Why doesn't my code work for "price_criterion"?

Screen Link:

My Code:

cheap = affordable_apps["Price"] < 5
reasonable = affordable_apps["Price"] >= 5
cheap_mean = affordable_apps[cheap].mean()["Price"]
affordable_apps.loc[cheap, "price_criterion"] = affordable_apps[cheap].apply(lambda row: 1 if row["Price"] < cheap_mean else 0)

What I expected to happen:
Same as running
affordable_apps.loc[cheap, "price_criterion"] = affordable_apps["Price"].apply( lambda price: 1 if price < cheap_mean else 0 )

What actually happened:

TypeErrorTraceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyErrorTraceback (most recent call last)
<ipython-input-1-151ce880c421> in <module>()
     40 reasonable = affordable_apps["Price"] >= 5
     41 cheap_mean = affordable_apps[cheap].mean()["Price"]
---> 42 affordable_apps.loc[cheap, "price_criterion"] = affordable_apps[cheap].apply(lambda row: 1 if row["Price"] < cheap_mean else 0)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4875                         f, axis,
   4876                         reduce=reduce,
-> 4877                         ignore_failures=ignore_failures)
   4878             else:
   4879                 return self._apply_broadcast(f, axis)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4971             try:
   4972                 for i, v in enumerate(series_gen):
-> 4973                     results[i] = func(v)
   4974                     keys.append(v.name)
   4975             except Exception as e:

<ipython-input-1-151ce880c421> in <lambda>(row)
     40 reasonable = affordable_apps["Price"] >= 5
     41 cheap_mean = affordable_apps[cheap].mean()["Price"]
---> 42 affordable_apps.loc[cheap, "price_criterion"] = affordable_apps[cheap].apply(lambda row: 1 if row["Price"] < cheap_mean else 0)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/series.py in __getitem__(self, key)
    621         key = com._apply_if_callable(key, self)
    622         try:
--> 623             result = self.index.get_value(self, key)
    624 
    625             if not is_scalar(result):

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
   2558         try:
   2559             return self._engine.get_value(s, k,
-> 2560                                           tz=getattr(series.dtype, 'tz', None))
   2561         except KeyError as e1:
   2562             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

KeyError: ('Price', 'occurred at index App')

Can someone please explain why my code doesn’t work in this scenario? Thank you!

Hi @spi
Basically you’re passing a sub series that contains boolean values instead of a integer, so pandas can’t apply the lambda function

Hi @alegiraldo666, thank you! Doesn’t .apply work on affordable_apps[cheap] and not just cheap though? I thought that it’s not working on boolean values but a filtered dataframe instead

Yeah, you are right. But you already filtered the data when you used .loc() so in my opinion what you need to pass is an integer and not the filtered data again