Question about new 'price_criterion' column

Hi, I try to use the following code for the exercise, but it doesn’t work. Could someone please tell me why? Thank you very much!

Screen Link:

My Code:

cheap = affordable_apps["Price"] < 5
reasonable = affordable_apps["Price"] >= 5

cheap_mean = affordable_apps[cheap]['Price'].mean()
affordable_apps['price_criterion'] = affordable_apps[cheap].apply(lambda row: 1 if row['Price'] < cheap_mean else 0)

What I expected to happen:
Create a new ‘price_criterion’ column, in which contains 1 for price lower than cheap_mean, otherwise 0.

What actually happened:

TypeErrorTraceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyErrorTraceback (most recent call last)
<ipython-input-1-01ac4bdbeb87> in <module>()
      3 
      4 cheap_mean = affordable_apps[cheap]['Price'].mean()
----> 5 affordable_apps['price_criterion'] = affordable_apps[cheap].apply(lambda row: 1 if row['Price'] < cheap_mean else 0)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4875                         f, axis,
   4876                         reduce=reduce,
-> 4877                         ignore_failures=ignore_failures)
   4878             else:
   4879                 return self._apply_broadcast(f, axis)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4971             try:
   4972                 for i, v in enumerate(series_gen):
-> 4973                     results[i] = func(v)
   4974                     keys.append(v.name)
   4975             except Exception as e:

<ipython-input-1-01ac4bdbeb87> in <lambda>(row)
      3 
      4 cheap_mean = affordable_apps[cheap]['Price'].mean()
----> 5 affordable_apps['price_criterion'] = affordable_apps[cheap].apply(lambda row: 1 if row['Price'] < cheap_mean else 0)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/series.py in __getitem__(self, key)
    621         key = com._apply_if_callable(key, self)
    622         try:
--> 623             result = self.index.get_value(self, key)
    624 
    625             if not is_scalar(result):

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
   2558         try:
   2559             return self._engine.get_value(s, k,
-> 2560                                           tz=getattr(series.dtype, 'tz', None))
   2561         except KeyError as e1:
   2562             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

KeyError: ('Price', 'occurred at index App')


Click here to open the screen in a new tab.

Hi @lhc1412,

The issue with your code is in the 4th line.
You have to apply your lambda function directly on the 'Price' column, as it’s mentioned in Step 2:

affordable_apps['price_criterion'] = affordable_apps[cheap]['Price'].apply(
lambda x: 1 if x < cheap_mean else 0)
1 Like

Hello @lhc1412,


This looks good:

cheap = affordable_apps["Price"] < 5
reasonable = affordable_apps["Price"] >= 5

cheap_mean = affordable_apps[cheap]['Price'].mean()

I would like to focus on this line:

affordable_apps['price_criterion'] = affordable_apps[cheap].apply(lambda row: 1 if row['Price'] < cheap_mean else 0)

I suppose you want to .apply the condition to the whole column and the filtered subset.
When you do affordable_apps['price_criterion'] = affordable_apps[cheap], you’re sub-setting the column with the boolean-mask cheap while actually, you should be using the whole column.

Remove the mask cheap. It should look like:
affordable_apps['price_criterion'] = affordable_apps.apply(lambda row: 1 if row['Price'] < cheap_mean else 0)


Before wrapping up, may I also suggest a small tweak to:

cheap = affordable_apps["Price"] < 5
reasonable = affordable_apps["Price"] >= 5

Instead of creating another boolean-mask reasonable, try using ~cheap for filtering.


Hope this clarifies!

1 Like