if x < cheap_mean:
What I expected to happen:
Create a new column called ‘price_criterion’ which applies 1 or 0 values based on the price
What actually happened:
- The new column does not show up in the affordable_apps table at all
- We have been taught to use df[“new column”]= … as our basis when we are creating a new column. Why are we told to use .loc(row, column) now in this case? what difference does this make?
This requires an understanding of how Pandas functions internally (underlying code).
The approach of using
affordable_apps[cheap]["price_criterion"] is called chained indexing. To keep it simple and brief - it can end up creating a temporary object and not an actual column in
The above behavior is also related to the
SettingWithCopy Warnings that have been mentioned before in the content.
In the end, it’s a matter of pre-defined behavior and how it works in Pandas. Using
.loc, as indicated, allows you to index rows based on the mask and also create (or later, modify) that new column.
If you wish to get a deeper understanding of this, you can check out these two resources -