Row filtering dataset

Screen Link:

My Code:

data['demographics'] = data['demographics'][data['demographics']['schoolyear'] == 20112012]

My doubt is why we need to add again data[‘demographics’] two times?
Because we are doing a filter in the Demographics CSV but why the two times add data[‘demographics’]

It wouldn’t be necessary to put data['demographics'] twice if data were a data frame. But in this mission data is actually a dictionary whose values are data frames. Here data['demographics'] is not a column of a dataframe, but an entire dataframe. So, anytime you’d like to refer to one of its columns, you have to refer first to the corresponding value of the data dictionary, and onle then to a specific column.

If we saved it first as an independent dataframe demografics, the code would be more familiar:
demographics = demographics[demographics ['schoolyear'] == 20112012]

Hope I could make it clear.


Makes a lot of sense!


You are welcome!

