Placement of Parentheses

Hello,
I had a question about why the parentheses are where they are for Mission 5 in the “Exploring Data with Pandas: Intermediate”

Screen Link:

The correct code is:
null_previous_rank = f500.loc[f500.loc[:,"previous_rank"].isnull()][["company","rank","previous_rank"]]

I tried doing this:
null_previous_rank = f500.loc[:,"previous_rank"].isnull()[["company","rank","previous_rank"]]

I don’t know why we need the brackets (i.e., why do we need to enclose .isnull() with a bracket)

My code produces the following error:

<ipython-input-1-e7a1f58459bc> in <module>
      9 f500.loc[f500["previous_rank"] == 0, "previous_rank"] = np.nan
     10 ##INITIAL CODE ENDS
---> 11 null_previous_rank = f500.loc[:,"previous_rank"].isnull()[["company","rank","previous_rank"]]

/dataquest/system/env/python3/lib/python3.8/site-packages/pandas/core/series.py in __getitem__(self, key)
    908             key = check_bool_indexer(self.index, key)
    909 
--> 910         return self._get_with(key)
    911 
    912     def _get_with(self, key):

/dataquest/system/env/python3/lib/python3.8/site-packages/pandas/core/series.py in _get_with(self, key)
    956                 return self._get_values(key)
    957 
--> 958             return self.loc[key]
    959 
    960         return self.reindex(key)

/dataquest/system/env/python3/lib/python3.8/site-packages/pandas/core/indexing.py in __getitem__(self, key)
   1766 
   1767             maybe_callable = com.apply_if_callable(key, self.obj)
-> 1768             return self._getitem_axis(maybe_callable, axis=axis)
   1769 
   1770     def _is_scalar_access(self, key: Tuple):

/dataquest/system/env/python3/lib/python3.8/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1952                     raise ValueError("Cannot index with multidimensional key")
   1953 
-> 1954                 return self._getitem_iterable(key, axis=axis)
   1955 
   1956             # nested tuple slicing

/dataquest/system/env/python3/lib/python3.8/site-packages/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
   1593         else:
   1594             # A collection of keys
-> 1595             keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
   1596             return self.obj._reindex_with_indexers(
   1597                 {axis: [keyarr, indexer]}, copy=True, allow_dups=True

/dataquest/system/env/python3/lib/python3.8/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
   1550             keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
   1551 
-> 1552         self._validate_read_indexer(
   1553             keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing
   1554         )

/dataquest/system/env/python3/lib/python3.8/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
   1638             if missing == len(indexer):
   1639                 axis_name = self.obj._get_axis_name(axis)
-> 1640                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   1641 
   1642             # We (temporarily) allow for some missing keys with .loc, except in

KeyError: "None of [Index(['company', 'rank', 'previous_rank'], dtype='object')] are in the [index]"```

Hi!
I would start with what output you want to get by implementing this code.

You want to see all the companies which have got a null value in the previous_rank column and their corresponding current and previous ranks. So, you need some objects that meet certain condition.

To get this we need to use a boolean mask. Boolean mask is a Series of same length as the original Series you check for the condition, where each value is either True or False depending if the value of original meets the condition or not.

So, what condition are you looking for?
1.You are looking for the null values. The pd Series.isnull() method checks if values of a Series are null.
2. The null values should be in the f500["previous_rank"] column. So, we apply .isnull() method to this column getting:

f500["previous_rank"].isnull()

The line above is only a condition (a.k.a. Boolean mask), now we have to apply it to the dataframe. We do it using the selector operator .loc[ ]:

f500.loc[f500["previous_rank"].isnull()]

Thus we get all the rows where previous_rank is null.

And only now you can select the columns that contain the companies’ names and their previos and current rank:

f500.loc[f500["previous_rank"].isnull()][["company","rank","previous_rank"]]
2 Likes

Hi @ksenia.kustanovich,
Thanks so much for the reply! I just had an additional question: what is the difference between boolean indexing vs boolean masking?

I´d say that these two concepts are if not the same then very similar ones.

Hi @ksenia.kustanovich. Thanks for your help. I’ve been going over what you wrote and playing around with the code in the mission. I think I’m there in terms of understanding my issue. Thank you again for your very thorough response!

1 Like