.loc - why does it throw an error

Hey there folks :slight_smile: I got a bit brave and started to play around with the mission 7 in the Pandas. Basically I thought that omitting .loc was a shortcut. Is this right? so in the second line of my code I used .loc but got an error (error included in the bottom of this message), but the 3rd line using .loc is correct. don’t quite understand why this is?

‘’‘countries = f500[“country”]
revenues_years = f500.loc[[“revenues”,“years_on_global_500_list”]]
ceo_to_sector = f500.loc[:,“ceo”:“sector”]’’’

the correct code is:

‘’‘countries = f500[“country”]
revenues_years = f500[[“revenues”, “years_on_global_500_list”]]
ceo_to_sector = f500.loc[:, “ceo”:“sector”]’’’
aren’t the use of double brackets just a part of the shortcut when omitting .loc?

error message:
KeyErrorTraceback (most recent call last)
/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _has_valid_type(self, key, axis)
1505 if not ax.contains(key):
-> 1506 error()
1507 except TypeError as e:

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in error()
1500 .format(key=key,
-> 1501 axis=self.obj._get_axis_name(axis)))
1502

KeyError: ‘the label [revenues] is not in the [index]’

During handling of the above exception, another exception occurred:

KeyErrorTraceback (most recent call last)
in ()
1 countries = f500[“country”]
----> 2 revenues_years = f500.loc[“revenues”,“years_on_global_500_list”]
3 ceo_to_sector = f500.loc[:,“ceo”:“sector”]

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in getitem(self, key)
1365 except (KeyError, IndexError):
1366 pass
-> 1367 return self._getitem_tuple(key)
1368 else:
1369 # we by definition only have the 0th axis

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _getitem_tuple(self, tup)
856 def _getitem_tuple(self, tup):
857 try:
–> 858 return self._getitem_lowerdim(tup)
859 except IndexingError:
860 pass

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _getitem_lowerdim(self, tup)
989 for i, key in enumerate(tup):
990 if is_label_like(key) or isinstance(key, tuple):
–> 991 section = self._getitem_axis(key, axis=i)
992
993 # we have yielded a scalar ?

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
1624
1625 # fall thru to straight lookup
-> 1626 self._has_valid_type(key, axis)
1627 return self._get_label(key, axis=axis)
1628

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _has_valid_type(self, key, axis)
1512 raise
1513 except:
-> 1514 error()
1515
1516 return True

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in error()
1499 raise KeyError(u"the label [{key}] is not in the [{axis}]"
1500 .format(key=key,
-> 1501 axis=self.obj._get_axis_name(axis)))
1502
1503 try:

KeyError: ‘the label [revenues] is not in the [index]’

KeyErrorTraceback (most recent call last)
in ()
1 countries = f500[“country”]
----> 2 revenues_years = f500.loc[[“revenues”,“years_on_global_500_list”]]
3 ceo_to_sector = f500.loc[:,“ceo”:“sector”]

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in getitem(self, key)
1371
1372 maybe_callable = com._apply_if_callable(key, self.obj)
-> 1373 return self._getitem_axis(maybe_callable, axis=axis)
1374
1375 def _is_scalar_access(self, key):

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
1614 raise ValueError(‘Cannot index with multidimensional key’)
1615
-> 1616 return self._getitem_iterable(key, axis=axis)
1617
1618 # nested tuple slicing

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
1113
1114 if self._should_validate_iterable(axis):
-> 1115 self._has_valid_type(key, axis)
1116
1117 labels = self.obj._get_axis(axis)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _has_valid_type(self, key, axis)
1470 raise KeyError(
1471 u"None of [{key}] are in the [{axis}]".format(
-> 1472 key=key, axis=self.obj._get_axis_name(axis)))
1473 else:
1474

KeyError: “None of [[‘revenues’, ‘years_on_global_500_list’]] are in the [index]”

I struggled a lot with .loc when I was first learning Pandas. I think the summary table on that mission screen page is SUPER helpful. Also, I was looking over my personal notes and I found this:


df.loc['row_label', 'column_label']

  • you can use : in place of row or column label to select all the rows/columns
  • Note: df.loc[:, 'column_name'] is exactly the same as df['column_name'].
    The second one is the more common usage.
  • To select multiple rows/columns, place the row/column label with a list of selected rows/columns
    • example: df.loc[:, ['col1', 'col2']] or df[['col1', 'col2']]
    • example: column slice df.loc[:, 'col1':'col5']
  • For selecting rows, you MUST always use .loc, even for the shortcut without : for columns!
  • row axis == ‘index axis’

Maybe that might help. I think the reason you got the error is that if you use .loc, it’s looking for the row index first and the data doesn’t have row index labels.

1 Like

hey there april.g :slight_smile:
oh yeah, so "manditory use of .loc is mostly related to rows rather than columns, is that about right?