Screen Link: https://app.dataquest.io/m/381/exploring-data-with-pandas%3A-fundamentals/5/method-chaining
zero_previous_rank = f500.loc[:,"previous_rank"].value_counts().loc['0']
What I expected to happen:
I expected an output of 33, but got a keyring error. I don’t understand why it wouldn’t accept “0” which looks like the label of row that I want when I look at
It seems to me that the correct answer (below) could give a wrong answer (though happens not to) as we don’t know a priori that the “0” row is in the 0 index. For example the “159” row has index 1.
zero_previous_rank = f500["previous_rank"].value_counts().loc
Welcome to the Community!
Everything is ok with your piece of code, except for that
0. Here it’s the index of the first row of the series
just according to your code. This
0 is an integer, not a string type here. Hence, you should use
loc to select the first (well, actually the “zero-th” row, considering Python’s indexing) of that series.
Thanks @Elena_Kosourova !
I’m still a little confused. The line
zero_previous_rank = f500.loc[:,"previous_rank"].value_counts().loc
returns 1, despite the fact that the series is shorter than 490. So it seems like it is returning the element with the label 490 not with index 490 (because then it should throw an error).
So say I for some reason wanted the element with index 1. It seems like the only way to do this would be to print
Name: previous_rank, Length: 468, dtype: int64
then observe that the 2nd “row” is labeled with 159 and then evaluate
So is there a better way to select from a series by index? Or am I wrong in thinking that loc is using the label and not the index - though in that case how would I look up by label (if the label is an int).
I´ll start with your last question:
f500["previous_rank"].value_counts() is not very illustratuve, so let´s consider another one whose labels are also integers.
which output is
Name: years_on_global_500_list, Length: 23, dtype: int64
Try the following methods on your terminal and what they return:
Can you tell now what method selects an element of a series by its label and which one by its index?
If it seems somewhat complicated, don´t worry. Selecting by index is covered in the next course of the path, Exploring Data with pandas: Intermediate.
And now coming back to your original question:
Why use the index, 0, instead of the lable, “0”
I guess you were confused by the fact that in this mission
.loc takes as an argument an integer and not a string. But no matter what type of value you pass it´s still going to be a label, not an index.
pandas.Series.value_counts() return a Series where its values are counts of unique values and its labels are those unique values of the Series on which you apply the
.value_counts(). So, if the original values were of a string type, the labels of the
.value_counts() Series will be of a string type, and if they were integers, they will continue to be integers.