Statistics. Frequency distribution 5 of 13

Hi. I am having trouble understanding the solution to this problem. My code is below:
“”"
wnba = pd.read_csv(‘wnba.csv’)
proportions = wnba[“Age”].value_counts(normalize = True)
percentages = proportions * 100

proportion_25 = proportions[25]
percentage_30 = percentages.loc[30]

percentage_over_30 = percentages.loc[30:].sum()

percentage_below_23 = percentages.loc[:23].sum()

“”"

However, the solution provided

“”"wnba = pd.read_csv(‘wnba.csv’)
percentages = wnba[‘Age’].value_counts(normalize = True).sort_index() * 100
proportion_25 = percentages[25] / 100
percentage_30 = percentages[30]
percentage_over_30 = percentages.loc[30:].sum()
percentage_below_23 = percentages.loc[:23].sum()

“”"
Why should I use “sort_index()” in this solution? I know what sort_index() is used for and what it does. However, I can’t seem to see why it’s necessary for this solution since I am using the “.loc” which uses the labels. Please can someone explain?

Hi @teemakai

I will attach the notebook below where you can see why we have to sort the index. And you will see what happens on using loc with or without sorting the index.

Let me know if you can’t understand.

loc[] method check.ipynb (8.5 KB)