Statistis Fundamentals in Python


  1. In the 1st mission - Sampling , screen no.8 , I did not understand this code :

print(wnba[‘Games Played’].value_counts(bins = 3, normalize = True) * 100). Here, how the %'s are calculated .

  1. Why only the below intervals are selected
    (22.0, 32.0] 72.727273
    (12.0, 22.0] 18.181818
    (1.969, 12.0] 9.090909
    Name: Games Played, dtype: fl

Hi! There are 2 parameters in value_counts() that are doing some heavy lifting for you to calculate the percentages and provide the intervals.

  1. To get the percentages, find the proportions and then multiply by 100. The proportions are found by counting the number of items in each category and then dividing each count by the number of items in the whole list.
    When normalize=True is used with value_counts(), it automatically calculates the proportions for you! To get the percentages you then multiply by 100.
  2. The bins=3 parameter takes the numerical data in the series and groups it into 3 equal intervals for you. Then value_counts() does the calculations based on the groups rather than the individual entries. This is why you see the 3 intervals in your result – it automatically grouped the data into 3 equal intervals, counted how many were in each interval, and calculated the proportions for you.

I hope that helps!