Probability and Statistics: course_1- mission : Frequency distributions

Hey Guys!

I am Carlos and I want to share an idea about an exercise in this mission: Probability and Statistics: course_1- mission: Frequency distributions. I came up with this idea using the knowledge that I acquired in previous courses and I think that it is good to share it, even being something really simple.

The suggested answer is the following:

wnba = pd.read_csv(‘wnba.csv’)
intervals = pd.interval_range(start = 0, end = 600, freq = 60)
gr_freq_table_10 = pd.Series([0 for _ in range(10)], index = intervals)

for value in wnba[‘PTS’]:
for interval in intervals:
if value in interval:
gr_freq_table_10.loc[interval] += 1
break

My idea is the following:

wnba = pd.read_csv(‘wnba.csv’)
intervals = pd.interval_range(start = 0, end = 600, freq = 60)

init_zeros = 10 * [0]
gr_freq_table_10 = pd.Series(init_zeros, index = intervals)

for interval in intervals:

filter_table = wnba['PTS'].between(interval.left + 1, interval.right)
gr_freq_table_10.loc[interval] = wnba['PTS'][filter_table].shape[0]

Instead of using two nested loops, I preferred to use only one and I did some research and found that it is possible to take the boundaries in pd.interval_range as attributes (left and right). Then, I filtered the series using the boundaries of each interval, and finally, I counted the number of elements in the interval using the number of rows in the filtered series.

Please let me know if I should post this in another category. I am new in Dataquest but I am loving the courses!

Best regards,

Carlos.

2 Likes