Plotting filtered & summed values of a dataframe

Screen Link:

By accident, i wrote the wrong code for this question, plotting (which i thought would be) the sum of registered bikes by each weekday:

plt.bar(bike_sharing['weekday'], bike_sharing['registered'])
plt.xticks(ticks=[0, 1, 2, 3, 4, 5, 6], labels=['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'], rotation=30)
plt.show()

After i submited the answer and saw the expected result, i understood my mistake. I should have plotted: plt.bar(weekday_averages['weekday'], weekday_averages['registered'])

BUT, what i don’t understand is, why y-axis values with the code above are at ~5.500-7.000:

bikes

I expected them to be much higher as (from my understanding) i plotted the sum of all registered bikes by weekday.

I double checked the summed values by coding print(bike_sharing.groupby('weekday').sum()['registered']), wich returns values from about 30.000 - 42.000 for each day.

Why do the bars show values from 5.500-7.000 and not the expected summed values? / Or rather: what did i plot with the code above and how would i plot the summed values for each day?

Sorry, if this question is asked cumbersome. This is my first post here :slight_smile:

You are not plotting the sum.

The above will plot the maximum value for registered grouped by weekday. Matplotlib seems to do this automatically. I can’t, as of now, find documentation which confirms this or explains why.

But, if you run

print(bike_sharing.groupby('weekday').max()["registered"])

You will see the values seem to match the plot.

You will have to group the data and aggregate it based on the sum and use that to plot it. Similar to how they use weekday_averages in the code for averages.

2 Likes