Why do we use .agg() with the function dif(group) if we are not performing multiple aggregations?


In the solution of this we use:

import numpy as np
grouped = happiness2015.groupby(‘Region’)
happy_grouped = grouped[‘Happiness Score’]
def dif(group):
return (group.max() - group.mean())
happy_mean_max = happy_grouped.agg([np.mean, np.max])
mean_max_dif = happy_grouped.agg(dif)

Now I understand why we use .agg() which is for multiple aggregations. Which is why I believe something like

happy_mean_max = happy_grouped.agg([np.mean, np.max, dif])

would be correct too.

What I do not understand is: While finding out mean_max_dif why do we use .agg() when we only want to perform one aggregation ie dif ?
When tried without .agg() it shows an error. Or maybe there is a way and I do not know of it.

Thanks for all the help :slight_smile:

Hey @swar.joshi

I believe this thread will help you with your concern.

Read here

Let me know your status.