# What is the logic here in agg function?

I dont really understand the logic relationship among below 3 lines. First, we defined a function called dif. Then we used this function on mean_max_dif and returns one column. In other posts, community mandatory has explained that if we did not put() after function, which means it called itself. what does itself mean here? if we dont put any parameter, it means no value, correct?
Happy_mean_max links to the other 2 lines? why we need it. Thank you!!

My Code:

``````def dif(group):
return (group.max() - group.mean())

happy_mean_max=happy_grouped.agg([np.mean, np.max])
mean_max_dif=happy_grouped.agg(dif)
``````

Let me use my code to explain.

One good thing about `agg` is that you can pass multiple functions through it and it does all the calculations once.

You group `happiness2015` by `Region` and you take the `Happiness Score` column from it as `happy_grouped`, which makes `happy grouped` a Series.

There is no `in-built` function that can calculate and return the difference between the `max and the mean`. So we designed this function `dif`, telling us that we can use `custom` fuctions of our own.

`happy_mean_max` is the mean and maximum values of `Happiness Score` from the `respective` regions. So we expect as many rows as there are regions. The `dif` function is also passed into `agg` along side `mean and max` for `mean_max_dif`.

``````import numpy as np
grouped = happiness2015.groupby('Region')
happy_grouped = grouped['Happiness Score']

def dif(group):
return (group.max() - group.mean())

happy_mean_max = happiness2015.groupby('Region')['Happiness Score'].agg((np.mean, np.max))

mean_max_dif = happiness2015.groupby('Region')['Happiness Score'].agg((np.mean, np.max, dif))['dif']
``````
1 Like

Your explanation is clear, thank you! but one last thing i need to understand about the last line. in `agg()` you put dif already, why still need to put`['dif']`at the end.Thank you again!

`mean_max_dif = happiness2015.groupby('Region')['Happiness Score'].agg((np.mean, np.max, dif))['dif']`

The question only requires me to return the `dif` table