For loop mechanism in Pandas

Hi All,

I was trying to figure out a piece of code on running loops within pandas.
Wanted to understand the looping mechanism below.
If we are running a loop on the regions list, then how inside the for loop we are referencing the entire list of happiness 2015.

import pandas as pd

mean_happiness = {}
regions = happiness2015[‘Region’].unique()
for r in regions:
region_group = happiness2015[happiness2015[‘Region’] == r]
region_mean = region_group[‘Happiness Score’].mean()
mean_happiness[r] = region_mean


1 Like

Hello DataScience_Raul, How are you doing? Welcome to Dataquest community.

I try to make things clear for you.

What you just did by refering to the entire hapiness2015 list was a way to filter the data you desire. This is a filtering method of pandas, like boolean indexing. By running the code happiness2015[happiness2015[‘Region’] == r] you are telling python a condition, which is to get the Region value, from the happiness2015 dataset, which is exactly the same as the r value of the for loop. So the code will do this for each unique value from the Region column.

If you still have any doubt about this, I suggest you make some review on course 1 from Step 2 (Exploring Data with pandas: Intermediate). But if I didn’t make myself clear on this explanation, just let me know ok?