Question about the Working With Missing And Duplicate Data mission

Hi everyone, I hope y’all doing well.

In the 12th screen of the Working With Missing And Duplicate Data mission, we wanted to replace the missing values in the happiness score column. So, we calculated the mean for the happiness score column for all the regions. After that, it was stated that the mean is too high or low for some regions, which I understand, so we are better off dropping the rows with missing values.
My question is, why didn’t we replace the missing values in the happiness score column with the mean happiness score of the region a country belongs to??

Can you please share the corresponding mission url link? Thanks!

https://app.dataquest.io/m/347/working-with-missing-and-duplicate-data/12/dropping-rows

Yep, that’s a good idea to consider! You could do that. Technically it wouldn’t be inherently wrong or anything - it’s certainly one measure worth considering. The mission just explores another way of getting that analysis done.

The thing is that even if you’d done that, it wouldn’t have changed the mean of the happiness scores of the region anyway, although it would have indeed changed the mean happiness of all the regions together.

What the mission says about lower than average happiness scores occurring where the data is missing also probably holds true even within the same region.