Groupby gives True/False result

Screen Link: https://app.dataquest.io/m/310/guided-project%3A-finding-the-best-markets-to-advertise-in/5/spending-money-for-learning

My Code:

countries_mean = survey_clean.groupby('CountryLive')['MonthlySpent'].mean()

What I expected to happen: Get a table that shows the mean amount students spent monthly on learning, grouped by country.

What actually happened: I got a True/False result:

CountryLive
False    0.681818
True     0.908453
Name: MonthlySpent, dtype: float64
What am I doing wrong here? How can I get an overview of the money students spent on average by country?

Thanks a lot in advance for your help!

Try this instead:

countries_mean = survey_clean.groupby('CountryLive').mean()['MonthlySpent']

If this deoesn’t work try to rerun all the code once again.

At last, if this also doesn’t work do share the python file. Something might be wrong in the previous code then.

Hello @Amaryllis,
Please share your notebook so we can assist further as problem seems in the previous code.

Hello @pablajaspreet94,

Both way should give same output. Actually better way is


This would take more running time so

1 Like

Here’s my notebook: FindingTheBestMarketsToAdvertiseIn.ipynb (93.7 KB)

(I’ve been working further on despite this error, but you don’t need to look to the next steps and errors)

Click here to view the jupyter notebook file in a new tab

Hello @Amaryllis, if you check at your code here:

remove the null values in the new column and the `CountryLive` column
# remove the null values in the new column and the `CountryLive` column
survey_clean['MonthlySpent'] = survey_clean['MonthlySpent'].notnull().copy()
survey_clean['CountryLive'] = survey_clean['CountryLive'].notnull().copy()

you did not remove the null values, you checked if a value is not null this will return True if a value is not null or False if a value is false.
And this is why you got the below output:


To remove the output index the series with the boolean series or use the pandas dropna() function. Actually the best method to use here in order to drop the missing values is DataFrame.dropna() method and set subset=['MonthlySpent', 'CountryLive']

survey_clean = survey_clean.dropna(subset=['MonthlySpent', 'CountryLive'])
2 Likes

Thanks a lot! This makes sense of course :slight_smile:

2 Likes

Hey @Amaryllis, If @info.victoromondi’s answer solved the issue then please consider to give like and mark it as solution. :slightly_smiling_face:

GUIDELINE #2: Accept and mark answer as Solution

If you find a reply that answers your question satisfactorily, please mark it as Solution . Doing so will help -

  • Others learners, who are searching for the same problem, find the solution faster
  • With the Learning Assistant program - by marking the answer as solution, you can directly help the person who helped you.
1 Like