Guided Project: Finding the Best Markets to Advertise In - Difference in monthly spend null count and following calcs

Hi everyone,

I’m having an issue with this guided project, specifically in the step where we compute monthly spend for students and drop missing values from the data set. At this point in the project, no other values have been dropped so I’m scratching my head as to where the difference could be.

My code:

df_survey["MonthsProgramming"].replace(0, 1, inplace=True)
df_survey["MonthlySpend"] = df_survey["MoneyForLearning"] / df_survey["MonthsProgramming"]

Yields 1995 as a result of the call to the sum() method.

The solution workbook:

# Replace 0s with 1s to avoid division by 0
fcc_good['MonthsProgramming'].replace(0,1, inplace = True)

# New column for the amount of money each student spends each month
fcc_good['money_per_month'] = fcc_good['MoneyForLearning'] / fcc_good['MonthsProgramming']

Yields 675, and as a result the following calculations (average by country etc) are all off. Since at this point in the project the data is still unmodified, I’m a bit dumbfounded as to where the difference comes from - I imagine the pandas replace() method works the same in both cases, as well as the general Python for dealing with NaN values. Any help would be greatly appreciated!


Hi Thomas, welcome to the community!

The reason there are fewer results in the solution notebook is because it seems like a step was missed in screen 4:

To make sure you’re working with a representative sample, drop all the rows where participants didn’t answer what role they are interested in. Where a participant didn’t respond, we can’t know for sure what their interests are, so it’s better if we leave out this category of participants.

I hope this helps!

1 Like

I excluded them for the count & relative frequency so I matched the tables but never dropped them from the dataset indeed, good catch! Thanks for the quick answer.

1 Like