Hi everyone,
I’m having an issue with this guided project, specifically in the step where we compute monthly spend for students and drop missing values from the data set. At this point in the project, no other values have been dropped so I’m scratching my head as to where the difference could be.
My code:
df_survey["MonthsProgramming"].replace(0, 1, inplace=True)
df_survey["MonthlySpend"] = df_survey["MoneyForLearning"] / df_survey["MonthsProgramming"]
df_survey["MonthlySpend"].isnull().sum()
Yields 1995 as a result of the call to the sum()
method.
The solution workbook:
# Replace 0s with 1s to avoid division by 0
fcc_good['MonthsProgramming'].replace(0,1, inplace = True)
# New column for the amount of money each student spends each month
fcc_good['money_per_month'] = fcc_good['MoneyForLearning'] / fcc_good['MonthsProgramming']
fcc_good['money_per_month'].isnull().sum()
Yields 675, and as a result the following calculations (average by country etc) are all off. Since at this point in the project the data is still unmodified, I’m a bit dumbfounded as to where the difference comes from - I imagine the pandas replace()
method works the same in both cases, as well as the general Python for dealing with NaN values. Any help would be greatly appreciated!
Thomas