Blue Week Special Offer | Brighten your week!
days
hours
minutes
seconds

Not seeing an NaN after updating Dissatisfied

Screen Link: Learn data science with Python and R projects

My Code: https://github.com/capncrockett/Dataquest/blob/production/emp_exit_surveys/exit_surveys.ipynb

def update_vals(val):
    if pd.isnull(val):
        return np.nan
    elif val == '-':
        return False
    else:
        return True
    
    
tafe_resignations['dissatisfied'] = tafe_resignations[['Contributing Factors. Dissatisfaction', 'Contributing Factors. Job Dissatisfaction']].applymap(update_vals).any(axis=1, skipna=False)
tafe_resignations_up = tafe_resignations.copy()


tafe_resignations_up['dissatisfied'].value_counts()




#COURSE SOLUTION DOESN'T RETURN NaN's EITHER!!!

# Update the values in the contributing factors columns to be either True, False, or NaN
def update_vals(x):
    if x == '-':
        return False
    elif pd.isnull(x):
        return np.nan
    else:
        return True
tafe_resignations['dissatisfied'] = tafe_resignations[['Contributing Factors. Dissatisfaction', 'Contributing Factors. Job Dissatisfaction']].applymap(update_vals).any(1, skipna=False)
tafe_resignations_up = tafe_resignations.copy()

# Check the unique values after the updates
tafe_resignations_up['dissatisfied'].value_counts(dropna=False)

What I expected to happen: As per the solution key https://github.com/dataquestio/solutions/blob/master/Mission348Solutions.ipynb at Out [20]

False    241
True      91
NaN        8
Name: dissatisfied, dtype: int64

What actually happened: For BOTH my solution and copy pasting the entire DQ solution:

False    241
True      99
Name: dissatisfied, dtype: int64

I really don’t know what is happening to the Nan values that existed. They get lumped as True which is definitely not helpful. If anything they should be False. The weirdest thing is that my solution AND the DQ solution produce the same results. However the solution actually posted on GitHub must have rendered out dofferently. Help?

In your code, the one on GitHub, you set skipna to True.

In the solution code, it’s set to False.

That seems to be the only difference.

Something must have gone wrong when you were trying to copy things over. I tried that piece of code from the solution an got the expected output.

Thanks for the reply. Yes the one on GitHub does has it set to True because I wound up going with it. Changing that one parameter seems to lump all the NaN values into the false category, which is the intended end result.

When I copy pasted everything over from the solution I actually started an entirely new notebook. So I’m not really sure how that could’ve happened. You’re saying to you copied it straight over and got the expected results? Something is wrong on my end.