Guided Project: Clean And Analyze Employee Exit Surveys - Step 7 / 11

In step #7. The instruction states: Use the df.any() method as described above to create a dissatisfied column in BOTH the tafe_resignations and dete_resignations dataframes.

In reviewing the documentation for pandas .any() I am stumped on how to create a new column based off of only a subset of columns in the dataframe.

I have used both :
tafe_resignations['dissatisfied'] = tafe_resignations.any(axis=1, skipna=False)
tafe_resignations['dissatisfied'] = tafe_resignations.any(axis=1, bool_only=True, skipna=False)

The statements run, but it appears to be applying the any on all the columns. How do I just get the calculation to only run on the columns I am interested in? I have looked everywhere and I cannot find an answer, which usually means I am missing something.

Thanks in advance.
Rich

Hi Rich,

From what I understand, you’ll need to select your column names before .any(), i.e.

tafe_resignations['dissatisfied'] = tafe_resignations[['column1', 'column2']].any(axis=1, skipna=False)

Hope that helps

2 Likes

Thanks Austin! That worked perfect. Appreciate the quick response.

2 Likes