Thresh is not working

Screen Link:
https://app.dataquest.io/m/348/guided-project%3A-clean-and-analyze-employee-exit-surveys/8/combine-the-data

My Code:

combined_updated = combined.dropna(thresh = 500, axis =1).copy()

What I expected to happen:
Recall that we still have some columns left in the dataframe that we don’t need to complete our analysis. Use the DataFrame.dropna() method to drop any columns with less than 500 non null values.

  • Remember that you can drop columns with less than a certain number of non null values with the thresh parameter.
  • Assign the result to combined_updated

What actually happened:

TypeErrorTraceback (most recent call last)
<ipython-input-84-e3a4614c7e0f> in <module>()
----> 1 combined_updated = combined.dropna(thresh = 500, axis =1).copy()

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/series.py in dropna(self, axis, inplace, **kwargs)
   2984         if kwargs:
   2985             raise TypeError('dropna() got an unexpected keyword '
-> 2986                             'argument "{0}"'.format(list(kwargs.keys())[0]))
   2987 
   2988         axis = self._get_axis_number(axis or 0)

TypeError: dropna() got an unexpected keyword argument "thresh"

Can you please share your notebook displaying this behavior?

Judging by the error message, combined is a series. so you’re actually using pandas.Series.dropna which doesn’t support thresh.

Hello Bruno, the notebook is attached. Please search for the thresh part. I appreciate it, since im totally stuck here :confused:

BR,Basics.ipynb (90.6 KB)

Click here to view the jupyter notebook file in a new tab

My suspicion is confirmed. The object combined isn’t what you think it is, I don’t think; it’s a series.

Take a look at it and trace back to find the first mistake. (Hint: What do dete_resignations_up and tafe_resignations_up look like?)

1 Like

Hmm still cant get it. Im just confused since the project solution says otherwise

Did you look at dete_resignations_up? Did you look at combined? The objects in the solution are much different.

Here’s a snippet from the solution:

dete_resignations['dissatisfied'] = dete_resignations[['job_dissatisfaction',
       'dissatisfaction_with_the_department', 'physical_work_environment',
       'lack_of_recognition', 'lack_of_job_security', 'work_location',
       'employment_conditions', 'work_life_balance',
       'workload']].any(1, skipna=False)
dete_resignations_up = dete_resignations.copy()

And here’s the equivalent snippet from your solution:

dete_resignations_up=dete_resignations['dissatisfied'].copy()
1 Like

Hi @federico1 Federico1, as @Bruno Bruno said below, the line 62 of your code reads:

dete_resignations_up=dete_resignations['dissatisfied'].copy()

This means you basically copied the series dete_resignations['dissatisfied'] into dete_resignations_up.

What you should have done is simply dete_resignations_up = dete_resignations.copy(). This way, you are copying the dataframe and not the dissatisfied column.

I got error same as federico1, but I think as of now I solved it based on your answer. Thanks Bruno so much :blush:

2 Likes