SettingWithCopyWarning is killing me, please help

Hi, I am desperate for some help.

While working on my Guided Project: Clean And Analyze Employee Exit Survey, I stumbled upon a famous “SettingWithCopyWarning” error, for the second time already in my code. I don’t understand how to get rid of it or even why it’s there.

  1. first it appeared when I tried cleaning one column in order to just get the year format (ex. my aim was to turn smth like this “08/2010” into “2010.0”). I tried with and without .loc, but always get the warning, here is my code line:

dete_resignation.loc[:,“cease_date”] = dete_resignation[“cease_date”].str.split(pat="/").str[-1].astype(“float”)

  1. second time it appears while I just wanted to create a new column, which is basically a subtraction of some two columns in my dataframe, the code is this:

dete_resignation[“institute_service”] = dete_resignation[“cease_date”] - dete_resignation[“dete_start_date”]

Here is also the working version of my project, to get the entire picture…

Clean And Analyze Employee Exit Surveys.ipynb (1.7 MB)

I am really looking forward to some help

Click here to view the jupyter notebook file in a new tab

@ merimus Merimus


Hope it will help you.

Frankly it is killing me as well.

I am stuck on a different piece of code from step 9. I cannot avoid this error. Tried copy, tried resetting index and copy. Tried intermediate values and copy. Every time I want to assign it back to institute_service column in combined it gives me the error.

resetting index as there were some duplicate values

combined_updated.reset_index(drop=True, inplace=True)

Option 1: extracting cleaned/converted years back into the column and get error

combined_updated[‘institute_service’] = combined_updated[‘institute_service’].astype(str).str.extract(’(\d+)’).astype(float)

Option 2: getting result loaded into intermediate df

flt_institute_service = combined_updated[‘institute_service’].astype(str).str.extract(’(\d+)’).astype(float)

trying to assign copy of the intermediate value back to the column - the same error!

combined_updated[‘institute_service’] = flt_institute_service.copy()

Option 3: extracting intermediate into Series and trying to assign back to the column. The same error!

flt_institute_service = combined_updated[‘institute_service’].astype(str).str.extract(’(\d+)’).astype(float)

trying to assign copy of the intermediate value back to the column - the same error!

a = pd.Series(flt_institute_service.loc[:,0])
combined_updated[‘institute_service’] = a

I know that the index is unique. The shape/length is matching. Why is it still throwing an error?

Thank you!

Hi!
First of all it’s not an error. It’s a warning here is a really good explanation about it

Second if you try this
combined_updated[‘institute_service’] = combined_updated[‘institute_service’].astype(str).str.extract(’(\d+)’).astype(float).copy()

The warning keeps showing?

Hello,

Yes Warning is still there:
/dataquest/system/env/python3/lib/python3.4/site-packages/ipykernel/main.py:1: FutureWarning: currently extract(expand=None) means expand=False (return Index/Series/DataFrame) but in a future version of pandas this will be changed to expand=True (return DataFrame)
if name == ‘main’:
/dataquest/system/env/python3/lib/python3.4/site-packages/ipykernel/main.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
if name == ‘main’:

1 Like

Hi…

Can you try passing expand=True to the extract() function and check.

Thanks.

Thank you Warning slightly changed - Expand part disappeared.

/dataquest/system/env/python3/lib/python3.4/site-packages/ipykernel/main.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
if name == ‘main’:

Hi! To avoid the warning, it’s necessary to apply .copy() before you try to make changes. I don’t speak about all possible cases when the warning appears, but particularly about this project. In this project the warning appears mostly due to chained assignment, described in the article @raisa.jerin.sristy79 shared above. What leads to the warning?
First, you create a new dataframe from the original dataframe based on some condition (the code below is an example, it’s from my project):
dete_resignations = dete_survey_updated[dete_survey_updated['separationtype'].str.contains(r'Resignation')]

Then, you continue working with the dete_resignation and obviously at some point you need to introduce some changes into it:

`dete_resignations[‘cease_date’] = dete_resignations[‘cease_date’].str.split(’/’).str[-1]’

It’s when Pandas gets confused. When you created dete_survey_update actually you don’t create a new dataframe, you create a “view” of the original dataframe which contains only the rows which face the condition you need. So, Pandas (the way the library is created) can’t assure you where the changes are going to be introduced: to the new dataframe or to the original one as well. And it raises the SettingwithCopyWarning.
Si, what should have been done to avoid the Warning? You should create a copy, right, but not in the moment when you introduce changes, but in the moment when you create a new dataframe:

dete_resignations = dete_survey_updated[dete_survey_updated['separationtype'].str.contains(r'Resignation')].copy()

I strongly recommend to read the article provided and maybe watch a couple of videos on YouTube where this issue is explained with more details and more technically then I do.

4 Likes

Hi Ksenia,

Apologies for late response - life took over :blush:

Thank you for your detailed response. I will definitely read through the article with more attention once again.

I have tried a few permutations and for some reason chained .copy() didn’t work but this did:

combined_updated = combined_updated.copy()

#effectively overwritten the old data frame with the new. was slightly surprised it did because I would have thought it is trying to reference itself but probably internals have their own referencing for view and dataframes. Or copy creates an intermediate object which is then “swaps” pointer to this new object.

combined_updated['institute_service'] = combined_updated['institute_service'].astype(str).str.extract('(\d+)', expand=False).astype(float)

Thank you once again for your response!

Cheers

1 Like

Hi. I was stuck on this exact issue and tried your recommended input, and the future warning is gone. What exactly did it do?

As per the warning message

It clearly states that problem lies in the expand=False statement, hence set expand=True.
The expand parameter, if True, return DataFrame with one column per capture group. If False, return a Series/Index if there is one capture group or DataFrame if there are multiple capture groups.