Clean And Analyze Employee Exit Surveys Step 5

Why Am I getting a SettingWithCopyWarning?

Screen Link: https://app.dataquest.io/m/348/guided-project%3A-clean-and-analyze-employee-exit-surveys/5/verify-the-data

My Code:

dete_resignations['cease_date'] = dete_resignations['cease_date'].str.split('/').str[-1]
dete_resignations['cease_date'] = dete_resignations['cease_date'].astype(float)

What I expected to happen:
The code did what I expected:

dete_resignations['cease_date'].value_counts()
2013.0    146
2012.0    129
2014.0     22
2010.0      2
2006.0      1
Name: cease_date, dtype: int64

What actually happened:

/dataquest/system/env/python3/lib/python3.4/site-packages/ipykernel/__main__.py:1: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

/dataquest/system/env/python3/lib/python3.4/site-packages/ipykernel/__main__.py:2: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

This has been clarified on Step 4 -

In this step, note that you may see what is known as a SettingWithCopy Warning. This won’t prevent your code from running properly but it’s just letting you know that whatever operation you’re doing is trying to be set on a copy of a slice from a dataframe. We’ll include instructions below to get around this.

And in the instructions

Use the DataFrame.copy() method on the result to avoid the SettingWithCopy Warning.

That particular Warning, SettingWithCopy , is expected. You will have to add the copy() method to avoid it as the above instruction points out.

2 Likes

Thank you @the_doctor but what I don’t really understand what it means using df.copy() on the result.

I am stuck with the same error. When I read the documentation, I don’t know if I want to make a deep copy or not (set parameters to True or False), I don’t understand the difference. Also, for the rest of the mission, then do I want to use the same name for the dataframe or else use a new name for the remaining part

Hi @jane.buchan,

I have created a new topic to explain this:

Let us know if you have any questions on it.

Best,
Sahil

1 Like

I had trouble understanding this issue as well. I didn’t really understand the documentation so I found this video by Data School to be extremely helpful. Hope this helps!

1 Like

I don’t see any SettingWithCopy Warning when I execute the same code. Why is that?

Please create a separate post for your question since it’s not relevant to this particular post.

I used the below code and it worked fine, by setting expand= True to change it to df then making a copy()

dete_resignations[‘cease_date’] = dete_resignations[‘cease_date’].str.extract(r"([1-2][0-9]{3})", expand=True).astype(float).copy()

And restart the kernel, run all to capture these changes in the last line of code

I am stuck as well. My code is the same, but I get a different error message:

dete_resignations['cease_date'] = dete_resignations['cease_date'].str.split('/').str[-1]
dete_resignations['cease_date'] = dete_resignations['cease_date'].astype('float')
AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

I have already tried a few things, including different codes (as the one suggested above), and it is still not running. I really appreciate any feedback.

thank you!