I’ve completed working on the Guided Project: Clean And Analyze Employee Exit Surveys. Just to get it clean I wanted to get rid of the SettingWithCopyWarning issues and I’ve failed to resolve them. I got a couple of extra issues as well while trying to resolve them so thought I’d come and ask any help would be appreciated.
The 9th challenge out of 11 requires us to clean up the institute_service column
# Extracts the number of years in the institute from the column by using the specified pattern and #converting the extracted value to float combined_updated.loc[:,"institute_service"] = combined_updated["institute_service"].astype(dtype='str').str.extract(pattern) print(combined_updated["institute_service"])
While trying to resolve the warning and based on the suggestion in the warning I used the .loc(). The output came up as:
3 NaN 5 NaN 8 NaN 9 NaN 11 NaN .. 696 NaN 697 NaN 698 NaN 699 NaN 701 NaN Name: institute_service, Length: 651, dtype: float64
However if I use the code as follows
combined_updated.["institute_service"] = combined_updated["institute_service"].astype(dtype='str').str.extract(pattern) print(combined_updated["institute_service"])
The output is as follows:
3 7 5 18 8 3 9 15 11 3 ... 696 5 697 1 698 NaN 699 5 701 3 Name: institute_service, Length: 651, dtype: object A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Can I know why I get the NaN result despite correctly setting the .loc function?
I’m unsure why I keep getting the warning. I have used the following code to ensure that the the combined data frame has new copies of the DETE and TAFE dataframes but still I keep getting the warning.
dete_resignations_up = dete_resignations.copy(deep = True) tafe_resignations_up = tafe_resignations.copy(deep = True)
Any idea why this might be?