While trying to generate a new column based on a sliced and cleaned column in a Pandas Dataframe, I continue getting this warning.
“A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead”
The issue I am working on is the 6th Guided Project: Employee Exit Survey, in the Data Cleaning Module
My respective code is the following:
yrs_str = combined_updated['institute_service'].astype('str') # convert to string patt = r"([0-9]+)[-\.]?([0-9]+)?" #regular exp pattern to extract up to 2 numbers yrs_extr = yrs_str.str.extract(patt) # extract numbers yrs_cln = yrs_extr.dropna(how='all') # remove all lines with both elements NaN - 88 lines(ref above) yrs_calc = yrs_cln.fillna('0').astype('float').apply(calc_yrs,axis=1) #fill single column NaN with '0' >> convert to float >> deploy calc_yrs function combined_updated.loc[:,'service_cat'] = yrs_calc.map(carr_stage).copy()
I do understand that the Series “yrs_calc” I generated has not the same size as I remove the 88 NaN elements from it. Nevertheless I would expect to be able to generate a new column with the above quoted slicing method.
But I still get the “copy of slice warning”. Therefore I assume that I have misunderstood the explanations given on the below link:
Keen to get your hint/explanation where I failed to deploy properly the slicing.
thanks for reading