combined_updated['institute_service_up'] = combined_updated['institute_service'].astype('str').str.extract(r'(\d+)')
combined_updated['institute_service_up'] = combined_updated['institute_service_up'].astype('float')
What I expected to happen:
For the code to run
What actually happened:
KeyError on 'institute_service'
What is the KeyError coming from? I’ve uploaded my full project below.
Guided Project Clean And Analyze Employee Exit Surveys.tar (699 KB)
The drop null values step from the combined dataframe has left very few columns. The column in question has also got deleted. Observe the results for
"combined_updated.head()" in code cell 9.
You may need to rework here and before, so that this column does not get dropped from the combined dataframe.
Thank you for the response, but I’m still stuck. My code is the same as the solution, but it still won’t work. Please help.
DETE has column
"institue_service" and TAFE has a column
"institute service". They are forming two separate columns during merge, with less than non-null threshold and hence getting dropped.
Apologies, but I just can’t seem to find the problem. I’ve looked at every instance of ‘institute’ and checked for spelling, but they all look right to me. Can you point me to the specific lines that are causing the error?
Are you sure about this? Have you tried to print all the column names from each of the dataset like as below:
Try this before the code cell where you have combined both the datasets and try a “Restart Run All Cells” command, to execute the entire book.
I am having the same problem, but do not know how to prevent this from happening.
Update: I got it to work. Way back when I updated the tafe data frame, I misspelled the the column name referring to the institute_service column. It did not change the name, and so did not work as intended.