On step 2 of the ‘Creating a kaggle workflow’ guided project, I’ve created a function that applies several functions to the train and holdout data frame.
def preprocessing(df): df = process_missing(df) df = process_age(df) df = process_fare(df) df = process_titles(df) df = process_cabin(df) columns = ['Age_categories', 'Fare_categories', 'Title', 'Cabin_type', 'Sex'] for col in columns: df = create_dummies(df, col) return df train = preprocessing(train) holdout = preprocessing(holdout)
However, when I run the above function, a long error message pops up that includes the following text (I’ve included only part of the error message since I can’t copy and paste the whole thing).
KeyErrorTraceback (most recent call last) <ipython-input-4-5b9da2c53e03> in <module>() 29 return df 30 ---> 31 train = pre_process(train) 32 holdout = pre_process(holdout) <ipython-input-4-5b9da2c53e03> in pre_process(df) 21 df = process_fare(df) 22 df = process_titles(df) ---> 23 df = process_cabin(df) 24 25 for col in ["Age_categories","Fare_categories", <ipython-input-3-85f0c13ce57d> in process_cabin(df) 47 train process_cabin(train) 48 """ ---> 49 df["Cabin_type"] = df["Cabin"].str 50 df["Cabin_type"] = df["Cabin_type"].fillna("Unknown") 51 df = df.drop('Cabin',axis=1) KeyError: 'Cabin'
I can’t figure out what is wrong with the code. I’ve also copied and pasted the code from the solution notebook and the same KeyError message pops up (https://github.com/dataquestio/solutions/blob/master/Mission188Solution.ipynb).
Any help would be very much appreciated.