Hi,
I am trying to update two dataframes datasets train and holdout. I created two for-loops to achieve this. But unfortunately it is not working the way I expected. Please help me to understand the logic.
Screen Link:
https://app.dataquest.io/m/186/feature-preparation%2C-selection-and-engineering/1/introduction
My Code:
def create_dummies(df,column_name):
dummies = pd.get_dummies(df[column_name],prefix=column_name)
df = pd.concat([df,dummies],axis=1)
return df
df_cat = [train, holdout]
categories = ['Age_categories', 'Pclass', 'Sex']
for d in df_cat:
for c in categories:
d = create_dummies(d, c)
print(d.columns)
print(train.columns)
print(holdout.columns)
What I expected to happen:
I was expecting the datasets train and holdout to be updated with the new columns. But that has not happened. When I print the variable ‘d’, I understand the modification has happened. But it is not reflecting on the actual datasets.
What actually happened:
The actual datasets train and holdout are not updated.