I am trying to update two dataframes datasets train and holdout. I created two for-loops to achieve this. But unfortunately it is not working the way I expected. Please help me to understand the logic.
def create_dummies(df,column_name): dummies = pd.get_dummies(df[column_name],prefix=column_name) df = pd.concat([df,dummies],axis=1) return df df_cat = [train, holdout] categories = ['Age_categories', 'Pclass', 'Sex'] for d in df_cat: for c in categories: d = create_dummies(d, c) print(d.columns) print(train.columns) print(holdout.columns)
What I expected to happen:
I was expecting the datasets train and holdout to be updated with the new columns. But that has not happened. When I print the variable ‘d’, I understand the modification has happened. But it is not reflecting on the actual datasets.
What actually happened:
The actual datasets train and holdout are not updated.