Solution Recommendation

Screen Link:

Hi guys,

Noticed in the recommended solution that we have used RFECV to identify the best features - but unfortunately this results in columns with a high level of multicollinearity e.g. ‘Title_Miss’, ‘Title_Mr’, ‘Title_Mrs’, ‘Sex_female’, ‘Sex_male’. Instead of these 5 different feature columns, we could have just used one any one single column: Sex_female or Sex_male, or Title_Mr
Wouldn’t that have been better?