Feature selection

This page is kinda confusing and I’ll be grateful to get an explanation.
On the previous page I learnt about feature selection based on the variance of the features.

On this current page, however, although I’m assuming that Open Porch SF was dropped because it has the lowest variance, there was no explicit explanation to why this has happened.

Can someone explain why we’re dropping Open Porch SF column? Thanks in advance

Link: https://app.dataquest.io/m/236/feature-selection/6/final-model

You are assuming correctly and on the previous page there was an explanation why we are doing so:

When the values in a feature column have low variance, they don’t meaningfully contribute to the model’s predictive capability.

That’s why we are dropping this column.

Does that answer your question or did I miss the point?

But it wasn’t the exactly zero…0.014. plus the feature with second least variance 0.019 wasn’t removed.

It shouldn’t be exactly zero, just low enough. Exactly how low it should be to be deleted, is your judgement. Probably, this answer may shed some additional light on the topic.

1 Like