How do we know how much variance should be there so that a feature must be dropped?

Screen Link:
https://app.dataquest.io/m/236/feature-selection/6/final-model

My Code:

Replace this line with your code

What I expected to happen:

What actually happened:

Replace this line with the output/error

In the screen one variable was dropped since it has lowest variance among all . But how was it decided that variance was low enough to be dropped ?

hi @SanchitSinghal

I haven’t yet done this mission and I don’t know which track/course it belongs to.

But what I understand from the content of the mission, is that we need to compare the variances of all of the features with one another. Like if you sort the results in descending order, the feature that comes at last is the one, we want to drop from the model.
It’s not working on a generic rule that variance should be this optimum.

If you see this table from the content, the lowest value is for Open Porch SF.
image

To strengthen that, the following excerpt highlights the different between variance of Open Porch SF & Full Bath with Full Bath & Garage Area

image

Full Bath - Open Porch SF => 0.018621 - 0.013938 = 0.004683
Garage Area - Full Bath => 0.020347 - 0.018621 = 0.001726

The difference between variance is higher in case of former than the latter and hence Open Porch SF is regarded as least variable feature among the features available in the dataset.

2 Likes