Min-Max Feature Scaling x Standardization


I’d like to know what is the difference between min-max feature scaling, mentioned in the Ranking customer task, of the Fuzzy Language in Data Science mission (https://app.dataquest.io/m/466/fuzzy-language-in-data-science/6/ranking-customers), and the stanrdardization, mentioned in the Using Standardization for Comparisons, which is a part of the Z-Scores mission (https://app.dataquest.io/m/309/z-scores/8/using-standardization-for-comparisons). Both are in the Data Analyst Path.

I don’t have a background in mathematics, so I fear there might be a simple answer for this question I’m not seeing. On the other side, I tried to solve the task in the fuzzy language mission with z-scores and it worked!

I used the following code, which is pretty similar to the one contained in the answer key of the Dataquest platform, to “rescale”.

scaled_tran = (best_churn["nr_of_transactions"]-best_churn["nr_of_transactions"].min())/(best_churn["nr_of_transactions"].max()-best_churn["nr_of_transactions"].min())
best_churn["scaled_tran"] = scaled_tran

scaled_amount = (best_churn["amount_spent"]-best_churn["amount_spent"].min())/(best_churn["amount_spent"].max()-best_churn["amount_spent"].min())
best_churn["scaled_amount"] = scaled_amount
best_churn["score"] = ((scaled_tran*0.5) + (scaled_amount*0.5))*100
best_churn.sort_values("score",ascending = False).head(10)

To find the z-scores, I did:

test = best_churn.copy()
test["z_score"] = (best_churn["nr_of_transactions"]-best_churn["nr_of_transactions"].mean())/best_churn["nr_of_transactions"].std()
test["z_score_two"] = (best_churn["amount_spent"]-best_churn["amount_spent"].mean())/best_churn["amount_spent"].std()
test["final"] = (test["z_score"]+test["z_score_two"])/2
test.sort_values("final",ascending = False).head(10)

What I expected to happen: as far as I understood judging by the mathematical formula of for the min-max scaling is that the distance from a given value to the minimum value (X - min(X) has to be proportional to the range of this scale(max(x)-min(x)), much like converting from Celsius to Fahreheint in High-School.

As for the standardization, it’s finding the number of standard deviations that fit in the “distance” between a given value and the mean.

I get that in both cases, I’m finding a common value that allow for comparison between two scales and this is what I expected from both techniques.

Still, what are the differences between them and is there anything I should watch for when using one over the other ?

I’m not sure what you mean with it working, but the ranks aren’t the same for test and for best_churn. The eighth customer is different in each of them.

It really depends on what you’re trying to do. Standardization doesn’t necessarily reduce the order of magnitude of the values. Large values will still possibly be very large, and small values still very possibly very small, so when you add two different scales, you can’t incur in the same issues depicted on this screen.

1 Like