CYBER WEEK - EXTRA SAVINGS EVENT

# Z-scores_Converting Back from Z-scores

My Code:

mean = 50
st_dev = 10
houses["dis"] = houses["z_merged"].apply(lambda x :((x*st_dev)+mean) )
mean_transformed = houses["dis"].mean()
stdev_transformed = houses["dis"].std(ddof = 0)


What I expected to happen:
I don’t understand this point
We are actually free to choose any values we want for mean & standard deviation , We want some more intuitive values for our two standardized distributions of index values, so let’s choose mean = 50 & standard deviation = 10
,

What actually happened:

Replace this line with the output/error


based on What he choose these values?

2 Likes

Yeah, this was not clear to me either as to why those values were more intuitive and how to identify the values through intuition in such a case.

@Sahil If it’s not too much trouble, is it possible to have a response from the content author(s) about this?

1 Like

I think in this case it’s trying to almost set a percent (out of 100) value system. If 50 is the ‘middle’ or mean value, than anything greater than 50 is higher quality, lower than 50, lower quality. It leaves room on that kind of scale for really good values to be near (or maybe even exceed) 100, and vice versa for lower amounts.

What I was a little confused about is why we’re switching to ddof=0 again, but I’m guessing it’s because we’re doing the entire population again and not a standard deviation of a standard deviation, which is ddof=1.

1 Like

To answer this, we have to start with the previous screen. Initially, the index values of the two companies were like this:

index_1 index_2 SalePrice
0 NaN -0.411111 215000
1 38.05 NaN 105000
2 NaN -0.888889 172000
3 39.44 NaN 244000
4 NaN -0.690000 189900

The initial problem was that the measurement system used by company 1 (index_1) was not comparable to the system used by company 2 (index_2). So we standardized it by transforming them into z-scores.

However, now the issue is that our values look like this:

z_1 z_2
0 NaN 0.429742
1 -0.935920 NaN
2 NaN -0.114456
3 0.786063 NaN
4 NaN 0.112082

While the values are good enough for comparison, it is not easy to communicate these values to non-technical audiences. So here, we have used this formula x = zσ + μ to convert the z-score into something that can be easily understood by a non-technical audience. Our choice of using \mu = 50 and \sigma = 10 is kind of a random choice (more on it below). And this is how our new values will look like:

0    54.297418
1    40.640797
2    48.855438
3    57.860626
4    51.120821
Name: transformed, dtype: float64
Min:  29.217360116843054 Max:  121.37299126210257


Though, it’s not completely random. We have to ensure that the new values make sense to the non-technical audience. For example, if we use \mu = 10 and \sigma = 50, it will not make our scores intuitive:

0    31.487092
1   -36.796016
2     4.277190
3    49.303132
4    15.604103
Name: transformed, dtype: float64
Min:  -93.91319941578475 Max:  366.86495631051287


Here, we have to experiment with a couple of values to ensure that the minimum values are at least greater than 0 to make it slightly intuitive. While there are many ways to find an intuitive range of values,

We can use a similar approach as above. Here is what I would suggest doing:

1. Define the value range (Ex: 0 - 100)
2. Set the middle value of the range as mean (Ex: \mu = 50)
3. Play around with the standard deviation values to ensure that the minimum and maximum values are within our defined range. (Ex: \sigma = 7.0054)

Here is what we would get with the above values:

0    53.010513
1    43.443504
2    49.198189
3    55.506683
4    50.785180
Name: transformed, dtype: float64
Min:  35.44092945625323 Max:  99.99963529875333


And if we round them, it would look better (Note: We will not be able to reverse it to original values if we use the round function. So make sure to keep a copy of the original values):

0    53
1    43
2    49
3    56
4    51
Name: transformed, dtype: int64
Min:  35 Max:  100


Let me know if it’s still not clear. I would request the content author to comment on it.

Best,
Sahil

1 Like

Hi,