CYBER WEEK - EXTRA SAVINGS EVENT
TRY A FREE LESSON

Is Bessel's correction applied in the coding exercise of the z-score mission?

Screen Link:

My Code:

def zscore(val, arr):
    return (val-np.mean(arr))/np.std(arr, ddof=0)

What I expected to happen:

In the Learn we talk about z-score and how to calculate it, it is mentioned that when calculating the standard deviation we should apply Bessel’s correction. To my understanding this would mean parameter ddof of function np.std() should be set to 1?

What actually happened:

But if the function is used as shown. It does work with ddof set to 0.

So my question is am I misunderstanding the formula for z-score calculation or is the correction not applied in the exercise?

Hello @takacs

Welcome to the community!

Degrees of freedom are restrictions you place when you try to estimate statistics of a population from its samples. When you are calculating statistics with the population data, your degree of freedom is zero - ddof=0.

This article explains why this is done.

To find the standard deviation of a population we do not need to use Bessel’s correction, and your function will work (assuming we import numpy as np first). However, in the second bullet point of the instructions they ask us to make our function flexible enough to work for populations OR samples. To calculate the standard deviation for a sample we DO need to use Bessel’s correction, and since there is no way for a function to “detect” if the array is a sample or population, they are implicitly asking for our function to take in a third argument, is the array a sample or population? The DQ answer used a Boolian value, making the input optional with a default value, but its your function, so feel free to use an argument that makes sense to you.

I found the instructions for this step a little misleading, since it seems like they are asking for a function that takes in two arguments, but then later add criteria that necessitates a third argument. I’m commenting here because it seems this was unclear to the OP as well, and hopefully this can help out someone else down the line.

(side note, I get a little frustrated by the way that DQ frequently presents these practice problems in an almost intentionally garbled manner, or will ask us to use some technique or method that we have never seen before. But the justification I have read here in the comments is that in the real world most projects are presented to data scientists in an unclear way and/or will require a method or skill they aren’t familiar with, so we need to learn how to operate under those conditions. What do you think?)

-edited for clarity

They are not misleading. They have provided us with information that we need to use to write our code.

1 Like