Screen Link:
https://app.dataquest.io/m/106/significance-testing/8/p-value
My Code:
frequencies = []
for key in sampling_distribution:
if key >= 2.52:
frequencies.append(key)
p_value = np.sum(frequencies) / 1000
Is there any difference with the solution??:
frequencies = []
for sp in sampling_distribution.keys():
if sp >= 2.52:
frequencies.append(sampling_distribution[sp])
p_value = np.sum(frequencies) / 1000
Why use sampling.distribution.keys() when we know we iterate over keys on dictionaries, is just to be extra clear or am I missing something??
Also, we are trying to find the number of times a value of 2.52
or higher appeared in our simulations, which is the difference of weight between the two groups in our test. So, when we are trying to find if the key is greater than 2.52, should we also find keys that are <= -2.52???, something like this:
frequencies = []
for sp in sampling_distribution.keys():
if sp >= 2.52 or sp <= -2.52:
frequencies.append(sampling_distribution[sp])
p_value = np.sum(frequencies) / 1000
Thanks.