# Significance Testing 106-8 questions

My Code:

``````frequencies = []
for key in sampling_distribution:
if key >= 2.52:
frequencies.append(key)

p_value = np.sum(frequencies) / 1000
``````

Is there any difference with the solution??:

``````frequencies = []
for sp in sampling_distribution.keys():
if sp >= 2.52:
frequencies.append(sampling_distribution[sp])
p_value = np.sum(frequencies) / 1000
``````

Why use sampling.distribution.keys() when we know we iterate over keys on dictionaries, is just to be extra clear or am I missing something??

Also, we are trying to find the number of times a value of `2.52` or higher appeared in our simulations, which is the difference of weight between the two groups in our test. So, when we are trying to find if the key is greater than 2.52, should we also find keys that are <= -2.52???, something like this:

``````frequencies = []
for sp in sampling_distribution.keys():
if sp >= 2.52 or sp <= -2.52:
frequencies.append(sampling_distribution[sp])
p_value = np.sum(frequencies) / 1000
``````

Thanks.

Hello @probot

I don’t know if I’m missing something, but your code will rather return the frequency of `dict.keys` used, not the values.

To calculate the `p_value` we need all frequencies that match the hypothesis. In this case it’s `sp >= 2.52` the 2.52 being the mean, we found in the previous mission.

Think of `p_value` as a fraction of the distribution that matches the hypothesis.

This is dependent on the size of the dictionary you end up working with. But, `for key in dict_name.keys()` is much faster than `for key in dict_name` .

The output is still the same, it’s just how they are implemented in Python’s source code that makes the difference. So, mostly when the dictionary sizes are quite large, `for key in dict_name.keys()` is much faster.

And, I think, @kakoori helps answer the other part of your question.

1 Like