hi @alegiraldo666

I hope I don’t confuse you more than you already are.

Let’s first make our assumptions:

H_0 : Words in high value and low value questions are present homogenously

H_\alpha : Words in high value and low value questions are not present homogenously

I don’t know if you completely understand the concept of alpha or level of significance or significance and the p-value. So I will start from there.

The p-value is the critical value beyond which the region of rejection starts in the probability distribution. It is the **smallest level of significance** at which we can reject the null hypothesis.

- if \alpha >= p-value : we can reject H_0
- if \alpha < p-value : we cannot reject the H_0.

(this is best understood graphically, you can either check images on google or let me know if you want a clumsy hand-drawn one )

In this case X^2 = 0.4 with degrees of freedom = 9 gives us a p-value = 0.53.

That means our critical value is 53% or at half of the probability distribution. In order to reject the H_0 we would need an alpha value of more than 0.53.

Now let’s check the chi-squared probability table for \alpha = 0.05 and df = 9 which gives us a X^2 = 16.92. The test-statistic we got is 0.40.

Comparing 16.92 with 0.4 we can say that our test-statistic is way too low. We would need a chi-squared value of more than 16.92 in order to reject our H_0.

If you observe we don’t even have a column where p-value (Upper-tail) is 0.5. And a value this high means the chances of going wrong with the prediction is 0.5 or 50% error .

So a player can’t base the chance of winning jeopardy on the assumption that all they need to focus on, is the set of words used in high-value questions only. (help me correct myself here if I have made a mistake)

Chi-squared table in image is here

Let me know if this didn’t help at all.