# What does chi2_contingency function tell us?

I am confused what the chi2_contingency function tells us.

In the example code from ‘Multi category chi-squared tests’ screen 6, we see:
"
import numpy as np
from scipy.stats import chi2_contingency
observed = np.array([[5, 5], [10, 10]])

chisq_value, pvalue, df, expected = chi2_contingency(observed)
"
Question 1
How can chi squared values be calculated when we only have one table? Chi squared results need observed and expected.

Question 2
How does the function calculate an expected value? It has no idea what the expected value should be. Who is to say that for the example code the expected values shouldn’t be : [ [200, 1000], [pi, 666] ]?

Any help would be much appreciated.

According to the documentation:

The expected frequencies are computed based on the marginal sums under the assumption of independence; see `scipy.stats.contingency.expected_freq` .

Hi, I have the same question too. I have read the documentation but I don’t understand the meaning of ‘computed based on the marginal sums under the assumption of independence’. Would you please explain a little bit more, thanks!

A lot of time we want to help, but without the exact link to the course it’s sometimes impossible! Please follow the Dataquest guidelines.

@lhc1412: Here are the guidelines for your reference. Do attach a mission link and format your code accordingly so we can better understand your issue and better assist. Thanks!