Blue Week Special Offer | Brighten your week!
days
hours
minutes
seconds

Chi-squared tests lesson 4 timeout

Screen Link: https://app.dataquest.io/m/99/chi-squared-tests/4/generating-a-distribution

Your Code: import numpy as np chi_squared_values = [] n=32561 expected=n/2 for i in range(1000): rands = np.random.random(n,) male_count = sum(rands<=0.5) female_count = sum(rands>0.5) male_diff = pow(male_count-expected, 2)/expected female_diff = pow(female_count-expected, 2)/expected chi_squared_values.append(male_diff + female_diff) plt.hist(chi_squared_values, bins=21) plt.show()

What I expected to happen: I expected the program to run to completion

What actually happened: After about 2 minutes - connection broke, and received timeout

Other details: It worked for loop length of 100, but not for 1000

2 Likes

Hi @bennyp1,

I just checked it. It seems, any code except the exact copy of Dataquest solution is getting timed out. I will get this issue logged.

Best,
Sahil

1 Like

Hi everyone!

This problem seems to still exist (in a different way). I don´t get a time-out but it won´t plot my histogram, thus not passing the challenge. The copy/pasted solution code plots the hist just fine :-/

Hi @tchintchie,

Whenever our system finds your answer to be exactly the same as the solution code, instead of checking the answer it automatically marks the screen as complete. This feature is implemented as a workaround for cases like this. Here is a gif illustrating the same.

I just tested a slightly modified solution code to bypass the above feature and it seems to be working correctly. Can you please send me the code you have used?

Thanks
Sahil

1 Like

@bennyp1 and @tchintchie. I just completed the mission with a modified version of the solution and it plots my histogram as well.

chi_squared_values = []
from numpy.random import random
import matplotlib.pyplot as plt
expected_females = 16280.5
expected_males = 16280.5

for i in range(1000):
    random_values = random((32561,))
    random_values[random_values <0.5] = 0
    random_values[random_values>=0.5] = 1
    male_count = len(random_values[random_values == 0])
    female_count = len(random_values[random_values == 1])
    female_diff = (female_count - expected_females) ** 2 / (expected_females)
    male_diff = (male_count - expected_males) ** 2 / (expected_males)
    chi_squared = male_diff + female_diff
    chi_squared_values.append(chi_squared)

plt.hist(chi_squared_values)
1 Like

I coded in a slightly different way and was able to successfully plot the graph and pass the mission with out any timeout issue. But when I checked the DQ solution and also the solutions provided by others here, I wonder whether my code is less efficient since I am using an additional for loop and if-else condition . Any feedback is appreciated.

My Code

import numpy as np
import matplotlib.pyplot as plt
chi_squared_values = []
excepted_val = 32561 / 2

for i in range(1000):
    counter = {'male' : 0, 'female' : 0}
    for val in np.random.random(32561):
        if val < 0.5 :
            counter['male'] += 1
        else:
            counter['female'] += 1
  
    male_diff = (counter['male'] - excepted_val)**2 / excepted_val
    female_diff = (counter['female'] - excepted_val)**2 / excepted_val
    chi_squared_values.append(male_diff + female_diff)
        
plt.hist(chi_squared_values)
1 Like