Np.random.rand() gives different values than np.random.choice(). Why is that?

Screen Link:

My Code:
‘’’
import numpy as np
import matplotlib.pyplot as plt

chi_squared_values =

for n in range(1000):
random_numbers = np.random.choice([0, 1], size=32561, replace=True)
binary_counts = np.bincount(random_numbers)
male_count = binary_counts[0]
female_count = binary_counts[1]
male_diff = ((male_count - 16280.5) ** 2) / 16280.5
female_diff = ((female_count - 16280.5) ** 2) / 16280.5
chi_squared = male_diff + female_diff
chi_squared_values.append(chi_squared)

plt.hist(chi_squared_values)
‘’’
What I expected to happen:
The corrrect chi_squared_values

What actually happened:
Different values

I (sloppily) recreated the above code using np.random.rand() and I got the correct values, but I don’t understand what happens under the hood to give a different output. Correct code:
‘’’
import numpy as np
import matplotlib.pyplot as plt
chi_squared_values =

for n in range(1000):
random_numbers = np.random.rand(32561)
binary_numbers =
for number in random_numbers:
if number >= 0.5:
number = 1
binary_numbers.append(number)
else:
number = 0
binary_numbers.append(number)

male_count = binary_numbers.count(0)
female_count = binary_numbers.count(1)

male_diff = ((male_count - 16280.5) ** 2) / 16280.5
female_diff = ((female_count - 16280.5) ** 2) / 16280.5

chi_squared = male_diff + female_diff
chi_squared_values.append(chi_squared)

plt.hist(chi_squared_values)
‘’’

numpy.random.choice takes a 1-D array and performs a random selection among them. In your example, you gave an argument of [0, 1] meaning that you want Python to choose between 0 and 1.

numpy.random.rand generates a random floating-point number between 0 and 1 (exclusive)

1 Like

That makes sense, thank you!