Screen Link: https://app.dataquest.io/m/283/sampling/5/simple-random-sampling
The instruction says take 100 samples. For each 100, sample 10. Does it mean 1000 samples?
* Using simple random sampling, take 100 samples from our WNBA data set, and for each sample measure the average points scored by a player during the 2016-2017 season. For *each* of the 100 iterations of a `for` loop:
* Sample 10 values from the `PTS` column.
* Compute the mean of this sample made of 10 values from the `PTS` column, and append the result to a list.
* To make your results reproducible, vary the `random_state` parameter of the `sample()` method with values between 0 and 99. For the first iteration of the `for` loop, `random_state` should equal 0, for the second iteration should equal 1, for the third should equal 2, and so on.
100 samples here means like 100 lists and each list is comprised of 10 elements.
for example, sample1 = (s1_1, s1_2, s1_3,…, s1_9, s1_10)
sample2 = (s2_1,s2_2,…,s2_9, s2_10) and so on.
does that mean in total there are 1000 data points? from what i can see here, there are not that many.
It’s graphing the 100 means we got from the sampling, not the individual samples.
Think of it this way: say we have a big bag of marbles, each with a number on it, and we’re picking out 10 and finding the mean. Then we’re putting the marbles back in, and reaching in for a 2nd time to get 10 marbles. And repeat 100 times. We might pick up the same marbles in subsequent samplings, so it’s not really 1000 samples, but 100 different configuration of picking 10 at a time.
In the end, we’re going to have a list of 100 means from our sampling. We’re graphing the means we got from each of those samples of 10. If you have a sample that happens to have larger numbers, the mean for that sample will be larger. The graph is showing the variations that can occur if our sample size is small, as opposed to the mean of the population.
I see, that explains it. thanks so much @april.g
I was looking out of curiosity to see what people have been asking and I have stumbled upon your beautiful graph.
How did you insert the min and max values like that (arrows and text included)?
EDIT: I just pressed next screen and I saw the graphs on the next page. Still, do you know how to do it? I tried with
# plt.annotate(max(sample_means), (range(1,101), sample_means)) but I doesn’t return anything