The Mean : 8. Estimating the Population Mean

Screen Link:
https://app.dataquest.io/m/305/the-mean/8/estimating-the-population-mean

My Code:

mean_population = houses['SalePrice'].mean()
size = 5

for r in range(101):
   sample_price = houses['SalePrice'].sample(size, random_state=r)
   mean_sample = sample_price.mean()
   sampling_error = mean_population - mean_sample

   import matplotlib.pyplot as plt
   plt.scatter(len(sample_price), sampling_error)
   plt.axhline(0)
   plt.axvline(2930)
   plt.xlabel('Sample size')
   plt.ylabel('Sampling error')
   size += 29    

What I expected to happen:
I am able to generate the scatter plot using my code and it looks quite similar to the expected result when compared

What actually happened:
As per DQ, my plot is not matching the expected result.

Paste output/error here

One difference I noticed between the code by DQ and mine is that, DQ is plotting outside the for loop.
But still I am not able to make out the real issue. Please provide your feedback.

1 Like

Hi @sreekanthac,

When you run the plotting code in a loop, you are adding the plot elements as a new layer to the figure object each time. While the output may appear the same, for the answer checker the plot is different than expected.

In a newer version of matplotlib, each layer uses a different colour to represent values. Here is an example of what is happening:

plt.scatter(1, 2)
plt.scatter(2, 3)
plt.scatter(3, 4)

image

plt.scatter([1, 2, 3], [2, 3, 4])

image

Best,
Sahil

1 Like

Thank your for the explanation.

1 Like