Hi Marina,

I don’t have the codes used by the content author to generate those plots, However, here are some codes that closely resembles it:

**Stratified Sampling (Games Played), Sample = 10:**

```
under_12 = wnba[wnba['Games Played'] <= 12]
btw_13_22 = wnba[(wnba['Games Played'] > 12) & (wnba['Games Played'] <= 22)]
over_23 = wnba[wnba['Games Played'] > 22]
proportional_sampling_means = []
for i in range(100):
sample_under_12 = under_12['PTS'].sample(1, random_state = i)
sample_btw_13_22 = btw_13_22['PTS'].sample(2, random_state = i)
sample_over_23 = over_23['PTS'].sample(7, random_state = i)
final_sample = pd.concat([sample_under_12, sample_btw_13_22, sample_over_23])
proportional_sampling_means.append(final_sample.mean())
plt.scatter(range(1,101), proportional_sampling_means)
plt.axhline(wnba['PTS'].mean())
plt.axis([-5, 105, 100, 350])
```

**Simple Random Sampling, Sample = 10:**

```
sampling_means = []
for i in range(100):
final_sample = wnba['PTS'].sample(10, random_state = i)
sampling_means.append(final_sample.mean())
plt.scatter(range(1,101), sampling_means)
plt.axhline(wnba['PTS'].mean())
```

**Stratified Sampling (Minutes Played), Sample = 12:**

```
under_12 = wnba[wnba['MIN'] <= 350]
btw_13_22 = wnba[(wnba['MIN'] > 350) & (wnba['MIN'] <= 700)]
over_23 = wnba[wnba['MIN'] > 700]
proportional_sampling_means = []
for i in range(100):
sample_under_12 = under_12['PTS'].sample(4, random_state = i)
sample_btw_13_22 = btw_13_22['PTS'].sample(4, random_state = i)
sample_over_23 = over_23['PTS'].sample(4, random_state = i)
final_sample = pd.concat([sample_under_12, sample_btw_13_22, sample_over_23])
proportional_sampling_means.append(final_sample.mean())
plt.scatter(range(1,101), proportional_sampling_means)
plt.axhline(wnba['PTS'].mean())
plt.axis([-5, 105, 100, 350])
```

**Simple Random Sampling, Sample = 12:**

```
sampling_means = []
for i in range(100):
final_sample = wnba['PTS'].sample(12, random_state = i)
sampling_means.append(final_sample.mean())
plt.scatter(range(1,101), sampling_means)
plt.axhline(wnba['PTS'].mean())
plt.axis([-5, 105, 100, 350])
```

**Stratified Sampling (Games Played), Sample = 12:**

```
under_12 = wnba[wnba['Games Played'] <= 12]
btw_13_22 = wnba[(wnba['Games Played'] > 12) & (wnba['Games Played'] <= 22)]
over_23 = wnba[wnba['Games Played'] > 22]
proportional_sampling_means = []
for i in range(100):
sample_under_12 = under_12['PTS'].sample(1, random_state = i)
sample_btw_13_22 = btw_13_22['PTS'].sample(2, random_state = i)
sample_over_23 = over_23['PTS'].sample(9, random_state = i)
final_sample = pd.concat([sample_under_12, sample_btw_13_22, sample_over_23])
proportional_sampling_means.append(final_sample.mean())
plt.scatter(range(1,101), proportional_sampling_means)
plt.axhline(wnba['PTS'].mean())
plt.axis([-5, 105, 100, 350])
```

Hope this helps

Best,

Sahil