Cluster Sampling: It's better to turn a list into a data frame then use df1.append(df2)

teams = pd.Series(wnba['Team'].unique()).sample(4, random_state=0)
clusters = []

for team in teams:
    cluster = wnba[wnba['Team'] == team]

data = pd.concat(clusters, ignore_index=True)

sampling_error_height = wnba['Height'].mean() - data['Height'].mean()
sampling_error_age = wnba['Age'].mean() - data['Age'].mean()
sampling_error_BMI = wnba['BMI'].mean() - data['BMI'].mean()
sampling_error_points = wnba['PTS'].mean() - data['PTS'].mean()

Using lists is less resource intensive.

Isn’t it better to grow a list and transform it into a data frame than to grow a data frame via df1.append(df2)?

Yes, that approach would be more efficient in this case. I can’t confirm on which one would be more (if at all) memory intensive, but yours would definitely be faster.

Yes df.append will get increasingly slow.
You can avoid lists and just df.query the 4 teams you want.

