Screen Link:
My Code:
aggregate_means = {}
brand_freq = {}
prices_by_index = autos_brands.loc[:,'price']
#print(autos_brands.loc[2])
for row in autos_brands['brand']:
add_row = 0
if row in aggregate_means:
aggregate_means[row]+=prices_by_index.iloc[add_row]
brand_freq[row]+=1
else:
aggregate_means[row]=prices_by_index.iloc[add_row]
brand_freq[row]=1
add_row+=1
aggregate_sums = aggregate_means
for key in aggregate_sums:
aggregate_means[key] = aggregate_sums[key]/brand_freq[key]
What I expected to happen:
I checked autos_brands.shape()
and I checked that autos_brands['price']
has the same price values as the relative posts in autos
. I expected to get average prices.
What actually happened:
I got 5000 for every brand’s average:
print(aggregate_means)
print(brand_freq)
print(prices_by_index)
Results in the following:
{'peugeot': 5000.0, 'bmw': 5000.0, 'volkswagen': 5000.0, 'smart': 5000.0, 'ford': 5000.0, 'seat': 5000.0, 'renault': 5000.0, 'mercedes_benz': 5000.0, 'audi': 5000.0, 'opel': 5000.0, 'mazda': 5000.0, 'toyota': 5000.0, 'nissan': 5000.0, 'fiat': 5000.0, 'skoda': 5000.0, 'citroen': 5000.0}
{'peugeot': 1413, 'bmw': 5201, 'volkswagen': 10157, 'smart': 684, 'ford': 3330, 'seat': 888, 'renault': 2272, 'mercedes_benz': 4586, 'audi': 4118, 'opel': 5155, 'mazda': 730, 'toyota': 609, 'nissan': 734, 'fiat': 1232, 'skoda': 772, 'citroen': 674}
0 5000.000000
1 8500.000000
2 8990.000000
3 4350.000000
4 1350.000000
...
49995 24900.000000
49996 1980.000000
49997 13200.000000
49998 22900.000000
49999 1250.000000
Name: price, Length: 42555, dtype: float64
autos_brands
was created by trimming unwanted brands from my dataset, but I’ve matched the prices back to the original autos
dataset to make sure I haven’t made any assignment errors along the way.