Screen Link: https://app.dataquest.io/m/314/dictionaries-and-frequency-tables/13/filtering-for-the-intervals
opened_file = open('AppleStore.csv')
from csv import reader
read_file = reader(opened_file)
apps_data = list(read_file)
rating_count_tot = []
for row in apps_data[1:]:
rating = float(row[5])
rating_count_tot.append(rating)
max_rating = max(rating_count_tot)
min_rating = min(rating_count_tot)
rating_buckets = {'0 - 100,000': 0, '100,000 - 1,000,000': 0, '1,000,000 +' :0}
for rating in rating_count_tot:
if rating <= 100000:
rating_buckets['0 - 100,000'] += 1
if 10000 < rating <= 1000000:
rating_buckets['100,000 - 1,000,000'] += 1
if rating > 1000000:
rating_buckets['1,000,000 +'] += 1
values = rating_buckets.values()
total = sum(values)
print(len(rating_count_tot))
print(total)
What I expected to happen:
For these two variables to return the same value:
print(len(rating_count_tot))
print(total)
What actually happened:
7197
7995
Essentially, I was testing to make sure that my list of ratings would be equal to the sum of the different buckets in my frequency breakdown. This way I would be sure that I did the code correctly and accounted for all the ratings.
Any idea why the number is coming out differently?
Thanks!