Hi @jithins123,
I have some promising news! Please see code below:
genres_ios = freq_table(ios_final, -5)
for genre in genres_ios:
total = 0
len_genre = 0
for app in ios_final:
genre_app = app[-5]
if genre_app == genre:
n_ratings = float(app[5])
total += n_ratings
len_genre += 1
avg_n_ratings = total / len_genre
a_tuple = (avg_n_ratings, genre)
a_list.append(a_tuple)
sorted(a_list, reverse = True)
The only problem now is the output. Here are the first several lines of the sorted list:
(86090.33333333333, ‘Navigation’),
(86090.33333333333, ‘Navigation’),
(74942.11111111111, ‘Reference’),
(74942.11111111111, ‘Reference’),
(71548.34905660378, ‘Social Networking’),
(71548.34905660378, ‘Social Networking’),
(57326.530303030304, ‘Music’),
(57326.530303030304, ‘Music’),
(52279.892857142855, ‘Weather’),
(52279.892857142855, ‘Weather’),
(39758.5, ‘Book’),
(39758.5, ‘Book’),
(33333.92307692308, ‘Food & Drink’),
(33333.92307692308, ‘Food & Drink’),
(31467.944444444445, ‘Finance’),
(31467.944444444445, ‘Finance’),
(28441.54375, ‘Photo & Video’),
(28441.54375, ‘Photo & Video’),
(28243.8, ‘Travel’),
(28243.8, ‘Travel’),
(26919.690476190477, ‘Shopping’),
(26919.690476190477, ‘Shopping’),
(23298.015384615384, ‘Health & Fitness’),
(23298.015384615384, ‘Health & Fitness’),
(23008.898550724636, ‘Sports’),
(23008.898550724636, ‘Sports’),
(22788.6696905016, ‘Games’),
(22788.6696905016, ‘Games’),
(21248.023255813954, ‘News’),
(21248.023255813954, ‘News’),
(21028.410714285714, ‘Productivity’),
(21028.410714285714, ‘Productivity’),
(18684.456790123455, ‘Utilities’),
(18684.456790123455, ‘Utilities’),
(16485.764705882353, ‘Lifestyle’),
(16485.764705882353, ‘Lifestyle’),
(16485.764705882353, ‘Lifestyle’),
(16485.764705882353, ‘Lifestyle’),
(16485.764705882353, ‘Lifestyle’),
(14029.830708661417, ‘Entertainment’),
(14029.830708661417, ‘Entertainment’),
(7491.117647058823, ‘Business’),
(7491.117647058823, ‘Business’),
(7003.983050847458, ‘Education’),
(7003.983050847458, ‘Education’),
(4004.0, ‘Catalogs’),
(4004.0, ‘Catalogs’),
(612.0, ‘Medical’),
(612.0, ‘Medical’)
Notice that everything is printed twice? How do I get past this. Also, there are recurring genre entries after the last line of output I’ve included here. I’m getting output that is partially correct but not all the way there. I need to get rid of the duplicates and added unnecessary data. Once i do I’ll have a sorted list without the duplicates! Also, I looked at the solutions closer. I made a mistake. What they’ve printed is not actually in order! I mistakenly thought it was. For whatever reason though, the outputs they get and the outputs I get using the same exact code are not the same. So that’s rather strange-- any thoughts on why this might be?
I also have one other question I’ve been having a hard time figuring out with this code. The first line is genres_ios = freq_table(ios_final, -5)
. The freq_table
function when called should generate percentages no? Here’s that code for reference:
def freq_table(dataset, index):
table = {}
total = 0
for row in dataset:
total += 1
value = row[index]
if value in table:
table[value] += 1
else:
table[value] = 1
table_percentages = {}
for key in table:
percentage = (table[key] / total) * 100
table_percentages[key] = percentage
return table_percentages
How does it work that the new code isn’t finding percentages in addition to the new code we’re writing? Is the code being overwritten when we introduce the new loop to find the averages for each genre? How is freq_table functioning within this code being written to find the averages?
Thanks so much!