LIMITED TIME OFFER: 50% OFF OF PREMIUM WITH OUR ANNUAL PLAN (THAT'S $294 IN SAVINGS).
GET OFFER

Guided Project: Profitable App Profiles for the App Store and Google Play Markets - Sorting Averages

This isn’t apart of the assignment for the project but to gain a deeper understanding of Python, I attempted to sort the avg_ratings. I understand there are easier methods to perform this but from what I’ve seen so far we have to create a dictionary from the list and turn the dictionary into a tuple to sort. I’ve attempted to do that but I’m not getting a sorted list as a result. Please explain if you could.

Screen Link:
https://app.dataquest.io/m/350/guided-project%3A-profitable-app-profiles-for-the-app-store-and-google-play-markets/12/most-popular-apps-by-genre-on-the-app-store

My Code:

ios_genres = freq_table(ios_complete, -5)

for genres in ios_genres:
    total = 0
    len_genre = 0
    for apps in ios_complete:
        genre_app = apps[-5]
        if genre_app == genres:
            ratings = float(apps[5])
            total += ratings
            len_genre += 1
    avg_ratings = total / len_genre
   
    sorted_genres = {}
    if genres in sorted_genres:
        sorted_genres[genres] += avg_ratings
    else:
        sorted_genres[genres] = avg_ratings
    
    display_genre = []
    for items in sorted_genres:
        tuple_genre = (sorted_genres[items], items)
        display_genre.append(tuple_genre)
    final_table = sorted(display_genre, reverse = True)

    print(final_table)

What I expected to happen:
Expected a sorted list

What actually happened:

[(10822.961077844311, 'Entertainment')]
[(20179.093023255813, 'Food & Drink')]
[(6266.333333333333, 'Education')]
[(15892.724137931034, 'News')]
[(47220.93548387097, 'Weather')]
[(18924.68896765618, 'Games')]
[(8498.333333333334, 'Book')]
[(56482.02985074627, 'Music')]
[(67447.9, 'Reference')]
[(18746.677685950413, 'Shopping')]
[(8978.308510638299, 'Lifestyle')]
[(6367.8, 'Business')]
[(53078.195804195806, 'Social Networking')]
[(20216.01785714286, 'Travel')]
[(27249.892215568863, 'Photo & Video')]
[(19053.887096774193, 'Productivity')]
[(459.75, 'Medical')]
[(20128.974683544304, 'Sports')]
[(13522.261904761905, 'Finance')]
[(19952.315789473683, 'Health & Fitness')]
[(25972.05, 'Navigation')]
[(1779.5555555555557, 'Catalogs')]
[(14010.100917431193, 'Utilities')]

Hello!

So, the first problem is that you initialize the sorted_genres inside the for genres in ios_genres: loop, so every time the for moves to next genres in ios_genres the dictionary gets empty again.

Also, there’s no need to run the if statement below:

      if genres in sorted_genres:
            sorted_genres[genres] += avg_ratings
      else:
            sorted_genres[genres] = avg_ratings

Note that the code below already increments the average each time the genres in the first for matches the apps[-5] in the second for:

               total += ratings
               len_genre += 1
        avg_ratings = total / len_genre

And if the sorted_genres does not have a key for the genre yet, the line sorted_genres[genres] = avg_ratings will create it.

The code below should also be out of the for genres in ios_genres: loop for the same reason of the sorted_genres:


        display_genre = []
        for items in sorted_genres:
            tuple_genre = (sorted_genres[items], items)
            display_genre.append(tuple_genre)
        final_table = sorted(display_genre, reverse = True)

And the code above is missing the lines to print the final_table in friendly way to read, these lines are:


    for row in final_table:
        print(row[1], ':', row[0])

Your code should be like this:


    sorted_genres = {}
    for genres in genres_ios:
        total = 0
        len_genre = 0
        for apps in ios_free:
            genre_app = apps[12]
            if genre_app == genres:
                ratings = float(apps[6])
                total += ratings
                len_genre += 1
        avg_ratings = total / len_genre
        sorted_genres[genres] = avg_ratings

        sorted_genres[genres] = avg_ratings
        
    display_genre = []
    for items in sorted_genres:
        tuple_genre = (sorted_genres[items], items)
        display_genre.append(tuple_genre)
    final_table = sorted(display_genre, reverse = True)
    for row in final_table:
        print(row[1], ':', row[0])

Hope this helps you!

1 Like

Hi Octavio,

I am actually having a similar issue with the lesson before the final lesson in this project. The output when I run the code appears to be incorrect.

I will provide the code below:

prime_genre_freq = freq_table(app_store_dataset,11)

for genre in prime_genre_freq:
    total_ratings = 0 # store the sum of user ratings (the number of ratings) per genre
    len_genre = 0 # store number of apps specific to each genre

    for each_row in app_store_dataset[1:]:
        genre_app = each_row[11]
        #print(genre_app)
    
        if genre_app == genre:
           ratings = float(each_row[5])
           total_ratings += ratings
           len_genre += 1
        
avg_ratings = total_ratings / len_genre  
print(genre_app)
print(avg_ratings)

Here is the output:

{‘Sports’: 114, ‘Reference’: 64, ‘Utilities’: 248, ‘Photo & Video’: 349, ‘Lifestyle’: 144, ‘Games’: 3862, ‘Finance’: 104, ‘Social Networking’: 167, ‘Health & Fitness’: 180, ‘Navigation’: 46, ‘Catalogs’: 10, ‘Business’: 57, ‘Productivity’: 178, ‘Shopping’: 122, ‘Education’: 453, ‘Travel’: 81, ‘Music’: 138, ‘News’: 75, ‘Medical’: 23, ‘Food & Drink’: 63, ‘Book’: 112, ‘Weather’: 72, ‘Entertainment’: 535}
Food & Drink
7533.678504672897

It appears to only match up Food & Drink and then underneath displays the average ratings. I am not sure why it is not displaying all genres in the Apple store data set. I thought it was an issue with my if statement but it does not look to far off of your example you provided.

Thank you!

I guess this is a indentation problem. But it is difficult to see it since your code is not well formatted. Could have it all formatted just like the red part is?

I apologize,

A simple copy and paste did not post the code as it is written in the Jupyter editor and this is my first time posting.

I think I fixed it for ya!

1 Like

OK, no problem.

Try to have the following part inside the for genre in prime_genre_freq::

avg_ratings = total_ratings / len_genre  
print(genre_app)
print(avg_ratings)

If it is outside the for and as you are not storing the data in a list or a dictionary, it will only display the data for the last genre in prime_genre_freq.

Octavio,

I appreciate the response and it does make sense as to why it was only outputting one genre.

However when you move that snippet of code into the for genre in prime_genre_freq: (for loop) you get a DivisonByZero error. I am guessing this is happening because during the first iteration of the outside loop it will try to run the avg_rating variable that calculates total_ratings against the len_genre.

Can you post the current code?

prime_genre_freq = freq_table(app_store_dataset,11)

for genre in prime_genre_freq:
     total_ratings = 0
     len_genre = 0
     avg_ratings = total_ratings / len_genre

     print(genre_app)
     print(avg_ratings)

     for each_row in app_store_dataset[1:]:
         genre_app = each_row[11]
    
         if genre_app == genre:
            ratings = float(each_row[5])
            total_ratings += ratings
            len_genre += 1

Just like I thought you put it inside the first for but before the second one, so in the first loop len_genre will still be zero and that is why you get a DivisonByZero error.

This is how it should be:

prime_genre_freq = freq_table(app_store_dataset,11)

for genre in prime_genre_freq:
     total_ratings = 0 # store the sum of user ratings (the number of ratings) per genre
     len_genre = 0 # store number of apps specific to each genre
    
     for each_row in app_store_dataset[1:]:
          genre_app = each_row[11]
    
           if genre_app == genre:
              ratings = float(each_row[5])
              total_ratings += ratings
              len_genre += 1

      avg_ratings = total_ratings / len_genre
      print(genre_app)
      print(avg_ratings)

Otavios,

I must still have something incorrect.

prime_genre_freq = freq_table(app_store_dataset,11)

for genre in prime_genre_freq:
    total_ratings = 0 
    len_genre = 0 

    for each_row in app_store_dataset[1:]:
         genre_app = each_row[11]
    
         if genre_app == genre:
            ratings = float(each_row[5])
            total_ratings += ratings
            len_genre += 1
        
     avg_ratings = total_ratings / len_genre
     print(genre_app)
     print(avg_ratings)

output:
Food & Drink
14026.929824561403
Food & Drink
22410.84375
Food & Drink
6863.822580645161
Food & Drink
14352.280802292264
Food & Drink
6161.763888888889
Food & Drink
13691.996633868463
Food & Drink
11047.653846153846
Food & Drink
45498.89820359281
Food & Drink
9913.172222222222
Food & Drink
11853.95652173913
Food & Drink
1732.5
Food & Drink
4788.087719298245
Food & Drink
8051.3258426966295
Food & Drink
18615.32786885246
Food & Drink
2239.2295805739514
Food & Drink
14129.444444444445
Food & Drink
28842.021739130436
Food & Drink
13015.066666666668
Food & Drink
592.7826086956521
Food & Drink
13938.619047619048
Food & Drink
5125.4375
Food & Drink
22181.027777777777
Food & Drink
7533.678504672897

Run this is post the output:

prime_genre_freq = freq_table(app_store_dataset,11)
print(prime_genre_freq)
print(freq_table)
prime_genre_freq = freq_table(app_store_dataset,11)
print(prime_genre_freq)
print(freq_table)

output:
{‘Sports’: 114, ‘Reference’: 64, ‘Utilities’: 248, ‘Photo & Video’: 349, ‘Lifestyle’: 144, ‘Games’: 3862, ‘Finance’: 104, ‘Social Networking’: 167, ‘Health & Fitness’: 180, ‘Navigation’: 46, ‘Catalogs’: 10, ‘Business’: 57, ‘Productivity’: 178, ‘Shopping’: 122, ‘Education’: 453, ‘Travel’: 81, ‘Music’: 138, ‘News’: 75, ‘Medical’: 23, ‘Food & Drink’: 63, ‘Book’: 112, ‘Weather’: 72, ‘Entertainment’: 535}
<function freq_table at 0x7f08a922b378>

You are printing genre_app instead of genre.

1 Like

■■■… Well I feel like an idiot.

I really appreciate you helping me out. I ran it again and it works as it should.

:man_facepalming:

2 Likes