Can anyone help on interpreting below code? --Most Popular Apps by Genre on the App Store

I dont understand this step at all. Can someone interpret it easily?
freq_table this function that we defined before, final return is the table shows category and percentage? we assign this function to genres_ios for average calculating?

in the nested loop, why we put if genre_app==genre? genre is the whole line of genres_ios , genre_app is just one element, how can they be equal?

genres_ios = freq_table(ios_final, -5)

for genre in genres_ios:
total = 0
len_genre = 0
for app in ios_final:
genre_app = app[-5]
if genre_app == genre:
n_ratings = float(app[5])
total += n_ratings
len_genre += 1
avg_n_ratings = total / len_genre
print(genre, ‘:’, avg_n_ratings)

It took a while to search and find this question. Please describe appropriately with links.

Your dataset is a list of lists. Your freq_table function takes in data (list of lists) and a key. It loops row after row to count the number of items available. It returns the key and the number of time the value occurs freq to the table dictionary. After that, takes the freq in the table dictionary and convert this value into percentage value. The function returns the percentage value.

genres_ios = freq_table(ios_final, -5) means take the dataset ios_final and take the key -5 and return a frequency table in percentage.

for genre in genres_ios This means take the dictionary key from the percentage dictionary genres_ios.

for app in ios_final this means take a row app from the ios_final list of lists.

genre_app = app[-5] This means pick a value from the row called app from the point indexed -5. if genre_app == genre: means check if the key from the dictionary genre is equal to what you picked from the row called app. If it is equal, update n_ratings, total and len_genre values.

After you have gone through all the values in the loop, calculate the average value and print.


def freq_table(dataset, index):
    table = {}
    total = 0
    
    for row in dataset:
        total += 1
        value = row[index]
        if value in table:
            table[value] += 1
        else:
            table[value] = 1
    
    table_percentages = {}
    for key in table:
        percentage = (table[key] / total) * 100
        table_percentages[key] = percentage 
    
    return table_percentages


genres_ios = freq_table(ios_final, -5)

for genre in genres_ios:
    total = 0
    len_genre = 0
    for app in ios_final:
        genre_app = app[-5]
        if genre_app == genre:            
            n_ratings = float(app[5])
            total += n_ratings
            len_genre += 1
    avg_n_ratings = total / len_genre
    print(genre, ':', avg_n_ratings)
1 Like

Thank you @monorienaghogho!! I feel like I understand 80% of this. So for xxx in the dictionary means that the computer only takes the key out from the dictionary, but if for xxx in the list means that the computer takes the whole line/row from the list. did I get it correctly?

I still don’t really understand genres_ios = freq_table(ios_final, -5). so the reason we do this is becasue we only need to take the unique genre fro the frequency table, we don’t really need the percentage number here. Did I get it right?

So **for xxx in the dictionary** means that the computer only takes the key out from the dictionary, but if **for xxx in the list** means that the computer takes the whole line/row from the list You understood this correctly.

genres_ios = freq_table(ios_final, -5) Here you are calculating the frequency (in percentage) of several genres in ios_table. It is the freq_table that does the work. It estimates in percentage how different genres occur in ios_table and it returns a dictionary with keys/genres and percentages

Thank again!!@monorienaghogho

Now I am trying to understand the sum up part.
if genre_app==genre:
** rating_value=float(row[5])**
** total +=rating_value**
** len_genre +=1**
i sum up all the rating_value that i took from ios_final dataset, and add all them up and put into the total{} that i created. did i get it correctly?

if genre_app == genre. Take the value from the rating which is indexed as 5 on the row. Convert to float. Update the total adding rating value (total +=rating_value), then increase the len_genre by 1. All these updates only take place if genre_app == genre.