I dont understand this step at all. Can someone interpret it easily?
freq_table this function that we defined before, final return is the table shows category and percentage? we assign this function to genres_ios for average calculating?
in the nested loop, why we put if genre_app==genre? genre is the whole line of genres_ios , genre_app is just one element, how can they be equal?
genres_ios = freq_table(ios_final, -5)
for genre in genres_ios:
total = 0
len_genre = 0
for app in ios_final:
genre_app = app[-5]
if genre_app == genre:
n_ratings = float(app[5])
total += n_ratings
len_genre += 1
avg_n_ratings = total / len_genre
print(genre, ‘:’, avg_n_ratings)
It took a while to search and find this question. Please describe appropriately with links.
Your dataset
is a list of lists. Your freq_table
function takes in data (list of lists) and a key
. It loops row after row to count the number of items available. It returns the key
and the number of time the value occurs freq
to the table dictionary
. After that, takes the freq
in the table dictionary
and convert this value into percentage
value. The function returns the percentage value.
genres_ios = freq_table(ios_final, -5)
means take the dataset ios_final
and take the key -5
and return a frequency table in percentage.
for genre in genres_ios
This means take the dictionary key
from the percentage dictionary genres_ios
.
for app in ios_final
this means take a row app
from the ios_final
list of lists.
genre_app = app[-5]
This means pick a value from the row called app
from the point indexed -5
. if genre_app == genre:
means check if the key from the dictionary genre
is equal to what you picked from the row called app
. If it is equal, update n_ratings, total and len_genre values.
After you have gone through all the values in the loop, calculate the average value and print.
def freq_table(dataset, index):
table = {}
total = 0
for row in dataset:
total += 1
value = row[index]
if value in table:
table[value] += 1
else:
table[value] = 1
table_percentages = {}
for key in table:
percentage = (table[key] / total) * 100
table_percentages[key] = percentage
return table_percentages
genres_ios = freq_table(ios_final, -5)
for genre in genres_ios:
total = 0
len_genre = 0
for app in ios_final:
genre_app = app[-5]
if genre_app == genre:
n_ratings = float(app[5])
total += n_ratings
len_genre += 1
avg_n_ratings = total / len_genre
print(genre, ':', avg_n_ratings)
1 Like
Thank you @monorienaghogho!! I feel like I understand 80% of this. So for xxx in the dictionary means that the computer only takes the key out from the dictionary, but if for xxx in the list means that the computer takes the whole line/row from the list. did I get it correctly?
I still don’t really understand genres_ios = freq_table(ios_final, -5). so the reason we do this is becasue we only need to take the unique genre fro the frequency table, we don’t really need the percentage number here. Did I get it right?
So **for xxx in the dictionary** means that the computer only takes the key out from the dictionary, but if **for xxx in the list** means that the computer takes the whole line/row from the list
You understood this correctly.
genres_ios = freq_table(ios_final, -5) Here you are calculating the frequency (in percentage) of several genres in ios_table
. It is the freq_table
that does the work. It estimates in percentage how different genres occur in ios_table
and it returns a dictionary with keys/genres
and percentages
Thank again!!@monorienaghogho
Now I am trying to understand the sum up part.
if genre_app==genre:
** rating_value=float(row[5])**
** total +=rating_value**
** len_genre +=1**
i sum up all the rating_value that i took from ios_final dataset, and add all them up and put into the total{} that i created. did i get it correctly?
if genre_app == genre
. Take the value from the rating
which is indexed as 5
on the row. Convert to float
. Update the total adding rating value
(total +=rating_value
), then increase the len_genre
by 1. All these updates only take place if genre_app == genre
.