Python For Loops Screen 5/11

Screen Link:

My Code:

rating_sum = 0
for row in app_data_set:
    rating = row[-1]
    rating_sum = rating_sum + rating
avg_rating = rating_sum / rating

What I expected to happen:
The correct answer of 4.3

What actually happened:

Variable avg_rating is larger than we expected. It should have value 4.3 but has value 4.777777777777778 instead.

I looked at the answer for the problem and it said I should have used

avg_rating = rating_sum / len(app_data_set)
instead of
avg_rating = rating_sum / rating

Why would I use len(app_data_set) instead of rating and how should I have known to use len?

Thanks for taking a look at my question

1 Like

Hi @sean.m.silvamiramon

It looks like you are trying to compute the average for rating column in app_data_set. In order to calculate the average:

  1. Calculate the sum (like you did with rating_sum)
  2. Count the number of ratings (You can think of counting the number of ratings like counting how many times you added a rating to the rating_sum)
  • This is where len(app_data_set) comes in handy. By using len(app_data_set), you can count the number of rating entries.
  • Read more about calculating averages here

You can click the triangle bullet below for an additional example:


Pretend you are looping through a sample from apps_data_set below

app_data_set  =  [
['284882215',   'Facebook',   '389879808',   'USD',   '0.0',   '2974676',   '212',   **'3.5',**   '3.5',   '95.0',  '4+',  'Social Networking',   '37',   '1',   '29',   '1'],

 ['389801252',   'Instagram',   '113954816',   'USD',   '0.0',   '2161558',   '1289',   **'4.5'**,   4.0', 10.23',   '12+',   'Photo & Video',   '37',   '0',   '29',   '1']
Now let's loop through the data set

On the first loop
rating_sum = 0
for row in app_data_set:
     rating  = row[7] # if you look in the 1st row of sample apps_data_set above you will see the rating is 3.5
    rating_sum = rating_sum  + rating

now rating_sum is 3.5
let’s go to the (2nd) second loop

# rating_sum = 3.5  #rating_sum is now 3.5 because you added 3.5 to rating_sum in first loop

for row in app_data_set:
     rating  = row[7] # look on 2nd row of sample app_data_set and the rating is 4.5
     rating_sum = rating_sum + rating

Now the rating_sum is 3.5 + 4.5 = 8.0

Now if you want to compute the average, you would take the rating_sum of 8.0 and divide it by
the number of rating entries which is 2 in our example.
rating_sum is 8.0
number of ratings counted is 2

avg_rating = rating_sum / len(app_data_set)

So, you need to use len(apps_data_set) so it will count the number of entries or ratings in the app_data_set for you

In this example, I just went two loops but the app_data_set has thousands of rows so the for loop will continue until it goes through the entire app_data_set to calculate rating_sum

Read more about calculating averages here