Lists and For Loops exercise 6

Hello Everyone,
I hope you are well. I am a new member and i am hoping that someone can please help me make sense of the answer below. I am currently learning about Lists and For Loops. I understood the previous lectures however, since there are so many rows i am confused when retrieving the data points that are located in other rows. since, all the rows store in 'app_data_set`. so there is no way for me to select row by row and after seeing the answer i feel even more confused. how does python knows which row and index i am referring to?

instructions:

  1. In the code editor, we’ve already stored the five rows as lists in separate variables. Group together the five lists in a list of lists. Assign the resulting list of lists to a variable named app_data_set .
  2. Compute the average rating of the apps by retrieving the right data points from the app_data_set list of lists.
  • The rating is the last element of each row. You’ll need to sum up the ratings and then divide by the number of ratings.
  • Assign the result to a variable named avg_rating .

answer:

row_1 = [‘Facebook’, 0.0, ‘USD’, 2974676, 3.5]
row_2 = [‘Instagram’, 0.0, ‘USD’, 2161558, 4.5]
row_3 = [‘Clash of Clans’, 0.0, ‘USD’, 2130805, 4.5]
row_4 = [‘Temple Run’, 0.0, ‘USD’, 1724546, 4.5]
row_5 = [‘Pandora - Music & Radio’, 0.0, ‘USD’, 1126879, 4.0]
app_data_set = [row_1, row_2, row_3, row_4, row_5]
avg_rating = (app_data_set[0][-1] + app_data_set[1][-1] +
app_data_set[2][-1] + app_data_set[3][-1] +
app_data_set[4][-1]) / 5

how does python knows which row and index i am referring to?

I appreciate your time and I look forward to seeing your comments.
Best,
Diana

@puerta.d Hello Diana!!! Welcome to the community!! :wave: Look at that! I am around. Shall we work through this together? I have a quick question. When you say: how does python know which row and index I am referring to, which line(s) of the code (the solution you pasted) above are you referring to? The avg_rating line?

1 Like

Hello Yemi,
thank you for replying. Exactly!
for instance, in this line “app_data_set[0][-1]” i am able to understand that [0 is Fb] and [-1 is 3.5] but for the rest of the code since, the rows are group together into five lists in the list called “app_data_set” list, how can I know which row and index i am referring to ? so, now that all the rows are in one list does the index change position?

avg_rating = (app_data_set[0][-1] + app_data_set[1][-1] +
app_data_set[2][-1] + app_data_set[3][-1] +
app_data_set[4][-1]) / 5

1 Like

So, I am going to copy and paste the code here and we will break down everything happening. Ready? This is going to be fun! :grinning:

row_1 = [‘Facebook’, 0.0, ‘USD’, 2974676, 3.5]
row_2 = [‘Instagram’, 0.0, ‘USD’, 2161558, 4.5]
row_3 = [‘Clash of Clans’, 0.0, ‘USD’, 2130805, 4.5]
row_4 = [‘Temple Run’, 0.0, ‘USD’, 1724546, 4.5]
row_5 = [‘Pandora - Music & Radio’, 0.0, ‘USD’, 1126879, 4.0]

app_data_set = [row_1, row_2, row_3, row_4, row_5]

avg_rating = (app_data_set[0][-1] + app_data_set[1][-1] + app_data_set[2][-1] + app_data_set[3][-1] + app_data_set[4][-1]) / 5

Step 1: Simple lists

Okay. So we defined 5 rows. Each of those rows is a list and contains 5 elements (we know that because we count them and we get 5 elements in each list).
Now, if I wanted to retrieve an item from any of those lists I will index them. Let’s remember that in Python lists indexes start at 0, meaning the first item is at index 0, the second at index 1 and so on… So:

  • To retrieve the name of the app from row_1:
    The name of the app is the first item in the list. We will do: row_1[0]
  • To retrieve the price of the app from row_3:
    The price of the app is the second item in the list. So: row_3[1]
  • To retrieve the rating of the app from row_5
    The rating is the last item of the list. So: row_5[4]. But because it is the last item we can actually use negative indexing. Negative indexing doesn’t start with 0 but 1. We will then end up doing row_5[-1] to get the last item of the row (the rating) using negative indexing.

So, with our rows alone that’s how we will retrieve items. So far so good?

Step 2: List of lists

Now, we put all those 5 rows in another list to make the app dataset and we end up with.

app_data_set = [row_1, row_2, row_3, row_4, row_5]

If I want to retrieve a row from app_data_set I will do what we are now so good at doing! app_data_set[0] will return row_1, app_data_set[1] will return row_2 etc…

Suppose I want to get the rating of row_4 using app_data_set. In app_data_set, row_4 is the fourth element so its index is 3. And in each row we already know from the previous step that the rating is the last element of the row, also at index 4. So, I could do:

  1. Get the row from app_data_set
    row_4 = app_data_set[3]
  2. Retrieve the rating from the row
    rating = row_4[4] or rating=row_4[-1]

But I can also do all this in a single line: rating = app_data_set[3][4] or rating = app_data_set[3][-1]

Whether I do it in two lines or one I will get the same results. How are we doing? Good?

Practice this part by getting, using app_data_set, the name of row_1, the price of row_5 and the number of ratings from row_2. Feel free to ask me to post the answers if you want to check them against yours.

Step 3: Calculating the average rating

If we want to calculate the average rating we have to:

  1. Get all the rows
  2. For each row get the rating
  3. Divide by the total number of rows (we already know we have 5 rows)

To get each row and the rating of each row we could do it in 2 steps (get the row and then use the row to get the rating) but from Step 2 we know how to do it in a single step.

So to get the rating of row_1 located at index 0 in app_data_set we will do: app_data_set[0][-1], for the rating of row_2 located at index 1 in app_data_set we will do: app_data_set[1][-1], for the rating of row_3 located at index 2 in app_data_set we will do: app_data_set[2][-1] etc…

And we will just add all those ratings and divide them by 5. We get this line of code:

avg_rating = (app_data_set[0][-1] + app_data_set[1][-1] +app_data_set[2][-1] + app_data_set[3][-1] + app_data_set[4][-1]) / 5

We could also go a bit more slowly if we wished by getting the sum first and dividing next:

rating_sum = avg_rating = app_data_set[0][-1] + app_data_set[1][-1] + app_data_set[2][-1] + app_data_set[3][-1] + app_data_set[4][-1]

avg_rating = rating_sum / 5

So, does all that make more sense? Let me know if something is confusing.

4 Likes

Thank you so much for taking your time to explain me! Yes, it makes more sense now!
I appreciate it!

Best regards,
Diana

1 Like

@Yemi You are the BEST ; )

1 Like

Hahaha… That’s nice of you. Thanks. :pray: I am very happy you are satisfied with the answer.

Hello,

I am getting an error on exercise 6 that avg_rating is greater than expected.

The code is the same as the answer given.

Can you help with this error?

Thank you

1 Like

Hi Christopher. The error is a mathematics order of operations error. Since there aren’t any parenthesis in your expression, Python is doing the division (app_data_set[4][-1] / 5) and then the additions. In order to get Python to add everything first and then divide, use parenthesis. Easy fix!

image

Please explain why we include [2] in the average

https://app.dataquest.io/m/312/lists-and-for-loops/4/retrieving-multiple-list-elements

Thanks

@maroof_adeoye

avg_rating = (fb_rating_data[2] + insta_rating_data[2] + pandora_rating_data[2]) / 3

The lists fb_rating_data, insta_rating_data, and pandora_rating_data each have the their respective rating values stored at the index position 2. To access this rating value, we therefore have to index these lists appropriately.

To illustrate why, let’s look closer at how one of these lists was created to begin with:

row_1 = ['Facebook', 0.0, 'USD', 2974676, 3.5]
fb_rating_data = [row_1[0], row_1[-2], row_1[-1]] 

The list fb_rating_data is made by indexing the elements at indexes 0, -2, and -1 from the row_1 list.

fb_rating_data is therefore actually:
['Facebook', 2974676, 3.5]

The 3.5 figure is the rating from fb_rating_data that we’re trying to retrieve.

Thanks blueberrypudding85
Are you saying the rating values for each list is in ‘USD’ and because the index for USD is 2, we therefore link the index 2. This is the way I understand it as a beginner to data science. Please confirm my understanding as a lay person in data science.

Also because we are talking about average, it means it has to have value to it!

Hi,
I am also studying this part around so as I understand app_data_set[0] from avg_rating refer to the row_1 from app_data_set, [-1] from ave_rating refers to the last element of row_1. As this you can understand the following row and points
Hope these can help you.

No, ‘USD’ has nothing to do with it. ‘USD’ is just an element of the list row_1 that didn’t end up making it into the list fb_rating_data. fb_rating_data was a second list that was made by only selectively handpicking 3 elements from the original row_1` list.

The list fb_rating_data was made by indexing row_1 such that fb_rating_data now only has the following elements:
fb_rating_data = ['Facebook', 2974676, 3.5]

‘Facebook’, 2974676, and 3.5 are in index positions 0, 1, and 2 respectively. To retrieve the string 'Facebook', you’d do fb_rating_data[0]. To retrieve the value 3.5 (which is the rating), you’d do fb_rating_data[2].

1 Like

avg_rating = (app_data_set[0][-1] + app_data_set[1][-1] + app_data_set[2][-1] + app_data_set[3][-1] + app_data_set[4][-1]) /5