Dictionaries - lists of lists, indexing, proportions and percentages

Screen Link:

https://app.dataquest.io/m/314/dictionaries-and-frequency-tables/9/proportions-and-percentages
My Code:

opened_file = open('AppleStore.csv')
from csv import reader
read_file = reader(opened_file)
apps_data = list(read_file)

genre_counting = {}

for item in apps_data:
    # Header row?
    item[:1]
    genre = item[11]
    if genre in genre_counting:
        genre_counting[genre] += 1
    else:
        genre_counting[genre] = 1
        
print(genre_counting)```

What I expected to happen:


Something is off with the dictionary. I am supposed to begin indexing through apps_data excluding the header row. Not knowing fully what table this is coming from, I'm guessing this would be apps_data[1:] .

I am having a hard time redoing these lessons as I cannot seem to visualize any table in my mind. Where did a list of lists come from to start the exercise? I recognize that we turned into a list(read_file) . (read_file into a list)

Replace this line with the output/error


<!--Enter other details below: -->

Hi Hunter. I hope the following is helpful to aid in visualizing the tables. Let’s say we have the following data:
image

What list() does is take the information and turn it into a list of lists that we can use. Every row of data becomes a list within it:
image


data = [[id, item, price, Qty], ['1', 'cheese', '3.50', '1'], ['2', 'banana', '0.70', '2'], ['3', 'apple', '0.85', '4']]

In the loop, we are going to go through each of the indices of data, which means we’re going to focus on just one row at a time. For every iteration, row is one of the lists that represents a row of the data. data[0] isn’t going to be helpful because it’s a header row, so we will want to skip it. We can skip it by looping through data[1:]. Within the loop, we are able to access any of the column data we want in that row, which could look like this:

for row in data[1:]:    # we want every row after the header data[0]
	item = row[1]   # the item name is the 1st index of every row
	price = row[2]   # the price is the 2nd index of every row
	qty = row[3]   # the quantity is the 3rd index of every row

In the code snippet from the post, the error comes in with trying to get rid of the header row. Because item is one row of apps_data, item[:1] is isolating specific columns within that row. (And in this case, it actually didn’t have any impact because it wasn’t saved, as in item = item[:1]). To skip the header row, you instead want to have the loop ignore it by using apps_data[1:]. (The reason it’s [1:] and not [:1] is that the 2nd version means “all elements until index 1”.)

I hope any of that helps at all (and wasn’t too long-winded! :wink: )

1 Like

I do not understand the issue with calling list on read_file and that making a list of lists. You are indexing by row, I was indexing by item. Can you point me to the location where I learned that everything is separated by row rather than by column with the list function? Some of the confusion may have sprouted from an abbreviated view of the the table.

This was the first lesson where we read in a csv file to a list of lists that I think you’re looking for: https://app.dataquest.io/m/312/lists-and-for-loops/7/opening-a-file

https://app.dataquest.io/m/312/lists-and-for-loops/7/opening-a-file

opened_file = open(‘AppleStore.csv’)

from csv import reader
read_file = reader(opened_file)
print(read_file)

When I print read_file, is the output a pointer? There is not a visual table for me to call list on.