Request FEEDBACK on exercise 4 of module 'Python Data Analysis Basics Practice Problems'

Hello there! I just finished the exercise on link:
https://app.dataquest.io/m/1000331/python-data-analysis-basics-practice-problems/4/analyzing-game-sales-3

  1. Read the game_sales.csv file and assign the rows (without the header) to a variable named games .
  2. Create a dictionary most_sold_per_year where each key is a year with a value equal to the name of the game with most global sales on that year.
  3. Assign to a variable named most_sold_1991 the game with most sales in 1991.
    Here´s my code, I just wanted to know if it could be better or if it already is :slight_smile:
import csv

def read_file(file):
    f=open(file)
    r=csv.reader(f)
    rows=list(r)
    return rows


def compute_most_soldx_year(games):
    most_sold_per_year={}
    for g in games:
        if g[2] not in most_sold_per_year:
            most_sold_per_year[g[2]]=[[g[0],float(g[-1])]]
        else:
            most_sold_per_year[g[2]].append([g[0],float(g[-1])])
 
    for i in most_sold_per_year:
         most_sold_per_year[i].sort(key = lambda x: x[1],reverse=True)
         del most_sold_per_year[i][1:]
    most_sold_per_year={k:v[0][0] for k,v in most_sold_per_year.items()}
    return most_sold_per_year

games=read_file('game_sales.csv')[1:]
most_sold_per_year=compute_most_soldx_year(games)
most_sold_1991=most_sold_per_year['1991']

Thanks in advance!

1 Like

Hello @NicoGuglielmo,

That looks good! Well done!

2 Likes

@NicoGuglielmo: recategorized your topic.

2 Likes

I like the way you thought about this problem. For me this feels more concise than the provided DQ solution, which in my eyes tries to hard to avoid built-in methods. A minor comment: You don’t need this line of code :

>  del most_sold_per_year[i][1:]

because with your dictionary comprehension you are already pulling out the name of the game with the most sales in the respective year.

> most_sold_per_year={k:v[0][0] for k,v in most_sold_per_year.items()}

My solution below…

import csv
data = list(csv.reader(open('game_sales.csv')))
games = data[1:]

# Create dictionary with year as key and tuple(name, sales) as value
most_sold_per_year = {}
for row in games:
year = row[2]
    if year not in most_sold_per_year: 
        most_sold_per_year[year] = [(row[0], row[-1])]
    else:
        most_sold_per_year[year].append((row[0], row[-1]))

# Sort values in dictionary for each year and keep only game with most sales
for y in most_sold_per_year:
    most_sold_per_year[y].sort(reverse = True, key=lambda x: float(x[1]))
    most_sold_per_year[y] = most_sold_per_year[y][0][0]     

most_sold_1991 = most_sold_per_year['1991']
3 Likes

Thanks for the feedback!
I truly don´t need the ‘del’ line, nice!

1 Like

This is how I did it:

import csv
f = open("game_sales.csv")
reader = csv.reader(f)
games = list(reader)[1:]

years = {}

for row in games:
    year = row[2]
    years[year] = 0

most_sold_per_year = {}

for year in years:
    for row in games:
        if row[2] == year and float(row[9]) > years[year]:
            years[year] = float(row[9])
            most_sold_per_year[year] = row[0]
            
most_sold_1991 = most_sold_per_year['1991'] 
2 Likes

Your solution is very cool. But I got a doubt why we have given [0] [0] index for most_sold_per_year?

For some reason this problem really stumped me and the DQ solution was really hard to follow since they used functions that seemed really disjointed. But this approach made a lot of sense and helped me work through it. Thanks for posting!