Removing duplicate entries

Can somebody explain this part of Guided Project 1 to me? I am very lost right here

reviews_max = {}

for app in android:
name = app[0]
n_reviews = float(app[3])

if name in reviews_max and reviews_max[name] < n_reviews:
    reviews_max[name] = n_reviews
    
elif name not in reviews_max:
    reviews_max[name] = n_reviews

Hi @MHnt1026e3a37

let’s get into your matter of concern :

  • First to make things clear ask yourself why i create dictionary? the answer is because we found apps with more than one number of reviews. So we’ll only keep the row with the highest number of reviews and remove the other entries for any given app.
  • Let’s get into the code part

reviews_max = {} # this is the dictionary you create

for app in android :
name = app[0] # Here you extract the column with app name
n_reviews = float(app[3]) # Here you extract the column with number of reviews
# and convert it to float

 if name in reviews_max and reviews_max[name] < n_reviews : 
     reviews_max[name] = n_reviews
  • This part means if the app name (key) in the dictionary and reviews_max[name] ( which is the dictionary value ) is less than the number of reviews(which means n_reviews is the highest and this what we want to avoid duplication) make the dictionary value equals the number of reviews.

     elif name not in reviews_max :
         reviews_max[name] = n_reviews
    
  • This part means if the name doesn’t exist, create a new key-value pair in the reviews_max dictionary, where the dictionary key is the app name and the dictionary value is n_reviews.

I hope this was helpful.