Which is the final cleaner dataset - there are 3 datasets that i can see

Screen Link:
App Profile Recommendation | Dataquest

My Code:

This is the first piece of code. I think i understand this. the idea is to create a dictionary with highest number of reviews. I don't understand the second block of code

reviews_max = {}

for app in android:
    name = app[0]
    n_reviews = float(app[3])
    
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
        
    elif name not in reviews_max:
        reviews_max[name] = n_reviews

Second block says the below.

  1. Why do we need this bit ? Isn’t the dictionary above good enough with unique entry - names and with highest number of review?

  2. Why are we using the dictionary to clean the original dataset ? What is that in original dataset which is not yet in dictionary - i think that’s the key to the question . Or are we saying that dictionary (reviews_max) has duplicates - but how is that possible, since first piece of code will not put in a new entry for the same app, or will it ?

  3. Finally, are we saying that android_clean is the final dataset that we are targetting ?

android_clean =
already_added =

for app in android:
name = app[0]
n_reviews = float(app[3])

if (reviews_max[name] == n_reviews) and (name not in already_added):
    android_clean.append(app)
    already_added.append(name)

What I expected to happen:

Just the first piece of code doing the work

What actually happened:

two pieces of code

Replace this line with the output/error


<!--Enter other details below: -->

understood it a bit more…so from original dataset ( which is a list), we have created a dictionary, that is now being used to scrub the original dataset to create a new list( & therefore a new database). My understanding is correct, right ?

A small query remains, why do we need database in form of list, isn’t dictionary good to substitute the original list ? Or dictionary can’t be considered as a database- what’s the reason for it - that it’s a programming concept and not an actual database…is that the reason ?