Hello,
This is my first time posting, let me know if there are better ways for me to go about this. I have started Dataquest with a little coding background (I went through Python Crash Course by Eric Matthes earlier this year).
I am on slide 5 of the first guided project in Removing Duplicate Entries: Part Two, I am having a hard time trying to write specifically what this code means and what it does. Right now it is not something I am understanding by looking at it and the context.
reviews_max = {}
for app in google:
name = app[0]
n_reviews = float(app[3])
if name in reviews_max and reviews_max[name] < n_reviews:
reviews_max[name] = n_reviews
elif name not in reviews_max:
reviews_max[name] = n_reviews
We are looping through the Google data set (without the header) and have created a dictionary (reviews_max) and singling out two keys (name and n_reviews). The if statement is looking if name is already in reviews_max but also checking to see if the number of reviews in reviews_max is less than the number of reviews of the specific name in the Google data set? If not, then that is not a duplicate and should be thrown into reviews_max. (This is one I need help with, I tried bolding it)
android_clean =
already_added =
for app in google:
name = app[0]
n_reviews = float(app[3])
if (reviews_max[name] == n_reviews) and (name not in already_added):
android_clean.append(app)
already_added.append(name)
Then for this we created two lists. I understand, from the solutions, that there could be duplicates of apps with the same number of the highest reviews. I cannot figure out why you append (app) to android_clean and (name) to already_added.
Maybe I am thinking too hard but I am stumped with this, I guess I am hoping someone can explain what reviews_max[name] < n_reviews means as well as the two appends.
Thank you so much,
Kevin