5. Removing Duplicate Entries: Part Two (Building the dictionary)

Hi,

Total beginner here!

I’m having trouble understanding the solution for building the reviews_max dictionary…I’m just not getting this concept in its entirety.

  • Why does reviews_max[name] = n_reviews for both if and elif statements?? How can name (a string) equal number of reviews (float/integer)…?
  • Why is n_reviews a float? Reviews are not in decimal points; they’re integers. Ratings are float.
  • How does this statement work – if name in reviews_max and reviews_max[name] < n_reviews
    How can name (a string) be less than number of reviews (float/integer)…?

Below is the solution…thanks in advance for your help!! Appreciate it.


reviews_max = {}

for app in android:
    name = app[0]
    n_reviews = float(app[3])
    
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
        
    elif name not in reviews_max:
        reviews_max[name] = n_reviews
2 Likes

The code above is an assignment of a dictionary value via dictionary key.

name is the key.
n_reviews is the value.
That is, n_reviews is stored within dictionary under the key name.

The idea behind the logic:

  1. Initialize value to n_reviews for the beginning value for key name.
  2. Update value to n_reviews for key name if the new value is greater than existing value in the dictionary.

The code is ugly.
Why? We do not need to check every single time if a beginning value is entered.

Improvment solution using defaultdict to set default behavior when a value isn’t found.

from collections import defaultdict

reviews_max = defaultdict(float) # default value if not found is 0.0 

for app in android:
    name, n_reviews = app[0], float(app[3])
    if reviews_max[name] < n_reviews:
       reviews_max[name] = n_reviews

Now the code is simple and easier to understand and read.

3 Likes

I have no clue. Most likely, without the loss of significant precision.

Thank you, really helps me understand the answer better!

I getting a traceback on the n_reviews part of the code wasn’t expected that as it basically in the instructions to do it like this, what am I missing ?
4 name = names[0]
5 n_reviews = float(names[3])
----> 6 if name in reviews_max and reviews_max[name] < n_reviews:
7 reviews_max[name]=[n_reviews]
8 if name not in reviews_max:

TypeError: unorderable types: list() < float()