Guided Project 1: Most Popular Appy by Genre on Google Play

Dears,

I’m a little lost here. I had the same issue with the Apple Store data, but somehow after playing around, I somehow solved it (without understanding really why). But now with the google data I’m at the same point. I don’t know why this happens, as the code is nearly identical (expect from some different names I’ve assigned). Plese help me to understand what went wrong.

Screen Link: https://app.dataquest.io/m/350/guided-project%3A-profitable-app-profiles-for-the-app-store-and-google-play-markets/13/most-popular-apps-by-genre-on-google-play

My Code:


for category in and_genre:
    total = 0
    len_category = 0
    for app in and_clean_engl_free:
        category_app = app[1]
        if category_app == category:
            installs = app[5]
            installs = installs.replace("+","")
            installs = installs.replace(",","")
            installs = float(installs)
            total += installs
            len_category += 1
    avg_category = total / len_category
    print(category_app," : ",avg_category)```

What I expected to happen:

ART_AND_DESIGN : 1986335.0877192982
AUTO_AND_VEHICLES : 647317.8170731707
BEAUTY : 513151.88679245283
BOOKS_AND_REFERENCE : 8767811.894736841
BUSINESS : 1712290.1474201474
COMICS : 817657.2727272727
COMMUNICATION : 38456119.167247385
DATING : 854028.8303030303
EDUCATION : 1833495.145631068
[...]

What actually happened: 
LIFESTYLE  :  1437816.2687861272
LIFESTYLE  :  8767811.894736841
LIFESTYLE  :  38456119.167247385
LIFESTYLE  :  638503.734939759
LIFESTYLE  :  7036877.311557789
LIFESTYLE  :  120550.61980830671
LIFESTYLE  :  253542.22222222222
LIFESTYLE  :  513151.88679245283
LIFESTYLE  :  5074486.197183099
LIFESTYLE  :  17840110.40229885
LIFESTYLE  :  24727872.452830188
LIFESTYLE  :  1924897.7363636363
LIFESTYLE  :  16787331.344927534
LIFESTYLE  :  1387692.475609756
LIFESTYLE  :  1833495.145631068
LIFESTYLE  :  4056941.7741935486
LIFESTYLE  :  9549178.467741935
LIFESTYLE  :  1986335.0877192982
LIFESTYLE  :  5201482.6122448975
LIFESTYLE  :  3697848.1731343283
LIFESTYLE  :  11640705.88235294
[...]

Hello Benni, welcome to the community!

This is where you’re running into an issue:

print(category_app," : ",avg_category)

When we used category_app in the print command, it’s going to use the last value from category_app = app[1] when the inner loop finishes running. This value happens to be 'LIFESTYLE', so that’s why you’re seeing it repeated in the list. However, what we want is the name of the category we’re aggregating values for, which is the category from and_genre.

I hope this helps!