Guided project: Profitable app profiles; problem with creating dictionary

Screen Link: https://app.dataquest.io/m/350/guided-project%3A-profitable-app-profiles-for-the-app-store-and-google-play-markets/12/most-popular-apps-by-genre-on-the-app-store

genres_ios = freq_table(ios_final, -5)

for genre in genres_ios:
    total_ratings = 0
    length_genre = 0
    
    for row in ios_final:
        app_genre = row[-5]
        if app_genre == genre:
            no_of_rating = float(row[5])
            total_ratings += no_of_rating
            length_genre += 1
            
    avg_rating = total_ratings / length_genre
### Up to this point everything was fine

### Below I tried to do some experiments on my own, and the problem started
    genres_avg_rating = {} # A dictionary with genre and average rating
    if genre in genres_avg_rating:
        genres_avg_rating[genre] += avg_rating
    else:
        genres_avg_rating[genre] = avg_rating
        
    print(genres_avg_rating)

I wanted to create a tuple containing the average rating and genre, so that I can sort it using the sorted() function. So, I needed to create a dictionary for that first. I tried to create it from the outputs of for loop, not from any list.
I wrote the above mentioned code (the last six lines) to create the dictionary, but it outputted dictionaries for each genre instead of only one dictionary with all the genres accompanied by their average ratings.
Output:

{'Social Networking': 71548.34905660378}
{'Photo & Video': 28441.54375}
{'Games': 22788.6696905016}
{'Music': 57326.530303030304}
{'Reference': 74942.11111111111}
{'Health & Fitness': 23298.015384615384}
{'Weather': 52279.892857142855}
{'Utilities': 18684.456790123455}
{'Travel': 28243.8}
{'Shopping': 26919.690476190477}
{'News': 21248.023255813954}
{'Navigation': 86090.33333333333}
{'Lifestyle': 16485.764705882353}
{'Entertainment': 14029.830708661417}
{'Food & Drink': 33333.92307692308}
{'Sports': 23008.898550724636}
{'Book': 39758.5}
{'Finance': 31467.944444444445}
{'Education': 7003.983050847458}
{'Productivity': 21028.410714285714}
{'Business': 7491.117647058823}
{'Catalogs': 4004.0}
{'Medical': 612.0}

I know it’s not required by the lesson, but I really wanted to know. It’ll further improve my understanding on Python.

Any kind of help is highly appreciated.
Thank you for taking the time and effort to read the post.

Hi @tomahadi, welcome to the community!

To get it to create one dictionary, move this line out of the loop and put it before the big loop:

genres_avg_rating = {} # A dictionary with genre and average rating

Then un-indent the last print(genres_avg_rating) line so that it’s out of the loop (or you’ll get a giant wall of text!).

1 Like

It really helped. Thank you. Highly appreciate your support.

By the way, if you don’t mind.
I checked your LinkedIn profile. You had been a mathematics teacher for almost ten years. That’s awesome. Especially when it comes to data science.
I have a degree in Finance and Banking. I was not that good at math in school. That’s not because I couldn’t understand the concept, but because I didn’t study well.
Now, I’m trying to learn data science on my own using online course materials like Dataquest. Is it going to be tough for me to master data science skills? or it’s not for someone like me?

Please forgive me if this isn’t the right place to ask the above question.
Thank you once again.

One of the things I liked about the DQ curriculum and method of learning is how hands-on it is. You’re learning things that you can apply and play with immediately. If you enjoy the process or the topic, it makes it easier to dive deeper into the subject. I’ve always enjoyed math, but it really came alive for me when I took physics because I could see how it connected with something in the real world. I never cared much for statistics, but when I see how it’s used I feel more confident that I can do better. I think the application and practice is key for learning anything.

What part of data science is attractive to you? Keep that in mind and push forward! At least up through the Data Analyst path, I don’t think there’s any math that would keep you from being able to progress (no calculus! :slight_smile: ). At the beginning while you’re learning programming, the math you need will revolve a lot around interpreting graphs and understanding frequency and percentages. There’s a separate section just for statistics that I think is explained pretty well. I haven’t gone into the more advanced topics in the Data Science path, so I’m not sure what math you would need there.

I’m not sure if that answers your question, but I hope it encourages you as you move along the path. The community is here to support you as well!

Hi @april.g,
Yes, you answered really well. Thank you.

Exactly, “the application and practice is key for learning anything”. I can’t agree with you more. Project based learning is the best method to learn. In fact, I, myself, learn much much better through hands-on learning.

Whenever I try to learn something, I subconsciously look for practical application of the new concept. In my life, so far, I’ve taken a good number of courses. All revolving around IT. I took courses on C, Java, Web design, JavaScript, Python, Swift. All on my own using online resources. But, I finished none of them. Partly because I had lost motivation when I got stuck. I just gave up. But the most important reason was the lack of practical application of the concepts, lack of project based lessons. And Dataquest really rocks in that aspect so far.

Now, I’m learning data analysis on Dataquest. I’m following their lessons regularly and diligently, and almost at the end of the fundamental course. First time in my life.

Thank you for your advice.

Hi @april.g,

I ran into a slightly different problem when trying to do a similar extension exercise so that I can sort the results. Below is my code:

genre_ios = freq_table(free_ios, -5)
display = {}

    for genre in genre_ios:
        total = 0 # store the sum of user ratings
        len_genre = 0 # number of apps specific to each genre
        for app in free_ios:
            genre_app = app[-5]
            if genre_app == genre:
                n_ratings = float(app[5])
                total += n_ratings
                len_genre += 1
        average_installs = total / len_genre
        # print(genre, ": ", average_installs)
        
    ### my problem starts here ###
            if genre not in display:
                display[genre] = average_installs

        print(display_table)

I was expecting to generate a dictionary of all genres with corresponding number of average installs. However, the output only has one genre listed, and it’s in fact a tuple:
[(26919.690476190477, 'Shopping')]

Please could you kindly take a look at what I might have done wrong?

Also, I have tried to markdown the above code in a more ux-friendly format but was unable to find a solution, do you know how I can copy and paste code in a better format when in community discussion?

Thank you very much for your time.

Hi @clee12005. I’m not getting the same results. I’m guessing it’s because of the formatting here, but I couldn’t the code to run on my copy of the project because of indenting issues. When I adjusted those, the code printed as expected.

genre_ios = freq_table(free_ios, -5)
display = {}

for genre in genre_ios:
    total = 0 # store the sum of user ratings
    len_genre = 0 # number of apps specific to each genre
    for app in free_ios:
        genre_app = app[-5]
        if genre_app == genre:
            n_ratings = float(app[5])
            total += n_ratings
            len_genre += 1
    average_installs = total / len_genre
    print(genre, ": ", average_installs)       # i uncommented this part
        
    ### my problem starts here ###
    if genre not in display:
        display[genre] = average_installs
# I commented out the last line, what is display_table?

I didn’t get any tuples though. There could be an issue in a different part of the code. I can help you diagnose that better if you upload a copy of your .ipynb file.

As far as the copying and pasting the code here, you would put a set of 3 backticks (usually next to the 1 key on the keyboard) on a line before your code and a line after. This image demonstrates:
image

Hi @april.g - thank you very much for your answer! Problem solved somehow after I reran the Kernel, you were probably right with pointing out that the issued lied somewhere else in the code.

And thank you so much for your detailed answer in showing me how to copy & paste the code!