Guided Project Roadblock

Hello world,

My question is about the Guided Project: Profitable App Profiles for the App Store and Google Play Markets, section 9: Most Common Apps by Genre…

I understand Dataquest shouldn’t hold your hand all the way through the course, but I was at a loss on where to even begin based on these instructions:

  1. Inspect both data sets and identify the columns you could use to generate frequency tables to find out what are the most common genres in each market.

Frankly, I’m

def freq_table(dataset, index):
    table = {}
    total = 0
    
    for row in dataset:
        total += 1
        value = row[index]
        if value in table:
            table[value] += 1
        else table[value] = 1
        
    table_percentages = {}
    for key in table:
        percentage = (table[key] / total) * 100
        table_percentages[key] = percentage
        
    return table_percentages

def display_table(dataset, index):
    table = freq_table(dataset, index):
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)
        
    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])

No joke, this code makes me want to quit programming. After going through all the lessons, these functions come to mind at all.

1 Like

Hi chase,

Maybe you spread out your lessons and have forgotten how to implement these ideas. I checked the lesson plan again to see that “frequency tables” are taught under " Dictionaries and Frequency Tables" leading up to this guided project.

I see what could be more difficult here is display_table where it reversed the key and value and sorted the tuples, before printing them back in the original order (anyway key_val_as_tuple is confusing naming, because what is stored is value, key).

A built-in python standard library tool that can do this too is Counter().most_common()
Another way is to turn the column into a list, find the unique categories for that column, and do list.count('category') repeatedly (whether you use normal for-loop or comprehensions is a separate implementation concern) to find out how many there are of each category.
If you learned pandas, this whole task would be simply pd.DataFrame(dataset).loc[:,index].value_counts().
For the sorting, there is no need to create the intermediate data structure. Just google how to sort on the 2nd element of a tuple which should lead you to operator.itemgetter.

Why go through these python exercises is so you know what goes on under the hood and can write custom functions like these when simple libraries do not fulfill your requirements.

When learning some syntax/technique, try to keep in mind when and why you would use it (this can be googled too), beyond what is taught in the lesson. After knowing the why, reading some simple sample code from stackoverflow makes the concept stick better. Abstracting the code to a user requirement in layman english is critical to developing the skill to also go in the reverse direction, from requirement to code as what this lesson (and actual coding jobs) demands. As you get better, you will be able to think at various levels of abstraction, and build up enough tools to combine them in any way you want to solve a real problem considering speed, memory, maintainability, etc.