Hello world,

My question is about the Guided Project: Profitable App Profiles for the App Store and Google Play Markets, section 9: Most Common Apps by Genre…

I understand Dataquest shouldn’t hold your hand all the way through the course, but I was at a loss on where to even begin based on these instructions:

1. Inspect both data sets and identify the columns you could use to generate frequency tables to find out what are the most common genres in each market.

Frankly, I’m

``````def freq_table(dataset, index):
table = {}
total = 0

for row in dataset:
total += 1
value = row[index]
if value in table:
table[value] += 1
else table[value] = 1

table_percentages = {}
for key in table:
percentage = (table[key] / total) * 100
table_percentages[key] = percentage

return table_percentages

def display_table(dataset, index):
table = freq_table(dataset, index):
table_display = []
for key in table:
key_val_as_tuple = (table[key], key)
table_display.append(key_val_as_tuple)

table_sorted = sorted(table_display, reverse = True)
for entry in table_sorted:
print(entry[1], ':', entry[0])
``````

No joke, this code makes me want to quit programming. After going through all the lessons, these functions come to mind at all.

1 Like

Hi chase,

Maybe you spread out your lessons and have forgotten how to implement these ideas. I checked the lesson plan again to see that “frequency tables” are taught under " Dictionaries and Frequency Tables" leading up to this guided project.

I see what could be more difficult here is `display_table` where it reversed the key and value and sorted the tuples, before printing them back in the original order (anyway `key_val_as_tuple` is confusing naming, because what is stored is value, key).

A built-in python standard library tool that can do this too is `Counter().most_common()`
Another way is to turn the column into a list, find the unique categories for that column, and do `list.count('category')` repeatedly (whether you use normal for-loop or comprehensions is a separate implementation concern) to find out how many there are of each category.
If you learned pandas, this whole task would be simply `pd.DataFrame(dataset).loc[:,index].value_counts()`.
For the sorting, there is no need to create the intermediate data structure. Just google `how to sort on the 2nd element of a tuple` which should lead you to `operator.itemgetter`.

Why go through these python exercises is so you know what goes on under the hood and can write custom functions like these when simple libraries do not fulfill your requirements.

When learning some syntax/technique, try to keep in mind when and why you would use it (this can be googled too), beyond what is taught in the lesson. After knowing the why, reading some simple sample code from stackoverflow makes the concept stick better. Abstracting the code to a user requirement in layman english is critical to developing the skill to also go in the reverse direction, from requirement to code as what this lesson (and actual coding jobs) demands. As you get better, you will be able to think at various levels of abstraction, and build up enough tools to combine them in any way you want to solve a real problem considering speed, memory, maintainability, etc.