Wrong numbers appeared in freq_table using def function(App Profiles guided project)

In the guided project:Profitable App Profiles for the App Store and Google Play Markets, I created a frequency table just as the solution told me to. But the numbers in it are all ‘0’. I tired to copy and paste the answer from the solution, but it still couldn’t work. Can anyone help me to figure about what is the problem?

def freq_table(dataset, index):
   table = {}
   total = 0
   
   for row in dataset:
       total += 1
       value = row[index]
       if value in table:
           table[value] += 1
       else:
           table[value] = 1
   
   table_percentages = {}
   for key in table:
       percentage = (table[key] / total) * 100
       table_percentages[key] = percentage 
   
   return table_percentages


def display_table(dataset, index):
   table = freq_table(dataset, index)
   table_display = []
   for key in table:
       key_val_as_tuple = (table[key], key)
       table_display.append(key_val_as_tuple)
       
   table_sorted = sorted(table_display, reverse = True)
   for entry in table_sorted:
       print(entry[1], ':', entry[0])

display_table(ios_final, -5)``` 

The output should be: 
Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365

But Actually the output is: 
('Weather', ':', 0)
('Utilities', ':', 0)
('Travel', ':', 0)
('Sports', ':', 0)
('Social Networking', ':', 0)
('Shopping', ':', 0)
('Reference', ':', 0)
('Productivity', ':', 0)
('Photo & Video', ':', 0)
('News', ':', 0)
('Navigation', ':', 0)
('Music', ':', 0)
('Medical', ':', 0)
('Lifestyle', ':', 0)
('Health & Fitness', ':', 0)
('Games', ':', 0)
('Food & Drink', ':', 0)
('Finance', ':', 0)
('Entertainment', ':', 0)
('Education', ':', 0)
('Catalogs', ':', 0)
('Business', ':', 0)
('Book', ':', 0)

Hi @stellayou1126, welcome to the forums!

I copied and pasted the code into a copy of my project and it is giving the desired output. My first guess is that something went wrong with ios_final in a previous step. It would help to see a copy of your .lpynb file to help you get it sorted out.

Hi , thank you for your response! Since I couldn’t upload files here, I will copy and paste my code for my ios_final here.

def is_english(string):
    non_ascii = 0
    
    for character in string:
        if ord(character) > 127:
            non_ascii += 1
    
    if non_ascii > 3:
        return False
    else:
        return True

android_english = []
ios_english = []

for app in android_clean:
    name = app[0]
    if is_english(name):
        android_english.append(app)
        
for app in ios:
    name = app[1]
    if is_english(name):
        ios_english.append(app)
        
explore_data(android_english, 0, 3, True)
print('\n')
explore_data(ios_english, 0, 3, True)

android_final = []
ios_final = []

for app in android_english:
    price = app[7]
    if price == '0':
        android_final.append(app)

for app in ios_english:
    price = app[4]
    if price == '0.0':
        ios_final.append(app)

print(len(android_final))
print(len(ios_final))

I got fewer numbers of records than the answer’s since the ios_english step. But I still can’t find the problem.

I am stumped! I’m still not getting the same numbers that you’re getting, everything seems to work perfectly. I end up with 3222 rows in the ios_final dataset, and display_table(ios_final, -5) gives me the desired output.

@Sahil or @Bruno, would either of you have any insight into what could be causing this error?

I will will copy my all of the codes before the last one I posted, thanks!

from csv import reader

#The Google Play data set#
opened_file = open('googleplaystore.csv')
read_file = reader(opened_file)
android = list(read_file)
android_header = android[0]
android = android[1:]

#The App Store data set#
opened_file = open('AppleStore.csv')
read_file = reader(opened_file)
ios = list(read_file)
ios_header = ios[0]
ios = ios[1:]

def explore_data(dataset, start, end, rows_and_columns = False):
    dataset_slice = dataset[start:end]
    for row in dataset_slice:
        print(row)
        print('\n') #adds a new (empty) line after each row
        
    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

print(android_header)
print('\n')
explore_data(android, 0, 3, True)

print(ios_header)
print('\n')
explore_data(ios, 0, 3, True)

print(android[10472])
print('\n')
print(android_header)
print('\n')
print(android[0])

del android[10472]

duplicate_apps = []
unique_apps = []

for app in android:
    name = app[0]
    if name in unique_apps:
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)

print('Number of duplicate apps: ', len(duplicate_apps))
print('\n')
print('Examples of duplicate apps: ', duplicate_apps[:15])

reviews_max ={}

for app in android:
    name = app[0]
    n_reviews = float(app[3])
    
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
        
    elif name not in reviews_max:
        reviews_max[name] = n_reviews 

print('Expected length', len(android) - 1181)
print('Actual length', len(reviews_max))

android_clean = []
already_added = []

for app in android:
    name = app[0]
    n_reviews = float(app[3])
    
    if (reviews_max[name] == n_reviews) and (name not in already_added):
        android_clean.append(app)
        already_added.append(name)

explore_data(android_clean, 0, 3, True)

def is_english(string):
    
    for character in string:
        if ord(character) > 127:
            return False
    
    return True

@stellayou1126 I don’t understand what your code is supposed to look like, it’s scatter across the whole post.

Can you please include your guided project file in the original post?

Thanks.

Hi! I’m sorry for the confusion. I’ve uploaded the file on GitHub, here is the link: https://github.com/xiaoshanyou/app/blob/master/appanalysis.ipynb

Thank you!

1 Like

Hi @stellayou1126
If you are using python2.7 table[key] / total will give always zero because you are dividing two integers and python2.7 will applier Euclidean division , you can try to initial total by 0.0 in freq_table function

def freq_table(dataset, index):
    table = {}
    total = 0
    
    for row in dataset:
        total += 1
        value = row[index]
        if value in table:
            table[value] += 1
        else:
            table[value] = 1
    
    table_percentages = {}
    for key in table:
        percentage = (table[key] / total) * 100
        table_percentages[key] = percentage 
    
    return table_percentages
1 Like

@bahmed21Hi! Thank you so much for your help! I solved the problem, and do you have any idea why my result looks like this:
(‘Games’, ‘:’, 58.53581571473651)
instead of :
Games : 58.16263190564867
when I printed it out?

You are welcome @stellayou1126
you can modify the last line display_table function like this

def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)
        
    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print entry[1]+' :', entry[0]
1 Like

That’s perfect. Does that mean I don’t need type parenthesis when I use ‘print’ function in Python2.7?

Yes, but is better to use the parentheses because in the end, you have to move to python 3 where you have to use theme.

I see. Thank you again for your help!

1 Like

When I run this part of the project, I get ‘Social Networking : 100.0’, not a frequency table. Can someone please tell m

e what I’m doing wrong here?

Hello @vroomvroom,

The error is in your freq_table function, the table_percentages dictionary and the second for loop are not meant to be defined inside the body of the first for loop.
The correct block of code for the freq_function should be:

def freq_table(dataset, index):
    table = {}
    total = 0
    
    for row in dataset:
        total += 1
        value = row[index]
        
        if value in table:
            table[value] += 1
        else:
            table[value] = 1
            
    table_percentages = {}
    
    for value in table:
        table_percentages[value] = (table[value] / total) * 100
            
    return table_percentages

I hope this helps.

1 Like

It worked, thank you!

1 Like

I do not understand what are we trying to do here ?

def display_table(dataset, index):
table = freq_table(dataset, index)
table_display =
for key in table:
key_val_as_tuple = (table[key], key)
table_display.append(key_val_as_tuple)