Guided lesson; data analysis with python

my code;

def eng_test(string):
    not_eng = 0
    for char in string:
        if (ord(char)>127):
            not_eng +=1
    if not_eng > 3:
        return False
    else:
        return True

g_data_eng = []
for app in g_data_clean:
    name = app[0]
    if eng_test(name):
        g_data_eng.append(app)
        
explore_data(g_data_eng, 0, 3, False)
explore_data(g_data_eng, 4000, 4003, True)
print('\n')

a_data_eng =[]
for app in apple_body:
    name = app[1]
    if eng_test(name):
        a_data_eng.append(app)
explore_data(a_data_eng, 0, 3, False)
explore_data(a_data_eng, 4000, 4003, True)

solution code;

def is_english(string):
    non_ascii = 0
    
    for character in string:
        if ord(character) > 127:
            non_ascii += 1
    
    if non_ascii > 3:
        return False
    else:
        return True

android_english = []
ios_english = []

for app in android_clean:
    name = app[0]
    if is_english(name):
        android_english.append(app)
        
for app in ios:
    name = app[1]
    if is_english(name):
        ios_english.append(app)
        
explore_data(android_english, 0, 3, True)
print('\n')
explore_data(ios_english, 0, 3, True)

the problem is that for some reason, my “app[1]” returns a number instead of a name,
meaning that it’s not pulling out the name somehow. i think it’s treating the entries
in the apple set as digits instead of strings.

this code is supposed to return:
Number of rows: 6183
Number of columns: 16

but my code returns:
Number of rows: 7197
Number of columns: 93

Hi @westlundderek!

It’s best practice to envelop your code in 3 back-ticks (``` before and after your code) so it renders more appropriately on these forums, since Markdown formatting is supported here. Helps make it a lot more readable!

post updated for formatting.

Thanks for editing the code for us. I’m not able to replicate your results, particularly with the number of columns. Since you have 93 columns, something might have happened in a previous section that caused app[1] to be something unexpected. Could you possibly upload your .ipynb file for further troubleshooting?


drop box link

Thanks a lot. I found the problem, it’s all the way in the 2nd cell where you’re reading in the files.

opened_google = open('googleplaystore.csv')
opened_apple= open('AppleStore.csv')
from csv import reader
read_google = reader(opened_google)

google_data = list(read_google)
google_header = google_data[0]
google_body = google_data[1:]

apple_data = list(opened_apple)
apple_header= apple_data[0]
apple_body= apple_data[1:]

You forgot to use reader() on the apple file. If you add read_apple = reader(opened_apple) and then change the apple_data variable to apple_data = list(read_apple), you should be all set! I reran all the code and tested it out and I end up with:
Number of rows: 6183
Number of columns: 16

1 Like

i keep getting caught on simple hard to notice things. fresh pair of eyes thing.

Thank you.

1 Like