Apps Project - What's the logic here?

Hello everyone. I’m doing the iOS and Google store analysis, which is the very first mission. I’ve just hit a bit of a roadblock. I’m trying to write a function that detects non-English app names using a for loop, but I’m not sure my code is doing what it’s exactly supposed to do.


    for character in string:
        if ord(character) > 127:
            return False
        else:
            return True

The code in the solution appears to differ somewhat.


    for character in string:
        if ord(character) > 127:
            return False

    return True

For the names, Instagram and 爱奇艺PPS -《欢乐颂2》电视剧热播, my function returns accurate results, but for Instachat 😜 and Docs To Go™ Free Office Suite, it doesn’t seem to recognize the special characters and returns True anyway.

The solution function seems to work okay though. Could somebody explain to me what exactly is the difference between the two functions and what I’m doing wrong?

1 Like

This part lost me :stuck_out_tongue_winking_eye: and I wasn’t sure of that was meant in code :thinking:
I ran this both ways and it works for me. Can you add screen shots of the output of both functions?
Thanks :slight_smile:

The function only checks at one character - since at the first comparison, a return statement is called.

If you want to invert the logic, you can do this in one line and simplified the code:

return not ord( c ) < 127

It reads true if not less than ascii value of 127. Otherwise, false.

It is as Alvin explains in his reply, but since I’ve seen this question asked on Slack a few times, I thought it was worth it to give some more detail. If you just want to read the technical answer, jump to the second part of this post.

Context

To give some context, this pertains the guided project Profitable App Profiles for the App Store and Google Play Markets whose solution can be found here. More specifically, it concerns screen 6.

This guided project is part of the course Python for Data Science: Fundamentals — the first course in our “Data Scientist in Python” path.

We’ll be working with the following strings:

  • Instagram
  • Docs To Go™ Free Office Suite
  • 爱奇艺PPS -《欢乐颂2》电视剧热播
  • Instachat 😜

These represent apps names. The goal is to identify which of these strings are “non-English” names. As a proxy for this, we are using the rationale:

If any of the characters is a non-English character, then the app’s name is not in English.

As a way to solve this problem, the course author suggested the following function:

def is_english(string):
    for character in string:
        if ord(character) > 127:
            return False
    return True

And Saif tried to solve this problem with the following slightly different function:

def saif_is_english(string):
    for character in string:
        if ord(character) > 127:
            return False
        else:
            return True

Technical Preamble

The return statement has the property that whenever the computer finds it, when it is running the function, it will quit the function right then and there and not do anything else.

We can read this from the official documentation:

return leaves the current function call with the expression list (or None ) as return value.

Let’s test this. Below I’m defining a function that starts off immediately with a return statement, then it does some stuff and returns something else.

def a_func(n):
    return pow(n,2)    # returns the square of n
    n=n+1              # reassigns n+1 to n
    return n           # returns n (after reassignment)

Let’s try using it:

>>> print(a_func(1))
1
>>> print(a_func(2))
4
>>> print(a_func(3))
9

So we see it’s always returning to us the square of the input. It ignores everything after the first return statement. Let’s now analyze the problem at hand.

Analyzing the Problem

Now that we know how return statements work, let’s see them in action in the context of this question.

We’ll be analyzing the usage of print(saif_is_english('Instagram')):

  1. First it enters the function with Instagram as its input.
  2. Then it will initiate a “for loop” over Instagram.
    1. character is assgined the value of the first character in the input string, which means that character is assigned I.
    2. The if condition is evaluated:
      • Since ord(character) is 73, which is smaller than 127, the statement ord(character) > 127 is false and so we are sent to the else part of the code.
    3. Once in else, we hit the statement return True and the function is exited right here.

So, as you can see, we ended up only ever looking at the first character of Instagram. We can even use Python Tutor’s Visualizer to see this:

saif_is_english

That the function returns a correct result, is merely a consequence of the fact that the first character is “an English character”.

Let’s now see what happens with the usage of print(is_english('Instagram')):

is_english

Notice (in the right side of the animation) how we only get to a return statement after iterating over all the characters. Here’s the link for this visualization.

I leave it to you to do this exercise for the other strings. I hope this clears it up.

17 Likes

Very very very helpful and detailed explanation! Great stuff, @Bruno!

3 Likes

I had this exact same questions, thank you so much! It was super well explained! :smile:

2 Likes

Voila… Got the solution right and well explained. @Bruno thanks a lot.

1 Like

Great tool to know your code mechanism. Thanks @Bruno. Got to know something new.

2 Likes

I tried writing the solution as the below, basically the less than and return True are the reverse of the offered solution but this returns True for characters where it should return false. Can someone explain why it isn’t evaluating as I would expect?

def English_check(string):
for character in string:
if ord(character) <= 127:
return True
return False

1 Like

image
What’s the problem here if I write the function in this way, please reply.

Hi @raisa.jerin.sristy79

With your code you are just checking if there is 1 or more English characters in the app name.
The logic of your loop:

  • Take the first character in app_name and check if ord(ch) <= 127
  • If this is the case return True and end the loop
  • If not continue the loop with the next character.

In other words, your loop ends as soon as the first English character is found (without checking the remaining characters). This is why you only get True as function ouput for all inputs (They have at least 1 English Character).

Best
htw

4 Likes

@ htw Htw
Thanks a lot for your explanation. :heart: