LIMITED TIME OFFER: 50% OFF OF PREMIUM WITH OUR ANNUAL PLAN (THAT'S $294 IN SAVINGS).
GET OFFER

'if not' Logic confusion in data cleaning

Screen Link: https://app.dataquest.io/m/351/cleaning-and-preparing-data-in-python/5/string-capitalization

Your Code: Enclose your code in 3 backticks

for row in moma:
    gender = row[5]

    # convert the gender to title case
    gender = gender.title()

    # if there is no gender, set
    # a descriptive value
    if not gender:
        gender = "Gender Unknown/Other"
    row[5] = gender

What I expected to happen:

‘if not gender’ would look at everything in the data that is ‘not’ in the gender row.

What actually happened:

Replaces empty gender columns with the string.

Other details:

I was expecting the code logic to come out to something like if gender == ’ '. When i read ‘if not gender’ I expect it would replace everything that is not gender.

Could someone help me understand the logic?

Thank you!

2 Likes

The short answer is that in this case, if not gender and if gender == "" are the same thing. I’ll explain why in more detail:

An if statement expects either True or False. These are both bool objects, as you can see:

>>> type(True)
<class 'bool'>

If if gets something that’s not a bool object, it converts that object to a bool object. Let’s look at what happens when we pass some strings to bool:

>>> bool('Hello')
True
>>> bool('12345')
True
>>> bool('False')
True

So far, every string we’ve given bool has equated to True. Let’s try an empty string:

>>> bool('')
False

Aha! It turns out the boolean value of an empty string is False and a non-empty string is True. That lets us use if gender to represent the logic If the string gender isn’t empty.

By adding not, we get if not gender or the logic If the string gender is empty (I converted the double negative here).

I hope this makes sense - either way represents the same logic, this is just a common shortcut you see in Python code.

11 Likes

That makes perfect sense! Thank you for the help, although I got the code to work I still wanted to understand why that specific way works as well.

1 Like

Thanks for the explanation. However, I would like the Dataquest team to put this kind of explanation on the Learn page. People who are new to programming or specific to python will surely feel confused.

I used len(word) == 0 to handle the scenario, few others used word == '' to pass the exercise.

I strongly suggest Dataquest team to update the Instructions on the Learn page.

5 Likes

if not threw me off too. It would be nice if Dataquest added comment after the if not logic in the code. Thank you for the explanation.

3 Likes

Hi @swarupmalli,

I have passed this feedback to the content team. They will review this screen when they work on optimizing the course.

Best,
Sahil

1 Like