Understanding the logic of the last function

I am on the guided project of “Popular Data Science Questions”. I am having trouble of the function used to identify the tags. Here is what I wrote:

def deep_learn(col):
    for tag in col:
        if tag in ["lstm", "cnn", "scikit-learn", "tensorflow", 
                   "keras", "neural-network", "deep-learning"]:
            return 1
        else:
            return 0

However, the right answer is:

def class_deep_learning(tags):
    for tag in tags:
        if tag in ["lstm", "cnn", "scikit-learn", "tensorflow",
                   "keras", "neural-network", "deep-learning"]:
            return 1
    return 0

I am unsure why the “return 0” is outside of the if statement? My theory is:

  • Since the tags are in a list in the dataframe, if I left the else clause in the if statement, then it would only assign a 1 if all tags in that particular row contained tags in the function?
  • For example, one cell has “lstm, cnn, r”. Ultimately, this would be assigned a “0” in the function, but should be assigned “1” based on the behaviour we want.

Is this accurate?

Thanks.

Hey.

No. The behaviour in your function can be put into words with the sentence "If the first tag is a deep learning tag, then return 1, otherwise return 0".

Here’s a sample of the tags that your function says aren’t deep learning questions, but the solution’s says they are.

Id CreationDate Tags Quarter
17323 32309 2018-05-29 10:03:08 [python, deep-learning, tensorflow, numpy] 18Q2
16331 31904 2018-05-21 07:40:11 [python, keras, predictive-modeling, prediction] 18Q2
8629 49198 2019-04-12 14:51:14 [python, scikit-learn, distance, scipy] 19Q2
7786 27949 2018-02-18 09:21:55 [machine-learning, python, scikit-learn, decision-trees, performance] 18Q1
12363 61683 2019-10-13 22:25:54 [python, scikit-learn] 19Q4

Note that the first tag isn’t a deep learning tag, yet the questions are deep learning questions.

So why does your function have this behavior? This happens because when a function first encounters a return statement, it will immediately quit the function. I have explained this in a different context here (specifically in the section Technical Preamble).

I hope this helps.

Oh wow!!! The link to your technical answer to a related question is phenomenal!

I see. We want to make a conclusion on the whole list of tags (i.e., detect at least one instance of a deep-learning word). Whereas how I have it, it evaluates on a single tag, then stops. As you pointed out in the dataframe above, it evaluated python as 0 then stopped. Versus examining the whole list of [python, deep-learning, tensorflow, numpy].

Thanks Bruno!

1 Like