N_word_given_spam Only Searches 'Label' Column?

Screen Link:

I had a question about the solution given for n_word_given_spam for step 3. In the answer key it states:

n_word_given_spam = spam_messages[word].sum()

I understand that spam_messages is the dataframe containing only spam messages. With the solution given above, wouldn’t this be looking for the specified word in any column of the dataframe (in a given row)? What if we had other columns with the same word as well, besides the SMS column?

No, what this code does is to look for a column named word and then sum all the values in this column together so you can know how many times this word appears in messages labeled as spam.

This is possible because the values in the columns are 0 (if the word is not in the message) and 1 (if the word is in the message). Therefore, the sum of such a column is the number of appearances of the word in all the messages.

1 Like