432-10 Additive Smoothing: Why is p_money_given_non_spam not (0 + 1) / (9 + 8)

the formula for additive smoothing is (N"money" | Spam + a ) / NSpam + a * (N vocab) where vocabulary is a set of unique words.

non-spam consists of two phrases: “secret party at my places” (5 words) & “you know the secret” (4 words)
so NSpam should be 9 but N Vocab, which is a set of unique words, should be 8 given that secret is repeated twice thereby making the final formula:

0 + 1 / (9 + 8*1).

So then why is the solution for p_money_given_non_spam = (0 + 1) / (9 + 9)

This is what the content states for that particular Mission Step -

N_{Vocabulary} represents the number of unique words in all the messages — both spam and non-spam

So, you count words from both types of messages for the vocabulary, and not just one type based on the probability you are trying to calculate.

Also, for future reference, please make sure to include the Mission/Mission Step link in your post as well. Otherwise, it becomes difficult for others to help you out without the relevant context.

1 Like