CYBER WEEK - EXTRA SAVINGS EVENT
TRY A FREE LESSON

Does getting rid of denominator really reduce a number of calculation

Hi everyone! In the explanation of using the naive Bayes algorithm we are suggested to get rid of denominators to reduce a number of calculation:

The initial formulas

P (Spam | Message) = P (Spam) * P (Message | Spam) / P(Message)

P (Non_Spam | Message) = P (Non_Spam) * P (Message | Non_ Spam) / P(Message)

turn into:

P (Spam | Message) => P (Spam) * P (Message | Spam)

P (Non_Spam | Message) => P (Non_Spam) * P (Message | Non_ Spam)

the symbol “=>” for “proportional”

We can still compare them, but we lose the equation

P (Non_Spam | Message) = 1 - P (Spam | Message)

Thus, for every message we need to calculate both P (Spam | Message) and P (Non_Spam | Message) and then compare them.

Isn’t it more efficient to leave the denominator P(Message) and for every message to calculate only the probability of the message is Spam. If the result is more than 0.5, we’ll classify the message as Spam. Otherwise - NonSpam