Screen Link: https://app.dataquest.io/m/432/the-naive-bayes-algorithm/3/using-bayes-theorem
Ok so, I do understand that P(New Message|Spam) != P(Spam|New Message). It seems so simple but I’m just having a hard time describing the difference between those two.
What I understand is that P(Spam|New Message) is the probability that the new email is spam given that it’s a new email. But, what does P(New Message|Spam) means then? It’s the probability that it is a new message given that it’s spam?
Would really appreciate a little bit of clarification here.
Thank you very much.
Yeah, this is a bit confusing. But that’s not what
P(Spam | New Message) means exactly.
New Message does not refer to a new email. It refers to the content of that message.
So, a new message could contain -
"Hi, would you subscribe to our newsletter". For this case,
P(Spam | New Message) would mean the probability that the message is spam given that the content of the message is
"Hi, would you subscribe to our newsletter"
So, when we have
P(New Message | Spam), that means the probability that the content of the message is
"Hi, would you subscribe to our newsletter" given that the message is spam.
It’s not the clearest of distinction and the content doesn’t quite clarify it that well either, but the 5th Step in the Mission will walk through an example where it should start to become clearer.
Phenomenal! Seriously, thank you. Really helped there. It’s all clear now.