Hello, I have a question about this exercise.
We are using P(Spam|“secret”)∝P(Spam)⋅P(“secret”|Spam) as the formula. In this formula, Spam is event A, and “secret” is event B. When we are calculating the probability of A, we define event A as being spam with the sample space as all messages. When we calculate P(B|A), we define A as the number of words in all spam messages. Definition for A, AKA Spam, changes.
How is this change of definition of A satisfy the axiom that the definition of events should be consistent in probability?

Definition of A is not changing.

P(Spam) is the probability of the event that a message is a spam given the outcome that a message was picked.

P(“secret”|Spam) is the probability of the event that the word “secret” appears in a message given that the message is Spam. For P(B|A), the event is different. We are given that event A has occurred.

A is unaffected here. The focus is on B given that A has occurred.

1 Like

This is really confusing to me as well. I would think that the information given would generate a chart that looks like this:

but it seems like DQ is saying the chart looks like this:

The difference is that one chart tallies the number of messages, and the other chart tallies the number of words. What am I missing here?