Screen Link: https://app.dataquest.io/m/429/conditional-probability%3A-fundamentals/7/probability-formula
Hi Dataquest team,
I wanted to bring to your attention what I think is an error on the theory part for the first course on conditional probability.
In introducing the concept of P(A|B), this dataset is used.
After the intro on cardinals, the equivalence with probabilities is introduced in an awkward way.
P(HIV | T+) is calculated as P(HIV ⋂ T+) / P(T+)
the values of these two variables are expressed as
P(T+) = 0.12 (?)
P(HIV ⋂ T+) = 0.000015 (??)
Thus leading to P(HIV|H+) = 0.000125
While I think the two values should be
P(T+) = 45/53 (success / possible)
P(HIV ⋂ T+) = 21/53
Thus leading to P(HIV|H+) = 0,466
Maybe the dataset was changed or maybe I have not understood a thing about conditional probability?
Looks like the 0.12 and 0.000015 values are plucked out of thin air?
I agree that using the provided data as you demonstrated would make a more relatable explanation.
You have understood condition probability correctly, but nevertheless there’s nothing theoretically wrong with 0.000015/0.12=0.000125.
Introducing the great allen downey collection to satisfy all your bayes needs: https://colab.research.google.com/github/AllenDowney/BiteSizeBayes
Thsnks @hanqi! Maybe worth flagging this as somerhing to edit in dq backlog.
Hi @nlong, the table you showed above describes the results obtained using a certain HIV test. Before introducing the new probabilities (the ones which confused you), we mentioned using a a different HIV test (you might have missed that paragraph, hence the confusion):
This formula is useful when we only know probabilities. For instance, let’s say a different test is used to diagnose a patient. The patient tests positive for HIV, and we want to find P(HIV | T+) — the probability that the patient actually has HIV, given that the test was positive.
This time, however, all we know is P(T+)=0.12 and P(HIV∩T+)=0.000015. We can no longer find cardinals, but using the formula above, we have:
So the probabilities come from a different test, not from that initial table, and this aspect is already mentioned. Let me know if this is still confusing.
Hi @alex thanks for clarifying you are right.
However from a UX perspective I found a bit confusing breaking the reasoning flow of the existing sample above introducing a new one ex abrupto. the text is very dense, so using some extra spacing would be helpful in breaking the reasoning and introduce some extra data.
nevertheless, it may just be me
I agree with the observation you’re making from a UX perspective, and I added a change to make this more clear — hopefully, the change will go live next week. Thanks!