Disease Screening

Related Concepts

Bayes' Theorem Conditional Probability Prior Probability (Base Rate / Prevalence) Likelihood Posterior Probability Sensitivity (True Positive Rate) Specificity (True Negative Rate) False Positive Rate (1 - Specificity) Law of Total Probability Diagnostic Testing

Hint

Define your events: D = Person has the disease, ND = Person does not have the disease, Pos = Test is positive.
Identify the priors: P(D) and P(ND).
Identify the likelihoods:
- P(Pos | D) is the sensitivity.
- P(Neg | ND) is the specificity. You'll need P(Pos | ND) which is 1 - specificity (the false positive rate).
Calculate the total probability of testing positive, P(Pos), using the law of total probability:
P(Pos) = P(Pos | D)P(D) + P(Pos | ND)P(ND)
Apply Bayes' Theorem to find P(D | Pos):
P(D | Pos) = [P(Pos | D)P(D)] / P(Pos)

Pay close attention to how the low prevalence of the disease impacts the final probability, even with a "highly accurate" test.

Explanation: Disease Screening

Imagine a rare disease that only 1 out of 100 people have. There's a test for it that's 95% accurate. If you test positive, what's the chance you actually have the disease?

You might think it's 95%, but it's much lower! Here's why:

Because the disease is rare, most people (99 out of 100) don't have it.
Even a 95% accurate test will still incorrectly flag some healthy people as positive (these are "false positives").
- The test is 95% accurate for healthy people, meaning it's 5% inaccurate (5% false positive rate).
When many healthy people are tested, even a small false positive rate (5%) can lead to a significant number of false positive results.
It turns out that the number of healthy people who falsely test positive can be larger than the number of sick people who correctly test positive, especially when the disease is rare.

So, a positive test result in this scenario means you are much more likely to have the disease than before you took the test, but there's still a high chance it's a false alarm because so few people have the disease in the first place.

We will use Bayes' Theorem. Let's define the events:

D: The person has the disease.
ND: The person does not have the disease (complement of D).
Pos: The person tests positive.
Neg: The person tests negative.

We want to find P(D | Pos).

1. Identify Given Probabilities

Prior probability of having the disease (prevalence):
P(D) = 0.01 (1%)
Prior probability of not having the disease:
P(ND) = 1 - P(D) = 1 - 0.01 = 0.99 (99%)
Probability of testing positive if the person has the disease (Sensitivity):
P(Pos | D) = 0.95 (95%)
Probability of testing negative if the person does not have the disease (Specificity):
P(Neg | ND) = 0.95 (95%)

2. Determine the Probability of Testing Positive if the Person Does Not Have the Disease (False Positive Rate)

We need P(Pos | ND). This is the complement of specificity.

P(Pos | ND) = 1 - P(Neg | ND) = 1 - 0.95 = 0.05 (5%)

This is the false positive rate: the chance a healthy person tests positive.

3. Calculate the Total Probability of Testing Positive (P(Pos))

Using the Law of Total Probability:
P(Pos) = P(Pos | D) × P(D) + P(Pos | ND) × P(ND)

This accounts for testing positive whether you have the disease or not.

P(Pos) = (0.95 × 0.01) + (0.05 × 0.99)
P(Pos) = 0.0095 + 0.0495
P(Pos) = 0.0590

So, about 5.9% of the total population will test positive.

4. Apply Bayes' Theorem to Find P(D | Pos)

Bayes' Theorem states:
P(D | Pos) = [P(Pos | D) × P(D)] / P(Pos)

P(D | Pos) = (0.95 × 0.01) / 0.0590
P(D | Pos) = 0.0095 / 0.0590
P(D | Pos) ≈ 0.1610169...

Final Result

The probability that a person actually has the disease, given that they tested positive, is:

P(Disease | Positive Test) ≈ 0.161

Or about 16.1%.

Key Insight: The Base Rate Fallacy
This result is often surprising! Even though the test is "95% accurate," a positive result only means there's a ~16.1% chance the person actually has the disease. This is due to the low base rate (prevalence) of the disease (1%).
Out of 10,000 people:

100 people have the disease (1%).
- 95 of them test positive (true positives).
9,900 people do not have the disease (99%).
- 495 of them test positive (false positives: 0.05 × 9900).

Total positive tests = 95 (true positives) + 495 (false positives) = 590.
Probability of having the disease if you tested positive = True Positives / Total Positives = 95 / 590 ≈ 0.161.
This illustrates why understanding base rates is crucial when interpreting diagnostic test results and why follow-up testing is often necessary, especially for rare conditions.

Interpreting Test Results: Bayes' Theorem in Diagnostics

Sensitivity, Specificity, and Base Rates

The Power of Bayes' Theorem