I made a GeoGebra applet to illustrate the dangers of false positives. So I thought I’d share that here. Here’s the statement of the problem.

Suppose Lenny Oiler visits his doctor for a routine checkup. The doctor says that he must test all patients (regardless of whether they have symptoms) for rare disease called analysisitis. (This horrible illness can lead to severe pain in a patient’s epsylawns and del-tahs. It should not to be confused with analysis situs.) The doctor says that the test is 99% effective when given to people who are ill (the probability the test will come back positive) and it is 95% effective when given to people who are healthy (it will come back negative).

Two days later the doctor informs Lenny that the test came back positive.

Should Lenny be worried?

Surprisingly, we do not have enough information to answer the question, and Lenny (being pretty good at math) realizes this. After a little investigating he finds out that approximately 1 in every 2000 people have analysisitis (about 0.05% of the population).

Now should Lenny be worried?

Obviously he should take notice because he tested positive. But he should not be too worried. It turns out that there is less than a 1% chance that he has analysisitis.

Notice that there are four possible outcomes for a person in Lenny’s position. A person is either ill or healthy and the test may come back positive or negative. The four outcomes are shown in the chart below.

Test result

Ill

Healthy

Positive

true positive

false positive

Negative

false negative

true negative

Obviously, the two red boxes are the ones to worry about because the test is giving the incorrect result. But in this case, because the test came back positive, we’re interested in the top row.

For simplicity, suppose the city that is being screened has a population of 1 million. Then approximately (1000000)(0.0005)=500 people have the illness. Of these (500)(.99)=495 will test positive and (500)(0.01)=5 will test negative. Of the 999,500 healthy people (999500)(.05)=49975 will test positive and (999500)(.95)=949525 will test negative. This is summarized in the following chart.

Test result

Ill

Healthy

Positive

495

49975

Negative

5

949525

Thus, 495+49975=50470 people test positive, and of these only 495 are ill. So the chance that a recipient of positive test result is sick is 495/50470=0.0098=0.98%. That should seem shockingly low! I wonder how many physicians are aware of this phenomenon.

I show this in my class as an application of Bayes’ rule – although now I am wondering if the formal algebraic manipulations actually do more to obscure rather than illuminate the intuitive reasons for this phenomena.

Hi, I don’t understand why, in your conclusions, when you count people who test positive, you consider 500 + 49975 instead of 495 + 49975. Obviously the percentage concerning “the chance that a recipient of positive test result is sick” rises a little, but I would know if I was wrong!
Thank u, your posts are always a shaking reading.