Samedi, 28 juin 2014

On the radio the other day, I heard a mathematician warning about too hastily
interpreting probability results. Here's the example he gave.

Imagine a population of 100,000 people, 100 of which having a rare disease
(but not knowing about it) and therefore 99,900 of which not being sick. Imagine
moreover there is a test that can detect this disease with a 99% accuracy:
this means that the test will on average give a positive result on
100 × 0.99 = 99 of the 100 sick persons and a negative result on
100 × (1 - 0.99) = 1 of them. It will also give a negative
result on 99,900 × 0.99 = 98901 non-sick persons and a (false) positive
result on 99,900 × (1 - 0.99) = 999 non-sick persons.

This means that out of the 99 + 999 = 1098 persons (out of the whole
population) who got a positive result, only 99 actually have the disease. This
means that the test indicates, for a random person taken from the whole
population, only a 99 / 1098 ≈ 9% probability of being sick. In other words,
even if the test is positive, there is still a 91% chance of not being sick!
This new result needs to be put into perspective with the probability of being
sick before doing the test (0.1%) and after getting a positive test result
(9%, i.e., 90 times higher chance of being sick). But it also means that
because of the imbalance between sick and non-sick populations, the 1% failure
of the test will yield a lot more false positive results among the non-sick
population than correct positive results among the sick population.