2009-07-21

Absence of Evidence is Evidence of Absence

if the evidence had a chance to show up, i.e., if someone looked for it.

if evidence is not understood as proof but as an indication. Of course, absence of proof is not proof of absence.

I recently found a really nice proof of this statement using conditional probabilities. I found the symbols and the layout to be not optimal. Therefore, I created my own version of the proof:

What I like especially about this proof is that it first explicitly defines what evidence is: Some observation E is evidence for a hypothesis X if X being true is more likely when given that E is observed than when given that E is not observed, i.e., P(X|E) > P(X|not E). If this is given, observing E will increase the likelihood of X being true. Using Bayes' Theorem, you can precisely calculate how much it is increased. Assume you assigned some a priori probability P(X) to X. The a posteriori probability, i.e., the probability that X is true given that you observed E, isP(X|E) = P(E|X)*P(X) / ( P(E|X)*P(X) + P(E|not x)*P(not X) ).It is easy to see that the "shift" from a priori to a posteriori probability will be the higher the bigger P(E|X) / P(E|not X) is. If it is likely to observe E given that X is true, say P(E|X) = 80%, and at the same time unlikely to observe E given that X is false, say P(E|not X) = 0.1%, observing E is strong evidence. On the other hand, if P(E|X) is only slightly greater than P(E|not X), say 80% vs. 75%, E is weak evidence.

Knowing what strong and weak evidence is and knowing that absence of evidence is evidence of absence helps you evaluating lots of things. I find these facts to be actually quite intuitive but it helps if you can be sure of them on a very formal basis.

For example, if your friend John tells you about that new alternative medical treatment that has helped him getting rid of his cold for just 50€. In that case, the hypothesis X is that the treatment is efficacious and the evidence E is that John recovered from his cold. Well, you know that John would probably have recovered anyway, so P(E|not X) is nearly 1. Therefore, however high P(E|X) is, E can only be very weak evidence. This, of course, matches the intuition that you cannot rely on an individual case to judge such a treatment.

Some time later you read about a double-blind, placebo-controlled clinical trial in which they tested the treatment John was talking about. It showed no statistical significant effect, i.e., the result they got did not contradict the null hypothesis that the treatment has no other effect than a placebo. (The result would have contradicted the null hypothesis if, assuming the null hypothesis being true, it was less likely than the level of significance they used -- usually 5%.) This absence of evidence of the treatment's efficacy is of course no proof that it is inefficacious. But it is evidence for that. Again, this matches the intuition. They looked for efficacy, they found no efficacy, so there being no efficacy is more likely now than before they looked.

What about if John tells you about that new lifestyle drug that turns your skin green (hypothesis X). He took it and his skin turned green (evidence E). Your intuition tells you that this should be strong evidence, and it is. Assuming that John tells the truth, you can be pretty sure about the efficacy of that drug because the probability P(E|not X) that he would turn green without the drug being efficacious is practically zero.

Über mich

I am a PhD student at the department of computer science at the Humboldt-University Berlin, Germany. My thesis topic is "Test-Driven Language Modeling". You can read more about it on my professional homepage.