How Correlation Does and Doesn’t Imply Causation

In my last article I discussed the popular rule “absence of evidence isn’t evidence of absence,” often associated with the argument from ignorance. However, this popular mantra, and almost every other as well, has its exceptions and is often used incorrectly.

It’s often too easy to find some correlation that helps your argument. Does that mean such data is useless?

Another pet peeve of mine is when people ignorantly recite correlation doesn’t imply causation. This maxim is of course correct when using ‘imply’ as defined in logic. But most people have a looser definition and way too often it’s used as a pseudo-logical flip of the hand in order to ignore any evidence we don’t like.

Let me give you a favorite example of mine.

“Does socialized/government healthcare work better than privatized?” One of the best, or maybe worst, misunderstandings of correlation I’ve had (mis)fortune of witnessing is when I recited the fact that countries with public healthcare have healthier populations, and pay less for this health, than countries where it is privatized, like the US. In other words, the health of a population positively correlates with the public health care.

“But correlation doesn’t mean causation, Paul!”

“Of course! But do you really think that health care and health aren’t causally related!”

Correlation is a very powerful bit of evidence when discussing the causal relationship between variables. So please, for the love of all the is rational, make sure you know when to and not to bring up this popular mantra.

Previously Established Causation

In my healthcare example we both already have strong reasons to believe that health care and health are causally related. There are of course other things that affect a population’s health, though. Europe, for instance, has a lot more smokers than the US, although the US has a much bigger problem with obesity. These are social and cultural issues that greatly affect health regardless of whether the system is public or private.

Most people would still agree that health care is one of the biggest factors in determining how healthy people are on average – there is a strong causal relationship. As long as we continue to agree about this, then data that shows that a public system strongly correlates with a healthy population is strong bit of evidence in our debate.

Pieces of Evidence Only Ever Hint

No one single bit of evidence of any kind will ever ‘prove’ anything – if by prove you mean remove all shadow of a doubt. Even a mountain of evidence doesn’t really prove anything with 100% confidence or provide a logical proof. In the section above, even when everyone has agreed that two things are causally related, data on how the variables correlate is still only just one bit of evidence among many.

But what if there isn’t any pre-established causal link? You take X and then you take some Y and voila! They just so happen to correlate. And of course, we can always come up with some story post hoc of how they’re related. Pastafarianism is beautifully built on satire of such correlations.

But even in such post hoc situations, correlations are still evidence! The correlation in the graph above is some, however small, evidence that the decline in pirates is causally related to global warming. What needs to be understood is that this is weak evidence. Evidence comes in different strengths. It that there might be some kind of causal relationship working in the background. More evidence is needed.

The role of correlation in these situations should be to spur further questions and studies.

Further, it is important to remember that you can have evidence, even strong evidence, for something that is false. Take for instance a bit of lipstick, a partial kiss, on your husband’s shirt collar. This is somewhat strong evidence that he is cheating. However, despite the evidence, he isn’t. It turns out that some woman tripped in his office and he caught her as her face awkwardly brushed against his shoulder.

Again, while correlation is always evidence hinting at a causal relationship, it doesn’t prove anything. But that doesn’t take away from the fact that it is still evidence.

Kinds of Causal Links

I’d lastly like to point out that correlation cannot, in principle, comment on the kind of causation that exists between two variables. Correlation is always evidence of causation, but cannot be evidence for the kind of causation. This distinction is probably the most confused part of any sloppy use of the ‘correlation doesn’t equal causation’ mantra.

First, and most obviously, correlation can’t tell us whether A causes B or B causes A. It doesn’t comment on the direction of causation. A popular example is the relationship between atheism and education. If you’re an atheist then it’s more likely you’re educated, and vice versa. The two are clearly correlated, but atheists and theists interpret this fact differently. While atheists argue that the more you learn the harder it becomes to deny the truth that god doesn’t exist, most theists switch the causal direction and argue that theistic conviction limits your educational opportunities since educational institutions are so strongly biased.

Second, it can’t tell us if it is a necessary cause or a sufficient cause. A necessary cause is simply when one thing is needed for another to happen or exist. An example is breathing oxygen to life, which positively correlate. You need a ton of things in order to live – water, warmth, food, and I’d even say friendship – but if you do not also have oxygen then say goodbye. Oxygen is necessary for life, while life isn’t necessary for oxygen.

Sufficient, in comparison, just means that you don’t need anything else. A classic example is fire and smoke, which also positively correlate. You don’t need fire to cause smoke since you could make it with dry ice. If you did have fire, though, then you wouldn’t need anything else. Fire is sufficient to cause smoke.

Third, correlation can’t tell us if one thing causes the other or they have a common cause. Maybe A causes B or vice versa? But alternatively, the two could be correlated and are both caused by something C? I have a cold, for instance. I’m tired and my nose is really runny. Over the past month, these two things have correlated very closely. First, I wasn’t either. Then I gradually become tired while I simultaneously had a more runny nose. Finally, both started to wane at the exact same time. Wow! I wonder if the runny nose causes the tiredness or vice versa? Maybe I was so tired because I was constantly blowing my nose? Of course the answer is that they’re instead two effects of a common cause.

I End with a Plea

Pretty, pretty please think before you dismiss data about correlation. It certainly doesn’t ‘prove’ anything. Nothing does. However, that doesn’t negate the important role it plays as evidence of causality. As Hume argued, all we really even have is correlation! We can’t observe ‘causation’ itself. In some sense, correlation is the ONLY thing that implies causation.

If you hear two things correlate closely, take it seriously. Then look for more evidence!

Paul Chiariello graduated from Rutgers in 2009 after studying Philosophy and Anthropology. Currently he is on the Board of Directors of the Rutgers Humanist Community, Co-founder of the Yale Humanist Community, and Director of the Humanism & Philo Curriculum for Camp Quest. Paul has a MSc in Sociology of Edu from Oxford, completing his field research in Bosnia on religious identity conflict. He also spent a year studying philosophy of ethics and religion at Yale on a PhD fellowship. He has worked with research organizations and schools DC, the UN, Uganda, Kenya, India, Indonesia and Germany.