Correlation does not imply causation is a quip that events or statistics that happen to coincide with each other are not necessarily causally related. The reality is that cause and effect can be indirect and due to a third factor known as a confounding variable, or entirely coincidental and random. The assumption of causation is false when the only evidence available is simple correlation. To prove causation, a controlled experiment must be performed.

Two events can consistently correlate with each other but not have any causal relationship. An example is the relationship between reading ability and shoe size across the whole population of the United States. If someone performed such a survey, s/he would find that the larger shoe sizes correlate with better reading ability, but this does not mean large shoes cause good reading skills. Instead it's caused by the fact that young children have small feet and have not yet (or only recently) been taught to read. In this case, the two variables are more accurately correlated with a third: age.[citation NOT needed]

The part age plays in this example is known as a "confounding variable" or "confounding factor," and is something that is not being controlled for in the experiment. In this case, age influences both reading ability and shoe size quite directly. A confounding variable can be what the actual cause of a correlation is, hence any studies must take these into account and find ways of dealing with them, usually by searching them out and trying to alter this variable directly.

The most common method to control confounding variables is with controlled studies. In these studies, the differences between the observations and the control group are minimised as best as possible, so that one can be more confident that a correlation is a valid indicator of causation. This is extremely important in compensating for the placebo effect in medical trials, but it is also important in other branches of science. In the age/shoe-size/reading-ability example, a controlled experiment would look for a correlation between reading ability and shoe size given a sample of people all the same age - or alternatively the hypothesis could be further tested by correlating age and reading ability given a sample of similar shoe size.

The term "risk factor" is used in medicine to mean "something that is positively correlated." For instance, obesity is a risk factor for Type 2 diabetes. The term is often incorrectly understood to mean "cause" (e.g. "I'm at risk for diabetes? But I'm not fat!"). Alternatively, a clear risk factor can be disputed on the basis that it's not a definitive cause - a classic use of the uncertainty tactic (e.g. "I smoke three packs a day and I don't have cancer!").

Even when there is a correlation, the risk factor does not necessarily go both ways: obesity is a risk for Type 2 diabetes, but untreated diabetes can, in certain situations, cause weight loss, as excess glucose is excreted in urine.[2] In other words, correlation combined with a specific temporal relationship does, in fact, imply causation.[3] Taking insulin can cause weight gain, as the previously wasted sugars can now be metabolized, but that risk factor correlates to the treatment, not the disease.[4]

In science, correlation studies are often used to test for the existence of interesting patterns, but they are never used exclusively to claim a cause. In order to make a causal claim you must run an experiment or series of experiments and further studies using the scientific method — i.e., test to see if it really is a cause by altering parameters and performing more experiments, making predictions and testing them. This is in order to validate that one event is indeed directly influencing the other and is the reason behind the detected correlation.

Many woo and pseudoscience pushers conflate correlation with causation in order to make a claim of validity but forget to attempt the later scientific steps of compensating for confounding variables and thoroughly testing the causal relationship. For example, if someone gets a cold, but takes vitamin C, their cold will go away. The claim is then made that the vitamin C caused the cold to go away. However, the cold would have gone away anyway, whether or not the vitamin C was taken, and so the validity claim is false. The placebo effect is another correlation with "treatment" that quacks use to create false validity.

Correlations seem to tap into a deep part of human psychology. As pattern recognition machines, we are hyper-responsive to any potential signal in our environment. People will often take two completely unrelated events and decide that they must cause each other because they seem to correlate. Someone may decide that when she wears a given shirt she has good luck; this is often combined with a powerful confirmation bias to create magical thinking.

A graph of autism prevalence compared with organic food sales over time, "proving" that autism is caused by organic food.

In the "Church of the Flying Spaghetti Monster", a key "belief" is that global warming is caused by a lack of pirates (not the modern kind, like in Somalia, but the old-timey swashbuckling kind) sailing the oceans. This is shown by a graph correlating increasing surface temperatures of the earth with a decline in the number of pirates.[5] While it is certainly true that piracy has decreased and temperatures have gone up, there is nothing directly connecting the two trends. Or is there?

While correlation is only weakly correlated with causation, causation typically causes correlation.

Care should be taken not to assume that the opposite is impossible (that correlation never implies causation). Correlations implying causations are successfully postulated and tested every day. Most scientific theories would not exist without this particular process of the scientific method.

Woo spinners, when cornered and on the defense may try to claim that correlation never implies causation, in order to avoid statistical analysis (where the correlation's implication of causation is demonstrated). They ignore the scientific analysis by claiming that correlation doesn't imply causation with an effortless handwave. By arguing this, they are arguing that they know the truth without any evidential support, and your number magic be damned!

For example, when presented with evidence that use of a vaccine is followed by an almost complete reduction in infection rate, anti-vaccine proponents often discount this change as mere correlation -- which ignores the sound methodology, control of variables, extremely low probability that this is merely an extreme example (low p-values), and generally clear-cut evidence that correlation is totally linked to a causal chain. Ironically, the antivaxers often, after casting doubt on this method, use that very method themselves (only without the hard work neccesary to support their case). They may cite their own statistics that show an increase of cases of autism (or some other terror du jour) when a vaccine is introduced. If they search long enough they may find some place at some time where this happened and then use this example as represenative of everywhere always.

This tactic is also often used by climate change deniers ("the highly statistically significant association of greenhouse gases and global temperature does not imply causation") as well as every type of woo spinner you can imagine (shockingly including even some economists and scientists).