In findings of statistical correlation, beware 'proof' that isn't

We are routinely bombarded by claims that have been “proven” with statistics. Here are tips on judging these claims.

Surprising results get headlines. “Did you hear that hurricanes with female names are more deadly? Who knew?!!” An Internet search of this report from last week yields thousands of citations.

“That’s why autism is on the rise!! It’s the vaccines!” The 1998 study making this claim about the measles, mumps and rubella vaccine got a lot more ink than The Lancet’s retraction, made after the study’s publisher learned the results were fraudulent.

Pure fabrication may be rare, but many studies are published with claims that should be served with many grains of salt. The first question to ask: “Is there enough data?”

“Is there evidence that property values fall when school budgets are rejected by the voters?” Asked that question last week, I had to disappoint my caller: Because relatively few school budget votes fail each year, the sample size is too small to test the connection between budget votes and home values.

Let’s consider the “deadly female hurricane” finding. When I was growing up, all hurricanes had female names. So by necessity the study starts with the period after 1978 when male names went into the lineup; that’s 53 hurricanes. Since the study excluded Katrina, the researchers were left with 52 instances in which a hurricane could be assigned a male or a female name. If this were a simple event—the only outcome is the death toll and the only variable is the gender of the assigned name—then it is possible that you could draw a statistically valid conclusion from 52 observations.

But a hurricane is clearly not a simple event. The death toll is dependent on many factors—the intensity of the storm, the population density of the affected communities, the time of day when the hurricane hit the community at greatest risk, whether the hurricane’s course was predicted in time, the geographic extent of the storm, and so on. The gender of the storm’s name may be one factor affecting the death toll, but it is only one among many. For the same explanatory power, more variables require more observations. The researchers’ hypothesis could be perfectly correct. But with so little data, the association is simply unsupportable. The University of Illinois researchers insist that the study was not intended to be a joke; it should have been.

We care about the size of the sample because statistics is all about relative probability: How likely is it that what we observe could have occurred by chance?

Suppose you speculate that cars painted white are more visible to other drivers and thus are involved in fewer accidents. You compile car color for the 110 accidents that occurred in New Hampshire in 2009 and discover that 10 percent of vehicles in accidents were painted white. If 15 percent of the cars registered in New Hampshire are white, the evidence supports your hypothesis.

So is the matter proven? No. In 110 accidents, it would not be that surprising for the white-car accident rate to be small just by chance. Encouraged by the New Hampshire result, you study all 33,808 motor vehicle accidents that occurred nationwide in 2009 and get the same result. That certainly strengthens your case.

So now is the matter proven? Nothing is ever “proven” to a statistician: But with a sample of 33,808 accidents, a substantially lower accident rate for white cars is very unlikely to have occurred just by chance.

Statisticians turn “guilt by association” into science. The probability of a particular association occurring by chance is reported according to levels of “significance.” A finding reported at the 10 percent significance level, for example, is an event that will have occurred by chance 10 times out of 100. The 5 percent significance level means that the observed result would have occurred by chance just five times out of a 100. The 10 or five random associations—which have nothing to do with causation—are called “false positives”; it appears there is a connection, but it doesn’t really exist.

In this era of “big data,” expect that studies of disease will mine new data sets for associations between specific conditions or diseases and other characteristics of the population such as diet or lifestyle. We’ve learned a lot from this approach in the past—and we have high hopes for the future.

But be aware of the risk. Suppose we want to know what factors are associated with rheumatoid arthritis. If we are testing at the 10 percent significance level, we can expect that 10 of 100 of our “explanatory” variables will appear to be associated with RA just by chance. At the 5 percent level, five of every 100 “promising” associations will turn out to be dead ends. Thus, we may explore a data set in which RA is statistically associated with a high-protein diet (e.g., Atkins or paleo) or growing up in an older home. Data “mining” can be helpful because some of the observed associations are “true” and may lead to a new understanding of the cause of disease. But we know in advance that some of the correlations are spurious.

Good science is based on replication. If, say, rheumatoid arthritis and an older home are associated in one data set, this should prompt further study, not a press release. The vaccine scare illustrates the damage that can be done by the premature release of unconfirmed results: This year has seen the highest incidence of measles in the United States in 18 years, and the World Health Organization reports that the incidence of measles in Europe was up 348 percent between 2007 and 2013.

Tyler Vigen, a Harvard law student, has posted a collection of statistical associations for our amusement at www.tylervigen.com: Want to reduce divorce in Maine? It’s all about margarine. Troubled about the mysterious die-off of bee colonies? Look no further than juvenile arrests for marijuana possession.

As “big data” brings us access to ever more numbers, let’s remember that statistical correlation is just math. Correlation is a beginning, not a conclusion.
Kent Gardner is chief economist and chief research officer of the Center for Governmental Research Inc.

What You're Saying

Tony Bateson at 2:32:27 PM on 7/19/2014

Of course Ken Gardner can have fun posting stuff about 'associations' but it is rather more than that when it comes to vaccines and autism. It is a hugely complex subject and I rather doubt from Kent Gardner's basic assertions that he can be qualified to even debate this qu... Read More >

Of course Ken Gardner can have fun posting stuff about 'associations' but it is rather more than that when it comes to vaccines and autism. It is a hugely complex subject and I rather doubt from Kent Gardner's basic assertions that he can be qualified to even debate this question.

Let's start with Andrew Wakefield. Please read the 1998 Lancet paper which gave rise to much fuss about this matter. That it is fuss or even froth is clear from the absolute fact that Andrew Wakefield did not say or even suggest that vaccines cause autism in that paper. He simply concluded that more research was needed. Nothing more.

Personally I believe vaccines may have been the only cause of autism. But dark forces, never mind conspiracy, are simply protecting their $300 billion a year estate by laying every conceivable false scent trail in any direction but the only one that might put them in the dock. However the argument now is squarely at the door of ethyl mercury in vaccines. This second most poisonous toxin is perfectly safe according to Pharma whilst methyl mercury the kind you might get from sea food is, of course, highly poisonous. So it wasn't the vaccines at all it was that all these autistic kids must have overdosed on sea food. Funny the US EPA doesn't seem to accept this. Any more than I do.