There are plenty of ways to get those numbers to work for you. (Thinkstock)

November 29, 2013

ADVERTISEMENT

Sign Up for

Our free email newsletters

10 things you need to know today

Today's best articles

The week's best photojournalism

Today's top cartoons

Daily business briefing

When it comes to arguing a position, convincing someone to buy a product, or proving your innocence in court, few things are more convincing than good ol' statistical data.

Of course, as high school math teachers and savvy internet commenters agree, statistics are malleable and easy to abuse. In Derrell Huff's 1954 book How to Lie With Statistics, he offers great tips on how to spot misleading data. They're still relevant today.

Cherry picking

What it is: Selecting data that supports your position while ignoring data that contradicts it.

How people lie with it: This happens all the time with medical journals, Ben Goldacre explained in a recent Ted Talk. When scientists develop a drug, they must send it through rigorous trials, first on lab mice, then on patients. The problem, says Goldacre, is that only about half of drug studies ever make it to publication, and positive findings are twice as likely to be published as negative findings about the same drug. The result is that doctors and medical professionals who read the publications wind up thinking drugs are a lot safer and more effective than they actually are.

As an example, Goldacre looked at all the trials researchers submitted to the FDA involving a group of antidepressants, to see which ones made it to publication in peer-reviewed academic journals. First, they looked at trials conducted for this purpose, and found 38 came back positive, while 36 came back negative — almost a 50/50 split. He then looked at which made it to print, and found that 37 of the positive trials made the cut, while only 3 of the negative ones did. You can see how a doctor brushing up on the latest antidepressant might find himself misled.

Great for: Marketing

Selecting the "right" average

What it is: Selecting the type of average that best supports your position — the mean, median, or mode. If you're rusty, the mean is the arithmetical average. To find it, you add all the figures up, then divide by the number of figures. The median is the number slap dab in the middle of a sample, where half are higher, and half lower. And the mode is simply the number that appears the most in any given sample.

How people lie with it: Simply pick the average that best suits your position. Huff explains this gives you a lot of wiggle room when talking about the "average" salary in a given area. Say you live in a kind of crummy town with many poor people, and a few uber-rich people. You want to coax your friends into moving to your town, and you happen to have a good sample of income data. If you calculate the mean salary, the rich people will pull up the number, and you'll wind up with a pretty solid average salary that makes your town seem inviting. The median, in contrast, would probably provide a more "honest" average.

As Huff explains, when using the mean salary, "Nearly everyone is below average."

Great for: Real estate, business accounting

Ignoring causality

What it is: This is the one that gets a lot of hand-slaps. People often point out that "correlation doesn't imply causation." But what does it imply?

Take drugs, for example. Let's say you discover that the number of people smoking marijuana in a neighborhood over a five-year period is correlated to the number of people who wind up in the hospital each year over the same period. With only that data, there are six possibilities:

Marijuana use leads to hospital visits in your neighborhood

Hospital visits lead to marijuana use in your neighborhood

Marijuana use and hospital visits both partly cause each other

Marijuana use and hospital visits are both caused by a third factor. In this case, maybe the population in your neighborhood has also risen dramatically in that five-year span.

An uptick in hospital visits is caused by the increased population, which is correlated to marijuana as well

The observed correlation was due purely to chance

How people lie with it: Someone who wants to make a case that marijuana makes the neighborhood dangerous could just pointedly raise his eyebrows and say, "More people are smoking marijuana in this neighborhood than ever before. More people are also winding up in the hospital. Coincidence?"