“Many stories use relative risk reduction or benefit estimates without providing the absolute data. So, in other words, a drug is said to reduce the risk of hip fracture by 50% (relative risk reduction), without ever explaining that it’s a reduction from 2 fractures in 100 untreated women down to 1 fracture in 100 treated women. Yes, that’s 50%, but in order to understand the true scope of the potential benefit, people need to know that it’s only a 1% absolute risk reduction (and that all the other 99 who didn’t benefit still had to pay and still ran the risk of side effects).

2. Association does not equal causation

A second key observation is that journalists often fail to explain the inherent limitations in observational studies – especially that they cannot establish cause and effect. They can point to a strong statistical association but they can’t prove that A causes B, or that if you do A you’ll be protected from B. But over and over we see news stories suggesting causal links. They use active verbs in inaccurately suggesting established benefits.

3. How we discuss screening tests

The third recurring problem I see in health news stories involves screening tests. … “Screening,” I believe, should only be used to refer to looking for problems in people who don’t have signs or symptoms or a family history. So it’s like going into Yankee Stadium filled with 50,000 people about whom you know very little and looking for disease in all of them. … I have heard women with breast cancer argue, for example, that mammograms saved their lives because they were found to have cancer just as their mothers did. I think that using “screening” in this context distorts the discussion because such a woman was obviously at higher risk because of her family history. She’s not just one of the 50,000 in the general population in the stadium. There were special reasons to look more closely in her. There may not be reasons to look more closely in the 49,999 others.”

Let’s discuss each of these in a little bit more depth. The first mistake is probably the most common one, where statistically significant findings are not put into clinical perspective. Let me explain. Suppose we are looking at a drug that prevents a rare illness. The base rate of this illness, which we will call Catachexia is 4 in 10,000 people. The drug reduces this illness to one in 10,000 people, a 75% decrease. Sounds good, right?

Not so fast. Let me add a few facts to this hypothetical case. Let’s imagine that the drug costs $10,000 a year, and also has some bad side effects. So in order to reduce the incidence from four people to one person in ten thousand, 9996 people who would never develop this rare but serious illness must be treated. The cost of doing so would be $99,960,000! Plus those 9996 people would be unnecessarily exposed to side effects.

So which headline sounds better to you?

New Drug Prevents 75% of Catachexia Cases!

Or

New Drug Lowers the Prevalence of Catachexia Cases by Three People per 10,000, at a Cost of Almost $100 Million Dollars

The first headline reflects a reporting of the relative risk reduction, without cost data, and the second headline reflects the absolute risk reduction, and the costs. The second headline is the only one that should be reported but unfortunately the first headline is much more typical in science and medical reporting.

The second error where association or correlation does not equal causation is terribly common as well. The best example of this is all of the studies looking at the health effects of coffee. Almost every week we get a different study that claims either a health benefit of coffee or a negative health impact of coffee. These findings are typically reported in the active tense such as, “drinking coffee makes you smarter.”

Of course the first headline is the one that will get reported, even though the second headline is equally inaccurate. Only the third headline accurately reports the findings.

The theoretical problem with any correlational study of two different variables is that we never know, nor can we ever know, what intervening variables might be correlated with each. Let me give you a classic example. There is a high correlation between the consumption of ice cream in Iowa and the death rate in Mumbai, India. This sounds pretty disturbing. Maybe those people in Iowa should stop eating ice cream. But of course the intervening variable here is summer heat. When the temperature goes up in Iowa people eat more ice cream. And when the temperature goes up in India, people are more likely to die.

The only way that one could actually verify a correlational finding would be to do a follow-up study where you randomly assign people to either consume or not consume the substance or food that you wish to test. The problem with this is that you would have to get coffee drinkers to agree not to drink coffee and non-coffee drinkers to agree to drink coffee, for example, which might be very difficult. But if you can do this with coffee, chocolate, broccoli, exercise, etc. then at least you could demonstrate a real causal effect. (I’ve oversimplified some of the complexity of controlled random assignment studies, but my point stands.)

The final distortion which involves confusion about screening tests is also very common, and unfortunately, incredibly complex. The main point that Schwitzer is trying to make here though is simple; screening tests are only those tests which are applied to a general population which is not at high risk for any illness. Evaluating the usefulness of screening tests must be done in the context of a low risk population, because that is how most screening tests are used. Most people don’t get colon cancer, breast cancer, or prostate cancer, even over 50. If you use a screening test only with high-risk individuals then it’s not really a screening test.

There is the whole other issue with reporting on screening tests that I’m only going to briefly mention because it’s so complicated and so controversial. It’s that many screening tests may do as much harm as good. Recently there has been a lot of discussion of screening for cancer, especially prostate and breast cancer. The dilemma with screening tests is that once you find cancer you almost always are obligated to treat it because of medical malpractice issues and psychological issues (“Get that cancer out of me!”) The problem with this automatic treatment is that current screening doesn’t distinguish between fast-growing dangerous tumors and very slow growing indolent tumors. Thus we may apply treatments which have considerable side effects or even mortality to tumors that would never harm the person.

Another problem is that screening often misses the onset of fast-growing dangerous tumors because they begin to grow between the screening tests. The bottom line is that screening for breast cancer and prostate cancer may have relatively little impact on the only statistic that counts – the cancer death rate. If we had screening tests that could distinguish between relatively harmless tumors and dangerous tumors then screening might be more helpful, but that is not where we are yet.

One more headline test. Which headline do you prefer?

Screening for Prostate Cancer Leads to Detection and Cure of Prostate Cancer

Or

Screening for Prostate Cancer Leads to Impotence and Incontinence in Many Men Who Would Never Die from Prostate Cancer

The first headline is the one that will get reported even though the second headline is scientifically more accurate.

I suggest that every time you see a health or medicine headline that you rewrite it in a more accurate way after you read the article. Remember to use absolute differences rather than relative differences, to report association instead of causation, and add in the side effects and costs of any suggested treatment or screening test. This will give you practice in reading health and medical research accurately.

Also remember the most important rule, one small study does not mean anything. It’s actually quite humorous how the media will seize upon a study, even though the study was based on 20 people and hasn’t been replicated or repeated by anybody. They also typically fail to put into context the results of one study versus other studies of the same thing. A great example is eggs and type II diabetes. The same researcher, Luc Djousse, published a study in 2008 (link) that showed a strong relationship between the consumption of eggs and the occurrence of type II diabetes, but then in 2010 published another study finding absolutely no correlation whatsoever. Always be very skeptical, and most often you will be right.

Dr. Andrew Gottlieb is a clinical psychologist in Palo Alto, California. His practice serves the greater Silicon Valley area, including the towns of San Jose, Cupertino, Santa Clara, Sunnyvale, Mountain View, Los Altos, Menlo Park, San Carlos, Redwood City, Belmont, and San Mateo. Dr. Gottlieb specializes in treating anxiety, depression, relationship problems, OCD, and other difficulties using evidence-based Cognitive Behavioral Therapy (CBT). CBT is a modern no-drug therapy approach that is targeted, skill-based, and proven effective by many research studies. Visit his website at CambridgeTherapy.com or watch Dr. Gottlieb on YouTube. He can be reached by phone at (650) 324-2666 and email at: Dr. Gottlieb Email.