How to know which pregnancy studies are legit

A question that comes up again and again in reading the large supply of studies available is how to know which studies warrant our attention. (Getty Images)

Most weeks, CNN Health or The New York Times Science section reports on another study about health. Within the last couple of weeks there was one about how four cups of coffee a day kills you and one on which brands of beer are most likely to result in a trip to the ER (answer: malt liquor).

A question that comes up again and again in reading these, and came up all the time when I was writing my book on pregnancy, is how to know which studies warrant our attention.

The gold standard in medicine, and in other fields, is the randomized controlled trial. In a study like this, participants are divided into two (or more) groups randomly and each is told to do something different. In a drug study, one group takes a drug and the other does not. Because the groups are randomly selected, on average they are similar before the study. So if the researchers see differences after the study, they can be confident the differences are due to the treatment.

Even in a randomized study, there are limitations. No research study is run on the entire population of the world. What we learn from these studies is the impact of the treatment on average, not on every individual.

So, if you are approaching a health decision based on data from a randomized study, that's great. But: the majority of the time, especially in public health where I looked for data on pregnancy, the studies aren't randomized. In 2012, the American Journal of Public Health published 128 papers — only 14 of them were randomized.

Advertisement

What we get instead are observational studies, which also compare individuals who engage in different behaviors. The difference, though, is that these individuals choose to engage in these behaviors on their own — and if differences between people influence their choice of behavior, it may be those differences, not the behavior itself, that changes their outcomes.

Let's consider a particular example, drawn from pregnancy. When I came to look at the issue of caffeine, I found that, at least in evaluating the relationship with miscarriage, there were no randomized studies. So I turned to observational studies.

One that I summarize in my book is from the American Journal of Obstetrics and Gynecology, published in 2008. The paper compares women who drink no caffeine to those who drink two cups a day or more, and analyzes the risk of miscarriage (the researchers also included an intermediate group, but for now we'll think about the simple comparison). The paper summarizes the characteristics of women in these different caffeine groups.

There are big differences across these groups other than how much caffeine they drink. The coffee drinkers are older, more likely to be white, poorer on average and more likely to smoke. Smoking and age, in particular, are linked to miscarriage. So is it the coffee? Or is it the smoking?

The crucial element of evaluating these studies is to figure out how big these differences really are, and how much they matter. It's common in studies like this to show the impacts of the treatment — in this case, caffeine — after adjusting for these controls. This adjustment is important, but also incomplete. In the caffeine study, for instance, the authors controlled for whether the household income was more or less than $50,000, but that's only a crude measure. It is hard to know whether better controls — more detail about income — would make more of a difference.

What does this mean, in general, when we're faced with these studies? How can we evaluate them? One thing to start with is: Always be a little skeptical. But a bit more concretely, here are two tips.

Look at how different the groups are on things like age, education and so on. All else equal, the more similar the groups look, the better.
Look at how complete the data is. A lot of media reports say a study is "adjusted for socio-demographics." How effective this is depends on how comprehensive the variables on demographics are. A study using data that details exactly how much education someone has, and exactly their income will be able to adjust more completely for this issue than one where all the researchers observe is whether someone completed high school or not.

If you use these criteria, you'll quickly realize that some studies are much, much better than others. And while no single study is going to be enough to close the book on any issue, thinking critically about these possible problems may sometimes lead you to decide a study doesn't have much information at all. Not to mention allow you to drink coffee.

---

Emily Oster is an associate professor of economics at the University of Chicago Booth School. Her book is "Expecting Better: Why the Conventional Pregnancy Wisdom Is Wrong."

Missy Franklin, Jenny Simpson, Adeline Gray and three other Colorado women could be big players at the 2016 Rio OlympicsWhen people ask Missy Franklin for her thoughts about the Summer Olympics that will begin a year from Wednesday in Rio de Janeiro, she hangs a warning label on her answer.