Are Underpowered Studies Ever Justified?

Is a small scientific study better than none at all? A provocative piece in Frontiers in Psychology raises the question of whether we should ever do under-powered studies. The authors are Dutch researchers Rik Crutzen and Gjalt-Jorn Y. Peters.

Crutzen and Peters begin by questioning the idea that even a little evidence is always valuable. Taking the example of a study that only manages to recruit a handful of patients because it’s studying a rare disease, the authors say that:

Underpowered studies are often unable to contribute to in fact answer research questions… sometimes, the more virtuous decision is to decide that current means do not allow studying the research question at hand.

What about preliminary or pilot studies, which are often small? Crutzen and Peters say that small pilot studies are useful for checking that a research method works and is feasible, but that we shouldn’t rely on the data they produce, not even as a guide to how many participants we need in a future follow-up study:

An early-phase trial is not appropriate to get an accurate estimate of the effect size. This lack of accuracy affects future sample size calculations. For example, if researchers find an effect size of (Cohen’s) d = 0.50 in an early-phase trial with N = 100, then the 95% confidence interval ranges from 0.12 to 0.91.

So what are researchers to do, then? Should we only ever consider data from well-powered studies? This seems to be what Crutzen and Peters are implying. They argue that it is rarely impossible to conduct a well-powered study, although it might take more resources:

when researchers claim to study a rare population, they actually mean that the resources that they have at their disposal at that moment only allows collection of a limited sample (within a certain time frame or region). More resources often allow, for example, international coordination to collect data or collecting data over a longer time period.

They conclude that

In almost all conceivable scenarios, and certainly those where researchers aim to answer a research question, sufficient power is required (or, more accurately, sufficient data points).

Crutzen and Peters go on to say that psychology textbooks and courses promote underpowered research, thus creating a culture in which small studies are accepted. The same could be said of neuroscience. But in this post I’m going to focus on the idea that research should be ‘never underpowered.’

First off, I don’t think Crutzen and Peters really answer the question of whether a small study is better than nothing at all. They say that small studies may be unable to answer research questions, but should we expect them to? Surely, a small amount of data is still data, and might provide a partial answer. Problems certainly arise if we overinterpret small amounts of data, but we can’t blame the data for this. Rather, we should temper our interpretations, although this may be easier said than done when it comes to issues of public interest.

It’s true that in most cases (though not all), it would be possible to make studies bigger. However (unless we increased total science funding), this would mean doing fewer studies. All else being equal, we’d do (say) one study with n=1000 instead of ten different studies with n=100.

From a statistical perspective, one n=1000 study might indeed be better, but I worry about the implications for scientific innovation and progress. How would we know what to study, in our n=1000 project, if we didn’t have smaller studies to guide us?

Larger studies would also lead to the centralization of power in science. Instead of most researchers having their own grants and designing their own experiments, funding and decision-making would be concentrated in a smaller number of hands. To some extent, this is already happening e.g. with the rise of ‘mega-projects’ in neuroscience. Many people are unhappy with this, although part of the problem may be the traditional top-down structure of leadership in science. If it were possible to organize large scale projects more ‘democratically’, they might be more palatable, especially to junior researchers.