You’ll often see loony zealots refer you to a study showing how effective their preferred treatment is — there usually is some small study supporting the use of almost any treatment.

You’ll also often hear people reply that the study was only small, so shouldn’t be trusted. But why shouldn’t you trust small studies? Sure, they won’t provide quite as much statistical power as larger ones, but surely they can still be useful.

And that’s true. They can be useful, and they do provide important information. But a meta-epidemiological study in the British Medical Journal recently showed a really interesting fact about small studies.

The researchers highlight what is known as the “small study effect”: a very particular bias that small studies introduce into systematic reviews.

It turns out that small studies are systematically biassed towards the effectiveness of the intervention they are testing.

Systematic reviews pool the results of all the relevant studies on a particular issue and usually provide the very best evidence. Before major decisions are made about some particular treatment, we usually wait for a big systematic review of the literature to be published.

But if there have been lots of small studies done, then when researchers conduct a systematic review, it turns out they might end up being slanted.

These particular researchers looked at studies that tested various treatments for osteoarthritis and they plotted all the studies for each treatment on a graph with the larger trials near the top and the smallest ones near the bottom.

If the study showed the treatment was very effective, they plotted it further to the left and if it showed it was ineffective (or had a negative effect) they plotted it further the the right.

They call them funnel plots because if small trials are not biassed, the graphs will resemble funnels. The large studies will group together at the top and the small studies will scatter evenly on either side.

The results are visually striking. Far from resembling funnels, they resemble toppling towers — with small studies heavily drawing the plots to the left, towards the treatment being more effective.

The funnel plots for the 13 meta-analyses studied.

You can see just by looking at these plots that if systematic reviews are conducted, pooling all these data, the final analysis will generally be skewed towards supporting the treatment.

So why are small studies biassed in this way? The study authors suggest that there might be several factors at play. For one thing, there might be a selection bias: Small studies that show less effect might be less likely to be published. They also suggest a number of other explanations including a problem with participants being excluded from the analysis after being randomised into one of the arms.

They urge that authors of systematic reviews include the above funnel plots in all systematic reviews, and if a “small study effect” is observed, the reviewers should include a separate analysis that excludes all the small studies.

Update: The researchers defined “small” studies in this paper as ones with less than an average of 100 participants in each arm. Thanks, Simon, for pointing this out in the comments.

So next time someone points you to a small study showing how effective acupuncture is, how reflexology relieves depression or how fish oil cures everything, you can rest easy knowing that you’re under no obligation to accept the study’s conclusions. It’s much better to wait for a larger study, a meta-analysis, or even better, a meta-analysis that controls for the small study effect.

41 Comments

When talking about ‘small studies’ it’s probably useful to describe the kind of sample size you’re talking about. From the paper, we can see that they classified a study as ‘large’ if the average number of patients in a treatment arm was 100, for example one with 110 in one arm and 95 in another would be ‘Large’ because the average number of patients is 102.5. This is crucial information for someone trying to understand your blog post and its relevance to their particular interest.

Actually, just a further thought: Their definition of what a small study amounts to is not so crucial for understanding the post since the analysis I focused on was the funnel plots. In them, the y-axis is (something similar to) sample size and they do not merely compare “small” studies with non-small ones.

Yes, I realised upon reflection that I could have been clearer that I meant it was crucial for someone considering their own work in the context of what you were writing, though the general argument itself would not be undermined without the classifying information.

Interesting piece. However, I’m not sure this tells us anything particularly new. Publication bias is a very well studied phenomenon, and we already know that small negative studies are less likely to be published than small positive studies. I suspect most of the “small study bias” just comes down to publication bias.

Yeah — it’s definitely something that people already think happens but this is definitely a new piece of evidence. It’s the first study to show the details of this bias with anything other than binary outcomes. And the funnel plots are really quite a striking way to illustrate the phenomenon too.

Here’s what the authors write about their study and the state of the literature:

To our knowledge, this is the first meta-epidemiological study to systematically assess small study effects in a series of meta-analyses with continuous clinical outcomes. In an analysis of trials with binary outcomes, Kjaergard et al7 8 found more beneficial treatment effects in small trials with inadequate methodology compared with large trials. In an analysis of homoeopathy trials Shang et al found that smaller trials and those of lower quality show more beneficial treatment effects than larger and higher quality trials.11 Moreno et al recently assessed the performance of contour enhanced funnel plots and a regression based adjustment method to detect and adjust for small study effects in placebo controlled antidepressant trials previously submitted to the US Food and Drug Administration (FDA) and matching journal publications.14 Application of the regression based adjustment method to the journal data produced a similar pooled effect to that observed by a meta-analysis of the complete unbiased FDA data. In contrast to our study, Moreno et al regressed treatment effects against their variance, which performed well in a simulation study but has been shown to give similar results to using the standard error as an explanatory variable.12 In funnel plots, however, treatment effects will typically be plotted against their standard error, and significance tests will be generally based on z or t values, which again are directly derived from standard errors. Therefore, we deem it preferable to regress treatment effects against the standard error rather than the variance. A second discrepancy is that Moreno et al predicted effects for infinitely large trials of a variance of zero.12

It seems to me that your closing statement ‘So next time someone points you to a small study showing how effective acupuncture is, how reflexology relieves depression or how fish oil cures everything, you can rest easy knowing that you’re under no obligation to accept the study’s conclusions’ is pretty much irrelevant to the paper being blogged.

Correct me if I’m wrong, but the paper adds nothing new to the issue of how reliable or valid any given small study is. Rather, it makes the new point that their use in reviews and meta-analyses is misleading because of their collective tendency to overstate effect sizes, and because of publication bias against small studies with small or negative results.

Yeah, fair enough. That is the main point of this study. But I think the study also does a nice job of showing (or reiterating) why a small study showing a positive effect shouldn’t really be relied on.

About

This blog is about science and science journalism: good, bad, and bogus. While most of the posts are about bad and bogus science and science writing, I try to find the time to reflect on good examples too.

I am a freelance science writer and I teach philosophy at the University of Sydney.