Theorising about evidence synthesis – is it about the cost, language or other?

As far as I can tell we undertake evidence synthesis to better understand the effectiveness of an intervention. The rationale is that the greater the accumulation of evidence the greater the understanding of how good an intervention is. This is typically characterised by a reduction in the size of the confidence intervals in meta-analyses. Put it another way, we attempt to be as certain as possible as to how effective an intervention is. The more evidence, the less the uncertainty (well, that’s the theory).

I like this graph:

I was always unsure of the y-axis label. Given what I said above I now think it should be ‘certainty’; the certainty of the average effectiveness of the intervention (not to be confused with certainty that the intervention will work in an individual). The graph above reflects the law of diminishing returns in relation to evidence synthesis. And I now feel slightly more confident to update it:

In the above depiction:

A can be seen as the ‘rapid’ review arena

B can be seen as the ‘systematic’ review area eg Cochrane

C can be seen as the full ‘systematic’ review e.g. Tamiflu style

The troubling bit for me is still the break in graph from B to C. This reflects the enormous difference in resource from B (using journal articles) to C (using clinical study reports). An important part of the graph though is it highlights the lack of a ‘cliff edge’ between ‘rapid’ and ‘systematic’ – it’s a continuum.

This is all relevant as we live in a world of limited resources therefore we need to maximise gain for a given intervention. For drug interventions there is a maturity of discussion around cost-effectiveness. We have a language around which we can debate whether the extra gain for a new drug is worth the high cost (for instance see Many new cancer drugs show ‘no clear benefit’).

Alas, we do not appear to have the maturity of thought and language to use similar approaches to evidence synthesis. Within this blog there are lots of articles relating to the fact that systematic reviews typically rely on published journal articles. This suggests they miss 50% of the trials and an awful lot of extra information contained in the clinical study reports and that does not appear in the journal articles. Convention has indicated that ‘we’ are happy with this and systematic reviews (based on published journal articles) are often portrayed as the ‘gold standard’. This appears arbitrary and due to lack of vision (or self-interest) we don’t discuss.

Why don’t we have similar discussions around the cost-benefit of various synthesis methods that we do with ‘classic’ interventions such as a pharmaceutical drug. Surely, an evidence synthesis is an intervention! To illustrate this with an example using 4 methods of evidence synthesis (purely fictional to make the case):

Full systematic review (eg Tamiflu) – cost 1,000 units

Systematic review (eg Cochrane) – cost 150 units

30 day rapid review – cost 30 units

5 day rapid review – cost 5 units

What is better? 1 full SR, 6.7 SRs, 33.3 thirty day RR or 200 five day rapid reviews?

When groups like NICE explore the worth of a drug they know the cost of the intervention and have a reasonable idea of the gain. From this they arrive at measures cost per QALY; for a given intervention how much ‘gain’ do we get in return for the money. In the evidence synthesis world we have typically have little idea of the costs (although some commercial and academic providers probably have a reasonable understanding) and no idea of the gain from methodological tweaks. How much gain do we derive from searching medline and one more database compared with medline alone? If we were to add a third database how much gain? We don’t know. They certainly add cost. How can we possibly begin to unpick if it’s worthwhile? We’re running blind and in the absence of evidence authority and eminence appear to rule.

We seem comfortable saying that Tamiflu-style reviews are inappropriate (by virtue of us not doing them). Is that based on cost? Based on pain (apparently, they’re incredibly painful to do, exponentially more than a standard systematic review)? As an aside, the debate around Tamiflu focused on the problem with the pharmaceutical industry for making access to clinical study reports so difficult. There was little debate, certainly not one that has affected the evidence synthesis world, that explored the methodological problems associated with relying on published journal articles. That, to me, was the bigger story.

At the other end of the resource scale, the rise of the rapid review, is indicative of an unrest with the status quo of ‘one-size fits all’ systematic reviews (not responsive enough, too costly etc). But there is no single ‘rapid review’ method, there are many. As far as I can tell the methodological differences are not ‘evidence based’ they appear to be based on consensus or (as I have done in the past) what seems reasonable, based on experience. They have not been informed by an exploration of the evidence that suggests spending 50% more resource gets you above a threshold of ‘quality’ or ‘certainty’. To repeat, we’re running blind.

We’re moving away from the one-size fits all approach to evidence synthesis. I feel we’re moving from evidence synthesis 1.0 to 2.0. But to accompany that we require more nuanced language to accompany the debate. Phrases like ‘gold standard’, ‘quick and dirty’ seem anchored in the world of evidence synthesis 1.0. But as well as a need for a better narrative for evidence synthesis we need this to be informed by the evidence. I’m a big fan of ‘rapid reviews’ hence doing this site and this post – but I’ve never claim I’m 100% right. In fact, as I explore rapid methods, I delight more in the failures than the successes, as it’s through failure I learn the most.

Evidence is coming and that should move the debate on. But as ever the inertia of the status quo will mean things are likely to move slowly. I’ve been on numerous grant applications to obtain funding to unpick these issues, all have failed. Fortunately, others have been more successful so I can’t wait to see how things develop. And, at Evidence Live next year I’m hoping to be running a rapid review hackathon – we’re just discussing what that might look like!