Assessment of a Method To Detect Signals for Updating Systematic Reviews

People using assistive technology may not be able to fully access information in these files. For additional assistance, please contact us.

Structured Abstract

Background

Systematic reviews are a cornerstone of evidence-based medicine. The Agency for Healthcare Research and Quality (AHRQ) has a program to produce systematic reviews. Systematic reviews will become out of date as new evidence gets published. Determining when a systematic review has gone sufficiently out of date to warrant an update is challenging. AHRQ has a surveillance system that uses limited literature searches and expert opinion to detect signals for when a systematic review is out of date. While the surveillance system has face validity, an assessment of predictive validity has not been performed.

Methods

The AHRQ Comparative Effectiveness Review (CER) program had produced 13 CERs by 2009, and 11 of these were assessed in 2009 using the surveillance system to produce determinations of the degree to which individual conclusions were out of date, along with a priority for updating each report. Four CERs were judged to be a high priority for updating, 4 CERs were judged to be medium priority for updating, and 3 CERs were judged to be low priority for updating. AHRQ then commissioned full updated reviews for 9 of these 11 CERs, including 4 high, 3 medium, and 2 low-priority reports. After all the updated reports were completed, we matched the original predictions about which conclusions in each CER were still valid, possibly out of date, probably out of date, and out of date, with the corresponding conclusions in the updated report, and then classified each pair as having good, fair or poor concordance. We also made a summary determination of the priority for updating each CER based on the actual changes in conclusions in the updated report, and compared these determinations with the earlier assessments of priority.

Results

The 9 CERs included nearly 150 individual conclusions. In 8 of the 9 reports, the great majority of assessments of individual conclusions had good concordance between the predictions and the update. Across reports, 83 percent of matched conclusions had good concordance, and 99 percent had good or fair concordance. For 16 percent of conclusions there was either no match between the original and updated report, or the concordance assessment was otherwise not applicable. There was one instance of poor concordance, and the publication of new evidence after the surveillance signal searches had been done contributed to the changed conclusion in the updated report. This occurred in a CER already judged as being a high priority for updating. For one CER originally judged as being high priority for updating, based on the actual updated results we judged it as having been a medium priority. For another CER originally judged as being medium priority for updating, based on the actual updated results we judged it as having been a high priority. The remaining 7 CERs had agreement between their assessments of priority status. Both CERs originally judged as being low priority for updating had no substantive changes to their conclusions in the actual updated report. The agreement on overall priority for updating between prediction and actual changes to conclusions was K=0.74.

Conclusions

These results provide some support for the surveillance system's validity for detecting signals of when a systematic review is sufficiently out of date that it needs updating.