Published Clinical Decision Aids May Lack Validation

The physician life has never been “easier.” We live in a fortunate future, replete with information technology at our fingertips, along with the decision support to suit our every clinical need. What tremendous satisfaction we all must take with the various epiphanies and pearls presented to us by our electronic health records. Independent thought, along with clinical judgment, is being rendered obsolete.

A steady diet of academic research inflates our fulsome girth of clinical calculators, shortcuts, and acronyms. NEXUS! PERC! HEART! HAS-BLED! The siren’s call of simplicity and cognitive unburdening is insidiously appealing. With progress, unfortunately, also comes folly. Are these tools actually smarter than the average bear? Does this ever-expanding cornucopia of “decision support” actually outperform a trained clinician?

Perhaps a better question is, is anyone even asking that question?

It’s troubling that this does not appear to be the case. A recent historical review in Annals of Emergency Medicine looked back at 171 research articles evaluating the performance of decision aids.1 For a decision aid intended to be incorporated into routine practice, it should seem reasonable not only to simply statistically validate a prediction but to also ensure it outperforms current clinical practice.

Of the 171 decision aids included in their survey, the authors were only able to identify 21 publications either in Annals or another journal in which the aid was compared directly to clinician judgment. In the remainder, no comparison was made or could be identified in the external literature. Of the handful for which a comparison was identified, the results are, unfortunately, discouraging. In these 21 comparisons, the decision aid was clearly superior to clinician judgment in only two. The two comparisons favoring the decision aid were a prognostic neural network for outcomes in patients presenting with chest pain—effective but too unwieldy for widespread use—and the useful and well-studied Canadian C-Spine Rule. Conversely, six decision aids clearly underperformed as compared to clinician judgment, and the remainder were a wash. Examples of popular decision instruments either inferior to or no different than clinician judgment included the Alvarado score for appendicitis, a general evaluation of pediatric head injury rules, risk-stratification rules for pulmonary embolism, and the San Francisco Syncope Rule.

A mere 21 publications hardly represent more than a tenth of their survey substrate, and it would be erroneous to assume those left untested are equally unreliable. It is also reasonable to suggest the decision aids for which the comparisons showed no difference may have suffered from flawed comparator study design rather than a failing of the decision aid itself. Regardless, it should certainly not instill any disproportionate confidence in clinical decision aids as a replacement for thoughtful clinical judgment and experience.
A salient contemporary example of a decision aid of questionable value versus clinical judgment is the Ottawa Subarachnoid Hemorrhage Rule.2 This rule, derived and described originally in JAMA, then recently validated prospectively in the Canadian Medical Association Journal, targets an important clinical question: Which patients with acute headache should be evaluated for subarachnoid hemorrhage (SAH)?2,3 Patients for whom an initial SAH or sentinel bleed is missed tend to have poor and potentially avoidable outcomes. However, the flip side is excessive resource use either by CT scanning or invasive procedures, such as lumbar puncture. A decision aid superior to clinician judgment could add a great deal of value for this clinical scenario.

About the Author

Ryan Patrick Radecki, MD, MS, is clinical assistant professor of emergency medicine at The University of Texas Medical School at Houston, and practices at Kaiser Permanente NW. He blogs at Emergency Medicine Literature of Note and can be found on Twitter @emlitofnote.