Performance and Usability of Machine Learning for Screening in Systematic Reviews: A Comparative Evaluation of Three Tools

Current Page

Apr. 29, 2019

Topic Initiated

Performance and Usability of Machine Learning for Screening in Systematic Reviews: A Comparative Evaluation of Three Tools

Abstract

April 30, 2019

Overview

More than 100 software tools exist that aim to create efficiencies in systematic review (SR) processes (systematicreviewtools.com), with machine learning being the driver behind the proposed efficiencies of many tools. The University of Alberta Evidence-based Practice Center (UA EPC) proposes to test the ability of several machine learning tools to semi-automate screening in SRs. This aligns with one of the goals of the EPC Program and the FY2019 EPC TO1 pilot projects to improve the efficiency of report development through use of technology, specifically machine learning.

Objectives

To test how machine learning tools can be used to expedite screening by eliminating irrelevant records

To test how machine learning tools perform as a second reviewer for the screening task of a SR

To examine differences in screening outcomes and user experiences across machine learning tools, and types of review questions, interventions, and study designs

General Proposed Approach

We recently published an evaluation of Abstrackr to test if predictions made by the machine learning tool could be used to expedite screening by eliminating irrelevant records (see Appendix for published abstract).1 We plan to build on this work by testing and comparing multiple tools (e.g., Abstrackr, Distiller, RobotAnalyst) with the screening tasks for several EPC (or similar) reviews. We would assess performance for level 1 screening, and using the tool as a second reviewer. Outcomes would include proportion of relevant studies missed, absolute percentage of studies missed, workload savings, and time savings. End user experiences will include usability measures (e.g., ease of use, technical requirements, support required and available, comparable functionality/practicality of the tools). We would explore performance for different types of interventions (pharmacologic, non-pharmacologic, complex) and study designs (randomized controlled trials, observational). We have not identified partners due to the nature of this methodological investigation; however, we would be pleased to partner with another EPC if there is interest.

Impact

Incorporating machine learning tools into the development of EPC reports could result in substantial time and cost savings. Study selection involving two independent reviewers is one SR component that can be a time intensive process. An initial evaluation showed that Abstrackr, a machine learning tool for study selection, can provide relatively reliable means of expediting SR processes; although further study is needed to answer specific questions and increase generalizability.1 This work could have direct applicability to traditional EPC products (technical briefs, reports and updates), as well as new products such as rapid and living reviews. Moreover, expediting review processes has direct relevance to optimizing the utility and relevance of the EPC program and its products for living health systems.