How to Assess Inter-Observer Reliability of Ratings Made on Ordinal Scales: Evaluating and Comparing the Emergency Severity Index (Version 3) and Canadian Triage Acuity Scale

An exact, optimal (“maximum-accuracy”) psychometric methodology for assessing inter-observer reliability for measures involving ordinal ratings is used to evaluate and compare two emergency medicine triage algorithms—both of which classify patients into one of five ordinal categories. Ten raters independently evaluated the identical set of 200 patients, five with each algorithm. Analysis revealed moderate levels of inter-observer reliability, indicating that prior estimates of almost perfect inter-observer reliability obtained for the present data using suboptimal statistical methods are untenable.