Wolters Kluwer Health
may email you for journal alerts and information, but is committed
to maintaining your privacy and will not share your personal information without
your express consent. For more information, please refer to our Privacy Policy.

In 2006, PAIN published a systematic review of the measurement properties of self-reportpain intensity measures in children and adolescents (Stinson JN, Kavanagh T, Yamada J, Gill N, Stevens B. Systematic review of the psychometric properties, interpretability and feasibility of self-reportpain intensity measures for use in clinical trials in children and adolescents. PAIN 2006;125:143–57). Key developments in pediatric pain necessitate an update of this work, most notably growing use of the 11-point numeric rating scale (NRS-11). Our aim was to review the measurement properties of single-item self-reportpain intensity measures in children 3 to 18 years old. A secondary aim was to develop evidence-based recommendations for measurement of child and adolescent self-report of acute, postoperative, and chronic pain. Methodological quality and sufficiency of measurement properties for reliability, validity, responsiveness, and interpretability was assessed by at least 2 investigators using COnsensus based Standards for the selection of health Measurement INstruments (COSMIN). Searches identified 60 unique self-report measures, of which 8 (reported in 80 papers) met inclusion criteria. Well-established measures included the NRS-11, Color Analogue Scale (CAS), Faces Pain Scale–Revised (FPS-R; and original FPS), Pieces of Hurt, Oucher—Photographic and Numeric scales, Visual Analogue Scale, and Wong-Baker FACES Pain Rating Scale (FACES). Quality of studies ranged from poor to excellent and generally reported sufficient criterion and construct validity, and responsiveness, with variable reliability. Content and cross-cultural validity were minimally assessed. Based on available evidence, the NRS-11, FPS-R, and CAS were strongly recommended for self-report of acute pain. Only weak recommendations could be made for self-report measures for postoperative and chronic pain. No measures were recommended for children younger than 6 years, identifying a need for further measurement refinement in this age range. Clinical practice and future research implications are discussed.

Sponsorships or competing interests that may be relevant to content are disclosed at the end of this article.

Supplemental digital content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal's Web site (www.painjournalonline.com).

1. Introduction

Given that pain is defined primarily as a subjective experience,56self-report is the preferred method for its measurement. As such, the development of valid and reliable self-report measures of pain by children has been a longstanding endeavor. These efforts remain relevant given contemporary emphasis on patient-reported outcomes in medical care.39 Although pain can be assessed by observer report or behavioral observation,108 these are limited and often discrepant from child self-report. Observers, such as parents, show a tendency to overestimate or underestimate children's pain,26,28,117 and are heavily influenced by their own characteristics and contextual factors in rating children's pain.34 Furthermore, behaviors associated with child pain are also frequently observed during other affective states, such as crying as an indicator of distress or sadness.108 However, valid and reliable self-report of pain by children poses a unique challenge given the vast normative cognitive developmental changes occurring in childhood and adolescence.29

Over one decade ago, PAIN published a systematic review of the measurement properties, interpretability, and generalizability of self-reportpain intensity measures in children and adolescents aged 3 to 18 years.92 At that time, 33 unique measures of self-report of pain intensity in children were identified. Only 5 met criteria for “well-established assessment”31 and were reviewed in detail,92 including the Pieces of Hurt/Poker Chip Tool, the Faces Pain Scale (FPS)/Faces Pain Scale–Revised (FPS-R), the Wong-Baker FACES Pain Rating Scale (FACES), the Oucher, and the Visual Analogue Scale (VAS). That review informed the pediatric version of the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (PedIMMPACT) that outlined evidence-based recommendations for selection of self-report of pain intensity measures for clinical trials of pediatric acute and chronic pain.66 Based on available evidence, recommendations at that time included use of the Poker Chip Tool with 3- to 4-year-olds, the FPS-R with 4- to 12-year-olds, and a VAS with children 8 years of age and older.

In addition to the availability of new research, 2 key developments in the field of pediatric pain suggest the timeliness of an updated review of measurement properties of self-reportpain intensity measures. First is the growth of evidence for the 11-point numeric rating scale (NRS-11) in pediatric populations.21 The NRS-11 was identified by the original review for future psychometric research given its widespread use in pediatric clinical care and establishment as a well-validated self-report measure of pain in adults.41,92 A second development is the electronic presentation of self-reportpain intensity scales as part of the adoption of technology platforms in pain research and management.37,58

Our primary objective was to update the review originated by Stinson et al. of the measurement properties of self-report measures of pain intensity in children and adolescents. This objective included additional assessment of the methodological quality of all published studies on measurement properties (from Stinson et al.'s review and this update) using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN).97 A secondary objective was to update evidence-based recommendations for the measurement of self-report of pain intensity in studies of pediatric acute, postoperative, and chronic pain.

2. Methods

2.1. Eligibility criteria

Review methods, inclusion/exclusion criteria, and analyses were specified in advance and are available in the registered systematic review protocol (PROSPERO CRD42017055759). Eligibility criteria closely followed that of the original Stinson et al.92 review and included all peer-reviewed English language research studies with the primary objective of examining measurement properties of self-report single-item measures of pain intensity in children and adolescents 3 to 18 years of age. Studies including individuals older than 18 years were included if data were presented separately for children 18 years or younger. Pain measures were included in the review if they met Cohen's criteria of “well-established” assessment. See Table 2 for a detailed description of Cohen's31 “well-established” criteria. Unpublished manuscripts, reviews, guidelines, commentaries, and other descriptive articles were excluded. Identified self-reportpain intensity measures not meeting Cohen's “well-established” criteria were noted for supplementary material but were not reviewed in greater detail.

2.2. Information sources

Potential studies were identified through 3 key sources: (1) studies published before 2005 were identified from the original Stinson et al.92 review that searched Medline, CINAHL, Embase, and Pubmed from inception to June 2005; (2) new studies were identified through updated electronic database searches from 2005 to November 13, 2017, conducted in Medline, Embase, PsycINFO, Cochrane Central Register of Controlled Trials (CENTRAL), Web of Science, and CINAHL; and (3) reference lists from all included studies and relevant review articles were also examined by hand to identify potential studies for inclusion.

2.3. Search strategy

The updated search strategy was developed in consultation with a medical librarian and largely followed that from the original review92 with modifications to reflect changes in the databases over time. Search terms included keywords relevant to pain measurement, children, adolescents, names of pain measures, and types of measures. See supplementary Table 1 for complete search strategies for all updated database searches (available at http://links.lww.com/PAIN/A651).

2.4. Study selection

Study titles and abstracts identified from updated database searches (2005-2017) were screened for eligibility by a research assistant. A random 15% of all titles and abstracts were reviewed by a second review author (A.H. or C.L.). Two study authors (K.A.B., A.H., or C.L.) assessed potentially relevant full text studies for inclusion as identified from the original review,92 the updated database searches, and hand searching of relevant reference lists. Disagreements were resolved by discussion with a third review author (K.A.B., A.H., or C.L.). All relevant publications from the original review were also included,92 including those studies from measures that were previously excluded from the original review but now meet Cohen's criteria of “well-established” assessment due to the availability of new evidence (such as for the Color Analogue Scale [CAS]). Thus, results presented herein include all relevant and eligible studies of measurement properties from the inception of database searches (as early as 1950s) through to November 2017 inclusive.

2.6. Measurement property sufficiency and quality

The COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist with a 4-point scale was used to ascertain the type of measurement property examined and study methodological quality,71,73,74,97 as well as criteria for judging the sufficiency of each measurement property80 for all included studies (from the original Stinson et al.92 review and updated searches). COnsensus-based Standards for the selection of health Measurement INstruments assesses sufficiency and quality separately across 9 measurement properties, including internal consistency, measurement error, reliability, content validity, hypothesis testing, structural validity, cross-cultural validity, criterion validity, and responsiveness. Only reliability (test–retest and measurement error were combined), content validity, hypothesis testing, cross-cultural validity, criterion validity, and responsiveness were assessed in this review. All other measurement properties were not applicable to single-item self-report measures. The interpretability section of COSMIN was also included to extract information on ease of interpretability related to the study and measure being evaluated. This information includes percentage of missing items (herein defined as missing data), description of how missing items (data) was handled, distribution of scores, percentage of respondents who had the lowest possible score, percentage of respondents who had the highest possible score, scores and change scores (ie, mean and SD) for relevant groups, and minimally important change or minimally important difference (herein also defined as clinically significant change of pain intensity). See Table 1 for a definition and judgment of sufficiency for each measurement property adapted from COSMIN as applied in this review.

Using COSMIN, each measurement property was assessed using 5 to 18 items evaluating study design and methods. Each item is scored on a 4-point scale, with criteria for excellent (4), good (3), fair (2), and poor (1) quality for each COSMIN item described. An overall quality score per measurement property is obtained by taking the lowest rating over all items. Sufficiency of each measurement property is judged against established criteria as sufficient (+), insufficient (−), or indeterminate (?). Detailed scoring methods for COSMIN quality and sufficiency of measurement properties are described elsewhere.73,80,97 As per COSMIN directions, studies were only included if their stated purpose was to examine measurement properties. If a study was identified for inclusion, measurement property evidence presented for eligible self-reportpain intensity measures was considered. In the case where the study objective was primarily to validate one measure, then measurement property evidence for that measure was considered only. In the case where studies assessed agreement between several self-report measures, measurement property evidence was extracted for all eligible measures. Recognizing variations in terminology used to describe various measurement properties, the type of measurement property assessed was determined as per COSMIN definitions (Table 1) rather than terminology used by study authors.

Pairs of study authors (K.A.B., A.H., and C.L.) applied COSMIN to all included studies. First, eligible self-reportpain measures and measurement properties evaluated in the article were determined. Second, relevant COSMIN boxes were completed, including sufficiency of measurement properties, study methodological quality, and interpretability. Following independent scoring, reviewers met to reach consensus, with any discrepancies resolved by a third reviewer.72 If multiple pain measures were included in the same article, they were evaluated separately. In the case where multiple studies with separate samples were reported in the same article, they were evaluated separately.

COnsensus-based Standards for the selection of health Measurement INstruments recognizes that no “gold standard” exists for patient report outcome instruments; however, evaluation of criterion validity required identification of a comparison or criterion instrument. For the purposes of assessing study methodological quality using COSMIN, pain measures recommended by PedIMMPACT for use in clinical trials were identified as the criterion measures within the recommended age ranges. These included the Pieces of Hurt/Poker Chip Tool for 3- to 4-year-olds, the FPS-R for 4- to 12-year-olds, and the VAS for children 8 years of age and older.66 The original FPS (in lieu of the FPS-R) was considered to be the criterion instrument for studies published before the PedIMMPACT guidelines were available.15 Original paper, mechanical, or verbal presentations of measures were considered as the criterion instrument for measures presented using novel formats (eg, electronically).

2.7. Criteria for evidence-based recommendations

Evidence-based recommendations were developed separately for each included self-report of pain intensity measure by type of pain (acute, postoperative, and chronic). Recommendations were specified across the age range from 3 to 18 years, with recommendations by chronological year determined by age-based evidence of measurement properties available from included studies. To translate COSMIN quality and measurement property sufficiency ratings into recommendations for measure use in research and clinical practice, our group specified 5 recommendation categories. These included weak and strong recommendations for measure use, weak and strong recommendations against measure use, inconclusive recommendation, and no recommendation. Values underlying the development of recommendation criteria included emphasis on the importance and potential for self-report of pain experience for all children balanced with the potential of harm with use of an invalid or unreliable measure. Emphasis was also placed on the presence of any negative data indicating poor measurement properties and quality of available evidence. Given the vast cognitive development occurring in early childhood with relevance to self-report of pain,29 an a priori decision was made to apply more stringent criteria for recommendations for children younger than 8 years. This included a lower tolerance for the presence of any data indicating insufficient measurement properties among younger children and considerations when age-related information based on aggregate age-based analyses only vs separately by year of child age (eg, results for 3- to 6-year-olds grouped together vs for 3-, 4-, 5-, and 6-year-olds separately). Measurement property evidence was considered separately by type of pain (acute, postoperative, and chronic) given the possibility that young children's ability to self-reportpain can improve with practice or with pain experience.110 Criteria for level of evidence-based recommendations are outlined in Table 2.

3. Results

3.1. Study and measure selection

Database searches retrieved a total of 21,448 unique abstracts that were screened for potential inclusion. Of these, 80 publications reported on 8 measures of self-report of pain intensity that met Cohen's criteria of “well-established assessment” and were included this review. These included the 11-point numeric rating scale (NRS-11), the CAS, the FPS-R (and original FPS), the Oucher photographic and numeric scales, the Pieces of Hurt/Poker Chip Tool, the VAS, and the Wong-Baker FACES Pain Rating Scale (FACES). See Figure 1 for the PRISMA flowchart. An additional 52 measures were excluded from this review for not meeting Cohen's criteria of “well-established assessment” (32 of which were identified in the previous review92). A list of excluded measures is available in supplementary Table 2 (available at http://links.lww.com/PAIN/A651).

3.2. Measurement properties and recommendations

Findings related to the quality and sufficiency of measurement properties of each included “well-established” measure, as well as measure recommendations, are reported separately below in alphabetical order by measure name. COSMIN quality and sufficiency ratings by measurement property for “well-established” self-report measures are summarized in Tables 3 and 4, respectively. COSMIN interpretability findings including minimally important change and recommended anchors for self-report measures are summarized in Table 5. COSMIN quality and sufficiency ratings for individual studies included in this review are also available in supplementary Table 3 (available at http://links.lww.com/PAIN/A652). Evidence-based recommendations for self-report measures of acute/procedural, postoperative, and chronic/recurrent pain by child age are summarized in Figures 2–4.

3.3. Eleven-point Numeric Rating Scale

The NRS-11 asks children to rate their pain intensity on a scale from 0 to 10. A total of 24 studies (reported in 21 publications) examined measurement properties of the NRS-11 in children 3 to 20 years old and were included in this review. Of these, 8 studies focused on use of the NRS-11 for self-report of acute pain,5,6,22,57,75,101,109,113 7 studies of postoperative pain,17,32,68,77,104,109 and 10 studies of chronic pain.22,23,54,55,67–69,82–84 Studies were conducted in a variety of hospital (inpatient, outpatient clinic, and emergency department) and community settings (school). One study focused on children 6 to 8 years old22 and another 8 to 12 years old.68 All other studies included samples crossing childhood and adolescence. Studies often excluded children who did not meet some minimal predetermined counting ability or numerical competency. Most studies presented the NRS-11 verbally only, with fewer studies reporting paper (n = 2)57,109,113 or electronic (n = 2)23,83 presentation of the NRS-11. When reported, “no pain/hurt” was consistently used as the lower anchor for the NRS-11. Upper anchors for the NRS-11 included “most or worst pain,” “worst pain possible,” “worst pain imaginable,” “very much pain,” and “pain as bad as it could be.” The NRS-11 was primarily presented in English, but also in Catalan,22,23,68 French,5,6 German,54,55 and Spanish.101 Studies were poor to excellent quality and generally reported sufficient measurement properties, including for 6- and 7-year-olds with acute pain101 and 4- to 7-year-olds with postoperative pain.17,68,104 Two studies from the same investigatory group also reported sufficient measurement properties in 6- to 7-year-olds with chronic pain.22,84 Recommendations were additionally informed by reports of insufficient measurement properties in 4- and 5-year-olds with acute pain.101

3.4. Color Analogue Scale

The CAS was designed to use gradations of color, area, and length to enable children to concretely identify variations in pain intensity.65 The measure is pocket sized (10 cm in length) with gradations from small and white to wider and deep red reflecting increases in pain intensity. The original CAS is a type of mechanical VAS: children are asked to move a slider along the length of the CAS to indicate how much pain they have. The back of the measure has a corresponding 0 to 10 numerical scale to determine a number representing the child's pain. A total of 19 studies (in 18 publications) examined measurement properties of the CAS in children 4 to 18 years old and were included in this review. Of these, 12 studies focused on use of the CAS for self-report of acute pain,5,18–20,63–65,94,96,98–100 5 studies for postoperative pain,38,49,78,95,96 and 2 studies for chronic pain.83,84 Studies were conducted in a variety of hospital (inpatient and emergency department) and community settings (school and clinic). Five studies focused on children aged 12 years and younger.63,78,84,94,96 All other studies included samples crossing childhood and adolescence. Two studies assessed presentation of the CAS on an electronic handheld device.83,95 The original development of the CAS, measure instructions, and testing of the scale were conducted with children 5 to 16 years old who were generally healthy or had recurrent headaches.65 Original reported scale instructions to children included anchors of “no pain” to “most pain.”65 The largest portion of subsequent studies used the original scale anchors, with some reported deviations of the upper anchor to “most pain you can imagine,” “most severe pain,” “worst pain,” “worst pain possible,” and “very much pain.” The CAS was primarily presented in English, but also in Thai,96 Spanish,98–100 Catalan,83,84 and French.5,38 Studies were mostly poor to fair quality and reported sufficient measurement properties. Recommendations were additionally informed by insufficient measurement properties in 4- to 7-year-olds with acute99,100 and postoperative pain.38

3.5. Faces Pain Scale–Revised (and Faces Pain Scale)

The FPS-R is composed of 6 faces from “no pain” to “very much pain.” Children are asked to point to the face that shows how much hurt they have.53 For more information, see: http://www.iasp-pain.org/FPSR. A total of 21 studies (reported in 20 publications) examined measurement properties of the FPS-R in children 3 to 19 years old and were included in this review. Of these, 9 studies focused on use of the FPS-R for self-report of acute pain,30,53,70,76,87,98–100,106 8 studies of postoperative pain,3,32,36,38,53,95,110,114 and 4 studies of chronic pain.35,50,83,84 Studies were conducted in a variety of hospital (inpatient, outpatient clinic, and emergency department) and community settings (school, daycare, and clinic). Eight studies focused on children aged 12 years and younger,30,36,53,84,87,106,110,114 with 2 studies focused exclusively on young children 3 to 6 years old.87,106 All other studies included samples crossing childhood and adolescence. Four studies assessed electronic presentations of the FPS-R on handheld devices.50,83,95,114 Original development, measure instructions, and testing of the FPS-R were conducted with hospitalized children 4 to 12 years old and 5 to 12 years old undergoing ear piercings.53 Almost all subsequent studies reported following the original stated instructions for the FPS-R,53 with some deviations from the upper anchor to “maximum pain,” “severe pain,” or “most pain possible.” The FPS-R was primarily presented in English, but also in Catalan,70,83,84 Portuguese,30 Spanish,98–100 Thai,76 German,3 and French.38 Studies were mostly fair to excellent quality and reported sufficient measurement properties, including for 7-year-olds with acute,30,53,70,76,99,100 postoperative,110 and chronic pain.35,50,84 Recommendations were additionally informed by insufficient measurement properties in 3- to 6-year-olds with acute87,106 and postoperative pain,38,110 as well as 4- to 6-year-olds with chronic pain.50

A total of 7 studies examined measurement properties of the original 7-face FPS in children,19,20,26,46,48,49,78 which was scored 0 to 6. COSMIN quality ratings for these studies are available in supplementary Table 3 (available at http://links.lww.com/PAIN/A652). Given the availability of a revised version (the FPS-R), these studies were not reviewed in further detail, and no recommendations about its use were made. A more detailed review of these studies is available in the original systematic review.92

3.6. Oucher—photographic and numeric

The Oucher is composed of 2 scales presented together, a numeric scale (0-10; originally 0-100) and a 6-picture color photographic scale (0-5). The faces are arranged vertically to the right of the numeric scale and show increasing levels of discomfort from “no hurt” at the bottom to the “biggest hurt you could ever have” at the top. Most studies require children undergo a brief numerical competency screening (ie, counting to 100 by ones or tens, and identifying which of any 2 numbers is larger) before using the Oucher to determine which scale they would use (photographic or numeric). Selection of the photographic over the numeric scale may also be determined by child preference. For more information, see: http://www.oucher.org. The original Oucher photographic was designed with pictures of a Caucasian preschool-aged boy.10 African-American,13,60,103 Hispanic,13,103 and Asian (female and male)115 versions of the Oucher photographic have since been developed. A Canadian aboriginal (female and male) version has also been developed, although no peer-reviewed publications report on its measurement properties (http://www.oucher.org).

A total of 15 studies (reported in 10 publications) examined measurement properties of the Oucher in children 3 to 18 years old and were included in this review. Of these, 8 reported on the Oucher photographic,2,8,10–14,103 5 reported on the Oucher numeric,2,11–13,81 and 2 reported findings of the Oucher photographic and numeric combined.60,115 Six studies focused on use of the Oucher for self-report of acute pain,8,10,11,60,103,115 6 for postoperative pain,2,12–14,81,115 and none for chronic pain. Studies were conducted in a variety of hospital (inpatient, intensive care, and outpatient clinic) and community settings (daycare and school). Most studies (n = 5) focused on young children 3 to 7 years old,8,10,14,103,115 with 3 studies including children up to 12 years old.2,11,13 Only 2 studies included samples crossing childhood and adolescence.60,81 No studies assessed alternate forms of presenting the Oucher (eg, electronic). Only one new study assessing the measurement properties of the Oucher has been published since the original 2006 systematic review.115 The original Oucher was developed and tested with Caucasian children 3 to 7 years old in daycare, attending a hospital preadmission clinic, or hospital inpatients.10 The tool was specifically designed for preschool-aged children. When reported, all studies identified using the original anchors for the Oucher.10 The Oucher was primarily presented in English; however, the Asian version was developed with Taiwanese children,115 and the Hispanic version has also been used in Spanish.13

Studies were mostly poor to fair quality and reported variable sufficiency of measurement properties. Studies reporting the photographic and numeric scales together did not inform recommendations.60,115 Recommendations for the Oucher photographic were additionally informed by studies reporting sufficient measurement properties in 7-year-olds,10,11,103 but insufficient measurement properties in 3- to 6-year-olds with acute pain.8,10 Only one study of acute pain included children older than 7 years, and it was unclear how many participants aged 8 to 18 years used the photographic vs the numeric scale60 leading to an inconclusive recommendation. Although no concerns with measurement properties were reported for studies including children 3 to 8 years old with postoperative pain, all studies in this age group were conducted by the same investigatory team.2,12,13 Recommendations for the Oucher numeric were informed by sufficient measurement properties in a single study for 3- to 12-year-olds with acute pain.11 All studies reporting the Oucher numeric scale separately for postoperative pain were conducted by the same investigatory team.2,12–14

3.7. Pieces of Hurt (Poker Chip Tool)

The Pieces of Hurt scale, frequently called the Poker Chip Tool, is composed of 4 red poker chips that are equated to pieces of hurt from one chip (“a little bit of hurt”) to 4 chips (“the most hurt you could ever have”).51,52 The number of chips reflects the amount of pain reported by the child. A total of 10 studies (reported in 9 publications) examined measurement properties of the Pieces of Hurt tool in children 3 to 16 years old and were included in this review. Of these, 6 studies focused on use of the Pieces of Hurt tool for self-report of acute pain,47,48,51,52,93,96 4 studies of postoperative pain,2,49,96,111 and no studies of chronic pain. Studies were conducted in a variety of hospital (inpatient, intensive care, and outpatient clinic) and community settings (school and clinic). Six studies (5 publications) focused on children aged 12 years and younger,2,48,51,93,96 with 3 of these studies focused exclusively on young children 4 to 6 years old.48,51,93 All other studies included samples crossing childhood and adolescence. No studies assessed alternate forms of presenting the original Pieces of Hurt tool (eg, paper and electronic). Original development, measure instructions, and testing of the Pieces of Hurt tool were conducted with generally healthy 4 to 6 years old receiving immunizations.51 The tool was specifically designed to suit the cognitive abilities of children as young as 4 years old. Some deviations were noted from the original Pieces of Hurt tool upper anchor of “the most hurt you could ever have” to “the most hurt you can have,” or “most severe pain.” No new studies assessing the measurement properties of the Pieces of Hurt tool have been published since the original systematic review.92 The Pieces of Hurt was primarily presented in English, but also in Arabic47 and Thai.96 Most studies were of poor to fair quality and reported variable sufficiency of measurement properties. Recommendations were additionally informed by concerns with greater use of extreme measure scores in 4- and 5-year-olds with acute pain,93 and 4-year-olds with postoperative pain.49 In postoperative pain, multiple studies indicated insufficient measurement properties that worsened with older child age,49,111 thus justifying extension of a strong against recommendation up to the oldest studied age of 16 years.

3.8. Visual Analogue scale

The VAS asks children to mark their pain intensity on a line typically 100 mm in length with a corresponding scale score from 0 to 100 or 0 to 10. A total of 15 studies (reported in 15 publications) examined measurement properties of the VAS in children 2 to 19 years old and were included in this review. Of these, 6 studies focused on use of the VAS for self-report of acute pain,5,7,60,75,76,79 4 studies of postoperative pain,1,2,32,102 and 5 studies of chronic pain.9,83–85,91 Studies were conducted in a variety of hospital (inpatient, outpatient clinic, and emergency department) and community settings (school and clinic). Three studies focused on children aged 12 years and younger,2,84,102 with no studies focused exclusively on young children. All other studies included samples crossing childhood and adolescence. Two studies used electronic presentations of the VAS on handheld electronic devices.83,91 When reported, “no pain” was consistently used as the lower anchor for the VAS. The upper anchor for the VAS was more variable, and included “worst pain possible,” “worst possible pain,” “very much pain,” “biggest hurt you could ever have,” “worst pain ever,” “pain as bad as it could be,” and “extreme/unbearable pain.” The VAS was primarily presented in English, but also in Catalan,83,84 French,5,7 and Thai.76 Most studies were of poor quality and reported sufficient measurement properties, including studies in 6- and 7-year-olds with postoperative pain2,102 and chronic pain.84,85 Only one study reported sufficient measurement properties in 3- to 5-year-olds with postoperative pain. Recommendations were additionally informed by studies reporting insufficient measurement properties in 3- to 7-year-olds with acute pain60,76 and 3- to 5-year-olds with chronic pain.85

3.9. Wong-Baker FACES Pain Rating Scale

The FACES scale is composed of 6 black and white cartoon faces ranging from a smiling face to a sad, tearful face showing as much hurt as you can imagine (scored 0-10; originally score 0-5). Children choose the face that best describes how they are feeling.113 For more information, see: http://wongbakerfaces.org. A total of 16 studies examined measurement properties of the FACES scale in children 2 to 18 years old and were included in this review. Of these, 14 studies focused on use of the FACES scale for self-report of acute pain,4,5,26,42,44,45,47,57,60,62,76,88,94,113 with only one study each focused on postoperative111 or chronic pain.67 Studies were conducted in a variety of hospital (inpatient, intensive care, outpatient clinic, and emergency department) and community settings (school and clinic). Five studies focused on children aged 12 years and younger,4,26,44,88,94 with one study focused exclusively on young children 4 to 5 years old.88 All other studies included samples crossing childhood and adolescence. The FACES scale is typically presented horizontally on paper; however, studies have presented the faces positioned in a circle113 with one study presenting the scale's faces on 3-dimensional dolls.4 The first peer-reviewed presentation of the FACES scale measure instructions and testing of the scale was conducted with hospitalized children 3 to 18 years old.113 Original reported scale instructions to children include anchors of “no hurt at all” to “hurt as much as you can imagine”113 with some reported deviations to the upper anchor to “lot of pain” or “hurts worst.” The FACES was primarily presented in English, but also in Arabic,4,47 French,5 and Thai.76 A culturally adapted version of the FACES scale developed for Inuit youth in northern Canada (the Northern Pain Scale; NorthPS) includes translated measure instructions in Inuktitut.42 Most studies were of poor quality and reported variable sufficiency of measurement properties. Recommendations were additionally informed by studies reporting sufficient measurement properties for acute pain in 6- and 7-year-olds,4,44,57,60,94,113 and concerns with greater use of extreme scores and insufficient criterion validity in 4- and/or 5-year-olds.76,88

4. Discussion

This systematic review presents a comprehensive examination of measurement properties of single-item self-reportpain intensity measures for children and adolescents. The application of COSMIN offered rigorous quantification of sufficiency, methodological quality, and interpretability of measurement property studies80,97 to inform evidence-based recommendations for self-report of pain in 3- to 18-year-olds. Sixty self-report measures were found; however, only 8 met “well-established assessment” criteria31 as reported in 80 publications, including the NRS-11, CAS, FPS-R/FPS, Pieces of Hurt (Poker Chip Tool), Oucher photographic and numeric scales, VAS, and FACES scale. Studies ranged from poor to excellent quality, with poor and fair ratings most frequent. Overall, criterion validity was most commonly assessed across measures, followed by hypothesis testing (construct validity), responsiveness, and reliability. Content and cross-cultural validity were minimally assessed overall. The NRS-11 now has the greatest number of studies assessing its measurement properties in children younger than 18 years followed closely by the FPS-R. The Pieces of Hurt and Oucher scales have little or no new evidence over the past decade.

4.1. Clinical and research implications

This review recommends several measures for painself-report by children older than 6 years dependent on the type of pain. By presenting evidence for all measures across ages and by type of pain, clinicians and researchers can preferentially select measures based on additional considerations of feasibility, familiarity, ease of use, child preference, and/or comparison with other measures (eg, numeric or faces scales of distress or unpleasantness).105 The largest proportion and highest quality studies were conducted in self-report of acute/procedural pain with the least available evidence for chronic pain. Although strong recommendations for acute pain could be made for the 3 measures (the NRS-11, FPS-R, and CAS), weak recommendations at best could be made across all measures for postoperative and chronic pain. Strong recommendations could be made if future studies showed evidence of sufficient measure reliability in postoperative and chronic pain, and responsiveness in chronic pain. Only the NRS-11 and the VAS were recommended across all types of pain for children 6 or 8 years and older. Recommended use of the NRS-11 in children aged 8 years or older, and even as young as 6 years old with demonstrated numerical competency, is consistent with a recent review of the measure.21

Related reviews have similarly noted concerns with measurement property study quality.59,107 Studies were largely downgraded in quality for insufficient detail in study procedures and/or poor study design. This was more common for older studies that were also more likely to have indeterminate or insufficient measurement properties as a result. Reliability (test–retest and measurement error) showed greater variability than other measurement properties, in part due to undetermined minimally important or clinically significant change for 4 of 8 measures. Meaningful decreases assessed for the NRS-11, CAS, FPS-R, and VAS ranged from 1 to 5/10, and were predominantly determined from acute pain in the emergency department.6,7,19,54,63,79,99,104 Thus, supporting individualized approaches to determining the clinical significance of pain.16 Furthermore, most studies neglected to report percentage of respondents rating lowest (no pain) or highest measure scores, of particular relevance given young children's tendency to select measure extreme scores.88,93 Inconsistent use of measurement property terminology was observed. Researchers should use COSMIN to guide high-quality design, terminology, and reporting of future measurement property studies.74,97

Despite several measures intentionally developed to suit younger children's cognitive abilities,10,51,53,113 measures were not recommended for children younger than 6 years, and sometimes up to 8 years old. This differs from previous PedIMMPACT measure recommendations for painself-report in young children,66 but more closely aligns with the general attainment of cognitive abilities required for most self-reportpain intensity at 5 or 6 years of age.29 Evidence of insufficient measurement properties across measures in 3- to 4-year-olds is consistent with a recent review in this area.107 These findings should not be misconstrued as a global recommendation against self-report in children of this age. Rather, it is a reiterated call for further testing and refinement of self-reportpain intensity measures appropriate for younger children, including consideration of screening tasks to identify suitability of measure use, training or gained experience with measure use, or measure simplification.107 Several simplified or alternative measures exist but have not been sufficiently studied to meet “well-established assessment” criteria.43,61,106 Unfortunately, studies did not consistently include age-based analyses and often aggregated data from children across large age ranges. We recommend studies include subanalyses by chronological year of age in children younger than 8 years with adequate sample size.101 There is no standard method of assessing children's ability to accurately use self-report scales with age offering the best proxy.107,110 Currently, it may be best to judge adequacy of each child's cognitive abilities for potential appropriate use of a self-report measure included in this review. When in doubt, consider a preceding yes–no question about the presence of pain or hurt,106,107 and supplement with observer or behavioral assessment of pain behaviors.108

Variability was seen in upper anchors denoting highest level of hurt or pain. This was particularly notable for measures without standardized instructions (NRS-11 and VAS), but also occurred with standardized measures (FPS-R and FACES). Changes to lexical, numerical, or facial anchors significantly impacts self-reported pain.25,27,112,116 Recommended measure anchors were specified prioritizing original standardized instructions (FPS-R, FACES, Oucher, and Pieces of Hurt), followed by recommendations from empirical research of measure anchors (NRS-11),116 and then anchors most commonly used in reviewed research (VAS). A minority of studies used generic names (eg, “faces scale”) and did not specify anchors. Ambiguity in reporting pain assessment has been identified in adult studies leading to reporting recommendations, such as specification of anchors, frequency of pain assessment, time period rated (eg, past week), type of pain intensity (eg, current), and location or pain condition rated.86 The same ambiguity often impedes interpretation of pediatric studies. Given the large number of reviewed studies conducted in languages other than English, careful translation of measure instructions or anchors for English language publication is needed,24 with good practice to report in both the original language and English translation.6,54 Additional translations of reviewed measures exist beyond those reported here (see measure web sites), but no peer-reviewed publications were found assessing their measurement properties.

This review identified electronic presentations of the FPS-R,50,83,95,114 VAS,83,91 CAS,83,95 and NRS-1123 on handheld devices with poor to excellent quality evidence. Equivalent or superior (eg, higher reliability) data are recommended to support equivalence of electronically completed patient-reported outcomes to original paper-based outcomes.33 This has not yet been clearly achieved for pediatric self-reportpain intensity measures, but this is a promising avenue for future research and practice. Given the rise of apps for pain, and their generally poor evidence base,37,58 these guidelines can inform quality development and testing of electronic self-reportpain intensity measures.33

4.2. Limitations

Although COSMIN offered the most established and rigorous means of assessing measurement property evidence, it was not designed for single-item measures and required some adaptation. Modifications for our review included the omission of inapplicable measurement property categories (such as internal consistency and structural validity), and de-emphasis of content and cross-cultural validity in the translation of measurement property evidence to measure recommendations.71,80,97 Content validity includes assessment of measure comprehensiveness, comprehensibility, and relevance to the construct of interest. While single-item self-reportpain intensity measures are inherently limited in addressing the comprehensiveness of pain experience, included studies have tried.1 Furthermore, included studies have assessed comprehensibility and relevance of both faces and numeric scales.50,96 This provides valuable information regarding children and adolescents' understanding and perceived relevance of single-item measures to their pain experience. Cross-cultural validity considers measure performance differences between language and cultural groups. Because they are single items, it is likely that researchers do not frequently undergo a formal measure translation and evaluation process, although cross-cultural/language multiple group comparisons have been performed with very short (3-item) self-report measurement in pediatric pain.40 An additional potential limitation of this review is its exclusive focus on studies assessing measurement properties. This is recommended by COSMIN; however, additional information regarding responsiveness could be inferred from intervention studies.

5. Conclusions

The availability of 8 “well-established” self-reportpain intensity measures with at least some supportive measurement property evidence for children 6 years and older directs selection of one of these over other less rigorously tested measures. Evidence of variable quality was shown across types of pain and from early childhood to late adolescence. Overall, we strongly emphasize the value of self-report of pain given the subjectivity of pain experience,56 and lack of regular pain assessment and documentation in clinical care.89,90 There is a need for continued efforts focused on self-report of pain in young children.

Conflict of interest statement

Acknowledgements

The authors thank Arnav Agarwal and Sarah Lloyd for their research assistance, as well as medical librarian Tamsin Adams-Webber at The Hospital for Sick Children for her assistance with development and execution of the literature search strategy. The authors would also like extend our appreciation to Dr. Carl von Baeyer for his feedback on an early version of this manuscript.

K.A. Birnie is supported by a Fellowship from the Canadian Institutes of Health Research.

IASP Members

What does "Remember me" mean?
By checking this box, you'll stay logged in until you logout. You'll get easier access to your articles, collections,
media, and all your other content, even if you close your browser or shut down your
computer.

To protect your most sensitive data and activities (like changing your password),
we'll ask you to re-enter your password when you access these services.

What if I'm on a computer that I share with others?
If you're using a public computer or you share this computer with others, we recommend
that you uncheck the "Remember me" box.

Don't have a user account?

Forgot Password?

Enter and submit the email address you registered with. An email with instructions to reset your password will be sent to that address.

Password Sent

Link to reset your password has been sent to specified email address.

Remember me?

What does "Remember me" mean?
By checking this box, you'll stay logged in until you logout. You'll get easier access to your articles, collections,
media, and all your other content, even if you close your browser or shut down your
computer.

To protect your most sensitive data and activities (like changing your password),
we'll ask you to re-enter your password when you access these services.

What if I'm on a computer that I share with others?
If you're using a public computer or you share this computer with others, we recommend
that you uncheck the "Remember me" box.