Abstract

The safe and effective administration of fluids and medications during the management of medical emergencies in children depends on an appropriately determined dose, based on body weight. Weight can often not be measured in these circumstances and a convenient, quick and accurate method of weight estimation is required. Most methods in current use are not accurate enough, but the newer length-based, habitus-modified (two-dimensional) systems have shown significantly higher accuracy. This meta-analysis evaluated the accuracy of weight estimation systems in children. Articles were screened for inclusion into two study arms: to determine an appropriate accuracy target for weight estimation systems; and to evaluate the accuracy of existing systems using standard meta-analysis techniques. There was no evidence found to support any specific goal of accuracy. Based on the findings of this study, a proposed minimum accuracy of 70% of estimations within 10% of actual weight (PW10 > 70%), and 95% within 20% of actual weight (PW20 > 95%) should be demonstrated by a weight estimation system before being considered to be accurate. In the meta-analysis, the two-dimensional systems performed best. The Mercy method (PW10 70.9%, PW20 95.3%), the PAWPER tape (PW10 78.0%, PW20 96.6%) and parental estimates (PW10 69.8%, PW20 87.1%) were the most accurate systems investigated, with the Broselow tape (PW10 55.6%, PW20 81.2%) achieving a lesser accuracy. Age-based estimates achieved a very low accuracy. Age- and length-based systems had a substantial difference in over- and underestimation of weight in high-income and low- and middle-income populations. A benchmark for minimum accuracy is recommended for weight estimation studies and a PW10 > 70% with PW20 > 95% is suggested. The Mercy method, the PAWPER tape and parental estimates were the most accurate weight estimation systems followed by length-based and age-based systems. The use of age-based formulas should be abandoned because of their poor accuracy.

Keywords

Weight estimationBroselow tapePAWPER tapeMercy method

Introduction

It cannot be considered to be good medical practice to use a weight estimation system that is known to be inaccurate [1]. When children’s weight cannot be measured during emergency care, an accurate, rapid estimation of weight is needed, as the safety and effectiveness of emergent interventions may ultimately depend on the accuracy of the weight estimation [2, 3]. Since most drug doses in children are based on weight, an accurate estimation of weight is important to ensure that a correct amount of medication is administered to achieve the desired effect, as well as to prevent the potential complications and side-effects of overdosing [4, 5]. This is relevant because most paediatric medication errors occur in the Emergency Department and most cases of resultant patient harm are related to incorrect dosing [6–8].

The problem is that most contemporary methods used to estimate children’s weight have been shown to lack sufficient accuracy and consistency of performance in different populations [9]. Most existing weight estimation systems are “one-dimensional”, because a single variable, usually age or length, is used in the weight estimation methodology. These systems fail because a single variable cannot adequately account for the biological variability of weight-for-age and weight-for-length [10, 11]. There is a wide variability of body habitus that is not accounted for in these weight-estimation systems, aggravated by the increasing levels of obesity affecting children [12, 13]. Newer, more promising, methods are the “two-dimensional” or dual length- and habitus-based systems, which include two variables in the estimation methodology: length (or a surrogate such as humerus or ulna length) and habitus (or a surrogate such as mid-arm circumference or waist circumference) [5, 14–17]. These have been shown to be much more accurate than the older, one-dimensional systems, in many studies [5, 15, 18–22].

Healthcare providers may also need more than one approach to emergency weight estimation: while parental estimates of weight can be very accurate, parents may not be present at the time that emergency care is required (especially in the prehospital environment) [9]. In these situations, an evidence-based alternative system may be required.

There has been a large amount of material published on weight estimation in children. It would be useful to combine the data from these studies to establish the accuracy of different methodologies both within and between different populations. Since many of the same weight estimation systems are used in populations with very different prevalences of underweight and obese children, it needs to be ascertained whether this impacts on the accuracy outcomes of these systems.

In order to create an evidence-based approach to emergency paediatric weight estimation, it is crucial to discover which methods predict weight most accurately and which are most appropriate for emergency use. This will enable clinicians to decide which systems they should incorporate into their clinical practice and will provide some guidance to those who administer, teach and train paediatric advanced life support on which systems are important.

The overall aim of this study was to determine which paediatric weight estimation systems most accurately estimate total body weight in children. The first objective was to determine whether there was evidence in the literature for an acceptable benchmark level of accuracy for a weight estimation system. The second objective was to extract and pool data on the performance of paediatric weight estimation systems to integrate the findings, provide a more comprehensive analysis on their functioning and identify those systems that operated best in diverse populations. The third objective was to directly compare the accuracy of paediatric weight estimation systems, for which paired data was available, using pooled data and meta-analysis techniques.

Only one meta-analysis has addressed this topic, but was limited to studies in low- and middle-income countries [23].

Methods

This systematic review and meta-analysis followed the PRISMA guidelines.

Search strategy

Online databases (MEDLINE, SCOPUS, Science Direct and Google) were interrogated for eligible studies, published between January 1983 and May 2017, using the following search terms: “paediatric weight estimation”, “weight estimation children” and “Broselow tape”. Citation lists of reviewed papers were examined for additional relevant articles. Studies in any language were included if English translations were obtainable. To minimise publication bias, all studies with adequate reporting were included, whether full-text articles, dissertations, abstracts, conference presentations or other unpublished data that had undergone some form of peer-review.

Study selection and eligibility criteria

All studies that evaluated weight-estimation methodologies were assessed for inclusion into the study by two separate investigators (MW and LG). Articles that contained discussions on desired targets of accuracy of weight estimation systems, or analysis of the performance of weight-estimation systems were included in the qualitative arm of the review. Studies that presented original data with either accuracy data (percentage of estimations within 10% of actual weight (PW10)) or bias and precision data (mean percentage error plus an appropriate indicator of variance), or both, were included in the meta-analysis. Studies that did not include original data, those that did not include usable data and those at high risk of bias (see below) were excluded from the meta-analysis (see Fig. 1).

Fig. 1

The PRISMA flow-chart of the study design

Data abstraction and analysis

Data was extracted from the included studies independently by two researchers (MW, LG), cross-checked and confirmed. Standard statistics for meta-analysis of method-comparison studies were used [24], with an emphasis on evaluating accuracy (percentage of estimations within 10% of actual weight), bias (mean percentage error) as well as precision (limits of agreement of percentage error). Two methods of representing the pooled parametric and non-parametric data were employed: a fixed effects model weighted by inverse variance and a random effects model. In general, the random effects model was preferred because of the large variance within and between samples as well as the effects of several very large database studies that may have introduced bias.

Many of the evaluated studies presented incomplete data. Where it was possible, without risking bias, missing data was imputed using standard methodologies [25].

Direct comparisons between weight estimation systems, using pooled paired data, were performed with non-parametric techniques based on PW10 accuracy data, where such data was available.

Subgroup analysis

There was considerable heterogeneity in the use and composition of subgroups within the included studies. Wherever possible, subgroup analyses that had been performed in each study were included in the overall meta-analysis. The included subgroups focused on different age groups as previous studies have shown a difference in weight estimation accuracy between infants (<1 year), toddlers and pre-school children (1 to 6 years) and older children (>6 years of age) [26].

Risk of bias within and across studies

Reporting bias was minimised by including all available methodologically sound studies (published or not). Methodological causes of potential bias were common (e.g. the Broselow tape was not actually used in many studies, but weight-estimates were generated from length data), but these were individually assessed and rated according to the level of risk of systematic bias. Studies with a high risk of bias were excluded from the meta-analysis (e.g. studies which excluded children above or below certain weight-for-length centiles).

Sensitivity analysis

There were three large database studies among those evaluated, with more than 100,000 children, one of which had more than 400,000 data points [27–29]. The effects of these “virtual” weight estimation studies, from very large databases, were carefully considered to establish any significant contribution to bias or distorted outcomes.

Results

Excluded studies

The most common reason for exclusion of potentially relevant studies was incomplete data presentation (see Fig. 1). The large database studies did not have a significant impact on overall outcomes based on the sensitivity analysis and were therefore not excluded from the analysis.

Characteristics of included studies

Two-thirds of included studies evaluated multiple weight-estimation systems and contained paired data or made direct comparisons, while one-third evaluated only a single system. Prospective studies accounted for the majority of articles (70/98 (71.4%)) but a minority of total patients (58,618/1,054,673 (5.6%)).

Table 1 provides a descriptive summary of the studies included in both the qualitative review as well as the meta-analysis, including the major findings and limitations of each study and the risk of bias assessment for each included study.

Table 1

Studies included in the qualitative review and quantitative meta-analysis

Findings: Height was a good predictor of weight; IBW is only useful for a handful of drugs; TBW must be used in low weight-for-height children. Comments: Derivation study for Traub-Kichen formula. IBW predicted (actually 50th centile weight-for-length) by formula. Limitations: Incomplete presentation of data. Limited validation of formula.

Findings: DWEM performed best of methods tested. Body habitus accurately assessed by evaluators. Comments: First ever report of evaluation of weight estimation systems in the literature. None of the systems tested were very accurate. Limitations: Only children up to 170 cm were included. Incomplete presentation of data.

Findings: Broselow tape better than healthcare provider guesses and similar accuracy to DWEM. Accuracy of Broselow tape falls off sharply in children > 25 kg. Comments: Original study of Broselow tape. Authors recommended that an assessment of body habitus in children > 25 kg should be considered. Limitations: No formal, prospective comparison with other methodologies or indication of desired accuracy.

Findings: Guesses of weight are very inaccurate; children should be weighed whenever possible. Comments: Authors suggest that accurate weight estimation is required for most drugs administered in emergency situations. Age-based formulas were wrongly considered acceptable. Limitations: Incomplete presentation of data, very small sample.

Findings: Parental estimates performed much better than formula. Distraught parents may be unreliable. Comments: Small sample size. Only children < 6 years included. Limitations: Over- or underestimation not recorded. Incomplete presentation of data.

Findings: Broselow tape performed best, far better than parental estimates and age-based formulas. Comments: The target of 10% accuracy chosen for children was deliberately less than the 20% that the authors considered would be appropriate for adults. Limitations: Incomplete presentation of data and small sample size.

Findings: Parents, especially mothers, can accurately estimate their children’s weights. Comments: Those parents that had weighed their children an average of 5 weeks previously had the best results. The authors defined highly accurate weight estimations as < 5% error, accurate as < 10% error and semi-accurate as < 20% error. Limitations: Incomplete presentation of data. Misinterpretation of bias as indicative of accuracy.

Findings: Weight estimates by parents, nurses and doctors were significantly unreliable. The error is “so great and so frequent that clinically significant untoward effects can be anticipated”. Comments: Broselow tape recommended by authors. Limitations: Incomplete presentation of data.

Findings: Healthcare provider guesses were very inaccurate; Blantyre tape better than guesses. A 20% error considered an acceptable target. Comments: Very young study population, mostly under 5 years. Limitations: Incomplete presentation of data.

Findings: Broselow tape most accurate in children from 10 to 25 kg, but acceptable for all children. Adjustment for habitus would be advantageous. Comments: Accuracy of Broselow tape outside of the 10–25 kg range was actually poor. The accuracy in this range was reasonable, but not as good as the authors suggest. Limitations: Poor interpretation of statistics. Broselow tape version not reported.

Findings: Broselow tape and DWEM were more accurate than formulas. These methods should be used if weighing not possible. Comments: EPLS worst performer but poor accuracy of all systems. Good reproducibility of assessment of body habitus. Limitations: Incomplete presentation of data. Broselow tape version not reported.

Findings: Broselow tape was accurate but underestimated weight in older children. Comments: Nearly 25% of sample excluded because they were too tall for the tape. Limitations: Broselow tape not actually used and version not reported. Incomplete presentation of data.

Findings: Doctors’ guesses of children’s weight were not accurate—drug doses should therefore be titrated in small paediatric patients. Comments: Wide variation in different doctors’ accuracy, not related to seniority. All estimators were very inaccurate; worst estimations occurred in children < 20 kg. Limitations: Incomplete presentation of data; conclusion that underestimation of weight may “not be a serious problem” was not supported by the evidence.

Findings: Both methods performed poorly and worsened with increasing age. Comments: Difficult to draw any conclusions from this study, but Broselow tape marginally better than formula. Authors suggest that methods of weight estimation not keeping up with increasing obesity. Limitations: Broselow tape version not reported. Incomplete presentation of data.

Findings: The Leffler formula was more accurate than the EPLS formula. Comments: A new formula was promoted but the evidence was limited: only bias was evaluated, not accuracy. Both formulas underestimated weight significantly The Luscombe formula, as known today, was the result of a later study. Limitations: Incomplete presentation of data.

Findings: Formulas and Broselow tape underestimated the weight of Pacific Island and Maori children. Comments: Broselow tape was the best performer of the systems tested, despite the fact that the authors reported the contrary. Limitations: Broselow tape version not reported. Incomplete presentation of data.

Findings: Foot length can be used for emergency drug calculation Comments: Only 10% of population > 1 year old. very poor statistics. Method of data analysis made findings unreliable and uninterpretable - bias confused with accuracy.

Findings: One third of children had inaccurate weight estimations. The 1998 tape performed better than the 2002 tape. A measurement of obesity (e.g. MAC) should be added to Broselow tape to increase accuracy. Comments: Broselow tape version 1998 and 2002A used. The study population BMI was 16.8. Good statistics. Limitations: Broselow tape not actually used. Nearly 7% of sample excluded because too tall for the tape.

Findings: Formulas overestimated weight in this developing-world study. The Broselow tape was the most accurate. Comments: More than half the study population was under 6 months of age and only 8% > 5 years. Limitations: Incomplete presentation of data.

Findings: Broselow tape accurate in children of normal weight-for-length Comments: Only children falling within 3rd to 97th weight-for-height centiles included. Very young study population. Broselow tape only recommended by authors for “normal-growth” children < 20 kg and < 120 cm. Limitations: Incomplete presentation of data. Conclusions not supported by findings.

Findings: Broselow tape reasonably accurate in this population, but less so in children > 25 kg. Comments: Overall underestimation of weight. Performance not very good and on par with most other studies. Limitations: Broselow tape not actually used and version not reported. Incomplete presentation of data.

Findings: BG performed moderately well, but overestimated weight in low BMI children. Comments: Multiple papers on same data. BMI was 17 in study population. Significant number of children had large errors of weight estimation. Limitations: Incomplete presentation of data.

Findings: Parental estimates performed best, followed by Broselow tape. Only 11% of parents could not provide an estimate. Formulas performed much worse than other methods. Comments: Multiple papers on same data. Study population BMI was 17.1. Limitations: Broselow tape not actually used. Broselow tape version not reported. Incomplete presentation of data.

Findings: The authors commented that since few children with high-acuity conditions are actually weighed in clinical practice, weight estimation essential. The EPLS formula significantly underestimated weight, which may lead to under-resuscitation. The Luscombe formula was more accurate. Comments: Both formulas actually performed poorly. Limitations: Incomplete presentation of data. Mean bias used incorrectly.

Findings: The Luscombe formula was less accurate than EPLS with greater overestimation of weight. The authors suggested length-based systems should rather be used. Comments: Scientific letter. Both formulas performed very poorly with significant overestimation of weight. Limitations: Incomplete presentation of data.

Findings: The BG formula performed better than EPLS and ARC formulas. Authors advised cautious use in infants. Comments: Data obtained from high acuity patients. The BG did not actually perform that well in this study. Limitations: Incomplete presentation of data. Only measures of bias used for comparison.

Findings: New formulas developed with no target accuracy and no validation sample. Comments: Limited indication of accuracy of formulas as only measures of bias reported. Limitations: Incomplete presentation of data.

Comments: This was a commentary on Broselow tape—that IBW might be better than TBW as a goal; “the Broselow tape is a tool that was not designed to be used without clinical judgement”. No evidence offered to support opinion; no goal target suggested (for TBW or IBW).

Findings: Broselow tape overestimated weight by > 10% in Indian children over 10 kg. A correction factor was developed, but not validated. Comments: One of the few studies to show overestimation of weight by Broselow tape. Limitations: Broselow tape version not reported. Incomplete presentation of data.

Findings: Broselow tape and DWEM were the least accurate methods. Comments: No conclusions can be drawn from this study because of the methodology. Limitations: The data appears to favour healthcare provider and parent guesses, but the statistical methodology is flawed.

Findings: The Broselow tape was often inaccurate and tended to underestimate weight. Comments: Abstract. The Broselow tape actually performed better in this study than in many other studies. Limitations: Broselow tape version not reported. Broselow tape not actually used.

Findings: The Broselow tape performed well, especially in children < 10 kg but underestimated all others, especially in children > 18 kg. Comments: Abstract. Very large prospective study. The authors recommended a new version of Broselow tape for Indian children because of underestimation of weight. Limitations: Broselow tape version not reported. Incomplete presentation of data.

Findings: Estimates of weight can be based on MAC. A special colour-coded MAC tape could be produced to aid drug dosing. The authors recommended habitus modified use of Broselow tape. Comments: Abstract. No data presented. Broselow tape performed better in younger children, MAC better in older children. Limitations: No data presentation.

Findings: Parents were better than nurses at estimating weight; nurses were very inaccurate. Comments: Guessed weights most often underestimations. The longer the time from last weighing, the greater the error. All nurses, regardless of training and experience, were poor estimators. Limitations: Incomplete presentation of data.

Findings: Length-based and age-based systems are suitable in emergencies, but length-based were better; new formulas more accurate than EPLS; “one size fits all” approach not likely to be successful. Comments: Unique method of analysing data—does not allow comparisons with other studies in this format. Age-based methods less accurate than suggested; biological variability less in length-based than age-based systems. Limitations: Incomplete presentation of data.

Findings: Broselow tape most accurate weight estimation method with best accuracy in normal BMI children. Comments: Study population BMI was 17.8. New formula developed, but not tested. Limitations: Incomplete presentation of data. Bias mistaken for accuracy. Broselow tape not actually used and Broselow tape version not reported.

Findings: Best performance of Broselow tape between 10 and 25 kg but very inaccurate in children >25 kg. More accurate than age-based formulas, however. Comments: Masters dissertation. Good statistics. Limitations: Broselow tape version not reported.

Findings: Argall formula performed best. Comments: Article in Turkish; many formulas studied, often applied outside of the age-range for which they were intended. Limitations: Incomplete presentation of data. Measures of bias mistaken for measures of accuracy.

Findings: BG was accurate in children 1–4 years of age. The EPLS formula was the least accurate of all methods. The authors recommend that the accuracy and ease-of-use of the Broselow tape mandates its use in the ED and pre-hospital. Comments: The authors’ suggestions that the BG was more “accurate” than the Broselow tape are misleading: the Broselow tape actually significantly outperformed the BG formula. BMI was 17.9 in study population. Limitations: Incomplete presentation of data. Broselow tape not actually used and Broselow tape version not reported.

Findings: MAC formula outperformed Broselow tape in children > 5 years. Comments: MAC actually only more accurate in children from 9 to 11 years who were too tall for the Broselow tape. Poor accuracy of all systems demonstrated. Limitations: Broselow tape not actually used, and 1998 edition values used. No evidence provided for desired accuracy targets.

Findings: Broselow tape was more accurate than age-based formulas but tended to slightly underestimate weight. Comments: Abstract; none of the systems estimated weight with an acceptable degree of accuracy. Limitations: Broselow tape version not reported. Incomplete presentation of data.

Findings: Both age and weight were estimated poorly by paramedics, with no association with experience or training. Comments: Estimation of weight and age from images of two children by 234 estimators. Limitations: Incomplete presentation of data.

Findings: Broselow tape was not accurate in First Nations’ children and may need to increase weight estimates by 12%. Although ideal method of dosing is still unknown, it is best practice to eliminate any possible contributing inaccuracies. Comments: First Nations’ children selected with high incidence of obesity in this population. Broselow tape did not perform equally well in all ethnic/population groups. Limitations: Incorrect use of some statistics. Broselow tape not actually used and Broselow tape version not reported.

Findings: CAWR-1 should be used in younger Chinese children, but other methods of weight estimation should be used in older children as age formulas are inaccurate. Comments: Comprehensive statistics. Findings suggested that no age formula performed well at any age. Limitations: No evidence to support suggested target of weight estimation.

Findings: Parents can estimate weight accurately based on home measurements, but very few actually weigh children at home. Comments: Authors used doses of ibuprofen and paracetamol as endpoints: the applicability in emergency situations is uncertain. Limitations: Incomplete presentation of data. Very small sample size.

Findings: Broselow tape and EPLS formula most accurate in this population. Careful titration of drugs and use of clinical judgement most important in using medications safely. Comments: Data from a poor community in South Africa. The accuracy of EPLS formula was the best ever reported while the accuracy of Broselow tape was on par with other reports. Only 4% of children excluded as too tall for the tape. The authors question whether the differences in accuracy of any weight estimation system are likely to affect outcomes. Limitations: Broselow tape not actually used and Broselow tape version not reported.

Findings: Parental estimates were most accurate when based on measurements made at home, rather than onguesses. Comments: Application to emergency weight estimations is uncertain as parental stress may negate this effect. Limitations: Targets were estimations of overweight or underweight. Incomplete presentation of data.

Findings: The Luscombe formula performed best of all the formulas. Comments: Study population BMI was 17. Same population as Nguyen 2007 and several other publications; all formulas performed poorly. Limitations: Incomplete presentation of data.

Findings: Broselow tape performed poorly, potentially leading to under-resuscitation in all weight categories, especially in younger children. Drug doses correct in only about 50% of cases. Consensus opinion required whether to use IBW or TBW during resuscitation. Comments: Broselow tape 2007B. High incidence of obesity in study population. Limitations: Broselow tape not actually used. Incomplete presentation of data. No direct assessment of accuracy of weight estimation.

Findings: The Luscombe formula outperformed the EPLS formula. Weight estimation is of paramount importance in resuscitation, therefore remembering one formula better than several. Comments: While the bias of the Luscombe formula was smaller, both formulas performed poorly. Limitations: Incomplete presentation of data. An inappropriate age range was used for formulas (up to 16 years).

Findings: The EPLS formula was least accurate of commonly used formulas. The BG and Luscombe formulas were very similar and the best performers. No formulas showed acceptable accuracy, however. Comments: Abstract, with additional data supplied by author. This was a very large retrospective database study with good descriptive statistics. Limitations: Some incomplete data.

Findings: The Broselow tape was better than guesses by doctors, but not in obese children. Poor assessment of habitus by doctors. IBW suggested as the best target for estimation in obese kids. Comments: Broselow tape 2007B. 35% of study population obese or overweight. Mean BMI was 17.4. It is a reflection of how poorly the Broselow tape performed in obese children that doctor estimates were better; overall findings of Broselow tape accuracy similar to other studies. Limitations: Incomplete data presentation. Incorrect use of some statistics.

Findings: More than 50% of study population required habitus-modified weight estimation. Length-based systems should be used for weight estimation in emergencies. Comments: Editorial comment. Insufficient evidence exists to choose between IBW and TBW for drug dose calculations. Limitations: No reference standard for weight estimation included.

Findings: The Mercy method performed significantly better than the age-based, length-based and MAC-based systems. Comments: This was the original development and validation study of the Mercy method. Good mix of overweight children, but few underweight kids in sample. Broselow tape unable to be used in more than one third of the sample (too tall for the tape). Limitations: Broselow tape not actually used and Broselow tape version not reported.

Findings: All formulas performed similarly and all poorly, even the new formula derived from the study population. Comments: Only children aged 1 to 5 are included in study. Limitations: No validation sample for derived formula. Incomplete presentation of data.

Findings: CAWR-2 performed as well as other formulas. Age formulas performed less well in older children. MAC performed well in older children but poorly in younger children. Comments: Conference presentation. Study of CAWR in a Western population. No age formula actually performed well in any age group. Limitations: Some errors in presented data and calculations.

Findings: Using the age on clothing label was more accurate than using actual age. Comments: This clever study used clothing size as a surrogate marker for habitus. The Luscombe formula performed slightly better than the EPLS formula. Limitations: Incomplete presentation of data.

Findings: The Garwood formula was better than the EPLS formula. Comments: Abstract. Authors used assessment of mean bias only. Limitations: Limited statistical analysis made findings difficult to interpret. No validation of new formula.

Findings: The Broselow tape was an accurate tool to estimate weight. Comments: Broselow tape agreement between pre-hospital personnel and ED personnel only 70.1%. Limitations: Most of study population < 4 years. Incomplete presentation of ldata. Broselow tape version not reported.

Findings: Best systems to be used, in order of accuracy: parental estimates, Broselow tape, age-based formulas. Comments: Book chapter. Limitations: No mention of acceptable targets for weight estimation or weaknesses of age formulas. Subjective assessment of studies only.

Findings: The Broselow tape is the most consistent and reliable tool for weight estimation, but habitus-based weight adjustment for Broselow tape is logical. Parental estimates may rival the Broselow tape, but parents may be absent or uncertain, especially under stress. Underdosing might be prudent in emergencies. Comments: The author acknowledges that the degree of acceptable error is difficult to define. Limitations: No evidence provided of efficacy of Broselow tape in reducing medication errors. No suggestion of goal target accuracy required. Some controversial opinions about drug dosing.

Findings: Broselow tape effective, although significantly underestimated weight. Similar findings in urban and rural children. Comments: Broselow tape edition 2002A. The authors suggested that an error range of 30% is reasonable and that, ideally, technology should be developed to estimate weight. Limitations: Broselow tape not actually used—data taken from anthropometric measurements.

Findings: All methods underestimated weight, possibly because of a secular trend towards increasing BMI. The Park formula performed best and was accurate in children < 1 year of age. Comments: All methods assessed showed poor accuracy. Limitations: Incomplete presentation of data. Poor interpretation of bias vs. accuracy. Broselow tape was not actually used and Broselow tape version not reported.

Findings: Mixed racial study population, no major racial differences found in performance of formulas. There was increased underestimation of weight with increasing age. The Luscombe formula performed better than EPLS and APLS in children 1 to 10 years of age; BG and APLS better outside these ages. Limitations: Limited statistical analysis and incomplete data—only measures of bias reported. Some data values were not credible.

Findings: It was possible to weigh children during trauma resuscitation. Comments: Broselow tape weight was compared to stretcher weight. Nearly 40% of children in the study could not be weighed—more often the sicker kids (not fully explained). There was no validation of the accuracy of weight measured during resuscitation. Poor performance by Broselow tape with 18% of children too tall for the tape. Limitations: Incomplete presentation of data. Broselow tape version not reported.

Findings: Family member estimation was most accurate, and the Broselow tape the most accurate of other weight estimation methods. Comments: Equal underestimation and overestimation by Broselow tape while family estimates tended to overestimate. Limitations: Incomplete presentation of data. Broselow tape version not reported.

Findings: As average weight of children has increased so the accuracy of formulas has decreased; age-based methods will be unable to adjust to deal with future rises in average weights of children; no age-based weight formula accurate—might as well continue with EPLS formula. Comments: Semi-systematic review. Limitations: No suggestion of goal target accuracy required.

Findings: Prediction models incorporating MAC and either tibia or ulna length performed extremely well. Age-based formulas were very inaccurate. Comments: Masters dissertation. Weight was markedly overestimated by formulas in this population with a high prevalence of HIV. Limitations: Incomplete statistics.

Findings: Both two-dimensional and 3D Mercy methods outperformed the Broselow tape. Comments: Study population BMI was 17.3. Children much older than most weight-estimation studies included. Multicentre study—3 in the USA. One third of children were too tall for the tape. The 3D tape was significantly less accurate than the two-dimensional method. Limitations: Broselow tape not actually used and Broselow tape version not reported.

Findings: The Mercy method was accurate across a wider age range than other methods. Comments: Study population BMI was 17.6 with very few underweight children. Inter-rater assessment was generally reasonable, but two of the raters were inferior to the others. Limitations: Broselow tape not actually used and Broselow tape version not reported.

Findings: Children “too tall for the tape” do not have full adult weight—this assumption would lead to an average 30% overestimation of weight in this study. About 40% of children were too tall for the tape. Comments: Selective evaluation of children who were too tall for Broselow tape. Interesting data which highlights a flaw in the Broselow tape methodology. Broselow tape was not practically useful over the age of 10. An overestimation of 30% would be unacceptable; no mention of desirable target. Limitations: Relatively small sample size.

Findings: New APLS formula was best for infants, BG best for older children. Broselow tape wrong zone in up to 60% of children. Comments: Broselow tape 2007B. Despite criticisms of Broselow tape, it outperformed the formulas in every analysis. This study had some of the poorest performances of aged-based formulas in any study. A post hoc study population BMI was calculated to be 18 to 22. Limitations: Incomplete presentation of data. Broselow tape not actually used.

Findings: Garwood formula performed best, especially in older children. Comments: Sample population of cancer patients. Very poor performance of all formulas tested—none were close to acceptable accuracy. Limitations: Formulas used for children older than intended. Poor interpretation of findings.

Findings: The authors suggested that EMS personnel were generally accurate in estimating weights of children. Comments: The more severe the condition, the worse the weight estimation. Weight estimation appeared better than it actually was. Limitations: Incomplete presentation of data.

Findings: PT performed better than Broselow tape in every category analysed and better than any previously published system. Comments: Broselow tape 2007B. Population with mixed under- and overweight. Multi-centre study of habitus-modified length-based weight estimation. Limitations: Assessment of body habitus based on visual estimation.

Findings: Finger counting system as accurate as Broselow tape and more accurate than other formulas. Conceptually a simple system. To increase the accuracy weight-estimation systems may cause increased complexity and stress during resuscitations. Comments: The finger counting method is equivalent to formula Wt = 2.5 × age(years) + 7.5. Median BMI of study population was 17.2. Limitations: Incomplete presentation of data. Broselow tape not actually used and version not reported.

Findings: Some experts have suggested that weight estimation cannot be accurate. This is likely to be to the disadvantage of children as dual length-and habitus-based can achieve acceptable accuracy. Comments: Brief review of weight estimation systems. Limitations: No target of weight estimation accuracy suggested; Mercy method recommended for environments where no scale available—no mention of emergency use.

Findings: Unreliable. Comments: Unclear statistical analysis. The reported results for age-based formulas were the best ever reported, but are completely incorrect. The graphically presented data contradict the other findings. Limitations: Incomplete presentation of data.

Findings: Broselow tape was most accurate in this study. Comments: Broselow tape 2007 edition B. Aboriginal and Torres Strait Island children included (low weight-for-length). Only published study of Sandell tape. Limitations: Broselow tape not actually used. Incomplete presentation of data. Narrow age range evaluated. Very fat and very thin children excluded from study.

Findings: Mercy method performed well in Indian children, similar to that shown in Western populations. Comments: Good statistics. Overestimation of weight by all methods except Mercy. Limitations: Broselow tape not actually used and Broselow tape version not reported.

Findings: Mercy method performed extremely well in this population in Mali, similar to its performance elsewhere in the world. Other methods overestimated weight. Comments: BMI of study population was 15.6, with 22% underweight and 1.7% overweight or obese. Good inter-rater reliability. Limitations: Broselow tape not actually used and version not reported.

Findings: The APLS formula underestimated weight in these Nigerian children. Comments: The findings are unreliable in view of the major methodological flaws. Limitations: Limited and incomplete data presentation.

Findings: Potentially enhanced accuracy of age-based equations by using a habitus-specific formula, if length-based methods not available. Comments: Three formulas for “thick”, “normal” and “thin” children. Theoretical accuracy in derivation study. Limitations: No original data or validation of formulas. Incomplete presentation of data.

Findings: New APLS formula very inaccurate and should not be used. Weight should be adjusted according to a data table with the 5th and 95th weight-for-height values. Comments: Study performed in ICU patients. Limitations: Incomplete presentation of data. New regression formula untested, even with an internal validation sub-sample. Adjustment of weight estimation based on a “guess”.

Findings: Neither formula was accurate in Nigerian children with a substantial overestimation of weight. Comments: This was one of the largest prospective studies of age-based formulas in the developing world. Limitations: Some limitations from incomplete statistics.

Main results: “Real world” performance of weight estimation less accurate than when performed by experts. Calculation errors were common with age-based formulas. Weight and habitus were often underestimated by visual inspection. Usage errors with Broselow tape and Mercy tapes were common. Limitations: Not all modern weight estimation systems evaluated.

Findings: Broselow tape significantly overestimated weight in Indian children. An 8% modification of the tape improved its accuracy. Comments: Broselow tape 2007 edition B. The improved performance was not substantially better than the original and was still below acceptable performance. Recalibrating bias alone is not enough when precision is low. Limitations: Incomplete presentation of data.

Findings: The authors reported that healthcare provider guesses and EPLS formula were more accurate than other methods. Comments: The findings are completely unreliable and the conclusions consequently unreasonable. Limitations: Incomplete presentation of data. Analysis of bias confused with accuracy.

Findings: Broselow tape more accurate than age-based formulas in children < 143 cm. Current acceptance of formulas needs to change. Large differences in accuracy of weight estimation in different ethnic groups. Comments: Broselow tape 2011 edition A. Large proportion of Pacific Island children in sample. Broselow tape could not be used in one fifth of children, but best ever performance of the Broselow tape in a study. Habitus was assessed but data not presented. Limitations: Incomplete presentation of data.

Findings: Age-based formulas and the MAC formulas performed badly. The Broselow tape performed better and the PAWPER was most accurate overall, although not as accurate as in previous studies. Comments: Underestimation of obesity (and habitus score) caused underestimation of weight. High level of obesity in study population. Limitations: Broselow tape version not reported. Incomplete presentation of data. PW5 mistaken for PW10 by authors—confusing for readers.

Findings: The PAWPER did not perform as well as the original study. Comments: Most assignment of habitus score by nurses. Very high proportion of obesity. Limitations: Many incorrect assignments of habitus score. Children not fully undressed to assess habitus.

Findings: Broselow tape not accurate in Mexican population. Comments: Reasonable statistics. Both under- and overestimation found to be a problem. Nearly half of children had at least one colour zone error. Limitations: Broselow tape not actually used and Broselow tape version not reported.

Main results: The Luscombe formula underestimated weight less than the EPLS formula. Comments: Increase in children’s weight in “modern” populations related to increase in both lean body weight and adipose. Limitations: Only data on bias presented. Limited and incomplete data presentation.

Main results: Mercy method performed better than Broselow tape and age-based formulas in children with Down syndrome. Comments: No system performed well in this population. Limitations: Limited and incomplete data presentation.

Findings: Broselow tape 2011A performed better than Broselow tape 2007B in this population. The authors suggested that the tapes were accurate. Comments: The method of statistical analysis does not support the conclusions drawn in this paper. Limitations: Unclear if the Broselow tape was used or if derived from length measurements. Limited and incomplete data presentation.

Findings: Broselow tape and APLS formulas performed well in this population. Comments: Contrary to the authors’ conclusions, although the Broselow tape and APLS formula performed similarly, they were both inaccurate. Limitations: Not clear if Broselow tape used or if derived from length measurements. Broselow tape edition not reported. Limited and incomplete data presentation.

Findings: The Broselow tape performed very poorly. Comments: Abstract. Study in South Sudan, the “hungriest place on earth” where 61% of study population was malnourished. There was up to a two colour-zone overestimation in severely malnourished children with only 26% agreement in normally nourished children. Very poor performance of the Broselow tape. Dangerous overestimation of weight in undernourished children. Limitations: Broselow tape not actually used and Broselow tape version not reported.

Main results: EPLS formula may be better suited to identifying ideal body weight rather than total body weight. Comments: Systematic review. Ideal body weight should only be used in obese children. Limitations: No weight estimation data.

Main results: Family members could frequently provide a weight estimate to the 911 operator. Comments: More than 20% of cases had a weight estimation error > 20%. Limitations: Limited and incomplete data presentation.

Findings: PAWPER tape performed best, but good performances from Wozniak and Mercy methods. Broselow tape was worst performer. Comments: Unpublished data. Broselow tape 2011 edition A. Poor population with high proportion of underweight children. First comparative study of these methodologies. Relatively weaker performance by all methods in infants, but Wozniak especially was very weak. Limitations: Assessment of body habitus done by single researcher.

Main results: The novel device performed better than the Broselow tape in all outcome measures and was quicker to use. Comments: Both devices underestimated weights. Overweight and underweight children were often misclassified into wrong habitus category. Broselow tape 2011 edition A. Limitations: Device is not commercially available. Incomplete statistical analysis

Main results: The Broselow tape performed better than the Handtevy tape in all outcome measures. Both tapes underestimated weights. Comments: Although the authors recognised the increased accuracy when obese children were excluded, they still erroneously advocated “recalibration” of the tapes. Broselow tape 2011 edition A. Limitations: Tapes not actually used. Limited and incomplete data presentation.

Main results: Parental estimates were most accurate, then Broselow tape then APLS formula. Mothers’ weights, as well as Broselow tape and EPLS estimations were extremely accurate. Comments: The BMI for this population suggests that very few obese children were included and the sample was skewed towards very young children. Broselow tape 2007 edition B. Limitations: Limited and incomplete data presentation.

Findings: Broselow tape underestimated weight in small children and overestimated weight in older children. It was not accurate. Comments: As with studies elsewhere there was a large variation in accuracy. Limitations: Broselow tape version not reported. Limited and incomplete data presentation.

Main results: A new CAWR formula was more accurate than other formulas in this study. Comments: No formula was very accurate and the differences between formulas were generally small. Limitations: Data very skewed towards younger children. No inclusion of length-based methods.

Main results: The finger-counting formula was more accurate than other formulas. Comments: Same data was used for So 2016, but with a narrower age-restriction and the new finger-counting formula. Limitations: Duplicate data. Only the data on finger counting was added.

Main results: The Broselow tape was very inaccurate in overweight and obese children, who made up more than half the sample. Habitus assessment was poor by parents and nurses, but better by the principal investigator. Comments: Using a corrective formula based on waist circumference to adjust the Broselow tape weight improved accuracy in obese children. Limitations: Broselow tape version not reported. Limited and incomplete statistical analysis. Only included children who fell within the length limitations of the tape.

Systematic review. Main results: Parental estimates were the most accurate technique followed by length- and habitus-based methods. Broselow tape more accurate than age formulas. Limitations: No quantitative data analysis.

Findings: Broselow tape performed better than every formula in this population, BG and Michigan formulas performed worst. Comments: None of the methods were accurate, and all methods overestimated weight. Limitations: Broselow tape not actually used. Broselow tape version not reported.

Findings: Age-based formulas performed badly. The Broselow tape and Mercy method performed significantly better and the PAWPER was most accurate overall. Comments: All systems performed worst in infants. Reasonably good performance of Broselow tape, possibly because of newer edition used. Limitations: Broselow tape version not reported.

Main results: Parental estimates were most accurate, followed by the Mercy method. Comments: Parents could only provide an estimate in 80% of cases—the Mercy method was the only method that could be used in all children. Limitations: It was not clear what training the raters had received.

Main results: Broselow tape underestimated weight in older children. It was not accurate in nearly half the population. Broselow tape 2011 edition A. Comments: Many underweight and obese children included which cause the poor performance. Limitations: Broselow tape not actually used. Limited and incomplete data presentation. Only included children who fell within the length limitations of the tape.

Findings: The Broselow tape performed poorly in this study, the Wozniak method and the Mercy method showed intermediate accuracy and the PAWPER was most accurate overall. Comments: This was a population with many older children and children with deviations from “average” weight-for-length. The PAWPER XL tape worked well in this population. Limitations: The Mercy method was used in a simulated resuscitation setting (supine children), which may have affected its accuracy.

Findings: The PAWPER XL-MAC method was very accurate in data from both the USA and South Africa. Comments: This sytem was completely objective with no visual assessment of habitus. Limitations: Retrospective study

A brief summary of findings as well as a short commentary on significant aspects is included. In the description of target accuracy, some studies used an implied target (indicated by an asterisk (*)) and some expressed a clear, strong opinion (indicated by a dagger (†)). Whether studies were entered into arm 1 (qualitative arm) only, or both arm 1 and arm 2 (meta-analysis) is indicated. The final assessment of the risk of bias is also indicated. Abbreviations:APLS Advanced Paediatric Life Support formula, ARC Australian Resuscitation Council formula, BG Best Guess formula, BT Broselow tape, CAWR Chinese age-weight rule formula, DWEM devised weight-estimating method, EPLS European Paediatric Life Support formula,TJ Traub-Johnson formula, TK Traub-Kichen formula

Benchmark accuracy for a weight estimation system

After studying the 150 identified articles, only three articles were found to propose a statistically meaningful target for a weight estimation system: one article recommended that 95% of weight estimates must fall within 20% of actual weight and two articles suggested that 70% of estimates must be within 10% of actual weight and 95% of weight estimates must fall within 20% of actual weight [11, 30, 31]. There was, however, no evidence found upon which to base any specific measurement analysis metric for a weight estimation system. There was also no credible evidence found of a tolerable weight estimation error, in terms of safety for drug dose calculation, for an individual child.

In 90/150 articles (60.0%), there was no mention at all of an appropriate target for weight estimation accuracy. In 41/150 articles (27.3%) an error of < 10% was suggested as appropriate; in 11/150 articles (7.3%) an error of < 20% was advocated; in 2/150 articles (1.3%) an error of < 30%; and in 6/150 articles (4.0%) another value or a statistically inappropriate measure was proposed. None of the studies included any evidence to support these target figures. The values were selected based on clinical significance, pragmatic limits based on generalised therapeutic ratios, or based on guidelines on determining drug bioequivalence [32, 33].

Table 2 contains a description of each of the weight estimation systems reviewed, as well as any restrictions on their use. The raw data and outcomes for each of the weight-estimation methodologies included in the meta-analysis are shown in Additional file 1: Table S1. From the individual study data, it could be seen that there was very poor within-study precision for most weight estimation systems (shown by the wide limits of agreement), with the exception of the two-dimensional methods, which generally had precision limits of agreement of less than ± 20%.

Table 2

Summary and description of weight estimation methodologies described in the literature

Name

Formula

Restrictions/limitations/acceptable accuracy benefits

Age-based and length-based formulas

Ali formula

Wt = (2.5 × Z) + 8

Derived in a Trinidadian population of children ≤ 5 years of age in 2012. No validation studies to date. Age restriction 1 to 5 years of age.

Argall formula

Wt = (3 × Z) + 6 or [Wt = 3 × (Z + 2)]

Developed from a small UK study in 2003 (300 children). Generally found to underestimate weight, more so in older and heavier children. Age restriction 1 to 10 years of age.

Advanced Paediatric Life Support formula (APLS) (new)

\( Wt=\frac{z}{2}+4 \)

For infants ≤ 12 months of age

Derived in a UK population and adopted in 2011 by the Advanced Life Support Group from a combination of the original APLS and the Luscombe formulas. It was untested and unvalidated at the time of adoption. Generally overestimates weight. Age restriction birth to 12 years of age.

Wt = (2 × Z) + 8 or [Wt = 2 × (Z + 4)]

For children aged 1 to 5 years

Wt = (3 × Z) + 7

For children aged 6 to 12 years

Australian Resuscitation Council formula (ARC)

Wt = 3.5

At birth

Adopted by the ARC in Australia in 1996. Same as New Zealand Resuscitation Council formula. Generally underestimates weight, more so in older and heavier children. Differing accuracy in different ethnic, socio-economic and international populations. No specific age restriction noted.

Wt = (2 × Z) + 8

For children aged 1 to 9 years

Wt = 3.3 × Z

For children 10 years and over

Best Guess formulas (BG)

\( Wt=\frac{z+9}{2} \)

For infants ≤ 12 months of age

Also known as the Tinning formulas. Derived in Australian population in 2007 from a retrospective database study of more than 70,000 children. Generally overestimates weight, especially in poorer populations. Has been evaluated in several validation studies with mixed results.

Wt = (2 × Z) + 10 or [Wt = 2 × (Z + 5)]

For children aged 1 to 5 years

Wt = 4 × Z

For children aged 6 to 14 years

Bicer formula

Wt = (3 × Z) + 6 or [Wt = 3 × (Z + 2)]

For children aged 3 to 6 years

Although these formulas are mentioned and evaluated in the Bicer study, the analysis and reporting is fatally flawed and cannot be evaluated; the origin of the first formula of the set is the same as the Argall formula. Age restriction proposed by Bicer to be 3 to 18 years.

Wt = (4 × Z) − 3

For children aged 7 to 18 years

Chinese age-weight rule 1 (CAWR-1)

Wt = (3 × Z) + 5

For children aged 1 to 10 years

Developed in Hong Kong for ethnic Chinese children in 2011 from a sample of 1248 children. Age restriction 1 to 10 years (although developers advise use with caution over 7 years).

Chinese age-weight rule 2 (CAWR-2)

\( Wt=\frac{\left(Z\times 7\right)+25}{3} \)

For children aged 1 to 6 years

Wt = (4 × Z) − 4

For children aged 7 to 10 years

European Paediatric Life Support formula (old APLS formula) (EPLS)

Wt = 2 × (Z + 4) or [Wt = (2 × Z) + 8]

Original population and date of derivation unclear. Generally underestimates weight, more so in older and heavier children. Differing accuracy in different ethnic, socio-economic and international populations. Age restriction 1 to 10 years of age

Garwood formula

\( Wt=\frac{z}{4}+6 \)

Developed in a UK population from a sample of 1252 children in 2012. The initial validation study was fatally flawed, but this formula has been subjected to a validation study subsequently (showing poor performance). For children aged 1 to 16 years.

Leffler formulas

\( Wt=\frac{z+8}{2} \)

For children <1 year of age

Also known as the Tintinalli formula, the original origin is unclear, but became popular after the Leffler study in 1997. Overestimates weight in younger children (≤ 6 years) and underestimates weight in older children (> 6 years).

Wt = (2 × Z) + 10

For children aged 1 to 10 years

Luscombe formula

Wt = (3 × Z) + 7

Developed in the UK in 2007 from a large database of nearly 14,000 children. Underestimates weight in most populations studied, but significantly overestimates weight in populations from developing countries. Age restriction 1 to 10 years.

Michigan formula

Wt = (3 × Z) + 10

Derived by Ackwerh in 2010, but has not been evaluated fully.

Nelson formulas (originally Weech’s formulas)

\( Wt=\frac{z+9}{2} \)

For infants 3 to 12 months

As described in Nelson’s Textbook of Paediatrics. The origin is probably from Weech’s formulas, first reported in 1954 in the USA. The Weech formula is still in use today as one of the standard measurement denominators for determining underweight status. Weight most often overestimated in infants and older children (> 6 years) and underestimated in younger children (≤ 6 years).

Wt = 2 × (Z + 4)

For children aged 1 to 6 years

\( Wt=\frac{\left(Z\times 7\right)-5}{2} \)

For children aged 7 to 12 years

Park formulas

\( Wt=\frac{z+9}{2} \)

For infants ≤12 months of age

Derived in Korean population from a large database study (nearly 125,000 children). Poor accuracy in older children (> 6 years).

Wt = (2 × Z) + 9

For children aged 1 to 4 years

Wt = (4 × Z) − 1

For children aged 5 to 14 years

Shann formulas

Wt = (2 × Z) + 9

For children aged 1 to 9 years

Used in Australasia primarily. Origin is unclear. Underestimates weight increasingly with increasing age.

Wt = (3 × Z)

For children aged >9 years

Theron formula

Wt = e(0.175571 × Z) + 2.197099

Derived in 2005 in New Zealand from a small study of 900 children that included a large number of Pacific Island children (high weight-for-age). The developers intended it for use in children high in the weight-for-age centiles. Age restriction 1 to 10 years. Overestimates weight in most populations.

Unknown

Wt = (3 × Z) + 8

This formula is mentioned in some weight-estimation studies with no reference to its origin. It is not known what restrictions apply. Mentioned in Dearlove, Bicer.

Traub-Johnson formula (TJ)

Wt = 2.05 × e0.02X

Derived in 1980 from USA national growth data from 1959. This formula was used to estimate ideal body weight and adjusted body weight, which were used interchangeably. The formula was intended to estimate the 50th centile of weight-for-height. Underestimates total body weight. For children aged 1 to 18 years.

Traub-Kichen formula (TK)

Wt = 2.396 × 1.0188X

Derived in 1983 in the USA from data from more than 20,000 children in the National Centre for Health Statistics database. The formula was intended to estimate the 50th centile of weight-for-height which the developers regarded as an approximation of ideal body weight. Underestimates total body weight. For children over 74 cm and aged 1 to 17 years.

Other length-based systems

Broselow tape (BT)

Weight estimated directly by placing tape next to child and measuring from head to heel. The estimated weight and colour zone is read off the tape.

Developed in 1985 in the USA from US growth data and first validated in a sample of just over 900 children in 1988. Several changes have been made over the years: the latest version is the 2011A edition. Underestimates weight except in populations with a high prevalence of poor nutrition. Inaccuracy increases with increasing length/weight. Increased underestimation of weight in obese and overweight individuals. Substantial number of children “too tall for the tape” but who are not at adult weight. Length restriction 46 to 143 cm. Maximum weight estimation 36 kg.

Blantyre tape

Weight estimated directly by placing tape next to child and measuring from head to heel. The estimated weight is read off the tape.

Developed in Malawi using values 85% of the 50th centile of the American National Centre for Health Statistics weight-for-length growth charts. Validated on a sample of 729 children. The developers reported a reasonable accuracy between 4 and 16 kg, but the reporting of data was fatally flawed and is unverifiable. Length restriction of 45 to 130 cm.

Oakley table

Age or length is used to estimate weight from a graph.

Developed in the USA in 1988 from averaged boy-girl medians of unspecified growth charts. Overestimates weight in infants and older children (> 6 years). Age restriction 0 to 14 years and length restriction 50 to 160 cm.

Habitus-modified systems

Erker formulas

Wt = (2 × Z) + 6

For “thin” children

Developed in 2014 in the USA using regression formulas to estimate the 5th, 50th and 95th centiles of the Centre for Disease Control weight-for-age growth charts. Has not yet been shown to be accurate.

Wt = (3 × Z) + 6

For “normal” (average) children

Wt = (4 × Z) + 6

For “thick” (fat) children

Yamamoto formulas

See reference 11 different logarithmic formulas.

Developed in 2009 in the USA from a sample of 542 children. A different length-based formula is selected for one of five (under 3 years of age) or six (over 3 years of age) icons, which represent an assessment of the body habitus. The reporting of the validation against the Broselow tape is flawed and does not permit verification. This technique has not been subsequently validated.

Wozniak formulas

Wt = (1.443 × U) + (1.596 × M) − 32.963

Developed in Botswana in 2012 from a sample of 777 children with a high prevalence of HIV infection and growth retardation. Measurements of mid-arm circumference and ulna length or tibia length are used to estimate weight using the formula. The accuracy of the method decreases in children <10 kg and children > 40 kg.

Wt = (0.86 × T) + (1.715 × M) − 30.426

Devised weight estimating method (DWEM)

Length measured and then habitus assessed as “Slim”, “Average” or “Heavy” and weight read off a chart. A pre-marked tape was developed but is not widely available.

Developed in 1986 in the USA based on standard growth data available in 1983 and validated in a small sample of 258 children. Underestimates weight, especially in taller children. Length restriction 50 to 175 cm. Maximum weight estimation 70 kg.

PAWPER tape

Weight estimated directly by placing tape next to child and measuring from head to heel. A habitus score (1 to 5) is assigned to the child based on body habitus (1 = very thin, 3 = average, 5 = very fat). The estimated weight for that length and habitus score is read off the tape.

Developed in 2004 in South Africa based on WHO weight-for-length growth charts and validated on a sample of 453 children in 2013. Estimates weight uniformly across length range of tape. Performs well in children who are under- or overweight. Length restriction 43 to 153 cm. Maximum weight estimation 47 kg. The extended PAWPER tape accommodates children up to 180 cm in length, a maximum weight estimation of 116 kg and with a 7-point habitus score assessment (habitus scores 6 and 7 were added to accommodate children above the 95% centile of weight-for-length, i.e. for obese and severely obese children).

Mercy method (MM)

Humerus length and mid-arm circumference are measured and then used to determine “segmental weights” from a table. Specifically desgined tapes “2D” and “3D” tapes may be used which eliminates the need for a data table.

Developed in the USA from a database of 19,625 children and validated across several centres in 2012, 2013 and 2014, including in developing countries. Consistently good weight estimation across age and habitus ranges. Decreased accuracy in younger children (< 2 years).

Other

Cattermole MAC formula

Wt = (M − 10) × 3

A mid-arm circumference-based formula developed in Hong Kong ethnic Chinese children in 2010 from a sample of 1370 children. Decreased accuracy in children < 6 years and underestimation of weight in older children. Recommended by developers to restrict use to children aged 6 to 11 years.

Haftel formula

Wt = (5.176 × LW) + 3.487

The “hanging-leg weight” formula developed in the USA in 1990 from a small sample of 100 anaesthetised children aged 2 months to 15 years. The accuracy of weight estimation was worst in infants and increased with age. No subsequent studies have been reported.

Bavdekar formula

Wt = − 5.15 + (FL × 1.35)

Developed in India in 2006 from a sample of 500 infants < 2 years of age. Fatal flaws in the methodology do not permit the interpretation of the accuracy of this formula. No subsequent studies have been reported. For infants ≤ 2 years.

Methods not shown include the Carroll method, the Sandell and Handtevy tapes (insufficient data) and neonatal weight estimation applications (out of scope). Only weight estimation systems that had more than one article assessing their functioning were considered for inclusion into the meta-analysis. Abbreviations: Z age in years, z age in months, X height or length in centimetre, M mid-arm circumference in centimetre, LW hanging leg-weight in kilogramme, FL foot length in centimetre, U ulna length in centimetre, T tibial length in centimetre

Figure 2 shows the pooled data of the bias and precision for the weight-estimation systems evaluated. The fixed effects outcomes and data for the weight estimation methods not presented in Fig. 2 can be found in Table 3. The important findings can be summarised as follows:

There was a wide variation in the weight estimation bias between low- and middle-income countries (overestimation) and high-income countries (underestimation). This was most noticeable with the age-based systems, less so with the length-based systems and least with the two-dimensional systems, which had virtually zero bias.

There were very wide limits of agreement for all methods other than the PAWPER tape and the Mercy method.

Fig. 2

Forest plot showing the bias and precision data of the major weight estimation systems evaluated

Data for the whole pooled sample as well as pooled data for high-income country (HIC) and low- and middle-income country (LMIC) populations are shown separately. There were very few substantial differences between the fixed effects and random effects analyses, which were not substantial enough to affect the overall outcomes. The number of studies in each pooled sample, as well as the number of data points is shown. A positive mean percentage error indicates an overestimation of weight, while a negative value indicates an underestimation of weight. Abbreviations: MPE mean percentage error, LLOA lower limit of agreement, ULOA upper limit of agreement, PW10 percentage of weight estimates within 10% of actual weight, PW20 percentage of weight estimates within 20% of actual weight

Figure 3 show the overall accuracy data for each weight estimation system (PW10 data). Age-based systems were least accurate, length-based systems were slightly more accurate and parental estimates and the two-dimensional systems were the most accurate. Despite the difference in bias between high-income countries and low- and middle-income countries for the one-dimensional systems, the overall accuracy was similarly poor. If a PW10 of 70% were used as a benchmark of acceptable accuracy, only the PAWPER tape and the Mercy method would have achieved acceptable accuracy, with parental estimates close behind. When examining the PW20 data in Table 3, only the PAWPER tape (96.6%) and the Mercy method (95.3%) met the acceptability criteria suggested by Stewart of a PW20 > 95% [30]. The PW20s for the Broselow tape, parental estimates and a value calculated for pooled age-based formulas were 81.2, 87.1 and 65.0%, respectively.

Fig. 3

A bar chart showing the accuracy data of the major weight estimation systems evaluated

The outcome data for the pooled data (not separated into high-income and low- and middle-income populations) is shown with both random effects (RE) and fixed effects (FE) results

Figure 4 shows the results of direct statistical comparisons between weight estimation systems from studies where paired data could be pooled, using non-parametric measures of accuracy (PW10 data). The full analyses are available in Additional file 2: Figure S1. There was little difference between the accuracy of the different age formulas. Length-based methods were always more accurate than age-based methods, and two-dimensional methods were more accurate than one-dimensional methods. On direct comparison, but with data from only two studies, the PAWPER tape was significantly more accurate than the Mercy method. Parental estimates were significantly more accurate than the Broselow tape, but there was no data for direct comparison with any two-dimensional system.

Fig. 4

Direct meta-analysis comparisons between weight estimation systems

Discussion

Summary

The quality of the evidence from the contributing studies was generally good, and the number of studies that could be included allowed for a comprehensive analysis of the data. The underlying risks of bias, while present, were considered not sufficient to alter the overall findings. Additional information on parental estimations of weight in different populations and circumstances is also required, as well as a comparison with the two-dimensional weight estimation systems.

The implications of the results for clinical practice and future research are profound: age-based formulas, along with healthcare provider guesses, were the least accurate of all weight estimation systems. They should not be used or taught. Similarly, one-dimensional length-based systems, while widely used and advocated by advanced life support organisations, were simply not accurate enough. The future challenges will be to develop two-dimensional systems, which produced the most accurate weight estimations, to be safe, quick and easy-to-use during emergency care.

Many articles on weight estimation have been—and continue to be—published without any clear indication if the results achieved, and the weight estimation systems tested, were actually good or bad. This meta-analysis has provided some useful findings which could guide researchers and decision-makers on which systems to use in clinical practice and which to explore in further research. It has also provided some perspective on the performance of weight estimation systems in high-income and low-and middle-income populations, which is important as most weight estimation systems have been developed in high-income countries and have the potential to be dangerous if used inappropriately.

A benchmark for weight estimation systems

What degree of under- or overestimation of weight is dangerous to a child when calculating drug doses is not known [34, 35]. Many of the drugs used in paediatric emergencies have not been adequately studied to determine optimal dosing ranges. Moreover, the consequences of overestimating or underestimating weight (and therefore dose) will differ between different drugs, different patients and different clinical scenarios [36–39]. The final dose will be strongly influenced by the clinical situation and the discretion of the treating doctor, but an accurate and reliable weight estimation would still be required to provide the starting point to allow for dose modifications. Some authors regard the need for a highly accurate weight estimation as debatable. Other argue that any factors potentially impacting on patient safety must be addressed and minimised, especially in the light of compounded errors in drug dose calculations [40].

In the qualitative arm of the systematic review, we found no objective evidence to support any particular target or system by which to assess the adequacy of weight estimation methodologies. The failure to define outcome measures on how accurately a weight estimation method must perform is methodologically unsound, however. This is important as the use of a system known to be inaccurate, or inferior to another system is not good medical practice [1]. There are clearly factors other than accuracy to consider when selecting the most appropriate weight system to adopt including the complexity and cognitive load generated by the system, the vulnerability to human factor errors and its ability to interface with a drug dosing guide [41]. This needs further research.

Despite the lack of objective evidence, some reference standard is still required. A large number of articles implied or stated explicitly that an individual estimation of weight within 10% of actual weight is desirable, but only three articles provided a benchmark by which to judge a weight estimation system. The suggested criteria were that, to be considered accurate, 70% of weight estimates must be within 10% of actual weight and 95% of weight estimates must fall within 20% of actual weight [11, 30, 31]. Since the newest two-dimensional systems have shown the capability to repeatedly achieve this standard, it could, therefore, be considered a reasonable benchmark to propose to assess the adequacy of weight estimation systems in the future.

Meta-analysis data: the accuracy of weight estimation systems

Age-based weight estimation

The age-based formulas were the least accurate and worst-performers of all the weight estimation methods. There are multiple reasons for the inaccuracy of age-based formulas: a large biological variability in weight-for-age; a non-linear relationship between weight and age; and differences between populations with different ethnic groups and different levels of nutrition [10]. We found that age-based formulas have never been shown to perform better than length-based systems. Despite this, many authors still regard the EPLS formula as the “gold standard” for weight estimation and age-formulas are still taught on advanced life support courses [42, 43]. Some authors also still support the use of age-based formulas because of their ostensible simplicity, because they require no equipment to function and they allow advanced preparation if emergency services personnel communicate a child’s age during transport to hospital [35]. However, their use presupposes that a child’s correct age is known, that the formula is remembered correctly and that the arithmetic is performed accurately. Memory is capricious in emergencies, however, and increased stress causes errors even in calculating simple formulas [44]. The benefits of the formulas are unlikely to mitigate for their very poor accuracy [11].

Many studies have shown age-based formulas to underestimate weight in first-world populations [45–47], but studies in low- and middle-income countries have shown a significant, potentially dangerous overestimation of weight by the same formulas [48–50]. In this meta-analysis, this was confirmed, with no age-based formula performing well in any population, but the overestimation of weight in low- and middle-income populations was significant and potentially unsafe. Even the use of habitus-modified age-formulas has failed to produce an improvement in accuracy to the degree of accuracy seen with length-based habitus-modified systems, as this modification still does not account for variations in length-for-age [11, 51].

This futility of age-based weight estimation can be perfectly summed up: “Accurate paediatric weight estimation by age: mission impossible” [27]. The unavoidable conclusion is that age-based formulas should no longer be used and clinicians that manage children should ensure that a better weight-estimation system is available for use during emergency care [11, 47, 52].

Length-based weight estimation

Every length-based system performed better than every age-based system in this study. This supports the argument that length-based weight estimation is more biologically valid than age-based estimation [10]. No length-based system achieved the acceptable outcome benchmark, however.

The two length-based formulas were originally designed to predict ideal body weight in children, but they have been used, albeit incorrectly, to estimate total body weight. The addition of a habitus-modification to these formulas has been shown to increase their performance significantly, to the same level of accuracy as the other two-dimensional systems [11]. The use of these formulas in this way shows potential, especially if used with a mobile phone app, and requires further investigation.

Although there are at least seven length-only weight-estimation tapes, only the Broselow tape has been extensively studied, while the Blantyre tape, the Sandell tape and the Handtevy tape have been evaluated only in single, small studies [53–55]. The Broselow tape, like other one-dimensional length-based systems, is vulnerable to error based on individual variations of weight-for-length (differences in body habitus) [56–59]. Some authors have questioned whether the tape is still valid given the increase in prevalence of overweight and obese children and may result in the “under-resuscitation of children” [33]. Although the manufacturer recommends modifying weight estimation up a colour zone in overweight children, to reduce this underestimation of weight, this has never been formally studied and still needs to be verified [60, 61]. However, while studies in high-income countries have demonstrated an overall underestimation of weight, studies in low- and middle-income countries have mostly shown an overestimation of weight, potentially to a dangerous degree in some populations (if drug doses were to be computed from those weights) [56, 57, 62]. Since length-based weight estimation is advocated by major, international advanced life support organisations and, since these systems are insufficiently accurate, this recommendation needs to be reconsidered and researched further [43, 63].

Two-dimensional (dual length- and habitus-based) weight estimation

The two-dimensional systems were far superior in accuracy to the one-dimensional age- and length-based systems. The accuracies of the Mercy method and the PAWPER tape in the meta-analysis were excellent, each with a PW10 of above 70% in both over- and undernourished populations. This finding was confirmed in individual studies, with no study reporting a one-dimensional system to be more accurate than a two-dimensional method. The direct meta-analysis comparisons showed that the PAWPER and Mercy methods were significantly more accurate than the other systems, with the PAWPER tape outperforming the Mercy method in the two studies in which they were both evaluated.

All weight estimation systems have limitations, however. The Mercy method, like all other weight estimation systems was vulnerable to human factor errors in undertrained users [64]. It also has shown considerable variation in accuracy between individual assessors [19]. The functioning of the Mercy system in emergencies still needs to be evaluated– this is of concern as one of the poorest performances of the Mercy method was in a study which measured children in the supine position, as it might be used in an emergency [4, 20]. The PAWPER system was shown to be very accurate in two South African studies, one Australian study and one study based on NHANES data from the USA [5, 20–22, 31]. It was somewhat less accurate in two American studies with very obese populations, mostly because of difficulties in assessing body habitus, however [13, 65]. Although the tape’s length-based measurements are objective and simple to perform, assessment of body habitus is more subjective and dependent on training and experience [66]. This will need to be researched further to explore more standardised and objective ways of assessing habitus.

The Devised Weight Estimating Method (DWEM), the Yamamoto obesity icon system, the Wozniak system and habitus-modified Traub-Johnson and Traub-Kichen formulas have all been shown to be significantly more accurate than length-based methods, but have not yet been sufficiently studied [11, 12, 14, 67].

Estimates of weight by parents

The utility of parental estimates of their child’s weight is dependent on the parent being willing to offer a weight estimate and being accessible to healthcare personnel at the time of the child’s need for emergency care [26]. The accuracy of prediction is determined by whether the accompanying parent is the regular caregiver of the child and whether or not the child has had a recent measurement of weight by the parent or in the parent’s presence [9]. A previous systematic review has suggested that parental estimates are the most accurate method for obtaining a weight, when it cannot be measured [9]. In this meta-analysis, parental estimates were statistically superior to the Broselow tape on direct comparison, but there were no paired data from which direct comparisons could be made with the two-dimensional systems. Only one previous study has compared the Mercy method with parental estimates, in which parental estimates were found to be more accurate [68]. This will require further research to clarify, especially the accuracy of parental estimates in populations of different socio-economic status and the frequency of availability of parental estimates. Since parents might not always be available, especially in the prehospital environment, it would be prudent to always have an alternative method of estimation available.

Differences in weight estimation accuracy between different populations

This study showed a clear disparity in how the one-dimensional weight estimation systems performed in different populations. These differences were primarily as a result of differences in bias, however, while the underlying lack of precision within each population was similar. Thus, the variability between populations was similar to the within-population variability shown in even the most homogeneous populations. The significance of this is that, although recalibration of a system for a specific population might reduce the bias, the underlying variability and imprecision would not allow an acceptable degree of overall accuracy to be achieved. This was well shown in the study by Asskaryar et al. which failed to recalibrate the Broselow tape in an Indian population by manipulating the bias only [57]. The two-dimensional systems, with their enhanced methodology which accounts for habitus, have proven to be the closer to a universally applicable system by achieving a more uniform accuracy, both within and between populations.

Limitations

The limitations of this study are similar to what is expected from any meta-analysis of this nature [24]. The lack of data comparing parental estimates and the newer two-dimensional systems limited the comparisons between these systems. The under-reporting on subgroups of weight status also limited the ability to analyse the performance of weight estimation systems in children with habitus that deviated from the average—this would provide insight into how the systems might function in populations with a high prevalence of underweight or obese children (or both).

Conclusions

No evidence exists of an acceptable benchmark for weight estimation systems. An accuracy of at least PW10 > 70% and PW20 > 95% could be considered as a reference standard, since the length-based, habitus-modified systems have proven that this target is achievable across a wide range of populations.

The only weight-estimation systems that were found to be of acceptable accuracy were the two-dimensional length- and habitus-based systems. The PAWPER tape and the Mercy Method achieved an accuracy that surpassed all other methods. Wide discrepancies in the accuracy of the Broselow tape in different age groups and different populations raise questions about its use. It may dangerously overestimate weight in children from low- and middle-income countries or poor communities. Without exception, the age-based formulas evaluated proved to be highly inaccurate, with a possibility for patient harm, especially in low- and middle-income countries. There is sufficient evidence to conclude that the use of age-based formulas should be discouraged.

Recommendations

Dual length- and habitus-based (two-dimensional) systems should be used for weight estimation in children because of superior accuracy to other systems (high quality evidence).

The Broselow tape or parental estimates of weight should be used for weight estimation in preference to age-based formulas and healthcare provider guesses (medium quality evidence).

Age-based formulas and healthcare provider guesses should not be used for weight estimation in children because of potential patient harm (high quality evidence).

Parental estimates should be used to estimate weight in preference to length-based and age-based systems (high quality evidence). There was insufficient evidence to provide a recommendation between the two-dimensional systems and parental estimates of weight.

Abbreviations

APLS:

Advanced Paediatric Life Support formula

ARC:

Australian Resuscitation Council formula

BG:

Best Guess formula

BMI:

Body mass index (kg/m2)

BT:

Broselow tape

CAWR:

Chinese age-weight rule formula

DWEM:

Devised weight-estimating method

EPLS:

European Paediatric Life support formula

FE:

Fixed effects

LOA:

95% limits of agreement

MAC:

Mid-arm circumference

MPE:

Mean percentage error

NCHS:

National Centre for Health Statistics

OR:

Odds ratio

PW10:

Percentage of estimates within 10% of actual weight

PW20:

Percentage of estimates within 20% of actual weight

RE:

Random effects

RMSPE:

Root mean squared percentage error (%)

TJ:

Traub-Johnson formula

TK:

Traub-Kichen formula

Declarations

Funding

No external funding for this manuscript.

Financial disclosure

MW is the developer of the PAWPER tape, but derives no financial or commercial benefit from it. LG and AB have no financial relationships relevant to this article to disclose.

Authors’ contributions

MW conceptualised and designed the study, carried out the initial data collection, analysed the data, drafted the initial manuscript and approved the final manuscript as submitted. LG carried out the initial data collection, drafted revisions of the manuscript and approved the final manuscript as submitted. AB assisted with preparation of the manuscript and approved the final manuscript as submitted. All authors agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Competing interests

MW is the developer of the PAWPER tape. LG and AB have no potential conflicts of interest to disclose.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Wells M, Coovadia A, Kramer E, Goldstein L. The PAWPER tape: a new concept tape-based device that increases the accuracy of weight estimation in children through the inclusion of a modifier based on body habitus. Resuscitation. 2013;84(2):227–32.PubMedView ArticleGoogle Scholar

Wells M, Goldstein L, Bentley A. A validation study of the PAWPER XL tape: accurate estimation of both total and ideal body weight in children up to 16 years of age. Trauma and Emergency Care. 2017;2(4):1–8.Google Scholar

Georgoulas V, Wells M. The PAWPER tape and the Mercy Method outperform other methods of weight estimation in children in South Africa. S Afr Med J. 2016;106(9):933–9.PubMedView ArticleGoogle Scholar

Bavdekar SB, Sathe S, Jani P. Prediction of weight of Indian children aged up to two years based on foot-length: implications for emergency areas. Indian Pediatr. 2006;43(2):125–30.PubMedGoogle Scholar

Geduld H, Hodkinson PW, Wallis LA. Validation of weight estimation by age and length based methods in the Western Cape. South Africa population Emergency Medicine Journal. 2011;28(10):856–60.PubMedGoogle Scholar

Wozniak R. The evaluation of potential weight-estimation methods in a primarily HIV positive cohort in Botswana for use in resource limited settings: The University of British Columbia; 2012.Google Scholar

Graves L, Chayen G, Peat J, O'Leary F. A comparison of actual to estimated weights in Australian children attending a tertiary children's hospital, using the original and updated APLS, Luscombe and Owens, Best Guess formulae and the Broselow tape. Resuscitation. 2014;85(3):392–6.PubMedView ArticleGoogle Scholar

Hegazy M, Taher E. Validating a new formula for weight estimation in pediatric cancer patients. International Research Journal of Medicine and Medical Sciences. 2013;1(1):34–9.Google Scholar