Accuracy of clinical pallor in the diagnosis of anemia in children: a meta-analysis

Chalco J P, Huicho L, Alamo C, Carreazo N Y, Bada C A

CRD summary

The authors concluded that none of the clinical signs were highly accurate in diagnosing anaemia. These conclusions appear to be supported by the results, but incomplete reporting of review methods and inadequate reporting of study validity make it difficult to confirm the reliability of the authors' conclusion.

Authors' objectives

To review the accuracy of clinical signs for the diagnosis of anaemia in children.

Searching

MEDLINE (1966 to January 2002), EMBASE (1986 to January 2002), LILACS (1986 to February 2002) and the African Health Anthology database (1924 to July 2002) were searched via the Internet. The reference lists of primary and qualitative review articles were screened for additional studies. The search terms were reported in an appendix.

Study selection

Study designs of evaluations included in the review

Both prospective and retrospective studies were eligible for inclusion.

Specific interventions included in the review

Studies of conjunctival, palmar or conjunctival pallor, alone or in combination, for the clinical diagnosis of anaemia were eligible for inclusion. The areas of pallor investigated were the conjunctiva, nailbed, palm, tongue and general. In the included studies, assessors of pallor included physicians, health workers, paediatricians or residents, nurses and parents.

Reference standard test against which the new test was compared

Studies that used haemoglobin as the 'gold' standard were eligible for inclusion. Haemoglobin was measured using Coulter, Hemocue or by spectrophotometer. The following thresholds were used to define anaemia in the included studies: <11 g/dL, <8 g/dL, 7g/dL and <5 g/dL. Some studies reported results for more than one threshold.

Participants included in the review

Studies of in- or out-patient children aged 0 to 18 years were eligible for inclusion. All of the studies were performed in developing countries; most were set in Africa and others were set in Pakistan, Bangladesh and Brazil. All of the included children were aged less than 6 years. The studies were conducted in both urban and rural locations. The majority of the studies were conducted in out-patient settings, although one was conducted in the emergency department and another in an in-patient setting.

Outcomes assessed in the review

The studies had to report sufficient data to allow calculation of the sensitivity, specificity, likelihood ratios (LRs) and predictive values.

How were decisions on the relevance of primary studies made?

Two reviewers independently selected the studies. Any discrepancies were resolved by consensus.

Assessment of study quality

The studies were assessed for methodological quality using published criteria (see Other Publications of Related Interest). Details of the criteria used are available on the BioMed Central website (accessed 17/06/2007). See Web Address at end of abstract. Each study was assigned a quality score of 0 to 16 by ascribing a score of 2 points to major criteria related to the systematic and blind application of clinical signs and reference standard to all patients and 1 point to each of the remaining criteria. The final quality score was reached by consensus.

Data extraction

The authors did not state how the data were extracted for the review, or how many reviewers performed the data extraction.

The data were extracted to form 2x2 tables. The sensitivity, specificity, positive and negative predictive values, positive and negative LRs, and diagnostic odds ratios (DORs), together with their 95% confidence intervals (CIs), were calculated for each study. Calculations were performed separately for each clinical sign and for each different haemoglobin threshold. Where 2x2 cells contained 0 events, 0.5 was added to each cell to enable calculation of the LRs.

Methods of synthesis

How were the studies combined?

The studies were pooled using the DerSimonian and Laird random-effects model since heterogeneity was detected. The pooled sensitivity, specificity (both weighted by sample size), DORs and LRs were calculated with their respective 95% CIs. Analyses were carried out separately for each test and for each haemoglobin threshold. Different pre-test probabilities were used to estimate post-test probabilities using the pooled positive and negative LRs.

How were differences between studies investigated?

The studies were assessed for heterogeneity in clinical signs and haemoglobin threshold using the chi-squared test for proportions (sensitivity and specificity) and Cochran Q for LRs and DORs. Meta-regression, weighted on sample size and using the natural log of the DOR (ln DOR) as the dependent variable, was used to investigate potential sources of heterogeneity. The following pre-specified variables were investigated: clinical setting (out-patient versus in-patient), continent of study (Africa, Asia, Latin America), age (up to 5 years versus older than 5 years), technique of haemoglobin measurement, whether or not study setting was endemic for malaria or intestinal worms, type of observer and methodological quality score. Analyses were carried out separately for each test and for each haemoglobin threshold.

Results of the review

Eleven studies (n=17,324) were included in the review.

The summary quality scores ranged from 11 to 14.

There was evidence of heterogeneity in LRs and DORs for all tests investigated at all haemoglobin thresholds. Pooled estimates of all diagnostic indices (sensitivity, specificity, LRs) varied according to the haemoglobin threshold investigated, but did not show a consistent pattern of increasing or decreasing as the threshold increased or decreased. There was also no consistency in the clinical sign found to be most accurate: this also varied according to haemoglobin threshold. In general, estimates of pooled sensitivity were fairly poor while estimates of specificity were higher. Pooled estimates of sensitivity ranged from 29.2 to 80.9% and estimates of pooled specificity varied from 67.7 to 90.8%.

Authors' conclusions

None of the clinical signs were highly accurate for the diagnosis of anaemia.

CRD commentary

This review addressed a clearly defined question that was supported by explicit inclusion criteria. The literature search appeared adequate although no attempts were made to identify unpublished studies, thus the results might be subject to publication bias. It was unclear whether any language restrictions had been applied. Very few details on the review methods were reported, thus it was not possible to determine whether appropriate steps were taken to minimise bias. A detailed quality assessment was carried out but only the composite score was presented; this makes it difficult to independently comment on the reliability of the evidence presented.

The methods used to pool the studies were appropriate, statistical heterogeneity was tested, and various potential sources of heterogeneity between studies were explored; the influence of study quality on the results was not examined. Several subgroup analyses were conducted, but there were based on a small number of studies and this limited the evidence. The authors' conclusions appear to be supported by the results presented, but incomplete reporting of review methods and inadequate reporting of study validity make it difficult to confirm the reliability of these conclusions.

Implications of the review for practice and research

Practice: The authors stated that given the poor performance of clinical signs, universal iron supplementation of children might be an adequate control strategy at public health level, particularly in high prevalence areas.

Research: The authors stated that further well-designed studies are needed for settings other than Africa. Such studies should assess inter-observer variation, the performance of combined clinical signs, phenotypic difference and different degrees of anaemia.

This is a critical abstract of a systematic review that meets the criteria for inclusion on DARE. Each critical abstract contains a brief summary of the review methods, results and conclusions followed by a detailed critical assessment on the reliability of the review and the conclusions drawn.