Much of the following material is for the benefit of those
planning anthropometric surveys. Emphasis is on large-scale,
cross-sectional surveys, and issues regarding the collection of
serial data on individual children are not discussed.

Personnel Selection and Training

The personnel selected to be trained in anthropometry should
read and write with ease and have good notions of arithmetic. If
at all possible, one should aim for persons with some secondary
school education. While some less-educated individuals may
perform adequately, measurement errors will be less of a problem
with, for example, high school graduates who can also be involved
with quality control procedures. On the other hand, more highly
educated personnel (e.g., physicians) may find the task of
maintaining constant measurement procedures boring and may for
this reason not be as reliable.

Women are preferred, for it may not be proper in some cultures
for men to measure women, and children may be more relaxed if
examined by women.

To train anthropometrists, one obviously needs a person with
experience in measuring to demonstrate the proper techniques. In
addition, a manual or handbook describing all procedures, ideally
including simple illustrations, should be available during
training.

The period of training should last as long as it takes to
achieve acceptable levels of reliability and accuracy.
"Reliability" can be understood as the degree to which
the measurer is able to reproduce measurements, while
"accuracy" refers to the degree to which the
measurement approximates "true" values. Habicht (13)
has devised simple procedures for training personnel. (These
procedures are recommended by WHO in a manual for use in
developing countries 191.) Typically, subjects are measured twice
by each student and by the supervisor. From these data, the error
variance of each student is estimated and compared to that of the
supervisor to monitor reliability; and systematic differences
among students and between each student and the supervisor are
used to monitor accuracy. At specified levels of performance as
suggested by Habicht, students are judged to be ready for data
collection. The widespread availability of inexpensive electronic
calculators should make these simple computations even easier.

Concern with measurement error should continue throughout the
duration of the study and should involve three types of
exercises. At periodic intervals, the supervisor and the
anthropometrists should meet to discuss problems encountered in
the field, to review measurement procedures, to calibrate
instruments and to repeat the initial training exercises in order
to monitor reliability and accuracy. Secondly, replicate measures
by each anthropometrist should also be carried out under field
conditions. The actual measurement process, particularly if the
examination only includes five measures as recommended in this
chapter, takes very little time; it is getting to the subject or
getting the subject to come to the measurement centre which is
generally time-consuming. Thus it is highly recommended that each
subject be measured twice by the anthropometrists. As suggested
in figure 3.2 (see FIG. 3.2. Anthropometry Form),
the first set of measurements should go on the front of the form
and the second on the reverse side in order to maximize the
independence of each set of observations. If possible, other data
(e.g. clinical signs) could be collected in between each set of
values to further improve measurement independence. Double
measurements provide ongoing reliability estimates under field
conditions and, as explained later in the quality control
section, result in better data. Finally, the supervisor will need
to visit periodically the various anthromometrists at their
working sites. On such occasions, the supervisor should also
measure a small number of subjects (i.e.. five subjects). At the
end of the survey, a sufficient number of cases (i.e., 50
subjects) would be available to determine accuracy under field
conditions for each anthropometrists.

Table 3.3. (see TABLE 3.3 Measurement Error
Standard Deviation in an INCAP Longitudinal Study (Preschool
Children)) shows measurement error data from an INCAP
longitudinal study for the five variables recommended in this
chapter. Error estimates derived from standardization exercises
are, as expected, less than those encountered under field
conditions. Comparisons of the field measurement error variance
to the population variance for each variable shows that the five
measures chosen have high reliability. The error variance for
other measurements such as chest circumference, for example,
represents over 50 per cent of the population variance.

The measurement error standard deviation was estimated by [S (a - b)²]1/2 / 2n where a and b are first
and second observations, respectively, and n is the number of
subjects measured.

Source: Martorell et. al. (14)

Measurement Techniques

The most widely-accepted techniques of measurement are those
proposed by the International Biological Programme (10, pp.
8-12), as follows:

1. Stature (measured with a
stadiometer or anthropometer; used for children two years old or
older). The subject should stand on a platform with his heels
together, stretching upward to the fullest extent, aided by
gentle traction by the measurer on the mastoid processes. The
subject's back should be as straight as possible, which may be
achieved by rounding or relaxing the shoulders and manipulating
the posture. The marked Frankfort plane must be horizontal.
Either the horizontal arm of an anthropometer, or a
counter-weighted board, is brought down on the subject's head. If
an anthropometer is used, one measurer should hold the instrument
vertical with the horizontal arm in contact with the subject's
head, while another applies the gentle traction. The subject's
heels must be watched to make sure they do not leave the ground.

2. Supine length (measured with an
infant measuring table; used for children younger than two years
of age). With the infant lying supine, one measurer holds the
infant's head in the Frankfort plane and applies gentle traction
to bring the top of his head into contact with the fixed
headboard. A second measurer holds the infant's feet, toes
pointing directly upward, and also applying gentle traction.
brings the movable footboard to rest firmly against the infant's
heels.

3. Upper arm circumference (measured
with a tape). The subject's arm hangs relaxed, just away from his
side, and the circumference is taken horizontally at the marked
level. (The horizontal mark on the left arm is measured half way
between the inferior border of the acromion process and the tip
of the olecranon process.)

4. Skinfold thick-nesses (measured
with skinfold caliper). The skinfold is picked up between thumb
and forefinger and the caliper jaws applied at exactly the level
marked. The measurement is read two seconds after the full
pressure of the caliper jaws is applied to the skinfold; if a
longer interval is allowed the jaws may "creep" and the
reading will be inaccurate. Two measurements are used:

- Over triceps. The skinfold is picked up at the
back of the arm about 1 cm above the level marked on the skin
for the arm circumference and directly in line with the point
of the elbow, or olecranon process.
- Subscapular. The skinfold is picked up under the
angle of the left scapula. The fold should be vertical, or
pointing slightly downwards and outwards.

5. Weight (measured with a scale).
Weighing should preferably be done with the subject in the nude
or clothed only in lightweight shorts (which may be provided by
the investigator). In the latter circumstance the measurement can
be corrected accordingly by adjusting the machine to read zero
when a sample garment is placed on it. In all other
circumstances, including when trousers are worn, the weight of a
representative garment should be entered on the form, for
subtraction later. The presence of visible oedema should be
recorded.

Knowing the correct age is essential for interpreting
anthropometric data. In its absence, for example, indicators of
stunting cannot be generated. Age data are easy to obtain when
records are available, or when mothers recall birth dates
reliably. in such cases, the date of birth and the date of
examination should be recorded; hasty calculations of age in the
field should be avoided.

It would be a serious mistake to exclude systematically all
children of unconfirmed ages, as they most likely will be in
poorer nutritional status. However, the problem of estimating age
in such cases is a difficult one. Some options are available if
the child is yew, young, as Jelliffe notes:

Sometimes, the mother may not know the child's age, but
may be able to recite the month of birth, and occasionally
the day as well. If this is so, the mother will often recall
details of the youngest child only, not those of older
siblings. (16)

The examiner in the above situations can estimate the year of
birth confidently if the child is an infant or toddler.

The use of dental eruption data as an aid in estimating age in
the first two years of life has also been proposed. However,
there is considerable variability in eruption times and normative
data are not available for many groups.

Estimating the age of children older than two years is the
difficult aspect of the problem. Jelliffe offers some advice:

Often the only practicable method may be to construct a
locally relevant calendar... based on events in the preceding
years, including agricultural, climatic and political
occurrences, as well as natural or man-made disasters... However,
such a calendar takes weeks to prepare and pre-test in the field,
while its use in survey circumstances is laborious,
time-consuming, and least satisfactory with the unsophisticated
communities for which it is intended. Calendars will plainly have
to be specific for different communities. (16)

All other data should be collected even when age proves
impossible to estimate. Assessments of wasting, as noted earlier,
do not require that age be known.

Equipment

The Ninth Handbook of the International Biological. Programme
(15) as well as Zerfas (17) discuss the selection of equipment
and list the addresses of some instrument distributors.

The lightest and least expensive piece of equipment will be
the tape needed for measuring arm circumference. Cloth and paper
tapes are not recommended because they wear out and stretch
easily. Fiberglass or steel tapes are preferable.

A number of calipers are available for measuring fat folds.
The two most widely used are the Lange (US made) and the
Harpenden (British-made) models, either of which is suitable.
These two calipers possess rectangular jaws and exert a constant
pressure of 10 gm/mm2 Prices range between US $150 and US $200
for most calipers.

The measurement of weight, height, and supine length presents
special problems. If subjects are to be measured in their homes,
the equipment chosen should be light and sturdy. Where subjects
are asked to come to a central station or designated location,
the bulk and weight of the equipment is of less concern. The most
widely used scales suitable for fixed locations are the Detecto
baby (up to 16 kg) and adult (up to 140 kg) scales (beam
balance). Suitable for use in house-to-house surveys is the
British-made Salter scale (spring scale; up to 25 or 50 kg),
which can be hung from a tree branch or a house beam. This scale
has been used successfully in surveys such as those conducted by
the US Center for Disease Control (CDC) in El Salvador.
Currently, the portable weighing
model 235 sells for £21. Field-worthy but heavier scales (i.e.,
19 pounds) are available for measuring adults (15). Common
bathroom scales are not reliable and are not recommended for
weighing infants and young children, though they are suitable for
cross-sectional surveys in adults.

Precise, sturdy instruments for
measuring stature in fixed locales include the Harpenden
Stadiometer (US $694 per unit). This instrument is impractical
(100 pounds in weight) for the type of survey research that is
generally carried out in developing countries. Some weight
scales, such as the Detecto scales, have built-in stadiometers;
but again, they are not suitable for house-to-house surveys A
portable stadiometer, suitable for house-to-house surveys, is
available in the Harpenden line of instruments (Harpenden Pocket
Stadiometer. model 98-605, $46). Where reasonably straight walls
are found in homes, a scale (e.g. fibre-glass tape) can be taped
to the wall, and, with the use of a flat board to press against
the top of the head after positioning the individual correctly,
height can be measured with surprising accuracy.

Length in children can be easily
measured with portable boards, usually made out of wood.
Inexpensive models have been developed by UCLA and the CDC (17),
WHO (9). and by INCAP; they can easily be replicated by local
carpenters. Infants' boards have a fixed vertical end against
which the child's head is positioned and a movable end which is
positioned against the soles of the feet. A fiberglass tape is
attached along the flat board. Sturdy infant measuring tables are
also available commercially; these are, however, more expensive,
very bulky and more appropriate for fixed locales. For example,
the Harpenden infant measuring table weights 20 pounds and costs
US $658. Perspective Enterprises (7466 Thrasher Lane, Kalamazoo,
Michigan) has, however, developed a lighter (8 pounds) measuring
board for around US $70.

Quality Control

All data collection activities
should be monitored by a supervisor. He or she should meet
periodically with the anthropometrists to reread the manuals and
to review the measuring techniques; periodic visits to the fled
are also recommended.

Data flow charts need to be
prepared. Forms must be inspected shortly after collection to
make sure all information is recorded properly. A record of names
and dates or measurements should be kept at the field site, and
the number of forms sent periodically to the central offices for
data-processing should be counted and recorded. After the data
are punched, the original forms should be filed for safekeeping.
The punched data should be analysed as soon as possible to
monitor reliability (from duplicate measurements) and to scan for
outliers. A recent book edited by Jelliffe and Jelliffe (18)
contains articles by Garn, Zerfas, and Nichaman. These three
articles outline the most common types of errors that lead to
outliers.

In scanning for outliers, values
can be checked against the appropriate age-sex group
distribution. For example, a value may be labelled as an outlier
if it is more than two or three standard deviations above or
below the mean. In addition, if two measurements are obtained, as
recommended in this chapter, the differences between paired
values can be compared to the measuring standard deviation.
Values that differ by three times or more from the measuring
standard deviation would be suspect. Ideally, one should
remeasure the subjects whose values are questionable. Where this
is not feasible, suspect values should be divided into those that
are clearly impossible and those that are unlikely but plausible.
The former should be deleted from the data files. Decisions as to
what to do with unusual but plausible information are always
difficult and arbitrary. If the quality of data is high, only a
small percent of values (i.e., less than 1 per cent) may be
involved, and little information will be lost if these values are
deleted. Another strategy is to code suspect but plausible
information so as to investigate its effect on key data analyses.
Whatever is done, criteria and procedures should be specified and
made available to those using the information.

A final note on quality control:
To some, the supervision and quality control procedures suggested
here may appear excessive and time consuming. They are certainly
not. One impact of quality control procedures is that they foster
a careful attitude on the part of field workers. To see "all
this fuss" devoted to numbers (which may be largely
meaningless to them) develops a sense of pride and
responsibility. Otherwise, workers come to believe one is
interested primarily in numbers, quality not being an issue.

The design presented in table
3.1. does not necessarily require the use of a norm or reference
against which to compare the data collected. However, there is
always the need to quantify the severity of nutritional problems
and/or ascertain the pace of progress in nutritional status in a
relative sense; for these purposes, norms become necessary.

The question as to whether growth
data on children of European origin from developed nations are
appropriate as norms for developing countries has been heatedly
debated (19; 20; 21). A seemingly obvious solution would be for
data on the well-nourished of each country, or even better for
each ethnic group within a country, to serve as the norm for
their ethnically appropriate group. This suggestion is not always
practical because high quality data on sufficient numbers of the
well-nourished do not yet exist for all ethnic groups. Moreover,
some ethnic groups in some countries are so overwhelmingly poor
and malnourished that a "well-nourished" group may not
even exist. What is more interesting is that when data from
well-nourished, preschool children of diverse ethnic groups are
compared to the European norms, differences are usually rather
small, particularly when viewed against the yew, large
differences between the poor and the well-nourished in each
ethnic group (19; 21). It would appear therefore, that growth
data derived from European populations are justified as
"norms" for developing countries - that is, as rough
approximations and not as targets. (Appropriate to this point are
the remarks of Waterlow et al. (22, p. 491 ) who in proposing
NCHS data as norms for developing countries state: "If it is
felt that in a particular population even well-nourished children
are shorter in stature than children of the North American
reference population, then it might be reasonable to set the
target for height as 95 per cent of the reference height rather
than 100 per cent. Decisions of this kind have to be taken
locally, and it is not possible to make international
recommendations about them.") Finally, the use of a similar
norm by workers in developing countries facilitates the
comparison of published results.

The National Center for Health
Statistics (NCHS) has prepared new percentile charts for
assessing the growth of children in the United States (23). These
data have been recommended as the norms to use worldwide by a
team of leading scientists from Europe, the United States, and
the World Health Organization (22). Their proposal is likely to
gain wide acceptance among researchers.

The adoption of the NCHS norms
will not lead to very different estimates of the magnitude of
malnutrition. The cut-off point of 75 per cent weight for age has
been used to identify children with second-and third-degree
malnutrition (i.e., < 75 per cent weight for age). Table 3.4
(see TABLE 3.4. Differences
(kg) between the NCHS and the Iowa-Harvard Norms with Respect to
the Cut-off Point of 75 % Weight for Age ) shows the criterion values using the new
NCHS norms and the values corresponding to the most widely used
norm of the past, the Iowa-Harvard data. (24) Mean differences
are small particularly for boys. The use of the NCHS norms,
instead of the Iowa-Harvard data, would result in a slightly
reduced prevalence of malnutrition in boys and a small increase
in the prevalence in girls.

Norms for arm and skinfold values
are not yet available from NCHS. When they do become available,
note of the fact that children in the US are heavier and fatter
than those in other European-origin populations (e.g.. British
children) should be taken into account. Interestingly, it has
recently been pointed out that children from
"privileged" groups in some developing countries are
even fatter than those from the US (21).

TABLE 3.4. Differences (kg) between the NCHS and the
lowa-Harvard Norms with Respect to the Cut-off Point of 75 %
Weight for Age

Age
(years)

Boys

Girls

NCHS*

Iowa-
Harvard**

Differences

NCHS

Iowa
Harvard

Differences

0.00

2.45

2 55

-0.10

2.42

2.52

- 0.10

0.25

4.49

4.29

0.20

4.05

4.22

- 0.17

0.50

5.89

5.69

0.20

5.41

5.45

- 0.04

0.75

6.89

6.80

0.09

6.42

6.53

- 0.11

1.00

7.61

7.55

0.06

7.15

7.31

- 0.16

1.50

8.60

8.57

0.03

8.12

8.33

- 0.21

2.00

9.44

9.42

0.02

8.93

9.22

- 0.29

2 50

10.25

10.21

0.04

9.70

10.07

- 0.37

3.00

11.02

10.96

0.06

10.45

10.82

- 0.37

3.50

11.76

11.67

0.09

11.30

11.54

- 0.24

4.00

12.52

12.38

0.14

11.97

12.32

- 0.35

4.50

13.27

13.07

0.20

12.61

13.10

- 0.49

5.00

14.00

13.78

0.22

13.25

13.78

- 0.53

* The NCHS norms use data from the Fels Research Institute for
children up to 3 years of age and from a national sample
thereafter
** 75 per cent of the age-sex-specific 50th percentile values

Analysis

Length for age and weight for
length are the key variables for evaluating the impact of
nutrition interventions Data on arm circumference, triceps
skinfold, arm and fat areas and subscapular skinfold should also
be used as we have stressed. When norms of comparison from NCHS
become available, the same reporting procedures discussed below
for weight for length and length for age should be followed.

Waterlow et al. (22) propose that
data on children be analysed by three months groups during the
first year of life (i.e., 0-2.99. 3.0-5.99, 6.0-8.99, 9.0-11.99
months), by one-year groups for older children (1.0-1.99, etc.).
The recommended sample sizes are 100 children in each age group;
failing this, Waterlow et al. recommend collapsing some of the
age groups. WHO (9) proposes seven larger age groupings: infants
(0-5.9 and 6-11.9 months), preschool children (12-23.9, 2447.9,
and 48-71.9 months), and school children (72-95.9 and 96-119.9
months). (The issue of sample size is an important one which is
best approached by power analyses (251. Needed for sample size
calculations are specifications of expected effects, level of
statistical significance, and power.)

A common procedure in analysing
data is to test whether the number of observations falling
between specific percentile values of the norm have changed
(i.e., between 50th and 60th percentile values, 60th and 70th
percentile, etc.). This method is inappropriate for data on
stunting indicators from developing countries because many
children, often as many as 90 per cent, will have values that
fall below the 5th percentile. Many researchers in the past have
therefore expressed the observations as a percentage of the mean
or median value for age. For example, per cent weight for age is
the actual recorded weight divided by the 50th age-sex specific
percentile weight value in the norms, the result multiplied by
100. There are at least two problems, one minor and one major,
with the use of per cent of mean or median values. First, per
cent of median values are not, as the seemingly equal units of
measurement would suggest. equivalent across age. For example,
with respect to the NCHS norms, 95 per cent of median height
corresponds to the 8th percentile at 12 months of age but to the
12th percentile at 48 months of age.

Differences of this magnitude are
not generally important. Of greater concern is the fact that per
cent of median values are not equivalent across measures, the
differences in this case being large. For example, for girls and
once again in reference to the NCHS norms, 90 per cent of median
corresponds to the 13th percentile for weight for length at 95 cm
but to less than 1 percentile for height for age at 36 months
which is approximately the age when girls achieve a median length
of 96 cm (22). Failure to take this into account has led to
fallacious statements such as: "Measure A appears to be more
retarded than measure B because the per cent of median value is
less for A than for B."

The use of standard deviation
scores or Z-scores is an elegant and simple solution to the
problems noted in the case of per cent of median value. Standard
deviation scores are equivalent across age and across measures.
Waterlow et al. (22) have proposed methods of classification
based upon Z-scores in the hope that they "will be widely
acceptable and thus make international comparisons
possible." They include Z-scores grids and suggest that data
be presented in summary tables by Z-score categories (i.e., by
half or one standard deviation groupings). Waterlow et al. also
propose that cross-tabulations of stunting (length for age) and
wasting (weight for length) be presented and that tables of means
(medians) and standard deviation in the original units (cm or kg)
and in Z-score units should be presented for all sex-age groups.

There are many approaches to the
testing of the experimental design shown in table 3.1. This
simple design, as will be recalled, has a test and a control
sample and a baseline and an intervention phase. For example.
differences between groups can be tested by an analysis of
variance (ANOVA). Independent variables would be
"sample" (test/control), " phase "
(baseline/intervention), " sex" (male/female), and
" age" (0-1, 1-2, etc.). Dependent variables would be
each of the anthropometric variables either in the original units
(i.e., cm or kg) or Z-score values.

WHO (9) has published a data
analysis manual for height and weight data. Numerous illustrative
tables and figures from a simulated analysis of a World Food
Programme in the country of "Barico" and the
"Kimani" project are included.

J. C. Waterlow, R. Buzina,
W. Keller, J.M. Lane, M.Z. Nichaman and J.M Tanner,
"The Presentation and Use of Height and Weight Data
for Comparing the Nutritional Status of Groups of
Children Under the Age of 10 Years," Bulletin of the
World Health Organization, 55: 489 (1977).