Controlling variation is vital. The more uniform the animals (or other experimental units) are within a treatment group or sub-group, the fewer of them will be needed, or the greater the power of the experiment. Variability due to things like sex, strain, age, previous history etc. should be taken into account in the design of the experiment.

Fig. 1. DNA fingerprint of eight animals of two isogenic strains. Note uniformity within each strain. Individuals are like identical twins.

Using genetically uniform animals

Isogenic strains, when available, tend to be more uniform than outbred stocks, leading to more powerful experiments (see 5. Species & strain for a slightly more detailed discussion, or the companion web site Isogenic Strains).

As an example, Fig. 2. shows variability in kidney weight corrected to a mean of 100 units, among 58 groups of rats, some of which were isogenic F1 hybrids (Std. Dev. 13.5 units), some non-isogenic F2 hybrids (Std. Dev. 18.4 units) and some outbred (Std. Dev. 20.1 units). In planning a future experiment to detect a 10% change in kidney weight compared with controls, assuming an 80% power and a 5% significance level it is estimated that the experiment would need 30 F1 hybrids or 55 F2 hybrids or 65 outbred animals in each group (for more details of calculations see 13. Sample size).

Ensuring freedom from clinical and sub-clinical disease.

Pathogenic micro-organisms lead to increased within-group variability because they are never distributed evenly within a colony. The animals may appear to be clinically healthy, but they will still be more variable. This can drastically reduce the power of the experiment so that it may be necessary to use several times more diseased than disease-free animals to do the same experiment. Wherever possible SPF (specific pathogen free) animals should be used.

As an example, four of the colonies shown in Fig. 2 were infected with Mycoplasma pulmonis, which causes chronic respiratory disease. The disease-free rats had a standard deviation of 18.6 units compared with 43.3 in the infected rats. Using power calculations and assuming an 80% power and a 5% significance level it would require 53 rats per group using the disease-free or 296 rats per group using the infected rats to detect a 10% change in kidney weight as a result of some applied treatment. This extra variation means more than five times as many rats would be needed.

Fig. 2. Variation in kidney weight in 58 groups of rats, including F1 and F2 hybrids and outbred rats. Some were infected with Mycoplasma pulmonis. See above for explanation of the consequences of genetic and disease-dependent variation.

Choosing animals of uniform weight, age and of known sex.

Unsexed animals should be used only if sex is known not to be important in response to the proposed treatments and if the use of unsexed animals does not increase within-group variation. Both sexes, separately identified, can be included in a single experiment without loss of statistical power by using a factorial experimental design (see 12.Experimental Designs).

As far as possible, animals should be be matched for age, weight, genotype and previous history.Following randomisation to treatment groups a "statistically significant" difference in body weight etc. between the groups may be observed. In fact, by definition, this will occur in about 5% of experiments (assuming p=0.05 is the critical value). In such cases it would be sensible to re-randomise the animals. Some scientists have developed programs which "improve" on the randomisation by moving animals between groups until body weight is the same in each group. This is not recommended as the effect is to increase the within-group variation which will decrease the power of the experiment.

Stratifying the animals by some pre-determined characteristic.

For example, if the only available animals vary considerably in weight or age, they can be grouped according to body weight or age using a randomised block experimental design (see 12. Experimental Designs). This will improve statistical power without requiring more animals.

Minimising measurement error

Use appropriate instrumentation, meticulous technique, and multiple determinations for more complex characters.

Individuals should be weighed and doses administered, using appropriate apparatus, according to the body weight (or some function of it) of individual animals.

Distribute the remaining variability randomly

This should be distributed among groups using appropriate randomisation (see Randomisation). This randomisation should continue throughout the experiment. It would be inappropriate to assign the animals to treatments at random, but then keep the groups separate, say on different shelves or in different rooms, or do the measurements on them at different times.