Towards a course on causal inference: Randomization: The early years

… a pair of plots similarly treated may be expected to yield considerably different results, even when the soil appears to be uniform and the conditions under which the experiment is conducted are carefully designed to reduce errors in weighing and measurement… the probable error attaching to a single plot is in the neighborhood of plus or minus 10 per cent… (Mercer and Hall 1911, “The Experimental Error of Field Trials”)

The astronomer’s measurements come up short of absolute accuracy because of a great number of atmospheric conditions… He has to obviate this unavoidable lack of accuracy by making many independent observations and taking their average. (Wood and Stratton 1910, “The Interpretation of Experimental Results”)

If there be a least perceptible difference, then when two excitations differing by less than this are presented to us, and we are asked to judge which is the greater, we ought to answer wrong as often as right in the long run. Whereas, if [not]… we… ought to answer right oftener than wrong… (Peirce and Jastrow 1885, “On Small Difference in Sensation”)

The Peirce-Jastrow experiment is the first of which I am aware where the experimentation was performed according to a precise, mathematically sound randomization scheme! (Stigler 1978, “Mathematical Statistics in the Early States”)

Fisher’s arguments for randomization: unbiasedness and valid tests

The first requirement which governs all well-planned experiments is that the experiment should yield not only a comparison of different manures, treatments, varieties, etc., but also a means of testing the significance of such differences as are observed.

… a purely random arrangement of plots ensures that the experimental error calculated shall be an unbiased estimate of the errors actually present. (Fisher 1925, Statistical Methods for Research Workers)

… because the uncontrolled causes which may influence the result are always strictly innumerable.

… relieves the experimenter from the anxiety of considering and estimating the magnitude of the innumerable causes by which his data may be disturbed. (Fisher 1935, The Design of Experiments)

(The valid tests point will need explication in future years.)

Fisher’s legacy:

The analysis of variance… showed experimenters the importance of placing the comparisons allotted to treatments on the same basis as the comparisons used for the estimate of error. The process of random assignment or arrangement was seen to be an essential and easy method of giving the same opportunity to treatment and error methods… The analysis of variance for the first time provided a sound statistical technique to accompany experimental arrangements such as the replicated block and the Latin square… (Youden 1951, “The Fisherian Revolution in Methods of Experimentation”)

(There’s also a neat Fisher anecdote in the paper: writing about Student’s t-distribution, Fisher says “the form establishes itself instantly”, then follows this with 12 pages of integrals.)