… a pair of plots similarly treated may be expected to yield considerably different results, even when the soil appears to be uniform and the conditions under which the experiment is conducted are carefully designed to reduce errors in weighing and measurement… the probable error attaching to a single plot is in the neighborhood of plus or minus 10 per cent… (Mercer and Hall 1911, “The Experimental Error of Field Trials”)

The astronomer’s measurements come up short of absolute accuracy because of a great number of atmospheric conditions… He has to obviate this unavoidable lack of accuracy by making many independent observations and taking their average. (Wood and Stratton 1910, “The Interpretation of Experimental Results”)

If there be a least perceptible difference, then when two excitations differing by less than this are presented to us, and we are asked to judge which is the greater, we ought to answer wrong as often as right in the long run. Whereas, if [not]… we… ought to answer right oftener than wrong… (Peirce and Jastrow 1885, “On Small Difference in Sensation”)

The Peirce-Jastrow experiment is the first of which I am aware where the experimentation was performed according to a precise, mathematically sound randomization scheme! (Stigler 1978, “Mathematical Statistics in the Early States”)

Fisher’s arguments for randomization: unbiasedness and valid tests

The first requirement which governs all well-planned experiments is that the experiment should yield not only a comparison of different manures, treatments, varieties, etc., but also a means of testing the significance of such differences as are observed.

… a purely random arrangement of plots ensures that the experimental error calculated shall be an unbiased estimate of the errors actually present. (Fisher 1925, Statistical Methods for Research Workers)

… because the uncontrolled causes which may influence the result are always strictly innumerable.

… relieves the experimenter from the anxiety of considering and estimating the magnitude of the innumerable causes by which his data may be disturbed. (Fisher 1935, The Design of Experiments)

(The valid tests point will need explication in future years.)

Fisher’s legacy:

The analysis of variance… showed experimenters the importance of placing the comparisons allotted to treatments on the same basis as the comparisons used for the estimate of error. The process of random assignment or arrangement was seen to be an essential and easy method of giving the same opportunity to treatment and error methods… The analysis of variance for the first time provided a sound statistical technique to accompany experimental arrangements such as the replicated block and the Latin square… (Youden 1951, “The Fisherian Revolution in Methods of Experimentation”)

(There’s also a neat Fisher anecdote in the paper: writing about Student’s t-distribution, Fisher says “the form establishes itself instantly”, then follows this with 12 pages of integrals.)

It’s interesting that causality doesn’t come up until the comments, when Stephen Conn discusses Galileo. Now, you can get a long way in science without thinking about causality, or only thinking about causality in a very simple way — that’s the basis for the success of statistics over the last century. And indeed, if you only care about prediction without intervention, it’s not necessary to tease out cause. But if you care about what happens after an intervention, you need to know something about the causal structure, and at this point in history, finding causal structure is not something machines are good at doing. (To be fair, humans aren’t always good at this either.) The philosophical difficulty is that almost everything is an intervention — changing Google’s search algorithm is an intervention — though often, the intervention changes the system by a negligible amount. Intervention is a continuum, and given the current state of knowledge in statistics and machine learning, this is something that should keep statisticians awake at night.

More specifically, the accusation focuses on a statement made at a press conference on 31 March 2009 by Bernardo De Bernardinis, who was then deputy technical head of Italy’s Civil Protection Agency and is now president of the Institute for Environmental Protection and Research in Rome. “The scientific community tells me there is no danger,” he said, “because there is an ongoing discharge of energy. The situation looks favourable”.

That statement does not appear in the minutes of the meeting that preceded the press conference, and it was later criticized as scientifically unfounded by seismologists – including Enzo Boschi, president of the National Institute of Geophysics and Vulcanology, who is also one of the accused. Much of the trial is likely to revolve around the origin and impact of De Bernardinis’s statement.

The defendants’ lawyers have tried to differentiate between their clients’ respective positions, in some cases implicitly blaming each other’s clients. Boschi and the other scientists stated that informing the public was the sole responsibility of civil-protection officials, and that they cannot be charged over what was said in their absence. De Bernardinis’s advocate, on the other hand, said that his client merely summarized what the scientists had told him, and that the evaluation of the situation as “favourable” had come from them. According to the prosecutor, the fact that none of the other committee members felt the need to immediately correct De Bernardinis’s statement makes them all equally culpable.

The “no danger” statement was careless, though not criminally negligent. Unless there was some mitigating context not presented here, I would have no problem with De Bernardinis losing his job over it, though manslaughter is ridiculous. Unless the seismologists said similar things — and it’s hard to believe they would — there’s no reasonable basis for any action against them.

The basic results of a Poisson regression are in this table. I would have preferred if the model had been specified in terms of categories instead of interactions. This would give you impacts on violence (multiplying their coefficients by 100, so they can be approximately interpreted as percentage effects; positive means more violence) as follows:

(“Expected close” means a point spread of less than four points.)

Clearly the upset loss is far away from the other numbers. They give standard errors for the model specified as interactions; I’d guess that the SEs for the graphed specification are about +/- 4.

A critical implication of Blanchard’s haiku metaphor is that the DGSE approach had failed to generate a truly progressive [in the Lakatos sense] scientific research program. A new project in the DGSE framework will typically, as Blanchard indicates, begin with the standard general equilibrium model, disregarding the modifications made to that model in previous work examining other ways in which the real economy deviated from the modeled ideal.

By contrast, a scientifically progressive program would require a cumulative approach, in which empirically valid adjustments to the optimal general equilibrium framework were incorporated into the standard model taken as the starting point for research. Such an approach would imply the development of a model that moved steadily further and further away from the standard general equilibrium framework, and therefore became less and less amenable to the standard techniques of analysis associated with that model.

John Quiggin, Zombie Economics, pp. 105–106

Edit: Bonus quote:

The prevailing emphasis on mathematical and logical rigor has given economics an internal consistency that is missing in other social sciences. But there is little value in being consistently wrong. (p. 211)