A research group of practically-minded military engineers are trying to work out how to effectively destroy enemy fortifications with a cannon. They are going to be operating in the field in varied circumstances so they want an approach that has as much general validity as possible. They understand the basic premise of pointing and firing the cannon in the direction of the fortifications. But they find that the cannon ball often fails to hit their targets. They have some idea that varying the vertical angle of the cannon seems to make a difference. So they decide to test fire the cannon in many different cases.

As rigorous empiricists, the research group runs many trial shots with the cannon raised, and also many control shots with the cannon in its ‘treatment as usual’ lower position. They find that raising the cannon often matters. In several of these trials, they find that raising the cannon produces a statistically significant increase in the number of balls that destroy the fortifications. Occasionally, they find the opposite: the control balls perform better than the treatment balls. Sometimes they find that both groups work, or don’t work, about the same. The results are inconsistent, but on average they find that raised cannons hit fortifications a little more often.

A physicist approaches the research group and explains that rather than just trying to vary the height the cannon is pointed in various contexts, she can estimate much more precisely where the cannon should be aimed using the principle of compound motion with some adjustment for wind and air resistance. All the research group need to do is specify the distance to the target and she can produce a trajectory that will hit it. The problem with the physicist’s explanation is that it includes reference to abstract concepts like parabolas, and trigonometric functions like sine and cosine. The research group want to know what works. Her theory does not say whether you should raise or lower the angle of the cannon as a matter of policy. The actual decision depends on the context. They want an answer about what to do, and they would prefer not to get caught up testing physics theories about ultimately unobservable entities while discovering the answer.

Eventually the research group write up their findings, concluding that firing the cannon pointed with a higher angle can be an effective ‘intervention’ but that whether it does or not depends a great deal on particular contexts. So they suggest that artillery officers will have to bear that in mind when trying to knock down fortifications in the field; but that they should definitely consider raising the cannon if they aren’t hitting the target. In the appendix, they mention the controversial theory of compound motion as a possible explanation for the wide variation in the effectiveness of the treatment effect that should, perhaps, be explored in future studies.

This is an uncharitable caricature of contemporary evidence-based policy (for a more aggressive one see ‘Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials’). Metallurgy has well-understood, repeatedly confirmed theories that command consensus among scientists and engineers. The military have no problem learning and applying this theory. Social policy, by contrast, has no theories that come close to that level of consistency. Given the lack of theoretical consensus, it might seem more reasonable to test out practical interventions instead and try to generalize from empirical discoveries. The point of this example is that without theory empirical researchers struggle to make any serious progress even with comparatively simple problems. The fact that theorizing is difficult or controversial in a particular domain does not make it any less essential a part of the research enterprise.

***

Also relevant: Dylan Wiliam’s quip from this video (around 9:25): ‘a statistician knows that someone with one foot in a bucket of freezing water and the other foot in a bucket of boiling water is not, on average, comfortable.’

Last week Paul Romer crashed out of his position as Chief Economist at the World Bank. He had already been isolated from the rest of the World Bank’s researchers for criticizing the reliability of their data. It seems there were several bones of contention, including the accusation that Chile’s current social democratic government falsified data contributing to some of its development indicators. Romer’s allergic reaction to the World Bank’s internal research processes has wider implications for how we think about policy research in international NGOs.