When Numbers Lie – Cautioning Quantitative Enthusiasm

There is an often repeated quote that is thrown around when speaking of “the past” and knowledge thereof in the kind of hushed, reverential tones usually reserved for gods and kings – that those who do not learn from it are doomed to repeat it and, principally, repeat its mistakes. As such, there are many out there who claim to have learned the lessons of the past and are able to make suggestions about future courses of action based on them, or to prophesy the end times thanks to lessons learned from the Black Death or a basic understanding of world war two. However, as almost any historian will attest, very few two historical moments are directly applicable to one another – humans are just too variable a factor to control for. More problematic though is that the very material the historical record is based on is often far from infallible. The inherent biases of the sermon preachers, victor historians, and letter writers is well known and the requirement to read between the lines of sources is a well hashed subject. What is often less well understood is that number “facts” often lie in a similar fashion, or at the very least skew the truth through over simplification.

Historians have a tendency to recount numbers and statistics as if they were beyond doubt. Nor is this a recent phenomenon based on the twentieth century rise of Cliometrics and the rush to add statistical relevance to even the most qualitative of accounts. Probably the most famous examples of historical number mistruths comes from the so-called “Father of History” himself, Herodotus – sometimes also dubbed the “Father of Lies”. Herodotus, in his fifth century work on Xerxes’s attack on the Greek cities, confidently calculated the size of the Persian invasion force at 2,641,610 which, when combined with the number of support staff, brought the Persian force to a grand total of 5,283,220 (Herodotus, Histories, CLXXXIV-CLXXXVI). Not only are his calculations entirely spurious, they exceed the estimated population of Greece even in Herodotus’s day. The modern Greek population is only just double that amount 2,500 years later. However, Herodotus works out his totals with such logic and sincerity that it is almost believable that he a) had access to accurate sources or b) stood counting them on an abacus as they passed over the Hellespont. Such sincerity and confidence are highly problematic. Herodotus remains one of the best sources on the war and in prior generations such numbers were treated as reliable at least as a reasonable estimate. While today we confidently dismiss Herodotus’s reaching assertion of numbers, modern historians are not beyond over selling their statistics. Broad assertions often conceal significant data gaps. The wide time ranges historians claim to have data for often belie the testable material beneath. Roger Lee Brown, in his book on the eighteenth century Fleet prison boldly claims to provide data for annual commitments in the 180 year period (1653-1832). He does provide an annual commitment rate for 1653 and 1832 but only for a further thirty four years in between, sixteen of which concern post-1800. While this does theoretically allow him to show trend over time it is not quite what his label describes. It is an inescapable fact that any historian working on the early modern period from a quantitative perspective will encounter the problem of missing years, however, how we describe our results is tantamount to deception if one is not careful.

With quantitative history, often it is a question of buyer beware. This is particularly the case when data is illustrated. Take for example a graph from my own work of debt prisoner commitments in the mid to late-eighteenth century (Graph 1). It employs a simple five year moving average to indicate the general trend over time of commitments per annum, suggesting a relative consistency over the period though with a slight, but significant, increase in the 1770s. It is categorically not a depiction of the actual total committed and while that would be made clear in any instance where I would employ the graph, the averaging effect gives a sense of a much calmer and more consistent pattern of commitment than the real results (see Graph 2). The trend remains the same – while the numbers are more erratic, they fall into a general pattern that was reflected in graph 1. But it is still true that to some extent the numbers have lied about the reality of year-to-year fluctuation. In another manner, while it is true that the average commitment per annum was 145 it is immediately clear from graph 2 that averages belie complexity and hide fluctuation.

None of this means that numbers in the historical record are inherently unreliable. Statistical returns can often provide insights that no qualitative source can. While the presence of references to tea sets in an early eighteenth century diary might tell you how they were used and viewed, statistics on the spread of tea sets in registers of possessions can provide you with a real sense of the relative representativeness of such a source. Historians should always be encouraged to use far more statistical evidence in their work. However, as a discipline, there is perhaps too much respect for the validity of these numbers. They do not always say what we think they mean, they have a deeper complexity, and finally in some cases they are dangerously over sold to cover up a lack of actual source material. As it is often said, and it is certainly true of historical writing, there are only three types of lies: lies, damned lies, and statistics.