A War of Words? Text Mining Political Speeches in Britain in the 19th and 20th Centuries

Historians are increasingly surrounded by an ever-growing forest of machine-readable textual sources. The old challenge of scarcity has been replaced by that of abundance. Despite this, the impact of text mining in History has been remarkably weak. Historians, who continue to be extremely interested in language, continue overwhelmingly to prize sharply focussed analyses based on close readings. Macroscopic computational approaches based on large (or even small) corpora remain at the fringes, despite the traditional barrier of cost and manpower being considerably mitigated by the march of technology.

This paper explores the transformative potential of text mining in this field in two areas of political language where large corpora have become available. The first example is based on election platform speeches in the late Victorian and Edwardian era. In this age of emerging democracy, even local constituency candidates would routinely hold over a hundred public meetings in an election campaign, speaking at length to large audiences which were often reported very thoroughly by a diligent and wordy press. I argue that even very simple text mining techniques in a relatively small corpus (4 million words) can challenge historical consensus on the contents of general election campaigns, on the significance of issues such as imperialism and Irish Home Rule, and the respective visibility of party leaders such as Gladstone and Disraeli.

The second is an analysis of the language of women MPs in Parliament since 1945. Drawing upon the outputs of the Digging into Linked Parliamentary Data ('Dilipad') project - which has added gender and party coding to the digital edition of Hansard - I present a wide-ranging empirical analysis of the role of gender in the 677 million words of Commons debates from 1945 to 2015. I investigate whether there is strong evidence to support the central feminist claim that women's contributions to Commons debates are substantively different to those of men, ask whether the 'gender effect' has been strengthening or weakening as the number of women in Parliament has increased since the 1997 election, and also at the effect of party, such as the oft-made claim that Labour (with its greater proportion of female MPs and ideological sympathy for feminism) was more focussed on representing women than the Conservatives.

Overall, I argue these techniques, whether used conservatively in a supplementary capacity alongside traditional approaches, or more boldly to lead analysis, have considerable potential to reshape historians' work in the digital age. They allow us to analyse texts too large in size to read, help overcome flaws in human ability to intuitively estimate frequency, allow greater verifiability, more precise communication of quantity, and a more empirical approach to working.