Comparing OED2 and OED3
On 13 March 2008, OED3 published its first batch of cross-alphabet revisions (as described on the OED Online website here). Up to that point, the revision had worked its way steadily through the alphabet, beginning at M and (by the previous quarter, i.e. 13 December 2007) reaching part-way through R. This meant that one could relatively easily compare OED2's and OED3's treatment of selected date ranges, or large sets of individual quotation sources, since one could isolate the revised stretch of entries in each case and look at them side by side.

Since cross-batch revision began, however, it has been virtually impossible to do this, given the difficulty of picking out the revised portions of OED3, now scattered across the alphabet. The problem here is that - most unfortunately, in EOED's view - OED3 is electronically merged with OED2. Consequently one cannot search OED3 independently of OED2, i.e. as a separate entity, in order to analyse and compare the two sets of data systematically on any large scale.

Fortunately, however, EOED searched for the respective quotation totals of some eighteenth-century male and female writers in the first few days of March 2008, confining its comparison to the alphabet-range at that stage revised by OED3, i.e. M-quit shilling. The results are given below.[1]

In all cases, the number of quotations from these female writers had risen, ranging from significant increases (in terms of absolute numbers, if not percentages) for Burney, Edgeworth, Radcliffe, and Wollstonecraft, down to a few tens of quotations or less for the other writers.

At first sight, this looks to be a cheering development in the OED's treatment of female-authored sources, although the differential rate of quotation is perplexing. One assumes that Burney, Edgeworth, Radcliffe, and Wollstonecraft were identified as sources of special importance - but why go to the trouble of reading Charlotte Macaulay, Penelope Aubin, and Anne Bannerman and not quote from them more intensively, given that female-authored quotations are so few anyway? This seems an inefficient use of lexical research. And is the difference in treatment due to the linguistic characteristics of these texts or the perceived cultural (or literary) importance of the authors?

However, comparing OED3's treatment of a handful of male-authored sources of the eighteenth century puts the data in a different perspective, especially given that all these male writers are already heavily quoted in OED:

The discrepancy between the numbers of quotations from male and from female authors is very nearly as striking in OED3 as in OED2. This is mainly because OED3 is carrying over the first edition's vast quantities of male-authored quotations into the new edition, so that - given that none of the female sources are being as intensively mined for the third edition as male sources were for the first edition - the existing male-to-female proportions are being preserved. Additionally, however, it looks as if the OED lexicographers were, at that stage (i.e. March 2008), continuing to give some male-authored sources quite significantly preferential treatment over female-authored ones: for example, both Fielding and Defoe, already handsomely cited in the first edition of the Dictionary, had been given far more attention by the revisers than any female authors of the period.

Comparing the two tables on this page, and contemplating the differences in quotation rate between different writers, it is hard to feel that linguistic considerations alone are at work here. Cultural and social values seem to be asserting themselves as well. The chief editor of OED3, John Simpson, has explained in his Preface to OED Online that the revisers' intention is to quote from a wider range of sources during the course of their revision, the implication being that OED3 will correct the first edition's biases in favour of male-authored over female-authored, and literary over non-literary sources, and against the eighteenth century (quoted on EOED here).

Where both gender and literary bias are concerned, however, it is difficult to see how any such correction can be achieved unless the lexicographers prune, quite significantly, OED's enormous banks of quotations from canonical male authors - and try to find new quotations from female rather than from male authors, especially the male authors already much quoted in the Dictionary.

But throwing away good lexical evidence goes against the grain for any historical linguist. And it seems particularly perverse to do so now, given that online publication would appear to remove many of the practical and financial constraints which forced the first lexicographers to restrict their account of the history of the language in the first place (Murray complained to the Philological Society in 1890 that the ruthless culling of quotations was 'a sorrowful necessity', required so as to keep the Dictionary's size in check; nevertheless, 'as the quotations are the essence of the work, it is like shearing Samson's locks'; K. M. E. Murray 1977: 274). Just as importantly, many of OED's users are literary scholars who would be appalled if the OED reneged on its predecessor's function of 'literary instrument', i.e. acting as a tool to explain and contextualize the vocabulary of major and minor literary writers (see EOED page on Writers and dictionaries).

Is the solution for the lexicographers to keep these quotations from Pope, Cowper and the like, as they appear to be doing, but greatly increase their quotations from other types of source - from female-authored literary works and from non-literary works, whether by males or females, of a wide range of genres?

In some very small way, it appears that OED tried at an earlier stage in its history to correct the under-quotation of female sources. During the course of compiling his four-volume Supplement of twentieth-century updatings to OED, published over 1972-86, R. W. Burchfield slipped in a few hundreds of quotations from the novels, letters or journals of Dorothy Wordsworth, Jane Austen, and Maria Edgeworth, despite the fact that their eighteenth- and nineteenth-century origins would appear to have made them ineligible for inclusion at this stage (see here on Austen [under construction**]). Was he trying to redress an imbalance in the parent dictionary? If OED3 were to extend Burchfield's policy (if this was what it was) and examine such female-authored sources - which are abundant - more widely and more exhaustively, it could at the same time move towards compensating entirely, rather than only partially, for the short-fall in eighteenth-century sources quoted in the first edition of OED and still perceptible in the third.

This policy could most productively be extended to other periods in the Dictionary, for example the nineteenth century, where OED1/2 citations from Dickens (c. 8,200), Tennyson (c. 6,700), Carlyle (c. 6,250), Macaulay (c. 5,450) and others dwarf those from female writers, for example George Eliot (by far the highest quoted female author, with c. 3,100 citations), Harriet Martineau (c. 1,650), Mary Braddon (c. 1,500), or even Jane Austen (c. 1,050); see EOED pages on Top sources and Top female sources.

The question of what the correct balance of quotation might be between male and female sources (as between different centuries) is a formidably knotted one and requires protracted research and analysis. Should it reflect the proportion of male to female speakers? or writers? or published writers? or some other ratio? It seems unlikely, however, that OED's present balance is just. As the front page of its website tells us, this great dictionary is the 'definitive record of the language'. Since it is necessarily based on written sources for much of the historical period it covers, it would seem appropriate to bring its proportions of male to female quotations up to those of the available source literature as an absolute minimum (and it could also be argued that OED ought to represent female-authored texts as much as possible over the earlier periods, given that the proportion of texts written by women is so out of step with the gender proportions of the literate population as a whole). But whatever decision the OED3 lexicographers arrive at, it is vital - in view of the fact that their dictionary furnishes the first port of call for virtually all historical research on English - that they set out and explain the basis on which they select their quotation sources where gender, or indeed any other category of language, is concerned.

Footnote[1] The remainder of this page derives from information and comment published in Brewer 2009b.