Using Early English Books Online

Early English Books Online (EEBO) is a Proquest/Chadwyck-Healey subscription database of over 125,000 works, mostly English and mostly printed between 1473 and 1700. The works are represented in digital images and through bibliographical descriptions drawn from the English Short-Title Catalogue, the Wing Catalogue, the Thomason Tracts, and the Early English Books Tract Supplement.

This article initially grew out of Ian Gadd's presentation to the Folger Institute’s Early Modern Digital Agendas (2013) institute. The primary authors, Erica Zimmer and Meaghan Brown, argue that understanding the development, current use, and limitations of EEBO allows students and scholars to fully consider the source of their information and the limits of this digital tool. This essay covers the scope of EEBO, its uses and limitations, as well as pedagogical considerations. For a background of how Early English Books Online was developed, see History of Early English Books Online.

The pedagogy section at the end of this article welcomes the addition of further readings, resources, and sample assignments.

Scope

For many students and scholars, Early English Books Online facilitates more extensive engagement with early modern texts in their printed forms than would otherwise be possible. EEBO provides access to black and white digitized images of over 125,000 printed works from libraries across Europe and North America. Additionally, a partnership between ProQuest and the nonprofit Text Creation Partnership (TCP) has led to the creation of standardized XML-encoded electronic editions of many of the early printed books in the EEBO corpus. While these full-text transcriptions are accessible through the Proquest EEBO portal, accessing them through the EEBO-TCP portal (hosted by the University of Michigan) allows subscribers to conduct more sophisticated Boolean and proximity searching. The first 25,000 books transcribed by the TCP became freely available to the public on 1 January 2015.

EEBO’s holdings are based on A. W. Pollard and G. R. Redgrave’s A short-title catalogue of books printed in England, Scotland, & Ireland and of English books printed abroad, 1475–1640 (STC), Donald Wing’s Short-title catalogue of books printed in England, Scotland, Ireland, Wales, and British America, and of English books printed in other countries, 1641–1700 (Wing), the Thompson Tracts Collection (1640–1661) and the Early English Books Tract Supplement. These sources, used to identify works for imaging, have clear national, linguistic, and date limitations. They pointed the creators of EEBO to works published in England, Ireland, Scotland, and Wales (and to a far more limited extent, British America), and printed in English and other British languages elsewhere. The vast majority of works in EEBO were printed between 1473 and 1700. The access provided by both EEBO and the TCP transcriptions should not be mistaken for a complete view of the print culture of early modern Britain.

Missing works, mistaken inclusions

It is helpful to remain alert to what EEBO is not. EEBO does not contain images of every book printed in England during the stated dates of collection. Books may be missing for a number of reasons. The simplest and most common reason is that copies of a publication may have perished long before Eugene Power began his microfilming project. Books were read to death, pamphlets easily lost, broadsides used to wrap the next day’s fish. Fire, water, and the acts of man destroyed libraries. Nor does EEBO contain all extant works from the period. Items may have been found after the microfilming project was completed. Some of the works not in EEBO were missed in the original microfilming because they closely resemble books that did make it in. Mistakes are easily made. Piracies, which actively attempted to imitate the form of legitimate publications, are notably sometimes missing. For example, EEBO contains two editions of the Holsome and catholyke doctryne concerninge the seuen sacreamentes printed by Robert Caly in 1558 (STC 25112.5 and 25114), but not the piracy by Thomas Marshe and John Kingston (STC 25113).

The English part of “Early English Books Online” should be taken with a grain of salt, both as a linguistic and geographic designator. While British languages like Welsh and Scots are suggested by liberal mentions of EEBO’s British focus in the “About” section, there are, in fact, works in languages from Algonquin to Turkish as well. It is possible to search EEBO by both language and country of origin in the Advanced Search(link for subscribers only). The number of non-Anglo British works bears emphasizing, given the history of subsuming British cultures under the English umbrella. While the earliest Welsh-language books are dated after political unification with England (e.g. STC 20310 in 1546), the earliest Scots item listed is a fragment of STC 22407, The kalendayr of the shyppars, printed in Paris by Antoine Vérard in 1503. In addition, there are a number of foreign-language and foreign-origin works that slipped into the original STC because of false or missing imprints, including forty-six French-language books now believed to have been printed in France (as such, they shouldn’t be present, but are imaged and included). The date ranges that delineate the EEBO corpus are similarly stretchable: over 3,000 texts in EEBO were printed after 1700, falling outside the advertised collection dates.

Focused on production, EEBO certainlydoes not contain images of every book read in England during the dates of collection. It does not include, for example, the huge number of Latin works imported into England over the period, or the significant number of Continental vernacular books read by a wide range of society. While its national and linguistic borders are not as rigid as implied, in many ways the framing of EEBO obscures the multilingual and transnational nature of the early modern book trade.

Implications for researchers

The sheer quantity of texts available through EEBO can give a false impression of comprehensiveness. Any statement or tool employing EEBO data for quantitative purposes must be aware of the material and cultural conditions under which the texts were produced and preserved. This is not to say that EEBO data can never contribute to our understanding of trends over time or “big data” studies. The Early Modern Print: Text Mining Early Modern Studies project, for example, looks at new ways of exploring the corpus using an n-gram browser.

An EEBO entry

A basic Early English Books Online entry has two main components: the bibliographical record and the page images.

The bibliographical record

contains:

Title

Author: Contains the attributed name, initials, or pseudonym from the item itself.

Other Authors: (an occasional use field) may contain names of attributed authors, authors of allographic paratextual materials, or of collaborators unnamed in the original publication.

Imprint: place of publication, printer, and date.

Date

Bib name/ number: Indicates the bibliography that served as the source for the metadata (such as the STC or Wing catalogues), with the entry number.

Physical description: typically consisting of the number of pages

Notes: includes information on such things as flaws in the imaging, publication details, the collation formula, or information about the author; often duplicates information in other fields.

Copy from: the name of the library or repository holding the original book, for example, Folger Shakespeare Library (There are 4,610 Folger Shakespeare Library items.)

UMI Collection / reel number: the location of the image on the physical microfilm reels

Subject: includes subject keywords

Fields such as imprint and physical description report information present in the book. Square brackets indicate information supplied by editors and scholars, including expanded abbreviations. For example, the imaged copy for STC 22356, Shakespeare’s Venus and Adonis, is missing the entire first gathering. The title and imprint information, [London : R. Field? for J. Harrison I, 1595?], is supplied in square brackets. When square brackets appear in the physical description field, they indicate that the page number is not present on the page. Many early modern texts are not paginated, or the page numbers are simply wrong. A physical descriptionreading [26], 69, [3] p., like the one for John Dryden’s version of Troilus and Cressida (Wing D2390), means that there are 26 unnumbered pages followed by 69 numbered pages, and then a further three unnumbered pages.

Some of these fields may appear redundant: the date of publication appears as both part of the imprint and a separate date fields; the notes description will often (but not always) mention the location of the original copy, also discussed in copy from. It is important to recognize that these duplicated fields are functional. The date field allows users to sort entries by “earliest publication first” or “latest publication first” and search by date-ranges (a basic search function). When citing the book, the date in the imprint field should be used, as interpolation and doubt markers such as square brackets and question marks are not found in the date field. The copy from field duplicates information often found in the notes to allow users to limit the search to specific institutions under Advanced Search.

Page images: digital facsimiles of early modern books

The digital images that make up the bulk of EEBO are viewable either online at different magnifications or downloaded as TIFF or PDF files. It is also possible to download a series of images as a single PDF file. Although EEBO uses the term “page images” to describe the visual files that make up the majority of its content, these images rarely show a single page. Books are instead typically photographed by opening, or showing two pages at once. Typically one copy was photographed to represent a given edition, although there are instances of two or even three sets of page images of different copies of the same edition. Likewise, while each page was typically photographed only once, duplicated pages are quite common as the microfilm photographers regularly retook doubtful shots.

EEBO’s digital images are the result of a series of remediations (changes from one medium to another) that come between the reader and the original object. Books were photographed onto microfilm and the microfilms subsequently scanned as black and white or grayscale images. At each stage, this process has increased access and subtly changed the look of the work. Manuscript annotation that is clear on the original page is often illegible, or even invisible, in the final digital image. Some physical characteristics, easy to grasp in real life, are difficult to convey digitally. Size is particularly difficult to determine: the smallest miniature book and the largest folio are both fitted to your computer screen. As Ian Gadd observes, present parameters for digitizing the microfilms do not allow for discernment by size and color, though greyscale images do provide shades of distinction.[1] Bonnie Mak calls attention to the absence of smell or texture and urges the importance of registering non-visual properties within one’s sense of the “real.”[2] Like the game “Telephone,” this process can distort the information being transmitted.

Implications of EEBO’s structure

Organizing EEBO’s facsimiles for ready access and navigation required the combination of the image file with bibliographical information, or metadata,about each edition into a searchable database. The creation of any database and its interface involves editorial intervention, and EEBO users should be aware that choices are being made about what items to include, how they are arranged, and even what level of access a user has to each item. As Bonnie Mak has pointed out, much of this intervention can be easily overlooked, creating “the illusion that the digitizations have not only been protected from editorial intervention, but may even function outside traditional infrastructures of production . . . making it increasingly difficult to raise questions about whether certain entries should be in the list; whether others should have been left out; or to what extent and in what respect a particular image or transcription is an accurate representation of its exemplar.”[3] Major questions include whether the images accurately represent the original physical object and whether that one physical object is a good example of that edition. As Gadd explains, there is a categorical disjunction between EEBO’s bibliographic records and the copy-focused image sets to which they are linked.[4]

While the microfilmed page images continue to be converted to digital form through increasingly sophisticated scans, the metadata stems from the electronic records of the English Short Title Catalogue, or ESTC. As Gadd relates:

A series of agreements made between ESTC and University Microfilms/ProQuest between 1989 and 1997 allowed EEBO to draw directly on ESTC’s existing bibliographical data. Consequently, every search run on EEBO (with some exceptions) relies, in a fundamental sense, on bibliographical information originally supplied by ESTC – but not in the form that one might expect. First, EEBO heavily edited ESTC’s data for its own purposes: certain categories of data were removed (e.g. collations, Stationers’ Register entrances), some information was amended (e.g. subject headings), and some was added (e.g. microfilm specific details).[5]

This ESTC-EEBO relationship does not persist, however. Metadata duplicated during the sharing is now updated separately, resulting in increasing, if gradual, divergence between the resources, particularly as “no formal mechanism for synchronizing the data” between them currently exists.[6] Most directly, this lack of communication means changes made in the one will not necessarily be reflected in the other. As errors will inevitably have been made when humans compile data sets of this size, the work required to maintain accurate bibliographic information is thereby doubled.

The “Edition of One” problem

Researchers should also be concerned about EEBO’s ontological mismatch between its ESTC-based metadata and scanned microfilm images. The ESTC’s records developed from the monumental short-title catalogue reference works of the nineteenth and twentieth centuries. Its entries reference the bibliographically reconstructed ideal copy, based on the analysis of many witnesses of an edition. The image sets of EEBO, however, rely upon what is frequently described as the “Edition of One” philosophy, as envisioned by Eugene Power in the 1930s and gradually realized over decades of practice. Due to the nature of the hand-press printing process and practices including in-press correction and emendation, no two copies of an early modern text will be exactly alike, even within the same edition. Although some EEBO titles are represented by multiple copies, the more common practice is to include “only a single witness” of the edition represented in an ESTC-based bibliographic entry. As Gadd argues, when the ESTC-influenced, edition-based records are repeatedly paired with a single witness of that edition, the database “impl[ies]—albeit not deliberately—that the record and the copy are one and the same thing.”[7] Such elisions could lead to inaccurate claims, as well as to a diminished sense of the rich history of English print culture.

Awareness of the ways in which the digital surrogates do and do not match the material realities of printed works can shape our understanding and use of EEBO. The “ideal text” described by the bibliographical information may have significant differences from the photographed copy. Considering how these images were selected, photographed, collated, indexed, and organized can help EEBO users develop a more accurate understanding of the physical objects from which the available representations have descended. Remaining alert to the digital archive’s own traces of material history helps clarify the forms of information such digitizations do and do not provide, and the critical user of EEBO will work to maintain this sense of perspective throughout his or her study and research.

Teaching the critical analysis of EEBO can be pedagogically helpful in a variety of undergraduate and graduate classrooms. As Stefania Crowther, Ethan Jordan, Jacqueline Wernimont, and Hillary Nunn have noted, approaches to using EEBO in the classroom vary widely, given the range of informing philosophies and practical perspectives one may bring to the resource.[8] EEBO is used in classes for a wide variety of fields, including media history, literature, history, rhetoric, philosophy, book history, history of science, and more. It often serves as a primary sourcebook for the early modern period, particularly for under-edited and non-canonical texts, and as tool for teaching critical analysis of the material forms of cultural objects. By teaching students to question EEBO’s affordances and limitations, instructors highlight the influence of media history on many other fields. Discussing how an early modern book becomes a digital image—and what gets lost along the way—can lead to discussions of critical perspective, the adaptation and remediation of cultural objects, the implications of materiality to reception, or the editorial and publication history of a specific work. While studying works in EEBO can serve as preparation for visiting special collections and rare book repositories, for many students the digital facsimiles in EEBO are their only contact with early modern textuality. Teaching them how EEBO works helps them better place these texts in their cultural context and understand their modern forms.

For more on digital humanities terms used in this article and throughout Folgerpedia, see our Glossary.

Lindquist, Thea and Heather Wicht. "Pleas’d By a Newe Inuention?: Assessing the Impact of Early English Books Online on Teaching and Research at the University of Colorado at Boulder." Journal of Academic Librarianship 33, no. 3 (2007): 347–60.