The Making of Mark Twain Project Online

“The edition of A.D. 2006 will make a stir when it comes out. I shall be hovering around taking notice, along with other dead pals. You are invited.” —Mark Twain, 1906

Going Digital

The resident editors at the Mark Twain Project, over a period of more than forty years, have created critical editions in print of progressively greater sophistication and complexity. That progression was made possible in part by their fortunate position within the archive of the Mark Twain Papers, and in part by their longevity as full-time, professional editors, unburdened by teaching duties or committee work—also without tenure, sabbaticals, and salaries any more reliable than the success of the latest biennial grant application. See Mark Twain Papers & Project: A Brief History.

In the years since 1967 the editors gradually mastered and assumed control over every aspect of making printed scholarly editions, including functions traditionally provided by publishers: not just proofreading and copyediting and creating electronic files for typesetting, but book design, and even a new system for transcribing manuscript. They also began to use some of the simpler tools of electronic editing—for example, converting letter transcriptions in WordPerfect to PDF files that could be searched and selectively printed, as well as offered for purchase by the University of California Press.

But by 2001 it became clear that editorial work of any kind on Mark Twain would no longer be supported unless an electronic (digital) edition became the primary focus of their efforts. Only one of the editors (the youngest, naturally) had any but the most rudimentary grasp of the technologies relevant to such an undertaking. And partly because of that inexperience, the earliest efforts in this direction conceived of stand-alone systems for each of the several kinds of texts: letters, notebooks, literary works, etc.

This piecemeal approach was not, however, entirely naïve. The “Electronic Edition of Mark Twain's Complete Letters, 1853–1910” begun in 2002 applied the standards of the Text Encoding Initiative (TEI), built the Project's first Document Type Definition (DTD), and actually managed to transcribe and encode some 700 letters in SGML (Standard Generalized Markup Language), even as the more powerful and flexible XML (Extensible Markup Language) was emerging as a superior alternative.

This initial experience with a TEI-conformant DTD and SGML soon persuaded the editors to undertake a more comprehensive approach to electronic editing, one that envisioned electronic publication of everything Mark Twain wrote on a single website, not in distinct, stand-alone modules. (Alternative forms of electronic publication, like CD-ROM and PDF, were almost immediately rejected in favor of the greater reach and flexibility of web publication.) But this more ambitious plan, called initially “Mark Twain's Writings Online,” obviously demanded expertise of a kind the very experienced editors could not hope to acquire for themselves.

Producing and sustaining such an edition required at least two kinds of help: institutional support for housing and preserving any digitized editions that might be created, and technical expertise in how to create and shape a website capable of serving up those editions on the web.

The longstanding partnership between the Mark Twain Project and the University of California Press needed a third party. The editors had already garnered advice about their initial DTD from Kirk Hastings at the California Digital Library (CDL). In 2004 CDL agreed to join with the Project and the Press in creating a website to provide worldwide digital access to the Project's editions, supplying the Project with irreplaceable access to expertise in database construction, information architecture, and web design.

CDL had already adopted TEI P4 as its standard for structured text encoding, and one editor from the Mark Twain Project had participated in development of additional local encoding guidelines for CDL contributors, and in further developing search, display, and navigation features for online access of encoded texts. CDL's very extensive work on Mark Twain Project Online drove, for a time, its development of standards for the structuring and preservation of digital texts. See Producing a Critical Edition in a Digital Environment.

Conversion and/or Born Digital

To be truly comprehensive, the thirty print volumes already published by the Project had to be converted to digital form. It was decided to contract for their conversion to XML first, even as the editors were beginning to produce born-digital texts of the unpublished letters. By the end of 2004, all but a handful of the print volumes had been converted by the data management firm codeMantra, whose staff twice re-keyed the text and apparatus while supplying basic XML markup. And significant progress had been made in applying XML to letters that had not been published in the six printed volumes devoted to letters.

More Access More Quickly

The advantages of web publication are easily grasped. First and foremost is the ease and range of access it affords, compared to printed volumes: potentially, millions of readers instead of thousands, and accurate information in seconds or minutes rather than days and months.

As an archive as well as a scholarly edition, the Mark Twain Project is in the rare position of being able to provide immediate, comprehensive access to both primary and secondary sources. Online, it can seamlessly integrate the kind of information found in its printed volumes with information traditionally acquired only through physical access to the public archive. Here are just a few examples:

The archive at the Mark Twain Papers continues to expand. Based on major discoveries, the editors of the MTP have revisited and republished the critical editions of Roughing It and Adventures of Huckleberry Finn. But for other discoveries, it may not be possible to undertake an expensive, full-scale revision of the printed edition. Dozens of previously unknown letters by Clemens come to light every year, so that print volumes that are meant to be comprehensive may be out of date almost as soon as they hit the bookstores. Web publication, on the other hand, can make new documents accessible to the public as soon as they emerge.

A digital critical edition allows for interactivity and an integrative reading experience that is unimaginable in a print edition. Side-by-side digital presentation makes more visible the transcription process of a particular passage from the original manuscript or typescript. Critical editions often contain long lists of emendations and historical collations in the back of the book, referencing the text using page-and-line cues. Digital publication allows each emendation or collation to be hyperlinked to a specific location in the text or texts that can instantly display all the variant readings from one witness in place, providing superior access to the information.

MTP editors can publish established texts without waiting to complete full annotation; then they can revise that text, if necessary, as annotation is completed. Scholars no longer have to wait for print publication or a research trip in order to benefit from the archive's offerings. In addition, their access to the archive fosters scholarly discourse and may enrich the editorial work of the MTP.