Repetition in Text

The purpose of this session, presented at the 2007 Society for Digital Humanities Conference, was to discuss a set of research projects that deal with visualizing features of individual documents. These projects have been proceeding in association with the Metadata Opens New Knowledge (MONK) project. MONK is an attempt to build on the earlier NORA and Wordhoard projects in their efforts to make data-mining and visualization systems available in forms that are congenial to humanities scholars, and that work across a wide range of digital collections. The visualization component includes both scientific visualizations, where numeric data about the collections is presented in visual forms, and humanities visualization, where text data is presented in visual ways that may or may not include typography. A recurring need in this kind of humanities visualization is to provide information in context, so that the researcher working with an individual document can have some kind of overview of the entire text, combined with tools for selecting, highlighting, and annotating significant features, without losing the context of the whole. One way that data mining can facilitate this process is in allowing the user to identify a wider range of features than might otherwise be possible.

One of our collaborators on MONK, Tanya Clement, is working with the text of a long and complex novel, Gertrude Stein’s Making of Americans. In approximately a thousand page, or three and half thousand paragraphs, Stein develops a treatment of her subject that involves recurring references to the same characters, through a process of repetition with variation. She uses the same words, or variations of the same words, only in different order and in different constructions, and these repetitions provoke a sense in the reader of an iterative unfolding of thought. Providing interfaces appropriate for Tanya has been our goal in these projects, since we have the conviction that although Stein is a special instance of this approach to writing, she is not alone in her use of repetition with variation.

Each of the three papers in the session takes a unique approach to this visualization challenge. The first presentation, “Dial R for Repetition,” assumes a contemporary PC environment, with a regulation monitor and interface options. The browsing tools consist of a series of radar screens, where the user can watch the system scanning through a document while highlighting the results in a set of transparent sheets that sit angled to the viewer to provide the document overview. The second paper, “A Text is a String of Words,” began with the strategy of removing the codex page conventions and starting over again with the rudimentary building blocks of text—words in sequence. Taking a metaphor from film editing, this interface shows an object consisting of loops of text that return to a common core. Each loop contains a different kind of information, chosen by the user, while the spine represents the repeated text of interest. For example, one interface might be configured with one loop for reported speech and another for prose, while another interface might have one loop for reported speech, one for prose in the first person, and another for prose in the third person. The third presentation is called “The Novel as Slot Machine.” It is predicated on the availability of wall-sized text displays and consists of a series of microtext columns, each one of which contains the entire text of the novel. The columns are multiplied according to the number of instances of the repeated text of interest, and align along a reading slot in the middle.

I live and work

on the ancestral and traditional Indigenous territories of the Blackfoot and the people of the Treaty 7 region in Southern Alberta, which includes the Siksika, the Piikani, the Kainai, the Tsuu T’ina and the Stoney Nakoda First Nations. The City of Calgary is also home to the Metis Nation of Alberta, Region III.