The Orlando Project: An Integrated History of Women's Writing in the British Isles

[September 07, 2001] The goal of the Orlando Project is to "provide an overarching account of women's writing across the centuries. This will appear in the form of four individually authored volumes of history together with an extensive, collaboratively authored, electronic textbase. In its literary history the project addresses issues raised by recent feminist thinkers and scholars of women's writing, and it draws on a wealth of new research on women's lives, their texts, and the conditions under which they wrote. In its computing work it has created a structure for writing, encoding, and working with its basic research material. It has developed a new application for Standard Generalized Markup Language (SGML), and has built its structure around four categories: biography, writing, events, and topics. A fifth structure (the volumes DTD) will be developed [...] These structures provide the material for detailed, interactive chronologies, a sample of which is mounted on the project website."

Our computing work is based on the Standard Generalized Markup Language (SGML) which provides a way of encoding features of interest within a document. SGML encoding makes it possible to use the same electronic document for many different purposes and ensures that the documents will outlast the computer system on which they were created. We have built on the encoding principles devised by the ground-breaking Text Encoding Initiative, but we are also using SGML in a novel and ambitious way. Our project is not encoding existing literary texts, but new material gathered by our own research. Hence, our encoding does not identify features of older texts, but instead takes a central role in our composition process: we encode as we write, and vice versa. We have created SGML specifications (called Document Type Definitions - DTDs) for encoding interpretive information about women authors' lives and writing, a project bibliography, and other important historical events and items of information.

From the 1999 ACH-ALLC report: "The Orlando Project is in the 4th year of its 6 year tenure as a Major Collaborative Research Initiative funded by the Social Sciences and Humanities Research Council of Canada and the Universities of Alberta and Guelph. Our aim is to research, write, and tag in SGML an integrated history of women's writing in the British Isles. We are not tagging pre-existing texts; rather we are creating our own literary history through conducting primary research that we then filter through SGML tagging. To do this we have created three unique yet interdependent SGML document types (DTDs): one that permits the description of biography, one that takes into account all the factors that contribute to a writing career, and one that provides an architecture for describing chronological events. Our DTDs are modeled structurally on the TEI but each contains many interpretive tags that allow us to foreground our research practices and label the intellectual content of our material. For example, the biography DTD has tags for birth, family, education, and political affiliations; writing documents use tags for such text-specific information as genre, intertextuality, literary awards, and relations with publishers; events documents contain chronological events that have such information as organization names and places tagged. Currently there are 252 unique tags in our DTDs and 114 unique attributes. These tags are used extensively across over 2,200 documents in our system. As of April 1999, the total number of elements in use in all project documents was over 640,000. For example, in biography documents alone the <quote> tag is used 2,135 times and the <name> tag 8,103 times. There are 51,125 uses of <date> in events documents. With element numbers of this scale, it became clear to us that we simply couldn't clean up our tags on an element by element basis. This paper will address the issues that we have faced when trying to achieve tagging consistency on the project. It will also report on a pilot study on tag consistency work that we have undertaken in the Fall and Winter of 1998-1999."

Description of 1997-10-18: "The primary objective of the Orlando Project is to produce, in both printed and electronic form, the first scholarly history of women's writing in the British Isles. The integration of the project's key disciplines -- literary history and humanities computing -- will produce a highly sophisticated research tool for the study of women's writing in the multiple traditions of England, Ireland, Scotland, and Wales... Using a combination of project-specific SGML with the TEI, we plan to extend the current capabilities of textual markup to include both subject tagging and the mapping of critical and argumentative movements within our textbase. Doing so will maximize the search, retrieval and display capabilities of our computing project."

"Can a Team Tag Consistently? Experiences on the Orlando Project." By Terry Butler, Sue Fisher, Susan Hockey, Greg Coulombe, Patricia Clements, Susan Brown, Isobel Grundy, Kathryn Carter, Kathryn Harvey, and Jeanne Wood. Paper presented at ACH-ALLC 1999. "Given that a team of thirty-five humanities researchers cannot possibly use structural and interpretive SGML tags consistently over a period of three years, what issues face a major SGML research project wanting to impose adequate amounts of consistency to their tagged data? This is the key question currently facing the Orlando Project as we take stock and embark on a process of tag cleanup work..." [cache]

"Conceptual Levels of SGML Tags: A Proposed Taxonomy Based on The Tagging in the Orlando Project." By Stan Ruecker (Orlando Project, University of Alberta, 3-5 Humanities Centre, Edmonton, Alberta CANADA T6G 2E5). Paper presented at the Web Information Systems Engineering 2000 Conference, Hong Kong. June, 2000. "Several projects in various disciplines are now using Standardized General Markup Language (SGML) tags at an interpretive level. That is, these projects contain tags which have the potential to provide the reader with additional information that is not already explicit in the text itself. One such interpretive project is the Orlando Project, which is an integrated history of women's writing in the British Isles, currently under development in Canada. Orlando is unlike other projects in that the content is being written and tagged simultaneously. It also contains a wide and rich variety of both descriptive and interpretive tags, which provide the user with a wealth of information on women's writing in the British Isles. But the project does not currently provide an explicit indication of the level of description or interpretation to be expected in any given tag. Without such a taxonomy, projects like Orlando risk introducing potential ambiguities for the scholarly user. This paper therefore proposes a potential conceptual tag taxonomy for literary interpretive SGML projects such as Orlando. There are already a number of SGML and XML projects operating in the humanities, medicine, and law, although only a very limited subset currently make extended use of the potential power of these markup grammars to enhance content with a full range of searchable interpretation. In such projects, however, there is a case to be made for supplementing the existing DTDs with some formalized understanding of conceptual tag level. The tag taxonomy proposed in this paper would add value to interpretive tag projects by explicitly including the intention of the DTD designers in the definition of individual tags. This taxonomic data could subsequently be made available to both the taggers and users as a mechanism for reducing ambiguities in examining specific tag instances." [cache PDF]