Humanities Data: Tools for Annotation and Access

The Northeastern North American Indigenous Languages Archive (NNAILA) is a new digital language archive housed at the University at Buffalo within the University Library system. NNAILA is currently in its pilot phase, with a focus on digitizing materials from Onondaga, an Iroquoian language spoken in parts of central New York and in the area near Brantford, Ontario. The archive's primary goals are to preserve recordings of indigenous languages of Northeastern America and to make the data in those recordings accessible to the academic community and to the indigenous communities whose languages are represented in the archive's collections. In order to facilitate access to the archive’s resources by Onondaga community members, especially language teachers and their students, we are developing a web-based toolkit which will allow users to construct and annotate personal digital collections of materials from the archive.

Resource Description Framework (RDF) is a key component of the Semantic Web, an extension of the World Wide Web intended to allow the content of web documents to be machine readable. While not yet widely deployed for use in humanities projects, RDF makes use of a simple, yet powerful, model for encoding relationships among documents and annotations on the data within those documents. This talk will discuss the deployment of RDF in a large linguistic database where it was used to, among other things, separate unstable, contested aspects of the data from stable, uncontested aspects of the data in order to keep the contested data from becoming too closely intermingled with the uncontested data, which would have presented problems for long-term database management. While the particular examples discussed will be drawn from the domain of language classification, many aspects of the discussion will be relevant to any project wishing to create a database designed to facilitate making new discoveries as opposed to simply encoding already known facts.

TEI Rails is a web-based content management system for documents encoded in TEI. The program, released as Free Software under the GPL, includes advanced features for XML-based content collaboration and annotation. This presentation will provide an overview of the TEI Rails system and demonstrate some of its advanced features including support for document versioning, cloning, and semi-automated annotation of content.