Case Study: Henry III Fine Rolls

Introduction and
Definition

Project definition: semantic markup is data marked up,
however lightly or heavily, in ways which reflect the semantic
content of a text, rather than its structure.

The Fine Rolls are documents produced in medieval England
detailing payments to the king; these payments were entered onto
parchment rolls and kept as a central record. Thus the Fine Rolls
provide an essential insight into the relationship between the
monarch and his subjects, especially royal patronage, as well as
fine-grained financial insight into medieval society. The long
reign of Henry III (1216-1272) is a good opportunity to make a
long sequence of the rolls available online for the first time,
and is particularly valuable in the case of Henry III since
previously there has not been a printed edition of the Fine Rolls
for this monarch, the only edition being an inadequate collection
of excerpts. Until this project the only recourse for researchers
was to study the rolls in The National Archives in Kew, which
raised both problems of access for researchers far from London,
and preservation questions for the rolls.

Project description

The project has been funded by the Arts and Humanities Research
Council (AHRC) in two phases. The project team consists of staff
from King’s College, London (History department and Centre for
Computing in the Humanities), Canterbury Christ Church University
and The National Archives.

The project has four main outputs:

High-quality, easily navigable images of every membrane of
the rolls. The high quality of the images will allow users to
zoom in to high magnifications to examine aspects of the scribes’
hands and other details. The images will also be a linked, on a
membrane-by-membrane basis, to:

Full translations of the Latin text of the rolls. These
translations are being supplemented by further information
included only in the Originalia Rolls (copies sent to the
Exchequer) but not in the Fine Rolls themselves.

Index and search facilities for the Fine Rolls, between 1216
and 1248 in the first instance. These will list all people and
places in the rolls, as well as giving a subject index. This will
enable complex searches to be carried out. It is hoped that
additional funding will allow the completion of the index for the
rest of Henry’s reign. In the meantime the translations will be
searchable in the usual but basic way with a web browser’s ‘find
in page’ facility.

Print publication of the translations and indexes for
1216-1242. Again, it is hoped that additional funding for
complete indexing will allow the print publication of
translations of the rolls for the entire reign.

It is notable that the project outputs do not include a
transcription of the Latin text, which would make the Latin
lexically searchable. Other, similar projects have been able to
include transcriptions as well as translations, for example the
Early English Laws project (http://www.earlyenglishlaws.ac.uk/).
However this step clearly adds another round of work, and it is
significant that Earl English Laws has a different project model,
involving a large element of crowd-sourcing and incremental
publication as new editions of individual manuscripts are
completed.

Additionally, the project has committed to publishing a ‘Fine of
the Month’, in which an expert comments on a particular fine,
illuminating its features and historical importance. This is part
of the project’s attempt to make a wider impact for material
which is, in itself, somewhat arcane to the layman.

Use of tool

The encoding of the textual data is in XML (Extensible Markup
Language). This is a standard for text-based data; it is endorsed
by the W3C and is an open format. It is therefore independent of
any platform or licensing. Because XML is so well established
within the humanities, and within computing generally, it has two
additional advantages:

a suite of other technologies has grown up around XML, for
instance the ability to interrogate and transform XML using the
XSL family (for example XSLT, which can transform XML into
other formats or restructured XML documents), as well as
numerous XML editing tools – for example, the Fine Rolls project
used Oxygen XML editor, a widely available and very full-featured
editor popular in academic projects.

the widespread use of XML within digital humanities projects
makes it very suitable for linking to other data, as well
as to re-use; given the extent to which XML is used worldwide,
its sustainability as a format looks assured for some decades.

The text of the translation has been marked up using the Text
Encoding Initiative (TEI) guidelines (see http://www.tei-c.org/index.xml).
The TEI has extensive markup specifically designed for complex
manuscript work such as that involved in this project.

To take the simplest example of each fine, these are marked in
the manuscripts with a paragraph mark as each fine commences: ¶.
A standard way to mark up this kind of thing within TEI is
to use the div (division) element, which has an optional number,
with a type and number attribute. So one way to mark up the first
fine within TEI could be:

<div type=”fine” n=”1”>

Or

<div1 type=”fine” n=”1”>

Naturally, other attributes can be added to meet the project’s
requirements.

The marking up of names can be illustrated as a slightly more
complex example of TEI, this time given as a markup example on
the project website:

The key attribute here is being used to link the descriptive text
about Phillip Marc to his entry in the authority file of
individuals, as is also being done with the placename authority
file for Nottinghamshire. In this way TEI markup in XML is being
used to generate the indexes for the project as well as allowing
more comprehensive searching by including variants in the markup.

A final layer of information added to the markup is RDF (Resource
Description Framework), a technology used for linked and the
semantic web (for more information on linked data, see our linked
data case study, on Liparm, here). The
ontology language chosen for this project was OWL (Web Ontology
Language) and editors used the popular open-source OWL editor
Protégé OWL (http://protege.stanford.edu/).
This RDF layer has two advantages:

It allows this dataset to be linked to others, using the same
ontology and, potentially, the same URIs. For example, if Phillip
Marc who appears above, also appeared in another project to do
with medieval Nottingham this second project could use the same
URI (Uniform Resource Indicator) as that used by the Fine Rolls
project, which would allow Phillip Marc to be returned in
searches across both datasets.

It allows machine reasoning over the data, where
relationships explicitly declared in the OWL ontology can be the
basis for machines to infer further relationships. To take a very
simple example, if there is a declaration that Phillip Marc has a
wife called, let’s say, Britney Marc, then it need not be
declared that Britney Marc has a husband called Phillip Marc,
because a reasoning engine can deduce this.

Further possibilities

As noted above, despite being a very large undertaking, this
project did not have the resources for a transcription of the
Latin text. Nevertheless the Henry III Fine Rolls project is
something of a gold standard in projects of this type, offering
high-quality manuscript images, linked, TEI-encoded translations,
and RDF-OWL encoding. There are numerous possible extensions of
this methodology in ways which would build on the work that has
already been done: other rolls series and documents for the same
monarch, or the fine rolls for contiguous monarchs, for example;
if these were encoded in the same way, with attention to
cross-searching, then this valuable resource would become even
more useful.

Conclusion

The Fine Rolls project is the outcome of a large grant from the
AHRC and a multi-institution collaboration. It shows the
possibilities of a high-end markup project. Nevertheless, leaving
aside the unavoidable expense of professional manuscript
photography, the rest of the project methodology can be employed
on a much smaller scale and with little cost other than staff
time. A committed researcher could produce a TEI- and
RDF-OWL-encoded transcription of a smaller document, using the
same editors (Oxygen and Protégé Owl) , with almost no outlay on
equipment.