About Me

I am Professor of Digital Humanities at the University of Glasgow and Theme Leader Fellow for the 'Digital Transformations' strategic theme of the Arts and Humanities Research Council. I tweet as @ajprescott.

This blog is a riff on digital humanities. A riff is a repeated phrase in music, used by analogy to describe a improvisation or commentary. In the 16th century, the word 'riff' meant a rift; Speed describes riffs in the earth shooting out flames. The poet Jeffrey Robinson points out that riff perhaps derives from riffle, to make rough.

Maybe we need to explore these other meanings of riff in thinking about digital humanities, and seek out rough and broken ground in the digital terrain.

13 January 2013

At Friday's conference organised by the estimable Orietta da Rold at the University of Leicester to mark the launch of the interesting Manuscripts Online project, I was telling a story of a nineteenth-century facsimile of a small late fourteenth-century manuscript in the British Library. The facsimile is a beautiful piece of craftsmanship, even down to the meticulous reproduction of the manuscript's binding. It is so beautifully done that it at first seems that somebody has stolen the original volume from the British Library. However, when you examine the facsimile more closely, you realise that, despite all the care lavished on ensuring that it reproduces the original manuscript as closely as possible, the facsimile has a major flaw. There are two very scrappy flyleaves in the manuscript, dirty and stained with old glue from an earlier binding. They initially appear completely insignificant, and the makers of the facsimile decided they could be left out. However, these scrappy flyleaves include some signatures which provide important information about early ownership of the manuscript.

This carefully constructed facsimile is deceptive. Editorial decisions were made in its construction which mean that it represents a subjective view of the manuscript. In the late nineteenth century, the great manuscript scholar Edward Maunde Thompson, in introducing a photographic facsimile of a manuscript, declared that photographs offered a depiction of the manuscript unaffected by human intervention. Sadly, this is never the case. Every reproduction of a manuscript involves a mass of human decisions about (for example) what lighting method is used, how flyleaves, blank leaves, etc., are included, how the binding is represented, and so on, which means that each reproduction of the manuscript represents an interpretation, frequently drawing on a mixture of curatorial, photographic and academic expertise. This applies just as much (perhaps even more) with digital imaging as with conventional photographic imaging.

During the Manuscripts Online event on Friday, we heard a great deal about data, to the point where data seemed to assume a life of its own, an energetic effervescent life force that needed to be freed so that medieval and other forms of scholarship could be transformed into new 'cool' forms by its remarkable qualities. It seemed that, for the participants in the conference, we are no longer curators or scholars but makers and consumers of data. In this perspective, data is presented as in some way offering a more objective, less problematic view of historic cultures and societies than the archives or manuscripts from which it is drawn.

There are many well-known projects and demonstrations which illustrate the way in which data is being manipulated to transform our views of historic and cultural trends and developments. Mapping has become ubiquitous, as if cultural geographers had never taught us about the way in which maps are difficult and challenging cultural constructs. It is intended that Medieval Manuscripts Online will include a mapping component, and mapping has become almost a standard requirement of such products nowadays. A good example of a characteristic approach is the visualisation by Ben Schmidt of seasonal movements of shipping during the late eighteenth century. This uses information from the log books of ships assembled for the climatological database. It all looks very entrancing and convincing, and we are reassured that the visualisation is based on 'hundreds of log books' so we assume it gives a good sense of major trade connections from 1750-1850. James Cheshire has also used the same data to produce a map of British trade routes from 1750 to 1800.

However, for a scholar interested in Britain, one thing immediately catches the eye. There are virtually no traces from the east coast of Britain, and very little trade between England and Scandinavia - how could great ports like Hull not be on trade routes? Moreover, no trade is shown as emanating from Liverpool or Glasgow - two of Britain's greatest slaving ports in the eighteenth century. How can this not be shown? The answer is that the climatological database was based on log books from Royal Naval ships and ships of the East India Company. The Royal Navy mainly operated from the south of England and in any case didn't engage directly in trade, while the East India Company of course did not trade much with northern Europe. This is visualisation shows some trade routes in the eighteenth century, but by no means all. Indeed, it is possible that some of these are not trade routes at all, since it is possible that the Royal Navy did not necessarily follow just trade routes. In removing the data sets from their original context, this visualisation runs the risk of creating a seriously distorted impression. For the original purpose of analysing eighteenth-century temperatures, the database was well constructed - it didn't particularly matter whether the vessels were naval or not, the weather was still the same. Using naval data to represent trade is another matter, however.

One of the problem confronting data enthusiasts in the humanities is that we feel a need to convince our more old-fashioned colleagues about what can be done. But our role as advocates as data shouldn't mean that we lose our critical sense as scholars. In their presentations to the Manuscripts Online conference, Michael Pidd and Kathy Rogers from the Humanitie Research Institute at Sheffield stressed the need for detailed and careful examination of datasets in making them available, but there is a risk that we look more carefully at the technical components of the datasets than the historical context of the information that they represent. This is a danger of which Ben Schmidt was aware in making the shipping visualisation - he observes that one of the many uses of the visualisation is that it shows the status and coverage of the climatological database. It may be that this proves to be one of the main uses of data visualisation, namely giving us an innovative way of analysing the structure of historical and other texts, as Tim Hitchcock and Bill Turkel have shown in the way in which they use visualisations to explore the structure of the Old Bailey Proceedings as a historical source.

It comes back to the notorious comment made in a New York Times article on the Digital Humanities a couple of years ago, that data provides a way by which humanities scholars can escape from the '-isms' of cultural theory. There appears to be a sense that data can somehow be cut free from its historical moorings to enjoy an autonomous existence. I think that's very dangerous. Data doesn't mean we become less critical; it demands that our critical faculties are sharper than ever, as the distortions and deceptions of data can be so deeply embedded that they are difficult to ferret out. As data becomes more promiscuous and greater cross-connections are made, our critical faculties need to be sharper than ever.

11 January 2013

This is the text of a keynote lecture to the conference at Leicester University on 11 January 2013 marking the launch of Manuscripts Online: www.manuscriptsonline.org.

THE
FUNCTION, STRUCTURE AND FUTURE OF CATALOGUES

The story of the British Library is full of
remarkable personalities.One of the
most striking of these was Donald Urquhart, who established in 1961 the
National Lending Library for Science and Technology at Boston Spa in Yorkshire,
which afterwards became the northern outpost of the British Library. Urquhart was
described by his successor Maurice Line as ‘one of the greatest innovators,
practitioners, thinkers and personalities the library profession has ever had’.
Urquhart was a scientist whose wartime experience made him aware of the
inability of staid literary libraries such as the British Museum to satisfy the
increasing need of scientific researchers for prompt, easy and cheap access to
the burgeoning range of publications reporting the latest technical and
scientific research.At Boston Spa,
Urquhart designed and built a remarkable mail order facility for information
which would ensure that scientists could receive the articles they needed in
their laboratory within twenty four hours. In creating this facility, Urquhart
questioned, and frequently rejected, many of the accepted principles of
librarianship. His best known innovation was to jettison the idea of a
catalogue. When he asked a librarian what the purpose of a catalogue was, he
was unimpressed by the reply he received: ‘for completeness’. Urquhart argued
that, if books were arranged on the shelf by author and title order, a
catalogue was unnecessary. If the book was there, the lending request could be
met straight away off the shelf; if the book was not there, then it would be
necessary in any case to contact other institutions to see if they have a copy.

Urquhart’s questioning of the principle of
a library catalogue may seem to be gaining a new relevance as we see Google and
other search engines becoming the primary means by which researchers seek out
information. Recent studies, for example, show that students, in seeking
electronic resources, do not turn to the catalogues of e-resources laboriously
compiled by libraries, but simply Google the resource. Library catalogues have
been criticized as dowdy and lacking in interaction by comparison with (for
example) Amazon.The highly structured
and meticulously prepared information in a catalogue looks redundant by
comparison with the speed and simplicity of Google. The catalogue is starting to look in many ways
to be exactly what Urquhart suggested – a comfort blanket for librarians and
curators. It seems that some librarians themselves are also coming to such a
view. Deanna Markum of the Library of
Congress commented in 2006 that: ‘the detailed attention that we have been
paying to descriptive cataloging may no longer be justified ... retooled
catalogers could give more time to authority control, subject analysis, [and]
resource identification and evaluation’.Likewise, Karen Calhoun, in a report commissioned by the Library of
Congress expressed a concern that ‘The existing local catalog's market position
has eroded to the point where there is real concern for its ability to weather
the competition for information seekers' attention’.

Yet the humble catalogue also underpins
many aspects of the new digital services by which it seems threatened. Two of
the major library digitization projects of recent years, Early English Books Online and Eighteenth
Century Collections Online, stem directly from the largest modern
cataloguing project of recent times, the English Short Title Catalogue, and the
primacy of EEBO and ECCO as digitisation projects reflects the visionary
insistence of those who established the English Short Title Catalogue in the
1970s that it should be in machine readable form. While Amazon may have given a
lead in promoting a more interactive approach to identifying and using books,
the comprehensiveness of Amazon database is due to the fact that it
incorporates the historic catalogues of major libraries such as the Library of
Congress and the British Library. Anyone
who feels that Google can do the job performed by library catalogues should
attempt to locate specific volumes of periodicals in Google Book. It is an
extraordinarily time consuming task, and sometimes downright impossible, which
explains why digital libraries such as Hathi and Open Library offer
conventional online catalogue access to digital libraries.

Library, archive and museum catalogues
offer some of the largest and most highly structured datasets which humanities
researchers are likely to encounter. These bibliographical datasets are
increasingly being made available as open data. The British Museum’s collection
database is now available in this form and the British Library has also made
the British National Bibliography available as linked open data. The highly structured
data in library catalogues has great potential to support innovative
visualisations showing aspects of bibliographic and intellectual history, as
can be seen from this project at St Andrews, the Bohemian Bookshelf. While these possibilities have lead to an
increased interest in the potential of using catalogue data in new ways, this
renaissance of interest in the catalogue comes at a time when the catalogue
itself is fundamentally changing because the services it has traditionally
supported are also being transformed. As Lorcan Dempsey has commented, ‘the
catalog is being reconfigured in ways which may result in its disappearance as
an individually identifiable component of library service. It is being subsumed
within larger library discovery environments and catalog data is flowing into
other systems and services’.

The catalogue is one of the oldest and most
important means by which humans have sought to control information. The library
of clay tablets collected by King Ashurbanipal of Assyria in the 7th
century BC had an author and title catalogue and probably a class catalogue as
well. We will all be familiar with the corpus of British Medieval Library
catalogues which has been in the process of publication by the British Academy
under the general editorship of Richard Sharpe and which lists thousands of
texts in circulation in medieval Britain. The production of library catalogues
was one of the first fields in which automation was used to expedite the
management of information. One of the earliest applications of automated
duplicating devices was in the production of the British Museum’s library
catalogue. The card index may nowadays seen like a very humdrum instrument of
information technology, but it was revolutionary in the way in which the use of
standardized cards allowed the sharing of information. The Library of Congress
in the early part of the twentieth century operated a bibliographic service
which offered pre-printed catalogue cards for books to local libraries. The
automation of these card indexes was one of the first computing technologies to
impact on humanities research.

The scholarly literature on cataloguing is
considerable, and the changes in the position of the catalogue mean that
discussion as to its purpose, value and future remains vigorous. This extensive scholarly and professional
debate has helped encourage the establishment and continued development of new
cataloguing standards. Not surprisingly, the discussion of cataloging is most
sophisticated for such conventional,library materials as the printed book and
the periodical publication. As early as the seventeenth century, Thomas Bodley
debated with his librarian Richard James how the books he purchased should be
described. The Keeper of Printed Books at the British Museum, Anthony Panizzi,
established the first modern set of rules for cataloguing books in 1841. The
ninety one rules promulgated by the British Museum reflected the collective
wisdom of Panizzi and his assistants, their debates about points of cataloguing
practice often extending far into the night. The British Museum’s example
encouraged American librarians to produce their own rules, culminating in
Charles Ammi Cutter’s Rules for a
Dictionary Catalog of 1876.The
formation of professional Library Associations in Britain and America
encouraged further collaboration, resulting in the compilation of an
Anglo-American Code in 1908 and finally the issue of the second edition of the
Anglo-American Cataloguing Rules (AACR2) in 1967, which were further revised in
1978. The experience of the Library of Congress in producing catalogue cards
for use by other libraries encouraged early experiments with distributing
library catalogue records in machine readable forms. The Library of Congress
developed a service to produce and distribute on tape Machine Readable
Catalogue entries as early as 1966. This international co-operation of course
extends beyond the English speaking world. The International Federation of
Library Associations has been very important here, for example, in enunciating
the International Principles for Bibliographic Description in 1961. This
framework has provided a strong basis for addressing new challenges. The new
version of the Principles for Bibliographic Description, which you can see
here, attempts to reconceptualise the role of bibliographic description in new
information environments, and reflect the sort of thinking which has
underpinned the development of the new Resource Discovery and Access (RDA) which
has been implemented by the Library of Congress and is in the process of being
adopted by the British Library. Since one of the advantages of RDA is that it
is meant to provide a more flexible framework than AACR2 for dealing with
archives, manuscripts and other non-book materials, RDA is likely to loom more
considerably in the field of manuscript scholars than AACR2 has done.

While the use of ICT in printed book
cataloguing has a long history, for archives the development has been much more
recent, but very dramatic. Archive processing differs fundamentally from
printed book processing because of its concern to preserve and represent the
hierarchies and administrative inter-relationships of individual documents. An
archival callmark such as this example (National Archives, KB 145/3/5/1) tells
me everything I need to know about the document. At the fonds or collection level, it forms part of the records of the law
court known as the King’s Bench. At the series level, the number 145 tells me I
is part of the series of King’s Bench Recorda files. The sub-series number, 3,
indicates that this from the reign of Richard II. The item number, 5, indicates
that this file is from the 5th regnal year of Richard II and the
file number 1 shows that it is the first of two parts surviving for that year.
The concern of archival descriptions is chiefly to preserve and document these
hierarchies, as the record entry for this file in the National Archives
catalogue illustrates. The kind of codicological and palaeographical
information such as the number of membranes or the number of scribes which
might be discussed in a literary or liturgical manuscript of the same period is
not analysed or recorded here. As you can see, the physical information
provided for description of a twelfth-century archival document such as this
pipe roll is minimal. The international standard which governs archival processing
and description is ISAD(G): the General International Standard Archival
Description. By contrast to MARC and printed books, the fonds structures of ISAD(G) cannot easily be represented in a
relational database. The hyperlinks of the World Wide Web closely map archival
structures, so that very quickly after the web appeared, an XML schema known as
EAD (Encoded Archival Description) was produced which enabled archive
descriptions to be readily made available for web access. The vast catalogues
of the National Archives in London, which had remained until the 1990s in
typewritten form and were only made available remotely through the energetic
photocopying programme of the List and Index Society, were rapidly made available online.
This was rapidly followed by the Access to Archives programme which converted
and put on the web catalogue records from many local and specialist
repositories.

There isn’t time here today to go into the
interesting development of online cataloguing and inventory methods in museums,
but the need for museum documentation to embrace such a wide range of materials
led to the emergence of a more semantically-based standard of the CIDOC
Conceptual Reference Model, which is I suspect likely to have a very major
impact on the way in which we document and analyse cultural heritage materials
over the next few years. But what is striking here is the way in which the sort
of material in which we are interested – the type of medieval literary, liturgical,
legal and other library manuscripts which are the glory of collections such as
the British Library, the Bodleian Library and the libraries of the Oxford and
Cambridge colleges – has been ignored by developments in cataloguing. The needs
of these manuscripts – or indeed of early modern and modern manuscripts which
do not fall easily into the fonds
structures of ISAD(G) - have barely figured in discussions of the nature and
future of the catalogue in new information environments. This is surprising,
since the cataloguing of manuscript libraries was one of the earliest forms of
library cataloguing. Among the earliest published library catalogues in England
were Thomas Smith’s 1696 catalogue of the Cotton Library and David Casley’s
1734 catalogue of the old Royal collection of manuscripts, while Humfrey Wanley
set a formidable standard for specialized catalogues with his catalogue of
Anglo-Saxon manuscripts published by Hickes in 1705. Edward Bernard’s 1697
Catalogue of manuscript books in England and Ireland was one of the first
attempts at a union catalogue. These seventeenth- and eighteenth-century
pioneers established a tradition which has without doubt been one of the
glories of English medieval scholarship. The catalogues of manuscript
collections compiled by scholars such as M. R. James, Neil Ker, Malcolm Parkes,
Tilly de la Mare, and Andrew Watson are remarkable achievements and this
tradition continues today, and some of its most distinguished practitioners are
with us today. Moreover, the various in-house catalogues of manuscript
collections compiled by institutions such as the British Library, the Bodleian
Library and the John Rylands Library in Manchester incorporate some of the
finest work of such manuscript scholars as Edward Maunde Thompson, Sir George
Warner, Francis Wormald, Julian Brown, Falconer Madan, Richard Hunt and (in
Manchester) Frank Taylor.

The catalogues of British manuscript
collections represent a formidable scholarly achievement, but, unlike printed
books or archives, this remarkable body of work has failed to generate any reflective
or theoretical literature. Manuscript cataloguers have been too deeply steeped
in the uncial to consider how the catalogues they produce fit into the wider
range of library and archive catalogue provision or to consider how their
catalogues can be better suited to their function and purpose.The contrast has been drawn between France,
where Leopold Delisle’s influence was responsible for the early development of
a very integrated and consistent approach to manuscript cataloguing. It has
beensuggested that the failure of
English manuscript libraries and scholars to develop a similar approach was due
to a more pragmatic tradition in England – that English scholars were more
concerned with studying the manuscripts
than with the way in which the catalogues were structured. I fear this is a
rather self-serving piece of justification. I suspect that the failure to
develop any theory of manuscript cataloguing in Britain has more to do with the
way in which the study of manuscripts has been some intimately connecting with
connoisseurship and collecting. Falconer Madan’s discussion of the cataloguing
of manuscripts in his 1899 volume Books
in Manuscript – amazingly, still one of the best introductions to the
subject when I started work in the Department of Manuscripts at the British
Library in 1979, but now of course supplanted by more up-to-date treatments by
scholars such as Michelle Brown and Christopher de Hamel – makes this concern
with the creation of informed connoisseurs clear when he explains that his
discussion of cataloguing is aimed at the ‘private collector [who] has
purchased a manuscript at a sale, that it has just reached him, and that he is
inexperienced in the treatment of such volumes’.

This tradition rooted in collecting and
connoisseurship goes back deep into the history of manuscript scholarship in
Britain – one thinks of Wanley’s work on the Harley collection. I suggest that
it had a profound effect on the intellectual programme of scholars such as
James or Ker. Richard Pfaff has suggested that the aim of M. R. James in
compiling his catalogues was to create in his mind a kind of imaginary library
which would assist him in dating and placing texts, and a similar sense is also
evident in the approach of Neil Ker. This means that for these scholars, the
catalogue was a method which gave them a structure for the systematic
exploration of manuscript libraries and also became a means of recording and
delivering a scholarly judgment on the dating and localization of a particular
manuscript. But frequently the relationship of these scholarly catalogues to
the libraries they described was not necessarily clear – as is apparent from
the problems created by James using his own systems for the numbering of
manuscripts. While the documentary scholars at the Public Record Office
codified their professional practice to create a new archive profession, with
training offered at new schools in centres like University College London and
Liverpool, there was no comparable move to create a similar professional basis
for manuscript librarianship. Indeed, in creating the archives profession in
Britain, Sir Hilary Jenkinson explicitly excluded Departments of Manuscripts
like that at the British Museum, arguing that they used museum procedures which
caused damage to the fonds. Rather
than seeking to create a parallel professional structure to that being
established by the archivists, manuscript scholars such as Edward Maunde
Thompson, Francis Wormald and Julian Brown concentrated instead on formalizing
and developing the academic study of paleography and codicology. While scholars
from the Department of Manuscripts such as Thompson and Frederick Kenyon served
as Directors of the British Museum and played a major part in museum
administration, they had little impact on the development of the new archives
profession – something which perhaps confirmed Jenkinson’s argument that the
approach of manuscript libraries was too often based on the selective
connoisseurship of the museum.

The result of this is that, while the
emergence of cataloguing standards for books and archives, was underpinned in
Britain by a substantial scholarly literature discussing the function and
structure of archives, there is no comparable literature on the theory and
practice of manuscript cataloguing. Our essential handbooks, such as the works
of Michelle Brown and Christopher de Hamel that I have already mentioned,
discuss palaeography, codicology and terminology. They do not discuss the
cataloguing requirements of manuscripts. The British literature on this subject
is embarrassingly meagre. The best
historical overview is A. J. Piper’s article on ‘Cataloguing British
Collections of Medieval Western Manuscripts’ in Lynda Dennison’s collection of
the legacy of M. R. James. An important but largely forgotten contribution is
an article by the remarkable palaeographer Dorothy Coveney, who produced a
groundbreaking catalogue of the manuscripts at University College London in
1935. Coveney’s article on ‘The Cataloguing of Literary Manuscripts’ – literary
manuscripts here being adopted as a technical term to distinguish library
manuscripts from archives – published in The
Journal of Documentation in 1950 argued for much fuller and more systematic
palaeographical treatment of manuscripts, making trenchant criticisms of the
mannered descriptions of hands in James’s catalogues. Of course, there are
descriptions of the methods adopted in the prefaces of catalogues by scholars
such as James and Ker and in some library catalogues, such as that of the Bodleian
Library which sought to introduce some of Delisle’s principles, but otherwise
that is all we have.While the Public
Record Office in London was at the heart of generating a new literature on the
processing and documentation of archives, the Department of Manuscripts at the
British Library produced nothing beyond two short handbooks itemizing the
various manuscript catalogues, a Guide to Manuscript Indexing by J. P. Hudson,
which is a impenetrable description of the typographical house rules used in the
indexes of the Catalogue of Additions to the Manuscripts, and a short guide to
the methods used initially to automate the catalogues of manuscripts.

As we have seen, the emergence of such
standards as AACR2, MARC and now RDA with printed books or ISAD(G) and EAD for
archives were closely related to both
theoretical discussions and the development of international associations such
as IFLA and the International Congress on Archives. There has been no such
process with manuscripts, so that the picture internationally remains
fragmented. In America, there was an earlier recognition of the distinct needs
of manuscripts and an enthusiasm for a closer connection with mainstream
library developments and the promotion of a more integrated approach to
manuscripts, such as the proposal of the controversial librarian of Princeton,
Ernest Richardson, for the creation of a Union
World Catalog of Manuscript Books. This willingness to accept that
manuscripts were part of libraries perhaps accounts for the way in which
American practice has been more willing to accept that manuscript books can be
catalogued in much the same way as printed books. Gregory Pass’s Descriptive
Cataloging of Ancient, Medieval, Renaissance, and Early ModernManuscripts is a supplement to AACR2 which provides guidelines for cataloguing
manuscripts according to ACCR2 principles. This approach is widely favoured in
the United States, but its drawback is that it cannot cope with the collection
hierarchies which are required as soon as one encounters archival materials,
and this is one reason why manuscript librarians have been reluctant to go down
the simple route of cataloging their manuscripts in AACR. However, while EAD
and ISAD(G) preserve information about the collection hierarchies, they are
very poor at representing the kind of bibliographical and codicological
information. The Liber
Horn, for example, is held by the London Metropolitan
Archives which naturally uses ISAD(G) and EAD. This is the description for the
Liber Horn in the London Metropolitan Archives, and you can see the problems:
whether it is helpful to describe the Liber
Horn as a file I am not sure, and the kind of
structural information we would normally expect in a description of a medieval
manuscript is simply not there.ISAD(G)
is geared to large quantities of corporate records, produced by institutions; a
volume of uncertain official status produced by a chamberlain of the city is
not easily accommodated by a standard designed to cope with the city’s
financial records.

There is, then, simply no accepted standard
for manuscript cataloguing. This would not matter very much if it wasn’t for
automation. The creation of large aggregated catalogues such as OCLC’s WorldCat
or the type of federated searching which is possible through services such as
CatCymru, which searches the catalogue of every public library in Wales, are
only made possible by the standardization grounded in the use of guidelines
such as AACR2. Without such standardization, it is impossible to develop such
services for manuscripts in the same way.A brave attempt to initiate such a standard was the MASTER project,
which sought to develop a TEI document type definition for use in manuscript
cataloguing. An immense amount of work has gone into developing MASTER and it
has been used in modified forms in cataloguing collections in Oxford, London,
Copenhagen and elsewhere. TEI P5 now includes provision for manuscript
description, but use of TEI P5 has tended to be restricted to academic
researchers rather than curators, and it has suffered from lack of take up by
major libraries. However, the Bodleian Library, which used EAD to prepare a
summary catalogue of its manuscript holdings, will be using TEI P5 to provide
more detailed descriptions of its medieval manuscripts. Nevertheless, the risks
and problems of fragmentation remain, which can be seen by looking at the
rather sorry tale of the British Library’s manuscript catalogue.

The British Library’s historic printed
manuscript catalogues, such as the long run of Catalogues of Additions to the
Manuscripts, were converted to machine readable form in the 1990s and made
available online via an Access database, which reproduced the split between
description and index in the printed catalogues and offered separate searches
for description and index, as well as easy access to information by manuscript
number. The catalogues of some of the
oldest collections in the Library were by this time very out of date and a
separate project was initiated to identify by means of a shelf survey all the
illuminated and pre-1200 manuscripts and then recatalogue them. This resulted
in a separate digital catalogue of illuminated manuscripts, where the
manuscript descriptions were also made available via an Access database. The
manuscript catalogues were separate from the Library’s main catalogue systems,
and it was clearly desirable that they should be incorporated in some way. In
1982, the India Office Records were transferred to the British Library. The
India Office Records are very much archives and in many ways it would have been
preferable to transfer them to the National Archives. For any library manager, it would clearly make
sense to try and provide integrated access to the manuscripts collection and a
major archive like the India Office Records. This is where the problem with
cataloguing standards kicks in. For the India Office material, ISAD(G) and EAD
is the available and recommended standard. For medieval manuscripts, there is
no recommended standard, so in creating an integrated British Library archive
and manuscript catalogue an ill-advised attempt has ben made to shoehorn the
western manuscript catalogue records into ISAD(G) and EAD ina form that I fear that many manuscript
scholars will simply find cumbersome at best and baffling at worst. But it is
difficult to suggest an alternative approach if there isn’t a clear-cut
manuscript standard available.

It’s perhaps worth lingering a moment to
take a closer look at why the new British Library ‘Search our Catalogue
Archives and Manuscripts’ is so problematic.Here’s what happens if you search on Thomas Hoccleve. The first
indication that there is a problem is actually in the left-hand side, where
entities from the manuscript descriptions, such as the language of the
manuscript or names of previous owners are displayed. You will notice that
there is some uncertainty as to whether these records are at fonds, item or
file level; my suggestion is that they should all be at item level, but the
difficulty of thinking about ‘file’ in the case of these manuscripts shows the
inappropriateness of the approach. However, more to the point is the display of
information about the manuscript. Here is the description of Harley MS 116 in
the Catalogue of illuminated Manuscripts, and in my view it is exemplary in the
clarity of its distinction between the different aspects of the manuscript. EAD
doesn’t allow for any of this, so this is what we get if we go to ‘Details’ for
this manuscript in the new catalogue. The first point to notice is that this is
a very different description from the one in the Catalogue of Illuminated
Manuscripts. Unfortunately, no information is given as to why this new more
detailed description was compiled and by who. It is a very fine description but
I think you can see how awkwardly it fits into the ISAD(G) template. Moreover,
some elements of the information will be difficult to search – there is no
reason why we couldn’t easily generate listings of manuscripts pricked in
different ways, given the level of detail here, but the inappropriate use of
the EAD schema makes that much more difficult.

An even bigger problem is apparent if we
look at the description of Sloane MS 1825. In this case, a description of the
manuscript compiled in the 1840s has simply been scanned in without further
amendment. Physical information is given briefly in Latin, and the date is in
the description but hasn’t been registered as the date of creation. Again,
there is no indication of the status or origin of the description. All this
provides is simply a keyword searchable version of a very old description –
useful, since this wasn’t previously accessible, but otherwise not much value.
It is very difficult within the new British Library to access descriptions by
manuscript number. The manuscript number here is a reference code. This is what
you get if you search for the reference code Nero D IV, the reference for the
Lindisfarne Gospels. In this record, more care has been taken to try and make
the discussion of the physical structure of the manuscript and the bibliography
fit into an archival framework, but the way in which the component texts of the
manuscript are treated (as if they were papers in a box set) is very
disconcerting. Moreover, the listing promises details that we don’t get – its
surprising for example that the colophon is listed as a separate textual
component, but no details are given anywhere of what the colophon says.

There are many other problems with the new
British Library manuscripts catalogue. The facility to add your own subject
tags is potentially useful, and a similar facility has been included in the new
Discover the National Archives catalogue, but the relevance of reviews for the
Lindisfarne Gospels seems doubtful (would we put something like ‘a manuscript
that offers a great deal but when you see it close fails to deliver?’). But the
important point is that the problem is not the way in which the British Library
catalogue has been implemented here, but rather the difficulty caused by the
lack of any agreed standard for manuscript cataloguing, which is in itself a
symptom of a deeper lack of intellectual consensus as to the most appropriate
methods for processing and documenting manuscript collections which are not
formal archives. The temptation of course is to leap in and propose what such a
standard might look like. The need to develop a more standardized approach is
apparent from the outcome of a conference of manuscript librarians from Oxford,
New York, the British Library, Harvard, Yale and elsewhere held at the Bodleian
Library in 2007, where it was suggested that a good first step might be to look
at better handling of name authority. But I’m doubtful whether such tinkering
around the edges is adequate. Archival standards are not simply cataloguing
conventions but a statement of a whole philosophy as to how archival documents
should be processed, stored and made available. Cataloguing standards such as
RDA likewise reflect a holistic view of how categories of information are
managed.Likewise, we need to think
about what manuscript libraries are and how they should be managed. In thinking
about the future of manuscript catalogues, we need to rethink the nature and
function of the manuscript catalogue, from first principles.

I think that the linking of data, and thus
the tentative first proof of concept that we have been given in Manuscripts
Online, has a role here, but we need to start at the beginning and think about
what the manuscript collections in the British Library or the Bodleian Library
are. The first, and most important point, is one that Otto Mazel stresses in
his little handbook, The Keeper of
Manuscripts, which is perhaps the nearest thing to a philosophy of manuscript
librarianship that we have. While medievalists may naturally assume that the
most important things in manuscript collections are the volumes in which they
are interested, manuscript holdings are extremely diverse. The Additional
Manuscripts in the British Library embrace not only the Luttrell Psalter or
Sherborne Missal but also the Codex Sinaiticus, Samuel Taylor Coleridge’s
Notebooks, Charles Babbage’s correspondence, the archives of many British Prime
Ministers, the notebooks of scientists and engineers like Fleming and Whittle
and even a choreographic diagram by Nijinsky. Any processing and cataloguing
method to deal with collections like these needs to be embrace all these varied
types of material – this is one reason why the use of the TEI guidelines for
manuscript description fail to address the problems of manuscript cataloguing. It
isn’t satisfactory to contemplate classifying the manuscripts, since individual
collections will themselves often be very diverse: Sir Robert Cotton’s library
included not only illuminated manuscripts but also a large portion of the
personal papers of Thomas Cromwell. Likewise, the manuscripts of the more modern
collector Eric Millar included both medieval material and the diaries of the
Edwardian writer F. Anstey. If we tried to split this material up into subject
types, we would potentially destroy a lot of evidence about the activities of
these collectors.

The way in which most manuscript libraries
address this problem of diversity is to use acquisition and accessioning as the
means of organizing the collections. This is one of the reasons why the manuscript
number is the key that draws together all our thinking about manuscripts. The
manuscript number provides our equivalent of title, author and much other
bibliographic information for modern printed books, and needs to be at the
heart of our thinking about manuscript cataloguing. It is this physicality of
manuscripts and other rare materials that creates a distinction with the kind
of discovery resources represented by, say, Explore the British Library. It
could be argued that coping with such physicality is more critical to the
future of the library catalogue than the discovery of wider ranges of resource.
Karen Calhoun, in her report to the Library of Congress, argued that since
libraries are unlikely to be able to compete with commercial search services,
they should perhaps focus on giving greater attention to providing information
about rare and unique materials in their collections. However, if libraries are
to give greater priority to the catalogues as a means of accessing manuscripts
and other special collections, they will need to accept that this requires a
different philosophy to that which is evident in the Explore type approach.

Lorcan Dempsey declared that: ’The catalog
emerged at a time when information resources were scarce and attention was
abundant. Scarce because there were relatively few sources for particular
documents or research materials: they were distributed in print, collected in
libraries and were locally available. If you wanted to consult books or journal
or research reports or maps or government documents you went to the library’. Dempsey
points out that nowadays the situation is reversed: ‘information resources are
abundant and attention is scarce. The network user has many information
resources available to him or her on the network. Research and learning
materials may be available through many services, and there is no need for
physical proximity’. However, of course, the dynamics described by Dempsey do
not apply to manuscripts. In the case of manuscripts, our problem is not so
much that we have become less focused and are looking at the manuscript in a
more distant fashion, but instead, we are looking at manuscripts under closer
and closer microscopes, as we seek to extract every nugget of information that
we can from them. The interest of manuscript scholars in the potential of new
information technologies is completely the reverse of what Dempsey describes –
we want to view the manuscript under finer and finer views and to garner as
much information about it as we can.

Again, this means that the focus is on the
physical volume, on the individual manuscript, rather than a multiplicity of
resources. Linked data is definitely one of the topics of the day in humanities
scholarship and elsewhere, but I think there is a tendency to think that if we
link a random group of resources together, somehow the magic of linked data
will give us instantly new perspectives and new understandings for a particular
place or period. I fear that this rather naïve hope is evident in Manuscripts
Online resource in its first version, particularly in the selection of
resources that have been linked. Sadly, scholarship is much harder than this.
Linking of data can be a very useful scholarly technique, but we need to be
clear about why we are linking data, what sort of data we are linking, and our
aim in doing so. In the case of manuscript catalogues, linking of data has the
potential to deal with many of the processing issues which govern the structure
of manuscript catalogues, if we approach the linking in the right way.

Dorothy Coveney, one of the few
commentators to discuss the philosophy of manuscript cataloguing, said that the
primary purpose of a manuscript catalogue is to ensure that the manuscript is
securely stored and can be easily located. This security aspect of a catalogue
is easily forgotten but in the case of medieval volumes worth millions of
pounds remains of fundamental importance. The potential problem of a catalogue
which ignores this requirement is illustrated by Samuel Ayscough’s catalogue of
the Sloane Manuscripts. Ayscough’s catalogue was organized by author and it
meant that the numeration of the manuscripts became rather confused, because Ayscough’s
catalogue was not an accurate guide to what should be on the shelf. As a
result, the Sloane manuscript containing William Harvey’s lectures on the
circulation of the blood was accidentally discarded. When the Harvey volume was
found, it was put in the place of a fifteenth-century astrological manuscript,
which has now in turn disappeared. The confusion created by Ayscough was only
sorted out when a shelflist recording all the numbers of the manuscripts on the
shelf was compiled.

In most manuscript libraries, these shelf
lists, containing the definitive listing of the manuscript numbers, provides
the fundamental statement of what the library holds, and is the spinal column
which links everything together. This is an example of the handlist for the some
of the Cotton manuscripts in the British Library. This is really the
fundamental catalogue for these manuscripts, since it is the only definitive
statement of the holdings of this section of the library. Obviously, it would
not be much use simply to provide readers with a list of numbers, so initially
a listing is prepared which provides an initial view of the manuscript. But the
important point is that this is only an initial view – what Edward Maunde
Thompson says about a manuscript in the Catalogue of Additions is simply the starting
point to a scholarly discussion which will then last centuries. To my mind,
ideally a catalogue provides us with access to a complete view of that
scholarly discussion in a structured way. Our vision of a catalogue has historically
been of a single volume that will provide us with an authoritative statement on
a particular manuscript. We expect a Ker or a Kathleen Scott or a Andrew Watson
to provide us with an ex cathedra view of what we need to know about a manuscript.
This is a view very much driven by the assumption that a catalogue will be a
single printed volume. Yet information about manuscripts is scattered through
dozens upon dozens of different sources, some in digital form, very many not.
Ideally what we want is synoptic access to all those different sources of
information. I heard a gripping account recently by Arnold Hunt of the British
Library of how linked access to catalogue information can be used to show that
a dinosaur tooth in the Natural History Museum came from Sir Hans Sloane’s
collection. Not all the information we need to follow this linked chain of
evidence is in digital form.

My vision of the future manuscript
catalogue then is very much one which is of linked information which enables us
to accrue more and more detail about a manuscript. This doesn’t mean of course
that we are limited to one single direction in exploring the links, but I see
the physical manuscript as remaining our inevitable and necessary starting
point. There is an enormous task in assembling the information which would
enable us to create such a catalogue, particularly since many of the key
sources are not yet available in digital form. Take this example, Additional MS
18196, folio 1, a leaf of a Hymnal , containing part of
the hymns to Agnes and Anthony, acquired by the British Museum when Sir
Frederic Madden was Keeper of Manuscripts. Among the basic contemporary
resources you would need to link from this manuscript number in order to get a
good overview of the manuscript are the British Library’s shelflist database,
the Catalogue of Additions, Madden’s acquisition reports, Madden’s three series
of dairies which contain a great deal of information on manuscripts acquired by
him, Madden’s binding records, the huge archives of various annotated sale
catalogues held by the British Library, and the indexes of Sir Thomas
Phillipps’ manuscripts and catalogues – and that’s all just for starters.
Subsequent scholarship on the manuscript is recorded in a huge range of
different resources, starting with the Manuscripts Classed Catalogue in the
British Library and going right up to works by Paul Binski and Jonathan
Alexander. One of the biggest problems faced by manuscript librarians is
keeping track of the scholarly bibliography of their subjects. One of the most
comprehensive schemes historically was the British Library which systematically
collected and indexed offprints of articles relating to manuscripts in the
library’s collections, but this pamphlet collection stopped being systematically
maintained in the 1960s. We now of course have an excellent opportunity to
revive it on a larger scale in the context of something like Manuscripts
Online. A search of JSTOR quickly reveals nearly 100 references to the
manuscript. The British Library’s own blog reports that this manuscript is
indeed currently on loan to the Getty Museum, where one of the curators
describes it as the most spectacular Florentine manuscript commission of the
first half of the fourteenth century. Just for this single leaf, there is an
enormous amount of information to link together.

My vision then of a future
manuscript catalogue would be of something that links together a wide range of
resources in this way, anchored by the record of the physical manuscript
itself. This is why in particularly welcome the vision of Manuscripts Online,
which represents a small and tentative step – almost a Fisher Price version –
of what I hope the manuscript catalogue might ultimately become.