August marks the 150th birthday of naturalist and Antarctic explorer, William Speirs Bruce, who was born on 1 August, 1867.

Part of the Bruce archive is held in the library collections of National Museums Scotland, with other Bruce archive collections being held by the University of Edinburgh, and the University of Cambridge.

Cartoon of Bruce originally published in a Buenos Aires newspaper.

As a teenager, Bruce attended a vacation course in biology at a marine station in Granton, studying under Patrick Geddes, which proved to be an influential experience. He went on to assist John Murray at the Challenger Office, and would help with dredging on the Forth or Clyde whenever there was an opportunity.

Bruce’s first Antarctic voyage was on the Balaena where he worked as a surgeon on the Dundee Antarctic Whaling Expedition. He went on to work as a biologist on the Jackson-Harmsworth Expedition, and then on the Coates Arctic Expedition. Bruce was then invited to make hydrological and biological surveys on trips to Spitsbergen.

Bruce’s best known expedition was on the Scotia where he was the leader of the Scottish National Antarctic Expedition during 1902 to 1904. This expedition set out to conduct hydrographic work in the Weddell Sea, and survey the South Orkney Islands and study their wildlife.

Bruce continued to make expeditions, and travelled to Spitsbergen several more times between 1906 and 1919.

The archive at National Museums Scotland holds a range of records that show the breadth of Bruce’s work over the years.

List of equipment and stores made by Bruce for an expedition to Spitsbergen.

The planning that was required to undertake a scientific voyage is evident from the many records held for ordering goods to take on board, and packing lists for specific parts of a voyage. Lists include everything from basic requirements such as food, to survival equipment, to specialised scientific apparatus.

The archive includes scientific data gathered on Bruce’s voyages. There are examples of scientific log books, oceanographic measurements of temperature and water density, and lists of specimens found in trawls.

Cuthbertson drawings of an Atlantic lizardfish and the head of a Shag.

Scientific data is accompanied by scientific drawings and sketches of the flora and fauna collected and described as part of the expeditions. The artist of the Scotia was William Cuthbertson, and his artwork shows the array of wildlife that was observed by the scientific team.

Cuthbertson painting.

Cuthbertson also painted landscapes and seascapes as the crew travelled, and the archive has a collection of these, often showing the beauty of the environment that was encountered on the Scottish National Antarctic Expedition.

Sketch by William Martin of Emperor Penguins.

The archive includes many illustrations and descriptions of penguins, including this sketch by William Martin. Their behaviour was noted by Bruce and his colleagues during the Scotia expedition, and specimens were collected for scientific study. Some of these specimens are part of the collections at National Museums Scotland, and still available for study. However, penguins and their eggs were also valued as food for the voyage, with black throated penguins being found the most palatable. Penguin was regularly served with fried onions, in soup, or as curry to those on board the Scotia.

William Speirs Bruce’s attempt to draw a pig in ‘Livre de Cochons’.

Despite the amount of scientific work undertaken during expeditions, Bruce and his colleagues did have leisure time to fill. Time would be spent singing songs, with each person doing a turn to entertain, Bruce being known for his rendition of ‘Two Blue Bottles’. The archive collection contains a notebook filled with attempts to draw a pig while blindfolded, which serves as a keepsake from the voyage, as well as evidence of the kind of games that would keep boredom at bay. The page shown is William Speirs Bruce’s attempt.

Sketch by William Martin of a cove at Gough Island.

The landscapes and living conditions experienced by those on the Scottish National Antarctic Expedition were captured by William Martin in a sketchbook that is also held in the National Museums Scotland archive. The sketch shown is of a cove at Gough Island where the Scotia stopped to collect specimens, and more images from the sketchbook can be found online http://www.nms.ac.uk/explore/collection-search-results/?item_id=737692

The Bruce papers also contain the diary of A Forbes Mackay who was a colleague of Bruce. Mackay reached the South Magnetic Pole on January 16th 1909, along with T.W. Edgeworth David, and Douglas Mawson. The diary tells of the difficult conditions as the men made the journey on foot over challenging terrain. Mackay also describes the pressure put on their relationships as a team, as the leadership passed from David to Mawson because David was no longer considered capable of leading.

In July 2017 the University of Portsmouth celebrates 25 years since gaining university status. However, the University of Portsmouth archive reveals that the roots of the institution go back much further than this, to the late 19th century.

PGSSA/1/1 Page from minute book of the Portsmouth and Gosport School of Science and Art [1874].

One of the earliest items in the university archive collection is a minute book from the Portsmouth and Gosport School of Science and Art. The school opened on 1st June 1870 and offered a mix of day and evening classes, the latter aimed at local artisans. Both men and women attended the school whose main premises were in the former Crown Sale Rooms in Pembroke Street. Students could receive instruction in a range of skills including practical geometry, artistic anatomy, and architectural and mechanical drawing.

By 1908 responsibility for technical education had been taken over by the local authority. A grand new building opened behind the Guildhall to house the Portsmouth Municipal College. The building is still in use by the university today and has Grade II listed status. The college offered a mix of higher and lower courses, the higher being of university standard. The college also had another role in the wider community as the reference library on the ground floor was open to both local residents and students.

PMTI/3/2.3 Plan of the second floor of the Portsmouth Municipal College building, later Park Building [1905]

The first edition of student magazine The Galleon was published in autumn 1911, shortly after the establishment of separate female and male student unions. It reported on the formation of a women’s basketball team and bemoaned the state of the common room. Student media is a fascinating source of information on the daily life of students and many newspapers and magazines survive in the archive.

There has been a succession of name changes among the university’s institutional predecessors. Portsmouth Municipal College became Portsmouth College of Technology in 1953, before developing into Portsmouth Polytechnic in 1969. The collections chart this process of expansion – of both student numbers and buildings – through prospectuses, newsletters, annual reports and more. The university is located right in the heart of the city of Portsmouth and the range of buildings it has utilised over the decades is notable. In addition to creating new buildings of its own, sites include a former sailors home, hotel, building society headquarters, drill hall and barracks, illustrating just how closely the history of the university is related to that of the city as a whole.

ART/3/1.1.20 College of Art later known as Eldon Building under construction [1960]

The university is also however, the result of the amalgamation of several institutions. One of the richest collections of material in the archive is from the Day Training College, latterly the College of Education. This teacher training college didn’t merge with Portsmouth Polytechnic until 1976 and was mostly based at its own separate site in the city. The collection includes college admissions registers, correspondence and photographs of staff, students and buildings. The training college was all-female for several decades after its establishment in 1907, and is a valuable resource for women’s history. Similarly, there is also a separate archival collection from the College of Art which maintained its independence until the 1990s.

EDUC/15/3.10.21 Student hockey team at City of Portsmouth Training College [early 1940s]

In 1992 Portsmouth became a ‘new’ University, but one with a considerable heritage and a long-established connection to the local area. The university archive has an important role in helping to tell this sometimes overlooked aspect of the institution’s history.

Archives Hub feature for June 2017

Erskine estate

June 2017 marks the 100th anniversary of the official opening of Erskine Hospital. Located in the west coast of Scotland, Erskine was founded in 1916 as the Princess Louise Scottish Hospital for Limbless Sailors and Soldiers, a military convalescence facility for servicemen who had lost limbs in the First World War. The creation of the hospital was a direct response to the need for specialised medical facilities to deal with the unprecedented number of injured and maimed service personnel returning from the battlefields, and for the last 100 years has continued to care for ex-Service men and women.

In 2015 the University of Glasgow received an award from the Wellcome Trust to catalogue and preserve the records of Erskine. The partnership came about as part of the University’s Great War project, and as part of Erskine’s centenary celebrations. It will ensure that material is preserved and accessible for researchers and outreach projects in perpetuity.

Fitting Provisional Limbs At Erskine Hospital

The Erskine Collection is vast in its scope – ranging from items intrinsically tied to the running of the hospital, such as minute books and admissions records, to items such as silk embroidered souvenir postcards sent during the First World War, or correspondence and loose photographs, the owner or subject of which may have been a resident at some point in time. While the administrative records are essential for documenting the running of the facility and tracing individual patients of Erskine, patient experiences, perspectives, and voices are also captured in an array of documents.

Admission Books show that by December 1917 the number of patients admitted to the hospital was 1,613, and of those 1,126 had been ‘discharged with limbs’. More than 2,145 ex-service pensioners from previous wars also attended Erskine to be fitted with new limbs or limb repairs. Between the opening in October 1916 and December 1919 over 400 major operations were performed.

The Princess Louise Scottish Hospital Rules for Patients

The Princess Louise Scottish Hospital Rules for Patients give a taste of the patient experience during the 1920s. While Erskine provided long term care and rehabilitation for many, patients were expected to follow the strict practices enforced by the hospital staff. Activities such as gambling and smoking were restricted or even forbidden, and bed and meal times were strictly adhered to. However, Erskine was always intended to be more than just a hospital. In return for their co-operation with the rules of the Hospital, patients were given the opportunity to retrain and gain new skills through onsite workshops; classes were set up in basketry, shoemaking, tailoring, woodwork, hairdressing and commercial training, ensuring the men would have the opportunity to re-enter the workforce despite their disability upon being discharged from Erskine.

After the war the number of patients entering the hospital due to amputation naturally decreased. The Executive Committee shifted focus toward providing a permanent home for ex-servicemen requiring long term care.

Patients in sunshine

As well as being a busy functioning hospital Erskine became a permanent home for paraplegic residents unable to live independently. Additionally in 1934, a convalescent holiday scheme was introduced which allowed ex-servicemen who had been ill and could not afford to pay for a holiday to come to Erskine for a break. In September 1946 the first of 50 cottages was built in the grounds of Erskine, allowing disabled men and their families to live near their place of work and close to the hospital facilities on which they depended.

During the 1960s and 1970s the patients of Erskine produced a magazine The Erskine Bugle. The Bugle ensured patients and staff could learn about events taking place throughout the hospital, and the poems, stories and letters submitted give a voice to those who stayed at Erskine during this period. The magazines offer a unique perspective into the community of Erskine, and serve as a worthy legacy to the patients and staff who created it.

Erskine Bugle, 1971

The hospital continually expanded in the second half of the 20th century, with new wings being built in 1950, 1962, 1975, and the 1990s. However it was clear the 19th century manor house was no longer equipped to deal with the demand of the busy convalescence home. In 2000 the new state of the art facility was completed to provide long term residential care for veterans.

The partnership between Erskine and the University of Glasgow is ongoing, and regular accessions are expected, ensuring an impressively full record of the activities of the hospital, its staff, and its patients is reflected in the collection, from both World Wars and the National Service era, right through to the present day.

Jimmy Quinn

For more information on the Erskine archive, and the collections held at the University of Glasgow Archives and Special Collections, please visit:

Archives Hub feature for May 2017

Habana

In May 1937 approximately 4,000 children, with labels pinned to their clothes, came to Southampton on board the Habana from Santurzi/Santurce, the port of Bilbo/Bilbão, fleeing the Spanish Civil War and its consequences.

Label of the Departamento de Asistencia Social, one for each Basque child refugee on board the ship.

The Spanish Second Republic had been established in 1931, with an ambitious agenda to eliminate deeply-rooted social and cultural inequalities. The republican programme encompassed land and education reform, improved rights for women, restructuring the army, and granting autonomy to Catalonia and the Basque Country. Threatened by far-reaching change, diverse political groupings aligned themselves in the so-called ‘two Spains’. The ensuing civil war lasted three years, with Nazi Germany and Fascist Italy helping one faction, Communist Russia the other, with Chamberlain’s Britain leading a policy of appeasement among Western democratic nations. In this bitter conflict, there was a third Spain, which did not want to take up arms, but to live in peace. War, hunger, revolution, counter-revolution, denunciations, persecution, summary trials and executions, and mass repression often resulted in the disintegration of family and community life, desolating a country and forcing thousands of its people into exile.

On 26 April 1937, General Franco attacked Guernica and Durango, one of the first bombings of a civilian population in Europe. In the wake of this, the Basque government and the National Joint Committee for Spanish Relief, co-ordinating relief in the UK, organised the evacuation of children from the north front of the war zone. The British government had a policy of non-intervention in Spain and, whilst it permitted the children to entry the UK, no public funds were made available for the expedition, nor for the care of the children once they arrived. Their maintenance was provided for entirely by private funds and those raised by voluntary groups and organisations, under the overall co-ordination of the Basque Children’s Committee.

On arrival at Southampton, the children were sent to a hastily constructed camp at North Stoneham, near Eastleigh, which now forms part of Southampton Airport.

Camp at North Stoneham.

This was the children’s temporary home until they were dispersed to be cared for by the Catholic Church, the Salvation Army, which accommodated children in a hostel in London, or in the so-called “colonies” set up by local committees across the country. Eventually over ninety “colonies” were established, each housing between 20 to 50 children. Ranging from stately homes to converted workhouses, the “colonies” were run on donations. When the initial funding for them began to dry up, the niños were drawn into helping raise funds by performing concerts and shows and by taking part in football matches with local teams.

Football team, Hull colony

The children who came on board the Habana brought very little in the way of personal possessions with them, but they brought memories of the conflict and a sense of their identity. Aside from the shows and concerts where the children dressed in national costume, sang songs or performed dances from home, publications such as Amistad, one of the newsletters produced by the children themselves, were a means for them to remember. Conceived as an informative monthly publication, the newsletter contains pieces describing life in the Basque region, the bombing of Guernica, reflections on war and the journey on the Habana.

Amistad newsletter

The Special Collections at the Hartley Library, University of Southampton, holds archives for the Basque Children of ‘37 Association UK (MS 404), which was founded in 2002 to ensure that the legacy of the Basque children was not forgotten, together with small collections relating to Basque child refugees (MS 370) that have come from individuals. Further details on the collection can be found on the website at:

Archives Hub Themed Collection: Open Lives. The OpenLives project documented the experiences of Spanish migrants returning to Spain after settling in the UK. Researchers from the University of Southampton collected oral testimony, images and other ephemera.

All images copyright the Hartley Library, University of Southampton and reproduced with the kind permission of the copyright holder.

Archives Hub feature for April 2017

First edition of the Manchester Guardian, 1821.

The Guardian is one Britain’s leading newspapers, with a long standing reputation as a platform for Liberal opinion, and an international online community of 30.4 million readers. Founded in Manchester in 1821, it was created by John Edward Taylor, a cotton manufacturer. In the wake of the Peterloo massacre, the paper was intended as a means of expressing Liberal opinion and advocating political reform. Over the next 100 years, the paper originally known as the Manchester Guardian would be transformed from a small provincial journal into a paper of international relevance and renown.

The Guardian archive consists of two main elements: the records of the newspaper as a business; and a very extensive collection of editorial correspondence and despatches from reporters, and was donated to the University of Manchester John Rylands Library in 1971. From April 2016-March 2017, a project entitled ‘What The Papers Say’ was undertaken to catalogue the editorial correspondence of Charles Prestwich Scott, which contains nearly 13,000 items from over 1,300 correspondents.

Charles Prestwich Scott, 1931.

Charles Prestwich Scott (1846-1932) presided over the Manchester Guardian for 57 years, cementing the Liberal editorial philosophy of the paper, and ensuring a consistently high standard of journalism and journalistic integrity. He championed causes including women’s suffrage, home rule for Ireland, and the establishment of a Jewish homeland, and stood out against Britain’s policy in South Africa during the Boer war, and conscription during the First World War, supporting the formation of the League of Nations and negotiations for peace in Europe.

C.P. Scott’s editorial correspondence series contains letters exchanged with figures of historical importance and eminence in almost every imaginable field, from politics and economics, to history, science and the arts. These individuals often contributed articles to the paper, and met with the editor to discuss current events and affairs. Examples of correspondents include politicians including Herbert Asquith, David Lloyd George, Ramsay MacDonald and Winston Churchill, and also Marion Phillips, first woman organiser of the Labour party, and Mary Agnes Hamilton, politician and broadcaster.

Excerpt of a letter to C.P. Scott from Winston Churchill, 9th May 1909, on interruptions to his speeches by Suffragists.

Campaigners for women’s suffrage are represented in the correspondence by Christabel and Emmeline Pankhurst, and Charlotte Despard, amongst many others.

Excerpt of a letter to C.P. Scott from Emmeline Pankhurst, 27th December 1910, on the death of her sister, Mary Jane Clarke.

The Liberal perspective of Scott and the Manchester Guardian can be seen in the interactions between Scott and Roger Casement, Irish nationalist, Rabindranath Tagore, poet and educationist, Emily Hobhouse, social activist and charity worker, Chaim Weizmann, Zionist, and social reformers Eleanor Rathbone and James Joseph Mallon. Scott creates a dialogue with these individuals about their fields of expertise, using the paper to provide a platform for the promotion of their views and causes.

The editors and proprietors of other newspapers are also featured in the correspondence, including William Maxwell Aitken, Lord Beaverbrook of the Daily Express, and James Louis Garvin of The Observer. Their correspondence includes discussion of current events and politics, and also expressions of admiration for Scott and the Manchester Guardian.

Literary figures also feature in the correspondence, such as George Bernard Shaw, John Galsworthy, William Butler Yeats, Harley Granville-Barker and Arthur Ransome. Prior to writing Swallows and Amazons, Ransome acted as a correspondent for the Manchester Guardian in Russia and Estonia, also writing a long running column for the paper on fishing.

In addition to occasional and expert contributors, there is a vast array of correspondence with members of staff of the paper, relating to editorial, technical, business and staffing concerns. These letters provide insight into the operation of a newspaper, alongside an impression of the colossal impact of events such as the First and Second World Wars.

Threaded through Scott’s correspondence, and the Guardian archive, there is also a real sense of the influence of the paper’s location in Manchester, and the significance of the Manchester Guardian in the history of the city. It can be seen in the approach to trade and industry, to the arts, and to education.

The centrality of trade and industry in Manchester meant that these subjects became a focal point of the Manchester Guardian. Such was the Manchester Guardian’s influence, that by 1920, Scott was able to employ the renowned economist John Maynard Keynes to produce a series of supplements for the Manchester Guardian Commercial on proposals for the reconstruction of Europe following the First World War.

Scott believed in the importance of producing a high quality of articles and reviews on the arts, and ensured coverage in the Manchester Guardian for literature, art, theatre and music. This would lead to a close relationship between the paper and Manchester’s resident symphony orchestra, the Hallé Orchestra. Scott would also become a supporter of the Whitworth Art Gallery, the Manchester Art Gallery, and of the production of Ford Madox Brown’s Manchester murals for the city’s town hall.

Manchester Guardian, 24th Oct 1921, p. 12.

Scott used the Manchester Guardian to champion the importance of access to education, evident in his work as a trustee of Owens College, which would become the University of Manchester. Scott was also one of the founders of Withington Girls School, established in 1890. This belief in the importance of education for women may be seen as an element of his more general perspective on women’s rights, which would lead to his influential support of the women’s suffrage movement.

For more information on the Guardian archive, and the collections held at the John Rylands Library, please visit:

Guardian News and Media Archive
The GNM Archive mainly holds records that relate to the Guardian since its move from Manchester to London in the 1960s (and some earlier records though the majority are held at the John Rylands Library, The University of Manchester).
Explore the Guardian News and Media Archive collections on the Archives Hub.

All images copyright The John Rylands Library, The University of Manchester and reproduced with the kind permission of the copyright holder.

Archives Hub feature for March 2017

The nuclear disarmament symbol, often known as the ‘peace sign’, is a modern icon, used by protestors and activists across the world and provoking powerful emotions. It is ubiquitous in fashion and youth culture, to be seen on clothing, jewellery, tattoos, even toiletries. Special Collections at the University of Bradford is home to the original sketches of this extraordinary design.

The symbol was designed in 1958 by Gerald Holtom, an artist based in Twickenham. It was intended for use on a march from London to the nuclear weapons research establishment at Aldermaston that Easter. The march was being organised by a small group of activists influenced by Gandhi’s ideas about nonviolent resistance; they had formed the Direct Action Committee against Nuclear War (DAC) the previous year in response to the testing of Britain’s first hydrogen bomb.

Photograph of the first Aldermaston March 1952. Image copyright: March photograph Cwl HBP (rights unknown).

In creating the visuals for the march, Holtom wanted to develop a symbol for the concept of nuclear disarmament. In a 1973 letter to Hugh Brock (editor of Peace News in 1958, active in the Direct Action Committee), Holtom remembered:

“I was in despair. Deep despair. I drew myself: the representative of an individual in despair, with hands palm outstretched outwards and downwards in the manner of Goya’s peasant before the firing squad. I formalised the drawing into a line and put a circle round it. It was ridiculous at first and such a puny thing …“.

The symbol also represented the semaphore signals for the letters N and D: Nuclear Disarmament.

Holtom sketched his design to meet the need of the moment; he did not expect the sketches to be of interest or preserved years into the future, and nor did many of his contemporaries. Among our other loans to the IWM, we see a letter from a fellow activist dated 10 March 1958; she rejected the use of the symbol, calling it ‘quite obscure’ and suggestive of ‘some Secret Society’.

Nuclear disarmament march sketch by Gerald Holtom. Image copyright: Cwl ND symbol drawing, courtesy of the Trustees of the Commonweal Collection, University of Bradford.

However, the march organisers were pleased with the design and it was used extensively on DAC literature thereafter. Reflecting huge public anxiety about nuclear testing and the arms race, the 1958 Easter march attracted much larger numbers and attention than previous protests directed at Aldermaston. Marchers, passers-by, readers of newspapers; all saw the symbol in action, on leaflets, flyers, song-sheets and banners. Its popularity was assured when later that year the Campaign for Nuclear Disarmament asked to adopt the symbol, and it has been synonymous with nuclear disarmament campaigns ever since. Easy to draw and to adapt, and hinting at other shapes and symbols (a missile, a tree …), the symbol was widely adopted by 1960s counter-cultural groups and came to symbolise peace and dissent more generally.

The original sketches remained with the papers of Hugh Brock. Following his death in 1985, these materials were given to the Commonweal Library, an independent public library, which stocks resources to help activists working for nonviolent social change. Commonweal is housed in the J.B. Priestley Library at the University of Bradford so, when the University set up its Special Collections service during the 2000s, it was natural for Commonweal to put their archival collections into the care of these specialist staff.

The sketches are among the most important objects held by Special Collections. There are four sketches, on three pieces of paper: two drawings of the shape and two illustrations of it in use on protest marches. Reproduction does not do these objects justice. In the flesh we see the weakness of the acidic paper, the cracking of the paint, and the wear and tear of storage and display.

2017 offers a rare chance to see these fragile originals on show. ‘People Power: fighting for peace’ will be on show at the IWM London from 23 March-28 August 2017. The sketches will take their place among hundreds of objects illustrating the stories of anti-war campaigners in Britain from 1917 to the present. Many of these stories can also be found through the Archives Hub.

Alison Cullingford Special Collections Librarian University of Bradford

Explore

Peace campaign archives in Special Collections at the University of Bradford, including:

Related features

Images copyright: Cwl ND symbol drawings courtesy of the Trustees of the Commonweal Collection. March songs Cwl DAC, march photograph Cwl HBP. Rights unknown. Article copyright: University of Bradford, shared under Creative Commons licence (CC BY-NC-SA). [Note that portions of this text have been adapted from existing blog posts and exhibition captions created by Special Collections.]

The back end of a new system usually involves a huge amount of work and this was very much the case for the Archives Hub, where we changed our whole workflow and approach to data processing (see The Building Blocks of the new Archives Hub), but it is the front end that people see and react to; the website is a reflection of the back end, as well as involving its own user experience challenges, and it reflects the reality of change to most of our users.

We worked closely with Knowledge Integration in the development of the system, and with Gooii in the design and implementation of the front end, and Sero ran some focus groups for us, testing out a series of wireframe designs on users. Our intention was to take full advantage of the new data model and processing workflow in what we provided for our users. This post explains some of the priorities and design decisions that we made. Additional posts will cover some of the areas that we haven’t included here, such as the types of description (collections, themed collections, repositories) and our plan to introduce a proximity search and a browse.

Speed is of the Essence

Faster response times were absolutely essential and, to that end, a solution based on an enterprise search solution (in this case Elasticsearch) was the starting point. However, in addition to the underlying search technology, the design of the data model and indexing structure had a significant impact on system performance and response times, and this was key to the architecture that Knowledge Integration implemented. With the previous system there was only the concept of the ‘archive’ (EAD document) as a whole, which meant that the whole document structure was always delivered to the user whatever part of it they were actually interested in, creating a large overhead for both processing and bandwidth. In the new system, each EAD record is broken down into many separate sections which are each indexed separately, so that the specific section in which there is a search match can be delivered immediately to the user.

To illustrate this with an example:-

A researcher searches for content relating to ‘industrial revolution’ and this scores a hit on a single item 5 levels down in the archive hierarchy. With the previous system the whole archive in which the match occurs would be delivered to the user and then this specific section would be rendered from within the whole document, meaning that the result could not be shown until the whole archive has been loaded. If the results list included a number of very large archives the response time increased accordingly.

In the new system, the matching single item ‘component’ is delivered to the user immediately, when viewed in either the result list or on the detail page, as the ability to deliver the result is decoupled from archive size. In addition, for the detail page, a summary of the structure of the archive is then built around the item to provide both the context and allow easy navigation.

Even with the improvements to response times, the tree representation (which does have to present a summary of the whole structure), for some very large multi-level descriptions takes a while to render, but the description itself always loads instantly. This means that that the researcher can always see they have a result immediately and view it, and then the archival structure is delivered (after a short pause for very large archives) which gives the result context within the archive as a whole.

The system has been designed to allow for growth in both the number of contributors we can support and the number of end-users, and will also improve our ability to syndicate the content to both Archives Portal Europe and deliver contributors own ‘micro sites‘.

Look and Feel

Some of the feedback that we received suggested that the old website design was welcoming, but didn’t feel professional or academic enough – maybe trying to be a bit too cuddly. We still wanted to make the site friendly and engaging, and I think we achieved this, but we also wanted to make it more professional looking, showing the Hub as an academic research tool. It was also important to show that the Archives Hub is a Jisc service, so the design Gooii created was based upon the Jisc pattern library that we were required to use in order to fit in with other Jisc sites.

We have tried to maintain a friendly and informal tone along with use of cleaner lines and blocks, and a more visually up-to-date feel. We have a set of consistent icons, on/off buttons and use of show/hide, particularly with the filter. This helps to keep an uncluttered appearance whilst giving the user many options for navigation and filtering.

In response to feedback, we want to provide more help with navigating through the service, for those that would like some guidance. The homepage includes some ‘start exploring’ suggestions for topics, to help get inexperienced researchers started, and we are currently looking at the whole ‘researching‘ section and how we can improve that to work for all types of users.

Navigating

We wanted the Hub to work well with a fairly broad search that casts the net quite widely. This type of search is often carried out by a user who is less experienced in using archives, or is new to the Hub, and it can produce a rather overwhelming number of results. We have tried to facilitate the onward journey of the user through judicious use of filtering options. In many ways we felt that filtering was more important than advanced search in the website design, as our research has shown that people tend to drill down from a more general starting point rather than carry out a very specific search right from the off. The filter panel is up-front, although it can be hidden/shown as desired, and it allows for drilling down by repository, subject, creator, date, level and digital content.

Another way that we have tried to help the end user is by using typeahead to suggest search results. When Gooii suggested this, we gave it some thought, as we were concerned that the user might think the suggestions were the ‘best’ matches, but typeahead suggestions are quite a common device on the web, and we felt that they might give some people a way in, from where they could easily navigate through further descriptions.

A search for ‘design’ with suggested results

The suggestions may help users to understand the sort of collections that are described on the Hub. We know that some users are not really aware of what ‘archives’ means in the context of a service like the Archives Hub, so this may help orientate them.

Suggested results also help to explain what the categories of results are – themes and locations are suggested as well as collection descriptions.

We thought about the usability of the hit list. In the feedback we received there was no clear preference for what users want in a hit list, and so we decided to implement a brief view, which just provides title and date, for maximum number of results, and also an expanded view, with location, name of creator, extent and language, so that the user can get a better idea of the materials being described just from scanning through the hit list.

Expanded mode gives the user more information

With the above example, the title and date alone do not give much information, which is particularly common with descriptions of series or items, of so the name of creator adds real value to the result.

Seeing the Wood Through the Trees

The hierarchical nature of archives is always a challenge; a challenge for cataloguing, processing and presentation. In terms of presentation, we were quite excited by the prospect of trying something a bit different with the new Hub design. This is where the ‘mini map’ came about. It was a very early suggestion by K-Int to have something that could help to orientate the user when they suddenly found themselves within a large hierarchical description. Gooii took the idea and created a number of wireframes to illustrate it for our focus groups.

For instance, if a user searches on Google for ‘conrad slater jodrell bank’ then they get a link to the Hub entry:

Result of a search on Google

The user may never have used archives, or the Archives Hub before. But if they click on this link, taking them directly to material that sits within a hierarchical description, we wanted them to get an immediate context.

Jodrell Bank Observatory Archives: Conrad Slater Files

The page shows the description itself, the breadcrumb to the top level, the place in the tree where these particular files are described and a mini map that gives an instant indication of where this entry is in the whole. It is intended (1) to give a basic message for those who are not familiar with archive collections – ‘there is lots more stuff in this collection’ and (2) to provide the user with a clearly understandable expanding tree for navigation through this collection.

One of the decision we made, illustrated here, was to show where the material is held at every level, for every unit of description. The information is only actually included at the top level in the description itself, but we can easily cascade it down. This is a good illustration of where the approach to displaying archive descriptions needs to be appropriate for the Web – if a user comes straight into a series or item, you need to give context at that level and not just at the top level.

The design also works well for searches within large hierarchical descriptions.

Search for ‘bicycles’ within the Co-operative Union Photographic Collection

The user can immediately get a sense of whether the search has thrown up substantial results or not. In the example above you can see that there are some references to ‘bicycles’ but only early on in the description. In the example below, the search for ‘frost on sunday’ shows that there are many references within the Ronnie Barker Collection.

Search within the Ronnie Barker Collection for ‘frost on sunday’

One of the challenges for any archive interface is to ensure that it works for experienced users and first-time users. We hope that the way we have implemented navigation and searching mean that we have fulfilled this aim reasonably well.

Small is Beautiful

The Archives Hub on an iPhone

The old site did not work well on mobile devices. It was created before mobile became massive, and it is quite hard to retrospectively fit a design to be responsive to different devices. Gooii started out with the intention of creating a responsive design, so that it renders well on different sized screens. It requires quite a bit of compromise, because rendering complex multi-level hierarchies and very detailed catalogues on a very small screen is not at all easy. It may be best to change or remove some aspects of functionality in order to ensure the site makes sense. For example, the mobile display does not open the filter by default, as this would push the results down the page. But the user can open the filter and use the faceted search if they choose to do so.

We are particularly pleased that this has been achieved, as something like 30% of Hub use is on mobiles and tablets now, and the basic search and navigation needs to be effective.

Devices used to view the Hub site over a three month period

In the above graph, the orange line is desktop, the green is mobile and the purple is tablet. (the dip around the end of December is due to problems setting up the Analytics reporting).

Cutting Our Cloth

One of the lessons we have learnt over 15 years of working on the Archives Hub is that you can dream up all of the interface ideas that you like, but in the end what you can implement successfully comes down to the data. We had many suggestions from contributors and researchers about what we could implement, but oftentimes these ideas will not work in practice because of the variations in the descriptions.

We though about implementing a search for larger, medium sized or smaller collections, but you would need consistent ‘extent’ data, and we don’t have that because archivists don’t use any kind of controlled vocabulary for extent, so it is not something we can do.

When we were running focus groups, we talked about searching by level – collection, series, sub-series, file, item, etc. For some contributors a search by a specific level would be useful, but we could only implement three levels – collection (or ‘top level’), item (which includes ‘piece’) and then everything between these, because the ‘in-between’ levels don’t lend themselves to clear categorisation. The way levels work in archival description, and the way they are interpreted by repositories, means we had to take a practical view of what was achievable.

We still aren’t completely sold on how we indicate digital content, but there are particular challenges with this. Digital content can be images that are embedded within the description, links to images, or links to any other digital content imaginable. So, you can’t just use an image icon, because that does not represent text or audio. We ended up simply using a tick to indicate that there is digital content of some sort. However, one large collection may have links to only one or two digital items, so in that case the tick may raise false expectations. But you can hardly say ‘includes digital content, but not very much, so don’t get too excited’. There is room for more thought about our whole approach to digital content on the Hub, as we get more links to digital surrogates and descriptions of born-digital collections.

Statistics

The outward indication of a more successful site is that use goes up. The use of statistics to give an indication of value is fraught with problems. Do the number of clicks represent value? Might more clicks indicate a poorer user interface design? Or might they indicate that users find the site more engaging? Does a user looking at only one description really gain less value than a user looking at ten descriptions? Clearly statistics can only ever be seen as one measure of value, and they need to be used with caution. However, the reality is that an upward graph is always welcomed! Therefore we are pleased to see that overall use of the website is up around 32% compared to this period during the previous year.

Feedback

“The new site is wonderful. I am so impressed with its speed and functionality, as well as its clean, modern look.” (University Archivist)

“…there are so many other features that I could pick out, such as the ability to download XML and the direct link generator for components as well as collections, and the ‘start exploring’ feature.” (University Archivist)

Archives Hub feature for February 2017

Mobile Recording Van outside Hereford Cathedral, 1927.

THE FOUNDING OF THE ARCHIVE

The first gramophone records went on sale in England 120 years ago and five years later, in 1902, the first ever gramophone record by an English robed choir of gentlemen and boys was issued. Since then many thousands of recordings of our choirs have been produced and they represent a unique and priceless recorded legacy of these choirs, which are woven into the very fabric of our cultural and musical heritage.

For a country which takes such care of all aspects of its heritage, this is one area which has been woefully neglected and even the National Sound Archives contains only a small selection.

Having spent a lifetime associated with church music and choirs, I decided to start researching and collecting recordings. As this had never been undertaken there were no discographies to consult and in many instances the choirs themselves had only scant information on what they had recorded over the years.

After fifteen years of collecting and research the Archive of Recorded Church Music is acknowledged to be the definitive collection of recordings worldwide and acquisitions are constantly being added as more and more treasures are discovered.

THE RAISON D’ETRE OF THE ARCHIVE

The Archive seeks to preserve this cultural heritage for future generations from the very first gramophone record in 1902 to the latest new releases. The recordings in the Archive are ‘from choirs of gentlemen and boys singing in the English Cathedral tradition’ both Anglican and Roman Catholic, from Cathedrals, Abbeys and Minsters, Parish churches, Royal Peculiars (such as the Chapel Royal) Oxbridge chapel choirs, School chapel choirs and independent choirs.

Recording a CD in King’s College Chapel.

This uniquely English tradition became the blue print for Anglican & RC choirs abroad, mainly in Canada, the USA, New Zealand and Australia and the Archive contains a representative selection of recordings from these ‘English’ foreign choirs.

THE RECORDINGS IN THE ARCHIVE

Every category of recording is represented in the Archive, whether it be a commercial issue from a major record company or a smaller independent company; or an in-house recordings issued by the choir themselves for limited sale in their surrounding area; or a private recording of which only that one copy exists. Each category contains recordings on 78rpm records, reel-to-reel tapes and cassettes, mini-discs, vinyl records and CDs.

Commercial issues: From 1902 to the present day, every commercial issue is listed in the Archive’s Discography with over 95% being in the collection; the remaining 10% are still to be tracked down. Many small independent labels over the years have specialized in choir recordings and these form a substantial part of the collection.

Of the numerous smaller independent companies specializing in choir recordings, Abbey/Alpha was one of the most famous, owned by Harry Mudd, OBE. Listen to one of his vinyl records from the choir of All Saints, Margaret Street in London, a choir of legendary status in the history of church music: https://youtu.be/UBgki4dGicc?list=PLEv7ZfArXoUm9-1GkoVpHpMbVlzNbt5Om.

In-house recordings: These were commissioned by the choir themselves and usually on sale only in the local area, so therefore more difficult to discover. The Archive contains thousands of these recordings on every format and many of these choirs are now long gone, their legacy being their recording.

Choir of All Saints, Margaret Street in London, 1968.

As these recordings were commissioned by the choirs themselves they give an excellent representation of the different types of choirs and of choirs which would not have otherwise recorded.

Private recordings: Some of the rarest gems in the Archive are one-off copies of private recordings which were usually made by the choirmaster himself or an enthusiastic amateur. Some choirs are represented with a large archive of these recordings but for many it’s the only recording of that choir in existence and many of the private recordings are of choirs which no longer exist.

Choir of Magdalen College Oxford, 1973.

One of the choirs for which we have a large collection of private recordings is Magdalen College Oxford, under the legendary Bernard Rose. This particular recording is of Stanford’s Magnificat in C and Rose recalls Sir Walter Alcock, a friend of the composer, telling him of Stanford’s puzzlement at the speed at which most choirmasters took the Magnificat. In Rose’s and Alcock’s view, this is the speed Stanford wishes it to be sung: https://youtu.be/MHgjuhp74w8.

RADIO & TV BROADCASTS

A major part of the Archive consists of Radio and TV broadcasts which represent an important part of this choral heritage. The broadcasts consist of services, concerts, recitals and documentaries on choirs and church music and are in particular danger of being lost for ever, as tapes were regularly wiped by the broadcasting company to save space.

TV broadcast from York Minster, 1965.

This is especially true of BBC Choral Evensong broadcasts as the BBC has no broadcasts from before 1990. Over the years the Archive has gathered up almost 2000 Evensong broadcasts which provide a fascinating snapshot of the choir under the Director of Music at that moment in history. We regularly upload archive radio broadcasts and BBC Choral Evensong broadcasts to our Youtube channel at: https://www.youtube.com/c/archiveofrecordedchurchmusic.

LIBRARY AND PHOTOGRAPHIC ARCHIVE

This complimentary collection has developed over the years with many thousands of photographs, newspaper and magazine articles, books; in fact, anything relating to choirs, choir schools and choristers and often provides invaluable background information to the recordings.

Visitors are always welcome to come and browse the archive and should you have any recordings of interest, please do get in touch and help the preserve this unique and priceless recorded heritage: www.recordedchurchmusic.org.

Colin BrownleeArchive of Recorded Church Music

All images copyright the Archive of Recorded Church Music and reproduced with the kind permission of the copyright holder.

Archives Hub feature for January 2017

Horrockses Miller and Co advertisement

Our large collection of business records relating to the Horrockses cotton firm was first deposited at Lancashire Archives in 1969, and has proved popular with researchers throughout the last half century. A recent funding award offered the opportunity to spend some time working on the earliest records in the collection, primarily those which date before 1887 when an amalgamation led to the formation of Horrockses Crewdson and Co.

John Horrocks was born in Edgworth, near Bolton, in 1768. His family operated a quarry in the area which was where Horrocks would first begin spinning cotton, selling the finished yarn in Preston. One of the earliest items within the Horrockses archive is a map showing the land owned by the family at Bradshaw, which clearly identifies a stone mill owned by John Horrocks Senior alongside a cotton mill owned by John Horrocks Junior. John Horrocks eventually moved his business to Preston, opening his first factory in 1791. As the business flourished additional factories would be built on the site, which collectively became known as the Yard Works.

Map showing the land and mills owned by the Horrocks family at Bradshaw

The company grew throughout the 19th century, and probably the most interesting material from this period relates to international trade. Horrockses Miller and Co had a number of agents throughout the world, in countries as diverse as Portugal, Mexico, India and China, and made arrangements not only to sell their cotton in these markets, but also to ship other goods for sale. This trade included the purchase of opium in India to be sold in China, where they would then purchase tea and silk to be brought back to the UK. Much of the correspondence also dates from a time of international conflict, and there are references to the Opium Wars, rebellions in India and Portugal and the Mexican-American war.

‘Mills re-opened’ bill poster, 1850s

The company was also involved in conflict much closer to home. The longest industrial dispute in Preston’s history took place between October 1853 and May 1854, and became known as the Preston Lock Out. During the 1840s cotton workers throughout Lancashire had suffered a 10-20% cut in their wages and they began to strike in efforts to have it reinstated. In retaliation the cotton masters locked the workers out of the mills denying them a living. As well as direct action, public opinion seems to have been central to the dispute, and the archive includes a collection of bill posters written from the viewpoint of both the striking workers and their employers.

Yet despite events such as these there was also much to be celebrated during this period, including the Preston Guild, an event dating back to the medieval period but which still takes place every twenty years. Horrockses Miller and Co would take the opportunity to publicise their goods, providing floats which would appear in the trade procession and building decorative Guild arches from cotton bales.

Decorative arch, made from cotton bales, as part of the Preston Guild

Heritage always seems to have been important to the company, which perhaps explains why we are fortunate to have such an extensive collection of surviving records. Advertising would celebrate the longevity of the firm both in terms of the date that they were established and the quality of the goods being produced. As the business moved into the 20th century they sought new sources of income, most notably with the launch of Horrockses Fashions in the late 1940s. It is this part of the business which is perhaps the most widely known, as the company began using their own cottons to produce off the peg dresses which would prove to be extremely fashionable. Designs would be sought from artists and designers including Pat Albeck, Graham Sutherland and Alastair Morton, and the Queen would famously wear Horrockses dresses on her first Commonwealth Tour.

Painting Ladies (DDHS 77)

We are currently fundraising to finish cataloguing the later records within the collection, which should help us to learn more about this important and famous period in the history of the company. To find out more or make a donation, please visit http://www.flarchives.co.uk/catalogue-horrockses.html.

This is the first post outlining what the Archives Hub team have been up to over the past 18 months in creating a new system. We have worked with Knowledge Integration (K-Int) to create a new back end, using their CIIM software and Elastic Search, and we’ve worked with Gooii and Sero to create a new interface. We are also building a new EAD Editor for cataloguing. Underlying all this we have a new data workflow and we will be implementing this through a new administrative interface. This post summarises some of the building blocks – our overall approach, objectives and processes.

What did we want to achieve?

The Archives Hub started off as a pilot project and has been running continuously as a service aggregating UK archival descriptions since 1999 (officially launched in 2001). That’s a long time to build up experience, to try things out, to have successes and failures, and to learn from mistakes.

The new Hub aimed to learn lessons from the past and to build positively upon our experiences.

Our key goals were:

sustainability

extensibility

reusability

Within these there is an awful I could unpack. But to keep it brief…

It was essential to come up with a system that could be maintained with the resources we had. In fact, we aimed to create a system that could be maintained to a basic level (essentially the data processing) with less effort than before. This included enabling contributors to administer their own data through access to a new interface, rather than having to go through the Hub team. Our more automated approach to basic processing would give us more resource to concentrate on added value, and this is essential in order to keep the service going, because a service has to develop to remain relevant and meet changing needs.

The system had to be ‘future proof’ to the extent that we could make it so. One way to achieve this is to have a system that can be altered and extended over time; to make sure it is reasonably modular so that elements can be changed and replaced.

Key for us was that we wanted to end up with a store of data that could potentially be used in other interfaces and services. This is a substantial leap from thinking in terms of just servicing your own interface. But it is essential in the global digital age, and when thinking about value and impact, to think beyond your own environment and think in terms of opportunities for increasing the profile and use of archives and of connecting data. There can be a tension between this kind of objective of openness and the need to clearly demonstrate the impact of the service, as you are pushing data beyond the bounds of your own scope and control, but it is essential for archives to be ‘out there’ in the digital environment, and we cannot shy away from the challenges that this raises.

In pursuing these goals, we needed to bring our contributors along with us. Our aims were going to have implications for them, so it was important to explain what we were doing and why.

Data Model for Sustainability

It is essential to create the right foundation. At the heart of what we do is the data (essentially meaning the archive descriptions, although future posts will introduce other types of data, namely repository descriptions and ‘name authorities’). Data comes in, is processed, is stored and accessed, and it flows out to other systems. It is the data that provides the value, and we know from experience that the data itself provides the biggest challenges.

The Archives Hub system that we originally created, working with the University of Liverpool and Cheshire software, allowed us to develop a successful aggregator, and we are proud of the many things we achieved. Aggregation was new, and, indeed, data standards were relatively new, and the aim was essentially to bring in data and provide access to it via our Archives Hub website. The system was not designed with a focus on a consistent workflow and sustainability was something of an unknown quantity, although the use of Encoded Archival Description (EAD) for our archive collection descriptions gave us a good basis in structured data. But in recent years the Hub started to become out of step with the digital environment.

For the new Hub we wanted to think about a more flexible model. We wanted the potential to add new ‘entities’. These may be described as any real world thing, so they might include archive descriptions, people, organisations, places, subjects, languages, repositories and events. If you create a model that allows for representing different entities, you can start to think about different perspectives, different ways to access the data and to connect the data up. It gives the potential for many different contexts and narratives.

We didn’t have the time and resource to bring in all the entities that we might have wanted to include; but a model that is based upon entities and relationships leaves the door open to further development. We needed a system that was compatible with this way of thinking. In fact, we went live without the ‘People and Organisations’ entity that we have been working on, but we can implement it when we are ready because the system allows for this.

Entities within the Archives Hub system

The company that we employed to build the system had to be able to meet the needs of this type of model. That made it likely that we would need a supplier who already had this type of system. We found that with Knowledge Integration, who understood our modelling and what we were trying to achieve, and who had undertaken similar work aggregating descriptions of museum content.

Data Standards

The Hub works with Encoded Archival Description, so descriptions have to be valid EAD, and they have to conform to ISAD(G) (which EAD does). Originally the Hub employed a data editor, so that all descriptions were manually checked. This has the advantage of supporting contributors in a very 1-2-1 way, and working on the content of descriptions as well as the standardisation (e.g. thinking about what it means to have a useful title as well as thinking about the markup and format) and it was probably essential when we set out. But this approach had two significant shortcomings – content was changed without liaising with the contributor, which creates version control issues, and manual checking inevitably led to a lack of consistency and non-repeatable processes. It was resource intensive and not rigorous enough.

In order to move away from this and towards machine based processing we embarked upon a long process, over several months, of discussing ‘Hub data requirements’. It sometimes led to brain-frying discussions, and required us to make difficult decisions about what we would make mandatory. We talked in depth about pretty much every element of a description; we talked about levels of importance – mandatory, recommended, desirable; we asked contributors their opinions; we looked at our data from so many different angles. It was one of the more difficult elements of the work. Two brief examples of this (I could list many more!):

Name of Creator

Name of creator is an ISAD(G) mandatory field. It is important for an understanding of the context of an archive. We started off by thinking it should be mandatory and most contributors agreed. But when we looked at our current data, hundreds of descriptions did not include a name of creator. We thought about whether we could make it mandatory for a ‘fonds’ (as opposed to an artificial collection), but there can be instances where the evidence points to a collection with a shared provenance, but the creator is not known. We looked at all the instances of ‘unknown’ ‘several’, ‘various’, etc within the name of creator field. They did not fulfill the requirement either – the name of a creator is not ‘unknown’. We couldn’t go back to contributors and ask them to provide a creator name for so many descriptions. We knew that it was a bad idea to make it mandatory, but then not enforce it (we had already got into problems with an inconsistent approach to our data guidelines). We had to have a clear position. For me personally it was hard to let go of creator as mandatory! It didn’t feel right. It meant that we couldn’t enforce it with new data coming in. But it was the practical decision because if you say ‘this is mandatory except for the descriptions that don’t have it’ then the whole idea of a consistent and rigorous approach starts to be problematic.

Access Conditions

This is not an ISAD(G) mandatory field – a good example of where the standard lags behind the reality. For an online service, providing information about access is essential. We know that researchers value this information. If they are considering travelling to a repository, they need to be aware that the materials they want are available. So, we made this mandatory, but that meant we had to deal with something like 500 collections that did not include this information. However, one of the advantages of this type of information is that it is feasible to provide standard ‘boiler plate’ text, and this is what we offered to our contributors. It may mean some slightly unsatisfactory ‘catch all’ conditions of access, but overall we improved and updated the access information in many descriptions, and we will ask for it as mandatory with future data ingest.

Normalizing the Data

Our rather ambitious goal was to improve the consistency of the data, by which I mean reducing variation, where appropriate, with things like date formats, name of repository, names of rules or source used for index terms, and also ensuring good practice with globally unique references.

To simplify somewhat, our old approach led us to deal with the variations in the data that we received in a somewhat ad hoc way, creating solutions to fix specific problems; solutions that were often implemented at the interface rather than within the back-end system. Over time this led to a somewhat messy level of complexity and a lack of coherence.

When you aggregate data from many sources, one of the most fundamental activities is to enable it to be brought together coherently for search and display so oftentimes you are carrying out some kind of processing to standardise in some way. This can be characterised as simple processing and complex processing:

1) If X then Y

2) If X then Y or Z depending on whether A is present, and whether B and C match or do not match and whether the contributor is E or F.

The first example is straightforward; the second can get very complicated.

If you make these decisions as you go along, then after so many years you can end up with a level of complexity that becomes rather like a mass of lengths of string that have been tangled up in the middle – you just about manage to ensure that the threads in and out are still showing (the data in at one end; the data presented through interface the researcher uses at the other) but the middle is impossible to untangle and becomes increasingly difficult to manage.

This is eventually going to create problems for three main reasons. Firstly, it becomes harder to introduce more clauses to fix various data issues without unforeseen impacts, secondly it is almost impossible to carry out repeatable processes, and thirdly (and really as a result of the other two), it becomes very difficult to provide the data as one reasonably coherent, interoperable set of data for the wider world.

We needed to go beyond the idea of the Archives Hub interface being the objective; we needed to open up the data, to ensure that contributors could get the maximum impact from providing the data to the Archives Hub. We needed to think of the Hub not as the end destination but as a means to enable many more (as yet maybe unknown) destinations. By doing this, we would also set things up for if and when we wanted to make significant changes to our own interface.

This is a game changer. It sounds like the right thing to do, but the problem is that it meant tackling the descriptions we already had on the Hub to introduce more consistency. Thousands of descriptions with hundreds of thousands of units created over time, in different systems, with different mindsets, different ‘standards’, different migration paths. This is a massive challenge, and it wasn’t possible for us to be too idealistic; we had to think about a practical approach to transforming descriptions and creating descriptions that makes them more re-usable and interoperable. Not perfect, but better.

Migrating the Data

Once we had our Hub requirements in place, we could start to think about the data we currently have, and how to make sure it met our requirements. We knew that we were going to implement ‘pipelines’ for incoming data (see below) within the new system, but that was not exactly the same process as migrating data from old world to new, as migration is a one-off process. We worked slowly and carefully through a spreadsheet, over the best part of a year, with a line for each contributor. We used XSLT transforms (essentially scripts to transform data). For each contributor we assessed the data and had to work out what sort of processing was needed. This was immensely time-consuming and sometimes involved complex logic and careful checking, as it is very easy with global edits to change one thing and find knock-on effects elsewhere that you don’t want.

The migration process was largely done through use of these scripts, but we had a substantial amount of manual editing to do, where automation simply couldn’t deal with the issues. For example:

dates such as 1800/190, 1900-20-04, 8173/1878

non-unique references, often the result of human error

corporate names with surnames included

personal names that were really family names

missing titles, dates or languages

When working through manual edits, our aim was to liaise with the contributor, but in the end there was so much to do that we made decisions that we thought were sensible and reasonable. Being an archivist and having significant experience of cataloguing made me feel qualified to do this. With some contributors, we also knew that they were planning a re-submission of all their descriptions, so we just needed to get the current descriptions migrated temporarily, and a non-ideal edit might therefore be fine just for a short period of time. Even with this approach we ended have a very small number of descriptions that we could not migrate for the going live date because we needed more time to figure out how to get them up to the required standard.

Creating Pipelines

Our approach to data normalization for incoming descriptions was to create ‘pipelines’. More about this in another blog post, but essentially, we knew that we had to implement repeatable transformation processes. We had data from many different contributors, with many variations. We needed a set of pipelines so that we could work with data from each individual contributor appropriately.. The pipelines include things like:

fix problems with web links (where the link has not been included, or the link text has not been included)

Of course, for many contributors these processes will be the same – there would be a default approach, but we sometimes will need to vary the pipelines as appropriate for individual contributors. For example:

add access information where it is not present

use the ‘alternative reference’ (created in Calm) as the main reference

We will be implementing these pipelines in our new world, through the administration interface that K-Int have built. We’re just starting on that particular journey!

Conclusion

We were ambitious, and whilst I think we’ve managed to fulfill many of the goals that we had, we did have to modify our data standards to ‘lower the bar’ as we went along. It is far better to set data standards at the outset as changing them part way through usually has ramifications, but it is difficult to do this when you have not yet worked through all the data. In hindsight, maybe we should have interrogated the data we have much more to begin with, to really see the full extent of the variations and missing data…but maybe that would have put us off ever starting the project!

The data is key. If you are aggregating from many different sources, and you are dealing with multi-level descriptions that may be revised every month, every year, or over many years, then the data is the biggest challenge, not the technical set-up. It was essential to think about the data and the workflow first and foremost.

It was important to think about what the contributors can do – what is realistic for them. The Archives Hub contributors clearly see the benefits of contributing and are prepared to put what resources they can into it, but their resources are limited. You can’t set the bar too high, but you can nudge it up in certain ways if you give good reasons for doing so.

It is really useful to have a model that conveys the fundamentals of your data organisation. We didn’t apply the model to environment; we created the environment from the model. A model that can be extended over time helps to make sure the service remains relevant and meets new requirements.