We all think we know where libraries are going. In the print age, libraries created collections for Australians in their local areas to learn from, and enjoy. They aspired to be well-resourced even when not funded to be. They also appeared to conform to a pattern of familiarity in the web age. Their services seem to be the same from one building to the next or one web site to the next. But there are a few differences worth noting, resulting in discussions now blossoming world-wide about shared library services on the web.

A network of libraries:

public libraries

school libraries

state and territory libraries

university libraries

national library

We all think we know where libraries are going. In the print age, libraries created collections for Australians in their local areas to learn from, and enjoy. They aspired to be well-resourced even when not funded to be. They also appeared to conform to a pattern of familiarity in the web age. Their services seem to be the same from one building to the next or one web site to the next. But there are a few differences worth noting, resulting in discussions now blossoming world-wide about shared library services on the web.

With a broad brush, let’s start with public libraries. They are often the first point of contact for a new member of the population. Local councils provide reading programs to develop reading ability. Sometimes young children slip through that net – their parents or carers are too busy to take them to the library. But all is not lost, as school libraries step into the breach. Schools focus on literacy, and their libraries provide supporting resources. They explain what research and what a citation is. Some state governments deliberately connect school and public library spaces to ensure that local populations in regional and remote areas have access to a wider set of resources. However, some students can successfully avoid a school library.

Public libraries can still assist, and some are closely connected to their state and territory library services. (For example, the State Library of Tasmania manages all public libraries.) Public libraries work hard to overcome the digital divide, by providing technology such as personal computers for web access, or games, training, or just spaces for interaction or quiet reflection. Often you have to be a local ratepayer to use these services, so state and territory libraries provide these services too, which takes some of the pressure off public libraries situated in our metropolitan cities. But the services are usually only available to be booked for short timeframes. People still slip through the net, and can reach university without much understanding of what libraries do or how to use them.

One of my colleagues, who is a university librarian, addresses this issue by delivering what he refers to as ‘cod-liver oil for undergraduates’ – information literacy programs which address retrieval and access, ethical use of information and plagiarism. [Andrew Wells, the 12th Biennial VALA conference, www.vala.org.au/vala2004/2004pdfs/62WeCaTa.PDF]. He argued back in 2004 that there was a much greater demand for this type of introduction to library services because many more library services and learning applications were online, and undergraduates weren’t then of the digital native type.

State and territory libraries pick up the threads in this area, as does the National Library. The National Library also offers training programmes to small groups of people. We are all part of the information network, no matter the size of the community we serve.

Scaling Collections

popular fiction

(reference works)

local studies collections

rare, unique collections

the intellectual output of the nation

The same sort of scale is true for content collection.

Public libraries provide popular fiction, much of it not Australian. As Kevin [Hennah] said, They are throwing out their non-fiction, mostly reference books, because people are finding what they need in Google and its various flavours, or in Wikipedia. However, some public libraries in Australia have wonderful local studies collections, with private resources donated by the general public, or an emphasis on acquiring works compiled by locals. They may work in tandem with historical societies or genealogy societies but they may not.

All state and territory libraries have a mandate as a collecting library to acquire the publishing of their jurisdiction. They also have wonderful rare and unique collections – which are a source of identity, history, and support personal research. They make up the literacy gap, even without large offerings of popular fiction. Cate [Richmond] referred to this depth in her talk this morning.

The National, State and Territory libraries are also weeding their browseable reference collections. They do, however, fill in gaps to make resources available where possible.The National Library collects the intellectual output of the nation, in as many formats as possible. It collects all works published in Australia. It also has a major aim to raise awareness of this, and we do this through our online services.

Fortunately, the advent of the web has allowed libraries to become far more collaborative than they were in the print-only age. Sometimes you see this, sometimes you don’t. We are all part of the national information infrastructure.

You might think in the web age that the amount of work we have to do on print collections is dropping. On the contrary, it has not decreased. This snapshot shows what happened in 2009.

The figures do not include new issues of journals already collected, the serials figure refers to the number of new journal titles. The figures do include ebooks which have been lodged with us in print form. The figures do not include web sites we harvest.

We harvest both selected web sites, and for the last four years, the whole of the Australian domain. By the domain, I mean any site whose domain URL ends with .au, although we do pick up others if we know about them. The domain harvests are not yet available to the general public as we are waiting for the copyright legislation to catch up. (TB stands for terabytes.)

PANDORA is our collaborative web archive where we keep selected web sites. The collaboration refers to the fact that most State libraries and other cultural heritage agencies participate in the harvesting of web sites in their jurisdictions. They use our infrastructure. The State Library of Tasmania operates independently,the National Gallery of Australia has just joined PANDORA.

Why did we create PANDORA? Because web sites disappear – our culture has been disappearing. It’s the only place where you can find a record of the 2000 Olympics. They are the first Olympic Games’ websites to be captured. It’s the only place where you can see the rise and fall of political parties and politicians. In PANDORA we also collect about 130 blogs, and a few ebooks. Our culture is not just physical, it’s also digital.

Apart from PANDORA, what is happening in our virtual spaces? Even though state governments and local councils try to place physical services where they are needed, libraries have also taken their online responsibilities very seriously. Why would they do this? Because we are a large country with a small population. Because we believe in equity of access, not time-limited access. The expectations of the general public have changed – they want access to information immediately. They search differently – by skimming lots of resources rather than using fewer resources in depth.

The National Library hosts several discovery services to meet this need. We have been aggregating information before the advent of the web, and since the web became dominant.

Libraries Australia is a subscription cost-recovery service which has been running for 30 years. It is built on the national bibliographic database, which is created by the national, state and territory libraries; all of our university libraries, a lot of public libraries, a few school libraries, and what we call special libraries within specialist communities – TAFEs, government departments, the health sector, law firms, and businesses. Libraries Australia contains records for 21 million items, mostly books and journal titles, with 46 million locations in Australia. We have shared these records with a global catalogue called WorldCat and in return Australian librarians can use WorldCat services too. Libraries Australia opened up its search interface for all Australians in 2006.

Picture Australia is ten years old. It contains information about, and links to, 1.8 million digitised images relating to Australia. Picture Australia was our first foray into significant collaboration with cultural agencies other than libraries. Its 100 contributors include archives, museums and local history societies. My goal was to create virtual galleries, so that if you couldn’t travel around Australia, you could still obtain an appreciation of the corpus of an Australian artist. However, the growth area has been photographs. More about that later.

Australian Research Online grew out of the ARROW project funded by the federal government for the higher education sector. It contains 400,000 links to academic research. Almost all of Australia’s universities now contribute, as well as some government departments and 30 separate e-journal titles. Music Australia focuses on information which can lead to digitised sheet music, performers and sound recordings. Since 2006 we have tried a few business models which would allow us to link to third party providers for the purchase of new music. The volatility of the music industry has meant mostly failure to date, but we haven’t given up and will keep exploring this.

The digitised historic newspapers service is the result of collaboration between the national, state and territory libraries. It began in 2005 when we identified ten key titles, essentially the major metropolitan daily from each capital city. We are digitising each of them from the time they started until 1954, which was the copyright cut-off date. Interestingly the list did not include the Sydney Morning Herald because the first paper in NSW was the Sydney Gazette, which started in 1803. The Fairfax Foundation has since funded us separately to digitise the SMH. It’s first one hundred years are now included in the service. By the end of 2009, we had digitised 10 million articles in one million pages. We only have three million pages to go. The newspapers service is a bit different from the others because it gives you instant access to all the full-text. More about that later.

We have some other smaller services such as the Register of Australian Archives and Manuscripts, and Australia Dancing, which also contain unique resources. You can see that they are very format based, and while that may suit those who work in libraries, the general public may not care. We decided it was time to integrate these services. Welcome to Trove. We chose the name because it means “treasure trove” – “a collection of valuable of delightful things”. It is from the French verb “trouver” – to find, to discover. A prototype was released May 2009, and many comments were received from the general public. It became a stable service in early November 2009.

Trove combines the content of all of the services I just mentioned. So you can start a search by searching across everything. Or you can select a collection view. So behind the pictures and photos view, you will find all of the Picture Australia images, plus records for images not yet digitised. Behind the music, sound and video view, you will find Music Australia content, plus audio books, and some of our sound files. The National Library is in the process of converting 40,000 hours of oral history and folklore recordings from analogue to digital. More than 50% of the collection has been digitised, but it will take another 11 years to finish. About 600 interviews are online. We welcome the introduction of broadband across the country, because this will help the content be more accessible.

In addition to Australian content, we have included sources which have made full text or links to full text available for free. For example, the University of Adelaide has digitised about 22,000 out-of-copyright literary works. We also harvest reviews from Amazon, full text works from the Open Library (which are digitised in the United States), and 20million links to scholarly works from a service called OAIster. The service is now owned by the same organisation which hosts WorldCat, the global catalogue. We wanted to create a global experience with a local flavour.

We have done usability testing to make sure that we haven’t used library jargon in the service, and we intend to do more, as we haven’t finished building Trove. Behind the ‘diaries, letters, archives’ view, you will find manuscripts. While we who work in libraries know what manuscripts are, no-one in our test group knew, so we dropped the term. The public isn’t often interested in format types, but we wanted to make sure that if they were only used to searching in a catalogue for books, then the value of other types of material was also obvious.

Underneath the search box, you can choose one or more filters – Australian material only, online content only, or in libraries close to you. The latter only works if you have given yourself a signon and selected the libraries. More about that later.

The power of Trove is in searching for a topic. The sinking of the Centaur has been tagged, so you can go to the tags and click on it without having to repeat the search yourself.

We have opened Trove up to Google, so that the 89% of people who start their searching in Google will be redirected to Trove, especially for the Australian experience. Three years ago, the statistic was 82%.

Here is an example for Matthew Flinders. Trove takes you to the source, the heart of what you’re after. The results are presented according to the separate collection views, and are ranked within each view. You can choose to close down a view, but the number of results in each view will still be displayed for information.

And of course there are links to material about his cat, Trim.

You will see on the left-hand side of the screen that we are also doing real time searching of large global services such as Google and Amazon. We include Amazon reviews in displays of individual items.

For more contemporary creators such as Kate Grenville, we provide links to Wikipedia where appropriate, and will implement links in both directions between Wikipedia and Trove, as a way of promoting global reach for local authors.

But is Wikipedia authoritative enough? Despite the close control of editing, some don’t think so. The National Library has tried to address this.

One of the new areas of information for which we didn’t develop a separate interface was about people and organisations. We almost launched a separate service called People Australia, but instead decided to integrate the content into Trove. Here is the page for Anita Heiss.

In the centre of the screen is a string we call a persistent identifier. It has the word ‘party’ in it, that’s because party means a person or an organisation. This is a unique identifier for Anita Heiss. By citing this identifier, you will retrieve all of the resources which have been matched to that person. You can also use it to link to this record from your own service. For example, a site listing Queensland authors could link to this page in Trove for Anita Heiss. The matching is done, firstly by a software program, and secondly where there is no match to existing content, by one of my team.

Our People Australia initiative is still quite new, and we are still gathering in as many sources of information about Australians as we can. We have identified about 50 so far. We have already gathered in the Australian Dictionary of Biography Online, the Australian Women’s Register, and the Australian Parliamentary Library. They in return can receive unique identifiers for the people they send us the data about. If they already have their own unique identifiers, we keep those too.

Somerset College has been good at ensuring its publications are catalogued for all to find.

The screen on the right is part of a web site captured into our PANDORA archive for the 2002 Celebration of Literature. So if Andrew [Stark] decides to write a history of the Celebration of Literature, he could use PANDORA as a source.

Our People Australia initiative is being expanded in a new project this year, to provide unique persistent identifiers for every researcher in the higher education sector. Professor Roland Sussex atthe University of Queensland already has a unique identifier, as shown here, due to his impressive publishing output.

By mid June next year, we expect to make available the tools and procedures for creating an identifier for every academic and post-graduate student in the country. We are being funded to do this work by the Department of Innovation, Industry, Science & Research, but the results will be shared with all Australians through Trove.

There were two main reasons for the National Library to integrate its collaborative services – firstly, we wanted maintenance of our services to be more efficient. Secondly, the more important reason was to expand on the ways the public can engage with library services.We’ve had some early successes with this. Picture Australia undertook two activities:

Expansion of trails to slideshows and

A collaboration with the photo-sharing service Flickr.

Trails were an early innovation to provide pathways into sets of images contained in Picture Australia, to help people understand the breadth of its content. We display the images in real time from contributing sites. Last year we extended this feature particularly for schools to turn them into slideshows – large images which can be shown in a classroom setting according to the themes identified in the national curriculum. We worked with The Learning Federation who produced matching Educational Value Statements.

In 2006, we established an arrangement with Flickr, the photo-sharing site hosted by Yahoo, to include contemporary images in Picture Australia. Each we week harvest records deposited in three Flickr pools we have set up including OurTown, and People Places and events. The agencies participating in Picture Australia can then look at those to determine whether the work of a particular photographer should be included in their permanent collections.

We don’t need to collect 21 million photographs of the Sydney Harbour Bridge, but images we do are copyright cleared. We also gather in the tags assigned to each image because keywords assigned by librarians may not be the same. The photograph of the pub illustrates another useful feature of the service, which is to provide the record of changes through time, or an historic event.

When we migrate the Picture Australia interface to Trove, we will be retaining all of these features.

But it is also our intention to allow the general public to create their own trails of interest – which we will call lists, so people will be able to include not just images, but information about books, people, music etc, and keep them. We are also considering how they will be shared with others with others, that is, you will be able to choose whether to share your lists or not.

The other service in which we have launched functions which people label ‘web 2.0’ or social networking is Australian digitised newspapers. The state and territory libraries have contributed to the project by providing microfilm copies of their newspapers.

Our goal with the project was to provide access to full text for searching. This means digitising pages, which we do from microfilm, then running some software called Optical Character Recognition to produce lists of words which are then indexed. In addition, the process generates articlesl, which is important especially if you are trying to browse through broadsheets. The OCR process has improved over the last 20 years, but not sufficiently to display every word well. Sometimes the quality of the microfilm, or the original newspaper, has been poor.

So what we have done is supplied the OCR’d text on the left hand side, and the original article on the right hand side. You can see that the word picnic appears with ls. You may also be able to see that someone has corrected the text of the article, and not bothered to correct the second occurrence of the word picnic. A good decision, because the article will still be found if searching on the word picnic.

The text correctors have now corrected more than nine million lines. They are volunteers. We use some contractors to quality assure the metadata for each article, but we did not have the staff resources to correct text, so instead we built the functionality and just left it on the site.

We have a Text Correctors Fall of Fame on the Trove website, and here I’ve shown the top 5, including Queenslanders the Mulcahys. Correcting text is a serious game [Jeff Brand] – people volunteer because they like to contribute to a national service. You don’t have to obtain a sign-on, you can correct text anonymously. But if you do self-register, we keep a list for you of all the text corrections you make, as well as any tags you might add to the articles, or comments you make.

We will also be migrating all of this functionality into Trove. It will be attached to the sign-on which you may have already given to yourself when registering in the separate newspapers service.

I can see all comments I have added, all text corrected, all tags added, and also, set the libraries I am interested in. You can set as many libraries as you like, and when there is a match with a work you are searching for, it will rank those before others in the results.

Communities are forming around the creation of tags. For example, there is a railway historical society whose members are tagging all articles they find about railway lines which no longer exist.

User Engagement Success

User engagement with the Australian Newspapers service is very successful and with Trove is proving likewise as you can see. These were the counts at 5 February this year. There were 3,629 searches in an hour.. more than 43,000 lines of newspaper article corrected.. etc

You can see lists of each of the searches, comments and so on behind these by clicking on these links. The one I haven’t yet talked about is the splitting and merging. This refers to a bibliographic function. If you decide that two records are describing the same item, then you can merge them.

The rules for doing this pop up in situ. The merge is not binding in the database where the records are stored, it only affects the display, effectively removing duplicates for others. And the function is possible in reverse – if you feel that someone has made a mistake, then you can spilt them.

If I go back to my Matthew Flinders result, and click on a work such as Matthew Flinders, navigator and chartmaker, you will see three versions of the work. The first one is held at 32 locations in Australia, the second at 17, the third in only two.

By looking at the title, it’s possible that the second and third are actually the same book. You can click on each title, and find out more. But you may also not care, so at the highest level, we have grouped all variations under the name of the work.

Each work has been given a unique Trove identifier, so you can cite it for further use. I’ll leave it to you to go and decide whether two of the records need to be merged.

You will also notice on the left hand side of the screen that we have provided some facets for browsing. As you click in each, they expand, so a decade will become each separate year where applicable. The facets are an alternative to advanced searching.

It is possible to add comments to the work, or to a version or edition, which are in the style of literary reviews. This is useful for publishers to consider whether they should reprint a work, whether they are traditional publishers or a literary centre helping authors to self-publish online.

In the short time that the service has been in full production mode, we’ve seen some interesting behaviours already. For example, rather than relying on librarians to add purchase information, authors and editors are doing it for themselves. They include information about book launches. Similarly, if they keep the links to their online works up-to-date, then the usefulness of the information in Trove will remain strong.

These actions will become more important as more authors self-publish. A small article entitled Self-publishing upstages a card in The Australian recently suggested that people were providing copies of their own books instead of a business card as a way of indicating their expertise [Author Solutions Inc.]. I think there is an interesting role here for public libraries as well – branching out into support for publishing by individuals, and associating that with print-on-demand services including for out-of-copyright works.

Of course, we still welcome corrections to metadata. Next to the image is the information provided by the State Library of Queensland.

Underneath it is what a member of the public has provided. One of my challenges is to ensure that such commentary is able to be reused by libraries which have contributed their content to Trove. This is an example of where libraries can share web services.

We’ve not had any serious issues with these additions so far. The online newspaper experience has repaid our trust in what the public is willing to do. We have been listening to others in the libraries, galleries, archives, museums sector who also have experience in crowd-sourcing. Their view is that Wikipedia is almost full, and volunteers are willing to work on other significant services.

Some commentary sent direct to us which illustrates Trove's role in sharing Australia's identity globally.

So the National Library has facilitated a new service by building on previous collaborations, built by Australians, for Australians. We want to make as much content as possible available in real time, preferably Australian. We are also asking readers to be authors [as opined by James Moloney] using Trove as a tool.

Pathways to local services are particularly important – for example not all information about Queensland or Queenslanders is held in or looked after by Queensland agencies.

In the first few months of the service, without the marketing campaign being launched, Trove has had more than 450,000 unique visitors. We will continue to enhance the service to encourage its growth, and expect to provide a lot more journal content.

What you can do

lobby for more digisitation of Australian content

show a colleague how to use Trove

send us your full, frank feedback

if you always use another favourite portal, ask the host agency to connect up with Trove

Seek out those wonderful local history collections, and encourage grant-writing for their digitisation. Lobby your local politicians to support public libraries which look after regional newspapers. Please let us know what you think about the service, and let your favourite portal know to contact us, so we can work out the best way to hook it up to Trove.

If you would like to find out more about how Trove changes with each new release, please go to the Site News tab in the service.

And just a reminder that Trove is not all about text, access is also provided to some wonderful music. Just click on:

It's hot in Brisbane but it's Coolangatta words and music by Claude Carnell http://trove.nla.gov.au/work/8775138, then listen to http://malcolm.screensound.gov.au/olcmedia/audio/00015646.mp3 (also available via http://trove.nla.gov.au/work/11181757)