Montana State Library Adopts the Internet Archive As Its Institutional Repository

The Montana State Library (MSL) has just completed moving 3,070 born-digital state publications from OCLC’s CONTENTdm to the Internet Archive. This is a key piece of the Montana State Library’s institutional repository for state publications now hosted by the Internet Archive (IA). The born digital state publications are integrated with two other pieces of MSL’s institutional repository at IA:

IA is hosting, as part of its free library, a growing number of Montana state publications newly digitized by IA under contract to MSL. This digitization project will take several years to complete given staffing limits. Nine thousand state publications and nearly one million pages have now been digitized. Ultimately, MSL expects to digitize 55,000 print items dating from the 1870s.

MSL has contracted with IA’s Archive-It team to crawl and archive state agency web sites. This partnership has significantly improved MSL’s capture of state publications as compared to the previous manual methods. For example, the archived publications are accessible by full-text search. Publications up to 10 MB are indexed. MSL has branded the web archive piece of its institutional repository, Archive Montana. Because web pages and linked state publications are crawled regularly, Archive Montana also provides a history of state publications in their web context.

As MSL began evaluating whether to commit the born digital piece to IA, MSL staff already knew that we could place links to IA display pages from state publication MARC records. During the evaluation, we learned the metadata exposed on the display pages is from a meta.xml file uploaded, integrated, and stored at IA with each digital object. The meta.xml files are crosswalked from MSL’s MARC records in two-steps. In the first step, MSL builds the meta.xml with customized fields it wants to appear on the display pages. Then, IA puts the finishing touches on the meta.xml to ensure standard fields are included. The content of the meta.xml is then indexed for search. So, search at IA is metadata driven with full-text search available in PDFs and the Read Online format.

MSL also came to understand the riches of the variety of formats produced and made accessible by IA display pages – PDF, EPUB, Kindle, Daisy, DjVU, and Read Online. Read Online has a new accessibility feature that reads aloud, providing a valuable option for patrons. In addition, IA can handle many other formats, such as audio and video. Patrons can quickly access and download materials or read online. MSL can brand its landing and display pages. In addition, IA enables repositories to articulate usage rights to patrons.

To support the IA display pages, there are effective metadata management tools. Plus, IA has a capable backup/preservation infrastructure. In summary, we found IA to be a versatile digital library. And, not insignificantly, IA is a library, officially recognized by the State of California. Given these findings, the Montana State Library realized IA could serve as its institutional repository and so began the process of uploading its born-digital content.

MSL uploaded born-digital state publications in batches by writing basic scripts to work with batch utilities at the IA. IA assigned software engineer, Hank Bromley, to advise the project. Hank was both very helpful and articulate. If the reader would like more information on how MSL completed the project, there is more information in the IA text forum or feel free to contact MSL Library Information Services division.

State publications use has increased significantly since so many publications are now available digitally. In a recent week, 5,042 items were downloaded from of our current collection of 11,999 state publications at IA. These 5,042 state publications were downloaded 6,827 times. Having an institutional repository at the Internet Archive has enabled the Montana State Library to extend the use of Montana state publications.