Census, Navy supercharge Web libraries

Librarians at the Census Bureau and the Naval Research Laboratory are building Web portals that give researchers faster access to a greater volume of higher quality information.

In 1999, Census began a library automation project that involves preserving irreplaceable legacy documents and processing them to make them more useful.

James P. Madigan III is program director for the Census Bureau project at Stanley Associates Inc. of Alexandria, Va. In a recent presentation to the Federal Publishers Committee, he described the bureau's projects for digital conversion of documents, cataloging, storage, retrieval and systems integration.

The bureau's Integrated Online Document System will contain the entire U.S. census since 1790, as well as hundreds of scholarly papers written by bureau employees on demographic matters, congressional testimony, bureau working papers and other publications. It will also hold vast amounts of foreign census data. The project costs the bureau about $1.2 million annually, Madigan said.

Much of the collection to be entered in the database was not 'born digital,' Madigan said, and is on paper.

'We scan all these on flatbed scanners,' he said. 'The Census Bureau has two scanners that optically correct for the crevice in a book.' Using Minolta PS 7000 face-up scanners avoids destroying books, he said.

200,000 catalog items

The bureau is using iLink portal software from Sirsi Corp. of Huntsville, Ala., to provide access to more than 200,000 catalog items. Potentially, the system could be accessed by the public via the Internet, though now it is available only to bureau employees, Madigan said.

In the future, Madigan said, the bureau's library automation project will improve access to documents at lower cost, preserve documents and potentially bring the system into compliance with Section 508 requirements for accessibility by handicapped users.

R. James King, specialist in library information technology for the Ruth H. Hooker Research Library at the Naval Research Laboratory in Washington, oversees an operation supporting 3,000 federal employees working in the fields of physics, chemistry, electronics and space sciences.

NRL's main unclassified system, Torpedo Ultra V.2, stores journals, proceedings, books, reports and other technical documents. It holds more than 900,000 journal articles, totaling more than 6 million pages of content, King said. NRL also runs a classified system called Cuadra, which distributes classified reports via the Defense Department's Secret IP Router Network.

'Users don't have to worry how the information is cataloged,' King said. NRL uses a thesaurus that associates words and concepts for the purposes of searches. The library lets users choose among three types of searches: Boolean, pattern matching and concept matching. The library plans to add technical thesauri.

King observed that while the Web has made information retrieval easier in some ways, it has also concealed some critical information and spread unreliable data. 'The Web is a box full of parts; it needs to be integrated,' King said. 'The end goal is to get to a digital library from a box full of junk.'

NRL uses the WebExpress search engine from Convera Corp. of Vienna, Va. 'Everything we are doing is Web-based,' King said. 'Now we have control of the thin client at the Web. As soon as the Web came out, we started drooling'we wanted to upgrade as soon as possible to the Web.'