Crowdsourcing Historical Collections

UVA asks citizen scientists to help transcribe 7,000 botany records

The University of Virginia Library and Mountain Lake Biological Station are enlisting citizen scientists from around the world to help transcribe nearly 7,000 handwritten historical plant records that have been tucked away for decades inside a file cabinet.

The project is part of an effort to digitize the specimens and place them online, which Mountain Lake director Edmund “Butch” Brodie says will be useful for researchers examining how plant distributions are changing with large-scale, global biological change.

One of the nearly 7,000 botany records that has been transcribed and digitized.

The Mountain Lake herbarium collection, currently housed in a tall, dark green, metal cabinet at the station, includes thousands of samples of various plants and trees dating as far back as the 1920s. They are all pasted or taped onto paper with handwritten or typed labels, but those labels can’t be deciphered digitally.

To get the records’ images online, the Library’s Digital Curation Services staff brought the herbarium collection to Grounds to photograph each record, capturing the entire plant and its label. They began photographing the records in 2006 in small batches as time and staff resources allowed. In 2009, they pushed to complete the final 4,000 records in one month.

The herbarium records went online in early May, and by mid-June users who were tapped into the online citizen science world—or who’d seen the tweets, announcements and articles about the collection—had transcribed about 20 percent of the collection.

To ensure the accuracy of the citizen scientists, Andrew Sallans (Col ’04, Com ’09), the library’s head of strategic data initiatives, says each record is transcribed 10 times. An algorithm developed by Zooniverse, a web portal developed by the Citizen Science Alliance, which is hosting the digital collection online, compares the likelihood of precision between each of the 10 versions.

There had been no public access to the collection before the effort to digitize them, says Sallans, who is helping coordinate the project.

“One of the downsides of these types of collections is that they’re hidden in cabinets and no one asks for them because no one knows they exist,” Sallans says.

A London-based project developed by Zooniverse that asked people to go online and transcribe old log records from Royal Navy ships inspired the library staff, Sallans said.

“We saw that as a good model,” he says. Crowdsourcing—seeking volunteers to help with an online project, usually for no pay—is an effective way to get people to become engaged in science, Sallans says.

“You can tell stories around it that help make people interested and excited about this while helping them contribute to science and research,” Sallans says.

The library also sees potential for transcribing other collections—such as Jefferson’s letters, papers authored by retired faculty, or letters by historic Virginia families. “We have millions of manuscripts that we would love for the public to have broader access to for research and teaching,” Sallans says.

The UVA collection of plants needing transcribers can be found at Notes from Nature online at www.notesfromnature.org.