Session Description:
In 2012, the Missouri Botanical Garden received a grant from the National Endowment for the Humanities entitled “The Art of Life: Data Mining and Crowdsourcing the Identification and Description of Natural History Illustrations from the Biodiversity Heritage Library (BHL)”. This project aims to develop software tools and a metadata schema for visual resources contained within the scanned literature made available through BHL digitization activities. The tools and schema will support automated identification and crowdsourced description of this corpus.

Initially, software tools will help discover visual resources (illustrations, maps, and other works of art) in BHL’s corpus, and basic metadata will be recorded. These resources will then be shared on multiple image delivery systems, including Flickr and the Wikimedia Commons, where citizen scientists will be able to add further annotations. Because of the wide diversity of information that a citizen scientist can add to any image, a comprehensive yet manageable schema is needed to help standardize inputs and enable synchronization and seamless import back into the BHL databases.

This schema therefore needs to support three objectives:
(1) to enable the discovery, description and use of the identified images by artists, biologists, humanities scholars, and educators;
(2) to make BHL’s metadata and images available to other platforms; and
(3) to import crowdsourced metadata generated in other platforms back into BHL.

The speakers will talk about the process of identifying existing schema that meet the needs of the project, instead of developing yet another schema from scratch, and integrating a solution that combines the best in biodiversity informatics and image curation standards and best practices. We will present our preliminary schema to the DLF community, explain how we addressed metadata challenges specific to biodiversity data, and obtain feedback on improving our schema further.