Breadcrumb

Crowdsourcing Metadata Enhancements to Improve the Discoverability and Reusability of Scientific Data

Crowdsourcing Metadata Enhancements to Improve the Discoverability and Reusability of Scientific Data

This project will conduct experiments with different user communities (expert, secondary data users and novice student users) to determine what motivates them to contribute metadata enhancements to data that have been archived but are not sufficiently FAIR (Findable, Accessible, Interoperable and Reusable). Current practice relies on the efforts of data producers and professional data curators to produce and provide metadata, including variable level data descriptors, study key words and bibliographic citations to data-related publications. These efforts are expensive and, as a result, are often undersupplied, leaving data that has been archived and shared with the scientific community of limited value for reuse.

It is challenging and expensive to create the metadata that makes data sharing useful by enabling data discovery and reuse. One potential method for reducing these costs and increasing the value of scientific data is to engage third parties — secondary data users, domain experts, students, data librarians, data "geeks" — in enhancing archived data with metadata improvements. These might include citations to publications or other work that uses the data, subject tagging, sharing of code, or the provision of missing metadata (such as variable level data labels, value labels, etc.). There are successful examples of citizen scientists engaging in similar activities, such as tagging objects in the sky or birds in the wild. The project will inform the development of strategies and tools for crowdsourcing enhancements to metadata and, more generally, will advance understanding of the incentives that individuals have to produce public goods.