Supplement Projects

Principal Investigator:

Xiaoqian Jiang, PhD, University of California San Diego

Description:

Biomedical data is distributed widely within secure silos. However, research questions can frequently only be answered by the integration of data across multiple such silos. One key barrier to such integration is the lack of standard data models and interfaces to the data. We propose a collaborative effort to systematically uncover and extend the limits to which biomedical data can be shared across multiple centers with the outside world.

The project is an ongoing collaboration between the center for Integrating Data for Analysis, Anonymization and SHaring (iDASH), one of the National Institutes of Health (NIH) National Centers for Biomedical Computing (NCBC), the Biomedical and Healthcare Data Discovery Index Ecosystem (bioCADDIE), and three of the NIH Big Data-to-Knowledge (BD2K) centers: the Patient-Centered Information Commons (PCIC), the Center for Big Data in Translational Genomics (CBDTG) and the Mobilesensor Data-to-Knowledge (MD2K) Center.

Chairs:

Website

Description

FORCE11 has been awarded supplemental funding as part of the NIH BD2K bioCADDIE project to extend the work of the FORCE11 Data Citation Implementation Group by organizing a Data Citation Implementation Pilot Project (DCIPP).

Members of the FORCE11 community have been participating in NIH meetings and bioCADDIE workshops and contributed substantial materials to the bioCADDIE white paper outlining the vision for a Data Discovery Index produced by bioCADDIE. Concrete plans were formulated by members of the community to conduct a pilot project on data citation with international partners based on the FORCE2015 bioCADDIE data citation workshop and the joint Elixir-BD2K workshop held in January 2015. At both of these workshops, significant support was expressed for testing the proposed implementation of the Joint Declaration of Data Citation Principles (JDDCP), developed by the FORCE11 Data Citation Implementation Group. The joint Elixir-BD2K workshop recommended a data citation pilot project as one of two outcomes of the meeting.

Goal

Our primary goal is to provide basic coordination between publishers, repositories and identifier / metadata services for early adopters of data citation according to the JDDCP.

To meet this goal we will provide authoritative guidance and group consultation on data citation implementation to help establish one or more benchmark implementations of data citation based on (a) JDDCP and (b) Starr et al 2015, its cross-domain implementation guidance.

The period of performance will be one year, during which we hope to enable several exemplar implementations by important early adopters across key use cases and provide group consensus on lessons learned to the community.

As part of this activity, we expect to publish several peer-reviewed articles and a final report.

Our first article, on the JDDCP for Nature Scientific Data, is in preparation.

We are also planning a peer-reviewed article submission based on Debbie Lapeyre’s 2015 white paper on citing data in JATS.

We will cooperate with CODATA’s outreach in their broad international workshops on data citation as complementary to the focused early adopter approach we will support.

Co-Investigators

Susanna-Assunta Sansone, PhD, University of Oxford, Nature Publishing Group

Description

This pilot brings together and builds on some of the work by the HeartBD2K and bioCADDIE NIH BD2K centers. The Omics Discovery Index (OmicsDI) will focus on piloting metadata- and molecular-based search for datasets across a heterogeneous, distributed group of genomics, proteomics, and metabolomics data resources. A common metadata model and interactive visualisation will enable a user-friendly, lightweight OmicsDI portal spanning eight repositories in two continents and six organizations, including both open and controlled access data resources.