Data Publication
Guidelines

Guidance and best practices for publishing data

Data Sharing and Publishing

DesignSafe provides an end-to-end data management, analysis, and publication platform for experimental, simulation, field research and other types of research products. Within the DesignSafe Data Depot, researchers have access to a private “My Data” space, a collaborative “My Projects” space, and a “Published” space for published datasets.

Any files from a research project (data, processing scripts, analysis products, models, etc.) can be stored in DesignSafe from the start of the project and shared among project team members. They will be kept private with a Project space until they are published by the research team. From the moment they are uploaded to a project the files can be curated for eventual publication, easing the burden of this work at the end of a project.

Research teams curate their own data in DesignSafe, using tools provided in the Curation Directory in “My Projects“ space. My Projects space is collaborative and any team member on a project has both read and write access to the entire project folder. These tools facilitate organizing, categorizing, and describing data. When researchers curate their data and request to publish it in DesignSafe, the data is automatically vetted to ensure that it meets best practices descriptive requirements (see details below). Published datasets receive permanent digital object identifiers (DOI) for persistent identification and ease of data sharing and reuse on the web.

Citing datasets in papers

Researchers using published data from the DesignSafe Data Depot must cite it using the DOI, which relies on the DataCite schema for accurate citation. For convenience, users can retrieve a formatted citation from the published data landing page. It is always recommended to insert the citations in the reference section of your paper.

Reusing data from other sources

Frequently you use data from other sources in your research and sometimes you even want to re-publish it. It is always a good practice to give credit to the data creators and or make sure you can re-publish the data if you need to. Please, be aware of the following:

If you cite the data, make sure there is preferably a DOI or a permanent URL in the citation so that users can get directly to the cited data. Use the Related Work box in Edit Project to include the citation/s and corresponding links.

If you use external data in your analyses, you can point to it from the Referenced Data Title box as you create your analyses category.

Be aware of the reused data original license conditions of usage. The license may specify if and how you can modify, distribute, and cite the reused data.

Data Embargo

In DesignSafe-CI this refers to time during which a project is not made public awaiting for the review and publication of a corresponding paper. Please submit a help ticket to Data Curation & Publication and we will work with you to accomplish the following requests:

to provide access to reviewers before publishing your data;

to reserve a DOI of your data before making it public;

to publish a dataset at the same time that you publish the corresponding paper in a Journal.

Metadata Requirements

Overview of metadata best practices implementation in DesignSafe Metadata is information that describes data. Metadata schemas provide a structured way for users to share metadata within and across domains. Because there is no standard schema to describe natural hazards engineering research and data, DesignSafe offers metadata sets to describe key components of datasets. These were developed in close consultation with researchers in the natural hazards community. The terms are evolving, and they are and will continue to be expanded, updated, and corrected as we gather feedback and observe how researchers use them in their publications.

DesignSafe’s metadata approach maps community terms to elements of widely-used, standardized schemas so that metadata can be exchanged with other platforms. The schemas to which terms have been mapped are: Dublin Core for description of the research project and the data publication, PROV to display provenance relationships between data and the processes from which it derives, and DataCite for DOI assignment and citation.

Due to variations in research domains and their methods, users may not need to use all of the elements available to describe their research. However, we identified a set of metadata terms that represent the structure of the data, are useful for discovery, and will allow proper citation of data. To ensure the quality of published data in DesignSafe, when users request to publish data the system checks for completeness of these core terms and or whether data are associated with them. The element set is shown below.

KEY (to help understand usage of the terms below)

(bold) Denotes the structure of the data. For example, an experimental project may have more than one experiment and more than one corresponding analysis.

(*) The metadata is repeatable, with multiple entries allowed.

($) Recommended if exists. For example, not every project will include an analysis.