Advanced metadata

To talk with us about preparing data for submission to a repository, see our Consulting page for details or contact us.

Introduction to metadata standards

In order to submit your research to a data repository, you may be required to format your metadata using a metadata standard. Consult the repository you will be using to determine what their metadata requirements are.

Metadata structures are often referred to as "schema." The schema will have a defined set of characteristics for describing the data. The completed metadata are often reported in a machine-readable language such as XML.

The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.

Creator

An entity primarily responsible for making the resource.

Date

A point or period of time associated with an event in the life cycle of the resource.

Description

An account of the resource.

Format

The file format, physical medium, or dimensions of the resource.

Identifier

An unambiguous reference to the resource within a given context.

Language

A language of the resource.

Publisher

An entity responsible for making the resource available.

Relation

A related resource.

Rights

Information about rights held in and over the resource.

Source

A related resource from which the described resource is derived.

Subject

The topic of the resource.

Title

A name given to the resource

Type

The nature or genre of the resource

If you are not using a standard metadata schema whose details are widely known and easily accessible to other researchers, be sure that you preserve the schema itself and its documentation, along with the data and metadata. By doing so, you will help ensure that you and others are able to fully understand and reuse your data in the future.

Examples of metadata standards

The following are several well-known and frequently-used metadata standards.

Data Documentation Initiative standard (DDI): an international XML-based standard for the content, presentation, transport, and preservation of documentation (i.e., metadata) for data sets in the social and behavioral sciences

We can assist you in selecting a metadata standard that is appropriate for your field of research. See our consultations page for more information.

About ontologies

Ontologies are shared vocabularies that are used to describe components of a particular discipline and the relationships among these components. By using ontologies, you make it easier for others (or even the future you) to understand your data. Controlled vocabularies, on the other hand, are merely lists of predefined, authorized terms.

In addition to using a metadata standard, you may wish (or be required) to use ontologies or controlled vocabularies to create your metadata. For example, if you use the Dublin Core as your metadata schema, they recommended that you use the Internet Media List, a controlled vocabulary, to enter information in the "Format" label. It is also recommended that you use a controlled vocabulary to enter the subject terms, but it is up to you to choose which vocabulary to use.

Here are some examples of ontologies and controlled vocabularies currently in use in a variety of disciplines:

Bioportal: the portal for the U.S. National Center for Biomedical Ontology, hosted at Stanford.

Gene Ontology: a bioinformatics initiative that aims to standardize the representation of gene and gene product attributes across species and databases.