Metadata & Data Documentation

What is Metadata?

Metadata is a term that has primarily been used by library and archives communities to describe standards used to aid the discovery of objects. Metadata standards are composed of metadata elements, sometimes called metadata fields. Metadata standards are created to facilitate searching similar items by using similar terms and constructs to describe them. A metadata record consists of all the metadata elements describing an object. Metadata records are often expressed in XML or other machine-readable formats for easy integration within systems.

There are three basic categories of metadata elements: descriptive, technical/structural, and administrative. All objects also have a unique identifier metadata element.

Descriptive metadata elements consist of information about the content and context of an object. For example, descriptive metadata for an image may include: title, creator, subject (tags), and description (abstract).

Technical/structural metadata elements describe the format, process, and inter-relatedness of objects. For example, technical/structural metadata for an image may include: camera, aperture, exposure, file format, and set (if in a series).

A good starting place for a metadata plan if a standard has not been defined for your discipline is Dublin Core or Data-Cite's recommendations. The UO Libraries Digital Library Initiatives group is happy to help with the instructions and/or application of these standards. You may also want to look at various metadata fields used in Dryad or other data repositories to see how other researchers are describing their data.

If your discipline or repository does not require a specific metadata standard, the UO Libraries Digital Library Initiatives group can help advise. Based on the complexity of description, the amount of hours required to create a metadata plan can vary. Please make sure to meet with Metadata Services and Digital Projects (MSDP) to budget for developing a metadata plan before submitting your grant.

modifications made to data over time since their original creation and identification of different versions of datasets

information on data confidentiality, access and use conditions, where applicable

At data-level, datasets should also be documented with:

names, labels and descriptions for variables, records and their values

explanation of codes and classification schemes used

codes of, and reasons for, missing values

derived data created after collection, with code, algorithm or command file used to create them

weighting and grossing variables created

data listing with descriptions for cases, individuals or items studied

Variable-level descriptions may be embedded within a dataset itself as metadata. Other documentation may be contained in user guides, reports, publications, working papers and laboratory books. (from UK Data Archive)

Additional Information

If possible, include unique identifiers for the identify of authors/contributors with the Open Researcher & Contributor ID (ORCID).

Register public data sets with DataCite (this may be done automatically by some repositories, so confirm with them)