Time Period of Content: The date(s) that the data was collected, OR, the oldest to most recent date of ANY sources contributing to the dataset.

Currentness Reference: "Ground condition" or "publication date". Data produced from satellite and photographic images or GPS-collected would have a currentness reference of "ground condition" based on the date the photo or image was taken. However, data that was digitized from previously published maps would have a currentness reference of "publication date" for the date when the map was published.

Spatial Domain: the bounding box or "footprint" encompassing the dataset or study area, expressed in terms of minimum and maximum latitude and longitude in decimal degrees. Alternatively, another description of the area can be given such as quadrangle names, county names, etc.

Place Keywords: Examples - Wyoming, county name, quad name, park name, or watershed name - list in a hierarchical fashion from Wyoming down to smallest unit necessary.

Access Constraints: usually none unless you have sensitive data that is only available to certain people or data, which cannot be distributed for free.

Use Constraints: Restrictions or legal prerequisites for using the data. Basic example; detailed example: from a project with national standards, includes appropriate/inappropriate uses; detailed example: includes statements about altering data and citation of data

Dataset Credit: (optional) Credit the people/organizations that produced the source information or were involved in different stages of production (such as collection, digitization, analysis, review, etc).

Data Quality

Attribute or Thematic Accuracy: (or thematic accuracy). How sure are you that your attributes are correctly labeled? Cite quantitative accuracy if available and any measures taken to verify attributes. There are three types of accuracy descriptions:

compare attributes to a separate data source of larger scale/greater accuracy (this can be difficult in some cases to evaluate because of generalization that occurs in smaller scale datasets)

independent sampling (field checks)

Examples:

good example: rare case where accuracy was determined by field-checking

good example: an example of how updates to the attribute accuracy were incorporated (with dates) as the data was improved.

special example: in modeled datasets, attribute accuracy sometime does not apply since the process is subjective or derivative.

special example: reference is made to individual source data sets within the Lineage section. This is acceptable, but not ideal.

Logical Consistency: Did you check for bad values and conditions? Does your data meet all the topological requirements needed (e.g. any overlapping polygons? Gaps between polygons? Unconnected lines?) Can also include consistency of domains (are the values of X between 0 and 100?)

Completeness Report: What is the completeness of the data? If it is in development, how much is still left to do? Was there some data that could not be collected (ran out of time/money or just not available)?

How complete is the attributing for data that was collected? Are there any records missing information because it wasn't available? For instance, for the land cover data set there are attributes for "crown closure", but not all of Wyoming is forested so only about 20% of the polygons are coded for crown closure. This section is to assure the data users that any polygons that are labeled "0" (or left blank if it is a character field) were so labeled intentionally and not merely overlooked in the attributing process. Good example; adequate example

Horizontal Positional Accuracy: Degree of compliance to the spatial registration standard. This standard may vary depending on how the data is registered - for instance, accuracy of GPS data is based on different criteria than accuracy of registered maps or images.

GPS data: the type of GPS equipment (mapping grade, survey grade), settings, number of satellites used, logging intervals of position, and post-processing techniques (such as differential correction) all contribute to the final accuracy of the data.

Maps or images: note the RMS (root mean square) error and the number of registration points used. For images registered to existing coverages, note the final image-to-coverage RMS error, the number of links used in registration, and the maximum positional offset accepted for the links.

In addition to registration error, error should be quantified, or at the very least estimated, for each of these steps in the production process:

Error from snapping tolerance (note tolerance used). The default tolerance will vary based on the extent of the area registered. This tolerance should be set based on consideration of the scale of the source data (larger scale maps should have smaller snapping tolerances).

Error from coordinate shift resulting from using CLEAN or other commands that use fuzzy tolerance such as intersections, unions, clips, etc. Make sure to note whether the coverage coordinates are stored as single or double precision (PRECISION command) and note the fuzzy tolerance used. The fuzzy tolerance will vary based on the extent of the registered area; for areas of large extent (greater than one 1:24,000 quad), avoid using the default Arc/Info tolerances.

Error from line shift. This is the distance that digitized lines are off from source lines, resulting from human error. Check for these errors by creating check plots and overlaying them on the source maps. Generally any differences greater than a line width should be corrected.

Error resulting from poor quality source media (e.g. folded or warped paper or photocopies used instead of original source map

Example:

The map was digitized from USGS 1:24,000-scale base maps, with an inherited error of +/- 40 feet according to USGS national mapping standards. In addition to this inherited error, the map was registered with 16 tics and an RMS error of .006, corresponding to +/- 12 ft for this scale. Line shift error is not greater than .01 map inches (+/- 20 ft). Unquantifiable errors may be associated with coordinate shift (fuzzy tolerance was set to 0.99 meters, single precision), snapping (set to 2 meters), and source media (unfolded paper maps were used). Check plots were created to check digitized lines against source maps and any lines off by more than a line width were corrected, but the number of errors detected and corrected were not quantified.

Other methods of determining positional accuracy include comparing the data to independent sources of higher accuracy when they are available, or using internal programs for detecting positional shifts, such as used by the USGS.

In some rare cases, positional accuracy may not apply. For instance, the 1:100,000-scale land cover map for Wyoming was interpreted from satellite imagery with a specific minimum mapping unit (100 ha). Because 100 ha units do not apply to actual vegetation boundaries (it is too generalized), positional accuracy was not a concern for this dataset.

Examples:

incomplete example: data recorded by a GPS should include not just final accuracy but how measurements were derived

good example: accuracy of data recorded by a GPS along with how measurements were derived

good example: slightly more detailed accuracy of data digitized from hardcopy maps

good example: estimated accuracy of data created from a legal description conversion process

special example: a derived data set, combined in a spatial analysis procedure from multiple data sources, each with varying horizontal accuracy.

special example: in rare cases, horizontal positional accuracy does not apply

special example: reference is made to individual source datasets within the Lineage section. This is acceptable, but not ideal.

Lineage (Sources): List all the sources used in compiling your data. Usually these are source maps, source photography, source images, people involved in GPS point collection, and may also include publications or people that were sources for attribute information. Example

Each source should include:

name, originator

scale (if applicable)

media (paper map, mylar map, digital map; report, publication, etc)

time period of content

reference for time period of content (e.g., ground condition or publication date)

what the source was used for (which set of features, which attributes?)

In some cases, there may be a large number of source maps from a series such as the USGS 1:24,000 or 1:100,000 quadrangle series used to create one digital data set. In such a case it is not practical to list each source map individually in this section. In this case, they can be grouped under one source (e.g. USGS 1:24,000 quadrangle maps) and the time period of content for this source would be a range of dates - the oldest to most recently published map. The actual names of the source maps can be listed (along with publication date) in one of the process steps, or attached in a separate file or table (INFO or dBase) to go with the coverage. If the names/dates of the source maps are attached in a separate file, the name of this file and its contents should be described in the entity/attribute section of the metadata.

Process Steps: this section is used to describe how the dataset was created. Includes relevant steps in the production of the data, or major changes/updates to the data. Digitization of a map could be one step, attributing another step; any unions/intersections, appended or subtracted features or areas of extent should be recorded as individual steps (with dates) as best as possible. Special modeling processes or decision rules used should also be put in this section. In some cases it may be worthwhile to mention a contact for a process if the process was not done in-house (for instance, if the data set was sent to a specialist to be reviewed).

Changes/updates: if an error is found and corrected in a dataset, a detailed description of the correction should be added as a process step with its own date so that data users can determine the status of the data they currently possess.

Spatial Organization Information:

This section contains information about how spatial information is represented in the dataset. There are two primary ways:

Indirect spatial reference: a system for describing spatial location without using coordinates. This can include geographic features such as counties, states, and townships. Street addresses Legal descriptions are two very common indirect methods

Direct spatial reference: includes three options: point, vector, and raster. These are data structures referenced by coordinates.

A dataset composed only of points, or a raster dataset composed only of cells (pixels), is the simplest to describe.

point: may include label point, area point, node

raster: described as a point, pixel, grid cell, or voxel, and associated with a row and column count. The cell-size is not described in this section, but in the following section (Spatial Reference Information)

Datasets with lines, nodes, polygons and polygon topology are more difficult to describe and involve more complicated terminology, based on the Spatial Data Transfer Standard (SDTS) federal information processing standard.

Spatial Reference Information

The coordinate system, projection and datum that the dataset is in. You can also record spatial reference information for the datas sources - for instance, GPS data can be collected in latitude/longitude coordinates in the WGS84 datum, then converted to UTM zone 12 coordinates in the NAD83 datum. Make sure to record projection parameters when necessary such as latitude of origin, central meridian, false eastings/northings, zones, units (meters, feet), etc.

Abscissa resolution and ordinate resolution: the smallest distance that can exist between two points. This value is almost always the same for the x axis (abscissa) and the y axis (ordinate) but may differ for non-square pixels. For vector data, this is usually the cluster tolerance or the minimum distance at which two points will be automatically converged together. For raster data, this is usually the length/width of the pixel size

Entity and Attribute Information

Description of the content of all entities and attributes (see definitions below) for each feature associated with the dataset.
Contains two main sub-sections, which can be used individually or together to describe the dataset:

The Overview description is unstructured and information is usually written in paragraph format. The Detailed Description is a much more structured method of describing the data dictionary, hierarchically organized. For simple datasets with only one or two entity types and a limited number of attributes and attribute domains, the Overview section may be sufficient. For more complex databases, it is important to enter information in both the Overview and Detailed Descriptions. The Overview Description should be used as a summary, to explain the structure and relation of different entities; the Detailed Description gives specifics about each of the entity's attributes and attribute domains, including definitions and sources.

In the case of some complex databases, a data dictionary may already have been compiled in a database format. Therefore it is redundant to transfer the information over into the Detailed Description format of the FGDC's content standard. The Overview Description can be used to reference the separate data dictionary, its format, and how to access it.

Definitions:

An entity is a spatial phenomenon of a defined type that has at least one key attribute value different from the corresponding attribute values of surrounding phenomena.

Simple GIS datasets are composed of only one entity, for instance "roads", "elevation", "soils".

A single GIS dataset can also be composed of more than one entity. For instance, one dataset may contain two entities representing land ownership and land management. Some pieces of land may have a distinct ownership, but portions of the land may be managed differently. Since there are many shared boundaries between ownership and management, it makes sense to keep the boundaries within one dataset to avoid data redundancy, yet it is important to clarify the distinction between the two separate entities.

An attribute is a defined characteristic of an entity type. For instance, the entity type "roads" may have several different distinct attributes, such as "name", "type" (highway, interstate, local road), and "condition" (paved, unpaved). An attribute will have a specific type of domain of categories associated with it. Four types of domains are recognized within the FGDC content standard:

Standard order process: this section is repeatable. For instance, the same dataset may be available two different ways: it can be ordered on CD or some other media, or it may be available for download over the internet. This section includes information on where to obtain the data, what format it is in, and any applicable fees. Example

Metadata Reference Information

Contains the name and contact information for the person responsible for creating and/or maintaining the metadata document. Also includes the date the metadata was last modified, and the version of the standard the metadata conforms to (version 1: 1994, version 2: 1998)