About Data.gov

Who developed Data.gov?

With what technology is Data.gov built?

Data.gov is powered by two open source applications, CKAN and WordPress, and it is developed publicly on GitHub. Learn how you can contribute to Data.gov and these larger open source projects here.Top

What standards were used to develop the metadata displayed on Data.gov?

Data.gov follows the Project Open Data schema – a set of required fields (Title, Description, Tags, Last Update, Publisher, Contact Name, etc.) for every data set displayed on Data.gov.Top

What metrics are available about data on Data.gov?

On the Metrics page, users can get detailed information about the composition of the total number of datasets on Data.gov.

The total number of datasets, available on the Metrics page and on the main Data.gov and Catalog pages, is a number that changes frequently. Because the Data.gov catalog is updated on a nightly basis, the total number is subject to change as agencies add or delete datasets or Data.gov adds new agencies. Agencies may also group similar datasets into a “collection.” A “collection” counts as “1” dataset in the total, so the number of datasets may drop when agencies organize similar datasets into a collection, even though there has not been a reduction in what’s available on Data.gov. Occasionally, a technical issue could arise when the Data.gov catalog runs into a problem harvesting an agency’s metadata and also cause a temporary drop in the total number of datasets. As of June 2017, the approximately 200,000 datasets reported as the total on Data.gov represents about 10 million data resources.

The Metrics page also provides a detailed breakdown of the dataset total by type of source and by agency. While most Data.gov datasets are from federal agencies, the Metrics page also shows how many datasets are provided by state, county, and city data catalogs that make their datasets available through Data.gov.

Under each type of organization, users can see how many datasets are provided by each federal agency. For larger agencies, such as the Department of Agriculture, clicking on the + symbol will further breakdown of the USDA dataset total by each subagency.

The “Latest Entry” column on the right shows the most recent date when the Data.gov catalog detected an update to the agency’s metadata. The date listed is the most recent date when the agency added, deleted, or changed a dataset.

Get Data on Data.gov

How are the datasets on Data.gov collected?

Under the terms of the 2013 Federal Open Data Policy, newly-generated government data is required to be made available in open, machine-readable formats, while continuing to ensure privacy and security.

Government data publishers looking to get their data on Data.gov should read the detailed guide: How to get your open data on Data.gov. The Data.gov team typically works with a designated open data point of contact as a liaison for each agency. Data publishers should consult with their agency point of contact to include any additional datasets on Data.gov. If you need help determining who your open data point of contact is, please contact us.

Create a Single Agency Data Inventory. Agencies are required to catalog their data assets, just like they would inventory computers or desk chairs, to better manage and use these resources.

Publish a Public Data Listing. Agencies are required to publish a list of their data assets that are public, or could be made public. This list is made available as a data.json file hosted at the primary domain of the agency (e.g. gsa.gov/data.json)

Develop New Public Feedback Mechanisms. Agencies are required to set up feedback mechanisms to engage the public about where agencies should focus open data efforts, such as facilitating and prioritizing the release of datasets. Agencies are also required to identify public points of contacts for agency datasets.

Agency Public Data Listings are made available on agency websites as JSON files following the Project Open Data metadata schema (at agency.gov/data.json) and are then harvested into the central catalog for Data.gov. Each agency is responsible for its own data.Top

How can I add my government data to Data.gov?

Data.gov is primarily a federal open government data site. However, state, local, and tribal governments can also syndicate metadata describing their open data resources on Data.gov for greater discoverability. Data.gov does not host data directly, but rather aggregates metadata about open data resources in one centralized location. Once an open data source meets the necessary format and metadata requirements, the Data.gov team can pull directly from it as a Harvest Source, synchronizing that source’s metadata on Data.gov as often as every 24 hours.

Step 1: Organize your open data for the Data.gov Pipeline

Getting your data source ready for harvesting by the Data.gov catalog differs depending on the type of source:

Federal Geospatial Data:A number of federal agencies hold geospatial data which has separate requirements under different legal authorities.

Non-Federal Data: Non-federal sources are not covered by the Federal Open Data Policy, but can be included in the Data.gov catalog voluntarily. Depending on your platform, creating this harvester might just be the push of a button or it could take a little more work, but the team will walk you through it either way. (See: https://www.data.gov/local/add )

Step 2: Coordinate with Data.gov

Contact the Data.gov team.Contact the Data.gov team (datagov@gsa.gov) to let them know you’d like to get started. Please include a link to your metadata in the data.json format or let us know if you have questions about how to create a data.json file from your current database along with any relevant links.

Connecting the pipes. The Data.gov team will create a new Harvest Source that will automatically collect information about your datasets and update Data.gov whenever changes are made on your data catalog.

Testing.The Data.gov team will test to ensure the harvester works properly. If anything seems wrong, the team will help you configure your data catalog so that Data.gov can collect your datasets without any errors.

Live within 24 hours! Once the harvester has been tested successfully, Data.gov will start automatically consuming information about your datasets and all the basic details of your datasets will be available on Data.gov with links to the source and your open data policy.

How do I find data on Data.gov?

Browse on the left side through types, tags, formats, groups, organization types, organizations, and categories. Clicking on multiple items narrows your search. You can click on the “x” to the side of any single item to remove it from the search, or “clear all” to remove all selected items in a category.

Search by geospatial area by drawing a boundary box on the map at the left side and clicking “Apply” to find all datasets that are tagged for that geographic area.

Once you find a dataset or tool of interest, click on the title and you will be taken to a page with more details on that specific dataset or tool. Some datasets are downloadable, while others are links to web sites or apps that help you access or use the data.
Please note that by accessing datasets or tools offered on Data.gov, you agree to the Data Policy, which you should read before accessing any data. If there are additional datasets that you would like to see included on this site, please suggest more datasets here.

What if I am having difficulty downloading a dataset from the catalog?

Media inquiries

If you are preparing an article or creating an event, and are interested in finding information, a speaker, or other help in communicating about open data, contact the Data.gov team at datagov@gsa.gov.Top