Data Policy #1: U.S. Global Change Research Program

I have sometimes been asked why I don’t start a FOI action with respect to source code and source data from the various multiproxy authors. I don’t preclude the possibility totally. However, my first inclination has been to attempt first to obtain the data and code from the authors through a direct and polite request, and, secondly, if that is unsuccessful, through a request to the funding agency and/or journals. I think that the stated U.S. policies make it very clear that source data and code pertaining to climate change policy should be publicly archived. To the extent that this is not being done, the acquiescence by the agencies, as well as the failures of the authors, are pertinent to policy-makers. The alphabet soup of U.S. agencies and U.S. policies is pretty confusing. I’ll try to set out some notes here on a road map to the confusing world of U.S. policy.

What appears to me to be the most senior statement on data archiving by top policy-makers on an occasion when they turned their minds to the issue (and a statement that is prominently displayed on a government website as still being in force) is by the U.S. Global Change Research Program (USGCRP), which (as I understand it) is part of the Office of Science and Technology Policy, Executive Office of the President. Their website provides a clear history of policies and guidelines.

The most senior policy statement on data archiving appears to be the July 1991 Policy Statement from USGCRP, which appears to be still in effect, together with guidelines discussed below. The Policy Statement (now here) should be read in its entirety.

The overall purpose of these policy statements is to facilitate full and open access to quality data for global change research. They were prepared in consonance with the goal of the U.S. Global Change Research Program and represent the U.S. Government’s position on the access to global change research data.
The U.S. Global Change Research Program requires an early and continuing commitment to the establishment, maintenance, validation, description, accessibility, and distribution of high-quality, long-term data sets.
1.ⵠFull and open sharing of the full suite of global data sets for all global change researchers is a fundamental objective. As data are made available, global change researchers should have full and open access to them without restrictions on research use.
2ⵠPreservation of all data needed for long-term global change research is required. For each and every global change data parameter, there should be at least one explicitly designated archive. Procedures and criteria for setting priorities for data acquisition, retention, and purging should be developed by participating agencies, both nationally and internationally. A clearinghouse process should be established to prevent the purging and loss of important data sets.
3ⵠData archives must include easily accessible information about the data holdings, including quality assessments, supporting ancillary information, and guidance and aids for locating and obtaining the data.
4ⵠNational and international standards should be used to the greatest extent possible for media and for processing and communication of global data sets.
5ⵠData should be provided at the lowest possible cost to global change researchers in the interest of full and open access to data. This cost should, as a first principle, be no more than the marginal cost of filling a specific user request. Agencies should act to streamline administrative arrangements for exchanging data among researchers.
6ⵠFor those programs in which selected principal investigators have initial periods of exclusive data use, data should be made openly available as soon as they become widely useful. In each case the funding agency should explicitly define the duration of any exclusive use period.

Commentary within the 1991 Policy Statement included the statements that:

Global change researchers include those in academic, industry, government, and non-government sectors conducting both basic and applied research.

data must be submitted to archives and information about data sets must be created and made available as well. The access policies for these archives should encourage the widest possible use of global change research data in meeting the objectives of the U.S. Global Change Research Program.

Deciding when data become widely useful is the responsibility of the funding agency, which should explicitly define the periods of restricted access, if any. In the past, some Principal Investigators have retained data for indefinite periods and this has inhibited their widespread use. This practice should be eliminated

In 1997, USGCRP endorsed the following grant language for use by its participating agencies [June 2010- this link is now deleted. Language is preserved in GAO 2007 report here Appendix 4 which provides same quote as CA]

SUGGESTED DATA PRODUCT REQUIREMENT FOR GRANTS, COOPERATIVE AGREEMENTS, AND CONTRACTS
Describe the plan to make available the data products produced, whether from observations or analyses, which contribute significantly to the grant results. The data products will be made available to the grant without restriction and be accompanied by comprehensive metadata documentation adequate for specialists and non-specialists alike to be able to not only understand both how and where the data products were obtained but adequate for them to be used with confidence for generations. The data products and their metadata will be provided in a standard exchange format no later than the grant final report or the publication of the data product’s associated results, whichever comes first.
Minimum Application – All such applicable data identified as important to the USGCRP
Desired Application – All such applicable data

In 2002, they provided the following clarification on what "data" includes as follows:
[original link dead. Language preserved in Appendix 4 of 2007 GAO Report].

1. Federally funded data significantly related to the USGRP that includes:
A. Data resulting from observations, the application of algorithms to data to produce new data, and from the data output of models.

If you look at the URLs cited above, you’ll see very strong language from senior policy-makers expressing frustration at holdback of publicly funded by scientists. So at this level, the failure of Thompson to archive considerable portions of his ice core data, the failure of Jacoby and Hughes to archive tree ring data, the failure of Jones to archive station data, the refusal of Mann to archive or make or source code available are all clearly against these policies.

14 Comments

I guess grant money is paid (in full) up front, not after delivery of work products, as would be stipulated in a private transaction or (even) in a government contract (as opposed to grant). Well, as anyone alive in the real world knows, if you’ve given over the money, you’ve lost your leverage.

The USGCRP policy statement is a wish, not a regulation. As such it is not enforceable. The GCRP is a merely a coordinating committee of agencies. So far as I know the only federal agency with a data archiving requirement is NIH, and that only since October. Nor does FOIA apply to funded research results. There is a law that requires disclosure of research data, but it is limited to data used in regulatory actions. The Mann case may well lead to a broadening of that mandate to policy related research, but it will take a new law to do that. The recent Energy Committee request for information from Mann, et al, as well as from NSF, could be the first step in the required legislative process. In addition there is a broad movement toward data sharing for scientidic purposes but it has made little progress on the Hill. Stay tuned.

To David Wojick. The NIH doesn’t require archiving of data. It only encourages authors to upload their completed articles on acceptance for publication and journals to permit their display at the PubMed Central website reasonably early. I’m not aware of any publicly accessible repository of raw data in the biomedical fields.

I suspect that’s thereason general public want to read blog….Internet visitors generally create blogs to declare themselves or their secret views. Blog grant them same matter on the monitor screen what they specifically needed,so as the above stuffs declared it.

My main concern is that you can’t guarantee every page of your website will be included in the SERPs. Considering I’m constantly adding new products to my company’s website, I need to be sure that customers can find them as soon as possible.http://www.seoptimizerz.com

Well, Steve, I’ve condensed your words a bit so a bloke scanning them might be attracted write responses like 9-14.

If you look at the URLs cited above, youll see very strong language expressing frustration at holdback. So at this level, the failure of Thompson ice, the failure of Jacoby and Hughes to archive ring data, the refusal of Mann to make source ..

Being serious, I’m trying to get data from SH by writing & talking to the authors and mutual friends. I do hope that I do not cut across your bow. Writing with regard to a blog like yours without knowing too much of what’s going on behind the scenes can be something of a loose cannon.

[…] look first at the overall policy. In this post Data Policy #1: U.S. Global Change Research Program, I discussed a clear policy statement by the U.S. Global Change Research Program in 1991 requiring […]

[…] seems to be developing for these issues. First and most importantly, here is some information on U.S. federal government policy on archiving of data. There are definite and long-standing policies on archiving data which are being flouted by climate […]