More on NSF Data Archiving Policies

Some time ago, I posted up some information on data archiving policy of the U.S.Global Change Research Program (USGRCP) and its guidelines to various agencies. I’ve identified 4 other policy statements from other institutions, including the National Science Foundation, which pertain to present matters.

National Science Foundation 1989
Even before the establishment of the USGRCP policy in 1991, the National Science Foundation adopted a policy, by sending a Notice in April 1989 to NSF client institutions announcing the implementation of the major recommendations of the National Science Board Report "Openness of Scientific Communication" (NSB 88-215) approved in
December 1988.

It expects investigators to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections, and other supporting materials created or gathered in the course of the research. It also encourages awardees to share software and inventions or otherwise act to make such items or products derived from them widely useful and usable.

NSF [Program management] will implement these policies in ways appropriate to the field of science and circumstances of research through the proposal review process; through award negotiations and conditions; and through appropriate support and incentives for data cleanup, documentation, dissemination, storage, and the like. Adjustments and, where essential, exceptions may be allowed to accommodate the legitimate interests of investigators and to safeguard the rights of individuals and subjects, the validity of results, and the integrity of collections.

Note: The added words "Program management" are in the version of the above policy as posed here while the link here does not contain these words.

This policy goes on to say:

Appropriate commercialization of the results of research will continue to receive encouragement by permitting grantee institutions [my bold] to keep principal rights to intellectual property conceived under NSF sponsorship. The Foundation emphasizes, however, that retention of such rights does not reduce the responsibility of researchers and institutions to make research results and supporting materials openly accessible.

On the question of computer source codes, investigators [my bold] retain principle legal rights to intellectual property developed under NSF award. This policy provide for the development and dissemination of inventions, software and publications that can enhance their usefulness, accessibility and upkeep. Dissemination of such products is at the discretion of the investigator.

The title under NSF policy, (as well as under university policy as discussed here) , would appear to reside with the institutions rather than the individuals. One also cannot help but question the diligence of program management in "encouraging awardees to share software". I am not privy to the inter-party communications, but, from my point of view, it seems like they did exactly the opposite, that they went out of their way to attempt to legitimize Mann’s withholding.

Earth System History 1995
The Earth System History program is a NSF program which has funded many paleoclimate studies; David Verardo is the contact. I know that many studies of interest were funded under this program, but I have not parsed particular grants to verify whether they might have been issued on some other program, which may have different policies.

The following data archiving policy is said here to have been adopted by the Earth System History Program:

Successful global change research requires a strong commitment to the establishment, description and accessibility of high-quality data sets. Thus all data generated or used in ESH research will be shared in a full and open manner in accordance with USGCRP policy. ESH data should be submitted to the World Data Center-A (WDC-A) for Paleoclimatology (Boulder, CO) within three years of generation or at the time of publication, whichever comes first. [my bold] The WDC-A provides advice on how to submit data to their permanent archives, and will make all paleoenvironmental data easily available via electronic and magnetic media.

Division of Earth Sciences 2002
The next policy statement that I wish to mention is the April 2002 statement of the Division of Earth Sciences, a division within the Geosciences Directorate of the NSF, which is jointly responsible for the Earth System History program. This is posted up here and is attached as an appendix to the publication "Geoscience Data and Collections: NATIONAL RESOURCES IN PERIL" by the National Research Council of the National Academies.

The Division of Earth Sciences conforms to the following statement on sharing of research results and data (NSB-88-215; PAM Manual #10, VII, G.2b): …The Division of Earth Sciences is committed to the establishment, maintenance, validation, description, and distribution of high-quality, long-term datasets. Therefore:

1. Preservation of all data, samples, physical collections and other supporting materials needed for long-term earth science research and education is required of all EAR-supported researchers.
2. Data archives must include easily accessible information about the data holdings, including quality assessments, supporting ancillary information, and guidance and aids for locating and obtaining data.
3. It is the responsibility of researchers and organizations to make results, data, derived data products, and collections available to the research community in a timely manner and at a reasonable cost. In the interest of full and open access, data should be provided at the lowest possible cost to researchers and educators. This cost should, as a first principle, be no more than the marginal cost of filling a specific user request.
4. Data may be made available for secondary use through submission to a national data center, publication in a widely available scientific journal, book or website, through the institutional archives that are standard for a particular discipline (e.g. IRIS for seismological data, UNAVCO for GPS data), or through other EAR-specified repositories.
5. For those programs in which selected principle investigators have initial periods of exclusive data use, data should be made openly available as soon as possible, but no later than two (2) years after the data were collected.[my bold] This period may be extended under exceptional circumstances, but only by agreement between the Principal Investigator and the National Science Foundation. For continuing observations or for long-term (multi-year) projects, data are to be made public annually….

I’m not sure if Verardo’s view is that "publication in a widely available scientific journal" is an alternative to archiving of the data, although this viewpoint would seem to be specifically precluded by the Earth System History policy of 1995. In many cases, if the publication in a "widely available journal" is (say) Science, which has no policies on data archiving, this means that some data sets (e.g. Thompson’s widely cited data sets from the Himalayas (Dunde, Dasuopu and Guliya) have remained completely absent from any digital archive for up to 17 years in one case and counting. One can only be grateful that the House Committee has turned its attention to this scandalous situation and hope that their focus is as broad as possible and that they resolve any possible ambiguity in this policy. (Personally I do not think that this sentence provides any justification for archiving failures.) In this respect, it is very helpful that assistance has been offered by journals, like Science, whose policies on data archiving are obviously inadequate (as evidenced by their serial publication of Thompson’s unarchived material) and by a group of learned members of the National Academy, including Thompson himself, who will undoubtedly be able to provide very helpful information on why he has not archived key datasets and how NSF has acquiesced in this.

PARCS (Paleoenvironmental Arctic Sciences)
The last policy statement to be mentioned today is that of PARCS, which describes itself on the NOAA website here as follows:

PARCS represents a community of researchers who study past climates and environments of the Arctic and sub-Arctic. Through the PARCS structure, this community develops science goals for arctic paleoenvironmental research. Funds are earmarked at NSF for PARCS research, and these goals are used as science guidelines within relevant NSF programs (more on PARCS funding). Investigators funded via PARCS are expected to follow PARCS data collection and archiving protocols and to participate in PARCS meetings.

4 Comments

Does anyone know if the NSF or IPCC have responded to Barton’s inquiry?

Also, does anyone know if and when Barton will be holding hearings. Hopefully C-SPAN will televise them if they occur.

Those who defend Mann should realize that being wrong in science is common and not a scandal. Covering up and obfuscating bad science is a scandal. Remember, President Clinton wasn’t impeached for having sex with an intern but lying about it. When this scandal hits the mainstream media it will be a devastating public relations blow to the AGW political and scientific community. Honesty is always the best policy. Even if it means less billions of dollars in grant money.

Given the particular issues in this field–policy importance), the huge amounts of $$ being shoveled to a bottlenecked set of participants (small field traditonally), very mathematical nature with lots of confounding variables–would be advisable if NSF put more emphasis on data (and methods, codes) sharing than they might in other fields (like materials synthesis).

Now, with all of the evidence that has been put forward, you’d think these guys were opening themselves to charges of research misconduct. Sadly, this is not the case.

The National Science Foundation (NSF), along with most governmental agencies, has adopted a version of the Federal Research Misconduct Policy (FRMP). The NSF version contains the boilerplate FRMP definitions, adopted by most agencies without change, of research misconduct. One states:

A finding of research misconduct
requires that”¢’¬?

(1) There be a significant departure from accepted practices of the relevant research community; and

(2) The research misconduct be committed intentionally, or knowingly, or recklessly; and

(3) The allegation be proven by a preponderance of evidence.

Why couldn’t Mann et. al. be found guilty of a misconduct charge for falsification? Consider that according to the FRMP (emphasis mine):

Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record.

We know from the CENSORED file, from the extension of a data series to allow its inclusion, and the lack of R2 statistic that falsification has occurred, it is clear that it was not accidental but deliberate, and it only needs a preponderance of evidence (including his refusal to reveal data when asked) to find in the affirmative. So what’s the problem?

The problem is, we have to show that his actions are “a significant departure from accepted practices of the relevant research community” … when the sad truth is, they seem to be depressingly common practices in his neck of the woods …

[…] than of the grant recipients who abused their compliance responsibilities (for example, see here , here.) I will try to review their conduct in a future post as well, as some of NSF’s most questionable […]

[…] than of the grant recipients who abused their compliance responsibilities (for example, see here , here.) I will try to review their conduct in a future post as well, as some of NSF’s most questionable […]