More Data Refusal – Nothing Changes

Phil Jones and his coauthors in the recent multiproxy study (Neukom et al 2011, (Climate Dynamics) Multiproxy summer and winter surface air temperature field reconstructions for southern South America covering the past centuries) did not archive proxy data in the Supplementary Information. Many proxy series used in the study are not otherwise publicly archived.

I wrote to lead author Raphael Neukom as follows:

Dear Dr Neukom,
I notice that your recent multiproxy article uses a number of proxies that aren’t publicly archived. Do you plan to provide an archive of the data as used in your study? If not, could you please send me a copy of the data as used. Thanks for your attention.
Regards, Steve McIntyre

I received the following answer refusing the data:

Dear Steve,

Thanks for your interest in our work. Most of the non-publicly available records were provided to us for use within the PAGES LOTRED-SA initiative only and I am not authorized to further distribute them. You would need to directly contact the authors. I am sorry for that.

If you are interested in a particular record, let me know and I can provide the contact details.

Cheers,
Raphael

Every inquiry into paleoclimate controversies, no matter how much whitewash was applied, concluded that climate scientists should archive data. If Neukom, Jones and their coauthors publish a multiproxy article, that means the multiproxy data, not just the output. If the contributing authors are not willing to archive their data, then it shouldn’t be used in a study in a climate journal. End of story.

Nor is it sufficient for the author to provide the addresses of the various contributors and force an interested reader to obtain data from each of them individually. There’s no guarantee that they will cooperate. The obligation rests with the publishing authors.

Making matters even worse in the present case is that many of the unarchived series were published by named Neukom coauthors. If they aren’t prepared to have their data see the light of day, don’t sign on as a coauthor and don’t allow Neukom to use your data.

While Phil Jones was not lead author of the study, he was a coauthor. As someone with recent adverse experience in data archiving issues, Jones should have insisted that the Neukom coauthors provide an exemplary data archive and, if they were unwilling to do so, Jones should have withdrawn as a coauthor. Similarly, the University of East Anglia should have adopted policies that require its authors to ensure that proper data archiving practices are mandatory in publications in which UEA employees are coauthors. Either UEA has failed to adopt such a policy or, if they have, Jones has ignored it.

PAGES, the organization that has sponsored or acquiesced in this latest secrecy, has the following mission:

PAGES is a core project of IGBP [International Geosphere=Biosphere Program] and is funded by the U.S. and Swiss National Science Foundations and NOAA

Climate scientists, rather than learning anything from Climategate, have, if anything, become more stubborn than ever. That international programs sponsored by funds from the Swiss NSF, US NSF and NOAA should sponsor and/or acquiesce in non-archiving was bad enough before Climategate, but totally unacceptable after Climategate.

The sending of Swiss and/or US federal funds to climate institutions and programs which do not adhere to data archiving policies seems a practical and useful topic for an oversight committee and I hope that one of them takes up the issue. If nothing else will change the archiving practices of climate scientists, maybe the funding agencies can.

And by the way, as I’ve said on many occasions, I don’t believe that new data policies are needed. If policies enunciated in the 1990s were applied to paleoclimate by NSF, I believe that that would deal with 95% of the problem in paleoclimate. However, in my opinion, NSF (paleoclimate) has become a cheerleader for the small paleoclimate industry and abdicated its obligations to ensure compliance with US federal data archiving policies.

I replied to Neukom as follows:

Thank you for your reply, which, unfortunately I do not accept. If you publish a multiproxy article using non-archived proxy data, you should obtain the consent of the contributors for archiving the data when the study is published or otherwise not use this data. It is your responsibility to obtain these consents, not the responsibility of the interested reader to try to obtain the data from potentially uncooperative contributors after the fact.

Future progress in understanding climate history will depend increasingly on the provision of well-documented data. Therefore, in addition to providing a set of useful data links, PAGES has initiated the PAGES Databoard. This service is intended to ensure the compatibility and accessibility of available paleo-databases.

Lucia asked whether the requested series were there. Many of the requested series appear to be on that list or to be composites constructed in some fashion. Some of the inputs to Neukom et al are identified as “Clusters” of tree ring data. The clusters appear to be composites of individual tree ring chronologies listed in the PAGES series.

PAGES in effect is using public funds to archive pubicly funded data for private use. It reminds me in a way of the password-protected archive at SO&P (SOAP) which rebuffed all my efforts to get access to their data in 2005. The gatekeeper for the program in that instance was none other than Climategate’s own Keith Briffa.

Well something has changed since climategate, far more people are informed that climate scientists don’t share their data in order to avoid critique, all through climate scientist’s e-mails. With this event they are just further poisoning their field and in turn narrowing their funding stream.

I don’t disagree with your sentiment. However, I bet Neukom, which doesn’t own the data, nor can make requirements of the group who’s providing it (only choose to use it or not use it under their restrictions), will perhaps direct folks that are mad at him to ‘locate their own data to use’ like he did.

I suppose it would be increasingly sinister if this PAGES organization refuses to provide this same information to other researchers who are interested in using it.

Although, as you say…once CRU makes a policy about this…they have to stick to it, and other concerns by other groups (even the data providers) are secondary.

It is not the responsibility of interested readers to spend their time locating data to support and replicate a peer reviewed and published article. That is the responsibiity of the lead author. The lead author must assure that all data used in the article is publically available.

Most funding organizations have existing bylaws which requires data archiving. The funding organizations must merely apply their existing bylaws to assure that all data used in the studies that they fund be publically archived.

The journals which publish scientific articles (mostly) have existing policies which require data used in the articles to be publically archived. The journals should not publish articles unless the lead author has publically archived all data prior to the date of publication.

The data can be used to verify the paper and validate its methods. If the data is not available, for any reason, then the effect and influence of the study is diminished, likely fatally. If this study is to be used in an IPCC report then its utility for that purpose is, in that way, greatly diminished.

IPCC proponents wonder why Copenhagen, Cancun, Kyoto and the rest have failed and that their influence is fading. They should realize that the fact that their work can not be afflictively validated is one major reason. The fact that the term “climate scientist” is now something of a joke should be yet more evidence

Watts will be subject to the same effect. However this does not diminish the same effect of “climate scientists”.

If the data is not included or easily available than the work cannot be replicated. It is than, by definition, not science and should not be published in any science journal. In this situation peer review is also impossible so “reviewers” who pass manuscripts that cannot be reproduced are also not doing their jobs.

Where did we get the idea that the data is the thing we are supposed to be proprietary about; Rather than the experimental design and the interpretation which are the actual intellectual property?

Future progress in understanding climate history will depend increasingly on the provision of well-documented data. Therefore, in addition to providing a set of useful data links, PAGES has initiated the PAGES Databoard. This service is intended to ensure the compatibility and accessibility of available paleo-databases.

Just wondering if “clustered” means groupings of all of the data in certain series (5 series with 1000 records each totaling a “cluster” of 5000 data points) or some unknown grouping of just some of the data in the series (5 series with 1000 records each totaling a “cluster” of only 2000 data points)?

I would guess that you will have to go through a few hoops to get what series make up a cluster only to find that they did not use the full series in the cluster.

I went to the PAGES-LOTRED-SA site listed above and found a list of
305 individual data records. About 2/3 state that the data is not
available to the public, and refer the reader to an individual POC.
About one-third say that their data is archived by NOAA Paleoclimate
database. I went to the NOAA site, but was unable to get into the
LOTRED-SA site (could be a minor net hiccup) but was able to view
NOAA’s policy regarding openess of data, by the way crafted last year
right after climategate! Read for yourselves.

The bigger question is, why would it be withheld? We’re not talking corporate proprietary information, or national security data. This is data paid for by governments (which means the people of those countries).

Here is an instance where archiving of data such that it was “freely available to the general public” led to substantial improvements in scientific insight.

With respect to the high-profile, high-impact Mann et al (2008) reconstruction, [access to properly-archived data led to the demonstration] that the three (not four) Tiljander data sets (not proxies) were wrongly used, and that the key conclusion of this paper depended on such wrong use.

Nature’s policy states data should be available to “readers”. Unless you think libraries require their users to pass an entrance exam, one would assume Nature’s intent is that data should be made available to the general public, but “that’s just my opinion.”

Why do people still want to argue about this? It is clear that all public funding agencies are under intense pressure to foster openness. EPA rule making requires openness. My guess is that other regulatory bodies will have to move in the direction of more openness rather than less. If climate scientists want to influence policy, they have to be open with their data.

Does someone’s funding actually have to be cut or does someone have to be publically humiliated before a legislative body before common sense takes hold?

Around the poker table, we have a saying for players who are obviously playing a foolish hand but still contemplating staying in the round…”Get out, you lost”.

IGBP will continue many of its successful approaches to
implementation from its first phase including: building
research networks to tackle focused scientific questions;
promoting standard methods; undertaking long timeseries
observations; guiding and facilitating construction
of global databases; establishing common data policies
to promote data sharing; undertaking model inter-comparisons
and comparisons with data; and coordinating
complex, multi-national field campaigns and experiments.
In addition, IGBP will facilitate comprehensive interactions
between modellers and experimentalists and will
forge an international institutional network for Earth
System science.
IGBP and Global Change National Committees will continue
to be essential to research planning and implementation
in IGBP, enabling a dialogue between national and
international levels of global change research.

Objectives are all well and good. Implementation is what reveals true attitude and intent. Nothing in the statement of intent however indicates any support for proper scientific openness and fostering of discussion and debate. In fact the last phrase is circular and self-referential.

In their policies and guidelines for sponsored scientists, the Swiss National Science Foundations (SNF) stipulates (free and partial translation from German):

Article 44:
The sponsored persons are obliged to make information about the sponsored project and its results publicly available … in particular:
– they have to adhere to the principles of Open Access
– data collected using SNF fonds must be made available for secondary research by other researchers and has to be accessible through a scientifically recognized data repository

Switzerland is also a signatory of the 1998 Aarhus Convention on Access to Information, Public Participation in Decision-making and Access to Justice in Environmental Matters, though does not seem to have ratified it

I find it refreshing that the “Team” and the folks from the
sidelines they send in have moved on from their previous
responses for information requests that could be summarized
as, “Go fish. Nyah, nyah, nyah!”

They now seem to have moved on to the policy position of,
“It may be around somewhere. If you can’t find what you
want, we double-dog dare to do something about it!”

Sadly, the attempts to then “do something about it” will
again be chacterized as being anti-science.

If the contributing authors are not willing to archive their data, then it shouldn’t be used in a study in a climate journal. End of story.

Nor is it sufficient for the author to provide the addresses of the various contributors and force an interested reader to obtain data from each of them individually. There’s no guarantee that they will cooperate. The obligation rests with the publishing authors.

“Thanks for your interest in our work. Most of the non-publicly available records were provided to us for use within the PAGES LOTRED-SA initiative only and I am not authorized to further distribute them. You would need to directly contact the authors.”

I’d be tempted to write back and explain that you wish to use them within that initiative to corroborate the work.

“The IGBP places high priority on establishment, maintenance, validation, description, accessibility, and distribution of high-quality, long-term global data sets, including the synthesis or generation of new global data sets,” and, “Full and open sharing of the full suite of global data sets, and other data sets needed for global change studies, is the primary objective of the IGBP-DIS” (IGBP Report No. 12).

Look how long it takes to distort the policy. The scientific goal of Lotred is to make a secret data base.

(i) to collate, maintain and share a common state-of-the-art protected data base (for contributors only) with the available multi-proxy data sets, and

One small question – a number of the Neukom coauthors, including Luterbacher and Grosjean, were not themselves “contributors” to the data base. I wonder why they were permitted access while I wasn’t. I guess it helps if you’re a member of the Team.

I do not see why that would/should surprise you — but then maybe you missed the “About” page:

This LOTRED- SA initiative is conceived as a collaborative long-term effort that seeks

(i) to collate the large number of disperse already existing and new multi-proxy paleoclimate data sets (documentary data, early instrumental data and natural proxies) for the last ca. 1000 – 2000 years available for South America, and

(ii) to use the Mann et al. (1998), Luterbacher et al. (2004) and Moberg et al. (2005) methodologies to work towards a regional reconstruction at different temporal and spatial resolution for southern South America.

This initiative seeks to involve research groups from different countries working within a common frame for a common goal.

Well, then, formally apply to be a “contributor” and see what they say.

You’re a member of the club now, Steve. You’ve been published in peer review, much to their disgust.

The problem will be, even if they accept you, they likely won’t let you publish your results the way you are used to doing, as that would require you to put their data in the open, along with your programs that analyze it.

Excellent point, reminding us that once the openness genie is out of the bottle it’s darn hard to push it back. One would say history is on our side, despite the last year. In which case this is a very foolish, public blip.

Nevermind – looks like output…
——————————————————
DESCRIPTION:
This dataset contains the austral summer (DJF) and winter (JJA)
surface temmperature reconstructions for southern South America (SSA)
as presented in Neukom et al. 2010. The data are based on the
PCR reconstruction (details see paper) and cover the following periods:
Summer: 900-1995
Winter: 1706-1995
——————————————————

It is not just because Steve is notorious or perceived as a sceptic. Before I published my non-treering recon, I got the identical runaround. Some data becomes unavailable because you can’t track the author down (moved, died, retired), often no reply from the author. It doesn’t even have to be a coverup, just lack of coherence.

Climate scientists, rather than learning anything from Climategate, have, if anything, become more stubborn than ever.

Their increased intransigence is what I would expect—and have experienced in other contexts. They got away without so much as an official reprimand; indeed, many people proclaim them to be heroes, for withstanding those evil climate deniers. For the scientists, the lesson from Climategate is that they can get away with just about anything.

Prof. Keenan, intransigence is putting it mildly. When challenged, they just get even more arrogant.

For example, in the latest Real Climate post, they try to excoriate Forbes journalist Larry Bell for saying that global Accumulated Cyclone Energy for 2009 and 2010 are at record low levels, by presenting hurricane data for ONLY the mid Atlantic and the Northern Hemisphere… Hmmm, I seem to recall RC reminding us over and over again that Northern Hemisphere temp records ARE NOT THE SAME as global records… But I guess when you spend so much time defending Mann 1998, maybe the go confused and teleconnected the ACE from the NH to the rest of the world. Still, Larry Bell was absolutely right about the ACE records and a couple of other things he wrote.

Funny thing is, I doubt that many people would have paid attention to the Forbes article in the first place. It’s nothing special. But, because the geniuses at RC not only took Bell to task, but made such obvious errors doing so, it will only draw more eyes to the article they so despise.

They really do need to go back to school and take a few courses on public relations.

Re: Sonicfrog (Jan 7 12:21), They are not interested in PR.
they are interested in providing talking points for their choir.

Now, the forbes article surely did not get everything right. neither did it get everything wrong ( particulary ACE) RC, could have choosen, to do a balanced review,
but that’s not their role as they see it.

And ya, you’re right, they do draw more attention to the forbes article than it would have gotten otherwise.

But Mosh… what happened to that PR, reaching out, initiative thingy they were going to do a few years ago? It sounded like such a good idea… Oh, there’s the problem. It was a good idea. No wonder they never followed through.

I know what it’s about bender. I would have thought attempting to get hold of the data just to see if you could would be wasting everybody’s time. It’s more likely that someone is attempting to get hold of the data to try to replicate or otherwise analyse the study. The end result is the same in this case, however.

My eyebrow raised when I saw the phrase I outlined in bold, for obvious reasons.

Sorry, I read you wrong. I thought you were implying SM was either a data predator (a competitor fighting unfairly) or an obstructionist (making spurious data requests to bog down research) – a specious claim sometimes heard at well-funded propagandist websites.

It sure is bender. This is much bigger than any individual – as Craig Loehle has said, it doesn’t depend on the demonisation of McIntyre (disgusting though that has been). It’s a matter of outright contempt for every Climategate Inquiry, which has had the temerity to put their finger on this issue of open data and code (despite their manifold shortcomings). And it represents utter contempt for the general public. I know Steve has said that apologies and changes in behaviour, not resignations, were what he was after in climate science. I for one now think this approach should be reviewed.

One of the purposes of the “Peer Review Process” is to allow others to replicate your findings. This cannot happen without the original data. It seems to me that the publishing publications are also at fault for allowing papers to be published that cannot be independently verified.

Of course the instantaneous fix for this would be a requirement for any institution or individual to receive a government grant to not have any outstanding studies funded in part by a government grant with non-archived data.

Then they can choose to publish the data, buy the data, or find another source of funding.

Quote: “As a part of the IGBP, PAGES supports the free and open exchange of data as described by policies of IGBP and its parent organization ICSU (CODATA 2002). In particular, the PAGES Data Board has established policies that support the development and use of rules of good scientific practice (ESF, 2000), including

• making data and methods available for reproducibility of results,
• making data behind any published graphic or figure publicly available, and
• ethical use of data, including proper citation.” – end quote

I will once again climb on to my Popper-channeling soapbox: If the data is not publicly available, this is not SCIENCE. It is merely hearsay. It is hearsay until is has been thoroughly vetted/audited by any and all interested others. Despite Gavin’s “Doesn’t anybody trust us?” plaint, science is not about trusting the researchers. “Disgusting”–bender always hits the nail on the head–is the word for it. It’s as though paleoclimate is still back in the pre-science era, the era of mumbo jumbo.

I don’t suppose that Climate Dynamics has an archiving policy. The website’s instructions for authors mentions how to arrange data for Supplemental Information, but doesn’t say anything about requiring all source data to be archived there (or at least linked to from there). Does anyone here know any more details?

One of the coldest winters I’ve ever seen in California this year and a lot more rain than the Climategate scientists were predicting. I have no doubt that our planet is dying, but I’m not convinced that we can do anything (even if our whole species went Green) to stop it. Climategate was made so that “scientists” could keep publishing and corporations could keep pursuing their agendas.

The IPCC in its statement on “PRINCIPLES GOVERNING IPCC WORK” says nothing on data but does say:
“The role of the IPCC is to assess on a comprehensive, objective, open and transparent basis the scientific, technical and socio-economic information relevant to understanding the scientific basis of risk of human-induced climate change, its potential impacts and options for adaptation and mitigation.”
Rather than spending too much time on single papers we should prepare to identify papers which, like this one, are far from being “open and transparent” and use the power of the blogosphere to challeange the IPCC about their inclusion in the next AR.

Both blogs and political pressure, I would think. AFAIK, a congressional panel could bring in anyone they wanted and ask for the raw data, meta data, etc. as a quality check on the research underlying reports used to make laws and rules.

I believe this secrecy and concealment does not serve its participants quite as well as they think.

By hiding their work from scrutiny, they conceal from themselves any small flaws or errors. This strikes me as rather the reverse of the benefit of the Royal Society, wherein members benefitted from sharing quality work.

I’ve got a partially formed notion relation to Constructal Theory (h/t Willis), and how Reconfiguration works. In this case, while it seems like it’s working for them, their position is very vulnerable to catastrophic collapse.

At least these organizations should have a data omsbudsman so that individual scientists are removed from the process of responding/denying data requests. I mean seriously if they are going to deny data why not just hire some useful tool to do the dirty work of denying requests/ delaying requests/ redirecting requests/ stonewalling/ etc.

I think post it to the web, period. I was responsible for the data in a large commercial database, and anticipated certain people claiming data didn’t exist or they didn’t have access. I set up a restricted guest account so (almost) anybody in the company could see the data, and broadcasted an email detailing it’s existence and how to use it to access the data. When two project managers claimed the data didn’t exist, I sat down with their director and the email (they were on the distribution list) and watched as the director followed the directions step by step, logged into the guest account and accessed the data.

You keep attributing motives of good faith, integrity and professionalism to people who do not deserve the benefit of the doubt.
It is no surprise at all that those involved with promoting or defending the consensus should ignore the lessons of climategate, since they did not pay any cost or suffer any consequence.
They are depending, still, on the social capital they garnered during the golden years of the climate catastrophe.

(2) Can someone who knows the NOAA managemnt personally approach them and politely suggest that they demand that the authors of the paper return their research grants on the basis of non archiving of raw data and full code?

(3) As another blogger suggested above, it would be very useful if draft wording is forwarded to responsible US House of Reps members to attach to all budget bills, withdrawing funding to all US departments and instrumentalities, universities and such, international or domestic, that do not strictly police the full archiving of code and raw data for projects given US government finance.

I’ve not been following this topic for a while and it is really depressing to return and find things have changed so little. Surely an appropriate thing to do would be to complain to the publishers (Springer). They probably care a lot about reputation because that’s really the only way they will sell journals in the age of REF (this journal has an impact factor of 3.9, which would be very high in my field, but is probably not so hot in applied science). I think it would also be appropriate to complain to the editors.

Just wondering: if you are so eager to obtain that data, did you even try to contact the original authors listed in Neukom et al.? If they refuse to deliver the data than you have the right to criticize them. But you do not have the right to criticize somebody who was allowed to use not-yet-published data in a major effort to improve our knowledge of past climate variability in a region so poorly covered with long series. In the end, aren’t we all interested to enhance our knowledge of past and current climate, isn’t this after all the goal of climate science? Constructive critique is very valuable and the essence of scientific progress. But it needs to be constructive and justified, just as every dialogue between human beings.

There are 305 proxies available for the reconstruction, of those 144 were
looked at, for the summer recon 22 proxies were “selected” due to being
“sensitive”. The authors of the report have access to this proxy data.
All we are asking for is that this data be archived, and readily available
to the public to see along with the report, any methods used in doing the
recon, and the data. Why does the individual auditor have to send a request to
each of the proxy owners and request data when it is already compiled and
available? By the way, most of the proxies are, yes, tree ring
data.

It’s not just in science that you can’t give somebody else’s property to another person without having the owner’s okay. Most likely, the original author’s have their own publication in preparation where they extensively discuss their series. If published, than this data should be publicly available. But you cannot force somebody to share their own data if they haven’t even published them themselves! It might be unfortunate in this case that several of the series have been obtained very recently and are thus not yet published. But the decision to share this data is not up to Neukom et al. since they are not the ones who initially produced that data. As considers the method etc. just look at the supplementary material and you will see that this particular study is an example of an extensive discussion of method and uncertainties.

I think you are missing the point that the lack of access to this information is the authour’s (and the Journal’s) problem not yours or mine. We know not to use these reconstructions because we can’t verify them.

It’s the authour’s and the Journal’s problem because they have wasted their time producing unusable stuff. If they wanted the stuff to be usable they would have organised it so it was verifiable.

Hans, I’ve just published a paper that completely overturns Einstein’s theory of relativity. Unfortunately it relies on carefully selected subsets of secret, unpublished underlying data to reach its conclusions. I did this work because I wanted to “make a major effort to advance our knowledge of physics”.

Without the ability for people to see the underlying data for my work, how worthwhile is my paper? Rate it from 0 to 10, with 10 being “a major advance” and 0 being totally useless. Then calibrate your impressions against the reception and impacts of similarly unverifiable papers in Climate Science.

If published, than this data should be publicly available. But you cannot force somebody to share their own data if they haven’t even published them themselves! It might be unfortunate in this case that several of the series have been obtained very recently and are thus not yet published.

Their data has been published via Neukom, et.al. It has yet to be revealed, however.

If the study were funded with public funds, then it is difficult to make the claim that data produced by the study is private property. if it were not for the funding, the data would not exist.

if you or I donate money to a private organization which wants to use the money they receive to fund studies, then they can rightly cliam that data from these studies are private. If we disagree with that claim, then we can stop contributing to the organization.

The situation is very different when public funds are used to produce data.

“If the data is not available, for any reason, then the effect and influence of the study is diminished, likely fatally.”

Not so, like lawyers bringing up inadmissible evidence or points in trial and the judge tells the jury to ignore it, the damage is already done. In AGW the damage is even worse because it isn’t just localized to one court it gets spread around the world and amplified, and no one ever tells us to ignore it.

These ‘scientists’ are giving the scientific community a very bad reputation. Failing to archive proxy data? Really? These guys just don’t seem to learn anything!

Can you imagine what would happen, to say a registered company auditor, if it were discovered that some of the auditor’s vital audit working papers, upon which his audit opinion had been based, had not been archived?

Until an element of professionalism is adopted by the scientific community, one really must dismiss any study that cannot be supported by readily available archived data. This is the information age… the age of computers! So… there’s absolutely no valid excuse, whatsoever, for data not to be archived.

Hmm, personally I think there needs to be a central repository of such data – those who produce it have to ‘private sign’ it and uploaded it to said repository. The repository then catalogs the data and makes it generally available. The repository is so set up as to provide BCP and QoS (i.e. set it up to be ‘hard’ and cloud based).

This solves the ‘not available’ and ‘moved on’ problems – whilst the signing will prevent any tweaking of the raw data – as any change would no longer match the sig. This would also standardize programmatic access and make it easier for yet more people to work with the data…

Basically if those charged with making the data available seem unable to do it of themselves – provide a central service that is easier than doing it themselves… Also journals could then mandate that supporting data sets must be provided on such a central service – problem solved!

Given the starey-eyed idealism of a lot of the above posts, I thought I’d wandered onto the wrong blog…

I’ve published papers based on datsets from both traditional and ‘corporatised’ government departments, and there is always a direct or implied condition that the data MUST NOT be given to any third party, and that I MUST NOT use it for any other purpose than the one I told them about when I requested the data. Even data I collect myself isn’t mine, it belongs to my employers and they’d be mightily upset if I went giving it away. I don’t like it, but I either accept it or never publish anything. I agree that it would be wonderful if all the data was made public in some nice, organised, searchable, verifiable form, but I expect that this will happen sometime in the decades following the end of poverty and the establishment of global peace and universal goodwill. It’s a bit harsh giving Neukom a hard time over it.

A couple of posters also seem to confuse ‘auditing’ with ‘the scientific method’. In the first case, sure, get a copy of the original data, copy the methods to see if you get the same results, look for flaws in the methods, apply new, better methods and compare. It’s useful and necessary work, but it’s not ‘reproducing the experiment’ in a scientific sense. For real science, you need to collect seperate data and apply the methods. Not easy in the case of climate records, but if it was easy we’d all be doing it.

As I understand the OP, Steve asked Neukom for a copy of the data, Neukom answered “Sorry, it’s not mine, you’ll have to ask the people who have the authority to release it”. I got a smile from Bender’s comment and I see his (and your) point and agree with it, but I don’t accept that Neukom is to blame. If all public research institutions had to make their data publicly available then the world may well become a better place, but it’s unreasonable to expect Neukom (or anyone else) to put their whole publishing careers on hold until it happens.

If I was to make a somewhat cynical reply to Bender’s rhetorical question, I’d answer “It doesn’t matter what’s wrong with it. It’s how the world works, it’s how the world has always worked and it’s how the world will always work”. Hence my crack about ‘starey eyed idealism’. I like it no more than you do, but being inside the academic system, I have to live with our part in it and I’ve got some sympathy for the other poor saps who have to live with it also.

Well then the “poor saps” ought to stop lobbying for a role in “informing policy”.

I know that the world of academe and “scholarly” publishing has its own ideas of what reality is and is made up of thousands of small ponds with several thousand “scholars” who think they are big fish. As long as they’re only trying to add a line to their CV’s that’s fine, but when you enter into the world of “informing policy”, you’re stepping into the big leagues and you damn well better raise your game.

Can’t agree more John. Me, I’m quite satisfied with the ‘adding a line to my CV’ part. 🙂

More seriously, I just want to understand my topic. I think Hansen did the world a terrible disservice, when he crossed the line between research and policy and managed to poison ‘climate science’ for decades to come. But I’ll stand by my main point, that it’s not within my authority (or Neukom’s) to release data.

Even data I collect myself isn’t mine, it belongs to my employers and they’d be mightily upset if I went giving it away.

If your employer is the general public, they’re not going to get upset if you make freely available to them that which they have paid for. As you say, it’s not your data, it’s not Neukom’s data, it’s not whoever the hell gathered it’s data- it belongs to the public. Sorry if that bothers you but feeding from the public trough does have its drawbacks.

I work for a (public) university, and if I was to release data without express permission from my Head of Institute then I’d expect to be sacked on the spot. If ‘the public’ wants to storm the gates of the Institute and demand that the data be freed then I’ll cheer them on, but I’m not going to throw away my career on this particular point of principle. It certainly wouldn’t ‘bother’ me if the data was public!

Just to play devil’s advocate for a moment… I’ve sacrificed 8 or 9 years of my life to building up enough scientific experience and credibility that ‘I’ can get access to data. I’ve proven that I can be trusted with data, that I won’t give it away, waste it, lose it or otherwise embarrass myself in front of my peers. I’ve demonstrated that I will understand it, and that I know how to analyse it properly. Is it reasonable that ‘you’, an innocent layman who hasn’t put in the hard yards, should have the same access? ‘You’ may use faulty methods, come to unfounded conclusions, and misinform public opinion with potentially disasterous consequences for government policy…

If you wanted to “publish”, the university would be begging you to release it! (Publish or Perish). No one is demanding that data be distributed world wide just “because”. But in order to verify published findings, the data is critical, and withholding it would just result in another “Cold Fusion” hoodwink.

The University is happy when I publish something, no question about that! But if a condition of my publishing something was that I publicly released ‘their’ 40-year record of forest growth measurements, then… I don’t know for sure, but I doubt they’d approve unless there was a Nobel Prize in it.

If I publish something based on this data and a reaearcher or reader has an issue with it, then they can say that I used invalid methods, or that my results are not necessarily applicable outside my study area, or that I draw unfounded conclusions. These criticisms may be (often are) quite valid, but reviewers don’t need the data to make those comments.

If I was to write a paper with the conclusion (for example) that “there is no detectable influence of climate change on the growth rates of forests in Central Europe” then I’d expect it to be reviewed mercilessly; I would have to exhaustively describe the dataset, detail and support every statistical method, and severely limit the broader applicability of the study. But I honestly doubt that anyone in my field would expect me to supply the raw data, or that the University would approve its release. ‘Verification’ (or otherwise) would come from some other group doing a similar study, with their own dataset.

Simple: “The science is NOT settled. Sorry policy-makers, you’ll just have to live with it.”

If ‘my’ data and research suggest one thing, and ‘hers’ suggests another, then there would be a good case for devising a study (and getting some funding) to figure out *why* this is so. If one of us has made a cock-up then it’s usually found internally (or on CA…), but WITHOUT the glare of publicity one of us could quietly withdraw or ‘update’ our research. If there is a real, physical reason for our inconsistency, then we learn something from it! Science is actually rather boring when experiments work, it’s the failures that make it interesting. 🙂

Even putting aside the policy makers, both pieces of research are useless in a scientific sense. If the problem is in the data or the interaction between the data and the method it won’t be settled until someone comes along and shows the data.

It’s a problem with ‘climate science’, that there’s in a way effectively only one dataset… (one ‘climate’). It’s an issue of experiment design: come up with the right design, we could make some supportable conclusions.

‘Verification’ (or otherwise) would come from some other group doing a similar study, with their own dataset.

So why are they setting up all these publicly-funded (yet not freely-available) shared databases like PAGES? The “trust us” meme doesn’t work in science- especially public policy science. See bender above.

Why are they doing it? I don’t know, and I gather that Steve doesn’t like speculation about motives. But I would imagine that this might give ‘them’ at least a chance to find each other’s cock-ups before they reach the light of day and the glare of public release. 🙂

I think you overgeneralise a bit about ‘trust’ not working though… Every time you you drive over a bridge, you’re trusting the engineers who designed it, who trust the scientists that made the equations. That said, I can’t agree with you more about ‘public policy science’. Any so-called scientist who thinks that he has a mission to drive policy should be run out of town.

I think I mentioned somewhere above that auditing is important and necessary. But in ‘respectable’ research institutions, this done in-house long before a paper is submitted for peer review. If we’re going to cast doubt on that, then we need independent data.

I’m not sure I follow you… If I read a paper and hypothesise that “their result isn’t correct because they screwed up their data”, then this would be a blatant accusation of incompetence or fraud. If an *independent* study came to different conclusions, then the matter needs further investigation.

If I hypothesise that “they use the wrong method” then I shouldn’t need the data. I need to know ABOUT the data – its provenance, variability, distribution, autocorrellation etc etc etc, and on the strength of that I should be able to demonstrate from first principles whether their method was appropriate or not. If I can’t, then either a) they haven’t provided an adequate data description and should be called on it (letter to the Editor: “X et al. do not provide sufficient information to judge the validity of their methods”), or b) I don’t know enough about the topic or the stats theory to criticise their method.

So I have to respectfully disagree with you HAS, we *should* be able to distinguish between data and method. Don’t get me wrong, I am a believer in ‘free the data, free the code’, but if we just hack over other people’s work then we’re data-miners, not researchers.

Chris – I would stop short of accusations of deliberate decption, but *must* consider the context of Climategate, in which one might fairly surmise that the failure to release data pretty much guarantees an effort to mislead. Go read THSI for starters, then come back and play your apologist theme song, if you can still do so with integrity.

Sure enough Mark, I haven’t forgotten. 🙂 And I certainly don’t offer any apologies on behalf of those in the Climategate scandal! You need to watch your logic though: Just because ‘effort to mislead’ relied in part on ‘failure to release data’ doesn’t imply that the opposite is true.

I do however (as a ‘scientist’ in general) strenously object to being tarred with the same brush, especially as there’s bugger-all I can do about it if I want to keep working in science. I wouldn’t know Neukom from Adam, but I expect he might feel the same way.

The first step in hypothesis forming in the scientific method is likely to be to review what you know. If a prior result doesn’t look right one is quite justified in hypothesizing that they stuffed it up, and this includes potentially incorrect assumptions in measurement, failure to apply measurement techniques correctly, and/or to use data manipulation techniques in subsequent experiments using the data that are inappropriate for the data (pause and think about assumptions of data being “well behaved” that abound in statistical techniques).

Discovering those kinds of errors in someone else’s work is part of the grand tradition in science, and the data and the method go together like a horse and carriage.

Fair enough, and a journal article should provide enough information for us to find those errors. My point is that an adequate description of the data should be enough. If the description is deficient, then the article is incomplete and its conclusions unfounded. For example, I’m putting together a conference presentation at the moment demonstrating that “Under certain circumstances, the methods applied by X et al. can produce spurious results.” I don’t need theír data to do this, but if the presentation gets published then the onus is on X et al. to either show that those ‘certain circumstances’ are not present in their data, or that I screwed up and their method is in fact robust to the circumstances. I’m not into ‘policy’, so I honestly don’t care if the ‘certain circumstances’ are there or not. My contribution to ‘science’ would be that in future, appropriately robust methods would be applied. Maybe even my methods. 🙂

I can see that for ‘policy’ people the important part of an article is the conclusions and ramifications, and I fully agree that independent data verification is a good and necessary thing if there are serious ramifications for public policy. You haven’t (yet) convinced me though, that failure to release data negates the scientific validity of a work.

You haven’t (yet) convinced me though, that failure to release data negates the scientific validity of a work.

Lack of access to the data partially blocks investigation into the work, and aspects of the validity that could otherwise be confirmed now have to be taken on trust.
ie. it’s not a negation of the work, but it is a partial obstacle against future negation of the work.

“Just to play devil’s advocate for a moment… I’ve sacrificed 8 or 9 years of my life to building up enough scientific experience and credibility that ‘I’ can get access to data. I’ve proven that I can be trusted with data, that I won’t give it away, waste it, lose it or otherwise embarrass myself in front of my peers. I’ve demonstrated that I will understand it, and that I know how to analyse it properly. Is it reasonable that ‘you’, an innocent layman who hasn’t put in the hard yards, should have the same access? ‘You’ may use faulty methods, come to unfounded conclusions, and misinform public opinion with potentially disasterous consequences for government policy…”

If the data is not to be reselased by your University, then how to comply with the publication policy of, for instance, the Royal Society:

As a condition of acceptance authors agree to honour any reasonable request by other researchers for materials, methods, or data necessary to verify the conclusion of the article…Supplementary data up to 10 Mb is placed on the Society’s website free of charge and is publicly accessible. Large datasets must be deposited in a recognised public domain database by the author prior to submission. The accession number should be provided for inclusion in the published article.

In that case the publication should be rejected as nobody would be able to falsify the research. which in fact has happended because Jones admitted that the raw data is hardly requested by the ppl who perform the peer review process.

This is getting a little legalistic now, but… Hypotheticaly, I am sure that my Institute (it’s their decision, not mine) would be only too happy to “honour any reasonable request”. They may not consider it ‘reasonable’ to cough up a unique 40-year dataset just to satisfy a random someone’s curiosity, however lofty that someone’s aims may be. If the Royal Society doesn’t like it, then we’d publish elsewhere. If another “researcher” wants some help to understand the article, then my office door will be open and he can get his Institution to send him over on a 3-month research exchange visit. Assuming that my boss approves, of course…

Once again, you don’t need the data to falsify the research. If the data is wrong, you have to tell me what I did wrong in the data collection process (like Anthony Watts did, with the surfacestation project). If the data is right, your efforts at falsification have to concentrate on the theoretical background of my method (like Steve and Ross’s work does). I have to give you enough information that a qualified reviewer can fully understand my data collection and method. If I fail to do that, then the reviewer should reject the paper. If you REALLY need the raw data in order to make a criticism, then you’re probably not qualified to comment. Sorry, but it’s true.
Or maybe now I’m being a little too idealistic about the process myself. 🙂

The exception to this of course is if you were (hypotheticaly) accusing me of falsifying my data. In that case, not being a particularly PC kinda guy, my first response would be “[censored]”.

What if the data were mishandled – what if the calculations carried out on the raw data are not those described in the paper? What if the calculations were bungled? How would one spot that without having access to the data?

11 Trackbacks

[…] and methods, here's one more new study published where they refused again to show data and methods More Data Refusal – Nothing Changes Climate Audit And who's one of the authors of the study? Dt. Phil Jones. Yes, the same climate gate Jones? Will […]

[…] Steve McIntyre takes up another case of somebody publishing a paper but refusing access to the data the paper is said to be based on at https://climateaudit.org/2011/01/06/more-data-refusal-nothing-changes/. […]

[…] None of the data for the earlier articles was archived. Or, more accurately, it was archived in a secret Swiss databank only accessible to the illuminati. CA readers will recall that I requested data for an earlier article from co-author Neukom and was blown off ( see CA post here.) […]

[…] and Gergis 2012, which was not archived at the time and which Neukom refused to provide (see CA here). I had hoped that Nature would require Neukom to archive the data this time, but disappointingly […]