Gavin’s Mystery Man

On Sunday, Feb 1 at 4:41 pm Eastern (3:41 blog time), I published a post describing West Antarctic stations. In that post, I observed that there was a very limited amount of station data (6 stations) in the West Antarctica area re-interpreted by Steig et al, that one of the stations (Harry AWS) had an extreme trend (0.81 deg C). I’d noticed some peculiar features about this series, but it was getting late in the day, it was Sunday and I had family things to do in a while, so I ended my post as follows:

Stay tuned for some interesting news about Harry.

The two main things that I’d noticed by then had been 1) a huge difference between the GISS version of Feb 2008 and the current version; and 2) that Harry had been installed in 1994, making the provenance of Harry data prior to 1994 a mystery – which I asked readers to think about around 5 pm Eastern (4 pm blog time). At the time, I said (more than once) that I didn’t know (“dunno”) whether the problems with Harry “mattered” – and that figuring that out would be hard without examining exact code.

The post attracted a fair amount of reader interest. At 5:19 Eastern (4:19 blog), reader Dishman speculated that the problem with Harry might come from the Argos number of 8900 moving around. I kept track of things for about an hour and around 6 pm Eastern (5 pm blog), I reported on scraping some source data from Wisconsin. By then, it was supper time. I started getting ready for Sunday dinner and watching some football. (Super Bowl is not a religious holiday in Canada, as it is in the States, but I do watch a lot of sports – less than usual because the Toronto Raptors basketball team are playing so poorly this year.)

The comment section of the thread was quite lively through the Super Bowl. I did a little work late in the evening, checking in just after midnight, noticing by that time that the provenance of the pre-1994 Harry data was “Gill” and being pretty sure by this time that BAS “Harry” was a splice of Harry and Gill. A CA reader had made a similar observation by the next morning (Feb2).

During the morning of Feb 2, I re-did my calculations, re-scraped the data, went over things again and wrote up a post on Harry, which I released at 1:49 pm Eastern (12:49 blog time). As I’ve explained from time to time, I often use the blog as a sort of diary – so that writing a blog post on Harry wasn’t distracting me from other analysis: it provides detailed documentation of the analysis that there was a problem with Harry, that I could refer to later. (I know that some readers aren’t interested in this sort of thing and it creates an editorial unevenness here, but obviously enough readers like it to create an audience.) In that sense, writing a blog post didn’t materially delay the reporting of the problem; it ensured that I’d documented the issues thoroughly.

As I was finalizing my post, I re-checked the BAS page and found that the Harry data had been changed – a point which CA readers had noticed a bit earlier, when I re-checked the thread.

Over to RC.

Their Antarctic thread had been closed, but they re-opened it. A couple of hours after my post (3:35 pm), bernie commented here as follows:

SM at CA has identified what appears to be a major error in the Steig et al paper that suggests that the perceived trend is an artifact of this particular error. Perhaps this is an opportunity to mend some fences and work towards a common goal of better data and clearer methods.

Gavin:

[Response: No-one should be against better data. It would have been nice had SM actually notified the holders of the data that there was a problem (he didn’t, preferring to play games instead). If he hadn’t left it for others to work out, he might even have got some credit ;)

Isn’t it rather petty (as well as possibly unethical) to refuse credit because you don’t like the source or their methods of communication?

Gavin:

[Response: People will generally credit the person who tells them something. BAS were notified by people Sunday night who independently found the Gill/Harry mismatch. SM could have notified them but he didn’t. My ethical position is that it is far better to fix errors that are found than play around thinking about cute names for follow-on blog posts. That might just be me though. – gavin]

[Response: This data error has nothing to with Steig or any of his co-authors. SM should have contacted BAS. I’m sure BAS are grateful that someone told them about the data mess up – it just didn’t happen to be SM.

The Harry error had been sitting in the BAS data for maybe a year. None of the authors of Steig et al authors nor any of their reviewers had noticed the error. And then remarkably both me and Gavin’s mystery man “independently” found the Gill/Harry mismatch within a couple of hours of each other, with Gavin’s mystery man, by sheer coincidence, doing so just after I had published notice of the problem at Climate Audit and after Climate Audit readers had turned their attention to the problem.

While Gavin complained about me not immediately contacting data authorities, I actually have a pretty good record of notifying data authorities of things that I’ve noticed. I sent Hansen an email on the Y2K problem when I identified it, as well as, on the October 2008 Siberia problem when I was aware that this had been noticed by a CA reader (and, as it turned out, by a Watts Up reader as well). I’ve notified WDCP of problems in dendro data. As of Sunday supper time, while I knew that there was something about Harry, I hadn’t fully diagnosed it and it didn’t seem like something that I needed to notify BAS about urgently – particularly when the incorrect data had already been used by Steig and there wasn’t anything that I could do about it. As I’ve explained to readers, I often write blog posts as sort of a work diary. My blog on Harry was to some extent a diary of what I’d found out about the series, and I certainly wanted to double check things (and determine more about the provenance) before contacting BAS.

On the other hand, Gavin’s mystery man seemed almost desperate to pre-empt me on this. Sometime during Super Bowl Sunday evening, it seems, Gavin’s mystery man contacted BAS about problems with Harry. BAS responded quickly to Gavin’s mystery man and on the (EST) morning of Feb 2, the Harry data was changed.

By the afternoon of Feb 2, not only had the data been changed, but Gavin Schmidt somehow knew something about the circumstances of the change, including the surprising news that a mystery man had “independently” discovered the Harry/Gill union, only a few hours after I’d published notice of a problem with Harry at Climate Audit and many Climate Audit readers had volunteered information on the problem.

Anything’s possible.

So who was Gavin’s mystery man and how was it that Gavin knew so confidently that the mystery man had identified the Harry/Gill problem “independently” of Climate Audit. One more thing: when did Gavin himself learn that the mystery man had identified the problem with Harry? At the time, realclimate had two threads devoted to Steig et al. If it was so urgent that BAS be notified of the problem that a delay until Monday was too late (as Gavin implies in his criticism of me), then once Gavin knew about the problem with Harry, did realclimate have an obligation to notify their readers? Just asking.

Harry is a tempest in a teapot. The bigger issue is what effect Harry has on Stieg’s conclusions. Gavin claims there is no effect citing a subnetwork without Harry that reached the same conclusions. Is he correct?

Steve: I observed very early in this analysis at the outset that I don’t know whether Harry “matters” or not. Nothing that Gavin’s said so far shows me that it doesn’t matter. For a couple of reasons. First, I’ve got considerable experience with Mannian methods – discussed at length in many posts – and it is strongly my view that these methods can be strongly influenced by a small number of records: bristlecones are the obvious examples. There’s been much huffing and puffing about this, but even Wahl and Ammann conceded that Mann needed the bristlecones to “get” his result in the early portion. If they used well-understood methods to get their results, it would be easier to see what they’ve done. But they’ve used RegEM (perhaps Mann-modified, perhaps not: the information on this is inconsistent) and it’s hard to say what the effect of anything is. Second, and from the same debate, I’m used to arguments saying that an “independent” method “gets” the same result, but then the “independent” study itself is addicted to bristlecones and isn’t “independent” as generally understood. Third, I didn’t examine all 60 series for problems; I looked at data in the region being re-interpreted and then at the only record with a big trend in the region being re-interpreted. From a data analyst point of view, this sort of thing matters a lot. If I were in Steig’s shoes and fully confident that the issue didn’t “matter”, I’d release all the code, data and intermediates right now and let the chips fall where they may. (Actually I’d have done that with the submission; I included turnkey code with our Santer submission SI, as I did (near-turnkey) with our MM2005 GRL submission).

The intrigue of the who-dunnit is interesting; however, what I would like to know is whether this is a potentially serious error (ie the hybrid record was used in the analysis) or trivial (but sloppy)error (ie the data was not used but was included on a table). Do we know one way or the other?

“Never wrestle with a pig. You both get dirty, and the pig enjoys it.” I think this blog is most effective when it (as it usually does) focuses on the data and methods, and skips the petty behavior by Team members.

Well, this is all very interesting, but I am much more curious concerning the true impacts this issue has on the conclusions of the Steig paper.

Perhaps Dr. Steig could use this situation as an opportunity to improve his QA/QC processes — in the spirit of implementing a TQM philosophy in writing his papers — and rework his analysis from scratch using an upgraded, end-to-end knowledge-managed approach.

For those not familiar with this decades-old concept, TQM is the acronym for Total Quality Management.

“But I’m not going to [check Steig’s code]. I actually trust those guys. As for the folks flinging criticism and accusations — I do not trust those guys.”

Ironic to say the least that Tamino, apparently a genuine climatologist, chooses not to identify him/herself and chooses to “trust” one group and not another. That is hardly a scientific attitude, and I further note RC’s aversion to skeptics who use words such as “belief” or “faith” in their critiques.

Ironic to say the least that Tamino, apparently a genuine climatologist, chooses not to identify him/herself and chooses to “trust” one group and not another.

Actually, no, at least, not apparently. Last I heard, Tamino claims to have an unrelated day job doing time series analysis, whatever that means. Not that it matters, either, though it does highlight the hypocrisy of repeated attacks on Steve and Ross for not being climatologists.

This issue alone is NOT going to have any significant impact on the Steig paper.

Steig et al already performed an analysis of the temperature data excluding Harry (along with most of the other AWS stations). While the results are not identical, they are sufficiently similar to be supportive of their conclusions.

But Steig et al still have a serious problem. If the very first data series that Steve checked had such a fatal error, what does that say about the general quality of both the data and the analysis that was built on top of it?

If the quality of the data is good, then what is the likelihood that the very first data series checked will not only contain a serious problem, but that problem will also be clearly identifiable from other available records?

An WHY didn’t Steig et al notice this (While Steve found it almost immediately)? Its not like their paper is reporting brand new measurements. All they have done is reanalyze old measurements. If they are going to publish a paper based entirely on their ability to derive new insights from old data, shouldn’t they have become sufficiently familiar with that old data to notice a problem like this? Its not like there are vast quantities of data to go through.

Steig et al already performed an analysis of the temperature data excluding Harry (along with most of the other AWS stations). While the results are not identical, they are sufficiently similar to be supportive of their conclusions.

According to Eric, Harry isn’t part of the full reconstruction in the first place. Why Harry and the other West Antarctic AWS are in Table S2 remains a mystery. As far as I can tell there is only one West Antarctic site in the reconstruction – Byrd.

Another mystery is why the subset analysis is “sufficiently similar” in the West Antarctic when it excludes sites that weren’t in the full reconstruction in the first place.

The formerly problematic Harry is indictative of some sloppiness by all parties involved. The excerpt below from the Table S2 caption on careful reading would be interpreted to mean that the 4 AWS stations were not used in the reconstruction as predictor variables.

List of the 42 occupied weather stations used as predictor variables, and four automatic weather stations (AWS) in West Antarctica. Weather stations used in the restricted 15-predictor reconstruction are marked with an asterix (*).

Table S1 shows the 26 AWS stations selected for reconstruction validation and verfication for AWS reconstruction (the IR measurements being the other source for a second reconstruction).
The 4 AWS stations, including Harry, from Table S2 are not among the 26.

I would be more interested in knowing the exact selection process for the 26 stations and secondarily why the 4 AWS stations are included in the table with the predictor stations.

List of the 42 occupied weather stations used as predictor variables, and four automatic weather stations (AWS) in West Antarctica. Weather stations used in the restricted 15-predictor reconstruction are marked with an asterix (*).

Re: Kenneth Fritsch (#28), I’m also curious about the site selection process for the AWG reconstruction. I looked at the trends in the full record (1957-2006). The linear fit to the temperature anomaly data has a negative slope for 12 of 63 sites. But of the 26 sites used for the reconstruction, every one has a positive trend (Table S1 of the SI). Less than one site in five has a negative trend, but if you pick 26 at random, it’s almost impossible to not have at least one with a negative trend.

The SI states that the 12 negative trend AWG sites (along with some positive trend sites) were thrown out because of insufficient calibration data or low verification skill. Does anyone understand this culling process that uses “calibration data” and “verification skill” which happens to remove all negative trend data, but only half of the positive trend data?

Re: Richard M (#122) if, hypothetically, someone was looking for a way to force agreement between a model that predicts warming and a set of data with mixed warming and cooling, then throwing out all the data that shows cooling would be a good way to do it.

It looks like AWSs Mount Siple and Elizabeth, both on West Antarctica and having negative trends, were tossed out before the “reconstruction”. If those had been included, I wonder how well the AWS reconstruction would have correlated with the model.

I read these postings about your interactions with the “scientific” climate science community and am constantly stunned by their attitude, behavior, and actions. It’s come to the point that I literally tune them out as background noise. They may have valid data and arguments but it’s impossible to even give them the benefit of the doubt.

These scientists have done more damage to scientific credibility than they are obviously aware. They seriously need some adult supervision as they keep on insisting to damage their own cause and agenda.

I guess that the mystery man was actually a woman, Ms Evets Erytnicm. She is an ideal scientist who shares almost nothing with Steve. In fact, she’s the very opposite. She thinks that Mann et al. and similar papers are great. A careful reader may figure out what Ms Erytnicm and Steve have in common! ;-)

Re: KevinUK (#14), I agree with Kevin. Probably Gavin. After picking up the clue here.

Went to RC for a rare visit: confrontational and uninteresting, except for the insults. Sent in the following: “Reading just briefly, so far, I see from your comments that McIntyre is an incompetent, self-appointed,unethical, unregulated, lying “McFraudit”. Given all this, and I am sure he’s been called worse on this site, he’s clearly worthless. So why do you people appear to be so defensive about him?”

I believe that they know that most people don’t like to be insulted; so they do it to anyone who doubts them, reasoning that they will stay away. It seems to work.

“While the results are not identical, they are sufficiently similar to be supportive of their conclusions.”

I think the temperature delta is quite different, when they used only 15 station instead of all. This implies that using another 15 stations from the remaining stations “likely” ends up with a negative temperature delta and even to the opposite overall result.

#15 I agree that 15 stations being consistent with all the stations does not verify that both are right, especially if the “all the stations” analysis includes Harry. Which begs the question – was Harry used in the analysis of all the stations or was it not used but left on the table?

Tamino nails it, however unwittingly. The central issue isn’t about the extent to which Steig’s results will have to be changed. As Tamino says, it’s trust (really — credibility).

Interesting, Tamino sees no reason to check the work of other scientists because he likes them and trusts them. That’s not a scientific attitude.

In the end, it’s all about scientific credibility. Credibility attaches to those who conduct science in accordance with real standards and with a commitment to seeking the truth. It doesn’t attach to those who demonize, fail to provide data, obstruct efforts to replicate, hide files, invent bizarre new “statistical methodologies”, and generally act like spoiled brats.

#20. James, there’s an important piece of evidence over and above what they say about the procedures – the recon_aws reconstruction itself. The 26th series matches pseudo-Harry 100%. Not a correlation of 0.57 or anything so mundane. It’s a dead match.

AFAIK, RegEM methods give you back the data you started with and “infill” the missing data. And in RegEM, there is no distinction between predictor and predictand – to this extent, some of the descriptions from Gavin and Steig aren’t always as clear as they might be. Harry had to be in the recon_aws network to get the archived results. There are other unarchived results – perhaps they are different, but we haven’t seen them yet.

#20. James, there’s an important piece of evidence over and above what they say about the procedures – the recon_aws reconstruction itself. The 26th series matches pseudo-Harry 100%. Not a correlation of 0.57 or anything so mundane. It’s a dead match.

AFAIK, RegEM methods give you back the data you started with and “infill” the missing data. And in RegEM, there is no distinction between predictor and predictand – to this extent, some of the descriptions from Gavin and Steig aren’t always as clear as they might be. Harry had to be in the recon_aws network to get the archived results. There are other unarchived results – perhaps they are different, but we haven’t seen them yet.

Well the mysteries pile up. Eric has posted at RC that:

the main figures in the paper use no AWS data in the first place. None. This is totally clear in the text, if you read it. The AWS data are only a double-check on the results from the satellite data.

Well it wasn’t clear to me, maybe I’m a bit slow. It also wasn’t clear to Florian on the other thread. It wasn’t even clear to Gavin, who “mispoke”. And it’s not even clear after I read the paper again.

And what do we make of this:

Independent data provide additional evidence that warming has been significant in West Antarctica. At Siple Station (76u S, 84u W) and Byrd Station (80u S, 120u W), short intervals of data from AWSs were spliced with 37-GHz (microwave) satellite observations, which are not affected by clouds, to obtain continuous records from 1979 to 1997 (ref. 13). The results show mean trends of 1.1 +/- 0.8 uC per decade and 0.45 +/- 1.3 uC per decade at Siple and Byrd, respectively. Our reconstruction yields 0.29 +/- 0.26 uC per decade and 0.36 +/- 0.37 uC per decade over the same interval. In our full 50-year reconstruction, the trends are significant, although smaller, at both Byrd (0.23 +/- 0.09 uC per decade) and Siple (0.18 +/- 0.06 uC per decade).

How do these results support significant warming in West Antarctica? Apart from the 50 year reconstruction results, the uncertainty is either greater or nearly equivalent to the claimed trend.

Well, really, if the check figures are wildly wrong, and if they confirm the substantive report, doesn’t that imply that the main point must be off?

My father told me that as an engineering student at Georgia Tech around 1939, he had a course with a one-problem final exam. He got the answer right but failed the course because he did not show that he had checked his work.

So Steve, what you are saying is that based on the archived results Harry was indeed used in the “all the stations” analysis and their claims to the contrary are either 1) wrong, or 2) unclear until further information is provided (realizing this is unlikely to happen)

Steve: Dunno. I’d be surprised if either Steig or Gavin know either – in the sense of knowing what actually happened in the Mannian RegEM. So I’d take everything under advisement for now.

As I said during the discussion an early email would be incomplete. It would be more helpful to tell BAS what oddities have been noted than to simply say “There is a problem”, so at least some participants were aware that Steve had merely begun a work-in-progress rather than playing games.

Meanwhile, someone called Bernie at RC was telling people to look at a map to see that Harry didn’t matter. Is looking somehow like “doing the math”? He’s since crossed that out with an unexplained reassurance that Harry didn’t affect a certain result. (I suppose the reassurance will be explained with “See above” — somewhere above.) I think I’ll wait for Steig’s next study with the new math.

Am I correct that BAS, by changing the station data, has turned Steig’s raw dataset into a grey unpublished version

From the RealClimate quote below, it would seem that Gavin Schmidt has a similar concern. It also seems that he would like a form of data archiving with version control. He regrets that this doesn’t exist for climate science. As we all know such systems are standard in engineering practice and any organization running s large project without one would be very strongly criticized. I suppose that the IPCC is not large enough nor has enough money to enable this in climate science.

Response: You raise a good question. Steig’s archiving is at http://faculty.washington.edu/steig/nature09data/ and you can see that the data sources are referenced to the originating organisations (who can and do update data, fix errors etc.). Ideally, for ‘movable’ datasets, one would want a system where snapshots in time were recoverable (and citable), along with pointers to the up-to-date versions, forward citation to publications that had used various versions and the ability to update analyses as time went on. What you don’t want is mostly duplicate data sets that aren’t maintained floating around in the grey zone – that will just lead to confusion. Google were actually working on such a system, but have unfortunately lost interest. Other organisations such as BADC are thinking along those lines, but it is a sad fact that such a system does not yet exist. – gavin]

Subversion/CVS/Git are free and any legitimate admin should know how to implement them. I’ve actually got a Winders version of Subversion running on my PC (though I never use it). It exists, it just takes a bit of effort to implement at which point it becomes automatic. Heck, I’ve even got a Seagate FreeAgent Pro external HDD that auto-backs up all my data. Nothing to do but plug the sucker in and pick the appropriate files to regularly back up.

So many people pointing out the idea (obvious to all of us who have spent any time in software-related development) that it just might be a good idea to automate the management of data related to research…. perhaps something’s been done about it already?

It has. And it’s been mentioned at CA in the past, particularly by Pat Frank, who works at the university where most of this kind of work is promoted. I get more than 10,000 hits googling “reproducible research” — here are a few interesting links:Good intro to reproducibility at the Madagascar project.Reproducible research paper… in 1992 (not exactly a new idea, is it!)ReproducibleResearch.org, a site with many good linksBlog post about, and demo of Sweave, a method for embedding R code into LaTeX documents, producing reproducible, self-generating scientific analysis reports (when properly used.) Sweave is apparently built into R these days.A paper recounting a history of reproducibility — joys and sorrows…

One of the most important points made in these papers concerning the reproducibility of computational analyses is that having the tools and techniques in hand isn’t nearly enough by itself to get the job done.
.
By themselves, these tools can’t enable reproducibility. For the tools to be truly useful, they must be embedded within an operational and procedural context wherein computational analysis processes are thoroughly documented as a matter of course and wherein a culture of end-to-end quality consciousness and ethical behavior is the foundation of all the work that is done.
.
Twenty years ago in the early days of the civilian nuclear waste repository program, while we were struggling with a variety of issues as to how to assure the reproducibility of the analytical work, the kind of cavalier attitudes toward software QA/QC that is now demonstrated by many climate scientists was not all that uncommon among the many scientists employed by the repository project.
.
Needless to say, those among them who did not choose to get on board with taking a thoroughly disciplined approach to performing their responsibilities were shown the door. And regardless of their individual talents or their potential future contributions, the project was ultimately better off without them.
.
Let us all recognize that for better or worse, automated computational analysis is the bedrock foundation of today’s climate science. If the computational analysis is not right — for whatever set of reasons — then the climate science is not right.
.
Given this as background, I have to say that in the three years I’ve been reading this blog, this quote from Gavin over at RC, repeated here once again, takes the prize as the absolute pinnacle of arrogance:

[Response: It’s not nonchalance. Errors in data and code should be looked for and fixed when found. It’s simply that this is not the most important issue when deciding whether a result is interesting or not. Someone could have perfect data and absolutely correct code, but because their analysis is based on a fundamentally incorrect assumption, the result will be worthless. But errors in data are ubiquitos, as are bugs in code (Windows, anyone?) – interesting results therefore need to be robust to these problems. Which is why we spend time trying to see whether different methods or different data sources give the same result. That kind of independent replication is the key to science, not checking arithmetic. – gavin]

OK, make my day. Tell me that when increases in temperature measured in tenths of a degree centrigrade taken over decadal timeframes are being used as proof of man-made global warming, that the validity and correctness of the climate science software — including its associated low-level arithmetic — isn’t just as important as the fundamental physical assumptions and physical processes the software is attempting to model.

Volume 3 also contains an article on “Grading Systems of Scientific
Workers” by D. Rougge. Several systems are evaluated:
1. Idea grading: based on IPM (ideas per minute). It works quite well
for scientists employing many graduate or postgrad students.
2. Execution grading: based on projects – no matter whether original or
not. The important factor is the number of publications or patents.
3. Disagreeable grading : Certain activities (administration of a
department, planning of laboratories, organization of conferences,
inspection of laboratories) seem to be quite disagreeable to scientists.
Promotions can be keyed to the performance of these disagreeable jobs.
4. Public relations: Grading depends on scientists ability to convince
potential investors to part with their money

“This data error has nothing to with Steig or any of his co-authors. SM should have contacted BAS. I’m sure BAS are grateful that someone told them about the data mess up – it just didn’t happen to be SM.”

That’s entirely true. This data error has nothing to do with Steig. It has everything to do with his approach to analysis. SMs approach to data analysis is garden variety standard stuff ( i say that in a nice way). Look for the thing that sticks out and double check it. Harry stuck out. Double check it. Heck Hansen does the same thing. If you read his papers on GSMT you’ll find little cases here and there where he spots an odd cooling trend and double checks the data ( H2001 where northern california sites are clipped and trimmed) And it really doesn’t matter if harry matters or not. What matters is that the Steig approach to data analysis missed this. Ascribe no motives to this mistake. I just have to laugh.

Forgive my naivete, but how can Harry NOT have an impact on the Steig conclusions? The main conclusion is the warming of West Antarctic (if you pick your start date correctly). There are only 6 stations there, and the ONLY one to show an appreciable warm trend is a bogus artifact. When all of the starting data is in complete agreement (no warming trend), it shouldn’t matter what result comes out of a Mannomatic manipulation of the data.

If you put station data with no warming and satellite data with no warming into a statistical algorithm and come up with warming, I think you made a mistake along the way.

Re: Clark (#36),
This is not my field. However, if I were prepping a paper like this, I would run the model with a large number of variations on the source data just to satisfy myself that my conclusions were not dependent on a few stations. I would probably right a program that removed sources until it no longer made sense to continue (I don’t know how many are “removable”). I would run this many times romoving different stations and look for those stations that do impact the results significantly. So I don’t know if it is Steig or a grad stuident, but somebody supporting this paper must have known about every station with a strong slope and (I sincerely hope) been satified that the overall impact was nominal.

It does concern me that a strongly sloped station had bad data: I thought that graduate students were suposed to obsess about these things.

I do not bear the scars that Steve bears, but I would like to give the Steig team the benefit of the doubt on this. I echo the desire for the release of the complete source for their runs. It appears that some of the best fact checkers the climate community has are hobbyists. Significant highly competent and very interested unpaid labor is here willing to help validate results. With a little care and nurture, this resource could be very helpful to the “Team”, would help work through a lot of frankly childish comments here and there and help the entire community improve its level of dialog.

As far as I can judge, Gavin S. at RealClimate is saying that Harry (neé Gill) had but a cameo role as a character witness in the re-trial of ‘Antartica Actually Cools’ versus ‘CGM’s predict Rapid Antartica Warming/Cooling/Change’
Consequently, Gavin claims that Harry’s removal from Eric’s much publicised ‘findings’ has zero-impact on the ‘Steig Supposition’

I feel for Gavin- He’s clearly a very clever bloke – and an amazing team player – Dunno if he’s on the wrong side or not but his loyalty is awesome!

It also seems that he would like a form of data archiving with version control. He regrets that this doesn’t exist for climate science. As we all know such systems are standard in engineering practice and any organization running s large project without one would be very strongly criticized.

Well if you in a large company and didn’t “archive ” or backup the data you would be strongly terminated.

What if science were software? and every paper that used data were linked to that data and when the data was updated or changed the whole of science related to it were regression tested…. ok jeez, I’ll get back to work.

Climate Audit lives up to its name and increasing level of credibility (mostly outside of the insider climate community). Yet if this data has *no significant effect* on the Antarctic Warming conclusion then this error is not all that relevant as RC has suggested (unconvincingly so far in my view)

It would sure be nice to develop some testable hypotheses about how the Climate community processes data. Despite what appear to me to be quality papers most of the time it certainly appears from even a few hours of reading at RealClimate that all of the scientists there enter the analyses with an expectation and even a desire to find the strongest possible AGW “signals”. I don’t know to what extent, if ever, that desire interferes with analyses or conclusions.

Re: Joseph Hunkins (#47), I think this 2004 NASA report of a paper of Gavin Schmidt and Drew Shindell says it all: the self-fulfilling prophecy of the Climate Model, and the one piece of Antarctica temperature mapping that looks believable.

Coincidentally very recently I came across a reference, I think it was on Wegman’s website but I’ve mislaid it (doh!) that suggested this method of Mann, Rutherford, Wahl and Ammann was liable to warm bias. I’ll look again.

I think this 2004 NASA report of a paper of Gavin Schmidt and Drew Shindell says it all: the self-fulfilling prophecy of the Climate Model, and the one piece of Antarctica temperature mapping that looks believable.

We should be careful on the conclusions that we draw from the conclusions drawn in these studies. The starting date is very critical to the measured trend. Schmidt and Shindell are making claims about 30 years ago and Steig et al. go back 50 years in making their claim. I believe that going back 30 years with Steig’s data will show a slight cooling to no trend in temperatures and more cooling (not significantly different than 0, however) going back 25 years.

I think it is important also to realize/speculate why Steig et al. did the reconstruction to extend measurements back 50 years. That starting data can show a positive trend (that could, however, well not be satistically different than a 0 trend with all things considered).

I have seen estimates that going back further in a guesstimate would show higher temperatures in the Antarctica in the 1935-1945 time period which would make the 50 year period the optimum for someone wanting to show the Antartica warming.

This example reminds me of the starting time chosen by those wanting to show an increase in the hurricane/tropical storm activity in the NATL. In both instances neither group seems much interested in discussing shorter or longer time periods and invariably can provide rationalizations for their selections.

I think it is important to check further on those 1935-1945 Antarctica estimates.

I’ve been interested in the climate change issue for a quite a while, but have just recently started reading a few related blogs (mainly CA, RC, and Prometheus) regularly. One thing I was not aware of before is the seemingly widespread protectiveness and lack of openness surrounding many temperature reconstructions’, and perhaps forecasting models’, code/methods.

I’m just an econ undergrad student, but it’s my impression from my own instruction, and from friends’ and family’s work, that reproducibility is a crucial element for valid science in ANY field. If that’s the case, what is the rationale, for example, for Steig et al not to make both the data they used and their code/methods all “open source” CONCURRENTLY with the publication of their study? Are they going through some process I’m not aware of that leads them, at some point in the near future, to release everything? If they don’t plan on releasing everything at some reasonable point in time, do they provide other avenues for reproduction and confirmation? Or have I been misreading things, and have they done so already?

“A good auditor doesn’t use the same Excel spreadsheet that the company being audited does. They make their own calculations with the raw data. After all, how would they know otherwise if the Excel spreadsheet was rigged? Mike Mann articulated this distinction very thoroughly in the discussions with the National Academy during the “hockey stick” debate. You should read this material; it’s enlightening (I will post the link when I find it). In any case, you’re pushing this analogy too far. Science is not the same as business. The self-appointed auditors of climate science don’t seem to understand that science has a built-in-auditing system — the fact that by proving someone else wrong, especially about an important issue, is a great way to get fame and success. There is no comparable mechanism in business. The analogy between auditing business and auditing science is therefore a poorly conceived one. But as long as the analogy is out there, consider how auditors are chosen and regulated in the business world. You don’t get to be an auditor merely by launching a blog, and you certainly don’t publicly speculate about your findings before (or even after) you’ve done the analysis. Above all, you have to demonstrate competence and integrity, and the company you work with has to trust you, or they won’t hire you.–eric]”

Yet if this data has *no significant effect* on the Antarctic Warming conclusion then this error is not all that relevant as RC has suggested (unconvincingly so far in my view)

.
Personally, based on everything said so far (here and at RC), I don’t think it has that big of an effect on the conclusion. Rather, it is a symptom of a disease of excessive tolerance for carelessness…carelessness in the quality control of the data used, carelessness in the documentation of the study such that it is entirely unclear whether the data was used, and (assuming it does not have a large effect) carelessness in the fact that irrelevant data was included. So the relevance isn’t necessarily in the effect on the conclusion. The relevance is in the reaction the “inside team” has with regard to the mistake . . . a reaction that is decidedly counterproductive. Instead of looking for ways to check themselves and improve, they spend an awful lot of time rationalizing why the mistake is okay and dismiss suggestions to make the process more transparent.
.
This is an important point because cure for carelessness is transparency. The easier you make it for someone to audit your work, the more likely you are to catch your own mistakes.
.
Scientists will make mistakes. They’re human. If someone points out a mistake, however, rather than circling the wagons and not giving credit where credit is due, they should admit the mistake and promptly determine whether the mistake has a substantial effect on their results. The ad hom attacks on Steve at RC are simply not constructive or scientific.

Hans Erren: The datasets are so small that team probably has never heard of RDBMS with versioning control….. Perhaps an idea for GISS?

When employing an Analysis Case File type of approach for purposes of maintaining end-to-end traceability of an analytical process flow — it is imperative that the combination of raw data sets, software programs, intermediate data sets, and final output data sets be considered one thing.
.
For purposes of supporting full traceability of a specific analysis conclusion, the subcomponents of the analytical processing system have no meaning independently of each other.
.
To establish full traceability and process auditability, a snapshot must be taken of all the analytical subcomponents within the system that supported any specific conclusion, at the point in time when the conclusion was generated — i.e., a snapshot of that particular combination of raw data sets, software programs, intermediate data sets, and final output data sets that went into the analysis at the time the data processing was performed.
.
I don’t see why the source code and executables for every set of related analysis programs couldn’t be archived in an RDBMS as binary objects, along with the raw, intermediate, and output data sets; and including metadata describing the background and operating parameters for the particular analysis that was performed.
.
I’m sure NASA and/or the University of Washington have at least one database programmer on their respective staffs who is proficient in Microsoft SQL Server and who could design an RDBMS-based Analysis Case File System tailored to the needs of both NASA’s and UW’s own climate scientists.
.
Hey folks, this isn’t rocket science.

You do not comment on this, but simply cite it. However, you are clearly implying that you agree with it because you do not comment. Are you prepared to either remove this from the web site immediately, or to provide evidence that I have committed fraud? This is a very extreme accusation. Indeed, it seems rather like libel to me. I would like to request a formal apology from you, in writing.

I am cc:ing several journalists on this, so hopefully inaction of your part will be noticed….

Goodness. Pretty extreme reaction to a secondhand blog ref. Steig doesn’t appear to be much of a nice guy, after all.

Steve,
I’ve read here about 3 1/2 years. I’ve – same timeframe – looked on RC.
I’ve epirienced, what happenend to people asking questions with reason, posted on RC.
I did even run – a “low frequency kind o’ webgrabber” – when discussions getting “interesting”.

This is great fun, and Gavin is almost certainly talking out of his hat on “independently”, but it is hard to see how BAS can really be faulted here. Someone told them their data probably had a problem, they looked, found the problem, and fixed it. It is unreasonable to expect them to engage in a full academic inquiry as to appropriate credit. For one thing, it would discourage people from contacting with them with problems that need fixing if the people who do so are subjected to the third degree to “prove” the originality of their research.

Steve: c’mon… no one is faulting BAS. We’re just wondering who Gavin’s mystery “independent” man is.

So, if I read that right, Steig’s basically saying, (explicitly) if you feel the need to check our work, build your own models see what you get and (implicitly) unless you do that, questioning of our results is invalid? I feel like there are a number of different reasons to take issue with that stance, but two that immediately occur to me are:

1. If another party did in fact take the same raw data and produce a temperature reconstruction, and it differed significantly from Steig’s, how could there be meaningful scientific discussion/debate about which reconstruction was likely more accurate if only one or neither party released their methods for producing the reconstruction? If Steig et al would release more of their work for this circumstance, does it make sense for them not to do so now?

2. Since their new method for interpolating data seems to be so important to their new work, it seems odd they seem to argue the most appropriate way for validating their work is for other others to go without it, or to attempt to reconstruct it without guidance.

Could there be some commercial or selfish academic reason they don’t want other people to be able to use their methods? If that’s not the case, I don’t think it an unreasonable speculation that they might lack confidence in their methods standing up to scrutiny…

People are encouraged to correct me if I’m wrong, but I’m under the impression this incomplete release of methods within climate science isn’t restricted to only papers Mann has been involved with. Such a proprietary attitude in science just seems incredibly bizarre to me, especially in a field that has such immense implications for public policy.

We are not supposed to speculate on motives here, but I believe a famous quotation from a climate ‘scientist’ a few years ago was: “Why should I let you have my data? You will just try to find something wrong with it.”

What, exactly, is the importance of Antarctica in the Great Debate? So far, by reading both CA and RC over a considerable period, much emphasis is placed on the atmosphere record contained in the ice cores, and on current (modern) instrument-recorded data. At the same time Antarctic weather is prominient in the “debate”, both sides of the debate also argue that Antarctic weather does not represent the “temperature record” of the planet for myriad reasons, and the most vocal back and forth occurs around the issue of the temp-CO2 relationship.

I’m not a scientist of any sort; I’m also not a rube from BFE. What is perfectly clear is that there is no accepted method for determining the temperature record of the planet. Maybe it’s just me, but I think to advance the science of climatology one must first establish, and gain acceptance of the “experts”, of a method to determine the record.

Until that happens, all that can be done is to find a reason why the latest report is inaccurate.

There is a place for auditing. Without the work of Steve M. et al, there would be no debate; no skeptics, no counter to what is apparently an infant science, by default flawed.

James: here’s a script showing that the old READER Harry and the reconstruction are almost identical: correlation in overlap of 0.99978 and virtual match. It simply HAS to be in the RegEM in order to reconstruct to that degree. Recall that RegEM is just “infilling” – Harry is in the AWS RegEM. In Mann 2008, there were similar 100% recons in the calibration period. I presume that they make a ReGEM network of surface stations and AWS stations. Maybe there are some other AWS reconstructions around, besides recon_aws – it takes a while to pick through the information. But I’m confident that Harry is in the recon_aws network – the only AWS recon that’s seen the light of day so far.

Harry and the reconstruction are almost identical: correlation in overlap of 0.99978 and virtual match. It simply HAS to be in the RegEM in order to reconstruct to that degree. Recall that RegEM is just “infilling” – Harry is in the AWS RegEM.

I would think the error becomes more egregious, since the Steig paper and SI would, to me, strongly indicate that Harry was not included in the reconstruction. It would then appear that it is there without the authors’ explicit knowledge. I am also becoming confused by Steig’s comments that I see here by way of RC. I initially had the impression that he was saying that Harry was not included in the AWS reconstruction and now from his rather imprecise language it seems he is saying that the IR reconstruction was the important reconstruction (for the papers conclusions) and therefore errors in the AWS are not important because it was only used as a check on the IR reconstruction.

Steve M, thanks for the AWS file scripts. They will help my efforts to look further at the AWS temperature trends over the last 30 or so years and closer at the 1957-2006 trend.

This link is just the general description of matlab functions. He handed a friggin’ manual to us after saying in is paper he used a modified version of RegEm claiming that it has been there all along. The RC minions eat it up, look how open he was.

It’s a for loop description and is not the code used. He states – ‘EXACTLY the code used in our paper’ -his words. Of course this is a piece of the code, but he could probably say the same thing about the manual entry for the print command.

I’ll add here that I was also hoping that someone adept at statistics and able to use R at a more proficient level than this rookie could document my following unofficial and tentative observations on the AWS reconstruction results from the 63 stations. I would very much like to document these observations in R as a learning experience, but I fear by the time I get an efficient script down we will have moved on from the Steig et al. (2009) paper.

The annual 1957-2006 temperature anomaly trend averaged over the 63 AWS stations is positive, but is not statistically different than zero for a p equal to or less than 0.05 when the trend regression data is adjusted for lag 1 auto correlation.

The 1970-2006 temperature anomaly trend averaged over the 63 AWS stations is very slightly negative, but not statistically different than zero.

The 1980-2006 temperature anomaly trend averaged over the 63 AWS stations is negative, but not statistically different than zero.

The station trends vary dramatically from station to station.

The stations tend to cluster spatially, but I do not see that biasing a cooling trend although that needs a more sophisticated analysis.

Dr. Steig is claiming on RC that his code has been archived, stating that it has been all along yet his link seems to point to the manual to the RegEm functions he claims to have used. It’s like me saying yeah here’s my code that calculates the magnetohysterisis of the sun and handing you a C++ manual on for loops.

[Response: ALL of the data that were used in the paper, and EXACTLY the code used in our paper have been available for a long time, indeed, long before we published our paper. This is totally transparent, and attempts to make it appear otherwise are disingenuous. This has always been clear to anyone that asked… .–eric]

[Response: Table S2 says it has “List of the 42 occupied weather stations … and four automatic weather stations (AWS)” (i.e. 46 entries). Only the 42 occupied stations are used to provide the data back to 1957. The AVHRR data, or the AWS data are used only for calculating the co-variance matrices used in the different reconstructions. Thus the reconstruction can either use the covariance with AVHRR to fill the whole interior back to 1957 (the standard reconstruction), or the covariance with the AWS data to fill the AWS locations back to 1957. – gavin]

The claim seems to be that the reported correlation for Harry is based on a separate analysis which used only the AWS data.

A quick scan of the code shows that if they are using Dr. Mann’s newregem.m (PNAS 2008) there are only mildly interesting changes. [ Links to other observations on Mann RegEM vs Schneider RegEM would be appreciated ]

#73. I don’t think that that nearly begins to solve anything. If you “fill” the AWS locations with RegEM, you get back the original data where it exists. You only “fill” missing data – that’s why the correlation is 100%. I don’t see any archived data series yielding a correlation of 0.57 to HArry. Maybe some one else can figure it out, but I can’t so far.

I’m taking a long shot here, (and I don’t have time to really research this right now and I haven’t been paying close attention), but is it possible that the algorithm being used has inadvertently created an autoassociative, (or possibly heteroassociative memory), where you can retrieve certain individual inputs of the covariance matrix by multiplying by a another input of the covariance matrix? If the inputs of the covariance matrix are orthogonal you can get perfect “recall”. If Harry is one of the inputs to the covariance matrix, you can retrieve Harry perfectly from the matrix. It’s an old “neural network” trick.

Steig et al already performed an analysis of the temperature data excluding Harry (along with most of the other AWS stations). While the results are not identical, they are sufficiently similar to be supportive of their conclusions.

So where is the chart that shows the divergence of these two “sufficiently similar” runs?

Which shows the greatest trend, with or without Harry?

And it says that a test was run “excluding Harry (along with most of the other AWS stations)”. This says they still used some of the AWS stations. Which ones?

The idea that the reconstruction would exactly reproduce Harry – and not just Harry, but Harry incorrectly spliced with Gill – when Harry-spliced-with-Gill was not an input to the analysis is utterly ridiculous. I get that. There must be a logical explanation.

Why not contact BAS and provide them with a link to your post on Sunday introducing the problem with Harry and ask them to credit the person who found the data error? If they are sensible, they will credit you. If not, perhaps they will credit the person who notified them of the error.

I am willing to bet whoever notified them did not give them all of the info at once. I bet two or three emails were sent as more and more information came out on this website. You might want to ask specifically about the number of emails sent. If they refuse to answer, an FOIA might be in order.

Climate science seems to be an awfully sloppy science. I read the reports that Harry/Gill were spliced, but assumed that as Steve Mc. had only compared the January/February data more work needed to be done to verify the proposition. Three questions spring to mind:

1. Who spliced the Harry/Gill station data and why?
2. Why didn’t Steig and his collaborators identify the problem?
3. Who were the “peers” who reviewed the paper who failed to notice the splice?

I once worked in an engineering research lab. We had to keep notebooks of all our musings, lab books of all our experiments and all our data was time stamped and under change control. It was a very long time ago, but I remember the culture was that everything you did was recorded, date stamped and the records filed and kept for review if necessary. I am surprised that the scientific community hasn[t introduced rigorous recording and change control techniques, particularly in the data age where huge amounts of data and code can be used on even a minor project.

I note Gavin refers to non-climate scientists who challenge the AGW alarmists as “citizen scientists” perhaps he should consider the climate scientist who are doing data analysis as “citizen data analysts” because frankly from what I’ve read on these pages they seem to be a pretty amateurist bunch.

“What if science were software? and every paper that used data were linked to that data and when the data was updated or changed the whole of science related to it were regression tested….”

Exactly! One would have thought that it would be an absolute necessity for the code and data upon which a paper is based to be frozen and archived at the moment said paper is accepted for publication. Changing data or code post publication quite simply invalidates the paper. It’s not exactly rocket surgery. For me, as an old-fashioned “working” scientist, it is mandatory – but it seems to be depressingly untypical behaviour in “climate scicence”.

Slightly OT: Tapio Schneider, of RegEM fame, might be an honest broker. This from a critique of a proxy study he published in Nature in 2007:

Even if the inferential errors are corrected, similar procedures based on temperature reconstructions from proxy data generally underestimate uncertainties in reconstructed temperatures and hence in climate sensitivity. Climate proxies are often selected on the basis of their correlations with instrumental temperature data, as in the reconstruction underlying the analysis of Hegerl et al. Using such proxies in regression models to reconstruct past temperatures leads to selection bias, resulting in an overestimation of the correlation between proxies and temperatures and an underestimation of uncertainties.

[Response: It’s not nonchalance. Errors in data and code should be looked for and fixed when found. It’s simply that this is not the most important issue when deciding whether a result is interesting or not. Someone could have perfect data and absolutely correct code, but because their analysis is based on a fundamentally incorrect assumption, the result will be worthless. But errors in data are ubiquitos, as are bugs in code (Windows, anyone?) – interesting results therefore need to be robust to these problems. Which is why we spend time trying to see whether different methods or different data sources give the same result. That kind of independent replication is the key to science, not checking arithmetic. – gavin]

Bolding mine.

Simply re-do the analyses, using different, but still possibly buggy data and code, and check your previous results. All that is required is that your assumptions be fundamentally sound.

hmmm … does this method work for bridges and elevators, … and rocket launches?

But errors in data are ubiquitos, as are bugs in code (Windows, anyone?)… – gavin

Astonishing. Bugs in data analysis algorithms are not ubiquitous. These are generally small (a few hundred lines of code), tightly coded algorithms that are relatively easy to probe for consistency. Windows bugs persist because Windows is tens of millions of lines of code, and bugs that don’t matter are ignored in favor of those that do.

“Response: It’s not nonchalance. Errors in data and code should be looked for and fixed when found. It’s simply that this is not the most important issue when deciding whether a result is interesting or not. Someone could have perfect data and absolutely correct code, but because their analysis is based on a fundamentally incorrect assumption, the result will be worthless. But errors in data are ubiquitos, as are bugs in code (Windows, anyone?) – interesting results therefore need to be robust to these problems. Which is why we spend time trying to see whether different methods or different data sources give the same result. That kind of independent replication is the key to science, not checking arithmetic. – gavin]#

Gavin
VS.
Steig

“People were calculating with their heads instead of actually doing the math,” Steig said. “What we did is interpolate carefully instead of just using the back of an envelope. While other interpolations had been done previously, no one had really taken advantage of the satellite data, which provide crucial information about spatial patterns of temperature change.”

So Gavin are you saying “Someone could have perfect data and absolutely correct code, but because their analysis is based on a fundamentally incorrect assumption”, instead back of an envelope thinking is the correct scientific method to use just as long as it comes to the right conclusion? Who is correct, you or Steig? /Jens

I am not sure that Steig is off in declaring the aws data were not used for the main figures in the body of the article. From his data repository, the description indicates that aws_recon_aws.txt was used for Figure S3 (in the supplemental materials, not the main article). The file with the data used for the primary conclusions in the body of the article appears to be ant_recon.txt. I would look there to compare the reconstructed grid box temp data to Harry and see what it looks like.

#93. Schneider’s point expresses it differently, but it’s really another variation of the point that correlation picking leads to a hockey-stick bias (if the target is modern temperature). Unfortunately you can only use 5 refs in a PNAS letter and the point is not made overtly by Schneider.

#98. Bob, the AVHRR data was used for the main figures; some issues have been raised about AVHRR data (which is not a well known or well discussed data set in the sense that the UAH mirowave data) and the reply has been that they “get” the same result using AWS.

Relevant monthly AVHRR is unavailable although there is copious raw data online. There are different screening methods available; Steig processed the AVHRR data differently than the predecessor report and got more warming. I asked Steig to archive their monthly AVHRR output, but he refused. Using pretty much the same arguments as Santer, his position is that I should run the gauntlet of figuring out how they processed AVHRR data. I obviously disagree.

The other part of the data – the READER AWS and surface data – was readily available and so I looked at the part of their analysis where the data as used was accessible.

If Steig wishes to get people to focus on the AVHRR data, then the simplest way of doing this is to archive the monthly AVHRR intermediates as used in their reconstruction – something that I requested in the politest possible way before Steig started slagging me.

Perhaps the AVHRR handling is above reproach. The best way for Steig to remove speculation about things is to provide their AVHRR monthlies in the form requested. BTW his coauthor Comiso said that he was working on it- but it would have saved them trouble if they had had it ready.

Everyone also needs to keep in mind the paleoclimate history in these discussions. One of the MM03 test was using updated data versions. Mann said that this was “WRONG”. An “auditor” had to do things EXACTLY the same way. There were unexplained steps in the methods which they refused to explain so we made reasonable assumptions. Mann said that these were “WRONG” because we had made some slight mis-step in re-tracing Mann’s steps – options that were by no means statistical necessities.

My point in attention to precise detail is not because I think that there is any magic in Team methods (I don’t), but because I’ve had my fill of the Team fogging up a debate by screeching that some step in a plausible interpretation was “WRONG”. So I’d rather make sure that I know exactly what they did; benchmark my methods against theirs and THEN test variations.

Steig does provide a link to Schneider’s RegEM Matlab code on his UW site, which at least is a start. However, there is no code showing what Steig et al put into Steig’s routines, so his claim to have posted all the Team’s code is totally inflated.

BTW, the Wikipedia pages on Total Least Squares and Ridge Regression are now beginning to make sense to me. Last year they were totally confusing. (Ridge Regression gets referred to Tikhonov Regularization, but apparently that is just an older name for the same thing.)

Gavin’s mystery man is no one other than Gavin Schmidt himself, says http://www.antarctica.ac.uk/met/READER/data.html, when accessed at 4. Feb 14:09:47 UTC 2009:
[…] “Note! The surface aws data are currently being re-proccessed after an error was reported by Gavin Schmidt in the values for Harry AWS(2/2/09)
The incorrect data file for Harry temperatures can be accessed here”[…]

Steve- I was simply trying to clarify the impression that many posters seem to have that “Harry” must have been used in the primary reconstruction, therefore, the whole thing should be thrown out. I think we are a long way from that conclusion (throwing the whole thing out) and need to step back and slowly and methodically go through what is available without jumping to potentially erroneous conclusions.

And yes, it would really help matters if the relevant screened or masked AVHRR data and the specific code used were made available. However, since Dr. Steig indicated that the code used is availabe on Tapio’s site, I think the thing to do is take that statement at face value and assume that they used it “off-the-shelf”. If you cannot recreate their output (e.g., for the aws recon) with the “off-the-shelf” code, then a good argument could be made that they didn’t make the actual code available as they stated.

RegEM is based on Ridge Regression which is just l2-norm regularized ordinary regression. If data errors are “ubiquitous” as Gavin claims, then regularization methods based on the l1-norm are probably more appropriate (they are much more robust to outliers than l2 methods).

R SCP, #1 “And why the “mystery man” was not credited when the change was made?”
I don’t quite follow – where is the mystery? It states on the Met READER Data page “The surface aws data are currently being re-proccessed after an error was reported by Gavin Schmidt in the values for Harry AWS(2/2/09)”
Sorry if I have missed something. Fairly minor point in any case.

Steve – congratulations on yet more excellent work.

Steve: That was not how the screen appeared yesterday. It appeared without attribution. (There’s a little back story on the change.)

Huh. Gavin actually says “people”, as in more than one (which makes “independently” even more of a statistical improbability), but it is interesting that he didn’t see fit to mention he was one of them!

And, btw, was there some McGyver-like urgency to his notification. A nuke about to explode? Timmy with water up to his chin at the bottom of a well? Something that required him to preempt on an urgent basis a reasonable allowance of time for the discoverer to make the report? Academic publication is a process that takes months (sometimes years) to wend thru, but this issue would brook no delay for academic civility that would be accorded nearly anyone else?

Also there’s an ongoing loose end with Table S1 and S2 that I don’t understand. Harry and 3 other AWS stations are in Table S2 with the surface stations and not in Table S1 with the AWS stations. I can’t make head nor tail of the various “explanations” at RC on this.

I don’t know why the four AWSs were mixed into the list of manned stations in Table S2. When they selected the 26 AWS sites for “reconstruction” in Table S1, they didn’t use Mount Siple with 138 data points or Byrd with 192. However, they kept Enigma Lake with 126 points and Nico with 120.

The “>40% complete” threshold for AWS selection seems to be defined in terms of when the READER data record starts for each station. It would have made more sense to pick one common start time to define data completeness.

Are there calibration issues? If there are calibration issues, they should see if their temperature calibrator is affected because of the climate or something hit it. This issue should be clear enough so that people will believe and it will be clear to all.

[…] So far investigation has been made of only one station in the network used. Steve McIntyre (of Climate Audit) discovered that it was seriously defective, and posted about the error at 4:41 pm EST, on February 1 (Super Bowl Sunday). The British Arctic Survey (BAS) corrected the error on Monday morning — fast work, but without crediting McIntyre for catching the error. (source) […]

[…] of Steig, let me review their previous defence of Gavin the Mystery Man, originally discussed here here. I had noticed curious properties in the Harry station in Antarctica, which were subsequently […]