Gavin on McKitrick and Michaels

Ross writes:

Has Gavin posted on his IJOC paper? [SM – yes. At RC today.] I will head over tomorrow to have a look. Not today–it’s very sunny here and the ice rink beckons. I am not sure how replication plays into the issue, since I posted my data and code from the get-go.

I got a copy of Gavin’s paper (from Jos de Laat) 2 weeks ago, and Gavin sent me his data promptly in response to my request. I have been having a lot of fun with it. In his paper, Gavin shows that the coefficients weaken a bit when RSS is substituted for UAH. But he doesn’t report the joint F tests which form the basis of the conclusions about contamination. They are still strongly significant. He also points out that the coeff’s are significant when GISS data are swapped in, and claims this shows the effects are spurious. I am sure I am not the only person who has actually read the paper and noticed that to the extent they are significant the coefficients on GISS data take the opposite signs to those on the observational data. Far from showing the effects are spurious, it shows that the observations negate a significant pattern in the modeled data and make it significant in the other direction. That’s called an observable effect, not a spurious result. And since it’s a comparison of ‘clean’ GCM data versus observations it has as much causal interpretation as the 3-part ‘signal detection’ methodology in the IPCC reports. I.e. in this case it’s a “signal detection” result for non-climatic contamination of the surface data. Gavin doesn’t report a chi-squared test of parameter equivalence between coeff’s estimated on modeled and observed data (akin to the outlier test and hausman test shown in MM07), but that’s OK because he posted his data, so I have done it and parameter equivalence is strongly rejected.

The issue of spatial autocorrelation is a huge red herring. Over a year ago I responded to the RC posting on this by writing a paper showing that spatial AC is not significant and even if we control for it anyway the results all hold up. The JGR would not publish my response unless Rasmus submitted his comment. I challenged him to do so in December of 2007 and he said it might take a while since he was getting busy with work. That’s the last I heard from him on it. I also deal with the topic in a more recent paper under review elsewhere, which I have reason to believe Rasmus has read. But when I write up my response to Gavin’s paper I’ll be sure to give the spatial AC issue a thorough discussion.

Editorial note (SM): I have not worked through any of these papers and accordingly have no personal view or opinion on the details. Schmidt acknowledges that Ross’ due diligence package was complete. Schmidt sneers at the problems that others encounter in trying to get to the starting point (for e.g. Steig) that would have been provided by a proper due diligence package, such as the one that Ross provided, but, inconsistently, appears to have consulted Ross’ well-prepared due diligence package in his own paper.

This is a little off topic but I am intrigued by the battle for the “meme of warming” via the publishing and criticizing of papers. And some times it gets less than professional. Gavin’s use of the term “spurious” results brought to mind the Lyman 2006 paper on recent cooling of the oceans http://oceans.pmel.noaa.gov/Pdf/heat_2006.pdf , The boys at NASA made sure you read an “unpublished” paper “Correction to “Recent Cooling of the Upper Ocean” , a correction that has been revised a few times , by “tacking” the correction in front of the Lyman paper. It appears they felt compelled that this publicly available document did not “misguide people” into thinking there was recent cooling, that the cooling was just spurious results. Is that a common practice at NASA to negate a published paper with an unpublished one? DId their “corrections” paper ever finally get revised enough to be published?

I found it very telling of the politics involved when their correction paper used the Gouretski and Kolterman 2007 paper which had shown that ocean heat content had been over stated and used that fact to suggest the overstatement was one of the “causes” of the spurious cooling recorded by Lyman et al. Even more importantly,conversely, they never mention any other papers that may have concluded excessive ocean heat content and spurious warming nor made sure the correction was tacked onto those papers. snip

What is worse about the first ‘lecture’ is that it simply casts doubt without being specific about when, where and by how much; and how they know which instrument was ‘wrong’. This is as bad a case of scientific white-anting as I have seen in 40 years.

Re: Douglas J. Keenan (#4), Douglas, there’s a long list of specification checks in MM07, see this site where you can get the paper. I don’t know what you mean by “Those tests tend to be non-robust”. Are you saying they lack power? F tests of linear restrictions in a regression model are exact in small samples for Gaussian errors, and asymptotically valid for non-normal errors. In any case, when you’re getting prob values that are zero to 10 or more decimal places, power isn’t an issue.

Your comment says “F tests of linear restrictions in a regression model are exact in small samples for Gaussian errors, and asymptotically valid for non-normal errors”. But having Gaussian errors presupposes the opposite of the premise of my comment, and asymptotic results tend to matter little in practice.

Wikipedia says that “even if the data displays only modest departures from the normal distribution, the [F-]test is unreliable and should not be used”. Wikipedia gives no reference, but my recollection is that Kendall’s makes a similar remark, though looking just now I could find only a weaker statement: “sensitive to departures from normality” (sect. 16.40). A brief google search turned up Krämer [EconLett, 1989], who found that “the F-test is extremely non-robust to autocorrelation”.

Your comment also says “when you’re getting prob values that are zero to 10 or more decimal places, power isn’t an issue”. Of course I agree; I have not looked at any of the numbers.

Re: Douglas J. Keenan (#35), Douglas, I have a shelf full of econometrics textbooks that derive, explain and recommend F-tests for linear restrictions in a regression model. You’ll have to do better than Wikipedia. And nobody is talking about autocorrelation here: this is a cross-sectional regression.

Nice post, Gavin. However, while I agree that exact reproducibility is not as crucial as the fundamental science, in this era of ever increasing amounts of data, I think it is important to keep track of data and methods. You mentioned emailing authors for clarifications. What happens if you want to try to reproduce results/methods in light of new information 10 years from now and the authors are no longer working in the field and/or no longer have records of their data or processing steps?

That is why it is crucial to provide and preserve solid documentation on the data and methods used. I have seen numerous papers that use “NSIDC sea ice” in their methods, with no reference to the exact dataset or version use. While it may be that as in the case shown above this doesn’t matter much, but in another case it could be crucial. I urge all scientists to be vigilant in making sure that refereed journal articles not only provide solid scientific results, but also solid information on their data and methods.

Walt Meier
National Snow and Ice Data Center

[Response: Hi Walt, well in ten years time, most of this kind of analysis being done now will be obsolete (not the methods, just the results) since much more data will hopefully be available. Your larger point is well taken, and that goes to the citeability of datasets – especially when they are continually evolving. I asked on a previous thread whether anyone knew of a specific database suite that gave all of that functionality (versioning of binary data, URL citeability of specific versions, forward citation to see who used what etc.), but other than vague references, there wasn’t anything concrete. Perhaps NSIDC are working on such a thing? – gavin]

Gavin is ignoring the bus. You document your work in case you can’t remember some detail in six months, and in case you get hit by a bus. Gavin seems confident in the performance of his memory, but his employer also needs his replacement to be able to continue his work. Gavin is also providing reasons for neurobiologists and psychologists to monitor climate science, to ensure that the memory of climate researchers continues to hold the information which it is being entrusted with. Perhaps certified memory problems will justify raising the priority of a researcher’s documentation tasks.

Ross (and others familiar with regression models):
You have obviously found the link at RC. I would be interested in understanding what “spurious” means when determining which variables or clusters of variables get entered into a model. I have always assumed that as long as independent variables were more or less independent then any variable with a significant contribution to explaining the dependent variable (i.e., some agreed upon metric for temperature ) cannot be “spurious”. Is this too simplistic?

Bernie – if I check all stock histories over the last 10 years and find a few that correlate well with the temperature in Nome, then would I not have an independent yet spurious relationship? Sorry if I’m not understanding your question correctly.

Steve:
Good point, but I was assuming that there was an a priori argument that linked the defined variables causally to the dependent variable. Clearly the stockprice of North Face may be correlated with the temperature but I for one would not be proposing a causal relationship. Economic Activity however is in part akin to UHI and the latter certainly has been shown to be causally related to local temperature measures. Moreover, I do not believe that this is the way Gavin means “spurious” – but I am open to being enlightened.

When, oh when, is the Team going to take Wegman’s advice and start working with real statisticians? If never, then why?

I’m with you. Reading Ross’s comment

…to the extent they are significant the coefficients on GISS data take the opposite signs to those on the observational data…

makes one wonder if there’s any hope.

Maybe the world really does face a global warming crisis. However, given the incredible incompetence of the self-(un)regulated community responsible for modeling as well as collecting, managing, and analyzing the data, well, I’m not convinced we would ever know.

I have long been puzzled by comments along the lines of “why are you worried about this ten year old paper (MBH)” as if scientific publications have some sort of “use-by date”.

Policy is informed by the current literature.

I find the discussion over at RC about transparency incredible. According to Gavin, it’s better not to publish code, or if you do it’s not necessary (re MM07), and now, in ten years time it won’t matter anyway. It’s a ridiculous position. The point of archiving the code is that people can understand what the authors did.

The Steig paper’s methods section and the SI are incomplete. I’ve read both several times, and I can’t figure it out. Even Gavin “mispoke” suggesting he doesn’t understand it either. Witness the heroic efforts of RM, SM and others to try to reverse-engineer it. Really.

What he means to say is that he is so convinced of the GHG/AGW hypothesis that in 10 years time the upward trend will be beyond the realm of statistics, the current cooling/flatline will be put in proper context as a brief transient, and current analyses such as lucia’s will then be irrelevant. Similarly, the question of CWP vs MWP will then be laid to rest as warming far exceeds the uppermost limit of any confidence interval on any paleoclimatic reconstruction.

I am willing to wager that that is what he meant.

[But it is disturbing to me that the team continually dodges the question of the nature of internal climate variability. Which is, of course, lucia’s point. If their noise model is wrong, then their statistical models are wrong.]

It is appalling that Gavin doesn’t know or pretends that he doesn’t know how to make a citation of a data base. My daughter learned this sort of thing at school. Gavin’s pretended confusion of simple data citation with different issues like forward uses is his usual attempt to avoid doing what is called for (and well understood in economics or sociology.)

They would have to pull thier heads in. Strident claims would be moderated. Press releases would become less alarming. Funding would decline. Empire building would no longer be viable. I could go on and on. So in relation to using genuine statisticians what is there in it for them other than credibliity with a few pesky sceptics? Real members of the “team” never seem to question each others work. So as it is they have little to fear from “climate scientists”. Why bother with a statistician who will just try to audit (potentially find something wrong with) the data and the methods?

The more often that Steve M and a few others find issues with work published by climate scientists the less likely these guys will be to communicate with professional statisticians. They are more likely to carry on with business as usual and simply try to bluster their way past any valid criticisms.

I believe that the more often mistakes like the recent one are exposed, the wider the coverage. And the wider the coverage, the more eyebrows get raised whenever the Team makes another grand statement, or releases a paper. The more they have to defend, the more likely they will pay more attention up front to what they’re releasing, because the “blind faith” is diminishing.

Think of how much easier it would be to audit their process if they archived code and data promptly. Even someone like say, a peer reviewer, might actually look through the data and results. Then the peer reviewer’s and the journal’s credibility is on the line because they would not have the excuse that they never saw the actual data.

I know it’s maybe a bit “low tech” but would it be possible to publish items such as code, datasets, etc, as appendicies to the paper?
That way it wouldn’t matter quite so much what happened to the people involved in producing it as we have librarians skilled at archiving and retrieving such information.

This reminds me of a situation back in Diffy Q’s class. I had a rather silly sign error in a complex equation that was a part of a 5 question test. I remember arguing with the professor for partial credit because “other than the sign error” I had shown I understood the work. I remember him laughing at me. In this case, because it’s statistics, I couldn’t even argue that the rest of my work showed a comprehension of the material because to do so would be admitting, point of fact, that I didn’t.

The climate system has a great deal of unforced weather ‘noise’ that has significant decadal variations and complex spatial structure which is uncorrelated with any external climate driver. This leads to the well known phenomenon that the variability of trends over specific regions is a strong function of their spatial extent. The smaller the area selected, the greater the spread in the observed trends. For instance, the individual grid box trends in the HadCRUT3v dataset over the period 1979–2001, range from −0.6 to 1.2°C/decade compared to the trend of 0.17 ± 0.01 °C/decade in the global mean (Brohan et al., 2006). There is a similar variance in trends at any one grid box over the same period seen in ndividual simulations in climate models ensembles. For instance, in an ensemble of 20th Century simulations with the Goddard Institute of Space Studies (GISS) ModelE-R at the grid point centered on 37.5 °E, 50 °N (in Eurasia, picked at random), trends go from −0.17 to 0.5 °C/decade in five ensemble members with identical forcing over the same 1979–2001 period (Hansen et al., 2007).

(Bold added)

Adding one year can hardly explain this difference. Hopefully, this is not a distraction.

If you turn in your homework, it is kind of difficult to claim later that you did it correctly, but the dog ate it. And “the dog ate my homework” is the final trump card. It’s impossible for someone else to establish mistakes in the homework, if the dog ate it.

Please correct me if I am wrong. Are not these studies funded by government grants? If so, then the “intellectual property” belongs to the people; i.e., the taxpayers. There must be oversight by the funding agency, and the funding agency should have access to the complete record — and therefore, the complete record should be available to the public through the FIA. I was a contract monitor once for a period of time in the military involving funded research, and the complete record was always provided as a stipulation in the contract. So there must be a way to legally obtain the complete record through the funding agency. If the funding agency does not have the complete record to review, and it appears that peer review does not have the complete record available to review, what oversight would there be to ensure accountability and adherence to scientific principles?

I have been asking the same questions for some time. One response I have gotten indicates the money trail is often obscure or randomly obfuscated. For example, while the Goddard operation at Columbia U. is an official NASA office, Hansen, for instance, receives some funding directly from individuals and non-government organizations. Untangling that kind of financial operation may be beyond what even SM and RM could do, if they so desired. So, how do you get around the objection, it was privately funded?

The following is purely my terminology – maybe someone knows of a more community-accepted terminology. A base dataset is the observations as they were measured. One may make “corrections” by doing things like removing obviously bad data points, reassigning IDs because one knows the data belongs to a different station, etc., but otherwise the data points themselves don’t change, i.e. – they don’t get “adjusted”. The data points in a derived dataset like GISTEMP, climate model outputs, etc. will change over time as new base data are added to the analysis that produces the derived dataset – which results in “adjusted” data dependent on the analysis scheme. Gavin’s reply to Walt indicates to me that he thinks it is okay to start one’s analysis from a derived dataset. I don’t necessarily disagree, but one has to have good reason to trust a derived dataset to use that as a starting point. A good research-quality derived dataset will incorporate newly-discovered data as a major effort with careful versioning, not quiet changes month-by-month. He doesn’t seem to understand or care about the difference between the two.

versioning of binary data, URL citeability of specific versions, forward citation to see who used what etc.

Something like a data version of arxiv (“darxiv” ?) would do the trick (except for forward citation, but you don’t have that for normal citations so I don’t see why it would be necessary for data citations).

Arxiv allows uploading of multiple versions of papers which then get a chronological tag history, so you can go back and look at previous versions (arxiv does not replace the old versions).

Darxiv should allow upload of code and data. Data must be stored in csv format (for portability, and that’s no much of a restriction since most datasets are easily converted to csv). You probably should allow any code format, but I’d encourage matlab and R.

Hi Gavin,
I had thought this post would be about the actual findings in your IJOC paper, and on that point I disagree with your interpretation of your results. But that can wait for another day. The immediate point of your post seems, to me, to be that there is a difference between reproducing results versus replicating an effect; and a difference between necessary and sufficient disclosure for replication. Full disclosure of data and code sufficient for reproducing the results does not ensure an effect can be replicated on a new data set: Agreed. But that is not an argument against full disclosure of data and code. Such disclosure substantially reduces the time cost for people to investigate the effect, it makes it easy to discover and correct coding and calculation errors (as happened to me when Tim Lambert found the cosine error in my 2004 code) and it takes off the table a lot of pointless intermediate issues about what calculations were done. Assuming you are not trying to argue that authors actually should withhold data and/or code–i.e. assuming you are merely pointing out that there is more to replication than simply reproducing the original results–one can hardly argue with what you are saying herein.

I do, however, dispute your suggestion that I am to blame for the fact that dispensing with the spatial autocorrelation issue has not appeared in a journal yet. Rasmus posted on this issue at RC in December 2007. I promptly wrote a paper about it and sent it to the JGR. The editor sent me a note saying: “Your manuscript has the flavour of the ‘Response’ but there are no scientists that have prepared a ‘Comment’ to challenge your original paper. Therefore, I don’t see how I can publish the current manuscript.” So I forwarded this to Rasmus and encouraged him to write up his RC post and submit it to the journal so our exchange could be refereed. Rasmus replied on Dec 28 2007 “I will give your proposition a thought, but I should also tell you that I’m getting more and more strapped for time, both at work and home. Deadlines and new projects are coming up…” Then I waited and waited, but by late 2008 it was clear he wasn’t going to submit his material to a journal. I have since bundled the topic in with another paper, but that material is only at the in-review stage. And of course I will go over it all when I send in a reply to the IJOC.

Briefly, spatial autocorrelation of the temperature field only matters if it (i) affects the trend field, and (ii) carries over to the regression residuals. (i) is likely true, though not in all cases. (ii) is generally not true. Remember that the OLS/GLS variance matrix is a function of the regression residuals, not the dependent variable. But even if I treat for SAC, the results are not affected.

Re: bender (#51), Actually I did ask a colleague in statistics else if he’d be willing to write up and submit the argument to JGR, and while he was keen to help he never got around to it. He was going to send it in under his own name and maybe on reflection he didn’t want the humiliation of having the argument thereafter associated with himself. A pseudonym is an intriguing possibility though.

There is on obvious solution to the problem of dataset’s changing at the source. Download your data when you do the study. Stick it in a subversion or cvs repository. Now you can proceed with your study, downloading new data if the source data set changes, committing the new downloads into your data repository as you go until finally your data is frozen for your paper. Then you when you publish your paper give read only access to your repository….you might even have this wild and utterly crazy idea that you treat your code in a similar manner. I know the climate scientists would find this an outlandish and utterly bizarre solution but there it is.

Depends on the distribution statement in the contract. Some data rights can be so called “government purpose rights” giving the funding agency full control of the data/code dissemination. Its not IP, per se.

Another thing, why don’t the agencies like BAS use data repositories at the outset, instead of just over writing the data whenever a change is made. Is this kind of professional behavior merely a pipe dream?

#39. This handwringing by climate scientists is insane. For our paper on Santer, I downloaded the MSU data version that I used for our calculations and saved it as I used it. Took 5 seconds. I documented by download script so that someone else can download updated data in the same way. It’s hard to believe that adult scientists purport to be unable to figure this out. You don’t need Google to invent some versioning tool.

Response: ALL of the data that were used in the paper, and EXACTLY the code used in our paper have been available for a long time, indeed, long before we published our paper. This is totally transparent, and attempts to make it appear otherwise are disingenuous.

If some of the data is NASA “proprietary” data that they plan to make available in the “near future”, how could Steig have provided “ALL of the data that were used in the paper”? Just wondering.

If some of the data is NASA “proprietary” data that they plan to make available in the “near future”, how could Steig have provided “ALL of the data that were used in the paper”? Just wondering.

NASA Data is NOT proprietary. If you look on papers done by NASA employees or even contractors it usually carries the legend “This work carried out under contract xxxx, or with federal funding and is therefore not subject to copyright”.

There have been some instances (more so in the past but not so much recently), that gives principal investigators certain limited time rights to the data to allow them to work on the data and publish their papers so that others do not scoop them on their own data. This is usually no longer than one year these days. This is an OPTION by the PI as it is more normal these days to release all data to the public as soon as possible as it is good for PR. For examples of this look at NASA’s images and data from Mars or other planetary missions.

We are doing a data archiving product now and our deliverable to the NSSDC will be our raw data with a lot of metadata attached, so that in 100 years people can have the data in as pristine form as possible. As a note on this, there is a lot of climate data where this was not done and all we have is the second or third generation (or worse) data, especially from the 70’s and before. Today, trying to reprocess that data is an extremely difficult process.

Dr. Schmidt’s replication study is quite similar to what I would require in a senior undergraduate course in econometrics (Ross saw some of these efforts last year). As I understand Dr. Schmidt’s point of view, anyone can learn from such an exercise even when no discrepancy between the description and the code is found. Having the code makes it easy to go on and probe for sensitivity (what he calls fragility). Clearly Dr. Schmidt has benefited greatly from complete archiving of data and results, but he fails to appreciate the real point of the exercise. From his perspective a climate science student in a class that requires a replication exercise would not submit his code along with the replication. It is better for the teacher to read the method section and attempt to replicate the replication on her own. If that does not work then problems with the method section have been uncovered (and of course the teacher might be an amateur who should not be encouraged to question the student’s results). So I would give him a B- on his study because he failed to learn the real lesson.

I’m surprised that a climate journal publishes a replication that does not report any F stats and does report estimated standard errors, simply indicators for significance at a fixed alpha. I’m also amazed that this replication fails to mention any of the sensitivity tests in the original study. I agree with Ross that Dr. Schmidt does not seem to understand the difference between correlation between X’s and correlation between residuals. Nor does he seem to understand the difference between “fail to reject the null” and “accept the null.”

The little digs against Stata are amusing. I Googled “stata license columbia university” and found that Dr. Schmidt could get a personal copy for $155. I recommend it strongly to him and to anyone who wants to work with large data sets. It’s true that Stata is neither open source nor free. However, it has been the workhorse of micro econometrics for over 20 years, well before R became available. So you have to cut people some slack for not abandoning precision tools they know well.

Stata is very robust across platforms (unlike some packages). Ross’s code will produce exactly the same results on any system. By using a specific FORTRAN compiler and a separate package of routines Dr. Schmidt makes his analysis much harder to follow up even if one had his code.

Chris:
With all due respect, the software package issue strikes me as a dead horse, i.e., post the code or post the script – but post that which allows your work to be replicated. How about some comments on Gavin’s actual critique of Ross’s paper?

As I said: failure to report F stats, failure to address sensitivity analysis in the original paper, misunderstanding correlation with observables and residuals. The paper ignores standard (undergraduate) econometric techniques and is itself a dead horse. Ross can address those deficiencies just fine (and he is fluent in the local pidgin that combines ‘spatial field’ and ‘Hausman test’).

I encourage Dr. Schmidt and/or colleagues to submit some of their papers that use economic data to journals such as J. of Applied Econometrics or the J. of Environmental Economics and Management. When they get through applied econometrics referees as Ross & his collaborators have gotten through climate science referees then we will start to see where both sides my benefit from a dialogue. Unfortunately it appears that if these papers were published in economics journals they would not be considered scientific. The climate science ‘team’ only plays at home.

Apart from ITAR, which is a very sticky wicket, you have distribution statements A-F per DOD (which I am familiar with). As to what NASA contracting does…don’t know about that.

A: Approved for public release; distribution is unlimited.

B: Distribution authorized to U.S. Government Agencies only (fill in reason) (date of determination). Other requests for this document shall be referred to (insert controlling DoD office).

C: Distribution authorized to U.S. Government Agencies and their contractors (fill in reason) (date of determination). Other requests for this document shall be referred to (insert controlling DoD office).

Re: Dennis Wingo (#53), There’s another point here as well. The contractor who may be producing the data is usually NOT allowed to release the data themselves. There is usually a government releaser who has authority and is supposed be the intermediary. See for example the recent LLNL FOIA thread.

Gavin’s point –tho I suspect, perhaps unkindly of me, that it was a sideways jibe– to Walt at NSIDC re a software suite to do this is actually well taken.

Really, what someone ought to be doing at this point is firing a rocket up some high-level national/international organization and congress to fund and design such a tool that would maximize interoperability, and additionally offer to be repository of all this cross-discipline versioned data as well.

I know, I’m probably dreaming, but darn it, it *is* doable in the 3-5 year range anyway if someone will get assigned the job and funding.

Chris:
I recognize the F stats issue and the fact that Ross will undoubtedly respond in detail, but given the data and code Gavin has provided, can you add some specific analysis a la SteveM, RomanM, Hu, Lucia, etc. Otherwise I am afraid it will be too easy for Gavin et al to wave off imply Ross’s criticisms.

Chris, the battles trying to get econometric techniques into climatology journals can be exasperating. I had a paper turned down at Climatic Change many years ago on the basis of a 2-page referee report explaining that unit roots can invalidate regression results. Stephen Schneider was quite impressed with this argument, and didn’t seem to think it relevant that I was doing a cross-sectional regression. More recently I am up against referees leveling charges of “overfitting” with no definition, then ignoring diagnostics that show multicollinearity is not a problem. Notice that Gavin ignored the diagnostics in my MM07 paper and my follow-up note on Spatial AC. Climatologists keep referring to the concept of “effective degrees of freedom” based on some largely unpublished in-house theorizing for AR1 processes, but then ignore GLS estimators as if they were no different from OLS. You can see that in Gavin’s reply to my RC posting. He ignores the actual test scores I computed, and the fact that my re-estimation using SAC weights leaves the conclusions intact. Instead he dismisses the results in their entirety based on vague, qualitative assertions about the number of “true effective degrees of freedom,” an ephemeral thing not susceptible to actual computation. There’s no formula for it, it just seems to be defined as “whatever number lets me ignore your results.” I am currently contending with a climatology referee pulling one against me.

Yes, effective degrees of freedom??!!. It strikes me like the spoof of identifying micro-numeracy and correcting for it (was that Kmenta?). Or people who in other social sciences worry about multi-collinearity not realizing that OLS standard errors automatically adjust for it.

Bernie seems to think I will have some point that would enlighten climate scientists doing applied econometrics yet refusing to take it seriously. Why would they listen to me? The only reason I weighed in at all is that Dr. Schmidt used economic data and replicated a paper (MM07) I understood more or less. He did this in an amateur way (i.e. from the perspective of referee for econ journals). Then missed the irony that he had done this using full documentation of the procedures while his collaborators have argued vociferously that they need not provide all code and data publicly nor respond to amateurs.

I think if any editors of applied econ journals are reading this they should invite papers on the relationship between temperature and economic activity (including causality in both directions a la Nordhaus). Invite climate scientists to show that they can meet the standards. Promise them that the econ referees would be chosen who have no history on the climate side. They would simply uphold econometric standards. Of course, many of these journals will require all the code and data be archived publicly as a condition of publication.

Bernie seems to think I will have some point that would enlighten climate scientists doing applied econometrics yet refusing to take it seriously.

Actually that is exactly what I was thinking, because without a proof point of some kind I do not see the argument progressing. It will simply remain a ****ing contest. Granted what you and Ross say about how they handle this econometric type of model, it seems to me that if you can unambiguously demonstrate a flaw in their approach then it may give others pause. For example, I was struck by the following in Gavin’s paper:

Adjacent grid boxes in both economic and climate data are not independent (Jones et al., 1997), and assuming that they are leads to over-estimating the significance of any correlation
and the potential for over-fitting any statistical model. Some indication of this is given by the fact that the largest ‘contamination’ deduced from their methodology are in very remote polar regions such as Svalbard or the South Orkneys, hardly sites of significant industry (Rasmus Benestad, pers. communication).

(Bold added)
Now this either points to an issue in MM07 or it suggests a fundamental misunderstanding of the approach. Which is it? If it is the latter, what is the misunderstanding? I think this goes to the interpretation of the model not to whether the right statistical technique was used. The power of Ross and Steve’s deconstructing of the Hockey Stick was that they could point to a visible culprit – BCPs – as an example of the technical statistical issues. It worked. I bet that Mann is still looking to resurrect his stick without BCPs.
My grasp of models is too weak to help in any meaningful ways – but I do believe that pinpointing a error that can be readily articulated is far more potent than arguing that Gavin et al cannot do undergraduate level econometric work. The latter type of argument, actually an ad hominem argument, is not particularly helpful. Gavin is a pretty smart dude and has mastered his own tool set. I do not think he handled the MM07 argument persuasively – but besides the DF and F argument I am not sure I have heard a compelling counterargument. Ross may have one ready to go – you might see one as well.

Re: bernie (#71), Every regression model has some anomalous observations, the issue is how influential they are. If you look at Section 4.2 of my paper you will see that I re-ran the analysis using a formal test to exclude influential outliers (see Fig. 1). It doesn’t affect the results–a point I showed using a chi-squared test of parameter equivalence. Gavin simply ignores this, pulls out one outlying observation and with no supporting statistical analysis argues that it undermines the conclusions, citing a “pers. comm.” from a fellow realclimate blogger who likewise has done no analysis on the issue. I get rather frustrated by the fact that they ignore the detailed statistical analysis in the paper and then take random, unsupported potshots at it.

Ross beat me to this … but I had to actual read the paper to answer this question.

the largest ‘contamination’ deduced from their methodology are in very remote polar regions such as Svalbard or the South Orkneys, hardly sites of significant industry (Rasmus Benestad, pers. communication).

Okay, you got me to re-read MM a bit more carefully, especially section 6 which seems to be where they do the counterfactual adjustment that Schmidt refers to. I don’t see why he needs personal communication to deduce the point. Figure 4 of MM show Svalbard and S. Orkneys among the largest adjustments. The counterfactual is not completely obvious to me: make every part of the world as rich as the U.S. (per person per sq km) and as well educated in 1979. Then act as if no effects from growth and other trends are in effect. Alright … I could a little help from M&M on exactly why this counterfactual. If Dr. Schmidt understood the adjustment he would have seen it is not about industrialization per se. But since Ross has provided the data it will take no time to explain ‘why’ those locations have relatively big adjustments (once I find their lattitiude & longitude). COme on people, all they did was compute where is with two columns set to U.S. values and the several others set to 0.

I was also going to say what Ross said: MM had a section looking at outliers and influential observations. Schmidt said nothing about their sensitivity analysis in their replication. The South Orkneys are indicated as one of the potential outlier observations (but not Svalbard). Quoting MM: using a Hausman-type chi-square statistic. The joint variance-covariance matrix was estimated and the model coefficients were compared, yielding a c2 (14) score of 18.82, which is insignificant (P = 0.17), indicating that we do not reject the hypothesis that there are no systematic differences in the coefficients between the models with and without outliers.

Of course the issue of influence is not the same as spatial correlation. And the size of the counterfactual adjustment is unaffected by spatial correlation even if MM based it on the GLS estimates.

Re: Chris Ferrall (#95), Hi Chris. The counterfactual is based on the argument that some aspects of economic activity are good for measurement and some are not good. It is good for a nation to have sufficient GDP per square km, and lots of educated workers. A large, poor country will not have enough money to run a good network of weather stations, and a country with relatively few people with post-secondary education will find it relatively costly to staff such a network. We assume, in essence, that the US has the human and economic capacity to do good surface temperature measurements, so if other countries had their resources and the surface process variables were set to zero, we could back out the non-climatic effects.
This is as much as we can do within the framework of our analysis. A better method would be a panel regression with controls for unobservable heterogeneity. That would permit better identification of changes in the surface temperature network that are clearly associated with economic changes rather than climatic ones. I hope someone else will do this one day, but if nobody gets around to it by the time I run out of other things to do, then I might.

Re: Ross McKitrick (#102),
Ross, That’s all there in the article but I missed the earlier discussion in section 3. It is pretty tenuous to associate education to “measure of the difficulty of recruiting and retaining trained technical staff.” And your story suggests we might see a quadratic effect in GDP. Yes … panel data seems obvious. And measures of economic activity with finer resolution. For example, you use growth in population (p) at the national level. Couldn’t that be measured closer to the size of grid cells, to pick up things like urbanization in the southern U.S.? And couldn’t you use % of GDP in agriculture to pick up any spur this might have for improve measurement?

If I wanted to undermine your explanation that the surface temperature record is contaminated I might have looked at other counterfactuals consistent with your story that don’t make the two lower distributions in Figure 3 look so nice. Or, I might have bootstrapped a confidence interval around that distribution for the experiment you ran – maybe it could be anything. This doesn’t eliminate the (partial) correlations but they would make it the story less obviously about economic activity. The last thing I would have done is substituted the average of five model simulations and reported coefficients of the opposite sign (without t-ratios and F stats!)

But you showed something – evidence that surface temperature has not been cleansed of local human activity. Rather than spurring climate experts to improve the ad hoc adjustments to temperature data it seems to have them working to explain away Hausman tests by graphing observables and expounding on effective degrees of freedom. Of course, Figure 3 shows that adjustments based on multiple regressions will probably flatten the global trend.

Re: Chris Ferrall (#110), Ideally we’d have the relative price of skilled labour by country, but educational attainment was as close as we came. A price series might be available though. Population at the grid cell level is out there I think, and certainly there’s a lot of data by US state and Cdn province. So there’s no question this kind of study could be re-done at finer resolution, at least for some continents. There are (presumably) other counterfactuals that could be done, but you’d have to make a case for each one. The filtering step is the least determinate part of the analysis. In our 2004 paper we talk more about the rationale for each of the variables, showing some pretty striking evidence of the breakdown of the Russian weather monitoring system after 1990.

Rather than spurring climate experts to improve the ad hoc adjustments to temperature data it seems to have them working to explain away Hausman tests by graphing observables and expounding on effective degrees of freedom. Of course, Figure 3 shows that adjustments based on multiple regressions will probably flatten the global trend.

Not just that. If the data are contaminated it confounds all the signal detection studies. They all rely on the assumption that the data contain nothing but clean climatic forcing signals. If this is untrue then they can’t conclude anything from the “experimental” design they have been working with. No matter how much evidence piles up that the surface temperature data are contaminated (and this applies to the oceanic data as well, with the bucket-vs-intake problem) the modeling community will be the last to accept it because they are too heavily invested in the assumption that their test tubes were sterile.
And this makes it a huge problem, in my mind, that the people who publish the data and pronounce on its fitness for use in measuring climate, namely CRU and GISS, are also running climate models and doing signal detection work. These functions should be strictly separated.

In this case it is Gavin who is making the criticism not MM not the other way around. It is my understanding that the results of MM were tested by holding out a number of the data points on a number of different runs and that the results remained significant. This would seem to make objections that a couple of data points caused the results incorrect. If Dr. Schmidt was saying that subjectively according to Dr. Benestad there is in fact not much economic activity at those locations then I guess he was making a point about the quality of the econometric data. In my opinion Dr. Benestad is probably not in a position to say whether the econometric data was in fact correct for those locations. Anyway this would certainly fall in the category of nit picking.

Nicolas:
You may be right – but couldn’t it also be that a relatively small change in development in relatively undeveloped (and very cold) locations could produce exactly dramatic shift Benestad noted. For example, the increased use of hot springs to centrally heat a previously locally heated environment brought about by an expansion of Tourism or other forms of development could lead to some pretty dramatic effects. Overtime these may be proportionality less pronounced given that only so many people want to live in the Svalbard, but for a particular time period they could be very pronounced. Some time ago Steve and others explored these effects in Siberia.
So if Gavin and Dr Benestad were interpreting Ross as saying that the contamination is due to massive economic development, then they have clearly missed the point. This would point to a mis-interpretation of the model. At some level, IMHO, MM07 systematically looks at a more generalized UHI effect. But perhaps I have this all wrong?

Actually, Gavin nearly made a point. His missed it a touch but came close. Achieving the data used to replicate is important. The data can change. Providing the mathematical steps, as in a math proof, are important to replication regardless of code. Code isn’t required if the process is adequately explained and the data source is readily available. Steve should be more eloquent in narrowing the issues about documentation.

Bernie, Dr. McKitrick makes the point for me. But I would also observe that the hypothesis behind the paper is that the thermometer based temperature record is influenced by surface changes that are correlated with measurements of economic activity. While classic UHI is a likely candidate, the possibility is left open for other types of land use changes.

Dr. McKitrick I have another question. Dr. Schmidt compares the results of various climate models to show that there is an expected correlation just based on the way climate works. Do you have any comment to that portion of his analysis? Is the GISS data you are referring to the land based temperature measurements or the climate model output?

He also points out that the coeff’s are significant when GISS data are swapped in, and claims this shows the effects are spurious. I am sure I am not the only person who has actually read the paper and noticed that to the extent they are significant the coefficients on GISS data take the opposite signs to those on the observational data. Far from showing the effects are spurious, it shows that the observations negate a significant pattern in the modeled data and make it significant in the other direction. That’s called an observable effect, not a spurious result.

The expected correlation goes in the direction opposite to the observations.

Regarding spatial autocorrelation in econometric & climate data. Generate a spatial correlogram of the variables of interest using Moran’s I statistic. If the maximum lag distance for which autocorrelations are non-zero is orders of magnitude smaller than the spatial extent of the study area, then it is a microscale effect of negligible importance in the grand scheme of things (and Gavin is a twit). Dismissing a global-scale analysis simply because SA is non-zero at localized spatial scales (tens or hundreds of km) is effectively throwing the baby out with the bathwater.

Dealing with departures from the assumptions underlying OLS being BLUE are a major part of econometrics work. My first two lectures on econometrics were to show that OLS was BLUE. The rest of my undergraduate and graduate econometric life has dealt with departures from the underlying assumptions required for that proof. Ross’s paper would seem to reflect that general background. Thus, a comment that there might be some form of correlation pattern in the residuals that invalidate OLS is the way econometricians clear their throats. For Gavin to base a major criticism around it – when it would appear to have been dealt with – seems odd and clearly comes from a different statistical culture.

(And surprisingly enough, despite all the autocorrelation and heteroskedasticity in the world, OLS generally turns out to be very robust, and certainly good enough for government work.)

To put it more clearly perhaps. An effect that arises due to a shared pattern of spatial autocorrelation may indeed be spurious. But if you observe the same effect in hundreds of samples, it surely is not spurious.

As there are a huge number of population growth centres distributed haphazardly around the world, the spatial relationship between surface temperature and economic activity is anything but spurious. And so the effect in the temporal domain is also non-spurious.

Re: Nicolas Nierenberg (#92), Nicolas, yes. GISS data here refers to model-generated data. On the other question, the parameters remain remarkably stable across the various configurations (UAH vs RSS, CRU3v vs CRU2v, etc.) However the std errors do increase using the RSS data compared to the UAH data. I don’t know why this is the case: maybe it is noisier data. The joint significant tests still strongly reject independence between the socioeconomic vars and the temperature trends (P=4.5e-6). The many specification tests still hold, and the striking difference between growing and declining economies still emerges, which I am sure GCMs don’t predict.

As there are a huge number of population growth centres distributed haphazardly around the world, the spatial relationship between surface temperature and economic activity is anything but spurious. And so the effect in the temporal domain is also non-spurious.

Thank you for that, Bender. The mind of a history and English major just had a learning moment.

Some of the defenders of non-disclosure plead the expense, time and inconvenience of exposing their software and data to the public. I work in a software group comprised of a dozen engineers. To expose our software to the world would require:

cd
cvs export -DNOW

When the dust settles, the “scientist’s” work is done. I think I could train a bright 10 year-old in less than five minutes.

The “auditor’s” end is far more complicated:

Use FTP to get the module from the “scientists”
./configure
make
make install
make check
make exec

Perhaps some configuration problems need to be overcome. Perhaps some of the checks fail for configuration-dependent reasons. When the dust settles, the “auditor” has duplicated the “scientist’s” data reduction.

CVS marks every file (in a well-run shop) with a line that looks like:

$Id document.doc, v 1.22 2004/02/14 23:38:04 authorName $

Automatically version-stamped and time-stamped.

The point of all this is that exposing the data and data manipulation files is a matter of a few seconds of desk work. No hill for a tree-corer.

You’re a good reader. However, there are a couple of things that I believe in: “full, true and plain disclosure” if scientists are dealing with the public (as they are); and “due diligence”. I don’t believe that you need “perfect” knowledge to make practical decisions; people make decisions without perfect knowledge all the time. I could write paragraphs on this issue, but will leave it here for tonight.

I am skeptical of the results in Steig et. al. (which should make pete m. happy), but the AWS data is not that signficant to the result. Instead it appears to mainly rely on manned weather station data as well as AVHRR data. It also depends on the RegEm algorithm, which is a black box for me. I will say that if the AVHRR data is accurate then the concept of using it as a way to go back in time to interpolate between the manned weather stations is a good one. It is the fact that the ground data is sparse and probably not that reliable over the years that makes me skeptical.

Actually given the issues with weather stations in general that have been discussed it would be interesting to see if this could be done on a global basis using reliable weather stations with very long records and the RSS or UAH data to create an interpolation between them. This could be another way to look at the issues raised in MM07.

I’m sorry that I was too brief in my statement. I meant I don’t know what what you believe relative to AGW in general. I can tell that you believe in disclosure and accuracy, and that you like seeing how things work.

Some of the defenders of non-disclosure plead the expense, time and inconvenience of exposing their software and data to the public. I work in a software group comprised of a dozen engineers. To expose our software to the world would require:

cd “FTP directory”
cvs export -DNOW “module name”

When the dust settles, the “scientist’s” work is done. I think I could train a bright 10 year-old in less than five minutes.

The “auditor’s” end is far more complicated:

Use FTP to get the module from the “scientists”
./configure
make
make install
make check
make exec

Perhaps some configuration problems need to be overcome. Perhaps some of the checks fail for configuration-dependent reasons. When the dust settles, the “auditor” has duplicated the “scientist’s” data reduction.

CVS marks every file (in a well-run shop) with a line that looks like:

$Id document.doc, v 1.22 2004/02/14 23:38:04 authorName $

Automatically version-stamped and time-stamped.

The point of all this is that exposing the data and data manipulation files is a matter of a few seconds of desk work. No hill for a tree-corer.

Re: bender (#129), Hi bender: scrooooooooooll down the page to the heading “Spatial Autocorrelation” where you will find a link to a paper that mentions spatial autocorrelation 16 times, not counting the formulae and acronyms. The other paper you refer to discussed it in passing to show that it was not an issue, but the focus on that paper was the fabricated statistics in the IPCC report, namely the claim that controlling for atmospheric circulation patterns makes the effects become statistically insignificant.

Re: Ross McKitrick (#143),
That too-brief paper does not illustrate (or even describe) the nature of the SA. The argument that there is no SA in the residuals, implying the effect is not spurious, is satisfactory. However it’s the actual estimates of the SA that I want to see.

In his IJOC paper Schmidt, like Steig, is guilty of confirmation bias. He sees the spuriously significant relationship between economics predictors and model output, missed the fact that they were negative, and goes on to tell his story, completely independently of the results he just produced. THAT is classic confirmation bias. Never mind the data, irrational belief has taken control of the rational mind.

That the correlations between econometric data and climate model ouput are (spuriously) negative only INCREASES the probability that McKitrick’s positive correlations are not spurious. McKitrick & Michaels conclude that 1/2 of the surface trend is attributable to localized (i.e. not “well-mixed”) socioeconomic factors. Schmidt’s IJOC result suggests it may be just a little higher than that.

I feel sorry for IJOC having accepted this paper. But in terms of quality of reviews, editors get what they pay for.

(1) On the Schmidt side, the reasoning is: as a significant correlation (the sign does matter at all) is found where no significant correlation should be found by experimental design, it indicates that there is an issue with the methodology used by McKitrick, because it “creates” artificial significance. The relation described by McKitrick is therefore spurious and there is no relation between socio-economic indicator and surface temperature trend.

(2) On the McKitrick side, the reasoning is: THERE IS a relation between socio-economic indicator and surface temperature trend and as the relation exists, the correlation is positive given some “physical” arguments. Therefore we cannot accept the results of an experiment that shows there is no relation because it also gives a negative correlation while we know that the correlation should be positive because there is relation etc.

Sorry, but in the first case, the logic is perfectly fine, but in the second case the reasoning is circular: basically it is: If I’m right, Gavin is wrong, and as Gavin is wrong, therefore I’m right.

Re: Ignatus (#105), No, you’ve missed the point (and I think there are some missing words in your posting because what you wrote doesn’t make sense as written). The assumed hypothesis in climate circles is that the weather data (known to be contaminated by local environmental changes and measurement inhomogeneities) has been “corrected” when being re-processed into gridded climate data so these no longer influence the trends. This means the trend pattern should not be correlated with socioeconomic factors. But MM04 and MM07 showed that the trend pattern is strongly correlated with socioeconomic factors.

Gavin tried to refute this by saying: No, uncontaminated data from GISS-E still yields a fluke correlation with socioeconomic patterns. So the fact that MM find a correlation proves nothing. The big flaw in his argument is that to the extent the non-contamination hypothesis, as played out in the GISS model, predicts significant coefficients, they take the opposite sign to those observed on observational data. My original argument was, in effect, non-contamination implies B=0, but B>0, ergo contamination. Gavin has responded that non-contamination actually implies B<0 imply contamination. Think about it.

Re: Ross McKitrick (#113),
Yes, that context is necessary. That’s why I was trying to link to the inadequacies of Parker’s “correction”. His estimate must be biased low. And that’s why the M&M estimate will seem high to some. It seems high if you assume Parker was not biased low.

I was going to say: No, Gavin does not say “No, uncontaminated data from GISS-E still yields a fluke correlation with socioeconomic patterns. So the fact that MM find a correlation proves nothing.” But now, I really read the paper, and yes, he says that. (OK. I should have really read the paper before…). I understand your point.

I thought the argument of Schmidt was simply: “the significance test used by McKitrick is not robust because it leads to a lot of Type I errors.” It is not his argument, right. But I think this point is valid.

Re: Ignatus (#121), Sorry my friend, but I didn’t leave you this wiggle room. If the main results in MM07 could be dismissed as a set of Type I errors we would not survive the bootstrap resampling test (para. 31) much less the out-of-sample prediction test (Sct 4.5). Fluke type I errors or spurious results give you zero out-of-sample performance. Here’s a simple example. Pick any 2 daily closing stock prices (you can get the data from http://finance.yahoo.com/). Take the first 75% of the data sample and regress one stock on the a 1-day lag of the other. If this regression works it would be the holy grail of financial analysis: a tool for predicting tomorrow’s stock prices. You’ll be encouraged when you get a high t statistic and a “significant” fit. But for reasons that are well known to econometricians this is a spurious result. You can prove it by using the “significant” fit from stock 1 to “predict” the next-day prices from the withheld portion of stock 2. Then check the predictions against the observations and the r2 will be zero. A scatter plot of a good prediction will fit a 45 degree line. A scatter plot of a spurious prediction will fit a horizontal line. The stock data will be horizontal.

In MM07 I did 500 repetitions of the out-of-sample test and repeatedly nailed the withheld data to a 45 degree line. Look at my Figure 2. And for comparison, suppose you took the GISS-E predictions from Gavin’s data set and did a scatter with the observed gridded trends on the vertical axis and the GISS-E generated trends on the horizontal axis. If you think my Figure 2 still leaves open the possibility that the results are spurious, wait until you see what the multimillion-dollar, Rossby Prize-winning GISS-E looks like.

Re: Ross McKitrick (#129),
That’s why I want to see the reviews of MM07 vs. S09. I am willing to bet MM07 got a much harder ride. Outsiders are always given the gears. Ignatus should understand by now that the probability of major errors is inversely proportional to the rigor of review. AGW confirmation bias is rampant in climate science.

For those of you who are passing me links because they think I clearly haven’t read enough about this issue to understand my history with this subject probably goes back further than most of you, although not as an active participant.

My father was Dr. William Nierenberg. He was a co-author of MacDonald et al (also known as the JASON Report). He was the chairman of the committee that produced “Changing Climate” a NAS publication that was the first to take a comprehensive look at all the issues surrounding climate change.

He and I discussed this topic many times over the years. Despite the fact that he well understood the physics behind the theory he felt that the climate system was unpredictable and that there was a great deal of uncertainty in what the sensitivity would turn out to be. As a result at the time of his death in 2000 he still felt Kyoto type policy action was not in the best interests of the US.

I was brought back to this topic by a paper written by Dr. Naomi Oreskes, which was very critical of my father’s early role in climate change. My web site chronicles the issues with that paper. It also has a link to my blog on the subject.

I am well aware of the various arguments that have been brought up to contradict the mainstream theory. While many of them poke holes in aspects of work that have been done, nothing has disproved it. At the same time I don’t believe that the theory has been proved in any scientific sense. Even the IPCC assigns probabilities to the outcome, and the models themselves cover a great range of sensitivity. My father felt that the actual measured temperature changes given the increase in CO2 would lead one to believe that the likely sensitivity is on the lower end of the range, but a lot of smart people don’t agree with that.

Anyway this was just so you guys know where I am coming from, not to open the whole AGW debate. Sorry if it has that effect Mr. McIntyre.

Re: Nicolas Nierenberg (#101), Nicolas:
I look forward to the thread that addresses Oreskes et al’s jaundiced and inaccurate rendition of your father’s and others’ work. Your rebutttal is very powerful, well structured and clearly written. Thank you for bringing it to my attention.

Re: Nicolas Nierenberg (#101),
You present a fascinating story and I thank you for bringing it to my attention. I hope that you get the full and complete retraction (and apology) that you and your father deserve. I will be following it at your blog but please keep people here posted as well.

snip
Steve: I repeatedly ask people not to try to resolve large issues in a couple of paragraphs. I know that people are interested in large issues, but editorially these sorts of one-paragraph bites tend to look the same.

Ignatus, I will not try to explain Schmidt’s logic because it is flawed. The flaw is this. He observes a weak but statistically significant negative correlation and knows it’s spurious because he knows the model contains no such effect. Ok. So we take this as the null model. McKitrick shows a very strong and significant positive correlation, as compared to the weakly negative null. This STRENGTHEN’s McKitrick’s result. It doesn’t weaken it. Moreover he has shown that once this putatively “spurious” correlation is accounted for there is no spatial autocorrelation in the residuals. If McKitrick’s correlation were spurious (as alleged by Schmidt) there would be some spatial autocorrelation left in the residuals. That there isn’t suggests he’s got a reasonably causative (i.e. proxy) relationship, with his socioeconomic indicators doing a better job of detecting anthropogenic effects than other approaches, such as night lights, which is bad at proxying lightless heat sources.

Try re-reading both papers with this in mind and get back to me. But don’t expect any follow-up tomorrow.

Re: Bernie (#110),
McKitrick’s paper would not be so important except that there is an apparent vacuum of interest in quantitative human (urban/empire) climatology. Hard to believe Oke is still the master.

No. you don’t “take this as a null model”.
In statistic, it is frequent to use artificially generated data for which you know the answer to test the strength of a statistical test. If your test shows significance while you know it should not be the case given your data, there is a problem. The test is not robust. And you cannot trust the results of your statistical test. It is simple.

Ignatus. When you compare analysis of real data (M&M07) to analysis of computer-generated data (S09), the computer-generated data can be considered a better “null model” (i.e. expectation) than the zero-correlation null model. How did I know you would get confused at this?

We know he lurks here. Assuming he’s looking now for ways to extricate himself from the mess he’s making …

Hey, Gavin, when you reduce the number of degrees of freedom in Ross’s calculation, how much do you think his p-values are going to drop? Effects with that degree of strength and significance don’t get overturned by dof adjustments for SA. Another good reason for y’all to start collaborating with real statisticians. Like Wegman suggested.

I haven’t read these papers too carefully, and I should know by now that it is a bit foolish to comment before doing so. So at the moment I have more questions than answers. The impression I get is the following:

1. Ross developed a model relating instrumental temperature to economic activity, and found a highly significant relationship

2. Gavin redid the analysis with GCM output, and found and opposite effect, and claimed it was insignificant (invoking climate-brand statistical techniques)

Naturally, one could come up with a plethora of possible hypotheses as to why this would be the case, but it would make sense to focus on the most likely problems:

A. Ross’ model has a problem
B. Gavin’s stats have a problem
C. GCM output is nothing like instrumental data

(Note all, or none of these could be true).

The thing is, haven’t we had a recent study about point C.? Carried out by Koutsoyiannis et al., reported on CA here and RC here. Weren’t the findings that GCM output was nothing like the instrumental data on a local scale, and significantly underestimated the variability in it? Didn’t Gavin’s own comment on this paper include the statement:

it’s a shame Koutsoyiannis et al addressed a question whose answer was obvious and well known ahead of time instead.

Perhaps I’m missing something here. Did Gavin just write another paper that demonstrates how disconnected GCMs are from reality on a local scale, an issue he previously claimed to be “obvious” – and then completely ignore that very possibility as an explanation of his results?

#122. I too have been surprised at the number of problems/mistakes in “big” papers. What’s even more surprising is the stubbornness. Mann’s PC1 got used more in third party papers after the Wegman Report than before. Bristlecone addiction is still rampant even after the NAS Report.

I have been interested in both your paper, and Dr. Schmidt’s paper which is partly in response. Dr. Schmidt accepted that the statement in AR4

However, the locations of greatest socioeconomic development are also those that have been most warmed by atmospheric circulation changes.

Was not illustrated by the climate model data that he used.

the AR4 statement probably refers to the trend in the NAO over the period (peaking around 1995). It’s a reasonable hypothesis but not what is happening the model runs I looked at.

I had thought that he used the model data because this would show that it predicts a spatial warming trend which is coincidentally the same as areas of high economic growth, but this apparently wasn’t the case. He did suggest that other climate models could be tried, although given the fact that they aren’t considered accurate at these geographic scales I’m not sure what the point would be.

His contention is that the fundamental issue is spatial auto correlation which would be present both in the measured data and in the model data. When I pointed out that you had held out data points as a test, he proposed that they should be held out spatially. For example hold out Western Europe and see whether the rest of the world would accurately predict that data set.

Based on a question that I asked he seems to feel that random data that was spatially correlated would have made the same case, although I shouldn’t put words in his mouth. Thus he finds the fact that there are correlations between the model data and the economic data shows the effects of spatial correlation, and that the sign is not relevant as a result.

Re: Nicolas Nierenberg (#130), Nicolas,
First, ‘Ross’ is fine.
In the 2004 predecessor to MM07 we withheld North and South America and successfully predicted them. Then we got criticized for that, on the grounds that we should withhold randomly-selected data. It’s hard to please some folks.

I actually did something like the test on Western Europe while at a conference in 2006. Chris Folland from the CRU was responding in a rather animated fashion to my results, and insisted that they were all due to the fact that Europe follows its own weather pattern due to some circulation regime or other. So I said to him, suppose I drop Europe out of the sample and re-do the analysis and get the same results: will you believe them then? He said no, probably not, and that was that. When i sat down I started up my laptop and re-did the analysis leaving out Europe. The results were the same. I didn’t bother going over to show him though.

I didn’t test spatial AC using the withheld-data test, I tested it using a spatial AC test. Gavin may claim that this is the fundamental issue but the reality is he doesn’t understand it. Because of that he doesn’t understand the argument against his position. He is wrong on 3 counts. First he is talking about the dependent variable rather than the residuals. The variance-covariance matrix (see equation 5 in my paper) contains the omega matrix, which is formed using the regression residuals. The dependent variable does not appear in that equation. If the dependent variable is spatially autocorrelated and the residuals are not, as in my case, that means the model accounts for it and the variances are not biased by spatial autocorrelation. Second, even if I control for spatial AC anyway the results hold up, so the issue is truly moot. Third, as I stated in #113 above, the sign is not irrelevant. In this context it is a crucially important datum. Even in time series modeling where the problem of spurious regression between integrated processes arises (and this is the true meaning of the term, as opposed to Gavin’s irrelevant usage), the sign of a coefficient still matters. The conditions that give rise to spurious inferences bias the standard errors, not the signs, of coefficients.

In the 2004 predecessor to MM07 we withheld North and South America and successfully predicted them. Then we got criticized for that, on the grounds that we should withhold randomly-selected data. It’s hard to please some folks.

Ross, can you confirm that you mean what you write, or did you mean that “we should NOT withold randomly-selected data.” ie word missing.

Dr. Schmidt accepted that the statement in AR4: “However, the locations of greatest socioeconomic development are also those that have been most warmed by atmospheric circulation changes.” was not illustrated by the climate model data that he used.

is very interesting given the record of IPCC review comments. This argument was used by IPCC to dismiss McK and Michaels 2004; the dismissal was vigorously contested by Ross in the review process. AR4 finally stated:

McKitrick and Michaels (2004) and De Laat and Maurellis (2006) attempted to demonstrate that geographical patterns of warming trends over land are strongly correlated with geographical patterns of industrial and socioeconomic development, implying that urbanisation and related land surface changes have caused much of the observed warming. However, the locations of greatest socioeconomic development are also those that have been most warmed by atmospheric circulation changes (Sections 3.2.2.7 and 3.6.4), which exhibit large-scale coherence. Hence, the correlation of warming with industrial and socioeconomic development ceases to be statistically significant. In addition, observed warming has been, and transient greenhouse-induced warming is expected to be, greater over land than over the oceans (Chapter 10), owing to the smaller thermal capacity of the land.

Here is one of many SOD Review Comments on this matter (here by Ross):

3-286 A 10:15 10:19 McKitrick and Michaels (2004) treats the IPCC claim of the absence of a global nonclimatic bias in the surface record as a hypothesis to be tested, exactly in accord with the methodological goals prescribed in Chapter 1 of the AR4. You do not have the option of simply ignoring results you don’t like. The hypothesis was tested and rejected, and the study has not been refuted. Nor can you ignore the deLaat and Maurellis findings, which again contradict the claim in this paragraph. This section raises the question of global data quality. The question has been treated in the literature and some key evidence has been published that does not go in favour of earlier positions taken by the IPCC. It is a disservice to readers to suppress this information. [Ross McKitrick (Reviewer’s comment ID #: 174-18)]

IPCC:

Rejected. See response to 3-284.

Here are other review responses on this issue:

These papers have already been taken into account. McKitrick & Michaels has itself been discredited. See e.g. Benestad (2004), Climate Research 27:171-173. The final comment merely suggests that the homogeneity adjustment works. Despite this we have added some text.
…
Rejected. McKitrick and Michaels (2004) is full of errors. There are many more papers in support of the statement than against it.
…
3.253. Rejected. Parker (2006) provides a detailed demonstration of the lack of urban influence. The locations of socioeconomic development happen to have coincided with maximum warming, not for the reason given by McKitrick and Michaels (2004) but because of the strengthening of the Arctic Oscillation and the greater sensitivity of land than ocean to greenhouse forcing owing to the smaller thermal capacity of land.
…
Rejected. 1). We now have BETTER homogenization techniques than in 1990 so the urban influence, if any, will have been mitigated. 2). The regions covered are a very substantial sample of the industrialized parts of the globe. 3). The difference between the Jones and Rural USSR series was statistically insignificant. The same is true for eastern Australia. The Jones series has succeeded in avoiding the urban warming evident in parts of eastern China. The only significant rural minus grid trend difference over USA (0.15C for 1901-84), when scaled by the fraction of grid-points (30/82), yields an urban trend of the order of 0.05C.
…
See response to 3-253. The results of homogenization techniques than in deLaat and Maurellis (2004) are biased in the same way as those of McKitrick and Michaels (2004), because the strengthening of the Arctic Oscillation, and the greater sensitivity of land than ocean to greenhouse forcing owing to the smaller thermal capacity of land, have yielded maximum warming in the locations of greatest socioeconomic development. Some text discussing this has been added to section 3.2.2.2.
…

Rejected because the locations of socioeconomic development happen to have coincided with maximum warming, not for the reason given by McKitrick and Michaels (2004) but because of the strengthening of the Arctic Oscillation and the greater sensitivity of land than ocean to greenhouse forcing owing to the smaller thermal capacity of land. That is why Benestad (2004) was correctly unable to replicate McKitrick and Michaels (2004) using his independent sample. See also 3-284 response. Some text was added to 3.2.2.2.
…
3-288. Rejected. See responses to 3-283 through 3-287. The studies the reviewer wants included are now discussd in text as to why they are flawed.

Some climatologists had better start getting serious about ALW (anthropogenic local warming). It looks like it might account for half of alleged AGW. If the climatologists won’t, then the econometricians will. Simple as that.

Rather than “correct” the data (why would you tinker with data?), shouldn’t the models be asked to include the ALW effects of urban/economic development? The only way the effect is going to be measured accurately is if some group is funded to study it seriously. Models are for tinkering, not data.

I agree with bender, this is turning into a powerful thread. We should thank Gavin for re-igniting this particular debate. If, as now seems likely, Gavin has made a total hash of his “replication” of at least MM07 then perhaps a wider audience will become familiar with what amounts to the suppression of MM04 (and MM07) in AR4. Is an IJOC rebuttal in the works? Surely they have formally contact MM and LM?

Re: Ross McKitrick (#138), Ross: The lack of contact from IJOC strikes me as an extraordinary omission. Is it or is it not standard practice to contact an author for comment when a paper is built around a critique/replication of another recent paper? Do you know if they contacted de Laat or Maurellis?

Re: bernie (#142), It’s standard if you submit a comment on a paper to the journal where it was published. If you submit to a different journal and write it up as an extension of the literature then there is no guarantee that it will be refereed by the person you’re critiquing. In fact the editor may prefer not to let them referee because it can lead to gate-keeping. Our 2005 GRL paper was not refereed by Mann for instance. I expect that when I send in a comment on Schmidt’s paper he will be asked to reply, then both items will go to 3rd parties for review.

Re: Ross McKitrick (#147), Ross:
Thanks for the response to what must seem like a naive question. I guess the editiorial policy makes sense. Do you know if de Laat or Maurellis will also be commenting and, if so, how they feel about Gavin’s “replication”?

Re: Ross McKitrick (#147), Ross:
One other comment and then I will have to get some billable work done: I am not sure whether you are tracking the RC thread with Gavin’s paper. Nicolas has asked Gavin a few pointed questions that seem to be pulling additional comments from Gavin.
It is also interesting to see, that few, if any, others are actually asking detailed statistical questions.

Re: bernie (#150), I check in once a day. Most of the discussion concerns archiving practices for software, which doesn’t concern me. The few questions on MM07 from Nicolas just elicit the same replies over and over from Gavin. He’s not going to move off his position.

For instance in his reply about the out-of-sample test he appealed yet again to the spatial correlation red herring, and added some oddness about how selecting the data at random makes the test easier to pass. But if adjacent cells are correlated, picking cells at random increases the difficulty of passing the test. More to the point, as anyone who understands the biasing effects of autocorrelation knows, if AC is your problem it will show up in the form of severely depleted out-of-sample predictive power. Try the experiment I outline in #129–the stock prices are limiting cases of autocorrelation (unit roots) and they yield zero out-of-sample predictive power.

When Gavin says “no one apart from the authors thinks the methodology is valid” he is referring to the tight circle of like-minded people he prefers to deal with. The econometricians I have shown this to get it right away. Not long ago I contacted a senior econometrician in the US who had expressed an interest in how economists were involved in the climate debate. I sent him my MM07 paper and he replied:

Thank you. That’s a good methodology you use, and just the sort of thing I thought economists could contribute.

Now maybe there are others who think the methodology and results are invalid. But look at the track record of my critics so far. They offer up a speculative claim about the flaws (Folland: it’s all due to the European special case, IPCC: it’s all atmospheric circulation, Rasmus: it’s spatial AC, Gavin: it’s a fluke), provide no proof of their case, and when I disprove their claims they either ignore the counter-evidence and keep repeating their charge, or simply abandon their argument yet maintain their position while awaiting inspiration for a new speculation. I don’t know how these kinds of debates eventually end but it doesn’t look like formal argumentation plays much the decisive role.

Re: Ross McKitrick (#155) wrote, “I don’t know how these kinds of debates eventually end but it doesn’t look like formal argumentation plays much the decisive role.”

It was hard to stop laughing after reading that, Ross. 😀 You really nailed it, with a fine irony of implicit rational wistfulness. Human history records perfect examples of how such debates play out. Check the history of the Copernican revolution. The struggle continues. Or the controversy following “The Origin of Species.” It continues. We all now know how the argument ends. It doesn’t. The debate continues endlessly, and no amount of rational and totally conclusive evidence and calculation will ever stop the counter-argument. The partisanship is constitutional, immortal, irrational, and relentless.

Re: Pat Frank (#161), Pat:
I am not sure about your pessimistic view. There is always an appeal to statistical expertise. Proof point: The HS is broken and Wegman separated the pieces. They can pretend to wrap tape around it, but everybody knows that it is broken. Similarly, I think that Ross’s involvement of a heavy hitting econometrician without an ostensible dog in this fight is a really smart move. Another world class statistician may also tilt the balance in favor of ensuring that the methods employed actually pass muster.

PS, thanks for compiling the AR4 entries on this, Steve. The AR4 claim that our results become statistically insignificant was a pure fabrication. I hope the full weight of this eventually dawns on people: confronted with published, peer-reviewed statistical evidence that their core data set was contaminated, the IPCC first tried to conceal the matter by simply not talking about it, and then when forced to confront it they fabricated counter-evidence, in the form of a claim about statistical insignificance that had no supporting evidence, is provably false and could not be replicated even by a sympathetic climate modeler.

I hadn’t followed this particular dispute. Its interesting that their handling of these comments was just as shabby as their handling of Mc and Mc. What offends me about both cases is their failure to provide a fair summary – in breach of IPCC obligations.

As you may recall, David Holland made a concerted and unsuccessful effort to obtain the Review Editor comments from IPCC and the UK Met Office,with John Mitchell saying that he had destroyed his comments, that the comments were his “personal” property. Upon determining that UK had paid for his travels, they then refused on the basis of international treaties – so much for “open transparent”.

This is amazing.
If we assume climate models have been evaluated by attempting to match the spacial temperature trends, then through survival of the fittest algorithm we will then have a model which to some degree correctly “historically predicts” the trends in those areas.

To then use that to “disprove” a statistically significant a priori hypothesis that certain areas exhibiting socio-economic growth would correlate with temperature is quite simply mind blowing.

The resulting output “paths” of the model are then claimed to be what caused the heating.

There can be no possible counter argument to that. The models will match the trend, therefore however the models matched the trend is why the area has warmed.

However, the locations of greatest socioeconomic development are also those that have been most warmed by atmospheric circulation changes (Sections 3.2.2.7 and 3.6.4), which exhibit large-scale coherence.

Thanks Ross, I now see the distinction. They are saying that you should test by witholding randomly selected data (ie from any area of the world) whereas you witheld North and South America which is not a random selection. Have I got that right now?

[Response: It relates to what the true null hypothesis should like. Clearly the differences between MSU and surface stations will not be random or spatially uncorrelated. I used 5 model runs – with the same model – in lieu of having an appropriate null. But I am not claiming that they define the proper null hypothesis. I think looking at more models would be useful if you wanted to do that (but you still wouldn’t be certain). The bigger problem here is that no-one apart from the authors thinks this methodology is valid, regardless of the results. I used it because I was interested in what would happen with model data – and the fact that there are very clear “significant” correlations occurring much more frequently than the nominal power of the test would imply, tells me that there is something wrong with the test. I’m happy to have other people pitch in and give their explanation of reasons for that, but continuing to insist that the test is really testing what they claimed seems to be unsupportable. – gavin]

“I conducted a Monte Carlo experiment with n=5 to determine the distribution of correlation statistics under the null. I found that the mean correlation for the parameter ‘g’ was -0.01119 with a range of [-0.01814, -0.00294]. The result of McKitrick and Michaels was 0.0480 which clearly falls within the distribution generated under the null hypothesis so I conclude that the result of McKitrick and Michaels is not significant.”

Can you spot the problem? (Hopefully it is not with my translation, but signs matter.)

I posted this on the wrong thread and I hadn’t yet read your above post. But even after reading it I would still like to pose the question below which puts it in terms I can understand.

As I am thinking about it Dr. Schmidt (he hasn’t asked me to call him Gavin yet) has a good point. Why would there be any correlation positive or negative between his model data and the temperature anomaly differences? If you treat his model output as essentially random at this scale then you wouldn’t expect correlation except in a small number of cases. This seems too frequent. If the output is non random, I can’t see why it would be affected one way or the other by socioeconomic factors.

I did read the stock example but I’m asking the inverse question. Not why your results are significant, but why he sees these correlations.

It is clear to me that the tests that you did should prove that your result isn’t spurious, which makes his result puzzling to me.

Re: Nicolas Nierenberg (#157), I believe the answer is in my comment 154. There are a myriad of reasons why test statistics might be distributed in ways that are not consistent with the theoretical distributions (the real world is always messy). In these circumstances, Monte Carlo experiments can help to determine the empirical distribution – you don’t need to hypothesise and speculate, you just calculate it to find out what the distribution is. It doesn’t matter much that you can’t think of a reason why it is so, provided you can observe that it is so and measure it quantitatively. Gavin has apparently done a limited form of this experiment and found that the test statistics Ross focused on do not have a mean zero under his null (although 5 runs is not enough to really establish this with any confidence, more like 1000 ought to be done). Unfortunately for Gavin’s argument, the distribution is such that this apparently strengthens Ross’s result because the mean is negative and Ross’s statistics are positive. The fact that some assumption underlying the unbiasedness of OLS estimates is violated in the data at hand is neither here nor there – it happens all the time. What is notable is the way in which it is biased – which, as stated above, seems to strengthen Ross’s findings.

Finally, some thoughts on why the model runs would be correlated with socioeconomic data:
1) The tuning of the models leads to implicit incorporation of these effects despite Gavin’s assumption
2) The climate affects economic activity (think of agriculture for the most obvious example)
3) Statistical problems in the data or models lead to OLS being biased (e.g. omitted variables or errors in variables)

Re: Nicolas Nierenberg (#157), Nicolas, I think the explanation of Gavin’s small but occasionally significant correlations is as follows. Suppose the climate model tells us that spatial pattern of trends (the trend field) in clean data under greenhouse warming is, for climatological reasons, uniform everywhere (at exactly 0.1C) except over Brazil where it is 0.2C. Depending on what you regress that vector on, you might get a significant pattern if a right hand side variable has a component that stands out in Brazil.

So it is possible that a spatial pattern formed by a linear combination of economic variables might also be uniform everywhere except over Brazil. Then we would have a confounded experiment. I think this is what Gavin was hoping to find. Then you wouldn’t be able to say that the spatial pattern is explained by the economic data because it’s also explained by the greenhouse hypothesis.

Now even if he had obtained that result, it would be a 2-edged sword. It would only show that the experiment is confounded. Sure you can’t conclude that the economic variables explain the pattern, but neither can you say it’s explained by the greenhouse hypothesis unless you first assume the data are clean. If the data are not clean then the appearance of GHG significance may be the fluke result.

Things did not work out this way however. The model-generated “clean” trends have a shape that generally doesn’t match the socioeconomic data, though there are a few dips where the economic data bulges and vice versa. So you get some apparent correlations but they are weak. Meanwhile the temperature trend pattern fits the economic data like white on rice. So it’s a double whammy. A simple comparison of fit shows the observations match the economic model better than the climate model. But also, the climate model predicts the opposite pattern to what turns up in the data.

And while some observational data combinations fit better than others, nevertheless the match to the economic data comes through every time, and the mismatch to the GISS-E data also comes through every time.

Happens all the time in earth science data. That’s why Gavin expected that that was the explanation for Ross’s correlational pattern. He made up his mind beforehand what the explanation was. That’s why he didn’t see or take note of the implications of his negative-correlation null model. He missed it, thinking he had found what he was looking for. Confirmation bias.

This continuing pattern of weak dismissals and non-responses from Team members is… disappointing. Sounds like new subtypes of a common logical fallacy in in order to describe this phenomenon: Appeal to My Own Authority and its cousin, Appeal to My Close Associate’s Authority. I expect to see that sort of thing from politicians, but was hoping for a bit more from scientists. Eventually.

Maybe I know the answer. Dr. Schmidt should have run the same test you did to hold out data points at random and see if the correlation remained. He would have had to do this for each of the model runs where he saw these significant results with all the data. His model runs, are the equivalent of the surface temperature set that you ran. Is that right?

Guys, why would you waste your time trying to explain negative correlations as weak and insignificant as Gavin’s when the independent and dependent variables (GCM output, socioeconomic data) both have spatial structure? Compare to Ross’s positive correlations which are ten times stronger and a zillion-fold more significant. THAT’s what needs explaining.

Such weak correlations as Gavin’s are expected by random chance alone in earth sciences data. Heck, it is even possible that it is an artefact of the “urban adjustment” process that Gavin is so in love with. Remember from Steve M’s posts that there both positive and negative adjustments. Who says the negative ones don’t tend to cluster in areas of economic development?

Gavin lept before he looked. Now he’s in the doo-doo. Watch the back-pedaling from what he knows are untenable arguments. Should be fun.

[Gavin, a mistake like this can happen to anyone. Just admit you goofed and spare yourself any further embarrassment. Sure, Ross has unearthed something that is highly unlikely. But that’s science. That’s why it’s worth investigating in some depth. You can’t tell me Parker is the god of microclimatology. You can’t tell me your urban adjustments are robust. Give it up.]

If Ross’s non-GGHG ALW (anthropogenic local warming) effect accounts for, say, 30-50% of the AGW signal, then would this not bump up black carbon (BC) forcing as a more important source of warming than CO2?

Since I know Gavin is lurking, he is welcome to reply here or anywhere. What’s the magic number, Gavin? 25%?

[Risk heading OT. Replies should go in unthreaded or one of the Hansen model threads.]

The strongest argument that Gavin can make – and it’s the position he’s already back-pedaling to – is this:
1-Ross’s correlation is spurious, just as the one I found in my model output is spurious.
2-Nobody believes Ross’ correlation is non-spurious.

The logical errors here are:
1-Trying to pretend that the two results are of the same kind by ignoring the degree of difference between them (strength of coefficient, significance level). [Note: sign is not an issue with this position.]
2-Appeal to authority/consensus.

So I could care less what Gavin/NASA/RC says. Given that Ross is Canadian I want to know what the Canadian climatologists are saying about this result. Say, wasn’t Oke an immigrant to Canada?

In re-reading you paper I see that in addition to the basic regression and calculation of significance you did a number of other tests. In reading Dr. Schmidt’s paper I don’t believe he did any of these tests the results from his model driven data. My question is if Dr. Schmidt did some or all of those tests on his model generated data what result do you believe would show that the correlations he is sometimes seeing in the model driven data are of a different character than the correlations you saw in the CRU/UAH data.

Re: Nicolas Nierenberg (#169),
The spurious correlations that he is seeing in the model data ARE ALREADY of a provably different “character”. They are (1) weak and (2) barely significant, and (3) with no apparent cause. But, as someone suggested above, if Gavin were to repeatedly randomly resample a subset (say 10%*) of his data, say 1000 times, then it will probably turn out that the majority of his correlations are non-significant. That would NOT be the case for Ross. His correlations are strong enough that they could withstand any bootstrap resampling approach.

*You might need to play with the subsetting threshold to maximize the contrast between the two analyses. Maybe it’s 10%. Could be 5%. Might be 50%. One can’t say in advance (without some insight as to the structure of the data).

But Ross will answer, I’m sure, with far more patience than I can muster.

This would be worth a paper, by the way. In the Journal of Statistical Climatology – were there such a thing.

Regarding mechanisms and Gavin’s disbelief – I fail to see what is so improbable with the idea that growing, exploitation-based socioeconomies generate a lot of waste heat. Airports are not cool places.

Re: bender (#171), This has always mystified me. Anecdotal evidence for UHI (radio station morning forecast [insert city name here]: “15 degrees at the airport, but if you live out in the sticks, you’ll be seeing 12 degrees”) is plentiful. The concept that a large concentration of people pumping heat into the air could affect local temperatures is logical. In my opinion, that means the burden of proof is to show that UHI is not a factor . . . not the other way around. It seems that the Team has a different view, however.

Re: Ryan O (#172),
Isn’t the question about teh effect of UHI on trends and not on UHI itself. If a station has been placed in the center of a large city then the UHI has always existed and the effect on the trend will be negligible. That is my understanding of the argument anyway.

You make a point that should be kept in mind when we talk of the existence of UHI and its possible effects on temperature trends. If the UHI conditions have not changed over time then even though the UHI effect might be extreme it would not necessarily affect the temperature trend.

The issue then becomes: have the affecting conditions changed over time? McKitrick’s and Michael’s analysis would indicate that changes well could have. It would appear that MM’s statistics are holding up under the earlier and casual critics and Schmidt’s more detailed recent criticism. MM’s analysis would (in a rational environment) then call for more detailed work on temperature measurements over the years and whether they are capturing the proper extent of the adjustments required for not only UHI but more generally for any micro site changes that could “artificially” change the temperature being measured. I do not judge, from my reading the literature that the adjustments to temperatures used for UHI would even take into consideration micro climates at the measuring stations. Hansen and Karl acknowledge micro site changes as a potential effect on temperature trends, but evidently do not attempt to resolve it.

I personally would like to see more statistical analyses of the results of Anthony Watts team’s work on the quality of the USHCN stations (CRN ratings) and the potential effects quality (or the lack thereof) on temperature trends in the US. From the precursory analysis here at CA (I will look up links for interested parties) I think it can be shown that micro climate effects on USHCN trends are greater in urban environments, but that trends can be affected under suburban and rural conditions also.

The conditions that would tend to change under the MM econometric factors would, I think, come into play on what the Watts team saw as quality issues in their station surveys ,i.e. paved parking lots, air conditioners, local vehicle traffic, paved roads and building locations, as opposed to more natural settings, etc. Those changes would seem to change over time as “modernization” occurs in advanced economies and even in non urban areas.

As an aside, I find that Ross McKitrick has done an exceptional job in articulating the important issues in this debate as he has with others here at CA. I am forever looking for his spelling, grammar and syntax errors and without success.

Re: Ryan O (#172), “the burden of proof is to show that UHI is not a factor . . . not the other way around. It seems that the Team has a different view, however.”

The Team knows agrees that UHI is a non-significant factor, and tries to correct for it. The issue is whether they correct enough for it, and it seems pretty clear to me from Ross’ paper (and others) that they do not. Yet if they were to accept MM07’s conclusions, they’d have to admit that the true global warming rate is well below their current calculations, and down that path lies professional embarrassment and (probably) less funding for climate science. I’m not saying that they’re deliberately misleading people on this to bilk taxpayers out of billions of dollars, just that scientists are human beings too, and see what they want/expect to see whenever there’s any ambiguity. They can rationalize their positions with the rest of us, and probably better.

The Team knows agrees that UHI is a non-significant factor, and tries to correct for it. The issue is whether they correct enough for it, and it seems pretty clear to me from Ross’ paper (and others) that they do not.

Re: bender (#177), Eh, if they really believed it was a significant factor and were not merely paying lip-service to it then they would investigate how on earth you could end up with positive UHI corrections in the ROW. Their corrections may work somewhat well in the US, but they clearly do not outside of the US . . . yet they show little concern. Furthermore, whenever the issue is brought up in the larger context, they alternate between “UHI is corrected for” (it is certainly not in CRU, and the GISS correction is more than questionable) and “UHI is not significant”.

they alternate between “UHI is corrected for” (it is certainly not in CRU, and the GISS correction is more than questionable) and “UHI is not significant”.

No, they don’t “alternate”. They’re not that schizophrenic. They state both simultaneously: the effects are weak AND they are already corrected for. Careful not to overdo the criticism. They’re not crazy. They’re negligent.

As most have already noted, the importance of UHI is clearly the trend of the UHI (including land use changes)especially relative to where the surface temperatures are being measured. MM07 use some of the available economic development indicators – there may well be other more potent indicators that are harder to get for enough gridded cells, e.g., km of new roads, tons of asphalt/cement, finished lumber consumption, etc., which have more or less direct impacts on local temperatures but are not responses to local temperatures (e.g., electricity consumption). A while back, we were looking at surface temperature records in various arctic areas and there were clear relationships between growth in local communities and the temperature records. Steve looked at Siberia in the Where’s Waldo series. I recall the Cambridge Bay data in Canada. Roger Pielke, Snr is an indefatigable proponent of the significant impact of land use changes on local climate. His blog, for those not familiar, provides a constant stream of research papers that document the impact.

Re: bernie (#181),
Which reminds me … the only other argument that Gavin could make that could sink M&M’s ship is that M&M searched rather exhaustively for socioeconomic predictors, scanning hundreds or thousands of candidate variables, and in their paper only report on the few that they decided to include in their model. If this were true, then effects of this magnitude could arise from cherry-picking. If this were true (and it does happen in the scientific literature) it would constitute a dishonest portrayal of the actual methods used the derive the results.

Trust but verify. I trust that M&M did NOT dredge through thousands of socioeconomic predictors (and predictands, for that matter) only to report on a cherry-picked few. But can Ross verify this? [Sorry, but auditors are obliged to ask.]

The other way to independently verify would be to examine a broader range of related predictors and see if the correlative pattern holds up. (Such that it wouldn’t matter what M&M chose; the conclusion was inescapable.)

No, they don’t “alternate”. They’re not that schizophrenic. They state both simultaneously: the effects are weak AND they are already corrected for. Careful not to overdo the criticism. They’re not crazy. They’re negligent.

.
At the risk of going OT, the point wasn’t to imply they were crazy, the point was to highlight that the answer changes depending on the context of the question. The default is that it is correct for. When pressed to show this for CRU, for example, the follow-up is that it is not significant.
.

I trust that M&M did NOT dredge through thousands of socioeconomic predictors (and predictands, for that matter) only to report on a cherry-picked few . . . The other way to independently verify would be to examine a broader range of related predictors and see if the correlative pattern holds up.

.
If it does hold up, that’s a slam dunk. But simply because something is in the general category of “socioeconomic predictor” does not mean it will have the same predictive power for temperature change as a different predictor. So just because one class or type of predictors does not yield the same results as the ones M&M chose does not necessarily mean that the results are skewed by cherry picking. It may simply mean that the underlying causal relationship between that particular indicator and a positive effect on temperature is not strong.
.
For example (SWAG territory now), I would expect that a data set of electric power consumption density would show a stronger relationship with temperature than mere population density.

Re: bender (#183), My defence, yr honr, is laziness. I don’t have the programming skills to scrape data from places like the Penn World Tables, the CIA World Fact Book, the UNCTED data base, etc. So my RA and I went in country by country and extracted the numbers the slow way.
But if you were to scan for many socioeconomic variables, which would you choose? When I first began this work I wrote down a short list of the variables that seemed to me to matter: real average income, population, GDP, missing value counts, local air pollution and abundance of skilled workers. Since the air pollution wasn’t readily available I used local coal consumption. Electricity consumption doesn’t quite work because some regions like France use a lot of nuclear. Coal consumption isn’t ideal because the US uses a lot of scrubbers compared to other places, but at least it captures some of the local particulate and aerosol loads associated with the heavy coal regions of Eastern Europe and Asia. Educational attainment isn’t ideal either but the proper measure (labour costs) aren’t easy to find.
Within these data sets there are lots of other variables you can pick: export and import volumes, inflation rates, unemployment, telephones per capita, govt debt as a % of GDP, etc. I didn’t take any of these because they are not relevant. There does have to be an a priori reason to use the variable.

Re: Ross McKitrick (#187),
Thanks for clarifying how the variables were selected. So, there was no post-hoc data dredging. The variables reported were all chosen a priori … with reasonable expectations of causality.

Re: Ross McKitrick (#187), I have a candidate, Ross, for another UHI indicator. When I started practicing law here in Charlotte in 1966 we had 150 lawyers. Now there are 3,500. That is one heck of an increase in hot air.

Re: bernie (#181), “…not responses to local temperatures (e.g. electricity consumption).” I would think that electricity consumption would be a direct response, as opposed to a cause, of local temperatures. This is interesting. Years ago an electrical utilities expert told me that an inmportant guage of economic activity were the weekly published figures of electricity consumption in this country. He viewed it as a predictor, of sorts, of the state of the economy. One wonders if historical records of local electricity consumption would be a useful indicator of UHI.

Re: PhilH (#195), Phil: That was my initial thought and certainly it is generally used as an indicator of economic development. You could presumably separate out residential from non-residential electricity consumption, partialling out most of the temperature driven consumption. I am re-reading Ross article again to see how they thought about the causal flows in the model. Hot weather and cold weather can drive electicity consumption, which complicates the equations I would think. It could perhaps be set up as a leading indicator, but I will leave the rest to Ross, bender and other regression whizzes.

Re: jc-at-play (#202), I’m sure you are right; but could you explain why this is so. I realize now that I was really thinking of electrical consumption as more of an economic growth indicator than as a response to UHI by, say, air conditioning. Now that I put it that way, my thought is trivial. Shoot, just look at the city limits.

Re: jc-at-play (#202), jc, I am not sure the decade thing really holds. On a year to year basis residential energy use is too close to being driven by temperature to be included. But I am sure Ross has already thought through the pros and cons.

OK, I admit I was speaking off the top of my head (or through my hat?). My basic idea was that if temperature were a significant determinant of electricity consumption, then we should tend to see changes of the two acting in tandem. However, as far as I knew, over the last century or so temperature has fluctuated up and down, with a barely discernible net increase; while electricity consumption has continually grown, resulting in a truly dramatic cumulative increase, far greater than generally realized.

Prompted by your challenge, I have since made an effort to find what the data on electricity consumption actually looks like. Everything I have found seems to confirm what I had originally thought. For example, domestic electricity consumption in Singapore increased every single year from 1986 and 2007, with a cumulative growth of over 300%. Clearly, climate change could have been only a minuscule factor in a change of that magnitude.

Re: Nicolas Nierenberg (#198),
I’d like to thank Nicolas for his contribution to this thread. You’ve been an interesting referee going between blogs to find answers, not just to comment or cojole. You’ve kept the thread on a straignter track.

I’m glad you’ve had better luck than some. I was thinking of the situation with the Steig data/software when I made that comment. Many other situations with the team that are chronicled on this website come to mind as well.

Re: steven mosher (#206), The NDVi looks promising because it is a global time series. The Impervious Surface Data method looks like it has only been developed for a few locations but in time it too looks like a good data source for this kind of analysis.

Bernie 205:
In the desert during the summer you can feel the difference in heat around an air conditioner as it cools the house inside. In the winter time there is a lot of escaped heat.
Also you would have more heated outside surface area to release heat as it cools. Single story verses 2 story verses apartment buldings versis highrises times growth.

For example, from 1990 to 2005, electricity consumption increased 5% per year in Ireland, but only 1.5% per year in the United Kingdom.

I recall reading on several occasions during this period about the dynamism of the Irish economy. The English economy was less so. It would be interesting to compare surface temperatures during 1990-2005 in the two countries.

There would seem to be too many variables influencing electricity consumption to be able to say that climate has any meaningful effect. Even tracking heating oil or natural gas consumption may be complicated by the introduction of better insulation and improvements in combustion and heat exchanger technology.

Hello All.
As far as I am aware, the Arctic is warming up faster than any other part of the world. This temperature increase can’t be correlated with economic activity (well, only negatively). Remote mountain glaciers are all receding and permafrost borehole records show recent warming. How is this explained by Ross’ model?

Professor McKitrick’s analysis does not attempt to “explain” or address those topics. But size matters. If the observed temprature trends are overstated because UHI issues have not been properly addressed this would be an important thing to know. This would remain true even if the world is somewhat warmer today than it was in 1975.

Thanks for your replies. I still maintain that there is evidence of warming from all parts of the globe and this does not correlate with economic factors. Why is the Arctic and Antarctica warming; why are Patagonian glaciers in recession? Can’t be due to local changes in GDP!

Hi bmcburney
I agree that the UHI might not have been properly modelled. But the remotest regions of Earth show warming. If AGW has been overstated (say it’s only 0.5 C) and yet this has had profound effects on the cryosphere for example…this means that these systems are more sensitive to T change than we thought. Which is not really good news.

You need to be certain that the claimed high heating in the arctic is really there. A lot of it is due to certain stations in Siberia of dubious provenance. There are some threads on the subject on this site. In addition, the warming in places like Alaska are possibly related to the PDO (if I have the proper TLA) and since it’s recently switched over to its cold phase, it may be premature to draw conclusions. I’d get into the mountain glacier debate but Steve would just snip it as being OT, so why bother?

Yes, those things (if true) would be difficult to explain by local changes in GDP. As Professor McKitrick points out, however, they are also difficult to explain as an effect of increases CO2, etc., in the atmosphere. If observed temprature changes outside of the Arctic and Antarctic are (or tend to be to a greater extent than previously thought) an artifact of poor UHI adjustments, they are even harder to explain.

However, the debate is not really whether things are warmer now then they were when things were colder. Things have been getting warmer and then colder and then warmer for a long time. It is all about the quantum.

Re: san quintin (#221), The sensitivity question as a result of accepting Ross’s paper is interesting. Yes, sensitivity as a total as presented by the IPCC would mean that the total sensitivity computations would include both the natural increase in warming seen since the LIA, and would also include the natural forcing that occurs from the sun. Based on the IPCC arguments, it might well mean that the actual CO2 sensitivity is much less a proportion of the whole that is left after accepting the paper.

san quintin,
Ross’s argument is not that there has been no warming, or even that all the warming is natural. The question he was trying to address was “What is the cause of the measured warming?” The answer, according to his paper, is that a large portion of the warming, as measured by the land surface network, is from UHI and land-use effects. The exact amount of warming, the proper error-bars for that figure, exact attributions for the warming, and the likely effects on the globe and human populations are other topics that are still being hotly debated, but are outside the topic of Ross’s paper and this thread.

I read through the paper and the for-dummies guide, which I appreciated as when it comes to regression analysis like this, I’m a dummy. (Hey I’m a physicist/engineer I don’t deal in complicated statistics to get trends because if I had to it would mean I had serious measurement and repeatability problems with my tests.)
What struck me was something obvious: You could do an experiment with real people and real buildings and at least cover some of the influences that Ross was talking about. So has anyone built a little control community and done such tests? That would be a good use of tax payers money I think.
I understand that statistics needs to be done to discern trends and possible causes/correlations but it’s only the first step.

You could do an experiment with real people and real buildings and at least cover some of the influences that Ross was talking about. So has anyone built a little control community and done such tests? That would be a good use of tax payers money I think.

Well, if you want to see the impact of a major city I think you would need to build a control community which is a major city; probably you would need several.

But your comment made me think that there are some quite simple experiments you can do yourself. Switch on the heating in your house and measure the temperature. I think you’ll observe warming, so that’s W. You switched on the heating, so that’s A. People all over the globe are doing similar things, so that’s G. We have AGW.

Now repeat the experiment but measure the temperature outside. Depending on your circumstances (I assume you’re not Al Gore) you might need to move a little away from the house (in my case 1mm is plenty). I think you’ll find that switching on the heating inside the house makes no difference to the temperature outside.

You might suppose that it’s the power generating plant that’s the problem. I’m sure that you could measure a temperature difference near the power plant, but that’s counting everyone together not just you. Given an efficiency of 30% the power plant is generating on your behalf about twice the heat in your house. So if you can boost your heating by x3 and measure the outside temperature that might be worthwhile doing. I can only manage x2. No effect but there might be a tipping point around 2.5.

So far I’m able to detect AGW but it’s all indoors. But I think Ross has shown that if you have enough people close enough together the bits of AGW that seep out do add up a little.

san quintin will want to read all he can on ocean dynamics in the arctic and antarctic and GCM simulations thereof.
.
For Carl Wunsch, start here:http://www.climateaudit.org/?p=2318
.
For Jim Hansen, start here:http://www.climateaudit.org/?p=2602
.
These are starting points. Read the papers discussed in those threads. Then talk to me about proof that CO2 is warming polar oceans. You will quickly realize that pretending that skeptics are invoking UHI for polar-warming is a major deceit.
.
Polar warming can be discussed in threads where it is apporpriate. This is about land-use effects in the land surface record.
.
Simple enough?

It occurs to me that Gavin Schmidt’s entire approach – extremely reductionist – is very different from that of Ross McKitrick’s analysis – holistic. I think what Ross has unearthed is not AN effect, but a SYNDROME of human effects. Gavin is upset at the power of Ross’s admittedly black-box result and the fact that it doesn’t map so easily to a single physical mechanism, which is the approach that Gavin is most comfortable with.
.
To the extent that Gavin is a methdology, he has no choice but to dislike Ross’s result & approach.
.
Score one for the holists.

Hi Bender
I was not having a go at sceptics….just pointing out that the Arctic amplification and glacier recession doesn’t fit well with Ross’ UHI effect. I don’t doubt that UHI and land use change is locally (and even regionally important). I also don’t doubt that well-mixed greenhouse gases are globally important too. No climate scientist thinks that C02 is the only driver of climate change. Ross’ paper shows that there may be contamination of the surface T record, but the cryosphere shows that this can only be part of the story.

Ross’ paper shows that there may be contamination of the surface T record

Yes that’s what it “shows”. Not “maybe”. And what it suggests is that up to half of the “global” effect is in fact a local effect that is nevertheless replicated around the globe. i.e. An effect not generated by well-mixed GHGs high in the atmosphere. Furthermore, what this implies is that if one were to correct for this purely local (i.e. non-global) effect, then the strength of the “global” signal is reduced by half.
.
Put this in your CO2 sensitivity pipe and smoke it: black carbon is the top issue, not CO2.
.
It seems you are in denial of these results and their implications. Please, correct me if I’ve misunderstood your view.

Before going “global” … you asked about the arctic. Did you read the papers I suggested? By Wunsch? By Hansen? How the arctic warming is simply not consistent with the fingerprint suggested by the precious GCMs? What is your explanation for this if CO2 is the main driver? The GCMs are supposed to capture this effect.
.
But let’s not go OT here. The point – in terms of the OP is this – there are strong local effects happening in the real world that are not in the GCMs. Some have to do with humans (Ross’s paper), others with complex ocean dynamics, others with sunniness/cloudiness and its effects on ocean surface temperatures and the parameters of moist convection.
.
But if you want to discuss those topics, do so in the appropriate threads. This here thread is about Gavin on McKitrick. I’m not dodging your question. Just saying there are places to discuss that and Steve’s blog isn’t it.

Re-read the paper. What’s the magnitude of the effect that he reports? By how much would the global signal be reduced if the local effects of humans were properly acoutned for? Answer that and then I’ll tell you if we agree.

I’ve tried to be civil in this discussion. Seems like bender doesn’t want to reciprocate.

Anyway, the result, are a little underwhelming aren’t they? Suggests that UHI MIGHT be contaminating the surface T record. Well…we sort of knew that. Response of the cryosphere suggests that something much more significant is happening though. And with low sensitivity I’m intrigued to know how you could produce glacial/interglacial signals.

is a misunderstanding of the paper. Your thoughts about the cryosphere seem to confound this misunderstanding. The effect by Ross is on the order of half the global detected warming, not local, is induced by anthropenic artifacts. Let’s use 0.8C. We are now down to 0.4C for all other global effects. The sun was consisdered about 0.1C to 0.15C. We will use the higher, as we will for all parameters except CO2, to underline the interesting effects of the paper. We are now down to about 0.35C. The IPCC estimated most of the warming in the first part of the century was natural and unknown. Assume it is not directly solar but is equal to about half. We will use 0.35C of the 0.4C. Now our results indicate that the effect of increased CO2 is smaller than the variance in the known parameters used thus far. We have not added aerosols and their possibly positive effect for carbon black after the sulfates have been washed out of the atmosphere. Although the IPCC considers CB as minor, it may be equal to CO2 under these asumptions. In which case, we have CO2 and CB at present resulting in a total of about 0.05C warming in a century. Thus one can hardly claim that the results are “underwhelming,” unless you are talking about CO2. You might get some takers on that ;).

The effect by Ross is on the order of half the global detected warming, not local

Just so there’s no confusion. The effects are generated “locally”, but it’s happening everywhere in the globe that there are people. It’s not a “global” effect in the sense that people are not as well-mixed a particle as CO2. That’s why I say Ross’s effect is a “local effect”. It’s a local effect replicated globally. Hopefully that clarifies my agreement with JFP.

bender: Why do you assume that all people who ask questions are trolls? Moving my comments on the cryosphere (not just mountain glaciers) to a glacier post is silly….we are talking about the global validity of Ross’ hypothesis and this is where these comments fit.

Additionally, if the C02-related warming is correspondingly low how can this be achieved given its known radiative effect? In other words, why has dumping huge amounts of GHG in the atmosphere not caused warming? From this…if sensitivity is low, how can we explain glacial-interglacial transitions?

I’m a regular reader of CA although not a regular poster. I’m also neither a sceptic nor a troll.
Steve:bender seems to be channeling TCO these days. I wish he wouldn’t. I haven’t been online all that much the last few days and I apologize for his remarks.

Re: san quintin (#249),
-You ask questions that are outside the scope of this thread. If you ask them in an apporpriate thread, you may get an answer.
-You don’t answer any questions asked of you.

You’re a regular reader who is nevertheless behaving like a troll. If you keep it up, this will become a foodfight and it will be snipped. And why? Because you ask questions about cryosphere that have nothing to do with Ross’s paper, and you refuse to ask them in appropriate threads. QUIT HIJKACKING the discussion. This ain’t about the warming in the cryosphere. It’s about the warming outside the cryosphere.

The messages from Bender and I are a bit out of synch. In my day-job, I am on record as arguing that the GCMs are inadequate and don’t show skill (esp at the regional scale). I don’t deny that model uncertainty and inadequate parameterisations are a major (and perhaps unsurmountable) problem. Still, this doesn’t mean that dumping enormous amounts of a GHG in the atmosphere is a good idea!

You’ll also notice that I wasn’t just talking about the Arctic either…there are other elements of the cryosphere.

Re Dave at 252. Yes, I do follow Ross’ argument. It’s just that if it concludes that the C02 GW component is much lower than previously thought, then how do you explain the points I made in post 249?

Specifically, what does the cryosphere say about GW and its attribution? Why is climate sensitivity so low, and how does this fit with the palaeo record? Why is our understanding of the radiative effect of C02 so wrong? I’m sorry to keep asking these questions…but I don’t seem to get any answers!

Steve: These are large and interesting questions. For several years, I’ve been asking critics to provide me with an engineering-quality exposition of how doubled CO2 leads to 3 deg with detailed discussion of clouds and water feedback – not some 10 page article hot off the press and thus far have been unsuccessful. Personally (and many readers disagree), I do not preclude the possibility of such an exposition – but, unfortunately, the reaction is always – climate scientists are too busy and important to write such a primer. Take a MET 101 course. Well, this is the issue of the day. Billions of dollars have been spent on climate science. If misunderstanding develops among the public, climate scientists should look in the mirror and eliminate whatever contribution they are making to such misunderstanding – including the lack of such engineering-quality exposition. If there is one and you can identify it (and don’t say IPCC because it isn’t), I’d be happy to draw the attention of readers to it. The purpose of this log – and this is often hard for scientists to understand – is to examine mainstream science, not to provide alternative views of the world. This is not a “skeptic” blog in that sense.

Editorially, I’ve found that – in the absence of such a reference to serve as a basis of discussion, I’m not interested in hosting the exchange of one paragraph opinions on root causes as every thread soon looks like every other thread and accordingly ask readers to respond as narrowly as possible to the issue on the thread without invoking the “big question” at every turn and have moved subsequent such comments to Unthreaded.

Re: san quintin (#248),
Grateful for Steve’s response in bold. Looking forward to SQ commenting in threads on treeline shift, Quelcaya, Wunsch, Hansen, and Kiehl – places where his comments are relevant. I’m not going to reply to such questions unless there’s an OP present for context.

As a re-iteration of an editorial position here, I haven’t personally examined Ross’ analysis and take no personal view at this time on whether his statistical results have “shown” UHI contamination or not. I personally am always wary of the possibility of spurious correlation. In this light, I welcome the new interest of Gavin Schmidt, san quintin and others in this phenomenon and urge them to apply their new-found skills to the relationship between Graybill bristlecone pine chronologies and teleconnected world temperature.

I am trying to understand Gavin’s paper and his conclusions. Early in the paper, he states the following criteria for evaluating Ross’s correlations:

There is a relatively easy way to assess whether there is any true significance to these correlations. We can take fully consistent model simulations for the same period and calculate the distribution of the analogous correlations. Those simulations contain no unaccounted-for processes (by definition!) but plenty of internal variability, locally important forcings and spatial correlation. If the distribution encompasses the observed correlations,then the null hypothesis(that there is no contamination)cannot be rejected.

But the distribution of correlations that Gavin generated does not encompass the correlations found by Ross, and therefor we can reject the null hypothesis, right? Or am I missing something here?

I have a few questions. According to Levitus et al 2005, regarding late 20th century warming about 84% of excess heat is stored in the ocean, about 5% heats the continents, about 4% is absorbed by the atmosphere and the remainder melts sea ice and glaciers.

According to Ross’s paper, if I understand correctly, about 50% of surface temp warming resulted from UHI type effects and not CO2. So does that mean 2.5% of the heat that warmed the continents was from UHI?

I bring up this question because I really think all of the focus on surface temps is ill-placed. If the question is whether or not atmospheric CO2 has put earth’s energy budget out of balance, we should be focusing on the largest reservoir of stored heat, the ocean. We need to be thinking in joules, not surface temp. How many joules are generated by UHI?

Re: Ron Cram (#255), Roger Pielke Sr has been arguing on his blog, for quite a while, that ocean heat content in joules is the appropriate metric of climate change. I don’t have a view on that. My study says nothing about the oceans. It is only looking at the global land area. The IPCC has said in both the AR4 and TAR that their land-based temperature data comes from contaminated sources but they have algorithms to clean it up and the result is a data set in which non-climatic biases have a trivially small and insignificant effect. They have repeatedly claimed this but offered no proof. The papers they cite, such as Jones et al. 1990 and Peterson 2003, don’t test the hypothesis and don’t give them any grounds for their claim. Pat and I tested the hypothesis on a global land sample and the hypothesis fails massively. And the contamination effect could be as large as half the trend they are measuring. A discussion that follows from a proper grasp of my paper (and Gavin’s) will not proceed directly into a discussion of remote glaciers, deep oceans or Arctic sea ice formation except if the purpose is to change the subject.

A discussion that follows from a proper grasp of my paper (and Gavin’s) will not proceed directly into a discussion of remote glaciers, deep oceans or Arctic sea ice formation except if the purpose is to change the subject.

Now, why would anyone want to change the subject?
.
Re: Ron Cram (#255),
It’s so important that they have an acronym for it: OHC.

First, thank you for writing the paper. It was an important addition to the literature and I do not want my comments to seem like I do not value it. One of the benefits of your paper may be unintended. Your paper, along with others, is further evidence that the surface temp record is the wrong metric to be taking our time and brain power. Perhaps someday your paper will be seen as the final nail in the coffin of the surface temp record as the metric of choice. Many times researchers end their paper with discussion of where they think additional research may be fruitful. My earlier comments come from that train of thought.

Yes, I am familiar with Pielke’s blog. I am not smart enough to come up with this on my own. It is not my intention to change the subject but to understand what your paper means. My first conclusion from your paper is the surface temp record is FUBAR and people should stop wasting their time on it. It is equally obvious to me that UHI warming is real. The question then arises, can the UHI warming be measured or calculated in a reliable metric like joules? If so, I think we might learn something from that.

While I realize your paper does not address joules, do you think it is possible you can correlate UHI warming into joules? It may be possible, if we can get someone smart enough to try. Bender, are you up to the challenge? Perhaps Steve, lucia, UC or Jean S. will take a crack at it?

The question then arises, can the UHI warming be measured or calculated in a reliable metric like joules? If so, I think we might learn something from that.

No. You need to define a volume to compute joules. Since the surface temperature is defined as a surface, you can’t compute joules. If you tried joules per unit volume, you gain no advantage over temperature.

UHI is described as something that effects a change in the local value of surface temperature. So.. still no joules.

The reason Pielke Sr. can discuss joules with ocean heat content is they measure over a range of depths. So, you have a volume.

You have provided a very articulate explanation for why UHI warming cannot be directly measured in joules, but correlation is still possible theoretically. For example, if we can reliable ascertain that global UHI warming caused a change in GMST of 0.3C or 0.4C, how many joules would have to be released from the ocean to have a similar effect? I cannot do the math but I am certain someone here can figure that out. My guess is the number of joules required would be very small compared to the increase in OHC seen in the 1990s. When people do the math, they will see very clearly why GMST should be rejected as the metric of choice. It is simply insignificant when compared to overall warming.

OK, I’ll have a stab at an order of magnitude (maximum) calculation of your joules.

To get started, imagine a city 100 km^2 in area with population 2 million. All buidings are brick. My guess (from places I’m familiar with) is that if you demolished the houses and spread the material evenly over the land each house would roughly cover the block that it’s on to a depth of one brick (which I’ll take to be 10cm). The total volume of bricks is then

100*1000(m)*100*1000(m)*0.01(m) = 10^8 m^3

As a reality check that’s 50 m^3 per person. Clearly, that’s too high (good, we’re looking for a maximum) but of the right order of magnitude.

A typical value for the density of a brick is 1500 kg/m^3 so the total mass of bricks is the volume*density, which is

10^8(m^3)*1500(kg/m^3) = 1.5*10^11 kg

This is the mass of bricks to house 2 million people, so the mass of bricks per person is 0.75*10^5 kg (75 tonnes; too high, but that’s OK).

So to be clear: UHI(J) is an estimate of the excess heat content of urban areas, over equivalent nonurban areas, for the whole globe,in joules. Note also that UHI(K) is a temperature difference, so K and C are the same.

In his discussion of ocean heat content Pielke’s units are 10^22 joules. So that’s 4 or 5 orders of magnitude higher than UHI(J). Unless I’ve made a major error (like /1000 instead of *1000) corrections to the guesses I’ve made are not likely to be material. One that might look like it could be out by a few orders of magnitude is the population density of the imaginary city I started with. But that’s just to get a handle on the mass of building material per person (if I was a builder I probably would have started there). I think that figure would decrease, not increase, with increasing population density,

Re: davidc (#285), I don’t think it is just the heat capacity of building materials. A major UHI effect is that you have replaced vegetation, which transpires and therefore cools, with asphalt and roofs which do not and which also have lower albedo (absorbing more solar).

I am glad to see you thinking it through. But I agree with Craig’s comment about land use changes also. Plus if we are going to find a trend over a given time period, you also have to subtract out the UHI(J) generated from the baseline period. Let’s say you picked 1979 as your starting point, the date we begin to have satellite data. A lot of those buildings were in existence in 1979. NYC may have increased its UHI(J) a little in that time period, but not much.

I was hoping someone could find a way to correlate numbers generated by Ross. The way I understand the situation, UHI causes real warming but it also biases the temp record towards a greater warming trend. For the purposes of this exercise, don’t attempt to adjust out the bias. Just take it that half of the recent atmospheric warming is caused by UHI. How many joules would have to be released from the oceans to heat the atmosphere that much?

Think of it this way. The atmosphere has a depth so it is possible to know how many joules are stored in the atmosphere at any one time. See if you can determine how many joules were in the atmosphere at the beginning of Ross’s study period and how many joules were in the atmosphere at the end of the study period. The difference is the answer.

I think people here are talking about urban heat island in two different ways. One which I’ll call UHI(K) (K for kelvins) is the excess temperature in an urban area over an equivalent nonurban area (or, what the temperature would be at the urban location if all Anthropogenic Structures were absent). I think that’s what Ross is talking about. But it also makes sense to talk about the excess heat content of an urban area. I’ll call that UHI(J) (J for joules). I think that’s what Ron is talking about. Now, I agree with what you are saying but I think what you really mean is that UHI(J) is not easily calculated. In principle the surface temperature you refer to cannot be converted to joules, but in practice it isn’t really a surface. It’s a measurement of the air temperature near the surface. That’s not much use in itself as the heat content of the air will make a negligible contribution, but the air temperature is providing information about the more important heat storing structures such as buildings, roads and airports. If you are prepared to assume that over the time scale UHI(K) values are available some kind of average approximate equilibrium exists between air and other structures then UHI(K) is telling you something about the temperatures of the heat storing structures. Then, if you know the mass mi of structure i and it’s heat capacity per unit mass, Cpi, you can calculate mi*Cpi*UHI(K) which is an approximation in joules of the excess heat content of structure i, over what would be there if the structure were absent. Then UHI(J) can be calculated by adding up the values for individual structures. If the Cpi were not too different for different structures you would get
UHI(J) = m*Cp*UHI(T)
where Cp is some kind of average of the cpi’s and m is the total mass.

Although most supplies must be imported, Barrow relies on local natural gas fields to meet all energy requirements for building heat and electrical power generation. This energy eventually dissipates into the atmosphere, and can be detected as a pronounced urban heat island (UHI) in winter.

According to Ross’s paper, if I understand correctly, about 50% of surface temp warming resulted from UHI type effects and not CO2. So does that mean 2.5% of the heat that warmed the continents was from UHI?

I think you are misinterpreting the key idea of Ross’s paper here (and the fundamental concept of UHI “contamination”). His finding is that up to 50% of the reported warming may result from contamination of measurement sites due to local UHI effects. That is, enough of the temperature readings have been pushed higher by local effects that when these readings are incorporated into the global “average”, they contribute a significant bias to that average.

It’s not that the magnitude of that local “urban” heating is significant contribution to the overall heat budget of the earth (I saw a paper recently that said if it continues to grow exponentially for centuries it will); it’s that it distorts enough measurements to skew the reported numbers. He finds it extremely likely that the true warming is significantly less than what has been reported.

San Quintin: the reason bender is getting a bit chuffed is that you sound like you are treating the global land-based temperature record as a minor regional wrinkle, while referring to the Arctic (which is a small patch on the globe) and “glaciers” (without saying which ones) as representative of the global climate. It begins to sound like special pleading, or moving the goal posts, or special pleading while moving the goal posts. Before anyone is going to show much interest in your small regional counterexamples you need to show you have taken on board the large global findings.

You might note that in my paper I present the trend adjustments with equal weighting of gridcells and areal weighting of gridcells. The latter yields even larger gaps between the observed and filtered trends. If the results were being driven by a few small areas this would go in the opposite direction: areal weighting would diminish the gap.

For years we have seen the global land surface temperature record used as a primary metric of warming. Now we have striking evidence that the data are contaminated. Rather than take the matter seriously there is a scattershot of incoherent, fabricated or plainly false rejoinders piling up. Apart from problems inherent in the data I think the episode is also pointing to problems in the research culture.

the reason bender is getting a bit chuffed is that you sound like you are treating the global land-based temperature record as a minor regional wrinkle

Exactly. I lurk for long periods of time. Then, when I see a result that merits wide attention, I speak up. Then I return to lurking. sq, by turning a quantitative argument into a qualititative one (and a scientific argument into a policy one), is sooooo missing the point. And, boy, is it annoying.

sq tries to downplay the potential importance of the paper. (why should we let it affect GHG policy?) If the paper is so irrelevant, then why, why did Gavin feel the need to go after it with his broken stick? Not just once – in the literature – but also at his mass media mouthpiece? The answer is that Gavin full understands its implications.

Sorry for being away. Ross (258)…you misunderstand me. If your results are robust then it’s a significant paper, for which I congratulate you. But do you see my points? I’m not trying to be difficult, but the questions I raised also have to be answered. I don’t doubt for a minute that landuse, UHI etc have a role in climate but it doesn’t explain the other system behaviours I was talking about.

I recognise Steve M’s frustration at not having a detailed exposition of climate sensitivity (post 248). However, reconstructed forcings and associated T reconstructions tend to show that sensitivity is around 3C. At points like this, climate science (esp. palaeoclimate) becomes an interpretive subject like geology or geomorphology. It might be unsatisfactory, but you have to explain the paleo record….and you can’t unless sensitivity is rather high.

Steve:Perhaps so. But if this is the best line of evidence, why isn’t this prominently featured in IPCC AR4. I don’t think that this line of argument is even mentioned in AR4. Why isn’t this front and center?

Given that IPCC didn’t (to my knowledge) provide references, what do you regard as the most definitive exposition of this line of argument?

Re: san quintin (#265), The problem with what you are pointing out is that you have an unstated prior: all systematic deviations from model predictions have the same underlying cause. Without this prior, your comments about the cryosphere are unconnected to Ross’s work. There is no known physical basis for making such a prior. This, perhaps, is why some of the responses were so pointed.

Phil:
Thanks for the link. Laat (2008) is decidedly non-technical but makes the point that the potenial impact of UHI on local surface temperature observations is sufficiently large to explore it in-depth.

I would point out that NASA has told us that a LOT of the Arctic ice melt is based on ocean circulation and wind patterns that bring warmer water to the arctic and push ice into warmer regions to melt.

How this shows substantial warming I will leave for you to explain to us.

Re Steve M’s response at 265. I agree that this line of reasoning is not often brought out, and it should be. As a palaeo person I think it’s one of the stronger cases for estimating sensitivity. Annan et al 2005 and Annan and Hargreaves 2006 are good places to start.

FYI: When starting a response to a message, go to that message and click the “reply and paste link” under the message number. This automatically produces the reference link and takes you to the reply box. A big saver of time and also a big help to people who want to read what you’re replying to.

It’s also an object lesson about why supplying data and code are so valuable. An interested person can go directly to what has been done / said before and verify that the later conclusions are valid.

I hadn’t used the link feature before either. Hansen relied this argument at AGU 2008 and Curt Covey mentioned it as well recently. Would you be willing to post up a thread summarizing the argument from your point of view?

As to your observation that this line of argument is insufficiently emphasized, did you or anyone else make this observation during IPCC review? I don’t think that the argument is dealt with in AR3 either, but haven’t checked.

Re: Steve McIntyre (#275),
Steve: I agree with your general point that the palaeo record hasn’t always been sufficiently used (including by IPCC) and I’d be happy to write a post. It won’t be for a while I’m afraid as I’m off on fieldwork on Wednesday for a couple of weeks.

I don’t think that Ross’s study would necessarily contradict the 3K warming estimate, nor does it contradict any particular rate of glacier retreat etc. The fact that the surface record might show some spurious warming relative to the satellite record is simply interesting. There are all kinds of unknown lags, and local effects in the climate system. Time to equilibrium is not known to any accurate degree. The conclusion would still be a warming planet, but perhaps a somewhat slower pace.

Re: Nicolas Nierenberg (#293), Nicolas:
I view Ross’ paper as does Curt ( Curt (#291),) and to that extent MM07 argues that after factoring out land use, UHI and other heavily localized anthropogenic effects, there has been considerable less of an increase in global temperature than existing CO2 based models suggest. Consequently, the original estimates for CO2 sensitivity do not match the observational record probably because they fail to take account of negative feedback loops from clouds. If this is what you were saying in Nicolas Nierenberg (#293), then we are in agreement. If you are saying that the 3K increase for doubling of CO2 remains a metric that should drive policy-making decisions then I would like to see the type of study that Steve has been asking for. Your posting of the Charney report (Nicolas Nierenberg (#292) ) is very helpful.

I have taken a look at the data provided by you and Dr. Schmidt. I note that the trends in both the UAH data provided in your archive and the RSS data provided in Dr. Schmidt’s archive are essentially identical. The UAH decadal trend is .232, and the RSS Trend is .237. What is very different is the standard deviation of the trends over the selected grid cells. The UAH sd is .183, while the RSS sd is .133. This was surprising to me.

I also note that the grid cells in the native data from RSS don’t match up to the selected grid locations. Should I assume that some type of interpolation has to be done? Neither Dr. Schmidt nor you provided the code that downloaded and converted the satellite temperature sets and computed the trend. Could you comment on how this was done, and whether you know why the RSS and UAH data is so different in character?

Re: Nicolas Nierenberg (#277), Nicolas, I don’t have an answer on the different std deviations. I would suggest computing them by latitude band to see if the differences are larger in the NH, tropics or SH. If the differences occur at every band then it’s likely an algorithm-level issue which would reduce the variance in RSS relative to UAH. Or maybe they deal with outliers differently, in which case it would likely be confined to one hemisphere. The gap isn’t all that huge by the sounds of it.

Can you elaborate on the other question? I didn’t do the RSS compilation, Gavin did. When you say the grid cells don’t match do you mean the data don’t match or the locations are off in each cell? He might be identifying a grid cell by the northwest corner rather than the center, for example. I can send you the SHAZAM code I used for extracting the UAH trends. I didn’t post it, but the one for my 2004 paper is posted at http://www.uoguelph.ca/~rmckitri/research/gdptemp.html. It’s written for SHAZAM, which is a popular econometrics package that looks something like Fortran.

I’m sorry I wasn’t clear, I’m new to this stuff. If I look at your “global” data set there are a set of rows. Each row has a latitude and a longitude along with a temperature anomaly trend, and of course the other factors. The lats and longs don’t match the lats and longs in the UAH data grid. This makes sense to me, even thought I hadn’t thought about it, because the surface temperature is in a 5×5 grid while the UAH grid is 2.5×2.5.

So my question is how did you produce the trends in the global table from the underlying data? I have asked Dr. Schmidt the same thing.

I think that the difference in the two standard deviations is quite significant relative the the mean values, but maybe that I’m missing something? Based on the histogram I would say this is present throughout the data.

It occurs to me that this could be an artifact of how each of you produced the trended gridded data, but again I might not be up to speed. You must be doing some type of interpolation from the underlying data set.

Can you get my email address from Steve? I tried emailing you but I guess maybe your spam filter caught it.

I don’t know if this is the right place to post this, but in response to your question about an engineering level study I am posting a link to the 1979 Charney report. It starts out with a basic calculation of heat caused by the direct forcing of CO2 and adds in water vapor. Then it goes on to discuss the various GCMs of that era. The answer centering around 3K for doubling hasn’t changed much in the intervening time, nor has the logic.

In my opinion things like Paleo studies are not the foundation of the estimate, but they are considered to show evidence that support these conclusions.

Is there anything about the Charney report that you don’t consider to be a first order engineering study? I think beyond a heat budget calculation the system is too complex for anything but some type of model. Whether that model is run on a computer, or calculated by hand it amounts to the same thing.

My apologies if this has been posted before and you have already discussed this report. I’ve been reading off and on for several months and didn’t see any reference to it.

Steve: The Charney Report was discussed in posts 1135, 1851 and 2528 and passim in 1335 and elsewhere. It’s 30 years old. I’ve mentioned on a number of occasions that, in my opinion, the principles of AGW are enunciated more clearly in some of the early articles (especially Ramanathan, who is heavily relied upon by Charney) than subsequent presentations which arm wave through things. A 30 year old report whatever its merits does not count as an “engineering quality” report as it is quite possible that some findings have been superceded. With the billions that have been spent on climate research and the keen interest of the public and policy makers. it is entirely reasonable to expect a more up-to-date analysis. It’s an interesting report, but it’s far short of “engineering quality”.

Re: Nicolas Nierenberg (#292)
Just FYI, there’s a google search button on CA’s mainpage on the upper right. It’s handy for checking what may have been discussed. Entering “charney” would have yielded a bunch of hits.

bernie, there are a lot of factors that can cause a lag in the time frame for the 3k equilibrium if that is indeed the right figure. The largest is the ocean. In the last several years there has been an assumption that the additional heat won’t necessarily mix into the deeper layers that quickly. If the heat is mixing deeper then temperature increases could be much slower. Of course if it is mixing deeper we should see that in steric sea level rise, which has been absent for at least the last few years.

I just breezed through that report and collected the following samples from the language (please forgive lack of quotation marks):

considerable uncertainties, greater uncertainties, major uncertainties, we cannot simulate accurately, the latter cannot be adequately projected, it appears that the warming will eventually occur, it is not known, the distribution . . . is not entirely clear, although our knowledge is inadequate, considering the uncertainties, some uncertainties in albedo feedback, an extremely difficult question to answer, cloud observations in sufficient detail . . . are not available, circulation models are very crude, above uncertainties, can yield only crude approximations, horizontal resolution . . .is rather coarse, lack of sufficient resolution, . . .

and the clincher:

Of course we can never be sure that some badly or totally overlooked effect may not vitiate our conclusions.

Re: theduke (#297), Duke:
It may well be Scientists do not know enough to do an “engineering grade” study – which is fine. The difficulty is that many of those involved in GCM act as if they do know enough and, therefore, it is reasonable to demand an engineering grade study of them. Other scientists may be perfectly correct in simply saying “we do not know enough”. My recollection is that there is a more recent NAS or NRC report that lays out all the uncertainties and the areas needing significant more work. Interestingly some seemed to have treated that report as a way of justifying additional funding, while others viewed it as a statement on the state of climate science. Here is the report on Radiative Forcings. Roger Pielke, Snr, I recall, was particularly concerned by the way other members of the committee positioned the results.

Referring to water vapor concentration increasing with increased temperature, they say “A plausible assumption, borne out qualitatively by model studies, is that the relative humidity remains unchanged.” How can an assumption of the models be validated by the models?

In the cloud effects section, they report that a 1% increase in clouds would result in a 10% decrease in the climate sensitivity. They also acknowledge that there is a large uncertainty in modeling the clouds. Later, a numerical value is given for the uncertainty due to clouds, but this is complete arm-waving. From reading just this report, I would guess that the sensitivity to CO2 is not known to any better than about +/-200%.

As engineering manager for my company, I can say that anyone who presented this kind of report to a peer review on a major project would get one of two responses:

1) If the group is very polite, they will say to go away and come back when you actually know something.
2) If the group isn’t polite, they will just shred you at every point for having no quantitative basis for anything other than the radiative transfer with CO2. The meeting moderator would stop the presentation part way through and move on to something more productive.

As Steve and theduke have said, this is far from an engineering quality document.

Trent, a cultural problem in this field is that climate scientists don’t seem to have the faintest idea what an “engineering quality” report looks like and the computer guys tend to hijack the discussion towards better annotation and QCed code, which is not really the issue that I have in mind. I’m not an engineer and am not quite sure how to define it, but I’ve seen engineering reports for mining projects. Some salient aspects seem to me to be that they are long and detailed (not a 4 page article in Nature), the parameters are worried about, the basis of knowing things is worried about; the engineers have to stand behind their calcs, and not blame errors on prior errors by the British Antarctic Survey or GHCN or Graybill or whatever.

In the present issue, I would at a minimum expect an exposition of infra red radiation issues as they relate to CO2 and H2O, something that remarkably has never been presented in any of the four IPCC reports. Not that there is necessarily any big issue, but I would like to see a stamped up to date report certifying that (say) uncertainties in the near infra red water vapor spectrum are not relevant, if they aren’t relevant or that they require specialized research if necessary.

It is not my position that all decisions be deferred until you have bridge building certainty, but I am convinced that an engineering quality report would summarize the relevant uncertainties and lead to more directed research than we see at present.

In other disciplines, even after the science is “settled”, the engineering has often just begun and it seems to me that this is, in a sense, what’s happening with climate models right now. Except that everything is being done in a very undisciplined way with things that should be done like engineers do them being done as though they were little science projects.

Steve, I have had some experience of engineering quality reports, in the context of raising debt and equity funding for the development of substantial resource industry projects – of the order of US$1 billion in capital cost.

Typically the documentation for such projects will be, say, 20 ring bound folders, each focussed on a particular aspect of the project. But each of the ring bound folders provides references to a whole series of other supporting reports, data, models etc. The documentation relating to such a project can easily fill a large bookcase on the wall of an office.

In order to obtain funding, typically the documentation (commonly called a Bankable Feasibility Study or BFS) will be reviewed, in detail, by an qualified specialist independent reviewer appointed by the banks. In Australia, Behr Dolbear is a firm that would typically be engaged to do this sort of work. The review exercises usually cost several hundred thousand dollars, and involve a detailed line by line checking of all significant assumptions, data inputs and calculations. Financial models are audited line by line by specialist firms engaged to undertake this work.

Typically the independent reviewer will engage specialist consultants to check detail in each particular area. These firms usually have a loose coalition of world experts that they can call upon for this purpose.

I can attest to the fact that these reviews are very demanding. The consequence of not receiving a positive report from the independent reviewer is that funding is simply not available to the project, so the stakes are high. Usually this means long hours and lost weekends for the team working on the project.

My comments apply to an independent company seeking to fund a development project from the debt and equity markets. Major companies follow similarly disciplined approaches. However, they may typically use internal review teams, supplementing their resources by accessing retired engineers whose judgement they respect. In any case, the practice is broadly similar with both independent and major companies, and also the practice is pretty standardised across the world.

You could look upon the review exercise as a detailed due diligence – an examination of all the key assumptions used to establish whether they are demonstrated to be true to a degree sufficient to enable debt funding to be used within the risk profile that the bank is prepared to accept.

The culture of the resource industry is understanding of, and accepting of, the need for such disciplined due diligence, and I have the impression that standards are even more stringent in, say, aerospace or bridge construction or tall building construction where lives as well as funds are exposed to the risk of failure.

Unfortunately, as you point out, climate scientists seem to have very little understanding of accepted practice in many industries, and expect to be able to put out poor quality work that affects public policy without being exposed to disciplined scrutiny.

Re: Steve McIntyre (#300),
An engineering quality study and report would cost about US$10-20 million. Certainly, out of all the money spent on climate science, this would be one of the most high-value activities. Anyone want to put together a proposal!?

However, what I think is more needed is the kind of process that engineers go through when they bring new technology into projects, and this new technology involves spending significant amounts of money or taking significant risk. There are well-defined processes for this (for instance, DNV’s qualification of new technology).

For me, the most important part of this process is multiple outside peer review sessions (not the superficial peer review performed for papers). These sessions review every aspect of the new technology and raise questions/concerns. Each of the questions/concerns is tracked and must be satisfactorily addressed. Sometimes this requires a clearer or more documented explanations. Sometimes it involves limiting calculations. Sometimes it involves complex modeling (e.g. dynamic process simulation, FE, or CFD with established codes). Sometimes it involves statistical analyses of risk. Sometimes it involves physical testing. Often it requires multiple iterations with the peer reviewer, before he is satisfied.

In that sense, Steve and others at CA are performing an invaluable service. What’s largely missing is the other half where the climate scientists hear the issues and then address them rather than ignore them. It is particularly annoying that they object to outsiders critiquing their work, as this is the most important element.

Steve, mondo and Trent:
The basic point couldn’t be clearer: There is a need for increased rigor in the documentation of findings that have significant financial ramifications. We should be striving for engineering quality reports. However, I think the point of the earlier NRC report and Pielke Snr’s lamentations are that nobody knows enough to actually pull these reports together. Consequently, it seems that a key priority is to determine what needs to be done to put such a report together and to identify a date when it could be available – albeit some sensible policy decisions should be made pending the results of more detailed and rigorous analysis.

You may not like it, but that has been happening at regular intervals. Starting with the report that my father chaired in 1983 called “Changing Climate.” The most recent document is AR4. Some uncertainties have been removed, some new ones have emerged. At least from a consensus perspective climate sensitivity to doubling CO2 has stayed about the same.

Re: Nicolas Nierenberg (#305), With all due respect, the IPCC reports are NOT engineering quality, more like a literature review. They do not check calculations, provide an auditable trail of how the CO2 sensitivity is derived, verify the GCM codes, test for sensitivity, compute confidence intervals, provide databases, evaluate sensitivity of projections, or anything like that. They explicitly maintain that it is not their purview to do any computing. Furthermore, the authors are selected by their countries often (not always) with consideration of their green credentials, not just their scientific ones.

Re: Craig Loehle (#312), Craig:
I read Nicolas statement somewhat differently. I thought he was saying that previous reports including those of his father pointed at various variables and sources of uncertainty that needed to be considered – none of elaborations materially changed the sensitivity estimate. But the studies, like his father’s and AR4, did point to the direction for further research. I take your point to be that the sensitivity calculations and other key parameters in GCMs are still just estimates and lack the systematic analysis needed to be a solid basis for major policy initiatives. Plus I think you are saying that there is no specific call for “better” studies in these major cliamte reports such as AR4 – just more research.
To your list of issues I would add that many critical variables have been derived from a limited number of researchers, notably UHI and other local man made impacts, and have not apparently been seriously reassessed even in a looser way. In other words, how come there are not more pieces like MM04 and MM07.

Re: Craig Loehle (#312), If engineering reports were put together with the kind of due diligence the IPCC uses, there would be daily reports of wings falling off aircraft, elevator cables snapping, bridge decks falling through and large ships peeling apart mid-ocean.

#318. Thanks for the comment, Trent. I think that I’ve mentioned a number of about $10 million for such a study on a couple of occasions, but it was just a wild guess on my part, so I’m glad to see someone else mention a number in that magnitude. It seems like a trivial amount to me.

If I had a big policy job, it would be my first priority. I’d have it done by independent engineers, but require one of the modeling groups to give it total attention. Some sort of coercion would probably be required, since climate scientists seem to oppose this sort of thing for some reason and there’s an enormous amount of primadonna behavior in the community (mostly counter productive as Pielke Jr and others observe.)

However, anyone who’s concerned about policy should welcome a professional report of this sort. Unfortunately the involved community doesn’t even seem to have the faintest idea of what this sort of professional report even looks like – so we keep getting reports in which intelligent people have invested a lot of time, but which really miss the mark in terms of engineering quality.

Re: Steve McIntyre (#310),
The problem is that academic researchers have been trained from the start of their careers that advancement is via publishing papers in journals. This means that to them these papers are the highest form of scientific achievement possible. So when engineers say that they are insufficient of course the academics object. “That’s absurd!” they say, “It’s all there, in black and white, clear as crystal. You just don’t understand it.” And to some extent this may be true, however the academics have no idea what engineering entails, and so it’s perhaps unrealistic to expect them to know what an “engineering quality” study is.

Re: Paul Penrose (#312), Paul, that’s not at all how it is in academics. For one thing, universities have Departments of Engineering (electronics, civil, chemical, you-name-it) with highly competent engineers who both are academics and who entirely understand engineering quality reports.

Further, though, academic research is about the cutting edge of knowledge, not about engineering. It’s about the borders of the known and the unknown, in a landscape where how to proceed is not, and can not be, mapped out. The only way to proceed is to give the explorers free reign to find their way. Through til now, this strategy has paid enormous dividends.

Academic scientists are not hostile to correction. Just the opposite. They argue their understandings as strongly as possible, but generally concede if they’re shown wrong. The best thing of all is to discover one’s own errors, because that heralds an improved understanding. And they work very hard to investigate their competitors claims, or to disprove the reigning theory, or to find a new effect. Error correction is the warp and woof of academic science, and they all know that and expect it.

The problem has arisen in academic climate science because so many of these scientists have used climate models as though they were engineering models. And they have insisted upon that use, and been vociferously public in their pronouncements. And hyperbolic. Not only that, but Michael Mann and others have as well advanced their proxy reconstructions as though they were engineering results, rather than properly as an academic research program into whether climate temperatures can be reconstructed at all.

It’s not just climate models that are over-sold, therefore. It’s also proxy paleothermometry, and even global average surface temperature. All three legs of the AGW claim have been presented as though they were engineering quality results, rather than areas of study on the edge of the unknown.

The problem is, in short, that climate science has been infected with an environmental partisanship that has blinded the activist scientists and actually derailed and wrecked the climate-scientific enterprise. The result is as we see it. Partisan polemics that doesn’t blink at character assassination, data imprisonment, methodological obscurantism, stonewalling, scornful dismissals of very clear disproofs, and all manner of weaseling.

I agree with you that it’s disgusting. But it’s not about academic attitudes, ivory-towers, effete liberals, or academic scientists in general. It’s about the entry of partisan politics into the academy. This infection has been going on wholesale now since about 1970, and it has now spread into and nearly ruined a branch of physics.

However, anyone who’s concerned about policy should welcome a professional report of this sort. Unfortunately the involved community doesn’t even seem to have the faintest idea of what this sort of professional report even looks like – so we keep getting reports in which intelligent people have invested a lot of time, but which really miss the mark in terms of engineering quality.

This discussion is on the edge of the forbidden land of policy so I will tread carefully here in reply.

First of all I agree that there are many interested and curious laypeople and scientists who want to discover the revealed truths about science in general and in this case climate science and specifically the degree and potential effects of AGW. That desire to learn the truth or discovering it for some of us does not mean that we think it will translate well into policy or actions emanating from that policy. For my own purposes and satisfaction I want to know the truth or a close approximation to it, but do not have a good expectation that policy considerations derived from it or the resulting actions will be the better for it.

The uncertainty into the foreseeable future associated with climate science and potential levels of AGW and its effects will allow policy to be more driven by predetermined notions of what is the role of the players in these matters than any hints of revealed truth. We see this all the time with economic theory and analyses – and they, the economists, are supposed to know the statistics involved.

I appreciate your call for an engineering quality report. Actually, it is a great idea and one I wish I had thought of. However, the uncertainties are just too great for such a report. And I just cannot see how a $10 or $20 million study could possibly improve the science enough to exclude the appeal to the precautionary principle. And I cannot see the precautionary principle being used as an appeal to action in any engineering quality report. Can you see the principle being applied to the building of a bridge?

“We should build a bridge here because a river may form underneath it even though no water has flowed through here before.”

The call for an engineering quality report is a good one because it highlights the fact such a report is not possible under the current state of science. Perhaps a good lab journal exercise would be to list all of the uncertainties that need to be better understood before an engineering quality report would be possible. Or a list of items that cannot be in the report because the uncertainties are too high. Once you get this list, you will see how much money has already been spent on these items trying to narrow the uncertainties with little success.

Re: Ron Cram (#314), Re: Steve McIntyre (#310), I believe what is needed is an Engineering Assessment Report. They were developed by industry, and have been extensively used in the environmental field where much is unknown to the point that a reasonable program and its attendant cost cannot be done well, even though pollution/contamination is known to be present at actionable levels. Suppossedly, this is what TAR and 4AR are supposed to be, and are not. An example of this is the economics of mitigation versus adaptation. In an engineering assessment report, the basis and assumption(s) of each subpart is seperately and distinctly stated. You will not find it in the 4AR. I went looking a number of months ago in order to plan possible expenditures for the facility where I work. As far as I can tell, the writers cannot or will not state basis or assumptions in an acceptable manner. Perhaps this is what economics papers do. However, the work is unacceptable for what an engineer in my position is expected to do. I think this is the point that the academic part of AGW crossed over into the political. However, it is presented as though it is an assessment that someone such as myself could use. It is not for the above stated reasons. Every basis and explicit assumption, as well as certain normally implicit assumptions, must be stated. The state of Florida came up with quite a good way to do one and for regulators to measure if it was done correctly a number of years ago for leaking underground storage tanks. I am sure there are others, especially from DOD or DOE. They came up with these rules so that the work could be audited and held to a standard. LOL. Seems that there was too much variance in approaches such that even experts had trouble figuring out whether the site or the design was reasonable. Accounts payable professionals were clueless. Governments have long accepted that monies they spend need proper accounting, and force professionals such as myself to tailor our work to meet their needs. Don’t understand why this is such a hard concept for some. It is simply another specification to meet.

This seems like a reasonable approach, but I still do not know who would be qualified to write such a report. I also agree with you that specs should be written out and provided to the engineer to meet.

Re: Ron Cram (#323), It is not the qualifications per se, the IPCC authors have most, if not all, of the qualifications to do the work. It is the methodology. They may well need some PE’s as the coordianting authors, or some other professional, certain military or commercial nuclear experts come to mind, but the strict adherence to ranking, stating, and documenting is what is missing. Many of the missing information had to be available, or what was written in the different sections, literally could not have been done coherently. An example is a list of assumptions and the basis for ranking what could be expected to be improved by increasing temperature and what basis for computing the breakeven point where temperature would start becoming negative. It is stated in AR4 to be about 2 to 3C, however, the basis and assumptions are not explicitly documented in a usable manner. Too often, the IPCC waves their hands and uses “expert” opinion, as far as I can tell. If I remember correctly, this is Rule 10. If you wanted to estimate what to expect in terms of regulations, one would look at the risk/cost matrix and propose a certain range of costs based on this matrix using certain assumptions as to cost, etc. whose basis would need to be explicitly stated to avoid errors.

In Steve’s comment he referred to “an engineer” who would do the work. It seems to me no one person has all of the domain knowledge necessary for such a task. Although we have some engineers who are regulars here who are learning climate. I would trust them over any of the IPCC lead authors.

The result you arrive at will have a great deal to do with which papers you select as trustworthy. If you trust MBH98 and clone papers to accurately portray past climate variability, you will get one answer. If you reject MBH98 and trust the temp reconstructions by Loehle and McCullough and/or Ababneh, you will get a different view of natural climate variability. The same is true for every contested area of climate science. And they are all contested.

When it comes to climate sensitivity, you can trust the IPCC range of 3-7C. Or you can trust Schwartz at 1-3C. Or you can trust Spencer at 0-2C (or thereabouts). All of these vary depending on their view of feedbacks.

I did not understand your comment about “where temperature would start becoming negative.” But I do agree about IPCC handwaving. The appeal to climate models is nothing more than an appeal to experts.

Re: Ron Cram (#326), Sorry it was shorthand for where “temperature increases cause negative effects from formerly positive effects.” On the range of climate sensitivities, that is what an assessment should do. It should state, and show why it is stated that further work and WHAT work needs to be done in order to be successful. Contested only means that by providing the basis of your selection, that your work is not wasted. If done correctly with modern computing systems, the original document may cost $10 million. If later it is discovered that the sensitivity is wrong, then to re-do it will cost about 10% or 1 million. As typical for one of these endevours, the assessment is about 5% of cost. So a 0.5% incresae to get it right is insignificant. One of the great things about a structured assessment is that you do not waste monies. A complete change caused by 0.5C versus 5.0C can be done while conserving the majority of the work, and avoid ing costs. The IPCC does not do this. It is obvious that the major fault of the IPCC assesssment is the lack of structure due to the fact it is not an assessment that engineers would be doing (basis and assumptions are not in a structured format). Of course one of the problems is that with such a large spread of possible sensitivities, an engineering assessment would be basically one meaningful paragraph…costs are indeterminant, needs more investigation. This often happens in environmental clean-ups, and a structure for this occurance is already outlined by the State of Florida in their leaking underground storage program requirements. Now to do a structured assessment of global warming would at least triple the cost. However, looking at even a $100 million for a project expected to take on the order of $10 trillion is getting it done on the cheap.

Steve, regarding the IR specta visible from space, do you have a link to an image? It would be interesting to compare an 1960s era image with a 1998 image with a 2008 image. If the differences are really that great I would think it would get more press.

Perhaps a first step for this lab journal exercise would be to set the ground rules.

Rule #1 – An engineering quality report must use nothing but data that is open and freely accessible to anyone who wants to examine it. That rule leaves us with temp data from satellites (UAH and RSS), Argos and GISTEMP.

Rule #2 – Any potential estimates of climate sensitivity have to correlate with the trend or no trend of Ocean Heat Content under different historical climatic scenarios such as Pinatubo and rising CO2. Glaring inconsistencies like rising CO2 and level OHC in the absence of volcanic eruptions cannot be glossed over.

Steve: Ron, while I’m a firm advocate of open data, that’s a separate issue. As is your second point.

Steve, I do not think it is a separate issue. An engineering quality report will have to show correlation to actual observations. You cannot build a bridge without knowing the elevation of the ground and where the foundations will be placed. In the same way, an engineering quality report will not be all theoretical physics. Also, an engineering quality study cannot rely on datasets which have known problems and cannot be examined – such HadCRUT3.

Placing this point aside for a moment, doesn’t it make sense to you to start by listing the uncertainties and the standards an engineering quality report should have?

Steve: If I were commissioning such a study, I’d get an engineer to try to figure out what he could and couldn’t do. Even writing the specs would cost a fair bit.

Okay, I can agree with that. So, let’s start there. Where could we find such an engineer or group of engineers? Just for fun I googled “atmospheric engineering” and found a surprising number of hits. In 2006, I found one federal contract for “atmospheric engineering – development” and the total dollars spent on the project – $2,125. The University of Georgia System was the contractor. Although nearly a half million was spent on the same category in 2005. It would be interesting to see what these contracts produced.

Pat, over the past year or so there have been those moments when I have read here various thoughtful and cogently argued insights into some of the “big picture” issues regarding this whole AGW inspired lemming race to the nearest cliff, and have heard myself muttering something along the lines of, “Gee I wish I could say something like that”.
( Actually, family and friends do hear me say these things, but not always with proper attribution.) A fair portion of your above post has already been processed and stored in memory waiting for the right moment. So, to you, and the others here spending your time and talents in creating this ongoing seminar that reminds me of the best moments of my own university days, I would like to express my thanks.

…. The problem has arisen in academic climate science because so many of these scientists have used climate models as though they were engineering models. And they have insisted upon that use, and been vociferously public in their pronouncements. And hyperbolic. Not only that, but Michael Mann and others have as well advanced their proxy reconstructions as though they were engineering results, rather than properly as an academic research program into whether climate temperatures can be reconstructed at all …..

If one is considering the GCMs and the proxy-based temperature reconstructions as being engineering models, then one also has to make a fundamental assumption about the earth’s climate as an integrated system — it is highly deterministic and also highly mechanistic in ways that are reasonably subject to both future prediction and to future causal evaluation once predicted results are compared to actual system performance.

Moreover, is it not so that under such an assumption, the potential effects of factors such as Long Term Persistence in Natural Systems have to be discounted as a matter of practical necessity — otherwise the engineering model paradigm becomes, as a practical matter, very difficult to enforce.

I are an engineer with extensive experience in large cap-ex feasibilities and the like. My brief foray into climate modeling (as an engineer suddenly afflicted with curiosity) involved a small trip down stefan-boltzman lane, radiative flux at the edge of the atmosphere, diameter of the earth, and so forth. Now, I realize that this approach has been seriously mocked at real climate, and I acknowledge that I am only one of the sons of Martha, not of Mary.

Still, with a radius of 6378 km, flux at 1.3 kW/m2, and an average surface temperature of 288.2 K, I get an emissivity of 0.83. Neglecting the axial tilt, internal heat, arc exposure of the sun, and so forth. With a two degree temperature rise to 290.2 K, I get an emissivity of 0.81. This whole process took about sixty seconds from a flat footed start. (I had to look up the radius of the earth.)

Can a few hundred parts per million of CO2 actually change the average emissivity of the entire earth by 2.4%?

I acknowledge that temperature is very sensitive to the suns input (as well as the emissivity) as the temperature is raised to the fourth power and emissivity and the flux are first power effects. Still, changing the long term average emissivity of the earth by 2.4% seems to me to be a massive undertaking.

All of these climate models are fancy methods of estimating the average emissivity over the long term. What I have affectionately called “techno-wanking” amongst my own staff from time to time. How does CO2 affect the emissivity that much? Given that the laws of thermodynamics (you can’t win, you can’t break even, you can’t quit) are immutable to the best efforts of climate science?

Steve: If you look at IR spectra of the earth (from space), CO2 is very noticeable – so yes, it does affect the properties. IMO an engineering report would discuss these sorts of spectra to help scientists from another field understand things, but IPCC seems to believe that it’s beneath their dignity.

I know I am way too slow for the blog world as everyone has moved on. Anyway I have written an analysis of spatial autocorrelation as it relates to S09 and MM07. My conclusion is that the primary result in MM07 was not affected by spatial autocorrelation, which is in line with Dr. McKitrick’s follow up paper on the subject. In addition I am able to explain the spurious results found in S09 using Model E data by showing that it is caused by spatial autocorrelation. This was Dr. Schmidt’s theory in that paper. Through this process I show that the results of S09 while interesting don’t contradict the findings of MM07.
The post can be found here.

Re: Nicolas Nierenberg (#330),
Those are the correct conclusions. Thank you for posting your results and code so that they can be checked by an independent investigator. There is nothing wrong with being “slow” when you are meeting this level of due diligence. Good, reproducible work takes time. To wave your hands about (as in S09) takes no time at all.

I was following this over at RC, too, Nicolas. I’m interested to see what Tamino has to say about it (and Gavin), if they say anything at all.
.
Thank you for running this . . . very much cool. I will have a look at it in more detail later.

I went to your site and reviewed your work. It is largely beyond my pay level, but I do appreciate the dispassionate and professional approach that you have adopted. I tried to leave a message of appreciation at your site, but your signing requirements defeated me.