Cohn and Lins [GRL 2005]

Cohn and Lins [GRL 2005] , engagingly titled “Nature’s Style: Naturally Trendy”, questions whether recent trends in temperature can be classified as statistically significant, if considered from a more general perspective, including stochastic processes other than white noise. Some of the issues will be familiar to readers of this blog, although the treatment in Cohn and Lins is obviously different. Cohn and Lins has prompted a reaction from Rasmus Benestad, a prominent proponent of the bizarre assumption of identical independent distributions (I.I.D.) in climate statistical testsing, who accuses Cohn and Lins at realclimate here of attempting to "pitch statistics against physics". Amusingly, Benestad argues that "fairly stable" climate is evidence against Cohn and Lins, with the URL supposedly showing "fairly stable" climate being – wait for this – MBH.

Perhaps it’s helpful to refresh some background discussion on the statistical significance of trends. It’s not an issue that I’ve fully come to grips with; I think that I’ve noted up some of the recent econometric considerations, which build on problems arising out of spurious regression. However, there are some points that obviously raise one’s eyebrows.

the likelihood of a record-breaking event taking place in a stable system is remarkably simple (Benestad, 2003, 2004). In fact, the simplicity and the nature of the theory for the null-hypothesis (for an stable behaviour/stationary statistics for a set of unrelated observations, referred to as independent and identically distributed data, or ‘iid’, in statistics) makes it possible to test whether the occurrence of record-events is consistent with the null-hypothesis (iid)

At the beginning of August, Benestad again discussed here the significance of trends on the basis that the residuals were independently identically distributed (i.i.d) — actually normal as well — “white noise”. Oddly, the illustration in this article, reproduced below, is taken from William Connolley, also of realclimate, who has elsewhere proclaimed that he is innocent of any actual statistical expertise.
Figure 1. From realclimate, taken from William Connolley.

The look of the residuals under such circumstances is visually completely different than the look of the observed residuals from applying a trend to actual temperature records. I discussed this in the context of the satellite records here, which have a very different look to them than the normal i.i.d. noise of Connolley and Benestad. An ARMA(1,1) model fitted the residuals very well. I also showed a figure here from an economics text with a variety of residuals from different noise structures.

Cohn and Lins consider the issue of trend significance not just from the point of view of AR or ARMA(1,1) models, but from a more general type of noise model – fractionally-differenced noise (FARIMA). We used this type of model in MM05a and are familiar with it. Fractionally-differenced time series have become somewhat familiar through their association with fractals, popularized by Benoit Mandelbrot.

Cohn and Lins [2005] say that:

The statistical significance, or p-value, associated with an observed trend, however, is more difficult to assess because it depends on subjective assumptions about the underlying stochastic process

and later:

The question remains whether natural [hydroclimatological] processes in fact possess [long-term persistence]. The idea was introduced more than 50 years ago by Hurst [1951], and has been debated ever since [Mandelbrot and Wallis, 1968; Klemeà’¦à⟬ 1974; Potter and Walker, 1981; Hosking, 1984; Loucks et al., 1981; Koutsoyiannis, 2000, 2003]. Hurst’s fundamental finding has neither been discredited nor universally embraced, but persuasive arguments have been presented (for discussion and additional references, see Koutsoyiannis [2003]). Given the LTP-like patterns we see in longer [hydroclimatological] records, however, such as the periods of multidecadal drought that occurred during the past millennium and our planet’s geologic history of ice ages and sea level changes, it might be prudent to assume that [hydroclimatological] processes could possess LTP.

Happily, I have already posted up on a couple of these references (all of which are interesting.) I discussed Mandelbrot and Wallis briefly here , which interested readers might consult again. The phenomenon of long-term persistence in geophysical series was originally raised by Hurst in connection with Nile River levels. Remarkably and interestingly, the data set for Nile River levels (which is very long) persists in use in mathematical literature to illustrate fractional processes and is probably more familiar there than in climatological literature. Mandelbrot and Wallis followed Hurst in looking for very long records among the fossil weather data exemplified by varve thickness and tree ring indices. They calculated Hurst indices and 3rd and 4 th moments for 12 varve series, 27 tree ring series from western U.S. (no bristlecones), 9 precipitation series, 1 earthquke frequency series, 11 river series and 3 Paleozoic sediment series. Some of the tree ring series are precursors of series used in the North American tree ring data set (a number of precursors to Stahle, who interpreted the series as being ENSO affected.)

Klemeà’¦à⟬ also cited in Cohn and Lins, was discussed here. Klemeà’¦à⟠contested the long-memory interpretation of fractional processes and argued that indistinguishable time series properties could be produced by semi-infinite storage properties of water:

An exceptionally fruitful concept for the mathematical modeling of hydrological processes is the so-called semi-infinite storage reservoir, especially the type with a fixed bottom and no fixed maximum (Klemeà’¦à⟬ 1970, 1971, 1973]. It adequately describes the basic mechanism common to such different water reservoirs as, for instance, a lake, a single dew droplet, a glacier, a groundwater basin and a man-made reservoir operated for flood control or hydroelectric generation. Their common property is on the one hand, the possibility of running dry and the other, the fact that they have no fixed limit of storage capacity (water level in a dam can rise to any elevation above the dam crest, as is demonstrated in the history of dam failures, and a glacier can cover whole continents as is documented in geological history.)

Even a very simple model of this type can reveal very disturbing properties to be expected in hydrologic processes. For instance, a single non-linear reservoir fed with white noise will produce output that is nonstationary, a first-order Markov chain with time variant serial correlation and random component [Klemeà’¦à⟠1973]…

I find Klemeà’¦à⟠consistently interesting and think that his concepts deserve very careful consideration. Fluctuating storage is a very pretty and very difficult mathematical concept. In a way, I can even see how you can apply this type of model to El Nino phenomena, where you have accumulations of warm water (both in energy and even elevation) in the Pacific Warm Pool driven by trade winds, with intermittent "avalanches" in which the accumulation dissipates. (However nothing here turns on whether this image has any validity.) I previously posted up some of Klemeà’¦à⟧ criticism of “boastful claims of assorted “modellers” about all kinds of climate-change effects, motivated more by polities than by science and reflecting prejudices rather than fact”

So Klemeà’¦à⟬ at any rate, has a physical image of how hydrological accumulations, lakes, glaciers, clouds, etc.., can lead to complicated stochastic processes.

Back to Benestad and his argument that Cohn and Lins "pitch statistics against physics". Benestadstates of the ARMA, ARIMA, FARIMA models etc.:

these models are not necessarily representative of nature – just convenient models which to some degree mimic the empirical data. In fact, I would argue that all these models are far inferior compared to the general circulation models (GCMs) for the study of our climate, and that the most appropriate null-distributions are derived from long control simulations performed with such GCMs

Benstad goes on:

statistics is a powerful tool, but blind statistics is likely to lead one astray. Statistics does not usually incorporate physically-based information, but derives an answer from a set of given assumptions and mathematical logic. It is important to combine physics with statistics in order to obtain true answers.

Wait a minute – isn’t this the same guy who wrote about record-breaking events on the basis of i.i.d.? Why didn’t we hear about "pitching statistics against physics" at that time? Benestad then goes on to the following amusing argument:

One difficulty with the notion that the global mean temperature behaves like a random walk is that it then would imply a more unstable system with similar hikes as we now observe throughout our history. However, the indications are that the historical climate has been fairly stable.

First, Cohn and Lins never use the term "random walk" which has a very specific technical meaning. A FARIMA process is not the same as a "random walk" – why twist words when you don’t need?

Second and this is fun – here is the image at the URL illustrating the "fairly stable" historical climate:

Needless to say this is MBH – how can this be used as proof of a fairly stable climate when it is in question? Maybe this isn’t "tuning" GCMs, but it’s sure tuning discourse. The other millennial citations are Jones et al [1998], Mann and Jones [2003] etc.

As to the existence of a "fairly stable" climate: I presume that Benestad includes the development of continental-scale glaciers over Toronto within "fairly stable". I recently read an article by Nicolas Scafetta in which stochastic processes (very long-tailed) were identified in solar behavior; he discerned similar patterns in earth temperatures. Without opining on the substance of any of the conclusions, the identification of stochastic behavior in the sun (or on the earth) is hardly inconsistent with the laws of physics. But not according to Benestad:

An even more serious problem with Cohn and Lins’ paper as well as the random walk notion is that a hike in the global surface temperature would have physical implications – be it energetic (Stefan-Boltzmann, heat budget) or dynamic (vertical stability, circulation). In fact, one may wonder if an underlying assumption of stochastic behaviour is representative, since after all, the laws of physics seem to rule our universe…

And, to re-iterate on the issues I began with: It’s natural for molecules under Brownian motion to go on a hike through their random walks (this is known as diffusion), however, it’s quite a different matter if such behaviour was found for the global planetary temperature, as this would have profound physical implications. The nature is not trendy in our case, by the way – because of the laws of physics.

As to Benestad’s claim that:

statistics is a powerful tool, but blind statistics is likely to lead one astray.

it is hard to find a better example than the Hockey Team itself. The "confidence intervals" in Hockey Team reconstructions are a confidence game. As I’ve discussed elsewhere (most recently AGU PPT), their confidence intervals are calculated from residuals from an over-fitted and mis-specified model in the calibration period, rather than from the verification period. Since the verification period R2’s are ~0, the standard errors are the same as natural vaiability, whatever that is and no confidence whatever can be attached to the reconstruction.

33 Comments

Steve,
In his record breaking statistics, Rasmus referenced the “record number of typhoons over Japan in 2004.” This is lovely example of a local condition. Most of the typhoons in 2004 missed the rest of Asia and hit Japan. I was in Taiwan during this period and we had very few typhoons (much below normal) in 2004. The storm track in 2004 was atypical, but the number of Pacific tropical cyclones was not.

A very famous physics predecessor once dismissed a theory because he disliked the idea that statistics was an essential part of the physics; the immortal phrase is that “god does not play dice”. Needless to say, einstein was wrong.

Here, a particular type of physics is dismissed because it follows the wrong type of statistics…
’nuff said
per

A comment at realclimate: “think how quickly temperature dissipates from day to night – there is no physical mechanism by which the atmosphere can carry additional warmth from one year to the next.” But surely oceans can. Also glaciers represent a carryforward. Don’t glaciers have something to do with Ice Ages?

I agree that oceans and ice fields would seem to be major means of storing energy over the long term. However, the premise of the borehole temperature measurements is that even the ground can store heat energy over very long periods (and transmit it to lower depths allowing the gradient to be measured as a profile of surface temperature changes many years later). Furthermore, as you noted in the original comment, very long-tailed stochastic processes have been identified in solar behavior, while other quasi-periodic astronomical phenomena (eg changes in earth orbital parameters, fluctuations in cosmic ray influx etc) that have been identified as possible climate drivers would also lead to long period autocorrelations in climate variables.

Re: #3 Very strange comment from Kaufmann, seeming to be directly contradicted by their finding (as I understand it) of a Granger causation connection between temperatures in the northern and southern hemisphere to the effect that AGW is passed between the hemispheres on an annual timescale.

re: #6 They may be equivalent, though I could be wrong. A random walk series is simulated with a cumulative sum – it is a random walk because it is reacting primarily to its current state, which is the sum of past states, and where past state is stored, you get the characteristic ARMA behaviour. There is nothing about random walk that contradicts physical laws. And a walk can be bounded between quasi-stable states, at a range of scales, as is climate.

RC may have (at least temporarily) allowed your post — but neither “Cubasch” nor Cubash (as you posted) show up in RC’s own blog word search (“Sorry, no posts or comments matched your criteria”). More likely (and perhaps more cynically,) the RC team is now in draft on how Bürger and Cubasch ’05 totally vindicates the hockey stick, and the another stake has been driven into the hearts of AGW skeptics, everywhere. Who will likely lead such an RC comment post? — my money bet is on Professor Connolley, with a cover bet on Professor Pierrehumbert, for reasons I will for now withhold in such enlightened and polite company.

Steve, I see your silly, childish naming or Rasmus Benestad as ‘Mr I.I.D.’ and I just think ‘jese, this man’s really a grandfather???’.

For heavens sake grow up and show the kind of respect to other I show to grandparents and you also merit (I still have one venerable GP btw)! And don’t give me the retort along the lines of ‘but, they [naughty boys over there] do it [sir]’ JUST GROW UP! Who’s going to respects such sillyness???

Fwiw, I think R.B.’s piece jolly interesting. And I suspect your’s would be if I felt like giving it the time.

I like to support Peter Hearnden’s comment. Try to provoke by interesting arguments instead of pre-pre college humor. Also the remarks on “who is married and in bed with whom” in the Hegerl et al. comment are simply distasteful. You have your agruments and people are listening, no reason for attacks against people you dont even know personally.

I came very close to posting something up at RealClimate when I first saw this topic. But when I framed my reply, I hit on a problem. The text is so bad it is difficult to know where to begin in criticising it. There are so many options.

Interestingly, there is an underlying true statement that rasmus keeps referring to – it is essential to relate the statistics to the underlying physical process. This much is true and unarguable, which is why he keeps referring to it in his replies; but the text itself weaves away from this point on to all kinds of other issues which do not logically follow from this axiom.

His argument that GCMs represent the best basis for establishing the statistics of the null hypothesis requires a whole stack of other assumptions. The first, and most important, is that it requires belief that GCMs adequately capture the underlying physics. And we are not just talking getting the mean of the distribution right (which he concentrates on) – to accept or reject the null hypothesis, we need to be confident that the GCMs adequately represent the tails of the distribution. Yet he does not even introduce this point, let alone provide credible support for it.

Compared to my field (remote sensing), this would be regarded very much as using the tail to wag the dog. We do build up extremely complex models, but the output of such complex models is rarely trusted without extensive validition using statistics of real world measurements. And this is for very good reason! We have an expression for those who build complex models, and use their outputs as a gold standard with insufficient justification – these are scientists who spend too long drinking their own bath water. They tend to start believing their own anomalous results and can’t understand why nobody else will give any credibility to their “breakthroughs”.

Of course, the fact that this same poster made such a crude assessment of temperature records, and then criticises attempts to investigate more advanced statistical methods, does leave you wondering why the rules seem to change so quickly to suit the current argument. Speaking of which, apparently comparisons to weather forecasting are in vogue this week as well.

Footnote: For a crowd so determined to claim that MBH is irrelevant to the scientific debate today, they seem to lean a little too heavily on it in presenting their arguments.

OK, I’ll edit out the Mr IID. I wanted to somehow convey the ridiculousness of Benestad’s statistics. To call it neolithic would be to give neanderthals a bad name. It is really awful even by HOckey Team standards. I agree 100% with Spence above.

#10 – epica: I hear a lot about “independent” confirmation of hockeystick results. In securities definitions of “independence”, one’s wife (or children) is not “independent”. It would be entirely appropriate to point out that, to take a prominent Canadian example, Belinda Stronach, as a director of Magna, was not “independent” of Frank Stronach. I mean no more and no less – it’s not intended as gossip. I don’t see that there’s anything distasteful about it and certainly nothing distasteful was intended. If, on reflection, you think that the point remains a distraction, I’ll take it out, but only on the basis that I’m not attempting to insert distractions.

Re: #11. I like your nuance on this. There are so many things wrong with the post at RC, but wrong in interesting ways. There is now a huge body of literature to show that the underlying stats of GCMs does not capture reality in many important ways, its a matter of what you think is important I suppose. Thanks for the ‘drinking the bathwater’ statement. I am actually fine with Mr IID – it seems like an occupational hazard for scientists to be full of themselves – and harmless nicknames are amusing.

“think how quickly temperature dissipates from day to night – there is no physical mechanism by which the atmosphere can carry additional warmth from one year to the next.”

Not that much gets lost at night. Which is why there are no violent winds just after sunset.

Energy from the sun is stored in a variety of ways, such as directly in ocean temperatures and in snow and ice, but also by ocean currents both on the surface and at depth and also oceanic oscillations like ENSO and the NAO which have both short and long period (multi-year and multidecadal) components.

The Naturally Trendy thread at realclimate has really turned into a train wreck for them. It’s rather fun to watch. I’ll bet that postings on this thread mysteriously dry up, as Gavin and William try to dig out from the wreck.

Re: #19
not only have i said what I think, I have set out my reasons for so doing, both here and on RC. You can try to reason with my bluntly expressed logic; or you can complain that I have clear views.

What do you think Peter ? perhaps you believe that we are all right ? Or that RC is right ?

Or perhaps, you are not forming a view on the facts of the matter at all ?
yours
per

Dear Peter
if you go to post # 28 at the reaclimate link, you will see more detail.

As steve points out eloquently in another post, it is a matter of fact whether the temperature series show autocorrelation, or other statistical parameters. You cannot change the facts by appeal to a hypothesis (the GCMs) which is not validated for this issue. It is also a little bit over the top to take a model, with known and multitudinous defects, and claim that this is a perfect embodiment of physics.

“I tend to the view RC has it broadly right most times. That’s my opinion.”

Whether RC has it right most times- or not- doesn’t matter to me. What I find appalling is that- apparently- you fail to come to grips with any aspect of the arguments raised. You haven’t said what is wrong with steve’s arguments, or anyone else’s. You simply say that RC is broadly right; presumably because the web site has got such a nice colour scheme, or for some other reason that you can’t bring yourself to mention.

You don’t even acknowledge that Cohn and Lins passed peer-review in the scientific literature, whereas RC is just a blog. Perhaps you should consider why the comments of Cohn and Lins, which have such important implications for many climate reconstructions, got past peer-review. Perhaps- you might even consider the arguments.

Could someone please explain Benestad’s response to me. Don’t even need to rip it apart…it’s just so terse that I don’t know what he means, much less what he would be making in terms of implications or what would be wrong with it.

Per, re 22, either stick to your idea of logic and science or to insults, doing both is difficult to pull of convincingly without looking a tad inconsistent (at best). Why would I have to acknowledge something obviously peer reviewed is peer reviewed and that a blog is a blog?

TCO, indeed, and, like you, I cheer those I think on the right tracks.

“Why would I have to acknowledge something obviously peer reviewed is peer reviewed and that a blog is a blog?”

rasmus starts off in his blog piece by criticising a piece of peer-reviewed science. In his first paragraph, he tells us that Cohn and Lins “in essence pitch statistics against physics”. That’s not what they are doing, and the piece wouldn’t have passed peer-review if it was; it is a ludicrous comment anyway. There is a reason Cohn and Lins passed peer-review, and there are multiple reasons that rasmus’ diatribe is in a blog.

However, since your theme is that you don’t even consider the arguments, feel free to cheer for RC anyway.
yours
per

TCO – I see that Rasmus has told you that GCMs do a “reasonable” job of replicating the autocorrelation structure of temperature and temperature proxies. Ask for a reference – I’m unaware of specific evidence that they do (but I’m not conversant with much GCM literature). Pelletier cited Manabe and Stouffer as showing that continental gridpoints had “flat spectra up to time scales of about 100 years in contrast to observation”.

Finally got around to reading “Naturally Trendy”. It’s amazing. They show that with the type of “d” values found in temperature datasets (~0.3 – 0.4), the p not statistically significant (p = ~ 7%) Zowie, I didn’t expect that.

Last post got munched, I used the “less than” symbol … wrong. Here’s the full post:

–

Finally got around to reading “Naturally Trendy”. It’s amazing. They show that with the type of “d” values found in temperature datasets (~0.3 – 0.4), the (p less than 5%) trend levels are exceeded, not 5% of the time as we’d expect, but some 30% – 40% of the time … I had not realized it was that bad.

The paper also shows that the annual NH temperature trend of the last 150 years is not statistically significant (p = ~ 7%) Zowie, I didn’t expect that.

6 Trackbacks

[…] My view is that Kyoto has always been a political and economic treaty having little – maybe even nothing – to do with the environment. This report confirms what was obvious on the surface although bureaucrats and climatologists did not see this one coming (economics being outside their field). Regardless, there is a really interesting online discussion of statistics, physics, economics models, hydraulics models, GCMs, etc, here and good questions on the former here.) Trackback · • • • […]

[…] and here), and I wonder whether this notion seems to be a bit difficult to grasp for some scholars (here and here). It is a new way of looking at the data, and I notice that even the SRES report (p. 125) […]

[…] and here), and I wonder whether this notion seems to be a bit difficult to grasp for some scholars (here and here). It is a new way of looking at the data, and I notice that even the SRES report (p. 125) […]

[…] and here), and I wonder whether this notion seems to be a bit difficult to grasp for some scholars (here and here). It is a new way of looking at the data, and I notice that even the SRES report (p. 125) […]