More Upside-Down Mann

Previously, we discussed the upside-down Tiljander proxies in Mann et al 2008. Ross and I pointed this out in our PNAS comment, with Mann denying in his answer that they were upside down. This reply is untrue (as Jean S and UC also confirmed.)

Andy Baker’s SU967 proxy is used in Mann 2008 and is one of a rather small number of long proxies. With Andy’s assistance, we’ve got a better handle on this proxy; Andy reported that narrow widths are associated with warm, wet climate.

I checked the usage of this proxy in Mann 2008. Mann reported positive correlations in early and late calibration (early – 0.3058; late 0.3533). Thus, the Mannomatic (in both EIV and CPS) used this series in the opposite orientation to the orientation of the original studies (Proctor et al 2000,2002), joining the 4 Tiljander series in upside-down world.

I remember the first months when I studied your work with Ross (and therefore MBH, too) – and the idea that the linear methods are used strictly, so that the weights of each proxy could have both signs, looked completely crazy to me. Thermometers that one trusts even though he has no idea whether the mercury goes up or down… 😉

I feel that for each proxy, there exists a natural expectation about the sign – independently measured otherwise – and any reconstruction that generates too many unexpected signs for the coefficients of proxies should be abandoned.

that the nonlinear dependence of the proxies can be such that they flip the sign in the critical region – so that the proxy shows “+-(delta T)^2”, going up (or down, but the same direction) whether the temperature around is becoming very warm or very cool.

These must be clearly avoided, too, so if the proxies are usable for a linearized calculation, one should guarantee that the linearization holds e.g. the proxies react linearly. It means that the linear term must exceed the quadratic and other nonlinear terms within the range of temperatures. When it does, it usually means that it must be ex ante possible to find out what the sign of this “large” linear term actually is, right?

Do you think that there can exist – at all – any “adaptive”, “fixed MBH” algorithm that picks the proxies dynamically (i.e. without ex ante selection criteria), according to some recent test period, which still avoids the “mining for the hockey sticks” problem? Or is the mining a completely general problem of all adaptive algorithms?

“fixed MBH” algorithm that picks the proxies dynamically (i.e. without ex ante selection criteria), according to some recent test period, which still avoids the “mining for the hockey sticks” problem?

SteveM’s much more familiar with the algorithms used but from my own experience with this data and a whole bunch of image processing I’d say no way that an algorithm can tell. The noise level is too high compared to the signal so while looking for any particular pattern as a criteria, you’ll get it.

Question: what does the Y axis indicate in the graphs? While it is obvious that the graphs are inverted, the corresponding data seems the same (unless there is a change in sign that is not evident in the graphs reproduced). Year 1450 in both graphs still seems to correlate with 95 in the Y axis – regardless of which way it points. Does it matter? This question comes from an artist who support$ this site! I’m just trying to understand! Thanks!

The initial mistake(s) of inversion are perhaps understandable given the number of proxies involved (but reflects not so well on qa procedures at the Mann end and very poorly at the peer review end) – but ok this sort of thing happens.

However a normal scientist would take on the advice of these errors and correct the original work and publish an acknowledgement somewhere – they might not be too happy about it but they would do it for the general betterment of the science.

But not Mann and his crew. They don’t even acknowledge the error despite the confirmation of third party experts – they really are living on another planet and richly deserve the eventual ignominy that will come their way.

Thanks for your response, Tim, but I don’t feel like my question was answered. If both graphs indicate the exact same values in the Y axis (no sign change as I can see), but just change the direction of the Y axis, what is the difference? Again, in both graphs, 1450 correlates with ~95 in the Y axis – not -95 in one, and 95 in another, unless I am grossly mistaken, which could be true :). I could swap the Y axis for the X axis, and still achieve the same correlation, how would that perpendicular orientation change our perception of the relation between the two axis? It shouldn’t at all.

Re: Matthew (#19), Matthew, I think you’ve missed the point, whilst the numbers are the same – heck it’s just plotted upside down – once it’s been through the mannomatic it’ll be a temperature proxy, but an upside down one, forget the numbers, look at the shape, the mannomatic removes all relavence of the numbers on the left.

Steve Mc is right – it matters not since the Gore film. A significant majority of Western populaces firmly believe the hockey stick is the truth. Policy trains have left the station now, only large NO votes will derail them. This is unlikely to happen any time soon.

I’m saddened by this perversion of Science, but at my age (+65) take solace in these points:

1) I feel privileged to have observed the growth of a major religion in my time. I’ve very often wondered how this happens (large numbers of people coming to believe insane things) and now I’ve actually seen it

2) culture evolves too in opportunistic and unpredictable ways – this now includes trashing scientific method. Evolution-deniers are unable to see this, but the comparable patterns between physical and cultural evolution are very clear. Cultural evolution is just very much faster.

Re: Matthew (#22), Well what level do you want it at? The suboptimal fortran Steve has got hold of or just an overview? If the latter then the mannomatic will turn any input into a standard deviation whether the scale is 0..1 or 0..1000 and is programmed to ignore the sign of the source and programitcally choose a +1/-1 multiplier based on.. well nothing physical

The numbers are in fact the same, the “interpretations” by Mann and Baker are exact opposites. As Steve M has plotted them both plots are oriented with warm oriented up. So according to Baker values of 10 indicate warmer conditions than values of 100. This is based on the physical relationship of the measured proxy with temperature.

According to Professor Mann (with the blunt intrument) values of 100 indicate warmer conditions than values of 10. This is based on no physical relationship whatsoever, simply on a fortuitous correlation with his temperature record of choice. It implies the exact opposite rphysical elationship to that determined by Andy Baker.

Obviously they are radically different interpretations of the same data.

Not exactly, if I understand the Mann-type approach correctly. It does not initially attribute a particular physical meaning to a proxy at all, nor does it suggest a physical interpretation as a result. Thus the sign does not count, nor are local aspects considered. It just fishes or mines for proxies sharing a signal with the instrumental record, explaining everything through teleconnections.

Doing it better would, in my opinion, need a lot of work of an expert team to impose all necessary constraints for physical plausibility and local connectedness – not just milling the accumulated dataset through a standard routine (# 29).

Re: Ulises (#49), You are right of course about what their data mining algorithm does. Of course, it assumes a) that all input data are plausible proxies, b) that a valid proxy must correlate with the local grid climate data or local weather station data and c) that a proxy can be arbitrary in its response compared to other proxies. But if a particular proxy MUST respond in a certain way to warming (due to physics, or tree growth, or whatever), but it does not show warming in a particular location, then either it is not a valid proxy or that location cooled while the world warmed (which is not impossible). In the latter case it is invalid to throw it out. BUT you can’t have the same proxy type responding positively at some times and negatively at others, or positively in some locations and negatively in others. Then you have gone down the rabbit hole.

Re: Craig Loehle (#50), I believe the mining algorithm is even worse than you state. Many times the correlation is found not to local climate data, but to hemispheric or global data. This seems to be the case with the strip-bark bristlecone pines used in MBH98/99, whose growth showed no meaningful correlation to local temperature or precipitation (which is why Idso and Graybill studied them), but happened to correlate nicely to global (or at least NH) temperatures.

(I recently re-read the Graybill/Idso paper, and the only statistically significant correlation they found was a moderate negative correlation to the previous winter’s temperature.)

This leads to yet another problem in attributing a physical causation mechanism. The team’s use of “teleconnections” in this regard has been widely mocked here.

Simple fix – you have a large database assembled of paleo data and have the tools to these data assemble for climate reconstruction.

Publish a reconstruction alternative that is open to peer review. If this is different to IPCC then you have a chance to influence things. If you can only dispute results published by others with minor quibbles then it is only noise! If it really matters then it will for itself speak!

Re: L R. Kerr (#29), Oh Puhleeze! This “publish an alternate reconstruction!” and “your just quibbling with other results without any of your own!” nonsense. You seem to be under the false impression that Steve is trying to make positive claims about climate variability based on the work he has done. Except tthat he isn’t. He is quite rightly pointing out that the basis for others making positive claims is dubious> Doesn’t proper methodology mean anything to you?

Simple fix – you have a large database assembled of paleo data and have the tools to these data assemble for climate reconstruction.
Publish a reconstruction alternative that is open to peer review.

This, to me, appears to be a complete misunderstading of Steve McIntyre’s position. He has stated many times that much of the “paleo data” cannot be considered to be a proxy for past conditions. It is just noise. The fact the major paleoclimate researchers publish work in PNAS and Science with “upside down” proxies, as shown in this blog posting, is evidence of this.

Re: TAG (#46), I believe that Steve would do his cause a favor by submitting an alternate paleoreconstruction to a peer-reviewed journal, even if the conclusion were that paleodata is mostly “just noise.” Publication in peer-reviewed journals can be and indeed has been done by hockey-stick skeptics. See Loehle and McCulloch, Energy and Environment 19: 93 (2008). This work finds that global mean temperature during the Medieval Warm Period was a bit more than c. 1935. Including the instrumental data for the rest of the 20th century would push the curve over the M.W.P. Bottom line: not a hockey-stick shape but still an unusual rate of global warming compared to the previous 1000 years at least.

Re: Curt Covey (#51), I too encourage Steve to publish his stuff in journals. I’m surprised he hasn’t gotten caller ID by now to screen my phone calls on the subject. Even without going into proxy problems there is a lot that needs to be said simply through an exposition of the stats literature on confidence interval construction for inverse multivariate calibration problems. We hinted at it in our PNAS letter.

Re: TAG (#46), I believe that Steve would do his cause a favor by submitting an alternate paleoreconstruction to a peer-reviewed journal, even if the conclusion were that paleodata is mostly “just noise.”

How does one get a paper published showing noise for a reconstruction? Currently available reconstructions could probably be analyzed and determined to be mostly noise and/or with uncertainty limits from floor to ceiling.

I do not think that the current state of climate science is ready to accept much in the way of publications that show the weakness of the reconstructions already published. Unfortunately, few people, including climate scientists, appear to have a comprehensive understanding of the intricacies and nuances of the methodologies used in reconstructions. The climate science community as a whole has been rather silent in response to the critiques of these methodologies.

I think the pressure has to be kept on pointing to the methodology weaknesses – and perhaps as a sensitivity test showing how alternative reconstructions can be produced contradictory results.

Steve M has no “cause” that I can see other than getting the methods right.

Reconstruction of the earth’s surface temperature from proxy data is an important task because of the need to compare recent changes with past variability. However, the statistical properties and robustness of climate reconstruction methods are not well known, which has led to a heated discussion about the quality of published reconstructions. In this paper a systematic study of the properties of reconstruction methods is presented. The methods include both direct hemispheric-mean reconstructions and field reconstructions, including reconstructions based on canonical regression and regularized expectation maximization algorithms.
The study will be based on temperature fields where the target of the reconstructions is known. In particular, the focus will be on how well the reconstructions reproduce low-frequency variability, biases, and trends.
A climate simulation from an ocean–atmosphere general circulation model of the period A.D. 1500–1999, including both natural and anthropogenic forcings, is used. However, reconstructions include a large element of stochasticity, and to draw robust statistical interferences, reconstructions of a large ensemble of realistic temperature fields are needed. To this end a novel technique has been developed to generate surrogate fields with the same temporal and spatial characteristics as the original surface temperature field from the climate model. Pseudoproxies are generated by degrading a number of gridbox time series. The number of pseudoproxies and the relation between the pseudoproxies and the underlying temperature field are determined realistically from Mann et al.
It is found that all reconstruction methods contain a large element of stochasticity, and it is not possible to compare the methods and draw conclusions from a single or a few realizations. This means that very different results can be obtained using the same reconstruction method on different surrogate fields. This might explain some of the recently published divergent results.
Also found is that the amplitude of the low-frequency variability in general is underestimated. All methods systematically give large biases and underestimate both trends and the amplitude of the low-frequency variability. The underestimation is typically 20%–50%. The shape of the low-frequency variability, however, is well reconstructed in general.
Some potential in validating the methods on independent data is found. However, to gain information about the reconstructions’ ability to capture the preindustrial level it is necessary to consider the average level in the validation period and not the year-to-year correlations. The influence on the reconstructions of the number of proxies, the type of noise used to generate the proxies, the strength of the variability, as well as the effect of detrending the data prior to the calibration is also reported.

The study focuses on low-frequency amplitude variation and concludes

The underestimation of the amplitude of the lowfrequency
variability demonstrated for all of the seven methods discourage the use of reconstructions to estimate the rareness of the recent warming. That this underestimation is found for all the reconstruction methods is rather depressing and strongly suggests that this point should be investigated further before any real improvements in the reconstruction methods can be made. Until then, smaller improvements may be possible by obtaining more and better proxies.

Re: Geoff (#69), Great reference, thanks. Also note that my paper in Mathematical Geology (2005) showed that the effect of dating error within proxies is ALSO to damp down the amplitude of peaks. Sometimes, the answer is that we can’t obtain the answer we want, and just adding more data doesn’t cut it. It is like taking a larger survey about people’s sex habits when we know up front that people lie about this on surveys: a larger sample just gives you more of the biased data.

I’ve observed for a long while that the general populace obdurately resists that part of scientific methodology which actually tests hypotheses … it seems a significant majority of people prefer to believe alarming news rather than cross-examine it. To me, a strange part of human nature, but there it is. A consequence of this is that appeals to “science” are used to justify untested hypotheses, but when reality finally mugs them then “science” is blamed. No reputation can overcome that Catch-22.

I am going to write a somewhat naive contribution, so please be tolerant.

There’s not yet been a mention of “Calibration” in this thread, as far as I can see (open to correction of course). Also, I’ve not noticed anything on “splicing”. It seems to me that these operations may be at the heart of the difficulties that are so evident in many climate related analytical operations. They are surely intimately connected, too.

Thus I ask why is it necessary to “calibrate”? Calibration implies the existence of a gold standard, which must presumably be closely related to the modern instrumental record. “Record of what?” I expect you will ask. There’s endless choice available! If we confine our search to measured temperatures these can be significantly pruned. So-called global means, regional means, single site values (after all, proxies come from single sites, do they not?) satellite data, or whatever takes your fancy just so long as it is deemed to be as accurate and precise as the underlying technology permits. Whatever your choice, the existence of some sort of recognised and fully defined gold standard is essential to any calibration procedure. I presume that calibration is required because we wish eventually to be able to quote some statistic that has “Temperature” as its dimension, and which can be converted into “Anomaly” to conform to climatological etiquette.

Now let’s assume that such a standard cannot be agreed upon – which is not at all improbable. What is to be done? Here’s my suggestion in a rudimentary form. First, select a time period for which as many potential contributors to one’s climate library as possible are available. (For Mann’s HS data set this is from 1820 to about 1970, for which all 112 of his choice of measurements are available). Now normalise (standardise to mean zero, variance one) all data columns. This operation reduces every type of observation to the same weight, which one could adjust arbitrarily if so desired – as Mann did in some cases, I believe. Now, simply average across the years, or other appropriate time interval to produce a value that would have a fully defined and simple provenance. One could also average across any sensible subsets (mutually exclusive of course) of the data columns to produce values which might then be compared. As an example, for the HS data this might be actual temperatures (or their scaled values – which Mann provided (with undisclosed scaling)), precipitation data, ice accumulation data, and so forth. I also used assemblies of data labelled as …PC when I did all this. Underlying these operations is the presumption that the “proxy” data are reasonably sensible ones from the view point of knowledgeable and experienced scientists. There need at this stage be no presumption that the proxies “calibrate” satisfactorily with REAL temperature data. However, this will become apparent in subsequent operations. The next step is to plot the averages against time, and the result will be a simple overview of the whole data assembly.

In practice, the result is likely to be a rather messy diagram, possibly lacking clear indications of the behaviour of the climate over the period under consideration. To help interpretation it can be very instructive to treat the data as if they were the outcome of a production line, being sampled at regular time intervals (one year, typically) for quality. The classical QC engineer would in the 1950s or 1970s have tried using the cumulative sum method when perusing the information for stability, gradual changes and steps, with the intention of identifying patterns of behaviour. With climate data it turns out that such patterns can be very revealing. For the HS data as a whole set it seems that effectively no change took place from about 1930 until the data finished in 1971. Clearly no sustained and large increase! Of course because there was no calibration the vertical scale was z scores – or something very close to that. Could be attempted, but would be rather tentative I think

Would similar methods be useful with the much larger assemblies that the team now seem to be using? No doubt it could be done in R – but I couldn’t :-((

Energy and Environment is peer-reviewed but it is edited by a skeptic and it seems she does not ask the alarmists (Michael Mann, James Hansen, Gavin Schmidt et al) to review any of the articles. I think that is how the rumor started that it is not peer-reviewed. But we have seen the faults of peer review in many journals, including Nature and Science among others. No peer-review system is perfect. The real question is: Does the study hold up to critical examination? Many of the papers published by E&E do hold up much better than papers published in Nature.

I’ve looked at proxies from both sides now
From up and down
And still somehow
It’s Mann’s illusion I recall
I really don’t know speleothems, paleoclimatology, dendroclimatology, Finnish lake sediments clouds, at all

What is Principia Mannomatica? Is that a term you coined? To what, exactly, does it refer? Does it refer to the errors Steve’s has identified? AFAIK, all of the errors have never been cataloged in one place.

Ron, “Principia Mannomatica” is indeed a term I coined, fashioned after Sir Isaac Newton’s Principia Mathematica.

Mann (and others) consider his hickey stick so important that it seemed only fitting that the statistical techniques he has pioneered (such as data inversion) should be published in their own great work.

Steve uses the term “Mannomatic” to describe the techniques, and given the worldwide importance ascribed to the results of such, IPCC for example, logic dictates that a volume of such techniques would be titled “Principia Mannomatica”

I had considered an alternate title: “The Nutty Professor”. But, Disney already has dibs on that.

I know about Principia Mathematica. That is why I asked the question. It was stated as if the joke was more than a one liner. I thought perhaps someone had cataloged all of Mann’s methodological errors and put them in one easy to read compendium on the internet. It is not a bad idea and would make very entertaining reading. So even though Anthony answered my question, I still do not understand his reference to the Wikipedia article on the Hockey Stick Controversy. The article itself is such a mismash, it is difficult to read and intended to confuse the issue (but it does link to some very good resources).

The question, it remains – why not use all this carefully gathered data and better methods and publish what should be done? Steve is strangely reluctant to put his alternatives before the community he criticizs. If it is good it will be published. If not then this exercise is just air that is hot, no?

The question, it remains – why not use all this carefully gathered data and better methods and publish what should be done? Steve is strangely reluctant to put his alternatives before the community he criticizs.

He has done this (and published teh result) and has shown that slight changes in the selection of proxies will show radically different results. This is one reason that he emphasizes that the IPCC referenced reconstructions are not independent in their proxies. The same one crop up in all of them and proxies which do not show the favored result are not used.

Knowing that something is a mess and knowing how to fix it are two different things. We obviously see that with the economy right now. But in order to a fix a mess, you first need to know that it is a mess.

My adding an alternative squiggle would make no contribution whatever to understanding what the problem is. In our papers, Ross and I refrained from presenting an alternative squiggle, saying only that slight changes in proxy weights led to very different results. I would very much like to have closed this chapter long ago.

There are very few issues actually in dispute between Wahl and Ammann and us, though you’d never know that from their obfuscation or IPCC. There is no reason why an agreed statement of facts couldn’t have been prepared long ago. We’ve made this offer to two separate groups who’ve purported to oppose us (Wahl and Ammann; Nanne Weber of Juckes et al). Ammann turned it down because it would be “bad for his career”.

Re: Steve McIntyre (#77),
There is no fixing the fact that the confidence intervals on these reconstructions – no matter who makes them, LR Kerr, – are so monstrously wide that it will be long time before anyone can say with any confidence whether MWP was warmer than the current warm period or if MWP was caused by a persistennt MCA with NAO-like characterstics, etc., etc. L.R. Kerr does not appear to understand or appreciate this very simple fact.
.
Until a method is devides to properly represent the uncertainty on those recons, all reconning should STOP. Without confidence it’s just a con. Over to you, Curt Covey.

I think that there’s an interesting analysis on confidence intervals applying Brown-style methods – following on some threads from last year. Ross and I alluded to this in our PNAS comment and Ross has encouraged me to write this up.

Curt, if you think that my formalizing the thoughts expressed here would be of value to the field, maybe you could suggest to NOAA that they send some of their PR Challenge funds my way. I promise that in my first year, I’d do more than simply collate the reconstructions that they already had conveniently on file.

Amman is listed as a contributor on RealClimate. So he really can’t go any further without throwing Mann, his co-contributor on RealClimate, under the bus. You really didn’t expect him to agree did you? He wouldn’t be much of a team player then.

“Practical men, who believe themselves to be quite exempt from any intellectual influence, are usually the slaves of some defunct economist. Madmen in authority, who hear voices in the air, are distilling their frenzy from some academic scribbler a few years back.”

J.M. Keynes, about 70 years ago.

This is not a plug for Keynes, it is simply a reminder that the events of today are those of 2 generations ago on a different stage. Economists, climate scientists, the doubtful methods have a parallel.

Take heart that Keynes did eventually have influence, be it now in flavour or disfavour. He was writing of the debate about the control of inflation. Unfortunately, many now think the debate then was won by the side that was wrong.

BTW, IMO there is a good deal of concordance of thought processes between several writers on this blog and the Chicago economist Prof Steven Levitt. It can be read, with entertainment and fresh approaches, in the book by Levitt and Dubner “Freakonomics”. On Google or in Penguin paperback (or Morrow in USA), ca 2005.

These data flips are outrageous. Why are they not picked up during peer review? Peer review seems to have the same relevance to scientific integrity as do corporate boards to sound corporate governance — i.e., none.

I don’t mean to burst the bubble of frivolity this thread endears, and I certainly don’t mean to change the subject but feel I can deviate slightly with a news report. For the record I am a firm believer that catastrophic warming is not only improbable, it is a strawman but that’s for another thread or blog. Anyways, my message is, and many of you probably already know that today the EPA has deemed CO2 to be a pollutant, subject to all the restrictions avaible at their disposal. It’s a sad day in the USA…at least it is for this meteorologist.

[…] This wasn’t the only proxy used upside down in Mann et al 2008. In our discussion of Trouet et al 2009 in the spring, Andy Baker commented at CA and it turned out that Mann had used one of Baker’s series upside down – as discussed here. […]

[…] This wasn’t the only proxy used upside down in Mann et al 2008. In our discussion of Trouet et al 2009 in the spring, Andy Baker commented at CA and it turned out that Mann had used one of Baker’s series upside down – as discussed here. […]

[…] proxy (192) is an interesting one. It is Baker’s speleothem record from Scotland that was discussed at CA in early 2009 and here as an interesting example of Upside-Down Mann. In the orientation applied in […]