More from the Junior Birdmen

In keeping with the total and complete stubbornness of the paleoclimate community, they use the most famous series of Mann et al 2008: the contaminated Korttajarvi sediments, the problems with which are well known in skeptic blogs and which were reported in a comment at PNAS by Ross and I at the time. The original author, Mia Tiljander, warned against use of the modern portion of this data, as the sediments had been contaminated by modern bridgebuilding and farming. Although the defects of this series as a proxy are well known to readers of “skeptical” blogs, peer reviewers at Nature were obviously untroubled by the inclusion of this proxy in a temperature reconstruction.

They stated:

For the Korttajarvi Lake record, we use the organic layer thickness, as the original publication indicates that a thicker organic layer “probably indicates a warmer summer and a relatively long growing season” [57- Boreas].

However, they didn’t mention the following:

This recent increase in thickness is due to the clay-rich varves caused by intensive cultivation in the late 20th century.

and again:

In the 20th century the Lake Korttaja¨rvi record was strongly affected by human activities. The average varve thickness is 1.2 mm from AD 1900 to 1929, 1.9 mm from AD 1930 to 1962 and 3.5 mm from AD 1963 to 1985. There are two exceptionally thick clay-silt layers caused by man. The thick layer of AD 1930 resulted from peat ditching and forest clearance (information from a local farmer in 1999) and the thick layer of AD 1967 originated due to the rebuilding of the bridge in the vicinity of the lake’s southern corner (information from the Finnish Road Administration). Varves since AD 1963 towards the present time thicken because of the higher water content in the top of the sediment column. However, the gradually increasing varve thickness during the whole 20th century probably originates from the accelerating agricultural use of the area around the lake.

All of this was discussed ad nauseam following Mann et al 2008, though Mann stubbornly refused to concede anything. Kaufman et al 2009 also used the data and, on the advice of Overpeck, conceded the point and issued a corrigendum. Raymond Bradley was a coauthor of both papers and more or less simultaneously took the position that a corrigendum was required and not required.

I’m sure that we’ll be told that their use of contaminated Korttajarvi data doesn’t “matter” – nothing ever seems to. But why use it?

Steve Update Apr 11: For R users, I’ve collated the Tingley proxies into a time series R-matrix called proxy.tab at http://www.climateaudit.info/data/multiproxy/tingley_2013 and their metadata as info_tingley.csv. A simple average of all the Tingley proxies is shown below. It has a divergence problem because the majority of proxies are MXD proxies.

Their Figure S34 top panel shows a reconstruction from MXD proxies along. The reconstruction is very similar to an MXD average,as shown below.
Figure ^. Tingley and Huybers S34 top panel, showing one variation of their proxy-only reconstructions (MXD), with average of MXD proxies (green) for comparison.

Tingley has provided an exemplary archive. It requires a little collation. R users who wish to skip their own collation may use my collation as follows:

110 Comments

Actually the Korttajärvi series is suspect after about 1720 when the area around the lake was first farmed on a large scale after the end of the Great Northern War. At that time farming in Finland was largely slash-and-burn which can have a considerable effect on erosion and sedimentation.

Where is the failure in the system that is demonstrated in the last few CA threads?

If a proxy is used, as a minimum, there has to be a demonstration that by the author that – a. it is temperature-sensitive. b. Confounding variables are either shown absent or ablr to be quantified and compensated accurately. c. the proxy material is in situ and undisturbed by anomalous events. d. there is a temperature record (in applicable cases)against which calibration can be done reliably and relevantly, with the temperature record proven to be ‘actual’ and not adjusted excessively. d. The error bounds, both for bias and precision, need correct derivation. e. The appropriate statistical analysis is conducted, and done correctly.

In an ideal world, if the author is unable or unwilling to satisfy these minimal criteria, then publication should not proceed, unless as a negative paper saying “Don’t go here because these complications cannot be resolved”. (This has positive value but is rare to read).

In the ideal world, the lack of criteria should be discerned in peer review, and/or by pre-publication discussion by the author with authors of papers cited, including their work in progress.

Finally, in the ideal world, the publisher should be working to comprehend the reliability of certain schools or authors and delay publication until the criteria are satisfied.

The world is not ideal. It should not be up to you, Steve, to find case after case of failed criteria. Have you thoughts of a system design that would function more effectively?

This paper is also hot off the press. BY chance, Atte Korhola, one of the coauthors, emailed me a copy of this paper this morning, but I hadn’t had a chance to look at it yet. Small world. Korhola spoke out very strongly in 2009 against misuse of the Finnish sediment series by Mann et al and Kaufman et al.

It’s odd that the Korttajarvi series would be handled so inconsistently in two papers released on more or less the same day.

According to the footnotes at the end of the paper MT links to, Hanhijarvi assembled the data sets, carried out all analyzes; and was primarily responsible for preparation of the manuscript, while Tingley only “contributed to the design of the quantitative comparisons and interpretation of results, and to the preparation of the manuscript“, i.e., Tingley wasn’t involved in the data set collation.

So, I guess Tingley wasn’t involved in the sensible decision to only use the Korttajarvi series up to 1720. This partially explains why he didn’t do the same in Tingley & Huybers.

But, as you say, it’s odd that the Korttajarvi series is handled so inconsistently.

This is why everyone should just stick to including Yamal/bristlecones/full length Korttajarvi (upside-down), or alternatively use Tingley & Huybers’ new tack of using thermometer records as “proxies”. If you don’t include them, it gets so hard to get a nice-looking hockey stick…
😉

As far as I remember, Kaufman et al., 2009 was essentially a simple averaging of 18-20 (?) millennial proxies (i.e., “composite-plus-scale” or CPS), but included both Yamal AND upside-down Tiljander. 😦 After the choice of upside-down Tiljander was criticised here on CA, they issued a corrigendum, using Tiljander the right way round and ending the proxy in 1800.

So they did not invert the data, but only used the contaminated last hundred years? How would this help in showing a blade of the hockey stick? If anything, it would be most defeating (i.e., a sharp downturn in the last hundred years).

how do you know they did not “invert” it? How would using the contaminated data in calibration help in giving any reliable information about past temperatures no matter which “orientation” it was used?

“For the Korttajarvi Lake record, we use the organic layer thickness, as the original publication indicates that a thicker organic layer “probably indicates a warmer summer and a relatively long growing season” [57- Boreas].”

Based on this statement, it appears that they understood the meaning of the sediment data correctly. However, it is possible that they misinterpreted the original data and flipped it because the original data was in x-ray density (not organic thickness directly) and less dense (more organic and less mineral) meant warmer, not the other way around.

Regardless, there is no question that including the last 100 years of data is grossly incorrect.

One is varve thickness (fatter = hotter) and the other is X-ray density, or MXD (lower = hotter). The MXD was used by Mann to say higher density = hotter. Look up the post titled “It’s Saturday Night Live” on climate audit for a more detailed explanation.

Looking at the data sets in the SI, the anomaly-adjusted Korttajarvi series does trend up sharply in the 20th century so appears to have been flipped.

Also, in the model validation section (SI section 5) it does say:

“…the main analysis, which includes both the CRU observations and all proxy observations. For
convenience we refer to these experiments with the following labels: ALL, CRU-only, MXD-only,
ICE, VARVE, and ICE&VARVE.”

… which suggests that all the charts comparing trees, ice cores, and varves separately are charting against a set that is heavily influenced by instrumental data in the recent period. It wasn’t apparent to me that they had spliced instrumental into any particular series, just that there is a strong weight to instrumental which explains why the ALL reconstruction can end higher than trees, ice, or varves.

I was also amused by (SI section 1.6):

“…for the 1850–1959, 1850–1994, and 1960–1994 intervals.
We do not judge the diﬀerences in the MXD–CRU correlations or regression relations between
the three intervals to be suﬃciently large to warrant discarding 35 years of data, and therefore
include the entire MXD data set in the analysis”

… given the degree of divergence on the top stripe of figure s.35, but I suppose if they are including instrumental in their reconstruction there they can overcome the divergence.

Apologies, was not paying attention and I see from amac below and the original post that a 20th centure increase in organic material thickness would have corresponded with a temperature increase (absent the contamination that makes this useless during the validation period). The sign does not seem to be inverted.

If you mean Mann08, they calibrated using the contiminated data in the modern period. This had two effects
1) The data for periods earlier than the calibration period was inverted relative to the interpretation given by the study authrose. This had the effect of flattening the shaft of the hockey stick because the data countered ohter temperature series in earlier periods
2) The strong correlatoin between the contaminated data and temperatures from instrument data reulted in the blade of the hockey stick being reinforced.

I’ll resist the temptation to write an ironic or sarcastic comment in this instance.

It’s quite disheartening to see one of the data series from Tiljander03 used into the late 20th century as a temperature proxy in a 2013 paper. Presumably Tingley and Huybers acted out of ignorance, rather than to signal the might of the climatology establishment and its at-all-costs devotion to the hockey-stick consensus. But if reviewers and editors can’t spot a prominent and obvious mistake like this, it’s hard to see what quality-control functions are served by Nature and the rest of the prestige peer-review press.

If a proxy is used, as a minimum, there has to be a demonstration that by the author that – a. it is temperature-sensitive. b. Confounding variables are either shown absent or able to be quantified and compensated accurately. c. the proxy material is in situ and undisturbed by anomalous events. d. there is a temperature record (in applicable cases) against which calibration can be done reliably and relevantly, with the temperature record proven to be ‘actual’ and not adjusted excessively. d. [e.] The error bounds, both for bias and precision, need correct derivation. e. [f.] The appropriate statistical analysis is conducted, and done correctly.

(a.) – Tiljander03 claimed that darksum (organic matter) could be directly correlated with temperature and precipitation, and that lightsum (mineral matter) could be inversely correlated, both prior to 1720, when land-use changes in the watershed began to overwhelm climate-related effects on the sediment record. I note that Tiljander and co-authors suggested these correlations, but did not demonstrate or quantitate them. I plotted darksum against CRUTEM3v temperature at this post. Lastly, during the period of the instrumental temperature record, darksum has a spurious direct correlation to CRUTEM3v temperature. Prior to 1720, Tiljander03’s authors claimed a direct relationship of darksum thickness to temperature. Thus, darksum was not one of the Tiljander proxies that was famously used upside-down by Professor Mann and coauthors in 2008 (PNAS) and 2009 (Science). Those were lightsum and thickness. Background here.

(b.) – Tiljander03’s authors were well aware of confounding variables in the post-1720 record of Lake Korrtajarvi sediments, and discussed them, per the original post. There is a large town on the lakeshore, and I suspect that inflow of nutrients (phosphorous and nitrogen) caused eutrophication, thus contributing to the large increases in darksum in the late 19th and 20th centuries.

(d.) – A point unaddressed by defenders of Mann08 and subsequent papers is that while the Tiljander data series can be correlated to the rising CRUTEM3v temperature-anomaly of the 5 deg x 5 deg grid in which Lake Korttajarvi is located, the nearest weather station data shows no such rise in temperature during the instrumental period (1881-1993). See Tiljander03’s Fig. 2.

(e.) – The calculation of uncertainties is a major theme of Climate Audit. For darksum, it is worth repeating that meaningful direct calibration to the instrumental record cannot be achieved, since the effects of climate on the varve record are overwhelmed by changes to the watershed that began in the 18th century and grew in magnitude through the 19th and 20th centuries. Indirect assessments of possible climate effects on these lake sediments in the pre-1720 time frame are possible, though fraught with difficulties.

(f.) – The above discussion lays out some of the challenges to achieving appropriate statistical analysis of the Lake Korttajarvi sediment record. To my knowledge, none of the parties that have employed these data sets in their paleoclimate reconstructions have shown the sophistication required to identify the relevant issues, much less to tackle them in a plausible fashion.

All of this was discussed ad nauseam following Mann et al 2008, though Mann stubbornly refused to concede anything.

I find it interesting how Mann et al (2009) “solved” the matter. In the infamous “bizarre” response to the MM criticism of Mann et al (2008) use of the Tiljander series the excuse given was that “multivariate regression methods are insensitive to the sign of predictors”. Even on the face of it that is a stupid answer as they had explicitly decided the sign of Tiljander series in both CPS (not a “multivariate regression method” in Mannian sense) and in EIV (RegEM; explicit sign through screening). Now Mann et al (2009) was essentially the same as the EIV in Mann et al (2008), but they had made some small changes to code. One of the changes was that they had removed (possibly; the code was still there but commented out so we can not be 100% sure what was actually used) explicit checking for the sign in the screening (one sided vs. two sided screening), so now the answer given to Mann et al (2008) “upside-down” criticism made at least some sense to the Mann et al (2009). 🙂 This is additionally funny, because in Mann et al (2008) they stated

Where the sign of the correlation could a priori be specified (positive for tree-ring data, ice-core oxygen isotopes, lake sediments, and historical documents, and negative for coral oxygen-isotope records), a one-sided significance criterion was used. Otherwise, a two-sided significance criterion was used.

Now suddenly, a year later, they did not apparently know the sign anymore! 🙂

Of course, it should not matter if a series is in a wrong “orientation” due to explicit forcing it to be or due to an “automatic” flipping by a (multivariate regression) method used. It seems to me that these people never actually plot the partial results or how the proxies actually contribute to the reconstruction, and think about any (physical) implications of all of this. Some related issues in connection of Mann et al (2008) EIV reconstruction are found here (signs flipping from time step to another etc.) and here (it’s always worth putting some time and effort in to what UC is saying in his, admittably, short comments).

A small note of clarification: darksum, the data series used by Tingley and Huybers 2013, was not used upside-down in Mann08 and Mann09 — it was lightsum and thickness that were. However, this should be small consolation to Prof. Mann and his co-authors. Climate signals in all three time series were progressively overwhelmed in the 19th and 20th centuries by local developments in the Lake Korttajarvi watershed. Thus, any correlation of any varve information to the instrumental temperature record is entirely spurious.

“Upside-down and wrong” is wrong. “Rightside-up and wrong” is just as wrong.

but run out of oil money and had to do other work instead. It took a lot of time to make Mann08 run on Matlab, and I wouldn’t try Tingley et al. unless someone says its turn-key. But as a referee I’d check carefully the part “Furthermore, each ensemble member will have variability similar to the actual temperature anomalies^14” and compare all this to Brown’s ( http://www.climateaudit.info/pdf/statistics/brown.1982.jrss.pdf ) Ch 3, or preferably Ch 2., as the point of the paper seems to be show that X is unprecedented.

Just a few of my observations on Tingley & Huybers, 2013 from a 30 minute first analysis. I haven’t looked at the SI, etc. yet & have just given a quick first overview read before lunch. But, some of you might find this helpful.

Datasets used:
Instrumental dataset = CRUTEM3v gridded April-September mean anomalies for all grids north of 45N with some land, and at least 10 years of data

Tree-rings = They did not use tree rings directly, but rather used the CRU’s “maximum latewood-density data set”: http://www.cru.uea.ac.uk/~timo/datapages/mxdtrw.htm
This comprises 96 gridbox series which Briffa et al., 2002 (a & b) constructed from 387 trees.
They cover the period 1400-1994.

As far as I remember, this dataset doesn’t have Yamal or bristlecones. Instead it has a “divergence problem” 😉
This can be seen from Figure 12 of Briffa et al., 2002a (Paywall).

I don’t see how the claims of Tingley & Huybers, 2013 would hold up on the basis of these diverging tree-rings. And indeed, they claim:
“Estimates using only tree-ring-density records result in a distribution of extreme years that is essentially unchanged,
provided the comparison extends only to 1960…”
In this sense, Figure 2 of the paper is unfortunately another “watch the timble”-type figure. Figure 2b is for “All data, 1400-2011”, and shows the highest frequencies of hottest years are post 1994, i.e., after the tree ring data set ends.
Figure 2c is for “All the data except the tree rings, 1400-1994”.
Since they did Figure 2c, I would have also liked to see “Just the tree rings, 1400-1994″… 😦

This suggests that the basis of their claims arises mostly from the other proxies, i.e., the varves and the ice cores
Ice cores = 14 of the 15 annually resolved O18 ice cores used by Kinnard et al., 2011 (Google Scholar)
Plus four other ice cores downloaded from NOAA NCDC Paleoclimate

Varves = All the publicly available annual varves on the NOAA Paleolimnology website with >200 years, reported in length units, and which Tingley believed could have a positive association with summer temperatures.
As Steve noted, this includes the Tiljander varves.

The problem with the other lakes is that PCA-type analysis amplifies the signal from the outliers. You could sample every Finnish that WASN’T affected by human activity, but if you include just one that WAS affected, you’ll still get “evidence of a hockey stick”.

As to why use it – the only rational explanation I can imagine is that it is deliberate. A sort of ‘take that’ to this blog. ‘We can do it, and there’s nothing you can do about it.’ And it’s true – they can, and there is.

Newcomers to the Mann08/Tiljander saga may wonder, “What’s the big deal?” After all, the Tiljander data series are only fourthree two of the hundreds of proxies employed in that paper.

The key finding of that paper was (my words): “Tree-ring proxies portray a hockey-stick pattern of the past few millennia of the earth’s climate. Critics have attacked tree-ring records as unreliable. Well, here we show that other-than-tree-ring records also paint a hockey stick pattern! Reconstructions can be derived in various ways from tree-ring proxies, other-than-tree-ring proxies, and combinations of the two. In all of these cases, the temperature reconstructions are hockey-stick-shaped, and valid, and statistically-significant, and consistent with one another. This is strong evidence of the robustness of our proxy-based approach to understanding the climate of the past.”

Through carelessness or incompetence (my opinion), Prof Mann and co-authors included the lakebed sediment records archived by Mia Tiljander in their proxy set.

Steve McIntyre (this blog), JeffId (the Air Vent), and others have shown that the concordance of tree-ring results and other-than-tree-ring results depends on the inclusion of the Tiljander data series in the other-than-tree-ring temperature proxy set. Depends.. If one excludes the post-1720 portions of darksum, lightsum, thickness, and XRD, then Mann08’s key findings fall apart. The last few centuries of the reconstructions based on other-than-tree-ring data fail to show significance, even by the quirky and permissive statistical methods used by Prof Mann and his co-authors. Prof Mann’s co-blogger Gavin Schmidt has grudgingly admitted as much, buried deep in the comments to a post of their blog RealClimate. Dr Mann himself has avoided addressing this topic in a forthright manner, to my knowledge.

This background may help explain why “Tiljander” is a name that is charged with meaning in online discussions of paleoclimate issues.

I think using upside down proxies is a very natural result of those doing reconstructions not facing up to the matter of providing a prior criteria for selecting temperature proxies that have some reasonably well understood physical basis. If you allow yourself to select proxies based on how well the modern proxy response lines up with the instrumental period, of course, you will allow yourself to select an upside down proxy. Look at it this way: Without a proper prior selection process how would you know that a proxy is not being used upside down.

Tiljander is considered upside down since the original work on the proxy indicated that the proxy went upside down and gave a reasonable explanation of why it occurred.

Just to keep the record straight on Mann (2008) Tiljander was just a small part of the wrong headed data processing in that paper. There were 104 MXD proxies truncated before the modern warming period and replaced with other data due to proxy divergence. A number of proxies ended well before the reconstruction end date and the missing data was infilled. Also proxies which were instrumental data in the modern warming period and before were used in the reconstruction. A selection process was proposed and exercised in the paper that used the simply wrong after-the- fact selection process of statistical significance of proxy correlation with the instrumental record. Given the wrong-headedness of that notion, the statistics were not properly calculated as has been explained at this blog and others. Pick 2 comes to mind as well as using the instrumental, 104 MXD modified, a number of infilled and upside down proxies in the selection process.

In total, the 125 proxy series in Tingley & Huybers, 2013 are:
1. 96 grid-box series derived from 387 MXD tree ring chronologies (which show a “divergence problem”)
2. 11 varve thickness series (“log-transformed”), including one of the Tiljander series
3. 18 O-18 ice cores, 14 of which were from Kinnard et al., 2011

Tingley & Huybers claim that the tree-ring proxies are less reliable post-1960 (due to the “divergence problem”) and they don’t include them in Figure 2c.
They also suggest that some of the varves have an anti-divergence problem:
“over the most recent 100-year interval of tree-ring-density observations, positive extremes are biased low on account of the divergence phenomenon, and there is a general tendency for the largest log-transformed varve observations to be biased high relative to the mean-shifted simulations, suggesting that the log transform is an insufficient scaling for these largest values.”
This worryingly suggests there are a number of inconsistencies between all three sets of proxies. Surely this would have been worth further investigation, shouldn’t it?

Does anyone know yet which of the sets (and which series, in particular) contribute to the “hockey stick”? I can’t see how it could be from the tree-rings. The tree-rings make up 76.8% (96/125) of the proxy series, but notoriously don’t have a “hockey stick” (e.g., see Figure 12 of Briffa et al., 2002a). The tree ring series also finish in 1994, so couldn’t be part of the post-1994 stick.

The only way I could see the tree-rings contributing to the hockey stick is if Tingley & Huyber’s averaging method gives a radically different trend than Briffa et al., 2002a. If so, that would be worrying, as it would suggest it’s a statistical phenomenon.

If the hockey stick doesn’t come from the tree-rings, then it must be from some of the 11 varves and 18 ice-cores. Does anyone know?

P.S. I know it’s ridiculous that they used one of the Tiljander series, but since that series ends in 1985, my hunch is that most of the “hockey stick” comes from some of the other varves and ice-cores. Any ideas?

I have a difficult time believing that none of the authors and none of the “anonymous reviewers” were aware of Steve’s past discussion of the problem. In other words the sad state of affairs continues. We just wait for the apologists’ excuses.

I did a lengthy survey of Arctic O18 data in connection with Kinnard et al 2009 here. Kinnard’s data went prior to AD1400 and covered the MWP-modern comparison that is of primary interest in this field. The AD1400 starting point has a mid-1990s retro feel to it.

The Tingley ice data all appears to be O18 data and to substantially overlap the Kinnard data with some occasionally quirky modifications. Tingley has 18 series, Kinnard had 22.

Excluded are Mt Logan, which was the ONLY Kinnard ice core in the half hemisphere from 90E to 80W. As is well known to CA readers, Mt Logan went down in the 19th and 20th century. Tingley and Huybers purport to justify its exclusion as follows:

We exclude the Mount Logan series that is included in [35] because the original reference [36] indicates it is a proxy for precipitation source region and is out of phase with paleotemperature series.

On this reasoning, one would think that agricultural contamination at Korttajarvi would have been a greater cause for concern. But I guess that this wasn’t noticed because it was “in phase”. Obviously, I’ve consistently objected to after-the-fact exclusion of proxies for this sort of reason. It’s equally possible that precipitation changes have caused upticks in other areas biasing the series.

The O18 Arctic ice core data – again there’s nothing new in Tingley and Huybers since my 2011 survey – shows a secular decline, with a reversal in the 20th century to precedented levels.

The varve data appears to be taken from Kaufman and Kinnard. I’ll look at this further.

The MXD data is, of course, the long-time Briffa MXD and has a divergence problem.

If Mt Logan got cooler and local temperatures got cooler (since the entire globe does not change temperature uniformly), then they are throwing out valid information and biasing their result up.
If local temperatures got warm but Mt logan got cooler, then this throws doubt on this type of proxy as only having a single cause (purely reflecting temperature).

Phi,
That looks bizarre! So, the “hockey stick” isn’t in the tree rings, the ice cores OR the varves? 😮

P.S. Figure S.36 answers my earlier question about what the equivalent of Figure 2c would look like with just the tree rings. Colour me unimpressed. 😦

Layman Lurker,
I see from your second comment that you found the Figures.

Anyway, Figure 1 is the “hockey stick” reconstruction.
But, the main claim of the paper seems to be that they’re saying the summer heatwaves in some parts of the world in 2003 and 2010 were “unprecedented in the past 600 years”.
Figure 2 is a separate analysis claiming to show when the “hottest summers” in the last 600 years have been.
I have questions about both Figures…

The tree ring proxies don’t show a hockey stick at all, which was why I wondered how Figure 1 has a hockey stick

Ronan,
This is not so bizarre, proxies and instruments do not measure the same thing. Hockey sticks exist only through the injection of instrumental data. This can be done by selection within a set of bad proxies, by more or less explicit corrections or simply, as here, by incorporating instrumental temperatures in the database.

Phi,
I totally agree that proxies and instruments measure different things.

But, I don’t agree that “Hockey sticks exist only through the injection of instrumental data”. Some people have infamously done so to “hide the decline”, but usually they rely on proxies with hockey sticks, e.g., Yamal, bristlecone/foxtails, upside-down Tiljander. That doesn’t seem to be the case here, since there doesn’t seem to be an underlying hockey stick in any of the three groups of proxies.

It’s true that some studies, e.g., Mann et al., 1998, use instrumental data as temperature “proxies”. But, usually the “hockey stick” shape arises for other reasons, e.g., odd proxies. The only case I can think of where the hockey stick was explicitly due to instrumental data was Mann et al., 2008’s “EIV” reconstruction.

Although, reading the earlier post Steve linked to below, it suggests you’re right about it applying in this case.

From Steve’s 2009 post on the unpublished Tingley & Huybers, 2010:
“Quasi-Splicing?
Something else must be going on in the algorithm and it will take a while to sort through this new algorithm to see what makes it tick. Tingley has provided code for it, but hasn’t provided data. But before doing that, there’s one other aspect of the Tingley code that we need to consider. Tingley-Huybers also use 249 instrumental series. Tingley-Huybers (in their second methodological article) compare their method to RegEM. Maybe their method effectively splices an instrumental data blade with a nondescript proxy handle.
Otherwise, it’s hard to see how their method – Bayesian or otherwise – can get from the nondescript proxy network to a HS.“

According to my understanding of the case, Yamal bristelcone, Tiljander, etc.. fall into the category “selection on the basis of a set of bad proxies”. If you have enough bad proxies which behave almost randomly and select those with the best correlation with your reference data (instrumental data), this is equivalent to inject instrumental characters in your values.

Regarding Tingley, they openly state having integrated the instrumental data (ALL category).

It’s similar, but not quite equivalent. If it was equivalent then there would be no “need” to superimpose the instrumental reconstruction on top of the proxy reconstruction, because the two reconstructions would be identical in the instrumental period.

This is in essence the reason why the “divergence problem” is supposedly considered a “problem”, i.e., the proxies don’t show the expected hockey stick.
To get the hockey stick, normally you throw a couple of proxies with spurious 20th century upticks into the mix. Who cares if they’re non-climatic or not? Then you can superimpose the instrumental reconstruction on top to make things look simple.

In Tingley’s case, it does seem they’re mixing instrumental series into the mix for the 1850-2011 period. I don’t think they’re “grafting” instrumental series on though, like Mann et al., 2008 did for their EIV reconstruction. If you look at Figure 1a, there are some slight differences between the purple “Instrumental mean” and black “Posterior median”. My hunch is that they’re treating instrumental series as being equivalent to proxy series, so they get their hockey stick because the diverging proxies “diverge” at different times, and so cancel each other out, and end up drowned out by instrumental series.

I believe the justification for this “apples and oranges” approach of treating instrumental series as being the same as proxy series was first provided in the seminal paper in the Journal of Irreproducible Results by Sandford, 1995 (joke!) 😉

Ronan,
I have not checked the database but figures 34 and 35 do not go beyond 1994 and I have found no indication of proxies beyond this date. This would mean that all the prominent part (1994 to 2010, the blade of Fig. S16) is composed only of instrumental data.

If the instrumental series grid boxes are being treated as “proxies”, then probably most of them have data for the 1994-2011 period. I haven’t checked how many grid boxes >45N with some land have instrumental data for then, but it’s probably near 100.

I don’t understand what is so special about this paper that it made in to Nature. They are only going back 600 years (summer temperatures in NH), nothing special about proxies, and the algorithm is the one they already had two publications about back in 2010.

On the other hand it seems that they have provided an extensive code+data archieve. I haven’t tested it yet, but on the surface the code (and data) appears rather well documented and is relatively easy to follow, and I presume the study could be easily replicated. That’s an example to follow!

I don’t understand what is so special about this paper that it made in to Nature.

Maybe this:

No researchers in this field have ever, to our knowledge, “grafted the thermometer record onto” any reconstruction. It is somewhat disappointing to find this specious claim (which we usually find originating from industry-funded climate disinformation websites) appearing in this forum. Most proxy reconstructions end somewhere around 1980, for the reasons discussed above. Often, as in the comparisons we show on this site, the instrumental record (which extends to present) is shown along with the reconstructions, and clearly distinguished from them.
Michael E. Mann
(https://climateaudit.org/2013/04/07/clearly-distinguished/)

I’m guessing it’s their claim that:
“The summers of 2005, 2007, 2010 and 2011 were warmer than those of all prior years back to 1400 (probability P>0.95), in terms of the spatial average. The summer of 2010 was the warmest in the previous 600 years in western Russia (P>0.99) and probably the warmest in western Greenland and the Canadian Arctic as well (P>0.90).”

Could that be it?

The fact that none of the proxies they used show unusually warm late 20th/early 21st century temperatures (according to Figure S.34 and S.35) doesn’t seem to have concerned them…

TH10 also used MXD, ice cores and varve thicknesses. The ice core network in 2013 has been increased from 7 to 18 (thus resembling Kinnard.) The varve network has contracted from 13 to 11. I suspect that one of the series classified in 2012 as a varve series was actually an ice core – the Svalbard proxy shown in their 2010 location map as a varve. Maybe one of their Canadian Archipelago series as well was reclassified.

My 2009 comments on the preprint already drew attention to a potential problem with Korttajarvi – then very much in the news with Kaufman et al 2009:

Both have sites in Finland – I wonder whether TH use upside-down Tiljander where narrower varves are interpreted as evidence of warmth? TH have 6 or 7 sites in the Arctic Islands versus 2 in Kaufman. We’ve discussed problems with some of these studies already: e.g. inhomogeneity at Iceberg Lake and upside-down Tiljander.

For R users, I’ve collated the Tingley proxies into a time series R-matrix called proxy.tab at http://www.climateaudit.info/data/multiproxy/tingley_2013 and their metadata as info_tingley.csv. A simple average of all the Tingley proxies is shown below. It has a divergence problem because the majority of proxies are MXD proxies.

Tingley has provided an exemplary archive. It requires a little collation. R users who wish to skip their own collation may use my collation as follows:

Does the graph of the Tingley proxies imply that the reconstruction without the instrumental record would appear as the series in the graph shows? Has Tingley taken non descript and meandering proxies and mixed in the instrumental record using some Bayesian magic and obtained the HS?

In this reconstruction the authors did not truncate the MXD series, but if I interpret the following excerpt from the SI of that paper correctly, I will stop reading and analyzing right there. A map of locations and graphs of proxy counts would appear to show that indeed the reconstruction includes instrumental data. If true that is blatant and the use of an upside proxy pales by comparison.

“1.5 Time series properties of the data sets
In terms of number of observations, the data set is dominated by the CRU observations over the
last 150 years and by the MXD observations before 1850.”

SI section 5.5
“We present a set of validation metrics for each of the proxy-only analyses, designed to test the reconstructions regarding their ability to predict the withheld CRU observations and correctly quantify uncertainty in these predictions. As these proxy-only reconstructions are based on resampling the ensemble of scalar parameters from the analysis including the CRU observations, the validation exercises are best thought of as in-sample assessments of the reconstructions.”

The only inconsistency between the proxy-only analyses and the ALL analysis pertains to the
MXD-only reconstruction over recent decades, in accord with the well-reported but as yet poorly
understood divergence phenomenon [11, 74, 25, 53]. As discussed above (Section 5), our results are
not qualitatively affected by this divergence.

Hi Jeff,
Borehole temperature methods are fearful. I’ve not read a satisfactory paper that overcomes my fears and simply ask if you have been partly convinced by any you can refer to me? It’s likely my coverage of papers is incomplete.

Their Figure S34 top panel shows a reconstruction from MXD proxies along. The reconstruction is very similar to an MXD average,as shown below.
Figure ^. Tingley and Huybers S34 top panel, showing one variation of their proxy-only reconstructions (MXD), with average of MXD proxies (green) for comparison.

Figures S34 and S35 (smoothed) show runs with no instrumental splicing for four cases: MXD only, Ice, Varve and Ice and Varve. As far as I can tell, they don’t show a proxy-only run in which all three are used. However, it will resemble the MXD reconstruction because there are more MXD proxies. The caption to S34 reads:

Figure S.34: Comparisons of the spatial average time series inferred from each of the four reduced model runs (black lines) to the ALL run (red). From top to bottom, the panels correspond to the MXD-only, ICE, VARVES, and ICE&VARVES. The correlations between the median of each reduced model run and the median of the ALL run are, respectively, 0.89, 0.39, 0.31, and 0.42; correlations for the interval 1400-1849 are 0.98, 0.28, 0.09, and 0.31.

Surprising that they didn’t show an all-proxy run (if my reading is correct).

IMO the evidence is overwhelming that the Tingley-Huybers reconstruction does not distinguish between temperature and proxy data in their reconstruction. This is not hidden in the paper; it’s just that it seems so odd.

Given that the reconstruction is merely showing instrumental data in the modern portion, it’s not particularly startling that it resembles instrumental data.

To continue the simplistic and obvious minimal criteria for proxies that I listed at the top, please compare and contrast to a world where accountability not only exists, but is enforced. I’m a fragment from Australia’s mineral industries which now operate under a Code of Conduct named JOCR.

Here are the first few paras from here http://www.jorc.org/
……………………
The Australasian Code for Reporting of Exploration Results, Mineral Resources and Ore Reserves (‘the JORC Code’) is a professional code of practice that sets minimum standards for Public Reporting of minerals Exploration Results, Mineral Resources and Ore Reserves.

The JORC Code provides a mandatory system for the classification of minerals Exploration Results, Mineral Resources and Ore Reserves according to the levels of confidence in geological knowledge and technical and economic considerations in Public Reports.

Public Reports prepared in accordance with the JORC Code are reports prepared for the purpose of informing investors or potential investors and their advisors. They include, but are not limited to, annual and quarterly company reports, press releases, information memoranda, technical papers, website postings and public presentations of Exploration Results, Mineral Resources and Ore Reserves estimates.
………………..
I don’t wish to side track this Korttajarvi theme, but might I please seek a couple of responses about whether an assessment stage would work in this troubled field of climate research? The stakes are fairly high in both, but the conduct of professionals is poles apart at times. Perhaps Steve might consider starting another thread. I don’t think bad conduct is sustainable and here is a way to start to moderate it.

“… simultaneously took the position that a corrigendum was required and not required.” He’s the electric monk of the new age sciency types. They don’t understand the humour weapon, especially when you’re being subtle.

Would not this paper, published in the prestigious Nature journal, represent a breakthrough in temperature reconstructions in that, instead of attaching the instrumental record to the end of the reconstruction (with the occasional noting that a climate scientist would never splice the instrumental record to the end of a reconstruction), those doing the reconstruction merely include the instrumental record with the proxies and never bother showing what the reconstruction looks like minus the instrumental series. The making of the validity of the proxy responses equivalent to the instrumental record (and evidently receiving reviewer acceptance) has finally been completed. Hail to the hockey stick and may it live forever.

By the way the references to good correlations of the proxy response to temperatures can be rather confusing in light of proxy divergence problems. One can readily construct series with good correlations (the correlation value being controlled primarily by higher frequency responses) that have very different trends (representing lower frequencies responses) over the period of measured correlation.

Kenneth,
Surely the deeper problem is that one cannot be sure that the types of divergence we see now were absent in the pre-instrumented era. It would be reasonable to demote the importance of proxies, if you are a strict scientist, for this reason alone. Have you seen credible study of the reasons for the present divergence? If studies have not been commenced urgently, do we have one more pointer to poor science?

There have been attempts in the climate science community to address the issue of divergence and almost all of it is oriented towards either an anthropogenic source or a change in how the dendro proxy chronolgies were handled. An anthropogenic source(s) would, of course, save the problem from what happened in historical times by claiming that only the divergent period would be affected by man. Most of the anthropogenic approaches are hand waving and conjecture in my judgment. Most of the attempts at modifying the methods of how the dendro proxies are handled appeared to me rather arbitrary and did not actully fix the divergent problem but rather made it less severe.

Mann(2008) notes that there is divergence in dendro and non dendro proxies. SteveM has shown many examples for non dendro proxies divergence here at CA.

Looking in the direction of the divergence problem being one of the post facto proxy selection process used in most if not all reconstructions has never been adressed by the climate science community, but to me could readily explain a divergence problem. Bringing past reconstructions up to date would also help shed light on this problem and again little or nothing in those regards have been carried out by the community.

There are major problems in addition to Tijlander. The other proxies were apparently cherry picked. The paper is like Gergis, a result of selection bias.
For example, from the NOAA archive there are 28 Europe proxies, of which 15 are sufficiently high latitude, most being high resolution. None were included in addition to Tijlander.
There are 8 for Greenland, of which one is high resolution, It was not included.
There are 9 for Baffin island. Four were chosen. Only two have associated temperatures. The abstract for Round lake says “varve thickness partly represents temperature”. Ogac lake abstract says “a temperature record is not discernible”. Neither therefore published an associated temperature. Yet temperature was inferred for both. Itilliq lake abstract notes a distinct LIA from AD 1700 to 1850, with continuous warming since. It was not used. CF8 shows a distinct LIA to 1800, with continuous warming since. It was not used.

Independent of statistical methods,the simple fact that the LIA disappeared from a region where it was intense suffices to show that the study is fundamentally flawed. The problem is not the blade, where high resolution proxies do show 20th century warming, as do thermometers. It is the shaft, which disappeared natural variability in the past 600 years, that is the LIA.

Well, maybe you think of the proxies you listed above. If this is the case and given their high resolution, what is their high frequency correlation with regional temperatures?

Realize that your statement is strange because, at least to my knowledge, there are no proxies which at once:
– Show a good high frequency correlation with regional temperatue (ie credible),
– Show a significant warming throughout the twentieth century (thus confirming global instrumental temperatures).

I agree that the loss of variability in the pre-instrumental period is a real problem. Yet it is especially global warming of the twentieth century which is overestimated. This produces a curious side effect, skeptics generally overestimate the natural variability. This position is untenable because contradicted in particular by phenology (altitudinal stability, tree rings, etc.), by the behavior of glaciers during the cooling of the 60s and 70s, by satellite data, etc.

I did not bother with your suggested calculations. To some degree or other, given all the vagarities of homogenization, calibration, etc, I think it is not possible to argue that measured actual temperatures have not increased in the 20th century. The questions why, not whether.
All of the proxies used in this paper also show an increase. Fact. I don’t care whether the increase is the same as temp records, as you might.

It is irrelevant if the prior reconstruction lost the LIA. Not as inferred fractions of some uncertain degree of temperature, but COMPLETELY. Then no 20th century data from that reconstruction is reliable, period. Which was my one and only point.

Hi Rud,
I’d agree with you that there are bigger problems with Tingley & Huybers, 2013 than Tiljander.

The limited/lacklustre (or even absent!) justifications for selecting/discarding proxies is still a systemic problem in the paleoclimate community. I guess that’s one of Steve’s biggest complaints here on CA.

But, in this case, I think there’s a more serious problem.
It seems that Tingley & Huybers treated thermometer records (or rather CRUTEM grid boxes) as being equivalent to “temperature proxies”. Did you know they did this? I was quite surprised when I found this out (see my discussion with Phi above).

To me, this is a classic case of mixing apples and oranges. I find it disappointing that this is considered acceptable by Tingley & Huybers and the peer-reviewers who approved the article.

Just to clarify, this means that:
1. The pre-1850 values are determined from tree-ring/ice core/varve proxies. – all “apples”
2. The 1850-1994 values are determined from a 50/50 mix of proxies (125) and thermometer grid-boxes (about 100?). – a mixture of “apples and oranges”
3. The 1994-2005 values are determined almost entirely from thermometer grid-boxes. – mostly “oranges”
4. The 2006-2011 values are determined entirely from thermometer grid-boxes. – all “oranges”

Don’t get me wrong. There is something to be said about comparing and contrasting apples and oranges. Indeed, by rights we should be also considering tree rings and varves as “apples and oranges” too…
But, you have to make sure to add the “…and contrasting” bit. Do you know what I mean?

If you look at Fig. S.34 and S.35, there are key differences between each of the proxy sets. To me, this should have been the focus of the paper, e.g., compare the timing of the 20th century warm periods.

But, regardless, it is telling that none of the three proxy sets show the dramatic post-1970s up-tick implied by the thermometer estimates.

Allegedly, the big finding of Tingley & Huybers, 2013 is that the summer heatwaves of 2003, 2005, 2007 and 2010 were unprecedented in the last 600 years.
But:
1. The CRUTEM thermometer estimates only go back to 1850, and the late-19th century thermometer records are fairly sparse (and mostly urban stations)
2. The proxies (which do go back to 1400) don’t show unprecedented late 20th/early 21st century temperatures.

In other words, none of the series they consider actually show summers in the 2000s to be “unprecedented in the last 600 years”.
So, how is their claim considered scientifically justifiable?

He includes a thermometer-based estimate on his proxy-based estimate for comparison. But, he explicitly discusses in the text how the two estimates disagree with each other:

“Since AD 1990, though, average temperatures in the extra-tropical Northern Hemisphere exceed those of any other warm decades the last two millennia, even the peak of the Medieval Warm Period, if we look at the instrumental temperature data spliced to the proxy reconstruction. However, this sharp rise in temperature compared to the magnitude of warmth in previous warm periods should be cautiously interpreted since it is not visible in the proxy
reconstruction itself.” – Ljungqvist, 2010 (Google Scholar)

Honest comment, Apple + Orange salad graph. I’m pretty sure every published “hockey stick” chart is either a proxy + instrumental splice, or a bad-proxy (strip-bark bristle-cone, etc etc.) artifact. The latter being pretty much all of the M. Mann body of work. Now with Marcott et al! Remarkable.

I was actually just criticising apples & oranges comparisons an hour before your comment (here), so I think we’re pretty much in agreement, yes?

But, I think Ljungqvist, 2010 is actually one of the rare “honest” millennial temperature reconstructions you were asking about.

Like other millennial reconstructions, he makes several assumptions, which in my opinion have not been properly justified. But, he at least flags a lot of the issues, and cautions his readers about them. He also seems to follow CA, and has taken many of Steve’s valid criticisms of the previous reconstructions on board.

With all that in mind, I find it interesting that the Ljungqvist chart is actually a very un-hockey stick reconstruction (in my opinion). Have you had a proper look at it?

There’s some interesting discussion of it on Jeff Condon’s blog, e.g., here and here.

To be honest, my current opinion of paleoclimate somewhat mirrors the sentiments Jeff Condon was expressing recently on the Marcott threads, and on his latest blog post. I am genuinely interested in how global (and regional) temperatures have varied over the last millennium. But, trying to figure that out from proxies is a lot harder than is implied by much of the current literature.

If we are to have any hope of determining what temperatures were genuinely like in the past, we have to take a much more open, frank and critical look at what assumptions we can justifiably make, and how to test those assumptions. In this sense, I think Ljungvist’s studies are a step forward, and should be commended.

Similar, for all its faults, Tingley & Huybers, 2013 seem to have done a reasonable job of archiving their data, which should be encouraged. I haven’t looked at the S.I. in detail, but it seems substantial (and I see Jean S. was complementing it above).

Ronan,
You are probably right for Ljungvist. Yet, paradoxically, what is especially highlighted by paleoclimatic reconstructions, is less the imprecision of our knowledge of oldest temperatures than the inability to properly integrate instrumental data. The magnitude of disagreement is more important between 1900 and 2000 than between 1000 and 1800.

Thanks much for the links, which I hadn’t seen. I’ll have to do some reading, but I was too hasty in lumping Ljungvist with the “bad guys”. Sorry, my bad.

Jeff Condon’s comments about the eternal allure of reconstructing paleoclimates make me long for my long-ago student days, when people were trying to do just that, and before (almost) the whole effort got poisoned by politics and pre-ordained results (aka Confirmation Bias, Noble Cause Corruption). It would be great to get back to doing actual *science* with the paleoclimate reconstructions….

To be fair, there are some paleoclimatologists who remain honest and, well, actually do science. But even they feel the pressure of at least nodding to the PC “climate change menace” to get funded. A sad situation, with no easy fix obvious. Not likely to come to a good ending, when it becomes obvious how much effort, and money, have been wasted.

People have a tremendous talent for “explaining” things post hoc. Every day in the newspaper, journalists proclaim that the stock market went up or down because of this or that, but these “explanations” are just stories to suit the moment. If we think some proxy had some odd behavior for reason x or y, it is only a hypothesis until it can be verified. If we think that certain of a class of proxy such as lakes are poorly behaved but don’t know why, how do we know that others are not “well-behaved” for equally spurious reasons, and how do know that the ones we keep were not adversely affected by something unknown in the past?

@phi, Apr 14, 2013 at 10:45 AM
They’re both important questions which haven’t been resolved satisfactorily in my opinion. I suspect you may have similar views on the instrumental record to me. 🙂 But, we’d probably be going off-topic from this useful thread on Tingley & Huybers, 2013 if we got into a discussion of CRUTEM, etc…

@pdtillman, Apr 14, 2013 at 7:27 PM
There’s a great quote by the late zoologist (& Nobel Laureate) Konrad Lorenz: “It is a good morning exercise for a research scientist to discard a pet hypothesis every day before breakfast. It keeps him young.” 😉
It’s hard to approach a problem without pre-conceived expectations. But, for me, that’s what the scientific method is about! Indeed, I was taught that whenever I come up with an explanation for a particular experimental result, I should then actively try to disprove my explanation. It’s only if I’ve consistently failed to disprove it from many different angles, that I should start to take it seriously! Was I the only one who was taught this? 😦

@Craig Loehle,
Exactly. If paleoclimate is to progress, we need to look far more critically at the assumptions behind individual proxies. I am actually still optimistic that there is useful information in the proxy record. But, there needs to be a greater recognition and discussion of the problems in the current assumptions. I thought your 2009 paper on the tree ring divergence problem (Google Scholar) was a much-needed step in this direction. I think we need a lot more discussion like that…

As CRUTEM is the main proxy of Tingley et al. for the twentieth century, as it is this proxy which is at the origin of the hockey stick shape, it does not seem completely off topic to say a few words about it.

Criticism of proxies is also often based on preconceived ideas. For example, to my knowledge, no study has shown that tree rings densities were a worst proxy for regional temperature than stations data. Moreover, when looking around, we find a number of counterarguments. See for example this graph: http://img38.imageshack.us/img38/1905/atsas.png

“As CRUTEM is the main proxy of Tingley et al. for the twentieth century, as it is this proxy which is at the origin of the hockey stick shape, it does not seem completely off topic to say a few words about it.” Good point 😉

I’m inclined to agree with you that there are serious problems with CRUTEM (and other similar thermometer-based estimates). In particular, I am not convinced that the problems of urbanization bias, inadequate station micro-climate, etc. have been adequately resolved by the likes of Peterson, 2003; Parker, 2006; Wickham et al., 2013; Menne et al., 2010; etc. A discussion of my problems with those studies would take longer than a few sentences, though, and would probably be going off-topic.

In any case, these biases seem to have a net “warming” trend, and it seems entirely reasonable to me to suppose that the CRUTEM “global warming” trends since 1850 have, at the very least, been overestimated. If so, this could substantially reduce the “problem” part of the so-called “divergence problem”! 😉

Having said that, I am not convinced that the problems with the temperature proxies have been resolved either. Your figure is certainly interesting, but it’s not sufficient to make your claim that the Swiss tree ring MXD and the Huss et al., 2009 glacial melt anomalies are better temperature estimates than the thermometer-based estimates. By the way, have you posted this figure before? I seem to remember seeing someone (I think it was you) post a similar figure (also in French, and similar in appearance) in the comments of blog-posts before.

You don’t provide a lot of details on the basis for your figure, so I’m not sure how much you know/don’t know about the series you’ve plotted. In the following, I will outline my understanding of the series in the figure. I assume you are quite familiar with them since you referred to them, so you will probably find some of the following basic. But, I think the basics do need to be discussed. Also, maybe some of the others still reading this week-old thread might find it helpful? Steve, if you feel I’m going off-topic, feel free to snip, but I’ll try to stay on-topic.

I think it’s helpful to explicitly consider the temperature properties of the thermometer records and temperature proxies. In terms of our goal of estimating actual global (or regional) temperature trends, both datasets have separate advantages and disadvantages.

To my mind, the thermometer records have the advantage that they are actual temperature readings, but have the disadvantage that the temperature recorded at a station is the sum of:
(a) the immediate micro-climate + (b) local climate + (c) regional climate + (d) global climate
This is a serious problem for studying long-term trends, because both the micro-climate and the local climate of most (if not all) weather stations has changed dramatically since the 19th century due to urbanization, modernization, etc. The challenge of extracting (c) and (d) from the thermometer records without including (a) or (b) has been seriously underestimated, in my opinion.

In contrast, some temperature proxies can offer the advantage that they are often located in areas relatively isolated from human activity – although we just have to consider the Tiljander proxies discussed in this post for a counter-example. A key disadvantage (aside from the prevalence of the likes of Yamal, bristlecones, etc) is that the proxies are not direct temperature measurements.

For example, if we take tree rings (whether MXD or ring widths), tree ring growth is a function of many factors, e.g., sunlight, soil moisture (precipitation), soil nutrients, age of the trees, absence of infestations/fires/etc… and temperature. If one of these factors is the sole “limiting factor” for a period of time, then it is plausible to consider using tree ring growth as a proxy for that factor. However, a big problem is how can we know that that factor was the limiting factor over the entire period of the record?

Some dendroclimatologists argue that, if you select trees that are in cold conditions, i.e., “boreal” (high latitude) or “alpine” (high elevation), then you increase the frequency of time that temperature is a limiting factor. I’m guessing your Swiss MXD series falls into the “alpine” category? Indeed, the use of the term “alpine” originates from the European Alps. Where is your source for them? Is it one of the CRU grid-boxes (Briffa et al., 2002) that Tingley & Huybers use?

I find that theory plausible, and it gives me some optimism that it may be possible to extract a meaningful temperature signal from these proxies. Your figure suggests it is possible that there is some temperature signal there. My main difficulties are two-fold:
1. I find it unlikely that temperature is and always was the limiting factor for all of the trees. So, how do we establish the temperature signal to noise relationship?
2. Even if the signal-to-noise ratio is great, it is likely that it is a non-linear relationship. For example, if a cold tree is temperature limited, then whenever it gets warmer, it stops being temperature limited! 😉
My current opinion is that these are actually more challenging problems than those with the thermometer records… Have you considered those problems for your Swiss MXD series?

With regards to the Huss et al., 2009 Swiss glacial melt anomalies, there are similar problems. Glacial advance/retreat is a surprisingly complex phenomenon. I think glaciers are actually more strongly influenced by climate change than tree rings… but, notice I didn’t say “global warming”. I mean “climate” as in the 30-year average weather conditions, which comprises temperature, precipitation, wind patterns. Like tree ring growth, glacial advance/retreat depends on a number of different factors, e.g., seasonal precipitation (amount and type), seasonal temperature, the shape and landscape of the glacier and the mountain underneath (“topography”), cloud cover, the presence of soot (“black carbon”).

Huss et al., 2009 (Google Scholar) use relatively long records of “glacial mass balance” at four locations in the Swiss alps (two of the locations are for the same glacier). The annual mass balance is usually determined by:
1) Digging a snow pit at a spot on the glacier (in the “accumulation zone”) at the end of the melting season, and then measuring the amount of snowfall at the end of the year.
2) Burying a stake at a different spot on the glacier (near the “ablation zone”) at the end of the winter season, and then measuring how much of the snow at the stake has melted by the end of the year. Care needs to be taken to make sure the stake hasn’t moved down the glacier.

The annual mass balance is then calculated as the sum of these two values:
(a) snow accumulation in the winter – (b) amount of snow melting in the summer (“summer ablation”).
In my opinion, these measurements are more informative than the “glacial extent” records that are typically shown in media presentations. But, unfortunately, there are only a few glaciers that have been studied using this approach, and most of these records only have a few years of data. Almost all of the records begin after 1950s. This is why Huss et al.’s data is important. They managed to find four glacial mass balance series with almost continuous records going back to the time of World War 1.

But, what do they tell us about Swiss temperature trends? I’m assuming your argument is that because the four-series anomaly average has similar trends to the MXD series and that both of these series are similar to the corresponding CRUTEM boxes, once we subtract a 1.5°C/century linear (?) trend from them, that:
1. The MXD series and glacial mass balance series are showing the “true” temperature trends for the area
2. The CRUTEM series has a warming bias of about 1.5°C/century
Is that correct? If so, I have a few questions…

Where did you get the value 1.5°C/century? Did you just subtract the MXD series from the CRUTEM, and assume the difference was “bias” in CRUTEM? Why? Isn’t that just circular logic? Maybe the “bias” is in MXD, maybe there is bias in both, or maybe MXD is also affected by other factors, e.g., precipitation trends…

Why do you think the glacial mass balance series is a temperature proxy series? Is it just because it’s quite similar to MXD? If so, then you still have to show that MXD is showing pure temperature trends. I don’t think Huss et al., 2009 actually claim their series is a temperature proxy.
Ironically, their basis for saying it’s not a good temperature proxy is the same as your apparent basis for concluding it is a good temperature proxy! 😉 I.e., the series doesn’t show the same trends as CRUTEM (or rather the Meteo Swiss data from which CRUTEM is constructed). Therefore, they conclude it’s not reliable.
I disagree with their argument for the same reason I disagree with yours – it’s circular logic. 😦 If two series disagree, that doesn’t in itself give you enough information to decide which is more accurate!

There are other reasons to be cautious of the melt anomaly series you plotted. If you look at Huss & Bauder, 2009 (Google Scholar) from which it was constructed, there are actually significant differences between the four sites. For instance, they found that the highest-altitude site had “opposite trends in summer balance compared to the other series”.

Also, in a follow-on study, Huss et al., 2010 (Google Scholar) combined the four series of Huss et al., 2009 with less complete records from a total of 30 Swiss glaciers. These extra series had less data, and they ended up interpolating between points using a mass-balance model. I am somewhat cautious of mass-balance models, because currently, they provide different results depending on which model you use, and also they often incorporate thermometer station records (!). But, if I look at Figure 2 in Huss et al., 2010, to me the 30 series average actually looks closer to CRUTEM than MXD!

So, I’m not yet convinced by your figure. Do you have any more information about it, to make your case more compelling? Again, I want to stress that I agree with you that CRUTEM is problematic. But, that’s not sufficient reason for saying MXD is a more accurate representation of regional temperatures…

P.S. Steve, apologies for the long length of this comment. If you feel you need to cut some of it, I’ll understand.

Thank you for the interest you have shown for my post. I do not pretend to do scientific work, I try only to draw the attention of researchers on a number of points that I think are important. For more than three years I mostly do it with comments on various blogs, it is very likely that you have seen this graph elsewherre. A detailed answer in English would be for me a mountain and would be a complete slip of the thread. So I will be brief and schematic.

1. My questioning of instrumental temperatures did not originate from the behavior of proxies but from the cooling bias of discontinuities in raw data. The comparison with proxies is only a means of testing a hypothesis about it (hypothesis described in Hansen et al. 2001).

2. Two proxies against an instrumental serie is not enough to conclude, indeed. But in this particular case, they are not two but four proxies that converge, add snow [1] and TLT (shorter period but comparable divergence).

3. Ring width (TRW) and ring densities (MXD) are two totally different proxies. TRW are closely linked to the growth and the various factors influencing it, this is hardly the case for MXD. The series used in my chart are all those identified by Schweingruber as Swiss and continuous at least from 1901 to 1950 [2]. These series are far not all located in areas of growth limits. I consider MXD as a good proxy for temperature only on the basis of the high frequency correlation [3]. Indeed, the relationship is probably not linear over the full range, see the behavior in the 40s.

4. The linear value of 1.5 ° C since 1890 is only a crude modeling attempt. Calculated divergence vary for the 4 proxies (from memory between 0.12 and 0.16 ° C per decade).

5. I do not have complete data for Huss 2010, but you’re right, the evolution is closer to instrumental data. Nevertheless, I consider this later paper as less reliable because Huss 2009 shows a very remarkable behavior towards Davos temperatures [4] (partial homogenization according to Hansen et al. 2001). You tell me some details about Huss 2010 I did not know and which still reinforce this hypothesis.

I add a little detail that makes me distrust towards Huss 2010. The use of models to overcome a lack of data is questionable and certainly a sensitive enterprise, it is particularly vulnerable to preconceived ideas, unfailing objectivity is required. But what do we see?

Figure 3a, the green curve is identified as 11-year running mean. This is not an appropriate designation because a moving mean (centered) of 11 years is undefined for the last five years. It is therefore a smooth curve. The section on the last five years is singular, comparison with annual values ​​shows that here is no known algorithm for smoothing. In fact, this section was hand-drawn in a perfectly tendentious manner. This small graphic manipulation reflects a state of mind incompatible with an objective treatment of incomplete data.

Hi phi,
Sorry, another busy week! I’ve a lot of deadlines coming up. So, I’m not sure if I’ll be able to continue this interesting discussion much longer. 😦 But, in any case, I thought I’d make a few comments on your reply.

Thanks for elaborating on your figure. I think I understand where you’re coming from a bit better.

I am similarly sceptical of the reliability of Huss et al., 2010. But, you have to be very careful not to use confirmation bias, when deciding that Huss et al., 2009 is “reliable”, but Huss et al., 2010 is “unreliable”.

There are features about Huss et al., 2009 that I find intriguing, e.g., as far as I remember, the peak in 1947 corresponds to a particularly hot year in the thermometer records of other parts of Europe.

But, the fact that the 2009 and 2010 papers by Huss et al. both give different long-term trends suggests that the Swiss glacier data is quite ambiguous and subjective. This means that your warning on models, also applies to picking one proxy as being more reliable than onother, i.e., “it is particularly vulnerable to preconceived ideas, unfailing objectivity is required”. Do you know what I mean?

By the way, if you’re interested in using glaciers as temperature proxies, I would recommend reading some of the papers by Dr. Gerard H. Roe. He did his PhD for Dr. Lindzen in the 1990s, but has done a lot of research since then in trying to figure out what climate signals we can extract from glaciers. In my opinion, he is asking many of the “hard questions” about what we can and what we can’t say from glaciers. He has pdfs for a lot of his papers on his website here. He’s currently based in Seattle, U.S.A.

In particular, I think you’d find Roe & O’Neal, 2009 worth a look (“The response of glaciers to intrinsic climate variability…”) In it, they put forward a new glacier mass-balance model and explicitly discuss the assumptions and caveats of mass-balance models. You might find it helpful for assessing Huss et al., 2010.

P.S. When you say “TLT”, are you referring to the 1978-2013 satellite-based estimates of “Troposphere Lower Temperatures”, e.g., Spencer & Christy’s dataset or the RSS dataset?

P.P.S. In terms of the Swiss instrumental records, have you looked at the switch from manual measurements to automated measurements? As far as I remember, MétéoSuisse changed almost all of their stations from manual to automated around the same time (late 1970s/early 1980s).

In several cases, I think this may have introduced a warm bias, but because all of the changes occurred around the same time, it’s quite hard to check…

My wording about Huss 2010 was clumsy. Hansen et al. 2001, regarding the issue of discontinuities, has not been refuted. One can legitimately take its conclusions as the current state of scientific knowledge on the subject. In the particular case of the reference station of Huss 2009, this means that the temperature changes in Davos, as the current state of science allows us to approach it, corresponds to the dark blue curve in the graph [4]. The quality of the match of the melting anomaly with this curve is quite remarkable and not very surprising. Among all imaginable proxies melting anomaly is probably the one whose relationship to temperature is physically the most straightforward and the best established. Then comes Huss 2010 with questionable methodology invalidating a very close link between melting anomaly and temperature. This lead, at least in a first step, to doubt of the reliability of Huss 2010. All the more so :
– Huss 2010 is partly built on the instrumental homogenized temperature. Its use to confirm those same instrumental temperatures would be a circular reasoning.
– Huss 2010 is inconsistent with instrumental temperatures properly treated, with MXD, with snow and with TLT. That’s a lot.

Thank you for the very interesting link to Roe. The length is a parameter easily accessible and available for much longer, an understanding of its dynamics (with the fascinating and complex memory effect) is promising. However, due to the simplicity of its relationship to temperature, if data available, the melting anomaly remains a much more robust proxy.

Regarding TLT, I refer to UAH low troposphere (grid 2.5 x 2.5 deg).

Stations. As I know, the automation of the main network (12 stations) took place from 1977 to 1982 for nine stations typically simultaneously with a displacement more or less significant, the three others remained manual at least until 1997. The corresponding discontinuities (in degrees C, my evaluation): 2 without any discontinuity, 4 between 0.15 and 0.3, 3 between 0.5 and 0.8. Meteorologists estimate that discontinuities strictly related to the instrument change are weak and this is what seems to confirm this distribution.

To go back to the topic of this thread and MXD, I initially pointed out that we should get rid of the preconceived idea that MXD were necessarily responsible for the divergence. I hope these samples will help.

I add in conclusion that the main weakness of this paper (Tingley) lies in the unexplained divergence MXD – T. The time origin of this divergence seems to me an essential item on the way to a solution. Oddly enough, it is easy to show that the origin lies in the early twentieth century and not in the 1960s as usually claimed (see graph [3] and this link : http://noconsensus.wordpress.com/2011/11/27/provenance-of-the-decline-a-forensic-analysis/).

Over the last few years, I have been looking at the various homogenization methods which have been proposed for thermometer records, including the Hansen et al., 2001 approach you mention and the Begert et al., 2005 approach used by MétéoSuisse. I have serious reservations about both approaches.
But, I don’t want to get involved in a discussion on the homogenization of thermometer records right now, because there are a lot of issues which need to be considered, and it could easily lead into a thread of its own! Also, I’m still inundated with deadlines, so unfortunately I don’t have the time (at the moment!) for a proper discussion… 😦

Having said that, I am of the opinion that:
1. The thermometer records are substantially affected by non-climatic biases, and homogenization of some sort is probably necessary before they can be reliably compared to temperature proxies.
2. Much (if not all) of the so-called “divergence problem” is likely due to biases in the thermometer records, and nothing to do with MXD.

[…] The Climate Auditor and friends have picked more holes than a colander in the Statistics and Climate Models, in doing so, breaking the Gore/Mann Hockey stick over and over again. More from the Junior Birdmen […]