Briffa’s Tornetrask Reconstruction

Briffa’s temperature reconstruction from Tornetrask (northern Sweden) tree rings is a staple of multiproxy studies, used in Bradley and Jones [1993], Hughes and Diaz [1994], Jones et al. [1998], MBH98, MBH99, Briffa and Crowley and Lowery [2000], Briffa et al. [2000], Bradley, Hughes and Diaz [2003], Mann and Jones [2003], Jones and Mann [2004]. Briffa makes an ad hoc "adjustment" to the MXD chronology which has a dramatic impact on the relation of 20th century and medieval levels of the chronology, which then affects all downstream multiproxy studies.

The Tornetrask data was discussed in three publications: Nature(1990), Clim. Dyn. (1992) and Climatic Variations (1996) ("NATO"). Pertinent archived data at WDCP is under series swed019w – ring width and swed019x – maximum density. Esper et al. [2002] uses Tornetrask, but appears to have used independently calculated chronology. Briffa and Osborn [1999] use the Tornetrask ring width series; Briffa et al. [2001] provide no information on sites used and it is unknown whether the matters discussed here apply to this study.

Briffa’s "Adjustment"
Low 20th century levels are inconsistent with Hockey Team policy, so Briffa et al. "adjusted" the MXD chronology. The "adjustment" is described in Clim. Dyn. 1992 (see Figure 7), but there is no illustration of the eventual impact on the MXD chronology. I’ve attempted for some time to replicate this "adjustment", which I describe here.

Briffa et al. justify an "adjustment" to MXD data because of a supposedly changing relationship between ring width (RW) and MXD in the past two centuries. The top panel of Figure 2 below shows Tornetrask chronologies – RW in red, MXD in black. Briffa regressed MXD (smoothed 10-year version) against RW (smoothed 10-year version) for 500-1750 and used the model to calculate residuals the entire period (shown in the 2nd panel, smoothed with a 25-year gaussian filter). Briffa et al. calculated the trend of density against RW since 1750 (shown as a red line in the 2nd panel). I find it hard to ascribe particular significance to this "trend" as compared to earlier "trends". Briffa then adjusted the post-1750 residuals for the MXD vs RW regression by this trend, as shown in the 2nd panel (which is scaled to sd units of the residuals).

The third and fourth panels show the impact of this "adjustment" on the "adjusted" MXD chronology. The third panel shows the unadjusted (black) and adjusted (red) MXD chronology, here centered on the 500-1980 mean of the unadjusted chronology. The fourth panel re-standardizes to a 1902-1980 zero (as used in MBH98-99), with a further 25-year smooth, evidencing the profound change in relationship between 20th century and earlier values.

Briffa purported to justify the "adjustment" by showing that the "adjustment" increased the R2 of the reconstruction from 0.503 to 0.555 (see Clim. Dyn. Table 2, columns 2 and 3). His regressions do not demonstrate any attempt to examine significance of any coefficients – I’ve replicated numbers that look somewhat like Table 2, but, when I do so, the t-values for all coefficients other than MXD lack significance. I have not been able to replicate the very slight gain in R2 – obtaining a gain of less than 0.01. While the "adjustment" is offensive because of its ad hoc-ness, at a minimum, it reduces the number of degrees of freedom being modeled and Briffa should show some sort of F-test being met, which naturally is not done. I’m not exactly sure right now how you would properly model this sort of bizarre ad-hockery in a F-test, but intuitively the gain in R2 seems too small to pass any sort of test. (In passing, Briffa and Jones and are both co-authors of Rutherford et al, [2005], with its diatribe against R2 statistics.)

Briffa has not archived any chronologies. only his temperature reconstruction. Measuements (RW and MXD) are available for a 65-tree set at WDCP (swed019w and swed019x). These were used in the Nature 1990 article. The CD92 article used larger 425-core set for the RW chronology (but stayed with the 65-core MXD chronology). No measurement or RW information is archived for the 425-core chronology. In the absence of information on the larger dataset, I calculated a RW chronology using the 65-core set, and visually compared the plot to the illustration of the 425-core RW chronology in CD92, and the differences do not appear to be material to the effect. The RW chronology is used only to calculate residuals and the panel shown here replicates Figure 7 of CD92 to all material detail. (I can’t think of a good reason why a valid effect should not be present in the RW chronology of the 65 cores in question anyway.)

A small curiousity in Briffa’s temperature reconstructions is the presence of a 1980 value. His regression estimate for the temperature in year N include the following year (N+1) ‘s tree ring values. Since the tree ring chronologies end in 1980, how does he get a 1980 temperature reconstruction value? The upspike at the end of the temperature reconstruction (Nature 1990, Figure 2a) is not present in my emulation. What seems to have happened (and I noticed this because I did it in a first run) is that some software picks up the year 1 (i.e. AD500) value and inserts it for the missing value. It looks for sure that the NA value for 1980 was overwritten by a high value from AD500. Not much turns on this latter error, but the impact of the first "adjustment" looks like it may be quite important in the very non-robust MWP calculations of Jones et al. [1998] and perhaps others.

UPDATE: Here is the text of the Clim Dyn. "adjustment" justification in full (p.116):

"Up until about 1800, the April-August RCS-based reconstruction (primarily dependent on density data, as is evident in Table 2) is consistently warmer (by ~0.25 deg. C) than the RCS ring-width reconstruction for July-August. The density chronology (Fig. 5b) shows a low-frequency decline over the last century which appears anomalous in comparison with both the RW data and the instrumental data over the nineteenth and twentieth centuries. These facts suggest that the density coefficients in the regression equation may be biased as would be the case if the density decline were not climate related (CO2 increases and/or the potential effects of increasing nitrogen input from remote sources, known to have occurred over the present century, may be implicated here.)

We examined the magnitude and timing of the recent MXD decline by regressing the 10-year smoothed RCS density curve against the equivalent RCS ring-width curve over the period 500-1750. The regression equation explains just under 35% of the MXD variance. Using this equation, we estimated MXD values for 501-1980. The residual MXD data (actual – estimated) are plotted in Fig 7. [this corresponds very closely to my 2nd panel] A systematic decline is apparent after 1750. By fitting a straight line through these residuals (1750-1980) and adding the straight line values (with the sign reversed) to the RCS density curve, the anomalous post-1750 decline was removed (cf. Fig. 7). This ‘corrected’ RCS curve was then used along with the RCS ring width curve in a final reconstruction of April-August temperature. The calibration of this reconstruction now explains 55 per cent of the instrumental variance (cf. Table 2 [where the ‘uncorrected’ reconstruction shows calibrated variance (R2) of 50.3%]. The improvement supports our contention that the anomalous recent density trend was not climate related."

Regarding “I’m not exactly sure right now how you would properly model this sort of bizarre ad-hockery in a F-test” I thought you were suggesting a Chow test for structural change in 1750. This test of course assumes some a priori identification of 1750 as a potential break-point for some exogenous reason, and maybe that is why you are wondering whether it is a legitimate thing to do? Nevertheless, from the plots you give it would seem that the Chow test would not yield a significant F statistic for 1750 being a break pont.

Looking at this in broad daylight, I have to say that I am astonished.
It is evident that you need the strongest possible justification for the data manipulation that was undertaken, and looking at the result of the data manipulation, it seems that it couldn’t be possible to justify such a manipulation under any circumstances. On the face of it, it seems that the meaning of the data has been turned upside down by this “manipulation” !
This must surely be a major source of embarrassment for all those authors who have used this dataset…
yours
per

BTW I’ve had no luck getting Briffa to identify the 387 sites in Briffa et al [2001]. Maybe someone else can obtain a listing of these sites; I’ve tried without success. If you’re on the Hockey Team, why should you have to identify the sites that yield your results? Noblesse oblige only extends so far.

Briffa is also the gatekeeper for S.O.A.P. data, which I still am unable to access.

There’s an overlap between the coauthors of the Briffa “Adjustment” and multiproxy studies: coauthors of the “Adjustment” are Briffa, Jones and Schweingruber: Briffa and Jones are in the multiproxy studies Jones et al [1998], Briffa et al [2001] and Rutherford et al [2005]; Jones alone is in Mann and Jones [2003] and Jones and Mann [2004]; Schweingruber is in Esper et al [2002]. So it’s not as though the multiproxy studies were blindsided.

Briffa et al. justify an “adjustment” to MXD data because of a supposedly changing relationship between ring width (RW) and MXD in the past two centuries. The top panel of Figure 1 below shows Tornetrask chronologies – RW in red, MXD in black. Briffa regressed MXD (smoothed 10-year version) against RW (smoothed 10-year version) for 500-1750 and used the model to calculate residuals the entire period (shown in the 2nd panel, smoothed with a 25-year gaussian filter). Briffa et al. calculated the trend of density against RW since 1750 (shown as a red line in the 2nd panel)…

Briffa purported to justify the “adjustment” by showing that the “adjustment” increased the R2 of the reconstruction from 0.503 to 0.555 (see Clim. Dyn. Table 2, columns 2 and 3).

So let me see if I have got this straight. Briffa “adjusted” the data on the grounds that adjusting the data improves the R2? That’s it?

Steve, so that there’s no misunderstanding, I’ve transcribed the pertinent two paragraphs of the Clim Dyn 92 article at the foot of the posting. As you see, Briffa hypothesized some deus ex machine nonclimatic trend here, but you’ve heard the screams from the Hockey Team (including Briffa and Jones in Rutherford et al [2005]) when the anomalous growth of the bristlecone pines is criticized. Rutherford et al. [2005] contains a diatribe against the R2 statistic, strangely absent here. I guess if you call it “calibrated variance” instead of R2, that makes all the difference.

Thanks Steve M. I’m reading through the paragraphs right now. Basically I’m stumped as it looks like the justifcation boils down to:

This data is weird in that it doesn’t fit with what we expected. Hence we are going to do some statistical mumbo-jumbo and then based on that mumbo-jumbo change the data so we get something more in line with what we expected.

Seems to me the next question is should have been: why is the divergence happening. Instead, they basically say the divergence is wrong (even though they have no justification for that) and then “fix” the data.

This strikes me as a blatant and highly suspect variant of maximizing one’s R2, a dubious practice in the best of cases.

Sorry to keep pestering you about this, but this just looks really bad to me. So, the justification for this dubious “fix” is that the author suspected the data were biased. Was there any attempt to try and find the source of this bias? I’m not a biology/plant/tree guy so I don’t know what could cause such a divergence in the data.

I guess my question is could the above divergence be explained by something natural. If that is the case, there there is biase, but the biase is from an omitted variable. If that is the case, then the proper statistical solution is to include the variable not “fix” the data so that it matches what was expected.

This strikes me as a highly dubious element of the paper and frankly it makes me wonder about Briffa as a serious research. Not that he’d really care what I think, but geez. This kind of thing sounds alot like maximizing R2 and and in the worst possible way (i.e., typically maximizing R-square is done via trying alternate model specifications). This is actually changing the data with virtually zero justification other than a hunch, and then noting that the change which was intended to improve the R2 does indeed improve the R2.

Steve, I posted up my emulation of his chronologies at http://www.climate2003.com/data/briffa/tornetrask.chronologies.txt. Briffa has never archived any of his archived RCS chronologies. I’ve done quite a bit of experimenting to emulate RCS chronology calculations and am satisfied that I can accurately emulate the NATO 1996 examples. The raw data is at WDCP, but the raw dataset is pretty big 18,851 individual measurements.

This is not pestering. I obviously find this sort of stuff pretty bizarre and appreciate the feedback. When I tried to replicate the calculations using the 65-core RW dataset (which yielded Figures looking very similar), I got an R2 for the “unadjusted” version of 0.555 and for the “adjusted” version of 0.558.

If you’re worried about Briffa as a serious researcher, he’s one of the biggest IPCC guys. The guys who brought us the Tornetrask adjustment (Briffa and Jones) are also the same guys (Jones and Briffa) who brought us the CRU surface temperature dataset, where Jones has refused to disclose station data to Warwick Hughes – see the reasons in the 15 Reasons post. Who knows what sort of adjustments lurk there: Warwick is pretty suspicious?

In the Polar Urals dataset, I noticed that, in the 16th century, the RW series has a big positive excursion while the MXD series is in a negative excursion. Briffa didn’t change the MXD chronology.

I just noticed an even better example in the Polar Urals dataset. In the 11th century – the Viking period – the RW chronology is positive and the MXD chronology is negative. Briffa, Jones et al. [1995] absurdly reports that 1032 is the “coldest year of the millennium” based on the density of 3 trees in the Polar Urals (which are poorly dated in my opinion). I didn’t notice the Hockey Team adjusting the 11th Century Polar Urals chronology – which is the problem with “adjusting” the data: it’s hard to remember to do it consistently.

BTW have you looked closely at Mann’s “adjustment” of the Gaspe data: that really sticks in my craw, but no one seems to care.

I’ll post up R code for the calculations if you like.

Another thing that’s going on with this dataset is that the samples in the 20th century are uniquely old within the roster. There are no young samples.

Steve V., Here is how I calculate the filters in R. It should be transposable to another language. I usually pad the ends by the mean for the length of the filter. Left to my own devices, I would probably use a lowess smooth, but I’m pretty sure that gaussian filters are used by Briffa elsewhere (although I can’t give an exact proof offhand) and so I’ve used gaussian filters in the absence of other information. I can’t imagine that the results would be very sensitive to the form of smoothing. Hope this helps, Steve Mc.

Re Comment on 12
Steve V., a beaver will kill a tree by chewing off the bark and outer rings in a full band. These outer rings transport the great majority of water and nutrients required for photosynthesis. This is the sort of thing that makes curing plain sawn lumber without bowing hell. The outer rings shrink considerably more than the inner. I would say the effect seen in figure 2 top panel should have been expected. The more recent RWs greater relative to their MXDs than the older RWs. There may be other effects beyond this. How about recoring a tree last cored a hundred years ago to look for them? It would also be a test of this theory. How about taking RW and MXD on a core before and after curing? How did they cure the cores if at all?

Correct me if I’m being unfair, but would it be accurate to summarize Briffa’s position as:
1) This Tornetrask adjustment to MXD data is justified because it slightly increases the R2 of the reconstruction.
2) It does not matter if hockey stick reconstructions have a low R2 because the correct measure to use is RE

It seems to me that to meet IPCC precondition for coherence, datasets with a low signal to noise ratio in terms of temperature were enhanced in an attempt to extract coherent data. This is at best, risky and in this case disastrous. Although this data set is only causally connected to others, the entire process is brought into question.

“The bottom line though is that these trees likely represent a mixed temperature and moisture-supply response that might
vary on longer timescales.”…

“Another serious issue to be considered relates to the fact that the PC1 time series in
the Mann et al. analysis was adjusted to reduce the positive slope in the last 150
years (on the assumption – following an earlier paper by Lamarche et al. – that this
incressing growth was evidence of carbon dioxide fertilization) , by differencing the
data from another record produced by other workers in northern Alaska and Canada
(which incidentally was standardised in a totally different way). This last adjustment
obviously will have a large influence on the quantification of the link between these
Western US trees and N.Hemisphere temperatures. At this point , it is fair to say that
this adjustment was arbitrary and the link between Bristlecone pine growth and CO2 is ,
at the very least, arguable. Note that at least one author (Lisa Gaumlich) has stated
that the recent growth of these trees could be temperature driven and not evidence of
CO2 fertilisation.
I stil believe the “Western US” series and its interpretation in terms of Hemispheric mean
temperature is perhaps a “Pandora’s box” that we might open at our peril!
What does Jan say about this – he is very acquainted with these issues?
cheers
Keith”

7 Trackbacks

[…] of my thoughts on this can be deduced merely by framing the question. If you look at the "adjusted" diagram, in the bottom panel, you will see that the differences between the adjusted and unadjusted versions […]

[…] The next highest t-score comes from Briffa’s Tornetrask reconstruction. In this case, it verifies almost exactly. The problem with putting much weight on this particular reconstruction is that Briffa adjusted his results so that they worked as discussed [link]. […]

[…] only few even among regular CA readers know that also this series was “adjusted”, see Steve’s discussion of the topic from the early days of CA. Fourth series to enter the MJ03 portfolio is the Chesapeake […]