Muir Russell and the Briffa Bodge

There has been some recent discussion of the Briffa bodge – an early technique to hide the decline. I had drafted a post on the topic and its handling by the Muir Russell “inquiry” in early July 2010, but did not publish the post at the time. In today’s post, I’ve slightly updated my July 2010 draft.

The term “bodge” was used for the first time in a comment (not a post) on November 8, 2009 by me here less than two weeks before Climategate). I had noticed the term “Briffa bodge” in a preprint of Briffa and Melvin 2008 2011 (see here), where it was used to describe a “very artificial correction” to Briffa’s widely used Tornetrask chronology as follows:

Briffa et al. (1992) ‘corrected’ this apparent anomaly by fitting a line through the residuals of actual minus estimated ring widths, derived from a regression using the density data over the period 501–1750 as the predictor variable, and then removing the recent apparent decline in the density chronology by adding the fitted straight line values (with the sign reversed) to the chronology data for 1750–1980. This ‘correction’ has been termed the ‘Briffa bodge’ (Stahle, personal communication)!

Bodging of the Tornetrask chronology had been discussed in much earlier CA posts – e.g. in March 2005 here and again here.

The term “bodge” also occurs in Climategate correspondence, as pointed out by Jeff Id on December 1, 2009 here.

In July 1999, Vaganov et al (Nature 1999) had attempted to explain the divergence problem in terms of later snowfall (an explanation that would seem to require caution in respect to the interpretation of earlier periods.) On July 14, 1999, Ed Cook wrote Briffa as follows:

Hi Keith,
What is your take on the Vagonov et al. paper concerning the influence of snowfall and melt timing on tree growth in Siberia? Frankly, I can’t believe it was published as is. It is amazinglly thin on details. Isn’t Sob the same site as your Polar Urals site? If so, why is the Sob response window so radically shorter then the ones you identified in your Nature paper for both density and ring width? I notice that they used Berezovo instead of Salekhard, which is much closer according to the map. Is that
because daily data were only available for the Berezovo? Also, there is no evidence for a decline or loss of temperature response in your data in the post-1950s (I assume that you didn’t apply a bodge here). This fully contradicts their claims, although I do admit that such an effect might be happening in some places.

I raised the Briffa bodge as an issue in my submission the Briffa bodge to the Parliamentary Committee and Muir Russell as an example of “data manipulation”.

Although Muir Russell expressed disinterest in opining on the proxy issues that dominated the Climategate dossier, they reluctantly expressed an opinion on Briffa’s adjustment of the Tornetrask chronology, agreeing that the bodge was indeed “ad hoc”, but found (without giving any evidence) that there was nothing “unusual about this type of procedure”. While I presume that this reassurance was intended to comfort his audience, I wonder whether readers should in fact be comforted by this observation.

Muir Russell condemned critics for even questioning the Briffa bodge, stating that it was “unreasonable that this issue, pertaining to a publication in 1992, should continue to be misrepresented widely to imply some sort of wrongdoing or sloppy science”.

I’ll review the long backstory today.

Early CA Commentary on the Briffa Bodge

Briffa’s ad hoc adjustment of the important Tornetrask chronology was discussed in one of the earliest CA posts here (though the term “bodge” was not then used). I reported:

Briffa makes an ad hoc “adjustment” to the MXD chronology which has a dramatic impact on the relation of 20th century and medieval levels of the chronology, which then affects all downstream multiproxy studies…

The density chronology (Fig. 5b) shows a low-frequency decline over the last century which appears anomalous in comparison with both the RW data and the instrumental data over the nineteenth and twentieth centuries. These facts suggest that the density coefficients in the regression equation may be biased as would be the case if the density decline were not climate related (CO2 increases and/or the potential effects of increasing nitrogen input from remote sources, known to have occurred over the present century, may be implicated here.)

We examined the magnitude and timing of the recent MXD decline by regressing the 10-year smoothed RCS density curve against the equivalent RCS ring-width curve over the period 500-1750. The regression equation explains just under 35% of the MXD variance. Using this equation, we estimated MXD values for 501-1980. The residual MXD data (actual – estimated) are plotted in Fig 7. [this corresponds very closely to my 2nd panel] A systematic decline is apparent after 1750. By fitting a straight line through these residuals (1750-1980) and adding the straight line values (with the sign reversed) to the RCS density curve, the anomalous post-1750 decline was removed (cf. Fig. 7). This ‘corrected’ RCS curve was then used along with the RCS ring width curve in a final reconstruction of April-August temperature. The calibration of this reconstruction now explains 55 per cent of the instrumental variance (cf. Table 2 [where the ‘uncorrected’ reconstruction shows calibrated variance (R2) of 50.3%]. The improvement supports our contention that the anomalous recent density trend was not climate related.”

Obviously the very slight improvement in r^2 (unadjusted) arising from the bodge would not prove the validity of the bodge to any reasonable statistician. (I note in passing that Briffa et al 2001 also used a supposed microscopic improvement in unadjusted r^2 to adopt a reconstruction variation using principal components.)

CA reader Per (a scientist) commented in 2005 as follows:

Looking at this in broad daylight, I have to say that I am astonished. It is evident that you need the strongest possible justification for the data manipulation that was undertaken, and looking at the result of the data manipulation, it seems that it couldn’t be possible to justify such a manipulation under any circumstances. On the face of it, it seems that the meaning of the data has been turned upside down by this “manipulation” ! This must surely be a major source of embarrassment for all those authors who have used this dataset…

In fact, the climate community was totally unembarrassed by the bodge. The bodged Tornetrask chronology was used in the first generation of multiproxy reconstructions (still in the spaghetti graphs): Jones et al 1998, Mann et al 1998-99, Crowley and Lowery 2000. Tornetrask continued to be used in various versions in pretty much every reconstruction, but this is outside the scope of this post.)

McIntyre Submission, Feb 2010
The terms of reference for the Muir Russell “inquiry” had required them to examine possible “data manipulation”. In my submission (Feb 2010), I included the Briffa bodge of the Tornetrask chronology as a potential example of “data manipulation”.

In my introduction, I observed that some forms of data manipulation were so embedded that reviewers and specialists in the field either no longer noticed or were unoffended by the practices:

some forms of data manipulation and withholding are so embedded that the practitioners and peer reviewers in the specialty seem either to no longer notice or are unoffended by the practices. Specialists have fiercely resisted efforts by outside statisticians questioning these practices – the resistance being evident in the Climategate letters.

The bodge in Briffa et al 1992 was really the first example of “hide the decline”. I took care to observe that the bodge had been properly disclosed in the original 1992 paper. However, I noted that the presence of the bodge was not disclosed in the downstream multiproxy reconstructions (e.g. Jones et al 1998) nor included in the calculation of confidence intervals. Here is the relevant section of my submission:

One of the underlying problems in trying to use tree ring width/density chronologies for temperature reconstructions is a decline in 20th century values at many sites – Briffa’s 1992 density (MXD) chronology for the influential Tornetrask site is shown at left below. The MXD chronology had a very high correlation to temperature, but went down in the 20th century relative to what it was “expected” to do and relative to the ring width (RW) chronology (which had a lower correlation to temperature.) So Briffa “adjusted” the MXD chronology, by a linear increase to the latter values (middle), thereby reducing the medieval-modern differential. This adjustment was described in private as the “Briffa bodge” (Melvin and Briffa 2008).

In my submission, I pointed out that comments in Climategate programs about “fudge factors” and “artificial corrections” might indicate other use of bodges. In Tim Osborn’s submission to the Parliamentary Committee, he stated that such bodges were not used in his publications (while remaining silent on the use of bodges by other CRU scientists). In fact, CRU publications later than May 1999 – the date of publication of Briffa and Osborn (Science 1999) and Jones et al (Rev Geophys 1999) – do not appear to use bodges (despite code comments). Rather than bodging, they used Keith’s Science Trick (the simple deletion of data) to hide the decline.

divergence does not affect any of our tree-ring-based temperature reconstructions that extend back to medieval time, the divergence phenomenon does not undermine the validity of our current estimates of the degree of warmth during the Medieval Warm Period. p 3)

Given that the Briffa bodge was designed to “correct” the decline of the Tornetrask MXD chronology (a mainstay medieval proxy), it seems hard to support CRU’s claim that divergence had not “affected” their Tornetrask chronology, particularly when CRU itself admitted that an “ad hoc” adjustment had been made to the Tornetrask chronology (as I had pointed out):

In one earlier publication (Briffa et al. 1992a) describing an analysis of ring-width and MXD in northern Sweden, we discuss a related observation of ‘divergence’ where a local, ring width chronology tracks the local rise in measured late 20th century temperatures, but a parallel chronology constructed from density measurements from some of the samples used to produce the ring-width chronology, exhibits a recent decline relative to the ring-width (and local temperature) data. In that paper it was assumed that the density decline was a recent phenomenon and a clear ad hoc correction, based on a comparison between the ring-width and density data, was applied before the ‘corrected’ chronology was used to reconstruct past temperature. No attempt was made to disguise this correction.

The issue wasn’t whether the adjustment had been disclosed in Briffa et al 1992; I had clearly stated that it had been disclosed in my submission. The issue was whether the bodge constituted “data manipulation” within the terms of reference of the Muir Russell “inquiry”, whether allowance for the bodge had been properly made in multiproxy reconstructions and whether it had been adequately disclosed in the multiproxy reconstructions.

CRU’s argument was that the bodge had been vindicated by later work, in effect saying that Briffa et al 1992 had used erroneous statistical methodology in the development of the original Tornetrask chronology and that the MXD decline at Tornetrask was an artifact of this erroneous methodology:

We have subsequently established that the reason for the relative density decline in this case was a bias in the RCS curve used to remove the influence of tree ageing from the original density measurements.

The bias came about because the standardisation curve (that quantifies the expectation of MXD as a function of ring age) was calculated without removing the parallel influence of climate on the growth of old age trees (see Section 1.1 for a discussion of standardisation and later discussion in this section for details of the bias issue). When a more appropriate application of standardisation is used, the agreement between these ring-width and density chronologies is markedly better, providing support for the efficacy of the original “correction” applied to the MXD data in our 1992 paper (for details of the later analysis see Briffa and Melvin, 2010).

Using the Alice-in-Wonderland logic that is all too prevalent in climate science, CRU then argued that their original reconstruction hadn’t been “affected by divergence” after all:

Because our original correction to the MXD chronology was subsequently shown to be justified and because the cause of the problem was shown to affect only the recent end of the chronology, we contend that our reconstruction of northern Swedish temperatures that used it (Briffa et al. 1992a) can justifiably be considered to be not significantly affected by divergence.

Alice-in-Wonderland logic – even if the bodged chronology were to be somewhat vindicated on other grounds (something that was not established in published academic literature at the time), it is untrue that the original chrnology was “unaffected” by divergence. Indeed, elsewhere, they inconsistently admitted that divergence had, in fact, been “observed” in the Tornetrask chronology:

An observation of divergence in a maximum-density chronology for northern Sweden (Briffa et al. 1992a) was later shown to be caused by the method of chronology production. The ad hoc correction applied to the original work was, therefore, appropriate and the temperature reconstruction for northern Sweden that made use of the corrected chronology is not invalidated by divergence.

And elsewhere, they argued only that improved statistical methods only “mitigated” the impact of divergence:

Though as yet unpublished, CRU has also indicated the possible role of standardisation as a cause of the apparent change in climate sensitivity of growth indices from near-tree-line conifers in North America (D’Arrigo et al. 2004) as described in Section 1.2. The signal-free approach to tree-ring standardisation has been put forward as a possible way of mitigating one manifestation of divergence that arises in the use of traditional and RCS-based standardisation techniques (Melvin 2004, Melvin and Briffa 2008, Briffa and Melvin 2010).

Briffa and Melvin, 2010 2011 of the Muir Russell report is Briffa, K. R., and T. M. Melvin. 2011. A closer look at Regional Curve Standardisation of tree-ring records: justification of the need, a warning of some pitfalls, and suggested improvements in its application in M. K. Hughes, H. F. Diaz, and T. W. Swetnam, editors. Dendroclimatology: Progress and Prospects. Springer Verlag. A preprint is online here.

I might add that I entirely support close examination of tree ring standardization methods of the type that Melvin is carrying out. The statistical issues are not easy ones.

Melvin proposes the concept of “signal-free” stamdardization. I think (though I’m not sure of this) that this relates to some prior discussion at CA in which I noted that dendro chronology calculations have analogues in linear mixed effects models e.g Pinheiro-Bates nlme package in R. I’ve shown that a “conventional” chronology can be obtained using the nlsList function and an RCS chronology using the nls function. The linear mixed effects function is the nlme function. My surmise is that Melvin’s “signal-free” method is a home-made method that has points in common with an nlme calculation. (This was something that I had hoped to present when I had been invited to present a paper at the World Dendro Conference last June; unfortunately this invitation was withdrawn.)

Given that the decline occurs at many sites (Briffa and Melvin focus on Tornetrask), I would be surprised if the decline is simply an artifact of tree ring standardization methodology. However, the potential of systemic sampling problems cannot be precluded. If the decline is simply an artifact of erroneous statistical methodology used by dendrochronologists, this deserves to be known. Given that similar methods are widely used in dendrochronology, if Melvin is right, it will require reappraisal of virtually every tree ring chronology in present use – a point that CRU failed to make.

The April 2010 Interview
After the announcement of the panel in February 2010, the Muir Russell “inquiry” carried out only one interview with Briffa (on April 9, collectively with Jones and Melvin.) The interview was attended by only two panelists, with Geoffrey Boulton, who been employed for 18 years by the University of East Anglia, taking the lead. There isn’t any transcript of the interview. The cursory minutes state on this issue:

The panel members had read the relevant paper (Briffa 1992) and the CRU submission in respect of the adjustment made to the Tornetrask series. They nevertheless wanted to cover the arguments again as an opening to the meeting. What was the scientific reasoning that justified the adjustments to the most recent period, and which has been described as a “bodge” in one submission?

Boulton’s notes simply recapitulate the CRU submission:

The TRW and MXD had proved to be a good proxy for high frequency temperature fluctuations, but the MXD series fell relative to the TRW for low frequency after 1750. The MXD had been adjusted to match the TRW based upon the assumption that the TRW was correct. It had later been found that the effect was due to a bias in the standardization procedure used and hence the ad-hoc adjustment had been good. It had not been used to argue for a temperature increase over the period. The manipulation had not been hidden, but had been clearly described in the paper.

CRU Further Comments, June 2010
CRU was afforded an opportunity to reply to Boulton’s meeting notes. (Critics were not offered a similar opportunity to rebut.)
And, on June 16, 2010, only a couple of weeks before the publication of the Muir Russell report, they did so here .

Their response states that the Review Team had requested “some additional context and support for the UEA response”, which they provided as follows:

The following comment is consistent with the “Summary of salient points…” but provides some additional context and support for the UEA response, as requested by the Review Team.

Professor Briffa was asked about the Tornetrask series created in Briffa et al. (1992). What was the scientific reasoning that justified the adjustments to the most recent period, and which has been described as a “bodge” in Briffa and Melvin (2010)? Briffa explained that the paper used two independent sets of measurements from the Tornetrask trees, maximum latewood density (MXD) and tree-ring width (TRW). The inter-annual variability of both is significantly correlated with variability in local temperature. TRW changes are associated with growing season temperatures mostly in midsummer (particularly July) while MXD appears to respond to the changes during a longer season (April to September). Indices of both (TRW and MXD) correspond well with the high-frequency changes in temperature. Prior to 1750 the low-frequency signals were similar in both series but after 1750 indices of the MXD series fell relative to the TRW series. Because of the much higher replication in TRW, Briffa presumed the TRW were correct and made a linear adjustment to the MXD series to force the MXD series to agree with the TRW series. He referred to this as the “bodge” in later work (Briffa and Melvin, 2010, a copy of which was supplied to the Review Team in the March 1, 2010 submission) which used improved processing methods that showed that the difference between TRW and MXD chronologies had resulted in part because the standardization technique used in the original paper (Briffa et al. 1992) to remove sample-age bias in the MXD data was itself biased. A brief presentation was made to the Review Team to illustrate this. This later work confirmed that it had been correct to adjust the MXD series in the earlier work and thus justified the original “bodge”. The manipulation had not been hidden, but had been clearly described in Briffa et al. (1992). Briffa directed the Review Team to our previous written submission of 1st March 2010 (page 7 in Section 1.2).

They added:

The following comment summarises material that was covered during the meeting but that was not included in the “Summary of salient points…”.
Melvin and Briffa also gave a detailed explanation of why the latest published Tornetrask density chronology (Grudd 2008) also advocated by McIntyre in his submission to this review is potentially in error because it applies a biased ‘standardisation’ approach to updated density measurements. This evidence was also shown to the Review Team in a brief presentation. This evidence is not published yet.

Although Muir Russell had stated that they would archive all submissions, they did not archive CRU’s “brief presentation” purporting to show that Melvin’s new statistical approach to standardization had vindicated the Briffa bodge.

Muir Russell Report, July 2010
The Muir Russell Report stated (7.3.4) they did not “focus upon disagreements over comparisons of results using individual tree series” even though issues regarding proxy reconstructions dominated the Climategate dossier. They reluctantly commented on Tornetrask only as the “subject of much misunderstanding” as follows:

Nevertheless we comment briefly upon Yamal as it has received so much attention and the Tornetrask series as it is subject of much misunderstanding.

As elsewhere, they did not provide any citations or references to actual examples of “misunderstanding”. Given that my submission was the only one on this topic, I presume that this accusation is made against me.

In the original press conference, Muir Russell had stated that they did not expect the public to accept “ex cathedra” pronouncements, but, in relation to the Briffa bodge (as elsewhere), that’s what they did, providing only the following declaration:

32. Finding on ― “Bodging” in respect of Tornetrask. The term ―bodging‖ has been used, including by Briffa himself, to refer to a procedure he adopted in 1992 [13]. The ‘bodge‘ refers to the upward adjustment of the low-frequency behaviour of the density signal after 1750, to make it agree with the width signal. This ad hoc process was based on the conjecture that the width signal was correct. There is nothing whatsoever underhand or unusual with this type of procedure, and it was fully described in the paper. The interpretation of the results is simply subject to this caveat. The conjecture was later validated [14 -Briffa KR and Melvin TM, 2010 in press] when it was shown to be an effect due to the standardisation technique adopted in 1992. Briffa referred to it as a ‘bodge’ in a private e-mail in the way that many researchers might have done when corresponding with colleagues. We find it unreasonable that this issue, pertaining to a publication in 1992, should continue to be misrepresented widely to imply some sort of wrongdoing or sloppy science.

As all too common in the execrable Muir Russell “inquiry”, key assertions are made out of thin air.

They said that there was nothing “unusual” about Briffa’s ad hoc adjustment of the Tornetrask chronology. Puh-leeze. Where are other examples of this “statistical” methodology? Where is the discussion of the statistical assumptions and protocols? There aren’t any. Even in the statistically challenged world of paleoclimate, the Briffa “bodge” wasn’t a “usual” practice – as evidenced by the Cook email and the Stahle pers. comm. noted above.

Nor was Muir Russell justified in criticizing critics of the bodge as being “unreasonable”. Even if Briffa and Melvin 2010 2011 saves the Tornetrask chronology without damaging other chronologies – something that is far from demonstrated, Briffa and Melvin 2010 2011 was unpublished at the time of the Muir Russell report. And without a rational justification, critics cannot be blamed for regarding the bodge as “sloppy science” or worse.

And even if Briffa and Melvin 2010 2011 ultimately vindicates the Briffa bodge on other grounds, the bodge itself remains, at a minimum, “sloppy science”. Sloppy science that helped conceal the divergence problem and thereby delay analysis and reconciliation of the divergence problem.

While it’s nice to know that the high frequency signal of the ring width’s and maximum density are similar, if you can simply force the low frequency signal to match when useful, what sense is there in measuring both? As you hint, it could be something else like CO2 or anthropic nitro-fertilization which caused the ring width measurements to follow the historical temperature measurements. So they need to present a scientifically defensible theory on why they choose one measurement over the other. Cherry-picking the methods as well as the series just makes the results totally unacceptable.

I also wonder about how they know that the new standardization is not affecting results before there were instrumental records? I suspect there are methods which might work, but I’m just wondering what they actually did to justify their statement.

So a 1992 ad-hoc adjustment (which added a completely artificial temperature increase of 0.5 deg C per century in the last two centuries of the proxies) is validated by a paper “IN PRESS” in 2010 and still unpublished in 2011. Crazy. Is this science ?

Muir Russell was right. There is no data manipulation here. The data remained unchanged.

They are fitting a model to the data to reconstruct April-August temperature. They made an ad hoc variation to the model to better fit a special situation. And they tested (R2) the fit against the original (unmanipulated) data.

Now a model purely based on known science is certainly to be preferred to one with ad hoc modifications. But plenty of models are tested and fitted without the scientific justification being known. The question is then whether that is helpful. It’s a familiar situation.

Re: Nick Stokes (Mar 30 14:44), So, If I note that temperatures in cities do Not go up with population as one would expect under Oke’s theory, then I could model the effect of UHI which is apparently missing in the data, to add in the missing UHI effect. I’m not manipulating data, i’m correcting it.

I could then publish the average of all stations and claim no data manipulation. it was a “result” manipulation.
Then If my paper got published, down the line, years later people could use my result as data, to test other things
like GCM fidelity.

So, by the letter of the law there is no data manipulation. there is a data adjustment. It seems to me we dont help our cause greatly by relying on such fine distinctions to avoiding owning up to some less than acceptable practices.

If I note that temperatures in cities do Not go up with population as one would expect under Oke’s theory, then I could model the effect of UHI which is apparently missing in the data, to add in the missing UHI effect.

Perfect example Steven. Better get started on the paper. It would be “Mosh’s Trick”. No actually it should be “Mosh’s Bodge” because then you could start using the method unilateraly – without any foundation in the literature – then circularly refer back to it and dismiss any criticism.😉

It’s very simple. The bodge that briffa employed was a data correction that was THEORY driven. the data were bodged to be brought into line with theory.

If I did that to bring station data in line with Oke’s theory about the relationship between population and UHI, you would scream bloody murder.
Don’t pretend you wouldn’t. Don’t pretend that you would let Willis or Anthony get away with this kind of bodge. Now, I don’t want to get into a debate about whether this is ethical or unethical. It’s not best practice and we deserve better science.

And if I have a timeseries of an indicator of homeless numbers, like number of people in shelters per night, and it goes down after 1960, when I “know” it should go up, it is ok for my model of homeless numbers to have a bodge forcing the estimate to go up in recent decades…because it should? Puhleeeeze.
Where is an example in real published science of this type of kludge? We can fit all sorts of models to data but no one gets away with such stuff. Do you even understand what you are saying?

Huh? A ‘model’ manually adjusted to reproduce recent data (with all subsequent reference to the ad hoc manual adjustment unreported). What ‘scientific’ value is that? (answer – none – you can just look at the instrumental record). However, this activity did have wonderful ‘communications’ and grant collection value.

Nick, is there nothing in the Team corpus that offends you? Even CRU admitted that they had made an “ad hoc” adjustment – i.e. they changed the chronology. “Data manipulation” is also not limited to altering data values. See Wikipedia here on “data manipulation”:

Informally called “fudging the data,” this practice includes selective reporting (see also publication bias) and even simply making up false data.

Examples of selective reporting abound. The easiest and most common examples involve choosing a group of results that follow a pattern consistent with the preferred hypothesis while ignoring other results or “data runs” that contradict the hypothesis.

Re: Tom Gray (Mar 30 16:35),
Tom, the model is just the mathematical process by which you take input data (MXD here) and produce an estimate of something else (April-September temperature) which you can test against observations for some period. Ideally it has a clear physical basis, but very often it doesn’t.

Wow Nick. I know you’re working on some kind of middle ground thought buy come on.

“So OK, we’d like to find something better. That doesn’t mean that they shouldn’t publish what they have.”

If the ‘something better’ is the difference between a corn filled turd and a flower, then yes it does mean they shouldn’t have published what they did. Sorry for the language but you have really got to e kidding me. Ya can’t chop data you don’t like because it don’t fit the pre-conceived and after ya did, ya can’t replace it with different data assumed to be the same thing.

NO other field would accept this practice without extensive evidence/reasoning.

Question for Steve Mc. What happens to a mining promoter who does this in a prospectus? What happens to a drug manufacturer? Is everyone aware of the tune the US Supremes just sang to Matrixx with a unanimous choir?

Nick, are we really supposed to change the world on the basis of “science” like this? Really?!

No data manipulation? Sorry Nick, that’s a parallel universe and you know it. The question is, is this sophistry and its ‘ad hoc variations’ anything remotely acceptable to science and a reasonable reflection of the diligent honesty required of its practitioners?

Re: SayNoToFearmongers (Mar 30 16:07),
As Steve said, [snip – please do not put words in my mouth. If you wish to quote me please do so]. They described exactly what they were doing. They are working in a field where (especially in 1992) they are trying to find patterns in a great cloud of unclear information. This is part of the process.

On a few occassions I have added a comment when I felt an underlying general principle was worth highlighting.
In this case it is Nick’s comment: “They are working in a field where (especially in 1992) they are trying to find patterns in a great cloud of unclear information.”

I think this explains a great deal about the different approaches to paleo data.
Steve acts impartially as an auditor of the data and modelling – and we have a lot to thank him for in his steady and thorough approach to this.
When he points out problems with statistical modelling or the use or manipulation of data the response is often – “but we need to get an answer!”
The “climate scientist team” and Nick seem to be unable to simply admit they did something wrong.

They KNOW there is a signal amongst the noise, and they KNOW what it should look like.
Therefore they are being perfectly “honest” when they bodge or trick or hide declines.
After all – the data they are manipulating is very noisy and they are pursuing the signal they know is there. The only data they got rid of or adjusted was the noisy/irrelevent part.

The irony is of course, that the analysis of anything paleo is done in the absence of witnesses.
It is always speculative – no matter what field you tend to be in – paleontology, cosmology or paleoclimatology.
The whole field is heavily dependent on that most marvellous of human gifts – the human imagination.
And it means that an awful lot of meaning is being derived from data that is often extremely noisy and unclear and almost always without any real possibility of hard verification.

I have to admit though – that there is hard qualitative verification of a warmer climate in the MWP and the Roman max – there were witnesses and they left monuments and villages now covered in ice.
This is a real witness vs the derived signal being strained out of the noise of tree ring widths, mud layers, etc
But it is also a qualitative measure, when the team is after numbers. In such an environment you can sense their frustration with any suggestion that proxies are not suitable for temperature.
I despair seeing the often repeated charge aimed at Steve and other critics [see the Steig vs O’Donnell papers] to produce a better number, when in fact the conclusion that it cannot be derived is by far the most logical scientific result.

I think that it is the determination that the data is capable of producing a valid pattern that has resulted in 2 camps which talk at each other rather than to each other. Any attempt at critique is ultimately seen as pure negativity by the team and its supporters and dismissed or belittled on those grounds. When in fact, as some non-team players point out – it is solid scientific evidence that you are looking for patterns where they cannot be found.
Will a pattern be found – absolutely – that is what humans are good at. Will it be valid? Almost certainly not – because you have already acknowledged that you are straining at gnats to get your pattern.
The best that can be said about the past are the big phenomenon – written histories, glacier extent, forest extent, etc. They will not give you numbers, but they are far more reliable than building climate history out of highly localised phenomenon with a high degree of assumptions brought to bear – such as tree rings, etc

Nick
The problem is that the output of the model – ie the temperature reconstruction – becomes the new data, used in reconstructions further down the line. So they did manipulate the data – ie the output data.

If I observe that by adding a linear, or some kind of polynomial, adjustment starting from a particular point in the model output, improves the match with reality over the benchmark period, can I just add the spurious adjustment with no a priori rigourous analysis or reasoning and announce the improved result ? Of course not.

Re: None (Mar 30 15:18), by nicks logic, if I take data and multiply it by a random number its not a data manipulation as long as I keep the orginal data around somewhere. We have just defined data manipulation out of existence. So, the question is to nick. What constitutes data manipulation?

“Also, there is no evidence for a decline or loss of temperature response in your data in the post-1950s (I assume that you didn’t apply a bodge here). This fully contradicts their claims, although I do admit that such an effect might be happening in some places.

Well, here we have the luxury of examining Briffa’s response to Ed ‘checking that the model was the usual one’, as you frame it.

Briffa starts out with a phrase I never understand;

“Ed
to be really honest,”

Does that imply that he isn’t really honest without this proviso? Is he like the traveller who can blow hot and cold with the same breath?

Anyway, even when he is being really honest, I can’t see anywhere that Briffa disillusions Cook about “the usual(bodged)model”, .

Briffa continues anon, and here is the only thing remotely connected to bodging;

“My instinctive first reaction is that I doubt it is the answer but we do get results that support a recent loss of low-frequency spring temperature response in our data that may be consistent with their hypothesis of prolonged snow lie in recent decades.”

So, is Ed supposed to infer that any non-declining results have been ‘bodged’?
based on that quoted response?

Why squander your credibility on this bizarre parsing exercise? Where in life is that a salable skill?
I don’t get it…

Re: Nick Stokes (Mar 30 18:52), If you drop data from an archive or fail to archive it, how are future researchers supposed to explain anomalous data. Suppose you have data. and half of it goes contrary to theory. You have no explanation for it. You think an explanation might exist but you dont have one.

So you make your charts. You explain that you havent plotted the anomalous data.
all well and good. But then instaed of archiving it so others can have a wack at explaining it, you fail to do so. is that manipulation? 25 years down the road
somebody goes to the archive. the archive doesnt show a deletion. The deleted data
is lost. manipulation? poor practice? sloppy?

Re: steven mosher (Mar 30 22:15),
Steven,
Do you actually have an example of actual measurement data that has been archived and then lost? Totally? No records anywhere in the world? How would it happen?

Even in pre-web days there were many printed copies. Now we have innumerable electronic copies as well.

“Steven,
Do you actually have an example of actual measurement data that has been archived and then lost? Totally? No records anywhere in the world? How would it happen?”

Why ask Steven, he doesn’t have much experience at that sort of thing? If you really want to know how maybe you should be posing your question to Mann. However, most magicians don’t like to share their tricks.

Actually Nick is doing his usual “trick” of purposely misunderstanding what Steven said. Steven said that a purported scientist DIDN’T archive data which was mentioned as not being graphed in a paper. So later when someone went to look for it, it naturally wasn’t there. Nick seems to be asking Steven to justify thinking that data once archived might be lost. This is not what Steven said.

If I have a temperature series for a urban city
and I apply a UHI correction, where I add temperature to the data to model
what I think the UHI contribution should be, my output will be an adjusted bit of data.

If I do this for 7000 stations, adjusting 3 or 4 thousand of them, then I have 3-4 thousand
adjusted bits of data.

If I then combine these 7000 stations into one final result. then what do we have?

I dont “see” the adjusted stations. they are just intermediate results.
the unadjusted data is still there, so no data manipulation. And you are happy
calling this accepted methodology.

I have a theory, Oke’s theory, which states that UHI goes with the log of the population. I have data. and we both know that data diverges from Oke’s theory.
So, I adjust the data, as an intermediate step. Then I combine all these into
a final number. And You are Ok with this?

Now, my final result is blessed by my community. I never show results with and without
my adjustment. I just note that an adjustment for UHI has been applied. I show improvement in some statistical way, hey my results are now consistent with theory.

Then, my results get used down the line by others. they take my results as “data”.

Very simply, The theory ( tree rings track temp) was employed to change the data.
That is clearly self serving data adjustments.

Steve, if you will note in the Climategate code (I think it is in HarryReadMe), they don’t add a unique UHI or other adjustment to the data. They have ONE adjustment applied to all the stations (after compiling, as I read it), one for each time period. It is the line of code that steps up (and only up) for each period. (It also is a bigger adjustment than the resultant “warming” trend,” which to me says that the actual data trend was DOWN, implying another “hide the decline.”)

Re: None (Mar 30 15:18),
The model is a process of mathematical manipulation based on the data. The output is necessarily a product of manipulation. This Tornetrask adaption is just part of it. The fact that you have to give special treatment to different sites is certainly an unsatisfactory feature of the model, and criticism of that aspect of the model is justified. But it isn’t any kind of impropriety.

In the statistical model between MXD and RW, the residuals contained serious autocorrelation. To any sane statistician, this is evidence of an incorrectly fitted model. It doesn’t entitle the author to insert a bodge to remove the serious autocorrelation.

As I observed in my submission, the issue is not that Briffa failed to disclose the 1992 bodge, but that neither reviewers nor specialists saw nothing wrong with it.

To Nick Stokes: Thanks for your comments, and good humor, and patience. But don’t you see that you are defending the indefensible here. If a man holds up a liquor store, is apprehended, and then confesses the crime, do you think he should be exonerated since he “fully disclosed” his actions?

Let me nuance this a bit. Yes, I see your point about empiricism, and mental models, and so forth. But do you honestly think Briffa was unaffected by his Team’s urgently-expressed desire to stay on-message? Have you noticed that, though he may express doubts in private, in public Briffa has been (sfaik) resolutely a Team Player, always on-message?

In normal science, if one were to propose such an empirical “bodge”, an honest scientist would, at least, show the curve before and after the bodge, don’t you think? And, as others have noted, workers downstream of Briffa Tornetrask have used the bodged reconstruction uncritically, and without further notice. Do you think that is good practice?

This business of “looking right” is a bit puzzling. Is that “looking right” compared to other reconstructions? if so, and the present effort has to be ‘encouraged” a bit to “look right” why not just use the other reconstruction and spare us the bodgery?

Or is the “looking right” related to some other trend? Or maybe even something not in evidence?

I suspect “looking right” is judged by the same standards as “looking better” – it means it looks more like what “climate scientists” think it should look like, to confirm what they already know, and to assist in influencing policy makers in the direction dictated by “the Science”.

Showing the actual data would be just too “confusing” to non climate scientists, especially if the data is not sufficiently “on message”.

This case of fabricating data has a very unusual aspect: Usually when scientists make up data to fit their conclusions, they try to keep it secret, so that respectable journals will publish their articles believing that the data is real.

In this case, however, Briffa has fabricated data by bodging the real series, and at the same time announced to all the world that he was fabricating it. But still he got away with publishing articles based on it.

This can’t happen in real science, so that unless other “climate scientists” renounce this series and all studies based on it, one can only conclude that “climate science” is in fact a pseudo-science.

It seems that in this climatology field one can get away with a variety of tricks and bodges, data massaging and shaping, etc. so long as the preferred agenda is served. I was just reading the RealClimate onslaughts against Courtillot in 2007, and trying to imagine what they (Ray P. especially) would need to say to apply the same kinds of standards to the corpus of Mann, Briffa, Jones, et al through the years.

Also, for NS, I see frequent references to ‘series’, such as Tornetrask series.
What about the word ‘series’ flags me that it is a model, and not data?
i.e.
“A brief presentation was made to the Review Team to illustrate this. This later work confirmed that it had been correct to adjust the MXD series in the earlier work and thus justified the original “bodge”.”

I would like to point out that Nick is correct in one sense and then damn him with praise.

Yes, I have used such a bodge in a complex problem. And I know others who have. It is as Nick says, it is to examine a model. However, Nick is not pointing the next step, that you have to find a physical model and then test it with updated, from the same series, data. In other words you would have the reason for the complexity and used the bodge to get the right model.

Then you redo your work, and then publish.

As we have seen, it is often what people leave out that explains the real problem.

The comment that they should have published is especially irksome. The lack of resolution and the 2011 date for the supposed resolution, indicate that the publication had a major flaw at the minimum.

For the benefit of ns, the model being developed by Briffa and all these guys is not one like say a piecewise linear fit to time. No. It is a model that says ring width (or density) for ALL TIMES in the past is a function of temperature (or conversely for some models). We must have the “ALL TIMES” in the past caveat so we can apply the model to the past. If it only works SOMETIMES then the whole game is over, and it is not a model for tree growth as a function of temperature. What Briffa and the others have done is say it only works SOMETIMES, and then pretend this is ok. How about an altimeter in a plane that only works sometimes? Ok? I don’t think so. Further, when it doesn’t work in the real world we don’t know why or what the value should really be, but they have simply fixed the values in those times to match other data (other proxies, ring width in the case of MXD here) or instrumental data in one bodge, or simply dropped them. By fixing the model against ring width and other proxies, and then saying: “look they match!” they have created circular reasoning — no longer independent, are they?

The great thing about the half dozen Briffa paper’s I’ve read is that they have all conclusion killing caveats in the middle. Sometimes, the conclusion still happened in the paper; other times the reproduction of the ‘preferred portion’ still left out. Briffa is the master chessman in a crowd of amateurs. People can always point out some quote or phrase which places the conclusion in context, others can quote the result as the voice of god.

Steve, no need to post this FYI. In your quote “Although Muir Russell expressed disinterest in opining on the proxy issues that dominated the Climategate dossier….’ you are using the word ‘disinterest’ inaccurately. It means impartiality. You mean ‘lack of interest’ which, in the context, is almost the exact opposite. Muir Russell was not impartial in his investigation but rather he was uninterested in the truth.

Determinable from the context, “bodge” was being used as an informal term to denote an inelegant and inferior, but timely and probably temporary procedure to fix a perceived problem. (This usage is derived from the word “botch”, meaning “to carry out a task badly or carelessly”.)

No this is bodge in the following sense: ‘Ford, this is your accountant, Phil. Looks like I bodged your tax return and you are going to have to spend some time making mail bags for her majesty. Sorry about that, anyway, thanks for trusting me, sorry I was cheating, cheers Phil’.

As a stand alone item, the bodge is not significant. But placed in the context of the full spectrum of manipulations, deletions, slicings, smoothings, deft choices of reconstructions and the discarding of others, upside down entries — every one of which supported one conclusion only — it stands as yet another entry in the dreary story of the rise and fall of the Hockey Stick.

This really amounts to torturing the data. No matter how open Briffa and others were to the “bodge” it cannot be justified without a systematic effort to justify the approach and to apply the same method to other proxies that are part of the same class or category. This is just embarrassingly weak research.

I am not a mathematician, but I would have thought that you can probably find a hidden signal in any series of data that you require by discarding the data you do not want and then truncating the data that fits your requirement. Is this science?

Thanks for your tireless sleuthing, Steve. I now understand that in Post-Normal science as practised by Briffa et al, a Bodge is just another acceptable technique for acquiring the data that the ‘scientist’ thinks looks about right.
Thank God these blokes are not designing aeroplanes or bridges or tunnels or ships or…
Is there such a thing as a post-normal standup comedian and is Nick Stokes a groundbreaker in a new comedic genre? I laughed, anyway, but I’m just a lay person without Mr Stokes’ obvious professional knowledge of how stats and models work.

This has happened so many times, and you have been so diligent at finding it, and the team supporters so uninterested in pursuing the obvious scientific ramifications that one has to wonder, what can be done in the public domain, other than the magnificent job that you do here, to illuminate this scientific fraud (whether unintentional by unconscious bias or other more disturbing causes).

I would propose to you and to the CA community that the development one step beyond your work be taken. An online, peer reviewed journal of science that reviews science, at least beginning with climate science, that would have cloud based archival of data, full traceability of papers, results, software, and data provenance.

Good science is like good fruit, it grows. Bad science at some point, maybe not in our lifetime, will wither.

In the UK we often talk of making a “bodge job” which would be a repair that is only meant to last as long as it takes to get the thing fixed properly, it has the implication that the repair will not last long and will need returning to later. Such things as using sellotape to stop something falling apart.

In this context they bodged up the data to fix the results, but their use of this word implies they knew that the data would not stand up to (would fall apart under) detailed scrutiny. This is a clear implication of this phrase and it follows that they must have been very confident their paper would not be properly scrutinised. IE They bodged the data because they knew the peer review process would also be bodged.

This word in this context implies an intention by the writer to dupe the scientific community, the politicians and the public. There is no other way for a British person to look at this use of this word in this context.

Whereas Briffa’s original bodge used a simplisitic straight-line (0 frequency) fudge factor taken from a trend line fit to a selected portion of another series, the advanced MBH99 Bodge uses a complicated, somewhat wiggly low frequency fudge factor fit to a selected portion of another series.

Same sort of thin air adjustment, but more advanced smoke and mirrors.

“This can’t happen in real science, so that unless other “climate scientists” renounce this series and all studies based on it, one can only conclude that “climate science” is in fact a pseudo-science.”

he makes the important point that Nick Stokes, by continuing to defend on the basis of this ad hoc consideration being nothing unusual, only re-emphasizes the image McCulloch presents above.

Some time ago I did an experiment using a digital camera at night (the camera tries to adjust for the lack of light by making the sensor more sensitive. this allows random thermal noise to produce the typical digital noise on such photos – this can only be reduced by operating the sensor at ultra low temperatures)
It is a good technique for producing pictures in near impossible conditions – take a binary number of photos, combine them in pairs using the “add” function in paintshop pro, take each summed photo and add to another summed photo. Continue adding together only 2 photos at a time until the required result is obtained.

What was the purpose:
To show that a signal buried in random noise can be extracted by averaging over many data sources.
I.e. take enough trees. Average the ring data and any common factors in the data may become visible – fertilisation, lack of nutrients; too much water, too little water; etc get reduced. but the temperature/CO2 fertilisation are not locally different and any of these or similar effects should become dominant in the averaged data. By junking obvious non responders (invalid photos of kids etc) the common signal is obtained more quickly. We what the temperature has done over the last few hundred years – is it therefore wrong to dump trees that do not conform? I knew that my photos contained no ships so why should I average my ship photos into the photo of the back garden?

Does anyone suggest that a proxy record is an exact representation of past temperatures – I have not seen such words used. All these proxies are simply work in progress (and done over a decade ago!). Reports generated a decade ago are not necessarily fixed in stone more recent ideas/data can displace such ancient documents. Why are these constantly paraded before us?

TFP, your mistake is assuming you know the meaning of the buried signal. You say that if you take enough trees, the water and nutrient levels will “get reduced” but the temperature will not. Yet you provide no proof the remaining signal is temperature. And now we see that the supposedly “non temperature” data was simply removed.

Why is that a problem?

Let’s use your experiment as an example. You averaged a bunch of noisy photos and got a reasonably clear picture of one dimension (looks like Y in Y/Cr/Cb terms) of the potential signal. That assumes Y is the desired result, not Cr or Cb — presumably what we see of your final result is woefully lacking in chroma “signal.” It ALSO assumes you know what the “Y” picture should look like!

You were able to cheat: you knew to remove photos of spouse, kids, dog/cat etc. You knew what the final picture “should” look like. So, by manipulating the source, data, you were able to reproduce the result you desired.

What if you didn’t know the desired final result? You couldn’t remove those extra photos.

Even worse:
* what if the desired result was not in the Y dimension but rather in the chroma?
* what if what we are seeing is not Y but Cr or Cb — randomly? Your method doesn’t resolve this question.
* what if what we are seeing is something else altogether, perhaps an infrared image? Again, we can’t answer.

Bottom line: unless you know in advance the type of signal you seek, your method doesn’t work well.

[BTW, your method IS very powerful when the signal being sought is known. This is how cell phone GPS works — it takes about a thousand very-noisy samples, merges them, and finds the faint GPS signal pattern in the mix. Far more sensitive than older GPS receiver technologies.]

thefordprefect,
Your picture experiment was a form of signal averaging. This requires that …

o Signal and noise are uncorrelated.
o Signal strength is constant in the replicate measurements.
o Noise is random, with a mean of zero and constant variance in the replicate measurements.http://en.wikipedia.org/wiki/Signal_averaging

Do you know that the above is true for treemometers?

By the way, in the imaging world, challenging low-signal images are made by characterizing the noise of the imager (make a suitably long exposure with no signal ie. with the lens cap on) and subtracting that noise from a suitably long real exposure.

Briffa is supposed to be using tree ring widths (TRW) and/or density (MXD) to reconstruct past temperature. In the case of the “ad hoc adjustment”:

a) Did Briffa use his knowledge of the divergence between actual and reconstructed temperatures to develop an ad hoc correction factor for the MXD reconstruction? If so, Briffa is guilty of artificially manipulating the data to obtain a desired result. Everyone that has used his reconstruction has somewhat better validation statistics because of Briffa’s manipulations.

b) Did Briffa use the divergence between the TRW data and the MXD data to determine that a correction to the MXD data was necessary? In that case, Briffa has developed a legitimate improved method for reconstructing temperature using a combination of MXD and TRW data.

How can we tell which is the correct interpretation? That’s easy: what would you do if you thought you had invented an improved method? First, you’d certainly mention your improved methodology in the abstract and Briffa did not. Second, you’d try your new method on other data sets to see if it improved other reconstructions. By his own actions, Briffa tells us that the bodge was a one-time trick to reduce the divergence at one site, not a legitimate technique for improving reconstructions.

If Briffa had looked at both CE and RE, he might have found that TRW were inferior to MXD at reproducing annual temperature changes but superior for reproducing low frequency temperature change. If that were the case, he would have had a good reason for choosing the TRW data