Keith Briffa Responds

In spite of suffering a serious illness (which I understand to be a kidney problem), Keith Briffa has taken the time to comment on the Yamal situation. The comment should be read by interested readers. If Briffa or any of his associates wishes to post a thread here without any editorial control on my part, they are welcome to do so.

Briffa’s comment leads off with the accusation that I had implied that the recent data had in this chronology had been “purposely selected” by Briffa “specifically because they exhibited recent growth increases”. I want to dispense with this up front. While I expressed surprise that there were so few cores, not only did I not imply that Briffa did any sub-selecting, but I specifically said the opposite. While the precise relationship of the CRU archive to the Hantemirov and Shiyatov subset is not entirely clear, I had speculated that H and S had created a subset that was relevant for their purposes (corridor standardization), but that it was not of an adequate size in the modern period for Briffa’s RCS standardization, stating clearly that it was not my belief that Briffa had crudely selected cores.

Since Briffa provides no quotation from any of my threads or comments to support his allegation I will review what I actually said.

Here is Briffa’s accusation:

The substantive implication of McIntyre’s comment (made explicitly in subsequent postings by others) is that the recent data that make up this chronology (i.e. the ring-width measurements from living trees) were purposely selected by me from among a larger available data set, specifically because they exhibited recent growth increases.

This is not the case. The Yamal tree-ring chronology (see also Briffa and Osborn 2002, Briffa et al. 2008) was based on the application of a tree-ring processing method applied to the same set of composite sub-fossil and living-tree ring-width measurements provided to me by Rashit Hantemirov and Stepan Shiyatov which forms the basis of a chronology they published (Hantemirov and Shiyatov 2002). In their work they traditionally applied a data processing method (corridor standardisation) that does not preserve evidence of long timescale growth changes. My application of the Regional Curve Standardisation method to these same data was intended to better represent the multi-decadal to centennial growth variations necessary to infer the longer-term variability in average summer temperatures in the Yamal region: to provide a direct comparison with the chronology produced by Hantemirov and Shiyatov.

Dealing with the second paragraph first, in my first post on the topic, I clearly distinguished between H and S corridor standardization and Briffa’s RCS standardization, noting that corridor standardization was known not to preserve centennial-scale vairability, as follows:

There is one other version of these series that readers may encounter: Hantemirov and Shiyatov archived a Yamal reconstruction at NCDC that has no hockey stick blade whatever. This version was promoted by a commenter (Lucy Skywalker) at Jeff Id’s as being a priori more valid than Briffa’s. Although the Hantemirov and Briffa chronologies have a very different visual appearance (especially the non_HSness of the Hantemirov version), there is an extremely high correlation between the very different looking Hantemirov-Shiyatov and Briffa Yamal chronologies. (If you regress the Briffa recon against the Hantemirov recon for the pre-1800 version, you get a huge r^2 of 0.81). The two series clearly have the same raw material.

However, in my opinion, the issue is considerably more nuanced than simply preferring the Hantemirov chronology. The Hantemirov and Shiyatov chronology adjusts for age (“standardization”) through a “corridor” method, whereas the Briffa chronology uses a “RCS” method to standardize for age. In other studies involving relatively short-lived trees (such as as Yamal), the corridor method has been found to yield very similar results to “conventional” standardization; such methods are also known to remove any centennial-scale variability from the reconstruction. As a result, no conclusions should be drawn with respect to centennial-scale variability from the Hantemirov chronology. No adverse conclusions should be found against the Briffa chronology merely because it differs from the Hantemirov chronology. There are other reasons to be concerned about the Briffa chronology, but these have to be presented and supported.

Briffa’s third paragraph states:

These authors [H and S] state that their data (derived mainly from measurements of relic wood dating back over more than 2,000 years) included 17 ring-width series derived from living trees that were between 200-400 years old. These recent data included measurements from at least 3 different locations in the Yamal region.

It is highly possible and even probable that the CRU selection is derived from a prior selection of old trees described in Hantemirov and Shiyatov 2002 as follows:

“In one approach to constructing a mean chronology, 224 individual series of subfossil larches were selected. These were the longest and most sensitive series, where sensitivity is measured by the magnitude of interannual variability. These data were supplemented by the addition of 17 ring-width series, from 200–400 year old living larches.”

In a comment to the same post, I clearly stated my view that there was no crude cherrypicking of the type that Briffa accuses me of implying. I stated :

bender, I agree with your point. I’ve tried to steer a careful line here. If you think otherwise, can you give me particulars as I don’t wish to unintentionally feed views that I don’t hold. It is not my belief that Briffa crudely cherry picked. My guess is that the Russians selected a limited number of 200-400 year trees – that’s what they say – a number that might well have been appropriate for their purpose and that Briffa inherited their selection – a selection which proved to be far from random and which, as you and I agree, falls vastly short of standards in the field for RCS chronology (as opposed to corridor or spline chronologies).

The substantive issue is whether the selection (that Briffa now confirms to have been inherited from the Russians) was appropriate for the RCS standardization method that Briffa applied. I brought this up, stating:

The subfossil collection does not have the same bias towards older trees. Perhaps the biased selection of older trees [results in] an unintentional bias, when combined with the RCS method. This bias would not have similarly affected the “corridor method” used by Hantemirov and Shiyatov themselves, since this method which did not preserve centennial-scale variability and Hantemirov and Shiyatov would not have been concerned about potential bias introduced by how their cores were selected on a RCS chronology method that they themselves were not using.

Briffa’s own caveats on RCS methodology warn against inhomogeneities, but, notwithstanding these warnings, his initial use of this subset in Briffa 2000 may well have been done without fully thinking through the very limited size and potential unrepresentativeness of the 12 cores. Briffa 2000 presented this chronology in passing and it was never properly published in any journal article. However, as CA readers know, the resulting Yamal chronology with its enormous HS blade was like crack cocaine for paleoclimatologists and got used in virtually every subsequent study, including, most recently, Kaufman et al 2009.

Briffa continues:

In his piece, McIntyre replaces a number (12) of these original measurement series with more data (34 series) from a single location (not one of the above) within the Yamal region, at which the trees apparently do not show the same overall growth increase registered in our data.

The basis for McIntyre’s selection of which of our (i.e. Hantemirov and Shiyatov’s) data to exclude and which to use in replacement is not clear but his version of the chronology shows lower relative growth in recent decades than is displayed in my original chronology. He offers no justification for excluding the original data; and in one version of the chronology where he retains them, he appears to give them inappropriate low weights.

The basis for replacing the CRU 12 with the Schweingruber 34 was, I think, quite clear: to see how the use of a larger and readily-available sample from the same area would affect the chronology. I think that I described my selection and exclusion procedures for this sensitivity study far more clearly than Briffa et al 2008 described its selection and procedures for, say, the Avam-Taimyr site. What was the basis for including the Avam site with Taimyr and not other sites in the area? What was the basis for including the Schweingruber Balschaya Kamenka with the Taimyr site and why wasn’t its inclusion mentioned in Briffa et al 2008? Why was Balschaya Kamenka included, but not Schweingruber’s Aykali River, Novaja Rieja, or Kotuyka River? Why was Balschaya Kamenka included with Taimyr, while Schweingruber’s Khadyta River, Yamal wasn’t included with Yamal? And what effect did all these changes have on the resulting chronologies?

While Briffa, in a peer-reviewed publication, omitted these relevant details, I provided a much clearer description of my methodology in the sensitivity study. The Avam-Taimyr example showed that Briffa was not opposed in principle to using Schweingruber data. There was reason to believe that the CRU data was not a complete population of living trees, but been subsetted by the Russians for a purpose different than RCS standardization. To test potential bias in this procedure, I tested the results without the 12 cores ending in 1988 and after and with the Schweingruber data. This indicated a dramatic difference between the versions.

Briffa argues:

Whether the McIntyre version is any more robust a representation of regional tree growth in Yamal than my original, remains to be established.

I note that McIntyre qualifies the presentation of his version(s) of the chronology by reference to a number of valid points that require further investigation. Subsequent postings appear to pay no heed to these caveats.

I did not propose the results of these sensitivity studies as an “alternative” and “more robust” chronology. I am not arguing that the Yamal versions using the Schweingruber data provide the “correct” climate history for the region. I am arguing that the version constructed by Briffa, and relied on so extensively in the literature since then, is not robust in its late-20th century portion to a small and reasonable inclusion of additional data. To accuse me of using “inappropriately low weights” for the cores selected into the CRU archive is beside the point. I could equally argue that Briffa used “inappropriately low weights” (i.e. zero) on the Schweingruber samples.

Briffa proceeds:

We have not yet had a chance to explore the details of McIntyre’s analysis or its implication for temperature reconstruction at Yamal but we have done considerably more analyses exploring chronology production and temperature calibration that have relevance to this issue but they are not yet published.

Like nomads, the Team has moved on. With respect to his new analyses, let’s hope that Briffa archives the measurement data and results concurrent with publication and that it doesn’t take 9 years from publication to see the data.

On a closing note, as I said from the outset, I did not say or imply that Briffa had “purposely selected” individual cores into the chronology and clearly said otherwise. Unfortunately for himself, Briffa’s tactic of withholding data and obstructing requests for data has backfired on him, as some people (not myself) have interpreted this as evidence of malfeasance, as opposed to my own interpretation that this only shows stubbornness on Briffa’s part and ineffective compliance administration by funding agencies and journals.

261 Comments

Inaccuracies aside (yeah, they’re important, but let’s see the forest here), at least you got a reasonably neutral-toned, relatively quick response. Perhaps this can be taken as an offer to engage on the issues, including archiving, rather than prolonging the jousting?

.
I have only three simple questions: What physical models are employed to relate tree ring growth to past temperatures; how robust are those physical models; and where and how are those physical models fully documented, independently of the various paleoclimate studies which those models support?

I for one am chagrined that my own lack of understanding of RCS vs corridors led me to make assumptions about motivation for Briffa’s selection of cores. For that, I do apologize.

At the same time, the real issue appears to be Briffa’s motivation for selection of analysis method. A more subtle and less visual issue. I think observers can hardly be faulted for seriously questioning Briffa’s methodology which so conveniently produced a result that Steve has accurately portrayed as “crack cocaine for paleoclimatologists.”

I’m only too happy to leave motivations entirely out of the discussion. Motivations are politics, not science.

The science has enough troubles of its own. And the ultimate policy implications of bad science are just as significant, no matter what the motivation of the practitioners.

Jennifer Marohasy has this point, too, and even more eloquently than I do. Keith Briffa explicitly says: “My colleagues and I are working to develop methods that are capable of expressing robust evidence of climate changes using tree-ring data.” If that is not implicit admission that present methods are not robust then I’m the Piltdown Man’s Uncle. Notice I use one ‘n’ to dodge the zamboni?
=======================================

Dr. Briffa can’t really be at the top of his game at the moment and a longer delay in responding would have been understandable… maybe even preferable since his reply is rather disappointing in quite a few ways. You were, however wrong about the RC response. Apparently Gavin DOES have an interest in defending Dr. Briffa. O’m sure I could learn something over there if I could just maintain enough discipline to ignore the sarcasm and ad hominem attacks…

Briffa says he used “the same set of composite sub-fossil and living-tree ring-width measurements provided to me by Rashit Hantemirov and Stepan Shiyatov which forms the basis of a chronology they published (Hantemirov and Shiyatov 2002).”

Is it the same? Or do we not know?

Steve said “while the CRU archive does not appear to be the precisely the same as the unavailable Hantemirov and Shiyatov 2002 archive, it does appear to be related.”

It has also been said that H&S used 17 living trees, but that Yamal had 12, 10, or 5 depending on exactly what endpoint you look at. Can anyone clarify?

I view the response as positive. There seem to be no glaring oversights, just a small disagreement about motivation in which neither side is entirely without complaint from their own perspective. If this can result in a better justification of the data, or more open discussion, the science in this region may improve. Let us hope the progress (and recovery in health) is swift.

We have not yet had a chance to explore the details of McIntyre’s analysis or its implication for temperature reconstruction at Yamal but we have done considerably more analyses exploring chronology production and temperature calibration that have relevance to this issue but they are not yet published.

This reads like: ‘Pay no attention to what the auditor says, we’ve already researched more into this very issue he talks about and we just haven’t told you about it yet.’

Having worked in the aerospace industry in the capacity of keeping contractors honest in their claims, I’ve seen very similar statements. It’s very disappointing. It would have been nice if there were open discussion of selection methods. Briffa, please just discuss exactly how this data was arrived at, and we’ll all be happy. Ignore whatever attacks you feel you’re enduring and just discuss method. That’s all there is to it.

He didn’t respond to the most important question about how exactly “The Twelve” were selected. Without knowing that, this cannot rise to the level of a scientific discussion.

The reluctance to disclose methodology and data for decades invites speculation about the motivations of Briffa (et al.). The strategy seems to be to use the results of that (natural) speculation to refute criticism without changing the questionable behavior.

Again with climate science we get the “we used this particular algorithm whereas X used another” excuse, where there had been no previous explanation or scientific rigour for exactly why their preferred selection was superior and why it should give the correct answer whereas every other usable algorithm gives a completely different answer (really reminds me of the “de-centered” PC analysis, which was an entirely valid approach for which sceptics supposedly only needed to ask to get Joliffe’s seal of approval…) In this case the righteous claim for an algorithm that picks up better “centurial” scale variations seems only to be picking up outrageous late 20th century decadal changes with no known physical cause (other than the mysterious teleconnection to Northern hemisphere temperatures)

Re: tarpon (#19), I don’t think you have grasped the essence of the problem. If he had archived from the outset, the paper would not have survived peer review: assuming, that is, that the journal had asked a qualified (and reasonably courageous) statistician to review the paper.

The sensitivity analysis that Steve did is all well and good if there are no other extra criteria (as in metadata or in other tree ring parameters) that are used to screen trees. So on the surface replacing one set of Siberian Larch with another set from a site quite close by (at least within the original area used) appears sensible. But are all the parameters associated with the trees e.g. siting and so forth the same for the Briffa set and for the Schweingruber set? Is there really no significant difference in the data sets such that one could be replaced by the other? If so then it does seem that looking for an ‘uptick’ to match with temperature is speculative; otherwise the answer for exclusion may be in the full set of data just for a different parameter or property, and this hasn’t been communicated well. It still seems a little bit hand-waving though.

I guess I have to ask the question (without reading the previous comments which may have it on this already) but did Briffa have access to all the H&S data? I thought he did. If he had access to all the H&S data, and decided to use the 17 (which shrink down as we move forward in time) then I would say he picked what he wanted.

If he simply was left with the 17 then I can see how he might be a victim here. Except he notes himself he wanted to use RCS, which one would assume he knew he should use a larger number of samples.

I think the problem with giving Briffa the benefit of the doubt in any of this is his history of hiding his data. This resistance to the scientific method removes a lot of willingness to give him any benefit of doubt.

Additionally, it would seem to me he would want to be using as much Urals and Yamal data as possible to ensure he was detecting global changes and not be chasing local effects.

Is it unreasonable to ask why he went to such a limited set with all the tree ring data available from that region? And why did he keep the magical larch YAD06?

The way Briffa writes paragraph 2 one has the impression that he uses the exact same raw data as H&S use in their 2002 paper. As Steve noted in the first post (and which I’ve done some more thinking on here) the cores used in H&S 2002 are clearly a different subset of the total cores gathered to those used in Briffa 2000. Whether Briffa used all the cores or not is unclear but the evidence suggests otherwise, hence we should be told why he discarded the ones he did. H&S appear to have discarded all short-lived trees giving them a very small number of cores at any one time

McIntyre followers are missing the point, the HS is not broken, the “blade” on the HS is should not be derived form proxy data but from observations. Proxy data are really on[ly] valuable to infer the instrumented temperature record prior to circa 1880.

Although the instrumental temperatures are (presumably) more accurate than the proxy-reconstructed temperatures during the observation period, the behavior of the proxy during this period is crucial for the whole reconstruction, since it is calibrated by correlating the proxy with the instrumental record.

If the proxy has no “blade” to correlate with the recently rising instrumental temperatures, there may be no reason to think it is a useful temperature proxy in the reconstruction period either. But if an otherwise flat series has a pronounced “blade” at the end, it will probably correlate well with temperatures, and then appear to tell us that temperatures were flat at their late 19th century levels throughout the reconstruction period.

So yes, it does matter a lot whether or not a proposed proxy has a Yamal-like “blade” in the past century.

This thread is about Briffa’s response, not RC’s, but note that several of RC’s “Hockey Sticks” do not extend back past the LIA. We know the LIA was colder than the present, so these graphs are just red herrings. The real issue is whether there was an MWP that was comparable to the current warm period. This is what was denied by the stripbark-driven MBH HS, and continues to be denied by numerous Yamal-dependent subsequent reconstructions.

One key element here is that Dr. Briffa did not archive the data for 9 years until he was cornered to. Why? Once archived for all to see, the limitation of the data (picked by the Russians, small subset etc…) would have become readily apparent, not only to their “nemesis” but most importantly to ANY OTHER Paleoclimatology research group. Therefore the focus of RC reply on McIntyre is indeed misplaced imo.

This statement is remarkable for what it does not do. It does not dispute the fact that the reconstruction has been proven mathematically to be non-robust to a small and reasonable change in included series. Indeed, the fact that the statement centers purely on the provenance of the problematic series seems to me to be a tacit admission of this fact. Non-robust reconstructions are invalid reconstructions. The data is what it is.

Until Dr. Briffa et al can find more data (that at least meets his own standardization requirements), I think it is safe to say that Yamal, and all the studies that use it, are now suspect.

Professor Briffa, if you read this, I’d like you to know that this whole business has been, and still is, teaching me and many others good science, in between everything else. I now believe it’s possible that Siberian larch might function as a proxy for temperatures – if, and only if, the many factors that influence growth ring size are transparently accounted for.

The key word is transparent. Transparency encourages trust and cooperation, and speeds up the processes of Science immeasurably.

We still want to know why your team chose the twelve trees, when one is so clearly an outlier that has definitively shaped half the hockey sticks, and without which, the other half of the hockey sticks are not independently validated. Please address what Steve says, in your response to him, and don’t impute to him what others say, nor what he does not say. Please keep to the essentials: the data and the methods, and the requests for transparency. Please do this for the sake of good science.

The best of scientists make mistakes. It’s impossible not to make mistakes, and near-impossible not to get emotional at times. But good scientists prove their excellence by admitting errors, and re-examining issues.

“We still want to know why your team chose the twelve trees, when one is so clearly an outlier that has definitively shaped half the hockey sticks, and without which, the other half of the hockey sticks are not independently validated.”

Don’t you think that this question should have been asked of Dr. Briffa BEFORE Steve M. posted on a public blog where he KNEW he was going to set off a firestorm?

…and don’t impute to him what others say, nor what he does not say.

So if I hand dynamite to a group of my followers while standing in front of a dam and I watch them placing the charges all the while telling them that I would never do that, am I am absolved of the crime they commit when they subsequently blow up the dam?

Re: Scott A. Mandia (#200), How could this question have been asked beforehand? It couldn’t. The point is, nobody knew these crucial details. That’s the dynamite, the refusal to release the data and methodology, and Briffa should not have been manufacturing it. I mean, do you believe in openness about such crucial issues of the science, or do you support this explosive secrecy?

What I believe in is giving the opportunity to turn over a new leaf, to say sorry, to start to cooperate. Even at this late hour. And remembering his illness, and what illness has done for me, giving me the chance to rethink deep issues.

Don’t forget, Scotty, I was once promoting “good” actions to counter alarmist warmism like you are now. But my regard for truth forced me to eat humble pie and rethink.

Re: Scott A. Mandia (#200),
That question should have been made by the peer review team before publishing.
Wait here;
That is the same team who defend it right now.
Either are they where not qualified to ask the question then OR do they not want to ask the question now.

The dam has collapsed, but not because of dynamite. It was constructed of inadequate building blocks after people breathed together upon them. Steve has simply pointed out that the Emperor’s Dam had no mortar.
===========================================

Re: Scott A. Mandia (#200),
Briffa knew the question was coming. How couldhe not? Why did he not pre-empt Steve ten years ago with a bolus of disclosure? So sad the choices people make when it comes to their petards.

So, if (by an accident of fate) Briffa’s selections of those 12 trees had, by accident or fate, managed to exclude only one specific tree (the one with the anomalous hockey stick) from his study, then the hockey stick would never have been discovered in tree-ring reconstructions? It comes down to that ONE tree? And those folks at realclimate still try to defend this as science? Unbelievable!

“I do not believe that McIntyre’s preliminary post provides sufficient evidence to doubt the reality of unusually high summer temperatures in the last decades of the 20th century.”

But who denies such temperatures? Steve wrote about something else entirely.

Shifting the topic to whether the globe has warmed is either a sly distraction. But probably it just shows Briffa didn’t edit very carefully.

Someone said Briffa can’t be at the top of his game right now. If he is indeed ill then his response to Steve is entirely understandable. And no response at all would have been understandable too but perhaps understood differently.

The response itself is not very strong. It omits important facts and implies Steve did things which he probably didn’t do (I’m not going to review all that text to decide what was implicit and what was explicit).

And Briffa unfortunately refers to unpublished work which will support him. But that is always the first attack made upon anything that Steve writes – it is unpublished.

Briffa also says that Mc is not clear about why he made some choices. I think Mc will win any debate about clarity and openness.

Notice Briffa does not dismiss anything. He just denies he didn’t select data he liked and reject data he didn’t like. And indicates he will review Mc’s postings.

I have been following the AGW movement (anti- and pro-) for a while now, and most recently, your excellent work presented in this forum. Being a physicist by training, and with experience in peer-reviewed publication and quality controls for publications, I find the Briffa et al. publications in violation of what I consider the 3 core canons of science – transparency, reproducibility and rigor. I would reject this paper if I was a referee. However, I digress… I am more interested in the science that a specific author.

To me, the interesting fact about hockey sticks isn’t the blade, its the handle. We have an independent set of temperature measurements in the 20’th century, and a good proxy should be able to reproduce that. The absolutely flat handle in the Mann/Briffa/etc hockey stick reconstructions is what makes the blade look like something extraordinary.

What your analysis establishes for me is a) Briffa’s analysis is not robust due to a very sparse sample with significantly different behavior compared to other samples in the region and b) the blade has a different shape (curved upwards vs flat) depending on which chronology is used.

This implies that Larch tree rings are just not a good proxy for temperature, right? Which means that the millenium long flat handle is not reliable, at least as far as I can conclude.

Is that the point you are making here? Thanks, and keep up the good work.

Re: broken hockey stick. Stop the rhetoric. Calm down.
.
Steve has revealed sensitive dependence of 10 papers on CRU’s Yamal larch. He then revealed sensitive dependence of Yamal on one tree, YAD061. This is much the same sequence that led to the refuting of all those papers that depended sensitively on California bristlecone pines – a refutation subsequently upheld by NAS.
.
That is the story. But stay tuned for more.
.
Alarmists: please speak to these FACTS. Any reasonable, well-posed questions will be answered.
Is it possible that the hypothesis can be salvaged? Yes, of course. Briffa thinks this likely. My opinion is different. However opinions are not facts. Let us discuss facts before opinion.
.
You inexperts: your ignorance – like that of Tom P – is quickly revealed to those who know better. You are well advised to scale your assertions to the strength of your expertise. Or you will be made to look like a fool.

heartrot is something that is discussed in an Esper paper and in Melvin’s thesis. I may have posted on this issue a couple of years ago. It seems to me like a very real inhomogeneity between subfossil and living cores and one that needs to be discussed technically in the presentation of chronologies used in multiproxy studies.

The graph may explain a lot.
If you “know” the temperature trend from say late 60ths and late 90ths is the there a clear temperature rise. What to do if most of the tree ring data does not fit this measured trend?
Either use the one core that fit, or question the other cores.
Here it seems that some people prefer to fit the one, probably to avoid the implication of the other choice.

In either way is the scientific answer that neither single trees or a small number of trees can be used to do a extrapolation of NH climate. Not even climate locally.

This thread contains many insightful comments. To reiterate what has already been alluded to, amazingly, nowhere in his comment linked above does Briffa respond to the repeated allegations/implications that he has hidden his data and repeatedly refused to disclose his data and metadata. Indeed he corroborates the allegations by concluding :…but we have done considerably more analyses exploring chronology production and temperature calibration that have relevance to this issue but they are not yet published.”

Tell me Dr. Briffa will that data be timely disclosed? Will the metadata be timely disclosed? What is your position regarding disclosure by your colleagues with which you have joined or will join in publications? If you should become too ill, have you provided instructions to the custodians of the data and metadata used in your published works to disclose all of it?

If there is any confusion arising from the interpretation of or inclusions or omissions from the Yamal data, Briffa has only himself to blame.

I think Briffa has been pretty reasonable in his reply, his desire to find a way to measure climate change reliably can’t be argued against. However his, what now seems de rigueur, conflation of the author’s position with the subsequent comments on the blog seem to be a bit of a smokescreen.

There should be a name for that kind of conflation? I vote “Yamalling”.

“It would be one thing if they had only sampled 10 trees and this is what they got. But they selected 10 trees out of a larger population. Because the selection yields such different results from a nearby population sample, there is a compelling prima facie argument that they’ve made biased picks. This is rebuttable. I would welcome hearing the argument on the other side.”

Maybe you didn’t mean it this way, but a plain reading of this text would have you accusing somebody of purposely picking certain trees because of a bias on their part.

Steve: if someone read my head posts and comments, I don’t see how there could be any misunderstanding as to what is meant here. First of all, this does not in any way imply that Briffa made the picks, which is what people are accusing me of alleging. In other comments, I had clearly said that it was my surmise that the Russians had made the picks and that Briffa had applied the subset for a different standardization method (one requiring more replication) than that used by the Russians. I noted that the Russians said that they had picked trees that were older in the 20th century than the corresponding subfossil trees. They also said that they selected cores for “sensitivity”, which might also have introduced a bias. Because there was such a difference to the Schweingruber population, it was my surmise that this selection procedure, however it was done, prima facie introduced a “bias” – which is a term of art in statistics and not derogatory. I observed that this surmise was rebuttable and was and remain interested in alternative explanations. I don’t see that this language in any way supports an allegation that I accused Briffa of cherrypicking cores to match the instrumental record.

Re: John N-G (#95),
What it says to me is that a biased sub-sample arose somehow. It does not guess at who might be responsible, or what criteria might have been used to bias the sample.
.
Note that Briffa’s letter does not deny that the CRU dozen identified by Steve represent a biased sample compared to so many other samples from the region.

I’m glad that Keith Briffa has seen fit to answer Steve’s questions and address him by name. What I’m not impressed with is the completeness of Briffa’s reply so far.

Without casting aspersions on Keith Briffa’s motives, the Yamal chronology is unjustifiably sparse on its most important time interval and fails obvious sensitivity tests with other cores which Briffa must have known about. Once again, just like the Holy Tree of Gaspé, we appear to have a chronology which depends critically upon a single tree for its key variance.

The second image below is, in my opinion, one of the most disquieting images ever presented at Climate Audit

I happen to agree. That a non-representative sample would be used to represent a whole region, and that this region should weight so heavily in Arctic and NH and global temperature reconstructions is “disquieting” – especially given the 10 years it’s taken to drag this fact in to daylight. Does it mean AGW is false. No. Does it decrease confidence in the conjecture that the CWP is warmer than the MWP? Yes, for now.

It would be pretty easy for Briffa, once he’s healthy, to clear the air on any speculations happening outside CA. Just explain the methods for deciding which sample sites get included in which chronologies, and why some sample sites don’t get included. And it would serve the entire community well if on Team chronologies and climate reconstructions, they computed confidence intervals (something Gavin Schmidt insists on … when it suits his purpose). If that were to happen, those “spaghetti graphs” would all start pointing to the same conclusion: it is not possible to accurately reconstruct climate 1000+ years into the past. Which is what the American National Academy of Sciences concluded in 2005.

I find it shameful that Keith Briffa has, for many years, held back the method he used for tree selection.

If he really discovered a reliable method for extracting a temperature signal from a subset of a larger community of trees, why hasn’t he made it public? The dendroclimatology community has spent, and maybe wasted, thousands of hours trying to come up with a reliable method. Although his method may not apply to all tree-types in all areas, it should have been released years ago for the dendro community to build on.

Someone set me straight on the Yamal larch analysis…. I am very puzzled…

Lets assume that Briffa et al did everything right, and had a legitimate reason to ONLY use the Briffa Yamal series and not the Schweinberger series. Lets for a moment focus on the science instead of any ethical issues.

The Briffa Yamal series has trees that have a ring structure that with increasing thickness over the 1900’s and somewhat constant over the 1800’s. Presumably, inspired by this, and the well-known fact that warmer temperatures spur growth, one posits that Yamal larch tree ring growth is a proxy for temperature. Am I right so far?

My problem is that there are other factors that can spur growth as well (as Craig Loehle has pointed out), notably C02. Craig also mentioned sunlight, nutrients,etc as factor that can affect growth. So the Briffa Yamal series is also a proxy for these. How can one then make any claims about temperature reconstruction when C02 levels alone may be causing all the variation the Yamal data? (If this has been discussed elsewhere, can someone point me to a link, please?)

For all I know, the long flat handle of the hockey stick just says the C02 levels are constant, but nothing about the temperature, as the C02 sensitivity is far greater than that for temperature. The curved blade is a reflection then of the increasing C02 after the industrial revolution started, but still not an indicator of global temperature.

To deconvolute the effect of multiple factors affecting the interpretation of a particular proxy, multiple proxies have to be used, with some mathematical techniques to figure out the piece that reflects just the temperature. A single proxy, therefore, cannot be used to determine past temperature, no matter how good the proxy data is, as long as there are more than one external factor that can affect the proxy.

The Briffa Yamal series by itself cannot be interpreted, in my opinion, as evidence of warming (or any temperature change, for that matter) regardless of whether Briffa et al had a solid analysis in their paper. A lot of time has been spent looking at whether the Briffa analysis was right or wrong, but in either case, the conclusion is not warranted. What am I missing in this line of thinking?

(The fact the Steves analysis for Schweinberger series is so radically different from the Briffa series means it isn’t even a good proxy and cannot be used for temperature reconstruction for the last millennium)

Has anyone looked at this draft paper which I believe is what Prof Briffa is referring to in his comment about current research?
I find the conclusions interesting as he seems to be saying that the validity of RCS climate reconstructions in removing the MWP is questionable and that “this should lead to a circumspect interpretation of RCS-based climate reconstruction”.

By now, the Yamal issue turn it to the basics, the validity of the scientific method… and inside this topic, the reliability of the data source.

If Dr. Briffa select that small amount of data by itself, he need to clarify their reasons for that.
If Dr. Briffa just use the small amount of data from HS, without validate that data, he is wrong.
If Dr Briffa use the small amount of data from HS, trough a validation, then he uses the same criteria used by HS, and it’s an error.

I read some comments about the exclusion of tree-ring data from the initial and subsequent studies, the data that not fit the “normal” readings used on dendrochronology, the data that is not consistent in regional scale.

The reasons for this inconsistence it’s maybe that the collection of that data was inadequate, maybe the data was infered or maybe the data was invented. But if the data was collected in an righht way, then , the data (biological, ecological and climatological speaking) its fine and dont have any wrong with it, or in other words, in a regional scale, the data are consistent in an ecological-historical way.

So, the exclusion its merely statistical or methodological, so, what’s the methodological reason for Dr. Briffa to use only 12 data samples?

It’s like ants on a nest, almost all the ants looks similar and almost all the ants are incapable of reproduction, if you sample the nest, and catch the queens and males you have a big chance to conclude that all the ants have offspring (that obviously it’s an error). Whats happen when in your sample take only workers?… you conclude that:
1. The ants are incapable of have offspring (another mistake) or,
2. The sample it’s wrong and inconsistent with all the ants on the nest, and for that reason, you exclude the workers from the data (another error) or,
3. You work with all the sample and conclude that ants have different castes.

For me, it’s a similar history, the 12 Briffa’s Yamal data are the queens and the male of the nest, he exclude the normal and historical warm climate cycle data (the workers) so he concludes that we are in a climate change (all ants are reproductive), but when you include all the data (the McIntyre’s 34 data series or, the queen-male-workers castes of the nest in my comparative), you conclude that we have historical warming-cooling cycles (or, that ants have normal life cycles and castes).

confusing?… well, I’m confused, hehe… but in my biological point of view, Dr. Briffa focus on the pixel and miss the full picture, and Steve McIntyre gather all the pixels in the right place and show us an HDR full size landscape.

All this disputing could have been avoided if the scientists involved had simply made their data, methodology and results fully transpareant, accessible and open to others.
As we’ve seen with other prominent graphs and reports, this unfortunately has not always been the case. Also recent childish name-calling on the part of some has done nothing for scientific integrity, and has only served to fuel more skepticsm.

Why was the phrase “The second image below is, in my opinion, one of the most disquieting images ever presented at Climate Audit.” used?

It’s disquieting that the problems with these studies were not picked up during the review process or thereafter, especially given that they are used all over the media, is it not? Are you not surprised? SM hasn’t done anything ground-breaking in his analysis, apart from manage after a long struggle to obtain the data (that alone is disquieting). We are constantly told that this science is “peer reviewed” as a form of appeal to authority and correctness. It’s an appeal I am unfortunately fast becomming immune to, which is a shame. I think that is the sense in which the word “disquiet” is meant.

Why was the phrase “The second image below is, in my opinion, one of the most disquieting images ever presented at Climate Audit.” used?

Here is why. It is showing that the scientific basis for numerous subsequent studies is non-existent. But these studies have entered the public policy arena, been quoted in UN documents, and have served to partly justify the proposed investment of many trillions. But one of the key studies underlying them is baseless.

What the graph shows is that when you use different samples from the same area, you get a diametrically opposed result. The different samples are drawn from a peer reviewed corpus, there is no obvious reason to prefer one set or the other. We have a situation in which scientific method appears to have failed. The author should have seen this for himself during the study period. Peer review should have found it prior to publication. The raw data should have been available for replication and verification of the work in the subsequent 10 years. The UN should have audited the study (and also the MBH studies) prior to publishing the later derivative works, and using them as justification for urgent action.

I have no idea what went wrong, and don’t think it is of any great matter. What is clear is that the original study is not robust, and the public policy implications are very considerable. Something is wrong with the way in which scientific research results are finding their way into public policy, unscrutinized, and in the present case, when they are not sound.

Imagine that this was medical. There was a case in the UK which was sort of similar – the case of the childhood leukaemia clusters. There appeared to be a cluster near Sellafield, a nuclear plant site. Further research showed that there were clusters not near such plants, and that other plants had no such clusters. There was no causal relationship or even robust correlation. We have something similar here. One set of trees shows a certain curve. Others, which were available to the researchers, should show that same curve if it were a real phenomenon and not just a characteristic of the particular sample. But they do not. And it was not a big deal to discover this, it only took one man a couple of days, once the data was in the open.

The reasons for this situation are both unclear and unimportant, but what is both clear and important is that Briffa and his colleagues have failed, and failed reprehensibly. There are lots of ways it might have happened, but in the end, you publish conclusions based on samples which are not a correct representation of all the evidence available at the time, you have done something reprehensible in scientific terms. If this came out at a Ph.D defence, you’d fail. If this came out in a business case presentation of a project to a Finance Committee, you’d be kicked out. In both cases people would say that what had happened was disquieting and reflected badly on you as a professional.

This is what is being said here. Briffa is a leading figure in the field. He has now got a really large black mark against him. None of the studies based on this can be relied on. The public policy implications are large, the programs which the studies were used to justify will have to be reexamined. It is indeed very disquieting, and I don’t recall that CA has ever published anything in the past with this sort of scale of issue. So the remark that it is ‘one of the most disquieting images’ published is spot on.

There is no implication of any particular cause for the failure. Its the failure, and the situation we find ourselves in as a result, that is disquieting. I would compare it to being half way into product introduction, and finding not only that the market is a tenth the size we thought it was, but that a reexamination of the original research shows that it should have been obvious from the start that the size estimates could not be justified. This is the kind of thing that when Planning audits the original case in the light of what is now clearly a pending financial bath, the Finance Committee looks at a chart like the one SM has shown, and the Controller says to the CEO, George, this is one of the most disquieting charts I have ever seen presented to this committee. Quite so.

Re: Bob Koss (#60),
Excellent summary by Dr. McKitrick. And note that it is fair in this instance to describe Briffa’s selectivity as “cherry-picking” – not trees in a chronology, but chronologies in a reconstruction (substitution of Yamal for Polar Urals). Readers will note that THIS form of cherry-picking is not addressed in Briffa’s response. He sidesteps that issue. Myself, in a courtroom I wouldn’t call it a “substitution” because I wasn’t in the room watching over their decisions as to what to include or exclude. The material fact is that Polar Urals has a warm MWP and Yamal does not. The source of the bias is, as always, a bit mysterious. (But it sure happens a lot to the same great gang!)
.
This is perhaps one reason why those outside CA are so quick to conclude that the method by which the CRU dozen were chosen was by a similar post-hoc process of choice with prior knowledge. They maybe see a pattern.

Dear Professor Briffa, my apologies for contacting you directly, particularly since I hear that you are unwell.
However the recent release of tree ring data by CRU has prompted much discussion and indeed disquiet about the methodology and conclusions of a number of key papers by you and co-workers.

As an environmental plant physiologist, I have followed the long debate starting with Mann et al (1998) and through to Kaufman et al (2009).
As time has progressed I have found myself more concerned with the whole scientific basis of dendroclimatology. In particular;
1) The appropriateness of the statistical analyses employed
2) The reliance on the same small datasets in these multiple studies
3) The concept of “teleconnection” by which certain trees respond to the “Global Temperature Field”, rather than local climate
4) The assumption that tree ring width and density are related to temperature in a linear manner.

Whilst I would not describe myself as an expert statistician, I do use inerfential statistics routinely for both research and teaching and find difficulty in understanding the statistical rationale in these papers.

As a plant physiologist I can say without hesitation that points 3 and 4 do not agree with the accepted science.

There is a saying that “extraordinary claims require extraordinary proof”.

Given the scientific, political and economic importance of these papers, further detailed explanation is urgently required.

There’s a good collection of links to studies relating to the MWP here. I don’t care who made the collection; what’s important is it is a serious list of studies, many of which purport to be careful quantitative work.

Since there are so many OT comments here (red, blue and green noise?), I might as well venture into a layman’s summary to try and “extract the signal” of the discussion between Keith Briffa and Steve McIntyre, which after all shouldn’t be too hard to understand.

(Very) simplified:

“Keith Briffa: SM implies I have purposely selected the data from a larger available data set. Which I have not.

Steve McIntyre: I did not accuse KB of having ‘purposely selected’ the data. Look here, I can document that.

KB: The Russians made the selection for their corridor standardisation (CS). I then used the same data for RCS methodology to be able to find the longer-term variability and compare my results with the Russians’ result.

SM: That’s precisely the point – was the selection appropriate for RCS or not. While it’s OK to use RCS, there are too few cores for this, and there is a bias towards older trees in the living samples compared to the subfossil ones. This does not affect the CS analysis but it does affect the RCS analysis, and may have introduced a bias.

KB: SM substituted the Schweingruber series for some of our data without any justification.

SM: It was done to test how other data from the same area affected the Yamal chronology, and the basis for this is clearly stated and justified, as everybody can see.

KB: It is not established that SM’s version of the Yamal chronology is more robust than the original.

SM: There is no ‘SM-version’ of the Yamal chronology. Only a test for how robust the Briffa Yamal chronology is to the inclusion of other data from the same area.

KB: We will come back with more. We have also done further work yet to be published.

SM: Let’s hope that this time the data for this new work will be archived at the time of publication.”

(Please note that the quotation marks around the whole “dialogue” indicate that this is my interpretation, I am not trying to put words into neither KB’s nor SM’s mouth.)

I hope I’m right about the CS vs. RCS point, as this seems to be a key technical point that I have to admit I don’t fully understand.

Both parties’ summary, and now I quote:

KB:

I do not believe that McIntyre’s preliminary post provides sufficient evidence to doubt the reality of unusually high summer temperatures in the last decades of the 20th century.’

SM

I am arguing that the version constructed by Briffa, and relied on so extensively in the literature since then, is not robust in its late-20th century portion to a small and reasonable inclusion of additional data.

(I don’t know whether it is significant or not that Briffa doesn’t mention where the unusual high summer temperatures are supposed to have been, in Yamal? The Arctic? The whole world?)

It will surely be interesting to see how this whole thing evolves further.

I’m about to make an attempt at moving the motivation discussion to unthreaded. All links should still work. I am not *removing* any comments, only moving them. If I find mixed-purpose comments, any that have some scientific content relating in some way to the Briffa affair will be left here.

Oh sure. He’s just ‘asking questions’ – and yet the innuendo and implication was perfectly clear to his friends and to the greek chorus and no correction of McKitrick’s or Watts’ comments were made. Strange that. At absolute minimum McIntyre is complicit in propagating slander – and if that makes you feel better about this, than good for you. It doesn’t do much for me. – gavin

Perhaps I’ll be snipped, but I think this is important. You do realise the difference between imagination on your part and real empirically proven “slander” (didn’t you mean “libel”?)? What McIntyre says is what he says. You accuse him directly of something he did not do, and this is a fact, innuendos aside. If you want to accuse Anthony Watts, that strange newspaper and others of misrepresenting SM, be my guest, release the panters, I’ll enjoy the show personally! But if the quality and rigor of your accusations is of this quality, I do not hope for the best. You are making a disservice to this site, mr Schmidt, specially considering how much you have complained of how bad the press is to misrepresent what scientists are actually doing. In those cases you cautioned everyone against thinking that what goes into blogs and newspapers is a good objective representation of a paper or finding. I hope a less boiled blood will convince you to do the right thing.

I believe some of the confusion over the accusations of “cherry picking” arise from some ambiguity in the term itself. Steve has amply established that there was additional data that could have been included in the Yamal series, but that it was not, and that no principled reason for its exclusion was offered. That failure would not be justified, even if it could be proved beyond doubt that Biffra innocently screwed up by not recognizing that the H&S data was not suitable for the RCS he had already decided, for valid reasons, to do. I believe it is fair to characterize that failure as “cherry picking.”

snip – Given what I’ve learned about dendrochronology here–that meaningful results come from populations of trees, and not individual ones- snip.

There’s an important lesson here ….. In my undergraduate advanced physics lab, I did a reproduction of the Michelson-Morley experiment. It involved passing a laser through a beam splitter, and hitting a rotating mirror with each beam, but with optical lengths as different as we could get them, so we could measure the displacement of the resulting dots. The assignment required the measurement of the speed of light with an experimental uncertainty below .3%. My lab partner and I lived in the lab for weeks trying to pull it off. Somewhere along the way, we decided that we weren’t just going to get the experimental uncertainty below .3%, but that we were going to keep refining our method until we got something within .3% of what we knew to be the right answer. We felt like kings when we handed the prof our reports. He glanced at them for about 5 seconds, and then said, “You’re off by a factor of 2. The mirror was silvered on both sides.” I can’t speak for my partner, but I felt like … I’d cheated on a test, though I’d never believed that was what I was doing at the time.

In Briffa’s reply he stated that this is not how chronologies are developed in his lab. He suggested, if not stated (I don’t have the quote in hand), that the study area and site types are defined a priori and ALL the data go into the chronology.

Re: bender (#75), Yes, Bender–that was exactly my point. As I believe my post said, before the snipping, the “cherry picking” was of whole sets of data, not picking and chosing individual trees.

Actually, I’m a bit confused about the way my earlier post was edited–what got snipped was every instance in which I was suggested that Biffra did NOT do what Steve called “gross cherry picking.” My intent was to highlight why it is that science only works when the scientist is more dedicated to the process than to any particular outcome–precisely because it is so easy for one’s hopes and beliefs to color the choices that are made, even within the randge of what would otherwise be acceptable choices, so as to control the outcome.

OK, the Big Move is complete. Close to 200 messages were moved to the current Unthreaded thread. That’s the biggest food fight we’ve had here in quite a long time.

A couple of thoughts and then I need to get going on my day:

1) Lorax, you had an early posting that was mostly good stuff but also contained the root of the “motivations” fight. It could have remained here except for that…and obviously motivations are an incendiary topic that has nothing to do with science. It’s all still over on Unthreaded.

2) SlowNewsDay, you bring up what can be a healthy science-based question, “Why is it disquieting?” However, much of your further commentary has been about motivations. Out of bounds.

I am very cautious about how to handle questions of motivation. In a sense, the meta-discussion. That conversation needs to be on Unthreaded, if anywhere.

Question motivation all you want, but certainly not in any of the science threads.

This blog IS about science. We can all disagree on lots of things — and we do — but one thing I appreciate most about the CA community is that this really IS about science, and that’s something that, done right, we can hopefully all agree on.

I happen to agree. That a non-representative sample would be used to represent a whole region, and that this region should weight so heavily in Arctic and NH and global temperature reconstructions is “disquieting” – especially given the 10 years it’s taken to drag this fact in to daylight.

You have assumed that it is a non-representative sample.

Have you looked at the number of proxies and sites that Loehle used to represent global temperature? 18.

The claim is not ‘disquieting’, it’s ‘one of the most disquieting’, in other words, one of the worst that CA can come up with. That appears to imply something extreme. For a matter that has not even been settled yet, it is a case of putting the cart before the horse.

No and yes. Steve has shown that there is a good chance that it may be non-representative. So, it’s not JUST an assumption. It’s a working hypothesis. But it will be noted in comments elsewhere that I have caveated this working hypothesis by noting the publication of a new paper by H&S, entirely in Russian, which may support the contention that the CRU dozen are representative. Let’s get a translation, and then revisit the question.

Not exactly, but I know what you mean. It is “disqueiting” because it fits a pattern established through prior audit. Whether it is “damning” depends on the strength of defense that Briffa can mount. And so far he has said very little in his reply. We will all just have to wait until he’s back in the saddle. I’m sure all at CA wish him a speedy recovery. I know I do.

I am shocked. One Siberian larch stood the difference between a hockey stick and a baseball bat. From my education in experimental design, The sample size was much too small even if the larger sample had been used. Gathering large samples is hard work.

Re: Henry chance (#86), Actually, I’m glad to hear someone else say this out loud. Based on my own limited experience, I would think that any data sets in double digits would be too small to be considered compelling, regardless of the calculated statistical uncertainty. After all, two identical measurements will give a calculated uncertainty of zero, and we know that’s not the real answer. But I’ve been afraid to argue the point, since I’m hardly an authority on either dendrochronology or statistics.

two identical measurements will give a calculated uncertainty of zero, and we know that’s not the real answer

The probability of two identical measurements occurring in a randomly chosen sample with wide variance is exceedingly small. Therefore there is not much value in contemplating this one improbale scenario. In the long-run, uncertainty expressed in a confidence interval does a good job taking care of the dangers of small samples. To argue otherwise is to argue against the whole of statistics and the central limit therepom in particular. You were wise to not voice such a silly objection.

Re: bender (#93), Firstly, I respectfully request that you reconsider whether characterizing an honest question about sound scientific practices as “unwise” is consistent with the ethos of this forum.

Secondly, forgive me for saying so, but your argument does not comport with what my professors taught me: Confidence intervals are not a panacea for small sample set. All a CI does is quantify uncertainty about what the rest of a population looks like, under the assumption that the measured data points are really members of that population. It does nothing to quantify uncertainty about whether those data points are really members of the population.

Small sample sets are problematic because they increase the likelihood of a statistically significant distorition of the results from anomolous random experimental error. They also generally imply a lower probability of detection of systematic experimental error. This is not a statistical effect (which, by the way, is why your comment that I’m “arguing against the entire field of statistics” is mistaken.) It is a very practical observation about what makes for reliabile observation of the real world.

Re: QBeamus (#112),
Sorry about the wisdom comment. What you say about the represenativeness of a small sample is true. In the case of biological populations, where variation abounds, the warning is especially relevant. Agreement aside, the question is what this all means in the present case of Briffa’s sample. Despite its small size, it can be shown to be unrepresentative of a population that includes a different, larger sample. The plotting of confidence intervals would help susbtantiate this assertion. Let us not argue at cross purposes. Small samples are bad. But in the specific case of a tree ring chronology, you never have just two samples to compare. You may have two cores, but each core has repeated measures of ring width. This provides some protection against the small number of cores in the sample, which might be fatal for any one within-year comparison, but loses importance when doing multiple comparisons across years. That is the ONLY reason why I say your case of “two identical observations” is irrelevant. A single core contains hundreds of observations.

It is a bit surprising that Prof. Fritz Hans Schweingruber has nothing to say about the use of his “Schweingruber series”. Maybe he is just interested in collecting, and not in evaluation, but some first hand knowledge about Yamal would surely be interesting.

The basis for McIntyre’s selection of which of our (i.e. Hantemirov and Shiyatov’s) data to exclude and which to use in replacement is not clear but his version of the chronology shows lower relative growth in recent decades than is displayed in my original chronology. He offers no justification for excluding the original data; and in one version of the chronology where he retains them, he appears to give them inappropriate low weights.

Back to the weighting issues. As in MBH98, Steve has explained how Mann’s PCA analysis gave hundreds of times more influence to the Bristlecones than for the other proxies.
How are “appropriate” weights determined?
I don’t mean that in a flippant way. Is there a valid physical mechanism that suggests scaling of an individual proxy, or statistical rule of thumb that would be applied before comparing with instrumental records?
Shouldn’t scaling choices be explained?

The posts seemed very focussed on the details, and yes, the details in this kind of work are obviously critical. Could I please solicit some thoughts on what the broader implications of Briffa’s alleged errors are for all the hockey-stick like graphs. Does this change the magnitude of the MWP? For example, the audit/critque of Steig et al. conducted here resulted in them making changes to their original paper. As it happens, it seems that those changes/corrections to the data analysis did not change their major conclusions.

Is this a similar case? Perhaps not. I am just asking those who are more informed. Wouldn’t a true test of the implications be for someone here to recreate the temperature proxy reconstruction in its entirety, and then to superimpose those data on the original construction,as well as those constructions from independent data sets and authors? If the ultimate goal is to place 20th and 21st century warming in context, Briffa’s data are only one reconstruction. I understand that others also use the controversial tree core data discussed here. What I am trying to ask, do these findings regarding Briffa’s seemingly erroneous methodology in any way confirm or deny that the 20th century warming is significant in the past 2000 years?
It seems to me that tree rings are probably not the best proxy for temperature, too many issues, especially at higher latitudes where most of the recent warming has occurred during the dormant season.

Re: Lorax (#94),
The alternate proxies already exist. Steve has shown sensitivity studies using Polar Urals for many years.

Right now there are vanishingly few non-tree-ring studies that show the CWP to be exceptionally warm. (Warm yes, but not exceptional). Loehle’s work is one of the few that brings in every temp-correlated proxy that doesn’t involve tree rings.

[Looks like things have settled down here. I’ve gotta get back to work…]

Re: Lorax (#94), This is certainly the next question to ask. The RC posting is mostly beside the point on this, looking at Wahl and Ammann, for instance, who rely on bristlecones, and so forth. Look at page 12 of Steve’s OSU lecture and you can see how Yamal is used so extensively. So if there is a problem with it, especially where its most pronounced feature turns out to depend on the weakest part of the sample, evaluating the knock-on effects is essential. I have been bugging Steve to work on the draft of a paper doing precisely this for about 3 years now (Steve will attest). But what I kept forgetting along the way is the number of such studies that cannot be replicated, let alone subject to a sensitivity analysis, because the input data has been withheld. Until you can reproduce the original results you can’t say what effect changing the Yamal data would have.
.
Nonetheless Steve presented one variation relevant to Yamal in his 2006 House testimony here. And I don’t think we will have to wait all that long for sensitivity studies of the other series involved, but the data are now out there for anyone to do the work who is impatient for the answer.

Could I please solicit some thoughts on what the broader implications of Briffa’s alleged errors are for all the hockey-stick like graphs

If you read the blog you will find some commentary has already been provided the day the analysis was done. Read all posts with “Yamal” in the title.
.
Note however that this is a far-reaching question that could only be answered if all the data and all the code contained in all those papers were available. That is not the case. You will need to wait before a definitive answer will be available.

Nobody is alleging any errors. Everybody is asking him to defend his choices (and those made for him by others) so that we can determine if those choices are justifiable. On the surface, they would appear difficult to defend. Please stop being so presumptive.

Could I please solicit some thoughts on what the broader implications of Briffa’s alleged errors are for all the hockey-stick like graphs. Does this change the magnitude of the MWP?

I think many here at CA take away a rather individual view of the sensitivty testing and particularly when if comes to applying the results to the general list of reconstructions.

I personally take away the message that all reconstructions must be submitted to the types of sensitivity testing that Steve M did with Briffa and in particularly with regards to sample selection and proxy selections for reconstructions. If the authors of reconstruction have clearly spelled out or will in the future the selection criteria and it makes good sense from a physical standpoint, I would think that sensitivity testing becomes less critical. However sensitiy testing cuurently is of critical importance since almost all reconstructions (that I am aware of and including that of Craig L and Hu M) use proxies where a good physical model is not well recognized or defined in the peer reviewed literature.

This is not to say that given proper sample/proxy selection that there cannot be errors in methodologies applied, and in particularly, that that applies to determining the uncertainty of reconstruction results.

And I am sure that Steve M agrees that the sensitivity testing itself requires a critical analysis as to its appropriateness and the final result of that analysis for the testing under discussion is not complete at this point in time.

Re Steig: that’s still a work in progress. Ryan O and Nic L are working on it offline and I get emails on that almost every day.

My original comment on that paper was that it made sense to me that the Antarctic was warming – everywhere else in the world was; and that I, for one, was not invested in the idea of Antarctic cooling nor, if that were the case, that that necessarily meant that the models were “wrong”. Indeed, there’s enough hair on the station data that one might reasonably be concerned about inhomogeneity in the data.

A benchmark calculation for Steig is to simply take an area-weighted average of the temperature station data. That wouldn’t get you on the front cover of Nature though.

The primary interest for me in the Steig analysis was that it offered access to the weird RegEM method used in Mann 2008. Jeff Id and I worked together on this – he got Matlab runs working and I gradually figured out to get an R-port of RegEM so that we could begin to see what they were doing. Ryan O’s done 90% of the work after that.

For me, there are interesting issues in things like PC retention and regularization parameters. Do these things matter? REsults are not stable. If Steig et al adopted the PC retention methodology proposed in Wahl and Ammann 2007 to salvage the bristlecones, then their results fall apart; alternatively if Wahl and Ammann adopted the PC retention strategy that Steig requires to preserve his results, then their results fall apart.

Do such inconsistencies matter in climate science? Seemingly not.

But please – let’s not discuss Steig or other drive-by topics on this thread or even right now. We’ve got lots to discuss with the topics that are on the table right now.

As a very rare poster but long time lurker I would urge Lorax to continue to contribute. The rules of the blog are well know to the regulars and it was unfortunate that one of the moderators was unable to push stuff out to unthreaded much earlier.

The site runs as a scientific discussion and usually non science stuff gets immediately snipped (i.e deleted) this has not been happening, I suspect due to traffic levels, but all the non science stuff has gone to unthreaded.

Because this is by necessity a fairly crude mechanism some of your science stuff also went that way. It doesn’t mean it wasn’t appropriate or appreciated just that it was mostly combined with other stuff. Please persevere – science stuff on the main threads and other stuff on the unthreaded. Remember that this is a largely uncensored site – what you say and argue about, if within the scope of the site gets seen by thousands

I think Briffa’s reference to “the unusually high summer temperatures in the last decades of the 20th century” is meant to mean that these trees have responded to increased “degree-days” by growth increases.

This is intenable for a number of reasons, some plant physiological;

1) 8-sigma increases are simply unbelievable. Van’t Hoff’s rule states that a biological process approximately doubles for each 10C rise in the physiological range
2) No evidence presented of LOCAL summer temperature increases. Any evidence of a 30C increase?

I guess my question is with only a small sample and it being used over and over again, why didn’t this study or any of the others use the readily available data that you used.

I like to know what my weaknesses are before I go out on a limb. It appears these people didn’t want to know that (an old saying: “if you can’t stand the answer, don’t ask the question”)or they did (they didn’t let the raw data out for 10 years)and decided to hide it.

Lorax, as an editorial policy, I don’t encourage people to debate the “big picture” in one paragraph bites. I understand that this is what people are interested in, but the trouble editorially is that every thread quickly becomes the same and most of these one-paragraph bites have been said on many occasions and re-iteration of the same points doesn’t interest me a whole lot and discourages discussion of technical things, which is what interests me and many regular readers.

The “Unthreaded” threads tend to get more hits than the technical threads, so don’t feel that your points will get lost there. But it keeps the technical threads cleaner.

Also, I like to discuss things in the context of a specific article or text rather than generic “big picture” discussion. If there’s such an article that you think is worth looking at, please chip in on at.

Steve, I disagree. The “big” picture is exactly what is critical here. But this site is your baby so I guess you get to call the shots. I would still appreciate a candid from you regarding my response to views/concerns on the unthreaded section. Now would someone kindly direct this “novice” to the unthreaded section? Thanks.

In an attempt to discuss the issue of Briffa’s reply. The suspicion of readers about sorting of data is far from unjustified. This quote is from the Osborne Briffa 06 SI

We removed any series that was not positively correlated with its “local” temperature observations [taken from the nearest grid box of the HadCRUT2 temperature data set (S9)]. The series used by (S3) were already screened for positive correlations against their local annual temperatures, at the decadal time scale (Table S1). We removed series from (S1) that did not correlate positively with their local annual or summer temperatures (Table S1), or which did not extend into the period with instrumental temperature to allow a correlation to be calculated. The series from south-west Canada (named Athabasca) used by (S1) did not correlate positively with local temperature observations, but has been replaced by a new, better-replicated series (S10) that does correlate very highly with summer temperature (Table S1) and has also been RCS-processed to retain all time scales of variability

It’s not like he should be offended by peoples suspicions that he may have actually ‘sorted’ the data. It’s almost standard practice in paleoclimatology. I realize Steve suggested sorting may have been done previously by the original Yamal authors. IMO it’s still likely that the data was sorted before use. I don’t think I’ve said it any stronger than that in any of my writings.

In an attempt to discuss the issue of Briffa’s reply. The suspicion of readers about sorting of data is far from unjustified. This quote is from the Osborne Briffa 06 SI

We removed any series that was not positively correlated with its “local” temperature observations [taken from the nearest grid box of the HadCRUT2 temperature data set (S9)]. The series used by (S3) were already screened for positive correlations against their local annual temperatures, at the decadal time scale (Table S1). We removed series from (S1) that did not correlate positively with their local annual or summer temperatures (Table S1), or which did not extend into the period with instrumental temperature to allow a correlation to be calculated. The series from south-west Canada (named Athabasca) used by (S1) did not correlate positively with local temperature observations, but has been replaced by a new, better-replicated series (S10) that does correlate very highly with summer temperature (Table S1) and has also been RCS-processed to retain all time scales of variability

It’s not like he should be offended by peoples suspicions that he may have actually ‘sorted’ the data. It’s almost standard practice in paleoclimatology. I realize Steve suggested sorting may have been done previously by the original Yamal authors. IMO it’s still likely that the data was sorted before use. I don’t think I’ve said it any stronger than that in any of my writings.

I find the Briffa/Osborne comments (in a 2006 paper no less) that Jeff ID has presented here as evidence that smart/informed/morally motivated climate scientists can have little or no concept of sampling critieria that will pass statistical muster. Anyone care to defend what these scientists have said here or put it in a different light?

Jeff, I’m holding my head. I’m falling out of my chair here. I simply can’t believe what Briffa is saying. That in itself guarantees that you are going to get modern temperatures that are higher than medieval temperatures. This means that all of the trees that don’t show 20th century warming are thrown out. And it is obvious that many trees don’t show the warming of certain periods of time. This means that you get 100% hits for trees that reflect 20th century warming and you get some much smaller percentage of hits for trees that reflect MWP warming. Trees have growing bursts for many different reasons. Shade from other trees; roots reaching nutrients at different times; hillside water flow changing to help or hurt the tree, etc. The chances that all the optimum conditions that supported growth in the twentieth century were also there in the MWP are practically nill for all the trees that were chosen. It seems to me that this builds in a huge bias for showing more 20th century growth. Do you have a link where I can get the quote that you gave. I have to put that in my archive.

Bender, I don’t want my request of Jeff for the link to be interpreted as thinking that he made it up. I’m sure that he is telling the truth. It’s just that I find it so incredible and so unbelievable that I want a copy right here on my computer.

It seems to me that using this technique if you have 20th century warming that is even in the ballpark of MWP warming, then you will end up with a data set that shows more warming than the MWP.

The quote itself is quite standard in paleoclimatology – think about what that means and RC defends it!!. I wasn’t able to put the link here, after several tries I asked Steve and he had me post it without the link.

Re: Tilo Reber (#123), Tilo, you’re exactly right. Their base assumption is a tree that responded correctly to modern local temperature also responded correctly to prior local temperature over virtually the entire lifetime of the tree. I.e., their assumption is that the core ring widths and densities reflect the identical physically deterministic causality across centuries. This is an impossible assumption, and so the effect giving fact to your reasoning dominates and the MWP is necessarily under-represented in so-called paleo-temperature reconstructions.

Steve: I do not regard this particular issue as one that is particularly involved with Yamal. I don’t want to debate tree rings in general terms, as there are separable Yamal issues – issues where a committed dendro might agree.

Not every paper has the same methods. In fact I’m not sure I’ve read any with exactly the same methods but the selection after the fact is one of the big problems.

Steve: Jeff, you’re used to the multi-proxy papers and are assimilating these techniques to the making of dendro chronologies. Those are more about crossdating and there are reasonable systems for crossdating. The great attraction of dendro is precisely the ability to accurately data things.

“there’s a difference between making a site chronology and selecting the site chronologies into a multiproxy analysis.”

Okay, Steve, I believe I understand the difference. Individual picking of cores is not done. But the selection of site chronologies based upon their correlation with the surface temperature records would seem to yield exactly the same result. You still have site chronologies where 100% of the chronologies represent 20th century warming. But the odds of 100% of the chronologies reflecting MWP warming would seem to be extremely low. And this would result in a bias. With so many of these that they seem to be throwing away, it would yield a very heavy bias.

Steve: Yes, but this bias is not relevant to the Yamal situation. There are enough specific issues floating around on Yamal – let’s stick to that for now. Ex post picking will still be an issue. In general, individual picking of cores is not done. However, it was done in Yamal where it appears that the Russians selected a subset of long cores to represent the 20th century and that’s what Briffa used. Rosanne D’Arrigo or Andy Bunn or Rob Wilson would use the whole population.

This is the paper that provided the (unused) Yamal data.
Look at page 720. It shows how tree lines have moved SOUTH over the last 700 years. Tree lines reflect minimum growth temperatures.
It has been getting progressively COLDER.

Re: Don Keiller (#114),
thanks for the link Don
may be o/t but i think your point on tree lines moving needs more discussion(mentioned this on another thread but lost your link).

from that paper –

“The more northerly tree-line suggests that the most favourable conditions during the last two millennia apparently occurred at around ad 500 and during the period 1200–1300. It is interesting to note that the current position of the tree-line in Yamal is south of the position it has attained during most of the last three and a half millennia, and it may well be that it has not yet shifted fully in response to the warming of the last century.”

nobody seems to have a comment on this. why? what am i missing :-(

realise this thread is on tree-rings & Briffa papers, but to a layman like me if you use tree-ring data/dendro calcs in a study you also look at forest tree-line movements first & then look at the tree core data second to confirm/make robust your conclusions.

David 8, I’ve scrolled through that link at RC and Steve is being scorned at and is being made to look the fool. Why? I haven’t seen anything said by Steve to justify those reactions. What am I missing here? Has Steve touched a sore nerve or something?

Could it be that some of these climate scientists are simplly too innocently disposed to understand that there might be an error in their methods and thus take criticism personally and with easily hurt feelings? Could that disposition have something to do with their seemly evasive replies to criticism and questions about their work, i.e. do they not fully comprehend the criticism given their good intentions? Would well intended people who feel their work bears directly on avoiding some terrible, or at least potentially terrible, consequences perhaps react as they do? I pose these questions very seriously in an attempt to make sense of their comments.

[Response: Forest ecosystems respond to climate forcing on multi-generational timescales. Evidence from fossil pollen, tree lines, etc. can thus in general only be used to infer climate change on multi-century timescales. They cannot be used to infer decadal timescale changes such as the anomalous warming of the past few decades. – mike]

The above is from comment #287 @ RC
Does it really say what I think it does?

Esper et al 2003:
“However as we mentioned earlier on the subject of biological growth populations, this does not mean that one could not improve a chronology by reducing the number of series used if the purpose of removing samples is to enhance a desired signal. The ability to pick and choose which samples to use is an advantage unique to dendroclimatology.”

Mann 08

Reconstructions were performed based on both the ‘‘full’ proxy data network and on a ‘‘screened’ network (Table S1) consisting of only those proxies that pass a screening process for a local surfacetemperature signal.

Re: Jeff Id (#127),
Sure, sure: Esper. Seen it a hundred times. It’s the stuff from the Briffa SI that’s puzzling. Who is Osborne? Who actually wrote those words? Surely not Briffa?! He is a senior leader in the field. Esper’s a nobody.
Steve: Tim Osborn is Briffa’s right hand man. Don’t minimize Esper. He’s very prominent, younger than Briffa.

Maybe the dendroclimatologists really ARE out of control? Correlations of 0.3-0.5 I understand. But choosing which chronologies to use in calibration and reconstruction on the basis of pos-hoc correlation analysis? That is going to juice those correlations substantially – possibly doubling them. That will definitely cause a hockey stick because you’re getting double the correlation in the calibration blade versus the reconstruction stick handle. Surely to god they understand this? Rob Wilson, you had better get in here and talk to us. You CAN NOT stay silent on this one. Because even I’m flabbergasted this time.

bender… we’ve discussed correlation picking on many occasions. This was something that Ross and I mentioned in our PNAS comment on Mann (citing David Stockwell’s AIG on the point), but it’s a viewpoint that Jeff Id, Lubos and I have also arrived at, each somewhat independently.

The bias of ex post picking is completely lost on the dendros and the multis.

Sorry for spamming the thread but in Mann08 1357 series were hand sorted down to 1209 series which were then sorted to 484 by correlation to temperature. It seems that from reading the original Yamal paper there may have been sorting before the data ever got to Briffa.

Steve: different issues entirely. HAntemirov and Shiyativ say they took a subset of Yamal cores, but that has nothing to do with Mann 2008.

There are vague comments in the original Yamal paper about sensitivity sorting based on variance. They seem to be directed to the fossilized trees. I haven’t located an SI for the paper yet but that doesn’t mean they were not sorted.

I’m this dogged about it because of the shape of the curve. We know how the shape can be created. I believe something happened, I don’t know what but something did.

In one approach to constructing a mean chronology, 224 individual series of subfossil larches were selected. These were the longest and most sensitive series, where sensitivity is measured by the magnitude of interannual variability. These data were supplemented by the addition of 17 ring-width series, from 200–400 year old living larches.

The “magnitude of interannual variability” has a technical meaning in Fritts 1976 and would have an operational meaning to a dendro. It measures year-to-year variability, a bit like a standard deviation.

In a small sample like Briffa’s turned out to be, a few 8-sigma outliers can really affect the average. This is what happened with strip bark bristlecones at the chronology stage. I don’t think that it has anything to do with CO2 fertilization , as mentioned before. Throw an 8 sigma series into a pool of 10 and you can really affect the mean.

H and S say that they picked 200-400 year old trees – very old trees in this context. Some of these trees seem to have had very delayed growth spurts. In a large enough sample, such things would presumably balance out. But the small size of the Briffa sample causes the problem.

I haven’t run your Yamal code without the most interesting tree in the world – damn that was funny – but I will.

The quote that really bugged me in Yamal was this

In one approach to constructing a mean chronology, 224 individual series of subfossil larches were selected. These were the longest and most sensitive series, where sensitivity is measured by the magnitude of interannual variability. These data were supplemented by the addition of 17 ring-width series, from 200–400 year old living larches.

I took the meaning of variability to mean variance but they collected over 2000 cores in the study and only 17 live ones?!! It’s possible but visually all 17 have upslopes. Also, why wouldn’t they sort the live trees for sensitivity if they sorted the older trees. If they selected the live trees for maximum intra-annual variability that might create the same problems.

There is a possibility they just got lucky this time and chose a very small subset of trees which worked.

Jeff, the comment about selecting the “most sensitive” trees is highly unusual in a dendro study. I agree that it doesn’t say this for live trees, but I agree with you – it’s not an unreasonable assumption to think that they did. For that matter, if they did it differently, that would be a problem as well.

But don’t extrapolate from this to dendro in general, where they wouldn’t do this sort of selection. This is why no dendros have stood up so far.

Re: Steve McIntyre (#149),
The reason you want the most variable (i.e. least “complacent”) series for the 224 dead/fossilized pieces is to minimize your chance of error in cross-dating. (Imagine trying to crossdate flat lines!) It doesn’t have anything to do with mining for climate signal. I think this may be standard practice in the archaeological branch of the field – leave the flatliners in the “dunno” box. If this is the case, then you would not need to do it for the living samples, whose dating is error-free, being anchored in the present.

Re: bender (#155),
Has there been discussion anywhere of the translation of dendrochronology principles to dendroclimatology?

It just hit me that at least some of the Principles of Dendrochronology have been assumed as valid principles for dendroclimatology. Seems like a big assumption. Don’t these need to be disaggregated?

The Uniformitarian Principle
The Principle of Limiting Factors
The Principle of Aggregate Tree Growth
The Principle of Ecological Amplitude
The Principle of Site Selection
The Principle of Crossdating
The Principle of Replication

Re: MrPete (#156),
I wouldn’t state it that way, but anyone who is working in dendroclimatology knows (!) and makes use of all of the principles of dendrochronology. I don’t think there needs to be an explicit statement of assumptions. It’s implicitly understood by anyone in the field that if you’re doing dendroclimatology you are doing dendrochronology plus some climatic response function analysis, calibration & reconstruction.

Dendrochronology needs to use a Site Selection principle that maximizes environmental “signal” because complacent samples are impossible to cross-date. That’s perfectly acceptable when the goal is a chronology of tree life/etc.

Now we want to perform dendroclimatology. The same principle seems to say that “boring” climate without significant change will a priori be ignored due to the need for non-complacent samples.

Isn’t there at least the possibility of a built-in bias because of this principle inherent to dendrochronology?

Re: MrPete (#165),
I see your point more clearly now. It’s a good question. Accurate crossdating of wood pieces is (supposedly) ensured not by a match in trend, but by a match in “wiggles”. If you were to bias your dendroclimatological study by using only dendrochronologically wiggle-matched wood pieces I don’t see that this is going to produce the same bias as mining for a trend-shaped 20th c. climatic signal. I think a far bigger concern would be the accuracy in capturing centennial scale variability when no one sample is longer than a century.
.
However I do not think that many dendroclimatological studies rely on fossilized wood pieces. I think they are far more reliant on absolutely (i.e. correctly) dated living trees. Someone could check me on that.
.
If the dendroclimatological study were to rely on dendrochronologically crossdated wood samples that were dominated by low-frequency variability, I think you’d not only have a potential climate reconstruction bias problem, you’d have a far more fundamentla problem of accuracy in the master chronology.

Re: MrPete (#165), I agree completely that selecting max variance (for cross-dating) may mask correlations to single climate variables such as temperature. One might rather prefer the most boring trees in the forest, unless the study sample size was very, very large in order to glean a generalized signal from the individual noise. It is like selecting, from photos of the sun, film negatives with the highest contrast, then using those to determine whether the sun is brighter or more dim over time.

Isn’t there at least the possibility of a built-in bias because of this principle inherent to dendrochronology?

The scenario you describe would introduce bias, yes. But it would be opposite to the one I *think* MrPete had in mind.
.
But note that for chronologies built from living trees this is a moot point, as there are no floating chronologies to be anchored by crossdating. You take the trees you get, regardless how variable or complacent they might be.

Re: MrPete (#165), MrPete I think you are exactly right: the selection criteria for being able to cross-date trees (wide variance in response) does not necessarily lead to an unbiased sample for detecting climate accurately.

At treeline on Almagre, there are both old strip-bark trees and younger whole-bark trees. As we know, Graybill only sampled the strip-bark trees. There are plenty of living whole and stripbark trees.

Even living trees have core cross-dating challenges. Partly that’s because you often get multiple sub-samples from a single core extraction due to defects in the wood. But it’s also because complacent trees don’t make it easy to align samples from one tree to the next.

My hypothesis:
a) Dendro’s have been in the habit of selecting non-complacent trees because of the cross-dating challenge. If nothing else, cross-dating software doesn’t work without good “signals”, which means the scientist (or their lowly grad students :)) must painstakingly do a manual cross-date.

Re: bender (#190),
I’m not sure I understand the disagreement. You seem to be quietly assuming that my question about biased samples is about bias-for-HSness?

You said

Such masking would reduce the calibration coefficient, not boost it. This would diminish the blade of the HS. Therefore you disagree with MrPete and you agree with me.

As you know, I’m not really a stats person at all. I do think I understand signals and noise a little (EE and imaging background.) Feel free to tell me to go away and get an education in X, Y or Z if that’s what I need :)

My innate sense is that if the “signal” boosted by selection is truly noise then any signal will be reduced, as I think you are suggesting above. Same thing (more or less) as how we have to be careful when using amplification-and-truncation to remove noise from AV data.

However, if there’s ANY non-random signal, from ANY source, that rises above the noise, won’t selection for “boosted” signals tend to amplify that signal, no matter what it is?

And if the actual desired signal is quiet, won’t this process tend to just hide it?

in Cook’s implementation of “conventional” standardization, he uses a “biweight” mean. Briffa has never published his RCS software or for that matter provided a sufficiently detailed technical description of his methods to know for sure. I spent a considerable amount of time looking at Briffa’s methods in 2004 (while Ross and I were being jerked around by Nature). This was pre-blog; today I’d memorialize such thoughts online in a technical post that most people wouldn’t read, but I find handy as a reference.

Re: bender (#159),
Strange. From my perspective (admittedly knowing next to nothing about tree growth, not to talk about specific species in Siberia), this seems to be almost a text-book-case for median (robust statistics): there are few “outliers” (trees with unusual response/growing conditions, etc) that contaminate the sample.

Re: bender (#174),
no bender, it’s never “dead easy” to modify code by someone else, especially if you want to be sure it is doing what you want it to do (additionally I’m rather R novice). Anyhow, here’s my modification:

So I replaced mean with median in the index calculation, and additionally the nonlinear regression with a robust equivalent. I tried to change the code as little as possible in order not to mess up anything. I think the exponential model might not be the best way to do the thing (I can think better alternatives), but I left it like that since that’s the way RCS is supposedly doing it. Any dendros wanting to know the alternatives, please contact me: my fees start with … ;)

I’m sorry about going off topic Steve, but for me this solves the mystery. I don’t understand how any comparison between the MWP and the 20th century can be meaningful under such conditions.

I can, however, tie this to Briffa’s comments from your link above where he says:

“Chronologies are constructed independently and are subsequently compared with climate data to measure the association and quantify the reliability of using the tree-ring data as a proxy for temperature variations.”
Steve: Again this is conflating two different articles and two different issues. Let’s stick to issues that are unique to Yamal.

In corridor standardization do they do some kind of centering to each core series? I haven’t done much searching on CA for it yet so sorry if it’s a bad question- the blog is kinda big but like always don’t worry, I’ll work it out in time. There are some uneven effects created by using the same standardization on different tree types and probably more importantly growing in different conditions.

Jeff, to my recollection, I haven’t specifically discussed corridor standardization here before. It’s a Russian method which was discussed in some dendro workshops in the 1990s. Some comparisons were made to “conventional” (tree-by-tree) standardization (not one-size-fits-all RCS standardization) and they were found to be pretty similar. “conventional” standardization as applied to “short”-lived trees ( e.g. under 250 years) will not recover centennial variability; the graphs have a distinctive look that is shared by the H and S chronology at NCDC.

There’s a useful article in the NATO 1996 conference (ed Jones, Bradley, Jouzel) that I’ll try to scan some time.

Steve: per, it’s nice to see another citation of Knowles et al (2007). For other readers, Knowles et al(2007) was previously cited in http://www.climateaudit.org/?p=3752. The citation of Aragon and Arwen (2009) is new. per left out a highly relevant reference (especially in an M&M context) on the same topic: Mathers (2007).

I have a favour to ask, could someone who has the wherewithal, download the Yamal data and construct an average radius for age series, and compare the YAD five to it.

I strongly suspect that the four eldest of the YAD five had stunted development until the start of the 20th Century followed by growth spurts. If so, this may not be not a typical group of five trees.

I have compared YAD061 to the average of the 33 most recent trees in the Yamal series and it appears to be only about 55% of the average for its age at the start of the 20th Century but is above average in 1996.

In comparison the youngest of the five started of like a rocket (last 1/4 of the 19th Century) and is now the tallest.
Alex

I read a paper last year (sorry lost the ref for now) which showed that very long-lived trees did NOT have the same early growth as trees of the same species that died earlier. I seem to remember they had more conservative growth when younger. But in any case, the pattern was different. In my research on tree lifespans I found that between species there was an inverse relation between lifespan and growth rate in general (longer-lived species grew slower) and there is every reason to believe this applies to within species genetic differences as well. Thus a random tree is not equivalent to all other trees: they are not electrons.

Re: Craig Loehle (#163), Growth rate versus longevity is one of those themes that cuts across so many domains in biology. Annual weeds (rapid growth to early maturity + prolific seeding) vs. perennials (slow growth to late maturity + periodic masting) represent extremes in strategies across taxa. Even within species the ends of the spectrum can be found. This is precisely why I question the classical assumption f age-related reductions in radial growth. It’s not like the cambium gets old and tired. Inject some IAA and the stuff rejuvenates to youthful vigor. Such suppleness should be bothersome to the uptick miners.

The results of this study confirm that the climate signal is maximized in older trees, but also that a sampling procedure nonstratified by age (especially in multi-aged forests) could lead to biased mean chronologies due to the higher amount of noise present in younger trees.

The authors looked at Larix decidua (not gmelinii) in the Alps (not the Yamal Peninsula) and warned that their findings “should not be applied to other regions or species”. But …

If this paper is off topic here I can understand and will take the discussion elsewhere, but my main take away from the sensitivity testing of Steve M for Briffa is that it puts in question the selection criteria of the those doing reconstructions in a most general and far reaching sense. I suppose that we can conjecture that the authors’ comments only apply to them and only to sites and not cores. On the other hand, we have a peer review process that should winnow out any methodology that is obviously not acceptable, at any level of application, to the reviewers. Beyond that we have processes that allow replies to papers that contain obviously unacceptable methods. It is rather clear that none of these objections have been made in the peer review process and would, in my judgment, indicate a passive acceptance of the selection process as Briffa and Osborn describe in the 2006 SI to their paper.

In addition, from the link above on page 6, a cross correlation of the sites used by Briffa and Osborn, something in the manner as suggested by Bender for the Kaufmann reconstruction, is shown. Observe the very poor correlations between most the site pairs, reminiscent of the Kaufman reconstruction. The authors use lack of correlation as evidence of site independence and then turn around and note that the average correlation of 0.15 (which translates to an explanatory R^2 of about 0.02) shows that a signal common to all sites exists. I cannot tell from the SI, but I assume the paper is using some kind of composite of all the sites, as was used in K09, and that that rendition assumes the sites are all responding to the same signal.

If we have very large localized differences as indicated in this Briffa study and that of K09 would not that imply that either the climates are very local and would in turn require very many sites to obtain a reasonable estimate of an average condition or that we are merely looking at a mish mash of mainly random responses (outside the instrumental period that is).

As a near term project I want to compare the Briffa correlations with the same correlations restricted to the instrumental period.

I posted this at RC, and they let it through. Thought I’d post it here, as I still find it amazing that an 18 year old (36 years ago) with all the distractions of the first year of college, would find the Briffa sample too small for a term paper:

Unlike many of the people posting here, I have actually cored trees and studied the rings. I did a simple study of growth rate vs. altitude for an undergraduate Alpine Ecology course. Even in that study, I used a larger sample size than Briffa did for the hockey stick portion of his Yamal series. I collected the samples myself, and yes, it took time, but in the end I felt certain my numbers could be replicated.

I have to say, McIntyre sounds reasonable when he questions the wisdom of attaching significance to such a truncated data set, and I wonder how the work of Briffa obtained such prominence.

If nothing else, the observations of climate audit should serve to point out that serious subjects need first class data collection before analysis and conclusions. If the data set is large, reproducible, and available, there should be fewer opportunities for this sort of second guessing.

If this paper is off topic here I can understand and will take the discussion elsewhere, but my main take away from the sensitivity testing of Steve M for Briffa is that it puts in question the selection criteria of the those doing reconstructions in a most general and far reaching sense. I suppose that we can conjecture that the authors’ comments only apply to them and only to sites and not cores. On the other hand, we have a peer review process that should winnow out any methodology that is obviously not acceptable, at any level of application, to the reviewers. Beyond that we have processes that allow replies to papers that contain obviously unacceptable methods. It is rather clear that none of these objections have been made in the peer review process and would, in my judgment, indicate a passive acceptance of the selection process as Briffa and Osborn describe in the 2006 SI to their paper.

In addition, from the link above on page 6, a cross correlation of the sites used by Briffa and Osborn, something in the manner as suggested by Bender for the Kaufmann reconstruction, is shown. Observe the very poor correlations between most the site pairs, reminiscent of the Kaufman reconstruction. The authors use lack of correlation as evidence of site independence and then turn around and note that the average correlation of 0.15 (which translates to an explanatory R^2 of about 0.02) shows that a signal common to all sites exists. I cannot tell from the SI, but I assume the paper is using some kind of composite of all the sites, as was used in K09, and that that rendition assumes the sites are all responding to the same signal.

If we have very large localized differences as indicated in this Briffa study and that of K09 would not that imply that either the climates are very local and would in turn require very many sites to obtain a reasonable estimate of an average condition or that we are merely looking at a mish mash of mainly random responses.

As a near term project I want to compare the Briffa correlations with the same correlations restricted to the instrumental period.

Steve has said the response at RC regarding Yamal contains many false allegations. I haven’t read CA as much as many here, but it has been my impression that it’s generally more quick in response than RC. So when is Steve going to state which allegations are false and respond to RC? Or has he done that already and I was just unaware?

Re: TGGP (#169), As demonstrated by the current topic, RC is focused on a PR mesaage (and I would not be surprised if a PR professional had been consulted for that post–imagery, attempts to discredit the messenger vs. the message, intentional repetition of surface issues with no depth, etc., all followed by censorship to keep the comment train “on message”).

CA has a much stronger reputation for transparent, scientific facts and discussion and does not focus on PR, or tolerate ad hominem attack. The blogs distinguish themselves in that way, with CA being more of a science blog.

In CA parlance I have been an interested “lurker” for many months. I searched CA to see if I could find any references to Briffa and Melvin’s “A Closer Look at Regional Curve Standardization of Tree-Ring Records: Justification of the Need, a Warning of Some Pitfalls, and Suggested Improvements in Its Application” (to appear as Chapter 5 of the soon to be published Dendroclimatology: Progress & Prospects Series: Developments in Paleoenvironmental Research). Since I didn’t find reference, I note that a draft of the chapter is available at

The chapter seems to address the issues associated with “growth rate vs longevity” and the potential that chronologies exhibit contemporaneous-growth-rate bias as a result of using different aged trees – a related excerpt from the Briffa and Melvin Chapter illustrates.

“In a typical ‘modern’ dendroclimatic sample collection, the earliest measurements will come from
the oldest trees cored, which tend to be slow growing. Faster-growing trees that may have been
contemporaneous with the old trees in the early years will likely not have survived long enough to be included in the modern sample. Similarly, the most recent section of the chronology produced from
these sampled trees would not contain data from young, slow-growing trees because these trees
would not be of sufficient diameter to be considered suitable for coring. Any relatively young trees
sampled would likely have to have been vigorous and growing quickly enough to allow them to
attain a reasonable size in a short time. This leads to a situation where a ‘modern sample’ may
exclude the fastest-growing trees of the earliest period and also exclude the slowest-growing trees of the most recent period. Such a sample of uneven-aged trees will be less susceptible to trend-in-signal bias, but still prone to contemporaneous-growth-rate bias, with smaller indices at the start of the chronology and larger ones at the end, imparting a positive bias on the overall chronology slope.”

Sorry, I meant to say Is it possible to calculate a confidence level of the 20th century trend shown in the Briffa Yamal series and also a confidence interval? If so, could someone do it and report it on this blog?

Dendrochronology generally operates under the assumption that climate–growth relationships are age independent, once growth trends and/or disturbance pulses have been accounted for. However, several studies have demonstrated that tree physiology undergoes changes with age. This may cause growth-related climate signals to vary over time. Using chronology statistics and response functions, we tested the consistency of climate–growth responses in tree-ring series from Larix decidua and Pinus cembra trees of four age classes. Tree-ring statistics (mean sensitivity, standard deviation, correlation between trees, and first principal component) did not change significantly with age in P. cembra, whereas in L. decidua they appeared to be correlated with age classes. Response function analysis indicated that climate accounts for a high amount of variance in tree-ring widths in both species. The older the trees are, the higher the variance explained by climate, the significance of the models, and the percentage of trees with significant responses.

Age influence on climate sensitivity is likely to be non-monotonic. In L. decidua, the most important response function variables changed with age according to a twofold pattern: increasing for trees younger than 200 years and decreasing or constant for older trees. A similar pattern was observed in both species for the relationship between tree height and age. It is hypothesized that an endogenous parameter linked to hydraulic status becomes increasingly limiting as trees grow and age, inducing more stressful conditions and a higher climate sensitivity in older individuals.

The results of this study confirm that the climate signal is maximized in older trees, but also that a sampling procedure non-stratified by age (especially in multi-aged forests) could lead to biased mean chronologies due to the higher amount of noise present in younger trees. The issue requires more extensive research as there are important ecological implications both at small and large geographic scales. Predictive modeling of forest dynamics and paleo-climate reconstructions may be less robust if the age effect is not accounted for.

Now you got me thinking… dangerous… I’m not playing with a full deck until monday (52 yrs :) )

I’m led to go back to first principles in finding useful dendroclimatology principles.

A two-minute proposal is given below, probably worth the time I’ve put into it. What quickly became obvious as I went through this exercise was the value of collecting more metadata, and the value of oversampling. Rather than one sample per tree and a few “good” trees in a data series, I would rather see multiple samples from most trees, and samples both from within the selected stand of trees and outside, where supposedly the sought-for signal does not exist. A further thought has occured; I’ll share it in a second comment.

Uniformitarianism: physical and biological processes that link current environmental processes with current patterns of tree growth MAY (not must) have been in operation in the past. Sufficient metadata and oversampling is needed to prove that physical and biological processes significant for tree growth have been approximately constant and in-bounds over the life of the tree. For example, a tree stripped by lightning or snowstorms is likely to sustain a significant change in its growth pattern.

Limiting Factors: rates of plant processes are constrained by the primary environmental variable that is most limiting. Sufficient metadata and oversampling is needed to prove that the limiting constraint has been valid for the life of the tree. For example, a major storm can radically change the long-term availability of light, nutrients and water for a stand of trees.

Aggregate Tree Growth: any individual tree-growth series can be “decomposed” into an aggregate of environmental factors, both human and natural, that affected the patterns of tree growth over time. Both the factors, and their relationship, must be known over the life of the tree. Sufficient metadata and oversampling, even of a single tree, is needed to demonstrate the validity of this principle in each case. For example, disease, storm damage and animal migration paths can externally modify a tree’s growth over time.

Ecological Amplitude: a tree species “may grow and reproduce over a certain range of habitats, referred to as its ecological amplitude” (Fritts, 1976). Sufficient historical and present-day metadata and oversampling is needed to demonstrate the historical validity of this principle in each case. For example, competing species may be introduced or removed that modify the ecological amplitude of the species of interest.

Site Selection: sites useful to dendrochronology can be identified and selected based on criteria that will produce tree-ring series sensitive to the environmental variable being examined. Sufficient metadata and oversampling beyond the selected site is necessary to demonsrate the validity of this selection process.

Crossdating: matching patterns in ring widths or other ring characteristics (such as ring density patterns) among several tree-ring series allow the identification of the exact year in which each tree ring was formed. Samples should never be disregarded due to difficulty of cross-sampling, as the patterns represent data in both time and amplitude dimensions.

Replication: the environmental signal being investigated can be maximized, and the amount of “noise” minimized, by sampling more than one stem radius per tree, and more than one tree per site. This principle needs to be followed more closely. The value of discovering poor correlations and missing signals is commonly under-appreciated.

Bender, just curious what your profession is. You are quite up-to-date in various areas. I read the blog a lot, but haven’t caught that particular detail. Or, it might be none of my beeswax :o) Not a problem if it is NOMBW.

A further thought: if the dendro principles are valid, should it not be possible to use a different method to create proxies for any given signal, as follows? This is probably just reinventing the wheel, but it was fun to write this out.

How this relates to the current topic: it seems to me that Briffa (and others) may be too easily assuming the validity of their data sets. Is there enough metadata available to perform a meta-analysis of the tree ring data to validate or falsify the assumptions being made of the data itself (let alone the statistical analyses being done?)

Starting point: Dendro principles are used to select sites and collect data in a manner that maximizes any given climate signal. If the principles are valid, then the same principles can be used to select sites that do NOT maximize the same signal.

Basic “ah ha”: this should permit “differential” in addition to “direct” proxy creation.

Direct method: collect data from a Tmax site, and analyze with respect to directly-measured Tmeas.

Differential method: collect data from Tmax site, and also from Tnorm or Tmin sites, where all other variables are kept constant. Analyze both the differential between Tmax and Tnorm and Tmax/Tnorm vs Tmeas.

Essentially, differential analysis should help identify data that actually relates to the desired signal, vs data that is emerging from other factors.

A simple example: presumably treeline BCP’s are the best temp sensors. BCP’s in a nearby exposed-but-warmer location are not as temp-limited. Measure both and compare. Are the Tmax BCP’s really showing us something, or is it just the strip-bark process, etc?

We use differential methods in communication to transmit signals through all kinds of noisy contexts. Why not do the same in climatology?

I apologise in advance if this is off topic. Please move it to Unthreaded if that is more appropriate.

John Mashey made the following “modest proposal” on Realclimate:

1) Other than those restricted for legal reasons, it is clear that all of the following should be archived, with a good browser interface so that any random person could find anything:
– all input data, including all versions, iterations, down to scribbles on lab notebooks, and especially any corrections that have happened on any iteration, so those can be checked.
– all code, including every iteration, to make sure that no funny business has gone on. A full set of RCS / CVS files might be good enough. Every makefile also.
– all versions of any compilers, libraries and operating systems used, because they might make a difference, and might be required to reproduce the exact results. Actually, the source code of these might be needed also, to make sure generated code is correct, although one has to be careful, as there is at least one famous case where a C preprocessor did something magic to itself and hid its tracks.
– actually, since not everyone understands F90 (for example), alternate, proved-equivalent code should be provided in Java, COBOL, and Excel.
– a really extensive set of regression tests, so that people can run numerous cases themselves and evaluate whether or not the code is robust.
– all outputs, including OCR’d versions of any printed output, and movies of any interactive 3D simulations.
– Adequate documentation and tutorials, especially if someone who isn’t a programmer needs to find errors in F90 code, or so anyone can learn the equivalent of a physics PhD as needed. This should also include detailed numerical analysis of all relevant code to make sure
– All emails discussing any of this.
– Ideally, records of discussions on whiteboards (via one of those electronic whiteboards).

And probably, it would be a good idea for NASA, GFDL, NCAR, etc to provide compute clusters of similar size to what they use for general use by auditors.

2) Now, to fund the vast increase in staffing, facilities, and computers, I propose that we first increase the taxes in Ontario (where McIntyre is located), to fund these efforts worldwide (especially in USA and UK, from whom he has demanded data), but anyone else can pitch in, too. *I’m* not interested in my tax money being spent this way, but others may want to spend their own money for such.
So, how much are people willing to pay? I’m sure Gavin & co could use lots more budget. Anonymous calls for someone else to spend their time doing more of this … are not worth the bits on disk to keep them.

3) More seriously:
a) Doing commercial-grade software products is *very* different than doing software for research, and the tradeoffs are very different.

b) Chris Mooney’s book mentions the Data Quality Act and what it was really for (i.e., hold up inconvenient research results, and ideally keep demanding further study and wasting time so that the inconvenient research slows to a halt.)

c) I strongly urge those who are honestly confused or unsure about this to read that book, and the brand-new one Climate Cover-Up about the general tactics.

Although this appears to be intended as a comment as to why archiving can’t possibly be done, some of the items do (at base) provide useful suggestions as to the level of archiving which should be aimed for.

In particular:
– input data, though possibly not quite to the level suggested. This should be extended to cover the selection criteria.
– source code, including the SCCS/RCS/CVS/SVN/git/whatever files
– compilers and libraries can make a difference. Noting versions used should be sufficient, though.
– the regression test suite. This should be extended to include sensitivity tests.

although he has neglected the more important point of the program specification and high-level algorithm which the code is intended to implement.

As for his point 2, funding. Google, through “Google Books”, is attempting to provide an electronic archive of all of the world’s literature. Imagine the benefits which could be obtained if Google were to provide a “Google Data” or “Google Experiment” facility, archiving the data and analysis for every published research paper.
I, for one, would love to see the originals of Kelvin, Cavendish, Maxwell, Einstein, Michelson-Morley Bohr, Darwin, Mendel, Fleischmann-Pons…

b) Chris Mooney’s book mentions the Data Quality Act and what it was really for (i.e., hold up inconvenient research results, and ideally keep demanding further study and wasting time so that the inconvenient research slows to a halt.)

c) I strongly urge those who are honestly confused or unsure about this to read that book, and the brand-new one Climate Cover-Up about the general tactics.

Given RCs advocacy position it is not very difficult to understand that they, and friend of the consensus, Chris Mooney, can rationalize transparency as holding up the inconvenient research (that is the research leading to immediate AGW mitigation) and refer to the “tactics” of their opposition as if it were some monollithic force our t there. RC may know some science (and present it in a one-sided manner) but surely we do not take seriously what they have to say about these issues.

Re: Shane P (#195),
John Mashey ought to read up on Reproducible Research. It is a standard process used at the University in his neighborhood. Everything necessary to reproduce all calculations and results is packaged, using software tools with which he is entirely familiar.

An integral estimation of tree-ring growth spatial-temporal conjugation was carried out based on tree-ring chronology network of subarctic zone of Siberia, Ural and Scandinavia for the last 2000 years. Phase and amplitude disagreements of the annual growth and its decadal fluctuation in different subarctic sectors of Eurasia are changed by synchronous fluctuation when century and longer growth cycles are considered. Long-term changes of radial growth indicate common character of global climatic changes in subarctic zone of Eurasia. Medieval warming occurred from 10 to 12 centuries and 15-century warming were changed by Little Ice Age with the cooling culmination taking place in the 17 century. Current warming started at the beginning of the 19 – century for the moment does not exceed the amplitude of the medieval warming. The tree-ring chronologies do not indicate unusually abrupt temperature rise during the last century, which could be reliably associated with greenhouse gas increasing in the atmosphere of our planet. Modern period is characterized by heterogeneity of warming effect in subarctic regions of Eurasia. Integral tree-ring chronology of the Northern Eurasia shows well agreement with 18 O fluctuations in the ice core obtained for Greenland (GISP2).

I’ve posted the following at RC:
Re: John Mashey (#112),
John,
The underlying goal of your presumably tongue-in-cheek proposal is mostly realistic. You can’t be unaware of Reproducible Research, can you?
It is a well-developed practice at the University in your neighborhood with which we are both reasonably familiar.
As a leader in the computer field, would it not make sense for you to become a torchbearer for these methods, as they make use of software tools that are entirely familiar to you? (Yes, “make clean” is all it takes to rebuild an analysis and results in some SU science labs!)
I’m also sure you understand Knuth’s Literate Programming paradigm in more depth than most. And so, you will instantly recognize the meaning and value of Literate Statistical Practice.
Just a thought.

as well as posted above. Briffa’s CRU archive is extremely robust to an age-sensitive test. There is no reason from the statistics to think that there has been any conscious or inadvertent bias in constructing the Yamal chronology.

Steve McIntyre’s critique was rather uncooked (to be kind) but he appeared to be in quite a rush to find fault and has left behind a bit of a mess.”

Re: Antonio San (#203),
May I summarize Tom P’s investigation here, in four lines?
1. hypothesis: the stick is robust
2. method: randomly poke at steve’s code and compare results among pokes
3. results: very little change in graphic output
4. conclusion: the stick is robust
.
The alternative explanation is that the experimenter’s experimentation is deeply flawed. Which is rather easily proven. Tom P’s little petard is going to create a small amongst the fauithful at RC.

In connection with a sudden change of hydrological regime of Tarmansky forest-bog-and-lake complex, the author undertook dendrochronological reconstruction with regard to the state of common pine along a coastal zone of Shaitansky lake as well as at Karagandinsky riam. Changes in their productivity and stability parameters have been investigated over the time of 150-225 years. The author established their extreme values and cyclicity, tracing their dependence on certain hydrological and climatic indices. Subject to analysis being natural and anthropogenic causes regarding changes in the state of trees and biotopes.

Objects and methods of research
In the course of work carried out in autumn 2001, were selected dendrochronological material from items Ahmanka, Ponizovka, Velizhany, Karaganda, Low Buhtalka, Lake Boglyanskoe, Lake Shaitansky, Lake Kopanets, Lake Large Tarmanskoe, Lake Yantyk.
The main tree species Tarmanskogo complex: Scotch pine, Siberian pine (cedar), Siberian fir, heart-shaped linden, birch, aspen, willow pyatitychinkovaya. Epipolar wood samples were taken usually from 10 model trees (two opposite radius model) of each species in each of the said paragraph. As the models were chosen trees of class I growth of different ages.
After special processing of the samples the width of the rings R i (t) was measured under a microscope, cross-dating of the individual series. In the generalized series for each tree the average of the absolute width of the ring R (t) and its standard deviation of DR (t).
The first of these parameters characterizes the course of tree growth due to the whole complex of factors, with the initial ontogenetic (age) factors overlap. All other and form a characteristic curve of the “great growth”. StdDev DR (t) characterizes the heterogeneity of the stand to increase, by which to judge the degree of organization of trees in a single stable system, it shows the increase in decay processes and the subsequent restoration of stand structure.
The ratio of standard deviation to the average growth rate (coefficient of variation)

Ss(t) = DR(t) / R(t) Ss (t) = DR (t) / R (t)

shows the proportion of abnormal components of growth associated with the loss and restoration of the integrity and the stand, and characterizes the structural stability of the stand [Arefiev, 2001]. The increase in the coefficient of variation of growth characterizes the decrease of structural (mechanical, structural and cenotical) stability of trees.

To eliminate ontogenetic and cenotical representing an increase of individual chronologies to standardize on a calculation of the ratio of the width of the adjacent rings (sensitivity factor)

followed by cumulation and by subtracting the age trend. These indexes of growth was normalized to the multi-year average, taken as 1, then obtain a generalization of the chronology of I (t) with values in the corridor from 0 to 2. Normalized radial growth indices I (t) are the main indicator of external abiotic and biotic conditions for the existence of trees of this species. As an indicator of physiological stability of trees Sf (t), caused by a yea-fluctuation rates, considered the sensitivity coefficient [Douglass, 1936] Sf (t) = K (t), on a scale from -1 to +1. In steady state it is close to 0, an increase in the amplitude of oscillations corresponds to a decrease stability and increase the probability of oscillations for a certain threshold level, corresponding to loss of the tree. The most revealing fluctuations with a negative sign, indicating a sharp decline in growth, however, and a sharp increase in growth shows a decrease of stability and is usually due to abnormal component.

Great stuff. You have enabled another old EE to understand better. Just curious here. All the charts from Briffa et al seem to be drawn with temperature axis whereas all the discussion here is focused on ‘climate signal’, and rightly so because a tree responds to climate ( water, CO², sun, warmth, etc) and not just temperature. So, how do we get from climate signal to temperature if a tree responds badly to drought and warmth as well as cold and wet and cold and dry?

Re: stephen richards (#209),
Aye, there’s the rub!
The challenge you describe is the challenge of dendroclimatology. Start by reading the principles and then browse more of CA. Categories here include (but NOT limited to) divergence, bristlecones, almagre, and much more.
It’s an intensely multidisciplinary topic, which makes it interesting to a lot of people.

Someone on WattsUpWithThat claims this paper will shed light on the Yamal matter. WattsUpWithThat Yamal Article. (Quoted below). I was wondering what the subject temperature reconstruction would look like if this method were used. Actually, I’m wondering if any method will make tree rings a viable thermometer.

” Briffa has also studied tree rings in Sweden at Torneåträsk (1992). A recent study found out that the adjustments made Briffa wiped out MWP. The new study, with more tree samples, makes it clear that MWP was the warmest peiriod in the last 1500 years there.

No doubt some readers didn’t notice or didn’t have time to look at that pdf. But doing so is well worthwhile if only because it shows these papers don’t have to be nearly incomprehensible to get published.”

It is about as clear an explanation of dendrochronology in action as an amateur will encounter. It also is useful for those interested in the Briffa Yamal matter.

It seems that Briffa as head author used Grudd´s OLD data “The Swedish Tornetrask data (Grudd
et al. 2002) that Grudd in the new paper dismissed by these lines: “Previous climate reconstructions based on tree-ring data
from Tornetrask were biased by a divergence phenomenon in TRW around AD 1800 and therefore show erroneously low temperature estimates in the earlier part of the records. Tornetrask MXD does not show this ‘‘divergence problem’and hence produces robust estimates of summer temperature variation on annual to multi-century timescales”

Regarding Torneträsk there’s quite a difference between Briffa(2000) and Grudd(2008). Grudd not only uses more samples but a different method. I belive Steve has been posting on this issue before. Now while reading about this Yamal mess I can’t help thinking about Torneträsk. What are the similarities between Briffas Yamal and Briffas Torneträsk proxies? Are there any new implications to reassess Briffas Torenträsk?

Steve: Yes, I’ve posted on Grudd and Briffa’s Tornetrask. See my Erice presentation. If you look at the Left-frame category “Briffa” and scroll to some early posts, there are some comments on the Briffa series used in MBH and Jones 1998.

It appears to me that Gavin is conducting a well orchestrated lynching of Steve at Real Climate. He will always allow posters to put up comments that seem to damage Steve, but he will not allow those comments to be answered. In addition he will allow no posts that mention the warming bias that is introduced by selecting chronologies based upon their agreement with the surface temperature record. I took a screen shot of my comment as it was awaiting moderation. Here it is:

Re: Tilo Reber (#220),
Mark my words. He is going to regret his statement about the acceptability of cherry-picking chronologies to match a temperature record. He has denied doing this with climate model runs. Why is it ok for dendroclimatology but not climate modeling? Hold his feet to the fire on this one, because it is going to burn, burn, burn.

Re: Tilo Reber (#220),
Mark my words. He is going to regret his statement about the acceptability of cherry-picking chronologies to match a temperature record. He has denied doing this with climate model runs. Why is it ok for dendroclimatology but not climate modeling? Hold his feet to the fire on this one, because it is going to burn, burn, burn.

In general there’s nothing wrong with conditional sampling, in a field where I have worked a gas flow was seeded with small particles (dust) and the velocity of the flow is determined by measuring the velocity of the dust. However the ability of the particle to follow the flow depends on the particle size so the best way to go is to select data from the particles below a certain size (ask Lucia). The question with dendroclimatology it seems to me is that while it’s possible to determine whether trees that are contemporary with the temperature record are sensitive to summer temperature it’s not possible to do so for the older trees. However if it was determined that the particular species of tree near the treeline that trees over a certain age exhibit sensitivity then it would be reasonable to limit the analysis to cores that fit that description.

Re: Phil. (#238),
Depends on how the selectivity is practiced. It is quite possible to take Jan Esper’s principle a bridge too far. That is the question: are the dendros taking the principle one step too far, or are these often-undocumented substitutions legitimate?

Re: bender (#239),
Remember, Phil, that we have mixed populations of “positive responders” and “negative responders”, leading to an inexplicable double divergence in the 20th century: both amongst samples, and between the trees versus temperature record. If I ex post facto exclude all those samples or chronologies that don’t match the 20th c. temperature record, and include all those that do, there is no question I’ve introduced a systematic bias. That would be kinda ok, except for the fact that the positive and negative responders correlate with each other very, very well prior ot the divergence (witness Shweingruber vs. CRU Yamal). You can’t have your cake and eat it too. You can’t toss Schweingruber out and keep CRU Yamal in solely on the basis of 20th century match. That kind of selectivity is going to surely artificially bias the evidence in favor of your hypotehsis. And this is a non-no in all science.
.
So please stop being facile in defending selectivity on principle. Stick to the relevant case. And it’s how it’s practised that matters.

I am very surprised to say the least of the paper “Trends in recent temperature and radial tree
growth spanning 2000 years across northwest Eurasia” and its conclusion by Briffa et al is not withdrawn.
Especially now with this Yamal flaw.

It seems that this paper published as late as 2007 used 5 tree-ring chronologies.

No 1 Grudd´s OLD data “The Swedish Tornetrask data (Grudd
et al. 2002) that Grudd in the new paper dismissed by these lines: “Previous climate reconstructions based on tree-ring data
from Tornetrask were biased by a divergence phenomenon in TRW around AD 1800 and therefore show erroneously low temperature estimates in the earlier part of the records. Tornetrask MXD does not show this ‘‘divergence problem’and hence produces robust estimates of summer temperature variation on annual to multi-century timescales”

No 2 is Finnish–Lapland data: About the paper here:
“The 20th century was indeed warm compared to the mean of the entire record (about 0.6°C warmer). However, there were three other hundred-year periods that were warmer still: 600-500 BC, 300-200 BC and AD 1500-1600. Likewise, the difference between the mean temperatures of the 20th and 19th centuries was large; but the difference between the mean temperatures of the16th and 15th centuries was of the same magnitude, while there were three other such century-to-century warmings that were even greater.”

In a later paper the original writer writes
“New tree ring-based analysis for climate variability at a regional scale is presented for high latitudes of Europe. Our absolutely dated temperature reconstruction seeks to characterise the summer temperatures since AD 750. The warmest and coolest reconstructed 250-year periods occurred at AD 931-1180 and AD 1601-1850, respectively. These periods share significant temporal overlap with the general hemispheric climate variability due to the Medieval Warm Period (MWP) and the Little Ice Age (LIA). Further, we detect a multi-decadal (ca. 50- to 60-year) rhythm, attributable to instability of the North Atlantic Deep Water, in the regional climate during the MWP but not during the LIA. Intensified formation of the North Atlantic Deep Water further appeared coincident to the initiation and continuation of MWP, the mid-LIA transient warmth occurring during the period AD 1391-1440, and to recent warming. Our results support the view that the internal climate variability (i.e. thermohaline circulation) could have played a role in the earlier start of the MWP in several proxy reconstructions compared to the externally forced model simulations.”

No 4 is Bol’shoi Avan by Sidorova et al. 2007Link:
Quote: “An analysis of long-term data suggests that the early 19th
century was the coldest time of the entire Little Ice Age
Period (the 14th–19th centuries) in this sector of the
Subarctic.”

No 5 is Taimyr by (Naurzbaev et al. 2002)
I did not managede to get a free of charge link to the original paper but here is a link about the paper. Link

I quote: “Naurzbaev et al. report that “the warmest periods over the last two millennia in this region were clearly in the third [Roman Warm Period], tenth to twelfth [Medieval Warm Period] and during the twentieth [Modern Warm Period] centuries.” With respect to the second of these three periods, they emphasize that “the warmth of the two centuries AD 1058-1157 and 950-1049 attests to the reality of relative mediaeval warmth in this region.” Their data also reveal three other important pieces of information: (1) the Roman and Medieval Warm Periods were both warmer than the Modern Warm Period has been to date, (2) the “beginning of the end” of the Little Ice Age was somewhere in the vicinity of 1830, and (3) the Modern Warm Period peaked somewhere in the vicinity of 1940.

One thing that may need further examination is that Hantemirov & Shiyatov found there was a MWP and one in Roman times while Briffa apparently didn’t. This is from apparently the same raw data set. Both sides may have used different filters or analytical methods but it implies either the filters are at fault or that dendro involving these trees isn’t a suitable method of determining historic temperatures

Is this an accurate characterisation of tree proxies? Found at Roger Pielke Jr’s site:
quasimodo said:

I’m rather confused about proxies. From what I can gather proxies (and in particular tree cores) are selected/discarded according to whether they “fit” a preconceived notion of the current rise in temperatures. In Briffa’s case they were selected by persons unknown because they showed the latter 20th century hockey stick. Some showing it with an alarming 8 sigma variation in growth.

Hence does it really surprise anyone that these selected samples each show the hockey stick? As EliRabett stated “robust to substraction”. That is exactly how they were selected – each and every one.

Selecting trees purely because they match your preconceived latter twentieth century hockey stick, doesn’t mean they then can be used to confirm the latter twentieth century hockey stick. This is very twisted thinking.

I thought the whole point is that you use these selected trees because you think they can be a reliable thermometer because they show some matches to data you have some confidence (the 20th century temperature recordings of firestations, airports and a bunch of places that professional climate scientists have managed to lose the original data for).

At this point you hope (since Briffa presents no cogent argument to support) that these trees represent an accurate temperature proxy for their entire life. What’s more you hope their response is linear to temperature and no other environmental stimuli.

Of course if most trees behaved this way, you’d have a little confidence that they might act as proxies. A sensible person would probably lose some confidence in their accuracy over the longer timescale, since you’ve selected for 20th century matches, the odds are they won’t all be perfect matches for their entire life.

To fix this unease, you’d tend to prefer more trees in the hope that the signal is stronger than the noise.

Unfortunately Briffa only used 10, and since so many more were discarded than used, you’d have to conclude that my assumption above “that most trees are good temperature proxies” doesn’t seem to be supportable.
So perhaps the trees you have selected are nothing more than a bunch of trees with a sudden growth spurt, perhaps due to other factors, and the odds of them all acting as proxies throughout their lives must be even lower.

Of course, as a competent scientist, assuming you proceed down this tenuous path, you’d document your assumptions, why you have made your choices, how many samples make up your analysis, the confidence bounds, and archive the data so that it can be reproduced or extended by any scientists who wish to rely on your results.

freespeech:
“Selecting trees purely because they match your preconceived latter twentieth century hockey stick, doesn’t mean they then can be used to confirm the latter twentieth century hockey stick. This is very twisted thinking.”

Exactly. At one point I believed that 20th century proxies and the surface temperature record were mutually supportive. This is obviously not the case. Chronologies are hand picked to ape the temperature record. So they really can’t tell you if the surface temperature record is overcooked or not. I’m sure that we have been getting quite a bit of warming. After all, we were coming out of a little ice age. But I still don’t have that much confidence in the HadCru or Gistemp numbers.

Now here is the really strange thing. Even though the chronologies are picked to match the surface temperature, none of them can reach the high levels of the surface temperature records. Take a look at the spagetti graphs. They all fall short. And the dendros seem to be having trouble finding new trees to update with that will get them up there.

I don’t have that much of a problem with cherry-picking trees. Pick the ones that match the temperature record, and assume that these trees remain good thermometers for their life. Then pick fossil trees that match those trees, and keep going further back. If you can get a good match between trees and temperature, that seems like a reasonable method.

MikeN wrote: “If you can get a good match between trees and temperature, that seems like a reasonable method.”

But isn’t that exactly the point? You don’t know you have a good match between trees and temperature. You have a number of trees that have shown some sort of late term “growth”, and these are selected. You don’t know if they are thermometers, just that they have recent growth.

To attempt to ameliorate this, you use a whole bunch of trees, and still you are faced with the uncertainty that they behave as thermometers their whole life.

So if you are being scientific about this, you note all this down and discuss how your decisions were made. You emphasise that you chose trees that matched a particular temperature record (I’d expect some thing local would be best). You exclude trees with an 8 sigma variability. You note how many trees made up the sample and how many you discarded. You talk about your results and he confidence bounds on those results.

You don’t hide the data, you don’t hide the sample size, you don’t include 8 sigma outliers and you certainly don’t say that the sample trees confirm the late 20th century hockey stick because you used that to select them in the beginning.
The fact that any of this is still worth debating is a sad indictment of the state of climate science.

The fact that questions about these practices are censoreed at RC is a sad indictment on the people that manage that site.

I find it particularly ironic that Gavin is asking Steve to publish his findings including all code, methods and data, I just wish he’d ask Briffa and the rest of the “team” do the same. If they did it wouldn’t take 10 years for the substandard nature of their work to be revealed. And it would be much easier for their colleagues and journals to disregard team publications without tainting their own citations.

Re: MikeN (#236),
Looks about right. Correlations can be tough to judge by eye. The eye tends to focus on interannual variaiblity and score high for short intervals of match, while the correlation coefficient is, in contrast, very sensitive to differences in trend.

Re: MikeN (#244),
-0.45 would be considered strong enough to proceed with a reconstruction. The dendros lurking here can chime in under a pseudonym if they feel otherwise.
-Run Steve’s R code. If you can’t let me know and I’ll run it and paste the numbers here.

#491 Eli Rabett says:
5 October 2009 at 4:13 PM
While the paper posted by Maikdev in 476 has something for everyone, it does have a nice explanation and a lot of references as to why using a large number of young trees in the calibration period to compare with the instrumental record is not considered to be a very good idea by dendrologists. Since even Eli was generally aware of these issues, there is no doubt that McIntyre is also. Thus, his injection of a mess of relatively young trees is, let us say cherry smashing.

Schweingruber is thru 1990.
Looking at the CRU record, it looks like they didn’t use Salehard, or the other station nearby.
Using the RCS Yamal number in CRU’s prn file. What is the SPL method that they calculated?

The Yamal tree-ring chronology (see also Briffa and Osborn 2002, Briffa et al. 2008) was based on the application of a tree-ring processing method applied to the same set of composite sub-fossil and living-tree ring-width measurements provided to me by Rashit Hantemirov and Stepan Shiyatov which forms the basis of a chronology they published (Hantemirov and Shiyatov 2002).

Briffa’s data, posted at CRU, doesn’t seem to match up with the information posted at NCDC by Hantemirov and Shiyatov.

Maybe this is a dumb question or an obvious known building block of dendroclimatology – apologies if so – but presumably it is possible to use temperature and other site information/records to predict the ring patterns of a tree before it is cut/cored?

Maybe one reason Briffa’s data at CRU doesn’t seem to match the information for Hantemirov & Shiyatov 2002 posted at NCDC is that Briffa’s Yamal analysis was based on a 1998 version of data provided by Hantemirov. I finally noticed that Briffa (QSR 2000) cites as his reference: “Hantemirov, R.M., 1998. Tree ring reconstruction of summer temperatures on the north of West Siberia during the last 3248 years. Siberian Ecological Journal 5, in press.”

The H & S 2002 reconstruction covers almost 4000 years. It may well be based on different data than was used in the work cited by Briffa that was done 4 years earlier.

So, when Briffa responds that his Yamal work is “based on” the data that “forms the basis of” the H & S 2002 analysis, maybe it’s really important to know what the meanings of based and basis are.

It’s tempting to presume that there is a lot of similarity between Briffa’s data set and the data used for the “one approach to constructing a mean chronology” using “224 individual series of subfossil larches” “supplemented by … 17 ring width series from 200-400 year old living larches.” (pg. 721, H&S 2002)

That “one approach” isn’t otherwise described in H&S 2002–not in the text nor in the figures.

Briffa’s data at CRU is similar in numbers of samples/series, but not in the number of cores from living larches. Briffa seems to have used fewer than 17.

Maybe I’ll live long enough for the details to be revealed that might clear up the discrepancy between the basis of H & S 2002 and the data Briffa’s Yamal work was based on. I hope so–I want to improve my golf game, and it clearly will also take a lot of time for me to end up on the correct side of 100.

Steve will, I am sure, be commenting on this development in detail. I note in passing that Briffa’s conclusion is exactly what I predicted could happen: he could salvage the hockey stick by dragging in other series to ressurrect the blade, effectively painting the Schweingruber sample as biased. (How could he have known this in 2000, with only twelve specimens?)
.
The cordial tone of the reponse aside, there is still a scientific pea-and-thimble game going on here, and McIntyre will get at the root of it. The clues to the game lie in the gentle admissions:

The last 8 years of our chronology ARE based on data from a decreasing number of sites and trees and this smaller available sample does emphasise the faster growing trees, so this section of the chronology should be used cautiously. The reworked chronology, based on all of the currently available data is similar to our previously published versions of the Yamal chronology demonstrating that our earlier work presents a defensible and reasonable indication of tree growth changes during the 20th century, and in the context of long-term changes reconstructed over the last two millennia in the vicinity of the larch treeline in southern Yamal.
.
This does not mean that these chronologies will not change as additional data become available and as the RCS processing technique evolves, but the results we show here do suggest that McIntyre’s sensitivity analysis has little implication for those other proxy studies that make use of the published Yamal chronology data.
.
When using the RCS technique, it is important to examine the robustness of RCS chronologies, involving the type of sensitivity testing that McIntyre has undertaken and that we have shown in this example.

IOW McIntyre, Jeff Id, et al are on the right track in scrutinizing RCS end-point issues when dealing with relatively small, heterogenous samples.
.
And Tom P? Out to lunch. Trying to sweep under the rug the problems that Briffa here admits are non-trivial.
.
The game continues.

Re: Lorax (#9), I think you and Dr. Briffa are confusing effect with intent. Dr. Briffa may not have set out (i.e. intended) to cherry-pick data, but it looks like (no accusation of certainty here) that may be exactly what the effect is from a strictly disinterested scientific point of view.

I agree that there may be a tendency among some commenters to conclude that there was intent on Dr. Briffa’s part, but that unfortunately has been fed by the refusal for almost a decade to disclose the raw data. One of the things that bothers me about climate science is a tendency to use data or methods intended for different purposes to measure climate trends, as appears to be the case here (and I don’t claim to be right 100% on this either). I attribute this tendency to the paucity of good measurements and the difficulty (and necessity) of trying to figure out what happened a long time ago. Unlike other fields of science, it is very difficult to “repeat the experiment” in climate science, so a lot of effort is spent on trying to figure out what the results of the “experiment” that has already taken place is (such as coming up with a measurement of “average global temperatures” over the past couple of millenia).

Re: Geo (#23), Where the rules got broken is a place where the dendros seem absolutely confident that they are correct: that you can pick certain trees because they are thermometers and reject others, with no need to justify this and no apparent understanding of the impact of some trees being good and others not on the assumption of stationarity of tree response (if some trees are not “good” then how do you know the trees you picked were “good” in the past when you have no temperature data?). Trees are not physical mechanisms like mercury in a thermometer or isotopes, and the assumption of stationarity needs some pretty strong justification. The divergence phenomenon shoots holes in stationarity.

We do not select tree-core samples based on comparison with climate data. Chronologies are constructed independently and are subsequently compared with climate data to measure the association and quantify the reliability of using the tree-ring data as a proxy for temperature variations.

More pointedly, how does Briffa treat individual chronologies that are “subsequently compared with climate data” and found to have sub-standard “reliability” (whatever that is)?

5 Trackbacks

[…] by: lucia It seems everyone has blogged on Steve McIntyre’s explosive discussions of the Yamal tree rings. Because my understanding of tree-nometers does not extend beyond what I learned of tree rings in […]

[…] Steve McIntyre of Climate Audit seems to have broken the hockey stick for a second time. In Yamal: A “Divergence” Problem, he poses serious questions about the Briffa tree ring chro­no­logies used in recent reconstructions of global temper­ature history. Here is Briffa’s response to McIntyre, and McIntyre’s reply. […]

[…] Mr. McIntyre has gained fame or notoriety, depending on whom you consult, for seeking weaknesses in NASA temperature data and efforts to assemble a climate record from indirect evidence like variations in tree rings. Last week the scientists who run Realclimate.org, several of whom are authors of papers dissected by Mr. McIntyre, fired back. The Capital Weather Gang blog has just posted its analysis of the fight. One author of an underlying analysis of tree rings Keith Briffa, responded on his Web site and at on Climateaudit.org. […]