Fresh Data on Briffa’s Yamal #1

A few days ago, I became aware that the long-sought Yamal measurement data url had materialized at Briffa’s website – after many years of effort on my part and nearly 10 years after its original use in Briffa (2000).

I am very grateful to the editors of Phil Trans B (Roy Soc) – at long last, a journal editor stood up to CRU, requiring Briffa to archive supporting data. They actually asked Briffa to archive the data last year. He asked for further time. When I looked earlier this year, it was still unarchived. However, when I looked again a few days ago, it had finally been archived (without anyone at CRU having the courtesy to inform me that they had rectified the situation.)

I’m assuming that CA readers are aware that, once the Yamal series got on the street in 2000, it got used like crack cocaine by paleoclimatologists, and of its critical role in many spaghetti graph reconstructions, including, most recently, a critical role in the Kaufman reconstruction.

I’ll also assume that readers are familiar with the difference between the Yamal chronology and the Polar Urals “Update” reconstruction and of the execrable story behind the non-reporting and abandonment of the Polar Urals “Update”. (See various CA posts and the quick review in my 2008 Erice presentation.) The graphic below shows both the similarities and differences: while the two series are similar on a decadal basis, the medieval-modern relationships are reversed with a huge 20th century pulse at Yamal and an imposing MWP at Polar Urals.

Reconciliation of the disparate Yamal-Polar Urals Update (and other such regional reconciliations) seems to me like the first order of business if multiproxy reconstructions are to advance. As an analyst in Toronto, I can comment on the differences, but am not in a position to resolve them. Such reconciliations are properly the obligation and responsibility of the field scientists involved. Unfortunately, to date, people in the field have not honored this responsibility and, to an outside observer, seem to have done no more than pick the version (Yamal) that suits their bias.

Until now, Briffa’s refusal to archive Yamal measurement data and the acquiescence of journals (including Science) in this obstruction has made it impossible to get even a foothold on the factors governing the differences between the two series. The archive is still seriously inadequate for full statistical analysis, but now I can at least get a foothold and will begin commentary on the issues in a series of posts.

First, to clear a little underbrush. There is one other version of these series that readers may encounter: Hantemirov and Shiyatov archived a Yamal reconstruction at NCDC that has no hockey stick blade whatever. This version was promoted by a commenter (Lucy Skywalker) at Jeff Id’s as being a priori more valid than Briffa’s. Although the Hantemirov and Briffa chronologies have a very different visual appearance (especially the non_HSness of the Hantemirov version), there is an extremely high correlation between the very different looking Hantemirov-Shiyatov and Briffa Yamal chronologies. (If you regress the Briffa recon against the Hantemirov recon for the pre-1800 version, you get a huge r^2 of 0.81). The two series clearly have the same raw material.

However, in my opinion, the issue is considerably more nuanced than simply preferring the Hantemirov chronology. The Hantemirov and Shiyatov chronology adjusts for age (“standardization”) through a “corridor” method, whereas the Briffa chronology uses a “RCS” method to standardize for age. In other studies involving relatively short-lived trees (such as as Yamal), the corridor method has been found to yield very similar results to “conventional” standardization; such methods are also known to remove any centennial-scale variability from the reconstruction. As a result, no conclusions should be drawn with respect to centennial-scale variability from the Hantemirov chronology. No adverse conclusions should be found against the Briffa chronology merely because it differs from the Hantemirov chronology. There are other reasons to be concerned about the Briffa chronology, but these have to be presented and supported.

Yamal Counts

The main focus of the program at Yamal described in Hantemirov and Shiyatov 2002 was the collection of subfossils from river beds. The original article considers cores from throughout the Holocene (this aspect is reminiscent of the long Finnish chronology), observing that the treeline in the Holocene Optimum was well to the north of the present line.

Today, I’m not going to review these findings (I will on another occasion – interested readers can go straight to the original article.)

First, I’m going to review the inventory of fossil and subfossil trees reported in the original article in order to crosscheck this information against the CRU archive.

Remains of dead trees can be found lying on the surface and tend to be up to a maximum of 750 years old. Within the frame of this research, some 30 of these dead trees have been collected…

by far the most significant source of subfossil wood remains, often trunks in a near-complete state, with bark, roots and large branches, is the material found in alluvial deposits… 1945 samples have been collected from alluvial deposits at the time of writing…

The second important source of subfossil wood is peat deposits…To date, 196 samples
have been recovered from such peat deposits…

a total of 2171 sawn wood samples has been collected.

Later they report the selection of a subsample of these subfossil samples, together with core results from 17 living trees:

In one approach to constructing a mean chronology, 224 individual series of subfossil larches were selected. These were the longest and most sensitive series, where sensitivity is measured by the magnitude of interannual variability. These data were supplemented by the addition of 17 ring-width series, from 200–400 year old living larches.

They go to observe (and I’ll return to this point in another post):

The data were treated separately for each valley system and tree position transformed into anomalies from their own present-day limit, and, because no river valley could, as yet, supply sufficient samples to cover the whole period, the data were then combined to form a single regional indication of tree-line shifts. This should be considered as a preliminary result because there is some subjectivity in the way the different valley data were expressed.

I mention this point, because the Briffa chronology appears to take a different approach i.e. not considering each river valley separately. More on this on another occasion.

Rob Wilson could have collected 17 cores from living trees in a morning. I presume that these 17 ring-width series from living larches are a sample from a larger program on living larches. This graphic shows the use of over 30 cores from about 600 to 1500 and the use of about 17 cores in the 19th and 20th century (presumably the 17 cores from living trees.) The following graphic (H and S Figure 6) shows the core count by year for the 241 series selected into the H and S chronology:
Figure 2. Series Counts in H and S Chronology 2002

The CRU Archive
While the Yamal measurement archive is a quantum addition to (indeed, the first useful) information on Yamal, it lacks any sort of metadata as to where the individual samples were obtained.

I think that some information can be gleaned from the nomenclature of the ID numbers. There are 252 distinct series in the CRU archive. There are 12 IDs consisting of a 3-letter prefix, a 2-digit tree # and 1-digit core#. All 12 end in 1988 or later and presumably come from the living tree samples. The nomenclature of these core IDs url (POR01…POR11; YAD04…YAD12; JAH14…JAH16 – excluding the last digit of the ID here as it is a core #) suggests to me that there were at least 11 POR cores, 12 YAD cores and 16 JAH cores. The JAH and POR cores (7 in total) were over 200 years, while all of the YAD cores were under 200 years – so there may be some difference between the Briffa selection and the Hantemirov selection.) YAD presumably stands for Yadayakhodyyakha River (see map and Figure 3); POR for Porzayakha River (see Figure 3); JAH for one of the unlabelled tributaries in Figure 3.

There are 235 IDs consisting of a 1-digit alphanumeric followed by a 4-digit tree number and a 1-digit core #. The tree numbers appear to be unique within this group – the lowest is 0008 and the highest is 2258. There are three different prefixes: L, P and _, with L having by far the most. Perhaps these prefixes correspond to three different subfossil provenances: alluvial, peat and surface. (I don’t rely on this; they might relate to something geographic.)

There are 5 IDs that are 4-5 digits (X13, X02S,M021,M022,M331): these in 1963, 1978 and 1982 and look like they are results from previous coring of living trees. Perhaps these 5 cores plus the 12 cores from living trees (with 6-digit IDs) are the same as the 17 cores from living trees selected in the H and S chronology. But maybe this is a coincidence. One never knows – it’s climate science.

In any event, the following graphic shows the number of cores by year in the CRU archive. The counts decline from 24 in 1956 to only 10 in 1990 and 5 in 1995-96. While the appearance has much in common with the corresponding H and S graphic, there are noticeable differences: the number of series in the MWP is dramatically reduced in the CRU sample – counts were consistently in the 30-40 range in the H and S graphic between 800 and 1400, while the CRU count is more like 15-20. On the other hand, the CRU count for the first part of the 20th century is visually higher than the corresponding H and S count (which holds at about 17.) What accounts for this difference? I have no idea.

Figure 2. Count of archived cores for Yamal at CRU by year.

Further confusing matters is the information on the number of cores in each year attached to the Hantemirov Yamal chronology at NCDC, which has an entirely different appearance in some respects, but other features seem to match. The count is exactly 17 from 1875 to 1963, which seems to match the 17 living trees. The count in 1990 is 10 (the same as the CRU count, dropping to 5 in 1995-1996, as CRU.) Unlike the CRU count, the count is much more homogeneous – so the data sets are definitely not the same.

I provide this data not to provoke reactions like – How can anyone conclude anything about Yamal climate from only 10 cores in 1990? Or 5 cores in 1995? (While I understand the sentiment, please don’t post such comments.) Especially the last few years, where the number of available cores falls below minimums advocated elsewhere in the Briffa corpus.

To show the tininess of the subset representing Yamal in the 1990s, the next graphic compares Yamal counts with counts from the Polar Urals version, set aside by Briffa in favor of the Yamal series (and even I’m a little shocked by the discrepancy). In 1990, there were 57 cores in the Polar Urals and only 10(!) in the Yamal subset.

So while CRU may have archived the data that they “used”, we run once again into a problem that we run into over and over again in paleoclimate studies. What about the data that was in their sample, but which they didn’t use? 10 trees going to 1990 and 5 trees to 1995-1996 is impossibly small for a dendro expedition. Rob Wilson could do that in an hour. What about the rest of the Yamal data? Where is it?

To what extent is the Yamal HS a product of the selection process and to what extent is it climatic? Without the complete data set, it is impossible to set aside the troubling thoughts that one is faced with in these circumstances.

I think that I’ve figured out a way to crosscheck the modern portion (yielding some disquieting results) and will discuss that in a forthcoming post.

34 Comments

Why does it take so long to archive data? If data is used in an analysis that gets published, then it must already be in some format for a computer program to read and perform calculations on. Why not just upload the raw data + source code for reading it and be done with it. Is there something I’m missing?

Chad, congratulations on your excellent work on Santer which I plan to cover. Sorry for not covering it sooner as I’ve been working on Kaufman and now Yamal.

There are obviously no technical obstacles to archiving data. NCDC has excellent facilities and people can archive data at their own websites in seconds.

The problem is entirely attitude. People can speculate on motives for obstructing the provision of data. We’ve talked about this at length.

But I think that it;s more useful to focus on policies of funding agencies and journals.

NSF has turned into a cheerleader for the Team and has totally abandoned its responsibilities to ensure that existing federal policies on archiving are enforced.

Journals in climate have not established mandatory policies that data be archived as a condition of review (as in econometrics.) One’s chances vary considerably from journal to journal. IJC has no archiving policy at all and refuses to ask authors to archive data because it has no obligation. Science and NAture both had adequate policies on paper, but their enforcement was poor. The Hwang cloning debacle was a huge help in improving Science’s attitude on getting authors to provide data. Before that, they were very unhelpful in enforcing their own policies. They are more proactive now. I got an acknowledgement from Science of my request for KAufman data within a few hours. While I don’t have the data yet, I doubt that they will have any patience with prevarication by Kaufman.

now I can at least get a foothold and will begin commentary on the issues in a series of posts.

Speaking of promised series of posts, what ever happened to the series of posts you were going to present concerning the conference in Italy? I for one am still interested.

Hopefully this isn’t akin to the old joke about someone who kept sending postcards to a friend. First he’d pose a question and a few days later send another with the answer. Finally he sent one which asked, “How do you keep a turkey in suspense?” The friend kept waiting for the second postcard with the answer, but eventually he realized the absence of the second card WAS the answer.

One thing that stuck out for me reading this: we have been told numerous times that tree location is important for selecting temperature proxy trees. That is all well and good for the modern living trees, even BCPs as far as they go. But how does this possibly apply to fossil and sub-fossil trees? The forces that create such samples do not seem like they would conveniently provide just the temp limited trees, even if there actually is such a thing.

Even worse, trees that are in a position to fall into a river through natural processes seem like they would more than likely be in swamps or bogs of some sort, not the high elevations where temperature is claimed to be dominant.

Please do not try to debate tree rings from first principles. There have been lots of discussion of tree rings. It’s an important issue but editorially I’m not interested in re-hashing the first principles over and over again. So for present purposes, I’m just working through the statistical and selection issues.

Re Soronel Haetir (#7): I myself think that your point is excellent and am a little surprised that S. McIntyre didn’t note it himself, since he is often–I believe rightly–going on about whether “whatever” truly is a temperature proxy. How can one in a sane state of mind even consider fossil and sub-fossil trees as valid proxies for any climate characteristic (except, of course, that the climate at one time was suitable for growing trees). Now the rest of you, more knowledgeable than I, go ahead and point out to me how I am wrong.

I wouldn’t say that adequate metadata is anywhere near “standard” in dendro. It isn’t. “Site information” metadata is seldom more than the lat-longs of the site and that level of information is available in H and S. What’s unusual about Yamal is that the “site” is spread out over miles, imposing an additional variable on the data, especially when RCS is used.

Re: bender (#13),
Did I ever write up my visit to the LTRR archives? I have photos of their earlier sources and metadata… valuable to remember that earlier samples were originally collected for chronology, not climatology purposes. Date and general location would have been *good* metadata.

Furthermore, I have data for more than 500 (including hundreds of non-GHCN) Siberian weather stations. Berezovo (in fact, it is “Beryozovo”), a station few hundred kilometres southwards has data going back to the early 1830’s. The records tells us that summer temperatures have been decreased slightly in the last 175 years. A truly inconvenient truth for the Team, because as far as I know these proxies can refer to the mean temperature of the vegetation period, which is only about 3 months on the Yamal Tundra.

Re: Adam Soereg (#52), It’s been nagging in the back of my mind for a while but you just put it into words. I got the impression proxies for global reconstructions were chosen on the basis of correlation to global temperature ie hockey stick mining. Is this the case? If so, is it a good assumption to make that local temperatures mirror global temperatures?

As a person who is interested in the geography and climate of Siberia, one more thing is still not clear to me: the northernmost living trees on Earth are located in the Taymir region, at about 72°40’N, north of the Khatanga river.

In the region mentioned above, there are remains of old trees on the treeless tundra, located far north from the current tree line. A small amount of warming cannot be unprecedented. An article about the background:

Over most of Russia, forest advanced to or near the current arctic coastline between 9000 and 7000 yr B.P. and retreated to its present position by between 4000 and 3000 yr B.P. Forest establishment and retreat was roughly synchronous across most of northern Russia. Treeline advance on the Kola Peninsula, however, appears to have occurred later than in other regions. During the period of maximum forest extension, the mean July temperatures along the northern coastline of Russia may have been 2.5° to 7.0°C warmer than modern.

So, per the “divergence problem” post, the archived Yamal chronologies were cherry-picked. The only real question, in my mind, is whether they were hand picked to support a preordained conclusion, or whether their selection was a more-or-less automatic result of accepted assumptions like “a chronology that is inconsistent with a large, monotonic increase in temperature in the 20th century is a bad chronology, while one that shows such an increase is a good chronology.”

If this is the work of one incompetent or dishonest person, that’s bad. If it’s a result of standard practice in the dendrochronology field, it’s much, much worse. Not from a moral perspective, but from the perspective of what it means for all the work that has been done by people following those practices.

A question – do the individual chronologies correlate less well (with one another) in the 20th century than in previous centuries? If not, is there any evidence that the recent “diverging” data are more problematic than earlier data, and therefore require some kind of “corrective” procedure applied on the basis of recent data characteristics?

If it’s a result of standard practice in the dendrochronology field, it’s much, much worse.

I would not have thought it standard practice. But you read Esper’s quote? Maybe it was taken out of context, but I can’t imagine that matters. What Esper said is frightening. What on earth was he thinking when he said that?

do the individual chronologies correlate less well (with one another) in the 20th century than in previous centuries?

That’s precisely the “divergence problem”. Read the blog. There has been a loss of correlation in many species, many sites, in the 20th century.

My understanding of the problem is incomplete, then. I was under the impression that it referred to the failure of the tree rings to reflect changes in the 20th century instrumental record in a consistent way. I’ll do more reading.

So what turns on this remarkable revelation?
.
This is a note to Bishop Hill and any others following this story. If AGW is your concern, it is important to maintain perspective. A modern warm bias in the Yamal chronology means that one is on shaky ground asserting that modern temperatures are “unprecedented”. This, however, does not imply modern temperatures will not soon be unprecedented. That takes you out of the realm of paleocliamtology and into the realm of the GCMs – the ultimate basis for all faith in AGW. In which case I would refer you to lucia’s “The Blackboard” (link at left).

Steve, the L, P, and _ I think is supposed to be L, P, and B, which matches up with the scientific classifications in the H&S paper for subfossil larch (Larix
sibirica Ldb.); and from spruce (Picea obovata Ldb.) and birch
(Betula tortuosa Ldb.).

Just what does “temperature-limited” mean? I suppose that the Yamal trees are considered “temperature-limited” because they survive in a far northern climate. However, the summer temperatures, when growth occurs, are above freezing. It is the winter cold that prevents tress from propagating, isn’t it? More particularly, it would seem to me to be the average temperature, which is integrated over the year by the ground, that affects tree survival by affecting root activity.

So if tree survival is affected by average temperature but growth is limited by summer temperautre and perhaps precipitation and other factors, how are the confounding variables that affect growth, but not directly survival, un-entangled?

I hope that this is a sensible question.

This actually is a good question. What shapes treeline is a matter of investigation. Nobody knows for sure. Are growth and reproduction limited by the same factor? Probably not. In the treeline condition (alpine or tundra) growing season length (e.g. number of frost-free days) is the primary factor limiting growth. Not just air temperature, but soil temperature.
.
Note: It is quite possible Yamal larch are temperature limited. That doesn’t mean the uptick is evidence of climatic release.

[…] Two posts ago, I observed that the number of cores used in the most recent portion of the Yamal archive at CRU was implausibly low. There were only 10 cores in 1990 versus 65 cores in 1990 in the Polar Urals archive and 110 cores in the Avam-Taymir archive. These cores were picked from a larger population – measurements from the larger population remain unavailable. One post ago, I observed that Briffa had supplemented the Taymir data set (which had a pronounced 20th century divergence problem) not just with the Sidorova et al 2007 data from Avam referenced in Briffa et al 2008, but with a Schweingruber data set from Balschaya Kamenka (russ124w), also located over 400 km from Taymir. […]

[…] Two posts ago, I observed that the number of cores used in the most recent portion of the Yamal archive at CRU was implausibly low. There were only 10 cores in 1990 versus 65 cores in 1990 in the Polar Urals archive and 110 cores in the Avam-Taymir archive. These cores were picked from a larger population – measurements from the larger population remain unavailable. One post ago, I observed that Briffa had supplemented the Taymir data set (which had a pronounced 20th century divergence problem) not just with the Sidorova et al 2007 data from Avam referenced in Briffa et al 2008, but with a Schweingruber data set from Balschaya Kamenka (russ124w), also located over 400 km from Taymir. […]

[…] Two posts ago, I observed that the number of cores used in the most recent portion of the Yamal archive at CRU was implausibly low. There were only 10 cores in 1990 versus 65 cores in 1990 in the Polar Urals archive and 110 cores in the Avam-Taimyr archive. These cores were picked from a larger population – measurements from the larger population remain unavailable. One post ago, I observed that Briffa had supplemented the Taimyr data set (which had a pronounced 20th century divergence problem) not just with the Sidorova et al 2007 data from Avam referenced in Briffa et al 2008, but with a Schweingruber data set from Balschaya Kamenka (russ124w), also located over 400 km from Taimyr. […]

[…] I’ve made MANY references to Hantemirov and Shiyatov 2002 in my posts on Yamal. In my first post on Yamal after getting access to the data, I discussed the Hantemirov and Shiyatov 2002 reconstruction as archived at NCDC see http://www.climateaudit.org/?p=7142 […]

[…] at a dataset that had been analysed and reported by the U.K. climate scientist Keith Briffa. In his first whack, McIntyre accused Briffa of withholding data, and suggested that Briffa was trying to hide the fact […]

[…] tree (black: original data) and red: adjusted in the attempt to eliminate bias due to age of tree). Steve McIntyre said “I think that some information can be gleaned from the nomenclature of the ID […]

[…] McIntyre found was astonishing: Briffa's "hockey stick" was created by using data from only 10 trees in 1990 and 5 trees in 1995-1996. Given that tree ring growth can be affected by non-climate factors, such as if a nearby tree is […]

[…] Two posts ago, I observed that the number of cores used in the most recent portion of the Yamal archive at CRU was implausibly low. There were only 10 cores in 1990 versus 65 cores in 1990 in the Polar Urals archive and 110 cores in the Avam-Taimyr archive. These cores were picked from a larger population – measurements from the larger population remain unavailable. One post ago, I observed that Briffa had supplemented the Taimyr data set (which had a pronounced 20th century divergence problem) not just with the Sidorova et al 2007 data from Avam referenced in Briffa et al 2008, but with a Schweingruber data set from Balschaya Kamenka (russ124w), also located over 400 km from Taimyr. […]

[…] Briffa disclose his work. Ironically, it was a journal editor who enforced the rules. As McIntyre wrote, I am very grateful to the editors of Phil Trans B (Roy Soc) – at long last, a journal editor […]