5 Niwot Ridge Chronologies

As I mentioned earlier today, Niwot Ridge is about 45 minutes drive from UCAR headquarters in Boulder (which hosts the IPCC WG1 TSU). I’ve identified 5 archived chronologies from Niwot Ridge, including two chronologies discussed in Kienast and Luxmoore 1988 as contradicting Graybill’s claims of enhanded post-1950 growth. Graumlich 1991 distinguished the two studies on the basis of altitude, but I pointed out that, at Niwot Ridge, the Graybill PIFL site was actually lower in altitude than the Kienast and Luxmoore sites (both PCEN). I also pointed out that Woodhouse did a PIFL study at the same altitude as Graybill, a few miles away. Anyway, here are the 5 tree ring width chronologies (there’s also an MXD chronology not shown so far.)

The differences between the site chronologies are obviously quite remarkable. The Kienast and Schweingruber site chronologies do not appear to have been taken more than a few hundred yards from the Graybill chronology, although the species PCEN- Engelman spruce) differs from Graybill’s IFL (limber pine). However, Woodhouse is exactly the same species at the same altitude from a few miles away.

Here’s something amusing re MBH. The Woodhouse and Graybill site chronologies (both PIFL, both at the same altitude) are both used in MBH (for some reason, the Schweingruber and Kienast chronologies are not used, although many Schweingruber chronologies are). What are the weights of the two sites in the MBH98 PC1?

Site

EOF1 Weight

Graybill co545

0.2299

Woodhouse co511

0.0025

So the HS-shaped series has a weight nearly 100 times larger than the non-HS shaped series, even though these are series of the same species within a couple of miles of one another. TCO has been grinding about why I regard this weighting as erroneous and what the correct weights should be? Or whether the "real " problem is the principal components methodology or the proxies?

I find it very difficult to be a judge at that particular beauty contest. How can one look at such disparate chronologies and come away with any view as to what the "correct" weighting is? How can both the Graybill chronology on the one hand and the Woodhouse/Kienast/Schweingruber chronologies on the other hand both be meaningful "proxies"? Yes, the HS shaped series is given 100 times the weight of the other series, but this is not just an error in geographical weighting. There’s something wrong with the method, as we’ve said from the beginning, with its biased picking; it’s not just that the geographic weights are wrong, but that it’s severely biased.

There’s also something fundamentally wrong with the confidence intervals claimed for tree ring chronologies. You see a statistic called the Subsample Signal Strength or Expressed Population Signal, where under the circumstances of any of these sites, the dendro people would claim that their signal is accurate to within very small percentages – but such claims are clearly unreconciliable with the big divergences between nearby series – a completely different Divergence Problem than the NAS Panel has in mind, but a big one nevertheless.

Another issue that I’m beginning to wonder about: how solid is the Graybill data anyway? The discrepancy between the HS-shaped Graybill series and the very different appearing Woodhouse, Kienast and Schweingruber series is worrying, although the Hansen-Bristow series has points in common with Graybill. There’s a lot of weight being placed on Graybill’s data – don’t you think that someone within the past 20 years should have replicated his results? (Oh, wait a minute, Hughes did resample Sheep Mountain in 2002 – it’s just that he hasn’t reported the results.) Tomorrow or the next day, I’m going to post up some worrying aspects to the Jacoby-D’Arrigo series at Churchill, Manitoba.

Update (Saturday):
Here is a figure showing the Graybill and Woodhouse PIFL chronologies on the same scale. Their correlation is 0.46. Graybill said that he was particularly looking for strip bark trees. Maybe that accounts for the difference. But the difference in 20th century HS-ness between samples taken by two different researchers on the same trees at the same elevation within a couple of miles of one another is pretty amazing and needs to be reconciled.

62 Comments

“So the HS-shaped series has a weight nearly 100 times larger than the non-HS shaped series, even though these are series of the same species within a couple of miles of one another.”

A possibly semi-exonerating thought occurred to me on reading that. Is it possible that Mann incorporated hockey-stick-seeking into his algorithm so as to automate a kind of instrumental-record temperature reconciler?

That is, hockey-stick tree-ring proxies have a terminal up-tick similar to what the instrumental surface record shows for the 20th century. That makes them proper responders to known temperature trends. And so, given the ‘linear-responder-now-ergo-linear-responder-forever-prior’ assumption, that in turn makes them appropriate fodder for a reconstruction. In that case, an algorithm that mines for hockey sticks is merely seeking proxy series that are ‘positive responders.’ Jacoby-D’Arrigo cherry-picking becomes automated.

If Mann did that, then all he did was automate what dendroclimatologists have been doing by eye all along anyway. Physicists specialize at that sort of thing. That would make Mann innocent of any conscious wrong-doing. It would just mean that he went blindly ahead accepting the assumptions that have governed the field up until now. The statistics indicating spurious significance could have been put aside because they disagreed with standard practice. That’s a slippery slope, but not unprecedented. No one expected the Canadian Inquisition. Maybe they didn’t even know they were gravely magicking.

If you hadn’t happened along, it might have been years, and billion$ more, before anyone came along and caught the error. Which leads me to ask – why do you think it is that no independent statisticians have ever apparently interested themselves in the climate proxy field?

And realizing that you’re a critical chorus of one makes me nervous. I hesitate to ask here in public, but have you gotten, umm, let’s see, any of the sort of email attentions, say, abortion-providers might get in North Dakota?

Steve, you’ve lost me here. What are the “weights” you are talking about? Are the figures in the table the component loadings on PC1? If that’s the case it’s an indictment of the PCA methodology that should satisfy TCO (no guarantee offered).

But then you start talking about “geographical weights” which suggests some sort of weighting post-PCA, but that can’t be right because we wouldn’t weight individual series at that stage.

Nope, hopelessly confused. Could you clarify what you mean by “weights” and specifically “geographical weights”?

James, I think you are right in thinking those weights are loadings on the PC1. I believe the point he is trying to make regarding geographical weights is that those weights do not seem to map well to the geographical significance of the proxies. Therefore the weights are evidence of poor geospacial mapping of the weights, and thus the reconstruction is not a proper representation of global “climate” signals.

In other words, if the weightings were geographically correct, those two proxies should get roughly the same weight, since they are roughly equally geographically representative of the area they are in.

Pat, I think most of us believe Mann at least partially *accidentally* came up with the hockey-stick-mining method. Like others he was just trying to “cherry pick temperature sensitive proxies”. Unfortunately, what that ended up doing is cherry picking hockey sticks. I believe it is a fundamental flaw that he is using *global* temperature data to look for correlation with *locally* representative proxies. That’s never going to work, as Steve and Ross have shown.

What he should be doing is looking for correlation between local temperature and local proxies, and then geographically weighting the results to get a global reconstruction. Of course, that’s making a bunch of (poor) assumptions such as that tree ring width is linearly correlated with temperature at all. But I believe that method would be less wrong than what he did. Of course if he did that the “bristlecones” would go right out the window.

To clarify, the quoted weights are the weights from the first eigenvector in the MBH98 PC1 (thus the weights in the PC1).

The reference to geographic weights was only a reference to TCO’s concern that the “real” problem was the failure to achieve geographic weights – assuming that I’m not unfairly characterizing a lengthy line of questioning correctly. Because, in this case, we have been able to boil out all kinds of co-varying factors and have two chronologies from identical species from no more than a couple of miles from one another (and ironically close to UCAR world headquarters), with one series weighted 100 times more than the other through the Mannian algorithm, the first thing that you can say is that the method makes no sense.

Left to my own devices, it would never have occurred to me to say that the problem was that the method was a poor estimate of geographic weights, simply because the bias in the method seems like a more important issue. It’s not that the method is overweighting some geographic locations e.g. 200 yards to the east of the NTER ecological station as opposed to 1 mile to the south of the NTER ecologicals station. It’s that it’s overweighting HS shaped series.

The other interesting question arising from this is surely: what’s going on with the Graybill series? Can they be replicated on the ground? There is a pretty astonishing difference between the Graybill and Woodhouse chronologies – maybe someone from UCAR should put down their lattes, locate the Graybill and Woodhouse trees, re-core them and reconcile the differences.

Doug. here are the correlations. Woodhouse is co511; Graybill is co545. The correlation is surprisingly high 0.46, given the visuals. I’ve re-plotted the two series on the same graphic in an update above. Some of the downspikes are common to the two series, which would probably yield much of the correlation. The discrepancy is obviously in the HS-ness of the Graybill series, which might be ue to strip-bark forms in Graybill. I’m going to post up some other interesting discrepancies over the weekend.

#6. The weights are not regression determined; they are determined from the PC algorithm. I think that we’ve already proved that Mann’s algorithm data mines – this is just a pretty example where it’s very difficult from them to argue some irrelevant reason as to why the weights should be as they are.

#3 Thanks for the comment Nicholas, you’re right. Choosing out local proxies that follow global average temperatures is asking for trouble. I should have thought of that.

Steve, visually it looks like Woodhouse and Graybill correlate very well prior to 1900. Is that the case? Maybe that’s why their total correlation is so high (typo alert in #10: 0.46, not 0.56). It looks like something unique happened to Graybill’s trees, but only after about 1900. I wonder: When was the NTER ecological station built there, is it close to one of the sampling areas, and where does their effluent go?

#10. For what it is worth, the average correlation in your table is 0.259 and the average r^2 is 0.076, so not a whole lot of variance in common. Throwing the local temperature into the mix would likely lower the variance even more.

My Big Picture thought: you’re providing incredible evidence that Science needs to return to first principles. Bacon (inventor of scientific method and a few other helpful bits) demonstrated a wonderful understanding that _humility_ is needed in our approach to scientific understanding.

Somehow, we’ve come to the point where many folks imagine that with just a little more digging, we can create the Ultimate Formula That Describes Everything. Your work reminds us that it’s a ton of effort to analyze much of anything in a valid, unbiased way.

Thanks.

I hope you can eventually create a category (and postings) with Big Picture reflections of your own. Since the beginning, careful real science has popped a lot of proud balloons.

–MrPete

PS This whole controversy has piqued my interest in the Medieval Warm Period. I’ve seen recent publications denying its existence on the basis that it is inconsistent with published (i.e. MBH98) data. I’m curious about any independent literature relating to glacial shrinkage, sociocultural impacts or other visible effects, particularly as it might relate to informing current discussion of impacts and policy decisionmaking. (e.g., we see TV specials on loss of walrus-hunting habitat today. Is that truly a unique event in millenial history?)

1. Can’t follow figure caption. which series is which.
2. Can you tell how well trees within a series agree? Get a feel for how many trees need to be sampled to really get the average behavior over several years?
3. With the two series that are same species and a few miles apart and same altitude, can you see if the trees within given series have better correlation than from series to series?
4. I’m not grinding you. This is nothing. The only reason that it seems like that is that I respond to the people who get mad at me asking tough questions (vice blowing them off) and that I don’t accept "answers to a different question".
5. Obviously you seem to think it is "incorrect weighting". So the amount needs to be calculated. Even leaving aside the idea of "correct weighting", we can compare weighting from one method to another (with numbers).
6. My objection on the "geo weighting" explanation was that I asked for comparison of averaged samples to PCA and was given a response which had geo weighting insterted.
7. Similarly, is this effect (above) more due to using PCA in general or from your discovered oddity (the off-centering mining). One shouldn’t conflate the two, becuase doing so would overstate the importance of your discovery, if it is more of a PCA effect, itself.
8. Why PC1!? Sheesh. What matters is the reconstruction. Heck, maybe the PC2 reverses the weighting imbalance (since flipping can occur). It’s like saying, wow look at the anisotropy of the 2pz orbital and ignoring that this is just of several orbitals.
9. "There’s also something fundamentally wrong with the confidence intervals claimed for tree ring chronologies. You see a statistic called the Subsample Signal Strength or Expressed Population Signal, where under the circumstances of any of these sites, the dendro people would claim that their signal is accurate to within very small percentages – but such claims are clearly unreconciliable with the big divergences between nearby series – a completely different Divergence Problem than the NAS Panel has in mind, but a big one nevertheless." Agreed that this is an important issue to drill down on.
10. "Another issue that I’m beginning to wonder about: how solid is the Graybill data anyway? The discrepancy between the HS-shaped Graybill series and the very different appearing Woodhouse, Kienast and Schweingruber series is worrying, although the Hansen-Bristow series has points in common with Graybill. There’s a lot of weight being placed on Graybill’s data – don’t you think that someone within the past 20 years should have replicated his results?" Can you mathematically say that the other series are similar to each other and that Graybill is most different of the lot? Otherwise, I think you should argue for updating all the series. Don’t want to get in the habit of only complaining wrt hockeystick series.

Steve: Quickly:
1. Down, columnwise.
2. There are local methods in the dendro trade, which generally claim very high confidence. But because of the differences between different samples, I think that the confidence estimates are wildly understated, but it’s a big topic, just in itself.
3. One can do the calcs, each site will be different. My impression is that results are being driven by outliers and non-normality, but it would take pretty detailed study to confirm. In some data sets (Tornetrask) the data already combines trees that are further apart than the Woodhouse, Graybill trees here. But before combining, I’d like to understand the differences.
4. I don’t mind the questions; I like questions. But sometimes I’m doing other things and sometimes you could subtract a little editorializing. Out of all the people in this wretched field, I really don’t try to evade questions.
5. Weighing one series 100 times more than the other is intuitively incorrect. As to what’s “correct” weighting, I don’t know. Let me observe though: you don’t see “weighting” as a front and centre issue in Mann or von Storch who don’t know what their weights are (or aren’t telling) because they’ve not worked through linear algebra. While I had referred to weights in the past and had done some checking, I hadn’t systematically worried through the differences. It’s worthwhile doing, but it’s a medium-sized project. I have to do some new programming to cover the relevant cases. However, in order to respond to the Mannian no-PC response, I think that this is a worthwhile approach.
6-7. We’ve given the results of a simple average in our NAS panel presentation – no HS. There are, as you realize, several layers to the onion: the Mannian PC method relative to ordinary PC methods; PC methods versus unweighted averages or geo-weighted averages; robust versus non-robust (means versus medians). Each is a non-trivial layering. These inter-relate to the question of outliers and proxy validity.

Ordinary PCs don’t mine like the Mannian method – that’s why we didn’t get a HS reconstruction using 2 PCs. But hte bristlecones ahve a HS shape and, if they or a lower-order PC representing them get into a Mannian multivariate stage, then you get a HS reconstruction.

8. Sheesh yourself. Our E&E article talks about reconstructions and the effect of these permutations.
10. I’m talking about reconciling not updating. I’d like to see someone locate the Graybill trees and resample those trees. In the mineral exploration business, geologists provide accurate descriptions of sample locations so that subsequent people can locate where the sample were taken from. This isn’t just in a GPS age. In lots of cases, people can re-locate samples from 50 years ago or more. People do go back and re-check geological samples. If you can’t verify the results, then you stop using them.

Pat, I agree and I asked about this also a while ago. There are of course other issues that come up if the Mannites go with that explanation, including what about the series that are left out, they are not contributing, so one is overstating the series in the recon. Also, the divergence effect seems to indicate that the HS ones are just luck in tterms of recent correlatin.

I agree that the numbering is confusing. People used to the conventions of English writing would typically number things across then down as opposed to down then across as in this case.

OTOH, I remember when I was a child being totally confused when trying to sing hymns in Sunday School until I suddenly realized that you skipped from the first line in the first block to the first line in the second block, etc. But I suppose that was an example of the same sort of thing as here.

Sorry about the numbering. I’m writing these things up pretty quickly. The plots come out in R from top to bottom in columns – so that’s how I numbered things.

Nothing in these plots has anything to do with PC weighting. The only reference to PCs is to the fact that the HS shaped series are heavily weighted in the MBH PC1.

I didn’t say that just the HS shaped series should be checked – I said that the discrepant results between the Graybill and Schweingruber-Kienast series should be reconciled. Not that that would mean that the reconciled ring width chronology is a thermometer – just that it would be a more accurate index of ring widths.

My query based on this would be: if you have an index of ring widths over time, what is the confidence interval on that index? Given the huge discrepancies between samples taken by different researchers, you’d have to see that the confidence intervals are from the floor to the ceiling, in Hegerl’s phrase at the NAS panel.

OK, time to weigh in lightly, as a generalist with broad/deep experience in computer/data/communication applications, from micro to macro, embedded to shrinkwrap.

Once upon a time I did some work on Y2K. [Was asked the “impossible” question: MrPete, what’s really going to happen? Worldwide $$$/etc are riding on your answer…Let’s just say I never have had reason to retract my extensive and detailed semi-private report, nor to be embarassed about it.]

Lots of lessons learned. One lesson: truth is neither democratic, nor a respecter of consistency in answers (particularly if one takes the Texas Sharpshooter Fallacy into consideration.)

Steve, just as your energy-promo hype-o-meter was raised as you examined the AGW reports, so too my Y2k scare-o-meter is beginning to go off.

* A certain amount of truth in the topic

* Serious real cost/benefit challenges

* Early, relatively small concerns mushroom via socio-political paths

* Scientific/engineering inquiry ends up radically biased by socio-political agendas: too-quick conclusions lead to blind confirmation of presumed reality. [Cautionary note: while inquiry needs to be neutral, policy/products/etc do need to be influenced by our values/virtues!]

I find it of immense significance that there’s such huge randomness in the data, and that the reporting to date produces results with supposed significance even when based on 100% random data.

Important challenges [I submit AGW is “important”] require robust data processing systems. Robust systems respond appropriately to fallacious input, and are built in recognition of basic realities, such as:

a) The data may be in error

b) The system designer may have made a mistake

c) The process itself may have failed [not same as (b). Stored data degrades, etc.]

These issues do not presume malicious/biased intent… yet addressing these can also address intentional breakdowns.

Steve, I sense from many of your statements that you are intimately aware of these things based on professional experience.

I’m beginning to suspect we’re looking at an entire field of scientific research that needs an audit of its basic operating principles at every stage, from data collection to publishing and interpretation.

Questions I’d love to ask:

1) How is accuracy of collected data verified and maintained (both raw and in early processing stages)?

2) What means are used to QA the methodologies at every stage of analysis? Just as software engineers tend to only test the anticipated user actions, so too info managers tend to only test processing paths for anticipated data inputs.

3) Why am I not seeing data-“controls” in AGW analyses and reports? More and more of what I read is discovery of potential correlation, with little analysis of whether random, alternate or external sources could be producing the observed effects.

4) What is the basis of published confidence/error levels? Are the processes regularly tested with a variety of random/ controlled-noise data?

I see the various “cherry picking”, “time extension” and other selective data manipulations as the equivalent of software bug workarounds: these indicate the underlying system (of data collection, analysis and more) is broken and/or misunderstood.

My guess: pride prevents honest evaluation of confidence levels. I’d be very curious to see a single careful work-through of the precision and accuracy of the data at every step, for _any_ of these AGW reports, let alone all of them.

We learned about such things in high school. Too many scientists in the “soft” scientists bypass such rigors simply because it seems too difficult. We can’t afford that.

Let’s start with identifying the confidence of tree ring data collection. How many cores, from how wide a “local area”, over what time period/time-of-day/???, by how many different individual collectors, yada yada yada… are required to give a good combined data set?

There MUST be several good studies of this in the literature, right? They’re not doing purely academic/curiosity studies after all — real money’s being spent on the outcomes. [Is it too extreme to compare this to the difference between observing the cool colors of oil-in-rainwater-on-the road vs producing reliable oilfield maps that will be used for major investment decisions?]

A conversation with a pharmaceuticals QA expert today shed a bit of light on one of their operating principles that may be applicable:

* Elements of process that _can_ be quantitatively measured, _are_ measured, and regularly recalibrated and proven to be valid. Particularly the inputs and outputs to the overall production system.

* Other process elements can’t be easily calibrated or analyzed for precision / accuracy (e.g. the middle of an ongoing chemical conversion process). At worst, those are simply observed with reasonable care at reasonable intervals.

When it comes to production, they know the value of careful/provable monitoring… and the danger of letting it slide.

2: It’s not just a wiggle-matching thing but an issue of how we sample for time series versus iid. A time series has more info than just the single y value. It has a value for each time point. I would think that this concept must come up in econ or something: How many stocks do you have to pick to track the market for a given degree of accuracy, etc.

7. Yes, I agree there is a difference in mannian PC and normal PC–well I think. Just trying to keep track of when an effect results from that and when from just being PC of either type. Sorry to “ask the question twice”, but sometimes one can shake a too confident assertion that is not correct or addressing the asked question, by requerrying. Anyhow, you’ve said that the differing weightings are a result of off-centering vice just a problem with PCA in general. I’m cool…

on 8: Yes, I know that you reported the effect with two PCs in EE. My point is that looking at only PC1 will not give you the right answer for the difference in weightings of the two series. Since you have more PCs (at least one more) to consider in the final reconstruction. And by their nature, the PCs tend to segregate. So just looking at one is the wrong way to approach this problem. Has nothing to do with EE papers or what you’ve done or haven’t done or how smart you are. It’s just a point about this analysis. Mind you: I’m still glad that you did the comparison. Just pointing out the problem with looking at PC1 only. I mean what if the PC1 had been 1:1 on the two series, but PC2 had a 2000:1 difference? Ross has already pointed out that mathematically in the Mannomatic, the different PCs get same weighting. So being PC1 versus PC2 is irrelevant. Only relevant is how many you include (Preisendorfer kerfuffle). But regardless, looking at just PC1 is going to give wrong idea for the amount of skew going on.

There MUST be several good studies of this in the literature, right? They’re not doing purely academic/curiosity studies after all “¢’¬? real money’s being spent on the outcomes. [Is it too extreme to compare this to the difference between observing the cool colors of oil-in-rainwater-on-the road vs producing reliable oilfield maps that will be used for major investment decisions?]

That’s what I thought; but amazingly and unfortunately, I don’t think these studies exist, or someone would have mentioned them here by now.

Jae: there are some foundational studies. It shows ignorance of the field to say there are none. And you haven’t looked, have you?

Whether they are adequate is certainly in debate and I don’t think they are sufficient and that it is a shame to build such a house of cards on a poor foundation. Note that the dendro professionals (cf. Rob Wilson) don’t care about these things sufficiently. I cringed with his simple-minded description of the Swiss putting CO2 in trees. Come on Rob! Try to assess things with some depth and some thoughtfullness.

To be pedantic, a house of cards does not have a foundation to which it is attached.

The whole multiproxy paradigm could be fairly described as a house of cards. Not only are the multiproxies not indepedent, they rely on assumptions about the underlying data that cannot be justified, and use statistical techniques in a way that could only be described as mistaken. This isn’t the first time that linear techniques have been used in a non-linear setting but it is by far the most visible example.

It’s only eight years ago that the original Hockey Stick was wheeled out as the brave new frontier of statistical analysis. How much money and time has been wasted because of it?

SUCCESSFUL APPLICATIONS OF DENDROCLIMATOLOGY AND DENDROECOLOGY DEPEND UPON CAREFUL STRATIFICATION. RING-WIDTH SAMPLES ARE SELECTED FROM TREES ON LIMITING SITES, WHERE WIDTHS OF GROWTH LAYERS VARY GREATLY FROM ONE YEAR TO THE NEXT AND AUTOCORRELATION OF THE WIDTHS IS NOT HIGH. RINGS ALSO MUST BE CROSS-DATED AND SUFFICIENTLY REPLICATED TO PROVIDE PRECISE DATING. THIS SELECTION AND DATING ASSURES THAT THE CLIMATIC INFORMATION COMMON TO ALL TREES, IS LARGE AND PROPERLY PLACED IN TIME. THE RANDOM ERROR OR NONCLIMATIC VARIATIONS IN GROWTH, AMONG TREES, IS REDUCED WHEN RING-WIDTH INDICES ARE AVERAGED FOR MANY TREES. SOME BASIC FACTS ABOUT THE GROWTH ARE PRESENTED ALONG WITH A DISCUSSION OF IMPORTANT PHYSIOLOGICAL PROCESSES OPERATING THROUGHOUT THE ROOTS, STEMS, AND LEAVES. CERTAIN GRADIENTS ASSOCIATED WITH TREE HEIGHT, CAMBIAL AGE, AND PHYSIOLOGICAL ACTIVITY CONTROL THE SIZE OF THE GROWTH LAYERS AS THEY VARY THROUGHOUT THE TREE. THESE BIOLOGICAL GRADIENTS INTERACT WITH ENVIRONMENTAL VARIABLES AND COMPLICATE THE TASK OF MODELING THE RELATIONSHIPS LINKING GROWTH WITH ENVIRONMENT. BIOLOGICAL MODELS ARE DESCRIBED FOR THE RELATIONSHIPS BETWEEN VARIATIONS IN RING WIDTHS FROM CONIFERS ON ARID SITES, AND VARIATIONS IN TEMPERATURE AND PRECIPITATION. THESE CLIMATIC FACTORS MAY INFLUENCE THE TREE AT ANY TIME. CONDITIONS PRECEDING THE GROWING SEASON SOMETIMES HAVE A GREATER INFLUENCE ON RING WIDTH THAN CONDITIONS DURING THE GROWING SEASON, AND THE RELATIVE EFFECTS OF THESE FACTORS ON GROWTH VARY WITH LATITUDE, ALTITUDE, AND DIFFERENCES IN FACTORS OF THE SITE. THE EFFECTS OF SOME CLIMATIC FACTORS ON GROWTH ARE NEGLIGIBLE DURING CERTAIN TIMES OF THE YEAR, BUT IMPORTANT AT OTHER TIMES. CLIMATIC FACTORS ARE SOMETIMES DIRECTLY RELATED TO GROWTH AND AT OTHER TIMES ARE INVERSELY RELATED TO GROWTH. STATISTICAL METHODS ARE DESCRIBED FOR ASCERTAINING THESE DIFFERENCES IN THE CLIMATIC RESPONSE OF TREES FROM DIFFERENT SITES. A PRACTICAL EXAMPLE IS GIVEN OF A TREE-RING STUDY AND THE MECHANICS ARE DESCRIBED FOR STRATIFICATION AND SELECTION OF TREE-RING MATERIALS, FOR LABORATORY PREPARATION, FOR CROSS-DATING, AND FOR COMPUTER PROCESSING. SEVERAL METHODS FOR CALIBRATION OF THE RING-WIDTH DATA WITH CLIMATIC VARIATION ARE DESCRIBED. SEVERAL EXAMPLES OF APPLICATIONS OF TREE-RING ANALYSIS OF PROBLEMS OF ENVIRONMENT AND CLIMATE ARE DESCRIBED. OTHER METHODS OF COMPARING PRESENT CLIMATE WITH PAST CLIMATE ARE DESCRIBED LONG WITH NEW DEVELOPMENTS IN RECONSTRUCTING PAST HYDROLOGIC CONDITIONS FROM TREE RINGS.–COPYRIGHT 1972, BIOLOGICAL ABSTRACTS, INC.

Abstract Tree-ring series from living trees near the timberline or timbers buried in the surroundings are exceptionally valuable both for climate reconstruction and investigations of the consequences of climate change to ecosystems. This paper is a critical assessment of the past and potential contributions of dendroecology and dendroclimatology in mountain environments. Problems addressed are the spatial variability of both climate and tree sites, the temporal variability of ecological growth conditions and the reconstruction of signals other than high frequency ones. A synoptic approach appears to be the only way to take into account both the spatial and temporal variability of tree-growth, allowing for a better comparison of spatial climatological patterns with spatial growth patterns.

TCO, I will review these references, but I’ll bet they do not provide the “foundations” that I’m looking for, which are mainly biological foundations, like the relationship between temperature and growth, when all the other variables are held constant. Like how can you be sure that a change in growth rate is due to a change in temperature and not some other variable. The abstracts suggest to me that the publications deal with methodologies, not basics. They also suggest to me(maybe due to my bias) that there is so many variables that there is generally no way to discern and attribute a temperature signal from the soup. And averaging the growth rates of several trees, each of which may have been affected by a different variable, sure doesn’t clear things up. That’s why these guys have to cherry pick to show any type of relationship, IMO. BTW, I have a background in forestry and wood science and I do understand some of which I speak.

This was what the original poster talked about. Sampling methodologies. You are looking for something else. Here is the direct quote from Pete:

Let’s start with identifying the confidence of tree ring data collection. How many cores, from how wide a “local area”, over what time period/time-of-day/???, by how many different individual collectors, yada yada yada… are required to give a good combined data set?

There MUST be several good studies of this in the literature, right?

However, there are papers on the botany as well. They may not satisfy you in terms of rigor (but they exist) and you ought to read them before saying that they don’t even exist. You’re a little free and loose and wild sometimes.

Will post up some refs for you. (Ultimate tree ring page is a good start.)

Note that botany AND statistics are covered. If you want to criticize them fine…after reading them. But don’t be an idiot and say there is no attempt to even write foundational papers.

The basic botanical processes governing tree ring formation are covered in Chapters 1-5. Chapter 6 deals with some of the simple statistics and what they reveal about the tree response to environmental and physiological variables. This includes a very basic discussion of matrix algebra, eigenvectors and principal components as used in the early works of tree ring analysis.

CLIMATIC FACTORS ARE SOMETIMES DIRECTLY RELATED TO GROWTH AND AT OTHER TIMES ARE INVERSELY RELATED TO GROWTH. STATISTICAL METHODS ARE DESCRIBED FOR ASCERTAINING THESE DIFFERENCES IN THE CLIMATIC RESPONSE OF TREES FROM DIFFERENT SITES.

Sure ’nuff …

However, I’m not sure if either of the cited resources were what jae was looking for, which was an analysis of robustness of the underlying assumptions of dendroclimatology.

For example, before we can look at the “spatial variability of both climate and tree sites, the temporal variability of ecological growth conditions and the reconstruction of signals other than high frequency ones” as done in the second paper, we first need to determine if the dang system works … both of these papers start with the assumption that dendroclimatology is a valid science, and talk of how to improve it … but is it really a valid science?

That is the question that (I believe) jae was asking, and that these papers don’t answer. Me, I find the idea to be very tenuous. The problem is the “upside-down quadratic” response to the variables.

Imagine, for example, that we have a hundred sensors that each have a different upside-down quadratic response to two variables, temperature and moisture, plus being affected by other variables (CO2 levels, soil nutrients, nitrogen loading, etc.). What are our odds of extracting an accurate temperature signal from these sensors?

Me, I’d say the odds were very low … and like jae, I have not seen any papers addressing this question. All of the papers I’ve seen assume that dendroclimatology is possible, and start from there. I am by no means convinced that it is possible at all.

Thanks for the starting refs, TCO. I was initially excited to discover a standard data QA toolkit, at this site.

My initial take:

a) Much work has been done to develop rigorous methods for what is in essence a very “loose” real-world data challenge.

b) I found it easy to see areas of weakness even in the academic definitions. (Examples: no suggestion of standardization for core sample height above ground. No suggestion of multiple measures per core (what if the measurement methodology and/or instrumentation is unreliable or introduces its own bias?) [NOTE: I have some expertise in real-world data collection/mgmt, indirect expertise in botany (via my spouse), and no expertise in dendro other than staring at the huge Colorado trees outside my home ;)]

c) Somewhere between zero and almost-nil researchers actually follow the recommended procedures. A random sample from the global database showed a number of sites where a single tree was cored a single time. Few sites contain the recommended multiple-samples from multiple-trees.

d1) They seem happy with data set correlations below 0.5 in many cases. OK, that’s as good as the data may get… yet rather than admit “we have bogus data that contains minimally useful real world information”, they simply smooth out the “noise”, ASSUME there’s a real signal in there (“proven” statistically… sorry, stats are not the real world and this profession appears to have forgotten that lesson…)… and plow forward toward a conclusion of some kind.

d2) What if a core presents itself as an outlier? Recommended action: delete the data samples (within a single core) that don’t “fit” and keep going. In my arena, we would never accept such a methodology. Either we remeasure, or we accept that our data is not going to be publishable, or we eliminate the complete data set from consideration.

Perhaps this is where dendro folks obtain the chutzpa to tell Steve M that they have the “right” to cherry pick data. They basically seem to presume their assumptions are correct, and that any data that doesn’t fit (i.e. is not 95 pctile) can simply be removed from consideration.

“The dendrochronologist must select sites that will maximize the environmental signal being investigated.” — soooo easily misconstrued! A slippery slope from maximum signal _utility_ to maximum signal _measurements_ (i.e. is it “valid temperature” or “increasing temperature”?!!!)

“The Principle of Replication states that the environmental signal being investigated can be maximized, and the amount of “noise” minimized, by sampling more than one stem radius per tree, and more than one tree per site.” — note “can be”, as if noise, bias and error detection were optional!

Recommended action: delete the data samples (within a single core) that don’t “fit” and keep going. In my arena, we would never accept such a methodology. Either we remeasure, or we accept that our data is not going to be publishable, or we eliminate the complete data set from consideration.

A better option is to publish and acknowledge the outlier. Why do you AND your posited opponents think that one can’t publish and include information that argues against what you’ve found. Reread Feynman’s cargo cult essay. Then read Wilson’s seminal work about research methods/philosophy from the 50s.

On replication: I think the author is right that increased sampling will reduce the effects of confounding variables PROVIDED that they are random in effect. Some variables will not be and thus need to be considered in the regression.

I agree that failure to follow methods is an issue. (multi-coring of trees for example). We could actually calculate the likely effect of that. What is the average in-tree variance? Out of tree variance? The part about measuring the rings. Well, sure we can calculate that too. Do you really think that it changes the story for anything that we look at on this site? Why not rail against the chemists for not having instructions on operating the Mettler balance in their papers?

****

Anyway, at least I’ve disabused you from your impression that these guys had no papers on fundamental sampling techniques, no recognition that they should have them. I guess I would ask you to read this stuff a little carefully and look at the practices carefully and consider the problem thoughtfully. I’m sure there are flaws, but I get the impression you are ready, fire, aim a bit. I mean, you were ALREADY WRONG about there being NO papers. That took 5 minutes of Google for me to figure out.

Here’s a thought for you: if I have time to take 30 cores, should I double core 15 trees or take 30 seperate samples? Why? Why not 10 trees triple-cored? One tree 30 cored? Hint: your answer should include both variance population considerations as well as mention of cross-dating efficacy (wiggle-matching).

Here’s a thought for you: if I have time to take 30 cores, should I double core 15 trees or take 30 seperate samples? Why? Why not 10 trees triple-cored? One tree 30 cored? Hint: your answer should include both variance population considerations as well as mention of cross-dating efficacy (wiggle-matching)

I’m pretty sure that drilling 30 holes in one tree is called “criminal damage”

TCO, thanks for the bibliography on methods. Maybe what I’m looking for is buried in there somewhere, but I doubt it. Excuse me, but I don’t have time to wade through thousands of pages of that stuff, and I don’t think I have to. I am merely asking for an expert to answer a simple question: When so many variables limit tree growth, how is it possible to isolate and identify any one of them? I want to see a paper that explains HOW you can EXPECT to identify a temperature signal in a tree or series of trees and be sure it is a temperature signal and not another signal. Look at the data Steve M has put up, showing opposite trends from trees of the same species in nearly the same locations. Some are presumably being affected by one variable (maybe temperature, who knows), while some are being affected by a different variable.

In order to make an assertion that tree ring growth reflects temperature, even for a single tree (as opposed to an average of several trees), you have to assume that the growth of the tree in question was LIMITED for most years of its life by temperature and only temperature. This is an enormous (and I think unproven) assumption, since so many variables can limit tree growth. Some of the more important limiting variables are moisture, nutrients/microrizae, competition, genetics, age, and general tree health (i.e., lack of disease). In the case of high-altitude/latitude trees, the time of year that all the snow disappears and it’s proximity to the tree line are probably also critical. Then on top of the variability problem, you also have to assume that there is a positive linear (or almost linear) relationship between temperature and growth rate (the only relationship I’ve seen is an upside down quadratic). One tree may be limited by temperature, it’s neighbor may be limited by moisture, because it is growing in extremely shallow soil. The next one may be limited by nutrients, etc., etc., etc. Or all of them may start showing an increase in growth rate because their neighbors are dying out. And the dendroclimatoligist happens to core these trees and concludes “Aha, a stand that shows a temperature signal.” It just stretches my poor old imagination to visualize how one can expect to know when he is seeing a temperature signal in tree rings. It is extremely easy to see why why cherry picking is essential for this part of dendroclimatology.

Bottom line: I don’t see how it is THEORETICALLY POSSIBLE to reconstruct temperatures from tree rings.

Just take a few minutes to look over the more relevant and easily accessed material. Skim it and get some more appreciation for what is there and how good/bad it is. No need to get a Ph.D. in the subject, but you kinda charge into things pretty wild and it’s embarresing

TCO, I don’t know who is “emembarresed” here, certainly not me. Although I’m no expert on dendrochronology, I have read some of the articles in the field, and I know a fair amount about the subject. The studies I have looked at use the same general methodologies as the studies being discussed by Steve M here, and they likely have many of the same flaws (particularly cherry-picking and weird statistics). And I’ll bet none of them have been audited. I am simply asking a straight-forward question, and I am trying to locate someone with the expertise to answer it. You are evidently not the one that can do it. You remind me of Dano, by linking to a series of articles to confuse the issue and ignore the question. I have been asking this same question for months. If there is no answer, there is no basis for the science of reading the tea leaves tree rings to reconstruct temperature.

By comparison, the ring width-related variables appear to be poorly correlated with temperature. Certainly LRW displays apparently random insignificant associations. TRW (and ERW) display mode consistently positive associations for the months of June through September and more particularly in the Gleichlaufigkeit values, with the June values (barring the WINNIP series) displaying consistently high values.

Now, notice, please that Steve is analyzing some of the very studies that you linked and asked me to “skim.” Notice, also, that the various authors do not agree with each other on about everything. Looks to me, so far, like you just cannot use tree rings to measure much, except tree age.

TCO, I presume you are referring to the simplistic College of Wooster Study (probably a special project by some senior). It has nothing to do with climate; it deals with a release of the trees due to clearing. OK, if you KNOW the history of a stand, you can discern certain things from tree rings. In this case they KNEW that the trees were thinned out when the facility was built, and it is well known that thinning produces faster growth. But if the dendro guys were looking at the early years of a 1000-year-old stand and saw this graph, they might attribute the growth spurt to temperature. That’s the problem; too many things can cause changes in growth patterns, and it is difficult, if not impossible to look back and say what caused the change.

Jae: You said that you couldn’t tell anything other than chrono by tree rings. This is a nice little example of a different effect. Sure they knew about the history of the area, but so what. Nothing wrong with bringing more data into the equation. Fire studies often use both tree ring evidence and other direct evidence. Since neither is foolproof, the combination is better then either on its own (or than throwing one’s hands up). When you get an MRI, it is not definitive. Neither is a direct palpation type examination. Does this make them worthless? Surely, you also see the value of combining the techniques to move to more accurate diagnosis.

My dear TCO, this is not foundational work for tree ring temperature studies. My question still stands. How do you distinguish between a temperature effect and the effect of say, a gradual natural thinning, caused by disease? How do you distinguish between a temperature effect and a solar effect? As Steve is demonstrating extremely well, these guys have to cherry pick their data to show anything, and that type of “research” is simply not scientific.

#24. Pete, I agree entirely. I was initially amazed and continue to be amazed at the lack of engineering-quality studies.

## others,
On ring width-temperature bibliography, I would start with reading articles by Jacoby and d’Arrigo, Schweingruber and Briffa. I’m in the process of re-reading Jacoby-D’Arrigo. I’ve posted up comments about many Briffa articles already. I’ll collate a reading list and post it up. From the point of view of multiproxy articles, Jacoby and d’Arrigo are particularly significant because they promote the use of ring width chronologies as a temperature index. I have seen very few references by them back to underlying botanical literature, other than very arm-waving and general comments.

I’m also going to post up some notes on relative recent CO2 fertilization work.

Update: Graybill and Idso 1993 made the following comment about differences between their Niwot Ridge sample and the Graumlich and Kienast and Luxmoore samples:

The tree forms that we sampled however were different. Gaumlich’s samples were almost entirely from mature full-bark individuals with substantial foliage [L.J. Graumlich, pers. comm. 1991]. Our samples of foxtail pine were predominantly from strip=bark forms. The nature of the subalpine tree forms sampled in Colorado for the Kienast and Luxmoore [1988] study are not described and their analyses focused primarily on growth and climate trends since 1950. Therefore it is difficulr to comment or elaborate on the growth differences found in our studies.

In one of our articles, we specifically mentioned strip bark issues and that the Graybill chronologies were specifically chosen to target strip bark trees in order to best obtain a CO2 fertilization signal. Sherwood Idso expressed his consternation to us that the Mann hockey stick was connected to these chronologies.

Have you reviewed the Idso, Graybill CO2 work as work in and of itself? How does it compare with work to develop temp proxies? (The key of course is the principle of limiting agent, for either.) Are you getting behind Grabill/Idso CO2 work or just want Mann to deal with the possible confounding?

Anticipating that you refrain: Can you give a qualitative assessment of Idso/Graybill work to derive CO2 proxies versus foundational work to develop temp proxies?

re: 49. TCO, I should have mentioned that I have taken cores from a lot of trees, in connection with studies of growth rates, wood density, and fire history. So I understand the “basics” very well. You say: “Actually your initial comment was that noone had written on the subject.” I don’t know what you mean by “the subject.” What I mean by “the subject” is the connection between growth and temperature, and I think you know this.

Incidently, I think there may be hope for using tree rings as a temperature proxy, IF attention is given to various isotopes that are related to temperature. Tree rings would certainly record changes in carbon and oxygen isotopes. But of course, the isotopes would be probably be related to solar influences and the studies would probably show that the SUN has caused the majority of the climate variation. And the studies would doubtless show a prominent MWP and LIA, and the dendro guys sure wouldn’t want that!

#53. One of the confounding factors in the Jacoby collection is that he only archived and reported south-facing sites as north-facing sites lacked the signal that he was looking for. If the effect is temperautre-related, there should obviously be no difference between north and south facing sites, suggesting that some other common factor is affecting the Jacoby chronologies – perhaps cloudiness, who knows. But how the hell can he limit his sites to south facing sites and then presume that there is no confounding?

I’ve posted some incidental comments on isotopes. I don’t think that they are magic bullets. However, there are changes in bristlecone C13 ratios in the 20th century that are consistent with an icnrease in water use efficiency predicted by botanical theory on CO2 fertilization (ahead of time, not ex post).

I wonder how well the Graybill/Idso CO2 proxies work if you get fossil or old trees and go back prior to recent CO2 increases. If CO2 did not fluctuate pre-industrial, one would expect to see a straight line. One could also compare to the ice cores.

I’ve posted some incidental comments on isotopes. I don’t think that they are magic bullets. However, there are changes in bristlecone C13 ratios in the 20th century that are consistent with an icnrease in water use efficiency predicted by botanical theory on CO2 fertilization (ahead of time, not ex post).

When I read that paper, I did not see how they established that the signal was proof of more CO2 versus just normal growth increase. What is the argument? Does it have something to do with the heavier mass of C-13 versus C-12?

No sweat. You also (I think it was you) talked about the actual issues in physically measuring the rings, whether the methods were listed and concern over errors in this step. The ITRB listserv FAQ 7.5 covers the equipment used in ring measurement. While we don’t have a system of error listed, I think there is a pretty detailed procedure listed for doing measurements right. Other parts of the FAQ show a lot of care/effort towards sample prep, transport, etc.

“…Jacoby…only archived and reported south-facing sites as north-facing sites lacked the signal that he was looking for…”

Married to a biologist, the answer to this has been drilled into me for years. At least in the Northern hemisphere: north-facing sites experience mostly ambient temperature. South-facing sites receive DIRECT SUNLIGHT. [That’s why you’ll see snow on north-facing slopes… that’s why our home is south-facing so our garden will grow, etc etc]

“…If the effect is temperautre-related, there should obviously be no difference between north and south facing sites…”

I respectfully disagree. AFAIK, north-facing should be LESS confounded by other factors and MORE reflective of ambient temperature.

“…how the hell can he limit his sites to south facing sites and then presume that there is no confounding?”

Actually, my ignorant guess is he accidentally selected for sun-confounding and against temperature!

On a related note. Discussed this with my wife. She related that in situations where biologists do not have ability to control all factors, they:
a) analyze a given site intensely over longer time
b) create transects, and use random number tables to determine which sectors of the transect to measure
c) ALL data must fit the hypothesis to 95%. Too many outliers means that either your hypothesis or your methodology is incorrect.

For the present situation, they should be:
a) Studying a given site much more intensely
b) Choosing trees at random, and core location on the tree at random (which compass direction? At what height? Etc)
c) Archiving far more data and subjecting the data to far more rigorous testing.

In another thread, I believe you mentioned something about your appreciation for a researcher who is heading in such a direction. If so, that would be confirmed from this corner.

I appreciate the cost and difficulty of doing such in-depth analysis. I can’t have too much sympathy however. My wife did such studies 50 feet down, in 1-foot visibility ice cold water, off the Monterey coast of California. I still find it hard to believe that scientists will endure such conditions and come up smiling! 🙂