Post navigation

Severe analytical problems in dendroclimatology, part two

In part one, I summarized briefly three principal analytical problems in using tree ring widths to infer long term paleoclimatic trends. I noted that any one of the three would be a major problem, but that taken together, they are fatal to confidence in the validity of long term climate reconstructions. In this second part I’m going to go into some more detail on the second problem, given that the first issue (as described in the cited Loehle and Kingsolver papers) is straightforward and well-described therein. However, first a couple more notes on that first issue of nonlinear response functions.

Restating that issue, if ring response is a +/- unimodal function of climate, then one cannot be sure which “side” of the optimum response that the ring widths of pre-calibration years fell on, and accordingly, cannot confidently use them to infer a single climatic value for any given time point. The best one can do is to estimate a bimodal response–and even then only if the climatic and ring width values during the calibration period cover a broad enough range of climatic states to allow this; that is, that they sample rings on both sides of the optimum. Such an estimate would certainly be a valid and useful piece of information, but it would not be the definitive estimate of a single value arising from an assumed linear relationship between climate and ring response. Furthermore, I have yet to see a single paper that takes this approach. Rather, studies universally assume a linear relationship between climate and ring response and then invert this relationship in order to predict past climate states. The point is that this assumption/practice is not justified, either empirically or theoretically; it constitutes a serious conceptual mistake in tree ring analysis. As I stated in part one, it is therefore disconcerting that it took so long for someone (i.e. Loehle, 2009) to point it out. Better late than never of course, but a number of the horses are already out of the gate.

One more point on this issue. The argument has been made explicitly or implicitly more than once, that a linear relationship between climate and ring response observed during the calibration period justifies the assumption of a linear relationship in the past. No it does not! This is in fact part of the point; the fact that all the observed ring responses during the calibration period fell on the “left” side of the optimum, and therefore approximate linearity, does not in any way guarantee that they did so in the past. There is no analytical method currently available for doing this, and it is not clear that it is even possible. [I have some ideas on how that problem might be approached but won’t get into that issue here. It is by no means clear that the problem is fully tractable however].

Now on to the second issue: problems with standardizing (or “detrending”) ring responses in order to remove the long term effects of changing tree age/size, thus leaving (presumably) only the effect of the climatic variable of interest. The problem here is that this is done by ad-hoc curve fitting procedures that cannot in any way guarantee an accurate result. An empirical “curve” (almost always a straight line or a negative exponential curve) is fit though each ring width series (obtained from a tree core), that curve being assumed to represent the effects of changing tree age/size on ring response. The residuals from this curve are thus taken to represent the effects of climate. There is no question that as a tree changes in age/size with time, that the ring characteristics also change. Ring widths typically, though not constantly, decrease with increasing tree size, for example. There are two main approaches to this “detrending” and it’s important to understand their basics.

The first, and historically oldest, method, fits a separate curve to each ring series individually. [I call this “Individual Curve Standardization” (ICS) for convenience, since there is no commonly used acronym in the literature for it.] It is immediately clear that the longest term effects of age/size, and climate, are (potentially) irretrievably confounded in this approach, and thus not fully resolvable. This issue was most clearly described in 1995 by Cook et al. The second method therefore tries to address this problem, by instead assuming that there is one fundamental age/size effect for a given species in a given, defined area, that can be approximated by a single curve fit to the ring series from the set of trees sampled therein. This method is known as “Regional Curve Standardization” (RCS). It is awkwardly described in the literature (e.g Briffa, 1992a) as resulting from “stacking” cores by their “cambial” ages, taking means, smoothing those means, etc. etc. More simply, RCS is the fitting of a single curve (or spline) to the ring cambial ages (years from tree center) of all cores in a defined area, followed by a detrending of each ring series with that single curve. [Sometimes two curves are generated if there are distinct groups of trees defined by clearly different long term growth patterns see e.g. Esper et al (2002)].

The RCS approach dates to Erlandsson (1936), but was used rarely until revived by Briffa et al (1992a). It has been generally assumed that the method largely solves the problems of the ICS method as long as certain requirements are met, the principal ones being that (1) the fullest possible range of tree ages is sampled in the field, and (2) that the sampled trees constitute a single biological population with respect to their response to climate. However, the first requirement in particular is clearly often not met, as inspection of the tree ages in many sites archived at the ITRDB shows. Clearly, field workers have almost always tried to sample the oldest trees at the expense of younger ones. This is very understandable, given that a common goal is to carry a chronology as far back in time as possible, but it creates real problems for the RCS method. Also, RCS approaches were rarely used before 1992, and only scatteringly since, so there was very likely no general awareness of the importance of the issue. More recently, Briffa and Cook have emphasized its importance however, so hopefully field sampling strategies will change. The second requirement (single biological population) also becomes questionable as the regional area encompassed by the tree sampling is increased, due to potential differences in site quality (e.g. soils and topography) and genetic differences between widely scattered sampling sites. Therefore, even the recognized requirements of the RCS method range from clearly unmet, to questionable and undemonstrated. And this does not include the unrecognized issues with the method (which there are; more on that later).

These problems point to a more general issue in the field however. Namely, it is based almost entirely on empirical data analyses of various types, with little or no theoretical foundation. The field has made almost no use of the power of simulation analysis to answer critically important questions regarding the limitations of various possible analytical approaches, which such types of analysis are particularly insightful for. If analyses of complex systems are performed piece-meal, based on empirical data analyses from some arbitrary set of realizations of that system, one will often obtain a set of results that is not underlain (and hence explainable) by any unifying conceptual framework. Dendroclimatology fits this description pretty well; it is almost entirely an observational science, with little basis in rigorous and systematic experimental analysis, be that on actual trees, or in simulated model systems (each of which has its own strengths and weaknesses).

It should go without saying that such an approach will have limited power to solve certain types of problems, and that it will also likely lead to potentially serious confusions and disagreements arising as different researchers apply different analytical methods to different data sets. This manifests itself, for example, in “spaghetti graphs” of the past temperature of continental and larger regions, each single strand differing, often substantially, in its time course from many of the others. It is (apparently) assumed by most or all that this represents a type of ensemble and that the “truth” therefore lies somewhere in the middle of the individual strands. But this assumption is defensible, as with any ensemble approach, only if each strand is one realization of an unbiased, stochastic process, with the signal gradually emerging from the noise as more ensemble members (strands) are added. But this requirement has not been even close to demonstrated (or even attempted as far as I can tell), and in fact is very unlikely to hold true, for the reasons discussed here and elsewhere. There is simply no way to know where the truth lies in (or outside of?) the spaghetti pile, given commonly used analysis methods. It’s frankly just a pile of spaghetti, without much real meaning. How are we supposed to have strong confidence in what the pile represents?

Inexplicably, although the RCS method has been generally believed (though again, not conclusively demonstrated) to be superior to ICS in retrieving the long term climatic signal (or at least no worse than it), at least since Cook et al. (1995) or Briffa et al. (1992a), this has not stopped a number of researchers from using the ICS method anyway. This includes studies by those two authors themselves (e.g. Briffa et al 1992b; Cook et al., 2004), and at least five or six other large (continental to global) scale reconstructions, right up to the present (e.g. Pederson et al., 2011), not counting a number of others at regional and smaller scales. One can only guess as to the rationale behind this, because in few if any of those papers is it made clear why this choice was made, and there is no set of agreed-upon rules or guidelines anywhere that clearly defend and delineate under what conditions the ICS method should and should not be used to detrend tree ring series.

Additionally, I argue that the RCS method also has one very serious unrecognized flaw, and is therefore by no means a complete solution to the problems with ICS detrending. Although this latter point is recognized in several places in the literature (e.g. Briffa and Melvin, 2010), this unrecognized issue means that the extent/magnitude of the method’s problems is worse than is currently recognized. That topic will be covered in part three.

[note: Craig Loehle points out in the comments that tree size is the better predictor of ring size than is age. I agree with him; I use the phrase “age/size” in the above discussion mainly because, historically, detrending has been done by fitting curves to tree ages, and most people are used to thinking in those terms. The two measures are highly correlated, but it is size that is the better predictor]

References:
(see Part 1 for links to the Loehle and Kingsolver papers; you’ll have to use Google Scholar for the others, sorry: many (but not all) of them are freely accessible)

15 thoughts on “Severe analytical problems in dendroclimatology, part two”

As a layman often put off by the unnecessary use of jargon in these discussions, I find the discussion in this and the previous part helpful, and perhaps you can help me further telling me precisely what the (or your) definition of a “chronology” is My naive view of a temperature reconstruction in this area is that it boils down, after all the principal-components or whatever other procedures are taken into account, to a vector generated as the product of (1) a matrix whose columns are different “chronologies” and (2) a vector of which each element is a weight for a respective, different matrix column, i.e.,for a different “chronology.” My guess is that a “chronology” could be the time sequence of measurements, e.g., ring widths, from a given tree (after the curve-standardization step of which you’ve described two flavors above and possibly other normalizations) or it could be the result of so averaging time-overlapped groups of such sequences that each column in the matrix is completely populated. Can you offer any guidance?

Having thus asked for your help, I hope you will not take it amiss if I mention that I found the discussion contained in the paragraph containing “spaghetti” and the one before it too abstract for me really to comprehend what you were driving at. If you could give some example of how one of a spaghetti graph’s curves represented something other than a sample from the same ensemble as of another, you might throw a line to those of us unable to think that abstractly.

Finally, a minor question directed to the following passage: “The second requirement (single biological population) also becomes questionable as the regional area encompassed by the tree sampling is increased, due to potential differences in site quality and genetic differences between widely scattered sampling sites.” Can’t the second-requirement failure mentioned here also result to an extent from the first? Specifically, can’t climate change in a single site, no matter how localized, so affect the genetic selection among same-species trees that the mode in the curve for that site’s older trees will be displaced from that in the curve for the same site’s younger trees?

Thanks for the questions/comments Joe; such comments are quite helpful in improving these discussions, as it’s quite easy to use terminology and phrasing that are not clear to everyone (even amongst other scientists). It also helps if graphs are included, which I didn’t do for lack of time. Future posts in this series will have some though.

The term “chronology” usually refers to the single time series of ring measurements that results from taking the mean (often a robust mean, where outliers are first removed) of the measurements from all the trees sampled in some defined area. Usually that area comprises a single sampling site, encompassing say a hectare or so. There is no principal component analysis (or any other complexities like the differential weighting of each series that you mention) involved at this stage. Principal components analysis (if it is used at all) occurs later, when one is aggregating different sites at regional or larger scales, i.e. “scaling up”. [Those are secondary level problems in climatic reconstructions, whereas I’m dealing with more fundamental issues in these two articles.] Strictly speaking, one could use the term in the way you describe (i.e. each tree core represents a single chronology), but typically “ring series” is what is used for that.

Per your second paragraph, my main point wrt spaghetti graphs is that one has to have confidence that each “strand” is a “valid” sample, in the sense that the only thing that keeps each such from being a fully accurate time series of the past climate variable of interest (say temp), is purely random variation i.e. “sampling error”. But if each strand (or some substantial fraction thereof) were also affected by systematic errors (in addition to the sampling error) then simply adding more strands does not get you where you want to be, no matter how many you add. My point in these posts is that there is very strong reason to suspect such systematic errors exist.

Per your third paragraph: yes I think that’s possible (although I don’t see how it “results to an extent from the first” as you phrase it). However, I think it’s unlikely to be important in most cases because even with younger trees in the sample, we are still talking about trees that typically have survived at least a few decades. The strongest selective force–by far–is when trees are very young (seedlings and saplings), and all sampled trees will have survived that selection. Once they get to a certain size–roughly several inches in diameter–they have survived most of the selection pressure they are going to face.

Thanks a lot. Having that first answer eliminates a lot of my difficulty, part of which was how great a difficulty gaps in the chronology-matrix columns present. That is, if a single (chronology-representing) column in the matrix results from many trees’ data, it’s more likely to have an entry for every matrix row (year).

And I understand, by the way, that the principal-components stuff occurs downstream from the curve-standardization processes you address here; I mentioned it only to reflect my suspicion that,even when the principal-components hocus-pocus is used, the operation is mathematically equivalent to simply using individual-chronology matrix columns rather than, as appears in the program, columns of principal components. Unfortunately, I don’t have the attention span to verify that this suspicion is correct as applied to the analyses that high priests of the relevant sect employ. If I had the (mental) bandwidth, I could probably tease out of McIntyre’s site how he obtains individual-chronology weights from the rites employed by Mann et al., for example, but I have so far found the prospect of making the attempt too daunting.

Your second answer was also helpful.

We didn’t quite join issue on the third point, but it’s a very minor one so by all means ignore my belaboring it as follows:

As I understand it, the problem presented by basing the curve-standardization process on data taken from sites located far from each other is that the sites’ different, say, temperature regimes may result in, e.g., infant mortality’s imposing different selection criteria and thereby yielding different temperature-vs.-ring-width curves. What I tried, obviously ineffectively, to say is that similar differences could arise even if you restrict yourself to a single small site and use an even age distribution, because trees that sprouted, grew up, and died at a given location in the Twenty-First Century may be subjected to selection criteria as different from those to which same-site trees that spent their whole lives in the Seventeenth Century as trees that had been located a long distance away would have..

Per your last paragraph there, my point wrt that issue was that, if you use RCS on trees from distant sites, you run the risk of having different site conditions as well as different biological populations (defined wrt their response to the climate variable of interest). If you use RCS only on trees from single sites, you eliminate (or rather, greatly reduce) those two sources of variation, but you have to sample from a wide age/size distribution for RCS to be effective. You definitely do not want to sample a single age/size class of trees. The problem is that, in many analyses, the trees have already been sampled, and this issue may or may not have been taken into consideration by the samplers of the various sites at that time.

Your second paragraph I had a hard time following. Not really sure what your concern is there. The issues I’m raising do not originate specifically from statements by either Mann or McIntyre; they are more general and fundamental than that.

I am a forestry specialist. In forestry it is well-known that at least when trees are in the commercial age range (30 to 100 yrs), the trees if not crowded too much exhibit constant basal area increment. Since the same basal area is added each year over a larger diameter, the ring width is a simple function of tree diameter (NOT age). Not a single dendro paper makes use of this insight for their analysis.
There is also a problem with age of trees sampled. A paper last year showed that the oldest trees in a stand grew slower when they were young that the average young tree in the same stand. This makes sense if there is a tradeoff between fast growth and long live, with slower growing trees being able to live longer. The effect will be to make the stand-level look like it is growing faster in recent decades, thus coincidentally matching the global (though not necessarily local) uptick in 20th century temperature. I have documented this trade-off in the following papers:
Loehle, C. 1996. Optimal Defensive Investments in Plants. Oikos 75:299-302.
Loehle, C. 2000. Strategy Space and the Disturbance Spectrum: A Life History Model for Tree Species Coexistence. American Naturalist 156:14-33

“I am a forestry specialist. In forestry it is well-known that at least when trees are in the commercial age range (30 to 100 yrs), the trees if not crowded too much exhibit constant basal area increment. Since the same basal area is added each year over a larger diameter, the ring width is a simple function of tree diameter (NOT age).”

No question about it, I fully agree. A generalization of this issue is a very major point of my recent submission to PNAS and will be dealt with in upcoming posts here. It is highly important to the RCS method of detrending in particular [I therefore consider it as a subset of the second of the three issues I am raising.]

“Not a single dendro paper makes use of this insight for their analysis.”

There is one actually (Biondi and Qaeden, 2008)* who developed their new “C” method of detrending based upon this very idea. And I believe there are some other (older) papers in which BAI (= ring area) is used instead of ring widths, but have no definite references at fingertips.

I also agree with your point about the longevity vs growth rate tradeoff (for which I think there is good evidence) and its potential effects wrt this issue. This also relates to Joe’s question above, and which I failed to mention in my response to him, because I just plain forgot about it. Of course the most important questions are (1) how do we demonstrate its actual existence, and (2) how large is the effect?

Could you please provide links to publicly available copies of that unspecified paper, and also to your two there, for all interested? That would be quite helpful. Thanks.
Jim

Jim Bouldin: “Your second paragraph I had a hard time following. Not really sure what your concern is there. The issues I’m raising do not originate specifically from statements by either Mann or McIntyre; they are more general and fundamental than that.”

Sorry I went off on a tangent; I hadn’t really intended to impose upon you further. I wasn’t so much raising a question as, I guess, putting the RCS issue into its place in the puzzle that exists in my mind regarding the overall flow chart between raw width-measurement sequences and a temperature sequence.

[That’s excellent Joe–exactly what we should all be doing, that is trying to get a comprehensive picture of the whole process and where errors are potentially introduced. I realize that it’s not simple or easy to do]

I recognize that what you’re dealing with here is age-adjusting an individual tree’s measurements (preferably in accordance with measurements of its fellows’) to get an individual tree’s (age-adjusted) measurement sequence. After that, as you taught me above, a set of those individual-tree measurement sequences is combined into an individual chronology. That step clearly occurs downstream from the step on which you focused.

[To elaborate just a bit further, once each individual ring series is detrended, the resulting residuals are typically referred to as index values. It is those index values that are averaged to create the single site chronology]

Even further downstream is the step of combining multiple contemporaneous such chronologies into a temperature reconstruction, and that step is the one on which the Mann-McIntyre controversy focuses. I was stating–irrelevantly to your post, I agree–my suspicion that, boiled down, this last step, i.e., producing a temperature reconstruction from a set of simultaneous chronologies, is nothing more than taking a weighted average of those chronologies (and, I presume a constant).

[Correct; principal components are essentially weighted averages of the original set of time series, describing in descending order the main axes of variation in the data. Or at least, that’s one simple way to think of them]

Although I would not take it amiss if you took the time to disabuse me of any misapprehensions my last sentence betrayed, I do recognize that I’ve gone off topic, and you may rest assured that, contrary to what the preceding logorrhea may seem to portend, you have not acquired a pesky pen pal.
[Joe, not only is this discussion not a problem, it is exactly what I want to have happen here. If everyone had your attitude we would be in great shape on these things. Keep asking questions. Only busy-ness will keep me from answering quickly–Jim]

I’m hoping this isn’t considered OT. I’ve been following the dendro discussions/arguments for some years now and was just wondering if isotope ratios within tree rings has been discarded or improved since the 70’s in respect to temp.s etc.? I realize this may well be an opinion based question but I would still like to read those opinions from readers here . . .

1. Would you consider posting up somewhere your PNAS submitted preprint that you mentioned in part 1?

2. A defence sometimes made of dendroclimatology is that other proxies (lake sediments, ice cores, corals etc) give similar results. To what extent do your criticisms also apply to these other proxies?

Also one suggestion – a picture is worth a thousand words, particularly for your first problem, which is simple and obvious if you draw a graph.

1. I have thought about posting the manuscript and just haven’t come to a decision on it. It’s an enormous amount of material that would have to be posted because there is a Supplemental Information document and a huge number of graphs. The main thing I have to consider is how it affects a possible submission elsewhere, which is very important.
2. I wish I knew more about other climate proxies but in fact I know very little about them. I just happen to know about tree rings because I know about trees (my dissertation was nearly a tree ring project but the idea fell through and so I changed to something else entirely). So I can make no statement about how these issues apply to those other proxy types. I will say that one of the critical issues with tree rings involves the whole concept of a unimodal temperature response (which is why the Loehle and Kingsolver papers are so important). Non-biological proxies are unlikely less likely to have this same characteristic.

You’re right on the picture issue. I’m a highly verbal person so I tend to write to try to explain things. But there will be some graphs in the future.