There is sufficient evidence from tree rings, boreholes, retreating glaciers, and other “proxies” of past surface temperatures to say with a high level of confidence that the last few decades of the 20th century were warmer than any comparable period in the last 400 years, according to a new report from the National Research Council. Less confidence can be placed in proxy-based reconstructions of surface temperatures for A.D. 900 to 1600, said the committee that wrote the report, although the available proxy evidence does indicate that many locations were warmer during the past 25 years than during any other 25-year period since 900. Very little confidence can be placed in statements about average global surface temperatures prior to A.D. 900 because the proxy data for that time frame are sparse, the committee added. …

The Research Council committee found the Mann team’s conclusion that warming in the last few decades of the 20th century was unprecedented over the last thousand years to be plausible, but it had less confidence that the warming was unprecedented prior to 1600; fewer proxies — in fewer locations — provide temperatures for periods before then. Because of larger uncertainties in temperature reconstructions for decades and individual years, and because not all proxies record temperatures for such short timescales, even less confidence can be placed in the Mann team’s conclusions about the 1990s, and 1998 in particular.

The committee noted that scientists’ reconstructions of Northern Hemisphere surface temperatures for the past thousand years are generally consistent. The reconstructions show relatively warm conditions centered around the year 1000, and a relatively cold period, or “Little Ice Age,” from roughly 1500 to 1850. The exact timing of warm episodes in the medieval period may have varied by region, and the magnitude and geographical extent of the warmth is uncertain, the committee said. None of the reconstructions indicates that temperatures were warmer during medieval times than during the past few decades, the committee added.

However, it is the big picture conclusions that have the most relevance for the lay public and policymakers, and it is re-assuring (and unsurprising) to see that the panel has found reason to support the key mainstream findings of past research, including points that we have highlighted previously:

I’d characterize it more as schizophrenic. It’s got two completely distinct personalities. On the one hand, they pretty much concede that every criticism of MBH is correct. They disown MBH claims to statistical skill for individual decades and especially individual years.

However, they nevertheless conclude that it is “plausible” – whatever that means – that the “Northern Hemisphere was warmer during the last few decades of the 20th century than during any comparable period over the preceding millennium”. Here, the devil is in the details, as the other studies relied on for this conclusion themselves suffer from the methodological and data problems conceded by the panel. The panel recommendations on methodology are very important; when applied to MBH and the other studies (as they will be in short order), it is my view that they will have major impact and little will be lefting standing from the cited multiproxy studies.

This should finally put an official end to the silliness that’s gone on for the last few years. I don’t doubt that McIntyre will continue to bloviate, but journalists especially now have no reason to give him any traction.

Well, the report is out and it seems to be a fairly strong vindication of Mann et al. There is some more fuzzy language that will surely be seized apon by some but there is certainly nothing to support the allegations of errors, omissions and frauds that had been thrown around. The main conclusion is that many other studies support these same findings and that this is not a central issue in the present and future of climate change.

Comments

Nope. Your understanding is faulty. Adding or leaving out RHS variables that are uncorrelated to the dependent variable affects goodness-of-fit but not bias. It doesn’t matter how many of them there are. In fact, leaving out (or adding) RHS variables that are collinear doesn’t affect bias either. (It does affect the precision of individual coefficients).

per, this is very elementary stuff. Frankly, it makes me wonder about any other attestations you may have made of a statistical nature. As it happens, tomorrow I leave for vacation and I have much to do before that so I may not be able to respond again. I don’t usually announce my comings and goings but in this case I wanted to be sure that you do not misinterpret my silence for agreement.

Sez per: You appear to be favouring the view that we can simply discard the samples that you don’t like (extracting the climate signal), until you are left with samples that just tell you what you want.

Really, per? That’s the way it appears to you is it? Let me tell you how it appears to me.

It appears to me that you asked me a to justify jp’s exclusion of “complacent” sites from consideration, even though their growth might “have a perfect correlation with temperature”. (The emphasis is mine, the bizarre words in quotes are yours).

It appears to me that I answered your question as follows:

I can’t speak for jp but I can invent a story about trees if you want one, based on my rambles around the local park. The trees in my story don’t much care about temperature. Herbivores improve their prospects by eating competitor plants and dumping lots of dung which fertilises them. So their rings exhibit a gradual rising trend, with little variance. On jp’s definition they are complacent. It’s true that they might have a strong correlation with temperature although it certainly could not be perfect unless temperature moves at a slower pace than a drugged snail. But such a correlation would be spurious.

It appears to me that you are doing that thing you call “paraphrasing” again. It is more usually referred to as misrepresentation. Sometimes it is called setting up a straw man.

>you asked me a to justify jp’s exclusion of “complacent” sites from consideration, even though their growth might “have a perfect correlation with temperature”.

good; we have something in common. What you did was invent a story about trees, which has god-like knowledge of why tree growth happens in a particular way as a postulate. This does not explain why complacent sites are discarded.

>Adding or leaving out RHS variables that are uncorrelated to the dependent variable affects goodness-of-fit but not bias. It doesn’t matter how many of them there are.

hmm. I did check, but no-one has mentioned any “RHS variables”, so presumably there will be a caustic retort coming along the lines of “you must be so stupid if you don’t know what one of those is…”

I made clear I am neither a statistician or a botanist. I did set out my logic clearly, so a bit of gobbldegook that I cannot interpret doesn’t really help much.

>Nope. Your understanding is faulty. per, this is very elementary stuff.

Yes, the obligatory sneer. Strangely, jp disagrees with you.

>At a good site for dendroclimatology there will individual trees with a high sensitivity, a strong correlation between the trees growth patterns within a stand and a chronology that has a significant correlation with a climate variable(s).

so jp requires that there is a strong correlation between tree growth patterns within a stand. jp discards data for stands where most of the trees are complacent.

I actually set out the biological basis for that reasoning, but you have not addressed that logic. Enjoy your holiday.

For tree X1, let us have a correlation coefficient A1. For tree X2, let us have a coefficient A2. This is completely irrelevant to how you actually do the analysis in dendrochronology !

For working out the responsiveness of the stand of trees, what you do is average the responsiveness of all the trees in the stand. So you are averaging the growth response of the X1 tree, with the growth response of the X2 tree, in a given year. Once you have an average of the response for all the trees, that is when you do your correlation with temperature. So it appears that, contrary to your comments, it really could create a bias if you exclude low variance samples.

“A further aspect of this critique is that the single-bladed hockey stick shape in proxy PC summaries for North America is carried disproportionately by a relatively small subset (15) of proxy records derived from bristlecone/foxtail pines in the western United States, which the authors mention as being subject to question in the literature as local/regional temperature proxies after approximately 1850 (cf. MM05a/b; Hughes and Funkhauser, 2003; MBH99; Graybill and Idso, 1993). (p.9)”

From NAS Report Chapter 4

“The possibility that increasing tree ring widths in modern times might be driven by increasing atmospheric carbon dioxide (CO2) concentrations, rather than increasing temperatures, was first proposed by LaMarche et al. (1984) for bristlecone pines (Pinus longaeva) in the White Mountains of California. In old age, these trees can assume a “stripbark” form, characterized by a band of trunk that remains alive and continues to grow after the rest of the stem has died. Such trees are sensitive to higher atmospheric CO2 concentrations (Graybill and Idso 1993), possibly because of greater water-use efficiency (Knapp et al. 2001, Bunn et al. 2003) or different carbon partitioning among tree parts (Tang et al. 1999)….’strip-bark’ samples should be avoided for temperature reconstructions, attention should also be paid to the confounding effects of anthropogenic nitrogen deposition (Vitousek et al. 1997), since the nutrient conditions of the soil determine wood growth response to increased atmospheric CO2 (Kostiainen et al. 2004).

If all the proxies are well correlated to temperature, then they should have some correlation to each other.

The Bristlecone/California Basin proxies do not correlate with the other proxies after about 1850, for reasons set out above.

That is why inclusion of the Bristlecone/California proxies is erronious.

Don’t just take my word for it, graph Manns data on a spreadsheet, and you will see that the Bristlecones are way out of line with almost all the other proxies, and the other proxies do not, in general show a rising temperature trend in the 20th century.

I make my living either by answering clients’ questions or teaching students to figure out how to answer their own. Last I looked, I hadn’t seen a check in the mail from you so unless you’re one of my students (a thought that should frighten both of us) you’re eating into my vacation time and the hints and instruction I give you are solely for entertainment purposes. I can tell from your post that you didn’t do what I suggested, which was to work out a numerical example. In class I tell my students that the homework is an integral part of their learning process and that I do not assign make-work problem sets. That web page you found was an excellent place to start but don’t just dig it up and think that you understand cuz you looked at the pretty pictures. Work out the example. Then add 5 or 6 complacent oaks with constant ring size, e.g., 2.5mm for each year. You started off claiming that “clearly” leaving out small variance complacent trees “substantially” biases the results: well, trees with constant ring size have variance = 0, so that should qualify. Work out the example a second time. Now here’s an exam question: “How did adding (or leaving out) those oaks with zero variance tree rings affect the predictive value? Show your work.”

My classes generally have prerequisites. If a student comes into lecture and asks a question I’ll answer it. If that student says he didn’t meet the stat prerequisite, didn’t do the homework assignment, and complains that I’d answered at a level beyond his understanding, I’d say that was rude. If he then makes an incorrect claim based on things he should have known, I’d say he’s a buffoon. Don’t be a buffoon.

Now I’m off to see what’s left of the Tour, and I gotta hurry cuz if they eject any more riders they may have to cancel the whole damn thing.

From the outset my objective has been to explain why your criticism of jp’s approach is without merit. So when you claim, without any evidence, that his methods exclude potentially useful data, I give you an example to show why data that looks useful to you may look worthless to him. Complacent sites show little variance, even in periods when temperature varies quite a lot. This does not mean that a series derived from a complacent site must be trendless. A variable can have a pronounced trend over a century and yet show little variance over periods of, say, five years. Is this the case with many complacent site, or few, or none at all? I have no idea. That’s why I cannot tell you “why complacent sites are discarded”, in the real world. All I have said is that there is a scenario in which complacent sites ought not to be included in datasets used for dendroclimatology. Whether the data actually support exclusion of complacent sites for that particular reason I don’t know.

But limited as my contribution is, it is sufficient to show that your criticism of jp is unconvincing, to say the least. You said initially that complacent sites cannot show a rising temperature trend. I’ve explaned why that’s not true: a series can have comparitively low variance (hence be complacent by jp’s definition) and still show a rising or falling trend.

To spell it out further, since I fear you still haven’t got it: suppose a variable X rises at a constant annual rate, while another variable, Y, follows a cyclical pattern but with an underlying upward trend. Over very long periods of time, X may track Y pretty closely. Does that make X a good proxy for Y? Not if you want to know what is happening in the cycles. Temperature in the 20th century shows an upward trend, but it doesn’t show a steady upward trend. A good proxy variable picks up at least some of the short-term variation as well as the trend. A complacent series is by definition incapable of doing that.

>From a statistical point of view I see nothing wrong with the approach jp describes.

>That’s why I cannot tell you “why complacent sites are discarded”, in the real world. All I have said is that there is a scenario in which complacent sites ought not to be included in datasets used for dendroclimatology.

let me get this right; you have given a positive and unambiguous declaration that discarding complacent site data is okay, yet you cannot justify why complacent sites are discarded and you accept the possibility that this could introduce bias.

>You said initially that complacent sites cannot show a rising temperature trend.

i think this unlikely, but accept that this is true. They could show a gradual increment.

>A good proxy variable picks up at least some of the short-term variation as well as the trend. A complacent series is by definition incapable of doing that.

You are having your cake and eating it. Complacent sites can show a trend, but they cannot show enough of a trend to correlate with temperature. Clearly, if the temperature didn’t change much year to year, they can. This statement of yours is simply a baseless assertion.

You have also failed to establish why a statistical correlation derived from a complacent tree, should be any less meaningful than the same number derived from the analysis of a sensitive tree. Even in your tenuous example featuring “cycles”, it would be quite acceptable for trees to give the long-term temperature trend. In short, I am saying that your requirement that a proxy picks up short-term temperature information is demonstrably wrong; tree rings cannot resolve on timescales shorter than a year for a kick-off.

>From the outset my objective has been to explain why your criticism of jp’s approach is without merit. So when you claim, without any evidence, that his methods exclude potentially useful data,

it is not just jp’s approach. Jacoby is a famous dendrochronologist, and he wrote this:

>The criteria are good common low and high-frequency variation, absence of evidence of disturbance (either observed at the site or in the data), and correspondence or correlation with local or regional temperature. If a chronology does not satisfy these criteria, we do not use it. The quality can be evaluated at various steps in the development process. As we are mission oriented, we do not waste time on further analyses if it is apparent that the resulting chronology would be of inferior quality. If we get a good climatic story from a chronology, we write a paper using it. That is our funded mission. It does not make sense to expend efforts on marginal or poor data and it is a waste of funding agency and taxpayer dollars. The rejected data are set aside and not archived. As we progress through the years from one computer medium to another, the unused data may be neglected. Some [researchers] feel that if you gather enough data and n approaches infinity, all noise will cancel out and a true signal will come through. That is not true. I maintain that one should not add data without signal. It only increases error bars and obscures signal. As an ex- marine I refer to the concept of a few good men. A lesser amount of good data is better without a copious amount of poor data stirred in.

More from per: let me get this right; you have given a positive and unambiguous declaration that discarding complacent site data is okay, yet you cannot justify why complacent sites are discarded and you accept the possibility that this could introduce bias.

That’s your best effort at getting it right?

Let’s start with this guff about a “positive and unambiguous declaration”.

I wrote that “from a statistical point of view I see nothing wrong with the approach jp describes.” Of course I cannot comment on whether jp’s approach is sound from, say, a botanist’s perspective, although I have no reason to doubt it. There may also be something wrong from a statistical perspective. If so Robert is more likely to spot it than I am. But for now my endorsement stands, for what it’s worth.

In the mind of per this is transformed into “a positive and unambiguous declaration that discarding complacent site data is okay”, which tells us something about per’s capacity for misreading. There is a distinction between saying that per’s efforts to demonstrate an error have failed miserably, and saying unequivocally that no error of any kind exists. Most readers can grasp that distinction.

For all I know maybe jp is doing something wrong. But several days after grandly announcing “I will even take you on at dendrochronology”, the challenger hasn’t even laid a glove on him.

Since when is it up to me to “justify why complacent sites are discarded”, per? You are the one claiming that there is something wrong with jp’s procedure. It is up to you to show why they should not be discarded. This you have failed to do.

Next per says I accept the possibility that this could introduce bias. Do I, per? Your exchanges with Robert suggest that you don’t even know what the term bias means in the context of statistics. As for me, Crtl-F confirms that the only time I used the word “bias” in this thread was when I took per to task for using it in an unintelligible way:

Here you are saying that a variable is both unknown and probably biased. That makes no sense at all. Perhaps you mean that the true correlation is unknown – as true correlations invariably are – while the estimates are biased. For all I know that may be true, but you shouldn’t complain about jp losing patience with you if you won’t take the trouble to present a coherent case.

per’s response to that suggested that he was dropping the claim of bias:

You are of course correct when saying that I cannot say that something is both unknown and biased. I will stick with unknown, but it pains me to see so many people flagrantly burying/ not analysing perfectly good data, and that is a bias.

That, folks, shows just how much nonsense per can cram into a few lines of text. It comes after an indent in which he puts together two remarks of mine which don’t even come from a single comment. What the idea of that was is anyone’s guess. I will leave the rest for now. The weather is fine and anyway the World Cup is becoming interesting.

We have now come full circle. Early in this thread I pointed out that research is undertaken for a purpose. I also pointed out that tree growth is influenced by many variables. At some sites it may not be possible to develop a chronology that is useful for climatic reconstruction. That may be evident in the field, it may be evident in the lab. What Jacoby points out is that there is no useful purpose in pursuing a site or chronology when it has little value for climatic reconstruction, if the purpose of the work is climatic recontruction. Only a troll would interpret this to mean “If he doesn’t like the data (gets the wrong correlation), he bins it.”

Jacoby states “If we get a good climatic story from a chronology, we write a paper using it”. Now listen closely because this is the tricky part for conspiracy theorists: it does not matter if that story supports or refutes the ‘consensensus’ view. What matters is that the chronology is of high quality and the story gets published.

Here is an opportunity for the clowns of the audit, go do your own work and publish it.

>I wrote that “from a statistical point of view I see nothing wrong with the approach jp describes.” … There may also be something wrong from a statistical perspective… But for now my endorsement stands, for what it’s worth.

If your statement is that you see nothing wrong with X, but you now qualify this by saying that there may be something wrong with X, I am quite happy to withdraw any suggestion that a statement that you made has any positive or unambiguous meaning whatsoever!

>Next per says I accept the possibility that this could introduce bias.

> suppose a variable X rises at a constant annual rate, while another variable, Y, follows a cyclical pattern but with an underlying upward trend. Over very long periods of time, X may track Y pretty closely.

That is your quote, and you are premising that for complacent trees, the growth (X) tracks temperature (Y) over a very long period “pretty closely”. Systematically discarding sites with such a pretty close relationship of temperature and growth will bias your estimates of how well trees measure temperature, unless sensitive trees show exactly the same response.

>I think I now have a fair idea of what per’s concern is. At all events I will give it one last shot, since it relates to statistics not botany (of which I know nowt). Suppose we have a large table of numbers, consisting of hundreds or maybe thousands of columns. The first column represents temperature and contains 100 numbers corresponding to the years 1901-2000. Each of the other columns represents a tree and contains 200 numbers which may or may not track temperature closely for the period 1801-2000. The optimists assert that some unknown proportion of the columns are a good proxy for temperature, but the only way to identify them is by statistical methods. The table is all we have.

Err, this is the crux, and what you still haven’t addressed. In fact, your table contains the “cherry-picked” data of sites with “good” correlations, and all the sites with “bad” correlations are simply buried, and never archived. You have simply selected those sites which have a good correlation, and are now using the fact that these sites have a correlation with temperature to prove that they are sensitive to temperature.

In fact, this is indistinguishable from the cardinal error of throwing away data you don’t like till you get the “right” answer. This analysis says nothing about whether these sites are sensitive to temperature, because you have pre-selected on correlation to temperature.

>If for example a researcher is testing the hypothesis that a particular species of tree is sensitive to temperature, or that its sensitivity depends on a particular set of variables, then of course it would be inexcusable to make lots of observations and throw out the ones that don’t fit. That’s not the case being discussed here. I refer you to jp’s explanation:
>>If the objective of the research is to extract a climate signal from a tree ring series it would be of no value to sample trees that were not sensitive to climate.

So what jp describes is a procedure which is indistinguishable from “make lots of observations and throw out the ones that don’t fit [correlate]”, but if we call it “extract[ing] the climate signal”, then that’s okay.

>Muffin Top:I give “F”s when students don’t do their homework. Please let your next post be the results of your worked example.

what a surprise, abuse ! One of the minor things that people forget is that there is such a thing as noise, or biological variation. So if you do an average of (signal + zero), then the resulting arithmetic mean will retain the same correlation, no matter how much you dilute the signal. This is not true in biological systems, where the signal is frequently not much greater than the noise. You ony have to dilute signal into a little bit of noise, and lo ! the signal becomes indistinguishable from noise.

looking forward to getting my homework marked ! Can I bill you for the tuition I am giving you ?

“”Didn’t come from anything to do with NAS. An MBH-reconstruction using the proxies from 1450, and without using the Bristlecones, was done by Wahl and Ammann 2006.”

Are you referring to figure 4a or 5c,”

Glad you finally realized it wasn’t “lifted from an erroneous statement during the NAS press conference”. Scenario 6.

“where exclusion of the bristlecones causes the reconstruction to fail ?”

I’m talking about “reconstruction using the proxies from 1450, and without using the Bristlecones” where the reconstruction passes.

“Where the 1400s are up to 0.3 c warmer ?”

It’s beside the point here but there’s not much point talking about attempted reconstructions that fail validation such as using the 1400 network with Bristlecones removed. The attempt fails. End of story.

>What Jacoby points out is that there is no useful purpose in pursuing a site or chronology when it has little value for climatic reconstruction, if the purpose of the work is climatic recontruction.

Dear jp

I am quite happy to be educated, and will accept if you can explain why I am wrong.

But what I hear you (and Jacoby) to say is that you select sites on the basis of their correlation with temperature; data from other sites are binned. The case is then made that because there is a correlation with temparture, these trees are sensitive to temperature.

Kevin Donoghue says:
>it would be inexcusable to make lots of observations and throw out the ones that don’t fit

You have made a deus ex cathedra pronouncement that there is a climate signal there which can be extracted. But I cannot see the reason why what you are doing is different from Kevin’s description.

>Only a troll would interpret this to mean “If he doesn’t like the data (gets the wrong correlation), he bins it.”

Reading Jacoby’s statement, it is quite clear that he says he bins data that doesn’t have “correspondence or correlation with local or regional temperature”. I have the utmost concern that you pre-select data on the basis of temperature correlation, and then claim that tells you temperature senstivity. You could do exactly the same thing for any series of random data, and it would tell you exactly the same thing.

>””If you take the Bristlecones out after 1450 you get the same old hockeystick.”
>An MBH-reconstruction using the proxies from 1450, and without using the Bristlecones, was done by Wahl and Ammann 2006.
>>It’s beside the point here but there’s not much point talking about attempted reconstructions that fail validation such as using the 1400 network with Bristlecones removed. The attempt fails. End of story.

Let me see if I understand you, Chris. You have said that if you take the Bristlecones out, you get the same old hockeystick, and you cited W&A. You now point out that this reconstruction fails in the absence of bristlecones !

Just joining the dots, don’t you realise that you have undercut your original statement ?

>”The Bristlecone/California Basin proxies do not correlate with the other proxies after about 1850″
So? They’re only needed for reconstructions before 1450. Last time I checked 1450 was waaaay before 1850.

Ouch ! the point is that in the region of time where we can check if they are temperature measures (1850+), we can demonstrate that they are not. If they are not thermometers when we can test them, why should they act as thermometers in the 1400s, when we cannot test their properties ?

“and you cited W&A. You now point out that this reconstruction fails in the absence of bristlecones !”

I wasn’t talking about the same reconstruction, which you would have realized if you had ever read and understood W&A. The reconstruction using the 1450 proxy network is different from the reconstruction using the 1400 proxy network because the 1400 proxy network has a lot fewer proxies.

You’d be a lot better off spending more time reading the work of people who’s jobs is this stuff and less time writing your opinions on blogs.

“Results for the exclusion of the bristlecone/foxtail pine series developed according to
scenario 3 are shown by the green curve in Figure 2. The exclusion of these proxy records
generally results in slightly higher reconstructed temperatures than those derived from inclusion
of all the proxy data series, with the greatest differences (averaging ~ +0.10°) over the period
1425-1510. The highest values before the 20th century in this scenario occur in the early 15th
century, peaking at 0.17° in relation to the 1902-1980 mean, which are nevertheless far below the
+0.40-0.80° values reported for scenario 1. The verification RE scores for this scenario (Table
2) are only slightly above the zero value that indicates the threshold of skill in the independent
Feb 24, 06 Wahl and Ammann Climatic Change, in press
29
verification period, and the verification mean reconstructions are correspondingly poor. These
results, which cannot be attributed to calibration overfitting because the number of proxy
regressors is reduced rather than augmented, suggest that bristlecone/foxtail pine records do
possess meaningful climate information at the level of the dominant eigenvector patterns of the
global instrumental surface temperature grid. This phenomenon is an interesting result in itself,
which is not fully addressed by examination of the local/regional relationship between the proxy
ring widths and surface temperatures (noted in section 1.1) and which suggests that the “all
proxy” scenarios reported in Figure 2 yield a more meaningful comparison to the original MBH
results than when the bristlecone/foxtail pine records are excluded.”

If you exclude the Bristlecones the “skill” of the result is minimal.

If you include the Bristlecones (which have no correlation to local temperature after 1850) the “skill” increases, but only because the proxy data “skill” is in this case being compared to “non-local” temperature by means of a “teleconnection”.

So there you have it. No “skill” in relation to local temperatures with either method.

The only way they can find “skill”, is by connecting them to temperatures that may have occurred elsewhere in the world, but not in the California Basin.

Let Y be a vector of temperatures. Let X be a vector which is a r.v. of the (average detrended) deviation of sensitive tree rings, just as was shown in the example you dug up. Let Z be the (average) deviation for a bunch of complacent tree rings with zero variance. Zero variance means they’re a constant. Adding a constant to X doesn’t change its correlation with Y. Adding a constant to X doesn’t change its variance, either. So adding a constant to X doesn’t change the coefficient for X in its regression with Y. So there is no bias. QED.

So, how’d you do? Hmmm. You claimed that leaving out (or adding) the complacent trees would bias the results. Ooooh, not a good answer.

I should point out that there probably were a reasonable number of thermometers in North America for a fair while before 1850. So you could probably calibrate Bristlecone proxies before 1850 without using intermediate proxies. But as I said, there’s more than one way to skin a cat.

okay, you have been careful to avoid specifying what an r.v. is, and I don’t know. Is that like an “r.h.s.”, or is it more like a WTF ?

I have been careful to set out what I am saying in simple and plain english. Switching to undefined acronyms, using obscure jargon; do you think that gives you an appearance of knowledge ?

>Let Z be the (average) deviation for a bunch of complacent tree rings with zero variance.

Okay, this one I get. When I said that biological samples have noise, you have replied with “let there be no noise in biological samples”. That is kind of like sticking your fingers in your ears, shutting your eyes, and shouting “I can’t hear you”.

It seems that you don’t know that biological samples have noise. It seems that you cannot even parse my simple sentences that state that biological samples have noise. Maybe you don’t know what noise is ?

>I should point out that there probably were a reasonable number of thermometers in North America for a fair while before 1850.

these were both your quotes; are you now arguing with yourself ?

Anyway, I am sure you can skin a cat, but the fact is the bristlecone pines don’t show a correlation with local temperature post 1850; how on earth can you make a case that they are temperature proxies ?

“but the fact is the bristlecone pines don’t show a correlation with local temperature post 1850; how on earth can you make a case that they are temperature proxies ?”

Have you started spending more time reading the work of people who’s jobs is this stuff and less time writing your opinions on blogs yet? Getting banned should be an excellent opportunity for you to do so. Anyway, regarding calibrating proxies without using contemporary thermometers, this can be done by calibration from other proxies that have been calibrated against thermometers. There are 250 years of high quality temperature reconstructions from 1600 to 1850 that enable this to be done.

>Anyway, regarding calibrating proxies without using contemporary thermometers, this can be done by calibration from other proxies that have been calibrated against thermometers.

Maybe you missed it when several of us wrote that the bristlecones do not show a correlation versus local temperature post-1850; in other words, they fail ! How can you suggest that calibration against a different proxy validates them, when you know that they fail against local thermometer readings ?

Graybill and Idso, when they characterised many of the californian great basin bristlecones, explicitly set out that they are not temperature proxies ! You are arguing against the people whose job it was to characterise these trees.

> Getting banned should be an excellent opportunity for you to do so.

you may imagine that getting “banned” is like a legal sanction, or some sort of slut on my reputation. I think it reflects a bit more on TimL than me.

I think everyone is free to check that Tim got you dead to rights per and dealt with (what you may now believe to have been) your little booboo very reasonably. Making some fair, modest yet winning points would be your best comeback but I’m betting you’ll stick with playing the girlish smartarse. Please prove me wrong.

>I think everyone is free to check that Tim got you dead to rights per

>that’s enough trolling.

QED; what more needs to be said ?
If you are referring to the bit about vector, and r.v., I still don’t know what an r.v. is, and I don’t see how I can guess.

I confess i am not a statistician, and I asked politely for the meaning of the word “vector”, as it didn’t fit in with my knowledge of what a mathematical vector is. Although there is a meaning in computer science (http://en.wikipedia.org/wiki/Vector), I don’t see how I can reasonably know this unless i am a computer scientist. Is this Tim’s reason for banning me ?

how strange ! Dano has not engaged in the debate, nor made a substantive point. He has told us that there are listservs (somewhere), or stuff in a library (somewhere), and strangely enough, he talks a lot about hand-waving.

In this context a vector, X, is a set of N observations (x1, x2, x3, … ,xN) of a variable, and an r.v. is a random variable (not, as you may suppose, a recreational vehicle). RHS stands for right-hand side (of a regression equation).

I believe that when he banned you, Tim Lambert was acting on the following assumptions:

Your name is David Bell.

per is short for peroxisome, as in such memorable remarks as “glucocorticoid-induced PPAR alpha is linked to peroxisome proliferator mitogenesis”.

You know at least as much about statistics as a typical science undergraduate and probably a damn sight more.

You feign ignorance in order to disrupt comment threads.

If these assumptions are correct, your banning was thoroughly justified. Indeed a good kick up the arse wouldn’t be entirely undeserved.

If these assumptions are incorrect then TL owes you an apology.

Tim,

If you reckon I’m feeding a troll, please feel free to delete this comment.

Dear Kevin
thanks for clarifying rv, and rhs; I have not come across either abbreviation, or that use of vector, in context before.

I guess if you have done physical science at university, or even maths/ stats, these might be fairly elementary usage. However, I have never done maths/physical science/ computing at university, and such stats as I did do were at an elementary level.

>You feign ignorance…

I have said twice I hadn’t come across that use of vector, and I have also said I didn’t know what rv and rhs were in context. If you choose to assume that I am lying, based on no knowledge whatsoever, that is your call.

Maybe you missed it when several of us wrote that the bristlecones do not show a correlation versus local temperature post-1850; in other words, they fail ! How can you suggest that calibration against a different proxy validates them, when you know that they fail against local thermometer readings ?

Well, MarkR posted why above, though at the time he didn’t realize that his quoted source didn’t support his (or your) position …

>If you include the Bristlecones (which have no correlation to local temperature after 1850) the “skill” increases, but only because the proxy data “skill” is in this case being compared to “non-local” temperature by means of a “teleconnection”.

Steve McIntyre at climate audit did some lovely stuff on this recently. Someone used a proxy series – on condition it had to show good correlation with local temperature. Sure enough it did; but it wasn’t the local temperature. Once you start sampling all the local temperatures, using just the 9 closest temperature stations, you can get your correlation statistics up to pretty reasonable simply by choosing the best statistic out of 9…

Re: “Steve McIntyre at climate audit did some lovely stuff on this recently.”

Per, the people at climate audit have never done any “lovely stuff”. They have never done anything constructive in climate research, and their blog is the favourite “climate science” blog of fools, industry PR hacks, and loony libertarians who have no real climate science background or serious record in climatology.

If you are to cite any climate science blog at all, please use RealClimate.org as your source and not one done up by economists, former mining executives, or “journalists”.

>I don’t think it is plausible that someone with a PhD in biology would really believe that “vector” is obscure jargon.

I am just wondering if you know what is involved in a UK biology undergraduate degree. Do you realise you can do such a degree while doing no formal maths, physical science or computing courses ? Do you realise that there can be no formal courses during a UK PhD, and that math/physical science/ computing would not normally be the subject matter of a biology PhD ?

I can only wonder what the factual basis for your belief system is; but I can point out to you that not everyone has studied computer science.

I count several publications in the peer-reviewed press, and they are also largely responsible for ensuring the NAS review happened. Strangely enough, the NAS review seems to accept many of the points made by M&M

it is truly rare to come across such an incisive piece of prose, making such an elegant, but well-thought through contribution to the debate:

>I’ve pointed out you are an annoying bobblehead, full of sh-t and this listserv you are so afraid of will further point out your mendacious, obfuscatory and – frankly – full of cr-p FUD. That is: you don’t have the courage …

As it happens, I have six textbooks about statistics on my desk. According to the index, one book out of six has the word “vector” indexed, and this book is a “howto” guide for a particular statistical programme.

“”Anyway, regarding calibrating proxies without using contemporary thermometers, this can be done by calibration from other proxies that have been calibrated against thermometers.”

Maybe you missed it when several of us wrote that the bristlecones do not show a correlation versus local temperature post-1850;”

Did you say they showed no correlation with temperature before 1850?

“How can you suggest that calibration against a different proxy validates them, when you know that they fail against local thermometer readings ?”

As McIntyre likes to remind us, the Bristlecone proxies were interfered with by rising CO2 level since 1850. This interference didn’t stop them from being proxies before 1850.

“”Getting banned should be an excellent opportunity for you to do so.”

you may imagine that getting “banned” is like a legal sanction, or some sort of slut on my reputation. I think it reflects a bit more on TimL than me.”

I never expected you to say that.

In any case the suggestion that Bristlecone proxies cause a huge error to temperature reconstructions solely during the period where the reconstructions are dependent on the Bristlecone proxies (before 1450) relies on believing a set of incredible co-incidences. Somehow we’re expected to believe that even though reconstructions using Bristlecone proxies agree extremely well with reconstructions not using Bristlecone proxies after 1450 that somehow the Bristlecone proxies suddenly go mad before 1450 when they haven’t caused any problem after 1450. So they only go crazy just the moment we can’t check up on them while every time we can check them, and they’re not interfered with by something, they behave themselves. You can believe that if you want to but don’t expect me to agree with you.

And don’t forget we have a perfectly good hockeystick from 1450 to now without touching Bristlecones.

Per, one can buy any number of textbooks. Reading them and understanding them is another thing. Given the ubiquetous (BRING BACK THE SPELL CHECHER TIM!) reach of linear algebra into all branches of mathematics including statistics one is tempted to ask which books you have.

Forgive me, but I don’t recall you showing that. Under any circumstances, it is irrelevant. They do not show correlation with local temperature post-1850. There are several (speculative) hypotheses why that may be so, but no proof. It is sufficient to note that they do not correlate with temperature.

Given that is so in the area where we have made a test, how can we possibly suggest that they will be temperature proxies at any other time ?

Even if you were able to show that they showed a correlation with local temperature from say 1600-1850, how do we know that there will not be other occasions in history when they would fail to show a correlation with local temperature again ?

talking about incompetence. One author doesn’t want to share his original data, his graphs are therefore (wrongly !) digitised by author 2. Conclusions are drawn from the wrong data and nobody cares to archive.

Here’s support for the positions taken (in part) by the NAS and by per and MarkR, in the Submission by David Holland to the UK’s Stern Review:
“21. The Stern Review should note that many of the proxy reconstructions, like Dr. Mann’s MBH98 which suggest the 20th century is exceptional in the last millennium, are based largely on the width of tree rings. Most of us know that trees grow better when it is warmer but we also know that they need water and sun light and of course CO2. Factors like age, disease and local competition also
affect growth. Thus the width of a tree ring may vary in a linear way in relation to average temperature over a range of temperatures, providing all the other factors
that affect it remain constant. In very few places and only for limited time spans can it be said that tree rings may respond linearly to average temperature….Non-linear processes such as plant growth do not lend themselves to analysis using linear algebra upon which statistical analysis relies.
22. Tree ring width data was collected for dendochronology long before climate change was an issue but few temperature reconstructions use those after the 1960’s. The Stern Review should understand why. In 1998, Briffa et al presented a paper entitled “Trees tell of past climates: but are they speaking less clearly today?” In [their] graph it can be seen that after 1960 tree rings are poor proxies for temperature. This is euphemistically referred to as the “Divergence Problem.” However, it does not prevent the same data being used in other papers to corroborate the “hockey stick”. The inconvenient data after 1960 is simply omitted. In other fields leaving
out contrary indications as significant as this would be considered dishonest. This data is included in the “spaghetti” diagrams shown in [Stern’s] Technical Annex and in the IPCC TAR 2001 as corroboration of Dr Mann’s hockey stick.
23. Clearly substantial uncertainty exists over reconstructions of past global temperatures based on proxies and particularly tree rings. Tree ring series, or
parts thereof, are selected or cherry-picked for use in reconstructions. Series from the same area can exhibit differing trends and without seeing the selection
criteria, which are rarely available, it is reasonable to suggest that there is bias in their selection.”

“Here’s support for the positions taken (in part) by the NAS and by per and MarkR, in the Submission by David Holland to the UK’s Stern Review”

So why do you think anyone would be interested in an out-of-date re-hash of the usual denialist misinformation? If you’re going to feed us denialism Tim at least make it up to date. BTW, how are you going with your study on molecular weight?

“”Did you say they showed no correlation with temperature before 1850?”

Forgive me, but I don’t recall you showing that.”

I never said I did.

“Under any circumstances, it is irrelevant.”

Yeah if you say so Mr Dendroclimatologist.

“They do not show correlation with local temperature post-1850.”

Yeah if you say so etc. Show me any up-to-date scientific publication that says they do not show ANY correlation with local temperature post-1850. And words such as “being subject to question” and “should be avoided” do not mean the same thing as “do not show ANY correlation”.

>>”They do not show correlation with local temperature post-1850.”
>Yeah if you say so etc. Show me any up-to-date scientific publication that says they do not show ANY correlation with local temperature post-1850.

Sadly, there are not whole publications devoted to showing single correlations, and i guess I should be clear. When I say “they do not show correlation”, that is shorthand for “they do not show a statistically significant correlation”; of course there is a correlation of some sort.

Stephen Berg “Even the CA guys are showing an alarming warming trend over the last three decades.”

The link you provide shows a graph of the reproduction of the Mann Hockeystick (which is discredited because it overweights the California Basin Bristlecones), and it also shows a graph of the actual Pacific Basin Bristlecone Proxy, and the statistical significance tests that Steve McIntyre did to check the Bristlecones validity as a local temperature proxy:

“….a first test, I calculated decadal averages in decades starting in year 6 and repeated the above calculation. Because there are only 9-10 degrees of freedom, one expects that some apparently high correlation might not be accompanied by statistical significance. In this case, I got a correlation of 0.38 (as compared to 0.56 cited in Jones and Mann 2004, quoted in Osborn and Briffa 2006). Looking back, they did their calculation on a 1901-1980 period, while I did it for the full period. I’ve not bothered trying to see what I get for 1901-1980 as I don’t think that it matters anyway. The adjusted r-squared is only 0.056; the t-statistic is a mere 1.29 (not significant) and DW is a ghastly 0.36.”

So you see, to put it in scientific terms, the Bristlecones are useless as a temperature proxy in the 20th century.

I cannot believe you are arguing this. It was reviewed for climatic change, and rejected by the referees.

>>”I ask again, show me an peer reviewed paper that passes the verification statistics, that excludes Bristlecones, that graphs as a Hockey Stick.”
>Shown. How about reading it and stop wasting everyone’s time.

well, everybody agrees that the bristlecones show a good correlation with global/northern hemisphere temperatures over the last 150 years; I don’t believe this is contentious. What is concerning is that they don’t show a correlation with the recorded temperature in their locality; as McIntyre’s blog post shows, and as confirmed by Graybill and Idso, who actually did the original work on these samples.

And on-topic for the NAS report, you will note that the NAS report found that a variety of verification statistics were required for the reconstruction. It also noted that MBH’98 fails dramatically on several of these statistical measures. Simply cherry-picking the one statistical test that it does well at is not sufficient…

>MBH’98 fails dramatically on several of these statistical measures”
which are not necessary or sufficient for adequate validation which you would have been aware of if you’d spent more time reading Wahl and Ammann and less time writing ill-informed opinions on blogs.

yes, chris, that’s why I said I lifted that conclusion from the NRC report.

So says the NRC report according to per even though this “dramatic failure” doesn’t seem to stop the NRC having a high level of confidence in reconstructions after 1600.

“that’s why I said I lifted that conclusion from the NRC report”

Knowing what an authority you are on the NRC report (which refers to Wahl and Ammann (in press)) and the literature in general it’s rather strange how you came to the view that Wahl and Ammann’s “Examination of Criticisms” was rejected. Who knows what other strange views you might have. Rather than continuing to attempt propagating such strange views it might be a better idea to try to become at least a little better informed and read what Wahl and Ammann actually have to say on the subject.

“• Large-scale surface temperature reconstructions demonstrate very limited statistical
skill (e.g., using the CE statistic) for proxy sets before the 19th century (Rutherford et al. 2005,
Wahl and Ammann in press). Published information, although limited, also suggests that these
statistics are sensitive to the inclusion of small subsets of the data.”
“• There are very few degrees of freedom in validations of the reconstructed
temperature averaged over periods of decades and longer. The validation metric (RE) used by
Mann et al. (1998, 1999) is a minimum requirement, but the committee questions whether any
single statistic can provide a definitive indication of the uncertainty inherent in the
reconstruction. Demonstrating performance for the higher-frequency component (e.g., by
calculating the CE statistic) would increase confidence but still would not fully address the issue
of evaluating the reconstruction’s ability to capture temperature variations on decadal-to-
centennial timescales.”

So the NRC committee takes the view the multiple measures of skill are NECESSARY; in direct contradistinction to your view.

“EXAMINATION AND CONTEXTUALIZATION OF INTERANNUAL RECONSTRUCTION PERFORMANCE IN MBH

Tables 1S and 2S give r2 and CE values for the WA emulation of MBH and the MMmotivated scenario sets, which parallel the RE and verification mean offset performance reported in Tables 1 and 2. These data suggest (but see below) that a number of the MBH calibration exercises exhibit poor/very little skill at the interannual temporal scale in the verification period, although they all show very good skill at the multi-decadal scale of reconstructing the shift
between the verification and calibration period means. All of the MM-motivated scenarios exhibit essentially no skill at the interannual scale in the verification period, yet at the same time scenarios 5a-d and 6a-b exhibit very good skill for the 1450 calibrations at the multi-decadal
scale of the verification/calibration mean offset.

These results highlight the need to be very careful about the logical framework for determining the kinds of errors for which minimization is being sought in validation. To use, for example, just the interannual information available from r2 would, under the criterion of minimizing the risk of committing a false positive error, lead to verification rejection of most of the MBH/WA emulations and all of the MM-motivated scenarios reported. However, this judgment would entail committing large proportions of false negative errors for these reconstructions at the low frequency scale of the overall verification period, whose multi-decadal
perspective is the primary temporal focus of this paper. Our assessment is that such a rejection of validated performance at the low frequency scale would be a waste of objectively useful information, similar to that documented for micro-fossil based paleoclimate reconstructions that
use highly conservative criteria focused on strong reduction of false positive errors (Lytle and Wahl, 2005). Significant attention to appropriate balancing of these errors is now being sought in paleoclimatology and paleoecology (cf. Lytle and Wahl, 2005; Wahl, 2004), and remains an important venue for further research targeted at recovering maximal information in paleoenvironmental reconstructions.

It also must be noted that indirect verifications of the MBH reconstruction actually suggest quite good interannual performance, potentially raising the question of the quality of the truncated-grid instrumental values used for validation testing in the verification period (219 grid points versus 1082 grid points in the calibration data set). A spatial subset of the MBH annual temperature reconstruction for European land areas (25°W-40°E, 35°N-70°N) compares very well with an independent CFR reconstruction for that region, using a regionally much richer, fully independent set of different instrumental series in combination with documentary proxy evidence (Luterbacher et al. 2004; Xoplaki et al. 2005). Over the 1760-1900 period of this comparison, the r2 between the regional annual temperature averages of these two reconstructions is 0.67, corresponding with excellent visual temporal tracking of these time series (not shown) at interannual, decadal, and centennial scales. [The interannual amplitude in the European-based annual reconstruction is slightly greater compared to MBH.] Additionally, von Storch et al. (2004) show that the higher-frequency (interannual to decadal) tracking of MBH-style reconstructed temperatures with “actual” temperatures in an AOGCM context is very good, although an implementation error in the von Storch et al. analysis incorrectly showed very large amplitude losses for the MBH method at lower-frequency (approximately centennial)
scales (Wahl et al., accepted). Good interannual/decadal tracking is robust to correction of the implementation error (Wahl et al., accepted). These indirect tests of the MBH reconstruction add a significant caveat to the indications of poor interannual performance based on the verification
instrumental data used by MBH (and by us here). It will be an important area of further testing of the MBH/WA reconstructions to attempt to resolve this inconsistency of verification validation results for the interannual frequency band, which is an issue that is potentially relevant to all high-resolution proxy-based climate reconstructions because of the limited spatial representation of pre-20th century instrumental data.”

So the NRC committee ignore most of what Wahl and Ammann have to say without giving justification.

One more thing, I cannot believe you used a blog post from McIntyre as a reference, i.e. a post from someone who has only ever had papers pubished in an unreviewed social science journal and a lightly reviewed letters journal.

>So the NRC committee ignore most of what Wahl and Ammann have to say without giving justification.

I am gobsmacked.

>One more thing, I cannot believe you used a blog post from McIntyre as a reference, i.e. a post from someone who has only ever had papers pubished in an unreviewed social science journal and a lightly reviewed letters journal.

That would be an ad hominem attack ?
A famous climate scientist tried the same trick, complaining that there had been an error of peer review when M&M published in GRL. Kind of sounds suspicious when he extols his own papers in the same journal as having passed the high standard of scientific peer-review.

Just out of interest, have you published anything in the “lightly reviewed” Geophysical Research Letters ?

The version you referred me to is not as far
as I know, the published peer reviewed version, it refers to itself as “in
press”, and as far as I can see it contains several graphs
, but you do not refer to any particular one, so I don’t see how you can
claim to have show the relevant graph.

The only graph which refers to non Bristlecone
proxies is Figure 4. It is truncated, and ends in 1500. My guess is that they
didn’t want to show the full timescale, as it would clearly not produce a
Hockey Stick.

Their paper still endorses the Mann et al
methodology, but has been overtaken by the following:

From von Storch
press release following NAS report

4) With respect to methods, the committee is
showing reservations concerning the methodology of Mann et al..The committee notes explicitly on pages 91
and 111 that the method has no validation (CE) skill significantly different
from zero. In the past, however, it has always been claimed that the method
has a significant nonzero validation skill. Methods without a validation skill are usually considered useless.http://www.climateaudit.org/?p=716

Also:The conclusion to MM05a noted:

An obvious guard against spurious RE significance is to examine
other cross-validation statistics, such as the R2 and CE statistics, as recommended, for
example, in Cook et al.
[1994]. While there are limitations to the R2 statistic, the analysis
of statistical “skill” of Murphy [1988] presupposes that the R2 statistic exceeds the skill statistic and
cases where the RE statistic exceeds the R2 statistic are of particular concern [Cook et al., 1994]. In the case of MBH98, unfortunately, neither the R2 and other
cross-validation statistics nor the underlying construction step
have ever been reported for the controversial 15th century period.Our calculations have
indicated that they are statistically insignificant. (mycomment..So
do Wahl etc in the paper you refer to in Figure 3 “Scenario 5d Network fail”)

The MM criticism of the need to examine MBH98
cross-validation statistics was specifically endorsed by one of our GRL referees
as follows:

[they] also show that by not presenting other stringent
verification statistics (e.g. R2, CE, product mean test and sign test) the
validity of the 1400 step is likely much weaker than is apparent from the
original MBH98 study.

MM05b criticized not only the statistical
insignificance of the cross-validation statistics, but also the withholding by
MBH98 of adverse cross-validation statistics. Yet in section 1.1 “MM
Criticisms”, WA omitted both topics and obviously failed to rebut them.

In their text, WA omit the very
cross-validation statistics that were at issue in the MM criticisms. This is
done without any notice to the reader of the omission. Although they have
withheld key cross-validation statistics themselves, they repeatedly emphasize
the need for cross-validation statistics, using language such as the following:

More generally, our results highlight the
necessity of reporting skill tests for each reconstruction model, as is
customary in quantitative paleoclimate
reconstruction. (p. 30)

A reader would be misled by the omission of
cross-validation statistics and by the many WA statements about verification as
he would have no way of knowing that WA had intentionally withheld standard
cross-validation statistics. http://www.climateaudit.org/pdf/wa.review.pdf

“Kind of sounds suspicious when he extols his own papers in the same journal”

Yeah sure that’s the only journal he bases his reputation on.

“I am gobsmacked.”

A bit like what happens when someone says they “cannot believe” that someone else is referring to a paper that was “rejected by the referees” without even bothering to check that they are telling the truth.

“Just out of interest, have you published anything in the “lightly reviewed” Geophysical Research Letters ?”

“The version you referred me to is not as far as I know, the published peer reviewed version, it refers to itself as “in press”

If you thought about what the words “in press” mean, you might get some idea. It’s an idiom for “in the printing press” which means the publisher has already decided to publish it and that means it has passed the peer review process.

“and as far as I can see it contains several graphs , but you do not refer to any particular one,”

When reading Von Storch bear in mind the mistakes his associates have made in producing paleo-climate models that they used to generate pseudo-proxies and also mistakes they made in making variations on the MBH98 method and variations on the RegEM method for producing reconstructions.

“The committee notes explicitly on pages 91 and 111 that the method has no validation (CE) skill significantly different from zero.”

Like a lot of people, Von Storch makes the mistake of believing that a CE score of zero means the validation has zero statistical significance. A true rigorous significance estimation procedure is based on the null hypothesis of AR(1) red noise predictions over the validation interval, using the variance and lag-one autocorrelation coefficient of the actual NH series over the calibration interval to provide surrogate AR(1) red noise reconstructions. In other words the validation CE has to be tested against the validation CE you would get if you used noise of the appropriate statistics instead of the reconstruction in the validation interval. The validation CE of noise is normally quite negative. The MBH98 and RegEM validation CEs are higher than at least 95% of noise simulations so we can say they exceed a 95% significance level.

“In their text, WA omit the very cross-validation statistics that were at issue in the MM criticisms.”

Pity this quote doesn’t mention specifically what validation statistics they are referring to.

“A reader would be misled by the omission..”

Rather hypocritical thing for MM to say considering the omissions they made mentioning the extensive discussions by Wahl and Ammann regarding the interpretation of validation statistics.