Signal to Noise Ratio Estimates of Mann08 Temperature Proxy Data

Occasionally when working on one thing long enough, you discover something unexpected that allows you to take a step forward in understanding. At the ICCC conference, I met Steve McIntyre and took time to ask him how come Mann07 “Robustness of proxy-based climate field reconstruction methods” didn’t show any variance loss in the historic signal. The paper makes the claim that CPS is a functional method for signal extraction, which I’ve long and vociferously contested . Neither of us had a good answer, but I had to know. In Mann07 – Part II at the Air Vent, the mystery was solved. The M07 paper uses model data as a ‘known’ temperature signal and adds various levels of noise to it. While the work oddly uses white noise in most operational tests, it does present the example of ARMA (1,0,0) ρ = 0.32 models, and it showed very little variance loss. Replicating M07 using CPS wasn’t difficult and the results were confirmed – no historic variance loss so no artificially flat handles for the Mann hockeystick.

With white noise or low autocorrelation noise, there will be none variance loss (HS handle) reported in VonStorch and Zorita 04, Christiansen2010, McIntyre Mckitrick o5 or numerous other studies. This is because low AR noise doesn’t create signal obscuring trends on a long enough timescale to make a difference. However, if red noise having autocorrelation which matches observed values in proxies is used, we get a whole different result overturning the conclusions of Mann07. But, this isn’t the topic of this post.

In demonstrating the effect in the recent Mann07 post at the Air Vent, we used model temperature data and added AR1 noise which matched the parameters from the 1209 proxies in Mann08. From that work we discovered that the percent proxies retained using CPS and a correlation threshold of r=0.1 results in a very high acceptance rate of the proxies 87% passed screening – even though they had a signal to noise amplitude of 0.25. This is significant because even 40% was considered a conservatively noisy proxy in M07

From the M07 paper:

Experiments were performed for five different values of SNR: 0.25, 0.4 0.5, 1.0 and 1 (i.e., no added noise).

and:

We adopted as our ‘‘standard’’ case SNR = 0.4 (86% noise, r = 0.37) which represents a signal-to-noise ratio than is either roughly equal to or lower than that estimated for actual proxy networks (e.g., the MXD or MBH98 proxy networks; see Auxiliary Material, section 5), making it an appropriately conservative standard for evaluating realworld proxy reconstructions.

Remember that in Mann08, so thoroughly deconstructed at Climate Audit, they managed to retain only 40% of the proxy data despite testing against the two closest gridcells.

Below is the third time I’ve presented this plot of ar1 coefficients for the 1209 proxies, however, in the recent MW10 paper, they calculated the same thing demonstrating verification of this result. In my case R threw an error after 1043 proxies when fiting the ARMA (1,0,0) model. It was difficult to figure out how to catch the error, so being an engineer, I assumed that this was a good enough sample set of the 1209 and used 1043 autocorrelation rho’s for this work. As confirmation of the assumption, the my histogram was verified by McShane and Wyner’s recent paper so widely discussed in recent weeks.

Source – the Air Vent

Source – MW10

The lightbulb went off when I realized we can use this approach to make an estimate of the signal to noise level in actual M08 temperature proxies. It’s done by adjusting the signal to noise in the pseudoproxies (model plus AR1 noise) until they match the 40% retention rate of Mann08. The code was already written to create 10,000 pseudoproxies with an autocorrelation histogram matching the above plots so it was a simple matter to finish the job. Sometimes it pays to work hard.

By adjusting the SNR of the pseudo proxies and rerunning the code, I came to a result of a SNR of 0.085 or 8.5 percent amplitude signal to match the 40 percent proxy retention of M08 at a correlation of 0.1. Put another way, this is about 12 to 1 noise to signal as compared with Mann07′s allegedly ‘conservative’ estimate of 40 percent signal to noise or 2.5 to 1 noise to signal. Not really very close.

Many could make the case that AR1 is a poor model for proxies and that other models will be better. This may be true but autoregression is a persistence of signal, nothing more. By persistence, I mean each datapoint has some similarity to the last. Changing the mathematical nature of the persistence will cause subtle differences in result but it will not reverse a result this strong. The only thing that I believe could reverse a result like this, is a big error on my part in the code.

So I’m forced to clean it up and present it.

But before we have that fun, let’s look at what a 0.1 correlation threshold CPS reconstruction, with model data that has AR1 noise matched to Mann08 , looks like using CPS methods.

The black line is the reconstruction, note the variance loss compared to the actual target model signal – green. The standard deviation of the reconstruction (black) in the pre-calibration period is 32 percent of the target variance of the original (green) signal. At first, I looked at this and thought something doesn’t look right, there is too little variance in the historic signal. To examine that, I visually compared to M08 below. Orange is the comparable line.

There is more curvature in the Mann08 graph but on a two hundred year variance the most I see is -0.2 to -0.8 but from 1500 to 1900 it varies between 0.3 and 0.6. My curve demonstrates less century scale variance in some areas(maybe not), but perhaps the curvature and differences are due to the AR1 vs actual non-temperature signal in proxy data not captured in models. In other words, the tameness of the model temperature signal may not match a systematic non-temp signal contained in the proxies.

Either way, the signalis substantially repressed compared toactual and this result will be difficult to reverse by minor adjustments in assumption.

As a further confirmation of methods used here, a quote from the M08 SI presented at Climate Audit, because that’s where I happened to be reading when I ran across this point. More on voodoo correlations.

Although 484 (~40%) pass the temperature screening process over the full (1850–1995) calibration interval, one would expect that no more than ~150 (13%) of the proxy series would pass the screening procedure described above by chance alone.

This is also the first time we’ve been able to test the zero signal screening 13% expectation using proxy data as a verification. If my result were different, it’s a demonstration of a problem in the pseudoproxies used for this result or in the statement itself. However, using AR1 over the rho spectrum of M08, I found reasonably good pass screen agreement of 15 percent of pure red noise series having the same histogram as M08.

To grasp the sensitivity of the screening percentage, consider that 8.5 percent signal corresponds to 40% passing screening and 0% signal corresponds to 15 percent passing. I think fifteen is an excellent, and after two years of paleoclimate literature, a little surprising confirmation of Mann’s assertion that 13 percent will pass by random chance. The corroboration here is supportive of the quality of the pseudoproxies and the result that only 8.5 percent of the proxy signal has a temperature component.

Conclusions:

By using pseudoproxies, we have calculated three new estimates as related to Mann08

3 – Rejection of AR1 matched proxy with no temperature signal to model temperatures is about 15%.

Caveats and differences:

Mann08 used the two closest temperature gridcells for threshold testing, this result only used one. Comparison to two closest curves would bias the answer in #1 higher than actual, meaning 8.5 percent is a high estimate using this method and data. In #2, the variance loss would be even more significant if a pick two had been used. In #3 the passing of pure random noise would be higher if we took two chances to match. Therefore in all cases, the results presented here would worsen had a pick two method been used.

Mann 08 also used infilled proxies in it’s study. Since we were matching the reject rate of Mann08 with these pseudoproxies, this infilled data could have substantial effect on the signal estimate in #1. The true amount of signal for a single-pass screening rate of 20 percent, could drop from 8.5 to as low as 2 – 4 percent and at this point, there is a possibility that perhaps there isn’t any discernible temperature signal at all in the proxies.

Of course, one could always be wrong, but considering that of the 40% which pass screening, Luterbacher 6% 0f 1209 series, contained actual temperature data, Briffa’s MXD data 8.6% were infilled for 40 years, this result means that it’s no longer outside the realm of reason in my opinion to consider that we might have very little or no signal at all.

The following code is designed to be turnkey, however several libraries may be required. I’m not sure because I’ve already installed them. Fortunately, chances are if you are a regular CA reader, you have too.

16 Comments

Interesting, but I would need more detail and to think about it more? I like the pseudoproxy approach, indeed am supposed to be working with it this summer.

Which model output are you using for your pseudoproxies? And how did you select your pseuoproxy locations.

The autocorrelation in Mann’s proxies arises from two sources: climatic autocorrelation and autocorrelated noise processes. Your analysis seems to treat all the autocorrelation as noise and I suspect the autocorrelation of your pseudoproxies is (slightly) higher than that of Mann’s proxies which will bias your result.

A bigger problem is that you have assumed that the signal:noise ratio is the same for each proxy. This is highly unlikely to be true. The challenge would be estimate the distribution of S/N ratio from the distribution of correlations and AR1. This could probably be done.

Some R suggestions: use scale() for demeaning and/or standardising (can take a matrix as an argument and will process columnwise); use apply() for looping over matrices, it leads to much cleaner code than for loops.

eg result=apply(A_matrix,2,function(col)functionOnCol(col))
is much easier to read than
result=matrix(NA, …)
for(i in 1:nrow(A_matrix){
result[,i]=functionOnCol(A_matrix[,i])
}

Your analysis seems to treat all the autocorrelation as noise and I suspect the autocorrelation of your pseudoproxies is (slightly) higher than that of Mann’s proxies which will bias your result.

I think there may be a very slight bias but consider that the actual model is only 8 percent of the total proxy so it should be pretty small. I tend to think of this as a coarse estimate anyway. It was just a bit surprising to find that the SNR is so low.

A bigger problem is that you have assumed that the signal:noise ratio is the same for each proxy. This is highly unlikely to be true.

I agree with this entirely. About 80 percent of the proxies in M08 are tree ring, when you combine that with Luterbacher being instrumetntal noise, it’s a slam dunk that you’re right. But if some of the proxies are high correlation due to excellent ‘sensitivity’ that would indicate that my analysis would overestimate the typical signal in the others.

Jeff Id, may I ask you to supply us with a precise technical definition for the term “signal” as it is employed within your analysis. This definition would include its dimensional units; the physical phenomena it is intended to characterize; the underlying theoretical basis which allows its employment within this particular analysis; its historical provenance as used in previous scientific literature; and any caveats which may necessarily apply concerning its appropriate use within a particular analytical process or within a particular analytical context. Thanks in advance.

The primary signal calibrated to is degrees C, however the variance loss in the pre-calibration record is in some other linear thermal units. Of course this signal is contaminated in proxies by moisture, humidity, CO2 levels and all kinds of other things which may or may not be stationary.

Of all people, I’ve been very critical of reconstruction methods. IMO they don’t work and aren’t vetted enough to produce anything useful. Backcasting models and comparing to them is a task fraught with unseen dangers. But that is just a method critique, this is probably the first real surprise I’ve produced with respect to the quality of the data itself. Steve McIntyre has made quite a hobby of the same kind of thing. See posts here on borehole math and near singular matrices.

OK, that’s a start. You have the dimensional units and a few caveats listed.

But recognize as well that your definition of the term “signal” might be specific and unique — to some greater or lesser degree — to the context of this particular analysis.

Therefore we need additional detail concerning the physical phenomena the term “signal” is intended to characterize; the underlying theoretical basis which allows its employment within this particular analysis; and the term’s historical provenance as used in previous scientific literature.

If your work is to be fully auditable from a nuclear-grade QA perspective, then the information needs to be provided in a more detailed and precise form.

From your analysis, which I am not fully persuaded by, you argue that the S/N is 0.85. How sensitive is the analysis to this ratio? It would be good to see a graph of % proxies included against S/N ratio.

With your R ability it should be easy for you to run a plot. I’ll work on one for you though. From memory I had 38% retained at 0.08 SNR and 42% at 0.09.

It may take a while tonight because the calculation takes time. As far as being convinced, I was rather surprised by the result myself and am willing to be wrong. Maybe I shouldn’t have been surprised with such a low r threshlold but this result wasn’t predicted.

Perhaps you can shed some additional light on where the analysis is flawed?

using a cutoff of r=0.1 (essentially no correlation), I am not surprised that the signal to noise ratio is 0.085. I can understand the desire to attempt to extract a temperature signal for a bunch of “noisy” data, but I have always been amazed at just how low the correlation cutoffs have been. I remember reading someone the other day stating that an r value of .38 indicated strong correlation. In my work (hydrogeology and environmental chemistry, we would describe that as weakly or poorly correlated and not try to read too much into that correlation.

Jeff, in the real world you need a signal-to-noise ratio of about 12-1 for
good reception of a communications channel for instance. Any degradation from that means dropped words, bits, etc. The global positioning system signal is
very faint, but is pulled out of the noise because it is coded with a long
pseudo-random code that when matched on reception boosts the signal by 70db.
We don’t know a priori what the “climate signal” looks like. We can’t boost it out of the noise with fancy processing. Correlations fo 0.1 and negative S/N
ratios mean we have noise in noise, pseudocorrelations if you will.

I hope this properly replies to your comment. I’ve had trouble lately. I’d forgotten about your method, I’ve got better chops now for these posts then I did two years ago. It was really quite brilliant, maybe your best.

Think about this, considering that you came up with 1:4 (maybe 1:5) in ‘common’ signal of messy data, it’s interesting that a different method comes up with 1/12 in temperature signal. It makes me wonder what the actual 1/4 signal might be comprised of.

[…] that would pass if you accept Mann’s incredibly generous autocorrelation assumptions. Far, far, higher if you do not. One of the most difficult scams of the paper is the determination of the correct autocorrelations to […]