Mann 2008 – Replication II

Let’s continue Mann 2008 – Replication with EIV. To run regreclow.m and regrechigh.m you’ll need files climate, eofnumb, and proxyN, where N runs from 1 to 19. I’ve run prepinputforrecon.m with required folder structure (C:\holocene\s1\zuz10\work1\temann\ etc.) in my computer. After that I did run regrechigh.m in folder

Here maxit is 100, so this one takes about 10 minutes with my oldish laptop. Will try to find a way to upload the results here (two 3 MB zip files). High-frequency recon19 shows that I wasn’t completely lost with this comment suggesting RegEM is ICE-like calibration method :

Proxies are standardized earlier in the process (high+low unity variance and zero mean in the calibration period). One needs to take std and mean of the target series, iHAD_NH_reform, and rescale the recon, recon= recon*sigma+mu

There is a warning about over-fitting in the paper, so do this only for the steps N=11..19

Combine the results of each step

Note that the number of proxies in highf and lowf inputs is not necessarily the same. Here’s the result and the archived nhnhscrihad_smxx:

As Jean S implied here , after 10 minutes of regem running, all you get is a linear combination of the proxies.

20 Comments

Yes, and since the combination is the same for all time instances in each step, the weigth vector can be recovered even without running RegEM. Now the interesting part is that signs of the weights seem to vary from step to step as shown here.

It’s clear they either do not know what their algorithm is doing, or, well… you get the picture. You’d think after repeatedly being beaten over the head with their errors, they’d get better at making sure there were none.

I’m actually toying with the idea of getting involved with this stuff again now that I’m not overwhelmed to the point of insanity. I need to brush up on some of the RegEM stuff that we discussed in the past. At one point I actually had myself convinced I knew what it was doing, hehe.

It’s clear they either do not know what their algorithm is doing, or, well…

Here’s a prime example: eofnumb, which is estimated via PCA-type of analysis in the code. It controls the regpar -parameter in RegEM-TTLS, i.e., the rank of the linear combination “matrix” for the imputed values. However, now the target (GL/NH/SH temperature) is one dimensional (and there is nothing to infill in proxies), so it is completely meaningless! Maybe they didn’t realize that, or maybe, just maybe, they first tried “climate field” type of approach (ala Steig et al), but for some reason abandoned that but left the estimation of regpar parameter intact. ;)

Maybe they didn’t realize that, or maybe, just maybe, they first tried “climate field” type of approach (ala Steig et al), but for some reason abandoned that but left the estimation of regpar parameter intact. ;)

Mann08:

Previous tests with synthetic proxy climate data derived from long model simulations demonstrate that this EIV procedure yields a very similar hemispheric mean reconstruction to that obtained from the application of the associated spatially-explicit CFR procedure when an estimate of only a single time series, e.g., the hemispheric mean temperature, is sought (32).

Re: JohnT (#12), it seems as if their heads are in the sand. He’s only one of many climate scientists who do not get statisticians involved in publications that are heavily dependent on statistical methods.

“Yep, Mann is wrong. How could he have made such a huge mistake. Seems like he is losing his MoJo”

Given that this started with Mann’s very first (or nearly) paper, I’m guessing none can really say he’s “losing” anything. Of course, this also assumes said peers understand the distinctions as well as Jean S or UC themselves.

Are thoughts like this being circulated in the climate community?

Not openly, at least, when was the last time you heard a headline in the MSM stating “statistical conclusions are shown to be unsupported in XXX09″ or similar?

Re: andy (#15),
if you mean the one labeled “fisher_1996_cgreenland”, it is supposedly this one:http://www.gfy.ku.dk/~www-glac/papers/abstracts/183.htm
It gets weights (AD600) -0.2079 (for the low split; shown above) and -0.1775 (for the high split). It is given ID 8000, which means that in order to pass the screening, it should have positive (pick-two) correlation with the local temperature. Indeed, it has, according to this file the reported (pick-two) correlations are
full screening: 0.3311
late: 0.3742
early: 0.4976

This amazing change between AD500 and AD600 (compare the lower right corners in UC’s update figures) in the reconstruction due to adding a single series is also clearly visible in the final reconstruction (see the second figure in this post). I guess this is what they mean by “robust”. ;)

Notice also that there are FOUR Tiljander series (one switching sign between AD500 and AD600) out of total 12/13 proxies. I guess it must be “bizarre” to wonder how “robust” the reconstruction is for removing those series… ;)

Re: andy (#15),
Oh, if you meant the one with the label “curtis_1996_d13cpyro” (which seems to be dominant in AD600 step), that’s one of Punta Laguna series (C13.P_coronatus) discussed already here in the context of infilling.