This is a new thread for updates on the analyses of the data and code freed from CRU.

Everybody, I'm sinking under weight of things to do here. I need you to post one or two line analyses of what you are finding in which bits of code. I'll transfer these to the main post as they come in. It needs to be in layman's language and to have a link to your work.

CRU code

Francis at L'Ombre De L'Olivier says the coding language is inappropriate. Also inappropriate use of hard coding, incoherent file naming conventions, subroutines that fail without telling the user, etc etc.

AJStrata discovered a file with two runs of CRU land temp data which show no global warming per the data laid out by country, and another CRU file showing their sampling error to be +/- 1°C or worse for most of the globe. Both CRU files show there has been no significant warming post 1960 era

A commenter notes the following comment in some of the code:"***** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********"

Good layman's summary of some of the coding issues with a file called "Harry". This appears to be the records of some poor soul trying to make sense of how the code for producing the CRU temperature records works. (rude words though, if you're a sensitive type)

Some of annotations of the Harry code are priceless - "OH **** THIS. It's Sunday evening, I've worked all weekend, and just when I thought it was done I'm hitting yet another problem that's based on the hopeless state of our databases. There is no uniform data integrity, it's just a catalogue of issues that continues to grow as they're found."

CRU's data collation methods also seem, ahem, amusing: "It's the same story for many other Russian stations, unfortunately - meaning that (probably) there was a full Russian update that did no data integrity checking at all. I just hope it's restricted to Russia!!"

Borepatch discovers that CRU has lost its metadata. That's the bit that tells you where to put your temperature record on the map and so on.

Mark in the comments notices a file called resid-fudge.dat, which he says contains, believe it or not, fudged residuals figures!

Mark in the comments notes a program comment: "Apply a VERY ARTIFICAL correction for decline!! followed by the words `fudge factor' " See briffa_sep98_d.pro.

From the programming file combined_wavelet.pro, another comment, presumably referring to the famous Briffa truncation: "Remove missing data from start & end (end in 1960 due to decline)".

From the file pl_decline.pro": "Now apply a completely artificial adjustment for the decline only where coefficient is positive!)"

From the file data4alps.pro: "IMPORTANT NOTE: The data after 1960 should not be used. The tree-ring density' records tend to show a decline after 1960 relative to the summer temperature in many high-latitude locations. In this data set this "decline" has been artificially removed in an ad-hoc way, and this means that data after 1960 no longer represent tree-ring density variations, but have been modified to look more like the observed temperatures."

From the Harry readme:"What the hell is supposed to happen here? Oh yeah - there is no )'supposed', I can make it up. So I have :-)...So with a somewhat cynical shrug, I added the nuclear option - to match every WMO possible, and turn the rest into new stations (er, CLIMAT excepted). In other words, what CRU usually do. It will allow bad databases to pass unnoticed, and good databases to become bad, but I really don't think people care enough to fix 'em, and it's the main reason the project is nearly a year late. " (see Harry readme para 35.

James in the comments says that in the file pl_decline.pro the code seems to be reducing temperatures in the 1930s and then adding a parabola to the 1990s. I don't think you need me to tell you what this means.

Charming chap, by all accounts.Ably aided by UKian the Baroness Ashton. "Last week she was unknown in Britain. Today she is unknown all over Europe". Call Me Dave's previous "we will not let matters there" now turns out to be "I don't want an in or out...

From the file pl_decline.pro: check what the code is doing! It's reducing the temperatures in the 1930s, and introducing a parabolic trend into the data to make the temperatures in the 1990s look more dramatic. - Recycled to a separate posting today by ClimateGate blogstar Bishop Hill from among the comments ...

Bear in mind that there is no working synthetic method for cloud, because Mark Newlost the coefficients file and never found it again (despite searching on tapearchives at UEA) and never recreated it. This hasn't mattered too much, becausethe synthetic cloud grids had not been discarded for 1901-95, and after 1995sunshine data is used instead of cloud data anyway.

My Dear Bishop, I imagine you will leave the opening comment up. It is perfectly illustrative of the cast of mind of one strand of AGW opinion: shrill, adolescent anger. I suspect it is there, half buried, in some allegedly scientific types.

Very good. I discovered a file with two runs of CRU land temp data which show no global warming per the data laid out by country, and another CRU file showing their sampling error to be +/- 1°C or worse for most of the globe. Both CRU files show there has been no significant warming post 1960 era

If this isn't cooking the books, then I don't know what is. Since when does scientific data have to "look good" and scientists need to "be happy with the version we release"? Also, what the hell is a "IDL thingummajig"? Some magic toaster used to make climate change guano?

It had to have been either an insider (with a couple of login IDs) swiping (or even collecting and zipping the contents of) the 62 mb zipfile, or someone accidentally finding it on an unprotected server. Subsequent posting onto RC and the Russian server was done through an anonymous server - a simple technique widely used by lots of (non-hacker) folks to obscure their identity.

But CRU and RC will continue to spin it as "hacking" because its to their advantage to do so...

Gravamen, it's really hard to edit wikipedia articles related to climate change because the AGW have an army of wiki editor gatekeepers to come up with some bullshit reason to remove what you've written. I've experienced this so many times. Just try writing something on the IPCC or Al Gore pages, you'll see what I mean.

A better strategy is write a google knol on the subjects you have outlined @:http://knol.google.com/k

But CRU and RC will continue to spin it as "hacking" because its to their advantage to do so...

I believe you are absolutely correct. I have been working in the computer science field for almost 30 years now, and this does not smack me as being "hackers" (I used to be one).

I am not buying into a word that Gavin Schmidt says about it. I do not believe RC was ever "hacked" or that there was even an attempt made. It simply doesn't make sense to do so and would have been a waste of time with unnecessary exposure and risk. Just doesn't make any sense.

RC is simply trying to play the victim card, in a broad CYA attempt. Its simply a diversionary tactic.

November 24, 2009 | mark From the programming file "combined_wavelet.pro":

restore,filename='combtemp'+regtit+'_calibrated.idlsave';; Remove missing data from start & end (end in 1960 due to decline);kl=where((yrmxd ge 1402) and (yrmxd le 1960),n)sst=prednh(kl)

November 24, 2009 | mark From the programming file "testeof.pro":

; Computes EOFs of infilled calibrated MXD gridded dataset.; Can use corrected or uncorrected MXD data (i.e., corrected for the decline).; Do not usually rotate, since this loses the common volcanic and global; warming signal, and results in regional-mean series instead.; Generally use the correlation matrix EOFs.;

November 24, 2009 | mark From the programming file: "pl_decline.pro":

;; Now apply a completely artificial adjustment for the decline; (only where coefficient is positive!);tfac=declinets-cval

November 24, 2009 | mark From the programming file "olat_stp_modes.pro":

;***TEMPORARY REPLACEMENT OF TIME SERIES BY RANDOM NOISE!; nele=n_elements(onets); onets=randomn(seed,nele); for iele = 1 , nele-1 do onets(iele)=onets(iele)+0.35*onets(iele-1);***ENDmknormal,onets,pctime,refperiod=[1922,1995]if ivar eq 0 then beginif iretain eq 0 then modets=fltarr(mxdnyr,nretain)modets(*,iretain)=onets(*)endif;; Leading mode is contaminated by decline, so pre-filter it (but not; the gridded datasets!);

November 24, 2009 | mark From the programming file "data4alps.pro":

printf,1,'IMPORTANT NOTE:'printf,1,'The data after 1960 should not be used. The tree-ring density'printf,1,'records tend to show a decline after 1960 relative to the summer'printf,1,'temperature in many high-latitude locations. In this data set'printf,1,'this "decline" has been artificially removed in an ad-hoc way, and'printf,1,'this means that data after 1960 no longer represent tree-ringprintf,1,'density variations, but have been modified to look more like theprintf,1,'observed temperatures.';

November 24, 2009 | mark From the programming file "mxd_pcr_localtemp.pro"

;; Tries to reconstruct Apr-Sep temperatures, on a box-by-box basis, from the; EOFs of the MXD data set. This is PCR, although PCs are used as predictors; but not as predictands. This PCR-infilling must be done for a number of; periods, with different EOFs for each period (due to different spatial; coverage). *BUT* don't do special PCR for the modern period (post-1976),; since they won't be used due to the decline/correction problem.; Certain boxes that appear to reconstruct well are "manually" removed because; they are isolated and away from any trees.;

November 24, 2009 | mark From the programming file "calibrate_mxd.pro":

;; Due to the decline, all time series are first high-pass filter with a; 40-yr filter, although the calibration equation is then applied to raw; data.;

November 24, 2009 | mark From the programming file "calibrate_correctmxd.pro":

; We have previously (calibrate_mxd.pro) calibrated the high-pass filtered; MXD over 1911-1990, applied the calibration to unfiltered MXD data (which; gives a zero mean over 1881-1960) after extending the calibration to boxes; without temperature data (pl_calibmxd1.pro). We have identified and; artificially removed (i.e. corrected) the decline in this calibrated; data set. We now recalibrate this corrected calibrated dataset against; the unfiltered 1911-1990 temperature data, and apply the same calibration; to the corrected and uncorrected calibrated MXD data.

November 24, 2009 | mark From the programming file "mxdgrid2ascii.pro":

printf,1,'NOTE: recent decline in tree-ring density has been ARTIFICIALLY'printf,1,'REMOVED to facilitate calibration. THEREFORE, post-1960 values'printf,1,'will be much closer to observed temperatures then they should be,'printf,1,'which will incorrectly imply the reconstruction is more skilful'printf,1,'than it actually is. See Osborn et al. (2004).'printf,1printf,1,'Osborn TJ, Briffa KR, Schweingruber FH and Jones PD (2004)'printf,1,'Annually resolved patterns of summer temperature over the Northern'printf,1,'Hemisphere since AD 1400 from a tree-ring-density network.'printf,1,'Submitted to Global and Planetary Change.';

November 24, 2009 | mark From the programming file "calibrate_correctmxd.pro":

;; Now verify on a grid-box basis; No need to verify the correct and uncorrected versions, since these; should be identical prior to 1920 or 1930 or whenever the decline; was corrected onwards from.;

November 24, 2009 | mark From the programming file "recon1.pro":

;; Computes regressions on full, high and low pass MEAN timeseries of MXD; anomalies against full NH temperatures.;; Specify period over which to compute the regressions (stop in 1940 to avoid; the decline;perst=1881.peren=1960.;

November 24, 2009 | mark From the programming file "calibrate_nhrecon.pro":

;; Calibrates, usually via regression, various NH and quasi-NH records ; against NH or quasi-NH seasonal or annual temperatures.;; Specify period over which to compute the regressions (stop in 1960 to avoid; the decline that affects tree-ring density records);perst=1881.peren=1960.

November 24, 2009 | mark From the programming file "briffa_sep98_e.pro":

;; PLOTS 'ALL' REGION MXD timeseries from age banded and from hugershoff; standardised datasets.; Reads Harry's regional timeseries and outputs the 1600-1992 portion; with missing values set appropriately. Uses mxd, and just the; "all band" timeseries;****** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********;

It's so difficult to make judgement calls on this issue but I do welcome the debate. I am rather sick of the high profile deniers, and to be honest I am occasional revolted by green scientists not being able to discuss the remote possibility of some (minor or possibly major) issues with the data.

Surely having more information in the public domain cannot be a bad thing. We paid for the research I'd like to see a little bit more than a graph at the end of it

I haven't seen FORTRAN for at least 30 years! When I did that programming it was on punchcards. Obviously, "Climate Playstation Software Designers" use it. This makes no sense, unless you are trying to be difficult to understand. I am working on my advanced degree I could not even find a Fortran compiler on campus. After digging my old programming books out of the mothballs I started looking at this cryptic spaghetti code, and it makes my head hurt. But the comments in the code make me very angry. I guess that those comments were just off the cuff remarks, right.

To the AGW true believers please find a less measurable fixation to,latch onto. I might suggest nicely shaped quartz crystals, alien abduction, Phrenology, or maybe Scientology. Your belief system is no longer viable as an intact cogent system. It is similar to the Enron fiasco, once the door is open and the light let in, it will never be the same.

@Richard: Meh. Sigh. Don't bash the use of FORTRAN, which is still alive and well, thank you. I even have a compiler running at the moment. Instead look at the enormous programming atrocities found in the code.

@Molon Labe: Yep. I found that too.

And if somebody wants to run IDL scripts: there is an GNU version called GDL. see http://gnudatalanguage.sourceforge.net/. F77 compilers are in all Linux distro's, I could not find a free F90 compiler which works. I have a DEC F90 compiler running, but that one isn't free and runs only under WIN2K.

I apologize if it appeared that I was bashing fortran, I have fond memories of the language. However, I do not have a compiler for fortran, and have not rebuilt my Linux box since the surge hit (splat). Unlike others I have only done a cursory look at the code, and may take a harder look over the holidays. I am a Hardware Engineer, and generally stay away from code (Unless it is assembly, I know I am a bit level control freak). What is even more scary is I was a meteorology student 30 years ago.

I would have thought that with the processor intensive simulations to be done that more advanced coding options would have been used. So to all that have reviewed the code I say thanks.

I am rather sick of the high profile deniers, and to be honest I am occasional revolted by green scientists not being able to discuss the remote possibility of some (minor or possibly major) issues with the data.

Led, this might be a good time to expunge the term "denier", which was always used pejoratively, and now appears to have been used largely for people like McIntyre who seem to have been proven right.

Foxnews is not really picking up the story. Sure, they had it on Beck, but that's only part of the day. Is the reason so many news outlets who WOULD pick up on this story NOT running it because they simply don't understand it yet? Or have they decided to drop the ball too?

;; On a site-by-site basis, computes MXD timeseries from 1902-1976, and; computes Apr-Sep temperature for same period, using surrounding boxes; if necessary. Normalises them over as common a period as possible, then; takes 5-yr means of each (fairly generous allowance for; missing data), then takes the difference.; Results are then saved for briffa_sep98_decline2.pro to perform rotated PCA; on, to obtain the 'decline' signal!;

Fortran is still the normal language to work with in many scientific circles. You join a research team, and everyone else uses Fortran, so you do too. I did an applied math Ph.D, and there was a division between those who had taken some Comp Sci courses as undergrads (often preferred to use more modern languages) and those without, who picked up Fortran from their elders in the field along the way. I was in the former category, and my models tended to be written in C/C++ with lots of calls to Fortran libraries. After a while, even people like me end up writing more Fortran for the sake of consistency and/or ease of swapping things with other people. Fortran was designed by and for mathematicians, but a long time ago. As such, it incorporates various mathematical conventions that more modern languages tend not to. Plus you can write surprisingly low level code with it if you really want to, and are trying to get the maximum performance out of the hardware and know what you are doing.

So I have no real problem with these people using Fortran. This is to be expected. As others have said, it is not an ideal language for things like text processing, but that isn't really the point. Most of these people are unlikely to have seen anything else. Comp Sci and Applied Math are different worlds. The efficiency of these people's text processing code is not really the point. It's the mathematical code that matters. A lot of it seems very sloppy at best.

Compare the two files resid-fudge.dat & resid-best.dat, both located in mbh98-osborn.zip\mbh98-osborn\TREE\COMPARE\

All the numbers from 1700 onwards are identical, all calculated to SIX decimal places.

But in resid-best.dat, the years 1961 onward all have "0.24". 1960 is the year, I think, that they talked about hiding the decline in some of the program files. The same numbers, plus the sudden change to 2 decimal places after 300 years, is a clear indication of data being artificially manipulated.

[[[ these are the 20 different "fudge factor(s)"-the programmer's words not mine - to be applied to the 20 different subsets of data, so here are those fudge factorswith the corresponding years for the 20 consequtive 5 year periods:

[[[ So, we leave the data alone from 1904-1928, adjust downward for 1929-1943, leave the samefor 1944-1948, adjust down for 1949-1953, and then, whoa, start an exponential fudgeupward (guess that would be the "VERY ARTIFICIAL CORRECTION FOR DECLINE" noted bythe programmer). Might this result in data which don't show the desired trend or god forbid show aglobal temperature "DECLINE" after "VERY ARTIFICIAL CORRECTION" turn into a hockey schtick - I mean stick ? and "HIDE THE DECLINE"? You bet it would!

This "correction for decline" is described in the Osborn, Briffa, Schweingruber, Jones (2004) paper cited above (Annually resolved patterns of summer temperature over the Northern Hemisphere since AD 1400).

"To overcome these problems, the decline is artificially removed from the calibrated tree-ring density series, for the purpose of making a final calibration. The removal is only temporary, because the final calibration is then applied to the unadjusted data set (i.e., without the decline artificially removed). Though this is rather an ad hoc approach, it does allow us to test the sensitivity of the calibration to time scale, and it also yields a reconstruction whose mean level is much less sensitive to the choice of calibration period."

In my opinion the approach requires more justification than provided in the article, but it isn't the smoking gun some are making it out to be.

Morgan,Thanks for the reference. Any ideas why they did not provide the "fudge factors" used in the paper, why said fudge factors are used to increase or decrease or leave the same different sets of data (the paper in effect says only increase because of a recent decline- and why that is permissable is questionable), why there is a huge, almost exponential increase in the fudge factors after 1958. I find no explaination of this. The programming files also use the words VERY ARTIFICIAL... , but the papers make this sound so routine and don't use that adjective. Why use the word VERY unless you are implying "too much." Correct me if I'm wrong, but the programs proceed to plot out the fudged data and do not plot the data "without the decline artificially removed." If you are only using the fudged data for calibration purposes, why plot it? Just asking.

"Any idea why they did not provide the "fudge factors" used in the paper?" Not really, except that there probably was no justification for the actual numbers (beyond "they work") and that the editor didn't require them to do so.

"why said fudge factors are used to increase or decrease or leave the same different sets of data?" Are you referring to the "fudge factors" having both positive and negative values (and zeroes)? Or are you saying that the factors themselves are sometimes added to the raw data, sometimes subtracted from them, and sometimes not used? Assuming you meant the former, it looks like they are trying to get a smoothly increasing time series over the calibration period.

The increase after 1958 is, I assume, because that is when the "divergence problem" kicks in (I'm very skeptical about the issue, but not surprised that Briffa et al. feel justified in correcting it). Why does the correction have the shape it does? Your guess is as good as mine.

Why plot the data without removing the fudge factors? That's a mystery, though I guess you'd want to visually inspect the series to make sure you hadn't cocked up the code. But if that's the case, why not plot the whole thing? Dunno.

GIGO? I suppose so, but how garbage-y? I don't know enough about the "calibration" to hazard a guess. All I'm pointing out is that the "very artificial adjustment" is applied as an intermediate step to allow this "calibration" to proceed, not to the final data (or at least that's what they claim). So it isn't the final reconstruction that is "made up", which is what some comments indicated.