Crossword Puzzle #3

Let’s move from the 2 column case to the 3 column case (e.g. Bagdarin, already considered) applying the results here. Some of the hypotheses from the earlier discussion have to be re-visited. Bagdarin has 3 dset=0 versions (0,1,2). As we’ve seen in the 2-column case, the column that continues to the present (version 2) exactly matches the dset=1 version in the later part of the record when there is only version. Bagdarin version 0 is reduced by 0.3 deg C, version 2 by 0.2 deg C. Given these adjustments, dset=1 more or less follows.

Versions 0 and 1 are scribal variations for 1980 and after and from 1951 to 1960, but are discrepant between 1960 and 1980. For analysis, it might be a good idea to find a record that has 3 columns and is only scribal.

Hansen and Lebedeff 1987 describe an iterative procedure for combining the versions net of the deductions. My experiments indicate that this boils down to a simple unweighted average of the versions net of deductions, but this is experimental so far.

Hansen’s description (for whatever that’s worth) indicates to me that he first calculated the delta between versions 0 and 1 (or alternatively between 1 and 2), then formed an interim composite and repeated the procedure. However, I couldn’t get everyting to work. If I apply the proposed 2-column Hansen-delta calculation to versions 0 and 1, I get a delta of -0.2, followed by a delta between the interim series and version 2 of +0.1: so this doesn’t work.

If I try version 1 against version 2 first, I get a delta of 0 followed by a delta of -0.2 between the interim series and version 1.

These deltas seem quite unstable to ordering in a first peek.

So today’s puzzle: find a system for the 3-column case, consistent with 2-column results.

32 Comments

Steve M, what is the reason behind Dr. Hansen’s refusal to provide the code for his publication if it was peer reviewed? I find it unusual, since if it was a commercial work, then you could understand the refusal to make the codes available to the general public, because it has commercial value. I don’t publish peer review work myself, however I scour different types of peer reviewed science & engineering journals looking for interesting algorithms that might be useful in what I do.

If I come across an algorithm that would be of interest, then I first request the author/s, to see if their versions of their codes is available to the general public. If the published algorithms of those authors were done based on commercial sponsorship, then I am told that their implemented codes are not available to the public, however they are very happy to answer any question regarding the published algorithm if I want to implement my own based on their paper. For example, the following paper “Numerical pricing of discrete barrier and lookback options via Laplace transforms“, was published in the Journal of Computational Finance and sponsored by CSFB (Credit Suisse First Boston). I made contact with the author Prof. Steven Kow of Columbia University, to see if his implementation is available. He told me that his work was commercial and he is sorry, that he couldn’t give me his codes, but I am welcome to ask him any questions about the algorithm if I want to implement it myself. In Non commercial publication, I have always request that authors code and they do send thru if they already have an implementation.

The following link is a must read about Reproducible Research. Prof. David Donoho and colleagues at Stanford have developed a wavelet toolkit, named WaveLab (I have used it for a few years now) where they made their codes available to the public, and their main reason is so that anyone could reproduce their research. Read about it below.

Have you tried a FOI request yet? Also, is there anyone else who works with Hansen who might be more sympathetic to the need for auditing? Someone who might leak the code anonymously? How about that guy who was asking for the scraped GISS data, any chance of him returning the favour?

Big fan here, but I have a question, please excuse the question it might sound confrontational but I don’t mean it to.

I’ve noticed you are continually auditing Hansens, HadCRUT3 and all the others work for mistakes and point out that they have made a lot of mistakes, which is excellent. I was wondering when or if ever you would just get fed up with waiting for them to “free the code” and make you own analysis of the worlds temperature data, and create your own paper on the subject? And then release the code and the system used to make the chart, wouldn’t your’s become as “official” as theirs? After all the only reason theirs is official is because they say it is, obviously not because anyone has actually tested their methodology, and found it to be correct. They could test yours.

It’s just a thought, and might help put a lot of added pressure on many of the so called “experts” to show their methodology (free the code), or be left as yet another Piltdown Man in the footnotes of history?

Do you think their temperature estimates would differ substantially form your own?

Sorry if I’m off topic I just after reading the last 2 or 3 Posts it’s obvious the system needs a total rehaul and I doubt very much it will happen from within.

#4:
I agree with you that generating an alternative analysis with open code has many advantages. Most importantly, it would lead to a better understanding of the truth. Second, it would pressure the authors of previous analyses to open their code. Third, it would help dispel any conspiracy theories.

I am looking into the feasibility of writing new analysis code myself (see my comment #115 here). With support and feedback from this community a very solid program could be written and publicized in a relatively short time.

#4, #5: One can either choose to play the game by their rules, or expose the flaws in the system. I would rather Steve keep doing what he is doing because he is helping to expose the flaws in the system.

#4. If I were doing it, I would want to develop proper information on every station used in the network. It’s a big job and would take time. They’ve been doing this for 25 years and obtained hundreds of thousands, if not. millions of dollars in funding. IT doesn’t seem like they’ve done much in the way of data quality control and so what they seem to have produced is some fairly crappy code to calculate average temperatures from poorly QCed data.

Just because Hansen, like Mann, has some gross errors, doesn’t mean that you can obtain something meaningful just by fixing the gross errors.

That’s nothing compared with what we discovered in the Meteorological Station at Linares :) The barometer was a laundry bathtub and the thermometers were alcohol thermometers manufactured in 1959! And our discovery happened in 2001!

I know that this doesn’t match Hansen’s description, but does calculating the deltas from version 2 before doing any combining work? So right off the bat calculate the delta for version 1 & 2 and also for versions 0 & 2. Then after applying the deltas, combine versions 1 & 2 (I’ll call 1-2). Then combine 1-2 with version 0. I would test this myself but I don’t have the yearly averages or the delta calculation method in front of me.

Is there somewhere all the temperatures are stored in a single database, easy to download? Or, If I wanted to play with the data, do I have to go to GISS and download it all station by station? I could get a scraper to do that, but it would still be some time.

Bottom line is, if I want to see the raw data on all the thermometers, is anyone sharing that? It can’t be that many megabytes.

First, January 89 is a special case. I suspect it may have been dealt with as an outlier, but the method is not clear. Whatever the method, it wasn’t dealt with very well. Need to read that part of the paper again.

If you ignore that month, this worked for me.

Combine Rec0 with Rec1 by applying a bias of -0.11 to -0.15 to Rec0
Round the resulting record (RecNew) to the nearest 0.1 (round in the positive direction)
Combine RecNew with Rec2 by applying a bias of -0.21 to -0.25 to RecNew
Round again for output

I think I ruled out any solution without rounding the combined record.

I think I also ruled out the other way to round (-1.05 rounds to -1.1, no good.) Round to the positive direction.

I think I also ruled out any other combination order for this set, including combining them all at the same time.

And finally, I think I ruled out rounding the bias itself, it just about has to be at least 2 decimal places, probably just floating point.

Re 7: snip #4. If I were doing it, I would want to develop proper information on every station used in the network. snip
Steve, could you elaborate on what you mean by “proper information”? I’m guessing that it would be mostly metadata and UHI effects. Agreed, on over 6000 stations that would be a big job, probably needing funding.
However, there may be other ways to address the issue. One would be for this group to agree on the best way to process the existing data, based on the kinds of data problems known now, and on the known defects in the Hansen method. Then write the program to do that. Then process the same data that Hansen has used, and compare the results. This would expose systematic averaging errors, if any, and give a better GW result. This job is well within the range of capability of this group, probably easier than puzzles #1 and #2, so far. (Could be solving puzzles is perceived as more challenging/fun). John V, in #5, seems willing to tackle the SW development for this one.
Step 2 would be for this group to agree a set of rules for treating apparent UHI for all stations where population growth from ca 1975 to ca 2005 is available. One of the rules would have to deal with apparent saturation, because it appears that urban agglomerations do reach a point where the temp. increase flattens. Then all sites where temp. and population delta are known could be recalculated, and then step one re-run. The comparison of GISS, step one and step two, would probably be very enlightening.
Step 3 owuld be agreeing how to treat all stations according to known metadata station changes. That is the big one from a work point of view, but might make the smallest contribution to getting as correct results as can be got. (For sure we can’t go back and regenerate high quality initial historic data).
I think that all the participants (and lurkers) here believe that the surface instrument average GW as generated by GISS is wrong due to flawed methodology, and the active contributors are trying to demonstrate that. Doing steps one and two above would probably be the best demonstration you could do. Given that a good methodology, implemented by good SW would be the result, actually knowing all the flaws in the GISS methodology would be superfluous.
Might not be as much fun as puzzle solving and ferreting out GISS flaws, but seems to me it would be vastly more productive. Murray

Re” #17
Steve, I can’t fund step 3 above, but I will put up $1000.00 for the person or team that implements step one. Ie write a set of rules that key contributors here agree to, write a program that all agree implements those rules, and then rerun the available data through that SW. You define how the judging will be done, and tell me how to put the stake in escrow.
If this proposal seems ok to you, we can go ahead as now, or I plan to be in Toronto the 14th through 17th Sept., and would be delighted to meet you. Murray

1a. Generate station monthly data from daily data:
Although the monthly data is already available in GHCN v2, it may be useful to generate the monthly data from scratch so that error bounds on the monthly averages can be determined and recorded. The result of this step would be compared to GISS dset=0.

1b. Combine station data:
Combining multiple sets of station records can be done directly from the daily data (my preference) or from the monthly averages. In either case, the variance of the offset should be calculated and stored. The result of this step would be compared to GISS dset=1.

1c. Homogeneity adjustments:
I am not sure how this could be done as I have not looked for any reference documents. The result of this step would be compared to GISS dset=2.

1d. Generate regional and worldwide temperature trends:
The boxing method used in GISS could be applied, but I think there are better methods. Would have to research this.

As for step 2, a good starting point for determining UHI effects would be creating a new study similar to Peterson 2003. It should not be difficult to get population statistics and trends for North American stations. We could use these to differentiate purely rural stations, long-time urban stations, and newly urban stations.

I apologize for hijacking this thread. Steve M, would you consider opening a new thread for this discussion? Thanks.

Falafulu Fisi,
Thank you so much for the link to WaveLab. The pdf was very interesting reading. Those guys feel strongly about supplying the code because they could not reproduce their own results when the code was lost. I have included a link to the website and pdf paper on Wikipedia’s article on data sharing.

If my bias figures are confirmed, I am flabbergasted. I already knew this record was flawed, but to see the process in action is amazing. When record 2 is added to the mix, Hansen tries to detect a .13 bias that exists between these records. But he comes up with -.24, a difference of -.38 degrees.

His error is triple the bias he is looking for.

And here’s the kicker. That .14 bias was introduced by Hansen’s same error in the first combination! If he hadn’t screwed that one up, there would be no bias to detect.

No, I’ve not a plausible explanation. Perhaps it was the obsoleteness of their equipment. The case went to the public dominion, but I didn’t follow the results and if they restructured their methods. I didnt know that GISS/GHCN on Linares had stopped in 1983; could you give me the link? Perhaps I’ll find something about.

Thank you, Steve! And you’re right, Linares is not working now! I’ve sent E-mails to some colleagues working there to know the cause. As soon as I have the answer, I’ll make you know it. I’m really ashamed by this inconvenience.