Quantifying the Hansen Y2K Error

I observed recently that Hansen’s GISS series contains an apparent error in which Hansen switched the source of GISS raw from USHCN adjusted to USHCN raw for all values January 2000 and later. For Detroit Lakes MN, this introduced an error of 0.8 deg C. I’ve collated GISS raw minus USHCN adjusted for all USHCN sites (using the data scraped from the GISS site, for which I was most criticized in Rabett-world). Figure 1 below shows a histogram of the January 2000 step for the 1221 stations (calculated here as the difference between the average of the difference after Jan 2000 and for the 1990-1999 period.) The largest step occurred in Douglas AZ where the Hansen error is 1.75 deg C! There is obviously a bimodal distribution.

Next here is a graph showing the difference between GISS raw and USHCN adjusted by month (with a smooth) for unlit stations (Which are said to define the trends). The step in January 2000 is clearly visible and results in an erroneous upward step of about 0.18-0.19 deg C. in the average of all unlit stations. I presume that a corresponding error would be carried forward into the final GISS estimate of US lower 48 temperature and that this widely used estimate would be incorrect by a corresponding amount. The 2000s are warm in this record with or without this erroneous step, but this is a non-negligible error relative to (say) the amounts contested in the satellite record disputes.

Aug 7 UPDATE:
On the weekend, I notified Hansen and Ruedy of their Y2K error as follows:

Dear Sirs,
In your calculation of the GISS “raw” version of USHCN series, it appears to me that, for series after January 2000, you use the USHCN raw version whereas in the immediately prior period you used USHCN time-of-observation or adjusted version. In some cases, this introduces a seemingly unjustified step in January 2000.

I am unaware of any mention of this change in procedure in any published methodological descriptions and am puzzled as to its
rationale. Can you clarify this for me?

In addition, could you provide me with any documentation (additional to already published material) providing information on the
calculation of GISS raw and adjusted series from USHCN versions, including relevant source code.

Thank you for your attention,
Stephen McIntyre

Today I received the following response:

Dear Sir,

As to the question about documentation, the basic “GISS Surface Temperature Analysis” page starts with a “Background” section whose first paragraph contains the sentence: “Input data for the analysis ,…, is the unadjusted data of GHCN, except that the USHCN station records were replaced by a later corrected version”. A similar statement appears in the “Abstract” and the “Introduction” section of our 2001 paper (JGR Vol 106, pg 23,947-23,948). The Introduction explains the above statement in more detail.

In 2000, USHCN provided us with a file with corrections not contained in the GHCN data. Unlike the GHCN data, that product is not kept current on a regular basis. Hence we used (as you noticed) the GHCN data to extend those data in our further updates (2000-present).

I agree with you that this simple procedure creates an artificial step if some new corrections were applied to the newest data, rather than bringing the older data in sync with the latest measurements – as I naively assumed. Comparing the 1999 data in both data sets showed that in about half the cases where the 1999 data were changed, the GHCN data were higher than the USHCN data and in the other half it was the other way round with the plus-corrections slightly outweighing the minus-corrections.

Although trying to eliminate those steps should have little impact on the US temperature trend (much less the global trend), it seems a good idea to do so and I’d like to thank you for bringing this oversight to our attention.

When we did our monthly update this morning, an offset based on the last 10 years of overlap in the two data sets was applied and our on-line documentation was changed correspondingly with an acknowledgment of your contribution. This change and its effect will be noted in our next paper on temperature analysis and in our end-of-year temperature summary.

The effect on global means and all our tables was less than 0.01 C. In the display most sensitive to that change – the US-graph of annual means – the anomalies decreased by about 0.15 C in the years 2000-2006.

Respectfully,

Reto A Ruedy

Well, my estimate of the impact on the US temperature series was about 0.18-0.19 deg C., a little bit more than Ruedy’s 0.15 deg C. My estimate added a small negative offset going into 2000 to the positive offset of about 0.15-0.16 after 2000 – I suspect that Ruedy is not counting both parts, thereby slightly minimizing the impact. However, I think that you’ll agree that my estimate of the impact of the impact was pretty good, given that I don’t have access to their particular black box.

Needless to say, they were totally unresponsive to my request for source code. They shouldn’t be surprised if they get an FOI request. I’ll post some more after I chance to cross-check their reply.

As to the impact on NH and global data, I’ve noted long before this exchange that the non-US data in GHCN looks more problematic to me than the US data and it would be really nice if surfacestations.org starting getting some international feedback. Ruedy’s reply was copied to Hansen and to Gavin Schmidt. I’m not sure what business it is of Gavin’s other than his “private capacity” involvement in a prominent blog.

The magnitudes of the differences in the second graph of the
pre 2000 years, and the post 1999 years, has to do with
which USHCN adjustments GISS has used for the pre 2000 years,
and which it has not. The pre 2000 portion of that graph
may indicate the USHCN FILNET adjustments, while the post
1999 portion of that graph may indicate FILNET plus SHAP
plus TOB adjustments, or their negatives. The wording of
this comment is loose because I haven’t had enough coffee
yet to want to write a more detailed description.

Is the grey line actually the variance and if so the variance of what? It does not appear symmetrical about the mean value.

Steve can correct me if I’m wrong, but I think the grey line is the data, and the black line is some kind of filtered average. As you point out, it appears asymmetrical, but I think that is because the top part of the data is cut off in this chart (again I may be wrong).

Re: Willis Eschenbach (#12),
Willis and Bernie –
I enlarged the image and get three impressions:
1.) Looking at other years and the data traces vs the smoothed out curve, it appears that this asymmetry is common on this graph.
2.) It does not appear that the data line (grey) extends up past the top of the image.
3.) It does appear that the data line (grey) has more width above the line than below, so I speculate that the data line above the black line represents more data points above than below. The negative spikes are sharper, implying single points, whereas the positive spikes are more rounded.

My overall impression is that there are simply more positive values than negative ones, implying the positive spikes are multiple data points; i.e., there is some kind of a dwell on the positive side.

It is too indistinct to tell for sure, but that is my impression.

The data, of course, are the data. I am just adding to the discussion (way after the earlier comments) about the image.

The NASA GISS Surface Temperature Analysis (GISTEMP) provides a measure of the changing global surface temperature with monthly resolution for the period since 1880, when a reasonably global distribution of meteorological stations was established. Input data for the analysis, collected by many national meteorological services around the world, is the unadjusted data of the Global Historical Climatology Network (Peterson and Vose, 1997 and 1998) except that the USHCN station records up to 1999 were replaced by a version of USHCN data with further corrections after an adjustment computed by comparing the common 1990-1999 period of the two data sets. (We wish to thank Stephen McIntyre for bringing to our attention that such an adjustment is necessary to prevent creating an artificial jump in year 2000.) These data were augmented by SCAR data from Antarctic stations not present in GHCN. Documentation of our analysis is provided by Hansen et al. (1999), with several modifications described by Hansen et al. (2001). The GISS analysis is updated monthly

I am eagerly anticipating the more later from Steve M. I need a reality check — again.

I have not been keeping up with all the ins and outs, but think maybe a more prominent note is needed, if the are actually changin the data set? If a previous corrected version was incorrect and has now been replaced by a modifed one? Need to let users of the previous one know that they used an incorrect one?

GISS are to be commended for their acknowledment to Steve, but what an opaque piece of writing. I’ve read it several times, and I can’t figure out what they mean. I hope Steve will be able to translate it for us.

If this data was used by a lot of other people, the change in correction may need to be stated in a publication. In that case, it may if central enough, warrent having you as a co-author given that you “made a significant contribution”. If that causes too much acrimony, since you won’t sign onto other things in their paper, or just don’t play well in the sandbox, it might be right for you to publish on your own.

All of the above is speculative as I have no idea about the datasets or their usage. I’m not sure if it’s a minor thing like eliminating a miscellaneous bad JCPDF, or if it is a big deal (widely used data for other work).

#26,27. TCO, I don’t know whether you remember the Mann corrigendum of 2004 – where Mann admitted a few errors, not on principal components or verification r2 or bristlecones – and said that it didn’t “matter”. Yeah, yeah, he still says that it didn’t “matter”, but I doubt that anyone really feels very confident in his reassurances. My guess is that 0.15 (probably more like 0.18) will affect some of the U.S. hot year rankings a little – not enormously the 2000s are still warm. Why would you or anyone simply think that this is the last stone to be turned over in this data set?

Let me add my congratulations. My guess is that the copy
to Gavin is due to his having been pelted by RC regulars
about your findings.

Regarding the GISSTEMP update:

Before:

“Input data for the analysis, collected by many national meteorological
services around the world, is the unadjusted data of the Global
Historical Climatology Network (Peterson and Vose, 1997 and 1998) except
that the USHCN station records included were replaced by a later
corrected version.”

After:

“Input data for the analysis, collected by many national meteorological
services around the world, is the unadjusted data of the Global
Historical Climatology Network (Peterson and Vose, 1997 and 1998) except
that the USHCN station records up to 1999 were replaced by a version of
USHCN data with further corrections after an adjustment computed by
comparing the common 1990-1999 period of the two data sets. (We wish to
thank Stephen McIntyre for bringing to our attention that such an
adjustment is necessary to prevent creating an artificial jump in year
2000.)”

[snip- I scrubbed something not because it was an error but because it was off topic and bickering. I’ve been trying to improve the threads by doing this and most readers appreciate the intervention with bickering people.]

[Steve – because bickering and off-topic postings occasionally swamp the threads. I don’t always prune things but I’ve been trying to do so recently due to an increased activity of bickering. From my perspective, it’s helped as the parties involved are less likely to post bickering posts if they are regularly pruned.]

#42. Jerry, can you give me some particulars of changes and updates that you’ve noticed?

They have already changed their US (Figure D) numbers online without any preservation of the old numbers. By sheer chance, I happened to have the old information sitting in my active R session; I hadn’t saved them or planned to save them(I have now) and will post them.

#45. I checked Hopewell and I agree. Jeez, they’ve been crazy busy the last couple of days. I’m not sure what they’re doing but they’re really going at it fast. IF Hopewell VA is typical, they’ll have changed all the GISS raw and GISS adjusted versions in the U.S. before 2000.

I think that they are trying to do things too fast without thinking it through. If this is what they’ve done (and I’m not sure yet), the pre-2000 GISS raw (which was fairly stable) has been changed into pre-adjusted versions that now don’t track to original sources, whatever those sources were.

My, my…

If it were me in their shoes, I’d have kept the pre-2000 data intact and adjusting the post-2000 data. Far too many changes in what they’re doing. But it will take a couple of days to assess the situation.

Here’s something interesting. If you compare “old” Hopewell VA numbers (fortunately preserved due to my much criticized “scraping” of GISS data) to the “new” Hopewell VA numbers, the GISS “raw” data for say June 1934 or June 1935 has gone up by 0.7 deg C, while the GISS “adjusted” data has gone up by only 0.1 deg C. So in some cases, their “UHI” adjustment as applied offsets what was a programming error. Makes you wonder about the validity of the UHI adjustment.

BTW as Jerry previewed, their US data set is now a total mess. Everything’s been written over prior to 2000.

I think that it seems necessary to the Climateers to always keep the current measured temperatures in sync with the current derived temperatures. Therefore they have to periodically readjust to make present average temperatures match.

To be fair, if they did it the opposite way they’d be jeered at just as loudly if not more so. Still, if one is looking at the thermometer outside or at a weather report in a paper from 1930 it still will make one pause if the scientists say their temperature is different than what it is/was.

There’s an interesting read from earlier this year over at Open Mind on this subject.

Note the number of things that we were assured could not occur but have in fact now occurred.

BTW, does anyone know the actual facts regarding the TOB correction and the temperature data outside the US? I get the impression that the TOB “correction” has an important effect and that it is not applied to non-US data? Has the TOB model been updated and re-validated as new data have become available?

All this talk about “corrections” has made an incorrect application of the word readily accepted SOP. We can now make corrections when the correct answer is not known.

Reviewing Reto Ruedy’s note to you, I would revise my interpretation
in light of his statement:

“When we did our monthly update this morning, an offset based on the last
10 years of overlap in the two data sets was applied and our on-line
documentation was changed correspondingly with an acknowledgment of your
contribution.”

I would say that the average adjusments of 1990 through 1999 would be
what gets back out (“offset”) of 1999 and preceding years. For most
stations, the adjustments will be the same for each of those years,
but for some stations the adjustments may have changed.

I would say that such an adjustment would be a temporary measure. Whatever
is its rationale would not survive much of a critique.

Note the number of things that we were assured could not occur but have in fact now occurred.

I think the writer is conjuring up some straw men from CA. He relates to the adjustment methods as being neutral on the issue of temperature trends and given their underlying assumptions he is correct. Unfortunately what he and many of the defenders of Hansen methods here at CA fail to address is the assumptions. It is as if they do not want to dig any further than the method itself allowed and that makes for a very dull analysis.

#55 Dan, It’s not just the best estimates thread, there’s also a brewhaha of sorts in the surface stations thread.

Liling calls 20% of the 125 year trend in just 6 years as being “a glitch”, but rather than deal with that, dhogaza complains about Dr. McIntyre instead!!

That glitch pales compared to McIntyre’s outright dishonesty about other issues, and Hansen’s graceful acceptance of the correction to the data analysis contrasts greatly with McIntyre’s unwillingness to acknowledge his own (frequent) errors.

Dano, Guthrie, Bloom, Boris and others are up to the usual tricks also. Maybe I should stop by.

I plotted the annual GISS temperature versus the satellite-derived lower troposphere temperature (RSS) for the US. The resulting plot is here . The GISS temperatures include the recent adjustment.

The periods where GISS surges ahead of the satellite record appear to be associated with times of El Ninos , perhaps involving changes in precipitation. The record is muddied by major volcanoes in the early 1980s and early 1990s and any correlation is weak, but it’s my best guess.

The year 2006 looks like an outlier. It was about neutral on ENSO ( a weak La Nina and a weak El Nino occurred) so why the apparent surge? It looks odd.

Prior to the recent GISS adjustments the 2000-2006 period stood out as an odd period. Now, with the adjustments, GISS in the 2000s looks much closer to the satellite record (except for 2006).

I revised the GISS versus satellite plot to include the old (incorrect) GISS numbers. The plot is here .

Note how the old GISS values split away from the satellite record in 2000 – in retrospect it seems like that should have been a caution flag that something was amiss.

It looks like something is still amiss with 2006.

The most intriguing thing about the comparison to me are the larger year-to-year swings for GISS than for the satellite record. If the satellite record rises or falls by X then GISS moves by, say, 1.3X. I’ll quantify that later. Remarkably that pattern of exaggerated swings seems to have broken down in 2000, even with the revised data.

A possible natural explanation is that it reflects the difference between years in which the US receives cold wintertime Arctic air and the years (typically El Nino years) when the US doesn’t get the bitter cold air. The really cold Arctic air is only in the lowest regions of the atmosphere (below say 5,000 feet), which GISS would fully see, while the satellite also sees air above 5,000 feet and averages that “warmer” upper air with the cold surface air.

#65. DAvid, that’s a nice plot. I mentioned that I thought that there was still some air in the GISS 2006 numbers and this is another indication. There’s a major difference in data provenance in 2006: GISS only uses USHCN data up to March 2006 in their US numbers; they have a population of non-USHCN sites : primarily airports (which dominate the current data in all the indices). There is more up-to-date USHCN data available (up to late 2006), but, for some reason, GISS has not included it. MY guess is that inclusion of the USHCN data will lower the final US number (based on this graphic.)

Is there an elephant in this room? I find this discussion of statistical methodology fascinating, but somewhat reminiscent of rearranging the deck chairs on the Titanic. Pardon me, but I’m just an innocent interloper in this discussion. I undoubtedly don’t appreciate the finer points of statistical significance, yet with a few minutes of investigation, I’ve learned some things that are absolutely stunning, but have gone totally unmentioned in the blog dialogue, or the popular media.

(1) When the GISS talks about a “Surface Temperature Anomoly” they are not speaking about the surface of the earth, but rather about the tiny bottom layer of the air. But the air moves up and down in the atmosphere, so how can you estimate the heat content of the atmosphere by only looking at the bottom 6 feet?

(2) “Anomoly” means relative to the average temperature from 1951 to 1980 (a period of relative stability). That means that the 0.6 C temperature rise has occured in the past 25 years — since 1980, not since 1880, which is how the popular press reports it. They talk about temperatures rising over the past century and a half. What this data shows is far more drastic than anything they report.

(3) As startling as that revelation was to me, I saw something even more stunning when I looked at the GISS 2005 Summation. The global temperature rise is not evenly spread over the planet. It is highly concentrated in the North polar region, where it is actually 5 times greater than the average global rise — 2.5 to 3 C in the last 25 years! That observation should be hugely distressing but it is never the focus of the popular press coverage. They are much more concerned with the conflict and controversy over whether 1934 was a few tenths of a degree warmer than 2005, not whether the polar region has warmed 5.4 degrees F in the past 25 years.

I say the warming of the Arctic should be hugely distressing because that is the most dangerous region on the planet to experience a temperature rise. It is the one region where a temperature rise will create extreme and irreversible positive feedback. Wherever the North polar ice cover is removed to expose the ocean and the tundra, the sun’s energy will be absorbed at 10 times the rate, and the thawing tundra will release methane with 20 times the GHG effect of CO2. Any temperature rise in the Arctic will be amplified, accelerate exponentially, and become irreversible. In the face of this looming planetary catastrophe, what does it matter if the U.S. lower 48 was a fraction of a degree hotter in 1934 or 2005?

(4) I suppose that temperature records are the only data we have that allow us to look back a century or more but it seems to me that total atmospheric heat energy is what we’re really after, not temperature. That would require us to also know the density and moisture content of the air for every temperature measurement. It would also require us to estimate the total air mass assumed to be associated with each discrete point and time where we have a temperature reading. Can someone please explain to me how that could possibly be calculated in a moving dynamic atmosphere, with a set of data points that must have changed in both quantity and quality over the past 125 years.

Perhaps we should be less concerned with identifying trends over past centuries, where we are bound to have all kinds of data problems. We could focus more of our concern on the last 25 years, where we have identified a real and dangerous problem, and where our data is likely to be much more complete, consistent, and reliable. However, to get any media attention there has to be controversy and conflict, so perhaps Stephen M could highlight how correcting Hansen’s Y2K error has reduced the 25-year warming of the Arctic from 5.4 degrees (F) to 5.35 degrees (just guessing)– if, indeed, the correction of post-2000 mainland U.S. temperatures has any effect at all on the Arctic measurements. (I suggest using Fahrenheit because the general American audience of the popular media can understand a 25-year warming of 5 degrees F much better than a 125-year warming of 0.6 degrees C, which is what we’ve been told so far.)

#73
Near surface anomolies were chosen as a proxy for global temperature. The methodology of this proxy is being challenged. The basic argument concerns what the cause of the phenomena of melting Artic areas, etc, than whether it is occurring. Since about 1990, most of the emphasis and claims have been that the temperature rise is due not just to man, but in particular manmade emissions, especially CO2. Yes, it is atmosphere heat as you call it. However, the temperature anomoly is used as a proxy for this. The real issue concerns whether man is the cause or not. Many want to do something about climate change or at least prevent an assumed climatic catastrophe. But to do something usually means you have to know the cause of the problem or what a cure consists of. This is the nature of our discussion on CA. Others see those who question as “denialists”. Their claim we must do something now, or we kill Terra. Many who post here have problems with the assumptions that have to be accepted for such a position, such as it is definitely manmade CO2. Of course many who post here are interested in showing the us the error of our thoughts. Your comment

Perhaps we should be less concerned with identifying trends over past centuries, where we are bound to have all kinds of data problems. We could focus more of our concern on the last 25 years, where we have identified a real and dangerous problem, and where our data is likely to be much more complete, consistent, and reliable.

As I stated above, hard to do as you suggest unless you know the cause or have a cure. Both of which are being challenged.

The article linked here and coauthored by Pielke, Sr. expresses in its conclusion what I have been repeatedly attempting to say about the importance of quality control problems exposed here at CA and SurfaceStations.

CONCLUSIONS. As Davey and Pielke (2005) documented and Peterson (2006) acknowledges, 926 | JUNE 2007 several USHCN stations are poorly sited or have siting conditions that change over time. These deficiencies in the observations should be rectified at the source, that is, by correcting the location and then ensuring high-quality data that are locally and, in aggregate, regionally representative. Station micrometeorology produces complex effects on surface temperatures, however, and, as we show in this paper, attempting to correct the errors with existing adjustment methods artificially forces toward regional representativeness and cannot be expected to recover all of the trend information that would have been obtained locally from a well-sited station.

The comparison of the reanalysis with the unadjusted and adjusted station data indicates that the reanalysis can be used to detect the inhomogeneity of individual station observations resulting from nonclimatic biases. In general, the adjustments indeed correct a large portion of nonclimatic biases in these poorly sited stations as far as the difference between the NARR/NNR and station data is concerned. The NNR yields a relatively uniform and statistically significant trend in this region, which is statistically similar to two of the four station trends. However, we found that there are some inconsistencies in the trends of the adjusted data. Among the four stations that have been subjected to adjustments, only the adjusted trend at Lamar is consistent with the NNR trend (being statistically similar). The other three adjustments either make the consistent trend (Cheyenne Wells) statistically inconsistent, produce a statistically significant larger trend than for the surrounding stations (Las Animas), or cause little change in the trend (Eads). This leads us to conclude that, whereas the adjustments do improve the consistency among the nearby station data and reduce the differences with respect to the reanalysis at the monthly and yearly scales, the trends of the adjusted data are often inconsistent among closely located stations.
Petersons approach and conclusions, therefore, provide a false sense of confidence with these data for temperature change studies by seeming to indicate that the errors can be corrected. For instance, the dependence of the corrections on other information (such as regional station moves, which in itself has been found on occasion to be inaccurate) can be considered an indication of the uncertainty and limitations of the corrective approach that is being sought. As a requirement, the statistical uncertainty associated with the effect of the adjustments on the regional temperature record needs to be quantified and documented.
Temperature adjustments such as those resulting from change in instrumentation are, of course, necessary. However, the results shown in this paper demonstrate that the lack of correctly and consistently sited stations results in an inherent uncertainty in the datasets that should be addressed at the root, by documenting the micrometeorological deficiencies in the sites and adhering to sites that conform to standards such as the Global Climate Observing System (GCOS) Climate Monitoring Principles (online at http: / /gosic.org/GCOS/ GCOS_climate_monitoring_principles.htm). A continued mode of corrections using approaches where statistical uncertainties are not quantified is not a scientifically sound methodology and should be avoided, considering the importance of such surface station data to a broad variety of climate applications as well as climate variability and change studies.

From the USHCN itself, we see from the excerpt below the special problem that undocumented changes (unknown noncompliances) can present to any adjustment schemes. It is interesting, however, that USHCN and NOAA documents and scientists, like the poster here at CA, Lee, make numerous references to quality control as that process for (attempting) extractions of valid data from poorly obtained collections. They spell out what they see as an optimum site for collecting temperature measurements, but on further reading it is clear that a true proactive quality control process is not in place. Below also is the process used for detecting and adjusting for undocumented changes. I would dearly like to find a statistically versed person who could determine, if nothing more than qualitatively, the assumptions that are made about the compliance of stations in general in using the adjustment processes.

The potential for undocumented discontinuities adds a layer of complexity to homogeneity testing. Tests for undocumented changepoints, for example, require different sets of test-statistic percentiles than those used in analogous tests for documented discontinuities (Lund and Reeves, 2002). For this reason, tests for undocumented changepoints are inherently less sensitive than their counterparts used when changes are documented. Tests for documented changes should, therefore, also be conducted where possible to maximize the power of detection for all artificial discontinuities. In addition, since undocumented changepoints can occur in all series, accurate attribution of any particular discontinuity between two climate series is more challenging (Menne and Williams, 2005).
The USHCN Version 2 homogenization algorithm addresses these and other issues according to the following steps. At present, only temperature series are evaluated for artificial changepoints.
1. First, a series of monthly temperature differences is formed between numerous pairs of station series in a region. The difference series are calculated between each target station series and a number (up to 40) of highly correlated series from nearby stations. In effect, a matrix of difference series is formed for a large fraction of all possible combinations of station series pairs in each localized region. The station pool for this pairwise comparison of series includes U.S. HCN stations as well as other U.S. Cooperative Observer Network stations.
2. Tests for undocumented changepoints are then applied to each paired difference series. A hierarchy of changepoint models is used to distinguish whether the changepoint appears to be a change in mean with no trend (Alexandersson and Moberg, 1997), a change in mean within a general trend (Wang, 2003), or a change in mean coincident with a change in trend (Lund and Reeves, 2002) . Since all difference series are comprised of values from two series, a changepoint date in any one difference series is temporarily attributed to both station series used to calculate the differences. The result is a matrix of potential changepoint dates for each station series.
3. The full matrix of changepoint dates is then “unconfounded” by identifying the series common to multiple paired-difference series that have the same changepoint date. Since each series is paired with a unique set of neighboring series, it is possible to determine whether more than one nearby series share the same changepoint date.
4. The magnitude of each relative changepoint is calculated using the most appropriate two-phase regression model (e.g., a jump in mean with no trend in the series, a jump in mean within a general linear trend, etc.). This magnitude is used to estimate the “window of uncertainty” for each changepoint date since the most probable date of an undocumented changepoint is subject to some sampling uncertainty, the magnitude of which is a function of the size of the changepoint. Any cluster of undocumented changepoint dates that falls within overlapping windows of uncertainty is conflated to a single changepoint date according to
1. a known change date as documented in the target station’s history archive (meaning the discontinuity does not appear to be undocumented), or
2. the most common undocumented changepoint date within the uncertainty window (meaning the discontinuity appears to be truly undocumented)
5. Finally, multiple pairwise estimates of relative step change magnitude are re-calculated at all documented and undocumented discontinuities attributed to the target series. The range of the pairwise estimates for each target step change is used to calculate confidence limits for the magnitude of the discontinuity. Adjustments are made to the target series using the estimates for each discontinuity.

I have read this 6 times now. I cannot see that it would do anything other than detect large errors. A small trend woulod not necessarily, in fact, would be quite unlikely to be corrected. Looking at the way that they set up the matrix it would appear that a rural dominated matrix would adjust urban down and to a lesser degree urban up. It would have the effect in an urban to adjust the rural up with a small decrease in the urban. THe two statements are based on the assumption of urban heat islands. Thus I would assume the reason UHI was not detected using this scheme is that it has been included as part of the normalization, homogenization steps.

This stuff is practically Greek to me, but thank you Steve for finding the problem. I am currently experiencing a heated argument with another poster on another website concerning Global warming and the validity of the IPCC report and its followers and their agendas, etc. Whilst I am a mere average citizen, he is a reporter for a newspaper in Canada and his ego knows no bounds. I wish some of you smart guys could come and put him to sleep. It would help. Thanks for the interesting comments.

[…] the data from ground temperature stations. Steve McIntyre did find a glaring error in the data for US Temps over the last 100 years, removing 1998 as the hottest year. This site is often cited by skeptics of […]

[…] August 6 (23:19 Eastern), I published my own first estimate of the impact of the error in the post Quantifying the Hansen Y2K Error. I showed a bimodal distribution of the step discontinuities and that the distribution was not […]

[…] errors lead to positive steps. There is a bimodal distribution of errors reported earlier at CA here , with many stations having negative steps. There is a positive skew so that the impact of the step […]

[…] graphs had been provided to show the magnitude of the effect” is false. In one of my original posts on the matter, I showed graphics estimating the impact of the error on the U.S. temperature record […]