A Second Look at USHCN Classification

Yesterday, I posted up a first look at differences between station histories classified as CRN=1 (good) versus CRN=5 (bad) – a simple comparison of averages, noting that other factors may well enter into the comparison.

A couple of other points that I’ve made consistently as we look at these results which I’d like people to keep in mind:

(1) the elephant in the room in these station studies is the difference in trends between the US history with 1930s at levels more or less similar to the 2000s and the ROW with a pronounced trend (where’s Waldo?);

(2) the US network has a large representation of rural stations which have records stretching back to the 1930s, a different situation than for the ROW;

(3) NOAA and NASA have quite different procedures, with NOAA showing a much more pronounced trend than NASA in the US 48

(4) whatever the warts on the NASA methodology, they at least make a more concerted effort to adjust for urbanization in their U.S. network (relative to NOAA) and we need to keep both networks in mind. In particular, NASA already has a high 1934 relative to 1998, especially as compared to NOAA.

While NASA has been taking the brunt of recent criticism, it is actually NOAA rather than NASA that has made highly publicized announcements about 2006 being the “warmest year” and we need to keep this in mind as our understanding of these methods and data improves.

First, reviewing the bidding: here is a simple comparison of the averages of the CRN1(good) and CRN5(worst) USHCN stations – a first cut making no attempt to disaggregate regionally or to check on ASOS instrumentation or things like that. It shows a noticeable difference between CRN1 and CRN5 results.

John V has carried out some useful analyses of the data, noting that the regional distribution of CRN1 and CRN5 stations was not homogeneous: CRN1 stations turn out to be skewed to the east, especially the southeast, while CRN5 stations are skewed to the west. I haven’t verified this point, but it seems plausible.

The median longitude in the USHCN network is 95W. As a coarse cross-check, I split the stations into groups east and west of 95W and compared CRN1 to CRN5 stations and secondly CRN1,2 to CRN5 stations. Doubtless many other variations and crosscuts can and will be identified, but this seemed like a pretty simple first check on regional issues.

East of 95W
The two graphics below compare (first) CRN1 to CRN5 ; (second) CRN 1,2 to CRN 5 for stations E of 95W. As you see, there is a strong increase of CRN5 relative to either CRN1 or CRN1,2 stations (over 0.4 deg C). (There are a number of ASOS stations in the CRN1 network.) Another thing to keep in mind is that the surfacestations.org quality classification does not coincide with the GISS lit-unlit classification. Of the 27 CRN1,2 stations in this group, only two were lights=0 and only 9 were dim/dark; 18 were classified as bright.

Nonetheless, there appears to be a difference between CRN1,2 and CRN5 stations in this eastern group. In fact, as seen below (and somewhat surprisingly), the difference is greater in the eastern stations than the western stations, an issue that I’ll return to below.

West of 95W

Here’s a similar calculation for west of 95W. Here there is surprisingly relatively little difference between CRN1,2 sites and CRN5 sites. Again the QC standards somewhat crosscut the urbanization standards, with some urban sites in the CRN1,2 classification (San Antonio WSFO, Berkeley). There’s not much trend over the full record, but there is a pronounced difference in the CRN1 records between levels in the 1930s and 2000s. There are not very many CRN1 stations in this grouping, which may affect things – but we’re also told by Gaivn Schmidt and others that a relatively small network of good stations should suffice for a global network and the number of CRN1 stations in the west would be sufficient within these standards, without the CRN2 stations. (And one would need to ascertain whether the CRN1-CRN2 differential was regional climatic or quality as well.)

The results, at a first pass, are opposite to a number of expectations. Eli Rabett, in one of his many sniggers against the mere idea of checking station quality, hypothesized that for every station in the west failing QC due to warming asphalt, there was an offsetting station in the east failing QC due to cooling tree growth (the Halpern Hypothesis of Offsetting QC Failures). Yet here we seem to have a greater difference between CRN1 and CRN5 sites in the east, where vegetation growth is an issue, relative to the west, where asphalt is more of an issue.

Secondly, as noted above, the QC classification crosscuts the traditional UHI issue (the nocturnal inversion caused by an urban setting, distinct from microsite issues). In this first pass analysis, there has been no attempt to cross-stratify these issues and that’s definitely something that should be done.

NOAA and NASA
Regardless of the above, the differences between the NASA temperature history for the U.S. (with its relatively warm 1930s) and the CRN1,2 averages (here averages of east and west stations are done first and then averaged) does not show a marked trend, a point noted by John V. Actually, it’s somewhat downward. (Below: dotted – post Y2K NASA version;solid – pre Y2K):

However, the situation is quite different with NOAA, as shown in the next graphic, which shows the difference between NOAA and NASA temperature histories for the U.S. (NOAA taken from http://www1.ncdc.noaa.gov/pub/data/cirs/drd964x.tmpst.txt). As you see, the NOAA trend relative to NASA is about 0.33 deg C per century since 1941 [ Note: this is a revised version of earlier graphic.

Obviously, the NOAA trend relative to CRN1,2 stations is going to be over 0.6 deg C per century. As you’ll recall, it’s actually been NOAA that’s made a point of issuing press releases about 2006 being the “warmest year”, rather than NASA, although NASA’s been taking the brunt of recent criticism.

The profound differences between NOAA and NASA results obviously point to substantial differences in their adjustment methods. While we’re gradually pinning down what NASA did, the process of disentangling NOAA results hasn’t really begun.

While I’ve been critical of NASA (and plan to make further criticisms of the procedures involved in their September adjustments), I’ve noted at all times that the U.S. is unique in having a large population of rural sites reaching back to the 1930s and that NASA has at least attempted to adjust for urbanization. Based on regional disaggregation – an approach that I endorse, John V suggested that the relatively similarity of NASA and CRN1,2 histories was a vindication of NASA methodology – a point that reader in a comment below asked me to note here. However, John V failed to observe that NASA used different methodologies outside the U.S. than in the U.S. and that the rural content of ROW networks was completely different than the U.S. and thus, using his approach, he could not argue that NASA methodology for the ROW was vindicated, as he suggested.

I’ve noted the worry that the QC ratings from the first cut of Anthony Watts ratings include a lot of stations classified by NASA as being in “unlit” areas. This strongly suggests the need to do a further cross-cut of the analysis, which will take a bit of time. The TOBS adjustment also needs to be looked at.

On the other hand, my guess is that such a cross-cut won’t change the similarity between CRN1,2 and NASA’s U.S. temperature history very much. I agree that this may well end up supporting (and perhaps even “vindicating”) the NASA analysis method for the U.S., which after all, resulted in the conclusion that 1934 was the warmest year. If it turns out that:

(1) ROW countries have a similar framework of rural “unlit” sites with records stretching back to the 1930s and continuing up to the present;
(2) NASA coerces the trends at urban stations in the ROW to these rural “unlit” stations

then one might also extrapolate that the methods applied to the U.S. might work reasonably on those countries. Turning John V’s point against him somewhat, I think that the analysis might even be held to demonstrate the necessity for an analysis of the type carried out in the U.S. by NASA.

The first casualty of such a process is obviously NOAA – whose results are inconsistent with NASA’s results. The greater similarity of the NASA temperature history with the CRN1,2 stations shows that the choice between NASA and NOAA histories is not completely arbitrary, but that, in this case, the NASA history for the U.S. looks more reasonable.

The evidence from our quick reconnaissance to date of the ROW suggests that NASA does not meet the above standards in how it handles the ROW on a number of counts. First, instead of using one integrated record at each station (as with the USHCN stations), NASA has a perverse splicing of station records, introducing a potential bias at every ROW station in which MCDW data is spliced to historical data, the effects of which have not been evaluated. Second, we’ve seen little evidence (where’s Waldo?) of a framework of long rural records: indeed, the evidence from Antarctica, South America, Africa and India is that there either are no such records or that they don’t show any material trend. The “Bias Method” used outside the U.S. has very different statistical properties. To the extent that the NASA approach in the U.S. has been vindicated – an approach that, once again, showed that 1934 was the warmest year on record, it merely highlights that this approach is not applied in the ROW so that one is left to speculate as to whether the difference in the ROW results from the failure to apply the “vindicated” method in the ROW – and perhaps it’s impossible to do so – or whether the U.S. has simply had a different climate history than the ROW, one in which, for some peculiar reason, present U.S. temperatures are not much different than the 1930s, while ROW temperatures have increased noticeably. (If U.S. temperatures diverge from world history during this observed period, one may then plausibly wonder as to why U.S. bristlecone growth should be held to have magic qualities for detecting world temperature.)

RE3 Steve I’ve been aware of the ASOS issue for some time, Doug Hoyt originally brought it to my attention. Then there is the HO83 hygrothermometer used at these ASOS stations and all of its failures to deal with.

By CRN standards which are distance based, many ASOS stations are 1 and 2’s and with the 0.25°C airport bias study they still fall in that category since CRN1 is less than 1°C bias

Could someone please indulge a newbie who’s still trying to get up to speed on the acronyms and tech-talk. I can’t find either of these in the “common acronyms” page:

ASOS
ROW (This one’s gonna be hard to re-learn; I’ve been doing planning actions for too long, and it just screams Right Of Way).

Slightly off-topic, but something just reminded me so I will ask while I’m thinking of it: Why do Climate Scientists use 30-year periods as the “trend setters”? It seems somewhat arbitrary; my seat-of-the-pants sense is that you’d need at least a century to establish a clear trend.

ROW is Rest Of the World which we may be yielding to, TOB is Time of OBservation used in adjusting temperature data (why? no clue.) ASOS is Automated Surface Observation System (weather station). There are several automated systems and all have issues.

For a while google will be your best bud, but I am sure someone will get around to revising the glossary.

The 30 -year ‘normal’ as it is called was introduced as a statistically significant (n=30 of population N) period of record by the World Meteorlogical Organization. The original reason was a need by researchers reconstructing historic temperatures for a comparison period in the modern record. The first period used was for 1931 to 1960. The ‘normal’ period was consistently changed because it was claimed a new and more complete record was available each time. The current ‘normal’ is for 1971 to 2000. The problem with this is there are less weather stations now than in 1960, that is prior to this 30-year normal.
A second problem developed because people began assuming the 30-year normal represented the entire record. Media would report the temperature for a day as above or below normal without knowing this meant it was only above the record for the 30 year period, not for the entire record. The public certainly didn’t know. I have written about the problems with this ‘normal’ period over the years, but it continues.

The current site rating is not a constant. In addition, any change in location (microclimate) may induce a temp shift. All sorts of location shifts have occurred at most stations. The usefulness of a given station’s data for climate studies may be negated by a relocation. Each station and its data should be vetted for microclimate changes (location changes, equipment changes, observer changes, vegetation changes, etc). The station ratings do not seem to take the station history into account . Perhaps a more sophisticated analysis than HCN rating is warranted for climate change auditing.

#4 PaddikJ:
As the 30-year reference period is applied here as a simple bias to the temperatures to a reference period. All temperature deviations are calculated as deviations from the reference period. It has *no* effect on trends.

While a number of people have pointed out how useful a blog is, a number of others are after a periodical summary and while wiki’s aren’t the best option always, and this may be one case, a locked down version so people can submit and update summaries as they need to be or even just a select number of people collecting and collating the various pieces of information.

This would also cover the point in post #7 about needing a glossary to cover acronyms commonly used on this site.

Re:#10
Yes, exactly. The surfacestations.org work has naturally focused on getting current site location info, but I see the next step as evaluating site changes during the duration of the temp record. For a number of sites, this should be fairly straightforward; for example, I’ve recently been surveying some dam-related sites where the only significant change during the past 50-75 years (other than during major construction work) has likely been the switch from manual max/min thermometers to the MMTS system. For others, it’s possible but more time-consuming, for example the U of Arizona Tucson site featured in some other threads. I’m hoping that using historical land-use data may help develop a procedure that can become automated to some degree.

The pattern break around 1950 can be from human error, from instrumental differences, from microsite changes or from actual weather changes. Plus a few more I might not have covered. Because it is so strong a break, it has to have affected many stations at about the same time, or have been subject to the same adjustments, if or where made.

Wondering aloud, how much post-recording normalisation has taken place with other centres of excellence? Before the late 1980s-early 1990s publications of Jones et al, for example, I wonder if there were some get-togethers to ensure that various centres were not reporting at odds with each other. Anyone have any records or supportable anecdotes?

I think these plots speak for themselves, but here are my conclusions:
– There is good agreement between GISS and CRN12 (the good stations)
– There is good agreement between GISS and CRN5 (the bad stations)
– On the 20yr trend, CRN12 shows a larger warming trend CRN5 in recent years

To be honest, this is starting to look like a great validation of GISTEMP.
The next step is probably to look at the subset of rural CRN12 stations. Can anybody get me a list of these?

Steve: In my comment, I explicitly referred to John V’s comments as follows:

the differences between the NASA temperature history for the U.S. (with its relatively warm 1930s) and the CRN1,2 averages (here averages of east and west stations are done first and then averaged) does not show a marked trend, a point noted by John V.

NASA uses different methodology for the U.S. and outside the U.S. and the temperature histories in the U.S> and ROW are very different. So I do not agree that this proves anything about the ROW (where’s Waldo?) but I’ll amend the closing paragraph to refer more explicitly to John V’s comment.

I am kind of OK with current NASA continental US temperatures (same warms in 1930s and 1990s with corresponding 320 and 390 ppm CO2, and with comparable speed of warming in first and last quarters of 20 century), but I kindaa curious to compare PDO index graph (thanks to Phil comment#76 in Bear Market trend) with most reliable and cross-verified temperature data for continental US. Just for fun.

Reading johnV’ excellent analysis I just wanted to congratulate Hansen. But Steve M’s new post shed new light on the issue and make things look much less convincing.

If CRN1 and 2 stations on the East show more than 0.42 deg. K LESS warming than CRN 5, and there is virtually no difference on the West, than something is probably wrong with Western of Eastern data. It doesn’t look very plausible to expect such a large difference between the high quality stations on the West and on the East.

My first guess is that most of the CRN1 and 2 stations on the East are probably rural, while most or many of the Western CRN 1 and 2 are urban. General UHI is much stronger factor than microsite rating of the station, what can explain such a beautiful match of CRN 5 and CRN 1 and 2 trends on the West. But this is just a guess. By and large, it seems strange that high quality, rural stations in two parts of USA show such a completely divergent trends.

It would be interesting to plot together rural stations all over the USA (whatever their CR would be) and to see the trend. I recall one paper of Christy and Spencer http://www.marshall.org/pdf/materials/415.pdf, where they showed some 6 or 7 stations on High Sierra in California that don’t show almost any warming trend in the last 30 years. Those are rural stations of course. If we plot them with some 10 aditional more on the West with CRN 1 and 2 on the East what will be the resulting trend?

19 Sod,
it is not true at all that GISS trend is consistent with satellite and weather balloon data. At the contrary, GISS decadal trend for 1979-2006 period is 0.26 degrees K, while satellite/weather baloon data show precisely TWICE smaller trend 0.12 /0,13 degrees K. I suppose if we eliminate bad stations and UHI from the surface database, trend will be similar to those of satellites and weatherbaloons.

24
This blog summary you linked is somewhat stupid. There is no hockey stick at all in Steve and Anthony analysis (just like in Hansen’s data this week) . At the contrary, values from 1930s are higher than those in the present.

I live in the “Maritime” Pacific Northwest and while I can’t use personal experience for the 1930’s I can for the 1950’s on. The ‘around 1950 shift referred above brings two things to mind:

– I don’t know about the east half of the country but a very large number of current stations in the my part of the west were established in 1948. I use the huge data base at the Western Regional Climate Center for a “layman’s” climate records; http://www.wrcc.dri.edu/Climsum.html

– I also recall the weather here in January 1949 & 1950 had monthly average temps 10-12F below the 58 year means. They were 27.9 and 25.8F and they are only months with an average below freezing in the station records.

Could all the new stations or the ‘extreme’ cold weather be a reason for the adjustments?

Big City nails it. When are they going to convene the new Nuremberg trials for these deniers?

A lot of effort to show that James Hansen is spot on across the board. This post reads like a strained apology; no no its not NASA, but NOAA. Right. How about, “Sorry for trying to risk the future habitability of our planet”?

Keep your eye on the pea under the thimble dear readers, because these jokers are not even jesters. I expect we’ll hear less and less from these clowns now that they have conclusively vindicated Hansen and Gore. Thank you Mr. John V.

John V used USHCN v2 for CRN 1,2 and 5 after adjustments. Hansen also uses USHCN v2 except he adds stations. All John V did was to take a small sample of Hansen’s data and compare it to the total. Of course there will be no or little difference in the final analysis.

I’d continue to be cautious about any analysis until there are enough sites to provide good geographical distributions. In the western half, for example, there is a preponderance (70%) of class 5 sites in the Pacific states while the Pacific states have but 40% of the class 1/2 sites.

Beyond that, I continue to wonder about the impact of vegetation encroachment, which may be more of an issue in the east than in the west.

While the intuitive thought is that vegetation cools, it may be more accurate to say that vegetation moderates (lower highs and higher lows). I cannot say whether the net effect is cooling or warming.

Vegetation (trees and shrubs) encroachment may change the area’s albedo, sunlight, outgoing IR, air mixing and ground cover while also providing some cooling via transpiration.

While concrete and A/Cs are photogenic, it may be that a bigger problem is that slowly-growing tree.

Anthony I’ll do some reading and see if there’s some basis for developing a scale to evaluate sites according to nearby vegetation.

1. It might be instructive to rank the 5s as “shaded 5″ or “heating surface 5″ or “both”

2. Not all 5s have been 5s forever. lake spaulding as an example and probably happy camp,
so it might be instructive to have a date for the last station move.
3. The best place to start on understanding vegetation and the climate near the ground
is the Classic Geiger, climate near the ground. Google book it or buy it

Looks to me like TOB might best be read as Time of Observation *Bias*, but maybe OBS in TOBS is just Time of OBServation. Or is the ‘S’ separate? Either way TOB/TOBS is a reference to the potential for bias or correcting for the bias.

Steve M has o key point here: the real issue at stake are not USA. USA record already shows higher temperatures in 1930s than now: the real problem is ROW. GISS, just like HadCru global estimates show completely different global picture than in USA, with much more warming in last 30 years than in the first part of century, while records from Arctic, South America and Africa, and most probably Antarctica show exactly the opposite – they are similar to USA. AS Steve previously analyzed here, much of the data that show fast warming are from least reliable networks, Chinese and Russian.

All that further analysis of UHI and other issues concerning USA data can show and probably will show is that the difference between 1930s and now is somewhat larger than GISS actually reports, but cannot change the overall pattern.

Real and biggest problem both for Hansen and Jones is to explain how they calculated “unprecedented warming” in last 30 years while the data from both Americas, Africa and Arctic at least, and probably from Antarctica, show no warming trend in the last 70 years.

SteveMc:
I am looking back at my posts from late Friday night. When I stated:
“To be honest, this is starting to look like a great validation of GISTEMP”

I should have said:
“To be honest, this is starting to look like a great validation of GISTEMP for the lower 48”

That will teach me to post late at night after a marathon coding and analyis session. I realize that much more analysis needs to be done before we can independently say that GISTEMP is valid for the whole world. I hope my program will be useful in that analysis.

One quick point regarding 1934 vs 1998 in the USA. It is often argued here that the error bounds are necessarily large for the temperature trend going back to the 1930s. A temperature difference of hundredths of a degree between 1934 and 1998 is completely irreleveant. It would be much more accurate to say that 1934 and 1998 are a statistical tie.

Using the much more useful 5-year average, the late 1990s are clearly warmer than the mid 1930s (by ~0.2C).

I might add – one more time – I personally think that modern temperatures are definitely higher than 19th century temperatures and it wouldn’t surprise me if they were higher than temperatures in the 1930s.

By observing that, for example, there are no records from Antarctica from the 1930s or (perhaps) no long valid rural records from South America or Africa, doesn’t prove that the 1930s in these places were warmer than the 200s. It simply means that the station histories are not necessarily evidence on this matter.

If people want to “move on” to other records, e.g. Arctic sea ice, then one has to consider the quality of satellite information from the 1930s, which, as I understand it, is less complete than satellite information from the 1990s. With ice, one would also have to consider LIFO-FIFO inventory issues – maybe the response is to a prolonged warm period and doesn’t permit an estimate of decade-scale temperatures. But maybe it does. It’s not something that can be decided as a throwaway comment.

And as always – the purpose of this work is auditing and verification. In business audits, auditors very seldom find major problems with corporate accounts and don’t expect to. They certainly wouldn’t expect the proportion of problems that one encounters with the Team.

I should have said:
To be honest, this is starting to look like a great validation of GISTEMP for the lower 48

Fair enough. I wasn’t meaning to post this as a “gotcha” against your analysis – as I think that most of your individual points are valid. However, some readers (and even another blogsite) were extrapolating your analysis far beyond what you wrote. It’s a bit of an occupational hazard. One blog even claimed that your analysis vindicated Michael Mann’s hockey stick! (Although if Mann had said that 1934 was the “warmest year of the millennium”, he might not have got as much attention.)

I agree with your final sentence in the sense that it looks likely that 1934 is more isolated as a warm year in the 1930s as compared to recent warm temperatures in the U.S. even with good data. I’d like to see a crosscut with rural+CRN1 before making any conclusions, but that’s certainly what I’d guess based on the information in hand. However, if something like that applied in the ROW, it would certainly cause a major re-thinking. So one needs to understand where Waldo comes from in the ROW.

SteveMc, don’t get me wrong. I absolutely support what you are doing here. I am trying to contribute because you have found problems in the GISTEMP methods. Our level of confidence in the IPCC consensus may be different, but we both have the same goal of improving historical temperature trend estimates.

My viewing of the graphs shown in these threads sees larger differences as we go back in time. The further back in time we go would also make the classifications fuzzier. Not knowing at what point do the snapshots in time yielding these classifications become nonbinding is, of course, the main weakness in attributing differences or lack of them going back in time.

If, as one might expect, when we go back in time, the classifications merely yield random samples of measurements and these classes show increasing differences as we progress backwards then perhaps we are looking at some unaccounted uncertainties in past data.

Looking for differences from the snapshot classifications should perhaps be confined to the most recent years worth of data and carried out with paired class 1 and 5 stations. The variations as well as the mean differences are important in analyzing the uncertainties in the data.

Re: #35

If people want to move on to other records, e.g. Arctic sea ice, then one has to consider the quality of satellite information from the 1930s, which, as I understand it, is less complete than satellite information from the 1990s.

When Anthony posted The first photos of Marysville and Orland, I got interested.
I’ve been to both places. So I started to look at the data.. Not monthly stuff, I started
with Daily. I Got the daily from USHCN..http://cdiac.ornl.gov/epubs/ndp/ushcn/state_CA.html

I looked at marysville back to 1900 or before, Willows, Colusa,Williams,Orland, Chico
Orville, Yuba City. They all had good daily records going back to to the early 20th
century.

My analysis indicated clear differences in trend between Marysville and the other sites.

So let me get this straight. You compared USHCN v2 unadjusted data (for CRN 1, 2 and 5) and compared it to GISTEMP and found good correlation? Well then that tells us that Hansen is not doing a very good job of adjusting his data and many of the problems that are adjusted out of USHCN v2 are still in the GISTEMP adjusted data. Is this correct?

If people want to move on to other records, e.g. Arctic sea ice, then one has to consider the quality of satellite information from the 1930s, which, as I understand it, is less complete than satellite information from the 1990s. With ice, one would also have to consider LIFO-FIFO inventory issues – maybe the response is to a prolonged warm period and doesnt permit an estimate of decade-scale temperatures.

I hope this isn’t a snipable tangent, but 90% of the sea ice is under water. Because of that and heat transport considerations, sea ice is almost exclusively determined by water temperatures, and not air temperatures. And water temperatures are determined mostly by ocean currents, not by air temperatures. So sea ice extent doesn’t tell us anything about polar air temperatures.

If you look at the ocean temperature anomaly charts, the Arctic ocean is something like 5C higher than normal; far, far above the anomalies in any of the other oceans.

John V’s last chart (the one after he made the geographical adjustment) shows very clearly (5 year averages) that there are very large differences 1900 – 1913 (CRN 2 is 0.6C higher) and 1925 – 1955 (at times CRN 2 is 0.3C higher). The local CRN 2 maximum in 1933 is .25C higher. It is also noteworthy that the 1933 peak is about .18C higher than the 2000 peak in CRN 1, a result which disappears when CRN 1 and 2 are collated.

Seeing as the focus of discussion is around this very result (is the peak to peak trend up or down or flat in the 20th century)this difference on the surface would seem to be significant. But is it?

The differences between CRN 1 and CRN 2 suggest to me that even in the CRN2 category there are very serious biases for half the series (1900 – 1955) relative to CRN 1. The problem is, we don’t know if this is because of sampling differences or whether they are micro-site or other biases. If they are geographical sampling biases they may disappear with a larger sample. Note the significant change in John V’s graphs when he shifted from a simple mean to a geographically weighted mean. To draw any conclusion at this point, therefore, is totally unwarranted.

I also think it is premature, given the lack of geographical inhomogeneity of the sample to undertake any analysis, even interim. There are bound to be significant differences in the results once the inhomegeneity disappears with a larger sample. Lots of people are watching this website, most uneducated in the scientific method and statistics and many just looking for tidbits to feed their political audiences without any thought to the notion of science in process and provisionality of results. All they care about is the 24 hour sound bite and short term political advantage or media ratings. There is a significant risk of some media person running stories based on these early results with very damaging impacts on climateaudit’s credibility when they turn out to be wrong. This is no longer a quasi-private conversation among a small group of people. A large chunk of the Netsphere is looking over your shoulder. This change of circumstance requires much more circumspection. Everything which you have worked so hard to achieve could very quickly unwind.

At this point I think only three conclusions are warranted. 1. NOAA has a very serious data reliability problem. 2. As CRN 1 sites are more reliable than CRN 2 and because the signal we are trying to detect is even smaller than the expected error band for CRN 1, to consider any other class of site is inappropriate for the purpose of estimating temperature until and unless local inspection and examination of station histories determines the sources of micro site bias and reliable adjustments can be made. 3. The US can no longer be regarded as having high quality surface temperature data.

I strongly recommend you call a halt to analysis until Anthony has a large enough sample for statistical reliability and that you make a statement on the site explaining why and asking people to refrain from publicizing the results so far, apart from the three conclusions I have listed.

#43. You’ve got a point. This site has functioned mostly as a type of dialogue, sort of an online seminar as opposed to publications. I pooled 1 and 2 so that my graphics matched John V’s – to the extent that differences arise between CRN1 and CRN2 sites, it needs to be determined to what extent these are quality and to what extent they are climatic.

To do a proper analysis of surface temperatures around the world is a huge undertaking and not one that I can undertake without doing that exclusively and I’m not sure that I want to do it. At a certain point, I’m probably going to have say -along the lines of what you say here, there are a bunch of problems here, but I’m going to take a 6-month pause on them. On a personal basis, it might also be practical to get some funding for such a study, rather than just doing it gratis.

In publication terms, there’s a lot of proxy work that I haven’t finished and which I’d like to finish.

(Although if Mann had said that 1934 was the warmest year of the millennium, he might not have got as much attention.)

Actually, MBH98 reconstruction tells that 1944 was the warmest. However, the reconstruction ends at 1980. Easy solution: add red noise (p=0.3) to AD1820 proxy set, up to 1993 as TPCs are available up to that year. This way I got 1990 as the warmest year in the period 1900..1993 :

RE43, 44 My goal with surfacestations.org was to create a useful metadata set with which to gauge the value of the actual data gathered at that location. Given the hard work of the volunteers so far, and on reaching 33%, I think that has been accomplished. What we have thus far are two different conclusions from that metadata.

But I agree with Steve, the undertaking of data analysis worldwide is huge. And, given the data set substitutions NASA has done in the past few days, I have to wonder if their version of raw is really just that, or if it contains other adjustments. The problem is that since NASA GISS has been silent on the matter, putting changes online with no formal (or informal) announcement it is difficult to know what exactly we are dealing with.

So what we need is an unimpeachable data set. One with known adjustments provenance and methodology with which to do surface temperature analysis. The NOAA/NCDC data sets come closer to that I believe, than NASA GISS on the issue of disclosure alone. The fact that we get significantly different results between John V’s analysis with GISS and Steve McIntyre’s analysis with NOAA/NCDC data tells us something.

The biggest problem with doing any time series curve fitting right now is that in fact the station rating is for “now”. We don’t know what the site quality (and therefore the data quality) was in 1933, or 1950, or 1974. In some cases we may be able to reverse engineer it and create a quality time-line to match the data time-line. To do that, we’d need to look at NCDC’s B44 forms that have been recorded over time, which have sketches and distances to object of influence. Some historical USGS photography, such as has been pointed out by Leon Palmer may also be helpful.

Of course disentangling all this will be time consuming and difficult. It is unfortunate that the new CRN didn’t exist 30 years ago. I believe the new CRN network is the right solution to the error band (noise) problem

And comparison says that in the NH does exist a rough agreement: HAd Cru finds 0.22 deg C per decade 1979-2006, while MSU finds 0.2 deg C (yet, if AGW should dominantly the influence climate we would expect slightly higher, and not lower, rate of T increase in troposphere than at the surface). But, look what we have on SH: surface record shows 0.14 deg per decade while satellites shows merely 0.06, almost three times less! That indicates surface record on the SH is almost certainly unreliable. To make it consistent with satellite record it must be “cooled down” almost 3 times. If we suppose surface temperatures on SH are similar to those measured by satellites, global mean surface trend for period 1979-2006 would not be 0.2 deg C anymore, but merely 0.13 or 0.14 deg C.

If we really assume greenhouse effect has something seriously to do with recent warming, we would expect less warming at the surface overall than up in the atmosphere, that is to say LESS than 0.13 deg C. Obviously there is a lot of sense in auditing surface temperature record.

#43.
I agree with many of your points (but necessarily with your three conclusions). One of my reasons for posting early was that my results were significantly different than SteveMc’s and I felt a counter-balance was needed. (It’s important to note that SteveMc did provide a disclaimer that his results were preliminary and not geogrpahically weighted. Unfortunately, not everybody reads all of the disclaimers).

I have lots of ideas to improve the accuracy of the results and our confidence in them. Before doing so I think it is important to validate my code. And before posting any results publically, it would be beneficial for a small select group to analyze the procedure and results.

I plan to setup a new website for my analysis code with the following goals:
1. Make the code publically available;
2. Provide a forum for reviewing and validating the code;
3. Involve some key members of this commmunity as advisors and reviewers;
4. Involve others from outside this community (preferably from the pro-IPCC side);
5. Generate new results in which all can have confidence (or provide dissenting opinions);
6. Publicize the results (with the help of established site like this one)

These are lofty goals. Without the help of ClimateAudit I will be working in a vacuum. (Many other sites on this topic are too biased to get much respect outside their readership).

#46 Anthony Watts:
I diagree that SteveMc and I get very different results.

I was remiss in excluding a CRN1 analysis (vs CRN1/2), but our results our very similar looking at CRN1/2.
I would like to do a CRN1 vs CRN5 comparison using only stations from each set that have corresponding neighbours in the other set. Doing so would remove geographic bias (as both sets would have the same bias). The results could not be validly compared to GISTEMP, but they would provide a baseline comparison for CRN1 vs CRN5 station quality.

I just finished promising that I would not post any new results without validated code. I will honour that promise, but would like to complete such an analysis to be performed after the code is reviewed. (Perhaps as a first small exercise for the new website I described above).

Just a word on the coarse-graining Steve M. refers to above. There is an enormous bifurcation in American settlement patterns west of W95, which represents Kansas City. On one hand, there is the now 75-year-long emptying-out of the Great Plains, and hundreds of counties in the West still have fewer than 6 people per square mile, the density standard Frederick Jackson Turner used to declare the American Frontier closed in 1893. Many have less than 2 people per square mile. And that frontier is expanding. The 1980 Census showed 388 frontier counties west of the Mississippi, the 1990 Census showed 397, and the 2000 Census showed 402. Kansas actually has more land in frontier status than it did in 1890. On the other hand, about one out of every eight Americans now lives in California, which today has a population 25 times larger than in 1900. To get a seat-of-the-pants feeling for just how big that is, today’s peak power demand there is forecast to be 31,223 megawatts. It’s hard to believe differences of this magnitude don’t appreciably affect climate.

#50. Anthony, my take is that the comparison of CRN1 to CRN5 by region yields a different (and worthwhile) perspective than the first-cut overall comparison that I posted. This doesn’t mean that the first analysis was “wrong”, only that there are other factors in play. And there may be more yet.

People will spin these things left and right, so it’s important to keep a careful description of each calculation in view.

Prior to these studies, if readers of this site were forced to choose among the following methodologies: NASA, NOAA or CRU – for U.S. temperatures, they’d all probably have chosen NASA’s – only because it had the lowest increase and not for any objective reason. As a result of establishing certain USHCN stations as more likely to be “good quality” on metadata grounds, there is an objective reason for preferring the NASA version. In my opinion, IPCC should have endeavoured to do this sort of thing, as opposed to what is in effect an “ensemble” approach.

So I think that this is some immediate progress resulting from the surfacestations classification.

Secondly, there is a big difference between NOAA and NASA results and that the surfacestations results specifically call the NOAA results into question. My surmise is that the ROW results – both for NOAA and for NASA – are done with methods that are more like the NOAA method than the NASA U.S. method (which is not the same as the NASA ROW method in terms of rural station availability or criteria.)

As for John V’s comments, I think that it’s more a matter of nuance and he’s been trying today to place a little more nuance on them, as he becomes more aware that people spin things a little more than he probably expected.

I wanted to recheck my Marysville versus Orland analysis. In the past I went to USHCN
and downloaded daily data.. Tmax and Tmin.. For both sites… and I worked from there.
I also checked agricultural sites that Hansen doesnt check ( looking at trends )

#35 ‘I might add – one more time – I personally think that modern temperatures are definitely higher than 19th century temperatures and it wouldnt surprise me if they were higher than temperatures in the 1930s.’

Sure, but if we’re detecting AGW, we need to know if the 19th c. was warmer than the 18th. How do we do that?

Anthony and Steven have been looking at microstation classifications and their potential impact on readings, which is great for establishing the validity of the database. But if the UHI is a big player here, shouldn’t we be looking at sub-categories in the CRN1&2 and CRN5 classifications to determine if there has been UHI “creep” in the data. I’m thinking more along the lines of CRN1&2 locations at airports, suburbanization, yadda, yadda that mask (or inflate) the true temperature record.

(My apologies to all who have already posted on this issue – I don’t claim it as being terribly original.)

One thing that has bothered me is this. Goddard ( nasa) has control over a GCM.. Glow ball climate
model and has control over temperature anomaly analysis. SAME WITH HADLEY.

The models “reference” these Records. They don’t “use” them As gavin has pointed out, but they
clearly “reference” them in their literature and validations.

It is not a sound practice to have a modelling group Connected in anyway with the reference data
it is validating against or hindcasting against. The temptation to fiddle is formitable.

Reference data, like the global temp record Needs to be Open. Needs to be independent.

John V took a step down that path. Stop bashing him.

Picture This. Within a couple days John V had figured out how
to ingest the data, Process it, get US coordinates, Ingest Anthony’s data and compare to HANSEN.
If his approach works, WE DON’T NEED GISS TEMP anymore and his Code becomes immaterial. John V
will publish a Open Source version that anyone can use/improve/critique. Reto Ruedy… Polish
that resume dude

The task that Hansen has worked on since 1987 was matched by a lone individual in a couple of days.
One guy matches a 20 year effort by Hansen and Nasa in TWO or three days. Give John V his
programming props and make him Gavins boss.

Seriously, When John V publishes his code, assuming it holds up we have this.

NO MORE NEED FOR GISS TEMP. And If Hansen Diverges from John V in the future, then Hansen
has the problem. And if Jones diverges from John V, then Jones has the problem.

I’m a Layman on this, but I still have an issue with the site classifications –

I would think the CRN1’s are probably valid, but might be affected by the area-wide UHI effect. They can look good in the field, but local regional influences can still distort the results.

But the CRN4’s and CRN5’s are potentially even weaker, combining micro-climate problems with UHI impacts over time.

I’m an Environmental Engineer (PE). If I saw this level of potential data corruption of the database on a project I was managing, I would have to go to the client, hat in hand, and say we had to re-do the entire site investigation. My company would take a financial hit for the re-do, and if I wasn’t lucky, I’d lose my job.

At some point, you have to realize you can’t (yet) make your case. So, you need more time to acquire data to either prove your case, or to prove yourself wrong. (P.S., I’ve done the latter – it hurts, but is humbling and ultimately positive. It still sucks!!)

Can’t fit the data to the objective. It’s gotta be the other way around.

I looked (eyeballed) the California class 5 stations versus the class 1 and 2 stations for the 1940s (1940-49) and the 1990s (1990-99). There are about 19 class 5 sites and 7 class 1,2 sites (Quincy excluded due to short record).

What I get is a 0.18C/decade rise at the class 1,2 stations and a 0.24C/decade rise at the class 5 stations. (I note that the Independence (class 2 site) includes an odd 1.5C jump in the early 1980s, which I included. Without that jump the class 1,2 average drops to 0.13C/decade.)

This look involves comparing limited data in a heterogeneous state but it’s better than comparing a class 5 California site with a class 2 Montana site.

The bottom line is that we’re at the point where one can slice and dice the limited data in various ways and draw different impressions.

No, not at all. But if the 19th was cooler than the 18th (as a result of its much greater vulcanism compared with the 18th or 20th), that would squeeze whatever warming occurred in the 20th, so that — assuming some natural warming ever since the Little Ice Age — there would be less warming attributable to CO2 (or farming, or building reservoirs or whatever it is we’re doing to rape Gaia).

Another way of putting it is that I want to see a trend that’s longer than 100 years.

I agree 100% with the need for better understanding of site contamination. The CRN program is fine for ongoing temperatures, but it would be nice to see some empirical studies of exactly how much impact specific contaminations have. TCO, Eli Rabett etc have been whining that Anthony and his volunteers haven’t done so already – but they’ve done more than their share already. Surely NOAA could spare a few CRN units to study contamination. They did a nice comparison between a decent-looking ASOS site in Asheville (surely a CRN1 by the surfacestations statnadards) and a CRN site and identified a bias even there. Why not try to empirically estimate the amount of contamination at Marysville and Tucson AZ. To study the effect on a trend you’d have to estimate the opening bias, which woldn’t necessarily be nothing. I’d be happier with people facing up to the problem at each station and making bias estimates the best they can – with documentation of the bias estimation procedures – rather than either ignoring the problem or making wild-ass approximations.

If someone like TCO says – well, maybe asphalt within 20 meters doesn’t matter regardless of WMO specs, he might be right. But my opinion is that it’s NOAA’s job to show that it doesn’t – through on-site measurements, not statistical incantations like Parker 2006. At the end of the day, they might be totally lucky and the non-compliance might not matter very much. But they should do the practical studies to find out.

WEIRD THINGS> files from USHCN that used to go back to early 1900s look like they
have been clipped.

Looks like there might be a sept 1 1949 problem.

I’ll Do it a third time from scatch, but My sense is something is changing.
I say that because on sundays when I play around with this USHCN site
Occassinally I will get File not found errors.. and then.. a few minutes later
The file appears.. I think They do updates on sunday….

Basically Delta Tmax goes from 0-5 degress up to sept1 1949 to to -70 to +70. afterwards.

Just to be clear. I have only checked TMAX. I check that first because I usually find nothing
there. Then I move on to TMIN where you can see microsite issues.. but after I saw this weirdness
I figured I better point it out

I tried several times as a layman to ask someone here the following question but none answered me: is it really so impossible to select a group of say 50, 60 or 80 stations in USA (Steve said USA has a lot of such good stations), relatively representative geographically and see what temperature trend that network shows?

After reviewing dailies starting with a copy of the hand written 12/1989 report from NCDC images and comparing to the GHCN dailies for everglades (08-2850-5)there are plenty of issues. The actual month would have been off by 1/2 degrees F had the data entry clerk not of entered data for a maintenance not recorded.

re 71, That’s not a good idea. The download data for co2science is not original, it appears to be calculated from the GHCN data. The author seems to have more problems dealing with missing data than GISS. I compared the co2science to the GISS to the GHCN to the few months of dailies that seemed accurate and the GISS GHCN were fairly close. All were off were in estimates for missing data. Giss actually showed more 9999’s than GHCN for annuals were the data was missing.

I know this was meant to Steve M. but I think this may be something he is finding in the dailies.

Air temperature is not particularly meaningful when it comes to climate.

Better is SST (sea surface temperatures) and actual ground temperatures. If you want to find out if the Earth is heating up measure the Earth. Air has very little thermal mass (relatively). Which makes the signal very noisy.

The difficulty is that sea surface temperatures until recently depended on trade routes (which limit geographic distribution) and ground temperatures have hardly been measured at all.

The only thing measuring air temps has going for it is the length of the record.

I didn’t see any response from JohnV about comparing unadjusted GHCN data against GISS data then coming up with cloose corellation. That definately would show that the GISS adjustments are not accurate.

I know this is a dumb question, but does anyone know where I can find a historical wet bulb temperature and relative humidity trend? Since I have seen nothing sexy about the tmax trends so far, the latent heat change might be more indicative of true trends.

While NASA has been taking the brunt of recent criticism, it is actually NOAA rather than NASA that has made highly publicized announcements about 2006 being the warmest year and we need to keep this in mind as our understanding of these methods and data improves.

I trust NOAAs data as far as I can punt it. I check the numbers of their monthly assesments on-line and generally find they overestimate temperature by 1-2C. I live in NC and we had the coolest July in at least 30 years, and they had us 0-1C below normal, which is complete BS. Plus, it almost seems like if there’s a large gap between stations(such as Asia, Russia), they fill in with whatever those stations recorded, which may or may not be correct.

#76
I’ve asked and looked for the same information without any luck. Knowing that the most of the Colorado river now travels overland to the East rather than into the Pacific has to have some forcing effects on humidity (as well as all the irrigation circles that might be noted by anyone who has looked down over flyover country – and not just in the USA.)

The effects of humidity are not at all well modeled. You might look at papers such as :

I know of no research that has established any understanding on the IR absorption/reflection/refraction by these different classes of ice crystals. Spensor has speculated that these crystals in the cirrus clouds may have a balancing climatic effect based on some observed IR signatures seen in cirrus clouds by satellites.

I don’t see how man’s irrigation can be written off as a confounding variable for the CO2 caused AGW theory. The standard RC argument is that water vapor has a shorted half-life and is a result not a cause, but they overstate the half-life of CO2 and don’t consider that rains in arid areas are often returned to the atmosphere in very short order. I think that RH averages are possibly 5% higher than in the early days of the west – which would dwarf a CO2 effect. I think some data must exist.

80 and 76
It’s getting late so I am about done but thanks I will dig in the next time I get a chance. I was just thinking that UHI and Micro site influences would mainly be sensible not latent. The difference may not be easily discernible, but it may be worth looking into.

It may be nothing, but I am sick of looking at the screwed up daily records.

(I apologize for cross-posting in two threads. My promise to release code was made in one thread, but the conversation has since moved on to another).

I don’t have much time so will have to wait until tomorrow to respond to any comments (if required). It’s been a very busy weekend but I’m finally back at my main computer. Although I had hoped to nicely package my analysis program on a website with documentation, I will not be able to do so tonight.

The zip file includes:
– the executable (OpenTemp.exe)
– source code (in the src directory)
– input data (v2_mean.txt, downloaded from ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2 on September 12, 2007)
– batch files to run the program
– my result files from running the program

I think it’s important that we don’t clutter up this thread and SteveMc’s site with discussion about my code. I will setup a proper forum for discussions tomorrow, and will post a link when it’s ready.

I’m not trying to “validate” any claim. CRN5 sites are non-compliant and to that extent are “bad” by definition. Whether that makes any relevant difference is a different issue. TCO, for example, raised the point that maybe 30 feet away from asphalt is enough and maybe WMO standards are too onerous. Maybe he’s right, maybe he’s wrong. You can’t say on a priori basis. However without showing that it doesn’t matter, then you have to assume that WMO climate scientists weren’t complete fools when they instituted this policy and that asphalt makes a difference. The only way of determining that is empirically. TCO blamed Anthony Watts’ volunteers for not completing such studies, even though they were just at a reconnaissance stage. My point is that NOAA’s the one that identified the sites. Surely they can spare some CRN units to actually study the iumpact of bad siting, in addition to the valid monitoring of good sites.

That is an interesting question. The answer depends entirely upon what a person accepts as a definition for “good stations.” If James Hansen says all of the stations in the United States are an insufficient number of stations to determine a trend for the ROW, then how can any subset of those stations be an acceptable number to Hansen and AGW proponents sharing his views and attitudes about AGW, unless they concluded the results support their AGW claims?

My point is that NOAAs the one that identified the sites. Surely they can spare some CRN units to actually study the iumpact of bad siting, in addition to the valid monitoring of good sites.

This is definitely good information to know (and should already have been done IMO). However, the only way this helps us understand historical surface temperatures is if it leads to some adjustments, which would be nearly impossible to apply. Unless the conclusion is that microsite effects are insignificant.

3. Identify the definition for “insignificant which James Hansen and NASA accepts.

4. Compare the definitions.

5. Determine (a) if there are or are not any differences in the definitions; and (b) if differences between the definitions exist, can they be reconciled to produce one definition accepted by all parties as a reasonable conclusion supported by evidence produced by the application of a scientific method.

Ive been playing around with your very cool program for the past hour or so. It looks like months, perhaps, many months worth of work. If that program is something you whipped up over the weekend, well, that explains why some people are Professors and others are not.

Anyway, some surprises, here are the top warmest years in the lower US-48 since 1880 utilizing the output from your program considering Anthony Watts 33% microsite audit and using quality sites reference (CRN=1,2). The results have moved around since our last leader board update. The years in the brackets represent the leader board posted by Steve. Because there are only three months in 2006 posted, the year 2006 was not included in the program or the graph below.

The graph below represents the average yearly temperature in the lower 48 US states using your output. The black line is a simple five-year average. The yearly temperature change from approximately 1930 to present is hyperbolic and not a hockey stick shape.

Something weird is going on between 1956 and 1970. The yearly variance shrunk. It really jumps out. I have no idea why. I am not a climatologist, so it is highly likely I did something wrong or have not understood the meaning of your program output. If that is the case, let me know.

I do not have anything to say regarding your code thus far. I need more time to go through it line by line. Maybe I will have a question or two tomorrow.

I would like to return to the matter of accuracy and error margins. Anthony’s survey proves beyond any doubt, what is obvious to any practical observer: there are inaccuracies in temperature measuring, and error margins that reach the range of +-5 deg C.

Now, no amount os statistical analysis or mathematical computation can make these error margins go away. The error margin is part of the data, a physical, integral part. You cannot ignore it.

When you compute a temperature trend for, let’s say the last 100 years, and state the trend is 0.5 deg C, this statement is incomplete, and therefore missleading. You have to state it this way: the trend is 0.5 deg C +- 5 deg C. This means: we can be confident the trend is not less than -4.5deg, and not more than +5.5deg.

Ignoring the error margin takes you from the real physical world into the imaginary world of numbers.
If someone wants to claim that by using many stations in the analysis, the error margin is reduced, he has to prove why it is so.
I have the feeling that mathematicians, programmers and climate scientists don’t appreciate enough the significance and importance of error margins in the practical world.

On the CRN1, 2..5 classification: many have already pointed out that this clasification is true for today. The clasification has changed over time, and we have no data about it. Some stations started (maybe) as CRN1 and then, over time, buildings and parking lots were added and they moved to CRN5. But this plausible trajectory isn’t the only possible one. It’s just as possible that some stations that are today CRN1, have been in the past CRN5, and as more buildings and parking lots were added, the station has, at some time, been moved to a better location, and turned into CRN1.

We have to assume that the error margin for all stations is the CRN5 error margin (+- 5 deg) unless the historical record of the station has been checked, and shows it hasn’t been moved, and hasn’t undergone other changes (like a change in instruments).

We have to assume that the error margin for all stations is the CRN5 error margin (+- 5 deg) unless the historical record of the station has been checked, and shows it hasnt been moved, and hasnt undergone other changes (like a change in instruments).

let me repeat my question:what is the meaning of “error>= (+-)5°C”???

i cannot lose the feeling, that this classification was endorsed by surfacestations, because it was an easy way to discredit the NASA data.

the way it is used, indicates it to be some sort of average error , but that doesn t make any sense. an average error that is either bigger than +5°C or below -5°C is highly unlikely.

i think people should start looking at the real definitions, given by the french guy, before we make further deductions from the type of station.

While the intuitive thought is that vegetation cools, it may be more accurate to say that vegetation moderates (lower highs and higher lows). I cannot say whether the net effect is cooling or warming.
Vegetation (trees and shrubs) encroachment may change the areas albedo, sunlight, outgoing IR, air mixing and ground cover while also providing some cooling via transpiration.

David, that’s my limited observations. I kept detailed temp recordings in a heavily forested spot where I lived for 14 yrs, and compared those to the nearby (~4.5 miles away) “official” NOAA airport-located station. The annual-averaged temps were surprisingly close (usually within 1 F), but my forested site had both cooler daytime highs & warmer nighttime lows compared to the wide-open airport station. The differences were most pronounced in the warm season.

On surveying the Circleville OH station this weekend, I noticed that the top of the MMTS had considerable black soil buildup, and that the vanes had numerous black spots on them as well. See my closeup photos at Surfacestations.org.

I guess MMTS’s are supposed to require no painting, but shouldn’t they at least be cleaned on a regular basis? Could AGW really just amount to Accumulated Grime Worsening?

If NOAA does get around to cracking down on Ring Around the Collar, I would hope that they would ask even numbered stations to clean up their acts immediately, but odd numbered stations to hold off and continue as usual for 12 months, in order to have a control group for comparison.

Perhaps Anthony Watts could include closeup photos of the top and vanes of any MMTS sensors in the instructions for his Surfacestations.org reporters. Unless most of them are actually clean, grime buildup could be an even bigger factor than Latex paint vs whitewash on the old Stephenson Screens, given the prevalence of the MMTS nowadays.

Circleville is a fairly clean environment, with no smokestack industry upwind. The station is even completely west of the town itself. I suspect that other stations are a lot worse. I can’t say whether the soil is soot, mildew, or just accumulated pollen and tree sap, but it’s curious that I’m getting similar black grunge on my tomatoes here in Columbus, 40 miles N.

My digital camera has a “closeup mode” indicated by a little flower icon, but it’s important to turn it off before taking other regular shots!

RE98 SOD “i cannot lose the feeling, that this classification was endorsed by surfacestations, because it was an easy way to discredit the NASA data.”

Well, that is flat wrong, and I don’t appreciate the implication. You may as well say that the new NOAA USCRN network was created to discredit GISS.

The classification was used because it was the only classification scheme available to us (that I could find) that could be applied for the kind of work that was being done. The classification was created by Michel Leroy of Meteo-France and is used for their network, and of course is used for the US Climate Reference Network. With that sort of endorsement for the scheme prior to surfacestations coming online, it seemed plausible to use. Had I created my own classification scheme, rather than used this one, I would have received criticism for that, most likely far more vocal.

In fact I don’t know of another classification scheme for site quality in existence, because site quality for climate monitoring appears to have been an afterthought to the existence of the networks being used for the near surface temperature record. If you know of one, put it forward, and I’ll apply it if possible.

Let’s see any published site quality classification schemes that anyone knows of. Please post them here.

RE: 98 sod, why do you not look it up your self? The class 5 station is one that is has the temperature sensor located next to/above an artificial heating source, such a building, roof top, parking lot, or concrete surface. The error is due to warming so it is not +/- 5 degree C, it is error>=5 degree C.

Others have asked how to post graphs. I too would like to know the answer. My own experience with submitting graphs and drawings to peer reviewed journals, starting in the late 60’s, basically required an envelope and postage to implement the snail mail procedure. Procedures for inserting images on this site may be slightly more technical. The Quicktags link has information about WordPress, perhaps the answer is hidden there?

RE 93 I have to agree. The data entry errors alone should require a minimum +/- 0.5 F error margin. Missing daily data should add a small error. Hansen’s
nearby site comparison does help reduce the error somewhat. Then that seems to be negated by the December gift. Plus the accuracy of early generation digital instrumentation in some cases is listed at +/- 2%.

It seems that the error bars shown by Hansen should be more consistent instead of reducing as he has indicated.

The meaning is: the “true remperature” could be anywhere between a limit of -5 deg of our reading to +5 deg.
Now what is the “true temp” ? It is two things (both of them):

1. The temperature that a very, very accurate thermometer would read in the same place.
2. The temperature that would exist in the same place due to climate alone, and not due to irrelevant noise influence – like a/c exshausts or asphalt radiation.

It seems obvious to me that the temperature errors associated with the CRN/Leroy categories are just guesstimates of potential errors. In fact, “Error >=5 deg C” could be interpreted to mean either the algebraic errors could be as much as +5 degrees (but not negative) or that the absolute errors could be as much as +5 degrees. Most of the problems with the CRN5 sites are heating problems, but shade could be a problem in the opposite direction as well. Let’s get on with studying the results from surfacestations (and any other sources) to see what the errors actually are, rather than arguing about someone’s guesstimate, however educated it may have been. As someone mentioned, it would be useful to have categories like 5H, 5C, and 5H/C to indicate the directions of the violations.

RE 107 Typically it is the total range of the instrument. Newer thermistor/thermocouples have a better accuracy if you pay the bigger bucks (precision thermistor, precision shut resistor and precision A/D). An Italian company provided many in Europe starting around ’88. I will try to dig up some more information. Some of the early instrument had short life spans (1 year) with incredible drift.

RE: #21 – The Western US tends to be highly urbanized – way up in the 90s %. We’ve never had the sort of relatively dense array of rural communities that one finds “back East” or in Europe. This is attributable to a number of factors – geography (mountains and deserts break up the land scape), less dense road network, developement aglomerated onto water networks of relatively limited span, relatively recent growth (as of 1800, there was almost nothing here – the Spanish were strictly pastoralists with exceedingly large acreages, and mainly lived near the current US Mexican border and on the Pacific coast south of 38 deg N – furthermore, most such areas did not incur any development of any sort until the late 1700s).

109, the technology involved made a huge difference. The cheap ones these days are much better than the expensive ones in the ’80s. Some of it has to do with digital signal processing, but the analog front ends are a lot better than they used to be, too.

Frankly, up until about the mid-90s, I’d have more faith in a mercury thermometer than an electronic instrument. These days, you could make a very cheap, very accurate logger with flash memory that is better (in addition to much, much smaller) than the best that you could get in the ’80s.

The six warmest years in this case occur in the first half of the 20th century. You have a bi-modal distribution, not a linear trend.

As someone pointed out, it looks like a cycle around a zero trend.

But all this is beside the point.

As Jacob said in #95, the standings of the sites in the classification are as of now. We do not know what the standings were in the past. There is no point making comparisons between the time series until we have that information and can adjust the series in light of it.

Moreover, as I pointed out in #43, the time series are biased geographically. Until we have a larger sample so that is no longer a problem, we have no idea if the graphs are artifacts of geographical bias, UHI, micro site bias or real climate differences.

It is time everyone cooled their jets re: interpreting the data at this point.

What has been going on is not good science. It is far below the standards climateaudit is demanding of climatologists. If we don’t wish to lose the high ground in the debate over AGW we have to stop this nonsense now.

You are right about the scale of the effort to correct the ROW, especially the SSTs (google my post of Sept 21 2005).

Perhaps it is time to establish a cooperative along the lines of the open source software movement. Why not establish a thread on this and see where it goes? There are active professional scientists on this blog with access to funds. Other people with respect for auditing standards such as Fred Singer and the Pielkes might become involved.

I would be sorry if the list were to lose its seminar quality. That is one of its great strengths. Moreover, it is a great classroom for people to learn how science ought to be done. This is of no small value. Most people do not know how science is done, including a very large percentage of working scientists.

I think that all that it would take to preserve the seminar quality is to set standards respecting posts e.g. no speculative analysis until normal statistical standards are met for the data. Once the standards are set, policing will be self-organizing, as it has been in the past.

1. You always have to know the margin of error. Why isn’t one published with the other information?

2. If you are comparing class 1 and 2, why not compare it to class 4 and 5?

3. Even if the express goal of the survey of stations is to discredit them, why does the survey bother some people so much?

4. Instead of asking why the survey isn’t specificially identifying the effects of the station not meeting whatever standards it doesn’t meet, why aren’t the vocal opponents of the effort instead asking why the people that run the network haven’t specifically identified those effects themselves and either fixing them or published the information? Or disproving there is an effect themselves rather than whining about it.

5. The new NOAA USCRN network was created to discredit GISS, why can’t everyone just accept it and move on? :D lol

REF 107 The Italian company is SIAP Bologna. Their current MMTS temperature sensor is rated as +/- 0.3 C accuracy. It is a constant current sensor (1 mA). accuracy of the current source and A/D better than the accuracy of the sensor, in the +/- 0.05 range, resulting in approximately +/- 0.4 degree accuracy. The irradiation shield increases error in a non-linear manner dependant on air temperature and irradiation intensity and azmith of the sun. (AMS had a pretty good research on the MMTS determining up to 0.7 degrees diuranal positive bias.

The cumlative error is then approx. +/- 1.1 degrees C. That is for the newest SIAP instrument. (Remember I am approximating the A/D and current generator error.)

the time series are biased geographically. Until we have a larger sample so that is no longer a problem, we have no idea if the graphs are artifacts of geographical bias, UHI, micro site bias or real climate differences.

It is time everyone cooled their jets re: interpreting the data at this point.

What has been going on is not good science. It is far below the standards climateaudit is demanding of climatologists. If we dont wish to lose the high ground in the debate over AGW we have to stop this nonsense now.

I just thought that Mike H said this so well, that it should be said again. This shouldn’t be about scoring points (TCO). The bottom line is that it really doesn’t matter which year was hottest. In fact, it’s clear that 1998 was the hottest year on record, since the 1938 heat wave was local, not global. Therefore, Hansen is correct in his intermediate conclusion that 1998 was the hottest year on record. He is simply dead wrong in his claim for what the reason was.

I say that neither side is actually engaged in a “debate of AGW”. First, Mann and McIntyre had an argument over the statistical treatment of data involved in a scientifically dubious claim of climate reconstruction. Now, Hansen and McIntyre are having an argument over the statistical treatment of surface temperature data. Neither case is actually a scientific debate and neither is on point.

123, It’s a scientific debate, just not a theoretical debate. Science isn’t just about theories, it’s also about testing theories (you’d fit in perfectly on the other side). The Hansen-McIntyre and Mann-McIntyre clashes are over the empirical side of the debate.

Push comes to shove, if the empirical says one thing and the theoretical says another, empirical wins. That’s why this is more important than sophomoric talk about [t-word].

It is time everyone cooled their jets re: interpreting the data at this point.

That’s why I object to some commentators who use CRN 1-5 (rawish), as it is now, as some sort of validation of the final version of NASA GISS worldwide, after adjustments. I am surprised at the naivete of those who say they didn’t realise how their conclusion would be received by the Blogocracy or other media. Perhaps these are the same naifs who don’t understand how their Press releases can be so mis-understood. The difference between the quality of raw data, and the quality of adjustments is being ignored.

The question is, what’s the margin of error? I’d think there’s quite a few years that are basically a tie, within the margin of error. I don’t really differentiate between ’34 and ’98 really.

However, in a complex inter-discipline scientific subject like climate, I would say that any discussion over the methods involved is a part of the science. And that includes statistics, software, models, measurements and the like. The issue here is replication. If a disagreement over data used in a paper on climate reconstruction is involved, that seems pretty much like science, and you even use the phrase “scientificially dubious claim” there Gunnar. But I do agree, it’s not really a “debate”. The science is settled! :D

No seriously, you can debate how much and what to do about it, but it’s pretty clear some warming is happening, and that people are responsible for some of it. I chalk it up to land use changes mostly, and say it’s not much.

I don’t know, can you debate what temperature water freezes, or if C or F is “better”? Or is climate change so complex and unknown it can be “debated”?

#124, you set up a false dichotomy between theory and empiricism. I use the word scientific to distinguish between two approaches to trying to determine the nature of reality:

1) using a statistical approach devoid of any knowledge of science. (“wet sidewalks cause rain”)

2) using the scientific method to propose a theory and then actually testing that hypothesis.

Option 1 is not some sort of empiricism, but a foolish attempt to shortcut the scientific method.

>> youd fit in perfectly on the other side

Actually, no, the AGW side is dominated by folks who use option 1. Although you folks appear to disagree with AGW proponents, you actually reinforce their position, since you implicitly agree with their science, but merely insist that they didn’t do their math correctly. It is you who fit in perfectly on their side.

>> The Hansen-McIntyre and Mann-McIntyre clashes are over the empirical side of the debate.

No, they were over statistics. That’s why it took a master statistician to settle the issue.

>> if the empirical says one thing and the theoretical says another

At best, this is a freshmanlike misunderstanding of the scientific method. At worst, it is a complete rejection of the SM, and hence science itself.

Gunnar, Steve is agnostic on cause and effect. You don’t have to take a position on cause and effect to audit the numbers. The quality control on the numbers can and should be done independently of climate theories.

If and when we ever get to the point of auditing models, the theory and the causal relations will come into play. And God save us from the pretend physicists what that all comes out on the table. Until then, it’s irrelevant, and this skirmishing over numbers is still science. It’s just the dull, mundane, part of science we have to get right.

Push comes to shove, if the empirical says one thing and the theoretical says another, empirical wins. Thats why this is more important than sophomoric talk about [t-word].

Err, doesn’t that depend a little bit on how good the empirical evidence is? You really need a theoretical basis for your empirical results, or it could be spurious. We have lotsa examples of this on this blog.

You missed my point: If 1998 was hotter than 1938, it does not mean that AGW is true. If 1938 was hotter than 1998, it does not mean AGW is false.

>> scientifically dubious claim

Because tree rings are not a good proxy for temperature. So, in that case, if Mann had been correct, it would not have meant that AGW was true. If M&M + Wegman are correct, it does not mean AGW is false.

>> Steve is agnostic on cause and effect. You dont have to take a position on cause and effect to audit the numbers.

I’m not criticizing steve. He’s a math wizard, and he performs the audit function admirably. Well enough to pass the Wegman test. However, he doesn’t care if there is valuable ore in the mine or not. All he cares is whether or not the folks fudged the numbers to get the investment.

It’s all of us, citizens (not just blog readers) that I’m criticizing. We care about what Steve is doing because we think that it implies that one conclusion is correct and another isn’t. The media cares. Even Hansen cares. Because they are judging whether AGW is correct or not, based on whether some irrelevant math was done correctly, or based on whether 1998 > 1934.

I’m concerned that the proponents will choose not to fudge the numbers, and with the science allegedly settled, and everyone’s hopes and fears resting on Steve M, there will be no argument that the science isn’t settled.

I agree with you Gunnar, year 1 or year 2 hotter/colder, or rings being whatever doesn’t prove or disprove things. Thanks for clarifying your point. I was just saying it doesn’t matter really, I guess.

The classification was used because it was the only classification scheme available to us (that I could find) that could be applied for the kind of work that was being done. The classification was created by Michel Leroy of Meteo-France and is used for their network, and of course is used for the US Climate Reference Network.

i think it is a good classification for weather stations. i think it provides good criteria for building a network of good stations in the FUTURE.
i have some doubts, that it is the best tool to analyse existing stations for the purpose of climate TRENDS.

my problem with the Leroy scheme as presented is:people move along from surface stations, with the impression, that the majority of stations is type 5, and those have a +5°C effect on the climate trend.

that s completely false of course, but i see how people come to that impression. negative effects on temperature are not under prominent display. and i still doubt that anyone did any serious reading on what Leroy actually said.

again:for the climate TREND, only a change in the station type is important.
but still, a type 5 station, if unchanged, can provide accurate climate trend data. while a type 1 station might be very tainted, for example by being downwind of some growing UHI effect.

The error is due to warming so it is not +/- 5 degree C, it is error>=5 degree C.

you are right on this. i picked the +- up in one of the comments above mine and didn t check it.
but i still have serious doubts about this. vents could cool actually, as could some surface paint or rooftop surface.

I am guessing that site B will be classified as class 5. Why do you not want to accept what NOAA and the French agree to?

if the surface is covered with significant snow, for the majority of the year, there will be NO +5°C. simple.

The meaning is: the true remperature could be anywhere between a limit of -5 deg of our reading to +5 deg.

that is false. this would be the case, if he had written errorIt seems obvious to me that the temperature errors associated with the CRN/Leroy categories are just guesstimates of potential errors. In fact, Error >=5 deg C could be interpreted to mean either the algebraic errors could be as much as +5 degrees (but not negative) or that the absolute errors could be as much as +5 degrees.

i sort of agree. though he implies an error BIGGER than 5°C.
i think he can only be speaking about some sort of MAXIMUM error. that is why i don t think it has such a big effect on climate TREND.

>> If and when we ever get to the point of auditing models, the theory and the causal relations will come into play. … Until then, its irrelevant

It’s good that you started with “If”, since on present course, this will never happen. Folks have common sense, and since no one is argueing the science, people will assume that anti-AGW folks started with their best argument, and that therefore, they will have concluded that the science is settled. It will not be possible at that point to go all the way back to the beginning.

It seems obvious to me that the temperature errors associated with the CRN/Leroy categories are just guesstimates of potential errors. In fact, Error >=5 deg C could be interpreted to mean either the algebraic errors could be as much as +5 degrees (but not negative) or that the absolute errors could be as much as +5 degrees.

i sort of agree. though he implies an error BIGGER than 5°C.
i think he can only be speaking about some sort of MAXIMUM error. that is why i don t think it has such a big effect on climate TREND.

RE98, I clicked on SOD’s link to see what sort of blog he wrote. Its called “Seed Of Doubt – Iraq”. That got me to thinking.

Here is a question that has been bugging me about this sort of commentary, and “SOD”, I’d like to hear your answer.

People whom doubt the government/administration on Iraq, 9/11, and various other issues of our time seem totally willing to accept temperature data, also gathered and presented by our government, without so much as a simple curiosity as to it’s validity. In fact I get criticism just for looking into it and so does Steve Mc.

If our government reporting cannot be trusted in these other areas, what makes it trustable in this one? Why is auditing the methods and data so “problematic” in this case of temperature data but demanded in other issues like Iraq and 9/11? Seems like a double standard to me.

RE98, I clicked on SODs link to see what sort of blog he wrote. Its called Seed Of Doubt – Iraq. That got me to thinking.

thanks for visiting! i have serious doubts that i have many readers. it is more of an online journal, to keep some thoughts, quotes and links.
you could still become one of the first commenters. i d be honored!

If our government reporting cannot be trusted in these other areas, what makes it trustable in this one?

i will happily answer this one:conflict of interests.
it s rather obvious, that the US government has a strong interest, in claiming success in Iraq. and, SUPRISE, the Petraeus report finds success.

that makes me very sceptical.

on the other hand, i can t see a similar interest of state actors in human made climate change.
i see, how it could be constructed, but i m unconvinced by it. it is not as simple.

ref 136 The abstract I linked in 135 indicates a bias of a few tenths to over 1.0 degrees C for MMTS stations. The average diurnal bias for MMTS tested at the University of Nebraska was 0.7 C. So the majority of the CRN biases are likely positive. Why they create a negative bias for the CRN 5’s prior to 1950 something is probably due to correcting the more accurate to match the less accurate.

Any instrumentation error prior to 1947 something was probably random with a slight positive bias due to column separation and more daylight hour recordings. From 1947 to 1985 something, there would have been an overall increasing UHI positive bias. From 1987 something on there would have been a growing large positive bias due to instrumentation changes (digital)peaking around 1995 and reducing slightly to present with digital improvements.

If the AMS is concerned with bias of over 1.0 degree C for some stations in 2004, it seems a nice fuzzy error range should be included in the global and US average temperature plots. This doesn’t mean warming is not happening, just that the how much and why are in question.

RE: #141 – You are highly naive about the nature of a bureaucracy in a mixed (socialist-free market) economy. Of course, it is in a bureaucracy’s interest, in such an economy, to pass draconian regs regarding CO2 and energy consumption. The experiment is already underway in California, and this is the proving ground for the national scale. You do not comprehend this?

on the other hand, i can t see a similar interest of state actors in human made climate change.
i see, how it could be constructed, but i m unconvinced by it. it is not as simple.

And it is comments like that, dear poster, that make me skeptical.

If our government reporting cannot be trusted in these other areas, what makes it trustable in this one? Why is auditing the methods and data so problematic in this case of temperature data but demanded in other issues like Iraq and 9/11? Seems like a double standard to me.

Lets hear your answer, SOD.

Anthony, you got it right, but do not expect some others to appreciate your SODbusting.

>> on the other hand, i can t see a similar interest of state actors in human made climate change.
i see, how it could be constructed, but i m unconvinced by it. it is not as simple.

I disagree. Hansen’s interest in this goes back to the early 70s, when there was little science on the subject of AGW. He, and others like him, are heavily invested personally, professionally, and, yes, economically. They BUILT the theory and supplied the science. Regardless if there is such a verifiable phenomenon as AGW, it’s very existence as a subject of discussion as an imminent danger to mankind guarantees that billions will be spent researching it and eventually on efforts to rectify it.

Bush/Iraq is much more similar to Hansen/AGW than you think.

Both will be judged by history. Iraq will either by understood as a mistake or not, and Hansen’s views on AGW will be understood as either prophetic or as incomplete, faith-based science that was needlessly alarmist.

Gunnar, I agree with MarkR. Steve and Anthony have their hands full with this rather small corner of the issue. I thought that we could have a reasonable discussion on the theoretical side, but it’s become abundantly clear to me that we’re missing a referee, and the door is open to every crackpot who comes along. The theoretical discussion has to be done somewhere else, and it needs to be done with a referee who can keep the crackpottery out. I have some faith in my own abilities, but not as much as you appear to have in yours. And I wouldn’t want to be the referee.

You don’t see a conflict of interest between climate change and govt interest?

What has been proposed as the solution.

1) Huge increases in taxes.
2) Huge increases in govt control over just about everything.

You don’t believe that there are those in govt who would like either of those to happen?

As to the climate scientists.
What’s the best way to get govt to spend billions on your area of research. Convince the govt that there is a massive problem that must be researched.
You believe that there isn’t a conflict of interest here?

You are willing to believe that a man of incredible personal ethics (according to everyone who knows him) is willing to lie just because Bush wants him to, but your Hyper natural skepticism just can’t make itself find even a single conflict of interest in the climate realm?

Larry, let’s put our little back-n-forth into perspective: I have not been discussing AGW theory. I was just agreeing with Mike H, and trying to put the Steve/Anthony effort into perspective, which I appear to have succeeded in, since you say: with this rather small corner of the issue

Gunnar, I don’t think anyone would argue that auditing the numbers is a small corner of the issue. But it’s something that can be done without getting into esoteric physics (although it does involve esoteric statistics), and it illustrates a sloppiness that may give people pause. Don’t underestimate how important that is. Every time another problem is found, no matter how minor, it brings into question what these people are up to.

Dan Rather found out the hard way that people will only accept the “butterfingers” excuse for so long.

RE: #153 – It’s a bit too early to be talking in generalities about trends of Type 5 stations, or any Type of station. Let the analysis get completed, then we can discuss whatever trends may or may not have emerged.

re:153: “do you agree, that a type 5 station can produce better CLIMATE TREND results, than a type 1 station does?:”

I could agree to one thing: if you computed trends per station (that is: a trend for *each station*) and if the station bias was constant over time – then the bias would not affect the trend.

Both conditions aren’t fulfilled: bias changes over time (as the immediate environment of the station changes, location changes or instruments change). We don’t know the magnitude and direction of bias changes.

Second: it seems trends aren’t computed for individual stations. First – all stations are combined into some averages – between stations with different biases – and only then are trends computed on these aggregate means. Temperature data from some stations is used to statistically “improve” or correct the data of other stations, without considering station bias, whose magnitude is not known.

Sod, error >= 5 degree C without a +/- in front means just what is says.

Quote:

“The air temperature error varied greatly depending on wind speed and shield type, ranging from +5° to nearly +20°C at the lowest wind speeds. In addition, when the temperatures of the sensor and the interior walls are not equal there is a net exchange of infrared radiation, and the sensor temperature will no longer represent the air temperature.”

Sorry for my very bad English, I am a french observer for Meteo France (National weather in France).

sod says:
September 17th, 2007 at 7:57 am

let me repeat my question:
what is the meaning of error>= (+-)5°C???

i cannot lose the feeling, that this classification was endorsed by surfacestations, because it was an easy way to discredit the NASA data.

the way it is used, indicates it to be some sort of average error , but that doesn t make any sense. an average error that is either bigger than +5°C or below -5°C is highly unlikely.

i think people should start looking at the real definitions, given by the french guy, before we make further deductions from the type of station.

so i m not suprised by the findings by John V.

5°c, it is a possible instantaneous error in one day, it is not a certainty and it is not systematic.
Classification does not indicate all. A station perhaps class 5 officially because a too important height of the shelter (> 2 m in released site. Example 2.5 m is classe 5) while being as good as a class 2 or 1.

Here, i have one station in class 2 and to 35 m another station classifies 4 of them natural. Trees and shrubs of the forest which couples the circulation of natural wind.
The maximum error in maximum temperature was of +4°, once . On average maximum temperatures, the error is of 1.5° with 2° for the summer months. The winter the station classifies 4 of them is colder in Tx with the shadows by the trees. I raised a weak average error of 0.1° in minimal temperatures.

See the class 2 and 4 here
The class 2 is an official radiation shield of Meteo France (on the picture : “Socrima MF”)

I don’t see how a type 5 station can produce a “better” (I presume you mean “more accurate”) climate trend. On what do you base this? We have a single point for these classifications, ATM – we do not have a historical record of what class (“type”) these stations had in the past. My personal opinion is that it’s much more likely that we have seen a gradual change from type 1 to type 5 for those currently type 5, and that those now in type 1 have either 1) been that way since dot or 2) have had a “step change” from 5 back to 1 after someone noticed a previous gradual change from 1->(2-5) and did something about it (relocation, eg). Some of this information may be able to be reconstructed from site change logs etc, but much is lost forever.

If we cannot know the full station history, we must assume the worst – that all are CRN 5; that there may be biases greater than 5C; and that these biases may not be monotonic or linear over the history of the station. If Anthony’s efforts can be carried forward, then perhaps we will have a better idea in the future of what is happening now, but we cannot correct biases we do not, and cannot, know about!

This is, in my view, where Hansen et al are being misleading – despite the views of many here that they have a vested interest (and it may or may not be true, which may or may not affect their results), it is my view that they feel that what they are doing is accurate, reasonable and correct. And maybe it is. BUT, by not taking the time to ensure that everyone understands the error margins involved (and in some cases, actually downplaying them!), the case they present appears to be more certain than a closer look reveals – the error margins may be as much as 10 times higher than the effect they are describing, and their “adjustments” (valid or not) are also around the same amount as the effect they describe. Such things do not inspire confidence in this reader of the “robustness” of their data and conclusions. The contortions of Mann et al to “defend” the HS, and the use of the same defense in this case (“it doesn’t matter”) is also somewhat concerning. As I believe Steve Mc was trying to say, what *does* matter? And perhaps more importantly: why?

As Christian noted, conditions do affect error. In defining an error rate the maximum probable error has to be considered. A MMTS crn 5 located on the north side of a building without being blocked from the wind may be fairly accurate. On the south side of the building blocked on two sides from airflow you would see the stated error.

While my French is one heck of a lot worse than Christian’s English, the CRN 5’s have totally positive biases due to the housing.

I have a question for JohnV. It is, did you repeat the steps shown in Hansen (2001) or the code released for GISSTEMP? I think this is an important question because if the answer, based on my reading of you posts, is no, then why does the raw data match the adjusted data?

I suspect that what we really need soonest, is for the 256 lights = 0 sites that Hansen uses to adjust for UHI to be surveyed. Since they are used to adjust all the other stations, it would be nice to know, at least as of now, how much heat error can be expected from them.

Thanks for your critique. I agree with your scientific analysis about the suspect data. For your interest, I was not stating anything as factual, merely explaining the output results of John Vs program, which I had nothing to do with, by the way. I was not championing the results as some naive cheerleader, just outlining some interesting results for talking points.

I would have liked to discuss my graph, but realized later that I needed to post it first at a URL, which I do not have. The graph likely considered lame by most of you anyway.

There are many biases with the data and it is true we only have a handle on a few of them. The idea of a blog of this type is to work our way through them, and allow outsiders to offer their two cents. More often than not, outsider propositions do not amount to a hill of beans, but sometimes the outsiders suggest something that does amount to a great aha moment. Perhaps my post #90 was not one of those moments.

The idea of cooling our jets is anathema to the entire enterprise of a blog in my opinion. I also disagree with your assessment that this is not science. I think science is a lot more like Michael Polanyis Tacit Knowledge combined with Carl Poppers falsification principle in place of Kuhns paradigm shift. We are doing that here at ClimateAudit, maybe not in an exact way, but certainly in a general way. By discussing ideas in a commons like ClimateAudit, helps ferret out information more efficiently and harnesses the truth behind Wisdom of Crowds. Besides, discussing the interpretation of the data, despite all its hidden problems, is fun. We are not writing papers to be published in the journal Nature. Although, I suspect some of you secretly wish that is precisely what Steves analysis will amount to in the end.

When a vast network of outsiders joins forces, there is the possibility we will run off the rails. Then someone like yourself pulls back on topic and highlights the errors of our ways. To cool our jets after there is so much buzz, would be disingenuous and runs the risk of severely curtailing the buzz. If we cannot have a residue of pleasure in all this analysis, but rather adhere to some rigid calculus, then in my opinion, it would be a lot less therapeutic and slowly becomes work, yuck.

Well then that tells us that Hansen is not doing a very good job of adjusting his data and many of the problems that are adjusted out of USHCN v2 are still in the GISTEMP adjusted data. Is this correct?

That’s not the conclusion I would make.
GISTEMP matches CRN12 very closely, and is not quite as close to CRN5. (CRN5 shows a warming trend relative to GISTEMP and CRN12). I would suggest that the GISTEMP procedure “fixes” the preponderance of CRN5 stations to yield a more accurate temperature history than would raw station data alone.

ROFL John! Look at what you did! You compared GISTEMP to UNADJUSTED GHCN data and found that they matched very closely using different smoothing. Face it, GISTEMP does NOT remove UHI, microsite effects, station moves or changes in (or problems with) equipment.

Could you please respond to JS (post #41) and Kristen (post #74) about the legitimacy of comparing GHCNv2 unadjusted data to GISS adjusted data? Is this not comparing apples to oranges?

Isn’t that the point? I was not attempting to duplicate GISTEMP — that would be a pure programming exercise. I was looking at the station data to check for differences between CRN12 and CRN5. There may have been an expectation here that GISTEMP would match CRN5 and that CRN5 would be very different than CRN12. However, GISTEMP is actually closer to CRN12 — what does that say about GISTEMP?

#97, #99 steven mosher:
Most analyses seem to use 1961-90 but GISTEMP uses 1951-80.
It’s only a shift up or down — it has no effect on the shape of the trends.

So, as you try to match GISS output you might want to consider the sites and time perids they excise.

It should be said again — I’m not trying to replicate GISS. I’m trying to make an open program that can be used by anyone to test different hypotheses about station data. If it works well, and it is well used, it may improve the temperature record.

Are you presuming that CRN12 is pristine (with no warming bias) just because they are scored to not have microsite issues at the present time? And that then if they trend like GISS adjusted, that that proves GISS adjusted is accurate? Just because CRN12 sites are currently scored acceptable doesn’t mean that they don’t suffer from numerous other warm biases (UHI, station moves, TOBS, etc..) Or am I missing something? I think it is apples to oranges comparison which has limited usefulness if you don’t limit the variable to just microsite scoring.

Many of the CRN1-2 stations are “bright”. So to be reassured of relatively uncontaminated sites, one needs to restrict the CRN1-2 sites to “dark” sites. Unfortunately this reduces the candidate population a lot. So far there are 13 stations that are CRN1-2 and GISS-bright =0. At 6% of the earth’s land surface, this is equivalent density to global coverage of about 215 stations – it seems to me that I’ve seen suggestions that this would be enough to measure earth’s temperature. There are 9 CRN1-2 stations that are GISS “dark”. GISS “dark” is not necessarily consistent with GISS numerical lights =0. The survey is about 1/3 complete, so maybe there will be about 40 stations by the end.

#113 Mike H.
#167 Ian McLeod:
I agree with Ian on this. With no offence intended to SteveMc, this site has always been about posting imcomplete analyses and seeing what happens. It’s real-time, down-and-dirty peer review. That’s what gives the site it’s vitality.

Intentionally or not, most of the articles on this site are very critical of the IPCC consensus view. My results seem to validate GISTEMP for the USA lower 48. Some people will try to extrapolate from that, but many extrapolate from SteveMc’s articles as well.

#114 Mike H:

Perhaps it is time to establish a cooperative along the lines of the open source software movement. Why not establish a thread on this and see where it goes?

Now that I can agree with.
Discussions about the code itself should probably not be on this site, but a discussion about the goals of a software project and the algorithms to be implemented should.

#174. What this says is that using an unlit=0 criterion in a system with a lot of rural stations stretching back to the 1930s is probably a reasonable concept as these things go. When you look at NASA station versions relative to NOAA/ (and presumably CRU versions), the NASA versions look more plausible.

The conundrum that’s down the road is why the NASA history warms more than CRU in recent years, even though it has less warming in cross-compared U.S. stations.

#177 Aaron Wells:
I agree that a rural vs urban comparison is needed. Remember the context of my posts on Friday night though: there was much excitement and anticipation in the comments. Many posters were very confident that the CRN5 results would show much warming, and the CRN12 results would show little warming. That’s what I was looking at.

I was wondering about Hansens arbitrary 1000 km radius averaging of region thing. This is a parameter in my opinion crying out to be manipulated. I live approximately 25 km outside of Toronto. The climate in Mississauga is as a rule quite different from the metropolis of Toronto on a consistent basis. Why 1000 km? Why not 100 km, or 500 km?
Anyway, it looks easy enough to change on your program, but I am unclear whether the data extraction will work? Any thoughts?

On that note, why use rectangles to calculate area for a cell while using all stations within a certain circle? Why not just use each station once (average all stations that fall within cell boundaries).

How fine of a grid will the data allow us? 1500km seems huge (think East LA county where you are 2hr drive from snow, beach, and desert).

“I guess well have to wait for a more thorough analysis. Its good to see that youre a couple of steps ahead and have already reached your conclusions.”

John, I have been to some of these sites. I have felt the heat coming off an air conditioner that was 2 feet upwind from a temperature sensor. I have felt the heat coming from the blacktop next to more than one of these stations. 3 of the CRN 5 stations were surveyed by myself or my volunteers were a part of your graph. There is no way that you can tell me that comparing unadjusted data from these stations to adjusted GISTEMP data, then finding such a good match, can possibly lead to any conclusion other than that there is a problem with the GISTEMP adjustments.

Can I make a plea for all posters to actually label up their charts with what version of whos data they are using at any one given time (SteveM excepted of course).

It is very hard to see whos data is being used, and what version, containing adjustment for what.

Secondly, would it be possible to do a running total chart for NASA GISS US data, in all it’s different states, that is raw, TOBS, and then for whatever adjustments they have done up to and including the final total, and also some kind of chart showing the effects of changes in different types of adjustment. Otherwise it seems to me that the casual observer will be completely unable to see what is happening to the data.

PS Since I wrote this, it seems that NASA GISS has completely changed its basic data, however perhaps it would be possible to use the output of the duplicated NASA code (if that project is still happening) to generate a timeline of snapshots of twentieth century temps as NASA GISS has morphed them over the last few years, and then flow that into the output using the new current database. Maybe one of those nice animated GIF type graphics.

PPS I suppose NASA GISS would need to provide an archive of all the different versions of the code used by them over the last several years so that those interested can see what has been done.

PPPS I suppose NASA GISS will now say they “moved on” etc., but that will only make the fall harder when it happens. People will say they had one opportunity to correct their mistakes, said they had corrected, and then were found to be still wrong. Don’t the Americans have a saying about “Fool me once….”

Sorry for my very bad English, I am a french observer for Meteo France (National weather in France).
…5°c, it is a possible instantaneous error in one day, it is not a certainty and it is not systematic.Classification does not indicate all. A station perhaps class 5 officially because a too important height of the shelter (> 2 m in released site. Example 2.5 m is classe 5) while being as good as a class 2 or 1.

thanks Christian. your english is fine and your comment is of enormous value!

RE 187: sod, I am sure you have a point but could you get to it? Yes, we understood that the discussion is very disturbing, people learning that deeply held beliefs are false often feel this way.

Christian says that a site where the attitude is off by a meter is a class 5 site for Météo-France. I my self do not see how that relates with being adjacent to or above an artificial heat source. But then I looked up what is a class 5 station to Météo-France. Based on LeRoy and Météo-France I do not see how just being 1 meter in height off would make the station class 5. Maybe Christian could explain the difference.

Now I did find the following. In 1997, Météo-France defined a site classification for some basic surface variables.

* Class 1 is for a site following WMO recommendations
* Class 5 is for a site which should be absolutely avoided for large scale or meso-scale applications.
* Class 2, 3 and 4 are intermediate

So, sod, how about you stating what you want everyone to look at and say ‘Ah! you were right!’

#183 Ian McLeod:
I am curious about the 1000km radius as well. I plan to do a sensitivity analysis.
*If* my quick linear algebra is correct, the radius will have an effect on regional temperatures but should not affect the average over the total region.

#184 Clayton B:
“On that note, why use rectangles to calculate area for a cell while using all stations within a certain circle?”
That’s part of the reason for using *small* rectangles. They can more closely match the shape of the circle.
I would like to move to a better shape than rectangles eventually. Rectangles break down at high latitudes. One option I’m looking at is sub-dividing the faces of an icosahedron into roughly equilateral triangles.

“Why not just use each station once (average all stations that fall within cell boundaries).”
I don’t like this approach because each station is put in an arbitrary box. If a station is very close to a boundary line, then it should have an effect on the neighbouring grid cell as well. More importantly, moving the boundary line could affect the average. (For example, by moving a station with a strong warming trend from a box with lots of stations to a box with few stations).

#185 Kristem:
I understand your passion about this.
Our disagreement is where you claim that there can be only one conclusion from my results.
Your conclusion is definitely possible, but that does not make it the only conclusion.

A few key data points:
– CRN12 shows less warming than CRN5, as you would probably expect;
– GISTEMP shows less warming than CRN5 but more than CRN12;
– GISTEMP matches CRN5 very well
– GISTEMP matches CRN12 even better

You claim that GISTEMP matching CRN5 (the worst 15% of stations) could only mean GISTEMP does not correct for the CRN5 problems.

One could claim that GISTEMP matching CRN12 (the best 13% of stations) could only mean GISTEMP does a great job of correcting for CRN5 problems.
One could claim that every station with artifical warming is offset by a station with artificial cooling.

I am not making either of these claims, but the data could be used to support them.

A difference of one meter is a relatively gigantic difference in altitude and temperature comparison, because of the temperature characteristics of the air masses found closest to the surface. There is a tremendous difference in what is taking place within the first three or four meters of air space above the land surface. Asphalt and concrete are often hot enough to fry an egg, so you can easily see with your eyes what happens within the first few centimeters above the surface of a frying pan frying an egg. Less easily observed with the eye and more easily detected with instruments is the temperature gradient found within the first three or four meters above a hot land surface. Throw in the heating and cooling of diurnal cycles, changes in humidity and marine intrusions, and mix it with variable winds and variable vorticity patterns and shading from nearby objects. Such complexities can and most often do result in great differences in temperature gradients completely apart from the much less influential differences resulting in micro-lapse rates. Changing the altitude of the instruments in such complex environment makes the already problematic and perhaps dubious task of producing measurements useful for comparisons between observation stations a virtually hopeless task. The existence and scope of the problem can be confirmed even by some simple experiments with measurements outside a home.

#177 & #178
Steve & Aaron:
I absolutely agree that the best benchmark are the sites with historically minimal micro-site contamination and no UHI effects (i.e, unlit or a population density equal to that of the world’s land mass or 43 per sq km . The “historical” aspect is important because I am assuming that the satellite records when handled appropriately will take care of the record going forward.

Historically minimal micro-site issues, no UHI, consistent location, minimal TOB. Again, why not look at the 15,000 lighthouses around the world with long term temperature records. They could be extracted from GISS or HadCRU, I think, with reference to name or latitude and longitude. As I mentioned in a previous post, a list is here: http://wlol.arlhs.com/index.php?mode=alpha Nothing wrong with cherry picking good fruit. Any takers?

I’ve read your piece on TOBS. When the USHCN data is TOBS adjusted do they use fixed values?
do they care about where the site is located? to they make diferrent adjustments for different
seasons? Or does one number fit all?

With the evidence at hand what can we say about the GISS temperature records in the US and instrumental temperature measures in general?

We certainly should not expect to see no warming trends over the past decades in the US. Satellite records, which as far as I have been able to research to date, are totally independent of surface measurements and they, while varying depending on which algorithms are applied, show a warming trend also. The adjustments made to the satellite measurements are every bit as complex and numerous as those applied to the surface records and all of this must put added uncertainty into the results. After we acknowledge all the uncertainties in the measurements we can then perhaps argue how certain we are of a relatively small trend in the US compared to the ROW.

We have seen from the audits that the USCHN station data show a relatively smaller percentage of the higher quality stations than we are lead to believe by the GISS maintainers. While the GISS calculated average global or even US trend may or may not be affected by these quality problems, it has to add uncertainty in terms of the plus or minus we attach to the measurements. As a matter of fact the GISS calculations used to provide an average US trend with less than reliable local measurements puts added uncertainty into the regional and local records. I have a difficult time discerning how much of the differences between regional and local trends have been reduced by the GISS calculations and how much local measurement errors have added to the differences. When looking for meaning of a global trend in terms of local trends this becomes an important factor and should be known to some degree of accuracy.

One aspect of these analysis with which I strongly disagree is taking the quality assessment snap shots of the stations and attempting to use that information to say anything about the station quality years ago or have an underlying assumption that it has been constant over a long time period. For discovering any potential biases in warming one almost has to confine ones analyses, at this time anyway, to very recent times. If the poor station quality can affect the temperature in both directions then that should be evident in comparing poor and good station variabilitys. That variability would add to the uncertainty of the trend and particularly to variability from region to region and locality to locality.

There comes out of this discussion another source of uncertainty, albeit intangible, and that is the reactions from NASA to the audits. Their defensiveness makes one judge that adjustments to their records, while wholly legitimate, may be more likely to be somewhat one way in nature. The various satellite record keepers in my view show more competition for legitimacy and cross critiquing than I see in the surface records. How can we have such large discrepancies between the various surface temperature records without some questioning of the assumed uncertainty in these individual records? Are we seeing this effect in looking at long term differences between the quality categories?

“this site has always been about posting imcomplete analyses and seeing what happens. Its real-time, down-and-dirty peer review.”

You obviously have not been around this site very long. It’s purpose is auditing in order, among other things, to root out incomplete (and incorrect or imprecise) analyses.

It is certainly real time review, but it is certainly not “down and dirty”. It contains some of the most refined and sophisticated mathematical, statistical and scientific analysis around.

Publishing analysis based on incomplete and statistically biased data, as you have done, is not in keeping with the standards and intent of the site. In fairness that has to include Steve’s post on this thread.

It will undermine what the site and Steve stand for: statistically impeccable analysis.

This is not to gainsay your considerable contribution in terms of providing open code.

USHCN TOB adjustments vary by month of the year, and by latitude. I forget
if they also vary by longitude, but there is/are some other factor(s) they use
besides month of the year and latitude, I just don’t recall what offhand.

Mike:
You are right that I have not been around this site for long. I happened to stumble in at a time when the focus turned to code. I am contributing what I can.

My assessment of “down-and-dirty” peer review was regarding the comments, not the articles.

I have some new results that I will be sharing soon. I will be sure to provide lots of disclaimers that my code has not been reviewed, there are geographical biases, etc. Hopefully it will kick-start the code review process. (Yes, I know that my promised website is not available yet — I am still waiting on my web host).

Re: #190: nice explanatory post, JohnV. Precocious scientists need to learn not to prejudge their conclusions and how to eliminate alternative hypotheses.

Re: #196: Lighthouses might be influenced by local conditions, particularly coastal sea surface temperatures. Enhanced coastal upwelling would bring more cool water to the surface and thus indicate a localized cooling somewhat unrelated to local prevailing atmospheric conditions.

Re: #198:

We certainly should not expect to see no warming trends over the past decades in the US. Satellite records, which as far as I have been able to research to date, are totally independent of surface measurements and they, while varying depending on which algorithms are applied, show a warming trend also.

Warming trends indicated by the data are further substantiated by a wide variety of ecological indicators. Here’s just a few:

It’s worth noting that while the ROW may have a “Where’s Waldo?” data problem for finding warming trends, Waldo is showing up in ecological indicators over a very wide area. For me, it validates the centuries-long tradition of science practiced by the naturalist.

Many have suggested that we need to look at only the rural CRN12 (good) stations. I think we can all agree that these are the stations least likely to have problems. I have done the analysis and would like to share my results.

First, some disclaimers:
– THE CODE HAS NOT YET BEEN REVIEWED
– THERE ARE GEOGRAPHICAL BIASES IN THE STATION DISTRIBUTION
– CONFIDENCE INTERVALS HAVE NOT BEEN CALCULATED

And now for the results:

I selected all of the rural stations with CRN site quality ratings of 1 or 2. (I excluded stations in small towns and at airports). A complete list with their coordinates is included below (you can type the coordinates into Google Maps or Google Earth to see the location).

=====
I will leave the conclusions to others.
Let me finish by repeating my disclaimers:
– THE CODE HAS NOT YET BEEN REVIEWED
– THERE ARE GEOGRAPHICAL BIASES IN THE STATION DISTRIBUTION
– CONFIDENCE INTERVALS HAVE NOT BEEN CALCULATED

If we look back at the Orland-Marysville comparison that spawned this, here are three different adjustment comparisons:

First here is the difference between adjusted USHCN versions, showing a very strong relative increase of Marysville to Orland net of the adjustment process.

The comparison with GHCN adjusted versions is similar – also a strong increase of MArysville relative to Orland.

Now here’s the comparison with GISS adjusted versions, where the GISS adjustment has a much different impact. Before we started on the examination of GISS, I observed that, if GISS was right, then Parker was wrong: urbanization does impact trends and needs to be dealt with – and, in fairness to GISS, at least, in the U.S. they are trying.

As I’ve mentioned previously, this yields a brand new puzzle: in the US, at least, GISS seems to adjust for urbanization more than GHCN. Since CRU and NOAA use GHCN versions without the GISS adjustment, why does the GISS global total rise faster than the CRU series? Long way to go yet , folks.

#209 steven mosher:
I worked with the robotics division of Spar Aerospace (now bought and sold a few times). We were validating the simulation code that they used to simulate space station construction tasks.

September 17th, 2007 at 4:53 pm
no Sam +6 +7 +8 greater than in a positive direction.

You’re mistaking what I meant I think. Off is off. Maybe I shouldn’t have used signs, but I didn’t mean them in a numerical sense. If it’s an error of => 5 degrees C, think of it this way. I’m talking about (whow this sounds weird) +6 +7 +8 in a negative direction. lol :) Let me not use signs.

If the actual temp is 25 and I read 30, it’s off 5 degrees. If the actual temp is 25 and I read 20, it’s off 5 degrees. We’re not really dealing with negative numbers here but with physical values of higher/lower. Off 5 is off 5.

By the same token, if the temp is negative 5 and I read negative 12, we’re off 7, and it’s the same as reading positive 2 degrees — off 7.

Just like if my negative ground battery is supposed to be at 7 volts, but it’s at 12, I’m 5 degrees higher on my negative voltage. (-7 V to -12 V is -5 V higher, and yes that looks odd too!)

John make no mistake. If you are a committed FOSS Guy and you have
experience in the Embedded world ( robotics counts) And you can whip out
pristine code ( yes you can) in short order ( well duh), then maybe we should talk.

I have some clients in your neck of the woods. you have mad skills.
I’ll hook you up if you like.

I would also add the disclaimer that the station locations have not yet been verified, due to the fact that the very first station on your list is an airport ASOS. I cross checked your first loc against the MMS for Muscle Shoals:

34.744100, -87.599700

I stopped checking at that point. It looks like you used the truncated locations from the master list, instead of the more accurate ones found in the MMS and especially the site surveys. I would recommend that as you assert each as having rural siting away from towns & airports, you verify each station’s actual siting. Since you are using Anthony’s list for 1 & 2, you should also use the locations that were verified with each site survey instead of following the master list locs. There exist significant discrepancies between master list locs and actual site locs in many cases.

Re: 205
What I see (with unexpert eyes) is that GISS is systematically somewhat cooler in the 1930’ies and somewhat hotter in the 1990’ies, this “somewhat” being some 0.1 deg or so.
It’s a (slight) differnece – but in exactly the direction we would suspect it to be, knowing Hansen. We don’t suspect Hansen of intentionally “cooking the books” or making gross errors. It’s just the so subtle tenth of degree here, another one there. So the graphs confirm my suspicion :-).

Of course – this is totally ridiculous. Given the quality of the data and the magnitude of error margins, any conclusion based on differences of fractions of degrees is equally nonsense. We must recognize that the data doesn’t support any trend claim.

If 1&2, why not against 4&5? Or 1 vs 5? Verified condition and verified rural only?

To speak about what SOD was bringing up, of course it’s possible for a sited class 5 station to be accurate and a sited class 1 station be biased, but up until now that’s mainly based upon siting standards alone. So if a class 5 sited turns out to be class 1 functionally, it’s then no longer a “class 5″. I don’t think we’re really at the step to do other than base the ideas upon siting. Certainly using siting for now, that’s as valid as using satellite-derived light density to classify as rural or not, I’d think.

I think about this sort of thing a) more stations need to be surveyed b) once there’s enough stations (probably now), the geographic distribution is filled in by selectivly surveying stations c) the stations picked are physicially verified as to condition now versus simply siting standards d) only class 1 stations in good conditions now and likely in the past are used and e) the stations are evenly distributed geographically. I don’t know if d) is possible to do accurately in the future, or e) is possible in the future, number-wise. I would think one goal, perhaps unattainable, is to not have to make adjustments.

Indeed, this station is right in the heart of downtown Tocca – I’ve seen it. If such an error exists, it is very likely that others, possibly as significant, exist within the system, as well. Perhaps this is fodder for another audit direction.

it is a possible instantaneous error in one day, it is not a certainty and it is not systematic.

so a type 5 station MIGHT have an error bigger than 5°C at a certain time on certain days.

as daily temperature for the use in climate research is calculated from the min/max daily average, that 5°C might quite often have ZERO effect.

and again:the only way that station type might influence the climate TREND is, when stations CHANGE over time.

If we cannot know the full station history, we must assume the worst – that all are CRN 5; that there may be biases greater than 5C;

how would you reply, if i called for all station data to be seen as 1°C too low, to avoid to miss potentially catastrophic climate change?

RE 187: sod, I am sure you have a point but could you get to it? Yes, we understood that the discussion is very disturbing, people learning that deeply held beliefs are false often feel this way.

my “deeply held believes” have been CONFIRMED so far, by John V. who did excellent analysis, and by Christian working for Meteo France.

So, sod, how about you stating what you want everyone to look at and say Ah! you were right!

none of you could tell me, what Leroy s error was. none of you did any reading of his work. none of you could clearly tell me, that type 5 stations can produce BETTER results than type 1.

all of these points have been confirmed now, in the way i suspected them to be.so i m not suprised by the findings of John V. basically the majority of expectations in those errors were too high, because you did NOT know, what the error actually was…

Siting standards class 1: Could turn out to be operationally class 5
Siting standards class 5: Could turn out to be operationally class 1

What are the odds? I’d say probably low they’d be backwards. We don’t know for sure. I’d say any conclusion is one that is being jumped to. More likely, the class 1 might turn out 2 or 5 turn out 4. No way to tell without more study of the key sites that would be used to attempt to cover the US with the readings of the most accurate stations, or however it’s done.

John V.’s analysis does seem to point to things being pretty close, but as some have been saying, we can’t tell with the numbers we have, not for sure. I’ll continue to hold off on my conclusions for the time being, until I see all the data, and we know everything that’s going on.

I was trying to login at your new webpage to add some comments about the code but the login menu did not like my username(s) I tried. BTW, I tried several variants. It might not be turned on the way you think. Is it case sensitive, alphanumeric, or what exactly. Please help.

John V,
One thing that you may not have been aware of when you used NASA GISS data to determine your Rural stations (rural should be less than 10,000). NASA GISS does a lousy job of keeping up with population because they prefer to use Hansens night lights urbanization scheme. Below is the population added to stations that should not be labeled “rural”from your list of 24 stations in 17 states. It should be 17 stations in 12 states. All of the population figures are from the 2000 census.
I also ask Anthony Watts to reconsider Electra PH and Tombstone; I think he was too generous in rating them CRN 2.

Re #205 John V., thank you for continuing to share your good work and for clearly stating the limitations of the analyses to-date.

I’d like to share a map which helps emphasize John V’s point about possible bias due to lack of geographical balance. The issue centers around the fact that the US is large enough to have regional climates which, to some extent, have their own trends.

The map is here . Ignore the colors. The number in each of the climatic regions is the temperature trend (decadal, 1895-2006, degrees F). The red dots are the (approximate) locations of the rural stations used by John V. data source is NOAA.

As indicated, the southeastern quarter of the US has seen essentially no net warming over the last 110 years. It has but three dots whereas six would be closer to representative.

The warmed western quarter, on the other hand, has 10 or 11 of the 24 stations.

Does this make a big difference? No, but it does make a noticeable difference.

Agaain, as noted, John V plainly acknowledges the geographic limitation of this to-date work.

Your map is very interesting. What it shows is that the hot and humid part of the US has shown no measurable warming despite tremendous population growth. Just to stir the pot, one could use this to make the case that H2O is a negative feedback.

Thanks for the map! This is what I was thinking of in my comment in #184, using each data point once within some meaningful grid (in this case, geographical breakdown of US).

I’m not sure if you manipulated John V’s code to obtain your numbers or did it “manually”, but is there an easy way to calculate an annual average based on the area of each region? How does this compare to John V. method? I’ll try to check it out…

Since the accuracy of the temperature sensor used is on the order of ±0.1°C in labratory conditions, your analysis shows that the measured temperature rise using only good stations is well within the margin of error of the measurements.

I think sod should re-read “Christian September 17th, 2007 at 4:45 pm”. Especially the last part of the message is very clear in pointing out the differences in maximum temperature between class 2 and 4. “Very likely” the difference is bigger between the class 1 and 5.

So I’m getting too tired to make much sense, but I tried looking at geographical differences using the data in JohnV’s Stations.csv file. I couldn’t decode the C# to determine if the data is after any sort of averaging or adjusting (I don’t think JohnV was averaging on each station – just on each grid point??).

Here’s what I got using Excel’s SLOPE formula. Red stations have trend greater than 0.1 degC/decade. Yellow between 0.0 and 0.1. Green between -0.1 and 0.0. Blue less than -0.1.

If broken into David’s regions, I get warming trends that are different from David’s. I am using the monthly data from JohnV’s stations.csv.

Stations 42572489002 and 42572489003 are 35 miles apart and show completely different warming trends. 9002 really sticks out as it has the largest slope (positive or negative) of all of the stations. I think something is fishy:

RE237 While it is a fact that the mercury max-min thermometers and the MMTS both read out in resolutions of 0.1°C, a point that is missed is that the observer rounds the reading to the nearest 1°C for reporting on the form that goes to NCDC, thus negating that resolution in the reported data.

The accuracy claim may not be accurate over time as neither the meercury or the MMTS thermometers are field calibrated on a regular basis.

RE232 Kristen I visited Electra PH, and while the station had been discontinued I was able to get the B44 form from the state climatologist which showed the placement of the CRS to be well away from the main powerhouse building and the Electric transformer substation, and upwind of both, and well beyond 30 meters. Given that info I’m comfortable with the CRN2 rating.

RE221 John V. Thats a great resource, thank you. I appreciate your willingness to share for scrutiny and replication, which is not always easy. Hansen et al could learn a few things from you. What is most important is finding out what is right, not who. I welcome additional study on the surfacestations.org siting data, and I will continue the project to be complete as is practically possible (given we may never be able to add some stations) so that we have a larger data pool to work from.

What is really most important is that a timeline for each station is developed, because the biggest weakness for the rating system is that it has no ratings over time. Scrutiny of the B44 forms and stations moves along with USGS historical photos could help solve this issue.

RE218 “I assumed the locations on the spreadsheet from Anthony Watts were accurate.”

Please don’t imply that I created errors in lat/lon, the lat/lon location data is from NOAA/NCDC/MMS and transcibed on the spreadsheet.

the lat/lon data in the spreadsheet has two functions

1) Columns C, D in red are for updating the CONUS plot map embedded in the spreadsheet, those columns are manually filled with data from columns K, L which are provided for reference.

2) Columns V, W are lat/lon taken from NCDC’s MMS database, if they are inaccurate, (and some are) the fault lies there. For updated and more accurate lat/lon consult the stite survey forms where GPS was used.

Stations 42572489002 and 42572489003 are 35 miles apart and show completely different warming trends.

These are Susanville CA, and Quincy CA. I did a quick non-scientific check of location and determined that Susanville is east of the northern Sierra Nevada range, fronting the Nevada Great Basin, and Quincy is IN the mountains (might be on the western slopes). So they are climatically distinct.

RE: #246 and 247 – from my own on the ground experience. Quincy sits in an intermontained basin which drains via the Feather River into the Sacramento. To get there from the coast, after heading diagonally across the Sacramento Valley, you take the Feather River Canyon route (SR 70) through the first “range” (which is actually a continuation of the so called “foothills” of the Sierra – the same hills that Hwy 49 follows further south). This “range” is dryish, with old metamorphic, extrusive volcanic and sedimentary rock – hence my comment about it being a northern extension of the “foothills.” After you go through via the Feather Route, you come out on an elevated plateau (not unlike some of the other significant piedmont forelands further south, such as Grass Valley, Big Oak Flat, etc). That’s where Quincy is. The vegetation is the typical upper “Gold Country” pine forest, land is reasonably level. They get decent rain during the rainy season (I think it’s about 35 or 40 inches per year). Just to the NE, are the Sierra Nevada proper. At this northernmost point of that range, they are not very impressive, a mere 5 to 6K feet at the crest. You really can barely perceive that you have crossed them, eventually you are into Graeagle and the string of other towns that lie north of Tahoe, along SR 89. Those towns (and Tahoe) are in a downwarped and downfaulted basin, which is technically part of the Basin and Range. But the rainfall/snowfall, while not as impressive as the Western Slope of the Sierra (e.g. 40 and on up per year) is certainly enough to allow Ponderosa Pine growth. The forest is definitely more open than on the Western Slope (including Quincy). Drainage is, with the exception of the Feather’s (a very old feature predating the formation of both the Sierra Nevada and Cascades further north) basin, out into Nevada. To get to Susanville, you have to cross yet another range, which is technically one of the Basin and Ranges ranges. It’s an extension of the range which divides Tahoe from the Washoe basin (e.g. where places like Reno, Carson City and Genoa sit). East of it is the true high desert. The signature N-S route there is US-395. Susanville is on 395 up to the north. It’s about half way between Reno NV and the Cal-Ore border. Definitely in a strong rain shadow. It looks just like so many places in the Interior West – you might just as well be in parts of Nevada, southern Idaho, or Utah. A nearly treeless high plain, with some irigated pasture land, some wheat farming, alfalfa is a bit marginal due to how cold it gets in the winter. Defintely a wider annual and diurnal swing than Quincy, and far, far lower average RH. Many long stretches of temps sitting between 10 and 25 Deg F in the winter (whereas, in Quincy, it tends to hang between the low 30s and upper 40s). I think you get the picture ….

Steve Sadlow: I agree with that excellent description, except to note that Bald Eagle Mountain (el 7183′)in the Bucks Lake Wilderness lies about fifteen miles due west of Quincy. I’m not sure what effect that would have on climate, but it might differentiate Quincy some from the typical gold country foothills. (I spent some time in Jackson/Volcano/Pioneer.)

RE: #250 – I consider Bald Eagle Mountain to actually be part of an extension of the Cascades. It is all by its lonesome in terms of elevation, significantly higher than the rest of the range. You also note correctly, that, unlike the other piedmont areas I mentioned, which are separated from the “lower foothills” and Central Valley by mere 2 to 4K foot (peak) ridges, the nature of the one separating Quincy from the lowlands is definitely more substantial. But not substantial enough to make it arid. Another factor is that Quincy is suffiently far north in latitude that the sheer number of Pacific storms that rain on it far exceeds even the I-80 corridor. I bet they got a real good gully washer on that day late July when Shasta got 2 in of rain.

Lest people mistake me for some kind of expert on either California geography or climate, I should disclose that I got most of the information above from an atlas titled “California Road and Recreation Atlas” by Benchmark Maps. Very nice combination of topographical and road maps. I recommend it.

Thanks for all the comments (essentially) concerning my map-derived suppositions. Even given what the “quality” of the weather station siting is, such differences highlight that regional climate trends can’t be derived from single stations, particularly where topography and terrain are highly variable. And that’s why the data has to be analyzed effectively.

#232 Kirsten Byrnes:
“One thing that you may not have been aware of when you used NASA GISS data to determine your Rural stations (rural should be less than 10,000). NASA GISS does a lousy job of keeping up with population because they prefer to use Hansens night lights urbanization scheme. ”

Rather than using NASA’s rural designations, I actually attempted to use Google Maps to look at each site. Realizing that two digits for lat and long only give position to about 1km, I also tried looking at the surrounding area. I only chose locations that *seemed* to be clear of any population centres by at least 1km.

It seems I did not use the best set of coordinates. I’m surprised and embarassed by the size of some of the cities for the “rural” stations I picked.

So Im getting too tired to make much sense, but I tried looking at geographical differences using the data in JohnVs Stations.csv file. I couldnt decode the C# to determine if the data is after any sort of averaging or adjusting (I dont think JohnV was averaging on each station – just on each grid point??).

The stations.csv file contains the same data as the input GHCN file, just grouped into columns so that it is easier to process. Actually, I think this part of the OpenTemp program is very useful by itself. You can provide it with a list of stations and a data file, and it will pull them out into a format that can be easily processed.

SteveMc:
When the program has a few more features, I plan to post a tutorial on how to use it. I hope you will consider linking to it and helping to make the community aware that it is available.

Heres what I got using Excels SLOPE formula. Red stations have trend greater than 0.1 degC/decade. Yellow between 0.0 and 0.1. Green between -0.1 and 0.0. Blue less than -0.1.

That’s an interesting map. Since the US temperature history has three clear periods, I wonder if it might be useful to break the temperature trend into three periods:

Early: 1900 to 1935
Mid: 1935 to 1975
Late: 1975 to present

It would be interesting to see how those trends compare at different stations.

I welcome additional study on the surfacestations.org siting data, and I will continue the project to be complete as is practically possible (given we may never be able to add some stations) so that we have a larger data pool to work from.

I believe the site quality data can provide us with lots of useful information to study. In particular, a small network of stations that are known to have accurate records will provide a great reference for studying the variance of other stations. This information may be critical for understanding the error bars for the rest of the world.

RE218 I assumed the locations on the spreadsheet from Anthony Watts were accurate.

Please dont imply that I created errors in lat/lon, the lat/lon location data is from NOAA/NCDC/MMS and transcibed on the spreadsheet.

Sorry for the implication. That was not my intent. My statement originally just said “the spreadsheet”. I realized I needed a better description, so I added your name.

Is there any chance of compiling the site survey forms into an accurate list of positions? I would like to include all stations in OpenTemp eventually, and setup the program to use Google Maps internally.

On Electra PH I was more concerned with calm days and nights. There are what looks like 3 large cooling fans blowing in the direction of the temperature station which is upslope from the fans. Also, if you look at the natural vegitation in the background it is nothing near the same as the gravel where the temperature station was.

As for Tombstone, there is a bunch of large metal objects in the photo titled “immediately north.” I think those are artificial heat sources. Also in the photo Tombstone_East it says the building is about 30 feet away.

I’ve taken the two stations from the small list of rural CRN12 that JohnV posted and picked out the two that showed the largest trends in warming and cooling:

To be fair, I looked at plots of the other stations with warming/cooling trends greater than 0.1C/decade and did not notice any apparent anomalies. I wonder how removing these two stations will affect JohnV’s US trend? Perhaps the code should automatically trim from the analysis a certain number of stations from the top and bottom (based on warming/cooling).

If Port Gibson, MS is determined to be unacceptable it leaves a huge void around Texas. Maybe I should get to work, Anthony…

I’m glad I could help. I think you have learned the first lesson in NCDC and NASA data from their websites; It’s all junk until you verify it with your own two eyes. Hopefully you will be able to keep on posting your graphs and data and computer stuff here where everyone can see and understand it. A while ago a very prominent sceptic scientist sent me one of his studies and told me, “your job is to find something wrong with this.” I think that is the correct spirit.

Yesterday I posted an analysis of “rural” stations with site quality ratings of 1 or 2 versus GISTEMP (see post #205 above). There were some mistakes in my analysis regarding my choice of rural stations. Chris D. and Kristen Byrnes pointed out some mistakes in my choices of rural stations (posts #217, #232 above). Based on that feedback I have revised the analysis.

Some have questioned how I chose to use CRN 1 and 2 stations. My initial hope was to use only CRN1 stations but there are not enough. The cutoff at CRN2 was made based on my understanding that a rating of 1 or 2 is the CRN site criteria to accurately measure long term climate change.

The standard disclaimers apply:
– CODE REVIEW IS IN THE VERY EARLY STAGES
– THERE ARE GEOGRAPHICAL BIASES IN THE STATION DISTRIBUTION
– CONFIDENCE INTERVALS HAVE NOT BEEN CALCULATED

John V., thanks again for your graphs. A question, in your graphs, the GISTEMP data consistently shows about .2 degrees lower temp in the 1930’s.
Do you or anyone else have any comments on how significant this discrepancy is? Thanks . . .

The deviation from 1951-1980 average is arbitrary. Wegman rebuked Mann et al for this very thing in the Wegman Report. You cannot average the last hundred years and use it as a deviation for the previous nine hundred years if you want your results to be meaningful. Similarly, to average 1951-1980 and use it as a deviation for the five decades before and two decades after is capricious.

Several comments from others above and in the First Look  thread explained the historical significance why climatologists/meteorologists used the 30-year average. They used it for convenience, nothing more.

Should we be averaging over the full range of data instead of 1951-1980? This is done to compare GISSTEMP, I know. We have the raw data, so at minimum, add a line overlaid with the average stretching through the entire data range. This makes more sense in terms of a truer temperature profile instead of a subjective statistical artifact. Then again, perhaps it makes no difference or conceivably, there is a better explanation than to simply compare GISSTEMP.

If this makes sense and is agreed to, I will attempt this with John Vs new program. If it does not, I will not bother.

To begin, your thoughts?

Ian

PS: John V, I printed off your OpenTemp.cs code last night and started reviewing (24 bloody pages!). It looks very clean and well documented. If I did not work for a living (like some here), I might be finished reviewing it by now. Alas, I do work for a living, and tragically, my work will take me away from reviewing for a few days. I like your webpage and will post comments there unless Steve M adds a thread here. I will be in touch soon. Cheers!

#272 Ian:
Just to be clear, shifting to 1951-80 does *not* change the slope or the shape of the lines. It merely defines where zero is on the chart. It has absolutely zero effect on the difference chart or any of the trend charts.

Also, my program does not reference 1951-80 in any way. I perform the shift to the reference period in Excel.

By the way, the calibration in MBH98 and MBH99 did affect the trending. They used the century temperature profile from 1902-1995 to calibrate the entire millennium, which produced problems with their principal component analysis. However, that is something very different from what you are doing. Forgive me ignorance.

When I plotted the Yearly.csv output, I noticed the weird behaviour between (estimating) 1956 and 1977 where the variance shrunk to approximately zero, as we discussed above. I assumed the program did the shifting automatically and you incorporated it into your code. The Excel does not show the average and then shift, just YEAR and AVG. At least in your original post anyway.

Just a general note to everyone regarding data quality. I would refrain from using pre 1900 data in any analysis, as up until that time, many stations still had exposure issues.

The Stevenson Screen was not universally applied before the US Weather Bureau made it an observing standard around 1892. If you look at raw data graphs for that period, you’ll often see large magnitude swings which were the result of thermometer exposure issues.

Regarding #271, it has been stated for several years that while the rest of the world and the United States were warming, the northeast U.S. was actually exhibiting a cooling trend. Those 17 stations only have one mid-Atlantic site (not even in the Northeast proper). So I’m curious if there is actually a cooling trend in the Northeast? The references to it date back to 2001, so the trend in the Northeast may have switched sign since then.

I used John Vs Excel cvs data to make paired comparisons of CRN12 and CRN5 stations. I was somewhat disappointed that there are much missing monthly data even for current measurements. I used the latest 15 months available from these data (01/2005-03/2006). Since I take the quality categorizing as a snapshot of recent conditions I did not judge it to be appropriate, without more detailed information, to go very far back in time. Below are the stations compared with the ID, latitude, longitude, number of months compared (out of the 15 maximum) and average monthly difference for CRN12-CRN5. The CRN12 stations are on the left and the CRN5 on the right.

The average difference in latitude and longitude (CRN12-CRN5) weighted for the number of monthly matches for the stations was latitude = -0.26 and longitude = 0.05 which shows that the CRN5 monthly differences came for a latitude slightly north of CRN12 and almost dead on for longitude.

I looked at these differences in 2 ways, of which neither was perfectly satisfactory; first I did a Wilcoxon signed-rank test using average monthly station differences and then another Wilcoxon signed-rank test using all the monthly paired differences. The calculated z values were 1.48 and 3.92, respectively. Using the station comparison we could not reject the null hypothesis of the stations being different at the 0.05 significance level (it is more like 0.15). Similarly using monthly pairs we can reject the null hypothesis at a very high level of significance (

I am reposting without the less than symbol to complete what I copied to post.

I used John Vs Excel cvs data to make paired comparisons of CRN12 and CRN5 stations. I was somewhat disappointed that there are much missing monthly data even for current measurements. I used the latest 15 months available from these data (01/2005-03/2006). Since I take the quality categorizing as a snapshot of recent conditions I did not judge it to be appropriate, without more detailed information, to go very far back in time. Below are the stations compared with the ID, latitude, longitude, number of months compared (out of the 15 maximum) and average monthly difference for CRN12-CRN5. The CRN12 stations are on the left and the CRN5 on the right.

The average difference in latitude and longitude (CRN12-CRN5) weighted for number of monthly matches for the stations was latitude = -0.26 and longitude = 0.05 which shows that the CRN5 monthly differences came for a latitude slightly north of CRN12 and almost dead on for longitude.

I looked at these differences in 2 ways, of which neither was perfectly satisfactory; first I did a Wilcoxon signed-rank test using average monthly station differences and then another Wilcoxon signed-rank test using all the monthly paired differences. The calculated z values were 1.48 and 3.92, respectively. Using the station comparison we could not reject the null hypothesis of the stations being different at the 0.05 significance level (it is more like 0.15). Similarly using monthly pairs we can reject the null hypothesis at a very high level of significance (less than 0.01).

The result here calls for further analysis and does not by itself indicate that the recent differences would change the recent trends in temperature anomalies for CRN12 versus CRN5 stations. In my view it only puts some uncertainty into the measurements.

I also took at look at the variance within the category stations of CRN12 and CRN5 stations by calculating average monthly differences from 2004 to 2005 for individual stations and then comparing the standard deviations for the CRN12 and CRN5 stations. The standard deviations were 0.56 for CRN12 stations and 0.58 for CRN5 stations. The results of this comparison would indicate to me that the suggestion that poorer quality stations would likely have errors in both directions (and thus have larger variances than better quality stations) without a bias in direction is not borne out.

I should have added to post #280 that the calculated paired monthly differences between CRN12 and CRN5 stations showed the CRN5 stations to be 1.1 degrees C per month warmer than the CRN12 stations on average.

The answer to your problem is in the data set you are using. John V. uses USHCN V2. The first step in creating USHCN V2 is, “First, daily maximum and minimum temperatures and total precipitation were extracted from a number of different NCDC data sources and subjected to a series of quality evaluation checks. The three sources of daily observations included DSI-3200, DSI-3206 and DSI-3210. Daily maximum and minimum temperature values that passed the evaluation checks were used to compute monthly average values. However, no monthly temperature average or total precipitation value was calculated for station-months in which more than 9 were missing or flagged as erroneous.”http://www.ncdc.noaa.gov/oa/climate/research/ushcn/
Then they do the TOB adjustment. That is also why there is such a close match between John V.’s CRN 1,2 and 5 graphs. The data is adjusted in the creation of the data set.

I am not completely clear to what adjustments you are referring or that I understand cleary what John V used. I wanted to use his data to show that perhaps other conclusions or directions for further analyses could be posed from it. I am aware of the adjustment stages for USHCN measurements and had assumed that John V was using a totally adjusted version. Now that I can ID the CRN12 and CRN5 stations I planned on going back to the original USHCN data and use the various adjustment stages to make the same comparisons.

Either I do not pay sufficient attention to some of these details of these threads at CA or we do a poor job of communicating what exactly the data we use represents.

John V is great with computer code from what I gather but he is not as experienced with the data. He used what was called USHCN v2 TOBS which he understood as raw data with the TOBS adjustment. But it is not raw data, it has the major problems that are caused by CRN 5 stations already removed. Anyone doing comparisons with CRN 1 and or 2 against CRN 5 needs to keep that in mind.

The TOBS adjustment is actually the biggest adjustment overall in the USHCN database and, consequently, I like to see the raw data and the TOBS version. (I think the other adjustments are done so poorly and are so full of bias that I believe we should just ignore them.)

A rigourous study was conducted on the TOBS adjustment and the results described the adjustment as “robust” so I suppose we should just accept it but I am always leary of these convenient adjustments.

Here is nice chart showing how all the adjustments affect the overall temperature trend. TOBS accounts for 0.35C of the total 0.55C in adjustments in the trend carried out in the USHCN temp database (80% of the total increase in temps is actually “adjustments”???)

John V. Your new website looks great. The old one had me thinking about skate boarding which is not a good thing currently for me to consider. I have to agree that your code must be very clean because I could actually understand what you were doing.

I just realized that you’re cross-posting lies about my analysis to multiple threads. That’s disgusting. I now must cross-post myself to set the record straight.

What gave you the idea that I’m using USHCNv2 data? Every one of my plots is clearly labelled “Rural CRN12 Stations (GHCNv2 Raw) and GISTEMP (Sept 12, 2007)” in bold letters across the top. I do not mention USHCN or TOBS in any of my posts. USHCNv2 is not even available online yet (there website states “Version 2 station data will be made available in July 2007″ but there is no link that I could find).

Here’s what it looks like to me. My results don’t fit with your world view, so you are convincing yourself there must be a problem with my analysis. Two days ago you were “ROLF”. Now you’re spreading false information about my analysis, and questioning my integrity by implication.

RE: #291 – John V there are some trust issues. I’ve certainly mellowed in my own stance and will be open minded about what you are doing. To be honest, the reason I did not initially trust you was because Joshua Halpern aka Eli Rabbet referred to you in so many words as “one of our McMice.” From that, I assumed some sort of relationship between you and either Joshua / Eli and or the core of his clique. I’ll continue to stand down, since you seem to be a decent team player.

John V is doing interesting work, but part of the problem in this discussion is the remaining assumptions that trends from bad stations will differ significantly from trends from good stations. I leave bad and good to your choice. All evidence shows that after homogeneity adjustments the trends and even the year to year variations are the same.

I leave you with the words of Roger Pielke Sr.:

Station micrometeorology produces complex effects on surface temperatures, however, and, as we show in this paper, attempting to correct the errors with existing adjustment methods artificially forces toward regional representativeness and cannot be expected to recover all of the trend information that would have been obtained locally from a well-sited station.

A rough, but good translation is that regional trends from a mixture of stations are accurate and thus so are continental and global ones but you do lose local information. Obvious and fair enough. More here

John V is doing interesting work, but part of the problem in this discussion is the remaining assumptions that trends from bad stations will differ significantly from trends from good stations. I leave bad and good to your choice. All evidence shows that after homogeneity adjustments the trends and even the year to year variations are the same.

I leave you with the words of Roger Pielke Sr.:

Station micrometeorology produces complex effects on surface temperatures, however, and, as we show in this paper, attempting to correct the errors with existing adjustment methods artificially forces toward regional representativeness and cannot be expected to recover all of the trend information that would have been obtained locally from a well-sited station.

A rough, but good translation is that regional trends from a mixture of stations are accurate and thus so are continental and global ones but you do lose local information. Obvious and fair enough. More here

Oh yeah, Eli calls the commenters at his blog anonymice, because a lot of them are. Thus those who comment here are McIntyre’s mice, e.g. McMice.

Steve: Eli’s comment is as usual untrue, as evidenced in many places here. Eli says:” All evidence shows that after homogeneity adjustments the trends and even the year to year variations are the same. ” For a simple case, after GHCN adjustments and USHCN adjustments, Marysville warms strongly relative to Orland. In this particular case, the GISS adjustment, given the strong underlying base of rural stations, appears to be workable. IT’s impossible for both Parker and Hansen to be right about adjustments, but you won’t hear Eli say this. As Schmidt says, the U.S. is only 2% of the world. Since the circumstances outside the U.S. are different, there’s no guarantee that the methods work there. Maybe they do. But surely someone should examine them as we’re in the process of doing. If the methods stand up, we’ll say so. PErhaps Eli can explain the adjustments in Wellington NZ.

John V,
You may or may not be aware that the GLOBAL HCN data comes from 30 different sources. Where does the source of data come from for the US 48 states? Let me help you; USHCN. (that’s the data set you are using)

Do you remember not too long ago when people on this blog were complaining about the data being switched on them from NASA? They went from a SHAP to FILNET. If you look at the description of the USHCN v2 link that I provided you might notice that the USHCN origonal version used SHAP, USHCN V2 uses FILNET. That is because NASA switched to the special USHCN V2 data set provided to them by the same folks at NOAA. It appears that the GHCN set you are using was changed in the same way, without notice.

Your apologies are accepted in advance. I am here to help you, not the other way around

A wile back I asked somebody to draw the frickin data flow diagram for the climate data.

Everyone agreed I was a good idea. … taps foot…. crickets chirp….

So.. issues.

1. I thought Rural CRN1&2 was slighty lower in than GISS in the last 10-25 years.
Like on the order of .1C.. I know a lot of folks expected all of the 20th century
warming to vanish.. I think I was expecting .10-.2C C round abouts.

2. GISS “used” this data inthe past probably to generate the line you use

3. I thought your code was super clean and well structured. I’m gunna try to play around
with it to do some Individual site comparisons.. See titusville thread .
One thing that perplexed me was this. WhenEver I looked at an individual bad site
and compared it to the average ( simple average.. no area weighting) of its
neighbors I would see divergence; at the global (US) level this detail gets smudged
away

4. Any thoughts on the AR(1) issue and the error bars one can put on such a time series?

A couple days ago you posted a graph that detailed the number of tropical
Storms. It showed a number of 8 for the current season. However,
in your text you refered to the chart and said it detailed the number
of hurricanes. I wrote you a comment. I pointed out that this was the kind of mistake that a 15
year old would make and suggested that you fix it.

Did you fix it? or just try to erase it?

which was it? Did you fix your mistake ( right up there with the
Sartre mistake) or did you just trash bin it?

I would love it if you trash binned it, that would be quintessentially
pathetic. I josh you not.

#298 Kristen:
I absolutely apologize for the tone of my last post.
I should know better than to post at the end of a long and frustrating day.
Now, let’s figure out where to find the raw data so my results can be updated if needed.

Im sorry if Im being dense here. Is the GHCN raw data truly raw, or have adjustments been applied?

GHCN Raw should be the same as USHCN Raw, but USHCN is the originator of the data. In case of inconsistency, you should use USHCN.

Also keep in mind that the USHCN adjustments are not done in hte ROW

Im staying away from ROW. Theres enough to worry about in US48.

There’s bigger issues in the ROW and the USCHCN esoterica don’t apply. If you’ve got some time and interest now, I’d welcome your thoughts on some of the conundrums on ROW adjustment that I’m working thorugh.

GHCN Raw should be the same as USHCN Raw, but USHCN is the originator of the data. In case of inconsistency, you should use USHCN.

Now I understand.
I would like to check for changes in GHCN before I write a file parser for USHCN (that could take a while because of deadlines in my real job). Is your GHCNv2 v2.mean.z from June available online?

USHCN V2 data have not yet been published, so if somebody here is using them, they
must have good contacts at NCDC. :-)

John V,

GHCN v2.mean.Z has been updated since June. I would upload it, but uploading
is much slower for me than downloading. If Steve does not have it available,
would a file of differences between June 11th, and Sept 10th be of use to you?
If so, such a file may be found here.

“You may or may not be aware that the GLOBAL HCN data comes from 30
different sources. Where does the source of data come from for the US 48
states? Let me help you; USHCN. (that s the data set you are using)”

GHCN data also include non-USHCN data for the contiguous 48 USA.

“Do you remember not too long ago when people on this blog were
complaining about the data being switched on them from NASA? They went
from a SHAP to FILNET. If you look at the description of the USHCN v2
link that I provided you might notice that the USHCN origonal version
used SHAP, USHCN V2 uses FILNET. That is because NASA switched to the
special USHCN V2 data set provided to them by the same folks at NOAA. It
appears that the GHCN set you are using was changed in the same way,
without notice.”

USHCN V2 data have not yet been published, but both versions use FILNET.
NASA switched to a USHCN V1 dataset at CDIAC, not USHCN V2. The
assertion that the differences between USHCN V1 and V2 are attributable
to the NASA switch is simply not credible. GHCN v2.mean “raw” data for USHCN
stations for the years prior to 2000 have not changed whatsoever between
June 11, 2001 and Sept 10, 2007.

No problem John, all the mice talk reminded me that it’s time to feed my snake.

Steve M # 306, I think you will find that the new USHCN (version 2) is not as raw as it used to be. They get rid of the “erroneous” data during construction of the data set. That is what I was talking about when I said that they adjusted out the big problems right up front. They don’t call it an adjustment but it is. The problem is that they do not define “erroneous.”

FWIW: on converting USHCN “raw” temps from F to C, and rounding to
tenths, and comparing the monthly numbers to GHCN v2.mean “raw” numbers,
omitting the partial year data for 2006, I found exact matches except for
one month at one station: January 2003, at Goldsboro. 3.6 C vs 4.3 C.

The MICROSITE issue is this. There is anther bias that needs to be accounted for.
It is a Bias “like” UHI. Some sites have this bias, other sites don’t.
This BIAS is a warming BIAS. It happens over time, gradually.

AFTER all the corrections of Karl and Hansen, this bias should still be there.

The goal is recovering T. e1&e2 cant be adjusted out. e3 ( recording errors, the guy
wrote down the wrong number) can be “detected” with gross QC checks.
e4 can be adjusted with simple physical models. e5 this is MMTS adjustment. e6:
if you change observation time you need to adjust.

THAT LEAVES e7.

This entire argument can be distilled down to this. THEY DIDNT ADJUST for microsite.

GHCN v2.mean raw data for USHCN
stations for the years prior to 2000 have not changed whatsoever between
June 11, 2001 and Sept 10, 2007.

Thanks for checking that. My version of v2.mean is from Sept 12th, and I can’t seem to track down an old version for comparison. It is *conceivable* that the data changed between Sept 10th and 12th.

Steve M # 306, I think you will find that the new USHCN (version 2) is not as raw as it used to be. They get rid of the erroneous data during construction of the data set. That is what I was talking about when I said that they adjusted out the big problems right up front. They dont call it an adjustment but it is. The problem is that they do not define erroneous.

From ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/README.TXT:
“The Areal Edited data is the original (or raw) data that have been screened to flag
monthly data that are suspect/outliers (over 3 standard deviations from period of
record mean of the element).”

#313 JerryB:
Can you confirm which versions of the data you were looking at. Was it current USHCNv1 raw against current GHCNv2 raw?

Kristen Byrnes:
Do you have a comparison between the data file I used (archived at the link following each of my major posts) and USHCN TOBS+FILNET? That would really clear up any confusion.

All evidence shows that after homogeneity adjustments the trends and even the year to year variations are the same.

Well yeah. After you homogenize everything I’d say you made an error if the trends weren’t the same.

BTW what is all this fiddling with temperatures and homogenizations if all you are looking for is the slope?

Why not just take consistent records (same instruments same TOB etc) and find the slope for a place and a time then combine the slopes (from the good stations). Why do you need a temperature intermediary?

Yes. I used the current USHCN hcn_doe_mean_data.Z file at: ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/
and the Sept 10 GHCN v2.mean.Z file at : ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2/
the latter of which exactly matched the copy that I downloaded a few minutes ago. :-)
(GHCN update activity seems usually to slow down by the 10th of each
month, even though the files get replaced daily in recent months.)

The hcn_doe_mean_data.Z file at CDIAC is effectively a copy of that at NCDC
except without the partial year 2006 data.

4. Any thoughts on the AR(1) issue and the error bars one can put on such a time series?

I have a few thoughts on the error bars (all very rough):
– the concern is trend, so we should be looking at errors in the trend (vs absolute)
– using trend variances minimizes the effect of local geography and micro-climates
– the network of CRN12 rural sites can be used to calibrate the variances on other sites
– we need some way to quantify regional variance for areas of the world with few stations
– as much as possible, we should discard bad sites instead of adjusting them
– we need a way to identify bad sites in ROW (I have some ideas around using trends in [Tmax – Tmin] as signatures for UHI and micro-site issues. There is a pretty reasonable number of stations with Tmax and Tmin available, although I have not checked how many years are available).

No, I tend to use a language called REXX within Kedit for this stuff. I had
forgotten that the 1994 USHCN edition at CDIAC has some fortran programs. If you
try to use them with recent USHCN files, you run into the problem that the 1994
edition did not include decimal points; the later editions do.

John V has clearly indicated that he has been using GHCN v2 files, not USHCN V2
files, which BTW have not yet been published. Your comment 322 uses terminology,
e.g. DSI-3200, DSI-3210, found in the very brief USHCN V2 anouncement, but not
mentioned in GHCN v2, or even in a somewhat analogous USHCN V1 file.

John V has already indicated that he has time constraints, as do we all.

Suggesting to John V that he pursue USHCN V2 issues may not be conducive to
alleviating his time constraints.

I have much confidence in John V. I do not have much confidence in NOAA or NASA GISS. Your and John’s examination of the data today is compelling but either of the two I listed in 322 would be conclusive. There is no need to rush, relaxing while doing the examination would probably be preferable anyway.

USHCN documentation also mentions indicators of the number of daily
observations missing from any given station/month’s data.

You have suggested that John V pursue such topics, but you have not
provided any evidence that such a pursuit, which would be time consuming,
would lead to results that differed more than trivially from those that
John V has presented.

Perhaps, instead of suggesting that John V spend his limited time in such
a pursuit, you might consider “relaxing while doing the examination”
yourself since “it would probably be preferable anyway”.

DSI 3200 is a data set, not specific to USHCN V1 or v2. It is used in both. It is daily temps without any adjustment.

DSI 3210 is a data set that has the “step one” qc checks applied in USHCN V2 data set construction.

The description in USHCN V1 also discusses QC checks as a first step in data set construction and is much more specific that the description in the USHCN v2 introduction but then says it was changed, but not what it was changed to.

There are other data sets as well with their own label numbers, each has it’s own individual variations of data.

I already went through John’s spreadsheets and could not find any labels. I went through USHCN v1, GHCN v1 and GHCN v2 and could not find any labels as to what data set was used. That’s why I asked you guys if the data he is using has such a label.

The other possibility was to find specific things about the data that could tell what data set was used, that’s why I asked about the number of days missing in a month, that would tell us if there were QC checks done with the data that GHCN labels as “raw.”

I am just trying to get to the bottom of this and it is just as frustrating for me but is necessary to know.

Here is what I think on microsite. The effect of microsite ( weather statin over asphalt)
is real. It’s a subspecies of UHI. However it takes a site to site comparison to tease
it out. Its seasonal. Its mdeulated by other factrs. It developes over time and could be
washed out in S/N

e1 – canceled: Agreed
e2 – still with us: Agreed.
e3 – still with us: Agreed.
e4 – not a factor: Agreed
e5 – not a factor: hmmmmm
e6 – not a factor: Agreed.
e7 – is canceled: It depends.

I need to puzzle on e5. ok? Meantime, proffer an argument.

On E7, I think that after a time the method of slopes will overcome e7.
A simple example. A tree grows and shades a site. During this period, say 20 years,
the trend signal can be lost. BUT after “maximum shadiness” is achieved, one
would expect the underlying trend signal to be detectable again. The same
with heating biases.

The bias is not a step function that is easily detectable. If I had to
model the bias I would use a logistic curve.. Am I being clear?

So at some time t1 in the station history this bias starts, and it grows,
ramps, but then plateaus at time Tn.

In the background if you have a climate trend operating. Question: will e7
bias the climate trend substantially?

I was kinda thinking if one could simulate such a combination of signals and
determine detectability criteria… Something like that.

I obtained the USHCN data from here in the version listed as the file named urban_mean_farh.Z which I assume from the stepwise progression of adjustments carried out by USHCN is the completely adjusted temperature series. Such assumptions are, however, often not correct when dealing with these temperature data sets.

I then downloaded the urban mean to Notepad and searched for the stations IDs required in the paired CRN12 and CRN5 comparisons as I carried out in Post # 280 above and copied the data to Excel.

I compared the results of paired CRN12 and CRN5 stations using the John V data and the USHCN Urban data. The compared stations (1 through 11) are listed in the first table below and the results of the comparisons using the John V data and the USHCN data are listed in the second table. I did a Wilcoxon Signed-Rank Test with the Urban data as I did before on the John V data. I confined my test to station differences this time as the Urban data was without missing data. The Z score was about the same for both sets of data (approx. 1.5) and did not allow rejecting the null hypothesis that CRN12 and CRN5 have different temperatures at a level of significance of 0.05 (the level was 0.15). The CRN5 stations were, on average, approx.1 degree centigrade warmer than the CRN12 stations using both the John V and USHCN Urban data sets. With all the noise I see in these temperature measurements and adjustments it might take a substantial difference between CRN12 and CRN5 stations to detect with the small sample size used in my analysis. I think these differences bear further investigation and analysis.

There were substantial differences between the John V data and the Urban data in many months for the station comparisons as can be seen in the table below. Also the John V data had many missing months of data and the time period was only through March of 2006 while the Urban data was complete through October 2006 with no missing months data.

I wrote in the post avove: The Z score was about the same for both sets of data (approx. 1.5) and did not allow rejecting the null hypothesis that CRN12 and CRN5 have different temperatures at a level of significance of 0.05 (the level was 0.15).

And should have said: The Z score was about the same for both sets of data (approx. 1.5) and did not allow rejecting the null hypothesis that CRN12 and CRN5 have the same temperatures at a level of significance of 0.05 (the level was 0.15).

I’ve been sitting back and watching this unfold as various analyses occur over the site data I’ve posted.

I do have the feeling though that comparing USHCN/GHCN data to GISS will yield similar curves no matter what, since the data has already been adjusted at the USHCN level, and that adjustment persists in the data through to GISS data.

As we get more stations surveyed, and we have a better idea on a station history for CRN ratings, lets revisit a way to add that to the analysis.

In the meantime I’m open to ideas on how to come up with historical CRN ratings. And I think its importnat that we identify “gold standard” stations that are without much nearby population and have had little or no moves, and keep a running tab of these. I think they will yield the best comparative results.

this data set (Revision 3) has at most four lines of data per year as compared with three for the older versions of the data. This is the first time that the actual raw unadjusted data (the ” ” record) have been distributed.

…Flag Codes for the Blank Row, the Original Data (JAN-DEC)FLAG1 is the code for the number of daily values not available in computing the monthly value (flag values are Blank =’O, A = I, B = 2, C = 3, . . . I = 9; “.” = data is estimated). If the source code flag (FLAG2 -see below) equals 0 or 1, then FLAG1 = I indicates that 1 to 9 days of data are missing.]

In the file HCN94MEA.ASC (and therefore by extension V2.MEAN as shown below) there are 18,104 instances of the “.” flag. If a data point in the “blank” row is marked with the “.” flag, then the corresponding data points in the “+” and “A” rows should also be marked with the “.” flag. It would seem that the number of instances would be evenly divisible by 3, but that is not quite the case, it seems. In any event, it would appear that there are more than 6,000 data points in the “unadjusted” row that are marked as estimates. Since, these estimated data points appear in the “blank” row, they would not have been estimated by the FILNET progam.

I had downloaded a copy of V2.MEAN on Sep 7 and I downloaded another copy today. I extracted the information for Walhalla from both copies of V2.MEAN and compared them to the Walhalla data I extracted from HCN94MEA.ASC. ALL data points for Walhalla in both V2.MEAN files were identical to the corresponding data from the “blank” rows for Walhalla in HCN94MEA.ASC up to 1994. Thus, it would seem that V2.MEAN contains “original” data without TOBS, SHAP, MMTS, FILNET OR UHI adjustments but with over 6,000 individual estimated data points. Since, V2.MEAN does not contain any flag data, it is impossible to tell from V2.MEAN which data points are estimates. That information is available through 1994 from HCN94MEA.ASC.

#305: John V. September 21st, 2007 at 8:26 am continues:

Do you know of a location where I can get temperatures with TOBS adjustments only?

The TOBS adjustments are availabe from the “+” rows in HCN94MEA.ASC through 1994.

From ND0P19.PDF (various pages):

The “unadjusted” data in the old data set was not quite “unadjusted.” In fact, what was labelled as “unadjusted” was “adjusted” for the time of observation (TOB). Therefore, this data set (Revision 3) has at most four lines of data per year as compared with three for the older versions of the data. This is the first time that the actual raw unadjusted data (the ” ” record) have been distributed. The old “unadjusted” data should match up better with the new TOB adjusted data (the “+” record).

….Finally, when a station moves 1, 2, or more of the station’s 40 nearest neighbors may be different

…The adjusted data, and corresponding confidence factors, are the product of four major computer programs. The station history data and a “network” of the best correlated nearby stations are used in all these routines. First, the original data are input into the TOB debiasing routine so that the data will be consistent with a midnight-to-midnight observation schedule.

It would appear that the 40 nearest neighboring stations are used to make the TOBS, SHAP, MMTS, FILNET & UHI adjustments. Thus, these adjustments apparently make all stations dependent to a certain extent, because data from the 40 nearest neighbors of a given station is used to make 5 separate and independent adjustments (TOBS, SHAP, MMTS, FILNET & UHI) to the data for that station (the UHI adjustments are available as a separate file: http://cdiac.esd.ornl.gov/ftp/ndp019r3/ur94mean.asc.Z ).

#314: steven mosher says:
September 21st, 2007 at 12:40 pm

RE 312.

Ill have a look but typically at this stage of data cleansing you are deleting outliers. For example. For years Fairbanks Alaska was a Jan. mean of 0C. Then in 1932, you see a mean of 32C.

Outlier.

In HCN94MEA.ASC (and by extension, at least in part, in V2.MEAN) there 24,040 “S” flags (which indicate outliers that are between 3.5 & 5 sigmas from the mean) & 158,481 “X” flags (which indicate more than 5 sigmas). However, “X” flags are also used for confidence factors to indicate that the algorithm was unable to adjust data. It looks like the vast majority of the “X” flags were for row C data points. S flag data points were NOT changed, except in the case of row “A”, where a change would be indicated by an “M” flag. There are 128,324 “M” flags in HCN94MEA.ASC. This may be a measure of data quality. NDP019.PDF has in appendix B quality assessments of station data, although it appears to be a numerical assessment only and the work that Anthony and others at Surface Stations are doing may not have been done before.

In conclusion, V2.MEAN appears to consist of data that is almost, but not quite, “unadjusted.” Specifically, the information contained in the flags in HCN94MEA.ASC has been omitted. In addition, the various adjustments to each station appear to be based on a “network” of the closest 40 neighbors to each station, even for the TOBS and MMTS adjustments.

John V clearly indicated that his “raw” data were from the GHCN v2.mean
file. GHCN V2 documentation clearly indicates that its source of that
data is USHCN V1. USHCN V1 documentation clearly indicates its immediate
sources of that data which I posted in comment #327.

I do have the feeling though that comparing USHCN/GHCN data to GISS will yield similar curves no matter what, since the data has already been adjusted at the USHCN level, and that adjustment persists in the data through to GISS data.

One of the problems we have here is that many of us are not sufficiently aware of the various versions of USHCN and how the versions progress to GISS. I think I know generally, but after reading all these explanations I am not at all sure of that. I believe that USHCN has unadjusted meta data that was used at one time by GISS to make the GISS adjustments for that data set. Anyway if someone here can provide a link that data set that would be a next step comparison for me, otherwise I will look for it on my own.

I believe it was Steven Mosher who earlier suggested that someone with sufficient knowledge make a flow diagram tying all the versions and data sets together. The problem I see with online work like this project is that usually no one is summarizing all the information in one spot for convenient review.

I think at this time that finding statistically significant differences between CRN5 and other station classifications will be difficult due to all the measurement and adjustment noise in the signal and the small sample size and less so due strictly to the data sets used for the comparison. If one proceeds back in history, and even with historical quality indicators, I think that digging the difference signal out will be even more daunting.

That you and your team have found and documented quality differences is a major achievement by itself. That information may be difficult to use to show a significant change in trends with better data going back in time or even comparing close by stations with different quality classifications, but in my view it does put some additional, if not quantifiable uncertainty, into the measurement data.

The other day I had an idea about where to find the signal. If it’s like
UHI, then it’s likely to have seasonaility. So I took John Vs monthly data
and arranged it by months and years.. all jans from 1880-2005, all febs 1880-2005
etc. and then ran differences between class5 jans and class12 jans, etc etc

so you get a nasty chart with all 12 months from 1880-2005.

Then I combined months into seasons.

I think I found something in summer months JJA.

alot of hand work in excell. Somebody else should have a look and see if I screwed up

#342 and #344 JerryB: Thanks for the updated files. Would you agree, though that the content of NDP019.PDF is still mostly applicable to the updated files as well, with the exception that the temperatures are now stored in hundreds of a degree instead of as a five digit integer?

#347 Jerry, were there changes in the SHAP/FILNET adjustment programs as described in NDP019.PDF between 1994 and the current files? Are the adjustments still based on a “network” of the closest 40 or so stations? Do you know what the difference is between the 1200km rule and the adjustments described in NDP019.PDF?

For the method of differences I would treat a site move as the start of a new record. The point is to keep the individual records as internally consistent as possible in order to minimize the number of corrections adjustments.

“#347 Jerry, were there changes in the SHAP/FILNET adjustment programs as
described in NDP019.PDF between 1994 and the current files?

I do not know if those programs have been changed since then, but
adjusted data of several stations have changed since then. I took a
brief look at NDP019.PDF and noticed only vague descriptions of those
programs. Can you supply page numbers of the descriptions you have in
mind?

“Are the adjustments still based on a “network” of the closest 40 or so
stations?”

I do not know if they are, or ever were.

“Do you know what the difference is between the 1200km rule and the
adjustments described in NDP019.PDF?”

BTW, the USHCN files at ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/ do get replaced
from time to time. If anyone likes to monitor changes of such files, they
might want to download the current set today or tomorrow. One can never be
sure when the next set will arrive.

Procedure: get Johns Code. Unzip per his instructions, Update your system as required

From there is is pretty damn simple. You got two Bat files that update two different
folders. So here is what I did. I’ll send the excell to Anthony and he can reverify
and post.

1. Run the Bat files for CRN5 and CRN12.
2. Then get the MONTHLY files in each folder. Copy and past both time series
in a separate excell sheet.
3. Now you got a time series from 1880 to march 2006 in months for CRN12 and CRN5

4. Difference the columns. CRN5-CRN12.

5. CRN5s average about .35C warmer than CRN12 (11.53 vs 11.18)

6. Now check the trend. CRN5s warm more than CRN12. .0002C per month..
over the last 1500 months.

THEN I though Why not use JohnVs method on ALL the sites. After all his
method is better than Hansens. To do this you just have to create a File
of all the Stations ( I used Anthony’s spread sheet) and then feed it
to the batch file.

The results were very telling. The CRN12 track the totality of stations pretty
well, but the class5s are clearly in a class of their own.

Anthony hollar if you want the spread sheet.. I make ugly graphs so you
can double check and post if you like

Conceptually I would argue for a changing grid through time. John V has noted
for example that his lat/lon griding results in certain grids having many
measureents and other grids being sparse. Ideally, I would think one would
want to adapt the tiling strusture based on the geographical distribution
of the sites.

“I took a brief look at NDP019.PDF and noticed only vague descriptions of those programs. Can you supply page numbers of the descriptions you have in mind?”

The program descriptions that I quoted excerpts from in my post #339 are from pages 5 and 9 of the scanned document (pg. 15 & 19 of NDP019.PDF). A further description is on page 36 of the scanned document (pg. 46 of NDP019.PDF). Hope this helps.

#344 Jerry, with regard to your question on the 6,056 estimated data points, they range from 1799 through 1989 and are from all sources except 0, 1, 6 and 7, with most of them from sources 2 and B.

I did a similiar ( but different…???) monthly analysis of CRN12 versus 5.

The CRN5s are warmer ( that’s interesting) and you indicated that
you looked at lat/lon in your comparision. I think Elevation differences
might be as important. Just to remove this I was going to normalize all stations
to sea level via a lapse rate adjustment.
But then We only have the elevation of the last station move.
The Lapse rate adjustment, being constant, doesnt change trends..

I will play with a Crude Lapse rate adjustment for all the class5 and see if they
are less warm.

Now here is an intersting question. Havent looked at this.

If you took the mean temp of all the CRN1, and then the CRN2, and 3, 4, 5
And you ranked those means, if the ranking of absolute mean temp came out
1,2,3,4,5 What would you conclude?

I do have the feeling though that comparing USHCN/GHCN data to GISS will yield similar curves no matter what, since the data has already been adjusted at the USHCN level, and that adjustment persists in the data through to GISS data.

This confuses me. The raw GHCN/USHCN data really is raw other than removing outliers (more than 3 standard deviations from the mean for that series). Are you suggesting that the removal of outliers is a significant adjustment that will coerce the trend to match GISS?

There are differences in the results from the CRN12 and CRN5 stations. I will post a summary in a few minutes. We need to determine if the differences are statistically significant and if they roughly match expectations. I will post on CRN5 expectations soon as well.

“ahhhhh. it would NOT be a simple line chart of temperature anomaly..”

One could always integrate a surface and elminate any altitude corrections.

My take on trying to determine climate change would be to instrument the ground and the air not only on grassy sites, but also various croplands, forests, deserts, suburban and urban areas in proportion to the area actually represented in the nation. That’s what is really happening and what people are going to have to live with. One could forget about the 21st century controversy of UHI correction magnitude equivalent to “how many angels can dance on the head of a pin”. Many of the ag stations seem to have had ground temperature measurement instrumentation and likely have historical records somewhere.

Right now it seems the historical weather data are still being used to try to measure “Global” warming and validate the AGW/CO2 hypothesis (even though the public/political discussion has morphed to “Climate Change” (with no great interest on the Science side of integration temperature and moisture, much less ground temperatures or mean Earth albedo, etc.)).

This is where I’m lost right now. I see that DSI-3200 eliminates outliers. It also seems like the GHCN “raw” data contains another adjustment (although not significant).
Can anyone tell me what the magnitude is of an outlier (I don’t know what the mean is so I can’t calculate something 3 times that)? Also, there seems to be different data for different years considering the TOB adjustment. Does the newer releases of USHCN (or GHCN 1 or 2) include the TOBS in the unadjusted data before 1994?

“Rather than using NASAs rural designations, I actually attempted to use Google Maps to look at each site. Realizing that two digits for lat and long only give position to about 1km, I also tried looking at the surrounding area. I only chose locations that *seemed* to be clear of any population centres by at least 1km.”

I’d suggest that Livingston 12S, MT be considered rural. Other than a likely increase in nearby irrigation and somewhat increased valley population, the area is really quite removed from influence of Livingston (one has to navigate a short, relatively narrow canyon to get out of Paradise Valley into Livingston).

The CRN5s are warmer ( thats interesting) and you indicated that
you looked at lat/lon in your comparision. I think Elevation differences
might be as important.

Steven you make a good point about elevation and I will go back and look at my eleven paired comparisons with consideration of differing elevations.

Not to this point, but something I have to get off my chest (again) and that is why are so many of you looking at data that goes back to 1880s when Anthony Watts’ snapshots for documenting quality categories is looking at AC, blacktop, parking lots, etc when these items came into common usage perhaps less than 50 years ago. If we had data back to 1750 I think we might still be making comparisons using all of the time series. Without more historical data I think we have to confine our analysis at this time to some reasonable recenthistory.

By the way I have found Anthony Watts’ Excel data sheet very helpful and informative and unlike delving through the USHCN and GISS series.

Summary of Differences Between CRN12 (Rural) and CRN5:
This is a summary of some of my previous results generated with OpenTemp.
The usual disclaimers apply:
– the program has not been properly reviewed
– there are geographical biases in the data
– etc.

CRN5 stations are the bad ones.
CRN12 stations are the good ones.
CRN12R stations are the rural good ones.

The first plot is of CRN5, CRN12R, and GISTEMP temperatures since 1880:

The second plot shows GISTEMP minus CRN12R and GISTEMP minus CRN5 for the same period:

GISTEMP shows approximately 0.2degC more warming than CRN12R. Much of this can be attributed to the Time of Observation Bias corrections that are included in GISTEMP but excluded from CRN12R. The plot below shows that the net TOBS correction applied to GISTEMP for all USHCN stations is ~0.35degF = ~0.22degC:

CRN5 shows ~0.35degC more warming over the century than CRN12R. I will get back to this in my next post.

The USA temperature trend since 1915 can be broken into three key periods. The plot below shows the temperature trend for each of those periods:

CRN5 shows substantially less cooling than CRN12R and GISTEMP from 1935 to 1975 (approximately 0.06degC per decade). This *could* be the time period when the majority of the CRN5 micro-site problems were introduced.

Since some of the microsite factors are only applicable to recent history, we can use a long-term trend to help identify the effects. If there is a discernable difference, one would expect it to grow with time (in general).

CRN5 Bias Expectation:
In my previous post I noted that CRN5 shows approximately 0.35degC more warming since 1915 than CRN12R. In this post I will do a quick check if 0.35degC meets expectations.

This is a very rough first attempt at determining the expected bias between CRN12 and CRN5 stations. It assumes that site issues at CRN5 stations were introduced after the station was in operation so that the site issues introduce temperature trends.

CRN5 stations are defined as having site issues that could introduce up to 5 degC bias on
rare occasions. For the sake of a first analysis, I will define “rare occasions” as between 1% and 5% of all days. I will also assume that the bias affects either Tmin or Tmax (not both). The effect on Tavg is therefore half of the total bias.

Using an exponential distribution, the average temperature bias for CRN5 stations is found to be between 0.75C and 1.16C:

Similarly, the average temperature bias for CRN12R stations (using an average of 1.5C for the “rare occasion” bias) is between 0.22C and 0.35C:

Based on these results, the minimum mean temperature bias between CRN12 and CRN5 stations is:
0.5*(0.75 – 0.22) = 0.26degC

“I think we also need to get this. DATE OF LAST STATION MOVE. If the station
moved in 2000 and is rated a 5 now, I Dont think we can rate it prior to that without
evidence.”

In almost evey case I’ve seen, the introduction of the MMTS sensors was an effective change in microclimate.

I’m also wondering how things go in the Winter at MMTS equipped stations when there is a large amount of snow. It would seem that one would have to clear a path to a CRS and in opening it ensure that at least part of it was exposed to the atmosphere, but in the case of an MMTS, the snow could overwhelm and insulate it.

“The program descriptions that I quoted excerpts from in my post #339 are
from pages 5 and 9 of the scanned document (pg. 15 & 19 of NDP019.PDF).
A further description is on page 36 of the scanned document (pg. 46 of
NDP019.PDF). Hope this helps.”

Thanks. To me, those descriptions seem so vague as to be barely useful.

“#344 Jerry, with regard to your question on the 6,056 estimated data
points, they range from 1799 through 1989 and are from all sources except
0, 1, 6 and 7, with most of them from sources 2 and B.”

My question marks there were because the estimation method was not
described at all, much less in sufficient detail to be of use.

Do you know if there are updated versions of the QA files, and, if so,
where?”

I have no recollection of seeing similar files in subsequent editions
of USHCN. If they ever were posted in the USHCN directory of the
NCDC ftp server, I missed them. (I have checked there from time to time
since summer 2001 at which time the 1999 edition was “current”.)

“A full discussion of the QA files is in Appendix B of NDP019.PDF
beginning at page 127.”

re 372 Kristen. You don’t need to know the mean. that is the beauty of standard
deviations. The Measurement is thrown out if it is 3.5 Sigma from the mean. You
can go check the tables but that is like a 1 in a thousand event. It is not an issue
that relates to MICROSITE contamination. These are typically transcription errors
( guy writes down 5C rather than -5C) or broken instruments.

TOBS. You have to start by reading Karls paper on Tobs. Actually start with
JerryBs paper. It is very good on the basic THEORY. Just google Time of Observation.

When you finish looking at Jerry’s work, Then move onto Karl. Some basic statistics
will help. Karl actually offered up the code in his paper, so I imagine SteveMc or
yourself could request it.

I have an even cooler idea. If you requested Karls Code after all these years ( 1985
I think) Then I think updating it and validating it against the NEW CRN would make
a great project. just a thought..

Steven # 388
TOB is not an issue for me, I think it’s probably pretty close. My question about TOB was if it was a part of the data for certain years, something JerryB answered in 391. I’m not convinced on outliers, do you have a link for the tables? I would like to get a better idea of it. It will probably affect CRN5 depending on the amount of the mean.
I doubt it would affect CRN1,2 sites and I thought John’s last post on CRN1,2 vs GISS was very compelling (with one question below) but I’m not as convinced with his CRN 5 analysis. I do not agree that the temperature spikes will be just 1 to 5 % of the time, it’s probably closer to the percentage of sunny days for pavement and summer days for AC and etc, of course that will also depend on geography because here in the Northeast the pavement is usually covered by snow in the winter.

JohnV # 376
The CRN 12 line is usually cooler than the others before 1950 then usually warmer after 1970. I understand your timing argument about when the stations were placed there. Is it possible that there is a statistical error (I’m just eliminating possibilities here and not being critical of you)?

JerryB # 391
“Different people may consider different things to be outliers, or may consider the same things to be outliers in different circumstances.”

Fortunately NOAA defined outliers as being 3.5 times the mean for USHCN v1 but they changed that and I have not seen what their new scheme was for eliminating outliers. Thats why I asked if anyone knows what the mean would be. If the mean is .2 deg. then it is easy to see how it affects the CRN 5 stations but if the mean is a full 5 deg. then it would be difficult to say it even affects CRN 5.

Per Steven Moshers suggestion I went back and looked at the effect of elevation differences for my 11 paired CRN12 and CRN5 stations. I had paired them by latitude and longitude and assumed that elevation would not be a factor with that close proximity. I was wrong and can only excuse my negligence by my existence in the flat lands of the Midwest. In fact I found the CRN12 stations biased at the higher elevations and when I adjusted the temperature differences for altitude using 1.99 degrees C per 1000 feet difference there was no difference in the paired comparisons on average.

For the number comparisons I found these differences in altitude (in feet using Anthony Watts Excel spreadsheet) when subtracting the CRN5 station altitude from the Station CRN12:

My conclusion when using the altitude correction is that in the 11 sample paired temperature comparison there was no difference between CRN12 and CRN5 stations for the approximate time period of 2005 -2006.

402, Kristen, even the LIG max/min thermometers automatically recorded the max min. The whole TOB creation idea sounds great! It just seems odd that the largest corrects are applied to the more modern equipment.

It assumes that site issues at CRN5 stations were introduced after the station was in operation so that the site issues introduce temperature trends

Dangerous assumption. Why not assume that NASA GISS shows lower temperatures for CRN5 (urban heat island)than CRN12 (see your own graph), and that therefore the NASA GISS files are FUBAR? Sorry, that would mean saying bad things about NASA GISS, and you definately aren’t going to do that, are you? By the way, glad you got round to naming the databases.

I understand the point about writing down the wrong temperature but I think they also do it against surrounding stations during data set construction. My understanding is that they take the data from one station and compare it to surrounding stations. If one of the station days is bad compared to the others they flag it as suspicious (3.5 to 5 standard deviations) or erroneous (more than 5 standard deviations). Once again this is the old scheme and I cannot find their new scheme.
In the 3200 data set the value for days that are flagged in other data sets are then estimated against the other stations and entered in as original value in 3200. Im sure that this set is available to the public because they were selling it months ago at NCDC when I researched my NOAA Junk Sale piece.

Captdallas # 404

I agree. It seemed odd to me that they would take an automated system and apply a time of observation adjustment. Probably would be a good idea to check if the MMTS is programmed to end its observation day at some time other than midnight.

Re380 Steven Lookign back to 1880 is problematic, as I’ve previously noted there are observational problems with data since the Stevensen Screen didn’t get introduced to the USWB until 1892, as wasn’t really well deployed until around 1900. Prior to that all sorts of observational techniques were used with thermometers. North side of buildings, ahnging under trees, odd shelters etc.

So I tend to want to discard that period becuase no standard of observation was well implemented.

I understand and Agree. But if moving a date from 1880 to 1900 makes a difference
and a guy like me does it, well you can hear the screams.

In the past I always ran things from
1900 ( prior to johnV) because the pre 1900 stuff was so sketchy and sparse. So I look at
both, I don’t talk about my 1900 to present stuff. It usually looks better and has better
fits.

At some point this needs to be addressed in a systematic fashion. Historical evidence
and statistical test for weirdness… The pre 1900 stuff IS all over the map when you
look at monthlies.. So, I think we need two criteria to dump it. Historical and mathematical.

But, yes, long time ago looking at this stuff on GISS I started chopping my charts at 1900.

If MMTS is an automated system that automatically records temperature then why would the TOBS trend increase as the amount of MMTS increased?

#404 captdallas2 says on September 22nd, 2007 at 7:02 pm:

402, Kristen, even the LIG max/min thermometers automatically recorded the max min. The whole TOB creation idea sounds great! It just seems odd that the largest corrects are applied to the more modern equipment.

#408 Kristen Byrnes says on September 22nd, 2007 at 8:05 pm:

(snip)
Captdallas # 404

I agree. It seemed odd to me that they would take an automated system and apply a time of observation adjustment. Probably would be a good idea to check if the MMTS is programmed to end its observation day at some time other than midnight.

Observations were taken at 1700 hrs daily since 1948 until 2005 when it was changed to 0700, per MMS.

(my emphasis)

Can anyone confirm if the automated stations are programmed to take observations at times other than midnight? If so, then maybe there is some adjustment error in the TOBS adjustments, which may introduce its own bias.

As can be seen, the data points flagged change slightly between the 1994 file (hcn94mea-asc) and the current one. However, the data points themselves appear to be identical. As I posted before, in the data for Walhalla extracted from V2.MEAN, all data points corresponded exactly to the “(blank)” row from hcn94mean-asc.

Thus, it would appear that no outliers have been removed from the data in V2.MEAN.

The program descriptions that I quoted excerpts from in my post #339 are from pages 5 and 9 of the scanned document (pg. 15 & 19 of NDP019.PDF). A further description is on page 36 of the scanned document (pg. 46 of NDP019.PDF). Hope this helps.

Thanks. To me, those descriptions seem so vague as to be barely useful.

Hmmmm. Free the code? Seems like I read that somewhere before.

Jerry continues:

#344 Jerry, with regard to your question on the 6,056 estimated data points, they range from 1799 through 1989 and are from all sources except 0, 1, 6 and 7, with most of them from sources 2 and B.

My question marks there were because the estimation method was not described at all, much less in sufficient detail to be of use.

Max/min temperature sensors commonly in use at weather observation
stations commonly indicate one maximum, and one minimum, temperature
until they are reset by a human, i.e. the observer.

ASOS, and AWOS, units are very different, are “automated” in various
regards, but should not be confused with the other, much simpler
devices.

The electronic “MMTS” devices introduced during the 1980s in many US NWS,
and prehaps other, weather observation stations, like the LIG
thermometers, are not, I repeat not, “automated” in such a way that
they would show max/min temperatures for time periods other than those
determined by the human operator reseting them at observation times.

When the observer records the temperature readings, and then resets the
device, whether LIG, or electronic, the device will treat the current
temperature as both the max, and the min, of the new time period, until
the temperature rises, and/or falls. The temperature at that time can
affect the readings for two time periods, usually two days.

If the time of observation (and reset) is during usually relatively cool
morning hours, there will be a cool bias. If the time of observation
(and reset) is during usually relatively warm evening hours, there will
be a warm bias. If the time of observation (and reset) changes from one
to the other, the trend will be biased.

Let me repeat that: if the time of observation (and reset) changes from
one to the other, the trend will be biased.

“If the time of observation (and reset) is during usually relatively cool
morning hours, there will be a cool bias. If the time of observation
(and reset) is during usually relatively warm evening hours, there will
be a warm bias. If the time of observation (and reset) changes from one
to the other, the trend will be biased.”

I agree, but I would replace will with can. If the readings are recorded at a likely time for a max or min temperature, there can be an error that can be biased. (Thomas Blackburn proposed a simple method for TOB correction that I will have to find.) If there is a fairly dramatic warming or cooling trend (front passing through) at the site, a max or min temperature can be in error (for more than one day as you mentioned) due to TOB, but not necessarily biased ( the min can be cold biased [warm front] or the max can be warm biased [cold front]making tave just in error not necessarily biased).

The greater the difference in tmax – tmin, the less likely there will be any error due to TOB. Finding LIG or MMTS TOB errors should be simple, the exact same temperature (max or min) would be recorded for two or more consecutive days.

Kristen:
My estimate of 1% to 5% of days reaching the maximum expected temperature bias can definitely be questioned. Same with using an exponential distribution. This was a quick first attempt at quantifying the exected CRN5 bias relative to CRN12R.

See my previous post re why CRN5 appears cooler than CRN12R prior to 1970. (I’ll try to find some time to post absolute temperatures instead of biasing to 1951-1980).

LIG (Liquid in Glass) thermometers (the thermometers in the cotton regional shelter often referred to here as Stevenson screens) are reset each day by the observer by spinning the thermometer after a daily temperature is taken. If the observation is made at any time other than midnight then there should be a TOB adjustment. The max LIG thermometer is mercury and the min thermometer is alcohol. One problem with this method is the day after a day was missed. Lets suppose the station is at a business and the observation gets missed on Saturday and Sunday, then the observation on Monday will need to be thrown out because then the thermometers will have recorded the max and min temps for the previous 2 or 3 days rather than the previous 24 hours. I dont think this is being done and may have something to do with the closing of the range when a station changed from CRS to MMTS.http://ams.allenpress.com/archive/1520-0477/72/11/pdf/i1520-0477-72-11-1718.pdf

The MMTS (according to two of the observers we talked to) records for 30 days at a time and is programmed to reset automatically every 24 hours. The question is whether the MMTS is programmed to reset every day at midnight or at some other time. If it is programmed to reset at midnight then there should be no TOBS for any station using MMTS. The same for other automated systems like ASOS.

Besides the common MMTS devices, there are Nimbus PL-2 devices which record 35
days of max/min data, nidnight to midnight. They also record max/min data
since previous reset. The question is which do the observers use?

LIG thermometers are subjected to environmental conditions in the CRS and in operator handling which are decidedly not conducive to the maintenance of accuracy in such instruments. NIST and other standards recognize a need for frequent recalibration of such instruments. Due to the nature of the COOP and other weather observation networks using LIG thermometers, the maintenance of the instruments and their recalibration is not likely to meet certain standards for accuracy, and the scope of the problem is not fully investigated.

Likewise, NOAA and other organizations have noted problems with accuracy drift in the thermistors used by ASOS, MMTS, and other electronic thermal sensors, particularly the early models.

An interesting document to read concerning LIG thermometers can be found at:

OCS/NERON was getting quite a handle on the quality control problem and automated systems for preforming quality control followed by selective hand-checking. See the PDF, read about the temperature errors being detected and corrected, and note how NOAA is no longer providing access to the NERON data while it is being “assessed.”

RE438 Jerry, I found the rounding instructions, in the document I linked to kristen

“If you have the MMTS, obtain the maximum and minimum temperatures by pressing the buttons marked “MAX” and “MIN”, respectively. Record these readings, as well as the current temperature, to the nearest whole degree. The current temperature is the reading shown when no buttons are depressed. If the reading to the right of the decimal is 5 or greater, round off to the higher figure; i.e., 39.5 should be recorded as 40. After recording these values on your form, press the “RESET” and “MAX” buttons simultaneously, then do the same with the “RESET” and “MIN” buttons. You can check to be certain the readings reset properly by pressing the “MAX,” then the “MIN” buttons. They should read the same as the current temperature.”

A first read of the PDF links provided suggest the Oklahoma Climatological Survey (OCS) program embarrassed the snot out of NOAA. The data in the OCS station sample (page 2 in the PDF) is suprisingly similar to Anthony Watts surfacestations.org audit effort.

Why hasn’t the OCS/NERON effort at improving climate research data quality been supported by the bureaucracy at NOAA and NASA to date?

Just to be clear: The effort of the OCS, without huge funding, embarrassed the heck out of the bigwigs at NOAA by showing how poorly the existing network is sited, operated, and maintained. I suspect that is why the NERON effort was “buried” by the “climate establishment” at NOAA.

4.1. Identifying Erroneous Observations The automated QA identifies potential data problems across the NERON network each day. For example, on 13 March 2005, the automated QA began to flag observations from the NERON site near Jonesboro, ME, because of a suspected area on the eastern coast of Maine. Because of Jonesboros location, the QA meteorologist was unsure whether the cool anomaly was a real mesoscale feature or if it was caused by a sensor problem. Jonesboro continued to report 2-5°C cooler every day than its nearest neighbor (about 50 km up the coast at Eastport, ME). The QA meteorologist contacted the site host and verified that the cool readings were erroneous. The sensor bias was traced to 1 February 2005, and the data were flagged manually back to that date (Fig. 3). In addition, a trouble ticket was issued so that the sensor could be replaced.
Christopher A. Fiebrich*, Renee A. McPherson, Clayton C. Fain, Jenifer R. Henslee, and Phillip D. Hurlbut. Oklahoma Climatological Survey, Norman, OK. AN END-TO-END QUALITY ASSURANCE SYSTEM FOR THE MODERNIZED COOP NETWORKhttp://ams.confex.com/ams/pdfpapers/92198.pdf

I go skating and come back to all of the manuals weve been trying to find for 2 months, great job guys, thanks!
John V, below is the list of the CRN 1,2 stations with histories added for the most recent location and equipment changes. All of the data are from MMS. I think if the MMS shows data going back to 1948 it should still be safe to use other data that you have going back farther. My friend and I started researching stations with historical photos. If we find something with photos of a location, maybe from 1930 to 2945, will you be able to use that if it is the only dates that have been documented?
Rural CRN12 Stations:

42572383001,34.7,-118.43 Fairmont CA (Not Surveyed, aerial photographs only but looks good) / No changes in location or equipment / records back to 1931
42572694004,44.63,-123.2 Corvallis State University OR / no change in location or equipment / records back to 1948

42572743002,46.33,-86.92 Chatham experimental farm MI / Location: same throughout / Equipment: LIG throughout / 1987: STN ESTABLISHED TO COMPARE WEATHER READING WITH CHATHAM EXP FARM 20-1484-2, 1.1 MI TO THE NNW / This station is a problem, the correct name is Chatham Exp Farm 2. The actual station is Chatham Exp Farm, which was active from 1948 to 1988 with no recorded station moves and LIG equipment throughout station history.

42574341005,45.13,-95.93 Milan 1 nw MN / Location same throughout / Equipment Nimbus 2006 to current / LIG before that / records back to 1948

42574341007,45.58,-95.88 Morris wc Experiment Station MN / Location: same throughout / Equip: LIG throughout / records back to 1948

I had to retype everything because of a word press error and I left some things out. Angola IN is probably not a location change in 1977. There are 2 stations with possible ID errors: Chatham exp farm MI and Port Gibson MS that someone might want to look into.

From the operators guide, “The minimum must be at least as low as the lowest of yesterday’s and today’s AT OBSN temperatures, and the MAX must be at least as high as the highest of today’s and yesterday’s AT OBSN temperatures. For example, if yesterday’s AT OBSN temperature was 95, today’s maximum must be at least as high as 95, even if the maximum this calendar day was only 86. You may record the 860 maximum in the REMARKS (far right) column as “PM MAX 86,” as shown on the sample page [inside front cover] on the first day of the month. This is optional. See the REMARKS column on the sample page for the 23d of the month for recording last night’s minimum (23), when it was warmer than yesterday’s AT OB temperature (11).

What’s that all about? I would expect that they would take the max from each callendar day.

#417 Hans. Looking at your chart, it seems that from 1925-1930 the “worst” CRN5 temps have dragged the average down, but from 1945 onwards they have dragged the average up. The converse is true for CRN12R. If TOBS adjustments were applied evenly in the GISS series, over all the categories of stations, then the differences shown appears be due to other artificial adjustments of the data, favouring recent warmth, over 1930’s warmth. The same features are apparent in John V’s first chart post #376.

I think it is good evidence of interference with the temperature record for no good reason.

I don’t understand for the life of me why they are rounding to the nearest degree. They are looking for a signal of 0.1 C per decade and the instruments are calibrated to 0.1 C so why not record to the nearest tenth?

I guess it depends on the time they report. In the manual Steven linked they are being told to report temperatures late in the day, after 5 pm. So lets use a common weather pattern. In the summer the observer takes a report on day 1 at 6 pm. The days high was 95 and the temperature at the time of observation was 85. The next day thunderstorms come in at 1 pm and hold the temperature down to 80 until 6 pm. The observer has to report 85 because it was the temperature at the time of observation the previous day.

That reminds me of a problem someone talked about for using the temperatures at surrounding stations to do quality control. Suppose the temperatures at ten stations in Maine are 95 deg. But a thunderstorm develops in the area of one of the stations and causes the temperature at that station to be a high of 85 that day. The temperature there would be considered either erroneous or at least suspicious and replaced with the mean of the surrounding stations.

All above-ground measurements are sampled every 3 s and with the exception of the barometer (which is 12 s) and the rain gauge (which is event driven). The above ground measurements are averaged over 5 min. Soil temperature measurements are sampled every 30 s and averaged into 15-min observations. Soil moisture is sampled once every 30 min. Every 5 minutes, all available observations are sent from the site to the Central Operations Facility in Norman.

If the folks in Oklahoma can do this on a shoestring budget, and given the economic impact of future policy decisions based on climate observations, why can’t the combined resources of NOAA and NASA do the same for the entire USA?

Kristen #446:
Currently the program will take all dates from any station listed. For situations like this it would be nice for it to accept date ranges for every station. I will add that to the roadmap.

While I’m at it, what other information should be included in the stations file?
Some ideas:

No matter what time of day the observations are made, if they are consistent for a given station, the observer is always recording the high and low temp for the previous 24 hours.

The key phrase is “if they are consistent for a given station”. The problem is that stations have already changed their recording times. We have old data taken with one recording time and new data taken with a different recording time. Therefore, an adjustment is needed.

Hmmm…..

I’m tired and this is an unformed thought, but in a new analysis could we treat the two sets of station data (before and after adjusting the recording time) as different series? Doing so would remove any need for TOBS adjustment.

Thanks Clayton. The Issue of altitude will come up if anyone does a comparion like kenneths.

When looking at class 4s I found they were colder than the 5s and 1&2. That will lead to
confusion in some peoples minds. The 4s are colder because they are higher on average
than other sites. The REAL issue is the TREND. So, that is the importance of having the alt.
data. Put another way. You cannot simply look at the CRN12 and plaot it against anothing else
without assuming an equal distribution of alt.

So,for example comparing CRN12 to GISS to see “which is warmer” isn’t sound, unless the subset
has a representative altitude. ( GISS doesnt correct for alt)

The most telling charts are the trend charts. So, Giss shows a trend over period X. This trend
would be comparied to the trend of CRN12 or CRN5. Or you can compare the trend of 5s
to the trend of CRN12. Or take the difference and take the trend of the difference.

The problem with the trend analsysis is “whats the best test” SO, visit the AR(1) thread to
get some idea of the complications involved. Now I just run a linear trend, it gives a sense
but I’m very uncomfortable with the approach and think the statistically inclined folks should
weigh in with a better method.

methinks the USA should adopt the metric system, like the rest of the world (ROW).

Most engineering fields use metric, it’s just not the public standard. One exception that I can think of is the pitch of pins on an IC. Most of the time they are still referred to using mils, rather than their metric equivalents. This is changing, however.

As I follow this discussion thread, I have had some rather strong objections to the approach taken in some analysis of the temperature data.

My number one issue is using a snapshot quality evaluation and categorization as a basis for going back 100 years in comparing categories. Without more details that just does not make sense.

Another problem in going back, and particularly for small sample sizes of the CRN1 through CRN5 categories, is the missing data in the early years of the data sets.

A third problem is the all the discussion of using the various USCHN data sets and the effects that it will have on the analysis. Would not it make the most sense to use the finished and most adjusted data set, i.e. USCHN Urban, since it is the final set that official trends are derived from that which should be in question.

Finally I have a problem with the samples and sample sizes used for the CRN1 through CRN5 quality category analysis. To this end I generated 5 randomly selected categories using the sum total of USCHN stations making up the categories CRN1 through CRN5. I did this with Excel which is rather a brute force approach when one considers how much easier a user of R can do it. Once the routines are outlined replicating the calculations and data manipulations becomes much less time consuming, but if more random group generations (which for statistical purposes should be performed) are necessary I would appreciate an R user doing it.

For my 5 random analysis I selected the period from 1960-2005 since it contains very few missing temperatures. My results for the calculated trend lines are as follows:

Random Group A: Temp = 0.044 *Year -34.5; R^2 = 0.53.

Random Group B: Temp = 0.038* Year -24.5; R^2 = 0.45.

Random Group A: Temp = 0.040* Year -26.0; R^2 = 0.47.

Random Group A: Temp = 0.040 *Year -28.8; R^2 = 0.47.

Random Group A: Temp = 0.048 *Year -44.6; R^2 = 0.50.

For a single rendition of 5 random samples, I think they show a maximum difference comparable to what we are looking at for the Category CRN12 to CRN5 comparison.

I should have added that my Random Group analysis was taken from the USHCN Urban data set, i.e. the most adjusted version in the progression of adjustments and I assume also the “officially” used data set from USHCN.

I’ve not seen anything new about the USHCN V2 timeline since the initial
announcement. I have seen an indication that an update of USHCN might be
posted this week, but I am assuming that if it does happen, it will be V1.

474 Kenneth,
I think the original intent was to determine what the temperature trend “should be” based on the highest quality rural sites and compare that to GISS estimation. These sites should not need adjustments for UHI.

464 Mosher,
We should come up with a number(s) to characterize the trend for each station and store it in a new column on Anthony’s spreadsheet. Along with a column for rural?, waterplant?, etc. It will be very easy to tease data out of such a table. The first step though is to determine which dataset to use.

Which dataset? I See no SUBSTANTIATED issue with Ushcn TOBS,MMTS,or SHAP adjustments.
That is NOT to say the data and methods are correct. I’m not sure about FILNET or the
Urban Homogenization. Version2 is redoing the whole Homogenization approach ( Better from
my reading) So

1. I would use as input what GISS use, FOR THE PRESENT OR
2. I would use what JerryB sees as the most authoritative. He just has a tropical
storm named after him.

This does not Preclude revisiting the issues folks have with TOBS, MMTS ect.

Columns: adding a column for rural is good, also, One for “hansens nighlights”

I think you need to put in:
The station name
The state it is in
The country it is in
Long/Lat
GHCN ID #
NWS Coop ID # (this can be changed to the countries ID code if you do the rest of the world)
A column for the actual raw data that we expect to get from the observer forms that has no adjustments at all
A column to show how many days are missing in the month
A column for each data adjustment and data source
A column for the CRN rating so that as CRN rating changes it can be put into the correct context (maybe CRN = 0 or U if it is unknown for a time period)
GISS Lights
Population
Type: coastal, ocean, land etc
Elevation
Type of equipment (ex A = ASOS, B = MMTS etc so as the equipment changes it will be identified for each month)

I think you should do a monthly graph too since 1979 to compare to satellite temps
I think you should link the daily data to the stations that are being used, it would be a big file but the stations are still sparse and will come on slowly

Anyways, just some thoughts and Ive asked some friends for more advice and Ill let you know what they suggest.

I came across this interesting tidbit re: UHI, not sure anyone has seen it before, but is somewhat relevant to #4 of the thread subject. Please move to another thread if not considered relevant for this particular thread.

A comparison of daytime and night-time thermal satellite images of Hong Kong for urban climate studies

For completeness of my analysis I need to add the following information for my Radom Group comparison: the first being the trendlines using the Urban data set for CRN12 and CRN5 stations and second to inform you that I used degrees F and not degrees C in the reported analysis results and thirdly that the coefficients for per year should be reduced by 5/9 and finally to use the corrected A,B,C,D and E designations for the respective 5 Random Groups:

Random Group A: Temp = 0.044*Year -34.5; R^2 = 0.53.

Random Group B: Temp = 0.038*Year -24.5; R^2 = 0.45.

Random Group C: Temp = 0.040*Year -26.0; R^2 = 0.47.

Random Group D: Temp = 0.040*Year -28.8; R^2 = 0.47.

Random Group E: Temp = 0.048*Year -44.6; R^2 = 0.50.

CRN12: Temp = 0.039*Year -25.6: R^2 = 0.46.

CRN5: Temp = 0.048*Year – 41.8; R^2 = 0.65.

Please note that the CRN12 and CRN5 difference is nearly equal to that for the maximum difference within the 5 Random Groups.

Re:#489
Ken,
Since your sub-groups are randomly selected, can you use counting statistics to estimate the increase in variance caused by the smaller number of number of stations used to construct the trend? Or at least put confidence limits on the slopes of the trend lines to see if they are significantly different. I suggested doing something like that for the audited stations vs. the whole population and Sinan jumped all over me because the audited stations could be unrepresentative of the whole population. That’s not the case for your experiment.

Clayton is doing a Database of the station stuff.. station ID, dates,
instruments, lat lon, crn# ect etc etc etc. This database is where all
the complication should occur. The station list should be as close to
list of indices as possible. Its function is to pick out series from the data file.

So the user makes a query on that DB and gets as a RETURN The List of station ID
that match the QUERY. Then you have a list for doing your read

So, I would NOT complicate the station list data structure too much

Talk to Clayton, He’s putting together a DB.. I think it would make a good
add on to your program..

So archetecturally it would go like so:

User makes a query to Claytons DB. “SELECT all stations where CRN=1 & INSTRUMENT = MMTS”
From 1910 to 1932″

HIS DB would return the station list. And then you take over.

I would NOT overcomplicate the station list since it primarily functions
as a index into the data file.

The other thought??? is just to load all and then Operate.

That is Load all station data into an internal stucture and then operate. That would be my major
suggestion to your approach. As long as all the records for all the sites can fit, load them all.
Load them once. and then you can run multiple methods on that object. Your style is serial.
make the list, load whats in the list, process the list, output the data.

I cant find a free sql DB hosting on the interweb (freesql.org looked promising but doesn’t look like it’s gonna work). I’ll try to see if I can fanagle my home PC to somehow do this. It would be MySQL.

I currently have two tables in Access – one is Anthony’s spreadsheet “stations” (with lat,lon, elev, crn,etc.). The other is the GHCN_v2 data “Temperatures” (same stuff that JohnV is using).

I havent been able to do too much with the temperature data – I don’t know where to begin. Unless you want to know something like the trend for the average temperature of all TX stations or the total amount of stations with complete measurements that year, etc.

#495 Clayton B:
If you’re interested I could probably host your stations database at opentemp.org. It would be great for your database and my program to work together. A little front-end PHP and it could spit out station lists in a format my program can use.

I prefer to leave the temps as text files from GHCN and USHCN (I’m hoping to make a USHCN parser soon).

DeWitt, my intent was to look at a minimum of random groupings of stations to determine whether we would see a difference as large as we see between CRN12 and CRN5 for the same period. From the results I conclude that the noise level is such in station trends that finding a statistically significant difference between CRN12 and CRN5 stations will be difficult. That tends to corroborate my finding of no significant differences between CRN12 and CRN5 absolute monthly temperatures when paired on the basis of latitude and longitude and adjusted for lapse rate by altitude differences.

Doing my analysis correctly would require doing a large number of random grouping simulations (such as could be handled easily in R if I had the confidence at this point to try) and establishing the expected variations to compare to CRN12 and CRN5 stations.

I personally think all the talk about using different data sets and looking back over a hundred years (with records with much missing data) is getting us caught up in our own underwear and away from a correct analysis of the snapshot quality evaluations and categorizations.

By the way, I think Watts and his teams work indicates that there is some uncertainty in the temperature measurements that has not previously been acknowledge. Further analysis of these data sets leads me to think that local temperature differences are in considerable doubt and that those differences are not really addressed because most of the climate scientists using the data are more concerned with national and global trends. I am more in agreement with Pielke that local climates are what are ultimately important to people.

Re: # 493

One thing I love about this place is we always know how to get back on topic.

Okay, so I see Hans Erren has replicated John V.’s analysis which shows there is reasonably good agreement with GISS temps (there is some variation between CRN 1,2 CRN 5 and GISTEMP but that is to be expected.)

I still haven’t seen the CRN 3, and 4 stations which represent the bulk of the records but these should be similar I imagine.

So, the US lower-48 is settled (if we accept the TOBS adjustments)?

The lower 48 doesn’t exactly prove the global warming case, so let’s have a look at the rest of the world now?

499. Nicely put Kenneth.. I was puzzling over the same thing. The Noise issue is one that made me
want to focus on Seasons… On the assumption that microUHI would be like macroUHI, Most of the
studies showed some seasonality in UHI…. So you might be able to find the signal that way
if the data set is limited.

The other thing I was struggling with is a way to communicate how tough it is to pull out this
signal if

1. Its modulated by wind and clouds
2. Its seasonally variable.
3. It developes over time in certain cases.

Thought about doing some monte carlos…

So Microsite

Is it real? Well, everything I’ve read in accepted climate science literature
( Gieger is fanastically German in his precision a utterly astonishing piece of work)
Tells me that there will be a BIAS. The first CRN study showed a BIAS at a class1 site.
The trick will be teasing it out. I thought about combining the 4s&5–thats why I did
the altitude study..

Now, Anthony has a really Neat site. Euruka CA. Since before 1900 it was located on the roof
of the highest building in Town. Then it was moved to a realy bad site on an Island in 1994.

That’s one of those sites that begs for a detailed investigation back to the raw.

Tough Love? I figured you and dewitt would just bore people to tears… HA. just kidding

Johns just posted his code. I think there are probably 4 or 5 of us who have actually used it.
( I actually ran all 1221 sites ) It hasnt crashed, but we’d be foolish ( john would agree )
to say that the thing is tested.

I have some ideas for test data sets that I’ll share with John on his site, just to wring the
code out and test sensitivities. There are also little nits like coastal cities which are dropped,
Non equal area gridding.. Nits, Johns already documented them and my sense is he’d like to be methodical.

PLus, I don’t know if anyone actually walked through the code? One problem I think Hansen had was
this exact problem. Developing code on a subset of data,, do a little testingthen let it fly on the whole
mess of data.

There is still a mystery of sorts in the Lower 48.

The TREND of Giss versus the Trend of CRN12… I think said GISS warmed at a higher rate.

1. I’m uneasy about a linear trend fit over 100 years.
2. I’m uneasy about cherry picking years ( like from 1998 to present!)
3. The number of Class12 is still low.
4. The significance of the test is unstated.

The TREND ( at leastthe stuf I did with monthly) showed that Class5 warmed more than Class12.
by a small amount.. like in the .2C range… hans shows something similiar. The issue is
that we are just poking around and kenneth is the only one who has come close to doing a
formal test.

ROW. If you think that USCHN data provenance is suspect WAIT until you look at the ROW.

Funny question no one has asked. How did the 1221 sites get picked out of 5000 or so sites
in the US? Everybody focuses on the adustments. The book says length of record? Is the longest the best?

Note this. Hansen and Peterson use a methodology called the reference station. They pick a station
with a Long record and bias everything to that in step0 and 1. That decision leverages a potentially biased vector

JohnV didnt do this. And I don’t think its a liability. I think its an advantage that could give
you better spatio/temporal coverage Ok now I’m rambling.

1. the ISSUE at hand is MICROSITE BIAS. NAMELY, Sites that show a SITING bias that has NOT been accounted for.

2. TOBS. No one has demonstrated any issue in actual data with TOBS adjustments that has anything
to do with the rating of the Site. None.

Now, JerryB has linked his stuff and Ive linked his stuff and you should all read Karls paper
before asking another TOBS question or proffering another TOBS opinion.
Further you can go download some CRN data.. in 5 minute increments
and study the stuff.

Now, do I think that it would be a good idea to revisit TOBS? Yes. But this is not the place.
One thing I love about this place is we always know how to get back on topic. Over at RC
every discussion degenerates into  The ice is melting and fuel from dog poo and wood chips

It’s true that TOB is pretty far from the topic of this thread, so perhaps our host could start a new one just on TOB, ideally copying into it the pertinent posts from this thread? I may have missed a few, but a good start would be #305, 376, 400, 402, 403, 413, 418, 419, 420, 424, 455, 458, 460, 462, 464, 468, 484, 488, and 493. Or perhaps Stephen Mosher could start it, if he is authorized to do so. (I don’t know what the rules are, but he seems to be a regular.)

#470 Kristen Byrnes, #475 steven mosher:
I should have been more clear when I asked about what should be included in the “stations file”. I think Kristen’s list should be included in Clayton B’s station database, but I agree with steven mosher that the station list for temperature processing should be simpler. (It’s intended as a list of stations to include in an analysis, with required metadata to define the station).

Kenneth Fritsch:
That’s an interesting and useful analysis of the uncertainty in the temperature trends.
It’s the kind of work we’ll need to do before we have any hope of extending the analysis to ROW.

That is Load all station data into an internal stucture and then operate. That would be my major suggestion to your approach. As long as all the records for all the sites can fit, load them all. Load them once. and then you can run multiple methods on that object. Your style is serial. make the list, load whats in the list, process the list, output the data.

My original plan was to pre-process the data once and store the intermediate results. Subsequent analyses could use the intermediate results and save the pre-processing step. I decided against this for a few reasons:

1. Each intermediate file between the original data and the final result is an opportunity to use the wrong file in the next step;

2. Versioning complications as the program evolves (it’s too early to lock into an intermediate file format, and I don’t want to support multiple file formats down the road)

3. Simplicity over speed (CPU time is cheap, auditing time is expensive)

I agree.
OpenTemp needs a lot of work before we start using it on ROW. (It could be used now with a few changes, but the error bars would be unknown and presumably enormous).

There is a short list of minor bugs that needs to be fixed, another short list of changes to the analysis method, a longer list of features to be added, many tests to be run, and analyses to be performed on the US data.

That could take a while. In the interest of the morale of anyone working on this, analyses with error bars can start as soon as the code is ready using conservative (wide) probability distributions. With luck we will be able to use the US48 data to refine the error bars.

Clayton B:
I can host MySQL databases. I’m a SQL Server guy but the query language is almost the same so it should not take long to set it up.

There should be a secondary table for the station metadata. Rather than get into the technical details here, I will start a new thread at OpenTemp.org. You and I should put together a rough plan and then invite some key players to give their stamp of approval.

I will try to get the DB and conversation thread setup tomorrow. (Real-life permitting).

For every station you have a series of dates when certain changes were made.
DATE
1. Location changes ( moves in lat/lon)
2. Change in elevation
3. Change in observer
4. Change in TOBS
5. Change in equipement.
etc

Have you been to the NOAA site to take a look at what a history file looks like

In addition we have variables that are added.
1. a population figure is added. ( need clarification here)
2. Hansen has his nightlights ( measured in 1995 in think)
3. Rural/Small/Urban designation
4. Land description
5. distance from Ocean
6.. and then we have Anthony’s data.. CRN. I’d like to see more detail here as well

TMIN and TMAX on a Daily basis is best if you want to look at detailed
Microsite issues. Huge files. This would be fun but a big undertaking.
Monthly TMAX and TMIN… might be interesting.. Not sure.

4. Probability distributions on all temperature readings
Meaning?
4a. Normal distribution for accuracy
4b. Exponential (?) distribution for station bias
Not sure what you are thinking here. Splain.

5. Convert to using trends instead of absolute temperatures
Both. And anomaly as well.
6. Probability distributions on station temperature trends: yes
7. Probability distributions on grid-point temperature trends (includes errors from interpolating over long distances)
ok
8. Analyses with error bars: now that will cause some fun fireworks!

I see a lot of discussion about the potential database here, but not much about how much normalization to do to the database itself. I’d think, for instance, that to implement Steven’s #499, it will be necessary to have a metadata table. It would be fairly simple to implement. There would be a station ID column, a metadata type column, a few date columns, some data columns (as suggested in earlier messages), a text column or two and (of course) a metadata ID column. Writing queries will be fairly simple. In many cases you’d just want to get all the metadata rows for a given station, in other cases you’d want to get those for a given class of stations, store them in a temp table and use them as needed by your program. BTW, if properly implemented, you could use this to let you see historical changes in the metadata, i.e. if you found a mistake in the date a station changed locations, you could just enter in a new row with the updated value and the metadata-implementation-date column would let you figure out whether or not to use such a change when you were trying to reconstruct an old calculation.

Of course, you might also need a meta-meta data table for Hanson-style universal updates.

My point, that was probably not emphasized sufficiently, was in part the fact that we are working with small samples here and that in itself produces added uncertainty. We need more data to look for statistically significant differences  if they exist. If I divide the samples further the uncertainty will probably increase even when restricted geographically. When I was looking at matched pairs by latitude and longitude and adjusted for altitude for CRN12 and CRN5 categories, the average differences were close to zero but the individual cases had significant variations. I have not listed my temperature differences corrected for altitude and thus I will do so here to reinforce my point that local variations, and adjustments, for that matter, are great and add much noise to detecting sought signals:

*The differences are in degrees C and I list those using the USCHN urban data set (Urban) and the John V derived data set (JV) even though the JV set did not provide anywhere near complete coverage for the period designated. For the JV comparison, the time period was from 01/2005-03/2006 and that for the Urban set was 01/2005-09/2006. The differences are for CRN12-CRN5 and that difference per station pair shows the CRN12 stations about 0.3 degrees warmer than the CRN5 per month warmer, but that is not anywhere near a statistically significant difference.

2. For linear trends, consider using the natural shape of the temperature history (ie 1915-1935, 1935-1975, 1975-2005).

You saying here, I assume, that a linear fit for a trendline would give a higher R^2 and thus be more appropriate. I will look at the 5 Random Groups and the CRN12 and CRN5 groups for the 1975-2005 time periods. I choose the 1960-2005 period because that was the period with very little missing data.

3. Do a lot more random samples (I realize this is the hard part  a post-processing program for combining multiple results is on my wishlist for OpenTemp)

It would only be the hard part for me to reload R, do a little learning and use it to do this. I was stalling in hopes of an R user doing it for me. That works for my grandkids, but I suspect here no one is going to do any spoiling.

1. USHCN Parser
Purpose is to allow #2 and #3 (only USHCN files have the data).

2. US48 Analysis with TOBS
USHCN daily may be useful later, but I was thinking about USHCN monthly. I would like to use TOBS or MMTS monthly means for one more comparison against GISTEMP.

I am also curious about comparing Tmin and Tmax trends for CRN12R and CRN5.

SHAP might be useful, but if OpenTemp does all calculations based on trends (vs absolute temperatures) I can treat every station history adjustment as the beginning of a new trend series instead of adjusting. Actually, I can do the same with TOBS and MMTS.

3. US48 analysis with Tmin, Tmax, Tavg
I’m not actually thinking about TOBS here. I want to compare CRN12R vs CRN5 on Tmin and Tmax. In theory they should match on Tmax and be very different on Tmin.

4. Probability Distributions
The recorded temperature at every station is affected by instrument errors and station issues. The instrument errors can probably be modelled as a normal distribution. The station issues are probably not symmetric so will need a different distribution. I used an exponential distribution when estimating the expected magnitude of station issues in a previous post.

5. Convert to trends
There was a lot of discussion a few days ago about using temperature trends in all calculations and not worrying about the absolute temperature. I think that makes a *lot* of sense:
– Constant station bias can be discarded
– TOBS, MMTS, and station moves can indicate the beginning a new series instead of requiring adjustments
– The trend error bounds for gridpoints between stations are probably easier to quantify than the absolute temperature error bounds
– It’s a new approach instead of duplicating GISTEMP or HadCRU

I’m just one of the many lurkers here, but I would like to give you my personal thanks for the work you are doing here.
When you say:

5. Convert to trends
There was a lot of discussion a few days ago about using temperature trends in all calculations and not worrying about the absolute temperature. I think that makes a *lot* of sense:
– Constant station bias can be discarded
– TOBS, MMTS, and station moves can indicate the beginning a new series instead of requiring adjustments
– The trend error bounds for gridpoints between stations are probably easier to quantify than the absolute temperature error bounds
– Its a new approach instead of duplicating GISTEMP or HadCRU

IMHO, that this is the only way gain prospective regarding the validity of GISTEMP or HadCRU methodologies. If there is a flaw in taking the trend approach, I for one fail to see it. I think this is important idea and hope that Steve M agrees enough to promote this idea to its own thread, where it can get the attention it deserves.

TMAX and TMIN is a interesting line. I’ve only done this kind of comparison on a daily basis.

For example: take Site X daily max and daily min. Take site Y 40KM away.
Site X is a good site; Site Y is a bad site.

Compare The deltas in TMAX and TMIN on a daily basis.

Lets say that Site X is over grass and site Y is over asphalt.

They are both heated in the same fashion ( same sun, similiar lat/lon, same wind, same clouds)

Site X is over soil and grass. That capacitor will release stored energy as the sun goes down.
The profile of that release will influence the MIN seen that night.

( have you read OKE’s paper?)

Site Y is over asphalt. It has different storage properties. It stores heat, transfering some to
ground underneath it and some back to space, depending on temperature differentials. Perhaps
on the first hot day, it acts rather “soil like” and we see no impact on MIN. Then we get another
hot day and another and other.. and the temperature differentials between the soil and the air
change such that the asphalt discharges more to the air and we see the MIN start to creep up.

So, consider that a gross over simplified model. If things operated that way, then you might
be able to pick up that signal in daily data. But, if the signal is intermittent, then looking
at Monthly MIN, could swamp it in noise.

The other thing with MAX and MIN is classifing the siting issues.

Let me go through The issues.

Warming Bias:

1. Ground cover: asphalt, rock, roof top. These materials store and release
energy with different time profiles than grass or vegetation. It distorts TMIN.

2. Artifical Heating sources: Buildings etc. Essentialy they keep the air near the
ground artificially warmer than it would be otherwise, increase the boundary layer
and inhibit ( kinda like Global warming) the transfer of Energy to free space.

3. Wind shelter: If a site is sheltered from the wind by trees or structures Then you’d
see signals in MIN.

Cooling Bias: a site that is shaded or on sloped land is potentialy cooled. Basically
the sun cant heat the earth. SO, you’d expect to see TMAX issues with these kind of sites,
modulated of course by several factors. The Time of the shading, cloud cover, the type
of the leaves, wind, latitude, Season… Ask a gardener.

So shading of a site can Lower MAX. Can it infect MIN? Good question. Maybe via restricted
skyview.

Wind can affect TMAX as well by increasing the turbulence. During the day the sun will first warm the air closest to the ground. (Actually the sun warms the ground which warms the air – just to keep DP off my case.) Increased turbulence will mix this surface layer air with air that hasn’t warmed yet.

Remember that frost is moisture falling/forming proportionate with the dew point. If there is shading the moisture may not reach the ground under and to the side of shade. You would need to confirm your hypothoses with a thermometer

You should ask JohnV for a login to the database that he is hosting at openSQL. There should probably a guest login in order to keep it to all. Some of these thing are on there from the USHCN stations metadata information.

To look further at the CRN12 and CRN5 trend differences when compared to random groups from the same population of the CRN categories 1,2,3,4 and 5, I took 27 sets with 5 groups randomly selected in each set. I found the maximum trend of the 5 groups in each of the 27 sets and then subtracted the other 4 group trends in the set from the maximum in the set. This gave 107 differences with which to look at the portion falling into the ranges shown below and to compare to the CRN12 to CRN 5 trend differences for the same period. The period selected was from 1960-2005 and used the USCHN Urban data set with temperatures in degrees F.

When these differences are compared to the CRN12 and CRN5 trend difference of 0.0481 (for CRN5) and 0.0394 (for CRN12) or 0.0087, one can readily see that this difference falls into the middle of the expected differences from randomly chosen groups. From these sample sizes and noise levels, we cannot make conclusive statements about the trend differences between CRN12 and CRN5 stations.

To Kenneth Fritsch: re 519 and other similar posts.
What is the number of stations you used for your random groups? How many in each group is what I am wondering.
How many of the CRN5 and how many of the CRN12 stations did you use in computing the trend difference between CRN12’s and CRN5’s?
Thanks, just trying to follow your analysis.

When my car is parked under a canopy, there is no dew on the window. When I drive off, dew does not form on the window.
When my car isn’t covered, there is dew on the window, when I wipe off the dew, more dew starts to form immediately. The new condensation shows that the window itself is colder than the air.

In answering a couple questions posed in recent posts in this thread I will detail my data and results for further clarification.

The category stations numbers are listed below:

CRN1 = 17; CRN2 = 36; CRN12 = 53; CRN3 = 69; CRN4 = 215; CRN5 = 58.

The total stations in all of these categories were than put into random order 27 times to yield 27 sets. These 27 sets were then divided into 5 random groups of equal or equal as possible groups with no stations in common. The groups were labeled A, B, C, D and E for each set. For each group a trend slope was determined. Within each set the maximum trend was determined and the other four group trends were subtracted from it. These 4 differences were recorded for each set to provide 4*27 or 108 differences (in trends). I arranged the portion of the 108 differences that fell into the ranges of differences that I showed in my previous post and compared that finding to the trend difference I found between CR12 and CR5 stations for the same period.

Note that the numbers of stations in the random groups were larger than the CR12 and CR5 station numbers. This should give a more conservative estimate of the expected difference between the trends of the CR12 and CR5 stations assuming no statistically significant differences.

I used the maximum as a reference for ease of calculation and the fact that it assumes that finding a maximum trend is the intent of the CRN12 and CRN5 comparison. The more general case without this assumption will require that I have used all 10 unique permutations of the 5 stations comparisons within each set. That will be covered in my next post.

Ever notice that CA links to a bunch of places from “both sides” and there’s nothing at RC linking to either here or WCR etc? Why the lack of transparency, openness and replicability? WHY

Anyway, when this all started, reading the things here and at RP jr and sr’s sites, deltoid, rc, rabett, motl, bcl, wikipedia, etc I thought that the people running this entire AGW machine (and more specifically Hansen et al and the temp data) and the way they were acting were one or more of: a) very territorial b) holier than thou c) hyping things for political purposes d) trying to run “the deniers” around in circles to keep them busy e) hiding that they didn’t really know what the network was specifically like (and think it unimportant) f) hiding what they were doing for other reasons which are many and varied (and not worth thinking about in any detail) g) trying to do their job and too busy to act like normal people (or incapable of acting so in the first place). And so on. It appears that all of the above are to some degree true. But I’ve always thought the idea of thinking that some sort of global average of what the temperature is can be tracked and is meaningful a bit far fetched. Be that as it may.

When the station surveys started, and seeing the reaction from “the alarmists” and how vehemently they were fighting it, it seemed to be so odd that those that are always talking about how they are for the science to be so vehement about verifying things. Then I read about how it was ‘a waste of time’, ‘wouldn’t tell anyone anything’, was ‘stupid’, and all those other paranoid attribution of hidden motives to both Anthony and Steve, and thought about how fanatical and belittling the alarmists are so often. Then we see the results, and get the snotty ‘we told you so’ and ‘ha ha we were right’ etc. Same attitude. Anyone else get the impression some self traits are being transfered here in the paranoia and then the gloating about something getting verified as was the goal?

I always thought we’d see pretty much the same results, but there have been some issues (the US being off by a few tenths for a number of years), and some things taken for granted (such as the Halpern Hypothesis of Offsetting QC Failures) were incorrect assumptions. .1 C is a disaster to the planet, but being off by a few tenths for years is a ‘nothing’ mistake. *sigh*

The question remains though. Regardless of the results, how can you know you corrected something unless you know what’s wrong? Unless it’s independently verified both by your own methods and a variation of it or a different one?

Per my intent from post #524, I did the 10 permutations of the 5 random groups in each of 27 sets for a total of 270 differences and gave the portions for each of the ranges listed below. This calculation, in my judgment, gives the general case with no assumptions about a group having a larger trend than the others. The CRN12 to CRN5 difference of 0.0087 puts this difference somewhere near the top 25% portion (largest differences) of the randomly selected groups, but does not meet any commonly used criteria for statistical significance (5% or less).

Perhaps larger sample sizes obtained from future additional categorizing will be able to attach some statistical significance to these differences. Better determining of statistical significance in future samples will probably also need to show what is the appropriate time period for comparing the snapshot categories and perhaps concentrating on statistically different variance between groups.

#525 Sam Urbinto:
Regarding the “few tenths” of a degree difference between CRN12R and GISTEMP, I suspect that the difference will become smaller when TOBS is incorporated into CRN12R. (I am not just guessing, my post #376 shows that the TOBS bias is approximately 0.2degC from 1950 to 2000. It is reasonable to assume that the TOBS effect on the CRN12R stations will be similar to the TOBS effect on all USHCN stations). I need to find the time to write a USHCN file parser and re-do the analysis with TOBS.

#526. John V, puh-leeze. You’re being very precious in this comment. You’ve been given full access to the audience here.

Prior to even starting the Hansen analysis, when we were looking at Parker, I observed that, if Hansens adjustments were right, then Parker’s analysis was wrong. If you can reconcile both Hansen and CRU, I’ll be amazed, but quite happy to provide you access as you have in the past.

To date, you’ve not looked outside the U.S. record or accounted for differences between NASA and CRU. I realize that you can’t do everything at once, but keep these issues firmly on your radar screen, before pronouncing that results from what Gavin Schmidt likes to describe as 2% of the world’s surface, with different data provenance than many other countries, constitutes vindication of the methodology of the other 98%, where there isn’t a comparable continuous rural network.

John V.
#527 I’ve said something like this here before. I take Steve at face value; that he does what he says and the reason he does it is the reason he gives for doing it. I’ve not been disappointed yet and I don’t expect I ever will be.

#528 As far as my “.1 C is a disaster to the planet, but being off by a few tenths for years is a nothing mistake. ” I’m more speaking about the bruhaha made about the claims of horrible damage to the planet for a few tenths of a degree, and the hypocracy of anyone who’s said that before that is now brushing away having to adjust the US record for a few years as “nothing”.

And remember, this is just the US, as Steve said. We’re on track, but it’s not finished, yes, I know. As I also said, the question remains and regardless of the results, we don’t know until we verify. (And it’s not like there have been no results at all so far.) But I don’t know what the answer is, and I don’t care. I just want to know what it is.

#529 Steve McIntyre:
I have never stated anything about applying my CRN12R comparison vs GISTEMP to ROW. I have been accused of doing so, but it has never happened. The closest I have come is omitting “in the USA lower 48″ to the end of every sentence when it should be obvious since:
a. All of my analyses have been on the USA lower 48
b. The title of this thread is “A Second Look at USHCN Classification”

My opinion is that if GISTEMP closely matched CRN5 with cRN12 showing substantially less warming (in the USA lower 48), then those results would be promoted here. This would be similar to how my actual results were promoted on other sites.

I have no intention to “reconcile both Hansen and CRU” or “account for differences between NASA and CRU”. It’s an interesting challenge to calculate temperature trends and I am pursuing it for the challenge. To date my results have been close to GISTEMP (for the USA lower 48). They may differ substantially when analyzing different regions. I will be sure to make my analyses available regardless of their results.

I can not speak intelligently about Parker or HadCRU and have not attempted to do so.

I recognize that this is your site and your community. You can promote or refute whatever you want. I am grateful for the opportunity to show my results (for the USA lower48). I am hopeful that I will be able to extend those results outside the USA lower 48 with the help of others here.

I recognize that this is your site and your community. You can promote or refute whatever you want. I am grateful for the opportunity to show my results (for the USA lower48). I am hopeful that I will be able to extend those results outside the USA lower 48 with the help of others here.

I think his point is that you’ve assumed some bias towards a particular dogma in the statement

I may be overly cynical, but I do not expect that SteveMc will promote my analysis or results if they continue to show close agreement with GISTEMP and/or HadCRU.

which is completely unfair given a) the latitude he has shown towards outside contributors, including you, and b) his well-known goal of uncovering the TRUTH, not just some pre-conceived notions of what the truth should be.

I think you miss the point of the audit Meme. The other thing is that SteveMc is pretty focused
on the Juckes issue ( and related matters) which go back YEARS. So, fear not, the project
will get headlines again as the various piles of dust settle.

We’ve both produced guesstimates at the Bias in a class 5 site. You kinda put it in the .3C regime.

I’ll revisit my guesstimate.

1. Class 5 is a 5C bias.
2. Some of these biases will be warming others cooling
3. We suspect that the warming bias outweighs the cooling bias.
4. A brief look at the sites I’ve done suggests the ratio of warming/cooling is large ( need to quantify)
5. Lets suppose that this 5C bias will net out at 2.5C warming.
6. The warming bias impacts Tmin. This reduces the impact to 1.25C.
7. The warming bias is most predominant in the warmest seasons. So lets say .3C.
8. Other facors may reduce it more, clouds, winds, etc..
9. at some point the warming bias becomes saturated and trend signals are detectable in spite of bias.

Now, the frequency of class5 sites is roughly 15% in the 33% sample we have.
Assume this is true at 100% sample.

The .3C gets leveraged down by the frequency of sites encountering this bias. It could very well fall
into the noise hole.

The case with class 4 sites is slightly different. Class 4s have a slighly lower bias, but
a substantially higher frequency in the database. Much Higher than I expected.
I expected 1,2,3,4,5 to be roughly normal with class 3s predominating.

The class 3s while not meeting standards would not appear to be a problem. Small bias
and low frequency of occurence suggests that the the final effect would be undetectable,
so I would leave thm out presently.

The one observation we have of a class1 ( the crn study ) indicated a BIAS of .25C
( on Tmean) but it was modulated by winds and clouds..

So Unraveling that. LEROY put the estimated bias at less than 1C for class 1 and 2
CRN study has a single “side by side” measure of .25C. Need to check if this was mean.

I’m still curious about throwing the 4s into the mix.

Part of the problem with the 1-5 ranking is that it does not capture cooling or warming bias.

THE HYPOTHESIS that rabbett and gavin and others had was this.

Warming bias at class5s is equal to cooling bias at other class 5s. That is the whole
POINT behind Rabbett showing shaded sites. I have no issue with this. I want to count.
If you find 50% of class 5s are asphalt infected and 50% are shade infected. Dont expect
big warming diferrences.

This in itself, regardless, of the global outcome, would be an intreesting finding.
If we looked at all the class 4 and class 5 and rated them as “expected warm bias”
or “expected cool bias” and then analyzed the numbers that would be a cool thing.
APART from GISS..

Warming bias at class5s is equal to cooling bias at other class 5s. That is the whole
POINT behind Rabbett showing shaded sites. I have no issue with this. I want to count.
If you find 50% of class 5s are asphalt infected and 50% are shade infected. Dont expect
big warming diferrences.

The purpose of the JohnV project as I understand it is to come up with an open-source computation of the US48 average temperature trend (and error bounds!), separate from other computations(or in parallel with them). This can hopefully be applied to ROW. Unlike other analyses, this one will take in surfacestations surveys into account. Therefore, why not just neglect results from low-quality stations? Why wouldn’t that make everyone (including from other sites) happy?

“Therefore, why not just neglect results from low-quality stations? Why wouldnt that make everyone (including from other sites) happy?”

Clayton, one would think so. But so far (I forget where I put my essay about it) the temper tantrums, paranoia and gloating doesn’t lead me to believe those whom have been doing such things are going to be happy with anything. This is just my opinion, but the level of spin going on leads me to believe this is all political for some, or they’re vain and think that bias can be adjusted out of bad sites by various numerical methods. I suppose if you’re happy with a US .3 (or .1 or .5 or whatever, I don’t care) then you can adjust out the bad sites. I don’t know why anyone would want to keep a site and adjust rather than just clean things up, but who knows what some people think.

Oh, yes. The “I’m right and you’re wrong. I understand this and you don’t. I’m smart and you’re dumb.” egotistical jerk syndrome.

RE: #544 – Here is what I am interested in. Differences between GISS and CRN1-2(revised). What are the root causes of the differences. I am also interested in the bizarre pattern prior to the 1920s. I am very suspicious of pre 1920s data. Here is a thought. If you lop off pre 1920s data, then, plot overall fitted constant slope lines to the remaining GISS and CRN1-2(revised) curves, what do you find? Hmmmmmmm……

Lou D:
Thanks for the kind words.
I did not actually replicae the GISS algorithm at all. I wrote a quick new algorithm that could use a subset of the US lower 48 stations. I then compared GISTEMP to the best stations (CRN12) and the worst stations (CRN5).

You can see my work history if you click my name above. No time at GISS. Sorry to disappoint.

It is my understanding that John V is putting the GISS methods of temperature measurement and adjustment into code that can be documented and validated. That is a very worthwhile endeavor and should be much appreciated. The remainder of the analyses I see being performed here seem to be charging ahead with little forethought of what we are really attempting to determine.

Watts and team have provided some categories for the quality of stations which give us, without significantly more detail, a mere snapshot of quality and in no manner or form would be translatable at this point to station histories going back 50 to 100 years in time. The question ought to be, given the discovered lack of station quality by the Watts team (and that is another very worthwhile endeavor) what can we say about past quality and what does it mean in terms of adding uncertainty to the measurements that may not have been acknowledge at this point and, as I have emphasized in the past, to local area differences in temperature trends.

I judge that we can determine statistically significant trend differences between groups that come from the populations of USCHN stations, but I would estimate at this time that if one goes back far enough the current Watts snapshots have less and less significance and at some point we are comparing, in effect, randomly selected groups.

I think the time has come to clearly outline a purposeful and pre-tested scheme of analysis. I would personally like to look at the much more general issue of the source of the trend variation that comes from randomly selected groups of stations and what that means in terms of a nation-wide average temperature anomaly trend.

I looked at your spreadsheets for monthly data and only found the anomalies. I know you have a real life with only so much time but when you get a chance can you please add the data you are using for CRN 1,2 and 5 to one of them or if you did already tell me which one its in. I have access to the original observer reports and would like to compare them. Thanks.

Using the Watts and team derived 5 CRN categories for the time period 1960-2005 and the USCHN Urban temperature data set in degrees F, I calculated the trends over that time period for the 5 categories separately and included the number of stations used in the calculations:

CRN1 = 0.042; Stations = 17.

CRN2 = 0.035; Stations = 36.

CRN3 = 0.042; Stations = 69.

CRN4 = 0.043; Stations = 215.

CRN5 = 0.048; Stations = 58.

Note (1) the different trends for stations CRN1 and CRN2, and, of course, (2) the small number of stations include in CRN1, and (3) the differences between the CRN2 and CRN5 stations. More simulations and calculations are required to determine statistical significance of any of this. It appears we are looking at the uncertainity of using small sample sizes.

It may be important to recognize that the 1221 or so stations in the UHCN record were not chosen randomly from the population of around 5,000 or so weather stations in the US. They were chosen because these stations had the best quality data. I don’t mean to demean the efforts of the folks at NOAA, but it is akin to looking for your keys under the lamppost with the best light. Unfortunately, we are stuck with data that was collected in a particular way with period instruments.

Second, I am not convinced that we can divorce trend data from absolute temperature data, as if there was no meaningful connection between the two. The trend data is derived from the absolute temperature data, so any uncertainties in the absolute temperature data are necessarily passed on to the trend data. Divorcing the two may mean underestimating the total uncertainty.

What is the difference (if any) between USHCN urban data set and GHCNv2(JohnV data) ?

The USCHN Urban data set is the most adjusted data set available from USCHN. I believe John V’s data set is derived from the raw USCHN data which I believe also has been corrected Areal only. I am almost certain that his data calculations will suffer from the same uncertainties due to small sample sizes as the USCHN Urban set. I do my work with that data set that is “officially” used by others most frequently. I think it more appropriate to determine how the station categories of Watts and team affect the final USCHN version and not the raw data.

It is my understanding that John V is putting the GISS methods of temperature measurement and adjustment into code that can be documented and validated.

This is a common misunderstanding. I am *not* attempting to replicate GISTEMP:
– GISTEMP estimates temperatures for missing months, I don’t
– GISTEMP attempts to correct the trend of urban stations, I don’t

#551 Kenneth Fritsch:
Thanks for the continued analysis of the differences between groups. Just to confirm, are you doing an arithmetic average of the stations or using OpenTemp to calculate a geographically weighted average?

#553 Clayton B:
The USHCN Urban data includes all of the USHCN corrections (outliers, TOBS, MMTS, Filnet, SHAP, and Urban adjustments). The unadjusted GHCNv2 data is raw except for the removal of outliers.

The station history data and a “network” of the best correlated nearby stations are used in all these routines.

The routines that adjust the data are TOB, MMTS, SHAP, FILNET and, separately, the URBAN Adjustment (the latter may be done separately in the GISS data). This unequivocal statement means that almost every single data point in a CRN1 or CRN2 station is “adjusted” by data from (implicitly) 40 neighboring stations (USHCN) or rural stations within 1000km (GISS). Consequently, it may be difficult, when using the USHCN Urban set, to distinguish between CRN12 stations and CRN345 stations as they have been used to adjust each other, as I would suggest your analysis has shown.

The CRN1 have a warming trend because 11 of the 17 are ASOS. That is, located at airports.

That’s a pretty definitive statement for an untested hypothesis. My CRN12R analysis which excludes all airports showed a warming of 0.29C/decade from 1975 to 2005, or 0.87C total for the 30 years (see post #377).

Non-ASOS CRN1 and CRN2 stations may have a small warming bias. However, even if we assume that they all started as perfect stations and degraded to CRN1 and CRN2, the expected bias would be on the order of 0.12C.

I base this on my estimate of CRN5 bias (relative to perfect) of 0.75C to 1.16C (see posts #380 and #540).

I don’t have the exact trend. Estimating from my plot in #267, I would say the trend is about 0.04C/decade (0.40C to 0.70C over 75 years). It would be tough to explain the 1935-1975 cooling by station problems though.

#560 steven mosher:
My first cut at rural-only included some airports. Kristen B and Chris D (?) informed me that airports should be removed, so subsequent analyses were done with rural, non-airport CRN1 and CRN2 stations.

I have not done an explicit comparison between rural and urban. That would be interesting though, and I would like to get around to it after the program is refined and updated.

75 years. A convenient figure that is significantly more than 67% of a century. Sort of arbitrary, albeit, vaguely informed by statistical considerations. If I trusted data prior to 1920, I’d really want 100 years, or more, but I am suspicious of the data prior to 1920. Horse sense, from 25 years of having to make snap decisions using highly flawed data. You could also try using 85 years, starting at 1920.

The routines that adjust the data are TOB, MMTS, SHAP, FILNET and, separately, the URBAN Adjustment (the latter may be done separately in the GISS data). This unequivocal statement means that almost every single data point in a CRN1 or CRN2 station is adjusted by data from (implicitly) 40 neighboring stations (USHCN) or rural stations within 1000km (GISS). Consequently, it may be difficult, when using the USHCN Urban set, to distinguish between CRN12 stations and CRN345 stations as they have been used to adjust each other, as I would suggest your analysis has shown.

Phil, I have thought about these homgenizing processes and I have not been able to reveal, at this point in time, how a temperature bias and/or temperasture trend bias would not show through these processes since they directly are correcting for other deviations. If you can explain it in simple terms I would appreciate hearing about it.

The adjustments do make some differences from station to station and it may actually add to the noise level but I cannot see it covering up a temperature bias. Just to show this point I will directly compare CRN trends from John V’s data set with the USCHN Urban data set and do the same for random samples.

I posted on this cherry picking crap. A while back I mentioned 3 regimes..

1910-1940.
1950-1975
1975-Current.

I thought it quite clever ( nobody caught that I sliced 10 years out 1940-1950.

When I started to fiddle with the data I started to recognize that a year here or
a year there could leverage everything. Did some research On regime changes
changes in time series. About the same time Tamino posts his warming by the decade piece.

His words were “it’s natural to see 3 regimes” That wasnt math, that was seduction.

very simply If you look at the time series and select a region that looks linear,
then you don’t exactly have a random sample. So, Looking ta specific regimes requires
some finessse.

Actually, let me amend my statement about not trusting older data. I would not trust pre 1925. Look at the very unnatural change from ~1917 to ~1921. While I know I am second guessing, it just looks very odd. I don’t think it’s real. Splice? Or other discontinuity? Interestingly, the US entered WW1 in 1917. We had the sudden establishment of formalized airports for the first time in many locations. Prior to that, most planes took off and landed whereever conditions permitted with very few formal airfields.

Starting at 1920 starts in a cold trough and trends to multiple hot peaks. Starting in 1935 starts at a hot peak and trends to another hot peak. Starting in 1935 and ending in 1965 starts at a hot peak and trends to a cold trough. To select a relevant time series is going to require a determination of precisely what the time series is supposed to accomplish, represent, and whether or not the dataset/s are capable of supporting the objective.

I believe that both the late 1910s and the mid 1970s shifts, which are seen in both GISS and CRN12R (in fact, are identical in both, you can’t see the two different lines during each such shift), are bogus. Again, horse’ sense, from years of making snap decisions using a wide variety of data, which is often flawed. You need to use what you’ve got sometimes. After lots of hard knocks, one develops an ability to detect aspects of the data that may point to splicing, equipment problems, changes in methods, or out right one time errors which shifted the reference level.

In the GISS – CRN12R plot, the areas to be most supicious of are zero crossings. When you see a zero crossing, it’s an indication that something may have changed systemically at and around the zero crossing. A shift, an undocumented “correction” or “adjustment.”

Here’s an anology. Checkbook balancing. Sometimes, we may incur a situation where we cannot get it to balance, typically some small service charge or interest increment missed, what have you. You go along for a while, and struggle to find the record of the event. After a while, you give up and just tweak the number to match a reasonable, audit derived figure. Hopefully, if you are a reasonably dilligent recorder, you will not have to do that very often, hopefully no more than once a decade if even that. When you do it, you make a very prominent notation of restatement. If you are not quite as dilligent, you may do the adjustment but not adequately document it. If you are really sloppy, you might simply do it on a calculator and simply work in into the next line entry without an explicit line item for an adjustment.

RE: From #376 – if “Much of this can be attributed to the Time of Observation Bias corrections that are included in GISTEMP but excluded from CRN12R.”

Then how come GISS and CRN12R track over each other during the big late 1910s, and early 1950s? In other words, why are there zero crossings in GISS-CRN12R precisely for the time frames where relatively large “corrections” were apparently applied per the NOAA chart?

Also, from the NOAA chart – the ramp in adjustment magnitude from the early 1970s to early 1990s is quite odd.

#568 Kenneth, these processes may inject their own trends and/or biases into the data set, thus masking the temperature bias and/or temperature trend bias that you are trying to detect because the corrections for the other deviations may not be accurate. I also am not sure that what we have is a bias between CRN12 and CRN345 (and therefore potentially correctable), as much as an increase in uncertainty, with the CRN12 and CRN345 having significantly different incremental uncertainties.

Nevertheless, I think the comparisons that you propose could shed some very interesting light on these issues.

A thought experiment to contemplate over the weekend. So, let’s imagine that I own the main data repository for US temperature records. Let’s say I also own the “process” for reaching down into the muck of a more primative time which existed here, back when we were really sort of a developing country. That time would be the time prior to WW2, which incidentally makes up 40% of the 20th century. Let’s also state that I happen to be a true believer in the notion of “killer AGW.” Now, let’s imagine something a bit dark.

I look at my slightly demuckified data, and see that we had a warm 1890s, a warm 1930s and a sort of warm 1990s. If I fit a curve to my reasonably demuckified data, for the period 1901 – 2000, I find that there is no detectable rise. What I need is an overall rise for the century. I can’t very well eliminate the warm 1930s. There are still a few alive today who can remember those times reasonably well, some of them are still working and were good observers in their youth. I can maybe slightly tweak those warm 1930s, but not by much, or it will raise quite a bit of suspicion. Nothing I can do to raise the 1990s very far – although automation in certain cases made quality go down, and I could perhaps do something with that, there are still enough CRN1 and 2 stations, and enough of a subset of them unaffected by serious UHI, that someone would bust me if I tweaked too much.

Ah, but what about those warm Victorian days? Almost no one remembers them. What I could do is to exploit the time which was in between the warm late 1800s, and, the warm 1930s as follows. By arguing for a major TOB shift during and shortly after WW1, I could effectly lower those warm late 1800s. By doing so, my overall derivative of the entire 20th century could be changed from a near zero slope to a positive value.

This is not to say that such a thing has happened. This is only what if.

Imagine if the TOBs “adjustment” prior to the early 1920s were all negative values. In other words, if you applied the formula TOBS-RAW(new) equals TOBS-RAW(current) minus 0.1 for everything prior to the early 1920s.

Then how come GISS and CRN12R track over each other during the big late 1910s, and early 1950s? In other words, why are there zero crossings in GISS-CRN12R precisely for the time frames where relatively large corrections were apparently applied per the NOAA chart?

Ummm, I think you’re reading the NOAA chart very wrong (post #376 for those following at home).

The plots are shifted to the 1951-1980 average for each data set. Between 1915 and 1955 the TOBS correction is nearly flat at about -0.1F. After 1955 the TOBS correction increases steadily to about +0.25F around 1990. After that it levels off again.

TOBS has not been verified so I can’t say anything for sure. It may be just a coincidence that the TOBS correction is ~0.35F (~0.2C) over the century and the difference between CRN12R and GISTEMP is also ~0.2C over the century.

I just did a trend comparison for the time period 1960-2005 using John V’s dataset comparing the trends for CRN12 and CRN5 in degrees F.

CRN12 = 0.034.

CRN5 = 0.033.

I believe the conjecture here was that the rawer data (as in John Vs data set) would show a larger trend difference than using the fully adjusted USCHN Urban data set. Note from earlier posts that the CRN12 trend using the Urban Set has a trend approximately 0.009 degrees F cooler than CRN5.

Note: John V, I used the files yearlyCRN5.cvs and yearlyCRN12.cvs to make my trend calculations. I assumed that these are the files I should have used.

To answer other questions, I use an arithmetic average and not one weighted by geography factors.

The USHCN Urban data set should be very much like the completely adjusted GISS data set. The completely adjusted sets use similar adjustment techniques. If GISS uses the USCHN data sets adjusted through filenet and than does it own UHI adjustment theoretically we should expect small differences since the UHI adjustment in either data set should be small.

In order to look at random sampling from the CRN1 through CRN5 categories I would need that data from John Vs data set. Currently I only see data for CRN5 and CRN12 in John Vs cvs files. Will you be adding the CRN3 and CRN4 data soon?

Also, please rember that I am not saying that there are no differences between the CRN quality categories It is just that I am saying we are, in general, going about detecting it using the wrong methods and particularly so when we use long term trends for a snapshot evaluation.

SteveSadlov:
We can both conjecture all day about what TOBS will do, but we’re both just guessing at this point. I will do what I can this weekend to implement a USHCN parser so that the CRN12R stations can be run with TOBS adjustment.

Kenneth Fritsch:
Those were the right data files to use. Based on my results in #376, I am surprised that you found less warming in CRN5 than CRN12 since I found more warming in CRN5. The difference is probably due to geographic vs arithmetic weighting.

Indeed, if two different programs give very similar results, it gives you confidence in the answer.

It is interesting that the GISS result tracks very closely with the CRN12 (“best station”) one. The temperature trend over the 20th century appears to be almost identical for the two.

Looks to me like Hansen did something right with his adjustments, since otherwise (if he had based it on the CRN5 “worst station” data), the temperature change over the 20th century would have appeared greater.

9:35 AM

——-

So I have to ask, what does “It is interesting that the GISS result tracks very closely with the CRN12″ mean? I what ways is it close, in what ways is it not? It is certainly a very close tracking in those transition areas such as at the end of / just after WW1, early 1950s, mid 1970s. It is also sort of close ove the past 20 or so years. But then there are some interesting areas of divergence as well.

It is an interesting thing. When looking at GHCNv2 raw, there is an incredible swing between ~1915 (unbelievably low) and ~ 1922 (almost unbelievably high). It is the greatest magnitude swing in the entire record. Over such a short period. Of course, to ground everyone a bit, we are only talking about tenths of a degree. But relatively speaking, it is a large swing in the context of the record.

Watts and team have provided some categories for the quality of stations which give us, without significantly more detail, a mere snapshot of quality and in no manner or form would be translatable at this point to station histories going back 50 to 100 years in time.

Speaking as 1/415 of the “Watts team”, I think that this “snapshot” is indeed very useful for trend analysis, since the “bad” stations have in general become progressively bad, while the “good” stations have probably been relatively “good” all along. Ideally, the Watts survey should have been performed every 10 years back to 1880, but it wasn’t. This does not mean, however, that it shouldn’t be run to completion as a volunteer effort under Anthony’s supervision in 2007, and then repeated under official NOAA auspices every 10 years thereafter.

At #551, Kenneth validly complains of the small sample size of CRN12. As I understand CRN1, these are ideal sites. CRN2 is not ideal, but still very, very good. CRN3 sites have issues that should be addressed, but are probably OK. CRN4 is unacceptable, and CRN5 is beyond the pale. The majority are CRN4, so we should throw them out along with CRN5. But including CRN3 along with CRN 1 and 2 would approximately double the sample size, even with only 1/3 coverage as at present. Why isn’t this being done?

Speaking of small sample size, Kenneth, just how many stations have you personally surveyed?

Speaking as 1/415 of the Watts team, I think that this snapshot is indeed very useful for trend analysis, since the bad stations have in general become progressively bad, while the good stations have probably been relatively good all along.

I think the snapshots serve a very useful and informative purpose, but I remained unconvinced that you or anyone else has historical records that could translate the snapshot very far back in time and certainly not for 50 or 100 years. The station data, whether it be good or bad, contains a noisy signal that makes comparison that will hold up to statistical significance without substantial sample sizes difficult. I will continue to look at the issue of a statistical analysis from a layperson’s perspective and perhaps come up with some better ways of analyzing the station categories.

I have not personally surveyed a single station in my entire life, but I believe I have followed the depiction and discussion of the Watts Team survey presented at this blog. I am here to learn. I can tell you that a better analysis of the stations could come from better knowledge of recent and past history of the stations and perhaps with reletively small sample sizes.

#594 Kenneth, go take a look at Anthony’s database. If you look at the pictures and at the pdf of the reports, you will see that many of the volunteers have made an effort to study the history of the station and you will often see references, if not actual pictures, of previous sites with many still standing (usually the CRS after conversion to MMTS). In many cases, it is possible to tell if the previous site was better or worse than the current one. There is a lot more historical information there than just the list of station categories. It would just take some time to convert that info into usable database form for use in analyses like yours.

#592 Kristen:
The file is actually called “station.csv”, not “stations.csv” as I said before. I have just confirmed that the file exists in every zip archive linked below (these are the original links from my previous posts). There is a sub-folder for every analysis. The sub-folder names should be self-explanatory.

BTW, .csv files can be opened by Excel.

Alternatively you can download and run OpenTemp yourself. If you run it with no parameters, it will give complete instructions. It’s not the most user-friendly program but it won’t take long to figure out.

John V, the time has come for me to attempt to incorporate your geographical weighting into my analysis of randomly generated samples and CRN category trends. Please provide a reference or procedure for the adjustments as your time permits. In the meantime I will search for it here.

If you do them by state or groups of states this would be an easy normalizing step for me to perform. If you do it by latitude and longitude it becomes more difficult but readily doable. I am gearing up to look at random groups with a final corrected method and should look at the geography adjustment before proceeding. I do not want to data snoop adjustments, but the geography adjustment seems an a prior legitimate one to me.

Coming from IL and doing an analysis of that state made me not hold out much hope for a state (or geographical) adjustment as here we have a general warming trend that changes significantly from a warming northern part to a increasingly less warming as one proceeds south in the state and also with significantly different local trends at near equal latitudes.

I think there might be a problem. So duplicate what I did and lets check notes.

1. I ran CRN12345.
2. I ran CRN123
3. I ran CRN45.

Then I look at yearlies and anomalies ( 1950-81) and the numbers are not adding up. The CRN123 are lower
Than the CRN12345. That makes sense. And one would expect the CRN45 to be higher, but it’s not always
higher and the averages dont exactly work out.

Crudely lets say for 19xx you have a yearly average of CRN12345 of 12C, crn123 is like 11.75C and CRN45
is like 11.85C

I described my procedure for geographic weighting in post #87 of the First Look thread (http://www.climateaudit.org/?p=2061). I quote myself below to save you from searching (emphasis added):

I calculated the 1-year and 5-year average temperatures for the continental USA. The calculations were done by overlaying a 0.5 x 0.5deg grid over the entire area and calculating average temperatures at each grid point for every month from 1880. The grid temperatures were calculated from surrounding stations with readings available for that month (if no reading for the month then the station was excluded for the month).

As per GISTEMP and due to the low number of stations, I used a 1000km radius for averaging the grid temperatures (with the weight of each station declining to 0 at 1000km).

As I described above, I applied a station offset to each station. The offset was calculated by averaging the difference between the station temps and the overall temps (entire lower 48) for all months with station temps. This gave a scalar value which is added to all station readings. Im probably not explaining this very well, and Im sure somebody will accuse me of introducing a trend.

As a side note, I think this station (series) offset is a simple and easy way to offset any station series (as apposed to the GISTEMP method). No estimates are required. Each series is offset to correct for regional climate averages (not trends). It requires a lot of computation but computation is cheap.

A few days ago I realized that this weighting method does not completely correct for geographic bias. Within a 1000km radius, a large number of stations in one direction will skew the results that way. (The interpolation is parabolic instead of linear). The effect should be small, particularly with a small number of stations.

The next version of the program will have an improved gridding and weighting scheme (using ISEA DGG triangles and weighting purely between vertices instead of stations). The current version is however much better than no geographic weight.

#601 steven mosher:
I would not expect the yearly averages to match up for a couple of reasons:

1. The bias applied to each station (see USHCN First Look around post #86) is calculated based on the average of stations included in the analysis. Different stations in the analysis means different bias.

2. The geographic distribution of the stations leads to each station having a different net effect on the regional average. (In CRN12R for example, the lone station in the SE affects a large region).

That’s not to say that there aren’t problems. I will be overhauling the OpenTemp website in the near future into the following categories:
– Development
– Validation
– Studies

The Validation part will be very important, but much of it should wait until the program is in a stable state (see my comment above about changing the gridding and weighting scheme, and my plan to implement pure trend analysis).

#605 steven mosher:
You’re right that Hansen uses 1200km. I’d like to get away from the constant radius averaging in the next version.

The coastal stations are included in the analysis even though they lie outside the gridded region.

I think the “best tiling” should have areas with equal size and shape. The ISEA DGG triangles are pretty close to this optimal. There are also diamands, pentagons, and hexagons but those are much more complicated to work with.

I ran ALL sites 1221 through JohnVs program. looked at yearly did the 1950-81 Anamoly dance

Then I ran all the surveyed sites. ( Around 400 CRN12345)

SO.. all 1221 VERSUS the 400 surveyed.

WHY do this Mosh pit? are you a friggin numbskull idiot boneheaded dunce. ( dont answer that!)
The difference between ALL sites
( 1221) and the sampled sites (400) .. should be around zero. Right?

Especially if I use JohnVs program to test both samples! so I did that.

I did not find Zero mean.
All of the sites SURVEYED TO DATE are COOLER
On average than the sites TO BE surveyed.

What’s that mean mosh pit?

With 33% of the sites visted, the rest are WORSE.

SO, here is the data. ALL1221-SURVEYED400

The following vector is a 5year moving average (trailing) of (CRN12345-ALL). Starting in 1884.
Units are Anomaly in C from 1950-81 as computed by OpenTemp running all 1221 USHCN files.

If ALL-SURVEYED is .GT. zero then UNSURVEYED is .GT. ALL.
Simply. If all of the temps have
an average of 10 and the surveyed sites average 9, then the unsurveyed sites average higher.
2/3s of the sites are unsurveyed. The analysis of all versus the surveyed indicates the unsurveyed
will be warming. Warmer EVERY YEAR since 1880. What the heck? I must be a bonehead.

CAVEATS. This needs to be duplicated by others. JohnVs program is a piece of cake to work.
JohnV maybe I should send you the station files for this study so people can VERIFY and duplicate.
I did a lot of this work at the coffee shop this am, so Somebody needs to check it.
Or duplicate from scratch.

John V, I am not sure I could apply your adjustments to the stations I want to analyze. I do not understand what the ultimate effect is on the bias to trends due to geographical differences. I assume that the yearlyCRN files do not have the adjustment applied. I have noted your files named offset. Are these the files that contain the geographical adjustments? And not to trends but averages?

Until I understand better your geographic adjustments I well look at any improvements (less variation in randomly selected groups) that I might discover by using a state by state geographic adjustment to the measurements. I will adjust the trend by normalizing state station data with weighting factors calculated from the entire 1200+ station data base. This works for now for a programming challenged participant. (I use Excel macros when the heavy lifting gets difficult, some VB when it gets unbearable and then only with adult supervision. I once did Fortran, many, many, many years ago entering it into a 361(?) IBM computer on stacks of punched cards.)

I was puzzling last night… about equal area ( which hansen uses ) and equal stations per grid
point. The equal area approach can give you unequal stations per cell. So, you have issues
about varience and distribution of CRN per cell.

The Equal station per grid point is ..ahhh… a mindless suggestion at this point ( brainstorming)

Now, the 1200km thing came from a correlation study. Were hansen (H87) looked at the Correlation
between stations Time series as a Function of km. at 1200KM you’ve got r2 =.5.

So I’m thinking maybe Minimizing the Varience is MORE IMPORTANT than equalizing the area
or equalizing the stations per grid cell.

Anyway.. I’m puzzlin on that.

hey if you like I can send you mail with all the *.bat files I created and station lists
so floks can play around with all the CRNS and all 1221 files..

And you can do an update..

I have.

CRN3
CRN4
CRN5
CRN12
CRN45
CRN123
CRN12345
ALL

So Missing, CRN1, CRN2. Piece of pie for me to do.

Note I havent subsetted any of the rural urban stuff in my work as I Figured
other folks were headed down such paths…

Now, I don’t intend to do ANY COMPARISONS to GISS. Why? well, if we match them then they say the
following.

1. You guys have merely confirmed us, go away.

If we don’t match them, they will say we got it wrong.

THAT why I prefer to compare thing in terms of JOHNV ALL.

Simply. JohnV has a method for computing average land temp in the lower 48.

That method should have no bias with regard to microsite issues. So, If his method shows
that Class5 is warmer than all 1221 USHCN sites, that is evidence that NOAA should remove these sites.

Let me put it another way to sharpen the point.

Surfacestations makes the claim that Class5 sites WARM the record artifically.

THAT IS THE FUNDAMENTAL CLAIM.

In the same way that MMTS changes raised temps by .05F, Microsite Bias has raised temps.

How to prove this? TWO WAYS.

1. use JohnVs program. Calculate the Time series of all 1221 sites. Calculate the time series of
all the sites EXCEPT the CRN5s. Note the difference. If the difference is significant delete
the CRN5 sites. f you can make a MMTS adjustment of .05F…..

How to heck much coffee did you have? Will you be needing any help coming down from an apparent coffee high? Seriously though, that was an interesting not-zero-mean analysis you did as are your thoughts on potential analyses.

Ok, here is what puzzles the heck out of me and why this problem fascinates the hell out of me.

Lets take a step back to Peterson, Parker and Hansen and on the issue of UHI.

First, some background. The climate scientists were seeing two things.

A. Urban Heat Island
B. Global warming.

Let’s start with UHI. UHI was first observed in the 1800s, I think the date of the first study was
somewhere in the 1820-1840 era. Subsequent to that there have been numerous studies on the
phenomena. The UHI issue is important to “enviromentalists” because they search for solutions to
decrease UHI. So, documenting the existence of UHI is essential for Enviromentalists. That is why
you can find studies on parking lots, urban boundary layer, green roof tops, adding trees to Altlanta
blah blah blah.

At the same time you have evidence of Global warming. Now, comes the concern. We have green
science that shows the effects of UHI. We like those results because people need to change the way
they live in cities. We also have evidence of GW. Question: could the evidence of GW be polluted by
UHI? Good question.

Lets put this in logical form.

1. UHI is real.
2. Therefore, weather sites that measure temperature in URBAN sites will be warmer than RURAL sites.

When this hypothesis was tested Peterson, Parker and Hansen found: NOT 2.

~2: Urban was roughly equal to RURAL.

So we now have the following epicycle from peterson.

IF UHI IS REAL ( and it must be real because all the eco studies show this)
AND IF
WE FIND NO DIFFERENCE BETWEEN URBAN AND RURAL

Then, the conclusion Peterson draws is that:

“URBAN MUST BE RURAL” specifically, he argues that urban stations must be located
in “urban cool parks”

WHY? Because he must PRESERVE TWO TENETS. UHI is real; Global warming is real.

This is a logical scheme that is adopted full force by Parker and Hansen.

UHI IS REAL. WE SEE NO DIFFERENCE between urban sites and rural sites; therefore,
the URBAN SITES MUST BE RURAL LIKE.

Now, I want you to think about that for a moment.

UHI IS REAL. BUT WE SEE NO DIFFERENCE between sites at rural locations and sites
in urban locatins: therefore the URBAN SITES MUST BE RURAL LIKE.

is this the only explaination? Nope. Try this:

UHI is real.
Urban sites measure the same as rural sites.Therefore,
RURAL SITES ARE IN HOT POCKETS!

So: you have two explanations for the same phenomena:
A: urban is rural
B: rural is urban

Now, if you visited Urban sites and found them in cool parks ( note: cool parks
DO EXIST ) you would verify the peterson hypothesis. If you visted Rural sites
and found Hot pockets( asphalt.. ect) then we would have a different outcome.

THAT is why this thing has fascinated me. It’s a perspective on the logic
of the argument that everyone ( save St.mac) has missed. There is a mystery
here.

We are left with this:

1. UHI is real.
2. We expect a difference between URBAN Sites and RURAL SITES.
3. There is no measured difference.
4. We expect, then urban sites to be located in cool parks.
5. we have found none in cool parks.
6. Alternatively we expect Rural sites to be in hot pockets.
7. We tested #6 and found no hot pockets

#611. Good questions, but you have to watch the pea under the thimble when you change data sets and change from the U.S.

PArker’s network is completely different. He purports to show that there’s no difference between windy and on-windy bsed on NCEP quadrant models. If this criterion having only a vague relationship to windy-nonwindy at the actual sites, then he won’t find a difference. THere are other ways that he wouldn’t find a difference – I posited the possibility that urbanization around airport sites would screw up his criterion. He did an indirect test that is very inconclusive.

Peterson’s network is also different than USHCN. Peterson conflates urban and rural sites and then says that there’s no differences – Snoqualmie Falls WA was grouped with Dallas. If you segregate sites that have major league franchises, you get a profound difference.

None of the studies purporting to show no difference are done by disinterested parties. PArker and PEterson are highly partisan.

Things like Phoenix Airport are not “cool parks” regardless of what Parker says.

This is not to say that you can’t show warming from proper sites. I suspect that you can. So far examining CRN1-2 stations shows that urban adjustments are required in the US – and indicate that HAnsen’s adjustment IN THE US is probably a decent first cut at it. As I mentioned at the start of these topics, if Hansen’s adjustments in the US are right, PArker and PEterson are wrong. Outside the US, without a rural framework to anchor things, I suspect that the Chinese data in all the studies is compromised by urbanization and that none of the studies have faced the problem squarely.

You are right on the particulars. I’m just talking about why this problems fascinates
the hell out of me. That is, the need peterson had for “closure”. Urban must be rural!

seriously, wouldn’t you want to check a couple just to make sure? Kinda stunning.
that’s my main point I guess. they could have also, logically, concluded that UHI
was “unreal” But they can’t. So, it was the epicycle espisode that fascinated me.

Anyway. Here is what I have for today.

I ran all 1221 sites through JohnVs program. Then I ran ALL THE SITES except the 5s.
That is all unrated sites, and the 1,2,3,4 sites

So ALL 1221 sites versus the sites that have not been rated a 5 or have not been rated.
make sense?

Essentially I wanted the two biggest samples I could get, so I could some decent
power in the test.

( I did a 5 year MA, trailing)

Results. The difference in anonmaly between ALLSITES and ALLSITESnofives is Constantly
positive throughout the entire history of the time series. The difference is on the order
of .05C.

Argument: if a MMTS adjustment of .05F is justified, then excising class5 records for a .05C
difference is justified.

For grins I compared ALL to only the CRN123. Anonmoly was always negative. That is, CRN123 was
cooler.

At some point I’ll figure ut how to post charts or give them to JohnV.

Create a file of all stations all 1221. ( ask if you dont know how)
Run them through JohnVs program.
Create a file of all stations EXCEPT those that Anthony says are Class 5. ( about 1160 stations)
Create a file of all stations EXCEPT those that Anthony says are Class1&2. (about 1160 stations)

Now, I know that about 700 of the stations in the sample are Unsurveyed. Lets call those CRNu.

What do you expect for ALL-CRNu1234
What do you expect for ALL-CRNu345.

That is, in one sample I take out those sites I know to be bad. In another I take out those I know
to be good.

If I take out the bad sites I lower the Anomaly consistently By .05C. More specifically,
Anthony has surveyed 33% of the sites and counted 50-60 CRN5.

IF you calculate the US anomaly graph for all 1221 stations ( 1950-1981) and then
IF you calculate the same graph, deleting those 50-60 stations, you will see a
Consistent difference in anomaly from 1880 to present. That difference is on the order of
.05-.07C. Namely, Remove the 50-60 stations That are class 5 and you will cut the Century
anomaly by roughly 5-10%.

When you take out the Good sites ( CRNu345) you get a difference from the whole that wanders
around the zero line, negative here positive there.

Ive compared several months from the original hand written station reports against the values in your station.csv file and none are matching. All of them are off from between 1 and 5 tenths. I have also found a couple of stations that have reports with less than 9 missing days but they are blank in your spreadsheet. All of the station reports that have 9 or more missing days are blank in your spreadsheet. I think the raw GHCN data is much less raw than they claim.

Then click on the state, then find the month and year and it gives you a link, click on the link and there you have it, the original hand written report.
I started with your and Steves favorite and the first one on John Vs CRN 5 list: Univ. of Tucson #1. John Vs data goes up to March 2006 so thats where I started (it was off by 3 tenths). Then I went back a year to March 2005, Johns spreadsheet was blank for that month but the report is there, no days missing (same with Wickenburg, AZ in Jan 1999). Then I went back another year to March 2004, it was off by 5 tenths.
I also did Gardiner, Maine for March in 2006 and 2005. It was off by 1 tenth for each.
I also checked blanks at Chico Univ. Farm, CA on John Vs spreadsheet for March 2006 and Nov. 1999; the NCDC reports were there but missing 9 and 10 days.
These were all March so try July or August and I get the feeling that the differences will be larger for the CRN 5 stations.
Get ready for some boring math because some of them didnt do their sums so I had to.
A Fahrenheit to Celsius converter can be found here:

I knew there was something wrong with all of this; Ive had my hand up to some of those AC units. And this is not John Vs fault, like I said, if it comes from NASA GISS or NCDC it is junk until you see it with your own two eyes.

John V,
Instead of gridding an area with regular sized rectangles or triangles, why not do the following.

1. Imagine the stations are the vertices of an irregular polyhedron with triangular sides. i.e. draw lines between the stations until the area is split up into triangles, each triangle having one station for each of its three corners.
2. Now compute the temperature for each triangle by taking the average of the three temperatures for the stations at the three corners.

3. Divide the temperature by the area of the triangle (remembering that the triangle is on the surface of a sphere when computing the area).
3. Sum the weighted temperatures and multiply by the total area to get an average temperature.

The polyhedron can be created computationally by starting at an arbitrary station, finding the closest two stations to form the first triangle, then expanding out recursively from each of the three edges of this triangle adding a point to make a new triangle.

A real-world example of exactly why the concept of the Laboratory Notebook was developed. One of the cornerstones of the requirement for independent Verification of data for quality control purposes. Early in our educational process these are typically called Lab Reports.

As additional information is uncovered, more and more it seems that NASA GISS are not familiar with any of the concepts of data quality. And their feeble attempts to belittle and dismiss those who know the benefits of these concepts, and apply them to important analyses, are beginning to paint a very ugly picture of NASA GISS and its ‘science’.