USHCN Trends: Red States and Blue States

I’ve calculated the trends for all 1221 USHCN stations after 1900 and plotted up the contour maps for the raw, time-of-observation adjusted and fully adjusted results. Pretty interesting results. Here’s the results for the “raw” data, the adjusted versions are shown below.

I’ll also show directions for how to produce pretty contour maps from irregular data using two different R packages.

First, I collated the nasty USHCN formats into three organized matrices (raw, tobs, filnet) with 1221 columns (one for each station). The USHCN storage formats are nasty in two distinct forms of obsoleteness: instead of using tab-separated ascii files, they use fixed columns packed together; and they use an obsolete zipping method to make .Z files, a form not recognized by unzipping routinges in R (as far as I can tell so far.) The collation step took by far the longest.

After doing that I calculated trends for each version for all 1221 stations (using the monthly data converted to anomaly form), storing each trend coefficient in a data.frame details which also has information on lat, long etc. as follows:

Contour plot programs require that you have gridded data. In order to go from the irregular USHCN locations to gridded data, I used the interp function in the package akima. I looked at a variety of other packages and stumbled across this package. This is my first use of this package. I’m not entirely sure how it works, but the results make sense visually. Some of the other packages over-smoothed for what I wanted. To make a contour plot, I used the image.plot function in the fields package witha couple of twists that are worth paying attention to. I specified the color range tim.colors(64) – tim.colors gives reds and blues in the right places for the comparisons that are done for climate anomalies. Also I specified the breaks so that there were the same number of negative breaks as positive breaks. You need to do this to ensure that blues and reds center in the right place. You have to specify one color less than the number of breaks. (I haven’t figured out how to tidy the color-docing legend yet, but that’s a small point.) You can add geography in fields with the command world(add=TRUE) or US(add-true). I specified a lwd=2. Then I added back the USHCN locations as individual points. Pretty simple. A nice fancy plot in a few lines.

The first result – USHCN Raw – is shown above. Given that this sort of “raw” data is what underpins temperature calculations throughout most of the world, which lacks the benefits of USHCN time-of-observation and filnet adjustments, it surprised me just how much of the U.S. lacked a 20th century trend in the raw data. I say this because I’m quite convinced that there has been 20th century warming. I’m not among those trying to say that it doesn’t exist. The pattern of the map is interesting as well: the apparent cooling is notable in the south-east. I understand that there is some proxy evidence of this – John Christy mentioned to me that oranges had a greater range in the early part of the century than at present. BTW I don’t think that I’ve ever seen a plot of the “raw” data before: I guess that it’s for mature audiences only. The hot spots were in California, Arizona, Utah and also Montana and the eastern cities – New York, Washington. There’s an interesting little hot spot in northern Minnesota that I have some information on and will post about on another occasion. So surfacestations.org with its initial focus on California is investigating a particularly important area for understanding these trends. There’s an interesting cooling anomaly in the middle of Colorado. The seven biggest “raw” trends were: LITTLE FALLS MILL ST NY, TUCSON U OF AZ, CHULA VISTA CA, PASADENA CA, SAINT GEORGE UT, ROCK SPRINGS AP WY and PARKER 6NE AZ.

USHCN Time-of-Observation Adjusted

Next, here is the same information, this time for the time-of-observation adjusted data. The regional pattern is similar: California and the eastern seaboard are red states, while the southeast and Texas are blue states. However, if you look back and forth, you’ll see that the warming trends are intensified relative to the raw data and the cooling trends are attenuated. The rankings at the top of the league are similar to the raw table but change enough to affect bowl invitations. This time the University of Arizona (Tucson) ranks top of the league tables with Yosemite Park Headquarters entering the top seven: TUCSON U OF AZ AZ
, ROCK SPRINGS AP WY, LITTLE FALLS MILL ST NY, PASADENA CA, CHULA VISTA CA, YOSEMITE PARK HEADQUARTERS CA and PARKER 6NE AZ.

USHCN Fully Adjusted

Finally, here is the USHCN fully adjusted version – and this is the version that is “raw” for GISS which proceeds to adjust things even more. Again the regional pattern is similar to the previous maps, but the warming is intensified even more and the cooling attenuated further. California, the eastern seaboard and the northern states near Canada are all red states, while the southeast and Texas remain blue states. Here’s something fun: 4 of the top seven are from California. Marysville CA, where Anthony Watts’ investigations began, cracks the top 7: HANKSVILLE UT, INDEPENDENCE CA, MESA AZ, PLYMOUTH-KINGSTON MA, BLYTHE CA, LIVERMORE CA and MARYSVILLE CA.

It’s hard to know where to begin commenting. Merely plotting the trends in an organized form identifies many questions that any mineral exploration geologist or geophysicist would investigate in detail on the ground. What accounts for the cool anomaly in the middle of Colorado? What accounts for the hot anomalies along the eastern seaboard – which, curiously, seem to match urbanization patterns. Hmmmm. And BTW, doncha think that Hansen should spend a little time discussing the impact of his adjustments in a specific way. Geophysicists would, but, not, I guess, CSIs.

But the oddest pattern is surely the degree to which red and blue states on these maps match their political counterparts. There are a few exceptions – Arizona, Montana, Utah, but it looks to me like voting patterns would be a better proxy for the existence of a 20th century temperature trend (by state) than tree rings.

94 Comments

Isn’t it strange that the raw data shows less warming and more cooling than the adjusted data? I would have thought that any random changes in the recording equipment would average out across the country, leaving only the underlying trend effects. But the fact that the adjusted data is warmer means that overall the adjustments are correcting for perceived artificial cooling effects in the raw data. What would such cooling effects be? The only identified artificial trend that I know of is UHI, which is a warming effect.

For example, movement of recording stations from one location to another probably causes no intrinsic bias, since an increase in temperature from such a move at one station should be offset by a decrease at some other station. However, if a substantial portion of the stations have been subject to UHI, then that should show up in the raw data, and one would expect the adjusted data to be cooler, not warmer.

One general comment: it looks as if the software may have modeled across the Gulf of Mexico and Great Lakes. Not sure if your software has a means to exclude those areas. It doesn’t sound like you did anything other than pull the data and have it plotted, but if your software is doing calcs, it is likely interpolating in those areas pretty nonsensically (eg, between SW FL and the TexMex border, between Maine and northern Minnesota, etc).

I do find the final result somewhat perplexing. While your point about the cities on the eastern seaboard is a good one, I look at the triangle of Dallas/Ft Worth, Houston, and San Antonio (with Austin in there, too), and see cooling. While there are plenty of rural areas in there, those are some major metropolitan areas that are concrete jungles. I would have expected to see UHI effects here, but the opposite appears to be true, at least from the fully-adjusted USHCN plot.

Also much of the heaviest warming seems to be in rural areas, which might also contradict the idea of UHI effects.

I would expect that the affect of UHI on raw temperature may be reduced in areas with higher humidities, since the humid air would take more energy to change one degree C. Hence it isn’t surprising to see the highest raw T change in the driest parts of the country.

It’s not that either the fixed format data, or the compression
method, is obsolete, It is that they are commonly used in areas
to which your experience may not extend.

The compression method is, I believe, the standard unix
compression method. Any unix/linux user who visits here may
be able to confirm, or rebut, that impression.

Fixed format records, rather than tab, or other delimited,
variable length format, are very commonly used because there
is no need to introduce the overhead of processing the
delimiters, and adjusting field lengths.

For many software packages, delimited data can be more convenient,
but fixed format was still the most commonly used format in
most data processing applications the last time I looked.

Look, the whole tenor of the recent discussion examining rural stations has been to try to disaggregate simplistic ideas of “urban” warming.

The inspection of places like Marysville, Lake Spaulding, Fallon show that there are problems with the poorly supervised rural network. Do one barbecue + one rural site have the same average trend as one urban airport?

The populated parts of the Atlantic seaboard stick out more than seems likely if the change there was due to large-scale climatic issues. In order to make an opiion about Dallas, one would hae to look there.

Jeez, I’m not trying to make sweeping generalizations off a couple of charts – I’m trying to demonstrate the texture of the problem and avoid arm-waving.

Merely plotting the trends in an organized form identifies many questions that any mineral exploration geologist or geophysicist would investigate in detail on the ground.

Indeed, After I create my anomaly maps, before I would send a crew into the field, I’d go anomaly hunting, starting with the biggest outliers to verify the origin of the anomalies. In most cases I would find some sort of processing artefact, like a forgotten decimal separator.

Re#7, I understand almost all of the recent posts have been about location bias and not so much a general rural vs urban…I was just surprised, right-or-wrong, about a few things I saw at first glance and thought them worth a mention.

I asked a statistics professor on the R list about the .Z files and he said:

Looks like the file is the format generated by the ancient UNIX “compress” program.

Nicholas said something similar. Yeah, it works, but it’s “ancient” and it’s not supported in R, for example. Your comment:

Fixed format records, rather than tab, or other delimited, variable length format, are very commonly used because there is no need to introduce the overhead of processing the delimiters, and adjusting field lengths.

They’re used because Fortran programmers use them. They were used originally because of punch cards trying to fit things onto 80-columns. Memo to Fortran programmers: you don’t need to squish things into 80 columns any more. Jerry, my beef is that this system imposes substantial overhead costs for people not using Fortran. If you have tab-separated ASCII files, you save a LOT of time. I waste an inordinate amount of time decoding these crappy Fortran layouts and then string-slitting. Don;t waste your time arguing with me about it. I’ve spent enough time already with these squished and stupid Fortran formats.

To me this looks like a map of “How the West was Won”. Or maybe “how the west was irrigated” in the last 100 years. Christy may be right that irrigation plays a major role in higher night temps, causing an upward trend.

It would be interesting to get nationwide irrigation use data, is acre-feet per year, and run the same R plots.

Austin, Dallas/Ft.Worth aren’t in the USHCN dataset, neither is Atlanta. But San Antonio is (which shows a negative trend) and Corpus Christi is (shows positive trend). So with exclusions of these major population centers from the USHCN data set, I’d expect them not to show as anomalies on the map.

But in California, which was a primarily agricultural state around 1900, the big growth centers are not excluded from the dataset…Pasadena for example used to be a lot of orchards, and the sensor is at the water company downtown. Its been there ever since, but is now surrounded by high rise. I have a volunteer on it.

While your point about the cities on the eastern seaboard is a good one, I look at the triangle of Dallas/Ft Worth, Houston, and San Antonio (with Austin in there, too), and see cooling. While there are plenty of rural areas in there, those are some major metropolitan areas that are concrete jungles. I would have expected to see UHI effects here, but the opposite appears to be true, at least from the fully-adjusted USHCN plot.

The USHCN network, to some extent but not always, tries to stay away from big urban centers. There is no Dallas or Houston station per se in USHCN. So you could have UHI in Dallas and Houston, but not in the USHCN network. Perhaps there’s a greater separation between urban and rural in Texas than along the New York-Washington corridor.

A quick request to the user community on techie issues like fixed formats would work wonders.

I date from Fortran punched card era….MS thesis was about an 18 inch stack of those wonders of technology…same is true for my early involvement in Unix/machine language programming and numerous other ancient lore

Just a quick look, and I must agree that is difficult to spot any large-scale, hoceky stick like trends in any of the charts. I would have expected to see much more red and orange in places like the Hill Country,the Southeast, and Ohio Vallies. The adjusted data was only surprising in that those areas that did show cooling persisted despite the adjustments.

This is a recognized problem in using surface data. It doesn’t always reflect the lowest air parcels in the tropesphere. It would be much better to use geopotential heights (900mb-850mb, or 800mb-700mb thicknesses) -that way you get parcel densities near or above the boundary layer. Of course, rawindsonde stations are few and far between -we must get on with what we have, which unfortunately is not all that good. Between the proxy divergence problems, and inaccurate surface data, it is difficult to say to what degree we are warming.

But in California, which was a primarily agricultural state around 1900, the big growth centers are not excluded from the dataset…Pasadena for example used to be a lot of orchards, and the sensor is at the water company downtown. Its been there ever since, but is now surrounded by high rise. I have a volunteer on it.

That is true for more than Southern California too. As late as the early 1980’s there were still many prune orchards around San Jose and Cupertino. Those are gone now. You could drive up US 101 from San Jose and there was actually open space between Santa Clara and Sunnyvale. Today this is all one solid blanket of suburban sprawl from San Jose nearly all the way to San Francisco with only a few breaks. San Jose, Santa Clara, Sunnyvale, Mountain View, Los Altos, Palo Alto, Menlo Park, etc. now simply blend from one to another.

It should be possible to write a simple parsing program to convert from such a fixed format file to a tab or comma seperated format.
If I can get my hands on a sample of the fixed format file, I’ll give it a go. I can only compile for Windows and Linux, but I would be happy to provide a copy of the source code so that you could compile to other OS’s.

Let me start by saying I don’t know anything about R. However the languages and scripting packages that I am familiar with support some kind of system call. That is, you can execute an OS command from within the language. Cygwin has a number of free packages, including decompressors for older formats that run on Windows boxes. If you are on some flavor of Unix, the package should already exist, you just need the right install tape.

The claim that adjustments have been made for UHI seems inconsistent with the
warm and cool both getting warmer. Something wrong here. Steve have you inverted
something or are the climate guys intentionally or accidently making a mistake?
Murray

#18. Hans, there is a “significant” correlation of trend to altitude (and also to latitude). I’d imagine that there are regional changes that could explain warming in the West and cooling in the East, but it’s not an obvious “fingerprint” of higher CO2 levels.

All of these interesting contour maps deliver to me a message about the nature of temperature change and/or measurement (errors). These changes are local and that implies that their resolution must be just that much more difficult to resolve by computer modeling. [snip – we’re not going to discuss the “meaning” of global temperature]

#28. Tese graphs fo NOT prove that change is local. All they show is that is there is a lot of error in the measurement and that the errors are not white noise. I noticed something as I wrote this – a climate scientist would say that there is a lot of “noise” in the measurement rather than a lot of “error”. There’s a subtle difference in the metaphor – the climate scientist inevitably goes on to assume or use methods that rely on the error being white noise or low-order red noise, as opposed to allowing for the possibility of really bad data being included in the stew.

The .Z format is really not that obscure. I know about it and I’m not that ancient. Any decent computer (ie any Linux computer) can deal with it – you just type uncompress hcn_calc_mean_data.Z. And the fixed column format is just the most efficient way of packing in the data – the README file tells you that the temperatures are in cols 15-20, 25-30 etc, so they can be extracted from each line using a substring command, whether you are using R, matlab, fortran or whatever.
Fascinating plots anyway – are the units in the colorbar degrees C (or F) per year?

RE: #19 – There is no break at all in the conurbation, from about north central Richmond, all the way down the east shore of the Bay, clear down to South San Jose, then up along the West side of the Bay to Ft. Point. While there are mountain / hill breaks to the East, the conurbation picks up where there is flat land in the “big C” – e.g. the Livermore-Pleasanton-Dublin-I680 corridor-242 corridor-Pittsburg sprawl. Same deal with going North – Marin and Solano Counties have their flat land essentially filled in. Same deal going over the hill to Santa Cruz. Not much open flat land left there either. HMB, they still have some, due to growth restrictions. But for how long? All of this growth has been since the 1950s, most of it as you have noted concentrated during the period 1970 – present.

RE: The maps – compare these maps with the long term forecasted surface temperature anomalies from the NWS, for any future 3 month period covered by such forecasts. Most of them show an anomaly pattern similar to these maps. That means that when they do their forecasts, they essentially assume that the future anomaly will be a “signature” akin to these reputed “expressions of AGW in the temperature change history.”

RE: Related side note. Anyone catch the lead clown, Hansen, on History Channel this weekend? There were a number of assistant clowns interviewed as well. It was as if someone had put together an “anti Swindle” – taking even the legitimate points on Swindle and doing a 180 for each of them.

The comment about irrigation brings up the general question of humidity. Areas that are irrigated aren’t the only areas when water usage has changed. Lawns are also more prevalent in areas such as Phoenix and Huston where they didn’t exist before suburbia. Dams have also created large bodies of water. How does the change in humidity affect the recorded temperature? Is the humidity available along with the temperature data to see if there is any correllation? Are any corrections applied to the raw data?

#30. Look, I can uncompress things, but for .Z files, I have to do it manually because the method according to specialists that I’ve talked to is obsolete or ancient and not supported in R. You can do all kinds of things in R, so if .Z isn’t supported, I take that as evidence that it’s ancient. The manual step can be annoying because there are files that I would like to handle within a script and I can’t without downloading the file manually, unzipping and storing it. It’s like 8-track cassettes.

Pasadena was a city before Los Angeles was; Old Town Pasadena, built in the 1880s and 1890s, is sometimes referred to as the first L.A. Downtown. Pasadena is also in the central part of the most heavily urbanized area of Southern California. In other words, using “Pasadena” in the USHCN is tantamount to using Los Angeles, or Southern California, which has gained more population in the last 100 years than any other place in the United States.

The USHCN temperature graph of Pasadena (I use CO2 Science for convenience) shows steady warming over the last 100 years, not a spike since 1980.

HEre is a graphic (link only to save bandwidth) showing the same information for 1920-1980 USHCN adjusted. The California locations are real hot spots. Is there a climatic explanation for these localized hot spots in one of Gavin’s GCMs? It would be a real test of tweaking powers.

It would be interesting to see a humidity level map superimposed on these maps. It appears that most of the heating has occurred in low humidity areas (with the exception of UHI effects). If it is indeed warmer now, which I think is true (mainly solar effects, IMHO), then this again demonstrates that there is no “water vapor feedback” to increased forcing.

I’ll try to figure out how to do the legend on image.plot. I converted USHCN data to deg C and plots are of deg C/year. The top end is 0.027 deg C/year or 2.7 deg C/century and the bottom end is about -1.3 deg C /century.

RE: #36 – Careful there. An anecdote. When I water during a day with no sea breeze or even an outright offshore flow, I can feel the local humidity build up. It’s almost as if I am creating a dome or plume (depending on the air velocity). For both “urban” and rurual ag areas, you’re going to get this effect pretty much anywhere in the southwest. Another twist is going to be during times when cT air is flowing in from Mexico (e.g. a southeasterly flow) and it pumps even the innate RH up above 50% (which is considered “very humid” here – LOL). In any case …. the local anthropogenic RH build ups are adding to the heat capacity way beyond what could ever happen naturally and probably causing a local positive anomaly during the late afternoon and night.

Steve, this is a really important observation. Going from knowing that there are errors in your data to assuming that they a equally balanced in sign and magnitude and then to assuming that they conform to some tractable and knowable population distribution, is such a common failing that it ought to have a fancy latin name. Such a large body of tools has been developed since Gauss addressing the problems of normal error that the temptation to apply them to pretty much any problem is, based on my personal (non-climate science) experience, irresistable.

Kudos for Mr. McIntyre! A classic example of “The Visualization of Quantitative Information.” I hope Dr. Tufte is following Steve’s entire body of work, for the now discredited “Hockey Stick” was a notoriously effective misuse of graphic information.

In re Posts 13 & 37: Will be informative to see comparisons with the satellite data for the same period.

Formats: SteveM You have ingested all this data? Time to cough it up. Ideally someone would injest
all 1221 sites in daily, monthly, annual. Then repost the record in multiple sensible formats.
Then others could injest and repost in their perferred format. Think data translation center.

Anomaly methods: The anomaly method makes the hair on my neck twitch. Now, I understand
the approach, I understand the rhetorical force as well. I don’t
object to the method. I have a question about presentation. When one
presents anomilies versus a historical mean, I THINK, we need to site or reference
the mean in the presentation. It took me some effort to find the HATCRU3 mean
for 1961-1990. Looking at an anomly from an unknown mean, with an unknown
varience, is well, well..( bites his tongue). So here is a stupid idea. I have
a historical Mean ( 1961-1990) and that mean should have a SD? right? Why not
do a Deviation anomaly map… So, for example ” these grids” have moved 1SD from
the 1961-90 mean. These grids have moved 2 SDs, these grids 3 SDs,
So instead of a map of anomaly versus the mean, do a SD map or Z score map.
Or is that idea just retarded?

Varience MAP: I do not like the grid approach. ( I enjoy being difficult ) Easy to be mesmerized
by the rectangular system. Could one tile the world differenly
( like using hexagons) and perhaps minimize variance within tile and or decrease the
number of tiles.. or do some kind of adaptive variance minimizing tiling? Just a thought.

Red versus Blue: Interesting. I think it might be interesting to correlate the map with
Land use. irrigation etc etc..

Coastal Madness: I’ll bite my tongue for now. With sea level rising and projected to rise
according to AGW, some folks continue to move to areas at or below sea level
And then demand that I change my lifestyle to accomedate their luxury.

Fortran and nasa: Anthony Notes that ModelE is written in Fortran. As far as software goes,
ModelE is a piece of poop. You would NOT fly a plane with code like this. wouldnt control
a heart rate monitor with code like this. Would not run an ATM with code like this.
Would not control your IPOD with code like this. Climate SOFTWARE needs an AUDIT.
With the exception of a CVS I saw nothing that lead me to believe the code was developed
or tested according to any standard. For a CONTRAST have a look at the MIT GCM.

Well maybe not that bad, but I’ll agree that it is not anything like modern software, and becuase of its long lineage, with some code going back 20+ years and having been touched by a variety of programmers over the years, it has become what professional porgrammers call “spaghetti code” to some degree. Mainframes that this code was started on had lees power then than some PC’s of 2-3 years ago.

It certainly is not robust, I’ll say that. Data that is out of bounds in some way can cause it to crash.

It really needs an overhaul with application of modern coding techniques and modern visualization tools.

Yeah, the water vapor decreases the diurnal variation by storing some heat. But it also carries heat away. It’s almost a wash. I still can’t figure out why average July temperatures in places like Daggett, CA are higher than those in the Southeast. If water vapor exerted a large positive feedback, wouldn’t the reverse would be true?

In GNU/Linux land .Z(compress) is rarely used. Other than the special formats like .deb or .rpm, most “tarballs(archive without compression)” are compressed with either .gz(gzip) or .bz2 (bzip2). I process a lot of text files from the command, although I always have to look-up switches due to my bad memory, with stuff like “sed(stream editor)” or “grep.” SED uses something called regular expressions that allow you to convert spaces/tabs into a single space and then replace it with a comma. GREP allows you to pick out data between spaces/tabs. I don’t know what OS you use, but the cygwin under Windows supports the GNU tools and obviously any flavor of Linux has everything. I enjoy your work, I wish I could use R that well.

Re 25 – Dumb statement. What was I thinking? Should have said with adjustments
getting warmer, when compensation for UHI should cause tham to get cooler. Steve
could your variances havethe wrong sign? Murray

#28. These graphs do NOT prove that change is local. All they show is that is there is a lot of error in the measurement and that the errors are not white noise.

Not to belabor my point, but what I observe in the contours of the maps in the thread heading is a sharp gradient as one goes from one region (locale may be a poor choice of wording, but my guess is that if one made the grid areas smaller the differences would be significantly greater) to another and that as the more complete “corrections” are made these gradients seem to be less sharp but remain nonetheless. You see many areas that have trends that are so near zero that the confidence in the significance of any trend must be low, than you have areas with very large trends upwards and finally a smaller, but yet observable, number that evidently are showing a statistically (assuming the measurements are reasonably correct) significant cooling trend. My point being that the maps show that global warming varies — well all over the map — and my state of IL shows little warming ‘€” which may explain part of my complacency about the situation.

Temperature (and climate) changes that are local are to what I would assume that most people immediately react. When these local changes, within relatively short distances, vary significantly more over a centennial time period compared to changes of the average of these temperatures or a global average, I think that people have a difficult time truly relating to the average or global average change without some help from policy advocates. They need to invoke more large area weather phenomena such as droughts, floods and hurricanes and that in itself is more difficult to do than merely predicting and modeling an increasing global temperature.

My point being that the maps show that global warming varies ‘€” well all over the map

Again, not to belabor the point, I don’t agree with this. The maps do NOT show that warming varies all over the map. It shows the USHCN measurements vary all over the map- possibly due to the presence of barbecues and incinerators at one site and not another. These do not preclude relatively uniform warming due to AGW, but do make the use of the USHCN data precarious. The biggest beef that I have is HAnsen and Karl calling this stuff “high quality”.

It looks like I need to get out and take some photos for the surfacestations website. I live in south-central Utah and have the Richfield HCN station just four blocks from my house, while the new Richfield Airport station is a couple of blocks from my office.

It is interesting that the Saint George station is so prominent in the raw data. The southwest corner of Utah has changed more in the last couple of decades than any other part of the State. Saint George has been transformed from a sleepy regional center in the midst of a largely agricultural valley to a booming urban center of McMansions and retirement communities ringed by golf courses. (County Pop. in 1970: 14,000; Today: 130,000) Unfortunately, I never get down there – but anyone on the road from LasVegas to SaltLake or Denver could stop and check it out.

Hanksville Utah, which was also mentioned is a little armpit of a town in the middle of nowhere. The traditional industries were farming (irrigated alfalfa) and cattle rustling, but for many decades its lifeblood has been selling gas to people coming and going to Lake Powell. I fought a fire down in that area last summer, and believe I saw a weather station at the BLM office, near the edge of the (very small) residential part of town, bordering on the desert.

#50, Ian, I disagree with you on compress. It is the orginal unix compression program and is available on virtually all unix systems. It would be prefered over newer, more efficient compression if the goal is to provide universal access.

We need independent methods of determining whether it is (a) the temperature measurements/corrections that vary all over the map on a background of a more spatially uniform actual temperatures or (b)reasonably accurate and precise measurements on a background of spatially more variable actual temperatures or (c) poorly measured temperatures with a background of spatially varying actual temperatures.

I am inclined to select (c), but if we can find sufficent weather stations with an audited good history we can test perhaps more accurately the spatial variability of climate change. The stations and porportions of them with “bad” audits would, of course, give credence to what I think you are proposing.

My statements in the preceeding post would stand with the alteration that it is people’s preception of temperature that is local, i.e. when Hansen, Karl, Jones and the remaining consensus say it is hot, I damn well better start sweating.

Ian: I agree, .Z is mostly obsolete. Here is the reasoning I was given by a friend (I knew this, but had forgotten since it was so long ago).

.Z files are from compress, which was basically the first good compression program. Then the author found out that the algorithm he had based it on was patented, and that patent was eventually acquired by Unisys. So people set about making a replacement. This was some time in the 1985-1990 time frame. That replacement was “deflate”, which was used in both PKZip and GZip (.zip and .gz). Deflate also got better compression than compress. PKZip was releaased in 1989, and by the time the Lempel-Ziv patent expired (2003 I think) pretty much everyone with a modern OS had switched to using either zip, gz or the newer bz2 format. That’s why you don’t see many .Z files these days. Probably the guys still producing them are stuck on old versions of Unix where compress is still the default program. Possibly the same guys who are still outputting 80 column punchcard-compatible records from Fortran programs.

On the subject of grep/sed, you point out:

SED uses something called regular expressions that allow you to convert spaces/tabs into a single space and then replace it with a comma. GREP allows you to pick out data between spaces/tabs. I don’t know what OS you use, but the cygwin under Windows supports the GNU tools and obviously any flavor of Linux has everything. I enjoy your work, I wish I could use R that well.

You are right, grep and sed are wonderful tools, as is awk, which is basically a combination of the two. However the grand-daddy of them all (actually, more like the grandson, but whatever) is PERL. Now, PERL is a horrid language, however if you learn it it quickly becomes indispensable. It’s like all three rolled into one, and more. For example if there was a format which had eight fixed columns with five digits each I could convert it into a comma-delimited format something like this:

The hash table support is v. useful although the syntax is horrible. This is a valid PERL program: “@$#()*UD()IIF#”. Still, once you learn it, it’s hard to go back. Of course, R includes regular expression and string splitting support, as well as many helper functions for common data input formats, so I think Mr. McIntyre has this well in hand, but if you want a quick way to munge a text file, knowing PERL can be very handy.

I believe this is (or was) located at the Western Salt Works facility. This facility produces salt from sea water using solar evaporation. If so, this formerly “high quality” site is about to undergo significant alteration as it transitions to the San Diego National Wildlife Refuge where a habitat restoration is planned.

Having spent the first 30 years of my life in Southern California and the next 15 in Georgia, participating in suburban sprawl in both locations, it is interesting to note that there are distinct differences in sprawl methods. The southwest clears, levels, and paves. The southeast retains much of the existing terrain and vegetation, and has less dense housing and commercial.

Drive up a freeway in the southeast and you feel like you are driving through a canyon of trees, even if the freeway is surrounded by subdivisions. Drive up a freeway in California in a suburban area and you see roof tops.

Of course not all rural locations have been replaced by suburbia, but certainly most of today’s suburbia was rural 50 years ago. Each station obviously has a unique environment, but I would be willing to bet that your average long term southeastern station has a much more stable terrain and vegetation environment then a southwestern one. It seems pretty obvious to me, perhaps I am misunderstanding something.

Average temperature is calculated as max temp – min temp over some time period. If I am right that increased daytime heating is causing increases in minimum temperatures, not reflected in nighttime temperatures. Then places with large diurnal ranges will show the most (apparent) warming.

I’d be very interested to see summer and winter versions of the graphic. I predict warming in the N/NE will be mostly in the winter and mostly in the summer in California. The SE should warm uniformly throughout the year (relative to Cal and the N/NE).

RE: #61 – It’s because in much of the southwest, you are limited by various things, sometimes multiple things at once. Buildable flat land is an obvious limitation. So where it exists, it’s typically already being farmed. As development edges in, prices shoot through the roof, and you end up with the clear, and pack ‘em in norm. The other factor is water. Getting a water hook up can be difficult. The less extensive the water delivery infrastructure is, the better. The other obvious thing is that, in all but a few areas, the developable flat land areas have nothing more than an open savanna type forest at most, usually much less than that – chaparral, sagebrush, grass lands or outright desert are the norm. Since clearing such land can be done in a couple days with a Cat, then the end result is no surprise.

re Steve S 63: “It’s because in much of the southwest, you are limited by various things, sometimes multiple things at once. Buildable flat land is an obvious limitation.”

You focused on the reason behind the difference and diluted the point. Much of the vegetation is left alone in the southeast, little or none of the vegetation is left alone in the southwest.

Look behind a mall or commercial building in the southeast and you will probably see native growth trees. Look behind a mall or commercial building in the southwest and you will likely see more mall or commercial buildings. With respect to this thread I don’t think it matters why that is, I am just making the argument that the different philosophy matters when you put up a weather station 50 or 100 years ago and then suburbia pops up around it.

Of course another theory could be that there was a lot of moonshine in the rural southeast 50 or 100 years ago and who knows how accurate those readings were!

Steve,
I have followed CA closely for a few months now. Thank you and all the posters here for the education, though often I only get the gist of it while the details are a bit beyond me. Though I have a BS in Math/Comp Sci (numerical analysis), that was years ago and now I am a professional Land Surveyor.

It is with that that I find these maps so interesting as they look like topographic data I deal with every day. I do agree with you that it looks, in terms of the contours produced, there are problems with the data. It’s the kind of thing I see when I have bad field data or an improperly TIN’d data set.
I don’t know what contour software you are using, but in general none of the programs I use require gridded data. If you are not using breaklines to control the TIN then gridded data will help to keep the program from linking data points that should not be, but the TIN will interpolate without the help of any “interpert” preogram. Or at least it should.
I am interested in knowing on what you based the placement of your breaklines though, as that has perhaps the greatest impact on the shape of the resulting contours.

#67. I provided the code that I used in the post. I used the R package akima to interpolate and the fields package (from NCAR) to contour. I have no knowledge of the breaklines assumptions. However, I don’t think that it matters much for this purpose. I don’t assume that all the data is correct; I’m assuming that some of the anomalies are artifacts. After identifying an anomaly, you check it out (a la mineral exploration.) If the data holds up, then maybe you experiment with breaklines but that seems premature at this analysis stage.

Much of the vegetation is left alone in the southeast, little or none of the vegetation is left alone in the southwest.

That’s primarily because what vegetation exists is basically useless scrub. Very difficult to grow stuff in the desert without irrigation and decent soil, as distinct from what occurs in the Southeast’s monsoonal conditions. SW average humidity less than 20%; SE usually greater than 75%. The numbers are very rough, but based on living in Virginia, Florida, Mississippi, Texas, Arizona, and California (Bay area, Monterey, San Joaquin Valley, and San Diego).

The higher humidity and rainfall of the SE also allows trees to grow taller. Making it easier for them to hide houses. You also have to consider the age of the development. Older developments will have bigger, more mature trees.

Re#67 Gary, I think you had some of the same thoughts as I did. I thought about bringing in the temp values as points, constructing “surfaces,” etc, in Land Development Desktop to play around with them and see if I came up with anything substantially different.

I think the only issues with interpolation from what Steve did is probably at the boundaries, such as I pointed-out in #4 with the Gulf, Great Lakes, etc. And my guess is that its effects are inconsequential within the US border.

Additional food for thought is here , for consideration when trying to understand temperature changes in the southeastern US.

Temperatures and rain in the southeastern region are noticeably affected by ENSO (El Nino), especially in winter. The link shows how temperature patterns at particular southeastern stations shift depending on whether an El Nino or La Nina are in progress.

Since 1976 (the great climate/PDO shift) the pattern has been El Nino dominated, which tends to cool the southeastern US and warm the upper midwest (fewer Arctic air penetrations).

Precipitation also affects soil moisture, which affects temperature. I think there are online articles which explore that.

I’m not suggesting that ENSO and precipitation explain all, because they do not, but rather that they are factors to be considered.

Back on Steve’s original point, which has to do with the messiness of the data, I notice an oddity in the Brenham Texas plot , which is consistent with it being a cool spot in the USHCN map. There is a discontinuity in a negative direction circa 1940, which I guess as a station move from the center of the town to a cooler rural airport which may have been unnoted.

What seems to be going on in the above post is that Pielke is choosing a baseline that reduces the trend (i.e. back to 1901) – in other words, as a political strategist, he is following his own advice and choosing the baseline that supports his political goal – which is apparently to get attention as a ‘noted climate skeptic’.

Eli, I don’t see anything at all similar in the posts. This post show three graphics prepared from original USHCN data which, to my knowledge, have not appeared anywhere else. I have also shown links to three new graphics in the same format for 1980 to 2002, done at Doug Hoyt’s suggestion and mentioned above, that you, with your rabbit-like ADD, appear not to have noticed. Here is the TOBS 1980-2002 trends previously linked

My interest is primarily in the adjustment procedures. I expressed no opinion on wehther there is or isn’t a trend in the southeast other than the colors of the map were different.

Speaking for myself, I looked at the century long trend not because I was trying to make a political point, but just because that was what I happened to be looking at first. I certainly believe in looking at data from multiple angles and have no objection to considering the 1980-2002 period as well, and have posted this up as well. My impression of the TOBS-adjusted USHCN data in the southeast is that there is a great deal of local heterogeneity to the data. USHCN and GHCN apply adjustments to this data – whether these adjustments are a sound statistical method is something that I do not presently have an opinion. So far I’m surprised at the lack of care and lack of due diligence in constructing a supposedly “high-quality” network. The claim that this network is “high quality” is sheer promotion and arm-waving. I notice that you’ve repeated the claim that the USHCN is a “high quality” network.

I notice that Lambert you used a GISS adjusted map to conclude that there has been recent southeastern warming. I’m just working my way towards the GISS adjustments, which are over and above the USHCN adjustments (which it incorporates). My impression is that it is a smoothing in which the southeast GISS data includes information from the warming southwest. I’m going to post up a Kriged version of the USHCN adjusted data showing how this spreads southwestern warming into the southeast and smooths everything out. So whatever Hansen is actually doing – and it looks like a weird recipe- it may still be possible to approximate his results with a known statistical technique. But the GISS adjusted data is so kriged that you cannot use it to draw conclusions about regional warming or not in the past 30 years. Personally I expect there to be warming in the 30 year period. No one disputes that. Is it warmer in the southeast than in the 1930s? I think that that’s a reasonable question.

From looking at the station data in more detail, I’m much more open to the possibility that it might have been warmer in the US as a whole in the 1930s than at present. It hadn’t been an issue that concerned me very much. But when you look closely at the U.S. data, there are many that show warmer 1930s than present and many of the data sets that show strong recent warming do not meet any sort of quality standard (Marysville).

BTW I urge you to actually read what I wrote, rather than what you assume I that wrote.

#76. Sorry, it was Eli who used the GISS adjusted map. Not that GHCN adjustments are preferable. I’m not sure what their impact is; I’ll try to plot out the data some time. Rob Wilson has reported that adjusted GHCN data is unusable for his tree ring studies.

The main similarity between the SE and the rest of the US in the TOB-adjusted map shown above is that both are very heterogeneous locally, with stations with declining measurements beside stations with increasing measurements in a crazy patchwork. Obviously many of the stations have substantial nonclimatic influences. Obviously no one thinks that you can recover information from this by merely averaging. USHCN, GHCN and GISS all adjust. Are there adjustments mere Mannianisms or do they have statistical meaning? The publications do not qualify as professional publications from a statistical vantage point.

“One of the objectives in establishing the U.S. HCN was to detect secular changes of regional rather than local climate. Therefore, only those stations that were not believed to be influenced to any substantial degree by artificial changes of local environments were included in the network. Some of the stations in the U.S. HCN are first order weather stations, but the majority were selected from U.S. Cooperative Weather Stations. To be included in the U.S. HCN, a station had to be active and have at least 80 years of mean monthly temperature and total monthly precipitation data, and have experienced few station changes.”

I wonder how they determined that these stations had not been adversely affected? How does a silver painted shelter become a USHCN station?

After looking at the charts, I was wondering, are there any indications that the winters are warmer vs both summer and winter warming? I ask this after reading ( http://dust.ess.uci.edu/ppr/ppr_FZR07_jgr.pdf ), the recient UC Irvine study on the affect of dirty snow on the artic. Per that study, dirty snow accounted for 20 percent of total measured global wamring in the last century and up to 94 percent of artic warming. This made me wonder if North American warming could also be form dirty snow and if it did, would the raw data show that?

RE: #83 – Or a slight twist…. dirty snow / soot may explain why Europe seems to have continental warming versus the rest of the world. To a letter but notable extent, so too does the northeastern US. But since Europe is second only to Asia in terms of the soot problem, and since it is entirely at upper mid latitude, that is where I would expect to see the soot problem express itself most prominently. Soot would, I surmise, even spike the level of summer warmth as well.

I then looked at the Tmin the Tmax and the Tave values in relation to the Tflags so that I could coordinate similar data together. I look for a “1O” flag to be associated with the monthly temperature before I include it in my analysis. The other issue is that some of the data banks go back as far as 1865; however, to try to keep everything on the same base line I have found that if I use the 1936 date that I can get a better line of consistency across all the sites.

The problem is, I believe I need a minimum of 90 years to have statistically significant data. In this case I went back and restarted my collection of sites based on a minimum of 90 years of data in which the temperatures had the “1O” or “1E” flags associated with the monthly values. This has drastically cut down on the available sites; however, except for certain areas like S. California or in the shadow of the industrial belts of the US, the data seems to have a fairly good overall quality.

I am envious of your tools as they appear to make spectacular graphics. However, with my old copy of Office and my Fathom 2 Statistical analysis tool I have found that I can get some pretty convincing data from the ushcn (formerly gchn) site. I would be curious as to what your graphics would look like if we applied a little filtering so that the data quality was a little more homogeneous and tracked not only the Tave; but, included Tmin and Tmax…

Looks pretty warm in the Southern Rockies. How does this gibe with the (blithe, unsupported, non-paper-written, lacking mathematics over the paramater space) assertion that Steve periodically makes (blithely, or did I say that), that instrumental temps do not correlate with bristlecone pines?

The southeast station’s showing little or no warming intrigued me. Referencing the CRN rating guide that Anthony uncovered, I suspect that very few southeast stations would achieve a rating of class 1 or 2 as there are tall trees everywhere here, and very few locations would meet the 5 degree to the horizon criteria required for class 2, let alone 3 degrees for class 1. Having excessive shade might explain a lack of warming, or even cooling. 50 or 100 years ago, some stations might have been in areas where it was clear cut. If you clear cut in the Southeast and wait 30 years, you get a wooded area with skinny, but very tall trees. If you arent actively clearing it, the trees are going to grow. You can see a lot of shade on the Georgia sites on Anthony’s surface stations.org, including pictures that appear to be mid day.

I have looked at regional spatial data all my career and I agree with your statement that the variability shown in your maps reflects other than high quality. The experienced eye is offended.

Your readers are being kind and apologetic in volunteering causes such as humidity and climate cycles (though these might have some effect in the final analysis). The main problem seems to be quality. For example, the change to red shades on adjusted 1980-2002 (your Map 3 on #37) arouses immediate suspicion that the adjustments changed the colours, not humidity or human settlement.

Couple of questions. Why use anomaly maps (see #44 Steve Mosher)? Is there not a better method of normalisation so that the starting point shows no anomaly? Best to start with a white map Number One?

Next, is the gridding/contouring package you use of a standard that your ore grade estimation geological friends would accept? There are geology packs that are pretty sophisticated. (Not being critical, I just don’t know the answer).

Observation. I have seen many maps of the USA 48 over the decades, for factors like the frequency of fatal road accidents, the age of children on leaving school, income levels…… Almost always there is that same vertical divide, as if the USA East was fundamentally different to the West. I can offer no more than the observation. (It’s like an ore body straddling two rather different rock types. The frequency of sampling can require more density in one rock type than the other).

Repeat. One of the major problems of interpretation remains the definition of stations unaffected versus affected by UHI and their subsequent use in adjustment. You can’t subtract a UHI station from a UHI station and expect the difference to show meaning. I have done some initial work locally and suspect here in Melbourne UHI started about 1920. Comparison stations within 20 km were heating 30 years later. Today they might have to be all considered affected. Geoff.

Has suburban sprawl been accounted for? In the years that stations have existed, many cities have grown suburbs surrounding them. In response #4, Michael Jankowski says, “…I look at the triangle of Dallas/Ft Worth, Houston, and San Antonio (with Austin in there, too), and see cooling.” In areas which are now suburbs of those cities, the geography prior to suburbanization was that of southwestern desert. In recent years, large tracts of former desert have been developed as detached residences — most with yards that are vegetated (trees, lawns, plantings, etc.) Vegetation is known to moderate climate by increasing water vapor in the atmosphere *resulting in rain), and absorbing short-wave raditation (resulting in cooler surface temperatures.)

What the heck happened to Philadelphia’s large 1980-2002 cooling anomaly in the raw and tobs data? In the final adjusted version, it shows warming! Maybe the station was in the middle of W.C. Fields. :-)

I’ve only just come across this thread, and have read it with great interest. Steve’s original message about contours of trends, and his impressive plots, together with the R code, and the comments they have generated are intriguing to say the least, but I must put in a plea for some further thoughts on the subject of trends in general.

As I understand it “trend” is short for “linear trend”. In other words, a computed trend is simply a linear least squares fit to a time series over the period of interest. One hypothesises a linear model and ascertains whether the hypothesis is sustainable. I have some experience with climate data of various types, especially station data, and it is my constant and overwhelming experience that they do not change with time in an orderly fashion – which is reasonable enough given that they are generated by an assembly of chaotic systems. My work has convinced me that although station data may be remarkably linear wrt time over very varying scales, they are liable to exhibit step function characteristics at seemingly random intervals. This is remarkably simple to demonstrate graphically – GIFs can be provided if anyone is interested. Last night I looked at the Brenham data (stored in my download of GHCN data from about 2003). I have also looked at Lampasas. Both these sites have very clear regime changes, for example Lampasas at March 1957, magnitude -1.1 C approx, and again just before the data end, April 1998, magnitude about 1.5 C.

I’m happy to post the evidence as a few GIFs, if anyone asks.
OK, the data may not be wholly reliable, for observational reasons that continue (rightly) to exercise many contributors. I believe that this type of analysis may make it simpler to identify step changes that might be associated with instrumental or location changes.

Meanwhile, I counsel against basing too many inferences on data whose “trends” computed over an arbitrarily chosen period might simply be artefacts of observational misdeeds. I think it might be safer to fit linear trends only over periods where the data do not contain step changes The GIFs I’ve generated will show fairly conclusively where these periods are.

One Trackback

[…] venues. In this case, I’ll take partial credit for initiating this particular topic as, in a post on June 11, 2007, I observed that the Tucson – Univsersity of Arizona station ranked #1 out of all 1221 USHCN […]