The Accidental Tourist

Occasionally I will take a trip after much careful planning and preparation, only to find myself going off into uncharted territory soon after embarking on my adventure. That is what happened to me recently when I started to take a fresh look at worldwide station coverage. Where I ended up and what I found when I got there was incredibly surprising.

It all began last week when GISS released their global mean summary for April, 2008. Following this release I went to view their global maps to get an idea as to where the “hot” and “cold” spots were last month. I viewed the data using both a 1200km and a 250km smoothing radius. Doing so helped me gauge the station coverage and the extent the 1200km smoothing algorithm estimates temperatures over the vast unsampled swaths of the planet.

It occurred to me that it would be interesting to compare April 2008 with April 1978 using a 250km smoothing radius. I was looking for “holes” in 2008 station coverage not present in 1978. I selected 1978 for two reasons. One was that the worldwide station coverage was near its peak that year. The second reason was that 1978 fell in the 1951-1980 30-year base period for calculating anomalies.

My thought was to identify multiple stations within a hole that were still reporting data today but were not being captured by GHCN. I wanted to see if the data from those stations supported the anomaly estimated by the 1200km smoothing. The 250km smoothed plots would be ideal for visually identifying holes. Here are the plots for April 1978 and April 2008:

There were lots of holes to choose from: Russia, China, Australia, Canada, Africa, and South America. I decided to start with Russia as I already knew where to look for recent temperature data from “discontinued” GHCN sites: meteo.ru. But first, I had to locate some stations to examine.

Looking at the April 2008 plot, the hole to the northeast of the Caspian Sea seemed like a good place to start. I went this time to the station data page at GISS and simply clicked my mouse on the map to the northeast of the Caspian Sea. GISS gave me a list of stations – sorted by increasing distance from where I clicked. At the top of the list was Kurgan, so I decided to go there first.

Wikipedia says Kurgan “is the administrative center of Kurgan Oblast, Russia; one of the oldest cities in Siberia.” The view from Google Earth indicates it is pretty remote as well, but apparently has a population of 310,000 (according to the GISS data page).

GHCN records for Kurgan extend from November 1893 to April 1990. These are actually comprised of three scribal records: (0) November 1893 to December 1989, (1) May 1929 to December 1989, (2) January 1931 to April 1990. Because I grabbed the data from the GISS website I will refer to the records as GISS.0, GISS.1, and GISS.2 respectively. Remember, however, that GISS takes the data from GHCN.

I was hoping that the Meteo record for Kurgan would match one of the three GISS records. What I had forgotten was that the Meteo records were of daily readings rather than monthly averages. This meant I was going to have to calculate monthly averages for Meteo before I compared it with the GISS records. It is at this point my journey took an unexpected turn.

The Meteo records have three daily temperature records: Min, Mid, and Max. The Mid value is described simply as “Daily air temperature”. I have not been able to find out when that value is recorded each day or how it is otherwise calculated. However, one thing that is certain: Mid does not represent the average of Min and Max. In fact, many of the early records only include Min and Mid. In the Meteo record for Kurgan, Mid records are available from November 1893 to December 2005. Following is a plot of that record:

I calculated the monthly averages using the Mid values in the Meteo record. I then compared this monthly record with GISS.0 and found they very closely match in the months that overlap. The values for just nine months differ by 0.1, likely due to rounding differences. Another eleven monthly records not present in the GISS record were present in the Meteo record. I went back to the Meteo record and found that in ten of those months, one or two days were flagged as having a quality issue. The quality issue turned out to be a Mid value that was lower than the Min value, so in the case of GISS.0, the entire month’s worth of data was discarded when just one or two data points were suspect. Interestingly, the GISS algorithm later creates an estimate for the missing month when calculating the annual average!

With the exception of June 1967 (which is missing from the GISS record) and the fact that the GISS record ends in December 1989, I was able to use the Meteo Mid data to reproduce GISS.0 for Kurgan.

Max data values begin appearing in the Meteo record May 1, 1929. I happened to notice that GISS.1 also begins with May 1929. On a whim, I decided to calculate the monthly averages using the daily averages in the Meteo record when both the Min and Max values were present. To my surprise, this variant of the Meteo record matched the GISS.1 record!

At this point I have not been able to determine whether or not GISS.2 is also derived from the same record, but it is likely that it is not. Clearly, however, GISS.0 and GISS.1 are derived from the same record. If you recall, the GISS algorithms will combine the two derived records using the “bias method”, which assumes that one record is biased warmer or lower than the other record. Here is a plot of the difference between the Meteo record calculated using Mid values and the Meteo record calculated using the average of Min and Max. Can you determine the relative bias?

Update 5/21/08:

There are several points to be made here:

GISS (from GHCN) ultimately uses the Meteo record twice. In GISS.0 they use the “Mid” values from the record. In GISS.1 they use the average of Min/Max where possible. Those two variations of the same record are then combined with a third record GISS.2 whose origin is unknown to me at this time.

The bias method is used to combine GISS.0 with GISS.1 (and GISS.2). The bias method assumes that one record is running warmer or cooler than the other, and adjusts one of them accordingly. In the case of the Meteo record Mid is cooler than the average of Min/Max most of the time, but not always, and not by a constant amount. The bias method is an inappropriate method for combining these records.

GHCN throws out an entire month’s worth of data when just one or two day’s are suspect. This is done rather than estimating the suspect days. In doing so, GHCN has left it to GISS to come back later and estimate the temperature for the entire missing month.

The point is that, for Kurgan, GISS appears to use (min+max)/2 for a mid-value, whereas Meteo does not use this method to compute a mid-value. Which raises the question: what index does Meteo use, and does it better represent central tendency? (I know some folks use a sine curve interpolation, where the min and max are expected to occur at 6am and 2pm respectively, with midpoints at 10am and 10pm.) I expect that this is the first entry in a diary of entries. In which case, stay tuned.

The data values (difference between the Meteo record calculated using Mid values and the Meteo record calculated using the average of Min and Max) are largely negative. Meaning the latter is larger than the former. Meaning GISS chooses a method that is warm–biased compared to Meteo.

The lack of coverage in Canada post-1980 reflects the gradual closure of the DEW Line, The Mid-Canada Line and the Pine Tree Line of radar stations manned by the Royal Canadian Air Force/Canadian Forces. Any sites remaining now are unmanned and remotely operated. The last to go were Holberg, BC, Sydney, NS and Barrington, NS in 1990.

The point is that, for Kurgan, GISS appears to use (min+max)/2 for a mid-value, whereas Meteo does not use this method to compute a mid-value. Which raises the question: what index does Meteo use, and does it better represent central tendency? (I know some folks use a sine curve interpolation, where the min and max are expected to occur at 6am and 2pm respectively, with midpoints at 10am and 10pm.) I expect that this is the first entry in a diary of entries. In which case, stay tuned.

I don’t get it. I can see that GISS shouldn’t use “GISS.0” (the “Mid” record) for Kurgan, but is there some larger point? Even if the error of using the Mid series turned out to common in GISS’s treatment of the russian data, it wouldn’t affect the trend, would it?

BTW, we don’t know if the Mid value is an observation (temp at a specific time) or is otherwise calculated.

What’s the story with collapse of coverage between 1978 and 2008? I can understand that political events might have influenced reporting (collapse of USSR, civil wars in africa, closure of cold war radar stations etc). But what about a country like Australia, with meteo records probably as good as any in the world?

#7 – I was there when we closed one of ’em down. I had found a map (on ‘Watts Up With That’ I think), which I can’t locate right now but I’ll keep looking, where the sites were colour coded as inactive. Maybe someone else is better at searching than I am? It’s certainly a more logical explanation of the gaps than conspiracy theories.

Lack of Canadian data is definitely not a DEW line decommissioning issue. Some sites went down in the late 80s or early 90s (Pelly Bay, Jenny Lind, Bernard Harbour, etc.) but many of the DEW line stations actually only report climate data for a short period from 1959 to 1963 (Cape Peel, Ross Point, etc.). There are still plenty of long-term stations in Nunavut, including some (Fox Five, Cambridge Bay) that were DEW line sites and are still active Environment Canada climate stations.

#10 That is an anecdote, of isolated local importance only. Look a the size and extent of the data gap. Audit means understanding exactly how GISS is getting their temperature data from those vacant jurisdictions. #11 and #12 prove the data exist to prevent a gap of that size occurring in Canada.

There are lots of GISS stations in Canada that were dropped in 1989 but are available on the Environment Canada website from 89 to the last hourly reading. Don’t worry, Gavin will figure out how to get information off the web in another decade or two.

#14 So why are there a few blips of data from northwestern and northeastern Canada? If it were a strict 1989 cutoff (a fact that has been discussed before at CA), then there should be NO data for Canada. Now, what about the countries of Africa? That’s quite the gap.

The other point is that with two semi-conintental-scale data gaps there is too high a probability that the two gaps happen to be blue (as they were in April 1978). If so, correcting that would bring the “global” anomaly down. Whether red or blue, the gaps need filling.

The plan is to reproduce the GISS – to show that the GISS was calculated using such and such a method, whereupon it is possible to identify defects in the method, to determine if, for example, it is more inclined to leave out weather stations that show cooling than weather stations that show warming.

It is difficult, probably impossible, to reliably determine global temperatures from weather stations, even to tell if the earth was getting warmer or cooler in the latter part of the twentieth century, for reasons that have been discussed many times. It is, however, possible to determine how robust temperature determinations are. To determine, for example, whether different equally reasonable approaches give very different results, and whether the approach used by the GISS – such as leaving out large parts of the earth – is reasonable.

About annually, the Bureau of Meteorology has produced a CD on which there are some 1200 stations, some going back to about 1860, many now closed. There is max min mean temp as well as some metadata, a reliability ranking, precipitation etc. The only problem is that it costs about $100 to buy the CD and if a researcher had to do that all around the world every year he or she would need to be full time serious.

In the last couple of years, there has emerged a high quality network of about 100 plus stations, many at small airports and many at lighthouses. I suspect that these are among the ones on the 2008 map you show, but there are still many other high quality stations missing from your map (eyeballing here). I cannot imagine a reason for these gaps other than a lag in updating data. I can count only about 17 grid cell clusters and I know that some clusters are not on your map.

Apart from the High Quality network, there are many hundreds more sites available on the latest CD. I do not have one so I do not know the latest date. It is subject to copyright so I cannot make one for you. Maybe if you asked the BoM for a complimentary copy for community use by C.A. regulars they’d donate one. Or maybe they would not.

I do not know the agreement under which the BoM provides data to NASA, nor the content of what they provide, nor how “adjusted” it is. My instinct is to gift you the latest CD, but then principles cut in. I’ve paid a lot of tax to help generate these CDs and in retirement I don’t feel like double dipping.

Speculating on Kurgan in Russia (and not coming from a city where it snows) I see the “Mid” temp of about 2 deg C. From the little I know, when snow falls on a city the inhabitants do their best to minimise its area. Does this create another form of UHI through an albedo effect, increasing in time as the population has increased while heating and salt and snow plows have converted snow cover to asphalt surface?

#17 James “It is difficult, probably impossible, to reliably determine global temperatures from weather stations, even to tell if the earth was getting warmer or cooler in the latter part of the twentieth century, for reasons that have been discussed many times”

Regardless of the Meteo data for Kurgan, the real issue here is the lack of coverage in the GISS database now.

What is the point of putting out temperature maps and global average temperatures in such a public way when you are covering less than half the planet? Why put in all that effort and use up all those tax-payer resources when you are just going to do such a sloppy job?

To me, it is not laziness or some problem with the data which forces GISS into these shortcuts. It is because this method produces a greater global warming signal, simple as that – bias.

I’ve discussed the station closure issue in the past. While there may be some station closures, this is COMPLETELY irrelevant to the decline in station numbers in GHCN and thus GISS. GHCN simply fails to collect data that is available online. See previous posts on Dawson in Canada, Cobija in Bolivia, Wellington in NZ etc etc.

If “high” is recorded in the afternoon and “low” is recorded in the evening (or something like that)…then “mid” as mid-day would make a lot of sense, including being mostly but not always between the “high” and “low” values.

Does this create another form of UHI through an albedo effect, increasing in time as the population has increased while heating and salt and snow plows have converted snow cover to asphalt surface?

As many know, I’ve been thinking about all the concrete and asphalt over large areas
of the NH and the rise in population since 1815 from 1 to 7 billion.
(And for those that want to point that cities cover “little area” I invite you hereor here and then think of all the farms and roads and freeways between the two)

So besides altering the ground’s behavior regarding water or sunlight in summer , the more people clear the snow off of it in the winter, too, changing the behavior year round. Makes perfect sense.

Further to Bender’s point … even with DEW line closures, you’d think that with the pace of development in the far north, new replacement stations are available (albeit, likely suffering from what ails many stations noted by Watts et al). So, I’ll go with GISS sloth.

So besides altering the ground’s behavior regarding water or sunlight in summer , the more people clear the snow off of it in the winter, too, changing the behavior year round. Makes perfect sense.

I just received some new photos from the Environment Canada climate station at Rankin Inlet, Nunavut. I’d visited it last summer, and noted that it’s located on a ~2 m thick gravel pad (typical for northern Environment Canada stations), with ground cover characteristics quite different from the adjacent tundra. In late April this year, with plenty of snow cover in areas beside the station, there were bare patches on the pad and it was obvious that they clear snow from the pad area (15-20 metres square).

As soon as the snow disappears in the springtime, that gravel pad would soak up the sun’s heat in a big way, and release it after dark. I’m presently trying to track down some historical information on some of those northern sites to better understand what the changes have been over the years.

The Meteo records have three daily temperature records: Min, Mid, and Max. The Mid value is described simply as “Daily air temperature”. I have not been able to find out when that value is recorded each day or how it is otherwise calculated. However, one thing that is certain: Mid does not represent the average of Min and Max. In fact, many of the early records only include Min and Mid. In the Meteo record for Kurgan, Mid records are available from November 1893 to December 2005.

To make the Kurgan data sources more complicated, there’s another dataset downloadable from the Kurgan page of Russia’s Weather Server. The data there are from 26.11.1998 to the present day and there are four daily values:

You can have either daily max and min temperatures or measurements every 3 hours. Incidentally most of the stations are rural, lighthouses or small military airfields, so UHI effects should be minimal.

#29 – I surrender! There were a lot of stations closed, but omission of sites and bad data adjusted and manipulated behind closed doors trumps a few closures for effect. Back to lurking and licking my wounds.

Neil #36 Excellent point, one sees it whenever it snows, the roads take longer to stick and they also take longer to melt than the grass or dirt (or whatever) according to their heat content (length of time in the sun or covered with ice, and what the material is). Next time it snows if it snows where you are, take a look at roads, sidewalks, gravel patches, grass, flowerbeds, mulched areas, plain dirt.

Re Canadian data.
Here is the disclaimer at the start of printed data for Canada titled “Canadian Climate Normals 1951-1980, Published by Environment Canada.

“No hourly data exists in the digital archive before 1953, the averages appearing in this volume have been derived from all available ‘hourly’ observations, at the selected hours, for the period 1953 to 1980, inclusive. The reader should note that many stations have fewer than the 28 years of record in the complete averaging.”

I note that another one of the gaps is North Island of New Zealand. Maybe after you found it for them, they have lost Wellington again. I t also appears that the South Island is still getting hotter despite Hokitika not showing that trend. It would be good to know just what adjustments they are doing.

Jae#48 Yes, as I’ve been saying, land-use is not just UHI, it’s UHI’s effect upon heat absorption, weather patterns and the like, coupled with (given my “look at Los Angeles, look at Chicago, look at the land between them) farms and roads and what happens to all the waste heat from the use of energy; cars, air conditioners, heaters. And all that goes someplace, it is the spheres after all. You know, teleconnections.

(42) JohnG
The WMO No. of Kurgan on Russia’s Weather is the same as that given in other databases.

According to the server,

Weather Archive is being developed and maintained by
Satellite Monitoring Technologies Department of Space Research Institute, with support from Russian Foundation for Basic Research (project 01-07-90172)

This is something like Goddard Inst. at NASA, isn’t it?
Whereas meteo.ru is run by RIHMI-WMO (Russian Hydrometeorological Institute)
So there seem to be two nonidentical datasets for the same station from two quite official sources…strange…

Given the above and many many many previous re-adjustments etc (and even another one yesterday on March 08), and problems, maybe the only solution to dealing with GISSTEMP is this ad verbatim quote taken from Anthony Watts site; ( I Hope this is OK if not please remove this posting).

The site says he does not “bother plotting GISS global temperature anomaly data anymore” because

“I simply don’t trust the representivity of the GISS dataset. Since GISS uses polar “interpolated” data which does not exist in any of the other three datasets, I see GISS as an outlier and not truly representative of global temperature. For example, GISS interpolates from the closest high latitude surface stations and adds that data for the very high latitude arctic where no weather stations exist. None of the other 3 data sets use interpolation to create data for the arctic where no measurements exist. Hence, I have a greater degree of trust for RSS, UAH, and HadCRUT global temperature anomaly data.
……
“And there’s more, see a post called “The Accidental Tourist” about GISS data problems, from John Goetz (originally posted on Climate Audit) which I’ve reposted as the next item down.”

Since GISStemp sems to be consistently way off from all other measurements as a scientist I would reject it as a “outlier” and would not included it in any statistical analysis

Coverage in Australia was mentioned in an earlier comment. Strangely, the April 2008 map seems to have no square overlapping Melbourne or Brisbane, two major cities. I can’t imagine why that would be the case, other than they haven’t acquired the data yet for some reason.