TOBS

Its true that TOB is pretty far from the topic of this thread, so perhaps our host could start a new one just on TOB, ideally copying into it the pertinent posts from this thread? I may have missed a few, but a good start would be #305, 376, 400, 402, 403, 413, 418, 419, 420, 424, 455, 458, 460, 462, 464, 468, 484, 488, and 493. )

155 Comments

Short version. Stations report two measurements per day. TMAX and TMIN.
That is the maximum seen in the last 24 hours and the minimum.

The TIME of this observation is supposed to be midnight. But some people
did their observing at noon, and others at 1800, and others at 0700.

Guess what? It introduces a bias.

No. It only seems like it it introduces a bias–because it does introduce a bias on a day by day basis. It isn’t the case that TOBS biases the trend. Consequently it should be quite alarming to most people that the TOBS correction does bias the trend.

TOBS only matters if you want to make statements like: this is the warmest July 17th in 100 years. If you are interested in statements such as x degrees/century, the method used to correct TOBS is wrong wrong wrong.

“TOB is not an issue for me, I think its probably pretty close.
My question about TOB was if it was a part of the data for certain years,
something JerryB answered in 391.”

Time of Observation adjustments are made when the guy reading the thermometer changes
his time of Observation. So, it depends upon when the changes were made, and what
change was made. I’ll link JerryBs Description. It is very lucid and added 6.23 points
to my IQ:http://www.john-daly.com/tob/TOBSUM.HTM

“Im not convinced on outliers, do you have a link for the tables?
I would like to get a better idea of it.
It will probably affect CRN5 depending on the amount of the mean.”

Ok, One class that I wish they offered in high school when I was young
was probability and statistics. Because those branches allow you to
DO STUFF. cool stuff. Win at poker and blackjack..

If you flip a fair coin 100 times, do you believe that you will
get 100 heads and no tails? What are the chances of this happening?
If you DO get 100 heads in a row, do you think the coin is fair?
or do you think the coin is biased?

Essentailly, they throw out data points ( like 1 month or 2 months) when
those data points are RARE events.. like fliping heads 10 times in a row.
It is not a class5 issue. Microsite bias is not a spike. If it were a spike
it would be detected. It is a sneaky bias. It creeps up.

The chances that a measurement ( tempeature recording) is 3.49 Sigma
greater than the mean is 2 in 10000. The chances that the measurement
is less than 3.49 Sigma of the mean is also 2 in 10000.

If I deal you 5 cards, the chances that you get 4 of kind is about 2-3 in 10000.

So, when they look through the records they are throwing out extremely rare EVENTS.

make sense. They are not throwing out big chucks of data. They toss the few cases that
seem odd or rare. It doesnt effect the mean when you have 1500 monthly data points.

Here is an example. you have 1000 months of data. Temperature is 0C for 999 months.
One month in the middle of this has a recording of 100C. What is the average
(100)/1000 = .01C or is 100C an Outlier. A mistake.

“I doubt it would affect CRN1,2 sites ”

Outliers typically result from things unrelated to class12 or 5. The guy writes down 5C
rather than -5C. Or the thermemter breaks. CRN12 describes the physical being of the site.
An observer can make a reading error there just as easily as at a CRN5 site. See? For example.
Orland is a great site. To get the tempeature a guy has to walk out and write down the MIN
and MAX. Suppose he sees 62F and rights down 26F? THAT has nothing to do with asphalt. When
he writes down 26 and sends that data in…. The “outlier detector” says.. OPPS. I think
our observer made a mistake. Now usually ( hansen does this) they CHECK other stations in the area
HEY, sometimes you get 4 of a kind.

“and I thought Johns last post on CRN1,2 vs GISS was very compelling (with one question below)
but Im not as convinced with his CRN 5 analysis.
I do not agree that the temperature spikes will be just 1 to 5 % of the time,
its probably closer to the percentage of sunny days for pavement and summer days for AC and etc,
of course that will also depend on geography because here in the Northeast the pavement is
usually covered by snow in the winter.”

We’ll John Needs to build an expectation ( a model of what he will see) in order to do a PROPER
test. I’ll do it using my expected value approach. Because it’s probably easier to understand but
less elegant than his approach.

A class 5 site can see a micro site bias of 5C.

1. This bias will apply to TMIN. The bias in TMEAN will be 2.5C
2. This bias will have the same seasonality as UHI. Summer is the worst. 25% of the year.
3. Now we are at .6C.. simply the microsite bias works at night in the summer.
4. Cloudy days cut the effect
5. Rainy days cut the effect
6. Windy days cut the effect.
7. Guess at cloud/wind/rain modulating half of the effect. .3C

now, this .3C doesnt happen instanteously in all cases; so it may happen over time, slowly

Anthony, I think I have a proposal for the pre 1900 records. Can you tell when
a station ad the CRS installed? Then I think we would have a historical fact
to base the exclusion on. This migt be different for different stations. The early
record is is exposure. CRS solves that, so perhaps excising pre CRS records is a
finer grained approach, you might have 1902 for some sites and 1904 for others.
That gives us a solid defensible case

Why introduce another bias, if it is only 1 day or so a month? Even if it was every time, then you would still have 15 good data points. The monthly average then to yearly average would make the bias small. Also since it can be either warm or cool biasing even when it is cool or warm, isn’t it a good assumption to assume iid?

Dangerous assumption. Why not assume that NASA GISS shows lower temperatures for CRN5 (urban heat island)than CRN12 (see your own graph), and that therefore the NASA GISS files are FUBAR? Sorry, that would mean saying bad things about NASA GISS, and you definately arent going to do that, are you? By the way, glad you got round to naming the databases.

MarkR, I was comparing CRN12R to CRN5 using my own analysis. It was not related to GISTEMP.

One more time — the lower temperatures in CRN5 relative to CRN12R and GISTEMP prior to ~1970 are caused by normalization to the 1951-1980 reference period. CRN5 shows ~0.35C more warming than CRN12R and ~0.2C more warming than GISTEMP (likely to increase to ~0.4C after TOBS correction). The trends for the key periods (bar graph) clearly shows more warming (less cooling) in CRN5 than either CRN12R or GISTEMP, particularly from 1935 to 1975.

I could say lots of bad things about GISTEMP, but many others are taking care of that. The fact is that GISTEMP agrees well with the temperature trends from the best stations (CRN12R).

I honestly continue to fail to see why TOBS adjustments are so important. Perhaps I’ve missed it, but follow with me for a moment.

No matter what time of day the observations are made, if they are consistent for a given station, the observer is always recording the high and low temp for the previous 24 hours. It might be noon-to-noon, midnight-to-midnight, or 6pm to 6pm. But it is a 24 hour high/low. For that station then, the trend recorded is a valid daily trend. Period. Might be shifted 6 hours from another station’s measures, but it is still a valid daily record!

Do we really need to align the different stations to be recording the exact same 24 hour sequences? I see no value in this, and much opportunity to make a mess of the data by attempting to “adjust” for TOB.

If the time of observation (and reset) is during usually relatively cool
morning hours, there will be a cool bias. If the time of observation
(and reset) is during usually relatively warm evening hours, there will
be a warm bias. If the time of observation (and reset) changes from one
to the other, the trend will be biased.

Let me repeat that: if the time of observation (and reset) changes from
one to the other, the trend will be biased.

I disagree. Suppose that every day in July the low is 50 at 2AM, the 7AM temp is 60, the high is 80 at 2PM, and the 5PM temp is 70. A 5PM recorder on July 15 will see a high of 80 and a low of 50, both from July 15. A 7AM recorder will see a high of 80 (from July 14), plus the July 15 low of 50. The fact that it is cooler at 7AM than 5PM is irrelevant, and creates no bias.

There will be small biases in monthly averages when temperatures are rising and falling: If the corresponding June temperatures are all 45/55/75/65, a July 1 7AM recorder will see the June 30 75 high instead of the July 1 80 high, and so the July average will be a little too low. However, there is no bias in the annual figures, since it works the other way when temperatures are falling.
The huge USHCN TOBS adjustment shown by John V (#376) is therefore highly suspect. I just read Karl, Williams, Young, and Wendland (J. Climate & Applied Met Feb. 86, online at http://ams.allenpress.com/perlserv/?request=get-document&doi=10.1175%2F1520-0450(1986)025%3C0145%3AAMTETT%3E2.0.CO%3B2 ), and can’t for the life of me figure out why TOBS should cause any perceptible shift in the annual averages. The graph at #376 shows a cumulative updrift of +.35 deg F from the 20’s to the 90’s, just from the TOBS adjustment. This can’t be right.
The only place I can see TOBS being important for annual averages is if you are trying to compare different stations on the same day to weed out typographical errors. If everyone else is reporting highs of 82 on July 15, but Joe is reporting 28, Joe is likely wrong. But in order to make this comparison, you have to know whether Joe was referring to July 14 or July 15.

I mentioned this once before, but the logical way to check TOBS is through a new analysis using CRN data which is hourly. Empirically calculate the effect of 5pm versus 12 am measurements for stations in the same area.

“The observer has to report 85 because it was the temperature at the time of
observation the previous day.”

Yes, because that was the high for the 24 hour period ending at 6 PM on day 2.
It may bother the observer, and some NWS documentatin suggests that it really
bothers some observers. However, when times of observation are other than
midnight, observations like that occur often, even without passing storms, and
they lead to TOB.

It does not depend on passing warm, or cold, fronts, or storms. It does
not depend on part of the last day of one month being treated as part of
the first day of the next month.

It depends on common, ordinary, differences of high temperatures from one
day to the next, and common, ordinary, differences of low temperatures
from one day to the next. When those ordinary differences relatively
frequently occur near the time of observation, they cause TOB.

Average estimated TOB of 190 locations at three times of observation
commonly used by COOP observers:

Suppose that every day in July the low is 50 at 2AM, the 7AM temp is 60, the
high is 80 at 2PM, and the 5PM temp is 70.

If max/min temps of successive days do not vary, TOB does not occur.

Actually, it occurs to me now that even with constant daily temperature patterns, TOB will occur, on the day of the conversion, and that this may be what Karl et al (see #458) are trying to quantify.

Suppose as above that the daily highs are always 80 at 2PM and the lows are always 50 at 2AM. If there is no change in procedure, PM observers will always report 80/50, as will AM observers. But if on July 15, say, a 5 PM observer switches to 7AM, since the thermometers (or MMTS)were reset the previous 5PM (when the temperature was 70), the reported high/low for July 15 will be 70/50 instead of 80/50, a big difference! The Karl et al diagrams start to make sense in terms of this sort of bias. Furthermore, for the month of the change , the immediate effect of the change (their TOB) is much bigger than the timing effect of attributing the last day of the month to the wrong month (the second term in their Drift Corrected TOB or DCTOB), as they note.

In this example, if August has the same pattern as July, the immediate effect would be that one of the Maxes is 10 deg F too low, making the average max for July about .3 deg too low, or the average min/max mean about .15 deg too low.

However, the effects of the conversion do not stop there. August will lose its last day to September, and so forth, so that September, Oct., Nov., and Dec. will all be a little too high. The net effect on the year will be that the Dec. 31 high (call it 10 deg. F) will be replaced by a July 5 PM temperature (70 in the example), so that one daily reading for the year will be 60 degrees too high. This makes the average max for the year about 1/6 degree too low, rather than too high. Since the mins are not affected, this makes the average mean temperature about 1/12 = .08 degree too low, using these hypothetical numbers.

In subsequent years, however, the change is a wash, since the year’s own Dec 31 high is just being replaced by the previous year’s Dec 31 high. The TOB effect therefore cannot permanently affect the mean temperature. The fact that TOB does have a permanent effect in the USHCN adjustments shown by John V at #376 strongly suggests that NOAA or whoever has mistakenly treated it like a perment bias, rather than correctly as a transitory bias. And the fact that they are adjusting the numbers up rather than down suggests that they forgot all about the end-of-the year consideration.

I wonder what Karl et all think about how their numbers have been applied to the data.

I would suggest that real daily temperature fluctuations, and the real TOB
to which they can contribute, are more germane than artificially constant daily
temperature patterns, and whatever speculations may be based thereon.

1. the ISSUE at hand is MICROSITE BIAS. NAMELY, Sites that show a SITING bias that has NOT been accounted for.

2. TOBS. No one has demonstrated any issue in actual data with TOBS adjustments that has anything
to do with the rating of the Site. None.

Now, JerryB has linked his stuff and I’ve linked his stuff and you should all read Karl’s paper
before asking another TOBS question or proffering another TOBS opinion.
Further you can go download some CRN data.. in 5 minute increments
and study the stuff.

Now, do I think that it would be a good idea to revisit TOBS? Yes. But this is not the place.
One thing I love about this place is we always know how to get back on topic. Over at RC
every discussion degenerates into ” The ice is melting” and “fuel from dog poo and wood chips”

My question is technical – how exactly is the Max/Min temperature recorded at older (ie, non electronic type) stations. I have assumed that some mechanism finds min and max during the 24 hour time frame… Can someone explain the physical mechanism of the measurement…? Thanks!

TOB can have a significant effect on the Stations’ monthly (and annual) mean temps. This is pretty much along the lines of #14’s examples. The reason this happens is that temperatures at the time of observations are double counted. If the TOB happens to be near the normal time of minimum (or maximum) daily tempseratures, then these extremes will be counted twice while it’s opposite max (or min) is counted only once. Using the princpio ad absurdum, say the min temp occurs at ob time and is a spectacular 25°F below the daily normal. This anomaly will appear on the record for the 24-hour period ending at the TOB and the 24-hour period beginning at the TOB. The two different maxes for each day will be counted only once for each 24-hour period. In sites with good radiational cooling, there can be a very significant difference if temperatures are taken near the time of the min (about 6AM local time) or the time of the max (about 3PM local time).

Notwithstanding the above, as re #1, if the station continues to use the same time of observation throughout its record, the trends at the station will not be affected provided that the trends are the same for both the max and min temps. The real problem arises when TOB is changed. This creates a discontinuity in the station record which should be accoumnted for. Many long-term sites used the once-a-day max and min temperatures to construct their daily mean temperature record, and then, when replaced by automated recording temperature devices, switched to hourly observations with the means based on the 24-hour averages. In these cases TOB can cause a significant discontinuity in the mean temperature record.

All that being said, I have no problem with TOB adjustments provided the algorithms used to nomalize time of observation bias be carefully constructed to work with the particular time of day such obs are taken.

There are quite a lot of abbreviations in different threads and although I
read this site nearly every day I can’t follow automatically all discussions.
So what exactly is TOBS ? Temperature of browsers ?

For many years, most weather stations used NBS (now NIST) certified glass thermometers to measure temperatures. Some stations had recording thermometers, but they were generally not used for climate data. The thermometers were located in a Stevenson Screen (Cotton region Shelter) which was designed to minimize solar radiation effects.

There were two types of thermometers. The maximum temperature thermometer was a mercury-in-glass device with a restriction above the bulb which allowed mercury to rise above the restriction as temperatures rose, but inhibited the mercury column from falling so that the mercury in the column was always stuck at the highest temperature of the recording period. It was reset at the time of observation by rotating the thermometer rapidly so that the centrifugal force overcame the resistance of the constriction. Most shelters had the thermometer mounted on a device with a crank that allowed the observer to reset the thermometer without touching the instrument. This was exactly the same principle used in mercury fever thermometers. One would reset these by shaking vigorously to allow momentum and centrifugal force to push the mercury past the constriction.

The minimum temperature thermometer was constructed differently. Like the mercury thermometer it was a liquid-in-glass instrument, but without a constriction above the bulb. It didn’t use mercury because in cold locations the mercury could actually freeze in the glass. (It wouldn’t shatter the glass because unlike water, mercury contracts when it freezes). A liquid with a low freezing point, like alcohol, was used. A small moveable metal index was embedded in the liquid portion of the column. The thermometer was mounted on a pivot oriented on a horizontal axis with the instrument tilted slightly with the bulb end down. As the temperatures fell and the liquid contracted into the bulb the index would be carried down by the surface tension of the fluid at the interface between the fluid and the vacant portion of the tube. If the temperatures rose, the index would remain where it was as the fluid flowed past it. Thus, the position of the index represented the lowest temperature during the period from the previous reset. The index was reset by tilting the thermometer into an inverted position so that the index could slide down to the meniscus (which represented the current temperature).

The minimum thermometer was also the one used to record the current temperature since the top of the fluid represented the ambient temperature irrespective of the location of the index.

Modern observations use a MMTS (Maximum Minimum Temperature System). It is an aspirated electronic device, originally using a thermistor as the temperature element. This instrument has so many problems that it would take a chapter to describe its inadequacies.

Your confusion is simple. The reconstructors are trying to develop an “accurate” temperature record vs just finding the rate of change. If you just worked out the rate of change for consistent records (same inst, same site, same TOB etc.) then a lot of the errors in the temperature reconstruction method either fall out or cancel (not all of them).

In fact the best way to reconstruct a temperature series (IMO) would be to do the slopes and then pick a calibration period and integrate the slopes to find actual temps.

It would seem the correct way to handle TOB is to flag the record when it happens, and then when processing the data for averaging treat it as any other missing measurement (interpolate, substitute average, etc).

Hansen 2001 refers to both Karl et al [1986] and Easterling et al [1996] regarding TOB adjustments. I haven’t been able to locate Easterling [1996] yet, but I have read Karl [1986]. What TOB model is used by NASA/Hansen or by GHCN? I am assuming that the Easterling [1996] provides an updated model, because the one in Karl [1986] has the following issues:

there can be considerable variability of the TOB from year to year. This year-to-year variability [in TOB] is … due to differences in the timing of frontal passages, cloudiness, precipitation, etc. …the TOB for any month and given year can be substantially different from the mean TOB. This suggests that estimates of TOB from a general model will be most appropriate when applied to a mean derived from a series of years… The technique which we develop in this article is most appropriate when applied to means comprised of a series of years to estimate nonclimatic trends

In other words, it is inappropriate to use the Karl [1986] TOB model to adjust individual months/years.

Thanks for the interesting perspectives proffered. My take on what I’ve learned so far:

1) The published technique for TOB adjustments is not appropriate for climatic trend analysis

Karl et al themselves declare “The technique which we develop in this article is most appropriate when applied to means comprised of a series of years to estimate nonclimatic trends which are detrimental to spatial and temporal analyses of mean monthly maximum, minimum and mean temperature.”

I.e. (if I am correctly interpreting that statement) they have developed an estimation technique that is “most” appropriate for spatial and temporal analysis of mean temperatures… in other words, it is based on a generalized model of temperature change across latitudes (northing) and longitudes (time of day shifts due to sunrise/sunset/timezone shifts).

I understand the reasoning behind their adjustments. It’s not too different from the adjustment built into my chicken coop automated seasonal light timer: we can turn the light on for a certain period before/after sunrise and sunset. The timer needs me to provide a general sense of our latitude, and to preset the time of day. However, I must adjust the sunrise/sunset offsets, because solar time of day varies by longitude. (BTW, the movie Longitude is a fascinating look into the history of accurate time measurement!)

For reasons discussed below, I believe attempts to apply such calculations to specific stations and/or for climatic trend analysis, are misguided.

2) The day of a time of observation shift, data is invalidated.

If observer reads “late” by two hours, then a high/low occuring in the final two hours of the previous day, larger than today’s high/low, will overwhelm today’s data. Iif the observer reads “early” by two hours, then a high/low occuring in the final two hours of the previous day will overwhelm today’s data.

Has anyone seen analysts actually invalidating data measurements due to TOB shift? I mostly see attempts at “correction” for duplicate/missing data!

3) It’s assumed that measurement at certain times of day will record duplicate high or low values over time

Yes, if the daily high/low observation is recorded and reset at the exact moment of the daily high/low, that high/low will affect both the prior and next day (if it happens to be the actual high/low of both 24 hour periods.)

Daily lows… I’ve been examining Weather Underground Personal Weather Station data in my area, and I see few if any patterns. The low could be near midnight, near five AM, near sunset… it all depends. It seems reasonable that a random percentage of daily measurements will be influenced by TO.

But there’s the rub: whether consistent or random, the trend is not impacted by TO, only by anomalies when TO changes.

The Karl paper is a wonderful presentation of certain analyses. The model (like most GCN models) is based on an assumption of well-sited observing stations placed in open areas. No shadows, no sun-facing thermal masses, nothing impacting either the measurements nor the general regional temperature. The assumed picture seems to be one of gently sloping land covered by smooth vegetation or desert. Such landscapes do exist of course, but they’re hardly the norm!

What the Karl method fails to do (and does not attempt to do) is show statistical skill in detecting and correcting realistic TOB based on real-world data. I look at the five-minute observation curves for sites near me and see a lot of very unpredictable shifts. Clouds move in and prevent the night from being “frizzling” cold as my daughter used to say, or they prevent the day from becoming a scorcher. A hail storm turns the temperature curve upside down. Daily rain drains off afternoon heat potential and sets up the evening as the high of the day. Etc etc.

4) Specific site issues trump TOB

A sensor located close to the east side of a tree, hill or mountain will have early “sunset.” West of an obstacle? You get a late sunrise. Here along the Front Range of Colorado, and in many other places, such effects can’t be avoided. But is that a bias or part of the reality of climate? Geography impacts climate! Should we “adjust” for this? I don’t think so, not if the “siting” issue is natural and impacts the region of interest.

As Stephen Mosher has indicated, other issues can easily overwhelm any potential for TOB. If the sensor is placed on a roof (or, to be extreme, near a wall of water barrels, in the exhaust plume of an A/C unit, etc et), the nearby thermal mass, thermal source or cooling source will have a massive effect on the daily temperature curve… potentially far more impact than the time of observation.

My bottom line: in theory, TOB seems a reasonable calculation. In practice, what I’ve learned is that more data should be invalidated (due to observation-time change)… and there ought to be a whole lot FEWER attempts at micro-adjusting for TOB because it really is not all that predictable.

A topic that seems pertinent to me: the recent discussion elsewhere on CA about proper handling of missing data. I’m not sure that’s been fully resolved.

The link form Jerry B in # 14 is useful, but the logic therein is hard to follow. From this and other sources, I gain the impression that at some stations, an occasional variation to precedure (like a change to the time of reading a thermometer) has been related to a long-term error and ajustments made over periods of years. This is a real worry.

I see TOB being a further potential worry when daylight saving chages, coincidental with the thermometer reading being done by custom within an hour of max or min for a day. I see potential problems with leap years. I see problems when normalising the data from urban to rural stations or vice versa up to 1000 km away. At high latitudes, 1000 km can mean midnight at one station and midday at another 1000 km east or west. Or even worse, really close to a Pole, one can cross the same station several times while travelling east or west for 1000 km. That old Fortran correction code had to forsee these spherical geometry problems. Did it?

In China, there used to be only one time zone, so when it was noon on the East coast and near to the hottest part of the day, it was 6 am in the west and about the coldest time of the day. So what is the result of a convention to read all thermometers at midnight? (Maybe all is changed to World time now). In some Arab countries, noon was defined by law as when the sun was overhead. Hard to measure, but pretty tricky to quantify and then eliminate TOB.

Many time zones are an hour wide, so a time at the borders can change by an hour as you cross the line from one station to another nearby. In a similar vein, did that old Fortran software incorporate the effect of the International Date Line, where stepping from one station to another on the other side would change times by a whole day?

There were designs of thermometers other than those in the posts above. One form had a maximum recorder for mercury in glass, an iron peg in the tube that was pushed up by the rising mercury and prevented from falling when the mercury cooled and fell, by a spring clip on the side. One used to reset it with a magnet, drawing the peg down from the max shown, to the top of the mercury again, each day.

Currently, some Australian Bureau of Meteorology tables have some column space for recording the number of days that passed since the previous obvervation, for events like remote lighthouse keepers going AWOL for a week. Such thermometers as above, if used, would record the same max for several days in a row (unless mathematically corrected by guessing). If one of those days was really hot, easy to see an upward T bias for the month. (Harder to see a downward bias). The effect is not so important for a rain gauge, where the cumulative rain over a week means something that can help in reconstructions, but it’s a no-no for the temperature example I’ve given.

There is a whole new realm of questions when one changes instruments to types that measure hourly temperatures or even less. Does an hourly instrument average point temperatures over that last hour and record the mean? Or does it record the highest T reached, or the temp at the end of each hour? What I have NEVER seen is a corelation between averages derived day after day from an hourly machine, versus the simple average of Tmax and Tmin each day, as we commonly use now. One measures extreme temperatures, the other has the capacity to approximate heat flow. A plot of temperatures taken every minute can produce a skinny curve or a fat one depending on how long the heat hung round before a cool breeze set in – but sill give the same Tmax and Tmin as before.

Then there is the topic of response times. If a puff of hot air floats through the instrument chamber, is this recorded as the daily max, or are the instruments designed to smooth transients? How fast did the old mercury devices change for transients? How fast are thermocouple devices designed to respond or smooth? If either records a max a certain time after the event, than a TOB is indicated as needed.

TOB is a general acronym that can be used for a multitude of bloody-minded purposes.

In a climate audit, we are supposed to discern the possible errors then uncover methodology “adjustments” by audit to see if they are valid or not. (Karl et al frighten me). I can’t do that with TOB in its present state. My maths and data bases are not good enough. But I hope that some of the above examples can be checked and audited by others more clever than I am.

1. The bias in TOB is only relative to the convention of determining average temperature as the mean of min and max temperatures over the 24 hours prior to midnight. That is, its not a real bias, in the sense of an error.
2. In temperate latitudes there is a symmetrical seasonal effect in TOB. In practical terms this means while monthly TOBs can be significant (as much as 1C) around the equinoxes, over the year monthly TOBs tend to cancel out and are much smaller over a year.
3. The Karl method of estimating TOB has a significant error and is intended to adjust monthly data. Karl said the error was around 25% of the estimated adjustment. As far as I can determine, annual TOB is just average monthly TOB. Given point 2, annual TOB is probably mostly estimating error.
4. The TOB adjustment in the US data (sorry, I dont have access to the computer where the links are saved) since 1950 has increased from a small negative adjustment to around 0.2C positive adjustment. This represents at least a third of total twentieth century warming. Note that a positive adjustment could only occur in the real world as a result of a wholesale shift in time of observation from evening to morning. In reality, over this period, the shift would have been to automated measurement, which presumably occurs at mid-night and hence cannot have a TOB. What we should be seeing is a steady reduction in TOB to zero, rather than the steady increase we do see. I. e. the TOB adjustment is definitely wrong.
5. The problem in sorting through this (mess) is that current automated measurements cannot have TOB and you need to reconstruct TOB in old manually collected data in order to determine what TOB adjustment is needed to handle the change from manual to automated measurement. My suggestion for removing the TOB problem is to just use data collected automatically to determine the temperature anomaly and see what trend there is and then statistically graft that on to the manually collected record (as per the Hockeystick). BTW, Im not a statistician.

On Karl, Karl based his model for TOBS adjsutment on 7 years of hourly data
from First order sites. 107 sites were used. 79 to create the model 28 held
out for validation.

Second if you have trouble with JerryBs description download one of his data files
JerryB has 190 sites with hourly data. His data file will how you how shifting the
TOB impacts the means. I suppose One could create a little excell file to show the
the what happens at one site..

Aurbo (#22) – Thanks! My background is heat transfer, and I taught graduate lab courses, including issues dealing with psychromatic ratios (ie, mass transfer). But I used all modern equipment, and so understanding that historical reference was illuminating. I assumed that such a device was possible (human ingenuity, and an understanding that observors in the 19th and 20th century would want to do a good job of recording max and min). I get to pass this on to a friend of mine to show that he was incorrect…

While I have mentioned that TOB does not depend on passing warm or cold
fronts, I had never quantified how much it does not depend on them.

Let me use the word spike(s) for brief changes of temperature up, then
down, or down, then up, by at least some particular number of degrees.

When I did my TOB study, I deleted 5 such spikes of at least 20 degrees
F, not because of their possible effects on TOB, but because of a concern
that they might be erroneous data entries. I have no doubt that the one
of +95 F/-100 F was an erroneous data entry.

Also, I counted spikes of at least 3 F (128,981), and 5 F (17,240),
simply from curiosity, but did not attempt to quantify their effects on
TOB. BTW ,the frequency of such spikes varies greatly from one region to
another.

Comment #27 in this thread seems to include an assumption that TOB does
depend on passing warm or cold fronts.

“Yes, if the daily high/low observation is recorded and reset at the
exact moment of the daily high/low, that high/low will affect both the
prior and next day (if it happens to be the actual high/low of both 24
hour periods.)”

So, from the stations used in the study I selected one from Colorado:
Denver/Stapleton airport, which ranked #10 in the 5 F or more spike
count, but only #16 in the 3 F or more, not because it didn’t have its
fair share of those spikes, but because such small spikes are common in
more places.

Then, I went and flattened the spikes, first those over 9 F, then those
over 5 F, then 3 F, and then 2 F, and after each successive flattening,
reran the TOB calculations. Following are the summary results, starting
with the original that has already been published. The effects of the
spikes do seem very minor, and in some respects, contrary to what some
might have expected.

Someone needs to reset my perspective. We’re talking about a simple instrument that captures a high & low over a measurement period, that is then recorded at some time, be that 7am, 5pm or midnight. The TOB is supposed to cater for a change from one measurement time to another on a particular day. So worst case you need to discard one or two days samples when you make such a shift. The instrument itself continues to record a peak high & low regardless. Please try and explain without using graphs that go beyond +/-1day how this can affect the min/max temperature other than on the day of measurement change.

Using about 6.5 years of hourly observations from the Columbus, GA airport (a data file I just happened to have handy I took the min/max temperatures at each hour of the day. Then averaged for each day. Then took the 6.5 year average of the daily averages. This is what I get:

I pulled down the last year of hourly temperatures (01:00 1 September 2006 through 20:00 31 August 2007) for Pittsburgh PA from the National Weather Service web site (http://www.erh.noaa.gov/pbz/hourlyclimate.htm). Of the 8,756 hourly readings, twelve were missing and replaced using a linear approximation based on the surrounding temperatures. The last four readings on 31 August 2007 were missing.

I computed the minimum and maximum for each 24-hour block starting at each hour of the day. So for the full year, I ended up 364 max/min pairs for each of the 24 possible data collection times. I found the midpoints for each pair. I then found the average of these 364 midpoint temperatures for each of the 24 collection times. The first column is the time of the daily data collection (recording the min/max pair from the previous 24 hours) and the second column is the average temperature for that collection time based on 364 readings.

There is a 1.92 degree F difference based on when the readings were taken. Readings that are taken in the late afternoon have the highest yearly average while those taken in the early mornings have the lowest.

Here are the 24 hr min/max average from Tyson Field (Knoxville TN)by month for several years for 0600, 1800, and 2400. The bias is clear, although this isn’t the best way to show it. I’ll try to do more later.

The bias occurs without changing the time of observation. Changing the time
of observation will change the bias.

The bias results from an accumulation of differences in temperatures of
consectutive days. Just two days of data may, or may not, indicate the
kinds of differences that cause the bias, but even if they do, just two
days of data will not indicate the magnitude of the bias.

At either page linked by this page you can see
two days of data that do indicate the kinds of differences that cause
the bias. Unfortunately, I may have crammed too much stuff into that two
days of data, but be that as it may, the necessary stuff is there.

49 Steven Mosher, said: THAT IS NOT THE POINT OF TOBS ADJUSTMENTS FOLKS. Let me explain why.

If you check the TREND of a midnight OBS for this site it was .0004F day.
If you check the TREND of a 7AM OBS for this site it was .0004F day.

NOW, what happens if halfway through the history of this site we SWITCH from Observing
at Midnight to Observing at 7AM?

The MEAN goes to 69.0 The trend goes to .0001F day.

So long term trends of raw data should match, long term trends TOB adjusted data if the TOB adjustment is properly calculated? Hmmmm, what a novel concept! If each time the time of observation, or the site is relocated or the instrumentation is changed, the site was treated as a new site instead of trying to micro manage the data to one uniform continuous site?

So if you reverse the two (go from 7am to Midnight) you would get a larger trend? Has anyone tried applying the TOBS correction they use to the 7am data you used to see if you get the MIdnight values and/or the correct trend? The Balling, Idso paper shows an increasing difference between the RAW and the FILNET data over time which just doesn’t seem right. That would imply that the time of observations are moving towards a non optimum time for reporting (more and more sites are requiring larger corrections to adjust to midnight).

I referenced a paper by Thomas Blackburn a while back on TOB. The most interesting part of an interesting paper was how he worded something about the midnight ideal reading of daily observations. Just for grins assume that early AM is the optimum time for TO, what happens?

Ummm… I finally woke up and found JerryB’s data links ;)… THANK YOU Jerry for going to all that trouble!

Now, I have a very basic question about the calculations made. I think I’m seeing a basic methodological error.

As stated, each high/low calculation incorporates 25 observations, not 24:

“The past 24 hours” will include the observations at both the beginning and the end of those 24 hours, and so will include 25 observations unless some data are missing. The “average (smoothed) hourly temperature of the past 24 hours” uses half of the first, and half of the last, of those observations (plus all of the other 23 observations). The number of consecutive observations will usually be 25, and if it is not, some data are missing, and 24 hour periods that are missing data will not be used in the summaries. If the “hour of most recent occurrence” is 25, it indicates that that occurrence was the observation at the beginning of that 24 hour period (i.e. it was 24 hours old).

I believe this is a basic error in analysis.

For each high/low record, only the most recent 24, not 25, observations should be used. Otherwise, high/low records 24 hours apart will incorporate, in essence, a one-hour overlap of measurements (since the measurements are hourly). This error will introduce a bias in the result.**

To consider this in a bit more detail, consider the case where the precision of our timing is to the second, and our measurements are made hourly. Then a 24 hour high/low calculation should be made incorporating data from T-0 to T-23h59m59s, not T-0 to T-24h — the measurement exactly 24 hours ago should not be part of the high/low calculation.

This becomes more obvious in a variety of computer science applications, where number space overlaps or gaps can cause serious errors in implementation of algorithms from random number generation to sort/search/hash tables etc. But all in all it’s the same problem. Getting the edge effects right is crucial.

Sadly, I’m not up to speed enough on R to code this (looks like a VERY interesting language, by the way!) But I do recommend looking into the result of correcting this error.

If the hourly measurements were hourly high/low measurements, then 24
pairs would be sufficient for a 24 hour period. However, the hourly
measurements in these instances are point measurements, i.e. what is the
temperature at the time of observation. In order to use such
measurements to approach what 24 hour max/min measurements would record,
one must include the 24 hour old measurements.

The time of observation issue arises only because we are trying to measure maximum and minimum temperatures. However, it seems to me that Jonathan Lowe has shown that we might learn much more about climate theories by looking at trends of temperatures taken at the same time every day. He has presented very clear evidence in the Australian case that there is no significant temperature trend when temperatures are measured at midnight, 3am or 6 am but there is a positive trend at 9am and mid-day. From his previous work on individual stations, I think he is also going to show us that temperatures recorded at 3pm and 6pm every day also show a significant positive trend since WWII. This information would apper to be much more informative about which types of models better explain recent temperature changes than does any trend in maximum or minimum temperatures. Does anyone know whether Jonathan’s findings extend to other locations eg the US?

the hourly measurements in these instances are point measurements, i.e. what is the
temperature at the time of observation. In order to use such
measurements to approach what 24 hour max/min measurements would record,
one must include the 24 hour old measurements.

Yes, they are point measurements. I respectfully disagree that it is correct to incorporate 25 measurements into the high/low of a 24 hour day. It is possible that my addled brain has forgotten everything I ever knew about this, but I doubt it. Once upon a time, I specialized in this…

This problem is similar to the challenge of converting data between discrete integer and ‘real’ values, which has been partially discussed elsewhere in CA. It’s an important topic in certain areas of computer science.

The bottom line, simplified: we can either extrapolate the measured values halfway before-and-after the measurement points, or simply extend the value from the measurement time up to (but not including) the next measurement time. Doesn’t matter. Either way, we should end up with four measured values integrated across 1/4 of a “day”. Four values, six hours each. Four values, four possible high/low extremes.

I’ll illustrate through a radically simplified example. In this example, there are only four measurements a day. And the sensor records in 25 degree increments:

D M T Hi Lo
1 0 0 0 0
1 1 50 50 0
1 2 75 75 0
1 3 100 100 0

2 0 25 25 25
2 1 50 50 25
2 2 75 75 25
2 3 25 75 25

3 0 0 0 0
3 1 100 100 0
3 2 25 100 0
3 3 0 100 0

Keeping this very simple (and extreme, yes, to make the point obvious)… here we have four point measurements each day, taken at the beginning of each of four Measurement periods. We have no idea what happened between measurements. There is no justification to interpolate in any way. All we can do on the first measurement of the new day is reset our high/low record and begin again.

What is the Day 2 low? It is 25, not zero. We do not know enough to guess any more accurately than that. Likewise, the 2nd day high is 75, not 100. Clearly, the measured high/low during the first day is 100/0, 75/25 for the second day, and 100/0 for the third day.

Between 1-3 and 2-0, the temperature could have been anything in a wide range, certainly enough to impact the high/low for either day. But we don’t know that and cannot guess.

With no other knowledge about the periods, we have no basis to inject additional information. No basis, for example, to suggest any slope or curve shape to the period between measurements. And, we have no basis to make use of measurements from the day before or after any given day, to modify that day’s high or low temperature.

Likewise, the same holds true for the data sets being considered here.

Essentially, we can correctly “pin” the high/low values at the (four) measured times in the period of interest, and must be VERY cautious about taking things any further than that. All hell breaks loose when one assumes the previous/next day data can be “pinned” to the obvious points.

It is dangerous to presume high/low values on the basis of presumed data, unless one knows something about the inter-measurement data curve… in which case there’s really more data available than meets the eye.

There’s a nice proof of this tucked away in a paper I made use of about 20 years ago, to implement provably correct geometric intersections for GIS analysis. I suppose I will need to go dig it up ;)

(Yes, an integrating min/max sensor will take many more measurements. (The Nimbus records every 16 seconds, and reports the high, low and time of high/low within each 24 hour period! Love it!) But that’s not what we have in this data set.)

The fact that hourly, much less six hourly, point temperature
measurements do not tell us what temperatures occurred between those
measurements is not in dispute. It is also completely extraneous to the
question of whether, or not, to include 24 hour old point measurements of
either such intervals when attempting to approach, as closely as either
such data permits, 24 hour max/min observations.

Barring operational errors, with 24 hour max/min observations, the
temperature ranges of consecutive days always at least meet, if not
overlap. When using point measurement data, if one does not include 24
hour old point measurements, one can get gaps between consecutive day
temperature ranges, which imply either operational error, or in the case
of what you propose, methodological error.

So here’s the deal. You have a thermometer that’s way off but it stays in the same condition over 100 years. Same environment. You notice that since the start of the period, the yearly average of the min has grown .5 You notice the yearly max has shrunk .5

OK but I am pointing to something else entirely. I am saying (Jonathan Lowe is saying really) that we should move away from looking at trends in maximum and minimum temperatures and instead look at trends in temperatures recorded at 3am each day, trend in temperatures recorded at 6am each day, trends in temperatures recorded at 9am each day etc. The fascinating thing that Jonathan has found when looking at high quality stations in Australia is that there are no trends with midnight, 3am and 6am temperature recordings of actual temperatures (not max or mins over the last hour etc), but there is a statistically significantly positive trend in 9am, noon, 3pm and 6pm temperature readings. I think this result (if it is replicated elsewhere in the world) says quite a bit about what could be causing the temperature changes. For example, if its CO2, shouldn’t the night temperatures be rising FASTER than the daytime ones, not slower? Looking at temperatures this way also has the added benefit that the TOBS issue is irrelevant. The drawback, of course, is that you can only use stations with a long history of hourly (or three hourly) readings.

69 Steve Mosher, if the raw data is indeed raw, trending the data per site should be the most accurate method of building the overall trend, via anomalities (sp) as John V. recommended. Instead of challenging, why not just build a better mouse trap? Kind of like Kung Fu, bend with the data, don’t force corrections.

70 Peter Hartley, Jonathan’s research is interesting but I don’t think it would be applicable to the data sets available for the GHCN/USHCN. The ASOS and I think the Nimbus sites that have hourly and the new CRN has hourly. Everything else is max/min only. The older sites were read four times a day in most cases (1800’s until max/min LIG became wide spread).

1. IF you are going to follow the ‘peterson, easterling, hansen” method of
creating a REFERENCE SITE that has the longest consecutive record as possible…
THEN.

a. You must adjust for TOBS changes at that site. They do. ( I want to audit that)
b. You must adjust for Instrument changes at that site. They do ( I want to audit that)
c. You must adjust for changes in site location( elevation). (They do… etc etc etc)

2. If you are going to follow a JohnV approach, you might not have to make these adjustments.

Agreed. That’s an attraction of JohnVs method. STILL. Unlike Mann and hansen and others I would
like to have Willis and JeanS and UC and SteveMc and Roman and professional statitiscians
weigh in on the decisions. JohnV and I are engineers. Well he is, I was. For us stats is a tool.
Sometimes we hammer with a screwdriver.

My goal here is focusing the questions on those areas that have not been questioned and verified.
JerryB has 9 years of hour data from 190 sites. The orginal TOBS study was 7 years of data from
107 sites. There are some issue here, but it is not a rich vein. Does it fascinate me? sure.

RE 70. Fair enough. Have the author post his data, source of his data, and code.

Free the code is my watchword no matter what the outcome.

That said. You wrote:

” (Jonathan Lowe is saying really) that we should move away from looking at trends
in maximum and minimum temperatures and instead look at trends in temperatures recorded at 3am each day,
trend in temperatures recorded at 6am each day, trends in temperatures recorded at 9am each day etc.”

Unfortunately in the US you ONLY get these hourly measurements at AIRPORTS.
So, non random sample. Thank you for playing. here’s a stuffed wallabee, throw a shrimp
on the barbie mate.

73 Steven Mosher, I appreciate that approach it is very normal. Outside of the box take a fictitious site called FT. Hansen, TX. 1890 to 1920 the soldiers on guard duty checked the temperatures every 6 hours starting at 0600. In 1920 they installed a brand spanking new max/min LIG thermometer and read it at 0700 hours. In 1992 due to Clinton base closures the LIG was replaced with an MMTS station.

1890 to 1920 station trend 0, 1920 to 1985 station trend 1, finally 1992 to present station trend 3. With the exception of the MMTS that requires correction because it sucks, you just have trends with variations above and below each trend’s average. Not the arbitrary 1951 to 1980 or whatever, the anomality of each trend. If the time of observation changes, thermometer breaks etc. new trend.

This reduces the influence of instrumentation error except for MMTS. You should find that pre-1950 divergence decreases. It is the KISS part of engineering school that I didn’t sleep through that makes me think this way.

Then pick any thirty year period to determine baseline for the combined trends. Auditing when Hansen and Gavin both say that have nothing to do with the less than optimum data is a waste of time. Do it the right way.

This is what needs explaining in TOB. The USHCN data set has a steadily increasing TOB adjustment from the 1950s. The total trend in the adjustment since then is about 0.3C, a large proportion of the warming that is supposed to have occurred over that period.

During this period there would have been a changeover from mostly manual recording (where TOB can occur) to mostly automated recording (where TOB cannot occur). Whatever TOB adjustment that was required with manual recording should have steadily declined toward zero as the proportion of automated stations increased.

The graph shows the opposite, a steadily increasing TOB adjustment since 1970. This is definitely wrong. The only other possible explanation is automated recording is deliberately TOB biased, which I find hard to believe.

Jonathan Lowe’s analysis shows that minimum temperatures are increasing without corresponding increases in night-time temperatures. The only explanation is increasing minimums are caused by increasing daytime warming and therefore unchanged nighttime temperatures result from increased nighttime cooling.

This is directly contrary to the OCO driven warming hypothesis and in a rational world would be held up as a classic case of evidence disproving theory.

It also raises some interesting questions about exactly how temperature data can be used to determine if warming is occuring or not. Jonathan’s analysis is persuasive that over the last 50 years daytime warming has increased, but so has nighttime cooling by an approximately equal amount.

This may seem off topic for a TOB discussion, but increased daytime warming and increased nighttime cooling would tend to increase TOB due to increased diurnal range. Note that min and max temperatures give a misleading picture of diurnal range at least in the Australian data.

(I’ve not yet had a chance to dig out my discrete/continuous number transform resources…will have minimal authoritative material to add until then.)

JerryB writes:

Barring operational errors, with 24 hour max/min observations, the
temperature ranges of consecutive days always at least meet, if not
overlap.

This is more or less correct. It depends on the device used, since some recorders automate the process completely. And a mechanically reset hi/lo thermometer will have (minor) reading gaps. But that’s actually immaterial to the question.

Proper management of high/low readings must be designed to avoid incorrect outcomes. And, by definition, one incorrect outcome would be to allow a moment to exist in two different time periods. One must choose: does midnight apply to day one or day two. No matter how finely you chop the times, each moment must apply to only one time period.

When using point measurement data, if one does not include 24
hour old point measurements, one can get gaps between consecutive day
temperature ranges, which imply either operational error, or in the case
of what you propose, methodological error.

To include the 24 hour old measurement in the current “day”, then the current measurement should be reserved as the beginning of the next day.

Otherwise, as you have so ably demonstrated, the “inter-day” reading is counted twice, biasing the results. No reading should be counted twice. It simply introduces the potential for error in analysis.

This is all a matter of properly envisioning the two dimensional space, in this case of temperature vs time, with respect to the discrete point measurements.

The common mistake is to assume that discrete point measurements define the outline of a curve that can be used to calculate an area for averaging, etc. That’s a pretty good approximation in many cases, and it makes for pretty line graphs, but it is not actually accurate.

Discrete measurements are points on a grid. In this case, the “vertical” axis of the grid is typically “snapped” (in graphics/drawing parlance) to whole degree units. And the “horizontal” axis is typically “snapped” to hours. One cannot simply remove the grid and presume to obtain correct results in infinite precision real number space.

[Yeesh. You’ve woken up some very old analysis I used to know by heart ;)… I do need to find that paper! I think I should be thanking you for that, Jerry… this was one of the more enjoyable portions of a past life :-D]

We need to adjust our thinking to recognize the underlying grid. Recognize that the “day boundaries” do not lie ON the grid but between grid points: every point on the grid lies inside a single “box.”

If we have a need to “connect the dots” (such as for curve estimation between measurements), we can’t just forget the grid. Yes, we can draw an imaginary continuous line between our measurements, but when we want to fix a given temp/time estimate along that curve, we recognize it will resolve to a spot on the grid, never in between grid points. And wherever that grid point is, lies inside only one “box.”

The fact of the underlying discrete measurement grid has significant implications on correct management of calculations. Many people never have worked through this question. As I said earlier, it’s typically a computer science problem only brought up when creating things like microcode for division algorithms, correctly implemented 2-D path intersections, and so forth.

Once this has been correctly implemented, any given time of observation will show no trend compared to any other time of observation…assuming changes in time of observation are treated as beginning a new, unrelated, series.

And now I see clearly: that’s the correct answer for handling changes in time of observation with respect to trend analysis. A new time of observation is simply the start of a new trend line. In the horizontal (time) axis, the two lines are discontinuous: either there’s a time gap or a time overlap. But in either case, one cannot simply connect the end of the previous trend line to the beginning of the new one. One must treat them as separate trend lines.

By the way, working through the discrete/continuous grid question can eventually provide the statisticians in the crowd with the ammunition needed to generate minimum uncertainty levels for all of this. And the answers are a bit different than most would assume.

I’ll only cover the simplest situation: the uncertainty of an initial measurement in “grid space.”

Suppose we see 10 degrees at 6am, presumably +/- 0.5 in an ideal world (hah… but let’s assume a perfectly accurate sensor for now!)

Because of “the grid”, and the various ways that grid calculations can be handled in real world computer code, proper handling of the underlying uncertainty of that value should actually be as high as +/- 0.99999…, not just 0.5.

Without more information we do not know how the initial “10 degrees” grid point is arrived at. Is “10” used for an underlying (continuous) range of 10.0 to 10.999…, or 9.000…001 to 10.0, or 9.5 to 10.4999…? We don’t know.

Making it practical, it’s perfectly reasonable to see this entire range of uncertainties in the temperature record:

– Some sensors are read using a rounding algorithm. 0.5 below to 0.5 above.
– Some sensors are read using truncation (observers always report whole numbers!), which means below-zero temperatures get truncated UP by up to 0.999 degrees, and above-zero temps get truncated DOWN by up to 0.999 degrees.

Re #82, there’s no such thing as “the temperature at the moment between two consecutive time periods”.

The temperature at that moment is either in one period or in the other. It cannot be in both. No individual moment is in two adjacent days, hours, minutes or seconds. It has a discrete identity.

Let’s simplify even further. One reading per year. I hope it’s obvious that we would never look at the following…

2004: 110
2005: 100
2006: 80

…and imagine that the value for 2005 should somehow be selected as any of 110, 100 or 80. The 2005 value is 100, period. In this case, 100 is both the high and the low for 2005. The value for 2004 and the value for 2006 are immaterial.

Extending from one reading per year to more has no impact on this truth. Each reading exists in only one time period, never two.

Conclusion
[9] We certainly realize that the conterminous United States
represents only 1.54 percent of the Earths surface area, and analyses of that areal unit may have limited interpretations for any global temperature record. Nonetheless, we show clearly that adjustments made to the USHCN produce highly significant warming trends at various temporal scales. We find that the trends in the unadjusted temperature records are not different from the trends of the independent satellite-based lower-tropospheric temperature record or from the trend of the balloon-based near-surface measurements.
Given that no substantial time of observation bias would be contained in either the satellite-based or balloon-based measurements, and given that the time of observation bias is the dominant adjustment in the USHCN database, our results strongly suggest that the present set of adjustments spuriously increase the long-term trend.

Simply rearranging the order should not have impacted the average 3-period temperature at all. It should be 20. Yet as I’ve shown, such rearrangements gives an average that varies from 16.7 to 23.3! Hardly correct.

Yes, my simple example is pushing Jerry’s method to the limit. But the same issue will arise if one uses more samples and deletes the edge values.

Has anybody seen evidence that that statement applies to USHCN stations?
I have not.

It seems most USHCN sites use the Maximum Minimum Temperature System that automatically records max and min temperatures but is manually read and reset. The recording period is from reset to reset.

So you are right and I was wrong. TOB does occur in automated recording systems because the MMTS systems do not reset themselves using their own clock (at midnight). Hard to believe that they have to use a dubious TOB estimating procedure in arguably the world’s most important measurements because a piece of electronic equipment doesn’t have a 50 cent clock chip in it.

Re #87, “Since every moment in time is between two consecutive times periods,
it would appear that you have attempted to banish temperature. :-)”:-D

I’m not suggesting the temperature disappears. Nor am I suggesting that time stops. I’m saying that measurements (of temperature or really almost anything) are at discrete, measurable moments, not continuous. And that time is recorded at discrete moments

The temperature at that moment is either in one period or in the other.
It cannot be in both.

“Again, I’m not suggesting the temperature fails to persist. I’m saying
the moment recorded belongs to only one period.”

And I am saying that the temperature, not the moment, belongs to two.

The tempeature at the moment between two consecutive time periods is the
temperature at the end of the first time period, and it is also the
temperature at the beginning of the second time period.

“As I read this, you are saying the moment at the end of the first time
period is identical to the moment at the beginning of the second time
period.”

I don’t know how you could read it that way; the temperature, not the
moment, at the end of the first time period is identical to the
temperature, not the moment, at the beginning of the second time period.

BTW, it would seem appropriate that you stop posting false assertions
about what my method, or algorithm, or calculation, would do with your
simple examples.

“If they are false, then prove it. The documentation and data sets say
that your method calculates hi and low from N+1 samples for periods with
N samples taken. How did I misuse that?”

My method estimates monthly, and yearly, TOB based on hourly temperature
observations. The first step of my method is to find sets of several
years of hourly temperature data. Your examples would not get past the
first step of my method.

Sam, don’t worry. Yes, the details are a bit minute, but that’s because sometimes the minute details are crucial to correct calculations ;). Be patient and I believe you’ll see some rather interesting and helpful results emerge from this discussion.

We really are making progress, step by step.

JerryB has clarified:

the temperature, not the
moment, at the end of the first time period is identical to the
temperature, not the moment, at the beginning of the second time period.

OK, now let’s specify that a bit more completely. For convenience, I’ll invent a brief notation:

[date,temp] is a data pair, specifying a “two dimensional” value in time/temp space.

[M2006e,T2006e] is the data pair associated with the Moment at the end of 2006, and Temp at end of 2006

[M2007b,T2007b] is the data pair associated with the Moment at the start of 2007, and Temp at start of 2007

For my example, you’ve now articulated:

The moment at the end of the first time period, M2006e
is NOT identical to
The moment at the beginning of the second time period, M2007b

AND

The temperature at the end of the first time period, T2006e
is identical to
The temperature at the beginning of the second time period, T2007b

M2006e !== M2007b
T2006e === T2007b

Correct?

I have declared that the beginning of the second time period, M2007b, is 12:00:00am, January 1, 2007.

Here is my first question for clarification:

1) Please articulate the definition of the moment at the end of the first time period, M2006e !=== M2007b

(My answer, for reference: for the purposes of this work, M2006e is one measurement interval before M2007b, e.g. one hour earlier for hourly measurements. And T2006e is a separate measurement from T2007b, not identical to T2007b.)

My method estimates monthly, and yearly, TOB based on hourly temperature
observations. The first step of my method is to find sets of several
years of hourly temperature data. Your examples would not get past the
first step of my method.

The method of estimation should work correctly anywhere, yes?

I will specify that my example is for a planet with one hour per day, one day per month, and one month per year. Thus, there is one sample per year. Perhaps the code does not handle such a case flexibly, but that’s not my problem. The underlying methodology should work no matter how few hours there are in a day, etc. QED.

(No, I’m not being silly. We’re dealing with a subtle boundary condition that’s best understood under simplified conditions. Throwing a lot of data at the problem is not helpful for understanding.)

It’s already quite complete enough. The temperature persists through
the boundary between consecutive time periods, whether you prefer that
boundary to be a microsecond, or a nanosecond, or something smaller.

If I measure 10 C at 10:00:00 and then at 22:00:00 I have two 10 C readings 12 hours apart. If I get 8 the first time and 12 the second time, I have an 8 C and a 12 C measurement 12 hours apart. (And of course, the same average temp “for the day”)

If I measure the temp on 31 Dec 2006 23:59:59 and then 1 Jan 2007 00:00:01 I have two readings, probably the same number, 2 seconds apart. If I do that again in a year, unless at that time there’s some weather event different than the year before, I’ll probably get pretty much the same numbers.

If I measure temp every minute for a day, add all the values, and divide the total by 1440 I have that day’s average temp. Which information tells me basically nothing because there’s a whole bunch of combinations of the 1440 values that would give me the same average. But if I did that in December in San Francisco, Greenland and Panama, I could probably tell which was which.

If I measure my interior oven temp every second after set it to 400 F and turn it on, I’ll probably get a flat diagonal rising line over time as it heats. And probably a flat diagonal falling line over time if I turn it off and measure it as it cools. And if it’s been off for a long time, something around room temperature minus however the insulation affects it. If I add the high and low and divide, I get a meaningless number. Even if I knew how long the oven had been on and knew the slopes of the heating and cooling.

#92 Not a problem. I really have absolutely no idea what either of you are even talking about. Is it a discussion of sampling interval? Statistics? What temperature beer freezes? Who would win the Kentucky Derby if horses had rockets strapped to their heads?

JerryB took time series and emulated the proceedure required for observers at COOP sites.

From the manual.

7. TEMPERATURE (-F). The maximum (MAX.) and minimum (MIN.) temperatures are the highest and lowest temperatures to have occurred during the past 24 hours. The AT OBSN temperature is the temperature at the time you take your observation. Enter to the nearest whole degree.

The minimum must be at least as low as the lowest of yesterday’s and today’s AT OBSN temperatures, and the MAX must be at least as high as the highest of today’s and yesterday’s AT OBSN temperatures. For example, if yesterday’s AT OBSN temperature was 95, today’s maximum must be at least as high as 95, even if the maximum this calendar day was only 86. You may record the 860 maximum in the REMARKS (far right) column as “PM MAX 86,” as shown on the sample page [inside front cover] on the first day of the month. This is optional. See the REMARKS column on the sample page for the 23d of the month for recording last night’s minimum (23), when it was warmer than yesterday’s AT OB temperature (11).

Sam is more or less understanding this intuitively. Take two readings in a day and they determine what you know about temperature for that day. (It’s easy to understand when the readings are not at the edge of the day. The challenge is how to properly handle the edges of the day.)

I really have absolutely no idea what either of you are even talking about. Is it a discussion of sampling interval? Statistics? What temperature beer freezes? Who would win the Kentucky Derby if horses had rockets strapped to their heads?

:-D The challenge we’re working through is how to properly perform the analysis that underlies the topic of this thread: Time of Observation Bias… and in particular how to handle the specific situation where you have a series of hourly temperature values. This is what Jerry’s data collection contains.

Its already quite complete enough. The temperature persists through
the boundary between consecutive time periods, whether you prefer that
boundary to be a microsecond, or a nanosecond, or something smaller.

Yes, the temperature persists. The problem is with the boundary definition and how it is handled.

The boundary is not a time period. It is a dividing line that separates time periods; it defines which measurements go in one period or the other.

And the hard part to understand, is that we cannot simply switch from a discrete data set containing measurements at specific times, and convert to an assumed infinite-precision timeline linking those data values. To do so invites analysis errors. My simple example demonstrates the kind of error that arises.

Ill stick with 24 hour days.

If you insist! It isn’t difficult to expand the simple example.

Heres an example using 24 hour days, and simple repeated data on each day:

My method gives an average temperature of 160.0 degrees for the month, no matter which day has which temperatures.

Jerry’s method gives an average January temperature of anywhere from 155 to 165 degrees, depending on the order and sequence of the daily temperatures.

We can randomize the days, do whatever you like. My method will always record the correct average value for each day and for the period. Jerry’s method will record a wrong answer any time the analysis (high or low for the day) comes from the final value in the previous period.

The first thing that stares me in the face about the manual is their rounding instruction:

Record these readings, as well as the current temperature, to the nearest whole degree. If the reading to the right of the decimal is 5 or greater, round off to the higher figure; i.e., 39.5 should be recorded as 40.

They’re using the old round-half-up method. It’s biased. Rounds up 5/9 of the time (5,6,7,8,9) and down 4/9 of the time (1,2,3,4). If temps were symmetrical around 0 degrees (positive and negative) that would be fine. But they’re not. I wonder if GISS attempts to “correct” for this? :-D

Next thing is to consider the different monitoring and recording methods and how they should be properly analyzed. Hmmm…. time to get some real work done and come back later…

Your method does not report what 24 hour max/min measurements would report for
the temperatures that you specified. Mine does. It seems that you expect
24 hour max/min measurements to behave the way you prefer them to work,
rather than the way they do work.

Hmmm: R users ( and people who know how to use Excel) should avert their eyes now, I think.
Here is recipe for creating the TOB for a simple sine wave day, in Excel:
Note: maybe start with 100 rows and then drag it all down to a 1000 or more (even 10k) to get a cleaner result

-fill down a column with an incremental series 0,15,30,…
In the next column convert this to radians. in the next column (NC from now on!) sine this. NC, multiply this by -5 to give a reasonable diurnal variation [ the factor should be negative to flip the sine wave so it is cool in the morning] and add say 20 to give a reasonable average temp. NC, fill with normally distributed random numbers mean zero and SD ,say, 2. NC, add this noise and the sine wave temperature column.
NC, divide the 0,15,30 column by 7.5 to get hours. NC, ‘MOD’ this hours column by 24 to get time of day. NC, this should be the MAX of the previous 24 hours ‘noised’ temperature. NC, the MIN of the previous 24 hours temps. NC the average of these MAX and MIN columns. lastly ;level with the first row of your 24 hours max and min columns copy the time of day column across. These last 4 columns want a name putting above the info and then you can drag over them all (included the name) and create a pivot table&chart.

Very long winded, apologies. It is in fact a lot easier and less messy to install one of the freebie monte carlo add-ins which refreshes the random numbers and collects the information from chosen cells as many times as you want.

Out of interest, the result that I get from this sine-wave day is very similar to PK’s plot in #48.

The example you provide does not shed any light on the problem at hand, which is:

Given hourly measurements of temperature, what method best emulates the results that would be produced by daily observations recorded on a min/max thermometer so that time-of-observation bias in a min/max regime can be measured?

Suppose the min/max is read and reset at midnight each day. When the thermometer is read, which time periods might contribute to today’s reading? Anything from midnight yesterday (last reset) to midnight today (reading). That is 24 one-hour periods bounded by 25 discrete readings.

The bias between averages of discrete daily measurements and min/max average measurements is an interesting related issue. Unfortunately, the historic record consists mostly of (as I understand it) min/max average measurements.

I think that’s why I called it an RC type discussion, they seem to go around in circles a lot there with nobody understanding they’re not really having a discussion on the same topic! Actually, come to think of it, it’s not just RC. I suppose it’s everywhere, just seems more so on ‘the alarmist’ sites than on ‘the denier’ sites.

I’ve ignored the min/max thermometer situation up until now, as several have noted. That’s true. My first goal was to understand the simpler problem of correctly analyzing the point measurements, as presented in Jerry’s collected data sets.

I believe I’ve set a baseline for what an unbiased (no TOB issues) analysis of point measurements would look like.

The difference between the MrPete and JerryB algorithms for determining high/low/average helps explain how TOB becomes a problem.

So my next thought-experiment question becomes… How do we perform the reverse conversion? How can a set of min/max measurements be restored to a bias-free form?

This will require some thought. I hope my brain doesn’t burn out along the way! :) Thoughtful contributions most welcome!

You bring up another important area: determining what time the thermometers were actually read and reset. Even if you develop a fantabulous bias correction, it is useless if read times are innaccurate or unreliable.

JerryB posted the FTP site for the USHCN meta-data. The USHCN rates the reliability of time of observation with a variable that has three levels:

G for good.
F for flaky.
missing.

Is a station with a missing observation time reliability indicator flakier than a flaky station?

Perhaps an initial outcome of this first round can be evaluation of the underlying uncertainty introduced into the raw data record, from errors inherent to min/max reading methodologies, to improper rounding rules and more.

All of that takes place even with perfect sites, perfect observers, perfect observation timing.

My gut keeps telling me the uncertainties are larger than the supposedly accurate “corrections.” But I don’t suppose the media care.

Thanks, Steve, for starting this thread on Time of Observation Bias (TOB). For the benefit of newcomers, the first twenty-some comments were copied over from “A Second Look at USHCN Classification”, archived under Sept, at 9/15, and the early references to high-numbered comments go back to that thread.

Aurbo (#19, 22) has at last clarified what Karl et al at http://ams.allenpress.com/archive/1520-0450/25/2/pdf/i1520-0450-25-2-145.pdf are trying to quantify: If you measure the max temperature near the afternoon average temperature peak, on a day that is warmer than both its neighbors, there is a good chance that the temperature near the observation time will also be recorded as the peak for one of the adjacent days. Thus, high maxes tend to get double counted. The same is true for low minima if the recording time is near the early morning average temperature min. Thus, the summary charts for the maximum in Figure 8 of Karl et al tend to have a low plateau from around 9pm to 10am Local Standard Time, with a big spike between 10am and 9pm, with seasonal variations. The charts for the minimum tend to have a high plateau from around 10am to 9pm, with a big dip from 9pm to 10am.

If the object is to proxy the 24-hour average temperature with the average of a 24-hour min and max, the ideal time to take the reading would be at the intersection of the two plateaus. About 9pm or 9am would minimize the amount of double counting. According to Karl et al, before WWII most stations used to measure in the evening, with 5pm being the most popular time, but an increasing number are moving to the morning, with 7am being the most popular time. Both times are too early, with opposite biases: 5pm gets the min right, but overcounts the high maxes, while 7am gets the maxes right, but overcounts the cold mins. Therefore shifting from 5pm to 7am will shift the min/max mean down somewhat. Karl et al don’t directly report what the difference is, but eyeballing the mean graphs in Figure 8, it looks like it’s about -.9 deg. C! Since about 28% of stations switched from pm to am between 1941 and 1985, this is about -0.25 deg. C, which would justify an upward adjustment of +0.25 deg. C. This is at least close to the approximately +.35 deg. C. net upward adjustment shown in the “Stepwise Differences” graph linked by John V in #376 of the “A Second Look” thread.

In short, the Karl et al model, on which the NOAA etc. TOB adjustment is based, is not unreasonable after all. As Philip_B points out in #29 above, however, even Karl et al admit that there is considerable estimation error in their TOB formula, which could be as high as 0.2deg. C. This should be factored into the measurement error in any bottom line estimate of climatic temperature trend.

Karl et al unfortunately obscure matters, by insisting that midnight is the ideal time to measure min and max for the purposes of estimating midnight-to-midnight average temperature. Midnight gets the maxes right, but often double counts the mins, and therefore itself generates a downward bias plus unnecessary measurement variance. The 11% of observers who dutifully trudge out to the stations at midnight would be doing everyone a favor if they just took their readings at 9pm and got some sleep instead.

I would suggest that the Time of Observation itself be abbreviated
Tobs and not TOBS, to avoid confusion with TOB, the standard abbreviation for Time of Observation Bias.

If you measure the max temperature near the afternoon average temperature peak, on a day that is warmer than both its neighbors, there is a good chance that the temperature near the observation time will also be recorded as the peak for one of the adjacent days. Thus, high maxes tend to get double counted. The same is true for low minima if the recording time is near the early morning average temperature min.

This description is wrong–but the effect is real. Given a certain phase relationship between the min-max measurement period and the signal, there is some probability that min-max will capture the peak and trough. As the period moves such that one of the end points approaches one of the peaks, you no longer capture the peak of every period. i.e., two troughs (or peaks) end up in the same interval. Thus rather than double counting you actually undercount extrema and pick arbitrary interior points.

This effects occurs because the peaks are not uniformly spaced at 24hr intervals. Rather they jitter with a long-run period of 24hr. Physically this is something like amplitude modulation. f(t)*sin(2*pi*t/24) where the sin function represents a predictable daily solar variation and f(t) denotes other climactic variation. f(t) is responsible for the TOB.

“If you measure the max temperature near the afternoon average temperature peak, on a day that is warmer than both its neighbors, there is a good chance that the temperature near the observation time will also be recorded as the peak for one of the adjacent days. Thus, high maxes tend to get double counted. The same is true for low minima if the recording time is near the early morning average temperature min.”
This description is wrongbut the effect is real.

It’s true that I didn’t phrase that quite right: If you measure max temperature today, near today’s actual daily max time, and today is so warm relative to its neighbors that today’s current temperature at the Tobs is higher than the local max for either yesterday or tomorrow, then today’s warm temperatures will be double counted. If today’s max was a little earlier than Tobs, today’s max will show up as the official max for today, while today’s Tobs-temp will show up as tomorrow’s official max. Or if today’s max was a little later than Tobs, today’s Tobs-temp will show up as today’s offical max and today’s actual max will show up as tomorrow’s official max. Either way, relatively warm days get double counted, and their cooler neighbors undercounted, creating an upward bias. The probability of a double count declines as Tobs moves away from Tmax, but the effect is perceptible from about 10am to 8pm in the max graphs in Karl et al’s Figure 8. However, it is the warm days that get double counted, not their max temperatures, as I erroneously put it.

Likewise, if Tobs is too close to Tmin, a night that is substantially cooler than its neighbors is likely to have its cool temperatures double counted, once as its actual min temp, and once as the current temp at the Tobs. Karl et al’s Min graphs in Fig. 8 show that there is perceptible downward bias from about 9pm to 8am in July. In other seasons, the weather is more turbulent, making the bias period is somewhat longer, so that there is no perfect observation time from the point of view of both max bias and min bias. However, either 9am or 9pm would be a big improvement over either 7am or 5pm, or midnight for that matter!

Since about 28% of stations switched from pm to am between 1941 and 1985, this is about -0.25 deg. C, which would justify an upward adjustment of +0.25 deg. C. This is at least close to the approximately +.35 deg. C. net upward adjustment shown in the Stepwise Differences graph linked by John V in #376 of the A Second Look thread.

In fact, the “Stepwise differences” graph linked by John V shows approximately +.35 deg. F, or about +.22 deg. C, net upward US average TOB adjustment circa 1941-85. This is even closer to my back-of-the envelope calculation than I had thought!

(I have a feeling we’ve covered this on other threads, but I can’t find it.)

Is anyone aware of major changes in TOBS policy in the US during the following time frames?:
– Early 1910s
– Late 1910s into early 1920s
– Late 1940s into early 1950s
– Mid 1970s
– Others

Is anyone aware of TOBS “corrections” applied by the keepers of GISS to these time frames or other specific time frames? If so, what were they? Also, if so, anyone aware of the rationale / justification?

You bring up another important area: determining what time the thermometers were actually read and reset. Even if you develop a

fantabulous bias correction, it is useless if read times are innaccurate or unreliable.

It would matter were the recorded Tobs used to determine TOB. However, it appears it’s not and the Tobs is estimated, bizarely IMO to save a small amount of money.

The fraction of observers recording at various hours of the day was calculated and interpolated for intervening years (extrapolated for subsequent years). For these seven states, the ending time of observation was grouped into three categories: AM, PM, and MD. The AM category included observers who ended their climatological day between 3 AM and 11 AM; the PM category between noon and 9 PM; and the MD category between 10 PM and 2 AM; all local standard time. The fraction of observers in these categories was calculated, and it was assumed the 7 AM observation time best represented the AM category; the 5 PM observation time, the PM category; and midnight for the MD category. The reason for the simplification was to test if a faster method, requiring significantly less bookkeeping and keypunching, could not provide nearly as good results as calculating the fraction of observers at each of the 24 hours of the day.

I would not rule out the possibility that in addition to semi defensible TOB “estimates,” ostensibly aimed at “saving money,” other additional fudge factors were included in TOB “corrections” for specific time frames in the historical sequence, by the GISS-meisters. Suspicious guy that I tend to be …. ;)

The time of observation was determined at each station within a climate division during January of the years 1931, 1941, 1951, 1965, 1975, and 1984 for the states of California, Colorado, Illinois, Indiana, New York, North Carolina, and Washington. The fraction of observers recording at various hours of the day was calculated and interpolated for intervening years (extrapolated for subsequent years).

and

As a result, by assuming 7 AM observation for all AM stations and 5 PM for all PM stations, a good estimate of the median bias is obtained for all AM or PM observations. Furthermore, nearly all the MD stations observed at midnight.

There’s lots of assumin’ goin’ on out there.

Look at the states listed there, and imagine trying to estimate family incomes or gasoline prices from that sample. I interact regularly with government agencies (Census Bureau, Department of Energy, Bureau of Economic Analysis) that wouldn’t dream of something so sloppy.

Thus, an additional incremental uncertainty of about 0.3 Deg. F. (~0.5 Deg. C) should be added to the estimation uncertainties (“errors of prediction”) in Karl’s paper (between 0.1 and 0.3 Deg. C). for the TOB process alone.

Differences of the biases were small ([less than] 0.3 Deg. F.) for those calculated by categorizing the ending time of observation into three categories compared to those obtained from calculating the fraction of stations with observation times at each of the 24 hours of the day. This is attributed to the preponderance of AM observation times falling between 6 AM and 9 AM, and PM observation times falling between 4 PM and 7 PM. As a result, by assuming 7 AM observation for all AM stations and 5 PM for all PM stations, a good estimate of the median bias is obtained for all AM or PM observations. Furthermore, nearly all the MD stations observed at midnight.

Thus, an additional incremental uncertainty of about 0.3 Deg. F. (~0.5 Deg. C) should be added to the estimation uncertainties (“errors of prediction”) in Karl’s paper (between 0.1 and 0.3 Deg. C). for the TOB process alone.

The first thing that stares me in the face about the manual is their rounding instruction:

Record these readings, as well as the current temperature, to the nearest whole degree. If the reading to the right of the decimal is 5 or greater, round off to the higher figure; i.e., 39.5 should be recorded as 40.

Theyre using the old round-half-up method. Its biased. Rounds up 5/9 of the time (5,6,7,8,9) and down 4/9 of the time (1,2,3,4). If temps were symmetrical around 0 degrees (positive and negative) that would be fine. But theyre not. I wonder if GISS attempts to correct for this?

Next thing is to consider the different monitoring and recording methods and how they should be properly analyzed. Hmmm . time to get some real work done and come back later

Another thing that worries me about the NWS manual MrPete linked (go back to #99 for the actual link), is that although they tell observers to use as nearly as possible the same time every day, and to obtain permission to change the scheduled time, they don’t say whether to use local standard time or local statutory time when clocks go on or off Daylight time. Thus there may be a 1-hour change in TOB twice a year that is not being accounted for. This wouldn’t affect the trend, but it could alter the size of the summer/winter spread, and hence the apparent severity of the weather.

One notion that I have is that it can be very difficult accurately to
describe in a general way the temperature sequences that lead to TOB. If
one considers the afternoon of day 1 of the Peoria sample,
the 1600, 1700, 1800, and 1900 observations are less than that day’s
high, and the first three of them are lower yet than the previous
afternoon’s corresponging observations. Also, the second two of them are
lower than the first two of them. So they may not seem too interesting.
And yet, they become the max observations of four of the next afternoon’s
hypothetical 24 hour max/min observations.

While considering some of Hu’s and Jon’s recent comments, it seemed to me
that some things that I had not counted in the hourly data might be of
interest. Here are some preliminary numbers from Cedar City, Utah, data
which may relate to discussions of “double counting” and “under
counting”.

Number of instances in which the current 24 hour min is at exactly the same
temperature as the 24 hour old 24 hour min: 3554, just over once per
10 hypothetical periods of observation.

In case the wording of “Numbers of hourly observations of current 24 hour
min(or max) temperature in 24 hour periods ending at 36167 hypothetical
24 hour periods of observation:” is obscure, the following may help.
After the program identified the 24 hour min (or max) for a given 24 hour
period, it went back and counted how often that temperature occurred in
the hourly observations for that period, and recorded that count.

Our goal is to record, as accurately as possible, as many valid highs, lows and “averages” as possible. Correct?

I’ve been noodling over the challenge of how to convert the record to unbiased/TOB-free data.
I quickly tossed one idea because it seemed to leave too much missing data to be acceptable.

But now Kristin has dug out the original data forms, and as Andrew notes, there’s already a LOT of missing data.

With that in mind, here’s what I’m thinking:

1) The best high/low data is collected at any time OTHER than the high/low of a 24 hour period.

2) If the temp is uniquely at its 24hr hi/lo point at TOB, then the actual TOB has become a significant factor in the measurement. (i.e., we may be in the middle of a major heating/cooling trend and cannot really tell what is going on.

Therefore, where I’m heading is:

3) If the 24 hr high is uniquely found at TOB, yesterday or today, we throw out today’s high observation.

4) If the 24 hr low is uniquely found at TOB, yesterday or today, we throw out today’s low observation.

That may not be quite right, but I think I’m heading somewhere useful. It may be that we can be a bit more sophisticated, and do something like:

– Calculate daily temp range Tr
– Calculate minimum acceptable AT TOB temp reading difference: AT TOB reading must be at least min(1,Tr) away from the high and low of the day
(is there anywhere that temps are incredibly stable day and night? That’s what this is for)
– Toss all readings that are invalid

Also look at Marysville, October 1980. It looks to me like when temp at time of observation is lower than the min temp, they select the temp that looks more in line with surrounding days. Sometimes they let the min temp stand and cross out the temp at observation; and sometimes they change the min temp to what it was at time of observation.

As Steven M says they are not looking for temperatures they are looking for the trend. Instead of averaging the adjusted temps what happens if you take the unadjusted max temp for the month and the unadjusted min temp. Average those and look at the trends of all three, max min and average? Over months and years the trends for these should be equvalent to the actual average trend the way it’s being calculated now only without the introduction of an adjustment. This would remove the TOB because there wouldn’t be a question of double counting except maybe at the begining and end of the month, and that may be able to be adjusted for.

A lot of effort has gone into analysis (and display) of station data based on the assumption that the data can be treated as a valid time series, with data points that (for the most part) describe a temperature “line” through time.

The more I think about it, I’m beginning to visualize the data (so far, mentally) as a scattergram of independent measurements, including at individual stations:

* Many data points are missing

* Many data points should be treated as missing or at least “fuzzy”, as the time of observation is inappropriate for a proper 24 hour hi/lo reading

* Many data points (big picture) have variant time of observation

This raises a few questions for me. Can anyone provide insight?

* Does R have good tools for dealing with sparse datasets?

* What are some good statistical methods for addressing sparse data?

* Any bids on good/proper/best ways to “bin” sparse data collections, so as to properly recognize trends?

This last one seems a tough problem.

One thought experiment: imagine a sine wave, sampled at regular intervals. Remove data points. Obviously a bad idea to simply connect across the “gap”. With a sine wave, we know the shape of the missing data and can easily fill in.

With temp observations, we don’t know the missing data. My goal is not to fill in the gaps, but properly analyze the trend without allowing the gaps to bias the analysis.

Now I’m itching to play with a scatterplot of measurements in 3-D space. Temp vs Day-Of-Year vs Years. Spinning that picture just might bring insight and inspiration on how to handle this. Then, take a set of such plots, one per station, and overlay them. Feels like a typical 3-D surface “fitting” problem…

(I realize that many have “moved on” from such elementary thinking. Sometimes, going back to the beginning can be helpful. This whole TOBS thing continues to feel like an exercise in learning to properly handle missing and invalid measurements.)

True. In the former Delaware OH USHCN station, for example, an intriguing pattern of every 7th day missing develops in 1999. In 2000, there are binges of entire months missing, and the last daily reading is on 1/30/01. (This doesn’t stop CDIAC from publishing fictitious annual averages for 2001-2005, however!)

If there are just a few days missing, it doesn’t hurt to interpolate them from adjacent days and data from adjacent stations. However, what does the first reading after a gap mean? Is it the min/max for the 24 hours before the reading? Or is it the min/max for the entire period since the last reading? This makes a big difference for how the gap should be interpolated. The safe thing would be to regard the first reading after the gap as ambiguous, and not to resume the assumption of 24-hour periods until the second reading after the gap.

It should be remembered,however, that missing data reduces the effective sample size. Thus if you have 10 stations and 4 are missing and have been interpolated from the other 6, when you average the “areally adjusted” readings for all 10 stations, you should remember, in computing your squared standard error, to divide the estimated station variance by 6 and not by 10. Likewise, if you have 10 stations are 4 are 20% missing for the month, you should remember that you have only 9.2 actual stations that month. The USHCN stations were selected for the continuity of their records back in 1999 or so. Since then many like Delaware have probably regressed toward the mean, so that the sample may have been falling off noticeably in the past few years. This would tend to increase the true standard error of the US average toward the end of the data.

The NWS manual MrPete links at #99 above doesn’t tell observers what to do if they miss a day — when they return, should they just reset their thermometers without taking a reading, or should they go ahead and report them as their reading for that day? A statistical analysis of the min/max spread after missing days might clarify the actual practice.

#128. Pete, in some of the forms, there’s an additional column which records the time(typo) temp at measurement time. This is a substantial increase in relevant information for those stations. You can readily spot the affected days, as the temp at measurement time for the prior day=the next day’s max. (Update Note: Same for min in applicable situations, as Jerry B observes below. Only one or the other will be relevant in a typical case.) ) For those stations, one could do a spot check on the NOAA algorithm by calculating the “normal” difference between the max and the (say) 5pm value, perhaps allowing for some nonlinearity in temperature. Use this to estimate the missing max value when it’s been overridden by the previous day 5 pm value.

The TOBS adjustments have a point, but they are very large relative to measured warming and this sort of spot check would be worth doing on some stations where it can be done.

As to missing data, that’s a large topic and I don’t think that there are any magic bullets. For example, the issue of the Monday reading after a lost weekend – I presume that max-min on Mon afternoon would be for the
max-min for the 3 days since Friday. You’d have to do something like a TOBS adjustment for this, but intuitively it seems to me that it would only increase the range a little.

“…records the time at measurement time” : I think you meant the temp
at measurement time.

“…the temp at measurement time for the prior day=the next day’s max.”

It’s different than that. If the either temp at measurement time is outside of
the in between day’s max/min range, then if that temp at measurement time is
higher than the day’s max, it replaces that max; if that temp at
measurement time is lower than the day’s min, it replaces that min.

I took a look at that, and at the “USHCN Daily” data, to see how it got
interpreted. Also, while the operator made the change on the 18th, the
17th did not receive similar treatment by the operator.

In the USHCN Daily file, the min for both days was picked up from the AT OBSN
column. Also, presumably because the time of observation was 8 AM, the daily
max is recorded as that of the day before the day of observation.

To me, all of these recording delays, etc “ought” to result in non-recorded data. Otherwise, we’re dealing with “analysis” via interpolation or extrapolation, right on the data entry forms.

It’s always been amazing to me how people assume that it’s ok to just wing it when a data record has not been correctly recorded. And I don’t just mean in climate science. My guess: it is embarrassing to leave blank spots in the record, and we assume that we just “know” what the missing value ought to be.

The problem with pulling in a value from nearby geolocations or “nearby” times is the same as the problem with attempting to repair bad pixels or old photos in PhotoShop:

– Using a simple one-color brush, the “repair” is obvious, and rarely even looks correct
– Using a modern healing brush, one can hide the repair more nicely… but even then it is hard to get the
photo to really appear unretouched
– AND in any case, the actual missing data is of course not filled in. The best one can do is create a fuzzy
approximation.

Steve M is correct: missing data is a huge subject. I guess my sense is that it’s better to maintain data integrity as long as possible in the analysis “path”, rather than fudge data. At the very least, all the missing data should be accounted for as a widening of CI’s, rather than assumed as equally correct sensor readings.

Geoff Sherrington’s contributions here are quite interesting to our discussion. That discussion majors on interpolation due to geolocation, and minors on time-based interpolation. But the issues are the same.

Sensor blackening is another station micro-issue that could have a big effect on measured temperature trends. On surveying the Circleville, OH station for Surfacestations.org last month, I noticed that the MMTS was coated with black soil buildup that must surely be raising its temperature from when it was installed. Closeup photos of the top and vane are posted in the Surfacestations gallery. A blowup of the crud on the top may be viewed by clicking here, and the vanes by clicking here.

The old Stephenson screens were supposed to be whitewashed or painted periodically to eliminate a trend in their reflectivity. Anthony Watts has validly called attention to the fact that modern TiO2 latex paint has different heat properties than the old CaCO3-based whitewash. The plastic housings on the new MMTS sensors don’t require painting, so paint type is not an issue. However, shouldn’t they at least be cleaned occasionally??

The SHAP adjustment is supposed to adjust station readings for recorded events like moves or instrument changes. A later adjustment (FILNET?) is supposed to “homogenize” the records by statistically detecting probable unrecorded events, and removing their effects. It seems to me that this might create a spurious warming trend if the gradual blackening of shelters or sensors is occasionally and abruptly offset by unrecorded repainting and/or cleaning. In fact, this shift merely corrects for the opposite trend that has built up over time, but the program might erroneously remove the cleaning or painting downshifts while leaving in the intervening blackening updrifts. Periodic cleaning might therefore create even more apparent GW than once-and-for-all blackening without cleaning!

I would hope that any future station surveys would include similar close-up photos of the MMTS sensor in order to see how pervasive this blackening problem is. Even if the exteriors of the sensors were cleaned, would inaccessible soil buildup in the interior not also cause warming? Even if this soil is not in direct sunlight, it would still be receiving indirect reflected radiation off the vanes.

Sensor blackening is admittedly a little off topic for this thread, but since this is an off-topic thread to start with, perhaps this digression might be excused. I tried to raise this issue last month on the “Second Look” thread (#2069), at #101, but no one took notice, as other events were popping at that time.

Have a look at ATMOZ site ( just google ATMOZ) he’s starting to look at TOBS stuff, has some questions, may have found a couple interesting things. I suggested he talk to you, but didnt have
contact info for you.

In short, the Karl et al model, on which the NOAA etc. TOB adjustment is based, is not unreasonable after all. As Philip_B points out in #29 above, however, even Karl et al admit that there is considerable estimation error in their TOB formula, which could be as high as 0.2deg. C. This should be factored into the measurement error in any bottom line estimate of climatic temperature trend.

FWIW, this post was prompted by your post at ATMOZ.

Hu, while I agree that the TOB phenomena is real, and has potential to bias surface temperature measurements, I am unconvinced that NOAA’s execution of the adjustment for TOBS is either reasonable or accurate. Without accurate, reliable meta-data, the TOBS adustment (which on an aggregate basis substantially cools the past) as applied by NOAA is little more than a SWAG.

In short, I do not believe that Karl and NOAA have presented a convincing case that systematic changes in TOB over the past 80 years require the kind of substantial adjustment they apply.

What if the weather stations were manned by experienced people who liked their craft and were not stupid. Say a reader sees a very low min at 9 am, the usual reading time. She/he says “I won’t reset the thermometer until this cold snap passes otherwise it might show up tomorrow too. So I’ll come back at 11 am and read and reset the min thermometer”. For that day, starting at 9 am, the Tmax comes at 2 pm (say) and the Tmin comes next morning on the same day at 8 am (say). If it is colder than the 11 am pseudo reading, then it’s correct. If it’s warmer than the 11 am pseudo reading, then the minimum for the day has already been taken at 11 am. Thus the daily record is correct and it requires no adjustment. Tmax is right and Tmin is right for the 24 hours starting 9 am. Midnight readings are irrelevant. TOBs is irrelevant.

There are other scenarios where no TOBs is needed for a day. Many of them are simply normal days. There is NO justification for spreading TOBs adjustments over every day for months or years. All that does is corrupt data known to be correct for the intended design and purpose.

Passing cold snaps, or warm ones, are minor factors of
TOB. TOB is mainly a result of ordinary, day to day,
variations of temperatures. Varying reset times will
merely corrupt the readings; it will not make TOB go away.

I have always doubted TOBS corrections. However, being a power station engineer, I have access to a lot of high quality monitoring data done to a lot higher accuracy than usual temperature measurements Processing that data, shows TOBS is real – as the summary below shows:

Sorry about the formatting but I don’t know how to fit spreadsheets in.

The 4pm data shows TOBS is about 0.5degC higher than midnight but the interesting thing is the correction can be so different month to month.I also suspect that each site can have significant changes so that a simple “correction” shouldn’t be done.