A comparison between global surface temperature and satellite anomaly datasets

UPDATE: This post was properly criticized for being an incomplete analysis, see this update and note to readers below about why it was incomplete.

Guest Essay By David Dohbro.

Comparing five monthly datasets since 1979. Three land-based data sets consistently report monthly higher values. The land-based data sets report in all most all cases monthly GSTA that are higher than the satellite based GSTAs.

Several global surface temperature anomaly datasets are publicly and freely available. These can be divided into two categories: surface-based and remote (satellite). The first category includes NASA’s GISS (1), NOAA/NCDC’s GSTA –here called NCDC- (2), UK’s Meteorological Office Hadley Centre’s HadCrut 4 (3), and several others. Satellite based GSTA are produced by NSSTC’s –here called UAH-(4) and RSS (5). Each of these produces GSTAs on a monthly basis. One can write a long essay about all the important difference between how each calculates the monthly GSTA, and for sure that is important, but here I simply and only want to compare each data set and see how well (or bad) they match each other. E.g. is one data set consistently reporting higher or lower than the others, are these differences increasing or decreasing over time or not? Etc. I am not assigning any subjective value to these possible differences; I just want to see if there are any differences and if there is a trend in these differences.

I used UAH’s data set as reference. I could have used any other data set as a reference, but it doesn’t matter which data set is compared to which since difference is relative. The satellite based datasets start in the year 1979, whereas the land-based datasets in some cases go all the way back to the year 1850. Hence, only the data from 1979 onward can be compared. That’s still over 35 years worth of data (n>420) and a large enough sample size to say something meaningful about the possible differences between reported GSTA for each dataset. I then simply subtracted the UAH monthly GSTA from the corresponding monthly GSTA of the other data sets (in this case, GISS, Hadcrut 4, RSS and NCDC; Data from January 1979 through March 2014). I then plotted these differences for each corresponding month and performed linear regression through each set of differences (Figure 1). A value of 0 means that the UAH data and the other dataset are similar, a value >0 means the other dataset reports a higher monthly GSTA compared to UAH and vice versa.

Figure 1.

As you can see, the three land-based data sets consistently report monthly GSTA higher than that of UAH. With NCDC > GISS > Hadcrut4 > RSS. NCDC’s data set reports on average a monthly GSTA 0.41°C higher than that of UAH. This value is almost the same as the average monthly GSTA reported by NCDC since 1979 (Table 1).

Table 1: summary statistics of several GSTA data sets and the difference between each of these data-sets with UAH’s

The other satellite based data-set, RSS, reports values rather similar to UAH (average difference of 0.058°C). In addition to the summary statistics, liner trend analyzes (assuming normal distributed data), shows that the difference between GISS and UAH is getting less over time (negative slope), while that for the three land-based data sets is increasing over time. The increase in difference over time between NCDC’ and UAH is the smallest (slope almost 0), followed by Hadcrut 4 and GISS (Figure 1). In fact, the difference between UAH and GISS has increased from 0.34°C/month in 1979 to 0.40°C/month by 2014, which is an 18% increase in difference in total or +0.5%/year. If this trend continues it would mean that GISS will not only diverge more from UAH report monthly GSTAs, but also diverge more from the other data sets. In addition, GISS will also surpass NCDC’ difference with UAH’ data which currently is the largest difference, as the current difference between GISS and NCDC is now only less than 0.05°C.

In summary, all five GSTA datasets analyzed here show an average GSTA over the past 35 years of between 0.01 to 0.42°C above their respective baseline period that varies between each data set. The land-based data sets report in all most all cases monthly GSTA that are higher than the satellite based GSTAs. In addition, there is a general trend towards larger differences between the former and later data-sets over time (since 1979). The GISS data-set has the strongest trend in difference over time and will soon report the largest difference with UAH if this trend continuous, as well as diverge more from the other land-based data-sets. The continuing divergence to the point where the difference is larger than the long term averages between satellite-based and land-based reported GSTAs warrants more in-depth analyses and attention.

This post was properly criticized for being an incomplete analysis, on a related note, I’m often criticized for being in the employ of “big oil” and flush with cash.

If that were so, I’d be able to hire assistant editors, and simple mistakes like this one wouldn’t have seen the light of day. Such is the issue of a being lone editor on a very demanding blog.

I actually hadn’t intended to publish this post, and had planned to contact Mr. Dohbro today for his reworking of it. I had saved this post to drafts, and as Mosher noted, noticed he didn’t do a baseline alingnment for anomalies. The post was originally a draft, joining dozens of other posts that I put into the system, but have not published for various reasons; it wasn’t set to publish. So that I can do my work during the day, since I can’t rely on those “big oil” checks, I often schedule posts to auto-publish in advance. Posts loaded into drafts have a collaboration tool that I planned to give Mr. Dohbro access to so that they can be updated and corrected as needed.

Last night I rescheduled and shuffled a bunch of new posts around after the changes I made yesterday to the WUWT format change and the news about UoQ and Shollenberger’s letter challenge required changing the schedule for Monday morning. Somehow, I set this post to auto-publish. I may have simply hit the wrong button, as Steve McIntyre recently did on Mann Misrepresents the EPA – Part 1, or I may have loaded the wrong story and got distracted and simply scheduled the wrong story. I just don’t know. I do know that whatever happened is entirely my fault.

I noted the post was published early today, and considered taking it down then, but thought better of that idea and decided I’d attend to it tonight and set things straight. Paul Homewood beat me to it and I thank him for doing so, and I’m updating this note from work.

My apologies to David Dohbro for publishing his essay without notice (he had no idea it had been published), and an apology to readers for publishing an essay that obviously needed some additional work to be fully accurate. – Anthony

Because of the different base periods for anomalies, maybe the comparisons are not as useful as they might be. Can you get hold of the base period data for all series, then re-construct the absolute temperature series for all except UAH (say) and re-base them on UAH’s base period. The results could be a bit different.

If you can’t get hold of the base period data, then you can rebase them all to a common period (eg. 1979-1989), provided you then report annual averages not monthly data. Given that your main findings are expressed in deg p.a., the results will I think be equally valid. The graphs might also be easier to interpret.

The anomalies are relative to the base period. If the base period is different for each data set then you can’t compare their respective values. For example, in a period of warming temperatures, any data set that has an earlier base period will have bigger positive anomalies – all other things being equal – because the temperature has had longer to increase. That doesn’t tell you anything other than that the base periods are different.

I don’t see how you can compare the anomalies unless you resolve the base period in each case to be the same. The trends maybe, but not the actual values.

Warming departure in UAH lower troposphere satellite temperatures compared to RSS over the period 2005-2006http://www.warwickhughes.com/blog/?p=2496
I would not jump to a conclusion that UAH is more correct than RSS.

“As you can see, the three land-based data sets consistently report monthly GSTA higher than that of UAH. With NCDC > GISS > Hadcrut4 > RSS. NCDC’s data set reports on average a monthly GSTA 0.41°C higher than that of UAH”

As Mike Jonas says, this is meaningless unless you put them on the same anomaly base. The trend differences are meaningful, but the only one that stands out is the difference between UAH and RSS. UAH and the surface measures are relatively close. It is Lord M’s favourite, RSS, that is the outlier.

Theer is an interactive graph here of those five indices, plotted monthly on a common base. It is interactive – you can rescale etc. Scroll up for details.

jimmi_the_dalek says:
May 19, 2014 at 1:13 am
As already pointed out, you need to get the baselines equal before you do anything else.
Here, woodfortrees has already done it for youhttp://www.woodfortrees.org/notes#baselines
=======================================================
As your post and linked graph only confirms the point, gistemp is the outliner. Notice how gistemp (red), is regularly mixed with the other metrics through about 1998. Since then it is ever more at the top of all the other metrics, and in the last several years this is increasing. Indeed, your link shows gistemp is increasing as an outliner.

Note that upstart products from Berkeley BEST and SchutzStaffel web site partners Cowtan & Way are now on board promoting real thermometer record hockey sticks even more extreme than Hansen’s GISS, that are likewise utterly falsified by the majority of the oldest existing thermometer records themselves, including Central England which is a damn good proxy for the older pre-alarm era global average plots. a lot better than tree rings are:

These hockey stick upgrades to boring old thermometer data are also bluntly falsified by the best liquid-expansion thermometer of all, the ocean, which utterly refuses to play along, so NASA’s web site cuts recent tide gauge data off, to obscure this unprofitable fact:

I was under the impression that RSS, being a satellite based database reports lower tropospheric temperatures and not the ground surface temperatures and hence doesn’t see the effect of nighttime temperature inversions due to radiational cooling. I think I’ve heard of some experimental products looking at surface temps, but wasn’t aware that any have become operational.

In fact, the difference between UAH and GISS has increased from 0.34°C/month in 1979 to 0.40°C/month by 2014, which is an 18% increase in difference in total or +0.5%/year.

I really hate percentage comparisons between anomalies like this. From figure 1, it appears to me the difference between RSS and UAH has dropped some 101%. If that trend continues for a few years, I think the discrepancy between today and 2017 will about another 100%.

Obviously both of these are much greater than the change in the discrepancy with UAH. I suggest you simply drop the percentage comparisons in any future analyses.

“The trend differences are meaningful, but the only one that stands out is the difference between UAH and RSS. UAH and the surface measures are relatively close.”

Well, we should not forget that TLT should amplify the warming since 1979, when we observe the opposite (including UAH). Graphics on a monthly basis are interesting but they include seasonal variations and significant differences in amplitude. An image with annual values ​​has a slightly different look: http://img215.imageshack.us/img215/5149/plusuah.png

The divergence TLT-surface is relatively large and it does not go in the direction which the physics of atmosphere predicted. It’s a bit annoying.

1) As multiple people have already pointed out, you absolutely MUST adjust for the different baselines used by the different datasets. Without such an adjustment, the different datasets are not directly comparable.

2) You cite trends in the differences between the datasets, and even show a figure to support your trends. However, the highest r^2 for any of the land-based measurements is .02 (GISS). That means the data has no trend. It also means the change of 0.06 deg (or as you put it, 18%) is in fact no change.

Look it is unacceptable to just throw stuff out there and expect readers to un bunk stuff.
Why? well because some people will just read the headline or read the article and never make it
to the comments. disinformation spreads and the corrections never do.

How many times do we see faulty representations of climate science pushed in the media without correction.

Problems.

There are more than two satellite series covering the period in question.
There are more than 3 global “land” series
They have different baselines, this must be accounted for.
They measure different things.
They measure in different ways.

All that said there are interesting things one can learn by comparing them, but it requires more work and more diligence than this post offers.

I think if you look at the RSS and UAH data you will find the RSS data started higher and the difference increased until the early 2000s. Since that time the two have been converging. This would seem to falsify the claim that orbital problems are the reason for the difference. If that was the case the two should be diverging even more at present.

I suspect the difference is in the areas where UAH provides data and RSS doesn’t. Since this is primarily around the south polar areas that would seem to indicated that it cooled late in the 20th century but has now returned to a near average value.

However, one can only estimate the offset due to the different base-periods (see also woodfortrees). I would like to refrain from adding an(other) estimate as it adds only more uncertainty then less.

Just because they have different base-periods does not mean one cannot compare the monthly Global Surface Temperature Anomalies (GSTA) each produces which are the same units. Otherwise we should also refrain from posting diagrams containing all of these different datasets in the same plot.

The idea here is to simply determine if the output between each data set is higher or lower, by how much and if that difference is changing over time. The RSS is the “odd-duck”, likely because it’s also satellite-based. Interestingly the RSS has a – slope, the land-based slopes are all +

Just because I didn’t compare ALL datasets available doesn’t mean this analyses is invalid. I choose to use those most widely used and known, and then compare their outputs unadjusted. Critique should then be pointed to what is done, not to what was not done.

FYI: R-square is not a measure of trend. A p-value of the slope is. I simply mentioned the slopes here. One can have a significant trend with a very low r-square and no trend with an r-square of 1 (e.g. x=1 has an r-square of 1, but no trend…). Then what and what we don’t like is irrelevant and doesn’t invalidate the comparison. Given that RSS is diverging less from UAH (difference closest to 0) it warrants less attention than the increasing divergence with the land-based data-bases.

“This post was properly criticized for being an incomplete analysis”
It is still incomplete without a thorough error analysis with tests of significance of any results. The error bands do look very wide.

Anthony – apology accepted. No worries otherwise, we all make mistakes (me included) and things do slip through the cracks once in a while. This time it was my work. However, I was indeed surprised to see it being posted without having had an email response from you first. As you recall, my email was already kinda questioning my own work -since I did it somewhat hastily without further in-depth thinking and this was a good first step one could say- but I thought I’d bring it to your attention nonetheless.

I appreciate the feedback received, as I see it as a peer-review aimed at improving the analyses, and will work on a re-analyses of the data adjusting for baseline years (ugh I really don’t like “adjusting”, since these days everything in climate science is “adjusted” for ; know what I mean ;-) ) and using a running average to smooth things out. I will also include statistical analyses.