Wednesday, 13 November 2013

Temperature trend over last 15 years is twice as large as previously thought

Yesterday a study appeared in the Quarterly Journal of the Royal Meteorological Society that suggests that the temperature trend over the last 15 years is about twice a large as previously thought. This study [UPDATE: Now Open Access] is by Kevin Cowtan and Robert G. Way and is called: "Coverage bias in the HadCRUT4 temperature series and its impact on recent temperature trends".

The reason for the bias is that in the HadCRUT dataset, there is a gap in the Arctic and the study shows that it is likely that there was strong warming in this missing data region (h/t Stefan Rahmstorf at Klimalounge in German; the comments and answers by Rahmstorf there are also interesting and refreshingly civilized; might be worth reading the "translation"). In the HadCRUT4 dataset the temperature trend over the period 1997-2012 is only 0.05°C per decade. After filling the gap in the Arctic, the trend is 0.12 °C per decade.

The study starts with the observation that over the period 1997 to 2012 "GISTEMP, UAH and NCEP/NCAR [which have (nearly) complete global coverage and no large gap at the Arctic, VV] all show faster warming in the Arctic than over the planet as a whole, and GISTEMP and NCEP/NCAR also show faster warming in the Antarctic. Both of these regions are largely missing in the HadCRUT4 data. If the other datasets are right, this should lead to a cool bias due to coverage in the HadCRUT4 temperature series.".

Datasets

All datasets have their own strengths and weaknesses. The nice thing about this paper is how they combine the datasets and use the strengths and mitigate their weaknesses.

Surface data. Direct (in-situ) measurements of temperature (used in HadCRU and GISTEMP) are very important. Because they lend themselves well to homogenization, station data is temporal consistent and its trend are thus most reliable. Problems are that most observations were not performed with climate change in mind and the spatial gaps that are so important for this study.

Satellite data. Satellites perform indirect measurements of the temperature (UAH and RSS). Their main strengths are the global coverage and spatial detail. A problem for satellite datasets are that the computation of physical parameters (retrievals) needs simplified assumptions and that other (partially unknown) factors can influence the result. The temperature retrieval needs information on the surface, which is especially important in the Arctic. Another satellite temperature dataset by RSS therefore omits the Arctic from their dataset. UAH is also expected to have biases in the Arctic, but does provide data.
A further problem of satellite data it is not homogeneous due to drifts of the satellites, changes in calibration constants due to deterioration of the instruments and changes in the number and type satellites. Thus computing reliable trends from satellite data is difficult.

Finally, its results cannot be directly compared to surface measurements as they measure the temperature of the lower troposphere, which is well above the 2-meter height at which station data measures temperature. As the study writes:

The satellite record is of particular interest because it provides a uniform sampling with near-global coverage. However the satellite microwave sounding units measure lower troposphere rather than surface temperatures and so are not directly comparable to the in situ temperature record. Furthermore there are temporal uncertainties in the satellite record arising from satellite failure and replacement and the numerous corrections required to construct a homogeneous record (Karl et al. 2006). Contamination of the microwave signal from different surface types is also an issue, particularly over ice and at high altitude (Mears et al. 2003).

Reanalysis data. Already in 2008, RealClimate suggested that the Arctic data gap in the HadCRUT 3V dataset may produce a too small trend because reanalysis data and the GISS dataset show strong warming in the missing region.

An analysis dataset combines all observations into a field that can be used to initialise a weather prediction. Typically, this also uses the information of a previous short term weather prediction. In a reanalysis dataset this process is repeated produce a long climate dataset. A reanalysis is better than analysis data because more data is available as in real-time weather predictions and because a reanalysis is computed using one atmospheric model whereas an operational weather prediction model is regularly improved. However reanalysis data is still not very reliable when it comes to trends because the amount and type of observations change considerably in time. For this reason the current Cowtan and Way study is much more reliable.

Putting it all together

The study uses two interpolation methods to fill the Arctic gap. The most interesting one is the hybrid method, it uses the strength of the UAH dataset (spatial coverage) and the HadCRUT dataset (temporal consistency). It does so by interpolating the difference between the two datasets. In this way the biases in the UAH dataset are basically computed at the Arctic edges of the HadCRUT dataset and thus reduced. The bias corrected satellite data is then used to fill the gap, especially in the middle, nearer to the edges also the HadCRUT data itself is still directly important in the interpolation.

The method is carefully validated by increasing the gap at the Arctic and comparing how well the data fits in the artificial part of this gap. Very elegant, even if this may not be the final word, because the behaviour of the UAH biases may be different in the true gap as in the artificial one, which is further from the pole.

Figure 5b from Cowtan and Way (2013) that shows the temperature for the satellite period smoothed by a 60 month moving average. The Null reconstruction fills the missing data with the global average values, the kriging reconstruction performs an optimal interpolation to fill all gaps, the hybrid reconstructions is briefly explained above and the HadCRUT4 is the original dataset with gaps.

Amateur science

Another interesting aspect of this study is that the authors are scientists, but no climate scientists and did the work in their free time. The authors are part of the Skeptical Science crew and the acknowledgement and the posts by Dana Nuccitelli and Stefan Rahmstorf on this study suggest that they did get some help from professionals.

This shows that amateur scientists can make valuable contributions, as we have also seen for Zeke Hausfather and his study on the influence of urbanization. And it also suggests that collaboration with experienced people is important. Personally, I have changed my research topic a few times, I always make sure to collaborate with experts to avoid making rooky mistakes.

[UPDATE: On twitter someone complained about my use of the word amateur.

Yes he is a scientist and that surely helped him a lot. But he is no climate scientist and did the work in his free time. I would also call myself an amateur at the moment I wrote my first paper on homogenization, which was also mainly written in my free time, for the love of the beautiful topic. The word is also not intended as an insult, in fact a bit more than century ago scientists being professionals was regarded a problem by many amateur scientists that were the main group then.]

Also refreshing, after previous game changing studies by amateur ostriches, is the modesty of the authors, as Dr. Cowtan states in The Guardian:

"No difficult scientific problem is ever solved in a single paper. I don't expect our paper to be the last word on this, but I hope we have advanced the discussion."

minor deviation

In fact, we are looking at a minor deviation. I am impressed that climate science seems to be able to come to conclusions on it. The change in warming of the atmosphere is only 2 percent of the warming of the full climate system and the deviation for the last 15 years is also just a few percent of the warming since 1900. Thus we are looking at a deviation of less than one in a thousand and if the current study holds even much less.

We now have several explanations for the reduced trend in the temperature record for the last 15 year, the Arctic gap, El Nino and more warming of the oceans, maybe smaller contributions by volcanic eruptions and less solar activity.

My personally hunch would be improvements in the climatic observations. I could imagine that people pay more attention to making accurate observations now that climate change is an issue (for meteorology less accuracy is sufficient) and maybe also due to the introduction of ventilated automatic weather stations, which reduces radiation errors. But that is just a hunch and still needs to be investigated. Let's see in a few years, when the dust is settled, what the main factors are.

Judith Curry: The[y] state that most of the difference in their reconstructed global average comes from the Arctic, so I focus on the Arctic (which is where I have special expertise in any event).First, Kriging. Kriging across land/ocean/sea ice boundaries makes no physical sense. While the paper cites Rigor et al. (2000) that shows ‘some’ correlation in winter between land and sea ice temps at up to 1000 km, I would expect no correlation in other seasons.

I see no reason not to use kriging. It is an optimal interpolation technique and also delivers the uncertainties of the interpolated values. Given that we expect that the temperate trend is larger near the poles as around the equator, one would expect that interpolation would underestimate the trend. As scientists are conservative, that is the preferred direction of erring. The GISS dataset also uses interpolation to fill the gaps. Taking the land, ocean and sea ice into account as covariables will likely make the interpolated estimates more accurate.Lotte Bruch makes an interesting comment at the Klimalounge (in German), that she mainly expects this study to change the temperatures in winter, because in summer the melting prevents strong temperature deviations. In the light of the Curry's remark above, it would be interesting whether this is the case. Then it would be no problem and if there are not much variations in other seasons that would also explain why the correlations are low in these seasons and then that would not be a problem for the interpolation. Definitely worth a study.The first author, Kevin Cowtan of this study has a detailed response to Curry's comment below her post: the difference between the hybrid and kriging reconstructions of Antarctica is only really significant around 1998, so it doesn’t greatly affect our conclusions.And also the second author, Robert Way, has written two clear, but not very friendly comments: The cross-validation steps taken in this paper are very important and the paper shows rather clearly that the Hybrid method in particular appears to be fairly robust even at long distances from adjacent cells.

Judith Curry: Second, UAH satellite analyses. Not useful at high latitudes in the presence of temperature inversions and not useful over sea ice (which has a very complex spatially varying microwave emission signature). Hopefully John Christy will chime in on this.

Because satellite data is not very reliable the authors use a combination of satellite and surface data and correct for (some of the) errors in the satellite dataset. How well this works will have to be studied in subsequent papers. I see no a priority reason why it should not work and especially not why it should be biased. If it is a problem it would just add to the uncertainties.I really wonder why John Cristy of the UAH dataset should chime in. He is the one that delivers a dataset with values in the Arctic. Maybe it would be better to ask the mainstream scientists behind the RSS dataset. They are the ones that did not trust the data sufficiently and rather leave a gap in the Arctic.

Judith Curry: Third, re reanalyses in the Arctic. See Fig 1 from this paper, which gives you a sense of the magnitude of grid point errors for one point over an annual cycle. Some potential utility here, but reanalyses are not useful for trends owing to temporal inhomogeneities in the datasets that are assimilated.

The paper is not based on the reanalysis datasets. As I wrote above, it only presents the information that GISTEMP, UAH and NCEP/NCAR show stronger as average warming in the Arctic. The analysis in the paper itself is based on the HadCRUT and UAH datasets.

Judith Curry: So I don’t think Cowtan and Wray’s analysis adds anything to our understanding of the global surface temperature field and the ‘pause.’

That is not very generous. At least it adds the information that the gap in the Arctic is potentially important and that it is thus worthwhile to study it in more detail, including improvements of the interesting new analysis method used by the authors. This is valuable information for people interested in this minor deviation in the temperature trend. At least for people who would like to understand it.Another valuable point is that the study illustrates how minor the deviation in the temperature is. If you have to worry about temperature inversions in the Arctic and hope that the very complex spatially varying microwave emission signature will save your point of view, then maybe it is time to broaden your view and have another look at the 0.8 °C temperature increase over the last century.

Judith Curry: The bottom line remains Ed Hawkins’ figure that compares climate model simulations for regions where the surface observations exist. This is the appropriate way to compare climate models to surface observations, and the outstanding issue is that the climate models and observations disagree.

Interesting that Dr Hawkins made the comparison that way, sounds like the best way to compare the models with the data. For the claim of the climate ostriches that the warming of the atmosphere has stopped, this is, however, not very relevant. The regional distribution of the warming is interesting for climate change impact studies, but not for the political "debate".

The comment below by Peter Thorne deserves more attention. He is an experienced climatologist, was lead author of the IPCC report, worked a lot on the quality of the climate record at the Hadley Centre and NOAA and is now professor in Norway.

A few brief observations:

1. This issue of sampling isn't entirely new or uninvestigated. See here for example.

2. GISS and NCDC MLOST do interpolate over some distance from real observations. Even HadCRUT gridding to 5 degree is arguably a form of limited interpolation. But interpolation is a vexed issue. We certainly need to do better on producing globally complete estimates and their uncertainties. And it certainly impacts on trends, particularly shorter-term trends.

3. HadCRUT does account for spatial incompleteness. It does this through its error model rather than attempting to interpolate. So, if you use and propagate the uncertainty estimates appropriately you will find that HadCRUT's estimates are consistent with a higher warming rate than its median estimator. Probably higher than even this estimate. See some pretty pictures at here or read the whole paper here.

Other opinions on this study

The authors have produced an informative homepage with lot's of background information on their article. It includes a nice discussion on previous work on the influence of the Arctic gap on the global mean temperature. They also wrote an article for Skeptical Science.

Curry points to some other papers on uncertainties in the temperature record and then discusses the study of Cowan and Way as mentioned in the above update. In the comments Steve Mosher writes: "I know robert [Way] does first rate work because we’ve been comparing notes and methods and code for well over a year. At one point we spent about 3 months looking at labrador data from enviroment canada and BEST. ... Of course, folks should double and triple check, but he’s pretty damn solid."

Lucia did not read the paper yet, but describes it and asks some questions based on the information in the internet. Lucia: "Right now, I’m mostly liking the paper. The issues I note above are questions, but they do do quite a bit of checking".

Motl discredits himself with Ad Hominems and empty babble. Makes WUWT look like a quality blog. This link really needs a NoFollow tag, to make sure that Google does not interpret this link as a recommendation.

John Nielsen-Gammon created a beautiful plot showing the relationship between global mean surface temperature and El Nino. By giving El Nino and La Nina years a different color and symbol you can see the influence of SOI, without having to "fudge" the data.

An editorial article in Nature on the Kosaka and Xie paper, which studied the influence of El Nino on the temperature, which could largely explain the atmospheric warming hiatus. By taking into account historical forcings (greenhouse gases, sun, etc.) and El Nino they are able to reproduce the global surface temperature with a of correlation 0.97 (!) since 1970. Impressive!

15 comments:

It's a nice reminder that every time you start looking for an explanation, you first need to be sure that there's an effect in need of one.

I'm curious to see how this research fits into other studies about the "pause" (AKA, "the paws"). Seems that Kosaka & Xie 2013 might be at odds with this finding since their POGA-H model had an excellent fit with the original temperature record.

That brings me to a question. When a model is compared with a temperature record, ¿do they use the same averaging method? It seems to me that if you are using an average observed temperature that doesn't include the Arctic, then you should construct an average model temperature with exactly the same coverage. I don't know if that's customary but if that's what K&X did then there would be no problem.

Hello R Daneel, that was also the first question that I was thinking of. That is why I would like to wait some time until the dust settles.

I am not a modeller, but I think I have seen people use a global mean temperature based on the model values at the locations of the climate stations. I have no idea, however, how often that is used.

Also for the validation of weather predicting such an approach is often used for more difficult parameters (for example radar precipitation measurements). And people have started simulating some paleo proxies in climate models.

Because the correlation of the global mean temperature in Kosaka & Xie was so high, I can imagine that they used such an approach.

I am not sure whether there would be a problem. The changes are very small. Kevin Cowtan and Robert G. Way, for example, explicitly write on their background homepage, that their better interpolated data is still within the error margins of the HadCRUT data.

This filling in the gaps of the temperature record seems like a logical thing to do. I can't help but wonder why no-one has done it before.

What is a little frustrating/amusing is that we seem to have gone from explaining the hiatus in terms of aerosols/natural variation/ocean heat uptake to now there is no hiatus. Probably in reality it's a bit of both but I can't help but feel a little bit silly given that I've been arguing something that's not entirely true. But as you say, it's early days yet and things may change. It's science after all.

The main reason it has not been done before is that it is quite complicated and it will require much further research to show that the method works reliably. This paper mainly shows that such research would be worthwhile.

Another reason is that you can only do this for the satellite period. Which is rather short and thus not very interesting climatologically.

Finally, a reason is likely that the "hiatus" is minor and that scientists thus likely not found it to be very interesting and maybe also did not expect that it would be possible to study something that small.

You can only get funding for scientifically interesting problems. The nonsense hypes of WUWT and Co. are not taken into account by the science funding agencies. The authors of this paper did the study in their free time.

The gap is filled with normal interpolation in the GISS dataset. This dataset also has a somewhat higher trend for the last 15 years. And according to Rahmstorf at RealClimate the GISS dataset has still some problems with ocean temperatures. If this problem is corrected the trend in the GISS dataset over the last 15 years would be 0.1°C per decade. Quite similar.

1. This issue of sampling isn't entirely new or uninvestigated. See here for example.

2. GISS and NCDC MLOST do interpolate over some distance from real observations. Even HadCRUT gridding to 5 degree is arguably a form of limited interpolation. But interpolation is a vexed issue. We certainly need to do better on producing globally complete estimates and their uncertainties. And it certainly impacts on trends, particularly shorter-term trends.

3. HadCRUT does account for spatial incompleteness. It does this through its error model rather than attempting to interpolate. So, if you use and propogate the uncertainty estimates appropriately you will find that HadCRUT's estimates are consistent with a higher warming rate than its median estimator. Probably higher than even this estimate. See some pretty pictures at here or read the whole paper here.

This is Ed Hawkins blog post on *that* comparison of models and global temp.

"A recent comparison of global temperature observations and model simulations on this blog prompted a rush of media and wider interest, notably in the Daily Mail, The Economist & in evidence to the US House of Representatives. Given the widespread misinterpretation of this comparison, often without the correct attribution or links to the original source, a more complete description & update is needed.

"Also note that I have ‘masked’ the simulations to only use data at the same locations where gridded observations in the HadCRUT4 dataset exist."

Just to clarify the paper I linked to is the HadCRUT4 paper not the new analysis. Like everyone else its ask for a pre-print or find the fee down the back of the sofa. I took option #1. Sorry for any confusion. Its a really neat analysis.

Its also worth noting that the opposite study - calculating for common geospatial coverage - was done by Vose et al in 2005 and found remarkable similarity. Since then UK and US SST and land analyses have developed respectively. It would be interesting to redo that now looking at what portion of diffs arises from data analysis in areas of common analysis and what portion arises from unique data / interpolation.

Peter, thanks for the warning. I have removed the link to the free manuscript from the post. Should have looked better what I was linking to.

The Arctic gap will be much less important for the long term trend. That is likely something special for the recent past in which we also saw a huge decline in Arctic sea ice. And maybe it would change some other decades by a similar minor amount.

Some comments on other blogs are really weird and make a crisis in climatology out of it. Some act as if they do not understand that two times almost nothing is still almost nothing and that this study makes a minute change to the centennial temperature signal. Maybe the cognitive dissonance of acting as if this almost nothing is an argument in their political battle.

Victor, thanks for putting all this together and for sharing your wisdom. Very much appreciated! Re Kosaka & Xie, I think it is interesting to note that their "tweaked" simulation (i.e. driven with observed Pacific SSTs = their POGA-H experiment) does not only have trouble to get the observed Eurasian sfc temperature trend right, but also the Arctic Amplification. It would therefore be great to see how it compares with the extended HadCRUT4 dataset rather than the current incomplete version given that POGA-H runs still a bit warmer than the original HadCRUT4 data. Should result either in a perfect match or a slight underestimation of the trend as far as the model is concerned.

While the Arctic Amplification is reproduced in the GCMs in general, they fail to get the observed magnitude right by quite a wide margin. Force them with observed sea ice extent and things will change dramatically. What I'm saying is that masking the GCMs in the way Ed did it, won't make it an apples to apples comparion as long as the GCMs keep producing very warm winter sfc temperatures over Eurasia. My guess regarding Ed's plot would be that the model trend will increase only slightly, while the HadCRUT4 trend will increase significantly. Might well fall in the 25-75% range of the CMIP5 ensemble after the correction. I'm sure we're gonna see an update soon ... keeping always in mind that Cowtan and Way is certainly not gonna be the last word.

Do I understand you right that the dramatic change, when forcing with observed sea ice, is that the modelled temperatures then match the observed ones very well?

A kind of Kosaka & Xie study, but not for the pacific, but for the Arctic. That is interesting. Do you happen to have a link to that? That could be interesting for some readers.

That would conclude that there is no problem with the theory of global warming, which was anyway a strange idea, or with the climate models in general, which could have been more to the point, but that we do have a problem with the modelling of Arctic sea ice in global models. Right?

Victor, re the dramatic change I should better have said that this is what I would expect to see in the GCMs. They would certainly get the WACCY (Warm Arctic Cold Continent) pattern right if only they were informed about the observed sea ice conditions a priori. As you rightly said, Kosaka & Xie for the Arctic. To my knowledge, there isn't too much in the literature yet. Hope that people are working on that. Some examples which point in the right direction:

I agree, if the Arctic amplification turns out to have an even stronger impact on the global temperature (as suggested by Cowtan and Way), we wouldn't need to be too worried about the apparant (temporary) negative winter feedback in Eurasia due to negative AO. Masking Eurasia in winter leads to this (using the current HadCRUT4 version): HadCRUT4 masked (credit to Eduardo)

No hiatus whatsoever! However, take Cowtan and Way's complete HadCRUT4 version and leave Eurasia as is, the global average will be lower than masking both, Eurasia and the Arctic. My point is, I still think we see a slight negative AO related global temperature impact. Interestingly, if AO is strongly positive as it is the case right now, Eurasia sends the global temperature anomaly to record level. Wouldn't be surprised if Nov will be the warmest ever (we will know in a week from now).

On top of it all, the (negative) ENSO trend remains to be another important issue. Nothing has changed in that regard! What Cowtan and Way shows is that there is a potential for an even stronger surface temperature trend than previously thought.

One of the interesting things of blogging is everything that happens in the background. A colleague has asked Stephen Outten to send me some papers on the relation between Arctic sea ice and temperature.

You say: "Maybe it would be better to ask the mainstream scientists behind the RSS dataset. They are the ones that did not trust the data sufficiently and rather leave a gap in the Arctic."

OK, I'll bite.

First, the gap in the arctic is relatively small, only for 82.5N to the pole. If you look at a polar projection, you can see that this is quite a small area. The reason we leave out this area is because the part of the satellite data we use *does not view* this area. The UAH team interpolates the data to fill in the hole at the pole.

The south pole is more complicated. We view the TLT product as a atmospheric temperature product. For areas with low surface elevations, this is true, with ~90% of the signal coming from atmospheric emission. For regions with high surface elevations, the portion that comes from the atmosphere is reduced, because the atmosphere is thinner, and thus more transparent. Much of the Antarctic continent is above 2000m, so the portion of the signal that comes from the atmosphere is sharply reduced, to as low as 60%. So TLT is no longer an atmospheric product.

There is a second problem as well. To calculate the TLT product, we use a method developed by the UAH group, where we calculate the weighted difference between satellite measurements at different viewing angles AND at different locations. When some of the views are on the Antarctic continent, and some are in the surrounding ocean, there is a large spatial gradient in the measurements that pollutes the TLT retrieval (which is based on vertical gradients). Away from the poles, this problem averages out when we make monthly mean maps, but doesn't do so near the poles for geometric reasons. You can look in Mears and Wentz, 2009 http://images.remss.com/papers/rsspubs/Mears_JTECH_2009_TLT_construction.pdf for more details.

These problems lead us to not provide TLT data south of 70S. But I don't think is matters much for the paper under discussion because:1. Most of the effect is from the Arctic. 2. The data is used in Cowtan and Way to provide a spatial pattern to fill in the surface temperatures. I don't think it matters that the UAH TLT product is not really an atmospheric temperature over Antarctica -- in fact, for this application, some surface emission in the data product is probably a good thing, since it is the surface temperature that we are interested in.

Carl Mears, thank you very much for your assessment. I had expected that the quality of the satellite datasets would be the main problem. In fact without the careful cross-validation I would have been very skeptical of this C&W study because of the satellite data.