Leopold in the Sky with Diamonds

Radiosonde trends are back in the news. A few days ago, on May 24, 2008, Realclimate reviewed three recent papers: Lanzante and Free (J Clim 2008), Haimberger et al (J Clim 2008) and Sherwood et al 2008, adding a note with the even more recent Allen and Sherwood (2008.) Peter Thorne of Hadley Center stated of the Allen and Sherwood study:

We discussed radiosonde data in connection with Douglass et al 2007, discussing in particular Gavin Schmidt’s purported excoriation of Douglass et al 2007, in which Schmidt took particular umbrage at Douglass et al use of Raobcore v1.2, a study published in April 2007, one month before the submission of Douglass et al 2007 in May 2007. We are used to climate scientists “moving on”, but the speed of decampment in this instance seems particularly rapid. Schmidt’s implication is that Haimberger had repudiated Raobcore v1.2 before it was even published (it is explicitly repudiated in Haimberger et al 2008); however, rather than criticizing Haimberger for his failure to withdraw his then still unpublished now repudiated results, he criticized Douglass et al for using the most recently published results (results that were a hoary one month of age at the time of the submission) rather than attempting to anticipate the results of future climate nomad migrations.

Overlooked in this particular exchange was exactly what prompted Haimberger’s rapid abandonment of the Raobcore v1.2 camp site. I thought that it would be interesting to examine the reasons for this abandonment and will do so today, as this is another interesting case of data adjustments by climate scientists – a topic not unfamiliar to CA readers.

Haimberger, like the other recent flurry of studies, to a considerable degree, turns on a statistical issue previously discussed at CA in connection with surface stations – the use of home-made breakpoint algorithms developed by climate scientists and unstudied in the general statistical literature to adjust (or attempt to adjust) for inhomogeneities and discontinuities in a data set with poor quality control and untrustworthy meta data. Indeed, some radiosonde literature cites adjustment (“homogenization”) techniques developed by USHCN, if you can imagine that.

If you browse the Surface Record category, you will see some discussion of these changepoint algorithms, although the issue has been noted, rather than run to ground. We spent a little extra time on the case of Lampasas TX, a case which also attracted the interest of Atmoz, where an obvious and easily observed discontinuity was missed by the USHCN changepoint algorithm. On other occasions, I’ve noted my sense that these changepoint algorithms in practice merely seem to end up blending good and bad data and, in particular, seem very vulnerable to contamination. Anthony Watts has a recent post referring to planned USHCN changepoint analysis.

Radiosonde adjusters take adjustment to extremes not contemplated in the surface record – ultimately even changing the sign of the trend. Sort of like Hansen on steroids.

The underlying difficulty for present-day scientists trying to extract information from the historical radiosonde data is that the problems with quality control and meta data in the radiosonde network appear far more severe than surface station record, which is disappointing, given that the radiosonde data was not collected by USHCN volunteers, but by trained climate professionals and that much of the data was collected during the IPCC era. Here’s a statement by Sherwood at realclimate in 2005 summarizing the compromising of the radiosonde record. Many other issues are identified in the specialist literature, problems being already identified in the early 1990s.

Few if any sites have used exactly the same technology for the entire length of their record, and large artifacts have been identified in association with changes from one manufacturer to another or design upgrades by the same manufacturer. Artifacts have even been caused by changing software and bug fixes, balloon technology, and tether lengths. Alas, many changes over time have not been recorded, and consistent corrections have proven elusive even for recorded changes. While all commonly used radiosondes have nominal temperature accuracy of 0.1 or 0.2 K, these accuracies are verified only in highly idealized laboratory conditions. Much larger errors are known to be possible in the real world. The most egregious example is when the temperature sensor becomes coated with ice in a rain cloud, in which case upper tropospheric temperatures can be as much as 20 C too warm. This particular scenario is fairly easy to spot and such soundings can be removed, but one can see the potential problems if many, less obvious errors are present or if the sensor had only a little bit of ice on it! Another potential problem is pressure readings; if these are off, the reported temperature will have been measured at the wrong level.

A considerable climatic and statistical literature exists on the problem of detecting undocumented “change points,” or discontinuities in the statistics of a time series (see Menne and Williams 2005, and references therein). Climate relevant changes are usually modeled as simple step discontinuities in observing bias, due e.g., to changed sensor design, relocation of the sensor, etc., and are thereby distinguishable—at least in principle—from the relatively smooth variation of the underlying observable. Detection of the change point is followed by estimation (and ultimately, removal) of its associated level shift.

These studies have left key issues unresolved. First, detection methods typically assume that the observations possess little or no serial correlation, but real climate records contain variability on all time scales. This makes false detections more likely since the natural variability begins to resemble the artifacts. Second, the goal is usually not detection per se but accurate climate signals, yet previous studies have not carefully investigated to what extent that actually occurs. A tendency has been noted for radiosonde temperature trends to disappear upon homogenization (Free et al. 2002). Finally, while the value of using data from neighboring sites is wellrecognized for levelshift estimation (e.g. Karl and Williams 1987), detection studies have dwelled on the case of an isolated time series; the use of neighbor information remains adhoc in practice, and its efficacy untested.

A detailed exploration by Sherwood (2007, hereafter S07) using statistical simulations revealed that standard methods were often unable to estimate trends reliably. Three problems were identified. Even with liberal detection criteria not all change points are found: the “missed artifact” problem. On the other hand, even with very strict criteria, false change point detections are unavoidable when time series have realistic serial correlation. Subsequent adjustment of the time series tended to eliminate trends (or, in the case where a satellite reference is used, trends in the sondesatellite difference): the “greedy artifact” problem. Finally, when reference information from nearby stations was used, artifacts at neighbor stations tend to cause adjustment errors: the “bad neighbor” problem. In this case, after adjustment, climate signals became more similar at nearby stations even when the average bias over the whole network was not reduced.

Sherwood’s last sentence here is very reminiscent of a phenomenon that I’d noted in connection with USHCN adjustments.

Both Sherwood and Haimberger have quick surveys of prior adjustment efforts, which are worth reading. My take is that there seem to be two approaches to the “homogenization” problem. One approach is what seems to be the approach of Angell – search for the best quality stations and use them, even if the network is only a subset of the original network. Angell ended up with 62 stations, but had trends inconsistent with model expectations.

[Update: As Peter Thorne observes in a comment below, radiosonde scientists have made substantial efforts to make their data publicly available. I commend them for this. I was able to quickly locate and download relevant information on the Angell, RATPAC, HadAT2, Raobcore v1.4 data sets. Some of the data sets are very large and represent considerable effort. I corresponded once with Leopold Haimberger and promptly received directions to a url that I had been unable to locate. In the Raobcore data set that is the primary topic here, I was able to locate data representing 4 satellite levels, but, even after seeking Haimberger’s assistance, could not locate data at the pressure levels portrayed in their web visualization.]

Lanzante and associates (Free, Seidel) also attempted to identify a QC-ed network (87 stations), but included that even this network (RATPAC-B) had large inhomogeneities and accordingly they developed adjustments based on metadata. After these adjustments, their trends were still inconsistent with model expectations. Sherwood deprecated this procedure as “subjective” and Haimberger deprecated it as “laborious”.

The Raobcore approach is based on the diametrically opposite approach. They make no attempt whatever at prior quality control – that would be “subjective”. They essentially dump all data into their network, regardless of quality or inhomogeneity and rely on automated adjustments through changepoint analysis to sort the mess. My entire instinct is against this sort of approach – which reminds me all too much of Mannian analysis of the North American tree ring network. Instead of trying to work with a controlled network of properly QCed stations, Haimberger constructed a network of 2881 records, of which 1536 were taken from the IGRA list of radiosonde stations and the other 1355 (including many very short records) from data that was not included in IGRA but was in the ERA-40 reanalysis project. Homogeneity adjustments were applied to stations with records longer than 180 days (of which there were only 1184 stations). One might well question the inclusion of 1697 stations with records of less than 180 days in a study purporting to understand 30-year trends.

The algorithm in Haimberger et al 2007 added a novel tweak to changepoint methods – a tweak that should not be accepted a proven methodology, merely because it’s been published in a journal with weak statistical refereeing (Journal of Climate):

This paper introduces a new technique that uses time series of temperature differences between the original radiosonde observations (obs) and background forecasts (bg) of an atmospheric climate data assimilation system for homogenization.

One wonders at the real statistical properties of this “new” technique. To what extent does this technique merely imprint trends in ERA-40 onto the radiosonde data? What if some other history had been used as a target? Would that have led to a different history? If that’s the case, what, if anything is proven by the Raobcore exercise?

In this case, it seems to me that important light is shed on this question because of a very curious change between Raobcore v1.2 to Raobcore v1.4 – which indicates that the target model has a substantial impact on the analysis results.

Adjusting the Target Model
Raobcore (both v1.2 and v1.4) did not merely adjust the radiosonde data. They adjusted the target ERA-40 model as well, with the adjustments to the target ERA-40 model being more extensive in v1.2 than in v1.4.

Haimberger 2007 described the v1.2 adjustments to the target ERA-40 model as follows:

Although ERA-40 used a frozen data assimilation system, the time series of the background forecasts contains some breaks as well, mainly due to changes in the satellite observing system. It has been necessary to adjust the global mean background forecast temperatures before the radiosonde homogenization. After this step, homogeneity adjustments, which can be added to existing raw radiosonde observations, have been calculated for 1184 radiosonde records….

It is essential to be aware of any inhomogeneities of the ERA-40 bg since these reduce the applicability of the ERA-40 bg as a reference. Inhomogeneities in the bg time series may be introduced by changes in the ERA-40 observation coverage, in the observation biases correction and in the overall observation quality. Apart from radiosondes mainly the satellite data are affected by changing biases

The most prominent breaks evident in Figure 8 occured in January 1975, September 1976 and April 1986 are related to problems with the NOAA-4 and NOAA-9 satellites. Jumps in 1995/1997 coincide with end of NOAA-11, start/end of NOAA-14 (see also Christy and Norris 2006). At high altitudes the effects of insufficient bias correction of radiances from the stratospheric sounding unit (SSU), particularly in the early 1980s, are noticeable (see Haimberger 2005; Uppala et al. 2006). Trenberth and Smith (2006) have recently diagnosed a spurious break in ERA-40 temperature analyses related to the assimilation of MSU-3 radiances at the end of the NOAA-9 period.

The principal changes in Raobcore v1.4 were changed adjustments to the target model, described on the Raobcore website as follows:

Version 1.4 of RAOBCORE contains 2 major improvements compared to the versions 1.2, 1.3 described in Haimberger (2007) (J. Climate, in press). These improvements are: …

2) The ERA-40 background modification described in Haimberger (2007) is only applied between Jan 1972 and Dec 1986. It has turned out that the ERA-40/ECMWF bg forecast time series are quite consistent with recent versions of the RSS and UAH satellite datasets, so that a modification of the ERA-40 bg is not necessary. Between 1972 and 1986, modifications of the bg are unavoidable. The bg is modified more strongly in the tropics in v1.4 compared to the modification applied in version 1.2. The differences between 1.2, 1.3 and 1.4 can be examined using the web visualization tool.

So Raobcore v1.2 argued in a peer reviewed journal that there were post-1986 inhomogeneities in the ERA-40 model that required adjustment, giving a list of such inhomogeneities. Raobcore v1.4 decided that adjustments to ERA-40 after 1986 were not required after all. One would have thought that Journal of Climate would have required a detailed explanation of why Haimberger et al had changed their views so quickly and a detailed analysis of each post-1986 adjustment that was no longer deemed pertinent. A year earlier, Haimberger expressed concern about “jumps” at the end of NOAA-14. A year later, he was no longer concerned. Why? Is there any such analysis in Haimberger et al 2008? Nope.

The impact of removing the post-1986 ERA-40 model adjustments is not small. Douglass et al 2007 argued that the Raobcore v1.2 trends were inconsistent with models, while Schmidt argued that Raobcore v1.4 was consistent with models.

The graphic below compares Raobcore v1.2 (blue) and v1.4 (green) tropical trends by altitude. One version of v1.2 is as reported in Douglass et al 2007 and one is manually estimated from the Raobcore “web visualization tool”; one version of v1.4 is manually estimated from a figure in Haimberger et al 2008 and one version is manually estimated from the Raobcore web visualization tool.

There’s another issue here. I wasn’t able to replicate the Raobcore v1.4 diagram from archived data. Raobcore data is not archived apples to apples with the “web visualization” tool. They’ve archived data blended to TLT, TMT, TTS and TLS levels, which I’ve plotted in red below. This should reconcile to the v1.4 data at the different altitudes, but doesn’t appear to. I don’t exclude the possibility that I’ve been wrongfooted somewhere along the way. I’m careful about these things, but I’m not familiar with this subfield. So there may be still another issue here.

For the purposes of trying to say whether or not the radiosonde “data” is consistent or inconsistent with the models, the underlying problem, as noted above, is that the radiosonde network is so thoroughly contaminated by inhomogeneities, worsened by defective quality control by observers. Different qualified observers have extracted radically different trends from the radiosonde data.

In the surface stations situation, one way of trying to find solid ground in a sea of adjustments was locating “crn 1” stations with long histories of consistent observation and metadata, as Anthony Watts has been trying to do. In the surface stations example, it’s hard to see any value in blending such data with compromised data from sites like the University of Arizona parking lot. It seems highly probable that temperatures in the 2000s are warmer than the 1930s, but I’d like to see this established from “crn 1” stations. This seems like an elementary form of quality control.

Raobcore v1.4 is hardly the last word in radiosonde adjustments. Leopold Haimberger (and I intend no slight by the title of the post, I just liked the sound of it) has already moved on to yet another adjustment system (“RICH”); Allen and Sherwood 2008 have used Iterative Universal Kriging, adding wind information into their adjustment brew. What each of these studies has in common is that none of them are new “experimental verification”; they are merely adjustments of ever increasing magnitude.

I noted above that Sherwood had cogently criticized changepoint analysis as carried out in every predecessor adjustment. I think that, at this point, it’s quite reasonable to stipulate Sherwood’s criticisms. An immediate result is that neither of the Raobcore versions constitutes “data” that could possibly confirm or reject a model, nor for that matter would any of the other adjustment systems.

The only alternative in the radiosonde field to these after-the-fact adjustments was Angell’s effort to create a small network of “good” sites without adjustment. If specialists have concluded – and this appears to be common ground – that Angell’s network was also compromised, then I think it’s possible that the field may simply have reached a stalemate in terms of trying to determine whether the models are consistent with the “data” – either to claim vindication as Thorne has recently done or claim inconsistency (Douglass et al 2007). This does not imply anything one way or the other in respect to the ability to draw conclusions from the satellite record, which, among other things, has the advantage of involving only a limited number of instruments, whose properties have been studied in detail.

Allen and Sherwood 2008 try a different tack – they try to create a homogenized wind data series on the basis that the radiosonde wind data is much less screwed up. They then argue that the trends in wind are consistent with tropical troposphere warming. They use this as evidence for the side of the argument that the UAH satellite temperature trends in the tropics are incorrect. I guess that we’ll see more about tropospheric wind data in the next while.

“No, you are wrong, RSS is consistent with models only if we look at global trends, but RSS trend for tropical “hot-spot” is out of 2 standard deviations limit of the model mean, just like UAH and all “uncorrected” radiosonde data sets.”

Sounds like a lot of people desperate to do “science” even if they have no worthwhile data. So they take crap and massage it until it looks like the “science” that they want. Then they announce to the world that the science is well-settled.

What ever happened to proving that your data analysis method actually works? In statistics you can’t publish a new method without evaluating it against real or simulated data. For the changepoint algorithms, for example, one could take data from weather stations with known location moves and good corrections for instrument changes and see how it does (pretty bad is my bet). With an elaborate untested method applied to complicated data, you get….who knows? Doesn’t that make reviewers uneasy?

It looks like mining for hockey sticks to me. They have a set of more or less (some of each) random data. They have a target. They search for a subset of the randomised series until they get the one they want. I am not into betting quatloos, too rich for my blood, but If I were, I would bet most of what I had that you could replicate these findings with enough pink noise time series. I would be far more likely to accept a series by series qc approach, no matter what the answer. But that is because I really want to know what is happening, and have not made a personal commitement to Gaia to save the world, no matter what the cost.

This is the long awaited vindication of the models? All the data is crap, so it is still possible that we are right, and BTW, shut up Lord Moncton? Yeesh!

My irony sensor is not working. I’m not sure if you truly want Lord Monkcton to shut up?

Otherwise, it strikes me that the RC mob are determined to find accuracy to .1C using tools that are no better than a man wearing boxing gloves trying to measure a rough cut piece of wood with a nailfile and then adjusting the results with a statistical food mixer.

Thanks for addressing this Steve. ICECAP had a link to an AFP story regarding this study that had the following sentences:

“Over the last two decades, temperature readings from the upper troposphere — 12 to 16 kilometres (7.5 and 10 miles) above Earth’s surface — based on data gathered by satellites and high-flying weather balloons showed little or no increase. Oft cited by climate change sceptics, these findings were known to be flawed but still challenged the validity of computer models predicting warming trends at these altitudes, especially over the tropics.”

…these findings were known to be flawed… Apparently data that doesn’t agree with the models is now “known to be flawed”. Perhaps they were refering to the shortcomings of the balloon data previously discussed here, but how is the satellite data known to be flawed?

I fail to see how anybody could perform a proper peer review of the Sherwood et al. paper ( http://earth.geology.yale.edu/~sherwood/sondeanal.pdf ) which I happened to be looking at after I stumbled over it in Unthreaded this AM. The number of ad hoc data removals, imputations and adjustments (the word “adjust” in some form is used 39 times in the body of the paper) combined with the black-box complexity of their “statistical” methodology make it virtually impossible to even begin to duplicate their results. Unsubstantiated subjective choices of results, e.g.:

Adding the Group B stations slightly reduced tropospheric warming trends in the Tropics and increased them in ESH. Given the importance of dT, and the failure to detect suspected artifacts at many Australian B stations, we judge that Group A results are more reliable. We therefore adopt the average of the L96 and twophase Round 3 results as our best estimate of the trend.

produce ouput with little or no scientific value. How do they estimate the size of possible error?

The structural uncertainty in our trends, quantified here by taking half the full range of results at different stages of a multistage analysis and with two different changepoint detection schemes, is > 0.05 C decade^−1 in the tropical troposphere and > 0.1 C decade^−1 for the stratosphere and the southern hemisphere extratropics.

…whatever that might mean. This is “robust”???

I presume that in the final printed copy, the reviewers will at least have caught the small error in

We analyzed twice daily data, for several reasons. First, individual observations are expected to have heteroscedastic error behavior (similar variance for all observations) as assumed by all methods, while monthly means will not owing to large variations in sampling rate at some stations.

(bold mine). In fact, heteroscedastic means the exact opposite of the definition given. The correct word is homoscedastic. The mis-definition aside, I don’t understand why the heteroscedasticity of the monthly means would pose a major difficulty since the numbers of observations in a given mean would be known and therefore could be adjusted for (oops, did I really use that word?!) in the analysis.

As I first read this, my mind felt as might be implied by the title. But on reading the entire article, it is quite clear and helpful. Thank you for your hard work and clarity of presentation of these puzzles. As you imply in your last sentence, it does seem we are left to the point where only the satellite data may provide reliable information over long time periods, but that only back to the 70’s. I essentially agree with Paul M.

If you were in charge of an undergraduate laboratory in an experimental science, and your students tortured data like this to find the result that they thought was expected of them, you’d give them a metaphorical cuff on the ear.

The procedure appears to have been successful in eliminating systematic temperature biases in most regions, although the deep tropics appear to retain cooling biases over time that we still cannot identify; these may be due to changes that are too numerous to detect, or not step-like.

C’mon. Are you going to write something like “the wind data isn’t worth looking at” if you are trying to get a paper published? 😉

The last paragraph of the abstract provides the alarmist bent for the journalists:

While this effort appears not to have detected all artifacts, trends appear to be systematically improved. Stronger warming is shown in the northern hemisphere where sampling is best. Several suggestions are made for future attempts. These results support the hypothesis that trends in wind data are relatively uncorrupted by artifacts compared to temperature, and should be exploited in future homogenization efforts.

I’m still trying to figure out what “trends appear to be systematically improved” means. A statistician might say “estimates of trends are improved”. Here, I suspect that the actual implication is that they managed to adjust the analysis to bring the “record” closer to what the models GW models predict.

Allen and Sherwood 2008 try a different tack – they try to create a homogenized wind data series on the basis that the radiosonde wind data is much less screwed up. They then argue that the trends in wind are consistent with tropical troposphere warming. They use this as evidence for the side of the argument that the UAH satellite temperature trends in the tropics are incorrect. I guess that we’ll see more about tropospheric wind data in the next while.

This looks like a somewhat different approach to torturing the temperature data one more time. Whether it rises above torturing the wind data, we’ll see. If anyone sees anything on inhomogeneity in radiosonde wind data, please post it up.

#19. Re-reading that paragraph, I think that what is meant is that the wind data was not hugely inhomogeneous and thus the changepoint algorithm had relatively little effect. Read in that sense, they are endorsing the wind data as being relatively homogeneous.

Then they state that there is a trend in the wind data, and that there is a link between wind and temperature and thus an increase in temperature.

It looks like quite a different approach to Raobcore and should be judged on its own terms.

I want to paraphrase a comment that I made on another thead. In the Douglass et al. Int. Jour. Climate paper of last Nov 2007 we presented temperature trends from the following 10 observational data sets: 3 surface, 4 radiosonde and 3 satellite. We realized that we had not explained our use of ver 1.2 from the several Haimberger ROABCORE radiosonde analyses, so we sent an addendum to the Journal on Jan 3, 2008 explaining our choice.

—————————– submitted to the journal on Jan 3, 2008————————

The ROABCORE data: choice of ver1.2.

Haimberger (2007) published a paper in which he discusses ver1.3 and the previous ver1.2 of the radiosonde data. He does not suggest a choice although he refers to ver1.2 as “best estimate.” He later introduces on his web page ver1.4. We used ver1.2 and neither ver1.3 nor ver1.4 in our paper for the satellite era (1979-2004). The reason is that ver1.3 and ver1.4 are much more strongly influenced by the first-guess of the ERA-40 reanalyses than ver1.2.

(Haimberger’s methodology uses ‘radiosonde minus ERA-40 first-guess’ differences to detect and correct for sonde inhomogeneities.) However, ERA-40 experienced a spurious upper tropospheric warming shift in 1991 likely due to inconsistencies in assimilating data from HIRS 11 and 12 satellite instruments — which would affect the analysis for the 1979-2004 period, especially as this shift is near the center of the time period under consideration. This caused a warming shift mainly in the 300-100 hPa layer in the tropics and was associated with (1) a sudden upward shift in 700 hPa specific humidity, (2) a sudden increase in precipitation, (3) a sudden increase in upper-level divergence and thus (4) a sudden temperature shift. All of these are completely consistent with a spurious enhancement of the hydrologic cycle. Thus ver1.3 and ver1.4 have a strange and unphysical vertical trend structure with much warming above 300 hPa but much less below 300 hPa (actually producing negative trends for 1979-2004 in some levels of the zonal mean tropics).

Even more unusual is the fact the near-surface air trend in the tropics over this period in ERA-40 is a minuscule +0.03 “C/decade (Karl et al. 2006) and so is at odds with actual surface observations indicating problems with the assimilation process. This inconsistent vertical structure as a whole is mirrored in the direct ERA-40 pressure level trends and has been known to be a problem as parts of this issue have been pointed out by Uppala et al. (2005), Trenberth and Smith (2006) and Onogi et al. (2007). Thus we have chosen ver1.2 as it is less influenced by the ERA-40 assimilation of the satellite radiances.
—————————————————————————————–

Can wind be legitimately used as a proxy for temperature? Isn’t wind created by a gradient in the temperature field and not by the absolute temperature of any given point in the field? Jupiter has sustained winds of over 400 m/h in the great red spot yet temperatures in its upper atmosphere are -150 F.

I’m not suggesting that Jupiter is a good analogue for the Earth’s atmosphere, just wondering if wind speed is necessarily correlated with temperature.

#25. Allen and Sherwood 2008 appears to take a different approach than Sherwood et al 2007 and criticisms of all the adjustments in S07 don’t necessarily carry over to A-S 08. See Allen Sherwood here. Hey, they’ve moved on from Sherwood et al 2007; that’s so last year.

Well, let’s see:
– 30 years of collected temperature data are useless and wrong, but a “reanalisys” 30 years after of winds can give us precise temperature readings;
– models have not to fit real world, but real world has to fit models.

In my University, such things would have meant me to be kicked off an exam to pass it maybe a year later (if the professor has changed or forgot me).

#26: In an incompressible fluid (water is often treated as such), it’s a pressure term, a potential energy term (that could be assumed to be zero(?) and velocity term that are constant. (Bernoulli) I don’t recall the assumptions that go with it and I don’t know if you can use the incompressible version of the Bernoulli equation for the air being measured by the radiosonde and also, if they use this form, there is no explicit temperature “proxy” term, so they may be using the compressible form because air is compressible.

The incompressible form of Bernoulli, brings in a specific heat (proportional to temperature) term so they may be using this. But, it still needs the velocity and pressure terms.

Wind speed is a function of pressure differences, not directly to temperature. In a situation in which two non-rotating homogeneous masses of gas can be assumed to be in thermal equilibrium and at constant pressure, the pressure difference is proportional to the difference in temperature via

and if you go to making the gases incompressible and laminar then Bernoulli’s Equation can help

But in a real atmosphere rotating as fast as Jupiter’s where the vorticity is a long way from zero, those simple assumptions are heroic indeed. Even changes in the acceleration due to gravity has a large effect.

No, not heroic. Just wrong.

I wouldn’t put it past most climate modellers to do such things though. Nothing seems to faze them.

A thought for climate modelers. Why not fit a surface map to the globe for the data at each specific date/time, derive the global or regional average for that date/time, and then follow that average over time. Good routines exist to create surface maps from unevenly distributed data. As long as quality controls are maintained, data taken at one point and time is just as valid as data taken at any other point and time.

This method would have the advantage of being able to test the effect of turning data points on and off.

Otherwise, while there is always a rush to be first in science, good science requires patience, a great attention to detail, and very careful demonstration of accuracy in measurement. What you have described only has the rush. This field desperately needs to convince some really good experimenters, such as do measurements for NIST, to switch fields, and get involved in designing new highly improved measurement networks.

We should allow the data massagers to still work with the old data, but they must be held to higher standards in their analysis.

I thought it might be instructive to show the RAOBCORE v1.4 observed data on the original graph of observed versus climate model results in Douglas et al. (2008). I took the RAOBCORE v1.4 data from a graph at RC. At RC they show the RAOBCORE v1.4 versus their 2 SD wide span model data while putting it on the Douglas graph shows it against the SE limits for the models and allows one to compare it to the other observed data.

Douglas gives reasons in this thread for v1.4 being suspect as a reanalysis of the radiosonde data and on which RC does not bother to comment. Even the v1.4 version data set (that shows disagreement with other radiosonde and MSU data sets) is not convincing in showing that the models are reproducing the observed ratio of surface to troposphere temperature trends.

I also included the graphs from AR4 that vividly displays the GHG signature of tropospheric warming in the tropics. RC seems to be confusing the issue by saying that the troposphere warms relative to the surface even without GHG forcings. That is true, but that is not what those who use the term GHG signature of fingerprint mean. They are referring to that prominent orange to red blob in the pictures that include GHG forcings.

Okay, I’ll take the bait once but given my commitments you will only get one posting and this is largely against my better judgement. I will try to be clear in my meaning and uncontentious.

Firstly, and most importantly, neither the radiosonde or the satellite programs have answered primarily (or at all in some cases) to the needs of climate. Rather, these measurements have been made with operational forecasts in mind. This has meant that there have been numerous changes in instrumentation and observing practice over time which make the purported issues with the surface record look like a walk in the park in comparison. Unlike surface observations, radiosondes are single use instruments (fire and forget), and satellites have in a climate sense very short lifetimes and are subject to all that space can throw at them. Neither is appealing as a raw database from which to construct an unimpeachable dataset either in isolation or in combination.

The upshot is that the raw data is a mess and the choices that a dataset creator makes imparts non-negligible and unintentional bias into the resulting database. So, how do you address this? You get many people to look at the data independently. Yes, some may make dumb choices, but only through getting this multi-effort approach can you begin to understand what you can and cannot say about the data.

The fact of the matter is that we cannot definitively say whether the troposphere is warming less quickly, as quickly, or more quickly than the surface, either from sondes or satellites, although no-one doubts it is warming. Ambiguity is large (and nobody has a dataset without issues) and kills us every time especially when you are asking a much harder second-order question about relative rates of change at which point you need that uncertainty range to be much smaller. I’m amazed that our structural uncertainty in tropospheric trends is as small as it seems (order 0.3K/decade from both sondes and satellites) to be given the state of the raw data.

The new result in Allen and Sherwood is use of wind data to infer temperature trends. Now these are from ground-tracking which has changed far less frequently and doesn’t change the instrument each time you make an obs and then more latterly GPS modules on the instruments. So, there are fewer obvious breaks in the wind series and those breaks which there are are much more well behaved and easy to adjust for. There is a paper by Leo’s team about to appear in Met. Zeit. on the issue. The thermal wind equation can be used to infer relative temperature trends and a boundary condition of the well observed NH mid-lats used to infer absolute trends. This is obviously not without issues itself and is new research and therefore should be treated with the appropriate levels of caution, but this is the work on which the highly selective quote given in the top is based (I assume this was culled from the AFP cull and is a highly selective version of my views as a result). Undoubtedly further work is needed to verify or deny the result and understand the issues but when temperature analyses alone patently aren’t working use of winds may help remove the roadblock and allow better understanding.

Those with access to Nature Geoscience can read my full comment that expands on all of this far more lucidly than a rapidly written note here.

On other comments raised:

We and others have been applying our methodologies to test-bed cases precisely to try to ascertain what can and cannot be said and those test cases are becoming increasingly complex and realistic. Papers in the pipeline will address this further but McCarthy et al. gives a start and both Steve Sherwood’s IUK and Leo’s RAOBCORE include a degree of verification in test cases so this criticism seems born more of not reading the manual than anything else.

I agree that we should be making observations for climate. Sadly, to date, we haven’t. GRUAN offers that opportunity but needs broad support to happen. Ditto CLARREO. Never heard of them? That potentially gives you some idea where climate comes on the pecking order. It is unfair to assert that we do not care about observations. Many people spend a lot of time making sure really dumb decisions are not made and trying to protect the observing system but climate has no observing budget is, sadly, the bottom line. Please feel free to write to your politicians to demand the billions necessary. Not so keen all of a sudden?!?

Finally, rather than ragging on the radiosonde community it would be nice if those who constantly carp on here about availability of metadata and audit trails were to recognise that as a community the radiosonde experts do actually provide that trail for nearly all the datasets that are publicly available. If all that is forthcoming constantly is criticism then this forum rapidly approaches the status of irrelevance to the climate community. Some balance and encouragement / highlighting of positive aspects is never remiss if you want to be taken seriously.

I have recently learnt that RSS data comes from the same source than UAH data, but uses a correction of UAH procedures when adjusting diurnal temperatures of the lower troposphere. It is said that the adjustment affects mostly in the tropics.

I have verified that myself, but I am a bit astonished as to the results I got. Not only do the adjustments affect mostly in the tropics. In truth, the adjustment only affect the tropical trend. And it changes the trend by a full 0.1ºC/decade. This means that it DOUBLES the tropical tropospheric temperature trend shown by UAH data.

Does anybody have a clue as to what has happened to the diurnal temperature in the tropical troposphere but didn’t happen to the diurnal temperatures of the remaining troposphere?

I would have liked to download and thoroughly read the Mears and Wentz (2005) article that explains the corrections performed, but it is not available without paying…

The correction trend is obtained by cooling the past and warming the present. Does it ring a bell here in Climate Audit?

In RC it is quoted that Thorne concludes: “The new analysis adds to the growing body of evidence suggesting that these discrepancies are most likely the result of inaccuracies in the observed temperature record rather than fundamental model errors.”

This quote is not at all “culled” as he says of the “long-awaited experimental verification of model predictions” quote but nevertheless both statements are inconsistent with the uncertainties and caveats that he has stated above. Perhaps he should post his real views on the realclimate.org blog in order to give the proper balance that he so desires the rest of us to display. And while he’s at it he should tell them he disapproves of such quote mining being used for public disinformation. I won’t hold my breath waiting.

We are well used to scientists saying one thing in private and quite another in public but it remains unacceptable behaviour. It is precisely this kind of mis-presentation of controversial and preliminary adjustments to the raw data as “evidence” that we are often “ragging about”. At what point does it become just plain dishonesty I’d like to know?

If all that is forthcoming constantly is criticism then this forum rapidly approaches the status of irrelevance to the climate community. Some balance and encouragement / highlighting of positive aspects is never remiss if you want to be taken seriously.

Does that criteria for being “taken seriously” also apply to the pro-AGW people with respect to the arguments put forth by those who are skeptical?

Billions have been spent on Climate Research because of AGW, yet nobody gets any money to improve our ability to measure climate change. Does it all go to supercomputers?

The data sets we have are flatly not capable of proving anything. Superhuman efforts are needed (and bragged about) to try and make sense out of the mess. Yet time and again, when somebody pulls back the curtain on the problems the superhumans have glossed over or even created, they respond with attacks and then change the very same data to ‘prove’ the critic wrong.

Of course no new money goes into this data taking. The superhumans are claiming they get the ‘right’ answer with what they have. Rare admissions like that of #40 above, which are needed to justify the improvements, undermine the claims that the answers are right. The politicians handing out the money want power now. And the activists pushing to drastically change the world in damaging ways claim that we can’t afford to wait.

It is long past the time where the scientists of this field regrouped and put together proposals on how to improve the measuring systems so that the new data will not have all these well known problems that make the old data worthless to the task at hand. Quit griping about the lack of money and insist on the share you need to advance your field and do the science.

#40. Thank you for the comment. Before I make any other comment on Peter Thorne’s post, I would like to note that my post did not survey the state of archiving in radiosonde data. However, I would like to observe that I was able to promptly locate and download several key radiosonde data sets (Angell, RATPAC-B, HAdAT2). While I’ve not spent enough time on the data to provide an opinion on the completeness of the archives, as Dr Thorne observes, the authors in the field have made substantial efforts to make their data publicly available and should be commended for it. Dr Thorne is also justified in reproaching me for not giving credit where credit is due. Point taken and my apology. I will add this information to the head post.

#Peter Thorne
You make some valid points. However, there seem to be a variety of issues that the climate science community does not seem to understand or does not seem to be willing to take steps to deal with. In particular, my specificl concern is with statistical methodology and its application to climate issues. For whatever reason, there is a distinct lack of trained statisticians working with the researchers. Given the increasing complexity of statistical methodology developed in the post- BC (Before Computers) era, this lack means that data analysts without a proper training or understanding are less likely to grasp the implications of using that methodology or of making ad hoc (theoretically unjustified) changes to it. This may lead to spurious results and/or serious underestimation of the inherent uncertainty of estimates of important parameters. When I read quotes like: “The new study provides … long-awaited experimental verification of model predictions” based on such results, my professional hackles are raised and my response is to somewhat vehemently point out the shortcomings in the analysis. Some examples from your post:

The upshot is that the raw data is a mess and the choices that a dataset creator makes imparts non-negligible and unintentional bias into the resulting database. So, how do you address this? You get many people to look at the data independently. Yes, some may make dumb choices, but only through getting this multi-effort approach can you begin to understand what you can and cannot say about the data.

This is one example where I feel that a good statistician could be helpful. Why do you have the idea that it is appropriate to over-manipulate the data with whole series of (often subjective ad hoc) “adjustments”. Looking at your document at http://www.cru.uea.ac.uk/cru/posters/2003-07-PT-HadAT1.pdf

We homogenise the individual station series by near neighbor checks to maintain spatio-temporal consistency. Neighbours are drawn from the contiguous region with correlation r>1/e for each target station, as defined by NCEP reanalyses fields (10). Weightings used to develop neighbor averages for each station are the NCEP correlation. We apply a moving Kolomogorov-Smirnov test through the difference series (target station – neighbour average) on a level-by-level and seasonal basis to identify potential jump-points, using metadata, where available, to confirm these jump-points. Time series are corrected based upon the change in the mean of the difference series across the break-point. We only proceed if this is >0.1K to avoid artificially reddening the time-series. The process is then iterated to a subjectively assessed degree of convergence on a station-by-station basis.

Our QC procedure has been through a total of six iterations…
The resulting zonal trends are spatio-temporally smoother than either the uncorrected analysis or HadRT2.1s (Figure 3). Importantly, the observed tropical tropospheric cooling over the satellite period remains.

Just how much of what is now in the data set an artifact of the “QC” process? Is there some theoretical basis for applying the K-S test in the moving average manner that you have? What are its properties as a jump-point detector? What inherent uncertainty has been introduced? What’s wrong with doing genuine Quality Control (which is typically necessary on large data sets) using metadata and other available real information. The remaining factors should be dealt with through an appropriate statistical model in the subsequent analysis where the uncertainty can be assessed in a more realistic fashion.

We and others have been applying our methodologies to test-bed cases precisely to try to ascertain what can and cannot be said and those test cases are becoming increasingly complex and realistic. Papers in the pipeline will address this further but McCarthy et al. gives a start and both Steve Sherwood’s IUK and Leo’s RAOBCORE include a degree of verification in test cases so this criticism seems born more of not reading the manual than anything else.

This is another situation where a competent statistician would be helpful. “Test-bedding” is not appropriately performed by running one or two cases and then performing an “eye-ball” comparison. Statisticians will run several thousand cases using multiple underlying scenarios and report statistics on bias, variability, robustness, etc. to evaluate the behavior of procedures where the derivation of a theoretical basis is not possible. The seat-of-the-pants approach “we tried a couple of situations and got the same results” doesn’t cut it in statistics.

Finally, rather than ragging on the radiosonde community it would be nice if those who constantly carp on here about availability of metadata and audit trails were to recognise that as a community the radiosonde experts do actually provide that trail for nearly all the datasets that are publicly available. If all that is forthcoming constantly is criticism then this forum rapidly approaches the status of irrelevance to the climate community. Some balance and encouragement / highlighting of positive aspects is never remiss if you want to be taken seriously.

Your comments regarding the openness and archiving of data seem to be justified given the relative ease with which I was able to locate data. My attitude in an earlier post on the Sherwood paper regarding the relative impossibility of reproducing the results was based on the fact that a lot of the steps done were (probably unintentionally) incompletely described. Criticism of a paper (or any research) is the way to evaluate the results – if it can stand up to the criticism, then it is good work and the results are reliable. I like to think that most of the posters here are honest enough to recognize and applaud good work when they see it.

Large errors in rawinsonde data are fairly easy to pick visually once the data is plotted on a Skew_t chart. Most operational forecasters with a few years under thier belt can pick out screwy data quite quickly. Both the NWS and Air Force have data validation sub-routines that “kick” the observations back to the operators if the lapse rates reach or exceed a certain threshold. If, for some reason the “error” is missed and is inserted into the model runs, one can be fairly certain that the first model run have some really screwy forecasts. Rawinsonde stations are few and far between, any significant uncaught errors will quickly show up in the forecast models.

Turning to Dr Thorne’s comment, I’m having trouble understanding what, if anything, he substantively disagrees with in the above post, other than the fact that he wants to be patted on the head.

I come at the radiosonde data as a third party as do readers here. Our first interest is in knowing what reliance can be placed on this data set in terms of understanding climate change. I said that the raw data was a mess and the inhomogeneities were far worse than the surface record which is more familiar to readers here. Dr Thorne said:

numerous changes in instrumentation and observing practice over time which make the purported issues with the surface record look like a walk in the park in comparison….

The upshot is that the raw data is a mess

Seems to me that Dr Thorne agrees 100% with one of the key comments in my post. This is unsurprising since I drew this observation from specialist literature which is quite candid on the topic, though this is less clear as the reports get “culled” in Dr Thorne’s turn of phrase for the public.

I expressed concern that the changepoint methodologies used in dealing with extremely inhomogeneous temperature data were potentially very problematic, that these techniques were not well established by the general statistical community and could bias the results. Dr Thorne stated:

the choices that a dataset creator makes imparts non-negligible and unintentional bias into the resulting database.

Again, it seems to me that Dr Thorne has agreed with the second key point of the above post. Again this is unsurprising since specialist literature in the field (e.g. the two quotes from Sherwood) says exactly the same thing.

I concluded that it seemed that the radiosonde “data” was insufficient quality to support any conclusion one way or the other as to whether radiosonde observations were or were not inconsistent with models, though I did not comment on the satellite record as it has its own set of issues. Dr Thorne said:

The fact of the matter is that we cannot definitively say whether the troposphere is warming less quickly, as quickly, or more quickly than the surface, either from sondes or satellites, although no-one doubts it is warming.

Again, I see no point of disagreement with this observation and my conclusion in respect to the radiosonde data.

I said that Allen and Sherwood argued that the radiosonde wind data was less screwed up than the radiosonde temperature data. I’m not in a position to comment one way or the other on whether this is the case; now that the issue is raised, other scientists may well disagree. But I stated that this was an attempt to use a different portion of the radiosonde data:

Allen and Sherwood 2008 try a different tack – they try to create a homogenized wind data series on the basis that the radiosonde wind data is much less screwed up. They then argue that the trends in wind are consistent with tropical troposphere warming. They use this as evidence for the side of the argument that the UAH satellite temperature trends in the tropics are incorrect. I guess that we’ll see more about tropospheric wind data in the next while.

Again, I see no material point of difference between this observation and Thorne’s corresponding observation.

So why the petulant tone in Dr Thorne’s post?

He felt that I had not properly praised the radiosonde community for archiving their data. While I’ve not fully surveyed their archival practices, as noted above, I was able to quickly locate and readily download data from relevant data sets. It would have done to no harm to have noted this and I’ve amended the above post to express this. On a scale of 1 to Lonnie Thompson, they’re pretty good. But by now, archiving data should be regarded merely as a type of hygiene. Hopefully, we’ll reach a point where praising a climate scientist for archiving data would seem as ridiculous as praising George Bush or Hillary Clinton for brushing their teeth. But since Dr Thorne wishes some praise of this sort from these quarters, as noted above, I’m happy to recognize their archiving efforts and have done so.

Dr Thorne makes the Rabbettesque-Halperinesque accusation that criticisms of their adjustment methodologies comes from failure to “read the manuals”. Thorne:

We and others have been applying our methodologies to test-bed cases precisely to try to ascertain what can and cannot be said and those test cases are becoming increasingly complex and realistic. Papers in the pipeline will address this further but McCarthy et al. gives a start and both Steve Sherwood’s IUK and Leo’s RAOBCORE include a degree of verification in test cases so this criticism seems born more of not reading the manual than anything else.

In my post, I quoted criticisms from Sherwood et al 2008, a paper that is current, of all prior adjustment efforts and stated that I was prepared to stipulate to these criticisms. Yes, there are a slew of new changepoint analyses. To carry out a complete deconstruction of all these changepoint analyses was beyond the scope of this post or my interest.

Perhaps one of these new adjustment methods will cut the Gordian knot where previous methods have failed. The track record of these prior efforts, according to the most recent survey by Sherwood, is not encouraging. My point was that these particular changepoint analyses were homemade methods developed by the climate science community and their properties were not well understood by the general statistical community. I can’t go to a statistics textbook and look up calculations on confidence intervals for any of these new techniques. As Dr Thorne observes, “the choices that a dataset creator makes imparts non-negligible and unintentional bias into the resulting database.” Quite so.

I see no justification for Dr Thorne’s petulant tone, which is all too reminiscent of the Gavinesque sigh that we’ve all become used to.

Re: Peter Thorne’s comments. To add to what RomanM said, the entire procedure of multiple adjustments, models, assumptions, and analysis procedures, while perhaps all very reasonable individually, leave too much wiggle room for arbitrary choices and are without an overall theoretical or experimental frame. Such a frame would be for example in a randomized block experimental design there is an established statistical framework for how to handle errors and do the analysis. When multiple ad hoc and/or complex analyses are done, we end up afloat in theory land, with no way to evaluate the results rigorously. This is even with perfectly honest scientists. My own field, ecology, suffers from this quite a bit, which makes competing theories hard to test (for many decades) and study results sometimes not convincing to others. In such a setting, it is really too bad when people make claims about the certainty of their results that can’t be supported by any rigor.

In RC it is quoted that Thorne concludes: “The new analysis adds to the growing body of evidence suggesting that these discrepancies are most likely the result of inaccuracies in the observed temperature record rather than fundamental model errors.”

This quote is not at all “culled” as he says of the “long-awaited experimental verification of model predictions” quote but nevertheless both statements are inconsistent with the uncertainties and caveats that he has stated above.

I agree and find Thorne’s post a long way around saying there are uncertainties in temperature measurements. I suspect one could just as readily apply what he says about the troposphere to the surface. The problem I have with some of these scientific opinions is that they seem to come with an outcome in mind. I also have a problem with climate scientists who make corrections by concentrating on and doing it primarily in one direction.

Notice that Thorne does not confine his argument to the reliability of radiosondes but of MSU also.

Thorne (and RC) continue to present good evidence for the case that those of us interested in climate science need to do our own investigations and analyses.

This could be a little bit off topic, but not entirely. In his RC critique of Douglass et al 2007 paper Gavin Schmidt displays two graphs, representing greenhouse and soalr forcing and both showing tropical hot-spot, and states that GISS model simulations show that hot-spot quite IRRESPECTIVE of the type of forcing involved. So, if Schmidt is right vertical amplification of warming in the tropical troposphere IS NOT unique signature of greenhouse warming as commonly understood, but regular consequence of any kind of warming. It is quite contrary to what IPCC says, and what Kenneth in # 36 posted, that only greenhouse warming shows that characteristic tropospheric fingerprint. Interesting question to clarify, by someone better qualified in physics than me: is NASA (GISS, Schmidt) or IPCC wrong? If IPCC is wrong then whole this fight to correct and adjust radiosonde and satellite data is misplaced: it would achieve nothing in terms of attribution of recent warming to any particular cause. If Schmidt is wrong, is it possible that none spotted his mistake thus far, waiting for me, philosopher/economist to do that? *)

Kenneth, 36, I now have read that you also reffered to Schmidt strange redefinition of tropical warming amplification, but you didn’t emphasized that Schmidt displayed for greenhouse and solar forcing two basically identical graphs, apart from stratosphere. That is quite different from IPCC graphs you have posted.

What you cannot say is that this new analysis confirms the models. I guess the best you can say is that the radiosonde data does not falsify the models. The point was made, and it is apparently true, that the way that “good” data is determined is by reference to agreement to the model output, when, as Steve points out, “good” data should be identified by careful qc analysis of the collection methods, irrespective of the results, as long as the results are not “unphysical” in some way, and no, you cannot defined “unphysical” through GCM output.

PErhaps someone could look through IPCC AR4 and see whether they reported Dr Thorne’s above observations that the data was a “mess” and that attempts to create data sets were fraught with problems and that the field experienced the under=funding teported above by Dr Thorne.

Within the community that constructs and actively analyses satellite- and radiosonde-based temperature records there is agreement that the uncertainties about long-term change are substantial. Changes in instrumentation and protocols pervade both sonde and satellite records, obfuscating the modest longterm
trends. Historically there is no reference network to anchor the record and establish the uncertainties arising from these changes – many of which are both barely documented and poorly understood. Therefore, investigators have to make seemingly reasonable choices of how to handle these sometimes known but often unknown influences. It is difficult to make quantitatively defensible judgments as to which, if any, of the multiple, independently derived estimates is closer to the true climate evolution. This reflects almost entirely upon the inadequacies of the historical observing network and points to the need for future network design that provides the reference sonde-based ground truth. Karl et al. (2006) provide a comprehensive review of this issue.

Although the language is polite, the conclusion is that the radiosonde data is a mess, as Thorne observes above, and that no individual adjustment method can be selected as “right”, as Thorne also observes.

I have doubted the ability to determine or measure AGW from the beginning. Before I get into the details the first problem that jumped out at me was the potential quality of measurements of very small changes measured over long periods of time. The second problem and most dangerous is people stand to gain financially from getting folks to believe in AGW.

Though I will be the first to admit I am ignorant of the nomenclature used on this site one thing is obvious: all the data comes from some form of measuring instrument, the data is interpreted and then manipulated based on acceptable standards.

This site touches on all three aspects and it is obvious many AGW believers have played a bit fast and loose with the “the data is interpreted and then manipulated based on acceptable standards”. I will leave that debate to the experts here.

What I would like to comment on (and it is touched on here) is the quality of the data from the measuring instruments and their operators.

A little personal history is needed. I spent 13 years installing, repairing and instructing on the proper us of analytical instruments. They included gas chromatograph, liquid chromatograph, mass spectrometers, capillary electrophoresis, atomic emission detectors, diode array detectors, and many more.

In those 13 years I leaned the following:

1. 80% of all problems were “operator error”. Our call centers reported year after year 80% of trouble calls were fixed over the phone by correcting user practices or misconceptions.

2. Operating instruments in even the most controlled environments still did not eliminate environmential affects on the instrument….. talking in front of Refractive Index detector to long or sunlight reflected off a pane of glass heating instrument parts all can affect results.

3. Calibrating an instrument only guaranteed it was in specifications at the moment of the calibration.

4. If it has electronic components the number of potential errors is astronomical

5. Expecting a result GREATLY affected the objectivity of the operator to evaluate data.

6. Being humans, operators get sloppy and circumvent proper protocols. I once had a NASA PhD taking C02 samples for the atmosphere by walking out in the hall and drawing a sample. I asked him if he felt that was appropriate and his response floored me..he said, “I wait until no one has walked by for a few minutes”!

7. And lastly and I know I may insult a large number of this sites readers so I will qualify it to say “I do not know if it applies to this field of science”. So here goes…without a doubt the most incompetent, unethical and bottom of the barrel operators where PhDs in academia. On average I spent one day a week in a college or University lab, the other four days were in private sector of pharmaceutical labs, chemical manufactory labs, food production labs, plastic manufacturing…..

The difference in academia and private sector PhDs was not just a little bit, but dramatic. It was so bad we (my coworkers) would make bets on sports events and the loser had to do the academia service calls for a given period.

Again I am sorry if this does not apply to this field of study.

Bottom line is in 13 years of being totally immersed in extremely accurate measurement instructs I seriously doubt the ability of many instruments in diverse and harsh environments to accurately measure the very small variances claimed by the AGW crowd with any reliability.

I think in one sense, Steve, you’re being too hard on Dr. Thorne. It doesn’t look to me as though he was complaining about your basic post. A lot of what he says is basically a repeat of what you said, but he doesn’t do it in a context of saying you were wrong. Likewise, it’s useful to have a specialist in the field back up what you said.

What’s a bit more contentious were his last three paragraphs which were explicitly addressed to “other comments”. I haven’t parsed the entire thread to see who he’s addressing, but I expect they are his general comments on the threaded blog comments rather than on the head post. Unfortunately, a lot of scientists who come to a blog like this seem to think that the comments of other people are to be taken as representative of the blog-owner’s thought, though except on the most highly censored sites this isn’t the case. Since active scientists don’t have time to learn much about how blogs work, this is understandable, but it does mean they’re often tilting at windmills. That’s what I believe is the case here.

So what Thorne and the IPCC are saying is that none of the post war temperature records are much cop. We know they place great emphasis on the surface record but we know from this blog and Surface Stations Org what the quality of that record is.

One comment of Thorne’s at the beginning that flummoxed me was:

“Firstly, and most importantly, neither the radiosonde or the satellite programs have answered primarily (or at all in some cases) to the needs of climate. Rather, these measurements have been made with operational forecasts in mind.”

I don’t want to start the weather versus climate debate again but I don’t see how this can hold up. If the radiosondes are not good enough for climate how are they good enough for operational forecasts? Can somebody tell me what the difference in accuracy required for climate and “operational” forecasts is? If the radiosondes are so poor at their job why are they used? Perhaps this explains why forecasts can only be made for 4 days ahead and are invariably wrong but ” projections” can be made for 100 years from models that are now validated by wrong data.

And how after these admissions can the IPCC claim the certainty it does for AGW

I think what Thorne is really saying he doesn’t like is the Radiosonde community taking all the heat for the “messy” data when it is used in climate analysis.

Firstly, and most importantly, neither the radiosonde or the satellite programs have answered primarily (or at all in some cases) to the needs of climate. Rather, these measurements have been made with operational forecasts in mind. This has meant that there have been numerous changes in instrumentation and observing practice over time which make the purported issues with the surface record look like a walk in the park in comparison. Unlike surface observations, radiosondes are single use instruments (fire and forget), and satellites have in a climate sense very short lifetimes and are subject to all that space can throw at them. Neither is appealing as a raw database from which to construct an unimpeachable dataset either in isolation or in combination.

My reading of this is that the system was designed for operational forecasts and the data has been carefully collected and logged with meta-data and audit trails as would be expected. But this is all within the parameters of operational forecasts. The problems that arise from using this data in climate analysis are varied and complex and acknowledged within the radiosonde community and some attempts to identify useful information from the existing datasets is happening. I think I would get upset too if I were fielding complaints about how someone else was using my data for something it was never designed to be used for in the first place. To me that explains the petulant tone, and the majority agreement between his post and the opinions here.

That’s my take on it, I’m fairly new to following posts here, so maybe I’m missing something. It seems clear that if radiosonde and satellite data are to play into climate analysis, then the system needs some funding proposals to improve the data for that pupose. (lots of money, it sounds like) In the mean time it sounds like people are trying to make do with what they have and doing a poor job of it from a statistical point of view.

It’s not that important what Thorne says here – it’s what he says elsewhere that counts. And elsewhere he says that with proper adjustment the sondes verify the models. Which is purely circular reasoning. It’s so easy to jump from a bare opinion that the data should show warming, to “evidence” of warming via untested adjustments. And just like other tainted “evidence” this then will become a “fact” in communications with policymakers. And why are they of the this opinion in the first place? – because of the surface records which are in fact equally tainted by adjustments significantly larger than the signal. The only records I trust are the Arctic and now the US48 (thanks to the 3rd party verification done here) and they tell us that so far it’s no different from the 1930’s heating and so we may be on a cusp just like then. Another 10 years data is needed – accurate this time.

Ivan, remember that Gavin Schmidt is comparing tropospheric to surface temperature trends for the case where temperatures have increased by the equivalent of what the model predicts for 2xCO2 and then the case where solar forcing increases the temperatures by the same amount without CO2 forcing. The AR4 reference to which I referred gives the current day GHG finger print assuming the model has the portions of the temperature increases/decreases correct for aerosol, solar, volcanic and GHG forcings. That is not an apples to apples comparison and in my mind could confuse. There may be more to the descrepancy than that (I need to investigate) but for right now Douglas etal. (2008) were comparing observed to model output for the current climate.

I was expecting a reply like yours. We know that we can’t get accuracy to 0.1C from most if not all of our current measuring methods, even the satellite MSU has been challenged. Yet the IPCC supports work that claims it is able to extract measurements to this accuracy by using statistical torture vide MBH and all their fellow travellers.

Your comment about extracting the signal is interesting. The inaccuracy that we have discussed here and elsehwere means that all the frequently claimed warming over the past 100 years could simply be non-existent because of measurement errors. But we know that it has warmed a bit, at least in the UK, since the end of the LIA as we no longer have ice fairs on the Thames and gardeners will tell you that the weather over the past 10 years has been markedly unseasonable. What accurcay do we need to say that the average temp in say the south of England has increased by say 1C since 1821 ( the year of the last ice fair I think)?

So is it that we know roughly there has been some warming but then there was during the MWP. We know the models cannot be true because to be so, the modellers would have to have a complete grasp of all of the driving factors of the weather and the climate. This they self-evidently do not. There is no convincing evidence of the link between CO2 and the warming. So the whole farrago of the IPCC is built on sand and despite the billions of dollars spent we seem to know so little.

I think Dr Thorne might have taken exception to my post,#19, where I did indeed quote AFP on verification of model predictions (although I did not mention his name), rather than Steve’s original post much of which, as you say, he seems to agree with.

However, I am sure the climate community would be quite happy to let the AFP quote stand in areas where it wasn’t being questioned

#66. Mosh, while there’s a lot of piling on, Thorne also has an easy alternative available to him – respond to my comments, noting that he disagrees with but is unable to reply to all the other posters.

That’s what I’d try to do with, say, Gavin Schmidt. I wouldn’t try to engage every commenter at realclimate.

So I don’t believe that the piling on is what is preventing Thorne from replying. He said in advance that he intended to engage in a drive-by shooting and I, for one, see no reason not to take him at his word. In addition, it’s hard to find anything that he actually disagrees with in my post, other than insufficient adulation of climate scientists.

You asked how data that is not of sufficient quality for climate research could be of sufficient quality for weather forecasts. I answered that question. If you wanted me to answer a question of how low quality weather data can be used for climate research, you should have asked that question.

Thorne was a contributing author to all chapters of the CCSP report and lead author of Chapter 5. I’d appreciate if if someone went through this and identified whether this report, which ocntains extensive discussion of radiosonde data, contains a straightforward statement that the radiosonde data is a mess and that various adjustment schemes each of the problem with potential bias mentioned in his prior comment.

As an old weatherman, I cannot any longer suppress a comment that it seems quite unlikely that many, if any, of the people trying ‘correct’ raob data have any experience with the equipment and techniques used, especially before the era of computerized reduction of data.

As has been mentioned, the instrumentation itself was expendable and therefore cheap. If memory still serves, and sometimes it doesn’t these days, a rawinsonde instrument in the early 50’s cost about US $3. By the late 60’s this price was about $6. These prices also apply roughly to dropsonde instruments of the same vintage. In short, these instruments would be laughed out of most laboratories.

I believe they were reasonably accurate for the purpose used and will not try to expand beyond that point.

As for winds being measured more more accurately than temperatures, I doubt that.

Winds in the 50’s and 60’s were calculated as follows.

– The hight of the balloon was ‘calculated’ based on the time-of-flight and the ‘canned’ ascension rate of the balloon. In short, the altitude was NOT measured, but more or less ‘assumed’.

– The angle off vertical could be measured by the passive tracking antenna.

– Then, using a slide rule and simple trig, the altitude and angle would allow calculating the Horizontal Distance Out (HDO) from the launch point.

– Using the bearing of the signal and the HDO, location of the instrument was then plotted every minute. The average wind direction and speed for that minute was then simply ‘measured’ using the last and previous plot.

Once again, these values were accurate enough for the purpose intended, but to think that someone in an ivory tower, decades later, could correct pressure readings, temperature values, humidity levels, or wind vectors, seems to me to be absurd.

So the wind data is “slide rule accurate” based on the average ascension rate of the balloon probably under “Standard Temperature and Pressure”. A rate that would be affected by humidity, temperature, barometric pressure, and wind, to name some that come to mind immediately. Not to mention getting weighted down by raindrops, etc.

re 71. ya. I was hoping that his ” I’ll make this one post against my better judgement and leave”
was more or less theatrical posing and I hoped against hope that he actually had more gumption and gonads.

I appreciate Dr. Thorne dropping by, but a telegraphed one time post that doesn’t substantivly disagree with the main points of the article but rather its presentation, or the reaction of the readers here, or whatever….. I agree with Steve, it’s easy to post one post following up to Steve rather than starting to debate everyone here with a variety of viewpoints that are often far from the center Steve seems to me to occupy. We’ve seen that before I think!

Aft hae I rov’d by Bonie Doon,
To see the rose and woodbine twine:
And ilka bird sang o’ its Luve,
And fondly sae did I o’ mine;
Wi’ lightsome heart I pu’d a rose,
Fu’ sweet upon its thorny tree!
And may fause Luver staw my rose,
But ah! he left the thorn wi’ me.

As with singer of Burns’ song, we can lament the drive-by, but, in our case, we surprisingly have a wandering Thorn.

I think everybody is forgetting that the satellite and radiosonde data is currently very much in agreement with the real world (cold winters NH and SH, no ice melt in NH and SH) and even with actual trends of sloppy ground temps reported by GISS, CRUT, anectodals from most blogs and so on and on. RSS and MSU UAH, to the distress of the modelers are proving to be very accurate. Are we to conclude that this data is nonsense.http://discover.itsc.uah.edu/amsutemps/
In fact when you think of it seriously, isn’t it totally ludicrous for anyone (includes all skeptics and AGW believers) to make any current assumptions in climate trends for if not for at least 100-200 years?

Yorick asked: Thermals and downdrafts? Were they launched from parking lots or “cool parks”?

Well, we weren’t concerned in those days about that except that the launch area was usually quite close to the inflation shack at the first space long and open enough to accommodate not dragging the instrument if the balloon went faster with the wind than it went up. Sometimes, it was a pretty good sprint ending in pitching the instrument up in the air.

At least with dropsondes, launch was simply dropping it.

A bit off topic, but I remember launching dropsonde runs years ago just off the coast of Charleston, SC from probably 30-35,000 ft, and at the same time radiosonde runs began. We made several drops over several days since we were involved with trying to make our new equipment operational as rapidly as possible.

It has been a long time ago, but the only really significant difference in the runs that I remember were the super-adiabatic lapse rates we encountered near the surface. We visited the Charleston weather bureau folks and they said that they too had seen the ‘supers’ but had ignored them because they didn’t think they really existed.

See, in those days we didn’t have anyone to massage the data after the fact and tell us what we really should have measured. (I could NOT resist that one!)

Steve M., regarding your question above, it seems that in Chapter 5 of the CCSP Report, they like radiosonde data just fine when it agrees with their ideas:

Even without performing formal statistical tests, it is visually obvious from Figure 5.1 that radiosonde-based estimates of observed stratospheric and tropospheric temperature changes are in better agreement with the PCM 20CEN experiment than with the PCM “GHG only” run.

To obtain a clearer picture of volcanic effects on atmospheric temperatures, Free and Angell (2002) removed the effects of variability in ENSO and the Quasi- Biennial Oscillation (QBO) from Hadley Centre radiosonde data18. Their work clearly shows that the cooling effect of massive volcanic eruptions has been larger in the upper troposphere than in the lower troposphere.

Brown et al. (2000) used surface, radiosonde, and satellite data to identify slow, tropic-wide changes in the lower tropospheric lapse rate20. In their analysis, the surface warmed relative to the troposphere between the early 1960s and mid- 1970s and after the early 1990s. Between these two periods, the tropical troposphere warmed relative to the surface. The spatial coherence of these variations (and independent evidence of concurrent variations in the tropical general circulation) led Brown et al. (2000) to conclude that tropical lapse rate changes were unlikely to be an artifact of residual errors in the observations.

Thorne et al. (2003) applied a “space-time” fingerprint method to six individual climate variables. These variables contained information on patterns34 of temperature change at the surface, in broad atmospheric layers (the upper and lower troposphere), and in the lapse rates between these layers35. Thorne et al. explicitly considered uncertainties in the searched-for fingerprints, the observed radiosonde data36, and in various data processing/fingerprinting options. They also assessed the detectability of fingerprints arising from multiple forcings37. The “bottom-line” conclusion of Thorne et al. is that two human-caused fingerprints – one arising from changes in well-mixed GHGs alone, and the other due to combined GHG and sulfate aerosol effects – were robustly identifiable in the observed surface, lower tropospheric, and upper tropospheric temperatures.

… and they think the radiosonde data is flawed when it disagrees with the models:

It should be emphasized that all of the studies reported on to date in Section 4 relied on satellite data from one group only (UAH), on early versions of the radiosonde data25, and on experiments performed with earlier model “vintages.” It is likely, therefore, that this work may have underestimated the structural uncertainties in observed and simulated estimates of lapse rate changes.

The range of model T4 trends encompasses the trends derived from satellites, but not the larger trends estimated from radiosondes. The most likely explanation for this discrepancy is a residual cooling trend in the radiosonde data (Chapter 4)47.

In contrast, the T2 trends in both radiosonde data sets are either slightly negative or close to zero, and are smaller than all of the model results. This difference is most likely due to contamination from residual stratospheric and upper-tropospheric cooling biases in the radiosonde data (Chapter 4; Sherwood et al., 2005; Randel and Wu, 2006).

However, they like the data once they’ve adjusted it …

These radiosonde data sets were either unadjusted for inhomogeneities, or had not been subjected to the rigorous adjustment procedures used in more recent work (Lanzante et al., 2003; Thorne et al., 2005).

FInally, I have another question. The high altitude wind study gives a value for the temperature trend of 0.65°C/decade for the area 7.5 to 10 miles up. My atmospheric calculator puts this at from 200 to 100 millibars. This temperature trend is 10% larger than the highest of the model results, 240% larger than the model average, and is wildly larger than any of the observations.

How does this result support the models? They did worse at reproducing the new “observation” than they did with the old ones, and the new “observation” is out of the ball park compared to both models and other observations.

0. It was not posturing. I really do not have much time as I am packing for me and my family to go on secondment to the states for six months next week. This is now eating into that time and so this really will be it as real life is more important than blogs. Really, it is.

1. Name calling is neither big nor clever and those who are doing so should be ashamed of themselves. It does nothing to further the cause and reflects badly on the blog owner eventually. So if you actually respect Steve’s work you really should back off the infantile stuff.

2. I was setting the record straight on my views and not trying to critique the original post. Seve, if you had emailed me I could have clarified for you to avoid this mis-understanding but what has happened has happened. The original post mis-represented my views with a highly selective quote and I was quite rightly pissed as a result (American meaning). If it was read as a critique of the post then I apologise to Steve. There are things I disagree with but after the infantile posts of many commentators I will take up any discussions of these off-line with Steve if I have time and inclination (clue: continued baiting every time a climate scientist pops in reduces the probability that any will in future).

3. As pointed out by others my “other comments” were on comments made. I didn’t have time to id them by number as I had a meeting I was already late for. I am sorry that I did not make this clear.

4. I find #91 to be a particularly bad case of mis-quoting as most of the quotes come from a summary of previously published work which is what a synthesis part of synthesis and assessment should be. Chapter 5 of CCSP and the whole report has writ large that the main new “finding” is that we were previously hopelessly optimistic in how certain we were in our estimates of observed tropospheric change.

5. As I said in my previous post winds may well prove not to be the answer in the longer term. Please note, however that Allen and Sherwood do not use winds before the 1970s because they are so poor. I have actually seen the sondes we used to use although I wasn’t around to launch them it is immediately clear to me that there were undoubtedly going to be issues in all measures.

6. This was my first Nature N&V piece and it has been quoted in different ways by different people. I stand by the comments but if they are selectively quoted anyone could state that my position was either that all was okay or there would be real problems. When we were doing CCSP I used to characterise it that we all lay somewhere on a bayesian continuum between no problem and substantial lapse rate discrepancy. This is still the case as the uncertainty tent is still large enough to comfortably accomodate both camps. I have tended to move towards “no problem” on the basis of all the evidence I have seen, largely the evidence to do with changes in moisture content from sondes and satellites which for satellites because of the measurement technology and the trend size are much less ambiguous. It would be very strange for the troposphere to be moistening substantially without a concurrent rise in temperature. I fully admit that this doesn’t make me right any more than it does any of you.

7. Multi-variate understanding is undoubtedly the way forwards. Which is why we should stop thinking temperature temperature temperature and start considering in far greater detail changes across a broad range of parameters (moisture, clouds, radiation, wind, pptn …) – after all its not global mean or tropospheric temperatures that affect people is it?

That really is it, but please please please note that some more civility in future rather than trying to tear contributors apart limb by limb will only aid your cause.

Peter Thorne
Can you set the record straight on the realclimate.org post where you were quoted now please, since that is read by a lot of people who don’t come here. It is actually very important that you are not seen to be saying one thing one place and another – almost mutually exclusive – thing somewhere else because you lay yourself wide open to this sort of criticism. We are locked out of the nature comment but if the real views of climate scientists are being misrepresented by realclimate.org then it’s important for scientists to correct them, not here where largely only skeptics read, but where your quotes are actually being disseminated. Knutson did it when Mann and Benestad mis-presented his latest work. When you do it we’ll all be 100% behind you. For most of us, the problem is not in the science itself (most of which is quite good) but in the presentation of the science.

The only thing close to name-calling I see here is my own statement – “It is precisely this kind of mis-presentation of controversial and preliminary adjustments to the raw data as “evidence” that we are often “ragging about”. At what point does it become just plain dishonesty I’d like to know” which is actually a serious question, not specifically directed at you, which I’d like an answer to. There really is no other conclusion to draw other than that someone, somewhere is perilously close to dishonesty. If it isn’t you then I’m sorry, but please then identify who it is.

First, Peter, thank you for your participation here, you have constraints on your time.

Next, I was actually not that fussed with what the CCSP said. However, the idea that the radiosonde data is good enough sometimes and not good enough other times seems odd to me. You say that the quotes came from “a summary of previously published work”, which is true. But you made absolutely no objection to the radiosonde data used in any of that work, in fact it was quoted approvingly. If your point was that the radiosonde data was flawed, as you say, that is an odd way to show it. If your contention is correct, if the data needs adjusting, isn’t it necessary to redo the earlier studies with the adjusted data to see the effect? Aren’t the older studies in abeyance until they are redone with the adjusted data?

My real curiosity lay in the question of how the estimated wind-based high-altitude temperature trend of 0.65°C/decade (which is way high, out of bounds compared with both models and data), can be said to validate the models.

While avoiding venting again, I’d like scientists to realize they have to draw a clear line between reality and fantasy when it’s as important as this issue obviously is. If this were an obscure dietetics paper we’d not worry about misrepresentations at all. But things in Nature and Science can acquire an importance that can sweep everything in their path, because you only get published there if you are presenting something apparently new. Hence the “verification of models” statement. The reality simply wouldn’t have been published. [ snip – no policy]

I think some of us, took your comment that you were going to make one post
seriously, so against our better judgement we engaged in a little fun with names. I’ll apologize for myself and jeez. we like to entertain each other hopefully to the amusement of others, without too much collateral damage.

#88. Dr Thorne, have a nice holiday and we appreciate your comments. I reviewed comments following your initial post and observed no disrespect in connection with your actual comments.

The only teasing came when it seemed like you were doing a “drive-by snooting” in Steve Mosher’s words. You’re encountering some prior history here. You’ve now clarified that this was not your intent; have a nice holiday.

I remain unclear as to what objections, if any, you had to my original post. It would have done no harm if you had more clearly noted points of agreement, however painful it may have seemed to actual agree with me on something.

Steve
It seems he only objects to the AFP quote, not your post, as he was just “setting the record straight on my views”. Why he thinks this is the best place for that is another question. Maybe this is becoming the new site of choice to discuss climate science.

There are things I disagree with but after the infantile posts of many commentators I will take up any discussions of these off-line with Steve if I have time and inclination (clue: continued baiting every time a climate scientist pops in reduces the probability that any will in future).

I have read Peter’s posts and have not been able to extract anything particularly substantive with regards to uncertainty of radiosonde measurements/adjustments or (dis)agreement with MSU. I suspect some of the disconnect comes from sources that I note below and perhaps his sensitivity.

One issue I see is attempting, I think, to argue that the unadjusted radiosonde measurements are a mess and not much improved by previous attempts at adjustment and therefore the seeming contradiction with climate models and that the latest attempt brings them closer to the models and, therefore, should be given consideration in judging this adjustment — even though the methods of that adjustment are not readily defended on scientific (or statistical) grounds.

The other issue is confusing the scientist’s statements as those from the scientist with those of the scientist as a policy advocate. I think many of these scientists have shown that they find it necessary to make an obligatory statement connecting their work to AGW. I suspect that Thorne’s policy position directs him to view the agreement of the RAOBCORE v1.4 (and in context of other observational sources) with the climate model outputs differently than some of us who post here would.

Understanding the nuances of what Thorne is attempting to communicate may be my problem and thus I see some merit in sensitive posters, such as Thorne, communicating with Steve M and then giving Steve permission to pass on the information to the readers at CA in a form that could add to our knowledge base.

Which I found repeated at several places on the web including RC, and I believe Steve Mc took from RC, but was unable to locate the content abbreviated by the ellipsis. If this quote is a misquote or an out of context edit, I’m sure we would all appreciate clarification both here and at RC.

Abstract: A recent report of the U.S. Climate Change Science Program (CCSP) identified a “potentially serious inconsistency” between modelled and observed trends in tropical lapse rates (Karl et al., 2006). Early versions of satellite and radiosonde datasets suggested that the tropical surface had warmed by more than the troposphere, while climate models consistently showed tropospheric amplification of surface warming in response to human-caused increases in well-mixed greenhouse gases. We revisit such comparisons here using new observational estimates of surface and tropospheric temperature changes. We find that there is no longer a serious and ubiquitous discrepancy between modelled and observed trends in tropical lapse rates.

This emerging reconciliation of models and observations has two primary explanations. First, because of changes in the treatment of buoy and satellite information, new surface temperature datasets yield slightly reduced tropical warming relative to earlier versions. Second, recently-developed satellite and radiosonde datasets now show larger warming of the tropical lower troposphere. In the case of a new satellite dataset from Remote Sensing Systems (RSS), enhanced warming is due to an improved procedure of adjusting for intersatellite biases. When the RSS-derived tropospheric temperature trend is compared with four different observed estimates of surface temperature change, the surface warming is invariably amplified in the tropical troposphere, consistent with model results. Even if we use data from a second satellite dataset with smaller tropospheric warming than in RSS, observed tropical lapse rates are not significantly different from those in all model simulations.
Our results contradict a recent claim that all simulated temperature trends in the tropical troposphere and in tropical lapse rates are inconsistent with observations. This claim was based on use of older radiosonde and satellite datasets, and on two methodological errors: the application of an inappropriate statistical “consistency test”, and the neglect of observational and model trend uncertainties introduced by interannual climate variability.

This emerging reconciliation of models and observations has two primary explanations.

The phrasing “(t)owards elimination of warm bias” and “emerging reconciliation of models and observations” indicates to me that these authors have not found any smoking guns to this point.

Anyone can make the SE comparison of the models average with the “new and improved” RAOBCORE v1.4 in context of the other observed measurements and make their own conclusions. And some of these climate scientists have the audacity to complain about and wonder why their analyses and comments are taken out of context.

The ellipsis looks like a poorly constructed quote, because “results” is not in the original, that I can tell. Here’s the full context of the original:

So are we any closer to resolving the riddle of tropospheric temperature change? It seems we’re getting there. Allen and Sherwood4 give evidence for a strong warming in the tropical upper troposphere, providing long-awaited experimental verification of model predictions. Furthermore, the warming they observe reaches its maximum just below the tropical tropopause. Such amplification of surface warming is expected on theoretical grounds, and is indeed found on monthly to inter-annual timescales by both models and observational estimates8. However, it has been absent in almost all observational estimates on decadal timescales — upon which non-climatic artefacts project most strongly. The new analysis4 adds to the growing body of evidence suggesting that these discrepancies are most likely the result of inaccuracies in the observed temperature record3, 5, 8 rather than fundamental model errors (Fig. 1).

Once again, I’m reminded that If you torture the data long enough, it will confess, even to crimes it didn’t commit. What, the temperature data doesn’t confirm the models? Must be something wrong with the temperature data, ’cause it sure cannot be anything wrong with the models.