Thursday, September 5, 2013

New paper in Nature Climate Change says IPCC uses statistical techniques 'out of date by well over a decade'

A new paper published in Nature Climate Change finds, "Use of state-of-the-art statistical methods could substantially improve the quantification of uncertainty in [IPCC] assessments of climate change" and that "The forthcoming Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5) and the US National Climate Assessment Report will not adequately address this issue. Worse still, prevailing techniques for quantifying the uncertainties that are inherent in observed climate trends and projections of climate change are out of date by well over a decade. Modern statistical methods and models could improve this situation dramatically." The authors recommend, "Including at least one author with expertise in uncertainty analysis on all chapters of IPCC and US national assessments" and that the IPCC "Replace qualitative assessments of uncertainty with quantitative ones." A prime example of this would be the ludicrous IPCC claim that it is "very likely" man is the cause of climate change, a claim that very likely would not hold up if subject to modern statistical analysis of uncertainty.

Article tools

Use of state-of-the-art statistical methods could substantially improve the quantification of uncertainty in assessments of climate change.

Because the climate system is so complex, involving nonlinear coupling of the atmosphere and ocean, there will always be uncertainties in assessments and projections of climate change. This makes it hard to predict how the intensity of tropical cyclones will change as the climate warms, the rate of sea-level rise over the next century or the prevalence and severity of future droughts and floods, to give just a few well-known examples. Indeed, much of the disagreement about the policy implications of climate change revolves around a lack of certainty. The forthcoming Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5) and the US National Climate Assessment Report will not adequately address this issue. Worse still, prevailing techniques for quantifying the uncertainties that are inherent in observed climate trends and projections of climate change are out of date by well over a decade. Modern statistical methods and models could improve this situation dramatically.

Uncertainty quantification is a critical component in the description and attribution of climate change. In some circumstances, uncertainty can increase when previously neglected sources of uncertainty are recognized and accounted for (Fig. 1 shows how uncertainty can increase for projections of sea-level rise). In other circumstances, more rigorous quantification may result in a decrease in the apparent level of uncertainty, in part because of more efficient use of the available information. For example, despite much effort over recent decades, the uncertainty in the estimated climate sensitivity (that is, the long-term response of global mean temperature to a doubling of the CO2 concentration in the atmosphere) has not noticeably decreased1. Nevertheless, policymakers need more accurate uncertainty estimates to make better decisions2.

Results are from 18 models included in the CMIP5 experiment24. The median climate model projection with no uncertainty is indicated by the vertical grey line, and uncertainty due to different climate model projections is shown by the histogram (white bars). The coloured curves represent cumulative uncertainty taking into account errors from the prediction of global mean sea-level rise from global mean temperature (red line), from the relation between Seattle sea-level rise and global mean sea-level rise (blue line) and from that between Seattle and Olympia sea-level rise (green line). Figure courtesy of Peter Guttorp, University of Washington.

Detailed guidance provided to authors of the IPCC AR5 and the US National Climate Assessment Report emphasizes the use of consistent terminology for describing uncertainty for risk communication. This includes a formal definition of terms such as 'likely' or 'unlikely' but, oddly, little advice is given about what statistical techniques should be adopted for uncertainty analysis3, 4. At the least, more effort could be made to encourage authors to make use of modern techniques.

Historically, several compelling examples exist in which the development and application of innovative statistical methods resulted in breakthroughs in the understanding of the climate system (for example, Sir Gilbert Walker's research in the early twentieth century related to the El Niño–Southern Oscillation phenomenon5). We anticipate that similar success stories can be achieved for quantification of uncertainty in climate change.

Although climate observations and climate model output have different sources of error, both exhibit substantial spatial and temporal dependence. Hierarchical statistical models can capture these features in a more realistic manner6. These models adopt a 'divide and conquer' approach, breaking the problem into several layers of conceptually and computationally simpler conditional statistical models. The combination of these components produces an unconditional statistical model, whose structure can be quite complex and realistic. By using these models, uncertainty in observed climate trends and in projections of climate change can be substantially decreased. This decrease is obtained through 'borrowing strength', which exploits the fact that trends or projections ought to be similar at adjacent locations or grid points. Methods that are currently applied usually involve analysing the observations for each location — or the model output for each grid point — separately. Hierarchical statistical models can also be applied to combine different sources of climate information (for example, ground and satellite measurements), explicitly taking into account that they are recorded on different spatial and temporal scales7, 8.

Even without any increase in computational power, statistical principles of experimental design can reduce uncertainty in the climate change projections produced by climate models through more efficient use of these limited resources9. Rather than allocating them uniformly across all combinations of the alternatives being examined, the allocation can be made in a manner that maximizes the amount of information obtained. For example, in assessing the impact of different global and regional climate models on climate projections, we do not need to examine all combinations of these models.

The recent IPCC Special Report Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation10, on which the IPCC AR5 relies, does not take full advantage of the well-developed statistical theory of extreme values11. Instead, many results are presented in terms of spatially and temporally aggregated indices and/or consider only the frequency, not the intensity of extremes. Such summaries are not particularly helpful to decision-makers. Unlike the more familiar statistical theory for averages, extreme value theory does not revolve around the bell-shaped curve of the normal distribution. Rather, approximate distributions for extremes can depart far from the normal, including tails that decay as a power law. For variables such as precipitation and stream flow that possess power-law tails, conventional statistical methods would underestimate the return levels (that is, high quantiles) used in engineering design and, even more so, their uncertainties. Recent extensions of statistical methods for extremes make provision for non-stationarities such as climate change. In particular, they now provide non-stationary probabilistic models for extreme weather events, such as floods, as called for by Milly and colleagues12.

We mention just one concrete example for which reliance on extreme value methods could help to resolve an issue with important policy implications. It has recently been claimed that, along with an increase in the mean, the probability distribution of temperature is becoming more skewed towards higher values13, 14. But techniques based on extreme value theory do not detect this apparent increase in skewness15. Any increase is evidently an artefact of an inappropriate method of calculation of skewness when the mean is changing16.

Besides uncertainty quantification, the communication of uncertainty to policymakers is an important concern. Several fields conduct research on this topic, in addition to statistics (including decision analysis and risk analysis). Statisticians have made innovative contributions to the graphic display of probabilities to make communication more effective17, methods that have not yet found much, if any, use in climate change assessments (see Spiegelhalter et al.17 for an example of how uncertainties about future climate change are now presented).

It should be acknowledged that statisticians alone cannot solve these challenging problems in uncertainty quantification. Rather, increased collaboration between statistical and climate scientists is needed. Examples of current activities whose primary purpose is to stimulate such collaborations include: CliMathNet18, the Geophysical Statistics Project at the National Center for Atmospheric Research19, International Meetings on Statistical Climatology20, the Nordic Network on Statistical Approaches to Regional Climate Models for Adaptation21 and the Research Network for Statistical Methods for Atmospheric and Oceanic Sciences22.

To bring uncertainty quantification into the twenty-first century, we offer a number of suggestions (Box 1). Elaborated on here, these range from how climate change research is conducted to the process by which climate change assessments are produced.

Whenever feasible, qualitative uncertainty assessments should be replaced with quantitative ones. At a minimum, a standard error should be attached to any estimate, along with a description of how it was calculated. Ideally, an entire probability distribution should be provided — innovative graphical techniques can be used to communicate these uncertainties in a more effective manner. In addition, modern statistical methods for spatio-temporal data, such as hierarchical models, should be used to reduce uncertainties in trend estimates for climate observations and projections of climate change. By taking into account spatial and temporal dependence, these techniques provide a powerful tool for detection of observed and projected changes in climate. To increase the accuracy with which the climate is monitored, various sources of information need to be combined using hierarchical statistical models. Such techniques can take into account differences in the uncertainties of, for instance, in situ and remotely sensed measurements.

Statistical principles of experimental design should be applied to climate change experiments using numerical models of the climate system. Through making more efficient use of computational resources, uncertainties in climate change projections can be reduced. Methods based on the statistical theory of extreme values should be used to quantify changes in the likelihood of extreme weather events, whether based on climate observations or on projections from climate models. In this way, information more useful to decision-makers about the risk of extreme events (for example, in terms of changing return levels) can be provided. Finally, to improve the quality of the treatment of uncertainty, at least one author with expertise in uncertainty analysis should be included on all chapters of IPCC and US national assessments. These authors could come from the field of statistics, as well as from other related fields including decision analysis and risk analysis.

If these recommendations are adopted, the improvements in uncertainty quantification would thereby help policymakers to better understand the risks of climate change and adopt policies that prepare the world for the future.

Suggestion first — it is highly instructive to plot the actual temperature by adding the “anomaly” to the supposed baseline temperature the anomaly is the anomaly of. That takes care of the “45 degree slope” problem once and forever. At typical high resolution plot scales, the entire 150 year warming is around one pixel high, so that the temperature graph is basically a straight line, with barely visible weakly trended noise. This eliminates a fraud straight out of the book “how to lie with statistics” — plotting the anomaly or difference instead of the actual quantity in question.

Although it is difficult to see, it is worthwhile illustrating the uncertainty that NASA itself acknowledges in the baseline temperature one is adding to the supposedly more accurate anomaly (where we could spend a few happy hours talking about whether one generally knows the mean of a distribution — this baseline — or its variance — the anomaly — more accurately, where I would have said that almost without exception it is the former known more, not less, accurately than the latter). This uncertainty is roughly 1 degree Kelvin today — anywhere from 13 to 15 C depending on the model used to estimate the actual mean. This makes the horizontal error bar on the baseline itself (on the absolute scale) roughly twice the size of the entire range of variation of the anomaly.

So do we know the variation of male height from the mean height of all males more accurately than we know the actual mean height of all males (from any given sample of measured heights of males randomly selected out of a crowd)? I don’t think so.

The question — that I have not had time to look at in the “leaked” report — is what they do with their statistical claims, the ones that were so horribly botched in AR4. In particular, do they persist in using language such as “likely”, “extremely likely”, and “possible” to describe e.g. probable warming in the future (or worse, give actual probability percentages for future expected warming given various warming scenarios)? If so, are those claims based on e.g. averages over sets of predictive model results from different general circulation models (GCMs)? Do the statistical ranges come from the assumption that these models somehow form independent and identically distributed samples drawn from some sort of distribution of GCMs that might enable them to use the Central Limit Theorem to make sound assertions of how probable it is that some sort of mean behavior is within some given range of the true, observed, or even expected behavior for any quantity those models might produce including global average surface temperature?

Because if they do, even implicitly, this goes beyond mere lying with statistics in the sense of presenting results that are accurately computed in a misleading way, it is an horrible error and abuse of statistics itself to promote any such conclusions.

The correct use of statistics in the general arena in which the GCMs live is to apply it to the models, one at a time, to determine if we can reject the null hypothesis: “This model is correct, given the observed climate since the model prediction was made” and if so, at what confidence level the rejection can be made. GCMs that fail to make a reasonable cut — for example, GCMs for which it is less than 5% likely that temperature trajectories as extreme as the actual, observed temperature series occur given the usual Monte Carlo spread in small perturbations of the model’s initial conditions and parameter set — should be summarily rejected and their results completely omitted from the report altogether. And by “as extreme”, I don’t mean trajectories that rarely dip down to overlap with the observed trajectory, I mean trajectories that vary over much the same range and with the same average trend.

This criterion would, I am confident, reject well over 90% of the GCMs that are currently being used as the basis for false statistical claims of probable future warming. Even after such a step, the mean of the remaining models could not be taken as being predictive of future climate behavior, but there would at least be a chance that one or more of the remaining models could be approximately correct, perhaps not in any sense we could ascribe a particular probability to at this time, but to the point where at least common sense tells us not to reject its predictions out of hand.

IMO, there is no possible way that one can make an assertion such as “over half of the warming observed post 1950 can be ascribed to human activity, e.g. increased CO_2″ based on the data itself. Dick Lindzen’s graph above already suffices to show, even over the tiny time interval plotted, that late 20th century warming is not “unprecedented” or impossible to explain with natural variation, it almost exactly matches the non-anthropogenic warming of the first half of the 20th century. Only by comparing models that predict much less warming without CO_2 and much more warming with CO_2 can one make any such claim at all, let alone claim that it is somehow “likely”.

But when the claim is made on the basis of an ensemble of similar results, all of them obtained from models that fail the hypothesis test above, how then can we give it even a shred of predictive confidence? Am I certain that most of the planetary warming has been manmade because failed numerical models tell me that it is, at the same time they are telling me that the current climate anomaly should be 0.5C or thereabouts instead of the measured value of 0.15C or thereabouts?

The mean of a bunch of incorrect models is not a correct model. The standard deviation of the predictions of a bunch of incorrect models is not a predictor of how close the real climate is expected to get to the mean — quite the opposite! As it shrinks (widening the gap between model and reality in terms of the statistical variation) all it does is make us more certain that the model is wrong!

I very much fear that AR5 will attempt, one final time, to perpetuate this sort of completely incorrect use of statistics, and in doing so will set actual climate science back another five to ten years. There is nothing that refines the scientific mind like having one’s pet model rejected from a major scientific review for cause, because it just doesn’t work! Suddenly one is motivated to go back to the drawing board or else — the typical or else being loss of funding, failure to publish in the future, failure to get tenure, students who aren’t interested in working for you — or else you commit professional suicide, in other words.

The way things stand now, there seem to be no restrictions on what gets included in the construction of the final “average” predicted trend for the climate. Don’t want to hurt anybody’s feelings, after all. If your model predicted a whole degree C of warming over the last 15 years, who cares? It’s still in the hat, just as likely or unlikely as anybody else’s to be right, still weighted the same as somebody who might have actually written a model that WORKS and got most of “the pause” correct.