Summary; Public policy about climate change has become politicized and gridlocked after 26 years of large-scale advocacy. We cannot even prepare for a repeat of past extreme weather. We can whine and bicker about who to blame. Or we can find ways to restart the debate. Here is the next of a series about the latter path, for anyone interested in walking it. Climate scientists can take an easy and potentially powerful step to build public confidence: re-run the climate models from the first 3 IPCC reports with actual data (from their future): how well did they predict global temperatures?

“Confirmations should count only if they are the result of risky predictions; that is to say, if, unenlightened by the theory in question, we should have expected an event which was incompatible with the theory — an event which would have refuted the theory.”
— Karl Popper in Conjectures and Refutations: The Growth of Scientific Knowledge (1963).

The most important graph from the IPCC’s AR5

Figure 1.4 from AR5: Estimated changes in the observed globally and annually averaged surface temperature anomaly relative to 1961–1990 (in °C) since 1950 compared with the range of projections from the previous IPCC assessments. Click to enlarge.

Let’s discuss what scientists can do to restart the debate. Let’s start with the big step: show that climate models have successfully predicted future global temperatures with reasonable accuracy.

This spaghetti graph — probably the most-cited data from the IPCC’s reports — illustrates one reason for lack of sufficient public support in America. It shows the forecasts of models run in previous IPCC reports vs. actual subsequent temperatures, with the forecasts run under various scenarios of emissions and their baselines updated. First, Edward Tufte probably would laugh at this The Visual Display of Quantitative Information — too much packed into one graph, the equivalent of a Powerpoint slide with 15 bullet points.

But there’s a more important weakness. We want to know how well the models work. That is, how well each forecast if run with a correct scenario (i.e., actual future emissions, since we’re uninterested here in predicting emissions, just temperatures). Let’s prune away all those extra lines on the spagetti graph, leaving forecasts from 1990 to now that match the actual course of emissions.

The big step: prove climate models have made successful predictions

A massive body of research describes how to validate climate models (see below), most stating that they must use “hindcasts” (predicting the past) because we do not know the temperature of future decades. Few sensible people trust hindcasts, with their ability to be (even inadvertently) tuned to work (that’s why scientists use double-blind testing for drugs where possible).

But now we know the future — the future of models run in past IPCC reports — and can test their predictive ability.

Karl Popper believed that predictions were the gold standard for testing scientific theories. The public also believes this. Countless films and TV shows focus on the moment in which scientists test their theory to see if the result matches their prediction. Climate scientists can run such tests today for global surface temperatures. This could be evidence on a scale greater than anything else they’ve done.

A hurricane in the Weather Research & Forecasting (WRF) Model. From NCAR/UCAR.

Testing the climate models used by the IPCC

“Probably {scientists’} most deeply held values concern predictions: they should be accurate; quantitative predictions are preferable to qualitative ones; whatever the margin of permissible error, it should be consistently satisfied in a given field; and so on.”
— Thomas Kuhn in The Structure of Scientific Revolutions (1962).

The IPCC’s scientists run projections. AR5 describes these as “the simulated response of the climate system to a scenario of future emission or concentration of greenhouse gases and aerosols … distinguished from climate predictions by their dependence on the emission/concentration/radiative forcing scenario used…”. The models don’t predict CO2 emissions, which are an input to the models.

So they should run the models as they were originally run for the IPCC in the First Assessment Report (FAR, 1990), in the Second (SAR, 1995), and the Third (TAR, 2001) — for details see chapter 9 of AR5: Evolution of Climate Models. Run them using actual emissions as inputs and with no changes of the algorithms, baselines, etc. This is a hindcast using data from the “future” (after the model was created), a form of out-of-sample data. It would cost a pittance compared to the annual cost of climate science — and the stakes for the world. How accurately will the models’ output match the actual global average surface temperatures?

Of course, the results would not be a simple pass/fail. Such a test would provide the basis for more sophisticated tests. Judith Curry (Prof Atmospheric Science, GA Inst Tech) explains here:

“Comparing the model temperature anomalies with observed temperature anomalies, particularly over relatively short periods, is complicated by the acknowledgement that climate models do not simulate the timing of ENSO and other modes of natural internal variability; further the underlying trends might be different. Hence, it is difficult to make an objective choice for matching up the observations and model simulations. Different strategies have been tried… matching the models and observations in different ways can give different spins on the comparison.”

On the other hand, we now have respectably long histories since publication of the early IPCC reports: 25, 20, and 15 years. These are not short periods, even for climate change. Models that cannot successfully predict over such periods require more trust than many people have when it comes to spending trillions of dollars — or even making drastic revisions to our economic system (as urged by Naomi Klein and Pope Francis).

Conclusion

Re-run the models. Post the results. More recent models presumably will do better, but firm knowledge about performance of the older models will give us useful information for the public policy debate. No matter what the results.

As the Romans might have said when faced with a problem like climate change: “Fiat scientia, ruat caelum.” (Let science be done though the heavens may fall.)

(b) See the papers about model validation listed in (f) below. But this is especially clear about the situation: “Reconciling warming trends” by Gavin A. Schmidt et al, Nature Geoscience, March 2014 — Ungated copy here.

“CMIP5 model simulations were based on historical estimates of external influences on the climate only to 2000 or 2005, and used scenarios (Representative Concentration Pathways, or RCPs) thereafter. Any recent improvements in these estimates or updates to the present day were not taken into account in these simulations.

“{We} collated up-to-date information on volcanic aerosol concentrations, solar activity and well-mixed greenhouse gases in the 1990s and 2000s. These updates include both newly observed data and also reanalyses of earlier 1990s data on volcanic aerosols based on improved satellite retrievals {and compared} the updated information with the data used in the CMIP5 climate model simulations …”

Most simulations of the historical period do not reproduce the observed reduction in global mean surface warming trend over the last 10 to 15 years. There is medium confidence that the trend difference between models and observations during 1998–2012 is to a substantial degree caused by internal variability, with possible contributions from forcing error and some models overestimating the response to increasing greenhouse gas (GHG) forcing. Most, though not all, models overestimate the observed warming trend in the tropical troposphere over the last 30 years, and tend to underestimate the long-term lower stratospheric cooling trend.

(3) Also important is this evaluation of forecast in the IPCC’s First Assessment Report “Assessment of the first consensus prediction on climate change“, David J. Frame and Dáithí A. Stone, Nature Climate Change, April 2013. They evaluated the original projections (i.e., runs using simulations), which did not include the eruption of Mt. Pinatubo, the collapse of the Eastern Bloc economies, or the rapid growth of East Asia’s economies. Nor did they show the difference between the scenarios used and actual observations.

(4) “Recent Climate Observations Compared to Projections” by an all-star group of scientists — Stefan Rahmstorf, Anny Cazenave, John A. Church, James E. Hansen, Ralph F. Keeling, David E. Parker, Richard C. J. Somerville — in Science, 4 May 2007. Ungated copy here. This is often cited as proof of models’ forecasting skill. It makes no such claim. The paper is only one page long. It has one paragraph describing global surface temperature changes and one about sea levels. There is little description or analysis, and no statistical testing. Also note this claim, which evidence in the past few years reveals to be exaggerated at best. Models are tuned to match past data (details here), make extensive use of parametrization.

“Although published in 2001, these model projections are essentially independent from the observed climate data since 1990: Climate models are physics-based models developed over many years that are not ‘tuned’ to reproduce the most recent temperatures …”

(5) “Test of a decadal climate forecast“, Myles R. Allen et al, Nature Geoscience, April 2013 — Gated. A follow-up to “Quantifying the uncertainty in forecasts of anthropogenic climate change” (Allen et al, Nature, October 2000), evaluating one model’s forecasts using data through 1996 over the subsequent 16 years. They re-ran the model, but do not state if they used the original scenario or actual observations after 1996 to general the prediction. The forecast was significantly below consensus, and so quite accurate. Odd that this examination of it provided so little information.

Other articles about validation of models. Most are just the usual hindcasts.

“Real-time multi-model decadal climate predictions” by Doug M. Smith et al., Climate Dynamics, December 2012 — Gated. Open copy here. Hindcasts and forecasts. “Verification of these forecasts will provide an important opportunity to test the performance of models and our understanding and knowledge of the drivers of climate change.” Yes.

“Recent observed and simulated warming“, John C. Fyfe and Nathan P. Gillett, Nature Climate Change, March 2014 — Gated. “Fyfe et al. showed that global warming over the past 20 years is significantly less than that calculated from 117 simulations of the climate by 37 models participating in Phase 5 of the Coupled Model Intercomparison Project (CMIP5). This might be due to some combination of errors… It is this light that we revisit the findings of Fyfe and colleagues.”

“Assessing temperature pattern projections made in 1989” by Ronald J. Stouffer and Syukuro Manabe in Nature Climate Change, March 2017. They compare the geographical pattern of warming in their 1989 model forecast vs observations. Limitations in their model cause “problems in comparing models to observations and makes the comparisons shown here qualitative in nature. It is one of the reasons why we focus our attention on the geographical distribution of surface temperature change rather than the magnitude of change in this study.”

Post navigation

Newer Comments

How have the alarmists failed? How by any measure have skeptics succeeded? Religion, medicine, media, political leaders across the world have bought into the idea of a dangerous climate crisis that can only be dealt with by regulating CO2. Our education industry indoctrinates to the climate consensus. The climate fear industry receives unlimited government and NGO funds.

Skeptics are happy that they will only be called “doubters of science” instead of “deniers”. Prominent groups are calling for the criminal prosecution of skeptics and doing so with impunity.

I think the 26 year campaign to sell climate alarmism has been incredibly successful since it has been based on alarmist self serving claims about the science from the start. Not one prediction of doom has held up at all, yet the money and power continues to flow into alarmist causes unabated.
Please check your assumptions.

Campaigns’ success are measured by the degree to which they achieve their objectives. After 26 years the US (and world) have made only trivial public policy changes for mitigation of and adaption to climate change.

Climate change ranks at or near the bottom of surveys of the American’s major public policy concerns. There are indications that this is true on a global scale as well. So there is no basis to expect large-scale measures to be taken soon.

Thank you for the response. It is very thoughtful. I agree that it is the measurement of success as to reaching goals. But what was the goal? I submit it was to gain power and money. The science has never seriously ben disputed: CO2 ghg, some warming, etc. The skeptics have pointed out that the science has not supported the crazy apocalyptic claptrap, nor the pseudo-religious zeal. The alarmists don’t care: They are enlightened and good and they get the money and power.
The public largely does not care, which is great from the pov of the alarmists- if anyone seriously examines their claims or ideas fairly, the alarmist claims fall apart. Of course the climate problem ranks near the bottom of concerns. Nothing is happening except that political leaders have a new universal solvent for dissolving responsibility for the sorry state of infrastructure: CO2 caused “climate change” building codes unenforced and storm damage? “climate change”. Seawalls left to deteriorate? climate change. no upgrade to water supply for 50 years and now a typical tough drought threatens people? climate change. etc.

You are making this far more complex than necessary, and so overlooking the obvious.

“I submit it was to gain power and money.”

When referring to goals, people generally mean “what do they say they want to accomplish”. A psychologist might prefer to dig down and reduce all goals to sex and power. A sociobiologist will reduce everything we do to improving our odds of reproductive success. But such guessing at people’s internal lives is IMO just imagination.

“The public largely does not care, which is great from the pov of the alarmists-”

It will not be the first time I have drifted into complexity.
As we saw in Canada and Australia, Spain and the UK, Germany and Japan, when the actual “climate” policies are implemented with great fanfare, they are walked back when reality rears its ugly head.

“So they should run the models as they were when originally run for the IPCC in the First Assessment Report (FAR, 1990), in the Second (SAR, 1995), and the Third (TAR, 2001). Run them using actual emissions as inputs and with no changes of the algorithms, baselines, etc. How accurately will the models’ output match the actual global average surface temperatures?”

But that has been done and the models have failed. In the 90’s it became clear that the FAR generation models were predicting larger temperature increases than were being observed. The modellers decided that the problem was with direct and indirect aerosol effects, tweaked the models to account for that, got decent agreement with observations, and declared victory. Since then the model forecasts and observations have once again diverged. There has been much speculation among modellers as to why. Divergence of emissions from predictions is not the cause; until recently we were being told that CO2 emissions were near the top of the projected range.

“25, 20, and 15 years. These are not short periods, even for climate change.”

Those are short periods. One problem is noise in the data; even 25 years is barely long enough to find a trend above the short term noise. A bigger problem is, as you suggest, the internal variations driven by the oceans (ENSO, PDO, AMO). For those to even out, one needs time scales of at least 60 years or so. There may longer internal oscillations, up to a millennium or two, but they may be too slow to matter.

An alternative could be to analyze the observational data to estimate climate sensitivity, then tune the models to match the resulting sensitivity. The first part has been done by a number of groups by various methods and gives a sensitivity near the bottom of the IPCC range (actually lower than any of the models). If the modellers were to accept such a low sensitivity, it would be game over for the idea of an imminent crisis.

Please give a citation. I did an extensive literature search, plus confirmed this with two eminent climate scientists. The citation list at the bottom is pretty comprehensive.

“Those are short periods.”

Please provide a supporting citation.

In the climate literature periods shorter than a decade are certainly considered “short” since they don’t pick up “decadal cycles”.

Durations of 15-25 years are commonly used — for example, when examining the pause/hiatus in global atmosphere warming. Santer et al. (2011) concluded that at least 17 years are required to identify anthropogenic effects.

The World Meteorological Organization (WMO) recommends comparing the current temperatures vs. a 30 year base period when computing anomalies (an essential component of climate analysis) — so a 25 year period is almost there (WMO website).

The American Meteorological Society defines Climate Change as “Any systematic change in the long-term statistics of climate elements (such as temperature, pressure or winds) sustained over several decades or longer” (American Met Society Statistics).

For the situation in the 90’s, I am relying on my memory of technical seminars I attended at that time. Perhaps there is some discussion in SAR or TAR. The more recent divergence can be found all over the internet, or just look at your “The most important graph from the IPCC’s AR5”. This may not be exactly what you suggest doing, but of the multiple theories proposed for the “pause”, I don’t think any involve errors in emissions.

Who cares what the retrospective analysis was of models in SAR or TAR, 15 or 20 years ago? The relevant question for us is how well the models in SAR and TAR look today.

It’s nice that you think the “models have failed”. Or that skeptics think the models have failed. Who cares? More important is the peer-reviewed literature and IPCC, which provide weak support for your belief.

The relevant point for htis post: the overall public policy debate has polarized and frozen. We need additional measures that can change opinions. Skeptics yelling “models failed” and activists yelling “models work” are just chaff in the debate, incapable of moving us forward.

No apology needed, as that’s a very useful question. This website is intended for a general American audience, and material is written for clarity. Definitions are those of a standard dictionary.

Where specialized vocabulary is necessary, I try to specify the source. Military terms are from the DoD dictionary (JP 1-02). Climate science terms are from the glossary of the most recent Assessment Report of the IPCC — in this case, the glossary of AR5. So climate is:

Climate in a narrow sense is usually defined as the average weather, or more rigorously, as the statistical description in terms of the mean and variability of relevant quantities over a period of time ranging from months to thousands or millions of years. The classical period for averaging these variables is 30 years, as defined by the World Meteorological Organization. The relevant quantities are most often surface variables such as temperature, precipitation and wind. Climate in a wider sense is the state, including a statistical description, of the climate system.

In comments people often play Humpty Dumpty, wanting to use their own definitions of words.

Thank you. I have asked this question at multiple sites and am generally ignored.
I also use the dictionary definition. By that definition nearly nothing (if anything at all) has changed in climate over the past ~150 years that is outside the range of weather experienced in that period. Most weather trends worldwide are flat. Temperature increases have been trivial if not flat. “Climate change” seems a much slipperier definition; a marketing term to replace the failed “global warming”. And “global warming” was only retired because of the embarrassing lack of “warming” around the globe.

(1) Lots of attempts to do this. The relevant public policy question is: how do we know if these adjusted results are more or less reliable than the original forecasts?

(2) Today the “report card” will come from website posts by climate scientists, (perhaps) followed by articles in the peer-reviewed literature, (perhaps) followed by analysis in the next IPCC report. So there will be many “report cards”. IMO it’s more operationally useful to think of the results of this test as another bit of information. At most it could provide a tipping point — the grain of sand that makes a difference. More likely it could be another brick in the public policy debate.

Most likely of all IMO — there will be no such test. It’s an obvious idea. If it would validate the models it would have been done and publicized widely. Like the “hockey stick.” My guess is that it’s been done, models failed, and so we’ve heard nothing of it — and will not hear anything about this. Unless people speak up.

First, temperature is not the only variable to consider in assessing the utility of a climate model. Climate is manifested in a huge number of variables: temperatures at different locations, at different altitudes in the atmosphere, temperatures at different depths of the ocean; precipitation at different locations, energy of weather, cloud cover at different altitudes; the location and strength of the jet stream; the global circulation; and on and on. It is possible to tweak a model to optimize temperature output while trashing some other output. This would be dishonest.

Assessing the utility of a model is an extremely complex business requiring years of education and experience with climate models. I hold an MS degree in physics and have followed these issues closely, but I do not consider myself competent to provide a reliable judgement of the utility of the CMIP5 results. How then can anybody other than a specialist make such a judgement?

Here we come to what I consider to be a central issue: the science behind AGW is beyond the comprehension of anybody other than the specialists. Sure, many of us can grasp the fundamentals, but the great majority of the discussions I have seen on the web are scientifically ignorant.

I’ll use this website as an example. It is obvious that the articles on this site are thoroughly researched and carefully thought through, yet it is obvious to me that their author has little scientific training. Here we have an admirable effort to grapple with an immensely complex subject by an obviously intelligent and open-minded citizen seeking the public good. Yet these articles all fall short of the goal of useful analysis of the public policy problem.

Here we come to the problem so brilliantly explicated in the last episode of James Burke’s wonderful television series “Connections”: How can citizens cope with public policy issues that involve scientific matters far beyond their education to understand? Burke offers no satisfying answer to the question.

My own conclusion is that the answer is that we must trust the experts — so long as we have established that the system under which they operate guarantees honesty on their part. The scientific community has a pretty good system for insuring the reliability of their results when those results have garnered the support of the great majority of the scientific community.

Sadly, the public is unwilling to trust the experts. There have been too many abuses of their trust by too many institutions, and now our polity is poisoned by a deficit of trust in all things. Until that trust is restored, it is impossible for us to differentiate truth from falsehood, and thereby to cope with the political problems we face.

I see nothing in our society that points towards a restoration of that trust. Our society is disintegrating.

“First, temperature is not the only variable to consider in assessing the utility of a climate model.”

Another reason, in addition to the two I mentioned — the findings of the IPCC with regards to temperature are given with more confidence than those for other climatological factors (precipitation is second, with the others far behind).

“Our society is disintegrating.”

Total bs. American society was disintegrating in 1862, clearly. During the 1930s in terms of many social, economic, and political factors. In the late 1960s, with armed troops occupying our major inner cities each summer — and massive anti-war riots on campuses, and widespread violation of social norms (e.g., sex, drugs). In the 1980s, with severe economic stress in 1980-83 followed by the crime waves (esp crack starting in ~1984), plus many signs of social decay (e.g., rising rates of divorce and teen births). To compare today with those periods is quite daft.

You have a masters in Physics but aren’t qualified to see that the CMIP5 ensemble is outside of observations? It is clear evidence the models don’t have much utility for predicting climate trends. Nobody really understands climate but it is not that hard to understand empirical evidence and who has it. Why would the public trust so called experts that ignore empirical facts?

Huh? It doesn’t take scientific training to figure out what is happening. The opinions of most scientific fields are irrelevant and only atmospheric science really applies. Engineers and statisticians who don’t have a career interest in the outcome should be able to crunch the available data and generate better numbers than the scientists are willing to.

The recent study that indicated 22 PPM produces 0.2 W/m2 of downward IR should be easy to check against the models. Either a 22 PPM change produces 0.2 W/m2 more forcing or the model is wrong.

Chris I have to disagree with your claim the subject is too complex to be understood by the general population, who lack the education to critique climate models or the results of climate research. It isn’t necessary to understand thermodynamics to use a thermometer. Your argument seems frankly elitist. I have, for example, a background in applied statistics as it relates to the design of industrial experiments. I’m not a climateologist nor do I have formal training in physics, but I can tell when a model is working and when it isn’t, there are simple statistical tests for these things. No doubt I have more training in the subject than many, but I also have enough training to know the subject isn’t especially difficult and can be explained to most folks with a decent high school education in a few paragraphs.

1) The difference would be this time a consortium of researchers would agree on a common set of inputs and an agreement to publish the results. Maybe the worst outliers would be eliminated
2) Weren’t all of the models high recently because of the Pause? We need a convincing explanation for it and possibly when it will expire. If this happens IMO things will fall into place.

I doubt very much that will happen anytime soon. The beauty of my proposal is that it could be done by an existing group (e.g, Goddard Inst at NASA, UK Met Ofc, or large university climate dept).

(2) “weren’t all of the models high recently”

That’s disputed by some. Others dispute the significance of the difference. In any case, and far more important, we don’t have a clear comparison. The spagetti graphs for 1995-2015 include “projections” based on a wide range of emission assumptions. We’re only interested in those projections using the actual emissions. That will eliminate most of the lines, leaving a clearer picture.

While it’s true that temperature is the first index that people look at, it is not the most politically significant variable — that would be sea level rise, which is most likely to be the source of the greatest costs. That in turn depends on ocean temperatures, which change in ways rather different from how air temperatures change.
But I think that a more significant factor here is the role of the natural time constant for climate change, which is about 30 years, possibly longer. That means that the only fair test of any climate model is a graph that is smoothed with a 30-year smoothing window. All the wiggling around for periods shorter than 30 years is meaningless with regard to climate change as a policy problem. The infamous “hiatus” in warming — actually a reduction in the rate of warming — just barely shows up in a 30-year smoothed graph of average global surface temperatures.
You dismiss my pessimistic assessment of the future of the American polity by referring to the successful handling of previous crises. But consider the fact that, in each of these cases, the American polity was able to decide upon a policy that fixed the problem. The secession of the South was met with a military effort that re-united the country. The social fractures created by the Depression were met with dramatic new social policies that held American society together. The riots over race and Vietnam were met with changes in race policy and the withdrawal from Vietnam. So, what are we doing to address climate change? Nothing. Do you see any prospect of us making any significant policy changes in the next ten years? Twenty years?
My point is that we are no longer capable of agreeing on policy actions in response to the challenge of climate change.

An afterthought: in each of the three crises you mentioned, the question at issue was easily understood by every citizen. This is not the case with climate change. The number of people who truly understand the science of climate change is a vanishingly small portion of the population.

You can’t be serious. None of those crises were understood by most people at the time. In the case of the latter two (1960s and 1980s, many people doubted that there was a crisis — let alone having agreement about its nature and magnitude).

As for climate change, there are few surveys — perhaps none – describing climate scientists’ perception of the magnitude and timing of future climate change. The findings of the IPCC are similarly uncertain. Probably because there is not strong consensus. The headline statement of the IPCC’s AR4 and AR5 reports about past climate change from greenhouse gases (more than half since 1950 attributed to human emissions) is only at the 90% confidence level — not at the usually required 95% level.

Well, yes, I am completely serious. The citizenry WAS in fact fully cognizant of the basic issues in each case. In the Civil War, the motivating issue was slavery and the integrity of the Union. Even the simplest farm boy could understand those issues.
In the case of the Depression, the key issue was not the proper economic policy — that was barely understood at the time. The key political issue was the confidence of the citizenry in the capitalist system. Communism was growing in appeal. The fundamental issue, then, was simple: capitalism versus socialism. Again, this was easily understood by the citizenry.
In the case of Vietnam, the key issue was also simple: pacifism versus militarism.
In none of these cases were the underlying facts in question. There really were slaves in the South in 1861, and nobody contested that. There really was a major failure of capitalism in 1929, and again, nobody contested that. There really was a war going on in Vietnam, and people really were dying, and again, nobody contested that fact.
Thus, the underlying facts were not in doubt, and the underlying principles easily understood.
Such is not the case with AGW. The underlying science is simply not within the educational grasp of any other than specialists. We see scientifically absurd statements made by deniers, yet the general public is unable to recognize just how dumb those comments are. Hence the public confusion.
I do not understand your statements regarding uncertainty among scientists. There is almost complete consensus that human emissions of CO2 are enhancing the greenhouse effect, which is definitely causing a general increase in global temperatures. There is intense disagreement about many of the details, but in terms of policymaking, there is no question among scientists that governments should take strong measures to reduce CO2 emissions.

I disgree. The usual spaghetti graphs give us insufficient basis at this time to declare the models “wrong”. It’s possible that forecasts run with actual emissions would yield lines in the bottom part of the graph, relatively close to actual temps. Also, I don’t believe models are usefully evaluated in such binary fashion.

“not the basis for policy making”

With available evidence, I agree 100%. Massive changes in tax and regulatory policies, let alone to our economic system, require stronger evidence. After adequate testing the answer might be different.

“clearly wrong”
“I disgree. The usual spaghetti graphs give us insufficient basis at this time to declare the models “wrong’”.

Let me qualify my comment. Spaghetti graphs are useful because they show the range of model results. The models in the upper range are clearly wrong. There is no way observations can catch up to them without extended periods of unprecedented warming rates.

The model mean is being used as a justification for mitigation when about half the models will never represent reality. They should be modified by empirical validation but climate scientists are arguing against that. As a result models that have no empirical basis are being used for policy decisions.

There was a model in the paper I reference that was found to have an error of 77 w/m (-2). I don’t know how more clearly wrong you can get.

That’s not a correct conclusion. Each line represents an assumption for greenhouse gas emissions (etc) — as an input — and a model’s forecast based on that input. By looking at the graph we cannot tell those lines which represent inputs that were different from actuals — from those with accurate input about emissions AND inaccurate forecast by a model.

Somewhere in the spaghetti there might be a line representing a model run with a correct input of emissions — that line is a forecast that can be used to evaluate that model.

Chris,
What climate crisis? Where is the crisis?
And if climate science was real science, then other well informed people would be able to test its models, review its results and statistics and come to similar conclusions.
That is the opposite of the case with climate science.
Climate science culture is secretive, insular, circular, arrogant, demanding and evasive.

More interesting still is what is the sensitivity of model output to the thousands of modeling choices and assumptions. A very big job to check and publish but quite impotant. What we see is in some sense the “best” way to run the code. If the output is quite sensitive to choices that are not well constrained by data then the model output is very uncertain.

I agree. Hence the value of running the models as they were at some specific point in time (for the first 3 IPCC assessment reports) using model data. This avoids choosing those modeling choices to determine the answer.

If I may, I’d like to express confusion as to why you think it appropriate to test obsolete climate models. Why bother? We don’t use them nowadays. I’m sure that historians might have some interest in that, but for policymaking purposes, we need to rely on the best data and the best climate models.

(1) Please provide a citation showing that they are obsolete. Note: i’ve asked for support for several of your bold statements. you’ve provided none. I suspect that you are just making stuff up.

(2) Current models are refinements of past models. If the past ones work, we can assume that the present ones work better. That’s valuable information. Was it really necessary for me to point this out?

Do you really need a citation to believe that models that are no longer in use are obsolete? Nobody uses CMIP3 any more. The standard is CMIP5, for which there are many, many variations. I do not understand why you consider to be significant models that nobody uses any more.
I’ll be happy to provide citations for statements I make that are not patently true. Do you want a citation to show that the war in Vietnam was actually fought? Do you want a citation for the fact that the US went through a Depression in the 1930s? Or that the US fought a civil war over slavery? If you will identify a statement I have made that you consider to be false, I will happily provide appropriate support for it.
I must say, you seem rather antagonistic; I was hoping for a dispassionate discussion. If I have offended you, please identify how I have offended you and I will apologize.

Chris, the US didn’t fight the civil war over slavery, the war was fought over port tariffs. You think the US civil war was fought over slavery because that’s what you were told in school. Independent research (something that isn’t performed in State funded K-12 institutions) would reveal the underlying economic reasons the South could not be allowed to secede. See Clyde N. Wilson (in his section of “Slavery, Secession, and Southern History”).

So there you go. One “fact” blown completely out the window. I could also provide references to refute your analysis that the Great Depression was a “failure of Capitalism”. It was not, in fact it was a failure of socialism directly caused by the Federal Reserve Bank, which is the antithesis of a Capitalist organization, as are all “central bank economies”. The advent of the Federal Reserve in 1913 was the end of free market capitalism in the US and shortly thereafter the world.

That’s one more for you. Would you care to discuss the Politics of Heroin in South East Asia? We could make it three strikes in a row?

Now, smoothing curves is tricky business. It introduces a number of errors into the result, the most serious being “anticipation” of future events. For example, when a time series includes a step in the data, the smoothed graph anticipates the step before it happens. So you don’t smooth data for which a step is plausible. The ideal application of smoothing is in a time series with lots of noise of amplitude much greater than any plausible change. That situation definitely applies to the temperature data.

I’ll also mention the role of the time constant here. The oceans comprise a huge thermal reservoir whose heat capacity vastly exceeds that of the atmosphere. The heat capacity of the oceans relative to the kind of changes we see from radiative forcing yield a time constant for change of at least 30 years. To learn about heat capacity, you should start with a high school physics textbook. Time constants are handled at the college level; the most common usage is in electrical engineering in the analysis of RC circuits, which are physically analogous to heat reservoirs. If you’d like to learn more, consult any standard college textbooks on the subject.

Next, you question my claim that ‘our society is disintegrating’. That was a personal opinion, and I hoped that you would recognize it as such. However, if you wish to insist that I provide a citation to support it, I think it only fair that you reciprocate with a citation to support your claim that my comment was “total bs”. Can you present any scholarly papers on bs?

Lastly, you accuse me of making stuff up. I remind you that your own comments contain a great many statements that are not backed up with citations from the literature. I do not begrudge you such statements when they are either 1) obviously personal opinions or 2) within the pale of reason. But you seem to be applying a much stricter standards to my comments than you do to your own. For example, you wrote “The findings of the IPCC are similarly uncertain. Probably because there is not strong consensus.” The IPCC is very careful to specify the degree of confidence it has in each of its conclusions. They provide a great many conclusions, with many different degrees of certainty, and lumping all that into a single statement that they are “uncertain” seems to me to be a gross oversimplification.

Moreover, you err in your later statement regarding the certainty of IPCC statements: ” is only at the 90% confidence level — not at the usually required 95% level.”

You have misunderstood the use of the 95% confidence level in statistics. That is the test that is applied to scientific hypotheses, not policy decisions. Yes, if you want to verify a scientific hypothesis, you need AT LEAST 2 standard deviations in your results, but even that does not comprise compelling evidence. 2 standard deviations puts your hypothesis into the realm where it can be taken seriously — that’s about all.
But policy decisions are made with far less certainty. How many cases can you cite of a significant policy decision that was based on factual evidence exceeding the 95% confidence level? I can think of a few: the US decision to declare war on Japan in the wake of Pearl Harbor — there was definitely better than 95% confidence that Japan had committed a serious act of war. But we have no such confidence in the facts underlying decisions on tax policy, military spending, defense stance, or diplomatic relationships. Is Russia a greater threat to US interests than Islamic terrorism? How about compared to the rise of China? These questions are impossible to answer with any degree of confidence. Yet we still make decisions about them that assume answers to those questions.

The difference between science and politics is that scientists can afford to defer judgement of a hypothesis until they have accumulated overwhelming evidence in its favor. In the world of politics, we have to make decisions based on the limited information we have. Inaction is just as much a policy as action. In the case of climate change, the available information is much more certain than the information we had to justify the invasion of Iraq — a decision costing a couple of trillion dollars, 5,000 American lives, and several hundred thousand Iraqi lives.

> But policy decisions are made with far less certainty. How many cases can you cite of a significant policy decision that was based on factual evidence exceeding the 95% confidence level? I can think of a few: the US decision to declare war on Japan in the wake of Pearl Harbor — there was definitely better than 95% confidence that Japan had committed a serious act of war. But we have no such confidence in the facts underlying decisions on tax policy, military spending, defense stance, or diplomatic relationships. Is Russia a greater threat to US interests than Islamic terrorism? How about compared to the rise of China? These questions are impossible to answer with any degree of confidence. Yet we still make decisions about them that assume answers to those questions.

You’re missing the point. You need to factor in how much these policy decisions will cost to figure out what level of confidence is necessary. While incredibly stupid, the price tag of our middle east ‘adventures’ is an order or two of magnitude smaller than reducing (in absolute terms) global greenhouse gas emissions.

> If I may, I’d like to express confusion as to why you think it appropriate to test obsolete climate models. Why bother?

I agree. High on my long-standing recommendations for resolving the climate change debate: a review of the major climate models by a multi-disciplinary team of outside experts — including statisticians and software engineers. This is an essential part of the drug approval process, wisdom from hard experience.

Once again I am required by the BBS software to reply to a reply; in this case it is the reply of Mr. Johnny Wowcakes, who points out that the degree of confidence required for any policy decision is concomitant with the costs it imposes. There are two flaws in this argument:

First, he states that ‘the price tag of our middle east ‘adventures’ is an order or two of magnitude smaller than reducing (in absolute terms) global greenhouse gas emissions.’ This is incorrect. As I understand it, the most common proposal from economists for dealing with CO2 emissions is a revenue-neutral carbon tax. In terms of direct effects, this would cost nothing, because the tax is revenue-neutral. Of course, there would be substantial indirect effects arising from the economic effects of increased prices of fossil fuels as society adjusts its consumption patterns. These costs can be reduced — but not eliminated — by slowly ramping up the carbon tax over the course of time. The most common figure bandied about for an effective carbon tax is in the range of $30 to $50 per ton. We could start the tax at just $5 per ton; at that level, the economic costs would be small.

Second, a proper policy analysis assesses not just costs but also benefits. The figure of merit here is the “cost-benefit ratio”: the relative magnitudes of the costs and the benefits. In the case of AGW, we know that the costs will be staggering. A paper I cited earlier (http://www.pnas.org/content/111/9/3292.full.pdf) puts the unadapted costs at 0.3–9.3% of global GDP, a truly stupendous sum. However, it also finds that these costs could be addressed with annual expenditures on dike-building of only $12–71 billion — but at the risk of catastrophic costs should a dike failure occur, as happened at New Orleans.

You write: “Sigh. Climate scientists need to think more like software engineers.”
So tell me, how much effort do you think that Microsoft has expended lately working on fixing bugs in MSDOS 3? Halo 2? Do you think that Apple has a team hard at work sprucing up OS 10.4? ;-)

“the most common proposal from economists”
Economists are a tiny subset of the actors in the public policy debate about climate, and I doubt they are the most important actors.

“Second, a proper policy analysis assesses not just costs but also benefits.”
First, duh. But Johnny’s point remains relevant — estimating costs is paramount when assessing policy, for many reasons. Benefits are often overestimated — sometimes even illusive. Costs are real, and usually underestimated (e.g, the long history of military equipment, domestic infrastructure, and social welfare programs).

“IPCC AR5 WG2 provides a non-quantitative assessment {of economic issues}
That would be funny if you weren’t serious. So it’s just sad. Non-quantitative assessment of economics is almost a start.

> So tell me, how much effort do you think that Microsoft has expended lately working on fixing bugs in MSDOS 3? Halo 2? Do you think that Apple has a team hard at work sprucing up OS 10.4? ;-)

Sorry but your analogy is simply incorrect. Verifying past predictions is not ‘supporting old software,’ it’s a form of regression test. It would certainly tell you where your code was wrong.

Speaking of which, here’s where some of the climate model lives: http://simplex.giss.nasa.gov/snapshots/ Is this representative of other agency’s climate models? I hope not because there are a lot of problems here.

We don’t have (read only) access to source control. How was the model refined over time? Who changed it? Why did they change it? This is all important information. Although I am happy to see that they are using git for source control. Put it on github.

As for the code itself, it’s pretty bad. It’s unreadable. There are hardly any tests, which is especially unwise when you are writing procedural code like this. What the tests do seem to cover are some basic utility classes; however none of the actual model appears to be tested. It’s written in Fortran (60 years old, there are much better alternatives today e.g. Haskell). It’s like you guys are stuck in a time warp.

Your remarks match those of others who have looked at the code of the major climate models. Hence the need for a review of them by a multi-disciplinary team of experts, including statisticians and software engineers.

Look, I don’t make stuff up. Material that I consider to be common knowledge I don’t bother to support with citations. If you are unfamiliar with this information, I could take the time to educate you, but it would be a rather tedious process. If I have made a claim that you think is false, please identify it and I shall justify it. Vague, generalized statements such as “making stuff up” are useless.

“Economists are a tiny subset of the actors in the public policy debate about climate, and I doubt they are the most important actors.”
That’s true. But if you want to know about economics, the best people to consult are economists, and so what they have to say about economic policies is very important.

You argue that cost estimates are more important than benefit estimates, because cost estimates are usually underestimated and benefit estimates are usually overestimated. Here I will hit you with your own club, because I think you are incorrect. I challenge you to provide citations to support your statements about the estimation of costs and benefits. Please confine it to the environmental sphere.

You dismiss WG2 because it doesn’t provide quantitative analysis. First, have you bothered to read WG2? It characterizes the kinds of costs that we can anticipate arising due to AGW. Before you can carry out a proper economic analysis, you need to identify the sources of these costs. Are you already familiar with all those sources? Are you certain that you would not be surprised by some of the kinds of costs described in WG2?

If you want quantitative analysis, you should look at some of the sources provided in the Wikipedia article to which I linked. There are lots of these.

Lastly, I again note the antagonistic tone pervading your comments. I’ll ask you point-blank: do you want to discuss these issues or do you want to pursue some sort of personal confrontation? I am uninterested in the latter.

Seems odd to be tempted to the role of moderator on someone else’s web site, but the two of you have both wandered into the weeds in my opinion. I believe the article is about validating climate models. It has nothing to do with past wars and insurrections for example.

Maybe you should both try to get back on point. Chris, the author has gone to a great deal of trouble to provide citations. If you wish to argue the point, you should do the same.

The models are works in process. The question is, with models, how do you validate them?

This is especially difficult for models predicting the future. Nobody expects a stock picking program to work tomorrow, things change. They may work, but in general, you can’t predict the future of complex systems from past behavior. Chaos, you know?

The fact of chaos means either the model includes all of the physical phenomena, meaning it includes chaos, and therefore every run of the model will produce different results if even a single digit is changed in the input data, or it doesn’t and your model averages over the chaos.

In neither case do you have any idea what the actual future will be.

Models of complex reality can not be a guide to the future for reasons built into the nature of nature, and that is what we are all disputing.

Lew, you’re right that climate is complex, but you seem to think that this complexity is equivalent to randomness. That is only true when you get down to short time frames. On the climatological scale — 30 years or longer — the noisy factors that are so complicated cancel each other out and you can look to the basic physics. There’s absolutely zero question that the basic physics predicts an increase in temperature concomitant with the increase in concentration of CO2. We *know* that our emissions of CO2 will cause temperatures to rise. The debate is over how much and how quickly.
But there’s a more important point regarding policymaking: we don’t abandon consideration of policy problems merely because they are really complicated. What should we do about the rise of China? That’s one of the most important geopolitical issues of the 21st century, and the complexities surrounding it greatly exceed those of AGW. There are no computer models predicting Chinese behavior this century — the thought of attempting one is absurd. Despite this, we still have to make decisions about China. So we gather as much information as we can, carefully weigh all the facts, make decisions, and enact policies. If we can do that with China, why can’t we do it with the simpler problem of AGW?

Chaos damping out in any time whatsoever is certainly not what the mathematics of interacting chaotic systems say to me. Ocean currents, feedback of all kinds. OK, big momentums in everything, but that is where chaos does its work, tiny redirections interacting with things like moving continents.

It is only simple models that allow prediction of the similar futures, ones that do not incorporate all of the chemistry and physics of the total system. Maps of Maps, far from the terrain.

Waving your hands or showing me the equations you think cover the situation is asking me to believe the model, which I can’t because I know it definitely does not match the reality. Just saying it averages out is not a persuasive argument, it just cannot be true from the nature of the physics I think I know.

Some smart grad student is going to start counting the emergent phenomena produced by each stage of growth of communications bit-rate and technology.

Such confidence you have that all is known and the math covers it. What precedents do you cite for that?

Why focus on models when it’s obvious that climate models will never be able to deal with all the variables, let alone the chaotic components? I’ve not heard anyone address the many issues below:

Alarmists confiscated the term “climate change”, which originally referred to natural events, such as ice ages. Now it means catastrophic anthropogenic global warming” (CAGW), and skeptics are referred to as “climate change” deniers.

President Obama recently visited Alaska, and explained that two receding glaciers he visited are due to “climate change”. Neither he nor the major news media bothered to mention that some other Alaskan glaciers, including Hubbard and Taku have been advancing. Obviously both phenomenon cannot be attributed to global warming. Also, consider the serious implication if no glaciers were receding; that is likely the beginning of our next ice age! The average duration of recent ice ages is 90,000 years whereas the pleasant interims between ice ages (interglacial periods, one of which we are now enjoying) average only 10,000 years. …

——-

Editor’s note: At 2700 words, this comment is twice the length of the post, and 10 times the few hundred words allowed by the comment policy. Click here to read the rest of it.}

I started reading through the long version of your post elsewhere and I found it it to be the usual pile of denier falsehoods. It is a rant, not an organized, logical argument. If I may appropriate somebody else’s dictum, “Show me your citations!” (ahem…)

Thank you for the great idea. Lord Monckton has posted some comparisons of the older models with subsequent temperature data but your idea is even better: plug in the post-publication CO2 emissions and rerun the old models. Do this for FAR, SAR and TAR to see what the strengths and weaknesses are. I suspect that the modelers have already done this but without publishing the results. Too often the alarmists have been caught ignoring (or hiding) pertinent data. Perhaps you have contacts in the climate science community who have the courage to do this. Judith Curry might be willing to do it. Anyway, a great post and a great idea. We shall see.

I share your suspicions. I have gotten strong pushback from climate scientists about doing this test. This obvious test. Which suggests that they have a good idea what the results would be.

As for finding climate scientists to endorse this — I have tried. Nobody is willing to stick their neck out. Which is smart of them. The risk is proportionate to the stink this might cause. Could be a career-ended.

Methodological safeguards exist because people have biases. Out of sample tests, double-blind tests, regulations requiring reporting of all results (not just confirming ones) — these and hundreds of others such measures are needed because people become invested in their theories. It is true in science just as in the judicial justice system.

These are not conspiracies or hiding things. That is over-dramatization. It is just human nature. As Phil Jones of the Hadley Center allegedly said (he hasn’t denied this, that I’ve seen):

“We have 25 or so years invested in the work. Why should I make the data available to you, when your aim is to try and find something wrong with it.”

In science these problems are just grit in the machinery, worked out over time. Vital public issues require sterner standards of time and validation — like those the FDA sets for drug testing (which have been tightened slowly for decades without people screaming “conspiracy theory”).

Note that standards for science research are being tightened even where there is no pressing public policy issue, driven by recognition of broad problems with rep,I cation. Such as recent reforms in psychology. Again, without folks yelling “conspiracy”.

The scientific community has a strong methodological safeguard in the competition among scientists. One of the best ways for a young scientist to get his/her career going is to find a flaw in another scientist’s work. Some of the scientific disputes we have seen border on the obsessive. A good example is the disagreement between Karl (2015) and the recent paper by a large group of scientists regarding the reality of the slowdown in the rate of increase in global surface atmospheric temperatures. It’s a tempest in a teapot, because atmospheric surface temperature measure less than 5% of the net increase in planetary surface heat content. These people argue over everything!
Moreover, scientific achievement is measured by rocking the boat, not by concurring with everybody. Nobody ever got a Nobel Prize for agreeing with everybody else. If you’re an ambitious scientist, you want to contradict others.
That’s a major natural protection against group bias. Yes, there are always individual biases, but that’s why we look not to any single scientist, but rather to the consensus of large groups of scientists.
Regarding the Phil Jones quote, some context will remove the sinister impression the quote leaves. He had been harassed for years by Steve McIntyre for original documents used for the Had-CRU database. Mr. Jones had initially cooperated with Mr. McIntyre, but the effort required to retrieve the huge mass of original documents that Mr. McIntyre requested (basically, Mr. McIntyre wanted everything) was beginning to interfere with ongoing research. Moreover, Mr. McIntyre had taken an antagonistic approach in his interactions with Mr. Jones, filing scores of freedom of information requests directly or deceptively through associates. There was definitely a lot of bad blood between the two. Hence Mr. Jones’ statement. Here’s an old blog entry on the dispute as it stood seven years ago:http://blogs.nature.com/climatefeedback/2009/08/mcintyre_versus_jones_climate_1.html

The whole story is certainly a nasty one and does no credit to any of the participants, but taking one quote out of context in that dispute can certainly confuse readers.

You observe that ” Vital public issues require sterner standards of time and validation”
Please advise as to the stern standards and validation used in deciding policy towards Islamic terrorism, the Israeli-Palestian conflict, Chinese assertiveness in East Asia, Russian policy towards Ukraine, or immigration issues. Do the great majority of analysts agree on what needs to be done on each of these issues, to the same extent that scientists agree that we must take strong action to reduce CO2 emissions? Do all the experts agree about what should be done with respect to the Israeli-Palestinian conflict? Is there a strong consensus about US policy towards China? Are most experts agreed about a policy for dealing with Russia? Have any of these expert communities amassed terabytes of data to support their conclusions?

“the people on skeptics’ websites sound just like you, confident and intransigent.”
Intransigent? I’ll assume that you shot that off without thinking.

I’m not sure what you mean by the policy campaign that ‘has been run as I suggest’. But it is obvious why so little progress has been made in the USA: a large group of conservatives antagonistic towards anything remotely associated with environmentalism has fought a vicious campaign of lies to mislead the American public. Inasmuch as it is obvious that those conservatives will never, ever relent in their ferocious opposition to scientific findings, the resolution will come from the discrediting that they will surely experience as the laws of physics manifest themselves. It’s a shame that conservatism will be so badly discredited in the process, but American conservatism has gone off the deep end and desperately needs a good kick in the pants to get it back on track.

One has to but read one sentence in this article to write a coruscating defenestration of all climate ‘models’: ‘THE MODELS DO NOT SIMULATE THE TIMING OF ENSO EVENTS.

Can you credit it? The single most predictable climate event on earth, the occurrence of el-Nno/la Nina events, has no role to play in these models.

Sure, you can’t predict them absolutely and maybe you can’t predict them quantitatively, since each event has unique characteristics.

But to say you have a credible climate model that has no inputs from ENSO events says that you wish to predict human behaviour by removing an X or Y chromosome from the modelling.

Don’t tell me you can’t model ENSO. You can’t model it perfectly maybe, but you can most certainly model scenarios for ENSO events and you can do that credibly, since the data exists to see what sorts of events have occurred in the past, what sort of frequency they occur at etc.

There is no climate model worth a damn which does not scenario plan the following well documented climate drivers:
1. Solar Hale Cycles.
2. ENSO events allied to PDO/AMO states.
3. Major volcanic eruptions/earthquakes.

If £100bn of spending has not got us that far, then whosoever designed the funding streams should be out of office, excoriated in the global press or worse.

Can you give a cite for that? I believe that’s quite false. First, ENSO events are irregular in occurrence, duration, and magnitude. Neither scientists nor models can relatiably predict them even a few months ahead until after the “spring barrier” (i.e., by early summer).