This is an excerpt from the draft manuscript of my forthcoming book, Computing the Climate.

While models are used throughout the sciences, the word ‘model’ can mean something very different to scientists from different fields. This can cause great confusion. I often encounter scientists from outside of climate science who think climate models are statistical models of observed data, and that future projections from these models must be just extrapolations of past trends. And just to confuse things further, some of the models used in climate policy analysis are like this. But the physical climate models that underpin our knowledge of why climate change occurs are fundamentally different from statistical models.

A useful distinction made by philosophers of science is between models of phenomena, and models of data. The former include models developed by physicists and engineers to capture cause-and-effect relationships. Such models are derived from theory and experimentation, and have explanatory power: the model captures the reasons why things happen. Models of data, on the other hand, describe patterns in observed data, such as correlations and trends over time, without reference to why they occur. Statistical models, for example, describe common patterns (distributions) in data, without saying anything about what caused them. This simplifies the job of describing and analyzing patterns: if you can find a statistical model that matches your data, you can reduce the data to a few parameters (sometimes just two: a mean and a standard deviation). For example, the heights of any large group of people tend to follow a normal distribution—the bell-shaped curve—but this model doesn’t explain why heights vary in that way, nor whether they always will in the future. New techniques from machine learning have extended the power of these kinds of models in recent years, allowing more complex patterns to be discovered by “training” an algorithm to find more complex kinds of pattern.

Statistical techniques and machine learning algorithms are good at discovering patterns in data (eg “A and B always seems to change together”), but hopeless at explaining why those patterns occur. To get over this, many branches of science use statistical methods together with controlled experiments, so that if we find a pattern in the data after we’ve carefully manipulated the conditions, we can argue that the changes we introduced in the experiment caused that pattern. The ability to identify a causal relationship in a controlled experiment has nothing to do with the statistical model used—it comes from the logic of the experimental design. Only if the experiment is designed properly will statistical analysis of the results provide any insights into cause and effect.

Unfortunately, for some scientific questions, experimentation is hard, or even impossible. Climate change is a good example. Even though it’s possible to manipulate the climate (as indeed we are currently doing, by adding more greenhouse gases), we can’t set up a carefully controlled experiment, because we only have one planet to work with. Instead, we use numerical models, which simulate the causal factors—a kind of virtual experiment. An experiment conducted in a causal model won’t necessarily tell us what will happen in the real world, but it often gives a very useful clue. If we run the virtual experiment many times in our causal model, under slightly varied conditions, we can then turn back to a statistical model to help analyze the results. But without the causal model to set up the experiment, a statistical analysis won’t tell us much.

Both traditional statistical models and modern machine learning techniques are brittle, in the sense that they struggle when confronted with new situations not captured in the data from which the models were derived. An observed statistical trend projected into the future is only useful as a predictor if the future is like the past; it will be a very poor predictor if the conditions that cause the trend change. Climate change in particular is likely to make a mess of all of our statistical models, because the future will be very unlike the past. In contrast, a causal model based on the laws of physics will continue to give good predictions, as long as the laws of physics still hold.

Modern climate models contain elements of both types of model. The core elements of a climate model capture cause-and-effect relationships from basic physics, such as the thermodynamics and radiative properties of the atmosphere. But these elements are supplemented by statistical models of phenomena such as clouds, which are less well understood. To a large degree, our confidence in future predictions from climate models comes from the parts that are causal models based on physical laws, and the uncertainties in these predictions derive from the parts that are statistical summaries of less well-understood phenomena. Over the years, many of the improvements in climate models have come from removing a component that was based on a statistical model, and replacing it with a causal model. And our confidence in the causal components in these models comes from our knowledge of the laws of physics, and from running a very large number of virtual experiments in the model to check whether we’ve captured these laws correctly in the model, and whether they really do explain climate patterns that have been observed in the past.

I would really like to read a book that discusses climate models from the most basic questions of how they fit into the scientific method.

I was particularly struck by your use of the term “virtual experiment.” The widespread practice in the climate science world of describing model runs as “experiments” makes me uneasy. The fact that you don’t do that makes me think you understand why I feel that way.

It seems to me that what makes something an experiment is that it generates observed data. As you say, the increment in information is not in any statistical methods of analyzing data. It is the logically structured observation that produces an increment in empirical knowledge. Crucially, models of phenomena contain this type of information gained from prior empirically validated observations, ie experiments.

So my question is, do climate model runs produce data? I think you will agree with me that the answer is clearly no, we don’t want to use the word data in this way. It doesn’t make sense to say that a virtual experiment produces virtual data. But if you allow yourself to use the word experiment for a model run, you might also blur the distinction between real data and your model run results. I was very intrigued but also concerned when I read Schmidt and Sherwood 2014, A Practical Philosophy of Complex Climate Modelling, Euro Jnl Phil Sci, wherein they propose to “explore specifically to what extent complex simulation in climate science is a new ‘pillar’ of inquiry as opposed to an expansion or cross fertilization of existing notions of theory and experiment (Winsberg 2003; Lloyd 2010) from the point of view of model developers and users of the simulations.” And Schmidt is one who freely describes model runs as experiments. I hope you address this in your book.

There is a way in which theories based on mathematical descriptions of observed data create knowledge at a level of abstraction above the information in the raw data. Sort of like a graph of a parabola creates an increment of understanding in the human mind beyond the symbols y=x2. So, I see that climate models do the same thing by letting us observe climate phenomena that emerge from causal models, which we otherwise don’t “see” by simply knowing the basic physics. But I sense a risk of overstating the amount of new knowledge climate models produce, because they formally cannot discover any new information that is not already inherent in the causal models that are based on observations of the physical world. You obviously know this, buts it’s not so clear to me that the climate scientists scrupulously make this distinction.

Which gets me to my last question. Among other things, climate models are being used for predicting future climate. People use the words skill and projection to acknowledge the uncertainties, but in practice—in general discussions and policy discussions— the models are taken as predicting ongoing global warming. You don’t have to go back very many years to the point where everyone would agree that the null hypothesis was that future climate is unpredictable. In normal empirical research if someone claimed to have a method to predict future climate to an actionable level of accuracy and precision, then prospective, out-of-sample data would have to reject the null hypothesis that climate is unpredictable. About 2103 the meme of a pause in warming started to appear because for several years the observed temperature data sets were tracking below the model aggregates. Climate scientists started to explain it. Mann and Steinman published a paper on heat being stored in the Pacific Ocean, for just one example. Among the myriad discussions, I never saw any experts explicitly acknowledge the core scientific-method problem of post hoc adjustments of models (or data sets for that matter) to explain failures of the prospective data to track the models. It is as if the null hypothesis is now universally taken to have shifted to anthropogenic global warming, and all post hoc adjustments are now just refinements in what we already know to be true. I think some expert as yourself needs to really delineate precisely where we are at in the state of the art in answering the basic question of where is the data, where is the increment in formal information, that justifies rejecting the null hypothesis that climate is unpredictable? Are we there yet?

Clearly the science already supports a large reasonable middle ground where policy actions on greenhouse gas emissions make sense. But consider this quote from Michael Mann: “What is disconcerting to me and so many of my colleagues is that these tools that we’ve spent years developing increasingly are unnecessary because we can see climate change, the impacts of climate change, now, playing out in real time, on our television screens, in the 24-hour news cycle.” That seems so counterproductive to me. Permanent drought in California, torrential rain in California, Hurricane Sandy, tipping points, obvious climate catastrophe….. Somebody, probably you, needs to keep this stuff anchored to the science, because it seems to me that by overstating the state of the science, we are at risk for losing our credibility.