The Uncertainty Principle for climate (and chemical) models

A recent issue of Nature had an interesting article on what seems to be a wholly paradoxical feature of models used in climate science; as the models are becoming increasingly realistic, they are also becoming less accurate and predictive because of growing uncertainties. I can only imagine this to be an excruciatingly painful fact for climate modelers who seem to be facing the equivalent of the Heisenberg uncertainty principle for their field. It's an especially worrisome time to deal with such issues since the modelers need to include their predictions in the next IPCC report on climate change which is due to be published next year.

A closer look at the models reveals that this behavior is not as paradoxical as it sounds, although it's still not clear how you would get around it. The article especially struck a chord with me I see similar problems bedeviling models used in chemical and biological research. In case of climate change, the fact is that earlier models were crude and did not account for many fine-grained factors that are now being included (such as the rate at which ice falls through clouds). In principle and even in practice there's a bewildering number of such factors (partly exemplified by the picture on top). Fortuitously, the crudeness of the models also prevented the uncertainties associated with these factors from being included in the modeling. The uncertainty remained hidden. Now that more real-world factors are being included, the uncertainties endemic in these factors reveal themselves and get tacked on to the models. You thus face an ironic tradeoff; as your models strive to mirror the real world better, they also become more uncertain. It's like swimming in quicksand; the harder you try to get out of it, the deeper you get sucked in.

This dilemma is not unheard of in the world of computational chemistry and biology. A lot of the models we currently use for predicting protein-drug interactions for instance are remarkably simple and yet accurate enough to be useful. Several reasons account for this unexpected accuracy; among them cancellation of errors (the Fermi principle), similarities of training sets to test sets and sometimes just plain luck. Error analysis is unfortunately not a priority in most of these studies, since the whole point is to publish correct results. Unless this culture changes our road to accurate prediction will be painfully slow.

But here's an example of how "more can be worse". For the last few weeks I have been using a very simple model to try to predict the diffusion of druglike molecules through cell membranes. This is an important problem in drug development since even your most stellar test-tube candidate will be worthless until it makes its way into cells. Cell membranes are hydrophobic while the water surrounding them is hydrophilic. The ease with which a potential drug transfers from the surrounding water into the membrane depends among other factors on its solvation energy, on how readily the drug can shed water molecules; the smaller the solvation energy, the easier it is for drugs to get across. This simple model which calculates the solvation energy seems to do unusually well in predicting the diffusion of drugs across real cell membranes, a process that's much more complex than just solvation-desolvation.

One of the fundamental assumptions in the model is that the molecule exists in just one conformation in both water and the membrane. This assumption is fundamentally false since in reality, molecules are highly flexible creatures that interconvert between several conformations both in water and inside the membrane. To overcome this assumption, a recent paper explicitly calculated the conformations of the molecule in water and included this factor in the diffusion predictions. This was certainly more realistic. To their surprise, the authors found that making the calculation more realistic made the predictions worse. While the exact mix of factors responsible for this failure can be complicated to tease apart, what's likely happening is that the more realistic factors also bring more noise and uncertainty with them. This uncertainty piles up, errors which were likely canceling before no longer cancel, and the whole prediction becomes fuzzier and less useful.

I believe that this is what is partly happening in climate models. Including more real-life factors in the models does not mean that all those factors are well-understood. You are inevitably introducing some known unknowns. Ill-understood factors will introduce more uncertainty. Well-understood factors will introduce less uncertainty. Ultimately the accuracy of the models will depend on the interplay between these two kinds of factors, and currently it seems that the rate of inclusion of new factors is higher than the rate at which those factors can be accurately calculated.

The article goes on to note that in spite of this growing uncertainty the basic predictions of climate models are broadly consistent. However it also acknowledges the difficulty in explaining the growing uncertainty to a public which has become more skeptical of climate change since 2007 (when the last IPCC report was published). As a chemical modeler I can sympathize with the climate modelers.

But the lesson to take away from this dilemma is that crude models sometimes work better than more realistic ones. Perhaps the climate modelers should remember George Box's quote that "all models are wrong, but some are useful". It is a worthy endeavor to try to make models more realistic, but it is even more important to make them useful.Image source

4 comments:

It's interesting you bring this up (again), especially - from my POV - after a mention at Nanoscale Views of a recent paper that demonstrated the validity of classical elasticity to a thin film only a few atoms thick. Clearly, for the "right" question, one may be able to appeal to very simple notions for an explanation of certain phenomenona.

I think the point that you make about the effect of well-understood versus ill-understood factors is a good one. I suppose there's something to be said about the nature of these ill-understood factors - if they're extremely sensitive to initial conditions or are only valid over a small interval, you might get wild oscillations that can't be compensated for with cancellation of errors.

That's a good point; you often don't know the sensitivity of these factors to initial conditions and small perturbations. The other point I thought about is whether the new factors are correlated or uncorrelated with the old ones. While the degree of correlation will certainly impact the estimates, it's also likely that there's certain windows of applicability outside which the factors change from being correlated to uncorrelated or vice versa. That should definitely affect the results but I don't know if this kind of stuff can even be quantified. Will take a look at the paper you mentioned.

Thanks - a good account of how adding detail to a model can make it worse. The meteorologists at the UK Met Office often face this problem, when they want to make model improvements. The models contain many parameterizations of processes that are too complex, or sub-grid scale to resolve fully. But as computational power increases, and grid sizes are reduced, it becomes possible to replace a parameterization scheme with a fully resolved process. However, doing this frequently reduces the forecast accuracy of the model, so the meteorologists wait until they can bundle the changes with others that improve forecast accuracy.

However, I must take issue with the patronizing comment in the final paragraph. Not only do climate modelers frequently repeat that Box quote among themselves, one has even used it as the name of her blog:http://allmodelsarewrong.com/In my experience, any sentence that starts "climate modelers should remember..." almost certainly means the writer hasn't met any climate modelers. The key point is that none of this is news to climate modelers - they've been dealing with these complexities for decades.

About Me

“Ashutosh (Ash) Jogalekar is a scientist and science writer based in the San Francisco Bay Area. He has been blogging at the “Curious Wavefunction” blog for more than ten years, and in this capacity has written for several organizations including Nature, Scientific American and the Lindau Meeting of Nobel Laureates. His professional areas of interest include medicinal and computational chemistry. His literary interests specifically lie in the history and philosophy of science.”
Follow @curiouswavefn