I’ve been hearing about Wolfram Alpha a lot lately, and I finally got a chance to watch a demo screencast today, and I have to say, it looks really cool. It’s a combination of a search engine, a set of computational widgets, and a large, curated knowledge base. Exactly the kind of thing we need for playing with climate datasets, and giving a larger audience a glimpse into how climate science is done. The only thing I can see missing (and maybe it’s there, and I just didn’t look hard enough) is the idea of a narrative thread – I want to be able to create a narrated trail through a set of computational widgets to tell a story about how we build up a particular scientific conclusion from a multitude of sources of evidence…

Last week, I ran a workshop for high school kids from across Toronto on “What can computer models tell us about climate change?“. I already posted some of the material I used: the history of our knowledge about climate change. Jorge, Jon and Val ran another workshop after mine, entitled “Climate change and the call to action: How you can make a difference“. They have already blogged their reflections: See Jon’s summary of the workshop plan, and reflections on how to do it better next time, and the need for better metaphors. I think both workshops could have done with being longer, for more discussion and reflection (we were scheduled only 75 minutes for each). But I enjoyed both workshops a lot, as I find it very useful for my own thinking to consider how to talk about climate change with kids, in this case mainly from grade 10 (≈15 years old).

The main idea I wanted to get across in my workshop was the role of computer models: what they are, and how we can use them to test out hypotheses about how the climate works. I really wanted to do some live experiments, but of course, this is a little hard when a typical climate simulation run takes weeks of processing time on a supercomputer. There are some tools that high school kids could play with in the classroom, but none of them are particularly easy to use, and of course, they all sacrifice resolution for ability to run on a desktop machine. Here are some that I’ve played with:

EdGCM – This is the most powerful of the bunch. It’s a full General Circulation Model (GCM), based on the NASA’s models, and does support many different types of experiment. The license isn’t cheap (personally, I think it ought to be free and open source, but I guess they need a rich sponsor for that), but I’ve been playing with the free 30-day license. A full century of simulation tied up my laptop for 24 hours, but I kinda liked that, as it’s a bit like how you have to wait for results on a full scale model too (it even got hot, and I had to think about how to cool it, again just like a real supercomputer…). I do like the way that the documentation guides you through the process of creating an experiment, and the idea of then ‘publishing’ the results of your experiment to a community website.

JCM – This is (as far as I can tell) a box model, that allows you to experiment with outcomes of various emissions scenarios, based on the IPCC projections, which means it’s simple enough to give interactive outputs. It’s free and open source, but a little cumbersome to use – the interface doesn’t offer enough guidance for novice users. It might work well in a workshop, with lots of structured guidance for how to use it, but I’m not convinced such a simplistic model offers much value over just showing some of the IPCC graphs and talking about them.

Climate Interactive (and the C-Roads model). C-ROADS is also a box model, but with the emissions of different countries/regions separated out, to allow exploration of the choices in international climate negotiations. I’ve played a little with C-ROADS, and found it frustrating because it ignores all the physics, and after all, my main goal in playing with climate models with kids is to explore how climate processes work, rather than the much narrower task of analyzing policy choices. It also seems to be hard to tell the difference between various different policy choices – even when I try to run it with extreme choices (cease all emissions next year vs. business as usual), the outputs are all of a similar shape (“it gets steadily warmer”). This may well be the correct output, but the overall message is a little unfortunate: whatever policy path we choose, the results look pretty similar. Showing the results of different policies as a set of graphs showing the warming response doesn’t seem very insightful; it would be better to explore different regional impacts, but for that we’re back to needing a full GCM.

CCCSN – the Canadian Climate Change Scenarios Network. This isn’t a model at all, but rather a front end to the IPCC climate simulation dataset. The web tool allows you to get the results from a number of experiments that were run for the IPCC assessments, selecting which model you want, which scenario you want, which output variables you want (temperature, precipitation, etc), and allows you to extract just a particular region, or the full global data. I think this is more useful than C-ROADS, because once you download the data, you can graph it in various ways, and explore how different regions are affected.

Some Online models collected by David Archer, which I haven’t played with much, but which include some box models, some 1-dimensional models, and the outputs of NCAR’s GCM (which I think is the one of the datasets included in CCCSN). Not much explanation is provided here though – you have to know what you’re doing…

John Sterman’s Bathtub simulation. Again, a box model (actually, a stocks-and-flows dynamics model), but this one is intended more to educate people about the basic systems dynamics principles, rather than to explore policy choices. So I already like it better than C-ROADS, except that I think the user interface could do with a serious make-over, and there’s way too much explanatory text – there must be a way to do this with more hands on and less exposition. It also suffers from a problem similar t0 C-ROADS: it allows you to control emissions pathways, and explore the result on atmospheric concentrations and hence temperature. But the problem is, we can’t control emissions directly – we have to put in place a set of policies and deploy alternative energy technologies to indirectly affect emissions. So either we’d want to run the model backwards (to ask what emissions pathway we’d have to follow to keep below a specific temperature threshold), or we’d want as inputs the things we can affect – technology deployments, government investment, cap and trade policies, energy efficiency strategies, etc.

None of these support the full range of experiments I’d like to explore in a kids’ workshop, but I think EdGCM is an excellent start, and access to the IPCC datasets via the CCCSN site might be handy. But I don’t like the models that focus just on how different emissions pathways affect global temperature change, because I don’t think these offer any useful learning opportunities about the science and about how scientists work.

I’m giving a talk today to a group of high school students. Most of the talk focusses on climate models, and the kinds of experiments you can do with them. But I thought I’d start with a little bit of history, to demonstrate some key points in the development of our understanding of climate change. Here’s some of the slides I put together (drawing heavily on Spencer Weart’s the Discovery of Global Warming for inspiration). Comments on these slides are welcome.

and, introducing the spaceship earth metaphor: Who’s driving this spaceship, and are the life support systems working properly?…

For millions of years, the planet had a natural control system that kept the climate relatively stable. We appear to have broken it. Now we’ve got to figure out how to control it ourselves, before we do irreversible damage. We’re not about to crash this spaceship, but we could damage its life support systems if we don’t figure out how to control it properly.

In the last year, there were three major attempts to assess the current state of the science of climate change, as an update to the 2007 IPCC reports (which are already looking a little dated). They have very similar names, so I thought it might be useful to disambiguate them:

The Copenhagen Synthesis Report was put together at the University of Copenhagen to summarize a conference on “Climate change: Global Risks, Challenges and Decisions” that was held in Copenhagen in March 2009. The report has some great summaries of the research presented at the conference, and puts it all together to identify six key messages:

Observations show that many key climate indicators are changing near the upper boundary of the IPCC range of projections;

We have a lot more evidence now on how vulnerable societies and ecosystems are to temperature rises;

Rapid mitigation strategies are needed because we now know that weaker targets for 2020 will make it much more likely we will cross tipping points and make it much harder to meet long term targets;

There are serious equity issues because the impacts of climate change will be felt by those least able to afford to protect themselves;

Action on climate change will have many useful benefits, including improvements in health, revitalization of ecosystems, and job growth in the sustainable energy sector;

Many societal barriers need to be overcome, including existing social and economic policies that subsidize fossil fuel production and consumption, weak institutions and lack of political leadership.

The Copenhagen Prognosis was released in December 2009, put together as a joint publication of the Stockholm Environment Institute and the Potsdam Institute for Climate Impact Research. It focuses on the evidence behind the key issues for an international climate treaty, especially the target of limiting warming to 2°C, and the political actions necessary to do this. The key messages of the report are:

The 2ºC limit is a scientifically meaningful one, because of the evidence about the damage caused by rises above this level;

Even rises below 2°C will have devastating impacts on vulnerable communities and ecosystems (and for this reason, 80 nations have endorsed the idea of setting a global target to be “as far below 1.5ºC as possible”);

Analysis of potential tipping points shows that currently discussed political targets will be unable to protect the world from devastating climate impacts and self-amplifying warming;

Global greenhouse gas emissions must decline very rapidly after 2015, and reach net zero emissions by mid-century, if we want a good (75%) chance of staying below 2ºC of warming;

The challenge is great, but not impossible – such a reduction in greenhouse gases appears to be technically feasible, economically affordable, and possibly even profitable (but only if we start quickly);

The challenge will be especially hard for developing countries, who will need serious assistance from developed countries to make the necessary transitions;

Securing a safe climate for generations to come is now in the hands of just one generation, which means we need a new ethical paradigm for addressing this;

The challenge isn’t only about reducing emissions – it will require a shift to sustainable management of land, water and biodiversity throughout the world’s ecosystems;

The achieve the transformation, we’ll need all of: new policy instruments, new institutions for policy development and enforcement, a global climate fund, feed-in tariff systems, market incentives, technological innovations,

The Copenhagen Diagnosis was also released in December 2009. It was put together by 26 leading climate scientists, coordinated by the University of New South Wales, and intended as an update to the IPCC Working Group I report on the physical science basis. The report concentrates on how knowledge of the physical science has changed the IPCC assessment report, pointing out:

Greenhouse gas emissions have surged, with emissions in 2008 40% higher than in 1990;

Temperatures have increased at a rate of 0.19°C per decade over the past 25 years, in line with model forecasts;

Satellite and ice measurements show the Greenland and Antarctic ice sheets are losing mass at an increasing rate, and mountain glacier melting is accelerating;

Arctic sea ice has declined much more rapidly than the models predicted: in 2007-2009 the area of arctic sea ice was 40% lower than the IPCC projections.

Satellite measurements show sea level rise to be 3.4mm/year over the last 15 years, which is about 80% above IPCC projections. This rise matches the observed loss of ice.

Revised projections now suggest sea level rise will be double what the IPCC 2007 assessment reported by 2100, putting it at least 1 meter for unmitigated emissions, with an upper estimate of 2 meters; furthermore, sea levels will continue to rise for centuries, even after global temperatures have stabilized.

Irreversible damage is likely to occur to continental ice sheets, the amazon rainforest, the West African Monsoon, etc, due to reaching tipping points; many of these tipping points will be crossed before we realize it.

If global warming is to be limited to 2ºC above pre-industrial levels, global emissions need to peak between 2015 and 2020, and then decline rapidly, eventually reaching a decarbonized society with net zero emissions.

Here’s a letter I’ve sent to the Guardian newspaper. I wonder if they’ll print it? [Update – I’ve marked a few corrections since sending it. Darn]

Professor Darrel Ince, writing in the Guardian on February 5th, reflects on lessons from the emails and documents stolen from the Climatic Research Unit at the University of East Anglia. Prof Ince uses an example from the stolen emails to argue that there are serious concerns about software quality and openness in climate science, and goes on to suggest that this perceived alleged lack of openness is unscientific. Unfortunately, Prof Ince makes a serious error of science himself – he bases his entire argument on a single data point, without asking whether the example is in any way representative.

The email and files from the CRU that were released to the public are quite clearly a carefully chosen selection, where the selection criteria appears to be those that might cause maximum embarrassment to the climate scientists. I’m quite sure that I could find equally embarrassing examples of poor software on the computers of Prof Ince and his colleagues. The Guardian has been conducting a careful study of claims that have been made about these emails, and has shown that the allegations that have been made about defects in the climate science are unfounded. However, these investigations haven’t covered the issues that Prof Ince raises, so it is worth examining them in more detail.

The Harry README file does appear to be a long struggle by a junior scientist to get some poor quality software to work. Does this indicate that there is a systemic problem of software quality in climate science? To answer that question, we would need more data. Let me offer one more data point, representing the other end of the spectrum. Two years ago I carried out a careful study of the software development methods used for main climate simulation models developed at the UK Met Office. I was expecting to see many of the problems Prof Ince describes, because such problems are common across the entire software industry. However, I was extremely impressed with the care and rigor by which the climate models are constructed, and the extensive testing they are subjected to. In many ways, this process achieves a higher quality code than the vast majority of commercial software that I have studied, which includes the spacecraft flight control code developed by NASA’s contractors. [My results were published here: http://dx.doi.org/10.1109/MCSE.2009.193].

The climate models are developed over many years, by a large team of scientists, through a process of scientific experimentation. The scientists understand that their models are approximations of complex physical processes in the Earth’s atmosphere and oceans. They build their models through a process of iterative refinement. They run the models, and compare them with observational data, to look for the places where the models perform poorly. They then create hypotheses for how to improve the model, and then run experiments: using the previous version of the model as a control, and the new version as the experimental case, they compare both runs with the observational data to determine whether the hypothesis was correct. By a continual process of making small changes, and experimenting with the results, they end up testing their models far more effectively than most commercial software developers. And through careful use of tools to keep track of this process, they can reproduce past experiments on old versions of the model whenever necessary. The main climate models are also subjected to extensive model intercomparison tests, as part of the IPCC assessment process. Models from different labs are run on the same scenarios, and the results compared in detail, to explore the strengths and weaknesses of each model.

Like many software industries, different types of climate software are verified to different extents, representing choices of where to apply limited resources. The main climate models are tested extensively, as I described above. But often scientists need to develop other programs for occasional data analysis tasks. Sometimes, they do this rather haphazardly (which appears to be the case with the Harry file). Many of these tasks are experimental tentative in nature, and correspond to the way software engineers regularly throw a piece of code together to try out an idea. What matters is that, if the idea matures, and leads to results that are published or shared with other scientists, the results are checked out carefully by other scientists. Getting hold of the code and re-running it is usually a poor way of doing this (I’ve found over the years that replicating someone else’s experiment is fraught with difficulties, and not primarily exclusively because of problems with code quality). A much better approach is for other scientists to write their own code, and check independently whether the results are confirmed. This avoids the problem of everyone relying on one particular piece of software, as we can never be sure any software is entirely error-free.

The claim that many climate scientists have refused to publish their computer programs is also specious. I compiled a list last summer of how to access the code for the 23 main models used in the IPCC report. Although only a handful are fully open source, most are available free under fairly light licensing arrangements. For our own research we have asked for and obtained the the full code, version histories, and bug databases from several centres, with no difficulties (other than the need for a little patience as the appropriate licensing agreements were sorted out). Climate and weather forecasting code has a number of potential commercial applications, so the modeling centres use a license agreement that permits academic research, but prohibits commercial use. This is no different from what would be expected when we obtain code from any commercial organization.

Professor Ince mentions Hatton’s work, which is indeed an impressive study, and one of the few that that has been carried out on scientific code. And it is quite correct that there is a lot of shoddy scientific software out there. We’ve applied some of Hatton’s research methods to climate model software, and have found that, by standard software quality metrics, the climate models are consistently good quality code. Unfortunately, is it is not clear that standard software engineering quality metrics apply well to this code. Climate models aren’t built to satisfy a specification, but to address a scientific problem where the answer is not known in advance, and where only approximate solutions are possible. Many standard software testing techniques don’t work in this domain, and it is a shame that the software engineering research community has almost completely ignored this problem – we desperately need more research into this.

Prof Ince also echoes a belief that seems to be common across the academic software community that releasing the code will solve the quality problems seen in the specific case of the Harry file. This is a rather dubious claim. There is no evidence that, in general, open source software is any less buggy than closed source software. Dr Xu at the University of Notre Dame studied thousands of open source software projects, and found that the majority had nobody other than the original developer using them, while a very small number of projects had attracted a big community of developers. This pattern would be true of scientific software: the problem isn’t lack of openness, it’s lack of time – most of the code thrown together to test out an idea by a particular scientist is only of interest to that one scientist. If a result is published and other scientists think it’s interesting and novel, they attempt to replicate the result themselves. Sometimes they ask for the original code (and in my experience, are nearly always given it). But in general, they write their own versions, because what matters isn’t independent verification of the code, but independent verification of the scientific results.

I am encouraged that my colleagues in the software engineering research community are starting to take an interest in studying the methods by which climate science software is developed. I fully agree that this is an important topic, and have been urging my colleagues to address it for a number of years. I do hope that they take the time to study the problem more carefully though, before drawing conclusions about overall software quality of climate code.

Prof Steve Easterbrook, University of Toronto

Update: The Guardian never published my letter, but I did find a few other rebuttals to Ince’s article in various blogs. Davec’s is my favourite!

I guess headlines like “An error found in one paragraph of a 3000 page IPCC report; climate science unaffected” wouldn’t sell many newspapers. And so instead, the papers spin out the story that a few mistakes undermine the whole IPCC process. As if newspapers never ever make mistakes. Well, of course, scientists are supposed to be much more careful than sloppy journalists, so “shock horror, those clever scientists made a mistake. Now we can’t trust them” plays well to certain audiences.

And yet there are bound to be errors; the key question is whether any of them impact any important results in the field. The error with the Himalayan glaciers in the Working Group II report is interesting because Working Group I got it right. And the erroneous paragraph in WGII quite clearly contradicts itself. Stupid mistake, that should be pretty obvious to anyone reading that paragraph carefully. There’s obviously room for improvement in the editing and review process. But does this tell us anything useful about the overall quality of the review process?

There are errors in just about every book, newspaper, and blog post I’ve ever read. People make mistakes. Editorial processes catch many of them. Some get through. But few of these things have the kind of systematic review that the IPCC reports went through. Indeed, as large, detailed, technical artifacts, with extensive expert review, the IPCC reports are much less like normal books, and much more like large software systems. So, how many errors get through a typical review process for software? Is the IPCC doing better than this?

Even the best software testing and review practices in the world let errors through. Some examples (expressed in number of faults experienced in operation, per thousand lines of code):

Worst military systems: 55 faults/KLoC

Best military systems: 5 faults/KLoC

Agile software development (XP): 1.4 faults/KLoC

The Apache web server (open source): 0.5 faults/KLoC

NASA Space shuttle: 0.1 faults/KLoC

Because of the extensive review processes, the shuttle flight software is purported to be the most expensive in the world, in terms of dollars per line of code. Yet still about 1 error every ten thousand lines of code gets through the review and testing process. Thankfully none of those errors have ever caused a serious accident. When I worked for NASA on the Shuttle software verification in the 1990’s, they were still getting reports of software anomalies with every shuttle flight, and releasing a software update every 18 months (this, for an operational vehicle that had been flying for two decades, with only 500,000 lines of flight code!).

The IPCC reports consist of around 3000 pages, and approaching 100 lines of text per page. Let’s assume I can equate a line of text with a line of code (which seems reasonable, when you look a the information density of each line in the IPCC reports) – that would make them as complex as a 300,000 line software system. If the IPCC review process is as thorough as NASA’s, then we should still expect around 30 significant errors made it through the review process. We’ve heard of two recently – does this mean we have to endure another 28 stories, spread out over the next few months, as the drone army of denialists toils through trying to find more mistakes? Actually, it’s probably worse than that…

The IPCC writing, editing and review processes are carried out entirely by unpaid volunteers. They don’t have automated testing and static analysis tools to help – human reviewers are the only kind of review available. So they’re bound to do much worse than NASA’s flight software. I would expect there to be 100s of errors in the reports, even with the best possible review processes in the world. Somebody point me to a technical review process anywhere that can do better than this, and I’ll eat my hat. Now, what was the point of all those newspaper stories again? Oh, yes, sensationalism sells.

Having posted last night about how frustrating it is to see the same old lies get recycled in every news report, this morning I’m greeted with the news that there’s now an app for that. I’ve posted before about the skeptical science site. Well, now it’s available on a free iPhone app. I’ve downloaded it and played with it, and it looks fabulous. Here’s the screenshots:

I’ve been distracted over the last few months with all these attacks on climate science. It’s like watching a car crash in slow motion. I know enough about climate science to be skeptical of absolutely everything written on the topic in the mainstream media. And yet I still feel compelled to read about each new revelation trumpeted in the press, and I feel compelled to do the necessary digging to find out what’s really going on. Well, I’m done with it. I’ve seen enough. I’m finally looking away. And I’m taking away some lessons about human behaviour, and most of it isn’t pretty. Many of the people attacking the scientists are truly nasty people.

Take climategate, for example (please!). It really was a non-event – a series of trumped up claims with no substance. We already knew the contrarians talk nonsense. At worse, some requests for access to data were mishandled. By scientists who were being hounded by an army of attack drones. What did those FOI requests look like? Well mostly they looked the same, because when Steve McIntyre was told that some of the metereological data was not available to non-academics because of commercial licencing agreements, he threw a hissy fit and told the lunatics that follow his blog to fire off FOI requests at the CRU. Sixty FOI requests in one weekend! Which makes them all vexatious, and probably counts as harrassment. Which is bad enough, but some of McIntyre’s followers did worse, and started firing off death threats. Death threats?!? Sometimes Often I think I’m on the wrong planet.

Or take the hockey stick controversy. Michael Mann was smeared again as a result of the CRU emails, but on investigation his name was cleared. The previous attempts to smear him, through the Wegman Investigation, turns out to be nothing but a political attack, put together by staffers in Senator Inhofe’s office. While any errors in Mann’s initial attempts at dendrochronology reconstructions have been long since been corrected, and and the results confirmed by other studies (that’s how science works, remember?), a group of obsessive denialists just won’t let the issue drop.

David Brin calls it a war on expertise. A bunch untrained armchair climatologists think they know more about the field than geoscientists who have been studying it as a fulltime career for decades. Or, more precisely, they think they can do a little poking and find errors, and that those errors will invalidate the science. Because they really really want the science to be wrong. Actually, I really really want the science to be wrong too, but I’m not so stupid as to think I can poke holes in it without first becoming an expert. If the science is wrong, you’ll read about it first in the peer-reviewed literature.

The world, and North America in particular, is entering a period of unprecedented change. There is mounting evidence of the potential for (and pressure for action to avoid) catastrophic runaway climate change, unprecedented species extinctions and environmental degradation, the persistence (if not growth) of alarming inequities in health, and accelerated resource depletion. By many estimates we currently possess most of the technological know-how to solve the world’s fiscal, economic, environmental, social justice and climatological crises. In other words, the problem is not technical but social. Consensus is emerging that building resilience at 3 nested levels (psychological/ personal, community, systems level) is or must be at the centre of convergent social justice and environmental social change movements. Resilience is widely understood to refer to the ability of communities, persons, or systems to withstand shocks or stress without collapse, and perhaps the ability to accept and embrace (as opposed to resist) change. We are an interdisciplinary team principally from Canada and Brazil and we are working on the development of an arts-enabled transformative learning curriculum on the transition to a low-carbon society for application in educational and community settings, that draws on paradigms and sources of knowledge from the Global South and the Global North. We will describe work in progress.

Blake Poland is an Associate Professor in the Dalla Lana School of Public Health at the University of Toronto, Co-Director of the Environmental Health Justice in the City Research Interest Group (Centre for Urban Health Initiatives), and co-principal investigator in the CUHI-funded Building Community Resilience pilot project. His work draws on complexity science, critical social theory, arts-enabled approaches, environmental justice, community development, and health promotion.

What I was trying to lay out on this slide was a wide range of possible activities for which we could build software tools, combining good visualizations, collaborative support, and compelling user interface design. If we are to improve the quality of the public discourse on climate change, and support the kind of collective decision making that leads to effective action, we need better tools for all four of these areas:

Improve the public understanding of the basic science. Much of this is laid out in the IPCC reports, but to most people these are “dead tree science” – lots of thick books that very few people will read. So, how about some dynamic, elegant and cool tools to convey:

The difference between emissions and concentrations.

The various sources of emissions and how we know about them from detection/attribution studies.

The impacts of global warming on your part of the world – health, food and water, extreme weather events, etc.

The various mitigation strategies we have available, and what we know about the cost and effectiveness of each.

Achieve a better understanding of how the science works, to allow people to evaluate the nature of the evidence about climate change:

How science works, as a process of discovery, including how scientists develop theories, and how they correct mistakes.

What climate models are and how they are used to improve our understanding of climate processes.

How the peer-review process works, and why it is important, both as a filter for poor research, and a way of assessing the credentials of scientists.

What it means to be expert in a particular field, why expertise matters, and why expertise in one area of science doesn’t necessarily mean expertise in another.

Tools to support critical thinking, to allow people to analyze the situation for themselves:

The importance of linking claims to sources of evidence, and the use of multiple sources of evidence to test a claim.

How to assess the credibility of a particular claim, and the credibility of its source (desperately needed for appropriate filtering of ‘found’ information on the internet).

Systems Thinking – because reductionist approaches won’t help. People need to be able to recognize and understand whole systems and the dynamics of systems-of-systems.

Understanding risk – because the inability to assess risk factors is a major barrier to effective action.

Identifying the operation of vested interests. Because much of the public discourse isn’t about science or politics. It’s about people with vested interests attempting to protect those interests, often at the expense of the rest of society.

And finally, none of the above makes any difference if we don’t also provide tools to support effective action:

How to prioritize between short-term and long term goals.

How to identify which kinds of personal action are important and effective.

How to improve the quality of policy-making, so that policy choices are linked to the scientific evidence.

How to support consensus building and democratic action for collective decision making, at the level of communities, cities, nationals, and globally.

Tools to monitor effectiveness of policies and practices once they are implemented.

A reader writes to me from New Zealand, arguing that climate science isn’t a science at all because there is no possibility to conduct experiments. This misconception appears to be common, even among some distinguished scientists, who presumably have never taken the time to read many published papers in climatology. The misconception arises because people assume that climate science is all about predicting future climate change, and because such predictions are for decades/centuries into the future, and we only have one planet to work with, we can’t check to see if these predictions are correct until it’s too late to be useful.

In fact, predictions of future climate are really only a by-product of climate science. The science itself concentrates on improving our understanding of the processes that shape climate, by analyzing observations of past and present climate, and testing how well we understand them. For example, detection/attribution studies focus on the detection of changes in climate that are outside the bounds of natural variability (using statistical techniques), and determining how much of the change can be attributed to each of a number of possible forcings (e.g. changes in: greenhouse gases, land use, aerosols, solar variation, etc). Like any science, the attribution is done by creating hypotheses about possible effects of each forcing, and then testing those hypotheses. Such hypotheses can be tested by looking for contradictory evidence (e.g. other episodes in the past where the forcing was present or absent, to test how well the hypothesis explains these too). They can also be tested by encoding each hypothesis in a climate model, and checking how well it simulates the observed data.

Well, a climate model is a detailed theory of some subset of the earth’s physical processes. Like all theories, it is a simplification that focusses on those processes that are salient to a particular set of scientific questions, and approximates or ignores those processes that are less salient. Climate modelers use their models as experimental instruments. They compare the model run with the observational record for some relevant historical period. They then come up with a hypothesis to explain any divergences between the run and the observational record, and make a small improvement to the model that the hypothesis predicts will reduce the divergence. They then run an experiment in which the old version of the model acts as a control, and the new version is the experimental case. By comparing the two runs with the observational record, they determine whether the predicted improvement was achieved (and whether the change messed anything else up in the process). After a series of such experiments, the modelers will eventually either accept the change to the model as an improvement to be permanently incorporated into the model code, or they discard it because the experiments failed (i.e. they failed to give the expected improvement). By doing this day after day, year after year, the models get steadily more sophisticated, and steadily better at simulating real climactic processes.

This experimental approach has another interesting effect: the software appears to be tested much more thoroughly than most commercial software. Whether this actually delivers higher quality code is an interesting question; however, it is clear that the approach is much more thorough than most industry practices for software regression testing.

I’m delighted to announce that my student, Jonathan Lung has started a blog. Jonathan’s PhD is on how we reduce energy consumption in computing. Unlike much work on green IT, he’s decided to focus on the human behavioural aspects of this, rather than hardware optimization. His first two posts are fascinating:

How to calculate if you should print something out or read it on the screen. Since he first did these calculations, we’ve been discussing how you turn this kind of analysis into an open, shared, visual representation, that others can poke and prod, to test the assumptions, customize them to their own context, and discuss. We’ll share more of our design ideas for such a tool in due course.

An analysis of whether the iPad is as green as Apple’s marketing claims. Which is, in effect, a special case of the more general calculation of print vs. screen. Oh, and his analysis also makes me feel okay about my desire to own an iPad…

As Jorge points out, this almost completes my set of grad student bloggers. We’ve been experimenting with blogging as a way of structuring research – a kind of open notebook science. Personally, I find it extremely helpful as a way of forcing me to write down ideas (rather than just thinking them), and for furthering discussion of ideas through the comments. And, just as importantly, it’s a way of letting other researchers know about what you’re working on – grad students’ future careers depend on them making a name for themselves in their chosen research community.

Of course, there’s a downside: grad students tend to worry about being “scooped”, by having someone else take their ideas, do the studies, and publish them first. My stock response is something along the lines of “research is 99% perspiration and 1% inspiration” – the ideas themselves, while important, are only a tiny part of doing research. It’s the investigation of the background literature and the implementation (design an empirical study, build a tool, develop a new theory, …etc) that matters. Give the same idea to a bunch of different grad students, and they will all do very different things with it, all of which (if the students are any good) ought to be publishable.

On balance, I think the benefits of blogging your way through grad school vastly outweigh the risks. Now if only my students updated their blogs more regularly… (hint, hint).

Part of the problem is that in their rush to do science, scientists fail to spot the software for what it is: the analogue of the experimental instrument. Consequently, it needs to be treated with the same respect that a physical experiment would receive.

Any reputable physical experiment would ensure the instruments are appropriate to the job and have been tested. They would be checked for known error behaviour in the parameter regions of study, and chosen for their ability to give a satisfactory result within a useful timeframe and budget. Those same principles should apply to a software model.