Kurzfassung

A set of performance metrics is applied to stratospheric-resolving chemistry-climate models (CCMs) to quantify their ability to reproduce key processes relevant for stratospheric ozone. The same metrics are used to assign a quantitative measure of performance (“grade”) to each model-observations comparison shown in Eyring et al. (2006). A wide range of grades is obtained, both for different diagnostics applied to a single model and for the same diagnostic applied to different models, highlighting the wide range in ability of the CCMs to simulate key processes in the stratosphere. No model scores high or low on all tests, but differences in the performance of models can be seen, especially for processes that are mainly determined by transport where several models get low grades on multiple tests. The grades are used to assign relative weights to the CCM projections of 21st century total ozone. For the diagnostics used here there are generally only small differences between weighted and unweighted multi-model mean and variances of total ozone projections. This study raises several issues with the grading and weighting of CCMs that need further examination. However, it does provide a framework and benchmarks that will enable quantification of model improvements and assignment of relative weights to the model projections.