Global Climate Models are computer programs that simulate the atmosphere, ocean, sea ice, land surface, and the effects of these components on one another. Earth System Models (ESMs) are global climate models that include biogeochemical cycling in terrestrial and marine ecosystems and explicitly model the movement of carbon through the earth system. Output from ESMs are estimates of variables such as precipitation and temperature at different spatial and temporal resolutions and are typically available as three dimensional gridded values.

The Coupled Model Intercomparison Project (CMIP) is now in its sixth phase (CMIP6) provides the climate and impact assessment community with a wide range of simulation output from Earth System Models (Eyring et al., 2016b). These simulations help us better understand past climate and also provide estimates for the future under different climate scenarios. It is therefore important to evaluate model simulation output in order to better understand systemic biases in the models and the magnitude of uncertainty in estimates of future projections before extensively using them for analyses and impact assessment studies. There is also a need for a comprehensive framework that performs baseline aspects of model evaluation in a consistent and efficient manner. This will prevent scientists wasting valuable time and effort individually re-implementing well-established evaluation diagnostics instead of just sharing them. A tool or framework that implements basic metrics and diagnostics for model output evaluation with the flexibility of adding newly-researched, science-based evaluation measures is therefore the need of the hour. We present the Earth System Model Evaluation Tool (ESMValTool) as a tool compatible with current and future phases of the CMIP and many desirable features for comprehensive model evaluation (Eyring et al., 2016a).

The ESMValTool is an open-source community diagnostics and performance metrics tool for the evaluation of ESMs. It allows comparisons between multiple models against previous versions and observations. The tool provides a versatile preprocessor to standardize data with regridding, interpolation and masking, as well as temporal and spatial extraction facilities. Standard recipes for scientific topics reproduce specific diagnostics and performance metrics considered important for ESM evaluation from peer-reviewed literature (Lauer et al., 2017).

The tool is currently at version 2.0 and the most recent release allows new analyses and recipes to be written in one of many open-source languages such as Python, R and NCL. Figure 1 shows a high-level overview of the ESMValTool.

Some key features of the tool are:
(a) Simultaneous multimodel comparison with observed data. Figure 2 shows an example of seasonal standard deviations of surface temperature anomalies for four different CMIP5 models and observed data as generated by the ESMValTool. We see that three of the models show a greater variability than observed data.
(b) Extensive documentation with log files for better traceability and reproducibility.
(c) Wide scope with standard recipes covering different aspects of ESMs such as dynamics, radiation, aerosols and sea ice.
(d) Central installation on the Centre for Environmental Data Analysis’s (CEDA’s) JASMIN infrastructure with CMIP model data retrieval being automatically facilitated.
(e) Diagnostic outputs produced in convenient formats such as plots (pdf, png etc.) and netCDF files.

The ESMValTool v2.0 continues to evolve with diagnostics being ported from the previous release, more relevant diagnostics ported from other assessment tools as well as new ones being added. The tool can be accessed from the Earth System Model Valuation tool GitHub pae and a sample of CMIP results produced with the ESMValTool is available for the German CMIP6 project. We encourage users to access the tool and try out the various diagnostics and metrics available.

Figure 2: Standard deviation of global surface temperature anomalies for the period 1982-2005. Figures (a) through (d) are for different CMIP5 climate model temperature outputs and (e) shows the standard deviation for observations (in this case, from HadCRUT4). Missing data in the observations is represented in white.