Abstract. During the fifth phase of the Coupled Model Intercomparison Project (CMIP5) substantial efforts were made to systematically assess the skill of Earth system models. One goal was to check how realistically representative marine biogeochemical tracer distributions could be reproduced by models. In routine assessments model historical hindcasts were compared with available modern biogeochemical observations. However, these assessments considered neither how close modeled biogeochemical reservoirs were to equilibrium nor the sensitivity of model performance to initial conditions or to the spin-up protocols. Here, we explore how the large diversity in spin-up protocols used for marine biogeochemistry in CMIP5 Earth system models (ESMs) contributes to model-to-model differences in the simulated fields. We take advantage of a 500-year spin-up simulation of IPSL-CM5A-LR to quantify the influence of the spin-up protocol on model ability to reproduce relevant data fields. Amplification of biases in selected biogeochemical fields (O2, NO3, Alk-DIC) is assessed as a function of spin-up duration. We demonstrate that a relationship between spin-up duration and assessment metrics emerges from our model results and holds when confronted with a larger ensemble of CMIP5 models. This shows that drift has implications for performance assessment in addition to possibly aliasing estimates of climate change impact. Our study suggests that differences in spin-up protocols could explain a substantial part of model disparities, constituting a source of model-to-model uncertainty. This requires more attention in future model intercomparison exercises in order to provide quantitatively more correct ESM results on marine biogeochemistry and carbon cycle feedbacks.

This paper explores how the large diversity in spin-up protocols used for ocean biogeochemistry in CMIP5 models contributed to inter-model differences in modeled fields. We show that a link between spin-up duration and skill-score metrics emerges from both individual IPSL-CM5A-LR's results and an ensemble of CMIP5 models. Our study suggests that differences in spin-up protocols constitute a source of inter-model uncertainty which would require more attention in future intercomparison exercises.

This paper explores how the large diversity in spin-up protocols used for ocean biogeochemistry...