We compare the ocean temperature evolution of the Holocene as simulated by climate models and reconstructed from marine temperature proxies. We use transient simulations from a coupled atmosphere–ocean general circulation model, as well as an ensemble of time slice simulations from the Paleoclimate Modelling Intercomparison Project. The general pattern of sea surface temperature (SST) in the models shows a high-latitude cooling and a low-latitude warming. The proxy dataset comprises a global compilation of marine alkenone- and Mg/Ca-derived SST estimates. Independently of the choice of the climate model, we observe significant mismatches between modelled and estimated SST amplitudes in the trends for the last 6000 yr. Alkenone-based SST records show a similar pattern as the simulated annual mean SSTs, but the simulated SST trends underestimate the alkenone-based SST trends by a factor of two to five. For Mg/Ca, no significant relationship between model simulations and proxy reconstructions can be detected. We test if such discrepancies can be caused by too simplistic interpretations of the proxy data. We explore whether consideration of different growing seasons and depth habitats of the planktonic organisms used for temperature reconstruction could lead to a better agreement of model results with proxy data on a regional scale. The extent to which temporal shifts in growing season or vertical shifts in depth habitat can reduce model–data misfits is determined. We find that invoking shifts in the living season and habitat depth can remove some of the model–data discrepancies in SST trends. Regardless whether such adjustments in the environmental parameters during the Holocene are realistic, they indicate that when modelled temperature trends are set up to allow drastic shifts in the ecological behaviour of planktonic organisms, they do not capture the full range of reconstructed SST trends. Results indicate that modelled and reconstructed temperature trends are to a large degree only qualitatively comparable, thus providing a challenge for the interpretation of proxy data as well as the model sensitivity to orbital forcing.