Abstract. Skilful hydrological forecasts at sub-seasonal to seasonal lead times would be extremely beneficial for decision-making in water resources management, hydropower operations, and agriculture, especially during drought conditions. Ensemble streamflow prediction (ESP) is a well-established method for generating an ensemble of streamflow forecasts in the absence of skilful future meteorological predictions, instead using initial hydrologic conditions (IHCs), such as soil moisture, groundwater, and snow, as the source of skill. We benchmark when and where the ESP method is skilful across a diverse sample of 314 catchments in the UK and explore the relationship between catchment storage and ESP skill. The GR4J hydrological model was forced with historic climate sequences to produce a 51-member ensemble of streamflow hindcasts. We evaluated forecast skill seamlessly from lead times of 1 day to 12 months initialized at the first of each month over a 50-year hindcast period from 1965 to 2015. Results showed ESP was skilful against a climatology benchmark forecast in the majority of catchments across all lead times up to a year ahead, but the degree of skill was strongly conditional on lead time, forecast initialization month, and individual catchment location and storage properties. UK-wide mean ESP skill decayed exponentially as a function of lead time with continuous ranked probability skill scores across the year of 0.75, 0.20, and 0.11 for 1-day, 1-month, and 3-month lead times, respectively. However, skill was not uniform across all initialization months. For lead times up to 1 month, ESP skill was higher than average when initialized in summer and lower in winter months, whereas for longer seasonal and annual lead times skill was higher when initialized in autumn and winter months and lowest in spring. ESP was most skilful in the south and east of the UK, where slower responding catchments with higher soil moisture and groundwater storage are mainly located; correlation between catchment base flow index (BFI) and ESP skill was very strong (Spearman's rank correlation coefficient= 0.90 at 1-month lead time). This was in contrast to the more highly responsive catchments in the north and west which were generally not skilful at seasonal lead times. Overall, this work provides scientific justification for when and where use of such a relatively simple forecasting approach is appropriate in the UK. This study, furthermore, creates a low cost benchmark against which potential skill improvements from more sophisticated hydro-meteorological ensemble prediction systems can be judged.

We benchmarked when and where ensemble streamflow prediction (ESP) is skilful in the UK across a diverse set of 314 catchments. We found ESP was skilful in the majority of catchments across all lead times up to a year ahead, but the degree of skill was strongly conditional on lead time, forecast initialization month, and individual catchment location and storage properties. Results have practical implications for current operational use of the ESP method in the UK.

We benchmarked when and where ensemble streamflow prediction (ESP) is skilful in the UK across a...