Abstract. The majority of data sets in the geosciences are obtained from observations
and measurements of natural systems, rather than in the laboratory. These
data sets are often full of gaps, due to to the conditions under which the
measurements are made. Missing data give rise to various problems, for
example in spectral estimation or in specifying boundary conditions for
numerical models. Here we use Singular Spectrum Analysis (SSA) to fill the
gaps in several types of data sets. For a univariate record, our procedure
uses only temporal correlations in the data to fill in the missing points.
For a multivariate record, multi-channel SSA (M-SSA) takes advantage of both
spatial and temporal correlations. We iteratively produce estimates of
missing data points, which are then used to compute a self-consistent
lag-covariance matrix; cross-validation allows us to optimize the window
width and number of dominant SSA or M-SSA modes to fill the gaps. The optimal
parameters of our procedure depend on the distribution in time (and space) of
the missing data, as well as on the variance distribution between oscillatory
modes and noise. The algorithm is demonstrated on synthetic examples, as well
as on data sets from oceanography, hydrology, atmospheric sciences, and space
physics: global sea-surface temperature, flood-water records of the Nile
River, the Southern Oscillation Index (SOI), and satellite observations of
relativistic electrons.