A method for restoring missing data is to ensure that the restored data,
after specified filtering, has minimum energy.
Specifying the filter chooses the interpolation philosophy.
Generally the filter is a ``roughening" filter.
When a roughening filter goes off the end of smooth data,
it typically produces a big end transient.
Minimizing energy implies a choice for unknown data values
at the end, to minimize the transient.
We will examine five cases and then make some generalizations.

A method for restoring missing data
is to ensure that the restored data,
after specified filtering,
has minimum energy.

Let m denote a missing value.
The dataset on which the examples are based is
.Using subroutine miss1(),
values were found to replace the missing m values
so that the power in the filtered data is minimized.
Figure 2
shows interpolation of the dataset with 1-Z as a roughening filter.
The interpolated data matches the given data where they overlap.

mlines
Figure 2
Top is given data.
Middle is given data with interpolated values.
Missing values seem to be interpolated by straight lines.
Bottom shows the filter (1,-1),
whose output has minimum power.

mparab
Figure 3
Top is the same input data as in Figure 2.
Middle is interpolated.
Bottom shows the filter (-1,2,-1).
The missing data seems to be interpolated by parabolas.

mseis
Figure 4
Top is the same input.
Middle is interpolated.
Bottom shows the filter (1,-3,3,-1).
The missing data is very smooth.
It shoots upward high off the right end of the observations,
apparently to match the data slope there.

msmo
Figure 5
The filter (-1,-1,4,-1,-1) gives
interpolations with stiff lines. They resemble
the straight lines of Figure 2,
but they project through a cluster of given values
instead of projecting to the nearest given value.
Thus, this interpolation tolerates noise in the given data
better than the interpolation shown in
Figure 4.

moscil
Figure 6
Bottom shows the filter (1,1).
The interpolation is rough.
Like the given data itself, the interpolation
has much energy at the Nyquist frequency.
But unlike the given data, it has little zero-frequency energy.

Figures 2-6
illustrate that the rougher the filter,
the smoother the interpolated data,
and vice versa.
Let us switch our attention from the residual spectrum
to the residual itself.
The residual for Figure 2
is the slope of the signal
(because the filter 1-Z is a first derivative),
and the slope is constant (uniformly distributed) along the straight lines
where the least-squares procedure is choosing signal values.
So these examples confirm the idea
that the least-squares method abhors large values
(because they are squared).
Thus, least squares tend to distribute uniformly residuals
in both time and frequency to the extent the constraints allow.

This idea helps us answer the question,
what is the best filter to use?
It suggests choosing
the filter to have an amplitude spectrum
that is inverse to the spectrum we want for the interpolated data.
A systematic approach is given in the next section,
but I will offer a simple subjective analysis here.
Looking at the data, I see that all points are positive.
It seems, therefore, that
the data is rich in low frequencies;
thus the filter should contain something like (1-Z),
which vanishes at zero frequency.
Likewise, the data seems to contain Nyquist frequency,
so the filter should contain (1+Z).
The result of using the filter (1-Z)(1+Z)=1-Z2
is shown in Figure 7.
This is my best subjective interpolation
based on the idea that the missing data should look like the given data.
The interpolation and extrapolations are so good that
you can hardly guess which data values are given
and which are interpolated.

mbest
Figure 7
Top is the same as in
Figures 2 to 6.
Middle is interpolated.
Bottom shows the filter (1,0,-1), which comes from
the coefficients of (1-Z)(1+Z).
Both the given data and the interpolated data
have significant energy at
both zero and Nyquist frequencies.