Atmospheric data assimilation is the analytical process of estimating
the entire state of the atmosphere from a set of observations. This is
considered to be a crucial element to weather forecasting. No forecast
model can be wholy correct and the assimilation procedure aims to equip
the numerical model with accurate initial conditions, thus encouraging
the model to advance in a realistic direction. It is not just
forecasting which benefits. The process is applied to the creation of
accurate and continuous research data sets, and also to the diagnosis
of model errors.

The technology has benefited from 70 years of
research and is still a developing area. A major lesson is that while
data assimilation is very important to forecasting and to atmospheric
science, it has to be done well to be of value.

1. The Method of Least Squares

The method of least squares is the central axiom used in most data
assimilation schemes used in leading weather centres. It is a basic
mathematical idea which tries to minimize the departure between the
state of the model and the incoming observations. Invented by Gauss
in the late eighteenth century, it was first used in astronomy to
determine the orbital parameters of planets and other bodies about the
Sun. This application is parallel with the meteorological problem in
the sense that both are classes of an inverse problem.

2. Numerical Weather Prediction

Numerical weather prediction was attempted first by Lewis F. Richardson
in 1922. In his well known experiment, the equations of atmospheric
dynamics - which were meant to represent the air flow over central
Europe - were solved numerically. Before digital computers this was a
lengthy procedure and involved teams of assistants. Although the
general philosophy of weather forecasting today is the same as that of
Richardson, his trial failed. The observations had not been
assimilated properly, leading to an unbalanced (badly initialized)
initial state. The initial errors were amplified by the model. This
lead to the belief that numerical weather prediction was not feasible,
and further research stalled.

3. Fresh Attempts

Digital computers developed during the Second World War opened new
opportunities for researchers to tackle the weather forecasting
problem. A pioneering group at the Princeton Institute of Advanced
Study in the United States, consisting of J. Charney, R. Fjortoft
and J. von Neumann used the ENIAC (Electronic Numerical Integrator
and Computer) - at the time a state of the art computer - to solve a
set of flow equations. They did not use the same equation set as
Richardson, but instead applied a simplified equation (the barotropic
vorticity equation), which could not support the unbalanced modes
which destroyed Richardson's analysis. This choice behaved as a kind
of filter which has analogues in the more modern assimilation
schemes. The success of their experiment put numerical weather
prediction back on the agenda.

4. Objective Analysis

Meteorological observations are generally non-uniformly distributed and
have errors associated with them. Systematically putting them into a
form which can be used by a numerical model was first done by H.
Panofsky in 1949. He used a primitive form of initialization which
involved simple interpolation of the measurements to the model's grid
positions by curve fitting and with a weighting which depended upon
the accuracy of each measurement. This analysis used the method of
least squares.

5. The Data Assimilation Cycle

Simple interpolation of existing observations to data voids is not
necessarily consistent with the overall fluid flow. B. Gilchrist and
G. Cressman recognized that although some regions lacked observations,
information could be provided by the previous forecast for the
purposes of initialization. This furthered the objective analysis
procedure and gave rise to the notion of the data assimilation cycle
incorporating a background field. These ideas were developed further
by P. Bergthorsson and B. Doos (1955) and P. Thompson (1961).

By the early 1960s, the major national weather centres of the
world were using these numerical weather prediction ideas - with the
data assimilation cycle - on an operational basis to guide
forecasters.

6. Optimal Interpolation

The `interpolation' procedures used to form initial conditions were
formalized in the 1970s to account properly for both model and
observation errors. The use of observations with models, combined in
a statistical sense via the method of least squares, was intended to
give a `best-fit'. This was named "optimal interpolation". Given the
set of approximations needed to make the method practical meant that
the results of optimal interpolation were far from optimal. One major
limitation was the computer power available which meant that a truly
global analysis was not possible, and the assimilation of data into
separate domains was used instead.

7. Variational Data Assimilation

A new approach to producing the best fit, which was coming on line
operationally in the mid 1990s was variational data assimilation
("Var."). The method itself - which finds the best fit by again
minimizing the square of the deviation between the analysis and the
back- ground/observations in an iterative fashion (using a descent
algorithm) - was actually conceived in the 1970s. The Var. procedure
allows observations to be assimilated which are not necessarily model
variables. This advantage means that radiances, as measured by the
many `Earth observation' satellites, could be exploited to their
full. The advances in computational power available by the 1990s
meant that a truly global analysis could be made. The `Holy Grail'
of data assimilation is the so-called 4d-Var. in which observations
are digested in the model at their proper time. A `cut-down'
version, 3d-Var., is often used instead, which requires somewhat less
computer power, but does not resolve the true observation times. All
measurements made inside a time window (normally 6 hours) are taken
as simultaneous.

Data Assimilation remains a rich and constantly evolving area for
development with many centres around the world trying to use and
refine the method. With the availability of increasing amounts of
good quality measurements, particularly from satellites, data
assimilation has proved to be the best way of using these data for
better weather forecasts and for research data sets to help scientists
understand our environment.

There is always scope for
improvements to the methodologies and algorithms. While researchers
are now aware of the `correct' approach to data assimilation, the
computational cost is still usually prohibitive. While computational
power is increasing, there is always a need for more accurate,
efficient and imaginative solutions.