ADAPTIVE OPTICS
Adaptive Optics is a general
technology for correcting images of objects seen through inhomogeneous
medium. Originally developed by astronomers and the military to study
objects outside the earth’s atmosphere, the technology now has
applications in many different fields, from biological imaging to
thermonuclear fusion.[Return to home page]

Introduction

Temperature fluctuations in the
earth’s atmosphere cause local changes in the refractive index of air.These changes cause the twinkling of stars and the mirage seen when
driving on a hot road. Light from a
star comes through space as a perfectly flat and smooth wavefront. As the light passes through the atmosphere, the wavefront, although smooth on small scales,
becomes "crinkled", degrading the resolution of a ground based telescope.

One way to overcome this problem is to
send the telescope into space, as was done with the highly successful
Hubble Space Telescope. Adaptive Optics tries to obtain similar imaging
performance from the ground by measuring the wavefront in real time and then
correcting the distortion using a deformable mirror.The minimum elements of any Adaptive
Optic (AO) system are therefore a device for actively correcting the wavefront
(aka “Deformable mirror”), a device for providing correction signals
(aka “Wavefront Sensor”) and a computer to control the whole system.

How do you Build an AO System?

When my group started building AO
systems in the early 1990s, there were very few components available to
build an AO system and we had to build the deformable mirror, wavefront
sensor and even the high speed computer ourselves. On the plus side, it did give us a lot of experience and this also
meant that we were able to make many different types of deformable
mirrors with different geometries, including some new designs. The
original system ( called “ChAOS”) now resides in the Adler Planetarium
in Chicago as a historical exhibit. It still looks good.

Making the Deformable Mirror

Our
deformable mirrors were built using tubes of piezoelectric material
about 1
inch long. These tubes change their length by a few microns when a
voltage is applied across the inner and outer surfaces of the tube. A
glass ball is used to interface the PZT tube with the faceplate. The
actuators were assembled in a
baseplate and glued together so that the tops of the glass
spheres were all in a plane. A thin flat faceplate 1 mm
was held in a vacuum chuck and glued onto the actuators, producing a
deformable mirror than was flat to about ½ micron and could be moved
over a range of 4 microns. Making these deformable mirrors is well
within the capabilities of a skilled amateur and full details of how
these mirrors are made is given in Mike Smutko's paper.
A modest deformable mirror can be built for a few hundred dollars. Even
lower cost mirrors can be made from bimorph plates. Newer
devices, using MEMS technology, can be bought ready-made for a few
thousand dollars from ThorLabs. A photograph of a collection of our DMs is
shown below. These have 91 or 201 actuators. The best DMs that we make use a quasi-hex geometry, which reduces waffle pattern noise.

Driving
all these deformable mirrors
is probably harder than making the mirror. Each actuator for either
piezo, bimorph or MEMS device requires a few hundred volts to operate,
and amplifiers operating at these voltage are both expensive and power hungry.
However,the piezo
actuators are built from a ceramic material and are basically
capacitors. This enabled us to use a multiplexed drive system in which
a single high voltage amplifier drove up to 16 actuators. It took 5
microseconds to charge each piezo so that the shape of the mirror could
be updated in 100 microseconds.[Return to top][Return to home page]

Building the Wavefront Sensor

Although other types of
wavefront sensor - shearing interferometers, Mach-Zender
interferometers, pyramid prisms, curvature sensors - may have
significant advantages for your application,
the classical wavefront sensor
is the Shack-Hartmann sensor. This device measures the gradient of
the wavefront across the pupil rather than the wavefront itself and
consists of a lattice of small lenses (lenslets) arranged in a
regular grid and a detector. The pupil of the telescope is usually
imaged onto the
lenslet array so that, when the telescope is pointed at a bright star,
a
series of images of the stars are formed in the focal plane of the
lenslet. A detector, usually a CCD, digitizes these images and the
position of each image determined by a computer. If the wavefront at
the telescope entrance pupil is flat, and the optics is perfect, the
stellar images will form a regular array on the detector. In practice,
each image will be displaced by an amount that is approximately
proportional to the average gradient of the wavefront across its
subaperture The sensor measures this average gradient, sampled at
the subaperture positions. It is usual to place the wavefront sensor
after the deformable mirror so that we actually measure the difference
between the mean slope of the atmosphere and the mean slope of the DM,
averaged over any subaperture. We can also simplify the mathematics and
reduce some systematic errors if we image the deformable mirror onto
the lenslet array so that an image of the deformable mirror actuators
is accurately
mapped onto the corners of the subapertures.[Return to top][Return to home page]

choosing the Detector

If
you are not photon limited, a commercial
CCD or similar camera with sufficient pixels to adequately sample the
Shack-Hartmann spots – say 4 to 8 CCD pixels across the FWHM of the
image from a given subaperture - will meet the needs of a low bandwidth
AO system. Unless you are using a very poor
detector, or a silly algorithm, to work out the position, oversampling
does not significantly increase the accuracy and usually results in a
lower read out speed, increased read-out noise and possibly higher
cost. The detector
usually has a fixed read noise for every pixel read-out (hence the
name). If photon noise or read out speed IS a significant problem, then
you must trade
the number of pixels/sub image with the linearity of the measurement.
The
more pixels you have, the more linear the slope measurements but,
usually, the higher the measurement noise and longer the read out time.
Adequate bandwidth is just as important as low noise.

In
the limit, you use the detector as
a quad cell and arrange that the lenslet images of the source are at
the
corners of
adjacent pixels. While the quad cell minimizes the number of pixels used to
measure
the position, the downside is that
(1) the measurement error is
non-linear – the
quad cell is most sensitive when the spot is centered at the 4 corners
of adjacent pixels.
(2) the gain, even when the spot is centered on
the corners of the pixel, (signal out/movement of the spot)
depends on the size and shape of the spot.
(3) alignment issues become important.
Because the normal AO system
is a feedback device, this non-linearity is not very important, since
the servo is always trying to move the actuators to force the
spots to their null position. Similarly, a change in spot size only
effects the gain of the system, which can, in principle, be compensated
for by the the control system. However you do have to very carefully
match the registration of the lenslets and detector pixels, so that
there are an integral number of pixels between each subaperture. For
some
reason, the lenslet manufacturers do not often do this and you may have
to re-image and re-scale the lenslets onto the detector pixels.
The simplest low noise AO systems (such as ChAOS) carefully image the
DM
actuators onto the corners of the subapertures and then arrange
re-imaging and re-scaling optics between the lenlet array and detector.
The Shack Hartmann sensor is thus not without its problems. Its most significant advantage is perhaps that its operation is easy to understand in principle.

In this section I will
assume that you have built a Shack Hartmann wavefront sensor that allows you
to image the DM actuator pattern accurately onto the lenslet array and
that you have a two dimensional detector to image the spots.
After some processing of the CCD data you will end up with a series of
measurements of the average slope of the wavefront across each
subaperture. This
is called the slope vector, although, in almost all calculations, what
we actually need is the phase difference across the subaperture, which is the slope
times the subaperture size, because most ways of
reconstructing the wavefront
from this data rely on a model relating these phase difference
measurements to the
phase of the wavefront at the edges of the subaperture.

The first step is to define the
phase differences in terms of the phase points. The figure left shows
one such arrangement, called the
Fried geometry, often used with the Shack-Hartmann sensor. In this
model, the first x phase difference is given by 0.5(2+5) -
0.5(1+4) where the integers correspond to actuator positions in the diagram. Different AO systems may have different geometries, but we can always define all the slopes
measured by the wavefront sensor in terms of the actuator positions as a matrix equation:

Where s is a vector containing a list
of all the phase differences, phi are the phases and A is a very sparse matrix
called the geometry matrix. There exists another inverse matrix, usually called
A+, which is a non sparse matrix relating the phases to the phase differences: This matrix gives us a recipe for
calculating the phases from the phase differences - we take all the phase differences that we
measure and weight each slope with a number that depends on the phase
position j and the slope number i:
There has been considerable
discussion, often heated, over what the “best” coefficients are, but
these differences are often only important when the source is faint and
we
are trying to optimize the signal/noise of the system. We should note
that most Gradient and curvature wavefront sensors do not sense all
possible
modes. For instance, the Fried geometry, discussed above, cannot
measure
the so-called "Waffle
mode". The Waffle
mode occurs when the actuators are moved in a checkerboard pattern( all
white squares change position while the black squares remain
stationary). This pattern can build up over time, it is reduced by
using different actuator geometries or filtering the control signals so
as suppress this mode. We should also note that the geometry
model is only an approximation to the wavefront. Even if there
the actuator positions reflect the true positions of the wavefront at
these points, there are high frequency components in the wavefront that
cause system errors. This error ( usually called "aliasing" ) can be
reduced by proper design but introduces an additional term into the
error budget.[Return to top][Return to home page]

Working out the reconstruction matrixWe need the work out an inverse to the geometry matrix.
Starting with the original matrix equation relating slopes to phase
points, we note that A is not square ( there are about twice as many slopes as phase points)
so that we cannot invert A directly. If we multiply both
sides by A transpose we get a square matrix A transpose A and, following
the steps shown left, we can at least write down an equation of the phase points in
terms of
the slopes. This reconstruction matrix is usually called the A+ matrix.

We have, of course, still not shown that we can invert A, and, indeed if
you try this the inversion will fail. To understand why, and what to do
about it, we should look at a simple 3x3 matrix drawn in the last section.
The first x slope is given by a Fried geometry as 0.5(2+5) -
0.5(1+4). If we add a constant to either (2 and 4) or (1 and 5) the
resulting slope is unchanged, even though the shape of the wavefront is very different. This is just an artifact of the original
equation but is the reason for the "Waffle mode"- you get the same
slopes if you piston all the odd or all the even phase points. You can
overcome this mathematically for the inversion problem either by setting one odd and one even point to
zero, or (better) by setting the sum of all the odd points,and
the sum of all the even points to zero, the so-called minimum norm
solution. The mode is still undetected and can build up with time, but we will discuss this issue in the section after next.

The
first thing to do is to define how your actuator positions produce the
slopes you hope to measure.
If you use Mathematica you can define this matrix, called the geometry
matrix, for an m x n array of points as follows, this adds in the extra
two rows needed to provide the minimum norm solution.

We will try this for a 3x3 actuator matrix, setting m=3 and n=3.
The geometry matrix produces an ordered list of slopes from the phases at the actuator positions.
In this example, the geometry matrix, A, looks as follows:

The first row is the first phase difference. The actuator positions run from 1
to 9 left to right along the top, so this matrix says that the first x phase difference
is given as 0.5(2+3)-0.5(1+4), where the integer numbers are the actuator
positions. There are 4 x slopes and then four y slopes. The last two
rows at the bottom add up the phases of all the odd phases and all the
even phases and represent the unobserved Waffle mode terms. The matrix
A transpose
A is given is Mathematica as

and the final A+ matrix is given by

In
this A+ matrix each
row represents a phase point which is a sum of the 8 slope values
multiplied by a weighting function. The last two columns are the waffle
mode pistons which are set to zero and not used. Although we have not
shown it here, this is the least squares estimator of the phases from
the phase differences. You must be careful about numbering of your
actuator positions and phase differences, so that the geometry matrix
you use to
calculate the least squares solution corresponds to the actuator
positions. Measured phase differences and phase difference
measurements round the edges and corners should
be treated slightly differently, but this example illustrates the
general principles of reconstruction.

If you have an adequate number of pixels/subaperture, you can (in
principle) simply poke up each actuator in
turn and measure the resulting phase differences, thus measuring the geometry A
matrix rather than calculating it. The measured A matrix can then be used to
calculate the reconstructor M matrix. This turns out to be trickier
than it sounds and, if you are starting off in AO, it may be faster to
put in the required opto-mechanical adjustments and simplify the maths.

Noise Gain

Once
you have the A matrix, you can work out the noise gain of the system.
We can show that the noise gain for the least squares A+ reconstructor
is
given by:

Noise gain = Tr[Inverse[Transpose[ae].ae]]/(m x n)

Where ae is the geometry matrix. The mean square error in calculating the wavefront
is given by this number times the mean square error of measuring the
phase difference across the subaperture. Note that the error is defined in terms of a variance and that the key parameter is
not the mean square slope error but rather the mean square slope error
x subaperture size^2.

The noise gain does not increase rapidly with number of actuators.
This is crucial for the whole approach and is certainly not true for a
one dimensional array of actuators. If
we had a single line of phase differences and set the phase at one end
to zero, the phase at the other end is simply the sum of the phase
differences along the line. In this case, if the noise between phase
difference
measurements is uncorrelated, the variance of the phase estimate at the
other end will then be the sum of the variances of the individual phase
difference measurements and the noise gain will increase directly with
the number of phase differences.

However because we have are measuring
phase differences in two directions, as a vector, there are a
number of different paths joining two points in the array and many of
the paths are independent. In fact if we have an N x N array of phase
points the number of different loops is nearly equal to N, and we can show that the
noise gain only increases logarithmically with the number of phases.
This
is a feature of using a weighted sum of all the phase differences in
two dimensions. Different reconstruction algorithms have different
noise propagators. For instance, curvature sensors, which measure a scalar field,
have much worse noise gains for large numbers of actuators.
[Return to top][Return to home page]

Adaptive Optics as a Feedback Loop

It
is important to realize that
we are actually building a sampled feedback loop drawn, outlined
below. The wavefront sensor attempts to measure the slopes of the
difference
between the atmospheric wavefront and deformable mirror, integrated
over the subapertures and over a sample time period t.
This slope vector is then multiplied by a matrix M to obtain an
updated
vector ( or map ) of the difference between the wavefront and the
positions of the DM actuators. M can be one of
a large number of different matrices from sparse iterative matrices to
dense matrices based on the statistics of the wavefront.

The
reconstructor matrix M provides error signals to the actuator
controller and can also control
the spatial smoothing of the deformable mirror. The most common
reconstructor used in adaptive optics probably the least squares
reconstructor . This minimizes the fitting error between the atmosphere
and DM but assumes that there is no correlation between the phases of
adjacent actuators. The lack of spatial smoothing has been a serious
criticism of this reconstructor. Atmospheric turbulence has the
form of a fractal
and has much more power at large spatial scales so that the idea that
the atmospheric wavefront can have big swings between adjacent phase
points is physically unrealistic. An enormous amount of time and effort
has gone into working out the optimum M matrix assuming given
atmospheric wavefront statistics. We should note however that what we
actually measure is the difference between the atmosphere and the
DM. This naturally takes out the high power at large spatial
scales (in fact that is how AO works) and the correct statistics are
not
those of the atmosphere but those of , which are, very approximately, uncorrelated between actuator positions. The least
square solution is therefore a better solution than would first appear. Whatever M matrix is used, the new phase differences between atmospheric wavefront and deformable mirror are fed to the control
loop of each actuator. This controller provides the required actuator position. We
often use simple first
order difference equation for this step.

where g is the loop gain. This says that we just add a weighted value
of the new difference measurement to each of the original actuator
positions to obtain a new position for the next sample.
The bandwidth of
the system is controlled by the sample frequency and the loop
gain of the system. For the least squares reconstructor, the servo
bandwidth is given by and the noise gain of the system
by , where is proportional to the measurement noise of the
wavefront sensor. We should note that the system noise gain
increases rapidly as g approaches 2 and the servo becomes unstable. For this reason the loop gain is rarely set above one.
The approach used in ChAOS was to
have a set of precomputed M matrices, each one optimized for a range of different
observing conditions (source brightness and atmospheric seeing) and to
change this matrix and the servo gain factor independently, by trial and error, so as to obtain the best system
performance.

There exist general trades between how much computing
we need to do (M can be sparse) and the sample frequency.
Under conditions of high turbulence, significant scintillation exists,
including singularities in the wavefront and different approaches must
be used.

Calibration

The performance of an AO system
depends on how well you can calibrate out systematic effects. It is
essential to have some point source (often a
single mode fiber) at a convenient position in the AO system so that
you
can measure and remove systematic errors in the system.[Return to top][Return to home page]

Non-Common Path Errors

Looking at the feedback loop diagram above, we see that we actually
control the wavefront reflected from the beamsplitter. Any differences in abberation
introduced into the beam as it passes through the beamsplitter ( such
as astigmatism because the plate is in a converging beam) is not
detected and reduces the final image quality. It is usual to introduce
additional optics, such as a thinly wedged plate, into the beam before
the wavefront sensor to correct for this effect. Non common path errors
are generally difficult to eliminate and are often responsible for poor
correction of the system.[Return to top][Return to home page]