Recapture the moment

By Luke Collins

Published Monday, March 2, 2009

For those amateur photographers who'd prefer their memories to be sharp as well as happy, image processing is coming to the rescue.

There's more, and less, to modern photography than meets the eye. The upsurge in the use of digital cameras has meant that it's easier than ever to capture the 'Kodak moment' - even if it turns out to be a blurry, red-eyed moment.

So far, technology has taught the snap-happy public to care about megapixels and ISO sensitivities. Enthusiasts now argue endlessly in online forums about digital noise, low-light performance, and how many megapixels are really enough to replace physical film.

The rapid evolution of digital cameras has been driven by improved manufacturing processes - making more pixels available - and increased processing power to handle the information harvested.

In some cameras, the processing can be as complex as removing the effects of the chromatic aberration and geometric distortions caused by lenses. In mobile phones, increased processing power is being used to apply the kind of skin-smoothing and blemish-removal tricks that used to be the domain of artificial 'post production', but which are now sold to the Asian phone market as 'the ten years younger' button.

Computational photography

Even as computing power is altering what it means to take a photograph, the way the original image is captured has remained much the same. A lens captures a scene and focuses it on to a flat plane, where a sensor records the light that has arrived.

There is, however, more information in the light that enters the camera than just the colour and intensity recorded on a plane. And there are other ways of recording a scene than to just project it on to a plane that can extract some of that information. A new discipline, 'computational photography', is coupling these new ways of capturing images with new ways of processing the resultant data to take photos with surprising new features - such as the ability to change the focus after the shutter has been pressed.

"Conventional cameras don't capture the majority of the information about the geometry of lighting," says Ren Ng, a Stanford University PhD who has started a company called Refocus Imaging to commercialise his work on computational photography. "You lose the information about the individual rays of light."

Computational photography's response is to treat light as a field, much like an electric or magnetic field, and then record the intensity and direction from which the light came at each position on a sensing surface, to create a four-dimensional description of the imaged scene. This gives a much fuller description of the light, as well as opening up the option of adding further dimensions to the description by recording parameter such as wavelength and polarisation states.

Because a recorded light field carries information about lighting geometry, it is possible to process that information to adjust the image's depth of focus, alter its depth of field, and even change, to a limited extent, the apparent position of the camera. According to Ng, it will also make it possible to 'compute out' aberrations in the lens systems, such as the colour fringes you get in some images, or the geometric distortions caused by extreme optics.

"An aberration is an undesired non-convergence of the rays of light," he said. "But if you have the whole light field you can just move the light to where it is supposed to be. It means we will get lenses in the future that are lighter and cheaper but with more powerful imaging than anything currently available."

Microlenses

So what does it take to capture a light field? Ng's technique is to put a flat array of microlenses close to the digital sensor's surface, between it and the rear element of the lens. Each lens in the microlens array captures light from inside the camera and projects it on to a subset of the sensor pixels behind it: in Ng's PhD work a 4000 × 4000 array of 9µm pixels on a Kodak sensor was overlaid with a 296 × 296 grid of 125µm square lenslets.

"Each microlens in the array is forming a miniature camera that gives a picture of the inside of the [real] camera from that position," said Ng. Using an array of such microlenses captures the entire light field from multiple positions, providing the multiple views of the image that make the various post-processing techniques, such as depth of focus control and even refocusing, possible.

Several academic teams in the US are investigating the idea of capturing more information about a scene by partially blocking the aperture of the lens. Think of a camera obscura with not one but two pinholes, side by side, letting in light from the scene outside. If you block one of the holes, you get one view of the scene: if you block the other you get a slightly different view. If you know how far apart the holes are then when you compare the two images you can extract some information about the depth of the 3D scene from your pair of 2D pictures of it.

A team at MIT has used this basic approach to build a lightfield camera using a small disc of material with a variety of different sized and shaped holes in it, which goes in the aperture of the camera's main lens. As with the microlens approach, the idea is to capture multiple views of the direction and intensity of the light entering the camera on to a standard sensor array. Unlike the microlens approach, though, using different sized and shaped holes will capture views of the lightfield which vary not only in position (as with the microlens array) but in other properties as well: imagine two different sized holes side by side, each encoding different depths of field.

By choosing different sets of hole size and shape, it should be possible to measure the lightfield in a variety of ways that provide differing options for post-processing. For example, a grid of regularly-sized holes may be better at extracting the information necessary to alter the camera's apparent point of view than a set of fewer holes with varying apertures, which might be better at extracting depth information. The team calls the approach 'coded apertures' to reflect their ability to alter the kind of information they can extract with each type.

Coded apertures

A team at Mitsubishi Electric Research Laboratory (MERL) has taken coded apertures further, showing that it is possible to use coded apertures in a mask close to the sensor plane, rather than in the aperture of the lens. As you might expect, there appears to be a trade-off in using these techniques to measure depth.

If you can code apertures to tailor the way the depth information in a lightfield is captured, can you tailor the exposure to similarly useful effect? It turns out that you can, and that it is useful for deblurring images of moving objects.

The problem with blurred objects is that, to remove the blur, you need to find sharp edges in the image that indicate how far it has moved while the shutter has been open, so that you can then attempt to use that information to reconstruct an unblurred image. A team at MERL used a ferroelectric shutter in front of an ordinary camera's lens to create multiple shutter openings and closings during a standard exposure. They found that using a variety of sub-exposures, totalling to a correct overall exposure, the final image would encode enough information about the movement of objects to make the deblurring process much more successful.

The key to computational photography is that it takes a lot of what is normally done by optics and physical systems and recasts it as mathematics to be done by increasingly powerful embedded processors. Ng admits, for example, that his microlens technique gives up some absolute resolution in the images that it captures, since it is taking multiple measures of the same rays of light. But he argues that many cameras these days have excessive pixel counts anyway, and that his technique can use those pixels to capture enough information to produce a sharper image than the optics of the system alone could resolve using all the available pixels.

The wider point is that the rate of improvement in camera systems, which has been slowing as new sensors outstrip the ability of lenses to use them effectively, will soon move on to a price/performance improvement curve driven by Moore's Law. Beyond that, Ng argues that the nature of photography will change.

"Henri Cartier-Bresson said the point of photography was 'to capture the decisive moment'," says Ng. He argues that the ability to adjust the focus of a photo after it has been recorded will make capturing these 'decisive images' much easier.

"Anyone who sees one of these images knows that something has changed," he says. "We have the innate sensation that you have to focus on something in order to see it but with a single exposure with a light-field camera, that is no longer true."