Computational Photography

Four-Dimensional Images

A digital camera sensor registers the intensity of light falling on each photosite but tells us nothing about where the light came from. To record the full light field we would need a sensor that measured both the intensity and the direction of every incident light ray. Thus the information recorded at each photosite would be not just a single number (the total intensity) but a complex data structure (giving the intensity in each of many directions). As yet, no sensor chip can accomplish this feat on its own, but the effect can be approximated with extra hardware. The underlying principles were explored in the early 1990s by Edward H. Adelson and John Y. A. Yang of the Massachusetts Institute of Technology.

One approach to recording the light field is to construct a gridlike array of many cameras, each with its own lens and photosensor. The cameras produce multiple images of the same scene, but the images are not quite identical because each camera views the scene from a slightly different perspective. Rays of light coming from the same point in the scene register at a different point on each camera's sensor. By combining information from all the cameras, it's possible to reconstruct the light field. (I'll return below to the question of how this is done.)

Experiments with camera arrays began in the 1990s. In one recent project Bennett Wilburn and several colleagues at Stanford University built a bookcase-size array of 96 video cameras, connected to four computers that digest the high-speed stream of data. The array allows "synthetic aperture photography," analogous to a technique used with radio telescopes and radar antennas.

A rack of 96 cameras is not something you'd want to lug along on a family vacation. Ren Ng and another Stanford group (Marc Levoy, Mathieu Brédif, Gene Duval, Mark Horowitz and Pat Hanrahan) implemented a conceptually similar scheme in a smaller package. Instead of ganging together many separate cameras, they inserted an array of "microlenses" just in front of the sensor chip inside a single camera. The camera is still equipped with its standard main lens, shutter and aperture control. Each microlens focuses an image of the main lens aperture onto a region of the sensor chip. Thus instead of one large image, the sensor sees many small images, viewing the scene from slightly different angles.

Whereas a normal photograph is two-dimensional, a light field has at least four dimensions. For each element of the field, two coordinates specify position in the picture plane and another two coordinates represent direction (perhaps as angles in elevation and azimuth). Even though the sensor in the microlens camera is merely a planar array, the partitioning of its surface into subimages allows the two extra dimensions of directional information to be recovered. One demonstration of this fact appears in the light-field photograph of a sheaf of crayons reproduced at right. The image was made at close range, and so there are substantial angular differences across the area of the camera's sensor. Selecting one subimage or another changes the point of view. Note that these shifts in perspective are not merely geometric transformations such as the scalings or warpings that can be applied to an ordinary photograph. The views present different information; for example, some objects are occluded in one view but not in another.