Cameras are becoming ubiquitous—tireless eyes watching over power plants and subway platforms—but they will only provide protection if they quickly and accurately identify threats in the images they record.

“Recognizing faces and fingerprints or understanding objects and activities in video data is fundamentally a vision problem,” said Fei-Fei Li ’99, an assistant professor of computer science who directs the Vision Lab at Princeton. “The goal is to create computer systems that completely understand the meaning of images.”

Researchers from the Vision Lab, which includes collaborators at the University of Illinois at Urbana-Champaign, took first place in the 2007 Semantic Robot Vision Challenge of the Association for the Advancement of Artificial Intelligence. The software they designed enabled a robot to find objects, including a Shrek DVD and a bottle of Tide laundry detergent, in a real-world setting. The security implications for this level of visual understanding are profound.

Li and her research group use a variety of statistical models to enable computers to recognize objects and activities. For instance, a system used to identify cars might compare a new object to what it has been programmed to recognize as a car—for example, an object with four wheels and a windshield. Such systems must be robust enough to accommodate variations in size and appearance, but stringent enough to provide useful information.

Using similar techniques, Li is providing the core visual recognition technology for one of the Defense Department’s Urban Reasoning and Geospatial Exploration Department projects. In conjunction with researchers at Lockheed Martin, the University of California at Berkeley and the Massachusetts Institute of Technology, she is developing systems to identify 150 objects commonly found in urban settings, including buildings, trees and street lamps.

In another effort, Li and graduate student Juan Carlos Niebles Duque are developing systems that recognize human motion and may in the future alert security camera monitors to suspicious activity. The ultimate goal would be for a system to sound an alarm if, for example, an individual in a subway terminal knelt down to deposit a bag and then took off at a run. This work is complemented by image-processing research by electrical engineering professor Stuart Schwartz and his graduate students to design computer systems that track the motion of humans and vehicles in video data.

“The data being collected by cameras are absolutely overwhelming, so we need to use computers to pare down the information,” Schwartz said. “Oftentimes, the final decision as to whether something is a threat will be made by a human, but the computer systems will help them know where they should be looking.”