System improves automated monitoring of security cameras

Police and security teams guarding airports, docks, and
border crossings from terrorist attack or illegal entry need to know
immediately when someone enters a prohibited area, and who they are. A network
of surveillance cameras is typically used to monitor these at-risk locations 24
hours a day, but these can generate too many images for human eyes to analyze.

Now, a system being developed by Christopher Amato, a
postdoctoral researcher at Massachusetts
Institute of Technology (MIT)'s Computer Science and Artificial Intelligence
Laboratory (CSAIL), can perform this analysis more accurately and in a fraction
of the time it would take a human camera operator. "You can't have a
person staring at every single screen, and even if you did the person might not
know exactly what to look for," Amato says. "For example, a person is
not going to be very good at searching through pages and pages of faces to try
to match [an intruder] with a known criminal or terrorist."

Existing computer-vision systems designed to carry out this
task automatically tend to be fairly slow, Amato says. "Sometimes it's
important to come up with an alarm immediately, even if you are not yet
positive exactly what it is happening," he says. "If something bad is
going on, you want to know about it as soon as possible."

So Amato and his University of Minnesota colleagues Komal
Kapoor, Nisheeth Srivastava, and Paul Schrater are developing a system that
uses mathematics to reach a compromise between accuracy—so the system does not
trigger an alarm every time a cat walks in front of the camera, for example—with
the speed needed to allow security staff to act on an intrusion as quickly as
possible.

For camera-based surveillance systems, operators typically
have a range of computer-vision algorithms they could use to analyze the video
feed. These include skin detection algorithms that can identify a person in an
image, or background detection systems that detect unusual objects, or when
something is moving through the scene.

To decide which of these algorithms to use in a given
situation, Amato's system first carries out a learning phase, in which it
assesses how each piece of software works in the type of setting in which it is
being applied, such as an airport. To do this, it runs each of the algorithms
on the scene, to determine how long it takes to perform an analysis, and how
certain it is of the answer it comes up with. It then adds this information to
its mathematical framework, known as a partially observable Markov decision
process (POMDP).

Then, for any given situation—if it wants to know if an
intruder has entered the scene, for example—the system can decide which of the
available algorithms to run on the image, and in which sequence, to give it the
most information in the least amount of time. "We plug all of the things
we have learned into the POMDP framework, and it comes up with a policy that
might tell you to start out with a skin analysis, for example, and then
depending what you find out you might run an analysis to try to figure out who
the person is, or use a tracking system to figure out where they are [in each
frame]," Amato says. "And you continue doing this until the framework
tells you to stop, essentially, when it is confident enough in its analysis to
say there is a known terrorist here, for example, or that nothing is going on
at all."

Like a human detective, the system can also take context
into account when analyzing a set of images, Amato says. So for instance, if
the system is being used at an airport, it could be programmed to identify and
track particular people of interest, and to recognize objects that are strange
or in unusual locations, he says. It could also be programmed to sound an alarm
whenever there are any objects or people in the scene, when there are too many
objects, or if the objects are moving in ways that give cause for concern.

In addition to port and airport security, the system could
monitor video information obtained by a fleet of unmanned aircraft, Amato says.
It could also be used to analyze data from weather-monitoring sensors to
determine where tornados are likely to appear, or information from water
samples taken by autonomous underwater vehicles, he says. The system would
determine how to obtain the information it needs in the least amount of time
and with the fewest possible sensors.