12:00, 14 Feb 1996, WeH 7220
Novel Machine Learning Tools for Computer Vision
Shumeet Baluja
Many real-world problems can be solved using machine learning tools. I
will present three learning approaches to solve problems in the
real-world vision based tasks of video-indexing, computer aided surgery,
autonomous navigation, and large-scale fault detection.
The first project is a neural-network based system for view-based
detection of frontal, upright, faces in cluttered scenes. A retinally
connected neural network examines small windows of an image, and decides
whether each window contains a face. The system arbitrates between
multiple networks to improve performance over a single network. We use a
bootstrap algorithm for training the networks, which adds false
detections into the training set as training progresses. This eliminates
the difficult task of manually selecting non-face training examples,
which must be chosen to span the entire space of non-face images.
Comparisons with other state-of-the-art face detection systems are
presented.
The second project explores expectation based task-specific selective
attention. The central thesis of this research is that for temporally
related inputs, a computed expectation of the next time step's inputs can
provide a basis upon which to focus attention in scene analysis and
anomaly detection. In tasks which contain distracting features or noise,
this method can be used to accentuate the predictable features in the
inputs and de-emphasize the unexpected ones. In tasks such as anomaly
detection, the role of expectation is reversed; the features which could
not be predicted are emphasized. These ideas have been implemented in
systems for autonomous navigation, visual hand-tracking, and the
detection of errors in the plasma-etch step of semiconductor wafer
fabrication.
Time permitting, I will also present an abstraction of the standard
genetic algorithm, termed Population-Based Incremental Learning (PBIL).
PBIL explicitly maintains the statistics contained in a GA's population,
but removes the crossover operator and redefines the role of the
population. This results in PBIL being simpler than a standard GA, both
theoretically and computationally. Empirical results show that PBIL is
faster and more effective than standard GAs on a large set of commonly
used benchmark problems. PBIL has been used for discrete point data
selection for object localization - for use in computer aided surgery,
the design of low-level vision controllers for autonomous navigation, and
the design of high-level reactive controllers for robot vehicles.