Visual Feature Learning

Doctoral Committee:

Abstract:

Humans
learn robust and efficient strategies for visual tasks through
interaction with their environment. In contrast, most current
computer vision systems have no such learning capabilities.
Motivated by insights from psychology and neurobiology, I combine
machine learning and computer vision techniques to develop
algorithms for visual learning in open-ended tasks. Learning is
incremental and makes only weak assumptions about the task
environment.

I begin by introducing an infinite feature space that contains
combinations of local edge and texture signatures not unlike those
represented in the human visual cortex. Such features can express
distinctions over a wide range of specificity or generality. The
learning objective is to select a small number of
highly useful features from this space in a task-driven
manner. Features are learned by general-to-specific random sampling.
This is illustrated on two different tasks, for which I give very
similar learning algorithms based on the same principles and the
same feature space.

The first system incrementally learns to discriminate visual scenes.
Whenever it fails to recognize a scene, new features are sought that
improve discrimination. Highly distinctive features are incorporated
into dynamically updated Bayesian network classifiers. Even after
all recognition errors have been eliminated, the system can continue
to learn better features, resembling mechanisms underlying human
visual expertise. This tends to improve classification accuracy on
independent test images, while reducing the number of features used
for recognition.

In the second task, the visual system learns to anticipate useful
hand configurations for a haptically-guided dextrous robotic
grasping system, much like humans do when they pre-shape their hand
during a reach. Visual features are learned that correlate reliably
with the orientation of the hand. A finger configuration is
recommended based on the expected grasp quality achieved by each
configuration.

The results demonstrate how a largely uncommitted visual system can
adapt and specialize to solve particular visual tasks. Such visual
learning systems have great potential in application scenarios that
are hard to model in advance, e.g. autonomous robots operating in
natural environments. Moreover, this dissertation contributes to our
understanding of human visual learning by providing a computational
model of task-driven development of feature detectors.