Research

How does vision determine the size, shape and boundaries of objects in
our environment?
Research in my laboratory centers on various
aspects of visual perception and the visual control of action.
Recent research on the visual
system has grown into an exciting collaboration among psychologists,
physiologists, computer scientists, and mathematicians. My research
continues to blur the lines between these fields in two ways. First,
traditional psychophysical methods are enhanced using advanced computer
graphics and image processing techniques for stimulus generation and
analysis. Second, both mathematical methods and computer simulations
are used to model the psychophysical results. As much as possible, the
simulation models attempt to reflect a feasible physiological
implementation, as I have a strong interest in neural network models of
vision. Next, I describe each broad area of my research in turn.

Texture-defined
edge based on a difference in texture element
orientation.

Texture and pattern coding. Sometimes when one texture pattern
is placed on a background of another, the two segregate quickly and seemingly
effortlessly into foreground and background. Other times not. Why is this?
What sequence of linear and nonlinear image transformations leads to this
variation in texture segregation performance? Our research in this area
consists of both psychophysical experiments and computational modeling to
determine the details of the visual machinery used to code, interpret and
segregate texture patterns. More recently, we have also looked at the
identification of shapes defined by texture (e.g., letters) and the estimation
of texture properties (e.g., surface roughness) in 3-d rendered scenes. We have examined the cortical coding of 2nd-order patterns by
looking for orientation-selective adaptation of cortical responses to 1st- and
2nd-order patterns using functional magnetic resonance imaging (fMRI) in
collaboration with David Heeger
(NYU). We have are also exploring a new model we have developed to account for cortical pattern adaptation, which has implications for computational theory, physiology and perception.

This is an animation of our "overt-criterion" task. Observers discriminate between two noisy categories of ellipses differing in their mean orientation. On each trial, observers rotate a line to indicate their decision criterion, and are scored on a subsequently presented ellipse based on whether it fell on the correct side (clockwise or counterclockwise) of the line's orientation based on the ellipse's category membership.

Perceptual decision-making. I study how observers make perceptual decisions under uncertainty. Sensory signals are noisy and an ideal observer will combine such signals with knowledge of their uncertainy, prior expectations and knowledge of potential outcome- and decision-continent rewards to guide decisions. We ask whether humans act as ideal decision-makers and, if not, where are compromises made or heuristics used. We have shown that orientation estimation appears to be consistent with the ideal-observer model and that humans use a prior distribution of orientations that matches environmental statistics. We have studied how the decision criterion for a perceptual discrimination is placed as a function of rewards, prior probabilities and changing conditions. We have developed a new model of how sensory evidence is accumulated over time, which has implications for modeling reaction-time and cued-response tasks.

This is a
frame from a stimulus movie of a pair of rotating cylinders. The images
include the depth cues of binocular disparity (stereo), motion, texture
perspective and occluding contour.

Sensory cue integration. I have worked extensively on the issue of how the
visual system combines information from multiple sources or cues. This
research has been continuing for a number of years in
collaboration with
Larry Maloney (NYU),
Marty Banks
(Berkeley), Wendy Adams, as well as several graduate students and postdoctoral research
associates. This work begins by considering what an ideal decision maker would
do in such a situation. Human performance in cue-combination
tasks is compared to models based on statistical decision theory. In many
cases, Bayesian models are used in which it is understood that information
sources (visual cues or prior knowledge) are uncertain, and should be
combined with reference to the form and amount of uncertainty in each. Our
studies have looked at cue combination in the perception of depth from
multiple depth cues (binocular disparity, motion, texture, shading, contour,
etc.), depth cue disambiguation (in stimuli with contour and shading cues),
depth cue scaling (with multiple cues to the viewing geometry, including the
possible interaction of the motion and stereo cues to compute the viewing
distance) and edge localization (with cues of texture scale, orientation and
contrast). More recent work considers multisensory cue integration including vision, audition, proprioception and touch.

This is one of our
subjects
(Julia Trommershäuser) set up for the collection of movement data
while performing a pointing task. Infra-red emitting diodes are
strapped to her finger and arm. Their motion path is tracked in 3D by
the three infrared-sensitive cameras of an
Optotrak system, visible in the background.

Visual control of action. We have also applied
statistical decision theory to modeling visuo-motor control. In this research,
subjects perform pointing or other tasks under tight time constraints.
Subjects earn points (and eventually, money) for fast, accurate
performance of the task (pointing at a target region), but lose points
if they respond late or point towards penalty regions. By measuring outcome
uncertainty (the variance in motor outcome), we can compute the optimal aim
point for any configuration of payoff and penalty regions and values. In a
variety of situations, subjects are optimal or near-optimal in this task. That
is, they earn as many points as would have been earned by an ideal
movement planner having the same movement variability as the subject. Subjects
appear to have available an estimate of their movement variability and take it
into account in movement
planning, even in situations in which that variability has been increased
(artificially) by the experimenter. More recently we have studied movement planning for reaches and saccadic eye movement using both learning and adaptation experiments to study the coordinate systems in which movements are planned.

This is an animation of a 3-d shape rocking
back and forth, thus cued by relative motion. The dots that carry
the motion flicker occasionally so as to eliminate the possible
cue of changing local dot density. Despite the elimination of that
cue and the flicker, it is relatively easy to perceive (and to
judge) the 3-d shape.

Depth perception. I am interested in the details of how the visual
system determines depth and object shape using a variety of visual
cues. I have done computational and psychophysical studies concerning several
such cues including the kinetic depth effect, binocular stereopsis, and shape
from texture, contour and shading. Binocular stereopsis is particularly
interesting as the raw information (the disparities in the positions of
features in the images from the two eyes) must be scaled based on estimates of
the gaze distance (vergence) and direction (version), and these can, in turn,
be estimated using both retinal cues (the pattern of vertical disparities) and
extra-retinal cues (knowledge of the eyes' positions).

Biography

I have
an enduring interest in the use of computational techniques to study
human vision. My doctoral dissertation concerned the computer
simulation of a neural network model of visual learning. For this work,
I received the Ph.D. from the Department of Computer and Communication
Sciences of the University of Michigan in 1981, having worked primarily
with John Holland. I then moved to New York University and worked as a
postdoctoral research associate with George Sperling, examining aspects
of low bandwidth visual image sequences, in particular as applied to
low bandwidth communication systems for the deaf (involving perceptual
studies of American Sign Language). During that time I also co-wrote
the
HIPS image processing
software.
In 1984 I joined the faculty at NYU, and have continued to work on
problems in visual perception, concentrating on perception of depth and
texture. In 1992-3, I spent a sabbatical year as a National Research
Council Senior Research Associate at NASA Ames Research Center. In the
summer of 1998, I visited the Institut d'Ingénierie de la
Vision, Université Jean Monnet de Saint-Étienne,
collaborating on work on texture appearance. In 1999-2002, I spent a
sabbatical year and much of the subsequent two years at the School of
Optometry, University of California at
Berkeley, working with Prof. Martin S. Banks on various projects in
depth perception and stereopsis, visiting again 2015-2016.