Eye movements in natural tasks:
A paradigm for understanding the process of visual
perception

My primary research focus is visual perception in everyday life. Most of what we
know about visual perception is based on carefully designed experiments performed
in the laboratory where conditions can be carefully controlled. While this has
given us a very thorough understanding of the metrics and mechanics of
vision, it tells us little about how we use vision every day.

In order to perceive the world around us, we must move our eyes almost constantly.
We typically make ~2 to 4 eye movements every second; over 100,000 every day. These eye
movements are necessary because of the design of the human eye. Unlike manmade
image sensors such as CCDs or photographic film, the image sensor at the back of
the eye (the retina) is highly anisotropic; the resolution varies by orders of
magnitude across the field. High acuity is only available in a small area at the
center of the retina, so the eyes are moved to 'point to' objects or regions in
the scene that require high acuity.
Eye movements are also made toward task-relevant targets even when high spatial
resolution is not required. These eye movements, made without conscious
intervention, can reveal attentional mechanisms and provide a window into
cognition; they are the focus or our research.

By examining the eye movements of subjects as they perform complex tasks, we
are able to take advantage of this window into cognition, helping us understand
how we gather information from the environment, how we store and recover the
information, and how we use that information in planning and guiding actions.

Recent work in the Visual Perception Laboratory has focused on using the RIT
Wearable Eyetracker to monitor complex, real-life tasks in natural
environments. Recent papers describe these experiments (follow the
recent publications:
link below)

Roxanne Canosa
has completed her MS thesis "Eye Movements and Natural Tasks in an
Extended Environment," and is now beginning her doctoral research. Her focus is
to better understand how eye movements aid the process of visual perception, and
seeking ways to use that understanding in the design of artificial vision systems.

Work in integrating an eyetracker into a virtual reality HMD is
described in "Development of a Virtual
Laboratory for the Study of Complex Human Behavior"

The design of the human eye was necessary to meet the competing evolutionary
demands for high visual acuity and a large field of view. There is simply not
enough neural real estate available in the brain to support a visual system that
has high resolution over the required field of view. Even if we left no room in
the cortex for any other senses (not to mention housekeeping functions like
breathing or keeping the heart beating), the human cortex could not support the
optimal size/resolution sensor. Some animals stay within the design limits by
restricting their field of view (e.g., a hawk); others give up high resolution in
favor of a larger field of view (e.g., a rabbit). Rather than picking one or the
other solution, humans evolved the anisotropic retina with very high spatial
resolution in the center of the visual field (the fovea), surrounded by a much
lower resolution region (the peripheral retina). In the human retina, the
high-resolution fovea encompasses less than 0.1% of the visual field visible at
any instant, and the effective resolution falls by an order of magnitude within a
few degrees from the fovea. This variable-resolution retina reduces bandwidth
sufficiently, but is not an acceptable solution alone. Unless the point of
interest at any moment happened to fall in the exact center of the visual field,
the stimulus would be relegated to the low-resolution periphery. The 'foveal
compromise' was made feasible by the evolution of a complementary mechanism to
move the eyes. In order to ensure useful vision, the eyes must be moved rapidly
about the scene.

The first job of an eye movement system is to move the eye quickly from the
current point of gaze to a new location. Vision is blurred during an eye
movement, so the length of time that the eye is moving must be minimized. In
order to minimize the time during which no clear image is captured on the fovea,
eye movements that move the fovea from one object/point to another are very rapid.
These saccadic eye movements are among the fastest movements the body can
make; the eyes can rotate at over 500 deg/sec, and subjects make well over one
hundred thousand of these saccades daily. These rapid eye movements are
accomplished by a set of six muscles attached to the outside of each eye. They
are arranged in three pairs of agonist-antagonist pairs; one pair rotates the eye
horizontally (left - right), the second rotates the eye vertically (up - down),
the third allows 'cyclotorsion,' or rotation about the line of sight.

The second class of eye movements maintains clear vision by
stabilizing the retinal image. This stabilization assures that the image of an
object or region in the center of the field-of-view is kept over the fovea.
Sophisticated mechanisms exist to accomplish this goal in the face of eye, head,
body, and object motion. These eye movements are often grouped into four
categories:

The vestibular-ocular reflex (VOR) rotates the eyes
to compensate for head rotation and translation. Rotational and linear
acceleration are detected by the semicircular canals and otolith organs in the
inner ear. The resultant signals are used to command compensating eye movements.

Optokinesis stabilizes the retinal image caused by
large-field motion. Retinal slip induced by field motion is used to initiate eye
movements at the appropriate rate to cancel out image motion.

Smooth-pursuit eye movements are similar to
optokinesis, but allow arbitrarily sized targets to be stabilized instead of
large-field motion. A moving target is required for smooth eye movements; the
eyes cannot move smoothly across a stationary object.

Vergence eye movements counter-rotate the eyes to
maintain the images of an object at a given depth to be maintained at
corresponding locations on the two retinae.

Much of the research on eye movements to date has been focused
on understanding the mechanics and dynamics of the oculomotor system. The
question of how successive fixations are aligned spatially has also received much
attention. Most of this research has been aimed at discovering how the visual
system 'knows' where the eyes are situated for each fixation so that the
individual images captured with each fixation can be correctly aligned to build
the rich internal representation we experience. Evidence is emerging, however,
that we may have been asking the wrong question. We are able to use regularities
in the environment to maintain a stable representation without resorting to
complex alignment mechanisms and large changes in the environment may go
undetected. Understanding visual perception requires us to ask a similar, but
orthogonal question about the temporal stitching of successive views. This
issue has not arisen with experimental tasks in the past because task complexity
was purposely restricted.

We are studying eye movements in complex tasks and natural environments so that
we can better understand the process, rather than the mechanics, of visual
perception

Teaching:

I am an Associate Professor in the Chester
F. Carlson Center
for Imaging Science at R.I.T.
I teach Introduction to Imaging Science I & II, Survey of Imaging Science, and
Vision and Psychophysics
and co-teach The Visual System, and
Spatial Vision and Pattern Perception with Eriko Miyahara.
I have
also taught courses in optics and computer programming in the Microelectronic
Engineering and Imaging and Photographic Technology programs.

New instrumentation allows us to monitor the movement of a subject's
eyes, head, and hand while they are performing complex visuo-motor tasks.
The photograph below shows the experimental equipment and one of the tasks
that we are using in these studies. A headband mounted eyetracker (made
by Applied Science Laboratories) monitors the subject's eye position. (The crosshairs
in the monitor in the background indicate the gaze position on the board.)
An EM field transmitter/receiver pair (from Ascension Technology) reports
the position and orientation of the head and hand. Eye and head data is
combined to provide an 'eye-in-space' signal.