Facial Behaviors

The human face is the most complex and versatile of all species. For
humans, the face is a rich and versatile instrument serving many
different functions. It serves as a window to display one's own
motivational state. This makes one's behavior more predictable and
understandable to others and improves communication. The face can be
used to supplement verbal communication. A quick facial display can
reveal the speaker's attitude about the information being
conveyed. Alternatively, the face can be used to complement verbal
communication, such as lifting of the eyebrows to lend additional
emphasis to a stressed word. Facial gestures can communicate
information on their own, such as a facial shrug to express "I don't
know" to another's query. The face can serve a regulatory function to
modulate the pace of verbal exchange by providing turn-taking
cues. The face serves biological functions as well -- closing one's
eyes to protect them from a threatening stimulus, and on a longer time
scale to sleep.

Kismet doesn't engage in adult-level discourse, but its face serves
many of these functions at a simpler, pre-linguistic
level. Consequently, the robot's facial behavior is fairly
complex. The above schematic shows the facial motor control system.
Kismet's face currently supports four different functions, and must do
so in a timely, coherent, and appropriate manner. It reflects the
state of the robot's emotion system. We call these emotive
expressions. It conveys social cues during social interactions with
people. We call these expressive facial displays. It participates in
behavioral responses (such as closing its eyes to protect them from a
dangerous stimulus). It also synchronizes with the robot's speech.
The face system must be quite versatile as the manner in which these
four functions are manifest changes dynamically with motivational
state and environmental factors.

However, people seem to be the most captivated by Kismet's emotive
facial expressions. Consequently, we will focus on Kismet's expressive
abilities here.

Emotive Facial Expressions

When designing the expressive abilities of a robot face, it is
important to consider both the believability and readability of the
facial behavior. Believability refers to how life-like the behavior
appears. Readability refers to how well the observer can correctly
interpret the intended expression. Kismet's face is always in motion,
which greatly enhances its life-like quality. Great attention has been
paid not only to how the face is configured to express a particular
``emotional" state, but also to the transitions between these states.

Move mouse over points in affect space (above) to see
Kismet's corresponding expressions.

Kismet's facial expressions are generated using an interpolation-based
technique over a three dimensional space. The three
dimensions correspond to arousal, valence, and stance.
These same three attributes are used to
affectively assess the myriad of environmental and internal factors
that contribute to Kismet's "emotional" state. We call the space defined
by the [A, V, S] trio the affect space. The current affective
state occupies a single point in this space at a time. As the robot's
affective state changes, this point moves about within this
space. Note that this space not only maps to emotional states (i.e.,
anger, fear, sadness, etc.) but also to the level of arousal as well
(i.e., excitement and fatigue). A range of expressions generated with
this technique is shown above. The procedure
runs in real-time, which is critical for social interaction.

There are nine basis (or prototype) postures that
collectively span this space of emotive expressions. Although some of
these postures adjust specific facial features more strongly than the
others, each prototype influences most if not all of the facial
features to some degree. For instance, the valence prototypes have the
strongest influence on lip curvature, but can also adjust the
positions of the ears, eyelids, eyebrows, and jaw. The basis set of
facial postures has been designed so that a specific location in
affect space specifies the relative contributions of the prototype
postures in order to produce a net facial expression that faithfully
corresponds to the active "emotion". With this scheme, Kismet displays
expressions that intuitively map to the emotions of anger, disgust,
fear, happiness, sorrow, and surprise. Different levels of arousal can
be expressed as well from interest, to calm, to weariness.

There are several advantages to generating the robot's facial
expression from this affect space. First, this technique allows the
robot's facial expression to reflect the nuance of the underlying
assessment. Hence, even through there is a discrete number of "emotions",
the expressive behavior spans a continuous space. Second,
it lends clarity to the facial expression since the robot can only be
in a single affective state at a time (by our choice), and hence can
only express a single state at a time. Third, the robot's internal
dynamics are designed to promote smooth trajectories through affect
space. This gives the observer a lot of information as to how the
robot's affective state is changing, which makes the robot's facial
behavior more interesting. Furthermore, by having the face mirror this
trajectory, the observer has immediate feedback as to how their
behavior is influencing the robot's internal state.