Mind-reading machines: automated inference of complex mental states

This technical report is based on a dissertation submitted March 2005 by
the author for the degree of Doctor of Philosophy to the University of
Cambridge, Newnham College.

Abstract

People express their mental states all the time, even when interacting
with machines. These mental states shape the decisions that we make,
govern how we communicate with others, and affect our performance. The
ability to attribute mental states to others from their behaviour, and
to use that knowledge to guide one’s own actions and predict those of
others is known as theory of mind or mind-reading.

The principal contribution of this dissertation is the real time
inference of a wide range of mental states from head and facial displays
in a video stream. In particular, the focus is on the inference of
complex mental states: the affective and cognitive states of mind that
are not part of the set of basic emotions. The automated mental state
inference system is inspired by and draws on the fundamental role of
mind-reading in communication and decision-making.

The dissertation describes the design, implementation and validation of
a computational model of mind-reading. The design is based on the
results of a number of experiments that I have undertaken to analyse the
facial signals and dynamics of complex mental states. The resulting
model is a multi-level probabilistic graphical model that represents the
facial events in a raw video stream at different levels of spatial and
temporal abstraction. Dynamic Bayesian Networks model observable head
and facial displays, and corresponding hidden mental states over time.

The automated mind-reading system implements the model by combining
top-down predictions of mental state models with bottom-up vision-based
processing of the face. To support intelligent human-computer
interaction, the system meets three important criteria. These are: full
automation so that no manual preprocessing or segmentation is required,
real time execution, and the categorization of mental states early
enough after their onset to ensure that the resulting knowledge is
current and useful.

The system is evaluated in terms of recognition accuracy, generalization
and real time performance for six broad classes of complex mental
states—agreeing, concentrating, disagreeing, interested, thinking and
unsure, on two different corpora. The system successfully classifies and
generalizes to new examples of these classes with an accuracy and speed
that are comparable to that of human recognition.

The research I present here significantly advances the nascent ability
of machines to infer cognitive-affective mental states in real time from
nonverbal expressions of people. By developing a real time system for
the inference of a wide range of mental states beyond the basic
emotions, I have widened the scope of human-computer interaction
scenarios in which this technology can be integrated. This is an
important step towards building socially and emotionally intelligent
machines.