Measuring Empathy

Psychologists distinguish between measurements of situational
empathy—that is, empathic reactions in a specific
situation—and measurements of dispositional empathy, where
empathy is understood as a person's stable character trait.
Situational empathy is measured either by asking subjects about their
experiences immediately after they were exposed to a particular
situation, by studying the “facial, gestural, and vocal indices
of empathy-related responding” (Zhou, Valiente, and Eisenberg
2003, 275), or by various physiological measures such as the
measurement of heart rate or skin conductance. None of these
measurements are perfect tools. As is widely known, self-reports can be
influenced by a variety of interfering factors. They might not indicate
of how one has actually felt but rather reflect one's knowledge
of how other people expect one to feel. They also might vary according
to an individual's ability to verbalize his or her thoughts.
Physiological measurements do not fall prey to such concerns, yet it is
unclear whether they allow one to distinguish sufficiently between
empathy, sympathy, and personal distress (Zhou, Valiente, and Eisenberg
2003).

A large chunk of empathy research has focused on investigating the
variables associated with empathy as a stable disposition.
Dispositional empathy has been measured either by relying on the
reports of others (particularly in case of children) or, most often (in
researching empathy in adults), by relying on the administration of
various questionnaires associated with specific empathy scales. Some of
the most widely used questionnaires have been Hogan's empathy
(EM) scale (Hogan 1969), Mehrabian and Epstein's
questionnaire measure of emotional empathy (QMEE; Mehrabian and Epstein
1972), and, since the 1980's, Davis's Interpersonal
Reactivity Index (IRI; Davis 1980, 1983, and 1994). These
questionnaires reflect the multiplicity of empathy conceptions in
psychology, since each understands itself as operationalizing a
different definition of empathy. Hogan conceives of empathy in an
exclusively cognitive manner, Mehrabian and Epstein think of it as an
exclusively affective phenomenon defining it broadly as “a
vicarious response to the perceived emotional experiences of
others” (525), and Davis treats empathy as including both
cognitive and affective components; as a “set of constructs,
related in that they all concern responsivity to others but are also
clearly discriminable from each other” (Davis 1983, 113).

Hogan's cognitive empathy scale consists of 64 questions that
were selected from a variety of psychological personality tests such as
the Minnesota Multiphasic Personality Inventory (MMPI) and the
California Personality Inventory (CPI) according to a rather
complicated procedure. From almost thousand questions, Hogan chose
those questions in response to which he found two groups of
people—who were independently identified as either
low-empathy or high-empathy individuals—as showing
significant differences in their answers. Mehrabian and Epstein's
questionnaire consists of 33 items divided into seven subcategories
testing for “susceptibility to emotional contagion,”
“appreciation of the feelings of unfamiliar and distant
others,” “extreme emotional responsiveness,”
“tendency to be moved by others' positive emotional
experiences,” “tendency to be moved by others'
negative emotional experience,” “sympathetic
tendency,” and “willingness to be in contact with others
who have problems” (Mehrabian and Epstein 197). Yet, even though
the QMEE distinguishes between these aspects of empathy on a conceptual
level, it only assigns a total empathy score to individuals completing
the questionnaire. For this very reason, Davis's Interpersonal
Reactivity Index tends to be nowadays preferred among researchers. The
IRI is a questionnaire consisting of 28 questions divided equally among
four distinct subscales; that is, “perspective
taking” or “the tendency to spontaneously adopt the
psychological view of others in everyday life;”
“empathic concern” or “the tendency to
experience feelings of sympathy or compassion for unfortunate
others;” “personal distress” or the “tendency
to experience distress or discomfort in response to extreme distress in
others;” and “fantasy” or “the tendency to
imaginatively transpose oneself into fictional situations” (Davis
1994, 55-57). In contrast to Mehrabian and Epstein, Davis's scale
does not calculate an overall value for empathy but calculates a
separate score for each of the subscales.

Remarkably, no significant correlation has been found between the
scores on various empathy scales and the measurement of empathic
accuracy. If any, only a negligibly small effect has been found between
empathic accuracy and affective empathy as measured by QMEE and the
empathic concern and personal distress subscales in the IRI (Davis and
Kraus 1997). This is particularly surprising in regard to Hogan's
empathy scale, which attempts to measure empathy understood in a
cognitive sense. Hogan certainly acknowledges the fact that
conceiving of empathy as a disposition to “imaginatively
apprehend another state of mind” does not conceptually imply
anything about the objective success of such apprehension. Yet, if
empathy plays a central role in establishing social relations among
agents, one would expect to find a more positive correlation between
the measurement of cognitive empathy as a stable disposition and
empathic accuracy. Davis and Kraus do not take such lack of correlation
as an indication of a fundamental failure in the conception of the
empathy questionnaires. Rather, it indicates a principal limitation of
any empathy scale relying on self-reports. They speculate, following
Ickes (1993, 2003, chap.7 ), that the lack of correlation indicates
that people in general have little meta-knowledge regarding their
empathic ability. Within this interpretive framework, there is nothing
in principle wrong with the questions asked to determine our empathic
abilities. If we would have the required meta-knowledge, answering the
questions truthfully would be a good guide for determining empathy as
defined within the context of each scale.

Yet a closer look at the questions used in the questionnaires raises
some more fundamental concerns about the adequacy of the various
scales. Particularly in Hogan's or Mehrabian and Epstein's
questionnaires, one has to be worried about the insufficient semantic
correspondence between the content of the items probed and the
conception of empathy presumed by the authors of the questionnaire or
even the conceptions of empathy as articulated in this entry
(Holz-Ebeling and Steinmetz 1995). In what sense, for example, can one
understand items like “I prefer a shower to a tub bath” (#7
), “I think I would like to belong to a singing
club,” “I would like the job of a foreign
correspondent” (#29), or “I like to talk about
sex” (#56) in Hogan's scale as having anything to do with
empathy in a cognitive sense? At most the scale could be used in
identifying empathic people, if there is, as a matter of fact, a
correlation between empathy and specific answers to such questions. Yet
the questionnaire does not seem to probe directly for empathy, since it
does not establish that subjects tested answer because of an empathic
disposition as it is defined by the author. Moreover, it is not even
clear that the questionnaire would be less appropriate if one were to
define empathy in a purely affective manner. Investigating empathy with
the help of Hogan's questionnaire seems like testing for
intelligence by asking people whether they wear expensive suits and
dresses. Even if counterfactually all and only intelligent people would
wear such apparel, a test designed in this manner would not ascertain
the existence of intelligence by testing for capacities directly
associated with our understanding of intelligence. (For a discussion of
Hogan see also Johnson, Cheek, and Smither 1983 and Bierhoff 2002).
Similarly in Mehrabian and Epstein's scale, reverse items like
“People make too much of the feelings and sensitivity of
animals,” “I often find public display of affection
annoying,” “I am annoyed by unhappy people who are just
sorry for themselves,” or “Little children sometimes cry
for no apparent reasons” (reverse items #2,3 4, and 33) do not
seem to test directly for affective empathy. Answers to the last item
might just reflect lack of experience with children (or too much
thereof) and the other items seem at most to test for particular social
attitudes towards others; attitudes that seem prima facie compatible
with both high and low empathy defined as a vicarious emotional
reaction to others. At best, the questions can be interpreted as
measuring one's emotional arousability rather than empathy. (For
this suggestion see Mehrabian, Young, and Sato 1988). Indeed in a
recent study (Holz-Ebeling and Steinmetz 1995), subjects regarded
hardly any of the items in the above two scales to be semantically
related to the author's empathy conception. In the same study,
Davis's IRI scale fared much better—even if it did not
score perfectly—in that 12 out of the 28 items were regarded to
be appropriate when compared with the author's general conception
of empathy and his definition of the specific sub-scales. However, it
has been suspected that in including the fantasy subscale and in
inserting questions like “I am usually pretty effective in
dealing with emergencies” or “I sometimes feel
helpless when I am in the middle of a very emotional situation”
under the personal distress subscale, Davis's scale measures
broader psychological processes such as the imagination or capacity for
emotional control; processes that probably are somehow related but that
certainly are not identical to empathy (Baron-Cohen and Wheelwright
2004).

In the context of studying the question of whether autism should be
regarded as an “empathy disorder,” Baron-Cohen and
Wheelwright therefore felt the need to develop a new questionnaire for
measuring empathy. Their empathy questionnaire, called the empathy
quotient (EQ), defines empathy as including a cognitive
component—a “drive to attribute mental states to
another person/animal”—and an affective component,
entailing “an appropriate affective response in the observer to
the other's person's mental state” (168). It consists
of 60 questions: 40 of them are directly related to empathy and 20 are
regarded as filler items in order to distract the subjects “from
a relentless focus on empathy.” First results using the test seem
to verify the hypothesis that autism is associated with impairment in
empathy. Simon-Baron and Wheelwright stress that one has to interpret
the above results cautiously as the validity of EQ needs to be further
tested and compared to other scales, particularly the IRI. Looking over
the various items, it does not seem as if EQ encounters the same
problems regarding its content validity as the other scales. Most items
can indeed be understood as testing for empathy as defined by the
authors. Yet it has to be pointed out—at least in regard to the
affective component of empathy—that neither the authors'
definition nor their included items sufficiently distinguish between
affective empathy, sympathy, and personal distress. As many other
authors, they tend to think of affective empathy as an amalgam of
empathy proper, sympathy, and personal distress.

Whether or not existing means—particularly
questionnaires—are appropriate tools for further distinguishing
between these very different emotional reactions remains an open
question. Help for devising new empathy scales or for further
validating existing questionnaires could also come from the
neurosciences that have very recently begun to contribute to the study
of empathy and have begun to investigate the underlying neurobiological
mechanisms of perspective taking (for a survey see Decety and Jackson
2004). Indeed some recent studies seem to have found preliminary
evidence for a correlation between some empathy questionnaires and
specific neural activity. Specifically, Gazzola, Aziz-Sadeh, and
Keysers (2006) found that the auditory mirror circuits—that is,
those circuits that are activated both in executing an action and while
listening to sounds of similar actions—show stronger activation
levels in individuals who have higher perspective taking scores in
Davis's IRI questionnaire. Lamm, Bateson, Decety (2007)
found correlation between the Empathy Quotient and activation in right
putamen, the left posterior/middle insula, the anterior medial
cingulated cortex and the left cerebellum. Such correlations suggest
that those questionnaires do measure aspects that have been
traditionally regarded to be central for empathy, like the ability of
perspective taking. Whether the existence of such correlation can
be further corroborated must await the results of additional research.
Studying empathy from the perspective of the neurosciences might also help
us to understand and further distinguish the various components that
have been at times insufficiently held apart in the social
psychological study of empathy.

The SEP would like to congratulate the National Endowment for the Humanities on its 50th anniversary and express our indebtedness for the five generous grants it awarded our project from 1997 to 2007.
Readers who have benefited from the SEP are encouraged to examine the NEH’s anniversary page and, if inspired to do so, send a testimonial to neh50@neh.gov.