Abstract

The expressions we see in the faces of others engage a number of different cognitive processes. Emotional expressions elicit rapid responses, which often imitate the emotion in the observed face. These effects can even occur for faces presented in such a way that the observer is not aware of them. We are also very good at explicitly recognizing and describing the emotion being expressed. A recent study, contrasting human and humanoid robot facial expressions, suggests that people can recognize the expressions made by the robot explicitly, but may not show the automatic, implicit response. The emotional expressions presented by faces are not simply reflexive, but also have a communicative component. For example, empathic expressions of pain are not simply a reflexive response to the sight of pain in another, since they are exaggerated when the empathizer knows he or she is being observed. It seems that we want people to know that we are empathic. Of especial importance among facial expressions are ostensive gestures such as the eyebrow flash, which indicate the intention to communicate. These gestures indicate, first, that the sender is to be trusted and, second, that any following signals are of importance to the receiver.

1. Multiple systems of emotion recognition

It has become increasingly apparent, initially from studies of neurological patients with circumscribed brain damage, that much, if not most, information processing occurs in the brain without any accompanying conscious experience. The final products of this information processing often become available to consciousness, but in many cases these unconscious processes can control behaviour without any need for awareness. The most dramatic example of this is blind sight (Weiskrantz & Warrington 1975). In such cases, patients have damage to the primary visual cortex, which renders them blind in the associated part of the visual field. Nevertheless, if a stimulus is rapidly moved, for example, across this blind field, the patient is able to ‘guess’ the direction of this movement considerably better than chance. Another example is patient D.F. who, as a result of damage to the inferior temporal cortex, is also effectively blind, since she is unable to recognize objects from their shapes (Goodale et al. 1991). In spite of this problem, when she reaches for an object, she shapes her hand to match the shape of the object. In her case, information about the shape of objects is available to achieve appropriate grasping behaviour, but does not enter consciousness.

Subsequent to these demonstrations, the same phenomena have been demonstrated in healthy volunteers. Behavioural studies have demonstrated that we are aware of far less of the visual world than we realize (e.g. change blindness, Rensink et al. 1997; inattentional blindness, Mack & Rock 1998). Furthermore, we make accurate reaching and grasping movements before we become aware of the stimuli that are eliciting these movements (Castiello et al. 1991; Pisella et al. 2000) and can be influenced by the meanings of words that we are not aware of having seen (Marcel 1983). Brain imaging studies have also confirmed that stimuli of which volunteers are unaware, nevertheless elicit activity in the brain (e.g. Beck et al. 2001), and that this activity can be observed up to and including those areas associated with the processing of meaning (e.g. Dehaene et al. 1998).

These results reveal that there are at least two processes through which sensory signals can be converted into behaviour: one is associated with consciousness while the other is not. These processes seem to occur in parallel, rather than the unconscious processing being the early stage of a route to consciousness. Different kinds of information seem to be extracted by the two routes. For example, when reaching and grasping I need information about the shape of the object I am planning to grasp, but this information requires an egocentric perspective. The precise nature of the reach and grasp depends on the precise spatial relationship between me and the object. By contrast, when I recognize an object from its shape I need information about the shape of the object that is independent of my current viewpoint, so that the object can be identified even from an unusual view (Schenk 2006). Thus, these two perceptual routes seem to have different functions.

It is, of course, an oversimplification to suggest that there are just two independent routes to action. There are likely to be many routes and, furthermore, there will be interactions between these routes (e.g. Braddick et al. 2000). However, there remains the important distinction between those routes that lead to conscious awareness and those that do not (Frith et al. 1999). The question as to why only some routes are associated with consciousness is a subject of great interest, but will not be addressed in any detail in this paper. Our intuition is that consciousness has a very important role in our behaviour, but it is surprisingly difficult to define precisely what this role might be. It is certainly not the case that processing which leads to consciousness is, in some sense, better than processing that does not lead to consciousness. Some tasks are better learned without awareness (e.g. Fletcher et al. 2005). Introspection about the sources of our behaviour can be very misleading (e.g. Johansson et al. 2005). In the social domain, as we shall see later in this essay, the same signal can have very different effects, depending on whether or not it reaches consciousness. The role of consciousness in our response to emotions is equally complex and poorly understood.

By contrast, the role of unconscious processes is perhaps slightly better understood. Unconscious routes to action seem to apply when responses need to be made with great rapidity. For example, the two routes are engaged by stimuli that induce fear (LeDoux 2000). I will jump out of the way of the snake-like object in the path before I become aware that it is only a bent stick. In this case, there is a clear survival advantage for responding rapidly to a signal of danger, even when the signal has been incorrectly interpreted. In the rest of this paper, I shall be concerned with the use of various kinds of facial expressions as signals, which can also be processed with and without consciousness.

2. Emotional contagion: the covert imitation of facial expressions

Social interactions depend on complex signals in many modalities. For human interactions, however, spoken language plays such an important role that we often forget about the importance of non-verbal signals. Among these non-verbal signals, facial expressions have a major part in social interactions (Adolphs 1999; but see de Gelder 2009). I shall first consider the facial signals that are processed without awareness. The sight of a human face expressing fear elicits fearful responses in the observer, as indexed by increases in autonomic markers of arousal (Ohman & Soares 1998) and increased activity in the amygdala (Morris et al. 1996). This effect occurs even when the observer is unaware of the expression on the face because it has been masked (see Pessoa (2005) for review). The process by which the sight of a fearful face elicits fear in the observer is an example of a more general effect known as mirroring or contagion, whereby an observer tends to covertly and unconsciously mimic the behaviour of the person being observed. This effect occurs for a whole range of behaviours: facial expressions (Dimberg et al. 2000), limb movements (Kilner et al. 2003), gestures (Chartrand & Bargh 1999) and many aspects of speech (Pickering & Garrod 2004). However, as will be discussed below, the magnitude of this automatic mimicry is modulated by a number of factors, including the social context and the relationship between the observed and the observer (Bourgeois & Hess 2008).

There are many advantages of this mirroring behaviour. In the case of speech, the mirroring creates a greater alignment of the communicants, in terms of vocabulary and grammar, which facilitates communication. Experiments on the imitation of gestures during conversations show that the person imitated feels more friendly towards the other speaker and subsequently behaves in a more prosocial manner (e.g. is more likely to give money to a charity; Van Baaren et al. 2004).

However, these effects probably only occur when the person is unaware of the imitation (Lakin & Chartrand 2003). This is an interesting case where the effect of signals seems to be very different, depending on whether or not they are consciously perceived.

Ideas about the advantages accruing from the unconscious imitation of facial expressions are more speculative. The sight of a fearful face is likely to be a cue that there is something to be afraid of and that the observer should therefore be vigilant. Furthermore, the facial expression of fear enhances vigilance: the widening of the eyes increases the size of the visual field, while the widening of the nostrils increases inspirational capacity and enhances the sense of smell (Susskind et al. 2008). Thus, by imitating the expression of fear we increase our sensitivity to sensory signals and become more vigilant. By contrast, the expression of disgust is a cue that there is some noxious substance to be avoided. The facial expression of disgust has the opposite effect to that of fear. The eyes are narrowed, reducing the size of the visual field, and the nose is wrinkled, narrowing the nasal passages and reducing the exposure to smells. By imitating the expression of disgust, we reduce the impact of potential noxious stimuli.

A problem in need of further research concerns the mechanism by which people can covertly recognize facial expressions of emotion in masked faces that they are not aware of seeing (Dimberg et al. 2000). There is some evidence that such recognition depends on subcortical pathways to the amygdala (via superior colliculus and pulvinar) that bypass the primary visual cortex (Morris et al. 1999). This idea is supported by the demonstration that a patient (G.Y.) with severe damage to the left occipital cortex can, nevertheless, recognize (i.e. make better than chance guesses) emotional facial expressions in his blind visual field (de Gelder et al. 1999). Further evidence for this speculation comes from studies of the role of different spatial frequencies in face recognition. Vuilleumier et al. (2003) showed that activity in the fusiform cortex was linked with facial identity with high spatial frequency presentation, while amygdala activity was linked with facial emotion with low spatial frequency presentation. Recognizing the expression of masked faces via a subcortical route may therefore depend on signals about the face carried by low spatial frequencies. One interesting possibility is that the critical sign of a fearful face is the area of white round the eye, which is greater when the eyes are opened wide. Whalen et al. (2004) showed that amygdala activity was elicited by masked fearful faces even when all information, but the eye whites, was removed. It remains to be shown how this sign can be so rapidly extracted from the visual signal in the absence of cortical processing.

It is not only facial expressions of emotion that are processed rapidly and without awareness. Gaze direction is another important facial cue when observing the behaviour of others. We are very sensitive to eye gaze direction and can discover the target of the gaze with great accuracy (Anstis et al. 1969). This ability enables us to discover who or what people are looking at and may reveal their interests and intentions. Eye gaze direction has been used as a cue on covert attention tasks. In these tasks, volunteers must detect a target that appears briefly to the left or right of fixation. Prior to the presentation of the target, a central cue is presented, for example, an arrow pointing left or right. The reaction time to detect the target is modulated by this cue, being faster when the cue is congruent with the target location and slower when it is incongruent. Typically congruent cues are presented on 80 per cent of trials and incongruent cues on 20 per cent of trials. If a face with eyes gazing left or right is used as the cue, the same congruency effect is found (Driver et al. 1999). Strikingly, however, this congruency effect occurs even if the eyes gaze consistently in the wrong direction. Bayliss et al. (2006) used a series of faces as cues in a covert attention task. The eye gaze direction in some faces was always congruent with the location of the following target, while other faces always looked in the incongruent direction. The effect of gaze on time to detect targets was unaffected by the identity of the faces. In other words, volunteers attended in the direction indicated by the eye gaze, even for the faces of individuals who consistently looked in the wrong direction. This result suggests that our tendency to follow the gaze direction of others is automatic and difficult to suppress. On the other hand, volunteers did learn something about the individuals. The faces that consistently gave invalid cues were rated as less trustworthy.

3. Explicit recognition of expressions

In parallel with the unconscious face processing route there is a conscious route, which is engaged, for example, when volunteers are explicitly asked to identify facial identities or facial expressions. Evidence that these two routes are indeed parallel comes from the study of patients who have acquired the inability to recognize faces (prosopagnosia) as the result of brain injury, usually to the right temporal lobe (Calder & Young 2005). Such patients also usually have difficulty in identifying facial expressions. However, there is evidence that these patients still show covert recognition of faces. For example, Tranel & Damasio (1988) describe four patients with prosopagnosia resulting from occipito-temporal lesions, who all showed covert recognition in terms of larger skin conductance responses (SCRs) for familiar faces. In other words, although they were unable to identify familiar faces, presentation of these faces elicited an emotional response. Further evidence for the two routes comes from patients with Capgras syndrome. This syndrome can occur both in psychiatric conditions and as a result of structural brain damage. Capgras patients are able to identify a familiar person's face. However, they believe that highly familiar people have been replaced by impostors, doubles or aliens, and they often hold this belief with extreme conviction (Ellis & Lewis 2001). Ellis & Young (1990) suggested that Capgras syndrome is the result of damage to an affective route to face recognition and is thus the mirror image of prosopagnosia. They recognize the face of a familiar person but do not experience the emotional response that normally accompanies such recognition. As a result, the face does not ‘feel’ right. In agreement with this idea, it has been shown that Capgras patients do not show enhanced SCR responses to familiar faces (Ellis et al. 1997).

My colleagues, led by Thierry Chaminade, have recently completed an imaging study in which volunteers were shown a variety of dynamic facial expressions displayed by a person or by a robot (humanoid robot WE-4RII; Takanishi Laboratory). This robot incorporates a simplified version of the facial action units identified by Ekman & Friesen (1978) and can display anger, joy and disgust. Our volunteers had no difficulty in recognizing the expressions made by the robot. The neural activity elicited by the facial expressions of this robot remains to be determined. An interesting possibility is that activity in areas such as the amygdala, which are associated with emotional responses, will not be elicited by robots. Such a pattern of responses would have obvious parallels with the case of Capgras syndrome discussed above. Here the patient sees a human face and recognizes identities and expressions, but feels no emotional resonance. In many cases, the patient believes that the person in question has been replaced by a robot (Ellis & Lewis 2001). It may turn out that the phenomenology described by the patient is correct in the sense that the experience is indeed like that elicited by observing a robot.

This speculation suggests that the observation of robots may not automatically elicit the contagious mimicry that occurs when we observe other people. There is some evidence in favour of this speculation. As mentioned above, Kilner et al. (2003) showed that observation of another person's incompatible movements interfered with the subject's own movements. However, this interference did not occur when the incompatible movements were made by a robot arm. On the other hand, there is evidence that automatic alignment increases when we interact with artificial agents, just as it does when we interact with people. Oviatt et al. (2004) showed that children's speech reliably and flexibly converged in terms of a number of acoustic and prosodic features with the speech of the artificial agent they were conversing with. In this case the convergence might have been promoted by the need to interact with the agent rather than simply observing it. As we shall see in the next section, the contagious effects of observation are considerably enhanced when we are in social contact with the person we are observing (e.g. Kilner et al. 2006).

4. Emotions as communicative signals

So far I have presented facial expressions as an example of public information (Danchin et al. 2004). Public information is a cue inadvertently produced by the behaviour of one animal that is useful to another animal. Many animals take advantage of such cues. Starlings, for example, locate likely sources of food by observing the foraging success of other members of the flock (Templeton & Giraldeau 1995). Facial expressions are another example of public information. Such expressions are also cues that are produced inadvertently and can be useful to observers. The appearance of an expression of fear is a signal to be vigilant since there may be something dangerous nearby. However, once emotional expressions become signals of value to observers, then it also becomes possible for these signals to be used deliberately as acts of communication (Parkinson 2005). The key characteristic of these communicative signals is that the communicator is aware that she is sending a signal and is also aware that her signal is being observed. As a result, the presence of an audience can alter the intensity of emotional expressions. An experiment by Bavelas et al. (1986) provides one of a number of examples of this effect. A volunteer sat in a waiting room unaware that she was being filmed. Two actors enter the room carrying a heavy object that is then dropped, apparently injuring one of the actors. The observing volunteer mimics the expression of pain shown by the injured actor. This is a typical ‘mirror’ response. However, the pained expression in the observer is greatly intensified if the actor is in eye contact with the observer. The pained expression shown by the observer is now a communicative act, signalling ‘I recognize your pain and I sympathize’ and/or ‘I am a caring person’.

So far I have talked about situations in which one person responds to the facial expression of another, i.e. the observer responds to the expression of the actor. This is an example of open-loop behaviour. However, communications are more typically two-way interactions. In a two-way interaction, after the observer has responded to the expression of the actor, the actor then responds to the changed expression of the observer, thereby closing the loop. An example of an interactive sequence of emotions in which the communication loop is closed comes from an analysis of the expression of embarrassment by Keltner & Buswell (1997). Embarrassment is a complex emotion, which, like guilt and shame, is concerned with reputation in the eyes of others. In particular, Keltner and Buswell suggest that embarrassment can be seen as an act of appeasement, whereby a loss of reputation can be mitigated. The interactive sequence could be described as follows. The actor commits a social faux pas. This elicits surprise and anger in the observers, communicating their disapproval. The actor expresses embarrassment, communicating that he is ashamed, apologetic and unhappy. This appeases the observers, who express compassion communicating forgiveness. This elicits happiness in the actor, indicating that he recognizes that he has been forgiven.

These deliberate emotional signals can also be deliberately deceptive. The person who signals embarrassment in order to appease observers may not actually feel the emotion.

5. Ostensive signals

If the same cue can be produced inadvertently or as a deliberate signal, then it is important to be able to distinguish between such causes of cues. Inadvertent signals are less likely to be deceptive, while deliberate signals may require an answer. In many situations, a special cue (an ostensive signal) is produced to indicate that the signalling is deliberate. The production of an ostensive signal indicates two things: first, that the person wishes to initiate communication, and second that the signal which follows will be relevant to the interests of the receiver (Sperber & Wilson 1995).

The most direct way of initiating a communication is to call someone's name. However, facial gestures are also used such as making eye contact, often accompanied by a brief raising of the eyebrows (the eyebrow flash; Eibl-Eibesfeldt 1972). Kampe et al. (2003) scanned volunteers while they received ostensive signals either in the auditory modality (their name) or in the visual modality (prolonged eye contact). Activity common to both types of signal was observed in the medial prefrontal cortex and temporal poles. These are the regions that are frequently activated when subjects have to think about mental states (Mitchell et al. 2005; Frith & Frith 2006). Making inferences about the mental states of the other is a key requirement for communicative interactions, because these are essentially about transferring knowledge and beliefs from one mind to another. The results of Kampe et al. suggest that the ostensive signal causes the receiver to prepare for mentalizing.

An ostensive signal indicates that the signals which follow are deliberately communicative and will provide useful and relevant information to the observer. Belief in this promise is nicely demonstrated in the behaviour of infants in a study by Senju & Csibra (2008). In this experiment, the infants, aged six months, observed an adult who directed her gaze towards an object on the left or an object on the right. The experimenters tested whether or not the infant would follow the gaze of the adult. The results clearly showed that the infants followed the gaze only when it had been preceded by an ostensive signal: eye contact with eyebrow flash or infant-directed speech (motherese). The implication is that they understand that these ostensive signals indicate the eye gaze that follows is intended to point at something of interest.

What is the origin of these ostensive signals? Calling someone's name is a straightforward way of attracting someone's attention, but why should the eyebrow flash have acquired this role? Watt et al. (2007) have shown that raising the eyebrows, as happens with the eyebrow flash, increases the distance at which it is possible for an observer to detect gaze direction. Watt et al. suggest this means that the eyebrow flash allows the observer to be more easily able to see that the actor is making eye contact. I speculate that the eyebrow flash may also be a signal of relevance.

There is good agreement among observers that some people's faces look more trustworthy than others (e.g. Winston et al. 2002; Todorov et al. 2008), although there is no evidence that these judgements have any validity. Todorov et al. have identified four facial features that drive the perception of trustworthiness. One of these is the height of the inner eyebrow. Faces with high inner eyebrows look trustworthy. Those with low inner eyebrows look untrustworthy. These differences can be interpreted on the basis of the results of Watt et al. Trustworthy people have raised eyebrows so that observers can see where they are looking. Untrustworthy people have lowered eyebrows so that observers have more difficulty in seeing where they are looking. Given these observations, we can interpret the eyebrow flash as a signal that the sender is to be trusted. Raised eyebrows enable eye gaze direction to be seen more easily, thereby helping to reveal the intentions of the sender. As an ostensive gesture, the eyebrow signals that the message which follows will be relevant and true.

6. Conclusions

The study of facial expressions illustrates very nicely how a behaviour can evolve into a sophisticated communication system. At first, the facial expression of fear has direct behavioural advantages for the actor, since widening the eyes for example, increases the visual field, thereby increasing the likelihood of detecting signals of danger. This expression then becomes public information that observers can use as a signal to be vigilant. In the next step, the actor becomes able to control the sending of a signal that was previously emitted inadvertently. Through such control he can express sorrow and embarrassment as a means of appeasing aggression in others. Finally, both the actor and the receiver become aware that they are exchanging signals and that these can be used for deliberate communication. At this stage, the signals need no longer be tied to their original behavioural function. They can be arbitrarily related to meaning, making the development of language possible.

Acknowledgements

This work is supported by the Danish National Research Foundation and the AHRC CNCC scheme AH/E511112/1.