Abstract

Affect detection from open-ended virtual improvisational contexts is a challenging task. To achieve this research goal, the authors developed an intelligent agent which was able to engage in virtual improvisation and perform sentence-level affect detection from user inputs. This affect detection development was efficient for the improvisational inputs with strong emotional indicators. However, it can also be fooled by the diversity of emotional expressions such as expressions with weak or no affect indicators or metaphorical affective inputs. Moreover, since the improvisation often involves multi-party conversations with several threads of discussions happening simultaneously, the previous development was unable to identify the different discussion contexts and the most intended audiences to inform affect detection. Therefore, in this paper, the authors employ latent semantic analysis to find the underlying semantic structures of the emotional expressions and identify topic themes and target audiences especially for those inputs without strong affect indicators to improve affect detection performance. They also discuss how such semantic interpretation of dialog contexts is used to identify metaphorical phenomena. Initial exploration on affect detection from gestures is also discussed to interpret users’ experience of using the system and provide an extra channel to detect affect embedded in the virtual improvisation. Their work contributes to the journal themes on affect sensing from text, semantic-based dialogue processing and emotional gesture recognition.

Article Preview

Introduction

Human behaviour in social interaction has been intensively studied. Intelligent agents are used as an effective channel to validate such studies. For example, mimicry agents are built to employ mimicry social behaviour to improve human agent communication (Sun et al., 211). Intelligent conversational agents are also equipped to conduct personalised tutoring and generate small talk behaviours to enhance users’ experience. However, the Turing test introduced in 1950 (Turing, 1950) still poses big challenges to our intelligent agent development. Especially, the proposed question, “can machines think?”, makes many of our developments shallow.

We believe it will make intelligent agents possess human-like behaviour and narrow the communicative gap between machines and human-beings if they are equipped to interpret human emotions during the interaction. Thus in our research, we equip our AI agent with emotion and social intelligence as the potential attempts to answer the above Turing question. According to Kappas (2010), human emotions are psychological constructs with notoriously noisy, murky, and fuzzy boundaries that are compounded with contextual influences in experience and expression and individual differences. These natural features of emotion also make it difficult for a single modal recognition, such as via acoustic-prosodic features of speech or facial expressions. Since human being’s reasoning process has taken related context into consideration, in our research, we intend to make our agent take multi-channels of subtle emotional expressions embedded in social interaction contexts into consideration to draw reliable affect interpretation. The research presented here focuses on the production of intelligent agents with the abilities of interpreting dialogue contexts semantically to support affect detection as the first step of building a ‘thinking’ machine. Emotional gesture recognition is also briefly discussed to provide a potential extra channel to interpret affect implied in the improvisation and detect users’ experience unintrusively.

Our research is conducted within a previously developed online multi-user role-play virtual drama framework, which allows school children aged 14 – 16 to talk about emotionally difficult issues and perform drama performance training. In this platform young people could interact online in a 3D virtual drama stage with others under the guidance of a human director. In one session, up to five virtual characters are controlled on a virtual stage by human users (“actors”), with characters’ (textual) “speeches” typed by the actors operating the characters. The actors are given a loose scenario around which to improvise, but are at liberty to be creative. An intelligent agent is also involved in improvisation. It included an affect detection component, which detected affect from human characters’ each individual turn-taking input (an input contributed by an individual character at one time).

This previous affect detection component1 was able to detect 25 emotions including basic and complex emotions and value judgments, but the detection processing has not taken any context into consideration. The basic emotions used referred to Ekman’s six cross-cultural universally accepted emotions (Ekman, 1999) including anger, disgust, fear, happiness, sadness and surprise. The complex emotions were taken from the global structure of emotion types, which is also known as the OCC model (Ortony, Clore, & Collins, 1988). In this model, emotions were regarded as the results of appraising events, agents, and objects. The previous development borrowed emotion categories such as approval/disapproval and gloating to annotate emotions embedded in the users’ inputs. Other emotions we used included meta-emotions (emotions about emotions) such as desiring to overcome anxiety; moods such as hostility; and value judgments (judgments of goodness, importance, etc.). The intelligent agent also made attempts to produce appropriate responses to help stimulate the improvisation based on the detected affect. The detected emotions were also used to drive the animations of the avatars so that they reacted bodily in ways that was consistent with the affect that they were expressing (Zhang et al., 2009; Zhang, 2010). Although merely detecting affect is limited compared to extracting the full meaning of characters’ utterances, we have found that in many cases this was sufficient for the purposes of stimulating the improvisation.