Laukka, Petri

Abstract [en]

The auditory gating paradigm was adopted to study how much acoustic information is needed to recognize emotions from speech prosody and music performances. In Study 1, brief utterances conveying 10 emotions were segmented into temporally fine-grained gates and presented to listeners, whereas Study 2 instead used musically expressed emotions. Emotion recognition accuracy increased with increasing gate duration and generally stabilized after a certain duration, with different trajectories for different emotions. Above-chance accuracy was observed for ≤ 100 ms stimuli for anger, happiness, neutral and sadness, and for ≤ 250 ms stimuli for most other emotions, for both speech and music. This suggests that emotion recognition is a fast process that allows discrimination of several emotions based on low-level physical characteristics. The emotion identification points (EIPs) – which reflect the amount of information required for stable recognition – were shortest for anger and happiness for both speech and music, but recognition took longer to stabilize for music vs. speech. This, in turn, suggests that acoustic cues that develop over time also play a role for emotion inferences (especially for music). Finally, acoustic cue patterns were positively correlated between speech and music, suggesting a shared acoustic code for expressing emotions.

Nordström, Henrik

Abstract [en]

Emotional communication is an important part of social interaction because it gives individuals valuable information about the state of others, allowing them to adjust their behaviors and responses appropriately. When people use the voice to communicate, listeners do not only interpret the words that are said, the verbal content, but also the information contained in how the words are said, the nonverbal content. A large portion of the nonverbal content of the voice is thought to convey information about the emotional state of the speaker. The aim of this thesis was to study how humans communicate and interpret emotions via nonverbal aspects of the voice, and to describe these aspects in terms of acoustic parameters that allow listeners to interpret the emotional message.

The thesis presents data from four studies investigating nonverbal communication of emotions from slightly different perspectives. In a yet unpublished study, the acoustic parameters suggested to communicate discrete emotions – based on theoretical predictions of how the voice may be influenced by emotional episodes – were compared with empirical data derived from listeners’ judgments of actors portraying a wide variety of emotions. Results largely corroborated the theoretical predictions suggesting that previous research has come far in explaining the mechanisms allowing listeners to infer emotions from the nonverbal aspects of speech. However, potentially important deviations were also observed. These deviations may be crucial to our understanding of how emotions are communicated in speech, and highlight the need to refine theoretical predictions to better describe the acoustic features that listeners use to understand emotional voices.

In the first of the three published studies, Study 1, the common sense notion that we are quick to hear the emotional state of a speaker was investigated and compared with the recognition of emotional expressivity in music. Results showed that listeners needed very little acoustic information to recognize emotions in both modes of communication. These findings suggest that low-level acoustic features that are available to listeners in the first tenths of a second carry much of the emotional message and that these features may be used in both speech and music.

By investigating listeners recognition of vocal bursts – the kind of sounds people make when they are not speaking – results from Study 2 showed that listeners can recognize several emotional expressions across cultures, including emotions that are often difficult to recognize from speech. The study thus suggests that the voice is an even more versatile means for emotional communication than previously thought.

Study 3 also investigated emotional communication in a cross-cultural setting. However, instead of studying emotion recognition in terms of discrete categories, this study investigated whether nonverbal aspects of the voice can carry information about how the speaker evaluated the situation that elicited the emotion. Results showed that listeners were able to infer several aspects about the situation, which suggests that nonverbal expressions may have a symbolic meaning comprising several dimensions other than valence and arousal that can be understood across cultures.

Taken together, the results of this thesis suggest that humans use nonverbal manipulations of the voice to communicate emotions and that these manipulations can be understood quickly and accurately by listeners both within and across cultures. Although decades of research has investigated how this communication occurs, the acoustic parameters allowing listeners to interpret emotions are still elusive. The data from the four studies in this thesis, the methods used, and the acoustic analyses performed shed new light on this process. Future research in the field may benefit from a more standardized approach across studies, both when it comes to acoustic analysis and experimental design. This would facilitate comparisons of findings between different studies and allow for a more cumulative science within the field of emotional communication in the human voice.