Recording and Analysis
of Emotional Speech

This course is open
for students of psychology and students of phonetics. It is a one-year
course (two semesters). In psychology, the first semester is
an elective course (GWM),
and the second semester is part of the empirical laboratory class
(Empiriepraktikum 2). In phonetics, the winter course is
identical with the course "Projekt"
(Modul I) (followed by Modul II in summer) intended for
students of the 5th semester.

In this course we will first of all impart the fundamentals of emotion
psychology and of speech production and recording. We will then
devise an experimental setting that allows to record emotional
speech. The range of emotions should cover the major basic emotions
(Tomkins, Ekman). We will introduce the course participants in the
techniques of speech recording. We will then proceed to record
emotional speech. The course participants will be instructed how to
analyze speech by means of digital speech processing. In the second
semester, we will devise an experimental setting for the evaluation of
emotional speech, by means of ratings of the listeners and of the
recording of physiological measures of the listeners.

First ideas

Much of the experimental setup will be discussed with the participants
of the course. Therefore, the following ideas are not meant to fix the
details of the experiments. They are meant to give a first impression
of how the experiments could be done. We are open to other setups that
will be suggested by the participants.

Recording spontaneous versus pretended emotions: It
would be very tempting to record true spontaneous emotions. It will,
however, be very difficult to find the entire range of basic emotions
displayed spontaneously in comparable settings. For better
comparability it will be better to have actors play the emotions. We
could offer drama students the possibility to be evaluated on their
performance to convey emotions (see, e.g., Schule
für Schauspiel Kiel). The evaluation would in
addition to the clearness of the displayed emotions also comprise their
authenticity. Finding clear and authentic emotional recordings might
weaken the concerns one might have with acted emotions.

When recording the speech of actors, we are free to
define the text that is read. We should choose a text that does not by
itself ease the display of some emotions and cumber that of others. The
text might in itself undergo an evaluation phase. We would choose the
text that is the most undefined with regard to its emotional content,
offering at the same time potential for emotional interpretations of
all
sorts. The text should be short enough to be learnt by heart.

We could in addition to the soundtrack record a video
of the performance. We could then compare the ratings and the
physiological effects of the sound
track, of the video without sound, and of the video with soundtrack.

Schröder, Marc (2003). Speech and Emotion
Research: An
Overview of Research Frameworks and a Dimensional Approach to Emotional
Speech Synthesis. Dissertation, PHONUS 7, Research Report of the
Institute of Phonetics, Saarland University.