Abstract

Recent studies have demonstrated perceptual adaptation to nonlinguistic properties of speech involving voice gender and emotional expression. The present study extends this work by examining the contribution of fundamental frequency (F0) to these effects. Voice recordings of vowel-consonant-vowel (VCV) syllables from six talkers were processed using the STRAIGHT vocoder and an auditory morphing technique to synthesize gender (experiment 1) and expressive (experiment 2) speechsound continua ranging from one category endpoint to the other (female to male; angry to happy). Continuum endpoints served as adaptors for F0 present and F0 removed conditions. F0 removed stimuli were created by replacing the periodic excitation source with broadband noise. Confirming previous findings, aftereffects were found in the F0 present condition, resulting in a decreased likelihood to identify test stimuli as belonging to the adaptor category. No aftereffects appeared when F0 was removed, highlighting the importance of F0 in adaptation. However, in an identification test listeners were still able to categorize F0 removed stimuli at better-than-chance levels, indicating that residual cues for gender and emotion were available even when F0 was not present.

Received 15 October 2011Revised 26 December 2012Accepted 28 January 2013Published online 03 April 2013

Acknowledgments:

The authors would like to thank Hideki Kawahara for providing the STRAIGHT synthesis software, Alice O'Toole for helpful feedback on the research analysis methodology, James Hillenbrand, Drew Rendall, and two anonymous reviewers for helpful comments made on a previous version of the manuscript, and the UT Dallas School of Arts and Humanities for cooperation in posting flyers to recruit talkers. Portions of this work were reported at the 162nd Meeting of the Acoustical Society of America.