Abstract

This paper presents the results of a closed-set recognition task for 64 consonant-vowel sounds (, spoken by 18 talkers) in speech-weighted noise and in quiet. The confusion matrices were generated using responses of a homogeneous set of ten listeners and the confusions were analyzed using a graphical method. In speech-weighted noise the consonants separate into three sets: a low-scoring set C1 (/f/, /θ/, /v/, /ð/, /b/, /m/), a high-scoring set C2 (/t/, /s/, /z/, /ʃ/, /ʒ/) and set C3 (/n/, /p/, /g/, /k/, /d/) with intermediate scores. The perceptual consonant groups are C1: { /f/-/θ/, /b/-/v/-/ð/, /θ/-/ð/ }, C2: { /s/-/z/, /ʃ/-/ʒ/ }, and C3: /m/-/n/, while the perceptual vowel groups are /ɑ/-/æ/ and /ε/-/ɪ/. The exponential articulation index (AI) model for consonant score works for 12 of the 16 consonants, using a refined expression of the AI. Finally, a comparison with past work shows that white noise masks the consonants more uniformly than speech-weighted noise, and shows that the AI, because it can account for the differences in noise spectra, is a better measure than the wideband signal-to-noise ratio for modeling and comparing the scores with different noise maskers.

We thank all members of the HSR group at the Beckman Institute, UIUC for their help. We thank Andrew Lovitt for writing a user-friendly code that made data collection easier and faster. Bryce Lobdell’s input was crucial in revising the manuscript. We are grateful to Kenneth Grant for sharing his confusion data, stimuli, and his expertise.