Abstract

The frequency above which the excitation function of a voiced vowel becomes essentially stochastic is determined. It varies with speaker and stress. Knowledge of this transition frequency can be used to develop better spectral subtraction denoising algorithms, where different methods of spectral estimation are used above and below this frequency. In addition, whenever speech is (1) unvoiced, or voiced with constant pitch, and (2) the convolution of an excitation function with another function, a normalized variance of the spectral estimate of speech can be defined which equals the normalized variance of the spectral estimate of the excitation function. This normalized variance measures the frequency dependence of the relative strengths of the stochastic and deterministic components present in the excitation function. As expected, for voiced vowels the normalized variance is small at low frequencies, confirming that the excitation function is largely deterministic. At high frequencies the normalized variance is large and the excitation function is primarily stochastic. Therefore, voiced vowels are whispered at high frequencies. [Work supported by DOE applied mathematics program and DARPA information technology office.]