Exploiting Variances in Robust Feature Extraction Based on a Parametric Model of Speech Distortion

Li Deng, Jasha Droppo, and Alex Acero September 2002

Abstract

This paper presents a technique that exploits the denoised speech’s variance, estimated during the speech feature enhancement process, to improve noise-robust speech recognition. This technique provides an alternative to the Bayesian predictive classification decision rule by carrying out an integration over the feature space instead of over the model-parameter space, offering a much simpler system implementation and lower computational cost. We extend our earlier work by using a new approach, based on a parametric model of speech distortion and thus free from the use of any stereo training data, to statistical feature enhancement, for which a novel algorithm for estimating the variance of the enhanced speech features is developed. Experimental evaluation using the full Aurora2 test data sets demonstrates an 11.4% digit error rate reduction averaged over all noisy and SNR conditions, compared with the best technique we have developed [2] prior to this work that did not exploit the variance information and that required no stereo training data.