In this paper, we propose a robust spectral representation using the group delay (GD) function computed from the stabilized weighted linear prediction (SWLP) coefficients. Temporal weighting of the cost function in linear prediction (LP) analysis with the short-term energy of the speech signal improves the robustness of the resultant spectrum. The additive property of the group delay function provides for better representation of weaker resonances in the spectrum, and thereby improving the robustness of the representation. The SWLP provides robustness in the temporal domain, whereas the GD function provides robustness in the frequency domain. The proposed SWLP-GD representation is shown to be robust against different types of additive noise degradations, compared to the popularly used discrete Fourier transform (DFT) or LP based representations. In a small-scale closed-set speaker recognition experiment, the cepstral features derived from the proposed SWLP-GD spectrum perform better than the traditional melcepstral features computed from the discrete Fourier transform (DFT) spectrum under conditions of mismatched degradations.

@inproceedings{DhanuSWLPGDIS2013,
title = {Robust formant detection using group delay function and stabilized weighted linear prediction},
author = {Dhananjaya Gowda and Jouni Pohjalainen and Mikko Kurimo and Paavo Alku},
year = {2013},
date = {2013-08-25},
booktitle = {Proc. Interspeech 2013},
abstract = {In this paper, we propose a robust spectral representation for detecting formants in heavily degraded conditions. The method combines the temporal robustness of the stabilized weighted linear prediction (SWLP) with the robustness of group delay (GD) function in the frequency domain. Weighting of the cost function in linear prediction analysis with the short-time energy of the speech signal improves the robustness of the resultant spectrum. It also improves the accuracy of the estimated resonances as the weighting function gives more weightage to the closed phase of the glottal cycle, which is also the high SNR region of the signal. The group delay spectrum computed as the sum of individual resonances denoted by the roots of the SWLP coefficients, improves the robustness of weaker higher order resonances. The proposed SWLP-GD spectrum performs better than the conventional LP spectrum and the STRAIGHT spectrum in terms of spectral distortion measure and formant detection accuracies.},
keywords = {formant detection, group delay, robust spectrum estimation, stabilized weighted linear prediction, SWLP}
}

In this paper, we propose a robust spectral representation for detecting formants in heavily degraded conditions. The method combines the temporal robustness of the stabilized weighted linear prediction (SWLP) with the robustness of group delay (GD) function in the frequency domain. Weighting of the cost function in linear prediction analysis with the short-time energy of the speech signal improves the robustness of the resultant spectrum. It also improves the accuracy of the estimated resonances as the weighting function gives more weightage to the closed phase of the glottal cycle, which is also the high SNR region of the signal. The group delay spectrum computed as the sum of individual resonances denoted by the roots of the SWLP coefficients, improves the robustness of weaker higher order resonances. The proposed SWLP-GD spectrum performs better than the conventional LP spectrum and the STRAIGHT spectrum in terms of spectral distortion measure and formant detection accuracies.