Abstract

We address the problem of extending the bandwidth of speech signals, which is of importance to enhance the quality and intelligibility of the telephone speech. The low-pass filtering effect of the telephone communication channels eliminate the high-frequency components of the speech signal, and it is necessary to retrieve those to maintain the speech quality. We adopt a joint-dictionary training approach to recover the missing spectral information. By exploiting the sparsity of the spectrogram frames, the dictionaries for the wide-band (WB) and the corresponding narrow-band (NB) spectrogram frames are trained in a coupled manner in order to learn the mapping from NB to WB frames. We refer to this approach as the joint dictionary training for bandwidth extension (JDTBE). To ensure that the reconstructed bandwidth-extended speech is consistent with the measurement, we propose to apply a suitable affine transformation that depends on the properties of the telephone channel. We study the effect of the choice of sparsity on the quality of the reconstructed speech, for both male and female speakers. A comparison of the proposed JDTBE algorithm with a bandwidth extension technique based on stochastic modeling reveals the superiority of the JDTBE approach in terms of subjective listening test scores.

Item Type:

Conference Proceedings

Additional Information:

Copy right for this article belongs to the IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA