Please use this identifier to cite or link to this item:
http://hdl.handle.net/10061/8031

Title:

A New Phonetic Tied-Mixture Model For Efficient Decoding

Authors:

Akinobu LeeTatsuya KawaharaKazuya TakedaKiyohiro Shikano

Issue Date:

Jun-2000

Publisher:

IEEE

Start page:

1269

End page:

1272

Abstract:

A phonetic tied-mixture (PTM) model for efficient large vocabulary continuous speech recognition is presented. It is synthesized from context-independent phone models with 64 mixture components per state by assigning different mixture weights according to the shared states of triphones. Mixtures are then re-estimated for optimization. The model achieves a word error rate of 7.0% with a 20000-word dictation of newspaper corpus, which is comparable to the best figure by the triphone of much higher resolutions. Compared with conventional PTMs that share Gaussians by all states, the proposed model is easily trained and reliably estimated. Furthermore, the model enables the decoder to perform efficient Gaussian pruning. It is found out that computing only two out of 64 components does not cause any loss of accuracy. Several methods for the pruning are proposed and compared, and the best one reduced the computation to about 20%