Hierarchical Processing of the Modulation Spectrum for GALE Mandarin LVCSR system

Presented at: Proceedings of the 10thAnnual Conference of the International Speech Communication Association (Interspeech), Brighton

Publication date: 2009

This paper aims at investigating the use of TANDEM features based on hierarchical processing of the modulation spectrum. The study is done in the framework of the GALE project for recognition of Mandarin Broadcast data. We describe the improvements obtained using the hierarchical processing and the addition of features like pitch and short-term critical band energy. Results are consistent with previous findings on a different LVCSR task suggesting that the proposed technique is effective and robust across several conditions. Furthermore we describe integration into RWTH GALE LVCSR system trained on 1600 hours of Mandarin data and present progress across the GALE 2007 and GALE 2008 RWTH systems resulting in approximatively 20% CER reduction on several data set.