4
4 1. Introduction Nowadays we have many chances to use a different language from the mother tongue by the stream of the internationalization. Moreover, there is an increasing demand on the automatic systems using the speech recognition. –Ex: Computer assisted language learning (CALL) … However, the performance of an automatic speech recognition (ASR) system tested by the non-native speech degrades significantly, compared with that by the native speech. The main reason –A target language, with which the speech recognition system has been already trained –The mother tongue of the non-native speaker have different pronunciation spaces of the vowel and consonant sounds.

5
5 1. Introduction (cont.) An ASR for the non-native speech requires kind of adaptation to compensate for this fact. –Pronunciation modeling Making a nonnative speech recognition system to include the pronunciation variants by non-native speakers for each word –Acoustic modeling Adapting the acoustic models by one of adaptation methods such MLLR, MAP adaptation –Language modeling Adapting the language model The combination of these approaches can be used for more improvement.

6
6 1. Introduction (cont.) In this paper, the pronunciation variability is first investigated and then the acoustic model adaptation is performed for the phonetic units. Pronunciation variability –Modeled by a phoneme confusion matrix for pronunciation from native to non-native speech. –Clustering the state of acoustic models of target language. Acoustic model adaptation –Making the states of the variant units tied. –The mixture of each acoustic model is increased

14
14 5. Conclusion We proposed the acoustic model adaptation method for non-native speech recognition. The proposed method, which is a data-driven approach, first ranked the phonetic units that gave most informative pronunciation variability by recognizing nonnative speech using the acoustic models trained by native speech.

15
15 Another way… Korean speak English –English Model → English Model for Korean By phoneme confusion matrix to do state tying American speak Arabic –Arabic Model → Arabic Model for American Hidden Markov Model (HMM) phone sets are trained for English and Arabic, and then English phones are merged into the Arabic phones to make a new Arabic system. The phone level transcriptions of the training data had to be relabeled with English phones By phoneme confusion matrix to do model merging (MM) adaptation