Makuhari, Chiba, Japan
September 26-30. 2010

Dialect Recognition Using a Phone-GMM-Supervector-Based SVM Kernel

Fadi Biadsy (1), Julia Hirschberg (1), Michael Collins (2)

(1) Columbia University, USA
(2) MIT, USA

In this paper, we introduce a new approach to dialect recognition which relies on the
hypothesis that certain phones are realized differently across dialects. Given a speaker's
utterance, we first obtain the most likely phone sequence using a phone recognizer. We
then extract GMM Supervectors for each phone instance. Using these vectors, we design a
kernel function that computes the similarities of phones between pairs of utterances. We
employ this kernel to train SVM classifiers that estimate posterior probabilities, used
during recognition. Testing our approach on four Arabic dialects from 30s cuts, we compare
our performance to five approaches: PRLM; GMM-UBM; our own improved version of GMM-UBM
which employs fMLLR adaptation; our recent discriminative phonotactic approach; and a
state-of-the-art system: SDC-based GMM-UBM discriminatively trained. Our kernel-based
technique outperforms all these previous approaches; the overall EER of our system is 4.9%.