Makuhari, Chiba, Japan
September 26-30. 2010

Advances in Fast Multistream Diarization Based on the Information Bottleneck Framework

Deepu Vijayasenan, Fabio Valente, Hervé Bourlard

Idiap Research Institute, Switzerland

Multistream diarization is an effective way to improve the diarization performance, MFCC
and Time Delay Of Arrivals (TDOA) being the most commonly used features. This paper
extends our previous work on information bottleneck diarization aiming to include large
number of features besides MFCC and TDOA while keeping computational costs low. At first
HMM/GMM and IB systems are compared in case of two and four feature streams and analysis
of errors is performed. Results on a dataset of 17 meetings show that, in spite of
comparable oracle performances, the IB system is more robust to feature weight variations.
Then a sequential optimization is introduced that further improves the speaker error by
5-8% relative. In the last part, computational issues are discussed. The proposed approach
is significantly faster and its complexity marginally grows with the number of feature
streams running in 0.75% real time even with four streams achieving a speaker error equal to 6%.