In both cases, the task becomes very difficult when the subject speaks at a distance of some meters from the microphones, under noisy and reverberant conditions. The task also regards the processing of audio streams, often combined with speech activity detection to determine the task of speaker diarization.