Navigation

Model compensation is a technique for noise-robustness.
It takes a model of the speech and a model of the noise, and produces
a model of the corrupted speech.
In this toolkit, the models of the speech and noise must both be Gaussian.
Only static speech recogniser coefficients are compensated.

The cross-entropy toolkit implements the following well-known model compensation
techniques.

Compute the corrupted speech distribution with DPMC (Gales 1995).
This draws samples from the dristibution of the corrupted speech that
follows from the distributions for the speech, the noise, and the
phase factor.
It then approximates a Gaussian distribution or a mixture of Gaussians
on these samples.

Returns:

approximate corrupted speech distribution.
if componentNum is 1, then this is a
Gaussian.
Otherwise, this is a
Mixture of Gaussians.

Parameters:

clean – the clean speech distribution.

noise – the noise distribution.

phaseFactor – the phase factor distribution.

sampleNum – the number of samples used to train the resulting
distribution.

Apply VTS compensation (Moreno 1996).
This uses a first-order vector Taylor series approximation to the mismatch
function.
The distributions for the speech, noise, and phase factor must be Gaussian.
Because the mismatch function is linearised, the resulting approximate
corrupted speech distribution is also Gaussian.