Contents

When the independence assumption is correct, blind ICA separation of a mixed signal gives very good results. It is also used for signals that are not supposed to be generated by a mixing for analysis purposes. A simple application of ICA is the “cocktail party problem”, where the underlying speech signals are separated from a sample data consisting of people talking simultaneously in a room. Usually the problem is simplified by assuming no time delays and echoes. An important note to consider is that if N sources are present, at least N observations (e.g. microphones) are needed to get the original signals. This constitutes the square (J = D, where D is the input dimension of the data and J is the dimension of the model). Other cases of underdetermined (J < D) and overdetermined (J > D) have been investigated.

The statistical method finds the independent components (aka factors, latent variables or sources) by maximizing the statistical independence of the estimated components. Non-Gaussianity, motivated by the central limit theorem, is one method for measuring the independence of the components. Non-Gaussianity can be measured, for instance, by kurtosis or approximations of negentropy. Mutual information is another popular criterion for measuring statistical independence of signals.

In general, ICA cannot identify the actual number of source signals, a uniquely correct ordering of the source signals, nor the proper scaling (including sign) of the source signals.

ICA is important to blind signal separation and has many practical applications. It is closely related to (or even a special case of) the search for a factorial code of the data, i.e., a new vector-valued representation of each data vector such that it gets uniquely encoded by the resulting code vector (loss-free coding), but the code components are statistically independent.

The components of the observed random vector are generated as a sum of the independent components , :

weighted by the mixing weights .

The same generative model can be written in vectorial form as
,
where the observed random vector is represented
by the basis vectors .
The basis vectors form the columns of the mixing matrix and the generative formula can be written
as , where .

Given the model and realizations (samples) of the random vector , the task is to estimate both the mixing matrix and the sources . This is done by adaptively calculating the vectors and setting up a cost function which either maximizes the nongaussianity of the calculated or minimizes the mutual information. In some cases, a priori knowledge of the probability distributions of the sources can be used in the cost function.

The original sources can be recovered by multiplying the observed signals with the inverse of the mixing matrix , also known as the unmixing matrix. Here it is assumed that the mixing matrix is square (). If the number of basis vectors is greater than the dimensionality of the observed vectors, , the task is overcomplete but is still solvable.