Independence analysis

The independence analysis or Independent Component Analysis ( ICA ) is a method of multivariate statistics . It was published in 1991 and is used to calculate independent components in a mixture of statistically independent random variables . It is closely related to the Blind Source Separation (BSS) problem.

Problem

It is assumed that the vector consists of statistically independent random variables. So that the ICA can be used, a maximum of one of the random variables may be Gaussian distributed . The random variables are multiplied with a mixed matrix . For the sake of simplicity, it is assumed that this mixed matrix is square. The result are mixed random variables in the vector , which has the same dimension as . ${\ displaystyle {\ vec {s}}}$ ${\ displaystyle n!}$ ${\ displaystyle A}$ ${\ displaystyle {\ vec {x}}}$ ${\ displaystyle {\ vec {s}}}$

{\ displaystyle {\ vec {x}} = A {\ vec {s}}}

The aim of the ICA is to reconstruct the independent random variables in the vector as true to the original as possible. For this only the result of the mixture is available and the knowledge that the random variables were originally stochastically independent. A suitable matrix is sought so that ${\ displaystyle {\ vec {y}}}$ ${\ displaystyle {\ vec {x}}}$ ${\ displaystyle B = A ^ {- 1}}$

{\ displaystyle {\ vec {y}} = B {\ vec {x}}}

.

Since neither the mixed matrix nor the independent random variables are known, these can only be reconstructed with some compromises. The variance and thus the energy of the independent random variables cannot be determined, since the independent random variables and the corresponding column vector of the mixed matrix can be weighted with any constant so that the scalings cancel each other out: ${\ displaystyle \! s_ {i}}$ ${\ displaystyle {\ vec {a}} _ {i}}$ ${\ displaystyle \ alpha _ {i} \!}$

{\ displaystyle {\ vec {X}} = \ sum _ {i} \ left ({\ frac {1} {\ alpha _ {i}}} {\ vec {a}} _ {i} \ right) ( s_ {i} \ alpha _ {i})}

In addition, the order of the column vectors of the mixing matrix cannot be reconstructed.

Troubleshooting

As a rule, it is assumed that the mixed random variables are mean-free . If this is not the case, this can be achieved by subtracting the mean value.

Pre-whitening

Pre-whitening is a linear transformation that is used for preprocessing. A principal component analysis (PCA) is carried out for this purpose. The result is the eigenvalues and the eigenvectors of the covariance matrix of the mixed random variables. The eigenvectors form the rows of the rotation matrix , which is multiplied by the vector . The eigenvalues correspond to the variance of the respective main component. The reciprocal values of their square roots are used to form the diagonal matrix , so that ${\ displaystyle R}$ ${\ displaystyle {\ vec {x}}}$ ${\ displaystyle e_ {i}}$ ${\ displaystyle D}$

{\ displaystyle {\ vec {z}} = DR {\ vec {x}}}

, With

{\ displaystyle D = {\ begin {pmatrix} e_ {1} ^ {- {\ tfrac {1} {2}}} &&& 0 \\ &. && \\ &&. & \\ 0 &&& e_ {n} ^ {- { \ tfrac {1} {2}}} \ end {pmatrix}}.}

By multiplying by the diagonal matrix, the variance of the main components is normalized to 1.

Determination of the independent components

Due to the pre-whitening, the random variables are not yet stochastically independent, but the problem has been reduced to the search for an orthogonal rotation matrix : ${\ displaystyle U}$

${\ displaystyle \! {\ vec {y}} = U {\ vec {z}}}$

The central limit value set is used to search for . This means that the mixture of standardized, centered random numbers resembles a normal distribution as the number increases. Since the random variables in meet this requirement, there must be a rotation matrix that generates random numbers that are not normally distributed if possible . There are several possible solutions for the concrete implementation of this search. ${\ displaystyle U}$ ${\ displaystyle \! {\ vec {z}}}$ ${\ displaystyle U}$ ${\ displaystyle \! {\ vec {y}}}$

Kurtosis

The kurtosis is a measure of the deviation from a normal distribution . It is defined by

{\ displaystyle \! kurt (X) = E (X ^ {4}) - 3E (X ^ {2}) ^ {2} = E (X ^ {4}) - 3.}

Since the variance of the random variables is normalized, it becomes equal to one. The kurtosis becomes zero when the distribution is Gaussian-like. If the kurtosis is negative, it increasingly resembles an even distribution . If it is positive, the distribution is more like a Laplace distribution . The kurtosis must therefore be maximized or minimized in order to deviate from a normal distribution. For this purpose, gradient methods are used, for example based on Oja’s learning rule . ${\ displaystyle E (X ^ {2})}$

Negentropy

Another approach is to maximize negentropy .

{\ displaystyle \! J (X) = H ({\ mathcal {N}} (\ mu _ {X}, \ sigma _ {X} ^ {2})) - H (X) \ geq 0}

,

where denotes the entropy and is the normal distribution whose expectation and variance correspond to those of . ${\ displaystyle H}$ ${\ displaystyle {\ mathcal {N}} (\ mu _ {X}, \ sigma _ {X} ^ {2})}$ ${\ displaystyle X}$

However, since it is difficult to determine, approximation formulas for the negentropy are usually used. ${\ displaystyle H (X)}$

An example of this is the calculation using the - often empirically determined - skewness and kurtosis of the distribution : ${\ displaystyle X}$

{\ displaystyle \! J (X) \ approx {1 \ over 12} (skew (X)) ^ {2} + {1 \ over 48} (kurt (X)) ^ {2}}

Almost ICA

Fast ICA is a fixed point algorithm that solves the problem using a Newton method .

literature

Pierre Comon: Independent Component Analysis: a new concept? In: Signal Processing Vol. 36, No. 3, 1994, pp. 287-314, doi : 10.1016 / 0165-1684 (94) 90029-9 .

Web links

FastICA implementations for Matlab, R, C ++, and Python (English)
What is Independent Component Analysis? University homepage of A. Hyvärinen (English)

Individual evidence

^ Christian Jutten, Jeanny Herault: Blind Separation of Sources. Part 1: An Adaptive Algorithm Based on Neuromimetic Architecture . In: Signal Process . tape 24 , no. 1 , August 1, 1991, p. 1-10 , doi : 10.1016 / 0165-1684 (91) 90079-X .
^ A. Hyvärinen, E. Oja: Independent component analysis: algorithms and applications . In: Neural Networks . tape 13 , no. 4-5 , June 1, 2016, pp. 411-430 , doi : 10.1016 / S0893-6080 (00) 00026-5 .

[1] Christian Jutten, Jeanny Herault: Blind Separation of Sources. Part 1: An Adaptive Algorithm Based on Neuromimetic Architecture . In: Signal Process . tape 24 , no. 1 , August 1, 1991, p. 1-10 , doi : 10.1016 / 0165-1684 (91) 90079-X .

[2] A. Hyvärinen, E. Oja: Independent component analysis: algorithms and applications . In: Neural Networks . tape 13 , no. 4-5 , June 1, 2016, pp. 411-430 , doi : 10.1016 / S0893-6080 (00) 00026-5 .