Pseudoinverse

The pseudo inverse of a matrix is a term from the mathematical branch of linear algebra , which also plays an important role in numerical mathematics . It is a generalization of the inverse matrix to singular and non-square matrices, which is why it is often referred to as the generalized inverse . The most common use case for pseudo inverse is the solution of linear systems of equations and linear equilibrium problems .

A first form was described by EH Moore (1920) and Roger Penrose (1955). The Moore-Penrose inverse named after them is not the only way to define a pseudo inverse, but pseudo inverse is often used synonymously with Moore-Penrose inverse (e.g. in). The Moore-Penrose inverse is defined and unique for all matrices with entries from the real or complex numbers . With it you can calculate the optimal solution with the smallest Euclidean norm for linear compensation problems .

A numerically robust method for determining the Moore-Penrose inverse is based on the singular value decomposition .

General pseudo-inverse

The generalization of the formation of the inverse of a matrix to singular matrices is not handled uniformly in the literature and is often based on the problem to be solved (some examples of such generalizations are listed below).

According to Adi Ben-Israel , a definition of generalized inverses should at least meet the following three requirements:

For regular matrices the usual inverse should clearly result.
In a generalized sense, singular matrices should also be invertible (at least some, not necessarily all).
For singular matrices, the generalized inverses should have properties similar to ordinary inverses of regular matrices.

As a starting point for the construction of various pseudo inverses, Adi Ben-Israel then weakens the four defining statements for the Moore-Penrose inverse described in the next section in different directions and supplements them with other conditions. The minimum requirement for a pseudo inverse is the following: A matrix is a pseudo inverse of if and only if: ${\ displaystyle B}$ ${\ displaystyle A}$

{\ displaystyle ABA = A}

On the other hand, Max Koecher calls a matrix a pseudo inverse of if and only if the following two statements are for it ${\ displaystyle B}$ ${\ displaystyle A}$

{\ displaystyle ABA = A}

and

{\ displaystyle BAB = B}

hold true.

The first condition ensures that the columns from are mapped to solutions of the system of equations . Due to the second statement, no columns of different from the zero vector can lie in the kernel of . ${\ displaystyle y}$ ${\ displaystyle A}$ ${\ displaystyle B}$ ${\ displaystyle x}$ ${\ displaystyle y = Ax}$ ${\ displaystyle B}$ ${\ displaystyle A}$

The Moore-Penrose Inverse

The Moore-Penrose inverse (also simply pseudo-inverse) of a matrix is the uniquely determined matrix that fulfills the following four properties ("Moore-Penrose conditions"): ${\ displaystyle A \ in \ mathbb {C} ^ {m \ times n}}$ ${\ displaystyle A ^ {+} \ in \ mathbb {C} ^ {n \ times m}}$

${\ displaystyle AA ^ {+} A \; \, \, = A}$	( is a generalized inverse .) ${\ displaystyle A ^ {+}}$
${\ displaystyle A ^ {+} AA ^ {+} = A ^ {+}}$	( acts like a weak inverse .) ${\ displaystyle A ^ {+}}$
${\ displaystyle (AA ^ {+}) ^ {*} \, = AA ^ {+}}$	(The matrix is Hermitian .) ${\ displaystyle AA ^ {+}}$
${\ displaystyle (A ^ {+} A) ^ {*} \, = A ^ {+} A}$	(The matrix is also Hermitian.) ${\ displaystyle A ^ {+} A}$

The adjoint matrix denotes a matrix . In the case of matrices with entries from the real numbers, this is identical to the matrix to be transposed . ${\ displaystyle M ^ {*}}$ ${\ displaystyle M}$ ${\ displaystyle M}$ ${\ displaystyle M ^ {T}}$

The Moore-Penrose inverse can also be defined by a limit value :

{\ displaystyle {\ begin {aligned} A ^ {+} & = \ lim _ {\ delta \ searrow 0} \, \; (A ^ {*} A + \ delta E_ {n}) ^ {- 1} A ^ {*} \\ & = \ lim _ {\ delta \ searrow 0} \; A ^ {*} (AA ^ {*} + \ delta E_ {m}) ^ {- 1} \ end {aligned}} }

with as the identity matrix in . This limit value also exists when and do not exist. ${\ displaystyle E_ {k}}$ ${\ displaystyle \ mathbb {C} ^ {k \ times k}}$ ${\ displaystyle (AA ^ {*}) ^ {- 1}}$ ${\ displaystyle (A ^ {*} A) ^ {- 1}}$

Calculation rules

{\ displaystyle (A ^ {+}) ^ {+} = A}

{\ displaystyle {\ overline {A}} ^ {+} = {\ overline {A ^ {+}}}}

{\ displaystyle (A ^ {*}) ^ {+} = (A ^ {+}) ^ {*}}

{\ displaystyle (\ lambda A) ^ {+} = \ lambda ^ {- 1} A ^ {+}}

For

{\ displaystyle \ lambda \ neq 0}

Special cases

If the columns of the matrix are linearly independent, then is invertible. In this case, the following equation applies ${\ displaystyle A}$ ${\ displaystyle A ^ {*} A}$

{\ displaystyle A ^ {+} = (A ^ {*} A) ^ {- 1} A ^ {*}.}

If one takes the first limit value definition for the Moore-Penrose inverse, the summand disappears . It follows that a left inverse is to. ${\ displaystyle \ delta E_ {n}}$ ${\ displaystyle A ^ {+}}$ ${\ displaystyle A}$

{\ displaystyle A ^ {+} A = E_ {n}}

If the rows of the matrix are linearly independent, then is invertible. In this case, the following equation applies ${\ displaystyle A}$ ${\ displaystyle AA ^ {*}}$

{\ displaystyle A ^ {+} = A ^ {*} (AA ^ {*}) ^ {- 1}.}

If one takes the second limit value definition for the Moore-Penrose inverse, the summand disappears . It follows that a right inverse is to. ${\ displaystyle \ delta E_ {m}}$ ${\ displaystyle A ^ {+}}$ ${\ displaystyle A}$

{\ displaystyle AA ^ {+} = E_ {m}}

If both the columns and the rows of a matrix are independent, then the matrix is invertible and the pseudo inverse coincides with the inverse.

If the product of two matrices is defined and one of the two is a unitary matrix , then we have ${\ displaystyle AB}$

{\ displaystyle (AB) ^ {+} = B ^ {+} A ^ {+}.}

The pseudo inverse can also be defined for scalars and vectors by considering them as matrices. For scalars, the pseudo inverse of zero is zero again, and for all other values it is . The following applies to vectors ${\ displaystyle x}$ ${\ displaystyle x ^ {- 1}}$

{\ displaystyle x ^ {+} \! = {\ begin {cases} 0 ^ {T}, & {\ text {if}} x = 0, \\ {\ frac {x ^ {*}} {x ^ {*} x}}, & {\ text {otherwise}}. \ end {cases}}}

These claims can be verified by checking the criteria for the Moore-Penrose Inverse.

If the matrix is Hermitian (or symmetric in the real case), then it is also Hermitian (symmetric). In this case, the decomposition follows from the spectral theorem ${\ displaystyle A}$ ${\ displaystyle A ^ {+}}$

{\ displaystyle UAU ^ {*} = D}

and thus

{\ displaystyle A ^ {+} = U ^ {*} D ^ {+} U}

,

where the pseudo inverse of the diagonal matrix is through ${\ displaystyle D ^ {+}}$ ${\ displaystyle D}$

{\ displaystyle d_ {ii} ^ {+} = {\ begin {cases} 0, & {\ text {if}} d_ {ii} = 0, \\ {\ frac {1} {d_ {ii}}} , & {\ text {otherwise}} \ end {cases}}}

is given for all diagonal entries.

calculation

If the rank is the matrix , then it can be decomposed into the product of a matrix and a matrix . It applies ${\ displaystyle k}$ ${\ displaystyle m \ times n}$ ${\ displaystyle A}$ ${\ displaystyle A}$ ${\ displaystyle A = BC}$ ${\ displaystyle m \ times k}$ ${\ displaystyle B}$ ${\ displaystyle k \ times n}$ ${\ displaystyle C}$

{\ displaystyle A ^ {+} = C ^ {*} (CC ^ {*}) ^ {- 1} (B ^ {*} B) ^ {- 1} B ^ {*}.}

Has full row rank, that is, it applies , then you can choose for the identity matrix and the above formula is reduced to

{\ displaystyle A}

{\ displaystyle k = m}

{\ displaystyle B}

{\ displaystyle A ^ {+} = A ^ {*} (AA ^ {*}) ^ {- 1}.}

Similarly, for a matrix of full column rank, that is , the equation holds

{\ displaystyle A}

{\ displaystyle k = n}

{\ displaystyle A ^ {+} = (A ^ {*} A) ^ {- 1} A ^ {*}.}

Another method for calculating the pseudo inverse exists with the singular value decomposition . If the singular value decomposition of , then applies ${\ displaystyle A = U \ Sigma V ^ {*}}$ ${\ displaystyle A}$

{\ displaystyle A ^ {+} = V \ Sigma ^ {+} U ^ {*}.}

With a diagonal matrix such as , the pseudo inverse is created by transposing the matrix and inverting the non-zero elements, i.e. forming with

{\ displaystyle \ Sigma \ in \ mathbb {C} ^ {m \ times n}}

{\ displaystyle \ Sigma ^ {+} \ in \ mathbb {C} ^ {n \ times m}}

{\ displaystyle (\ Sigma) _ {ij} ^ {+} = {\ begin {cases} {\ frac {1} {\ sigma _ {i}}}, & {\ text {if}} i = j \ wedge \ sigma _ {i} \ neq 0, \\ 0, & {\ text {otherwise}}. \ end {cases}}}

With the help of the modification of matrices, the pseudo inverse can be represented implicitly or calculated.
Greville's algorithm is a finite iterative method for the column-wise computation of the Moore-Penrose inverses.

The method in which the matrix is required is often used for the numerical calculation of the solution of overdetermined systems of equations for the sake of convenience, but is numerically unstable because the condition of the matrix is squared. The use of QR decomposition is considered a stable and efficient numerical method . The method based on singular value decomposition is the most complex, but also the most benign numerically. The method based on the change offers a compromise between effort and numerical stability. ${\ displaystyle A ^ {*} A}$

There is also an overview of the numerical effort and stability of the procedures.

Applications

If the system of equations can not be solved, the solution can be combined with the pseudo-inverse to the method of least squares , so that with the smallest Euclidean norm as calculated. ${\ displaystyle Ax = b}$ ${\ displaystyle \ | Ax-b \ | _ {2}}$ ${\ displaystyle {\ bar {x}} = A ^ {+} b}$

If there are infinitely many solutions for the system of equations , these can be found using ${\ displaystyle Ax = b}$

{\ displaystyle x = A ^ {+} b + (E_ {n} -A ^ {+} A) y, \ quad y \ in \ mathbb {C} ^ {n}}

determine. The solution of the system of equations is that which has the smallest distance from the Euclidean norm. ${\ displaystyle x}$ ${\ displaystyle y}$

Selected other versions of generalized inverses

Drazin Inverse

Let be a matrix with index (the index of is the minimum integer for which and have the same kernel ). Then the Drazin inverse is the uniquely defined matrix that satisfies the conditions ${\ displaystyle A \ in \ mathbb {R} ^ {n \ times n}}$ ${\ displaystyle i \ in \ mathbb {N}}$ ${\ displaystyle A}$ ${\ displaystyle i \ geq 1}$ ${\ displaystyle A ^ {i}}$ ${\ displaystyle A ^ {(i + 1)}}$ ${\ displaystyle n \ times n}$ ${\ displaystyle A ^ {D}}$

${\ displaystyle A ^ {i} A ^ {D} \, A = A ^ {i}}$
${\ displaystyle A ^ {D} \, AA ^ {D} = A ^ {D}}$
${\ displaystyle A \, A ^ {D} = A ^ {D} \, A}$

enough. It was introduced by Michael Drazin .

calculation

For the calculation one can use the decomposition

{\ displaystyle T {\ begin {pmatrix} \ Lambda & 0 \\ 0 & N \ end {pmatrix}} T ^ {- 1} = A}

of the matrix in Jordan normal form, where the regular part is Jordan form and nilpotent. The Drazin inverse then results in ${\ displaystyle A \ in \ mathbb {R} ^ {n \ times n}}$ ${\ displaystyle \ Lambda}$ ${\ displaystyle N}$

{\ displaystyle A ^ {D}: = T {\ begin {pmatrix} \ Lambda ^ {- 1} & 0 \\ 0 & 0 \ end {pmatrix}} T ^ {- 1}}

.

The Drazin inverse of a matrix with an index (which therefore equals the zero matrix ) is also known as the group inverse . The group inverse is a pseudo inverse as defined by Koecher. ${\ displaystyle i = 1}$ ${\ displaystyle N}$

Applications

1. An important application for the Drazin Inverse is the analytical representation of the solution of time-invariant linear descriptor systems. Take the difference equation as an example

{\ displaystyle Ax_ {k + 1} = x_ {k}}

a time-discrete descriptor system with the real matrix . The solution of the difference equation satisfies the equations with . So initial values are only consistent if they are in all images of the matrices (otherwise the solution breaks off after a finite number of steps). The solution to the difference equation is then . ${\ displaystyle n \ times n}$ ${\ displaystyle A}$ ${\ displaystyle \ {x_ {k} \} _ {k = 0,1,2, \ ldots} \ subset \ mathbb {R} ^ {n}}$ ${\ displaystyle A ^ {k} x_ {k} = x_ {0}}$ ${\ displaystyle k = 0,1,2, \ ldots}$ ${\ displaystyle x_ {0}}$ ${\ displaystyle A ^ {k}}$ ${\ displaystyle x_ {k} = (A ^ {D}) ^ {k} x_ {0}}$

2. The equation applies to real or complex matrices with index ${\ displaystyle n \ times n}$ ${\ displaystyle A}$ ${\ displaystyle i}$

{\ displaystyle \ int _ {0} ^ {t} \ exp (A {\ bar {t}}) \, d {\ bar {t}} = \ sum _ {k = 0} ^ {i-1} {\ frac {t ^ {k + 1}} {(k + 1)!}} A ^ {k} (E_ {n} -AA ^ {D}) + (\ exp (At) -E_ {n} ) A ^ {D}.}

This allows the step response of a linear time-invariant dynamic system

{\ displaystyle {\ begin {aligned} {\ dot {x}} & = Ax + bu \\ y & = c ^ {T} x \ end {aligned}}}

with input signal

{\ displaystyle u (t) = {\ begin {cases} 0 & {\ text {at}} t <0 \\ 1 & {\ text {at}} t \ geq 0, \ end {cases}}}

State vector ( zero vector), system matrix and input and output vectors in the form ${\ displaystyle x: \ mathbb {R} \ rightarrow \ mathbb {R} ^ {n \ times 1}}$ ${\ displaystyle x (0)}$ ${\ displaystyle A \ in \ mathbb {R} ^ {n \ times n}}$ ${\ displaystyle b, c \ in \ mathbb {R} ^ {n \ times 1}}$

{\ displaystyle y (t) = {\ begin {cases} 0 & {\ text {at}} t <0 \\ c ^ {T} \ left (\ sum _ {k = 0} ^ {i-1} { \ frac {t ^ {k + 1}} {(k + 1)!}} A ^ {k} (E_ {n} -AA ^ {D}) + (\ exp (At) -E_ {n}) A ^ {D} \ right) b & {\ text {bei}} t \ geq 0 \ end {cases}}}

represent.

Restricted Generalized Inverse - the Bott-Duffin Inverse

In some practical tasks, the solution is a system of linear equations ${\ displaystyle x}$

{\ displaystyle Ax = b \ qquad ({\ text {with given}} A \ in \ mathbb {R} ^ {m \ times n} {\ text {and}} b \ in \ mathbb {R} ^ {m })}

only permissible if it lies within a certain linear subspace of . It is also said that the problem by a restringiertes linear equation system will be described (English constrained linear equation ). ${\ displaystyle L}$ ${\ displaystyle \ mathbb {R} ^ {m}}$

The following is the will orthogonal projector on with designated. The restricted system of linear equations ${\ displaystyle L}$ ${\ displaystyle P_ {L}}$

{\ displaystyle Ax = b \ qquad x \ in L}

is solvable if and only if this is for the unconstrained system of equations

{\ displaystyle (AP_ {L}) x = b \ qquad x \ in \ mathbb {R} ^ {m}}

applies. If the subspace is a real subspace of , then the system matrix of the unrestricted problem is also singular if the system matrix of the restricted problem can be inverted (in this case applies ). This explains that pseudo inverses are also used to solve restricted problems. A pseudo-inverse of is then also called - restricted pseudo-inverse of . This definition initially seems to contradict Requirement 1 from Section General Pseudo-inverse . However, this contradiction is put into perspective again when one considers that the restricted pseudo inverse for bijectives is injective in the space of interest and that the image space has the same dimension as . ${\ displaystyle L}$ ${\ displaystyle \ mathbb {R} ^ {m}}$ ${\ displaystyle (AP_ {L})}$ ${\ displaystyle A}$ ${\ displaystyle m = n}$ ${\ displaystyle (AP_ {L})}$ ${\ displaystyle L}$ ${\ displaystyle A}$ ${\ displaystyle L}$ ${\ displaystyle A}$ ${\ displaystyle L}$ ${\ displaystyle L}$

An example of a pseudo inverse with which the solution of a restricted problem can be determined is the Bott-Duffin inverse of regarding , which is given by the equation ${\ displaystyle A}$ ${\ displaystyle L}$

{\ displaystyle A_ {L} ^ {(- 1)}: = P_ {L} (AP_ {L} + P_ {L ^ {\ perp}}) ^ {- 1}}

is defined if the ordinary inverse occurring on the right side exists.

Applications

The Bott-Duffin inverse can be used to solve the equations of an affine-linear electrical network if the relation between branch voltage assignments and branch current assignments is in the form ${\ displaystyle v \ in \ mathbb {R} ^ {n}}$ ${\ displaystyle i \ in \ mathbb {R} ^ {n}}$

{\ displaystyle Ai + u = u ^ {0} \ qquad i \ in L, \; u \ in L ^ {\ perp}}

can be represented, where the space is all the current assignments that satisfy the kirchhoff equations and the column matrix should be the independent source voltages fed into the branches. At this point, Tellegen's theorem of graph theory flows in , which states that the spaces of the branch voltage assignments and branch current assignments that satisfy the Kirchhoff's mesh or knot equations are orthogonally complementary to one another. ${\ displaystyle L}$ ${\ displaystyle i}$ ${\ displaystyle u ^ {0}}$

One property of the Bott-Duffin inverse is that with their help the branch currents associated with a given source voltage assignment ${\ displaystyle u ^ {0}}$

{\ displaystyle i = A_ {L} ^ {(- 1)} u ^ {0}}

and branch voltages

{\ displaystyle u = (E_ {m} -AA_ {L} ^ {(- 1)}) u ^ {0}}

can be calculated ( stands for the identity matrix im ). ${\ displaystyle E_ {m}}$ ${\ displaystyle \ mathbb {R} ^ {m \ times m}}$

literature

W. Mackens, H. Voss: Mathematics I for engineering students
A. Kielbasinski, H. Schwetlick: Numerical linear algebra , Deutscher Verlag der Wissenschaften, 1988

Individual evidence

^ EH Moore : On the reciprocal of the general algebraic matrix. In: Bulletin of the American Mathematical Society 26, pp. 394-395, 1920
^ Roger Penrose : A generalized inverse for matrices. In: Proceedings of the Cambridge Philosophical Society 51, pp. 406-413, 1955, doi : 10.1017 / S0305004100030401
^ J. Stoer: Numerical Mathematics 1. Springer Verlag, 2002, ISBN 3-540-66154-9
^ ^A ^b ^c Adi Ben-Israel , Thomas NE Greville : Generalized Inverses. Springer-Verlag, 2003, ISBN 0-387-00293-6
^ ^A ^b Max Koecher: Lineare Algebra and Analytical Geometry, Springer-Verlag Berlin, 1997
^ ^A ^b Nobuo Shinozaki, Masaaki Sibuya, and Kunio Tanabe: Numerical algorithms for the Moore-Penrose inverse of a matrix: Direct methods. Annals of the Institute of Statistical Mathematics, Springer Netherlands, Vol. 24, No. 1, Dec. 1972, pp. 193-203, doi : 10.1007 / BF02479751 .
^ TNE Greville: Some applications of the pseudo inverse of a matrix. SIAM Rev., No. 2, 1960, pp. 15-22, doi : 10.1137 / 1002004 , JSTOR 2028054 .

[MOORE1920-1] EH Moore : On the reciprocal of the general algebraic matrix. In: Bulletin of the American Mathematical Society 26, pp. 394-395, 1920

[PENROSE1955-2] Roger Penrose : A generalized inverse for matrices. In: Proceedings of the Cambridge Philosophical Society 51, pp. 406-413, 1955, doi : 10.1017 / S0305004100030401

[StoerNumMath1-3] J. Stoer: Numerical Mathematics 1. Springer Verlag, 2002, ISBN 3-540-66154-9

[IG2003-4] A ^b ^c Adi Ben-Israel , Thomas NE Greville : Generalized Inverses. Springer-Verlag, 2003, ISBN 0-387-00293-6

[Koecher1997-5] A ^b Max Koecher: Lineare Algebra and Analytical Geometry, Springer-Verlag Berlin, 1997

[Shinozaki1972-6] A ^b Nobuo Shinozaki, Masaaki Sibuya, and Kunio Tanabe: Numerical algorithms for the Moore-Penrose inverse of a matrix: Direct methods. Annals of the Institute of Statistical Mathematics, Springer Netherlands, Vol. 24, No. 1, Dec. 1972, pp. 193-203, doi : 10.1007 / BF02479751 .

[Greville1960-7] TNE Greville: Some applications of the pseudo inverse of a matrix. SIAM Rev., No. 2, 1960, pp. 15-22, doi : 10.1137 / 1002004 , JSTOR 2028054 .