Courant-Fischer's theorem

The Courant-Fischer theorem (also known as the minimum-maximum principle ) is a mathematical theorem from linear algebra , which enables a variational characterization of the eigenvalues of a symmetrical or Hermitian matrix . Each eigenvalue is represented as a minimum or maximum Rayleigh quotient of vectors from sub-vector spaces with certain dimensions . The sentence is named after the mathematicians Richard Courant and Ernst Fischer . Among other things, it is used to estimate eigenvalues and to analyze numerical eigenvalue methods.

sentence

Is a symmetric matrix (if ) or Hermitian matrix (if ) with ascending sorted eigenvalues and designates the amount of dimensional subspaces of , has, then the th eigenvalue of the representation ${\ displaystyle A \ in {\ mathbb {K}} ^ {n \ times n}}$ ${\ displaystyle {\ mathbb {K}} = \ mathbb {R}}$ ${\ displaystyle {\ mathbb {K}} = \ mathbb {C}}$ ${\ displaystyle \ lambda _ {1} \ leq \ ldots \ leq \ lambda _ {n}}$ ${\ displaystyle X_ {i}}$ ${\ displaystyle i}$ ${\ displaystyle {\ mathbb {K}} ^ {n}}$ ${\ displaystyle i = 1, \ ldots, n}$ ${\ displaystyle i}$ ${\ displaystyle A}$

{\ displaystyle \ lambda _ {i} = \ min _ {X \ in X_ {i}} \ max _ {x \ in X \ atop x \ neq 0} {\ frac {\ langle x, Ax \ rangle} { \ langle x, x \ rangle}} = \ max _ {X \ in X_ {n-i + 1}} \ min _ {x \ in X \ atop x \ neq 0} {\ frac {\ langle x, Ax \ rangle} {\ langle x, x \ rangle}}}

,

where is the standard real or complex scalar product. If the Courant-Fischer theorem is given with eigenvalues sorted in descending order, then the minimum and maximum are interchanged. ${\ displaystyle \ langle \ cdot, \ cdot \ rangle}$

Illustrative example

The Courant-Fischer theorem characterizes the eigenvalues of a symmetric positive definite (3 × 3) matrix over extreme points on an ellipsoid

For a symmetric positive definite matrix , the Courant-Fischer theorem can be illustrated as follows. Since the eigenvalues of are and are the squares of the always positive eigenvalues of , the -th eigenvalue of has the representation ${\ displaystyle (3 \ times 3)}$ ${\ displaystyle A \ in \ mathbb {R} ^ {3 \ times 3}}$ ${\ displaystyle A ^ {T} A}$ ${\ displaystyle A}$ ${\ displaystyle \ langle x, A ^ {T} Ax \ rangle = \ langle Ax, Ax \ rangle}$ ${\ displaystyle i}$ ${\ displaystyle A}$

{\ displaystyle \ lambda _ {i} = \ min _ {X \ in X_ {i}} \ max _ {x \ in X \ atop x \ neq 0} {\ frac {\ | Ax \ |} {\ | x \ |}} = \ min _ {X \ in X_ {i}} \ max _ {x \ in X \ atop \ | x \ | = 1} \ | Ax \ |}

,

where is the Euclidean norm . The set has the shape of an ellipsoid in three-dimensional space with the semi-axes , and . The Courant-Fischer theorem now characterizes the eigenvalues of over certain extreme points on this ellipsoid: ${\ displaystyle \ | \ cdot \ |}$ ${\ displaystyle \ left \ {Ax \ in \ mathbb {R} ^ {3} \ mid \ | x \ | = 1 \ right \}}$ ${\ displaystyle \ lambda _ {1}}$ ${\ displaystyle \ lambda _ {2}}$ ${\ displaystyle \ lambda _ {3}}$ ${\ displaystyle A}$

For the smallest eigenvalue , all one-dimensional sub-vector spaces, i.e. all straight lines through the origin , are considered. Each of these straight lines through the origin intersects the ellipsoid at two diametrically opposite points. Of all these points, one of those with the smallest distance to the origin is selected. ${\ displaystyle \ lambda _ {1}}$

For the second smallest eigenvalue , all two-dimensional sub-vector spaces, i.e. all original planes , are considered. Each of these planes of origin intersects the ellipsoid in an ellipse . One of the points with the greatest distance from the origin is searched for on each of these ellipses and one of the points with the smallest distance from the origin is selected from all these points. ${\ displaystyle \ lambda _ {2}}$

For the greatest eigenvalue , the whole space is considered and one of the points on the ellipsoid with the greatest distance from the origin is selected.

{\ displaystyle \ lambda _ {3}}

The position vector of a point selected in this way is then an eigenvector of the matrix and the length of this vector is the associated eigenvalue.

proof

The Courant-Fischer theorem represents the eigenvalues of a symmetric or Hermitian matrix as minimum and maximum Rayleigh quotients, respectively

{\ displaystyle R_ {A} (x) = {\ frac {\ langle x, Ax \ rangle} {\ langle x, x \ rangle}}}

In the following, an upper and a lower bound are determined for the first part of the assertion. The second equation follows analogously by considering and the corresponding complementary spaces . ${\ displaystyle -A}$

Upper bound

Since is symmetric or Hermitian, an orthonormal basis from eigenvectors can be found for the eigenvalues . Designated ${\ displaystyle A}$ ${\ displaystyle \ {x_ {1}, \ ldots, x_ {n} \}}$ ${\ displaystyle \ lambda _ {1}, \ ldots, \ lambda _ {n}}$

{\ displaystyle V_ {i} = \ operatorname {span} (x_ {i}, \ ldots, x_ {n})}

the linear envelope of those eigenvectors whose indices are at least . The intersection of with a -dimensional subspace is not , because with the dimensional formula we have ${\ displaystyle i}$ ${\ displaystyle V_ {i}}$ ${\ displaystyle i}$ ${\ displaystyle X \ in X_ {i}}$ ${\ displaystyle \ {0 \}}$

{\ displaystyle \ dim (X \ cap V_ {i}) = \ dim X + \ dim V_ {i} - \ dim (X + V_ {i}) = i + (n-i + 1) - \ dim (X + V_ {i}) \ geq 1}

.

Hence there is a vector with which is a basic representation ${\ displaystyle v \ in X \ cap V_ {i}}$ ${\ displaystyle v \ neq 0}$

{\ displaystyle v = c_ {i} x_ {i} + \ ldots + c_ {n} x_ {n}}

with coefficients . For such a vector we now have ${\ displaystyle c_ {i}, \ ldots, c_ {n} \ in {\ mathbb {K}}}$ ${\ displaystyle v}$

{\ displaystyle \ langle v, Av \ rangle = \ lambda _ {i} c_ {i} ^ {2} + \ ldots + \ lambda _ {n} c_ {n} ^ {2} \ geq \ lambda _ {i } (c_ {i} ^ {2} + \ ldots + c_ {n} ^ {2}) = \ lambda _ {i} \ langle v, v \ rangle}

.

The maximum Rayleigh quotient is therefore for the vectors of any -dimensional sub-vector space and therefore also applies ${\ displaystyle x \ in X}$ ${\ displaystyle i}$ ${\ displaystyle X \ in X_ {i}}$ ${\ displaystyle R_ {A} (x) \ geq \ lambda _ {i}}$

{\ displaystyle \ min _ {X \ in X_ {i}} \ max _ {x \ in X \ atop x \ neq 0} R_ {A} (x) \ geq \ lambda _ {i}}

.

Lower bound

Now denote

{\ displaystyle W_ {i} = \ operatorname {span} (x_ {1}, \ ldots, x_ {i})}

the linear envelope of those eigenvectors whose indices are at most . For a vector with and the representation ${\ displaystyle i}$ ${\ displaystyle w \ in W_ {i}}$ ${\ displaystyle w \ neq 0}$

{\ displaystyle w = c_ {1} x_ {1} + \ ldots + c_ {i} x_ {i}}

applies now

{\ displaystyle \ langle w, Aw \ rangle = \ lambda _ {1} c_ {1} ^ {2} + \ ldots + \ lambda _ {i} c_ {i} ^ {2} \ leq \ lambda _ {i } (c_ {1} ^ {2} + \ ldots + c_ {i} ^ {2}) = \ lambda _ {i} \ langle w, w \ rangle}

.

The maximum Rayleigh quotient of all vectors is and therefore applies ${\ displaystyle x \ in W_ {i}}$ ${\ displaystyle R_ {A} (x) = \ lambda _ {i}}$

{\ displaystyle \ min _ {X \ in X_ {i}} \ max _ {x \ in X \ atop x \ neq 0} R_ {A} (x) \ leq \ max _ {x \ in W_ {i} \ atop x \ neq 0} R_ {A} (x) = \ lambda _ {i}}

.

By combining the two limits, the first part of the assertion follows.

use

A direct consequence of the Courant-Fischer theorem is the estimate

{\ displaystyle \ lambda _ {\ min} \ leq R_ {A} (x) \ leq \ lambda _ {\ max}}

for the smallest and the largest eigenvalue of a symmetric or Hermitian matrix . Equality applies precisely when an eigenvector is the respective eigenvalue. The smallest and the largest eigenvalue can accordingly be determined by minimizing or maximizing the Rayleigh quotient. ${\ displaystyle A}$ ${\ displaystyle x}$

Another application is in numerical stability statements for eigenvalue methods. Are two symmetric or Hermitian matrices with ascending sorted eigenvalues and then applies ${\ displaystyle A, B \ in {\ mathbb {K}} ^ {n \ times n}}$ ${\ displaystyle \ lambda _ {1} (A) \ leq \ ldots \ leq \ lambda _ {n} (A)}$ ${\ displaystyle \ lambda _ {1} (B) \ leq \ ldots \ leq \ lambda _ {n} (B)}$

{\ displaystyle | \ lambda _ {i} (A) - \ lambda _ {i} (B) | \ leq \ | AB \ |}

for all , where is any natural matrix norm . If a matrix is approximated by a matrix (whose eigenvalues are easier to calculate), the resulting error is limited by the norm of the difference between the two matrices. ${\ displaystyle i = 1, \ ldots, n}$ ${\ displaystyle \ | \ cdot \ |}$ ${\ displaystyle A}$ ${\ displaystyle B}$

variants

The following variant of the Courant-Fischer theorem also exists for representing the singular values of a matrix. If a (not necessarily square) matrix with ascending sorted singular values and denotes the Euclidean norm , then the -th singular value of has the representation ${\ displaystyle A \ in {\ mathbb {K}} ^ {m \ times n}}$ ${\ displaystyle \ sigma _ {1} \ leq \ ldots \ leq \ sigma _ {\ min \ {m, n \}}}$ ${\ displaystyle \ | \ cdot \ |}$ ${\ displaystyle i}$ ${\ displaystyle A}$

{\ displaystyle \ sigma _ {i} = \ min _ {X \ in X_ {i}} \ max _ {x \ in X \ atop \ | x \ | = 1} \ | Ax \ | = \ max _ { X \ in X_ {n-i + 1}} \ min _ {x \ in X \ atop \ | x \ | = 1} \ | Ax \ |}

,

where again is the set of -dimensional subspaces of . This result follows from Courant-Fischer's theorem about the representation of the singular values of as roots of the eigenvalues of and . ${\ displaystyle X_ {i}}$ ${\ displaystyle i}$ ${\ displaystyle {\ mathbb {K}} ^ {n}}$ ${\ displaystyle A}$ ${\ displaystyle A ^ {H} A}$ ${\ displaystyle AA ^ {H}}$

Generalizations of this statement also exist for the representation of the spectrum of self-adjoint operators on Hilbert spaces , which is used, for example, in the Rayleigh-Ritz principle .

literature

Harry Dym: Linear Algebra in Action . 2nd Edition. American Mathematical Society, 2013, ISBN 978-1-4704-0908-1 .
Roger A. Horn: Topics in Matrix Analysis . Cambridge University Press, 1994, ISBN 0-521-46713-6 .
Robert Schaback, Holger Wendland: Numerical Mathematics . Springer, 2006, ISBN 978-3-540-26705-8 .

Original work

Ernst Fischer: About quadratic forms with real coefficients . In: Monthly books for mathematics and physics . tape 16 , 1905, pp. 234-249 .
Richard Courant: About the eigenvalues in the differential equations of mathematical physics . In: Mathematical Journal . tape 7 , no. 1-4 , 1920, pp. 1-57 .

Individual evidence

↑ ^a ^b Harry Dym: Linear Algebra in Action . 2nd Edition. American Mathematical Society, 2013, pp. 224-225 .
^ Robert Schaback, Holger Wendland: Numerical Mathematics . Springer, 2006, p. 270 .
^ Roger A. Horn: Topics in Matrix Analysis . Cambridge University Press, 1994, pp. 148 .

[dym-1] Harry Dym: Linear Algebra in Action . 2nd Edition. American Mathematical Society, 2013, pp. 224-225 .

[2] Robert Schaback, Holger Wendland: Numerical Mathematics . Springer, 2006, p. 270 .

[3] Roger A. Horn: Topics in Matrix Analysis . Cambridge University Press, 1994, pp. 148 .