QR algorithm

from Wikipedia, the free encyclopedia

The QR algorithm is a numerical method for calculating all eigenvalues and possibly the eigenvectors of a square matrix . The QR method or QR iteration, also known as the QR method, is based on the QR decomposition and was introduced in 1961–1962 independently of one another by John GF Francis and Wera Nikolajewna Kublanowskaja . A forerunner was the LR algorithm by Heinz Rutishauser (1958), but it is less stable and is based on the LR decomposition . Often the iterates from the QR algorithm converge on the Schur formthe matrix. The original method is quite complex and therefore impractical - even on today's computers - for matrices with hundreds of thousands of rows and columns.

Derived variants such as the Multishift method by Z. Bai and James Demmel 1989 and the numerically more stable variant by K. Braman, R. Byers and R. Mathias 2002 have practical running times that are cubic in the size of the matrix. The latter procedure is implemented in the numerical software library LAPACK , which in turn is used in many computer algebra systems (CAS) for the numerical matrix algorithms.

Description of the QR algorithm

Goal of the calculation process

Starting from a real or complex matrix or a sequence of orthogonal or unitary matrices is determined, so that the recursive matrix sequence largely converges to an upper triangular matrix. Since all transformations in the recursion are similarity transformations, all matrices of the matrix sequence have the same eigenvalues ​​with the same multiplicities.

If an upper triangular matrix is ​​reached in the limit value, a Schur form of , the eigenvalues ​​can be read from the diagonal elements. In the case of a real matrix with orthogonal transformations, however, there can also be complex eigenvalues. Then the result of the method is an upper block triangle matrix, the diagonal blocks of which have the size for the real eigenvalues ​​or the size for complex eigenvalues. The diagonal block of the corresponding rotational extension corresponds to an eigenvalue and its complex conjugate eigenvalue .

General scheme of the procedure

Starting from a matrix (or ), a sequence of matrices is determined according to the following rule:

  1. for do
  2. Find a polynomial and evaluate it in the matrix .
  3. Compute the QR decomposition of .
  4. Calculate the new iterate: .
  5. end for

The variants with (implicit) shifts and multiple shifts can also be displayed and analyzed in this general form.

The matrix function is often a polynomial (here a linear combination of matrix powers) with real (or complex) coefficients. In the simplest case, a linear polynomial is chosen, which then has the shape . In this case, the algorithm is simplified to the classic version (with single shift):

  1. for do
  2. Find an appropriate number
  3. Compute the QR decomposition of
  4. Compute the new iterate:
  5. end for

In each iteration is set, the method is consistent with the on subspaces (here the full vector space) extended power method match.

The polynomial controlling the QR decomposition can be fixed from the start or dynamically adapted to the current matrix in each iteration step . There are also attempts to use for rational matrix functions, i. H. Functions of the form with polynomials and .

deflation

If it turns out in an iteration step that a lower left block of the columns and their rows in the amounts of all of its components falls below a predetermined threshold, the process is continued on the two diagonal square sub-blocks of the rows / columns and separately. If the matrices resulting from the division are small enough, the determination of the eigenvalues ​​can e.g. B. be terminated by calculating the characteristic polynomial and its zeros.

Transformation to Hessenberg form

The goal of the QR algorithm is to create an upper triangle shape, or block triangle shape, in which blocks of the size correspond to complex eigenvalues. This can be done almost in a simple manner by a similarity transformation to Hessenberg form , i. H. on a matrix with only one (not identically disappearing) lower secondary diagonal.

  • Set
  • for k = 1 to n-1 do
  1. Form the column segment
  2. Find the householder reflection that maps to the first canonical basis vector,
  3. Use the block matrix to replace with the similar matrix .
  • Note the overall transformation matrix .
  • is now in Hessenberg form.

The Hessenberg form makes it easier to determine the characteristic polynomials of partial matrices . The Hessenberg form of a symmetrical matrix is a tridiagonal form , which considerably speeds up further calculations.

Furthermore, the Hessenberg form is obtained in every step of the QR algorithm. The basis for this is the arithmetic of generalized triangular matrices in which all entries disappear below a lower secondary diagonal. A Hessenberg matrix is ​​therefore a generalized triangular matrix with a secondary diagonal. With multiplication, the numbers of non-vanishing secondary diagonals add up, with addition, the larger number is usually retained.

Therefore, like the orthogonal matrix, have the number of lower secondary diagonals. Well, because of , too

,

and the latter product also has secondary diagonals. This is generally only possible if it has exactly one secondary diagonal, i.e. is again in the Hessenberg shape. From the decomposition of into linear factors it follows (see below) that this is always the case.

Building on this, one can show (Francis 1962 after Bai / Demmel) that the first column of , which is also proportional to the first column of , completely determines the following matrix . More precisely, an orthogonal matrix whose first column is proportional to is created by bringing the transformed matrix back into Hessenberg form. Since only the components of the rows differ from zero in, a modification of the identity matrix can be in the upper left -block with a .

Variants of the QR algorithm

Simple QR iteration

It is elected. The method can then be given briefly as QR decomposition followed by multiplying together in the reverse order. This procedure is the direct generalization of the simultaneous power method for determining the first eigenvalues ​​of a matrix. This connection is derived from the subspace iteration . A simultaneous inverse power method is also performed indirectly.

QR algorithm with simple shifts

It is set. This means that the method can alternatively also be displayed as a QR decomposition followed by multiplying together corrected for the shift . Usually an attempt is made to approximate the smallest eigenvalue with the shift . The last diagonal element can be selected for this. The simple QR iteration results from setting all shifts to zero.

Has so must Hessenberg form as a product of a matrix and a matrix without secondary diagonals also have Hessenberg form. The same therefore also applies to . So if in preparation of the QR algorithm in Hessenberg form, this is retained throughout the algorithm.

Simple shifts for symmetric matrices

A symmetric real matrix has only real eigenvalues. The symmetry is preserved in all of them during the QR algorithm . For symmetric matrices, Wilkinson (1965) suggested that the eigenvalue of the submatrix on the right

to choose as a shift that is closer to . Wilkinson showed that the matrix sequence determined in this way converges to a diagonal matrix whose diagonal elements are the eigenvalues ​​of . The speed of convergence is quadratic.

QR algorithm with double shifts

A pair of simple shifts can be combined in one iteration step. As a consequence, this means that complex shifts can be dispensed with for real matrices. In the above notation is

a QR decomposition for the quadratic polynomial , evaluated in . The coefficients of this polynomial are real even for a conjugate pair of complex shifts. Thus, complex eigenvalues ​​of real matrices can also be approximated without complex numbers appearing in the calculation.

A common choice for this double shift consists of the eigenvalues ​​of the sub-matrix at the bottom right , i.e. H. the quadratic polynomial is the characteristic of this block,

.

QR algorithm with multiple shifts

A number is specified that is larger but significantly smaller than the size of the matrix . The polynomial can be chosen as the characteristic polynomial of the bottom right quadratic sub-matrix of the current matrix . Another strategy is the eigenvalues of the very lowest to determine -Teilmatrix and the amount smallest eigenvalues select among them. With these, a QR decomposition of

and

certainly.

With a multishift method, it is often achieved that the components of the lower left -block become small particularly quickly in the sequence of the iterated matrices, thus splitting the eigenvalue problem.

Implicit multishift iteration

Combining multiple shifts in the general form is very time-consuming. As mentioned above, the effort can be reduced by producing the Hessenberg shape in a preparatory step . Since each multishift step can be composed of individual (even complex) shifts, the Hessenberg form is retained throughout the entire algorithm.

This allows the QR algorithm to be converted into a “bulge-chasing” algorithm that creates a dent in the Hessenberg shape at the upper end of the diagonal and then “chases” it down the diagonal and out of the matrix at the lower end.

  1. for do
  2. Calculate the polynomial according to one of the given variants,
  3. Find the vector .
  4. Find a reflection from onto the first unit vector. Since only the first components do not disappear, this reflection has a simple block shape.
  5. Form the matrix and transform it so that it is in Hessenberg form again.
  6. end for

If householder reflections are put together , they have a block diagonal shape .

Notes on functionality

Similarity transformations

The matrices calculated in the QR algorithm are unitarily similar to one another, since due to

applies. Thus all matrices have the same eigenvalues ​​(counted with the algebraic and geometric multiplicity).

Choice of shifts, convergence

A simple but not very effective choice is to choose shifts equal to zero. The iterates of the resulting algorithm, the QR algorithm in its basic form, partially converge, if all eigenvalues ​​differ in their absolute value, to an upper triangular matrix with the eigenvalues ​​on the diagonal. Partial convergence here means that the elements of the lower triangle go from towards zero and the diagonal elements towards the eigenvalues. So nothing is said about the elements in the upper triangle .

If the shifts are chosen as the matrix element at the bottom right of the current iterate, that is , the algorithm converges squarely or even cubically under suitable circumstances. This shift is known as the Rayleigh quotient shift because it is related to a Rayleigh quotient via the inverse iteration .

When calculating in the real ( ) one would like to calculate the real Schur form and still be able to work with conjugate complex eigenvalues. There are various shift strategies for this.

An extension of simple shifts is the Wilkinson shift named after James Hardy Wilkinson . For the Wilkinson Shift, the eigenvalue closer to the last matrix element becomes the last main subsection matrix

used.

The QR algorithm as an extension of the power method

For the analysis of the algorithm, the additional matrix effects of the products are accumulated and , defined. The products of orthogonal or unitary matrices are again orthogonal or unitary matrices, and the products of right-upper triangular matrices are again right-upper triangular matrices. The matrices of the QR iteration result from similarity transformations , because

.

From this one deduces QR-decomposition of the powers of :

In the course of the algorithm, QR-decompositions of the powers of the matrix are determined implicitly . Due to the form of these decompositions, it applies that for each the first columns of the matrix span the same subspace as the first columns of the matrix ; in addition, the columns are orthonormal to each other. However, this is exactly the situation after the -th step of a simultaneous power iteration. The diagonal elements of are the approximations of the eigenvalues ​​of . Therefore the convergence properties of the power iteration can be carried over:

The simple QR algorithm only converges if all eigenvalues ​​differ from one another in their amounts. Are the eigenvalues ​​after

sorted, the speed of convergence is linear with a contraction factor that corresponds to the minimum of the quotients .

For real matrices in particular, this algorithm can only converge if all eigenvalues ​​are real (since otherwise complex conjugate pairs would exist with the same amount). This situation is given for all symmetric matrices.

The QR algorithm as a simultaneous inverse power iteration

If it is invertible, it holds for the transpose (for complex matrices the Hermitian adjoint ) the inverse of and all their powers

The inverse of an upper right triangular matrix is ​​again an upper right triangular matrix, the transpose of which is a lower left triangular matrix. The QR algorithm thus also determines a QL decomposition of the powers of . From the form of this decomposition it is evident that for each the last columns of span the same subspace as the last columns of . In the last column of , an inverse power iteration for is carried out, so this column converges towards the dual eigenvector to the smallest eigenvalue of . In the product , the lower left component is the so-called Rayleigh quotient of the inverse power iteration. The convergence properties are analogous to those given above.

Web links

literature