GMRES procedure

The GMRES method (for Generalized Minimal Residual Method ) is an iterative numerical method for solving large, sparse linear systems of equations . The method is from the class of the Krylow subspace method and in particular is also suitable for non-symmetrical matrices . In exact arithmetic, i.e. if the calculation is carried out without rounding errors , the method delivers the exact solution after a finite number of steps. However, it is more interesting as an approximate method, since with suitable preconditioning it can also solve systems of equations with millions of unknowns in a few iterations with satisfactory accuracy. It thus represents a kind of black box solver for sparse linear equation systems. It was developed in 1986 by Yousef Saad and Martin H. Schultz.

The procedure

The linear system of equations with a real matrix A is given . The system of equations is uniquely solvable, so A has full rank . There is also a starting approximation , for example simply the right side b . Then the GMRES method is defined by minimizing the Euclidean norm of the residual over the affine Krylow subspace in the m th step . ${\ displaystyle Ax = b}$ ${\ displaystyle n \ times n}$ ${\ displaystyle x_ {0}}$ ${\ displaystyle \ | Ax-b \ | _ {2}}$ ${\ displaystyle x_ {0} + {\ mathcal {K}} _ {m} (A, r_ {0}) = x_ {0} + {\ mbox {span}} \ {r_ {0}, Ar_ {0 }, \ ldots, A ^ {m-1} r_ {0} \}}$

For this purpose, an orthonormal basis of the space is calculated iteratively using the Arnoldi procedure . This allows a representation of the matrices formed by the base vectors and via a matrix that is an upper Hessenberg matrix to which a line has been appended in which only the last entry is not zero, as ${\ displaystyle \ {v_ {1}, ..., v_ {m} \}}$ ${\ displaystyle V_ {m} \ in \ mathbb {R} ^ {n \ times m}}$ ${\ displaystyle V_ {m + 1} \ in \ mathbb {R} ^ {n \ times (m + 1)}}$ ${\ displaystyle H_ {m} \ in \ mathbb {R} ^ {(m + 1) \ times m}}$

{\ displaystyle AV_ {m} = V_ {m + 1} H_ {m}}

.

The approach results in an efficiently calculable form of the norm of the residual, since the Euclidean norm is given: ${\ displaystyle x_ {m} = x_ {0} + V_ {m} y}$ ${\ displaystyle V_ {m + 1}}$

{\ displaystyle \ | Ax_ {m} -b \ | _ {2} = \ | \ | r_ {0} \ | _ {2} v_ {1} -V_ {m + 1} H_ {m} y \ | _ {2} = \ | \ | r_ {0} \ | _ {2} e_ {1} -H_ {m} y \ | _ {2}}

.

Here denotes the first unit vector . The Hessenberg matrix H is updated in every step and then brought to an upper right triangular matrix with zeros in the last line by a composite orthogonal transformation , mostly by Givens rotations as in the pseudo-code given below . Here only m -1 rotations are necessary, since each can set an element on the lower secondary diagonal to zero. In some cases the calculated vectors lose their orthogonality due to rounding errors. This can usually be fixed by using the more elaborate Householder reflections instead of the rotations. Application of returns in both cases ${\ displaystyle e_ {1} \ in \ mathbb {R} ^ {m + 1}}$ ${\ displaystyle Q_ {m}}$ ${\ displaystyle {\ bar {R}} _ {m} \ in \ mathbb {R} ^ {(m + 1) \ times m}}$ ${\ displaystyle Q_ {m}}$

{\ displaystyle \ | \ | r_ {0} \ | _ {2} e_ {1} -H_ {m} y \ | _ {2} = \ | Q_ {m} (\ | r_ {0} \ | _ {2} e_ {1} -H_ {m} y) \ | _ {2} = \ | {\ bar {g}} _ {m} - {\ bar {R}} _ {m} y \ | _ {2} = {\ sqrt {| \ gamma _ {m + 1} | ^ {2} + \ | g_ {m} -R_ {m} y \ | _ {2} ^ {2}}}}

,

where and are obtained from their counterparts by omitting the last line. Here it can now be seen at which point the residual becomes minimal, namely for the uniquely determined vector y that satisfies. The residual in the m th step is therefore exact . ${\ displaystyle g_ {m}}$ ${\ displaystyle R_ {m}}$ ${\ displaystyle g_ {m} = R_ {m} y}$ ${\ displaystyle | \ gamma _ {m + 1} |}$

A special feature of the method is that the current approximation is initially not calculated in the course of the iteration, only the auxiliary vector y . Instead, the procedure delivers the norm of the residual in each step. If this is less than the desired accuracy, the method is usually terminated. Then the current approximation is calculated as a linear combination of the basis vectors. Here the components of y are simply the coefficients of the base representation. ${\ displaystyle x_ {m}}$

Alternatively, the solution to the above minimization problem is given as the vector of the affine Krylow subspace , the residue of which is perpendicular to the space . This makes GMRES a skewed projection method . ${\ displaystyle x_ {m}}$ ${\ displaystyle x_ {0} + {\ mathcal {K}} _ {m} (A, r_ {0})}$ ${\ displaystyle b-Ax_ {m}}$ ${\ displaystyle A {\ mathcal {K}} _ {m} (A, r_ {0})}$

Pseudocode

Given and a termination tolerance TOL for the norm of the residual, calculate . ${\ displaystyle x_ {0} \ in \ mathbb {R} ^ {n}}$ ${\ displaystyle r_ {0} = b-Ax_ {0}}$

If , then END. ${\ displaystyle r_ {0} = 0}$

${\ displaystyle v_ {1} = {\ frac {r_ {0}} {\ | r_ {0} \ | _ {2}}}}$ .

${\ displaystyle \ gamma _ {1} = \ | r_ {0} \ | _ {2}}$ .

For ${\ displaystyle j = 1, ..., n}$

{\ displaystyle q = Av_ {j}}

For do .

{\ displaystyle i = 1, ..., j}

{\ displaystyle h_ {ij} = v_ {i} ^ {T} q}

{\ displaystyle w_ {j} = q- \ sum _ {i = 1} ^ {j} h_ {ij} v_ {i}, \ quad h_ {j + 1, j} = \ | w_ {j} \ | _ {2}.}

For do .

{\ displaystyle i = 1, ..., j-1}

{\ displaystyle {\ begin {pmatrix} h_ {ij} \\ h_ {i + 1, j} \ end {pmatrix}} = {\ begin {pmatrix} c_ {i + 1} & s_ {i + 1} \\ -s_ {i + 1} & c_ {i + 1} \ end {pmatrix}} {\ begin {pmatrix} h_ {ij} \\ h_ {i + 1, j} \ end {pmatrix}}}

{\ displaystyle \ beta = {\ sqrt {h_ {jj} ^ {2} + h_ {j + 1, j} ^ {2}}}; \ quad s_ {j + 1} = {\ frac {h_ {j + 1, j}} {\ beta}}}

.

{\ displaystyle c_ {j + 1} = {\ frac {h_ {jj}} {\ beta}}; \ quad h_ {jj} = \ beta}

.

{\ displaystyle \ gamma _ {j + 1} = - s_ {j + 1} \ gamma _ {j}; \ quad \ gamma _ {j} = c_ {j + 1} \ gamma _ {j}}

.

if ,

{\ displaystyle | \ gamma _ {j + 1} | \ geq TOL}

{\ displaystyle v_ {j + 1} = {\ frac {w_ {j}} {h_ {j + 1, j}}}.}

else

for do .

{\ displaystyle i = j, ..., 1}

{\ displaystyle y_ {i} = {\ frac {1} {h_ {ii}}} \ left (\ gamma _ {i} - \ sum _ {k = i + 1} ^ {j} h_ {ik} y_ {k} \ right)}

{\ displaystyle x = x_ {0} + \ sum _ {i = 1} ^ {j} y_ {i} v_ {i}}

.

END.

Convergence results

Due to the definition of the procedure via the minimization problem, the Euclidean norm of the residuals falls monotonically . In exact arithmetic, GMRES is even a direct solution method that delivers the exact solution after n steps at the latest . If the dimension of the Krylow subspace is increased by one in each step, this statement is clear, since then in the last step the whole is minimized. If this is not the case, the algorithm is aborted beforehand, but with the exact solution. ${\ displaystyle \ mathbb {R} ^ {n}}$

For general matrices this is also the strongest possible result, because according to a theorem of Greenbaum, Pták and Strakoš there is a matrix for every monotonically decreasing sequence, so that the sequence of residuals generated by GMRES corresponds to the given sequence. In particular, it is possible that the residuals remain constant and only fall to zero in the very last step.

For special matrices there are sharper convergence results. If the matrix can be diagonalized , there is a regular matrix V and a diagonal matrix with . Then for every polynomial of degree k with : ${\ displaystyle \ Lambda}$ ${\ displaystyle A = V \ Lambda V ^ {- 1}}$ ${\ displaystyle p (0) = 1}$

{\ displaystyle {\ frac {\ | r_ {k} \ | _ {2}} {\ | r_ {0} \ | _ {2}}} \ leq \ kappa _ {2} (V) \ max _ { z \ in \ sigma (A)} | p_ {k} (z) |.}

Here, the condition number of the matrix in the Euclidean norm and the spectrum , i.e. the set of eigenvalues . For a normal matrix is . It follows from the inequality in particular that the preconditioning should merge the eigenvalues into clusters. ${\ displaystyle \ kappa _ {2}}$ ${\ displaystyle \ sigma (A)}$ ${\ displaystyle \ kappa _ {2} (V) = 1}$

If the matrix is positive definite (not necessarily symmetric) then:

{\ displaystyle \ | r_ {m} \ | _ {2} \ leq \ left (1 - {\ frac {\ lambda _ {min} ^ {2} ({\ frac {A ^ {T} + A} { 2}})} {\ lambda _ {max} (A ^ {T} A)}} \ right) ^ {m / 2} \ | r_ {0} \ | _ {2}}

,

where and denote the largest or smallest eigenvalue of a matrix. ${\ displaystyle \ lambda _ {min}}$ ${\ displaystyle \ lambda _ {max}}$

If the matrix A is not only positively definite, but also symmetric , then it even holds:

{\ displaystyle \ | r_ {m} \ | _ {2} \ leq \ left ({\ frac {\ kappa _ {2} ^ {2} (A) -1} {\ kappa _ {2} ^ {2 } (A)}} \ right) ^ {m / 2} \ | r_ {0} \ | _ {2}}

.

All of these statements only apply to the residuals and therefore do not provide any information about the actual error, i.e. the distance between the current approximation and the exact solution. No statements are known about this.

Effort and Restarted GMRES

For each iteration, GMRES requires a matrix-vector multiplication and a series of scalar products , the number of which increases by one per iteration, as does the number of (fully occupied!) Base vectors to be stored. This is because the method is not given by a short recursion, but rather all basis vectors are accessed in each step.

Since the effort and the storage space increase linearly with the number of iterations, it is common to discard the calculated basis after k steps and to restart the iteration with the current approximate solution (= restart). This procedure is called GMRES (k), usual restart lengths are 20 to 40. Here, however, convergence can only be proven for special cases, and matrices can be specified so that a restart does not lead to convergence.

As with all Krylow subspace methods, the total effort of GMRES for sparse matrices is O (n) with a high constant if significantly fewer iterations are carried out than there are unknowns.

Comparison with other solvers

For symmetric matrices, the Arnoldi method for calculating the orthogonal basis coincides with the Lanczos method . The corresponding Krylow subspace method is the MINRES method (for Minimal Residual Method) by Paige and Saunders. In contrast to the generalized variant, this manages with a three-term recursion. It can be shown that there is no Krylow subspace method for general matrices which works with short recursions, but at the same time, like the GMRES method, fulfills an optimality condition with regard to the norm of the residual.

Another class of methods is based on the asymmetrical Lanczos method , in particular the BiCG method . Such methods are characterized by a three-fold recursion, but due to the lack of optimality they no longer have a monotonous convergence history. In addition, although they provide the exact solution in the event of convergence, they no longer have a guaranteed convergence.

The third variant are methods such as CGS and BiCGSTAB . These also work with three-part recursions without optimality and can also terminate prematurely without convergence. The idea behind this method is to choose the generating polynomials of the iteration sequence in a clever way.

None of the three groups is better for all matrices, there are examples in each case where one class trumps the other. In practice, several solvers are tried out in order to collect empirical values for the given problem.

Preconditioning

Less critical than the selection of the actual solver is the selection of the preconditioner , through which decisive speed improvements can be achieved. For sequential codes, ILU decomposition is recommended , but depending on the problem, other preconditioners can also prove to be very efficient. Since ILU cannot be parallelized , other methods are used in this case, for example Schwarz region decomposition methods.

literature

CT Kelley: Iterative Methods for Linear and Nonlinear Equations . Society for Industrial and Applied Mathematics SIAM, Philadelphia PA 1995, ISBN 0-89871-352-8 , ( Frontiers in applied mathematics 16).
Andreas Meister: Numerics of linear systems of equations. An introduction to modern procedures . 2nd revised edition. Vieweg, Wiesbaden 2005, ISBN 3-528-13135-7 .
Yousef Saad: Iterative Methods for Sparse Linear Systems . 2nd edition. Society for Industrial and Applied Mathematics SIAM, Philadelphia PA 2003, ISBN 0-898-71534-2 .
Yousef Saad, Martin H. Schultz: GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems . In: SIAM Journal on Scientific and Statistical Computing Vol. 7, 1986, ISSN 0196-5204 , pp. 856-869.