MINRES procedure

A comparison of the norm of the error and the residual in the CG method (blue) and the MINRES method (green). The matrix used comes from a 2d boundary value problem .

The MINRES method (from English min imum res idual "minimal residual") is a Krylow subspace method for the iterative solution of symmetrical systems of linear equations. It was proposed in 1975 by mathematicians Christopher Conway Paige and Michael Alan Saunders .

In contrast to the CG method , the MINRES method does not assume that the matrix is positive definite, only the symmetry of the matrix is mandatory.

Properties of the MINRES method

The MINRES method iteratively calculates an approximate solution of a linear system of equations of the form

{\ displaystyle Ax = b}

.

Here is a symmetrical matrix and the right side. ${\ displaystyle A \ in \ mathbb {R} ^ {n \ times n}}$ ${\ displaystyle b \ in \ mathbb {R} ^ {n}}$

For this purpose, the norm of the residual in the -dimensional Krylow space ${\ displaystyle r (x): = b-Ax}$ ${\ displaystyle k}$

{\ displaystyle V_ {k} = x_ {0} + \ operatorname {span} \ {r_ {0}, Ar_ {0} \ ldots, A ^ {k-1} r_ {0} \}.}

minimized. A start value is and . ${\ displaystyle x_ {0} \ in \ mathbb {R} ^ {n}}$ ${\ displaystyle r_ {0}: = r (x_ {0})}$

Specifically, we define the approximate solutions by ${\ displaystyle x_ {k}}$

{\ displaystyle x_ {k}: = \ mathrm {argmin} _ {x \ in V_ {k}} \ | r (x) \ |}

.

It is the Euclidean norm in . ${\ displaystyle \ | \ cdot \ |}$ ${\ displaystyle \ mathbb {R} ^ {n}}$

Because of the symmetry of , it is possible, in contrast to the GMRES method , to carry out this minimization process recursively. In the -th step, only the iterates from the last two steps need to be accessed (short recursion). ${\ displaystyle A}$ ${\ displaystyle k}$

Construction idea from MINRES

Krylov subspace methods are limited to using only vectors from matrix-vector products with the system matrix. This has advantages because the matrix does not have to be available explicitly, but only as a function for the matrix-vector product. At the beginning the only known vectors are the current approximate solution (at the beginning mostly a zero vector), the right hand side and the residual . ${\ displaystyle x_ {0}}$ ${\ displaystyle b}$ ${\ displaystyle r_ {0} = bA \ cdot x_ {0}}$

The residual is copied into a vector and, for the above reason, used as the correction direction for the approximate solution. To do this, you calculate your image . One wants to add this image optimally to the residue so that its length is as small as possible (hence the name of the procedure). One reckons with . This must be (Gram-Schmidt). The corresponding approximate solution for this residual knows it: . ${\ displaystyle p_ {0}}$ ${\ displaystyle s_ {0} = A \ cdot p_ {0}}$ ${\ displaystyle r_ {1} = r_ {0} - \ xi \ cdot s_ {0}}$ ${\ displaystyle r_ {1} \ perp s_ {0}}$ ${\ displaystyle \ xi = \ langle s_ {0}, r_ {0} \ rangle / \ langle s_ {0}, s_ {0} \ rangle}$ ${\ displaystyle x_ {1}}$ ${\ displaystyle x_ {1} = x_ {0} + \ xi \ cdot p_ {0}}$

A copy is made for the new residual and the image is calculated again . In order to keep reducing the residual by repeating this principle, the next step is to create a residual that is upright and perpendicular. As a share of direction might contain must at orthogonalized and are adjusted accordingly so that after further applies. For will . So you continue this over many iterations. ${\ displaystyle p_ {1}}$ ${\ displaystyle s_ {1} = A \ cdot p_ {1}}$ ${\ displaystyle r_ {2}}$ ${\ displaystyle s_ {0}}$ ${\ displaystyle s_ {1}}$ ${\ displaystyle s_ {1}}$ ${\ displaystyle s_ {0}}$ ${\ displaystyle s_ {1}}$ ${\ displaystyle s_ {0}}$ ${\ displaystyle p_ {1}}$ ${\ displaystyle s_ {1} = A \ cdot p_ {1}}$ ${\ displaystyle s_ {1}: = s_ {1} - \ tau \ cdot s_ {0}}$ ${\ displaystyle p_ {1}: = p_ {1} - \ tau \ cdot p_ {0}}$

In this way, the direction would have to be orthogonalized to predecessors in the -th iteration . Lanczos was able to show, however, that it is already perpendicular to all these directions, if only its two predecessors are orthogonalized (= perpendicular). This is due to the symmetry of (which is why the method only works in the symmetrical case). ${\ displaystyle i}$ ${\ displaystyle s_ {i}}$ ${\ displaystyle i-1}$ ${\ displaystyle s_ {1}, \ dotsc, s_ {i-1}}$ ${\ displaystyle s_ {i}}$ ${\ displaystyle s_ {i}}$ ${\ displaystyle s_ {i-1}, s_ {i-2}}$ ${\ displaystyle A}$

MINRES algorithm

Note: The MINRES method is comparatively more complicated than the algebraically equivalent conjugate residual method. In the following, the conjugate residual (CR) procedure was therefore written down as a replacement. It differs from MINRES in that, in CR, not the columns of a basis of the Krylov space (labeled below with ), as in MINRES , but their images (labeled below with ) are orthogonalized via the Lanczos recursion. There are more efficient and preconditioned variants with fewer AXPYs. Compare with the English-language article. ${\ displaystyle p_ {k}}$ ${\ displaystyle s_ {k}}$

First, you choose arbitrarily and calculate ${\ displaystyle x_ {0} \ in \ mathbb {R} ^ {n}}$

{\ displaystyle r_ {0} = b-Ax_ {0}}

{\ displaystyle p_ {0} = r_ {0}}

{\ displaystyle s_ {0} = Ap_ {0}}

Then we iterate for the following steps: ${\ displaystyle k = 1,2, \ dots}$

Calculate the by ${\ displaystyle x_ {k}, r_ {k}}$

{\ displaystyle \ alpha _ {k-1} = {\ frac {\ langle r_ {k-1}, s_ {k-1} \ rangle} {\ langle s_ {k-1}, s_ {k-1} \ rangle}}}

{\ displaystyle x_ {k} = x_ {k-1} + \ alpha _ {k-1} p_ {k-1}}

{\ displaystyle r_ {k} = r_ {k-1} - \ alpha _ {k-1} s_ {k-1}}

if it is less than a specified tolerance, the algorithm with the approximate solution is aborted at this point , otherwise a new descent direction is calculated using

{\ displaystyle \ | r_ {k} \ |}

{\ displaystyle x_ {k}}

{\ displaystyle p_ {k}}

{\ displaystyle p_ {k} \ leftarrow s_ {k-1}}

{\ displaystyle s_ {k} \ leftarrow As_ {k-1}}

for (the step is only carried out from the second iteration step ) calculate: ${\ displaystyle l = 1.2}$ ${\ displaystyle l = 2}$

{\ displaystyle \ beta _ {k, l} = {\ frac {\ langle s_ {k}, s_ {kl} \ rangle} {\ langle s_ {kl}, s_ {kl} \ rangle}}}

{\ displaystyle p_ {k} \ leftarrow p_ {k} - \ beta _ {k, l} p_ {kl}}

{\ displaystyle s_ {k} \ leftarrow s_ {k} - \ beta _ {k, l} s_ {kl}}

Convergence rate of the MINRES method

In the case of positively definite matrices, the convergence rate of the MINRES method can be estimated in a similar way to the CG method. In contrast to the CG method, however, the estimate does not apply to the errors of the iterates, but to the residual. The following applies:

{\ displaystyle \ | r_ {k} \ | \ leq 2 \ left ({\ frac {{\ sqrt {\ kappa (A)}} - 1} {{\ sqrt {\ kappa (A)}} + 1} } \ right) ^ {k} \ | r_ {0} \ |}

.

Where is the condition number of the matrix . ${\ displaystyle \ kappa (A)}$ ${\ displaystyle A}$

Example implementation in GNU Octave / Matlab

function [x,r] = minres(A,b,x0,maxit,tol)
  x = x0;
  r = b - A*x0;
  p0 = r;
  s0 = A*p0;
  p1 = p0;
  s1 = s0;
  for iter=[1:maxit]
    p2 = p1;p1 = p0;
    s2 = s1;s1 = s0;
    alpha = r'*s1/(s1'*s1);
    x += alpha*p1;
    r -= alpha*s1;
    if (r'*r < tol^2)
      break
    end
    p0 = s1;
    s0 = A*s1;
    beta1 = s0'*s1/(s1'*s1);
    p0 -= beta1*p1;
    s0 -= beta1*s1;
    if iter > 1
      beta2 = s0'*s2/(s2'*s2);
      p0 -= beta2*p2;
      s0 -= beta2*s2;
    end
  end
end

Individual evidence

↑ Christopher C. Paige, Michael A. Saunders: Solution of sparse indefinite systems of linear equations . In: SIAM Journal on Numerical Analysis . tape 12 , no. 4 , 1975.
↑ Sven Gross, Arnold Reusken: Numerical Methods for Two-phase Incompressible flows . Springer, ISBN 978-3-642-19685-0 , chap. 5.2.

[1] Christopher C. Paige, Michael A. Saunders: Solution of sparse indefinite systems of linear equations . In: SIAM Journal on Numerical Analysis . tape 12 , no. 4 , 1975.

[2] Sven Gross, Arnold Reusken: Numerical Methods for Two-phase Incompressible flows . Springer, ISBN 978-3-642-19685-0 , chap. 5.2.