Quasi-Newton method

from Wikipedia, the free encyclopedia

Quasi-Newton methods are a class of numerical methods for solving nonlinear minimization problems. The methods are based on Newton's method , but do not calculate the inverse of the Hesse matrix directly, but merely approximate it in order to reduce the computational effort per iteration .

The first algorithm was developed by William Davidon , a physicist at Argonne National Laboratory , in the mid-1950s . The best-known algorithms are Broyden-Fletcher-Goldfarb-Shanno (BFGS) , named after Roger Fletcher , Donald Goldfarb , David F. Shanno , Charles George Broyden , and Davidon-Fletcher-Powell (DFP) (after Fletcher, Davidon and Michael JD Powell ).

Basic algorithm

A doubly differentiable function is approximated to the second degree with a Taylor expansion.

The derivative of the function must be zero for a minimum. It follows :

If the Hessian matrix is positive definite , the said zero of the derivative of is actually a minimum of and this can be approximated iteratively with the Newton method:

The problem here is that the inverse of the Hessian matrix is ​​calculated and that it has to be positive. The quasi-Newton method is replaced by a scalar and a matrix

The derivation equation above gives transformed for and

From this we can deduce:

It is now thought that the Hessian function and are approximately the same, and concludes:

For one chooses a correction term of the form :

The equation can be rearranged so that

Thus applies

In this way the matrix can be clearly determined, but with just one correction term it is not always positive.

Davidon-Fletcher-Powell (DFP)

The matrix is approximated with the matrix and two correction terms:

properties

If is a quadratic function, with exact arithmetic the algorithm delivers the exact solution after a finite number of iterations. The following applies to all other functions

In the case of a quadratic function with parameters, the solution is ideally even achieved in steps. In practice, a little more iterations are needed, e.g. B. if the linear step size search is not carried out precisely enough or the gradients are not determined precisely enough. Usually you stop the optimization if z. B. the gradient is very small or a certain number of iterations is reached.

literature

  • William C. Davidon, Variable Metric Method for Minimization , SIOPT Volume 1 Issue 1, Pages 1-17, 1991 (first as Argonne National Laboratory Report 1959).
  • Jorge Nocedal and Stephen J. Wright: Numerical Optimization , Springer-Verlag, 1999 ISBN 0-387-98793-2 .
  • Edwin KP Chong and Stanislaw H.Zak: An Introduction to Optimization , 2ed, John Wiley & Sons Pte. Ltd. August 2001.
  • P. Gill, W. Murray and M. Wright: Practical Optimization , 1981