Gaussian elimination method
The Gaussian elimination method or simply Gauss method (after Carl Friedrich Gauß ) is an algorithm from the mathematical sub-areas of linear algebra and numerics . It is an important method for solving systems of linear equations and is based on the fact that elementary transformations change the system of equations but retain the solution. This allows every uniquely solvable system of equations to be brought to a stepped form in which the solution can be easily determined by successive elimination of the unknowns or the amount of solutions can be read off.
The number of operations required at a - matrix of order . In its basic form, the algorithm is prone to rounding errors from a numerical point of view , but with small modifications ( pivoting ) it represents the standard solution method for general systems of linear equations and is part of all essential program libraries for numerical linear algebra such as NAG , IMSL and LAPACK .
Explanation
A linear system of equations with three equations and three unknowns and right-hand side has the form:
The algorithm for calculating the variables , and can be divided into two stages:
- Forward elimination,
- Backwards onset (back substitution).
In the first step, the system of equations is brought into step form. Stepped form means that at least one less variable occurs per line, i.e. at least one variable is eliminated . In the above system of equations , and would be eliminated, in the third line there is only the variable :
Elementary line transformations are used to achieve the step form , with the help of which the system of equations is transformed into a new one, but which has the same set of solutions. Two types of elementary line transformation are sufficient:
- Add a row or a multiple of a row to another row.
- Swap two lines.
The procedure then consists in starting in the first column with transformations of the first type by skillfully adding the first row to zero all entries except the first. This is then continued in the second column modified in this way, this time multiples of the second row being added to the following rows and so on. This step only works if the diagonal element of the current column is not zero. In such a case, the second type of line reshaping is necessary, since a non-zero entry can be created on the diagonal by interchanging the lines. With the help of these two types of transformations it is possible to bring any linear system of equations into step form.
Another type of elementary reshaping is the swapping of columns. This is not required to carry out the algorithm, but is sometimes used in computer programs for reasons of stability. This changes the position of the variable in the equation system. When calculating with your head, it is sometimes useful to multiply a line by a number, for example to avoid complicated fractions. This causes additional computational effort and is therefore not an option in computer programs and also changes the determinant of the coefficient matrix, which has theoretical disadvantages.
In the second step of the procedure, the backward insertion, starting from the last line in which only one variable appears, the variables are calculated and inserted in the line above.
An alternative to this is the Gauss-Jordan algorithm , in which not only the lower parts are eliminated, but also the upper parts, so that a diagonal shape is created, in which the above-mentioned second step is then omitted.
example
- , Here: , , and
For better clarity, the coefficients of the linear equation system are written into the coefficient matrix expanded with :
Now it is transformed so that and become zero by adding appropriate multiples of the first equation to the second and third equations. The corresponding multiplier is obtained by dividing the element to be eliminated (first ) by the pivot element (here: and ). Since the two elements and should be zero, the two multipliers are each multiplied by.
The -fold is added to the second line and the -fold of the first line is added to the third line . To make it zero, a multiple of the second line is added to the third line, in this case -fold:
If the number that is used to divide to calculate the multiplier (here the number for the first two lines, the number for the third time ) is zero, this line is swapped with one below. The last line means
- .
This equation is easy to solve and provides . This results in the second line
- , with so
and on . With that all variables are calculated:
- , and .
Control by line total
The transformations can be checked by calculating the line sum.
Here, the sum of all elements of the respective row was added in the last column. For the first line is the line total . Since no transformations are carried out on the first line, its line total does not change. During the first transformation of this system of equations, the -fold of the first is added to the second line . If you do the same for the line total, then the following applies . This result is the row total of the transformed second row . In order to check the invoices, one can therefore carry out the transformations on the line total. If all invoices are correct, the line total of the transformed line must result.
Pivoting
In general, the Gaussian elimination method cannot be carried out without interchanging lines. By replacing in the above example by , the algorithm without Zeilenvertauschung can not start. To remedy this, choose an element of the first column of the coefficient matrix, the so-called pivot element , which is not equal to 0.
0 | 2 | 3 | 4th |
1 | 1 | 1 | 2 |
3 | 3 | 1 | 0 |
Then swap the first line with the pivot line:
1 | 1 | 1 | 2 |
0 | 2 | 3 | 4th |
3 | 3 | 1 | 0 |
For the calculation by hand, it is helpful to choose a 1 or minus 1 as the pivot element so that there are no breaks in the further course of the process. For the calculation with the help of a computer, it makes sense to choose the element with the largest amount in order to obtain the most stable algorithm possible. If you choose the pivot element in the current column, you speak of column pivoting. Alternatively, you can also select the pivot in the current line.
0 | 2 | 3 | 4th |
1 | 1 | 1 | 2 |
3 | 3 | 1 | 0 |
In this case the columns are swapped accordingly.
2 | 0 | 3 | 4th |
1 | 1 | 1 | 2 |
3 | 3 | 1 | 0 |
When inserting backwards, it should be noted that the variables have changed their position in the system of equations. If the element of the total remaining matrix with the largest amount is selected as the pivot, one speaks of complete pivoting or total pivoting. In general, both rows and columns must be swapped around for this.
Pivoting can be carried out without significant additional effort if the entries of the matrix and the right-hand side are not swapped, but the swaps are stored in an index vector.
LR decomposition
Introductory example
If you want to implement the solution of a quadratic, uniquely solvable system of equations as a computer program, it is advisable to interpret the Gaussian algorithm as an LR decomposition (also called LU decomposition or triangular decomposition ). This is a decomposition of the regular matrix into the product of a lower left, standardized triangular matrix (left, or English "left", or "lower") and a right upper triangular matrix (right, or English "right", or also " upper ", and then labeled). The following example shows this:
The matrix is used to store the required transformation steps , which correspond to multiplications with Frobenius matrices , and has the above-mentioned step form. This shows the existence of the decomposition. For clarity, the diagonal elements of the matrix are set as 1. Saving the transformation steps has the advantage that the system of equations can be efficiently solved for different “right sides” by inserting them forwards and backwards.
The generally required interchanges of lines can be described by a permutation matrix :
Existence proposition
For every regular matrix there is a permutation matrix , a lower, normalized triangular matrix and an upper triangular matrix , so that:
- .
A permutation matrix is a matrix that is created from the identity matrix by any number of line interchanges and thus still consists of only zeros and ones.
algorithm
The algorithm for computing the matrices for a given one is as follows.
It will be completed (matrix transformations ). The reshaping matrices are introduced :
New auxiliary matrices were introduced:
Note:
- represents the -th basis vector
- in only one row of the identity matrix was swapped with another
- must be chosen so that the maximum absolute value for all elements from the column part has
You need more auxiliary matrices :
The desired matrices can now be specified:
Algorithm in pseudocode
Without pivoting, out-of-place
The following algorithm performs an LR decomposition of the matrix A without pivoting , by simultaneously generating L and R outside (out-of-place) of A:
Eingabe: Matrix A
// Initialisierung R := A L := E_n
// n-1 Iterationsschritte for i := 1 to n-1 // Zeilen der Restmatrix werden durchlaufen for k := i+1 to n // Berechnung von L L(k,i) := R(k,i) / R(i,i) // Achtung: vorher Prüfung auf Nullwerte notwendig // Spalten der Restmatrix werden durchlaufen for j := i to n // Berechnung von R R(k,j) := R(k,j) - L(k,i) * R(i,j)
Ausgabe: Matrix L, Matrix R
Without pivoting, in-place
Alternatively (out of a possible interest in memory efficiency) a simultaneous development of L and R directly in A is possible (in-place), which is described by the following algorithm:
Eingabe: Matrix A
// n-1 Iterationsschritte for i := 1 to n-1 // Zeilen der Restmatrix werden durchlaufen for k := i+1 to n // Berechnung von L A(k,i) := A(k,i) / A(i,i) // Achtung: vorher Prüfung auf Nullwerte notwendig // Spalten der Restmatrix werden durchlaufen for j := i+1 to n // Berechnung von R A(k,j) := A(k,j) - A(k,i) * A(i,j)
Ausgabe: Matrix A (in modifizierter Form)
With pivoting
The following algorithm performs an LR decomposition of the matrix with pivotization . It differs from the algorithms without pivoting only in that the lines can be swapped:
Eingabe: Matrix
// n-1 Iterationsschritte for k := 1 to n-1 // Pivotisierung // finde betragsmäßig größtes Element in k-ter Spalte = k for i := k+1 to n if < > = i // vertausche Zeilen <Erstelle > <Vertausche Zeilen k,> // Rest analog zu obigen Algorithmen <Rest >
Ausgabe: Matrix L, Matrix R, Matrix P
Solving an LGS
The original LGS is now simplified as follows using the LR decomposition :
Now define the following auxiliary variables
As a result, the LGS has changed into a simplified structure:
These can easily be solved by pushing forwards or backwards .
Bet forward
When betting forward, one calculates a solution of the linear equation system , or when calculating with pivoting of . This is related to the solution of the original system of equations via the equation .
The system of equations has the following form written out:
The following formula then applies to the components :
Starting with, you can calculate one after the other using the ones you already know .
Insert backwards
When betting backwards, one calculates the solution of the original system of equations by solving in a manner similar to that for forward betting. The difference is that you start at and then calculate the values of one after the other . The corresponding formula is
Incomplete decompositions
The LR decomposition has the disadvantage that it is often full even with thinly populated matrices . If, instead of all entries, only those in a given occupation pattern are calculated, one speaks of an incomplete LU decomposition . This provides a favorable approximation to the matrix and can therefore be used as a preconditioner in the iterative solution of linear systems of equations. In the case of symmetrically positive definite matrices one speaks of an incomplete Cholesky decomposition .
Properties of the procedure
Computing effort and storage space requirements
The number of arithmetic operations for the LR decomposition is approx . The effort for inserting it forwards and backwards is quadratic ( ) and therefore negligible overall. The Gaussian elimination method is a fast, direct method for solving linear systems of equations; a QR decomposition requires at least twice as many arithmetic operations. Nevertheless, the algorithm should only be used for systems of equations of small to medium dimensions (up to about ). For matrices of higher dimensions, iterative methods are often better. These approach the solution step by step and require arithmetic operations in each step for a fully occupied matrix . The speed of convergence of such methods strongly depends on the properties of the matrix and it is difficult to predict the specific computing time required.
The calculation can be carried out on the memory of the matrix , so that apart from the storage itself, no additional memory is required . For a full matrix of the dimension one would have to store a million coefficients. This corresponds to the IEEE-754 format double in about 8 megabytes . In the case of iterative methods that work with matrix-vector multiplications , however, explicit storage may not be necessary by itself, so that these methods may be preferable.
For special cases, effort and storage space can be significantly reduced by using special properties of the matrix and its LR decomposition. The Cholesky decomposition for symmetric positive definite matrices requires only half the arithmetic operations and memory. Another example are band matrices with a fixed bandwidth , since here the LR breakdown maintains the band structure and the effort is reduced to. For a few special sparse matrices it is possible to use the occupation structure so that the LR decomposition also remains sparse. Both go hand in hand with a reduced memory requirement.
accuracy
Preconditions for accuracy - procedure
In order for the calculation to be sufficiently accurate, on the one hand the condition of the matrix must not be too bad and the machine precision used must not be too low. On the other hand, you need a solution process that is sufficiently stable . A good algorithm is therefore characterized by high stability.
In general, the procedure is unstable without pivoting. Therefore, column pivoting is mostly used for the solution. This means that the method can be carried out in a stable manner for most matrices, as became particularly clear through the work of James H. Wilkinson after the Second World War. However, matrices can be specified for which the stability constant increases exponentially with the dimension of the matrix. With complete pivoting, the stability can be improved, but the effort for the pivot search increases , so this is rarely used. QR decompositions generally have better stability, but they are also more complex to calculate.
In the case of strictly diagonally dominant or positively definite matrices (see also Cholesky decomposition ), the Gaussian method is stable and can be carried out without pivoting, so there are no zeros on the diagonal.
Post-iteration
A practical approach to compensate for these computational inaccuracies consists of a post-iteration by means of a splitting method , since the LR decomposition provides a good approximation of the matrix A , which can be easily inverted. To do this, you start with the calculated solution and calculate the residual in each step
Then the solution of the system of equations is calculated using the LR decomposition
and sets
Since mostly only small corrections are involved, a few iteration steps are often sufficient. In general, however, a higher level of accuracy is required to calculate the residual . If the post-iteration is not sufficient to achieve the desired accuracy, the only option is to choose a different method or to remodel the problem in order to obtain a more favorable matrix, for example one with a lower condition.
The post-iteration is used, for example, in the LAPACK routine DSGESV. In this routine, the LR decomposition is determined with single precision and double the precision of the solution is achieved through post-iteration with a residual calculated with double precision .
The Gaussian method as a theoretical aid
In addition to its importance for the numerical treatment of uniquely solvable systems of linear equations, the Gauss method is also an important aid in theoretical linear algebra.
Statements on the solvability of the linear system of equations
A system of linear equations can have one, more, or no solutions. When using complete pivoting, the Gaussian method brings each matrix to a triangular step shape, in which all entries below a certain line are zero and no zero entries appear on the diagonal. The rank of the matrix then results from the number of non-zero rows in the matrix. Solvability then results from the interaction with the right-hand side: If non-zero entries belong to the right-hand side of the zero lines, the system of equations is unsolvable, otherwise solvable, whereby the dimension of the solution set corresponds to the number of unknowns minus the rank.
- example
Since the second equation is a multiple of the first equation, the system of equations has an infinite number of solutions. When x is eliminated in the second equation, it disappears completely, only the first equation remains. If you solve this for x , you can specify the solution set as a function of y :
Determinant
The Gaussian method also provides a way of calculating the determinant of a matrix. Since the elementary line transformations have the determinant 1, except for line swaps, the determinant of which is −1 (however, this only changes the sign and can therefore be easily corrected), the resulting upper triangular matrix has the same determinant as the original matrix, but can be much simpler calculated: It is the product of the diagonal elements.
Calculating the inverse
Another way of using the Gaussian method is to calculate the inverse of the matrix. For this purpose, the algorithm is applied to a scheme extended from the right by an identity matrix and continued after the first phase until an identity matrix is reached on the left. The inverse matrix is then on the right. This method is not recommended numerically and the explicit calculation of the inverse can usually be bypassed.
history
The Chinese math book Jiu Zhang Suanshu (Eng. Nine Books of Arithmetic Technique ), which was written between 200 BC and 100 AD, already provides an exemplary but clear demonstration of the algorithm based on the solution of a system with three unknowns. In 263, Liu Hui published a comprehensive commentary on the book, which was then incorporated into the corpus. Jiu Zhang Suanshu was an essential source of mathematical education in China and surrounding countries until the 16th century .
In Europe it was not until 1759 that Joseph-Louis Lagrange published a procedure that contained the basic elements. As part of his development and application of the method of least squares, Carl Friedrich Gauß dealt with linear systems of equations, the normal equations that occur there. His first publication on the subject dates from 1810 ( Disquisitio de elementis ellipticis Palladis ), but as early as 1798 he cryptically mentioned in his diaries that he had solved the problem of elimination. What is certain is that he used the method for calculating the orbit of the asteroid Pallas between 1803 and 1809. In the 1820s he first described something like an LR decomposition. In the following years, the elimination method was mainly used in geodesy (see Gauß 'achievements ), and so the second namesake of the Gauß-Jordan method is not the mathematician Camille Jordan , but the geodesist Wilhelm Jordan .
During and after the Second World War, the study of numerical methods gained in importance and the Gaussian method was now increasingly applied to problems independent of the least squares method. John von Neumann and Alan Turing defined the LR decomposition in the form commonly used today and examined the phenomenon of rounding errors. These questions were only solved satisfactorily in the 1960s by James Hardy Wilkinson , who showed that the pivoting method is backwards stable.
literature
- Gerd Fischer : Lineare Algebra , Vieweg-Verlag, ISBN 3-528-97217-3 .
- Gene Golub and Charles van Loan: Matrix computations . 3. Edition. Johns Hopkins University Press, Baltimore 1996, ISBN 0-8018-5414-8 (English).
- Andrzej Kielbasiński, Hubert Schwetlick: Numerical linear algebra. A computer-oriented introduction . Deutscher Verlag der Wissenschaften, Berlin 1988, ISBN 3-326-00194-0 .
- Andreas Meister: Numerics of linear systems of equations. An introduction to modern procedures . 2nd edition, Vieweg, Wiesbaden 2005, ISBN 3-528-13135-7 .