Gram-Schmidt's orthogonalization method

The Gram-Schmidt orthogonalization method is an algorithm from the mathematical branch of linear algebra . For every system of linearly independent vectors, he creates an orthogonal system from a Prähilbert space (a vector space with a scalar product ) , which creates the same sub-vector space . The Gram-Schmidt orthonormalization method is an extension : Instead of an orthogonal system, it calculates an orthonormal system . If a system of basis vectors is used as input for the algorithms, they compute an orthogonal or orthonormal basis .

The two processes are named after Jørgen Pedersen Gram and Erhard Schmidt . However, they were used earlier in the works of Pierre-Simon Laplace and Augustin-Louis Cauchy .

The Gram-Schmidt methods are poorly suited for numerical calculation by a computer with floating point arithmetic . Due to accumulated rounding errors , the calculated vectors are no longer orthogonal. However, there are modifications of the method that do not have this disadvantage. Other orthogonalization methods are based on household transformations or Givens rotations .

Preliminary remark

In the following, elements of the vector space under consideration (vectors) are designated with simple Latin letters such as and , if necessary with indices such as and . Both overlay arrows and bold type are dispensed with. The dot product is represented by angle brackets: . In the complex case, the convention is used that the scalar product is semilinear in the first argument and linear in the second argument, that is ${\ displaystyle v}$ ${\ displaystyle w}$ ${\ displaystyle v_ {i}}$ ${\ displaystyle w_ {j}}$ ${\ displaystyle \ langle v, w \ rangle}$

{\ displaystyle \ langle \ lambda v, w \ rangle = {\ overline {\ lambda}} \ langle v, w \ rangle, \ quad \ langle v, \ lambda w \ rangle = \ lambda \ langle v, w \ rangle }

for all vectors , and all . In the complex case, the order of the factors in the scalar product is important in the formulas shown below, but not in the real case. ${\ displaystyle v}$ ${\ displaystyle w}$ ${\ displaystyle \ lambda \ in \ mathbb {C}}$

In addition, denotes the norm of the vector . ${\ displaystyle \ | v \ | = {\ sqrt {\ langle v, v \ rangle}}}$ ${\ displaystyle v}$

Algorithm of the orthogonalization process

Illustration of the Gram-Schmidt method using an example with three vectors

The following algorithm calculates an orthogonal system of pairwise orthogonal vectors for the linearly independent vectors, which generates the same subspace. ${\ displaystyle w_ {1}, \ dots, w_ {n}}$ ${\ displaystyle n}$

The individual vectors of the orthogonal system are calculated as follows: ${\ displaystyle v_ {1}, \ dots, v_ {n}}$

{\ displaystyle v_ {1} = w_ {1} \,}

{\ displaystyle v_ {2} = w_ {2} - {\ frac {\ langle v_ {1}, w_ {2} \ rangle} {\ langle v_ {1}, v_ {1} \ rangle}} \, v_ {1}}

{\ displaystyle v_ {3} = w_ {3} - {\ frac {\ langle v_ {1}, w_ {3} \ rangle} {\ langle v_ {1}, v_ {1} \ rangle}} \, v_ {1} - {\ frac {\ langle v_ {2}, w_ {3} \ rangle} {\ langle v_ {2}, v_ {2} \ rangle}} \, v_ {2}}

{\ displaystyle \ vdots}

{\ displaystyle v_ {n} = w_ {n} - {\ frac {\ langle v_ {1}, w_ {n} \ rangle} {\ langle v_ {1}, v_ {1} \ rangle}} \, v_ {1} - {\ frac {\ langle v_ {2}, w_ {n} \ rangle} {\ langle v_ {2}, v_ {2} \ rangle}} \, v_ {2} - \ dotsb - {\ frac {\ langle v_ {n-1}, w_ {n} \ rangle} {\ langle v_ {n-1}, v_ {n-1} \ rangle}} \, v_ {n-1} = w_ {n } - \ sum _ {i = 1} ^ {n-1} {\ frac {\ langle v_ {i}, w_ {n} \ rangle} {\ langle v_ {i}, v_ {i} \ rangle}} \, v_ {i}}

In other words, the for are recursively through ${\ displaystyle v_ {j}}$ ${\ displaystyle j = 1,2, \ dots, n}$

{\ displaystyle v_ {j} = w_ {j} - \ sum _ {i = 1} ^ {j-1} {\ frac {\ langle v_ {i}, w_ {j} \ rangle} {\ langle v_ { i}, v_ {i} \ rangle}} \, v_ {i}}

Are defined.

example

In provided with the standard scalar two linearly independent vectors are determined that provide an subspace: ${\ displaystyle \ mathbb {R} ^ {3}}$ ${\ displaystyle \ langle \ cdot, \ cdot \ rangle}$

{\ displaystyle w_ {1} = {\ begin {pmatrix} 3 \\ 1 \\ 2 \ end {pmatrix}}, \ quad w_ {2} = {\ begin {pmatrix} 2 \\ 2 \\ 2 \ end {pmatrix}}}

Two orthogonal vectors and are now calculated, which generate the same subspace: ${\ displaystyle v_ {1}}$ ${\ displaystyle v_ {2}}$

{\ displaystyle v_ {1} = w_ {1} = {\ begin {pmatrix} 3 \\ 1 \\ 2 \ end {pmatrix}}}

{\ displaystyle v_ {2} = w_ {2} - {\ frac {\ langle v_ {1}, w_ {2} \ rangle} {\ langle v_ {1}, v_ {1} \ rangle}} \ cdot v_ {1} = {\ begin {pmatrix} 2 \\ 2 \\ 2 \ end {pmatrix}} - {\ frac {12} {14}} \ cdot {\ begin {pmatrix} 3 \\ 1 \\ 2 \ end {pmatrix}} = {\ frac {1} {7}} {\ begin {pmatrix} -4 \\ 8 \\ 2 \ end {pmatrix}}}

Algorithm of the orthonormalization process

The following algorithm calculates an orthonormal system of normalized, pairwise orthogonal vectors for the linearly independent vectors, which generates the same sub-vector space. It is identical to a normalization of the orthogonal vectors which were determined by the above algorithm. ${\ displaystyle w_ {1}, \ dots, w_ {n}}$ ${\ displaystyle n}$

The individual vectors of the orthonormal system are obtained by first calculating an orthogonal vector and then normalizing it: ${\ displaystyle v_ {1}, \ dots, v_ {n}}$

{\ displaystyle v_ {1} = {\ frac {w_ {1}} {\ left \ | w_ {1} \ right \ |}}}

(Normalize the first vector )

{\ displaystyle w_ {1}}

{\ displaystyle v_ {2} ^ {\ prime} = w_ {2} - \ langle v_ {1}, w_ {2} \ rangle \ cdot v_ {1}}

(Orthogonalizing the second vector )

{\ displaystyle w_ {2}}

{\ displaystyle v_ {2} = {\ frac {v_ {2} ^ {\ prime}} {\ left \ | v_ {2} ^ {\ prime} \ right \ |}}}

(Normalize the vector )

{\ displaystyle v_ {2} ^ {\ prime}}

{\ displaystyle v_ {3} ^ {\ prime} = w_ {3} - \ langle v_ {1}, w_ {3} \ rangle \ cdot v_ {1} - \ langle v_ {2}, w_ {3} \ rangle \ cdot v_ {2}}

(Orthogonalizing the third vector )

{\ displaystyle w_ {3}}

{\ displaystyle v_ {3} = {\ frac {v_ {3} ^ {\ prime}} {\ left \ | v_ {3} ^ {\ prime} \ right \ |}}}

(Normalize the vector )

{\ displaystyle v_ {3} ^ {\ prime}}

{\ displaystyle \ vdots}

{\ displaystyle v_ {n} ^ {\ prime} = w_ {n} - \ sum _ {i = 1} ^ {n-1} \ langle v_ {i}, w_ {n} \ rangle \ cdot v_ {i }}

(Orthogonalizing the -th vector )

{\ displaystyle n}

{\ displaystyle w_ {n}}

{\ displaystyle v_ {n} = {\ frac {v_ {n} ^ {\ prime}} {\ left \ | v_ {n} ^ {\ prime} \ right \ |}}}

(Normalize the vector )

{\ displaystyle v_ {n} ^ {\ prime}}

In other words, the and for are recursively through ${\ displaystyle v_ {j}}$ ${\ displaystyle v_ {j} ^ {\ prime}}$ ${\ displaystyle j = 1,2, \ dots, n}$

{\ displaystyle v_ {j} ^ {\ prime} = w_ {j} - \ sum _ {i = 1} ^ {j-1} \ langle v_ {i}, w_ {j} \ rangle \ cdot v_ {i }}

and defined.

{\ displaystyle v_ {j} = {\ frac {v_ {j} ^ {\ prime}} {\ left \ | v_ {j} ^ {\ prime} \ right \ |}}}

In general, this method does not give a particularly excellent system. Im must z. B. only be followed by a rearrangement step in order to obtain a right or left system. ${\ displaystyle \ mathbb {R} ^ {3}}$

example

In equipped with the standard scalar two basis vectors may be given: ${\ displaystyle \ mathbb {R} ^ {2}}$ ${\ displaystyle \ langle \ cdot, \ cdot \ rangle}$

{\ displaystyle w_ {1} = {\ begin {pmatrix} 3 \\ 1 \ end {pmatrix}}, \ quad w_ {2} = {\ begin {pmatrix} 2 \\ 2 \ end {pmatrix}}}

Two vectors and are now calculated, which form an orthonormal basis of . ${\ displaystyle v_ {1}}$ ${\ displaystyle v_ {2}}$ ${\ displaystyle \ mathbb {R} ^ {2}}$

{\ displaystyle v_ {1} = {\ frac {w_ {1}} {\ left \ | w_ {1} \ right \ |}} = {\ frac {1} {\ sqrt {10}}} \ cdot { \ begin {pmatrix} 3 \\ 1 \ end {pmatrix}}}

{\ displaystyle v_ {2} ^ {\ prime} = w_ {2} - \ langle v_ {1}, w_ {2} \ rangle \ cdot v_ {1} = {\ begin {pmatrix} 2 \\ 2 \ end {pmatrix}} - {\ frac {1} {\ sqrt {10}}} \ cdot \ left \ langle {\ begin {pmatrix} 3 \\ 1 \ end {pmatrix}}, {\ begin {pmatrix} 2 \ \ 2 \ end {pmatrix}} \ right \ rangle \ cdot {\ frac {1} {\ sqrt {10}}} {\ begin {pmatrix} 3 \\ 1 \ end {pmatrix}} = {\ frac {1 } {5}} {\ begin {pmatrix} -2 \\ 6 \ end {pmatrix}}}

{\ displaystyle v_ {2} = {\ frac {v_ {2} ^ {\ prime}} {\ left \ | v_ {2} ^ {\ prime} \ right \ |}} = {\ sqrt {\ frac { 25} {40}}} \ cdot {\ frac {1} {5}} {\ begin {pmatrix} -2 \\ 6 \ end {pmatrix}} = {\ frac {1} {\ sqrt {10}} } \ cdot {\ begin {pmatrix} -1 \\ 3 \ end {pmatrix}}}

Remarks

A special property of both methods is that after each intermediate step the vectors calculated up to now generate the same vector space as the vectors . The vectors thus form an orthogonal or orthonormal basis of the corresponding sub-vector spaces. In other words, the transformation matrix that expresses the coordinates of one system in the other is a triangular matrix on the upper right. This also has a positive determinant, so the resulting orthogonal or orthonormal basis has the same orientation as the starting basis. If the orthonormal vectors are summarized as columns of a matrix Q , as well as the vectors of the initial system to form a matrix A , there is a triangular matrix R with A = QR , so a QR decomposition is determined. Accordingly, the result of the Gram-Schmidt orthonormalization can also be determined with other methods for QR decomposition that work with Givens rotations or Householder reflections . ${\ displaystyle v_ {1}, \ dots, v_ {i}}$ ${\ displaystyle w_ {1}, \ dots, w_ {i}}$ ${\ displaystyle v_ {1}, \ dots, v_ {i}}$ ${\ displaystyle v_ {1}, \ dots, v_ {n}}$ ${\ displaystyle w_ {1}, \ dots, w_ {n}}$

If you calculate an orthogonal system by hand, it is often easier to first calculate an orthogonal system and then to normalize the individual vectors. This saves you having to standardize twice and you can often count on simpler values. It may be worthwhile to carry out the Gaussian elimination method before creating the orthogonal system / orthonormal system .

Principle of the procedure

If the orthogonal vectors are already determined, we try to subtract a suitable linear combination of the vectors from, so that the difference vector ${\ displaystyle v_ {1}, \ ldots, v_ {k-1}}$ ${\ displaystyle w_ {k}}$ ${\ displaystyle v_ {1}, \ ldots, v_ {k-1}}$

{\ displaystyle v_ {k} = w_ {k} - \ sum _ {i = 1} ^ {k-1} \ lambda _ {i} v_ {i}}

becomes orthogonal to all vectors . This is equivalent to saying that the scalar product results in the value 0 for all of them . Such a linear combination results if for each the expression ${\ displaystyle v_ {1}, \ ldots, v_ {k-1}}$ ${\ displaystyle \ langle v_ {j}, v_ {k} \ rangle}$ ${\ displaystyle j = 1, \ ldots, k-1}$ ${\ displaystyle i}$

{\ displaystyle \ lambda _ {i} = {\ frac {\ langle v_ {i}, w_ {k} \ rangle} {\ langle v_ {i}, v_ {i} \ rangle}}}

is chosen. A control calculation shows that by all scalar products with assume the value 0: ${\ displaystyle \ langle v_ {j}, v_ {k} \ rangle}$ ${\ displaystyle j \ neq k}$

{\ displaystyle {\ begin {aligned} \ langle v_ {j}, v_ {k} \ rangle & = \ left \ langle v_ {j}, w_ {k} - \ sum _ {i = 1} ^ {k- 1} \ lambda _ {i} v_ {i} \ right \ rangle \\ & = \ langle v_ {j}, w_ {k} \ rangle - \ sum _ {i = 1} ^ {k-1} {\ frac {\ langle v_ {i}, w_ {k} \ rangle} {\ langle v_ {i}, v_ {i} \ rangle}} \ langle v_ {j}, v_ {i} \ rangle \\ & = \ langle v_ {j}, w_ {k} \ rangle - \ langle v_ {j}, w_ {k} \ rangle \\ & = 0 \ end {aligned}}}

Orthonormalization of infinite systems of vectors

In any Hilbert space , the method can also be applied to infinite systems of independent vectors, whereby the independence is to be understood in the sense that no element lies in the closure of the linear envelope of the other vectors. The case of a countable system (i.e. is a separable Hilbert space ) can be traced back directly to the finite case presented above: Given an independent sequence , a corresponding orthonormal sequence is obtained by applying and obtaining the above procedure for each . More generally, every independent system can be viewed according to the well-ordered theorem as a sequence for a cardinal number and ordinal numbers (in the case of a dense linear envelope of the independent system, the dimension of is precisely ). Now designate the orthogonal projection onto a closed subspace , which always exists due to the completeness of the space, designate the normalization . So an orthonormal system results from ${\ displaystyle H}$ ${\ displaystyle H}$ ${\ displaystyle \ left (w_ {n} \ right) _ {n \ in \ mathbb {N}}}$ ${\ displaystyle \ left (v_ {n} \ right) _ {n \ in \ mathbb {N}}}$ ${\ displaystyle n \ in \ mathbb {N}}$ ${\ displaystyle v_ {n}}$ ${\ displaystyle \ left (w _ {\ alpha} \ right) _ {\ alpha <d}}$ ${\ displaystyle d}$ ${\ displaystyle \ alpha}$ ${\ displaystyle d}$ ${\ displaystyle H}$ ${\ displaystyle \ pi _ {A} \ colon H \ to A}$ ${\ displaystyle A}$ ${\ displaystyle {\ hat {x}}}$ ${\ displaystyle \ textstyle {\ frac {x} {\ left \ | x \ right \ |}}}$ ${\ displaystyle \ left (v _ {\ alpha} \ right) _ {\ alpha <d}}$

{\ displaystyle A _ {\ alpha}: = {\ overline {\ operatorname {span} \ left (\ left \ {w _ {\ beta} \ colon \ beta <\ alpha \ right \} \ right)}}}

{\ displaystyle v _ {\ alpha}: = {\ widehat {\ left (w _ {\ alpha} - \ pi _ {A _ {\ alpha}} \ left (w _ {\ alpha} \ right) \ right)}}}

.

Using transfinite induction it can then be shown that , even for . The procedure can be written more explicitly using transfinite recursion as follows: ${\ displaystyle A _ {\ alpha} = {\ overline {\ operatorname {span} \ left (\ left \ {v _ {\ beta} \ colon \ beta <\ alpha \ right \} \ right)}}}$ ${\ displaystyle \ alpha = d}$

{\ displaystyle v _ {\ alpha}: = {\ widehat {\ left (w _ {\ alpha} - \ sum _ {\ beta <\ alpha} \ langle v _ {\ beta}, w _ {\ alpha} \ rangle \ cdot v _ {\ beta} \ right)}}}

Here, the sum is well-defined due to Bessel's inequality (in particular, there are only countably many summands that are not equal to zero).

literature

Andrzej Kielbasiński, Hubert Schwetlick: Numerical linear algebra. A computer-oriented introduction . Deutscher Verlag der Wissenschaften, Berlin 1988, ISBN 3-326-00194-0 .