Smith normal form

In mathematics , the Smith normal form is a normal form that is defined for any matrices with entries from a main ideal ring . The Smith normal form of a matrix is a diagonal matrix , which is obtained from the output matrix by multiplying left and right by a regular square matrix each. The entries of this diagonal matrix are called elementary divisors or invariant factors of the output matrix . The Smith normal form is named after the English mathematician Henry John Stephen Smith .

definition

If there is a matrix over a main ideal ring that is not equal to the zero matrix , then a regular matrix and a regular matrix exist such that ${\ displaystyle A}$ ${\ displaystyle (m \ times n)}$ ${\ displaystyle R}$ ${\ displaystyle (m \ times m)}$ ${\ displaystyle S}$ ${\ displaystyle (n \ times n)}$ ${\ displaystyle T}$

{\ displaystyle S \ cdot A \ cdot T = {\ begin {pmatrix} \ alpha _ {1} & 0 & \ cdots & \ cdots & \ cdots & 0 \\ 0 & \ ddots & \ ddots &&& \ vdots \\\ vdots & \ ddots & \ alpha _ {r} & \ ddots && \ vdots \\\ vdots && \ ddots & 0 & \ ddots & \ vdots \\\ vdots &&& \ ddots & \ ddots & 0 \\ 0 & \ cdots & \ cdots & \ cdots & 0 & 0 \ end {pmatrix}}}

applies. For the main diagonal elements is intended to for apply. This representation is called Smith's normal form of the matrix . The entries are clearly defined except for multiplication by a unit and are called elementary divisors or invariant factors of the matrix. The elementary divisors are through (except for multiplication by a unit) ${\ displaystyle \ alpha _ {i} \ mid \ alpha _ {i + 1}}$ ${\ displaystyle i = 1, \ ldots, r-1}$ ${\ displaystyle A}$ ${\ displaystyle \ alpha _ {i}}$

{\ displaystyle \ alpha _ {i} = {\ frac {d_ {i} (A)} {d_ {i-1} (A)}}}

given, where the greatest common divisor of all - is the minor of the matrix . ${\ displaystyle d_ {i} (A)}$ ${\ displaystyle (i \ times i)}$ ${\ displaystyle A}$

algorithm

The hard part in finding Smith's Normal Form is finding two matrices and so that the product is a diagonal matrix. For this purpose, the matrix is successively brought to a diagonal shape, with elementary row or column reshaping being carried out in each step . At the same time, the matrices and starting from standard matrices of the appropriate size are successively reshaped. In this case, at a line forming the matrix is the matrix from the right, and in a column forming multiplied from the left with a corresponding elementary matrix. The relationship then applies to the matrices modified in one step ${\ displaystyle S}$ ${\ displaystyle T}$ ${\ displaystyle S \ cdot A \ cdot T}$ ${\ displaystyle A}$ ${\ displaystyle S}$ ${\ displaystyle T}$ ${\ displaystyle S}$ ${\ displaystyle T}$ ${\ displaystyle A ', S', T '}$

{\ displaystyle A '= S' \ cdot A \ cdot T '}

.

Only invertible row and column operations are carried out, so that and remain regular. The Smith normal form is then finally determined on the basis of the diagonal shape of . In order to bring a matrix into Smith-normal form, the following steps are carried out specifically. ${\ displaystyle S}$ ${\ displaystyle T}$ ${\ displaystyle A}$ ${\ displaystyle t = 1, \ ldots, m}$

Step 1: choosing the pivot

Let be the smallest column index of those columns of that have at least one entry not equal to zero, whereby the search is started for at . Now it is required that for the diagonal element ${\ displaystyle j_ {t}}$ ${\ displaystyle A}$ ${\ displaystyle t> 1}$ ${\ displaystyle j_ {t-1} +1}$

{\ displaystyle a_ {t, j_ {t}} \ neq 0}

applies. If this is not the case, there is an element according to the prerequisite . Now the two lines and are swapped by multiplication with a permutation matrix , so that an element not equal to zero appears on the diagonal of the current column. This element is then called the pivot element . ${\ displaystyle a_ {k, j_ {t}} \ neq 0}$ ${\ displaystyle t}$ ${\ displaystyle k}$

Step 2: improving the pivot

If there is now an entry with , then be ${\ displaystyle a_ {k, j_ {t}}}$ ${\ displaystyle a_ {t, j_ {t}} \ nmid a_ {k, j_ {t}}}$

{\ displaystyle \ beta = \ operatorname {ggT} \ left (a_ {t, j_ {t}}, a_ {k, j_ {t}} \ right)}

.

The greatest common divisor of two elements of a main ideal ring can be represented by the lemma of Bézout . Elements then exist such that ${\ displaystyle \ sigma, \ tau \ in R}$

{\ displaystyle \ beta = a_ {t, j_ {t}} \ cdot \ sigma + a_ {k, j_ {t}} \ cdot \ tau}

applies. Using a line conversion, the -fold of the line is now added to the -fold of the line . Satisfy and the above equation, then applies to and (these divisions are possible due to the definition of ) ${\ displaystyle \ sigma}$ ${\ displaystyle t}$ ${\ displaystyle \ tau}$ ${\ displaystyle k}$ ${\ displaystyle \ sigma}$ ${\ displaystyle \ tau}$ ${\ displaystyle \ alpha = a_ {t, j_ {t}} / \ beta}$ ${\ displaystyle \ gamma = a_ {k, j_ {t}} / \ beta}$ ${\ displaystyle \ beta}$

{\ displaystyle \ sigma \ cdot \ alpha + \ tau \ cdot \ gamma = 1}

.

The matrix

{\ displaystyle L_ {0} = {\ begin {pmatrix} \ sigma & \ tau \\ - \ gamma & \ alpha \\\ end {pmatrix}}}

is thus regular with the inverse

{\ displaystyle L_ {0} ^ {- 1} = {\ begin {pmatrix} \ alpha & - \ tau \\\ gamma & \ sigma \\\ end {pmatrix}}}

.

By inserting the entries of the matrix into the rows and columns and an identity matrix, the elementary matrix is obtained . The product then has the entry at the point (and, due to the choice of and at the point, the entry zero, which is practical, but not essential for the algorithm). This new entry shares the previous entry . This step is repeated until there is no improvement. Denotes the number of prime factors of an element , then applies after each step ${\ displaystyle L_ {0}}$ ${\ displaystyle t}$ ${\ displaystyle k}$ ${\ displaystyle L}$ ${\ displaystyle L \ cdot A}$ ${\ displaystyle (t, j_ {t})}$ ${\ displaystyle \ beta}$ ${\ displaystyle \ alpha}$ ${\ displaystyle \ gamma}$ ${\ displaystyle (k, j_ {t})}$ ${\ displaystyle \ beta}$ ${\ displaystyle a_ {t, j_ {t}}}$ ${\ displaystyle \ delta (a)}$ ${\ displaystyle a}$

{\ displaystyle \ delta (\ beta) <\ delta (a_ {t, j_ {t}})}

,

therefore the process terminates after a finite number of steps. The result is a matrix with one entry at the point that divides all entries in the column . ${\ displaystyle (t, j_ {t})}$ ${\ displaystyle j_ {t}}$

Step 3: elimination of entries

By adding the corresponding multiples of the row , all entries in the column outside the diagonal are now set to zero. This can also be achieved by left-hand multiplication with corresponding elementary matrices. However, in order to bring the matrix to a full diagonal shape, the entries not equal to zero in the row must also be eliminated. This can be achieved by repeating step 2 for the columns of the matrix in combination with right multiplications. However, this can mean that zero entries that were generated in a previous application of step 3 become non-zero again. ${\ displaystyle t}$ ${\ displaystyle j_ {t}}$ ${\ displaystyle t}$

The ideals that are formed by the elements at the position , however, generate an ascending chain , since the entries from a later step always share the entries from an earlier step. After there is noetherian , the ideals become stationary after a certain step and no longer change. This means that finally the entry at the point after applying step 2 divides all entries not equal to zero in the same column and row. These entries can then be eliminated, with the zero entries already generated being retained. Now only the block has to be diagonalized from the right below . The algorithm is continued with this sub-matrix in step 1. ${\ displaystyle (t, j_ {t})}$ ${\ displaystyle R}$ ${\ displaystyle (t, j_ {t})}$ ${\ displaystyle A}$ ${\ displaystyle (t, j_ {t})}$ ${\ displaystyle t + 1}$

Step 4: normalization

The repeated application of the steps 1 to 3 will eventually lead to a matrix in which only the entries for with are not zero. The zero columns of this matrix are now shifted to the right so that the entries not equal to zero are exactly at the positions for . These entries are now denoted by. ${\ displaystyle (m \ times n)}$ ${\ displaystyle (l, j_ {l})}$ ${\ displaystyle j_ {1} <\ ldots <j_ {r}}$ ${\ displaystyle r \ leq \ min (m, n)}$ ${\ displaystyle (i, i)}$ ${\ displaystyle i = 1, \ ldots, r}$ ${\ displaystyle \ alpha _ {i}}$

However, the divisibility requirement of the Smith normal form for the diagonal elements may not yet be met. If this applies to an index , then this can be remedied by means of row and column transformations as follows. First, the column is added to the column so that an entry is created in the column without changing the position of the diagonal entry . Now, as in step 2, with a line transformation, the entry at the point becomes the same ${\ displaystyle \ alpha _ {i} \ nmid \ alpha _ {i + 1}}$ ${\ displaystyle i}$ ${\ displaystyle i + 1}$ ${\ displaystyle i}$ ${\ displaystyle \ alpha _ {i + 1}}$ ${\ displaystyle i}$ ${\ displaystyle \ alpha _ {i}}$ ${\ displaystyle (i, i)}$ ${\ displaystyle (i, i)}$

{\ displaystyle \ beta = \ operatorname {ggT} (\ alpha _ {i}, \ alpha _ {i + 1})}

set. Finally, as in step 3, the matrix is diagonalized again. Since the new entry is a linear combination of the original entries and , it must be divisible by . This operation does not change the value (it corresponds to that of the determinant of the upper sub-matrix), but decreases the value of ${\ displaystyle (i + 1, i + 1)}$ ${\ displaystyle \ alpha _ {i}}$ ${\ displaystyle \ alpha _ {i + 1}}$ ${\ displaystyle \ beta}$ ${\ displaystyle \ delta (\ alpha _ {1}) + \ cdots + \ delta (\ alpha _ {r})}$ ${\ displaystyle \ delta}$ ${\ displaystyle (r \ times r)}$

{\ displaystyle \ sum _ {j = 1} ^ {r} (rj) \ delta (\ alpha _ {j})}

,

by shifting the prime factors to the right. Therefore, after a finite number of applications, no further operations are possible, which means that the desired result has been achieved. Since all of the row and column transformations in this process are invertible, invertible matrices must exist such that they yield Smith's normal form. In particular, this means that Smith's normal form always exists, which was assumed in the definition without evidence. ${\ displaystyle \ alpha _ {1} \ mid \ alpha _ {2} \ mid \ cdots \ mid \ alpha _ {r}}$ ${\ displaystyle S, T}$ ${\ displaystyle S \ cdot A \ cdot T}$

example

As an example, the Smith-Normal form of the matrix

{\ displaystyle A = {\ begin {pmatrix} 2 & 4 & 4 \\ - 6 & 6 & 12 \\ 10 & -4 & -16 \ end {pmatrix}}}

calculated. The following matrices are the intermediate steps of the Smith algorithm applied to this matrix:

{\ displaystyle \ to {\ begin {pmatrix} 2 & 0 & 0 \\ - 6 & 18 & 24 \\ 10 & -24 & -36 \ end {pmatrix}} \ to {\ begin {pmatrix} 2 & 0 & 0 \\ 0 & 18 & 24 \\ 0 & -24 & -36 \ end {pmatrix}} \ to {\ begin {pmatrix} 2 & 0 & 0 \\ 0 & 18 & 24 \\ 0 & -6 & -12 \ end {pmatrix}}}

{\ displaystyle \ to {\ begin {pmatrix} 2 & 0 & 0 \\ 0 & 6 & 12 \\ 0 & 18 & 24 \ end {pmatrix}} \ to {\ begin {pmatrix} 2 & 0 & 0 \\ 0 & 6 & 12 \\ 0 & 0 & -12 \ end {pmatrix}} \ to { \ begin {pmatrix} 2 & 0 & 0 \\ 0 & 6 & 0 \\ 0 & 0 & 12 \ end {pmatrix}}}

The last matrix then represents the Smith normal form of . The invariant factors of are thus , and . ${\ displaystyle A}$ ${\ displaystyle A}$ ${\ displaystyle 2}$ ${\ displaystyle 6}$ ${\ displaystyle 12}$

use

Smith's normal form is useful for computing the homology of a chain complex when its modules are finitely generated. In the topology , the Smith normal form can be used, for example, to calculate the homology of a simplicial complex or a cell complex over the whole numbers, since the boundary operators of such complexes are represented by integer matrices. It can also be used to prove the structure theorem for finitely generated modules over a main ideal ring .

Smith's Normal Form can also be used to determine whether two matrices over the same body are similar to each other . Two matrices and are in fact similar to one another if and only if their characteristic matrices and have the same Smith normal form. For example, the following applies to the following matrices: ${\ displaystyle A}$ ${\ displaystyle B}$ ${\ displaystyle xI-A}$ ${\ displaystyle xI-B}$

{\ displaystyle {\ begin {aligned} A & {} = {\ begin {pmatrix} 1 & 2 \\ 0 & 1 \ end {pmatrix}}, && {\ mbox {SNF}} (xI-A) = {\ begin {pmatrix} 1 & 0 \\ 0 & (x-1) ^ {2} \ end {pmatrix}} \\ B & {} = {\ begin {pmatrix} 3 & -4 \\ 1 & -1 \ end {pmatrix}}, && {\ mbox {SNF}} (xI-B) = {\ begin {pmatrix} 1 & 0 \\ 0 & (x-1) ^ {2} \ end {pmatrix}} \\ C & {} = {\ begin {pmatrix} 1 & 0 \\ 1 & 2 \ end {pmatrix}}, && {\ mbox {SNF}} (xI-C) = {\ begin {pmatrix} 1 & 0 \\ 0 & (x-1) (x-2) \ end {pmatrix}} \ end {aligned}}}

Therefore, and are similar to each other because the Smith-Normal forms of their characteristic matrices are the same, but they are not similar to because the characteristic matrices are different. ${\ displaystyle A}$ ${\ displaystyle B}$ ${\ displaystyle C}$

literature

Henry J. Stephen Smith : On systems of linear indeterminate equations and congruences . In: Phil. Trans. R. Soc. Lond. . 151, No. 1, 1861, pp. 293-326. doi : 10.1098 / rstl.1861.0016 . Reprinted. JWL Glaisher (Ed.): The Collected Mathematical Papers of Henry John Stephen Smith , Vol. I. Clarendon Press, Oxford 1894, pp. 367-409, Textarchiv - Internet Archive
KR Matthews: Smith normal form . (PDF) In: MP274: Linear Algebra , Lecture Notes, University of Queensland, 1991.

Web links

Smith normal form . In: PlanetMath . (English)
Example of Smith normal form . In: PlanetMath . (English)