Transition matrix

In mathematics , especially in probability theory and statistics , a transition matrix (also process matrix or stochastic matrix ) is used to express the transition probabilities of (discrete and continuous) Markov chains . This enables future developments to be calculated in advance. In the theory of Markov chains, infinite-dimensional transition matrices are also defined. However, this article only deals with matrices in the sense of linear algebra.

A transition matrix is a square matrix whose row or column sums are one and the elements are between zero and one.

Process matrices are also used to calculate dynamic developments in the future. In contrast to stochastic matrices, however, they do not have to have any row or column sums of 1. However, like the stochastic matrix, they are square.

Further distinction

A transition matrix is called row stochastic if all entries in the matrix are between 0 and 1 and the row sums result in 1.

A transition matrix is called column-stochastic if all entries in the matrix are between 0 and 1 and the column sums result in 1.

A transition matrix is called double-stochastic if it is both row -stochastic and column-stochastic.

The following definition is equivalent: A matrix is called row (column) stochastic if it consists of probability vectors row (column) by row .

Sometimes matrices with entries between 0 and 1, the row sums (or column sums) of which are less than 1, are also referred to as substochastic . In stochastics, line-stochastic matrices are used almost exclusively. The distinction is i. A. Not very common, since the matrices merge into one another by transposing .

properties

Eigenvalues and Eigenvectors

The eigenvalues and eigenvectors of a stochastic matrix play a special role in stochastics. If the eigenvector is the eigenvalue , it corresponds to a stationary distribution of the Markov chain (see below). In general, every stochastic matrix has the eigenvalue 1. If z. B. line stochastic, it follows with the line sum norm that . Since the spectral radius of a matrix is always at most as large as its norm, all eigenvalues must be less than or equal to 1 in terms of absolute value. If now is a one- vector (i.e. a vector with only 1 as entries), then and 1 is the eigenvalue of . The proof for column-stochastic matrices runs analogously, but with the column sum norm instead of the row sum norm. It follows directly from this that 1 is always the greatest eigenvalue. Furthermore, 1 is always a semi-simple eigenvalue . The dimension of the eigenspace is somewhat more difficult to calculate. With Perron-Frobenius' theorem it follows: ${\ displaystyle v}$ ${\ displaystyle \ lambda = 1}$ ${\ displaystyle P}$ ${\ displaystyle \ Vert P \ Vert _ {\ infty} = 1}$ ${\ displaystyle \ mathbf {1}}$ ${\ displaystyle P \ mathbf {1} = \ mathbf {1}}$ ${\ displaystyle P}$

If all entries of a stochastic matrix are genuinely greater than 0, the dimension of the eigenspace belonging to the eigenvalue 1 is equal to 1.

If the stochastic matrix is irreducible , the dimension of the eigenspace belonging to the eigenvalue 1 is equal to 1.

Convexity, norms and seclusion

The set of transition matrices is convex . So if and are row or column stochastic matrices, then there is again a row or column stochastic matrix for all . ${\ displaystyle P}$ ${\ displaystyle Q}$ ${\ displaystyle \ lambda P + (1- \ lambda) Q}$ ${\ displaystyle \ lambda \ in [0,1]}$

It follows directly from the definition that the row sum norm of a row stochastic matrix is 1, just like the column sum norm of a column stochastic matrix.

In addition, transition matrices are closed with regard to matrix multiplication : If column or row stochastic matrices are then a column or row stochastic matrix is also. ${\ displaystyle A, B}$ ${\ displaystyle A \ cdot B}$

Example of a transition matrix P

{\ displaystyle P = {\ begin {bmatrix} 0 {,} 5 & 0 {,} 3 & 0 {,} 2 \\ 0 {,} 2 & 0 {,} 4 & 0 {,} 4 \\ 0 {,} 3 & 0 {,} 3 & 0 {,} 4 \ end {bmatrix}}}

The characteristic polynomial of a transition matrix can be calculated very easily. ${\ displaystyle (3 \ times 3)}$

With the trace and the determinant : ${\ displaystyle S: = \ operatorname {track} (P)}$ ${\ displaystyle D: = \ det (P)}$

{\ displaystyle {\ begin {aligned} \ chi _ {A} (\ lambda) & = \ lambda ^ {3} -S \ cdot \ lambda ^ {2} + (S + D-1) \ cdot \ lambda - D \\ & = (\ lambda -1) \ cdot (\ lambda ^ {2} - (S-1) \ cdot \ lambda + D) \ end {aligned}}}

The last line shows that P is always the eigenvalue of the matrix, regardless of the choice of the coefficient of P. The other two eigenvalues can then be calculated using the pq formula . ${\ displaystyle \ lambda = 1}$

Application for the characterization of discrete Markov chains

If a row-stochastic matrix, a time-invariant Markov chain with a finite state space can be characterized in the following way : ${\ displaystyle P}$

The entries of the matrix are exactly the transition probabilities from state to state : . If there is now a probability vector (which is often defined as a line vector in stochastics and is referred to as), then describes the state of the system at time 0 (the -th entry of is the probability of being at time 0 in the state ). The probabilities of stay at time 1 result by multiplying left with : ${\ displaystyle p_ {ij}}$ ${\ displaystyle P}$ ${\ displaystyle i}$ ${\ displaystyle j}$ ${\ displaystyle p_ {i, j}: = P (X_ {t + 1} = j \ mid X_ {t} = i)}$ ${\ displaystyle x_ {0}}$ ${\ displaystyle \ pi}$ ${\ displaystyle x_ {0}}$ ${\ displaystyle i}$ ${\ displaystyle x_ {0}}$ ${\ displaystyle i}$ ${\ displaystyle P}$ ${\ displaystyle x_ {0}}$

{\ displaystyle x_ {0} ^ {T} P = x_ {1} ^ {T}}

The probabilities of staying at any point in time depending on the starting state are then ${\ displaystyle k}$ ${\ displaystyle x_ {0}}$

{\ displaystyle x_ {0} ^ {T} P ^ {k} = x_ {k} ^ {T}}

For column-stochastic matrices one can proceed analogously, only that the vector multiplication is carried out from the right and the usual eigenvector is calculated with eigenvalue 1. Alternatively, you can also transpose the matrix and use the procedure outlined above.

The left eigenvectors of the matrix for the eigenvalue play a special role , because they represent the stationary distributions of the Markov chain. ${\ displaystyle P}$ ${\ displaystyle \ lambda = 1}$

An application-oriented example for this use of transition matrices is the calculation of PageRank using the Google matrix . Each state corresponds to a website on the World Wide Web; the transition probabilities indicate the probability with which a user clicks on a link. The limit distribution is then the relative frequency with which the user comes across a website and thus a measure of the importance of this page.

The right eigenvectors of a transition matrix to the eigenvalue 1 also play a role in the investigation of Markov chains. With a suitable standardization, these are precisely the absorption probabilities in an absorbing state .

Furthermore, many properties of a Markov chain can also be found in the transition matrix:

If there is a number such that is, then state j can be reached from state i.

{\ displaystyle n \ in \ mathbb {N}}

{\ displaystyle p_ {ij} ^ {n}> 0}

If also applies to a , the states i and j communicate . ${\ displaystyle p_ {ji} ^ {m}> 0}$ ${\ displaystyle m \ in \ mathbb {N}}$

In the case of a homogeneous Markov chain with finite state space, the concept of the irreducible Markov chain is equivalent to the irreducibility of the transition matrix.

If for one , then the Markov chain is irreducible and aperiodic and therefore converges to a limit distribution. This criterion is often easier to check than to show the irreducibility and aperiodicity separately. ${\ displaystyle P ^ {n}> 0}$ ${\ displaystyle n \ in \ mathbb {N}}$

In the case of a transition matrix, the Chapman-Kolmogorow equation can be interpreted as a matrix multiplication written out component-wise.

Examples

The adjacency graph of the stochastic matrix from the example and thus also a transition graph

The rat in the room

Peter owns a rat. If the rat is not locked in the cage, it is either under the desk (state 3), behind the closet (state 2) or is in the cage to eat (state 1). The rat changes place every 5 minutes. If it is in the cage, there is a probability of 0.05 that it will stay there, that of the 0.4 that it will go behind the closet and that of the 0.55 probability of it going under the desk. If it is behind the closet, there is a probability of 0.7 it will stay there, with a probability of 0.2 it will go under the desk and with a probability of 0.1 it will go into the cage. If it is under the desk, then with probability 0.1 it stays there, with probability 0.1 it goes into the cage and with probability 0.8 it escapes behind the closet. The behavior of the rat is described by the row-stochastic matrix : ${\ displaystyle P}$

{\ displaystyle P = {\ begin {pmatrix} 0 {,} 05 & 0 {,} 4 & 0 {,} 55 \\ 0 {,} 1 & 0 {,} 7 & 0 {,} 2 \\ 0 {,} 1 & 0 {,} 8 & 0 {,} 1 \ end {pmatrix}}}

Peter now releases his rat and wants to know the probability that the rat will be in the cage after 20 minutes. The start state of the system is

{\ displaystyle x_ {0} = (1; 0; 0) ^ {T}}

(the rat is in the cage with probability 1). The status after 20 minutes (after 4 time steps) is then (rounded)

{\ displaystyle x_ {4} ^ {T} = x_ {0} ^ {T} P ^ {4} = (0 {,} 0952; \ 0 {,} 6933; \ 0 {,} 2115)}

So there is a probability of 0.0952 that the rat is in the cage.

Peter is going on vacation over the weekend and then wants to catch his rat again. The question now arises as to where to look. Since it has been a long time since the rat was released, it is reasonable to assume that the system is in equilibrium. We are therefore looking for a left eigenvector of or a right eigenvector of to the eigenvalue 1. By recalculating the result for the eigenvector (rounded) ${\ displaystyle P}$ ${\ displaystyle P ^ {T}}$

${\ displaystyle x = (0 {,} 0952; \ 0 {,} 6926; \ 0 {,} 2121) ^ {T}}$

So Peter should look behind the closet first.

The cat and the mouse

There are five boxes next to each other, numbered from one to five, and a cat in the first box and a mouse in the last. After a fixed time, the animals randomly move to a neighboring box. The macabre game comes to an end when the cat meets the mouse in a box. We denote the possible states by (i, j), i.e. i.e., the cat is in the i-th box and the mouse is in the j-th box. We immediately see that if i is even (odd), j must also be even (odd). It is immediately clear that this must apply. The Markov chain that describes this game has the following five states: ${\ displaystyle i \ leq j}$

(1.3)
(1.5)
(2.4)
(3.5)
End of game (2.2), (3.3) and (4.4).

The vector indicates which of these five states is present. For example, stands for the first state of our listing, i.e. , and for the last, i.e. the end of the game (regardless of which box). ${\ displaystyle v}$ ${\ displaystyle v = [1,0,0,0,0] ^ {T}}$ ${\ displaystyle (1,3)}$ ${\ displaystyle v = [0,0,0,0,1] ^ {T}}$

The transition matrix A for this is now

{\ displaystyle A = {\ begin {bmatrix} 0 & 0 & 1/4 & 0 & 0 \\ 0 & 0 & 1/4 & 0 & 0 \\ 1/2 & 1 & 0 & 1/2 & 0 \\ 0 & 0 & 1/4 & 0 & 0 \\ 1/2 & 0 & 1/4 & 1/2 & 1 \ end {bmatrix}}.}

If, for example, we are in the 2nd state as at the beginning , then we will definitely switch to the 3rd state , i.e. cat in the second box and mouse in the fourth box. Therefore the position in the 2nd column and 3rd row in the transition matrix is equal to one. ${\ displaystyle v = [0,1,0,0,0] ^ {T}}$ ${\ displaystyle v = [0,0,1,0,0] ^ {T}}$

Starting from this state, there is a 25% probability that we will get into one of the other four states, so all rows in the 3rd column are 1/4 (except for the 3rd row - the state cannot remain the same).

literature

Hans-Otto Georgii: Stochastics: Introduction to probability theory and statistics. 4th edition. de Gruyter textbook, Berlin 2009, ISBN 978-3-11-021526-7 , chap. 6th
Peter Knabner , Wolf Barth : Lineare Algebra . Basics and applications (= Springer textbook ). Springer, Berlin 2012, ISBN 978-3-642-32185-6 .

Individual evidence

^ Gerhard Huebner: Stochastics . 2009, p. 162 .

[1] Gerhard Huebner: Stochastics . 2009, p. 162 .

Transition matrix

contents

Further distinction