MCMC process

Markov chain Monte Carlo methods ( MCMC for short ; more rarely Markov chain Monte Carlo methods ) are a class of algorithms that draw random samples from probability distributions ( Monte Carlo method ). This is done on the basis of the construction of a Markov chain which has the desired distribution as its stationary distribution . The state of the chain after a large number of steps is then used as a sample of the desired distribution. The quality of the sample increases with the number of steps.

Convergence of the Metropolis-Hastings algorithm; the blue distribution should be approximated by the orange.

Usually it is easy to construct a Markov chain with the desired properties. The more difficult thing is to determine how many steps are necessary to achieve convergence to the stationary distribution with acceptable error or to design the algorithm in such a way that independent system states are generated as effectively as possible. A good chain with a well chosen initial distribution will converge quickly; H. stationary distribution is achieved quickly. With the typical application of the MCMC method, the target distribution can only be achieved approximately, since there is always a certain residual effect of the initial distribution. Samples that are generated with MCMC algorithms typically show high autocorrelations . Therefore, the error of the mean estimator is underestimated when using the standard error (see central limit theorem for Markov chains ).

application areas

Frequent applications of these algorithms are found in the numerical calculation of multidimensional integrals. These can often be found in the context of Bayesian statistics as well as in computer simulations in statistical physics or for the evaluation of path integrals in quantum field theories

MCMC processes can e.g. B. can be used to simulate systems in a canonical ensemble . Simulations near a phase transition converge i. d. Typically slower and require longer equilibration . A sufficient, but not necessary, condition for an MCMC method to have the canonical state as a stationary distribution is the detailed balance property.

example

Suppose we can simulate a discrete uniform distribution on . This corresponds to tossing a fair coin with , where "1 = heads" and "0 = tails". But now we want to simulate a distribution on with and . To do this, we define a Markov chain in which all transition probabilities can be simulated (i.e. 0.5, 0 or 1) and which has the distribution as a stationary distribution. The following procedure forms a Markov chain that fulfills the requirements: If you are in state 1, go to 2. If you are in state 2, flip a coin. If it shows upside down, go to 3, otherwise stay in 2. If you are in state 3, flip a coin. If it shows upside down, go to 1, otherwise stay in 3. This Markov chain is described by the following transition matrix : ${\ displaystyle \ {0.1 \}}$ ${\ displaystyle P (\ {1 \}) = P (\ {0 \}) = 0 {,} 5}$ ${\ displaystyle S = \ {1,2,3 \}}$ ${\ displaystyle P_ {Z} (\ {1 \}) = 0 {,} 2}$ ${\ displaystyle P_ {Z} (\ {2 \}) = P_ {Z} (\ {3 \}) = 0 {,} 4}$ ${\ displaystyle P_ {Z}}$

{\ displaystyle M = {\ begin {bmatrix} 0 & 1 & 0 \\ 0 & 0 {,} 5 & 0 {,} 5 \\ 0 {,} 5 & 0 & 0 {,} 5 \ end {bmatrix}} = ((P (i | k)) _ {k, i})}

.

Since all entries in the matrix are positive ab , the Markov chain is irreducible and aperiodic . Hence the Markov chain converges to the stationary distribution for any starting distribution ${\ displaystyle M ^ {t}> 0}$ ${\ displaystyle t \ geq 3}$ ${\ displaystyle v_ {0}}$ ${\ displaystyle \ pi = P_ {Z}}$

For the simulation, we now start in state 1. In the table, the simulation step is in the first column, then the exact probability of being in state 1 in the 2nd column: ${\ displaystyle P (X_ {k} = 1) = \ sum _ {i = 1} ^ {3} P (1 | i) P (X_ {k-1} = i)}$

According to the assumption, this converges to 0.2. The last column shows the relative frequency of state 1, averaged over a thousand simulations. After several simulation steps, this approximates the stationary distribution. So if you let the simulation run for a long time, the sample obtained in this way is distributed. ${\ displaystyle P_ {Z}}$

steps ${\ displaystyle k}$	${\ displaystyle P (X_ {k} = 1)}$	Relative frequency of 1 in a thousand simulations
0	1	1
1	0	0
2	0	0
3	0.25	0.228
4th	0.25	0.271
5	0.1875	0.173
6th	0.1875	0.176
7th	0.2031	0.236
8th	0.2031	0.180
9	0.1992	0.202
10	0.1992	0.171
11	0.2002	0.205
12	0.2002	0.198
13	0.2000	0.190
14th	0.2000	0.206

If the sum of the entries of the left-hand eigenvector of Zum Eigenwert 1 is normalized , then the probabilities of the states of the stationary probability distribution of the Markov chain are obtained: here 0.2, 0.4, 0.4. ${\ displaystyle M}$

Algorithms

Examples of Markov Chain Monte Carlo methods are:

Metropolis algorithm: The local procedure generates random movements that are accepted or rejected with a certain probability. It is easy to implement, but has the disadvantage of a high level of autocorrelation of the system states generated.

Gibbs sampling : The local method is a special case of the Metropolis-Hastings algorithm, in which the states are generated according to the local probability distribution .

Thermal algorithm

Hybrid Monte Carlo algorithm : The process creates a combination of molecular dynamics and random movement. Molecular dynamics is used to efficiently create new, independent states that are likely to be accepted or rejected.

Cluster algorithms : These are special, non-local procedures that reduce the autocorrelation times and thus the so-called critical slowing down during phase transitions . Such an algorithm was first used in 1987 by R. Swendsen and J.-S.Wang for a Potts model , see Swendsen-Wang algorithm . A popular variant of this clustering method is the Wolff algorithm introduced in 1989 .

Over relaxation procedure

literature

Jeff Gill: Bayesian methods: a social and behavioral sciences approach . Second ed. Chapman and Hall / CRC, London 2008, ISBN 978-1-58488-562-7 .
N. Metropolis, A. Rosenbluth, M. Rosenbluth , A. Teller, and E. Teller : Equation of State Calculations by Fast Computing Machines . In: Journal of Chemical Physics . tape 21 , 1953, pp. 1087-1092 , doi : 10.1063 / 1.1699114 .
WK Hastings: Monte Carlo Sampling Methods Using Markov Chains and Their Applications . In: Biometrika . tape 57 , 1970, pp. 97-109 .
RH Swendsen, J.-S. Wang: Nonuniversal critical dynamics in Monte Carlo simulations . In: Phys. Rev. Lett. tape 58 , 1987, pp. 86 .
SL Adler: Overrelaxation method for Monte Carlo evaluation of the partition function for multiquadratic actions . In: Phys. Rev. D . tape 23 , 1988, pp. 2901 .
Thomas Müller-Gronbach, Erich Novak, Klaus Ritter: Monte Carlo algorithms . Springer-Verlag, Berlin 2012, ISBN 978-3-540-89140-6 , chapter 6, p. 179-244 .
Charles Geyer: Introduction to Markov Chain Monte Carlo . In: Chapman & Hall / CRC Handbooks of Modern Statistical Methods . Chapman and Hall / CRC, 2011, ISBN 978-1-4200-7941-8 , doi : 10.1201 / b10905-2 ( mcmchandbook.net [PDF]).

Individual evidence

↑ Sudipto Banerjee, Bradley P. Carlin, Alan P. Gelfand: Hierarchical Modeling and Analysis for Spatial Data , Second. 1st edition, CRC Press, September 12, 2014, ISBN 978-1-4398-1917-3 , p. Xix.
↑ Ankur Gupta, James B. Rawlings: Comparison of Parameter Estimation Methods in Stochastic Chemical Kinetic Models: Examples in Systems Biology . In: AIChE Journal . 60, No. 4, April 2014, pp. 1253-1268. doi : 10.1002 / aic.14409 . PMID 27429455 . PMC 4946376 (free full text).

[1] Sudipto Banerjee, Bradley P. Carlin, Alan P. Gelfand: Hierarchical Modeling and Analysis for Spatial Data , Second. 1st edition, CRC Press, September 12, 2014, ISBN 978-1-4398-1917-3 , p. Xix.

[2] Ankur Gupta, James B. Rawlings: Comparison of Parameter Estimation Methods in Stochastic Chemical Kinetic Models: Examples in Systems Biology . In: AIChE Journal . 60, No. 4, April 2014, pp. 1253-1268. doi : 10.1002 / aic.14409 . PMID 27429455 . PMC 4946376 (free full text).