Mersenne twister

from Wikipedia, the free encyclopedia

The Mersenne Twister is a pseudo random number generator , which in 1997 was developed by Makoto Matsumoto and Takuji Nishimura. It generates sequences of pseudorandom numbers and has been tailored to overcome the problems of older algorithms (such as linear congruence generators ).

There are two variants of this algorithm; the newer and more widely used is the Mersenne twister "MT 19937", which is described here.

properties

  1. Extremely long period of . This period length also explains the name of the algorithm: it is a Mersenne prime , and some properties of the algorithm result from it.
  2. All bits of the output sequence are evenly distributed. The returned integer values ​​are therefore also highly evenly distributed (up to dimension 623, see below). This results in an extremely low correlation between successive sequences of values ​​in the output sequence.
  3. The algorithm is fast. It always generates 624 new status words at once, which it then returns value for value with the next 624 calls. The recalculation of the state vector can be parallelized almost at will on SIMD computer architectures, which benefits the execution speed.

On the other hand, it has the disadvantage of working on a large amount of data of around 2.5  kbytes (624 words with 32 bits each  ). In computer architectures with a relatively small cache and slower working memory, this can result in a speed disadvantage.

The word “Twister” refers to a certain transformation within the algorithm, through which this high-grade uniform distribution is ensured (pure linear congruence generators can only guarantee five-dimensional uniform distribution with reasonable effort).

An n-dimensional uniform distribution means: if you divide the output sequence into tuples of n numbers each , then the sequence of n-tuples is uniformly distributed in n-dimensional space.

In contrast to other algorithms, the Mersenne Twister is not cryptographically secure in its pure form . However, it is already being used successfully for many other applications.

algorithm

The values to (with ) are given as start values. The other values with are calculated as follows:

The symbol denotes the bitwise XOR operation , and "hex" stands for hexadecimal . The symbol is the Gaussian bracket and stands for the rounded value, i.e. H. the largest integer that is not greater than the argument in the parentheses.

In order to ensure the 623-dimensional uniform distribution for all 32 bits of the , they are still modified:

It stands for the bit-wise AND operation .

The calculated numbers are used as random numbers.

initialization

Ideally, real random numbers are selected as starting values to B. can be generated by a physical random number generator . However, pseudo-random numbers from another generator can also be used.

Not all bits that make up the state of the Mersenne Twister may be initialized with zero, otherwise it will only generate zero as a "random number". These are the most significant bit in and all bits in the other variables to .

The less random the start values ​​are (i.e. the more unevenly the bits are distributed), the longer the “warm-up phase” that the Mersenne twister needs before it outputs good pseudo-random numbers. The worst possible initialization consists of only a single set bit in the initialization vector. In this case, the Mersenne Twister needs over 700,000 calls before it returns an evenly distributed bit sequence. If in doubt, you should generate around 800,000 random numbers before using the numbers. Alternatively, there are also modern generators that have much shorter recovery times, such as B. the WELL or Marsaglias Xorshift .

However, you can save yourself the initialization with another PRNG in this way (if you do not trust it, for example): You set (in the code ) a random seed value (e.g. the time) and all others to 0 (In the C code they are usually already because of the attribute). Then you simply call the generator 800,000 times. y[1]static

code

These calculations can be done e.g. Implement it efficiently in C code, for example . The following function always calculates N = 624 words at once, and then these are yread out from the vector in sequence:

TT800

Matsumoto and Nishimura had previously developed a "little brother" of the Mersenne Twister called the TT800 . It works according to the same functional principle, but on a smaller amount of data of only 25 words, and its algorithm is a little simpler because only two old 32-bit status words are used to calculate the next status word. Its period length is .

See also

literature

  • Makoto Matsumoto, Takuji Nishimura: Mersenne twister. A 623-dimensionally equidistributed uniform pseudorandom number generator . In: ACM Transactions on Modeling and Computer Simulation. 8, 1998, ISSN  1049-3301 , pp. 3-30.

Web links

Individual evidence

  1. iro.umontreal.ca (PDF; 301 kB)