Mersenne twister
The Mersenne Twister is a pseudo random number generator , which in 1997 was developed by Makoto Matsumoto and Takuji Nishimura. It generates sequences of pseudorandom numbers and has been tailored to overcome the problems of older algorithms (such as linear congruence generators ).
There are two variants of this algorithm; the newer and more widely used is the Mersenne twister "MT 19937", which is described here.
properties
- Extremely long period of . This period length also explains the name of the algorithm: it is a Mersenne prime , and some properties of the algorithm result from it.
- All bits of the output sequence are evenly distributed. The returned integer values are therefore also highly evenly distributed (up to dimension 623, see below). This results in an extremely low correlation between successive sequences of values in the output sequence.
- The algorithm is fast. It always generates 624 new status words at once, which it then returns value for value with the next 624 calls. The recalculation of the state vector can be parallelized almost at will on SIMD computer architectures, which benefits the execution speed.
On the other hand, it has the disadvantage of working on a large amount of data of around 2.5 kbytes (624 words with 32 bits each ). In computer architectures with a relatively small cache and slower working memory, this can result in a speed disadvantage.
The word “Twister” refers to a certain transformation within the algorithm, through which this high-grade uniform distribution is ensured (pure linear congruence generators can only guarantee five-dimensional uniform distribution with reasonable effort).
An n-dimensional uniform distribution means: if you divide the output sequence into tuples of n numbers each , then the sequence of n-tuples is uniformly distributed in n-dimensional space.
In contrast to other algorithms, the Mersenne Twister is not cryptographically secure in its pure form . However, it is already being used successfully for many other applications.
algorithm
The values to (with ) are given as start values. The other values with are calculated as follows:
The symbol denotes the bitwise XOR operation , and "hex" stands for hexadecimal . The symbol is the Gaussian bracket and stands for the rounded value, i.e. H. the largest integer that is not greater than the argument in the parentheses.
In order to ensure the 623-dimensional uniform distribution for all 32 bits of the , they are still modified:
It stands for the bit-wise AND operation .
The calculated numbers are used as random numbers.
initialization
Ideally, real random numbers are selected as starting values to B. can be generated by a physical random number generator . However, pseudo-random numbers from another generator can also be used.
Not all bits that make up the state of the Mersenne Twister may be initialized with zero, otherwise it will only generate zero as a "random number". These are the most significant bit in and all bits in the other variables to .
The less random the start values are (i.e. the more unevenly the bits are distributed), the longer the “warm-up phase” that the Mersenne twister needs before it outputs good pseudo-random numbers. The worst possible initialization consists of only a single set bit in the initialization vector. In this case, the Mersenne Twister needs over 700,000 calls before it returns an evenly distributed bit sequence. If in doubt, you should generate around 800,000 random numbers before using the numbers. Alternatively, there are also modern generators that have much shorter recovery times, such as B. the WELL or Marsaglias Xorshift .
However, you can save yourself the initialization with another PRNG in this way (if you do not trust it, for example): You set (in the code ) a random seed value (e.g. the time) and all others to 0 (In the C code they are usually already because of the attribute). Then you simply call the generator 800,000 times.
y[1]
static
code
These calculations can be done e.g. Implement it efficiently in C code, for example . The following function always calculates N = 624 words at once, and then these are y
read out from the vector in sequence:
#include <stdint.h>
#define N 624
#define M 397
/*
* Initialisiere den Vektor mit Pseudozufallszahlen.
* Siehe Donald E. Knuth: The Art of Computer Programming, Band 2, 3. Ausgabe, S. 106 für geeignete Werte für den Faktor mult
*
* 2002/01/09 modifiziert von Makoto Matsumoto:
* In der vorhergehenden Version die MSBs des Seeds haben auch nur die MSBs des Arrays beeinflusst.
*/
static void
mersenne_twister_vector_init (uint32_t* const p, const int len)
{
const uint32_t mult = 1812433253ul;
uint32_t seed = 5489ul;
int i;
for (i = 0; i < len; i++)
{
p[i] = seed;
seed = mult * (seed ^ (seed >> 30)) + (i+1);
}
}
/*
* Berechne neuen Zustandsvektor aus altem Zustandsvektor
* Dies erfolgt periodisch alle N=624 Aufrufe von mersenne_twister().
*
* Der folgende Code stellt eine optimierte Version von
* ...
* for (i = 0; i < N; i++)
* Proc3 (p[i], p[(i+1) % N], p[(i+M) % N]);
* ...
* mit
* void Proc3 (uint32_t &a0, uint32_t a1, uint32_t aM)
* {
* a0 = aM ^ (((a0 & 0x80000000) | (a1 & 0x7FFFFFFF)) >> 1) ^ A[a1 & 1];
* }
* dar, in der die (zeitintensiven) Modulo N-Operationen vermieden werden.
*/
static void
mersenne_twister_vector_update (uint32_t* const p)
{
static const uint32_t A[2] = { 0, 0x9908B0DF };
int i = 0;
for (; i < N-M; i++)
p[i] = p[i+(M )] ^ (((p[i ] & 0x80000000) | (p[i+1] & 0x7FFFFFFF)) >> 1) ^ A[p[i+1] & 1];
for (; i < N-1; i++)
p[i] = p[i+(M-N)] ^ (((p[i ] & 0x80000000) | (p[i+1] & 0x7FFFFFFF)) >> 1) ^ A[p[i+1] & 1];
p[N-1] = p[M-1] ^ (((p[N-1] & 0x80000000) | (p[0 ] & 0x7FFFFFFF)) >> 1) ^ A[p[0 ] & 1];
}
/*
* Die eigentliche Funktion:
* - beim 1. Aufruf wird der Vektor mittels mersenne_twister_vector_init() initialisiert.
* - beim 1., 625., 1249., ... Aufruf wird ein neuer Vektor mittels mersenne_twister_vector_update berechnet.
* - bei jedem Aufruf wird eine Zufallszahl aus dem Vektor gelesen und noch einem Tempering unterzogen.
*/
uint32_t
mersenne_twister ()
{
static uint32_t vektor [N]; /* Zustandsvektor */
static int idx = N+1; /* Auslese-Index; idx >= N: neuer Vektor muss berechnet werden */
/* idx = N+1: Vektor muss initialisiert werden */
uint32_t e;
if (idx >= N)
{
if (idx > N)
mersenne_twister_vector_init (vektor, N);
mersenne_twister_vector_update (vektor);
idx = 0;
}
e = vektor [idx++];
e ^= (e >> 11); /* Tempering */
e ^= (e << 7) & 0x9D2C5680;
e ^= (e << 15) & 0xEFC60000;
e ^= (e >> 18);
return e;
}
#undef N
#undef M
TT800
Matsumoto and Nishimura had previously developed a "little brother" of the Mersenne Twister called the TT800 . It works according to the same functional principle, but on a smaller amount of data of only 25 words, and its algorithm is a little simpler because only two old 32-bit status words are used to calculate the next status word. Its period length is .
#include <stdint.h>
#define N 25
#define M 7
/*
* Initialisiere den Vektor mit Pseudozufallszahlen.
*/
static void
TT800_vector_init (uint32_t* const p, const int len)
{
const uint32_t mult = 509845221ul;
const uint32_t add = 3ul;
uint32_t seed1 = 9;
uint32_t seed2 = 3402;
int i;
for (i = 0; i < len; i++)
{
seed1 = seed1 * mult + add;
seed2 *= seed2 + 1;
p[i] = seed2 + (seed1 >> 10);
}
}
/*
* Berechne neuen Zustandsvektor aus altem Zustandsvektor
* Dies erfolgt periodisch alle N=25 Aufrufe von TT800().
*
* Der folgende Code stellt eine optimierte Version von
* ...
* for (i = 0; i < N; i++)
* Proc2 (p[i], p[(i+M) % N]);
* ...
* mit
* void Proc2 (uint32_t &a0, uint32_t aM)
* {
* a0 = aM ^ (a0 >> 1) ^ A[a0 & 1];
* }
* dar, in der die (zeitintensiven) Modulo N-Operationen vermieden werden.
*/
static void
TT800_vector_update (uint32_t* const p)
{
static const uint32_t A[2] = { 0, 0x8ebfd028 };
int i = 0;
for (; i < N-M; i++)
p[i] = p[i+(M )] ^ (p[i] >> 1) ^ A[p[i] & 1];
for (; i < N ; i++)
p[i] = p[i+(M-N)] ^ (p[i] >> 1) ^ A[p[i] & 1];
}
/*
* Die eigentliche Funktion:
* - beim 1. Aufruf wird der Vektor mittels TT800_vector_init() initialisiert.
* - beim 1., 26., 51., ... Aufruf wird ein neuer Vektor mittels TT800_vector_update berechnet.
* - bei jedem Aufruf wird eine Zufallszahl aus dem Vektor ausgelesen und noch einem Tempering unterzogen.
*/
uint32_t
TT800 ()
{
static uint32_t vektor [N]; /* Zustandsvektor */
static int idx = N+1; /* Auslese-Index; idx >= N: neuer Vektor muss berechnet werden, */
/* idx=N+1: Vektor muss überhaupt erst mal initialisiert werden */
uint32_t e;
if (idx >= N)
{
if (idx > N)
TT800_vector_init (vektor, N);
TT800_vector_update (vektor);
idx = 0;
}
e = vektor [idx++];
e ^= (e << 7) & 0x2b5b2500; /* Tempering */
e ^= (e << 15) & 0xdb8b0000;
e ^= (e >> 16);
return e;
}
#undef N
#undef M
See also
literature
- Makoto Matsumoto, Takuji Nishimura: Mersenne twister. A 623-dimensionally equidistributed uniform pseudorandom number generator . In: ACM Transactions on Modeling and Computer Simulation. 8, 1998, ISSN 1049-3301 , pp. 3-30.
Web links
- Web presence of the Mersenne Twister (English)
Individual evidence
- ↑ iro.umontreal.ca (PDF; 301 kB)