Inversion method

from Wikipedia, the free encyclopedia
Inversion method.

The inversion method is a simulation method to generate other probability distributions from uniformly distributed random numbers .

intention

The generation of these random numbers is simulated by means of artificially created implementations of a statistical random variable . One can generate random numbers of various discrete and continuous probability distributions . These sequences of random numbers are used, for example, to investigate the behavior of complex systems that can only be analyzed with difficulty using mathematical functions . Examples would be queuing theory or the distribution of certain sample functions , such as the non-central beta distribution .

The basis for generating the random number of a certain distribution is usually a rectangular distribution or a uniform distribution in the interval . The generation of these random numbers is described in more detail in the article Random Number Generator . Assuming that the distribution function also has an image area between zero and one, one now randomly selects a number in this interval that is interpreted as the value of the selected probability distribution and thus the associated value of the random variable itself, its so-called quantile , as the one ultimately searched for Finds a random number (the quantile function supplying this number is nothing more than the inverse function of the corresponding distribution function ).

Simulation lemma

formulation

Definition of .

The inversion method is based on the simulation lemma , a lemma that states that one can generate a random variable with a different distribution function from a uniformly distributed random variable .

Let be a distribution function and a probability (i.e. a number from the interval ). The - quantile or the quantile function of the distribution function is defined as

.

If the set on the right side is empty and the infimum is not defined, you bet .

Now be a random variable uniformly distributed over the interval .

Then is a real random variable that satisfies the distribution function .

The quantile function is required to also cover the case of a non- injective distribution function, such as that of a discrete random variable. If the distribution function increases strictly monotonically , the usual inverse function of the distribution can be used.

proof

It is true that an inverse mapping is not closed in the strict sense of the word , but at least it applies

because of the right-hand continuity of . But that applies

.

The last step is correct because, according to the assumption, it is continuously evenly distributed and therefore applies to all .

Conclusion

Many distribution functions can be generated from uniformly distributed random numbers using the simulation lemma. However, for too many distribution functions it is not possible in practice to determine the quantile function with reasonable effort.

Use with discrete distribution

The examples are based on random numbers from the interval .

Example of a distribution with two values

Random numbers F (x) determine x and thus the frequencies of P (1) and P (2).

Coffee pots are made in a porcelain factory. The handles and beaks of the jugs are glued on by hand before burning. Experience has shown that 25% of the parts are not properly attached. After firing, the jugs are checked one after the other.

If the jug is OK, a one is given: This is the case in 75% of all jugs.
If it is objectionable, the examiner gives a two: This occurs in 25% of all jugs.

We now want to simulate a sequence of jugs:

Let us define the above specification as the distribution of a random variable :

  • For all other values ​​of is .

You could now proceed as follows: A random number ( ) is generated in the interval . If it is between 0 and 0.75, the random number gets the value 1, otherwise the value 2. In this way we generate 75% ones and 25% twos. For example, the table below results in a sequence of (1; 2) random numbers. The graphic illustrates the assignment process based on the first value. The equal distribution produced one . Here is forgiven.

A: Nr. der Kanne
B: Gleichverteilte Zufallszahl	
C: Zustand der Kanne
A      B         C
1      0,39      1
2      0,34      1
3      0,41      1
4      0,93      2
5      0,05      1
6      0,44      1
7      0,95      2
8      0,43      1
9      0,07      1
10     0,77      2
11     0,02      1
12     0,93      2
13     0,68      1
14     0,26      1
15     0,94      2
16     0,88      2
17     0,23      1
18     0,91      2
19     0,51      1
20     0,69      1

Example of Poisson distributed random numbers

The random numbers are interpreted as F ( x ) and provide Poisson-distributed x values.

In order to optimize customer service, the number of customers who come to a specific bank counter within a minute is recorded. We know from experience that an average of 1.2 customers come to the counter every minute. The number of arriving customers should be simulated. A good approximation for the distribution is the Poisson distribution with the parameter customers / minute. Your values ​​in this case are:

x:    Zahl der Kunden in einer Minute
f(x): Wahrscheinlichkeit für genau     x Kunden
F(x): Wahrscheinlichkeit für höchstens x Kunden 
---
x      f(x)      F(x)
0      0,3       0,3
1      0,36      0,66
2      0,22      0,88
3      0,09      0,97
4      0,03      0,99
5      ca.0      ca. 1

As above, we use the distribution for the simulation again.

A: Minuten-Index:
B: Gleichverteilte Zufallszahl: F(x)
C: Zahl der neu hinzu kommenden Kunden: 1,2 Kunden / Minute: x
D: Zahl der verbleibenden Kunden nach Bedienung von 1,5 Kunden / Minute (siehe unten)
E: Wie D, gerundet auf ganze Zahlen (siehe unten)
----
A	B	C	D	E
1	0,63	1	0	0
2	0,55	1	0	0
3	0,21	0	0	0
4	0,93	3	1,5	2
5	0,85	2	2	2
6	0,96	3	3,5	4
7	0,81	2	4	4
8	0,68	2	4,5	5
9	0,88	2	5	5
10	0,04	0	3,5	4
11	0,51	1	3	3
12	0,07	0	1,5	2
13	0,28	0	0	0
14	0,59	1	0	0
15	0,55	1	0	0
16	0,68	2	0,5	1
17	0,61	1	0	0
18	0,08	0	0	0
19	0,57	1	0	0
20	0,56	1	0	0
21	0,52	1	0	0
22	0,1 	0	0	0
23	0,27	0	0	0
24	0,17	0	0	0
25	0,72	2	0,5	1
26	0,06	0	0	0
27	0,55	1	0	0
28	0,92	3	1,5	2
29	0,72	2	2	2
30	0,03	0	0,5	1

In the first minute, the evenly distributed random number is between 0.3012 and 0.6626, as indicated in the graphic. Here is given the value 1.

In the second minute it is again between 0.3012 and 0.6626, receives the value 1 again, and so on.

The result is the sequence of incoming customers as in the table.

You could now use the simulation to investigate whether the line of customers can get very large, whether, for example, you should keep another counter open.

Excursus into the queuing theory

New customers coming downstairs, customers waiting upstairs.

The resulting queue of bank customers is examined in a very simplified system simulation: On average, as in the example above, 1.2 customers arrive per minute. An average of 1.5 customers per minute should be served. One could assume that there are no queues because, on average, more customers are served than arrive. A simulation with the Poisson distribution gives the following picture:

Lines of up to five people formed in just under an hour. The reason is that the processing capacity is not used in the periods when no customers are present (there are no negative customer numbers).

Use with constant distribution

For the distribution of a continuous random variable, a continuous, strictly monotonically increasing distribution curve results instead of a step function .

Example of an equal distribution

Uniformly distributed random numbers from the interval are used to simulate a random walk . It then applies to the continuous uniform distribution on the interval : .

Illustrative: The random numbers are reduced by 0.5, i.e. H. . The numbers are interpreted as step lengths that are set forwards or backwards depending on the sign .

Example of exponential distribution

Exponential distribution function

The distribution function as the exponential distribution is

As an inverse function we then get

The time between two requests to a certain Wikipedia server is exponentially distributed with the parameter . The times x between two requests to the server are to be simulated. The examples are based on random numbers from the interval .

The distribution function

is shown in the graphic on the right. The ordinate value corresponds to the random number , the corresponding abscissa value is the exponentially distributed random number sought.

Computationally simpler it is, the inverse function to work and take into account that the random numbers from the interval does not change if we by replacing:

Table-1 shows a sequence of exponentially distributed random numbers. The graph below shows these numbers as a time series.

Tabelle-1
lambda = 2 Zugriffe / Zeiteinheit
A: Index
B: exponentialverteilte Zufallszahl
C: B aufsummiert = Zeitpunkte des Eintreffens
---
A        B      C             
0	 0,00	0,00
1	 0,58	0,58	
2	 1,17	1,76	
3	 0,31	2,07	
4	 0,02	2,09		
5	 0,25	2,34		
6	 1,58	3,92	
7	 0,15	4,07	
8	 0,25	4,32			     
9	 0,06	4,38
10	 0,17	4,55
11	 0,21	4,76
12	 0,20	4,96
13	 0,70	5,66
14	 0,22	5,88
15	 0,17	6,05
16	 0,42	6,47
17	 0,00	6,47
18	 1,89	8,36
19	 0,40	8,76
20	 0,76	9,52
21	 0,37	9,88
22	 0,53	10,42
23	 0,54	10,96
24	 0,36	11,32
25	 0,41	11,73
26	 0,38	12,11
27	 1,31	13,42
28	 0,13	13,55
29	 0,05	13,61
30	 1,95	15,55
31	 1,60	17,15
32	 0,35	17,50
33	 0,81	18,31
34	 0,29	18,60
35	 0,32	18,92
36	 0,30	19,22
37	 1,43	20,65
38	 0,55	21,20
39	 0,01	21,21
40	 0,33	21,54
41	 0,48	22,03
42	 1,13	23,16
43	 0,97	24,13
44	 0,42	24,54
Mittelwert von B: 0,56 (berechnet: 1/lambda = 0,5)
Standardabweichung von B: 0,51 (berechnet: 0,5)
Temporal sequence of accesses to a server with an average of 2 accesses / time unit.

While the time sequence of the accesses is exponentially distributed, the number of accesses per time unit follows a Poisson distribution . For example, 4 intervals are observed within which no event occurs and 8 intervals with an event, see Table-2. For the two cases, the Poisson distribution makes the predictions of 3.1 and 6.2 - a good agreement in view of the small number of values.

Tabelle-2
A: Zahl der Ereignisse pro Intervall, basierend auf C in Tabelle-1
B: Gemessene Ereignisse 
C: Berechneter Wert für E nach Poisson-Verteilung
---
A	B	C
0	4	3,11
1	8	6,23
2	5	6,23
3	4	4,15
4	2	2,08
5	1	0,83
6	0	0,28

Further examples

Here are some examples of how a uniformly distributed random variable with given distributions or can be generated.

Density function Distribution function Quantile function Area

Problems

The quantile function used in the simulation lemma cannot always be determined. Then the inversion method cannot be used. In such cases, the solution is often the discard method .

Normal distribution

Since the inverse of the normal distribution cannot be determined directly, the simulation lemma remains a theoretical idea for it too.

Different approaches to the generation of normally distributed random numbers are summarized in the article Normal Distribution, section Generation of Normally Distributed Random Numbers .

When generating multivariate normally distributed random numbers, the generated stochastically independent random numbers still have to be correlated. This is achieved by multiplying the data matrix of the independent random numbers by S 1/2 . Here S 1/2 denotes the Cholesky decomposition of S and S the covariance matrix of the normal distribution to be simulated.

literature

  • Michael Kolonko: Stochastic Simulation: Fundamentals, Algorithms and Applications. Vieweg + Teubner, Wiesbaden 2008, ISBN 978-3-8351-0217-0 .