Galton-Watson trial

from Wikipedia, the free encyclopedia

The Galton-Watson process , named after the British naturalist Francis Galton (1822–1911) and his compatriot, the mathematician Henry William Watson (1827–1903), is a special stochastic process that is used to determine the numerical development of a unisexual Mathematically model population of self-replicating individuals. It is sometimes referred to as the Bienaymé-Galton-Watson trial , in honor of the French Irénée-Jules Bienaymé (1796–1878), who had already worked on the same problem a long time before.

history

50 independent GW processes with a starting value of 20 and Poisson distributed descendants with a parameter of 0.95. At t = 41, all but 6 populations are extinct.

In England in the Victorian era , the aristocracy was increasingly concerned about the fact that aristocratic families repeatedly died out due to a lack of male descendants, and that more and more traditional names were disappearing from aristocratic society. Galton, who was not a mathematician himself, published the question about the likelihood of such erasure in the science journal Educational Times in 1873 and promptly received an answer from Watson. The following year they published their collaborative work On the probability of extinction of families , in which they presented a stochastic concept that is now known as the Galton-Watson process. The result they came to was that if the population remained constant, all but one of the names would die out over time. Apparently this work was created out of ignorance of the results of Bienaymé.

At first, the dying surname problem remained the only one to which the Galton-Watson concept was applied. But biologists soon began to use them to model the spread of living things. Today the process is used in a wide variety of areas, from queuing theory to the spread of computer viruses and chain letters .

Mathematical modeling

The same experiment with Poisson parameter 1 (instead of 0.95). This time, up to t = 50, 24 out of 50 populations survived.

The Galton-Watson process is characterized by the following model assumptions:

  • Each individual lives exactly one time step.
  • The -th individual in the -th time step leaves behind a certain number of offspring according to a random variable, independently of all other individuals .
  • All are independently identically distributed with distribution only taking values ​​in .
  • The population starts with one individual.

The last assumption is plausible because, due to the independence of reproduction, starting with individuals is equivalent to processes running in parallel with one individual as the starting population.

Let us now be the number of living individuals at the time (in the original model the number of male ancestors ). It applies

and

Then follows due to independent reproduction

If there were exactly individuals in the -th generation , then the distribution of is clearly determined by

Here is the fold convolution of the distribution . This follows directly from the summation of the independent random variables.

Thus the Galton-Watson process is a temporally homogeneous Markov chain in discrete time and a countable state space. The (countably infinitely large) transition matrix is ​​through

given. The probability of obtaining individuals if there were individuals before is given by the convolution of the distribution .

The probability of extinction

The question Galton and Watson were interested in was the likelihood of a population becoming extinct. The probability that no individual lives in the -th generation is

But since the 0 is an absorbing state (it applies ), i.e. it can never be left again when it is entered once, the following always applies: is , so is . It follows directly that the chances of being in the 0 monotonous, are growing: . Hence the probability of extinction

The probability of extinction is calculated using the probability-generating function of . The following applies and then it follows inductively using the fact that sums can be represented as a chain of generating functions over a random number of summands:

where the -fold composition (consecutive execution) denotes a function . There is is . It follows from this that the extinction probability is the smallest non-negative fixed point of the probability-generating function of , i.e. the solution of the equation

.

The following then applies:

  • is , so is , so the population will almost certainly die out.
  • is , the probability of extinction is really between 0 and 1.

Except for these considerations, the case that each individual is produced exactly one progeny: . This is then a trivial absorbing state.

example

Assume that each individual has a certain number of offspring independently of all other individuals, which are geometrically distributed to the parameter , i.e. the probability function

for everyone owns. Then

It can be shown by induction that

and therefore

is true, the population will almost certainly die out. The procedure used here is the exception; in most cases, no direct formula can be specified for the multiple concatenation. The classic procedure would be to calculate the expected value of and then, if necessary, determine the fixed point. However, since the expected value is already 1 here, the calculation of the fixed point can be dispensed with.

literature