Dirichlet distribution

from Wikipedia, the free encyclopedia
Examples of a Dirichlet distribution with K = 3 for different parameter vectors α . Clockwise from top left: α = (6, 2, 2), (3, 7, 5), (6, 2, 6), (2, 3, 4).

The Dirichlet distribution (after Peter Gustav Lejeune Dirichlet ) is a family of continuous, multivariate probability distributions.

It is the multivariate extension of the beta distribution and the conjugate a priori distribution of the multinomial distribution in Bayesian statistics . Its density function gives the probabilities of K different, exclusive events if each event was observed three times.

illustration

The multinomial distribution indicates the probabilities of up to k different events, e.g. B. how likely it is to roll a one, two, three, four, five or six in one roll. In contrast, the Dirichlet distribution indicates how likely such a distribution will occur. In the case of a dice factory, the Dirichlet distribution could indicate how probable the distributions of the dice results are for the manufactured dice. If the machines in the dice factory function correctly, the probability of anything other than the uniform distribution (all numbers are equally likely) would be very low. That would correspond to a parameter vector with equal and very high elements such as . On the other hand, it would mean that the machines produce dice in which the number one occurs twice as often as any other number. And this almost without exception, since the values ​​are again very high and thus the variance is low. But if the values ​​in z. B. all , then dice would be made that have a strong tendency towards a number. Which is the preferred number on a die would be random, since all values ​​in are the same. The smaller the values, the more pronounced the unfairness of most dice would be, and the rarer would be dice without a preferred number.

Density function

The Dirichlet distribution of order K  ≥ 2 with the parameters has the following density function:

for everyone with and . Therefore the sum of all probabilities is 1.

The normalizing constant is the multinomial beta function, which can be represented by gamma functions:

Web links