Mixed distribution
The term mixed distribution or composite distribution comes from the theory of probability . It is the probability distribution of the mixture of random variables from several different populations .
Introductory example
If, for example, one considers the characteristic of height in small children (first population) and adults (second population), this characteristic is mostly approximately normally distributed within each individual population , with the mean value for small children being significantly lower than for adults. The mixed distribution is now the distribution of body size if the two populations of small children and adults are not considered individually but together, i.e. the distribution of the body size of a person who is not known to be a small child or an adult.
Mathematically, in this example, the height of the toddlers is a random variable from one population and the height of the adults is a different random variable from the other population . The mixture of these two random variables is a further random variable that comes with a certain probability as the first population or with a certain probability as the other population . Since only these two populations are available, must apply. The probabilities and can also be interpreted as the relative proportions of the population and the common population, based on the example as the proportion of small children or adults in the total sample. The distribution of determined about the law of total probability to
If and distribution functions and have, is the distribution function of so
- .
definition
The density function of a continuous random variable can be expressed as
we say that it follows a mixed distribution. The density functions of continuous random variables and the probabilities are with
- .
is therefore a convex combination of the densities .
One can easily show that under these conditions is nonnegative and the normalization property
is satisfied.
Accordingly, the probability function of a discrete mixed distribution results as
from the probability functions of discrete random variables .
properties
For the moments of :
This follows (in the continuous case) from
A similar calculation gives the formula for the discrete case.
Frequent special case: Gaussian mixed models
A common special case of mixed distributions are so-called Gaussian mixture models ( gaussian mixture models , in short: GMM ). The density functions are those of the normal distribution with potentially different mean values and standard deviations (or mean value vectors and covariance matrices in the -dimensional case). So it applies
and the density of the mixed distribution has the form
- .
Parameter estimation
Estimators for the parameters of probability distributions are often derived using the maximum likelihood method . In the case of mixed distributions, however, this usually results in equations whose solutions cannot be given algebraically and must therefore be determined numerically. A typical method for this is the expectation maximization algorithm ( EM algorithm ), which, starting with initial values for the parameters, generates a sequence of increasingly better estimated values, which in many cases approximate the real parameters .
example
A trout farmer sells trout in bulk. An inventory is made in autumn when the ponds are emptied. The trout that have been fished out are weighed. The result is the distribution of the weight, as can be seen in the graphic. The two-peaked distribution indicates a mixed distribution. It turns out that the trout came from two different ponds. The trout weights from the first pond are normally distributed with the expected value 400 g and the variance 4900 g ^{2} and those from the second pond with the expected value 600 g and the variance 8100 g ^{2} . 40% of the trout come from the first pond and 60% from the second. The result is the density function (see figure).
See also
Individual evidence
- ↑ Fraley, Ch., Raftery, A .: 'MCLUST; Version 3 for R: Normal Mixture Modeling and Model-Based Clustering ' ( Memento of the original from September 24, 2015 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice.