Pareto distribution

from Wikipedia, the free encyclopedia
The frequency of the population figures in German cities ( histogram in yellow) can be well described by a Pareto distribution (blue).

The Pareto distribution , named after the Italian economist Vilfredo Pareto (1848–1923), is a continuous probability distribution on a right-hand infinite interval . It is scale invariant and satisfies a power law . For small exponents, it belongs to the end-load distributions .

The distribution was initially used to describe the distribution of income in Italy. Pareto distributions are characteristically found when random, positive values ​​extend over several orders of magnitude and come about through the influence of many independent factors. Distributions with similar properties are the Zipf distribution and Benford's law .

Concept history

In the second volume of the Cours d'économie politique by Vilfredo Pareto (1897) he explains that the number of people who have an income higher than a threshold value within a country is approximately proportional to , with the parameter about 1.5 across countries amounts. Except for a scaling, this specification defines the probability distribution named after Pareto (via the cumulative distribution function). Numerous other empirical distributions can also be well described as Pareto distributions, for example city sizes or loss amounts in actuarial mathematics.

definition

Pareto probability density f (x) with (x min = 1).
Cumulative distribution function F (x)

A continuous random variable is called Pareto distributed with the parameters and , if they are the probability density

owns.

There is a parameter that describes the minimum value of the distribution. This is also the mode of distribution, i.e. the maximum point of the probability density. The greater the distance between and , the less likely it will accept the value . The distance between the two values ​​is determined as a quotient , i.e. the ratio between the two quantities.

is a parameter that describes the size ratio of the random values ​​depending on their frequency. With the quotient is potentiated . If it is larger , the curve is significantly steeper, that is, the random variable takes on large values ​​with a lower probability.

The probability with which the random variable assumes a value less than or equal is calculated using the distribution function for all :

.

The probability that the random variable will assume values ​​greater than that is calculated by:

.

The distribution therefore belongs to the end-load distributions .

properties

Expected value

The expected value results from:

Quantiles

Median

The median results in

Review of the Pareto principle

The fourth quintile, which is asked for in the Pareto principle , is obtained analogously

.

The expected value , restricted to values ​​greater than the 4th quintile, is sufficient for the equation

.

For the value considered typical by Pareto , there is an expected value that , i. H. about 58%, of the total expected value. If the income of a population were to correspond to a Pareto distribution with the parameter 1.5, the 20% with the highest incomes would only earn 58% of the total income - not 80%, as the Pareto principle suggests. In contrast, the 80% to 20% rule applies.

Variance

The variance results from

Standard deviation

Of the variance is obtained for the standard deviation

Coefficient of variation

The coefficient of variation is obtained immediately from the expected value and the standard deviation

Crookedness

For the skewness you get for

For the Pareto distribution is skewed to the right according to the definition of the 3rd order central moment. For the 3rd moment diverges, even if the distribution looks like a typical right-skewed distribution. For the median is always smaller than the expected value, in the sense of Pearson's definition the distribution is skewed to the right; for are the quantile coefficients positive, i.e. H. the distribution is skewed to the right in the sense of the definition using the quantiles .

Moments

Generally one gets for the -th moment

Characteristic function

The characteristic function results from:

Here is the incomplete gamma function .

Moment generating function

The torque generating function cannot be specified in a closed form for the Pareto distribution.

entropy

The entropy is given by: .

Zipf's law

The Zipf law is mathematically identical with the Pareto distribution ( - and axis are reversed). While the Pareto distribution looks at the probability of certain random values, Zipf's law focuses on the probability with which random values ​​occupy a certain position in the order of frequency.

Relationship to other distributions

Relationship to the exponential distribution

If is a Pareto-distributed random variable with the parameters and , then is exponentially distributed with the parameter .

Relation to the shifted Pareto distribution

If is a Pareto-distributed random variable, then a shifted Pareto distribution is sufficient .

Unequal measures of distribution and the Pareto principle

Lorenz curve of the mass of small cities and their population. The 80% smallest cities together make up only 38% of the total population. The Theil index is 0.8329315.

Since the (probability density of the) Pareto distribution has a single maximum at the smallest value , Pareto-distributed quantities show the phenomenon of unequal distribution known from the Pareto principle (also known as the 80:20 rule): Smaller values ​​are quite common, however, large values ​​are very rare. How strong this effect is depends on the parameter .

In the city example (see illustration in the introduction), a few large cities contribute disproportionately to the total population, while a very large number of small cities have only a few inhabitants.

Various measures of uneven distribution exist to quantify this phenomenon . For the calculation of measures of unequal distribution, distributions of the form “ to ” describe two quantiles, where the width of the first quantile is equal to the height of the second quantile and the height of the first quantile is equal to the width of the second quantile. An example of this way of representing distributions is the often cited "80-20 principle". For example, if 80% of a group has 20% of the group's resources and 20% of this group can use 80% of the resources.

In the Lorenz curve , this fact is in the form of a "standing" and a "lying" quantile. And must in each case be in the range from 0 to 1 it applies: . The Gini coefficient and the Hoover unequal distribution are the same in this case:

For an 80:20 distribution, this results in a Gini coefficient or a Hoover coefficient of 0.6 and 60%, respectively.

For these two-quantile distributions, the Theil index (an entropy measure) can then also be easily calculated:

The Pareto principle can serve as a memory aid for the range of values ​​of the Theil index. The index has a value of 0 with a uniform distribution of 0.5: 0.5 (50% to 50%) and takes the value 1 at around 0.82: 0.18 (82% to 18%). That is very close to the 80% to 20% distribution. Above the distribution of 82% to 18%, the Theil index is greater than 1.

Detecting Pareto distributions

Distribution of the population of German cities and municipalities

Whether a distribution is a Pareto distribution can be estimated graphically using double-logarithmic representations of the distributions.

The probability density of the Pareto distribution can be written as a power law :

You can also bring it into the form :

The (simple) logarithmic graph of such power laws is

After taking the logarithm of the -axis with (i.e. the actual -value is , but the axis is often labeled directly with the -values) one obtains

what a straight line with a rise is.

Double logarithmic representation of the distribution

The diagram on the right shows the city example in double-logarithmic form. It is easy to see that the graph is actually straight over large parts, with an increase , from which the parameter results.

Hence the exponent of the density function is , in good agreement with the literature.

It was used for the presentation because it is a cumulative measure that is created by adding up (in theory: integrating ) many individual values, which means that the spread of individual values ​​is less significant. When using the histogram, on the other hand, a summation of many values ​​can only be implemented with a reduced number of intervals, which would make the distribution unrealistically coarse.

literature

  • Rainer Schlittgen : Introduction to Statistics. Analysis and modeling of data. 10th revised edition. Oldenbourg Wissenschaftsverlag, Munich a. a. 2003, ISBN 3-486-27446-5 , p. 231, ( excerpt from Google book search).
  • Karl Mosler, Friedrich Schmid: Probability calculation and conclusive statistics. 2nd, improved edition. Springer, Berlin a. a. 2006, ISBN 3-540-27787-0 , p. 99, ( excerpt from Google book search).
  • Vilfredo Pareto: Cours d'Économie Politique. 2 volumes. Rouge, Lausanne 1896-1897.

Web links

Commons : Pareto distribution  - collection of images, videos and audio files

Individual evidence

  1. Frederik M. Dekking, Cornelis Kraaikamp, ​​Hendrik P. Lopuhaä, Ludolf E. Meester: A modern introduction to probability and statistics. Understanding why and how. Springer, London 2005, ISBN 1-85233-896-2 , p. 63. ( Excerpt from the Google book search).
  2. 17.6,82.4 On-Line-Calculator: Unequal distribution , accessed on July 29, 2018.