Empirical distribution (probability distribution)

from Wikipedia, the free encyclopedia

The empirical distribution is a special probability distribution in stochastics , a branch of mathematics . It belongs to the discrete probability distributions and establishes a relationship between descriptive statistics and probability theory . The expected value of the empirical distribution is the arithmetic mean of the underlying sample, just as the distribution function of the empirical distribution is the empirical distribution function .

definition

A vector is given . Denote the Dirac measure on which is given by

.

Then the probability distribution on the real numbers is called given by

the empirical distribution of on the real numbers. So it is

The thickness of the set denotes the number of its elements and contains the indices of the elements of the vector that are contained in. Thus it is clearly counted first how many components of the vector are contained in the set . That number, divided by the total number of components, is then the probability of the amount .

The empirical distribution can also be defined on more general basic spaces , then is . This article continues to cover the case .

Probability function

If all components of different, so for , the corresponding probability function of the empirical distribution of a discrete uniform distribution on and is given by

If a component occurs multiple times, the value of the probability function is there accordingly .

Distribution function

The distribution function of the empirical distribution is the empirical distribution function and is thus given by

.

Here the indicator function is the quantity .

properties

A random variable is given , which is (co ) distributed according to the empirical distribution . Then the probabilistic indicators of such as expected value and quantile are exactly the corresponding indicators of the descriptive statistics of the sample like the arithmetic mean and the empirical quantiles .

Expected value

The expected value is the empirical distribution of the arithmetic mean (see weighted arithmetic mean as the expected value ), ie

Variance

The variance of the empirical distribution is the (uncorrected) empirical variance , so

.

Here refers to the arithmetic mean or expected value.

Median and quantile

The median (in the sense of probability theory) of the empirical distribution corresponds to the median of the sample , and the quantiles of the empirical distribution also correspond to the empirical quantiles .

mode

The mode (in the sense of probability theory) of the empirical distribution corresponds to the mode of the sample .

More spread

The following also applies:

Individual evidence

  1. ^ Hans-Otto Georgii: Stochastics . Introduction to probability theory and statistics. 4th edition. Walter de Gruyter, Berlin 2009, ISBN 978-3-11-021526-7 , p. 116 , doi : 10.1515 / 9783110215274 .
  2. Achim Klenke: Probability Theory . 3. Edition. Springer-Verlag, Berlin Heidelberg 2013, ISBN 978-3-642-36017-6 , p. 237 , doi : 10.1007 / 978-3-642-36018-3 .