Normal distribution

Normal distribution
Density function Density functions of the normal distribution : (blue), (red), (yellow) and (green) ${\ displaystyle {\ mathcal {N}} (\ mu, \ sigma ^ {2})}$ ${\ displaystyle {\ mathcal {N}} (0; 0 {,} 2)}$ ${\ displaystyle {\ mathcal {N}} (0; 1)}$ ${\ displaystyle {\ mathcal {N}} (0; 5)}$ ${\ displaystyle {\ mathcal {N}} (- 2; \, 0 {,} 5)}$
Distribution function Distribution functions of the normal distributions: (blue), (red), (yellow) and (green) ${\ displaystyle {\ mathcal {N}} (0; 0 {,} 2)}$ ${\ displaystyle {\ mathcal {N}} (0; 1)}$ ${\ displaystyle {\ mathcal {N}} (0; 5)}$ ${\ displaystyle {\ mathcal {N}} (- 2; \, 0 {,} 5)}$
parameter	${\ displaystyle \ mu \ in \ mathbb {R}}$ - Expected value ( location parameter ) - Variance ( scale parameter ) ${\ displaystyle \ sigma ^ {2}> 0}$
carrier	${\ displaystyle {\ mathcal {T}} _ {X} = \ mathbb {R}}$
Density function	${\ displaystyle {\ frac {1} {\ sqrt {2 \ pi \ sigma ^ {2}}}} \ operatorname {exp} \ left (- {\ frac {\ left (x- \ mu \ right) ^ { 2}} {2 \ sigma ^ {2}}} \ right)}$
Distribution function	${\ displaystyle {\ frac {1} {2}} \ left (1+ \ operatorname {erf} \ left ({\ frac {x- \ mu} {\ sqrt {2 \ sigma ^ {2}}}} \ right) \ right)}$ - with error function ${\ displaystyle \ operatorname {erf} (x)}$
Expected value	${\ displaystyle \ mu}$
Median	${\ displaystyle \ mu}$
mode	${\ displaystyle \ mu}$
Variance	${\ displaystyle \ sigma ^ {2} \,}$
Crookedness	${\ displaystyle 0}$
Bulge	${\ displaystyle 3}$
entropy	${\ displaystyle {\ frac {1} {2}} \ ln (2 \ pi e \, \ sigma ^ {2})}$
Moment generating function	${\ displaystyle \ exp \ left (\ mu t + {\ tfrac {1} {2}} \ sigma ^ {2} t ^ {2} \ right)}$
Characteristic function	${\ displaystyle \ exp \ left (i \ mu t - {\ tfrac {1} {2}} \ sigma ^ {2} t ^ {2} \ right)}$
Fisher information	${\ displaystyle {\ begin {pmatrix} 1 / \ sigma ^ {2} & 0 \\ 0 & 1 / (2 \ sigma ^ {4}) \ end {pmatrix}}}$

The normal or Gauss distribution (after Carl Friedrich Gauß ) is an important type of continuous probability distribution in stochastics . Their probability density function is also called Gaussian function, Gaussian normal distribution, Gaussian distribution curve, Gaussian curve, Gaussian bell curve, Gaussian bell function, Gaussian bell or simply bell curve.

The special importance of the normal distribution is based, among other things, on the central limit theorem , according to which distributions that result from the additive superposition of a large number of independent influences are approximately normally distributed under weak conditions. The family of normal distributions forms a position-scale family .

The deviations of the measured values of many natural, economic and engineering processes from the expected value can be described by the normal distribution (in biological processes often logarithmic normal distribution ) either exactly or at least in a very good approximation (especially processes that are divided into different factors independently of one another Directions act).

Random variables with normal distribution are used to describe random processes such as:

random scattering of measured values,
random deviations from the nominal size when manufacturing workpieces,
Description of Brownian molecular motion .

In actuarial mathematics , the normal distribution is suitable for modeling damage data in the range of medium damage amounts.

In measurement technology , a normal distribution is often used, which describes the spread of the measurement errors. What is important here is how many measuring points are within a certain spread.

The standard deviation describes the width of the normal distribution. The half width of a normal distribution is approximately times (exactly ) the standard deviation. The following applies approximately: ${\ displaystyle \ sigma}$ ${\ displaystyle 2 {,} 4}$ ${\ displaystyle 2 {\ sqrt {2 \ ln 2}}}$

In the interval of the deviation from the expected value, 68.27% of all measured values can be found, ${\ displaystyle \ pm \ sigma}$
In the interval of the deviation from the expected value, 95.45% of all measured values can be found, ${\ displaystyle \ pm 2 \ sigma}$
In the interval of the deviation from the expected value, 99.73% of all measured values can be found. ${\ displaystyle \ pm 3 \ sigma}$

And conversely, the maximum deviations from the expected value can be found for given probabilities:

50% of all measured values have a deviation of at most from the expected value, ${\ displaystyle 0 {,} 675 \ sigma}$
90% of all measured values have a maximum deviation from the expected value, ${\ displaystyle 1 {,} 645 \ sigma}$
95% of all measured values have a deviation of at most from the expected value, ${\ displaystyle 1 {,} 960 \ sigma}$
99% of all measured values have a maximum deviation from the expected value. ${\ displaystyle 2 {,} 576 \ sigma}$

In addition to the expected value, which can be interpreted as the center of gravity of the distribution, the standard deviation can also be assigned a simple meaning with regard to the magnitude of the probabilities or frequencies that occur.

history

Gaussian bell curve on a German ten-mark note from the 1990s

In 1733 Abraham de Moivre showed in his work The Doctrine of Chances in connection with his work on the limit theorem for binomial distributions an estimate of the binomial coefficient, which can be interpreted as a pre-form of the normal distribution. The calculation of the non- elementary integral necessary for normalizing the normal distribution density to the probability density

{\ displaystyle \ int _ {- \ infty} ^ {\ infty} e ^ {- {\ frac {1} {2}} t ^ {2}} \ mathrm {d} t = {\ sqrt {2 \ pi }}}

succeeded Pierre-Simon Laplace in 1782 (according to other sources Poisson ). In 1809, Gauß published his work Theoria motus corporum coelestium in sectionibus conicis solem ambientium ( German theory of the movement of the celestial bodies moving in conic sections around the sun ), which defines the normal distribution in addition to the method of least squares and maximum likelihood estimation . It was also Laplace who in 1810 proved the theorem of the central limit value , which represents the basis of the theoretical meaning of the normal distribution, and who completed de Moivre's work on the limit value theorem for binomial distributions. Adolphe Quetelet finally recognized an astonishing agreement with the normal distribution in investigations of the chest girth of several thousand soldiers in 1844 and brought the normal distribution into the applied statistics . He probably coined the term "normal distribution".

definition

A continuous random variable has a ( Gaussian or) normal distribution with expectation and variance ( ), often written as if it has the following probability density : ${\ displaystyle X}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma ^ {2}}$ ${\ displaystyle - \ infty <\ mu <\ infty, \ sigma ^ {2}> 0}$ ${\ displaystyle X \ sim {\ mathcal {N}} \ left (\ mu, \ sigma ^ {2} \ right)}$ ${\ displaystyle X}$

{\ displaystyle f (x \ mid \ mu, \ sigma ^ {2}) = {\ frac {1} {\ sqrt {2 \ pi \ sigma ^ {2}}}} \ operatorname {exp} \ left (- {\ frac {(x- \ mu) ^ {2}} {2 \ sigma ^ {2}}} \ right) = {\ frac {1} {\ sqrt {2 \ pi \ sigma ^ {2}}} } e ^ {- {\ frac {(x- \ mu) ^ {2}} {2 \ sigma ^ {2}}}} \ quad - \ infty <x <\ infty}

.

The graph of this density function has a “ bell-shaped shape” and is symmetrical with the parameter as the center of symmetry, which also represents the expected value , the median and the mode of distribution. The variance of is the parameter . Furthermore, the probability density has turning points at . ${\ displaystyle \ mu}$ ${\ displaystyle X}$ ${\ displaystyle \ sigma ^ {2}}$ ${\ displaystyle x = \ mu \ pm \ sigma}$

The probability density of a normally distributed random variable has no definite integral that can be solved in closed form , so that probabilities have to be calculated numerically. The probabilities can use a standard normal distribution table are calculated, a standard form used. To see this, one uses the fact that a linear function of a normally distributed random variable is itself normally distributed again. In concrete terms, this means that if and , where and are constants with , then applies . The consequence of this is the random variable ${\ displaystyle X \ sim {\ mathcal {N}} \ left (\ mu, \ sigma ^ {2} \ right)}$ ${\ displaystyle Y = aX + b}$ ${\ displaystyle a}$ ${\ displaystyle b}$ ${\ displaystyle a \ neq 0}$ ${\ displaystyle Y \ sim {\ mathcal {N}} \ left (a \ mu + b, a ^ {2} \ sigma ^ {2} \ right)}$

Density function of a normally distributed random variable

{\ displaystyle \ varphi (x) = {\ tfrac {1} {\ sqrt {2 \ pi}}} e ^ {- {\ frac {1} {2}} x ^ {2}}}

{\ displaystyle Z = {\ frac {1} {\ sigma}} (X- \ mu) \ sim {\ mathcal {N}} (0,1)}

,

which is also called the standard normally distributed random variable . The standard normal distribution is the normal distribution with parameters and . The density function of the standard normal distribution is given by ${\ displaystyle Z}$ ${\ displaystyle \ mu = 0}$ ${\ displaystyle \ sigma ^ {2} = 1}$

{\ displaystyle \ varphi (x) = {\ frac {1} {\ sqrt {2 \ pi}}} e ^ {- {\ frac {1} {2}} x ^ {2}} \ quad - \ infty <x <\ infty}

.

Their course is shown graphically opposite.

The multidimensional generalization can be found in the article Multidimensional Normal Distribution .

properties

Distribution function

The distribution function of the normal distribution is through

{\ displaystyle F (x) = {\ frac {1} {\ sigma {\ sqrt {2 \ pi}}}} \ int _ {- \ infty} ^ {x} e ^ {- {\ frac {1} {2}} \ left ({\ frac {t- \ mu} {\ sigma}} \ right) ^ {2}} \ mathrm {d} t}

given. If one introduces a new integration variable instead of a substitution , the result is ${\ displaystyle t = \ sigma z + \ mu}$ ${\ displaystyle t}$ ${\ displaystyle z: = {\ tfrac {t- \ mu} {\ sigma}}}$

{\ displaystyle F (x) = {\ frac {1} {\ sqrt {2 \ pi}}} \ int \ limits _ {- \ infty} ^ {(x- \ mu) / \ sigma} e ^ {- {\ frac {1} {2}} z ^ {2}} \ mathrm {d} z = \ Phi \ left ({\ frac {x- \ mu} {\ sigma}} \ right).}

It is the distribution function of the standard normal distribution ${\ displaystyle \ Phi}$

{\ displaystyle \ Phi (x) = {\ frac {1} {\ sqrt {2 \ pi}}} \ int _ {- \ infty} ^ {x} e ^ {- {\ frac {1} {2} } t ^ {2}} \ mathrm {d} t.}

The error function can be represented as ${\ displaystyle \ operatorname {erf}}$ ${\ displaystyle \ Phi}$

{\ displaystyle \ Phi (x) = {\ frac {1} {2}} \ left (1+ \ operatorname {erf} \ left ({\ frac {x} {\ sqrt {2}}} \ right) \ right)}

.

symmetry

The probability density graph is a Gaussian bell curve, the height and width of which depends on. It is axially symmetrical to the straight line with the equation and thus a symmetrical probability distribution around its expected value. The graph of the distribution function is point-symmetric to the point For is especially true and for all . ${\ displaystyle f \ colon \ \ mathbb {R} \ to \ mathbb {R}}$ ${\ displaystyle \ sigma}$ ${\ displaystyle x = \ mu}$ ${\ displaystyle F}$ ${\ displaystyle (\ mu; 0 {,} 5).}$ ${\ displaystyle \ mu = 0}$ ${\ displaystyle \ varphi (-x) = \ varphi (x)}$ ${\ displaystyle \ Phi (-x) = 1- \ Phi (x)}$ ${\ displaystyle x \ in \ mathbb {R}}$

Maximum value and inflection points of the density function

With the help of the first and second derivative , the maximum value and the turning points can be determined. The first derivative is

{\ displaystyle f '(x) = - {\ frac {x- \ mu} {\ sigma ^ {2}}} f (x).}

The maximum of the density function of the normal distribution is therefore at and is there . ${\ displaystyle x _ {\ mathrm {max}} = \ mu}$ ${\ displaystyle f _ {\ mathrm {max}} = {\ tfrac {1} {\ sigma {\ sqrt {2 \ pi}}}}}$

The second derivative is

{\ displaystyle f '' (x) = {\ frac {1} {\ sigma ^ {2}}} \ left ({\ frac {1} {\ sigma ^ {2}}} (x- \ mu) ^ {2} -1 \ right) f (x)}

.

Thus the turning points are included in the density function . The density function has the value at the turning points . ${\ displaystyle x = \ mu \ pm \ sigma}$ ${\ displaystyle {\ tfrac {1} {\ sigma {\ sqrt {2 \ pi e}}}}}$

Normalization

Density of a centered normal distribution . For , the function becomes higher and narrower, but the area remains unchanged 1.

{\ displaystyle \ delta _ {a} (x) = {\ tfrac {1} {{\ sqrt {\ pi}} a}} \ cdot e ^ {- {\ frac {x ^ {2}} {a ^ {2}}}}}

{\ displaystyle a \ to 0}

It is important that the total area under the curve is equal , i.e. equal to the probability of the certain event . Thus it follows that if two Gaussian bell curves have the same but different one , the curve with the larger one is wider and lower (since both associated surfaces each have the same value and only the standard deviation is larger). Two bell curves with the same but different ones have congruent graphs that are shifted from one another by the difference in the values parallel to the axis. ${\ displaystyle 1}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$ ${\ displaystyle \ sigma}$ ${\ displaystyle 1}$ ${\ displaystyle \ sigma,}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ mu}$ ${\ displaystyle x}$

Every normal distribution is actually normalized, because with the help of linear substitution we get ${\ displaystyle z = {\ tfrac {x- \ mu} {\ sigma}}}$

{\ displaystyle \ int _ {- \ infty} ^ {\ infty} {\ frac {1} {\ sigma {\ sqrt {2 \ pi}}}} e ^ {- {\ frac {1} {2}} \ left ({\ frac {x- \ mu} {\ sigma}} \ right) ^ {2}} \ mathrm {d} x = {\ frac {1} {\ sqrt {2 \ pi}}} \ int _ {- \ infty} ^ {\ infty} e ^ {- {\ frac {1} {2}} z ^ {2}} \ mathrm {d} z = 1}

.

For the normalization of the latter integral see error integral .

calculation

Since it can not be traced back to an elementary antiderivative , tables were usually used for the calculation in the past (see standard normal distribution table ). Nowadays there are functions available in statistical programming languages such as R , which also master the transformation to arbitrary and . ${\ displaystyle \ Phi (z)}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$

Expected value

The expected value of the standard normal distribution is . It is true ${\ displaystyle 0}$ ${\ displaystyle X \ sim {\ mathcal {N}} \ left (0.1 \ right)}$

{\ displaystyle \ operatorname {E} (X) = {\ frac {1} {\ sqrt {2 \ pi}}} \ int \ limits _ {- \ infty} ^ {+ \ infty} x \ e ^ {- {\ frac {1} {2}} x ^ {2}} \ mathrm {d} x = 0,}

since the integrand can be integrated and is point-symmetric .

Is now , then is is standard normal distributed, and thus ${\ displaystyle Y \ sim {\ mathcal {N}} \ left (\ mu, \ sigma ^ {2} \ right)}$ ${\ displaystyle X = (Y- \ mu) / \ sigma}$

{\ displaystyle \ operatorname {E} (Y) = \ operatorname {E} (\ sigma X + \ mu) = \ sigma \ underbrace {\ operatorname {E} (X)} _ {= 0} + \ mu = \ mu .}

Variance and other measures of dispersion

The variance of the -normally distributed random variables corresponds to the parameter ${\ displaystyle (\ mu, \ sigma ^ {2})}$ ${\ displaystyle \ sigma ^ {2}}$

{\ displaystyle \ operatorname {Var} (X) = {\ frac {1} {\ sqrt {2 \ pi \ sigma ^ {2}}}} \ int _ {- \ infty} ^ {\ infty} (x- \ mu) ^ {2} e ^ {- {\ frac {(x- \ mu) ^ {2}} {2 \ sigma ^ {2}}}} \, \ mathrm {d} x = \ sigma ^ { 2}}

.

An elementary proof is ascribed to Poisson.

The mean absolute deviation is and the interquartile range . ${\ displaystyle {\ sqrt {\ frac {2} {\ pi}}} \, \ sigma \ approx 0 {,} 80 \ sigma}$ ${\ displaystyle \ approx 1 {,} 349 \ sigma}$

Standard deviation of the normal distribution

One-dimensional normal distributions are fully described by specifying the expected value and variance . So if there is a - -distributed random variable - in symbols - its standard deviation is simple . ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma ^ {2}}$ ${\ displaystyle X}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma ^ {2}}$ ${\ displaystyle X \ sim {\ mathcal {N}} (\ mu, \ sigma ^ {2})}$ ${\ displaystyle \ sigma _ {X} = {\ sqrt {\ sigma ^ {2}}} = \ sigma}$

Spreading intervals

Intervals around in the normal distribution

{\ displaystyle \ mu}

From the standard normal distribution table it can be seen that for normally distributed random variables in each case approximately

68.3% of the realizations in the interval ,

{\ displaystyle \ mu \ pm \ sigma}

95.4% in the interval and

{\ displaystyle \ mu \ pm 2 \ sigma}

99.7% in the interval

{\ displaystyle \ mu \ pm 3 \ sigma}

lie. Since in practice many random variables are almost normally distributed, these values from the normal distribution are often used as a rule of thumb. For example, it is often assumed to be half the width of the interval that encompasses the middle two thirds of the values in a sample, see quantile . ${\ displaystyle \ sigma}$

Normal distribution (a) and contaminated normal distribution (b)

However, this practice is not recommended because it can lead to very large errors. For example, the distribution can hardly be visually differentiated from the normal distribution (see picture), but 92.5% of the values lie in the interval , with the standard deviation of . Such contaminated normal distributions are very common in practice; The example mentioned describes the situation when ten precision machines produce something, but one of them is badly adjusted and produces with deviations ten times as high as the other nine. ${\ displaystyle P = 0 {,} 9 \ cdot {\ mathcal {N}} (\ mu, \ sigma ^ {2}) + 0 {,} 1 \ cdot {\ mathcal {N}} (\ mu, ( 10 \ sigma) ^ {2})}$ ${\ displaystyle \ mu \ pm {\ overline {\ sigma}}}$ ${\ displaystyle {\ overline {\ sigma}}}$ ${\ displaystyle P}$

Values outside of two to three times the standard deviation are often treated as outliers . Outliers can be an indication of gross errors in data acquisition . However, the data can also be based on a highly skewed distribution. On the other hand, with a normal distribution, on average about every 20th measured value is outside of twice the standard deviation and about every 500th measured value is outside of three times the standard deviation.

Since the proportion of values outside of six times the standard deviation is vanishingly small at approx. 2 ppb , such an interval is a good measure for almost complete coverage of all values. This is used in quality management by the Six Sigma method , in which the process requirements stipulate tolerance limits of at least . However, one assumes a long-term shift in the expected value by 1.5 standard deviations, so that the permissible error portion increases to 3.4 ppm . This proportion of error corresponds to four and a half times the standard deviation ( ). Another problem with the method is that the points cannot be determined in practice. If the distribution is unknown (i.e. if it is not absolutely certain to be a normal distribution), for example, the extreme values of 1,400,000,000 measurements limit a 75% confidence interval for the points. ${\ displaystyle 6 \ sigma}$ ${\ displaystyle 4 {,} 5 \ \ sigma}$ ${\ displaystyle 6 \ sigma}$ ${\ displaystyle 6 \ sigma}$ ${\ displaystyle 6 \ sigma}$

Dependency of the probability (percent within) on the size of the scatter interval

{\ displaystyle p (z)}

Dependence of the scatter interval limit on the included probability

{\ displaystyle z (p)}

Expected proportions of the values of a normally distributed random variable inside or outside the scatter intervals ${\ displaystyle \ left (\ mu -z \ sigma, \ mu + z \ sigma \ right)}$
${\ displaystyle z \ sigma}$	Percent within	Percent outside	ppb outside	Fraction outside
0.674490 ${\ displaystyle \ sigma}$	50%	50%	500,000,000	1/2
0.994458 ${\ displaystyle \ sigma}$	68%	32%	320,000,000	1 / 3.125
1 ${\ displaystyle \ sigma}$	68,268 9492%	31,731 0508%	317.310.508	1 / 3.151 4872
1.281552 ${\ displaystyle \ sigma}$	80%	20%	200,000,000	1/5
1.644854 ${\ displaystyle \ sigma}$	90%	10%	100,000,000	1/10
1.959964 ${\ displaystyle \ sigma}$	95%	5%	50,000,000	1/20
2 ${\ displaystyle \ sigma}$	95.449 9736%	4,550 0264%	45.500.264	1 / 21,977 895
2.354820 ${\ displaystyle \ sigma}$	98,146 8322%	1.853 1678%	18,531,678	1/54
2.575829 ${\ displaystyle \ sigma}$	99%	1 %	10,000,000	1/100
3 ${\ displaystyle \ sigma}$	99.730 0204%	0.269 9796%	2,699,796	1 / 370.398
3.290527 ${\ displaystyle \ sigma}$	99.9%	0.1%	1,000,000	1 / 1,000
3.890592 ${\ displaystyle \ sigma}$	99.99%	0.01%	100,000	1 / 10,000
4th ${\ displaystyle \ sigma}$	99.993 666%	0.006 334%	63,340	1 / 15,787
4,417173 ${\ displaystyle \ sigma}$	99.999%	0.001%	10,000	1 / 100,000
4.891638 ${\ displaystyle \ sigma}$	99.9999%	0.0001%	1,000	1 / 1,000,000
5 ${\ displaystyle \ sigma}$	99.999 942 6697%	0.000 057 3303%	573.3303	1 / 1,744,278
5,326724 ${\ displaystyle \ sigma}$	99.999 99%	0.000 01%	100	1 / 10,000,000
5,730729 ${\ displaystyle \ sigma}$	99.999 999%	0.000 001%	10	1 / 100,000,000
6th ${\ displaystyle \ sigma}$	99.999 999 8027%	0.000 000 1973%	1,973	1 / 506.797.346
6.109410 ${\ displaystyle \ sigma}$	99.999 9999%	0.000 0001%	1	1 / 1,000,000,000
6,466951 ${\ displaystyle \ sigma}$	99.999 999 99%	0.000 000 01%	0.1	1 / 10,000,000,000
6.806502 ${\ displaystyle \ sigma}$	99.999 999 999%	0.000 000 001%	0.01	1 / 100,000,000,000
7th ${\ displaystyle \ sigma}$	99.999 999 999 7440%	0.000 000 000 256%	0.002 56	1 / 390.682.215.445

The probabilities for certain scattering intervals can be calculated as ${\ displaystyle p}$ ${\ displaystyle [\ mu -z \ sigma; \ mu + z \ sigma]}$

{\ displaystyle p = 2 \ Phi (z) -1}

,

where is the distribution function of the standard normal distribution . ${\ displaystyle \ Phi (z) = {\ frac {1} {\ sqrt {2 \ pi}}} \ int _ {- \ infty} ^ {z} e ^ {- {\ frac {x ^ {2} } {2}}} \, \ mathrm {d} x}$

Conversely, can for given by ${\ displaystyle p \ in (0,1)}$

{\ displaystyle z = \ Phi ^ {- 1} \ left ({\ frac {p + 1} {2}} \ right)}

the limits of the associated scattering interval can be calculated with probability . ${\ displaystyle [\ mu -z \ sigma; \ mu + z \ sigma]}$ ${\ displaystyle p}$

An example (with fluctuation range)

The human body size is approximately normally distributed. In a sample of 1,284 girls and 1,063 boys between the ages of 14 and 18, the girls had an average height of 166.3 cm (standard deviation 6.39 cm) and the boys an average height of 176.8 cm (standard deviation 7.46 cm) measured.

Accordingly, the above fluctuation range suggests that 68.3% of the girls have a height in the range 166.3 cm ± 6.39 cm and 95.4% in the range 166.3 cm ± 12.8 cm,

16% [≈ (100% - 68.3%) / 2] of the girls are shorter than 160 cm (and 16% correspondingly taller than 173 cm) and
2.5% [≈ (100% - 95.4%) / 2] of the girls are shorter than 154 cm (and 2.5% correspondingly taller than 179 cm).

For boys it can be expected that 68% have a height in the range 176.8 cm ± 7.46 cm and 95% in the range 176.8 cm ± 14.92 cm,

16% of boys shorter than 169 cm (and 16% taller than 184 cm) and
2.5% of boys are shorter than 162 cm (and 2.5% taller than 192 cm).

Coefficient of variation

The coefficient of variation is obtained directly from the expected value and the standard deviation of the distribution ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$ ${\ displaystyle {\ mathcal {N}} (\ mu, \ sigma ^ {2})}$

{\ displaystyle \ operatorname {VarK} = {\ frac {\ sigma} {\ mu}}.}

Crookedness

The skewness has independent of the parameters and getting the value . ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$ ${\ displaystyle 0}$

Bulge

The vault is also of and independent and equal . In order to better assess the curvature of other distributions, they are often compared with the curvature of the normal distribution. The curvature of the normal distribution is normalized to (subtraction of 3); this size is called excess . ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$ ${\ displaystyle 3}$ ${\ displaystyle 0}$

Accumulators

The cumulative generating function is

{\ displaystyle g_ {X} (t) = \ mu t + {\ frac {\ sigma ^ {2} t ^ {2}} {2}}}

This is the first cumulant , the second is and all other cumulants disappear. ${\ displaystyle \ kappa _ {1} = \ mu}$ ${\ displaystyle \ kappa _ {2} = \ sigma ^ {2}}$

Characteristic function

The characteristic function for a standard normally distributed random variable is ${\ displaystyle Z \ sim {\ mathcal {N}} (0,1)}$

{\ displaystyle \ varphi _ {Z} (t) = e ^ {- {\ frac {1} {2}} t ^ {2}}}

.

For a random variable we get : ${\ displaystyle X \ sim {\ mathcal {N}} (\ mu, \ sigma ^ {2})}$ ${\ displaystyle X = \ sigma Z + \ mu}$

{\ displaystyle \ varphi _ {X} (t) = \ operatorname {E} (e ^ {it (\ sigma Z + \ mu)}) = \ operatorname {E} (e ^ {it \ sigma Z} e ^ { it \ mu}) = e ^ {it \ mu} \ operatorname {E} (e ^ {it \ sigma Z}) = e ^ {it \ mu} \ varphi _ {Z} (\ sigma t) = \ exp \ left (it \ mu - {\ tfrac {1} {2}} \ sigma ^ {2} t ^ {2} \ right)}

.

Moment generating function

The moment-generating function of the normal distribution is

{\ displaystyle m_ {X} (t) = \ exp \ left (\ mu t + {\ frac {\ sigma ^ {2} t ^ {2}} {2}} \ right)}

.

Moments

Let the random variable be -distributed. Then her first moments are as follows: ${\ displaystyle X}$ ${\ displaystyle {\ mathcal {N}} (\ mu, \ sigma ^ {2})}$

order	moment	central moment
${\ displaystyle k}$	${\ displaystyle \ operatorname {E} (X ^ {k})}$	${\ displaystyle \ operatorname {E} ((X- \ mu) ^ {k})}$
0	${\ displaystyle 1}$	${\ displaystyle 1}$
1	${\ displaystyle \ mu}$	${\ displaystyle 0}$
2	${\ displaystyle \ mu ^ {2} + \ sigma ^ {2}}$	${\ displaystyle \ sigma ^ {2}}$
3	${\ displaystyle \ mu ^ {3} +3 \ mu \ sigma ^ {2}}$	${\ displaystyle 0}$
4th	${\ displaystyle \ mu ^ {4} +6 \ mu ^ {2} \ sigma ^ {2} +3 \ sigma ^ {4}}$	${\ displaystyle 3 \ sigma ^ {4}}$
5	${\ displaystyle \ mu ^ {5} +10 \ mu ^ {3} \ sigma ^ {2} +15 \ mu \ sigma ^ {4}}$	${\ displaystyle 0}$
6th	${\ displaystyle \ mu ^ {6} +15 \ mu ^ {4} \ sigma ^ {2} +45 \ mu ^ {2} \ sigma ^ {4} +15 \ sigma ^ {6}}$	${\ displaystyle 15 \ sigma ^ {6}}$
7th	${\ displaystyle \ mu ^ {7} +21 \ mu ^ {5} \ sigma ^ {2} +105 \ mu ^ {3} \ sigma ^ {4} +105 \ mu \ sigma ^ {6}}$	${\ displaystyle 0}$
8th	${\ displaystyle \ mu ^ {8} +28 \ mu ^ {6} \ sigma ^ {2} +210 \ mu ^ {4} \ sigma ^ {4} +420 \ mu ^ {2} \ sigma ^ {6 } +105 \ sigma ^ {8}}$	${\ displaystyle 105 \ sigma ^ {8}}$

All central moments can be represented by the standard deviation : ${\ displaystyle \ mu _ {n}}$ ${\ displaystyle \ sigma}$

{\ displaystyle \ mu _ {n} = {\ begin {cases} 0 & {\ text {if}} n {\ text {odd}} \\ (n-1) !! \ cdot \ sigma ^ {n} & {\ text {if}} n {\ text {even}} \ end {cases}}}

the double faculty was used:

{\ displaystyle (n-1) !! = (n-1) \ cdot (n-3) \ cdot \ ldots \ cdot 3 \ cdot 1 \ quad \ mathrm {f {\ ddot {u}} r} \; n {\ text {even}}.}

A formula for non-central moments can also be specified for. To do this, one transforms and applies the binomial theorem. ${\ displaystyle X \ sim {\ mathcal {N}} (\ mu, \ sigma ^ {2})}$ ${\ displaystyle Z \ sim {\ mathcal {N}} (0,1)}$

{\ displaystyle \ operatorname {E} (X ^ {k}) = \ operatorname {E} ((\ sigma Z + \ mu) ^ {k}) = \ sum _ {j = 0} ^ {k} {k \ choose j} \ operatorname {E} (Z ^ {j}) \ sigma ^ {j} \ mu ^ {kj} = \ sum _ {i = 0} ^ {\ lfloor k / 2 \ rfloor} {k \ choose 2i} \ operatorname {E} (Z ^ {2i}) \ sigma ^ {2i} \ mu ^ {k-2i} = \ sum _ {i = 0} ^ {\ lfloor k / 2 \ rfloor} {k \ choose 2i} (2i-1) !! \ sigma ^ {2i} \ mu ^ {k-2i}.}

Invariance to convolution

The normal distribution is invariant to the convolution , i.e. This means that the sum of independent, normally distributed random variables is normally distributed again (see also under stable distributions and under infinite divisible distributions ). The normal distribution thus forms a convolution half-group in its two parameters. An illustrative formulation of this situation is: The convolution of a Gaussian curve of the half-width with a Gaussian curve FWHM again yields a Gaussian curve with the half-width ${\ displaystyle \ Gamma _ {a}}$ ${\ displaystyle \ Gamma _ {b}}$

{\ displaystyle \ Gamma _ {c} = {\ sqrt {\ Gamma _ {a} ^ {2} + \ Gamma _ {b} ^ {2}}}}

.

So are two independent random variables with ${\ displaystyle X, Y}$

{\ displaystyle X \ sim {\ mathcal {N}} (\ mu _ {X}, \ sigma _ {X} ^ {2}), \ Y \ sim {\ mathcal {N}} (\ mu _ {Y }, \ sigma _ {Y} ^ {2}),}

so their sum is also normally distributed:

{\ displaystyle X + Y \ sim {\ mathcal {N}} (\ mu _ {X} + \ mu _ {Y}, \ sigma _ {X} ^ {2} + \ sigma _ {Y} ^ {2 })}

.

This can be shown, for example, with the help of characteristic functions by using the fact that the characteristic function of the sum is the product of the characteristic functions of the summands (see the convolution theorem of the Fourier transform).

More generally, independent and normally distributed random variables are given . Then every linear combination is normally distributed again ${\ displaystyle n}$ ${\ displaystyle X_ {i} \ sim {\ mathcal {N}} (\ mu _ {i}, \ sigma _ {i} ^ {2})}$

{\ displaystyle \ sum _ {i = 1} ^ {n} c_ {i} X_ {i} \ sim {\ mathcal {N}} \ left (\ sum _ {i = 1} ^ {n} c_ {i } \ mu _ {i}, \ sum _ {i = 1} ^ {n} c_ {i} ^ {2} \ sigma _ {i} ^ {2} \ right)}

in particular, the sum of the random variables is normally distributed again

{\ displaystyle \ sum _ {i = 1} ^ {n} X_ {i} \ sim {\ mathcal {N}} \ left (\ sum _ {i = 1} ^ {n} \ mu _ {i}, \ sum _ {i = 1} ^ {n} \ sigma _ {i} ^ {2} \ right)}

and the arithmetic mean as well

{\ displaystyle {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} X_ {i} \ sim {\ mathcal {N}} \ left ({\ frac {1} {n} } \ sum _ {i = 1} ^ {n} \ mu _ {i}, {\ frac {1} {n ^ {2}}} \ sum _ {i = 1} ^ {n} \ sigma _ { i} ^ {2} \ right).}

According to Cramér's theorem, the reverse is true: If a normally distributed random variable is the sum of independent random variables, then the summands are also normally distributed.

The density function of the normal distribution is a fixed point of the Fourier transform , i. That is, the Fourier transform of a Gaussian curve is again a Gaussian curve. The product of the standard deviations of these corresponding Gaussian curves is constant; Heisenberg's uncertainty principle applies .

entropy

The normal distribution is the entropy : . ${\ displaystyle \ log \ left (\ sigma {\ sqrt {2 \, \ pi \, e}} \ right)}$

Since it has the greatest entropy of all distributions for a given expected value and a given variance, it is often used as the a priori probability in the maximum entropy method .

Relationships with other distribution functions

Transformation to the standard normal distribution

As mentioned above, a normal distribution with any and and the distribution function has the following relation to the distribution: ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$ ${\ displaystyle F}$ ${\ displaystyle {\ mathcal {N}} (0,1)}$

{\ displaystyle F (x) = \ Phi \ left ({\ tfrac {x- \ mu} {\ sigma}} \ right)}

.

Therein is the distribution function of the standard normal distribution. ${\ displaystyle \ Phi}$

If so , standardization will lead ${\ displaystyle X \ sim {\ mathcal {N}} (\ mu, \ sigma ^ {2})}$

{\ displaystyle Z = {\ frac {X- \ mu} {\ sigma}}}

to a standard normally distributed random variable , because ${\ displaystyle Z}$

{\ displaystyle P (Z \ leq z) = P \ left ({\ tfrac {X- \ mu} {\ sigma}} \ leq z \ right) = P \ left (X \ leq \ sigma z + \ mu \ right ) = F (\ sigma z + \ mu) = \ Phi (z)}

.

From a geometrical perspective, the substitution carried out corresponds to an equal-area transformation of the bell curve from to the bell curve from . ${\ displaystyle {\ mathcal {N}} (\ mu, \ sigma ^ {2})}$ ${\ displaystyle {\ mathcal {N}} (0,1)}$

Approximation of the binomial distribution by the normal distribution

The normal distribution can be used to approximate the binomial distribution if the sample size is sufficiently large and the proportion of the property sought is neither too large nor too small in the population ( Moivre-Laplace theorem , central limit theorem , for experimental confirmation see also Galtonbrett ).

If a Bernoulli experiment with mutually independent levels (or random experiments ) is given with a probability of success , the probability of success can generally be calculated by ( binomial distribution ). ${\ displaystyle n}$ ${\ displaystyle p}$ ${\ displaystyle k}$ ${\ displaystyle P (X = k) = {\ tbinom {n} {k}} \ cdot p ^ {k} \ cdot (1-p) ^ {nk}, \ quad k = 0.1, \ dotsc, n}$

This binomial distribution can be approximated by a normal distribution if is sufficiently large and neither too large nor too small. The rule of thumb for this applies . The following then applies to the expected value and the standard deviation : ${\ displaystyle n}$ ${\ displaystyle p}$ ${\ displaystyle np (1-p) \ geq 9}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$

{\ displaystyle \ mu = n \ cdot p}

and .

{\ displaystyle \ sigma = {\ sqrt {n \ cdot p \ cdot (1-p)}}}

This applies to the standard deviation . ${\ displaystyle \ sigma \ geq 3}$

If this condition is not met, the inaccuracy of the approximation is still acceptable if: and at the same time . ${\ displaystyle np \ geq 4}$ ${\ displaystyle n (1-p) \ geq 4}$

The following approximation can then be used:

{\ displaystyle {\ begin {aligned} P (x_ {1} \ leq X \ leq x_ {2}) & = \ underbrace {\ sum _ {k = x_ {1}} ^ {x_ {2}} {n \ choose k} \ cdot p ^ {k} \ cdot (1-p) ^ {nk}} _ {\ mathrm {BV}} \\ & \ approx \ underbrace {\ Phi \ left ({\ frac {x_ { 2} +0 {,} 5- \ mu} {\ sigma}} \ right) - \ Phi \ left ({\ frac {x_ {1} -0 {,} 5- \ mu} {\ sigma}} \ right)} _ {\ mathrm {NV}}. \ end {aligned}}}

In the normal distribution, the lower limit is reduced by 0.5 and the upper limit is increased by 0.5 to ensure a better approximation. This is also called "continuity correction". It can only be dispensed with if it has a very high value. ${\ displaystyle \ sigma}$

Since the binomial distribution is discrete, a few points must be observed:

The difference between or (and between greater than and greater than or equal to ) must be taken into account (which is not the case with normal distribution). Therefore the next lower natural number has to be chosen, i. H. ${\ displaystyle <}$ ${\ displaystyle \ leq}$ ${\ displaystyle P (X _ {\ text {BV}} <x)}$

{\ displaystyle P (X _ {\ text {BV}} <x) = P (X _ {\ text {BV}} \ leq x-1)}

or ,

{\ displaystyle P (X _ {\ text {BV}}> x) = P (X _ {\ text {BV}} \ geq x + 1)}

so that the normal distribution can be used for further calculations.

For example:

{\ displaystyle P (X _ {\ text {BV}} <70) = P (X _ {\ text {BV}} \ leq 69)}

Also is

{\ displaystyle P (X _ {\ text {BV}} \ leq x) = P (0 \ leq X _ {\ text {BV}} \ leq x)}

{\ displaystyle P (X _ {\ text {BV}} \ geq x) = P (x \ leq X _ {\ text {BV}} \ leq n)}

{\ displaystyle P (X _ {\ text {BV}} = x) = P (x \ leq X _ {\ text {BV}} \ leq x)}

(necessarily with continuity correction)

and can thus be calculated using the formula given above.

The great advantage of the approximation is that very many levels of a binomial distribution can be determined very quickly and easily.

Relationship to the Cauchy distribution

The quotient of two stochastically independent -standard normally distributed random variables is Cauchy-distributed . ${\ displaystyle {\ mathcal {N}} (0,1)}$

Relationship to the chi-square distribution

The square of a normally distributed random variable has a chi-square distribution with one degree of freedom . So: if , then . Furthermore, if there are stochastically independent chi-square distributed random variables, then it applies ${\ displaystyle Z \ sim {\ mathcal {N}} (0,1)}$ ${\ displaystyle Z ^ {2} \ sim \ chi ^ {2} (1)}$ ${\ displaystyle \ chi ^ {2} (r_ {1}), \ chi ^ {2} (r_ {2}), \ dotsc, \ chi ^ {2} (r_ {n})}$

{\ displaystyle Y = \ chi ^ {2} (r_ {1}) + \ chi ^ {2} (r_ {2}) + \ dotsb + \ chi ^ {2} (r_ {n}) \ sim \ chi ^ {2} (r_ {1} + \ dotsb + r_ {n})}

.

From this it follows with independent and standard normal distributed random variables : ${\ displaystyle Z_ {1}, Z_ {2}, \ dotsc, Z_ {n}}$

{\ displaystyle Y = Z_ {1} ^ {2} + \ dotsb + Z_ {n} ^ {2} \ sim \ chi ^ {2} (n)}

Other relationships are:

The sum with and independent normally distributed random variables satisfies a chi-square distribution with degrees of freedom. ${\ displaystyle X_ {n-1} = {\ frac {1} {\ sigma ^ {2}}} \ sum _ {i = 1} ^ {n} (Z_ {i} - {\ overline {Z}} ) ^ {2}}$ ${\ displaystyle {\ overline {Z}}: = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} Z_ {i}}$ ${\ displaystyle n}$ ${\ displaystyle Z_ {i} \ sim {\ mathcal {N}} (\ mu, \ sigma ^ {2}), \; i = 1, \ dotsc, n}$ ${\ displaystyle X_ {n-1} \ sim \ chi _ {n-1} ^ {2}}$ ${\ displaystyle (n-1)}$

As the number of degrees of freedom increases ( df ≫ 100), the chi-square distribution approaches the normal distribution.

The chi-square distribution is used to estimate the confidence level for the variance of a normally distributed population.

Relationship to the Rayleigh distribution

The magnitude of two independent normally distributed random variables , each with a mean and the same variances , is Rayleigh distributed with parameters . ${\ displaystyle Z = {\ sqrt {X ^ {2} + Y ^ {2}}}}$ ${\ displaystyle X, Y}$ ${\ displaystyle \ mu _ {X} = \ mu _ {Y} = 0}$ ${\ displaystyle \ sigma _ {X} ^ {2} = \ sigma _ {Y} ^ {2} = \ sigma ^ {2}}$ ${\ displaystyle \ sigma> 0}$

Relation to the logarithmic normal distribution

If the random variable normally distributed with , then the random variable is log-normally distributed , ie . ${\ displaystyle X}$ ${\ displaystyle {\ mathcal {N}} (\ mu, \ sigma ^ {2})}$ ${\ displaystyle Y = e ^ {X}}$ ${\ displaystyle Y \ sim {\ mathcal {LN}} (\ mu, \ sigma ^ {2})}$

The emergence of a logarithmic normal distribution is due to multiplicative factors, whereas a normal distribution is due to the additive interaction of many random variables.

Relationship to the F-distribution

If the stochastically independent and identically normally distributed random variables and the parameters ${\ displaystyle X_ {1} ^ {(1)}, X_ {2} ^ {(1)}, \ dotsc, X_ {n} ^ {(1)}}$ ${\ displaystyle X_ {1} ^ {(2)}, X_ {2} ^ {(2)}, \ dotsc, X_ {n} ^ {(2)}}$

{\ displaystyle \ operatorname {E} (X_ {i} ^ {(1)}) = \ mu _ {1}, {\ sqrt {\ operatorname {Var} (X_ {i} ^ {(1)})} } = \ sigma _ {1}}

{\ displaystyle \ operatorname {E} (X_ {i} ^ {(2)}) = \ mu _ {2}, {\ sqrt {\ operatorname {Var} (X_ {i} ^ {(2)})} } = \ sigma _ {2}}

own, then the random variable is subject to

{\ displaystyle Y_ {n_ {1} -1, n_ {2} -1}: = {\ frac {\ sigma _ {2} (n_ {2} -1) \ sum \ limits _ {i = 1} ^ {n_ {1}} (X_ {i} ^ {(1)} - {\ overline {X}} ^ {(1)}) ^ {2}} {\ sigma _ {1} (n_ {1} - 1) \ sum \ limits _ {j = 1} ^ {n_ {2}} (X_ {i} ^ {(2)} - {\ overline {X}} ^ {(2)}) ^ {2}} }}

an F-distribution with degrees of freedom. Are there ${\ displaystyle ((n_ {1} -1, n_ {2} -1))}$

{\ displaystyle {\ overline {X}} ^ {(1)} = {\ frac {1} {n_ {1}}} \ sum _ {i = 1} ^ {n_ {1}} X_ {i} ^ {(1)}, \ quad {\ overline {X}} ^ {(2)} = {\ frac {1} {n_ {2}}} \ sum _ {i = 1} ^ {n_ {2}} X_ {i} ^ {(2)}}

.

Relationship to Student's t-distribution

If the independent random variables are identically normally distributed with the parameters and , then the continuous random variable is subject to ${\ displaystyle X_ {1}, X_ {2}, \ dotsc, X_ {n}}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$

{\ displaystyle Y_ {n-1} = {\ frac {{\ overline {X}} - \ mu} {S / {\ sqrt {n}}}}}

with the sample mean and the sample variance of a Student's t-distribution with degrees of freedom. ${\ displaystyle {\ overline {X}} = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} X_ {i}}$ ${\ displaystyle S ^ {2} = {\ frac {1} {n-1}} \ sum _ {i = 1} ^ {n} (X_ {i} - {\ overline {X}}) ^ {2 }}$ ${\ displaystyle (n-1)}$

For an increasing number of degrees of freedom, the Student's t-distribution approaches the normal distribution ever closer. As a rule of thumb, from around the Student's t-distribution onwards, if necessary, one can approximate the normal distribution. ${\ displaystyle df> 30}$

Student's t-distribution is used to estimate the confidence for the expected value of a normally distributed random variable with unknown variance.

Calculate with the standard normal distribution

In the case of tasks in which the probability of - normally distributed random variables is to be determined using the standard normal distribution, it is not necessary to calculate the transformation given above every time. Instead, it just does the transformation ${\ displaystyle \ mu}$ ${\ displaystyle {\ sigma} ^ {2}}$

{\ displaystyle Z = {\ frac {X- \ mu} {\ sigma}}}

used to generate a -distributed random variable . ${\ displaystyle {\ mathcal {N}} (0,1)}$ ${\ displaystyle Z}$

The probability for the event that e.g. B. lies in the interval , is equal to a probability of the standard normal distribution by the following conversion: ${\ displaystyle X}$ ${\ displaystyle [x, y]}$

{\ displaystyle {\ begin {aligned} P (x \ leq X \ leq y) & = P \ left ({\ frac {x- \ mu} {\ sigma}} \ leq {\ frac {X- \ mu} {\ sigma}} \ leq {\ frac {y- \ mu} {\ sigma}} \ right) \\ & = P \ left ({\ frac {x- \ mu} {\ sigma}} \ leq Z \ leq {\ frac {y- \ mu} {\ sigma}} \ right) \\ & = \ Phi \ left ({\ frac {y- \ mu} {\ sigma}} \ right) - \ Phi \ left ( {\ frac {x- \ mu} {\ sigma}} \ right) \ end {aligned}}}

.

Fundamental questions

In general, the distribution function gives the area under the bell-shaped curve up to the value , i.e. That is, the definite integral of to is calculated. ${\ displaystyle x}$ ${\ displaystyle - \ infty}$ ${\ displaystyle x}$

In tasks, this corresponds to a desired probability in which the random variable is smaller or not larger than a certain number . Because of the continuity of the normal distribution, it makes no difference whether it is now or required, because z. B. ${\ displaystyle X}$ ${\ displaystyle x}$ ${\ displaystyle <}$ ${\ displaystyle \ leq}$

{\ displaystyle P (X = 3) = \ int _ {3} ^ {3} f (x) \ mathrm {d} x = 0}

and thus .

{\ displaystyle P (X <3) = P (X \ leq 3)}

The same applies to “larger” and “not smaller”.

Because it can only be smaller or larger than a limit (or within or outside of two limits), two fundamental questions arise for problems with probability calculations for normal distributions: ${\ displaystyle X}$

What is the probability that in a random experiment the standard normally distributed random variable takes on at most the value ? ${\ displaystyle Z}$ ${\ displaystyle z}$
${\ displaystyle P (Z \ leq z) = \ Phi (z)}$

In school mathematics , the term left pointed is occasionally used for this statement , since the area under the Gaussian curve runs from the left to the border. For negative values are allowed. However, many tables of the standard normal distribution only have positive entries - because of the symmetry of the curve and the rule of negativity

{\ displaystyle z}

{\ displaystyle \ Phi (-z) \ = \ 1- \ Phi (z)}

of the "left tip", this is not a restriction.

What is the probability that in a random experiment the standard normally distributed random variable takes at least the value ? ${\ displaystyle Z}$ ${\ displaystyle z}$

{\ displaystyle P (Z \ geq z) = 1- \ Phi (z)}

The term right pointed is occasionally used here, with

{\ displaystyle P (Z \ geq -z) = 1- \ Phi (-z) = 1- (1- \ Phi (z)) = \ Phi (z)}

there is also a negativity rule here.

Since every random variable with the general normal distribution can be converted into the random variable with the standard normal distribution , the questions apply equally to both quantities. ${\ displaystyle X}$ ${\ displaystyle Z = {\ frac {X- \ mu} {\ sigma}}}$

Scatter area and anti-scatter area

Often the probability is of interest for a range of variation ; H. the probability that the standard normally distributed random variable takes on values between and : ${\ displaystyle Z}$ ${\ displaystyle z_ {1}}$ ${\ displaystyle z_ {2}}$

{\ displaystyle P (z_ {1} \ leq Z \ leq z_ {2}) = \ Phi (z_ {2}) - \ Phi (z_ {1})}

In the special case of the symmetrical scatter range ( , with ) applies ${\ displaystyle z_ {1} = - z_ {2}}$ ${\ displaystyle z_ {2}> 0}$

{\ displaystyle {\ begin {aligned} P (-z \ leq Z \ leq z) & = P (| Z | \ leq z) \\ & = \ Phi (z) - \ Phi (-z) \\ & = \ Phi (z) - (1- \ Phi (z)) \\ & = 2 \ Phi (z) -1. \ End {aligned}}}

For the corresponding anti-scatter range , the probability that the standard normally distributed random variable assumes values outside the range between and is: ${\ displaystyle Z}$ ${\ displaystyle z_ {1}}$ ${\ displaystyle z_ {2}}$

{\ displaystyle P (Z \ leq z_ {1} {\ text {or}} Z \ geq z_ {2}) = \ Phi (z_ {1}) + (1- \ Phi (z_ {2})). }

Thus follows with a symmetrical anti-scattering region

{\ displaystyle {\ begin {aligned} P (Z \ leq -z {\ text {or}} Z \ geq z) & = P (| Z | \ geq z) \\ & = \ Phi (-z) + 1- \ Phi (z) \\ & = 1- \ Phi (z) + 1- \ Phi (z) \\ & = 2-2 \ Phi (z). \ End {aligned}}}

Scatter areas using the example of quality assurance

Both ranges are of particular importance. B. in quality assurance of technical or economic production processes . There are tolerance limits to be observed here and , whereby there is usually a greatest still acceptable distance from the expected value (= the optimal target value). The standard deviation , on the other hand, can be obtained empirically from the production process. ${\ displaystyle x_ {1}}$ ${\ displaystyle x_ {2}}$ ${\ displaystyle \ epsilon}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$

If the tolerance interval to be observed was specified, then (depending on the question) there is a symmetrical or anti-scatter area. ${\ displaystyle [x_ {1}; x_ {2}] = [\ mu - \ epsilon; \ mu + \ epsilon]}$

In the case of the spread:

{\ displaystyle {\ begin {aligned} P (x_ {1} \ leq X \ leq x_ {2}) & = P (| X- \ mu | \ leq \ epsilon) \\ & = P (\ mu - \ epsilon \ leq X \ leq \ mu + \ epsilon) \\ & = P \ left ({\ frac {- \ epsilon} {\ sigma}} \ leq Z \ leq {\ frac {\ epsilon} {\ sigma}} \ right) \\ & = \ Phi \ left ({\ frac {\ epsilon} {\ sigma}} \ right) - \ Phi \ left ({\ frac {- \ epsilon} {\ sigma}} \ right) \ \ & = 2 \ Phi \ left ({\ frac {\ epsilon} {\ sigma}} \ right) -1 \\ & = \ gamma \ end {aligned}}}

.

The anti-scatter area then results from

{\ displaystyle P (| X- \ mu | \ geq \ epsilon) = 1- \ gamma}

or if no spread was calculated by

{\ displaystyle P (| X- \ mu | \ geq \ epsilon) = 2 \ cdot \ left (1- \ Phi \ left ({\ frac {\ epsilon} {\ sigma}} \ right) \ right) = \ alpha.}

The result is the probability for sellable products, while the probability means for rejects, both of which are dependent on the specifications of , and . ${\ displaystyle \ gamma}$ ${\ displaystyle \ alpha}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$ ${\ displaystyle \ epsilon}$

If it is known that the maximum deviation is symmetrical around the expected value, then questions are also possible in which the probability is given and one of the other variables has to be calculated. ${\ displaystyle \ epsilon}$

Testing for normal distribution

Quantiles of a normal distribution and a chi-square distribution

A χ²-distributed random variable with 5 degrees of freedom is tested for normal distribution. For each sample size, 10,000 samples are simulated and then 5 goodness-of-fit tests are carried out at a level of 5%.

The following methods and tests can be used to check whether the available data are normally distributed:

Chi-square test
Kolmogorov-Smirnov test
Anderson-Darling test (modification of the Kolmogorow-Smirnow test)
Lilliefors test (modification of the Kolmogorow-Smirnow test)
Cramér von Mises test
Shapiro-Wilk test
Jarque Bera test
QQ plot (descriptive review)
Maximum likelihood method (descriptive check)

The tests have different characteristics in terms of the types of deviations from normal distribution that they detect. The Kolmogorov-Smirnov test recognizes deviations in the middle of the distribution rather than deviations at the edges, while the Jarque-Bera test reacts quite sensitively to strongly deviating individual values at the edges (" heavy edges ").

In contrast to the Kolmogorov-Smirnov test, the Lilliefors test does not have to be standardized; i.e., and the assumed normal distribution may be unknown. ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$

With the help of quantile-quantile diagrams or normal-quantile diagrams, a simple graphic check for normal distribution is possible.
The maximum likelihood method can be used to estimate the parameters and the normal distribution and graphically compare the empirical data with the fitted normal distribution. ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma}$

Parameter estimation, confidence intervals and tests

Many of the statistical questions in which the normal distribution occurs have been well studied. The most important case is the so-called normal distribution model, which is based on the implementation of independent and normally distributed experiments. There are three cases: ${\ displaystyle n}$

the expected value is unknown and the variance is known
the variance is unknown and the expected value is known
Expected value and variance are unknown.

Depending on which of these cases occurs, different estimation functions , confidence ranges or tests result. These are summarized in detail in the main article normal distribution model.

The following estimation functions are of particular importance:

The sample mean

{\ displaystyle {\ overline {X}} = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} X_ {i}}

is an unbiased estimator for the unknown expected value for both known and unknown variance. In fact, he is the best unbiased estimator ; H. the estimator with the smallest variance. Both the maximum likelihood method and the moment method provide the sample mean as an estimator.

The uncorrected sample variance

{\ displaystyle V (X) = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} (X_ {i} - \ mu _ {0}) ^ {2}}

.

is an unbiased estimator for the unknown variance for a given expected value . It can also be obtained both from the maximum likelihood method and from the moment method.

{\ displaystyle \ mu _ {0}}

The corrected sample variance

{\ displaystyle V ^ {*} (X) = {\ frac {1} {n-1}} \ sum _ {i = 1} ^ {n} (X_ {i} - {\ overline {X}}) ^ {2}}

.

is an unbiased estimator for the unknown variance when the expected value is unknown.

Generation of normally distributed random numbers

All of the following methods generate random numbers with standard normal distribution. Any normally distributed random numbers can be generated from this by linear transformation: If the random variable is -distributed, then finally -distributed. ${\ displaystyle x \ sim {\ mathcal {N}} (0,1)}$ ${\ displaystyle a \ cdot x + b}$ ${\ displaystyle {\ mathcal {N}} (b, a ^ {2})}$

Box-Muller method

Using the Box-Muller method , two independent, standard normally distributed random variables and two independent, uniformly distributed random variables , so-called standard random numbers , can be simulated: ${\ displaystyle X}$ ${\ displaystyle Y}$ ${\ displaystyle U_ {1}, U_ {2} \ sim U (0,1)}$

{\ displaystyle X = \ cos (2 \ pi U_ {1}) {\ sqrt {-2 \ ln U_ {2}}}}

and

{\ displaystyle Y = \ sin (2 \ pi U_ {1}) {\ sqrt {-2 \ ln U_ {2}}}.}

Polar method

George Marsaglia's Polar method is even faster on a computer because it does not require evaluations of trigonometric functions:

Generate two independent random numbers, equally distributed in the interval and ${\ displaystyle [-1.1]}$ ${\ displaystyle u_ {1}}$ ${\ displaystyle u_ {2}}$
Calculate . If or , go back to step 1. ${\ displaystyle q = u_ {1} ^ {2} + u_ {2} ^ {2}}$ ${\ displaystyle q = 0}$ ${\ displaystyle q \ geq 1}$
Calculate . ${\ displaystyle p = {\ sqrt {\ frac {-2 \ cdot \ ln q} {q}}}}$
${\ displaystyle x_ {i} = u_ {i} \ cdot p}$ for provides two independent, standard normally distributed random numbers and . ${\ displaystyle i = 1,2}$ ${\ displaystyle x_ {1}}$ ${\ displaystyle x_ {2}}$

Rule of twelve

The central limit theorem states that under certain conditions the distribution of the sum of independently and identically distributed random numbers approaches a normal distribution.

A special case is the rule of twelve , which is limited to the sum of twelve random numbers from an even distribution on the interval [0,1] and which already leads to acceptable distributions.

However, the required independence of the twelve random variables is not guaranteed for the linear congruence generators (LKG) that are still frequently used . On the contrary, the spectral test for LKG usually only guarantees the independence of a maximum of four to seven of the . The rule of twelve is therefore very questionable for numerical simulations and should, if at all, only be used with more complex but better pseudo-random generators such as B. the Mersenne Twister (standard in Python , GNU R ) or WELL can be used. Other methods, even easier to program, are therefore i. d. R. preferable to the rule of twelve. ${\ displaystyle X_ {i}}$ ${\ displaystyle X_ {i}}$

Rejection method

Normal distributions can be simulated with the rejection method (see there).

Inversion method

The normal distribution can also be calculated using the inversion method.

Since the error integral cannot be explicitly integrated with elementary functions, one can fall back on series expansion of the inverse function for a starting value and subsequent correction with the Newton method. These are and are needed, which in turn can be calculated with series expansion and continued fraction expansion - overall a relatively high effort. The necessary developments can be found in the literature. ${\ displaystyle \ operatorname {erf} (x)}$ ${\ displaystyle \ operatorname {erfc} (x)}$

Development of the inverse error integral (can only be used as a starting value for the Newton method because of the pole):

{\ displaystyle \ operatorname {erf} ^ {- 1} \ left ({\ frac {2} {\ sqrt {\ pi}}} x \ right) = x {\ Bigl (} a_ {1} + x ^ { 2} {\ bigl (} a_ {2} + x ^ {2} (\ dotsb) {\ bigr)} {\ Bigr)}}

with the coefficients

{\ displaystyle {\ begin {aligned} a_ {i} & = 1, {\ tfrac {1} {3}}, {\ tfrac {7} {30}}, {\ tfrac {127} {630}}, {\ tfrac {4369} {22680}}, {\ tfrac {34807} {178200}}, \ dotsc \ end {aligned}}}

Applications outside of probability

The normal distribution can also be used to describe not directly stochastic facts, for example in physics for the amplitude profile of the Gaussian rays and other distribution profiles.

It is also used in the Gabor transformation .

literature

Stephen M. Stigler: The history of statistics: the measurement of uncertainty before 1900. Belknap Series. Harvard University Press, 1986. ISBN 9780674403413 .

Web links

Commons : normal distribution - collection of images, videos and audio files

Wikibooks:: Mathematics for School ${\ displaystyle {\ begin {smallmatrix} {\ mathbf {MATH} \ mu \ alpha T \ mathbb {R} ix} \ end {smallmatrix}}}$

Wikibooks: Illustrative representation of the normal distribution - learning and teaching materials

Clear explanation of the normal distribution with an interactive graph
Representation with program code ( Memento from February 7, 2018 in the Internet Archive ) in Visual Basic Classic
Online calculator normal distribution

Individual evidence

↑ Wolfgang Götze, Christel Deutschmann, Heike Link: Statistics. Text and exercise book with examples from the tourism and transport industry . Oldenburg, Munich 2002, ISBN 3-486-27233-0 , p. 170 ( limited preview in Google Book search).

↑ Hans Wußing: From Gauß to Poincaré: Mathematics and the industrial revolution. P. 33.

↑ This is the exponential function with the base ${\ displaystyle e ^ {x}}$ ${\ displaystyle e.}$

↑ George G. Judge, R. Carter Hill, W. Griffiths, Helmut Lütkepohl , TC Lee: Introduction to the Theory and Practice of Econometrics. 1988, p. 47.

↑ George G. Judge, R. Carter Hill, W. Griffiths, Helmut Lütkepohl , TC Lee: Introduction to the Theory and Practice of Econometrics. 1988, p. 48.

^ H. Schmid, A. Huber: Measuring a Small Number of Samples and the 3σ Fallacy. (PDF) In: IEEE Solid-State Circuits Magazine. Vol. 6, No. 2, 2014, pp. 52-58, doi : 10.1109 / MSSC.2014.2313714 .

↑ Mareke Arends: Epidemiology of bulimic symptoms among 10-grade students in the city of Halle. Dissertation. Martin Luther University of Halle-Wittenberg, 2005 Table 9, p 30 urn : nbn: de: gbv: 3-000008151

↑ George G. Judge, R. Carter Hill, W. Griffiths, Helmut Lütkepohl , TC Lee: Introduction to the Theory and Practice of Econometrics. 1988, p. 49.

^ William B. Jones, WJ Thron: Continued Fractions: Analytic Theory and Applications. Addison-Wesley, 1980.

[Götze_2002-1] Wolfgang Götze, Christel Deutschmann, Heike Link: Statistics. Text and exercise book with examples from the tourism and transport industry . Oldenburg, Munich 2002, ISBN 3-486-27233-0 , p. 170 ( limited preview in Google Book search).

[2] Hans Wußing: From Gauß to Poincaré: Mathematics and the industrial revolution. P. 33.