Formula collection stochastics

This is a collection of formulas for the mathematical sub-area of stochastics including probability theory , combinatorics , random variables and distributions as well as statistics .

notation

In stochastics, in addition to the usual mathematical notation and mathematical symbols, there are the following frequently used conventions:

Random variables are in uppercase letters written , etc. ${\ displaystyle X}$ ${\ displaystyle Y}$
Realizations of a random variable are written with the appropriate lower case letters , e.g. As for the observations in a sample : . ${\ displaystyle x_ {1}, x_ {2}, \ ldots, x_ {n}}$
Lowercase letters are used to denote probability functions and probability densities , e.g. B. . ${\ displaystyle f (x)}$
Capital letters are used to designate distribution functions , e.g. B. . ${\ displaystyle F (x)}$ $F (x)$
- In particular, the probability density of the standard normal distribution , the designation and is used for the distribution function . ${\ displaystyle \ phi (z)}$ ${\ displaystyle \ Phi (z)}$
Greek letters (e.g. ) are used to denote unknown parameters (population parameters). ${\ displaystyle \ theta, \ beta}$
An estimator is often referred to with a circumflex above the appropriate symbol, e.g. B. (spoken: Theta roof ). ${\ displaystyle {\ hat {\ theta}}}$

The arithmetic mean is denoted by (spoken: across ). ${\ displaystyle {\ bar {x}}}$ ${\ displaystyle x}$

probability calculation

In the following, a probability space is always given. In it the result space is an arbitrary nonempty set , a σ-algebra of subsets of which contains, and a probability measure on ${\ displaystyle (\ Omega, \ Sigma, P)}$ ${\ displaystyle \ Omega}$ ${\ displaystyle \ Sigma}$ ${\ displaystyle \ Omega}$ ${\ displaystyle \ Omega}$ ${\ displaystyle P}$ ${\ displaystyle \ Omega.}$

Basics

Axioms: Every event is assigned a probability such that: ${\ displaystyle A \ in \ Sigma}$ ${\ displaystyle P (A)}$

{\ displaystyle 0 \ leq P (A) \ leq 1}

,

{\ displaystyle P (\ Omega) = 1 \,}

,

holds for pairwise disjoint events

{\ displaystyle A_ {1}, A_ {2}, \ dots}

{\ displaystyle P (A_ {1} \ cup A_ {2} \ cup \ dots) = P (A_ {1}) + P (A_ {2}) + \ dots}

Calculation rules: The axioms result in:

{\ displaystyle P (\ emptyset) = 0}

For true , especially

{\ displaystyle A \ subset B}

{\ displaystyle P (B \ setminus A) = P (B) -P (A)}

{\ displaystyle P (A) \ leq P (B)}

The following applies to the counter-event

{\ displaystyle {\ overline {A}} = \ Omega \ setminus A}

{\ displaystyle P ({\ overline {A}}) = 1-P (A)}

{\ displaystyle P (A \ cup B) = P (A) + P (B) -P (A \ cap B)}

Laplace experiments

{\ displaystyle P (A) = {\ frac {| A |} {| \ Omega |}} = {\ frac {\ text {number of favorable results}} {\ text {number of possible results}}}}

Conditional probability

{\ displaystyle P (A \ vert B) = P_ {B} (A) = {\ frac {P (A \ cap B)} {P (B)}}}

Bayes' theorem :

{\ displaystyle P (B \ vert A) = {\ frac {P (B) P (A \ vert B)} {P (B) P (A \ vert B) + P ({\ overline {B}}) P (A \ vert {\ overline {B}})}}}

Independence :

Two events are independent

{\ displaystyle A, B}

{\ displaystyle \ Leftrightarrow P (A \ cap B) = P (A) \ cdot P (B)}

Combinatorics

Faculty : Number of possibilities when pulling allballs out of an urn (without replacing): ${\ displaystyle n}$

{\ displaystyle n! = n \ cdot (n-1) \ cdot (n-2) \ cdot \ dots \ cdot 3 \ cdot 2 \ cdot 1 = n \ cdot (n-1)!}

in which ${\ displaystyle 0! = 1! = 1}$

	without repetition (of n elements) ${\ displaystyle (a, b, c)}$	with repetition (of r + s +… + t = n elements, from which r , s … t are indistinguishable) ${\ displaystyle (a, a, b)}$
permutation ${\ displaystyle (a, b) \ neq (b, a)}$	${\ displaystyle ~ n! ~}$	${\ displaystyle {\ frac {(r + s + \ ldots + t)!} {r! \ cdot s! \ cdot \ ldots \ cdot t!}} = {\ frac {n!} {r! \ cdot s! \ cdot \ ldots \ cdot t!}}}$

Binomial coefficient " n over k "

{\ displaystyle {n \ choose k} = {n! \ over k! (nk)!}}

Number of options when pulling of balls from an urn containing balls: ${\ displaystyle k}$ ${\ displaystyle n}$

	without repetition (without replacing) (see hypergeometric distribution ) ${\ displaystyle (a, b, c)}$ ${\ displaystyle \ {a, b, c \}}$	with repetition (with replacement) (see binomial distribution ) ${\ displaystyle (a, a, b)}$ ${\ displaystyle \ {a, a, b \}}$
variation ${\ displaystyle (a, b) \ neq (b, a)}$	${\ displaystyle {n \ choose k} {\ cdot k!} = {\ frac {n!} {\ left (nk \ right)!}}}$	${\ displaystyle ~ n ^ {k} ~}$
combination ${\ displaystyle \ {a, b \} = \ {b, a \}}$	${\ displaystyle {n \ choose k} = {\ frac {n!} {{\ left (nk \ right)!} \ cdot k!}}}$	${\ displaystyle \ left (\! \! {n \ choose k} \! \! \ right) = {n + k-1 \ choose k} = {\ frac {\ left (n + k-1 \ right) !} {\ left (n-1 \ right)! \ cdot k!}}}$

Random variables

Discrete random variables

A function is called a probability function of a discrete random variable if the following properties are met: ${\ displaystyle f}$ ${\ displaystyle X}$

For all true ${\ displaystyle x \ in \ mathbb {Z}}$ ${\ displaystyle f (x) \ geq 0}$
${\ displaystyle \ sum _ {x \ in \ mathbb {Z}} f (x) = 1}$

The following then applies to the associated random variable:

{\ displaystyle P (X = x) = f (x)}

A random variable and its distribution are called discrete if the function has property (2). One calls the probability function of . ${\ displaystyle X}$ ${\ displaystyle f (x) = P (X = x)}$ ${\ displaystyle f (x)}$ ${\ displaystyle X}$

{\ displaystyle E (X) = \ mu = \ sum _ {x \ in \ mathbb {Z}} \, x \ cdot f (x)}

{\ displaystyle E (g (X)) = \ sum _ {x \ in \ mathbb {Z}} \, g (x) \ cdot f (x)}

{\ displaystyle V (X) = \ sigma ^ {2} = \ sum _ {x \ in \ mathbb {Z}} \, (x- \ mu) ^ {2} \ cdot f (x)}

Constant random variables

A function is called the density (function) of a continuous random variable if the following properties are met: ${\ displaystyle f}$ ${\ displaystyle X}$

For all true ${\ displaystyle x \ in \ mathbb {R}}$ ${\ displaystyle f (x) \ geq 0}$
${\ displaystyle \ int \ limits _ {- \ infty} ^ {+ \ infty} f (x) \ mathrm {d} x = 1}$

The following then applies to a continuous random variable:

{\ displaystyle P (a \ leq X \ leq b) = \ int \ limits _ {a} ^ {b} f (x) \ mathrm {d} x}

A random variable and its distribution are called continuous if there is a suitable density function with this property. The function is called the density (function) of . ${\ displaystyle X}$ ${\ displaystyle f}$ ${\ displaystyle f}$ ${\ displaystyle X}$

The following applies to the probability

{\ displaystyle P (X = a) = 0 \,}

for all

{\ displaystyle a \ in \ mathbb {R}}

{\ displaystyle P (a \ leq X \ leq b) = P (a <X \ leq b) = P (a \ leq X <b) = P (a <X <b)}

Expected value and variance are given by

{\ displaystyle E (X) = \ mu = \ int \ limits _ {- \ infty} ^ {+ \ infty} x \ cdot f (x) \ mathrm {d} x}

{\ displaystyle E (g (X)) = \ int \ limits _ {- \ infty} ^ {+ \ infty} g (x) \ cdot f (x) \ mathrm {d} x}

{\ displaystyle V (X) = \ sigma ^ {2} = \ int \ limits _ {- \ infty} ^ {+ \ infty} (x- \ mu) ^ {2} \ cdot f (x) \ mathrm { d} x}

Expected value, variance, covariance, correlation

For the expected value , the variance , the covariance and the correlation : ${\ displaystyle E (X)}$ ${\ displaystyle V (X)}$ ${\ displaystyle \ operatorname {Cov} (X, Y)}$ ${\ displaystyle \ varrho (X, Y)}$

{\ displaystyle E (aX + b) = aE (X) + b}

{\ displaystyle E (X + Y) = E (X) + E (Y)}

, general

{\ displaystyle E (\ sum _ {i = 1} ^ {n} X_ {i}) = \ sum _ {i = 1} ^ {n} E (X_ {i})}

The following applies to independent random variables :

{\ displaystyle X_ {i}}

{\ displaystyle E (\ prod _ {i = 1} ^ {n} X_ {i}) = \ prod _ {i = 1} ^ {n} E (X_ {i})}

{\ displaystyle V (X) = E ((XE (X)) ^ {2}) = E (X ^ {2}) - E (X) ^ {2}}

{\ displaystyle V (aX + b) = a ^ {2} V (X)}

The following applies to independent random variables :

{\ displaystyle X_ {i}}

{\ displaystyle V (\ sum _ {i = 1} ^ {n} X_ {i}) = \ sum _ {i = 1} ^ {n} V (X_ {i})}

{\ displaystyle \ operatorname {Cov} (X, Y) = E ((XE (X)) (YE (Y))) = E (XY) -E (X) E (Y)}

{\ displaystyle \ operatorname {Cov} (X, Y) = \ operatorname {Cov} (Y, X)}

{\ displaystyle \ operatorname {Cov} (X, X) = V (X)}

{\ displaystyle \ operatorname {Cov} (aX + b, Y) = a \ operatorname {Cov} (X, Y)}

{\ displaystyle \ operatorname {Cov} (X_ {1} + X_ {2}, Y) = \ operatorname {Cov} (X_ {1}, Y) + \ operatorname {Cov} (X_ {2}, Y)}

{\ displaystyle V (X + Y) = V (X) + V (Y) +2 \ operatorname {Cov} (X, Y)}

{\ displaystyle \ varrho (X, Y) = {\ frac {\ operatorname {Cov} (X, Y)} {{\ sqrt {V (X)}} {\ sqrt {V (Y)}}}}}

Chebyshev inequality :

{\ displaystyle P (| XE (X) | \ geq \ alpha) \ leq {\ frac {V (x)} {\ alpha ^ {2}}}}

Special distributions

Binomial distribution

A -step Bernoulli experiment is given (i.e. the same experiment, independent of each other, with only two possible outcomes and constant probabilities) with the probability of success and the probability of failure . The probability distribution of the random variable : number of successes is called the binomial distribution . ${\ displaystyle n}$ ${\ displaystyle n}$ ${\ displaystyle p}$ ${\ displaystyle q = 1-p}$ ${\ displaystyle X}$

The probability of success is calculated using the formula: ${\ displaystyle k}$

{\ displaystyle P (X = k) = {\ binom {n} {k}} \ cdot p ^ {k} \ cdot q ^ {nk}}

Expected value :

{\ displaystyle \ mu = E (X) = n \ cdot p}

Variance :

{\ displaystyle \ sigma ^ {2} = V (X) = n \ cdot p \ cdot q}

Standard deviation :

{\ displaystyle \ sigma = \ sigma (X) = {\ sqrt {V (X)}} = {\ sqrt {n \ cdot p \ cdot q}}}

σ-rules

(Probabilities of neighborhoods of the expected value in binomial distributions) The following assignments apply between the radius of a neighborhood around the expected value and the associated probability of the neighborhood (if ): ${\ displaystyle \ sigma> 3}$

Radius of the surroundings	Probability of the environment
1σ	0.68
2σ	0.955
3σ	0.997

Probability of the environment	Radius of the surroundings
0.90	1.64σ
0.95	1.96σ
0.99	2.58σ

Standardize a distribution

If the random variable has a distribution with expected value and standard deviation , then the standardized variable is defined by ${\ displaystyle X}$ ${\ displaystyle E (X) = \ mu}$ ${\ displaystyle \ sigma}$ ${\ displaystyle X ^ {*}}$

{\ displaystyle X ^ {*} = {\ frac {X- \ mu} {\ sigma}}.}

The standardized variable has the expected value 0 and the standard deviation 1. ${\ displaystyle X ^ {*}}$

Poisson approximation

A binomial distribution with a large sample size ≥ 100 and a low probability of success is given . With the help of you can then approximately calculate the probability of success: ${\ displaystyle n}$ ${\ displaystyle p \ leq 0.1}$ ${\ displaystyle \ mu = n \ cdot p}$ ${\ displaystyle k}$

{\ displaystyle P (X = 0) \ approx e ^ {- \ mu}}

{\ displaystyle P (X = k) \ approx {\ frac {\ mu} {k}} \ cdot P (X = k-1)}

The relationships can be summarized as follows:

{\ displaystyle P (X = k) \ approx {\ frac {\ mu ^ {k}} {k!}} \ cdot e ^ {- \ mu}}

Poisson distribution

Applies to the distribution of a random variable ${\ displaystyle X}$

{\ displaystyle P (X = k) = {\ frac {\ mu ^ {k}} {k!}} \ cdot e ^ {- \ mu}}

Approximation formulas from Moivre and Laplace

Let be a binomially distributed random variable with (usable approximation better ). The probability for exact and at most successes can be calculated approximately by: ${\ displaystyle X}$ ${\ displaystyle \ sigma> 4}$ ${\ displaystyle \ sigma> 9}$ ${\ displaystyle k}$

{\ displaystyle P (X = k) \ approx {1 \ over \ sigma} \ cdot \ varphi \ left ({k- \ mu \ over \ sigma} \ right)}

{\ displaystyle P (X \ leq k) = F_ {X} (k) \ approx \ varphi \ left ({k- \ mu \ over \ sigma} \ right)}

Standard normal distribution

The density (function) (also known as the bell curve ) of the standard normal distribution is defined by: ${\ displaystyle \ varphi}$

{\ displaystyle \ varphi (x) = {\ frac {1} {\ sqrt {2 \ pi}}} \, \ mathrm {e} ^ {- {\ frac {1} {2}} x ^ {2} }}

and the distribution function by: ${\ displaystyle \ Phi}$

{\ displaystyle \ Phi (z) = \ int \ limits _ {- \ infty} ^ {z} \ varphi (x) dx}

Approximation formulas for a discrete distribution using the continuity correction:

{\ displaystyle P (X = k) \ approx \ Phi \ left ({\ frac {k + 0 {,} 5- \ mu} {\ sigma}} \ right) - \ Phi \ left ({\ frac {k -0 {,} 5- \ mu} {\ sigma}} \ right)}

{\ displaystyle P (X \ leq k) \ approx \ Phi \ left ({\ frac {k + 0 {,} 5- \ mu} {\ sigma}} \ right)}

{\ displaystyle P (a \ leq X \ leq b) \ approx \ Phi \ left ({\ frac {b + 0 {,} 5- \ mu} {\ sigma}} \ right) - \ Phi \ left ({ \ frac {a-0 {,} 5- \ mu} {\ sigma}} \ right)}

Hypergeometric distribution

In a population of scope , two characteristic values of scope or are represented. A sample of the size is taken. Then the distribution of the random variable is called : Number of copies of the 1st characteristic expression in the sample of a hypergeometric distribution . ${\ displaystyle N}$ ${\ displaystyle K}$ ${\ displaystyle NK}$ ${\ displaystyle n}$ ${\ displaystyle X}$

The probability that there are exactly copies of the 1st characteristic value in the sample is: ${\ displaystyle n}$ ${\ displaystyle k}$

{\ displaystyle P (X = k) = {{\ binom {K} {k}} \ cdot {\ binom {NK} {nk}} \ over {\ binom {N} {n}}}}

${\ displaystyle N}$ = Number of elements, = number of positive elements, = number of draws, = number of successes. ${\ displaystyle K}$ ${\ displaystyle n}$ ${\ displaystyle k}$

Let the proportion with which the 1st characteristic expression occurs in the totality then applies: ${\ displaystyle p = {\ tfrac {K} {N}}}$

{\ displaystyle \ mu = E (X) = n \ cdot p = n \ cdot {\ frac {K} {N}}}

{\ displaystyle \ sigma ^ {2} = V (X) = n \ cdot p (1-p) {\ frac {Nn} {N-1}} = n \ cdot {\ frac {K} {N}} \ left (1 - {\ frac {K} {N}} \ right) {\ frac {Nn} {N-1}}}

Geometric distribution

A Bernoulli experiment with a probability of success is given . The distribution of the random variable : the number of stages up to the first success is called the geometric distribution . The following applies: ${\ displaystyle p}$ ${\ displaystyle W}$

{\ displaystyle P (W = k) = p \ cdot q ^ {k-1}}

(Success exactly on the -th attempt)

{\ displaystyle k}

{\ displaystyle \, P (W> k) = q ^ {k}}

( Failures in a row or the first success only comes after the -th attempt)

{\ displaystyle k}

{\ displaystyle k}

{\ displaystyle P (W \ leq k) = 1-q ^ {k}}

(Success at the latest at the -th attempt or at least one success occurs by the -th attempt)

{\ displaystyle k}

{\ displaystyle k}

The expected value is

{\ displaystyle E (W) = {\ frac {1} {p}}}

Further

The innumerable other special distributions cannot all be listed here; reference is made to the list of univariate probability distributions .

Approximations of distributions

Under certain approximation conditions, distributions can also be approximated through one another in order to simplify calculations. Depending on the textbook, the approximation conditions can be slightly different.

	To
From	${\ displaystyle B (n, p)}$	${\ displaystyle Po (\ lambda)}$	${\ displaystyle N (\ mu, \ sigma)}$
Discrete distributions
Binomial distribution ${\ displaystyle B (n, p)}$	-	${\ displaystyle n> 10, p <0 {,} 05}$ , ${\ displaystyle \ lambda: = np}$	${\ displaystyle np (1-p) \ geq 9}$ , ${\ displaystyle \ mu: = np, \ sigma ^ {2}: = np (1-p)}$
Hypergeometric distribution ${\ displaystyle Hyp (N, M, n)}$	${\ displaystyle {\ frac {n} {N}} <0 {,} 05}$ ${\ displaystyle p: = {\ frac {M} {N}}}$	${\ displaystyle n> 10}$ , , ${\ displaystyle {\ frac {M} {N}} <0 {,} 05}$ ${\ displaystyle \ lambda: = n {\ frac {M} {N}}}$	${\ displaystyle n {\ frac {M} {N}} \ left (1 - {\ frac {M} {N}} \ right) \ geq 9}$ ${\ displaystyle \ mu: = n {\ frac {M} {N}}, \ sigma ^ {2}: = n {\ frac {M} {N}} \ left (1 - {\ frac {M} { N}} \ right) {\ frac {Nn} {N-1}}}$
Poisson distribution ${\ displaystyle Po (\ lambda)}$		-	${\ displaystyle \ lambda> 9}$ , ${\ displaystyle \ mu: = \ lambda, \ sigma ^ {2}: = \ lambda}$
Continuous distributions
Chi-square distribution ${\ displaystyle \ chi _ {n} ^ {2}}$			${\ displaystyle n> 30}$ ${\ displaystyle \ mu: = n, \ sigma ^ {2}: = 2n}$
Student's t-distribution ${\ displaystyle t_ {n}}$			${\ displaystyle n> 30}$ ${\ displaystyle \ mu: = 0, \ sigma ^ {2}: = 1}$
Normal distribution ${\ displaystyle N (\ mu, \ sigma)}$			-

In the transition from a discrete distribution to a continuous distribution, a continuity correction (if or ) also comes into consideration and in particular . ${\ displaystyle \ sigma ^ {2} \ leq 9}$ ${\ displaystyle n \ leq 60}$ ${\ displaystyle P (a \ leq X _ {\ text {discrete}} \ leq b) \ approx P (a-0 {,} 5 \ leq X _ {\ text {continuous}} \ leq b + 0 {,} 5 )}$ ${\ displaystyle P (X _ {\ text {discrete}} = a) \ approx P (a-0 {,} 5 \ leq X _ {\ text {continuous}} \ leq a + 0 {,} 5)}$

Critical values

The -level is the value of a probability distribution for which: . There is a standard notation for some commonly used distributions: ${\ displaystyle \ alpha}$ ${\ displaystyle F (x _ {\ alpha}) = \ alpha}$

${\ displaystyle z _ {\ alpha}}$ or for the standard normal distribution ${\ displaystyle z (\ alpha)}$
${\ displaystyle t _ {\ alpha, \ nu}}$ or for the t-distribution with degrees of freedom ${\ displaystyle t (\ alpha, \ nu)}$ ${\ displaystyle \ nu}$
${\ displaystyle \ chi _ {\ alpha, \ nu} ^ {2}}$ or for the chi-square distribution with degrees of freedom ${\ displaystyle \ chi ^ {2} (\ alpha, \ nu)}$ ${\ displaystyle \ nu}$

${\ displaystyle F _ {\ alpha, \ nu _ {1}, \ nu _ {2}}}$ or for the F-distribution with and degrees of freedom ${\ displaystyle F (\ alpha, \ nu _ {1}, \ nu _ {2})}$ ${\ displaystyle \ nu _ {1}}$ ${\ displaystyle \ nu _ {2}}$

statistics

Descriptive statistics

Location dimensions

Arithmetic mean: ${\ displaystyle {\ bar {x}} = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} {x_ {i}} = {\ frac {x_ {1} + x_ {2} + \ cdots + x_ {n}} {n}}}$

Median

mode

Measures of dispersion

empirical variance : ${\ displaystyle s ^ {2} = {\ frac {1} {n}} \ sum \ limits _ {i = 1} ^ {n} \ left (x_ {i} - {\ bar {x}} \ right ) ^ {2} = {\ frac {1} {n}} \ left (\ sum \ limits _ {i = 1} ^ {n} x_ {i} ^ {2} \ right) - {\ bar {x }} ^ {2}}$

empirical standard deviation : ${\ displaystyle s = {\ sqrt {s ^ {2}}} = {\ sqrt {{\ frac {1} {n}} \ sum \ limits _ {i = 1} ^ {n} \ left (x_ { i} - {\ bar {x}} \ right) ^ {2}}}}$

Measures of connection

Empirical covariance :

{\ displaystyle s_ {xy} = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} {(x_ {i} - {\ bar {x}}) (y_ {i} - {\ bar {y}})} = {\ frac {1} {n}} \ left (\ sum _ {i = 1} ^ {n} x_ {i} y_ {i} \ right) - {\ bar {x}} {\ bar {y}},}

Empirical correlation coefficient :

{\ displaystyle r_ {xy} = {\ frac {s_ {xy}} {s_ {x} \ cdot s_ {y}}} = {\ frac {\ sum (x_ {i} - {\ bar {x}} ) (y_ {i} - {\ bar {y}})} {\ sqrt {\ sum (x_ {i} - {\ bar {x}}) ^ {2} \ sum (y_ {i} - {\ bar {y}}) ^ {2}}}}}

Equation of the regression line of a linear single regression : with ${\ displaystyle y = ax + b}$

{\ displaystyle a = {\ frac {s_ {xy}} {s_ {x} ^ {2}}} = {\ frac {\ sum (x_ {i} - {\ bar {x}}) (y_ {i } - {\ bar {y}})} {\ sum (x_ {i} - {\ bar {x}}) ^ {2}}}}

{\ displaystyle b = {\ bar {y}} - a {\ bar {x}}}

,

where and mean the arithmetic mean. ${\ displaystyle {\ bar {x}}}$ ${\ displaystyle {\ bar {y}}}$

Mean values

Average	Two numbers	General
mode	Expression with the highest frequency
Median (median)	If sorted: ${\ displaystyle x_ {1}, \ dotsc, x_ {n}}$ ${\ displaystyle {\ bar {x}} _ {\ mathrm {med}} = {\ begin {cases} x _ {({\ frac {n + 1} {2}})}, & n {\ text {odd, }} \\ {\ frac {1} {2}} \ left (x _ {({\ frac {n} {2}})} + x _ {({{\ frac {n} {2}} + 1} )} \ right), & n {\ text {even.}} \ end {cases}}}$
Arithmetic mean	${\ displaystyle {\ frac {a + b} {2}}}$	${\ displaystyle {\ bar {x}} _ {\ mathrm {arithm}} = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} {x_ {i}} = {\ frac {x_ {1} + x_ {2} + \ dotsb + x_ {n}} {n}}}$
Geometric mean	${\ displaystyle {\ sqrt {ab}}}$	${\ displaystyle {\ bar {x}} _ {\ mathrm {geom}} = {\ sqrt [{n}] {\ prod _ {i = 1} ^ {n} {x_ {i}}}} = { \ sqrt [{n}] {x_ {1} \ cdot x_ {2} \ dotsm x_ {n}}}}$
Harmonious mean	${\ displaystyle {\ frac {2} {{\ frac {1} {a}} + {\ frac {1} {b}}}}}$	${\ displaystyle {\ bar {x}} _ {\ mathrm {harm}} = {\ frac {n} {\ sum \ limits _ {i = 1} ^ {n} {\ frac {1} {x_ {i }}}}} = {\ frac {n} {{\ frac {1} {x_ {1}}} + {\ frac {1} {x_ {2}}} + \ dotsb + {\ frac {1} {x_ {n}}}}}}$
Square mean	${\ displaystyle {\ sqrt {\ frac {a ^ {2} + b ^ {2}} {2}}}}$	${\ displaystyle {\ bar {x}} _ {\ mathrm {quadr}} = {\ sqrt {{\ frac {1} {n}} \ sum _ {i = 1} ^ {n} {x_ {i} ^ {2}}}} = {\ sqrt {{x_ {1} ^ {2} + x_ {2} ^ {2} + \ dotsb + x_ {n} ^ {2}} \ over n}}}$

Closing statistics

parameter

In general, in statistics, unknown population or model parameters are identified with Greek letters (e.g. ). ${\ displaystyle \ theta, \ beta}$

The arithmetic mean in the population: . ${\ displaystyle \ mu}$
The variance in the population: . ${\ displaystyle \ sigma ^ {2}}$
The share value of a dichotomous variable in the population: . ${\ displaystyle \ pi}$
The intercept and the slope in the simple linear regression model . ${\ displaystyle \ beta _ {0}}$ ${\ displaystyle \ beta _ {1}}$ ${\ displaystyle Y_ {i} = \ beta _ {0} + \ beta _ {1} x_ {i} + U_ {i}}$

Estimators

An estimate function for an unknown parameter is often indicated by a capital letter in the parameter name from the descriptive statistics. The estimator results from the sample variables . ${\ displaystyle X_ {1}, \ ldots, X_ {n}}$

parameter	condition	Estimator	distribution
${\ displaystyle \ mu}$		${\ displaystyle {\ bar {X}} = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} X_ {i}}$	1. ${\ displaystyle X_ {i} \ sim N (\ mu; \ sigma ^ {2}) \ Rightarrow {\ bar {X}} \ sim N (\ mu; \ sigma ^ {2} / n)}$ 2. If the central limit theorem holds, then holds ${\ displaystyle {\ bar {X}} \ approx N (\ mu; \ sigma ^ {2} / n)}$
${\ displaystyle \ sigma ^ {2}}$	${\ displaystyle \ mu}$ known	${\ displaystyle S ^ {* 2} = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} (X_ {i} - \ mu) ^ {2}}$	${\ displaystyle X_ {i} \ sim N (\ mu; \ sigma ^ {2}) \ Rightarrow {\ frac {nS ^ {* 2}} {\ sigma ^ {2}}} \ sim \ chi _ {n } ^ {2}}$
${\ displaystyle \ sigma ^ {2}}$	${\ displaystyle \ mu}$ unknown	${\ displaystyle S_ {n} ^ {2} = {\ frac {1} {n-1}} \ sum _ {i = 1} ^ {n} (X_ {i} - {\ bar {X}}) ^ {2}}$	${\ displaystyle X_ {i} \ sim N (\ mu; \ sigma ^ {2}) \ Rightarrow {\ frac {(n-1) S_ {n} ^ {2}} {\ sigma ^ {2}}} \ sim \ chi _ {n-1} ^ {2}}$
${\ displaystyle \ pi}$		${\ displaystyle \ Pi = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} X_ {i}}$	1.Draw with replacement: ${\ displaystyle \ Pi \ sim B (n; \ pi)}$ 2. Pull without replacing: with and the size of the population. ${\ displaystyle \ Pi \ sim Hyp (N; M; n)}$ ${\ displaystyle M = \ pi \ cdot N}$ ${\ displaystyle N}$
${\ displaystyle \ beta _ {0}}$ , ${\ displaystyle \ beta _ {1}}$		${\ displaystyle B_ {k} = \ sum _ {i = 1} ^ {n} Y_ {i} w_ {i} ^ {(k)} (x_ {1}, \ ldots, x_ {n})}$	If so , then follows ${\ displaystyle U_ {i} \ sim N (0; \ sigma _ {u} ^ {2})}$ ${\ displaystyle B_ {k} \ sim N (\ beta _ {k}; \ sigma _ {B_ {k}} ^ {2})}$

Point estimates and confidence intervals

parameter	Point estimator	${\ displaystyle 1- \ alpha}$ Confidence interval
${\ displaystyle \ mu}$	${\ displaystyle {\ hat {\ mu}} = {\ bar {x}} = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} x_ {i}}$	1. If known: ${\ displaystyle \ sigma}$ ${\ displaystyle [{\ bar {X}} - z_ {1- \ alpha / 2} \ sigma / {\ sqrt {n}}; {\ bar {X}} + z_ {1- \ alpha / 2} \ sigma / {\ sqrt {n}}]}$
${\ displaystyle \ mu}$		2. If unknown: ${\ displaystyle \ sigma}$ ${\ displaystyle [{\ bar {X}} - t_ {n-1; 1- \ alpha / 2} S / {\ sqrt {n}}; {\ bar {X}} + t_ {n-1; 1 - \ alpha / 2} S / {\ sqrt {n}}]}$
${\ displaystyle \ sigma ^ {2}}$	${\ displaystyle {\ hat {\ sigma}} ^ {2} = s_ {n} ^ {2} = {\ frac {1} {n-1}} \ sum _ {i = 1} ^ {n} ( x_ {i} - {\ bar {x}}) ^ {2}}$
${\ displaystyle \ pi}$	${\ displaystyle {\ hat {\ pi}} = p = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} x_ {i}}$	1.Drawing with replacement: If , then the following applies approximately: ${\ displaystyle \ Pi \ approx N \ left (\ pi; {\ tfrac {\ pi (1- \ pi)} {n}} \ right)}$ ${\ displaystyle \ left [\ Pi -z_ {1- \ alpha / 2} {\ sqrt {\ tfrac {\ pi (1- \ pi)} {n}}}; \ Pi + z_ {1- \ alpha / 2} {\ sqrt {\ tfrac {\ pi (1- \ pi)} {n}}} \ right]}$
		2.Drawing without replacing: If , then the following applies approximately: ${\ displaystyle \ Pi \ approx N \ left (\ pi; {\ tfrac {\ pi (1- \ pi)} {n}} {\ tfrac {Nn} {N-1}} \ right)}$ ${\ displaystyle \ left [\ Pi -z_ {1- \ alpha / 2} {\ sqrt {{\ tfrac {\ pi (1- \ pi)} {n}} {\ tfrac {Nn} {N-1} }}}; \ Pi + z_ {1- \ alpha / 2} {\ sqrt {{\ tfrac {\ pi (1- \ pi)} {n}} {\ tfrac {Nn} {N-1}}} } \ right]}$
		When calculating an estimation interval using a sample in 1. and 2. is replaced by . ${\ displaystyle \ pi}$ ${\ displaystyle p}$

Individual evidence

^ Yates, F. (1934). Contingency Tables Involving Small Numbers and the χ2 Test . Supplement to the Journal of the Royal Statistical Society 1 (2): 217-235. JSTOR Archive for the journal

Web links

Earliest Uses of Symbols in Probability and Statistics

[1] Yates, F. (1934). Contingency Tables Involving Small Numbers and the χ2 Test . Supplement to the Journal of the Royal Statistical Society 1 (2): 217-235. JSTOR Archive for the journal