Formula collection stochastics

from Wikipedia, the free encyclopedia

This is a collection of formulas for the mathematical sub-area of stochastics including probability theory , combinatorics , random variables and distributions as well as statistics .

notation

In stochastics, in addition to the usual mathematical notation and mathematical symbols, there are the following frequently used conventions:

  • Random variables are in uppercase letters written , etc.
  • Realizations of a random variable are written with the appropriate lower case letters , e.g. As for the observations in a sample : .
  • Lowercase letters are used to denote probability functions and probability densities , e.g. B. .
  • Capital letters are used to designate distribution functions , e.g. B. .
    • In particular, the probability density of the standard normal distribution , the designation and is used for the distribution function .
  • Greek letters (e.g. ) are used to denote unknown parameters (population parameters).
  • An estimator is often referred to with a circumflex above the appropriate symbol, e.g. B. (spoken: Theta roof ).
  • The arithmetic mean is denoted by (spoken: across ).

probability calculation

In the following, a probability space is always given. In it the result space is an arbitrary nonempty set , a σ-algebra of subsets of which contains, and a probability measure on

Basics

Axioms: Every event is assigned a probability such that:

,
,
holds for pairwise disjoint events

Calculation rules: The axioms result in:

For true , especially
The following applies to the counter-event

Laplace experiments

Conditional probability

Bayes' theorem :

Independence :

Two events are independent

Combinatorics

Faculty : Number of possibilities when pulling allballs out of an urn (without replacing):

in which

  without repetition
(of n elements)
 
with repetition
(of r + s +… + t = n elements,
from which r , st are indistinguishable)
permutation

Binomial coefficient " n over k "

Number of options when pulling of balls from an urn containing balls:

  without repetition
(without replacing)
(see hypergeometric distribution )

with repetition
(with replacement)
(see binomial distribution )

variation
combination

Random variables

Discrete random variables

A function is called a probability function of a discrete random variable if the following properties are met:

  1. For all true

The following then applies to the associated random variable:

A random variable and its distribution are called discrete if the function has property (2). One calls the probability function of .

Constant random variables

A function is called the density (function) of a continuous random variable if the following properties are met:

  1. For all true

The following then applies to a continuous random variable:

A random variable and its distribution are called continuous if there is a suitable density function with this property. The function is called the density (function) of .

The following applies to the probability

for all

Expected value and variance are given by

Expected value, variance, covariance, correlation

For the expected value , the variance , the covariance and the correlation :

, general
The following applies to independent random variables :
The following applies to independent random variables :

Chebyshev inequality :

Special distributions

Binomial distribution

A -step Bernoulli experiment is given (i.e. the same experiment, independent of each other, with only two possible outcomes and constant probabilities) with the probability of success and the probability of failure . The probability distribution of the random variable : number of successes is called the binomial distribution .

The probability of success is calculated using the formula:

Expected value :

Variance :

Standard deviation :

σ-rules

(Probabilities of neighborhoods of the expected value in binomial distributions) The following assignments apply between the radius of a neighborhood around the expected value and the associated probability of the neighborhood (if ):

Radius of the surroundings Probability of the environment
0.68
0.955
0.997
Probability of the environment Radius of the surroundings
0.90 1.64σ
0.95 1.96σ
0.99 2.58σ

Standardize a distribution

If the random variable has a distribution with expected value and standard deviation , then the standardized variable is defined by

The standardized variable has the expected value 0 and the standard deviation 1.

Poisson approximation

A binomial distribution with a large sample size ≥ 100 and a low probability of success is given . With the help of you can then approximately calculate the probability of success:

The relationships can be summarized as follows:

Poisson distribution

Applies to the distribution of a random variable

Approximation formulas from Moivre and Laplace

Let be a binomially distributed random variable with (usable approximation better ). The probability for exact and at most successes can be calculated approximately by:

Standard normal distribution

The density (function) (also known as the bell curve ) of the standard normal distribution is defined by:

and the distribution function by:

Approximation formulas for a discrete distribution using the continuity correction:

Hypergeometric distribution

In a population of scope , two characteristic values ​​of scope or are represented. A sample of the size is taken. Then the distribution of the random variable is called : Number of copies of the 1st characteristic expression in the sample of a hypergeometric distribution .

The probability that there are exactly copies of the 1st characteristic value in the sample is:

= Number of elements, = number of positive elements, = number of draws, = number of successes.

Let the proportion with which the 1st characteristic expression occurs in the totality then applies:

Geometric distribution

A Bernoulli experiment with a probability of success is given . The distribution of the random variable : the number of stages up to the first success is called the geometric distribution . The following applies:

(Success exactly on the -th attempt)
( Failures in a row or the first success only comes after the -th attempt)
(Success at the latest at the -th attempt or at least one success occurs by the -th attempt)

The expected value is

Further

The innumerable other special distributions cannot all be listed here; reference is made to the list of univariate probability distributions .

Approximations of distributions

Under certain approximation conditions, distributions can also be approximated through one another in order to simplify calculations. Depending on the textbook, the approximation conditions can be slightly different.

To
From
Discrete distributions
Binomial distribution
- ,
,
Hypergeometric distribution

, ,

Poisson distribution
- ,
Continuous distributions
Chi-square distribution

Student's t-distribution

Normal distribution
-

In the transition from a discrete distribution to a continuous distribution, a continuity correction (if or ) also comes into consideration and in particular .

Critical values

The -level is the value of a probability distribution for which: . There is a standard notation for some commonly used distributions:

  • or for the standard normal distribution
  • or for the t-distribution with degrees of freedom
  • or for the chi-square distribution with degrees of freedom
  • or for the F-distribution with and degrees of freedom

statistics

Descriptive statistics

Location dimensions

Arithmetic mean:

Median

mode

Measures of dispersion

empirical variance :

empirical standard deviation :

Measures of connection

Empirical covariance :

Empirical correlation coefficient :

Equation of the regression line of a linear single regression : with

,

where and mean the arithmetic mean.

Mean values

Average Two numbers General
mode Expression with the highest frequency
Median (median) If sorted:

Arithmetic mean
Geometric mean
Harmonious mean
Square mean

Closing statistics

parameter

In general, in statistics, unknown population or model parameters are identified with Greek letters (e.g. ).

  • The arithmetic mean in the population: .
  • The variance in the population: .
  • The share value of a dichotomous variable in the population: .
  • The intercept and the slope in the simple linear regression model .

Estimators

An estimate function for an unknown parameter is often indicated by a capital letter in the parameter name from the descriptive statistics. The estimator results from the sample variables .

parameter condition Estimator distribution
1.

2. If the central limit theorem holds, then holds

known
unknown
1.Draw with replacement:

2. Pull without replacing:     with and the size of the population.

, If so , then follows

Point estimates and confidence intervals

parameter Point estimator Confidence interval
1. If known:
2. If unknown:
1.Drawing with replacement: If , then the following applies approximately:

2.Drawing without replacing: If , then the following applies approximately:

When calculating an estimation interval using a sample in 1. and 2. is replaced by .

Individual evidence

  1. ^ Yates, F. (1934). Contingency Tables Involving Small Numbers and the χ2 Test . Supplement to the Journal of the Royal Statistical Society 1 (2): 217-235. JSTOR Archive for the journal

Web links