Displacement rate (statistics)

from Wikipedia, the free encyclopedia

The shift theorem (also called Steiner's theorem or Steiner's shift theorem ) is a calculation rule for determining the sum of the squared deviations or the empirical variance

In short, it says that for numbers and their arithmetic mean :

.

However, using this formula with floating point numbers can result in numerical cancellation if is significantly greater than the variance, i.e. the data is not centered. Therefore, this formula is primarily used for analytical considerations, not for use with real data. One possible remedy is to determine an approximation for the mean in advance and thus calculate it:

.

If the approximation is close enough to the real mean , the accuracy with this formula is good. Further numerically more stable calculation methods can be found in the literature.

Explanation for the case of a finite sequence of numbers: The sample mean

The shift theorem is first demonstrated in the simplest case: Let the values ​​be given, for example a sample . The sum of the squared deviations of these values ​​is formed:

in which

is the arithmetic mean of the numbers. The shift law results from

.

example

Coffee packets are continuously weighed as part of quality assurance . The values ​​(in g) were obtained for the first four packages

The average weight is

It is

For the application of the displacement theorem one calculates

and

For example, you can use this to determine the (corrected) empirical variance as an "average" square of deviation:

for example

If another packet comes into the sample, it is sufficient to recalculate the sample variation with the aid of the shift theorem, simply to recalculate the values ​​for and . The fifth package weighs 510 g. Then:

such as

The sample variance of the new, larger sample is then

Applications

Sample covariance

The sum of the deviation products of two characteristics and is given by

Here results the shift theorem

The corrected sample covariance is then calculated as the “average” deviation product

Random variable

Variance

The variance of a random variable

can also be specified with the displacement rate as

This result is also a set of king - Huygens called. It results from the linearity of the expected value :

A more general representation of the displacement theorem results from:

.
  • For a discrete random variable with the characteristics and the associated probability, one then obtains for
With the special choice , and the above formula results
  • For a continuous random variable and the associated density function is
One obtains here with the displacement theorem

Covariance

The covariance of two random variables and

can be expressed as a

specify.

For discrete random variables we get for

corresponding to above

with a common probability that and is.

In the case of continuous random variables, with results as a common density function of and at the point and for the covariance

corresponding to above

Individual evidence

  1. a b Erich Schubert, Michael Gertz: Numerically stable parallel computation of (co-) variance . In: Proceedings of the 30th International Conference on Scientific and Statistical Database Management - SSDBM '18 . ACM Press, Bozen-Bolzano, Italy 2018, ISBN 978-1-4503-6505-5 , p. 1–12 , doi : 10.1145 / 3221269.3223036 ( acm.org [accessed December 7, 2019]).
  2. a b Tony F. Chan, Gene H. Golub, Randall J. LeVeque: Algorithms for computing the sample variance: analysis and recommendations . In: The American Statistician Vol. 37, No. 3 (Aug. 1983), pp. 242-247
  3. Hans-Friedrich Eckey, Reinhold Kosfeld, Christian Dreger: Statistics: Principles - Methods - Examples , page 86
  4. Ansgar Steland: Basic knowledge statistics , p 116

Web links