The shift theorem (also called Steiner's theorem or Steiner's shift theorem ) is a calculation rule for determining the sum of the squared deviations or the empirical variance
In short, it says that for numbers and their arithmetic mean :
n
{\ displaystyle n}
x
1
,
...
,
x
n
{\ displaystyle x_ {1}, \ dotsc, x_ {n}}
x
¯
{\ displaystyle {\ overline {x}}}
S.
Q
x
=
∑
i
=
1
n
(
x
i
-
x
¯
)
2
=
(
∑
i
=
1
n
x
i
2
)
-
n
x
¯
2
=
(
∑
i
=
1
n
x
i
2
)
-
1
n
(
∑
i
=
1
n
x
i
)
2
{\ displaystyle SQ_ {x} = \ sum _ {i = 1} ^ {n} \ left (x_ {i} - {\ overline {x}} \ right) ^ {2} = \ left (\ sum _ { i = 1} ^ {n} x_ {i} ^ {2} \ right) -n {\ overline {x}} ^ {2} = \ left (\ sum _ {i = 1} ^ {n} x_ { i} ^ {2} \ right) - {\ frac {1} {n}} \ left (\ sum _ {i = 1} ^ {n} x_ {i} \ right) ^ {2}}
.
However, using this formula with floating point numbers can result in numerical cancellation if is significantly greater than the variance, i.e. the data is not centered. Therefore, this formula is primarily used for analytical considerations, not for use with real data. One possible remedy is to determine an approximation for the mean in advance and thus calculate it:
x
¯
2
{\ displaystyle {\ overline {x}} ^ {2}}
x
~
≈
x
¯
{\ displaystyle {\ tilde {x}} \ approx {\ overline {x}}}
S.
Q
x
=
∑
i
=
1
n
(
x
i
-
x
¯
)
2
=
∑
i
=
1
n
(
x
i
-
x
~
)
2
-
1
n
(
∑
i
=
1
n
(
x
i
-
x
~
)
)
2
{\ displaystyle SQ_ {x} = \ sum _ {i = 1} ^ {n} \ left (x_ {i} - {\ overline {x}} \ right) ^ {2} = \ sum _ {i = 1 } ^ {n} (x_ {i} - {\ tilde {x}}) ^ {2} - {\ frac {1} {n}} \ left (\ sum _ {i = 1} ^ {n} ( x_ {i} - {\ tilde {x}}) \ right) ^ {2}}
.
If the approximation is close enough to the real mean , the accuracy with this formula is good. Further numerically more stable calculation methods can be found in the literature.
x
~
{\ displaystyle {\ tilde {x}}}
x
¯
{\ displaystyle {\ bar {x}}}
Explanation for the case of a finite sequence of numbers: The sample mean
The shift theorem is first demonstrated in the simplest case: Let the values be given, for example a sample . The sum of the squared deviations of these values is formed:
x
1
,
x
2
,
...
,
x
n
{\ displaystyle x_ {1}, x_ {2}, \ ldots, x_ {n}}
S.
Q
x
=
∑
i
=
1
n
(
x
i
-
x
¯
)
2
,
{\ displaystyle SQ_ {x} = \ sum _ {i = 1} ^ {n} (x_ {i} - {\ overline {x}}) ^ {2} \,}
in which
x
¯
: =
1
n
(
x
1
+
x
2
+
...
+
x
n
)
=
1
n
∑
i
=
1
n
x
i
{\ displaystyle {\ overline {x}}: = {\ frac {1} {n}} (x_ {1} + x_ {2} + \ ldots + x_ {n}) = {\ frac {1} {n }} \ sum _ {i = 1} ^ {n} {x_ {i}}}
is the arithmetic mean of the numbers. The shift law results from
S.
Q
x
=
∑
i
=
1
n
(
x
i
2
-
2
x
i
x
¯
+
x
¯
2
)
=
(
∑
i
=
1
n
x
i
2
)
-
2
x
¯
(
∑
i
=
1
n
x
i
)
+
n
x
¯
2
{\ displaystyle SQ_ {x} = \ sum _ {i = 1} ^ {n} (x_ {i} ^ {2} -2x_ {i} {\ overline {x}} + {\ overline {x}} ^ {2}) = \ left (\ sum _ {i = 1} ^ {n} x_ {i} ^ {2} \ right) -2 {\ overline {x}} \ left (\ sum _ {i = 1 } ^ {n} x_ {i} \ right) + n {\ overline {x}} ^ {2}}
=
(
∑
i
=
1
n
x
i
2
)
-
2
x
¯
⋅
n
x
¯
+
n
x
¯
2
=
(
∑
i
=
1
n
x
i
2
)
-
n
x
¯
2
{\ displaystyle \ quad = \ left (\ sum _ {i = 1} ^ {n} x_ {i} ^ {2} \ right) -2 {\ overline {x}} \ cdot n {\ overline {x} } + n {\ overline {x}} ^ {2} = \ left (\ sum _ {i = 1} ^ {n} x_ {i} ^ {2} \ right) -n {\ overline {x}} ^ {2}}
.
example
Coffee packets are continuously weighed as part of quality assurance . The values (in g) were obtained for the first four packages
x
i
{\ displaystyle x_ {i}}
505
,
500
,
495
,
505
{\ displaystyle 505,500,495,505}
The average weight is
x
¯
=
505
+
500
+
495
+
505
4th
=
501
,
25th
{\ displaystyle {\ overline {x}} = {\ frac {505 + 500 + 495 + 505} {4}} = 501 {,} 25}
It is
S.
Q
x
=
(
505
-
501
,
25th
)
2
+
(
500
-
501
,
25th
)
2
+
(
495
-
501
,
25th
)
2
+
(
505
-
501
,
25th
)
2
=
14.062
5
+
1.562
5
+
39,062
5
+
14.062
5
=
68
,
75
.
{\ displaystyle {\ begin {aligned} SQ_ {x} & = (505-501 {,} 25) ^ {2} + (500-501 {,} 25) ^ {2} + (495-501 {,} 25) ^ {2} + (505-501 {,} 25) ^ {2} \\ & = 14 {,} 0625 + 1 {,} 5625 + 39 {,} 0625 + 14 {,} 0625 \\ & = 68 {,} 75 \,. \ End {aligned}}}
For the application of the displacement theorem one calculates
q
1
=
∑
i
=
1
n
x
i
=
505
+
500
+
495
+
505
=
2.005
{\ displaystyle q_ {1} = \ sum _ {i = 1} ^ {n} x_ {i} = 505 + 500 + 495 + 505 = 2,005}
and
q
2
=
∑
i
=
1
n
x
i
2
=
255.025
+
250,000
+
245.025
+
255.025
=
1.005.075
{\ displaystyle q_ {2} = \ sum _ {i = 1} ^ {n} x_ {i} ^ {2} = 255.025 + 250,000 + 245.025 + 255.025 = 1.005.075}
S.
Q
x
=
q
2
-
1
4th
q
1
2
=
68
,
75
{\ displaystyle SQ_ {x} = q_ {2} - {\ frac {1} {4}} q_ {1} ^ {2} = 68 {,} 75}
For example, you can use this to determine the (corrected) empirical variance as an "average" square of deviation:
s
2
=
1
n
-
1
S.
Q
x
,
{\ displaystyle s ^ {2} = {\ frac {1} {n-1}} SQ_ {x} \ ,,}
for example
s
2
=
1
4th
-
1
68
,
75
≈
22nd
,
9
.
{\ displaystyle s ^ {2} = {\ frac {1} {4-1}} 68 {,} 75 \ approx 22 {,} 9 \ ,.}
If another packet comes into the sample, it is sufficient to recalculate the sample variation with the aid of the shift theorem, simply to recalculate the values for and . The fifth package weighs 510 g. Then:
q
1
{\ displaystyle q_ {1}}
q
2
{\ displaystyle q_ {2}}
q
1
New
=
q
1
+
510
=
2.005
+
510
=
2,515
,
{\ displaystyle q_ {1} ^ {\ text {new}} = q_ {1} + 510 = 2,005 + 510 = 2,515 \ ,,}
q
2
New
=
q
2
+
510
2
=
1.005.075
+
260,100
=
1,265,175
,
{\ displaystyle q_ {2} ^ {\ text {new}} = q_ {2} + 510 ^ {2} = 1.005.075 + 260.100 = 1.265.175 \ ,,}
such as
S.
Q
New
=
q
2
New
-
1
5
(
q
1
New
)
2
=
130
.
{\ displaystyle SQ ^ {\ text {new}} = q_ {2} ^ {\ text {new}} - {\ frac {1} {5}} \ left (q_ {1} ^ {\ text {new} } \ right) ^ {2} = 130 \ ,.}
The sample variance of the new, larger sample is then
s
New
2
=
1
5
-
1
S.
Q
New
=
130
/
4th
=
32
,
5
.
{\ displaystyle s _ {\ text {new}} ^ {2} = {\ frac {1} {5-1}} SQ ^ {\ text {new}} = 130/4 = 32 {,} 5 \ ,. }
Applications
Sample covariance
The sum of the deviation products of two characteristics and is given by
x
{\ displaystyle x}
y
{\ displaystyle y}
S.
P
x
y
: =
∑
i
=
1
n
(
x
i
-
x
¯
)
(
y
i
-
y
¯
)
.
{\ displaystyle SP_ {xy}: = \ sum _ {i = 1} ^ {n} (x_ {i} - {\ overline {x}}) (y_ {i} - {\ overline {y}}) \ .}
Here results the shift theorem
S.
P
x
y
=
∑
i
=
1
n
(
x
i
y
i
)
-
n
x
¯
y
¯
.
{\ displaystyle SP_ {xy} = \ sum _ {i = 1} ^ {n} (x_ {i} y_ {i}) - n {\ overline {x}} {\ overline {y}} \.}
The corrected sample covariance is then calculated as the “average” deviation product
s
x
y
=
1
n
-
1
S.
P
x
y
.
{\ displaystyle s_ {xy} = {\ frac {1} {n-1}} SP_ {xy} \.}
Random variable
Variance
The variance of a random variable
Var
(
X
)
=
E.
(
(
X
-
E.
(
X
)
)
2
)
{\ displaystyle \ operatorname {Var} (X) = \ operatorname {E} ((X- \ operatorname {E} (X)) ^ {2})}
can also be specified with the displacement rate as
Var
(
X
)
=
E.
(
X
2
)
-
(
E.
(
X
)
)
2
.
{\ displaystyle \ operatorname {Var} (X) = \ operatorname {E} (X ^ {2}) - (\ operatorname {E} (X)) ^ {2} \.}
This result is also a set of king - Huygens called. It results from the linearity of the expected value :
E.
(
(
X
-
E.
(
X
)
)
2
)
=
E.
(
X
2
-
2
X
E.
(
X
)
+
E.
(
X
)
2
)
=
E.
(
X
2
)
-
E.
(
2
X
E.
(
X
)
)
+
E.
(
E.
(
X
)
2
)
=
E.
(
X
2
)
-
2
E.
(
X
)
E.
(
X
)
+
E.
(
X
)
2
=
E.
(
X
2
)
-
E.
(
X
)
2
.
{\ displaystyle {\ begin {aligned} \ operatorname {E} {\ bigl (} (X- \ operatorname {E} (X)) ^ {2} {\ bigr)} & = \ operatorname {E} {\ bigl (} X ^ {2} -2X \ operatorname {E} (X) + \ operatorname {E} (X) ^ {2} {\ bigr)} \\ & = \ operatorname {E} (X ^ {2} ) - \ operatorname {E} {\ bigl (} 2X \ operatorname {E} (X) {\ bigr)} + \ operatorname {E} {\ bigl (} \ operatorname {E} (X) ^ {2} { \ bigr)} \\ & = \ operatorname {E} (X ^ {2}) - 2 \ operatorname {E} (X) \ operatorname {E} (X) + \ operatorname {E} (X) ^ {2 } \\ & = \ operatorname {E} (X ^ {2}) - \ operatorname {E} (X) ^ {2}. \ end {aligned}}}
A more general representation of the displacement theorem results from:
Var
(
X
)
=
E.
(
(
X
-
c
)
2
)
-
(
E.
(
X
)
-
c
)
2
,
c
∈
R.
{\ displaystyle \ operatorname {Var} (X) = \ operatorname {E} \ left ((Xc) ^ {2} \ right) - \ left (\ operatorname {E} (X) -c \ right) ^ {2 }, \ quad c \ in \ mathbb {R}}
.
For a discrete random variable with the characteristics and the associated probability, one then obtains for
X
{\ displaystyle X}
x
i
,
i
=
1
,
...
,
n
{\ displaystyle x_ {i}, \, i = 1, \ dots, n}
P
(
X
=
x
j
)
=
p
j
{\ displaystyle \ operatorname {P} (X = x_ {j}) = p_ {j}}
Var
(
X
)
=
E.
(
(
X
-
E.
(
X
)
)
2
)
=
∑
j
p
j
(
x
j
-
∑
i
p
i
x
i
)
2
=
∑
i
p
i
x
i
2
-
(
∑
i
p
i
x
i
)
2
.
{\ displaystyle \ operatorname {Var} (X) = \ operatorname {E} ((X- \ operatorname {E} (X)) ^ {2}) = \ sum _ {j} p_ {j} \ left (x_ {j} - \ sum _ {i} p_ {i} x_ {i} \ right) ^ {2} = \ sum _ {i} p_ {i} x_ {i} ^ {2} - \ left (\ sum _ {i} p_ {i} x_ {i} \ right) ^ {2} \.}
With the special choice , and the above formula results
p
i
=
1
n
{\ displaystyle p_ {i} = {\ frac {1} {n}}}
E.
(
X
)
=
x
¯
=
1
n
∑
i
x
i
{\ displaystyle \ operatorname {E} (X) = {\ overline {x}} = {\ frac {1} {n}} \ sum _ {i} x_ {i}}
1
n
∑
i
(
x
i
-
x
¯
)
2
=
1
n
∑
i
x
i
2
-
x
¯
2
.
{\ displaystyle {\ frac {1} {n}} \ sum _ {i} \ left (x_ {i} - {\ overline {x}} \ right) ^ {2} = {\ frac {1} {n }} \ sum _ {i} x_ {i} ^ {2} - {\ overline {x}} ^ {2}.}
For a continuous random variable and the associated density function is
X
{\ displaystyle X}
f
{\ displaystyle f}
Var
(
X
)
=
E.
(
(
X
-
E.
(
X
)
)
2
)
=
∫
-
∞
∞
(
x
-
E.
(
X
)
)
2
f
(
x
)
d
x
.
{\ displaystyle \ operatorname {Var} (X) = \ operatorname {E} ((X- \ operatorname {E} (X)) ^ {2}) = \ int _ {- \ infty} ^ {\ infty} ( x- \ operatorname {E} (X)) ^ {2} \, f (x) \, \ mathrm {d} x \.}
One obtains here with the displacement theorem
Var
(
X
)
=
E.
(
(
X
-
E.
(
X
)
)
2
)
=
∫
-
∞
∞
x
2
f
(
x
)
d
x
-
E.
(
X
)
2
.
{\ displaystyle \ operatorname {Var} (X) = \ operatorname {E} ((X- \ operatorname {E} (X)) ^ {2}) = \ int _ {- \ infty} ^ {\ infty} x ^ {2} f (x) \, \ mathrm {d} x- \ operatorname {E} (X) ^ {2} \.}
Covariance
The covariance of two random variables and
X
{\ displaystyle X}
Y
{\ displaystyle Y}
Cov
(
X
,
Y
)
=
E.
(
(
X
-
E.
(
X
)
)
⋅
(
Y
-
E.
(
Y
)
)
)
{\ displaystyle \ operatorname {Cov} (X, Y) = \ operatorname {E} ((X- \ operatorname {E} (X)) \ cdot (Y- \ operatorname {E} (Y)))}
can be expressed as a
Cov
(
X
,
Y
)
=
E.
(
X
Y
)
-
E.
(
X
)
E.
(
Y
)
{\ displaystyle \ operatorname {Cov} (X, Y) = \ operatorname {E} (XY) - \ operatorname {E} (X) \ operatorname {E} (Y)}
specify.
For discrete random variables we get for
Cov
(
X
,
Y
)
=
∑
j
∑
k
(
x
j
-
E.
(
X
)
)
(
y
k
-
E.
(
Y
)
)
⋅
f
(
x
j
,
y
k
)
{\ displaystyle \ operatorname {Cov} (X, Y) = \ sum _ {j} \ sum _ {k} (x_ {j} - \ operatorname {E} (X)) (y_ {k} - \ operatorname { E} (Y)) \ cdot f (x_ {j}, y_ {k})}
corresponding to above
Cov
(
X
,
Y
)
=
∑
j
∑
k
x
j
y
k
f
(
x
j
,
y
k
)
-
E.
(
X
)
⋅
E.
(
Y
)
,
{\ displaystyle \ operatorname {Cov} (X, Y) = \ sum _ {j} \ sum _ {k} x_ {j} \, y_ {k} \, f (x_ {j}, y_ {k}) - \ operatorname {E} (X) \ cdot \ operatorname {E} (Y) \,}
with a common probability that and is.
f
(
x
j
,
y
k
)
{\ displaystyle f (x_ {j}, y_ {k})}
X
=
x
j
{\ displaystyle X = x_ {j}}
Y
=
y
k
{\ displaystyle Y = y_ {k}}
In the case of continuous random variables, with results as a common density function of and at the point and for the covariance
f
(
x
,
y
)
{\ displaystyle f (x, y)}
X
{\ displaystyle X}
Y
{\ displaystyle Y}
x
{\ displaystyle x}
y
{\ displaystyle y}
Cov
(
X
,
Y
)
=
∫
-
∞
∞
∫
-
∞
∞
(
x
-
E.
(
X
)
)
(
y
-
E.
(
Y
)
)
⋅
f
(
x
,
y
)
d
y
d
x
{\ displaystyle \ operatorname {Cov} (X, Y) = \ int _ {- \ infty} ^ {\ infty} \ int _ {- \ infty} ^ {\ infty} (x- \ operatorname {E} (X )) (y- \ operatorname {E} (Y)) \ cdot f (x, y) \, \ mathrm {d} y \, \ mathrm {d} x}
corresponding to above
Cov
(
X
,
Y
)
=
∫
-
∞
∞
∫
-
∞
∞
x
y
f
(
x
,
y
)
d
y
d
x
-
E.
(
X
)
⋅
E.
(
Y
)
{\ displaystyle \ operatorname {Cov} (X, Y) = \ int _ {- \ infty} ^ {\ infty} \ int _ {- \ infty} ^ {\ infty} xy \, f (x, y) \ , \ mathrm {d} y \, \ mathrm {d} x- \ operatorname {E} (X) \ cdot \ operatorname {E} (Y) \,}
Individual evidence
↑ a b Erich Schubert, Michael Gertz: Numerically stable parallel computation of (co-) variance . In: Proceedings of the 30th International Conference on Scientific and Statistical Database Management - SSDBM '18 . ACM Press, Bozen-Bolzano, Italy 2018, ISBN 978-1-4503-6505-5 , p. 1–12 , doi : 10.1145 / 3221269.3223036 ( acm.org [accessed December 7, 2019]).
↑ a b Tony F. Chan, Gene H. Golub, Randall J. LeVeque: Algorithms for computing the sample variance: analysis and recommendations . In: The American Statistician Vol. 37, No. 3 (Aug. 1983), pp. 242-247
↑ Hans-Friedrich Eckey, Reinhold Kosfeld, Christian Dreger: Statistics: Principles - Methods - Examples , page 86
↑ Ansgar Steland: Basic knowledge statistics , p 116
Web links
<img src="https://de.wikipedia.org/wiki/Special:CentralAutoLogin/start?type=1x1" alt="" title="" width="1" height="1" style="border: none; position: absolute;">