This article covers the covariance of two random variables; For information on the covariance of a data series or sample, see
sample covariance .
The covariance ( latin con = "co" and variance (dispersion) of variare = "(ver) change may be different," therefore rarely Mitstreuung ) is in the stochastic a nonstandard measure of association for a monotonous two associated random variables with common probability distribution . The value of this parameter tends to make statements about whether high values of one random variable are more likely to be associated with high or rather low values of the other random variable. Covariance is a measure of the association between two random variables.
definition
Are and two real, integrable random variables whose product is also integrable, d. i.e., the expected values , and exist, then is called
${\ displaystyle X}$${\ displaystyle Y}$ ${\ displaystyle \ operatorname {E} (X)}$${\ displaystyle \ operatorname {E} (Y)}$${\ displaystyle \ operatorname {E} (XY)}$
 ${\ displaystyle \ operatorname {Cov} (X, Y): = \ operatorname {E} {\ bigl [} (X \ operatorname {E} (X)) \ cdot (Y \ operatorname {E} (Y) ) {\ bigr]}}$
the covariance of and .
${\ displaystyle X}$${\ displaystyle Y}$
If and are squareintegrable , i.e. if and hold, then the CauchySchwarz inequality implies :
${\ displaystyle X}$${\ displaystyle Y}$ ${\ displaystyle \ operatorname {E} ( X  ^ {2}) = \ operatorname {E} (X ^ {2}) <\ infty}$${\ displaystyle \ operatorname {E} ( Y  ^ {2}) = \ operatorname {E} (Y ^ {2}) <\ infty}$

${\ displaystyle \ operatorname {E} ( X ) = \ operatorname {E} ( X  \ cdot 1) \ leq {\ sqrt {\ operatorname {E} ( X  ^ {2})}} < \ infty}$and analog and in addition .${\ displaystyle \ operatorname {E} ( Y ) \ leq {\ sqrt {\ operatorname {E} ( Y  ^ {2})}} <\ infty}$${\ displaystyle \ operatorname {E} ( X \ cdot Y ) \ leq \ operatorname {E} ( X  \ cdot  Y ) \ leq {\ sqrt {\ operatorname {E} ( X  ^ { 2}) \ cdot \ operatorname {E} ( Y  ^ {2})}} <\ infty}$
Thus the required existence of the expected values for square integrable random variables is fulfilled.
Properties and rules of calculation
Interpretation of the covariance
 The covariance is positive if and have a monotonic relationship, i.e. That is, high (low) values of go along with high (low) values of .${\ displaystyle X}$${\ displaystyle Y}$${\ displaystyle X}$${\ displaystyle Y}$
 The covariance, on the other hand, is negative if and have a monotonous relationship in opposite directions, i.e. That is, high values of one random variable are associated with low values of the other random variable and vice versa.${\ displaystyle X}$${\ displaystyle Y}$
 If the result is zero, there is no monotonic relationship between and (nonmonotonic relationships are possible, however.).${\ displaystyle X}$${\ displaystyle Y}$
The covariance indicates the direction of a relationship between two random variables, but no statement is made about the strength of the relationship. This is due to the linearity of the covariance. In order to make a relationship comparable, the covariance must be normalized. The most common normalization using the standard deviation leads to the correlation coefficient .
Displacement set
To often simplify the calculation of the covariance, one can also use the shift theorem as an alternative representation of the covariance.
Theorem (shift theorem for the covariance):
 ${\ displaystyle \ operatorname {Cov} (X, Y) = \ operatorname {E} (XY)  \ operatorname {E} (X) \ operatorname {E} (Y).}$
Proof:
 ${\ displaystyle {\ begin {aligned} \ operatorname {Cov} (X, Y) & = \ operatorname {E} {\ bigl [} (X \ operatorname {E} (X)) \ cdot (Y \ operatorname {E} (Y)) {\ bigr]} \\ & = \ operatorname {E} {\ bigl [} (XYX \ operatorname {E} (Y) Y \ operatorname {E} (X) + \ operatorname {E} (X) \ operatorname {E} (Y)) {\ bigr]} \\ & = \ operatorname {E} (XY)  \ operatorname {E} (X) \ operatorname {E} (Y)  \ operatorname {E} (Y) \ operatorname {E} (X) + \ operatorname {E} (X) \ operatorname {E} (Y) \\ & = \ operatorname {E} (XY)  \ operatorname { E} (X) \ operatorname {E} (Y) \ qquad \ Box \ end {aligned}}}$
Relationship to variance
Theorem: The covariance is the generalization of the variance , because it holds
 ${\ displaystyle \ operatorname {Var} (X) = \ operatorname {Cov} (X, X).}$
Proof:
 ${\ displaystyle {\ begin {aligned} \ operatorname {Cov} (X, X) & = \ operatorname {E} {\ bigl [} (X \ operatorname {E} (X)) ^ {2} {\ bigr ]} \\ & = \ operatorname {Var} (X) \ qquad \ Box \ end {aligned}}}$
The variance is therefore the covariance of a random variable with itself.
The covariances can also be used to calculate the variance of a sum of squareintegrable random variables. In general
 ${\ displaystyle {\ begin {aligned} \ operatorname {Var} \ left (\ sum _ {i = 1} ^ {n} X_ {i} \ right) & = \ sum _ {i, j = 1} ^ { n} \ operatorname {Cov} (X_ {i}, X_ {j}) \\ & = \ sum _ {i = 1} ^ {n} \ operatorname {Var} (X_ {i}) + \ sum _ { i, j = 1, i \ neq j} ^ {n} \ operatorname {Cov} (X_ {i}, X_ {j}) \\ & = \ sum _ {i = 1} ^ {n} \ operatorname { Var} (X_ {i}) + 2 \ sum _ {i = 1} ^ {n1} \ sum _ {j = i + 1} ^ {n} \ operatorname {Cov} (X_ {i}, X_ {j}). \ end {aligned}}}$
The formula therefore applies especially to the sum of two random variables
 ${\ displaystyle \ operatorname {Var} (X + Y) = \ operatorname {Var} (X) + \ operatorname {Var} (Y) +2 \ operatorname {Cov} (X, Y).}$
As can be seen immediately from the definition, the covariance changes sign when one of the variables changes sign:
 ${\ displaystyle \ operatorname {Cov} (X, Y) =  \ operatorname {Cov} (X, Y)}$
This results in the formula for the difference between two random variables
 ${\ displaystyle \ operatorname {Var} (XY) = \ operatorname {Var} (X + ( Y)) = \ operatorname {Var} (X) + \ operatorname {Var} (Y) 2 \ operatorname {Cov} ( X, Y).}$
Linearity, symmetry and definiteness
Theorem: The covariance is a positive semidefinite symmetric bilinear form on the vector space of the square integrable random variables.
So the following three sentences apply:
Theorem (bilinearity): For :
${\ displaystyle a, b, c, d, e, f, g, h \ in \ mathbb {R}}$
 ${\ displaystyle \ operatorname {Cov} (aX + b, cY + d) = ac \ operatorname {Cov} (X, Y) \ qquad and}$
 ${\ displaystyle \ operatorname {Cov} [X, (eY + f) + (gZ + h)] = e \ operatorname {Cov} (X, Y) + g \ operatorname {Cov} (X, Z).}$
Proof:
 ${\ displaystyle {\ begin {aligned} \ operatorname {Cov} (aX + b, cY + d) & = \ operatorname {E} {\ bigl [} (aX + b \ operatorname {E} (aX + b) ) \ cdot (cY + d \ operatorname {E} (cY + d)) {\ bigr]} \\ & = \ operatorname {E} {\ bigl [} (aXa \ operatorname {E} (X) ) \ cdot (cYc \ operatorname {E} (Y)) {\ bigr]} \\ & = ac \ operatorname {E} {\ bigl [} (X \ operatorname {E} (X)) \ cdot (Y \ operatorname {E} (Y)) {\ bigr]} \\ & = ac \ operatorname {Cov} (X, Y) \ end {aligned}}}$
 ${\ displaystyle {\ begin {aligned} \ operatorname {Cov} [X, (eY + f) + (gZ + h)] & = \ operatorname {E} {\ bigl [} (X \ operatorname {E} ( X)) \ cdot (eY + f + gZ + h \ operatorname {E} (eY + f + gZ + h)) {\ bigr]} \\ & = \ operatorname {E} {\ bigl [} (X  \ operatorname {E} (X)) \ cdot (eYe \ operatorname {E} (Y) + gZg \ operatorname {E} (Z)) {\ bigr]} \\ & = \ operatorname {E } {\ bigl [} (X \ operatorname {E} (X)) \ cdot e (Y \ operatorname {E} (Y)) + (X \ operatorname {E} (X)) \ cdot g ( Z \ operatorname {E} (Z)) {\ bigr]} \\ & = e \ operatorname {E} {\ bigl [} (X \ operatorname {E} (X)) \ cdot (Y \ operatorname {E} (Y)) {\ bigr]} + g \ operatorname {E} {\ bigl [} (X \ operatorname {E} (X)) \ cdot (Z \ operatorname {E} (Z)) {\ bigr]} \\ & = e \ operatorname {Cov} (X, Y) + g \ operatorname {Cov} (X, Z) \ qquad \ Box \ end {aligned}}}$
The covariance is obviously invariant under the addition of constants to the random variables. In the second equation, the covariance is also linear in the first argument because of the symmetry.
Theorem (symmetry):
 ${\ displaystyle \ operatorname {Cov} (X, Y) = \ operatorname {Cov} (Y, X)}$
Proof:
 ${\ displaystyle {\ begin {aligned} \ operatorname {Cov} (X, Y) & = \ operatorname {E} {\ bigl [} (Y \ operatorname {E} (Y)) \ cdot (X \ operatorname {E} (X)) {\ bigr]} \\ & = \ operatorname {Cov} (Y, X) \ qquad \ Box \ end {aligned}}}$
Theorem (positive semidefiniteness):
 ${\ displaystyle \ operatorname {Cov} (X, X) \ geq 0.}$
Proof:
 ${\ displaystyle \ operatorname {Cov} (X, X) = \ operatorname {Var} (X) \ geq 0 \ qquad \ Box}$
Overall, the CauchySchwarz inequality follows for every positive semidefinite symmetric bilinear form
 ${\ displaystyle  \ operatorname {Cov} (X, Y)  \ leq {\ sqrt {\ operatorname {Var} (X)}} \ cdot {\ sqrt {\ operatorname {Var} (Y)}}}$
The linearity of the covariance means that the covariance depends on the scale of the random variable. For example, you get ten times the covariance if you look at
the random variable instead . In particular, the value of the covariance depends on the units of measurement used for the random variables. Since this property makes the absolute values of the covariance difficult to interpret, one looks at the investigation for a linear relationship between and often instead the scaleindependent correlation coefficient. The scaleindependent correlation coefficient of two random variables and is the covariance of the standardized (related to the standard deviation) random variables and :
${\ displaystyle X}$${\ displaystyle 10X}$${\ displaystyle X}$${\ displaystyle Y}$${\ displaystyle X}$${\ displaystyle Y}$${\ displaystyle {\ tilde {X}} = X / \ sigma _ {X}}$${\ displaystyle {\ tilde {Y}} = Y / \ sigma _ {Y}}$

${\ displaystyle \ operatorname {Cov} ({\ tilde {X}}, {\ tilde {Y}}) = \ operatorname {Cov} (X / \ sigma _ {X}, Y / \ sigma _ {Y}) = {\ frac {1} {\ sigma _ {X} \ sigma _ {Y}}} \ operatorname {Cov} (X, Y) =: \ rho (X, Y)}$.
Uncorrelatedness and independence
Definition (uncorrelatedness): Two random variables and are called uncorrelated if .
${\ displaystyle X}$${\ displaystyle Y}$${\ displaystyle \ operatorname {Cov} (X, Y) = 0}$
Theorem: Two stochastically independent random variables are uncorrelated.
Proof: For stochastically independent random variables and we have , i. H.
${\ displaystyle X}$${\ displaystyle Y}$${\ displaystyle \ operatorname {E} (XY) = \ operatorname {E} (X) \ operatorname {E} (Y)}$
 ${\ displaystyle {\ begin {aligned} \ operatorname {E} (XY)  \ operatorname {E} (X) \ operatorname {E} (Y) & = 0 \\\ Leftrightarrow \ qquad \ qquad \ qquad \ operatorname { Cov} (X, Y) & = 0. \ qquad \ end {aligned}}}$
The reverse is generally not true. A counterexample is given by a random variable and uniformly distributed in the interval . Are obvious and interdependent. But it applies
${\ displaystyle [1.1]}$ ${\ displaystyle X}$${\ displaystyle Y = X ^ {2}}$${\ displaystyle X}$${\ displaystyle Y}$

${\ displaystyle \ operatorname {Cov} (X, Y) = \ operatorname {Cov} (X, X ^ {2}) = \ operatorname {E} (X ^ {3})  \ operatorname {E} (X) \ operatorname {E} (X ^ {2}) = 00 \ cdot \ operatorname {E} (X ^ {2}) = 0}$.
Stochastically independent random variables whose covariance exists are therefore also uncorrelated. Conversely, however, uncorrelatedness does not necessarily mean that the random variables are stochastically independent, because there may be a nonmonotonic dependency that does not capture the covariance.
Further examples of uncorrelated but stochastically dependent random variables:
 Be and random variables and${\ displaystyle X}$${\ displaystyle Y}$${\ displaystyle P (X = 0, Y = 1) = {\ tfrac {1} {2}}}$${\ displaystyle P (X = 2, Y = 0) = P (X = 2, Y = 2) = {\ tfrac {1} {4}}.}$
 Then and ,${\ displaystyle P (X = 0) = P (X = 2) = {\ tfrac {1} {2}}}$${\ displaystyle P (Y = 0) = P (Y = 2) = {\ tfrac {1} {4}}}$${\ displaystyle P (Y = 1) = {\ tfrac {1} {2}}.}$
 It follows and also , so${\ displaystyle \ operatorname {E} (X) = \ operatorname {E} (Y) = 1}$${\ displaystyle \ operatorname {E} (XY) = 1}$${\ displaystyle \ operatorname {Cov} (X, Y) = 0.}$
 On the other hand, and because of them are not stochastically independent.${\ displaystyle X}$${\ displaystyle Y}$${\ displaystyle P (X = 0, Y = 1) = {\ tfrac {1} {2}} \ neq {\ tfrac {1} {2}} \ cdot {\ tfrac {1} {2}} = P (X = 0) P (Y = 1)}$
 If the random variables and Bernoulli distributed with parameter and are independent, then and are uncorrelated, but not independent.${\ displaystyle X}$${\ displaystyle Y}$ ${\ displaystyle p}$${\ displaystyle (X + Y)}$${\ displaystyle (XY)}$
 The uncorrelatedness is clear because ${\ displaystyle \ operatorname {Cov} (X + Y, XY) = \ operatorname {Cov} (X, X)  \ operatorname {Cov} (X, Y) + \ operatorname {Cov} (Y, X)  \ operatorname {Cov} (Y, Y) = 0.}$
 But and are not independent because it is${\ displaystyle (X + Y)}$${\ displaystyle (XY)}$${\ displaystyle P (X + Y = 0, XY = 1) = 0 \ neq p (1p) ^ {3} = P (X + Y = 0) P (XY = 1).}$
See also
literature
Individual evidence

↑ H. Autrum, E. Bünning et al .: Results of the Biology. , P. 88

^ Ludwig Fahrmeir , Rita artist, Iris Pigeot , and Gerhard Tutz : Statistics. The way to data analysis. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2016, ISBN 9783662503713 , p. 326.