# Weak law of large numbers

The weak law of large numbers is a statement of probability theory that deals with the limit value behavior of sequences of random variables . Statements are made about the convergence in probability of the mean values ​​of the random variables. The weak law of large numbers is closely related to the strong law of large numbers , but it uses a different notion of convergence, almost certain convergence . Both belong to the laws of large numbers and thus to the limit theorems of stochastics .

In the course of time the conditions under which the weak law of large numbers applies were weakened more and more, while the means necessary to prove it became more and more advanced. Some of the historically significant formulations of the weak law of large numbers also have proper names such as Bernoulli's law of large numbers (after Jakob I Bernoulli ), Chebyshev's weak law of large numbers (after Pafnuti Lwowitsch Chebyshev ) or Khinchin's weak law of large numbers (after Alexander Yakovlevich Chinchin ). Sometimes there are still designations such as -version or -version of the weak law of large numbers for formulations that only require the existence of the variance or the expected value as a prerequisite. ${\ displaystyle {\ mathcal {L}} ^ {2}}$${\ displaystyle {\ mathcal {L}} ^ {1}}$

## formulation

A sequence of random variables is given , the expectation of which applies to all . It is said that the sequence satisfies the weak law of large numbers if the sequence is ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle \ operatorname {E} (| X_ {n} |) <\ infty}$${\ displaystyle n \ in \ mathbb {N}}$

${\ displaystyle {\ overline {X}} _ {n}: = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} \ left (X_ {i} -E (X_ { i}) \ right)}$

of the centered mean values ​​converges to 0 in probability , that is, it holds

${\ displaystyle \ lim _ {n \ to \ infty} P \ left (\ left | {\ overline {X}} _ {n} \ right | \ geq \ epsilon \ right) = 0}$

for everyone . ${\ displaystyle \ epsilon> 0}$

## Interpretation and difference to the strong law of large numbers

The weak law of large numbers always follows from the strong law of large numbers.

## validity

Below are various conditions under which the weak law of large numbers applies. The weakest and most specific statement is at the top, the strongest and most general at the bottom.

### Bernoulli's law of large numbers

Are independently identical Bernoulli-distributed random variables to the parameter , that is ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$ ${\ displaystyle p \ in (0,1)}$

${\ displaystyle X_ {n} \ sim \ operatorname {Ber} (p)}$,

so the weak law of large numbers suffices and the mean value converges in probability to the parameter . ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle p}$

This statement goes back to Jakob I Bernoulli , but was only published posthumously in 1713 in the Ars conjectandi edited by his nephew Nikolaus I Bernoulli .

### Chebyshev's weak law of large numbers

If there are independently identically distributed random variables with finite expectation and finite variance , then the weak law of large numbers suffices . ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$ ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$

This statement goes back to Pafnuti Lwowitsch Tschebyschow (alternative transcriptions from the Russian Chebyshev or Chebyshev ), who proved it in 1866.

### L 2 version of the weak law of large numbers

Are a sequence of random variables for which the following applies: ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$

• They are pairwise uncorrelated , that is, it is for .${\ displaystyle X_ {n}}$${\ displaystyle \ operatorname {Cov} (X_ {i}, X_ {j}) = 0}$${\ displaystyle i \ neq j}$
• For the sequence of variances which applies${\ displaystyle X_ {n}}$
${\ displaystyle \ lim _ {n \ to \ infty} {\ frac {1} {n ^ {2}}} \ sum _ {i = 1} ^ {n} \ operatorname {Var} (X_ {i}) = 0}$.

Then the weak law of large numbers suffices . ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$

The condition for the variances is fulfilled, for example, if the sequence of the variances is limited, i.e. it is . ${\ displaystyle \ sup _ {n \ in \ mathbb {N}} \ operatorname {Var} (X_ {n}) <\ infty}$

This statement is a real improvement on Chebyshev's weak law of large numbers for two reasons:

1. Pair-wise uncorrelatedness is a weaker requirement than independence, since independence always results in pair-wise uncorrelatedness, but the reverse is generally not true.
2. The random variables also no longer have to have the same distribution; the above requirement for the variances is sufficient.

The designation in the L 2 version comes from the requirement that the variances should be finite; in terms of measure theory, this corresponds to the requirement that the random variable (measurable function) should lie in the space of the square-integrable functions .

### Khinchin's weak law of large numbers

If independently identically distributed random variables with a finite expectation value, then the consequence satisfies the weak law of large numbers. ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$

This theorem was proven in 1929 by Alexander Jakowlewitsch Chintschin (alternative transcriptions from the Russian khintchine or khinchin ) and is characterized by the fact that it provides the first formulation of a weak law of large numbers that does not require a finite variance.

### L 1 version of the weak law of large numbers

Let be a sequence of pairwise independent random variables that are identically distributed and have a finite expectation. Then the weak law of large numbers suffices . ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$

This statement is a real improvement over Khinchin's weak law of large numbers, since pairwise independence from random variables does not imply independence of the entire sequence of random variables.

## Evidence Sketches

The abbreviations are agreed

${\ displaystyle {\ overline {X}} _ {n}: = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} \ left (X_ {i} - \ operatorname {E } (X_ {i}) \ right), \ quad S_ {n}: = \ sum _ {i = 1} ^ {n} X_ {i}, \ quad M_ {n}: = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} X_ {i}}$

### Finite Variance Versions

The proofs of the versions of the weak law of large numbers, which require the finiteness of the variance as a prerequisite, are essentially based on the Chebyshev inequality

${\ displaystyle \ operatorname {P} \ left [\ left | Y- \ operatorname {E} (Y) \ right | \ geq \ epsilon \ right] \ leq {\ frac {\ operatorname {Var} (Y)} { \ epsilon ^ {2}}}}$,

formulated here for the random variable . ${\ displaystyle Y}$

The proof of Bernoulli's law of large numbers is elementary possible thus: Applies to , it is a binomial distribution , ie . So is ${\ displaystyle X_ {n} \ sim \ operatorname {B} (p)}$${\ displaystyle S_ {n}}$ ${\ displaystyle S_ {n} \ sim \ operatorname {Bin} (n, p)}$

${\ displaystyle \ operatorname {E} (M_ {n}) = {\ tfrac {1} {n}} \ operatorname {E} (S_ {n}) = {\ frac {np} {n}} = p \ quad {\ text {and}} \ quad \ operatorname {Var} (M_ {n}) = \ operatorname {Var} ({\ tfrac {1} {n}} S_ {n}) = {\ frac {1} {n ^ {2}}} \ operatorname {Var} (S_ {n}) = {\ frac {np (1-p)} {n ^ {2}}} = {\ frac {p (1-p) } {n}}}$.

If one applies the Chebyshev inequality to the random variable , it follows ${\ displaystyle M_ {n}}$

${\ displaystyle P \ left (\ left | {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} (X_ {i} - \ operatorname {E} (X_ {i})) \ right | \ geq \ epsilon \ right) = P \ left (\ left | M_ {n} - \ operatorname {E} (M_ {n}) \ right | \ geq \ epsilon \ right) \ leq {\ frac { p (1-p)} {\ epsilon ^ {2} n}} \ to 0}$

for and all . ${\ displaystyle n \ to \ infty}$${\ displaystyle \ epsilon> 0}$

The proof of Chebyshev's weak law of large numbers follows analogously. Is and is due to the linearity of the expected value ${\ displaystyle \ operatorname {E} (X_ {n}) = \ mu}$${\ displaystyle \ operatorname {Var} (X_ {n}) = \ sigma ^ {2} <\ infty}$

${\ displaystyle \ operatorname {E} (M_ {n}) = {\ tfrac {1} {n}} \ operatorname {E} (S_ {n}) = {\ frac {n \ mu} {n}} = \ mu}$.

The identity

${\ displaystyle \ operatorname {Var} (M_ {n}) = \ operatorname {Var} ({\ tfrac {1} {n}} S_ {n}) = {\ frac {1} {n ^ {2}} } \ operatorname {Var} (S_ {n}) = {\ frac {n \ sigma ^ {2}} {n ^ {2}}} = {\ frac {\ sigma ^ {2}} {n}}}$

follows from the equation of Bienaymé and the independence of the random variables. The further proof follows again with the Chebyshev inequality, applied to the random variable . ${\ displaystyle M_ {n}}$

To prove the version you go without loss of generality assume that all random variables have the expected 0. Because of the pairwise uncorrelation, Bienaymé's equation still holds, then it is ${\ displaystyle L ^ {2}}$

${\ displaystyle \ operatorname {Var} (M_ {n}) = {\ frac {\ sum _ {i = 1} ^ {n} \ operatorname {Var} (X_ {i})} {n ^ {2}} }}$.

Applying the Chebyshev inequality one obtains

${\ displaystyle \ operatorname {P} \ left [\ left | M_ {n} \ right | \ geq \ epsilon \ right] \ leq {\ frac {\ operatorname {Var} (M_ {n})} {\ epsilon ^ {2}}} = {\ frac {\ sum _ {i = 1} ^ {n} \ operatorname {Var} (X_ {i})} {n ^ {2} \ epsilon ^ {2}}} \ rightarrow 0}$.

for according to the requirement of the variances. ${\ displaystyle n \ to \ infty}$

### Khinchin's weak law of large numbers

If one waives the finite variance as a prerequisite, the Chebyshev inequality is no longer available for proof.

Instead, the proof is made using characteristic functions . If , with the calculation rules for the characteristic functions and the Taylor expansion , it follows that ${\ displaystyle \ operatorname {E} (X_ {n}) = \ mu}$

${\ displaystyle \ varphi _ {M_ {n}} (t) = \ left (\ varphi _ {X_ {1}} \ left ({\ tfrac {t} {n}} \ right) \ right) ^ {n } = \ left (1 + {\ frac {i \ mu t + n {\ hbox {o}} \ left ({\ tfrac {t} {n}} \ right)} {n}} \ right) ^ { n}}$,

what due to the definition of the exponential function to converge the characteristic function of a Dirac-distributed random variable. So in distribution converges to a Dirac-distributed random variable in the point . But since this random variable is almost certain constant, the convergence follows in the probability against what was to be shown. ${\ displaystyle n \ to \ infty}$${\ displaystyle \ exp (i \ mu t)}$${\ displaystyle M_ {n}}$ ${\ displaystyle \ mu}$${\ displaystyle M_ {n}}$${\ displaystyle \ mu}$

## Alternative formulations

### More general wording

Somewhat more generally, we say that the sequence of random variables satisfies the weak law of large numbers if there are real sequences with and , so that for the partial sum ${\ displaystyle (b_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle \ lim _ {n \ to \ infty} b_ {n} = \ infty}$${\ displaystyle (a_ {n}) _ {n \ in \ mathbb {N}}}$

${\ displaystyle S_ {n}: = \ sum _ {i = 1} ^ {n} X_ {i}}$

the convergence

${\ displaystyle {\ frac {S_ {n}} {b_ {n}}} - a_ {n} \ rightarrow 0}$

is true in probability.

With this formulation, convergence statements can also be made without the existence of the expected values ​​having to be assumed.

### More specific formulation

Some authors consider the convergence in probability against the averaged partial sums . However, this formulation assumes that all random variables have the same expected value. ${\ displaystyle {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} X_ {i}}$${\ displaystyle \ operatorname {E} (X_ {0})}$