Convergence in probability

The convergence in probability , and stochastic convergence called, is a term used in probability theory , a branch of mathematics . The convergence in probability is the probabilistic counterpart to the convergence according to measure in the measure theory and next to the convergence in the p-th mean , the convergence in distribution and the almost certain convergence one of the convergence terms in stochastics. There are also sources that define the convergence in probability analogous to the convergence locally according to the measure of measure theory. Convergence in probability is used, for example, in the formulation of the weak law of large numbers .

definition

For real-valued random variables

A sequence of real random variables converges in probability or stochastically to the random variable if for each it holds that ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle X}$${\ displaystyle \ epsilon> 0}$

${\ displaystyle \ lim _ {n \ to \ infty} P (| X_ {n} -X | \ geq \ epsilon) = 0}$

is. Then you write or or also . ${\ displaystyle X_ {n} {\ stackrel {p} {\ rightarrow}} X}$${\ displaystyle X_ {n} {\ stackrel {P} {\ rightarrow}} X}$${\ displaystyle \ operatorname {plim} (X_ {n}) = X}$

General case

Let be a separable metric space and the associated Borel σ-algebra . A sequence of random variables on a probability space with values ​​in is called convergent in probability or stochastically convergent against if that applies to all${\ displaystyle (M, d)}$ ${\ displaystyle {\ mathcal {B}} (M)}$${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle (\ Omega, {\ mathcal {A}}, P)}$${\ displaystyle (M, {\ mathcal {B}} (M))}$${\ displaystyle X}$${\ displaystyle \ epsilon> 0}$

${\ displaystyle \ lim _ {n \ to \ infty} P (d (X_ {n}, X) \ geq \ epsilon) = 0}$

is. The required separability is required in order to ensure that the mapping used in the definition can be measured . ${\ displaystyle \ Omega \ rightarrow \ mathbb {R}, \, \ omega \ mapsto d (X_ {n} (\ omega), X (\ omega))}$

example

Let be independent Rademacher-distributed random variables , that is . Then is and . If one now defines the sequence of random variables as ${\ displaystyle Y_ {n}}$ ${\ displaystyle P (Y_ {n} = - 1) = P (Y_ {n} = 1) = {\ tfrac {1} {2}}}$${\ displaystyle \ operatorname {E} (Y_ {n}) = 0}$${\ displaystyle \ operatorname {Var} (Y_ {n}) = 1}$${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$

${\ displaystyle X_ {n}: = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} Y_ {i}}$,

so is due to independence

${\ displaystyle \ operatorname {E} (X_ {n}) = {\ frac {1} {n}} \ cdot n \ operatorname {E} (Y_ {n}) = 0}$

and

${\ displaystyle \ operatorname {Var} (X_ {n}) = {\ frac {1} {n ^ {2}}} \ operatorname {Var} \ left (\ sum _ {i = 1} ^ {n} Y_ {i} \ right) = {\ frac {1} {n}}}$.

With the Chebyshev inequality

${\ displaystyle P \ left [| X_ {n} - \ operatorname {E} [X_ {n}] | \ geq \ epsilon \ right] \ leq {\ frac {\ operatorname {Var} [X_ {n}]} {\ epsilon ^ {2}}}}$

one then obtains the estimate

${\ displaystyle P \ left [| X_ {n} | \ geq \ epsilon \ right] \ leq {\ frac {1} {n \ epsilon ^ {2}}} {\ stackrel {n \ to \ infty} {\ longrightarrow}} 0}$.

So they converge in probability to 0. In addition to the Chebyshev inequality, the more general Markov inequality is a helpful means of showing convergence in probability. ${\ displaystyle X_ {n}}$

properties

• Converges stochastically to 0 and converges stochastically to 0, so also converges stochastically to 0.${\ displaystyle (X_ {n}) _ {n \ in N}}$${\ displaystyle (Y_ {n}) _ {n \ in N}}$${\ displaystyle (X_ {n} + Y_ {n}) _ {n \ in N}}$
• If the real number sequence is bounded and converges stochastically to 0, then it also converges stochastically to 0.${\ displaystyle (a_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle (X_ {n}) _ {n \ in N}}$${\ displaystyle (a_ {n} X_ {n}) _ {n \ in N}}$
• One can show that a sequence converges stochastically to if and only if${\ displaystyle X_ {n} \;}$${\ displaystyle X}$
${\ displaystyle \ lim _ {n \ to \ infty} \ operatorname {E} [\ mathrm {min} (1, | X_ {n} -X |)] = 0,}$
that is, the stochastic convergence corresponds to the convergence with respect to the metric . The space of all random variables provided with this metric forms a topological vector space that is generally not locally convex .${\ displaystyle d (X, Y): = \ operatorname {E} [\ min (1, | XY |)]}$

Relationship to other types of convergence in stochastics

In general, the implications apply to the concepts of convergence in probability theory

${\ displaystyle {\ begin {matrix} {\ text {almost certain}} \\ {\ text {convergence}} \ end {matrix}} \ implies {\ begin {matrix} {\ text {convergence in}} \\ {\ text {probability}} \ end {matrix}} \ implies {\ begin {matrix} {\ text {convergence in}} \\ {\ text {distribution}} \ end {matrix}}}$

and

${\ displaystyle {\ begin {matrix} {\ text {convergence in}} \\ {\ text {p-th mean}} \ end {matrix}} \ implies {\ begin {matrix} {\ text {convergence in} } \\ {\ text {probability}} \ end {matrix}} \ implies {\ begin {matrix} {\ text {convergence in}} \\ {\ text {distribution}} \ end {matrix}}}$.

The convergence in probability is therefore a moderately strong concept of convergence. The relationships to the other types of convergence are detailed in the sections below.

Convergence in the pth mean

The convergence in probability immediately follows from the convergence in the p-th mean . To do this, the Markov inequality is applied to the function , which is increasing for monotonically, and the random variable . Then follows ${\ displaystyle p \ geq 1}$${\ displaystyle h = Y ^ {p}}$${\ displaystyle p> 0}$${\ displaystyle Y = | X_ {n} -X |}$

${\ displaystyle P (| X_ {n} -X | \ geq \ epsilon) \ leq {\ frac {1} {\ epsilon ^ {p}}} \ operatorname {E} (| X_ {n} -X | ^ {p})}$,

which goes towards zero in the limit value. The reverse is generally not true. An example of this is: are the random variables defined by

${\ displaystyle P (X_ {n} = e ^ {n \ alpha}) = e ^ {- n} = 1-P (X_ {n} = 0)}$

with . Then ${\ displaystyle \ alpha> 0}$

${\ displaystyle \ operatorname {E} (| X_ {n} | ^ {1}) = e ^ {n (\ alpha -1)} {\ xrightarrow [{}] {n \ to \ infty}} 0}$,

if . So the sequence converges for on the average to 0. But for anything is ${\ displaystyle \ alpha <1}$${\ displaystyle \ alpha \ in (0,1)}$${\ displaystyle \ epsilon \ in (0,1)}$

${\ displaystyle P (| X_ {n} | \ geq \ epsilon) = P (X_ {n} = e ^ {n \ alpha}) = e ^ {- n} {\ xrightarrow [{}] {n \ to \ infty}} 0}$. So the sequence converges to 0 in probability for all .${\ displaystyle \ alpha}$

Is a criterion under the convergence in pth mean from the convergence in probability is that a majorant with there, so for all true. If they then converge in probability to , then they also converge in the p-th mean to . More generally, a connection can be drawn between the convergence in the p-th mean and the convergence in probability using Vitali's convergence theorem and the equal integrability in the p-th mean : A sequence converges in the p-th mean if and only if it is equally integrable in the is p th mean and it converges in probability. ${\ displaystyle Y}$${\ displaystyle \ operatorname {E} (| Y | ^ {p}) <\ infty}$${\ displaystyle P (| X_ {n} | \ leq Y) = 1}$${\ displaystyle n}$${\ displaystyle X_ {n}}$${\ displaystyle X}$${\ displaystyle X}$

Almost certain convergence

The convergence in probability follows from the almost certain convergence . To see this, one defines the quantities

${\ displaystyle B_ {N}: = \ {\ omega \ in \ Omega \ colon \ forall n \ geq N \ \ vert X_ {n} -X \ vert <\ epsilon \} {\ text {and}} B: = \ bigcup _ {N = 1} ^ {\ infty} B_ {N}}$.

They form a monotonically increasing sequence of sets , and the set contains the set ${\ displaystyle B_ {N}}$${\ displaystyle B}$

${\ displaystyle A: = \ {\ omega \ in \ Omega \ colon \ lim _ {n \ to \ infty} X_ {n} = X \}}$

of the elements on which the sequence converges point by point. According to prerequisite is and therefore also and accordingly . The statement then follows by forming a complement. ${\ displaystyle P (A) = 1}$${\ displaystyle P (B) = 1}$${\ displaystyle \ lim _ {N \ to \ infty} P (B_ {N}) = 1}$

However, the reverse is generally not true. An example of this is the consequence of Bernoulli distributed random variables for parameters , ie . Then ${\ displaystyle {\ tfrac {1} {n}}}$${\ displaystyle X_ {n} \ sim \ operatorname {Ber} _ {\ frac {1} {n}}}$

${\ displaystyle \ lim _ {n \ to \ infty} P (| X_ {n} | \ geq \ epsilon) = 0}$

for all and thus the sequence converges in probability to 0. The sequence does not converge almost definitely; this is shown with the sufficient criterion for almost certain convergence and the Borel-Cantelli lemma . ${\ displaystyle \ epsilon> 0}$

Conditions under which the probability of the convergence leads to an almost certain convergence:

• The convergence speed of the convergence in probability is sufficiently fast, i.e. it applies
${\ displaystyle \ sum _ {i = 1} ^ {\ infty} P (\ vert X_ {i} -X \ vert \ geq \ epsilon) <\ infty}$.
• The base space can be represented as a countable union of μ-atoms . This is always possible for probability spaces with at most a countable basic set.${\ displaystyle \ Omega}$
• If the sequence of the random variables is almost certainly strictly monotonically falling and converges in probability to 0, then the sequence almost certainly converges to 0.

More generally, any sequence that converges in probability has a subsequence that is almost certain to converge.

Convergence in distribution

Of convergence in probability follows the set of Slutzky the convergence in distribution , the reverse does not apply in general. For example, if the random variable is Bernoulli-distributed with parameters , that is ${\ displaystyle X}$ ${\ displaystyle p = q = {\ tfrac {1} {2}}}$

${\ displaystyle P (X = 1) = P (X = 0) = {\ frac {1} {2}}}$,

and if one sets against for all so converged in distribution , since they have the same distribution. It is always true, however, that the random variables cannot converge in probability. However, there are criteria under which the convergence in distribution results in convergence in probability. If, for example, all random variables are defined on the same probability space and converge in distribution to the random variable , which is almost certainly constant, then they also converge in probability to . ${\ displaystyle X_ {n} = 1-X}$${\ displaystyle n \ in \ mathbb {N}}$${\ displaystyle X_ {n}}$${\ displaystyle X}$${\ displaystyle | X_ {n} -X | = 1}$${\ displaystyle X_ {n}}$${\ displaystyle X}$${\ displaystyle X_ {n}}$${\ displaystyle X}$