# Convergence in the p th mean

The convergence in the p -th mean and the two special cases of convergence in the quadratic mean and the convergence in the mean are convergence concepts from the theory of measure and probability theory , two sub-areas of mathematics . In measure theory, it is fundamental for the convergence of function sequences in the function spaces of the p -fold integrable functions, and spaces ; in probability theory, it is one of the common convergence terms alongside almost certain convergence , convergence in distribution and stochastic convergence. ${\ displaystyle {\ mathcal {L}} ^ {p}}$${\ displaystyle L ^ {p}}$

The convergence is sometimes referred to in the p- th mean to differentiate it from the weak convergence in and${\ displaystyle L ^ {p}}$${\ displaystyle {\ mathcal {L}} ^ {p}}$ also as strong convergence in${\ displaystyle L ^ {p}}$ or or norm convergence in or . ${\ displaystyle {\ mathcal {L}} ^ {p}}$${\ displaystyle {\ mathcal {L}} ^ {p}}$${\ displaystyle L ^ {p}}$

## definition

### Dimension theory formulation

Given is a measurement space , a real number and the corresponding function space , briefly denoted by. It should also be a sequence of functions from given and another function . One defines ${\ displaystyle (X, {\ mathcal {A}}, \ mu)}$${\ displaystyle p \ in (0, \ infty)}$${\ displaystyle {\ mathcal {L}} ^ {p} (X, {\ mathcal {A}}, \ mu)}$${\ displaystyle {\ mathcal {L}} ^ {p}}$ ${\ displaystyle (f_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle {\ mathcal {L}} ^ {p}}$${\ displaystyle f \ in {\ mathcal {L}} ^ {p}}$

${\ displaystyle \ | f \ | _ {{\ mathcal {L}} ^ {p}} = \ left (\ int | f (x) | ^ {p} \, d \ mu (x) \ right) ^ {1 / p}}$,

so the sequence of functions is called convergent in the p-th mean to if ${\ displaystyle f}$

${\ displaystyle \ lim _ {n \ to \ infty} \ | f_ {n} -f \ | _ {{\ mathcal {L}} ^ {p}} = 0}$

is. If , one speaks of convergence in the root mean square . If , then one speaks of convergence on average . ${\ displaystyle p = 2}$${\ displaystyle p = 1}$

The convergence of against is also defined . ${\ displaystyle (f_ {n}) _ {n \ in \ mathbb {N}} \ in L ^ {p} (X, {\ mathcal {A}}, \ mu)}$${\ displaystyle f \ in L ^ {p} (X, {\ mathcal {A}}, \ mu)}$

### Probability-theoretical formulation

Given a sequence of random variables and another random variable . It applies and for all . ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle X}$${\ displaystyle \ operatorname {E} (| X | ^ {p}) <\ infty}$${\ displaystyle \ operatorname {E} (| X_ {n} | ^ {p}) <\ infty}$${\ displaystyle n \ in \ mathbb {N}}$

The sequence converges in the p -th mean to if ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle X}$

${\ displaystyle \ lim _ {n \ to \ infty} \ operatorname {E} (| X_ {n} -X | ^ {p}) = 0}$

is. Then you write . ${\ displaystyle X_ {n} {\ xrightarrow [{}] {{\ mathcal {L}} ^ {p}}} X}$

As in the case of measure theory, one speaks of convergence in the quadratic mean , for one speaks of convergence in the mean . ${\ displaystyle p = 2}$${\ displaystyle p = 1}$

## properties

• For functions , the limit value is only determined - almost everywhere , since it only follows - almost everywhere . For the limit value is clear.${\ displaystyle f \ in {\ mathcal {L}} ^ {p}}$${\ displaystyle \ mu}$${\ displaystyle \ | f \ | = 0}$${\ displaystyle f = 0}$ ${\ displaystyle \ mu}$${\ displaystyle f \ in L ^ {p}}$
• For forms a semi-norm based on the above statement . In is then a standard . However , this does not apply to, since the triangle inequality (in this special case the Minkowski inequality ) no longer applies. However, it can${\ displaystyle p \ in [1, \ infty)}$${\ displaystyle \ | f \ | _ {p}}$${\ displaystyle {\ mathcal {L}} ^ {p}}$${\ displaystyle L ^ {p}}$${\ displaystyle p \ in (0,1)}$
${\ displaystyle d (f, g): = \ | fg \ | _ {p} ^ {p}}$
define a metric for which
${\ displaystyle \ | f + g \ | _ {p} ^ {p} \ leq \ | f \ | _ {p} ^ {p} + \ | g \ | _ {p} ^ {p}}$
applies.

## Properties for different parameters p

Be it . For finite measure spaces, the convergence in the -th mean results in the convergence in the p -th mean. Because it applies ${\ displaystyle 0 ${\ displaystyle p ^ {*}}$

${\ displaystyle \ | f \ | _ {p} \ leq \ mu (X) ^ {1 / p + 1 / p ^ {*}} \ | f \ | _ {p ^ {*}}}$,

the sequence convergent in the p th mean is thus dominated by the sequence convergent in the -th mean. The above inequality follows from the Hölder inequality , applied to the functions and with exponents . ${\ displaystyle p ^ {*}}$${\ displaystyle | f | ^ {p}}$${\ displaystyle 1}$${\ displaystyle r = {\ tfrac {p ^ {*}} {p}}, s = (1-1 / r) ^ {- 1}}$

But the statement is generally wrong. For example, if we look at the sequence of functions for real things${\ displaystyle k}$${\ displaystyle (\ mathbb {R}, {\ mathcal {B}} (\ mathbb {R}), \ lambda)}$

${\ displaystyle f_ {n} (x): = n ^ {k} \ chi _ {[0, n]}}$,

so is

${\ displaystyle \ | f_ {n} \ | _ {p} = n ^ {k + {\ tfrac {1} {p}}}}$

and thus

${\ displaystyle \ lim _ {n \ to \ infty} \ | f_ {n} \ | _ {p} = {\ begin {cases} \ infty & {\ text {falls}} p <- {\ tfrac {1 } {k}} \\ 0 & {\ text {falls}} - {\ tfrac {1} {k}}

The reverse conclusion, i.e. from the convergence in the p th mean to the convergence in the -th mean, is wrong both in the case of a finite measure and in general. An example of this would be the sequence of functions defined by ${\ displaystyle p ^ {*}}$${\ displaystyle ([0,1], {\ mathcal {B}} ([0,1]), \ lambda)}$

${\ displaystyle f_ {n} (x): = n ^ {k} \ chi _ {[0, {\ tfrac {1} {n}}]}}$

As above is then

${\ displaystyle \ | f_ {n} \ | _ {p} = n ^ {k - {\ tfrac {1} {p}}} {\ text {and}} \ lim _ {n \ to \ infty} \ | f_ {n} \ | _ {p} = {\ begin {cases} 0 & {\ text {if}} p <{\ tfrac {1} {k}} \\\ infty & {\ text {if}} {\ tfrac {1} {k}} .

## Cauchy episodes

A sequence of functions in (or ) is called a Cauchy sequence for the convergence in the p -th mean, if for each there is an index such that ${\ displaystyle (f_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle {\ mathcal {L}} ^ {p}}$${\ displaystyle L ^ {p}}$${\ displaystyle \ epsilon> 0}$${\ displaystyle N}$

${\ displaystyle \ | f_ {n} -f_ {m} \ | _ {p} <\ epsilon}$

for everyone . Every sequence that converges in the p th mean is a Cauchy sequence. Because for is ${\ displaystyle n, m \ geq N}$${\ displaystyle p \ in [1, \ infty)}$

${\ displaystyle \ | f_ {n} -f_ {m} \ | _ {p} = \ | f_ {n} -f + f-f_ {m} \ | _ {p} \ leq \ | f_ {n} -f \ | _ {p} + \ | f_ {m} -f \ | _ {p}}$,

for is the same inequality with . The Fischer-Riesz theorem gives the inverse, i.e. every Cauchy sequence converges. So this and that are complete spaces . ${\ displaystyle p \ in (0,1)}$${\ displaystyle \ | \ cdot \ | _ {p} ^ {p}}$${\ displaystyle {\ mathcal {L}} ^ {p}}$${\ displaystyle L ^ {p}}$

## Relationship to convergence concepts of probability theory

In general, the implications apply to the concepts of convergence in probability theory

${\ displaystyle {\ begin {matrix} {\ text {almost certain}} \\ {\ text {convergence}} \ end {matrix}} \ implies {\ begin {matrix} {\ text {convergence in}} \\ {\ text {probability}} \ end {matrix}} \ implies {\ begin {matrix} {\ text {convergence in}} \\ {\ text {distribution}} \ end {matrix}}}$

and

${\ displaystyle {\ begin {matrix} {\ text {convergence in}} \\ p {\ text {-th mean}} \ end {matrix}} \ implies {\ begin {matrix} {\ text {convergence in} } \\ {\ text {probability}} \ end {matrix}} \ implies {\ begin {matrix} {\ text {convergence in}} \\ {\ text {distribution}} \ end {matrix}}}$.

The convergence in the p- th mean is thus one of the strong convergence concepts in probability theory. The relationships with the other types of convergence are detailed in the sections below.

### Convergence in probability

The convergence in probability immediately follows from the convergence in the p th mean . To do this, the Markov inequality is applied to the function , which is increasing for monotonically, and the random variable . Then follows ${\ displaystyle p> 0}$${\ displaystyle h = Y ^ {p}}$${\ displaystyle p> 0}$${\ displaystyle Y = | X_ {n} -X |}$

${\ displaystyle P (| X_ {n} -X | \ geq \ epsilon) \ leq {\ frac {1} {\ epsilon ^ {p}}} \ operatorname {E} (| X_ {n} -X | ^ {p})}$,

which goes towards zero in the limit value. The reverse is generally not true. An example of this is: are the random variables defined by

${\ displaystyle P (X_ {n} = e ^ {n \ alpha}) = e ^ {- n} = 1-P (X_ {n} = 0)}$

with . Then ${\ displaystyle \ alpha> 0}$

${\ displaystyle \ operatorname {E} (| X_ {n} | ^ {1}) = e ^ {n (\ alpha -1)} {\ xrightarrow [{}] {n \ to \ infty}} 0}$,

if . So the sequence converges for on the average to 0. But for anything is ${\ displaystyle \ alpha <1}$${\ displaystyle \ alpha \ in (0,1)}$${\ displaystyle \ epsilon \ in (0,1)}$

${\ displaystyle P (| X_ {n} | \ geq \ epsilon) = P (X_ {n} = e ^ {n \ alpha}) = e ^ {- n} {\ xrightarrow [{}] {n \ to \ infty}} 0}$. So the sequence converges to 0 in probability for all .${\ displaystyle \ alpha}$

A criterion under which the convergence in the p -th means from the convergence in probability is considered that an upper bound to exist, so that for all applicable. If the probabilities then converge to , then they also converge to in the p -th mean . More generally, a connection can be drawn between the convergence in the p- th mean and the convergence in probability using Vitali's convergence theorem and the equal integrability in the p- th mean : A sequence converges in the p- th mean if and only if it is equally integrable in the p -th is mean and it converges in probability. ${\ displaystyle Y}$${\ displaystyle \ operatorname {E} (| Y | ^ {p}) <\ infty}$${\ displaystyle P (| X_ {n} | \ leq Y) = 1}$${\ displaystyle n}$${\ displaystyle X_ {n}}$${\ displaystyle X}$${\ displaystyle X}$

### Almost certain convergence

In general, the almost certain convergence does not follow from the convergence in the p th mean . If one considers, for example, a sequence of stochastically independent random variables

${\ displaystyle P (X_ {n} = 1) = 1-P (X_ {n} = 0) = {\ tfrac {1} {n}}}$,

so is for everyone ${\ displaystyle p> 0}$

${\ displaystyle \ operatorname {E} (| X_ {n} | ^ {p}) = P (X_ {n} = 1) = {\ tfrac {1} {n}}}$,

what converges to zero. Thus the random variables converge on the p th mean to 0. They do not, however, almost certainly converge, as can be shown with the help of the second Borel-Cantelli lemma .

However, if a sequence of random variables in the p th mean converges to and holds ${\ displaystyle (X_ {n}) _ {n \ in \ mathbb {N}}}$${\ displaystyle X}$

${\ displaystyle \ sum _ {n = 1} ^ {\ infty} \ operatorname {E} (| X_ {n} -X | ^ {p}) <\ infty}$,

then the sequence will almost certainly converge against . The convergence must therefore be “fast enough”. (Alternatively, one can also use the fact that, if Vitali 's convergence theorem is valid, the convergence according to probability and the almost certain convergence coincide. If the requirements of this theorem are met, the convergence in the p- th mean leads to the almost certain convergence, since the Convergence in the p -th mean automatically follows the convergence in probability.) ${\ displaystyle X}$

Conversely, the almost certain convergence does not result in the convergence in the p th mean. For example, if we consider the random variables on the probability space${\ displaystyle ([0,1], {\ mathcal {B}} ([0,1]), {\ mathcal {U}} _ {[0,1]})}$

${\ displaystyle X_ {n} (\ omega) = n ^ {2} \ cdot \ mathbf {1} _ {\ left [0, {\ tfrac {1} {n}} \ right]} (\ omega)}$,

so it converges point by point towards 0 and therefore almost certainly towards 0 ( here denotes the uniform distribution on ). ${\ displaystyle \ omega \ in (0,1]}$${\ displaystyle [0,1]}$${\ displaystyle {\ mathcal {U}} _ {[0,1]}}$${\ displaystyle [0,1]}$

so is and the consequence is therefore unlimited for all , so it cannot converge. ${\ displaystyle \ operatorname {E} (| X_ {n} | ^ {p}) = n ^ {2p-1}}$${\ displaystyle p \ geq 1}$

However, the theorem of majorized convergence provides a criterion under which this conclusion is correct. Converge almost certain and there is a random variable with and is almost certain to converge in the p th remedy and also applies . ${\ displaystyle X_ {n}}$${\ displaystyle Y}$${\ displaystyle \ operatorname {E} (\ vert Y \ vert ^ {p}) <\ infty}$${\ displaystyle X_ {n} \ leq Y}$${\ displaystyle X_ {n}}$${\ displaystyle X}$${\ displaystyle X}$${\ displaystyle \ operatorname {E} (\ vert X \ vert ^ {p}) <\ infty}$

## Relation to the convergence concepts of measure theory

### Customized local convergence

According to Vitali's convergence theorem , a sequence is convergent in the p -th mean if and only if it is locally convergent to measure and is equally integrable in the p -th mean .

The possibility of equal integration cannot be dispensed with, as the following example illustrates. One sets and defines the sequence of functions ${\ displaystyle p = 1}$

${\ displaystyle f_ {n} = n ^ {2} \ chi _ {[0,1 / n]}}$.

on the measure space , this converges locally by measure to 0, because for is ${\ displaystyle ([0,1], {\ mathcal {B}} ([0,1]), \ lambda | _ {[0,1]})}$${\ displaystyle \ varepsilon \ in (0,1]}$

${\ displaystyle \ lim _ {n \ to \ infty} \ lambda (\ {n ^ {2} \ chi _ {[0,1 / n]} \ geq \ varepsilon \}) = \ lim _ {n \ to \ infty} {\ frac {1} {n}} = 0}$.

But it cannot be integrated equally (in the first mean) because it is

${\ displaystyle \ inf _ {a \ in [0, \ infty)} \ sup _ {f \ in (f_ {n}) _ {n \ in \ mathbb {N}}} \ int _ {\ {a < | f | \}} | f | \ mathrm {d} \ lambda = \ infty}$

Following Vitali's convergence theorem, it is also not (in the first mean) convergent to 0, because it is

${\ displaystyle \ lim _ {n \ to \ infty} \ int _ {[0,1]} | f_ {n} | \ mathrm {d} \ lambda = \ lim _ {n \ to \ infty} n ^ { 2} \ cdot {\ frac {1} {n}} = \ infty}$.

Nor can the convergence locally to measure be dispensed with, because if one chooses and the measure space , then the sequence of functions is the one through ${\ displaystyle p = 1}$${\ displaystyle ([0,1], {\ mathcal {B}} ([0,1]), \ lambda | _ {[0,1]})}$

${\ displaystyle f_ {n}: = {\ begin {cases} \ chi _ {[0; 1/2]} & {\ text {for}} n {\ text {even}} \\\ chi _ {( 1/2; 1]} & {\ text {for}} n {\ text {odd}} \ end {cases}}}$.

defined is equally integrable in the first mean, since it is majorized by the integrable function, which is constant 1. Due to its oscillating behavior, however, the sequence cannot converge locally to measure, because there is no function for the basic set and , so that it becomes small. With an analogous argument it then also follows that the function sequence does not converge in the first mean. ${\ displaystyle \ varepsilon <{\ tfrac {1} {2}}}$${\ displaystyle f}$${\ displaystyle \ lambda (\ {f_ {n} -f \ leq \ varepsilon \})}$

### Customized convergence

From the convergence in the p th mean follows the convergence according to measure , because it is

${\ displaystyle \ mu (\ {| f_ {n} -f | \ geq \ varepsilon \}) \ leq {\ tfrac {1} {\ varepsilon ^ {p}}} \ int _ {X} | f_ {n } -f | ^ {p} \ mathrm {d} \ mu = {\ tfrac {1} {\ varepsilon ^ {p}}} \ Vert f_ {n} -f \ Vert _ {p} ^ {p}}$.

According to Vitali's convergence theorem , the convergence in the p -th mean is equivalent to the convergence according to measure and the equal integrability in the p -th mean. Neither the custom-made convergence nor the equal integrability can be dispensed with. The examples can be found in the section "Convergence locally made to measure"

### Pointwise convergence μ-almost everywhere

The point-wise convergence μ-almost everywhere does not generally result in the convergence in the p -th mean. Likewise, the convergence in the p -th mean does not generally result in the point-wise convergence μ-almost everywhere.

An example of this is the sequence of functions

${\ displaystyle f_ {n} (x) = n ^ {2} \ chi _ {[0, {\ tfrac {1} {n}}]} (x)}$.

on the measurement space . It converges pointwise to 0 almost everywhere, but it is ${\ displaystyle ([0,1], {\ mathcal {B}} ([0,1]), \ lambda)}$

${\ displaystyle \ | f_ {n} \ | _ {1} = n \; {\ text {and thus}} \ lim _ {n \ to \ infty} \ | f_ {n} \ | _ {1} = \ infty}$.

Conversely, if one considers the sequence of intervals

${\ displaystyle (I_ {n}) _ {n \ in \ mathbb {N}} = [0,1], [0, {\ tfrac {1} {2}}], [{\ tfrac {1} { 2}}, 1], [0, {\ tfrac {1} {3}}], [{\ tfrac {1} {3}}, {\ tfrac {2} {3}}], [{\ tfrac {2} {3}}, 1], [0, {\ tfrac {1} {4}}], [{\ tfrac {1} {4}}, {\ tfrac {2} {4}}], \ dots}$

and defines the sequence of functions as

${\ displaystyle f_ {n} (x) = \ chi _ {I_ {n}} (x)}$,

so because the width of the intervals converges to zero. However, the sequence does not converge point by point to 0 almost everywhere, since each of the values ​​0 and 1 is assumed as often as desired at any point . ${\ displaystyle \ lim _ {n \ to \ infty} \ | f_ {n} \ | _ {1} = 0}$${\ displaystyle x}$

However, every sequence convergent in the p th mean has an almost certainly convergent subsequence with the same limit value. For example, in the example above, one could select indices so that ${\ displaystyle n_ {k}}$

${\ displaystyle I_ {n_ {k}} = [0, {\ tfrac {1} {m}}]}$

for is. Then they almost certainly converge point by point to 0. ${\ displaystyle m \ in \ mathbb {N}}$${\ displaystyle f_ {n_ {k}}}$

One criterion under which the point-wise convergence μ-almost everywhere the convergence in the p- th mean follows, is provided by the theorem of majorized convergence . It states that if in addition to the convergence there is a majorant from almost everywhere , the convergence in the p- th mean also follows. More generally, it is sufficient if, instead of the existence of a majorant, only the uniform integrability of the sequence of functions is required, because from the convergence almost everywhere the convergence follows locally according to measure. Thus, with equal integrability in the p th mean, the convergence in the p th mean can be inferred by means of Vitali's convergence theorem . From this perspective, the majorante is merely a sufficient criterion for equal integrability. ${\ displaystyle {\ mathcal {L}} ^ {p}}$

### Uniform convergence μ-almost everywhere

In the case of a finite measure space , the uniform convergence results almost everywhere in the convergence in the p -th mean , because one can show by means of the Hölder inequality that ${\ displaystyle p \ in (0, \ infty)}$

${\ displaystyle \ | f \ | _ {p} \ leq \ mu (X) ^ {1 / p} \ | f \ | _ {_ {\ infty}}}$.

applies. However, this conclusion is generally wrong for non-finite measure spaces. For example, if you define the sequence of functions

${\ displaystyle f_ {n} (x) = {\ tfrac {1} {n}} \ chi _ {[0, n]} (x)}$

on so is ${\ displaystyle (\ mathbb {R}, {\ mathcal {B}} (\ mathbb {R}), \ lambda)}$

${\ displaystyle \ lim _ {n \ to \ infty} \ | f_ {n} \ | _ {\ infty} = \ lim _ {n \ to \ infty} {\ tfrac {1} {n}} = 0 { \ text {but}} \ lim _ {n \ to \ infty} \ | f_ {n} \ | _ {1} = 1}$.

The conclusion from the convergence in the p- th mean to the uniform convergence almost everywhere is generally wrong both in finite measure spaces and in general measure spaces. The sequence of functions on the finite measure space converges, for example, for the p -th mean to 0, but not almost everywhere evenly to 0. ${\ displaystyle f_ {n} (x) = x ^ {n}}$${\ displaystyle ([0,1], {\ mathcal {B}} ([0,1]), \ lambda)}$${\ displaystyle p \ in [1, \ infty)}$

### Weak convergence in L p

Each in p convergent th middle sequence converges for also weak because of the -Hölder inequality follows for : ${\ displaystyle p \ in [1, \ infty)}$${\ displaystyle {\ frac {1} {p}} + {\ frac {1} {q}} = 1}$

${\ displaystyle \ left | \ int _ {X} f_ {n} g \, \ mathrm {d} \ mu - \ int _ {X} fg \, \ mathrm {d} \ mu \ right | \ leq \ int _ {X} | f_ {n} -f || g | \, \ mathrm {d} \ mu \ leq \ | f_ {n} -f \ | _ {p} \ | g \ | _ {q}}$,

thus there is a convergent majorante. The limit values ​​then match. The Radon-Riesz also provides the reversal on one condition. It says that for a function sequence converges in the p th mean if and only if it converges weakly and the sequence of the norms of the function sequence converges to the norm of the limit function. ${\ displaystyle p \ in (1, \ infty)}$