This article or section needs to be revised: The article currently only considers two of the numerous existing Chernoff inequalities, in particular only for Bernoulli experiments with identical parameters (in general, however, this is not a prerequisite for Chernoff bounds), although this also applies to this one Trap, stricter Chernoff bounds exist.
Please help to improve it , and then remove this flag.
In probability theory , the Chernoff inequality, named after Herman Chernoff but tracing back to Herman Rubin , describes an upper bound for the probability that a sequence of independent Bernoulli experiments deviates from its expected number of successes.
The Chernoff inequality is a versatile and widely used tool in the analysis of randomized algorithms in computer science .
sentence
Let be a sequence of independent Bernoulli experiments with and . Accordingly, describes the expected number of successes ( ) of the experiment.
X
1
,
X
2
,
...
,
X
n
{\ displaystyle X_ {1}, X_ {2}, \ ldots, X_ {n}}
n
{\ displaystyle n}
P
[
X
i
=
1
]
=
p
{\ displaystyle P [X_ {i} = 1] = p}
P
[
X
i
=
0
]
=
1
-
p
{\ displaystyle P [X_ {i} = 0] = 1-p}
p
n
{\ displaystyle pn}
X
i
=
1
{\ displaystyle X_ {i} = 1}
1. Then applies to each
δ
>
0
{\ displaystyle \ delta> 0}
P
[
∑
X
i
≥
(
1
+
δ
)
⋅
p
n
]
≤
exp
(
-
min
{
δ
,
δ
2
}
3
p
n
)
{\ displaystyle P \ left [\ sum X_ {i} \ geq (1+ \ delta) \ cdot pn \ right] \ leq \ exp \ left (- {\ frac {\ min \ {\ delta, \ delta ^ { 2} \}} {3}} pn \ right)}
2. For each applies:
δ
∈
[
0
,
1
]
{\ displaystyle \ delta \ in [0,1]}
P
[
∑
X
i
≤
(
1
-
δ
)
⋅
p
n
]
≤
exp
(
-
δ
2
2
p
n
)
{\ displaystyle P \ left [\ sum X_ {i} \ leq (1- \ delta) \ cdot pn \ right] \ leq \ exp \ left (- {\ frac {\ delta ^ {2}} {2}} pn \ right)}
proof
First Chernoff bound
Let be an arbitrary constant at first . In the following, to simplify the notation, designate a new random variable able . Because of the monotony of the illustration then follows
t
>
0
{\ displaystyle t> 0}
Y
{\ displaystyle Y}
Y
=
exp
(
t
∑
X
i
)
{\ displaystyle \ textstyle Y = \ exp \ left (t \ sum X_ {i} \ right)}
x
↦
exp
(
t
x
)
{\ displaystyle x \ mapsto \ exp (tx)}
P
[
∑
X
i
≥
(
1
+
δ
)
⋅
p
n
]
=
P
[
Y
≥
exp
(
t
(
1
+
δ
)
⋅
p
n
)
]
=
P
[
Y
≥
k
E.
[
Y
]
]
≤
1
k
{\ displaystyle P \ left [\ sum X_ {i} \ geq (1+ \ delta) \ cdot pn \ right] = P \ left [Y \ geq \ exp \ left (t (1+ \ delta) \ cdot pn \ right) \ right] = P \ left [Y \ geq k {\ textrm {E}} \ left [Y \ right] \ right] \ leq {\ frac {1} {k}}}
,
where is defined and the last estimate follows using the Markov inequality . Now applies
k
{\ displaystyle k}
k
=
exp
(
t
(
1
+
δ
)
p
n
)
E.
[
Y
]
{\ displaystyle k = {\ tfrac {\ exp (t (1+ \ delta) pn)} {{\ textrm {E}} [Y]}}}
E.
[
exp
(
t
X
i
)
]
=
(
1
-
p
)
e
0
+
p
e
t
=
1
+
(
e
t
-
1
)
p
{\ displaystyle {\ textrm {E}} \ left [\ exp (tX_ {i}) \ right] = (1-p) e ^ {0} + pe ^ {t} = 1 + (e ^ {t} -1) p}
and thus
E.
[
Y
]
=
E.
[
exp
(
t
∑
X
i
)
]
=
E.
[
∏
exp
(
t
X
i
)
]
=
∏
E.
[
exp
(
t
X
i
)
]
=
(
1
+
(
e
t
-
1
)
p
)
n
{\ displaystyle {\ textrm {E}} \ left [Y \ right] = {\ textrm {E}} \ left [\ exp (t \ sum X_ {i}) \ right] = {\ textrm {E}} \ left [\ prod \ exp (tX_ {i}) \ right] = \ prod {\ textrm {E}} \ left [\ exp (tX_ {i}) \ right] = \ left (1+ (e ^ { t} -1) p \ right) ^ {n}}
.
So follows
1
k
=
e
-
t
(
1
+
δ
)
p
n
(
1
+
(
e
t
-
1
)
p
)
n
≤
e
-
t
(
1
+
δ
)
p
n
⋅
e
(
e
t
-
1
)
p
n
=
e
-
t
(
1
+
δ
)
p
n
+
(
e
t
-
1
)
p
n
{\ displaystyle {\ frac {1} {k}} = e ^ {- t (1+ \ delta) pn} \ left (1+ (e ^ {t} -1) p \ right) ^ {n} \ leq e ^ {- t (1+ \ delta) pn} \ cdot e ^ {(e ^ {t} -1) pn} = e ^ {- t (1+ \ delta) pn + (e ^ {t} - 1) pn}}
.
Look now . Then applies
t
=
log
(
1
+
δ
)
{\ displaystyle t = \ log (1+ \ delta)}
1
k
≤
e
-
(
log
(
1
+
δ
)
)
(
1
+
δ
)
p
n
+
δ
p
n
=
e
(
δ
-
(
1
+
δ
)
log
(
1
+
δ
)
)
p
n
{\ displaystyle {\ frac {1} {k}} \ leq e ^ {- (\ log (1+ \ delta)) (1+ \ delta) pn + \ delta pn} = e ^ {(\ delta - (1 + \ delta) \ log (1+ \ delta)) pn}}
.
For part of the exponent of the right term
L.
(
δ
)
=
(
1
+
δ
)
log
(
1
+
δ
)
{\ displaystyle L (\ delta) = (1+ \ delta) \ log (1+ \ delta)}
one can show by means of curve discussion and Taylor series expansion that it always holds. Due to the monotony of the exponential function, the following applies
L.
(
δ
)
≥
δ
+
1
3
min
{
δ
,
δ
2
}
{\ displaystyle L (\ delta) \ geq \ delta + {\ tfrac {1} {3}} \ min \ {\ delta, \ delta ^ {2} \}}
1
k
≤
e
(
δ
-
(
δ
+
1
3
min
{
δ
,
δ
2
}
)
)
p
n
=
exp
(
-
min
{
δ
,
δ
2
}
3
p
n
)
{\ displaystyle {\ frac {1} {k}} \ leq e ^ {(\ delta - (\ delta + {\ frac {1} {3}} \ min \ {\ delta, \ delta ^ {2} \ })) pn} = \ exp \ left (- {\ frac {\ min \ {\ delta, \ delta ^ {2} \}} {3}} pn \ right)}
.
The assertion follows together with the first estimate.
Second Chernoff bound
The proof of the second bound follows technically analogously to the first bound.
variants
A general variant of the Chernoff inequality can be formulated using the standard deviation . Let be discrete, independent random variables with and . Denote the variance of . Then applies to each :
X
1
,
X
2
,
...
,
X
n
{\ displaystyle X_ {1}, X_ {2}, \ ldots, X_ {n}}
E.
[
X
i
]
=
0
{\ displaystyle {\ textrm {E}} [X_ {i}] = 0}
|
X
i
|
≤
1
{\ displaystyle | X_ {i} | \ leq 1}
σ
2
{\ displaystyle \ sigma ^ {2}}
X
=
∑
X
i
{\ displaystyle \ textstyle X = \ sum X_ {i}}
0
<
λ
≤
2
σ
{\ displaystyle 0 <\ lambda \ leq 2 \ sigma}
P
[
|
∑
X
i
|
≥
λ
σ
]
≤
2
exp
(
-
λ
2
4th
)
{\ displaystyle P \ left [\ left | \ sum X_ {i} \ right | \ geq \ lambda \ sigma \ right] \ leq 2 \ exp \ left (- {\ frac {\ lambda ^ {2}} {4 }} \ right)}
The proof is technically analogous to the one shown above.
Examples
Consider the following question: How likely is it that if you toss a fair coin ten times, you will get at least seven tails? The coin flips provide Bernoulli trials with is after the first follow follows Chernoff bound.:
X
1
,
X
2
,
...
,
X
10
{\ displaystyle X_ {1}, X_ {2}, \ ldots, X_ {10}}
p
n
=
1
2
⋅
10
=
5
{\ displaystyle pn = {\ tfrac {1} {2}} \ cdot 10 = 5}
P
[
∑
X
i
≥
7th
]
=
P
[
∑
X
i
≥
(
1
+
4th
10
)
⋅
5
]
{\ displaystyle P \ left [\ sum X_ {i} \ geq 7 \ right] = P \ left [\ sum X_ {i} \ geq \ left (1 + {\ frac {4} {10}} \ right) \ cdot 5 \ right]}
≤
exp
(
-
min
{
4th
10
,
16
100
}
3
⋅
5
)
=
exp
(
-
4th
15th
)
≈
0.766
...
{\ displaystyle \ leq \ exp \ left (- {\ frac {\ min \ {{\ frac {4} {10}}, {\ frac {16} {100}} \}} {3}} \ cdot 5 \ right) = \ exp \ left (- {\ frac {4} {15}} \ right) \ approx 0 {,} 766 \ ldots}
Just rephrase the above example slightly and ask instead: How likely is it to get at least seventy times the result "number" after a hundred fair coin toss? The first Chernoff bound immediately proves to be much stronger:
P
[
∑
X
i
≥
70
]
=
P
[
∑
X
i
≥
(
1
+
4th
10
)
⋅
50
]
{\ displaystyle P \ left [\ sum X_ {i} \ geq 70 \ right] = P \ left [\ sum X_ {i} \ geq \ left (1 + {\ frac {4} {10}} \ right) \ cdot 50 \ right]}
≤
exp
(
-
min
{
4th
10
,
16
100
}
3
⋅
50
)
=
exp
(
-
8th
3
)
≈
0.069
...
{\ displaystyle \ leq \ exp \ left (- {\ frac {\ min \ {{\ frac {4} {10}}, {\ frac {16} {100}} \}} {3}} \ cdot 50 \ right) = \ exp \ left (- {\ frac {8} {3}} \ right) \ approx 0 {,} 069 \ ldots}
literature
Individual evidence
↑ Herman Chernoff: A career in statistics. In: Xihong Lin, Christian Genest, David L. Banks, Geert Molenberghs, David W. Scott, Jane-Ling Wang (Eds.): Past, Present, and Future of Statistics. CRC Press, 2014, ISBN 978-1-4822-0496-4 , pp. 35 ( crcpress.com ).
↑ John Bather: A Conversation with Herman Chernoff . In: Statistical Science . 11, No. 4, November 1996, pp. 335-350. doi : 10.1214 / ss / 1032280306 .
<img src="https://de.wikipedia.org/wiki/Special:CentralAutoLogin/start?type=1x1" alt="" title="" width="1" height="1" style="border: none; position: absolute;">