Chebyshev
Chebyshev
Gerold Alsmeyer
Institut für Mathematische Statistik
Einsteinstraße 62
D-48149 Münster, Germany
July 26, 2010
Plainly, (0.1) also provides us with the same bound σ 2 t−2 for the one-sided
tail probability P(X − µ > t), but in this case an improvement is obtained by
the following consideration: For any c ≥ 0, we infer from Markov’s inequality
with g(x) = x2
E(X − µ + c)2 σ 2 + c2
P(X − µ ≥ t) = P(X − µ + c ≥ t + c) ≤ = .
(t + c)2 (t + c)2
The right-hand side becomes minimal at c = σ 2 /t giving the one-sided tail
bound
σ2
P(X − µ > t) ≤ 2 for all t > 0, (0.3)
σ + t2
sometimes called Cantelli’s inequality.
Although Chebyshev’s inequality may produce only a rather crude bound
its advantage lies in the fact that it applies to any random variable with finite
1
variance. Moreover, within the class of all such random variables the bound
is indeed tight because, if X has a symmetric distribution on {−a, 0, a} with
P(X = ±a) = 1/(2a2 ) and P(X = 0) = 1 − 1/a2 for some a > 1, then µ = 0,
σ 2 = 1 and
1
P(|X| ≥ a) = P(|X| = a) = 2 ,
a
which means that equality holds in (0.1) for t = a.
On the other hand, tighter bounds can be obtained when imposing ad-
ditional conditions on the considered distributions. On such example is the
following Vysočanskiı̆-Petunı̄n inequality for random variables X with an uni-
modal distribution:
4σ 2 p
P(|X − µ| ≥ t) ≤ for all t > 3/8 σ, (0.4)
9t2
This improves (0.1) by a factor 4/9 for sufficiently large t.
One of the most common applications of Chebyshev’s inequality is the weak
law of large numbers (WLLN). Suppose we are given a sequence (Sn )n≥1 of
real-valued random variables with independent increments X1 , X2 , ... such that
µn := E Xn and σn2 := VarXn are finite for all n ≥ 1. Defining
n
X n
X
mn := E Sn = µk and s2n := VarSn = σk2
k=1 k=1
s2n
lim =0 (0.5)
n→∞ n2
and therefore
Sn − mn
→ 0 in probability. (WLLN)
n
This result applies particularly to the case of i.i.d. X1 , X2 , ... Then mn = nµ
and s2n = nσ 2 where µ := E X1 and σ 2 := VarX1 . In this case, Chebyshev’s
inequality further gives, for all , β > 0, that
σ2
X Sn − nµ X
β
≥ log n ≤
P 2 n log2β n
< ∞
n≥1
n
n≥1
2
This is not quite the strong law of large numbers (β = 0) but gets close to it.
In fact, in order for this to derive, a stronger variant of Chebyshev’s inequality,
called Kolmogorov’s inequality, may be employed which states that
s2n
P max |Sk − mk | ≥ t ≤ 2 for all t > 0
1≤k≤n t
under the same independence assumptions stated above for the WLLN. Notice
the similarity to Chebyshev’s inequality in that only Sn −mn has been replaced
with max1≤k≤n (Sk − mk ) while retaining the bound.
Let us finally note that, if X has mean µ, median m and finite variance σ 2 ,
then the one-sided version (0.3) of Chebyshev’s inequality shows that
1 1
P(X − µ ≥ σ) ≤ and P(X − µ ≤ −σ) ≤ ,
2 2
in other words, the median of X is always with one standard deviation of it
mean.
Bibliographical notes: (0.1) dates back to Chebyshev’s original work [1],
but is nowadays found in any standard textbook on probability theory, like
[2]. The latter contains also a proof of the one-sided version (0.3) which differs
from the one given here. (0.4) for unimodal distributions is taken from [6], see
also [5]. For multivariate extensions of Chebyshev’s inequality see [4] and [3].
References
[1] P. Chebyshev. Des valeurs moyennes. Liouville’s J. Math. Pures Appl.,
12:177–184, 1867.
[6] D. F. Vysočanskiı̆ and J. Ī. Petunı̄n. Proof of the 3σ rule for unimodal
distributions. Teor. Veroyatnost. i Mat. Statist., 21:23–35, 163, 1979.