0% found this document useful (0 votes)
79 views

Chebyshev

Chebyshev's inequality provides a bound for the probability that a random variable with a finite variance will deviate from its mean by a certain amount. It states that this probability is no greater than the variance divided by the square of the amount of deviation. Markov's more general inequality, which Chebyshev's inequality can be derived from, bounds the probability of a random variable exceeding a value based on its expected value. Chebyshev's inequality is widely used, including in proving the weak law of large numbers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

Chebyshev

Chebyshev's inequality provides a bound for the probability that a random variable with a finite variance will deviate from its mean by a certain amount. It states that this probability is no greater than the variance divided by the square of the amount of deviation. Markov's more general inequality, which Chebyshev's inequality can be derived from, bounds the probability of a random variable exceeding a value based on its expected value. Chebyshev's inequality is widely used, including in proving the weak law of large numbers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Chebyshev’s inequality

Gerold Alsmeyer
Institut für Mathematische Statistik
Einsteinstraße 62
D-48149 Münster, Germany
July 26, 2010

Chebyshev’s inequality is one of the most common inequalities used in prob-


ability theory to bound the tail probabilities of a random variable X having
finite variance σ 2 = VarX. It states that
σ2
P(|X − µ| ≥ t) ≤ for all t > 0, (0.1)
t2
where µ = E X denotes the mean of X. Of course, the given bound is of use
only if t is bigger than the standard deviation σ. Instead of proving (0.1) we
will give a proof of the more general Markov’s inequality which states that for
any nondecreasing function g : [0, ∞) → [0, ∞) and any nonnegative random
variable Y
E g(Y )
P(Y ≥ t) ≤ for all t > 0. (0.2)
g(t)
Indeed, choosing Y = |X −µ| and g(x) = x2 gives (0.1). The proof of Markov’s
inequality is very easy: For any t > 0,
Z Z
g(Y ) E g(X)
P(Y ≥ t) = 1{Y ≥t} d P ≤ dP ≤ .
{Y ≥t} g(t) g(t)

Plainly, (0.1) also provides us with the same bound σ 2 t−2 for the one-sided
tail probability P(X − µ > t), but in this case an improvement is obtained by
the following consideration: For any c ≥ 0, we infer from Markov’s inequality
with g(x) = x2
E(X − µ + c)2 σ 2 + c2
P(X − µ ≥ t) = P(X − µ + c ≥ t + c) ≤ = .
(t + c)2 (t + c)2
The right-hand side becomes minimal at c = σ 2 /t giving the one-sided tail
bound
σ2
P(X − µ > t) ≤ 2 for all t > 0, (0.3)
σ + t2
sometimes called Cantelli’s inequality.
Although Chebyshev’s inequality may produce only a rather crude bound
its advantage lies in the fact that it applies to any random variable with finite

1
variance. Moreover, within the class of all such random variables the bound
is indeed tight because, if X has a symmetric distribution on {−a, 0, a} with
P(X = ±a) = 1/(2a2 ) and P(X = 0) = 1 − 1/a2 for some a > 1, then µ = 0,
σ 2 = 1 and
1
P(|X| ≥ a) = P(|X| = a) = 2 ,
a
which means that equality holds in (0.1) for t = a.
On the other hand, tighter bounds can be obtained when imposing ad-
ditional conditions on the considered distributions. On such example is the
following Vysočanskiı̆-Petunı̄n inequality for random variables X with an uni-
modal distribution:
4σ 2 p
P(|X − µ| ≥ t) ≤ for all t > 3/8 σ, (0.4)
9t2
This improves (0.1) by a factor 4/9 for sufficiently large t.
One of the most common applications of Chebyshev’s inequality is the weak
law of large numbers (WLLN). Suppose we are given a sequence (Sn )n≥1 of
real-valued random variables with independent increments X1 , X2 , ... such that
µn := E Xn and σn2 := VarXn are finite for all n ≥ 1. Defining
n
X n
X
mn := E Sn = µk and s2n := VarSn = σk2
k=1 k=1

and assuming Markov’s condition

s2n
lim =0 (0.5)
n→∞ n2

we infer by making use of (0.1) that, for any  > 0,


2
 
Sn − mn
P ≥  ≤ sn → 0 as n → ∞
n 2 n2

and therefore
Sn − mn
→ 0 in probability. (WLLN)
n
This result applies particularly to the case of i.i.d. X1 , X2 , ... Then mn = nµ
and s2n = nσ 2 where µ := E X1 and σ 2 := VarX1 . In this case, Chebyshev’s
inequality further gives, for all , β > 0, that

σ2

X  Sn − nµ  X
β
≥  log n ≤
P 2 n log2β n
< ∞
n≥1
n
n≥1


and thus, by invoking the Borel-Cantelli lemma,


Sn − nµ
→ 0 a.s. for all β > 0 (0.6)
n logβ n

2
This is not quite the strong law of large numbers (β = 0) but gets close to it.
In fact, in order for this to derive, a stronger variant of Chebyshev’s inequality,
called Kolmogorov’s inequality, may be employed which states that

s2n
 
P max |Sk − mk | ≥ t ≤ 2 for all t > 0
1≤k≤n t

under the same independence assumptions stated above for the WLLN. Notice
the similarity to Chebyshev’s inequality in that only Sn −mn has been replaced
with max1≤k≤n (Sk − mk ) while retaining the bound.
Let us finally note that, if X has mean µ, median m and finite variance σ 2 ,
then the one-sided version (0.3) of Chebyshev’s inequality shows that
1 1
P(X − µ ≥ σ) ≤ and P(X − µ ≤ −σ) ≤ ,
2 2
in other words, the median of X is always with one standard deviation of it
mean.
Bibliographical notes: (0.1) dates back to Chebyshev’s original work [1],
but is nowadays found in any standard textbook on probability theory, like
[2]. The latter contains also a proof of the one-sided version (0.3) which differs
from the one given here. (0.4) for unimodal distributions is taken from [6], see
also [5]. For multivariate extensions of Chebyshev’s inequality see [4] and [3].

References
[1] P. Chebyshev. Des valeurs moyennes. Liouville’s J. Math. Pures Appl.,
12:177–184, 1867.

[2] W. Feller. An Introduction to Probability Theory and Its Applications. Vol.


II. Second edition. John Wiley & Sons Inc., New York, 1971.

[3] D. Monhor. A Chebyshev inequality for multivariate normal distribution.


Probab. Engrg. Inform. Sci., 21(2):289–300, 2007.

[4] I. Olkin and J. W. Pratt. A multivariate Tchebycheff inequality. Ann.


Math. Statist, 29:226–234, 1958.

[5] T. M. Sellke and S. H. Sellke. Chebyshev inequalities for unimodal distri-


butions. Amer. Statist., 51(1):34–40, 1997.

[6] D. F. Vysočanskiı̆ and J. Ī. Petunı̄n. Proof of the 3σ rule for unimodal
distributions. Teor. Veroyatnost. i Mat. Statist., 21:23–35, 163, 1979.

You might also like