0% found this document useful (0 votes)
37 views3 pages

27 Concentration Inequalities

The document discusses concentration inequalities, which provide probability bounds for random variables taking unusually large or small values. It introduces Markov's inequality, which gives an upper bound for the probability that a non-negative random variable exceeds a given value. It also covers Chebyshev's inequality, which provides a bound when the random variable has finite mean and variance, and the Chernoff bound, which gives an exponentially decaying bound under certain assumptions.

Uploaded by

Mohd Saud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views3 pages

27 Concentration Inequalities

The document discusses concentration inequalities, which provide probability bounds for random variables taking unusually large or small values. It introduces Markov's inequality, which gives an upper bound for the probability that a non-negative random variable exceeds a given value. It also covers Chebyshev's inequality, which provides a bound when the random variable has finite mean and variance, and the Chernoff bound, which gives an exponentially decaying bound under certain assumptions.

Uploaded by

Mohd Saud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

EE5110: Probability Foundations for Electrical Engineers July-November 2015

Lecture 27: Concentration Inequalities


Lecturer: Dr. Krishna Jagannathan Scribe: Arjun Nadh

A concentration inequality is a result that gives us a probability bound on certain random variables taking
atypically large or atypically small values. While concentration of probability measures is a vast topic, we
will only discuss some foundational concentration inequalities in this lecture.

27.1 Markov’s Inequality

If X is a non-negative random variable, with E[X] < ∞, then for any α > 0,

E[X]
P(X > α) ≤ .
α

Clearly, this inequality is meaningful only when α > E[X].

Proof:

(a)
E[X] = E[XI{X≤α} ] + E[XI{X>α} ],
(b)
≥ E[XI{X>α} ],
≥ αP(X > α).

where (a) follows from linearity of expectations. Since X is a non-negative random variable
E[XI{X≤α} ] ≥ 0 and thus (b) follows.
Markov Inequality is probably the most fundamental concentration inequality, although it is usually quite
loose. After all, the bound decays rather slowly, as 1/α. Tighter bounds can be derived under stronger as-
sumptions on the random variable. For example, when the variance is finite, we have Chebyshev’s inequality.

27.2 Chebyshev Inequality

If X is a random variable with expectation µ and variance σ 2 < ∞, then

1
P(| X − µ |> kσ) ≤ , k > 0.
k2

This can also be written as

σ2
P(| X − µ |> c) ≤ , c > 0.
c2

27-1
27-2 Lecture 27: Concentration Inequalities

Proof: The proof follows by applying Markov’s inequality to the non-negative random variable | X − µ |2 .
E(| X − µ |2 )
P(| X − µ |2 > (kσ)2 ) ≤ ,
(kσ)2
σ2
= ,
(kσ)2
1
= ,
k2
1
⇒ P(| X − µ |> (kσ)) ≤ .
k2

Note that the Chebyshev’s bound decays as 1/k 2 , an improvement over the basic Markov inequality. As one
might imagine, exponentially decaying bounds can be derived by invoking the Markov inequality, as long as
the moment generating function exists in a neighbourhood of the origin. This result is known as the Chernoff
bound, which we present briefly.

27.3 Chernoff Bound

Let MX (s) = E[esX ] and assume that MX (s) < ∞ for s ∈ [−, ] for some  > 0. Then

P(X > α) ≤ e−Λ (α)
,

where Λ∗ (α) = sup(sα − log MX (s)).


s>0

Proof: For any s > 0,


P(X > α) = P(esX > esα ).
By Markov’s Inequality,
E[esX ]
P(X > α) ≤ ,
esα
P(X > α) ≤ MX (s)e−sα , ∀s > 0 and s ∈ DX , (27.1)

where DX = {s | MX (s) < ∞}.

In (27.1), note that the bound decays exponentially in α for every s > 0 belonging to DX . The tightest such
exponential bound is obtained by infimising the right hand side:
P(X > α) ≤ inf MX (s)e−sα ,
s>0
−sup(sα−log MX (s))
=e s>0
.

Thus

P(X > α) ≤ e−Λ (α)
.

This gives us an exponentially decaying bound for the ‘positive tail’ P(X > α). Similarly we can prove a
Chernoff bound for the negative tail P(X < α) by taking s < 0.
Lecture 27: Concentration Inequalities 27-3

27.4 Exercise
1. Let X1 , X2 , ..., Xn be i.i.d. random variables with PDF fX . Then the set of random variables X1 ,
X2 , ..., Xn is called a random sample of size n of X. The sample mean is defined as
1
Xn = (X1 + X2 + ... + Xn ).
n
Let X1 , X2 ,.. Xn be a random sample of X with mean µ and variance σ 2 . How many samples of X
are required for the probability that the sample mean will not deviate from the true mean µ by more
than σ/10 to be at least .95?
1
2. A biased coin, which lands heads with probability 10 each time it is flipped, is flipped 200 times
consecutively. Give an upper bound on the probability that it lands heads at least 120 times.
3. A post-office handles 10,000 letters per day with a variance of 2,000 letters. What can be said about
the probability that this post office handles between 8,000 and 12,000 letters tomorrow? What about
the probability that more than 15,000 letters come in?

You might also like