0% found this document useful (0 votes)

46 views

Lecture Notes 2 1 Probability Inequalities

The document provides proofs of several probability inequalities including: - The Gaussian tail inequality bounds the probability that a normally distributed random variable exceeds a threshold. - Markov's inequality bounds the probability that a non-negative random variable exceeds a value in terms of its expected value. - Chebyshev's inequality bounds the probability that the difference between a random variable and its mean exceeds a threshold in terms of its variance. - Hoeffding's inequality bounds this probability for bounded random variables and gives an application to binomial distributions. - McDiarmid's inequality, also called the bounded difference inequality, extends Hoeffding's inequality to more general functions of independent random variables. - Jensen's inequality relates the

Uploaded by

Héctor F Bonilla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views

Lecture Notes 2 1 Probability Inequalities

Uploaded by

Héctor F Bonilla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Lecture Notes 2

Probability Inequalities

Inequalities are useful for bounding quantities that might otherwise be hard to compute.
They will also be used in the theory of convergence.

Theorem 1 (The Gaussian Tail Inequality) Let X N (0, 1). Then

2 /2

2e
P(|X| > )

If X1 , . . . , Xn N (0, 1) then
1
2
P(|X n | > ) en /2 .
n

Proof. The density of X is (x) = (2)1/2 ex /2 . Hence,

Z
Z
1
P(X > ) =
(s)ds
s (s)ds

Z
2
()
1 0
e /2
(s)ds =
=

.

By symmetry,
2

2e /2
.

P
d
Now let X1 , . . . , Xn N (0, 1). Then X n = n1 ni=1 Xi N (0, 1/n). Thus, X n = n1/2 Z
where Z N (0, 1) and
P(|X| > )

P(|X n | > ) = P(n1/2 |Z| > ) = P(|Z| >

1
2
n ) en /2 .
n

Theorem 2 (Markovs inequality) Let X be a non-negative random variable and

suppose that E(X) exists. For any t > 0,
E(X)
.
t

P(X > t)

(1)

Proof. Since X > 0,

xp(x)dx
t

x p(x)dx t

x p(x)dx +

x p(x)dx =

E(X) =

p(x)dx = t P(X > t).

Theorem 3 (Chebyshevs inequality) Let = E(X) and 2 = Var(X). Then,
P(|X | t)

2
t2

and

P(|Z| k)

1
k2

(2)

where Z = (X )/. In particular, P(|Z| > 2) 1/4 and P(|Z| > 3) 1/9.
Proof. We use Markovs inequality to conclude that
P(|X | t) = P(|X |2 t2 )

2
E(X )2
=
.
t2
t2

The second part follows by setting t = k.

P
If X1 , . . . , Xn Bernoulli(p) then and X n = n1 ni=1 Xi Then, Var(X n ) = Var(X1 )/n =
p(1 p)/n and
Var(X n )
p(1 p)
1
P(|X n p| > )
=

2
2

n
4n2
since p(1 p) 14 for all p.

Hoeffdings Inequality

Hoeffdings inequality is similar in spirit to Markovs inequality but it is a sharper inequality.

We begin with the following important result.
Lemma 4 Supppose that E(X) = 0 and that a X b. Then
2 (ba)2 /8

E(etX ) et
2

Recall that a function g is convex if for each x, y and each [0, 1],
g(x + (1 )y) g(x) + (1 )g(y).
Proof. Since a X b, we can write X as a convex combination of a and b, namely,
X = b + (1 )a where = (X a)/(b a). By the convexity of the function y ety we
have
X a tb b X ta
e +
e .
etX etb + (1 )eta =
ba
ba
Take expectations of both sides and use the fact that E(X) = 0 to get
EetX

a tb
b ta
e +
e = eg(u)
ba
ba

(3)

where u = t(b a), g(u) = u + log(1 + eu ) and = a/(b a). Note that
00
g(0) = g 0 (0) = 0. Also, g (u) 1/4 for all u > 0. By Taylors theorem, there is a (0, u)
such that
0

g(u) = g(0) + ug (0) +

2 (ba)2 /8

Hence, EetX eg(u) et

u2 00
u2
t2 (b a)2
u2 00
g () = g ()
=
.
2
2
8
8

Next, we need to use Chernoff s method.

Lemma 5 Let X be a random variable. Then

P(X > ) inf et E(etX ).
t0

Proof. For any t > 0,

P(X > ) = P(eX > e ) = P(etX > et ) et E(etX ).
Since this is true for every t 0, the result follows.
Theorem 6 (Hoeffdings Inequality) Let Y1 , . . . , Yn be iid observations such that
E(Yi ) = and a Yi b. Then, for any > 0,

2
2
P |Y n | 2e2n /(ba) .
(4)

Corollary 7 If X1 , X2 , . . . , Xn are independent with P(a Xi b) = 1 and common

mean , then, with probability at least 1 ,
s

c
2
|X n |
log
(5)
2n

where c = (b a)2 .
Proof. Without los of generality, we asume that = 0. First we have
P(|Y n | ) = P(Y n ) + P(Y n )
= P(Y n ) + P(Y n ).
Next we use Chernoffs method. For any t > 0, we have, from Markovs inequality, that
!
n
Pn

X
P(Y n ) = P
Yi n = P e i=1 Yi en

i=1
P
t n
i=1 Yi

= P e

= etn

Pn
etn etn E et i=1 Yi

E(etYi ) = etn (E(etYi ))n .

i
2 (ba)2 /8

From Lemma 4, E(etYi ) et

. So
2 n(ba)2 /8

P(Y n ) etn et

This is minimized by setting t = 4/(b a)2 giving

P(Y n ) e2n

2 /(ba)2

Applying the same argument to P(Y n ) yields the result.

Example 8 Let X1 , . . . , Xn Bernoulli(p). From, Hoeffdings inequality,
2

P(|X n p| > ) 2e2n .

The Bounded Difference Inequality

So far we have focused on sums of random variables. The following result extends Hoeffdings
inequality to more general functions g(x1 , . . . , xn ). Here we consider McDiarmids inequality,
also known as the Bounded Difference inequality.
4

Theorem 9 (McDiarmid) Let X1 , . . . , Xn be independent random variables. Suppose that

0
sup g(x1 , . . . , xi1 , xi , xi+1 , . . . , xn ) g(x1 , . . . , xi1 , xi , xi+1 , . . . , xn ) ci (6)

x1 ,...,xn ,x0i
for i = 1, . . . , n. Then
!
P g(X1 , . . . , Xn ) E(g(X1 , . . . , Xn ))

22
exp Pn

2
i=1 ci

(7)

Proof.
Let Vi = E(g|X1 , . . . , Xi )E(g|X1 , . . . , Xi1 ). Then g(X1 , . . . , Xn )E(g(X1 , . . . , Xn )) =
Pn
i=1 Vi and E(Vi |X1 , . . . , Xi1 ) = 0. Using a similar argument as in Hoeffdings Lemma we
have,
2 2
E(etVi |X1 , . . . , Xi1 ) et ci /8 .
(8)
Now, for any t > 0,
P (g(X1 , . . . , Xn ) E(g(X1 , . . . , Xn )) ) = P

n
X

!
Vi

i=1
Pn
=P e
e et E et i=1 Vi

!!

Pn1

= et E et i=1 Vi E etVn X1 , . . . , Xn1

Pn1
2 2
et et cn /8 E et i=1 Vi

i=1 Vi

..
.
Pn
2
2
et et i=1 ci .
P
The result follows by taking t = 4/ ni=1 c2i .
P
Example 10 If we take g(x1 , . . . , xn ) = n1 ni=1 xi then we get back Hoeffdings inequality.
Example 11 Suppose we throw m balls into n bins. What fraction of bins are empty? Let
Z be P
the number of empty bins and let F = Z/n be the fraction of empty bins. We can write
Z = ni=1 Zi where Zi = 1 of bin i is empty and Zi = 0 otherwise. Then
= E(Z) =

n
X

E(Zi ) = n(1 1/n)m = nem log(11/n) nem/n

i=1

and = E(F ) = /n em/n . How close is Z to ? Note that the Zi s are not independent
so we cannot just apply Hoeffding. Instead, we proceed as follows.
5

Define variables X1 , . . . , Xm where Xs = i if ball s falls into bin i. Then Z = g(X1 , . . . , Xm ).

If we move one ball into a different bin, then Z can change by at most 1. Hence, (6) holds
with ci = 1 and so
2
P(|Z | > t) 2e2t /m .
Recall that he fraction of empty bins is F = Z/m with mean = /n. We have
2 t2 /m

P(|F | > t) = P(|Z | > nt) 2e2n

Bounds on Expected Values

Theorem 12 (Cauchy-Schwartz inequality) If X and Y have finite variances

then
p
(9)
E |XY | E(X 2 )E(Y 2 ).

The Cauchy-Schwarz inequality can be written as

2 2
Cov2 (X, Y ) X
Y .

Recall that a function g is convex if for each x, y and each [0, 1],
g(x + (1 )y) g(x) + (1 )g(y).
If g is twice differentiable and g 00 (x) 0 for all x, then g is convex. It can be shown that if
g is convex, then g lies above any line that touches g at some point, called a tangent line.
A function g is concave if g is convex. Examples of convex functions are g(x) = x2 and
g(x) = ex . Examples of concave functions are g(x) = x2 and g(x) = log x.
Theorem 13 (Jensens inequality) If g is convex, then
Eg(X) g(EX).

(10)

Eg(X) g(EX).

(11)

If g is concave, then

Proof. Let L(x) = a + bx be a line, tangent to g(x) at the point E(X). Since g is convex,
it lies above the line L(x). So,
Eg(X) EL(X) = E(a + bX) = a + bE(X) = L(E(X)) = g(EX).

Example 14 From Jensens inequality we see that E(X 2 ) (EX)2 .

Example 15 (Kullback Leibler Distance) Define the Kullback-Leibler distance between

two densities p and q by

Z
p(x)
dx.
D(p, q) = p(x) log
q(x)
Note that D(p, p) = 0. We will use Jensen to show that D(p, q) 0. Let X f . Then

Z
Z
q(X)
q(X)
q(x)
D(p, q) = E log
log E
= log p(x)
dx = log q(x)dx = log(1) = 0.
p(X)
p(X)
p(x)
So, D(p, q) 0 and hence D(p, q) 0.
Example 16 It follows from Jensens inequality that 3 types of means can be ordered. Assume that a1 , . . . , an are positive numbers and define the arithmetic, geometric and harmonic
means as
1
(a1 + . . . + an )
n
= (a1 . . . an )1/n
1
= 1 1
.
( + . . . + a1n )
n a1

aA =
aG
aH
Then aH aG aA .

Suppose we have an exponential bound on P(Xn > ). In that case we can bound E(Xn ) as
follows.
Theorem 17 Suppose that Xn 0 and that for every > 0,
2

P(Xn > ) c1 ec2 n

(12)

for some c2 > 0 and c1 > 1/e. Then,

r
E(Xn )

C
.
n

(13)

where C = (1 + log(c1 ))/c2 .

R
Proof. Recall that for any nonnegative random variable Y , E(Y ) = 0 P(Y t)dt. Hence,
for any a > 0,
Z
Z a
Z
Z
2
2
2
2
E(Xn ) =
P(Xn t)dt =
P(Xn t)dt +
P(Xn t)dt a +
P(Xn2 t)dt.
0

Equation (12) implies that P(Xn > t) c1 ec2 nt . Hence,
Z
Z
Z

2
2
E(Xn ) a +
P(Xn t)dt = a +
P(Xn t)dt a + c1
a

ec2 nt dt = a +

c1 ec2 na
.
c2 n

Set a = log(c1 )/(nc2 ) and conclude that

E(Xn2 )

log(c1 )
1
1 + log(c1 )
+
=
.
nc2
nc2
nc2

Finally, we have
s
p

E(Xn )

E(Xn2 )

1 + log(c1 )
.
nc2

Now we consider bounding the maximum of a set of random variables.
Theorem 18 Let X1 , . . . , Xn be random variables. Suppose there exists > 0 such
2
that E(etXi ) et /2 for all t > 0. Then

p
E max Xi 2 log n.
(14)
1in

Proof. By Jensens inequality,

exp tE max Xi
E exp t max Xi
1in

1in

max exp {tXi }

= E

1in

n
X

2 2 /2

E (exp {tXi }) net

i=1

Thus,

E
The result follows by setting t =

max Xi

1in

log n t 2
+
.
t
2

2 log n/.

OP and oP

In statisics, probability and machine learning, we make use of oP and OP notation.

Recall first, that an = o(1) means that an 0 as n . an = o(bn ) means that
an /bn = o(1).
an = O(1) means that an is eventually bounded, that is, for all large n, |an | C for some
C > 0. an = O(bn ) means that an /bn = O(1).
8

We write an bn if both an /bn and bn /an are eventually bounded. In computer sicence this
s written as an = (bn ) but we prefer using an bn since, in statistics, often denotes a
parameter space.
Now we move on to the probabilistic versions. Say that Yn = oP (1) if, for every > 0,
P(|Yn | > ) 0.
Say that Yn = oP (an ) if, Yn /an = oP (1).
Say that Yn = OP (1) if, for every > 0, there is a C > 0 such that
P(|Yn | > C) .
Say that Yn = OP (an ) if Yn /an = OP (1).

Lets use Hoeffdings inequality to show that sample proportions are OP (1/ n) within the
the true mean. Let Y1 , . . . , Yn be coin flips i.e. Yi {0, 1}. Let p = P(Yi = 1). Let
n
1X
Yi .
pbn =
n i=1

We will show that: pbn p = oP (1) and pbn p = OP (1/ n).

We have that
2
P(|b
pn p| > ) 2e2n 0
and so pbn p = oP (1). Also,

C
pn p| > C) = P |b
pn p| >
P( n|b
n
2

2e2C <
if we pick C large enough. Hence,

n(b
pn p) = OP (1) and so

1
pbn p = OP
.
n

Now consider m coins with probabilities p1 , . . . , pm . Then

m
X
P(max |b
pj pj | > )
P(|b
pj pj | > )
j

j=1
m
X

2e2n

union bound

Hoeffding

j=1

2
= 2me2n = 2 exp (2n2 log m) .

Supose that m en where 0 < 1. Then

P(max |b
pj pj | > ) 2 exp (2n2 n ) 0.
j

Hence,
max |b
pj pj | = oP (1).
j

Solucionario Econometría Jeffrey M. Wooldridge
3% (30)
Solucionario Econometría Jeffrey M. Wooldridge
4 pages
Guided Notes
No ratings yet
Guided Notes
8 pages
Lecture Notes 2 1 Probability Inequalities
No ratings yet
Lecture Notes 2 1 Probability Inequalities
9 pages
Concentration Inequalities: Hoeffding and Mcdiarmid
No ratings yet
Concentration Inequalities: Hoeffding and Mcdiarmid
5 pages
Probability Bounds
No ratings yet
Probability Bounds
14 pages
Math556-05-Inequalities
No ratings yet
Math556-05-Inequalities
8 pages
Lecture 4 Inequalities and Asymptotic Estimates
No ratings yet
Lecture 4 Inequalities and Asymptotic Estimates
9 pages
MA 4040/ MA 2540: Probability Theory
No ratings yet
MA 4040/ MA 2540: Probability Theory
12 pages
Notes 2
No ratings yet
Notes 2
10 pages
Ashish Mcdiarmid
No ratings yet
Ashish Mcdiarmid
22 pages
Problems: MN) MN
No ratings yet
Problems: MN) MN
11 pages
03 Hoeffding
No ratings yet
03 Hoeffding
5 pages
Introduction To Probability Theory
No ratings yet
Introduction To Probability Theory
13 pages
נוסחאות ואי שיוויונים
No ratings yet
נוסחאות ואי שיוויונים
12 pages
CS229 Supplemental Lecture Notes Hoeffding's Inequality: 1 Basic Probability Bounds
No ratings yet
CS229 Supplemental Lecture Notes Hoeffding's Inequality: 1 Basic Probability Bounds
8 pages
Selective Review - Probability
No ratings yet
Selective Review - Probability
30 pages
Lec 3
No ratings yet
Lec 3
8 pages
1 Inequalities: 1.1 Markov
No ratings yet
1 Inequalities: 1.1 Markov
15 pages
Solutions To Exam 1: 1 2 N N A N
No ratings yet
Solutions To Exam 1: 1 2 N N A N
3 pages
MUML Preliminiaries
No ratings yet
MUML Preliminiaries
24 pages
Notes
No ratings yet
Notes
32 pages
Discussion Notes 2-6
No ratings yet
Discussion Notes 2-6
3 pages
Martingale Limit Theory and Stochastic Regression Theory: Ching-Zong Wei
No ratings yet
Martingale Limit Theory and Stochastic Regression Theory: Ching-Zong Wei
155 pages
ADASD
No ratings yet
ADASD
4 pages
Appendix A Solutions of Selected Problems
No ratings yet
Appendix A Solutions of Selected Problems
19 pages
HW2 550
No ratings yet
HW2 550
8 pages
04 - Random Variables 2
No ratings yet
04 - Random Variables 2
17 pages
Unit 3 - Bounds and Inequalities
No ratings yet
Unit 3 - Bounds and Inequalities
25 pages
18MAB203T U3 Book PDF
No ratings yet
18MAB203T U3 Book PDF
38 pages
Math 710 Homework 7: Problem 1
No ratings yet
Math 710 Homework 7: Problem 1
7 pages
Concentration
No ratings yet
Concentration
28 pages
SC633 Lecture Notes
No ratings yet
SC633 Lecture Notes
4 pages
Hoeffding Bounds
No ratings yet
Hoeffding Bounds
9 pages
Prob 2 B English
No ratings yet
Prob 2 B English
81 pages
Unit3-Probability and Stochastic Processes (18MAB203T)
No ratings yet
Unit3-Probability and Stochastic Processes (18MAB203T)
25 pages
hw3 Sol
No ratings yet
hw3 Sol
8 pages
Ineq PDF
No ratings yet
Ineq PDF
3 pages
Best Question
No ratings yet
Best Question
6 pages
Prob Notes
No ratings yet
Prob Notes
70 pages
Bernstein's Inequality, and Generalizations: CS281B/Stat241B (Spring 2003) Statistical Learning Theory
No ratings yet
Bernstein's Inequality, and Generalizations: CS281B/Stat241B (Spring 2003) Statistical Learning Theory
4 pages
Section 6 - The law of large numbers and the central limit theorem(1)
No ratings yet
Section 6 - The law of large numbers and the central limit theorem(1)
10 pages
Essentials On The Analysis of Randomized Algorithms: 1 Basics
No ratings yet
Essentials On The Analysis of Randomized Algorithms: 1 Basics
8 pages
2 On The Strong Law of Large Number For Pairwise I.I.D Random Variables With General Moment Conditions
No ratings yet
2 On The Strong Law of Large Number For Pairwise I.I.D Random Variables With General Moment Conditions
6 pages
Cheat sheet for the final exam
No ratings yet
Cheat sheet for the final exam
6 pages
Probability Bounds: Simple Bounds On Expectation
No ratings yet
Probability Bounds: Simple Bounds On Expectation
3 pages
Formulas
No ratings yet
Formulas
2 pages
High Dimensional Probability MA3K0 Notes 3
No ratings yet
High Dimensional Probability MA3K0 Notes 3
108 pages
Vjeravatnost
No ratings yet
Vjeravatnost
429 pages
Homework 1: Instructions and Notes
No ratings yet
Homework 1: Instructions and Notes
2 pages
확통1 LectureNote06 on Limit Theorems
No ratings yet
확통1 LectureNote06 on Limit Theorems
36 pages
Unif Gauss Tail
No ratings yet
Unif Gauss Tail
13 pages
Various Modes of Convergence: Definitions
No ratings yet
Various Modes of Convergence: Definitions
6 pages
1.8. Large Deviation and Some Exponential Inequalities.: B R e DX Essinf G (X), T e DX Esssup G (X)
No ratings yet
1.8. Large Deviation and Some Exponential Inequalities.: B R e DX Essinf G (X), T e DX Esssup G (X)
4 pages
HDP_solution
No ratings yet
HDP_solution
76 pages
STA 211 Lecture 3
No ratings yet
STA 211 Lecture 3
21 pages
Lecture Notes 4 Convergence (Chapter 5) 1 Random Samples: 1 N N 1 N N I
No ratings yet
Lecture Notes 4 Convergence (Chapter 5) 1 Random Samples: 1 N N 1 N N I
12 pages
Covergence
No ratings yet
Covergence
18 pages
E2 201: Information Theory (2019) Solutions To Homework 3
No ratings yet
E2 201: Information Theory (2019) Solutions To Homework 3
11 pages
[Pitcan'17]Concentration Inequalities UStats
No ratings yet
[Pitcan'17]Concentration Inequalities UStats
6 pages
Hw4sol 2015 PDF
No ratings yet
Hw4sol 2015 PDF
11 pages
Inequalites Mso205
No ratings yet
Inequalites Mso205
5 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Lean Manufacturing Case
No ratings yet
Lean Manufacturing Case
1 page
Takt Time Vs Cycle Time
No ratings yet
Takt Time Vs Cycle Time
6 pages
Takt Time Calculation: Exercise Solution
No ratings yet
Takt Time Calculation: Exercise Solution
1 page
Eviews 7.0 Manual
No ratings yet
Eviews 7.0 Manual
108 pages
L9 Strong Duality
No ratings yet
L9 Strong Duality
71 pages
Looking at This Value Stream Map, Do Your Own Considerations. Then, Draw "Yamazumi Chart" and Evaluate Staffing
No ratings yet
Looking at This Value Stream Map, Do Your Own Considerations. Then, Draw "Yamazumi Chart" and Evaluate Staffing
1 page
Chapter 1-Work Study Mikell Groover
No ratings yet
Chapter 1-Work Study Mikell Groover
28 pages
Work Study Introduction
No ratings yet
Work Study Introduction
36 pages
Poly Nets
No ratings yet
Poly Nets
23 pages
Numerical Analysis and Computing
No ratings yet
Numerical Analysis and Computing
44 pages
Journal of Physics A Mathematical and General 2002 Vol. 35 (14) p3245
No ratings yet
Journal of Physics A Mathematical and General 2002 Vol. 35 (14) p3245
20 pages
) (Log Log: 2 Log 1 Log 3 Log Log 4 Log 2 Log
No ratings yet
) (Log Log: 2 Log 1 Log 3 Log Log 4 Log 2 Log
4 pages
Comments On The Savitzky Golay Convolution Method For Least Squares Fit Smoothing and Differentiation of Digital Data
No ratings yet
Comments On The Savitzky Golay Convolution Method For Least Squares Fit Smoothing and Differentiation of Digital Data
4 pages
Persson 03 Smoothing
No ratings yet
Persson 03 Smoothing
15 pages
Chapter 4-Taller 4
No ratings yet
Chapter 4-Taller 4
56 pages
Maths
No ratings yet
Maths
62 pages
I Mid Examination dec2010CSE
No ratings yet
I Mid Examination dec2010CSE
3 pages
2024 A-Level Predicted Paper 2 - Version 2
No ratings yet
2024 A-Level Predicted Paper 2 - Version 2
22 pages
Week 4
No ratings yet
Week 4
36 pages
9.-Mathematics-Practice-Problems-JEE-Main
No ratings yet
9.-Mathematics-Practice-Problems-JEE-Main
4 pages
Numerical Methods Optimization
No ratings yet
Numerical Methods Optimization
19 pages
2016 SPM Addmath Sarawak Ans p1
No ratings yet
2016 SPM Addmath Sarawak Ans p1
5 pages
Chapter 7 - Transient Heat Conduction PDF
No ratings yet
Chapter 7 - Transient Heat Conduction PDF
64 pages
Extended Kalman Filter For UAV Attitude
No ratings yet
Extended Kalman Filter For UAV Attitude
8 pages
Linear Algebra Ch. 3
No ratings yet
Linear Algebra Ch. 3
7 pages
Detailed Lesson Plan For Math 10 I. Objectives
0% (1)
Detailed Lesson Plan For Math 10 I. Objectives
4 pages
Technische Universit at M Unchen
No ratings yet
Technische Universit at M Unchen
72 pages
Hierarchical Archimedean Copulas Through Multivariate Compound Distributions
No ratings yet
Hierarchical Archimedean Copulas Through Multivariate Compound Distributions
13 pages
Multiple Choice
No ratings yet
Multiple Choice
6 pages
IJEART01014
No ratings yet
IJEART01014
5 pages
Gams
No ratings yet
Gams
34 pages
Lesson 7 - Higher Order Derivatives
No ratings yet
Lesson 7 - Higher Order Derivatives
2 pages
EEE 315: Numerical Analysis: Runge Kutta 2 Order Method
No ratings yet
EEE 315: Numerical Analysis: Runge Kutta 2 Order Method
7 pages
Course Slides MATH 144 - Lecture 7.4 - C1 - Annotated
No ratings yet
Course Slides MATH 144 - Lecture 7.4 - C1 - Annotated
22 pages
Factoring Polynomials: Be Sure Your Answers Will Not Factor Further!
No ratings yet
Factoring Polynomials: Be Sure Your Answers Will Not Factor Further!
5 pages
Then Multiply To 100 000 595 986.75729 A 595 987 There Will Be 595 987 Total Estimated Population in 10 Years With The Growth Rate of 2.81%
No ratings yet
Then Multiply To 100 000 595 986.75729 A 595 987 There Will Be 595 987 Total Estimated Population in 10 Years With The Growth Rate of 2.81%
5 pages
Analytical and Numerical Analysis of Heat Transfer From Conical Spine Extended Surface With Hexagonal Cross-Section
No ratings yet
Analytical and Numerical Analysis of Heat Transfer From Conical Spine Extended Surface With Hexagonal Cross-Section
6 pages
1516 Markov Chains 2 H
No ratings yet
1516 Markov Chains 2 H
21 pages
Quadratic 10th
No ratings yet
Quadratic 10th
31 pages
A.P and G.P
No ratings yet
A.P and G.P
6 pages
Homework 1 Solutions: 2 3 4 K 2 3 K 2 3 K k+1 k+1 k+1 2 3 4 K 2 3 4 K K M
No ratings yet
Homework 1 Solutions: 2 3 4 K 2 3 K 2 3 K k+1 k+1 k+1 2 3 4 K 2 3 4 K K M
9 pages
Exam 1 Blank
No ratings yet
Exam 1 Blank
6 pages
Uae - 13 - Soj - 001 (1-3)
No ratings yet
Uae - 13 - Soj - 001 (1-3)
2 pages

Lecture Notes 2 1 Probability Inequalities

Uploaded by

Lecture Notes 2 1 Probability Inequalities

Uploaded by

Lecture Notes 2

Theorem 1 (The Gaussian Tail Inequality) Let X N (0, 1). Then

Proof. The density of X is (x) = (2)1/2 ex /2 . Hence,

P(|X n | > ) = P(n1/2 |Z| > ) = P(|Z| >

Theorem 2 (Markovs inequality) Let X be a non-negative random variable and

Proof. Since X > 0,

p(x)dx = t P(X > t).

The second part follows by setting t = k.

Hoeffdings inequality is similar in spirit to Markovs inequality but it is a sharper inequality.

g(u) = g(0) + ug (0) +

Hence, EetX eg(u) et

Next, we need to use Chernoff s method.

Lemma 5 Let X be a random variable. Then

Proof. For any t > 0,

Corollary 7 If X1 , X2 , . . . , Xn are independent with P(a Xi b) = 1 and common

E(etYi ) = etn (E(etYi ))n .

From Lemma 4, E(etYi ) et

This is minimized by setting t = 4/(b a)2 giving

Applying the same argument to P(Y n ) yields the result.

P(|X n p| > ) 2e2n .

The Bounded Difference Inequality

Theorem 9 (McDiarmid) Let X1 , . . . , Xn be independent random variables. Suppose that

E(Zi ) = n(1 1/n)m = nem log(11/n) nem/n

Define variables X1 , . . . , Xm where Xs = i if ball s falls into bin i. Then Z = g(X1 , . . . , Xm ).

P(|F | > t) = P(|Z | > nt) 2e2n

Bounds on Expected Values

Theorem 12 (Cauchy-Schwartz inequality) If X and Y have finite variances

The Cauchy-Schwarz inequality can be written as

Example 14 From Jensens inequality we see that E(X 2 ) (EX)2 .

Example 15 (Kullback Leibler Distance) Define the Kullback-Leibler distance between

P(Xn > ) c1 ec2 n

for some c2 > 0 and c1 > 1/e. Then,

where C = (1 + log(c1 ))/c2 .

Set a = log(c1 )/(nc2 ) and conclude that

Proof. By Jensens inequality,

E (exp {tXi }) net

In statisics, probability and machine learning, we make use of oP and OP notation.

We will show that: pbn p = oP (1) and pbn p = OP (1/ n).

Now consider m coins with probabilities p1 , . . . , pm . Then

Supose that m en where 0 < 1. Then

You might also like

P(|X n | > ) = P(n1/2 |Z| > ) = P(|Z| >

E(etYi ) = etn (E(etYi ))n .

This is minimized by setting t = 4/(b a)2 giving

Applying the same argument to P(Y n ) yields the result.

P(|X n p| > ) 2e2n .

P(Xn > ) c1 ec2 n