0% found this document useful (0 votes)

18 views8 pages

Lec 4

Concentration inequalities

Uploaded by

nnguyen22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views8 pages

Lec 4

Concentration inequalities

Uploaded by

nnguyen22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

36-705: Intermediate Statistics Fall 2018

Lecture 4: September 4
Lecturer: Siva Balakrishnan

We will first continue our discussion of exponential concentration inequalities.

4.1 Levy’s inequality

There is a similar concentration inequality that applies to functions of Gaussian random
variables that are sufficiently smooth. In this case, the assumption is quite different. We
assume that:
v
u n
uX
|f (X1 , . . . , Xn ) − f (Y1 , . . . , Yn )| ≤ Lt (Xi − Yi )2 ,
i=1

for all X1 , . . . , Xn , Y1 , . . . , Yn ∈ R.
For such functions we have that if X1 , . . . , Xn ∼ N (0, 1) then,
t2

P(|f (X1 , . . . , Xn ) − E[f (X1 , . . . , Xn )]| ≥ t) ≤ 2 exp − 2 .
2L

4.2 χ2 tail bounds

A χ2 random variable with n degrees of freedom, Pn denoted by Y ∼ χ2n , is a RV that is a sum
of n i.i.d. standard Gaussian RVs, i.e. Y = i=1 Xi2 where each Xi ∼ N (0, 1). Suppose that
Z1 , . . . , Zn ∼ N (0, 1), then the expected value E[Zi2 ] = 1, and we have the χ2 tail bound:
n
!
1X 2
P Z − 1 ≥ t ≤ 2 exp(−nt2 /8) for all t ∈ (0, 1).
n k=1 k
You will derive this in your HW using the Chernoff method. Analogous to the class of sub-
Gaussian RVs, χ2 random variables belong to a class of what are known as sub-exponential
random variables. The main note-worthy difference is that the Gaussian-type behaviour of
the tail only holds for small values of the deviation t.
Detour: The union bound. This is also known as Boole’s inequality. It says that if we
have events A1 , . . . , An then
n
! n
[ X
P Ai ≤ P(Ai ).
i=1 i=1

4-1
4-2 Lecture 4: September 4

In particular, if we consider a case when each event Ai is a failure of some type, then the
above inequality says that the probability that even a single failure occurs is at most the
sum of the probabilities of each failure.
Example: The Johnson-Lindenstrauss Lemma. One very nice application of χ2 tail
bounds is in the analysis of what are known as “random projections”. Suppose we have a
data set X1 , . . . , Xn ∈ Rd where d is quite large. Storing such a dataset might be expensive
and as a result we often resort to “sketching” or “random projection” where the goal is
to create a map F : Rd 7→ Rm , with m d. We then instead store the mapped dataset
{F (X1 ), . . . , F (Xn )}. The challenge is to design this map F in a way that preserves essential
features of the original dataset. In particular, we would like that for every pair (Xi , Xj ) we
have that,
(1 − )kXi − Xj k22 ≤ kF (Xi ) − F (Xj )k22 ≤ (1 + )kXi − Xj k22 ,
i.e. the map preserves all the pair-wise distances up to a (1 ± ) factor. Of course, if m is
large we might expect this is not too difficult.
The Johnson-Lindenstrauss lemma is quite stunning: it says that a simple randomized con-
struction will produce such a map with probability at least 1 − δ provided that,
16 log(n/δ)
m≥ .
2
Notice that this is completely independent of the original dimension d and depends on
logarithmically on the number of points n. This map can result in huge savings in storage
cost while still essentially preserving all the pairwise distances.
The map itself is quite simple: we construct a matrix Z ∈ Rm×d , where each entry of Z is
i.i.d N (0, 1). We then define the map as:
ZXi
F (Xi ) = √ .
m
Now let us fix a pair (Xj , Xk ) and consider,
2
kF (Xj ) − F (Xk )k22 Z(Xj − Xk )
2
= √
kXj − Xk k2 mkXj − Xk k2 2
m
1 X Xj − Xk 2
= hZi , i .
m i=1 kXj − Xk k2
| {z }
Ti
Pd
Now, for some fixed numbers aj the distribution of j=1 aj Zij is Gaussian with mean 0 and
variance dj=1 a2j . So each term Ti is an independent χ2 random variable. Now applying the
P

χ2 tail bound, we obtain that,

kF (Xj ) − F (Xk )k22

P − 1 ≥ ≤ 2 exp(−m2 /8).
kXj − Xk k22
Lecture 4: September 4 4-3

Thus for the fixed pair (Xi , Xj ) the probability that our map fails to preserve the distance
2
is exponentially small, i.e. is at most
2 exp(−m /8). Now, to find the probability that our
n
map fails to preserve any of our 2 pairwise distances we simply apply the union bound to
conclude that, the probability of any failure is at most:

n
P(failure) ≤ 2 exp(−m2 /8).
2

Now, it is straightforward to verify that if

16 log(n/δ)
m≥ ,
2
then this probability is at most δ as desired. An important point to note is that the expo-
nential concentration is what leads to such a small value for m (i.e. it only needs to grow
logarithmically with the sample size).
In the rest of this lecture we discuss the convergence of random variables. At a high-level,
our first few lectures focused on non-asymptotic properties of averages i.e. the tail bounds we
derived applied for any fixed sample size n. For the next few lectures we focus on asymptotic
properties, i.e. we ask the question: what happens to the average of n i.i.d. random variables
as n → ∞.
Roughly, from a theoretical perspective the idea is that many expressions will considerably
simplify in the asymptotic regime. Rather than have many different tail bounds, we will
derive simple “universal results” that hold under extremely weak conditions.
From a slightly more practical perspective, asymptotic theory is often useful to obtain ap-
proximate confidence intervals (and p-values and other useful things) that although approx-
imate are typically more useful. We will follow quite closely Section 5.5 of Casella and
Berger.

4.3 Reminder: convergence of sequences

When we think of convergence of deterministic real numbers the corresponding notions are
classical.
Formally, we say that a sequence of real numbers a1 , a2 , . . . converges to a fixed real number a
if, for every positive number , there exists a natural number N () such that for all n ≥ N (),
|an − a| < . We call a the limit of the sequence and write limn→∞ an = a.
Our focus today will in trying to develop analogues of this notion that apply to sequences
of random variables. We will first give some definitions and then try to circle back to relate
the definitions and discuss some examples.
4-4 Lecture 4: September 4

Throughout, we will focus on the setting where we have a sequence of random variables
X1 , . . . , Xn and another random variable X, and would like to define what is means for the
sequence to converge to X. In each case, to simplify things you should also think about the
case when X is deterministic, i.e. when X = c with probability 1 (for some constant c).
Importantly, we will not assume that the RVs X1 , . . . , Xn are independent.

4.4 Almost sure convergence

We will not use almost sure convergence in this course so you should feel free to ignore this
section. A natural analogue of the usual convergence would be to hope that,

lim Xn = X.
n→∞

These are both however random variables so one has to at least specify on what event we
are hoping for this statement to be true.
The correct analogue turns out to be to require:

P lim Xn = X = 1.
n→∞

There are measure theoretic subtleties to be aware of here. In particular, the sample space
inside the probability statement here grows with n and it requires some machinery to be
precise here.
There are other equivalent (this is somewhat difficult to see) ways to define almost sure
convergence. Equivalently, we say that Xn converges almost surely to X if we let Ω be a set
of probability mass 1, i.e. P(Ω) = 1, and for every ω ⊆ Ω, and for every > 0, we have that
there is some n ≥ N (ω, ) such that:

|Xn (ω) − X(ω)| ≤ .

Roughly, the way to think about this type of convergence is to imagine that there is some
set of exceptional events on which the random variables can disagree, but these exceptional
events have probability 0 as n → ∞. Barring, these exceptional events the sequence con-
verges just like sequences of real numbers do. The exceptional events is where the “almost”
in almost sure arises.
Lecture 4: September 4 4-5

4.5 Convergence in probability

A sequence of random variables X1 , . . . , Xn converges in probability to a random variable X
if for every > 0 we have that,

lim P(|Xn − X| ≥ ) = 0.
n→∞

To build intuition it is perhaps useful to consider the case when X is deterministic, i.e.
X = c with probability 1. Then convergence in probability is saying that as n gets large the
distribution of Xn gets more peaked around the value c.
Again somewhat roughly, convergence in probability can be viewed as a statement about the
convergence of probabilities, while almost sure convergence is a convergence of the values of
a sequence of random variables.
We will not prove this statement but convergence in probability is implied by almost sure
convergence. The notes contain a counterexample to the reverse implication but we most
likely will not cover this in lecture.
Example: Weak Law of Large Numbers Suppose that Y1 , . . . , Yn are i.i.d. with E[Yi ] =
µ and Var(Yi ) = σ 2 < ∞. Define, for i ∈ {1, . . . , n},
i
1X
Xi = Yj .
i j=1

The WLLN says that the sequence X1 , X2 , . . . converges in probability to µ.

Proof: The proof is simply an application of Chebyshev’s inequality. We note that by
Chebyshev’s inequality:
σ2
P(|Xn − E[X]| ≥ ) ≤ .
n2
This in turn implies that,

lim P(|Xn − E[X]| ≥ ) = 0,

n→∞

as desired.
Notes:

1. Strictly speaking the WLLN is true even without the assumption of finite variance, as
long as the first absolute moment is finite. This proof is a bit more difficult.
2. There is a statement that says that under similar assumptions the average converges
almost surely to the expectation. This is known as the strong law of large numbers.
This is actually quite a bit more difficult to prove.
4-6 Lecture 4: September 4

Consistency: Convergence in probability will frequently recur in this course. Usually we

will construct an estimator θbn for some quantity θ∗ . We will then say that the estimator is
consistent if the sequence of RVs θbn converges in probability to θ∗ .
The WLLN/Chebyshev can already be used to prove some rudimentary consistency guaran-
tees. For instance, if we consider the sample variance:

n
1 X
Sbn = bn )2 ,
(Xi − µ
n − 1 i=1

then by Chebyshev’s inequality we obtain,

Var(Sbn )
P(|Sbn − σ 2 | ≥ ) ≤ ,
2

so a sufficient condition for consistency is that Var(Sbn ) → 0 as n → ∞.

Convergence in probability does not imply almost sure convergence: This example
is from Casella and Berger. Suppose we have a sample space S = [0, 1], with the uniform
distribution, we draw s ∼ U [0, 1] and define X(s) = s.
We define the sequence as:

X1 (s) = s + I[0,1] (s), X2 (s) = s + I[0,1/2] (s), X3 (s) = s + I[1/2,1] (s)

X4 (s) = s + I[0,1/3] (s), X5 (s) = s + I[1/3,2/3] (s), X6 (s) = s + I[2/3,1] (s).

Now one can check that this sequence converges in probability but not almost surely.
Roughly, the “1 + s” spike becomes less frequent down the sequence (allowing convergence
in probability) but the limit is not well defined. For any s, Xn (s) alternates between 1 and
1 + s.

4.6 Convergence in quadratic mean

An often useful way to show convergence in probability is to show something stronger known
as convergence in quadratic mean. We say that a sequence converges to X in quadratic mean
if:

E(Xn − X)2 → 0,

as n → ∞. We will return to this one when we discuss some examples.

Lecture 4: September 4 4-7

4.7 Convergence in distribution

The other commonly encountered mode of convergence is convergence in distribution. We
say that a sequence converges to X in distribution if:
lim FXn (t) = FX (t),
n→∞

for all points t where the CDF FX is continuous. We will see why the exception matters in
a little while but for now it is worth noting that convergence in distribution is the weakest
form of convergence.
For instance, a sequence of i.i.d. N (0, 1) RVs converge in distribution to an independent
N (0, 1) RV, even though the values of the random variables are not close in any meaningful
sense (their distributions are however, identical). A famous example that we will spend a
chunk of the next lecture on is the central limit theorem. The central limit theorem says that
an average of i.i.d. random variables (appropriately normalized) converges in distribution to
a N (0, 1) random variable.
The picture to keep in mind to understand the relationships is the following one:

We will re-visit this in the next lecture and perhaps try to prove some of the implications
(or disprove some of the non-implications).

4.8 More Examples

Example 1: Suppose we consider a sequence Xn = N (0, 1/n). Intuitively, it seems like
this sequence converges to 0. Let us first consider what happens in distribution.
The CDF of the RV that is deterministically 0 is simply FX (x) = 0, for x < 0 and FX (x) = 1
for x ≥ 0. Now, let us consider,
√
FXn (x) = P(Z ≤ nx),
where Z ∼ N (0, 1). If x > 0 this tends to 1, and if x < 0 this tends to 0. Interestingly, at
x = 0, FXn (x) = 1/2, and does not converge to FX (0) = 1. Remember, however, that we
had an exception at points of discontinuity.
Example 2: Let us consider the same example and consider convergence in probability.
E[Xn2 ] 1
P(|Xn − X| ≥ ) = 2
= 2 → 0,
n
4-8 Lecture 4: September 4

so the sequence converges to 0 in probability.

Example 3: Let us consider another example from the Casella and Berger book. Suppose
X1 , . . . ∼ U [0, 1]. Let us define X(n) = max1≤i≤n Xi . Now, we verify two things:

1. X(n) converges in probability to 1. To see this observe that,

P(|X(n) − 1| ≥ ) = P(X(n) ≤ 1 − )
Yn
= P(Xi ≤ 1 − ) = (1 − )n
i=1
→ 0.

2. The random variable n(1 − X(n) ) converges in distribution to an Exp(1) RV. To see
this we compute:

FX(n) (t) = P(n(1 − X(n) ) ≤ t) = 1 − P(X(n) ≤ 1 − t/n)

= 1 − (1 − t/n)n → 1 − exp(−t) = FX (t).

Stochastic Calculus For Finance II Conti
No ratings yet
Stochastic Calculus For Finance II Conti
99 pages
Shreve Stochcal4fin 2
No ratings yet
Shreve Stochcal4fin 2
99 pages
7 5-04-01-01.2 Analysis of Speed Power Trial Data PDF
No ratings yet
7 5-04-01-01.2 Analysis of Speed Power Trial Data PDF
25 pages
Recitation 1
No ratings yet
Recitation 1
10 pages
Asymptotic Statistics (By Changliang ZOU)
No ratings yet
Asymptotic Statistics (By Changliang ZOU)
115 pages
Convergence Concepts: 2.1 Convergence of Random Variables
No ratings yet
Convergence Concepts: 2.1 Convergence of Random Variables
6 pages
Lecture 7: Convergence and Limit Theorems
No ratings yet
Lecture 7: Convergence and Limit Theorems
23 pages
Đ Án CSXS
No ratings yet
Đ Án CSXS
28 pages
4 Convergence and Simulation
No ratings yet
4 Convergence and Simulation
55 pages
2 PDF
No ratings yet
2 PDF
27 pages
Introduction To Probability Theory
No ratings yet
Introduction To Probability Theory
13 pages
Limiting Distributions
No ratings yet
Limiting Distributions
10 pages
BSDS Slides-Week9
No ratings yet
BSDS Slides-Week9
6 pages
Chapter7 (Probability)
No ratings yet
Chapter7 (Probability)
15 pages
MTL106 - by Amaiya Singhal (PROBABILITY AND STOCHASTIC PROCESSES)
No ratings yet
MTL106 - by Amaiya Singhal (PROBABILITY AND STOCHASTIC PROCESSES)
53 pages
Problems: MN) MN
No ratings yet
Problems: MN) MN
11 pages
Lecture Notes 4 Convergence (Chapter 5) 1 Random Samples: 1 N N 1 N N I
No ratings yet
Lecture Notes 4 Convergence (Chapter 5) 1 Random Samples: 1 N N 1 N N I
12 pages
Slides Large Sample
No ratings yet
Slides Large Sample
148 pages
Lecture 1: Stochastic Convergence and CLT
No ratings yet
Lecture 1: Stochastic Convergence and CLT
102 pages
Lecture 6
No ratings yet
Lecture 6
90 pages
Math556 11 ModesOfConvergence
No ratings yet
Math556 11 ModesOfConvergence
9 pages
Lesson4 MAT284 PDF
100% (1)
Lesson4 MAT284 PDF
36 pages
Probability and Stochastic Process 51
No ratings yet
Probability and Stochastic Process 51
11 pages
Chapter 9
No ratings yet
Chapter 9
22 pages
December 2, 2020
No ratings yet
December 2, 2020
38 pages
확통1 LectureNote06 on Limit Theorems
No ratings yet
확통1 LectureNote06 on Limit Theorems
36 pages
Stochastic Convergence
No ratings yet
Stochastic Convergence
20 pages
Various Modes of Convergence: Definitions
No ratings yet
Various Modes of Convergence: Definitions
6 pages
Econ 623 AsymptoticTheory 2023
No ratings yet
Econ 623 AsymptoticTheory 2023
74 pages
CH7 Prob Supp
No ratings yet
CH7 Prob Supp
5 pages
Essentials On The Analysis of Randomized Algorithms: 1 Basics
No ratings yet
Essentials On The Analysis of Randomized Algorithms: 1 Basics
8 pages
Probability II Upload Week 9
No ratings yet
Probability II Upload Week 9
3 pages
Strong Law
No ratings yet
Strong Law
9 pages
Convergence of Random Variables - Wikipedia
No ratings yet
Convergence of Random Variables - Wikipedia
17 pages
Notes 2
No ratings yet
Notes 2
10 pages
Introduction To Probability Theory
No ratings yet
Introduction To Probability Theory
8 pages
Lecture Note 4
No ratings yet
Lecture Note 4
8 pages
Lec 6
No ratings yet
Lec 6
7 pages
Ee5110 Lecture Limit Theorems
No ratings yet
Ee5110 Lecture Limit Theorems
9 pages
ADASD
No ratings yet
ADASD
4 pages
Prob Notes
No ratings yet
Prob Notes
70 pages
STB351 Unit 1a
No ratings yet
STB351 Unit 1a
20 pages
Convergence of Random Variables
No ratings yet
Convergence of Random Variables
11 pages
Chap 4
No ratings yet
Chap 4
12 pages
Duffy2 1,2 2
No ratings yet
Duffy2 1,2 2
17 pages
Prof (1) F P Kelly - Probability
No ratings yet
Prof (1) F P Kelly - Probability
78 pages
Lec 2
No ratings yet
Lec 2
7 pages
Chebyshev's Inequality:: K K K K
No ratings yet
Chebyshev's Inequality:: K K K K
4 pages
Foss Lecture1
No ratings yet
Foss Lecture1
32 pages
Convergence in Probability
No ratings yet
Convergence in Probability
10 pages
Math5846 Chapter6
No ratings yet
Math5846 Chapter6
85 pages
Notes
No ratings yet
Notes
32 pages
Lect 05
No ratings yet
Lect 05
22 pages
Lim PR Ob - 0: Convergence in Probability
No ratings yet
Lim PR Ob - 0: Convergence in Probability
4 pages
Random Sequences: Gaurav S. Kasbekar Dept. of Electrical Engineering IIT Bombay
No ratings yet
Random Sequences: Gaurav S. Kasbekar Dept. of Electrical Engineering IIT Bombay
14 pages
Convergence
No ratings yet
Convergence
7 pages
SST 304 Lesson 6 - 240912 - 000756
No ratings yet
SST 304 Lesson 6 - 240912 - 000756
6 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Pythagorean Triples: Determine Whether Each Set of Numbers Form A Pythagorean Triple. 12, 20, 16 8, 15, 17 1, 7, 5
No ratings yet
Pythagorean Triples: Determine Whether Each Set of Numbers Form A Pythagorean Triple. 12, 20, 16 8, 15, 17 1, 7, 5
2 pages
Multi, Square & Percentage
No ratings yet
Multi, Square & Percentage
6 pages
Ship Hydrodynamics 1 Part B Lecture 7 - Seakeeping Criteria - Supplement
100% (1)
Ship Hydrodynamics 1 Part B Lecture 7 - Seakeeping Criteria - Supplement
23 pages
Lecture 13: Natural Frequency and Bode Plot: Lecturer: Dr. Vinita Vasudevan Scribe: Shashank Shekhar
No ratings yet
Lecture 13: Natural Frequency and Bode Plot: Lecturer: Dr. Vinita Vasudevan Scribe: Shashank Shekhar
4 pages
Mechanical 3 RD Sem Syllabus
No ratings yet
Mechanical 3 RD Sem Syllabus
6 pages
Game Theory Lecture Notes - Levent Kockesen
No ratings yet
Game Theory Lecture Notes - Levent Kockesen
120 pages
Maple Labs
No ratings yet
Maple Labs
19 pages
M.E Maths
No ratings yet
M.E Maths
87 pages
Reduction Thesis Peirce
100% (3)
Reduction Thesis Peirce
7 pages
Solution 1
No ratings yet
Solution 1
6 pages
Coal India MT Paper 1 2020 Previous Year Paper
No ratings yet
Coal India MT Paper 1 2020 Previous Year Paper
50 pages
Design of UV Joint
No ratings yet
Design of UV Joint
11 pages
Hashsorting
No ratings yet
Hashsorting
33 pages
Module 5.3 Lateral Loads On Building Frames (Portal and Cantilever Method)
No ratings yet
Module 5.3 Lateral Loads On Building Frames (Portal and Cantilever Method)
11 pages
Mental Calculation
No ratings yet
Mental Calculation
54 pages
Volumen Finito
No ratings yet
Volumen Finito
11 pages
Case Study For The Amsterdam ArenA Stadium
No ratings yet
Case Study For The Amsterdam ArenA Stadium
24 pages
DCF Techniques
No ratings yet
DCF Techniques
25 pages
Complete the table showing the rejection regions for common values of α
No ratings yet
Complete the table showing the rejection regions for common values of α
1 page
Maam Ty. Jadeng
No ratings yet
Maam Ty. Jadeng
4 pages
THIRD-QUARTER-EXAM-IN-MATH-6-SY-2024-2025 - Edited
No ratings yet
THIRD-QUARTER-EXAM-IN-MATH-6-SY-2024-2025 - Edited
6 pages
Exponents Worksheets PDF
0% (3)
Exponents Worksheets PDF
2 pages
Expt 4 Conclusion and Applications
0% (2)
Expt 4 Conclusion and Applications
2 pages
Delivery Feet Data Using K Mean Clustering With Applied SPSS
No ratings yet
Delivery Feet Data Using K Mean Clustering With Applied SPSS
2 pages
Wca Regulations and Guidelines
No ratings yet
Wca Regulations and Guidelines
25 pages
Ac QP 360 Indian Abacus
No ratings yet
Ac QP 360 Indian Abacus
9 pages
Triangle Class 10
No ratings yet
Triangle Class 10
69 pages
CTFT DTFT DFT
No ratings yet
CTFT DTFT DFT
6 pages
Roark's Formulas For Excel - Superposition Wizard: Universal Technical Systems Inc
No ratings yet
Roark's Formulas For Excel - Superposition Wizard: Universal Technical Systems Inc
6 pages

Lec 4

Uploaded by

Lec 4

Uploaded by

36-705: Intermediate Statistics Fall 2018

We will first continue our discussion of exponential concentration inequalities.

4.1 Levy’s inequality

4.2 χ2 tail bounds

χ2 tail bound, we obtain that,

Now, it is straightforward to verify that if

4.3 Reminder: convergence of sequences

4.4 Almost sure convergence

|Xn (ω) − X(ω)| ≤ .

4.5 Convergence in probability

The WLLN says that the sequence X1 , X2 , . . . converges in probability to µ.

lim P(|Xn − E[X]| ≥ ) = 0,

Consistency: Convergence in probability will frequently recur in this course. Usually we

then by Chebyshev’s inequality we obtain,

so a sufficient condition for consistency is that Var(Sbn ) → 0 as n → ∞.

X1 (s) = s + I[0,1] (s), X2 (s) = s + I[0,1/2] (s), X3 (s) = s + I[1/2,1] (s)

4.6 Convergence in quadratic mean

as n → ∞. We will return to this one when we discuss some examples.

4.7 Convergence in distribution

4.8 More Examples

so the sequence converges to 0 in probability.

1. X(n) converges in probability to 1. To see this observe that,

FX(n) (t) = P(n(1 − X(n) ) ≤ t) = 1 − P(X(n) ≤ 1 − t/n)

You might also like

|Xn (ω) − X(ω)| ≤ .

lim P(|Xn − E[X]| ≥ ) = 0,