0% found this document useful (0 votes)
77 views

Lim PR Ob - 0: Convergence in Probability

This document discusses convergence in probability and introduces several key concepts: 1) It defines convergence in probability as a sequence of random variables converging to a constant c if the probability that Xn differs from c by more than a small amount ε approaches 0 as n approaches infinity. 2) It introduces the notation plim[Xn]=c to denote convergence in probability and notes that if a function g(X) is continuous, then plim[g(Xn)]=g(plim[Xn]). 3) It provides an example showing how the law of large numbers implies that the sample mean of iid random variables converges in probability to the population mean.

Uploaded by

Rajendra Gupta
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views

Lim PR Ob - 0: Convergence in Probability

This document discusses convergence in probability and introduces several key concepts: 1) It defines convergence in probability as a sequence of random variables converging to a constant c if the probability that Xn differs from c by more than a small amount ε approaches 0 as n approaches infinity. 2) It introduces the notation plim[Xn]=c to denote convergence in probability and notes that if a function g(X) is continuous, then plim[g(Xn)]=g(plim[Xn]). 3) It provides an example showing how the law of large numbers implies that the sample mean of iid random variables converges in probability to the population mean.

Uploaded by

Rajendra Gupta
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Convergence in Probability

One of the handiest tools in regression is the asymptotic analysis of estimators as the
number of observations becomes large. This is handy for the following reason. When you have
a nonlinear function of a random variable g(X), when you take an expectation E[g(X)], this is not
the same as g(E[X]). For example, for a mean centered X, E[X2] is the variance and this is not
the same as (E[X])2=(0)2=0. However, as we will soon show, plim[g(Xn)] is equal to
g(plim[Xn]), so asymptotic properties can be passed through nonlinear operators. What is this
“plim” in the above?

Suppose that we have a sequence of random variables X1, X2,...,Xn,…each with


cumulative distribution functions F1(X), F2(X),...,Fn(X),… We say that this sequence converges
lim Pr ob (| X n −c|>ε )=0
in probability to a constant c if n→∞ for any small >0. In terms of the
CDFs, this says that Fn gets very close to a staircase step function at the value c, as seen in the
figure below.
F1 Fn

1.0 1.0

X X
c c
That is, the limiting random variable is very unlikely to be either below or above the value of c.
lim Pr ob (| X n −c|>ε )=0
We use the notation plim[Xn]=c to denote the above n→∞ , and we say
that the probability limit of Xn is c.

If g(X) is a continuous function then values of X close to c produce values of g(X) that
are close to g(c), since that is what “continuous” means. If the plim[Xn]=c, then almost certainly
Xn is near c, so almost certainly g(Xn) is near g(c). That is, plim[g(Xn)]=g(plim[Xn]). For
example, if g(X)=X2 and plim[ x̄ n ]=, then the plim[ x̄ n 2
]= (plim[ x̄ n ])2=2. It is not true,
of course, that E[ x̄ n 2
]=( E[ x̄ n ])2=2. In fact, since var( x̄ n )=E[( x̄ n -)2]=E[ x̄ n 2
]-2,
we know that E[ x̄ n 2
]=2+var( x̄ n )=2+2/n ≠2.

1
1
Pr ob (|X −μ|> kσ )≤
Chebychev’s Inequality: k2 .
Chebychev’s inequality (sometimes written Chebysheff) says that only an insignificant portion
of the probability falls beyond a given number of standard deviations of the mean, regardless of
the probability distribution. For example, less than 25% of the probability can be more than 2
standard deviations of the mean; of course, for a normal distribution, we can be more specific –
less than 5% of the probability is more than 2 standard deviations from the mean.
2 ∞ 2 μ−kσ 2 ∞ 2
σ ≡∫−∞ ( x−μ) f ( x)dx≥∫−∞ ( x−μ) f ( x)dx+∫μ+kσ ( x−μ) f (x )dx
Proof: because we
have left out the middle piece of the sum of positive numbers.
a. Look at the first integral on the right of the inequality. Notice that in this region of integration
x  -k or x--k, but x< so x-<0. Multiple both sides of x--k by -1 to get |x-|k
μ−kσ 2 μ−kσ 2 2 2 2
Thus, ∫−∞ (x−μ) f ( x)dx≥∫−∞ k σ f (x )dx=k σ F( μ−kσ ) .
b. By similar reasoning for the second integral on the right of the integral
∞ 2 2 2

μ +kσ
( x−μ) f ( x)dx≥k σ (1−F(μ+kσ ))
Putting a and b together with the first integral, we have
2 2 2 2 2 2 2
σ ≥k σ F (μ−kσ )+k σ (1−F ( μ+kσ))=k σ Pr ob(|X−μ|≥kσ ) . Dividing both sides by
k22 gives Chebychev’s inequality. Q.E.D.
σ2
Pr ob (|X −μ|>θ )≤ 2
Note: This implies that θ .

Law of Large Numbers

Let us combine Chebychev’s inequality and “plim” as follows. Suppose that we have a
sequence of independent and identically distributed (iid) random variables and we calculate the
x + x +⋯x n
x̄ n≡ 1 2
average of them: n . We know that the average has a mean  and a
σ 1
variance 2/n. Apply Chebychev’s inequality to get
(
Pr ob | x̄ n−μ|≥k ≤ 2
√ n k . Let )
2
n σ
k =ε √ Pr ob (|x̄ n −μ|≥ε ) ≤ 2
σ , so nε . Take the limit as n,
σ2
lim Pr ob (|x̄ n −μ|≥ε ) ≤lim 2 =0
n→∞ n→ ∞ nε , for any >0. Hence,

plim[ x̄ n ]=.

2
This is the famous Law of Large Numbers that states that the empirical mean of a growing list of
iid random variables will almost certainly equal the population mean of the random variable. If
you flip enough fair coins, then the average number of heads for all those flips will almost
certainly be 0.5.

3
Application of “plim” in regression.

Suppose Y=X+ and the OLS estimator is b=(X’X)-1X’Y=+(X’X/n)-1X’/n. The plim b=+
(X’X/n)-1plim(X’/n). For the kth variable, plim(Xiki/n)=0, since E[Xiki/n]=0 and
σ ΣX

var(Xiki/n)=2Xik2/n2. By Chebychev’s inequality
(
Pr ob |ΣX ik ε i /n|≥k
2
√n
ik
2
) ≤
1
k2 .
σ ΣX 2
√ σ ΣX 2

ik ik
k =θ Pr ob (|ΣX ik ε i /n|≥θ )≤ 2
Set √n and we have nθ . Taking the limit with
respect to n, plim(Xiki/n)=0. That is, the plim b =  and we say that the OLS estimator is
consistent.

You might also like