0% found this document useful (0 votes)

17 views10 pages

Section 6 - The Law of Large Numbers and The Central Limit Theorem

Uploaded by

jixuezhanggg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views10 pages

Section 6 - The Law of Large Numbers and The Central Limit Theorem

Uploaded by

jixuezhanggg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

6 The law of large numbers and the central limit theorem

In this section we study two of the most important results in probability and statistics: the law of
large numbers and the central limit theorem.

6.1 Markov and Chebyshev’s inequalities

Lemma 6.1. Let X : Ω → R and Y : Ω → R be discrete or continuous random variables with

well-defined expectations. Suppose that X(ω) ≤ Y (ω) for all ω ∈ Ω. Then E [X] ≤ E [Y ].

Proof. We focus on the discrete case. Consider the random variable Z = Y − X. For all ω ∈ Ω we
have Z(ω) = Y (ω) − X(ω) ≥ 0 by the hypothesis. Thus SZ = {Z(ω) : ω ∈ Ω} ⊆ [0, ∞), and so
X
E[Y ] − E[X] = E[Z] = z · P(Z = z) ≥ 0.
z∈SZ

The first equality uses linearity of expectation, the second is by definition of E[Z], the final inequality
holds since P(Z = z) ≥ 0 and z ≥ 0 for all z ∈ SZ . It follows that E[X] ≤ E[Y ].

Definition 6.2. Let A ⊆ Ω be an event. The indicator variable of the event A is the random variable
1A : Ω → R defined for all ω ∈ Ω by
(
1, if ω ∈ A,
1A (ω) =
0, if ω ∈/ A.

Remark 6.3. Note that 1A ∼ Berp with p = P (A), giving E[1A ] = P(A).

We are now ready to prove Markov’s inequality, a fundamental result. It is very intuitive – a non-
negative random variable rarely takes values that are much larger than its expectation 1 .
Theorem 6.4 (Markov’s inequality). Let X : Ω → R be a non-negative random variable with well-
defined expectation. Then, given any t > 0, we have
E [X]
P X≥t ≤ .
t

Proof. Let A ⊆ Ω denote the event A := {ω ∈ Ω : X(ω) ≥ t}. We claim that t · 1A (ω) ≤ X(ω) for all
ω ∈ Ω. To see this, we split according to whether ω ∈ A or ω ∈
/ A.

• If ω ∈ A then t · 1A (ω) = t ≤ X(ω), by definition of A.

• If ω ∈
/ A then t · 1A (ω) = 0 ≤ X(ω). (We used X is non-negative here, as this gave X(ω) ≥ 0.)

Thus, by Lemma 6.1 we have E[t · 1A ] ≤ E[X], and so

t · P(X ≥ t) = t · P(A) = t · E 1A = E t · 1A ≤ E[X].

Dividing the left and right-hand sides by t gives the theorem.

1
A non-negative random variable is just a random variable which never takes negative values.

1
Remark 6.5. .

• Markov’s inequality is used very often in mathematics. Here are two reasons:

– It is very flexible as it assumes almost nothing about X; just that it is non-negative 2 .

– It provides a connection between the expectation E[X] and probabilities associated to X.

• Markov’s inequality is only really useful if t > E[X], as otherwise P(X ≥ t) ≤ 1 is better.

Example 6.6. Let X ∼ bin100,0.1 . Then E [X] = 100 · 0.1 = 10. By Markov’s inequality, we have
P (X ≥ 50) ≤ 10/50 = 0.2.

Example 6.7. Let X be a non-negative random variable with P (X ≥ 10) = 0.3. Then, by Markov’s
inequality, E [X] ≥ 0.3 · 10 = 3.

Example 6.8. On a social network, an average user has 300 friends. Equivalently, if we select a
random person on the network and let X equal the number of their friends then E[X] = 300. Markov’s
inequality then gives P(X ≥ 900) ≤ 1/3, and so at most a third of the users have 900 friends or more.

One of the most important applications of Markov’s inequality is to prove Chebyshev’s inequality,
which shows that a random variable with small variance is typically close to its expectation.

Theorem 6.9 (Chebyshev’s inequality). Let X be a random variable with well-defined expectation
and variance. Then, for all ε > 0, we have
Var(X)
P |X − E [X] | ≥ ε ≤ .
ε2

Proof. Note that if ω ∈ Ω then |X(ω) − E[X]| ≥ ε if and only if |X(ω) − E[X]|2 ≥ ε2 . This gives

P |X − E[X]| ≥ ε = P |X − E[X]|2 ≥ ε2 .

2
Now apply Markov’s inequality to the non-negative random variable X − E[X] , with t = ε2 , to get
2
2 2
E X − E[X] Var(X)
P |X − E[X]| ≥ ε = P |X − E[X]| ≥ ε ≤ = .
ε2 ε2
The last equality here holds by definition of variance.
2 , if we take ε = ασ with α > 0, Chebyshev’s inequality gives
Remark 6.10. Since Var(X) = σX X

P |X − E [X] | ≥ ασX ≤ α−2 .

It follows that |X − E[X]| is typically not much larger than σX (see Remark 5.32).

Example 6.11. Let X ∼ bin50,0.4 . Then E [X] = 20 and Var(X) = 12 so P (|X − 20| ≥ 5) ≤ 12/25.
2
Check that the theorem is false if X can take negative values.

2
Example 6.12 (Binomial distribution). Let X ∼ binn,p for n ∈ N, p ∈ (0, 1). By Chebyshev’s
inequality, for all ε > 0, we have
Var(X) np(1 − p) p(1 − p)
P |X − np| ≥ εn ≤ 2
= 2
= .
(εn) (εn) ε2 n

Hence, limn→∞ P |X − np| ≥ εn = 0.

Example 6.13. We toss a coin which appears as tails with some unknown probability p ∈ (0, 1) a
total number of n times. Let Ŝn be the number of times that tails appears. Then, Ŝn ∼ binn,p . How
large must n be to guarantee that p̂n = Ŝn /n satisfies |p̂n − pn | ≤ 0.01 with probability at least 0.95?
This question is ideal for applying Chebyshev’s inequality. First note that as Ŝn ∼ binn,p we have
Var(Ŝn ) = np(1 − p). Secondly, by Proposition 5.39 (ii) we have Var(Ŝn /n) = n−2 Var(Ŝn ). Therefore

Ŝn Var(Ŝn ) np(1 − p) 1
Var = 2
= 2
≤
n n n 4n

where the last inequality holds since p(1 − p) ≤ 1/4 for p ∈ [0, 1]. Thus
1 2500
P |p̂n − p| ≥ 0.01 = P Ŝn − np ≥ 0.01n ≤ 2
= ,
4n · (0.01) n

which is smaller than 0.05 for all n ≥ 50, 000.

Example 6.14 (Mean and variance estimation). We return to the estimators for mean µ and variance
σ 2 of an unknown distribution using independent random samples X̂1 , . . . , X̂n discussed in Section 5.6.
We introduced the sample average µ̂n = n−1 (X̂1 + · · · + X̂n ) and the estimator for sample fluctuations
σ̂n2 := n−1 ni=1 (X̂i − µ̂n )2 . We proved that E [µ̂n ] = µ, E σ̂n2 = 1 − n1 σ 2 and both variances of µ̂n
P

and σ̂n2 converge to zero as n → ∞. By Chebyshev’s inequality, for ε > 0,

lim P (|µ̂n − µ| ≥ ε) = 0
n→∞

and 1 2
lim P σ̂n2 − 1 − σ ≥ ε = 0 =⇒ lim P σ̂n2 − σ 2 ≥ ε = 0.

n→∞ n n→∞

Estimators with these properties are called consistent.

6.2 The law of large numbers

An infinite collection of random variables X1 , X2 , . . . are independent and identically distributed 3 if:

• X1 , . . . , Xn are independent for all n ∈ N (as in Definition 4.24), and

• all Xi follow the same distribution, that is FXi (t) = FXj (t) for all i, j ∈ N and t ∈ R.

3
It is very common to write i.i.d. to abbreviate this condition.

3
Theorem 6.15 (Law of large numbers). Suppose that X1 , X2 , . . . are independent and identically
distributed random variables with well-defined expectations and variances, where E[Xi ] = µ and
Var(Xi ) ≤ c for all i ∈ N. Then, setting Sn = X1 + · · · + Xn , for any ε > 0 we have
S
n
lim P − µ ≥ ε = 0.
n→∞ n

Proof. As X1 , . . . , Xn are independent by Theorem 5.50 we have

Var(Sn ) = Var(X1 ) + · · · + Var(Xn ) ≤ nc,

using that Var(Xi ) ≤ c for i ∈ {1, . . . , n}. Applying Chebyshev’s inequality to Sn gives

Sn Var (Sn ) cn c
P − µ ≥ ε = P Sn − nµ ≥ εn ≤ 2
≤ 2 2 = 2.
n (εn) ε n nε

The right hand side tends to zero as n → ∞.

Remark 6.16. We have thought of the expectation E[X] of a random variable X as a sort of average,
or typical value. The law of large numbers gives strong support to this view: if we take many
independent samples according to X (e.g. n repeated dice rolls) and average the results, the result is
extremely likely to be close to E[X].

Monte Carlo simulation. Here is a useful way to use randomness (and the law of large numbers)
to approximate quantities. We are given a set S and and random variable Y which takes values in S
and would like to estimate P(Y ∈ A) for some set A ⊆ S.
A natural way to try to do this is to take independent random variables Y1 , . . . , Yn which all follow
the same distribution as Y and to use the ratio |{1 ≤ i ≤ n : Yi ∈ A}|/n to estimate P(Y1 ∈ A). But
does this work (most of the time)?
Yes, it does. Apply Theorem 6.15 to the random variables Xi = 1{Yi ∈A} , for i = 1, . . . , n. Then for
any ε > 0, we find

|{1 ≤ i ≤ n : Yi ∈ A}|
lim P − P (Y1 ∈ A) > ε = 0, (1)
n→∞ n

This gives rise to the Monte Carlo method of approximation.

Rb
Example 6.17 (Integral approximation). Given f : [a, b] → [0, ∞), we would like to find a f (x)dx.
This is generally difficult as many functions do not have an anti-derivative. However, if we are happy
to estimate the integral then Monte Carlo simulation offers an solution.
Suppose that f is bounded by M on [a, b]. Let Y1 , Y2 , . . . , Yn be independent random variables with
the uniform distribution on [a, b] × [0, M ] = {(x, y) : a ≤ x ≤ b, 0 ≤ y ≤ M }. As in the previous
example, we have P (Y1 ∈ A) = area(A)/(M (b − a)). In particular, letting B denote the set of points
between the x-axis and f , we have

B = (x, y) : a ≤ x ≤ b, 0 ≤ y ≤ f (x) .

4
y y

M
f (x) f (x)

x x
a b a b

Figure 1: The area of the yellow region be- Figure 2: 11 of the 20 points lie between f (x) and
11
tween the curve f (x) and the x-axis represents the x-axis, so we approximate the integral by 20 ·
Rb
M (b − a).
a f (x)dx.

Rb Rb
In particular, the area(B) = a f (x)dx. It follows that P (Y1 ∈ B) = (M (b − a))−1 · a f (x)dx. As
justified in (1), with probability close to 1 this gives
Rb Z b
{1 ≤ i ≤ n : Yi ∈ A} f (x)dx |{1 ≤ i ≤ n : Yi ∈ B}|
≈ P(Y1 ∈ B) = a =⇒ f (x)dx ≈ M (b−a)· .
n M · (b − a) a n

Example 6.18 (Finding π). A nice application of Monte Carlo simulation is in approximating π.
Let Y1 , Y2 , . . . , Yn be independent random variables with the uniform distribution on [−1, 1]2 =
{(x, y) : −1 ≤ x, y ≤ 1}. In other words, the two components of each Yi are independent ran-
dom variables with the continuous uniform distribution on [−1, 1]. For a set A ⊆ [−1, 1]2 , we
have P (Y1 ∈ A) = area(A)/4, where area(A) denotes the area of the set A 4 . As P (|Y1 | ≤ 1) =
P Y1 ∈ {(x, y) ∈ [−1, 1]2 : x2 + y 2 ≤ 1} = π/4, the law of large numbers gives

|{1 ≤ i ≤ n : |Yi | ≤ 1}| π
lim P − > ε = 0.
n→∞ n 4

Thus 4 · |{1 ≤ i ≤ n : |Yi | ≤ 1}|/n is extremely likely to be a good approximation of π if n is large.

6.3 De Moivre-Laplace and the central limit theorem

Chebyshev’s inequality shows that a random variable X with X ∼ binn,p typically differs from E(X) =
p p
np by about σX = Var(X) = np(1 − p). In this subsection we study the De Moivre–Laplace
theorem which gives a finer description of the behaviour of X, showing that X is well–approximated
by a normal distribution. The central limit theorem, a later generalisation of De Moivre–Laplace, is
widely considered as the most important result in probability theory and statistics.
4
The definition of area is clear for rectangles, circles and other familiar objects. For more general sets, such a definition
is content of second and third year modules.

5
0.09

0.08

0.07

0.06

0.05

0.04

0.03

0.02

0.01

36 38 40 42 44 46 48 50 52 54 56 58 60 62 64

p
Figure 3: Mass function of X ∼ bin100,0.5 . The smooth curve is x 7→ 1/(50π) exp −(x − 50)2 /50 . The area
of the shaded region equals to P (47 ≤ X ≤ 55) and is approximated by the integral of the smooth curve from
46 to 55, or, more precisely, by the integral from 46.5 to 55.5.

Theorem 6.19 (De Moivre–Laplace). Let p ∈ (0, 1). Given n ∈ N let Xn ∼ binn,p . Then for any
t ∈ R, ! Z t
Xn − np 1 2
e−x /2 dx,

lim P p ≤ t = P N ≤ t = Φ(t) = √
n→∞ np(1 − p) 2π −∞
where N ∼ N (0, 1) follows the standard normal distribution.

Proof. Omitted.

The De Moivre–Laplace theorem is very useful as it allows us to estimate P(Xn ≤ k) for Xn ∼ binn,p
by a calculation for the standard normal distribution. Given such k, let
k − np
x(n, k) := p .
np(1 − p)

Then letting N denote a standard normal distribution, Theorem 6.19 gives 5

X − np p
P(Xn ≤ k) = P p ≤ x(n, k) ≈ Φ x(n, k) = P np + np(1 − p) · N ≤ k . (2)
np(1 − p)

We note that if |x(n, k)| is not too large, the so-called midpoint rule 6 is a little more accurate
!
1 p
P (Xn ≤ k) ≈ Φ x(n, k) + p = P np + np(1 − p) · N ≤ k + 1/2 . (3)
2 np(1 − p)
This is called the continuity correction and is supported by Figure 3 and the following example.

Example 6.20 (Dice). We roll a fair dice 600 times, and let X denote the number of times ‘6’ appears.
We are interested in the event {90 ≤ X ≤ 100}. Exact computations reveal that

P (90 ≤ X ≤ 100) = bin600,1/6 (90) + bin600,1/6 (91) + · · · + bin600,1/6 (100) = 0.4024 . . .

5
As rule of thumb, one often considers the criterion np(1 − p) ≥ 9 for the applicability of this approximation.
6
This is a result from numerical integration, which we will not discuss further.

6
q
1 5
p
As np(1 − p) = 600 · 6 · 6 = 9.1287 . . ., (2) gives

11
P (90 ≤ X ≤ 100) = P (X ≤ 100) − P (X ≤ 89) ≈ Φ(0) − Φ − = 0.3858 . . .
9.1287 . . .
Similarly, the estimate (3) involving the continuity correction yields the estimate
P (90 ≤ X ≤ 100) = P (X ≤ 100) − P (X ≤ 89) ≈ Φ(0.0547 . . .) − Φ(−1.1502 . . .) = 0.3968 . . . .
Example 6.21 (Greedy manager). A hotel has 250 rooms. Statistically, the probability of a no-show
is 0.15 per room and night. Looking to take advantage here, the hotel manager decides to overbook
and tolerate a probability of 0.03 to exceed capacities. How many rooms can the manager offer?
Let n be the number of bookings and X be the number of guests showing up. Then, X ∼ binn,0.85 .
Overbooking happens if and only if X ≥ 251, and this event shall have probability at most 3%. Using
the De Moivre-Laplace theorem with continuity correction, that is (3), we obtain

1
P (X ≥ 251) = 1 − P (X ≤ 250) ≈ 1 − Φ x(n, 250) + √ .
2 0.1275n
From Table 2, we can read off that the right hand side is smaller than 0.03 if (and only if)
1
x(n, 250) + √ ≥ 1.89
2 0.1275n
√
As x(n, 250) = (250 − 0.85n)/ 0.1275n, this corresponds to n ≤ 281. Thus, the hotel manager can
offer at most 281 beds.
Example 6.22 (Bribes). A city has N = 1, 000, 000 residents, and two candidates, A and B, run for
mayor. Every citizen flips a fair coin to come to reach a decision and so the two candidates are equally
likely to win. (A tie is possible, but very unlikely).
Now suppose candidate A bribes 1, 000 voters. Does this increase the candidate’s chances significantly?
Assume that the remaining 999, 000 people still flip fair coins and denote by X the total number of votes
candidate A receives from these people. Then X follows the binomial distribution with parameters
999, 000 and 1/2. By (2), we have

X − 499500 500
P (A wins) = P (X ≥ 499, 001) = 1 − P (X ≤ 499, 000) = 1 − P ≤−
499.7499 . . . 499.7499 . . .
≈ Φ(1) = 0.841 . . .
Thus, only 1, 000 bribes are enough to boost the candidate’s chances significantly!

The central limit theorem generalises the De Moivre-Laplace theorem to arbitrary distributions with
finite variance.
Theorem 6.23 (Central limit theorem). Let X1 , X2 , . . . be independent and identically distributed
random variables with well-defined expectations and variances, where µ := E [Xi ] and σ 2 := Var(Xi ) >
0. Then, setting Sn = X1 + · · · + Xn , for all t ∈ R we have
Z t
Sn − n · µ 1 2
e−x /2 dx,

lim P √ ≤ t = P N ≤ t = Φ(t) = √ (4)
n→∞ σ n 2π −∞
where N ∼ N (0, 1) follows the standard normal distribution.

7
Proof. Omitted, as all proofs are quite involved. See Statistics or Fourier Analysis in later years.

Remark 6.24. .

• The central limit theorem establishes the normal distribution as a key probability distribution.
It is remarkable that the same limit appears in (4), regardless of the distribution you start with
(i.e. the distribution of the random variables Xi ).

• The theorem also explains why the normal distribution appears so often in the real-world: if
a random quantity is determined as a sum of many (roughly) independent contributions then
it can be approximated by a random variable following the normal distribution with suitable
expectation and variance 7 .

Example 6.25. Your friend rolls a fair dice N = 10, 000 times and claims to have obtained a total
sum of 35, 854. Should you believe this?
Let X1 , . . . , XN denote the results of the individual dice rolls and let S denote the total sum. Then
X1 , . . . , XN all follow the uniform distribution on {1, . . . , 6} and these random variables are indepen-
35
dent. We have E [X1 ] = 3.5, and Var(X1 ) = 12 , as computed in Section 5. By linearity of expectation,
P10000
E[S] = i=1 E[Xi ] = 35, 000 and your friend claims a deviation of at least 854, which has probability
!
|S − N · E [X1 ] | 854
r 12
P |S − E [S] | ≥ 854 = P √ ≥ q ≈2· 1−Φ · 8.54
N · σX1 100 · 35 35
12

= 2 · 1 − Φ(5.0005...) = 5.7 · 10−7 [3dp].

Theorem 6.23 justifies the approximation. This is a tiny probability, so the claim is extremely unlikely.

Confidence intervals. Let X̂1 , . . . , X̂n be independent random variables following the same distri-
bution with mean µ and variance σ 2 . We consider these random variables as observed data and would
like to estimate the unknown mean µ. The natural guess is to use the same average

X̂1 + X̂2 + . . . + X̂n

µ̂n = . (5)
n
How certain can we be that µ̂n is close to µ?
The central limit theorem is very useful in this context. Given α ∈ (0, 1), an α-confidence interval Iα
is an interval which satisfies

P µ ∈ Iα ≥ α. (6)

Here it is the interval Iα that is randomly chosen (based on X̂1 , . . . , X̂n ), and not µ (which is fixed,
as µ is the mean of the distribution). Thus, in words, (6) says that the random interval Iα covers µ
with probability at least α.
7
A nice example here is human height, which is roughly the sum of the length of many bones.

8
Here, we only construct approximate confidence intervals. Let us first assume that the value of σ is
known to us. Given the samples X̂1 , . . . , X̂n , take µ̂n as in (5) and set Iα to be

σ σ
Iα = µ̂n − zα √ , µ̂n + zα √ ,
n n

where we still need select the value of zα . This approximately satisfies (6), as by Theorem 6.23
Pn
h σ σ i i=1 X̂i − n · µ
P (µ ∈ Iα ) = P µ ∈ µ̂n − zα √ , µ̂n + zα √ = P − zα ≤ √ ≤ zα
n n σ n
≈ Φ(zα ) − Φ(−zα ) = 2Φ(zα ) − 1 = α,

provided we take zα = Φ−1 α+1

2 . Table 1 gives the values of zα which are often chosen in applications.

α 0.9 0.95 0.97 0.99

zα 1.645 1.96 2.17 2.575

Table 1: Table for zα for popular values of α.

It turns out that, if the unknown distribution is N (µ, σ 2 ), then Iα gives an exact confidence interval,
that is, P (µ ∈ Iα ) = α independently of the value of n.
Typically σ is unknown in applications. Then, one can either use an upper bound for σ valid for all
possible values of µ or needs to estimate σ from the data adding a second level of approximation.

Example 6.26 (Swiss babies). Between 1901 and 2016, n = 9, 569, 478 babies were born in Switzer-
land, 4, 907, 770 of them were boys. We assume that the number of boys born followed the binomial
distribution with parameters n and p where p is unknown. In other words, we are in the situation of
the previous example with Bernoulli random variables X̂1 , X̂2 , . . . , X̂n with parameter p. The observed
p
probability is p̂n = µ̂n = 0.512856 . . . 8 We do not know σ = p(1 − p), but we can use the bound
σ ≤ 1/2 valid for all p. 9 An approximate 95%-confidence interval is given by

1.96 1.96
I0.95 = p̂n − √ , p̂n + √ = [p̂n − 0.000316 . . . , p̂n + 0.000316 . . .] = [0.512539 . . . , 0.51317 . . .].
2 n 2 n

We caution that this does not mean that I0.95 covers p with probability at least (roughly) 0.95. Indeed,
there is no longer any randomness here since p is fixed (but unknown) and I0.95 also fixed above. Thus
I0.95 either contains p or not. We can only say the following: if p ∈/ I0.95 , then the probability that
observed interval I0.95 took such an extreme value was at most (roughly) 0.05.

8
The ratio is similar in the UK and in other western societies.
9
As the true value of p should be close to 1/2, this bound is very precise.

9
Most important takeaways in this chapter. You should

• know the statements of Markov’s inequality, Chebyshev’s inequality, the law of large numbers
and the De Moivre-Laplace theorem.

• be able to apply these results in standard situations,

• be familiar with the concept of confidence intervals, their interpretation and how to find them
in simple examples.

Rt −x2 /2 dx, t
Table 2: Table for Φ(t) = √1 ≥ 0.
2π −∞ e
t 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0 0.5 0.50399 0.50798 0.51197 0.51595 0.51994 0.52392 0.5279 0.53188 0.53586
0.1 0.53983 0.5438 0.54776 0.55172 0.55567 0.55962 0.56356 0.56749 0.57142 0.57535
0.2 0.57926 0.58317 0.58706 0.59095 0.59483 0.59871 0.60257 0.60642 0.61026 0.61409
0.3 0.61791 0.62172 0.62552 0.6293 0.63307 0.63683 0.64058 0.64431 0.64803 0.65173
0.4 0.65542 0.6591 0.66276 0.6664 0.67003 0.67364 0.67724 0.68082 0.68439 0.68793
0.5 0.69146 0.69497 0.69847 0.70194 0.7054 0.70884 0.71226 0.71566 0.71904 0.7224
0.6 0.72575 0.72907 0.73237 0.73565 0.73891 0.74215 0.74537 0.74857 0.75175 0.7549
0.7 0.75804 0.76115 0.76424 0.7673 0.77035 0.77337 0.77637 0.77935 0.7823 0.78524
0.8 0.78814 0.79103 0.79389 0.79673 0.79955 0.80234 0.80511 0.80785 0.81057 0.81327
0.9 0.81594 0.81859 0.82121 0.82381 0.82639 0.82894 0.83147 0.83398 0.83646 0.83891
1 0.84134 0.84375 0.84614 0.84849 0.85083 0.85314 0.85543 0.85769 0.85993 0.86214
1.1 0.86433 0.8665 0.86864 0.87076 0.87286 0.87493 0.87698 0.879 0.881 0.88298
1.2 0.88493 0.88686 0.88877 0.89065 0.89251 0.89435 0.89617 0.89796 0.89973 0.90147
1.3 0.9032 0.9049 0.90658 0.90824 0.90988 0.91149 0.91309 0.91466 0.91621 0.91774
1.4 0.91924 0.92073 0.9222 0.92364 0.92507 0.92647 0.92785 0.92922 0.93056 0.93189
1.5 0.93319 0.93448 0.93574 0.93699 0.93822 0.93943 0.94062 0.94179 0.94295 0.94408
1.6 0.9452 0.9463 0.94738 0.94845 0.9495 0.95053 0.95154 0.95254 0.95352 0.95449
1.7 0.95543 0.95637 0.95728 0.95818 0.95907 0.95994 0.9608 0.96164 0.96246 0.96327
1.8 0.96407 0.96485 0.96562 0.96638 0.96712 0.96784 0.96856 0.96926 0.96995 0.97062
1.9 0.97128 0.97193 0.97257 0.9732 0.97381 0.97441 0.975 0.97558 0.97615 0.9767
2 0.97725 0.97778 0.97831 0.97882 0.97932 0.97982 0.9803 0.98077 0.98124 0.98169
2.1 0.98214 0.98257 0.983 0.98341 0.98382 0.98422 0.98461 0.985 0.98537 0.98574
2.2 0.9861 0.98645 0.98679 0.98713 0.98745 0.98778 0.98809 0.9884 0.9887 0.98899
2.3 0.98928 0.98956 0.98983 0.9901 0.99036 0.99061 0.99086 0.99111 0.99134 0.99158
2.4 0.9918 0.99202 0.99224 0.99245 0.99266 0.99286 0.99305 0.99324 0.99343 0.99361
2.5 0.99379 0.99396 0.99413 0.9943 0.99446 0.99461 0.99477 0.99492 0.99506 0.9952
2.6 0.99534 0.99547 0.9956 0.99573 0.99585 0.99598 0.99609 0.99621 0.99632 0.99643
2.7 0.99653 0.99664 0.99674 0.99683 0.99693 0.99702 0.99711 0.9972 0.99728 0.99736
2.8 0.99744 0.99752 0.9976 0.99767 0.99774 0.99781 0.99788 0.99795 0.99801 0.99807
2.9 0.99813 0.99819 0.99825 0.99831 0.99836 0.99841 0.99846 0.99851 0.99856 0.99861
3 0.99865 0.99869 0.99874 0.99878 0.99882 0.99886 0.99889 0.99893 0.99896 0.999

Stochastic Calculus For Finance II Conti
No ratings yet
Stochastic Calculus For Finance II Conti
99 pages
Shreve Stochcal4fin 2
No ratings yet
Shreve Stochcal4fin 2
99 pages
Trados 2017 Manual PDF
100% (1)
Trados 2017 Manual PDF
55 pages
MTL106 - by Amaiya Singhal (PROBABILITY AND STOCHASTIC PROCESSES)
No ratings yet
MTL106 - by Amaiya Singhal (PROBABILITY AND STOCHASTIC PROCESSES)
53 pages
SST 304 Lesson 6 - 240912 - 000756
No ratings yet
SST 304 Lesson 6 - 240912 - 000756
6 pages
Inequalities and Limit Theorems
100% (1)
Inequalities and Limit Theorems
18 pages
6.1 Markov & Chevyshev Inequalities SIMPLE
No ratings yet
6.1 Markov & Chevyshev Inequalities SIMPLE
5 pages
Probability and Stochastic Process 51
No ratings yet
Probability and Stochastic Process 51
11 pages
Unit 3 - Bounds and Inequalities
No ratings yet
Unit 3 - Bounds and Inequalities
25 pages
Lec 1
No ratings yet
Lec 1
4 pages
Chapter 7 8fhjg
No ratings yet
Chapter 7 8fhjg
9 pages
Unit3-Probability and Stochastic Processes (18MAB203T)
No ratings yet
Unit3-Probability and Stochastic Processes (18MAB203T)
25 pages
STA 211 Lecture 3
No ratings yet
STA 211 Lecture 3
21 pages
ADASD
No ratings yet
ADASD
4 pages
G Chebyshev's
No ratings yet
G Chebyshev's
9 pages
Unit 2
No ratings yet
Unit 2
28 pages
Math 20 - Inequalities of Markov and Chebyshev
No ratings yet
Math 20 - Inequalities of Markov and Chebyshev
4 pages
18MAB203T U3 Book PDF
No ratings yet
18MAB203T U3 Book PDF
38 pages
SC633 Lecture Notes
No ratings yet
SC633 Lecture Notes
4 pages
1 Upper Bounds On The Tail Probability
No ratings yet
1 Upper Bounds On The Tail Probability
7 pages
1 Markov's Inequality: Lecture Notes CS:5360 Randomized Algorithms
No ratings yet
1 Markov's Inequality: Lecture Notes CS:5360 Randomized Algorithms
11 pages
Chebyshev
No ratings yet
Chebyshev
3 pages
Markov and Cheybshev Inequalities and The Law of Large Numbers-Proofs and Illustrations
No ratings yet
Markov and Cheybshev Inequalities and The Law of Large Numbers-Proofs and Illustrations
18 pages
Notes 2
No ratings yet
Notes 2
10 pages
Lect 05
No ratings yet
Lect 05
28 pages
확통1 LectureNote06 on Limit Theorems
No ratings yet
확통1 LectureNote06 on Limit Theorems
36 pages
Wattle Lecture 15
No ratings yet
Wattle Lecture 15
6 pages
Osobine Var
No ratings yet
Osobine Var
19 pages
Instructor: DR - Saleem AL Ashhab Al Ba'At University Mathmatical Class Second Year Master Dgree
No ratings yet
Instructor: DR - Saleem AL Ashhab Al Ba'At University Mathmatical Class Second Year Master Dgree
13 pages
Lecture Notes 2 1 Probability Inequalities
No ratings yet
Lecture Notes 2 1 Probability Inequalities
9 pages
Fundamental Inequalities and The Strong Law
No ratings yet
Fundamental Inequalities and The Strong Law
6 pages
Lecture Notes 2 1 Probability Inequalities
No ratings yet
Lecture Notes 2 1 Probability Inequalities
9 pages
Introduction To Probability Theory
No ratings yet
Introduction To Probability Theory
13 pages
Prob 2 B English
No ratings yet
Prob 2 B English
81 pages
Lecture Note 4
No ratings yet
Lecture Note 4
8 pages
Probability Inequalities: 15.1. Boole's Inequality, Bonferroni Inequalities
No ratings yet
Probability Inequalities: 15.1. Boole's Inequality, Bonferroni Inequalities
14 pages
Introduction To Probability Theory
No ratings yet
Introduction To Probability Theory
12 pages
Problems: MN) MN
No ratings yet
Problems: MN) MN
11 pages
Prob Notes
No ratings yet
Prob Notes
70 pages
9 CLT
No ratings yet
9 CLT
19 pages
Lecture 4 Inequalities and Asymptotic Estimates
No ratings yet
Lecture 4 Inequalities and Asymptotic Estimates
9 pages
Probability Bounds: Simple Bounds On Expectation
No ratings yet
Probability Bounds: Simple Bounds On Expectation
3 pages
The Law of Large Numbers
No ratings yet
The Law of Large Numbers
10 pages
Notes
No ratings yet
Notes
32 pages
CLT, Tcheby, Chi Problems
No ratings yet
CLT, Tcheby, Chi Problems
22 pages
STA 241 Topic 14 Laws of Large Numbers (Corr)
No ratings yet
STA 241 Topic 14 Laws of Large Numbers (Corr)
9 pages
Lec 4
No ratings yet
Lec 4
8 pages
Lec 8
No ratings yet
Lec 8
13 pages
Important Inequalities
No ratings yet
Important Inequalities
7 pages
CH7 Prob Supp
No ratings yet
CH7 Prob Supp
5 pages
Chapter 4 Expected Values
No ratings yet
Chapter 4 Expected Values
13 pages
Unit-14 IGNOU STATISTICS
No ratings yet
Unit-14 IGNOU STATISTICS
13 pages
RIP Routing Protocol
No ratings yet
RIP Routing Protocol
27 pages
4 Convergence and Simulation
No ratings yet
4 Convergence and Simulation
55 pages
A Probabilistic Theory of Pattern Recognition: Based On The Appendix of The Textbook
No ratings yet
A Probabilistic Theory of Pattern Recognition: Based On The Appendix of The Textbook
4 pages
Chebyshev's Inequality:: K K K K
No ratings yet
Chebyshev's Inequality:: K K K K
4 pages
Strong Law of Large Numbers
No ratings yet
Strong Law of Large Numbers
5 pages
15-359: Probability and Computing Inequalities: N J N J
No ratings yet
15-359: Probability and Computing Inequalities: N J N J
11 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Teacher's Code-Switching As Scaffolding in Teaching Content Area Subjects
No ratings yet
Teacher's Code-Switching As Scaffolding in Teaching Content Area Subjects
5 pages
Kišobran-Antena: Antena Umirovljenog Profesora Po Principu Uradi Sam"
No ratings yet
Kišobran-Antena: Antena Umirovljenog Profesora Po Principu Uradi Sam"
4 pages
Fast Food Foolies: Read The Text Very Carefully
No ratings yet
Fast Food Foolies: Read The Text Very Carefully
3 pages
IR 25 v7.0 2
No ratings yet
IR 25 v7.0 2
40 pages
Narratology and Classics A Practical Guide 1st Edition Irene J. F. de Jong Instant Download
No ratings yet
Narratology and Classics A Practical Guide 1st Edition Irene J. F. de Jong Instant Download
51 pages
Graph 19 PDF
100% (1)
Graph 19 PDF
4 pages
Logical Basic Lesson 45 Present Tense
No ratings yet
Logical Basic Lesson 45 Present Tense
26 pages
Dasha Shloki
No ratings yet
Dasha Shloki
6 pages
Yearly Distribution: 1 Week 2 Week 3 Week 4 Week
No ratings yet
Yearly Distribution: 1 Week 2 Week 3 Week 4 Week
5 pages
Godel Incompleteness Theorems - Complete Proofs of - Kim PDF
No ratings yet
Godel Incompleteness Theorems - Complete Proofs of - Kim PDF
23 pages
AP Calculus AB Review Chapter 1
100% (1)
AP Calculus AB Review Chapter 1
21 pages
Grammar Test
75% (4)
Grammar Test
4 pages
Selfstudys Com File
No ratings yet
Selfstudys Com File
20 pages
Khoi 12 (2024-2025)
No ratings yet
Khoi 12 (2024-2025)
3 pages
The Fine-Scale Genetic Structure of The British Population
No ratings yet
The Fine-Scale Genetic Structure of The British Population
45 pages
Probability and Computing Randomization and Probabilistic Techniques in Algorithms and Data Analysis 2nd Edition Michael Mitzenmacher PDF Download
100% (2)
Probability and Computing Randomization and Probabilistic Techniques in Algorithms and Data Analysis 2nd Edition Michael Mitzenmacher PDF Download
59 pages
MIDAS/Civil: Project Title
No ratings yet
MIDAS/Civil: Project Title
9 pages
Financial Advisor PowerPoint Presentation Template
No ratings yet
Financial Advisor PowerPoint Presentation Template
51 pages
Anawrahta and His Contributions To Buddhism
No ratings yet
Anawrahta and His Contributions To Buddhism
2 pages
Workshop4: User Guide
No ratings yet
Workshop4: User Guide
52 pages
From Cultural Awareness To Intercultural Awarenees Culture in ELT
No ratings yet
From Cultural Awareness To Intercultural Awarenees Culture in ELT
9 pages
Department of Information Technology: Mini Project Title
No ratings yet
Department of Information Technology: Mini Project Title
4 pages
Public Key Encryption
No ratings yet
Public Key Encryption
79 pages
1 2 3 4 Merged
No ratings yet
1 2 3 4 Merged
19 pages
Datesheet Ut-1 (2025-26)
No ratings yet
Datesheet Ut-1 (2025-26)
1 page
Print Culture
No ratings yet
Print Culture
15 pages
Thyolo English p3 QP
No ratings yet
Thyolo English p3 QP
11 pages
Statement Coverage:: Weakness
No ratings yet
Statement Coverage:: Weakness
35 pages
25 - Properties and Solution of Triangles
No ratings yet
25 - Properties and Solution of Triangles
38 pages

Section 6 - The Law of Large Numbers and The Central Limit Theorem

Uploaded by

Section 6 - The Law of Large Numbers and The Central Limit Theorem

Uploaded by

6 The law of large numbers and the central limit theorem

6.1 Markov and Chebyshev’s inequalities

Lemma 6.1. Let X : Ω → R and Y : Ω → R be discrete or continuous random variables with

• If ω ∈ A then t · 1A (ω) = t ≤ X(ω), by definition of A.

Thus, by Lemma 6.1 we have E[t · 1A ] ≤ E[X], and so

Dividing the left and right-hand sides by t gives the theorem.

– It is very flexible as it assumes almost nothing about X; just that it is non-negative 2 .

which is smaller than 0.05 for all n ≥ 50, 000.

and σ̂n2 converge to zero as n → ∞. By Chebyshev’s inequality, for ε > 0,

Estimators with these properties are called consistent.

6.2 The law of large numbers

• X1 , . . . , Xn are independent for all n ∈ N (as in Definition 4.24), and

Proof. As X1 , . . . , Xn are independent by Theorem 5.50 we have

Var(Sn ) = Var(X1 ) + · · · + Var(Xn ) ≤ nc,

The right hand side tends to zero as n → ∞.

This gives rise to the Monte Carlo method of approximation.

Thus 4 · |{1 ≤ i ≤ n : |Yi | ≤ 1}|/n is extremely likely to be a good approximation of π if n is large.

6.3 De Moivre-Laplace and the central limit theorem

Then letting N denote a standard normal distribution, Theorem 6.19 gives 5

P (90 ≤ X ≤ 100) = bin600,1/6 (90) + bin600,1/6 (91) + · · · + bin600,1/6 (100) = 0.4024 . . .

= 2 · 1 − Φ(5.0005...) = 5.7 · 10−7 [3dp].

X̂1 + X̂2 + . . . + X̂n

provided we take zα = Φ−1 α+1

α 0.9 0.95 0.97 0.99

zα 1.645 1.96 2.17 2.575

Table 1: Table for zα for popular values of α.

• be able to apply these results in standard situations,

You might also like