See Proposition 1.3.13 On Page 15 of Our Text
See Proposition 1.3.13 On Page 15 of Our Text
See Proposition 1.3.13 On Page 15 of Our Text
Solution: For every x ∈ R, the set {x} is the complement of an open set, and hence
Borel. Since there are only countably many rational numbers 1 , we may express Q
as the countable union of Borel sets: Q = ∪x∈Q {x}. Therefore Q is a Borel set.
1
See Proposition 1.3.13 on page 15 of our text.
We know 2 that N × N is countable, so there is a one-to-one map φ from N × N to
N. The composition φ ◦ c is a one-to-one map from ∪∞ n=1 Fn into N.
Solution: The solution depends on the fact that we have a concrete way to identify
sets in A. Define F = {E ⊆ R | E is countable, or E c is countable}; we claim
that A = F. If E is a countable set, then E = ∪x∈E {x} is the countable union of
singletons, and so belongs to σ(C) = A. If E c is countable, then E c , and hence E,
belongs to σ(C) = A. This shows that F ⊆ A. To prove the other inclusion, we
note that C ⊆ F, so it suffices to prove that F is a σ-algebra.
(a) The empty set ∅ is countable, so R = ∅c ∈ F.
(b) If E is countable, then E c has countable complement, while if E has countable
complement, then E c is countable. Either way, E ∈ F implies E c ∈ F.
(c) Suppose that E1 , E2 , . . . belong to F. If all of the En ’s are countable, then
so is the union3 , and hence belongs to F. On the other hand, if one of the
E’s, say EN , has countable complement, then (∪n En )c = ∩n Enc ⊆ EN c
is
countable, so that ∪n En ∈ F. Either way, ∪n En ∈ F.
Since singletons are Borel sets, so is every member of σ(C) = A. However, the
Borel set (0, 1) is not countable 4 and neither is its complement (−∞, 0] ∪ [1, ∞).
Thus (0, 1) is an example of a Borel set that does not belong to A.
5. Prove the following, where (Ω, F, P ) is a probability space and all sets are assumed to
be in F.
(i) If A ⊆ B, then P (A) ≤ P (B).
P∞
(ii) P (∪∞n=1 An ) ≤ n=1 P (An ).
(iii) If An+1 ⊆ An for all n, then P (An ) → P (∩∞
n=1 An ).
Solution:
(i) B is the disjoint union B = A ∪ (B \ A), so P (B) = P (A) + P (B \ A) ≥ P (A).
(ii) Define A01 = A1 and for n ≥ 2, A0n = An \ (∪n−1 0 0
i=1 Ai ). Then the An ’s are
disjoint, A0n ⊆ An for each n, and ∪n An = ∪n A0n . Therefore
X X
P (∪n An ) = P (∪n A0n ) = P (A0n ) ≤ P (An ).
n n
6. Prove that a simple function s (as in Definition 2.1.1) is a random variable (as in
Definition 2.1.6).
Pn
Solution: Write s = k=1 ak 1Ak where the Ak ’s are disjoint members of F. Then
for any λ ∈ R we have
[
{ω | s(ω) < λ} = Ak .
{k|ak <λ}
7. Suppose that Ω = {0, 1, 2, . . .}, F = all subsets of Ω, and P({n}) = e−1 /n! for n ∈ Ω.
Calculate E(X) where X(n) = n3 for all n ∈ Ω.
P∞
Solution: We need P to calculate the infinite sum n=0 n3 e−1 /n!. Let’s begin with
∞ −1
a simpler problem: n=0 ne /n!. Here the factor of n cancels nicely with part
of the factorial on the bottom to give
X∞ X∞ X∞
−1 −1
ne /n! = e /(n − 1)! = e−1 /k! = 1.
n=0 n=1 k=0
Attempting the same trick with n2 shows that we will not get the desired cancel-
lation unless we write n2 = n(n − 1) + n:
∞
X ∞
X
2 −1
n e /n! = [n(n − 1) + n]e−1 /n!
n=0 n=0
X∞ ∞
X
= n(n − 1)e−1 /n! + ne−1 /n!
n=0 n=0
X∞ ∞
X
= e−1 /(n − 2)! + ne−1 /n!
n=2 n=0
X∞ ∞
X
= e−1 /k! + ne−1 /n!
k=0 n=0
=1+1
= 2.
3
P∞ n 3=−1n(n − 1)(n − 2) + 3n(n − 1) + n and
To solve the original question, write
repeat the method above to get n=0 n e /n! = 1 + 3 + 1 = 5.
9. Prove that E(X)2 ≤ E(X 2 ) for any non-negative random variable X. Hint: First look
at simple functions.
Pn
Solution:
Pn If s = k=1 aP k 1Ak is a simple function, then so is its square s2 =
2 n 2
Pn 2
k=1 ak 1Ak , and E(s) = k=1 ak P (Ak ) and E(s ) = k=1 ak P (Ak ). Applying
the Cauchy-Schwarz inequality to the vectors x = (a1 P (A1 ) , . . . , an P (An )1/2 )
1/2
Here’s an even better proof that uses the variance of a random variable.
0 ≤ E((X − E(X))2 )
= E(X 2 − 2XE(X) + E(X)2 )
= E(X 2 ) − 2E(X)E(X) + E(X)2
= E(X 2 ) − E(X)2 .
Show that if Xn (ω) → X(ω) for every ω ∈ Ω, then Var (X) ≤ lim inf n Var (Xn ).
Solution: We may as well assume that lim inf n Var (Xn ) < ∞, otherwise the
conclusion is trivial. At the same time, let’s extract a subsequence Xn0 so that
Var (Xn0 ) → lim inf n Var (Xn ) as n0 → ∞. In other words, without loss of gener-
ality we may assume that supn Var (Xn ) < ∞, let’s call this value K.
Since Var (Xn ) < ∞, we have E(Xn2 ) < ∞ and by the previous exercise, this
implies E(|Xn |) ≤ E(Xn2 )1/2 < ∞. In other words, Xn is integrable. Our next job
is to show that E(Xn ) is a bounded sequence of numbers. The triangle inequality
|X(ω) − E(Xn )| ≤ |X(ω) − Xn (ω)| + |Xn (ω) − E(Xn )| implies the following set
inclusion for any value M > 0,
{ω :|X(ω) − E(Xn )| > 2M }
⊆ {ω : |X(ω) − Xn (ω)| > M } ∪ {ω : |Xn (ω) − E(Xn )| > M },
and hence
P (|X − E(Xn )| > 2M ) ≤ P (|X − Xn | > M ) + P (|Xn − E(Xn )| > M ).
Take expectations over the inequality 1{|Xn −E(Xn )|>M } ≤ (Xn −E(Xn ))2 /M 2
to give P (|Xn − E(Xn )| > M ) ≤ Var (Xn )/M 2 ≤ K/M 2 . Combined with the
previous inequality we obtain
P (|X − E(Xn )| > 2M ) ≤ P (|X − Xn | > M ) + K/M 2 .
The pointwise convergence of Xn to X implies that the sets (|X − Xn | > M )
decrease to ∅ as n → ∞, so that P (|X − Xn | > M ) → 0.
Fix M > 0 so large that K/M 2 < 1/8. The pointwise convergence of Xn
to X implies that the sets (|X − Xn | > M ) decrease to ∅ as n → ∞, so that
P (|X − Xn | > M ) → 0. Therefore we can choose N 0 so large that n ≥ N 0 implies
P (|X − Xn | > M ) ≤ 1/8 and thus P (|X − E(Xn )| > 2M ) ≤ 1/4.
The sets (|X| > N ) decrease to ∅ as N → ∞, so that for some large N we
have P (|X| > N ) ≤ 1/4. Now lets define a set of good points:
G = {ω : |X(ω)| ≤ N } ∩ {ω : |X(ω) − E(Xn )| ≤ 2M }.
If G is not empty, then for ωg ∈ G we have |E(Xn )| ≤ 2M + |X(ωg )| ≤ 2M + N .
Our bounds show us that P (Gc ) = P ((|X| > N ) ∪ |X − E(Xn )| > 2M ) ≤
1/4 + 1/4 = 1/2 so that G is non-empty, for all n ≥ N 0 . In other words, |E(Xn )| ≤
2M + N for n ≥ N 0 , which implies that E(Xn ) is bounded.
Now we have that E(Xn2 ) = Var (Xn ) + E(Xn )2 is a bounded sequence. Ap-
plying Fatou’s lemma to the non-negative random variables Xn2 , we conclude that
X 2 is integrable and E(X 2 ) ≤ lim inf n E(Xn2 ). From problem 4, this also shows
that X is integrable since E(|X|) ≤ E(X 2 )1/2 < ∞.
For any random variable Y and constant c > 0, let’s define the truncated
random variable Y c = Y 1{−c≤Y ≤c} . For any c > 0, we have
|E(Xn ) − E(X)| ≤ |E(Xn ) − E(Xnc )| + |E(Xnc ) − E(X c )| + |E(X c ) − E(X)|
≤ |E(Xn 1{|Xn |>c} )| + |E(Xnc ) − E(X c )| + |E(X1{|X|>c} )|
≤ E(Xn2 /c) + |E(Xnc ) − E(X c )| + E(X 2 /c)
supn E(Xn2 ) E(X 2 )
≤ + |E(Xnc ) − E(X c )| +
c c
Now for every c, the sequence Xnc is dominated by the integrable random
variable c1Ω , and converges pointwise to X c . Therefore the dominated convergence
theorem tells us E(Xnc ) → E(X c ). Letting n → ∞ and then c → ∞ in the above
inequality shows that, in fact, E(Xn ) → E(X).
Finally, we may apply Fatou’s lemma to the sequence (Xn −E(Xn ))2 to obtain
Whew!
Solution:
(1) ⇒ (2) For every n we can write An as the disjoint union of U sets An =
(An \ An+1 ) ∪ (An+1 \ An+2 ) ∪ . . . ∪ A, and use σ-additivity to obtain P(An ) =
P(An \ An+1 ) + P(An+1 \ An+2 ) + · · · + P(A). This shows that P(An ) − P(A) is
the tail of a convergent series, and thus converges to zero as n → ∞.
(2) ⇒ (3) (3) is a special case of (2).
(3) ⇒ (1) Suppose that (3) holds and that Em ∈ U are disjoint and E =
∪∞ n ∞
m=1 Em ∈ U. For each n ∈ N define the U set An = E \ (∪m=1 Em ) = ∪m=n Em .
We have An ⊃ An+1 and ∩∞ n=1 An = ∅ so we know P(An ) → 0. On the other hand
by finite additivity we havePP(E) = P(E1 ) + · · · + P(En−1 ) + P(An ), so letting
∞
n → ∞ we obtain P(E) = m=1 P(Em ), which says that P is σ-additive.
(2) ⇔ (5) This follows since U is closed under complementation and P is a
finitely additive probability so that P(Ac ) = 1 − P(A).
(3) ⇔ (4) These statements are contrapositives of each other.
(2) µ((a, c)) = limn µ((a, c − 1/n]) = limn F (c − 1/n) − F (a) = F (c−) − F (a).
(3) We have two different ways to calculate (µ × ν)(B). The first is by definition
(µ × ν)(B) = µ((a, b])ν((a, b]) = (F (b) − F (a))(G(b) − G(a)). The second is by
adding (µ × ν)(B − ) and (µ × ν)(B + ). We already know that
Z Z
−
(µ × ν)(B ) = ν((a, x]) µ(dx) = (G(x) − G(a)) F (dx),
(a,b] (a,b]
(4) Starting with the equation above and multiplying out where possible gives
L∞ a.s.
13. Prove that if Xn −→ X, then Xn −→ X.
P w
14. Prove that if Xn −→ X, then Xn −→ X.
P P
Solution: Since Xn −→ X, we have φ(Xn ) −→ φ(X) for any continuous bounded
φ. Using the dominated convergence below, we have E(φ(Xn )) → E(φ(X)), which
w
by definition means Xn −→ X.
P
15. (Dominated convergence theorem) Prove that if Xn −→ X, and |Xn | ≤ Y ∈ L1 ,
then E(Xn ) → E(X).
p
16. Prove or disprove the following implication
PN for convergence in L , almost surely, and
1
in probability: Xn → X implies N n=1 Xn → X.
PN
Solution: (a.s.) It suffices to show that Xn (ω) → X(ω) implies N1 n=1 Xn (ω) →
X(ω). Suppose Xn (ω) → X(ω), and pick Pn> 0. Let n so that supn≥n |Xn (ω) −
X(ω)| ≤ . Choose N so large that n=1 |Xn (ω) − X(ω)| ≤ N . Then for
N ≥ N we get
1 N N
X
1 X
N Xn (ω) − X(ω) ≤
Xn (ω) − X(ω)
n=1
N n=1
n ! N !
1 X
Xn (ω) − X(ω) + 1
X
≤ Xn (ω) − X(ω)
N n=1 N
n=n +1
N
≤+
N
≤ 2,
1
PN
which proves N n=1 Xn (ω) → X(ω).
Solution: (LPp ) Pick > 0 and let n so that supn≥n kXn − Xkp ≤ . Choose N
n
so large that n=1 kXn − Xkp ≤ N . Then for N ≥ N we get
1 N N
X
1 X
N X n − X
≤
Xn − X
p
n=1
p N n=1
n
! N
!
1 X
Xn − X
+ 1
X
≤ p
Xn − X
p
N n=1 N n=n +1
N
≤+
N
≤ 2,
1
PN Lp
which proves N n=1 Xn −→ X.
17. Prove that if supn E(Xn2 ) < ∞, then (Xn )n∈N is uniformly integrable.
Solution:
Solution: Let G =R{ω : E[X | RG](ω) < 0}. Then E[X | G]1G ≤ 0 and X1G ≥ 0,
but G ∈ G so 0 ≤ G X dP = G E[X | G] dP ≤ 0. A negative random variable
with a zero integral must be zero: thus E[X | G]1G = 0, and we conclude that
1G = 0, that is P(G) = 0.
P
19. (Dominated convergence theorem) Prove that if Xn −→ X, and |Xn | ≤ Y ∈ L1 ,
P
then E[Xn | G] −→ E[X | G].
P
Solution: Since |Xn − X| −→ 0 and |Xn − X| ≤ 2Y , dominated convergence tells
us that E(|Xn − X|) → 0 as n → ∞. Since |E[Xn − X | G]| ≤ |Xn − X|, this
shows us that E[Xn | G] → E[X | G] in L1 and hence also in probability.
21. True or false: If X and Y are independent, then E[X | G] and E[Y | G] are indepen-
dent for any G.
1. Determine the σ-algebra F of P∗ -measurable subsets of R for the measure whose dis-
tribution function is
1 if x ≥ 0,
n
F (x) =
0 if x < 0.
Solution: Before we try to determine F, let’s find out as much as we can about P
and P∗ . Define An = (−1/n, 0] so that An+1 ⊆ An for every n and ∩n An = {0}.
This implies that P(An ) → P({0}). Now P(An ) = F (0) − F (−1/n) = 1 for every
n and so we conclude that P({0}) = 1, and also P(R \ {0}) = 0.
For any subset E ⊆ R with 0 ∈ E we have P∗ (E) ≥ P∗ ({0}) = P({0}) = 1.
On the other hand, if 0 6∈ E, then E ⊆ (R \ {0}) and so P∗ (E) ≤ P∗ (R \ {0}) =
P(R \ {0}) = 0. Therefore we have
∗ 1 if 0 ∈ E,
P (E) =
0 if 0 6∈ E.
1 = P∗ (E) = P∗ (E ∩ Q) + P∗ (E ∩ Qc ) = 1.
0 = P∗ (E) = P∗ (E ∩ Q) + P∗ (E ∩ Qc ) = 0.
2. If P is a probability measure on (R, B(R)) and F its distribution function, show that
F is continuous at x if and only if P({x}) = 0.
Solution: Since F is non-decreasing, it has a left limit at x given by
Subtracting gives
which shows that P({x}) = 0 if and only if F (x) = F (−x). This is the same as
continuity at x, since F is right-continuous.
P∗ (E ∩ (Q \ A)) ≤ P∗ (Q \ A) ≤ P∗ (B \ A) = P(B \ A) = 0.
Consequently we obtain
1. Give an example where Xn (ω) → X(ω) for every ω ∈ Ω, but E(X) < lim inf n E(Xn ).
Solution: Let Ω = N, F = all subsets of Ω, and P({n}) = 2−n for n ≥ 1. Define
random variables by Xn = 2n 1{n} . For fixed ω, we have Xn (ω) = 0 for all n > ω
so Xn (ω) → 0. On the other hand, E(Xn ) = 2n P({n}) = 2n 2−n = 1, for all n.
2. Show that if E[(X − a)2 ] < ∞ for some a ∈ R, then E[(X − a)2 ] < ∞ for all a ∈ R.
Solution: If E[(X − a)2 ] < ∞, then the result follows for any b ∈ R by integrating
the inequality (X − b)2 ≤ 2(X − a)2 + 2(b − a)2 .
R R
3. Prove that X dP = sup{ s dP | 0 ≤ s ≤ X, s simple} for non-negative X.
R R R
Solution: If s R≤ X, then s dP ≤ X dPR and so sup{ s dP | 0 R≤ s ≤
X, s simple} ≤ X dP. On the other hand, X dP is defined as limk sk dP
where sk is a sequence of simple functions that increases to X. Therefore
4. Let P, Q be Rprobabilities on (R, B(R)) where P has the density function f . Prove
that h(z) = f (z − x) Q(dx) is the density of the convolution P ? Q.
Solution: For any Borel set B, we have by the definition of convolution and
Fubini’s theorem
Z
(P ? Q)(B) = 1B (x + y)(P × Q)(dx, dy)
R2
Z Z
= 1B (x + y)P(dx)Q(dy).
R R
This is true whether or not the value is finite. Now if E[(X − y)2 ] = ∞ for all
y ∈ R, then both Var (X) = E[(X −E(X))2 ] and R E[(X −y)2 ] P(dy) are infinite.
R
Let’s suppose that E[(X − y)2 ] < ∞ for some y ∈ R, and hence by problem 2, for
all y ∈ R. Then expanding the square is justified and we obtain
Z Z
2
E[(X − y) ] P(dy) = E[X 2 − 2yX + y 2 ] P(dy)
R
ZR
= (E[X 2 ] − 2yE[X] + y 2 ) P(dy)
R
Z Z Z
2
= E[X ] P(dy) − 2E[X] y P(dy) + y 2 P(dy)
R R R
2 2
= E[X ] − 2E[X]E[X] + E[X ]
= 2Var (X).
STAT 571 Final Exam
April 22 1999
Instructions: This an open book exam. You can use your text, your notes, or
any other book you care to bring. You have three hours.
1. If you roll a fair die, how long on average before the pattern “ . . . ... .. .. ..... ....
.. ” appears?
2. Let Ω = (0, 1], F = B((0, 1]), and P be Lebesgue measure on (0, 1]. Define the sub
σ-algebra G = σ{(0, 1/4], (1/4, 1/2], (1/2, 1]} and the random variable X(ω) = ω 2 . Write
out an explicit formula for E(X | G)(ω).
4. For X ∈ L2 , prove that the random variable E(X | G) has smaller variance than X.
5. Without using any mathematical notation, explain in grammatical English the mean-
ing of a stopping time. Why are they defined in this way, and why do we only consider
random times with this special property?
7. Let S be a stopping time. Prove that T (ω) = inf{n > S(ω) : Xn (ω) ≤ 3} is a
(Fn )-stopping time, where (Xn )n∈N is adapted to the filtration (Fn )n∈N .