Lec Notes 10
Lec Notes 10
10. Martingales
Associated reading: Chapter 6 of Ash and Doléans-Dade; Sec 5.2, 5.3, 5.4 of Durrett.
Overview
Martingales are elegant and powerful tools to study sequences of dependent random variables.
It is originated from gambling, where a gambler can adjust the bet according to the previous
results. In a simple version, assume a gambler bets 1 dollar in the first game. If he wins,
then he stops playing. Otherwise he doubles the bet until he wins. If each game is i.i.d coin
toss with non-zero winning probability and the gambler has infinite amount of money, then
he will win one dollar with probability one.
1 Martingales
Let (Ω, F, P ) be a probability space.
Example 4 (R-N derivatives). Let (Ω, F, P ) be a probability space. Let {Fn }∞ n=1 be a
filtration. Let ν be a finite measure on (Ω, F) such that for every n, ν has a density Xn with
1
respect to P when both are restricted to (Ω, Fn ). Then {Xn }∞
n=1 is adapted to the filtration.
To see that we have a martingale, we need to show that for every n and A ∈ Fn
Z Z
Xn+1 (ω)dP (ω) = Xn (ω)dP (ω). (1)
A A
Since Fn ⊆ Fn+1 , each A ∈ Fn is also in Fn+1 . Hence both sides of Equation (1) equal ν(A).
Example 6 (Likelihood ratio – general case). Let (Ω, F, P ) be a probability space. Let
{Yn }∞
n=1 be a sequence of random variables and Fn = σ(Y1 , . . . , Yn ). Suppose that, for each
n, µY1 ,...,Yn has a strictly positive density pn with respect to Lebesgue measure λn . Let Q be
another probability on (Ω, F) such that Q((Y1 , . . . , Yn )−1 (·)) has a density qn with respect to
λn for each n. Define
qn (Y1 , . . . , Yn )
Xn = .
pn (Y1 , . . . , Yn )
It is easy to check that {(Xn , Fn )}∞
n=1 is a martingale.
Example 8 (Lévy martingale). Let {Fn }∞ n=1 be a filtration and let X be a random variable
with finite mean. Define Xn = E(X|Fn ). By the law of total probability we have a martingale.
Such a martingale is sometimes called a Lévy martingale.
2
F0 be the trivial σ-field. (We could let Y0 be a random variable and let F0 = σ(Y0 ), but
then we would also have to expand Fn to σ(Y0 , . . . , Yn ).) Suppose that the gambler devises
a system for determining how much Wn ≥ 0 to bet on the nth play. We assume that Wn
is Fn−1 measurable for each n. This forces the gambler to choose the amount to bet before
knowing what will happen. Now, define Zn = Y0 + nj=1 Wj Yj . Since
P
and Wn+1 ≥ 0, we have that E(Wn+1 Yn+1 |Fn ) is ≥, =, or ≤ than Wn Yn depending on whether
E(Yn+1 ) is ≥, =, or ≤ than 0, respectively. That is, {(Zn , Fn )}∞
n=1 is a submartingale, a
∞
martingale, or a supermartingale according as {(Yn , Fn )}n=1 is a submartingale, a martin-
gale, or a supermartingale. This result is often described by saying that gambling systems
cannot change whether a game is favorable, fair, or unfavorable to a gambler.
for n > 1. Also, define Zn = Xn − An . Because E(Xk |Fk−1 ) ≥ Xk−1 for all k > 1, we
have An ≥ An−1 for all k > 1, so {An }∞ 1
n=1 is nondecreasing. Also, E(Xk |Fk−1 ) is Fn−1 /B -
3
measurable for all 1 < k ≤ n, so {An }∞
n=1 is previsible. Finally, notice that
so Zn is a martingale.
For uniqueness, suppose that Xn = Yn + Wn is another decomposition so that Yn is a
martingale and Wn is previsible. Then write
n
X n
X
[E(Xk |Fk−1 ) − Xk−1 ] = [E(Yk + Wk |Fk−1 ) − Xk−1 ]
k=2 k=2
Xn
= (Yk−1 + Wk − Xk−1 )
k=2
n
X
= (Wk − Wk−1 ) = Wn .
k=2
The previsible process in Theorem 12 is called the compensator for the submartingale.
2 Stopping Times
Let (Ω, F, P ) be a probability space, and let {Fn }∞
n=1 be a filtration.
If {Xn }∞
n=1 is adapted to the filtration and if τ < ∞ a.s., then Xτ is defined as Xτ (ω) (ω).
(Define Xτ equal to some arbitrary random variable X∞ for τ = ∞.)
1
If your filtration starts at n = 0, you can allow stopping times to be nonnegative valued. Indeed, if your
filtration starts at an arbitrary integer k, then a stopping time can take any value from k on up. There is a
trivial extension of every filtration to one lower subscript. For example, if we start at n = 1, we can extend
to n = 0 by defining F0 = {Ω, ∅}. Every martingale can also be extended by defining X0 = E(X1 ). For the
rest of the course, we will assume that the lowest possible value for a stopping time is 1.
4
Example 14. Let {Xn }∞ n=1 be adapted to the filtration and let τ = k0 , a constant. Then
{τ = n} is either Ω or ∅ and it is in every Fn , so τ is a stopping time. Also,
A if k0 ≤ k,
A ∩ {τ ≤ k} =
∅ if k0 > k.
So A ∩ {τ ≤ k} ∈ Fk for all finite k if and only if A ∈ Fk0 . So Fτ = Fk0 .
Example 15 (First passage). Let {Xn }∞ n=1 be adapted to the filtration. Let B be a Borel
set and let τ = inf{n : Xn ∈ B}. As usual, inf ∅ = ∞. For each finite n,
\
{τ = n} = {Xn ∈ B} {Xk ∈ B C } ∈ Fn .
k<n
This proves that Xτ is Fτ measurable. Suppose that τ1 and τ2 are two stopping times such
that τ1 ≤ τ2 . Let A ∈ Fτ1 . Since A ∩ {τ2 ≤ k} = A ∩ {τ1 ≤ k} ∩ {τ2 ≤ k} for every event A,
it follows that A ∩ {τ2 ≤ k} ∈ Fk and A ∈ Fτ2 . Hence Fτ1 ⊆ Fτ2 . As an example, let τ be
an arbitrary stopping time (not necessarily finite a.s.) and define τk = min{k, τ } for finite
k. Then τk is a finite stopping time with τk ≤ τ . Hence Xτk is Fτk measurable for each k
and so Xτk is Fτ measurable. Similarly, τk ≤ k so that Fτk ⊆ Fk and Xτk is Fk measurable.
Example 16 (Gambler’s ruin). The gambler in Example 9 can try to build a stopping
time into a gambling system. For example, let τ = min{n : Zn ≥ Y0 + x} for some integer
x > 0. This would seem to guarantee winning at least x. There are two possible drawbacks.
One is that there may be positive probability that τ = ∞. Even if τ < ∞ a.s., it might require
unlimited resources to guarantee that we can survive until τ . For example, let Y0 = 0 and
let Yn have equal probability of being 1 or −1 all n. So, we stop as soon as we have won x
more than we have lost. If we modify the problem so that we have only finite resources (say
k units) then this becomes the classic gambler’s ruin problem. The probability of achieving
Zn = x before Zn = −k is k/(k + x), which goes to 1 as k → ∞. So, if we have unlimited
resources, the probability is 1 that τ < ∞, otherwise, we may never achieve the goal. If the
probability of winning on each game is less than 1/2, then P (τ = ∞) > 0.
5
Suppose that we start with a martingale {(Xn , Fn )}∞
n=1 and a stopping time τ . We can
define
∗ Xn if n ≤ τ ,
Xn = = Xmin{τ,n} .
Xτ if n > τ
We can call this the stopped martingale. It turns out that {Xn∗ }∞ n=1 is also a martingale
relative to the filtration. First, note that Xmin{τ,n} is Fn measurable. Next, notice that
n−1 Z
X Z
E(|Xn∗ |) = |Xk |dP + |Xn |dP
k=1 {τ =k} {τ ≥n}
Xn
≤ E(|Xk |) < ∞.
k=1
3 Optional Sampling
Let {(Xn , Fn )}n=1 be a martingale. Consider a sequence of a.s. finite stopping times {τn }∞
n=1
such that 1 ≤ τj ≤ τj+1 for all j. Then we can construct {(Xτn , Fτn )}∞n=1 and ask whether or
not it is a martingale. In general, an unpleasant integrability condition is needed to prove
this. We shall do a simplified case.
6
Theorem 18 (Optional sampling theorem). Let {(Xn , Fn )}∞ n=1 be a (sub)martingale.
Suppose that for each n, there is a finite constant Mn such that τn ≤ Mn a.s. Then
{(Xτn , Fτn )}∞
n=1 is a (sub)martingale.
The unpleasant integrability condition that can replace P (τn ≤ Mn ) = 1 is the following:
For every n,
• P (τn < ∞) = 1,
• E(|Xτn |) < ∞, and
• lim inf m→∞ E(|Xm |I(m,∞) (τn )) = 0.
Proof: [Theorem 18] Without loss of generality, assume that Mn ≤ Mn+1 for every n.
Since τn ≤ Mn ,
Mn Z
X XMn
E(|Xτn |) = |Xk |dP ≤ E(|Xk |) < ∞.
k=1 {τn =k} k=1
R already knowR that Xτn is Fτn measurable. Let A ∈ Fτn . We need to show that
We
A
Xτn+1 dP (≥) = A Xτn dP . Write
Z Z
[Xτn+1 − Xτn ]dP = [Xτn+1 − Xτn ]dP.
A A∩{τn+1 >τn }
Since A ∈ Fτn and {τn < k ≤ τn+1 } = {τn ≤ k − 1} ∩ {τn+1 ≤ k − 1}C , it follows that
Bk = A ∩ {τn < k ≤ τn+1 } ∈ Fk−1 ,
for each k. So
Z Mn+1
XZ
[Xτn+1 − Xτn ]dP = (Xk − Xk−1 )dP
A k=2 Bk
Mn+1
XZ
(≥) = [Xk − E(Xk |Fk−1 )]dP = 0.
k=2 Bk
7
4 Martingale Convergence
The upcrossing lemma says that a submartingale cannot cross a fixed nondegenerate interval
very often with high probability. If the submartingale were to cross an interval infinitely
often, then its lim sup and lim inf would have to be different and it couldn’t converge.
Lemma 19 (Upcrossing lemma). Let {(Xk , Fk )}nk=1 be a submartingale. Let r < q, and
define V to be the number of times that the sequence X1 , . . . , Xn crosses from below r to
above q. Then
1
E(V ) ≤ (E|Xn | + |r|) . (2)
q−r
We will only give an outline of the proof of Lemma 19. Let Yk = max{0, Xk − r}. Then
V is the number of times that Yk moves from 0 to above q − r, and {(Yk , Fk )}∞ k=1 is a
submartingale. It is easy to see that V is at most the sum of the upcrossing increments
divided by q − r. That is,
n
1 X
V ≤ (Yk − Yk−1 )IEk ,
q − r k=2
where Ek is the event that the path is crossing up at time k. Notice that Ek ∈ Fk−1 for all
k. Hence, for each k ≥ 2,
Z Z
E([Yk − Yk−1 ]IEk ) = (Yk − Yk−1 )dP = [E(Yk |Fk−1 ) − Yk−1 ]dP.
Ek Ek
Because E(Yk |Fk−1 ) − Yk−1 ≥ 0 a.s. by the submartingale property, we can expand the
integral from Ek to all of Ω to get
Z
E([Yk − Yk−1 ]IEk ) ≤ [E(Yk |Fk−1 ) − Yk−1 ]dP = E(Yk − Yk−1 ).
Proof: Let X ∗ = lim supn→∞ Xn and X∗ = lim inf n→∞ Xn . Let B = {ω : X∗ (ω) < X ∗ (ω)}.
We will prove that P (B) = 0. We can write
[
B= {ω : X ∗ (ω) > q > r > X∗ (ω)}.
r < q, r, q rational
Now, X ∗ (ω) > q > r ≥ X∗ (ω) if and only if the values of Xn (ω) cross from being below r to
being above q infinitely often. For fixed r and q, we now prove that this has probability 0;
8
hence P (B) = 0. Let Vn equal the number of times that X1 , . . . , Xn cross from below r to
above q. According to Lemma 19,
1
sup E(Vn ) ≤ sup E(|Xn |) + |r| < ∞.
n q−r n
The number of times the values of {Xn (ω)}∞ n=1 cross from below r to above q equals
limn→∞ Vn (ω). By the monotone convergence theorem,
Example 21 (Random walk). For the random walk martingale of Example 3, if the Yn ’s
√
are iid with finite variance σ 2 , then Xn / n converges in distribution so Xn can’t converge
a.s. To check how the condition of Theorem 20 is violated, the Markov inequality says that
E(|Xn |) √ h c i
√ ≥ P (|Xn | > c n) → 2 1 − Φ ,
nc σ
√
for each positive c. So, eventually E(|Xn |) ≥ c n[1 − Φ(c/σ)] and limn→∞ E(|Xn |) = ∞.
P∞
However, if n=1 Var(Yn ) < ∞, then the condition of Theorem 20 holds. Indeed, the Basic
L2 Convergence Theorem already told us that the sum converges a.s.
for all n, so the martingale converges. In Theorem 24, we can say even more about the limit.
We need the following result of uniform integrability of Lévy martingales before we can
identify the limit of a Lévy martingale. Recall that uniform integrability allows the exchange
of limit and integral under finite measure (Theorem 20 of Lecture Notes Set 3).
Lemma 23. Let {Fn }∞ n=1 be a sequence of σ-fields. Let E(|X|) < ∞. Define Xn = E(X|Fn ).
∞
Then {Xn }n=1 is a uniformly integrable sequence.
Proof: Since E(X|Fn ) = E(X + |Fn ) − E(X − |Fn ), and the sum of uniformly integrable
sequences is uniformly
R integrable, we willR prove the result for nonnegative X. Let Ac,n =
{Xn ≥ c} ∈ Fn . So Ac,n Xn (ω)dP (ω) = Ac,n X(ω)dP (ω). If we can find, for every > 0,
R
a C such that Ac,n X(ω)dP (ω) < for all n and all c ≥ C, we are done. This is achieved
using absolute continuity and the detail is a homework problem.
9
Theorem 24 (Lévy’s theorem). Let {Fn }∞ n=1 be an increasing sequence of σ-fields. Let
F∞ be the smallest σ-field containing all of the Fn ’s. Let E(|X|) < ∞. Define Xn = E(X|Fn )
and X∞ = E(X|F∞ ). Then limn→∞ Xn = X∞ , a.s.
Proof: By Lemma 23, {Xn }∞ n=1 is a uniformly integrable sequence. Let Y be the limit of
the martingale guaranteed by Theorem 20. Since Y is a limit of functions of the Xn , it is
measurable with respect to F∞ . It follows from uniform integrability that for every event A,
limn→∞ E(Xn IA ) = E(Y IA ). Next, note that, for every m and A ∈ Fm ,
Z Z
Y dP = lim E(X|Fn )dP
A n→∞ A
Z
= lim Xn dP
n→∞ A
Z
= XdP,
A
R R
where the last equality follows from
R the factRthat A ∈ F n for all n ≥ m so A
X n dP = A
XdP
because Xn = E(X|FS n ). Since A
Y dP = A
XdP for all A ∈ Fm for all m, it holds for all
∞
A in the field F = n=1 Fn . Since |X| is integrable and F is a field, we can conclude
that the equality holds for all A ∈ F∞ , the smallest σ-field containing F. The equality
E(XIA ) = E(Y IA ) for all A ∈ F∞ together with the fact that Y is F∞ measurable is
precisely what it means to say that Y = E(X|F∞ ) = X∞ .
In Section 6 we shall see an important example where F∞ 6= F. Before this, we shall first
introduce reversed martingales.
5 Reversed Martingales
Definition 26 (Reversed Martingales). For n = −1, −2, . . ., let sub-σ-field’s Fn−1 ⊆
Fn , suppose that Xn is Fn measurable, E(|Xn |) < ∞, and E(Xn |Fn−1 ) = Xn−1 . Then
{(Xn , Fn )}−∞
n=−1 is a reversed martingale.
10
An equivalent way to think about reversed martingales is through a decreasing sequence of
σ-field’s {Fn }∞
n=1 such that Fn+1 ⊆ Fn for n ≥ 1. The proofs of the next two theorems are
similar to the corresponding theorems for forward martingales.
Theorem 27 (Reversed martingale convergence theorem). If
{(Xn , Fn )}−∞
n=−1 is a reversed martingale, then X = limn→−∞ Xn exists a.s. and E(X) =
E(X−1 ).
Proof: Just as in the proof of Theorem 20, we let Vn be the number of times that the finite
sequence Xn , Xn+1 , . . . , X−1 crosses from below a rational r to above another rational q (for
n < 0). Lemma 19 says that
1
E(Vn ) ≤ (E(|X−1 |) + |r|) < ∞.
q−r
As in the proof of Theorem 20, it follows that X = limn→−∞ Xn exists with probability 1.
Since Xn = E(X−1 |Fn ) for each n < −1, Lemma 23 says that
E(X) = lim E(Xn ) = E(X−1 ).
n→−∞
Notice that reversed martingales are all of the Lévy type. Not surprisingly, there is a version
of Lévy’s theorem 24 for reversed martingales. We state it in terms of decreasing σ-field’s.
Theorem 28 (Lévy Theorem for reversed martingales). Let {Fn }∞ n=1 be a decreasing
T∞
sequence of σ-fields. Let F∞ = n=1 Fn . Let E(|X|) < ∞. Define Xn = E(X|Fn ) and
X∞ = E(X|F∞ ). Then limn→∞ Xn = X∞ a.s.
since A ∈ Fn and Xn = E(X1 |Fn ). Once again, using Lemma 23, it follows that
Z Z Z
lim Xn (ω)dP (ω) = Y (ω)dP (ω) = X1 (ω)dP (ω),
n→∞ A A A
Theorem 28 allows us to prove a strong law of large numbers that is even more general than
the usual version. The greater generality comes from the fact that it applies to sequences
that are not necessarily independent.
11
6 Exchangeability and de Finetti Theorem
A sequence of random quantities {Xn }∞ n=1 is exchangeable if, for every n and all distinct
j1 , . . . , jn , the joint distribution of (Xj1 , . . . , Xjn ) is the same as the joint distribution of
(X1 , . . . , Xn ).
Example 29 (Conditionally iid random quantities). Let {Xn }∞ n=1 be conditionally iid
∞
given a σ-field C. Then {Xn }n=1 is an exchangeable sequence. The result follows easily from
the fact that
µXj1 ,...,Xjn |C = µX1 ,...,Xn |C , a.s.
1
P (X1 = x1 , . . . , Xn = xn ) = ,
(n + 1) ny
where y = nj=1 xj . One can show that this specifies consistent joint distributions. One can
P
Proof: Define Yn = n1 nj=1 Xj and let Fn be the σ-field generated by all function of
P
(X1 , X2 , . . .) that are invariant under permutations of the first n coordinates. (For example,
Yn is such a function.) Let T Zn = E(X1 |Fn ). Theorem 28 says that Zn converges a.s. to
E(X1 |F∞ ), where F∞ = ∞ n=1 Fn . We prove next that Zn = Yn , a.s. Since Yn is Fn
measurable, we need only prove that, for all A ∈ Fn , E(IA Yn ) = E(IA X1 ). Notice that IA
can be written as a function h(X1 , X2 , ...), a function of X1 , X2 , . . . that is invariant under
permutations of X1 , . . . , Xn . By exchangeability, for all j = 1, . . . , n, X1 h(X1 , X2 , . . .) has
the same distribution as Xj h(Xj , X2 , . . . , Xj−1 , X1 , Xj+1 , . . .). But
12
Clearly E(X1 |F∞ ) has mean E(X1 ). If the Xn ’s are independent, then the limit, being
measurable with respect to the tail σ-field, must be constant a.s., by Kolmogorov 0-1 law.
The constant must equal the mean of the random variable, which is E(X1 ).
Example 32. In Example 30, we know that Yn converges a.s., hence it converges in dis-
tribution. We can compute the distribution of Yn exactly: P (Yn = k/n) = 1/(n + 1) for
k = 0, . . . , n. Hence, Yn converges in distribution to uniform on the interval [0, 1], which
must be the distribution of the limit. The limit is not a.s. constant.
There is a very useful theorem due to deFinetti about exchangeable random quantities that
relies upon the strong law of large numbers. To state the theorem, we need to recall the
concept of “random probability measure” that was introduced in Example 40 of Lecture
Notes Set 4. Let (X , B) be a Borel space, and let P be the set of all probability measures on
(X , B). We can think of P as a subset of the function space [0, 1]B , hence it has a product
σ-field. Recall that the product σ-field is the smallest σ-field such that for all B ∈ B, the
function fB : P → [0, 1] defined by fB (Q) = Q(B) is measurable. These are the coordinate
projection functions.
Example 33 (Empirical probability measure). Let X1 , . . . , Xn be random quantities
taking values in X . For each B ∈ B, define Pn (ω)(B) = n1 nj=1 IB (Xj (ω)). For each
P
DeFinetti’s theorem says that a sequence of random quantities is exchangeable if and only
if it is conditionally iid given a random probability measure, and that random probability
measure is the limit of the empirical probability measures of X1 , . . . , Xn . That is, Example 29
is essentially the only example of exchangeable sequences. A simple proof can be found in
Kingman (1978, Annals of Probability, Vol. 6, 183–197). A photocopy of the pages are
attached at the end of this note.
Theorem 34 (DeFinetti’s theorem). A sequence {Xn }∞ n=1 of random quantities is ex-
changeable if and only if Pn (the empirical probability measure of X1 , . . . , Xn ) converges a.s.
to a random probability measure P and the Xn ’s are conditionally iid with distribution Q
given P = Q.
Example 35. In Example 30, the empirical probability measure is equivalent to Yn =
Pn
k=1 Xk /n, since Yn is one minus the proportion of the observations less than or equal
to 0. So P is equivalent to the limit of Yn , the limit of relative frequency of 1’s in the se-
quence. Conditional on the limit of relative frequency of 1’s being x, the Xk ’s are iid with
Bernoulli distribution with parameter x.
13
A Upcrossing lemma
Proof: [Upcrossing lemma] Let Yk = max{0, Xk − r} for every k so that {(Yk , Fk )}nk=1 is
a submartingale. Note that a consecutive set of Xk (ω) cross from below r to above q if and
only if the corresponding consecutive set of Yk (ω) cross from 0 to above q − r. Let T0 (ω) = 0
and define Tm for m = 1, 2, . . . as
Now V (ω) is one-half of the largest even m such that Tm (ω) ≤ n. Define, for k = 1, . . . , n,
1 if Tm (ω) < k ≤ Tm+1 (ω) for m odd,
Rk (ω) =
0 otherwise.
Then (q − r)V (ω) ≤ nk=1 Rk (ω)[Yk (ω) − Yk−1 (ω)] = X̂, where Y0 ≡ 0 for convenience. First,
P
note that for all m and k, {Tm (ω) ≤ k} ∈ Fk . Next, note that for every k,
[
{Tm ≤ k − 1} ∩ {Tm+1 ≤ k − 1}C ∈ Fk−1 .
{ω : Rk (ω) = 1} = (3)
m odd
n Z
X
E(X̂) = [Yk (ω) − Yk−1 (ω)]dP (ω)
k=1 {ω:Rk (ω)=1}
n Z
X
= [E(Yk |Fk−1 )(ω) − Yk−1 (ω)]dP (ω)
k=1 {ω:Rk (ω)=1}
Xn Z
≤ [E(Yk |Fk−1 )(ω) − Yk−1 (ω)]dP (ω)
k=1
n
X
= [E(Yk ) − E(Yk−1 )] = E(Yn ),
k=1
where the second equality follows from Equation (3) and the inequality follows from the
fact that {(Yk , Fk )}nk=1 is a submartingale. It follows that (q − r)E(V ) ≤ E(Yn ). Since
E(Yn ) ≤ |r| + E(|Xn |), it follows that Equation (2) holds.
14