BR Mo ST Ca 13

Download as pdf or txt
Download as pdf or txt
You are on page 1of 143

Brownian Motion and

Stochastic Calculus
Chapters 0 to 7

Spring Term 2013

Alain-Sol Sznitman
Table of Contents
0 Introduction 1

1 Brownian Motion: Definition and Construction 5

2 Brownian Motion and Markov Property 23

3 Some Properties of the Brownian Sample Path 45

4 Stochastic Integrals 53

5 Stochastic Integrals for Continuous Local Martingales 73

6 Ito’s formula and first applications 89

7 Stochastic differential equations and Martingale problems 107

References 137
Chapter 0: Introduction
The object of this course is to present Brownian motion, develop the infinitesimal calculus
attached to Brownian motion, and discuss various applications to diffusion processes.
The name “Brownian motion” comes from Robert Brown, who in 1827, director at the
time of the British botanical museum, observed the disordered motion of “pollen grains
suspended in water performing a continual swarming motion”. Louis Bachelier in his
thesis in 1900 used Brownian motion as a model of the stock market, and Albert Einstein
considered it in 1905 when discussing the motion of small particles in suspension in a
fluid, under the influence of shocks due to thermal agitation of molecules in the fluid. The
mathematical theory of Brownian motion was then put on a firm basis by Norbert Wiener
in 1923.
There are several ways to mathematically construct Brownian motion. One can for
instance construct Brownian motion as the limit of rescaled polygonal interpolations of a
simple random walk, choosing time units of order n2 and space units of order n:

0 t 0 t

(n)
St : the polygonal interpolation Bt : the rescaled trajectory
of the simple random walk

1
X1 , . . . , Xn , . . . , are i.i.d. with P [Xi = 1] = P [Xi = −1] = ,
2
Sm = X1 + · · · + Xm , m ≥ 1, S0 = 0 ,
(0.1)
St , t ≥ 0, is the polygonal interpolation of Sm , m ≥ 0, and
(n) 1
Bt = Stn2 , t ≥ 0, is the rescaled (in time and space) trajectory.
n
(n)
From the central limit theorem, one knows that B1 converges in law to a N (0, 1)-
distribution, that is:
Z a
(n) 1 x2
P [B1 ≤ a] −→ √ e− 2 dx, for a ∈ R .
n→∞ 2π −∞

In fact, much more is true, and the law of B .(n) viewed as a random continuous trajectory
converges in a suitable sense to the law of Brownian motion (this is a special case of the
so-called “invariance principle” of Donsker).

1
An important advantage of continuous models versus discrete models is the presence
of the whole apparatus of “infinitesimal calculus”. However, in the case of a typical
realization of Brownian motion, the trajectory t ≥ 0 → Bt (ω) ∈ R, is continuous, but
very rough (in particular nowhere differentiable, and of infinite variation on any proper
interval).
The basic formula of calculus:
d
(0.2) f (b(t)) = f ′ (b(t)) b′ (t), for f and b two C 1 -functions,
dt
can still be given a meaning when b is continuous of finite variation, and f is C 1 , namely:
Z t
(0.3) f (b(t)) = f (b(0)) + f ′ (b(s)) db(s), for t ≥ 0 ,
0
R
where db(s) stands for the Stieltjes measure on [0, ∞), such that [0,a] db(s) = b(a) − b(0),
for 0 ≤ a < ∞.
However, this extension is of little help in the case of Brownian motion since t → Bt
is of infinite variation on any proper interval.
Nonetheless, we will develop an infinitesimal calculus based on a formula (Ito’s for-
mula), which brings into play an “extra term”:
Z t Z t
′ 1
(0.4) f (Bt ) = f (B0 ) + f (Bs ) dBs + f ′′ (Bs )ds, for f ∈ C 2 (R), t ≥ 0,
0 2 0

or in differential notation:
1
df (Bt ) = f ′ (Bt ) dBt + f ′′ (Bt ) dt .
2
Rt
Of course, part of the work has to do with defining what is meant by “ 0 f ′ (Bs ) dBs ”,
since, as explained above, this expression has no meaning as a Stieltjes integral. This task
will correspond to the construction of stochastic integrals.
Once this infinitesimal calculus is at our disposal, we will be able to solve certain dif-
ferential equations with random perturbations, the so-called “stochastic differential equa-
tions” (SDEs):

(0.5) dXt = b(Xt )dt + σ(Xt )dBt .


| {z }
random perturbation

There turns out to be a deep connection between solutions of such stochastic differential
equations and certain partial differential equations (PDEs).
For instance, when Bt = (Bt1 , . . . , Btd ), where the B .i are independent real-valued
Brownian motions, and D ⊆ Rd is a smooth bounded domain, e.g. a ball, one can consider
the

2
Dirichlet problem: given f ∈ C(∂D), find u such that
(1
∆u = 0 in D ,
(0.6) 2
u|∂D = f ,

or the

Poisson equation: for g ∈ C α (D), find u such that


(1
∆u = g in D ,
(0.7) 2
u|∂D = 0 .

The two problems have solutions, which can be expressed in terms of Brownian motion:
x + Bτx

Setting for x ∈ D,

(0.8) τx = inf{s ≥ 0; x + Bs ∈ ∂D} ,

one has

(0.9) uDirichlet (x) = E[f (x + Bτx )]

and
hZ τx i
(0.10) uPoisson (x) = −E g(x + Bs )ds .
0

With stochastic differential equations, one is able to handle more general partial differential
equations with 12 ∆ replaced by:

d
X d
X
1
(0.11) L= (σ(x) t σ(x))i,j ∂i,j
2
+ b(x)i ∂i ,
2
i,j=1 i=1

and, during this course, we will describe a number of applications of these ideas and
concepts.

3
4
Chapter 1: Brownian Motion: Definition and Construction
We will see that Brownian motion plays a prominent role as a canonical example of three
different notions:

- a continuous Gaussian process,


- a continuous Markov process,
- a continuous martingale.

In this chapter, we will mainly deal with the first of these three notions.
Definition 1.1. Let (Ω, A, P ) be a probability space. A d-dimensional Brownian motion
on (Ω, A, P ) is an Rd -valued stochastic process (i.e. for each t ≥ 0, Bt (·) is an Rd -valued
random variable defined on (Ω, A, P )), such that:
i) B0 = 0, P -a.s.,

ii) for any 0 = t0 < t1 < · · · < tn , Bt1 − Bt0 , Bt2 − Bt1 , . . . , Btn − Btn−1 are
independent random variables (“independent increments”),

(1.1) iii) for t > 0, s ≥ 0, A ∈ZB(Rd ),


d |x|2
P [Bt+s − Bs ∈ A] = (2πt)− 2 e− 2t dx, (“Bt+s − Bs is
A
N (0, tI)-distributed”),

iv) P -a.s., t ≥ 0 → Bt (ω) ∈ Rd is continuous.

In the above definition (Ω, A, P ) is “arbitrary”. As we will see, there is a way to


construct a “canonical Brownian motion”, once we know that at least one Brownian motion
in the sense of the above definition exists.
We take as a model the “canonical” space
(1.2) C = C(R+ , Rd ) = {continuous functions R+ → Rd }.
On C we have the canonical coordinates:
(1.3) Xu : C → Rd , u ≥ 0, such that Xu (w) = w(u), for w ∈ C,
and the σ-algebra generated by these coordinates:
F = σ(Xu , u ≥ 0), (i.e. the smallest σ-algebra on C for which all
(1.4)
Xu , u ≥ 0, are measurable).
Lemma 1.2. (ψ: a map Ω → C)
ψ
(Ω, A) −→ (C, F) is measurable if and only if
(1.5) Xu ◦ψ
(Ω, A) −→ (Rd , B(Rd )) is measurable for each u ≥ 0.

5
Proof. =⇒: immediate (the composition of two measurable maps is measurable).
⇐=: the collection S of B ⊆ C such that ψ −1 (B) ∈ A is a σ-algebra, which contains
Xu−1 (D) for D ∈ B(Rd ), and u ≥ 0. Hence, S contains F, the smallest σ-algebra for
which all Xu , u ≥ 0, are measurable. As a result for all F ∈ F, ψ −1 (F ) ∈ A, and ψ is
measurable.

We will later see that on a suitable (Ω, A, P ) we can construct a Brownian motion. For
such a Brownian motion we can pick by (1.1) iv), a negligible set N ∈ A (i.e. P (N ) = 0),
and define
 B.
Ω\N, A ∩ (Ω\N ) −→ (C, F)
| {z }
the notation means the collection of sets A ∩ (Ω\N ), with A ∈ A.

The above map is measurable, indeed:

for u ≥ 0, Xu ◦ B .(ω) = Bu (ω) is measurable (Ω\N, A ∩ (Ω\N )) → (Rd , B(Rd ))

and we can apply (1.5).


We can consider the image under B . of the probability measure P restricted to Ω\N
(i.e. P : A ∩ Ω\N → [0, 1]). We denote by W this image probability.
Proposition 1.3.

(1.6) The law W on (C, F) is uniquely determined,


(it is the so-called d -dimensional Wiener measure).

(1.7) For 0 = t0 < t1 < · · · < tn and h ∈ bB((Rd )n+1 ) (i.e. bounded measurable
on (Rd )n+1 )
E W [h(Xt0 , Xt1 , . . . , Xtn )] =
Z n
Y n |x − x |2 o
d i i−1
h(0, x1 , . . . , xn ) [2π(ti − ti−1 )]− 2 exp − dx1 . . . dxn .
d
(R ) n 2(t i − t i−1 )
i=1

(1.8) Xt (w), t ≥ 0, is a Brownian motion on (C, F, W ),


(it is the canonical d -dimensional Brownian motion).

Proof.
• (1.6): For h ∈ bB((Rd )n+1 ) and 0 = t0 < t1 < · · · < tn , we have
def
a = E W [h(Xt0 , Xt1 , . . . , Xtn )] = E P [h(B0 , Bt1 , . . . , Btn )]

(1.9) = E P h(B0 , B0 + Bt1 − Bt0 , . . . , B0 + Bt1 − Bt0 + Bt2

− Bt1 + · · · + Btn − Btn−1 ) .

6
By (1.1), the Bti − Bti−1 , 1 ≤ i ≤ n, are independent, respectively N (0, (ti − ti−1 )I)-
distributed, and B0 = 0, P -a.s.. Hence we find
Z n
Y d
a= h(0, y1 , y1 + y2 , . . . , y1 + · · · + yn ) [2π(ti − ti−1 )]− 2
(1.10) (Rd )n i=1
|yi |2
− 2(t −t
e i i−1 ) dy1 . . . dyn .

Picking h = 1D , where D ∈ B((Rd )n+1 ), we see that (1.9), (1.10) determine

W ({(X0 , Xt1 , . . . , Xtn ) ∈ D}) .

The class of sets of the form {(X0 , Xt1 , . . . , Xtn ) ∈ D}, n ≥ 1, 0 = t0 < t1 < · · · < tn , and
D ∈ B((Rd )n+1 ) arbitrary, is a π-system (i.e. is stable under intersection), which generates
F. From Dynkin’s lemma, W is completely determined on F, and in particular does not
depend on the specific (Ω, A, P, B) and N we used.

• (1.7): We perform the change of variables in (1.10)

(1.11) x1 = y 1 , x2 = y 1 + y 2 , . . . , xn = y 1 + y 2 + · · · + y n .

Note that Jac(y1 , . . . , yn |x1 , . . . , xn ) = 1, so that


Z n |xi − xi−1 |2 o
n
Y
(1.10) d
a = h(0, x1 , . . . , xn ) [2π(ti − ti−1 )]− 2 exp − dx1 . . . dxn ,
(Rd )n 2(ti − ti−1 )
i=1

and this proves (1.7).

• (1.8): We pick h in (1.9), (1.10) of the form

h(x0 , x1 , . . . , xn ) = g(x0 , x1 − x0 , x2 − x1 , . . . , xn − xn−1 ), so that


h(0, y1 , y1 + y2 , . . . , y1 + · · · + yn ) = g(0, y1 , y2 , . . . , yn ) .

It then follows that Xt , t ≥ 0, fulfills (1.1), i), ii), iii). Since for all w ∈ C, t ≥ 0 →
Xt (w) ∈ Rd is continuous, (1.1) iv) holds as well, and Xt , t ≥ 0, is a Brownian motion on
(C, F, W ).

Definition 1.4.

• An Rd -valued process, Xt , t ∈ T , (T is some arbitrary non-empty set), defined on


(Ω, A, P ) is a centered Gaussian process (when T is finite, one also P speaks of a centered
Gaussian vector), if for any n ≥ 1, t1 , . . . , tn ∈ T , λ1 , . . . , λn ∈ Rd , ni=1 λi · Xti is a real-
valued centered Gaussian variable (possibly ≡ 0). ↑
scalar product when d ≥ 2

7
• The d × d-matrix valued function on T 2 :

Γ(u, v) = E[Xu t Xv ] = (E[Xui Xvj ])1≤i,j≤d , u, v ∈ T


(1.12)
(note that Γ(v, u) = t Γ(u, v)),

is the covariance function of the process (note that E[Xt ] = 0, for each t ∈ T ).

Lemma 1.5. ((Xt )t∈T a centered Gaussian process with covariance function Γ)

The function Γ(u, v), u, v ∈ T , completely determines all


(1.13) finite distributions Xt1 , . . . , Xtn on (Rd )n , for any n ≥ 1,
and t1 , . . . , tn ∈ T .

Proof. For ξ = (λ1 , . . . , λn ) ∈ (Rd )n , we set


h n Xn oi n h X
n 2 io
def 1
ϕ(ξ) = E exp i λj · Xtj = exp − E λj · Xtj
2
j=1 j=1
| {z }
real-valued centered Gaussian variable

n n
X o
1
(1.14) = exp − E[(λi · Xti )(λj · Xtj )]
2
i,j=1

row vector column vector


n n ↓ ↓ o n n o
1
X (1.12) 1
X
t t t
= exp − λi E[Xti Xtj ] λj = exp − λi Γ(ti , tj ) λj .
2 | {z } 2
i,j=1 i,j=1
d × d matrix

But the characteristic function ϕ(·) completely determines the law of (Xt1 , . . . , Xtn ) on
(Rd )n .

We will now provide a characterization of Brownian motion as a continuous centered


Gaussian process.

Proposition 1.6. Let Bt , t ≥ 0, be an Rd -valued process defined on (Ω, A, P ) with P -a.s.


continuous trajectories:

Bt , t ≥ 0 is a Brownian motion ⇐⇒ Bt , t ≥ 0 is a centered Gaussian process


(1.15) with Γ(s, t) = (s ∧ t) Id×d .

identity matrix

8
Proof.
• =⇒: For n ≥ 1, 0 ≤ t1 < t2 < · · · < tn , λ1 , . . . , λn ∈ Rd , P -a.s.,
n
X (1.1) i)
n
X X
i  Xn X
n 
def
a = λi · Bti = λi · (Btj − Btj−1 ) = λi · (Btj − Btj−1 )
i=1 i=1 j=1 j=1 i=j

(denoting t0 = 0). By (1.1), the (Btj − Btj−1 ), 1 ≤ j ≤ n, are independent respectively


N (0, (tj − tj−1 )Id×d )-distributed. Therefore, a is a real valued centered Gaussian variable
(use the characteristic function), and we have shown that Bt , t ≥ 0, is a centered Gaussian
process. Moreover for 0 ≤ s ≤ t, we have

Γ(s, t) = E[Bs t Bt ] = E[Bs t Bs ] + E[Bs t (Bt − Bs )]


(1.16) |{z} տ ր
= sI = (s ∧ t) I
d×d . d×d q
0 independent and centered

(1.16)
Then, for t ≤ s, Γ(s, t) = t Γ(t, s) = (t ∧ s) t Id×d = (s ∧ t) Id×d .

• ⇐=: If 0 < t1 < · · · < tn are given, and on some auxiliary probability space Yj ,
P
1 ≤ j ≤ n, are independent N (0, (tj −tj−1 ) Id×d )-distributed, we can define Xj = jk=1 Yk .
A repetition of the argument above shows that Xj , 1 ≤ j ≤ n, is a centered Gaussian
process, and we can calculate its covariance as follows. For 1 ≤ i ≤ j ≤ n, one has:
def
Γ(i, j) = E[Xi t Xj ] = E[Xi t Xi ] + E[Xi t (Xj − Xi )]
| {z }
q
h Xi  X i i 0 independent
(1.17) =E Yk t Yk and centered
1 1
indep. X
= E[Yk t Yk ] = ti Id×d .
centered 1≤k≤i

As below (1.16), we thus see that:

(1.18) Γ(i, j) = (ti ∧ tj ) Id×d , for 1 ≤ i, j ≤ n .

By (1.13), we thus see that (X1 , . . . , Xn ) has the same law as (Bt1 , . . . , Btn ). Therefore,
(Bt1 , Bt2 − Bt1 , . . . , Btn − Btn−1 ) has the same law as (X1 , X2 − X1 , . . . , Xn − Xn−1 ) =
(Y1 , . . . , Yn ). Hence (1.1) ii), iii) follow. Moreover (1.1) iv) holds by assumption and
E[B0 t B0 ] = 0 implies by looking at the diagonal coefficients of this matrix that P -a.s.,
B0 = 0. This proves that Bt , t ≥ 0, is a Brownian motion.

The above characterization is very helpful.

9
Examples:
1) Invariance by scaling:
Consider Bt , t ≥ 0, an Rd -valued Brownian motion, λ > 0, then

(1.19) λBt/λ2 , t ≥ 0, is also an Rd -valued Brownian motion .

Indeed:
def
Btλ = λBt/λ2 , t ≥ 0, is also a continuous centered Gaussian process, and for s ≥ 0, t ≥ 0:
s t 
E[Bsλ t Btλ ] = λ2 E[Bs/λ2 t Bt/λ2 ] = λ2 2 ∧ 2 Id×d = (s ∧ t) Id×d ,
λ λ
and (1.19) follows from (1.15).

2) Invariance by time inversion:


Consider Bt , t ≥ 0, an Rd -valued Brownian motion, and define

(1.20) β0 = 0, and βs = s B1/s , for s > 0 .

Then one has:

(1.21) βs , s ≥ 0, is an Rd -valued Brownian motion .

Indeed:
βs , s ≥ 0, is a centered Gaussian process, and for 0 < s, t:
1 1 st
E[βs t βt ] = st E[B 1 t B 1 ] = st ∧ Id×d = Id×d
(1.22) s t s t s∨t
= (s ∧ t) Id×d ,

and this formula immediately extends to the case 0 ≤ s, t.


There only remains to see that P -a.s., s ≥ 0 → βs is continuous to conclude (1.21).
To this end, we note that by (1.22) and (1.13),

the laws of βs , s > 0, and Bt , t > 0, on C((0, ∞), Rd ) are identical.


(1.23) ↑
open at 0!

We let Xu , u > 0, denote the canonical process on C((0, ∞), Rd ), and G = σ(Xu , u > 0)
the canonical σ-algebra. If Q stands for the common law on (C((0, ∞), Rd ), G) of βs , s > 0,
or Bt , t > 0, then

(1.24) lim Xu = 0 ∈ G (it is an “event”) .
u→0

Indeed one has


 \ [ \ n 1o
lim Xu = 0 = |Xu | ≤ ∈G.
u→0 n
n≥1 m≥1 u∈Q∩(0, 1 )
m

10
As a result we find
  (1.1) i),iv)
Q( lim Xu = 0) = P lim Bu = 0 = 1
| u→0 {z } u→0

q (1.23)
 
P lim βu = 0 ,
u→0

and hence βs , s ≥ 0, fulfills (1.1) iv) as well. 

Construction of Brownian motion:


We are now going to construct a Brownian motion on some (Ω, A, P ). It suffices to consider
the case d = 1, since by taking d independent copies of a real-valued Brownian motion,
we obtain a d-dimensional Brownian motion.
We follow a method of Paul Lévy (1948), later simplified by Z. Ciesielski (1961).
We recall the Haar functions on R+ :

ϕℓ (t) = 1[ℓ,ℓ+1) (t), ℓ ∈ N (the set of non-negative integers),


(1.25) m
ϕm,k (t) = 2 2 1  (t) − 2 m2 1  (t) with m, k ∈ N .
1 1
k
, k + m+1
2m 2m
k
2m
+ m+1 , k+1
2m
2 2

Fact:

(1.26) The ϕℓ , ℓ ≥ 0, ϕm,k , m, k ≥ 0, form a complete orthonormal basis of L2 (R+ , dt) .

Indeed:

- The functions ϕℓ , ϕm,k have unit L2 (R+ , dt)-norms.

- They are pairwise orthogonal in L2 (R+ , dt).

- The L2 -closure of the span of the ϕℓ , ϕm,k is L2 (R+ , dt), because one sees by induction
on m ≥ 0, that all 1[ jm , j+1
m )
, j ≥ 0, belong to the space generated by
2 2

ϕℓ , ℓ ≥ 0, and ϕm′ ,k , 0 ≤ m′ < m, 0 ≤ k ,

and the above claim follows.

Heuristic (non-rigorous) description of the construction of Brownian motion


The idea is to use the formal development of Ḃ (the derivative of Brownian motion !!!) in
the above Haar basis. Formally we have:
X Z ∞ X Z ∞ 
Ḃ(·) “=” ϕℓ (·) Ḃ ϕℓ dt) + ϕm,k (·) Ḃ ϕm,k dt .
ℓ≥0 0 m,k≥0 0

11
We then write
Z t X Z t Z ∞ 
B(t) “=” Ḃ(u)du “=” ϕℓ (u)du Ḃ ϕℓ dt +
0 ℓ≥0 0 0
(1.27) X Z t Z ∞ 
ϕm,k (u)du Ḃ ϕm,k dt .
m,k≥0 0 0

We now define
Z ∞ Z ℓ+1
(1.28) ξℓ “=” Ḃ ϕℓ dt “=” Ḃ dt “=”B(ℓ + 1) − B(ℓ),
0 ℓ
Z ∞ Z k 1
+ m+1 Z k+1 
m 2m 2 2m
(1.29) ξm,k “=” Ḃ ϕm,k dt “=” 2 2 Ḃ dt − Ḃ dt
k k 1
0 2m 2m
+ m+1
2
         
m k 1 k m k+1 k 1
“=” 2 2 B m+ −B −22 B − B + .
2 2m+1 2m m 2 m 2 2m+1

Now, if a Brownian motion exists, then the right-hand sides of (1.28) and (1.29) make
sense, and the above ξℓ , ξm,k are N (0, 1)-variables, and form a centered Gaussian family
(since they are linear combinations of B(t), t ≥ 0). Moreover, the variables ξℓ , ℓ ≥ 0, ξm,k ,
m, k ≥ 0, are pairwise orthogonal:

E[ξℓ ξℓ′ ] = 0, for ℓ 6= ℓ′ ≥ 0, E[ξℓ ξm,k ] = 0, for ℓ ≥ 0, m, k ≥ 0 ,


E[ξm,k ξm′ ,k′ ] = 0, for (m, k) 6= (m′ , k′ ) .

This orthogonality follows from (1.1) ii), iii) (for instance E[ξℓ ξℓ′ ] = 0, for ℓ 6= ℓ′ is
immediate to check, likewise E[ξℓ ξm,k ] = 0, if [ 2km , k+1
2m ) ∩ [ℓ, ℓ + 1) = ∅ is immediate, and
on the other hand when the intersection is not empty, then [ 2km , k+1 2m ) ⊆ [ℓ, ℓ + 1), and one
has
h n  k + 1   
m k 1
−E[ξℓ ξm,k ] = 2 2 E (B ℓ + 1) − B(ℓ) B m
− B m
+
2m+1
   2 oi 2
k 1 k
− B m + m+1 − B m ,
2 2 2

and writing
     
k+1 k+1 k 1
B(ℓ + 1) − B(ℓ) = B(ℓ + 1) − B + B − B + +
2m m m 2m+1
| {z } | 2 {z 2 }
     
k 1 k k
B m + m+1 − B m + B m − B(ℓ) ,
2 2 2 2
| {z } | {z }

we can use (1.1), ii), iii) to conclude that E[ξℓ ξm,k ] = 0 as well, the last equality
E[ξm,k ξm′ ,k′ ] = 0 for (m, k) 6= (m′ , k′ ) is shown with analogous considerations).

12
Since the ξℓ , ξm,k form a centered Gaussian family, are N (0, 1)-distributed, and are
pairwise uncorrelated, they are in fact independent (cf. (1.13)). Hence, the formal formulas
(1.27), (1.28), (1.29) tell us where we should “look for a Brownian motion”. We will now
see how the above non-rigorous considerations can be transformed into a real proof.

Mathematical construction:
We consider on some suitable probability space (Ω, A, P ) a (countable) family of i.i.d.
N (0, 1)-distributed variables ξℓ , ξm,k , ℓ ≥ 0, m, k ≥ 0 (for instance Ω = (0, 1), A =
B((0, 1)), P = Lebesgue measure on (0, 1), will do the job).
We then define for n ≥ 1 and t ≥ 0:
X X  X 
(1.30) Xn (t) = Φℓ (t) ξℓ + Φm,k (t) ξm,k ,
0≤ℓ<n 0≤m<n 0≤k<n2m

where
Z t Z t
Φℓ (t) = ϕℓ (u)du and Φm,k (t) = ϕm,k (u)du
(1.31) 0 0
(these are called Schauder functions) .

“tent function”
1 Φℓ Φm,k
m
2−( 2 +1)

t t
0 ℓ ℓ+1 0 k 2k+1 k+1
2m 2m+1 2m

Lemma 1.7.
P -a.s., Xn (·, ω) converges uniformly on compact intervals of
(1.32)
R+ to a finite limit X(·, ω).

Proof. It suffices to prove that for any n0 ≥ 1, P -a.s., Xn (·, ω) converges uniformly on
[0, n0 ] to a finite limit (because we can then find N ∈ A, with P [N ] = 0, so that on N c ,
Xn (·, ω) converges uniformly on any [0, n0 ], and then define

X(t, ω) = lim Xn (t, ω), ω ∈ N c , t ≥ 0 ,


n
= 0, when ω ∈ N, t ≥ 0).
For t ∈ [0, n0 ], we have for n ≥ n0 :
nX
0 −1 X  X 
(1.33) Xn (t) = Φℓ (t) ξℓ + Φm,k (t) ξm,k ,
ℓ=0 0≤m<n 0≤k<n0 2m

13
and for each t ∈ [0, n0 ], and each m ≥ 0, there is at most one k ≥ 0, such that Φm,k (t) 6= 0.
We then define:
X m
(1.34) am (ω) = sup Φm,k (t) ξm,k ≤ 2−( 2 +1) sup |ξm,k | .
t∈[0,n0 ] k<n 2m k<n0 2m
0

We will control this supremum. To this end, we note that for ξ N (0, 1)-distributed:
r n a2 o
2 1
(1.35) P [|ξ| > a] ≤ exp − , for a > 0 ,
π a 2
R∞ 2 R∞ 2 2
(indeed 0 exp{− (a+u)
2 }du ≤ 0 exp{− a2 − au}du = 1
a exp{− a2 }).
It follows that
r
X  √  X 2 n0 2m −m
P sup |ξm,k | > 2m ≤ √ e < ∞.
0≤k<n0 2m π 2m
m≥1 m≥1

Thus, Borel Cantelli’s lemma implies that for


√ P -almost every ω, there exists m0 (ω) such
that for m ≥ m0 (ω), supk<n0 2m |ξm,k (ω)| ≤ 2m. As a result:
X X m √
(1.36) P -a.s., am (ω) ≤ 2−( 2 +1) 2m < ∞ .
m≥m0 (ω) m≥m0 (ω)

It follows that P -a.s., Xn (·, ω) converges uniformly on [0, n0 ] to a finite limit.

We will now see that the above defined X(t, ω), t ≥ 0, is a Brownian motion. First, we
observe that each Xn (t), t ≥ 0, is a centered Gaussian process (the Xn (t) are finite linear
combinations of the i.i.d. N (0, 1)-distributed ξℓ , ℓ ≥ 0, ξm,k , m, k ≥ 0). Note also that

L2 (Ω,A,P )
(1.37) for t ≥ 0, Xn (t, ω) −→ X(t, ω) .

Indeed the ξℓ , ξm,k are orthogonal in L2 (P ) so that, cf. (1.30),


P Xn (t) and Xn+m (t) − Xn (t)
are orthogonal. So, for 1 ≤ m < ℓ, E[(Xℓ (t) − Xm (t)) ] = m<k≤ℓ E[(Xk (t) − Xk−1 (t))2 ],
2
2
P prove that Xn (t), n ≥ 21 is a Cauchy sequence in L (Ω, A, P ), it suffices to show
and to
that k≥1 E[(Xk (t)−Xk−1 (t)) ] < ∞ (with X0 (t) = 0, by convention). Since E[Xn (t)2 ] =
P 2 2
1≤k≤n E[(Xk (t) − Xk−1 (t)) ], we only need to check that supn E[Xn (t)] < ∞, that is:
X X
(1.38) Φℓ (t)2 + Φm,k (t)2 < ∞ .
ℓ≥0 m,k≥0

To check this last point, we observe that the above sum equals:
X Z t 2 X Z t 2 (1.26)
ϕℓ (u)du + ϕm,k (u)du =
0 0 Parseval relation
(1.39) ℓ≥0 m,k≥0

k1[0,t] k2L2 (R+ ,du) = t < ∞ .

14
In a very similar vein, we calculate E[Xn (s) Xn (t)] for 0 ≤ s, t, as follows:
(1.30) X X
E[Xn (s) Xn (t)] = Φℓ (t) Φℓ (s) + Φm,k (t) Φm,k (s)
(1.40) 0≤ℓ<n 0≤m<n
0≤k<n2m

= πn (1[0,t] ), πn (1[0,s] )iL2 (R+ ,du) ,

with πn the orthogonal projection in L2 (R+ , du) on the space spanned by ϕℓ , 0 ≤ ℓ < n,
ϕm,k , 0 ≤ m < n, 0 ≤ k < n2m . Combining (1.37) and (1.40), we find that

E[Xn (s) Xn (t)] = πn (1[0,s] ), πn (1[0,t] ) L2 (R+ ,du)


n→∞ n→∞
↓ ↓
E[X(s) X(t)] = 1[0,s] , 1[0,t] L2 (R+ ,du)
= s ∧ t.

Note that weak limits of centered Gaussian distributions are centered Gaussian (use char-
acteristic functions). It follows that linear combinations of X(t), which are limit, in L2 (P )
by (1.37), and thus in distribution, of linear combinations of Xn (t), are centered Gaussian
variables.
So X(t, ω), t ≥ 0, is a centered Gaussian process. From the above calculation Γ(s, t) =
s ∧ t, and due to (1.32) P -a.s., t ≥ 0 → X(t, ω) is continuous.
We have thus proved that X(t, ω) is a Brownian motion on the probability space
(Ω, A, P ) selected above (1.30).

Complement:
We will now discuss another construction of Brownian motion Bt , 0 ≤ t ≤ 1, on the time
interval [0, 1], which gives a proof of a result of Paley and Wiener (1934). This approach
will bring into play some important methods on how to control the modulus
of continuity of a stochastic process.
In place of the complete orthonormal basis of L2 (R+ , dt) given by the Haar functions,
cf. (1.25), (1.26), we consider

(1.41) ϕk (t), 0 ≤ t ≤ 1, k ≥ 0, an orthonormal basis of L2 ([0, 1], dt),

as well as the sequence


Z t
(1.42) Φk (t) = ϕk (u)du, 0 ≤ t ≤ 1, k ≥ 0 .
0

Example: (Paley and Wiener)


The concrete choice of Paley and Wiener (1934) corresponds to:

ϕ0 ≡ 1, ϕk (t) = 2 cos(kπt),
√ k ≥ 1, 0 ≤ t ≤ 1, so that
(1.43) 2
Φ0 (t) = t, and Φk (t) = sin(kπt), 0 ≤ t ≤ 1, k ≥ 1 .

15
In the spirit of (1.30), we consider on some probability space (Ω, A, P ) (for instance Ω =
(0, 1), A = B((0, 1)), P = Lebesgue measure on (0, 1)), a sequence ξk , k ≥ 0, of i.i.d.
N (0, 1)-distributed variables.
Similarly to (1.30), we define for n ≥ 0, 0 ≤ t ≤ 1,
X
(1.44) Xn (t) = Φk (t) ξk .
0≤k≤n

(So, in the situation corresponding to the choice (1.43) of Paley and Wiener:

X 2
(1.45) Xn (t) = tξ0 + sin(kπt)ξk , for 0 ≤ t ≤ 1, n ≥ 0 ).

1≤k≤n

We will see that P -a.s., Xn (t, ω) converges uniformly on [0, 1] to X(t, ω) distributed as the
time restriction to [0, 1] of a Brownian motion. Here again, the most delicate point has to
do with the fact that the convergence is P -a.s. uniform on [0, 1]. However, unlike for the
proof of (1.32), we cannot make use of the special properties of the orthonormal basis (see
for instance (1.34)). This will bring into play interesting considerations. We begin with a
lemma concerning functions.

Lemma 1.8. (T > 0)


For r, γ > 0, and f ∈ C([0, T ], Rd ), define
Z  |f (t) − f (s)| r
(1.46) I= ds dt .
[0,T ]2 |t − s|γ

Assume I < ∞. Then for 0 ≤ s < t ≤ T , one has


Z t−s   1
4I r
(1.47) |f (t) − f (s)| ≤ 8 2
γ uγ−1 du
0 u
2
and hence, for r < γ, one has
γ 1 2
(1.47’) |f (t) − f (s)| ≤ 8 2 (4I) r |t − s|γ− r , for 0 ≤ s < t ≤ T .
γ− r

Proof. We only need to prove (1.47) when


2
(1.48) t = 1 = T, s = 0, and <γ.
r
The restriction 2r < γ is clear (otherwise the right-hand side is infinite) and note that
given f and 0 ≤ s < t ≤ T , as in Lemma 1.8, we can define

f (τ ) = f (s + (t − s) τ ), 0 ≤ τ ≤ 1 ,

16
so that if we know that (1.47) holds under (1.48), we find that:
γ 1
|f (t) − f (s)) = |f (1) − f (0)| ≤ 8 2 (4I) r , with
γ− r
(1.49) change of
Z  |f (u) − f (v)| r variable
I= du dv ≤ (t − s)−2+rγ I
[0,1]2 |u − v|γ

and inserting in (1.49) we find (1.47’) (and (1.47) as well).


We thus assume (1.48). We define for 0 ≤ u ≤ 1,
Z 1 
|f (u) − f (v)| r
(1.50) J(u) = dv .
0 |u − v|γ
R1
Since 0 J(u)du = I, there is a t0 ∈ (0, 1) s.t. J(t0 ) ≤ I. We will show that

1


 There are tn , n ≥ 0, in (0,1) such that with dγn = tγn ,

 2
(1.51) tn+1 ∈ (0, dn ),

  |f (t 

 2I n+1 ) − f (tn )| r 2J(tn )
 J(tn+1 ) ≤ , and ≤ .
dn |tn+1 − tn |γ dn

Indeed given tn , define dn as indicated, and note that


n Z
2I o dn 1
u ∈ (0, dn ); J(u) > < , since J(u)du = I
dn 2 0

Lebesgue measure

and
n  |f (u) − f (t )| r 2J(tn ) o dn
n
u ∈ (0, dn ); γ
> < ,
|u − tn | dn 2
Z 
|f (u) − f (tn )| r
since du = J(tn ) .
|u − tn |γ

Hence we must be able to find tn+1 in (0, dn ), for which the last line of (1.51) holds. This
proves (1.51). We now have

(1.52) 1 > t0 > d0 > t1 > d1 > · · · > tn > dn > . . . ,

and since dγn+1 ≤ 1


2 dγn , dn → 0, as n → ∞.
Note also that:
 
1
(1.53) |tn − tn+1 |γ ≤ tγn = 2dγn = 4 dγn − dγn ≤ 4(dγn − dγn+1 ) .
2

17
From the last line of (1.51), with the convention d−1 = 1, we find for n ≥ 0:
 2J(t )  1
(1.51) (1.51)  4I  r1
n r γ
|f (tn ) − f (tn+1 )| ≤ |tn − tn+1 | ≤ |tn − tn+1 |γ
dn Z dn   1 d d
n n−1
(1.54) (1.53)  4I  r1 γ γ 4I r γ−1
≤ 4 (dn − dn+1 ) ≤ 4 2
γu du .
dn dn−1 dn+1 u

Summing over n, we find:


Z t0  4I  1
r
(1.55) |f (t0 ) − f (0)| ≤ 4 γuγ−1 du .
0 u2

If we introduce the function fe(·) = f (1 − ·), for which the corresponding Ie of course equals
I, we can pick e
t0 = 1 − t0 , and obtain with (1.55) that
Z e
t0  4I  1
|fe(1 − t0 ) − fe(0)| = |f (t0 ) − f (1)| ≤ 4 γuγ−1 du ,
r

0 u2

and combining with (1.55) deduce that


Z 1
4I  1r γ−1
(1.56) |f (1) − f (0)| ≤ 8 γu du ,
0 u2

thus concluding the proof of Lemma 1.8.

Remark 1.9. The above lemma is a special case of a more general result (see [12], p. 170).
The interest of the lemma is that it permits to control the modulus of continuity of f
in terms of an integral of f (·) corresponding to I. This will be handy when proving
Kolmogorov’s criterion below. The quantity I is also related to certain Besov norms (see
for instance [1], p. 214). 

As an application of Lemma 1.8 we have

Theorem 1.10. (Kolmogorov’s criterion)


If Xn (t, ω), 0 ≤ t ≤ T , n ≥ 1, are d-dimensional stochastic processes on (Ω, A, P ) with
continuous trajectories such that for r > 0, α > 0,
 
(1.57) E sup |Xn (t) − Xn (s)|r ≤ C |t − s|1+α , for 0 ≤ s ≤ t ≤ T ,
n≥1

then for each β ∈ (0, αr ) there is a K(r, α, β, T ) > 0, such that:


h |Xn (t) − Xn (s)| i KC
(1.58) P sup ≥ R ≤ r , for all R > 0 .
n≥1,0≤s<t≤T (t − s)β R

(Note that the processes Xn (·) may very well all coincide with X1 (·).)

18
Proof. Consider β ∈ (0, αr ) and set

def 2 2+α
(1.59) γ = +β < .
r r
Then, observe that (1.57) and Fubini’s theorem imply that
Z  Z
|Xn (t) − Xn (s)|r
E sup dt ds ≤ C |t − s| 1+α−rγ
| {z } ds dt
[0,T ]2 n≥1 |t − s|rγ [0,T ]2
| {z } ↑
(1.60) q def
J (ω) > −1 by (1.59)
def
= C K1 (r, α, β, T ) < ∞ .

By Lemma 1.8, we know that for 0 ≤ s ≤ t ≤ T ,


(1.47′ ) γ 1
(1.61) sup |Xn (t, ω) − Xn (s, ω)| ≤ 8 (4J(ω)) r (t − s)β .
n≥1 β

It thus follows that for R > 0


h |Xn (t, ω) − Xn (s, ω)| i (1.61)   γ r i Markov
r r
P sup ≥ R ≤ P 8 4 J ≥ R ≤
n≥1 |t − s|β β inequality
 γ r E[J] (1.60)  r CK
r γ 1
4 · 8r ≤ 4 · 8 ,
β Rr β Rr

and (1.58) follows.

Kolmogorov’s criterion provides a powerful tool to estimate the modulus of continuity


of stochastic processes.
The next result in the special case of (1.43), (1.45), recovers a famous result of Paley
and Wiener (1934).

Theorem 1.11. (under (1.41), (1.42))


P
P -a.s., Xn (t) = 0≤k≤n Φk (t) ξk , 0 ≤ t ≤ 1, converges uniformly on
(1.62) [0, 1] to X(t, ω), 0 ≤ t ≤ 1, which has the same law on C([0, 1]), R)
as Bt , 0 ≤ t ≤ 1, if Bt , t ≥ 0, is a Brownian motion .

Proof. We introduce the filtration

(1.63) Fn = σ(ξ0 , ξ1 , . . . , ξn ), n ≥ 0 ,

and note that for each t ∈ [0, 1], and p ≥ 1:

(1.64) Xn (t) is an Fn -martingale, with finite Lp -norm.

19
It then follows from Doob’s inequality (with p = 4), that for 0 ≤ s ≤ t ≤ 1:
 4   4 4
(1.65) E sup (Xn (t) − Xn (s) ≤ sup E[(Xn (t) − Xn (s))4 ] .
n≥0 3 n≥0

Note that Xn (t) − Xn (s) is a centered Gaussian variable with variance


Z 2 X  Z t
 2  (1.44) X  t 2
E Xn (t) − Xn (s) = ϕk (u)du ≤ ϕk (u)du
(1.42) s s
0≤k≤n k≥0
(1.66)
Parseval
= k1[s,t] k2L2 ([0,1],du) = t − s .
identity

Now for Z an N (0, 1) variable we have

E[Z 4 ] = 3 ,
 Z h  x2 Z
x2 dx integration e− 2 i∞ x2 dx 
indeed: x4 e− 2 √ = x 3
− √ + 3x2 e− 2 √ =3 ,
R 2π by parts 2π −∞ R 2π

and hence, by scaling we find that for Z, N (0, σ 2 ) distributed,

(1.67) E[Z 4 ] = 3σ 4 .

Combining this identity with (1.65), (1.66), we see that


    4 
4
(1.68) E sup (Xn (t) − Xn (s))4 ≤ C(t − s)2 , for 0 ≤ s ≤ t ≤ 1, with C = 3 .
n≥0 3

We can then apply Kolmogorov’s criterion, with α = 1, r = 4, in (1.57) and choosing


β = 18 (< 41 = αr ), we deduce from (1.58) that

|Xn (t) − Xn (s)|


(1.69) P -a.s., sup < ∞.
n≥1,0≤s<t≤1 (t − s)1/8

Since Xn (0) = 0, for each n, it follows from Ascoli-Arzela’s theorem that

(1.70) P -a.s., Xn (·, ω), n ≥ 0, is a relatively compact sequence in C([0, 1], R)


endowed with the topology of uniform convergence .

By (1.68) with s = 0, (1.64) and the martingale convergence theorem we also see that

(1.71) P -a.s., for all t ∈ Q ∩ (0, 1), Xn (t, ω) has a finite limit .

Combining (1.70) and (1.71), we have thus shown that

(1.72) P -a.s., Xn (t, ω) converges uniformly on [0, 1]

(this plays the role of (1.32)).

20
We can thus define a stochastic process X(t, ω), 0 ≤ t ≤ 1, such that

t ∈ [0, 1] → X(t, ω) is continuous for each ω ∈ Ω, and


(1.73) P -a.s., lim sup |Xn (t, ω) − X(t, ω)| = 0 .
n t∈[0,1]

For each t ∈ [0, 1] the (Fn )-martingale Xn (t) also converges in L2 (P ), due to (1.66), with
s = 0, see also below (1.37). Proceeding as in (1.40) (see also (1.66)), we find that for
0 ≤ s, t ≤ 1:

E[Xn (s) Xn (t)] −→ 1[0,s] , 1[0,t] L2 ([0,1],du)


=s∧t
n
(1.74) ↓ n→∞
E[X(s) X(t)] = s ∧ t .

Moreover X(t), 0 ≤ t ≤ 1, by the same arguments as on page 15 is also a centered


Gaussian process. Applying (1.13), we thus find that X(t), 0 ≤ t ≤ 1, has the same law
on C([0, 1], R) as Bt , 0 ≤ t ≤ 1, if Bt , t ≥ 0, is a Brownian motion.

21
22
Chapter 2: Brownian Motion and Markov Property
We are going to successively discuss the “simple Markov property” and the “strong Markov
property”, and this chapter will revolve around the fact that Brownian motion is a canon-
ical example of a continuous Markov process.
Heuristically the simple Markov property states that if one “knows” the trajectory of
a Brownian motion X. until time s, then the trajectory after time s : X.+s , given this
information behaves like a Brownian motion starting from the random initial position Xs .
In particular, only the knowledge of Xs matters, in this prediction of the future after time
s given the past up to time s. The strong Markov property will extend this to stopping
times in place of the fixed time s.

Notation: (as in Chapter 1)

C = C(R+ , Rd ),

Xt : C → Rd , t ≥ 0 (“the canonical coordinates”),

F = σ(Xu , u ≥ 0),

W = Wiener measure on (Ω, F).

For x ∈ Rd , “Brownian motion starting from x” is described by the probability:

(2.1) Wx = the image of W under the map w(·) ∈ C → w(·) + x ∈ C,

(we write Ex for the corresponding expectation).

In particular, for h(x0 , . . . , xn ) ∈ bB((Rd )n+1 ), 0 = t0 < t1 < · · · < tn ,

Ex [h(Xt0 , Xt1 , . . . , Xtn )] = E W [h(x + X0 , x + Xt1 , . . . , x + Xtn )]


Z n
Y
(1.7) d
= h(x, x1 , . . . , xn ) [2π(ti − ti−1 )]− 2 ×
(2.2) (Rd )n i=1
n n
X |xi − xi−1 |2 o
exp − dx1 . . . dxn (with x0 = x) .
2(ti − ti−1 )
i=1

On C we have the time-shift operators:


s θ
for s ≥ 0, (C, F) −→ (C, F), θs (w)(·) = w(s + ·) .
(2.3) ↑
measurable by (1.5)

23
w(·)

t
0 s
new origin of time

Note that:
f (Xt0 , . . . , Xtn ) ◦ θs (w) = f (w(s + t0 ), w(s + t1 ), . . . , w(s + tn )).
(2.4)
տ concerns the trajectory after time s

The information contained in the part of the trajectory up to time s is described by:

(2.5) Fs = σ(Xu , u ≤ s), and


\
(2.6) Fs+ = Fs+ε (⊇ Fs ),
ε>0

Fs+ peeks infinitesimally into the future after time s ”. For instance the
so that “F
event “the trajectory immediately leaves its starting point”:
\  [ 
A= {Xr 6= X0 } is in F0+ but not in F0 .
n≥1 1
r∈Q∩[0, n ]

Theorem 2.1. (simple Markov property)


Y ∈ bF, s ≥ 0, x ∈ Rd , then

(2.7) Ex [Y ◦ θs | Fs+ ] = EXs [Y ], Wx -a.s. .

(2.8) Under Wx , (Xs+u − Xs )u≥0 is a Brownian motion independent of Fs+ .

Proof.
• (2.7): Note that

(2.9) y ∈ Rd → Ey [Y ] is in bB(Rd ) for Y ∈ bF .

Indeed this is true when Y = 1A0 ◦ Xt0 . . . 1An ◦ Xtn , with t0 < t1 < · · · < tn and
A0 , . . . , An ∈ B(Rd ), thanks to (2.2). We can then use Dynkin’s lemma to conclude that
this is also true when Y =P 1F , with F ∈ F, and then approximate a general Y ∈ bF by
step-functions of the form m i=1 λi 1Fi to obtain (2.9).

24
Then, note that for u0 = 0 < · · · < un = s, t0 = 0 < · · · < tk , with f, g bounded
measurable one has
Ex [f (Xu0 , . . . , Xun ) g(Xs+t0 , . . . , Xs+tk )] =
Z n
Y d
f (x, x1 , . . . , xn ) g(xn , . . . , xn+k ) [2π(ui − ui−1 )]− 2
(Rd )n+k i=1
k
Y n n
X |xi − xi−1 |2
− d2 1
[2π(tj − tj−1 )] exp −
2 ui − ui−1
j=1 i=1
|xn+j − xn+j−1 |2 o
Xk
1
− dx1 . . . dxn+k
2 tj − tj−1
(2.10) j=1
Z n |xi − xi−1 |2 o
Xn
(2.2) 1
= f (x, x1 , . . . , xn ) Exn [g(Xt0 , . . . , Xtk )] exp −
(Rd )n 2 ui − ui−1
i=1
Yn
d (2.2)
[2π(ui − ui−1 )]− 2 dx1 . . . dxn =
i=1

Ex [f (X0 (w), . . . , Xun (w)) EXs (w) [g(Xt0 , . . . , Xtk )]]



this is the function: w → EXs (w) [g(Xt0 , . . . , Xtk )] .

Using Dynkin’s lemma, we see that for s ≥ 0, and A ∈ Fs :

(2.11) Ex [1A g(Xs+t0 , . . . , Xs+tk )] = Ex [1A EXs [g(Xt0 , . . . , Xtk )]] .

If we now pick g continuous and bounded, clearly

(2.12) y ∈ Rd → Ey [g(Xt0 , . . . , Xtk )] = E0 [g(Xt0 + y, . . . , Xtk + y)] ∈ R

is a continuous bounded function thanks to dominated convergence. We can apply (2.11)


with s+ε in place of s, A ∈ Fs+ ⊆ Fs+ε , and letting ε → 0, see, by dominated convergence,
that (2.11) holds for s ≥ 0, A ∈ Fs+ , and g bounded continuous. Then, by approximation,
it holds for g of the form
k
Y
g(x0 , . . . , xk ) = 1Ki (xi ), with Ki , i = 0, . . . , k, closed in Rd .
i=0

Using Dynkin’s lemma once more we see that for s ≥ 0, A ∈ Fs+ , A′ ∈ F, Y = 1A′ :

(2.13) Ex [1A Y ◦ θs ] = Ex [1A EXs [Y ]] .

Then, using a uniform approximation by step functions, we see that (2.13) holds for
Y ∈ bF. This proves (2.7).

25
• (2.8): (Xs+u − Xs )u≥0 has continuous trajectories, and for f ∈ bB((Rd )n+1 ), 0 = t0 <
t1 < · · · < tn , Wx -a.s.,

Ex [f (Xs+t0 − Xs , . . . , Xs+tn − Xs ) | Fs+ ] =


(2.7)
(2.14) Ex [f (Xt0 − X0 , . . . , Xtn − X0 ) ◦ θs | Fs+ ] =
(2.1)
EXs [f (Xt0 − X0 , . . . , Xtn − X0 )] = E0 [f (Xt0 , . . . , Xtn )] .

It now readily follows that (Xs+u − Xs )u≥0 fulfills (1.1), and is a Brownian motion on
(C, F, Wx ). Moreover it is straightforward from (2.14) (with Dynkin’s lemma) to see that
for any F (·) : C → R, bounded measurable F (Xs+. − Xs ) is independent of Fs+ . This
proves (2.8).

Corollary 2.2. (Blumenthal’s 0 − 1 law)

(2.15) For any x ∈ Rd , Wx (A) ∈ {0, 1}, when A ∈ F0+ .

Proof. 1A ◦ θ0 = 1A , since θ0 is the identity map. Therefore, we find that

Ex [1A | F0+ ] = 1A , Wx -a.s. (since A ∈ F0+ ),

and

(2.7)
Ex [1A | F0+ ] = Ex [1A ◦ θ0 | F0+ ] = EX0 [1A ] = Wx (A), Wx -a.s., since Wx (X0 = x) = 1 .

As a result, Wx -a.s., 1A = Wx (A), and the claim (2.15) follows.

As we now see, the σ-algebra F0+ contains some interesting events and this explains
the interest of Blumenthal’s 0 − 1 law.

Examples:
e + def
1) d = 1, let H e − def
= inf{s > 0; Xs > 0}, H = inf{s > 0; Xs < 0} denote the respective
hitting times of (0, ∞) and (−∞, 0).

Proposition 2.3.

(2.16) e+ = H
W0 -a.s., H e− = 0 .

26
Proof. \  [ 
e + = 0} =
{H {Xr ∈ (0, ∞)} ∈ F0+
n≥1 1
r∈(0, n ]∩Q

and for t > 0,


e + ≤ t) ≥ W0 (Xt > 0) = 1
W0 (H
2

ւ decreases for t → 0
e + = 0) ≥ 1 .
W0 (H
2

e + = 0) = 1.
Thus by (2.15), we find that W0 (H
e − = 0) = 1.
Of course, in the same way W0 (H

2) d ≥ 2, C some open cone with tip 0 in Rd


(i.e. C is open, and for x ∈ Rd , x ∈ C ⇐⇒ λx ∈ C, for all λ > 0).
Define the hitting time of C:

(2.17) e C = inf{s > 0; Xs ∈ C} .


H

Proposition 2.4.

(2.18) eC = 0 .
W0 -a.s., H

Proof. The argument is similar to the proof of (2.16).


e C = 0} ∈ F + and
We use the fact that {H 0


e C ≤ t) ≥ W0 (Xt ∈ C) scaling
W0 (H = W0 ( t X1 ∈ C)
C is a cone
= W0 (X1 ∈ C) > 0 .
with tip 0

One then concludes as for (2.16).

27
3) d = 1, tn > 0, n ≥ 1, with lim tn = 0.
Proposition 2.5.
Xt
(2.19) W0 -a.s., lim √ n = ∞ .
n tn
def √
Proof. For c > 0, note that A = lim sup{Xtn > c tn } ∈ F0+ . Indeed
n

def
\ \ [ √
A = An = An with An = {Xtm > c tm } ,
n≥1 n≥n0 m≥n

so A ∈ Fε , if tm ≤ ε, for m ≥ n0 . But ε > 0 can be chosen arbitrary.


By (2.15), W0 (A) = 0 or 1, moreover An decreases with n, so that:

W0 (A) = lim W0 (An ) ≥ lim W0 (Xtn > c tn )
n n
(2.20)
scaling
= W0 (X1 > c) > 0 .

As a result W0 (A) = 1, for arbitrary c > 0.


Xt n
In particular, W0 -a.s., limn √
tn
≥ c, and choosing c = k, k ≥ 1, we obtain (2.19).

Exercise 2.6. Show that:


T
Under W0 , the asymptotic σ-field A = u<∞ σ(Xv , v ≥ u) is trivial, that is

W0 (A) ∈ {0, 1}, for any A ∈ A .

In fact more is true:

A ∈ A =⇒ Wx (A) = 0, for all x ∈ Rd , or Wx (A) = 1, for all x ∈ Rd .

(Hint: use (1.21), Blumenthal’s 0 − 1 law and the Markov property). 

We continue our discussion of the simple Markov property, and will in the spirit of
(1.15) (in the case of Gaussian processes), provide a Markovian characterization of Brow-
nian motion. For this purpose, we introduce the Brownian transition semigroup:

Rt f (x) = Ex [f (Xt )], x ∈ Rd , t ≥ 0, f ∈ bB(Rd )


(2.21) Z n |y − x|2 o
− d2
= (2πt) f (y) exp − dy, when t > 0 .
Rd 2t

Rt , t ≥ 0, satisfies the semigroup property:

(2.22) Rt+s = Rt ◦ Rs , t, s ≥ 0 .

28
Indeed, one has with the help of the simple Markov property:
(2.7)
Rt+s f (x) = Ex [f (Xt+s )] = Ex [f (Xs ) ◦ θt ] = Ex [EXt [f (Xs )]]
(2.21) (2.21)
= Ex [Rs f (Xt )] = Rt (Rs f )(x) .

One then has the following Markovian characterization of Brownian motion (compare with
(1.15) for Gaussian processes):

Proposition 2.7. Let Bt , t ≥ 0, be an Rd -valued process defined on (Ω, A, P ), with P -a.s.,


continuous trajectories, and Gs = σ(Bu , u ≤ s). Then

Bt , t ≥ 0, is a Brownian motion ⇐⇒ B0 = 0, P -a.s., and


(2.23)
E[f (Bt+s )|Gs ] = Rt f (Bs ), P -a.s., for f ∈ bB(Rd ) and t, s ≥ 0 .

Proof.

• =⇒: Using (1.7) and (2.21), if Bt , t ≥ 0, is a Brownian motion, for 0 = t0 < · · · <
tn = s, t > 0, and f0 , . . . , fn , f ∈ bB(Rd ):
(1.7)
E[f0 (Bt0 ) . . . fn (Btn ) f (Bt+s )] = E[f0 (Bt0 ) . . . fn (Btn=s ) Rt f (Bs )]

and using Dynkin’s lemma, for any G ∈ Gs :

E[1G f (Bt+s )] = E[1G Rt f (Bs )] ,

from which we deduce that P -a.s., E[f (Bt+s ) | Gs ] = Rt f (Bs ). The fact that B0 = 0,
P -a.s., is automatic.

• ⇐=: By induction we see that for t0 = 0 < · · · < tn , f0 , . . . , fn ∈ bB(Rd ):

E[f0 (Bt0 ) . . . fn (Btn )] =


Z n
Y − 21
n
P |xi −xi−1 |2
d ti −ti−1
f0 (0) f1 (x1 ) . . . fn (xn ) [2π(ti − ti−1 )]− 2 e i=1 dx1 , . . . dxn .
(Rd )n i=1

As a result Bt , t ≥ 0, has the same finite marginal distributions as Xt , t ≥ 0, under


W (= Wiener measure), cf. (1.7), and it thus satisfies (1.1). Our claim follows.

29
Strong Markov property
In order to discuss the strong Markov property we need to introduce the notion of stop-
ping times.
In the case of a discrete filtration (Ω, G, (Gn )n≥0 ) (i.e. the σ-algebras Gn , n ≥ 0, G satisfy
G0 ⊆ G1 ⊆ · · · ⊆ Gn ⊆ · · · ⊆ G), a stopping time is defined as a map T : Ω → N ∪ {∞}
(recall N = {0, 1, 2, . . . }), such that {T = n} ∈ Gn , for each n ≥ 0.
In other words “the decision to stop at a certain time n is a function of the information
known by time n”.
In the case when time varies in R+ = [0, ∞) in place of N, the “right way” to interpret
the above sentence comes in the next:
Definition 2.8. (Ω, G, (Gt )t≥0 ) where (Gt )t≥0 is assumed to be a filtration (i.e. the σ-
algebras Gt , t ≥ 0, satisfy Gs ⊆ Gt ⊆ G, for 0 ≤ s ≤ t), then T : Ω → [0, ∞] is a
(Gt )-stopping time if:

(2.24) {T ≤ t} ∈ Gt , for all t ≥ 0 .

The “σ-algebra of the past of T ” is defined as:

GT = {A ∈ G; A ∩ {T ≤ t} ∈ Gt , for each t ≥ 0}
(2.25)
(this is indeed a σ-algebra!).

Remark 2.9. Note that when T is a (Gt )-stopping time, then for any t ≥ 0,
[
{T < t} = {T ≤ s} belongs to Gt , and
s∈Q∩[0,t)

{T = t} = {T ≤ t}\{T < t} belongs to Gt . 


Examples:
1) Entrance time in a closed set:
Consider the canonical space (C, F) and Ft , t ≥ 0, as in (2.5), as well as A a closed subset
of Rd . The entrance time of X. in A is

(2.26) HA = inf{s ≥ 0; Xs ∈ A} (by convention, HA = ∞ when {. . . } = ∅) .

We will now see that

(2.27) HA is an (Ft )-stopping time .

ւ closed
Indeed, for w ∈ C, {s ≥ 0, Xs (w) ∈ A} is a closed subset of R+ , which thus contains
HA (w) when it is finite. տ continuous
Hence for t ≥ 0:

HA (w) > t ⇐⇒ ∀s ∈ [0, t], dist(Xs (w), A) > 0 ⇐⇒ inf dist(Xs (w), A) > 0 .
[0,t]

30
Therefore we see that
[ \ n o
1
{HA > t} = dist(Xs (w), A) > ∈ Ft ,
n
n≥1 s∈Q∩[0,t]

and (2.27) follows.

2) Entrance time in an open set:


We now replace A with O an open subset of Rd , and of course set

(2.28) HO = inf{s ≥ 0; Xs ∈ O} .

Observe that in general {HO = t} is not Ft -measurable (and hence HO is not an (Ft )-
stopping time by Remark 2.9).

two possible trajectories which


O agree up to time t, one of which
has HO = t and the other not.

One needs to “peek a little bit into the future” to decide whether HO = t or not. This
motivates the use of the filtration Ft+ (= ∩ε>0 Ft+ε ), t ≥ 0.
Proposition 2.10.

(2.29) If O is an open set of Rd , HO is an (Ft+ )-stopping time .

Proof.
[
{HO < s} = {Xu ∈ O} ∈ Fs , for s ≥ 0 ,
u∈Q∩[0,s)

and hence
\
{HO ≤ s} = {HO < s + ε} ∈ Fs+ , for s ≥ 0 .
ε>0

Remark 2.11. By the above argument we also see that:

when (Ω, G, (Gt )t≥0 ) is such that the filtration (Gt )t≥0 is
(2.30) right continuous (i.e. Gt = ∩ε>0 Gt+ε , for all t ≥ 0), then
T is a (Gt )-stopping time ⇐⇒ {T < t} ∈ Gt , for all t > 0 .
T
(Indeed “=⇒” immediate and for “⇐=” : {T ≤ s} = n≥1 { T < s + n1 } which belongs
| {z }
to ∩ε>0 Gs+ε = Gs ). decreasing in n 

31
Here are now some simple useful properties:
Proposition 2.12. (Ω, G, (Gt )t≥0 )

(2.31) T stopping time =⇒ T is GT -measurable.

(2.32) S, T stopping times, then T ∧ S (= min(T, S)) and


T ∨ S (= max(T, S)) are stopping times.

(2.33) In the case of (C, F), if T, S are (Ft+ )-stopping times, then
T + S ◦ θT = T (w) + S(θT (w) (w)), when T (w) < ∞ ,
= ∞, when T (w) = ∞ ,
is also an (Ft+ )-stopping time.

Proof.
• (2.31):
It suffices to show that {T ≤ u} ∈ GT , for u ≥ 0, and indeed

{T ≤ u} ∩ {T ≤ t} = {T ≤ u ∧ t} ∈ Gu∧t ⊆ Gt , for all t ≥ 0 ,

and by (2.25), {T ≤ u} ∈ GT , for all u ≥ 0.

• (2.32):

{T ∧ S ≤ t} = {T ≤ t} ∪ {S ≤ t} ∈ Gt , for all t ≥ 0. Hence, T ∧ S is a (Gt )-stopping time.


{T ∨ S ≤ t} = {T ≤ t} ∩ {S ≤ t} ∈ Gt , for all t ≥ 0. Hence, T ∨ S is a (Gt )-stopping time.

• (2.33):
(Ft+ )t≥0 is a right-continuous filtration (check it!), and by (2.30) we only need to show
that for t > 0:

(2.34) {T + S ◦ θT < t} ∈ Ft+ .

To this effect, note that


[
(2.35) {T + S ◦ θT < t} = {T < u, S ◦ θT < v} .
u,v∈Q∩(0,∞)
u+v<t

We will use the following claim:


Assume {T < u} = 6 ∅, then
(2.36)
θT : ({T < u}, Fv+u ∩ {T < u}) → (C, Fv ) is measurable.

Indeed, for 0 ≤ s ≤ v, Xs ◦ θT , is measurable as a map:

({T < u}, Fv+u ∩ {T < u}) → (Rd , B(Rd )) ,

32
as follows from the equality valid for w ∈ {T < u}:
X n o
Xs ◦ θT (w) = Xs+T (w) (w) = lim Xs+ k u (w) 1 (k−1)
n u ≤ T < k
n u .
n→∞ n
1≤k≤n | {z } | {z }
+ ∩{T <u}
measurable ∈Fu
Fs+u ∩{T <u} ⊆Fv+u ∩{T <u}
⊆Fv+u ∩{T <u}

The claim (2.36)


S now follows from a similar argument as (1.5). On the other hand,
{S < v} = r<v,r∈Q∩(0,∞) {S < r} ∈ Fv , and we see that the event in the union in the
right-hand side of (2.35) satisfies:

{T < u, S ◦ θT < v} = ∈Fu+ (even Fu )


z }| {
(2.36)
{w ∈ {T < u}, θT (w) ∈ {S < v}} ∈ Fu+v ∩ {T < u} ⊆ Fu+v .

Thus, coming back to (2.35) we have shown that {T + S ◦ θT < t} ∈ Ft ⊆ Ft+ , and (2.34)
is proved, whence (2.33).

Complement:
Special characterization of FT , when T is an (Ft ))-stopping time on the canonical
space C. We have the identity:

(2.37) FT = σ(XT ∧s , s ≥ 0)

(in other words FT describes the information of the trajectory X. stopped at time T ).

Proof.
• “⊇”:
This is the easier direction. We need to show that

(2.38) XT ∧s is FT -measurable for each s ≥ 0 .

For this purpose we write



X nk k + 1o
(2.39) XT ∧s = lim X k
∧s 1 n ≤T < .
n→∞ 2n 2 2n
k=0

Observe now that for A ∈ F k


∧s , and u ≥ 0,
2n

nk k + 1o
A∩ ≤ T < ∩ {T ≤ u}
2n 2n

is empty if u < 2kn , and when 2kn ≤ u < k+1


2n , it coincides with A ∩ {T <
k c
2n } ∩ {T ≤ u} ∈
Fu , and when u ≥ 2n , it equals A ∩ { 2n ≤ T < k+1
k+1 k
2n } ∈ Fu .

33
It thus follows that A ∩ { 2kn ≤ T < k+1
2n } ∈ FT . Then, using an approximation of
X kn ∧s by step functions we conclude that X kn ∧s 1{ 2kn ≤ T < k+1
2n } is FT -measurable, and
2 2
(2.38) follows from (2.39).

• “⊆”:
This step is more involved. We introduce the notation
def
(2.40) wt (·) = w(· ∧ t)(∈ C), for any t ≥ 0, and w ∈ C .
We will use the following
Claim:
(2.41) f (w) = f (wT (w) ), for any f ∈ bFT and w ∈ C .

To see that the claim holds note that it is obviously true when T (w) = ∞, since wT (w) = w
in this case, and we only need to check that
(2.42) f (w) 1{T (w) = t} = f (wt ) 1{T (w) = t}, for all t ≥ 0, and f ∈ bFT .
To see this last point we argue as follows: Using Dynkin’s lemma we find that for t ≥ 0,
(2.43) Y (w) = Y (wt ), for any Y ∈ bFt , and w ∈ C .

and since T is an (Ft )-stopping time, {T = t} ∈ Ft , and hence with (2.43):


1{T (w) = t} = 1{T (wt ) = t}, for t ≥ 0, w ∈ C .

Similarly, when f (·) is bFT , using (2.25), (2.31), we see that


f (w) 1{T (w) = t} = f (w) 1{T (w) = t} 1{T (w) ≤ t} ∈ bFt .
| {z }
∈bFT

As a result, by (2.43) and the identity below (2.43), we find that for w ∈ C, t ≥ 0,
f (w) 1{T (w) = t} = f (wt ) 1{T (wt ) = t} = f (wt ) 1{T (w) = t} ,

and this proves (2.42) and completes the proof of (2.41).


Now, from Dynkin’s lemma we see that for any f ∈ bF, there exists F bounded
measurable on (Rd )N and a sequence (tk )k≥0 in [0, ∞) such that:
(2.44) f (w) = F (w(t0 ), w(t1 ), . . . , w(tk ), . . . ) .
Applying the claim (2.41) we thus find that for f ∈ bFT
(2.44)
f (w) = f (wT (w) ) = F (wT (w) (t0 ), wT (w) (t1 ), . . . , wT (w) (tk ), . . . )
(2.40)
= F (w(T (w) ∧ t0 ), w(T (w) ∧ t1 ), . . . , w(T (w) ∧ tk ), . . . )

= F (XT ∧t0 (w), XT ∧t1 (w), . . . , XT ∧tk (w), . . . )

and this proves that FT ⊆ σ(XT ∧s , s ≥ 0). 

34
Exercise 2.13.
1) Show that FT is generated by a countable collection of events (hint: use (2.37)).
2) Given (Ω, G, (Gt )t≥0 ) and S, T two (Gt )-stopping times, show that

a) if S ≤ T , GS ⊆ GT ,

b) {S < T } and {S ≤ T } belong to GS ∩ GT


S
(hint: write {S < T } ∩ {T ≤ t} = s∈Q∩[0,t] {S ≤ s} ∩ {T > s} ∩ {T ≤ t}, and
{S < T } ∩ {S ≤ t} = {S ≤ t} ∩ {T > t} ∪ {S < T } ∩ {T ≤ t}, and use that
{S < T } ∈ GT }),

c) for A ∈ GS , A ∩ {S < T } and A ∩ {S ≤ T } belong to GS∧T .

We continue with the discussion of the strong Markov property. We consider


(C, F). We recall that (Ft+ )t≥0 is a right-continuous filtration, see (2.30) and above (2.34).
We further observe that:
(2.25)
for T an (Ft+ )-stopping time, FT+ = {A ∈ F; A ∩ {T ≤ t} ∈ Ft+ , ∀t ≥ 0}
(2.45)
= {A ∈ F; A ∩ {T < t} ∈ Ft+ , ∀t ≥ 0}.

Indeed “⊆” is immediate and for “⊇” when A ∩ {T < t} ∈ Ft+ , for all t ≥ 0, then
\ n 1o \ +
A ∩ {T ≤ t} = A∩ T <t+ ∈ Ft+ε = Ft+ , for t ≥ 0 .
n
n≥1 ε>0

Theorem 2.14. (strong Markov property)


Let T be an (Ft+ )-stopping time, Y ∈ bF, x ∈ Rd , then

(2.46) Ex [Y ◦ θT | FT+ ] = EXT [Y ], on {T < ∞}, Wx -a.s.

(in other words, θT : ({T < ∞}, F ∩ {T < ∞}) → (C, F) is measurable, and the random
variable EXT [Y ] defined on {T < ∞} is FT+ ∩ {T < ∞}-measurable and for any A ∈
FT+ ∩ {T < ∞}, Ex [Y ◦ θT 1A ] = Ex [EXT [Y ] 1A ]).
| {z }

well-defined since A ⊆ {T < ∞}

Rather than discussing the proof right away we first give an application.

35
The reflection principle:

a ≥ 0, b ≤ a
2a − b
reflection of the path after time Ha
a
b Ha = inf{s ≥ 0, Xs = a},
“entrance time in {a}”.

0 Ha t

Theorem 2.15. (d = 1, a ≥ 0, b ≤ a)

(2.47) W0 (Xt ≤ b, sup Xs ≥ a) = W0 (Xt ≥ 2a − b), for t > 0, and


s≤t


(2.48) W0 sup Xs ≥ a = 2W0 (Xt ≥ a)
s≤t

(in particular sups≤t Xs under W0 has same law as |Xt |).


Proof.
• (2.47):

W0 (Xt ≤ b, sup Xs ≥ a) = W0 (Ha ≤ t, Xt ≤ b) =


(2.49)  s≤t  
W0 {w ∈ C; Ha (w) ≤ t, X(t−Ha (w))+ θHa (w) (w) ≤ b} .

We will use the following


Lemma 2.16. (T an (Ft+ )-stopping time)
If h(w1 , w2 ) is bF ⊗ FT+ , then for any x ∈ Rd , Wx -a.s. on {T < ∞},
Z
+
(2.50) Ex [h(θT (w) (w), w) | FT ] = h(w1 , w) dWXT (w) (w1 ) .
C
տ ր
variable of integration

Proof. For h = 1A1 (w1 ) 1A2 (w2 ), A1 ∈ F, A2 ∈ FT+ , (2.46) implies that for any B ∈
FT+ ∩ {T < ∞}, one has:
h Z i
(2.51) Ex [h(θT (w) (w), w) 1B ] = Ex 1B h(w1 , w) dWXT (w1 ) .
C

36
Then
R using Dynkin’s lemma and approximation, (2.51) holds for any h ∈ bF ⊗ FT+ , and
+
C h(w1 , w) dWXT (w1 ) (defined on {T < ∞}) is FT ∩ {T < ∞} measurable. Our claim
follows.

We now apply the above lemma with T = Ha , and

h(w1 , w2 ) = 1{X(t−Ha (w2 ))+ (w1 ) ≤ b} ,


+
which is F ⊗ FH a
-measurable, because
X
(w, t) → Xt (w) = lim X k (w) 1[ k k+1
, ) (t) is F ⊗ B(R+ )-measurable
n 2n 2n 2n
k≥0

and one can realize h(w1 , w2 ) in the following steps:

(w1 , w2 ) ∈ C × C → (w1 , (t − Ha (w2 ))+ ) ∈ C × R+


→ X(t−Ha (w2 ))+ (w1 ) ∈ R → h(w1 , w2 ) ∈ R+ ,

and each step is induced by a measurable transformation relative to the natural σ-algebra.
We can thus apply (2.50) to the last line of (2.49) and find:
 
fX [X
W0 [Ha ≤ t, Xt ≤ b] = E0 Ha ≤ t, W e(t−H ) ≤ b] (note XHa = a on {Ha ≤ t})
Ha a +
symmetry  
= fX [X
E0 Ha ≤ t, W e(t−H ) ≥ 2a − b] and going backward
Ha a +

(2.50)
= W0 [Ha ≤ t, Xt ≥ 2a − b] = W0 [Xt ≥ 2a − b] .

We have thus shown that:

(2.52) W0 (Ha ≤ t, Xt ≤ b) = W0 (Xt ≥ 2a − b) ,

and together with (2.49) this proves (2.47).

• (2.48):

W0 [Ha ≤ t] = W0 [Ha ≤ t, Xt ≥ a] + W0 [Ha ≤ t, Xt ≤ a]

(2.53) = W0 [Xt ≥ a] + W0 [Ha ≤ t, Xt ≤ a]


(2.52)
= W0 [Xt ≥ a] + W0 [Xt ≥ a] = 2W0 [Xt ≥ a] ,
with b=a

and this proves (2.48).

Corollary 2.17. For a ∈ R, Ha is W0 -a.s. finite and has distribution:

1 |a| n a2 o
(2.54) W0 (Ha ∈ ds) = √ 3 exp − 1(0,∞) (s) ds .
2π s 2 2s

37
The joint law of Xt and sups≤t Xs , for t > 0, is given by:

W0 (Xt ∈ db, sup Xs ∈ da) =


s≤t
(2.55) 2a − b n (2a − b)2 o
2√ exp − 1{a > 0, b < a} da db .
2π t3 2t
Proof.
• (2.54):
(2.48) scaling √
a > 0, then W0 [Ha ≤ s] = W0 [|Xs | ≥ a] = W0 [|X1 | ≥ a/ s] −→ 1 ,
s→∞

and hence W0 [Ha < ∞] = 1. Moreover we have:


h Z
a i 2
(2.48)
∞ u2
(2.56) W0 [Ha ≤ s] = W0 [sup Xu ≥ a] = 2W0 X1 ≥ √ = √ e− 2 du .
u≤s s 2π a

s

Differentiating in s we find (2.54).

• (2.55): Consider 0 < a, b < a, we have


Z ∞
(2.47) 1 x2
W0 [Xt ≤ b, sup Xs ≥ a] = W0 [Xt ≥ 2a − b] = √ e− 2t dx
s≤t 2πt 2a−b
setting x=2u−b derivating in b
↓ Z ∞ ↓ Z
(2.57) 2 − (2u−b)
2
=√ e 2t du = f (u, v) du dv, if
2πt a [a,∞)×(−∞,b]

2u − v n (2u − v)2 o
f (u, v) = 2 √ exp − 1{u > 0, v < u} ,
2πt3 2t

is the probability density that appears in the right-hand side of (2.55). This probability
density is concentrated on the open set:

∆ = {(x, y) ∈ R2 ; x > 0, y < x} .

Note that the same holds true for the joint law of (sups≤t Xs , Xt ) under W0 . Indeed
observe that
def
(2.58) Bs = Xt−s − Xt , 0 ≤ s ≤ t ,

is a Brownian motion with time parameter [0, t] (because it is a centered Gaussian process
with continuous trajectories and covariance E0 [Bs Bs′ ] = s ∧ s′ , 0 ≤ s, s′ ≤ t). We know
from (2.16) that the hitting time of (0, ∞) by B. or by X. is a.s. equal to 0 (one can also
see this from (2.47)). Therefore we have:

(2.59) W0 -a.s., sup Xs > Xt , and sup Xs > 0 .


s≤t s≤t

38
As a result the joint law of (sups≤t Xs , Xt ) under W0 is supported by ∆. Now the collection
of subsets of ∆ of the form [a, ∞) × (−∞, b], with a > 0, b < a, is a π-system, which
generates B(∆) (the Borel subsets of ∆). By (2.57) and Dynkin’s lemma we can conclude
that (2.55) holds.

Remark 2.18. The collection of subsets [a, ∞) × (−∞, b], of ∆ = {(x, y) ∈ R2 ; x ≥ 0,


y ≤ x}, with a ≥ 0, b ≤ a, is not rich enough to generate B(∆) (their trace on {(x, y) ∈ R2 ,
x = y ≥ 0} ⊆ ∂∆ consists at most of a point). This is why we work with ∆ and not ∆. 
As a preparation for the proof of the strong Markov property, we introduce the following
Definition 2.19. Given (Ω, G, (Gt )t≥0 ), an Rd -valued process Zu (ω), u ≥ 0, ω ∈ Ω, is
called progressively measurable if the restriction of Z to Ω × [0, t] is Gt ⊗ B([0, t])-
measurable for each t ≥ 0.

Example:
A process Zu (ω) right-continuous in u, adapted, (i.e. Zu (·) is Gu -measurable for each
u ≥ 0), is progressively measurable because on [0, t] × Ω:
n
X nk − 1 k o
(2.60) Zs (ω) = lim Zk t 1 t ≤ s < t + Zt (ω) 1{s=t} .
n→∞ n n n
|k=1 {z }
B([0,t])⊗Gt −measurable


The interest of this notion in our context comes from the next
Lemma 2.20. Given (Ω, G, (Gt )t≥0 ), Z a progressively measurable process, and T a (Gt )-
stopping time one has

(2.61) ZT (i.e. ω ∈ {T < ∞} → ZT (ω) (ω) ∈ Rd ) is GT ∩ {T < ∞}-measurable .

Proof. It suffices to show that for any t ≥ 0,


Z
(2.62) T
({T ≤ t}, Gt ∩ {T ≤ t}) −→ (Rd , B(Rd )) is measurable .

But the above map is the composition of the measurable maps:



ω ∈ {T ≤ t} −→ ω, T (ω) ∈ Ω × [0, t] ∋ (ω, u) −→ Zu (ω) ∈ Rd .
Gt ∩ {T ≤ t} Gt ⊗ B([0, t]) B(Rd )

The first map is measurable because both maps


Id
({T ≤ t}, Gt ∩ {T ≤ t}) −→ (Ω, Gt ) and
T
({T ≤ t}, Gt ∩ {T ≤ t}) −→ ([0, t], B([0, t]) are measurable ,

and the second map is measurable because Z is progressively measurable.

39
We now turn to the proof of the strong Markov property:

Proof. We will prove Theorem 2.14 in a number of steps. The first step is to show that:

(2.63) θT : ({T < ∞}, F ∩ {T < ∞}) −→ (C, F) is measurable .

Due to (1.5) we only need to show that for s ≥ 0,

(2.64) Xs ◦ θT : ({T < ∞}, F ∩ {T < ∞}) −→ (Rd , B(Rd )) is measurable ,

and in the spirit of the proof in (2.36), we write for w ∈ {T < ∞}


X n (k − 1) ko
Xs ◦ θT (w) = Xs+T (w) (w) = lim Xs+ k (w) 1 ≤T <
n→∞
k≥1 | {zn
} | n {z n}
meas. F ∩{T <∞} ∈F ∩{T <∞}

whence (2.64) and therefore (2.63).


The second step is then to show that:

(2.65) XT : ({T < ∞}, FT+ ∩ {T < ∞}) → (Rd , B(Rd )) is measurable .

To this effect we note that Xt (ω) is a progressively measurable process due to (2.60) and
the claim (2.65) now follows from (2.61).
Note that y ∈ Rd → Ey [Y ] is measurable for any Y ∈ bF, as shown in (2.9) (or in
other words y ∈ Rd , A ∈ F → Wy (A) ∈ [0, 1], is a probability kernel). Combining this
observation with (2.63) and (2.65), the statement in (2.46) now makes sense.
The third step is to show that:

when T takes an at most denumerable set of values in R+ ∪ {∞},


(2.66)
then (2.46) is true .

This step will follow from a direct application of the simple Markov property. We write
an , 0 ≤ n < N (≤ ∞) for the set of values of T in [0, ∞).
Then for A ∈ FT+ ∩ {T < ∞}, 0 = t0 < · · · < tk , f ∈ bB((Rd )k+1 ) we find:

Ex [f (Xt0 , . . . , Xtk ) ◦ θT 1A ] = Ex [f (XT +t0 , . . . , XT +tk ) 1A ] =


X h i (2.7)
Ex f (Xan +t0 , . . . , Xan +tk ) 1A ∩ {T = a } =
n
n | {z }
+
∈Fa n
(2.67)
and by the simple Markov property (2.7):
X  
Ex EXan [f (Xt0 , . . . , Xtk )] 1A∩{T =an } =
n
Ex [EXT [f (Xt0 , . . . , Xtk )] 1A ] ,

40
where the summation runs over the set 0 ≤ n < N (such that an ∈ [0, ∞)) in lines two
and four of (2.67). We can now use Dynkin’s lemma and approximation to deduce that
for Y ∈ bF, one has
Ex [Y ◦ θT 1A ] = Ex [EXT [Y ] 1A ] ,
and obtain (2.66).
The last step of the proof will be:

(2.68) the claim (2.46) holds for T a general (Ft+ )-stopping time .

For this purpose, we use the discrete skeleton approximation of T :


X k+1 n k k + 1o
(2.69) Tn = 1 ≤ T < + ∞1{T = ∞} .
2n 2n 2n
k≥0

The key observation is that:

(2.70) for each n, Tn is an (Ft+ )-stopping time, and Tn ↓ T as n → ∞ .

Indeed the fact that Tn ≥ T and Tn ↓ T as n → ∞ is obvious from (2.69). In addition, for
k, n ≥ 0,
n k + 1 o (2.69) n k + 1o
Tn ≤ n
= T < n
∈ F+
k+1 ,
2 2 2n

and for t ∈ [ 2kn , k+1


2n ) we have
n ko
{Tn ≤ t} = Tn ≤ n ∈ F +k ⊆ Ft+ ,
2 2n

so that Tn is an (Ft+ )-stopping time.


Since T ≤ Tn , it follows, cf. Exercise 2.13 2) a), that

(2.71) FT+ ⊆ FT+n , for n ≥ 0 .

Consider A ∈ FT+ ∩ {T < ∞}. Since {T < ∞} = {Tn < ∞}, we also have A ∈ FT+n ∩ {Tn <
∞}, and applying (2.66) we see that for Y ∈ bF:

Ex [Y ◦ θTn 1A ] = Ex [EXTn [Y ] 1A ] .

Specializing to the case where 0 = t0 < · · · < tk and Y = f (Xt0 , . . . , Xtk ), with f bounded
continuous on (Rd )k+1 , we obtain that for n ≥ 0:

(2.72) Ex [f (XTn +t0 , . . . , XTn +tk ) 1A ] = Ex [EXTn [f (Xt0 , . . . , Xtk )] 1A ] .

We also know from (2.12) that:

y ∈ Rd −→ Ey [f (Xt0 , . . . , Xtk )] is a bounded continuous function .

41
Therefore, letting n tend to infinity in (2.72) we find that

(2.73) Ex [f (XT +t0 , . . . , XT +tk ) 1A ] = Ex [EXT [f (Xt0 , . . . , Xtk )] 1A ] .

By the same argument as below (2.12), we then find that (2.73) holds for f (x0 , . . . , xk ) =
Qk d
i=0 1Ki (xi ), with Ki , i = 0, . . . , k, closed subsets of R , and then by Dynkin’s lemma
and approximation we obtain that

Ex [Y ◦ θT 1A ] = Ex [EXT [Y ] 1A ] ,

for Y ∈ bF and A ∈ FT+ ∩ {T < ∞}. This identity, recall (2.65), (2.9), now completes the
proof of (2.46).

Complement:
What can go wrong when going from the simple to the strong Markov property:

A typical example is given by the following process:

state space is R+ .
0

The process waits an exponential time of parameter 1 in 0, and afterwards moves with
unit speed to the right. If it starts in x > 0, it simply moves to the right with unit speed.
We denote by Px , x ≥ 0, the law on (C(R+ , R+ ), F) of the process starting at x ≥ 0,
with F the canonical σ-algebra on C(R+ , R+ ).
For t ≥ 0, one defines the operator Rt : bB(R+ ) → bB(R+ ) in analogy with (2.21), via:
def
Rt f (x) = Ex [f (Xt )] = f (x + t), if x > 0,
(2.74) Z t
= e−t f (0) + e−u f (t − u)du, if x = 0 .
0

Note that Rt , t ≥ 0, has the semigroup property:

(2.75) Rt+s = Rt ◦ Rs , for s, t ≥ 0 .

Indeed, when f ∈ bB(R+ ) and x > 0,

Rt (Rs f )(x) = (Rs f )(x + t) = f (x + t + s) = Rt+s f (x) ,

42
and when x = 0,
Z t
Rt (Rs f )(0) = e−t Rs f (0) + e−u Rs f (t − u) du
0
Z s Z t
−t−s −t −v
=e f (0) + e e f (s − v) dv + e−u f (t + s − u) du
0 0
Z s Z t
−(t+s) −(t+v)
=e f (0) + e f (s − v) dv + e−u f (t + s − u) du
0 0
Z t+s Z t
setting t+v=u −(t+s) −u
= e f (0) + e f (t + s − u) du + e−u f (t + s − u) du
t 0

= Rt+s f (0), whence (2.75) .

Moreover, one has the regularity:

Rt f (x) −→ f (x), for x ≥ 0, when f is continuous bounded


(2.76) t→0
(by direct inspection of (2.74)) .

One checks that Ex [f (Xt+s )|Fs ] = Rt f (Xs ), Px -a.s., for t, s ≥ 0, f ∈ bB(R+ ), and x ≥ 0
(one looks separately at the events {Xs > 0} and {Xs = 0}). From this identity one can
deduce that X. has the simple Markov property with respect to (Ft )t≥0 . One can further
check that:

(2.77) X. has the simple Markov property with respect to (Ft+ )t≥0 .

In essence as below (2.12) one uses the fact that for g continuous bounded, x ≥ 0,
ε→0
EXs+ε [g(Xt0 , . . . , Xtk )] −→ EXs [g(Xt0 , . . . , Xtk )], Px -a.s.

and this is done by looking separately at the events {Xs > 0} and {Xs = 0}.
However, the process is not strong Markov! For instance, H(0,∞) the entrance time
in (0, ∞) is an (Ft+ )-stopping time, cf. (2.29), and P0 -a.s. H(0,∞) > 0, but on the other
hand, H(0,∞) ◦ θH(0,∞) = 0, P0 -a.s., so that:
 
0 = E0 1{H(0,∞) > 0} ◦ θH(0,∞) ] 6=E[EXH [1{H(0,∞) > 0}] = 1
(0,∞)
(2.78)
(since XH(0,∞) = 0, P0 -a.s.).

Roughly speaking the problem is that P0 does not describe the motion of XH(0,∞) +· , i.e.
of X. after time H(0,∞) . Note that even when f is smooth one can have for t > 0,

(2.79) lim Rt f (x) = f (t) 6= Rt f (0) : Rt f is not continuous!


x→0+

43
So, the crucial property (2.12) in the Brownian case, which was used below (2.72), is not
n
satisfied in the present example. Indeed, if H(0,∞) denotes the discrete skeleton of H(0,∞) ,
cf. (2.69), for bounded continuous g,

EXH n [g(Xt0 , . . . , Xtk )] need not P0 -a.s., converge for n → ∞, to


(0,∞)
(2.80)
EXH [g(Xt0 , . . . , Xtk )] = E0 [g(Xt0 , . . . , Xtk )] .
(0,∞) P0 −a.s.

This point should be contrasted with (2.77). 

44
Chapter 3: Some Properties of the Brownian Sample Path
We will now discuss some typical properties of the Brownian sample paths. From this
discussion the “roughness” of the typical sample path will be apparent. We begin with
the quadratic variation and the variation of the sample path.
Theorem 3.1. (d = 1, on the canonical space (C, F, W0 ))
For t > 0, W0 -a.s., and in L2 (W0 ),
X 2
(3.1) lim X k+1 −X k = t,
n→∞ n 2 2n
k≥0, k+1
2n
≤t

(3.2) W0 -a.s., the map t ≥ 0 → Xt (w) ∈ R, has infinite variation


on any [a, b], 0 ≤ a < b .

Proof.
• (3.1): We set
def 2
(3.3) ∆k,n = X k+1 −X k , for k, n ≥ 0 .
n 2 2n

For fixed n, by (1.1), the ∆k,n , k ≥ 0, are i.i.d. under W0 , with mean 2−n . Moreover, we
find that
h X 2 i hn X  [t2n ]  o2 i
E0 ∆k,n − t = E0 (∆k,n − 2−n ) − t − n =
k+1 k+1 | {z 2 }
2n
≤t 2n
≤t
=an
h X i h X 2 i
(3.4) 2 −n −n
an − 2an E0 (∆k,n − 2 ) +E0 (∆k,n − 2 ) .
k+1 k+1
2n
≤t 2n
≤t
| {z }
=0

Since we have to do with the variance of a sum of i.i.d. centered variables, we find:
h X 2 i
E0 (∆k,n − 2−n ) = [2n t] E0 [(∆0,n − 2−n )2 ]
k+1
2n
≤t
(3.5) and since ∆0,n is distributed as 2−n X12 under W0
[2n t]
= E0 [(X12 − 1)2 ] .
22n
We have thus found that
h X 2 i [2n t]
(3.6) E0 ∆k,n − t = a2n + 2n E0 [(X12 − 1)2 ] is summable in n .
k+1
2
2n
≤t
P 2
From this we deduce that ( k+1
2n
≤t ∆k,n − t) converges a.s. and in L (W0 ) to 0 as n → ∞.
The claim (3.1) now follows.

45
• (3.2):
The set of w ∈ C for which there exists 0 ≤ a < b < ∞, such that t → Xt (w) has finite
variation on [a, b] equals the event
[
(3.7) {w ∈ C : Vr,s (w) < ∞} ,
r<s in Q∩[0,∞)

where Vr,s (w) denotes the random variable:


k
X
(3.8) Vr,s (w) = sup |Xti (w) − Xti−1 (w)| .
r=t0 <···<tk =s
rationals i=1

If (3.2) did not hold, then for some 0 ≤ r0 < s0 ∈ Q ∩ [0, ∞) one would have

(3.9) W0 [Vr0 ,s0 < ∞] > 0 .

However on the event {Vr0 ,s0 < ∞},


X 2
X
X k+1 − X kn ≤ sup |Xu − Xv | X k+1 −X k
n 2 2 |u−v|≤2−n
n2 2n
r0 ≤ 2kn , k+1
2n
≤s0 u,v≤s0 r0 ≤ 2kn , k+1
2n
≤s0
(3.10)
≤ sup |Xu − Xv | Vr0 ,s0 −→ 0, thanks to the continuity of the
|u−v|≤2−n
n→∞
u,v≤s0 trajectory t → Xt .

On the other hand by (3.1) and the continuity of the trajectory, we find that W0 -a.s.,
X 2
(3.11) X k+1 − X kn −→ s0 − r0 > 0, thus contradicting (3.10) .
n 2 2 n→∞
r0 ≤ 2kn , k+1
2n
≤s0

This proves (3.2).

Remark 3.2. Using Dini’s second theorem (i.e. a sequence of non-decreasing functions
on a compact interval I ⊆ R, converging to a continuous function, converges uniformly on
I to this function), we deduce from (3.1) that
X 2
(3.12) W0 -a.s., for any N ≥ 1, lim sup X k+1 −X k − t = 0.
n→∞ 0≤t≤N n2 2n
0≤ k+1
2n
≤t

Exercise 3.3. Consider for 0 ≤ r < s and w ∈ C, the function theoretic quadratic
variation of w on [r, s]
k
X
V2,r,s (w) = sup |Xti (w) − Xti−1 (w)|2 .
r≤t0 <···<tk ≤s
rationals i=1

46
Show that (in spite of (3.1)):

W0 -a.s., V2,r,s = ∞ for all 0 ≤ r < s < ∞ in Q ∩ [0, ∞) .

(Hint: Take advantage of (2.19) to construct partitions of [r, s] for which |Xti − Xti−1 | ≥

K ti − ti−1 occurs often. See also [5], Exercise 2.4, p. 345).


Our next objective is the law of the iterated logarithm.

Theorem 3.4. (A. Khinchin, 1933).


q 1

i) W0 -a.s., lim Xt 2t log log = 1, “small time behavior”
t→0 t
(3.13)
q 1

ii) W0 -a.s., lim Xt 2t log log = −1 ,
t→0 t

and
√ 
i) W0 -a.s., lim Xt 2t log log t = 1, “large time behavior”
t→∞
(3.14) √ 
ii) W0 -a.s., lim Xt 2t log log t = −1 .
t→∞

Proof. Under W0 , (−Xt )t≥0 is also a Brownian motion, so that we only need to prove
(3.13) i) and (3.14) i).
Moreover, we know from (1.20), (1.21), that

βs = s X1/s , s > 0
(3.15)
= 0, s = 0

is a Brownian motion. Thus, if we can prove (3.13) i), it follows that W0 -a.s.
r r
 1  2 1
1 = lim s X1/s 2s log log = lim X1/s log log .
s→0 s s→0 s s

Setting t = 1s , we then find (3.14) i).


As a result, we only need to prove (3.13) i).

First step: “the upper bound”.


q
def
We set ϕ(t) = 2t log log 1t , our goal is to prove that

Xt
(3.16) W0 -a.s., lim ≤ 1.
t→0 ϕ(t)

47
The idea is to use Borel-Cantelli’s lemma, and to produce some decoupling, we look at
geometrically decreasing times. Indeed, we choose δ > 0 and q ∈ (0, 1) (δ will be small
and q close to 1), so that

(3.17) (1 + δ)2 q > 1, and define

(3.18) tn = q n , n ≥ 0, (note that tn ↓ 0) ,

(3.19) An = {w ∈ C; for some t ∈ [tn+1 , tn ], Xt > (1 + δ) ϕ(t)}, n ≥ 0 .

Note that ϕ is non-decreasing on [0, T ], when T is small and positive, because

def ϕ2 (t) 1
ψ(t) = = t log log , so that
2 t
1 t 1 1 1
ψ ′ (t) = log log + 1 × − t = log log t − > 0, for t small .
t log t log 1t
As a result, we see that for large enough n

W0 (An ) ≤ W0 sup Xs > (1 + δ) ϕ(tn+1 ) (recall tn+1 < tn )
0≤s≤tn

(2.48)
(3.20) = 2W0 (Xtn > (1 + δ) ϕ(tn+1 ))
r n x2 o
(1.35) 2 1 def √
≤ exp − n , with xn = (1 + δ) ϕ(tn+1 )/ tn .
π xn 2

Note that
p
xn = (1 + δ) 2q n+1−n log log q −n−1
h  i 1
1 2
= (1 + δ) 2q log (n + 1) log
q
λ 1/2
= [2 log{(α(n + 1)) }] with α = log(1/q)
(3.17)
λ = q(1 + δ)2 > 1 .

Coming back to the last line of (3.20), we find that


r
2 1 1
(3.21) W0 (An ) ≤ for large n .
π α (n + 1)λ
λ

Since λ > 1, by (3.17), it follows that:


X
(3.22) W0 (An ) < ∞ ,
n

and by the first lemma of Borel-Cantelli, we see that

(3.23) W0 -a.s., An occurs only finitely many times .

48
As a result we obtain W0 -a.s., limt→0 Xt /ϕ(t) ≤ 1 + δ. Letting δ → 0 (this is possible,
cf. (3.17)), we obtain (3.16).

Second step: “the lower bound”


Xt
(3.24) W0 -a.s., lim ≥ 1.
t→0 ϕ(t)

To this end we choose q ∈ (0, 1), ε ∈ (0, 12 ), and define tn , n ≥ 0, as in (3.18). Here both
ε and q will be chosen small, see (3.29) below.
We will use the lower bound (in the spirit of (1.35)):

x 1 n x2 o
(3.25) P [ξ > x] ≥ √ exp − , where x > 0 and ξ is N (0, 1)-distributed
x2 + 1 2π 2
x2 R +∞ z2 R∞ z2
(indeed x−1 e− 2 = x (1 + z −2 ) e− 2 dz ≤ (1 + x−2 ) x e− 2 dz, whence (3.25)).
As a result, setting now xn = (1 − ε) ϕ(tn )(tn − tn+1 )−1/2 , we find for large n that:

(1.1) (3.25)
W0 [Xtn − Xtn+1 > (1 − ε) ϕ(tn )] = W0 [X1 > xn ] ≥
(3.26) n x2 o √ −1 n x2 o
√ −1
2π xn (1 + x2n )−1 exp − n ≥ 2π (2xn )−1 exp − n .
2 2
Moreover, we have
r 
1−ε 1 p
(3.27) xn = √ 2 log n log = β log(αn), with α = log 1q and
1−q q
(1 − ε)2
(3.28) β=2 .
1−q
We assume that q is small enough so that
ε2
(3.29) q< (and in particular, as a result β < 2) .
4

Then the variables Xtn − Xtn+1 , n ≥ 0, are independent and


c
(3.30) W0 (Xtn − Xtn+1 > (1 − ε) ϕ(tn )] ≥ n−β/2 , for large n .
(log n)1/2

The above expression is the general term of a divergent series. Hence, the second lemma
of Borel-Cantelli yields that

(3.31) W0 -a.s., for infinitely many n, Xtn − Xtn+1 > (1 − ε) ϕ(tn ) .

49
From the upper bound (3.16) applied to (−Xt )t≥0 , we see that W0 -a.s., for large n, Xtn ≥
−(1 + ε) ϕ(tn ), and therefore

W0 -a.s., for infinitely many n


(3.32) Xtn = Xtn − Xtn+1 + Xtn+1 ≥ (1 − ε) ϕ(tn ) − (1 + ε) ϕ(tn+1 )
= ϕ(tn )[1 − ε − (1 + ε) ϕ(tn+1 )/ϕ(tn )] .

Note that by (3.29):

√ (3.29) ε
(3.33) lim ϕ(tn+1 )/ϕ(tn ) = q < ,
n 2
and it follows from (3.32) that

(3.34) W0 -a.s., for infinitely many n, Xtn ≥ ϕ(tn )(1 − 2ε) ,

so that W0 -a.s., limt→0 Xt /ϕ(t) ≥ 1 − 2ε. Letting ε tend to zero along some sequence, we
deduce (3.24).

Remark 3.5. (further extensions and related results)


1) There is a “functional” extension of the law of the iterated logarithm due to V. Strassen
(1964). Given w ∈ C, one considers the subset of C([0, 1]; R) (endowed with the sup-norm):
n Xut (w) o
Fw = f ∈ C([0, 1]; R), for some t ≥ 10, f (u) = √ , 0≤u≤1 .
2t log log t
Theorem 3.6.

(3.35) W0 -a.s., Fw is relatively compact, and the set of limit points of


p
(Xut / 2t log log t)0≤u≤1 , as t → ∞, coincides with:

n Z u
(3.36) K = f ∈ C([0, 1]; R); f (u) = g(s) ds for some g ∈ L2 ([0, 1], ds)
0
Z 1 o
with g2 (s) ds ≤ 1 (of course K is compact) .
0

For the proof see [2], p. 21. Note that when f runs over K, f (1) runs over [−1, 1]
R1
(indeed |f (1)| ≤ ( 0 g 2 (s)ds)1/2 ) ≤ 1, and f (u) = au, with |a| ≤ 1 belongs to K). From
this one recovers that:
Xt
(3.37) W0 -a.s., the set of limit points of √ , as t → ∞, equals [−1, 1] ,
2t log log t
which in essence is a restatement of (3.14).

√ 3.7. Given T > 0, what is the W0 -a.s. set of limit points as t → ∞, of


Exercise
(Xut / 2t log log t)0≤u≤T in C([0, T ]; R) endowed with the sup-norm? 

50
2) Another related result is Lévy’s modulus of continuity for Brownian motion.
Theorem 3.8. (P. Lévy, 1937)

1
(3.38) W0 -a.s., lim q sup |Xt − Xs | = 1 .
u→0 1 0≤s<t≤1
2u log u t−s≤u

For the proof, which has a similar flavour as the proof of the law of the iterated
logarithm, see for instance [8], p. 114.
q
Note that in (3.38), 2t log log 1t in (3.13) is replaced with the “bigger” function
q
2t log 1t . This has to do with the fact that in (3.38) one also takes the supremum over
|X −Xs |
the “starting point Xs ”, whereas for fixed s, W0 -a.s., limu→0 q s+u = 1.
2u log log u1

3) A further law of the iterated logarithm was proved by K.L. Chung (1948). It governs
the small values of sup0≤s≤t |Xs |.
Theorem 3.9.
 log log t  1 π
2
(3.39) W0 -a.s., lim sup |Xs | = √ .
t→∞ t 0≤s≤t 8

This shows that sup0≤s≤t |Xs | cannot grow too slowly. On the other hand it follows
from (3.35), (3.36) that it cannot grow too fast and
 1 1
2
(3.40) W0 -a.s., lim sup |Xs | = 1 .
t→∞ 2t log log t 0≤s≤t


We will now conclude this short chapter with a discussion of the Hölder property of
the Brownian path.
Proposition 3.10. For γ ∈ (0, 12 ),

|Xt − Xs |
(3.41) W0 -a.s., for any T > 0, sup < ∞,
0≤s<t≤T (t − s)γ

and if γ ≥ 12 ,

|Xt − Xs |
(3.42) W0 -a.s., for any 0 ≤ a < b < ∞, sup =∞
a≤s<t≤b (t − s)γ

(so, the Brownian path is Hölder continuous with exponent γ for γ < 21 , but not for larger
γ).

51
Proof.
• (3.41):
We use Kolmogorov’s criterion, cf. (1.57), (1.58). Indeed for 0 ≤ s < t, m > 1:
scaling
E0 [(Xt − Xs )2m ] = (t − s)m E0 [X12m ] .

It now follows from (1.57), with the choices r = 2m, α = m − 1, β ∈ (0, m−1
2m ), that for
N ≥ 1, R > 0,
h |Xt − Xs | i K(m, β, N )
W0 sup ≥ R ≤ E0 [X12m ] ,
0≤s<t≤N (t − s)β R2m

so that for β ∈ (0, 12 − 1


2m ) one finds

|Xt − Xs |
(3.43) W0 -a.s., sup < ∞, for all N ≥ 1 .
0≤s<t≤N (t − s)β

Picking m large enough, (3.41) follows.

• (3.42):
From (3.13), we see that

|Xs+u − Xs |
W0 -a.s., for all s ∈ Q ∩ [0, ∞), lim q = 1,
u→0
u>0 2u log log u1

1
so that for γ ≥ 2

W0 -a.s., for all s ∈ Q ∩ [0, ∞),


q
(3.44) |Xs+u − Xs | |Xs+u − Xs | 2u log log u1
lim = lim q = ∞.
u→0
u>0
uγ u→0
u>0 2u log log 1 uγ
u

The claim (3.42) immediately follows.

Remark 3.11. Of course the above proposition offers a weaker result than the aforemen-
tioned Lévy’s modulus of continuity (3.38):
1
W0 -a.s., lim q sup |Xt − Xs | = 1 .
u→0 1 0≤s<t≤T
2u log u t−s≤u

52
Chapter 4: Stochastic Integrals
The fact that Brownian motion is a continuous martingale will now play a major role
in this chapter. We know from (3.2) that W0 -a.s., t → Xt (w) has infinite variation on any
non-trivial interval of R+ . As explained in the introduction, this precludes the definition
of a Stieltjes-type integral “dXs (w)”, because “dXs (w) is not a signed measure”. The
next proposition will play a key role.

Proposition 4.1. (d = 1, on the canonical space (C, F, W0 ))

(4.1) Xt is an (Ft+ )-martingale,

(4.2) Xt2 − t is an (Ft+ )-martingale.

Proof. For 0 ≤ s < t:

E0 [Xt − Xs | Fs+ ] = E0 [(Xt−s − X0 ) ◦ θs | Fs+ ]


(4.3) (2.7)
= EXs [Xt−s − X0 ] = 0, W0 -a.s.,

an likewise:
E0 [Xt2 − Xs2 − (t − s) | Fs+ ] = E0 [2(Xt − Xs ) Xs + (Xt − Xs )2 − (t − s) | Fs+ ] =
(4.3)
(4.4) 2Xs E0 [Xt − Xs | Fs+ ] + E0 [(Xt−s − X0 )2 ◦ θs | Fs+ ] − (t − s) =
(2.7)
EXs [(Xt−s − X0 )2 ] − (t − s) = 0, W0 -a.s. .

The claims (4.1), (4.2) now follow since Xt and Xt2 − t are (Ft+ )-adapted.

We will later see that the above two continuous martingales characterize Brow-
nian motion! (a fact due to Paul Lévy). The increasing process t that appears in (4.2)
coincides with the limit of the quadratic variation of the Brownian path as discussed in
(3.12).
Before discussing the construction of stochastic integrals, we introduce the following

Definition 4.2. We say that a filtered probability space (Ω, G, (Gt )t≥0 , P ) satisfies the
usual conditions if:

(4.5) G0 contains all sets N ∈ G with P (N ) = 0, and


T
(4.6) (Gt )t≥0 is right-continuous (i.e. Gt = ε>0 Gt+ε , for t ≥ 0) .

Example:
We consider the canonical space (C, F, W0 ) for the d-dimensional Brownian motion and
define for t ≥ 0 the σ-algebra:
def
(4.7) Ft = {A ∈ F; ∃B ∈ Ft with 1A = 1B , W0 -a.s.} .

53
Proposition 4.3.

(4.8) Ft+ ⊆ Ft , for t ≥ 0,

(4.9) (Ft )t≥0 , is right-continuous,

(4.10) (C, F, (Ft )t≥0 , W0 ) satisfies the usual conditions.

Proof.
• (4.8): Observe that for t ≥ 0,

(4.11) for any A ∈ F, there exists a Y ∈ bFt , such that E0 [1A | Ft+ ] = Y , W0 -a.s. .

Indeed, when A is of the form, t0 = 0 < · · · < tk = t, 0 < s1 < · · · < sm , Di ∈ B(Rd ), for
0 ≤ i ≤ k + m,

A = {Xt0 ∈ D0 , . . . , Xtk ∈ Dk , Xtk +s1 ∈ Dk+1 , . . . , Xtk +sm ∈ Dk+m } ,

it follows from (2.7) that


W0 −a.s.
E0 [1A | Ft+ ] = 1{Xt0 ∈ D0 , . . . , Xtk ∈ Dk } WXt [Xs1 ∈ Dk+1 , . . . , Xsm ∈ Dk+m ] ,

which is Ft -measurable.
The claim (4.11) now follows from Dynkin’s lemma.
As a result of (4.11), we see that for A ∈ Ft+ ,

(4.12) 1A = E0 [1A | Ft+ ] = Y, W0 -a.s., for some Y ∈ bFt .


def
/ {0, 1}) = 0, and hence Ye = 1{Y = 1} = Y , W0 -a.s.. Therefore,
It follows that W0 (Y ∈
we have

(4.13) 1A = 1{Y =1} , W0 -a.s., with {Y = 1} ∈ Ft ,

and (4.8) follows in view of (4.7).

• (4.9):
def T
Let A ∈ Ft+ ( = ε>0 Ft+ε ), then for each n ≥ 1, by (4.7) we can find Bn ∈ Ft+ 1 , with
n

(4.14) 1A = 1Bn , W0 -a.s. .

We now define B ∈ Ft+ via:

(4.15) 1B = lim sup 1Bn


n

(B is indeed Ft+ -measurable because it belongs to Ft+ε for each ε > 0). By (4.14) we find
that

(4.16) 1B = 1A , W0 -a.s.,

54
and by (4.8) since B ∈ Ft+ , we can find C ∈ Ft such that

1C = 1B = 1A , W0 -a.s. .

This proves that A ∈ Ft , and hence (4.9) holds.

• (4.10):
From (4.7) we see that N ∈ F with W0 (N ) = 0, belongs to F0 , so (4.5) holds. With (4.9)
it follows that (C, F, (Ft )t≥0 , W0 ) satisfies the usual conditions.

Remark 4.4.
1) Note that (4.8) can be seen as a generalization of Blumenthal’s 0− 1 law (2.15). Indeed,
when t = 0, (4.8) implies that for any A ∈ F0+ one can find a B ∈ F0 such that 1A = 1B ,
W0 -a.s.. Moreover, B ∈ F0 is of the form B = {X0 ∈ C}, for some C ∈ B(R). Hence,
W0 (B) = 1, if 0 ∈ C, and W0 (B) = 0, if 0 ∈/ C. This shows that W0 (A) = W0 (B) ∈ {0, 1},
and we recover (2.15).
2) From (4.1), (4.2) it naturally follows that

(4.17) Xt is an (Ft )t≥0 -martingale,

(4.18) Xt2 − t is an (Ft )t≥0 martingale,

when one considers (C, F, (Ft )t≥0 , W0 ), (d = 1). 

We will now begin the discussion of stochastic integrals. We assume that

(4.19) (Ω, G, (Gt )t≥0 , P ) is a filtered probability space satisfying


the usual conditions (4.5), (4.6),

(4.20) Xt , t ≥ 0, is a continuous square integrable (Gt )-martingale,

(4.21) Xt2 − t, t ≥ 0, is a (Gt )-martingale.

A concrete example of this situation occurs for instance in (4.10), (4.17), (4.18), when
considering (C, F, (Ft )t≥0 , W0 ) and the canonical process (Xt )t≥0 .

Remark 4.5. We will later see, cf. Theorem 5.2 in Chapter 5, that when (Mt )t≥0 is a
continuous square integrable martingale on (Ω, G, (Gt )t≥0 , P ), as in (4.19) (i.e. for each
t ≥ 0, E[Mt2 ] < ∞), one can construct a process hM it , t ≥ 0, such that

(4.22) t → hM it (ω) is non-decreasing continuous, for each ω ∈ Ω,

(4.23) hM i0 = 0,

(4.24) (hM it )t≥0 , is (Gt )-adapted, integrable,

(4.25) Mt2 − hM it , t ≥ 0 is a (Gt )-martingale.

55
Moreover, (hM it )t≥0 is essentially unique (i.e. two such processes agree for all t ≥ 0,
except maybe on a negligible set of ω ∈ Ω), and it is called the “quadratic variation
process’. The terminology stems from the fact that for t ≥ 0,
X  2
M k+1 − M k −→ hM it in P -probability,
n 2n 2 n→∞
k+1
2n
≤t

see [8], Section 1.5. 


Rt
We are going to define the integral 0 Hs dXs for some suitable basic integrands, for
which the definition is “natural”, and then we will use an isometry property to extend the
class of processes we integrate. Later, we will further extend the class of integrands by a
so-called “localization argument”.
Our building blocks are the basic processes:
(4.26) Hs (ω) = C(ω) 1{a < s ≤ b}, s ≥ 0, ω ∈ Ω, with C ∈ bGa , 0 ≤ a ≤ b .
For such an H as in (4.26), we define:
Z ∞
def 
(4.27) Hs dXs = C(ω) Xb (ω) − Xa (ω) ∈ L2 (P ) .
0

The restriction C(ω) ∈ bGa is not a priori natural. It is motivated by the fact that if we
define
Z ∞
def (4.27) 
(4.28) (H.X)t = Hs 1[0,t] (s) dXs = C(ω) Xb∧t (ω) − Xa∧t (ω)
0
տ Rt
also denoted 0 Hs dXs

we have the
Proposition 4.6.
(4.29) Mt = (H.X)t is a continuous square integrable (Gt )-martingale .
Proof.
(H.X)t = C(ω)(Xb∧t −Xa∧t )
= 0, if 0 ≤ t ≤ a ,
= C(Xt − Xa ), if a ≤ t ≤ b ,
= C(Xb − Xa ), if b ≤ t ≤ ∞ ,

clearly defines a continuous adapted process which is square integrable. Considering the
case a ≤ s < t ≤ b (the other cases are simpler), we see that
ւ bGa ⊆bGs
P −a.s.
E[Mt − Ms | Gs ] = E[C(Xt − Xs ) | Gs ] = C E[Xt − Xs | Gs ]
= 0.
Our claim follows.

56
The next step is the following
Proposition 4.7. If H, K are basic processes, then
hZ t i
(4.30) E[(H.X)t (K.X)t ] = E Hs (ω) Ks (ω)ds , for 0 ≤ t ≤ ∞ .
0
Proof. It suffices to consider the case t = ∞, because
(H.X)t = (H 1[0,t] .X)∞ .

It also suffices to check (4.30) when (a, b] = (c, d] or (a, b] “<” (c, d] (i.e. a ≤ b ≤ c ≤ d),
and H = C 1(a,b] , K = D 1(c,d] .
Indeed, one makes repeated use of identities such as
for 0 ≤ α ≤ β ≤ γ, H = C 1(α,γ] , H 1 = C 1(α,β] , H 2 = C 1(β,γ] ,
Z ∞ Z ∞ Z ∞
(4.31)
Hs dXs = Hs1 dXs + Hs2 dXs .
0 0 0

We will thus only need to check (4.30) in two cases:

• Case (a, b] = (c, d]:


∈bGa
z}|{
(4.32) E[(H.X)∞ (K.X)∞ ] = E[ CD(Xb − Xa )2 ] =
E[CDE[(Xb − Xa )2 | Ga ]] .
Note that:
E[(Xb − Xa )2 | Ga ] = E[Xb2 − 2Xb Xa + Xa2 | Ga ] =
(4.20)
(4.33) E[Xb2 | Ga ] − 2Xa E[Xb | Ga ] + Xa2 = E[Xb2 | Ga ] − Xa2 =
(4.21)
E[Xb2 − b | Ga ] + b − Xa2 = Xa2 + b − Xa2 = b − a .

Thus coming back to (4.32) we have shown that


hZ ∞ i
E[(H.X)∞ (K.X)∞ ] = E[CD(b − a)] = E Hs (ω) Ks (ω)ds ,
0
i.e. (4.30) holds.

• Case (a, b] “<” (c, d]:


E[ (H.X)∞ (K.X)∞ ] = E[(H.X)∞ E[(K.X)∞ | Gb ]]
| {z }
Gb −meas (4.29)
= E[(H.X)∞ (K.X)b ] = 0 .
| {z }
=0

Analogously we have
hZ ∞ i
E Hs (ω) Ks (ω) ds = 0, and (4.30) holds.
0

57
Remark 4.8. If one replaces (Xt )t≥0 satisfying (4.20), (4.21), with (Mt )t≥0 , a continuous
square integrable martingale, and defines for basic processes Hs (ω) = C(ω) 1{a < s ≤ b},
with C ∈ bGa ,
Z ∞ Z t Z ∞
(4.34) Hs dMs = C(Mb − Ma ), and Hs dMs = Hs 1[0,t] (s) dMs , 0 ≤ t ≤ ∞ ,
0 0 0

then (4.30) is replaced by


hZ t i
E[(H.M )t (K.M )t ] = E Hs (ω) Ks (ω) dhM is (ω) ,
0
(4.35) with hM i. the quadratic variation of M , and (H.M ).
defined as in (4.28) with M replacing X. .

We now define the class Λ1 of simple processes:

(4.36) Λ1 = {Hs (ω) = Hs1 (ω) + · · · + Hsn (ω), H i are basic} .

Proposition and definition:


Z ∞ n Z
X ∞
def
(4.37) For H ∈ Λ1 , Hs dXs = Hsi dXs , is well defined.
0 i=1 0

Proof. The only point toPcheckR is that when H 1 , . . . , H p are basic processes such that

H 1 + · · · + H p = 0, then pi=1 0 Hsi dXs = 0.
Making repeated use of (4.31) and

H = C 1(α,β] , K = D 1(α,β] basic processes, then


Z ∞ Z ∞ Z ∞
(4.38) Hs dXs + Ks dXs = Ls dXs , with
0 0 0
L = (C + D) 1(α,β] basic process,

we can assume that H i = Ci 1I i , 1 ≤ i ≤ p, with I i ∩ I j = ∅, when


P i 6=R j. In this case

H 1 + · · · + H p = 0 implies H 1 = H 2 = · · · = H p = 0, and hence pi=1 0 Hsi dXs = 0.
Our claim is thus proved.

As a consequence of (4.29) and (4.37) we see that for H, K ∈ Λ1 ,


Z t  Z ∞ 
def
Hs dXs = Hs 1[0,t] (s) dXs is a continuous square
0 0
(4.39) integrable martingale, and
hZ ∞ Z ∞ i hZ ∞ i
E Hs dXs Ks dXs = E Hs Ks ds .
0 0 0

58
Remark 4.9. In the case of a general continuous square integrable (Gt )-martingale (Mt )t≥0 ,
in place of (Xt )t≥0 , we can use the same construction as above. The role of ds is simply
replaced by dhM is (ω) so that for H, K ∈ Λ1 one has
hZ ∞ Z ∞ i hZ ∞ i
(4.40) E Hs dMs Ks dMs = E Hs (ω) Ks (ω) dhM is (ω) .
0 0 0


As a result of (4.39), we see that


Z ∞
K ∈ Λ1 −→ Ks dXs ∈ L2 (Ω, G, P ) is an isometry,
(4.41) 0
if Λ1 is viewed as a subspace of L2 (Ω × R+ , G ⊗ B(R+ ), dP ⊗ ds) .

Note that Λ1 is in general not dense in L2 (Ω × R+ , G ⊗ B(R+ ), dP ⊗ ds) since all K in Λ1


are progressively measurable processes. We hence consider

P = the σ-algebra of progressively measurable sets in Ω × R+ ,


(4.42) (i.e. of A ∈ G ⊗ B(R+ ) such that for all t ≥ 0,
A ∩ (Ω × [0, t]) ∈ Gt ⊗ B([0, t])) .

Remark 4.10. A process Zu (ω) on (Ω, G, (Gt )t≥0 , P ) is progressively measurable in the
sense of the definition below (2.61) exactly when
Z
(4.43) (Ω × R+ , P) −→ (Rd , B(Rd )) is measurable .

We then define:

(4.44) Λ2 = L2 (Ω × R+ , P, dP ⊗ ds) ,
R∞
the set of progressively measurable processes Hs (ω) for which E[ 0 Hs2 (ω)ds] < ∞. The
interest of this definition comes from the next

Proposition 4.11.

(4.45) Λ1 is a dense subset of Λ2 for the L2 (Ω × R+ , P, dP ⊗ ds)-distance,


Z ∞
(4.46) H→ Hs dXs extends uniquely into an isometry from Λ2 into L2 (Ω, G, P ) .
0
Rt R∞
(we will also write 0 Hs dXs for 0 Hs 1[0,t] (s) dXs , for 0 ≤ t ≤ ∞).

Proof. In view of (4.41) we see that (4.46) immediately follows from (4.45).
The proof of (4.45) will in fact rely on a lemma, which is more general than what
is needed to prove (4.45), but applies as well to the subsequent discussion of stochastic
integrals with respect to continuous square integrable martingales. The non-decreasing
process t → At (ω) in the next lemma plays the role of t → hM it (ω), cf. (4.22).

59
Lemma 4.12. Suppose that At , t ≥ 0, is a continuous (Gt )-adapted process, non-decreasing
in t, with A0 = 0, and E[At ] < ∞, for every t ≥ 0, then

(4.47) Λ1 is a dense subset of L2 (Ω × R+ , P, dP × dAs ) for the L2 (dP × dAs )-distance.

(The σ-finite measure dµ = dP × dAs is defined via


Z Z ∞ 
µ(B) = 1B (ω, u) dAu (ω) dP (ω), for B ∈ G ⊗ B(R+ ) ⊃ P ) .
Ω 0

With the above lemma, (4.45) clearly follows.


We have thus reduced the proof of the important Proposition 4.11, to the

Proof of Lemma 4.12: Since E[At ] < ∞, for each t ≥ 0, it follows that indeed Λ1 ⊂
def
Λ2 = L2 (Ω × R+ , P, dP × dAs ).
We further observe that
et = t + At , t ≥ 0 ,
A

et , t ≥ 0, implies (4.47)
satisfies the same assumptions as At , t ≥ 0, and proving (4.47) for A
for At , t ≥ 0. We thus assume that for ω ∈ Ω:

t ∈ [0, ∞) −→ At (ω) ∈ [0, ∞) is an increasing bijection,


(4.48)
and for 0 ≤ s ≤ t, t − s ≤ At (ω) − As (ω) .

We define for H ∈ Λ2 ,

(4.49) H n = 1[0,n] × {(−n) ∨ (H ∧ n)} ∈ Λ2 ,

and we find that by dominated convergence:

(4.50) kH − H n kL2 (dP ×dA) −→ 0 .


n→∞

We then introduce the inverse function of A.

(4.51) τu = inf{t ≥ 0; At > u}, for u ≥ 0

(we will sometimes write τ (u) in place of τu ).


Note that for f ≥ 0, B(R+ )-measurable and ω ∈ Ω:
Z ∞ Z ∞
(4.52) f (t) dAt = f (τu ) du “change of variable formula”.
0 0

(Indeed this identity holds when f = 1[a,b] with a ≤ b, since τu ∈ [a, b] is equivalent to
u ∈ [Aa , Ab ]. Then, by Dynkin’s lemma, (4.52) holds for any f = 1C , with C ∈ B([0, T ]),
T > 0, arbitrary, and the general case follows by approximation).

60
We then define for n ≥ 1, ℓ ≥ 0,
Z t
n,ℓ
Ht (ω) = 2 ℓ Hsn (ω) dAs (ω), for t ≥ 0, ω ∈ Ω ,
(4.53) τ (At −2−ℓ )

where by convention At = 0, for t ≤ 0, τ (u) = 0, for u ≤ 0 .

Clearly H.n,ℓ is bounded in absolute value by kH n k∞ . It is a continuous function of t,


(4.48)
and it vanishes when t > n + 2−ℓ (indeed An+2−ℓ ≥ An + 2−ℓ , so that for t > n + 2−ℓ ,
At − 2−ℓ > An , and hence τ (At − 2−ℓ ) > n, which in turns implies that the integral in
(4.53) vanishes in view of (4.49). Moreover

(4.54) H n,ℓ is (Gt )-adapted.

Indeed τ (At − 2−ℓ ) = inf{s ≥ 0; As > At − 2−ℓ } is Gt -measurable (simply observe that
for u ≤ t, {τ (At − 2−ℓ ) < u} = {for some v ∈ Q ∩ [0, u), Av > At − 2−ℓ } ∈ Gt , and
Rt
it equals Ω ∈ Gt , when u > t). Moreover for any F ∈ bGt ⊗ B([0, t]), 0 Fs (ω) dAs (ω)
is Gt -measurable, as follows from Dynkin’s lemma, approximation, and consideration of
functions of the form F = 1D×[a,b] , with D ∈ Gt , 0 ≤ a ≤ b ≤ t. Coming back to (4.53),
the claim (4.54) follows.

Now, as a result of (4.52) we find that:


Z ∞ Z ∞
n n,ℓ 2
(4.55) (Ht (ω) − Ht (ω)) dAt = (Hτnu (ω) − Hτn,ℓ
u
(ω))2 du
0 0

and for u ≥ 0,
Z τu Z τu
(4.53)
Hτn,ℓ
u
(ω) = 2ℓ Hsn (ω) dAs = 2ℓ Hsn (ω) dAs
τ ( Aτu −2−ℓ ) τ (u−2−ℓ )
|{z}
k
u
Z ∞

=2 1{τ (u − 2−ℓ ) ≤ s ≤ τu } Hsn (ω)dAs
(4.56) 0
Z ∞
(4.52)
= 2ℓ 1{τ (u − 2−ℓ ) ≤ τv ≤ τu } Hτnv (ω) dv
0
Z u
= 2ℓ Hτnv (ω) dv .
(u−2−ℓ )+

Note that for any g ∈ L2 (R+ , du)


Z ∞
ℓ L2 (du)
gℓ (u) = 2 1{u − 2−ℓ ≤ v ≤ u} g(v) dv −→ g(u)
0 ℓ→∞

(this follows directly from the continuity of translations in L2 (R, du)).

61
Thus, combining (4.55) and (4.56), it follows by dominated convergence that

kH n − H n,ℓ k2L2 (dP ×dA) =


(4.57) hZ ∞ i
E (Htn (ω) − Htn,ℓ (ω))2 dAt (ω) −→ 0, for any n ≥ 1 .
0 ℓ→∞

We can now define for n ≥ 1, ℓ, m ≥ 0:


X n,ℓ
(4.58) Htn,ℓ,m(ω) = H k (ω) 1( k k+1
, ] (t), for t ≥ 0, ω ∈ Ω .
2m 2m 2m
k≥0

Clearly H n,ℓ,m ∈ Λ1 are uniformly bounded in m, and for t > 0, ω ∈ Ω, thanks to the
continuity of H.n,ℓ (ω), Htn,ℓ,m (ω) −→ Htn,ℓ (ω). Since dAt does not give positive mass to
m→∞
{0}, we find that:

(4.59) kH n,ℓ − H n,ℓ,mk2L2 (dP ×dA) −→ 0, for n ≥ 1, ℓ ≥ 0 .


m→∞

Combining (4.50), (4.57), (4.59) we have proved (4.45).

This concludes the proof of the Proposition 4.11. 

Remark 4.13.
1) Reconstructing some trajectorial character to the stochastic integral.
Note that when H and K belong to Λ2 , and G ∈ G are such that

(4.60) Hs (ω) = Ks (ω) for all s ≥ 0, and ω ∈ G,

then we see from (4.49) that a similar identity holds for H n and K n , from (4.53), that the
same holds for H n,ℓ and K n,ℓ , and finally from (4.58), that the same holds for H n,ℓ,m and
K n,ℓ,m. As a result we can find H (i) and K (i) in Λ1 , i ≥ 1, with the property:

H (i) → H in L2 (dP ⊗ ds), K (i) → K in L2 (dP ⊗ ds), and for all i ≥ 1,


(4.61) (i) (i)
Hs (ω) = Ks (ω), for all s ≥ 0 and ω ∈ G .

On the other hand when H, K ∈ Λ1 are such that H.(ω) = K.(ω) for ω ∈ G, one checks
from (4.27), (4.28), (4.37), (4.39) that

(4.62) (H.X)t (ω) = (K.X)t (ω), for 0 ≤ t ≤ ∞, and ω ∈ G .

Combining this observation with (4.61) and (4.46), we see that:


Z ∞ Z ∞
(4.63) when H, K ∈ Λ2 satisfy (4.60), then Hs dXs = Ks dXs , P -a.s. on G .
0 0

This somehow reconstructs some trajectorial character to the stochastic integral.

62
2) The class of processes we can integrate has severe limitations.
If we consider the canonical space (C, F, (Ft )t≥0 , W0 ) with (Xt )t≥0 , the canonical pro-
cess, we can now consider
Z 1  Z ∞ 
eαXs dXs = 1[0,1] (s) eαXs dXs , for α ∈ R
0 0

because eαXs is progressively measurable and


hZ ∞ i Z 1 Z 1
2
2αXs 2αXs
E0 1[0,1] (s) e ds = E0 [e ] ds = e2α s ds < ∞ ,
0 0 0

so that 1[0,1] (s) eαXs belongs to Λ2 , for all α ∈ R.

On the other hand, if we consider α ∈ R and


Z 1
2
(4.64) eαXs dXs ,
0

then, we observe that


hZ ∞ i Z 1 Z
2 1 1 2
E0 1[0,1] (s) e2αXs ds = √ e(2α− 2s )x dx ds =
0 0 R 2πs
Z 1
−1 1
(1 − 4αs)+ 2 ds < ∞ when α < ,
0 4

1
= ∞ when α ≥ .
4

R1 1 2
Thus, at the present state of the construction of stochastic integrals, 0 e 10 Xs dXs is
R1 2
meaningful, but 0 eXs dXs is not!
R1 2
We will later extend the definition of stochastic integrals so that 0 eXs dXs (or even
R 1 (eXs )
0 e dXs !) are well-defined. However, in the theory we develop
Z 1
(4.65) X1 dXs will not be defined because 1[0,1] X1 is not P-measurable .
0

Observe that given H ∈ Λ2 and kH n − HkL2 (dP ⊗ds) −→ 0, with H n ∈ Λ1 for each n,
n→∞
we know that for each t ≥ 0, (H n .X)t −→ (H.X)t in L2 (Ω, G, P ) and in fact (H.X)t ∈
n→∞
L2 (Ω, Gt , P ). We are now going to select a nice version of the process (H.X)t , t ≥ 00, so
that it defines a continuous square integrable (Gt )-martingale. We recall Doob’s inequality
in the discrete setting:

63
Proposition 4.14. Consider a filtered probability space (Ω, F, (Fm )m≥0 , P ) and (Xm )m≥0 ,
an (Fm )-submartingale (i.e. Xm is Fm -measurable and integrable, and E[Xm+1 | Fm ] ≥
Xm , for m ≥ 0). Then for λ > 0, n ≥ 0, A = {ω ∈ Ω; sup0≤m≤n Xm (ω) ≥ λ}, one has

(4.66) λP [A] ≤ E[Xn 1A ] ≤ E[Xn+ ]

(see [5], p. 215).

In the continuous time set-up we obtain:

Proposition 4.15. Consider a filtered probability space (Ω, G, (Gt )t≥0 , P ) and (Xt )t≥0 , a
continuous (Gt )-submartingale. Then for λ > 0, t ≥ 0, and A = {sup0≤u≤t Xu ≥ λ} one
has
 
(4.67) λP sup Xu ≥ λ ≤ E[Xt 1A ] ≤ E[Xt+ ] .
0≤u≤t

Proof. It suffices to prove that for λ > 0:


 
(4.68) λP sup Xu > λ] ≤ E[Xt 1{ sup Xu > λ} .
0≤u≤t 0≤u≤t

One then applies (4.68) to λn ↑ λ and obtains (4.67). By the same argument with λn ↓ λ,
we deduce from (4.66) that for λ > 0, one has:
   
λP sup X mt > λ ≤ E Xt 1{ sup X mt > λ} .
0≤m≤2ℓ 2ℓ 0≤m≤2ℓ 2ℓ

Letting ℓ ↑ ∞, since {sup0≤m≤2ℓ X mt > λ} ↑ {sup0≤u≤t Xu > λ}, as ℓ ↑ ∞, we obtain


2ℓ
(4.68), and our claim is proved.
Rt
Doob’s inequality will be a key tool for the construction of a good version of 0 Hs dXs ,
when H ∈ Λ2 .

R t We now proceed to the construction of a good version of the stochastic integral


0 Hs dXs , for H ∈ Λ2 , cf. (4.44). We recall our standing assumptions (4.19), (4.20), (4.21).

Theorem 4.16. For Hs (ω) ∈ Λ2 , there is a process (It )0≤t≤∞ , essentially unique (i.e. two
such processes, except on a P -negligible set, agree for all t ≥ 0), continuous, (Gt )-adapted,
such that:
Z t
(4.69) for each 0 ≤ t ≤ ∞, It = Hs dXs , P -a.s.,
0

(4.70) (It )0≤t≤∞ is a continuous square integrable (Gt )-martingale,


(4.46)
hZ t i
2
(and of course E[It ] = E Hs2 (ω)ds , for 0 ≤ t ≤ ∞) .
0

64
Rt
Proof. When H ∈ Λ1 , our definition of 0 Hs dXs satisfies the above properties, see (4.37),
(4.28), (4.29). When H ∈ Λ2 , we pick H n ∈ Λ1 , n ≥ 0, with limn kH − H n kL2 (dP ⊗ds) = 0.
As a result of (4.46), for 0 ≤ s ≤ t, A ∈ Gs ,
hZ t i hZ s i
n
E Hu dXu 1A = E Hun dXu 1A
0 0
↓n→∞ ↓n→∞
hZ t i hZ s i
E Hu dXu 1A = E Hu dXu 1A
0 0

and by the discussion below (4.65), we thus find that:

(4.71) E[(H.X)t | Gs ] = (H.X)s , P -a.s.,

so the martingale property comes for free. We thus only need to find It (ω) a continuous
(Gt )-adapted process for which (4.69) holds. We choose nk → ∞ such that
X
(4.72) k4 kH nk − H nk+1 k2L2 (P,dP ⊗ds) < ∞ .
k

Then for each k ≥ 0, ((H nk − H nk+1 ).X)2t is a continuous submartingale and by Doob’s
inequality (4.67), for λ > 0:
  h Z ∞  2 i
2 nk nk+1 n
λ P sup |(H .X)u − (H .X)u | ≥ λ ≤ E (Hsnk − Hs k+1 dXs
(4.73) u≥0 0
= kH nk − H nk+1 k2L2 (P,dP ⊗ds) .

Choosing λ = k−2 , we obtain


 
P sup | (H nk .X)u − (H nk+1 .X)u | ≥ k−2 ≤ k4 kH nk − H nk+1 k2L2 (P,dP ⊗ds) .
u≥0

Applying Borel-Cantelli’s lemma, we can find N ∈ G with P (N ) = 0, such that for ω ∈


/ N,
we have k0 (ω) < ∞, such that
1
(4.74) sup |(H nk .X)u (ω) − (H nk+1 .X)u (ω)| ≤ , for k ≥ k0 (ω) .
u≥0 k2

/ N , (H nk .X)u (ω) converges uniformly on [0, ∞]. We thus define:


As a result for ω ∈

Iu (ω) = lim(H nk .X)u (ω), for ω ∈


/N,
(4.75) k
=0 , for ω ∈ N ,

so that u ∈ [0, ∞] → Iu (ω) is continuous for all ω ∈ Ω, and Iu (·) is Gu -measurable (we use
here the fact that Gu contains all negligible sets of G, see (4.5)).
L2 (P )
Observe that (H nk .X)u −→ (H.X)u , for 0 ≤ u ≤ ∞, and P -a.s., (H nk .X)u → Iu . As
P -a.s.
a result Iu = (H.X)u , and (4.69) holds. The theorem is proved.

65
From now on (H.X)t will denote the essentially unique regular version It . We will use
the following inequality:
Proposition 4.17. If (Xt )t≥0 is a continuous non-negative submartingale on a filtered
probability space, then for 0 ≤ t < ∞, p ∈ (1, ∞)
 p
(4.76) E sup Xsp ]1/p ≤ E[Xtp ]1/p .
s≤t p − 1
Proof. We apply the discrete time inequality to X ktn , 0 ≤ k ≤ 2n , and let n → ∞ (see for
2
instance [5], p. 216 for the discrete time inequality).
As an immediate application we have:
 1/2
(4.77) E sup(H.X)2t ≤ 2 kHkL2 (P,dP ⊗ds) , for H ∈ Λ2 .
t≥0

We now proceed to the next


Proposition 4.18. For H, K ∈ Λ2 , the essentially uniquely defined process
Z t
def
(4.78) Nt = (H.X)t (K.X)t − Hs (ω) Ks (ω) ds, 0 ≤ t ≤ ∞ ,
0
is a continuous (Gt )-martingale and
(4.79) sup |Nt | ∈ L1 (Ω, G, P ) .
t≥0
R∞
Proof. Note that by the Cauchy-Schwarz inequality E[ 0 |Hs (ω)| |Ks (ω)|ds] ≤
kHkL2 (dP ⊗ds) kKkL2 (dP ⊗ds) < ∞, so that (4.78) is well-defined for all 0 ≤ t ≤ ∞, and
R∞
ω outside the negligible set N where 0 |Hs (ω)| |Ks (ω)|ds = ∞. It also defines a process
Rt
with continuous trajectories outside N , and setting for instance 0 Hs (ω) Ks (ω)ds ≡ 0,
for ω ∈ N , the property (4.79) is an immediate consequence of (4.77), and the above
inequality. We thus only need to check that Nt is a (Gt )-martingale.

• 1st case: H, K are basic (similar to (4.30)):


We only need to treat the case of H = C 1(a,b] , K = D1(c,d] , with either (a, b] = (c, d] or
“(a, b] < (c, d]”.
If (a, b] = (c, d], then for t ≥ 0,
(4.80) Nt = CD{(Xt∧b − Xt∧a )2 − (t ∧ b − t ∧ a)} ((Gt )-measurable)
is a martingale because it is adapted, and when for instance a ≤ s < t ≤ b:
E[Nt | Gs ] = E[(Xt − Xa )2 − (t − a) | Gs ] CD
= E[Xt2 − 2Xt Xa + Xa2 − (t − a) | Gs ] CD
(4.81) (4.20),(4.21)
= (Xs2 − s − 2Xs Xa + Xa2 + a) CD
= {(Xs − Xa )2 − (s − a)} CD = Ns ,

and the other cases are easier to check.

66
If “(a, b] < (c, d]”:

(4.82) Nt = (Xt∧b − Xt∧a )(Xt∧d − Xt∧c ) CD, t ≥ 0 ,

is a martingale because it is adapted, and when for instance c ≤ s < t ≤ d:


(4.20)
(4.83) E[Nt | Gs ] = CD(Xb − Xa ) E[Xt − Xc | Gs ] = CD(Xb − Xa )(Xs − Xc ) = Ns ,

and the other cases are simpler to check.

• 2nd case: H, K ∈ Λ1 :
Immediate from the previous case by bilinearity.

• General case: H, K ∈ Λ2 :
We choose H n , K n , n ≥ 0, in Λ1 , respectively converging to H and K in L2 (P, dP ⊗ ds).
By (4.77) we see that
 
(4.84) E sup |(H n .X)t − (H.X)t |2 ≤ 4 kH n − Hk2L2 (P,dP ⊗ds) → 0,
t≥0

and a similar inequality for K. Note also that


Z t Z t Z ∞ Z ∞
n n n
Hs Ks ds − Hs Ks ds ≤ |Hs − Hs | |Ks | ds + |Hsn | |Ks − Ksn | ds ,
0 0 0 0

so taking expectations and using Cauchy-Schwarz’s inequality, we find that


h Z t Z t i
E sup Hs Ks ds − Hsn Ksn ds ≤ kH − H n kL2 (P,dP ⊗ds) kKkL2 (P,dP ⊗ds)
(4.85) t≥0 0 0

+kH n kL2 (P,dP ⊗ds) kK − K n kL2 (P,dP ⊗ds) → 0 .

As a result, we find that


(n)
(4.86) sup |Nt − Nt | → 0 in L1 (Ω, G, P ) ,
t≥0

(n)
if Nt denotes the martingale attached to H n , K n via (4.78). This is more than enough
to conclude that Nt , t ≥ 0, satisfies the martingale property, and this concludes the proof
of the Proposition.

Remark 4.19. Note that the above proposition shows that for H ∈ Λ2 ,
Z t
(H.X)2t − Hs2 (ω) ds is a continuous (Gt )-martingale ,
0

and the non-decreasing adapted process:


Z t
t ≥ 0 −→ Hs2 (ω) ds
0

67
fulfills the properties (4.22) - (4.25) relative to Mt = (H.X)t . We have thus constructed
by “bare hands”
Z t
(4.87) h(H.X)it = Hs2 (ω) ds, t ≥ 0
0

(as mentioned below (4.25), the process satisfying (4.22) - (4.25) is essentially unique). 
The good version of the stochastic integral, which we have produced, is, in essence,
based on an isometry. We will now reconstruct some trajectorial property of the
integral.
When T is a (Gt )-stopping time, the process
def
(4.88) (ω, s) → 1[0,T ] (ω, s) = 1{s ≤ T (ω)}

is progressively measurable (it is adapted, left-continuous in s, and a simple variant of


RT
(2.60) yields the claim). For such a T we have two “natural ways” to define “ 0 Hs dXs ”,
when H ∈ Λ2 :

• We can for instance use the continuous version (H.X)t (ω) and replace t by T (ω).
Observe that the essential uniqueness of the continuous version of (H.X)t (ω) ensures
that two different continuous versions of the stochastic integral give rise to resulting ran-
dom variables which differ on an at most negligible set. In other words:

(4.89) (H.X)T (ω) (ω) is uniquely defined up to a negligible set.

• Alternatively we can use the definition


Z ∞
(4.90) (1[0,T ] H)s dXs ,
0

once we note that 1[0,T ] H ∈ Λ2 .


As we now explain both definitions coincide.
Proposition 4.20. (stopping theorem for stochastic integrals)
Let T be a (Gt )-stopping time, and H ∈ Λ2 , then P -a.s.,
Z t∧T Z t
(4.91) Hs dXs = (1[0,T ] H)s dXs , for 0 ≤ t ≤ ∞ .
0 0

Proof. We consider (H.X)t , t ≥ 0, and (H1[0,T ] .X)t≥0 . For a given u ≥ 0,


def
(H1[0,u] )s (ω) = (H1[0,T ] 1[0,u] )s (ω), for all s ≥ 0, on G = {u ≤ T } .

It now follows from (4.63) that for u ≥ 0, P -a.s. on {u ≤ T },

(4.92) (H.X)u = (H1[0,u] .X)∞ = (H1[0,T ] 1[0,u] .X)∞ = (H1[0,T ] .X)u .

68
As a result we see that

P -a.s., for all u ∈ Q ∩ [0, ∞), u ≤ T (ω) =⇒ (H.X)u (ω) = (H1[0,T ] .X)u (ω) ,

and using continuity that

(4.93) P -a.s. for all 0 ≤ t ≤ T (ω), (H.X)t (ω) = (H1[0,T ] .X)t (ω) .

Analogously for u ≥ 0:
e = {T ≤ u} ,
H1[0,T ] 1[0,u] = H1[0,T ] on G

so that for u ≥ 0, P -a.s., on {T ≤ u}

(H1[0,T ] .X)u = (H1[0,T ] .X)∞ .

From this we deduce that

P -a.s., for all u ∈ Q ∩ [0, ∞), T (ω) ≤ u =⇒ (H1[0,T ] .X)u (ω) = (H1[0,T ] .X)∞ (ω) .

Using continuity as above, we thus find that

(4.94) P -a.s., for all t ≥ T (ω), (H1[0,T ] .X)t (ω) = (H1[0,T ] .X)∞ (ω) .

Combining (4.93) and (4.94), we see that

P -a.s., for all t ≥ 0, (H.X)t∧T (ω) (ω) = (H1[0,T ] .X)t∧T (ω) (ω) = (H1[0,T ] .X)t (ω) ,

and this proves (4.91).

We then have the following

Corollary 4.21. Given H, K ∈ Λ2 , T a (Gt )-stopping time such that “H = K on the


random interval [0, T ]” (i.e. H1[0,T ] = K1[0,T ] ), then one has
Z t Z t
(4.95) P -a.s., Hs dXs = Ks dXs , for 0 ≤ t ≤ T (ω) .
0 0

Proof.
Z t∧T Z t Z t
(4.91)
P -a.s., for t ≥ 0, Hs dXs = (1[0,T ] H)s dXs = (1[0,T ] K)s dXs
0 0 0
Z t∧T
(4.91)
= Ks dXs ,
0

and the claim follows.

69
The above corollary provides some “pathwise feeling” to the stochastic integral and
also has important consequences.
Our next item of discussion is the “localization of stochastic integrals”. We are
going to relax the integrability condition H ∈ Λ2 (i.e. H ∈ L2 (Ω × R+ , P, dP ⊗ ds))
in the definition of stochastic integrals. As mentioned previously, cf. (4.64), presently
R 1 αX 2 1
0 e
s dX
s has no meaning when α ≥ 4 (for the sake of definiteness we consider the
canonical space (C, F, (Ft )t≥0 , W0 ) and the canonical process Xt , t ≥ 0). We are going to
R1 2
remedy this feature and 0 eαXs dXs will become well-defined for any α ∈ R, as a result
of the construction below (together with many other stochastic integrals!).
We introduce
n
(4.96) Λ3 = K : P-measurable functions on Ω × [0, ∞), such that
Z t o
P -a.s., for all t ≥ 0, Ks2 (ω) ds < ∞ .
0

Remark 4.22.
1) Note that when Ks (ω) is (Gs )-adapted, for each s ≥ 0, and continuous in s, for each ω,
then automatically K ∈ Λ3 . In particular exp{αXs2 }, or exp{exp{Xs2 }} belong to Λ3 !

2) In the case where we consider a continuous square integrable martingale M. in place


of
R t X2. , the relevant condition will be that outside a P -negligible set of ω, one has
0 Ks (ω) dhM is (ω) < ∞, for all t ≥ 0. 
Lemma 4.23. When H ∈ Λ3 , there exists a non-decreasing sequence of (Gt )-stopping
times Sn , n ≥ 0, which is P -a.s. tending to +∞, such that for each n ≥ 0:

(4.97) H1[0,Sn ] ∈ Λ2 .
Rt
Proof. Note that (ω, t) → 0 Hs2 (ω)ds ∈ [0, ∞] is a continuous, non-decreasing, (Gt )-
adapted stochastic process. As a result
Z t
def
(4.98) Sn = inf{t ≥ 0; Hs2 (ω) ds ≥ n} ≤ ∞
0

is a (Gt )-stopping
R t 2 time (cf. (2.26), (2.27), in fact the proof is simpler here because
{Sn > t} = { 0 Hs (ω) ds < n} ∈ Gt , for each t ≥ 0). In addition we have:
hZ ∞ i
E (H 2 1[0,Sn ] )s (ω) ds ≤ n < ∞ ,
0

and (4.97) holds. Moreover since H ∈ Λ3 , it follows that Sn (ω) ↑ ∞ for P -a.e. ω. This
proves our claim.

We are now ready to extend the definition of the stochastic integral to all integrands
in Λ3 .

70
Definition and Theorem 4.24. Let H ∈ Λ3 and Sn , n ≥ 0, be any sequence of (Gt )-
stopping times, non-decreasing in n, P -a.s. tending to +∞, and such that (4.97) holds.
Then the event
[ 
N= ω ∈ Ω; ∃t ∈ Q+ , (H1[0,Sn ] .X)t∧Sn 6= (H1[0,Sn+1 ] .X)t∧Sn
(4.99) n≥0

∪ {ω ∈ Ω; lim Sn (ω) < ∞} is P -negligible,


n

and the process


def
(H.X)t (ω) = (H1[0,Sn ] .X)t (ω), for ω ∈
/ N , and t ≤ Sn (ω),
(4.100)
def
= 0, if ω ∈ N ,

is well-defined, continuous, adapted. Two such processes arising from two possible choices
of sequences of Sn , n ≥ 0, and versions of (H1[0,Sn ] .X)t (ω), agree for all t ≥ 0, except
maybe on a negligible set (i.e. (4.100) defines (H.X) in an essentially unique fashion).

Proof. Note that H1[0,Sn ] and H1[0,Sn+1 ] agree on [0, Sn ]. As a result of (4.95), the event
N in (4.99) is P -negligible. Note that

for ω ∈ N c , (H1[0,Sn ] .X)t (ω) = (H1[0,Sn′ ] .X)t (ω),


(4.101)
for n, n′ ≥ 0, and for 0 ≤ t ≤ Sn (ω) ∧ Sn′ (ω) .

Hence (H.X)t (ω) in (4.100) is well-defined. Moreover for t ≥ 0,

(H.X)t (ω) = lim (H1[0,Sn ] .X)t (ω), if ω ∈


/ N, = 0, if ω ∈ N .
n

Since (Ω, G, (Gt )t≥0 , P ) satisfies the usual conditions, cf. (4.5), (4.6), (H.X)t , t ≥ 0, is
(Gt )-adapted. Further t → (H.X)t (ω) is continuous, for each ω ∈ Ω.
If Sn , Sn′ , n ≥ 0, are two sequences satisfying the assumptions of the theorem, the
def
same holds for Tn = Sn ∧ Sn′ . From (4.95) we thus find that

P -a.s., for 0 ≤ t ≤ Tn (ω), (H1[0,Sn ] .X)t (ω) = (H1[0,Sn′ ] .X)t (ω),


(4.102)
= (H1[0,Tn ] .X)t (ω) .

The claim about the essential uniqueness in the claim (4.100) easily follows.

Remark 4.25. Of course Λ2 ⊆ Λ3 , and for H ∈ Λ2 , we can choose Sn ≡ ∞, for all


n ≥ 0, so that (4.97) holds. Noting that (H1[0,∞] .X)t , t ≥ 0, and (H.X)t , t ≥ 0, are
indistinguishable, we see that the definition (4.100) is consistent when H ∈ Λ2 ⊆ Λ3 . 
Rt
We now have given a meaning to expressions like 0 exp{exp{Xs2 }}dXs , and of course
we should not expect that we still keep the martingale property for H ∈ Λ3 (an indication
of this feature appeared below (4.64)). The adequate notion comes in the next Definition.

71
Definition 4.26. A process (Mt )t≥0 , such that there exists an increasing sequence of (Gt )-
stopping times Sn , P -a.s. tending to ∞, such that for each n, (Mt∧Sn )t≥0 is a continuous
square integrable martingale, is called a continuous (Gt ))-local martingale.

Remark 4.27. When M0 is bounded, one can replace “continuous square integrable”
with “continuous bounded”. Indeed for such an (Mt )t≥0 as above, with M0 bounded, one
defines the sequence of (Gt )-stopping times:

(4.103) Tm = inf{s ≥ 0; |Ms | ≥ m} ≤ ∞ ,

so that Tm ↑ ∞, as m → ∞. Then, for fixed m ≥ kM0 k∞ , we have |Mt∧Tm | ≤ m, for all


t ≥ 0. Hence, when 0 ≤ s < t, A ∈ Gs , we have
dom. conv.
(4.104) E[Mt∧Tm 1A ] = lim E[Mt∧Tm ∧Sn 1A ] .
n

By the stopping theorem, M ft∧Tm is a continuous martingale if M


ft is a continuous martin-
ft def
gale, and applying this to M = Mt∧Sn , we find that the last term of (4.104) equals
dom. conv.
lim E[Ms∧Tm ∧Sn 1A ] = E[Ms∧Tm 1A ] .
n

In other words (Mt∧Tm )t≥0 is a (Gt )-martingale, which is bounded and continuous, and
our claim follows. 

Exercise 4.28.
1) Deduce the continuous time stopping theorem we used above from the discrete time
version (see also [8], p. 19).

2) Show that a bounded continuous local martingale is a martingale. 

Continuous local martingales naturally arise in our context as shown by the next

Proposition 4.29.

(4.105) For H ∈ Λ3 , (H.X)t , t ≥ 0, is a continuous (Gt )-local martingale.

Proof. Consider an increasing sequence of stopping times Sn ↑ ∞, P -a.s., such that for
each n, H1[0,Sn ] ∈ Λ2 , then

(4.100)
P -a.s., for t ≥ 0, (H.X)t∧Sn = (H1[0,Sn ] .X)t∧Sn
(4.91)
(4.106) = (H1[0,Sn ] .X)t .

continuous square integrable martingale

Our claim follows.

72
5 Stochastic Integrals for Continuous Local Martingales
Rt
In this chapter we are going to define the stochastic integral 0 Hs dMs when the integrator
M
R t is2 a continuous local martingale, and H is progressively measurable and such that
0 Hs (ω) dhM is (ω) < ∞, where hM i is the so-called “quadratic variation of the local
martingale M ”. As in the previous chapter the filtered probability space (Ω, G, (Gt )t≥0 , P )
satisfies the “usual conditions”, see (4.5), (4.6). Our first task will be the construction of
hM i. We begin with the

Lemma 5.1. Suppose At , t ≥ 0, Bt , t ≥ 0, are continuous, (Gt )-adapted, non-decreasing


processes such that A0 = B0 = 0, and

(5.1) At − Bt is a (Gt )-local martingale,

then

(5.2) P -a.s.(ω), for all t ≥ 0, At (ω) = Bt (ω) .

Proof. Introduce the non-decreasing sequence of (Gt )-stopping times

(5.3) Sn = inf{s ≥ 0, As or Bs ≥ n} ,

and note that Sn ↑ ∞ as n → ∞. As in (4.103) we see that

(5.4) At∧Sn − Bt∧Sn , t ≥ 0, is a bounded martingale, for each n ≥ 0 .

It thus suffices to prove the theorem in the case where At , t ≥ 0, and Bt , t ≥ 0, are
uniformly bounded, and

(5.5) Mt = At − Bt , t ≥ 0, is a bounded continuous martingale.

We now observe that for t ≥ 0:


m −1
h 2X 2 i
E[Mt2 ] =E M k+1
m t
−M k
t
2 2m
k=0

and expanding the square, the cross terms disappear by the martingale property. So we
find
m −1
2X h 2 i h X  2 i
E[Mt2 ] = E M k+1
m t
−M k
t =E M k+1
m t
−M k
t
2 2m 2 2m
k=0 0≤k<2m
h X i
dom. conv.
≤E sup M k+1 t − M k
t × M k+1
t − M k
t −→ 0 .
0≤k<2m 2m 2m 2m 2m m→∞
0≤k<2m
| {z } | {z }
by continuity ↓ m→∞ ≤A∞ +B∞ ≤Const <∞
0

73
We have thus shown that for t ≥ 0:

(5.6) E[Mt2 ] = 0 ,

and hence P -a.s., for all t ∈ Q+ , Mt = At − Bt = 0. But by continuity we have P -a.s.(ω),


for all t ≥ 0, At (ω) = Bt (ω). This completes the proof of the lemma.

We now proceed with the construction of hM i, when M is a continuous square inte-


grable martingale. The result is a special case of the so-called Doob-Meyer decomposition
(see for instance [8], p. 24).
Theorem 5.2. Let Mt , t ≥ 0, be a continuous square integrable (Gt )-martingale. Then
there exists a continuous, non-decreasing, (Gt )-adapted process At , t ≥ 0, such that

(5.7) A0 = 0 ,

(5.8) At is integrable for each t ≥ 0 ,

(5.9) Mt2 − At is a (Gt )-martingale ,

and At , t ≥ 0 is essentially unique.


Proof. The essential uniqueness follows from (5.2). We only need to prove the existence
of At , t ≥ 0. Without loss of generality we assume that M0 = 0 (otherwise we replace Mt
with M ft = Mt − M0 ).
We are going to construct A. as a suitable limit of discrete quadratic variations of M. ,
along certain random grids with mesh tending to 0. For this purpose we define, for each
n ≥ 0 (n controls the mesh of the discrete grid), a sequence τkn , k ≥ 0, of stopping times
as follows:

(5.10) τk0 = k, for k ≥ 0 ,

and for n ≥ 1, by induction:

(5.11) n−1
τ0n = 0, and for ℓ ≥ 0, on the event {τkn−1 ≤ τℓn < τk+1 }, k ≥ 0,
n 1 o  1 
n n−1
τℓ+1 = inf t ≥ τℓn ; |Mt − Mτℓn | ≥ ∧ τℓn + ∧ τk+1 .
n n

Using the continuity of M. , we see that for ω ∈ Ω,


 n n (ω), for n, k ≥ 0 ,
 τk (ω) < τk+1





 {τ0n (ω), τ1n (ω), . . . } ⊆ {τ0n+1 (ω), τ1n+1 (ω), . . . } ,


 տ ր
(5.12) subsets of R+





 τkn (ω) → ∞, as k → ∞ ,



 M n 1
τk+1 (ω) (ω) − Mτkn (ω) (ω) ≤ .
n

74
We then choose K0 < K1 < · · · < Kn < . . . in N so that

n 1
(5.13) P (τK ≤ n) ≤ , for n ≥ 0 ,
n
n
and define
KX
n −1

(5.14) In (t) = n ∧t − Mτ n ∧t )
Mτkn (Mτk+1 k
k=0

as well as
KX
n −1
2
n ∧t − Mτ n ∧t ) .
(5.15) An (t) = (Mτk+1 k
k=0

Note that In (0) = 0, An (0) = 0, that In (·), An (·) are continuous and adapted (for instance
in the case of (5.14), the generic term vanishes on {τkn > t} and one has

1{τkn ≤ t}Mτkn Mτk+1
n ∧t − Mτkn ∧t is Gt -measurable ,
| {z } | {z } | {z }
↑ ↑ ↑
Gt −measurable Gτ n ∧t ⊆Gt −meas. Gτ n ∧t ⊆Gt −meas.
k+1 k

and the case of (5.15) is easier). In fact one has

(5.16) In (t), t ≥ 0, is a continuous (Gt )-martingale, bounded for each t .

Here we only need to check that for s ≤ t:

(5.17) n ∧t − Mτ n ∧t ) | Gs ] = Mτ n (Mτ n ∧s − Mτ n ∧s ) .
E[Mτkn (Mτk+1 k k k+1 k

But using the stopping theorem and the observation above (5.16), we have

E[1{τkn ≤ s} Mτkn (Mτk+1


n ∧t − Mτ n ∧t ) | Gs ] = right-hand side of (5.17)
k
| {z }
∈Gs

(note that the right-hand side of (5.17) equals 0 on {τkn > s}).
On the other hand 1{s < τkn ≤ t} Mτkn is Gτkn -measurable and for A ∈ Gs , 1A 1{s < τkn }
is also Gτkn -measurable and on this set τkn ≤ τk+1
n ∧ t. Hence,

E[1A 1{s < τkn ≤ t} Mτkn (Mτk+1


n ∧t − Mτ n ∧t )] =
k

E[1A 1{s < τkn ≤ t} Mτkn (Mτkn ∨(τk+1


n ∧t) − Mτ n )] = 0,
k

using the stopping theorem, cf. [8], p. 19, for the last equality.
As a result, E[1{s < τkn ≤ t} Mτkn (Mτk+1n ∧t − Mτ n ∧t ) | Gs ] = 0, and (5.17) now easily
k
n
follows since 1{τk > t} (Mτk+1 ∧t − Mτk ∧t ) = 0. This proves (5.16).
n n

75
By direct inspection of (5.15) and (5.11) we see that
1 1
(5.18) for t ≥ s + , An (t) + 2 ≥ An (s) .
n n
n > n}, M =
P
Moreover, for n ≥ 1, on {τK n t 0≤k<Kn (Mτk+1 ∧t − Mτk ∧t ), for 0 ≤ t ≤ n, so
n n

that expanding the square and regrouping terms, we see that, cf. (5.14), (5.15):

(5.19) Mt2 = 2In (t) + An (t), for 0 ≤ t ≤ n, on {τK


n
n
> n} .

The next step is to prove the P -a.s. uniform convergence on compact intervals of Inℓ (·)
for a suitably chosen subsequence nℓ . To this end we will use the next
Lemma 5.3. (T > 0, ε > 0)

(5.20) lim sup P [ sup |In (t) − Im (t)| ≥ ε] = 0 .


m→∞ n≥m 0≤t≤T

m ∧ τ n . By Doob’s inequality, cf. (4.67),


Proof. Choose n ≥ m ≥ T , and define S = T ∧ τK m Kn
we find that
  m ≤ m or τ n ≤ n] +
P sup |In (t) − Im (t)| ≥ ε ≤ P [τK m Kn
0≤t≤T
(5.21)   (4.67) 2 1
P sup |In (t ∧ S) − Im (t ∧ S)| ≥ ε ≤ + 2 E[(In (S) − Im (S))2 ] ,
0≤t≤T (5.13) m ε

where we have used (5.16) and the stopping theorem.


If we now define for k, ℓ ≥ 0,

(5.22) ρk = τkm ∧ S, σℓ = τℓn ∧ S ,

it follows from the second line of (5.12) that

{ρ0 (ω), . . . , ρk (ω), . . . } ⊆ {σ0 (ω), σ1 (ω), . . . , σℓ (ω), . . . } ,

and from (5.14) that


X X
In (S) − Im (S) = Mσℓ (Mσℓ+1 − Mσℓ ) − Mρk (Mρk+1 − Mρk ) =
ℓ≥0 k≥0
X
1{ρk ≤ σℓ < ρk+1 } Mσℓ (Mσℓ+1 − Mσℓ ) −
k,ℓ≥0
(5.23) X
1{ρk ≤ σℓ < ρk+1 } Mρk (Mσℓ+1 − Mσℓ ) =
k,ℓ≥0
X def
X
1{ρk ≤ σℓ < ρk+1 } (Mσℓ − Mρk )(Mσℓ+1 − Mσℓ ) = ak,ℓ (ω) .
k,ℓ≥0 k,ℓ≥0

76
Note that

(5.24) the ak,ℓ , k, ℓ ≥ 0, are pairwise orthogonal in L2 (P ) .

Indeed for ℓ < ℓ′ , cf. Exercise 2.13 2)

ak,ℓ = 1{ρk ≤ σℓ < ρk+1 }(Mσℓ − Mρk )(Mσℓ+1 − Mσℓ ) is Gσℓ+1 ⊆ Gσℓ′ -measurable

and:
1{ρk′ ≤ σℓ′ < ρk′ +1 }(Mσℓ′ − Mρk′ ) is Gσℓ′ -measurable as well.

Since E[Mσℓ′ +1 | Gσℓ′ ] = Mσℓ′ (see for instance [8], p. 19), it follows that E[ak,ℓ ak′ ,ℓ′ ] = 0,
for ℓ < ℓ′ . To obtain (5.24), one simply notes that for ℓ ≥ 0, k < k′ , ak,ℓ (ω) ak′ ,ℓ (ω) = 0.

Coming back to (5.23), we conclude by (5.24) that


X (5.11)
E[(In (S) − Im (S))2 ] = E[a2k,ℓ ] ≤
k,ℓ≥0
1 h X i 1 hX i
2 2
E 1{ρk ≤ σ ℓ < ρ k+1 }(Mσ − Mσ ) = E (Mσ − Mσ )
m2 ℓ+1 ℓ
m2 ℓ+1 ℓ
k,ℓ≥0 ℓ≥0
(5.25)
and since the above increments are pairwise orthogonal
1 1
= 2 E[MS2 ] ≤ 2 E[MT2 ] ,
m m
2
since Mt , t ≥ 0, is a continuous submartingale, and S ≤ T .

Inserting this inequality in (5.21) yields that


  2 1
(5.26) P sup |In (t) − Im (t)| ≥ ε ≤ + 2 2 E[MT2 ] .
0≤t≤T m m ε

In particular (5.20) follows, and the lemma is proved.

We can now extract nℓ ≥ ℓ2 , such that


h 1i 1
sup P sup |In (t) − Inℓ (t)| ≥ 2 ≤ ℓ ,
n≥nℓ 0≤t≤ℓ ℓ 2

so that
h 1i 1
(5.27) P sup |Inℓ+1 (t) − Inℓ (t)| ≥ 2
≤ ℓ.
0≤t≤ℓ ℓ 2

By (5.13) and Borel-Cantelli’s lemma we can choose N ∈ G with P (N ) = 0, so that


1
(5.28) sup |Inℓ+1 (t) − Inℓ (t)| ≤ , τ nℓ > ℓ2 , for ℓ ≥ ℓ0 (ω), when ω ∈
/N.
0≤t≤ℓ ℓ2 Knℓ

77
Thus Inℓ (·, ω) converges uniformly on compact intervals when ω ∈
/ N , and we define

I(t, ω) = lim Inℓ (t, ω), for ω ∈


/ N,

(5.29)
1
= Mt2 (ω), for ω ∈ N .
2

Therefore t ∈ R+ → I(t, ω) is continuous for ω ∈ Ω, and I(t, ω) is (Gt )-measurable for


each t ≥ 0 (we use the fact that G0 contains all negligible sets of G).
nℓ
By (5.19) and the fact that τK nℓ
→ ∞, when ω ∈
/ N , we see that when ω ∈
/ N , Anℓ (·, ω)
converges uniformly on compact intervals of R+ to
def
(5.30) At (ω) = Mt2 (ω) − 2I(t, ω), ω ∈ Ω.

Thus At (ω) is (Gt )-measurable, for all t ≥ 0, continuous in t for all ω ∈ Ω. Note that
A0 = 0, and due to (5.18) when ω ∈ / N , and to (5.29), (5.30) when ω ∈ N , t → At (ω) is
non-decreasing in t for all ω ∈ Ω.
We will now prove that for n0 ≥ 1, k0 ≥ 1, t ≥ 0,

(5.31) Inℓ (τkn00 ∧ t) −→ I(τkn00 ∧ t) in L1 (P ) .


ℓ→∞

We already know the P -a.s. convergence, cf. (5.29). It thus suffices to prove that Inℓ (τkn00 ∧
t), ℓ ≥ 0, are uniformly integrable. However writing for m ≥ n0 ,

νk = τkm ∧ t ∧ τkn00 ,

we see as in (5.23) that


X
(5.32) Im (τkn00 ∧ t) = Mνk (Mνk+1 − Mνk ) .
k≥0

Since we have the bound


(5.11) k0
|Mνk | ≤ sup |Mu | ≤ ,
n
0≤u≤τk 0 n0
0

a similar (but easier) calculation as in (5.25) yields that for m ≥ n0 :


 k 2 X  k 2
0 0
(5.33) E[Im (τkn00 ∧ t)2 ] ≤ E[(Mνk+1 − Mνk )2 ] ≤ E[Mt2 ] .
n0 n0
k≥0

This proves the asserted uniform integrability and (5.31) follows.

78
By the stopping theorem, Inℓ (τkn00 ∧ t), t ≥ 0, are martingales, and by (5.31) we deduce
that I(τkn00 ∧ t), t ≥ 0, is a (continuous) martingale. By (5.30) we now find that

E[A(τkn00 ∧ t)] = E[M 2 (τkn00 ∧ t)]


| {z
 } | {z }

monotone yk → ∞ 
dominated yk → ∞
0 0
convergence convergence (4.76)
(recall E[sup |Ms |2 ] ≤ 4E[Mt2 ])
s≤t
E[A(t)] = E[M 2 (t)] .

This proves (5.8), and it also follows that


L1 (P )
(5.34) M 2 (τkn00 ∧ t)2 − A(τkn00 ∧ t) −→ M 2 (t) − A(t), for t ≥ 0 .
k0 →∞

The claim (5.9) follows and the theorem is proved.

Notation:
When M. is a continuous square integrable martingale, the essentially unique process
A. constructed in the above theorem is denoted by hM i, it is the so-called “quadratic
variation” of M (in some sense (5.15) explains the terminology).
When (Zt )t≥0 , is a stochastic process and T a random time, one introduces:
def
(5.35) ZtT = Zt∧T , t ≥ 0, the so-called stopped process .

Corollary 5.4. Let (Mt )t≥0 , be a continuous local martingale. Then, there exists an
essentially unique, continuous, non-decreasing, (Gt )-adapted process hM it , t ≥ 0, such
that

(5.36) hM i0 = 0 ,

(5.37) Mt2 − M02 − hM it , t ≥ 0, is a continuous local martingale.

Moreover, when T is a (Gt )-stopping time, one has

(5.38) P -a.s., for all t ≥ 0, hM T it = hM iT ∧t (= hM iTt ) .

Proof. The uniqueness part of the statement follows from (5.2). As for the existence part,
choose stopping times Tn ↑ ∞, P -a.s., so that M.Tn is a continuous square integrable
martingale. Note that by the stopping theorem,
Tn+1 2
Mt∧T n
− hM Tn+1 it∧Tn = Mt∧T
2
n
− hM Tn+1 it∧Tn , t ≥ 0 ,

is a (Gt )-martingale. From (5.2) it follows that

(5.39) P -a.s., for all t ≥ 0, hM Tn+1 it∧Tn = hM Tn it .

79
As a result we can find N in G with P (N ) = 0, so that

(5.40) / N , for all m ≥ n ≥ 0, 0 ≤ t ≤ Tn (ω), hM Tm it (ω) = hM Tn it (ω) .


when ω ∈

We thus define

hM it (ω) = hM Tn it (ω), for any n ≥ 0, with Tn (ω) ≥ t, if ω ∈


/N,
(5.41)
= 0, if ω ∈ N .

Note that hM it , t ≥ 0, is continuous, non-decreasing, (Gt )-adapted, and (5.36) holds.


Moreover hM iT. n and hM Tn i. are indistinguishable so that Mt∧T2
n
− M02 − hM it∧Tn is a
continuous martingale with value 0 at time 0. The argument below (4.104) shows that
(5.37) holds. As for (5.38) it directly follows from the previous existence and uniqueness
2
result, and the fact that Mt∧T − M02 − hM it∧T is a continuous local martingale.

Notation:
When M, N are continuous local martingales one writes:
1
(5.42) hM, N it = (hM + N it − hM − N it ), t ≥ 0 (Polarization identity) .
4

Corollary 5.5. When M, N are continuous local martingales, hM, N it , t ≥ 0, is a con-


tinuous adapted process with bounded variation on finite intervals, essentially unique such
that

(5.43) hM, N i0 = 0 ,

(5.44) Mt Nt − M0 N0 − hM, N it , t ≥ 0, is a continuous local martingale.

Proof. We only have to prove the uniqueness, the other properties being immediate. To
this end observe that when Ct , t ≥ 0, is a continuous adapted process with finite variation
on finite intervals, then
X
(5.45) Vt = lim C k+1 − C kn , t ≥ 0 ,
n→∞ n2 2
k+1
2n
≤t

is a continuous, non-decreasing, adapted process, and

(5.46) Vt − Ct , t ≥ 0, is non-decreasing as well .

We apply this observation to the difference of hM, N it with Dt , some other continuous
adapted process, with finite variation on finite intervals, satisfying similar conditions as in
(5.43), (5.44). By (5.2) we conclude that

(5.47) P -a.s., for all t ≥ 0, hM, N it = Dt .

80
We now turn to the construction of the stochastic integrals with respect to
continuous local martingales. This construction involves several steps, which often
are very similar to what has been done in the previous chapter (such steps will be merely
briefly discussed below).
For Hs (ω) a basic process (i.e. Hs (ω) = C(ω) 1{a < s ≤ b}, with C ∈ bGa , cf. (4.26)),
and Ms , s ≥ 0, a continuous square integrable martingale one defines in the spirit
of (4.27):
Z ∞
def
(5.48) Hs dMs = C(ω)(Mb (ω) − Ma (ω)) ,
0

and for 0 ≤ t ≤ ∞:
Z t Z ∞
def (5.48)
(5.49) Hs dMs = (H1[0,t] )s dMs = C(ω)(Mb∧t (ω) − Ma∧t (ω)) .
0 0

One immediately extends the definition to H ∈ Λ1 , i.e. H = H 1 + · · · + H n , with H i basic


processes, for 1 ≤ i ≤ n, cf. (4.36), by the formula
Z ∞ Xn Z ∞
def
(5.50) Hs dMs = Hsi dMs ,
0 i=1 0

and one checks that this is well-defined and that, as in (4.39), one has
hZ ∞ Z ∞ i hZ ∞ i
(5.51) E Hs dMs Ks dMs = E Hs (ω) Ks (ω) dhM is (ω) , for H, K ∈ Λ1 .
0 0 0
R∞
With the help of (4.47) one extends the definition of 0 Hs dMs to H in

(5.52) Λ2 (M ) = L2 (Ω × R+ , P, dP × dhM is ) ,

so that
Z ∞
(5.53) H ∈ Λ2 (M ) → Hs dMs ∈ L2 (P ) is an isometry .
0
Rt
One chooses a “good version” of 0 Hs dMs , t ≥ 0, with similar arguments as in the proof
of (4.69), (4.70), denoted by (H · M )t , t ≥ 0, such that
(5.54) (H.M )t , t ≥ 0, is a continuous, square integrable (Gt )-martingale
with value 0 at time 0 ,
Z t
(5.55) for each t ≥ 0, P -a.s., (H.M )t = Hs dMs ,
0
Z t
(5.56) Nt = (H.M )2t − Hs2 dhM is , t ≥ 0, is a continuous martingale,
0
with sup |Nt | ∈ L1 (P ) (and value 0 at time 0) .
t≥0

81
In particular, cf. (5.9), (5.2), (5.56),
Z t
(5.57) hH.M it = Hs2 dhM is , t ≥ 0 .
0

The above defined stochastic integral has the following property (with a similar argument
as for the proof of (4.62)): when H, K ∈ Λ2 (M ), G ∈ G are such that

(5.58) Hs (ω) = Ks (ω), for all s ≥ 0, when ω ∈ G ,

then

(5.59) P -a.s., for 0 ≤ t ≤ ∞, (H.M )t (ω) = (K.M )t (ω), for ω ∈ G .

Then, one has (in the notation of (5.35)):

Theorem 5.6. (stopping theorem for stochastic integrals)


Let T be a (Gt )-stopping time and H ∈ Λ2 (M ), then

P -a.s., for 0 ≤ t ≤ ∞,
(5.60)
((1[0,T ] H).M )t = (H.M )t∧T = (H.M T )t = ((1[0,T ] H).M T )t .

Proof. Note that MtT = Mt∧T , t ≥ 0, is also a continuous square integrable martingale
and, cf. (5.38), hM T i. = hM iT. , so that H ∈ Λ2 (M T ) as well. Then we find just as in
(4.91) that

(5.61) P -a.s., for 0 ≤ t ≤ ∞, ((1[0,T ] H).M )t = (H.M )t∧T .

Moreover since hM T i. = hM iT. , we see that

(5.62) H = 1[0,T ] H in L2 (Ω × R+ , P, dP × dhM T i) ,

so that

(5.63) P -a.s., for 0 ≤ t ≤ ∞, (H.M T )t = ((1[0,T ] H).M T )t .

Then, observe by coming back to (5.49) that for K basic process, and then K in Λ1 , one
has

(5.64) for 0 ≤ t ≤ ∞, (K.M )t∧T = (K.M T )t .

Then by approximation of H ∈ Λ2 (M ) (and hence in Λ2 (M T )) one finds that

(5.65) P -a.s., for 0 ≤ t ≤ ∞, (H.M )t∧T = (H.M T )t .

Combining (5.61), (5.63), (5.65), we obtain (5.60).

82
With the help of the stopping theorem for stochastic integrals, cf. (5.60), we will now
extend the definition of stochastic integrals.
For M a continuous local martingale with value 0 at time 0 , we define
n
Λ3 (M ) = H : P-measurable functions on Ω × R+ such that
(5.66) Z t o
P -a.s., ∀t ≥ 0, Hs2 (ω) dhM is (ω) < ∞ .
0

One then considers a non-decreasing sequence of stopping times Tn , n ≥ 0, P -a.s. tending


to ∞, such that
(5.67) M Tn is a continuous square integrable martingale for each n ≥ 0 ,
(5.68) 1[0,Tn ] H ∈ Λ2 (M Tn ) .

One such sequence is for instance obtained by setting:


n Z s o
(5.69) Tn (ω) = inf s ≥ 0, |Ms (ω)| ≥ n or Hs2 (ω) dhM is (ω) ≥ n .
0

Definition and Theorem 5.7. If Tn ↑ ∞, P -a.s., is a sequence of stopping times satis-


fying (5.67), (5.68), then, the event
[
(5.70) N= ω ∈ Ω; ∃t ∈ Q+ , ((H1[0,Tn+1 ] ).M Tn+1 )t∧Tn 6= ((H1[0,Tn ] ).M Tn )t∧Tn
n≥0

∪ {ω ∈ Ω; lim Tn (ω) < ∞} is P -negligible, and


n

def
(5.71) (H.M )t (ω) = ((H1[0,Tn ] ).M Tn )t (ω), for 0 ≤ t ≤ Tn (ω), if ω ∈
/N,
= 0, if ω ∈ N ,
is a continuous local martingale. It is defined in an essentially unique way if one uses
different choices of Tn , n ≥ 0, and of ((H1[0,Tn ] ).M Tn )t .
Proof. By (5.60), letting M Tn+1 play the role of M , and H1[0,Tn+1 ] of H, we find that
P -a.s., for 0 ≤ t ≤ ∞,
(5.72) ((H1[0,Tn+1 ] ).M Tn+1 )t∧Tn = ((H1[0,Tn ] ).M Tn )t = ((H1[0,Tn ] ).M Tn )t∧Tn ,

where the last equality follows from (5.60), with M replaced by M Tn and H by H1[0,Tn ] .
We thus find that P (N ) = 0, and it is immediate from (5.71) that (H.M )t defines a
continuous local martingale. Now, when Tn , Tn′ , n ≥ 0, are two sequences of stopping
times satisfying the assumptions of the theorem, setting Sn = Tn ∧ Tn′ ↑ ∞, P -a.s., we find
that P -a.s., for 0 ≤ t ≤ Sn (ω):
(5.60) (5.60) ′
(5.73) ((H1[0,Tn ] ).M Tn )t (ω) = ((H1[0,Sn ] ).M Sn )t (ω) = ((H1[0,Tn′ ] ).M Tn )t (ω) ,

and the claim about the essential uniqueness follows.

83
Remark 5.8. When M is a continuous square integrable martingale, with M0 = 0, and
H ∈ Λ2 (M ), we can take Tn ≡ ∞ in the previous definition, so that (5.71) agrees with
our previous definition of the stochastic integral. 

We will now give an alternative characterization of (H.M ). , by means of its bracket


h(H.M ), N i with other continuous local martingales. The following will be helpful.

Proposition 5.9. (Kunita-Watanabe’s inequality, 1967)


If H, K are progressively measurable, M, N are continuous local martingales, and
Z ∞ Z ∞
2
(5.74) P -a.s., Hs (ω) dhM is (ω) < ∞, Ks2 (ω) dhN is (ω) < ∞ ,
0 0

then
Z ∞ Z ∞ 1
2
(5.75) P -a.s., Hs (ω) Ks (ω) d|hM, N i|s (ω) ≤ Hs2 (ω) dhM is (ω)
0 0
Z ∞ 1
2
· Ks2 (ω) dhN is (ω)
0

(with |hM, N i|s denoting the total variation process of hM, N is , cf. (5.45)).

Proof. From (5.43), (5.44), we see that

(5.76) P -a.s., for all λ ∈ Q, all t ≥ 0, hM + λN it = hM it + 2λhM, N it + λ2 hN it .


def
Hence, P -a.s., for s ≤ t, λ ∈ Q (with h its = h it − h is )

(5.77) hM its + 2λhM, N its + λ2 hN its ≥ 0 ,

and thus, looking at the discriminant (in λ), we find that


1 1
P -a.s., for 0 ≤ s ≤ t, |hM, N its (ω)| ≤ hM its (ω) 2 (hN its (ω)) 2
(5.78)
1 1
≤ hM its (ω) + hN its (ω) .
2 2

This shows that on the above set of full P -measure


1 1 def
(5.79) d |hM, N i|s ≤ dhM is + dhN is = dνω (s)
2 2

and we can then introduce fsM (ω), fsN (ω), fsM,N (ω) the respective densities of dhM is (ω),
dhN is (ω) and dhM, N is (ω) with respect to dνω (s).

84
Coming back to (5.76), and expressing the density of dhM + λN is (ω) with respect to
dνω (s), we see that P -a.s., for νω -a.e. s,

(5.80) fsM (ω) + 2λ fsM,N (ω) + λ2 fsN (ω) ≥ 0, for λ ∈ Q, and hence λ ∈ R .

We first assume, H, K ≥ 0, bounded and compactly supported in s. Setting sign(x) =


1{x ≥ 0} − 1{x < 0}, He s = Hs sign(fsM,N ), and λ = γKs He s−1 1{Hs 6= 0}, we see multi-
2
plying (5.80) by Hs and considering the set of s where Hs = 0 separately that,

P -a.s., for νω -a.e. s, for all γ ∈ R ,


e s (ω) Ks (ω) fsM,N (ω) + γ 2 K 2 (ω) f N (ω) ≥ 0 .
Hs2 (ω) fsM (ω) + 2γ H s s

Integrating over s, with respect to dνω (s), we find that P -a.s. for all γ ∈ R,
Z ∞ Z ∞ Z ∞
2 2
(5.81) Hs (ω) dhM is + 2γ Hs (ω) Ks (ω) d |hM, N i|s + γ Ks2 (ω) dhN is ≥ 0 ,
0 0 0

and looking at the discriminant in γ we find that P -a.s.,


Z ∞ Z ∞ 1 Z ∞ 1
2 2
Hs Ks d |hM, N i|s ≤ Hs2 (ω) dhM is Ks dhN is .
0 0 0

The case of non-negative H, K satisfying (5.74) follows by truncation and monotone con-
vergence, and then the general case is immediate.

We now have the following characterization of (H.M ) for H ∈ Λ3 (M ):

Theorem 5.10. Let M be a continuous local martingale with M0 = 0, H ∈ Λ3 (M ),


then (H.M ) is the unique continuous local martingale vanishing in 0, such that for all
continuous local martingales N , P -a.s.
Z t
(5.82) h(H.M ), N it = Hs (ω) dhM, N is , for all t ≥ 0
0

(note that (5.75) implies that P -a.s. the right-hand side is well-defined).

Proof.
• Uniqueness:
If I, Ie are continuous local martingales vanishing at time 0 such that hI, N i = hI,
e N i for
all continuous local martingales N , we have hI − Ii e = 0. Hence, we can find stopping
e 2
times Tn ↑ ∞, P -a.s., such that (I − I)t∧Tn , t ≥ 0, are bounded continuous martingales,
cf. Remark 4.27. Hence, we see that

(5.83) e 2 ] = 0, t ≥ 0, n ≥ 0 ,
E[(I − I)t∧Tn

85
so that P -a.s., for t ∈ Q ∩ [0, ∞), n ≥ 0, It∧Tn = Iet∧Tn . Since P -a.s., Tn ↑ ∞, using
continuity, we find that

(5.84) P -a.s., for all t ≥ 0, It = Iet .

• (5.82):
When H = C1(a,b] , with 0 ≤ a < b, C ∈ bGa , is a basic process and M, N are continuous
square integrable martingales,
Z t
Jt = (H.M )t Nt − Hs dhM, N is =
(5.85) 0

C (Mb∧t − Ma∧t ) Nt − hM, N ib∧t
a∧t is a continuous martingale.

For instance, when a ≤ s ≤ b, s < t:

(5.86) E[Jt | Gs ] = CE[E[(Mb∧t − Ma ) Nt − hM, N ib∧t


a | Gb∧t ] | Gs ]

= CE[(Mb∧t − Ma ) Nb∧t − hM, N ib∧t


a | Gs ]

since Nt∧b , t ≥ 0, and Mb∧t Nb∧t − hM, N ib∧t are martingales

= C{(Mb∧s − Ma ) Nb∧s − hM, N ib∧s


a } = Js ,

and the other cases are easier to check. We then find that (5.82) holds for H ∈ Λ1 , M, N
continuous square integrable martingales. Then, keeping M, N as above, for H ∈ Λ2 (M )
L2 (P )
we can choose H n in Λ1 , approximating H in Λ2 (M ), so that (H n .M )t −→ (H.M )t , for
t ≥ 0. Then, as a result of (5.75), for t ≥ 0,
hZ t i
E |Hs (ω) − Hsn (ω)| d |hM, N i|s ≤
0
Cauchy-
h Z ∞ 1 1 i Schwarz
(5.87) E (H − H n )2s (ω) dhM is (ω)
2
hN it
2

0
1
kH − H n kL2 (dP ×dhM i) E[hN it ] 2 −→ 0 .
n→∞

As a result, we see that for t ≥ 0,


Z t Z t
n n L1 (P )
(5.88) (H .M )t Nt − Hs dhM, N is −→ (H.M )t Nt − Hs dhM, N is ,
0 n→∞ 0

and the limit is a martingale as well.

86
Thus we have proved that (5.82) holds when M, N are continuous square integrable
martingales, and H ∈ Λ2 (M ).
Now, in the general case of the theorem, when H ∈ Λ3 (M ), we choose stopping times
Tn ↑ ∞, P -a.s., so that M Tn , N Tn are continuous square integrable martingales, and
H1[0,Tn ] ∈ Λ2 (M Tn ), for each n ≥ 0. Then we find from (5.71) that P -a.s., for all t ≥ 0,
Z t∧Tn
(H.M )t∧Tn Nt∧Tn − Hs (ω) dhM, N is =
0
Z t
(5.38)
(5.89) ((H1[0,Tn ] ).M Tn )t NtTn − (H1[0,Tn ] )s dhM, N is∧Tn =
0 (5.42)

Z t
((H1[0,Tn ] ).M Tn )t NtTn − (H1[0,Tn ] )s dhM Tn , N Tn is ,
0

which is a continuous martingale.


Rt
This proves that (H.M )t Nt − 0 Hs dhM, N is , t ≥ 0, is a continuous local martingale,
and by (5.43), (5.44) (recall that (H.M )0 = 0), the claim (5.82) follows in the general
case.

Corollary 5.11. For M, N continuous local martingales, vanishing at time 0, H ∈ Λ3 (M ),


K ∈ Λ3 (N ),
Z t
(5.90) P -a.s., h(H.M ), (K.N )it = Hs (ω) Ks (ω) dhM, N is , for t ≥ 0 .
0

Proof. By (5.82) we find that P -a.s.

dh(H.M ), (K.N )i = HdhM, (K.N )i = HKdhM, N i .

We have the following very useful consequence of this result:

Corollary 5.12. For M continuous local martingale with M0 = 0, H ∈ Λ3 (M ), K ∈


Λ3 ((H.M )) one has

(5.91) HK ∈ Λ3 (M ) and

(5.92) P -a.s., for t ≥ 0, (K.(H.M ))t = ((K · H).M )t .

Proof.
• (5.91):
By (5.90), we have P -a.s., dh(H.M )i = H 2 dhM i, so that K ∈ Λ3 ((H.M )) means that K
Rt
is P measurable and P -a.s., 0 Ks2 Hs2 dhM is < ∞, and therefore HK ∈ Λ3 (M ).

87
• (5.92):
By (5.82), for N continuous local martingale, P -a.s.

dh(K.(H.M )), N i = K dh(H.M ), N i = KH dhM, N i

and (5.92) now follows from the uniqueness part of (5.82).

88
6 Ito’s formula and first applications
In this chapter we will prove Ito’s formula, which is a fundamental “change of variable
formula” for stochastic integrals, and the source of many explicit calculations. We will
discuss some of its applications. Throughout this chapter (Ω, G, (Gt )t≥0 , P ) will denote a
filtered probability space, which satisfies the “usual conditions”, cf. (4.5), (4.6).
Definition 6.1. A continuous semimartingale (Yt )t≥0 on (Ω, G, (Gt )t≥0 , P ) is a con-
tinuous adapted process, which admits the decomposition
(6.1) Yt = Y0 + Mt + At , t ≥ 0 ,
where Mt , t ≥ 0, is a continuous local martingale such that M0 = 0, and At , t ≥ 0, is a
continuous adapted process with bounded variation on finite intervals, such that A0 = 0.
Remark 6.2. The same argument used in the proof of (5.43), (5.44) shows that when
(Yt )t≥0 is a continuous semimartingale,
(6.2) the decomposition (6.1) is essentially unique.

Notation:
For (Yt )t≥0 a continuous semimartingale we will write
n
Λ(Y ) = H : P-measurable on Ω × R+ such that P -a.s., for t ≥ 0,
(6.3) Z t Z t o
2
Hs dhM is < ∞ and |Hs | d|A|s < ∞ ,
0 0

where M and A are as in (6.1) and |A|. denotes the total variation process of A. Then,
for H ∈ Λ(Y ), we will use the notation
Z t Z t Z t
(6.4) Hs dYs = Hs dMs + Hs dAs , t ≥ 0 ,
0 0 0
Rt
so that (6.4) defines 0 Hs dYs , t ≥ 0, in an essentially unique fashion (with respect to the
various versions and decompositions in (6.1)).

Example:
Any continuous adapted process Hs (ω) is automatically in Λ(Y ). In particular an expres-
Rt
sion such as 0 exp{exp(Ys2 + s2 )} dYs (for instance) is well defined. 
An important first step towards Ito’s formula will be the next
Proposition 6.3. (Integration by parts formula)
If Yt , t ≥ 0, and Zt , t ≥ 0, are continuous semimartingales on (Ω, G, (Gt )t≥0 , P ), then
P -a.s., for all t ≥ 0,
Z t Z t
(6.5) Y t Zt = Y 0 Z0 + Ys dZs + Zs dYs + hY, Zit ,
0 0

where hY, Zit , t ≥ 0, denotes the bracket of the local martingale parts of Y and Z.

89
We begin with several reductions.
It is enough to prove for Y as above, that P -a.s.,
Z t
2 2
(6.6) Yt = Y0 + 2 Ys dYs + hY it , for t ≥ 0
0

(i.e. prove (6.5) when Y = Z). Indeed one then applies (6.6) to (Y + Z)2 and (Y − Z)2
and recovers (6.5).
Moreover, we can also replace Y by Y n = 1{Tn > 0} Y Tn , where Y.Tn = Y·∧Tn , n ≥ 1,
and Tn ↑ ∞ is the sequence of stopping times

(6.7) Tn = inf{s ≥ 0, |Ys | ≥ n, |Ms | ≥ n, |A|s ≥ n or hM is ≥ n}, n ≥ 1 ,

and, in this fashion, we can assume that Y. , M. , |A|. , hY i. are continuous bounded pro-
cesses (so M. is in fact a martingale). Since all processes, which appear in (6.6) are
continuous, it is also sufficient to prove (6.6) for fixed t.

Proof. We thus pick a fixed t ≥ 0, and define for m ≥ 1,


it
(6.8) ti = , 0 ≤ i ≤ m.
m
We then write:
 m−1
X 2 X X
(6.9) Yt2 = Y0 + (Yti+1 − Yti ) = Y02 + (Yti+1 − Yti )2 + 2 Yti (Yti+1 − Yti ) .
i=0 i<m i<m

We now analyze the convergence of the last two terms in the right-hand side of (6.9), as
m → ∞. We have:
X X X
Yti (Yti+1 − Yti ) = Yti (Mti+1 − Mti ) + Yti (Ati+1 − Ati )
i<m i<m i<m
(6.10) Z t Z t
= Ysm dMs + Ysm dAs ,
0 0

where
X
(6.11) Ysm (ω) = Yti (ω) 1(ti ,ti+1 ] (s) .
0≤i<m

Clearly, using dominated convergence, we find that:


hZ t i
E (Ys − Ysm )2 dhY is −→ 0, and
0 m→∞
hZ t i
E |Ys − Ysm | d|A|s −→ 0 .
0 m→∞

90
It thus follows that
Z t Z t Z t Z t
m L2 (P ) m L1 (P )
(6.12) Ys dMs −→ Ys dMs and Ys dAs −→ Ys dAs .
0 m→∞ 0 0 m→∞ 0

This shows that


X Z t
L1 (P )
(6.13) Yti (Yti+1 − Yti ) −→ Ys dYs .
m→∞ 0
i<m

We now come back to the second term of the right-hand side of (6.9). We write for
0 ≤ i < m,
def
(6.14) ∆m 2
i = (Mti+1 − Mti ) − (hY iti+1 − hY iti ) .

The calculation below resembles what we did in (3.4). For m ≥ 1,


h X 2 i h X 2 i
def
Am = E (Mti+1 − Mti )2 − hY it =E ∆m
i =
i<m i<m
(6.15) X X
E[(∆m 2
i ) ]+ 2 E[∆m m
i ∆j ] .
i<m i<j<m

As we now explain, the ∆m


i , 1 ≤ i < m, are pairwise orthogonal.
Indeed, by (5.9), we have for j < m
E[∆m 2 2
j | Gtj ] = E[Mtj+1 − 2Mtj+1 Mtj + Mtj − (hY itj+1 − hY itj ) | Gtj ] =
(5.9)
(6.16) E[Mt2j+1 − hM itj+1 | Gtj ] − 2Mtj E[Mtj+1 | Gtj ] + Mt2j + hY itj =

Mt2j − hM itj − 2Mt2j + Mt2j + hY itj = 0 .

Thus, the last term of (6.15) vanishes (∆m


i is Gtj -measurable for i < j). One has
X X
Am = E[(∆m 2
i ) ]≤ 2E[(Mti+1 − Mti )4 ] + 2E[(hY iti+1 − hY iti )2 ]
i<m i<m
h X i
≤ 2E sup | Mti+1 − Mti |2 ∆m
i
i<m i<m
  
+ 2E sup |Mti+1 − Mti |2 + sup | hY iti+1 − hY iti | hY it .
i<m i<m
m→∞ ւ
0 dominated convergence
b2
Further, note that 2ab ≤ 2a2 + 2, for a, b in R. So,
h X i (6.15)   A
E sup |Mti+1 − Mti |2 ∆m
i ≤ E sup |Mti+1 − Mti |4 + m .
i<m i<m 4
(6.17) i<m ↓
m→∞
0 dominated convergence

91
Thus, coming back to the line above, we have shown that
X
(6.18) Am = E[(∆m 2
i ) ] −→ 0 . m→∞
i<m

By (6.15), it means that


X L2 (P )
(6.19) (Mti+1 − Mti )2 −→ hY it .
m→∞
i<m
P
To prove an analogous statement for i<m (Yti+1 − Yti )2 , which is our main object of
interest in view of (6.9), we write:
X X
(Yti+1 − Yti )2 − (Mti+1 − Mti )2 ≤
i<m i<m
X 
(6.20) 2|Ati+1 − Ati | |Mti+1 − Mti | + |Ati+1 − Ati |2 ≤
i<m
X 1  X 1 X
2 2
2 |Ati+1 − Ati |2 × (Mti+1 − Mti )2 + (Ati+1 − Ati )2
i<m i<m i<m

and observe that by dominated convergence and continuity,


X L2 (P )
(Ati+1 − Ati )2 ≤ sup |Ati+1 − Ati | |A|t −→ 0 .
i<m m→∞
i<m

Thus, coming back to the last line of (6.20), we see using Cauchy-Schwarz’s inequality for
the first term and (6.19) that all terms converge to 0 in L2 (P ). We have shown that

X L2 (P )
(6.21) (Yti+1 − Yti )2 −→ hY it .
m→∞
i<m

Together with (6.13) this concludes the proof of (6.6) for fixed t, and thus our general
claim (6.5) has been established.

We now turn to the main result.

Theorem 6.4. (Ito’s formula)


Let F be a C 2 -function on Rd , and Y.1 , . . . , Y.d be continuous semimartingales on
(Ω, G, (Gt )t≥0 , P ). Then, writing Y. = (Y.1 , . . . , Y.d ), the real-valued process F (Yt ), t ≥ 0,
is a continuous semimartingale and P -a.s., for all t ≥ 0,
d Z
X t d
X Z t
1
(6.22) F (Yt ) = F (Y0 ) + ∂i F (Ys ) dYsi + 2
∂i,j F (Ys ) dhY i , Y j is .
0 2 0
i=1 i,j=1

92
Proof. The formula (6.22) shows that F (Xt ), t ≥ 0, is a continuous semimartingale. We
use several reductions to prove (6.22).

• First reduction:
Using “localization”, similarly as explained above (6.7), we can assume that Yti , |hY i , Y j i|t
and the total variation of the bounded variation processes entering the decomposition (6.1)
of the Yti , t ≥ 0, are uniformly bounded processes.

• Second reduction:
We can assume that F (·) is C 2 with compact support.

• Third reduction:
We can assume that F (·) is a polynomial.
Indeed note that it suffices now to prove (6.22) for F ∈ Cc∞ (Rd , R), since any F ∈
Cc2 (Rd , R) is approximated, for instance by convolution, in C 2 -topology by such functions,
and (6.22) remains true in the limit. But F ∈ Cc∞ (Rd , R) is approximated in C 2 -topology
on any compact set by linear combinations of eiξ·x , with ξ ∈ Rd (for instance using Fourier
series, when L is large enough so that support(f ) ⊂ (− L2 , L2 )d , one has:
X   Z
k L L d 1 k
F (x) = ak ei2π L ·x , for x ∈ − , , with ak = d  d
f (z) e−i2π L ·z dz .
2 2 L −L ,L
k∈Zd 2 2

P 1
Now, we have the expansion eiξ·x = n≥0 n! (iξ · x)n , which shows that eiξ·x is approxi-
2
mated in C -topology on compact sets by polynomials and the third reduction follows.
We are thus reduced to proving (6.22) for F (·) a polynomial in the coordinate variables.
We will now prove that:

the validity of (6.22) for the polynomial F implies the validity of (6.22) for
(6.23)
the polynomials G(x1 , . . . , xd ) = xi0 F (x1 , . . . xd ), for any 1 ≤ i0 ≤ d .

Indeed, we apply (6.5) to Y i0 and F (Y ), and find that


Z t Z t
G(Yt ) = G(Y0 ) + Ysi0 dF (Y )s + F (Ys ) dYsi0 + hY i0 , F (Y )it
0 0

and since (6.22) holds true for F (Y )


d Z
X t Z t
G(Yt ) = G(Y0 ) + Ysi0 ∂i F (Ys ) dYsi + F (Ys ) dYsi0
i=1 0 0
d
X Z t
1
+ Ysi0 ∂i,j
2
F (Ys ) dhY i , Y j is + hY i0 , F (Y )it .
2 0
i,j=1

93
Now by (5.82) and (6.22) we also have
d Z
X t
hY i0 , F (Y )it = ∂i F (Ys ) dhY i0 , Y i is .
i=1 0

Inserting this identity in the last term of the previous formula, we find that:
d Z
X t d
X Z t
1
G(Yt ) = G(Y0 ) + ∂i G(Ys ) dYsi + 2
∂i,j G(Ys ) dhY i , Y j is ,
0 2 0
i=1 i,j=1

and (6.23) is proved.

Since (6.22) clearly holds when F = constant, it follows by (6.23), that (6.22) holds
for all polynomials F . This, as explained above, yields the general claim.

Example: (Canonical d-dimensional Brownian motion)


We can apply the above theorem in the special case when Y. = X. (= (X.1 , . . . , X.d ))
is the canonical d-dimensional Brownian motion, and (C, F, (Ft )t≥0 , W0 ), cf. (4.7), is the
filtered probability space (which as we have seen satisfies the usual conditions). In this
example each Xti , t ≥ 0, are (Ft )-martingales, and

(6.24) Xti Xtj − δij t, t ≥ 0, are (Ft )-martingales, for 1 ≤ i, j ≤ d ,



Kronecker’s symbol

(when i = j, see (4.2), (4.18), the case i 6= j simply uses a simple modification of (4.4)).
In view of (5.43), (5.44), this means that

(6.25) hX i , X j it = δi,j t, t ≥ 0, for 1 ≤ i, j ≤ d .

In particular, in the rightmost term of Ito’s formula, only the terms with i = j are present,
and hence for F ∈ C 2 (Rd , R), W0 -a.s., for all t ≥ 0,
d Z
X t Z t
1
(6.26) F (Xt ) = F (0) + ∂i F (Xs ) dXsi + ∆F (Xs ) ds ,
0 2 0
i=1

def Pd 2
where ∆F (x) = i=1 ∂i F (x) is the Laplacian of F . 

We will now describe some first applications of Ito’s formula. We recall that (Ω, G, (G)t≥0 , P )
is a filtered probability space satisfying the “usual conditions”, cf. (4.5), (4.6).

94
Exponential Martingales:

Theorem 6.5. Let Mt , t ≥ 0, be a continuous (Gt )-local martingale, with M0 = 0.


n o
1
(6.27) Zt = exp Mt − hM it , t ≥ 0, is a continuous (Gt )-local martingale,
2

which satisfies the “stochastic differential equation”:


Z t
(6.28) P -a.s., for all t ≥ 0, Zt = 1 + Zs dMs .
0

Moreover, if for some ε > 0 and 0 < T ≤ ∞,


h n (1 + ε) oi
(6.29) E exp hM iT < ∞, then
2
(6.30) Zt , t ≤ T , is a continuous (Gt )-martingale.

Remark 6.6.
1) We will later see that (6.30) is still valid when (6.29) holds with ε = 0. This is the
so-called Novikov condition, cf. [8], [9]. For the time being we discuss this simpler result
which has an elementary proof and can be helpful in a number of situations.
2) If (6.29) holds with T = ∞, then the proof below will show that Z∞ = limt→∞ Zt exists
P -a.s., and for small η > 0,

(6.30’) Zt , 0 ≤ t ≤ ∞, is a continuous (Gt )-martingale bounded in L1+η .

Exercise 6.7. Show that when E[hM iT ] < ∞, (Mt , t ≥ 0, as above), then Mt , t ≤ T , is
a continuous square integrable martingale.

Proof of Theorem 6.5.


• (6.28): We introduce the function
n o
1
(6.31) f (x, t) = exp x − t , x, t ∈ R .
2

This function satisfies the equation:

∂f 2
1 ∂ f
(6.32) (x, t) + (x, t) = 0 .
∂t 2 ∂x2

We will now apply Ito’s formula (6.22) to Y = (Y 1 , Y 2 ) = (M, hM i). We first note that

hY 1 , Y 2 i = 0 = hY 2 i, and hY 1 i = hM i .

95
We thus find that P -a.s., for t ≥ 0:
Z t Z t
f (Mt , hM it ) = f (0, 0) + ∂x f (Ms , hM is ) dMs + ∂t f (Ms , hM is ) dhM is
0 0
Z t
1
(6.33) + ∂ 2 f (Ms , hM is ) dhM is
2 0 x
Z t
(6.32)
= 1+ f (Ms , hM is ) dMs , (∂x f = f ) .
0

Since Zt = f (Mt , hM it ), this proves (6.28), as well as the fact that Zt , t ≥ 0, is a continuous
local martingale.

• (6.30):
We consider a sequence of finite stopping times Tn ↑ ∞, P -a.s., such that
(6.34) Mt∧Tn , 0 ≤ t ≤ T , is a bounded martingale for each n .

Observe that Zt∧Tn is a bounded continuous local martingale and hence, cf. (4.104),
(6.35) Zt∧Tn , 0 ≤ t ≤ T , is a bounded martingale for each n .
We will now see that
(6.36) for some q > 1, sup E[ZTq ∧Tn ] < ∞ .
n≥0

From Doob’s inequality (4.76), it will then follow that:


 q     q q
(6.37) E sup Zt = lim E sup Ztq ≤ lim E[ZTq ∧Tn ] < ∞ .
t≤T n→∞ t≤T ∧Tn n q − 1

Together with (6.35), this will imply by dominated convergence that:


(6.38) Zt , 0 ≤ t ≤ T , is a continuous martingale bounded in Lq .

This will prove (6.30) (and also (6.30’) in the case T = ∞; in this case Z∞ exists as a
P -a.s. limit, by the martingale convergence theorem, see Theorem 3.15, p. 17 of [8], and in
addition Z∞ = limt→∞ Zt in Lq (P ) as well, by dominated convergence, thanks to (6.37)).
There remains to prove (6.36). We pick q, α > 1, and write:
h n q oi
E[ZTq ∧Tn ] = E exp q MT ∧Tn − hM iT ∧Tn =
2
h n oi Hölder
1 1
E exp q MT ∧Tn − αq 2 hM iT ∧Tn + q(αq − 1)hM iT ∧Tn ≤
2 2
(6.39) h n oi 1
1 α
E exp αq MT ∧Tn − α2 q 2 hM iT ∧Tn ×
2
h n α oi α−1
1 α
E exp q(αq − 1)hM iT ∧Tn .
2 α−1

96
Note that exp{αqMt∧Tn − 12 α2 q 2 hM it∧Tn } is a bounded martingale just as in (6.35), and
the first term in the last line of (6.39) equals 1. So we see that for any n ≥ 0,
h n1 α oi α−1
E[ZTq ∧Tn ] ≤ E exp
α
q(αq − 1)hM iT .
2 α−1
Note that
α
lim lim q(αq − 1) = 1 ,
α↓1 q↓1 α−1

and hence we can choose q, α > 1, so that


α
q(αq − 1) < 1 + ε .
α−1

The combination of (6.29) and the above inequality yields (6.36). This concludes the proof
of (6.30). 

Example:
(Xt )t≥0 , Brownian motion on R. Then for λ ∈ R,
n o
λ2
(6.40) exp λXt − t is a martingale.
2

Consider a > 0 and the entrance time of X in {a}:

Ha = inf{s ≥ 0; Xs = a}, (the distribution of Ha under W0 appears in (2.54)) .

Then, using the stopping theorem, we see that when λ > 0,


n λ2 o
(6.41) exp λXt∧Ha − (t ∧ Ha ) is a martingale, which is bounded by eλa .
2
As a result we obtain that
h n λ2 oi
1 = E0 exp λXt∧Ha − (t ∧ Ha ) , and since Ha < ∞, W0 -a.s.,
 2
dominated 
convergence
yt→∞

h n λ2 oi  λ2 
E0 exp λXHa − Ha = eλa E0 e− 2 Ha .
2

Setting u = λ2 /2, we obtain (by symmetry when a is negative):



(6.42) E0 [exp{−uHa }] = exp{−|a| 2u}, for a ∈ R, u ≥ 0 .

97
As an application of exponential martingales, we will prove Paul Lévy’s character-
ization of Brownian motion. We introduce the following

Definition 6.8. A continuous adapted Rd -valued process (Xt )t≥0 , with X0 = 0, is called
(Gt )-Brownian motion if for 0 ≤ s < t,

(6.43) Xt − Xs is independent of Gs and N (0, (t − s)I)-distributed.

Remark 6.9. A (Gt )-Brownian motion is then of course in particular a d-dimensional


Brownian motion in the sense of the definition (1.1). However, the independence assump-
tion (6.43) is a (possibly) more stringent requirement (Gs may be strictly bigger than
σ(Xu , u ≤ s)).

Theorem 6.10. (P. Lévy’s characterization of Brownian motion)


If (Xt )t≥0 , is a d-dimensional continuous (Gt )-local martingale, such that X0 = 0, and

(6.44) hX i , X j it = δij t, for 1 ≤ i, j ≤ d ,

then

(6.45) Xt , t ≥ 0, is a d-dimensional (Gt )-Brownian motion.

Proof. The same calculation as in (6.33) shows that for ξ ∈ Rd ,


n o
1
(6.46) Zt = exp iξ · Xt + |ξ|2 t , t ≥ 0 ,
2

is a complex-valued, continuous (Gt )-local martingale, which is bounded when t remains


bounded. Hence it is a continuous martingale and for 0 ≤ s ≤ t:
P -a.s.
E[Zt | Gs ] = Zs , so that (since |Zs | > 0)
h n |ξ|2 o i
P -a.s.
1 = E[Zt Zs−1 | Gs ] = E exp iξ · (Xt − Xs ) + (t − s) Gs .
2
As a result, for 0 ≤ s < t, ξ ∈ Rd ,
h n o i n o
P -a.s. 1
(6.47) E exp iξ · (Xt − Xs ) Gs = exp − |ξ|2 (t − s) .
2

This implies that for 0 ≤ s < t

Xt − Xs is independent of Gs and N (0, (t − s)I)-distributed.

The claim (6.45) now readily follows.

Remark 6.11. If we now look again at the assumptions (4.20), (4.21), when we began the
discussion of stochastic integrals, we see that they are equivalent to the fact that Xt − X0
is a (Gt )-Brownian motion and X0 ∈ L2 (Ω, G0 , dP ). This link with Brownian motion was
not clear at the time we introduced (4.20), (4.21). 

98
We will now give a further application of exponential martingales.
Proposition 6.12. (Bernstein’s inequality)
If Mt , t ≥ 0, is a continuous local martingale with M0 = 0, and hM it ≤ ct, for t ≥ 0,
then Mt , t ≥ 0, is a martingale and
   a2
(6.48) P sup Mt ≥ a ≤ exp − , for all a, T > 0 ,
t≤T 2cT
a 2
(and hence P [supt≤T |Mt | ≥ a] ≤ 2 exp{− 2cT }, for all a, T > 0).
Proof. For λ ∈ R, by (6.27), (6.30), we know that
 λ2
Zt = exp λMt − hM it , t ≥ 0 ,
2

is a continuous martingale. We can pick λ > 0, and we have by Doob’s inequality, cf (4.67):
  h n λ2 oi
P sup Mt ≥ a ≤ P sup Zt ≥ exp λa − cT
t≤T t≤T 2
(6.49) (4.67) n λ2 o
≤ exp − λa + cT E[ZT ] .
2 | {z }
k
E[Z0 ]=1

a 2 2
We can optimize over λ > 0, and choose λ = cT , so that −λa + λ2 cT = − 2cT
a
. As a result
we find that n
  a2 o
P sup Mt ≥ a ≤ exp − ,
t≤T 2cT

that is, (6.48) holds. Applying this inequality to −M , we thus see that (Mt )t≥0 , is a
continuous local martingale, which is square integrable. It is therefore a martingale,
cf. (4.104) (alternatively use the exercise below (6.30’)).

We continue our discussion of some first applications of Ito’s formula.

Harmonic functions and Brownian motion


When U ⊆ Rd is non-empty open set, a C 2 -function on U is said harmonic (in U ) when
P
∆f (x)(= d1 ∂i2 f (x)) = 0, x ∈ U . In fact no regularity requirement on f is necessary in
the sense that the equation ∆f = 0 on U in the distribution sense implies that f is C ∞
on U and satisfies ∆f (x) = 0, x ∈ U , in the classical sense, cf. [3], p. 127. Harmonic
functions play a very important role in the study of Brownian motion. Here is an example.
Proposition 6.13. When d ≥ 2 and x 6= 0 is a point of Rd , then

(6.50) Wx -a.s., Xt 6= 0, for all t ≥ 0 .

(in other words, “Brownian motion does not hit points when d ≥ 2”).

99
Proof. When g is a C 2 -function on (0, ∞), one can define the radial function
q 
f (x) = g(|x|) = g x21 + · · · + x2d , x ∈ Rd \{0} ,

and one has the identity (exercise!)


d−1 ′
(6.51) ∆f (x) = g′′ (r) + g (r), with r = |x|, for x ∈ Rd \{0} .
r
When d ≥ 3, we choose
(6.52) g(r) = r 2−d ,
so that
d−1 ′
g′′ (r) + g (r) = (2 − d)(1 − d) r −d + (d − 1)(2 − d) r −d = 0 ,
r
and therefore
1
(6.53) f (x) = , x 6= 0, is harmonic in Rd \{0} .
|x|d−2
If x 6= 0, and a < |x| < b, we choose fa , a smooth radial function, equal to f (x) on
{y ∈ Rd ; |y| ≥ a}, so that applying Ito’s formula, cf. (6.26), one finds that:
Wx -a.s., for all t ≥ 0:
Z t Z t
1
(6.54) fa (Xt ) = fa (x) + ∇fa (Xs ) · dXs + ∆fa (Xs ) ds .
0 2 0

We then introduce the stopping time


(6.55) τ = inf{u ≥ 0; |Xu | ≤ a or |Xu | ≥ b} ,
and see that Wx -a.s., for all t ≥ 0,
Z t∧τ
−(d−2) (6.54) −(d−2)
|Xt∧τ | = fa (Xt∧τ ) = |x| + ∇fa (Xs ) · dXs
0
Z t∧τ
1
+ ∆fa (Xs ) ds ,
2 0

and the last term vanishes, since ∆fa (Xs ) = 0, for 0 < s < τ . As a result |Xt∧τ |−(d−2) ,
t ≥ 0, is a local martingale, which is bounded, and hence:
(6.56) |Xt∧τ |2−d , t ≥ 0, is a martingale.
As a result, we find
(6.57) Ex [|Xt∧τ |2−d ] = |x|2−d , for t ≥ 0 .

Note that τ is Wx -a.s. finite, cf. Corollary 2.17. Letting t → ∞, and using dominated
convergence, we find that
|x|2−d = E[|Xτ |2−d ] = a2−d Wx (|Xτ | = a) + b2−d Wx (|Xτ | = b) .

100
x

An illustration of
Xτ a
Xs , 0 ≤ s ≤ τ , under Wx .
0
b

Since Wx (|Xτ | = a) + Wx (|Xτ | = b) = 1, we obtain for a < |x| < b

|x|2−d − b2−d a2−d − |x|2−d


(6.58) Wx (|Xτ | = a) = , W x (|Xτ | = b) = .
a2−d − b2−d a2−d − b2−d

Letting a → 0, with b fixed, we see that

(6.59) Wx (H{0} < TB(0,b) ) = 0, for 0 < b,

with the notation TU = inf{s ≥ 0; Xs ∈


/ U }, the “exit time from U ”. It now follows that

Wx (H{0} < ∞) = Wx (Xt = 0, for some t ≥ 0) =


(6.60)
lim Wx (H{0} < TB(0,b) ) = 0 ,
b→∞

and this proves (6.50) when d ≥ 3.


When d = 2, we choose instead
1
(6.61) g(r) = log ,
r

so that
d−1 ′ 1 1
g′′ (r) + g (r) = 2 − 2 = 0 ,
r r r
and
1
(6.62) f (x) = log , x 6= 0, is harmonic in R2 \{0} .
|x|

The repetition of the above proof now yields that for a < |x| < b,
b
log|x| log |x|
a
(6.63) Wx (|Xτ | = a) = , Wx (|Xτ | = b) = ,
log ab log ab

and one concludes as above, by letting a → 0, with b fixed, and then b → ∞.

101
As an application of the same circle of ideas we will discuss recurrence and tran-
sience properties of Brownian motion in Rd , when d ≥ 2.

Theorem 6.14. (transience of Brownian motion in Rd , d ≥ 3)


When d ≥ 3, then for x ∈ Rd ,

(6.64) Wx -a.s., lim |Xt | = ∞ .


t→∞

Proof. Since under Wx , (Xt + z)t≥0 is a Brownian motion starting from x + z, it suffices
to prove (6.64) for some x 6= 0. By (6.54) we know that, letting HB(0,a) stand for the
entrance time of X in B(0, a),

(6.65) |Xt∧HB(0,a) |2−d is a continuous bounded martingale under Wx .

Using Fatou’s lemma for conditional expectations, we find that for s ≤ t, Wx -a.s.,

(6.50)   Fatou
Ex [|Xt |2−d | Fs ] = Ex lim inf |Xt∧HB(0, 1 ) |2−d | Fs ≤
n n
  (6.65) (6.50)
lim inf Ex |Xt∧HB(0, 1 ) | | Fs = lim inf |Xs∧HB(0, 1 ) |2−d = |Xs |2−d .
2−d
n n n n

In other words we have proved that

(6.66) |Xt |2−d , t ≥ 0, is a continuous supermartingale under Wx .

Since this supermartingale is non-negative, it follows from the convergence theorem, see
[8], p. 17, that

(6.67) Wx -a.s., |Xt |2−d has a finite limit as t → ∞ .

On the other hand, looking at one of the components of Xt , we already know that

Wx -a.s., lim sup |Xt | = ∞ .


t

This observation combined with (6.67) implies that the finite limit in (6.67) is 0.

Exercise 6.15. Show that a non-negative continuous local martingale is a supermartin-


gale. 

We now turn to the two-dimensional situation.

Theorem 6.16. (recurrence of Brownian motion in R2 )


When d = 2, for any x ∈ R2 ,

(6.68) Wx -a.s., for any non-empty open set O ⊆ R2 , {t ≥ 0; Xt ∈ O} is unbounded.

102
Proof. By (6.63), we see, letting b → ∞, that when a < |x|, Wx -a.s., HB(0,a) < ∞. Of
course this remains true when |x| ≤ a, so that

(6.69) for any x ∈ R2 , a > 0, Wx -a.s., HB(0,a) < ∞ .

One can then define the sequence of (Ft+ )-stopping times, cf. (2.33),

S1 = HB(0,a) , S2 = S1 ◦ θS1 +1 + S1 + 1, and by induction for i ≥ 1:


Si+1 = S1 ◦ θSi +1 + Si + 1, so that Si ↑ ∞ .

Using the strong Markov property, cf. (2.46), we see that for any y ∈ R2 , for i ≥ 1,

Wy [Si+1 < ∞] = Wy [Si < ∞ and θS−1


i +1
(S1 < ∞)]
(2.46) 
= Ey Si < ∞, WXSi +1 [S1 < ∞] = Wy [Si < ∞]
(6.70) | {z }
k
1 by (6.69)

induction
= Wy [S1 < ∞] = 1 .

Note also that by construction, for i ≥ 1,

Wy -a.s., on {Si < ∞}, XSi ∈ B(0, a) .


1
It thus follows from (6.70) and Si ↑ ∞, that for any y ∈ R2 , Wy -a.s., for any a = n,
{t ≥ 0; Xt ∈ B(0, n1 )} is unbounded.
Since Wy is the law of (Xt + y)t≥0 , under W0 , the above property implies that (setting
z = −y):
 o
1
(6.71) W0 -a.s., for all z ∈ Q2 , n ≥ 1, {t ≥ 0; Xt ∈ B z, is unbounded .
n

This proves (6.68) when x = 0. The case of a general x follows since (Xt )t≥0 under Wx
has the law of (Xt + x)t≥0 under W0 , as was already used in the proof.

Exercise 6.17. Give a proof of (6.68) using (6.69) and (6.50) (without the introduction
of the stopping times Si , i ≥ 1).

Complement
We will now present Nivokov’s criterion, which refines the condition we gave in (6.29)
to ensure that Zt is a martingale (and not merely a continuous local martingale).
Theorem 6.18. (Novikov’s criterion)
Let (Mt )t≥0 , be a continuous local martingale with M0 = 0, such that
h n oi
1
(6.72) E exp hM i∞ < ∞ .
2

103
Then,
h n oi
1
(6.73) E exp sup |Mt | < ∞ ,
2 t≥0

and
n o
1
(6.74) Zt = exp Mt − hM it , t ≥ 0, is a uniformly integrable continuous martingale
2

(and of course, (Mt )t≥0 , is a continuous martingale as well).


Remark 6.19. If instead of (6.72) we assume that for some T > 0,
h n oi
1
(6.75) E exp hM iT < ∞,
2

the above theorem can be applied to Mt∧T , t ≥ 0, and we find that


n o
1
(6.76) Zt = exp Mt − hM it , t ≤ T, is a continuous martingale.
2


Proof. We first observe that E[hM i∞ ] < ∞ implies that

(6.77) (Mt )t≥0 , is a continuous martingale bounded in L2 .

Indeed (this is just as in the exercise below (6.30’)), one chooses a sequence Tn ↑ ∞ of
stopping times so that (Mt∧Tn )t≥0 , are bounded martingales. Then, one has

2 (5.36)
E[Mt∧Tn
] = E[hM it∧Tn ] ≤ E[hM i∞ ], and by Fatou’s lemma

E[Mt2 ] ≤ lim inf E[Mt∧T


2
n
] ≤ E[hM i∞ ] .
n

It now also follows with similar considerations as in (4.104) and Doob’s inequality (4.76)
P -a.s.
that (Mt )t≥0 , is a continuous martingale, with E[supt≥0 |Mt |2 ] < ∞ and that M∞ =
limt→∞ Mt is well-defined, by the martingale convergence theorem, cf. [8], p. 17.

• (6.73): Note that for 0 ≤ t ≤ ∞,


h n oi h n o n oi
1 1 1 1
E exp Mt = E exp Mt − hM it exp hM it
2 2 4 4
(6.78) Cauchy-Schwarz h n oi 1 n oi 1
1 2 1 2
≤ E exp Mt − hM it exp hM i∞ .
2 2

Since (Zt )t≥0 , is a non-negative local martingale, it is also a supermartingale (see also
exercise below (6.67)). Indeed, for 0 ≤ s ≤ t,
Fatou
E[Zt | Gs ] = E[lim Zt∧Tn | Gs ] ≤ lim inf E[Zt∧Tn | Gs ]
n n

= lim inf Zs∧Tn = Zs , thus proving the claim .


n

104
As a result, E[Zt ] ≤ E[Z0 ] = 1, and coming back to (6.78),
h n oi h n oi 1
1 1 2
(6.79) E exp Mt ≤ E exp hM i∞ , 0 ≤ t ≤ ∞.
2 2

The same argument applied to −M yields that


h  i h n oi 1
1 1 2
sup E cosh Mt ≤ E exp hM i∞ , so that
t≥0 2 2
(6.80)
h n oi 1
1 2 1
E[cosh(cMt )] ≤ E exp hM i∞ , for 0 ≤ c ≤ , 0 ≤ t ≤ ∞ .
2 2
1
In particular, since cosh x ≥ 2 ex ,
1
(6.81) sup E[ecMt ] < ∞, for 0 ≤ c ≤ .
0≤t≤∞ 2

Jensen’s inequality implies that ecMt , t ≥ 0, is a non-negative sub-martingale. It then


1
follows from Doob’s inequality (4.76) with p = 2c , and 0 < c < 12 , that
h n oi  p 
1
E sup exp Mt = E sup exp{cMt }
t≥0 2 t≥0
 p p
(4.76)
(6.82) ≤ sup E[exp{pcMt }]
p−1 t≥0
 p p n oi (6.81)
1
= sup E[exp Mt < ∞.
p−1 t≥0 2

Of course, a similar bound holds for −M in place of M . Note also that


1 1
supt≥0 (cosh( 12 Mt )) ≤ 12 supt≥0 e 2 Mt + 12 supt≥0 e− 2 Mt , and hence
h  i
1
(6.83) E sup cosh Mt < ∞ .
t≥0 2

This implies that E[exp{ 12 supt≥0 |Mt |}] < ∞, and (6.73) holds.

• (6.74):
We will use the next
Lemma 6.20.

(6.84) If E[Z∞ ] = 1, then Zt , 0 ≤ t ≤ ∞, is a uniformly integrable martingale.

Proof. By the supermartingale property of (Zt )t≥0 , it follows that 1 = E[Z∞ ] ≤ E[Zt ] ≤
E[Z0 ] = 1, for 0 ≤ t ≤ ∞, so that E[Zt ] = 1, for 0 ≤ t ≤ ∞.
n→∞
Note that P -a.s., Zt∧Tn −→ Zt , and E[Zt∧Tn ] = E[Zt ] = 1 and these variables are
non-negative.

105
It now follows that they are uniformly integrable and
L1
(6.85) Zt∧Tn −→ Zt , for 0 ≤ t ≤ ∞ ,
n→∞

see for instance [5], p. 224. This now implies that Zt , t ≥ 0, is a martingale. Moreover,
since P -a.s., Zt → Z∞ , as t → ∞, and E[Zt ] = E[Z∞ ] = 1, the same argument shows that

L1
(6.86) Zt −→ Z∞ , as t → ∞ ,

so that

(6.87) Zt = E[Z∞ | Gt ], for t ≥ 0 ,

and the conclusion of the lemma follows, cf. [5], p. 223.

We will now show that

(6.88) E[Z∞ ] ≥ 1 .

Since we already know that E[Z∞ ] ≤ 1, the claim (6.74) will now follow from the lemma.
By (6.29), (6.30), with T = ∞, we know from (6.72) that
h n a2 oi
E exp a M∞ − hM i∞ = 1, for a ∈ [0, 1) .
2
Note that the following equality holds:
n a2 o n oa2 n a o1−a2
1
exp a M∞ − hM i∞ = exp M∞ − hM i∞ exp M∞ .
2 2 1+a
Using Hölder’s inequality with p = a−2 and q = (1 − a2 )−1 , we find:

2
h n a oi1−a2
(6.89) 1 ≤ E[Z∞ ]a E exp M∞ .
1+a
Using (6.73), we can use dominated convergence to argue that
h n a oi h n oi
1
lim E exp M∞ = E exp M∞ ∈ (0, ∞) .
a→1 1+a 2

As a result we obtain that


h n a oi1−a2
(6.90) lim E exp M∞ = 1,
a→1 1+a

and the claim (6.88) now follows from (6.89) and (6.90). This concludes the proof of
(6.74).

106
7 Stochastic differential equations and Martingale problems
We begin with some heuristic considerations.
In this chapter we want to construct processes, which locally, near each point x in Rd ,
move like
x + tb(x) + σ(x) Bt ,

where b(x) ∈ Rd , σ(x) is a (d × n)-matrix and Bt is an n-dimensional Brownian motion.


We want that “infinitesimally”, the increment of the process we construct behaves as a
Gaussian variable with
mean: b(x)dt, and
covariance matrix: a(x)dt,
where for ξ ∈ Rd
t
ξa(x) ξdt “ =” E[(t ξ · σ(x) · (Bt+dt − Bt ))2 ]

= E[(t (t σ(x)ξ) · (Bt+dt − Bt ))2 ] = |t σ(x)ξ|2 dt


t
= ξ σ(x) t σ(x) ξ dt ,

in other words a(x) = σ(x) t σ(x) (d × d-matrix).


We will consider two approaches to build such processes. The first approach will
rely on solving stochastic differential equations (SDE):
Z t n Z
X t
Xti = X0i + bi (Xs ) ds + σi,j (Xs ) · dBsj , i = 1, . . . , d ,
0 j=1 0

or in vector notation:
Z t Z t
(7.1) Xt = x + b(Xs ) ds + σ(Xs ) · dBs , x ∈ Rd .
0 0

The second approach will be based on a martingale problem, i.e. finding on (C(R+ , Rd ), F)
a probability Px , such that
Z t
def
Mtf = f (Xt ) − f (X0 ) − Lf (Xs ) ds, t ≥ 0, is an (Ft )-martingale under Px ,
0
(7.2) def 1 P
d P
d
when f ∈ Cc2 (Rd ) with Lf (y) = 2 f (y) +
ai,j (y)∂i,j bi (y)∂i f (y), y ∈ Rd ,
2 i,j=1 i=1
Px [X0 = x] = 1 .

This latter approach shares the same spirit as Lévy’s characterization of Brownian motion,
cf. (6.44), (6.45).

107
Notation:
X
d 1
2
• For b ∈ Rd , |b| = b2i .
i=1
 X 1/2 n o1 n o1
2 t 2 t 2
• For σ ∈ Md×n , |σ| = σi,j = Trace (σ σ) = Trace ( σσ) .
1≤i≤d
| {z } | {z }
1≤j≤n ∈Md×d ∈Mn×n

• (Ω, G, (Gt )t≥0 , P ) a probability space satisfying the usual conditions, cf. (4.5), (4.6).
• Bt , t ≥ 0, an n-dimensional (Gt )-Brownian motion, or equivalently in view of (6.44),
(6.45) for all 1 ≤ i, j ≤ n, Bti , t ≥ 0, are continuous local martingales with B0i = 0,
and Bti Btj − δij t, t ≥ 0, are continuous local martingales.

We now begin with the discussion of stochastic differential equations. The next
theorem provides a basic result.
Theorem 7.1. (Picard’s iteration method)
Assume that b(·) : Rd → Rd , and σ(·): Rd → Md×n satisfy the Lipschitz condition

(7.3) |b(y) − b(z)| + |σ(y) − σ(z)| ≤ K |y − z|, for y, z ∈ Rd .

Then, for any (Ω, G, (Gt )t≥0 , P ) and Bt , t ≥ 0, as above, and any x in Rd , there exists an
essentially unique continuous (Gt )-adapted (Xt )t≥0 with values in Rd , such that P -a.s., for
t ≥ 0,
Z t Z t
(7.4) Xt = x + b(Xs ) ds + σ(Xs ) · dBs .
0 0

Proof.
• Uniqueness:
Consider X. , Y. two solutions. For M > |x|, we define

(7.5) T = inf{u ≥ 0; |Xu | or |Yu | ≥ M } ,

so that P -a.s., for all t ≥ 0,


Z t∧T Z t∧T
Xt∧T − Yt∧T = (b(Xu ) − b(Yu ))du + (σ(Xu ) − σ(Yu )) · dBu .
0 0

As a result, we see that for t0 > 0:


h Z
  t∧T 2i
2
E sup |Xt∧T − Yt∧T | ≤ 2E sup (σ(Xu ) − σ(Yu )) · dBu
t≤t0 t≤t0 0
hZ t0 ∧T i
+ 2t0 E |b(Xu ) − b(Yu )|2 du .
0

108
Using Doob’s inequality (4.76), with p = 2, for each component of the Rd -valued stochastic
integral we find that

  h Z t0 ∧T 2i
2
E sup |Xt∧T − Yt∧T | ≤ 8E (σ(Xu ) − σ(Yu )) · dBu
t≤t0 0
(7.6)
hZ t0 ∧T i
+ 2t0 E |b(Xu ) − b(Yu )|2 du .
0

On the other hand one has


h Z t0 ∧T 2i
E (σ(Xu ) − σ(Yu )) · dBu =
0
hX Z t0 ∧T 2 i
X
d n
E (σi,j (Xu ) − σi,j (Yu )) dBuj =
i=1 j=1 0

X
d h X Z t0 ∧T Z t0 ∧T i
E (σi,j (Xu ) − σi,j (Yu )) dBuj (σi,k (Xu ) − σi,k (Yu )) dBuk
(7.7) i=1 1≤j,k≤n 0 0

(5.90) X
d X hZ t0 ∧T i
= E (σi,j (Xu ) − σi,j (Yu ))(σi,k (Xu ) − σi,k (Yu )) d B j , B k u =
i=1 1≤j,k≤n 0 | {z }
k
δj,k du
d X
X n hZ t0 ∧T i hZ t0 ∧T i
E (σi,j (Xu ) − σi,j (Yu ))2 du = E |σ(Xu ) − σ(Yu )|2 du .
i=1 j=1 0 0

Inserting (7.7) in the right-hand side of (7.6), and taking (7.3) into account we find that
for any t0 ≥ 0:
Z t0

(7.8) E sup |Xs∧T − Ys∧T |2 ] ≤ (8K 2 + 2t0 K 2 ) E[|Xu∧T − Yu∧T |2 ] du .
s≤t0 0

The next result will be helpful.

Lemma 7.2. (Gronwall’s lemma)


Let f be a non-negative integrable function on [0, t] such that for some a, b ≥ 0, and all
0 ≤ u ≤ t: Z u
f (u) ≤ a + b f (s) ds ,
0
then

(7.9) f (u) ≤ a ebu , for all 0 ≤ u ≤ t .

109
Proof. Iterating the inequality satisfied by f , we see that for 0 ≤ u ≤ t,
Z u Z u Z s1
2
f (u) ≤ a + b f (s) ds ≤ a + bau + b ds1 f (s2 ) ds2 ≤ . . .
0 0 0
1 2 1 n
≤ a + bau + b a u2 + ··· + b a un
+
Z u 2 Z Z sn n!
s1
bn+1 ds1 ds2 . . . f (sn+1 ) dsn+1
0 0 0
Z u Z u
bu n+1 (u − s)n bu n+1 un
≤ ae +b f (s) ds ≤ a e +b f (s) ds .
0 n! n! 0

Letting n → ∞, we find (7.9).

We will now apply the above lemma with the choice f (u) = E[sups≤u |Xs∧T − Ys∧T |2 ]
(≥ E[|Xu∧T − Yu∧T |2 ]), 0 ≤ u ≤ t, t > 0, a = 0, b = K 2 (8 + 2t), and find that (with t
some positive number)

(7.10) f (u) = 0, for 0 ≤ u ≤ t .

Letting M in (7.5) tend to infinity, and then t → ∞,

(7.11) P -a.s., Xu = Yu , for all u ≥ 0.

Such a statement is called a strong uniqueness result (sometimes it is also called path-
wise uniqueness).

• Existence:
We iteratively define for m ≥ 0, t ≥ 0,

0
 Xt ≡ x ,



 Z t Z t
 1 0
(7.12) Xt = x + b(Xs ) ds + σ(Xs0 ) · dBs .

 0 Z
t
0 Z
t



 Xtm+1 = x + b(Xsm ) ds + σ(Xsm ) · dBs
0 0

Then, for m ≥ 1:
Z t Z t

(7.13) Xtm+1 −Xtm = b(Xsm )−b(Xsm−1 ) ds+ (σ(Xsm )−σ(Xsm−1 ))·dBs , for t ≥ 0 .
0 0

If we now pick M > |x|, and define

TM = inf{u ≥ 0; |Xum | or |Xum+1 | ≥ M } ,

the same calculation as for (7.8) yields that for 0 ≤ t0 ≤ t,


Z t0
 m+1 m 2
 2 m m−1 2
(7.14) E sup |Xs − Xs | ≤ (8 + 2t) K E[|Xu∧TM
− Xu∧TM
| ] du .
s≤t0 ∧TM 0

110
Now sups≤t |Xs0 | = |x|, sups≤t |Xs1 | ∈ L2 (P ), by (7.12).
We now see from (7.14), with m = 1, letting M → ∞, that sups≤t |Xs2 | ∈ L2 (P ), and
repeating the argument that

(7.15) sup |Xsm | ∈ L2 (P ), for any m ≥ 0, and t ≥ 0 .


s≤t

Coming back to (7.14) we can thus let M → ∞ and find that for 0 ≤ t0 ≤ t:
Z t0
 
E sup |Xsm+1 − Xsm |2 ≤ K 2 (8 + 2t) E[|Xum − Xum−1 |2 ] du, and iterating
s≤t0
(7.16) Z t0 Z t1 Z t0m−1
2
≤ {K (8 + 2t)} m dt1 dt2 . . . dtm E[|Xt1m − Xt0m |2 ] .
0 0 0

With (7.12) we also have:

E[|Xt1m − Xt0m |2 ] ≤ 2 |b(x)|2 t2m + 2E[|σ(x) · Btm |2 ]


(7.17)
≤ K 1 (x, t) tm , for 0 ≤ tm ≤ t .

Hence with (7.16) we obtain that

  (8K 2 + 2t K 2 )m tm+1
(7.18) E sup |Xsm+1 − Xsm |2 ≤ K 1 (x, t) for t > 0, m ≥ 0 .
s≤t0 (m + 1)!

We have thus proved that for t > 0,


hX i X  1
(7.19) E sup |Xsm+1 − Xsm | ≤ E sup |Xsm+1 − Xsm |2 2 < ∞ .
m≥0 s≤t m≥0 s≤t

As a consequence of the finiteness of the expectation on the left-hand side, P -a.s., X.m
converges uniformly on bounded time intervals to X.∞ , which can be chosen (Gt )-adapted
continuous (see for instance (4.75)), and
 1/2 Fatou  1
E sup |Xs∞ − Xsm |2 ≤ lim inf E sup |Xsp − Xsm |2 ] 2 ≤
s≤t p→∞ s≤t
(7.20) ∞
X 1
E[sup |Xsk+1 − Xsk |2 ] 2 −→ 0, for t > 0 .
s≤t m→∞
k=m

Hence, by (7.20) and (7.3), we see that for t > 0, P -a.s.,


Z t Z t
m+1 m
Xt =x + b(Xu ) du + σ(Xum ) · dBu
 0  0 
2   2 
L ym→∞ m→∞yL m → ∞ y L2
Z t Z t


Xt = x + b(Xu ) du + σ(Xu∞ ) · dBu ,
0 0

111
and in view of the continuity of X.∞ , we see that P -a.s.,
Z t Z t
∞ ∞
(7.21) Xt = x + b(Xu ) du + σ(Xu∞ ) · dBu , for all t ≥ 0 .
0 0

Therefore, X.∞ is a solution of (7.4).

Remark 7.3. From the definition of the X.m in (7.12), and the fact that P -a.s., X.m
converges uniformly to X.∞ on compact time intervals, we see that for each t ≥ 0:
X ∞ def
Ft . = the smallest σ-algebra containing all negligible sets of
(7.22) G and making Xs∞ measurable for s ≤ t

⊆ FtB. , defined analogously (with B. in place of X.∞ ) .

Due to (7.22), X.∞ is called a strong solution of (7.4) (intuitively X.∞ is a function
of the “noise” B. ). The above theorem shows that for any (Ω, G, (Gt )t≥0 , P ), (Bt )t≥0 , we
have a strong solution of (7.4), which is strongly unique, cf. (7.11). 
We will now see that solutions of stochastic differential equations (SDE’s) can be
used to represent solutions of certain partial differential equations (PDE’s).
We begin with a result which will also be helpful in the subsequent discussion of
martingale problems.
Proposition 7.4. Assume that b(·): Rd → Rd , σ(·): Rd → Md×n are measurable, locally
bounded functions, x ∈ Rd , and on some (Ω, G, (Gt )t≥0 , P ), endowed with an n-dimensional
(Gt )-Brownian motion (Bt )t≥0 , a continuous adapted Rd -valued, (Xt )t≥0 , satisfies P -a.s.,
for t ≥ 0:
Z t Z t
(7.23) Xt = x + b(Xs ) ds + σ(Xs ) · dBs ,
0 0

then for any f ∈ C 2 (Rd , R),


Z t
def
(7.24) Mtf = f (Xt ) − f (X0 ) − Lf (Xs ) ds, t ≥ 0, is a continuous local martingale
0

where we used the notation


1
X X
2
Lf (y) = ai,j (y) ∂i,j f (y) + bi (y) ∂i f (y), and
2
1≤i,j≤d 1≤i≤d
(7.25)
a(y) = σ(y) t σ(y), for y ∈ Rd .
| {z }
∈Md×d

Proof. We apply Ito’s formula and find that P -a.s., for t ≥ 0:


d Z
X t d
X Z t
1
(7.26) f (Xt ) = f (X0 ) + ∂i f (Xs ) dXsi + 2
∂i,j f (Xs ) dhX i , X j is .
0 2 0
i=1 i,j=1

112
Note that by (5.90) and (7.23), it follows that
n Z
DX t n Z
X t E
i j
hX , X it = σi,k (Xu ) dBuk , σj,ℓ (Xu ) · dBuℓ
k=1 0 ℓ=1 0
(7.27)
n Z
X t Z t
= σi,k (Xu ) σj,k (Xu ) du = ai,j (Xu ) du .
k=1 0 0

Hence, coming back to (7.26), we find that P -a.s., for all t ≥ 0,


d Z
X t X n Z
d X t
f (Xt ) = f (X0 ) + ∂i f (Xs ) bi (Xs ) ds + ∂i f (Xs ) σi,k (Xs ) dBsk
i=1 0 i=1 k=1 0

X Z t
1 2
+ ai,j (Xs ) ∂i,j f (Xs ) ds
2
(7.28) 1≤i,j≤d 0

Z t Z t
= f (X0 ) + Lf (Xs ) ds + ∇f (Xs ) · σ(Xs ) · dBs
0 0 | {z }
↑ ↑
d-vector
scalar product

The claim (7.24) now follows.

We will now see that the solutions of stochastic differential equations can be used
to provide probabilistic representation formulas for the solutions of certain second order
partial differential equations.
We consider the following Dirichlet-Poisson problem:

U 6= ∅ is a bounded open subset of Rd , f ∈ Cb (U ), g ∈ C(∂U ),

and we look for u ∈ C 2 (U ) ∩ C(U ) such that, see (7.25) for the notation:

Lu(x) = −f (x), for x ∈ U ,
(7.29)
u(x) = g(x), for x ∈ ∂U .

g ∈ C(∂U )

Lu = −f in U

113
The Dirichlet problem corresponds to f = 0, and the Poisson equation to g = 0 in (7.29).
In addition to the local boundedness and measurability of b(·), σ(·), we assume the
following elliptic condition:

(7.30) there is c > 0, so that t ξ a(x) ξ ≥ c|ξ|2 , for ξ ∈ Rd , x ∈ U .

It is known that when σ(·), b(·) in addition satisfy (7.3) (in fact a Hölder condition is good
enough), when f is bounded Hölder continuous in U , and U satisfies an exterior sphere
condition:
∀z ∈ ∂U , there is an open ball B, with B ∩ U = {z} ,

the problem (7.29) has a solution, cf. [6], p. 106.


Theorem 7.5. (b(·), σ(·), measurable, locally bounded, and (7.30))
If u is a solution of (7.29), and (Xt )t≥0 satisfies (7.23), for some x ∈ U , then the exit
time of X. from U

(7.31) TU = inf{s ≥ 0; Xs ∈
/ U } is P -integrable,

and
h Z TU i
(7.32) u(x) = E g(XTU ) + f (Xs ) ds .
0

Proof.
• (7.31):
Pick ϕ(y) = C(eαR − eαy1 ), where y = (y1 , . . . , yd ) ∈ U , then
α2 
Lϕ(y) = −C eαy1 a1,1 (y) + α b1 (y)
2
(7.30)
α2 
≤ −C eαy1 c − αM , with M = sup |b1 (·)| .
2
U

Choosing α, R large and then C large enough, we can make sure that

(7.33) Lϕ ≤ −1, on U ,

(7.34) ϕ > 0, on U .

By (7.24) we find that under P ,


Z t∧TU
ϕ(Xt∧TU ) − ϕ(x) − Lϕ(Xs ) ds, t ≥ 0 ,
0

is a local martingale, which is bounded. Hence it is a martingale. Taking expectation, we


find that
h Z t∧TU i
E[ϕ(Xt∧TU )] − ϕ(x) − E Lϕ(Xu ) du = 0 ,
0

114
and keeping in mind (7.33), (7.34), we thus find that
h Z t∧TU i
(7.35) sup ϕ ≥ E[ϕ(Xt∧Tu )] − E Lϕ(Xu ) du ≥ E[t ∧ TU ] .
U 0

Letting t → ∞, we obtain (7.31) and in fact the more precise estimate

(7.36) E[TU ] ≤ sup ϕ .


U

• (7.32):
1
Recall that x ∈ U by assumption. For m ≥ 1 large enough so that m < d(x, U c ), we define
o
1
Tm = inf{s ≥ 0; d(Xs , U c ) ≤
m

and construct um ∈ Cc2 (Rd , R) such that


n o
1
(7.37) u = um on z ∈ U ; d(z, U c ) ≥ .
m

Uc
U
XTm
x

By (7.24), we see that under P ,


Z t∧Tm Z t∧Tm
(7.38) um (Xt∧Tm ) − um (x) − Lum (Xs ) ds = u(Xt∧Tm ) − u(x) + f (Xs ) ds
0 0

is a bounded continuous local martingale, and hence a martingale. Taking expectation we


conclude that
h Z t∧Tm i
E[u(Xt∧Tm )] − u(x) + E f (Xs ) ds = 0 .
0

Since P -a.s., Tm ↑ TU < ∞, and TU is integrable, we can let t → ∞, and then m → ∞,


and conclude that
h Z TU i
u(x) = E[u(XTU )] + E f (Xs ) ds
0
(7.39) Z
h
(7.29)
TU i
= E g(XTU ) + f (Xs ) ds ,
0

whence our claim (7.32).

115
We will now discuss some features of the martingale problem (7.2), and its link with
SDE’s.

Assumptions and notation:


b(·) : Rd → Rd , σ(·): Rd → Md×n are measurable, locally bounded, a(·) = (σ t σ)(·), and
for f ∈ C 2
d
X d
X
1 2
(7.40) Lf (y) = ai,j (y) ∂i,j f (y) + bi (y) ∂i f (y), y ∈ Rd .
2
i,j=1 i=1

Theorem 7.6. If on some (Ω, G, (Gt )t≥0 , P ) endowed with an n-dimensional (Gt )-Brownian
motion Bt , t ≥ 0, a continuous adapted process (Yt )t≥0 satisfies P -a.s., for all t ≥ 0:
Z t Z t
(7.41) Yt = x + b(Ys ) ds + σ(Ys ) · dBs ,
0 0

then
the law Px of (Yt )t≥0 , on (C(R+ , Rd ), F) is a solution
(7.42)
of the martingale problem (7.2) .

Conversely, if Px is a solution of the martingale problem (7.2), then there exists an


(Ω, G, (Gt )t≥0 , P ) endowed with an n-dimensional Brownian motion (βt )t≥0 , and a con-
tinuous adapted process (Zt )t≥0 , such that P -a.s.,
Z t Z t
(7.43) Zt = x + b(Zs ) ds + σ(Zs ) dβs , for all t ≥ 0 ,
0 0

and the law of Z (on (C(R+ , Rd ), F)) is Px .


Remark 7.7. One should not expect Z. (or Y. ) to be strong solutions of the SDE, as in
(7.22). An example of this feature comes for instance when considering d = 1 = n,

σ(x) = sign(x) = 1, when x ≥ 0


(7.44)
= −1, when x < 0 ,

and Y. = X. the canonical Brownian motion on (C(R+ , R), F, (Ft )t≥0 , W0 ). Then, in this
case
Z t
(7.45) Bt = sign(Xs ) dXs , t ≥ 0 ,
0

is thanks to Lévy’s characterization (6.44), (6.45) a Brownian motion. Moreover one has
the identity: P -a.s., for all t ≥ 0,
Z t Z t
2 (5.92)
Xt = sign(Xs ) dXs = sign(Xs ) dBs ,
0 0

116
In other words Y. = X. solves (7.41), but one can prove, see [8], p. 302, or (7.144) below,
that, in the notation of (7.22), for all t ≥ 0,
|X |
FtB. = Ft . ( FtX = Ft .

As a matter of fact, one can show that whenever Y. satisfies (7.41) with σ as in (7.44),
then FtB ( FtY , for t > 0. 
Proof.
• (7.42):
We know from (7.24) that for f ∈ Cc2 (Rd , R), under P
Z t
(7.46) f (Yt ) − f (Y0 ) − Lf (Ys ) ds is a (Gt )-martingale.
0

Hence, for 0 ≤ s0 < · · · < sm ≤ s < t, g0 , . . . , gm ∈ bB(Rd ), denoting by Px the law on


C(R+ , Rd ) of Y under P , we see that:
canonical process
h ւ Z t  i
P P =law of Y.
E x
f (Xt ) − f (Xs ) − (Lf )(Xu ) du g0 (Xs0 ) . . . gm (Xsm ) x =
h Z t s  i (7.46)
E f (Yt ) − f (Ys ) − (Lf )(Yu ) du g0 (Ys0 ) . . . gm (Ysm ) = 0 .
s

Using Dynkin’s lemma, it follows that under Px ,


Z t
def
Mtf = f (Xt ) − f (X0 ) − Lf (Xu ) du
0

is an (Ft )-martingale for any f ∈ Cc2 (Rd , R). Moreover since Y0 = x, P -a.s., we see that
Px [X0 = x] = 1. Hence Px is a solution of the martingale problem (7.2).

• (7.43):
We will only prove (7.43) in a special case, namely when

n = d, and a(x) is locally elliptic (i.e. for U 6= ∅ a bounded open subset of


(7.47)
Rd , ∃c(U ) > 0, such that t ξ a(y) ξ ≥ c(U ) |ξ|2 , for all ξ ∈ Rd , and y in U ) .

For a proof in the general case we refer to [11], p. 91.


Note that due to (7.47) and a(·) = σ(·) t σ(·), σ(·) is invertible and for y ∈ U , ξ ∈ Rd
one has

(7.48) |ξ|2 =t ξ σ −1 (y) a(y) t σ −1 (y) ξ ≥ c(U )| t σ −1 (y) ξ|2 ,

so that (using the explicit formula for σ −1 (·)):

(7.49) σ −1 (·) is locally bounded measurable.

117
On (C(R+ , Rd ), F, Px ), we introduce for t ≥ 0, the σ-algebra Ht generated by Ft and the
negligible sets of Px and Gt = Ht+ , t ≥ 0 (satisfying the usual conditions). We define
Z t
i i i
(7.50) Mt = Xt − X0 − bi (Xs ) ds, i = 1, . . . , d .
0

If we apply (7.2) and stopping we see (since for f (y) = y i , Lf (y) = bi (y)) that the Mti are
continuous (Gt )-local martingales (see exercise below). Analogously, choosing f (y) = y i y j ,
so that Lf (y) = ai,j (y) + y i bj (y) + y j bi (y), we see that
Z t

Xti Xtj − X0i X0j − ai,j (Xs ) + Xsi bj (Xs ) + Xsj bi (Xs ) ds,
(7.51) 0
is a continuous (Gt )-local martingale under Px .
From Ito’s formula we know that Px -a.s.,
Z t Z t
Xti Xtj = X0i X0j + Xsi dXsj + Xsj dXsi + hX i , X j it
0 0
Z t
(7.52) 
= X0i X0j + Xsi bj (Xs ) + Xsj bi (Xs ) ds + hM i , M j it
0
+ continuous local martingale.

Comparing (7.51) and (7.52), we conclude that P -a.s.,


Z t
i j
(7.53) hM , M it = ai,j (Xs ) ds, for t ≥ 0 .
0

We now define
Z t
βt = σ −1 (Xs ) · dMs , t ≥ 0
0
(7.54) Xd Z t
−1

that is βti = σi,j (Xs ) dMsj ,
j=1 0

so that
(7.55) βt , t ≥ 0, is an Rd -valued continuous (Gt )-local martingale
and
d Z .
DX d Z .
X E
−1 −1
hβ i , β j it = σi,k (Xs ) dMsk , σj,ℓ (Xs ) dMsℓ =
0 0 t
k=1 ℓ=1
d
X Z t
−1 −1 (7.53)
σi,k (Xs ) σj,ℓ (Xs ) dhM k , M ℓ is =
(7.56) k,ℓ=1 0

Xd Z t Z t
−1 −1
σi,k (Xs ) ak,ℓ (Xs ) σj,ℓ (Xs ) ds = (σ −1 (Xs ) a(Xs ) t σ −1 (Xs ))i,j ds
k,ℓ=1 0 0
= δi,j t .

118
It thus follows from Paul Lévy’s characterization, cf. (6.44), (6.45), that

(7.57) (βt )t≥0 is d-dimensional (Gt )-Brownian motion under Px .

Now, from (7.54) we deduce that Px -a.s., for t ≥ 0,


Z t Z t
(5.92)
(7.58) σ(Xs ) · dβs = σ(Xs ) · σ −1 (Xs ) · dMs = Mt ,
0 0

and therefore Px -a.s., for t ≥ 0,


Z t Z t
(7.59) Xt = x + b(Xs ) ds + σ(Xs ) · dβs .
0 0

This yields the representation (7.43).

Exercise 7.8. Show that when Px is a solution of the martingale problem (7.2) on
(C(R+ , Rd ), F), then for any f ∈ C 2 (Rd )
Z t
f (Xt ) − f (X0 ) = Lf (Xs ) ds, t ≥ 0 ,
0
T
is a (Gt )-local martingale, where (Gt )t≥0 is the filtration Gt = Ht+ (= ε>0 Ht+ε ), where
Ht = σ(Ft , N ), where N is the collection of Px -negligible sets of F (so that (C(R+ , Rd ), F,
(Gt )t≥0 , Px ) satisfies the usual conditions). 

We further discuss the martingale problem (7.2) and its link with the SDE (7.1). As
an application of Theorems 7.1 and 7.6 we have the following

Corollary 7.9. Assume that b(·): Rd → Rd and σ(·): Rd → Md×n satisfy the Lipschitz
condition (7.3). Then, for x ∈ Rd ,

there is a unique solution of the martingale problem (7.2) attached to


Xd d
X
(7.60) 1 2
L= aij (·) ∂i,j + bi (·) ∂i , with a(·) = σ(·) t σ(·)
2
i,j=1 i=1

(one says that the martingale problem attached to L is well-posed).

Proof.
• Existence:
We consider some filtered probability space (Ω, G, (Gt )t≥0 , P ) endowed with an n-dimensional
(Gt )-Brownian motion (Bt )t≥0 . One can for instance pick the canonical space (C(R+ , Rd ), F,
(Ft )t≥0 , W0 ), and B. = X. , the canonical process). By (7.4), we know that we can con-
struct a “solution”, i.e. a continuous adapted (Yt )t≥0 , such that P -a.s., for t ≥ 0,
Z t Z t
(7.61) Yt = x + b(Ys ) ds + σ(Ys ) · dBs .
0 0

119
It then follows from (7.42) that

the law of (Yt )t≥0 on (C(R+ , Rd ), F) is a solution of the


(7.62)
martingale problem attached to L and x .

• Uniqueness:
Assume that Px is a solution of the martingale problem (7.2) attached to L and x. By (7.43)
we know that we can find some (Ω, G, (Gt )t≥0 , P ) and (βt )t≥0 , which is an n-dimensional
Brownian motion, and a continuous (Gt )-adapted Rd -valued process (Zt )t≥0 , such that
P -a.s., for t ≥ 0:
Z t Z t
Zt = x + b(Zs ) ds + σ(Zs ) · dβs , and
(7.63) 0 0
Px = law of (Zt )t≥0 on (C(R+ , Rd ), F) .

From (7.11), (7.21), we see that P -a.s., for all t ≥ 0,

(7.64) Zt = Xt∞ ,

where Xt∞ is, see below (7.19), defined as the P -a.s. uniform limit on compact intervals
of Xtm , t ≥ 0, where
Z t Z t
0 1 0
Xt ≡ x, Xt = x + b(Xs ) ds + σ(Xs0 ) · dβs , and for m ≥ 1,
0 0
(7.65) Z Z
t t
Xtm+1 = x + b(Xsm ) ds + σ(Xsm ) · dβs , for m ≥ 1 .
0 0

By inspection of (7.65) we see that the law of (Xtm )t≥0 or of (Xt∞ )t≥0 are unchanged
if instead of β. one uses the canonical n-dimensional Brownian motion. Combining this
observation with (7.64), we see that the law of Z. (i.e. Px ) is uniquely determined.

Remark 7.10. A not very satisfactory feature of the above theorem has to do with the
fact that the assumptions are made on the coefficients σ(·) and b(·), that appear in (7.3),
but the conclusion concerns the martingale problem where only a(·) and b(·) are involved.
It is clear that it does not suffice to assume a(·) Lipschitz continuous, in order to find
a Lipschitz continuous σ(·) such that σ t σ = a (for instance when d = 1, a(x) = |x| yields
such an example).
However, one can show that when a(·) : Rd → Md×d satisfy a global ellipticity condi-
tion, i.e. for some ε > 0,
t
(7.66) ξ a(x) ξ ≥ ε |ξ|2 , for all x, ξ in Rd ,

and a Lipschitz condition

(7.67) |a(y) − a(z)| ≤ K |y − z|, for y, z ∈ Rd ,

120
then a1/2 (·) satisfies a Lipschitz condition as well., cf. [11], p. 97. Of course a1/2 (·) can
then play the role of σ(·) in the Corollary 7.9.
A similar Lipschitz property of a1/2 (·) can be proved when instead of (7.66), (7.67)
one assumes that
(7.68) sup |a(x)| ≤ C < ∞, and
x∈Rd

(7.69) x ∈ Rd → a(x) ∈ Md×d is a C 2 -function and


2
sup sup |∂i,j a(x)| ≤ C < ∞ .
1≤i,j≤d x∈Rd

In this last case, a(·) need not be uniformly elliptic. As direct application of the Corollary
7.9, one now has:
if b(·) satisfies a global Lipschitz condition, and a(·) either satisfies
(7.70) (7.66), (7.67), or (7.68), (7.69), then the martingale problem attached
to L is well-posed.


Girsanov transformations and applications to martingale problems


We will now bring into play certain exponential martingales, and use them as a tool to
solve various martingales problems. An important role is played by a theorem due in
various forms to Cameron-Martin (1944), Girsanov (1960), Maruyama (1954, 1955).

Setting:
(Ω, G, (Gt )t≥0 , P ) is a filtered probability space satisfying the usual conditions, cf. (4.5),
(4.6).
(Mt )t≥0 , is a continuous local martingale such that
n o
1
(7.71) Zt = exp Mt − hM it , t ≥ 0, is a martingale.
2
As we know from Novikov’s criterion, cf. (6.72), (6.74), this is for instance the case when
E[exp{ 21 hM it }] < ∞, for each t ≥ 0.
We pick a fixed T > 0, and since E[ZT ] = 1 (= E[Z0 ]), we introduce the new proba-
bility Q on (Ω, G) defined by
def
(7.72) Q = ZT P .
Of course when A ∈ Gt with t ≤ T , one has
Q(A) = E[1A ZT ] = E[1A Zt ], so that
(7.73) dQ
= Zt , 0 ≤ t ≤ T ,
dP Gt

in other words Zt , for t ≤ T , represents the density of the restriction of Q to Gt with


respect to the restriction of P to Gt . Note that P and Q have the same null sets.

121
Theorem 7.11. (Cameron-Martin, Girsanov, Maruyama)
If (Nt )0≤t≤T is a continuous local martingale under P , the “Girsanov transform of N ”
defined as
(7.74) et = Nt − hN, M it , 0 ≤ t ≤ T, is a continuous local martingale under Q .
N
Moreover, if N 1 , N 2 are continuous local martingales under P , then P -a.s. (or equivalently
Q-a.s.),

(7.75) hN e 2 iQ = hN 1 , N 2 iPt , for 0 ≤ t ≤ T .


e 1, N
t

Proof. Without loss of generality we assume N0 = 0. We first claim that:


(7.76) et Zt , 0 ≤ t ≤ T, is a local martingale under P .
N
Indeed, it follows by Ito’s formula (6.22) and by (6.28) that P -a.s., for t ≥ 0,
Z t Z t
e
Nt Zt = e
Zs dNs + Nes dZs + hN e , Zit , and
0 0
Z t
Zt = 1 + Zs dMs , so hNe , Zit = hN − hN, M i, Zit = hN, Zit
0
Z t
(5.90)
= Zs d hN, M is .
0

As a result,
Z t Z t Z t Z t
et Zt =
N Zs dNs − Zs dhN, M is + es dZs +
N Zs dhN, M is
0 0 0 0
(7.77) Z t Z t
= Zs dNs + es dZs , and (7.76) follows.
N
0 0

et , 0 ≤ t ≤ T , is a local martingale under Q. We define:


We will now see that N
eu | ≥ n} (↑ ∞, as n → ∞) ,
Tn = inf{u ≥ 0; |N
(7.78)
Sm = inf{u ≥ 0; |Mu | ≥ m or hM iu ≥ m} (↑ ∞, as m → ∞) .
et∧Tn ∧Sm shows that:
The calculation (7.77) applied to Mt∧Sm , Zt∧Sm , Nt∧Tn ∧Sm , N

Net∧Tn ∧Sm Zt∧Sm is a continuous local martingale, which


(7.79)
is bounded, and hence a martingale.
As a result, we see that for 0 ≤ s ≤ t ≤ T , and A ∈ Gs , we have:
   
E P 1A Net∧Tn ∧Sm ZT ∧Sm = E P 1A Net∧Tn ∧Sm ZT ∧Sm =
| {z } | {z } |{z} | {z }
Gt -meas. martingale Gs -meas. martingale
(7.80)
 
es∧Tn ∧Sm
E P 1A N Zs∧Sm es∧Tn ∧Sm ZT ∧Sm ] .
= E P [1A N
| {z } | {z }
Gs -meas. martingale

122
L1 (P )
Letting m → ∞, and observing that ZT ∧Sm −→ ZT by uniform integrability and a.s.
m→∞
e·∧Tn | ≤ n, we find that
convergence, and keeping in mind that |N
et∧Tn ZT ] = E P [1A N
E P [1A N es∧Tn ZT ]
(7.81) k k
et∧Tn ]
E Q [1A N es∧Tn ]
= E Q [1A N

Since Tn ↑ ∞, P and Q-a.s., this proves that

(7.82) et , 0 ≤ t ≤ T , is a continuous local martingale under Q .


N

We then turn to the proof of (7.75). We introduce


Z t
def
It = Nt2 − hN it = 2 Ns dNs
0

using Ito’s formula (6.22) and the notation h i = h iP . By (7.82) we find that
Z t
(7.74)
Iet = It − hI, M it = Nt2 − hN it − 2 Ns dhN, M is
(7.83) 0
is a continuous local martingale under Q .

As a result

et2 − hN it (7.74)
N = Nt2 − 2Nt hN, M it + hN, M i2t − hN it
(7.84) Z t
(7.83)
= Iet + hN, M i2t + 2 Ns dhN, M is − 2Nt hN, M it .
0

It follows from Ito’s formula that:


Z t Z t
Jt = 2Nt hN, M it − 2 Ns dhN, M is = 2 hN, M is dNs =
0 0
continuous local martingale under P ,

and
Z t
(7.74)
(7.85) Jet = Jt − 2 hN, M is dhN, M is = Jt − hN, M i2t .
0

We can then come back to (7.84) to conclude that

(7.86) e 2 − hN it = Iet − Jet (7.74)


N = continuous local martingale under Q .
t

This shows that

(7.87) e iQ = hN iP , 0 ≤ t ≤ T ,
hN t t

and the claim (7.75) now follows by polarization.

123
We will now apply the above theorem in order to construct the solution of certain
martingale problems.
+
Theorem 7.12. Assume that b(·), c(·) : Rd → Rd and a(·) : Rd → Md×d (i.e. the set
of d × d non-negative matrices) are measurable locally bounded functions, with b, a, t cac
bounded. Then, there is a bijective correspondence between the solutions of the martingale
problem attached to
d
X d
X
1 2
(7.88) L= aij ∂i,j + bi ∂i , x ∈ Rd ,
2
i,j=1 i=1

and
d
X d
X
e 1 2
(7.89) L= aij ∂i,j + (bi + (ac)i ) ∂i , x ∈ Rd .
2
i,j=1 i=1

This bijective correspondence is the following. To each P solution of the martingale


problem attached to (L, x), one associates the law Pe on (C(R+ , Rd ), F), which is specified
by the fact that

dPe nZ t Z t o
1 t
= exp c(Xs ) · dX s − cac(Xs ) ds , for t ≥ 0 ,
(7.90) dP Ft 0 ↑
2 0

Rd -scalar product
Rt
where X t = Xt − 0 b(Xs ) ds, t ≥ 0.

Proof. Note that under P , the process X is a continuous local martingale and
Z t
i j (7.53)
(7.91) hX , X it = aij (Xs )ds, for t ≥ 0 .
0
Rt
Hence, 0 c(Xs ) · dX s is a continuous local martingale and P -a.s.:
Z . Z t
t
(7.92) c(Xs ) · dX s t
= cac(Xs ) ds, for t ≥ 0 .
0 0
R.
As a result, the expression in (7.90) is the stochastic exponential of 0 c(Xs ) · dX s . By
Novikov’s criterion (6.72) or by (6.29), we know that:
nZ t Z t o
1 t
Zt = exp c(Xs ) · dX s − cac(Xs ) ds , t ≥ 0,
(7.93) 0 2 0
is a continuous martingale.

We will now use the following

124
Lemma 7.13.
There is a unique probability Q on (C(R+ , Rd ), F) such that
(7.94) dQ
= Zt , for each t ≥ 0 .
dP Ft
Proof. Uniqueness follows from Dynkin’s lemma.

We now prove the existence. Consider for n ≥ 2 the laws πn on


(n − 1) copies
P def z }| {
d d d
n = C([0, 1], R ) × C0 ([0, 1], R ) × · · · × C0 ([0, 1], R ), of
((X. )0≤·≤1 , (X1+· − X1 )0≤·≤1 , . . . , (Xn−1+· − Xn−1 )0≤·≤1 ) under Zn P

def
(here C0 ([0, 1], Rd ) = {w ∈ C([0, 1], Rd ); w(0) = 0}).
The laws πPn , n ≥ 2, are consistent, i.e. the image of πn+1 on Σn under the “projection”
from Σn+1 → n , which drops the last component is equal to πn , for all n ≥ 2, thanks to
the martingale property of Zt , t ≥ 0, under P .
By Kolmogorov’s extension theoremQ(see for instance [12], p. 129), there is a (unique)
probability π on Σ = C([0, 1], Rd ) × ∞ d
1 C0 ([0, 1], R ), such that for each n ≥ 2, the
image of π under the natural “projection” Σ → Σn is πn . If one now defines the map ϕ:
Σ → C(R+ , Rd ) via
ϕ((w1 , w2 , . . . )) = w such that
w(t) = w1 (t), for 0 ≤ t ≤ 1 ,
(7.95)
w(t) = w1 (1) + w2 (t − 1), for 1 ≤ t ≤ 2 ,
w(t) = w1 (1) + w2 (1) + · · · + wn (1) + wn+1 (t − n), for n ≤ t ≤ n + 1 ,

then, the image Q of π under ϕ satisfies (by definition of πn ):


dQ
(7.96) = Zn , for any n ≥ 2 .
dP Fn

The martingale property (7.93) now implies the property (7.94).


We will now see that the unique Q constructed by the above lemma satisfies:
(7.97) e x.
Q solves the martingale problem L,
To this end, we first note by (7.74) that under Zn P
Z . Z t d Z
X t
i i j i
Xt − hX , c(Xs ) · dX s it = Xti − bi (Xs ) ds − cj (Xs ) dhX , X is
0 0 j=1 0
Z t
(7.91) 
= Xti − bi (Xs ) + (ac)i (Xs ) ds, 0 ≤ t ≤ n ,
0

is a local martingale.

125
Ru
If we now define Tm = inf{u ≥ 0; |Xu − 0 (b + ac)(Xs ) ds| ≥ m}, so that Tm ↑ ∞, as
m → ∞ (ac = a1/2 (a1/2 c) is bounded, since a and t cac are bounded),
Z t∧Tm
Xt∧Tm − (b + ac)(Xs ) ds, 0 ≤ t ≤ n, is a martingale
(7.98) 0
under Zn P (and also under Q) .

As a result, we see that for each m, the Rd -valued


Z t
(7.99) Mt = Xt − (b + ac)(Xs ) ds is a local martingale under Q .
0

By (7.75) and (7.91) we see that Q-a.s.,


Z t
(7.100) hM i , M j it = aij (Xs ) ds, t ≥ 0 .
0

The application of Ito’s formula yields that for any f in C 2 (Rd ), Q-a.s., for t ≥ 0:
d Z
X t d
X Z t
1
f (Xt ) = f (X0 ) + ∂i f (Xs ) dXsi + 2
∂ij f (Xs ) dhX i , X j is
0 2 0
i=1 i,j=1
Z t Z t
(7.99)
= f (X0 ) + ∇f (Xs ) · (b + ac)(Xs ) ds + ∇ f (Xs ) · dMs
(7.100) 0 0
(7.101)
d
X Z t
1 2
+ ∂ij f (Xs ) aij (Xs ) ds
2 0
i,j=1
Z t
= f (X0 ) + e f (Xs ) ds + continuous local martingale under Q .
L
0

This completes the proof of (7.97).


Since Q coincides with Pe in (7.90), we see that for any x ∈ Rd

P → Pe sends solutions of the martingale problem (L, x) into solutions


(7.102) e x) .
of the martingale problem (L,

There now remains to see that the correspondence is bijective. We first show that

(7.103) the correspondence P → Pe is injective.

RAssume that P1 → Pe and P2 → Pe. Then for t ≥ 0, P1 ∼ P2 ∼ Pe on Ft , and the integral


s e
0 c(Xu ) dX u , 0 ≤ s ≤ t, is identical P1 , P2 , P -a.s. (this feature goes back to (4.50), (4.57),
(4.59) and the explicit approximating sequences used to construct stochastic integrals, see

126
exercise 2) below). Now the equalities Pe = Zt P1 = Zt P2 on Ft , imply that P1 = P2 on
Ft , for t ≥ 0, and (7.103) follows, by Dynkin’s lemma.

(7.104) The correspondence P → Pe is surjective.

Indeed, consider Q a solution of the martingale problem (L, e x). The above shows that
d
there is a unique P on (C(R+ , R ), F) such that for any t ≥ 0

dP n Z t Z s  Z t o
1 t
= exp − c(Xs ) · d(Xs − (b + ac)(Xu ) du − cac(Xu ) du
(7.105) dQ Ft 0 0 2 0
def e
= Zt ,
and P is a solution of the martingale problem (L, x) (this is an application of (7.102) with
Q playing the role of P ).
Note that for t ≥ 0,
n Z t Z t o
et = exp 1
Z − c(Xs ) · dX s + ( cac)(Xu )du = Zt−1 ,
t
0 2 0

so that for t ≥ 0,
dPe dQ
= Zt =
dP Ft dP Ft

and hence Pe = Q, thus completing the proof of (7.104).

Example:
We know by (7.60) that the martingale problem attached to L = 12 ∆ is well-posed (this
is the case b = 0, σ = Identity matrix (d × d)). The solution to the martingale problem
attached to (L, x) is Wx , the “Wiener measure starting from x”.
Consider now b(·): Rd → Rd , bounded measurable.
When Le = 1 ∆ + b · ∇, we see by (7.90) that the martingale problem attached to (L,e x)
2
has a unique solution, which is a probability Wfx on (C(R+ , Rd ), F) such that

fx
dW nZ t Z t o
1
(7.106) = exp b(Xs ) · dXs + |b(Xs )|2 ds , for any t ≥ 0 .
dWx Ft 0 2 0

fx -a.s.,
Note that W
Z t
(7.107) Xt = x + b(Xs ) ds + βt , for t ≥ 0 ,
0

fx (thanks to (7.75) and Paul Lévy’s


where (βt )t≥0 , is an (Ft )-Brownian motion under W
characterization Theorem 6.10).
In particular this yields a (weak) solution to the stochastic differential equation at-
tached to b(·) (which is only bounded measurable!) and σ(·) = Id.

127
Exercise 7.14.
1) Show that all solutions to (7.107) have the same law (hint: use Theorem 7.12).
2) Consider P a solution of the martingale problem (L, x), where L is as in (7.88) and Pe
such that (7.90) holds (with the same assumptions on a(·), b(·), c(·)).
Rt
a) Show that when H is a bounded progressively measurable process, 0 Hs dX s is well-
defined regardless of whether one uses that under P , X . is a continuous martingale or that
under Pe , X . is a continuous semimartingale (hint: we use the approximating sequences
from (4.57), with As = s, and (4.59), alternatively use (5.82), (7.75)).
Rt
b) Show that 0 c(Xs ) · dX s is well-defined regardless of whether one works with P or Pe
to interpret the stochastic integral.
3) When b = 1, in (7.106), show that although the restrictions of Wx and W fx to Ft are
f
equivalent, for each t ≥ 0, one has Wx ⊥ Wx (i.e. there is A ∈ F with Wx (A) = 1 =
fx (Ac )).
W 

Explosions of solutions of stochastic differential equations: an application of


Girsanov transformations.
We consider c(·) : Rd → Rd a locally Lipschitz function:

(7.108) ∀M > 0, ∃KM > 0, such that |c(x) − c(y)| ≤ KM |x − y|, for |x|, |y| ≤ M .

If we now consider (Xt )t≥0 , the canonical process on (C(R+ , Rd ), F, (Ft )t≥0 , Wx ), i.e. the
canonical Brownian motion starting from x, we know from (6.27) that
nZ t Z t o
1
(7.109) Zt = exp c(Xs ) · dXs − |c(Xs )|2 ds , t ≥ 0 ,
0 2 0

is a continuous local martingale.


However when c(·) “grows too fast at infinity”, it need not be a martingale. The key
quantity, cf. (6.84), is

(7.110) et = 1 − Ex [Zt ] ≥ 0 ,

since we know, from (6.84), that et = 0 implies that Zs , 0 ≤ s ≤ t, is a martingale.


We will now provide an interpretation of et , t ≥ 0, in terms of the possible “explosions”
of the SDE:
(
dYt = c(Yt ) dt + dBt ,
(7.111)
Y0 = x .

For this purpose we choose a sequence of bounded, Lipschitz functions cN (·) on Rd , such
that

(7.112) cN (·) = c(·) on B N = {z ∈ Rd ; |z| ≤ N } .

128
Using the canonical Brownian motion X. as “driving noise”, we have (7.3), (7.4) a unique
solution (YtN )t≥0 of
Z t
N
(7.113) Yt = Xt + cN (YsN ) ds
0

(since σ(·) = Id, Y.N is even (Ft )-adapted and actually a continuous function of X. , when
C(R+ , Rd ) is endowed with the topology of uniform convergence on compact time intervals,
cf. (7.21)).
We then define the (Ft )-stopping time:
(7.114) TN = inf{u ≥ 0 : |YuN | ≥ N } .
Lemma 7.15. (N, k ≥ 0)
N +k N
(7.115) For all t ≥ 0, Yt∧T N
= Yt∧TN
.

Proof. Define TNk = inf{u ≥ 0; |YuN +k | ≥ N }. As below (7.5) we find that


Z t∧TN ∧T k
N +k N (7.111) N
N +k N

Yt∧T ∧T k − Y t∧TN ∧T k = cN +k (Y s ) − cN (Y s ) ds
N N N
0

and since |YsN +k |, |YsN | ≤ N for 0 < s ≤ TN ∧ TNk , and cN +k (·) = cN (·) = c(·) on B N , we
see that for t0 ≥ 0,
(7.108)
Z t0
N +k N N +k N
(7.116) sup |Yt∧T ∧T k − Yt∧T ∧T k | ≤ KN |Ys∧T ∧T k
− Ys∧T ∧T k
| ds .
N N N N N N
t≤t0 N 0 N

By Gronwall’s lemma, cf. (7.9), we see that


N +k N
(7.117) for all t0 ≥ 0, sup |Yt∧T ∧T k
− Yt∧T k | = 0.
N N ∧TN
t≤t0 N

This last equality immediately implies that TN = TNk , for all k ≥ 0, and the claim (7.115)
follows.

By (7.115), we see that for N ′ ≥ N , Y N and Y N coincide up to time TN , and in
particular,
(7.118) TN is a non-decreasing sequence of (Ft )-stopping times.

We can now define the explosion time of the SDE


Z t
Yt = Xt + c(Ys ) ds, t ≥ 0 ,
0

as the (Ft )-stopping time:


(7.119) T = lim TN ∈ (0, ∞] .
N →∞

The relation between the explosion time and et in (7.110) comes in the following:

129
Theorem 7.16.

(7.120) For t ≥ 0, et = Wx [T ≤ t] .

Proof. The case t = 0 is obvious and we thus assume t > 0.


By (7.106), (7.107), we know that if we define the probability
nZ t Z t o
1
(7.121) QN = exp cN (Xs ) · dXs − |cN (Xs )|2 ds Wx ,
0 2 0

under QN ,
Z s
(7.122) Xs = x + βs + cN (Xu ) du, for 0 ≤ s ≤ t ,
0

where (βs )0≤s≤t is a d-dimensional Brownian motion, and the law (on C([0, t], Rd )) of
(Xs )0≤s≤t under QN is that of the restriction to time [0, t] of the unique solution of the
martingale problem attached to L = 12 ∆ + cN · ∇, and x. By (7.42) the law of (YsN )0≤s≤t ,
under Wx , with Y.N as in (7.113), coincides with the law of (Xs )0≤s≤t under QN . As a
result, we find that setting
def
SN = TBN = inf{s ≥ 0; |Xs | ≥ N }) ,

we have
Wx [TN > t] = QN [SN > t] =
(7.123) h nZ t Z t oi
1 2
Ex SN > t, exp cN (Xs ) · dXs − |cN (Xs )| ds .
0 2 0

Note that Wx -a.s. on {SN > t},


Z t Z t∧SN Z t∧SN Z t
(5.60)
cN (Xs ) · dXs = cN (Xs ) · dXs = c(Xs ) · dXs = c(Xs ) · dXs ,
0 0 (7.112) 0 0

and therefore:

(7.124) Wx [TN > t] = Ex [SN > t, Zt ] .

Now Wx -a.s. SN ↑ ∞, and thus letting N → ∞, we find

Wx [T > t] = lim Wx [TN > t] = Ex [Zt ] ,


n

and in view of (7.110) the claim (7.120) follows.

130
Complement: An example of SDE with no strong solution, which is weakly
well-posed
We resume here the discussion (cf. Remark 7.7) of the stochastic differential equation on
R:
Z t
(7.125) Yt = sign(Ys ) dBs ,
0

where B is a one-dimensional Brownian motion and, cf. (7.44),

sign(x) = 1, when x ≥ 0 ,
= −1, when x < 0 .

We have explained below (7.45) that we can always find a solution (in a weak sense)
of this equation: if we have on some (Ω, G, (Gt )t≥0 , P ) satisfying the usual conditions a
(Gt )-Brownian motion Y. , we define
Z t
(7.126) Bt = sign(Ys ) dYs , t ≥ 0 .
0

By Lévy’s characterization, (6.44), (6.45), we know that

(7.127) (Bt )t≥0 is a (Gt )-Brownian motion.

In addition, P -a.s.,
Z t Z t
2 (5.92)
Yt = sign (Ys ) dYs = sign(Ys ) dBs , for t ≥ 0 .
0 (7.126) 0

In other words, with (Y, B) as above, we have a solution of the SDE (7.125), in the weak
sense. We will now explain that the above structure is “typical” and necessarily, for all
t > 0, FtB ( FtY , in the notation of (7.22).
By a weak solution of (7.125), we mean an (Ω, G, (Gt )t≥0 , P ) satisfying the usual
conditions, endowed with a (Gt )-Brownian motion (Bt )t≥0 , and a continuous adapted
process (Yt )t≥0 , such that (7.125) holds.

Theorem 7.17. Given a weak solution of (7.125), then Y. is a (Gt )-Brownian motion and
P -a.s.,
Z t
(7.128) Bt = sign(Ys ) dYs , for t ≥ 0 .
0

Moreover, one has the identity:

(7.129) |Yt | = Bt + Lt , for t ≥ 0, “Tanaka’s formula”,

131
where (Lt )t≥0 , the “local time of Y at 0”, is a continuous, adapted, non-decreasing process,
with L0 = 0, characterized by the fact that P -a.s.,
Z t
(7.130) Lt = 1{Ys = 0} dLs , for all t ≥ 0 .
0

In addition, P -a.s.,

(7.131) Lt = sup (Bs )− = − inf Bs , for t ≥ 0 (where x− = max(−x, 0)),


s≤t s≤t

and for all t ≥ 0,


|Y |
(7.132) Ft = FtB .

Proof.
• (7.128):
From (7.125) and P. Lévy’s characterization, it follows that Y. is a (Gt )-Brownian motion
and Z t Z t
(5.92)
sign(Ys ) dYs = sign2 (Ys ) dBs = Bt , P -a.s., for all t ≥ 0,
0 0
whence our claim.

• (7.129):
We consider a decreasing sequence of C 2 , symmetric, convex functions ϕn on R, such that
 1 1 c
ϕn (x) = |x| on − 2 , 2 ,
(7.133) n n
ϕn (x) ↑ 1, for x > 0, ϕ′n (x) ↓ −1, x < 0 .

ϕn

−1/n2 0 1/n2
1
Note that ϕn (x) ↓ |x| for all x ∈ R, and in fact 0 ≤ ϕn (x) − |x| ≤ n2
, so, ϕn (·) converges
uniformly to | · | on R. Also, by convexity and (7.133),

(7.134) −1 ≤ ϕ′n (x) ≤ 1, for x ∈ R, n ≥ 1 .

132
Applying Ito’s formula we find that:
Z t Z t
1
(7.135) ϕn (Yt ) = ϕn (Y0 ) + ϕ′n (Ys ) dYs + ϕ′′n (Ys ) ds .
0 2 0

1 1
We already know that |ϕn (Yt ) − |Yt || ≤ n2
, and |ϕn (Y0 ) − |Y0 || ≤ n2
, and in addition, by
(7.128) and Doob’s inequality,
h Z 2 i (4.76) hZ
t t 2 i
E sup ϕ′n (Ys ) dYs − Bt ≤ 4E ϕ′n (Ys ) − sign(Ys ) ds ≤
0≤s≤t 0 0
Z t h
1 i Y. is a
(7.136) c P |Ys | ≤ 2 ds ≤
0 n B.M.
Z t Z 1
n2 ds du c √

c √ ≤ 2 t ← convergent series in n .
0 − 12 2πs n
n

It now follows from (7.136) and (7.135) that


Z t
P -a.s., ϕ′n (Ys ) dYs converges to Bt , uniformly on bounded
0
time intervals, and
(7.137) Z t
def 1 ′′
Lnt = ϕn (Ys ) ds converges uniformly to a continuous
0 2
non-decreasing process Lt , on bounded time intervals.

Moreover, we have P -a.s., for all t ≥ 0,

(7.138) |Yt | = Bt + Lt (i.e. (7.129) holds).

By (7.137) we also see that P -a.s.,


Z t Z t
ψ(s) dLns −→ ψ(s) dLs , for all t ≥ 0, and continuous bounded ψ(·) on [0, ∞) .
0 n→∞ 0

Applying this fact to ψ(s) = h(Ys ), where h is a continuous bounded function on R,


vanishing on some open interval containing 0, yields
Z t Z t Z t
n 1
h(Ys ) dLs = lim h(Ys ) dLs = lim h(Ys ) ϕ′′n (Ys ) ds
0 n 0 n 0 2
(7.133)
= 0.

It now follows that


Z t
(7.139) P -a.s., Lt = 1{Ys = 0} dLs , for all t ≥ 0 ,
0

133
Rt1
i.e. (7.130) holds. In addition, note that ϕ′′n (·) is symmetric so Lnt = 0 2 ϕ′′n (|Ys |) ds, and
(7.137) together with (7.138) imply that
|Y | |Y |
(7.140) FtL ⊆ Ft , and FtB ⊆ Ft , for all t ≥ 0 .
|Y |
There only remains to prove that L necessarily satisfies (7.131), then the claim FtB ⊇ Ft
will follow from (7.129), (7.131). For this purpose we will use a deterministic lemma:
Lemma 7.18. (Skorohod)
Let b(·) a continuous real-valued function on [0, ∞) such that b(0) ≥ 0. There exists a
unique pair of continuous functions z(·) and ℓ(·) on [0, ∞) such that

 i) z(·) = b(·) + ℓ(·) ,



ii) z(·) ≥ 0 ,
(7.141)

 iii) ℓ(·) is non-decreasing, ℓ(0) = 0, and dℓ(s) is supported by


{s ≥ 0; z(s) = 0} .

The function ℓ(·) is moreover given by

ℓ(t) = sup (b(s))− (where x− = (−x) ∨ 0)


s≤t
(7.142)
(in addition when b(0) = 0, ℓ(t) = − inf b(s)).
s≤t

def def
Proof. Note first that z(t) = b(t) + sups≤t (b(s))− , ℓ(t) = sups≤t (b(s))− , t ≥ 0, satisfy
(7.141), i), ii), iii), for iii) note that t such that z(t) > 0 does not belong to the support of
dℓ (note also that the right-hand side of (7.142) equals max(− inf s≤t b(s), 0)).
To prove uniqueness, we consider an other pair z ′ (·), ℓ′ (·) of continuous functions on
[0, ∞) satisfying (7.141). In particular z − z ′ = ℓ − ℓ′ is continuous with bounded variation
and vanishes at time 0. So,
Z t
0 ≤ (z(t) − z ′ (t))2 = 2 (z(s) − z ′ (s)) d(ℓ(s) − ℓ′ (s))
0
Z t Z t (7.141) ii),iii)
(7.141) iii)
= −2 z ′ (s) dℓ(s) − 2 z(s) dℓ′ (s) ≤ 0.
0 0

Hence, z(t) = z ′ (t) and ℓ(t) = ℓ′ (t), for all t ≥ 0.


Skorohod’s lemma immediately yields (7.131), and as noted above, this concludes the
proof of the theorem.
Corollary 7.19. For any weak solution of (7.125),

(7.143) the law of (Y, B) on C(R+ , R)2 is uniquely determined

and
|Y |
(7.144) for any t > 0, FtB = Ft ( FtY .

134
Proof.
• (7.143):
This follows from the fact Y. is a (Gt )-Brownian motion and (7.128).
• (7.144):
We note that for t ≥ 0:

(7.145) sign(Yt ) is equidistributed on {−1, 1}, and independent of |Y. | .

Indeed, since −Y has same distribution as Y , we see that for any B in F (the canonical
σ-algebra on C(R+ , R)), one has:
Yt 6=0
P [sign(Yt ) = 1, |Y | ∈ B] = P [sign(−Yt ) = 1, | − Y | ∈ B] =
P -a.s.
1
P [sign(Yt ) = −1, |Y | ∈ B] = P [|Y | ∈ B] ,
2

and (7.145) follows.


|Y |
By (7.132), the claim (7.144) now readily follows (if Ft = FtY was true, we would
|Y | P -a.s. (7.145) 1
have P [sign(Yt ) = 1 | Ft ] = 1{sign(Yt ) = 1} = 2, a contradiction).

Remark 7.20.
1) We have thus shown that (7.125) is weakly well-posed in the sense that there are
weak solutions for (7.125) and the law of (Y, B) for any such weak solution is uniquely
determined.
However, there are no strong solutions of (7.125) due to (7.144). Note for instance that if we
choose B to be the canonical Brownian motion (i.e. Bt = Xt , t ≥ 0, and (Ω, G, (Gt )t≥0 , P ) =
(C(R+ , R), F, (Ft )t≥0 , W0 )), we cannot find an adapted stochastic process Y. such that
(7.125) holds.

2) We have in fact shown in (7.129) that given any (Gt )-Brownian motion Y , one has the
identity, P -a.s., for all t ≥ 0:
Z t
(7.146) |Yt | = sign(Ys ) dYs + Lt
0

whereR L0 = 0, and Lt is a continuous, non-decreasing stochastic process such that


t
Lt = 0 1{Ys = 0} dLs .
This is the so-called Tanaka Formula for the local time of Brownian motion. We refer
to Chapter 3 §6 of [8] for more on this matter. 

135
136
References
[1] R.A. Adams. Sobolev Spaces. Academic Press, New York, 1975.

[2] J.D. Deuschel and D.W. Stroock. Large deviations. Academic Press, Boston, 1989.

[3] W.F. Donoghue. Distributions and Fourier transforms. Academic Press, Vol. 32, 1969.

[4] R. Durrett. Brownian motion and martingales in analysis. Wadsworth, Belmont CA,
1984.

[5] R. Durrett. Probability: Theory and Examples. Wadsworth and Brooks/Cole, Pacific
Grove, 1991.

[6] D. Gilbarg and N.S. Trudinger. Elliptic partial differential equations of second order.
Reprint of the 1998 ed. Springer, Berlin, 2001.

[7] N. Ikeda and S. Watanabe. Stochastic Differential Equations and Diffusion Processes.
North-Holland; Amsterdam, Kodansha, Ltd., Tokyo, 2nd edition, 1989.

[8] I. Karatzas and S. Shreve. Brownian motion and stochastic calculus. Springer, Berlin,
1988.

[9] D. Revuz and M. Yor. Continuous martingales and Brownian motion. 3rd ed., 3rd.
corrected printing, Springer, Berlin, 2005.

[10] L.C.G. Rogers and D. Williams. Diffusions, Markov processes, and martingales, vol-
ume 1 and 2. Wiley, Chichester, 1987, 1994.

[11] D.W. Stroock. Lectures on stochastic analysis: diffusion theory. London Math. Soc.
Student Text 6, Cambridge University Press, Cambridge, 1987.

[12] D.W. Stroock. Probability theory, an analytic view. Rev. ed., Cambridge University
Press, Cambridge, 2000.

[13] D.W. Stroock and S.R.S. Varadhan. Multidimensional diffusion processes. Springer,
Berlin, 1979.

137
Index
canonical space, 45 harmonic functions, 99

adapted process, 56, 60, 65, 67, 80, 89, 116, increasing process, 53
131 integration by parts, 89
invariance principle, 1
Bachelier, 1
Bernstein inequality, 99 Kolmogorov criterion, 18–20, 52
Blumenthal’s 0 − 1 law, 26, 28, 55 Kolmogorov’s extension theorem, 125
Borel-Cantelli’s lemma, 48, 49, 65, 77
Brownian motion, 1–3, 5–16, 19, 21, 23, 24, Lévy, 11, 51, 53
26, 28, 29, 38, 47, 51, 53, 94, 97–99, Lévy’s characterization, 107, 116, 119, 127,
102, 116, 131 131, 132
d-dimensional, 5, 6, 53, 98 Lévy’s modulus of continuity, 51, 52
n-dimensional, 120 law of the iterated logarithm, 47, 50, 51
canonical, 116 local time, 132
recurrence properties, 102 of Brownian motion, 135
Brownian path, 51
Markov process, 23
quadratic variation, 53
Markov property, 23, 28
Brownian sample paths, 45
simple, 23, 24, 28, 29, 40, 43
Cameron-Martin, 121, 122 strong, 23, 30, 35, 39, 40, 42, 103
change of variable formula, 60, 89 martingale, 20, 66, 67, 72, 100
Ciesielski, 11 continuous, 53, 72, 124, 128
continuous semimartingale, 89, 92, 93, 128 continuous local, 72, 73, 79–81, 83–85,
continuous submartingale, 65 87–89, 96, 99, 102, 103, 108, 112,
115, 118, 121–124, 126, 128
diffusion process, 1 continuous square integrable, 55, 58, 59,
Dirichlet problem, 3, 114 70, 72, 86
Doob’s inequality, 63–65, 96, 99, 104, 105, local, 89, 100, 104
109, 133 martingales
Doob-Meyer decomposition, 74 continuous, 86
Dynkin’s lemma, 24–26, 29, 34, 37, 39, 41, Maruyama, 121, 122
42, 54, 60, 61, 117, 125, 127 modulus of continuity, 15, 18, 19

exit time, 101, 114 Nivokov’s criterion, 103


exponential martingales, 95, 98, 99 Novikov condition, 95
Novikov’s criterion, 103, 124
filtration, 19, 30, 31, 119
discrete, 30 quadratic variation, 45, 46
right-continuous, 32, 35
Skorohod, 134
Girsanov, 121, 122 stochastic differential equations, 2, 3, 95,
107, 108, 112, 113, 127, 128, 131
Hölder continuous, 51 strong solution, 131

138
stochastic integral, 53, 55, 59, 62–64, 68, 70,
73, 81–84, 89, 98
stochastic process, 5, 15, 21, 135
strong solution, 112, 116, 135
strong uniqueness, 110
supermartingale, 102, 104

Tanaka formula, 131, 135

usual conditions, 53–55, 71, 73, 89, 94, 108,


118, 119, 121, 131

weak solution, 127, 131, 134, 135


Wiener measure, 23, 29, 127
d-dimensional, 6

139

You might also like