0% found this document useful (0 votes)
8 views

Stoc

Uploaded by

paulodumildesprc
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Stoc

Uploaded by

paulodumildesprc
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Introduction to stochastic calculus

Justin Salez

April 2, 2024
2
Contents

1 Preliminaries 5
1.1 Stochastic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Absolute and quadratic variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Levy’s characterization of Brownian motion . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Local martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Stochastic integration 17
2.1 The Wiener isometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 The Wiener integral as a process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Progressive processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 The Itô isometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 The Itô integral as a process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Generalized Itô integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Stochastic differentiation 27
3.1 Itô processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Quadratic variation of an Itô process . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Itô’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Exponential martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Girsanov’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.6 An application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4 Stochastic differential equations 37


4.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Existence and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Practical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 Markov property for diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5 Generator of a diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.6 Connection with partial differential equations . . . . . . . . . . . . . . . . . . . . 45

Disclaimer: this course is a minimal and practical introduction to the theory of stochastic
calculus, with an emphasis on examples and applications rather than abstract subtleties.

Acknowledgment: Thanks are due to Josué Corujo and Damiano De Gaspari for having
reported many typos in a preliminary version of these notes.

3
CONTENTS

4
Chapter 1

Preliminaries

1.1 Stochastic processes

Stochastic processes. A stochastic process is just a collection X = ( Xt )t∈T of real-valued


random variables defined on the same probability space (Ω, F , P), and indexed by an arbitrary
set T. Two simple choices, which should be familiar to the reader, are T = {1, . . . , n} (random
vectors) – and T = N (random sequences). We shall here soon focus on the more involved
choice T = R+ (random functions), and interpret the parameter t as time: the stochastic process
X may then be thought of as modeling a physical quantity which evolves at random through time.

Law of a process. Our collection X = ( Xt )t∈T of one-dimensional random variables can


equivalently be viewed as a single random variable taking values in the multi-dimensional space
RT , equipped with the product σ −field. As any random variable, X has a well-defined law. The
latter is a probability measure on RT which, by Dynkin’s lemma, is uniquely characterized by
the laws of the random vectors ( Xt1 , . . . , Xtn ) for all n ∈ N and all (t1 , . . . , tn ) ∈ Tn . In practice,
specifying these finite-dimensional marginals can be very complicated, and one often restricts
attention to two fundamental statistics of X: its mean and covariance.

Mean and covariance. A stochastic process X is called square-integrable if its coordinates are in
L2 (Ω, F , P), i.e. E[ Xt2 ] < ∞ for all t ∈ T. By Cauchy-Schwarz, this ensures the well-definiteness
of the mean m X : T → R and covariance γX : T2 → R, given by

m X ( t ) : = E[ Xt ], γX (s, t) := Cov( Xs , Xt ) = E[ Xs Xt ] − E[ Xs ]E[ Xt ]. (1.1)

Recall for future reference that the function γX is always symmetric in its two arguments, and
positive semi-definite: for all n ∈ N, all (t1 , . . . , tn ) ∈ Tn , and all (λ1 , . . . , λn ) ∈ Rn , we have
!
n n n
∑ ∑ λ j λk γX ( t j , tk ) = Var ∑ λj Xj ≥ 0. (1.2)
j,k =1 k =1 j =1

Perhaps surprisingly, the two simple functions m X and γX capture a considerable amount of
structural information about the process, and play a major role in many practical aspects of
signal processing and forecasting. While they are far from characterizing the law of a general
square-integrable process, they do characterize it in the important case of Gaussian processes.

5
1.1. Stochastic processes

Gaussian processes. A stochastic process X = ( Xt )t∈T is called Gaussian if every finite linear
combination of its coordinates is a gaussian random variable. More explicitly, for every n ∈ N,
every (t1 , . . . , tn ) ∈ Tn , and every (λ1 , . . . , λn ) ∈ Rn , the scalar random variable

Z : = λ 1 Xt1 + · · · + λ n Xt n (1.3)

is a Gaussian random variable. In particular, we have


 
1
E[e ] = exp iE[ Z ] − Var( Z ) .
iZ
(1.4)
2
Note that the left-hand side is precisely the characteristic function of the random vector
( Xt1 , . . . , Xtn ) evaluated at the point (λ1 , . . . , λn ): the knowledge of this quantity for every
(λ1 , . . . , λn ) ∈ Rn suffices to determine the law of ( Xt1 , . . . , Xtn ). Since the right-hand side of
(1.4) only depends on X through m X and γX , we conclude that the law of a Gaussian process is
fully determined by its mean and covariance. We will henceforth write X ∼ N (m, γ) to mean
that X is a Gaussian process with mean m and covariance γ. Such a process can be shown to
exist for any choice of m : T → R and of the symmetric positive-definite function γ : T2 → R.
Here are a few particularly important choices, to which we will come back a lot:

• T = R+ , m = 0, γ(s, t) = 1(s=t) (as in a white noise).

• T = R+ , m = 0, γ(s, t) = s ∧ t (as in a Brownian motion).

• T = R, m = 0, γ(s, t) = e−|t−s| (as in an Ornstein-Uhlenbeck process).

• T = [0, 1], m = 0, γ(s, t) = s ∧ t − st (as in a Brownian bridge).

Independance. Two stochastic processes ( Xs )s∈S and (Yt )t∈T are independent if the random
vectors ( Xs1 , . . . , Xsn ) and (Yt1 , . . . , Ytn ) are independent for every n ∈ N and every choice of the
indices (s1 , . . . , sn ) ∈ Sn and (t1 , . . . , tn ) ∈ Tn . In general, this may be quite hard to check, but
a huge simplification occurs when the two processes ( Xs )s∈S and (Yt )t∈T are jointly Gaussian,
meaning that the concatenated process (( Xs )s∈S , (Yt )t∈T ) is Gaussian. Indeed, the random vector
( Xs1 , . . . , Xsn , Yt1 , . . . , Ytn ) is then Gaussian, so its distribution is entirely determined by its mean
and covariance. In particular, the independance between ( Xs1 , . . . , Xsn ) and (Yt1 , . . . , Ytn ) reduces
to the corresponding covariances being 0. Thus, two jointly Gaussian processes X = ( Xs )s∈S
and Y = (Yt )t∈T are independent if and only if they are decorrelated in the sense that

∀(s, t) ∈ S × T, Cov( Xs , Yt ) = 0. (1.5)

This simplification extends to more than two processes in the obvious way.

Indistinguishability We will say that two processes X = ( Xt )t∈T and Y = (Yt )t∈T are indistin-
guishable if the random variables X and Y coincide a.-s., i.e. if the set

{ω ∈ Ω : ∃t ∈ T : Xt ̸= Yt }

is P−negligible. In general, this is stronger than requiring that Y is a modification of X, i.e.

∀t ∈ T, P( Xt = Yt ) = 1. (1.6)

However, the two notions coincide when T is countable, or when T = R and X, Y are (right-)
continuous. Note that (1.6) implies, in particular, that X and Y have the same law.

6
1.2. Brownian motion

1.2 Brownian motion


We are now ready to introduce our most important stochastic process, the Brownian motion.
Definition 1.1 (Brownian motion). A Brownian motion is a stochastic process B = ( Bt )t≥0 such that
(i) B is Gaussian with mean m B (t) = 0 and covariance γB (s, t) = s ∧ t for all s, t ≥ 0.

(ii) B almost-surely has continuous trajectories. More precisely, the set

{ω ∈ Ω : the function t 7→ Bt (ω ) is not continuous} (1.7)

is P−negligible, which means that it lies inside an event E ∈ F such that P( E) = 0.


The existence of this object is not obvious, and we will admit it here.
Remark 1.1 (Continuous version). Note that one can always improve the almost-sure continuity to a
pointwise continuity by re-defining B to be the zero function on the null event E above. Working with
this continuous modification will be sometimes useful.
Remark 1.2 (A subtle point). As explained above, Condition (i) completely determines the law of B.
This, however, does not determine whether Condition (ii) holds or not: there are processes satisfying
(i) and (ii), and others satisfying (i) but not (ii). The reason is that the set of continuous trajectories
C 0 (R+ ) ⊆ RR is not in the product σ−field: in words, trajectorial continuity is too sophisticated to be
+

expressible in terms of finite-dimensional marginals only.


Let us now enumerate some simple properties of Brownian motion. We start with a simple
description of its distribution, based on increments.
Proposition 1.1 (Increments). Let B = ( Bt )t≥0 be a Brownian motion. Then,
(i) B0 = 0 almost-surely;

(ii) Bt − Bs ∼ N (0, t − s) for all 0 ≤ s ≤ t;

(iii) Bt2 − Bt1 , . . . , Btn − Btn−1 are independent for any n ∈ N and any 0 ≤ t1 ≤ . . . ≤ tn .
Conversely, any process satisfying these three properties has the law of a Brownian motion.
Proof. Since B is a Gaussian process with m B = 0 and γB (s, t) = s ∧ t, we have Bt ∼ N (0, t) for
all t ≥ 0. Taking t = 0 yields the first claim, and we now turn to the second. The fact that Bt − Bs
is a Gaussian random variable is clear, since B is a Gaussian process. Thus, it only remains to
compute its mean and variance: by linearity of expectations and bilinearity of covariances,

E[ Bt − Bs ] = m B (t) − m B (s) = 0
Var( Bt − Bs ) = γB (t, t) + γB (s, s) − 2γB (t, s) = t − s.

Finally, the random vector ( Bt2 − Bt1 , . . . , Btn − Btn−1 ) is Gaussian, because any linear combina-
tion of its coordinates is also a linear combination of coordinates of the Gaussian process B.
Consequently, independance reduces to decorrelation. Now, for 1 ≤ j < k ≤ n, we have
 
Cov Bt j − Bt j−1 , Btk − Btk−1 = γ B ( t j , t k ) + γ B ( t j −1 , t k −1 ) − γ B ( t j , t k −1 ) − γ B ( t j −1 , t k )
= t j + t j −1 − t j − t j −1
= 0,
where the second line uses the fact that t j−1 ≤ t j ≤ tk−1 ≤ tk . This establishes (iii). The converse
is a good exercise, which we leave to the reader.

7
1.2. Brownian motion

We now list three elementary but important invariance properties, which confirm the robust-
ness and canonical nature of Brownian motion.

Proposition 1.2 (Invariance). Let B = ( Bt )t≥0 be a Brownian motion. Then, in each of the following
cases, the process W = (Wt )t≥0 is also a Brownian motion.

(i) Wt := Ba+t − Ba for any fixed a ≥ 0 (invariance by translation).

Bat
(ii) Wt := √
a
for any fixed a > 0 (invariance by scaling).

(iii) Wt := tB 1 1(t>0) (invariance by time inversion).


t

Proof. In each case, W is a Gaussian process because any linear combination of its coordinates is
also a linear combination of coordinates of the Gaussian process B. Moreover, direct computa-
tions reveal that W has the same mean and covariance as B. From this, we can already conclude
that W is distributed as a Brownian motion, but we still have to check the almost-sure continuity
of t 7→ Wt . The latter is clear in cases (i) and (ii), by composition of continuous functions. The
same argument works in (iii), except at t = 0. Thus, it only remains to check that Wt → 0
almost-surely as t → 0. Here is a short but subtle argument: if a function x : R+ → R is known
to be continuous on (0, ∞), then its convergence to 0 at 0+ can be expressed as x ∈ E, where

∞ [
∞  
\ \ 1
E := | xt | ≤ .
n=1 k =1 t∈[0, 1 ]∩Q
n
k

Clearly, this set is in the product σ−field. Since W and B have the same law, we can safely
conclude that P(W ∈ E) = P( B ∈ E). But P( B ∈ E) = 1, by the trajectorial continuity of B.

Remark 1.3 (SLLN for the Browian motion). The invariance (iii) as an interesting consequence: being
a Brownian motion, the process t 7→ tB 1 1(t>0) must tend to 0 almost-surely as t → 0, which means that
t

Bt a.−s.
−−→ 0. (1.8)
t t→∞

This classical fact is known as the strong law of large numbers for the Brownian motion.

Finally, let us complement the invariance by translation observed above.

Proposition 1.3 (Markov property for the Brownian motion). Let ( Bt )t≥0 be a Brownian motion,
and let a ≥ 0 be fixed. Then, the Brownian motion ( Bt+a − Ba )t≥0 is independent of ( Bt )t∈[0,a] .

Proof. The processes ( Bt )t∈[0,a] and ( Bt+a − Ba )t≥0 are jointly Gaussian, because their coordinates
are linear combinations of coordinates of the same Gaussian process B. Thus, the claimed
independance reduces to the decorrelation property

∀(s, t) ∈ [0, a] × R+ , Cov ( Bs , Bt+a − Ba ) = 0, (1.9)

which readily follows from the explicit expression of γB .

8
1.3. Martingales

1.3 Martingales
From now onward, we turn our probability space (Ω, F , P) into a filtered probability space
(Ω, F , (Ft )t≥0 , P), by equipping it with a filtration (Ft )t≥0 . In other words, each Ft ⊆ F is a
σ −field on Ω and Fs ⊆ Ft for each 0 ≤ s ≤ t. The intuition is that Ft represents the information
that is available by time t about the various stochastic processes under consideration. For this
interpretation to be valid, we shall restrict our attention to processes X = ( Xt )t≥0 that satisfy

∀t ≥ 0, Xt is Ft − measurable. (1.10)

Such processes are said to be adapted. A simple way to ensure that a given process X is adapted
is to choose its natural filtration F X = (FtX )t≥0 , defined by

FtX := σ ( Xs : s ≤ t) . (1.11)

Of course, any larger filtration (in the coordinate-wise sense) will also work.

Definition 1.2 (Martingale). A martingale is a stochastic process M = ( Mt )t≥0 such that

(i) M is adapted, i.e. Mt is Ft −measurable for each t ≥ 0.

(ii) M is integrable, i.e. E[| Mt |] < ∞ for each t ≥ 0.

(iii) M is fair, i.e. E[ Mt | Fs ] = Ms for each 0 ≤ s ≤ t.

Remark 1.4 (Constant mean). In particular, the mean of a martingale is a constant function, i.e.

∀t ≥ 0, E[ Mt ] = E[ M0 ]. (1.12)

Property (iii) is, however, a much deeper property: as we will see, it implies that (1.12) acutally remains
valid when the deterministic time t is replaced by any “sufficiently reasonable” random time T.

Example 1.1 (Some important martingales). Let B = ( Bt )t≥0 be a Brownian motion. Then, in each of
the following cases, the process ( Mt )t≥0 is a martingale with respect to the filtration F B .

(i) Mt := Bt

(ii) Mt := Bt2 − t
θ2 t
(iii) Mt := eθBt − 2 , for any fixed θ ∈ R.

Proof. In each case, the process M is adapted because we have Mt = f t ( Bt ) for some measurable
(in fact, continuous) function f t : R → R. The integrability is standard, since Bt ∼ N (0, t).
Finally, for 0 ≤ s ≤ t, we may write Bt = Bs + ( Bt − Bs ) and use the fact that Bs is Fs −measurable
while Bt − Bs is independent of Fs (this is the Markov property for B) to obtain

E[ Bt | Fs ] = Bs + E[ Bt − Bs ] = Bs
E[ Bt2 | Fs ] = Bs2 + E[( Bt − Bs )2 ] + 2Bs E[ Bt − Bs ] = Bs2 + (t − s)
θ2
h i h i
E eθBt | Fs = eθBs E eθ ( Bt − Bs ) = eθBs + 2 (t−s) .

Rearranging these identities readily gives the desired martingale property in each case.

Remark 1.5. The same argument works for any filtration (Ft )t≥0 such that B is adapted to (Ft )t≥0 and
Bt − Bs is independent of Fs for every 0 ≤ s ≤ t. We then speak of a (Ft )t≥0 −Brownian motion.

9
1.3. Martingales

Remark 1.6 (A general formula). The above computations are special cases of a useful general formula
for conditional expectations: if G ⊆ F is any σ−field, and if X and Y are two random variables, the first
being G−measurable and the second being independent of G , then

E [ f ( X, Y ) | G] = F ( X ), where F ( x ) := E [ f ( x, Y )] ,

for any measurable f such that E[| f ( X, Y )|] < ∞. In particular, if B is a (Ft )t≥0 −Brownian motion
and 0 ≤ s ≤ t, we may apply this to X = Bs , Y = Bt − Bs , G = Fs and f ( x, y) = φ( x + y) to obtain

1
Z
z2 √
E[ φ( Bt ) | Fs ] = √ e− 2 φ( Bs + z t − s) dz, (1.13)
2π R

for any measurable function ϕ : R → R such that E[| φ( Bt )|] < ∞.

As we will now see, the true strength of martingales lies in the fact that the time t in the
mean conservation identity (1.12) can, under appropriate conditions, be taken to be random.

Definition 1.3 (Stopping time). A stopping time is a [0, ∞]−valued random variable T such that

∀t ≥ 0, { T ≤ t} ∈ Ft . (1.14)

The intuition is that, at any given time, one should be able to determine whether the random
time T has already occurred or not, just by looking at the information available so far (and not
in the future). For example, the first time that a (Ft )t≥0 −Brownian motion reaches the value 1 is
a stopping time, but the last time that a Brownian motion reaches the value 0 in the time interval
[0, 1] is not. In practice, all stopping times that we shall encounter will be of the following form.

Proposition 1.4 (A useful criterion). Suppose that A ⊆ R is a closed set and that X = ( Xt )t≥0 is an
adapted, continuous process. Then, the hitting time of A by X, defined as

TA ( X ) := inf{t ≥ 0 : Xt ∈ A}, (1.15)

is always a stopping time (with the usual convention inf ∅ = +∞).

Proof. Using the continuity of X and the fact that A is closed, one can easily check that
∞  
\ [ 1
{ TA ( X ) ≤ t} = dist(Xs , A) ≤ . (1.16)
k =1 s∈[0,t]∩Q
k

Now, dist(Xs , A) ≤ k1 ∈ Fs ⊆ Ft because X is adapted and z 7→ dist(z, A) is measurable.




Thus, { TA ( X ) ≤ t} ∈ Ft as a countable union and intersection of events in Ft .

Exercise 1.1 (Stopping times). Show that if S, T are stopping times, then so are S ∧ T, S ∨ T, S + T.

We are now in position to recall the main result of martingale theory.

Theorem 1.1 (Doob’s optional stopping Theorem). If ( Mt )t≥0 is a continuous martingale and T a
stopping time, then the stopped process M T := ( Mt∧T )t≥0 is a (continuous) martingale. In particular,

∀t ≥ 0, E [ MT ∧t ] = E[ M0 ]. (1.17)

If ( Mt∧T )t≥0 is uniformly integrable and T < ∞ a.-s., then taking t → ∞ to obtain E [ MT ] = E[ M0 ].

Here is an example to illustrate the practical interest of Doob’s optional stopping Theorem.

10
1.3. Martingales

Example 1.2 (Exit time from an interval). Fix two constants a, b > 0. How long will it take, on
average, for a Brownian motion B to exit the interval I = (− a, b) ? The variable of interest T is a
stopping time, because it is the hitting time of the closed set I c by the continuous and adapted process B.
Applying Doob’s optional stopping Theorem to the continuous martingale ( Bt2 − t)t≥0 , we deduce that
E[ T ∧ t] = E BT2 ∧t ,
 

for all t ≥ 0. We now send t → ∞. The left-hand side tends to E[ T ] by monotone convergence. Since
the right-hand side is bounded by ( a ∨ b)2 independently of t, we already see that E[ T ] ≤ ( a ∨ b)2 . In
particular, T is a.-s. finite, and the domination BT2 ∧t ≤ ( a ∨ b)2 now allows us to obtain the equality
E[ T ] = E[ BT2 ] = pb2 + (1 − p) a2 ,
where p = P( BT = b). The second equality relies on the observation that BT takes values in the
two-element set {− a, b}, by continuity of B. To compute p, we now apply Doob’s optional stopping
theorem to the martingale ( Bt )t≥0 . We already know that T < ∞ a.-s., and that | BT ∧t | ≤ a ∨ b, so we
may safely conclude that 0 = E[ BT ] = pb − (1 − p) a, i.e. p = a/( a + b). In conclusion, the answer is
ab2 ba2
E[ T ] = + = ab.
a+b a+b
More generally, Doob’s optional stopping theorem remains true for sub-martingales or super-
martingales (defined by relaxing the equality E[ Mt |Fs ] = Ms into the inequality E[ Mt |Fs ] ≥ Ms
or E[ Mt |Fs ] ≤ Ms , respectively). Such processes arise naturally when applying a convex or
concave function to a martingale (by the conditional Jensen inequality). We end this section by
mentioning a uniform refinement of Chebychev’s inequality in the case of martingales.
Theorem 1.2 (Doob’s maximal inequality). If M is a square-integrable continuous martingale, then
!
E[ Mt2 ]
∀ a, t ≥ 0, P sup | Ms | ≥ a ≤ . (1.18)
s∈[0,t] a2
Here is a nice and useful application of Doob’s maximal inequality.
Proposition 1.5 (Limits of continuous martingales). Let ( Mn )n≥1 be continuous, square-integrable
martingales, and suppose that for each t ≥ 0 the limit Mt := limn→∞ Mtn exists in L2 . Then, the process
M = ( Mt )t≥0 (has a modification which) is a continuous square-integrable martingale.
Proof. The only real difficulty is continuity. By Doob’s maximal inequality applied to the
continuous square-integrable martingale Mn − Mm , we have for fixed t ≥ 0 and k ∈ N,
!
1 h i
P sup | Ms − Ms | ≥ 2
n m
≤ k2 E ( Mtn − Mtm )2 . (1.19)
s∈[0,t] k

Since ( Mtn )n≥1 converges in L2 , the right-hand side can be made arbitrarily small by choosing
m ∧ n large. Consequently, there is an increasing sequence ( Nk )k≥1 such that
!
Nk+1 Nk 1 1
∀k ∈ N, P sup | Ms − Ms | ≥ 2 ≤ 2. (1.20)
s∈[0,t] k k
By the Borel-Cantelli lemma, we deduce that almost-surely,


Nk+1
sup Ms − MsNk < ∞. (1.21)
k =1 s∈[0,t]

This ensures that almost-surely, the sequence ( M Nk )k≥1 is convergent in the space of continuous
functions equipped with the topology of uniform convergence on every compact set. But the
limit is necessarily a version of M, because for each t ≥ 0, we have Mtn → Mt in L2 .

11
1.4. Absolute and quadratic variation

1.4 Absolute and quadratic variation


Despite being continuous, the Brownian motion – or any interesting martingale, as we will see –
is extremely rough: it oscillates wildly. To formalize this idea, we define the absolute variation of
a function f : R+ → R on the interval [s, t] as the (possibly infinite) quantity

V ( f , s, t) := sup ∑ | f (tk ) − f (tk−1 )| , (1.22)


(tk ) k

where the supremum is taken over all subdivisions s = t0 ≤ t1 ≤ . . . ≤ tn = t (n ∈ N) of the


interval [s, t]. Note that we have the chain rule

∀u ∈ [s, t], V ( f , s, t) = V ( f , s, u) + V ( f , u, t).

The function f has finite variation if V ( f , s, t) < ∞ for every 0 ≤ s ≤ t. This is the case for most
of the functions that we are used to manipulate: for example, we invite the reader to check that
Rt
(i) If f is continuously differentiable, then V ( f , s, t) = s | f ′ (u)| du < ∞;

(ii) If f is monotone, then V ( f , s, t) = | f (t) − f (s)| < ∞;

(iii) V ( f + g, s, t) ≤ V ( f , s, t) + V ( g, s, t).
In particular, (ii) and (iii) imply that the difference of two non-decreasing functions has finite
variation. In fact, any function of finite variation is of this form.
Proposition 1.6 (Characterization of finite variation). A function f : R+ → R has finite variation if
and only if it can be written as f = f 1 − f 2 , where f 1 , f 2 : R+ → R are non-decreasing.
Proof. The ’if’ part is trivial. Conversely, if f has finite variation, then it is immediate to check
that the functions f 1 : t 7→ V ( f , 0, t) and f 2 : t 7→ V ( f , 0, t) − f (t) are non-decreasing.

It is now time to give an example of a function that fails to have finite variation.
Example 1.3 (Variation of the Brownian motion). Let B be a Brownian motion. Fix 0 ≤ s ≤ t and
consider the subdivision (t0 , . . . , tn ) of [s, t] into n intervals of equal length, i.e. tk = s + nk (t − s). Then,
n
r
t−s
∑ | Btk − Btk−1 | =
d
(|ξ 1 | + · · · + |ξ n |) , (1.23)
k =1
n

where (ξ k )k≥0 are i.i.d. with law N (0, 1). Now, the right-hand side diverges as n → ∞ by the strong
law of large numbers, implying that P(V ( B, s, t) = +∞) = 1. By taking s, t ∈ Q+ and noting that
V ( f , s, t) ≤ V ( f , s′ , t′ ) whenever [s, t] ⊆ [s′ , t′ ], we conclude that

P (∀s, t ≥ 0, V ( B, s, t) = +∞) = 1.

Thus, Brownian motion oscillates much more than the typical functions that we are used to manipulate.
The computation appearing at (1.23) strongly suggests looking at squared increments when
measuring the variations of Brownian motion. Indeed, the quadratic version of (1.23) is
n
t−s
∑ | Bt
d
− Btk−1 |2 = | ξ 1 |2 + · · · + | ξ n |2 ,

k
(1.24)
k =1
n

and the right-hand side now tends to t − s instead of +∞, by the strong law of large numbers.
This idea of considering quadratic variation when the function of interest has infinite absolute
variation turns out to work way beyond the specific example of Brownian motion, as shown in
the following fundamental result.

12
1.4. Absolute and quadratic variation

Theorem 1.3 (Quadratic variation of squared-integrable martingales). Let M = ( Mt )t≥0 be a


continuous, square-integrable martingale. Then, for each t ≥ 0, the limit
n
⟨ M ⟩t := lim
n→∞
∑ | Mt n
k
− Mtnk−1 |2
k =1

exists in and does not depend on the subdivisions (tnk )0≤k≤n of [0, t], as long as the mesh max0≤k≤n |tnk −
L1
tnk−1 | tends to 0 as n → ∞. Moreover, the process ⟨ M⟩ = (⟨ M ⟩t )t≥0 has the following properties:
(i) ⟨ M ⟩0 = 0 ;

(ii) t 7→ ⟨ M⟩t is non-decreasing ;

(iii) t 7→ ⟨ M⟩t (has a modification which) is continuous ;

(iv) Mt2 − ⟨ M⟩t t≥0 is a martingale.




Proof. Let us here admit existence and continuity, and focus on the important martingale
property (iv), whose proof is instructive (Properties (i) and (ii) are trivial). Fix 0 ≤ s ≤ t, and
consider a subdivision (tnk ) of [s, t] with max0≤k≤n |tnk − tnk−1 | → 0 as n → ∞. We may then write
n h i n  2 
E Mt − Ms | F s = ∑ E Mt n − Mt n | F s = ∑ E Mt k − Mt k −1 | F s ,
 2 2
 2 2 n n
k k −1
k =1 k =1

by the conditional orthogonality of martingale increments. On the other hand, by construction,


n  2
L1
∑ Mtnk − Mtnk−1 −−−→ ⟨ M⟩t − ⟨ M⟩s . n→∞
k =1

Taking conditional expectation w.r.t. Fs , we conclude from the above computation that
E Mt2 − Ms2 | Fs = E [⟨ M⟩t − ⟨ M⟩s |Fs ] .
 

Since Ms and ⟨ M⟩s are Fs -measurable, this proves the desired martingale property.

Example 1.4 (Brownian case). In the case of a Brownian motion B = ( Bt )t≥0 , the computation (1.23)
shows that ⟨ B⟩t = t, which does indeed satisfy Properties (i)-(iv) above.
Remark 1.7 (Quadratic covariation). If M, N are two continuous square-integrable martingales, then
we may define their quadratic covariation by the polarization formula:
1
⟨ M, N ⟩ := (⟨ M + N ⟩ − ⟨ M⟩ − ⟨ N ⟩) .
2
The above result implies that ⟨ M, N ⟩ is continuous, that MN − ⟨ M, N ⟩ is a martingale, and that
n
L1

 
Mt k − Mt k −1 Ntk − Ntk−1 −−−→ ⟨ M, N ⟩t ,
n→∞
k =1

for any subdivisions (tnk )0≤k≤n of [0, t] with max0≤k≤n |tnk − tnk−1 | → 0 as n → ∞.
Remark 1.8 (Absolute vs quadratic variation). If f has finite variation and g is continuous, then
n
∑ ( f (tk ) − f (tk−1 )) ( g(tk ) − g(tk−1 )) ≤ V ( f , 0, t) max
u,v∈[0,t],|u−v|≤∆
| g(u) − g(v)|,
k =1

where ∆ = max0≤k≤n |tnk − tnk−1 |. By uniform continuity of g on compact sets (Heine’s theorem), the
right-hand side tends to 0 as ∆ → 0, i.e. ⟨ f , g⟩ = 0. Taking f = g shows that a continuous process
with finite variation must have zero quadratic variation. When applied to martingales, this implies the
following result, which considerably extends our observation about the roughness of Brownian motion.

13
1.5. Levy’s characterization of Brownian motion

Corollary 1.1 (No interesting martingale has finite variation). If M = ( Mt )t≥0 is a continuous
square-integrable martingale which has finite variations a.-s., then M is a.-s. constant in time:

P(∀t ≥ 0, Mt = M0 ) = 1. (1.25)

Proof. Fix t ≥ 0. By the above remark, we have ⟨ M ⟩t = 0. On the other hand, the orthogonality
of martingale increments and property (iv) above yield

E[( Mt − M0 )2 ] = E[ Mt2 ] − E[ M02 ] = E[⟨ M ⟩t ] = 0,

which shows that P( Mt = M0 ) = 1. This is true for any fixed t ≥ 0, so we may take t ∈ Q+ and
invoke the continuity of M to conclude that P(∀t ≥ 0, Mt = M0 ) = 1, as desired.

Remark 1.9 (Truncation). The result actually holds without the square-integrability assumption. Indeed,
stopping preserves both the finite variation and the martingale properties, so the conclusion applies to
M Tn , where Tn = inf{t ≥ 0 : | Mt | ≥ n}, and can then be transferred to M by sending n → ∞.
Remark 1.10 (Uniqueness). The quadratic variation ⟨ M⟩ = (⟨ M⟩t )t≥0 defined at (1.30) is the only
process satisfying the properties (i ) − (iv) in Theorem 1.3. Indeed, if A = ( At )t≥0 is another process
with these properties, then ⟨ M ⟩ − A = ( M2 − A) − ( M2 − ⟨ M ⟩) is a continuous martingale (as
the difference of two continuous martingales), and has finite variation a.-s. (as the difference of two
non-decreasing processes). Thus, it must be a.-s. constantly equal to its initial value, which is zero.

1.5 Levy’s characterization of Brownian motion


The quadratic variation of a continuous martingale is a remarkable object. Perhaps surprisingly,
it completely determines the distribution of the underlying martingale. We shall here only prove
the following important special case, which constitutes a deep and celebrated result.
Theorem 1.4 (Levy’s characterization of Brownian motion). For a process M = ( Mt )t≥0 on a
filtered space (Ω, F , (Ft )t≥0 , P), the following two statements are equivalent
(i) M is a continuous square-integrable martingale with M0 = 0 and ⟨ M ⟩t = t for all t ≥ 0;

(ii) M is a (Ft )t≥0 −Brownian motion.


Proof. The implication (ii ) =⇒ (i ) has already been established. Conversely, let us suppose that
(i ) holds. We will establish the following fact: for any twice-differentiable function F : R+ × R →
C whose first and second-order derivatives are all bounded, the process Z defined by
∂F 1 ∂2 F
Z t 
Zt := F (t, Mt ) − + (u, Mu ) du, (1.26)
0 ∂t 2 ∂x2
 2

is a martingale. In particular, taking F (t, x ) = exp iθx + θ2 t makes the integral vanish, so that

θ 2 (t−s)
h i
E eiθ ( Mt − Ms ) | Fs = e− 2 ,

for any 0 ≤ s ≤ t and θ ∈ R. This formula shows that Mt − Ms has law N (0, t − s) and is
independent of Fs , as desired. To prove (1.26), we Taylor-expand F: for 0 ≤ t ≤ t′ and x, x ′ ∈ R,
 
′ ′ ′ ∂F
F (t , x ) − F (t, x ) = (t − t) (t, x ) + o (1)
∂t
( x ′ − x )2 ∂2 F
 
′ ∂F
+( x − x ) (t, x ) + (t, x ) + o (1) ,
∂x 2 ∂x2

14
1.6. Local martingales

where the o (1) term is uniformly bounded and can be made arbitrary small by choosing
| x ′ − x | + |t′ − t| small (this uses the boundedness assumption on the derivatives of F). We now
choose ( x, x ′ ) = ( Mt , Mt′ ) and take conditional expectation w.r.t. to Ft . Using E[ Mt′ − Mt |
Ft ] = 0 and E[( Mt′ − Mt )2 | Ft ] = (t′ − t), we easily obtain
∂F 1 ∂2 F
   
E F (t′ , Mt′ ) − F (t, Mt ) − (t′ − t) + ( t, M t ) | F t = ( t ′ − t ) o (1).
∂t 2 ∂x2
(t−s)k
Finally, fix 0 ≤ s ≤ t, set tnk = n , and apply the above identity to t = tnk−1 and t′ = tnk . By the
tower property of conditional expectation, we may replace Ft by Fs . Summing the resulting
identity over 1 ≤ k ≤ n yields
" #
n  2F

∂F 1 ∂
E F (t, Mt ) − F (s, Ms ) − ∑ (tnk − tnk−1 ) + 2
(tnk−1 , Mtnk−1 ) | Fs = ( t − s ) o (1),
k =1
∂t 2 ∂x

and taking n → ∞ completes the proof.

1.6 Local martingales


To deal with more general processes, we will need to relax the integrability requirement in the
definition of a martingale. This leads to the following notion.
Definition 1.4 (Local martingale). A stochastic process M = ( Mt )t≥0 is a local martingale if there
exists a sequence ( Tn )n≥1 of stopping times (called a localizing sequence) such that
(i) For each n ∈ N, the stopped process M Tn = ( Mt∧Tn )t≥0 is a martingale.

(i) Almost-surely, Tn ↑ ∞ as n ↑ ∞.
Of course, any continuous martingale is a local martingale (take Tn = +∞), but the converse
is far from true. In fact, a local martingale needs not even be integrable ! However, any local
martingale which is uniformly dominated is a true martingale, as we now show.
Proposition 1.7 (Uniform domination). For a local martingale M to be a martingale, it suffices that
" #
∀t ≥ 0, E sup | Ms | < ∞. (1.27)
s∈[0,t]

Proof. As any local martingale, M is adapted: it is the pointwise limit of the sequence of adapted
processes ( M Tn )n≥1 , where ( Tn )n≥1 is a localizing sequence. Moreover, the above domination
ensures that M is integrable. Finally, fix 0 ≤ s ≤ t. For all n ∈ N, we know that

E [ MTn ∧t | Fs ] = MTn ∧s . (1.28)

To conclude that E [ Mt | Fs ] = Ms , we now take n → ∞: the random variables MTn ∧t and MTn ∧s
tend to Mt and Ms a.-s., because Tn ↑ ∞. Moreover, the domination | MTn ∧t | ≤ Z with Z :=
sups∈[0,t] | Ms | ∈ L1 allows us to safely interchange the limit and conditional expectation.

Local martingales are easy to work with, because we can always localize them to obtain
true martingales (for which we have a well-developed theory), and then transfer the desired
conclusion by taking a limit. As a consequence, many of the results that we have mentioned
about martingales extend easily to local martingales. Here are a few important examples, which
we really invite the reader to prove.

15
1.6. Local martingales

Proposition 1.8 (Doob’s optional stopping theorem for local martingales). If M is a continuous
local martingale and T a stopping time, then the stopped process M T = ( Mt∧T )t≥0 is a local martingale.
Proof. Let ( Tn )n≥1 be a localizing sequence for M, and fix n ∈ N. Since M Tn is a (continuous)
martingale and T a stopping time, the non-local version of Doob’s optional stopping Theorem
ensures that M Tn ∧T is a martingale. Thus, ( Tn )n≥1 is also a localizing sequence for M T .

Remark 1.11 (A smart localizing sequence). If M is a continuous local martingale with M = 0, then

Tn := inf{t ≥ 0 : | Mt | ≥ n}, (1.29)

is a stopping time for any n ∈ N (hitting time of a closed set by a continuous adapted process), so the
local version of Doob’s optional stopping Theorem ensures that M Tn is a local martingale. But M Tn is
[−n, n]−valued by construction, so the uniform domination (1.27) trivially holds, showing that M Tn
is in fact a martingale. Finally, Tn → +∞ a.-s. as n → ∞, because sups∈[0,t] | Ms | < ∞ a.-s.. In
conclusion, the sequence ( Tn )n≥1 defined by (1.29) is always a localizing sequence for M. It has the
additional advantage that the stopped martingale M Tn is bounded for every n ∈ N, which can be useful.
Proposition 1.9 (Addition of local martingales). Continuous local martingales form a vector space.
Proof. Let M and M e be two local martingales, with localizing sequences ( Tn )n≥1 and ( T en )n≥1 .
Fix n ∈ N. Since M Tn and M e Ten are continuous martingales, Doob’s optional stopping Theorem
ensures that the stopped processes M Tn ∧Tn and M e Tn ∧Ten are martingales. Thus, so is λM Tn ∧Ten +
e

µMe Tn ∧Ten , for any λ, µ ∈ R. But this shows that ( Tn ∧ T en )n≥1 is a localizing sequence for
λM + µ M, e thereby completing the proof (note that Tn ∧ T en → ∞ because Tn , T en → ∞).

Proposition 1.10 (No interesting local martingale has finite variation). If M is a continuous local
martingale which has finite variation a.-s., then P(∀t ≥ 0, Mt = M0 ) = 1.
Proof. Assume without loss of generality that M0 = 0. The smart localizing sequence (1.29)
makes the stopped process M Tn a square-integrable martingale. Moreover, we have V ( M Tn , 0, t) =
V ( M, 0, t ∧ Tn ) < ∞ for all t ≥ 0. Thus, the non-local version of the result ensures that M Tn is
a.-s. constant in time, and letting n → ∞ shows that M is a.-s. constant in time, as desired.

Proposition 1.11 (Quadratic variation). Let M be a continuous local martingale. Then, the limit
n
⟨ M ⟩t := lim
n→∞
∑ | Mt n
k
− Mtnk−1 |2
k =1

exists in probability for each t ≥ 0, and does not depend on the subdivisions (tnk )0≤k≤n of [0, t], as long as
max0≤k≤n |tnk − tnk−1 | → 0 as n → ∞. Moreover, ⟨ M⟩ is the unique process (up to modification) so that
(i) ⟨ M ⟩0 = 0 ;

(ii) t 7→ ⟨ M⟩t is a.-s. continuous ;

(iii) t 7→ ⟨ M⟩t is a.-s. non-decreasing ;

(iv) Mt2 − ⟨ M⟩t t≥0 is a local martingale.




Exercise 1.2 (Square-integrable local martingales). Fix t ≥ 0. Show that a continuous local
martingale M = ( Ms )s∈[0,t] is a square-integrable martingale if and only if M0 ∈ L2 and ⟨ M ⟩t ∈ L1 .
Exercise 1.3 (Local martingales are unbounded). Let M be a continuous local martingale such that
a.-s., ⟨ M ⟩∞ = ∞. Prove that a.-s., lim supt→∞ Mt = +∞ and lim inft→∞ Mt = −∞.

16
Chapter 2

Stochastic integration

We have now arrived to the main theoretical challenge of this introductory course: giving a
proper meaning to a stochastic integral of the form
Z t
It = Xu dYu , (2.1)
0

where X = ( Xt )t≥0 and Y = (Yt )t≥0 are stochastic processes. A natural idea is of course to define
this integral as a limit of Riemann sums, just as one would do if X and Y were deterministic:
n  
It := lim ∑ Xtnk−1 Ytnk − Ytnk−1 , (2.2)
n→∞
k =1

where (tnk )0≤k≤n is a subdivision of [0, t] such that max1≤k≤n tnk − tnk−1 → 0 as n → ∞. Unfortu-
nately, the almost-sure convergence of these Riemann sums requires the process Y to have finite
variations, thereby excluding Brownian motions as well as any interesting martingale.
The solution found by Itô consists in compensating roughness by randomness: with a bit
of work, it will be shown that the above limit does in fact exist when taken in the L2 sense,
for a wide class of stochastic processes X, Y which includes Brownian motion. The general
construction is rather delicate, but will eventually provide us with an extremely robust theory
of stochastic integration and differentiation. As a warm-up, let us first restrict our attention to
the special case where X is deterministic and Y is a Brownian motion. In this very comfortable
setting, the integral It is known as a Wiener integral, and it enjoys remarkable properties.

2.1 The Wiener isometry


Let H, H′ be Hilbert spaces. An isometry is an additive and norm-preserving map I : H → H′ :
∀ x, y ∈ H, I ( x + y) = I ( x ) + I (y) and ∥ I ( x )∥H′ = ∥ x ∥H . (2.3)
In particular, this implies that I is linear and continuous. When I is only defined on a vector
subspace V ⊆ H (and is linear and norm-preserving thereon), we speak of a partial isometry.
The following classical result will play a fundamental role in this chapter.
Theorem 2.1 (Isometry extension). Let I : V → H′ be a partial isometry whose domain V is dense in
H. Then I admits a unique continuous extension to H, and the latter is an isometry.
Proof. Fix x ∈ H \ V, and take ( xn )n≥1 in V which converges to x. Clearly, any continuous
extension must satisfy
I (x) = lim I ( xn ), (2.4)
n→∞

17
2.1. The Wiener isometry

which establishes uniqueness. To prove existence, one would like to use (2.4) as a definition, but
there are two potential problems: it is not clear that the limit exists, and even if it does, it might
a priori depend on the particular sequence ( xn )n≥1 chosen to approximate x. Fortunately, both
issues are solved by the fact that I is a partial isometry. Indeed, for all n, m ∈ N, we have

∥ I ( xn ) − I ( xm )∥ = ∥ I ( xn − xm )∥ = ∥ xn − xm ∥ −−−−→ 0, (2.5)
n∧m→∞

because ( xn )n≥1 is convergent. Thus, ( I ( xn ))n≥1 is a Cauchy sequence, hence the limit (2.4)
exists. Moreover, the latter does not depend on the chosen approximation ( xn )n≥1 . Indeed, if
(yn )n≥1 is another sequence in V which converges to x, then

∥ I ( xn ) − I (yn )∥ = ∥ xn − yn ∥ −−−→ 0. (2.6)


n→∞

Thus, the formula (2.4) defines a continuous extension, and the latter is automatically linear and
norm-preserving, because these properties depend continuously on their arguments.

Now, given a Brownian motion B = ( Bt )t≥0 and a deterministic, square-integrable function


f : R+ → R, our goal is to give a meaning to the random variable
Z ∞
I( f ) = f (u) dBu . (2.7)
0

Of course, one should have I ( f ) = Bt in the basic case f = 1(0,t] . Also, as any reasonable
integral, f 7→ I ( f ) should be linear. Together, these two requirements impose that

n n
∑ ∑ ak

f = ak 1(tk−1 ,tk ] =⇒ I ( f ) = Btk − Btk−1 , (2.8)
k =1 k =0

for any n ∈ N, a1 , . . . , an ∈ R and 0 = t0 ≤ t1 ≤ . . . ≤ tn . With this definition, we have

h i n Z t
E ( I ( f ))2 = ∑ a2k (tk − tk−1 ) =
0
f 2 (u) du.
k =0

Thus, our map I is a partial isometry on the subspace E ⊆ L2 (R+ ) of all step functions:
( )
n
E := ∑ ak 1(t − ,t ] : n ∈ N, (a1 , . . . , an ) ∈ Rn , 0 = t0 ≤ t1 ≤ . . . ≤ tn
k 1 k
. (2.9)
k =1

It turns out that this set is large enough, in the precise sense that it is dense in L2 (R+ ).

Lemma 2.1 (Approximation by step functions). Any function f ∈ L2 (R+ ) is the limit in L2 (R+ ) of
the sequence of step functions ( Pn f )n≥1 , where
!
n2 Z k


n
Pn f := n k −1
f (u) du 1( k , k+1 ] . (2.10)
n n
k =1 n

Moreover, when f ∈ C0c (R+ ), we can replace the (·) term by f (k/n).

Thus, the isometry extension theorem applies, leading to the following result.

18
2.2. The Wiener integral as a process

Theorem 2.2 (Wiener isometry). Let B = ( Bt )t≥0 be a Brownian motion on (Ω, F , P). Then, there
exists a unique linear and continuous map I : L2 (R+ ) → L2 (Ω, F , P) such that for all t ≥ 0,
I (1(0,t] ) = Bt . (2.11)
Moreover, I is an isometry, in the sense that for all f ∈ L2 (R+ ),
∥ I ( f )∥ L2 (Ω) = ∥ f ∥ L2 (R+ ) . (2.12)
R∞
The map I is called the Wiener isometry, and denoted I ( f ) = 0 f (u) dBu .
Remark 2.1 (Explicit formula). Let us make several important remarks about this result.
1. By construction, for any f ∈ L2 (R+ ), we have the explicit formula
Z ∞ n2  
0
f (t) dBt = lim
n→∞
∑ an,k ( f ) B k+1 − B k ,
n n
(2.13)
k =1
R k/n
where an,k ( f ) = n (k−1)/n f (u) du, and where the limit is taken in the L2 sense. Moreover, in the
particular case where f ∈ Cc0 (R+ ), we can take the simpler choice an,k ( f ) = f (k/n).
2. As any distributional limit of a sequence of (centered) Gaussian random variables, the Wiener
integral is a (centered) Gaussian random variable, with variance given by (2.12):
Z ∞  Z ∞ 
∀ f ∈ L (R+ ),
2
f (u) dBu ∼ N 0, 2
f (u) du . (2.14)
0 0

3. Thanks to the polarization identity 2⟨ f , g⟩ = ⟨ f + g, f + g⟩ − ⟨ f , f ⟩ − ⟨ g, g⟩, valid for any


symmetric bilinear form, the isometry property (2.12) leads to the covariance formula
Z ∞ Z ∞  Z ∞
Cov f (u) dBu , g(u) dBu = f (u) g(u) du. (2.15)
0 0 0

2.2 The Wiener integral as a process


For 0 ≤ s ≤ t, it is natural to introduce the notation
Z t Z ∞
f (u) dBu := f (u)1(s,t] (u) dBu . (2.16)
s 0
Note that for this definition to make sense, we only needR t the function f : R+ → R to be locally
square-integrable (written f ∈ L LOC ), in the sense that 0 f (u) du < ∞ for each t ≥ 0. Also, by
2 2

linearity of the Wiener integral, we have the Chasles relation


Z t Z s Z t
f (u) dBu = f (u) dBu + f (u) dBu (0 ≤ s ≤ t ). (2.17)
0 0 s
f
To any f ∈ L2LOC , we may now naturally associate a process M f = ( Mt )t≥0 , given by
Z t
f
Mt := f (u) dBu . (2.18)
0

Clearly, M f is a centered Gaussian process, with covariance function given by


  Z s∧t
f f
Cov Mt , Ms = f 2 (u) du. (2.19)
0

But M f has an even more remarkable property, which justifies by itself the interest of stochastic
integration. In the following result, the underlying filtration (Ft )t≥0 can be taken to be the
natural filtration of B, or any filtration for which B is a (Ft )t≥0 −Brownian motion.

19
2.2. The Wiener integral as a process

Theorem 2.3 (Wiener martingale). M f is a continuous square-integrable martingale, with


Z t
∀t ≥ 0, ⟨ M f ⟩t = f 2 (u) du. (2.20)
0

Proof. The square-integrability is clear, by construction of the Wiener integral. Now, for any
fixed 0 ≤ s ≤ t, the function f 1(s,t] is the L2 −limit of a sequence of step functions supported on
(s, t]. In view of our construction of the Wiener integral, this implies that
Z t
f f
Mt − Ms = f (u) dBu ∈ Vect ( Bu − Bs : u ∈ [s, t]). (2.21)
s
f f f
Since B is a (Ft )t≥0 −Brownian motion, it follows that Mt is Ft −measurable and that Mt − Ms
is independent of Fs . In particular, we have
f f f f
E[ Mt − Ms |Fs ] = E[ Mt − Ms ] = 0;
Z t
f f f f f f
E[( Mt )2 − ( Ms )2 |Fs ] = E[( Mt − Ms )2 |Fs ] = E[( Mt − Ms )2 ] = f 2 (u) du,
s

where the last identity uses


n the isometry property.
o The first line shows that M f is a martin-
t
gale, and the second that ( M f )2t − 0 f 2 (u) du
R
also is. It thus only remains to prove the
t ≥0
continuity of M f . With f n := Pn f as in (2.10), the L2 convergence f n 1(0,t] → f 1(0,t] implies that
f L2 f
Mt n −−−→ Mt , (2.22)
n→∞

for each t ≥ 0. In view of Proposition 1.5, we thus only have to establish the continuity of M f
when f is a step function. By linearity, we may further assume that f = 1(a,b] , 0 ≤ a ≤ b. But
f
then the result is trivial, since Mt = Bb∧t − Ba∧t .

When f = 1, we have M f = B, so we recover earlier observations. Likewise, we had seen


2
earlier that (eθBt −θ t/2 )t≥0 is a martingale, for any θ ∈ R. Here is a considerable generalization.
f
Proposition 2.1 (Exponential martingale). For any f ∈ L2LOC , the process Z f = ( Zt )t≥0 defined by
Z t
1 t 2
Z 
f
Zt := exp f (u) dBu − f (u) du , (2.23)
0 2 0
is a (continuous, square-integrable) martingale.
Proof. The integrability poses no problem, because the stochastic integral is a Gaussian
Rs ran-
dom variable. Now, fix 0 ≤ s ≤ t. As already observed, the random variable 0 f (u) dBu is
Rt
Fs −measurable, while s f (u) dBu is independent of Fs . Consequently, we have
Z t
1 t 2
h i  Z 
f f
E Zt |Fs = Zs E exp f (u) dBu − f (u) du
s 2 s
f
= Zs ,
Rt  Rt 
because s f (u) dBu ∼ N 0, s f 2 (u) du .

Exercise 2.1. Determine the law of the process X in the following two cases
Z t
1
Xt : = (1 − t ) dBu , t ∈ (0, 1);
1−u
 0 Z t 
−t
Xt : = e X0 + e dBu , t ∈ R+
u
with X0 ∼ N (0, 1/2) indep. o f B.
0

20
2.3. Progressive processes

2.3 Progressive processes


As above, we consider a filtered space (Ω, F , (Ft )t≥0 , P) on which is given a (Ft )t≥0 −Brownian
motion B = ( Bt )t≥0 . We seek to extend the Wiener integral to the case where the deterministic
function f is replaced by a stochastic process ϕ = (ϕt )t≥0 . To do so, we will need to require that
ϕ is progressive, in the sense that for each fixed t ≥ 0, the function

([0, t] × Ω, B([0, t]) ⊗ Ft ) → (R, B(R))


(u, ω ) 7→ ϕu (ω )

is measurable. This is more than asking that ω 7→ ϕt (ω ) is Ft −measurable for each t ≥ 0,


R t for each ω ∈ Ω. It ensures, by Fubini’s Theorem, that
R t that t 7→ ϕt (ω ) is Borel-measurable
and
ϕ du is Ft −measurable whenever 0 |ϕu | du < ∞ a.-s.
0 u

Remark 2.2 (Progressive σ−field). It is easy to check that the set P defined by

{ A ⊆ R+ × Ω : A ∩ ([0, t] × Ω) ∈ B([0, t]) ⊗ Ft } ,


\
P :=
t ≥0

is a σ−field, and that a process ϕ is progressive if and only if the map (t, ω ) 7→ ϕt (ω ) is P −measurable.

Proposition 2.2 (Sufficient conditions). The class of progressive processes include:

(o) any deterministic process ϕt (ω ) = f (t), where f : R+ → R is measurable.

(i) any process of the form ϕt (ω ) = X (ω )1(a,b] (t) where 0 ≤ a ≤ b, and where X is F a −measurable;

(ii) any process of the form ϕt (ω ) = 1[0,T (ω )] (t), where T is a stopping time;

(iii) any process of the form ϕt (ω ) = F ϕt1 (ω ), . . . , ϕtn (ω ) , where F : Rn → R is a measurable




function and (ϕ1 , . . . , ϕn ) are progressive processes (so sums, products, etc);

(iv) any pointwise limit ϕ = limn→∞ ϕn of a sequence (ϕn )n≥1 of progressive processes;

(v) any continuous and adapted process.

Proof. For (i), we write for any Borel set B ∈ B(R) which does not contain 0 (otherwise, take Bc )

{(u, ω ) ∈ [0, t] × Ω : ϕu (ω ) ∈ B} = {u ∈ [0, t] ∩ ( a, b]} × {ω ∈ Ω : X (ω ) ∈ B} ,

which is either empty (if t ≤ a), or of the form I × A with I ∈ B[0, t]) and A ∈ Ft (if t ≥ u). For
(ii), we note that ϕ is {0, 1}−valued, and that

{(u, ω ) ∈ [0, t] × Ω : ϕu (ω ) = 0} = (q, t] × {ω ∈ Ω : T (ω ) ≤ q} .


[

q∈[0,t]∩Q

For (iii) and (iv), we simply use the fact that limits and compositions of measurable functions
are measurable. Finally, (v) follows from (i),(iii) and (iv) once we observe that any continuous
adapted process ϕ is the pointwise limit of the sequence (ϕn )n≥1 , where

n2
ϕtn (ω ) : = X0 ( ω ) 1 ( t = 0 ) + ∑X k
n
( ω ) 1 ( k , k +1 ] ( t ) .
n n
(2.24)
k =0

This concludes the proof.

21
2.4. The Itô isometry

2.4 The Itô isometry


We let M2 (R+ ) denote the space of progressive processes ϕ = (ϕt )t≥0 such that
Z ∞ 
E ϕu du < ∞.
2
(2.25)
0

By Remark 2.2,RM2 (R+ ) = L2 (R+ × Ω, P , dt ⊗ P( dω )) is a Hilbert space, with scalar product

⟨ψ, ϕ⟩M2 := E 0 ψu ϕu du . This space contains every elementary random step function

ϕu (ω ) := X (ω )1(s,t] (u), (2.26)

where 0 ≤ s ≤ t and X ∈ L2 (Ω, Fs , P). For such a basic process, it makes sense to define
Z ∞
ϕu dBu := X (ω ) ( Bt − Bs ) . (2.27)
0

As in the Wiener case, this definition extends uniquely to the whole Hilbert space:

Theorem 2.4 (Itô integral). There exists a unique continuous and linear map I : M2 (R+ ) → L2 (Ω)
such that I (ϕ) = X ( Bt − Bs ) whenever ϕ is as in (2.26). Moreover, I is an isometry, i.e.
Z ∞ 
∀ψ, ϕ ∈ M (R+ ), E [ I (ψ) I (ϕ)] = E
2
ψu ϕu du . (2.28)
0
R∞
We call I the Itô integral, and write I (ϕ) = 0
ϕu dBu .

Proof. If ϕ = (ϕt )t≥0 is a random step function of the form

n −1
ϕt (ω ) = ∑ Xk (ω )1(t ,t + ] , k k 1
(2.29)
k =0

with n ∈ N, 0 ≤ t0 ≤ . . . ≤ tn , and Xk ∈ L2 (Ω, Ftk , P) for each 0 ≤ k < n, we are forced to set

n −1
∑ Xk

I (ϕ) := Btk+1 − Btk .
k =0

Note that I (ϕ) ∈ L2 (Ω, F , P). Moreover, for 0 ≤ j < k < n, we have
h i h i 
E X j ( Bt j+1 − Bt j ) Xk ( Btk+1 − Btk ) = E X j ( Bt j+1 − Bt j ) Xk E Btk+1 − Btk = 0,


because X j , Xk , ( Bt j+1 − Bt j ) are Ftk −measurable, while Btk+1 − Btk is independent of Ftk . Thus,

n −1 n −1 ∞
h i h Z 
2 i
∑E ∑E
2
E | I (ϕ)| Xk2 Xk2 ( t k +1 − t k ) = E ϕt2 dt
 
= Btk+1 − Btk = .
k =0 k =0 0

This identity shows that our linear map I – so far defined on random step functions – is an
isometry. To conclude, it thus only remains to show that random step functions are dense in
M2 (R+ ). For this, we again use the approximation operators ( Pn )n≥1 from Lemma 2.1, i.e.
!
n2 Z k


n
( Pn ϕ)t = n k −1
ϕu du 1( k , k+1 ] (t). (2.30)
n n
k =1 n

22
2.5. The Itô integral as a process

Note that Pn ϕ is a random step function for any ϕ ∈ M2 (R+ ) and n ∈ N, because the random
Rk
variable n kn−1 ϕs ds is Fk/n −measurable (the progressivity of ϕ is used here) with
n
 !2  " Z k #
Z k
n n
E n k −1
ϕu du  ≤ E n k−1 ϕu2 du < +∞ (2.31)
n n

(∥ϕ∥M2 < ∞ is used here). Now, to prove that Pn ϕ → ϕ in M2 , we estimate


h i
∥ Pn ϕ − ϕ∥2M2 = E ∥ Pn ϕ − ϕ∥2L2 (R+ ) . (2.32)

Lemma 2.1 ensures that the term inside the expectation tends a.-s. to 0 as n → ∞ (the random
function u 7→ ϕu is a.-s. in L2 (R+ ), because ∥ϕ∥M2 < ∞). Moreover, we have the domination
 2
∥ Pn ϕ − ϕ∥2L2 (R+ ) ≤ ∥ Pn ϕ∥ L2 (R+ ) + ∥ϕ∥ L2 (R+ ) ≤ 4∥ϕ∥2L2 (R+ ) , (2.33)

and the right-hand side has expectation 4∥ϕ∥2M2 < ∞.

Remark 2.3 (Important comments). Here are a few elementary but important observations.

1. For any ϕ ∈ M2 (R+ ), we have, in the L2 sense,


!
Z ∞ n2 Z k  

n
ϕu dBu = lim n ϕu du B k +1 − B k . (2.34)
0 n→∞ k −1 n n
k =1 n

2. In the deterministic case ϕt (ω ) = f (t) with f ∈ L2 (R+ ), we recover the Wiener integral.

3. Regardless of whether the process ϕ ∈ M2 (R+ ) is centered or not, we always have


Z ∞ 
E ϕu dBu = 0. (2.35)
0

4. For any ϕ, ψ ∈ M2 (R+ ), the isometry formula also reads (by polarization)
Z ∞ Z ∞  Z ∞ 
Cov ϕu dBu , ψu dBu = E ϕu ψu du . (2.36)
0 0 0

R∞
5. Even in the elementary case (2.26), the random variable 0
ϕu dBu has no reason to be Gaussian!

2.5 The Itô integral as a process


As in the Wiener case, we adopt the natural notation
Z t Z ∞
ϕu dBu := ϕu 1(s,t] (u) dBu , (2.37)
s 0

for all 0 ≤ s ≤ t. Note that the right-hand side makes sense as soon as ϕ is progressive with
Z t
E ϕu2 du < ∞.
 
∀t ≥ 0, (2.38)
0

The space of such processes is quite larger than M2 (R+ ), and will be denoted by M2 . The
interest of stochastic integration is essentially contained in the following fundamental result.

23
2.5. The Itô integral as a process

Theorem 2.5 (Itô martingale). For any ϕ ∈ M2 , the process Mϕ = ( Mt )t≥0 defined by
ϕ

Z t
ϕ
Mt := ϕu dBu (2.39)
0

is a continuous square-integrable martingale, with quadratic variation


Z t
⟨ M ⟩t =
ϕ
ϕu2 du. (2.40)
0

Proof. Let us first consider the case of an elementary random step function ϕt (ω ) = X (ω )1(a,b] (t),
with 0 ≤ a ≤ b and X ∈ L2 (Ω, F a , P). By definition, we then have for t ≥ 0,

ϕ X ( Bb∧t − Ba ) if a ≤ t
Mt = X ( Bb∧t − Ba∧t ) = (2.41)
0 else.
The continuity of Mϕ is clear from the first expression, and the adaptedness and square-
integrability easily follow from the second expression. Moreover, for 0 ≤ s ≤ t, we have

ϕ ϕ X ( Bb∧t − Bs∨a ) if s ≤ b and a ≤ t
Mt − Ms = (2.42)
0 else.
In either case, we easily find
h i
E Mt − Ms | F s ∨ a = 0
ϕ ϕ
(2.43)
  Z t Z t
ϕ 2

E Mt − Ms 2
ϕu2 du.
ϕ
| Fs∨ a = X 1(a,b] (u) du = (2.44)
s s

By the tower property of conditional expectation, this implies E[ Mt − Ms | Fs ] = 0 and


ϕ ϕ
Rt Rt
E[( Mt − Ms )2 − s ϕu2 du | Fs ] = 0. Thus, Mϕ is a martingale, and ⟨ Mϕ ⟩t = 0 ϕu2 du. Now, if
ϕ ϕ

ϕ e e is another step function with b ≤ e


et (ω ) = X1 a, then a similar reasoning as above yields
(e
a,b ]
h   i Z t
E
ϕ ϕ ϕ ϕ
Mt − Ms Mt − Ms | Fs∨ea = 0 = eu du. (2.45)
e e
ϕu ϕ
s
This remains true if Fs∨ea is replaced by Fs , and we conclude that the formula
Z t
⟨ M , M ⟩t =
ϕ ϕ eu du
e
ϕu ϕ
0

holds whenever the elementary random step functions ϕ, ϕ e are equal or have disjoint support.
Now, if ϕ an arbitrary random step functions, then ϕ is a linear combination of elementary
random step functions with disjoint supports, so the above computations show that Mϕ is a
t
square-integrable martingale with ⟨ Mϕ ⟩t = 0 ϕu2 du. Finally, this extends to any ϕ ∈ M2 by
R

Proposition 1.5, once we have observed that


ϕn L2 ϕ
∀t ≥ 0, Mt −−−→ Mt , (2.46)
n→∞

with ϕn = Pn ϕ, because (ω, s) 7→ ϕsn (ω )1(0,t] (s) converges to (ω, s) 7→ ϕs (ω )1(0,t] (s) in M2 (R+ ).

Remark 2.4 (Quadratic covariation). By polarization, we have for all ϕ, ψ ∈ M2 and all t ≥ 0,
Z t
⟨ M ϕ , M ψ ⟩t = ϕu ψu du. (2.47)
0

Remark
Rt 2.5 (Two differences with the Wiener integral). Unlike the Wiener case, the random variable
ϕ dBu has, in general, no reason to be Gaussian, and no reason to be independent of Fs !
s u

24
2.6. Generalized Itô integral

2.6 Generalized Itô integral


We now extend the Itô integral to the class M2LOC of all progressive process ϕ = (ϕt )t≥0 satisfying
Z t
∀t ≥ 0 ϕu2 du < ∞, (2.48)
0

almost-surely. Note that this new space is much larger than M2 : in particular, it contains every
continuous adapted process ! Now, fix ϕ ∈ M2LOC and n ∈ N, and consider the stopping time
 Z t 
2
Tn := inf t ≥ 0 : ϕu du ≥ n , (2.49)
0
Rt
which is the hitting time of the closed set [n, ∞) by the continuous adapted process t 7→ 0
ϕu2 du.
Thanks to Proposition 2.2, the truncated process

ϕtn (ω ) := ϕt (ω )1[0,Tn (ω )] (t)

is progressive, and it is even in M2 (R+ ) because


Z ∞ Z Tn
(ϕun )2 du = ϕu2 du ≤ n,
0 0

by definition of Tn . Consequently, the process Mn given by


Z t Z ∞
Mtn := ϕun dBu = ϕu 1[0,Tn ∧t] (u) dBu , (2.50)
0 0

is a perfectly well-defined continuous (square-integrable) martingale, for each n ∈ N. Now,


fix t ≥ 0. On the event { Tn ≥ t}, we have ϕm 1[0,t] = ϕn 1[0,t] for all m ≥ n. By virtue of (2.34),
this implies that Mtm = Mtn for all m ≥ n. Thus the sequence ( Mtm )m≥1 converges a.-s. on the
event { Tn ≥ t}. Since n ∈ N is arbitrary, it follows that ( Mtm )m≥1 converges a.-s. on the event
m
n≥1 { Tn ≥ t }. But the latter has probability 1, thanks to (2.48). Thus, ( Mt )m≥1 converges a.-s.,
S

to a limit Mt , and we may safely set


Z t Z t
ϕu dBu := Mt = lim ϕu 1[0,Tn ] (u) dBu . (2.51)
0 n→∞ 0

The fact that Mt∧Tn = Mtn shows that M is a continuous local martingale. Let us sum this up.

Theorem 2.6 (Generalized Itô integral). For any ϕ ∈ M2LOC , the process Mϕ = ( Mt )t≥0 defined by
ϕ

Z t
ϕ
∀t ≥ 0, Mt := ϕu dBu ,
0

is a continuous local martingale with quadratic variation


Z t
∀t ≥ 0, ⟨ M ϕ ⟩t = ϕu2 du
0

Here is a useful stochastic analogue of Lebesgue’s dominated convergence Theorem.

Proposition 2.3 (Stochastic dominated convergence). Fix t ≥ 0. In order to ensure the convergence
Z t Z t
P
ϕun dBu −−−→ ϕu dBu , (2.52)
0 n→∞ 0

it suffices that the progressive processes ϕ, ϕ1 , ϕ2 . . . satisfy:

25
2.6. Generalized Itô integral

(i) (simple convergence): for almost-every u ∈ [0, t], ϕun → ϕu in probability as n → ∞.

(ii) (domination): for all u ∈ [0, t] and n ∈ N, |ϕun | ≤ Ψu a.-s., with Ψ ∈ M2LOC .
n Rt o
Proof. For k ∈ N, let Tk := inf t ≥ 0 : 0 ψu2 du ≥ k . Then the isometry formula in M2 yields
"Z  #
Tk ∧t Z Tk ∧t
Z 2  Tk ∧t
E ϕun dBu − ϕu dBu = E (ϕun − ϕu )2 du −−−→ 0, (2.53)
0 0 0 n→∞

by dominated convergence. To conclude from this, we simply write for ε > 0,


Z t Z t   Z t∧ T Z t∧ Tk 
k
P n
ϕu dBu − ϕu dBu ≥ ε ≤ P( Tk ≤ t) + P n
ϕu dBu − ϕu dBu ≥ ε .
0 0 0 0

The first term can be made arbitrarily small by choosing k large enough, because Tk ↑ ∞ a.-s..
The second term can then be made arbitrarily small by choosing n large enough, by (2.53).

Corollary 2.1. (Approximation of the generalized Itô integral). If ϕ is continuous and adapted, then
n −1 Z t
P
 
∑ϕ tnk B tnk+1 −B tnk −−−→
n→∞ 0
ϕu dBu ,
k =0

for every t ≥ 0 and any subdivision (tkn )0≤k≤n of [0, t] with max0≤k<n |tnk+1 − tnk | → 0 as n → ∞.
−1 n
Proof. Apply the above theorem with ϕtn = ∑nk= 0 ϕtk 1(tk ,tk+1 ] ( t ) and Ψt = supu∈[0,t] | ϕu |.

In the next chapter, we will compute stochastic integrals explicitly. Here is an example.
Example 2.1 (Brownian against brownian). For all t ≥ 0, we have
Z t
1 2 
Bu dBu = Bt − t . (2.54)
0 2
Proof. Fix t ≥ 0. Recall that for any continuous adapted process ϕ, we have
n −1 Z t
P
 
∑ϕ tnk B tnk+1 −B tnk −−−→
n→∞ 0
ϕu dBu ,
k =0

where tnk = kt/n. Taking ϕ = 2B, we deduce that


n −1 Z t
P
 
∑ 2B tnk B tnk+1 −B tnk −−−→ 2
n→∞ 0
Bu dBu .
k =0

On the other hand, we have seen at (1.24) that


n −1  2
P
∑ Btnk+1 − Btnk −−−→ t.
n→∞
k =0

Adding up those two lines, and observing that 2a(b − a) + (b − a)2 = b2 − a2 , we arrive at
n −1  Z t
P

∑ Bt2n − Bt2n
k +1 k
−−−→ 2
n→∞ 0
Bu dBu + t.
k =0

But the left-hand side is a telescopic sum, which equals Bt2 independently of n. Thus,
Z t
Bt2 = 2 Bu dBu + t,
0

almost-surely, as desired.

26
Chapter 3

Stochastic differentiation

3.1 Itô processes


In the previous chapter, we have learnt how to integrate a stochastic process ϕ = (ϕt )t≥0 against
our Brownian motion B, resulting in the generalized Itô integral
Z t
t 7→ ϕu dBu . (3.1)
0

This process is well-defined as soon as ϕ ∈ M2LOC , and it is always a continuous local martingale.
On the other hand, we of course also know how to integrate a stochastic process ψ = (ψt )t≥0
against the Lebesgue measure, resulting in the classical integral
Z t
t 7→ ψu du. (3.2)
0

This process is well-defined, adapted, and continuous as soon as ψ belongs to the space M1LOC of
progressive processes satisfying almost-surely
Z t
∀t ≥ 0 |ψu | du < ∞. (3.3)
0

Note that M2LOC ⊆ M1LOC , (Cauchy-Schwartz) and that both spaces contain, in particular, every
continuous and adapted process. Also keep in mind that those two integrals are very different:
the process (3.1) is always a local martingale, while (3.2) has a.-s. finite variation ! We will now
combine those two kinds of processes to construct the main object of stochastic calculus.

Definition 3.1 (Itô process). An Itô process is a stochastic process X = ( Xt )t≥0 of the form
Z t Z t
∀t ≥ 0, X t = X0 + ϕu dBu + ψu du, (3.4)
0 0

with X0 ∈ F0 , ψ ∈ M1LOC and ϕ ∈ M2LOC . The two integrals are called the martingale term and the drift
term, respectively. Instead of (3.4), we will often use the more convenient differential notation

dXt = ϕt dBt + ψt dt.

Remark 3.1 (Linearity). Itô processes form a vector space: if X, Y are Itô processes, and λ, µ ∈ R, then
Z = λX + µY is of course an Itô process, and the martingale terms and drift terms behave linearly:

dZt = λ dXt + µ dYt . (3.5)

27
3.1. Itô processes

Note that an Itô process is always continuous, and adapted. Giving names to the two parts ϕ
and ψ of the decomposition (3.4) suggests that they are unique. This is indeed the case.

Proposition 3.1 (Uniqueness of the drift and martingale terms). If X simultaneously satisfies

dXt = ϕt dBt + ψt dt and dXt = ϕ


et dBt + ψ
et dt,

e ∈ M2LOC and ψ, ψ
for some ϕ, ϕ e ∈ M1LOC , then ϕ, ϕ
e are indistinguishable, and so are ψ, ψ.
e

Proof. By assumptions, we have almost-surely,


Z t Z t
∀t ≥ 0, (ϕu − ϕeu ) dBu = (ψu − ψeu ) du.
0 0

Now, the left-hand side is a continuous local martingale, while the right-hand side has finite
variation almost-surely. Thus, both sides are null a.-s. In particular, the nullity of the left-hand
side implies that of its quadratic variation, i.e. a.-s.,
Z t
∀t ≥ 0, (ϕu − ϕeu )2 du = 0. (3.6)
0

Letting t → ∞ yields the indistinguishability of ϕ and ϕ


e. On the other hand, we have a.-s.,
Z t
∀t ≥ 0, (ψu − ψeu ) du = 0.
0

It is a classical exercise on Lebesgue integrals that this forces the integrand to be null a.-e..

Remark 3.2 (Itô martingales). If X is as in (3.4), then it follows from the previous chapters that

1. X is a local martingale if and only if X0 ∈ L1 and ψ ≡ 0.

2. X is a square-integrable martingale if and only if X0 ∈ L2 , ψ ≡ 0 and ϕ ∈ M2 .

For this reason, determining the martingale term ϕ and the drift term ψ of an Itô process is essential.

Remark 3.3 (Integral against an Itô process). Let X be as in (3.4), and let Y be a continuous and
adapted process. Then, clearly, Yϕ ∈ M2LOC and Yψ ∈ M1LOC , so it makes sense to define for t ≥ 0,
Z t Z t Z t
Yu dXu := Yu ϕu dBu + Yu ψu du.
0 0 0

By the dominated convergence Theorem (and its stochastic version), we then have
n −1 Z t
P
∑ Ytnk ( Xtnk+1 − Xtnk ) −−−→
n→∞ 0
Yu dXu , (3.7)
k =0

along any subdivisions (tnk )0≤k≤n of [0, t] with ∆n := max0≤k<n (tnk+1 − tnk ) → 0 as n → ∞.

Example 3.1 (Squared Brownian motion). Our Brownian motion B is of course an Itô process (take
ϕ = 1 and ψ = 0). A less trivial example is B2 , for which the computation in Example 2.1 shows that

dBt2 = 2Bt dBt + dt.

Note the presence of the quadratic variation term dt, compared to the classical formula dXt2 = 2Xt dXt
that one would have in the case of a continuously differentiable process t 7→ Xt . We will come back to it!

28
3.2. Quadratic variation of an Itô process

3.2 Quadratic variation of an Itô process


Lemma 3.1 (Quadratic variation of an Itô process). Let X be an Itô process with stochastic differential

dXt = ϕt dBt + ψt dt. (3.8)

Then for any subdivision (tnk )0≤k≤n of [0, t] with ∆n := max0≤k<n (tnk+1 − tnk ) → 0 as n → ∞, we have

n −1  2 Z t
P
∑ Xtnk+1 − Xtnk −−−→
n→∞ 0
ϕu2 du. (3.9)
k =0

We will naturally denote the right-hand side by ⟨ X ⟩t , and call t 7→ ⟨ X ⟩t the quadratic variation of X.
More generally, if X et = ϕ
e is another Itô process, with dX et dBt + ψ et dt, then

n −1  Z t
P
 
∑ Xtnk+1 − Xtnk e tn − X
X k +1
e tn
k
−−−→ ⟨ X, X
n→∞
e ⟩t =
0
eu du.
ϕu ϕ
k =0

We call t 7→ ⟨ X, X
e ⟩t the quadratic covariation of X and X,
e and write d⟨ X, X
e ⟩t = ϕt ϕ
et dt.

Proof. We only have to prove the first claim, since the second follows by polarization.
Rt 2 Now,
when ψ = 0, X is a continuous local martingale with quadratic variation t 7→ 0 ϕu du, so the
claim is Proposition 1.11. The general case then easily follows from the observation that
n −1
a.−s.
∑ (Yt +n
k 1
− Ytnk )( Ztnk+1 − Ztnk ) ≤ V (Y, 0, t) sup | Zu − Zv | −−−→ 0,
n→∞
k =0 u,v∈[0,t],|u−v|≤∆n

whenever Y has finite variation and Z is continuous (almost-surely).

Proposition 3.2 (Stochastic integration by parts). If X, Y are Itô processes, then so is ( Xt Yt )t≥0 , and

d( Xt Yt ) = Xt dYt + Yt dXt + d⟨ X, Y ⟩t .

Proof. Fix t ≥ 0. Consider subdivisions (tnk )0≤k≤n of [0, t] with max |tnk+1 − tnk | → 0. We have

n −1  
Xt Yt − X0 Y0 = ∑ Xtnk+1 Ytnk+1 − Xtnk Ytnk
k =0
n −1   n −1   n −1
= ∑ Xt n
k
Ytnk+1 − Ytnk + ∑ Ytnk Xtnk+1 − Xtnk + ∑ ( Xtnk+1 − Xtnk )(Ytnk+1 − Ytnk ),
k =0 k =0 k =0

thanks to the identity x ′ y′ − xy = x (y′ − y) + y( x ′ − x ) + ( x ′ − x )(y′ − y). Letting n → ∞ yields


Z t Z t
Xt Yt − X0 Y0 = Xu dYu + Yu dXu + ⟨ X, Y ⟩t , (3.10)
0 0

by the above Lemma and Remark 3.7.

Remark 3.4 (Itô term). Here again, note the extra covariation term d⟨ X, Y ⟩t , compared to the classical
integration-by-parts formula d( Xt Yt ) = Xt dYt + Yt dXt for continuously differentiable trajectories.

Remark 3.5 (Squaring). In particular, if X is an Itô process, then so is X 2 and

dXt2 = 2Xt dXt + d⟨ X ⟩t , (3.11)

thereby generalizing the Brownian case studied in Example 2.1.

29
3.3. Itô’s Formula

3.3 Itô’s Formula


We have seen how to differentiate the square of an Itô process. In fact, the square may be
replaced with any smooth function, thanks to the following fundamental formula, which is the
stochastic analogue of the classical rule dF ( Xt ) = F ′ ( Xt ) dXt for differentiating a composed
function. The stochastic version contains an extra term, due to the quadratic variation of X.
Theorem 3.1 (Itô’s Formula). Consider an Itô process X, and a function F ∈ C 2 (R). Then, the process
( F ( Xt ))t≥0 is again an Itô process, with stochastic differential
1
dF ( Xt ) = F ′ ( Xt ) dXt + F ′′ ( Xt ) d⟨ X ⟩t . (3.12)
2
Proof. As above, we fix t ≥ 0 and consider subdivisions (tnk )0≤k≤n of [0, t] with max |tnk+1 − tnk | →
0. Since F ∈ C 2 (R), we have the second-order Taylor expansion,
n −1
F ( X t ) − F ( X0 ) = ∑ F ( Xt + ) − F ( Xt
n
k 1
n
k
)
k =0
n −1   1 n −1  2
= ∑ F ′
( X n
tk ) X n
t k +1 − X n
tk +
2 k∑
F ′′
( X n
Uk ) X n
t k +1 − X n
tk ,
k =0 =0

for some Unk ∈ [tnk , tnk+1 ]. By Remark 3.7, we know that


n −1 Z t
P
 
∑ F ′ ( Xtnk ) Xtnk+1 − Xtnk −−−→ F ′ ( Xu ) dXu .
n→∞ 0
k =0

Thus, it only remains to show that


n −1 2 Z t
P

∑F ′′
(X ) X
Ukn tnk+1 −X tnk −−−→
n→∞ 0
F ′′ ( Xu ) d⟨ X ⟩u . (3.13)
k =0

By Lemma 3.1, we already know that


n −1 2 Z t
P

∑ Ytnk Xtnk+1 − Xtnk −−−→ Yu d⟨ X ⟩u ,
n→∞ 0
k =0

in the elementary case where Yu = 1(0,s] (u), for any s ≥ 0. By linearity, this immediately extends
to the case where Y is a random step function. By density, it further extends to the case where Y
is any continuous and adapted process. In particular, we may take Yu = F ′′ ( Xu ), and this suffices
to yield (3.13), since max0≤k<n | F ′′ ( XUkn ) − F ′′ ( Xtnk )| → 0 a.-s. (by uniform continuity).

The Itô formula admits a multivariate extension, allowing to combine several Itô processes.
Theorem 3.2 (Multivariate extension). Let F ∈ C 2 (Rd ), and let X 1 , . . . , X d be Itô processes. Then,
the process t 7→ F ( Xt1 , . . . , Xtd ) is again an Itô process, with
d
∂F 1 1 d d ∂2 F
dF ( Xt1 , . . . , Xtd ) = ∑ ∂xi t( X , . . . , X d
t ) dX i
t +
2 ∑ ∑ ∂xi x j (Xt1 , . . . , Xtd ) d⟨Xi , X j ⟩t .
i =1 i =1 j =1

Proof. The argument is the same as above, with the multivariate version of Taylor expansion:
d
∂F 1 d d ∂2 F
F (y) = F ( x ) + ∑ ( x )(yi − xi ) + ∑ ∑ (z)(yi − xi )(y j − x j ),
i =1
∂xi 2 i=1 j=1 ∂xi x j

valid for any x, y ∈ Rd and some z ∈ Conv( x, y). More precisely, we here take x = ( Xt1n , . . . , Xtdn )
k k
and y = ( Xt1n , . . . , Xtdn ), and then sum over 0 ≤ k ≤ n − 1, and finally let n → ∞.
k +1 k +1

30
3.3. Itô’s Formula

Remark 3.6 (Special cases). Here are a few special cases of interest.

1. One recovers the integration-by-parts formula by taking F ( x, y) = xy.

2. One can add a time dependency by letting one of the Itô processes be t 7→ t. For example,

∂F ∂F 1 ∂2 F
dF (t, Xt ) = (t, Xt ) dXt + (t, Xt ) dt + (t, Xt ) d⟨ X ⟩t ,
∂x ∂t 2 ∂x2

for any F ∈ C 2 (R2 ). Note that t 7→ t does not contribute to the last term (finite variation).

In view of Remark 3.2, Itô’s Formula is extremely useful for finding martingales. Here is a
typical exercise to familiarize with this powerful technique.

Exercise 3.1 (Practicing with Itô’s Formula). In each of the following cases, compute the stochastic
differential of the process M = ( Mt )t≥0 , and deduce that it is a martingale.

1. Mt := Bt2 − t.

2. Mt := Bt3 − 3tBt .

3. Mt := Bt4 − 6tBt2 + 3t2 .

4. Mt := Bt5 − 10tBt3 + 15t2 Bt .


 
θ2
5. Mt := exp θBt − 2t , with θ ∈ R.

θ2
6. Mt := cos(θBt )e 2 t , with θ ∈ R.

7. Mt := f (t, Bt ), where f ∈ C 2 (R2 ) satisfies an appropriate condition.


Rt
8. Mt := Btn − (n2 ) 0
Bun−2 du, where n ≥ 2 is any integer.

Exercise 3.2 (A typical exam problem). Let B be a Brownian motion and let F denote the cumulative
distribution function of a standard Gaussian random variable. Consider the process
 
B
Mt : = F √ t , t ∈ [0, 1). (3.14)
1−t

1. Compute the stochastic differential of M.

2. Show that M1 := limt→1 Mt exists a.-s., and compute it.

3. Prove that ( Mt )t∈[0,1] is a martingale.



4. Deduce the probability that B intersects the graph of t 7→ 1 − t, t ∈ [0, 1].

5. How did we guess that the Gaussian cumulative distribution function was a good choice for F?

31
3.4. Exponential martingales

3.4 Exponential martingales


Rt 1 2
An important property of the Wiener integral was that the process (e 0 f (u) dBu − 2 f (u) du )t≥0 is a
martingale, for any f ∈ L2LOC . The argument used to prove this relied on a specific Gaussian
computation, which no longer applies in the Itô case. However, an elementary application of
Itô’s formula yields the following important result.
Lemma 3.2 (Doléans-Dade exponential). For any ϕ ∈ M2LOC , the process Z ϕ = ( Zt )t≥0 defined by
ϕ

Z t
1 t 2
Z 
ϕ
Zt := exp ϕu dBu − ϕ du , (3.15)
0 2 0 u
is a local martingale.
Rt 1
Rt
Proof. Applying Itô’s formula with F = exp and Xt = 0
ϕu dBu − 2 0
ϕu2 du yields

1
= e Xt dXt + e Xt d⟨ X ⟩t
ϕ
dZt
 2 
Xt 1 2 1
= e ϕt dBt − ϕt dt + e Xt ϕt2 dt
2 2
Xt
= e ϕt dBt .
ϕ
Since Z0 = 1, we obtain
Z t
ϕ ϕ
∀t ≥ 0, Zt = 1+ Zu ϕu dBu ,
0

and the result follows from the general properties of the Itô integral.

For reasons that will become clear in the next section, it is very important to ensure that
Zϕ is really a martingale, and not just a local martingale. This holds, for example, when ϕ is
deterministic: the result was proved in the section on Wiener’s integral, and can be recovered
by checking that the process ( Zu ϕu )u∈[0,t] appearing in the stochastic differential of Z ϕ is in M2 .
ϕ

The following criterion is much more general, but its proof is considerably more involved.
Theorem 3.3 (Novikov’s Condition). Fix T ∈ R+ . For ( Zt )t∈[0,T ] to be a martingale, it suffices that
ϕ

  Z T 
1
E exp 2
ϕ du < ∞. (3.16)
2 0 u
In the proof of this theorem, we will use the following elementary lemma.
Lemma 3.3 (Non-negative local martingales). If M = ( Mt )t∈[0,T ] is a non-negative local martingale,
then it is a super-martingale. Moreover, it is a martingale if and only if E[ MT ] ≥ E[ M0 ].
Proof. Let ( Tn )n≥1 be a localizing sequence, and let 0 ≤ s ≤ t ≤ T. For each n ∈ N, we have

E[ MTn ∧t | Fs ] = MTn ∧s .

We now take n → ∞. Since Tn → ∞ a.-s., the conditional version of Fatou’s Lemma yields

E [ Mt | F s ] ≤ Ms , (3.17)

which shows that M is a super-martingale. Now, suppose that E[ MT ] ≥ E[ M0 ]. This forces the
non-increasing map t 7→ E[ Mt ] to be constant on [0, T ]. In particular, for any 0 ≤ s ≤ t ≤ T, the
non-negative variable Ms − E[ Mt |Fs ] has zero mean, hence is null a.-s..

32
3.5. Girsanov’s Theorem

Proof of Theorem 3.3. Fix 0 < ε < 1. It is straightforward to check that for all 0 ≤ t ≤ T,
  1   1+1 ε  Rt  1+ε ε
(1− ε ) ϕ 1− ε2 ϕ 1
ϕu2 du
Zt = Zt e2 0 .

In particular, we may choose t = T ∧ Tn , where ( Tn )n≥1 is a localizing sequence for Z (1−ε)ϕ .


Taking expectations, and invoking Hölder’s inequality, we arrive at
  1  h i 1+1 ε h R T ∧Tn i 1+ε ε
(1− ε ) ϕ 1− ε2 1
ϕu2 du
E ≤ E E e
ϕ
ZT ∧Tn ZT ∧Tn 2 0 . (3.18)

h i h 1 R T 2 i 1+ε ε
Since E ZT ∧Tn = 1, the right-hand side is bounded by E e 2 0 ϕu du
ϕ
independently of n.
(1− ε ) ϕ 1
This means that the sequence ( ZT ∧Tn )n≥1 is bounded in L p with p = 1− ε2
> 1. Thus,
h i h i
(1− ε ) ϕ (1− ε ) ϕ
E ZT = lim ZT ∧Tn = 1. (3.19)
n→∞
h i
(1− ε ) ϕ p
In particular, E ( ZT ) ≥ 1, so using (3.18) with T instead of T ∧ Tn yields

h i 1+1 ε h 1 R T 2 i 1+ε ε
1 ≤ E ZT E e 2 0 ϕu du
ϕ
.

Taking ε → 0 yields E[ ZT ] ≥ 1, which suffices to conclude thanks to the above lemma.


ϕ

3.5 Girsanov’s Theorem


The interest of ensuring that the exponential local martingale Z ϕ is a martingale – at least when
restricted to a finite time horizon [0, T ] – is contained in the following fundamental result.

Theorem 3.4 (Girsanov’s Theorem). Fix ϕ ∈ M2LOC and T ≥ 0, and suppose that the associated
ϕ
exponential local martingale ( Zt )t∈[0,T ] is a martingale. Then, the formula
h i
Q ( A ) : = E ZT 1 A
ϕ
∀ A ∈ FT , (3.20)

defines a probability measure on (Ω, F T ), under which the process X = ( Xt )t∈[0,T ] defined by
Z t
Xt := Bt − ϕu du, (3.21)
0

is a (Ft )t∈[0,T ] −Brownian motion (restricted to the time horizon [0, T ]).

Let us make a number of important comments before proceeding to the proof of this result.

Remark 3.7 (Some useful comments).

1. In practice, the most efficient way to verify the assumption is to check Novikov’s Criterion:
h 1 RT 2 i
E e 2 0 ϕu du < +∞. (3.22)

2. The statement that Q is a probability measure follows from the fact that ZT ≥ 0 and E[ ZT ] = 1.
ϕ ϕ

33
3.5. Girsanov’s Theorem

3. By linearity and density, (3.20) implies that for any F T −measurable non-negative variable Y,
" #
Q
h i
Q Y
E [Y ] = E YZT , and E [Y ] = E
ϕ
ϕ , (3.23)
ZT

where EQ is expectation under Q. This is useful for transferring computations between Q and P.

4. It follows from the martingale property that we also have EQ [Y ] = E[YZt ] for any t ∈ [0, T ] and
ϕ

any non-negative Ft −measurable random variable Y.

5. The practical interest of Girsanov’s Theorem is as follows: on our original probability space
(Ω, F , P), computing expectations about X is rather complicated. Moving to (Ω, F , Q) turns
X into a much simpler object, for which such computations become doable. One can then try to
transfer the results back to (Ω, F , P), using Formula (3.23). Practical examples will follow...

6. Under Q, the process B is of course no longer a Brownian motion! Consequently, computing


expectations under Q typically requires expressing all quantities of interest in terms of X only.

7. The result admits the following T = ∞ version: suppose that the whole process Z ϕ = ( Zt )t≥0
ϕ

is a martingale (this is the case, for example, when (3.22) holds for each t ≥ 0). Then, for each
t ≥ 0, the formula (3.20) can be used to define a probability measure Qt on (Ω, Ft ). Moreover, for
0 ≤ s ≤ t, the restriction of Qt to Fs coincides with Qs , since

Qt ( A) = E[ Zt 1 A ] = E[ Zs 1 A ] = Qs ( A).
ϕ ϕ
∀ A ∈ Fs ,

where the second equality uses the martingale property. Thus, (Qt )t≥0 is a consistent family of
probability measures, and the Kolmogorov extension theorem guarantees that these measures are
all restrictions of a common probability measure Q∞ defined on F∞ := σ t≥0 Ft . On the
S 

probability space (Ω, F∞ , Q∞ ), the whole process X = ( Xt )t≥0 is then a Brownian motion.

We now turn to the proof of Girsanov’s Theorem.

Proof. Let us first settle the special case where (ϕt )t∈[0,T ] satisfies
Z T
ϕu2 du ≤ C, (3.24)
0

for some deterministic C < ∞. In particular, for any θ ∈ R, the shifted process ϕ + θ satisfies
ϕ+θ
Novikov’s Criterion, so ( Zt )t∈[0,T ] is a martingale. Thus, for any 0 ≤ s ≤ t ≤ T, we have,
h i
ϕ+θ ϕ+θ
E Zs |Ft = Zs .

ϕ+θ θ2 u
= Zu eθXu −
ϕ
Since Zu 2 for all u ≥ 0, we may rewrite this as follows:
θ2
h i
E Zs eθ (Xt −Xs ) |Fs = e 2 (t−s) Zs .
ϕ ϕ

In other words, for any A ∈ Fs , we have


θ2
h i h i
E Zt eθ (Xt −Xs ) 1 A = e 2 (t−s) E Zs 1 A .
ϕ ϕ

In view of Item 4 in the above remark, this may be further rewritten in terms of Q as follows:
θ2
h i
EQ eθ (Xt −Xs ) 1 A = e 2 (t−s) Q( A).

34
3.6. An application

Taking A = Ω shows that Xt − Xs has distribution N (0, t − s), and the product form shows
that Xt − Xs is independent of Fs . But this holds for any 0 ≤ s ≤ t ≤ T, and X is continuous
by construction, so ( Xt )t∈[0,T ] is indeed a (Ft )t∈[0,T ] −Brownian motion under Q. To address the
general case, we of course introduce the truncated process ϕtn := ϕt 1Tn ≥t , where
 Z t 
2
Tn := inf t ≥ 0 : ϕu du ≥ n .
0

Since ϕn satisfies the condition (3.24) (with C = n), the first part of the proof implies that
−θ 2
h n n
i h i
E Zt∧Tn eiθ (Xt −Xs ) 1 A = e 2 (t−s) E Zs∧Tn 1 A ,
ϕ ϕ

R t∧ T
for all 0 ≤ s ≤ t ≤ T, θ ∈ R and A ∈ Fs , where Xtn := Bt − 0 n ϕu du. Now, as n → ∞,
we have Tn ↑ +∞, hence Xtn → Xt a.-s. Moreover, by Scheffe’s Lemma, the a.-s. convergence
ϕ ϕ
Zt∧Tn → Zt also holds in L1 , for all t ∈ [0, T ]. Thus, we may pass to the limit and obtain
−θ 2
h i h i
E Zt eiθ (Xt −Xs ) 1 A = e 2 (t−s) E Zs 1 A .
ϕ ϕ

Recalling the definition of Q, this precisely means that


−θ 2
h i
EQ eiθ (Xt −Xs ) 1 A = e 2 (t−s) Q( A).
The fact that this is true for every A ∈ Fs and every θ ∈ R shows that under Q, the random
variable Xt − Xs has law N (0, t − s) and is independent of Fs , as desired.

3.6 An application
Here is a good technical exercise to practice with Girsanov’s Theorem.
Rt
Exercise 3.3 (Joint distribution of Bt2 , 0 Bs2 ds). In order to understand the joint distribution of Bt2
Rt
and 0 Bs2 ds for fixed t ≥ 0, one would naturally like to compute the following Laplace transform:
b2 t 2
  Z 
Lt ( a, b) := E exp − aBt2 − Bu du ( a, b, t ≥ 0).
2 0
1. Compute Lt ( a, 0) for all a, t ≥ 0. We henceforth assume that b > 0.
2. Find ψ ∈ M1loc so that the process Z defined below is a local martingale:
 Z t Z t 
Zt := exp −b Bu dBu − ψu du .
0 0
Rt
3. Express Zt in terms of the random variables Bt and 0 Bu2 du only, and deduce that
     
b bt
Lt ( a, b) = E Zt exp − a Bt 2
exp − .
2 2

R s measure Qt on (Ω, Ft ) under which the process W =


4. Fix t ≥ 0 and construct a probability
(Ws )s∈[0,t] defined by Ws := Bs + b 0 Bu du is a Brownian motion. Show that for all s ≥ 0,
Z s
Bs = eb(u−s) dWu .
0

5. Determine the law of Bt under Qt and deduce the formula


1
Lt ( a, b) = q .
2a
cosh(bt) + b sinh(bt)

35
3.6. An application

36
Chapter 4

Stochastic differential equations

4.1 Motivations
An ordinary differential equation (abbreviated as ODE) is an equation involving an unknown
function x = ( xt )t≥0 and its derivative. Such equations are massively used to model physical
processes whose evolution in any infinitesimal time-interval [t, t + dt] only depends on the
considered time t, and the current value xt . In differential notation, they take the form

dxt = b(t, xt ) dt (4.1)

where the function b : R+ × R → R describes the underlying dynamics. The classical Picard-
Lindelöf theorem gives a simple sufficient condition on b for such an equation to be well-posed,
in the sense that it admits a unique solution that starts from each possible initial condition x0 .

Theorem 4.1 (Picard-Lindelöf). Let b : R+ × R → R be a measurable function satisfying

(i) (uniform spatial Lipschitz continuity): there exists a constant κ < ∞ such that

∀(t, x, y) ∈ R+ × R × R, |b(t, x ) − b(t, y)| ≤ κ | x − y|.

Rt
(ii) (local integrability in time): 0
|b(u, 0)| du < ∞ for each t ≥ 0.

Then, for each z ∈ R, there exists a unique measurable function x = ( xt )t≥0 satisfying
Z t
∀t ≥ 0, xt = z + b(u, xu ) du. (4.2)
0

In many interesting situations however, the dynamics is intrinsically chaotic and unpre-
dictable: it is then natural to add a random external influence to the above evolution equation,
typically driven by a Brownian motion B = ( Bt )t≥0 . This naturally leads to the following
stochastic analogue of (4.1):

dXt = b(t, Xt ) dt + σ(t, Xt ) dBt ,

where X = ( Xt )t≥0 is an (unknown) stochastic process, and b : R+ × R → R and σ : R+ × R →


R are deterministic functions called the drift and diffusion coefficients, respectively. The
mathematical study of such stochastic differential equations (SDE) is a rich and active topic, to
which the present chapter only constitutes a modest introduction.

37
4.2. Existence and uniqueness

4.2 Existence and uniqueness


Our first task is to establish a stochastic analogue of the Picard-Lindelöf theorem, giving a simple
sufficient condition for the well-posedness of a stochastic differential equation of the form

dXt = b(t, Xt ) dt + σ(t, Xt ) dBt , (4.3)

where B = ( Bt )t≥0 is a given (Ft )t≥0 −Brownian motion on our filtered space (Ω, F , (Ft )t≥0 , P),
and b : R+ × R → R and σ : R+ × R → R are two given measurable functions. By a solution
to the stochastic equation (4.3), we will mean a progressive process X = ( Xt )t≥0 defined on
(Ω, F , (Ft )t≥0 , P), satisfying (b(t, Xt ))t≥0 ∈ M1LOC , (σ(t, Xt ))t≥0 ∈ M2LOC , and
Z t Z t
∀t ≥ 0, X0 + b(s, Xs ) ds + σ(s, Xs ) dBs = Xt . (4.4)
0 0

Note that X is then necessarily an Itô process (in particular, it is continuous and adapted).

Theorem 4.2 (Existence and uniqueness). Let b, σ : R+ × R → R be measurable functions such that

(i) (Uniform Lipschitz continuity in space): there exists κ < ∞ such that for all (t, x, y) ∈ R+ × R2 ,

|b(t, x ) − b(t, y)| ≤ κ | x − y| and |σ(t, x ) − σ(t, y)| ≤ κ | x − y|.

(ii) (Local square-integrability in time): for all t ≥ 0,


Z t Z t
|b(u, 0)|2 du < ∞ and |σ(u, 0)|2 du < ∞.
0 0

Then, for each initial condition ζ ∈ L2 (Ω, F0 , P), there exists a unique (up to indistinguishability)
solution X to the SDE (4.3) satisfying X0 = ζ. Moreover, we have X ∈ M2 .

As in the proof of the Picard-Lindelöf theorem, the uniqueness uses Gronwall’s lemma:

Lemma 4.1 (Gronwall’s Lemma). Let ( xt )t∈[0,T ] be a non-negative function in L1 ([0, T ]) satisfying
Z t
∀t ∈ [0, T ], xt ≤ α + β xu du,
0

for some constants α, β ≥ 0. Then, xt ≤ αe βt for all t ∈ [0, T ].


RT
Proof. Set κ = α + β 0 xu du. An immediate induction shows that

n −1
( βt)k ( βt)n
∀t ∈ [0, T ], xt ≤ α ∑ k!

n!
,
k =0

for every n ∈ N. Sending n → ∞ yields the result.

R t X, Y are 2two solutions of (4.3) satisfying X0 =


Proof of uniqueness in Theorem 4.2. Suppose that
Y0 = ζ. Fix n ∈ N, and set Tn := inf{t ≥ 0 : 0 ( Xu − Yu ) du ≥ n}. For t ≥ 0, we have
"Z  #
t∧ Tn Z t∧ Tn
Z 2  t∧ Tn
E σ(u, Xu ) dBu − σ(u, Yu ) dBu ≤ E (σ(u, Xu ) − σ(u, Yu ))2 du
0 0 0
t∧ Tn
Z 
2
≤ κ E 2
( Xu − Yu ) du .
0

38
4.2. Existence and uniqueness

On the other hand, by Cauchy-Schwartz,


"Z 2 #
t∧ Tn t∧ Tn t∧ Tn
Z Z 
E b(u, Xu ) du − b(u, Yu ) du ≤ tE (b(u, Xu ) − b(u, Yu ))2 du
0 0 0
t∧ Tn
Z 
2
≤ κ tE 2
( Xu − Yu ) du .
0

Summing these two estimates and using (u + v)2 ≤ 2u2 + 2v2 , we obtain that
h i Z t h i
2
E ( Xt − Yt ) 1(t≤Tn ) 2
≤ 2κ (t + 1) E ( Xu − Yu )2 1(u≤Tn ) du. (4.5)
0

Note that the right-hand side is finite, by definition


h of Tn . Thus,i we may invoke Gronwall’s
Lemma with α = 0, β = 2κ ( T + 1) and xt = E ( Xt − Yt )2 1(t≤Tn ) to conclude that
2

h i
∀t ≥ 0, E ( Xt − Yt )2 1(t≤Tn ) = 0.

Sending n → ∞ yields X ≡ Y, as desired.

Proof of existence in Theorem 4.2. Let us construct a sequence of approximate solutions ( X n )n≥0
in M2 by setting X 0 ≡ 0 and then inductively, for each n ∈ N and each t ∈ R+ ,
Z t Z t
Xtn+1 := ζ + σ (u, Xun ) dBu + b (u, Xun ) du. (4.6)
0 0

Let us first check that this makes sense. Clearly, X 0 ≡ 0 ∈ M2 . Now, fix n ∈ N, and suppose we
know that X n ∈ M2 . Then both integrands in (4.6) are in M2 , because our assumptions imply

|b(u, Xun )|2 ≤ 4|b(u, 0)|2 + 4κ 2 ( Xun )2


|σ(u, Xun )|2 ≤ 4|σ(u, 0)|2 + 4κ 2 ( Xun )2 .
Rt Rt
Since t 7→ 0 ϕu du and t 7→ 0 ϕu dBu are in M2 whenever ϕ ∈ M2 , we conclude that X n+1 ∈
M2 , so our induction makes sense. Now, by a similar argument as for (4.5), we have for n ≥ 1,
 2  Z t  2 
n −1
∀t ∈ [0, T ], E Xt − Xt
n +1 n
≤ CT E Xu − Xu n
du,
0

where CT = 4κ 2 ( T + 1). By an immediate induction, this implies

MT CTn tn−1
 2 
E Xt − Xt
n +1 n

( n − 1) !
RT h 2 i
where MT := 0 E Xu1 − Xu0 du. This is more than enough to guarantee that


∑ ∥X n+1 − X n ∥M ([0,T]) 2 < ∞, (4.7)
n =0

and hence that the sequence ( X n )n≥0 is convergent in the Hilbert space M2 ([0, T ]). But this is
true for each T ≥ 0, so the limit is an element X ∈ M2 , and passing to the limit in (4.6) yields
Z t Z t
Xt = ζ + σ (u, Xu ) dBu + b (u, Xu ) du, (4.8)
0 0

for all t ≥ 0, as desired.

39
4.3. Practical examples

Remark 4.1 (Useful comments). There are several things to note about the theorem.

1. Condition (ii) is only used in the proof of existence: it is not needed for the uniqueness part.

2. Thanks to Condition (i), the measurability of b, σ only needs to be checked w.r.t. the time variable.

3. If Conditions (i) and (ii) are only satisfied on some restricted time horizon [0, T ], then the above
proof still yields existence and uniqueness of a restricted solution X = ( Xt )t∈[0,T ] .

4. In particular, the conclusion of the theorem remains valid if the Lipschitz constant κ = κt appearing
in Condition (i) is allowed to depend on time t, as long as supt∈[0,T ] κt < ∞ for each T ≥ 0.

5. Condition (ii) trivially holds in the homogeneous case where the coefficients b(t, x ), σ(t, x ) do not
depend on the time t and more generally, when they depend continuously on t.

6. Our construction shows that for each t ≥ 0, Xt is σ (ζ, ( Bs )0≤s≤t ) −measurable. In other words,
 
Xt = Ψt ζ, ( Bs )s∈[0,t] , (4.9)

for some measurable Ψt : R × R[0,t] → R which only depends on t and the coefficients b and σ.

Exercise 4.1 (Dependence in the initial condition). Consider the homogeneous SDE

dXt = b( Xt ) dt + σ( Xt ) dBt ,

where b, σ are Lipschitz functions. Let X, Y be the solutions starting from X0 , Y0 ∈ L2 (Ω, F0 , P), and

b( Xt ) − b(Yt ) σ( Xt ) − σ(Yt )
   
ψt := 1(Xt ̸=Yt ) , and ϕt := 1(Xt ̸=Yt ) (4.10)
Xt − Yt Xt − Yt

1. Establish the following identity: almost-surely, for all t ≥ 0

ϕ2
Z t   Z t 
Xt − Yt = ( X0 − Y0 ) exp ψu − u du + ϕu dBu . (4.11)
0 2 0

2. Deduce that almost-surely, the overlap {t ≥ 0 : Xt = Yt } is either equal to ∅, or to R+ .

3. Prove the existence of a constant c ∈ (0, ∞) such that for all t ≥ 0 and all p ≥ 1,
2
E [( Xt − Yt ) p ] ≤ E [( X0 − Y0 ) p ] ecp t .

4.3 Practical examples


Example 4.1 (Langevin equation). The following SDE was proposed by Paul Langevin in 1908 to
describe the random motion of a small particle in a fluid, due to collisions with the surrounding molecules:

dXt = −bXt dt + σ dBt , (4.12)

with b, σ ∈ (0, ∞). This is an homogeneous SDE with b(t, x ) = −bx and σ(t, x ) = σ. The above
theorem ensures existence and uniqueness, for any initial condition ζ ∈ L2 (Ω, F0 , P). In fact,
Z t
∀t ≥ 0, Xt = ζe−bt + σ e−b(t−u) dBu , (4.13)
0

40
4.3. Practical examples

as can be checked by differentiating. Let us investigate the long-term behavior


 of X: the first term
 on the
2
right-hand side tends to 0 a.-s. as t → ∞, while the second term has law N 0, σ2b (1 − e−2bt ) , so

σ2
 
d
Xt −−→ N 0, ,
t→∞ 2b
independently of the choice of the initial condition ζ. Thus, the process X mixes: as time increases,
  the 2
random variable Xt progressively forgets its initial distribution, and approaches a limit N 0, σ2b .
 2
This observation suggests to start directly from ζ ∼ N 0, σ2b . Recall that ζ is always assumed to
be F0 −measurable, hence independent of B. The right-hand side of (4.13) then belongs to the Gaussian
space vect(ζ, B), so X is a Gaussian process. Its mean is clearly 0, and its covariance is easily computed:
σ2 −b|t−s|
∀s, t ≥ 0, Cov( Xs , Xt ) = e . (4.14)
2b
A continuous centered Gaussian process with this covariance is called an Ornstein-Uhlenbeck process.
Since its covariance only depends on |t − s|, its distribution is invariant under time-translation:
d
∀ a ≥ 0, ( X t + a ) t ≥0 = ( X t ) t ≥0 .
This stationarity is a key property, which explains the importance of the Ornstein-Uhlenbeck process.
Example 4.2 (Geometric Brownian motion). Fix ζ ∈ L2 (Ω, F0 , P), σ, µ ∈ R, and consider the SDE
dXt = Xt (σ dBt + µ dt) , X0 = ζ.
This homogeneous SDE with coefficients b(t, x ) = µx and σ(t, x ) = σx has a unique solution X =
( Xt )t≥0 . In light of what the answer would be in the deterministic case σ = 0, it is natural to expect a
solution of the form Xt = ζeYt , where Y is an Itô process. By Itô’s formula,
 

Yt

Yt 1
d ζe = ζe dYt + d⟨Y ⟩t .
2
σ2
Writing dYt = ϕt dBt + ψt dt and identifying, we see that ϕt = σ and ψt = µ − 2 , yielding finally
2
 
σBt + µ− σ2 t
Xt := ζe .
This important process is known as the geometric Brownian motion.
Example 4.3 (Black-Scholes process). Fix ζ ∈ L2 (Ω, F0 , P) and two deterministic measurable
bounded functions σ = (σt )t≥0 and µ = (µt )t≥0 . Consider the inhomogeneous SDE
dXt = Xt (σt dBt + µt dt) , X0 = ζ.
The coefficients b(t, x ) = µt x and σ (t, x ) = σt x satisfy the Lipshitz and square-integrability conditions,
thanks to the boundedness of σ, µ. Thus, there is a unique solution X. As in the above example, it is
natural to expect that Xt = ζeYt , where Y is an Itô process. Writing dYt = ϕt dBt + ψt dt, we have
 

Yt

Yt 1 2
d ζe = ζe ϕt dBt + ϕt dt + ψt dt .
2
σ2
Thus, it suffices to choose ϕt = σt and ψt = µt − 2t , yielding finally
σu2
Z t Z t  
Xt = ζ exp σu dBu + µu − du .
0 0 2
This natural generalization of the geometric Brownian motion is known as a Black-Scholes process.

41
4.4. Markov property for diffusions

Exercise 4.2 (Change of variable). Show that there is a unique Itô process X = ( Xt )t≥0 satisfying
q 
Xt
q
dXt = 2
1 + Xt + dt + 1 + Xt2 dBt , X0 = x,
2
then determine it explicitly by means of the change of variable Yt = argsh( Xt ).

4.4 Markov property for diffusions


From now on, we focus on the homogeneous case (diffusions). Specifically, we fix two Lipschitz
functions b, σ : R → R, and we let X = ( Xt )t≥0 be the unique solution to the well-posed SDE

dXt = b( Xt ) dt + σ ( Xt ) dBt
(4.15)
X0 = ζ ∈ L2 (Ω, F0 , P).
The process X enjoys a fundamental memoryless property, which can be described as independence
between the past and future, given the present. To formalize this, let us recall from (4.9) that we have
Xt = Ψt (ζ, ( Bu )u∈[0,t] ), (4.16)

for some deterministic, measurable map Ψt : R × R[0,t] → R which only depends on the coef-
ficients b and σ. The following result shows that for any fixed time s ≥ 0, the shifted process
Xe = ( Xt+s )t≥0 solves a SDE with the same coefficients b and σ, but initialized with X
e 0 = Xs and
driven by the shifted Brownian motion B e = ( Bu+s − Bs )u≥0 (which is independent of Fs ).

Theorem 4.3 (Invariance under time shift). For any s, t ≥ 0, we have


Xt+s = Ψt ( Xs , ( Bu+s − Bs )u∈[0,t] ).

Proof. Fix s ≥ 0 and define B


et := Bt+s − Bs for t ≥ 0. Then, the change-of-variable formula
Z t+s Z t
ϕu dBu = ϕu+s d B
eu ,
s 0

is clear in the elementary case where ϕt (ω ) = X (ω )1]u,v] (t) with X ∈ L2 (Ω, Fu , P). By linearity
and density, it then extends to any ϕ ∈ M2LOC . In particular, we can write
Z t+s Z t+s
Xt + s = Xs + b( Xu ) du + σ( Xu ) dBu
s s
Z t Z t
= Xs + b( Xu+s ) du + σ ( Xu + s ) d B
eu .
0 0

e := ( Xt+s )t≥0 solves the well-posed SDE


In other words, the process X
(
e t = b( X
dX e t ) dt + σ ( X
et ) dB
et
e 0 = Xs
X

driven by the Brownian motion B e on the filtered space (Ω, F , (Fet )t≥0 , P), where Fet := Ft+s .
e t = Ψ t ( Xs , ( B
But this precisely means that X eu )u∈[0,t] ) for all t ≥ 0.

A direct consequence of Theorem 4.3 (along with the general Remark 1.6 about conditional
expectation) is the following fundamental formula, which is known as the Markov property.
Given f ∈ L∞ (R) and t ≥ 0, we define a new function Pt f : R → R by
∀ x ∈ R, ( Pt f )( x ) := E[ f ( Xtx )], (4.17)
where X x denotes the unique solution to the SDE (4.15) with initial condition ζ = x.

42
4.5. Generator of a diffusion

Corollary 4.1 (Markov property). For any s, t ≥ 0 and any f ∈ L∞ , we have

E[ f ( Xt+s )|Fs ] = ( Pt f )( Xs ). (4.18)

In particular, the map ( x, A) 7→ ( Pt 1 A )( x ) is a transition kernel describing the conditional


distribution of the future state Xt+s given that the current state is Xs = x. Note that by successive
conditionings, the operators ( Pt )t≥0 actually allow one to recover the law of the entire process X
from that of X0 . This motivates a deeper study of the family ( Pt )t≥0 , called a semi-group because
of the second property in the following lemma.

Lemma 4.2 (Properties of the semi-group). The family ( Pt )t≥0 enjoys the following properties:

1. Pt is a bounded linear operator from L∞ (R) to L∞ (R) for each t ≥ 0.

2. We have P0 = Id and Pt+s = Pt ◦ Ps for all s, t ≥ 0.

3. If f is continuous, then so is t 7→ Pt f ( x ) for each fixed x ∈ R.

4. If f is monotone, then so is Pt f for each t ≥ 0.

5. If f is Lipschitz, then so is Pt f for each t ≥ 0.

6. If σ, b, f are in Cbk for some k ∈ N, then so is Pt f for each t ≥ 0.

Proof. The linearity of Pt readily follows from that of E[·]. Moreover, for any f ∈ L∞ (R) and
t ≥ 0, the function Pt f : x 7→ E[ f ( Xtx )] = E[( f ◦ Ψt )( x, B)] is measurable by Fubini’s theorem
(because f ◦ Ψt is bounded and measurable). Since ∥ Pt f ∥∞ ≤ ∥ f ∥∞ , the first assertion is proved.
The fact that P0 = Id is clear since X0x = x. To prove that Pt+s = Pt ◦ Ps for all s, t ≥ 0, we write

( Pt+s f )( x ) = E[ f ( Xtx+s )] = E[( Pt f )( Xsx )] = Ps ( Pt f )( x ),

where the second identity is obtained by taking expectations in (4.18). Finally, if f is continuous
and bounded, then so is the random function t 7→ f ( Xtx ), hence also its expectation. The last
three assertions are consequences of the identity (4.11), and the details are left to the reader.

4.5 Generator of a diffusion


It is an easy and pleasant exercise to show that any positive continuous function t 7→ pt satisfying
p −p
pt+s = pt ps must take the form pt = eλt , where λ := limt→0 t t 0 . By analogy, it is natural to
hope for a representation of our semi-group ( Pt )t≥0 under the form

Pt = etL , (4.19)

where L = limt→0 Pt −t P0 . Of course, at this level, those identities do not really make any sense
and the analogy is purely formal. Nevertheless, this motivates the following fruitful definition.

Definition 4.1 (Generator). The generator of the semi-group ( Pt )t≥0 is the linear operator L defined by

( Pt f )( x ) − f ( x )
∀ x ∈ R, ( L f )( x ) := lim , (4.20)
t →0 t

for all f ∈ L∞ (R) such that the limit exists. Those functions form a vector space denoted, by Dom( L).

43
4.5. Generator of a diffusion

The interest of this definition is summed up in the following important result. We recall that
Cc2 (R)
denotes the vector space of twice continuously differentiable functions f : R → R with
compact support, and that this space is dense in L∞ (R).

Theorem 4.4 (Properties of the generator). Let f ∈ Cc2 (R). Then,

1. L f is well-defined and given by the formula:

1
∀ x ∈ R, L f ( x ) = b( x ) f ′ ( x ) + σ2 ( x ) f ′′ ( x ). (4.21)
2

2. The function Pt f is in Dom( L) for all t ≥ 0, and it satisfies Kolmogorov’s equation:

d
∀ x ∈ R, ( Pt f )( x ) = ( Pt L f )( x ) = ( LPt f )( x ). (4.22)
dt

3. The process M = ( Mt )t≥0 defined as follows is a continuous square-integrable martingale:


Z t
∀t ≥ 0, Mt : = f ( X t ) − f ( X0 ) − ( L f )( Xu ) du. (4.23)
0

Proof. Let us define R f := b f ′ + 21 σ2 f ′′ . Applying Itô’s formula to f and X, we find

1
d f ( Xt ) = f ′ ( Xt ) dXt + f ′′ ( Xt ) d⟨ X ⟩t
 2 
1
= b( Xt ) f ′ ( Xt ) + σ2 ( Xt ) f ′′ ( Xt ) dt + f ′ ( Xt )σ( Xt ) dBt
2

= ( R f )( Xt ) dt + f ( Xt )σ( Xt ) dBt .

In other words, for all t ≥ 0,


Z t Z t
f ( X t ) − f ( X0 ) − ( R f )( Xu ) du = σ( Xu ) f ′ ( Xu ) dBu . (4.24)
0 0

Now, the fact that f ∈ Cc (R) easily ensures that the function u 7→ ( R f )( Xu ) and u 7→
σ ( Xu ) f ′ ( Xu ) are in M1 and M2 , respectively. In particular, the right-hand side of (4.24) is
a square-integrable martingale. Taking expectations and using Fubini’s theorem, we deduce that
Z t
E [ f ( Xt )] = f ( X0 ) + E [( R f )( Xu )] du.
0

Recalling the definition of Pt , we deduce that


Z t
∀ x ∈ R, ( Pt f )( x ) = f (x) + ( Pu R f )( x ) du. (4.25)
0

But for each fixed x ∈ R, the function u 7→ Pu R f ( x ) is continuous (Lemma 4.2), so we conclude
that t 7→ ( Pt f )( x ) is continuously differentiable on R+ , with derivative


( Pt f )( x ) = Pt R f ( x ). (4.26)
∂t

Since the left-hand side equals limh→0 1h (( Ph Pt f )( x ) − Pt ( x )), we see that Pt f ∈ Dom( L) and
that LPt f = Pt R f . Finally, taking t = 0 shows that L f = R f , and the proof is complete.

44
4.6. Connection with partial differential equations

Remark 4.2 (Extension). Since X is square-integrable (Theorem 4.2), the definition of Pt f given at
(4.17) actually extends to all measurable functions f with quadratic growth (i.e. | f ( x )| ≤ K (1 + x2 ) for
some constant K), and the general properties of ( Pt )t≥0 established in Lemma 4.2 remain valid for this
extended definition. In particular, our proof of Theorem 4.4 carries over to the case where f ∈ Cb2 (R),
meaning that f is twice continuously differentiable with f , f ′ , f ′′ being bounded.

Remark 4.3 (Martingales). The family of martingales given by (4.23) is of course extremely useful for
studying the process X. For example, in the case of Brownian motion, we obtain that
Z t
1
t 7→ f ( Bt ) − f ′′ ( Bu ) du,
2 0

is a martingale for any f ∈ Cb2 (R), generalizing the two simple cases ( Bt )t≥0 and ( Bt2 − t)t≥0 .

Remark 4.4 (Fokker-Planck equation). Writing ht for the distribution of Xt , the equation (4.22) gives

σ2 (z) ′′
Z  
d
Z

f (z)ht ( dz) = b(z) f (z) + f (z) ht ( dz), (4.27)
dt R R 2

for any f ∈ Cb2 . Integrating by parts in the sense of distributions, we obtain Fokker-Planck’s equation:

∂ht 1 ′′
= L⋆ ht , where L⋆ h = hσ2 − (bh)′ . (4.28)
∂t 2
In particular, the equation L⋆ h = 0 characterizes those distributions h which are stationary.
2
Exercise 4.3 (Stationary distribution). Check that the Gaussian distribution N (0, σ2b ) is stationary for
the Langevin equation dXt = −bXt dt + σ dBt with b, σ > 0.

4.6 Connection with partial differential equations


The Kolmogorov equation (4.22) is the starting point of a far-reaching connection between SDEs
and partial differential equations (PDEs), which we now uncover. Consider again our diffusion

dXtx = b( Xtx ) dt + σ( Xtx ) dBt



(4.29)
X0x = x,

where b, σ : R → R are Lipschitz functions. Now, fix f ∈ L∞ (R), and consider the PDE

2 2
 ∂v (t, x ) = b( x ) ∂v (t, x ) + σ ( x ) ∂ v (t, x )

∂t ∂x 2 ∂x2 (4.30)

 v(0, x ) = f ( x ),

where v ∈ C 1,2 (R+ × R) is unknown.

Theorem 4.5 (Connection). The evolutions (4.29) and (4.30) are linked as follows:

1. If v is a bounded solution to the PDE (4.30), then we must have

∀(t, x ) ∈ R+ × R, v(t, x ) = E[ f ( Xtx )]. (4.31)

2. If b, σ, f ∈ Cb2 , then conversely, the function (4.31) is a bounded solution to (4.30).

45
4.6. Connection with partial differential equations

Proof. Fix v ∈ C 2 (R+ × R) and ( T, x ) ∈ R+ × R, and consider the stochastic process M =


( Mt )t∈[0,T ] defined by Mt := v( T − t, Xtx ). By Ito’s formula, we have
∂v ∂v 1 ∂2 v
dMt = − ( T − t, Xtx ) dt + ( T − t, Xtx ) dXtx + ( T − t, Xtx ) d⟨ X ⟩tx
∂t ∂x 2 ∂x2
1 ∂2 v
 
∂v x x x 2 x ∂v x
= ( T − t, Xt )b( Xt ) + ( T − t, Xt )σ ( Xt ) − ( T − t, Xt ) dt
∂x 2 ∂x2 ∂t
∂v
+ ( T − t, Xtx )σ( Xtx ) dBt .
∂x
In particular, if v solves (4.30), then the drift term is zero so M is a local martingale. If moreover
v is bounded, then M is bounded, so it is a martingale. Thus, E[ MT ] = E[ M0 ], i.e.
E[ f ( XTx )] = v( T, x ).
This proves the first claim. Now, assume that b, σ, f ∈ Cb2 . Then the last item in Lemma 4.2
ensures that Pt f is in Cb2 (R) for all t ≥ 0, so we may replace f with Pt f in (4.21) to obtain
1
LPt f ( x ) = b( x )( Pt f )′ ( x ) + σ2 ( x )( Pt f )′′ ( x ).
2
Thus, the Kolmogorov equation ∂t∂ Pt f ( x ) = LPt f ( x ) (along with the trivial identity P0 f ( x ) =
f ( x )) precisely assert that the bounded function v(t, x ) := ( Pt f )( x ) solves the PDE (4.30).

The interest of this connection between SDEs and PDEs is two-fold: on the one hand, one can
use tools from PDE theory to understand the distribution of Xtx . Conversely, the probabilistic
representation (4.31) offers a practical way to numerically solve the PDE (4.30), by simulation.
Here is an important extension, which incorporates a zero-order term into our PDE.
Theorem 4.6 (Feynman-Kac’s formula). Let v ∈ C 1,2 (R+ × R) be a bounded solution to the PDE

2 2
 ∂v (t, x ) = −h( x )v(t, x ) + b( x ) ∂v (t, x ) + σ ( x ) ∂ v (t, x )

∂t ∂x 2 ∂x2 (4.32)

 v(0, x ) = f ( x ),
where f , h : R → R are measurable, with h non-negative. Then, we have the representation
h Rt x
i
∀(t, x ) ∈ R+ × R, v(t, x ) = E f ( Xtx )e− 0 h(Xu ) du . (4.33)

Proof. Fix T ≥ 0 and x ∈ R, and consider the stochastic process ( Mt )t∈[0,T ] defined by
Rt
h( Xux ) du
Mt := Vt e− 0 , with Vt := v( T − t, Xtx ).
As above, using Ito’s formula and the fact that v solves the PDE (4.32), we find for t ∈ [0, T ],
∂v
dVt = h( Xtx )Vt dt + σ( Xtx ) ( T − t, Xtx ) dBt ,
∂x
Consequently,
Rt
h( Xux ) du
dMt = e− 0 ( dVt − h( Xtx )Vt dt)
Rt x ∂v
= e− 0 h(Xu ) du σ( Xtx ) ( T − t, Xtx ) dBt .
∂x
Thus, ( Mt )t∈[0,T ] is a local martingale. Since it is bounded (v is bounded and h ≥ 0), it is in fact a
true martingale. In particular, E[ MT ] = E[ M0 ], i.e.
h RT x
i
E f ( XTx )e− 0 h(Xu ) du = v( T, x ).
Since T ≥ 0 and x ∈ R are arbitrary, the claim is proved.

46

You might also like