Sde PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Stochastic Differential Equations

Steven P. Lalley
December 2, 2016

1 SDEs: Definitions
1.1 Stochastic differential equations
Many important continuous-time Markov processes — for instance, the Ornstein-Uhlenbeck pro-
cess and the Bessel processes — can be defined as solutions to stochastic differential equations with
drift and diffusion coefficients that depend only on the current value of the process. The general
form of such an equation (for a one-dimensional process with a one-dimensional driving Brownian
motion) is
dXt = µ(Xt ) dt + σ(Xt ) dWt , (1)
where {Wt }t≥0 is a standard Wiener process.

Definition 1. Let {Wt }t≥0 be a standard Brownian motion on a probability space (Ω, F, P ) with an
admissible filtration F = {Ft }t≥0 . A strong solution of the stochastic differential equation (1) with
initial condition x ∈ R is an adapted process Xt = Xtx with continuous paths such that for all t ≥ 0,
Z t Z t
Xt = x + µ(Xs ) ds + σ(Xs ) dWs a.s. (2)
0 0

At first sight this definition seems to have little content except to give a more-or-less obvious in-
terpretation of the differential equation (1). However, there are a number of subtle points involved:
First, the existence of the integrals in (2) requires some degree of regularity on Xt and the functions
µ and σ; in particular, it must be the case that for all t ≥ 0, with probability one,
Z t Z t
|µ(Xs )| ds < ∞ and σ 2 (Xs ) ds < ∞. (3)
0 0

Second, the solution is required to exist for all t < ∞ with probability one. In fact, there are
interesting cases of (1) for which solutions can be constructed up to a finite, possibly random time
T < ∞, but not beyond; this often happens because the solution Xt explodes (that is, runs off to ±∞)
in finite time. Third, the definition requires that the process Xt live on the same probability space
as the given Wiener process Wt , and that it be adapted to the given filtration. It turns out (as we
will see) that for certain coefficient functions µ and σ, solutions to the stochastic integral equation
equation (2) may exist for some Wiener processes and some admissible filtrations but not for others.
Definition 2. A weak solution of the stochastic differential equation (1) with initial condition x is
a continuous stochastic process Xt defined on some probability space (Ω, F, P ) such that for some
Wiener process Wt and some admissible filtration F the process X(t) is adapted and satisfies the
stochastic integral equation (2).

1
2 Existence and Uniqueness of Solutions
2.1 Itô’s existence/uniqueness theorem
The basic result, due to Itô, is that for uniformly Lipschitz functions µ(x) and σ(x) the stochastic
differential equation (1) has strong solutions, and that for each initial value X0 = x the solution is
unique.
Theorem 1. Assume that µ : R → R and σ : R → R+ are uniformly Lipschitz, that is, there exists a
constant C < ∞ such that for all x, y ∈ R,

|µ(x) − µ(y)| ≤ C|x − y| and (4)


|σ(x) − σ(y)| ≤ C|x − y|. (5)

Then the stochastic differential equation (1) has strong solutions: In particular, for any standard Brownian
motion {Wt }t≥0 , any admissible filtration F = {Ft }t≥0 , and any initial value x ∈ R there exists a unique
adapted process Xt = Xtx with continuous paths such that
Z t Z t
Xt = x + µ(Xs ) ds + σ(Xs ) dWs a.s. (6)
0 0

Furthermore, the solutions depend continuously on the initial data x, that is, the two-parameter process Xtx
is jointly continuous in t and x.
This parallels the main existence/uniqueness result for ordinary differential equations, or more
generally finite systems of ordinary differential equations

x0 (t) = F (x(t)), (7)

which asserts that unique solutions exist for each initial value x(0) provided the function F is
uniformly Lipschitz. Without the hypothesis that the function F is Lipschitz, the theorem may fail
in any number of ways, even for ordinary differential equations.
p
Example 1. Consider the equation x0 = 2 |x|. This is the special case of equation (7) with F (x) =

x. This function fails the Lipschitz property at x = 0. Correspondingly, uniqueness of solutions
fails for the initial value x(0) = 0: the functions

x(t) ≡ 0 and y(t) = t2

are both solutions of the ordinary differential equation with initial value 0.
Example 2. Consider the equation x0 = x2 , the special case of (7) where F (x) = x2 . The function
F is C ∞ , hence Lipschitz on any finite interval, but it is not uniformly Lipschitz, as uniformly
Lipschitz functions cannot grow faster than linearly. For any initial value x0 > 0, the function

x(t) = (x−1
0 − t)
−1

solves the differential equation and has the right initial value, and it can be shown that there is no
other solution. The difficulty is that the function x(t) blows up as t → 1/x0 , so the solution does not
exist for all time t > 0. The same difficulty can arise with stochastic differential equations whose
coefficients grow too quickly: for stochastic differential equations, when solutions travel to ±∞ in
finite time they are said to explode.

2
2.2 Gronwall inequalities
The proof of Theorem 1 will make use of several basic results concerning the solutions of simple
differential inequalities due to Gronwall. These are also useful in the theory of ordinary differential
equations.
Lemma 1. Let y(t) be a nonnegative function that satisfies the following condition: For some T ≤ ∞ there
exist constants A, B ≥ 0 such that
Z t
y(t) ≤ A + B y(s) ds < ∞ for all 0 ≤ t ≤ T. (8)
0

Then
y(t) ≤ AeBt
for all 0 ≤ t ≤ T. (9)
RT
Proof. Without loss of generality, we may assume that C := 0 y(s) ds < ∞ and that T < ∞. It then
follows since y is nonnegative, that y(t) is bounded by D := A + BC on the interval [0, T ]. Iterate
the inequality (8) to obtain
Z t
y(t) ≤ A + B y(s) ds
0
Z t Z s
≤A+B (A + B) y(r) drds
0 0
Z tZ s Z r
≤ A + BAt + B 2 (A + B y(q) dq)drds
0 0 0
Z tZ sZ r Z q
≤ A + BAt + B 2 At2 /2! + B 3 (A + B y(p) dp)dqdrds
0 0 0 0
≤ ··· .

After k iterations, one has the first k terms in the series for AeBt plus a (k + 1)−fold iterated integral
Ik . Because y(t) ≤ D on the interval [0, T ], the integral Ik is bounded by B k Dtk+1 /(k + 1)!. This
converges to zero uniformly for t ≤ T as k → ∞. Hence, inequality (9) follows.

Lemma 2. Let yn (t) be a sequence of nonnegative functions such that for some constants B, C < ∞,

y0 (t) ≤ C for all t ≤ T and


Z t
yn+1 (t) ≤ B yn (s) ds < ∞ for all t ≤ T and n = 0, 1, 2, . . . . (10)
0

Then
yn (t) ≤ CB n tn /n! for all t ≤ T. (11)
Proof. Exercise.

2.3 Proof of Theorem 1: Constant σ


It is instructive to first consider the special case where the function σ(x) ≡ σ is constant. (This
includes the possibility σ ≡ 0, which the stochastic differential equation reduces to an ordinary
differential equation x0 = µ(x).) In this case the Gronwall inequalities can be used pathwise to
prove all three assertions of the theorem (existence, uniqueness, and continuous dependence on

3
initial conditions). First, uniqueness: suppose that for some initial value x there are two continuous
solutions
Z t Z t
Xt = x + µ(Xs ) ds + σ dWs and
0 0
Z t Z t
Yt = x + µ(Ys ) ds + σ dWs .
0 0

Then the difference satisfies Z t


Yt − Xt = (µ(Ys ) − µ(Xs )) ds,
0
and since the drift coefficient µ is uniformly Lipschitz, it follows that for some constant B < ∞,
Z t
|Yt − Xt | ≤ B |Ys − Xs | ds
0

for all t < ∞. Lemma 1 now implies that Yt − Xt ≡ 0. Thus, the stochastic differential equation
can have at most one solution for any particular initial value x. A similar argument shows that
solutions depend continuously on initial conditions X0 = x.
Existence of solutions is proved by a variant of Picard’s method of successive approximations.
Fix an initial value x, and define a sequence of adapted process Xn (t) by
Z t
X0 (t) = x and Xn+1 (t) = x + µ(Xn (s)) ds + σW (t).
0

The processes Xn (t) are all well-defined and have continuous paths, by induction on n (using the
hypothesis that the function µ(y) is continuous). The strategy will be to show that the sequence
Xn (t) converges uniformly on compact time intervals. It will then follow, by the dominated con-
vergence theorem and the continuity of µ, that the limit process X(t) solves the stochastic integral
equation (6). Because µ(y) is Lipschitz,
Z t
|Xn+1 (t) − Xn (t)| ≤ B |Xn (s) − Xn−1 (s)| ds,
0

and so Lemma 2 implies that for any T < ∞,

|Xn+1 (t) − Xn (t)| ≤ CB n T n /n! for all t ≤ T

It follows that the processes Xn (t) converge uniformly on compact time intervals [0, T ], and there-
fore that the limit process X(t) has continuous trajectories.

2.4 Proof of Theorem 1. General Case: Existence


The proof of Theorem 1 in the general case is more complicated, because when differences of so-
lutions or approximate solutions are taken, the Itô integrals no longer vanish. Thus, the Gronwall
inequalities cannot be applied directly. Instead, we will use Gronwall to control second moments.
Different arguments are needed for existence and uniqueness. Continuous dependence on initial
conditions can be proved using arguments similar to those used for the uniqueness proof; the de-
tails are left as an exercise.
To prove existence of solutions we use the same iterative method as in the case of constant σ to
generate approximate solutions:
Z t Z t
X0 (t) = x and Xn+1 (t) = x + µ(Xn (s)) ds + σ(Xn (s)) dWs . (12)
0 0

4
By induction, the processes Xn (t) are well-defined and have continuous paths. The problem is
to show that these converge uniformly on compact time intervals, and that the limit process is a
solution to the stochastic differential equation.
First we will show that for each t ≥ 0 the sequence of random variables Xn (t) converges in L2
to a random variable X(t), necessarily in L2 . The first two terms of the sequence are X0 (t) ≡ x and
X1 (t) = x + µ(x)t + σ(x)Wt ; for both of these the random variables Xj (t) are uniformly bounded in
L2 for t in any bounded interval [0, T ], and so for each T < ∞ there exists C = CT < ∞ such that
E(X1 (t) − X0 (t))2 ≤ C for all t ≤ T.
Now by hypothesis, the functions µ and σ are uniformly Lipschitz, and hence, for a suitable con-
stant B < ∞,
|µ(Xn (t)) − µ(Xn−1 (t))| ≤ B|Xn (t) − Xn−1 (t)| and (13)
|σ(Xn (t)) − σ(Xn−1 (t))| ≤ B|Xn (t) − Xn−1 (t)|
for all t ≥ 0. Thus, by Cauchy-Schwartz and the Itô isometry, together with the elementary inequal-
ity (x + y)2 ≤ 2x2 + 2y 2 ,
Z t Z t 2
E|Xn+1 (t) − Xn (t)|2 ≤ E (µ(Xn (s)) − µ(Xn−1 (s)) ds + (σ(Xn (s)) − σ(Xn−1 (s))) dWs
0 0
Z t 2 Z t 2
≤ 2E (µ(Xn (s)) − µ(Xn−1 (s)) ds + 2E (σ(Xn (s)) − σ(Xn−1 (s)) dWs
0 0
Z t 2 Z t
2
≤ 2B E |Xn (s) − Xn−1 (s)| ds + 2B 2 E|Xn (s) − Xn−1 (s)|2 ds
0 0
 Z t  Z t
≤ 2B 2 E t |Xn (s) − Xn−1 (s)|2 ds + 2B 2 E|Xn (s) − Xn−1 (s)|2 ds
0 0
Z t
≤ 2B 2 (T + 1) E|Xn (s) − Xn−1 (s)|2 ds ∀ t ≤ T.
0

Lemma 2 now applies to yn (t) := E|Xn+1 (t) − Xn (t)|2 (recall that E|X1 (t) − X0 (t)|2 ≤ C = CT for
all t ≤ T ), yielding
E(Xn+1 (t) − Xn (t))2 ≤ C(4B 2 + 4B 2 T )n /n! ∀ t ≤ T. (14)
This clearly implies that for each t ≤ T the random variables Xn (t) converge in L2 . Furthermore,
this L2 −convergence is uniform for t ≤ T (because the bounds in (14) hold uniformly for t ≤ T ),
and the limit random variables X(t) := L2 − limn→∞ Xn (t) are bounded in L2 for t ≤ T .
It remains to show that the limit process X(t) satisfies the stochastic differential equation (6).
To this end, consider the random variables µ(Xn (t)) and σ(Xn (t)). Since Xn (t) → X(t) in L2 , the
Lipschitz bounds (13) imply that
lim E|µ(Xn (t)) − µ(X(t))|2 + E|σ(Xn (t)) − σ(X(t))|2 = 0

n→∞

uniformly for t ≤ T . Hence, by the Itô isometry,


Z t Z t
L2 − lim σ(Xn (s)) dWs = σ(X(s)) dWs
n→∞ 0 0

for each t ≤ T . Similarly, by Cauchy-Schwartz and Fubini,


Z t Z t
L2 − lim µ(Xn (s)) ds = µ(X(s)) ds.
n→∞ 0 0

5
Thus, (12) implies that
Z t Z t
X(t) = x + µ(X(s)) ds + σ(X(s)) dWs .
0 0

This shows that the process X(t) satisfies the stochastic integral equation (6). Both of the integrals
in this equation are continuous in t, and therefore so is X(t).

2.5 Proof of Theorem 1. General Case: Uniqueness


Suppose as before that for some initial value x there are two continuous solutions
Z t
Z t
Xt = x + µ(Xs ) ds + σ(Xs ) dWs and
0 0
Z t Z t
Yt = x + µ(Ys ) ds + σ(Ys ) dWs .
0 0

Then the difference satisfies


Z t Z t
Yt − Xt = (µ(Ys ) − µ(Xs )) ds + (σ(Ys ) − σ(Xs )) dWs (15)
0 0

Although the second integral cannot be bounded pathwise, its second moment can be bounded,
since σ(y) is Lipschitz:
Z t 2 Z t
E (σ(Ys ) − σ(Xs )) dWs ≤ B2 E(Ys − Xs )2 ds,
0 0

where B is the Lipschitz constant. Of course, we have no way of knowing that the expectations
E(Ys − Xs2 ) are finite, so the integral on the right side of the inequality may be ∞. Nevertheless,
taking second moments on both sides of (15), using the inequality (a + b)2 ≤ 2a2 + 2b2 and the
Cauchy-Schwartz inequality, we obtain
Z t
2 2 2
E(Yt − Xt ) ≤ (2B + 2B T ) E(Ys − Xs )2 ds
0

If the function f (t) := E(Yt − Xt )2 were known to be finite and integrable on compact time
intervals, then the Gronwall inequality (9) would imply that f (t) ≡ 0, and the proof of uniqueness
would be complete.1 To circumvent this difficulty, we use a localization argument: Define the
stopping time
τ := τA = inf{t : Xt2 + Yt2 ≥ A}.
Since Xt and Yt are defined and continuous for all t, they are a.s. bounded on compact time inter-
vals, and so τA → ∞ as A → ∞. Hence, with probability one, t ∧ τA = t for all sufficiently large
A. Next, starting from the identity (15), stopping at time τ = τA , and proceeding as in the last
paragraph, we obtain
Z t
E(Yt∧τ − Xt∧τ )2 ≤ (2B 2 + 2B 2 T ) E(Ys∧τ − Xs∧τ )2 ds for all t ≤ T
0
1 O KSENDAL seems to have fallen prey to this trap: In his proof of Theorem 5.2.1 he fails to check that the second moment
is finite.

6
By definition of τ , both sides are finite, and so Gronwall’s inequality (9) implies that
E(Yt∧τ − Xt∧τ )2 = 0
Since this is true for every τ = τA , it follows that Xt = Yt a.s., for each t ≥ 0. SInce Xt and Yt
have continuous sample paths, it follows that with probability one, Xt = Yt for all t ≥ 0. A similar
argument proves continuous dependence on initial conditions.

3 Example: The Feller diffusion


The Feller diffusion {Yt }t≥0 is a continuous-time Markov process on the half-line [0, ∞) with ab-
sorption at 0 that satisfies the stochastic differential equation
p
dYt = σ Yt dWt (16)
up until the time τ = τ0 of the first visit to 0. Here σ > 0 is a positive parameter. The Itô exis-

tence/uniqueness theorem does not apply, at least directly, because the function y is not Lips-
chitz. However, the localization lemma of Itô calculus can be used in a routine fashion to show that
for any initial value y > 0 there is a continuous process Yt such that
Z t∧τ p
Yt∧τ = y + Ys dWs where τ = inf{t > 0 : Yt = 0}.
0

(Exercise: Fill in the details.)


The importance of the Feller diffusion stems from the fact that it is the natural continuous-
time analogue2 of the critical Galton-Watson process. The Galton-Watson process is a discrete-
time Markov chain Zn on the nonnegative integers that evolves according to the following rule:
Given that Zn = k and any realization of the past up to time n − 1, the random variable Zn+1 is
distributed as the sum of k independent, identically distributed random variables with common
distribution F , called the offspring distribution. The process is said to be critical if F has mean 1.
Assume also that F has finite variance σ 2 ; then the evolution rule implies that the increment Zn+1 −
Zn has conditional expectation 0 and conditional variance σ 2 Zn , given the history of the process to
time n. This corresponds to the stochastic differential equation (16), which roughly states that the
increments of Yt have conditional expectation 0 and conditional variance σ 2 Yt dt, given Ft .
A natural question to ask about the Feller diffusion is this: If Y0 = y > 0, does the trajectory Yt
reach the endpoint 0 of the state space in finite time? (That is, is τ < ∞ w.p.1?) To see that it does,
1/2
consider the process Yt . By Itô’s formula, if Yt satisfies (16), or more precisely, if it satisfies
Z t p
Yt = y + σ Ys 1[0,τ ] (s) dWs , (17)
0

then
1/2 1 −1/2 1 −3/2
dYt = Y dYt − Yt d[Y ]t
2 t 8
σ 1 −1/2
= dWt − Yt dt
2 8
1/2
up to time τ . Thus, up to the time of the first visit to 0 (if any), the process Yt is a Brownian

motion plus a negative drift. Since a Brownian motion started at y will reach 0 in finite time, with
1/2
probability one, so will Yt .
2 Actually, the Feller diffusion is more than just an analogue of the Galton-Watson process: It is a weak limit of rescaled

Galton-Watson processes, in the same sense that Brownian Motion is a weak limit of rescaled random walks.

7
Exercise 1. Scaling law for the Feller diffusion: Let Yt be a solution of the integral equation (17)
with volatility parameter σ > 0 and initial value Y0 = 1.
(A) Show that for any α > 0 the process

Ỹt := α−1 Yαt (18)

is a Feller diffusion with initial value α−1 and volatility parameter σ.


(B) Use this to deduce a simple relationship between the distributions of the hitting time τ for the
Feller diffusion under the different initial conditions Y0 = 1 and Y0 = α−1 , respectively.
Exercise 2. Superposition law for the Feller diffusion: Let YtA and YtB be independent Feller
diffusion processes with initial values Y0A = α and Y0B = β: In particular, assume that Y A and Y B
satisfy stochastic integral equations (17) with respect to independent Brownian motions W A and
W B . Define the superposition of Y A and Y B to be the process

YtC := YtA + YtB .

(A) Show that YtC is a Feller diffusion with initial condition Y0C = α + β.
(B) Use this to deduce a simple relationship among the hitting time distributions for the three
processes.
Exercise 3. Zero is not an entrance boundary: The stochastic differential equation (16) has a sin-

gularity at the endpoint 0 of the state space, in the sense that the volatility σ y becomes 0 in a non-
smooth manner as y → 0. Equation (16) has solutions up to time τ for any initial value Y0 = y > 0;
however, it is unclear whether or not there are solutions Yt of (16) such that limt→0 Yt = 0. Use the
scaling law to prove that there are no such solutions. H INT: Consider the time that it would take to
get from 2−k−m to 2−m .

4 The Differential Operator Associated to an SDE


We have seen that the Laplacian differential operator 21 ∆ plays a central role in the study of Brow-
nian motion:
(A) The transition probabilities pt (x, y) obey the backward and forward heat equation:

∂ 1 1
pt (x, y) = ∆x pt (x, y) = ∆y pt (x, y).
∂t 2 2

(B) For any C ∞ function f with compact support, the Dynkin formula holds:
Z t
1
x
E f (Wt ) = f (x) + E x ∆f (Ws ) ds.
2 0

(C) For any C ∞ function f with compact support, the Itô formula holds:
Z t Z t
1
f (Wt ) − f (W0 ) = ∇f · dWt + ∆f (Ws ) ds.
0 2 0

Associated to any autonomous, time-homogeneous stochastic differential equation of the form


(1) is a corresponding second-order differential operator, the so-called generator, which figures into

8
the study of solutions Xt in much the same way as does the Laplacian for Brownian motion. The
generator is defined as follows:

d 1 d2
G = µ(x) + σ(x)2 2 . (19)
dx 2 dx
Assume henceforth that the coefficients µ(x) and σ(x) are continuous.
Proposition 1. Let Xt be any weak solution to the stochastic differential equation (1) and let f : R → R be
a C ∞ function with compact support. Then
Z t Z t
1 t
Z
0 0
f (Xt ) − f (X0 ) = µ(Xs )f (Xs ) ds + f (Xs )σ(Xs ) dWs + σ(Xs )2 f 00 (Xs ) ds. (20)
0 0 2 0
Consequently, the process
Z t
Mtf := f (Xt ) − Gf (Xs ) ds (21)
0
is a martingale.
Proof. Formula (20) is just a restatement of the Itô formula for Itô processes – see the Lecture Notes
on Itô Calculus, sec. 2.2. The second assertion is a direct consequence of formula (20), because
Z t
Mtf = f 0 (Xs )σ(Xs ) dWs
0

and the integrand f 0 (Xs )σ(Xs ) is uniformly bounded, since f has compact support.
Equations (20) and (21) are the analogues of the Itô and Dynkin formulas, respectively. There is
also an analogue of Property (A) (the heat equation), but this is, unfortunately, much more difficult
to establish. In particular, it is by no means obvious that transition densities even exist (i.e., that for a
given initial value X0 = x the random variable Xt will have an absolutely continuous distribution).
Following is a weak analogue of Property (A).
Proposition 2. Let Xt be any weak solution to the stochastic differential equation (1) and let u : [0, T ]×R →
R be a continuous, bounded function that satisfies the diffusion equation


u(t, x) = Gu(t, x) for 0 < t < T. (22)
∂t
Then the process {u(T − t, Xt )}0≤t≤T is a martingale, and so

u(T, x) = E x u(0, XT ). (23)

Proof. It is implicit in the hypotheses that the function u is of class C 1×2 in (0, T ) × R, and so the Itô
formula applies in this range: in particular,

∂ ∂ ∂2
du(T − t, Xt ) = − u(T − t, Xt ) dt + u(T − t, Xt ) dXt + u(T − t, Xt ) d[X]t
 ∂t  ∂x ∂x2
∂ ∂
= − + G u(T − t, Xt ) dt + u(T − t, Xt ) dWt
∂t ∂x

= u(T − t, Xt ) dWt .
∂x
This implies that the process u(T − t, Xt ) is a local martingale. But since the function u is bounded,
it then follows by the dominated convergence theorem that u(T − t), Xt is a proper martingale.

9
5 Distributional Uniqueness and the Strong Markov Property
Recall that a weak solution to the stochastic differential equation (1) need not be adapted to the
minimal filtration for the driving Brownian motion, so the random variables Xt might somehow
incorporate some “exogenous randomness”. Thus, one might wonder if the distribution of the
process {Xt }t≥0 gotten from a weak solution could depend in a nontrivial way on the particular
filtration.
Definition 3. Say that distributional uniqueness holds for the stochastic differential equation (1) if
any two weak solutions Xt and X̃t with the same initial value x have the same finite-dimensional
distributions.
When does distributional uniqueness hold? The answer, it turns out, depends on properties of
the generator G associated with the stochastic differential equation.
Definition 4. Let f : R → R be a bounded, continuous function. The Cauchy problem (more properly,
the Cauchy initial value problem) for the data G, f is the partial differential equation


u(t, x) = Gu(t, x) (24)
∂t
with initial condition
u(0, x) = f (x). (25)

Theorem 2. If for every C ∞ function f : R → R with compact support the Cauchy problem has a unique,
bounded solution u : [0, ∞) × R → R then distributional uniqueness holds for the stochastic differential
equation (1).
Thus, the problem of distributional uniqueness devolves to a problem in the theory of partial
differential equations. What do the PDE folks have to tell us about this problem? Quite a lot, it
seems (cf. Avner Friedman, Partial differential equations of parabolic type for an entire book on the
subject). Following is a simple sufficient condition.
Theorem 3. If the coefficients µ(x) and σ(x) are bounded and continuous, then for every C ∞ function
f : R → R with compact support the Cauchy problem has a unique, bounded solution.
I will not prove this, but I will remark that probability theory gives an explicit form for the
solution: if Xtx is any weak solution of the stochastic differential equation (1) with initial value x,
then
u(t, x) = Ef (Xtx ) (26)
is the unique solution to the Cauchy problem for the initial data f . Distributional uniqueness im-
plies that the expectation in (26) does not depend on the particular weak solution used, and a
simple application of the Itô formula (essentially the same calculation as in the proof of Proposi-
tion 2) would show that (26) is a solution of the Cauchy problem if we could show that u is C 1×2 .
Unfortunately, there seems to be no easy way to prove this directly.

Proof of Theorem 2. For any C ∞ function f : R → R with compact support, let uf (t, x) be the unique
solution to the Cauchy problem (24)–(25) with initial data f . To prove Theorem 2, we must show
that for any weak solution Xt to the stochastic differential equation (1) with initial value X0 = x and
any choice of times 0 =< t1 < · · · < tm < ∞ the joint distribution of Xt1 , Xt2 , . . . , Xtm is completely
determined by the functions µ(x) and σ(x). By a routine induction (E XERCISE), it suffices to show
that for any s, t > 0 and any C ∞ function f with compact support,

E(f (Xt+s ) | Fs ) = uf (t, Xs ) almost surely. (27)

10
But this follows easily from Proposition 2: since uf satisfies the partial differential equation (24),the
process {uf (t − r, Xs+r )}0≤r≤t is a martingale relative to the filtration {Fs+r }0≤r≤t , and so

E(f (Xt+s ) | Fs ) = uf (t, Xs ) almost surely.

Finally, we shall show that distributional uniqueness implies that any weak solution Xt to the
stochastic differential equation (1) is a strong Markov process. Because the laws that govern the
evolution of the process Xt are not spatially homogeneous, the strong Markov property cannot be
formulated in quite the same way as for Brownian motion.
Theorem 4. Assume that distributional uniqueness holds for the stochastic differential equation (1)and that
the coefficients µ(x), σ(x) satisfy the hypotheses of Ito’s Existence/Uniqueness Theorem. Then any weak
solution (Xt )t≥0 with non-random initial point X0 = x has the strong Markov property. More precisely:
For any x ∈ R, denote by µx the distribution3 of any weak solution (Xt )t≥0 of (1) with initial point
X0 = x. Denote by (Ft )t≥0 the filtration for the weak solution (Xt )t≥0 , and let τ be any finite stopping time
for this filtration. Then the conditional distribution of the post-τ process (Xτ +t )t≥0 given the σ−algebra Fτ
is µXτ .
Proof. By an induction argument similar to that in the proof of Theorem 2, it suffices to show that
for any C ∞ function f with compact support and any t > 0,

E x (f (Xτ +t ) | Fτ ) = uf (t, Xτ )

where uf is defined by equation (26). But the process (X̃t )t≥0 = (Xτ +t )t≥0 solves the stochastic
differential equation
dX̃t = µ(X̃t ) dt + σ(X̃t ) dW̃t (28)
where W̃t = Wτ +t − Wτ and the relevant filtration is (Fτ +t )t≥0 . By Itô’s theorem, for any y ∈ R the
equation (28) has a unique strong solution with initial value X̃0 = y. By hypothesis, this solution
has the same distribution as does the process Xt with initial value X0 = y; consequently,

E x (f (Xτ +t ) | Fτ ) = (E y f (Xt ))y=Xτ = uf (t, Xτ ).

3 that is, the measure on the space C[0, ∞) of continuous paths x(t) such that for any Borel set B ⊂ C[0, ∞),
P ((Xt )t≥0 ∈ B) = µx (B).

11

You might also like