Stochastic Differential Equations: Florian Herzog 2010
Stochastic Differential Equations: Florian Herzog 2010
Do not worry about your problems with mathematics, I assure you mine are far
greater.
Albert Einstein.
Florian Herzog
2010
Stochastic Differential Equations (SDE)
dx(t)
= f (t, x) , dx(t) = f (t, x)dt , (1)
dt
with initial conditions x(0) = x0 can be written in integral form
Z t
x(t) = x0 + f (s, x(s))ds , (2)
0
where x(t) = x(t, x0, t0) is the solution with initial conditions x(t0) = x0. An
example is given as
dx(t)
= a(t)x(t) , x(0) = x0 . (3)
dt
When we take the ODE (3) and assume that a(t) is not a deterministic parameter
but rather a stochastic parameter, we get a stochastic differential equation (SDE). The
stochastic parameter a(t) is given as
where ω denotes that X = X(t, ω) is a random variable and possesses the initial
condition X(0, ω) = X0 with probability one. As an example we have already
encountered
dY (t, ω) = µ(t)dt + σ(t)dW (t, ω) .
Furthermore, f (t, X(t, ω)) ∈ R, g(t, X(t, ω)) ∈ R, and W (t, ω) ∈ R. Similar as
in (2) we may write (7) as integral equation
Z t Z t
X(t, ω) = X0 + f (s, X(s, ω))ds + g(s, X(s, ω))dW (s, ω) . (8)
0 0
RT
For the calculation of the stochastic integral 0 g(t, ω)dW (t, ω), we assume that
g(t, ω) is only changed at discrete time points ti (i = 1, 2, 3, ..., N − 1), where
0 = t0 < t1 < t2 < . . . < tN −1 < tN < T . We define the integral
Z T
S= g(t, ω)dW (t, ω) , (9)
0
as the Riemannßum
N
X ³ ´
SN (ω) = g(ti−1, ω) W (ti, ω) − W (ti−1, ω) . (10)
i=1
with N → ∞ .
A random variable S is called the Itô integral of a stochastic process g(t, ω) with
respect to the Brownian motion W (t, ω) on the interval [0, T ] if
h³ N
X ³ ´i
lim E S − g(ti−1, ω) W (ti, ω) − (W (ti−1, ω) = 0, (11)
N →∞
i=1
for each sequence of partitions (t0, t1, . . . , tN ) of the interval [0, T ] such that
maxi(ti − ti−1) → 0. The limit in the above definition converges to the stochastic
integral in the mean-square sense. Thus, the stochastic integral is a random variable,
the samples of which depend on the individual realizations of the paths W (., ω).
The simplest possible example is g(t) = c for all t. This is still a stochastic process, but
a simple one. Taking the definition, we actually get
Z T N ³
X ´
c dW (t, ω) = c lim W (ti, ω) − W (ti−1, ω)
0 N →∞
i=1
= c lim [(W (t1, ω)−W (t0, ω)) + (W (t2, ω)−W (t1, ω)) + . . .
N →∞
where W (T, ω) and W (0, ω) are standard Gaussian random variables. With W (0, ω) = 0,
the last result becomes Z T
c dW (t, ω) = c W (T, ω) .
0
RT
Example: g(t, ω) = W (t, ω) 0
W (t, ω) dW (t, ω) =
N
X ³ ´
= lim W (ti−1, ω) W (ti, ω) − W (ti−1, ω)
N →∞
i=1
h1 XN
1X
N i
2 2 2
= lim (W (ti, ω) − W (ti−1, ω)) − (W (ti, ω) − W (ti−1, ω))
N →∞ 2 2 i=1
i=1
X N
1 2 1 2
= − lim (W (ti, ω) − W (ti−1, ω)) + W (T, ω) , (12)
2 N →∞ i=1 2
PN
We take now a detailed look at :limN →∞ i=1 (W (ti , ω) − W (ti−1, ω))2.
N
X N
X
2 2
E[ lim (W (ti, ω) − W (ti−1, ω)) ] = lim E[(W (ti, ω) − W (ti−1, ω)) ]
N →∞ N →∞
i=1 i=1
N
X
= lim (ti − ti−1)
N →∞
i=1
= T
N
X N
X
2 2
Var[ lim (W (ti, ω) − W (ti−1, ω)) ] = lim Var[(W (ti, ω) − W (ti−1, ω)) ]
N →∞ N →∞
i=1 i=1
N
X 2
= 2 lim (ti − ti−1) .
N →∞
i=1
N
X N
X
2
lim (ti − ti−1) ≤ max(ti − ti−1) lim (ti − ti−1)
N →∞ i N →∞
i=1 i=1
= max(ti − ti−1) T
i
= 0, (13)
PN 2
since ti−1 − ti → 0. Since the expected value of i=1 (ti − ti−1 ) is T and the
variance becomes zero, we get
N
X 2
(W (ti, ω) − W (ti−1, ω)) = T (14)
i=1
This is inRcontrast to our intuition from standard calculus. In the case of a deterministic
T
integral 0 x(t)dx(t) = 12 x2(t), whereas the Itô integral differs by the term − 12 T .
— This example shows that the rules of differentiation (in particular the chain rule)
and integration need to be re-formulated in the stochastic calculus.
Proof:
Z T N
X ³ ´
E[ g(t, ω)dW (t, ω)] = E[ lim g(ti−1, ω) W (ti, ω) − W (ti−1, ω) ]
0 N →∞
i=1
N
X ³ ´
= lim E[g(ti−1, ω)] E[ W (ti, ω) − W (ti−1, ω) ]
N →∞
i=1
= 0.
The expectation of stochastic integrals is zero. This is what we would expect anyway.
Proof:
hZ T i h Z T i
2
Var g(t, ω)dW (t, ω) = E ( g(t, ω)dW (t, ω))
0 0
h³ N
X ³ ´´2i
= E lim g(ti−1, ω) W (ti, ω) − W (ti−1, ω)
N →∞
i=1
N X
X N
= lim E[g(ti−1, ω)g(tj−1, ω)
N →∞
i=1 j=1
N
X 2
= lim E[g (ti−1, ω)] (ti − ti−1)
N →∞
i=1
Z T
2
= E[g (t, ω)]dt . (17)
0
The calculation of the variance of the Itô Integrals shows two important properties:
h³ R ´2i RT h 2 i
T
• E 0
g(t, ω)dW (t, ω) = 0 E g (t, ω) dt
RT
• 0
E[g 2(t, ω)]dt < ∞
The second property is the condition of existence for Itô integrals. The next property is
the linearity of Itô integrals:
Z T
[a1 g1(t, ω) + a2 g2(t, ω)]dW (t, ω)
0
Z T Z T
= a1 g1(t, ω)dW (t, ω) + a2 g2(t, ω)dW (t, ω) , (18)
0 0
for numbers a1, a2 and stochastic functions g1(t, ω), g2(t, ω).
As mentioned shown in the second example, the rules of classical calculus are not valid
for stochastic integrals and differential equations. It is the equivalent to the chain rule
in classical calculus. The problem can be stated as follows:
Given a stochastic differential equation
where the function φ(t, X(t)) is continuously differentiable in t and twice continuously
differentiable in X , find the stochastic differential equation for the process Y (t):
In the case when we assume that g(t, X(t)) = 0, we know the result: the chain rule
for standard calculus. The result is given by
1 2
dY (t) = φt(t, X)dt + φtt(t, X)dt + φx(t, X)dX(t)
2
1 2
+ φxx(t, X)(dX(t)) + h.o.t . (21)
2
dY (t) = φt(t, X)dt + φx(t, X)[f (t, X(t))dt + g(t, X(t))dW (t)]
1 ³
2 2 2 2 2
+φtt(t, X)dt + φxx(t, X) f (t, X(t))dt + g (t, X(t))dW (t)
2
´
+2f (t, X(t))g(t, X(t))dt dW (t) + h.o.t . (22)
The differentials of higher order (dt, dW ) become fast zero, dt2 → 0 and
dtdW (t) → 0. The stochastic term dW 2(t) according to the rules of Brownian
motion is given as
2
dW (t, ω) = dt . (23)
Omitting higher order terms and using the properties of Brownian motion, we arrive at
1 2
dY (t) = [φt(t, X) + φx(t, X)f (t, X(t)) + φxx(t, X)g (t, X(t))]dt
2
+φx(t, X)g(t, X(t))dW (t) . (24)
The term 12 φxx(t, X)g 2(t, X(t)) is often called the Itô corretion term, since this
does not occur in the det. case.
We apply Itôs formula for the following problem: φ(t, X) = X 2 with the SDE
dX(t) = dW (t). From the SDE, we get X(t) = W (t) and calculate the partial
∂φ(t,X) ∂ 2 φ(t,X) ∂φ(t,X)
derivatives of ∂X = 2X , ∂X 2
= 2, and ∂t = 0. The Itô lemma yields
2
d(W (t)) = 1dt + 2W (t)dW (t) . (28)
We now allow that the process X(t) is in Rn. We let W (t) be an m-dimensional
standard Brownian motion and f (t, X(t)) ∈ Rn and g(t, X(t)) ∈ Rn×m. Consider
a scalar process Y (t) defined by Y (t) = φ(t, X(t)), where φ(t, X) is a scalar
function which is continuously differentiable with respect to t and twice continuously
differentiable with respect to X . The Itô formula can be written in vector notation as
follows:
We want to find the SDE for the process Y related to S as follows: Y (t) = φ(t, S) =
∂φ(t,S) ∂ 2 φ(t,S) ∂φ(t,S)
ln(S(t)) . The partial derivatives are: ∂S = S1 , ∂S 2
= − S12 , and ∂t = 0.
Therefore, according to Itô we get,
³ ∂φ(t, S) ∂φ(t, S) 1 ∂ 2φ(t, S) 2 2 ´
dY (t) = + µS(t) + σ S (t) dt
∂t ∂S 2 ∂S 2
³ ∂φ(t, S) ´
+ σS(t) dW (t) , (34)
∂S
1 2
dY (t) = (µ − σ )dt + σdW (t) . (35)
2
Since the right hand side of (35) is independent of Y (t), we are able to compute the
stochastic integral:
Z t Z t
1 2
Y (t) = Y0 + (µ − σ )dt + σdW , (36)
0 2 0
1 2
Y (t) = Y0 + (µ − σ )t + σW (t) . (37)
2
Since Y (t) = ln S(t) we have found a solution for S(t) :
1 2
ln(S(t)) = ln(S(0)) + (µ − σ )t + σW (t) , (38)
2
(µ− 1 2
S(t) = S(0)e 2 σ )t+σW (t) , (39)
We show that we obtain the same result as in the previous formula by apply Itô’s
lemma. By (40) liefert
· ¸
∂ 2U 0 1
The partial derivatives of U are : ∂U
∂X = (X2(t), X1(t))T , ∂X 2
= and
1 0
∂U
∂t = 0.
∂U ∂U T
dU (t) = [ + [f1(t, X1), f2(t, X2)]
∂t ∂X
· ¸´
1 ³ ∂ 2U g1(t, X1)2 g1(t, X1)g2(t, X2)
+ tr ]dt
2 ∂X 2 g1(t, X1)g2(t, X2) g2(t, X2)2
∂U T
+ [g1(t, X1), g2(t, X2)] dW (t)
∂X
= [X2(t)f1(t, X1) + X1(t)f2(t, X2) + g1(t, X1)g2(t, X2)]dt
+[X2(t)g1(t, X1) + X1(t)g2(t, X2)]dW (t)
We classify SDEs into two large groups, linear SDEs and non-linear SDEs. Furthermore,
we distinguish between scalar linear and vector-valued linear SDEs.
We start with the easy case, the scalar linear linear SDEs. An SDE
for a one-dimensional stochastic process X(t) is called a linear (scalar) SDE if and
only if the functions f (t, X(t)) and g(t, X(t)) are affine functions of X(t) ∈ R and
thus
³ Z t h m
X i
−1
X(t) = Φ(t) x0 + Φ (s) a(s) − Bi(s)bi(s) ds
0 i=1
m Z
X t ´
−1
+ Φ (s)bi(s)dWi(s) , (42)
i=1 0
³Z t h Xm
Bi2(s) i Xm Z t ´
Φ(t) = exp A(s) − ds + Bi(s)dWi(s) , (43)
0 i=1
2 i=1 0
The expectation m(t) = E[X(t)]and the second moment P (t) = E[X 2(t)] for
m
X
dX(t) = (A(t)X(t) + a(t))dt + (Bi(t)X(t) + b(t))dWi(t) . (47)
i=1
The ODE for the expectation is derived by applying the expectation operator on both
sides of (42).
m
X
E[dX(t)] = E[(A(t)X(t) + a(t))dt + (Bi(t)X(t) + bi(t))dWi(t) ]
i=1
m
X
+ E[(Bi(t)X(t) + bi(t))] E[dWi(t) ]
| {z }
i=1 =0
In order to compute the second moment, we need to derive the SDE for Y (t) = X 2(t):
h m ³
X ´2i
dY (t) = 2X(t)(A(t)X(t) + a(t)) + Bi(t)X(t) + bi(t) dt
i=1
m ³
X ´
+2X(t) Bi(t)X(t) + bi(t) dWi(t) (51)
i=1
h m ³
X
2 2 2
dY (t) = 2A(t)X (t) + 2X(t)a(t) + Bi (t)X (t) + 2Bi(t)bi(t)X(t)
i=1
´i m ³
X ´
2
+bi (t) dt + 2X(t) Bi(t)X(t) + bi(t) dWi(t) (52)
i=1
Furthermore, we apply the expectation operator to (52) and use P (t) = E[X 2(t)] =
E[Y (t)] and m(t) = E[X(t)].
h m ³
X
2 2 2
E[dY (t)] = 2A(t)E[X (t)] + 2a(t)E[X(t)] + Bi (t)E[X (t)]
i=1
´i
2
+2Bi(t)bi(t)E[X(t)] + bi (t) dt
h m ³
X ´ i
+E 2X(t) Bi(t)X(t) + bi(t) dWi(t)
i=1
h
dP (t) = 2A(t)P (t) + 2a(t)m(t)
m ³
X ´i
2 2
+ Bi (t)P (t) + 2Bi(t)bi(t)m(t) + bi (t) dt
i=1
There are some specific scalar linear SDEs which are found to be quite useful in practice.
The simplest case of SDE is where the drift and the diffusion coefficients are independent
of the information received over time
This model has been used to simulate commodity prices, such as metals or agricultural
products.
The mean is E[S(t)] = µt + S0 and the variance Var[S(t)] = σ 2t. S(t) possesses
a behavior of fluctuations around the straight line S0 + µt.The process is normally
distributed with the given mean and variance.
The standard model of stock prices is the geometric Brownian motion as given by
Another very popular class of SDEs are mean reverting linear SDEs. The model is
obtained by
σ2 ³ −2κ t
´
Var[S(t)] = 1−e .
2κ
lim E[S(t)] = µ
t→∞
and
σ2
lim Var[S(t)] = .
t→∞ 2κ
2
This analysis shows that the process fluctuates around µ and has a variance of σ2κ
which depends on the parameter κ: the higher κ, the lower the variance.
This is obvious since the higher κ, the faster the process reverts back to its mean
value.
A popular extension is where the diffusion term is in scale with the current value, i.e.,
the geometric mean reverting process:
The first mean reversion model(57) may produce negative values even for µ > 0.
Since the second mean-reversion model has always positive realizations, it is also
called log-normal mean reversion. This type of model is used to model interest rate or
volatilities.
In this equation, X(t) is normally distributed because the Brownian motion is just
multiplied by time-dependent factors.
When we compute an optimal control law for this SDE, the deterministic optimal control
law (ignoring the Brownian motion) and the stochastic optimal control law are the same.
This feature is called certainty equivalence. For this reason, the stochastics are often
ignored in control engineering.
The logical extension of scalar SDEs is to allow X(t) ∈ Rn to be a vector. The rest of
this section proceeds in a similar fashion as for scalar linear SDEs. A stochastic vector
differential equation
³ Z t h m
X i
−1
X(t) = Φ(t) x0 + Φ (s) a(s) − Bi(s)bi(s) ds
0 i=1
m Z
X t ´
−1
+ Φ (s)bi(s)dWi(s) , (61)
i=1 0
where the fundamental matrix Φ(t) ∈ Rn×n is the solution of the homogenous
stochastic differential equation.
The fundamental matrix Φ(t) ∈ Rn×n is the solution of the homogenous stochastic
differential equation:
m
X
dΦ(t) = A(t)Φ(t)dt + Bi(t)Φ(t)dWi(t) , (62)
i=1
with initial condition Φ(0) = I , I ∈ Rn×n e now prove that (61) and (62) are
solutions of (59). We rewrite (61) as
³ Z t ´
−1
X(t) = Φ(t) x0 + Φ (t)dY (t)
0
h m
X i m
X
dY (t) = a(t) − Bi(t)bi(t) dt + bi(t)dWi(t) .
i=1 i=1
³ Z t ´
−1
X(t) = Φ(t)Z(t) , Z(t) = x0 + Φ (t)dY (t)
0
−1
dZ(t) = Φ (t)dY (t)
Noting that Z(t) = Φ−1(t)X(t) and using the SDE for Y (t), we get
m
X m
X
dX(t) = dY (t) + A(t)Φ(t)Z(t)dt + Bi(t)Φ(t)Z(t)dWi(t) + Bi(t)bi(t)dt
i=1 i=1
h m
X i m
X
= a(t) − Bi(t)bi(t) dt + bi(t)dWi(t) + A(t)X(t)dt
i=1 i=1
m
X m
X
+ Bi(t)X(t)dWi(t) + Bi(t)bi(t)dt
i=1 i=1
m
X
= [a(t) + A(t)X(t)]dt + (Bi(t)X(t) + bi(t))dWi(t) .
i=1
The expectation m(t) = E[X(t)] ∈ Rn and the second moment matrix P (t) =
E[X(t)X T (t)] ∈ Rn×n can be computed as follows:
The covariance matrix for the system of linear SDEs is given by als
T
V (t) = Var{x(t)} = P (t) − m(t)m (t) . (65)
where
As first example of a linear vector valued SDE, we consider a two dimensional geometric
Brownian motion:
³ ´
dS1(t) = µ1S1(t)dt + S1(t) σ11dW1(t) + σ12dW2(t) , (66)
³ ´
dS2(t) = µ2S2(t)dt + S2(t) σ21dW1(t) + σ22dW2(t) . (67)
Written in matrix form S = (S1, S2)T , the same SDE is given as:
µ ¶ µ ¶ µ ¶ µ ¶
µ1 0 0 σ11 0 σ12 0
A(t) = a(t) = B1(t) = B2(t) =
0 µ1 0 0 σ21 0 σ22
Both processes S1(t) and S2(t) are correlated if σ12 = σ21 6= 0. This model can be
easily extended to n processes.
The observed volatility for real existing price processes, such as stocks or bonds is itself
a stochastic process. The following model describes this observation:
where θ is the average volatility, σ1 a volatility, and κ the mean reversion rate of
the volatility process σ(t). If this model is used for stock prices, the transformation
P (t) = ln(S(t)) is useful. The two Brownian motions dW1(t) and dW2(t) are
correlated, hence corr[dW1(t), dW2(t)] = ρ. This model captures the behavior of
real existing prices better and its distribution of returns shows “fatter tails”.
wobei x(t) = (P (t), σ(t))T . The system (68) has the property, that the variance
of P (t) depends on the initial condition σ0 For the parameters µ = 0.1, κ = 2,
θ = 0.2, σ1 = 0.5 and ρ = 0.5, we calculate the standard deviation of P (t) with
σ0 = 0.1 and alternatively with σ0 = 0.8. The expected value of σ(t) has the
following evaluation over time m(t) = θ + (σ0 − θ)e−κt and thus the variance of
P (t) depends on σ0.
0.7
σ0=0.1
σ0=0.8
0.6
0.5
Standardabweichung
0.4
0.3
0.2
0.1
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
time
In comparison with linear SDEs, nonlinear SDEs are less well understood. No general
solution theory exists. And there are no explicit formulae for calculating the moments.
In this section, we show some examples of nonlinear SDEs and their properties.
In general, a scalar square root process can be written as
where A(t), a(t), and B(t) are real scalars. The nonlinear mean reverting SDEs differ
from the linear scalar equations by their nonlinear diffusion term. For this process, the
distribution and moments can be calculated.
For a specific square root process with A(t) = 0, a(t) = 1 and B(t) = 2 we are
able to derive the analytical solution: The SDE
q
dX(t) = 1dt + 2 X(t)dW (t) , X(0) = xo ,
has the solution X(t) = (W (t) + x0)2We verify the solution using Itô formula. We
use Φ(t) = X(t) = (Y (t) + x0)2 and dY (t) = dW (t). The partial derivatives are
Φt = 0, ΦY = 2(Y (t) + x0), and ΦY Y = 2. Thus
1
dΦ(t) = [Φt + ΦY · 0 + ΦY Y · 1]dt + ΦY · 1dW (t) ,
2
q
dΦ(t) = 1dt + 2(Y (t) + x0)dW (t) , ⇒ dX(t) = 1dt + 2 X(t)dW (t) ,
p
since X(t) = Y (t) + x0.
µt
The expected value
³ for (69) is
´ E [S(t)] = S 0 e and the variance is obtained by
2
σ S
Var[S(t)] = µ 0 e2µt − eµt .
Another widely used mean reversion model is obtained by
Using the transformation P (t) = ln(S(t)) yields the linear mean reverting and
normally distributed process P (t):
σ2
dP (t) = κ[(µ − ) − P (t)]dt + σdW (t) , (71)
2κ
Because of the transformation, S(t) is log-normally distributed. This model is used
to model stock prices, stochastic volatilities, and electricity prices. Because S(t) is
log-normally distributed, S(t) is always positive.
where the first integral is a path-wise Riemann integral and the second integral is an
Itô integral.
In this definition, it is assumed that the functions f (t, X(t)) and g(t, X(t)) are
sufficiently smooth in order to guarantee the existence of the solution X(t).
There are several ways of finding analytical solutions. One way is to guess a soluti-
on and use the Itô calculus to verify that it is a solution for the SDE under consideration.
For some classes of SDEs, analytical formulas exist to find the solution, e.g. consider
the following SDE:
where X(t) ∈ Rn, f (t, X(t)) ∈ Rn is an arbitrary function, σ(t) ∈ Rn×m and
dW (t) ∈ Rm. This class of SDEs has the following general solution:
SinceF (t) is know,, we are able to solve for Y (t) in in function of F (t).
Using Itô lemman, we show that X(t) = Y (t) + F (t) and this solves the SDE
This solution is not very suprising, since X(t) is the sum of the process of Y (t) and
the BM of F (t).
For another class of SDEs, exist an analytical formula for their solution:
The proof is similar to the first case, sice the diffusion is linear.
dt
dX(t) = + αX(t)dW (t) , X(0) = x0 .
X(t)
1 α2 t−αW (t) F (t) F 2(t)
F (t) = e2 , dY (t) = −1 dt = dt
F (t)Y Y
Z t
2 1 2 2
dY (t)Y (t) = F (t)dt , Y (t) = F (s)ds + C0
2 0
³ Z t ´1
2 α2 s−2αW (s) 2
Y (t) = x0 + 2 e ds
0
Z
−1 α 2 t+αW (t) ³ 2 t
α2 s−2αW (s)
´1
2
X(t) = e 2 x0 +2 e ds
0
However, most SDEs, especially nonlinear SDEs, do not have analytical solutions so
that one has to resort to numerical approximation schemes in order to simulate sample
paths of solutions to the given equation.
The simplest scheme is obtained by using a first-order approximation. This is called the
Euler scheme
where the ²(.) is a discrete-time Gaussian white process with mean 0 and standard
deviation 1.