Continuous - Time - Vs Discrete Time
Continuous - Time - Vs Discrete Time
We need a statistical model that can apply to stock prices, returns, etc. The point of these notes
is to quickly present the standard di¤usion model that we use in asset pricing.
We use continuous time when it’s simpler. In point of fact all our data are discretely sampled.
Whether time is “really”continuous or discrete is an issue we’ll leave to physicists and philosophers.
Most of the approximations that took a lot of e¤ort in Chapter 1 of Asset pricing work much more
easily in continuous time.
2.1 Preview/summary:
1. Brownian motion
zt+ zt N (0; )
2. Di¤erential
dzt = lim (zt+ zt )
&0
3. dz and dt
p
dzt O( dt)
dzt2 = dt
Et (dzt ) = 0
vart (dzt ) = Et (dzt2 ) = dzt2 = dt
4. Di¤usions
5. Examples
dpt
= dt + dzt
pt
dxt = (xt )dt + dzt
1
Please report typos to [email protected]. Make sure you have the latest version of the document.
If you can report the context of the typo and not just the page number that will help me to …nd it more quickly.
1
6. Ito’s lemma
dxt = dt + dzt ; yt = f (t; xt ) =)
@f @f 1 @2f 2
dyt = dt + dxt + dx
@t @x 2 @x2 t
@f @f 1 @2f 2 @f
dyt = + + 2
dt + dz
@t @x 2 @x @x
2. Example 1: di¤usion
dxt = dt + dzt
Z T Z T Z T
dxt = dt + dzt
t=0 t=0 t=0
xT x0 = T + (zT z0 )
Z T
1 2
ln xT ln x0 = T+ dzt
2 0
1 2
RT
xT = x0 e ( 2 )T + 0 dzs
4. Example 3: AR(1)
dxt = (xt )dt + dzt
Z T
T
xT =e (x0 )+ e (T t)
dzT t
t=0
2
5. Continuous-time MA processes
Z T
xT = w(T t)dzt
t=0
1
E(x ) = x e( 2
2
)T + 12 2T
= x0 e T
T 0
(b) AR(1)
Z t
t (t s)
(xt )=e (x0 )+ e dzs
s=0
t
E0 (xt )=e (x0 )
Z t 2 t
2 2 (t s) 2 1 e 2
(xt )= e ds =
s=0 2
f (x; T ) = (x):
3
(a) De…ne f (x; tjx0 ) = the probability density of xt given x0 at time 0. It’s a delta function
at t = 0, x = x0 . Looking forward it solves
(b) This fact gives a nice formula for the stationary (unconditional) density. The formula is:
Rx (s)
2 2 (s) ds
e
f (x) = k 2 (x)
2. What’s a “return”?
dpt xt
dRt = + dt
pt pt
3. What’s a “cumulative value process”
dVt
= dRt
Vt
d( t pt ) xt
Et dt = 0
p
t t pt
or Et [d ( t Vt )] = 0
4
7. What’s the equivalent of Rf = 1=E(m)?
d t
rf dt = Et
t
I’ll explain continuous time by analogy with discrete time. This is a quick review of the essential
concepts can’t hurt.
1. A sequence of i.i.d. random variables (like a coin ‡ip) forms the basic building block of time
series.
"t i:i:d:
i.i.d. means “independent and identically distributed.” In particular, it implies that the
conditional mean and variance are constant through time. Our building block series has a
zero mean,
Et ("t+1 ) = 0
2
t ("t+1 ) = = const.
The concept of conditional mean and variance is really important here! Et means “expec-
tation conditional on all information at time t.” We sometimes specify that "t are normally
distributed, but they don’t have to be.
5
2. We build more interesting time series models from this building block. The AR(1) is the
…rst canonical example
xt+1 = xt + "t+1 :
The sequence of conditional means follows
Et xt+1 = Et ( xt + "t+1 ) = xt
2
Et xt+2 = Et ( xt+1 + "t+2 ) = xt
k
Et xt+k = xt :
The AR(1) displays some persistence since after a shock it is expected to stay up for a while,
and then decay gradually back again. A good exercise at this point is to work out the sequence
of conditional variances of the AR(1). The answer is
2 2
t (xt+1 ) = "
2 2 2
t (xt+2 ) = (1 + ) "
2 2 2(k 1) 2
t (xt+k ) = (1 + + ::: + ) "
The earlier notation is more common in discrete time; the latter notation is more common
as we move to continuous time. In either case, note Et "t+1 = 0 so it’s uncorrelated with xt .
Thus you can estimate an AR(1) by an OLS forecasting regression.
4. The canonical example 2 is the MA(1).
Et (xt+1 ) = "t
Et (xt+2 ) = 0
Et (xt+k ) = 0:
It displays some persistence too but only for one period. Problem: work out its conditional
variances.
5. You can transform from AR to MA representations. You can think of this operation as
“solving” the AR(1)
xt = xt 1 + "t
2
xt = ( xt 2 + "t 1) + "t = xt 2 + "t 1 + "t
3 2
xt = xt 3 + "t 2 + "t 1 + "t
t 1
X
t j
xt = x0 + "t j
j=0
6
“Solving” the di¤erence equation means, really, expressing xt as a random variable, given
time 0 information. Now we know xt has
k
E0 (xt ) = x0 ;
2 2 2(k 1) 2
0 (xt ) = (1 + + :: + ) ":
Thus, an AR(1) is the same as an MA(1). You can similarly write an MA(1) as an AR(1).
Choose which representation is easiest for what you’re doing. An example: Let’s …nd the
unconditional mean and variance of the AR(1). One way to do this is with the MA(1)
representation
1
X
j
E(xt ) = E ("t j ) = 0:
j=0
0 1
1
X 1
X 2
var (xt ) = var @ j
"t jA = 2j 2
" = "
2
1
j=0 j=0
Notice: the " are all uncorrelated with each other. That’s how we got rid of the covariance term
and all 2 are the same. We can also …nd the same quantities from the AR(1) representation
7. All of the familiar tricks for linear ARMA models, including lag operator notation, have con-
tinuous time counterparts. It takes a little work to make the translation. If you’re interested,
see my “Continuous-time linear models” Foundations and Trends in Finance 6 (2011), 165–
219 DOI: 10.1561/0500000037. It’s on my webpage,
http://faculty.chicagobooth.edu/john.cochrane/research/papers/continuous_time_linear_models_FT.pdf
1. Preview. In discrete time, our building block is the i.i.d. (normal) shock "t with variance
2 (" ) = 1. We build up more complex time series from this building block with di¤erence
t
equation models such as the AR(1),
xt+1 = xt + "t+1
7
We “solve” such di¤erence equations to represent xt as a function of its past shocks,
t 1
X
j t
xt = "t j + x0 :
j=0
(a) In discrete time, a random walk zt is de…ned as the sum of independent shocks,
t
X
zt z0 = "j :
j=1
zt = (zt zt 1) = "t :
It will turn out to be easier to think …rst about zt in continuous time and then take
di¤erences than to think directly about "t
(b) Now, let’s examine the properties of zt . The variance of z grows linearly with horizon.
zt+ zt N (0; )
for any , not just integers. This is a Brownian motion, the continuous-time version of
a random walk.
(d) I.i.d. property In discrete time E("t "t+1 ) = 0, i.e. E(zt+2 zt+1 ; zt+1 zt ) = 0. The nat-
ural generalization is E(zt+ zt ; zt+ + zt+ ) = 0, or more generally Nonoverlapping
di¤ erences of fzt g are uncorrelated with each other.
3. The di¤erential dz
8
(a) Now let’s take the “derivative” as we took the di¤erence "t = zt zt 1. De…ne
Et (dzt ) = 0; () Et (zt+ zt ) = 0
vart (dzt ) = dt; () var(zt+ zt ) = ,
cov(dzt ; dzs ) = 0 s 6= t () cov(zt+ zt ; zs+ zs ) = 0
Watch out for notation. Et (dzt ) 6= dzt ! d is a forward di¤erence operator, so Et (dzt )
means expected value of how much dz will change in the next instant. It really means
Et (zt+ zt ), which is obviously not the same thing as zt+ zt
(e) dzt2 = dt. Variance and second moment are the same.
Second moments are nonstochastic! It’s not just dz 2 is of order dt, but in fact dz 2 =
dt. We often write E(dz 2 ) to remind ourselves that for any discrete interval these are
second moments of random variables. But in the limit, in fact, second moments are
nonstochastic. Similarly if we have two Brownians dz and dw, they can be correlated
and the cross product is
(f) (Optional note.) The fact that squares of random variablesp are deterministic is hard to
swallow. If dzt is a normal with standard deviation dt, why is dzt2 a number, not a
2 distributed random variables? To see why, compare dz and dz 2 . z
p t t t+ p zt is N (0; ),
(zt+ zt ) = N (0; 1) so the probability that, say, kzt+p zt k > 2 is 5%. The
“typical size” of a movement zt+ zt goes to zero at rate
On the other hand, (zt+ zt )2 = 2 . ( 2 or “chi-squared with one degree of
1 1
freedom” just means “the distribution of the square of a standard normal.”) The fact
that the distribution (zt+ zt )2 = stays the same as gets small means that the
9
probability that, say, (zt+ zt )2 > 2 is …xed as gets small. Thus (zt+ zt ) 2
goes to zero at rate . The “typical size”of a movement (zt+ zt )2 or the probability
that it exceeds any value goes to zero at rate . Things whose movements go to zero at
rate are di¤erentiable, and hence nonstochastic. (zt+ zt )2 is of order and hence
nonstochastic.
4. Di¤usion processes. We have the building block, the analog to "t . Now, we build more
complex process like AR(1), ARMA, etc. in the same way as we do in discrete time.
But watch:
n o n o n o
Et [dxt Et (dxt )]2 = Et [dxt dt]2 = Et [ dzt ]2 = 2
Et dzt2 = 2
dt
We getp the same result whether we take the mean out or not. The dzt term is of
order dt, so almost always in…nitely bigger than the dt term.
(b) Geometric Brownian motion with drift. This is the standard asset price process, so
worth careful attention.
dpt
= dt + dzt
pt
dpt is the change in price, so dpt =pt is the instantaneous percent (not annualized) change
in price or instantaneous rate of return if there are no dividends. We’ll see how to add
dividends in a moment.
10
i. The moments are
dpt
Et = dt
pt
dpt 2
vart = dt
pt
so and can represent the mean and standard deviation of the arithmetic percent
rate of return. Since they multiply dt, they have annual units. = 0:05 and
2 = 0:102 are numbers to use for a 5% mean return and 10% standard deviation of
return.
ii. Don’t forget the dt at the end of these expressions
dpt dpt 2
Et = ; vart =
pt pt
1 dpt 1 dpt 2
Et = ; vart =
dt pt dt pt
dpt = pt dt + pt dzt
iv. You can also see here the reason we call it a “stochastic di¤erential equation.”
Without the dzt term, we would “solve” this equation to say
pt = p0 e t :
i.e. value grows exponentially. We will come back and similarly “solve”the stochastic
di¤ erential equation with the dzt term.
(c) AR(1), Ornstein Uhlenbeck process
11
continuous: dxt = (xt )dt + dzt
Et dxt = (xt )dt
2
vart dxt = dt
The part in front of dt is called the “drift” and the part in front of dzt is called the
“di¤usion” coe¢ cient.
(d) In general, we build complex time series processes from the stochastic di¤ erential equa-
tion
dxt = (xt ; t; :::)dt + (xt ; t; :::)dzt :
and may be nonlinear functions (ARMA in time series are just linear). Note we
often suppress state and time dependence and just write , or ( ); ( ) or t and t .
Et (dxt ) = (xt ; t; ::)dt
2
vart (dxt ) = (xt ; t; ::)dt
5. Ito’s lemma. Given a di¤usion process for xt , we can construct a new process yt = f (xt ):
How can we …nd a di¤usion representation for yt given the di¤usion representation of xt ?.
For example, what does the log price process log(pt ) look like?
12
p
Do a second order Taylor expansion. Keep terms of order dt and dzt = dt, ignore
higher order terms. Don’t forget dzt2 = dt.
The answer is called Ito’s lemma:
@f 1 @2f 2
dyt = dxt + dx
@x 2 @x2 t
@f 1 @2f 2 @f
dyt = + 2
dt + dz
@x 2 @x @x
Intuition: Jensen’s inequality says, E [y(x)] < y [E(x)] if y (x) is concave. E [u(c)] <
u [E(c)] for example. The second derivative term takes care of this fact. You will see
this in the “convexity” terms of option and bond pricing.
(c) An example. Geometric growth
dpt
= p dt + dzt
pt
yt = ln pt
1 1 1 2 1 2
dyt = dpt dp = p dt + dzt
pt 2 p2 t 2
(d) Conversely,
dyt = y dt + dzt
yt
pt = e
1
dpt = eyt dyt + eyt dyt2
2
dpt 1 2
= y + dt + dzt
pt 2
(e) Another example. We will often multiply two di¤usions together. What is the product
di¤usion? Use the chain rule but go out to second derivatives.
@(xy) @(xy) 1 @ 2 (xy)
d(xt yt ) = dxt + dyt + 2 dxt dyt
@x @y 2 @x@y
= yt dxt + xt dyt + dxt dyt
The second partials with respect to x and y @ 2 (xy)=@x2 are zero. Notice the extra term
dxt dyt relative to the usual chain rule.
(f) Ito’s lemma more generally. If
y = f (xt ; t)
then
@f @f 1 @2f 2
dyt = dt + dxt + dx
@t @x 2 @x2 t
@f @f 1 @2f 2 @f
dyt = + + 2
dt + dz
@t @x 2 @x @x
The version with partial derivatives starting from yt = f (x1t ; x2t ; t) is obvious enough
and long enough that I won’t even write it down.
13
2.4 Solving stochastic di¤erential equations.
1. The problem: We want to do the analogue of “solving” di¤erence equations. For example,
when we write an AR(1),
xt = xt 1 + " t
we solve backward to
t 1
X
t j
xt = x0 + "t j.
j=0
dxt = dt
dxt = dt + dzt ?
What meaning can we give to the integrals? Why not just “add up all the little changes.”’
Z T
dzt = (z z0 ) + (z2 z ) + (z3 z2 ) + :::: + (zT zT ) = zT z0 :
0
4. In sum, the building block of solutions is the opposite of our dzt di¤erencing operator, and
completes the analogy we started when de…ning zT in the …rst place
Z T T
X
zT z0 = dzt $ zT z0 = "t
t=0 t=1
14
RT
t=0 dztis called a Stochastic integral. This is a fundamentally di¤erent de…nition of “integral”
which gives mathematicians a lot to play with. For us, it just means “add up all the little
changes.” Recall that zt is not di¤erentiable. That’s why we do not write the usual integral
notation
Z T
Yes: zT z0 = dzt
0
Z T
dz(t)
No: zT z0 = dt
0 dt
We do not de…ne theR integral as the “area under the curve” as you do in regular calculus.
T
What we mean by “ 0 dzt ” in the end is just a normally distributed random variable with
mean zero and variance t. Z T
dzt = zT z0 N (0; T )
0
5. Now, to more complex stochastic di¤erential equations. In discrete time, we “solve” the
di¤erence equation:
xt = (1 ) + xt 1 + "t
xt = (xt 1 ) + "t
T
X1
T j
xT = (x0 )+ "T j:
j=0
This means we know the conditional distribution of the random variable xT , and really that’s
what it means to have “solved” the stochastic di¤erence equation.
2 3
T
X1
xT jx0 N 4 + T (x0 ) ; "2 2j 5
j=0
(a) Random walk. If we start with dzt and integrate both sides,
Z T
zT z0 = dzt
t=0
zT z0 N (0; T )
15
(b) Random walk with drift
dxt = dt + dzt
Z T Z T Z T
dxt = dt + dzt
t=0 t=0 t=0
Z T
xT x0 = T + dzt
t=0
2
xT x0 = T + "; " N (0; T )
2
xT x0 N ( T; T )
As in discrete time,
T
X
xT = x0 + T + "t
t=1
i.e, p
1 2
pT = p0 e( 2 )T + T"
" N (0; 1)
pT is lognormally distributed. (Lognormal means log of pT is normally distributed.)
A geometric di¤ usion, where the arithmetic return is instantaneously normal, means
that the …nite-period return is lognormally distributed.
(d) AR(1)
dxt = (xt )dt + dzt
The solution is, Z T
T s
xT =e (x0 )+ e dzT s
s=0
this looks just like the discrete time case
T
X1
T j
xT = (x0 )+ "t j:
j=0
16
i. To derive this answer, you have to be a little clever. Find
In the continuous time tradition, we usually express integrals going forward in time,
as this one is. To connect with discrete time, you can also rearrange the integral to
go back in time, Z Z
T T
(t T ) s
e dzt = e dzT s:
t=0 s=0
ii. To check this answer, write it as
Z t
t (t s)
xt =e (x0 )+ e dzs
s=0
and take dxt . The …rst term is just the usual time derivative. Then, we take the
derivative of the stu¤ inside the integral as time changes, and lastly we account for
the fact that the upper end of the integral changes,
Z t
dxt = e t (x0 ) dt + ( ) e (t s) dzs dt + dzt
s=0
Z t
dxt = e t (x0 )+ e (t s) dzs dt + dzt
s=0
dxt = (xt ) dt + dzt
(e) This example adds an important tool to our arsenal. Notice we can weight sums of dzt
terms, just as we weight sums of "t terms. We can produce or write results as continuous
time moving average processes
Z T
xT = w(T t)dzt
t=0
7. Once you have “solved an sde” you have expressed xT as a random variable. Naturally, you
will want to know moments –means and variances of xT . Here are some examples.
17
(a) Example: Di¤usion:
Z T
xT x0 = T + dzt
t=0
Z T
E (xT x0 ) = T + E (dzt ) = T
t=0
Z T 2 Z T Z T
2 2
var (xT x0 ) = E dzt = E dzt2 = 2
dt = 2
T
t=0 t=0 t=0
dpt
= dt + dzt :
pt
The solution is Rt
1 2
pt = p0 e( 2 )t+ s=0 dzs
1 2 (y)
(In the last line I use the fact that for normal y, E(ey ) = eE(y)+ 2 .)
(c) Example: AR(1). Start with the solution, then use the fact that E(dzt ) = 0; E(dzt2 ) =
dt; E(dzt dzs ) = 0, for t 6= s.
Z t
t (t s)
(xt )=e (x0 )+ e dzs
s=0
t
E0 (xt )=e (x0 )
" Z 2
#
t
2 (t s)
(xt )=E e dzs
s=0
Z t 2 t
2 (t s) 2 1 e 2
= e ds =
s=0 2
as in discrete time
j
E0 (xt+j )= (x0 )
20 12 3
t 1
X t 1
X 2t
1
var0 (xt ) = E 4@ j
"t jA 5= 2j 2
= 2
2
"
1
j=0 j=0
18
then Z Z
t t
E0 (xt ) = E0 g(t s)dzs = g(t s)Es (dzs ) = 0
0 0
and Z Z
t 2 t
E0 (x2t ) = E0 g(t s)dzs = g(t s)2 ds;
0 0
8. In general, solving SDE’s is not so easy! As with regular di¤erential equation, you can’t just
integrate both sides, because you might have an x on the right side as well that you can’t
easily get rid of. The idea is simple, from
9. Simulation. You can easily …nd distributions and moments by simulating the solution to a
stochastic di¤erential equation. Use the random number generator and program
p
xt+ = xt + (xt ; t) + (xt ; t) "t+ ; "t+ N (0; 1)
10. A last example. The CIR (Cox Ingersoll Ross) square root process. Suppose we generate xt
from zt by
p
dxt = (xt )dt + xt dzt :
The AR(1) process ranges from 1 to 1. As x ! 0, volatility goes down, and the drift is
pulling it back, so xt can’t cross zero. That fact, and the fact that it’s more volatile when the
level is higher, makes it a good process for nominal interest rates. It’s a nonlinear process,
not in ARMA class. A strong point for continuous time is that we can handle many of these
nonlinear processes in continuous time, though we really can’t really do much with them in
discrete time. For example, the discrete-time square root process is
p
xt = xt 1 + xt " t
Now what do you do? There are closed form solutions for the CIR process and many other
nonlinear processes. I won’t give those here.
Often you don’t really need the whole solution, you only want moments. If xT is a cash payo¤
at time T , you might want to know its expected value, Et (xT ). You might want the expected value
19
of a function of xT , i.e. Et [ (xT )]. This happens in option pricing, where Ris the call option payo¤
function. The most basic asset pricing valuation formula is a moment E0 e t mt xt dt .
We saw above how to …nd moments after you have a full solution. but that’s hard. But you
can often …nd moments without …rst …nding the whole “solution” or distribution of the random
variable, i.e. xT or (xT ) itself.
1. But we can also …nd moments directly because Moments follow nonstochastic di¤ erential
equations. You don’t have to solve stochastic di¤ erential equations to …nd the moments.
(a) Example.
dxt = dt + ( )dz
Let’s suppose we want to …nd the mean. Now,
Watch the birdie here. By d [E0 (xt )] I mean, how does the expectation E0 (xt ) move
forward in time. But now the point: Since the dz term is missing, you can …nd the mean
of x without solving the whole equation.
(b) Example: lognormal pricing
dx
= dt + dz
x
The mean follows
dE0 (xt )
= dt
E0 (xt )
dE0 (xt ) = E0 (xt )dt
t
E0 (xt ) = x0 e
This is the same solution we had before. Look how much easier that was!
(c) Similarly, to …nd moments E0 (x), …nd the di¤usion representation for E0 (x) by Ito’s
lemma.
(d) Alas, this technique is limited. If (xt ) then E0 (dxt ) = E0 ( (xt ))dt: You need to know
the distribution of xt to get anywhere.
2. The “Backward equation.” What is Et [ (xT )]? We know that ET [ (xT )] = (xT ). We can
work backwards using Ito’s lemma to …nd Et [ (xT )]. We have a process
dx = ( )dt + ( )dz
f (xt ; t) = Et [ (xT )]
20
Since it’s a conditional expectation, Et [Et+ ( )] = Et ( ), so
Et (dft ) = 0:
@f @f 1 @2f 2
df = dt + dx + dx
@t @x 2 @x2
Et (df ) @f @f 1 @2f 2
= + ()+ ()=0
dt @t @x 2 @x2
Thus, the conditional expectation solves the “backward equation”
f (x; T ) = (x)
(a) This is a partial di¤erential equation (ugh). The usual way to solve it is to guess and
check –you guess a functional form with some free parameters, and then you see what
the free parameters have to be to make it work. It is also the kind of equation you can
easily solve numerically. Use the spatial derivatives at time t to …nd the time derivative,
and hence …nd the function at time t . Loop. That may be easier numerically than
solving for Et (xT ) by Monte-Carlo simulation of xT forward. Or, the Monte-Carlo
simulation may turn out to be an easy way to solve partial di¤erential equations.
3. Finding densities. You can also …nd the density by an Ito’s lemma di¤erential equation
rather than solving the whole thing as we did above. This is called the “Forward equation.”
(a) The density at t + must be the density at t times the transition density to go from t
to t + :
Z
f (xt+ jx0 ) = f (xt jx0 )f (xt+ jxt )dxt
Z
(xt+ xt (xt ) )
= f (xt jx0 )N p dxt
(xt )
...Ito’s lemma and lots of algebra, and allowing explicit f (x; t), ...
21
where k is a constant, so we integrate to one. Evaluating k,
2 Rx 3 1 Rx
Z 2
(s)
2 (s) ds 2
(s)
2 (s) ds
e e
f (x) = 4 2 (s)
ds5 2 (x)
1 d 2 (x)f (x)
(x)f (x) =
2 dx
d 2 (x)f (x)
(x) 2
2 2 (x)
(x)f (x) =
dx
Rx (s)
2 2 2 (s) ds
(x)f (x) = ke
Rx (s)
2 2 (s) ds
e
f (x) = k 2 (x)
R
Normalizing so f (x)dx = 1,
2 Rx 3 1 Rx
Z 2
(s)
2 (s) ds 2
(s)
2 (s) ds
e e
f (x) = 4 2 (s)
ds5 2 (x)
22
2.6 Problems
(a) Find dyt . Express your answer in standard form dyt = (yt ; t)dt + (yt ; t; )dzt
(b) xt varies over the whole real line, yet yt must remain positive. What force keeps y from
going negative? Hint: look at the drift and di¤usion terms as y approaches zero.
as a model for stocks. Use = 0:06 and = 0:18 (i.e. 6% and 18%)
(a) What are the mean and standard deviation of horizon log returns,
23