0% found this document useful (0 votes)
21 views20 pages

Chapter V. Stochastic Processes in Continuous Time 1. Brownian Motion

Stochastic process in continuous time

Uploaded by

Sylvia Cheung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views20 pages

Chapter V. Stochastic Processes in Continuous Time 1. Brownian Motion

Stochastic process in continuous time

Uploaded by

Sylvia Cheung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

m3f33chVI

Chapter V. STOCHASTIC PROCESSES IN CONTINUOUS TIME

§1. Brownian Motion.


The Scottish botanist Robert Brown observed pollen particles in suspen-
sion under a microscope in 1828 and 1829 (though this had been observed
before),1 and observed that they were in constant irregular motion.
In 1900 L. Bachelier considered Brownian motion a possible model for
stock-market prices:
BACHELIER, L. (1900): Théorie de la spéculation. Ann. Sci. Ecole Nor-
male Supérieure 17, 21-86
– the first time Brownian motion had been used to model financial or eco-
nomic phenomena, and before a mathematical theory had been developed.
In 1905 Albert Einstein considered Brownian motion as a model of parti-
cles in suspension, and used it to estimate Avogadro’s number (N ∼ 6×1023 ),
based on the diffusion coefficient D in the Einstein relation

varXt = Dt (t > 0).

In 1923 Norbert Wiener defined and constructed Brownian motion rigor-


ously for the first time. The resulting stochastic process is often called the
Wiener process in his honour, and its probability measure (on path-space) is
called Wiener measure.
We define standard Brownian motion on R, BM or BM (R), to be a
stochastic process X = (Xt )t≥0 such that
1. X0 = 0,
2. X has independent increments: Xt+u − Xt is independent of σ(Xs : s ≤ t)
for u ≥ 0,
3. X has stationary increments: the law of Xt+u − Xt depends only on u,
4. X has Gaussian increments: Xt+u − Xt is normally distributed with mean
0 and variance u,
Xt+u − Xt ∼ N (0, u),
5. X has continuous paths: Xt is a continuous function of t, i.e. t 7→ Xt is
continuous in t.
For time t in a finite interval – [0, 1], say – we can use the following filtered
1
The Roman author Lucretius observed this phenomenon in the gaseous phase – dust
particles dancing in sunbeams – in antiquity: De rerum natura, c. 50 BC.

1
space: (i) Ω = C[0, 1], the space of all continuous functions on [0, 1]; (ii) the
points ω ∈ Ω are thus random functions, and we use the coordinate mappings:
Xt , or Xt (ω), = ωt ; (iii) the filtration is given by Ft := σ(Xs : 0 ≤ s ≤ t),
F := F1 ; (iv) P is the measure on (Ω, F) with finite-dimensional distribu-
tions specified by the restriction that the increments Xt+u −Xt are stationary
independent Gaussian N (0, u).

Theorem (WIENER, 1923). Brownian motion exists.

The best way to prove this is by construction, and one that reveals some
properties. The result below is originally due to Paley, Wiener and Zygmund
(1933) and Lévy (1948), but is re-written in the modern language of wavelet
expansions. We omit the proof; for this, see e.g. [BK] 5.3.1, or SP L20-22.
The Haar system (Hn ) = (Hn (.)) is a complete orthonormal system (cons)
of functions in L2 [0, 1]. The Schauder system ∆n is obtained by integrating
the Haar system. Consider the triangular function (or ‘tent function’)
1 1
∆(t) := 2t on [0, ), 2(1 − t) on [ , 1], 0 else.
2 2
With ∆0 (t) := t, ∆1 (t) := ∆(t), define the nth Schauder function ∆n by

∆n (t) := ∆(2j t − k) (n = 2j + k ≥ 1).

Note that ∆n has Z t


1
H(u)du = ∆(t),
0 2
and similarly Z t
Hn (u)du = λn ∆n (t),
0
where λ0 = 1 and for n ≥ 1,
1
λn = × 2−j/2 (n = 2j + k ≥ 1).
2
The Schauder system (∆n ) is again a complete orthogonal system on L2 [0, 1].
We can now formulate the next result; for proof, see the references above.

2
Theorem (PWZ theorem: Paley-Wiener-Zygmund, 1933). For (Zn )∞
0
independent N (0, 1) random variables, λn , ∆n as above,

X
Wt := λn Zn ∆n (t)
n=0

converges uniformly on [0, 1], a.s. The process W = (Wt : t ∈ [0, 1]) is Brow-
nian motion.

Thus the above description does indeed define a stochastic process X =


(Xt )t∈[0,1] on (C[0, 1], F, (Ft ), P ) (the uniform limit of continuous functions
is continuous). The construction gives X on C[0, n] for each n = 1, 2, · · · ,
and combining these: X exists on C[0, ∞). It is also unique (a stochastic
process is uniquely determined by its finite-dimensional distributions and the
restriction to path-continuity).
No construction of Brownian motion is easy: one needs both some work
and some knowledge of measure theory. But existence is really all we need,
and we assume this. For background, see any measure-theoretic text on
stochastic processes. The classic is Doob’s book, quoted above (see VIII.2
there). Excellent modern texts include Karatzas & Shreve [KS] (see particu-
larly §2.2-4 for construction and §5.8 for applications to economics), Revuz &
Yor [RY], Rogers & Williams [RW1] (Ch. 1), [RW2] Itô calculus – below).
We denote standard Brownian motion BM (R) – or just BM for short
– by B = (Bt ) (B for Brown), though W = (Wt ) (W for Wiener) is also
common. Standard Brownian motion BM (Rd ) in d dimensions is defined
by B(t) := (B1 (t), · · · , Bd (t)), where B1 , · · · , Bd are independent standard
Brownian motions in one dimension (independent copies of BM (R)).
Zeros.
It can be shown that Brownian motion oscillates:

lim supt→∞ Xt = +∞, lim inf t→∞ Xt = −∞ a.s.

Hence, for every n there are zeros (times t with Xt = 0) of X with t ≥ n


(indeed, infinitely many such zeros). So if

Z := {t ≥ 0 : Xt = 0}

denotes the zero-set of BM (R):


1. Z is an infinite set.

3
Next, if tn are zeros and tn → t, then by path-continuity B(tn ) → B(t); but
B(tn ) = 0, so B(t) = 0:
2. Z is a closed set (Z contains its limit points).
Less obvious are the next two properties:
3. Z is a perfect set: every point t ∈ Z is a limit point of points in Z. So
there are infinitely many zeros in every neighbourhood of every zero (so the
paths must oscillate amazingly fast!).
4. Z is a (Lebesgue) null set: Z has Lebesgue measure zero.
In particular, the diagram above (or any other diagram!) grossly distorts
Z: it is impossible to draw a realistic picture of a Brownian path.
Brownian Scaling.
For each c ∈ (0, ∞), X(c2 t) is N (0, c2 t), so Xc (t) := c−1 X(c2 t) is N (0, t).
Thus Xc has all the defining properties of a Brownian motion (check). So,
Xc IS a Brownian motion:

Theorem. If X is BM and c > 0, Xc (t) := c−1 X(c2 t), then Xc is again a


BM .

Corollary. X is self-similar (reproduces itself under scaling), so a Brownian


path X(.) is a fractal. So too is the zero-set Z.

Brownian motion owes part of its importance to belonging to all the im-
portant classes of stochastic processes: it is (strong) Markov, a (continuous)
martingale, Gaussian, a diffusion, a Lévy process (process with stationary
independent increments), etc.

§2. Filtrations; Finite-Dimensional Distributions

The underlying set-up is as before, but now time is continuous rather


than discrete; thus the time-variable will be t ≥ 0 in place of n = 0, 1, 2, . . ..
The information available at time t is the σ-field Ft ; the collection of these as
t ≥ 0 varies is the filtration, modelling the information flow. The underlying
probability space, endowed with this filtration, gives us the stochastic basis
(filtered probability space) on which we work,
We assume that the filtration is complete (contains all subsets of null-sets
as null-sets), and right-continuous: Ft = Ft+ , i.e.

Ft = ∩s>t Fs

4
(the ‘usual conditions’ – right-continuity and completeness – in Meyer’s ter-
minology).
A stochastic process X = (Xt )t≥0 is a family of random variables defined
on a filtered probability space with Xt Ft -measurable for each t: thus Xt is
known when Ft is known, at time t.
If {t1 , · · · , tn } is a finite set of time-points in [0, ∞), (Xt1 , · · · , Xtn ), or
(X(t1 ), · · · , X(tn )) (for typographical convenience, we use both notations in-
terchangeably, with or without ω: Xt (ω), or X(t, ω)) is a random n-vector,
with a distribution, µ(t1 , · · · , tn ) say. The class of all such distributions as
{t1 , · · · , tn } ranges over all finite subsets of [0, ∞) is called the class of all
finite-dimensional distributions of X. These satisfy certain obvious consis-
tency conditions:
(i) deletion of one point ti can be obtained by ‘integrating out the unwanted
variable’, as usual when passing from joint to marginal distributions,
(ii) permutation of the ti permutes the arguments of the measure µ(t1 , · · · , tn )
on Rn .
Conversely, a collection of finite-dimensional distributions satisfying these
two consistency conditions arises from a stochastic process in this way (this
is the content of the DANIELL-KOLMOGOROV Theorem: P. J. Daniell in
1918, A. N. Kolmogorov in 1933).
Important though it is as a general existence result, however, the Daniell-
Kolmogorov theorem does not take us very far. It gives a stochastic process
X as a random function on [0, ∞), i.e. a random variable on R[0,∞) . This
is a vast and unwieldy space; we shall usually be able to confine attention
to much smaller and more manageable spaces, of functions satisfying reg-
ularity conditions. The most important of these is continuity: we want to
be able to realise X = (Xt (ω))t≥0 as a random continuous function, i.e. a
member of C[0, ∞); such a process X is called path-continuous (since the
map t → Xt (ω) is called the sample path, or simply path, given by ω) – or
more briefly, continuous. This is possible for the extremely important case of
Brownian motion, for example, and its relatives. Sometimes we need to allow
our random function Xt (ω) to have jumps. It is then customary, and con-
venient, to require Xt to be right-continuous with left limits (rcll), or càdlàg
(continu à droite, limite à gauche) – i.e. to have X in the space D[0, ∞) of
all such functions (the Skorohod space). This is the case, for instance, for the
Poisson process and its relatives.
General results on realisability – whether or not it is possible to realise, or
obtain, a process so as to have its paths in a particular function space – are

5
known, but it is usually better to construct the processes we need directly on
the function space on which they naturally live.
Given a stochastic process X, it is sometimes possible to improve the
regularity of its paths without changing its distribution (that is, without
changing its finite-dimensional distributions). For background on results of
this type (separability, measurability, versions, regularization, ...) see e.g.
Doob’s classic book [D].
The continuous-time theory is technically much harder than the discrete-
time theory, for two reasons:
(i) questions of path-regularity arise in continuous time but not in discrete
time,
(ii) uncountable operations (like taking sup over an interval) arise in contin-
uous time. But measure theory is constructed using countable operations:
uncountable operations risk losing measurability.

Filtrations and Insider Trading


Recall that a filtration models an information flow. In our context, this
is the information flow on the basis of which market participants – traders,
investors etc. – make their decisions, and commit their funds and effort.
All this is information in the public domain – necessarily, as stock exchange
prices are publicly quoted. Again necessarily, many people are involved in
major business projects and decisions (an important example: mergers and
acquisitions, or M&A) involving publicly quoted companies. Frequently, this
involves price-sensitive information. People in this position are – rightly –
prohibited by law from profiting by it directly, by trading on their own ac-
count, in publicly quoted stocks but using private information. This is rightly
regarded as theft at the expense of the investing public.2 Instead, those in-
volved in M&A etc. should seek to benefit legitimately (and indirectly) –
enhanced career prospects, commission or fees, bonuses etc.
The regulatory authorities (Securities and Exchange Commission – SEC
– in US; Financial Conduct Authority (FCA) and Prudential Regulation Au-
thority (PRA, part of the Bank of England (BoE) in UK) monitor all trading
electronically. Their software alerts them to patterns of suspicious trades.
The software design (necessarily secret, in view of its value to criminals)
involves all the necessary elements of Mathematical Finance in exaggerated
2
The plot of the film Wall Street revolves round such a case, and is based on real life
– recommended!

6
form: economic and financial insight, plus: mathematics, probability and
stochastic processes; statistics (especially pattern recognition, data mining
and machine learning); numerics and computation.

§3. Classes of Processes.


1. Martingales.
The martingale property in continuous time is as in discrete time:

E[Xt |Fs ] = Xs (s < t),

and similarly for submartingales and supermartingales. There are regular-


ization results, under which one can take Xt right-continuous in t. The
convergence results, and UI mgs – important as they occur in RNVF – are
similar. Among the contrasts: the Doob-Meyer decomposition, easy in dis-
crete time (IV.8), is deep in continuous time. For background, see e.g.
MEYER, P.-A. (1966): Probabilities and potentials. Blaisdell
– and subsequent work by Meyer and the French school (Dellacherie & Meyer,
Probabilités et potentiel, I-V, etc.)
2. Gaussian Processes.
Recall the multivariate normal distribution N (µ, Σ) in n dimensions. If
µ ∈ Rn , Σ is a non-negative definite n×n matrix, X has distribution N (µ, Σ)
if it has characteristic function
1
φX (t) := E exp{itT .X} = exp{itT .µ − tT Σt} (t ∈ Rn ).
2
If further Σ is positive definite (so non-singular), X has density (Edgeworth’s
Theorem of 1893: F. Y. Edgeworth (1845-1926), English statistician)
1 1
fX (x) = 1 exp{− (x − µ)T Σ−1 (x − µ)}.
1
n
(2π) |Σ|
2 2 2

A process X = (Xt )t≥0 is Gaussian if all its finite-dimensional distribu-


tions are Gaussian. Such a process can be specified by:
(i) a measurable function µ = µ(t) with EXt = µ(t),
(ii) a non-negative definite function σ(s, t) with

σ(s, t) = cov(Xs , Xt ).

Gaussian processes have many interesting properties. Among these, we


quote Belayev’s dichotomy: with probability one, the paths of a Gaussian

7
process are either continuous, or extremely pathological: for example, un-
bounded above and below on any time-interval, however short. Naturally,
we shall confine attention in this course to continuous Gaussian processes.
3. Markov Processes.
X is Markov if for each t, each A ∈ σ(Xs : s > t) (the ‘future’) and
B ∈ σ(Xs : s < t) (the ‘past’),

P (A|Xt , B) = P (A|Xt ).

That is, if you know where you are (at time t), how you got there doesn’t
matter so far as predicting the future is concerned. Equivalently, past and
future are conditionally independent given the present.
The same definition applied to Markov processes in discrete time.
X is said to be strong Markov if the above holds with the fixed time t
replaced by a stopping time T (a random variable). This is a real restriction
of the Markov property in continuous time (though not in discrete time) –
another instance of the difference between the two.
4. Diffusions.
A diffusion is a path-continuous strong-Markov process such that for each
time t and state x the following limits exist:
1
µ(t, x) := limh↓0 E[(Xt+h − Xt )|Xt = x],
h
1
σ 2 (t, x) := limh↓0 E[(Xt+h − Xt )2 |Xt = x].
h
2
Then µ(t, x) is called the drift, σ (t, x) the diffusion coefficient. Then p(t, x, y),
the density of transitions from x to y in time t, satisfies the parabolic PDE
1
Lp = ∂p/∂t, L := σ 2 D2 + µ(x)D, D := ∂/∂x.
2
The (2nd-order, linear) differential operator L is called the generator. Brow-
nian motion is the case σ = 1, µ = 0, and gives the heat equation (L = 21 D2
in one dimension, half the Laplacian ∆ in higher dimensions).
It is not at all obvious, but it is true, that this definition does indeed
capture the nature of physical diffusion. Examples: heat diffusing through a
metal; smoke diffusing through air; dye diffusing through liquid; pollutants
diffusing through air or liquid.

8
§4. Quadratic Variation (QV) of Brownian Motion; Itô’s Lemma

Recall that for ξ N (µ, σ 2 ), ξ has moment-generating function (MGF)


1
M (t) := E exp{tξ} = exp{µt + σ 2 t2 }.
2
Take µ = 0 below; for ξ N (0, σ 2 ),
1
M (t) := E exp{tξ} = exp{ σ 2 t2 }
2
1 22 1 1 22 2
= 1 + σ t + ( σ t ) + O(t6 )
2 2! 2
1 22 3 44
= 1 + σ t + σ t + O(t6 ).
2! 4!
So as the Taylor coefficients of the MGF are the moments (hence the name!),
2
E(ξ 2 ) = varξ = σ 2 , E(ξ 4 ) = 3σ 4 , so var(ξ 2 ) = E(ξ 4 )−[E(ξ 2 )] = 2σ 4 .

For B BM , this gives in particular

EBt = 0, varBt = t, E[(Bt )2 ] = t, var[(Bt )2 ] = 2t2 .

In particular, for t > 0 small, this shows that the variance of Bt2 is negligible
compared with its expected value. Thus, the randomness in Bt2 is negligible
compared to its mean for t small.
This suggests that if we take a fine enough partition P of [0, T ] – a finite
set of points
0 = t0 < t1 < · · · < tk = T
with |P| := max |ti − ti−1 | small enough – then writing

∆B(ti ) := B(ti ) − B(ti−1 ), ∆ti := ti − ti−1 ,

Σ(∆B(ti ))2 will closely resemble ΣE[(∆B(ti )2 ], which is Σ∆ti = Σ(ti −


ti−1 ) = T . This is in fact true a.s.:

Σ(∆B(ti ))2 → Σ∆ti = T as max |ti − ti−1 | → 0.

This limit is called the quadratic variation VT2 of B over [0, T ]:

9
Theorem. The quadratic variation of a Brownian path over [0, T ] exists and
equals T , a.s.

For details of the proof, see e.g. [BK], §5.3.2, SP L22, SA L7,8.
If we increase t by a small amount to t + dt, the increase in the QV can
be written symbolically as (dBt )2 , and the increase in t is dt. So, formally
we may summarise the theorem as
(dBt )2 = dt.
Suppose now we look at the ordinary variation Σ|∆Bt |, rather than the
quadratic variation
√ Σ(∆Bt )2 . Then instead
√ of Σ(∆Bt )2 ∼ Σ∆t ∼ t, we get
Σ|∆Bt | ∼ Σ ∆t. Now for ∆t small, √ ∆t is of a larger order of magnitude
that ∆t. So if Σ∆t = t converges, Σ ∆t diverges to +∞. This suggests –
what is in fact true – the

Corollary. The paths of Brownian motion are of infinite variation - their


variation is +∞ on every interval, a.s.

The QV result above leads to Lévy’s 1948 result, the Martingale Char-
acterization of BM. Recall that Bt is a continuous martingale with respect
to its natural filtration (Ft ) and with QV t. There is a remarkable converse;
we give two forms.

Theorem (Lévy; Martingale Characterization of Brownian Mo-


tion). If M is any continuous local (Ft )-martingale with M0 = 0 and
quadratic variation t, then M is an (Ft )-Brownian motion.

Theorem (Lévy). If M is any continuous (Ft )-martingale with M0 = 0


and Mt2 − t a martingale, then M is an (Ft )-Brownian motion.

For proof, see e.g. [RW1], I.2. Observe that for s < t,
Bt2 = [Bs + (Bt − Bs )]2 = Bs2 + 2Bs (Bt − Bs ) + (Bt − Bs )2 ,
E[Bt2 |Fs ] = Bs2 + 2Bs E[(Bt − Bs )|Fs ] + E[(Bt − Bs )2 |Fs ] = Bs2 + 0 + (t − s) :
E[Bt2 − t|Fs ] = Bs2 − s :
Bt2 − t is a martingale.

10
Quadratic Variation (QV).
The theory above extends to continuous martingales (bounded continu-
ous martingales in general, but we work on a finite time-interval [0, T ], so
continuity implies boundedness). We quote (for proof, see e.g. [RY], IV.1):

Theorem. A continuous martingale M is of finite quadratic variation hM i,


and hM i is the unique continuous increasing adapted process vanishing at
zero with M 2 − hM i a martingale.

Corollary. A continuous martingale M has infinite variation.

Quadratic Covariation. We write hM, M i for hM i, and extend h i to a bilin-


ear form h., .i with two different arguments by the polarization identity:
1
hM, N i := (hM + N, M + N i − hM − N, M − N i).
4
If N is of finite variation, M ± N has the same QV as M , so hM, N i = 0.

Itô’s Lemma.
We discuss Itô’s Lemma in more detail in §6 below; we pause here to give
the link with quadratic variation and covariation. We quote: if f (t, x1 , · · · , xd )
is C 1 in its zeroth (time) argument t and C 2 in its remaining d space argu-
ments xi , and M = (M 1 , · · · , M d ) is a continuous vector martingale, then
(writing fi , fij for the first partial derivatives of f with respect to its ith
argument and the second partial derivatives with respect to the ith and jth
arguments) f (Mt ) has stochastic differential
1
df (Mt ) = f0 (M )dt + Σdi=1 fi (Mt )dMti + Σdi,j=1 fij (Mt )dhM i , M j it .
2
Integration by Parts. If f (t, x1 , x2 ) = x1 x2 , we obtain
1
d(M N )t = N dMt + M dNt + hM, N it .
2
R
Similarly for stochastic integrals (defined below): if Zi := Hi dMi (i = 1, 2),
dhZ1 , Z2 i = H1 H2 dhM1 , M2 i.
Note. The integration-by-parts formula – a special case of Itô’s Lemma, as
above – is in fact equivalent to Itô’s Lemma: either can be used to derive the

11
other. Rogers & Williams [RW1, IV.32.4] describe the integration-by-parts
formula/Itô’s Lemma as ‘the cornerstone of stochastic calculus’.

Fractals Everywhere.
As we saw, a Brownian path is a fractal – a self-similar object. So too is
its zero-set Z. Fractals were studied, named and popularised by the French
mathematician Benôit B. Mandelbrot (1924-2010). See his books, and
Michael F. Barnsley: Fractals everywhere. Academic Press, 1988.
Fractals look the same at all scales – diametrically opposite to the familiar
functions of Calculus. In Differential Calculus, a differentiable function has a
tangent; this means that locally, its graph looks straight; similarly in Integral
Calculus. While most continuous functions we encounter are differentiable,
at least piecewise (i.e., except for ‘kinks’), there is a sense in which the typi-
cal, or generic, continuous function is nowhere differentiable. Thus Brownian
paths may look pathological at first sight – but in fact they are typical!

Hedging in continuous time.


Imagine hedging an option in continuous time. In discrete time, this
involves repeatedly rebalancing our portfolio between cash and stock; in con-
tinuous time, this has to be done continuously. The relevant stochastic pro-
cesses (Ch. VII) are geometric Brownian motion (GBM), relatives of BM,
which, like BM, have infinite variation (finite QV). This makes the rebal-
ancing problematic – indeed, impossible in these terms. Analogy: a cyclist
has to rebalance continuously, but does so smoothly, not with infinite varia-
tion! Or, think of continuous-time control of a manned space-craft (Kalman
filter). In practice, hedging has to be done discretely (as in Ch. V). Or, we
can use price processes with jumps – finite variation, but now the markets
are incomplete, so prices are no longer unique.

§5. Stochastic Integrals (Itô Calculus)

Stochastic integration was introduced


R t by K. RITt Ô in 1944, hence its name
Itô calculus. It gives a meaning to 0 XdY = 0 Xs (ω)dYs (ω), for suitable
stochastic processes X and Y , the integrand and the integrator. We shall con-
fine our attention here to the basic case with integrator Brownian motion:
Y = B. Much greater generality is possible: for Y a continuous martingale,
see [KS] or [RY]; for a systematic general treatment, see
MEYER, P.-A. (1976): Un cours sur les intégrales stochastiques. Séminaire

12
de Probabilités X: Lecture Notes on Math. 511, 245-400, Springer.
The first thing to note is that stochastic integrals with respect to Brown-
ian motion, if they exist, must be quite different from the measure-theoretic
integral of III.2. For, the Lebesgue-Stieltjes integrals described there have
as integrators the difference of two monotone (increasing) functions (by Jor-
dan’s theorem), which are locally of finite (bounded) variation, FV. But we
know from §4 that Brownian motion is of infinite (unbounded) variation on
every interval. So Lebesgue-Stieltjes and Itô integrals must be fundamentally
different.
In view of the above, it is quite surprising that Itô integrals can be de-
fined at all. But if we take for granted Itô’s fundamental insight that they
can be, it is obvious how to begin and clear enough how to proceed. We
begin with the simplest possible integrands X, and extend successively much
as we extended the measure-theoretic integral of Ch. III.
1. Indicators. R
If Xt (ω) = I[a,b] (t), there is exactly one plausible way to define XdB:

Z t Z t  0 if t ≤ a,
XdB, or Xs (ω)dBs (ω), := Bt − Ba if a ≤ t ≤ b,
0 0 
Bb − Ba if t ≥ b.

2. Simple functions. Extend by linearity: if X is a linear combination of


indicators, X = Σci I[ai ,bi ] , we should define
Z t Z t
XdB := Σci I[ai ,bi ] dB.
0 0

Already one wonders how to extend this from constants ci to suitable ran-
dom variables, and one seeks to simplify the obvious but clumsy three-line
expressions above. It turns out that finite sums are not essential: one can
have infinite sums, but now we take the ci uniformly bounded.
We begin again, calling X simple if there is an infinite sequence

0 = t0 < t1 < · · · < tn < · · · → ∞

and uniformly bounded Ftn -measurable random variables ξn (|ξn | ≤ C for all
n and ω, for some C) if Xt (ω) can be written in the form

Xt (ω) = ξ0 (ω)I{0} (t) + Σ∞


i=0 ξi (ω)I(ti ,ti+1 ] (t) (0 ≤ t < ∞, ω ∈ Ω).

13
Rt
The only definition of 0 XdB that agrees with the above for finite sums is,
if n is the unique integer with tn ≤ t < tn+1 ,
Z t
It (X) := XdB = Σn−1 0 ξi (B(ti+1 ) − B(ti )) + ξn (B(t) − B(tn ))
0
= Σ∞
0 ξi (B(t ∧ ti+1 ) − B(t ∧ ti )) (0 ≤ t < ∞).

We note here some properties of the stochastic integral defined so far:


A. I0 (X) = 0 P − a.s.
B. Linearity. It (aX + bY ) = aIt (X) + bIt (Y ).
Proof. Linear combinations of simple functions are simple.
C. E[It (X)|F
R t s ] = Is (X) P − a.s. (0 ≤ s < t < ∞) :
It (X) = 0 XdB is a continuous martingale.
Proof. There are two cases to consider.
(i) Both s and t belong to the same interval [tn , tn+1 ). Then

It (X) = Is (X) + ξn (B(t) − B(s)).

But ξn is Ftn -measurable, so Fs -measurable (tn ≤ s), so independent of


B(t) − B(s) (independent increments property of B). So

E[It (X)|Fs ] = Is (X) + ξn E[B(t) − B(s)|Fs ] = Is (X).

(ii) s < t and t belong to different intervals: s ∈ [tm , tm+1 ) for m < n. Then

E[It (x)|Fs ] = E(E[It (X)|Ftn ]|Fs ) (iterated conditional expectations)


= E(Itn (X)|Fs ),

since ξn Ftn -measurable and independent increments of B give

E[ξn (B(t) − B(tn ))|Ftn ] = ξn E[B(t) − B(tn )|Ftn ] = ξn .0 = 0.

Continuing in this way, we can reduce successively to tm+1 :

E[It (X)|Fs ] = E[Itm (X)|Fs ].

But Itm (X) = Is (X) + ξm (B(s) − B(tm )); taking E[.|Fs ] the second term
gives zero as above, giving the result. //

Note. The stochastic integral for simple integrands is essentially a martingale


transform, and the above is essentially the proof of Ch. III that martingale

14
transforms are martingales.
We pause to note a property of martingales which we shall need below.
Call Xt − Xs the increment of X over (s, t]. Then for a martingale X,
the product of the increments over disjoint intervals has zero mean. For, if
s < t ≤ u < v,

E[(Xv − Xu )(Xt − Xs )] = E[E[(Xv − Xu )(Xt − Xs )|Fu ]]


= E[(Xt − Xs )E[(Xv − Xu )|Fu ]],

taking out what is known (as s, t ≤ u). The inner expectation is zero by the
martingale property, so the LHS is zero, as required.
Rt Rt
D (Itô isometry). E[(It (X))2 ], or E[( 0 Xs dBs )2 ], = E[ 0 Xs2 ds].
Proof. The LHS above is E[It (X).It (X)], i.e.

E[(Σn−1 2
i=0 ξi (B(ti+1 ) − B(ti )) + ξn (B(t) − B(tn ))) ].

Expanding the square, the cross-terms have expectation zero by above, so

E[Σn−1 2 2 2 2
i=0 ξi (B(ti+i − B(ti )) + ξn (B(t) − B(tn )) ].

Since ξi is Fti -measurable, each ξi2 -term is independent of the squared Brown-
ian increment term following it, which has expectation var(B(ti+1 )−B(ti )) =
ti+1 − ti . So we obtain

Σn−1 2 2
i=0 E[ξi ](ti+1 − ti ) + E[ξn ](t − tn ).
Rt Rt
This is 0 E[Xu2 ]du = E[ 0 Xu2 du], as required.
Rt
E. Itô isometry (continued). It (X) − Is (X) = s
Xu dBu satisfies
Z t Z t
E[( Xu dBu ) ] = E[ Xu2 du]
2
P − a.s.
s s

Proof: as above.
Rt Rt
F. Quadratic variation. The QV of It (X) = 0 Xu dBu is 0 Xu2 du.
This is proved in the same way as the case X ≡ 1, that B has quadratic
variation process t.

15
Integrands. Rt
The properties above suggest that 0 XdB should be defined only for
processes with Z t
E[Xu2 ]du < ∞ for all t.
0
We shall restrict attention to such X in what follows. This gives us an L2 -
theory of stochastic integration (compare the L2 -spaces introduced in Ch.
II), for which Hilbert-space methods are available.

3. Approximation.
Recall steps 1 (indicators) and 2 (simple integrands). By analogy with
the integral of Ch. III, we seek a suitable class of integrands suitably ap-
proximable by simple integrands. It turns out that:
(i) The suitable class
R t of 2integrands is the class of left-continuous adapted
processes X with 0 E[Xu ]du < ∞ for all t > 0 (or all t ∈ [0, T ] with finite
time-horizon T , as here),
(ii) Each such X may be approximated by aR sequence of simple integrands
t
Xn so that the stochastic integral It (X) = 0 XdB may be defined as the
Rt
limit of It (Xn ) = 0 Xn dB,
Rt
(iii) The stochastic integral 0 XdB so defined still has properties A-F above.
It is not possible to include detailed proofs of these assertions in a course
of this type [recall that we did not construct the measure-theoretic integral
of Ch. III in detail either – and this is harder!]. The key technical ingredient
needed is the Kunita-Watanabe inequalities. See e.g. [KS], §§3.1-2.
One can define stochastic integration in much greater generality.
1. Integrands. The natural class of integrands X to use here is the class of
predictable processes. These include the left-continuous processes to which
we confine ourselves above.
2. Integrators. One can construct a closely analogous theory for stochastic
integrals with the Brownian integrator B above replaced by a continuous
local martingale integrator M (or more generally by a local martingale: see
below). The properties above hold, with D replaced by
Z t Z t
E[( Xu dMu ) ] = E[ Xu2 dhM iu ].
2
0 0

See e.g. [KS], [RY] for details.


One can generalise further to semimartingale integrators: these are pro-

16
cesses expressible as the sum of a local martingale and a process of (locally)
finite variation. Now C is replaced by: stochastic integrals of local martin-
gales are local martingales. See e.g. [RW1] or Meyer (1976) for details.

§6. Stochastic Differential Equations (SDEs) and Itô’s Lemma

R t Suppose that U, V are adapted processes,with U locally integrable (so


U ds is defined as an ordinary integral, as in Ch. III), and V is left-
0 s Rt Rt
continuous with 0 E[Vu2 ]du < ∞ for all t (so 0 Vs dBs is defined as a stochas-
tic integral, as in §5). Then
Z t Z t
Xt := x0 + Us ds + Vs dBs
0 0

defines a stochastic process X with X0 = x0 . It is customary, and convenient,


to express such an equation symbolically in differential form, in terms of the
stochastic differential equation

dXt = Ut dt + Vt dBt , X 0 = x0 . (SDE)

Now suppose that f : R2 → R is a function, continuously differentiable


once in its first argument (which will denote time), and twice in its second
argument (space): f ∈ C 1,2 . The question arises of giving a meaning to the
stochastic differential df (t, Xt ) of the process f (t, Xt ), and finding it.
Recall the Taylor expansion of a smooth function of several variables,
f (x0 , x1 , · · · , xd ) say. We use suffices to denote partial derivatives: fi :=
∂f /∂xi , fi,j := ∂ 2 f /∂xi ∂xj (recall that if partials not only exist but are
continuous, then the order of partial differentiation can be changed: fi,j =
fj,i , etc.). Then for x = (x0 , x1 , · · · , xd ) near u,

1
f (x) = f (u) + Σdi=0 (xi − ui )fi (u) + Σdi,j=0 (xi − ui )(xj − uj )fi,j (u) + · · ·
2
In our case (writing t0 in place of 0 for the starting time):
1
f (t, Xt ) = f (t0 , X(t0 ))+(t−t0 )f1 (t0 , X(t0 ))+(X(t)−X(t0 ))f2 + (t−t0 )2 f11 +
2
1
(t − t0 )(X(t) − X(t0 ))f12 + (X(t) − X(t0 ))2 f22 + · · · ,
2

17
which may be written symbolically as
1 1
df (t, X(t)) = f1 dt + f2 dX + f11 (dt)2 + f12 dtdX + f22 (dX)2 + · · · .
2 2
In this, we
(i) substitute dXt = Ut dt + Vt dBt from √above,
2
(ii) substitute (dBt ) = dt, i.e. |dBt | = dt, from §4:
1 1
df = f1 dt+f2 (U dt+V dB)+ f11 (dt)2 +f12 dt(U dt+V dB)+ f22 (U dt+V dB)2 +· · ·
2 2
Now using (dB)2 = dt,
(U dt + V dB)2 = V 2 dt + 2U V dtdB + U 2 (dt)2
= V 2 dt + higher-order terms :
1
df = (f1 + U f2 + V 2 f22 )dt + V f2 dB + higher-order terms.
2
Summarising, we obtain Itô’s Lemma, the analogue for the Itô or stochastic
calculus of the chain rule for ordinary (Newton-Leibniz) calculus:

Theorem (Itô’s Lemma). If Xt has stochastic differential


dXt = Ut dt + Vt dBt , X0 = x0 ,
and f ∈ C 1,2 , then f = f (t, Xt ) has stochastic differential
1
df = (f1 + U f2 + V 2 f22 )dt + V f2 dBt .
2
That is, writing f0 for f (0, x0 ), the initial value of f ,
Z t Z t
1 2
f (t, Xt )) = f0 + (f1 + U f2 + V f22 )dt + V f2 dB.
0 2 0

This important result may be summarised as follows: use Taylor’s theo-


rem formally, together with the rule
(dt)2 = 0, dtdB = 0, (dB)2 = dt.
Itô’s Lemma extends to higher dimensions, as does the rule above:
1
df = (f0 + Σdi=1 Ui fi + Σd1 Vi2 fii )dt + Σd1 Vi fi dBi
2
18
(where Ui , Vi , Bi denote the ith coordinates of vectors U, V, B, fi , fii denote
partials as above); here the formal rule is

(dt)2 = 0, dtdBi = 0, (dBi )2 = dt, dBi dBj = 0 (i 6= j).


Rt
Corollary. E[f (t, Xt )] = f0 + 0
E[f1 + U f2 + 21 V 2 f22 ]dt.
Rt
Proof. 0 V f2 dB is a stochastic integral, so a martingale, so its expectation
is constant (= 0, as it starts at 0). //

Note. Powerful as it is in the setting above, Itô’s Lemma really comes into its
own in the more general setting of semimartingales. It says there that if X is
a semimartingale and f is a smooth function as above, then f (t, X(t)) is also
a semimartingale. The ordinary differential dt gives rise to the bounded-
variation part, the stochastic differential gives rise to the martingale part.
This closure property under very general non-linear operations is very pow-
erful and important.

Example: The Ornstein-Uhlenbeck Process.


The most important example of a SDE for us is that for geometric Brow-
nian motion (VII.1 below). We close here with another example.
Consider now a model of the velocity Vt of a particle at time t (V0 = v0 ),
moving through a fluid or gas, which exerts
(i) a frictional drag, assumed propertional to the velocity,
(ii) a noise term resulting from the random bombardment of the particle by
the molecules of the surrounding fluid or gas. The basic model is the SDE

dV = −βV dt + cdB, (OU )

whose solution is called the Ornstein-Uhlenbeck (velocity) process with re-


laxation time 1/β and diffusion coefficient D := 21 c2 /β 2 . It is a stationary
Gaussian Markov process (not stationary-increments Gaussian Markov like
Brownian motion), whose limiting (ergodic) distribution is N (0, βD) (the
Maxwell-Boltzmann distribution of Statistical Mechanics) and whose limit-
ing correlation function is e−β|.| .
If we integrate the OU velocity process to get the OU displacement pro-
cess, we lose the Markov property (though the process is still Gaussian).
Being non-Markov, the resulting process is much more difficult to analyse.
The OU process is the prototype of processes exhibiting mean reversion,

19
or a central push: frictional drag acts as a restoring force tending to push the
process back towards its mean. It is important in many areas, including
(i) statistical mechanics, where it originated,
(ii) mathematical finance, where it appears in the Vasicek model for the term-
structure of interest-rates (the mean represents the ‘natural’ interest rate),
(iii) stochastic volatility models, where the volatility σ itself is now a stochas-
tic process σt , subject to an SDE of OU type.

Theory of interest rates.


This subject dominates the mathematics of money markets, or bond mar-
kets. These are more important in today’s world than stock markets, but are
more complicated, so we must be brief here. The area is crucially important
in macro-economic policy, and in political decision-making, particularly after
the financial crisis (“credit crunch”). Government policy is driven by fear of
speculators in the bond markets (rather than aimed at inter-governmental
cooperation against them). The mathematics is infinite-dimensional (at each
time-point t we have a whole yield curve over future times), but reduces to
finite-dimensionality: bonds are only offered at discrete times, with a tenor
structure (a finite set of maturity times).
Mean reversion is used in models, to reflect the underlying ‘natural inter-
est rate’, from which deviations may occur due to short-term pressures.
Note.
The ‘short-term pressures’ arising from the Crash or Credit Crunch of
2007-8 and on have now lasted a decade! Interest rates have been histor-
ically low (to the benefit of borrowers such as mortgage-holders, and the
detriment of savers, for example). In the last days of September 2017, the
Governor of the Bank of England, Mark Carney, said that bank rate may well
rise (we shall see – the decision is taken by the Monetary Policy Committee,
on which the Governor has one vote out of nine). You may be interested to
compare this with the actions of the Fed in recent years. etc.

20

You might also like