0% found this document useful (0 votes)
79 views

New Methods Notes

This document provides an introduction to mathematical methods for financial engineering. It discusses the geometric Brownian motion model used to represent asset prices and defines European options as financial derivatives that satisfy a conditional expectation equation. Specifically, it presents the geometric Brownian motion process and shows the expected growth of asset prices is the risk-free interest rate. It also gives the example of a European put option and defines its payoff at expiration.

Uploaded by

Brian Precious
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

New Methods Notes

This document provides an introduction to mathematical methods for financial engineering. It discusses the geometric Brownian motion model used to represent asset prices and defines European options as financial derivatives that satisfy a conditional expectation equation. Specifically, it presents the geometric Brownian motion process and shows the expected growth of asset prices is the risk-free interest rate. It also gives the example of a European put option and defines its payoff at expiration.

Uploaded by

Brian Precious
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

BRAD BAXTER
Version: 201411291333

Contents
1. Introduction
1.1. Reading List
2. The Geometric Brownian Motion Universe
2.1. Analytic Values of European Puts and Calls
3. Brownian Motion
3.1. Simple Random Walk
3.2. Discrete Brownian Motion
3.3. Basic Properties of Brownian Motion
3.4. Martingales
3.5. Brownian Motion and Martingales
3.6. The BlackScholes Equation
3.7. It
o Calculus
3.8. Multivariate Geometric Brownian Motion
3.9. Asian Options
4. The Binomial Model Universe
5. The Partial Differential Equation Approach
5.1. The Diffusion Equation
5.2. Finite Difference Methods for the Diffusion Equation
5.3. The Fourier Transform and the von Neumann Stability Test
5.4. Stability and the Fourier Transform
5.5. Option Pricing via the Fourier transform
5.6. Fourier Transform Conventions
6. Mathematical Background Material
6.1. Probability Theory
6.2. Differential Equations
6.3. Recurrence Relations
6.4. Mortgages a once exotic instrument
6.5. Pricing Mortgages via lack of arbitrage
6.6. Exponential Growth and Population Rhetoric
7. Numerical Linear Algebra
7.1. Orthogonal Matrices

2
2
4
7
8
8
9
9
10
11
12
13
18
20
22
28
28
30
37
39
41
43
45
45
47
48
51
52
53
54
54

BRAD BAXTER

1. Introduction
You can access these notes, and other material, on my office machine:
http://econ109.econ.bbk.ac.uk/brad/Methods/
These notes are updated fairly regularly, so please check for newer versions from
time to time.
Past exams can be downloaded from
http://econ109.econ.bbk.ac.uk/brad/FinEngExams/
However, in my experience, students will benefit enormously from the old-fashioned
method of taking notes as I lecture. I shall post further material on my office
machine throughout the year.
This is not a course in pricing, but I have tried to illustrate topics by examples
drawn from mathematical finance.
1.1. Reading List. There are many books on mathematical finance, but very few
good ones. My strongest recommendations are for the books by Higham and Kwok.
However, the following books are all useful.
(i) M. Baxter and A. Rennie, Financial Calculus, Cambridge University Press.
This gives a fairly informal description of the mathematics of pricing,
concentrating on martingales. Its not a source of information for efficient
numerical methods.
(ii) A. Etheridge, A Course in Financial Calculus, Cambridge University Press.
This does not focus on the algorithmic side but is very lucid.
(iii) D. Higham, An Introduction to Financial Option Valuation, Cambridge
University Press. This book provides many excellent Matlab examples,
although its mathematical level is undergraduate.
(iv) J. Hull, Options, Futures and Other Derivatives, 6th edition. [Earlier
editions are probably equally suitable for much of the course.] Fairly
clear, with lots of background information on finance. The mathematical
treatment is lower than the level of much of our course (and this is not
a mathematically rigorous book), but its still the market leader in many
ways.
(v) D. Kennedy, Stochastic Financial Models, Chapman and Hall. This is
an excellent mathematical treatment, but probably best left until after
completing the first term of Methods.
(vi) J. Michael Steele, Stochastic Calculus and Financial Applications, Springer.
This is an excellent book, but is one to read near the end of this term,
once you are more comfortable with fundamentals.
(vii) P. Wilmott, S. Howison and J. Dewynne, The Mathematics of Financial
Derivatives, Cambridge University Press. This book is very useful for
its information on partial differential equations. If your first degree was
in engineering, mathematics or physics, then you probably spent many
happy hours learning about the diffusion equation. This book is very
much mathematical finance from the perspective of a traditional applied
mathematician. It places much less emphasis on probability theory than
most books on finance.
(viii) P. Wilmott, Paul Wilmott introduces Quantitative Finance, 2nd edition,
John Wiley. More chatty than his previous book. The authors ego grew

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

enormously between the appearance of these texts, but theres some good
material here.
(ix) Y.-K. Kwok, Mathematical Models of Financial Derivatives, Springer. Rather
dry, but very detailed treatment of finite difference methods. If you need
a single book for general reference work, then this is probably it.
There are lots of books suitable for mathematical revision. The Schaum series
publishes many good inexpensive textbooks providing worked examples. The inexpensive
paperback Calculus, by K. G. Binmore (Cambridge University Press) will also
be useful to students wanting an introduction to, say, multiple integrals, as will
Mathematical Methods for Science Students, by Geoff Stephenson. At a slightly
higher level, All you wanted to know about Mathematics but were afraid to ask, by
L. Lyons (Cambridge University Press, 2 vols), is useful and informal.
The ubiquitous Numerical Recipes in C++, by S. Teukolsky et al, is extremely
useful. Its coverage of numerical methods is generally reliable and its available
online at www.nr.com. A good hard book on partial differential equations is that
of A. Iserles (Cambridge University Press).
At the time of writing, finance is going through a turbulent period, in which
politicians profess their longstanding doubts that the subject was well-founded
surprisingly, many omitted to voice such doubts earlier! It is good to know that
we have been here before. The following books are included for general cultural
interest.
(i) M. Balen, A Very English Deceit: The Secret History of the South Sea
Bubble and the First Great Financial Scandal.
(ii) P. L. Bernstein, Capital Ideas Evolving. This is an optimistic history of
mathematical finance.
(iii) C. Eagleton and J. William (eds), Money: A History.
(iv) C. P. Kindleberger, R. Aliber and R. Solow, Manias, Panics, and Crashes:
A History of Financial Crises, Wiley.
(v) N. N. Taleb, The Black Swan. In my view, this is greatly over-rated, but
you should still read it.
No text is perfect: please report any slips to [email protected].

BRAD BAXTER

2. The Geometric Brownian Motion Universe


We shall begin with a brisk introduction to the main topics, filling in the details
later.
Let r be the risk-free interest rate, which we shall assume constant; we assume
that money can be borrowed and lent at this rate, so that such debts (or investments,
if lent) satisfy Bt = B0 exp(rt). We shall assume that every asset is described by a
random process called geometric Brownian motion:
S(t) = S(0)e(r

(2.1)

/2)t+Wt

t > 0,

where Wt denotes Brownian motion and is a non-negative parameter called the


volatility of the asset. The later section on Brownian motion introduces this in
detail, but all we need for now is the fundamental property that
(2.2)

Wt N (0, t),

for all t > 0.

Thus, to generate sample prices S(T ) at some future time T given the initial
price S(0), we use


where ZT N (0, 1).


(2.3)
S(T ) = S(0) exp (r 2 /2)T + T ZT ,

Example 2.1. Generating sample prices at a fixed time T using (2.3) is particularly
easy in Matlab and Octave:
S = S0*exp((r-sigma^2/2)*T + sigma*sqrt(T)*randn(m,1)
will construct a column vector of m sample prices once youve defined S0, r, and
T . To calculate the sample average price, we type sum(S)/m.
To analytically calculate ES(T ) we need the following simple, yet crucial, lemma.
Lemma 2.1. If W N (0, 1), then E exp(W ) = exp(2 /2).
Proof. We have
EeW =

et (2)1/2 et

/2

dt = (2)1/2

e 2 (t

2t)

dt.

The trick now is to complete the square in the exponent, that is,
t2 2t = (t )2 2 .

Thus



2
1
exp ([t ]2 2 ) dt = e /2 .
2

This is also described in detail in Example 6.3.


EeW = (2)1/2

Lemma 2.2. For every 0, we have the expected growth


(2.4)

ES(t) = S(0)ert ,

t 0.

Proof. This is an easy consequence of Lemma 2.1.

The geometric Brownian motion universe might therefore seem rather strange,
because every asset has the same expected growth ert as the risk-free interest rate.
This is usually called the risk-neutral model and is chosen deliberately. Thus our
universe of all possible assets in a risk-neutral world is specified by one parameter

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

only: the volatility . It is not our task to discuss this model in detail at the present
time.
A financial derivative is any function f (S, t). We shall concentrate on the
following particular class of derivatives.
Definition 2.1. A European option is any function f f (S, t) that satisfies the
conditional expectation equation


(2.5)
f (S(t), t) = erh E f (S(t + h), t + h)|S(t) ,
for any h > 0.
We shall often simply write this as

f (S(t), t) = erh Ef (S(t + h), t + h)


but you should take care to remember that this is an expected future value given
the assets current value S(t). We see that (2.5) describes a contract f (S, t) whose
current value is the discounted value of its expected future value in the risk-neutral
GBM universe.
We can learn a great deal by studying the mathematical consequences of (2.5)
and (2.1).
Example 2.2. A plain vanilla European put option is simply an insurance contract
that allows us to sell one unit of the asset, for exercise price K, at time T in the
future. If the assets price S(T ) is less than K at this expiry time, then the option is
worth K S(T ), otherwise its worthless. Such contracts protect us if were worried
that the assets price might drop. The pricing problem here to calculate the value
of the contract at time zero given its value at expiry, namely
(2.6)
where (z)+ := max{z, 0}.

fP (S(T ), T ) = (K S(T ))+ ,

Typically, we know the value of the option f (S(T ), T ) for all values of the asset
S(T ) at some future time T . Our problem is to compute its value at some earlier
time, because were buying or selling this option.
Example 2.3. A plain vanilla European call option gives us the right to buy one
unit of the asset at the exercise price K at time T . If the assets price S(T ) exceeds
K at this expiry time, then the option is worth S(T ) K, otherwise its worthless,
implying the expiry value
(2.7)

fC (S(T ), T ) = (S(T ) K)+ ,

using the same notation as Example 2.2. Such contracts protect us if were worried
that the assets price might rise.
How do we compute f (S(0), 0)? The difficult part is computing the expected
future value Ef (S(T ), T ). This can be done analytically for a tiny number of
options, including the European Put and Call (see Theorem 2.4), but usually we
must resort to a numerical calculation. This leads us to our first algorithm: Monte
Carlo simulation. Here we choose a large integer N and generate N pseudo-random
numbers Z1 , Z2 , . . . , ZN that have the normalized Gaussian distribution; in Matlab,
we simply write Z = randn(N,1). Using (2.1), these generate the future asset prices



2
k = 1, . . . , N.
)T + T Zk ,
(2.8)
Sk = S(0) exp (r
2

BRAD BAXTER

We then approximate the future expected value by an average, that is, we take
(2.9)

f (S(0), 0)

N
erT X
f (Sk , T ).
N
k=1

Monte Carlo simulation has the great advantage that it is extremely


simple to
program. Its disadvantage is that the error is usually a multiple of 1/ N , so that
very large N is needed for high accuracy (each decimal place of accuracy requires
about a hundred times more work). We note that (2.9) will compute the value of
any European option that is completely defined by a known final value f (S(T ), T ).
We shall now use Monte Carlo to approximately evaluate the European Call and
Put contracts. In fact, Put-Call parity, described below in Theorem 2.3, implies
that we only need a program to calculate one of these, because they are related by
the simple formula
(2.10)

fC (S(0), 0) fP (S(0), 0) = S(0) KerT .

Heres the Matlab program for the European Put.


%
% These are the parameters chosen in Example 11.6 of
% OPTIONS, FUTURES AND OTHER DERIVATIVES,
% by John C. Hull (Prentice Hall, 4th edn, 2000)
%
%% initial stock price
S0 = 42;
% unit of time = year
% 250 working days per year
% continuous compounding risk-free rate
r = 0.1;
% exercise price
K = 40;
% time to expiration in years
T = 0.5;
% volatility of 20 per cent annually
sigma = 0.2;
% generate asset prices at expiry
Z = randn(N,1);
ST = S0*exp( (r-(sigma^2)/2)*T + sigma*sqrt(T)*Z );
% calculate put contract values at expiry
fput = max(K - ST,0.0);
% average put values at expiry and discount to present
mc_put = exp(-r*T)*sum(fput)/N
% calculate analytic value of put contract
wK = (log(K/S0) - (r - (sigma^2)/2)*T)/(sigma*sqrt(T));
a_put = K*exp(-r*T)*Phi(wK) - S0*Phi(wK - sigma*sqrt(T))
The function Phi denotes the cumulative distribution function for the normalized
Gaussian distribution. Unfortunately, Matlab only provides the very similar error

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

function, defined by
2
erf(y) =

Its not hard to prove that

exp(s2 ) ds,
0

y R.


1
1 + erf(t/ 2) ,
t R.
2
We can add this to Matlab using the following function.
function Y = Phi(t)
Y = 0.5*(1.0 + erf(t/sqrt(2)));
We have only revealed the tip of a massive iceberg in this brief introduction.
Firstly, the Black-Scholes model, where asset prices evolve according to (2.1), is
rather poor: reality is far messier. Further, there are many types of option which
are path-dependent: the value of the option at expiry depends not only on the final
price S(T ), but on its previous values {S(t) : 0 t T }. In particular, there are
American options, where the contract can be exercised at any time before its expiry.
All of these points will be addressed in our course, but you should find that Hulls
book provides excellent background reading (although his mathematical treatment
is often sketchy). Higham provides a clear Matlab-based exposition.
Although the future expected value usually requires numerical computation,
there are some simple cases that are analytically tractable. These are particularly
important because they often arise in examinations!
(t) =

2.1. Analytic Values of European Puts and Calls. Its not too hard to calculate
the values of these options analytically. Further, the next theorem gives an important
relation between the prices of call and put options.
Theorem 2.3 (Put-Call parity). European Put and Call options satisfy
fC (S(0), 0) fP (S(0), 0) = S(0) KerT .

(2.11)

Proof. The trick is the observation that


y = y+ (y)+ ,

for any y R. Thus

S(T ) K = fC (S(T ), T ) fP (S(T ), T ),


which implies

Now



erT ES(T ) K = fC (S(0), 0) fP (S(0), 0).

ES(T )

=
=
=

S(0)e(r /2)T + T w ew /2 dw

2
2
1
e 2 (w 2 T w) dw
S(0)e(r /2)T (2)1/2

(2)1/2

S(0)e

rT

and some simple algebraic manipulation completes the proof.

BRAD BAXTER

This is a useful check on the Monte Carlo approximations of the options values.
To derive their analytic values, we shall need the function
Z y
2
1/2
ez /2 dz,
y R,
(2.12)
(y) = (2)

which is simply the cumulative distribution function of the Gaussian probability


density, that is, P(Z y) = (y) and P(a Z b) = (b) (a), for any
normalized Gaussian random variable Z.
Theorem 2.4. A European Put option satisfies
(2.13)

fP (S(0), 0) = KerT (w(K)) S(0)(w(K) T ),

where w(K) is defined by the equation


K = S(0)e(r

/2)T + T w(K)

that is
(2.14)

w(K) =

Proof. We have
EfP (S(T ), T ) = (2)
Now the function

1/2

log(K/S(0)) (r 2 /2)T

.
T

K S(0)e(r

/2)T + T w

ew

/2

dw.

w 7 K S(0) exp((r 2 /2)T + T w)

is strictly decreasing, so that


K S(0)e(r

/2)T + T w

if and only if w w(K), where w(K) is given by (2.14). Hence


Z w(K) 


2
2
EfP (S(T ), T ) = (2)1/2
K S(0)e(r /2)T + T w ew /2 dw

=
=

Z w(K)

2
1
e 2 (w 2 T w) dw
(2)1/2

rT
K(w(K)) S(0)e (w(K) T ).
K(w(K)) S(0)e(r

/2)T

Discounting this expectation to its present value, we derive the option price.

Exercise 2.1. Modify the proof of this theorem to derive the analytic price of a
European Call option. Check that your price agrees with Put-Call parity.
3. Brownian Motion
3.1. Simple Random Walk. Let X1 , X2 , . . . be a sequence of independent random
variables all of which satisfy
(3.1)
and define

P (Xi = 1) = 1/2

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

(3.2)

Sn = X1 + X2 + + Xn .

We can represent this graphically by plotting the points {(n, Sn ) : n = 1, 2, . . .},


and one way to imagine this is as a random walk, in which the walker takes identical
steps forwards or backwards, each with probability 1/2. This model is called simple
random walk and, whilst easy to define, is a useful laboratory in which to improve
probabilistic intuition.
Another way to imagine Sn is to consider a game in which a fair coin is tossed
repeatedly. If I win the toss, then I win 1; losing the toss implies a loss of 1.
Thus Sn is my fortune at time n.
Firstly note that
ESn = EX1 + + EXn = 0.
2
Further, EXi = 1, for all i, so that var Xi = 1. Hence
var Sn = var X1 + var X2 + + var Xn = n,

since X1 , . . . , Xn are independent random variables.

3.2. Discrete Brownian Motion. We begin with a slightly more complicated


random walk this time. We choose a timestep h > 0 and let Z1 , Z2 , . . . be independent
N (0, h) Gaussian random variables. We then define a curve B (h) t by defining
B (h) 0 = 0 and
(3.3)

B (h) (kh) = Z1 + Z2 + + Zk ,

for positive integer k. We then join the dots to obtain a piecewise linear function.
More precisely, we define
!
B (h) (k+1)h B (h) kh
(h)
(h)
,
for t (kh, (k + 1)h).
B t = B kh + (t kh)
h
The resultant random walk is called discrete Brownian motion.
Proposition 3.1. If 0 a b c and a, b, c hZ, then the discrete Brownian
motion increments B (h) c B (h) b and B (h) b B (h) a are independent random variables.
Further, B (h) c B (h) b N (0, c b) and B (h) b B (h) a N (0, b a).
Proof. Exercise.

3.3. Basic Properties of Brownian Motion. Its not obvious that discrete
Brownian motion has a limit, in some sense, when we allow the timestep h to
converge to zero. However, it can be shown that this is indeed the case (and will
see the salient features of the LevyCieselski constructon of this limit later). For
the moment, we shall state the defining properties of Brownian motion.
Definition 3.1. There exists a stochastic process Wt , called Brownian motion,
which satisfies the following conditions:
(i) W0 = 0;
(ii) If 0 a b c, then the Brownian increments Wc Wb and Wb Wa
are independent random variables. Further, Wc Wb N (0, c b) and
Wb Wa N (0, b a);
(iii) Wt is continuous almost surely.
Proposition 3.2. Wt N (0, t) for all t > 0.

10

BRAD BAXTER

Proof. Just set a = 0 and b = t in (ii) of Definition 3.1.

The increments of Brownian motion are independent Gaussian random variables,


but the actual values Wa and Wb are not independent random variables, as we shall
now see.
Proposition 3.3. If a, b [0, ), then E (Wa Wb ) = min{a, b}.
Proof. We assume 0 < a < b, the remaining cases being easily checked. Then

E (Wa Wb ) = E Wa [Wb Wa ] + Wa2

= E (Wa [Wb Wa ]) + E Wa2
=0+a
= a.


Brownian motion is continuous almost surely but it is easy to see that it cannot
be differentiable. The key observation is that
Wt+h Wt
1
N (0, ).
h
h

(3.4)

In other words, instead of converging to some limiting value, the variance of the
random variable (Wt+h Wt )/h tends to infinity, as h 0.
3.4. Martingales. A martingale is a mathematical version of a fair game, as we
shall first illustrate for simple random walk.
Proposition 3.4. We have
E (Sn+k |Sn ) = Sn .
Proof. The key observation is that
Sn+k = Sn + Xn+1 + Xn+2 + + Xn+k
and Xn+1 , . . . , Xn+k are all independent of Sn = X1 + + Xn . Thus
E (Sn+k |Sn ) = Sn + EXn+1 + EXn+2 + + EXn+k = Sn .

To see why this encodes the concept of a fair game, let us consider a biased coin
with the property that
E (Sn+10 |Sn ) = 1.1Sn .
Hence
E (Sn+10 |Sn ) = 1.1 Sn .
In other words, the expected fortune Sn+10 grows exponentially with . For
example, if we ensure that S4 = 3, by fixing the first four coin tosses in some
fashion, then our expected fortune will grow by 10% every 10 tosses thereafter.

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

11

3.5. Brownian Motion and Martingales.


Proposition 3.5. Brownian motion is a martingale, that is, E (Wt+h |Wt ) = Wt ,
for any h > 0.
Proof.
E (Wt+h |Wt ) = E ([Wt+h Wt ] + Wt |Wt )
= E ([Wt+h Wt ]|Wt ) + Wt

= E ([Wt+h Wt ]) + Wt
= 0 + Wt
= Wt .


We can sometimes use a similar argument to prove that functionals of Brownian
motion are martingales.
Proposition 3.6. The stochastic process Xt = Wt2 t is a martingale, that is,
E (Xt+h |Xt ) = Xt , for any h > 0.
Proof.


2
E (Xt+h |Xt ) = E [Wt+h Wt + Wt ] [t + h]|Wt

= E [Wt+h Wt ]2 + 2Wt [Wt+h Wt ] + Wt2 t h|Wt

= E[Wt+h Wt ]2 + Wt2 t h

= h + Wt2 t h
= Xt .


The following example will be crucial.
Proposition 3.7. Geometric Brownian motion
Yt = e+t+Wt
is a martingale, that is, E (Yt+h |Yt ) = Yt , for any h > 0, if and only if = 2 /2.
Proof.


E (Yt+h |Yt ) = E e+(t+h)+Wt+h |Yt


= E Yt eh+(Wt+h Wt ) |Yt
= Yt Eeh+(Wt+h Wt )
= Yt e(+

/2)h

.


12

BRAD BAXTER

In this course, the mathematical model chosen for option pricing is risk-neutral
geometric Brownian motion: we choose a geometric Brownian motion St with the
property that Yt = ert St is a martingale. Thus we have
St = e+(r)+Wt
and Proposition 3.7 implies that r = 2 /2, i.e.
St = e+(r

/2)t+Wt

= S0 e(r

/2)t+Wt

3.6. The BlackScholes Equation. We can also use (2.5) to derive the famous
Black-Scholes partial differential equation, which is satisfied by any European option.
The key is to choose a small postive h in (2.5) and expand. We shall need Taylors
theorem for functions of two variables, which states that


G
G
G(x + x, y + y) = G(x, y) +
x +
y
x
y
 2

2G
2G
1 G
2
2
+ .
(x)
+
2
(x)(y)
+
(y)
+
2 x2
xy
y 2
(3.5)
Further, it simplifies matters to use log-space: we introduce u(t) := log S(t),
where log loge in these notes (not logarithms to base 10). In log-space, (2.1)
becomes
u(t + h) = u(t) + (r 2 )h + Wt ,

(3.6)
where
(3.7)

Wt = Wt+h Wt N (0, h).

We also introduce
(3.8)

g(u(t), t) := f (exp(u(t), t)),

so that (2.5) takes the form


g(u(t), t) = erh Eg(u(t + h), t + h).

(3.9)

Now Taylor expansion yields the (initially daunting)


g(u(t + h), t + h)

=
=

(3.10)

g(u(t) + (r 2 /2)h + Wt , t + h)

g
(r 2 /2)h + Wt +
g(u(t), t) +
u
g
1 2g 2
2
(Wt ) + h
+ ,
2 u2
t

ignoring all terms of higher order than h. Further, since EWt = 0 and E[(Wt )2 ] =
1, we obtain


1 2 g 2 g
g
2
+ .
+
(r /2) +
(3.11) Eg(u(t + h), t + h) = g(u(t), t) + h
u
2 u2
t
Recalling that
1
erh = 1 rh + (rh)2 + ,
2

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

we find
g
(3.12)

=
=

13



g
1 2 g 2 g
2
(1 rh + ) g + h
+

(r /2) +

+
u
2 u2
t


2
1 g 2 g
g
+ .
(r 2 /2) +
+
g + h rg +
u
2 u2
t


For this to be true for all h > 0, we must have


(3.13)

rg +

1 2 g 2 g
g
+
(r 2 /2) +
= 0,
u
2 u2
t

and this is the celebrated Black-Scholes partial differential equation (PDE). Thus,
instead of computing an expected future value, we can calculate the solution of
the Black-Scholes PDE (3.13). The great advantage gained is that there are highly
efficient numerical methods for solving PDEs numerically. The disadvantages are
complexity of code and learning the mathematics needed to exploit these methods
effectively.
Exercise 3.1. Use the substitution S = exp(u) to transform (3.13) into the nonlinear
form of the BlackScholes PDE.
3.7. It
o Calculus. Equation (3.11) is really quite surprising, because the second
derivative contributes to the O(h) term. This observation is at the root of the It
o
rules. We begin by considering the quadratic variation In [a, b] of Brownian motion
on the interval [a, b]. Specifically, we choose a positive integer n and let nh = b a.
We then define
(3.14)

In [a, b] =

n
X

k=1

Wa+kh Wa+(k1)h

2

We shall prove that EIn [a, b] = b a, for every positive integer n, but that
var In [a, b] 0, as n . We shall need the following simple property of Gaussian
random variables.
Lemma 3.8. Let Z N (0, 1). Then EZ 4 = 3.
Proof. Integrating by parts, we obtain
Z
2
s4 (2)1/2 es /2 ds
EZ 4 =

Z
d  s2 /2 
ds
e
s3
= (2)1/2
ds

h
Z
 

i
s2 /2
2
3 s2 /2
1/2
ds
3s e

s e
= (2)
= 3.


Exercise 3.2. Calculate EZ 6 when Z N (0, 1). More generally, calculate EZ 2m
for any positive integer m.
Proposition 3.9. We have EIn [a, b] = b a and var In [a, b] = 2(b a)2 /n.

14

BRAD BAXTER

Proof. Firstly,
EIn [a, b] =

n
X

k=1

E Wa+kh Wa+(k1)h

2

n
X

k=1

h = nh = b a.

Further, the Brownian increments Wa+kh Wa+(k1)h are independent N (0, h)


random variables. We shall define independent N (0, 1) random variables Z1 , Z2 , . . . , Zn
by

Wa+kh Wa+(k1)h = hZk ,


1 k n.
Hence

var In [a, b] =
=
=
=
=

n
X

k=1
n
X

k=1
n
X

k=1
n
X

k=1
n
X

var


2
hZk



var hZk2

 
h2 var Zk2


2 

h2 EZk4 EZk2
2h2

k=1

= 2nh2
= 2(b a)2 /n.

With this in mind, we define
Z

b
a

(dWt ) = b a

and observe that we have shown that


Z
Z b
2
(dWt ) =
a

dt,
a

for any 0 a < b. Thus we have really shown the It


o rule
dWt2 = dt.

Using a very similar technique, we can also prove that


dtdWt = 0.
We first define
Jn [a, b] =

n
X

k=1

where nh = b a, as before.


h Wa+kh Wa+(k1)h ,

Proposition 3.10. We have EJn [a, b] = 0 and var In [a, b] = (b a)3 /n2 .

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

15

Proof. Firstly,
n
X

EJn [a, b] =

k=1

The variance satisfies


var Jn [a, b] =
=
=


Eh Wa+kh Wa+(k1)h = 0.

n
X

k=1
n
X

k=1
n
X

var h Wa+kh Wa+(k1)h

h2 var Wa+kh Wa+(k1)h

h3

k=1
3

= nh

= (b a)3 /n2 .

With this in mind, we define
Z b
dtdWt = 0,
a

for any 0 a < b,

and observe that we have shown that


dtdWt = 0.
Exercise 3.3. Setting nh = b a, define
Kn [a, b] =

n
X

h2 .

k=1
2

Prove that Kn [a, b] = (b a) /n 0, as n . Thus


Z b
2
(dt) = 0,
a

for any 0 a < b. Hence we have (dt)2 = 0.


Proposition 3.11 (It
o Rules). We have dWt2 = dt and dWt dt = dt2 = 0.
Proof. See Propositions 3.9, 3.10 and Exercise 3.3

The techniques used in Propositions 3.9 and 3.10 are crucial examples of the
bascis of stochastic integration. We can generalize this technique to compute other
useful stochastic integrals, as we shall now see.
Proposition 3.12. We have
Z

Ws dWs =


1
Wt2 t .
2

16

BRAD BAXTER

Proof. We have already seen that, when h = t/n,


n
X

(3.15)

k=1

Wkh W(k1)h

2

t,

as n . Further, we shall use the telescoping sum


n 

X
2
2
2
= Wnh
W02 = Wt2 .
Wkh
W(k1)h
(3.16)
k=1

Subtracting (3.15) from (3.16), we obtain


(3.17)
n
n h

X
X

2 i
2
2
Wkh Wkh W(k1)h .
=2
Wkh W(k1)h
Wkh
W(k1)h
k=1

k=1

Now the LHS converges to

Wt2

t, whilst the RHS converges to


Z t
Ws dWs ,
2
0

whence (3.12).

Example 3.1. Here we shall derive a useful formula for


Z t
f (s)dWs ,
(3.18)

where f is continuously differentiable. The corresponding discrete stochastic sum is


n
X

f (kh) Wkh W(k1)h
(3.19)
Sn =
k=1

where nh = t, as usual. The key trick is to introduce another telescoping sum:


n
X

f (kh)Wkh f ((k 1)h)W(k1)h = f (t)Wt .
(3.20)
k=1

Subtracting (3.20) from (3.19) we find


Sn f (t)Wt =
=
(3.21)
as n . Hence
(3.22)

n
X

k=1
n
X

(f (kh) f ((k 1)h)) W(k1)h

k=1
t


hf (kh) + O(h2 ) W(k1)h

f (s)Ws ds,

t
0

h(s)dWs = h(t)Wt

h (s)Ws ds.
0

Exercise 3.4. Modify the technique of Example 3.1 to prove that


"Z
2 # Z t
t
h(s)2 ds.
=
h(s)dWs
(3.23)
E
0

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

17

This is the It
o isometry property.
We now come to It
os lemma itself.
Lemma 3.13 (It
os Lemma for univariate functions). If f is any infinitely differentiable
univariate function and Xt = f (Wt ), then
(3.24)

1
dXt = f (Wt )dWt + f (2) (Wt )dt.
2

Proof. We have
Xt+dt = f (Wt+dt )
= f (Wt + dWt )
1
= f (Wt ) + f (Wt )dWt + f (2) (Wt )dWt2
2
1
= Xt + f (Wt )dWt + f (2) (Wt )dt.
2
Subtracting Xt from both sides gives (3.24).

Example 3.2. Let Xt = ecWt , where c C. Then, setting f (x) = exp(cx) in


Lemma 3.24, we obtain


1
dXt = Xt cdWt + c2 dt .
2
Example 3.3. Let Xt = Wt2 . Then, setting f (x) = x2 in Lemma 3.24, we obtain
dXt = 2Wt dWt + dt.
Example 3.4. Let Xt = Wtn , where n can be any positive integer. Then, setting
f (x) = xn in Lemma 3.24, we obtain
1
dXt = nWtn1 dWt + n(n 1)Wtn2 dt.
2
We can also integrate It
os Lemma, as follows.
Example 3.5. Integrating (3.24) from a to b, we obtain
Z b
Z
Z b
1 b (2)
f (Wt )dWt +
f (Wt )dt,
dXt =
2 a
a
a

i.e.

Xb Xa =

1
f (Wt )dWt +
2

f (2) (Wt )dt.


a

Lemma 3.24 is not quite sufficient to deal with geometric Brownian motion,
hence the following bivariate variant.
Lemma 3.14 (It
os Lemma for bivariate functions). If g(x1 , t), for x1 , t R, is
any infinitely differentiable function and Yt = g(Wt , t), then
 2

g
1 g
g
(Wt , t)dWt +
(Wt , t) +
(Wt , t) dt.
(3.25)
dYt =
x1
2 x21
t

18

BRAD BAXTER

Proof. We have
Yt+dt = g(Wt+dt , t + dt)
= g(Wt + dWt , t + dt)
g
g
1 2g
(Wt , t)dWt2 +
(Wt , t)dWt +
(Wt , t)dt
2
x1
2 x
t
 12

g
g
1 g
(Wt , t) +
= g(Wt , t) +
(Wt , t)dWt +
(Wt , t) dt
x1
2 x21
t

= g(Wt , t) +

Subtracting Yt from both sides gives (3.25).



Example 3.6. Let Xt = e+t+Wt . Then, setting f (x1 , t) = exp( + t + x1 )
in Lemma 3.24, we obtain


 
1 2
dXt = Xt dWt +
+ dt .
2
2

Example 3.7. Let Xt = e+(r /2)t+Wt . Then, setting = r 2 /2 in Example


3.6, we find
dXt = Xt (dWt + rdt) .
Exercise 3.5. Let Xt = Wt2 t. Find dXt .
3.8. Multivariate Geometric Brownian Motion. So far we have considered
one asset only. In practice, we need to construct a multivariate GBM model that
allows us to incorporate dependencies between assets via a covariance matrix. To
do this, we first take a vector Brownian motion Wt Rn : its components are
independent Brownian motions. Its covariance matrix Ct at time t is simply a
multiple of the identity matrix:
Ct = EWt WtT = tI.
Now take any real, invertible, symmetric n n matrix A and define
Zt = AWt .

The covariance matrix Dt for this new stochastic process is given by



Dt = EZt ZTt = EAWt WtT A = A EWt WtT A = tA2 ,

and A2 is a symmetric positive definite matrix.

Exercise 3.6. Prove that A2 is symmetric positive definite if A is real, symmetric


and invertible.
In practice, we calculate the covariance matrix M from historical data, hence
must construct a symmetric A satisfying A2 = M . Now a covariance matrix is
precisely a symmetric positive definite matrix, so that the following linear algebra
is vital. We shall use kxk to denote the Euclidean norm of the vector x Rn , that
is
n
X
1/2
(3.26)
kxk =
,
x Rn .
x2k
k=1

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

19

Further, great algorithmic and theoretical importance attaches to those n n


matrices which preserve the Euclidean norm. More formally, an n n matrix
Q is called orthogonal if kQxk = kxk, for all x Rn . It turns out that Q is an
orthogonal matrix if and only if QT Q = I, which is equivalent to stating that its
columns are orthonormal vectors. See Section 7 for further details.
Theorem 3.15. Let M Rnn be symmetric. Then it can be written as M =
QDQT , where Q is an orthogonal matrix and D is a diagonal matrix. The elements
of D are the eigenvalues of M , while the columns of Q are the eigenvectors. Further,
if M is positive definite, then its eigenvalues are all positive.
Proof. Any good linear algebra textbook should include a proof of this fact; a proof
is given in my numerical linear algebra notes.

Given the spectral decomposition M = QDQT , with D = diag (1 , 2 , . . . , n ),
we define
1/2
1/2
D1/2 = diag (1 , 2 , . . . , 1/2
n )
when M is positive definite. We can now define the matrix square-root M 1/2 by
(3.27)

M 1/2 = QD1/2 QT .

Exercise 3.7. Prove that (M 1/2 )2 = M directly from (3.27).


Given the matrix square-root M 1/2 for a chosen symmetric. positive definite
matrix M , we now define the assets
1/2
k = 1, 2, . . . , n,
(3.28)
Sk (t) = e(rMkk /2)t+(M Wt )k ,

1/2
where M Wt k denotes the kth element of the vector M 1/2 Wt . We now need
to check that our assets remain risk-neutral.
Proposition 3.16. Let the assets stochastic processes be defined by (3.28). Then
ESk (t) = ert ,
for all k {1, 2, . . . , n}.
Proof. The key calculation is
1/2
Ee(M Wt )k

=
=

Pn

1/2

Ee =1 (M )k Wt ()
n
Y
1/2
e(M )k Wt ()
E
=1

=
=

n
Y

=1
n
Y

Ee(M
e(M

=1

1/2

)k Wt ()

1/2 2
)k t/2

Pn

=1 (M

e(t/2)

e(t/2)Mkk ,

1/2 2
)k

using the independence of the components of Wt .


Exercise 3.8. Compute E[Sk (t)2 ].
Exercise 3.9. Whats the covariance matrix for the assets S1 (t), . . . , Sn (t)?

20

BRAD BAXTER

3.9. Asian Options. European Put and Call options provide a useful laboratory
in which to understand and test methods. However, the main aim of Monte Carlo
is to calculate option prices for which there is no convenient analytic formula. We
shall illustrate this with Asian options. Specifically, we shall consider the option
!
Z
1 T
(3.29)
fA (S, T ) = S(T )
S( ) d
.
T 0
+

This is a path dependent option: its value depends on the history of the asset price,
not simply its final value.
Why would anyone trade Asian options? Consider a banks corporate client
trading in, say, Britain and the States. The clients business is exposed to exchange
rate volatility: the pounds value in dollars varies over time. Therefore the client
may well decide to hedge by buying an option to trade dollars for pounds at a
set rate at time T . This can be an expensive contract for the writer of the option,
because currency values can blip. An alternative contract is to make the exchange
rate at time T a time-average, as in (3.29). Any contract containing time-averages
of asset prices is usually called an Asian option, and there are many variants of
these. For example, the option dual to (3.29) (in the sense that a call option is dual
to a put option) is given by
!
Z
1 T
S( ) d S(T )
.
(3.30)
gA (S, T ) =
T 0
+

Pricing (3.29) via Monte Carlo is fairly simple. We choose a positive integer M
and subdivide the time interval [0, T ] into M equal subintervals. We evolve the
asset price using the equation
kT (r2 /2) T + T Zk
(k + 1)T
M
M
) = S(
)e
(3.31)
S(
,
k = 0, 1, . . . , M 1,
M
M
where Z0 , Z1 , . . . , ZM 1 are independent N (0, 1) independent pseudorandom numbers.
We can use the simple approximation
Z T
M
1
X
kT
1
S( ) d M 1
).
S(
T
M
0
k=0

Exercise 3.10. Write a Matlab program to price the discrete Asian option defined
by
!
M
1
X
1
(3.32)
fM (S, T ) = S(T ) M
S(kT /M )
.
k=0

We can also study the average


(3.33)

A(T ) = T

S(t) dt
0

directly, and this is the subject of a recent paper of Raymond and myself. For
example,
Z T
1
ES(t) dt.
(3.34)
EA(T ) = T
0

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

Exercise 3.11. Prove that


(3.35)

EA(T ) = S(0)

erT 1
rT

21


Exercise 3.12. In a similar vein, find expressions for ES(a)S(b) and E A(T )2 .

22

BRAD BAXTER

4. The Binomial Model Universe


The geometric Brownian Motion universe is an infinite one and, for practitioners,
has the added disadvantage of the mathematical difficulty of Brownian motion. It
is also possible to construct finite models with similar properties. This was first
demonstrated by Cox, Ross and Rubinstein in the late 1970s.
Our model will be entirely specified by two parameters, > 0 and p [0, 1]. We
choose S0 > 0 and define
(4.1)

Sk = Sk1 exp(Xk ),

k > 0,

where the independent random variables X1 , X2 , . . . satisfy


P (Xk = 1) = p,

(4.2)

P (Xk = 1) = 1 p =: q.

Thus
Sm = S0 e(X1 +X2 ++Xm ) ,

(4.3)

m > 0.

It is usual to display this random process graphically.


e4
p

e3

e2

e2

e2

e2
p

e3

e4

At this stage, we havent specified p and . However, we can easily price


a European option given these parameters. If Sk denotes our Binomial Model
asset price at time kh, for some positive time interval h, then the Binomial Model
European option requirement is given by
(4.4)

f (Sk1 , (k 1)h)

=
=

erh Ef (Sk1 eXk , kh)


erh pf (Sk1 e , kh) + (1 p)f (Sk1 e , kh) .

Thus, given the m+1 possible asset prices at expiry time mh, and their corresponding
option prices, we use (4.4) to calculate the m possible values of the option at time
(m 1)h. Recurring this calculation provides the value of the option at time 0.
Lets illustrate this with an example.

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

23

Example 4.1. Suppose e = 2, p = 1/2 and D = erh . Let m = 4 and lets


use the Binomial Model to calulate all earlier values of the call option whose expiry
value is
f (S(mh), mh) = (S(mh) 1)+ .

Using (4.4), we obtain the following diagram for the asset prices.
16
p

1/2

1/2

1/4

1/4
p

1/8

1/16

The corresponding diagram for the option prices is as follows.


15

9D

21D2 /4

3D/2

3D3

3D2 /4

27D4 /16

3D3 /8

24

BRAD BAXTER

How do we choose the constants and p? One way is to use them to mimic
geometric Brownian motion. Thus we choose a positive number h and use
(4.5)

S(kh) = S((k 1)h)e(r

/2)h+ hZ

k > 0,

where, as usual Z N (0, 1).


Lemma 4.1. In the Geometric Brownian Motion Universe, we have
ES(kh)|S((k 1)h) = S((k 1)h)erh

(4.6)
and
(4.7)

ES(kh)2 |S((k 1)h) = S((k 1)h)2 e(2r+

)h

Proof. These are easy exercises if you have digested Lemma 2.1 and Lemma 2.2:
everything rests on using the fact that E exp(cZ) = c2 /2 when Z N (0, 1), for any
real (or complex) number c.

There are analogous quantities in the Binomial Model.
Lemma 4.2. In the Binomial Model Universe, we have

(4.8)
ESk |Sk1 = pe + (1 p)e Sk1
and

(4.9)

 2
ESk2 |Sk1 = pe2 + (1 p)e2 Sk1

Proof. You should find these to be very easy given the definitions (4.1), (4.2) and
(4.3); revise elementary probability theory if this is not so!

One way to choose p and is to require that the right hand sides of (4.6), (4.8)
and (4.7), (4.9) agree, that is,
erh

(4.10)
(4.11)

e(2r+

)h

=
=

pe + (1 p)e ,

pe2 + (1 p)e2 .

This ensures that our Binomial Model preserves risk neutrality.


Rearranging (4.10) and (4.11), we find
2

(4.12)

p=

e(2r+ )h e2
erh e
=
.
e e
e2 e2

Further, the elementary algebraic identity

e2 e2 = e + e
transforms (4.12) into

e e

(4.13)

erh e =

e(2r+ )h e2
.
e + e

Exercise 4.1. Show that (4.12) implies the equation


(4.14)

e + e = e(r+

)h

+ erh ,

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

25

How do we solve (4.14)? The following analysis is a standard part of the theory of
hyperbolic trigonometric functions1, but no background knowledge will be assumed.
If we write

1  (r+2 )h
(4.15)
y=
e
+ erh ,
2
then (4.14) becomes
e + e = 2y,

(4.16)
that is

(e )2 2y(e ) + 1 = 0.

(4.17)

This quadratic in e has solutions

e = y

(4.18)

y2 1

and, since (4.15) implies y 1, we see that each of these possible solutions is
positive. Thus the possible values for are


p
(4.19)
1 = loge y + y 2 1
and



p
2 = loge y y 2 1 .

(4.20)
Now
1 + 2

=
=

h
i
i
p
y 2 1 + loge y y 2 1
h

i
p
p
loge y + y 2 1 y y 2 1


loge y 2 (y 2 1)
loge 1

0.

=
=

(4.21)

loge

h

y+

Since y + y 2 1 1, for y 1, we deduce that 1 0 and 2 = 1 . Since we


have chosen > 0, we conclude
h
i
p
(4.22)
= loge y + y 2 1 ,

where y is given by (4.15).


Now (4.22) tells us the value of required, but the expression is somewhat
complicated. However, if we return to (4.14), that is,
e + e = e(r+

)h

+ erh ,

and consider small h, then is also small and Taylor expansion yields
2 + 2 + = 1 + (r + 2 )h + + 1 rh + ,
that is,
(4.23)

2 + = 2 h + .

1Specifically, this is the formula for the inverse hyperbolic cosine.

26

BRAD BAXTER

Cox and Ross had the excellent idea of ignoring the messy higher order terms, since
the model is only an approximation in any case. Thus the CoxRoss Binomial Model
chooses
= h1/2 .

(4.24)

The corresponding equation for the probability p becomes


1/2

(4.25)

p=

erh eh
eh1/2 eh1/2

Its useful, but tedious, to Taylor expand the RHS of (4.25). We obtain

1 + rh + 1 h1/2 + 21 2 h +

p =
2 h1/2 + 3 h3/2 /6 +
=

=
=
(4.26)

h1/2 + (r 2 /2)h +
2h1/2 (1 + 2 h/6 + )


1 1 + 1 h1/2 (r 2 /2) +
2
1 + 2 h/6 +
h

i
1
1 + 1 h1/2 (r 2 /2) + 1 2 h/6 +
2
i
1h
1 + 1 h1/2 (r 2 /2) + ,
2

to highest order, so that


(4.27)

1p=

i
1h
1 + 1 h1/2 (r 2 /2) + ,
2

Its tempting to omit the higher order terms, but we would then lose risk neutrality
in our Binomial Model.
Is the Binomial Model consistent with the Geometric Brownian Motion universe
as h 0? We shall now show that the definition of a sufficiently smooth European
option in the Binomial Model still implies the BlackScholes PDE in the limit as
h 0.
Proposition 4.3. Let f : [0, )[0, ) R be an infinitely differentiable function
satisfying


1/2
1/2
(4.28)
f (S, t h) = erh pf (Seh , t) + (1 p)f (Seh , t)) ,

for all h > 0, where p is given by (4.25). Then f satisfies the BlackScholes PDE.
Proof. As usual, it is much more convenient to use log-space. Thus we define
u = log S and
g(u, t) = f (S, t).
Hence (4.28) becomes
(4.29)



g(u, t h) = erh pg(u + h1/2 , t) + (1 p)g(u h1/2 , t) ,

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

27

Using (4.26) and (4.27) and omitting terms whose order exceeds h for clarity, we
obtain

 h
i
1 2
rh 1
1/2
1/2 1
2
g hgt + = e
1 + h (r /2) g + h gu + hguu
2
2

i
1h
1
1/2 1
2
1/2
2
1 h (r /2) g h gu + hguu
+
2
2

h
i
1
= erh g + h (r 2 /2)gu + 2 guu
2
i


h
1
= 1 rh + O(h2 ) g + h (r 2 /2)gu + 2 guu
2
h
i
1
= g + h rg + (r 2 /2)gu + 2 guu .
2
(4.30)
Equating the O(h) terms on both sides of equation (4.30) yields the BlackScholes
equation
1
gt = rg + (r 2 /2)gu + 2 guu .
2


28

BRAD BAXTER

5. The Partial Differential Equation Approach


One important way to price options is to solve the BlackScholes partial differential
equation (PDE), or some variant of BlackScholes. Hence we study the fundamentals
of the numerical analysis of PDEs.
5.1. The Diffusion Equation. The diffusion equation arises in many physical
and stochastic situations. In the hope that the baroque will serve as a mnemonic,
we shall model the diffusion of poison along a line. Let u(x, t) be the density of
poison at location x and time t and consider the stochastic model

x R, t 0,
(5.1)
u(x, t) = Eu(x + hZ, t h),

where is a positive constant and Z N (0, 1). The idea here is that the poison
molecules perform a random walk along the line, just as share prices do in time. If
we assume that u has sufficiently many derivatives, then we obtain
u(x, t)

u
1
u
2u
= E u(x, t) + h Z + 2 hZ 2 2 + O(h3/2 ) h
+ O(h2 )
x
2
x
t
 1 2 u u 
+ O(h3/2 ).
= u(x, t) + h 2 2
2 x
t
In other words, dividing by h, we obtain
 1 2 u u 
+ O(h1/2 ) = 0.
2

2 x2
t
Letting h 0, we have derived the diffusion equation

u
1 2 2u

=
.
2
2 x
t
This important partial differential equation is often called the heat equation.

(5.2)

Exercise 5.1. The d-dimensional form of our stochastic model for diffusion is
given by

u(x, t) = Eu(x + hZ, t h),


x Rd , t 0.
Here Z Rd is a normalized Gaussian random vector: its component are independent
N (0, 1) random variables and its probability density function is
p(z) = (2)d/2 exp(kzk2 /2),

z Rd .

Assuming u is sufficiently differentiable, prove that u satisfies the d-dimensional


diffusion equation
d
2 X 2 u
u
=
.
t
2
x2k
k=1

Variations on the diffusion equation occur in many fields including, of course,


mathematical finance. For example, the neutron density2 N (x, t) in Uranium 235
or Plutonium approximately obeys the partial differential equation
d

X 2N
N
= N +
.
t
x2k
k=1

2In mathematical finance, we choose our model to avoid exponential growth, but this is not
always the aim in nuclear physics.

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

29

In fact, the BlackScholes PDE is really the diffusion equation in disguise, as we


t) of the BlackScholes
shall now show. In log-space, we consider any solution f (S,
equation, that is,
f
1 2f
f
+ 2
+
= 0.
2

2
t
S
S
The inspired trick, a product of four centuries of mathematical play with differential
equations, is to substitute
rf + (r 2 )

(5.3)

t) = u(S,
t)eS+t
f (S,

(5.4)

and to find the PDE satisfied by u. Now




u
f

eS+t
= u +

S
S
and


u
2u
2f

eS+t .
= u2 + 2
+
S2
S S2
Substituting in the BlackScholes equation results in





u
2
u
2u
u
2
2
ru + r /2 u
+
+ u +
+
u + 2
= 0,
2

2
t
S
S
S
or


1 2 u u u
+
(r 2 /2) + 2 + u r + (r 2 /2) + 2 2 /2 + .
0 = 2 2 +

2 x
t S
We can choose and to be any real numbers we please. In particular, if we set
= 2 (r 2 /2), then the u/ S term vanishes. We can then solve for to
kill the u term.
Exercise 5.2. Find the value of that annihilates the u term.
The practical consequence of this clever trick is that every problem involving the
BlackScholes PDE can be transformed into an equivalent problem for the diffusion
equation, as you will see in Pricing next year. Therefore we now study methods for
solving the diffusion equation.
There is an analytic solution for the diffusion equation that is sometimes useful.
If we set h = t in (5.1), then we obtain

(5.5)
u(x, t) = Eu(x + tZ, 0),
that is,
(5.6)
(5.7)

u(x, t)

=
=

u(x + tZ, 0)(2)1/2 exp(z 2 /2) dz


u(x w, 0)G(w, t) dw,

using the substitution w = tz, where


2

G(w, t) = (2 t)

1/2

w2
exp 2
2 t

w R.

This is called the Greens function for the diffusion equation. Of course, we must
now evaluate the integral. As for European options, analytic solutions exist for
some simple cases, but numerical integration must be used in general.

30

BRAD BAXTER

5.2. Finite Difference Methods for the Diffusion Equation. The simplest
finite difference method is called explicit Euler and its a BAD method. Fortunately
the insight gained from understanding why its bad enables us to construct good
methods. There is another excellent reason for you to be taught bad methods and
why theyre bad: stupidity is a renewable resource. In other words, simple bad
methods are often rediscovered.
We begin with some finite difference approximations to the time derivative
u(x, t + k) u(x, t)
u

t
k
and the space derivative, using the second central difference
u(x + h, t) 2u(x, t) + u(x h, t)
2u

.
x2
h2
Exercise 5.3. Show that
g(x + h) 2g(x) + g(x h)
h2
= g (2) (x) + g (4) (x) + O(h4 )
2
h
12
and find the next two terms in the expansion.
Our model problem for this section will be the zero boundary value problem:

(5.8)

2u
u
=
,
0 x 1, t 0,
t
x2
u(x, 0) = f (x), 0 x 1,
u(0, t) = u(1, t) = 0,

t 0.

We now choose a positive integer M and positive numbers T and k. We then


set h = 1/M , N = T /k and generate a discrete approximation
n
{Um
: 0 m M, 0 n N, }

to the values of the solution u at the points of the rectangular grid


{(mh, nk) : 0 m M, 0 n N }
using the recurrence
(5.9)
where


n+1
n
n
n
n
,
Um
= Um
+ Um+1
2Um
+ Um1

n 0, 1 m M 1,

k
h2
and the boundary values for u imply the relations
(5.10)

(5.11)

n
U0n = UM
= 0 and

0
Um
= u(mh, 0),

0 m M.

This is called explicit Euler.


In matrix terms3, we have
Un = T Un1 ,

(5.12)

n 1,

3How do we find T ? Equation ((5.12)) implies


n
Um
= (T Un1 )m =

M
1
X
=1

n
n
n
Tm Un1 = Um1
+ (1 2)Um
+ Um+1
.

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

where
(5.13)

U1n

Un = ... RM 1
n
UM
1

and T R(M 1)(M 1) is the tridiagonal


by

1 2

(5.14)
T =

symmetric Toeplitz (TST) matrix defined

..
.

..

.
1 2

Hence
(5.15)

31

..
.


1 2

Un = T n U0 .

Unfortunately, explicit Euler is an unstable method unless 1/2. In other


n
words, the numbers {Um
: 1 m M 1} grow exponentially as n . Heres
an example using Matlab.
Example 5.1. The following Matlab fragment generates the explicit Euler approximations.
% Choose our parameters
mu = 0.7;
M=100; N=20;
% Pick (Gaussian) random initial values
uold = randn(M-1,1);
% construct the tridiagonal symmetric Toeplitz matrix T
T = (1-2*mu)*diag(ones(M-1,1)) + mu*( diag(ones(M-2,1),1) + diag(ones(M-2,1),-1) );
% iterate and plot
plot(uold)
hold on
for k=1:N
unew = T*uold;
plot(unew)
uold = unew;
end
If we run the above code for M = 6 and

0.034942
0.065171

U =
0.964159 ,
0.406006
1.450787

then we obtain

U20 =

4972.4
8614.6
9950.7
8620.5
4978.3

32

BRAD BAXTER

Further, kU40 k = 2.4 108 . The exponential instability is obvious. Experiment


with different values of , M and N .
The restriction 1/2, that is k h2 /2, might not seem particularly harmful
at first. However, it means that small h values require tiny k values, and tiny
timesteps imply lots of work: an inefficient method. Now lets derive this stability
requirement. We begin by studying a more general problem based on (5.15).
Specifically, let A Rnn be any symmetric matrix4. Its spectral radius (A)
is simply its largest eigenvalue in modulus, that is,
(5.16)

(A) = max{|1 |, |2 |, . . . , |n |}.

Theorem 5.1. Let A Rnn be any symmetric matrix and define the recurrence
xk = Axk1 . is some initial vector.
(i) If (A) < 1, then limk kxk k = 0, for any initial vector x0 Rn .
(ii) If (A) 1, then the norms of the iterates kx1 k, kx2 k, . . . remain bounded.
(iii) If (A) > 1, then we can choose x0 Rn such that limk kxk k = .
Proof. We use the spectral decomposition introduced in Theorem 3.15, so that
A = QDQT , where Q Rnn is an orthogonal matrix and D Rnn is a diagonal
matrix whose diagonal elements are the eigenvalues 1 , . . . , n of A. Then
and

xk = Axk1 = A2 xk2 = = Ak x0
Ak

QDQT

QDk QT .



QDQT QDQT

Hence
xk = QDk QT x0
and, introducing zk := QT xk , we obtain
zk = D k z0 ,
and it is important to observe that kzk k = kQT xk k = kxk k, because QT is an
orthogonal matrix. Since D is a diagonal matrix, this matrix equation is simply n
linear recurrences, namely
zk () = k z0 (),

= 1, 2, . . . , n.

The following consequences are easily checked.


(i) If (A) < 1, then each of these scalar sequences tends to zero, as k ,
which implies that kxk k 0.
(ii) If (A) = 1, then each of these scalar sequences is bounded, which implies
that the sequence kxk k remains bounded.
(iii) If (A) > 1, then there is at least one eigenvalue, i say, for which |i | > 1.
Hence, if z0 (i) 6= 0, then the sequence |zk (i)| = |ki z0 (i)| , as k .

Definition 5.1. Let A Rnn be any symmetric matrix. We say that A is
spectrally stable if its spectral radius satisfies (A) 1.
4All of this theory can be generalized to nonsymmetric matrices using the Jordan canonical
form, but this advanced topic is not needed in this course.

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

33

Example 5.2. Let A Rnn be a symmetric matrix whose distinct eigenvalues


are 0.1, 0.1, . . . , 0.1, 10. If x0 Rn contains no component of the eigenvector
corresponding to the eigenvalue 10, i.e. z0 (n) = 0, then, in exact arithmetic, we
shall still obtain kxk k 0, as k . However, a computer uses finite precision
arithmetic, which implies that, even if z0 (n) = 0, it is highly likely that z0 (m) 6= 0,
for some m > 0, since the matrix-vector product is not computed exactly. This
(initially small) nonzero component will grow exponentially.
Theorem 5.1 is only useful when we can deduce the magnitude of the spectral
radius. Fortunately, this is possible for an important class of matrices.
Definition 5.2. We say that a matrix T (a, b)
and Toeplitz (TST) if it has the form

a b
b a
b

.
.
..
.. ...
(5.17)
T (a, b) =

b
a
b

Rmm is tridiagonal, symmetric

b
a

a, b R.

TST matrices arise naturally in many applications. Fortunately theyre one of the
few nontrivial classes of matrices for which the eigenvalues and eigenvectors can
be analytically determined rather easily. In fact, every TST matrix has the same
eigenvectors, because
(5.18)

T (a, b) = aI + 2bT0 ,

where T0 = T (0, 1/2) (this is not a recursive definition, simply an observation given
(5.18)). Hence, if T0 v = v, then T (a, b)v = (a + 2b)v. Thus we only need to
study T0 .
In fact, every eigenvalue of T0 lies in the interval [1, 1]. The proof is interesting
because its our only example of using a different norm. For any vector w Rm ,
we define its infinity norm to be
kwk = max{|w1 |, |w2 |, . . . , |wm |}.
Exercise 5.4. Show that
m

for any vector z R .

kT0 zk kzk ,

We shall state our result formally for ease of reference.


Lemma 5.2. Every eigenvalue of T0 satisfies || 1.
Proof. If T0 v = v, then
||kvk = kvk = kT0 vk kvk .
Hence || 1.
Proposition 5.3. The eigenvalues of T0 Rmm are given by


j
,
j = 1, . . . , m,
(5.19)
j = cos
m+1

34

BRAD BAXTER

and the corresponding eigenvector v(j) has components




jk
(j)
(5.20)
vk = sin
,
j, k = 1, . . . , m.
m+1
Proof. Suppose v is an eigenvector for T0 , so that
vj+1 + vj1 = 2vj ,
and

2 j m 1,

v2 = 2v1 , vm1 = 2vm .


Thus the elements of the vector v are m values of the recurrence relation defined
by
vj+1 + vj1 = 2vj ,
j Z,
where v0 = vm+1 = 0. Heres a rather slick trick: we know that || 1, and a
general theoretical result states that the eigenvalues of a real symmetric matrix are
real, so we can write = cos , for some R. The associated equation for this
recurrence is therefore the quadratic
t2 2t cos + 1 = 0

which we can factorize as


Thus the general solution is

t ei


t ei = 0.

vj = reij + seij ,

j Z,

where r and s can be any complex numbers. But v0 = 0 implies s = r, so we


obtain
vj = sin j,
j Z,
on using the fact that every multiple of a sequence satisfying the recurrence is
another sequence satisfying the recurrence. The only other condition remaining to
be satisfied is vm+1 = 0, so that
sin ((m + 1)) = 0,
which implies (m + 1) is some integer multiple of .

Exercise 5.5. Prove that the eigenvectors given in Proposition 5.3 are orthogonal
by direct calculation.
The spectral radius of the matrix T driving explicit Euler, defined by (5.15), is
now an easy consequence of our more general analysis of TST matrices.
Corollary 5.4. Let T R(M 1)(M 1) be the matrix driving explicit Euler, defined
by (5.15). Then (T ) 1 if and only if 1/2. Hence explicit Euler is spectrally
stable if and only if 1/2
Proof. We need only observe that T = T (1 2, ), so that Proposition 5.3 implies
that its eigenvalues are
k
k = 1 2 + 2 cos( )
M


k
2
,
k = 1, 2, . . . , M 1.
= 1 4 sin
2M
Thus (T ) 1 if and only if 1/2, for otherwise |1 4| > 1.

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

35

We can also use TST matrices to understand implicit Euler: here we use

n+1
n+1
n+1
n+1
n
2Um
+ Um1
(5.21)
Um
= Um
+ Um+1
,
1 m M 1.

In matrix form, this becomes

T (1 + 2, )Un+1 = Un ,

(5.22)

using the notation of (5.17). Before using Proposition 5.3 to derive its eigenvalues,
we need a simple lemma.
Lemma 5.5. Let A Rnn be any symmetric matrix, having spectral decomposition
A = QDQT . Then A1 = QD1 QT .
Proof. This is a very easy exercise.

Proposition 5.6. Implicit Euler is spectrally stable for all 0.


Proof. By Proposition 5.3, the eigenvalues of T (1 + 2, ) are given by
k
k = 1 + 2 2 cos( )
M
k
= 1 + 4 sin2 (
).
2M
Thus every eigenvalue of T (1 + 2, ) exceeds 1, which implies (by Lemma 5.5)
that every eigenvalue of its inverse lies in the interval (0, 1). Thus implicit Euler is
spectrally stable for all 0.

We have yet to prove that the answers produced by these methods converge to
the true solution as h 0. We illustrate the general method using explicit Euler,
for 1/2, applied to the diffusion equation on [0, 1] with zero boundary (5.8). If
we define the error
n
n
Em
:= Um
u(mh, nk).

(5.23)
then
(5.24)


n+1
n
n
n
n
= L(x, t),
Em
Em
Em+1
2Em
+ Em1

where the Local Truncation Error (LTE) L(x, t) is defined by


(5.25)

L(x, t) = u(x, t + k) u(x, t) (u(x + h, t) 2u(x, t) + u(x h, t)) ,

recalling that, by definition,


(5.26)


n+1
n
n
n
n
,
0 = Um
Um
Um+1
2Um
+ Um1

Thus we form the LTE by replacing


recalling that k = h2 , we obtain
L(x, t)

=
=

(5.27)

n
Um

1 m M 1.

by u(x, t) in (5.26) . Taylor expanding and

kut (x, t) + O(k 2 ) h2 uxx (x, t) + O(h4 )

h2 uxx (x, t) h2 uxx (x, t) + O(h4 )

O(h4 ),

using the fact that ut = uxx . Now choose a time interval [0, T ]. Since L(x, t) is a
continuous function, (5.27) implies the inequality
(5.28)

|L(x, t)| Ch4 ,

for 0 x 1 and 0 t T,

5You will see the same idea in the next section, where this will be called the associated
functional equation.

36

BRAD BAXTER

where the constant C depends on T . Further, rearranging (5.24) yields



n+1
n
n
n
n
+ L(x, t),
(5.29)
Em
= Em
+ Em+1
2Em
+ Em1
and applying inequality (5.28), we obtain
(5.30)

n+1
n
n
n
|Em
| (1 2)|Em
| + |Em+1
| + |Em1
| + Ch4 ,

because 1 2 0 for 1/2. If we let n denote the maximum modulus error


at time nk, i.e.
n
n = max{|E1n |, |E2n |, . . . , |EM
1 |}

(5.31)
then (5.31) implies

n+1
|Em
| (1 2)n + 2n + Ch4 = n + Ch4 ,

(5.32)
whence

n+1 (1 2)n + 2n + Ch4 = n + Ch4 .

(5.33)

Therefore, recurring (5.33)


(5.34)

n n1 + Ch4 n2 + 2Ch4 0 + nCh4 = Cnh4 ,

0
since Em
0. Now

(5.35)

n N :=

T
T
=
,
k
h2

so that (5.34) and (5.35) jointly provide the upper bound


 CT 
n
h2 ,
(5.36)
|Um
u(mh, nk)|

for 1 m M 1 and 0 n N . Hence we have shown that the explicit Euler


approximation has uniform O(h2 ) convergence for 0 t T . The key here is the
order, not the constant in the bound: halving h reduces the error uniformly by 4.
Exercise 5.6. Refine the expansion of the LTE in (5.27) to obtain


1 2
1 4
3
2
6
L(x, t) = kut (x, t)+ k utt (x, t)+O(k ) h uxx (x, t) + h uxxxx (x, t) + O(h ) .
2
12
Hence prove that
L(x, t) =

1 4
h utt (x, t) ( 1/6) + O(h6 ).
2

Hence show that, if = 1/6 in explicit Euler, we obtain the higher-order uniform
error
n
|Um
u(mh, nk)| Dh4 ,
for 1 m M 1 and 0 n T /k, where D depends on T .

Implicit Euler owes its name to the fact that we must solve linear equations to
obtain the approximations at time (n + 1)h from those at time nh. This linear
system is tridiagonal, so Gaussian elimination only requires O(n) time to complete,
rather than the O(n3 ) time for a general n n matrix. In fact, there is a classic

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

37

method that provides a higher order than implicit Euler together with excellent
stability: CrankNicolson is the implicit method defined by
n+1
Um

(5.37)

n+1
n+1
n+1
Um+1
2Um
+ Um1
2

n
= Um
+

In matrix form, we obtain

n
n
n
.
Um+1
2Um
+ Um1
2

T (1 + , /2)Un+1 = T (1 , /2)Un ,

(5.38)
or

Un+1 = T (1 + , /2)1 T (1 , /2)Un .

(5.39)

Now every TST matrix has the same eigenvectors. Thus the eigenvalues of the
product of TST matrices in (5.39) are given by
(5.40)

k =

1 + cos( k
M)
1 + cos k
M

k
)
1 2 sin2 ( 2M

k
1 + 2 sin2 ( 2M
)

Hence |k | (0, 1) for all 0.


Exercise 5.7. Calculate the LTE of Crank-Nicolson when h = k.
5.3. The Fourier Transform and the von Neumann Stability Test. Given
any univariate function f : R R for which the integral
Z
|f (x)| dx
(5.41)

is finite, we define its Fourier transform by the relation


Z
f (x) exp(ixz) dx,
z R.
(5.42)
fb(z) =

The Fourier transform is used in this course to understand stability properties,


solve some partial differential equations and calculate the local truncation errors
for finite difference methods. Next year, you will see it being used to derive analytic
values of certain exotic options during your pricing course.
Proposition 5.7.

(5.43)

(i) Let
Ta f (x) = f (x + a),

x R.

We say that Ta f is the translate of f by a. Then


(5.44)

b
Td
a f (z) = exp(iaz)f (z),

z R.

(ii) The Fourier transform of the derivative is given by


(5.45)

fb (z) = iz fb(z),

z R.

38

BRAD BAXTER

Proof.

(i)
Td
a f (z) =
=

Ta f (x)eixz dx
f (x + a)eixz dx
f (y)ei(ya)z dy

= eiaz fb(z).

(ii) Integrating by parts and using the fact that limx f (x) = 0, which is
a consequence of (5.41), we obtain
Z
f (x)eixz dx
fb (z) =

Z


x=
f (x) izeixz dx
= f (x)eixz x=

= iz fb(z).

d
(k) (z)
(2) (z) = (iz)2 fb(z) = z 2 fb(z). Find f
Exercise 5.8. We have fd

Many students will have seen some use of the Fourier transform to solve differential
equations. It is also vitally important to finite difference operators.
Example 5.3. Lets analyse the second order central difference operator using the
Fourier transform. Thus we take
g(x) =

f (x + h) 2f (x) + f (x h)
.
h2

and observe that


Now6


gb(z) = h2 eihz 2 + eihz fb(z) = 2h2 (cos(hz) 1) fb(z).
cos(hz) = 1

so that
gb(z)

=
=
=

h4 z 4
h2 z 2
+
,
2
4!

 2 2

h z
h4 z 4
2h2
+
fb(z)
2
4!
2 4
h
z
fb(z) +
z 2 fb(z) +
12
(4) (z)
fd
(2) (z) +
fd
+ .
12

Taking the inverse transform, we have computed the Taylor expansion of g:


g(x) = f (2) (x) + h2 f (4) (x) + .
6Commit this Taylor expansion to memory if you dont already know it!

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

39

Of course, theres no need to use the Fourier transform to analyse the second
order central difference operator, but we have to learn to walk before we can run!
We shall also need the Fourier transform for functions of more than one variable.
For any bivariate function f (x1 , x2 ) that tends to zero sufficiently rapidly at infinity,
we define
Z Z
f (x1 , x2 )ei(x1 z1 +x2 z2 ) dx1 dx2 ,
z1 , z2 R.
(5.46)
fb(z1 , z2 ) =

In fact, its more convenient to write this using a slightly different notation:
Z
b
(5.47)
f (z) =
f (x) exp(ixT z) dx,
z R2 .
R2

This is still a double integral, although only one integration sign is used. Similarly,
for a function f (x), x Rn , we define
Z
f (x) exp(ixT z) dx,
z Rn .
(5.48)
fb(z) =
Rn

Here

xT z =

n
X

x, z Rn .

x k zk ,

k=1

The multivariate version of Proposition 5.7 is as follows.


Proposition 5.8.

(i) Let

(5.49)

Ta f (x) = f (x + a),

x Rn .

We say that Ta f is the translate of f by a. Then


T
b
Td
a f (z) = exp(ia z)f (z),

(5.50)

z Rn .

Further, if 1 , . . . , n are non-negative integers and || = 1 + + n ,


then
(ii)
(5.51)

\
|| f
(z) = (iz1 )1 (iz2 )2 (izn )n fb(z),
n
x
n

2
1
x
1 x2

z Rn .

Proof. The proof is not formally examinable, but very similar to the multivariate
result.

5.4. Stability and the Fourier Transform. We can also use Fourier analysis to
avoid eigenanalysis when studying stability. We shall begin abstractly, but soon
apply the analysis to explicit and implicit Euler for the diffusion equation.
Suppose we have two sequences {uk }kZ and {vk }kZ related by
X
X
(5.52)
bk v k =
ak u k .
kZ

kZ

In most applications, uk u(kh), for some underlying function, so we study the


associated functional equation
X
X
(5.53)
bk v(x + kh) =
ak u(x + kh).
kZ

kZ

40

BRAD BAXTER

The advantage of widening our investigation is that we can use the Fourier transform
to study (5.53). Specifically, we have
X
X
(5.54)
vb(z)
bk eikhz = u
b(z)
ak eikhz ,
kZ

or

(5.55)
where
(5.56)

vb(z) =
a(w) =

kZ

a(hz)
b(hz)

ak eikw

u
b(z) =: R(hz)b
u(z),

and

b(w) =

kZ

bk eikw .

kZ

Example 5.4. For explicit Euler, we have


vk = uk+1 + (1 2)uk + uk1 ,

so that the associated functional equation is

k Z,

v(x) = u(x + h) + (1 2)u(x) + u(x h),

whose Fourier transform is given by

(5.57)

vb(z)

=
=
=

x R,


eihz + 1 2 + eihz u
b(z)

(1 2(1 cos(hz))) u
b(z)

2
1 4 sin (hz/2) u
b(z).

Thus vb(z) = r(hz)b


u(z), where r(w) = 1 4 sin2 (w/2).

When we advance forwards n steps in time using explicit Euler, we obtain in


Fourier transform space
(5.58)

u
cn (z) = r(hz)u[
[
c0 (z).
n1 (z) = (r(hz)) u
n2 (z) = = (r(hz)) u

Thus, if |r(w)| < 1, for all w R,then limn u


cn (z) = 0, for all z R. However, if
|r(hz0 )| > 1, then, by continuity, |r(hz)| > 1 for z sufficiently close to z0 . Further,
since r(hz) is periodic, with period /h, we deduce that |r(hz)| > 1 on /h-integer
shifts of an interval centred at z0 . Hence limn u
cn (z) = . Further, there is an
intimate connection between u and u
b in the following sense.

Theorem 5.9 (Parsevals Theorem). If f : R R is continuous, then


Z
Z
1
2
|f (x)| dx =
(5.59)
|fb(z)|2 dz.
2

Proof. Not examinable.

Hence limn u
cn (z) = implies
Z
|un (x)|2 dx = .
lim
n

this motivated the brilliant Hungarian mathematician John von Neumann to analyse
the stability of finite difference operators via the Fourier transform.
Definition 5.3. If |r(hz)| 1, for all z R, then we say that the finite difference
operator is von Neumann stable, or Fourier stable.
Theorem 5.10. Explicit Euler is von Neumann stable if and only if 1/2,
whilse implicit Euler is von Neumann stable for all > 0.

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

41

Proof. For explicit Euler, we have already seen that


u
cn (z) = r(hz)u[
n1 (z),

where

r(w) = 1 4 sin2 (w/2).


Thus |r(w)| 1, for all w R, if and only if |1 4| 1, i.e. 1/2.
For implicit Euler, we have the associated functional equation
un+1 (x) = un (x) + (un+1 (x + h) 2un+1 (x) + un+1 (x h)) ,

Hence
or

Therefore


eihz + (1 + 2) eihz u[
cn (z),
n+1 (z) = u

1 + 4 sin2 (hz/2) u[
cn (z),
n+1 (z) = u

u\
n+1 (z) =

x R.

z R,

z R.

1
u
cn (z) =: r(hz)c
un (z),
1 + 4 sin2 (hz/2)

z R,

and 0 r(hz) 1, for all z R.

Exercise 5.9. Prove that CrankNicolson (5.37) is von Neumann stable for all
0.
5.5. Option Pricing via the Fourier transform. The Fourier transform can
also be used to calculate solutions of the BlackScholes equation, and its variants,
and this approach provides a powerful analytic and numerical technique.
We begin with the BlackScholes equation in log-space:
(5.60)

0 = rg + (r 2 /2)gx + ( 2 /2)gxx + gt ,

where the asset price S = ex and subscripts denote partial derivatives. We now let
gb(z, t) denote the Fourier transform of the option price g(x, t) at time t, that is,
Z
g(x, t)eixz dx,
z R,
(5.61)
gb(z, t) =

The Fourier transform of (5.60) is therefore given by


1
(5.62)
0 = rb
g + iz(r 2 /2)b
g 2 z 2 gb + gbt .
2
In other words, we have, for each fixed z R, the ordinary differential equation


1
(5.63)
gbt = r + iz(r 2 /2) 2 z 2 gb, ,
2
with solution


(5.64)

gb(z, t) = gb(z, t0 )e

r+iz(r 2 /2) 21 2 z 2 (tt0 )

When pricing a European option, we know the options expiry value g(x, T ) and
wish to calculate its initial price g(x, 0). Substituting t = T and t0 = 0 in (5.64),
we therefore obtain

(5.65)

gb(z, 0) = e

r+iz(r 2 /2) 21 2 z 2 T

gb(z, T ).

42

BRAD BAXTER

In order to apply this, we shall need to know the Fourier transform of a Gaussian.
Proposition 5.11. Let G(x) = exp(kxk2 ), for x Rd , where is a positive
constant. Then its Fourier transform is the Gaussian
b
(5.66)
G(z)
= (/)d/2 exp(kzk2 /(4)),
z Rd .
Proof. Its usual to derive this result via contour integration, but here is a neat
proof via It
os lemma and Brownian motion. Let c C be any complex number
and define the stochastic process Xt = exp(cWt ), for t 0. Then a straightforward
application of It
os lemma implies the relation

dXt = Xt cdWt + (c2 /2)dt .
Taking expectations and defining m(t) = EXt , we obtain the differential equation
m (t) = (c2 /2)m(t), whence m(t) = exp(c2 t/2). In other words,
EecWt = ec

t/2

which implies, on recalling that Wt N (0, t) and setting = ct1/2 ,


EeZ = e

/2

for any complex number C.

Corollary 5.12. The Fourier transform of the univariate Gaussian probability


density function
2
2
p(x) = (2 2 )1/2 ex /(2 ) ,
x R,
is
2 2
pb(z) = e z /2 ,
z R.

Proof. We simply set = 2 /2 in Proposition 5.11.

Exercise 5.10. Calculate the Fourier transform of the multivariate Gaussian probability
density function
2
2
p(x) = (2 2 )d/2 ekxk /(2 ) , x Rd .
The cumulative distribution function (CDF) for the Gaussian probability density
N (0, 2 ) is given by
Z x
2
2
(2 2 )1/2 ey /(2 ) dy,
x R.
(5.67)
2 (x) =

Thus the fundamental theorem of calculus implies that


2 (x) = (2 2 )1/2 ex

/(2 2 )

and
Example 5.5. Calculate the price of the option whose expiry price is given by
(
1 if a S(T ) b,
f (S(T ), T ) =
0 otherwise.
In other words, this option is simply a bet that pays 1 if the final asset price lies
in the interval [a, b].

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

43

5.6. Fourier Transform Conventions. There are several essentially identical


Fourier conventions in common use, but their minor differences are often confusing.
The most general definition is
Z
f (x)eiCxz dx,
(5.68)
fb(z) = A

where A and C are nonzero real constants. The Fourier Inversion Theorem then
takes the form
Z
A1 C sign C b
(5.69)
f (x) =
f (z)eiCxz dx.
2
sign(C)

Example 5.6. The following four cases are probably the most commonly encountered.
(i) C = 1, A = 1:
(
R
fb(z) = f (x)eixz dx
R
R
1
ixz
b
f (x) = 1
dz = 2
fb(z)eixz dz.
2 + f (z)e

(ii) C = 2, A = 1:

R
fb(z) = f (x)e2ixz dx
R
f (x) = fb(z)e2ixz dz.

(iii) C = 1, A = 1/ 2:
(
fb(z) =

f (x) =

(iv) C = 1, A = 1:

fb(z) =
f (x) =

Its not hard to show that

1
2
1
2

f (x)eixz dx
fb(z)eixz dz.

R
f (x)eixz dx

1
ixz
b
dz.
2 f (z)e

iaCz b
Td
f (z)
a f (z) = e

(5.70)
and

fb (z) = iCz fb(z),

(5.71)

where Ta f (x) = f (x + a), for any a C.

Example 5.7. For the same four examples given earlier, we obtain the following
shifting and differentiation formulae.
(i) C = 1, A = 1:
iaz b
Td
f (z),
a f (z) = e

(ii) C = 2, A = 1:

2iaz b
Td
f (z),
a f (z) = e

(iii) C = 1, A = 1/ 2:

iaz b
Td
f (z),
a f (z) = e

(iv) C = 1, A = 1:

iaz b
Td
f (z),
a f (z) = e

fb (z) = iz fb(z).

fb (z) = 2iz fb(z).


fb (z) = iz fb(z).

fb (z) = iz fb(z).

44

BRAD BAXTER

Which, then, should we choose? Its entirely arbitrary but, once made, the choice
is likely to be permanent, since changing convention greatly increases the chance
of algebraic erro. I have chosen C = 1 and A = 1 in lectures, mainly because
its probably the most common choice in applied mathematics. It was also the
convention chosen by my undergraduate lecturers at Cambridge, so the real reason
is probably habit!

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

45

6. Mathematical Background Material


Ive collected here a miscellany of mathematical methods used (or reviewed)
during the course.
6.1. Probability Theory. A random variable X is said to have (continuous)
probability density function p(t) if
Z b
p(t) dt.
(6.1)
P(a < X < b) =
a

We shall assume that p(t) is a continuous function (no jumps in value). In particular,
we have
Z
1 = P(X R) =

Further, because

0 P(a < X < a + a) =

p(t) dt.

a+a
a

p(t) dt p(a)a,

for small a, we conclude that p(t) 0, for all t 0. In other words, a probability
density function is simply a non-negative function p(t) whose integral is one. Here
are two fundamental examples.
Example 6.1. The Gaussian probability density function, with mean and variance
2 , is defined by


(t )2
2 1/2
(6.2)
p(t) = (2 )
exp
.
2 2
We say that the Gaussian is normalized if = 0 and = 1.

To prove that this is truly a probability density function, we require the important
identity
Z
p
2
eCx dx = /C,
(6.3)

which is valid for any C > 0. [In fact its valid for any complex number C whose
real part is positive.]

Example 6.2. The Cauchy probability density function is defined by


1
.
(6.4)
p(t) =
(1 + t2 )
This distribution might also be called the Mad Machine Gunner distribution; imagine
our killer sitting at the origin of the (x, y) plane. He7 is firing (at a constant rate)
at the infinite line y = 1, his angle (with the x-axis) of fire being uniformly
distributed in the interval (0, ). Then the bullets have the Cauchy density.
If you draw some graphs of these probability densities, you should find that,
for small , the graph is concentrated around the value . For large , the graph
is rather flat. There are two important definitions that capture this behaviour
mathematically.
7The sexism is quite accurate, since males produce vastly more violent psychopaths than
females.

46

BRAD BAXTER

Definition 6.1. The mean, or expected value, of a random variable X with p.d.f
p(t) is defined by
Z
tp(t) dt.
(6.5)
EX :=

Its very common to write instead EX when no ambiguity can arise. Its variance
Var X is given by
Z
2
(t ) p(t) dt.
(6.6)
Var X :=

Exercise 6.1. Show that the Gaussian p.d.f. really does have mean and variance
2 .
Exercise 6.2. What happens when we try to determine the mean and variance of
the Cauchy probability density defined in Example 6.4?
Exercise 6.3. Prove that Var X = E(X 2 ) (EX)2 .
We shall frequently have to calculate the expected value of functions of random
variables.
Theorem 6.1. If

|f (t)|p(t) dt

is finite, then
E (f (X)) =

(6.7)

f (t)p(t) dt.

Example 6.3. Let X denote a normalized Gaussian random variable. We shall


show that
EeX = e

(6.8)

/2

Indeed, applying (6.7), we have


Z
Z
t
1/2 t2 /2
1/2
X
e (2)
e
dt = (2)
Ee
=

e 2 (t

2t)

dt.

The trick now is to complete the square in the exponent, that is,
t2 2t = (t )2 2 .
Thus
Ee

= (2)

1/2

1
exp ([t ]2 2 )
2

dt = e

/2

Exercise 6.4. Let W be any Gaussian random variable with mean zero. Prove
that

2
1
(6.9)
E eW = e 2 E ( W ) .

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

47

6.2. Differential Equations. A differential equation, or ordinary differential equation


(ODE), is simply a functional relationship specifying first, or higher derivatives, of
a function; the order of the equation is just the degree of its highest derivatives.
For example,
y (t) = 4t3 + y(t)2
is a univariate first-order differential equation, whilst or
y (t) = Ay(t),
where y(t) Rd and A Rdd is a first-order differential equation in d-variables. A
tiny class of differential equations can be solved analytically, but numerical methods
are required for the vast majority. The numerical analysis of differential equations
has been one of the most active areas of research in computational mathematics
since the 1960s and excellent free software exists. It is extremely unlikely that any
individual can better this software without years of effort and scholarship, so you
should use this software for any practical problem. You can find lots of information
at www.netlib.org and www.nr.org. This section contains the minimum relevant
theory required to make use of this software.
You should commit to memory one crucial first-order ODE:
Proposition 6.2. The general solution to
(6.10)

y (t) = y(t),

t R,

where can be any complex number, is given by


(6.11)

y(t) = c exp(t),

t R.

Here c C is a constant. Note that c = y(0), so we can also write the equation as
y(t) = y(0) exp(t).
Proof. If we multiply the equation y y = 0 by the integrating factor exp(t),
then we obtain
d
0=
(y(t) exp(t)) ,
dt
that is
y(t) exp(t) = c,
for all t R.

In fact, theres a useful slogan for ODEs: try an exponential exp(t) or use
reliable numerical software.
Example 6.4. If we try y(t) = exp(t) as a trial solution in
then we obtain

y + 2y 3y = 0,


0 = exp(t) 2 + 2 3 .
Since exp(t) 6= 0, for any t, we deduce the associated equation
2 + 2 3 = 0.

The roots of this quadratic are 1 and 3, which is left as an easy exercise. Now
this ODE is linear: any linear combination of solutions is still a solution. Thus we
have a general family of solutions
exp(t) + exp(3t),

48

BRAD BAXTER

for any complex numbers and . We need two pieces of information to solve for
these constants, such as y(t1 ) and y(t2 ), or, more usually, y(t1 ) and y (t1 ). In fact
this is the general solution of the equation.
In fact, we can always change an mth order equation is one variable into an
equivalent first order equation in m variables, a technique that I shall call vectorizing
(some books prefer the more pompous phrase reduction of order). Most ODE
software packages are designed for first order systems, so vectorizing has both
practical and theoretical importance.
For example, given
3

y (t) = sin(t) + (y (t)) 2 (y(t)) ,

we introduce the vector function

z(t) =
Then
z (t) =

y
y

z(t) =

z =
which we can write as

.
=
2
3
sin(t) + (y ) 2 (y)

In other words, writing


y(t)
,
y (t)

we have derived

z1 (t)
z2 (t)


y(t)
,
y (t)


z2
,
sin(t) + z23 2z12

z = f (z, t).
Exercise 6.5. You probably wont need to consider ODEs of order exceeding two
very often in finance, but the same trick works. Given
y (n) (t) =

n1
X

ak (t)y (k) (t),

k=0

n1

we define the vector function z(t) R


zk (t) = y

(k)

(t),

Then z (t) = M z(t). Find the matrix M .

by
k = 0, 1, . . . , n 1.

6.3. Recurrence Relations. In its most general form, a recurrence relation is


simply a sequence of vectors v(1) , v(2) , . . . for which some functional relation generates
v(n) given the earlier iterates v(1) , . . . , v(n1) . At this level of generality, very little
more can be said. However, the theory of linear recurrence relations is simple and
very similar to the techniques of differential equations.
The first order linear recurrence relation is simply the sequence {an : n =
0, 1, . . .} of complex numbers defined by
an = can1 .
Thus
an = can1 = c2 an2 = c3 an3 = = cn a0
and the solution is complete.

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

49

The second order linear recurrence relation is slightly more demanding. Here
an+1 + pan + qan1 = 0
and, inspired by the solution for the first order recurrence, we try an = cn , for some
c 6= 0. Then

0 = cn1 c2 + pc + q ,
or
0 = c2 + pc + q.
If this has two distinct roots c1 and c2 , then one possible solution to the second
order recurrence is
un = p1 cn1 + p2 cn2 ,
for constants p1 and p2 . However, is this the full set of solutions? What happens if
the quadratic has only one root?
Proposition 6.3. Let {an : n Z} be the sequence of complex numbers satisfying
the recurrence relation
an+1 + pan + qan1 = 0,

n Z.

If 1 and 2 are the roots of the associated quadratic


t2 + pt + q = 0,
then the general solution is

an = c1 1n + c2 2n
when 1 =
6 2 . If 1 = 2 , then the general solution is
an = (v1 n + v2 ) 1n .

Proof. The same vectorizing trick used to change second order differential equations
in one variable into first order differential equations in two variables can also be
used here. We define a new sequence {b(n) : n Z} by


an1
b(n) =
.
an
Thus

b
that is,
(6.12)

(n)


an1
,
pan1 qan2

b(n) = Ab(n1) ,

where
(6.13)

A=

0
q


1
.
p

This first order recurrence has the simple solution


(6.14)

b(n) = An b(0) ,

so our analytic solution reduces to calculation of the matrix power An . Now let
us begin with the case when the eigenvalues 1 and 2 are distinct. Then the
corresponding eigenvectors w(1) and w(2) are linearly independent. Hence we can
write our initial vector b(0) as a unique linear combination of these eigenvectors:
b(0) = b1 w(1) + b2 w(2) .

50

BRAD BAXTER

Thus
b(n) = b1 An w(1) + b2 An w(2) = b1 n1 w(1) + b1 n2 w(2) .
Looking at the second component of the vector, we obtain
an = c1 n1 + c2 n2 .
Now the eigenvalues of A are the roots of the quadratic equation



1
det (A I) = det
,
q p

in others the roots of the quadratic

2 + p + q = 0.
Thus the associated equation is precisely the characteristic equation of the matrix
A in the vectorized problem. Hence an = c1 1n + c2 2n .
We only need this case in the course, but I shall lead you through a careful
analysis of the case of coincident roots. Its a good exercise for your matrix skills.
First note that the roots are coincident if and only if p2 = 4q, in which case


0
1
,
A=
p2 /4 p

and the eigenvalue is p/2. In fact, subsequent algebra is simplified if we substitute


= p/2, obtaining


0
1
A=
.
2 2
The remainder of the proof is left as the following exercise.

Exercise 6.6. Show that
A = I + uvT ,
where

 
1
u=
,

and note that vT u = 0. Show also that


A2 = 2 I + 2uvT ,

 

v=
1
A3 = 3 I + 32 uvT ,

and use proof by induction to demonstrate that


An = n I + nn1 uvT .
Hence find the general solution for an .

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

51

6.4. Mortgages a once exotic instrument. The objective of this section is


to illustrate some of the above techniques for analysing difference and differential
equations via mortgage pricing. You are presumably all too familiar with a repayment
mortgage: we borrow a large sum M for a fairly large slice T of our lifespan, repaying
capital and interest using N regular payments. The interest rate is assumed to be
constant and its a secured loan: our homes are forfeit on default. How do we
calculate our repayments?
Let h = T /N be the interval between payments, let Dh : [0, T ] R be our debt
as a function of time, and let A(h) be our payment. We shall assume that our
initial debt is Dh (0) = 1, because we can always multiply by the true initial cost
M of our house after the calculation. Thus D must satisfy the equations
(6.15)

Dh (0) = 1,

Dh (T ) = 0

and

We see that D(h)h = erh A(h), while

Dh (h) = Dh (( 1))erh A(h).


Dh (2h) = Dh (h)erh A(h) = e2rh A(h) 1 + erh .

The pattern is now fairly obvious:


(6.16)

Dh (h) = erh A(h)

and summing the geometric series8


(6.17)

Dh (h) = erh A(h)

In order to achieve D(T ) = 0, we choose


(6.18)

A(h) =

1
X

ekrh ,

k=0

erh 1
erh 1

erh 1
.
1 erT

Exercise 6.7. What happens if T ?


Exercise 6.8. Prove that
(6.19)

Dh (h) =

1 er(T h)
.
1 erT

Thus, if t = h is constant (so we increase as we reduce h), then


(6.20)

Dh (t) =

1 er(T t)
.
1 erT

Almost all mortgages are repaid by 300 monthly payments for 25 years. However,
until recently, many mortgages calculated interest yearly, which means that we
choose h = 1 in Exercise 6.7 and then divide A(1) by 12 to obtain the monthly
payment.
Exercise 6.9. Calculate the monthly repayment A(1) when M = 105 , T = 25,
r = 0.05 and h = 1. Now repeat the calculation using h = 1/12. Interpret your
result.
8Many students forget the simple formula. If S = 1 + a + a2 + + am2 + am1 , then

aS = a + a2 + + am1 + am . Subtracting these expressions implies (a 1)S = am 1, all


other terms cancelling.

52

BRAD BAXTER

In principle, theres no reason why our repayment could not be continuous,


with interest being recalculated on our constantly decreasing debt. For continuous
repayment, our debt D : [0, T ] R satisfies the relations

(6.21)

D(0) = 1,

D(T ) = 0 and

D(t + h) = D(t)erh hA.

Exercise 6.10. Prove that


(6.22)

D (t) rD(t) = A,

where, in particular, you should prove that (6.21) implies the differentiability of
D(t). Solve this differential equation using the integrating factor ert . You should
find the solution
 rt

Z t
e
1
(er ) d = A
(6.23)
D(t)ert 1 = A
.
r
0
Hence show that
r
(6.24)
A=
1 erT
and
1 er(T t)
(6.25)
D(t) =
,
1 erT
agreeing with (6.20), i.e. Dh (kh) = D(kh), for all k. Prove that limr D(t) = 1
for 0 < t < T and interpret.
Observe that
erh 1
A(h)
=
1 + (rh/2),
Ah
rh
so that continuous repayment is optimal for the borrower, but that the mortgage
provider is making a substantial profit. Greater competition has made yearly
recalculations much rarer, and interest is often paid daily, i.e. h = 1/250, which is
rather close to continuous repayment.
(6.26)

Exercise 6.11. Construct graphs of D(t) for various values of r. Calculate the
time t0 (r) at which half of the debt has been paid.
6.5. Pricing Mortgages via lack of arbitrage. There is a very slick arbitrage
argument to deduce the continuous repayment mortgage debt formula (6.25). Specifically,
the simple fact that D(t) is a deterministic financial instrument implies, via arbitrage,
that D(t) = a + b exp(rt), so we need only choose the constants a and b to satisfy
D(0) = 1 and D(T ) = 1, which imply a+b = 1 and a+b exp(rT ) = 0. Solving these
provides a = exp(rT )/(exp(rT ) 1) and b = 1/(exp(rT ) 1), and equivalence to
(6.25) is easily checked.

MATHEMATICAL METHODS FOR FINANCIAL ENGINEERING

53

6.6. Exponential Growth and Population Rhetoric. From time to time, journalists
wishing to emphasize the rapidity of population growth make statements of the form
There are more humans alive now than who have ever lived in the past. The New
Zealand Population Statistics Unit provides a fascinating Myth Busters page, one
of which addresses this question directly:
http://www.population.govt.nz/myth-busters/myth-11.aspx,
My concern here is to consider conditions on exponential growth for which the claim
would be true. Specifically, suppose our population satisfies
N (t) = e+t ,

for t R,

so that the total population who had ever lived at time t is given by
 +s t
Z t
e
N (t)
N (s) ds =
=
P (t) =
.

Thus the inequality P (t) N (t) is satisfied for all t R if and only if 1.
Exercise 6.12. Human growth is not exponential. For example, world population
is often estimated to have decreased by some 30% in the mid-Fourteenth Century,
due to the Black Death9. However, it is of interest to estimate the parameters of
N (t) = exp( + t) to fit the recent population estimates N (1970) = 3 109 and
N (2000) = 6 109 . What do you find? Specifically, calculate P (2000) for these
parameters.
Example 6.5. Its also of interest to consider the discrete analogue. Specifically,
suppose Nk = rk and
k1
k1
X
X
r .
N =
Pk =
=

Then




Nk
r
1
k1
=
=r
.
Pk = r
1 + r + r + = r
1
1r
r1
r1
Thus Pk Nk , for all k, if and only if r 2. My error in Mondays lecture was
to replace k 1 by k in the definition of Nk , which led to nonsense! As a general
rule, one should never extemporize when exhausted at the end of term!
k1

k1

Exercise 6.13. A pessimist might object to the sanguine conclusions of Exercise


6.12 : Exercise 5.12 is an invalid estimate of P (t), because it counts people twice.
The population is now doubling roughly once every 30 years, since N (1940) = 1.5
109 , N (1970) = 3 109 and N (2000) = 6 109 . Surely thats enough for the total
population now to exceed the total number who have ever lived! Show that
N (1940)+N (1970)+N (2000)+ +N (1940+3(n1)) = N (1940+3n)N (1940).

Is the pessimist correct? Were our earlier arguments valid?

9for those who wish to learn more, Plagues and Peoples, by W. H.McNeill, provides a
fascinating epidemiological history of humanity which is highly recommended.

54

BRAD BAXTER

7. Numerical Linear Algebra


I shall not include much explicitly here, because you have my longer lecture notes
on numerical linear algebra. However, do revise the first long chapter of those notes
if you have problems.
7.1. Orthogonal Matrices. Modern numerical linear algebra began with the computer
during the Second World War, its progress accelerating enormously as computers
became faster and more convenient in the 1960s. One of the most vital conclusions
of this research field is the enormous practical importance of matrices which leave
Euclidean length invariant. More formally:
Definition 7.1. We shall say that Q Rnn is distance-preserving if kQxk = kxk,
for all x Rn .
The following simple result is very useful.
Lemma 7.1. Let M Rnn be any symmetric matrix for which xT M x = 0, for
every x Rn . Then M is the zero matrix.

Proof. Let e1 , e2 , . . . , en Rn be the usual coordinate vectors. Then


1
T
Mjk = eTj M ek = (ej + ek ) M (ej + ek ) = 0,
1 j, k n.
2


n

Theorem 7.2. The matrix Q R is distance-preserving if and only if Q Q = I.


Proof. If QT Q = I, then

kQxk2 = (Qx)T (Qx) = xT QT Qx = xT x = kxk2 ,

and Q is distance-preserving. Conversely, if kQxk2 = kxk2 , for all x Rn , then



xT QT Q I x = 0,
x Rn .

Since QT Q I is a symmetric matrix, Lemma 7.1 implies QT Q I = 0, i.e.


QT Q = I.

The condition QT Q = I simply states that the columns of Q are orthonormal
vectors, that is, if the columns of Q are q1 , q2 , . . . , qn , then kq1 k = = kqn k = 1
and qTj qk = 0 when j 6= k. For this reason, Q is also called an orthogonal matrix.
We shall let O(n) denote the set of all (real) n n orthogonal matrices.

Exercise 7.1. Let Q O(n). Prove that Q1 = QT . Further, prove that O(n) is
closed under matrix multiplication, that is, Q1 Q2 O(n) when Q1 , Q2 O(n). (In
other words, O(n) forms a group under matrix multiplication. This observation is
important, and O(n) is often called the Orthogonal Group.)
Department of Economics, Mathematics and Statistics, Birkbeck College, University
of London, Malet Street, London WC1E 7HX, England
E-mail address: [email protected]

You might also like