0% found this document useful (0 votes)
15 views75 pages

Notes - Mike Giles - Aad

Uploaded by

sk0813
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views75 pages

Notes - Mike Giles - Aad

Uploaded by

sk0813
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

Adjoint methods for option

pricing, Greeks and calibration


using PDEs and SDEs
Mike Giles
[email protected]

Oxford University Mathematical Institute


Oxford-Man Institute of Quantitative Finance

Adjoints for finance – p. 1


Outline
introductory ideas
black-box adjoints
high-level linear algebra adjoints
automatic differentiation
backward and forward PDEs for pricing
backward and forward discrete equations
what can go wrong with PDEs?
modular approach to calibration
Monte Carlo pathwise sensitivities
path-dependent payoffs
binning
handling discontinuities
Adjoints for finance – p. 2
A question!
Given compatible matrices A, B, C does it matter how one
computes the product A B C ? (i.e. (A B) C or A (B C) ?)

Adjoints for finance – p. 3


A question!
Given compatible matrices A, B, C does it matter how one
computes the product A B C ? (i.e. (A B) C or A (B C) ?)
Answer 1: no, in theory, and also in practice if A, B, C are
square

Adjoints for finance – p. 3


A question!
Given compatible matrices A, B, C does it matter how one
computes the product A B C ? (i.e. (A B) C or A (B C) ?)
Answer 1: no, in theory, and also in practice if A, B, C are
square
Answer 2: yes, in practice, if A, B, C have dimensions
1×105 , 105 ×105 , 105 ×105 .
  
· · · · · · · · · ·


 · · · · · 
 · · · · · 

· · · · ·  · · · · · · · · · ·
  
 
  
 · · · · ·  · · · · · 
· · · · · · · · · ·

Adjoints for finance – p. 3


A question!
Given compatible matrices A, B, C does it matter how one
computes the product A B C ? (i.e. (A B) C or A (B C) ?)
Answer 1: no, in theory, and also in practice if A, B, C are
square
Answer 2: yes, in practice, if A, B, C have dimensions
1×105 , 105 ×105 , 105 ×105 .
  
· · · · · · · · · ·


 · · · · · 
 · · · · · 

· · · · ·  · · · · · · · · · ·
  
 
  
 · · · · ·  · · · · · 
· · · · · · · · · ·
Point: this is all about computational efficiency
Adjoints for finance – p. 3
Generic black-box problem
An input vector u0 leads to a scalar output uN :

u0 - - - - 

      

- - - - uN

Each box could be a mathematical step (calibration, spline,


pricing) or a computer code, or one computer instruction
Key assumption: each step is (locally) differentiable
Adjoints for finance – p. 4
Generic black-box problem
Let u̇n represent the derivative of un with respect to one
particular element of input u0 . Differentiating black-box
processes gives

∂un+1
u̇n+1 = Dn u̇n , Dn ≡
∂un
and hence

u̇N = DN −1 DN −2 . . . D1 D0 u̇0

standard “forward mode” approach multiplies matrices


from right to left – very natural
each element of u0 requires its own sensitivity
calculation – cost proportional to number of inputs
Adjoints for finance – p. 5
Generic black-box problem
Let un be the derivative of output uN with respect to un .
 T  T
∂uN ∂uN ∂un+1
un ≡ = = DnT un+1
∂un ∂un+1 ∂un
and hence
T
u0 = D0T D1T . . . DN −2 D T
N −1 uN

and uN = 1.

u0 gives sensitivity of uN to all elements of un at a fixed


cost, not proportional to the size of u0 .
a different output would require a separate adjoint
calculation; cost proportional to number of outputs
Adjoints for finance – p. 6
Generic black-box problem
It looks easy (?) – what’s the catch?

need to do original nonlinear calculation to compute


and store Dn before doing adjoint reverse pass
– storage requirements can be significant for PDEs
practical implementation can be tedious if hand-coded
– use automatic differentiation tools
need care in treating black-boxes which involve a fixed
point iteration
derivative may not be as accurate as original
approximation

Adjoints for finance – p. 7


Automatic differentiation
We now consider a single black-box component, which is
actually the outcome of a computer program.

A computer instruction creates an additional new value:


!
un
un+1 = fn (un ) ≡ ,
fn (un )

A computer program is the composition of N such steps:

uN = fN −1 ◦ fN −2 ◦ . . . ◦ f1 ◦ f0 (u0 ).

Adjoints for finance – p. 8


Automatic differentiation
In forward mode, differentiation gives
!
In
u̇n+1 = Dn u̇n , Dn ≡ ,
∂fn /∂un

and hence

u̇N = DN −1 DN −2 . . . D1 D0 u̇0 .

Adjoints for finance – p. 9


Automatic differentiation
In reverse mode, we have
T
un = Dn un+1 .

and hence

u0 = (D0 )T (D1 )T . . . (DN −2 )T (DN −1 )T uN .

Note: need to go forward through original calculation to


compute/store the Dn , then go in reverse to compute un

Adjoints for finance – p. 10


Automatic differentiation
At the level of a single instruction

c = f (a, b)

the forward mode is


   
ȧ 1 0 !
 0 ȧ
 ḃ  = 1 
  
∂f ∂f ḃ
ċ ∂a ∂b
n
n+1

and so the reverse mode is


 
! ∂f
! a
a 1 0 ∂a
=
 
∂f  b 
b n
0 1 ∂b c n+1
Adjoints for finance – p. 11
Automatic differentiation
This gives a prescriptive algorithm for reverse mode
differentiation.
∂f
a = a+ c
∂a
∂f
b = b+ c
∂b
Manual implementation is possible but can be tedious,
so automated tools have been developed, following two
approaches:
operator overloading (ADOL-C, FADBAD++)
source code transformation (Tapenade, TAF/TAC++)

Adjoints for finance – p. 12


Source code transformation
programmer supplies black-box code which takes u as
input and produces v = f (u) as output
in forward mode, AD tool generates new code which
takes u and u̇ as input, and produces v and v̇ as output
 
∂f
v̇ = u̇
∂u

in reverse mode, AD tool generates new code which


takes u and v as input, and produces v and u as output
 T
∂f
u= v
∂u

Adjoints for finance – p. 13


Linear algebra sensitivities
Low-level automatic differentiation is very helpful, but a
high-level approach is sometimes better (e.g. when using
libraries)

Won’t go through derivation – just present results.

∂Cij ∂ output
Notation: Ċij ≡ , C ij ≡
∂ input ∂Cij

(Note: some literature defines C as the transpose)

Adjoints for finance – p. 14


Linear algebra sensitivities

C =A+B C = AB

Ċ = Ȧ + Ḃ Ċ = Ȧ B + A Ḃ

A = C, A = C BT
B=C B = AT C

C = A−1 C = A−1 B

Ċ = − C Ȧ C Ċ = A−1 (Ḃ − Ȧ C)

A = −C T C C T B = (AT )−1 C, A = −B C T

Adjoints for finance – p. 15


Linear algebra sensitivities
One important little catch: when A is a tri-diagonal matrix,
and B and C are both vectors,

C = A−1 B

Ċ = A−1 (Ḃ − Ȧ C)

B = (AT )−1 C, A = −B C T

this gives a dense matrix A, at O(n2 ) cost – since A is


tri-diagonal then only the tri-diagonal elements of A should
be computed, at O(n) cost

Adjoints for finance – p. 16


Linear algebra sensitivities
Others:
matrix determinant
matrix polynomial pn (A) and exponential exp(A)
eigenvalues and eigenvectors of A,
assuming no repeated eigenvalues
SVD (singular value decomposition) of A,
assuming no repeated singular values
Cholesky factorisation of symmetric A

Most of the adjoint results are 30-40 years old,


but not widely known.

Adjoints for finance – p. 17


Fixed point iteration
Suppose a black-box computes output v from input u by
solving the nonlinear equations

f (u, v) = 0

using the fixed-point iteration

vn+1 = vn − P (u, vn ) f (u, vn ).

For Newton iteration P is the inverse Jacobian, but P could


also correspond to a multigrid cycle in an iterative solver.

Adjoints for finance – p. 18


Fixed point iteration
A naive forward mode differentiation uses the fixed-point
iteration
   
∂P ∂P ∂f ∂f
v̇n+1 = v̇n − u̇ + v̇n f (u, vn )−P (u, vn ) u̇ + v̇n
∂u ∂v ∂u ∂v

but it is more efficient to use


 
∂f ∂f
v̇n+1 = v̇n − P (u, v) u̇ + v̇n
∂u ∂v

to iteratively solve
∂f ∂f
u̇ + v̇ = 0
∂u ∂v

Adjoints for finance – p. 19


Fixed point iteration
Since
 −1
∂f ∂f
v̇ = − u̇,
∂v ∂u
the adjoint is
 T  −T  T
∂f ∂f ∂f
u=− v= w
∂u ∂v ∂u

where
 T
∂f
w + v = 0.
∂v

Adjoints for finance – p. 20


Fixed point iteration
This can be solved iteratively using
 T !
T ∂f
wn+1 = wn − P (u, v) wn + v
∂v

and this is guaranteed to converge (well!) since


 T
T ∂f
P (u, v)
∂v

has the same eigenvalues as


∂f
P (u, v) .
∂v

Adjoints for finance – p. 21


Forward and reverse PDEs
Suppose we are interested in the forward PDE
∂p
= Lt p,
∂t
where Lt is a spatial operator, subject to Dirac initial data
p(x, 0) = δ(x−x0 ), and we want the value of the output
functional Z
(p(·, T ), f ) ≡ p(x, T ) f (x) dx.

The adjoint spatial operator L∗t is defined by the identity

(Lt v, w) = (v, L∗t w), ∀v, w

assuming certain homogeneous b.c.’s.


Adjoints for finance – p. 22
Forward and reverse PDEs

If u(x, t) is the solution of the adjoint PDE


∂u
= −L∗t u,
∂t
subject to “initial” data u(x, T ) = f (x) then
Z T    
∂p ∂u
(p(·, T ), u(·, T )) − (p(·, 0), u(·, 0)) = , u + p, dt
0 ∂t ∂t
Z T
= (Lt p, u) − (p, L∗t u) dt
0
= 0,

and hence u(x0 , 0) = (p(·, T ), f ).


Adjoints for finance – p. 23
Forward and reverse FDEs
Suppose the forward problem has the discrete equivalent

pn+1 = An pn

where An is a square matrix.

If there are N timesteps, then the output has the form

f T M pN = f T M AN −1 AN −2 . . . A0 p0 .

where M is a symmetric “mass” matrix, which may be


diagonal.

Adjoints for finance – p. 24


Forward and reverse FDEs
Taking the transpose, this can be re-expressed as

pT0 v0

where
v0 = AT0 . . . ATN −2 ATN −1 M f T
the adjoint solution vn is defined by

vn = ATn vn+1

subject to “initial” data vN = M T f .

Adjoints for finance – p. 25


Forward and reverse FDEs
It is sometimes more appropriate to work with

un = (M −1 )T vn ,

in which case we have

un = (M An M −1 )T un+1

subject to “initial” data

uN = f,

and the output functional is pT0 M u0 .

Adjoints for finance – p. 26


Financial relevance
Fokker-Planck (or forward Kolmogorov) equation:

∂p ∂ 1 ∂2 2 
+ (a p) = 2
b p
∂t ∂x 2 ∂x
for probability density p(x, t) for path St satisfying the SDE

dSt = a(St , t) dt + b(St , t) dWt .

Backward Kolmogorov (or discounted Feynman-Kac)


equation:
∂u ∂u 1 2 ∂ 2 u
+a + b 2
=0
∂t ∂x 2 ∂x
where u(x, t) = E[f (ST )|St = x]
Adjoints for finance – p. 27
Financial relevance
The spatial operators are

∂ 1 ∂2 2 
Lp ≡ − (a p) + 2
b p
∂x 2 ∂x
and
∂u 1 ∂ 2u
L∗ u ≡ a + b2 2
∂x 2 ∂x

The identity
(Lv, w) = (v, L∗ w), ∀v, w

can be verified by integration by parts, assuming


2 ∂w 2 ∂v
a v w, b v ,b w are zero on boundary.
∂x ∂x
Adjoints for finance – p. 28
Financial relevance
Discrete equations are usually formulated for backward
equation:
un = Bn un+1
subject to payoff data uN = f , and the output is eT un where
e is a unit vector with a single non-zero entry.

The equivalent discrete adjoint problem is

Pn+1 = BnT Pn

subject to initial data P0 = e, and the output is PNT f .

Pn is a vector of discrete probabilities – need to divide by


grid spacing to get approximation to probability density.
Adjoints for finance – p. 29
Financial relevance
With implicit time-marching, we have an equation like

An un = Cn un+1

so
Bn ≡ A−1
n Cn

In this case,
BnT ≡ CnT (ATn )−1
so
Pn+1 = CnT (ATn )−1 Pn

Note time reversal: multiply by Cn and then by A−1


n turns
into multiply by (ATn )−1 and then by CnT
Adjoints for finance – p. 30
Financial relevance
Which is better – forward or reverse?

forward is best for pricing multiple European options


a single forward calculation and then a separate
vector dot product for each option
particularly useful when calibrating a model to vanilla
options?
backward is only possibility for American options, and
also gives Delta and Gamma approximations for free

Adjoints for finance – p. 31


FDE sensitivities
Suppose we want to compute P = eT u0 where uN = f and

un = Bn un+1 .

Now suppose that f and Bn depend on some parameter θ,


and we want to compute the sensitivity to θ.

Standard “forward mode” sensitivity analysis gives Ṗ = eT u̇0


where u̇N = f˙ and

u̇n = Bn u̇n+1 + ḃn

where
ḃn ≡ Ḃn un+1
Adjoints for finance – p. 32
FDE sensitivities
What is “reverse mode” adjoint?

Work “backwards” applying the linear algebra rules.

u0 = e

un+1 = BnT un , bn = un

f = uN

Adjoints for finance – p. 33


FDE sensitivities
This gives f and bn and then payoff sensitivity is given by
T ˙ X T
θ=f f+ bn ḃn
n

This can be evaluated using AD software, or hand-coded


following the AD algorithm.

θ, un+1 −→ Bn un+1 original code


θ, un+1 −→ Ḃn un+1 forward mode, keeping un+1 fixed
θ, un+1 , bn −→ θ incr reverse mode, keeping un+1 fixed

Adjoints for finance – p. 34


FDE sensitivities
We now add 2 extra ingredients:
nonlinearity (e.g. American options using penalty
method)
implicit time-marching

Including these, “forward mode” sensitivity analysis gives


Ṗ = eT u̇0 where u̇N = f˙ and

An u̇n = Cn u̇n+1 + ḃn ,

for some An , Cn , ḃn , and “reverse mode” gives

un+1 = CnT (ATn )−1 un , bn = (ATn )−1 un

Adjoints for finance – p. 35


FDE sensitivities
This again gives f and bn and AD ideas can then be used to
compute θ.

So far, I have talked of θ being a single input parameter, but


it can be a vector of input parameters.

The key is that they all use the same f and bn , and it is just
this final AD step which depends on θ, and the cost is
independent of the number of parameters.

Adjoints for finance – p. 36


What can go wrong?
Differentiation like this gives the sensitivity of the numerical
approximation to changes in the input parameters.

This is not necessarily a good approximation to the true


sensitivity

Simplest example: a digital put option with strike K when


∂V
wanting to compute , the sensitivity of the option price to
∂K
the strike

Adjoints for finance – p. 37


What can go wrong?
Using the simplest numerical approximation,

fi = H(K −Si )

and so f˙ = 0 which leads to a zero sensitivity!

Using a better approximation


Z Si + 12 ∆S
1
fi = H(K −S) dS
∆S Si − 12 ∆S

gives an O(∆S 2 ) approximation to the price, and an O(∆S)


approximation to the sensitivity to K .

Adjoints for finance – p. 38


What can go wrong?
More generally, discontinuities are not the only problem.

Suppose our analytic problem with input x has solution

u = x2

and our discrete approximation with step size h ≪ 1 is

uh = x2 + h2 sin(x/h)

then uh − u = O(h2 ) but u′h − u′ = O(h)

This seems to be typical, that in bad cases you lose one


order of convergence each time you differentiate.

Adjoints for finance – p. 39


What can go wrong?
Careful construction of the approximation can usually avoid
these problems.

In the digital put case, the problem was the strike moving
across the grid.

Solution: move the grid with the strike at maturity t = T ,


keeping the end at the current time t = 0 fixed.

(0) (0) t
log Si (t) = log Si + (log K − log K )
T
(0)
This uses a baseline grid Si corresponding to the true
strike K (0) then considers perturbations to this which move
with the strike.
Adjoints for finance – p. 40
Use of adjoint sensitivities
Fokker-Planck discretisation:
standard calculation goes forward in time, then
performs a separate vector dot product for each vanilla
European option
adjoint sensitivity calculation goes backward in time,
gives sensitivity of vanilla prices to initial prices, model
constants
if the Greeks are needed for each option, then a
separate adjoint calculation is needed for each – might
be better to use “forward mode” AD instead, depending
on number of parameters and options
one adjoint calculation can give a weighted average of
Greeks – useful for calibrating a model to market data
Adjoints for finance – p. 41
Use of adjoint sensitivities
A calibration procedure might find the optimum vector of
parameters θ which minimises the mean square difference
between vanilla option model prices and market prices:
X 2
1 (k) (k)
2 Cmodel (θ) − Cmarket
k

Gradient-based optimisation would need to compute

X  (k)  ∂C (k)
(k) model
Cmodel − Cmarket
∂θ
k

which is just a weighted average (with both positive and


negative weights) of the Greeks.
Adjoints for finance – p. 42
Use of adjoint sensitivities
Since the vanilla option price is of the form
(k)
Cmodel = fkT PN

then, provided fk does not depend on θ, the adjoint


calculation works backwards in time from the “initial”
condition:
X  (k) (k)

PN = Cmodel − Cmarket) fk
k

Adjoints for finance – p. 43


Use of adjoint sensitivities
Black-Scholes / backward Kolmogorov discretisation:
standard calculation goes backward in time for pricing
an exotic option, with possible path-dependency and
optional exercise
adjoint sensitivity calculation goes forward in time,
giving sensitivity of price to initial prices, model
constants, etc.

Adjoints for finance – p. 44


Use of adjoint sensitivities
Many applications may involve a process which goes
through several stages:
market implied vol σI =⇒ local vol σl at a few points
using Dupire’s formula
local vol σl at a few points =⇒ σl , σl′ through cubic
spline procedure
σl , σl′ =⇒ σ at FDE grid points using cubic spline
interpolation
σ at FDE grid points =⇒ exotic option value V using
FDE calculation

Adjoints for finance – p. 45


Use of adjoint sensitivities
To obtain the sensitivity of the option value to changes in
the market implied vol, go through all of the stages in the
reverse order:
V =⇒ σ
σ =⇒ σl , σl′
σl , σl′ =⇒ σl
σl =⇒ σI

Each stage needs to be developed and validated


separately, then they all fit together in a modular way.

Adjoints for finance – p. 46


Use of adjoint sensitivities
It is not necessary to use adjoint techniques at each stage.

For example, the final stage in the last example computes


 T
∂σl
σI = σl
∂σI

The matrix
∂σl
∂σI
can be obtained by forward mode sensitivity analysis (more
expensive), or approximated by bumping (more expensive
and less accurate)

Adjoints for finance – p. 47


Monte Carlo sensitivities
Pathwise sensitivity analysis is very simple, in concept

Monte Carlo estimate for option value


M
X
M −1 P (S (m) )
m=1

Standard pathwise estimate for sensitivity


M
−1
X ∂P (m)
M Ṡ
∂S
m=1

where Ṡ is path sensitivity, keeping fixed all of the random


numbers
Adjoints for finance – p. 48
Monte Carlo sensitivities
The corresponding adjoint (reverse mode) sensitivity is
M
−1
X (m)
M θ
m=1
 T
(m) ∂P
where θ corresponds to for mth path
∂θ

Note: the adjoint sensitivity is the same as the standard


pathwise sensitivity, so it is valid under the same conditions
(e.g. P (θ) Lipschitz and piecewise differentiable)

Adjoints for finance – p. 49


Monte Carlo sensitivities
Largely a straightforward application of reverse mode AD,
but a few new things to discuss

path-dependent payoffs (Asian and lookback options)


efficiency improvement for handling multiple European
payoffs (Christoph Kaebe & Ekkehard Sachs)
binning for expensive pre-processing steps
(Luca Capriotti)
handling discontinuous payoffs

Adjoints for finance – p. 50


Path dependent payoffs
A single path calculation (for a given set of random
numbers) can be described by

Sn+1 = fn (θ; Sn ), n = 0, . . . , N −1

with payoff P (S) depending on the whole path.

Forward mode sensitivity analysis gives

Ṡn+1 = Bn Ṡn + ḃn , n = 0, . . . , N −1

with payoff sensitivity


N
X ∂P
Ṗ = Ṡn
∂Sn
n=0
Adjoints for finance – p. 51
Path dependent payoffs
When computing Delta, we have ḃn = 0 and so
N
X ∂P
Ṗ = Bn−1 Bn−2 . . . B0 Ṡ0
∂Sn
n=0

T
This is equal to S0 Ṡ0 if the adjoint solution is defined by
 T
∂P
SN =
∂SN

and
 T
∂P
Sn = BnT S n+1 + , n = N −1, . . . , 0
∂Sn
Adjoints for finance – p. 52
Path dependent payoffs
When Ṡ0 = 0, and there is just one ḃn which is non-zero,
then the payoff sensitivity is
∂P T
Ṗ = BN −1 . . . Bn+1 ḃn = S n+1 ḃn
∂SN

In the most general case therefore, we have


N −1
T X T
Ṗ = S0 Ṡ0 + S n+1 ḃn
n=0

so bn ≡ S n+1

Adjoints for finance – p. 53


Path dependent payoffs
Having discussed the maths, the good news is that all of
the details should be handled automatically by the AD tools.

If step(n,theta,S) performs the nth timestep, taking


θ, Sn as input and returning Sn+1 , then the adjoint routine
step b(n,theta,theta b,S,S b) takes inputs
θ, θ, Sn , S n+1 and returns an updated θ and S n .
 T
∂P
The only thing you have to add to S n is .
∂Sn
This could also be handled by AD, but maybe simpler to do
it by hand – e.g. for lookback options you just need to store
which timestep has the minimum or maximum, whereas AD
would need to store lots of other info.
Adjoints for finance – p. 54
Path dependent payoffs
An alternative point of view / approach is to make the payoff
depend only on the final state SN by augmenting the state:
X
Sn for Asian options
n

min Sn , max Sn for lookback options


n n

Doing it this way, the whole adjoint code can be generated


by AD.

Adjoints for finance – p. 55


Path dependent payoffs
Some more implementation detail:
first, go forward through the path storing the state Sn at
each timestep (corresponds to “checkpointing” in AD
terminology)
then, go backwards through the path, using reverse
mode AD for each step – this will re-do the internal
calculations for the timestep and then do its adjoint
when hand-coding for maximum performance, I also
store the result of any very expensive operations
(typically exp) to avoid having to re-do these

Note that this is different from applying AD to the entire


path, which would require a lot of storage – it’s cheaper to
re-calculate the internal variables rather than fetch them
from main memory Adjoints for finance – p. 56
Multiple European payoffs
Suppose that you have
nθ input parameters
nP different payoffs
dimension d path simulation

If nθ is smallest, use forward mode sensitivity analysis

If nP is smallest, use reverse mode sensitivity analysis

What if d is smallest?

Adjoints for finance – p. 57


Multiple European payoffs
Going back to original matrix question, what is the best way
of computing this?

    
· · · · · · · · · · ·

 · · · · · 
 · 


 · · · · · 

· · · · · ·  · · · · ·  · · · · ·
    
  
    
 · · · · ·  ·   · · · · · 
· · · · · · · · · · ·

Adjoints for finance – p. 58


Multiple European payoffs
The most efficient approach is
perform d adjoint calculations to determine
∂SN
∂θ
perform d forward sensitivity calculations to determine
∂Pk
∂SN

combine these to obtain


∂Pk ∂Pk ∂SN
=
∂θ ∂SN ∂θ

Adjoints for finance – p. 59


Binning
The need for binning is best demonstrated by the case of
correlation Greeks.

The standard pricing calculation has three stages


perform Cholesky factorisation
do M path calculations
compute average and confidence interval

How do we compute the adjoint sensitivity to the correlation


coefficients?

Adjoints for finance – p. 60


Binning
If we apply the reverse mode AD approach to the entire
calculation, then we get an estimate of
the sensitivity of the price
the sensitivity of the confidence interval,
not the confidence interval for the sensitivity!

To get the confidence interval for the sensitivity, for each


path we can do the adjoint of the Cholesky factorisation,
so we compute θ for each path and then compute an
average and confidence interval in the usual way.

However, this greatly increases the computational cost.

Adjoints for finance – p. 61


Binning
The binning approach splits the M paths into K groups.

For each group, it uses the full AD approach to efficiently


compute an estimate of the price sensitivity.

It then uses the variability between the group averages to


estimate the confidence interval.

Needs
K ≫ 1 to get a good estimate of the confidence interval
K ≪ M for cost of K adjoint Cholesky calculations to
be smaller than M path calculations

Adjoints for finance – p. 62


Binning
The same approach can be used for a Monte Carlo version
of the earlier example with local volatility:
market implied vol σI =⇒ local vol σl at a few points
using Dupire’s formula
local vol σl at a few points =⇒ σl , σl′ through cubic
spline procedure
M Monte Carlo path calculation, using spline evaluation
to obtain local volatility
compute average and confidence interval

The adjoint of the path calculation will contribute increments


to σl and σl′ . Then, for each group of paths, can use adjoint
of first two stages to get an estimate for the sensitivity to
market implied vol data. Adjoints for finance – p. 63
Non-smooth payoffs
The biggest limitation of the pathwise sensitivity method
(both forward mode and reverse mode) is that it cannot
handle discontinuous payoffs.

There are 3 main ways to deal with this:


explicitly smoothed payoffs
using conditional expectation to smooth the payoff
“vibrato” Monte Carlo

Of course, one can also switch to Likelihood Ratio Method


or Malliavin calculus, but then I don’t see how one gets the
efficiency benefits of adjoint methods.

Adjoints for finance – p. 64


Non-smooth payoffs
Explicitly-smoothed payoffs replace the discontinuous
payoff by a smooth (or at least continuous) alternative.

Digital options P (S) ≡ H(S −K) can be replaced by a


piecewise linear version, or something much smoother:
 
S−K
Φ
δ

This introduces an O(δ 2 ) error due to the smoothing, but


with Richardson extrapolation this can be improved to O(δ 4 )
by using
   
4 S−K 1 S−K
Φ − Φ
3 δ 3 2δ

Adjoints for finance – p. 65


Non-smooth payoffs
Implicitly-smoothed payoffs use conditional expectations.

My favourite is for barrier options, where a Brownian Bridge


conditional expectation computes the probability that the
path has crossed the barrier within a timestep.
(see Glasserman’s book, pp. 366-370)

This improves the weak convergence to first order, and


makes the payoff differentiable.

Adjoints for finance – p. 66


Non-smooth payoffs
With digital options, can stop the path simulation one
timestep before maturity.

Conditional on the value SN −1 , an Euler discretisation for


the final timestep gives a Gaussian p.d.f. for SN :

SN = SN −1 + µN −1 ∆t + σN −1 ∆WN −1

In simple cases one can then analytically evaluate


 
E P (SN ) | SN −1

and this will be a smooth function of SN −1 so we can use


the pathwise sensitivity method.

Adjoints for finance – p. 67


Non-smooth payoffs
Continuing this digital example, in more complicated
multi-dimensional cases it is not possible to analytically
evaluate the conditional expectation.

Instead, one can apply the Likelihood Ratio Method for the
final step – I called this the “vibrato” method because of the
uncertainty in the final value SN

Need to read my paper for full details. Its main weakness is


that the variance is O(∆t−1/2 ), but it is much better than the
O(∆t−1 ) variance of the standard Likelihood Ratio Method,
and you get the benefit of adjoints.

Malliavin calculus will give an O(1) variance, but no adjoint


efficiency gains, I think.
Adjoints for finance – p. 68
Conclusions
adjoints can be very efficient for option pricing,
calibration and sensitivity analysis
same result as “standard” approach but a much lower
computational cost
basic elements of discrete adjoint analysis are very
simple, although real applications can get quite complex
automatic differentiation ideas are very important, even
if you don’t use AD software

Adjoints for finance – p. 69


Further reading
M.B. Giles and P. Glasserman. ‘Smoking adjoints: fast
Monte Carlo Greeks’, RISK, 19(1):88-92, 2006
M.B. Giles and P. Glasserman. ’Smoking Adjoints: fast
evaluation of Greeks in Monte Carlo calculations’.
Numerical Analysis report NA-05/15, 2005.
— original RISK paper, and longer version with appendix on AD

M. Leclerc, Q. Liang, I. Schneider. ’Fast Monte Carlo


Bermudan Greeks’, RISK, 22(7):84-88, 2009.
L. Capriotti and M.B. Giles. ‘Fast correlation Greeks by
adjoint algorithmic differentiation’, RISK, 23(5):77-83, 2010
— correlation Greeks and binning

L. Capriotti and M.B. Giles. ‘Algorithmic differentiation:


adjoint Greeks made easy’, RISK, to appear, 2012
— use of AD Adjoints for finance – p. 70
Further reading
M.B. Giles. ’Monte Carlo evaluation of sensitivities in
computational finance’. Numerical Analysis report
NA-07/12, 2007.
— use of AD, and introduction of Vibrato idea

M.B. Giles. ’Vibrato Monte Carlo sensitivities’. In Monte


Carlo and Quasi-Monte Carlo Methods 2008, Springer,
2009.
— Vibrato Monte Carlo for discontinuous payoffs

C. Kaebe, J.H. Maruhn and E.W. Sachs. ’Adjoint-based


Monte Carlo calibration of financial market models’.
Finance and Stochastics, 13(3):351-379, 2009.
— adjoint Monte Carlo sensitivities and calibration

Adjoints for finance – p. 71


Further reading
M.B. Giles ‘On the iterative solution of adjoint equations’,
pp.145-152 in Automatic Differentiation: From Simulation to
Optimization, G. Corliss, C. Faure, A. Griewank, L. Hascoet,
U. Naumann, editors, Springer-Verlag, 2001.
— adjoint treatment of time-marching and fixed point iteration

M.B. Giles. ’Collected matrix derivative results for forward


and reverse mode algorithmic differentiation’. In Advances
in Automatic Differentiation, Springer, 2008.
M.B. Giles. ’An extended collection of matrix derivative
results for forward and reverse mode algorithmic
differentiation’. Numerical Analysis report NA-08/01, 2008.
— two papers on adjoint linear algebra, second has MATLAB code and
tips on code development and validation

Adjoints for finance – p. 72

You might also like