100% found this document useful (1 vote)
59 views279 pages

Financial Risk Management - Part 2

Uploaded by

kevsonjules
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
59 views279 pages

Financial Risk Management - Part 2

Uploaded by

kevsonjules
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 279

Random variate generation

Simulation of stochastic processes


Monte Carlo methods

Agenda

Lecture 1: Introduction to Financial Risk Management


Lecture 2: Market Risk
Lecture 3: Credit Risk
Lecture 4: Counterparty Credit Risk and Collateral Risk
Lecture 5: Operational Risk
Lecture 6: Liquidity Risk
Lecture 7: Asset Liability Management Risk
Lecture 8: Model Risk
Lecture 9: Copulas and Extreme Value Theory
Lecture 10: Monte Carlo Simulation Methods
Lecture 11: Stress Testing and Scenario Analysis
Lecture 12: Credit Scoring Models

Thierry Roncalli Course 2023-2024 in Financial Risk Management 965 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Uniform random numbers

The idea is to build a pseudorandom sequence S and repeat this sequence


as often as necessary

Thierry Roncalli Course 2023-2024 in Financial Risk Management 966 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Linear congruential generator

The most famous and used algorithm is the linear congruential


generator (LCG):

xn = (a · xn−1 + c) mod m
un = xn /m

where:
a is the multiplicative constant
c is the additive constant
m is the modulus (or the order of the congruence)
The initial number x0 is called the seed
{x1 , x2 , . . . , xn } is a sequence of pseudorandom integer numbers
(0 ≤ xn < m)
{u1 , u2 , . . . , un } is a sequence of uniform random variates
The maximum period is m
Thierry Roncalli Course 2023-2024 in Financial Risk Management 967 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Linear congruential generator

Example #1
If we consider that a = 3, c = 0, m = 11 and x0 = 1, we obtain the
following sequence:

{1, 3, 9, 5, 4, 1, 3, 9, 5, 4, 1, 3, 9, 5, 4, . . .}

The period length is only five, meaning that only five uniform random
variates can be generated: 0.09091, 0.27273, 0.81818, 0.45455 and
0.36364

Thierry Roncalli Course 2023-2024 in Financial Risk Management 968 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Linear congruential generator


The minimal standard LCG proposed by Lewis et al. (1969) is defined by
a = 75 , c = 0 and m = 231 − 1
Its period length is equal to m − 1 = 231 − 2 ≈ 2.15 × 109

Table: Simulation of 10 uniform pseudorandom numbers

n xn un xn un
0 1 0.000000 123 456 0.000057
1 16 807 0.000008 2 074 924 992 0.966212
2 282 475 249 0.131538 277 396 911 0.129173
3 1 622 650 073 0.755605 22 885 540 0.010657
4 984 943 658 0.458650 237 697 967 0.110687
5 1 144 108 930 0.532767 670 147 949 0.312062
6 470 211 272 0.218959 1 772 333 975 0.825307
7 101 027 544 0.047045 2 018 933 935 0.940139
8 1 457 850 878 0.678865 1 981 022 945 0.922486
9 1 458 777 923 0.679296 466 173 527 0.217079
10 2 007 237 709 0.934693 958 124 033 0.446161

Thierry Roncalli Course 2023-2024 in Financial Risk Management 969 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Linear congruential generator

Figure: Lattice structure of the linear congruential generator

Thierry Roncalli Course 2023-2024 in Financial Risk Management 970 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Multiple recursive generator

We have !
k
X
xn = ai · xn−i + c mod m
i=1

The famous MRG32k3a generator of L’Ecuyer (1999) uses two 32-bit


multiple recursive generators:

xn = (1403580 · xn−2 − 810728 · xn−3 ) mod m1
yn = (527612 · yn−1 − 1370589 · yn−3 ) mod m2

where m1 = 232 − 209 and m2 = 232 − 22853. The uniform random


variate is then equal to:

xn − yn + 1 {xn ≤ yn } · m1
un =
m1 + 1

The period length of this generator is equal to 2191 ≈ 3 × 1057


Thierry Roncalli Course 2023-2024 in Financial Risk Management 971 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

We now consider X a random variable whose distribution function is noted


F. There are many ways to simulate X , but all of them are based on
uniform random variates

Thierry Roncalli Course 2023-2024 in Financial Risk Management 972 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Continuous random variables

We assume that F is continuous


Let Y = F (X ) be the integral transform of X
Its cumulative distribution function G is equal to:

G (y ) = Pr {Y ≤ y }
= Pr {F (X ) ≤ y }
= Pr X ≤ F−1 (y )


−1

= F F (y )
= y

where G (0) = 0 and G (1) = 1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 973 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Continuous random variables

We deduce that F (X ) has a uniform distribution U[0,1] :

F (X ) ∼ U[0,1]

IIf U is a uniform random variable, then F−1 (U) is a random variable


whose distribution function is F:

U ∼ U[0,1] ⇒ F−1 (U) ∼ F

To simulate a sequence of random variates {x1 , . . . , xn }, we can


simulate a sequence of uniform random variates {u1 , . . . , un } and
apply the transform xi ← F−1 (ui )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 974 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Continuous random variables

Example #2
If we consider the generalized uniform distribution U[a,b] , we have
F (x) = (x − a) / (b − a) and F−1 (u) = a + (b − a) u. The simulation of
random variates xi is deduced from the uniform random variates ui by
using the following transform:

xi ← a + (b − a) ui

Thierry Roncalli Course 2023-2024 in Financial Risk Management 975 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Continuous random variables

Example #3
In the case of the exponential distribution E (λ), we have
F (x) = 1 − exp (−λx). We deduce that:

ln (1 − ui )
xi ← −
λ
Since 1 − U is also a uniform distributed random variable, we have:
ln (ui )
xi ← −
λ

Thierry Roncalli Course 2023-2024 in Financial Risk Management 976 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Continuous random variables

Example #4
In the case of the Pareto distribution P (α, x− ), we have
−α −1/α
F (x) = 1 − (x/x− ) and F−1 (u) = x− (1 − u) . We deduce that:
x−
xi ← 1/α
(1 − ui )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 977 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Continuous random variables

The method of inversion is easy to implement when we know the


analytical expression of F−1
When it is not the case, we use the Newton-Raphson algorithm:

ui − F (xim )
xim+1 = xim +
f (xim )

where xim is the solution of the equation F (x) = u at the iteration m


If we apply this algorithm to the Gaussian distribution N (0, 1), we
have:
m+1 m ui − Φ (xim )
xi = xi +
φ (xim )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 978 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Discrete random variables

In the case of a discrete probability distribution


{(x1 , p1 ) , (x2 , p2 ) , . . . , (xn , pn )} where x1 < x2 < . . . < xn , we have:


 x1 if 0 ≤ u ≤ p1
 x2 if p1 < u ≤ p1 + p2

F−1 (u) = ..

 .
 Pn−1
xn if k=1 pk < u ≤ 1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 979 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Discrete random variables

We assume that:
xi 1 2 4 6 7 9 10
pi 10% 20% 10% 5% 20% 30% 5%
F (xi ) 10% 30% 40% 45% 65% 95% 100%

The inverse function is a step function


If u = 0.5517, Then X = F−1 (u) = F−1 (0.5517) = 7

Thierry Roncalli Course 2023-2024 in Financial Risk Management 980 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Discrete random variables

Figure: Inversion method when X is a discrete random variable

Thierry Roncalli Course 2023-2024 in Financial Risk Management 981 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Discrete random variables

Example #5
If we apply the method of inversion to the Bernoulli distribution B (p), we
have: 
0 if 0 ≤ u ≤ 1 − p
x←
1 if 1 − p < u ≤ 1
or: 
1 if u ≤ p
x←
0 if u > p

Thierry Roncalli Course 2023-2024 in Financial Risk Management 982 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Piecewise distribution functions

A piecewise distribution function is defined as follows:


 ? ?

F (x) = Fm (x) if x ∈ xm−1 , xm

where xm? are the knots of the piecewise function and:

Fm+1 (xm? ) = Fm (xm? )

In this case, the simulated value xi is obtained using a search


algorithm:
−1
if F xm−1 < ui ≤ F (xm? )
?

xi ← Fm (ui )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 983 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Piecewise distribution functions

We consider the piecewise exponential model


The survival function has the following expression:
?
? −λm (t−tm−1 )
  ? ?

S (t) = S tm−1 e if t ∈ tm−1 , tm

We know that S (τ ) ∼ U
It follows that:
?

? 1 S tm−1 ? ?

ti ← tm−1 + ln if S (tm ) < ui ≤ S tm−1
λm ui

Thierry Roncalli Course 2023-2024 in Financial Risk Management 984 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Piecewise distribution functions

Example #6
We model the default time τ with the piecewise exponential model and
the following parameters:

 5% if t is less or equal than one year
λ= 8% if t is between one and five years
12% if t is larger than five years

Thierry Roncalli Course 2023-2024 in Financial Risk Management 985 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of inversion
Piecewise distribution functions

We have S (0) = 1, S (1) = 0.9512 and S (5) = 0.6907. We deduce that:



 0 + (1/0.05) · ln (1/ui ) if ui ∈ [0.9512, 1]
ti ← 1 + (1/0.08) · ln (0.9512/ui ) if ui ∈ [0.6907, 0.9512[
5 + (1/0.12) · ln (0.6907/ui ) if ui ∈ [0, 0.6907[

Table: Simulation of the piecewise exponential model


? ?

ui tm−1 S tm−1 λm ti
0.9950 0 1.0000 0.05 0.1003
0.3035 5 0.6907 0.12 11.8531
0.5429 5 0.6907 0.12 7.0069
0.9140 1 0.9512 0.08 1.4991
0.7127 1 0.9512 0.08 4.6087

Thierry Roncalli Course 2023-2024 in Financial Risk Management 986 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Let {Y1 , Y2 , . . .} be a vector of independent random variables. The


simulation of the random variable X = g (Y1 , Y2 , . . .) is straightforward if
we know how to easily simulate the random variables Yi . We notice that
the inversion method is a particular case of the transform method, because
we have:
X = g (U) = F−1 (U)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 987 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

The Binomial random variable is the sum of n iid Bernoulli random


variables:
Xn
B (n, p) = Bi (p)
i=1

We simulate the Binomial random variate x using n uniform random


numbers:
Xn
x= 1 {ui ≤ p}
i=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 988 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

To simulate the chi-squared random variable χ2 (ν), we can use the


following relationship:
ν
X ν
X 2
χ2 (ν) = χ2i (1) = (Ni (0, 1))
i=1 i=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 989 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Box-Muller algorithm
if U1 and U2 are two independent uniform random variables, then X1 and
X2 defined by:  √
X1 = √−2 ln U1 · cos (2πU2 )
X2 = −2 ln U1 · sin (2πU2 )
are independent and follow the Gaussian distribution distribution N (0, 1)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 990 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation
If Nt is a Poisson process with intensity λ, the duration T between
two consecutive events is an exponential:

Pr (T ≤ t) = 1 − e −λt

Since the durations are independent, we have:


n
X
T1 + T2 + . . . + Tn = Ei
i=1

where Ei ∼ E (λ)
Because the Poisson random variable is the number of events that
occur in the unit interval of time, we also have:
( n
)
X
X = max {n : T1 + T2 + . . . + Tn ≤ 1} = max n : Ei ≤ 1
i=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 991 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

We notice that:
n n n
X 1X 1 Y
Ei = − ln Ui = − ln Ui
λ λ
i=1 i=1 i=1

where Ui are iid uniform random variables


We deduce that:
( n
) ( n
)
1 Y Y
X = max n : − ln Ui ≤ 1 = max n : Ui ≥ e −λ
λ
i=1 i=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 992 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

We can then simulate the Poisson random variable with the following
algorithm:
1 set n = 0 and p = 1;
2 calculate n = n + 1 and p = p · ui where ui is a uniform random
variate;
3 if p ≥ e −λ , go back to step 2; otherwise, return X = n − 1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 993 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

Theorem
F (x) and G (x) are two distribution functions such that
f (x) ≤ cg (x) for all x with c > 1
We note X ∼ G and consider an independent uniform random
variable U ∼ U[0,1]
Then, the conditional distribution function of X given that
U ≤ f (X ) / (cg (X )) is F (x)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 994 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

Proof
Let us introduce the random variables B and Z :
 
f (X ) f (X )
B=1 U≤ and Z =X U≤
cg (X ) cg (X )
We have:
 
f (X )
Pr {B = 1} = Pr U ≤
cg (X )
  Z +∞
f (X ) f (x)
= E = g (x) dx
cg (X ) −∞ cg (x)
Z +∞
1
= f (x) dx
c −∞
1
=
c
Thierry Roncalli Course 2023-2024 in Financial Risk Management 995 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

Proof
The distribution function of Z is defined by:
 
f (X )
Pr {Z ≤ x} = Pr X ≤ x U ≤
cg (X )
We deduce that:
 
f (X )
Pr X ≤ x, U ≤ Z x Z f (x)/(cg (x))
cg (X )
Pr {Z ≤ x} =   =c g (x) du dx
f (X ) −∞ 0
Pr U ≤
cg (X )
Z x Z x
f (x)
= c g (x) dx = f (x) dx
−∞ cg (x) −∞
= F (x)
This proves that Z ∼ F
Thierry Roncalli Course 2023-2024 in Financial Risk Management 996 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

Acceptance-rejection algorithm
1 generate two independent random variates x and u from G and U[0,1] ;
2 calculate v as follows:
f (x)
v=
cg (x)
3 if u ≤ v , return x (‘accept’); otherwise, go back to step 1 (‘reject’)

Remark
The underlying idea of this algorithm is then to simulate the distribution
function F by assuming that it is easier to generate random numbers from
G, which is called the proposal distribution. However, some of these
random numbers must be ‘rejected’, because the function c · g (x)
‘dominates’ the density function f (x)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 997 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

The number of iterations N needed to successfully generate Z has a


geometric distribution G (p), where p = Pr {B = 1} = c −1 is the
acceptance ratio
The average number of iterations is equal to:

E [N] = 1/p = c

To maximize the efficiency (or the acceptance ratio) of the algorithm,


we have to choose the constant c such that:
f (x)
c = sup
x g (x)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 998 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

We consider the normal distribution N (0, 1)


We use the Cauchy distribution function as the proposal distribution:
1
g (x) =
π (1 + x 2 )

We can show that: √



φ (x) ≤ g (x)
e 0.5
meaning that c ≈ 1.52
We have:
1 1
G (x) = + arctan x
2 π
and:   
−1 1
G (u) = tan π u −
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 999 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

Figure: Rejection sampling applied to the normal distribution

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1000 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

Acceptance-rejection algorithm for simulating N (0, 1)


1 generate two independent uniform random variates u1 and u2 and set:
  
1
x ← tan π u1 −
2
2 calculate v as follows:
2

0.5
1+x
e φ (x)
v=√ = (x 2 −1)/2
2πg (x) 2e

3 if u2 ≤ v , accept x; otherwise, go back to step 1

The acceptance ratio is 1/1.52 ≈ 65.8%

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1001 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

Table: Simulation of the standard Gaussian distribution using the


acceptance-rejection algorithm

u1 u2 x v test z
0.9662 0.1291 9.3820 0.0000 reject
0.0106 0.1106 −30.0181 0.0000 reject
0.3120 0.8253 −0.6705 0.9544 accept −0.6705
0.9401 0.9224 5.2511 0.0000 reject
0.2170 0.4461 −1.2323 0.9717 accept −1.2323
0.6324 0.0676 0.4417 0.8936 accept 0.4417
0.6577 0.1344 0.5404 0.9204 accept 0.5404
0.1596 0.6670 −1.8244 0.6756 accept −1.8244
0.4183 0.3872 −0.2625 0.8513 accept −0.2625
0.9625 0.0752 8.4490 0.0000 reject

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1002 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Rejection sampling

Figure: Comparison of the exact and simulated densities

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1003 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of mixtures
A finite mixture can be decomposed as a weighted sum of distribution
functions:
n
X
F (x) = πk · Gk (x)
k=1
Pn
where πk ≥ 0 and k=1 πk = 1
The probability density function is:
n
X
f (x) = πk · gk (x)
k=1
To simulate the probability distribution F, we introduce the random
variable B, whose probability mass function is defined by:
p (k) = Pr {B = k} = πk
It follows that:
n
X
F (x) = Pr {B = k} · Gk (x)
k=1
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1004 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of mixtures

We deduce the following algorithm:


1 generate the random variate b from the probability mass function
p (k)
2 generate the random variate x from the probability distribution Gb (x)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1005 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of mixtures

The previous approach can be easily extended to continuous mixtures:


Z
f (x) = π (ω) g (x; ω) dω

where ω ∈ Ω is a parameter of the distribution G

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1006 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of mixtures

The negative binomial distribution is a gamma-Poisson mixture


distribution: 
N B (r , p) ∼ P (Λ)
Λ ∼ G (r , (1 − p) /p)
To simulate the negative binomial distribution, we simulate
1 the gamma random variate g ∼ G (r , (1 − p) /p)
2 and then the Poisson random variable p, whose parameter λ is equal
to g

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1007 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Random vectors

The random vector X = (X1 , . . . , Xn ) has a given distribution function


F (x) = F (x1 , . . . , xn )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1008 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

If X1 , . . . , Xn are independent, we have:


n
Y
F (x1 , . . . , xn ) = Fi (xi )
i=1

To simulate X , we can then generate each component Xi ∼ Fi


individually, for example by applying the method of inversion

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1009 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

If X1 , . . . , Xn are dependent, we have:

F (x1 , . . . , xn ) = F1 (x1 ) F2|1 (x2 | x1 ) F3|1,2 (x3 | x1 , x2 ) × · · · ×


Fn|1,...,n−1 (xn | x1 , . . . , xn−1 )
Yn
= Fi|1,...,i−1 (xi | x1 , . . . , xi−1 )
i=1

where Fi|1,...,i−1 (xi | x1 , . . . , xi−1 ) is the conditional distribution of Xi


given X1 = x1 , . . . , Xi−1 = xi−1
This ‘conditional’ random variable is denoted by
Yi = Xi | X1 = x1 , . . . , Xi−1 = xi−1
The random variables (Y1 , . . . , Yn ) are independent

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1010 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

We obtain the following algorithm:


1 generate x1 from F1 (x) and set i = 2
2 generate xi from Fi|1,...,i−1 (x | x1 , . . . , xi−1 ) given
X1 = x1 , . . . , Xi−1 = xi−1 and set i = i + 1
3 repeat step 2 until i = n

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1011 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

Fi|1,...,i−1 (x | x1 , . . . , xi−1 ) is a univariate distribution function, which


depends on the argument x and parameters x1 , . . . , xi−1 . To simulate it,
we can therefore use the method of inversion:

xi ← F−1
i|1,...,i−1 (ui | x1 , . . . , xi−1 )

where F−1i|1,...,i−1 is the inverse of the conditional distribution function and


ui is a uniform random variate

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1012 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

Example #7
We consider the bivariate logistic distribution defined as:
−x1 −x2 −1

F (x1 , x2 ) = 1 + e +e

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1013 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

−1
We have F1 (x1 ) = F (x1 , +∞) = (1 + e −x1 ) . We deduce that the
conditional distribution of X2 given X1 = x1 is:

F (x1 , x2 )
F2|1 (x2 | x1 ) =
F1 (x1 )
1 + e −x1
=
1 + e −x1 + e −x2
We obtain:
F−1
1 (u) = ln u − ln (1 − u)

and:
F−1 −x1

2|1 (u | x1 ) = ln u − ln (1 − u) − ln 1 + e

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1014 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions


We deduce the following algorithm:
1 generate two independent uniform random variates u1 and u2 ;
2 generate x1 from u1 :

x1 ← ln u1 − ln (1 − u1 )

3 generate x2 from u2 and x1 :


−x1

x2 ← ln u2 − ln (1 − u2 ) − ln 1 + e
−1
Because we have (1 + e −x1 ) = u1 , the last step can be replaced by:
3 generate x2 from u2 and u1 :
 
u1 u2
x2 ← ln
1 − u2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1015 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

The method of conditional distributions can be used for simulating


uniform random vectors (U1 , . . . , Un ) generated by copula functions
We have

C (u1 , . . . , un ) = C1 (u1 ) C2|1 (u2 | u1 ) C3|1,2 (u3 | u1 , u2 ) × · · · ×


Cn|1,...,n−1 (un | u1 , . . . , un−1 )
Yn
= Ci|1,...,i−1 (ui | u1 , . . . , ui−1 )
i=1

where Ci|1,...,i−1 (ui | u1 , . . . , ui−1 ) is the conditional distribution of Ui


given U1 = u1 , . . . , Ui−1 = ui−1
By definition, we have C1 (u1 ) = u1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1016 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions


We obtain the following algorithm:
1 generate n independent uniform random variates v1 , . . . , vn ;
2 generate u1 ← v1 and set i = 2;
3 generate ui by finding the root of the equation:

Ci|1,...,i−1 (ui | u1 , . . . , ui−1 ) = vi

and set i = i + 1;
4 repeat step 3 until i = n.
For some copula functions, there exists an analytical expression of the
inverse of the conditional copula. In this case, the third step is replaced by:
3 generate ui by the inversion method:

ui ← C−1
i|1,...,i−1 (vi | u1 , . . . , ui−1 )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1017 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions


For any probability distribution, the conditional distribution can be
calculated as follows:
F (x1 , . . . , xi−1 , xi )
Fi|1,...,i−1 (xi | x1 , . . . , xi−1 ) =
F (x1 , . . . , xi−1 )

In particular, we have:

∂1 F (x1 , x2 ) = ∂1 F1 (x1 ) · F2|1 (x2 | x1 )
= f1 (x1 ) · F2|1 (x2 | x1 )

For copula functions, the density f1 (x1 ) is equal to 1, meaning that:

C2|1 (u2 | u1 ) = ∂1 C (u1 , u2 )

We can generalize this result and show that the conditional copula given
some random variables Ui for i ∈ Ω is equal to the cross-derivative of the
copula function C with respect to the arguments ui for i ∈ Ω
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1018 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions


Archimedean copulas are defined as:
C (u1 , u2 ) = ϕ−1 (ϕ (u1 ) + ϕ (u2 ))
where ϕ (u) is the generator function
We have:
ϕ (C (u1 , u2 )) = ϕ (u1 ) + ϕ (u2 )
and:
∂ C (u1 , u2 )
ϕ0 (C (u1 , u2 )) · = ϕ0 (u1 )
∂ u1
We deduce the following expression of the conditional copula:
∂ C (u1 , u2 ) ϕ0 (u1 )
C2|1 (u2 | u1 ) = = 0 −1
∂ u1 ϕ (ϕ (ϕ (u1 ) + ϕ (u2 )))
The calculation of the inverse function gives:
   0  
ϕ (u 1 )
C−1
2|1 (v | u1 ) = ϕ−1
ϕ ϕ0−1
− ϕ (u1 )
v
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1019 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

We obtain the following algorithm for simulating Archimedean copulas:


1 generate two independent uniform random variates v1 and v2 ;
2 generate u1 ← v1 ;
3 generate u2 by the inversion method:
   0  
ϕ (u 1 )
u2 ← ϕ−1 ϕ ϕ0−1 − ϕ (u1 )
v2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1020 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

Example #8
We consider the Clayton copula:
−1/θ
C (u1 , u2 ) = u1−θ + u2−θ −1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1021 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

The Clayton copula is an Archimedean copula, whose generator function is:

ϕ (u) = u −θ − 1

We deduce that:
−1/θ
ϕ−1 (u) = (1 + u)
ϕ0 (u) = −θu −(θ+1)
−1/(θ+1)
ϕ0−1 (u) = (−u/θ)

We obtain:
  −1//θ
C−1
2|1 (v | u1 ) = 1 + u1−θ v −θ/(θ+1)
−1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1022 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of conditional distributions

Table: Simulation of the Clayton copula

Random uniform Clayton copula


variates θ = 0.01 θ = 1.5
v1 v2 u1 u2 u1 u2
0.2837 0.4351 0.2837 0.4342 0.2837 0.3296
0.0386 0.2208 0.0386 0.2134 0.0386 0.0297
0.3594 0.5902 0.3594 0.5901 0.3594 0.5123
0.3612 0.3268 0.3612 0.3267 0.3612 0.3247
0.0797 0.6479 0.0797 0.6436 0.0797 0.1704

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1023 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

To simulate a Gaussian random vector X ∼ N (µ, Σ), we consider the


following transformation:

X =µ+A·N

where AA> = Σ and N ∼ N (0, I )


Since Σ is a positive definite symmetric matrix, it has a unique
Cholesky decomposition:
Σ = PP >
where P is a lower triangular matrix

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1024 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

The decomposition AA> = Σ is not unique. For instance, if we use the


eigendecomposition:
Σ = UΛU >
we can set A = UΛ1/2 . Indeed, we have:

AA> = UΛ1/2 Λ1/2 U >


= UΛU >
= Σ

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1025 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

To simulate a multivariate Student’s t distribution


Y = (Y1 , . . . , Yn ) ∼ Tn (Σ, ν), we use the relationship:

Xi
Yi = p
Z /ν

where the random vector X = (X1 , . . . , Xn ) ∼ N (0, Σ) and the random


variable Z ∼ χ2 (ν) are independent

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1026 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

If X = (X1 , . . . , Xn ) ∼ F, then the probability distribution of the


random vector U = (U1 , . . . , Un ) defined by:

Ui = Fi (X )

is the copula function C associated to F


To simulate the Normal copula with the matrix of parameters ρ, we
simulate N ∼ N (0, I ) and apply the transformation:

U = Φ (P · N)

where P is the Cholesky decomposition of the correlation matrix ρ


To simulate the Student’s t copulawith the matrix of parameters ρ
and ν degrees of freedom, we simulate T ∼ Tn (ρ, ν) and apply the
transformation:
Ui = Tv (Ti )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1027 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Figure: Simulation of the Normal copula

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1028 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Figure: Simulation of the t1 copula

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1029 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Frailty copulas are defined as:


−1 −1

C (u1 , . . . , un ) = ψ ψ (u1 ) + . . . + ψ (un )

where ψ (x) is the Laplace transform of a random variable X


They can be generated using the following algorithm:
1 simulate n independent uniform random variates v1 , . . . , vn ;
2 simulate the frailty random variate x with the Laplace transform ψ;
3 apply the transformation:
    
ln u1 ln un
(u1 , . . . , un ) ← ψ − ,...,ψ −
x x

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1030 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

−1/θ
The Clayton copula is a frailty copula where ψ (x) = (1 + x) is
the Laplace transform of the gamma random variable G (1/θ, 1)
The algorithm to simulate the Clayton copula is:

 x ← G (1/θ, 1) 

−1/θ  −1/θ !
ln u1 ln un
 1 (u , . . . , u n ) ← 1 − , . . . , 1 −
x x

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1031 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Figure: Simulation of the Clayton copula

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1032 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

We consider the multivariate distribution F (x1 , . . . , xn ), whose


canonical decomposition is defined as:

F (x1 , . . . , xn ) = C (F1 (x1 ) , . . . , Fn (xn ))

If (U1 , . . . , Un ) ∼ C, the random vector 


(X1 , . . . , Xn ) = F1−1 (U1 ) , . . . , F−1
n (Un ) follows the distribution
function F
We deduce the following algorithm:

(u1 , . . . , un ) ← C
−1 −1

(x1 , . . . , xn ) ← F1 (u1 ) , . . . , Fn (un )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1033 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

We assume that τ ∼ E (5%) and LGD ∼ B (2, 2)


We also assume that the default time and the loss given default are
correlated and the dependence function is a Clayton copula

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1034 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Figure: Simulation of the correlated random vector (τ , LGD)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1035 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Remark
The previous algorithms suppose that we know the analytical expression Fi
of the univariate probability distributions in order to calculate the quantile
F−1
i . This is not always the case. For instance, in the operational risk, the
loss of the bank is equal to the sum of aggregate losses:
K
X
L= Sk
k=1

where Sk is also the sum of individual losses for the k th cell of the
mapping matrix. In practice, the probability distribution of Sk is estimated
by the method of simulations

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1036 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

The method of the empirical quantile function is implemented as follows:


?
1 for each random variable Xi , simulate m1 random variates xi,m and
estimate the empirical distribution F̂i ;
2 simulate a random vector (u1 , . . . , un ) from the copula function
C (u1 , . . . , un );
3 simulate the random vector (x1 , . . . , xn ) by inverting the empirical
distributions F̂i :
xi ← F̂−1 i (ui )

we also have:
 
1 Xm1  ?
xi ← inf x 1 x ≤ xi,m ≥ ui
m1 m=1

4 repeat steps 2 and 3 m2 times

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1037 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

X1 ∼ N (0, 1)
X2 ∼ N (0, 1)
The dependence function of (X1 , X2 ) is the Clayton copula with
parameter θ = 3

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1038 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Figure: Convergence of the method of the empirical quantile function

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1039 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

X1 ∼ N (−1, 2), X2 ∼ N (0, 1), Y1 ∼ G (0.5) and Y2 ∼ G (1, 2) are


four independent random variables
Let (Z1 = X1 + Y1 , Z2 = X2 · Y2 ) be the random vector
The dependence function of Z is the t copula with parameters ν = 2
and ρ = −70%
It is not possible to find an analytical expression of the marginal
distributions of Z1 and Z2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1040 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Figure: Simulation of the random variables Z1 and Z2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1041 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Method of transformation

Figure: Simulation of the random vector (Z1 , Z2 )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1042 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Uniform random numbers
Random variate generation
Non-uniform random numbers
Simulation of stochastic processes
Random vectors
Monte Carlo methods
Random matrices

Random matrices

Orthogonal and covariance matrices


Correlation matrices
Wishart matrices
⇒ HFRM, Chapter 13, Section 13.1.4, pages 807-813

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1043 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Brownian motion
A Brownian motion (or a Wiener process) is a stochastic process
W (t), whose increments are stationary and independent:
W (t) − W (s) ∼ N (0, t − s)
We have: 
W (0) = 0
W (t) = W (s) +  (s, t)
where  (s, t) ∼ N (0, t − s) are iid random variables
To simulate W (t) at different dates t1 , t2 , . . ., we have:

Wm+1 = Wm + tm+1 − tm · εm
where Wm is the numerical realization of W (tm ) and εm ∼ N (0, 1)
are iid random variables
In the case of fixed-interval times tm+1 − tm = h, we obtain the
recursion: √
Wm+1 = Wm + h · εm
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1044 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Geometric Brownian motion

The geometric Brownian motion is described by the following SDE:



dX (t) = µX (t) dt + σX (t) dW (t)
X (0) = x0

Its solution is given by:


  
1 2
X (t) = x0 · exp µ − σ t + σW (t) = g (W (t))
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1045 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Geometric Brownian motion


1 Simulating the geometric Brownian motion X (t) can be done by
applying the transform method to the process W (t)
2 Another approach to simulate X (t) consists in using the following
formula:
  
1 2
X (t) = X (s) · exp µ − σ (t − s) + σ (W (t) − W (s))
2
We have:

  
1 2
Xm+1 = Xm · exp µ − σ (tm+1 − tm ) + σ tm+1 − tm · εm
2
where Xm = X (tm ) and εm ∼ N (0, 1) are iid random variables
3 If we consider fixed-interval times, the numerical realization becomes:

  
1 2
Xm+1 = Xm · exp µ − σ h + σ h · εm
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1046 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Geometric Brownian motion

Figure: Simulation of the geometric Brownian motion

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1047 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Ornstein-Uhlenbeck process
The stochastic differential equation of the Ornstein-Uhlenbeck
process is:

dX (t) = a (b − X (t)) dt + σ dW (t)
X (0) = x0

The solution of the SDE is:


Z t
−at −at
e a(θ−t) dW (θ)

X (t) = x0 e +b 1−e +σ
0

We also have:
  Z t
−a(t−s) −a(t−s)
X (t) = X (s) e +b 1−e +σ e a(θ−t) dW (θ)
s

where:
t  −2a(t−s)

1−e
Z
e a(θ−t) dW (θ) ∼ N 0,
s 2a
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1048 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Ornstein-Uhlenbeck process

If we consider fixed-interval times, we obtain the following simulation


scheme:
r
−ah −ah
 1 − e −2ah
Xm+1 = Xm e +b 1−e +σ · εm
2a
where εm ∼ N (0, 1) are iid random variables

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1049 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Ornstein-Uhlenbeck process

Figure: Simulation of the Ornstein-Uhlenbeck process

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1050 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Stochastic differential equations without an explicit


solution

Let X (t) be the solution of the following SDE:



dX (t) = µ (t, X ) dt + σ (t, X ) dW (t)
X (0) = x0

The Euler-Maruyama scheme uses the following approximation:

X (t) − X (s) ≈ µ (t, X (s)) · (t − s) + σ (t, X (s)) · (W (t) − W (s))

If we consider fixed-interval times, the Euler-Maruyama scheme


becomes:

Xm+1 = Xm + µ (tm , Xm ) h + σ (tm , Xm ) h · εm

where εm ∼ N (0, 1) are iid random variables

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1051 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Stochastic differential equations without an explicit


solution

The fixed-interval Milstein scheme is:



Xm+1 = Xm + µ (tm , Xm ) h + σ (tm , Xm ) h · εm +
1 2

σ (tm , Xm ) ∂x σ (tm , Xm ) h εm − 1
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1052 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Stochastic differential equations without an explicit


solution

If we consider the geometric Brownian motion, the Euler-Maruyama


scheme is: √
Xm+1 = Xm + µXm h + σXm h · εm
whereas the Milstein scheme is:
1 2 2
√
Xm+1 = Xm + µXm h + σXm h · εm + σ Xm h εm − 1
2 
√ 1 √
  
1 2
= Xm + µ − σ Xm h + σXm h 1 + σ hεm εm
2 2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1053 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Stochastic differential equations without an explicit


solution

Figure: Comparison of exact, Euler-Maruyama and Milstein schemes (monthly


discretization)
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1054 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Stochastic differential equations without an explicit


solution

When we don’t know the analytical solution of X (t), it is natural to


simulate the numerical solution of X (t) using Euler-Maruyama and
Milstein schemes. However, it may be sometimes more efficient to find the
numerical solution of Y (t) = f (t, X (t)) instead of X (t) itself, in
particular when Y (t) is more regular than X (t)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1055 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Stochastic differential equations without an explicit


solution
By Itô’s lemma, we have:
 
1 2
dY (t) = ∂t f (t, X ) + µ (t, X ) ∂x f (t, X ) + σ (t, X ) ∂x2 f (t, X ) dt +
2
σ (t, X ) ∂x f (t, X ) dW (t)
By using the inverse function X (t) = f −1 (t, Y (t)), we obtain:
dY (t) = µ0 (t, Y ) dt + σ 0 (t, Y ) dW (t)
where µ0 (t, Y ) and σ 0 (t, Y ) are functions of µ (t, X ), σ (t, X ) and
f (t, X )
We can then simulate the solution of Y (t) using an approximation
scheme and deduce the numerical solution of X (t) by applying the
transformation method:
Xm = f −1 (tm , Ym )
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1056 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Stochastic differential equations without an explicit


solution
Let us consider the geometric Brownian motion X (t). The solution of
Y (t) = ln X (t) is equal to:
 
1
dY (t) = µ − σ 2 dt + σ dW (t)
2

We deduce that the Euler-Maruyama (or Milstein) scheme with


fixed-interval times is:

 
1 2
Ym+1 = Ym + µ − σ h + σ h · εm
2

It follows that:

 
1 2
ln Xm+1 = ln Xm + µ − σ h + σ h · εm
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1057 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Stochastic differential equations without an explicit


solution
p
The CIR process is dX (t) =p(α + βX (t)) dt + σ X (t) dW (t). Using
the transformation Y (t) = X (t), we obtain the following SDE:
! p
2
1 (α + βX (t)) 1 σ X (t) 1 σ X (t)
dY (t) = p − 3/2
dt + p dW (t)
2 X (t) 8 X (t) 2 X (t)
 
1 1 1
= α + βY 2 (t) − σ 2 dt + σ dW (t)
2Y (t) 4 2
We deduce that the Euler-Maruyama scheme of Y (t) is:
1 √
 
1 2 1 2
Ym+1 = Ym + α + βYm − σ h + σ h · εm
2Ym 4 2
It follows that:
2

p  
1 1 1
Xm+1 = Xm + √ α + βXm − σ 2 h + σ h · εm
2 Xm 4 2
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1058 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Poisson process

Let tm be the time when the mth event occurs. The numerical algorithm is
then:
1 we set t0 = 0 and N (t0 ) = 0
2 we generate a uniform random variate u and calculate the random
variate e ∼ E (λ) with the formula:

ln u
e=−
λ
3 we update the Poisson process with:

tm+1 ← tm + e and N (tm+1 ) ← N (tm ) + 1

4 we go back to step 2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1059 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Mixed Poisson process (MPP)

The algorithm is initialized with a realization λ of the random intensity Λ

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1060 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Non-homogenous Poisson process (NHPP)

λ (t) varies with time


The inter-arrival times remain independent and exponentially
distributed with:
Pr {T1 > t} = exp (−Λ (t))
where T1 is the duration of the first event and Λ (t) is the integrated
intensity function: Z t
Λ (t) = λ (s) ds
0
It follows that:

Pr T1 > Λ−1 (t) = exp (−t) ⇔ Pr {Λ (T1 ) > t} = exp (−t)




Thierry Roncalli Course 2023-2024 in Financial Risk Management 1061 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Non-homogenous Poisson process (NHPP)

We deduce that if {t1 , t2 , . . . , tM } are the occurrence times of the NHPP


of intensity λ (t), then {Λ (t1 ) , Λ (t2 ) , . . . , Λ (tM )} are the occurrence
times of the homogeneous Poisson process (HPP) of intensity one.
Therefore, the algorithm is:
0
1 we simulate tm the time arrivals of the homogeneous Poisson process
with intensity λ = 1
2 we apply the transform tm = Λ−1 (tm
0
)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1062 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Non-homogenous Poisson process (NHPP)

Figure: Simulation of a non-homogenous Poisson process with cyclical intensity

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1063 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Multidimensional Brownian motion

Let W (t) = (W1 (t) , . . . , Wn (t)), be a n-dimensional Brownian


motion
Each component Wi (t) is a Brownian motion:

Wi (t) − Wi (s) ∼ N (0, t − s)

We have:
E [Wi (t) Wj (s)] = min (t, s) · ρi,j
where ρi,j is the correlation between the two Brownian motions Wi
and Wj
We deduce that:

W (0) = 0
W (t) = W (s) +  (s, t)

where  (s, t) ∼ Nn (0, (t − s) ρ) are iid random vectors


Thierry Roncalli Course 2023-2024 in Financial Risk Management 1064 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Multidimensional Brownian motion

It follows that the numerical solution is:



Wm+1 = Wm + tm+1 − tm · P · εm

where P is the Cholesky decomposition of the correlation matrix ρ


and εm ∼ Nn (0, I ) are iid random vectors
In the case of fixed-interval times, the recursion becomes:

Wm+1 = Wm + h · P · εm

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1065 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Multidimensional Brownian motion

Figure: Brownian motion in the plane (independent case)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1066 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Multidimensional Brownian motion

Figure: Brownian motion in the plane (ρ1,2 = 85%)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1067 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Multidimensional geometric Brownian motion


We consider the multidimensional geometric Brownian motion:

dX (t) = µ X (t) dt + diag (σ X (t)) dW (t)
X (0) = x0
where X (t) = (X1 (t) , . . . , Xn (t)), µ = (µ1 , . . . , µn ),
σ = (σ1 , . . . , σn ) and W (t) = (W1 (t) ,h. . . , Wn (t)) is ia
>
n-dimensional Brownian motion with E W (t) W (t) = ρt
If we consider the j th component of X (t), we have:
dXj (t) = µj Xj (t) dt + σj Xj (t) dWj (t)
The solution of the multidimensional SDE is a multivariate log-normal
process with:
  
1
Xj (t) = Xj (0) · exp µj − σj2 t + σj Wj (t)
2
where W (t) ∼ Nn (0, ρ t)
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1068 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Multidimensional geometric Brownian motion

We deduce that the exact scheme to simulate the multivariate GBM


is:
 1 2
 √ 
 X1,m+1 = X1,m · exp µ1 − 2 σ1 (tm+1 − tm ) + σ1 tm+1 − tm · ε1,m
..



.



1 2
 √ 
Xj,m+1 = Xj,m · exp µj − 2 σj (tm+1 − tm ) + σj tm+1 − tm · εj,m

 ..
.





1 2
 
Xn,m+1 = Xn,m · exp µn − 2 σn (tm+1 − tm ) + σn tm+1 − tm · εn,m

where (ε1,m , . . . , εn,m ) ∼ Nn (0, ρ)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1069 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes

We consider the general SDE:



dX (t) = µ (t, X (t)) dt + σ (t, X (t)) dW (t)
X (0) = x0

where X (t) and µ (t, X (t)) are n × 1 vectors, σ (t, X (t)) is a n × p


matrix and W (t) is a p × 1 vector
h i
>
We assume that E W (t) W (t) = ρ t, where ρ is a p × p
correlation matrix

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1070 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes

The corresponding Euler-Maruyama scheme is:



Xm+1 = Xm + µ (tm , Xm ) · (tm+1 − tm ) + σ (tm , Xm ) tm+1 − tm · εm

where εm ∼ Np (0, ρ)
In the case of a diagonal system, we retrieve the one-dimensional
scheme:

Xj,m+1 = Xj,m +µj (tm , Xj,m )·(tm+1 − tm )+σj,j (tm , Xj,m )· tm+1 − tm εj,m

However, the random variables εj,m and εj 0 ,m may be correlated

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1071 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes

We consider the Heston model:


 p
dX (t) = µX (t) dt + v (t)X p (t) dW1 (t)
dv (t) = a (b − v (t)) dt + σ v (t) dW2 (t)

where E [W1 (t) W2 (t)] = ρ t. By applying the fixed-interval


Euler-Maruyama scheme to (ln X (t) , v (t)), we obtain:
 
1 p
ln Xm+1 = ln Xm + µ − vm h + vm h · ε1,m
2

and: p
vm+1 = vm + a (b − vm ) h + σ vm h · ε2,m
Here, ε1,m and ε2,m are two standard Gaussian random variables with
correlation ρ

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1072 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes


The multidimensional version of the Milstein scheme is:
p
X
Xj,m+1 = Xj,m + µj (tm , Xm ) (tm+1 − tm ) + σj,k (tm , Xm ) ∆Wk,m +
k=1
p X
X p
L(k) σj,k 0 (tm , Xm ) I(k,k 0 )
k=1 k 0 =1

where ∆Wk,m = Wk (tm+1 ) − Wk (tm ) and:


n
X ∂ f (t, x)
L(k) f (t, x) = σk 00 ,k (tm , Xm )
∂ xk 00
k 00 =1

and: Z tm+1 Z s
I(k,k 0 ) = dWk (t) dWk 0 (s)
tm tm

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1073 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes

In the case of a diagonal system, the Milstein scheme may be simplified as


follows:

Xj,m+1 = Xj,m + µj (tm , Xj,m ) (tm+1 − tm ) + σj,j (tm , Xj,m ) ∆Wj,m +


L(j) σj,j (tm , Xj,m ) I(j,j)

where:
Z tm+1 Z s
I(j,j) = dWj (t) dWj (s)
tm tm
Z tm+1
= (Wj (s) − Wj (tm )) dWj (s)
tm
1 2

= (∆Wj,m ) − (tm+1 − tm )
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1074 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes

We deduce that the Milstein scheme is:

Xj,m+1 = Xj,m + µj (tm , Xj,m ) (tm+1 − tm ) +



σj,j (tm , Xj,m ) tm+1 − tm εj,m +
1 2

σj,j (tm , Xj,m ) ∂xj σj,j (tm , Xj,m ) (tm+1 − tm ) εj,m − 1
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1075 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes

If we apply the fixed-interval Milstein scheme to the Heston model, we


obtain:  
1 p
ln Xm+1 = ln Xm + µ − vm h + vm h · ε1,m
2
and:
p 1 2 2

vm+1 = vm + a (b − vm ) h + σ vm h · ε2,m + σ h ε2,m − 1
4
Here, ε1,m and ε2,m are two standard Gaussian random variables with
correlation ρ

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1076 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes

Remark
The multidimensional Milstein scheme is generally not used, because the
terms L(k) σj,k 0 (tm , Xm ) I(k,k 0 ) are complicated to simulate. For the Heston
model, we obtain a very simple scheme, because we only apply the Milstein
scheme to the process v (t) and not to the vector process (ln X (t) , v (t))

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1077 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes

If we also apply the Milstein scheme to ln X (t), we obtain:


 
1 p
ln Xm+1 = ln Xm + µ − vm h + vm h · ε1,m + Am
2

where:
2 2 2
!
X X X σ1,k 0 (tm , Xm )
Am = σk 00 ,k (tm , Xm ) I(k,k 0 )
∂ xk 00
k=1 k 0 =1 k 00 =1
p 1
= σ v (t) · p · I(2,1)
2 v (t)
σ
= · I(2,1)
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1078 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes


p
Let W2 (t) = ρW1 (t) + 1 − ρ2 W ? (t) where W ? (t) is a Brownian
motion independent from W1 (t). It follows that:
Z tm+1 Z s
I(2,1) = dW2 (t) dW1 (s)
tm tm
Z tm+1  p 
= ρW1 (s) + 1 − ρ2 W ? (s) dW1 (s) −
tm
Z tm+1  p 
?
ρW1 (tm ) + 1 − ρ2 W (tm ) dW1 (s)
tm
Z tm+1
= ρ (W1 (s) − W1 (tm )) dW1 (s) +
tm
p Z tm+1
1 − ρ2 (W ? (s) − W ? (tm )) dW1 (s)
tm
and:
1  2

I(2,1) = ρ (∆W1,m ) − (tm+1 − tm ) + Bm
2
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1079 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Univariate continuous-time processes
Simulation of stochastic processes
Multivariate continuous-time processes
Monte Carlo methods

Euler-Maruyama and Milstein schemes

We finally deduce that the multidimensional Milstein scheme of the Heston


model is:
 
1 p 1 2

ln Xm+1 = ln Xm + µ − vm h + vm h · ε1,m + ρσh ε1,m − 1 + Bm
2 4

and:
p 1 2 2

vm+1 = vm + a (b − vm ) h + σ vm h · ε2,m + σ h ε2,m − 1
4
where Bm is a correction term defined by:
p Z tm+1
Bm = 1 − ρ2 (W ? (s) − W ? (tm )) dW1 (s)
tm

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1080 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

A basic example
Suppose we have a circle with radius r and a 2r × 2r square of the
same center. Since the area of the circle is equal to πr 2 , the
numerical calculation of π is equivalent to compute the area of the
circle with r = 1
In this case, the area of the square is 4, and we have:
A (circle)
π=4
A (square)
To determine π, we simulate nS random vectors (us , vs ) of uniform
random variables U[−1,1] and we obtain:
nc
π = lim 4
nS →∞ n

where nc is the number of points (us , vs ) in the circle:


nS
X
1 us + vs2 ≤ r 2
 2
nc =
s=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1081 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

A basic example

Figure: Computing π with 1 000 simulations

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1082 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Theoretical framework
We consider the multiple integral:
Z Z
I = · · · ϕ (x1 , . . . , xn ) dx1 · · · dxn

Let X = (X1 , . . . , Xn ) be a uniform random vector with probability


distribution U[Ω] , such that Ω is inscribed within the hypercube [Ω]
The pdf is:
f (x1 , . . . , xn ) = 1
We deduce that:
Z Z
I = ··· 1 {(x1 , . . . , xn ) ∈ Ω} · ϕ (x1 , . . . , xn ) dx1 · · · dxn
[Ω]
= E [1 {(X1 , . . . , Xn ) ∈ Ω} · ϕ (X1 , . . . , Xn )]
= E [h (X1 , . . . , Xn )]
where:
h (x1 , . . . , xn ) = 1 {(x1 , . . . , xn ) ∈ Ω} · ϕ (x1 , . . . , xn )
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1083 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Theoretical framework
Let IˆnS be the random variable defined by:
nS
1 X
IˆnS = h (X1,s , . . . , Xn,s )
nS s=1

where {X1,s , . . . , Xn,s }s≥1 is a sequence of iid random vectors with


probability distribution U[Ω]
Using the strong law of large numbers, we obtain:
lim Iˆns = E [h (X1 , . . . , Xn )]
ns →∞
Z Z
= · · · ϕ (x1 , . . . , xn ) dx1 · · · dxn

Moreover, the central limit theorem states that:


!
√ Iˆns − I
lim nS = N (0, 1)
ns →∞ σ (h (X1 , . . . , Xn ))

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1084 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Theoretical framework

When nS is large, we can deduce the following confidence interval:


" #
ˆ ŜnS ˆ ŜnS
InS − cα · √ , InS + cα · √
nS nS

where α is the confidence level, cα = Φ−1 ((1 + α) /2) and ŜnS is the
usual estimate of the standard deviation:
v
u nS
u 1 X
ŜnS = t h2 (X1,s , . . . , Xn,s ) − Iˆns
nS − 1 s=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1085 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Theoretical framework

Figure: Density function of π̂nS

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1086 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Extension to the calculation of mathematical expectations


Let X = (X1 , . . . , Xn ) be a random vector with probability distribution
F. We have:
Z Z
E [ϕ (X1 , . . . , Xn )] = · · · ϕ (x1 , . . . , xn ) dF (x1 , · · · , xn )
Z Z
= · · · ϕ (x1 , . . . , xn ) f (x1 , · · · , xn ) dx1 · · · dxn
Z Z
= · · · h (x1 , . . . , xn ) dx1 · · · dxn

where f is the density function


The Monte Carlo estimator of this integral is:
nS
1 X
IˆnS = ϕ (X1,s , . . . , Xn,s )
nS s=1

where {X1,s , . . . , Xn,s }s≥1 is a sequence of iid random vectors with


probability distribution F
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1087 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Extension to the calculation of mathematical expectations


The price of the look-back option with maturity T is given by:
" + #
C = e −rT E S (T ) − min S (t)
0≤t≤T

The price S (t) of the underlying asset is given by the following SDE:
dS (t) = rS (t) dt + σS (t) dW (t)
where r is the interest rate and σ is the volatility of the asset
For a given simulation s, we have:

  
(s) 1
Sm+1 = Sm(s) · exp r − σ 2 (tm+1 − tm ) + σ tm+1 − tm · ε(s) m
2
(s)
where εm ∼ N (0, 1) and T = tM
The Monte Carlo estimator of the option price is then equal to:
−rT XnS  +
e (s)
Cb = SM − min Sm(s)
nS s=1 m

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1088 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Extension to the calculation of mathematical expectations

Figure: Computing the look-back option price

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1089 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Extension to the calculation of mathematical expectations

Let us consider the following integral:


Z Z
I = · · · h (x1 , . . . , xn ) dx1 · · · dxn

We can write it as follows:


Z Z
h (x1 , . . . , xn )
I = ··· f (x1 , · · · , xn ) dx1 · · · dxn
f (x1 , · · · , xn )

where f (x1 , · · · , xn ) is a multidimensional density function


We deduce that:  
h (X1 , . . . , Xn )
I =E
f (X1 , . . . , Xn )
This implies that we can compute an integral with the MC method by
using any multidimensional distribution function

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1090 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Extension to the calculation of mathematical expectations


If we apply this result to the calculation of π, we have:
x x 
2 2
π = dx dy = 1 x + y ≤ 1 dx dy
x 2 +y 2 ≤1
x 1 x 2 + y 2 ≤ 1
= φ (x) φ (y ) dx dy
φ (x) φ (y )
We deduce that: "  #
1 X2 + Y2 ≤ 1
π=E
φ (X ) φ (Y )
where X and Y are two independent standard Gaussian random variables.
We can then estimate π by:
nS  2
1 X 1 xs + ys2 ≤ 1
π̂nS =
nS s=1 φ (xs ) φ (ys )

where xs and ys are two independent random variates from the probability
distribution N (0, 1)
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1091 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Extension to the calculation of mathematical expectations

Figure: Computing pi with normal random numbers

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1092 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Variance reduction

We consider twoh unbiased estimators ˆn(1)


I S and ˆ
I
(2)
n S of the integral I ,
i h i
(1) (2)
meaning that E IˆnS = E IˆnS = I

We say that ˆn(1)


I S is more efficient than ˆ
In
(2)
S if the inequality
 
(1) (2)
var IˆnS ≤ var IˆnS holds for all values of nS that are larger than
nS?
Variance reduction is then the search of more efficient estimators

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1093 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates

We have:
I = E [ϕ (X1 , . . . , Xn )] = E [Y ]
where Y = ϕ (X1 , . . . , Xn ) is a one-dimensional random variable
It follows that:
nS
1 X
IˆnS = ȲnS = Ys
nS s=1

We now consider the estimators ȲnS and Ȳn0S based on two different
samples and define Ȳ ? as follows:

?ȲnS + Ȳn0S
Ȳ =
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1094 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates

We have: " #
 ? ȲnS + Ȳn0S  
E Ȳ = E = E ȲnS = I
2
and:
!
?
 ȲnS + Ȳn0S
var Ȳ = var
2
1  1 0
 1 0

= var ȲnS + var ȲnS + cov ȲnS , ȲnS
4 4 2
1 + ρ ȲnS , Ȳn0S 
= var ȲnS
2
1 + ρ hYs , Ys0 i 
= var ȲnS
2
where ρ hYs , Ys0 i is the correlation between Ys and Ys0

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1095 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates

Because we have ρ hYs , Ys0 i ≤ 1, we deduce that:


?
 
var Ȳ ≤ var ȲnS

If we simulate the random variates Ys and Ys0 independently,


ρ hYs , Ys0 i is equal to zero and the variance of the estimator is divided
by 2
However, the number of simulations have been multiplied by two.
The efficiency of the estimator has then not been improved

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1096 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates

The underlying idea of antithetic variables is therefore to use two


perfectly dependent random variables Ys and Ys0 :

Ys0 = ψ (Ys )

where ψ is a deterministic function


This implies that:
nS
1 X
Ȳn?S = Ys?
nS s=1
where:
Ys + Ys0 Ys + ψ (Ys )
Ys? = =
2 2
It follows that:

ρ ȲnS , Ȳn0S = ρ hY , Y 0 i = ρ hY , ψ (Y )i

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1097 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates

?

Minimizing the variance var Ȳ is then equivalent to minimize the
correlation ρ hY , ψ (Y )i
We also know that the correlation reaches its lower bound if the
dependence function between Y and ψ (Y ) is equal to the lower
Fréchet copula:
C hY , ψ (Y )i = C−
However, ρ hY , ψ (Y )i is not necessarily equal to −1 except in some
special cases

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1098 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates

We consider the one-dimensional case with Y = ϕ (X )


If we assume that ϕ is an increasing function, it follows that:

C hY , ψ (Y )i = C hϕ (X ) , ψ (ϕ (X ))i = C hX , ψ (X )i

To obtain the lower bound C− , X and ψ (X ) must be


countermonotonic:

ψ (X ) = F−1 (1 − F (X ))

where F is the probability distribution of X


For instance, if X ∼ U[0,1] , we have X 0 = 1 − X . In the case where
X ∼ N (0, 1), we have:

X 0 = Φ−1 (1 − Φ (X )) = Φ−1 (Φ (−X )) = −X

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1099 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates

Example #9
We consider the following functions:
1 ϕ1 (x) = x 3 + x + 1
2 ϕ2 (x) = x 4 + x 2 + 1
3 ϕ3 (x) = x 4 + x 3 + x 2 + x + 1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1100 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates
For each function, we want to estimate I = E [ϕ (N (0, 1))] using the
antithetic estimator:
nS
1 X ϕ (Xs ) + ϕ (−Xs )
Ȳn?S =
nS s=1 2

where Xs ∼ N (0, 1)
 2
 X2m∼ N (0, 1). We have
Let E X = 1,
 2m−2  2m+1 
E X = (2m − 1) E X and E X = 0 for m ∈ N
We obtain the following results:

ϕ (x) ϕ1 (x) ϕ2 (x) ϕ3 (x)


E [ϕ (Xs )] or E [ϕ (−Xs )] 1 5 5
var (ϕ (Xs )) or var (ϕ (−Xs )) 22 122 144
cov (ϕ (Xs ) , ϕ (−Xs )) −22 122 100
ρ hϕ (Xs ) , ϕ (−Xs )i −1 1 25/36

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1101 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates

To understand these numerical results, we must study the relationship


between C hX , X 0 i and C hY , Y 0 i. Indeed, we have:

C hX , X 0 i = C− ⇒ C hY , Y 0 i = C− ⇔ ϕ0 (x) ≥ 0


Thierry Roncalli Course 2023-2024 in Financial Risk Management 1102 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Antithetic variates

Figure: Functions ϕ1 (x), ϕ2 (x) and ϕ3 (x)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1103 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Application to the geometric Brownian motion


In the Gaussian case X ∼ N (0, 1), the antithetic variable is:
X 0 = −X
2

As the simulation of Y ∼ N µ, σ is obtained using the relationship
Y = µ + σX , we deduce that the antithetic variable is:

0 (Y − µ)
Y = µ − σX = µ − σ = 2µ − Y
σ
If we consider the geometric Brownian motion, the fixed-interval
scheme is:

  
1 2
Xm+1 = Xm · exp µ − σ h + σ h · εm
2
whereas the antithetic path is given by:

  
0 1
Xm+1 = Xm0 · exp µ − σ 2 h − σ h · εm
2
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1104 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Application to the geometric Brownian motion

Figure: Antithetic simulation of the GBM process

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1105 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Application to the geometric Brownian motion


In the multidimensional case, we recall that:

  
1 2
Xj,m+1 = Xj,m · exp µj − σj h + σj h · εj,m
2

where εm = (ε1,m , . . . , εn,m ) ∼ Nn (0, ρ)


We simulate εm by using the relationship εm = P · ηm where
ηm ∼ Nn (0, In ) and P is the Cholesky matrix satisfying PP > = ρ
The antithetic trajectory is then:

  
0 0 1
Xj,m+1 = Xj,m · exp µj − σj2 h + σj k · ε0j,m
2

where:
ε0m = −P · ηm = −εm
0 0 0

We verify that εm = ε1,m , . . . , εn,m ∼ Nn (0, ρ)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1106 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Application to the geometric Brownian motion

In the Black-Scholes model, the price of the spread option with maturity T
and strike K is given by:
h i
−rT +
C=e E (S1 (T ) − S2 (T ) − K )

where the prices S1 (t) and S2 (t) of the underlying assets are given by the
following SDE:

dS1 (t) = rS1 (t) dt + σ1 S1 (t) dW1 (t)
dS2 (t) = rS2 (t) dt + σ2 S2 (t) dW2 (t)

and E [W1 (t) W2 (t)] = ρ t

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1107 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Application to the geometric Brownian motion

To calculate the option price using Monte Carlo methods, we simulate


the bivariate GBM S1 (t) and S2 (t) and the MC estimator is:
nS 
e −rT X (s) (s)
+
ĈMC = S (T ) − S2 (T ) − K
nS s=1 1

(s)
where Sj (T ) is the s th simulation of the terminal value Sj (T )
For the AV estimator, we obtain:
 +  +
(s) (s) 0(s) 0(s)
e
nS
−rT X S1 (T ) − S2 (T ) − K + S1 (T ) − S2 (T ) − K
ĈAV =
nS s=1 2

0(s) (s)
where Sj (T ) is the antithetic variate of Sj (T )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1108 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Application to the geometric Brownian motion

Figure: Probability density function of CbMC and CbAV (nS = 1 000)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1109 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

Let Y = ϕ (X1 , . . . , Xn ) and V be a random variable with known


mean E [V ]
We define Z as follows: Z = Y + c · (V − E [V ])
We deduce that:

E [Z ] = E [Y + c · (V − E [V ])]
= E [Y ] + c · E [V − E [V ]]
= E [ϕ (X1 , . . . , Xn )]

and:

var (Z ) = var (Y + c · (V − E [V ]))


= var (Y ) + 2 · c · cov (Y , V ) + c 2 · var (V )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1110 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

It follows that:

var (Z ) ≤ var (Y ) ⇔ 2 · c · cov (Y , V ) + c 2 · var (V ) ≤ 0


⇒ c · cov (Y , V ) ≤ 0

In order to obtain a lower variance, a necessary condition is that c


and cov (Y , V ) have opposite signs
The minimum is obtained when ∂c var (Z ) = 0 or equivalently when:

? cov (Y , V )
c =− = −β
var (V )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1111 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

The optimal value c ? is then equal to the opposite of the beta of Y


with respect to the control variate V . In this case, we have:

cov (Y , V )
Z =Y − · (V − E [V ])
var (V )

and:
cov2 (Y , V ) 2

var (Z ) = var (Y ) − = 1 − ρ hY , V i · var (Y )
var (V )

This implies that we have to choose a control variate V that is highly


(positively or negatively) correlated with Y in order to reduce the
variance

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1112 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

Example
We consider that X ∼ U[0,1] and ϕ (x) = e x . We would like to estimate:
Z 1
I = E [ϕ (X )] = e x dx
0

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1113 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

We set Y = e X and V = X
We know that E [V ] = 1/2 and var (V ) = 1/12
It follows that:

var (Y ) = E Y − E2 [Y ]
 2
Z 1 Z 1 2
= e 2x dx − e x dx
0 0
 2x
1
e 1 0 2

= − e −e
2 0
2
4e − e − 3
=
2
≈ 0.2420

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1114 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates
We have:
cov (Y , V ) = E [VY ] − E [V ] E [Y ]
Z 1
x 1 1 0

= xe dx − e −e
0 2
 1 Z 1
x x 1 1 0

= xe − e dx − e −e
0 0 2
3−e
=
2
≈ 0.1409
If we consider the VC estimator Z defined by:
cov (Y , V )
Z = Y− · (V − E [V ])
var (V )
 
1
= Y − (18 − 6e) · V −
2
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1115 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

We have β ≈ 1.6903
We obtain:
cov2 (Y , V )
var (Z ) = var (Y ) −
var (V )
4e − e 2 − 3 2
= − 3 · (3 − e)
2
≈ 0.0039

We conclude that we have dramatically reduced the variance of the


estimator, because we have:
 
var IˆCV var (Z )
 = = 1.628%
var Iˆ var (Y )
MC

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1116 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

Figure: Understanding the variance reduction in control variates

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1117 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates
Ŷ is the conditional expectation of Y with respect to V :

E [Y | V ] = E [Y ] + β (V − E [V ])

This is the best linear estimator of Y


The residual U of the linear regression is then equal to:

U = Y − Ŷ = (Y − E [Y ]) − β (V − E [V ])

The CV estimator Z is a translation of the residual in order to satisfy


E [Z ] = E [Y ]:

Z = E [Y ] + U = Y − β (V − E [V ])

By construction, the variance of the residual U is lower than the


variance of the random variable Y . We conclude that:

var (Z ) = var (U) ≤ var (Y )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1118 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

We can therefore obtain a large variance reduction if the following


conditions are satisfied:
the control variate V largely explains the random variable Y
the relationship between Y and V is almost linear

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1119 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

The price of an arithmetic Asian call option is given by:


h +
i
C = e −rT E S̄ − K


where K is the strike of the option and S̄ denotes the average of S (t) on
a given number of fixing dates21 {t1 , . . . , tnF }:
nF
1 X
S̄ = S (tm )
nF m=1

We can estimate the option price using the Black-Scholes model

21 We have tnF = T .
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1120 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

We can also reduce the variance of the MC estimator by considering the


following control variates:
1 the terminal value V1 = S (T ) of the underlying asset;
2 the average value V2 = S̄;
+
3 the discounted payoff of the call option V3 = e −rT (S (T ) − K ) ;
4 the discounted payoff of the geometric Asian call option
 +
V4 = e −rT S̃ − K where:

YnF 1/nF
S̃ = S (tm )
m=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1121 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

For these control variates, we know the expected value


In the first case, we have:

E [S (T )] = S0 e rT

In the first case, we have:


nF
S0 X
e rtm
 
E S̄ =
nF m=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1122 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

The expected value of the third control variate is the Black-Scholes


formula of the European call option:
  
YnF 2
 1/n F 1
S0 e (r − 2 σ )tm +σW (tm )
1
S̃ = = S0 ·exp r − σ 2 t̄ + σ W̄
m=1 2

where:
1 XnF
t̄ = tm
nF m=1

and:
1 Xn F
W̄ = W (tm )
nF m=1

Because S̃ has a log-normal distribution, we deduce that the expected


value of the fourth control variate is also given by a Black-Scholes
formula

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1123 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

Figure: CV estimator of the arithmetic Asian call option

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1124 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates
The previous approach can be extended in the case of several control
variates:
nCV
X
Z =Y + ci · (Vi − E [Vi ]) = Y + c > (V − E [V ])
i=1

where c = (c1 , . . . , cnCV ) and V = (V1 , . . . , VnCV )


We can show that the optimal value of c is equal to:
−1
c ? = − cov (V , V ) · cov (V , Y )
Minimizing the variance of Z is equivalent to minimize the variance of
U:
>

U = Y − Ŷ = Y − α + β V
We deduce that c ? = −β. It follows that
2

var (Z ) = var (U) = 1 − R · var (Y )
where R 2 is the R-squared coefficient of the linear regression
Y = α + β>V + U
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1125 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Control variates

Table: Linear regression between the Asian call option and the control variates

α̂ β̂1 β̂2 β̂3 β̂4 R2 1 − R2


−51.482 0.036 0.538 90.7% 9.3%
−24.025 −0.346 0.595 0.548 96.5% 3.5%
−4.141 0.069 0.410 81.1% 18.9%
−38.727 0.428 0.174 92.9% 7.1%
−1.559 −0.040 0.054 0.111 0.905 99.8% 0.2%

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1126 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

Let X = (X1 , . . . , Xn ) be a random vector with distribution function F


We have:

I = E [ϕ (X1 , . . . , Xn ) | F]
Z Z
= · · · ϕ (x1 , . . . , xn ) f (x1 , . . . , xn ) dx1 · · · dxn

where f (x1 , . . . , xn ) is the probability density function of X

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1127 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

It follows that:
Z Z  
f (x1 , . . . , xn )
I = ··· ϕ (x1 , . . . , xn ) g (x1 , . . . , xn ) dx1 · · · dxn
g (x1 , . . . , xn )
 
f (X1 , . . . , Xn )
= E ϕ (X1 , . . . , Xn ) G
g (X1 , . . . , Xn )
= E [ϕ (X1 , . . . , Xn ) L (X1 , . . . , Xn ) | G]

where g (x1 , . . . , xn ) is the probability density function of G and L is


the likelihood ratio:
f (x1 , . . . , xn )
L (x1 , . . . , xn ) =
g (x1 , . . . , xn )

The values taken by L (x1 , . . . , xn ) are also called the importance


sampling weights

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1128 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

Using the vector notation, the relationship becomes:

E [ϕ (X ) | F] = E [ϕ (X ) L (X ) | G]

It follows that: h i h i
E IˆMC = E IˆIS = I

where IˆMC and IˆIS are the Monte Carlo and importance sampling
estimators of I
We also deduce that:
 
var IˆIS = var (ϕ (X ) L (X ) | G)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1129 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

It follows that:
 
ˆ = E ϕ (X ) L (X ) | G − E2 [ϕ (X ) L (X ) | G]
 2 2

var IIS
Z
= ϕ2 (x) L2 (x) g (x) dx − I 2

f 2 (x)
Z
2
= ϕ (x) 2 g (x) dx − I 2
g (x)
f 2 (x)
Z
2
= ϕ (x) dx − I 2
g (x)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1130 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

If we compare the variance of the two estimators IˆMC and IˆIS , we


obtain:
Z 2 Z
    f (x)
var IˆIS − var IˆMC = ϕ2 (x) dx − ϕ2 (x) f (x) dx
g (x)
Z  
2 f (x)
= ϕ (x) − 1 f (x) dx
g (x)
Z
= ϕ2 (x) (L (x) − 1) f (x) dx

The difference may be negative if the weights L (x) are small


(L (x)  1) because the values of ϕ2 (x) f (x) are positive
The importance sampling approach changes then the importance of
some values x by transforming the original probability distribution F
into another probability distribution G

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1131 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling
The first-order condition is:
2
2 f (x)
−ϕ (x) · 2 =λ
g (x)

where λ is a constant
We have:
 
g ? (x) arg min var IˆIS
=
f 2 (x)
Z
2
= arg min ϕ (x) dx
g (x)
= c · |ϕ (x)| · f (x)
R ?
where c is the normalizing constant such that g (x) dx = 1
A good choice of the IS density g (x) is then an approximation of
|ϕ (x)| · f (x) such that g (x) can easily be simulated
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1132 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

Remark
In order to simplify the notation and avoid confusions, we consider that
X ∼ F and Z ∼ G in the sequel. This means that IˆMC = ϕ (X ) and
IˆIS = ϕ (Z ) L (Z )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1133 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

We consider the estimation of the probability


p = Pr {X ≥ 3} ≈ 0.1350% when X ∼ N (0, 1)
We have:
ϕ (x) = 1 {x ≥ 3}
2

Importance sampling with Z ∼ N µz , σz , µz = 3 and σz = 1 ⇒ the
probability Pr {Z ≥ 3} is equal to 50%

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1134 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

Figure: Histogram of the MC and IS estimators (nS = 1 000)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1135 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

Figure: Standard deviation (in %) of the estimator p̂IS (nS = 1 000)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1136 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

We consider the pricing of the put option:


h i
+
P = e −rT E (K − S (T ))

We can estimate the option price by using the Monte Carlo method
with:
+
ϕ (x) = e −rT (K − x)
In the case where K  S (0), the probability of exercise
Pr {S (T ) ≤ K } is very small
Therefore, we have to increase the probability of exercise in order to
obtain a more efficient estimator

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1137 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

In the case of the Black-Scholes model, the density function of S (T )


is equal to:  
1 ln x − µx
f (x) = φ
xσx σx
2
 √
where µx = ln S0 + r − σ /2 T and σx = σ T
We consider the IS density g (x) defined by:
 
1 ln x − µz
g (x) = φ
xσz σz

where µz = θ + µx and σz = σx

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1138 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

For instance, we can choose θ such that the probability of exercise is


equal to 50%. It follows that:
 
1 ln K − θ − µx 1
Pr {Z ≤ K } = ⇔ Φ =
2 σx 2
⇔ θ = ln K − µx
 
K 1
⇔ θ = ln − r − σ2 T
S0 2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1139 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

We deduce that:

P = E [ϕ (S (T ))] = E [ϕ (S 0 (T )) · L (S 0 (T ))]

where:
 
1 ln x − µx
φ  2   
xσx σx θ ln x − µx θ
L (x) =   = exp 2
− ·
1 ln x − µz 2σx σx σx
φ
xσz σz

and S 0 (T ) is the same geometric Brownian motion than S (T ), but


with another initial value:
0 θ −(r −σ 2 /2)T
S (0) = S (0) e = Ke

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1140 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

Example #10
We assume that S0 = 100, K = 60, r = 5%, σ = 20% and T = 2. If we
consider the previous method, the IS process is simulated using the initial
0 −(r −σ 2 /2)T
value S (0) = Ke = 56.506, whereas the value of θ is equal to
−0.5708

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1141 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Importance sampling

Figure: Density function of the estimators P̂MC and P̂IS (nS = 1 000)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1142 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Quasi-Monte Carlo simulation methods

We consider the following Monte Carlo problem:


Z Z
I = ··· ϕ (x1 , . . . , xn ) dx1 · · · dxn
[0,1]n

Let X be the random vector of independent uniform random


variables. It follows that I = E [ϕ (X )]
The Monte Carlo method consists in generating uniform coordinates
n
in the hypercube [0, 1]
Quasi-Monte Carlo methods use non-random coordinates in order to
obtain a more nicely uniform distribution

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1143 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Quasi-Monte Carlo simulation methods

A low discrepancy sequence U = {u1 , . . . , unS } is a set of deterministic


n
points distributed in the hypercube [0, 1]

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1144 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Quasi-Monte Carlo simulation methods

Figure: Comparison of different low discrepancy sequences

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1145 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Quasi-Monte Carlo simulation methods

Figure: The Sobol generator

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1146 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Quasi-Monte Carlo simulation methods

Figure: Quasi-random points on the unit sphere

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1147 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Quasi-Monte Carlo simulation methods

Example #11
We consider a spread option whose payoff is equal to
+
(S1 (T ) − S2 (T ) − K ) . The price is calculated using the Black-Scholes
model, and the following parameters: S1 (0) = S2 (0) = 100,
σ1 = σ2 = 20%, ρ = 50% and r = 5%. The maturity T of the option is
set to one year, whereas the strike K is equal to 5. The true price of the
spread option is equal to 5.8198.

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1148 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation Computing integrals
Simulation of stochastic processes Variance reduction
Monte Carlo methods Quasi-Monte Carlo simulation methods

Quasi-Monte Carlo simulation methods

Table: Pricing of the spread option using quasi-Monte Carlo methods

nS 102 103 104 105 106 5 × 106


LCG (1) 4.3988 5.9173 5.8050 5.8326 5.8215 5.8139
LCG (2) 6.1504 6.1640 5.8370 5.8219 5.8265 5.8198
LCG (3) 6.1469 5.7811 5.8125 5.8015 5.8142 5.8197
Hammersley (1) 32.7510 26.5326 21.5500 16.1155 9.0914 5.8199
Hammersley (2) 32.9082 26.4629 21.5465 16.1149 9.0914 5.8199
Halton (1) 8.6256 6.1205 5.8493 5.8228 5.8209 5.8208
Halton (2) 10.6415 6.0526 5.8544 5.8246 5.8208 5.8207
Halton (3) 8.5292 6.0575 5.8474 5.8235 5.8212 5.8208
Sobol 5.7181 5.7598 5.8163 5.8190 5.8198 5.8198
Faure 5.7256 5.7718 5.8157 5.8192 5.8197 5.8198

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1149 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Simulation of stochastic processes
Monte Carlo methods

Exercises

Exercise 13.4.1 – Simulating random numbers using the inversion


method
Exercise 13.4.6 – Simulation of the bivariate Normal copula
Exercise 13.4.7 – Computing the capital charge for operational risk

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1150 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
Random variate generation
Simulation of stochastic processes
Monte Carlo methods

References

Devroye, L. (1986)
Non-Uniform Random Variate Generation, Springer-Verlag.
Roncalli, T. (2020)
Handbook of Financial Risk Management, Chapman and Hall/CRC
Financial Mathematics Series, Chapter 13.

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1151 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring
Statistical methods
Performance evaluation criteria and score consistency

Agenda

Lecture 1: Introduction to Financial Risk Management


Lecture 2: Market Risk
Lecture 3: Credit Risk
Lecture 4: Counterparty Credit Risk and Collateral Risk
Lecture 5: Operational Risk
Lecture 6: Liquidity Risk
Lecture 7: Asset Liability Management Risk
Lecture 8: Model Risk
Lecture 9: Copulas and Extreme Value Theory
Lecture 10: Monte Carlo Simulation Methods
Lecture 11: Stress Testing and Scenario Analysis
Lecture 12: Credit Scoring Models

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1192 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring
Statistical methods
Performance evaluation criteria and score consistency

Credit scoring

Credit scoring refers to statistical models to measure the


creditworthiness of a person or a company
Mortgage, credit card, personal loan, etc.
Credit scoring first emerged in the United States
The FICO score was introduced in 1989 by Fair Isaac Corporation

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1193 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Judgmental credit systems versus credit scoring systems

In 1941, Durand presented a statistical analysis of credit valuation


He showed that credit analysts uses similar factors, and proposed a
credit rating formula based on nine factors: (1) age, (2) sex, (3)
stability of residence, (4) occupation, (5) industry, (6) stability of
employment, (7) bank account, (8) real estate and (9) life insurance
The score is additive and can take values between 0 and 3.46
From an industrial point of view, a credit scoring system has two
main advantages compared to a judgmental credit system:
1 it is cost efficient, and can treat a huge number of applicants;
2 decision-making process is rapid and consistent across customers.

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1194 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Scoring models for corporate bankruptcy


Altman Z score model (1968)
The score was equal to:

Z = 1.2 · X1 + 1.4 · X2 + 3.3 · X3 + 0.6 · X4 + 1.0 · X5

The variables Xj represent the following financial ratios:

Xj Ratio
X1 Working capital / Total assets
X2 Retained earnings / Total assets
X3 Earnings before interest and tax / Total assets
X4 Market value of equity / Total liabilities
X5 Sales / Total assets

If we note Zi the score of the firm i, we can calculate the normalized score:

Zi? = (Zi − mz ) /σz

where mz and σz are the mean and standard deviation of the observed scores
A low value of Zi? (for instance Zi? < 2.5) indicates that the firm has a high probability of
default
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1195 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

New developments

Default of corporate firms


Consumer credit and retail debt management (credit cards,
mortgages, etc.)
Statistical methods: discriminant analysis, logistic regression, survival
model, machine learning techniques

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1196 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Choice of the risk factors

The five Cs:


1 Capacity measures the applicant’s ability to meet the loan payments
(e.g., debt-to-income, job stability, cash flow dynamics)
2 Capital is the size of assets that are held by the borrower (e.g. net
wealth of the borrower)
3 Character measures the willingness to repay the loan (e.g. payment
history of the applicant)
4 Collateral concerns additional forms of security that the borrower can
provide to the lender
5 Conditions refer to the characteristics of the loan and the economic
conditions that might affect the borrower (e.g. maturity, interests
paid)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1197 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Choice of the risk factors

Table: An example of risk factors for consumer credit

Character Age of applicant


Marital status
Number of children
Educational background
Time with bank
Time at present address
Capacity Annual income
Current living expenses
Current debts
Time with employer
Capital Purpose of the loan
Home status
Saving account
Condition Maturity of the loan
Paid interests

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1198 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Choice of the risk factors

Scores are developed by banks and financial institutions, but they can
also be developed by consultancy companies
This is the case of the FICO R scores, which are the most widely used
credit scoring systems in the world

5 main categories Range


1 Payment history (35%) Generally from 300 to 850 (average
2 Amount of debt (30%) score of US consumers is 695)
3 Length of credit history (15%) Exceptional (800+)
4 New credit (10%) Very good (740-799)
5 Credit mix (10%) Good (670-739)
Fair (580-669)
Poor (580−)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1199 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Choice of the risk factors

Corporate credit scoring systems use financial ratios:


1 Profitability: gross profit margin, operating profit margin,
return-on-equity (ROE), etc.
2 Solvency: debt-to-assets ratio, debt-to-equity ratio, interest coverage
ratio, etc.
3 Leverage: liabilities-to-assets ratio (financial leverage ratio),
long-term debt/assets, etc.
4 Liquidity: current assets/current liabilities (current ratio), quick
assets/current liabilities (quick or cash ratio), total net working
capital, assets with maturities of less than one year, etc.

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1200 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Data preparation

Check the data and remove outliers or fill missing values


Variable transformation
Slicing-and-dicing segmentation
Potential interaction

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1201 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Variable selection
Many candidate variables X = (X1 , . . . , Xm ) for explaining the
variable Y
The variable selection problem consists in finding the best set of
optimal variables
We assume the following statistical model:

Y = f (X ) + u
2

where u ∼ N 0, σ
We denote the prediction by Ŷ = fˆ (X ). We have:
 2   2 
E Y − Ŷ = E f (X ) + u − fˆ (X )
 h i 2  h i2 
= E fˆ (X ) − f (X ) + E fˆ (X ) − E fˆ (X ) + σ2

= Bias2 + Variance + Error


Thierry Roncalli Course 2023-2024 in Financial Risk Management 1202 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Variable selection

Best subset selection:


 
(model)
AIC (α) = −2`(k) θ̂ + α · df (k)

Stepwise approach:
   
RSS θ̂(k) − RSS θ̂(k+1)
F =  
(residual)
RSS θ̂(k+1) / df (k+1)

Lasso approach:
K
X K
X
yi = βk xi,k + ui s.t. |βk | ≤ τ
k=1 k=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1203 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring The emergence of credit scoring
Statistical methods Variable selection
Performance evaluation criteria and score consistency Score modeling, validation and follow-up

Score modeling, validation and follow-up

Cross-validation approach (leave-p-out cross-validation or LpOCV,


leave-one-out cross-validation or LOOCV, Press statistic)
Score modeling
 
S = f X ; θ̂ is the score
Decision rule: 
S < s =⇒ Y = 0 =⇒ reject
S ≥ s =⇒ Y = 1 =⇒ accept
Score follow-up
Stability
Rejected applicants (reject inference)
Backtesting

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1204 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Statistical methods

Unsupervised learning is a branch of statistical learning, where test


data does not include a response variable
It is opposed to supervised learning, whose goal is to predict the value
of the response variable Y given a set of explanatory variables X
In the case of unsupervised learning, we only know the X -values,
because the Y -values do not exist or are not observed
Supervised and unsupervised learning are also called ‘learning
with/without a teacher ’ (Hastie et al., 2009)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1205 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Clustering

K -means clustering
Hierarchical clustering

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1206 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Clustering

Figure: An example of dendrogram

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1207 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Dimension reduction

Principal component analysis


Non-negative matrix factorization

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1208 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis

Figure: Classification statistical problem

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1209 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The two-dimensional case

Using the Bayes theorem, we have:

Pr {A ∩ B} = Pr {A | B} · Pr {B} = Pr {B | A} · Pr {A}

It follows that:
Pr {A}
Pr {A | B} = Pr {B | A} ·
Pr {B}

If we apply this result to the conditional probability


Pr {i ∈ C1 | X = x}, we obtain:

Pr {i ∈ C1 }
Pr {i ∈ C1 | X = x} = Pr {X = x | i ∈ C1 } ·
Pr {X = x}

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1210 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The two-dimensional case

The log-probability ratio is then equal to:


 
Pr {i ∈ C1 | X = x} Pr {X = x | i ∈ C1 } Pr {i ∈ C1 }
ln = ln ·
Pr {i ∈ C2 | X = x} Pr {X = x | i ∈ C2 } Pr {i ∈ C2 }
f1 (x) π1
= ln + ln
f2 (x) π2
where πj = Pr {i ∈ Cj } is the probability of the j th class and
fj (x) = Pr {X = x | i ∈ Cj } is the conditional pdf of X
By construction, the decision boundary is defined such that we are
indifferent to an assignment rule (i ∈ C1 and i ∈ C2 ), implying that:
1
Pr {i ∈ C1 | X = x} = Pr {i ∈ C2 | X = x} =
2
Finally, we deduce that the decision boundary satisfies the following
equation:
f1 (x) π1
ln + ln =0
f2 (x) π2
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1211 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Quadratic discriminant analysis (QDA)

If we model each class density as a multivariate normal distribution:

X | i ∈ Cj ∼ N (µj , Σj )

we have:
 
1 1 >
fj (x) = K /2 1/2
exp − (x − µj ) Σ−1
j (x − µj )
(2π) |Σj | 2

We deduce that:
f1 (x) 1 |Σ2 | 1 > −1 1 >
ln = ln − (x − µ1 ) Σ1 (x − µ1 )+ (x − µ2 ) Σ−1
2 (x − µ2 )
f2 (x) 2 |Σ1 | 2 2
The decision boundary is then given by:
1 |Σ2 | 1 > −1 1 > −1 π1
ln − (x − µ1 ) Σ1 (x − µ1 )+ (x − µ2 ) Σ2 (x − µ2 )+ln =0
2 |Σ1 | 2 2 π2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1212 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Linear discriminant analysis (LDA)

If we assume that Σ1 = Σ2 = Σ, we obtain:


1 > −1 1 > −1 π1
(x − µ2 ) Σ (x − µ2 ) − (x − µ1 ) Σ (x − µ1 ) + ln =0
2 2 π2
We deduce that:
> −1 1 > −1 > −1
 π2
(µ2 − µ1 ) Σ x= µ2 Σ µ2 − µ1 Σ µ1 + ln
2 π1
The decision boundary is then linear in x (and not quadratic)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1213 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis

Example #1
We consider two classes and two explanatory variables X = (X1 , X2 ) where
π1 = 50%, π2 = 1 − π1 = 50%, µ1 = (1, 3), µ2 = (4, 1), Σ1 = I2 and
Σ2 = γI2 where γ = 1.5.

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1214 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis

Figure: Boundary decision of discriminant analysis

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1215 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis

Figure: Impact of the parameters on LDA/QDA boundary decisions

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1216 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The general case

We can generalize the previous analysis to J classes


The Bayes formula gives:

Pr {i ∈ Cj }
Pr {i ∈ Cj | X = x} = Pr {X = x | i ∈ Cj } ·
Pr {X = x}
= c · fj (x) · πj

where c = 1/ Pr {X = x} is a normalization constant that does not


depend on j
We note Sj (x) = ln Pr {i ∈ Cj | X = x} the discriminant score
function for the j th class
We have:
Sj (x) = ln c + ln fj (x) + ln πj

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1217 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The general case

If we again assume that X | i ∈ Cj ∼ N (µj , Σj ), the QDA score


function is:
1 1 >
Sj (x) = ln c + ln πj − ln |Σj | − (x − µj ) Σ−1
0
j (x − µj )
2 2
1 1 >
∝ ln πj − ln |Σj | − (x − µj ) Σ−1
j (x − µj )
2 2
K
where ln c 0 = ln c − ln 2π
2
Given an input x, we calculate the scores Sj (x) for j = 1, . . . , J and
we choose the label j ? with the highest score value

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1218 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The general case

If we assume an homoscedastic model (Σj = Σ), the LDA score


function becomes:
1 >
Sj (x) = ln c + ln πj − (x − µj ) Σ−1
00
j (x − µj )
2
1 > −1
∝ ln πj + µ>
j Σ −1
x − µj Σ µj
2
1 1
where ln c 00 = ln c 0 − ln |Σ| − x > Σ−1 x
2 2

Remark
In practice, the parameters πj , µj and Σj are unknown. We replace them
by the corresponding estimates π̂j , µ̂j and Σ̂j . For the linear discriminant
analysis, Σ̂ is estimated by pooling all the classes.

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1219 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The general case

Example #2
We consider the classification problem of 33 observations with two
explanatory variables X1 and X2 , and three classes C1 , C2 and C3 :
i Cj X1 X2 i Cj X1 X2 i Cj X1 X2
1 1 1.03 2.85 12 2 3.70 5.08 23 3 3.55 0.58
2 1 0.20 3.30 13 2 2.81 1.99 24 3 3.86 1.83
3 1 1.69 3.73 14 2 3.66 2.61 25 3 5.39 0.47
4 1 0.98 3.52 15 2 5.63 4.19 26 3 3.15 −0.18
5 1 0.98 5.15 16 2 3.35 3.64 27 3 4.93 1.91
6 1 3.47 6.56 17 2 2.97 3.55 28 3 3.87 2.61
7 1 3.94 4.68 18 2 3.16 2.92 29 3 4.09 1.43
8 1 1.55 5.99 19 3 3.00 0.98 30 3 3.80 2.11
9 1 1.15 3.60 20 3 3.09 1.99 31 3 2.79 2.10
10 2 1.20 2.27 21 3 5.45 0.60 32 3 4.49 2.71
11 2 3.66 5.49 22 3 3.59 −0.46 33 3 3.51 1.82

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1220 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The general case

Table: Parameter estimation of the discriminant analysis

Class C1 C2 C3
π̂j 0.273 0.273 0.455
µ̂j 1.666 4.376 3.349 3.527 3.904 1.367
1.525 0.929 1.326 0.752 0.694 −0.031
Σ̂j
0.929 1.663 0.752 1.484 −0.031 0.960

For the LDA method, we have:


 
1.91355 −0.71720
Σ̂ =
−0.71720 3.01577

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1221 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The general case

Table: Computation of the discriminant scores Sj (x)


QDA LDA LDA2
i
S1 (x) S2 (x) S3 (x) S1 (x) S2 (x) S3 (x) S1 (x) S2 (x) S3 (x)
1 −2.28 −3.69 −7.49 0.21 −0.96 −0.79 6.93 5.60 5.76
2 −2.28 −6.36 −12.10 −0.26 −2.17 −2.34 1.38 −2.13 −1.89
3 −1.76 −3.13 −6.79 2.84 2.16 1.71 12.13 12.01 11.38
4 −1.80 −4.43 −8.88 1.35 0.09 −0.22 7.73 6.20 5.93
5 −2.36 −7.75 −13.70 4.32 2.93 1.45 8.12 5.54 4.76
6 −3.16 −5.63 −14.68 10.75 11.36 8.95 14.82 13.99 12.96
7 −3.79 −1.92 −6.32 8.06 9.22 8.15 17.36 19.03 17.89
8 −2.85 −8.43 −15.23 6.73 5.76 3.70 10.47 8.09 7.15
9 −1.74 −4.12 −8.37 1.76 0.64 0.27 8.94 7.77 7.39
10 −3.14 −3.21 −6.17 −0.58 −1.56 −0.98 6.59 5.55 6.15
11 −2.87 −3.01 −9.45 9.10 9.96 8.31 16.89 17.65 16.42
12 −3.04 −2.38 −7.77 8.42 9.34 7.98 17.28 18.50 17.28
13 −6.32 −2.29 −1.62 1.41 1.82 2.64 12.48 13.94 14.46
14 −6.91 −2.07 −1.42 3.86 4.94 5.34 15.15 17.41 17.34
15 −9.79 −3.62 −7.12 9.79 12.43 11.75 12.58 14.01 13.50
16 −3.90 −1.47 −3.44 5.25 5.99 5.65 16.84 18.82 18.03
17 −3.31 −1.55 −3.61 4.50 4.92 4.63 16.25 17.95 17.21
18 −4.84 −1.60 −2.19 3.65 4.28 4.45 15.51 17.48 17.14
19 −10.21 −4.12 −1.27 −0.13 0.52 2.06 8.98 9.99 11.70
20 −7.05 −2.41 −1.24 1.85 2.50 3.32 12.99 14.72 15.22
21 −23.11 −11.16 −2.56 2.98 5.75 7.61 3.79 4.57 7.26
22 −19.22 −9.53 −2.42 −1.84 −0.57 2.01 1.81 1.53 5.51
23 −13.86 −5.92 −1.01 −0.01 1.15 2.98 7.65 8.67 10.95
24 −10.01 −3.43 −0.70 2.75 4.07 5.02 12.84 14.95 15.65
25 −23.48 −11.44 −2.54 2.65 5.38 7.33 3.40 4.09 6.95
26 −15.87 −7.59 −2.30 −2.01 −1.14 1.23 3.19 3.02 6.50
27 −14.09 −5.40 −1.52 4.56 6.78 7.70 11.17 13.24 14.08
28 −7.55 −2.27 −1.39 4.18 5.45 5.85 15.10 17.44 17.40
29 −12.40 −4.67 −0.61 2.38 3.92 5.17 11.21 13.14 14.33
30 −8.85 −2.87 −0.88 3.17 4.41 5.17 13.77 15.97 16.37
31 −5.97 −2.17 −1.72 1.58 1.97 2.70 12.78 14.26 14.67
32 −9.40 −2.97 −1.81 5.33 7.11 7.46 14.55 16.95 16.93
33 −8.84 −3.01 −0.80 2.19 3.21 4.16 12.82 14.77 15.45

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1222 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The general case

Figure: Comparing QDA, LDA and LDA2 predictions

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1223 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
The general case

Figure: QDA, LDA and LDA2 decision regions

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1224 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Class separation maximization

We note xi = (xi,1 , . . . , xi,K ) the K × 1 vector of exogenous variables


X for the i th observation
The mean vectorP and the variance (or scatter) matrix of Class Cj is
equal to µ̂j = n1j i∈Cj xi and
P >
Sj = nΣ̂j = i∈Cj (xi − µ̂j ) (xi − µ̂j ) where nj is the number of
observations in the j th class
1
Pn
If consider the total population, we also have µ̂ = n i=1 xi and
Pn >
S = nΣ̂ = i=1 (xi − µ̂) (xi − µ̂)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1225 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Class separation maximization

We notice that:
J
1X
µ̂ = nj µ̂j
n
j=1
We define the between-class variance matrix as:
J
>
X
SB = nj (µ̂j − µ̂) (µ̂j − µ̂)
j=1

and the within-class variance matrix as:


J
X
SW = Sj
j=1

We can show that the total variance matrix can be decomposed into
the sum of the within-class and between-class variance matrices:
S = SW + SB
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1226 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Class separation maximization

The discriminant analysis consists in finding the discriminant linear


combination β > X that has the maximum between-class variance
relative to the within-class variance:

β ? = arg max J (β)

where J (β) is the Fisher criterion:

β > SB β
J (β) = >
β SW β
Since the objective function is invariant if we rescale the vector β –
J (β 0 ) = J (β) if β 0 = cβ, we can impose that β > SW β = 1. It follows
that:

β̂ = arg max β > SB β


s.t. β > SW β = 1
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1227 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Class separation maximization

The Lagrange function is:


> >

L (β; λ) = β SB β − λ β SW β − 1

We deduce that the first-order condition is equal to:

∂ L (β; λ)
>
= 2SB β − 2λSW β = 0
∂β
It is remarkable that we obtain a generalized eigenvalue SB β = λSW β
or equivalently:
S−1
W SB β = λβ

Even if SW and SB are two symmetric matrices, it is not necessarily


the case for the product S−1
W SB
Using the eigendecomposition SB = V ΛV > , we have
1/2
SB = V Λ1/2 V >
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1228 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Class separation maximization

1/2
With the parametrization α = SB β, the first-order condition
becomes:
1/2 1/2
SB S−1
W B α = λα
S
−1/2
because β = SB α
We have a right regular eigenvalue problem
Let λk and vk be the k th eigenvalue and eigenvector of the symmetric
1/2 1/2
matrix SB S−1 S
W B
It is obvious that the optimal solution α? is the first eigenvector v1
corresponding to the largest eigenvalue λ1
−1/2
We conclude that the estimator is β̂ = SB v1 and the discriminant
−1/2
linear relationship is Y c = v1> SB X
Moreover, we have:
  β̂ > SB β̂
λ1 = J β̂ =
β̂ > SW β̂
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1229 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Class separation maximization

Example #3
We consider a problem with two classes C1 and C2 , and two explanatory
variables (X1 , X2 ). Class C1 is composed of 7 observations: (1, 2), (1, 4),
(3, 6), (3, 3), (4, 2), (5, 6), (5, 5), whereas class C2 is composed of 6
observations: (1, 0), (2, 1), (4, 1), (3, 2), (6, 4) and (6, 5).

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1230 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Class separation maximization

Figure: Linear projection and the Fisher solution

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1231 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Class separation maximization

Concerning the assignment decision, we can consider the midpoint rule:



si < µ̄ ⇒ i ∈ C1
si > µ̄ ⇒ i ∈ C2

where µ̄ = (µ̄1 + µ̄2 ) /2, µ̄1 = β > µ̂1 and µ̄2 = β > µ̂2

This rule is not optimal because it does not depend


on the variance s̄12 and s̄22 of each class

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1232 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Discriminant analysis
Class separation maximization

Figure: Class separation and the cut-off criterion

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1233 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


General framework

We assume that Y can take two values 0 and 1


We consider models that link the outcome to a set of factors X :
>

Pr {Y = 1 | X = x} = F x β

F must be a cumulative distribution function in order to ensure that


F (z) ∈ [0, 1]
We also assume that the model is symmetric, implying that
F (z) + F (−z) = 1
Given a sample {(xi , yi ) , i = 1, . . . , n}, the log-likelihood function is
equal to:
X n
` (θ) = ln Pr {Yi = yi }
i=1

where yi takes the values 0 or 1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1234 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


General framework

We have:
1−yi
Pr {Yi = yi } = piyi · (1 − pi )
where pi = Pr {Yi = 1 | Xi = xi }
We deduce that:
n
X
` (θ) = yi ln pi + (1 − yi ) ln (1 − pi )
i=1
Xn
xi> β xi> β
 
= yi ln F + (1 − yi ) ln 1 − F
i=1

We notice that the vector θ includes only the parameters β

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1235 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


General framework

By noting f (z) the probability density function, it follows that the


associated score vector of the log-likelihood function is:

∂ ` (β)
S (β) =
∂β
n >

X f xi β >

= >
 >
 yi − F xi β xi
i=1
F xi β F −xi β

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1236 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


General framework

The Hessian matrix is:


n
∂ 2 ` (β) X
>

H (β) = >
=− H i · xi xi
∂β∂β
i=1

where:
>
2
f xi β
xi> β

Hi =  − yi − F ·
xi> β −xi> β

F F
2  !
f 0 xi> β f xi> β 1 − 2F xi> β

>
 >
− 2 2
F xi β F −xi β >
F xi β F −xi β>

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1237 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


General framework

Once β̂ is estimated by the method of maximum likelihood, we can


calculated the predicted probability for the i th observation:
 
>
p̂i = F xi β̂
Like a linear regression model, we can define the residual as the
difference between the observation yi and the predicted value p̂i
We can also exploit the property that the conditional distribution of
Yi is a Bernoulli distribution B (pi )
It is better to use the standardized (or Pearson) residuals:
yi − p̂i
ûi = p
p̂i (1 − p̂i )
These residuals are related to the Pearson’s chi-squared statistic:
n n 2
X X (y i − p̂ i )
χ2Pearson = ûi2 =
p̂i (1 − p̂i )
i=1 i=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1238 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


General framework

This statistic may used to measure the goodness-of-fit of the model


Under the assumption H0 that there is no lack-of-fit, we have
χ2Pearson ∼ χ2n−K where K is the number of exogenous variables
Another goodness-of-fit statistic is the likelihood ratio. For the
‘saturated’ model, the estimated probability p̂i is exactly equal to yi
We deduce that the likelihood ratio is equal to:
n    
X yi 1 − yi
−2 ln Λ = 2 yi ln + (1 − yi ) ln
p̂i 1 − p̂i
i=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1239 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


General framework

In binomial choice models, D = −2 ln Λ is also called the deviance


and we have D ∼ χ2n−K
In a perfect fit p̂i = yi , the likelihood ratio is exactly equal to zero
The forecasting
  procedure consists of estimating the probability
p̂ = F x > β̂ for a given set of variables x and to use the following
decision criterion:
1
Y = 1 ⇔ p̂ ≥
2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1240 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


Logistic regression

The logit model uses the following cumulative distribution function:


1 ez
F (z) = = z
1 + e −z e +1
The probability density function is then equal to:
e −z
f (z) = 2
(1 + e −z )
The log-likelihood function is equal to:
n
X
xi> β xi> β
 
` (β) = (1 − yi ) ln 1 − F + yi ln F
i=1
n
!
−xi> β
X e 
−xi> β

= (1 − yi ) ln > − yi ln 1 + e
i=1
1+ e −xi β
n  
−xi> β
X
xi> β

= − ln 1 + e + (1 − yi )
i=1
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1241 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


Logistic regression

We also have:
n
X
xi> β

S (β) = yi − F xi
i=1

and:
n
X
xi> β xi xi>
 
H (β) = − f ·
i=1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1242 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


Probit analysis

The probit model assumes that F (z) is the Gaussian distribution


The log-likelihood function is then:
n
X
xi> β xi> β
 
` (β) = (1 − yi ) ln 1 − Φ + yi ln Φ
i=1

The probit model can be seen as a latent variable model


>
? 2

Let us consider the linear model Y = β X + U where U ∼ N 0, σ
We assume that we do not observe Y ? but Y = g (Y ? )
For example, if g (z) = 1 {z > 0}, we obtain:
 > 
 > β x
Pr {Y = 1 | X = x} = Pr β X + U > 0 | X = x = Φ
σ
We notice that only the ratio β/σ is identifiable
Since we can set σ = 1, we obtain the probit model
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1243 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


Regularization

The regularized log-likelihood function is equal to:


λ p
` (θ; λ) = ` (θ) − kθkp
p
The case p = 1 is equivalent to consider a lasso penalization
The case p = 2 corresponds to the ridge regularization

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1244 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Binary choice models


Extension to multinomial logistic regression

We assume that Y can take J labels (L1 , . . . , LJ ) or belongs to J


disjoint classes (C1 , . . . , CJ )
We define the conditional probability as follows:
βj> x
e
pj (x) = Pr {Y = Lj | X = x} = Pr {Y ∈ Cj | X = x} = PJ−1 >x
1+ j=1 e βj
The probability of the last label is then equal to:
J−1
X 1
pJ (x) = 1 − pj (x) = PJ−1 βj> x
j=1 1+ j=1 e
The log-likelihood function becomes:
 
Xn YJ
i∈C
` (θ) = ln  pj (xi ) j 
i=1 j=1

where θ is the vector of parameters (β1 , . . . , βJ−1 )


Thierry Roncalli Course 2023-2024 in Financial Risk Management 1245 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Unsupervised learning
Statistical methods Parametric supervised methods
Performance evaluation criteria and score consistency Non-parametric supervised methods

Non-parametric supervised methods

k-nearest neighbor classifier (k-NN)


Neural networks (NN)
Support vector machines (SVM)
Model averaging (bagging or bootstrap aggregation, random forests,
boosting)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1246 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Definition and properties

The entropy is a measure of unpredictability or uncertainty of a


random variable
Let (X , Y ) be a random vector where pi,j = Pr {X = xi , Y = yj },
pi = Pr {X = xi } and pj = Pr {Y = yj }
The Shannon entropy of the discrete random variable X is given by:
Xn
H (X ) = − pi ln pi
i=1

We have the property 0 ≤ H (X ) ≤ ln n.


The Shannon entropy is a measure of the average information of the
system
The lower the Shannon entropy, the more informative the system

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1247 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Definition and properties

For a random vector (X , Y ), we have:


Xn Xn
H (X , Y ) = − pi,j ln pi,j
i=1 j=1

We deduce that the conditional information of Y given X is equal to:

H (Y | X ) = EX [H (Y | X = x)]
Xn Xn pi,j
= − pi,j ln
i=1 j=1 pi
= H (X , Y ) − H (X )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1248 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Definition and properties

We have the following properties:


if X and Y are independent, we have H (Y | X ) = H (Y ) and
H (X , Y ) = H (Y ) + H (X );
if X and Y are perfectly dependent, we have H (Y | X ) = 0 and
H (X , Y ) = H (X ).
The amount of information obtained about one random variable, through
the other random variable is measured by the mutual information:

I (X , Y ) = H (Y ) + H (X ) − H (X , Y )
Xn Xn pi,j
= pi,j ln
i=1 j=1 pi pj

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1249 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Definition and properties

1/36 1/36 1/36 1/36 1/36 1/36 1/6

1/36 1/36 1/36 1/36 1/36 1/36 1/6

1/36 1/36 1/36 1/36 1/36 1/36 1/6

1/36 1/36 1/36 1/36 1/36 1/36 1/6

1/36 1/36 1/36 1/36 1/36 1/36 1/6

1/36 1/36 1/36 1/36 1/36 1/36 1/6

H (X ) = H (Y ) = 1.792 H (X ) = H (Y ) = 1.792
H (X , Y ) = 3.584 H (X , Y ) = 1.792
I (X , Y ) = 0 I (X , Y ) = 1.792

Figure: Examples of Shannon entropy calculation

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1250 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Definition and properties

1/24 1/24 1/12

1/24 1/24 1/24 1/48 1/8 1/8

1/24 1/6 1/24 1/48 1/24

1/48 1/24 1/6 1/24 5/24 1/24

1/48 1/24 1/24 1/24 3/24 1/24

1/24 1/24 3/24 1/24 1/24

H (X ) = H (Y ) = 1.683 H (X ) = 1.658
H (X , Y ) = 2.774 H (Y ) = 1.328
I (X , Y ) = 0.593 I (X , Y ) = 0.750

Figure: Examples of Shannon entropy calculation

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1251 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Application to scoring

Let S and Y be the score and the control variable


For instance, Y is a binary random variable that may indicate a bad
credit (Y = 0) or a good credit (Y = 1)
We consider the following decision rule:

S ≤ 0 ⇒ S? = 0

S > 0 ⇒ S? = 1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1252 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Application to scoring

We note ni,j the number of observations such that S ? = i and Y = j.


We obtain the following system (S ? , Y ):

Y =0 Y =1
S? = 0 n0,0 n0,1
S? = 1 n1,0 n1,1

where n = n0,0 + n0,1 + n1,0 + n1,1 is the total number of observations


The hit rate is the ratio of good bets:
n0,0 + n1,1
H=
n
This statistic can be viewed as an information measure of the system
(S, Y )
When there are more states, we can consider the Shannon entropy

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1253 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Application to scoring

y1 y2 y3 y4 y5 y1 y2 y3 y4 y5
s1 10 9 s1 7 10

s2 7 9 s2 10 8

s3 3 7 2 s3 5 4 3

s4 2 10 4 5 s4 3 10 6 4

s5 10 2 s5 2 5 8

s6 3 4 13 s6 5 5 5

H (S1 ) = 1.767 H (S1 ) = 1.771


H (Y ) = 1.609 H (Y ) = 1.609
H (S1 , Y ) = 2.614 H (S1 , Y ) = 2.745
I (S1 , Y ) = 0.763 I (S1 , Y ) = 0.636
Figure: Scorecards S1 and S2
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1254 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Graphical methods

We assume that the control variable Y can takes two values


Y = 0 corresponds to a bad risk (or bad signal)
Y = 1 corresponds to a good risk (or good signal)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1255 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Graphical methods

We assume that the probability Pr {Y = 1 | S ≥ s} is increasing with


respect to the level s ∈ [0, 1], which corresponds to the rate of
acceptance.
We deduce that the decision rule is the following:
if the score of the observation is above the threshold s, the
observation is selected;
if the score of the observation is below the threshold s, the
observation is not selected.
If s is equal to one, we select no observation
If s is equal to zero, we select all the observations

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1256 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Performance curve

The performance curve is the parametric function y = P (x) defined


by: 
 x (s) = Pr {S ≥ s}
Pr {Y = 0 | S ≥ s}
 y (s) =
Pr {Y = 0}
where x (s) corresponds to the proportion of selected observations
and y (s) corresponds to the ratio between the proportion of selected
bad risks and the proportion of bad risks in the population
The score is efficient if the ratio is below one
If y (s) > 1, the score selects more bad risks than those we can find in
the population
If y (s) = 1, the score is random and the performance is equal to zero.
In this case, the selected population is representative of the total
population

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1257 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Selection curve

The selection curve is the parametric curve y = S (x) defined by:



x (s) = Pr {S ≥ s}
y (s) = Pr {S ≥ s | Y = 0}

where y (s) corresponds to the ratio of observations that are wrongly


selected
By construction, we would like that the curve y = S (x) is located
below the bisecting line y = x in order to verify that
Pr {S ≥ s | Y = 0} < Pr {S ≥ s}

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1258 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Performance and selection curves

We have:
Pr {S ≥ s,Y = 0}
Pr {S ≥ s | Y = 0} =
Pr {Y = 0}
Pr {S ≥ s,Y = 0}
= Pr {S ≥ s} ·
Pr {S ≥ s} Pr {Y = 0}
Pr {Y = 0 | S ≥ s}
= Pr {S ≥ s} ·
Pr {Y = 0}

The performance and selection curves are related as follows:

S (x) = xP (x)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1259 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Discriminant curve

The discriminant curve is the parametric curve y = D (x) defined by:


−1

D (x) = g1 g0 (x)

where:
gy (s) = Pr {S ≥ s | Y = y }
It represents the proportion of good risks in the selected population
with respect to the proportion of bad risks in the selected population
The score is said to be discriminant if the curve y = D (x) is located
above the bisecting line y = x

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1260 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Some properties

1 the performance curve (respectively, the selection curve) is located


below the line y = 1 (respectively, the bisecting line y = x) if and
only if cov (f (Y ) , g (S)) ≥ 0 for any increasing functions f and g
2 the performance curve is increasing if and only if:

cov (f (Y ) , g (S) | S ≥ s) ≥ 0

for any increasing functions f and g , and any threshold level s


3 the selection curve is convex if and only if E [f (Y ) | S = s] is
increasing with respect to the threshold level s for any increasing
function f
4 We can show that (3) ⇒ (2) ⇒ (1)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1261 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Some properties

A score is perfect or optimal if there is a threshold level s ? such that


Pr {Y = 1 | S ≥ s ? } = 1 and Pr {Y = 0 | S < s ? } = 1
It separates the population between good and bad risks
Graphically, the selection curve of a perfect score is equal to:
 
x −1
y = 1 {x > Pr {Y = 1}} · 1 +
Pr {Y = 0}

Using the relationship S (x) = xP (x), we deduce that the


performance curve of a perfect score is given by:
 
x − Pr {Y = 1}
y = 1 {x > Pr {Y = 1}} ·
x · Pr {Y = 0}

For the discriminant curve, a perfect score satisfies D (x) = 1


When the score is random, we have S (x) = D (x) = x and P (x) = 1

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1262 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Some properties

Figure: Performance, selection and discriminant curves

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1263 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Some properties

The score S1 is more performing on the population P1 than the score


S2 on the population P2 if and only if the performance (or selection)
curve of (S1 , P1 ) is below the performance (or selection) curve of
(S2 , P2 )
The score S1 is more discriminatory on the population P1 than the
score S2 on the population P2 if and only if the discriminant curve of
(S1 , P1 ) is above the discriminant curve of (S2 , P2 )

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1264 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Some properties

Figure: The score S1 is better than the score S2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1265 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Some properties

Figure: Illustration of the partial ordering between two scores

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1266 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Kolmogorov-Smirnov test

We consider the cumulative distribution functions:

F0 (s) = Pr {S ≤ s | Y = 0}

and:
F1 (s) = Pr {S ≤ s | Y = 1}
The score S is relevant if we have the stochastic dominance order
F0  F1
In this case, the score quality is measured by the Kolmogorov-Smirnov
statistic:
KS = max |F0 (s) − F1 (s)|
s

It takes the value 1 if the score is perfect


The KS statistic may be used to verify that the score is not random
(H0 : KS = 0)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1267 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Kolmogorov-Smirnov test

Figure: Comparison of the distributions F0 (s) and F1 (s)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1268 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Gini coefficient
The Lorenz curve

Let X and Y be two random variables


The Lorenz curve y = L (x) is the parametric curve defined by:

x = Pr {X ≤ x}
y = Pr {Y ≤ y | X ≤ x}

In economics, x represents the proportion of individuals that are


ranked by income while y represents the proportion of income
In this case, the Lorenz curve is a graphical representation of the
distribution of income and is used for illustrating inequality of the
wealth distribution between individuals

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1269 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Gini coefficient
The Lorenz curve

Figure: An example of Lorenz curve

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1270 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Gini coefficient
Definition

We define the Gini coefficient by:


A
Gini (L) =
A+B
where A is the area between the Lorenz curve and the curve of perfect
equality, and B is the area between the curve of perfect concentration
and the Lorenz curve
By construction, we have 0 ≤ Gini (L) ≤ 1
The Gini coefficient is equal to zero in the case of perfect equality and
one in the case of perfect concentration
We have: Z 1
Gini (L) = 1 − 2 L (x) dx
0

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1271 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Gini coefficient
Application to credit scoring

We can interpret the selection curve as a Lorenz curve


We recall that F (s) = Pr {S ≤ s}, F0 (s) = Pr {S ≤ s | Y = 0} and
F1 (s) = Pr {S ≤ s | Y = 1}
The selection curve is defined by the following parametric coordinates:

x (s) = 1 − F (s)
y (s) = 1 − F0 (s)

The selection curve measures the capacity of the score for not
selecting bad risks
We could also build the Lorenz curve that measures the capacity of
the score for selecting good risks:

x (s) = Pr {S ≥ s} = 1 − F (s)
y (s) = Pr {S ≥ s | Y = 1} = 1 − F1 (s)

It is called the precision curve


Thierry Roncalli Course 2023-2024 in Financial Risk Management 1272 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Gini coefficient
Application to credit scoring

Another popular graphical tool is the receiver operating characteristic


(or ROC curve), which is defined by:

x (s) = Pr {S ≥ s | Y = 0} = 1 − F0 (s)
y (s) = Pr {S ≥ s | Y = 1} = 1 − F1 (s)
The Gini coefficient associated to the Lorenz curve L becomes:
Z 1
Gini (L) = 2 L (x) dx − 1
0
The Gini coefficient of the score S is then computed as follows:
? Gini (L)
Gini (S) =
Gini (L? )
where L? is the Lorenz curve associated to the perfect score
An alternative to the Gini coefficient is the AUC measure, which
corresponds to the area under the ROC curve:
Gini (ROC) = 2 × AUC (ROC) − 1
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1273 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Gini coefficient
Application to credit scoring

Figure: Selection, precision and ROC curves

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1274 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Choice of the optimal cut-off


Confusion matrix

A confusion matrix is a special case of contingency matrix


Each row of the matrix represents the frequency in a predicted class
while each column represents the frequency in an actual class
Using the test set, it takes the following form:

Y =0 Y =1
S <s n0,0 n0,1
S ≥s n1,0 n1,1
n0 = n0,0 + n1,0 n1 = n0,1 + n1,1

where ni,j represents the number of observations of the cell (i, j)

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1275 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Choice of the optimal cut-off


Confusion matrix

We notice that each cell of this table can be interpreted as follows:


Y =0 Y =1
It is rejected It is rejected,
S <s and it is a bad risk but it is a good risk
(true negative) (false negative)
It is accepted, It is accepted
S ≥s but it is a bad risk and it is a good risk
(false positive) (true positive)
(negative) (positive)

The cells (S < s, Y = 0) and (S ≥ s, Y = 1) correspond to


observations that are well-classified: true negative (TN) and true
positive (TP)
The cells (S ≥ s, Y = 0) and (S < s, Y = 1) correspond to two types
of errors:
1 a false positive (FP) can induce a future loss, because it may default:
this is a type I error
2 a false negative (FN) potentially corresponds to a loss of a future
P&L: this is a type II error
Thierry Roncalli Course 2023-2024 in Financial Risk Management 1276 / 1695
Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Choice of the optimal cut-off


Classification ratios

We have
TP
True Positive Rate TPR =
TP + FN
FN
False Negative Rate FNR = = 1 − TPR
FN + TP
TN
True Negative Rate TNR =
TN + FP
FP
False Positive Rate FPR = = 1 − TNR
FP + TN
The true positive rate (TPR) is also known as the sensitivity or the
recall
It measures the proportion of real good risks that are correctly
predicted good risk

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1277 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Choice of the optimal cut-off


Classification ratios

The precision or the positive predictive value (PPV) is

TP
PPV =
TP + FP
It measures the proportion of predicted good risks that are correctly
real good risk
The accuracy considers the classification of both negatives and
positives:
TP + TN TP + TN
ACC = =
P+N TP + FN + TN + FP
The F1 score is the harmonic mean of precision and sensitivity:
2 2 · PPV · TPR
F1 = =
1/precision + 1/sensitivity PPV + TPR

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1278 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Choice of the optimal cut-off


Classification ratios

Table: Confusion matrix of three scoring systems and three cut-off values s

Score s = 100 s = 200 s = 500


386 616 698 1 304 1 330 3 672
S1
1 614 7 384 1 302 6 696 670 4 328
372 632 700 1 304 1 386 3 616
S2
1 628 7 368 1 300 6 696 614 4 384
382 616 656 1 344 1 378 3 624
S3
1 618 7 384 1 344 6 656 622 4 376
1 000 0 2 000 0 2 000 3 000
Perfect
1 000 8 000 0 8 000 0 5 000

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1279 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Choice of the optimal cut-off


Classification ratios

Table: Binary classification ratios (in %) of the three scoring systems

Score s TPR FNR TNR FPR PPV ACC F1


100 92.3 7.7 19.3 80.7 82.1 77.7 86.9
S1 200 83.7 16.3 34.9 65.1 83.7 73.9 83.7
500 54.1 45.9 66.5 33.5 86.6 56.6 66.6
100 92.1 7.9 18.6 81.4 81.9 77.4 86.7
S2 200 83.7 16.3 35.0 65.0 83.7 74.0 83.7
500 54.8 45.2 69.3 30.7 87.7 57.7 67.5
100 92.3 7.7 19.1 80.9 82.0 77.7 86.9
S3 200 83.2 16.8 32.8 67.2 83.2 73.1 83.2
500 54.7 45.3 68.9 31.1 87.6 57.5 67.3
100 100.0 0.0 50.0 50.0 88.9 90.0 94.1
Perfect 200 100.0 0.0 100.0 0.0 100.0 100.0 100.0
500 62.5 37.5 100.0 0.0 100.0 70.0 76.9

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1280 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring Shannon entropy
Statistical methods Graphical methods
Performance evaluation criteria and score consistency Statistical measures

Choice of the optimal cut-off


Classification ratios

Table: Best scoring system

Cut-off TPR FNR TNR FPR PPV ACC F1


100 S1 /S3 S1 /S3 S1 S1 S1 S1 S1
200 S1 /S2 S1 /S2 S2 S2 S2 S2 S2
500 S2 S2 S2 S2 S2 S2 S2

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1281 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring
Statistical methods
Performance evaluation criteria and score consistency

Exercises

Exercise 15.4.5 – Two-class separation maximization


Exercise 15.4.6 – Maximum likelihood estimation of the probit model

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1282 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403
The method of scoring
Statistical methods
Performance evaluation criteria and score consistency

References

Gouriéroux, C., and Jasiak, J. (2007)


The Econometrics of Individual Risk: Credit, Insurance, and
Marketing, Princeton University Press.
Roncalli, T. (2020)
Handbook of Financial Risk Management, Chapman and Hall/CRC
Financial Mathematics Series, Chapter 15.

Thierry Roncalli Course 2023-2024 in Financial Risk Management 1283 / 1695


Electronic copy available at: https://ssrn.com/abstract=4574403

You might also like