SSRN Id2944341

MLEMVD: A R Package for Maximum Likelihood
Estimation of Multivariate Diffusion Models

Matthew Dixon
Stuart School of Business
Illinois Institute of Technology
[email protected]
November 18th, 2017
Abstract
Continuous-time Markov processes are typically defined by stochastic differential equa-
tions, describing the evolution of one or more state variables. Maximum likelihood estimation
of the model parameters to historical observations is only possible when at least one of the
state variables is observable. In these cases, the form of the transition function correspond-
ing to the stochastic differential equations must be known to assess the efficacy of fitting a
continuous model to discrete samples. This paper makes two contributions: (i) we describe a
new R package MLEVD, available from https://github.com/mfrdixon/MLEMVD for
calibrating general multi-variate diffusions models using maximum likelihood estimates; and
(ii) we present an algorithm for calibrating the Heston model to option prices using maximum
likelihood estimation and assess the robustness of the approach using Monte Carlo simulation.
1 Introduction
Continuous-time Markov processes are typically defined by stochastic differential equations, de-
scribing the evolution of one or more state variables. Maximum likelihood estimation of the
model parameters to historical observations is only possible when at least one of the state variables
is observable. In these cases, the form of the transition function corresponding to the stochastic
differential equations must be known to assess the efficacy of fitting a continuous model to dis-
crete samples.
[1] provide closed form expansions for the likelihood function of a general class of univariate
diffusion models. The same author lated extended the approach to multi-variate diffusion models
[2] and to, in particular, stochastic volatility models [3]. The later work describes an approach
when only of the state variables is observed in financial times series and the other state variable
is estimated from both the observed state variable and the corresponding at-the-money constant
maturity option prices. The approach is applied to calibrate the Heston model [7], a model which
has received considerable attention in the context of calibration owing to the many practical chal-
lenges and material defects. However, the authors in addition to other similar approaches, most
notably by [10], only consider the calibration of the Heston model to the observations of the stock
and at-the-money option over time.
It is important for option pricing models to be calibrated to the surface of implied volatilities
across different moneyness and maturities. Traders typically first calibrate their models to liquid
vanilla options, and then use the calibrated models to price exotic options, to hedge their trading
books, and to determine the relative ’cheapness and expensiveness’ of options being offered in
the market when making trading decisions. Failure to calibrate one’s model properly could result
in mispricing in customer trades, losses due to inaccurate hedge ratios, or being ’picked-on’
(becoming a victim of arbitrage) by other traders.
In addition, it is also important for the pricing model to be consistent with the stochastic
evolution of the implied volatility surface over time. There is also a danger of over-fitting to in-
sample data. Ideally, one would like the calibrated parameters to be stable over time, unless there
is a regime change in the market that justifies markedly different parameters.
Practitioners use least squares errors to calibrate the Heston model to the surface of implied
volatilities. The authors draw attention to the fact that the calibration procedure is non-trivial – it
1
Electronic copy available at: https://ssrn.com/abstract=2944341
is a non-linear programming problem with a non-linear constraint and non-convex objective func-
tion. Since multiple local-minima may exist, [11] propose using a combination of global search
and local optimizers. The authors further note that the use of common stochastic algorithms for
global search, such as simulated annealing, generally renders the calibration problem more com-
putationally burdensome and unstable. The global optimizers that the authors consider include
the differential evolution (DE) algorithm and simulated annealing (SA), both of which have been
employed elsewhere in the quantitative finance literature [4].
The work of Aı̈t-Sahalia provides a more rigorous alternative to calibrating by least squares,
replacing a non-smooth, non-convex of non-concave objective function with smooth convex or
concave marginal likelihood functions. The calibration of the Heston model to at-the-money
option prices is not without its own share of numerical stability challenges, in regions where one
or more components of the Jacobian vanish. A numerical study is required to study the robustness
of estimating Heston model parameters from option prices.
Overview This paper makes two contributions. We provide a R package MLEMVD for imple-
menting the maximum likelihood estimation of general multivariate diffusions [1]. An example
of how to use —MLEMVD— for pricing the Heston model is given in Section A. We bring
the readers attention that a Matlab implementation, accompanying [1], is available for calibrating
multi-variate diffusions to state vectors. However, this implementation only supports the case
the state vector is fully observed. As such, it must be adapted to calibrate to option prices. The
approach for calibrating the Heston model to option prices, described in [3], seeks a numerically
robust and efficient implementation. Our second contribution is to described and evaluate such
an implementation in Sections 4 and 5.
We begin in the next sections with a review of maximum likelihood estimation for diffusion
models as described in [1]. Then in Section 3.3, we review the method of [3] for approximating
likelihood functions for option prices. We then turn to the computational aspects of the approach,
first reviewing an efficient implementation of the Heston pricing model that far outperforms FFT
in Section 3.4 before we proceed to the description of the calibration and presenting numerical
results evaluating the approach applied to simulated ATM options. In future works we seek to
extend this approach to calibrating to historical observations of the implied volatility surface.
2 Maximum Likelihood Estimation

The principle of maximum likelihood estimation (MLE), originally developed by R.A. Fisher a
century ago and presented in 1922 [6], states that the desired parametric probability distribution
is the one that renders the observed data most probable. The maximum likelihood estimator
(MLE) is the parameter vector that maximizes the likelihood function. We shall now introduce
the necessary terminology and notation to explain maximum likelihood estimation of general
diffusion processes.
Let i denote index observations whose values are xi . Let y → fi (x|p) be a smooth positive
density parametrized by p ∈ Rm . Let Xi be independent with density fi (·|p) which are not
independent.
The data is modeled as observed values of Xi for i ∈ 1, 2, . . . , n. The likelihood function is
n
X
L(p) = logfi (Xi |p). (1)
i=0
The first and second partial derivatives of L with respect to p are referred to as the score and the
Hessian and are given by
∂L
D(p) = (2)
∂p
and
∂2L
H(p)ij = . (3)
∂pi ∂pj
In the absence of model specification error, we first consider the curvature of the log likelihood
function at the stationary point. A large curvature represents more confidence in the MLE and
2
hence a lower standard error. The curvature is represented by the Information matrix - the negative
of the expected value of the Hessian matrix:
[I(p)] = −E[H(p)]. (4)
The variance-covariance matrix of the parameter is
var(p) = [I(p)]−1 . (5)
The standard errors of the estimator are just the square roots of the diagonal terms in the variance-
covariance matrix.
By the Cramer-Rao Theorem, under certain regularity conditions on the distribution, the vari-
ance of any unbiased estimator of a parameter p must be at least as large as
var(p) ≥ [−E[H(p)]−1 . (6)
An unbiased estimator which achieves this lower bound is said to be efficient. Such a solution
achieves the lowest possible mean squared error among all unbiased methods, and is therefore the
minimum variance unbiased estimator.
The Cramer-Rao Theorem implies that the maximum likelihood estimator is efficient but are
our assumption that the data is generated from the model is too strong.
2.1 Huber Sandwich Estimator

If the model is not well-specified, but the mean function is correctly specified and the variance
function is reasonably specified, then maximum likelihood is asymptotically normal with the
following variance-covariance matrix
var(p̂) = [I(p̂)]−1 E[D(p̂)D(p̂)T ][I(p̂)]−1 . (7)
This is the variance-covariance matrix whose square root of the diagonals provides the robust
standard error estimates that are asymptotically correct, even when the model is mis-specified.
This is the maximum likelihood analogue of White’s consistent standard errors. The reader is
referred to [8] for a lucid interpretation of the Huber Sandwich Estimator.
3 Diffusion Models
Following [1], consider the multivariate time-homogenous Markovian diffusion of the form
dXt = µ(Xt )dt + Σ(Xt )dWt (8)
where Xt , µ ∈ Rm , Σ(Xt ) ∈ Rm×m and Wt ∈ Rm are independent Wiener processes.

Prior to the pioneering work of [1], the log of the transition function fX (x|x0 , ∆) was only
given in closed form under severe restrictions on the form of µ and Σ. We shall refer the variance-
covariance matrix v(x) := ΣΣT . [1] constructs closed form expansions for the log-transition
function for a large class of multivariate Markovian diffusions. The primary use of such closed
form expansions is to permit the computation of the MLE rather than rely on less desirable ap-
proaches to inferring the log transition function numerically by solving a partial differential equa-
tion, simulating the process to Monte Carlo integrate the transition density or approximating the
process with binomial trees.
We observe X at times t0 , t1 , . . . , tn , where ∆ denotes the difference between observation
times and is assumed independent. Under this finite data, the log-likelihood takes the form:
n
X
ln (p, ∆) := lX (xi+1 |xi , ∆), (9)
i=1
where the log of the transition density lX := lnfX . Under a Hermite expansion of lX and
application of a number of transformations, [1] eventually arrive at the following compact closed
form expression with K terms.
K
(K) m C −1 (x|x0 ) X (k) ∆k
lX (x|x0 ) = − ln(2π∆) − Dv (x) + X + CX (x|x0 ) , (10)
2 ∆ k!
k=0
3
where
1
Dv := − ln(Det[v(x)]). (11)
2
We drop the (K) subscript to lighten the notation slightly and our references to the likelihood
function shall refer to this closed form approximation unless stated otherwise. Now that we’ve
outlined the fundamentals of likelihood function estimation we now turn to specific models to
illustrate and extend the approach further, starting with geometric Brownian motion and then
considering the Heston model. The MLEMVD R package currently implements over twenty uni-
variate and bivariate diffusions models given in Section A.1 of the Appendix.
3.1 Geometric Brownian motion

A geometric Brownian motion (GBM) is a continuous-time stochastic process in which the log-
arithm of the random state variables follows a Brownian motion (also called a Wiener process)
with drift µ and volatility σ given by
dXt = µXd t + σXt dWt . (12)
The transition function takes the form

1 (lnXt − lnX0 − (µ − σ 2 /2)t)2
fX (x|x0 , t) = √ exp − (13)
2πσt 2σ 2 t
and the exact log likelihood function, evaluated over a uniform time series of n observations of
the state variable with spacing ∆
n n−1
X 1X
ln (p, ∆) := lX (xi+1 |xi , ∆) = − (ln(2π∆σ 2 x2i+1 )+(ln[Xi+1 /Xi ]−(µ−σ 2 /2)∆))2 /(σ 2 ∆).
i=1
2 i=1
(14)
Section 5 uses the exact likelihood function and the corresponding exact information matrix to
assess the error in the approximation approach.
3.2 Heston Model

Under the pricing measure Q, the Heston model describes the evolution of the log of stock price
st = ln St whose variance Yt is given by a mean reverting square root process:
dst = (a + bYt )dt + Yt dW1Q (t),

p
(15)
Q
p
0 0
dYt = κ (θ − Yt )dt + σ Yt dW2 (t), (16)
where
1
a = r − d, b=− , (17)
2
A key characteristic of the model is that the Wiener processes are correlated dW1Q · dW2Q = ρdt.
This feature enables the model to exhibit the ’leverage effect’. There are five parameters in the
model
• κ: mean-reversion rate
• θ: long-term variance
• σ: volatility of variance
• ρ: instantaneous correlation between dW1Q and dW2Q
• y0 : initial variance
In this paper, we assume that the unknown parameter set p := [κ, θ, σ, ρ] and that a proxy for
the initial variance exists. Bounds are placed on each parameter so that p is in a four dimensional
feasible region F ⊂ R4 . The unknown parameters must also satisfy a non-linear constraint,
known as the ’Feller condition’ 2κθ − σ 2 > 0 to ensure that Yt is always positive.
4
3.3 Likelihood function estimation
Consider, for a moment, a fully observed state vector Xt := [st , Yt ] following the Heston model,
with a transition density function for the conditional density of Xt+∆ = x given Xt = x0 denoted
by fX (∆, x|x0 ; p). The log likelihood function for observations at times t0 , t1 , . . . , tn is given
by
n
1X
ln (p) = lX (ti − ti−1 , xti |xti−1 ; p), (18)
n i=1
where lX (∆, x|x0 ; p) := lnfX ∆, x|x0 ; p) and is given in closed form. Now let’s turn to the
problem that exists in practice, the case when Xt := [st ; Yt ]0 is partially observed and hence ln
can not be directly estimated from time series data.
Revisiting [3], we approximate the likelihood of the observed state vector Gt := [st ; Ct ]0 ,
where Ct is the ATM constant maturity option price. The transition density function for the
conditional density of Gt+∆ = g given Gt = g0 is now denoted by fG (∆, g|g0 ; p) and the log
likelihood function is given by
n
1X
ln (p) := lG (∆ti , g(ti )|g(ti−1 )); p) (19)
n i=1
where lG (∆, g|g0 ; p) := lnfG (∆, g|g0 ; p).

For ease of exposition, let the stock and option prices be expressed as a function of the state
vector Gt = f (Xt ; p) so that the inverse of the function gives the state vector as a function of the
option and stock prices Xt = f −1 (Gt ; p). Under a change of variables from Gt to Xt , the log of
the transition density fG can be expressed in terms of fX through a Jacobian Jt to give:
lG (∆, g|g0 ; p) := lnfG (∆, g|g0 ; p) = −lnJt (∆, g|g0 ; p) + lX (∆, f −1 (g; p)|f −1 (g0 ; p); p).
(20)
In Section 4, we shall introduce a numerical approximation for estimating f −1 (G; p), the key
contribution of this paper which has enabled the R package to be applied to option prices. Before
we proceed to the description of the calibration, we shall review the pricing model approximation
that leads to an efficient and robust implementation.
3.4 Pricing
With marginal loss of generality, we will restrict the scope of this section to European equity
options. The Heston stochastic volatility model permits closed-form solutions for computing risk
neutral European option prices. The price can be represented as a weighted sum of the delta of
the European call option P1 and P2 - the probability that the asset price will exceed the strike
price at maturity. Adopting standard option pricing notation, the call price of a vanilla European
option struck at K and expiring at time T is
C(St , Yt , K, τ ; p) = St P1 − Ke−(r−q)τ P2 , (21)
where τ = T − t and P1 and P2 can be expressed as:
1 ∞ φj (St , Yt , τ, u; p)e−iu ln K
Z
1
Pj = + < du, j = 1, 2. (22)
2 π o iu
where φj are Heston analytic characteristic functions and are given in a convenient form in [9],
and p is the vector of Heston model parameters. Following Fang and Oosterlee [5], the entire
inverse Fourier integral in Equation 22 is reconstructed from Fourier-cosine series expansion of
the integrand to give the following approximation of the call price
"N −1 #
−rτ
X kπ xt −a
ikπ b−a
C(St , Yt , K, τ ; p) ≈ Ke < φ ;p e Uk , (23)
b−a
k=0
where the log moneyness xt := ln(St /K) and φ(w; p) denotes the Heston characteristic function
of the log-asset price, Uk the payoff series coefficients and N denotes the number of terms in the
cosine series expansion (typically 128 will suffice).
5
4 Calibration
The mapping between Xt and Gt is given by

st
f (Xt ; p) = , (24)
C(St , Yt , K, ∆; p)
where C(·) is the Heston model option price defined above and ∆ = T − t is the constant
time to maturity of the option. Given a sequence {gt }ni=1 of observed underlying prices and
corresponding constant maturity, ATM option prices, we seek to find the maximum likelihood
estimate p∗ .
Since Yt , as previously mentioned, is unobserved, we seek the inverse

st st
f −1 (Gt ; p) = = (25)
C −1 (St , Ct , K, ∆; p) Yt
to map from Gt and Xt and hence imply the volatility yt from the observed state vector gt .
The inverse does not exist in closed form expression and we now turn to a multi-step numerical
approximation for estimating p∗ .
We assume that the state variable Xt is observed at time t0 only. In practice, this relies on
assuming an initial value yt0 = y0 , estimated by a filtering method. Now starting at time t = t1 ,
we generate a particular value of the parameter vector. The calibration procedure is outlined
below and then specified more precisely by Algorithm 1.
Step 1: In the first step, we fix p and find the corresponding implied volatility ytpi through
solving the least squared error between the observed option price cti and the model price C(·):
ytpi arg min = |cti − C(Sti , y, K, ∆; p)|. (26)

y
Unlike a least-squared error problem over p, this objective function is convex and so has a unique
solution independent of the initial choice of y in a simple root finding method such as a Newton-
Raphson solver.
Step 2: Using (ytpi , p) we compute the Jacobian of Gt w.r.t. Xt and the log of transition density
function lX (·))
lG (∆, gti |gti−1 ; p) = −lnJti + lX (∆, xpti |xpti p) (27)
where xpti := [sti ; ytpi ]0 . When one option price at each point in time is used to calibrate the
Heston model, as it is the case here, then the Jacobian is shown by [3] to be equal to the vega of
the option.
Step 3: Steps 1 and 2 are not repeated for all remaining times ti and the log likelihood function
is evaluated for the combination (y p , p) using Equation 19. The value of ytpi−1 is used to initialize
the solver for the least squares problem given by Equation 26.
Step 4: A new value of p is generated and Steps 1 and 3 are repeated until the likelihood
function has been maximized by a numerical solver
p∗ ← arg min − ln (p). (28)

p∈F
Steps 1-3 can be expressed more rigorously in the following algorithm for approximating the
maximum likelihood from option prices. Note that for ease of exposition some of the details have
been omitted.
The calibration problem is then to minimize the negative log likelihood function under the
constraints on the parameter vector. Unlike Least-Squares based calibration, the optimization
using the log likelihood function does not contain local optima and hence is more robust. Addi-
tionally, we can provide the information matrix to characterize the uncertainty in our parameter
estimate. Large uncertainty corresponds to a flattening of the log likelihood function which can
in practice result in stagnation of the solver. For this reason, we use a differential evolution
algorithm to avoid stagnation in these regions.
6
Algorithm 1: L OG L IKELIHOOD F UNCTION (p)
Input: g, K, T, y0 , p0
Output: p
1 p ← p0
2 ln (p) ← 0
p
3 yt0 ← y0
4 for i = 1 to N do
5 y ← ytpi−1
6 ytpi ← arg miny>0 |(cti − C(Sti , y, K, ∆; p)|
∂C(Sti ,Yti ,K,∆;p)
7 Jti ← ∂Yt |Yti =ytp
i
8 lG (∆, gti |gti−1 ; p) ← −lnJti + lX (∆, xti |xti−1 ; p)
9 ln (p) < −ln (p) + lG (∆, gti |gti−1 ; p)
10 end
11 return ln (p)
4.1 Implementation
We implemented the above algorithm together with the Fourier-Cosine method for pricing the
Heston model in R and C++. The C++ implementation of the Heston price and vega are called
by the non-linear optimization R packages NLoptr and DEOptim[13, 12]. More specifically,
we combine the DEOptim global optimizer with one of three constrained local optimization
solvers provided in the NLopt package. These optimizers are (i) the Sequential Least SQuares
Programming (SLSQP) method; (ii) the L-BFGS-B algorithm; and (iii) the Truncated Newton
(TNC) method. Each method exploits the smoothness of the error function over the feasible
region by approximating the Jacobian with first order forward differences under perturbations of
each parameter. A small number of Hessian vectors are also computed at each main iteration
in the L-BGFS-B algorithm. The NLoptr methods described above incorporate the non-linear
inequality constraint to enforce the Feller condition.
Since we numerically compute the gradient and Hessian, the number of function evaluations
per iteration is thus dependent on the number of model parameters. The global optimizer is
terminated if either the objective function is below a threshold or the number of iterations exceeds
a limit. The specifiable stopping criterion varies for each of the local optimizers. However, for
ease of comparison of convergence properties between each, it is possible to terminate if either the
absolute difference in function values between successive iterations is within a tolerance or the
number of function evaluations exceeds a limit. In practice, a tolerance on the absolute difference
of the function value is neither intuitive or ideally suited to calibration. In further experiments,
not reported here, we find that specifying the tolerance on the norm of the difference in solution
iterates leads to more stable parameters over successive calibrations. Of the three aforementioned
local solvers, only the TNC method permits a tolerance of this form.
5 Results
This section describes the evaluation of maximum likelihood estimation using simulated historical
data. To assess the numerical error in the calibration, we shall begin by considering geometric
brownian motion. In this case, the exact form of the likelihood function and its derivatives are
known. All approximations of the marginal log likelihood function are second order (K = 2).
10 iterations of the differential evolution method are performed with 100 candidates before using
the NLOPT_LN_COBYLA constrained optimizer. To test for numerical robustness, the initial
condition for the solver is chosen to be different from the parameter vector used in the Monte
Carlo simulation.
5.1 Geometric Brownian Motion

All results are presented for the case when there are n = 500 simulated weekly observations
using an Euler scheme with 10 intermediate time steps between each observations, for a total of
5000 time steps. The approximate marginal log likelihood function for the state variable St with
7
respect to µ (parameter 1) and σ (parameter 2) are given by Figure 1. In each case, the red point
show the maximum likelihood estimate. Table 1 lists the various error estimates and the amount
of numerical error in each estimated parameter. Table 2 provides further diagnostics of some of
the other estimated quantities.
−1450
● ●
−1500
−1500
−1600
−1550
log likelihood
log likelihood
−1700
−1600
−1800
−1650
−1900
−1700
−2000
−1750
−1.0 −0.5 0.0 0.5 1.0 0.2 0.4 0.6 0.8 1.0
param 1 param 2
Figure 1: This figure shows the marginal log likelihood function with respect to each parameter
of the geometric brownian diffusion model. The mean is represented by Parameter 1 and the
volatility by Parameter 2.
µ σ
Actual 0.5 0.2
Estimated 0.525 0.204
Est. Std. Error 6.415 × 10−2 5.975 × 10−3
Est. Huber Sandwich Error 6.420 × 10−2 6.624 × 10−3
Error in Est. Std. Error 5.169 × 10−13 9.339 × 10−12
Error in Est. Huber Sandwich Error 4.214 × 10−13 7.931 × 10−12
Table 1: This table lists the correct parameters, the estimates and the standard error estimates
using 500 simulated stock prices.
8
Actual maximum log likelihood 1326.4464
Est. maximum log likelihood 1326.204
L2 Norm of Score Error 1.659 × 107
L2 Norm of Hessian Error 1.448 × 107
L2 Norm of Information matrix 1.962 × 106
Table 2: This table lists the numerical error in a selection of estimated values.
5.2 Heston Model

All results are presented for the case when there are n = 50 simulated weekly observations
using an Euler scheme with 10 intermediate time steps between each observations, for a total of
500 time steps. The approximate marginal log likelihood function for the state variable Gt with
respect to ρ (parameter 1), κ (parameter 2), θ (parameter 3) and σ (parameter 4) are given by
Figure 2. In each case, the red point show the maximum likelihood estimate. Table 3 lists the
various error estimates.
140
● ●
0
−2000
log likelihood
log likelihood
100
80
−4000
60
−6000
40
−1.0 −0.8 −0.6 −0.4 −0.2 1 2 3 4
param 1 param 2
140
● ●
0
120
−5000
log likelihood
log likelihood
100
−15000 −10000
80
60
0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0
param 3 param 4
Figure 2: This figure shows the marginal log likelihood function with respect to each parameter
of the Heston model applied to the simulated underlying and prices. Parameters 1-4 represents ρ,
κ, θ and σ.
Figures 3 and 4 show the absolute error in the calibrated model option prices and volatilities
versus the simulated states gt ni=1 at weekly intervals.
9
abs
[2017−01−01/2017−12−10]
price.model − price.sim
Last 0.000486872616368927
0.004
0.003
0.002
0.001
0.000
Jan 01 Feb 05 Apr 02 May 07 Jul 02 Aug 06 Oct 01 Nov 05 Dec 10

2017 2017 2017 2017 2017 2017 2017 2017 2017
Figure 3: This figure shows the absolute error in the calibrated model prices versus the simulated
prices.
10
abs
[2017−01−01/2017−12−10]
vol.sim − vol.model
Last 0.00122376798258172
0.0040
0.0035
0.0030
0.0025
0.0020
0.0015
Jan 01 Feb 05 Apr 02 May 07 Jul 02 Aug 06 Oct 01 Nov 05 Dec 10

2017 2017 2017 2017 2017 2017 2017 2017 2017
Figure 4: This figure shows the absolute error in the implied volatilities versus the simulated
volatilities.
11
ρ κ θ σ
Actual −0.8 3.0 0.2 0.25
Estimated −0.81 3.4 0.201 0.2476
Std. Error 6.702 × 10−3 1.092 1.496 × 10−3 4.687 × 10−2
Huber Sandwich Error 4.903 × 10−4 9.311 × 10−1 1.117 × 10−2 3.356 × 10−2
Table 3: This table lists the correct parameters, the estimates and the lower bounds on the standard
error using 50 simulated observations of the stock and the ATM option price.
6 Conclusion
Continuous-time Markov processes are typically defined by stochastic differential equations, de-
scribing the evolution of one or more state variables. Maximum likelihood estimation of the
model parameters to historical observations is only possible when at least one of the state variables
is observable. In these cases, the form of the transition function corresponding to the stochastic
differential equations must be known to assess the efficacy of fitting a continuous model to dis-
crete samples. This paper makes two contributions: (i) we describe a R package MLEVD for
calibrating general multi-variate diffusions models using maximum likelihood estimates; and (ii)
we present an algorithm for calibrating the Heston model to option prices using maximum like-
lihood estimation and assess the robustness of the approach using Monte Carlo simulation. In
future works we seek to extend this approach to calibrating to historical observations of the im-
plied volatility surface.
12
A Overview of Package
The current implementation of the package supports maximum likelihood estimation for a number
of univariate and bivariate diffusion processes. The current version can be installed using the
following commands
library("devtools")
install_github("mfrdixon/MLEMVD")
All model agnostic functionality for estimating the maximum likelihood function is provided
in the core directory. All models are provided in the models directory and examples illustrating
the calibration of diffusion models to simulated data are provided in the examples directory.
See the documentation in each source file for further details of each function. The following
tables provide a brief mathematical specification of all models currently supported at the time of
writing this article.
A.1 Model Reference
Model µ σ constraints
U1 x(a + bx) σx3/2
U2 a + bx dx
U3 b(a − x) cxd
U4 κ(α − x) σx1/2
P3 i
U5 i=0 θi x γxρ ρ≥1
U6 a + bx + cx + dx3
2
f
U7 κ(α − x) σ
a−1 2
U8 x + a 0 + a1 x + a2 x σxp ρ≥1
a−1 2
U9 x + a0 + a1 x + a2 x (b0 + b1 x + b2 xb3 )1/2
a−1 2
U10 x + a0 + a1 x + a2 x b0 + b1 x + b2 xb3 ρ≥1
U11 a + bx f + dx
β 3
U12 x − αx γx1/2
a−1 2 3
U13 x + a0 + a1 x + a2 x + a3 x σxρ ρ≥1
Table 4: The specification of various univariate diffusion models currently supported by the pack-
age.
13
Model µ(x1 , x2 ) Σ(x1 , x2 ) constraints
p !
ρ (x2 ) 0

a + bx2
B1 q
c + dx2 h (1 − ρ2 )x2

a0 + a1 x1 + a2 x2 c0 + c1 x1 + c2 x2 0
B2
b0 + b1 x1 + b2 x2 0 d0 + d1 x1 + d2 x2
√ !

µ − x2 /2
x2 q 0
B3 γ γ
α + βx2 σρx2 σ 1 − ρ 2 x2
!  q 
a 0 + a 1 x2 2
p p
1−ρ a + f (x2 − a) ρ a + f (x2 − a) 
B4 βp 
b(a − x2 ) + λgx2 a + f (x2 − a) β
0 gx2
√ !

bx1
hx1 x2 0
B5 √
q
√
c − dx2 gρ x2 g 1−ρ 2 x2
√ !

m − x2 /2
x2 0
B6 q
√ √ 2a > σ 2
a − bx2 σ 1 − ρ 2 x2 σρ x2
 
2x1 2ηx1
√

0  γ x2 γ
B7 
a1 − a2 x2 √
2 x2 0
γ
dx1 ex2

a + bx1 0
B8
cx2 0 f
γ
dx1 ex2

a + bx1 0
B9
cx2 0 f

b1 (a1 − x1 ) g1 0
B10 √
b2 (a2 − x2 ) 0 g2 x2
q !
√ √
B11
k 1 + k 2 x2 1 − ρ 2 x2 ρ x2
κ(θ − x2 ) 0 σx2
cx1 ex2
!
q 0

ax1
B12
−bx2 dr d 1 − r2

b11 (a1 − x1 ) + b12 (a2 − x2 ) σ11 σ12
B13
b21 (a1 − x1 ) + b22 (a2 − x2 ) σ σ22
√ 21
k1 (x2 − x1 ) σ x1 0
B14 √
k2 (θ − x2 ) 0 σ2 x2
√
a + bx1 x1 p 0
B15
f x1 + dx2 h 1 + gx1
√
a + bx1 + gx2 x1 0
B16 √
d + ηx1 + f x2 h x2
q q
2 + nu g )(pa + a x b+d
! !
a00 − (a1 + a2 x2 )/2 + (n0 1 − g1 2 p
1 − g1 a1 + a2 x2
p
g1 a 1 + a 2 x 2
B17 1 1 1 2 2
a01 + a11 x2 + (nu1 g11 )( a1 + a2 x2 b+d b
p p
0 g11 ( a1 + a2 x2 )
g11 ex1
!
q0

b1 x1
B18
a2 + b2 x2 g22 r g22 1 − r 2
ex2
!
q0

b1 x1
B19
a2 + b2 x2 g22 r g22 1 − r 2
√ !

a1 + b1 x1
x2 0
B20 √
q
√
a2 + b2 x2 gr x2 g 1 − r 2 x2
√
a1 (b1 − x1 ) x 0
B21 √1 √
a21 (b1 − x1 ) + a2 (b2 − x2 ) g x1 g22 x1
 q 21 
√ √
1 − r 2 x2

k 1 + k 2 x2 r x2 
B22 
k(a − x2 ) 0 sx2b
Table 5: The specification of various bivariate diffusion models currently supported by the pack-
age.
14
References
[1] Y. Ait-Sahalia. Maximum likelihood estimation of discretely sampled diffusions: A closed-
form approximation approach. Econometrica, 70(1):223–262, 2002.
[2] Y. Ait-Sahalia. Closed-form likelihood expansions for multivariate diffusions. The Annals
of Statistics, 36(2):906–937, 2008.
[3] Y. Ait-Sahalia and R. Kimmel. Maximum likelihood estimation of stochastic volatility mod-
els. Journal of Financial Economics, 83(2):413 – 452, 2007.
[4] D. Ardia, J. David, O. Arango, and N. Gomez. Jump-Diffusion Calibration using Differen-
tial Evolution. Wilmott Magazine, 55:76–79, Sept. 2011.
[5] F. Fang and C. W. Oosterlee. A Novel Pricing Method for European Options based on
Fourier-Cosine Series Expansions. SIAM Journal on Scientific Computing, 31:826–848,
2008.
[6] R. A. Fisher. On the mathematical foundations of theoretical statistics. Philosophical
Transactions of the Royal Society of London A: Mathematical, Physical and Engineering
Sciences, 222(594-604):309–368, 1922.
[7] S. Heston. A Closed-form Solution for Options with Stochastic Volatility. Review of Finan-
cial Studies, 6:327–343, 1993.
[8] P. J. Huber. The behavior of maximum likelihood estimates under nonstandard conditions.
In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability,
Volume 1: Statistics, pages 221–233. University of California Press, 1967.
[9] J. Kienitz and D. Wetterau. Financial Modelling: Theory, Implementation and Practice with
MATLAB Source. 2013.
[10] F. Mariani, G. Pacelli, and F. Zirilli. Maximum likelihood estimation of the heston stochastic
volatility model using asset and option prices: an application of nonlinear filtering theory.
Optimization Letters, 2(2):177–222, 2008.
[11] S. Mikhailov and U. Nögel. Heston’s Stochastic Volatility Model Implementation, calibra-
tion and some extensions. Wilmott Magazine, 4:74–79, July 2003.
[12] K. Mullen, D. Ardia, D. Gil, D. Windover, and J. Cline. DEoptim: An R package for global
optimization by differential evolution. Journal of Statistical Software, 40(6):1–26, 2011.
[13] K. V. Price, R. M. Storn, and J. A. Lampinen. Differential Evolution - A Practical Ap-
proach to Global Optimization. Natural Computing. Springer-Verlag, January 2006. ISBN
540209506.
15

SSRN Id2944341

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

SSRN Id2944341

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SSRN Id2944341

Uploaded by

Copyright:

Available Formats

MLEMVD: A R Package for Maximum Likelihood

Estimation of Multivariate Diffusion Models

2 Maximum Likelihood Estimation

[I(p)] = −E[H(p)]. (4)

The variance-covariance matrix of the parameter is

var(p) = [I(p)]−1 . (5)

var(p) ≥ [−E[H(p)]−1 . (6)

2.1 Huber Sandwich Estimator

var(p̂) = [I(p̂)]−1 E[D(p̂)D(p̂)T ][I(p̂)]−1 . (7)

dXt = µ(Xt )dt + Σ(Xt )dWt (8)

where Xt , µ ∈ Rm , Σ(Xt ) ∈ Rm×m and Wt ∈ Rm are independent Wiener processes.

3.1 Geometric Brownian motion

dXt = µXd t + σXt dWt . (12)

The transition function takes the form

3.2 Heston Model

dst = (a + bYt )dt + Yt dW1Q (t),

where lG (∆, g|g0 ; p) := lnfG (∆, g|g0 ; p).

C(St , Yt , K, τ ; p) = St P1 − Ke−(r−q)τ P2 , (21)

where τ = T − t and P1 and P2 can be expressed as:

ytpi arg min = |cti − C(Sti , y, K, ∆; p)|. (26)

p∗ ← arg min − ln (p). (28)

5.1 Geometric Brownian Motion

5.2 Heston Model

−1.0 −0.8 −0.6 −0.4 −0.2 1 2 3 4

Jan 01 Feb 05 Apr 02 May 07 Jul 02 Aug 06 Oct 01 Nov 05 Dec 10

Jan 01 Feb 05 Apr 02 May 07 Jul 02 Aug 06 Oct 01 Nov 05 Dec 10

A.1 Model Reference

You might also like