0% found this document useful (0 votes)
12 views29 pages

MLMC Gaussian Correction

Uploaded by

eikigoro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views29 pages

MLMC Gaussian Correction

Uploaded by

eikigoro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

The Annals of Applied Probability

2011, Vol. 21, No. 1, 283–311


DOI: 10.1214/10-AAP695
© Institute of Mathematical Statistics, 2011

MULTILEVEL MONTE CARLO ALGORITHMS FOR LÉVY-DRIVEN


SDES WITH GAUSSIAN CORRECTION

B Y S TEFFEN D EREICH
Philipps-Universität Marburg
We introduce and analyze multilevel Monte Carlo algorithms for the
computation of Ef (Y ), where Y = (Yt )t∈[0,1] is the solution of a multidi-
mensional Lévy-driven stochastic differential equation and f is a real-valued
function on the path space. The algorithm relies on approximations obtained
by simulating large jumps of the Lévy process individually and applying a
Gaussian approximation for the small jump part. Upper bounds are provided
for the worst case error over the class of all measurable real functions f that
are Lipschitz continuous with respect to the supremum norm. These upper
bounds are easily tractable once one knows the behavior of the Lévy measure
around zero.
In particular, one can derive upper bounds from the Blumenthal–Getoor
index of the Lévy process. In the case where the Blumenthal–Getoor index
is larger than one, this approach is superior to algorithms that do not apply a
Gaussian approximation. If the Lévy process does not incorporate a Wiener
process or if the Blumenthal–Getoor index β is larger than 43 , then the upper
bound is of order τ −(4−β)/(6β) when the runtime τ tends to infinity. Whereas
in the case, where β is in [1, 43 ] and the Lévy process has a Gaussian com-
ponent, we obtain bounds of order τ −β/(6β−4) . In particular, the error is at
most of order τ −1/6 .

1. Introduction. Let dY ∈ N and denote by D[0, 1] the Skorokhod space of


functions mapping [0, 1] to RdY endowed with its Borel-σ -field. In this article, we
analyze numerical schemes for the evaluation of
S(f ) := E[f (Y )],
where
• Y = (Yt )t∈[0,1] is a solution to a multivariate stochastic differential equation
driven by a multidimensional Lévy process (with state space RdY ), and
• f : D[0, 1] → R is a Borel measurable function that is Lipschitz continuous with
respect to the supremum norm.
This is a classical problem which appears for instance in finance, where Y mod-
els the risk neutral stock price and f denotes the payoff of a (possibly path depen-

Received August 2009; revised January 2010.


AMS 2000 subject classifications. Primary 60H35; secondary 60H05, 60H10, 60J75.
Key words and phrases. Multilevel Monte Carlo, Komlós–Major–Tusnády coupling, weak ap-
proximation, numerical integration, Lévy-driven stochastic differential equation.
283
284 S. DEREICH

dent) option, and in the past several concepts have been employed for dealing with
it.
A common stochastic approach is to perform a Monte Carlo simulation of nu-
merical approximations to the solution Y . Typically, the Euler or Milstein schemes
are used to obtain approximations. Also higher order schemes can be applied pro-
vided that samples of iterated Itô integrals are supplied and the coefficients of the
equation are sufficiently regular. In general, the problem is tightly related to weak
approximation which is, for instance, extensively studied in the monograph by
Kloeden and Platen [12] for diffusions.
Essentially, one distinguishes between two cases. Either f (Y ) depends only on
the state of Y at a fixed time or alternatively it depends on the whole trajectory of Y .
In the former case, extrapolation techniques can often be applied to increase the
order of convergence, see [21]. For Lévy-driven stochastic differential equations,
the Euler scheme was analyzed in [17] under the assumption that the increments of
the Lévy process are simulatable. Approximate simulations of the Lévy increments
are considered in [11].
In this article, we consider functionals f that depend on the whole trajectory.
Concerning results for diffusions, we refer the reader to the monograph [12]. For
Lévy-driven stochastic differential equations, limit theorems in distribution are
provided in [10] and [18] for the discrepancy between the genuine solution and
Euler approximations.
Recently, Giles [7, 8] (see also [9]) introduced the so-called multilevel Monte
Carlo method to compute S(f ). It is very efficient when Y is a diffusion. Indeed,
it even can be shown that it is—in some sense—optimal, see [5]. For Lévy-driven
stochastic differential equations, multilevel Monte Carlo algorithms are first intro-
duced and studied in [6]. Let us explain their findings in terms of the Blumenthal–
Getoor index (BG-index) of the driving Lévy process which is an index in [0, 2].
It measures the frequency of small jumps, see (3), where a large index corresponds
to a process which has small jumps at high frequencies. In particular, all Lévy
processes which have a finite number of jumps has BG-index zero. Whenever the
BG-index is smaller or equal to one, the algorithms of [6] have worst case errors
at most of order τ −1/2 , when the runtime τ tends to infinity. Unfortunately, the
efficiency decreases significantly for larger Blumenthal–Getoor indices.
Typically, it is not feasible to simulate the increments of the Lévy process per-
fectly, and one needs to work with approximations. This necessity typically wors-
ens the performance of an algorithm, when the BG-index is larger than one due to
the higher frequency of small jumps. It represents the main bottleneck in the sim-
ulation. In this article, we consider approximative Lévy increments that simulate
the large jumps and approximate the small ones by a normal distribution (Gaussian
approximation) in the spirit of Asmussen and Rosiński [2] (see also [4]). Whenever
the BG-index is larger than one, this approach is superior to the approach taken in
[6], which neglects small jumps in the simulation of Lévy increments.
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 285

F IG . 1. Order of convergence in dependence on the Blumenthal–Getoor index.

To be more precise, we establish a new estimate for the Wasserstein metric be-
tween an approximative solution with Gaussian approximation and the genuine
solution, see Theorem 3.1. It is based on a consequence of Zaitsev’s generalization
[22] of the Komlós–Major–Tusnády coupling [13, 14] which might be of its own
interest itself, see Theorem 6.1. With these new estimates, we analyze a class of
multilevel Monte Carlo algorithms together with a cost function which measures
the computational complexity of the individual algorithms. We provide upper error
bounds for individual algorithms and optimize the error over the parameters under
a given cost constraint. When the BG-index is larger than one, appropriately ad-
justed algorithms lead to significantly smaller worst case errors over the class of
Lipschitz functionals than the ones analyzed so far, see Theorem 1.1, Corollary 1.2
and Figure 1. In particular, one always obtains numerical schemes with errors at
most of order τ −1/6 when the runtime τ of the algorithm tends to infinity.

Notation and universal assumptions. We denote by | · | the Euclidean norm


for vectors as well as the Frobenius norm for matrices and let  ·  denote the
supremum norm over the interval [0, 1]. X = (Xt )t≥0 denotes an dX -dimensional
L2 -integrable Lévy process. By the Lévy–Khintchine formula, it is character-
ized 2by a square integrable Lévy-measure ν [a Borel measure on RdX \{0} with
|x| ν(dx) < ∞], a positive semi-definite matrix  ∗ ( being a dX × dX -
matrix), and a drift b ∈ RdX via
Eeiθ,Xt = etψ(θ ) ,
where

1  iθ,x 
ψ(θ) = | ∗ θ |2 + b, θ + e − 1 − iθ, x ν(dx).
2 RdX
286 S. DEREICH

Briefly, we call X a (ν,  ∗ , b)-Lévy process, and when b = 0, a (ν,  ∗ )-Lévy


martingale. All Lévy processes under consideration are assumed to be càdlàg. As
is well known, we can represent X as sum of three independent processes
Xt = Wt + Lt + bt,
where W = (Wt )t≥0 is a dX -dimensional Wiener process and L = (Lt )t≥0 is a L2 -
martingale that comprises the compensated jumps of X. We consider the integral
equation
 t
(1) Yt = y0 + a(Yt− ) dXt ,
0

where y0 ∈ RdY is a fixed deterministic initial value. We impose the standard Lip-
schitz assumption on the function a : RdY → RdY ×dX : for a fixed K < ∞, and all
y, y ∈ RdY , one has
|a(y) − a(y )| ≤ K|y − y | and |a(y0 )| ≤ K.
Furthermore, we assume without further mentioning that

|x|2 ν(dx) ≤ K 2 , || ≤ K and |b| ≤ K.

We refer to the monographs [3] and [20] for details concerning Lévy processes.
Moreover, a comprehensive introduction to the stochastic calculus for discon-
tinuous semimartingales and, in particular, Lévy processes can be found in [16]
and [1].
In order to approximate the small jumps of the Lévy process, we need to impose
a uniform ellipticity assumption.

A SSUMPTION UE. There are h ∈ (0, 1], ϑ ≥ 1 and a linear subspace H of


RdX such that for all h ∈ (0, h] the Lévy measure ν|B(0,h) is supported on H and
satisfies
  
1
y, x ν(dx) ≤
2
y , x ν(dx) ≤ ϑ
2
y, x 2 ν(dx)
ϑ B(0,h) B(0,h) B(0,h)

for all y, y ∈ H with |y| = |y |.

Main results. We consider a class of multilevel Monte Carlo algorithms A


together with a cost function cost : A → [0, ∞) that are introduced explicitly in
Section 2. For each algorithm S  ∈ A, we denote by S(f
 ) a real-valued random
variable representing the random output of the algorithm when applied to a given
measurable function f : D[0, 1] → R. We work in the real number model of com-
putation, which means that we assume that arithmetic operations with real num-
bers and comparisons can be done in one time unit, see also [15]. Our cost function
represents the runtime of the algorithm reasonably well when supposing that
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 287

• one can sample from the distribution ν|B(0,h)c /ν(B(0, h)c ) and the uniform dis-
tribution on [0, 1] in constant time,
• one can evaluate a at any point y ∈ RdY in constant time, and
• f can be evaluated for piecewise constant functions in less than a constant mul-
tiple of its breakpoints plus one time units.
 ) is less
As pointed out below, in that case, the average runtime to evaluate S(f

than a constant multiple of cost(S). We analyze the minimal worst case error
err(τ ) = inf  )|2 ]1/2 ,
sup E[|S(f ) − S(f τ ≥ 1.
 A: f ∈Lip(1)
S∈

cost(S)≤τ

Here and elsewhere, Lip(1) denotes the class of measurable functions f : D[0,
1] → R that are Lipschitz continuous with respect to supremum norm with coeffi-
cient one.
In this article, we use asymptotic comparisons. We write f ≈ g for 0 <
lim inf fg ≤ lim sup fg < ∞, and f  g or, equivalently g  f , for lim sup fg < ∞.
Our main findings are summarized in the following theorem.

T HEOREM 1.1. Assume that Assumption UE is valid and let g : (0, ∞) →


(0, ∞) be a decreasing and invertible function such that for all h > 0

|x|2
∧ 1ν(dx) ≤ g(h)
h2
and, for a fixed γ > 1,
 
γ
(2) g h ≥ 2g(h)
2
for all sufficiently small h > 0.
(I) If  = 0 or
g −1 (x)  x −3/4 as x → ∞,
then
err(τ )  g −1 ((τ log τ )2/3 )τ 1/6 (log τ )2/3 as τ → ∞.
(II) If
g −1 (x)  x −3/4 as x → ∞,
then
log τ
err(τ )  as τ → ∞,
g ∗ (τ )
where g ∗ (τ ) = inf{x > 1 : x 3 g −1 (x)2 (log x)−1 ≥ τ }.
288 S. DEREICH

The class of algorithms A together with appropriate parameters which establish


the error estimates above are stated explicitly in Section 2.
In terms of the Blumenthal–Getoor index

(3) β := inf p > 0 : |x|p ν(dx) < ∞ ∈ [0, 2]
B(0,1)

we get the following corollary.

C OROLLARY 1.2. Assume that Assumption UE is valid and that the BG-index
satisfies β ≥ 1. If  = 0 or β ≥ 43 , then
4−β
sup{γ ≥ 0 : err(τ )  τ −γ } ≥ ,

and, if  = 0 and β < 43 ,
β
sup{γ ≥ 0 : err(τ )  τ −γ } ≥ .
6β − 4

Visualization of the results and relationship to other work. Figure 1 illustrates


our findings and related results. The x-axis and y-axis represent the Blumenthal–
Getoor index and the order of convergence, respectively. Note that MLMC 0 stands
for the multilevel Monte Carlo algorithm which does not apply a Gaussian approx-
imation, see [6]. Both lines marked as MLMC 1 illustrate Corollary 1.2, where the
additional (G) refers to the case where the SDE comprises a Wiener process.
These results are to be compared with the results of Jacod et al. [11]. Here an ap-
proximate Euler method is analyzed by means of weak approximation. In contrast
to our investigation, the object of that article is to compute Ef (XT ) for a fixed time
T > 0. Under quite strong assumptions (for instance, a and f have to be four times
continuously differentiable and the eights moment of the Lévy process needs to be
finite), they provide error bounds for a numerical scheme which is based on Monte
Carlo simulation of one approximative solution. In the figure, the two lines quoted
as JKMP represent the order of convergence for general, respectively pseudo sym-
metrical, Lévy processes. Additionally to the illustrated schemes, [11] provide an
expansion which admits a Romberg extrapolation under additional assumptions.
We stress the fact that our analysis is applicable to general path dependent func-
tionals and that our error criterion is the worst case error over the class of Lip-
schitz continuous functionals with respect to supremum norm. In particular, our
class contains most of the continuous payoffs appearing in finance.
We remark that our results provide upper bounds for the inferred error and so
far no lower bounds are known. The worst exponent appearing in our estimates
is 16 which we obtain for Lévy processes with Blumenthal–Getoor index 2. Inter-
estingly, this is also the worst exponent appearing in [19] in the context of strong
approximation of SDEs driven by subordinated Lévy processes.
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 289

Agenda. The article is organized as follows. In Section 2, we introduce a class


of multilevel Monte Carlo algorithms together with a cost function. Here, we also
provide the crucial estimate for the mean squared error which motivates the con-
sideration of the Wasserstein distance between an approximative and the genuine
solution, see (6). Section 3 states the central estimate for the former Wasserstein
distance, see Theorem 3.1. In this section, we explain the strategy of the proof
and the structure of the remaining article in detail. For the proof, we couple the
driving Lévy process with a Lévy process constituted by the large jumps plus a
Gaussian compensation of the small jumps and we write the difference between
the approximative and the genuine solution as a telescoping sum including further
auxiliary processes, see (9) and (10). The individual errors are then controlled in
Sections 4 and 5 for the terms which do not depend on the particular choice of
the coupling and in Section 7 for the error terms that do depend on the particular
choice. In between, in Section 6, we establish the crucial KMT like coupling result
for the Lévy process. Finally, in Section 8, we combine the approximation result
for the Wasserstein metric (Theorem 3.1) with estimates for strong approximation
of stochastic differential equations from [6] to prove the main results stated above.

2. Multilevel Monte Carlo. Based on a number of parameters, we define a


multilevel Monte Carlo algorithm S:  We denote by m and n1 , . . . , nm natural num-
bers and let ε1 , . . . , εm and h1 , . . . , hm denote decreasing sequences of positive re-
als. Formally, the algorithm S  can be represented as a tuple constituted by these
parameters, and we denote by A the set of all possible choices for S.  We continue
with defining processes that depend on the latter parameters. For ease of notation,
the parameters are omitted in the definitions below. 
We choose a square matrix  (m) such that ( (m) ( (m) )∗ )i,j = B(0,hm ) xi xj ×
ν(dx). Moreover, for k = 1, . . . , m, we let L(k) = (L(k) t )t≥0 denote the (ν|B(0,hk )c ,
0)-Lévy martingale which comprises the compensated jumps of L that are larger
than hk , that is

(k)
Lt = 1{| Ls |≥hk } Ls − t xν(dx).
s≤t B(0,hk )c

Here and elsewhere, we denote Lt = Lt − Lt− . We let B = (Bt )t≥0 be an


independent Wiener process (independent of W and L(k) ), and consider, for
k = 1, . . . , m, the processes X (k) = (Wt +  (m) Bt + L(k)
t + bt)t≥0 as driving
processes. Let ϒ (k) denote the solution to
 t
 (k) 
ϒt(k) = y0 + a ϒs− dXι(k) (s) ,
0
where (ι(k) (t))t≥0 is given via ι(k) (t) = max(I(k) ∩ [0, t]) and the set I(k) is con-
(k) (k)
stituted by the random times (Tj )j ∈Z+ that are inductively defined via T0 = 0
and
  
Tj(k) (k) (k)
+1 = inf t ∈ Tj , ∞ : | Lt | ≥ hk or t = Tj + εk .
290 S. DEREICH

(k) (k)
Clearly, ϒ (k) is constant on each interval [Tj , Tj +1 ) and one has
(k) (k)  (k)  
(4) ϒ (k) =ϒ (k) +a ϒ (k) XT (k) − XT (k) .
Tj +1 Tj Tj j +1 j

Note that we can write


m
          
E f ϒ (m) = E f ϒ (k) − f ϒ (k−1) + E f ϒ (1) .
k=2

The multilevel Monte Carlo algorithm—identified with S—estimates each expec-
tation E[f (ϒ ) − f (ϒ
(k) (k−1) )] (resp., E[f (ϒ )]) individually by sampling in-
(1)

dependently nk (resp., n1 ) versions of f (ϒ (k) ) − f (ϒ (k−1) ) [f (ϒ (1) )] and taking


the average. The output of the algorithm is then the sum of the individual esti-
 ) a random variable that models the random output of the
mates. We denote by S(f
algorithm when applied to f .

The mean squared error of an algorithm. The Monte Carlo algorithm intro-
duced above induces the mean squared error
 m
   1     
 f ) = E[f (Y )] − E f ϒ (m) 2 +
mse(S, var f ϒ (k) − f ϒ (k−1)
n
k=2 k
1   
+ var f ϒ (1) ,
n1
when applied to f . For two D[0, 1]-valued random elements Z (1) and Z (2) , we
denote by W (Z (1) , Z (2) ) the Wasserstein metric of second-order with respect to
supremum norm, that is
 
   (1)    1/2
(5) W Z (1) , Z (2) = inf z − z(2) 2 dξ z(1) , z(2) ,
ξ

where the infimum is taken over all probability measures ξ on D[0, 1] × D[0, 1]
having first marginal PZ (1) and second marginal PZ (2) . Clearly, the Wasserstein
distance depends only on the distributions of Z (1) and Z (2) . Now, we get for f ∈
Lip(1), that
m 2 
    1  (k)
 f ≤ W Y, ϒ (m) 2 +
mse S, E ϒ − ϒ (k−1) 
n
k=2 k
(6)
1  (1) 2 
+ E ϒ − y0  .
n1
We set
 =
mse(S)  f ),
sup mse(S,
f ∈Lip(1)
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 291


and remark that estimate (6) remains valid for the worst case error mse(S).
The main task of this article is to provide good estimates for the Wasserstein
metric W (Y, ϒ (m) ). The remaining terms on the right-hand side of (6) are con-
trolled with estimates from [6].

The cost function. In order to simulate one pair (ϒ (k−1) , ϒ (k) ), we need to
simulate all displacements of L of size larger or equal to hk on the time interval
[0, 1]. Moreover, we need the increments of the Wiener process on the time skele-
ton (I(k−1) ∪ I(k) ) ∩ [0, 1]. Then we can construct our approximation via (4). In the
real number model of computation (under the assumptions described in the Intro-
duction), this can be performed with runtime less than a multiple of the number of
entries in I(k) ∩ [0, 1], see [6] for a detailed description of an implementation of a
similar scheme. Since
 
   1 1
E # I(k) ∩ [0, 1] ≤ 1 + + E 1{| Lt |≥hk } = ν(B(0, hk )c ) + + 1,
εk t∈[0,1]
εk
 ∈ A,
we define, for S
m  
 = 1
cost(S) nk ν(B(0, hk ) ) + + 1 .
c

k=1
εk

Then supposing that ε1 ≤ 1 and ν(B(0, hk )c ) ≤ 1


εk for k = 1, . . . , m, yields that
m
 ≤3 1
(7) cost(S) nk .
k=1
εk

Algorithms achieving the error rates of Theorem 1.1. Let us now quote the
choice of parameters which establish the error rates of Theorem 1.1. In general,
one chooses εk = 2−k and hk = g −1 (2k ) for k ∈ Z+ . Moreover, in case (I), for
sufficiently large τ , one picks
 
g −1 (2k )
m = log2 C1 (τ log τ )2/3  and nk = C2 τ 1/3 (log τ )−2/3
g −1 (2m )
for k = 1, . . . , m,
where C1 and C2 are appropriate constants that do not depend on τ . In case (II),
one chooses
 
∗ g ∗ (τ )2 g −1 (2k )
m = log2 C1 g (τ ) and nk = C2
log g ∗ (τ ) g −1 (2m )
for k = 1, . . . , m,
where again C1 and C2 are appropriate constants. We refer the reader to the proof
of Theorem 1.1 for the error estimates of this choice.
292 S. DEREICH

3. Weak approximation. In this section, we provide the central estimate for


the Wasserstein metric appearing in (6). For ease of notation, we denote by ε and h
two positive parameters which correspond
 to h(m) and ε(m) above. We denote by

 a square matrix with  ( ) = ( B(0,h) xi xj ν(dx))i,j ∈{1,...,dX } . Moreover, we
let L denote the process constituted by the compensated jumps of L of size larger
than h, and let B = (Bt )t≥0 be a dX -dimensional Wiener process that is indepen-
dent of W and L . Then we consider the solution ϒ = (ϒt )t≥0 of the integral
equation
 t
 
ϒt = y0 + a ϒι(s−) dXs ,
0
where X = (Xt )t≥0 is given as Xt = Wt +  Bt + Lt + bt and ι(t) = max(I ∩
[0, t]), where I is, in analogy to above, the set of random times (Tj )j ∈Z+ defined
inductively via T0 = 0 and
Tj +1 = inf{t ∈ (Tj , ∞) : | Lt | ≥ h or t = Tj + ε} for j ∈ Z+ .
The process ϒ is closely related to ϒ (m) from Section 2 and choosing ε = εm
and h = hm , implies that (ϒι(t) )t≥0 and ϒ (m) are identically distributed.
 We need to introduce two further crucial quantities: for h > 0, let F (h) =
B(0,h) |x| ν(dx) and F0 (h) = B(0,h)c xν(dx).
2

T HEOREM 3.1. Suppose that Assumption UE is valid. There exists a finite


constant κ that depends only on K, dX and ϑ such that for ε ∈ (0, 12 ], ε ∈ [2ε, 1],
and h ∈ (0, h] with ν(B(0, h)c ) ≤ 1ε , one has
  2 
 2 h2 ε F (h) e
W Y, ϒι(·) ≤ κ F (h)ε + log ∨e + ε log ,
ε h2 ε
and, if  = 0, one has
    2 
 2 e h2 ε F (h)
W Y, ϒι(·) ≤ κ F (h) ε + ε log + log ∨e + |b − F0 (h)| ε .
2 2
ε ε h2
C OROLLARY 3.2. Under Assumption UE, there exists a constant κ =
κ(K, dX , ϑ) such that for all ε ∈ (0, 14 ] and h ∈ (0, h] with ν(B(0, h)c ) ∨ Fh(h)
2 ≤ ε,
1

one has
 
 2 2 1 e
W Y, ϒι(·) ≤ κ h √ + ε log ,
ε ε
and, in the case where  = 0,
 
 2 1 e
W Y, ϒι(·) ≤ κ h2 √ log + |b − F0 (h)|2 ε2 .
ε ε

P ROOF. Choose ε = ε log 1/ε and observe that ε ≥ 2ε since ε ≤ 14 . Using
that Fh(h)
2 ≤ g(h) ≤ ε , it is straight forward to verify the estimate with Theorem 3.1.
1


MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 293

3.1. Strategy of the proof of Theorem 3.1 and main notation. We represent X
as
Xt = Wt + Lt + Lt + bt,
where L = (Lt )t≥0 = L − L is the process which comprises the compensated
jumps of L of size smaller than h. Based on an additional parameter ε ∈ [2ε, 1],
we couple L with B. The introduction of the explicit coupling is deferred to
Section 7. Let us roughly explain the idea behind the parameter ε . In classical
Euler schemes, the coefficients of the SDE are updated in either a deterministic or
a random number of steps of a given (typical) length. Our approximation updates
the coefficients at steps of order ε as the classical Euler method. However, in our
case the Lévy process that comprises the small jumps is ignored for most of the
time steps. It is only considered on steps of order of size ε .
On the one hand, a large ε reduces the accuracy of the approximation. On
the other hand, the part of the small jumps has to be approximated by a Wiener
process and the error inferred from the coupling decreases in ε . This explains the
increasing and decreasing terms in Theorem 3.1. Balancing ε and ε then leads to
Corollary 3.2.
We need some auxiliary processes. Analogously to I and ι, we let J denote the
set of random times (Tj )j ∈Z+ defined inductively by T0 = 0 and
 
Tj +1 = min I ∩ (Tj + ε − ε, ∞)
so that the mesh-size of J is less than or equal to ε . Moreover, we set η(t) =
max(J ∩ [0, t]).
Let us now introduce the first auxiliary processes. We set X = (Xt − Lt )t≥0
and we consider the solution Ȳ = (Ȳt )t≥0 to the integral equation
 t  t
   
(8) Ȳt = y0 + a Ȳι(s−) dXs + a Ȳη(s−) dLη(s)
0 0

and the process Ȳ = (Ȳt )t≥0 given by


  
Ȳt = Ȳt + a Ȳη(t) Lt − Lη(t) .

It coincides with Ȳ for all times in J and satisfies


 t  t
   
Ȳt = y0 + a Ȳι(s−) dXs + a Ȳη(s−) dLs .
0 0

Next, we replace the term L by the Gaussian term  B in the above integral
equations and obtain analogs of Ȳ and Ȳ which are denoted by ϒ̄ and ϒ̄. To be
more precise, ϒ = (ϒ̄t )t≥0 is the solution to the stochastic integral equation
 t  t
   
ϒ̄t = y0 + a ϒ̄ι(s−) dXs + a ϒ̄η(s−)  dBη(s) ,
0 0
294 S. DEREICH

and ϒ̄ = (ϒ̄t )t≥0 is given via


   
ϒ̄t = ϒ̄t + a ϒ̄η(t)  Bt − Bη(t) .
We now focus on the discrepancy of Y and ϒι(·) . By the triangle inequality, one
has
   
(9) Y − ϒι(·)  ≤ Y − Ȳ  + Ȳ − ϒ̄ + ϒ̄ − ϒ + ϒ − ϒι(·) .

Moreover, the second term on the right satisfies


(10) Ȳ − ϒ̄ ≤ Ȳ − ϒ̄  + Ȳ − Ȳ − (ϒ̄ − ϒ̄ ).
In order to prove Theorem 3.1, we control the error terms individually. The first
term on the right-hand side of (9) is considered in Proposition 4.1. The third and
fourth term are treated in Propositions 5.1 and 5.2, respectively. The terms on the
right-hand side of (10) are investigated in Propositions 7.1 and 7.2, respectively.
Note that only the latter two expressions depend on the particular choice of the
coupling of L and  B. Once the above-mentioned propositions are proved, the
statement of Theorem 3.1 follows immediately by combining these estimates and
identifying the dominant terms.

4. Approximation of Y by Ȳ .

P ROPOSITION 4.1. There exists a constant κ > 0 depending on K only such


that, for ε ∈ (0, 12 ], ε ∈ [2ε, 1] and h > 0 with ν(B(0, h)c ) ≤ 1ε , one has
 
E sup |Yt − Ȳt |2 ≤ κ[F (h)ε + |b − F0 (h)|2 ε2 ],
t∈[0,1]

if  = 0, and
   
(11) E sup |Yt − Ȳt |2 ≤ κ ε + F (h)ε
t∈[0,1]

for general .

P ROOF. For t ≥ 0, we consider Zt = Yt − Ȳt , Zt = Yt − Ȳι(t) , Zt = Yt − Ȳη(t)


and z(t) = E[sups∈[0,t] |Zs |2 ]. The main task of the proof is to establish an estimate
of the form
 t
z(t) ≤ α1 z(s) ds + α2
0
for appropriate values α1 , α2 > 0. Since z is finite (see, for instance, [6]), then
Gronwall’s inequality implies as upper bound:
 
E sup |Ys − Ȳs |2 ≤ α2 exp(α1 ).
s∈[0,1]
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 295

We proceed in two steps.


1st step. Note that
 t  t
     
Zt = a(Ys− ) − a Ȳι(s−) d(Ws + Ls ) + a(Ys− ) − a Ȳη(s−) dLs
0  0 
=:Mt
 t
  
+ a(Ys− ) − a Ȳι(s−) b ds,
0
so that
 t 2
    
(12) 2 
|Zt | ≤ 2|Mt | + 2 a(Ys− ) − a Ȳι(s−) b ds  .
2
0
For t ∈ [0, 1], we conclude with the Cauchy–Schwarz
t inequality that the second
term on the right-hand side is bounded by 2K 0 |Zs− | ds.
4 2

Certainly, (Mt ) is a (local) martingale with respect to the canonical filtration,


and we apply the Doob inequality together with Lemma A.1 to deduce that
   t
  2
E sup |Ms | 2
≤ 4E a(Ys− ) − a Ȳ  dW + L s
ι(s−)
s∈[0,t] 0
 t 
  
+ a(Ys− ) − a Ȳη(s−) 2 dL s .
0

Here and
elsewhere, for a multivariate local L2 -martingale S = (St )t≥0 , we denote
S = j S (j ) and S (j ) denotes the predictable compensator of the classical
bracket
 process of the j th coordinate S (j ) of S. Note that dW + L t = (||2 +
B(0,h)c |x| ν(dx)) dt ≤ 2K dt and dL t = F (h) dt. Consequently,
2 2

    t  t 
E sup |Ms |2 ≤ 4E 2K 4 |Zs |2 ds + K 2 F (h) |Zs |2 ds .
s∈[0,t] 0 0

Hence, by (12) and Fubini’s theorem, one has


   t
 
E sup |Zs | 2
≤ κ1 z(s) + E[|Zs |2 ] + F (h)E[|Zs |2 ] ds
s∈[0,t] 0

for a constant κ1 that depends only on K. Since Zt = Zt + Ȳt − Ȳι(t) and Zt =


Zt + Ȳt − Ȳη(t) , we get
 t
  2   2  
(13) z(t) ≤ κ2 z(s) + E Ȳs − Ȳι(s)  + F (h)E Ȳs − Ȳη(s)  ds
0
for an appropriate constant κ2 = κ2 (K).
2nd step. In the second step we provide appropriate estimates for E[|Ȳt − Ȳι(t) |2 ]
and E[|Ȳt − Ȳη(t) |2 ]. The processes W and L are independent of the random time
296 S. DEREICH

ι(t). Moreover, L has no jumps in (ι(t), t), and we obtain


  
Ȳt − Ȳι(t) = Ȳt − Ȳι(t) + a Ȳη(t) Lt − Lη(t)
      
= a Ȳι(t)  Wt − Wι(t) + b − F0 (h) t − ι(t)
  
+ a Ȳη(t) Lt − Lη(t)
so that
  2   2   
E Ȳt − Ȳι(t)  ≤ 3K 2 E Ȳι(t) − y0  + 1 ||2 ε + |b − F0 (h)|2 ε2
  2  
+ E Ȳη(t) − y0  + 1 F (h)ε .
By Lemma A.2, there exists a constant κ3 = κ3 (K) such that
 2 
(14) E Ȳt − Ȳι(t)  ≤ κ3 [||2 ε + |b − F0 (h)|2 ε2 + F (h)ε ].

Similarly, we estimate E[|Ȳt − Ȳη(t) |2 ]. Given η(t), (Lη(t)+u −


Lη(t) )u∈[0,(ε −ε)∧(t−η(t))] is distributed as the unconditioned Lévy process L on
the time interval [0, (ε − ε) ∧ (t − η(t))]. Moreover, we have dLu = −F0 (h) du
on (η(t) + ε − ε, t]. Consequently,
 t
 
Ȳt − Ȳη(t) = 1{s−η(t)≤ε −ε} a Ȳι(s−) d(Ws + Ls + bs)
η(t)
 t
     
+ 1{s−η(t)>ε −ε} a Ȳι(s−) d Ws + b − F0 (h) s
η(t)
  
+ a Ȳη(t) Lt − Lη(t) ,
and analogously as we obtained (14) we get now that
 2 
E Ȳt − Ȳη(t)  ≤ κ4 [ε + |b − F0 (h)|2 ε2 ]
for a constant κ4 = κ4 (K). Next, note that, by the Cauchy–Schwarz inequality,
 2
|F0 (h)|2 ≤ B(0,h)c |x|2 ν(dx) · ν(B(0, h)c ) ≤ Kε so that we arrive at
 2
E Ȳt − Ȳη(t)  ≤ κ5 ε .
Combining this estimate with (13) and (14), we obtain
 t
z(t) ≤ κ2 z(s) ds + κ6 [||2 ε + F (h)ε + |b − F0 (h)|2 ε2 ].
0

In the case where  = 0, the statement of the proposition follows immediately


via Gronwall’s inequality. For general , we obtain the result by recalling that
2
|F0 (h)|2 ≤ Kε . 
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 297

5. Approximation of ϒ̄ by ϒι(·) .

P ROPOSITION 5.1. Under the assumptions of Proposition 4.1, one has

E[ϒ̄ − ϒ2 ] ≤ κε F (h)

for a constant κ depending only on K.

P ROOF. The proposition can be proved as Proposition 4.1. Therefore, we only


provide a sketch of the proof. The arguments from the first step give, for t ∈ [0, 1],
 t
  2   2  
z(t) ≤ κ1 z(s) + E ϒ̄ι(s) − ϒ̄ι(s)  + F (h)E ϒ̄ι(s) − ϒ̄η(s)  ds,
0

where z(t) = E[sups∈[0,t] |ϒs − ϒ̄s |2 ] and κ1 = κ1 (K) is an appropriate constant.


Moreover, based on Lemma A.2 the second step leads to
 2   2 
E ϒ̄ι(t) − ϒ̄ι(t)  ≤ κ2 ε F (h) and E ϒ̄ι(t) − ϒ̄η(t)  ≤ κ3 ε

for appropriate constants κ2 = κ2 (K) and κ3 = κ3 (K). Then Gronwall’s lemma


implies again the statement of the proposition. 

P ROPOSITION 5.2. Under the assumptions of Proposition 4.1, there exists a


constant κ depending only on K and dX such that, if  = 0,
  
 2  e
 
E sup ϒt − ϒι(t) ≤ κ F (h)ε log + |b − F0 (h)| ε
2 2
t∈[0,1] ε

and, in the general case,


  e  
E sup ϒt − ϒι(t)  ≤ κε log .
2
t∈[0,1] ε

P ROOF. Recall that by definition


 t
 
ϒt − ϒι(t) = a ϒι(s−) dXs
ι(t)

so that
      
ϒt − ϒι(t) 2 ≤ K 2 ϒι(t) − y0  + 1 2 Xt − Xι(t) 2 .

Next, we apply Lemma A.4. For j ∈ Z+ , we choose


 
Uj = |ϒTj ∧1 − y0 |2 and Vj = sup Xt − Xι(t) 2
s∈[Tj ,Tj +1 ∧1)
298 S. DEREICH

with the convention that the supremum of the empty set is zero. Then
       
E sup ϒt − ϒι(t)  ≤ E sup Uj · E sup Vj
2
t∈[0,1] j ∈Z+ j ∈Z+
   
≤ E sup (|ϒt − y0 | + 1)2 · E sup |Xt − Xs |2 .
t∈[0,1] 0≤s<t≤1
t−s≤ε

By Proposition 5.1 and Lemma A.2, E[supt∈[0,1] (|ϒt − y0 | + 1)2 ] is bounded by a


constant that depends only on K. √
Consider ϕ : [0, 1] → [0, ∞), δ → δ log(e/δ). By Lévy’s modulus of continu-
ity,
|Wt − Ws |
W ϕ := sup
0≤s<t≤1 ϕ(t − s)

is finite almost surely, so that Fernique’s theorem implies that E[W 2ϕ ] is finite
too. Consequently,
   
E sup Xs − Xι(s) 
2
s∈[0,t]
(15)  
  e
≤ 3 ||2 + F (h) E[W 2ϕ ]ε log + |b − F0 (h)|2 ε2 .
ε
K2
The result follows immediately by using that |F0 (h)|2 ≤ ε and ruling out the
asymptotically negligible terms. 

6. Gaussian approximation via Komlós, Major and Tusnády. In this sec-


tion, we prove the following theorem.

T HEOREM 6.1. Let h > 0 and L = (Lt )t≥0 be a d-dimensional (ν, 0)-Lévy
martingale whose Lévy measure ν is supported on B(0, h). Moreover, we suppose
that for ϑ ≥ 1, one has
 
y , x 2 ν(dx) ≤ ϑ y, x 2 ν(dx)

for any y, y ∈ Rd with |y| = |y |, and set σ 2 = |x|2 ν(dx).
There exist constants c1 , c2 > 0 depending only on d such that the following
statement is true. For every T ≥ 0, one can couple the process (Lt )t∈[0,T ] with a
Wiener process (Bt )t∈[0,T ] such that
 
c1 σ 2T
E exp √ sup |Lt − Bt | ≤ exp c2 log ∨e ,
ϑh t∈[0,T ] h2

where  is a square matrix with  ∗ = covL1 and σ 2 = |x|2 ν(dx).
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 299

The proof of the theorem is based on Zaitsev’s generalization [22] of the


Komlós–Major–Tusnády coupling. In this context, a key quantity is the Zaitsev
parameter: Let Z be a d-dimensional random variable with finite exponential mo-
ments in a neighborhood of zero and set
(θ ) = log E exp{θ, Z }
for all θ ∈ C with integrable expectation. Then the parameter is defined as
τ (Z) = inf{τ > 0 : |∂w ∂v2 (θ )| ≤ τ covZ v, v for all θ ∈ Cd , v, w ∈ Rd
with |θ| ≤ τ −1 and |w| = |v| = 1}.
In the latter set, we implicitly only consider τ ’s for which  is finite on a neigh-
borhood of {x ∈ Cd : |x| ≤ 1/τ }. Moreover, covZ denotes the covariance matrix
of Z.

P ROOF OF T HEOREM 6.1. 1st step: First, consider a d-dimensional infinitely


divisible random variable Z with

 θ,x 
(θ ) := log Eeθ,Z = e − θ, x − 1 ν (dx),

where the Lévy measure ν is supported on the ball B(0, h ) for a fixed h > 0.
Then

∂w ∂v2 (θ ) = w, x v, x 2 eθ,x ν(dx)
B(0,h )
and

covZ v, v = varv, Z = ∂v2 Z (0) = v, x 2 ν(dx).
B(0,h )

We choose ζ > 0 with eζ = 1/ζ , and observe that for any θ ∈ Cd , v, w ∈ Rd with
|θ | ≤ ζ / h and |w| = |v| = 1,
h
|∂w ∂v2 (θ )| ≤ h e|θ|h covZ v, v ≤ covZ v, v .
ζ
Hence,
h
τ (Z) ≤ .
ζ
2nd step: In the next step, we apply Zaitsev’s coupling to piecewise constant
interpolations of (Lt ). Fix m ∈ N and consider L(m) = (L(m)
t )t∈[0,T ] given via

L(m)
t = L2m t/T 2−m T .
Moreover, we consider a d-dimensional Wiener process B = (Bt )t≥0 and its piece-
wise constant interpolation B (m) given by B (m) = (B2m t/T 2−m T )t∈[0,T ] .
300 S. DEREICH

Since covL1 is self-adjoint, we find a representation covLt = tU DU ∗ with D


diagonal and U orthogonal. Hence, for At := (tD)−1/2 U ∗ we get covAt Lt = Id .
We denote by λ1 the leading and by λ2 the minimal eigenvalue of D (or covL1 ).
Then At Lt is again infinitely
√ divisible and the corresponding Lévy measure is
supported on B(0, h/ λ2 t). By part one, we conclude that
h
τ (At Lt ) ≤ √ .
ζ λ2 t
Now the discontinuities of A2−m L(m) are i.i.d. with unit covariance and Zaitsev
m/2
parameter less than or equal to ζh2

T λ2
. By [22], Theorem 1.3, one can couple the
processes L and B on an appropriate probability space such that
√  2 
T λ2  (m)  ζ T λ2
 (m)
E exp κ1 m/2 sup A2−m Lt − A2−m Bt ≤ exp κ2 log ∨e ,
2 h t∈[0,T ] h2
where κ1 , κ2 > 0 are constants only depending on the dimension d. The smallest
eigenvalue of A2−m is 2m/2 (T λ1 )−1/2 and, by assumption, λ1 ≤ ϑλ2 . Since λ2 ≤
σ 2 , we get
 
1   ζ 2T σ 2
E exp κ1 √ sup L(m)
t − Bt(m)  ≤ exp κ2 log ∨e .
ϑh t∈[0,T ] h2
3rd step: The general result follows by approximation. First, note that
(m)
supt∈[0,T ] |Lt − Lt | converges as m → ∞ to supt∈[0,T ] |Lt − Lt− | so that by
dominated convergence
1  
lim E exp κ1 √ sup Lt − L(m)
t

m→∞ ϑh t∈[0,T ]
1
= E exp κ1 √ sup |Lt − Lt− | ≤ eκ1 .
ϑh t∈[0,T ]
(m)
Analogously, limm→∞ E exp{κ1 √1 supt∈[0,T ] |Bt − Bt |} = 1. Next, we
ϑh
choose κ3 ≥ 1 with eκ1 + 1 ≤ eκ2 +κ3 and we fix m ∈ N such that
κ1 1 1  (m) 
E exp √ sup |Lt − Bt | + E exp κ1 √ sup Bt − Bt 
3 ϑh t∈[0,T ] ϑh t∈[0,T ]
≤ eκ2 +κ3 .
We apply the coupling introduced in step 2 and estimate
κ1 1 1  (m) 
E exp √ sup |Lt − Bt | ≤ E exp κ1 √ sup Lt − Lt 
3 ϑh t∈[0,T ] ϑh t∈[0,T ]
1  (m) (m) 
+ E exp κ1 √ sup Lt − Bt 
ϑh t∈[0,T ]
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 301

1  
sup Bt − Bt 
(m)
+ E exp κ1 √
ϑh t∈[0,T ]
 
T σ2
≤ exp κ2 log ∨ e + eκ2 +κ3 .
h2
Straightforwardly, one obtains the assertion of the theorem for c1 = κ1 /3 and c2 =
κ2 + 2κ3 . 

C OROLLARY 6.2. The coupling introduced in Theorem 6.1 satisfies


√   2  
 1/2 ϑh σ T
E sup |Lt − Bt | 2
≤ c2 log ∨e +2 ,
t∈[0,T ] c1 h2
where c1 and c2 are as in the theorem.

σ 2T
P ROOF. We set Z = supt∈[0,T ] |Lt − Bt | and t0 = ϑh
c1 c2 log( h2 ∨ e), and
use that
 ∞  ∞
(16) E[Z 2 ] = 2 tP(Z ≥ t) dt ≤ t02 + 2 tP(Z ≥ t) dt.
0 t0
By the Markov inequality and Theorem 6.1, one has for s ≥ 0

E[exp{c1 /( ϑh)Z}] c1
P(Z ≥ s + t0 ) ≤ √ ≤ exp − √ s .
exp{c1 /( ϑh)(s + t0 )} ϑh

We set α = ϑh/c1 , and deduce together with (16) that
 ∞
1
E[Z ] ≤ t0 + 2
2 2
(s + t0 ) exp − s ds = t02 + 2t0 α + 2α 2 ≤ (t0 + 2α)2 . 
0 α
7. Coupling the Gaussian approximation. We are now in the position to
couple the processes L and  B introduced in Section 3.1. We adopt again the
notation of Section 3.1.
To introduce the coupling, we need to assume that Assumption UE is valid, and
that ε ∈ (0, 12 ], ε ∈ [2ε, 1] and h ∈ (0, h] are such that ν(B(0, h)c ) ≤ 1ε . Recall that
L is independent of W and L . In particular, it is independent of the times in J,
and given W and L we couple the Wiener process B with L on each interval
[Ti , Ti+1 ] according to the coupling provided by Theorem 6.1.
More explicitly, the coupling is established in such a way that, given J, each
pair of processes (Bt+Tj − BTj )t∈[0,Tj +1 −Tj ] and (Lt+Tj − LTj )t∈[0,Tj +1 −Tj ] is in-
dependent of W , L and the other pairings, and satisfies
  
c1 
E exp √ sup |Lt − LTj − ( Bt −  BTj )| J
ϑh t∈[Tj ,Tj +1 ]
(17)  
F (h)(Tj +1 − Tj )
≤ exp c2 log ∨ e
h2
302 S. DEREICH

for positive constants c1 and c2 depending only on dX , see Theorem 6.1. In partic-
ular, by Corollary 6.2, one has
 1/2
E sup |Lt − LTj − ( Bt −  BTj )|2 |J
t∈[Tj ,Tj +1 ]
(18)  
F (h)(Tj +1 − Tj )
≤ c3 h log ∨e
h2
for a constant c3 = c3 (dX , ϑ).

P ROPOSITION 7.1. Under Assumption UE, there exists a constant κ depend-


ing only on K, ϑ and dX such that for any ε ∈ (0, 12 ], ε ∈ [2ε, 1] and h ∈ (0, h]
with ν(B(0, h)c ) ≤ 1ε , one has
   2
1 ε F (h)
E sup |Ȳt − ϒ̄t | 2
≤ κ h2 log ∨e .
[0,1] ε h2

P ROOF. For ease of notation, we write


At = Lη(t) and At =  Bη(t) .
By construction, (At ) and (At ) are martingales with respect to the filtration (Ft )
induced by the processes (Wt ), (Lt ), (At ) and (At ). Let Zt = Ȳt − ϒ̄t , Zt = Ȳι(t) −
ϒ̄ι(t) , Zt = Ȳη(t) − ϒ̄η(t) and z(t) = E[sups∈[0,t] |Zs |2 ]. The proof is similar to the
proof of Proposition 4.1.
Again, we write
 t  t  t
        
Zt = a Ȳι(s−) − a ϒ̄ι(s−) d(Ws + Ls ) + a Ȳη(s−) dAs − a ϒ̄η(s−) dAs
0  0 0

=:Mt (localmartingale)
(19)  t
    
+ a Ȳι(s) − a ϒ̄ι(s) b ds.
0

Denoting M = W + L , we get
      
dMt = a Ȳι(t−) − a ϒ̄ι(t−) dMt + a Ȳη(t−) d(At − At )
    
+ a Ȳη(t−) − a ϒ̄η(t−) dAt
and, by Doob’s inequality and Lemma A.1, we have
    t   t 
E sup |Ms | 2
≤ κ1 E |Zs− | dM2
s +E |Zs− | dA2
s
s∈[0,t] 0 0
(20)  t 
  2
+E Ȳ 
η(s−) + 1 dA − A s .
0
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 303

Each bracket · in the latter formula can be chosen with respect to a (possibly
different) filtration such that the integrand is predictable and the integrator is a
local L2 -martingale. As noticed
 before, with respect to the canonical filtration (Ft )
one has dM t = (||2 + B(0,h)c |x|2 ν(dx)) dt ≤ 2K 2 dt. Moreover, we have with
respect to the enlarged filtration (Ft ∨ σ (J))t≥0 ,
A t = (Tj − Tj −1 )F (h) = max(J ∩ [0, t]) · F (h),
{j ∈N:Tj ≤t}

and, by (18), for j ∈ N,


A − A Tj = E[|LTj − LTj −1 − ( BTj −  BTj −1 )|2 |J] ≤ c32 ξ 2 ,

where ξ := h log( ε Fh2(h) ∨ e). Note that two discontinuities of A − A are at least
ε /2 units apart and the integrands of the last two integrals in (20) are constant on
(Tj −1 , Tj ] so that altogether
    t   t 
E sup |Ms |2 ≤ κ1 2K 2 E |Zs |2 ds + F (h)E |Zs |2 ds
s∈[0,t] 0 0
 t 
2 22   2
+ c3 ξ E  
Ȳη(s−) + 1 ds .
ε 0
With Lemma A.2 and Fubini’s theorem, we arrive at
   t 
1
E sup |Ms |2 ≤ κ2 z(s) ds + ξ 2 .
s∈[0,t] 0 ε
Moreover, by Jensen’s inequality, one has
  s 2   t
      
E sup  a Ȳι(u−) − a ϒ̄ι(u−) b du ≤ K4 E[|Zs− |2 ] ds.
s∈[0,t] 0 0

Combining the latter two estimates with (19) and applying Gronwall’s inequality
yields the statement of the proposition. 

P ROPOSITION 7.2. There exists a constant κ depending only on K and dX


such that
     
2 F (h)ε
E[Ȳ − Ȳ − (ϒ̄ − ϒ̄ ) ] 2 1/2
≤ κ h log 1 + + log ∨e
 ε h2 
e
+ F (h)ε log E[Ȳ − ϒ̄  ]
2 1/2
.
ε
P ROOF. Note that
     
Ȳt − Ȳt − (ϒ̄t − ϒ̄t ) = a Ȳη(t) Lt − Lη(t) − a ϒ̄η(t)  Bt −  Bη(t)
   
= a Ȳη(t) Lt − Lη(t) −  Bt −  Bη(t)
     
+ a Ȳη(t) − a ϒ̄η(t)  Bt −  Bη(t) .
304 S. DEREICH

Similar as in the proof of Proposition 5.2, we apply Lemma A.4 to deduce that
E[Ȳ − Ȳ − (ϒ̄ − ϒ̄ )2 ]1/2
  2  1/2
(21) ≤ KE[(Ȳ  + 1)2 ]1/2 E sup Lt − Lη(t) −  Bt −  Bη(t) 
t∈[0,1]
   1/2
+ KE[Ȳ − ϒ̄ 2 ]1/2 E sup  Bt −  Bη(t) 
2
.
t∈[0,1]

Next, we estimate E[supt∈[0,1] |Lt − Lη(t) − ( Bt −  Bη(t) )|2 ]. Recall


that conditional on J, each pairing of (Lt+Tj − LTj )t∈[0,Tj +1 −Tj ] and (Bt+Tj −
BTj )t∈[0,Tj +1 −Tj ] is coupled according to Theorem 6.1, and individual pairs are
independent of each other.
Let us first assume that the times in J are deterministic with mesh smaller or
equal to ε . We denote by n the number of entries of J which fall into [0, 1], and
we denote, for j = 1, . . . , n, j = supt∈[Tj −1 ,Tj ] |Lt − LTj −1 − ( Bt −  BTj −1 )|.
By (17) and the Markov inequality, one has, for u ≥ 0,
 n  
F (h)ε c1
P sup j ≥u ≤ P( j ≥ u) ≤ n exp c2 log 2
∨e − √ u .
j =1,...,n j =1
h ϑh

√c1 , F (h)
Let now α = β= h2
and u0 = α1 (log n + c2 log(βε ∨ e)). Then for u ≥ 0
ϑh

P sup j ≥ u ≤ e−α(u−u0 )
j =1,...,n

so that
   ∞ 
E sup 2
j =2 uP sup j ≥ u du
j =1,...,n 0 j =1,...,n
 ∞  2
−α(u−u0 ) 1 1 2
≤ u20 +2 e du = u20 + 2 u0 + 2 2 ≤ u0 + .
u0 α α α
Note that the upper bound depends only on the number of entries in J ∩ [0, 1],
and, since #(J ∩ [0, 1]) is uniformly bounded by ε2 + 1, we thus get in the general
random setting that
 
 2 1/2
E sup Lt − Lη(t) −  Bt −  Bη(t) 
t∈[0,1]
√      
ϑh 2 F (h)ε
≤ log 1 + + c2 log ∨e +2 .
c1 ε h2
Together with Lemma A.2, this gives the appropriate upper bound for the first
summand in (21).
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 305

By the argument preceding (15), one has


  
  1/2 e e
E sup  Bt −  Bη(t) 
2
≤ κ1 | | ε log = κ1 F (h)ε log ,
t∈[0,1] ε ε
where κ1 is a constant that depends only on dX . This estimate is used for the second
summand in (21) and putting everything together yields the statement. 

8. Proof of the main results.

Proof of Theorem 1.1. We consider a multilevel Monte Carlo algorithm S ∈ A


−k −1
partially specified by εk := 2 and hk := g (2 ) for k ∈ Z+ . The maximal index
k

m ∈ N and the number of iterations n1 , . . . , nm ∈ N are fixed explicitly below in


such a way that hm ≤ h and m ≥ 2. Recall that
m 2  2 
 2 1  (k) 1 
 ≤ W Y, ϒ (m)
mse(S) + E ϒ − ϒ (k−1)  + E ϒ (1) − y0  ;
n
k=2 k
n1

see (6). We control the Wasserstein metric via Corollary 3.2. Moreover, we deduce
from [6], Theorem 2, that there exists a constant κ0 that depends only on K and
dX such that, for k = 2, . . . , m,
 2   
E ϒ (k) − ϒ (k−1)  ≤ κ0 εk−1 log(e/εk−1 ) + F (hk−1 )
and
 2 
 
E ϒ (1) − y0  ≤ κ0 ε0 log(e/ε0 ) + F (h0 ) .
Consequently, one has
 
m−1  
 ≤ κ1 1 e 1 e
(22) mse(S) h2m √ + εm log + F (hk ) + εk log
εm εm k=0 nk+1 εk

in the general case, and


!
 1 e
mse(S) ≤ κ2 h2m √ log + |b − F0 (h)|2 εm
2
εm εm
(23)
m−1  "
1 e
+ F (hk ) + εk log
k=0
nk+1 εk

in the case where  = 0. Note that F (hk ) ≤ h2k g(hk ) = g −1 (2k )2 2k . With
Lemma A.3, we conclude that hk = g −1 (2k )  (γ /2)k so that εk log εek =
2−k log(e2k )  g −1 (2k )2 2k . Hence, we can bound F (hk ) + εk log εek from above
by a multiple of h2k g(hk ) in (22) and (23).
306 S. DEREICH

By Lemma A.3, we have |F0 (hm )|  hm /εm as m → ∞. Moreover, in the case


with general  and g −1 (x)  x −3/4 , we have h2m √1εm  εm . Hence, in case (I),
there exists a constant κ3 such that
! m−1
"
 1 e 1 2
(24) mse(S) ≤ κ3 h2m √ log + h g(hk ) .
εm εm k=0 nk+1 k

Conversely, in case (II), i.e. g −1 (x)  x −3/4 , the term h2m √1εm is negligible in (22),
and we get
! m−1
"
 e 1 2
(25) mse(S) ≤ κ4 εm log + h g(hk )
εm k=0 nk+1 k
for an appropriate constant κ4 .
Now, we specify n1 , . . . , nm in dependence on a positive parameter Z with
Z ≥ 1/g −1 (2m ). We set nk+1 = nk+1 (Z) = Zg −1 (2k ) ≥ 12 Zg −1 (2k ) for k =
0, . . . , m − 1 and conclude that, by (30),
m−1 m−1  m−k
1 1 k −1 1 m−1 k −1 m 2
h2k g(hk ) = 2 g (2 ) ≤ κ5
k 2
2 g (2 )
k=0
nk+1 k=0
nk+1 Z k=0 γ
m−1
1
(26) = κ5 2m g −1 (2m ) γ −(m−k)
Z k=0
1 1 m −1 m
≤ κ5 −1
2 g (2 ).
1−γ Z
Similarly, we get with (7)
m−1
(27)  ≤3
cost(S) 2k+1 nk ≤ κ6 Z2m g −1 (2m ).
k=0
We proceed with case (I). By (24) and (26),
 
 −1 m 2 m/2 1 m −1 m
(28) mse(S) ≤ κ7 g (2 ) 2 m + 2 g (2 )
Z
so that, for Z := 2m/2 /(mg −1 (2m )),
 ≤ 2κ7 g −1 (2m )2 2m/2 m
mse(S)
and, by (27),
2(3/2)m
 ≤ κ6
cost(S) .
m
For a positive parameter τ , we choose m = m(τ ) ∈ N as the maximal integer
with κ6 2(3/2)m /m ≤ τ . Here, we suppose that τ is sufficiently large to ensure the
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 307

 ≤ τ . Since 2m ≈
existence of such a m and the property hm ≤ h. Then cost(S)
2/3
(τ log τ ) , we conclude that
  g −1 ((τ log τ )2/3 )2 τ 1/3 (log τ )4/3 .
mse(S)
It remains to consider case (II). Here, (25) and (26) yield
 
 ≤ κ8 2−m m + 1 m −1 m
mse(S) 2 g (2 )
Z
1 2m −1 m
so that, for Z := m 2 g (2 ),
 ≤ 2κ8 2−m m
mse(S)
and, by (27),

 ≤ κ6 1 3m −1 m 2
cost(S) 2 g (2 ) .
m
Next, let l ∈ N such that 2κ6 2−l γ −2l ≤ 1. Again we let τ be a positive parameter
which is assumed to be sufficiently large so that we can pick m = m(Z) as the
maximal natural number larger than l and satisfying 2m+l ≤ g ∗ (τ ). Then, by (29),
 2l
 1 3m −1 m 2 −3l 2 1 3(m+l) −1 m+l 2
cost(S) ≤ κ6 2 g (2 ) ≤ 2κ6 2 2 g (2 ) ≤ τ.
m γ m+l
Conversely, since 2−m ≤ 2l+1 g ∗ (τ ),
 ≤ 2κ8 2l+1 g ∗ (τ )−1 log g ∗ (τ ).
mse(S) 2

Moreover, g −1 (x)  x −1 so that x 3 g −1 (x)2 / log x  x/ log x, as x → ∞. This


implies that log g ∗ (τ )  log τ .

Proof of Corollary 1.2. We fix β ∈ (β, 2] or β = 2 in the case where β = 2,


and note that, by definition of β,

κ1 := |x|β ν(dx)
B(0,1)
 |x|2
is finite. We consider ḡ : (0, ∞) → (0, ∞), h → h2
∧ 1ν(dx). For h ∈ (0, 1],
one has
 
|x|2 |x|2
ḡ(h) = ∧ 1ν(dx) + ∧ 1ν(dx)
B(0,1) h2 B(0,1)c h2
 
|x|β
≤ ν(dx) + 1ν(dx) ≤ κ2 h−β ,
B(0,1) hβ B(0,1)c

where κ2 = κ1 + ν(B(0, 1)c ). Hence, we find a decreasing and invertible function


g : (0, ∞) → (0, ∞) that dominates ḡ and satisfies g(h) = κ2 h−β for h ∈ (0, 1].
308 S. DEREICH

Then for γ = 21−1/β , one has g( γ2 h) = 2g(h) for h ∈ (0, 1] and we are in the
position to apply Theorem 1.1: In the first case, we get
err(τ )  τ −(4−β )/(6β ) (log τ )(2/3)(1−1/β ) .
In the second case, we assume that β ≤ 4
3 and obtain g ∗ (τ ) ≈ (τ log τ )−β /(3β −2)
so that
err(τ )  τ −β /(6β −4) (log τ )(β −1)/(3β −2) .
These estimates yield immediately the statement of the corollary.

APPENDIX
L EMMA A.1. Let (At ) be a previsible process with state space RdY ×dX , let
(Lt ) be a square integrable RdX -valued Lévy martingale and denote by L the
process given via
dX
# (j ) $
L t = L t,
j =1

where L(j ) denotes the predictable compensator of the classical bracket process
for thej th coordinate of L.One has, for any stopping time τ with finite expecta-
tion E 0τ |As |2 dL s , that ( 0t∧τ As dLs )t≥0 is a uniformly square integrable mar-
tingale which satisfies
 τ 2  τ
 
E As dLs  ≤ E |As |2 dL s .
0 0

The statement of the lemma follows from the Itô isometry for Lévy driven sto-
chastic differential equations. See, for instance, [6], Lemma 3, for a proof.

L EMMA A.2. The processes Ȳ and ϒ introduced in Section 3.1 satisfy


   
E sup |Ȳs − y0 | ≤ κ and E sup |ϒ̄s − y0 | ≤ κ,
s∈[0,1] s∈[0,1]

where κ is a constant that depends only on K.

P ROOF. The result is proven via a standard Gronwall inequality type argument
that is similar to the proofs of the above propositions. It is therefore omitted. 

L EMMA A.3. Let h̄ > 0, γ ∈ (1, 2) and g : (0, ∞) → (0, ∞) be an invertible


and decreasing function such that, for h ∈ (0, h̄],
 
γ
g h ≥ 2g(h).
2
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 309

Then
γ −1
(29) g (u) ≤ g −1 (2u)
2
for all u ≥ g(h̄). Moreover, there exists a finite constant κ1 depending only on g
such that for all k, l ∈ Z+ with k ≤ l one has
 l−k
−1 2
(30) g (2 ) ≤ κ1
k
g −1 (2l ).
γ
If ν(B(0, h)c ) ≤ g(h) for all h > 0, and ν has a second moment, then

 
|x|ν(dx) ≤ κ2 hg(h) + 1 ,
B(0,h)c

where κ2 is a constant that depends only on g and |x|2 ν(dx).

P ROOF. First, note that property (2) is equivalent to


γ −1
g (u) ≤ g −1 (2u)
2
for all sufficiently large u > 0. This implies that there exists a finite constant κ1
depending only on g such that for all k, l ∈ Z+ with k ≤ l one has
 l−k
−1 2
g (2 ) ≤ κ1
k
g −1 (2l ).
γ
For general, h > 0 one has
  
1
|x|ν(dx) ≤ |x|ν(dx) + |x|2 ν(dx).
B(0,h)c B(0,h)c ∩B(0,h̄) h̄
Moreover,
 ∞    n c   n+1
2 2
|x|ν(dx) ≤ ν B 0, h ∩ B(0, h̄) h
B(0,h)c ∩B(0,h̄) n=0
γ γ
∞   n   n+1
2 2
≤ 1{h(2/γ )n ≤h̄} g h h
γ γ
n=0   
≤2−n g(h)

≤ 2hg(h) γ −(n+1) .
n=0 

L EMMA A.4. Let n ∈ N and (Gj )j =0,1,...,n denote a filtration. Moreover, let,
for j = 0, . . . , n − 1, Uj and Vj denote nonnegative random variables such that
Uj is Gj -measurable, and Vj is Gj +1 -measurable and independent of Gj . Then
one has
     
E max Uj Vj ≤ E max Uj · E max Vj .
j =0,...,n−1 j =0,...,n−1 j =0,...,n−1
310 S. DEREICH

P ROOF. See [6]. 

REFERENCES
[1] A PPLEBAUM , D. (2004). Lévy Processes and Stochastic Calculus. Cambridge Studies in Ad-
vanced Mathematics 93. Cambridge Univ. Press, Cambridge. MR2072890
[2] A SMUSSEN , S. and ROSI ŃSKI , J. (2001). Approximations of small jumps of Lévy Processes
with a view towards simulation. J. Appl. Probab. 38 482–493. MR1834755
[3] B ERTOIN , J. (1998). Lévy Processes. Cambridge Univ. Press. MR1406564
[4] C OHEN , S. and ROSI ŃSKI , J. (2007). Gaussian approximation of multivariate Lévy processes
with applications to simulation of tempered stable processes. Bernoulli 13 195–210.
MR2307403
[5] C REUTZIG , J., D EREICH , S., M ÜLLER -G RONBACH , T. and R ITTER , K. (2009). Infinite-
dimensional quadrature and approximation of distributions. Found. Comput. Math. 9 391–
429. MR2519865
[6] D EREICH , S. and H EIDENREICH , F. (2009). A multilevel Monte Carlo algorithm for Lévy
driven stochastic differential equations. Preprint.
[7] G ILES , M. B. (2008). Improved multilevel Monte Carlo convergence using the Milstein
scheme. In Monte Carlo and Quasi-Monte Carlo Methods 2006 343–358. Springer,
Berlin. MR2479233
[8] G ILES , M. B. (2008). Multilevel Monte Carlo path simulation. Oper. Res. 56 607–617.
MR2436856
[9] H EINRICH , S. (1998). Monte Carlo complexity of global solution of integral equations. J. Com-
plexity 14 151–175. MR1629093
[10] JACOD , J. (2004). The Euler scheme for Lévy driven stochastic differential equations: Limit
theorems. Ann. Probab. 32 1830–1872. MR2073179
[11] JACOD , J., K URTZ , T. G., M ÉLÉARD , S. and P ROTTER , P. (2005). The approximate Euler
method for Lévy driven stochastic differential equations. Ann. Inst. H. Poincaré Probab.
Statist. 41 523–558. MR2139032
[12] K LOEDEN , P. E. and P LATEN , E. (1992). Numerical Solution of Stochastic Differential Equa-
tions. Applications of Mathematics (New York) 23. Springer, Berlin. MR1214374
[13] KOMLÓS , J., M AJOR , P. and T USNÁDY, G. (1975). An approximation of partial sums of inde-
pendent RV’s and the sample DF. I. Z. Wahrsch. Verw. Gebiete 32 111–131. MR0375412
[14] KOMLÓS , J., M AJOR , P. and T USNÁDY, G. (1976). An approximation of partial sums of inde-
pendent RV’s, and the sample DF. II. Z. Wahrsch. Verw. Gebiete 34 33–58. MR0402883
[15] N OVAK , E. (1995). The real number model in numerical analysis. J. Complexity 11 57–73.
MR1319050
[16] P ROTTER , P. (2005). Stochastic Integration and Differential Equations, 2nd ed. Stochastic
Modelling and Applied Probability 21. Springer, Berlin. MR2273672
[17] P ROTTER , P. and TALAY, D. (1997). The Euler scheme for Lévy driven stochastic differential
equations. Ann. Probab. 25 393–423. MR1428514
[18] RUBENTHALER , S. (2003). Numerical simulation of the solution of a stochastic differential
equation driven by a Lévy process. Stochastic Process. Appl. 103 311–349. MR1950769
[19] RUBENTHALER , S. and W IKTORSSON , M. (2003). Improved convergence rate for the simula-
tion of stochastic differential equations driven by subordinated Lévy processes. Stochastic
Process. Appl. 108 1–26. MR2008599
[20] S ATO , K. (1999). Lévy Processes and Infinitely Divisible Distributions. Cambridge Studies in
Advanced Mathematics 68. Cambridge Univ. Press, Cambridge. MR1739520
[21] TALAY, D. and T UBARO , L. (1990). Expansion of the global error for numerical schemes
solving stochastic differential equations. Stochastic Anal. Appl. 8 483–509. MR1091544
MLMC FOR LÉVY SDE WITH GAUSSIAN CORRECTION 311

[22] Z AITSEV, A. Y. (1998). Multidimensional version of the results of Komlós, Major and Tus-
nády for vectors with finite exponential moments. ESAIM Probab. Statist. 2 41–108.
MR1616527

P HILIPPS -U NIVERSITÄT M ARBURG


F B . 12, M ATHEMATIK UND I NFORMATIK
H ANS -M EERWEIN -S TRASSE
D-35032 M ARBURG
G ERMANY
E- MAIL : [email protected]
URL: http://www.mathematik.uni-marburg.de/~dereich

You might also like