0% found this document useful (0 votes)
2 views

Protter

This paper discusses the numerical approximation of expected values for solutions of stochastic differential equations driven by Lévy processes using Euler schemes. It examines the error associated with these approximations, providing estimates for bias and expansions of the error in terms of the step size and the characteristics of the driving process. The authors derive results for both genuine and approximate Euler schemes, highlighting the implications for simulation and the convergence rates of the approximations.

Uploaded by

intequodiscimus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Protter

This paper discusses the numerical approximation of expected values for solutions of stochastic differential equations driven by Lévy processes using Euler schemes. It examines the error associated with these approximations, providing estimates for bias and expansions of the error in terms of the step size and the characteristics of the driving process. The authors derive results for both genuine and approximate Euler schemes, highlighting the implications for simulation and the convergence rates of the approximations.

Uploaded by

intequodiscimus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

The approximate Euler method for Lévy driven stochastic

differential equations

Jean Jacod∗, Thomas G. Kurtz†, Sylvie Méléard‡and Philip Protter §

Abstract

This paper is concerned with the numerical approximation of the expected value
IE(g(Xt )), where g is a suitable test function and X is the solution of a stochastic differ-
ential equation driven by a Lévy process Y . More precisely we consider an Euler scheme
or an “approximate” Euler scheme with stepsize 1/n, giving rise to a simulable variable
Xtn , and we study the error δn (g) = IE(g(Xtn )) − IE(g(Xt )).
For a genuine Euler scheme we typically get that δn (g) is of order 1/n, and we even
have an expansion of this error in successive powers of 1/n, and the assumptions are some
integrability condition on the driving process and appropriate smoothness of the coefficient
of the equation and of the test function g.
For an approximate Euler scheme, that is we replace the non–simulable increments of
X by a simulable variable close enough to the desired increment, the order of magnitude
of δn (g) is the supremum of 1/N and a kind of “distance” between the increments of Y
and the actually simulated variable. In this situation, a second order expansion is also
available.

Mathematics Subject Classifications (1991): 60H10, 65U05 60J30


Key words: Euler scheme ; Stochastic differential equations ; Simulations ; approximate
simulations .

Laboratoire de Probabilités et Modèles Aléatoires, CNRS UMR 7599 and Université P. et M. Curie, 4
Place Jussieu, 75252 Paris-Cedex 05, France

Departments of Mathematics and Statistics, University of Wisconsin — Madison, 480 Lincoln Drive,
Madison, WI 53706-1388, USA

MODAL’X, Université Paris-10, 200 Avenue de la République, 92000-Nanterre, France, and Labora-
toire de Probabilités et Modèles Aléatoires
§
Supported in part by NSF grant DMS-0202958 and NSA grant MDA-904-00-1-0035; ORIE – 219
Rhodes Hall, Cornell University, Ithaca, NY 14853-3801 USA
1 Introduction
1) Approximating Markov process expectations. In applications of Markov pro-
cesses, it is frequently necessary to compute IE(g(Xt )), where X is the process modelling
the system of interest. While this expectation can sometimes be obtained by direct nu-
merical computation, for example, by applying numerical schemes for partial differential
equations, Monte Carlo methods may provide the only effective approach. In the simplest
Monte Carlo approach, the desired expectation is approximated as
N
1 X
IE(g(Xt )) ≈ g(X̂ti ),
N i=1

where the X̂ti are simulated, independent copies of Xt . In practice, it may not be possible
to simulate draws from the distribution of Xt exactly, so the Monte Carlo approximation
may introduce a bias
∆g = IEg(X̂t ) − IEg(Xt ),
where X̂ is a simulatable approximation of X. We are interested in developing methods
for estimating this bias for a large class of Markov processes and corresponding approxi-
mations.
The simulated process used in the Monte Carlo approximation will typically be a
discrete time Markov chain in which the discrete time-step is identified with a small
interval on the real-time axis. To simplify notation, we take the length of this interval
to be 1/n for some integer n and assume that t = 1. We will denote the approximating
process by X n to emphasize the dependence on the time-step in the simulation. We also
note that the bias will depend on the initial state x, particularly if g is unbounded, so we
want to estimate the bias

∆n g(x) = IEg(X1n ) − IEg(X1 ). (1.1)


as a function of x and n.
Let (Pt )t≥0 denote the transition semigroup for the process X and let Qn1 denote the
one-step transition operator for the approximating process X n and Qnj = (Qn1 )j the j-step
operator, that is,
Qnj h(x) = IE[h(Xj/n
n
)|X0n = x].
Defining Ptn = Qn[nt] ([nt] is the integer part of the number nt), we can write the bias as

∆n g(x) = P1n g(x) − P1 g(x) (1.2)


n
X
n
= P1−j/n (Qn1 − P1/n )P(j−1)/n g(x)
j=1
Z 1
n
= P1−η n (s)−1/n
E n Pηn (s) g(x)ds,
0

where ηn (s) = [ns]/n and


E n h(x) = n(Qn1 − P1/n )h(x).

1
If X n is a good approximation of X, then the operator Ptn is “close” to the operator Pt .
Hence the identity (1.2) suggests that the rate of convergence of ∆n g(x) to 0 is governed
by the rate at which the error operator E n goes to 0. More precisely, one might expect to
obtain estimates of the form

|E n h(x)| ≤ εn ρ(x)khkE ,

for some sequence εn → 0 and function ρ ≥ 0 and for h in some collection of functions
DE with k · kE a norm on DE . If further Pt maps DE into itself (a requirement which,
for diffusion processes, is sometimes implied by regularity results for partial differential
equations but, as we shall show, can also be obtained by direct probabilistic calculations),
then we may expect a bound of the form
Z 1
n
|∆n g(x)| ≤ εn P1−η n (s)
ρ(x)kPηn (s) gkE ds.
0

This analysis suggests the possibility of an exact asymptotic limit of the form
Z 1
Γ(1) g(x) ≡ lim ε−1
n ∆n g(x) = P1−s E Ps g(x)ds,
n→∞ 0

where
Eh(x) = lim ε−1 n
n E h(x),
n→∞

for h ∈ DE . One can also consider the rate of convergence in this limit or attempt to
derive higher order expansions.

2) Euler approximations for solutions of stochastic differential equations. We


develop the desired estimates for solutions of a stochastic differential equation
Z t
Xt = x + f (Xs− )dYs (1.3)
0
driven by a Lévy process Y . The process X is d–dimensional, while Y is d0 –dimensional,
0
so f takes its values in IRd ⊗ IRd , and we systematically use (column) vector and matrix
notation. The initial value is some given x ∈ IRd . The precise assumptions are stated
later.
The process X n will be given by an Euler approximation with stepsize 1/n defined
recursively at times i/n by
 
X0n = x, X ni+1 = X ni + f (X ni ) Y i+1 − Y i , (1.4)
n n n n n

or since the true increment Y(i+1)/n −Y1/n may be difficult to simulate, in practice, we may
substitute the i.i.d. random variables ζin which are close enough to the true increments,
and are exactly simulatable. That is, instead of the genuine Euler scheme given by (1.4),
we consider the approximate Euler scheme, still denoted by X n , given by

X0n = x, X ni+1 = X ni + f (X ni )ζi+1


n
. (1.5)
n n n

2
In this approximation, we have two sources of error, the discretization error and the error
due to the approximation of the increment. The latter error does not have a natural
generic characterization, so we will simply assume that we have estimates for

δn (h) = IE(h(ζ1n )) − IE(h(Y1/n )) (1.6)

of the form
Kun
|δn (h)| = |IE(h(ζ1n )) − IE(h(Y1/n ))| ≤ khkE ,
n
for h in a sufficiently large space DE with an appropriate norm and a sequence un going
to 0. For higher order results, we need an expansion of δn (h) around 0.
The space DE will be of the form Cαk (IRd ) for some integer k ≥ 0 and some function
α ≥ 1, where Cαk (IRd ) is the space of k-times continuously differentiable functions on IRd
with norm
khkα,k = inf{a > 0 : |∇i h(x)| ≤ aα(x) for i = 0, . . . k}.
For all p ≥ 0 we introduce the function

1 + |x|p
(
if p > 0
αp (x) = (1.7)
1 if p = 0,

and we write Cpk (IRd ) and khkp,k instead of Cαkp (IRd ) and khkαp ,k .

3) The basic problems. We are interested in results of the following type:

(A) An estimate of the bias, that is

C(g, x)
|∆n g(x)| ≤ (1.8)
n
for some constant C(g, x) depending on g and on x, and also of course on the
characteristics of the driving process Y and on the coefficient f .

(B) A first order expansion, that is the existence of an operator Γ(1) such that

1 (1) C1 (g, x)
∆n g(x) − Γ g(x) ≤ . (1.9)
n n2

(C) Higher order expansions, that is, the existence of operators Γ(k) for k = 1, . . . , m,
such that
m
X 1 (k) Cm (g, x)
∆n g(x) − k
Γ g(x) ≤ . (1.10)
k=1
n nm+1

If f is bounded, then versions of our results for (A) and (B) can be stated in a
straightforward manner; the rates 1/n and 1/n2 respectively are obtained for genuine
Euler schemes, while rates may be different for approximate schemes. Results for (C) and
for unbounded f require somewhat more cumbersome assumptions.

3
Theorem 1.1 Assume that the coefficient f is in C04 (IRd ), that Yt has finite moments up
to order 8, and that
0 Kun
h ∈ C04 (IRd ) =⇒ |δn (h)| ≤ khk0,4 (1.11)
n
for some constant K and some sequence (un ) of positive numbers. Then
 _ 1
0
g∈ C04 (IRd ) =⇒ |∆n (g)| ≤ K un kgk0,4 (1.12)
n
for another constant K 0 which depends on K, on f and on the law of Y , but not on the
starting point x.

Remark 1.2 For the genuine Euler scheme ζ1n = Y1/n , (1.12) with un = 0 is obviously
satisfied, hence we recover the results of Protter and Talay [13], under stronger assumptions
(but we will see weaker assumptions below). The comparison between the rate 1/n due
to the Euler scheme and the rate un due to the approximation of Y1/n by ζ1n (in law) is
instructive: since the time needed for the simulation is proportional to n and is also usually
an increasing function of 1/un , from a practical point of view it is best (asymptotically)
to choose un = 1/n. This is why we assume un = 1/n in the next result, which allows for
a Romberg–type method of simulation:

Theorem 1.3 Assume that the coefficient f is in C010 (IRd ), that Yt has finite moments
up to order 20, and that we have (1.11) with un = n1 and also

0 1 Ku0n
h ∈ C06 (IRd ) =⇒ δn (h) − β(h) ≤ khk0,6 (1.13)
n2 n
0
for some constant K, some linear map β on C06 (IRd ) and some sequence (u0n ) with nu0n →
0. Then there is a linear map γx on C06 (IRd ) (where x is the starting point), such that
1 _ 1 
g ∈ C06 (IRd ) =⇒ |∆n (g) − γx (g)| ≤ K 0 u0n kgk0,6 (1.14)
n n2
for another constant K 0 depending on K, on f and on the law of Y , but not on x.

Remark 1.4 For the genuine Euler scheme, we have (1.13) with u0n = 0, so we again
recover the results of Protter and Talay [13] under stronger assumptions. It is noteworthy
to observe that, except for genuine Euler schemes, we have in general (1.13) with n2 u0n →
∞, unfortunately, as we will see in the examples below.

4) Relationship to other work. The results above, starting with (A), are in deep
contrast with “pathwise” rates of convergence obtained for example in [9], [5], or [6],
where one looks for sequences vn increasing to infinity and such that vn (X n − X) is tight
(as processes, or at some time t) with non–zero limits. The rate vn depends on the

characteristics of the Lévy process Y , ranging from vn = n when Y has a non–vanishing
Wiener part to vn = (n/ log n)α when Y is a symmetric stable process with index α, and

4
even to “vn = ∞” (that is X n ≡ X for n big enough, depending on ω of course) when
Y is a compound Poisson process. The mathematical reason for this discrepancy is a
lack of uniform integrability which prevents exchanging limits of random variables and
expectations. It is interesting to observe that ∆n g(x) is always of order 1/n, irrespective
of the characteristics of Y , provided Y has some integrability. The reason for that is quite
clear when Y is a compound Poisson process, and we will devote some space to that special
case (although from a practical point of view there is a way to simulate X “exactly” in
that case and one should not use an Euler scheme).
The identity (1.2) has been used by a number of authors to estimate the error in
Markov process approximation. See for example [11], [3], and Section 1.6 of [4].

5) Implications for simulation. The main motivations for these types of results are
practical: we want to estimate IE(g(X1 )). We run the Euler scheme N times, giving rise to
the simulated numbers X1n,1 , . . . , X1n,N , and we take the estimate GN,n = N1 N n,i
P
i=1 g(X1 ).
When g is bounded, for example, and if (A) holds, the error |GN,n ) − IE(g(X1 ))| is of order
√1 + 1 , while the time required for the procedure is proportional to nN . Consequently,
N n
it is best to take N = O(n2 ) and, in order to achieve an expected error ε we need a time
Tε of order 1/ε3 .
If (B) holds, we can use a Romberg–type method: we can run N simulations of the
Euler scheme twice, once with stepsize
 1/n and once with
 stepsize 1/2n. We then take the
1 PN 2n,i n,i
estimate to be GN,n = N i=1 2g(X1 )) − g(X1 )) . Since 2IE(g(X12n )) − IE(g(X1n )) −
IE(g(X1x )) is of order 1/n2 , the error |GN,n − IE(g(X1 ))| is of order √1N + n12 , while the
time required for the procedure is still proportional to nN . Consequently, it is best to
take N = O(n4 ). Hence in order to achieve an expected error ε, we need a time Tε of
order 1/ε5/2 which can be substantially less than the time O(1/ε3 ) required if we do not
use the Romberg approach. Observe that we can do no better than Tε = O(1/ε2 ), the
error in the case one can exactly simulate g(X1 ) without any bias.
The expansion (C) is mathematically interesting, but its practical relevance is more
dubious: in principle it lays the foundation for studying higher order Romberg schemes,
but these are probably quite unstable from the numerical point of view, especially for a
discontinuous Lévy process Y .

6) Organization of the paper. In Section 2 we state the main results in full gener-
ality. Some “practical” examples are expounded in Section 3, with emphasis put on the
evaluation of the time necessary to perform simulations. In Section 4 we state and prove
a version of the first order expansion when the driving process Y is a compound Poisson
process and for the genuine Euler scheme: this is easy to prove and serves as a good in-
troduction to the general case. Section 5 is devoted to recalling some more or less known
results on Equation (1.3). In Section 6 we prove various technical lemmas, and the last
three sections are devoted to the proof of the main results.

5
2 The main results
1) Some notation. We suppose that the time interval is bounded, and without loss of
generality that it is [0, 1]. The starting point x plays a role in the results, and a crucial
n,x
role in the proofs. So, instead of Xt we write Xtx in (1.3). Similarly, we write Xi/n for
the solution (1.5) to the approximate Euler scheme. For more coherent notation, we also
define Xtn,x for all t ∈ [0, t], by setting Xtn,x = Xi/n
n,x
when i ≤ nt < i + 1.
We consider also the processes
[nt]
X
Ytn = ζin , (2.1)
i=1

where [s] denotes the integer part of s. For a genuine Euler scheme, we have of course
Ytn = Y[nt]/n . In general, Y n is a non–homogeneous process with independent increments,
however its increments over intervals of length a multiple of 1/n are stationary. If we use
the notation ϕn (t) = ni when i < nt ≤ i + 1, then we can rewrite (1.5) as
Z t
Xtn,x =x+ f (Xϕn,x
n (s)
)dYsn . (2.2)
0

Therefore, by well known results on the stability of stochastic differential equations


(plus the fact that a sequence of processes with independent increments converging in
law to a Lévy process is “predictably uniformly tight” (PUT): see Slominski [15], or also
Theorem IX.6.9 of [7]), we readily obtain that:

L L
Y n −→ Y ⇒ X n,x −→ X x 

(2.3)
IP
Ytn = Y[nt]/n ⇒ supt∈[0,1] |Xtn,x − Xtx | −→ 0 

(the second case corresponds to the genuine Euler scheme).


¿From time to time we need a filtration. For the genuine Euler scheme we take for (F t )
the filtration generated by Y . Otherwise, we have convergence in law only (see (2.3)), so it
is no restriction to assume that all processes are defined on the same probability space and
that the Y n ’s are mutually independent and independent of Y , and (F t ) is the filtration
generated by all processes Y and Y n for n ≥ 1.

2) Assumptions on f and Y . We now state our assumptions, starting with those on


the coefficient f :

Assumption H(l, N ): The coefficient f is N times continuously differentiable and all


partial derivatives of order l, l + 1, . . . , N are bounded. 2

Here N is an integer, and l will be either 0 or 1. We usually assume H(1, 1) at least,


except when Y is a compound Poisson. Clearly H(l, N + 1) ⇒ H(l, N ).

Next, we denote by (b, c, F ) the characteristics of the Lévy process Y , in the sense that
< u, cu >
 Z 
i<u,Yt >
IE(e ) = exp t i < u, b > − + F (dy)(ei<u,y> − 1 − i < u, τ (y) >) ,
2

6
0 0
where τ is a truncation function on IRd , that is a map from IRd into itself, which is
bounded and coincides with the identity near 0, and whose components can be assumed
0
to be in C0∞ (IRd ) without loss of generality. Then we need the following integrability
assumption, where p is some nonnegative real:

p < ∞ (equivalently, Yt ∈ ILp for all t). 2


R
Assumption F(p): We have {|y|>1} |y| F (dy)

3) Assumptions on the variables ζin . As said before, for each n the sequence of d0 –
dimensional variables ζin for i = 1, . . . , n is i.i.d. The discrepancy between ζ1n and Y1/n is
measured by the quantities δn (g) of (1.6), and we make different assumptions according
to the kind of result (of type (A), (B) or (C)) which we want to prove. Below, p ∈ IR+
and N ∈ IN ? are arbitrary, and (un )n≥1 and (u0n )n≥1 are arbitrary sequences of positive
numbers tending to 0 and such that u0n /un → 0:

Assumption G({un }, p): We have F(p), and there is a constant K such that
0 Kun
h ∈ Cp4 (IRd ) =⇒ |δn (h)| ≤ khkp,4 . (2.4)
n

Observe that we put un /n and not un on the right: this is because the variable Y1/n is close
to 0, and indeed “of order 1/n” already as n → ∞ in the sense that IE(h(Y1/n )) = 0(1/n)
0
for any h ∈ Cp0 (IRd ) under F(p). We will see later that this assumption is enough to
ensure the first convergence in law in (2.3).

Assumption G’({un }, {u0n }, p): We have G({un }, p), and there are a constant K and a
0
linear map φ on the space Cp6 (IRd ) such that
0 un Ku0n
h ∈ Cp6 (IRd ) =⇒ δn (h) − φ(h) ≤ khkp,6 . (2.5)
n n

Assumption G”(N, p): We have F(p) and there are a constant K and linear maps φk
0
on the spaces Cp2k+4 (IRd ) for k = 1, . . . , N , such that (with an empty sum set equal to 0):
k
0 X 1 K
k = 0, . . . , N, h ∈ Cp2k+4 (IRd ) =⇒ δn (h) − φi (h) ≤ khkp,2k+4 . (2.6)
i=1
ni+1 nk+2

Clearly G”(N, p) ⇒ G”(N − 1, p), and G”(1, p) = G’({1/n}, {1/n2 }, p), and also
G’({un }, {u0n }, p) ⇒ G({un }, p).
Finally, observe that for the genuine Euler scheme we have δn (h) = 0 for all h, hence
all the above assumptions are trivially fulfilled in this case.

4) The main results. Our aim is to evaluate the “error” involved by the – approximate
or genuine – Euler scheme, and measured through the quantity
n,x x
∆n,t g(x) = IE(g(X[nt]/n ))) − IE(g(X[nt]/n )) (2.7)

7
for suitable test functions g.
Our first result is an estimate on ∆n,t g(x), that is, it solves problem (A) for approximate
Euler schemes:

Theorem 2.1 Let p ≥ 0 and l = 0 or l = 1. Assume H(l, 4) and G({un }, 4 + 4 p) for


W

some sequence un decreasing to 0. Then there is a constant K depending on p, f , Y only,


such that for any g ∈ Cp4 (IRd ) we have, for all t ∈ [0, 1], n ≥ 1, x ∈ IRd :
 _ 1
|∆n,t g(x)| ≤ Kt un kgkp,4 (1 + |x|p+4l ). (2.8)
n

As said before, for the genuine Euler scheme F(p) implies G({un }, p) with un = 0, so
we recover the estimates of Protter and Talay in [13]. For the approximate Euler scheme
this result allows us to single out the contributions of the error of the Euler scheme on the
one side (that is, 1/n), and of the simulation discrepancy on the other side (which is un ).
The second main result is a first order expansion for ∆n,t g(x). The result goes as
follows, and we see that there are in fact two “first order terms” corresponding respectively
to the Euler scheme error and to the simulation discrepancy.

Theorem 2.2 Let l = 0 or l = 1. Assume H(l, 10) and G’({un }, {u0n }, 10 + 10 p) for
W

some p ≥ 0 and some sequences un and u0n with un → 0 and u0n /un → 0. Then there is a
constant K depending on p, f , Y only, and linear operators Ut and Vt on Cp10 (IRd ), such
that for any g ∈ Cp10 (IRd ) the functions Ut g and Vt g belong to Cp+6l
4 (IRd ), and also that
for all t ∈ [0, 1], n ≥ 1, x ∈ IRd :
kUt gkp+6l,4 ≤ Ktkgkp,10 , kVt gkp+6l,4 ≤ Ktkgkp,10 . (2.9)
1 _ 1
 
Vt g(x) ≤ Kt u0n u2n
_
∆n,t g(x) − un Ut g(x) − kgkp,10 (1 + |x|p+8l ). (2.10)
n n2
W
For the genuine Euler scheme and under F(10+10 p) we have the previous hypotheses
with un = u0n = 0, so the first order term is n1 Vt g(x) and the remainder is of order 1/n2 .
Theorems 1.1 and 1.3 are Theorems 2.1 and 2.2 respectively, when l = 0 and p = 0.
Finally we state the result about Problem (C), that is expansions of arbitrary order,
of the form:
N
X 1 (k) 1
∆n,t g(x) = k
Γ[nt]/n g(x) + N +1 RN,n,t g(x). (2.11)
k=1
n n

Theorem 2.3 Let p ≥ 0 and l = 0 or l = 1 and N ≥ 1, and assume H(l, 6N + 4)


W
and G”(N, 6N + 4 + (6N + 4) p). Then there is a constant K depending only on p,
(k)
f , Y and N , and linear operators Γt on Cp6k (IRd ) for k = 1, . . . , N , such that if r =
6k, 6k + 1, . . . , 6N + 4 we have
(k) r−6k (k)
g ∈ Cpr (IRd ) ⇒ Γt g ∈ Cp+6lk (IRd ), kΓt gkp+6lk,r−6k ≤ Ktkgkp,r , (2.12)

and moreover if g ∈ Cp6N +4 (IRd ) we have the expansion (2.11) with a remainder satisfying

|RN,n,t g(x)| ≤ Ktkgkp,6N +4 (1 + |x|p+4(N +1)l ). (2.13)

8
(k)
Remark 2.4 In a sense (2.11) is not a true expansion because the “coefficients” Γ[nt]/n g(x)
depend on n, except when t = 1 of course. But, except for t = 1 again, ∆n,t g(x) is not
really a function of t but rather of the discretized time [nt]/n, so having an expansion
which depends on time through [nt]/n is also natural.
For the genuine Euler scheme, one could also use Y instead of Y n in (2.2) (the two
Euler approximations coincide at all times i/n). Instead of (2.7) one naturally takes
∆n,t g(x) = IE(g(Xtn,x )) − IE(g(Xtx )). Then we have (2.8), but not (2.10), essentially
because nt − [nt] oscillates (when t 6= 1) between 0 and 1 as n varies. 2

Remark 2.5 For the genuine Euler scheme one can slightly improve the result of Theorem
2.3: we only need H(l, 4N + 2) and F(4N + 2 + (4N + 2) p), and g ∈ Cp4N +2 (IRd ), and
W
(k) 4N +2−4k
then Γt g ∈ Cp+4lk (IRd ). Similarly, Theorem 2.2 holds under H(l, 6) and F(6 + 6
W
p)
and for g ∈ Cp6 (IRd ) (and of course Ut g does not show up in that case). 2

Remark 2.6 When un = 1/n in Theorem 2.2, the sum Ut + Vt is equal to the operator
(1)
Γt of Theorem 2.3. 2

Remark 2.7 At the end of the paper (Remark 9.1) we give an “explicit” form for Γ(1) in
the 1–dimensional case. This operator is well defined under F(2) only, so it is likely that
we cannot drop integrability assumptions, even when g is bounded (but the assumption
that Yt has finite moments of order 20 (or 12 in the genuine Euler case, see Remark 2.5)
is obviously too strong !). 2

Remark 2.8 From a practical point of view, only the first two theorems are interesting:
the first one for the plain (approximate) Euler scheme, and the second one if one wishes to
use the Romberg method. And, in the latter case, this method can be applied only when
the two “first order” terms are comparable, that is when un = 1/n. This is not a true
practical restriction since the probabilist or statistician can indeed choose the accuracy
un (at the price of a more or less long time for the simulation of a single variable ζ1n ): we
explain in Section 3 below how this works on a particular example. 2

5) Let us end this section with another set of assumptions, in a particular case. Actually,
checking Assumptions G({un }, p) or G0 ({un }, {u0n }, p) if we have a procedure to approxi-
mate Y1/n by variables ζ1n when we know only the laws of the latter may be quite difficult
(not to mention G”(N, p)).
However there is a situation which occurs often in practice and for which we have
simpler conditions: we will say that we have a restricted approximate Euler scheme if
each ζ1n is (in law) the value at time 1/n of a Lévy process Y 0n (equivalently, the law of
ζ1n is infinitely divisible). That is, for each n we have a Lévy process Y 0n , and we take
0n − Y 0n
ζin = Yi/n n of (2.1) is the discretization of Y 0n ,
(i−1)/n . Then of course the process Y
0n
that is Ytn = Y[nt]/n .
In this situation it is usually the case that the characteristics of Y 0n are known, and
they are denoted by (b0n , c0n , Fn0 ) (w.r.t. the same truncation function τ than Y ). We also

9
consider the second modified characteristics of Y and Y 0n , given by

ce = c + F (τ τ ? ), ce0n = c0n + Fn0 (τ τ ? ) (2.14)


R
(here and below, we write F (g) instead of g(y)F (dy)). With this notation, and if we
0 0
further denote by Cp0k (IRd ) (for k ≥ 2) the set of all h ∈ Cpk (IRd ) such that ∇i h(0) = 0
for i = 0, 1, 2, we can introduce the following two assumptions (we suppose, as above, that
un → 0 and u0n /un → 0):

Assumption G({u n }, p): We have F(p), and there is a constant K such that
b

|b0n − b| ≤ Kun , |ce0n − ce| ≤ Kun , 
0
(2.15)
h ∈ Cp04 (IRd ) ⇒ |Fn0 (h) − F (h)| ≤ Kun khkp,4 . 

Assumption G c0 ({un }, {u0 }, p): We have F(p), and there are a vector β ∈ IRd0 and a
n
0
d0 × d0 matrix σ and a linear map Φ on Cp06 (IRd ) and a constant K such that

|b0n − b − un β| ≤ Ku0n , |ce0n − ce − un σ| ≤ Ku0n ,







0
h ∈ Cp04 (IRd ) ⇒ |Fn0 (h) − F (h)| ≤ Kun khkp,4 (2.16)


0
h ∈ Cp06 (IRd ) |Fn0 (h) − F (h) − un Φ(h)| ≤ Ku0n khkp,6 .


Proposition 2.9 In the case of a restricted approximate Euler scheme, that is ζin = Yi/n 0n −
0n
Y(i−1)/n for some Lévy process Y 0n with characteristics and second modified characteristic
(b0n , c0n , Fn0 ) and ce0n , we have for any p ≥ 2:
1
n }, p) implies G({un n }, p).
W
a) G({u
b
c0 ({un }, {u0 }, p) implies G0 ({un }, {u0
b) If un ≥ 1/n, then G un
n }, p).
W
n n

The proof of this proposition is given in Subsection 6.1.

3 Some examples
In this section we consider some practical examples and compute the time necessary to
achieve a given precision in the computation of IE(g(X1 )) via a Monte–Carlo method, as
explained in the Introduction (and especially §5). We draw N independent copies of the
approximation X1n of X1 (or N copies of X1n and X12n if we use the Romberg technique).
Assuming √ g bounded, the expected error e(N, n) is the sum of the statistical error, of
order 1/ N , plus the bias ∆n g(x) (or 2∆2n g(x) − ∆n g(x) for the Romberg method).
Therefore if Theorem 1.1 applies we get
1 1_
 
e(N, n) = O √ + un , (3.1)
N n

10
while for the Romberg method and if we can apply Theorem 1.3, we get
1 1 _ 0
 
e(N, n) = O √ + 2 un . (3.2)
N n
In both cases the time needed is T (N, n) = O(N nαn ), where αn is the time necessary to
calculate a single time step.

1) Genuine Euler scheme: If we can simulate exactly the increments of Y , the time
αn is αn = O(1), and we have (2.4) and (2.5) with un = u0n = 0. Optimizing the choices
of n and N in (3.1), subject to the condition e(N, n) ≤ ε, leads to take N = O(n2 ) and
n = O(1/ε), and injecting into T (N, n) = O(N n) gives us a time Tε necessary to achieve
a precision ε which satisfies:  
Tε = O ε−3 . (3.3)

If we use the Romberg method, we apply (3.2) instead of (3.1): this leads to take

N = O(n4 ) and n = O(1/ ε), and injecting into T (N, n) = 0(N n) gives us
 
Tε = O ε−5/2 . (3.4)

2) Approximate Euler scheme: We can exactly simulate the drift (of course !) and
the Wiener part of Y , but not the jump part (except when this jump part is compound
Poisson, or is a stable process, but in the latter case the integrability assumptions of this
paper are not fulfilled). Otherwise, we cannot exactly simulate the increments of Y . To
approximate them, most methods resort to deleting in some way or another the “small
jumps” of Y , so for the discontinuous part we are left with a compound Poisson process,
which can usually be simulated. The reader can look at the papers [14] of Rosiński or [1]
of Asmussen and Rosiński for various possibilities. Below we use the most simple–minded
one, with a view towards minimizing the time needed. This method works if we can
simulate a variable whose law is the (normalized) restriction of F to the complement of
any neighborhood of 0: since F is often explicitly known, this is in general feasible.
So we truncate the jumps at some cut–off size vn (going to 0 as n → ∞). This
amounts to a restricted approximate Euler scheme, the characteristics (b0n , c0n , Fn0 ) and ce0n
c0 0
being chosen such that G({u n }, p) or G ({un }, {un }, p) holds for suitable sequences un
b
and/or un , and with Fn (dx) = 1{|x|>vn } F (dx). Then we can of course choose b0n = bn , and
0 0

cn in such a way that ce0n = ce, so only the last parts of (2.15) or (2.16) have to be checked.
Observe that Y 0n is the sum of a drift, a Wiener process, and a compound Poisson
0n by using a Gaussian variable, plus a Poisson
process. So we can simulate exactly ζ1n = Y1/n
variable Z (the number of jumps on the interval (0, 1/n]), plus Z variables according to
the law Fn0 (normalized)). The time necessary to do that is random, with expectation
1 0
 
0
αn = O(1 + IE(Z)) = O 1 + F (IRd ) . (3.5)
n n

Now we introduce the assumptions. First, we suppose F(p) for all the values of p
necessary to apply our theorems. Next, we set β(t) = F ({y : |y| > t}) for t > 0, and we

11
assume that
C
t≤1 ⇒ β(t) ≤ (3.6)

for some constants C > 0 and α ∈ [0, 2]: this assumption is always satisfied if α = 2,
because F integrates x 7→ |x|2 near 0, and if it holds for α0 it also holds for any α > α0 . If
this holds with α = 0, then β(0) < ∞ and the purely discontinuous part of Y is compound
Poisson. Note also that it holds for some α ∈ (0, 2) as soon as the Lévy measure F in a
neighborhood of 0 is dominated by the Lévy measure of an α–stable process.
For q ≥ 2 we also introduce the functions
Z Z t Cq q−α
q
βeq (t) = |y| F (dy) = q sq−1 (β(s) − β(t))ds ≤ t , (3.7)
{|y|≤t} 0 q−α
where the last inequality holds under (3.6).
R
We say that we are in the pseudo–symmetrical case if {|y|≤t} yi yj yk F (dy) = 0 for all
0
i, j, k when t is small enough (here yi is the ith coordinate of y ∈ IRd ; this holds e.g. when
0
F is invariant by all rotations in IRd ).
Note that, in view of (3.5), we have αn = O(1 + 1/nvnα ) under (3.6). Therefore the
expected time necessary to perform the computation, namely T (N, n) = O(N nαn ), is
N
 
T (N, n) = O N n + α . (3.8)
vn

Now we want G({u n }, p). As said before, only the last part of (2.15) has to be checked.
b
0 0
For any function h on IR we have Fn (h)−F (h) = − h(y)1{|y|≤vn } F (dy). If h ∈ Cp04 (IRd ),
d 0
R

by using a Taylor expansion of h around 0, up to order 4, we obtain



 K βe4 (vn )khkp,4 in the pseudo–symmetrical case
Fn0 (h) − F (h) ≤ (3.9)

K βe3 (vn )khkp,4 otherwise
4−α in the pseudo–
for some constant K. Hence (3.7) yields G({u n }, p) with un = vn
b
symmetrical case, and 3−α
√ un = vWn otherwise. Then Theorem 1.3 and Proposition 2.9
yield e(N, n) = O(1/ N + un (1/n)). So in view of minimizing (3.8 ) it is best to take
un = O(1/n) and N = O(n2 ). This leads to an expected time Tε necessary to achieve an
error smaller than ε which is
−3 3
 
 O ε in the pseudo–symmetrical case, or if α ≤ 2
Tε = (3.10)
 O ε− 6−α
 
3−α otherwise.

Moreover, it is noteworthy to observe that the expected number of jumps to simulate in


a single interval is always smaller than 1 in the first case above.

If we want to use the Romberg method, based upon Theorem 1.3, we need the last two
parts of (2.16) with un = 1/n, and for this an assumption like (3.6) is not enough, and
we need an equivalent to β(t) (or β± (t)) as t → 0. To keep things simple, we consider the
very particular case where d0 = 1 and the Lévy measure F satisfies
A+ A−
F (dx)1[−v,v] = 1(0,v) (x) + 1 (x), (3.11)
x1+α (−x)1+α (−v,0)

12
where α ∈ (0, 2) and A+ , A− ≥ 0 and for some number v > 0. This of course implies
(3.6) with the same α. We already know that (2.15) holds with un = vn4−α in the pseu–
symmetrical case (corresponding to A+ = A− here), and with un = vn3−α otherwise.
Hence we take vn = n−1/(4−α) if A+ = A− and vn = n−1/(3−α) otherwise. Then a simple
0
calculation shows that the last assertion in (2.16) holds if h ∈ Cp06 (IRd ), with
A+ + A− (4) 6−α
Φ(h) = h (0), u0n = n− 4−α
24(4 − α)
if A+ = A− , and otherwise
A+ − A− (3) 4−α
Φ(h) = h (0), u0n = n− 3−α .
6(3 − α)

Then we can apply Theorem 1.3 and √(3.2) to get that e(N, n) = O(1/ N +1/n(6−α)/(4−α) )
if A+ = A− , and e(N, n) = O(1/ N + 1/n (4−α)/(3−α) ) otherwise. Then we take N =
O(n (12−2α)/(4−α) ) in the first case, and N = O(n (8−2α)/(3−α) ) in the second case. This
leads to an expected time Tε necessary to achieve an error smaller than ε which is
  16−3α 


 O ε− 6−α if A+ = A−


  11−3α 
Tε = O ε− 4−α if A+ 6= A− and α ≤ 3/2 (3.12)


  8−α 
 O ε− 4−α

if A+ 6= A− and α > 3/2.

In all cases we have Tε = ε−ρ(α) , and the smaller ρ(α) is, the better is the result. We
can summarize all the results by stating the behavior of ρ(α) as a function of α, as follows:

α : 0 3/2 2
genuine simple 3 −→ 3 −→ 3
genuine Romberg 2.5 −→ 2.5 −→ 2.5
approximate simple, symm. 3 −→ 3 −→ 3
approximate simple, non-symm. 3 −→ 3 % 4
approximate Romberg, A+ = A− 2.66 & 2.55 & 2.5
approximate Romberg, A+ 6= A− 2.75 & 2.6 % 3

The reader will observe that the rates of convergence are quite reasonable for the
approximate scheme, compared with those for the genuine scheme. Also, the improvement
of the Romberg method is not really significant.

4 The compound Poisson case


In this section we suppose that Y is a compound Poisson process. This Rcan be expressed
0
through its characteristics as follows: c = 0 and F (IRd ) < ∞ and b = τ (y)F (dy). We
only consider the genuine Euler scheme, since we can simulate Y exactly in this situation.
Actually, we can even simulate X x exactly, so the result below is given only for the sake
of comparison with the general result and because of the simplicity of its proof, and we
restrict ourselves to the first order expansion.

13
In this situation, Equation (1.3) has a unique solution X x , with no assumption at all
on the coefficient f . This is why we assume nothing like H(l, N ) below. Observe also that
there is no integrability assumption on the jumps of Y like F(p) below.

Theorem 4.1 If Y is a compound Poisson process with Lévy measure F , and if Ytn =
Y[nt]/n (the genuine Euler scheme), for any bounded measurable function g on IRd we have
(1)
the expansion (2.11) for N = 1 with the operator Γt given by

1
Z t
(1)
Γt g(x) = Ps Ht−s g(x)ds, (4.1)
2 0
Z
Hs g(y) = F (du)F (dv) (Ps g(y + f (y)(u + v)) − Ps g(y + f (y)u + f (y + f (y)u)v)) , (4.2)

where (Pt )t≥0 is the transition semi–group of the process X x , and for some constant K
depending on F and f we have (below kgk∞ denotes the sup-norm).
(1)
|R1,n,t g(x)| ≤ tKkgk∞ , |Γt g(x)| ≤ tKkgk∞ . (4.3)

The reason such a result holds is simple enough: recall that in this case, we have
X n,x ≡ X x on the set An where there is at most one jump of Y on each interval Iin =
i−1 i
( n , n ] and IP (An ) → 1. The set Bn on which there is exactly one interval Iin on which
two jumps of Y occur and Y jumps at most once on all other intervals Ijn has a probability
of order 1/n, and the complement of An ∪ Bn has a probability of order 1/n2 . On Bn
the values of X1n,x and of X1x are possibly far apart, so IE((g(X1n,x ) − g(X1x ))1Cn ) is 0
when Cn = An and of order 1/n when Cn = Bn , and of course of order 1/n2 if Cn is the
complement of An ∪ Bn (when g is bounded).

0
Proof. We set λ = F (IRd ) and G = F/λ (a probability measure, which is the law
of all jumps of Y ). We denote by Nin the number of jumps of Y within the interval
((i − 1)/n, i/n], and we set for 1 ≤ i ≤ k ≤ n:
 
Cn,i = ∩ij=1 {Njn ≤ 1}, Dn,i,k = ∩j: n
j6=i,1≤j≤k {Nj ≤ 1} ∩ {Nin = 2}.

The sets Dn,i,k for i = 1, . . . , k are pairwise disjoint, with a union denoted by Dn,k . By
well–known properties of Poisson processes, IP (Dn,i,k ) does not depend on i, and we have

λ2 K λ2
IP (Cn,n ) = 1 − + O(1/n2 ), IP ((Cn,n ∪ Dn,k )c ) ≤ , IP (Dn,i,k ) = + O(1/n3 ).
2n n2 2n2
(4.4)
Set Qnj g(x) = IE(g(Xj/n
x )1
Cn,j ) and
Z  
Hjn g(y) = F (du)F (dv) Qnj g(y + f (y)(u + v)) − Qnj g(y + f (y)u + f (y + f (y)u)v)

for j = 0, 1, . . . , n. We deduce from the first part of (4.4) that

K
Pj/n g(x) − Qnj g(x) ≤ kgk∞ , (4.5)
n

14
hence if Hs is defined by (4.2) we get

K
|Hjn g(x) − Hj/n g(x)| ≤ kgk∞ . (4.6)
n

Observe that Xsn,x = Xsx for all s ≤ i on the set Cn,i , so the second part of (4.4) gives:

[nt]
K
 
IE (g(X n,x
X
x
∆n,t g(x) − [nt] ) − g(X [nt] ))1Dn,i,[nt] ≤ kgk∞ . (4.7)
i=1 n n n2

Now, if we denote by W1 and W2 the sizes of the two jumps of Y on the interval ( i−1 i
n , n ],
when there are exactly two of them, we have on the set Dn,i,k for any k ≥ i:
x x x x x
Xi/n = X(i−1)/n + f (X(i−1)/n )W1 + f (X(i−1)/n + f (X(i−1)/n )W1 )W2 ,
n,x x x
Xi/n = X(i−1)/n + f (X(i−1)/n )(W1 + W2 ).
Moreover, it is obvious that if i < j, then
x
IE(g(X[nt]/n )1Dn,i,[nt] |F i/n ) = Qn[nt]−i g(Xi/n
x )1
Dn,i,i ,
n,x n,x
IE(g(X[nt]/n )1Dn,i,[nt] |F i/n ) = Qn[nt]−i g(Xi/n )1Dn,i,i ,

Therefore, taking into account the fact that conditionally on Dn,i,[nt] the two variables W1
and W2 are i.i.d. with law G = F/λ, we get for i ≤ [nt]:

e−λ/n
Z
IE(g(X x[nt] ) 1Dn,i ) = Qni−1 (x, dy)F (du)F (dv)Qn[nt]−i g(y + f (y)u + f (y + f (y)u)v),
n 2n2

e−λ/n
Z
n,x
IE(g(X [nt] )) 1Dn,i ) = Qni−1 (x, dy)F (du)F (dv)Qn[nt]−i g(y + f (y)u + f (y)v).
n 2n2
Hence we have

e−λ/n n
 
IE (g(X n,x x
[nt] ) − g(X [nt] ))1Dn,i = n
Qi−1 H[nt]−i g(x). (4.8)
n n 2n2

Since |e−λ/n − 1| ≤ K/n, the previous equality and (4.5), (4.6) and (4.7) yield
[nt
1 X K
∆n,t g(x) − 2 P i−1 H [nt]−i g(x) ≤ 2 kgk∞ .
2n i=1 n n n

Furthermore it is obvious that for s ≤ r ≤ t we have |Ps g(x) − Pr g(x)| ≤ Kkgk∞ (r − s),
hence we also have
i−1 i K
≤s≤ =⇒ P i−1 H [nt]−i g(x) − Ps H [nt] −s g(x) ≤ .
n n n n n n

Then the sum showing up in (4.7) is in fact equal, up to a term smaller than Kkgk∞ /n2 ,
1 R [nt]/n
to the integral 2n 0 Ps H[nt]/n−s g(x)ds, and the result follows. 2

15
As said before, there is no assumption here on the size of jumps, nor on f . On the
other hand, as soon as Y is not a compound Poisson, and even if it is a “compound Poisson
process with drift”, the previous result becomes wrong, and one needs at least F(1), because
R (1)
(y − τ (y))F (dy) comes in the explicit form of the operator Γt (see Remark 9.1), and
also H(1, 1) of course in order to have a solution to the equation. Evidently, the operator
(1) 0
Γt Rof Theorem 2.3 formally takes the expression (4.1) when c = 0 and F (IRd ) < ∞ and
b = τ (y)F (dy).
On the other hand if g is unbounded then the two terms on the right of (2.7) might be
infinite or not defined: so if we want the previous result to hold for, say, g ∈ Cp0 (IRd ) (or
g ∈ Cpk (IRd ) for some k; the smoothness of g makes no difference here), that is if we want
(2.8) or (2.11) for N = 1 to hold in the situation of Theorem 4.1, then F(p) is required.

5 Lévy driven stochastic differential equations


In this section we gather some results on equation (1.3), whose solution is denoted by X x .
These results are part of the folklore of the subject, but we could not find them explicitly
proved, under our needed assumptions, in any paper.
Below, Kα (or K(α)) denotes a constant which may change from line to line, and
depends only on the parameter α and on the dimensions d and/or d0 .

1) First we need estimates on stochastic integrals w.r.t. Y . The forthcoming result is


taken from [13], but we give here a simpler proof.

Lemma 5.1 For any predictable (matrix–valued) process H and any p ≥ 2, and if βp =
|y|p F (dy), we have (recall that time belongs to [0, 1]):
R

!
Z s p  Z t
IE sup Hs dYs ≤ Kp |b|p + |c|p/2 + (β2 )p/2 + βp IE(|Hs |p )ds.
s≤t 0 0

when βp < ∞. In this case, b0 = b+ (y−τ (y))F (dy)


R
Proof. It is enough to prove the result

0
exists and satisfies |b | ≤ |b| + K β2 for a constant K depending on the function h only,
and Yt = b0 t + Ytc + Mt where M is a purely discontinuous martingale. Then it is enough
to prove our inequality separately when Yt = b0 t, and Y = Y c , and Y = M . In the
first two cases the result is well known (and easy), so we assume that Y = M . It is also
clearly enough to consider the case where Y and H are 1–dimensional. By a Burkholder–
p/2
Davis–Gundy inequality the left side of the inequality is smaller than Kp IE(Zt ), where
p/2 p/2
Zt = s≤t Hs2 ∆Ys2 . So it remains to prove that IE(Zt ) ≤ Kp (β2 + βp )at , where
P
Rt
at = 0 IE(|Hs |p )ds.
Set q = p/2 ≥ 1. For all x, z ≥ 0 we have first (x + z)q − xq ≤ 2q−1 (xq−1 z + z q ), and
second xq−1 z ≤ εxq + z q /εq−1 for all ε > 0. Hence for all ε > 0 and x, u, v ≥ 0 we have

1
 
(x + uv)q − xq ≤ 2q−1 εxq u + uv q + uq v q .
εq−1

16
Then if Tm = inf(t : Zt ≥ m), and since Z is non–decreasing and purely discontinuous,
Z t V Tm
ZtqV T q q
X
q
= ((Zs− + ∆Zs ) − Zs− ) = ((Zs− + Hs2 y 2 )q − Zs− )µ(ds, dy),
m V 0
s≤t Tm

where µ is the jump measure of Y . The predictable compensator of µ is ν(ds, dy) =


ds ⊗ F (dy), so we get
Z t V TmZ  !
1

IE(ZtqV T ) ≤ 2 q−1
IE ε q
Zs− y2 + 2q 2
|Hs | y + |Hs | |y| 2q 2q
F (dy) ds
m
0 εq−1
1
   
ZtqV T )
^
q−1 q
≤ 2 ε β2 IE(m + β2 + βp at ,
m εq−1
because Z is increasing and Zs− ≤ m if s ≤ Tm . The right side above is finite, hence the
left side as well. Then it remains to take ε = 1/(2q β2 ) and let m → ∞ and apply the
2
monotone convergence theorem: we get the result with Kp = 2q . 2

For further reference, we set


Z  
ηp = |b| + |c| + |y|2 1{|y|≤1} + |y|p 1{|y|>1} F (dy). (5.1)

so F(p) amounts to saying that ηp < ∞. With this notation, it follows from the previous
lemma that for any p ≥ 2 and any predictable process H we have
p0
!
Z s Z t 0
0
2≤p ≤p ⇒ IE sup Hs dYs ≤ K(p, ηp ) IE(|Hs |p )ds, (5.2)
s≤t 0 0

where K(p, ηp ) denotes a constant which depend only on p and ηp , and on the dimensions
of Y and H.

2) Now we turn to estimates on the solution X x of (1.3). We know that it is a Markov


process, whose semigroup is denoted by (Pt ). The following estimates on Pt are crucial
(when we write kgkp,k < ∞ for a function g on IRd , this automatically implies that
g ∈ Cpk (IRd ):

p) for some p ≥ 0, we have for some con-


W
Proposition 5.2 a) Under H(1, 1) and F(2
stant K = K(p, f, η2 W p ) (recall (1.7)):

IE (sups |Xsx |p ) ≤ Kαp (x), 
(5.3)
g ∈ Cp0 (IRd ) ⇒ kPt gkp,0 ≤ Kkgkp,0 . 

p) for some p ≥ 0 and N ≥ 1, we have for some


W
b) Under H(1, N ) and F(N + N
constant K = K(p, f, ηN +N W p )

g ∈ CpN (IRd ) ⇒ kPt gkp,N ≤ Kkgkp,N . (5.4)

17
The first property in (5.3) is then a consequence of (5.2) and of Gronwall’s inequality
(recall that F(p) ⇔ ηp < ∞). The second property in (5.3) is a trivial consequence of
the first one.
For (b) above we need first some facts about the differentiability of x 7→ X x . We say
that it is continuously differentiable in ILp if thereare d × d–dimensional processes Xx,(1)
x,(1) x,(1) p
which satisfy IE(sups |Xs |p ) < ∞ and also IE sups Xsy − Xsx − Xs · (y − x) =
x,(1) p
 
y,(1)
o(|y − x|p ), and IE sups Xs − Xs → 0 as y → x. By induction, it is N times
continuously differentiable in if the (N − 1) derivative process X x,(N −1) exists and is
ILp
continuously differentiable in ILp . Observe that X x,N is dN –dimensional.
It is well known, using Gronwall’s Lemma and (5.2), that under H(1, 1) and F(p) for
some p ≥ 2, then X x is once continuously differentiable in ILp and X x,(1) is the unique
solution of the following linear equation (with Id being the d × d identity matrix):
Z t
x,(1) x x,(1)
Xt = Id + ∇f (Xs− )Xs− dYs (5.5)
0

x,(1) p
and further x 7→ IE(sups |Xs | ) is bounded. More generally, we have:

Lemma 5.3 Under H(1, N ) for some N ≥ 1 and F(N p) for some p ≥ 2, then x 7→
X x is N times continuously differentiable in ILp , and we have for some constant K =
K(N, p, f, ηN p ):  
x,(N ) p
IE sup |Xs | ≤ K. (5.6)
s

Proof. Not only do we get (5.6), but we also have that the N th derivative is the unique
solution of the following linear equation (when the X x,(j) for j = 1, . . . , N −1 are supposed
to be known):
Z t
x,(N ) x x,(N )
Xt = ∇f (Xs− )Xs− dYs
0
N Z t
X x,(1) x,(N −i+1)
+ ∇i f (Xs−
x
)FN,i (Xs− , . . . , Xs− )dYs , (5.7)
i=2 0

if N ≥ 2 (and (5.5) if N = 1). Here, the components of FN,i (x(1) , . . . , x(N −i+1) ) are sums
of terms of the form
−i+1
NY αj
Y X
x(j),lr , where jαj = N, (5.8)
j=1 r=1 j

j
and where x(j),l is the lth component of x(j) ∈ IRd , and an “empty” product equals 1.
The proof is by induction on N , using Gronwall’s Lemma and Lemma 5.1, and as it is
well known it boils down to proving first that by formal differentiation of (5.7) for N − 1
we get Equation (5.7) for N , and to proving secondly that the solution of (5.7) satisfies
the estimate (5.6).

18
Hence we assume the result for all N 0 < N . By formally differentiating x 7→ X n,x,(N −1)
in Equation (5.7) written for N −1 we readily get (5.7) for N , with (using matrix notation,
and FN −1,1 (x(1) , . . . , x(N −1) ) = x(N −1) ):

FN,i (x(1) , . . . , x(N −i+1) ) = x(1) FN −1,i−1 (x(1) , . . . , x(N −i+1) )


N −i

FN −1,i (x(1) , . . . , x(N −i) )x(j+1) .
X
+
j=1
∂x(j)

Then if all FN 0 ,i for N 0 < N are sums of terms like in (5.8), the same is true of FN,i .
Next, we prove that the solution of (5.7) satisfies (5.6) (assuming again this is true for
all N 0 < N ). By Gronwall’s Lemma and the fact that ∇f is bounded, the only thing to
prove is that
 Z t p
,x x,(1) x,(N −i+1)
IE sup (∇i f (Xs− )FN,i (Xs− , . . . , Xs− )dYs ≤K
t 0

for all i = 2, . . . , N and for some constant K = K(N, p, f, ηN p ) (in the remainder of the
proof K = K(N, p, f, ηN p ) varies from line to line). And of course, it is enough to prove
that if G is any monomial like in (5.8), then
 Z t p
i x x,(1) x,(N −i+1)
IE sup (∇ f (Xs− )G(Xs− , . . . , Xs− )dYs ≤ K. (5.9)
t 0

For this we use Lemma 5.1 and the fact that ∇i f is bounded. By (5.2) the left side of
(5.9) is smaller than
 
NY
+1−i NY
+1−i  jαj /N
K IE  sup |Xsx,(j) |pαj  ≤K IE(sup |Xs,x,(j) |N p/j )
s s
j=1 j=1
P
by Hölder inequality, since j jαj = N . The recurrence assumption yields that each
expectation above is smaller than some constant K(p, N, f, ηN p ), so we obtain (5.9). 2

Proof of Proposition 5.2-(b). Let g ∈ CpN (IRd ). By Lemma 5.3, for anyWk = 1, . . . , N
N +N p
then x 7→ X x is k times continuously differentiable in ILrk , where rk = k ; further if
X x,(0) = X n,x e x,(j) = sup |X x,(j) |, then with K = K(N, p, f, ηN p ):
and X t t

   K(1 + |x|r ) if j = 0 and r ∈ [0, r1 ]
e x,(j) |r ≤
IE |X (5.10)
 K if j = 1, . . . , N and r ∈ [0, rj ].

Then any kth partial derivative (for k = 1, . . . , N ) of x 7→ g(Xtx ) exists (in probability),
and is continuous in probability and is smaller than a sum of terms of the form
k
Y k
X
e x,(0) |p )
Zx,p,k,{αj } = a(1 + |X e x,(j) |αj ,
|X where jαj = k, αj ∈ IN
j=1 j=1

19
with an empty product equal to 1. Then it is enough to prove that under our assumptions,
each Zx,p,k,{αj } as above has IE(Zx,p,k,{αj } ) ≤ aK(1 + |x|p ). But Hölder’s inequality and
Pk
j=1 jαj = k ≤ r1 − p yield

k 
Y jαj /k
IE(Zx,p,k,{αj } ) ≤ a  e x,(j) |k/j )
IE(|X
j=1

 k 
(r1 −k)/r1 Y jαj /r1
e x,(0) |r1 p/(r1 −k) )
+ IE(|X e x,(j) |rj )
IE(|X .
j=1

Then the result readily follows from (5.10). 2

3) The generator of X x . As is well known, the “extended generator” of the Markov


process (Xtx ) is the operator A acting on C 2 functions g on IRd as follows (where ∇g is a
row vector; τ is the truncation function):
d
1 X ∂2 g
Ag(x) = ∇g(x)f (x)b + (x)(f (x)cf (x)? )ij
2 i,j=1 ∂xi ∂xj
Z
+ F (dy) (g(x + f (x)y) − g(x) − ∇g(x)f (x)τ (y)) , (5.11)

In the next lemma, we denote by CpN,1 (IRd × [0, 1]) the set of all families (gt )t∈[0,1] of
functions on IRd such that gt0 (x) = ∂t

gt (x) exists and is continuous for all x, and that the
functions gt and gt all belong to Cp (IRd ) with supt (kgt kp,N + kgt0 kp,N ) < ∞.
0 N

Lemma 5.4 Let l = 0 or l = 1 and p ≥ 0 and N ∈ IN .


W
a) Under H(l, 1 N ) and F(p + N ) there is a constant K = K(p, N, f, ηp+N ) such that
g ∈ CpN +2 (IRd ) ⇒ kAgkp+2l,N ≤ Kkgkp,N +2 .

b) Under H(l, 1) and F(p), for any (gt ) ∈ Cp2,1 (IRd × [0, 1]), the function t 7→ Agt (x)
is continuously differentiable and its derivative is Agt0 (x).

Proof. We prove (b) first. Observe that under our assumptions on (gt ) the partial
derivatives of order 1 and 2 w.r.t. x commute with the partial derivative w.r.t. t, hence
the claim readily follows from (5.11) and the dominated convergence theorem. It is even
simpler to check that kAgkp+2l,0 ≤ Kkgkp,2 for some K = K(p, f, ηp ) when g ∈ Cp2 (IRd ).
It remains to prove (a) when N ≥ 1, and this is proved by induction on N . For example
if N = 1 and if we denote by ∂k the derivative w.r.t. the kth coordinate of x, we have
∂k Ag = A∂k g + A0k g (by applying (b) and again the dominated convergence theorem, and
using H(l, 1)), where
d
1 X ∂2 g
A0k g(x) = ∇g(x)∂k f (x)b + (x)∂k (f (x)cf (x)? )ij
2 i,j=1 ∂xi ∂xj
Z
+ F (dy) (∇g(x + f (x)y)∂k f (x)y − ∇g(x)∂k f (x)τ (y)) . (5.12)

20
We have seen already that kA∂k gkp+2l,0 ≤ Kkgkp,N +2 , and the same argument shows that
kA0k gkp+2l,0 ≤ Kkgkp,N +2 as well: hence the result for N = 1. We can obviously iterate
the procedure and get the result for N arbitrary; details are left to the reader. 2

We denote by Ak the kth iterate of A (and A0 is the identity). A straightforward


iteration of the above result yields the

Lemma 5.5 Let l = 0 or l = 1 and p ≥ 0 and k ≥ 1, and assume H(l, 1 (N + 2k − 2))


W

and F(p + N + 2k − 2) for some N ∈ IN .


a) There is a constant K = K(p, N, k, ηp+N +2k−2 ) such that

g ∈ CpN +2k (IRd ) ⇒ kAk gkp+2lk,N ≤ Kkgkp,N +2k .

b) If (gt ) ∈ Cp2k,1 (IRd × [0, 1]), then the function t 7→ Ak gt (x) is continuously differen-
tiable and its derivative is Ak gt0 (x).

Another very important property for us is the next one, well known in general but
perhaps not under these hypotheses:

Lemma 5.6 Let l = 0 or l = 1 and p ≥ 0.


a) Assume H(l, 1) and F(2 (p + 2l)). For any g ∈ Cp2 (IRd ) we have
W

Z t
Pt g(x) = g(x) + Ps Ag(x)ds. (5.13)
0

In particular, the map t 7→ Pt g(x) is differentiable and


d
Pt g(x) = Pt Ag(x) = APt g(x). (5.14)
dt

b) Let N, N 0 ≥ 0 and assume H(l, 1 (2N + N 0 )) and F(2 (p + 2l) (p + 2N + N 0 )).


W W W
0
If g ∈ Cp2N +2+N (IRd ) we have

N k t
t 1
X Z
Pt g(x) = Ak g(x) + (t − s)N Ps AN +1 g(x)ds. (5.15)
k=0
k! (N )! 0

Proof. a) An application of Itô’s formula yields that the process


Z t
Mt = g(Xtx ) − g(x) − x
Ag(Xs− )ds
0

is a local martingale. Further Lemma 5.4 yields that Ag ∈ Cp+2l 0 (IRd ), hence supt |Mt | ≤
p
K(1 + |x| + supt |Xt | x p+2l ) for some constant K, and this quantity is integrable by (5.3).
Hence M is a martingale, and taking expectations above yields (5.13). This gives that the
map t 7→ Pt g(x) is first continuous, and second differentiable with derivative Pt Ag(x). For
any given s the function g 0 = Ps g is also in Cp2 (IRd ) and the derivative of t 7→ Pt+s g(x) =
Pt g 0 (x) at t = 0 is Ps Ag(x) and also Ag 0 (x) = APs g(x), so that (5.14) holds.

21
b) Observe that (5.15) for N = 0 is indeed (5.13), and the proof for N arbitrary is by
induction. In fact, it is clearly enough to prove
Z t 1 N 1
Z t
N −1 N
(t − s) Ps A g(x)ds = A g(x) + (t − s)N Ps AN +1 g(x)ds.
0 N N 0

But this follows from (5.14) applied to AN g, which is in Cp+2lN


2 (IRd ) by Lemma 5.5. 2

4) The generator of Y . All the previous results hold of course when d0 = d and f (x)
is equal to the identity matrix for all x: we then get X 0 = Y : so (Pt ) is the semigroup
0
of Y , and A is replaced by the generator B of Y which acts on C 2 functions h on IRd as
follows:
d 0
1 X ∂2 h
Z
Bh(x) = ∇h(x)b + (x)cij + F (dy)(h(x + y) − h(x) − ∇h(x)τ (y)). (5.16)
2 i,j=1 ∂y i ∂y j

In this case, observe that in Lemma 5.4-(a) we need only F(p): indeed H(0, N ) is
trivially fulfilled for all N ; and we have A0k g = 0 in (5.12) so ∂k Ag = A∂k g, hence in order
0 0
to have Ag = Bg ∈ Cp1 (IRd ) we need only F(p) and g ∈ Cp3 (IRd ), and our claim follows
by a trivial induction. Therefore, with B k denoting the kth iterate of B, Lemma 5.5 reads
as follows:

Lemma 5.7 Let p ≥ 0 and k, N ∈ IN with k ≥ 1, and assume F(p).


a) There is a constant K = K(p, N, k, ηp ) such that
0
h ∈ CpN +2k (IRd ) ⇒ kB k hkp,N ≤ khkp,N +2k .
0
b) If (ht ) ∈ Cp2k,1 (IRd × [0, 1]), then the function t 7→ B k ht (y) is continuously differ-
entiable and its derivative is B k h0t (y).
W
Similarly, Lemma 5.6 is true with F(2 p) as the only assumption, and for all N .
Therefore, using the previous lemma, we readily obtain:

Lemma 5.8 Under F(2 p), for any k ∈ IN there is a constant K = K(p, k, ηp ) such
W
0
that if h ∈ Cp2k+4 (IRd ), then
k
  X 1 K
n IE(h(Y1/n )) − h(0) − B i+1 h(0) ≤ k+1 khkp,2k+4 . (5.17)
i=0
(i + 1)! ni n

6 Some technical lemmas

6.1 Some consequences of the assumptions on ζ1n

Let us associate with ζ1n its “normalized” distribution Fn , and also the vector bn and the
matrix cen , as follows:
Fn (A) = n IP (ζ1n ∈ A), bn = Fn (τ ), cen = Fn (τ τ ? ). (6.1)

22
L
By results in [7] (see Theorem VII-3-4), the convergence Y n → Y is equivalent to having

bn → b, cen → ce, h bounded continuous null around 0 ⇒ Fn (h) → F (h). (6.2)


0
We also introduce an operator Bn acting on C 1 functions h on IRd as follows:
Z
Bn (h) = ∇h(0)bn + Fn (dy)(h(y) − h(0) − ∇h(0)τ (y)) = nIE(h(ζ1n ) − h(0)). (6.3)

0 0
Let us also recall that Cp0k (IRd ) is the set of all functions in Cpk (IRd ) which vanish at 0,
as well as their first and second derivatives. The next lemma shows in particular that
G({un }, p) for any p ≥ 2 and any sequence un → 0 implies (6.2).

Lemma 6.1 If un is a sequence satisfying un ≥ n1 and if p ≥ 2, then Assumption


G({un }, p) is equivalent to each one of the following two properties:
(a) We have F(p) and there is a constant K such that
0
h ∈ Cp4 (IRd ) ⇒ |Bn (h) − Bh(0)| ≤ Kun khkp,4 . (6.4)

(b) We have F(p) and there is a constant K such that (recall (2.14) for ce):

|bn − b| ≤ Kun , |cen − ce| ≤ Kun , 
0
(6.5)
h ∈ Cp04 (IRd ) ⇒ |Fn (h) − F (h)| ≤ Kun khkp,4 . 

Proof. First, we have


Bτ (0) = b, Bn (τ ) = bn ,
B(τ τ ? )(0) = ce, Bn (τ τ ? ) = cen ,
0
h ∈ Cp04 (IRd ) ⇒ Bh(0) = F (h), Bn (h) = Fn (h).
d0
Since the components of τ and τ τ ? belong to Cp4 (IR ) for all p ≥ 0, we get (a) ⇒ (b).
Next, we can rewrite Bh(0) and Bn (h) as follows:
d 0
1 X ∂2h
Bh(0) = ∇h(0)b + (0) ceij + F (h̃), (6.6)
2 i,j=1 ∂y i ∂y j
d 0
1 X ∂2h
Bn (h) = ∇h(0)bn + (0) ceij
n + Fn (h̃),
2 i,j=1 ∂y i ∂y j

where 0
d
1 X ∂2h
h̃(y) = h(y) − h(0) − ∇h(0)τ (y) − (0)τ i (y)τ j (y). (6.7)
2 i,j=1 ∂y i ∂y j

Observe that there is a constant C such that kh̃kp,4 ≤ Ckhkp,4 and ∇i h̃(0) = 0 for i = 0, 1, 2
(recall τ is C ∞ with compact support and τ (y) = y for |y| small). Thus (b) ⇒ (a).
Third, (1.6) and (6.3) yield
 
Bn (h) = n (IE(h(ζ1n )) − h(0)) = nδn (h) + n IE(h(Y1/n )) − h(0) . (6.8)

23
Combining this with (5.17) for k = 0 immediately yields the equivalence of G({un }, p)
with (a), since un ≥ 1/n. 2

In the next corollary we use the notation (to be compared with (5.1)):
 Z 
ηp0 = sup |bn | + 2 p
Fn (dy)(|y| 1{|y|≤1} + |y| 1{|y|>1} ) . (6.9)
n

Corollary 6.2 Suppose that G({un }, p) holds for some sequence un → 0 and some p ≥ 2.
Then ηp0 < ∞.

Proof. It is of course no restriction here to assume that un ≥ 1/n. Hence we have (6.5)
0
by the previous lemma. Since we can find a function h ∈ Cp04 (IRd ) such that

|y|2 1{|y|≤1} + |y|p 1{|y|>1} ≤ |τ (y)|2 + h(y),


the result readily follows from (6.5) and F(p). 2

We have seen that G({un }, p) gives us an estimate on the difference Bn (h) − Bh(0).
Our other assumptions will give us expansions of Bn (h) around Bh(0):

Lemma 6.3 (a) Under G’({un }, {u0n }, p 2) for some p ≥ 0 we have a constant K such
W
0
that for all h ∈ Cp6 (IRd ) (recall (2.5) for φ):

|φ(g)| + |B 2 h(0)| ≤ Kkhkp,6 ,




  (6.10)
Bn (h) − Bh(0) − un φ(h) − 1
B 2 h(0) ≤ K u0n 1
khkp,6 . 
W
2n n2

(b) Under G”(N, p 2) for some p ≥ 0 and some N ≥ 1 we have a constant K such
W
0
that for all k = 0, . . . , N + 1 and g ∈ Cp2k+2 (IRd ):

|B (k) (h)| ≤ Kkhkp,2k+2 ,




Pk (6.11)
1 K
Bn (h) − i=1 ni−1 i! B (i) (h) ≤ nk
khkp,2k+2 , 

where B (1) (h) = Bh(0) and B (k) (h) = B k h(0) + k! φk−1 (h) for k ≥ 2.

Proof. (a) The first inequality follows from combining (2.4) and (2.5) plus the fact that
u0n /un → 0, and from Lemma 5.7. The second inequality follows from combining (6.8)
with (2.5) and (5.17) for k = 1.
(b) The second inequality follows from combining (6.8) with (2.6) for k − 1 and (5.17)
for k. For the first inequality, in view of Lemma 5.7 it suffices to prove that |φk−1 (h)| ≤
Kkhkp,2k+2 for k ≥ 2. We set Φn,k = δn − k−1 1
i=1 ni+1 φi . We know that |Φn,k (h)| ≤
P

Kkhkp,2k+2 /nk+1 and also |Φn,k−1 (h)| ≤ Kkhkp,2k+2 /nk . Since φk−1 = nk (Φn,k−1 − Φn,k ),
the result is then obvious. 2

The operators B (k) and φ above are linear, and we need to check that they commute
with differentiation. This is obvious for B (1) by Lemma 5.7, but otherwise it needs a proof.

24
0
Lemma 6.4 Let p ≥ 0 and k ≥ 2 and (ht ) ∈ Cp2k+2,1 (IRd × [0, 1]).
a) Under G’({un }, {u0n }, p 2) and if k = 2, the function t 7→ φ(h0t ) is continuous and
W

is the derivative of t 7→ φ(ht ).


2) the function t 7→ B (k) (h0t ) is continuous and is the derivative
W
(b) Under G”(k−1, p
of t 7→ B (k) (ht ).

Proof. We prove only (b), since for (a) the proof is similar (simpler in fact because we
do not need the induction step).
In view of Lemma 5.7, it is enough to prove the result with φk−1 instead of B (k) , and
for this we use an induction: we suppose that the result holds for all k 0 ≤ k − 1. We
consider the operators Φn,k of the previous proof, with Φn,1 = 0.
We have F(p) and (2.4), thus |Y1/n |p and |ζ1n |p are integrable. It follows from Lebesgue’s
theorem that t 7→ δn (h0t ) is continuous and is the derivative of t 7→ δn (ht ). Then the
induction hypothesis yields that

t 7→ Φn,k−1 (h0t ) is continuous, and 
Rs (6.12)
Φn,k−1 (ht+s ) − Φn,k−1 (ht ) − 0 Φn,k−1 (h0t+u )du = 0. 

Next, (2.6) for k − 1 yields for all t, and for some constant K:
K
|Φn,k (ht )| + |Φn,k (h0t )| ≤ . (6.13)
nk+1
Moreover, φk−1 = nk (Φn,k−1 − Φn,k ). Hence we first deduce from (6.13) and t ∈ [0, 1] that
2K
φk−1 (h0t ) − φk−1 (h0s ) ≤ nk Φn,k−1 (h0t ) − Φn,k−1 (h0s ) + .
n
If we use the first part of (6.12) and let first s → t and next n → ∞, we deduce that
φk−1 (h0s ) → φk−1 (h0t ) as s → t. Second, taking account of the second part of (6.12), we
see that
Z s  Z s 
φk−1 (ht+s )−φk−1 (ht )− φk−1 (h0t+u )du = −nk Φn,k (ht+s ) − Φn,k (ht ) − Φn,k (h0t+u )du .
0 0

By (6.13), the right side above is smaller than a constant times 1/n. This being true for
all n, we get φk−1 (ht+s ) − φk−1 (ht ) − 0s φk−1 (h0t+u )du = 0: this finishes the proof. 2
R

6.2 Proof of Proposition 2.9


c0 0
In Case (a) we assume G({u n }, p) and un → 0; in Case (b) we assume G ({un }, {un }, p)
b
0
with un ≥ 1/n and un → 0 and un /un → 0; and in both cases we suppose p ≥ 2.
1) Set
 Z 
ηp00 = sup |b0n | + |c0n | + Fn0 (dy)(|y|2 1{|y|≤1} + |y|p 1{|y|>1} ) .
n

Exactly as in Corollary 6.2 we see that in both cases we have ηp00 < ∞. In the remainder
of the proof K denotes a constant which changes from line to line and depends on p and
ηp and ηp00 only.

25
We denote by Bn0 the generator of the Lévy process Y 
0n . Lemmas 5.7 and 5.8 and the
0
fact that Y1/n 0n )) − h(0) , yield for k = 0 and k = 1:
= ζ1n , hence Bn (h) = n IE(h(Y1/n
0
h ∈ Cp6 (IRd ) ⇒ kBn0 hkp,4 ≤ Kkhkp,6 , (6.14)

k
0 1 K
Bn0i+1 h(0) ≤ k+1 khkp,4+2k .
X
h ∈ Cp4+2k (IRd ) ⇒ Bn (h) − i
(6.15)
i=0
(i + 1)! n n

On the other hand, similar to (6.6), we have


d 0
1 X ∂ 2 hx
(Bn0 − B)h(x) = ∇hx (0)(b0n − b) + (0)(ce0ij
n −ceij ) + (Fn0 − F )(h
fx ), (6.16)
2 i,j=1 ∂y i ∂y j

where hx (y) = h(x + y) and h fx is the transform of hx given by (6.7). Then, comparing
0
this with (6.6), and since h ∈ Cpk (IRd ) yields kh̃kp,k ≤ Ckhkp,k for some constant C, we
immediately deduce from (6.16) with x = 0 that
0
h ∈ Cp4 (IRd ) ⇒ |Bn0 h(0) − Bh(0)| ≤ Kun khkp,4 (6.17)

in Case (a). In Case (b) we have the same, and also


0
h ∈ Cp6 (IRd ) ⇒ Bn0 h(0) − Bh(0) − un φ(h) ≤ Ku0n khkp,6 , (6.18)

provided we have set (recall (2.16) for β, σ and Φ)


d 0
1 X ∂2h
φ(h) = ∇h(0)β + (0)σ ij + Φ(h̃).
2 i,j=1 ∂y i ∂y j

2) In Case (a), the result is then a trivial consequence of (6.17) and (5.17) with k = 0
and (6.15) with k = 0 as well, applied to the equality (6.8).

3) Now we assume that we are in Case (b). The assumptions of this case imply those
of Case (a), so (2) above yields G({un }, p) (recall that now un ≥ 1/n), so it remains to
prove (2.5) with u0n (un /n) instead of u0n .
W
0
Let us consider (6.16) with some h ∈ Cp6 (IRd ). Exactly as in Lemma 5.7, we can
differentiate up to 2 times in x, and any partial derivative of the left side is given by the
right side applied to the same partial derivatives of x 7→ hx or x 7→ h fx , and h
fx belongs to
0
Cp04 (IRd ) and satisfies kh
fx kp,4 ≤ (1 + |x|p )khkp,6 . Then we deduce from (6.17) that

k(Bn0 − B)hkp,2 ≤ Kun khkp,6 .

Next, Lemma 5.7 yields


|B(Bn0 − B)h(0)| ≤ Kun khkp,6 .
On the other hand combining (6.14) and (6.17) gives us

|(Bn0 − B)Bn0 h(0)| ≤ Kun khkp,6

26
as well, and since Bn02 − B 2 = (Bn0 − B)Bn0 + B(Bn0 − B) we finally get:
0
h ∈ Cp6 (IRd ) ⇒ |Bn02 h(0) − B 2 h(0)| ≤ Kun khkp,6 . (6.19)

At this point we can inject (5.17) for k = 1 and (6.15) for k = 1 as well into (6.8); in
view of (6.18) and (6.19) we obtain:

un u0n un 1
 
δn (h) − φ(h) ≤ K + 2 + 3 khkp,6 ,
n n n n

and since un ≥ 1/n the result readily follows.

6.3 Estimates for X n,x

Next we turn to studying the solution X n,x of Equation (2.2), with Y n given by (2.1). We
first give a result similar to Lemma 5.1:

Lemma 6.5 For any adapted (matrix–valued) process H and any p ≥ 2, and if βpn =
p
R
|y| Fn (dx), we have
!
Z s p  Z t
IE sup Hϕn (s) dYsn ≤ Kp |bn |p + (β2n )p/2 + βpn IE(|Hϕn (s) |p )ds.
s≤t 0 0

Proof. It is enough to prove the result when βpn < ∞, and in thep1–dimensional case.
Then bn = bn + (y − τ (y))Fn (dy) exists and satisfies |b0n | ≤ |bn | + K β2n for a constant K
0
R
P[nt] n
depending on the function h only, and Ytn = b0n [nt] n
n + Mt , where Mt =
n n
i=1 ξi and ξi =
ζin −IE(ζin ). As in Lemma 5.1, the result is obvious when Ytn = b0n [nt]
n , hence we can assume
Y n = M n . Note that M n is a martingale w.r.t. the filtration (F [nt]/n )t≥0 . So we reproduce
P[nt] 2 n 1 P[nt] p
the proof of Lemma 5.1 with Zt = i=1 H(i−1)/n ξi and at = n i=1 IE(|H(i−1)/n | ) ≤
Rt p p/2
0 IE(|Hϕn (s) | )ds, and we have to prove again that IE(Zt )≤ Kp ((β2n )p/2 + βpn )at . With
Tm as in Lemma 5.1, we get
 
[nt]
IE(ZtqV T ) = IE 
X
((Z i−1 + H 2i−1 (ξin )2 )q − (Z i−1 )q )1{nTm ≥i} 
m n n n
i=1
 
[nt]
1
Z X
= Fn (dy)IE  ((Z i−1 + H 2i−1 y 2 )q − (Z i−1 )q )1{nTm ≥i}  ,
n i=1
n n n

because the set {nTm ≥ i} = {nTm > i − 1} is F (i−1)/n –measurable. Then we finish as in
Lemma 5.1 again. 2

As a consequence we get, using the notation (6.9), and similarly to (5.2):


p0
!
Z s Z t 0
0
2≤p ≤p ⇒ IE sup Hϕn (s) dYsn ≤ K(p, ηp0 ) IE(|Hϕn (s) | p )ds. (6.20)
s≤t 0 0

27
At this point we can do for X n,x exactly what we have done for X x in the previous
section. First, although X n,x is not a Markov process, we introduce the analogue of its
semigroup by putting
Ptn g(x) = IE[g(Xtn,x )]. (6.21)
Observe that Ptn g(x) = Pi/n
n g(x) whenever i ≤ nt < i + 1. Then the analogue of Proposi-

tion 5.2 reads as:

Proposition 6.6 a) Under H(1, 1) and η20 W p < ∞ for some p ≥ 0, we have for some
constant K = K(p, f, η20 W p ):

IE (sups |Xsn,x |p ) ≤ Kαp (x), 
(6.22)
g ∈ Cp0 (IRd ) ⇒ kPtn gkp,0 ≤ Kkgkp,0 . 

0
b) Under H(1, N ) and ηN W < ∞ for some p ≥ 0 and N ≥ 1, we have for some
+N p
0
constant K = K(p, f, ηN W ):
+N p

g ∈ CpN (IRd ) ⇒ kPtn gkp,N ≤ Kkgkp,N . (6.23)

Let us define the following operators An acting on C 1 functions:

An g(x) = nIE (g(x + f (x)ζ1n ) − g(x))


Z
= ∇g(x)f (x)bn + Fn (dy) (g(x + f (x)y) − g(x) − ∇g(x)f (x)τ (y)) . (6.24)

This operator obviously satisfies (by (2.2) and (6.24)):

n 1 n
P i+1 g(x) = P ni g(x) + P i An g(x). (6.25)
n n n n
So it plays the role of the generator for the process X n,x . The proof of Lemma 5.4 holds
(“uniformly” in n) in that case as well, and we can state the

Lemma 6.7 Let l = 0 or l = 1 and p ≥ 0 and N ∈ IN .


a) Under H(l, 1
W 0
N ) and η(p+N W < ∞ there is a constant K = K(p, N, f, η 0 W )
) 2 (p+N ) 2
such that
g ∈ CpN +2 (IRd ) ⇒ kAn gkp+2l,N ≤ Kkgkp,N +2 .

b) Under H(l, 1) and ηp0 W 2 < ∞, for any (gt ) ∈ Cp2,1 (IRd × [0, 1]) the function t 7→
An gt (x) is continuously differentiable and its derivative is An gt0 (x).

28
6.4 Expansion of the generators

Observe that we can write An in a different form. For any C 2 function g we put
Lx g(y) = g(x + f (x)y). (6.26)
Then we have for n ≥ 1 (recall (6.3)):
An g(x) = Bn (Lx g). (6.27)
Note that we also have Ag(x) = BLx g(0) (see(5.11) and (5.16)). Then under G”(N, p),
and similarly to (6.27) it is natural to set for k = 1, . . . , N + 1:
A(k) g(x) = B (k) (Lx g), (6.28)
while under G0 ({un }, {u0n }, p) we set
Ug(x) = φ(Lx g), Vg(x) = B 2 Lx g(0). (6.29)

Since B (1) (g) = Bg(0), we see that


A(1) = A. (6.30)

Lemma 6.8 Let l = 0 or l = 1 and p ≥ 0 and N ∈ IN and k ≥ 2. Assume H(l, N


W
1)
and G”(k − 1, (p + N ) 2).
W

a) There is a constant K such that


g ∈ CpN +2k+2 (IRd ) ⇒ kA(k) gkp+2(k+1)l,N ≤ Kkgkp,N +2k+2 .

b) If (gt ) ∈ Cp2k+2,1 (IRd × [0, 1]), then t 7→ A(k) gt (x) is continuously differentiable and
its derivative is A(k) gt0 (x).

Proof. We denote by ∇rx (resp. ∇ry ) the rth iterate of the gradient w.r.t. x (resp. y).
Let g ∈ CpN +2k+2 (IRd ). We clearly have for 0 ≤ i + r ≤ N + 2k + 2 and r ≤ N and some
constant K (which varies from line to line in this proof):
|∇rx ∇iy Lx g(y)| ≤ Kαp+il (x) (1 + |y|p+r )kgkp,N +2k+2 .
Therefore
r = 0, . . . , N ⇒ k∇rx Lx gkp+N,2k+2 ≤ Kαp+2l(k+1) (x)kgkp,N +2k+2 . (6.31)
Then applying Lemma 6.4 r times, with t replaced by the component of x w.r.t. which we
differentiate, we obtain that
∇rx B (k) (Lx g) = B (k) (∇rx Lx g) (6.32)
as soon as r ≤ N and G”(k − 1, p + N ) holds. In view of (6.28), the properties (6.31) and
(6.32) and (6.11) imply (a).
If further g = gt depends on t ∈ [0, 1] in a continuously differentiable way, we can add
a derivation w.r.t. t above, and this derivation again commutes with B (k) , hence with
A(k) : so we have (b). 2

29
Remark 6.9 For the genuine Euler scheme A(k) (g) = B k Lx g(0). Hence by Lemma
5.7 the above result holds with g ∈ CpN +2k (IRd ) instead of g ∈ CpN +2k+2 (IRd ), and then
kA(k) gkp+2kl,N ≤ Kkgkp,N +2k . 2

The same proof, based on (a) of Lemmas 5.7 and 6.4, yields also the following:

Lemma 6.10 Let l = 0 or l = 1 and p ≥ 0 and N ∈ IN . Assume H(l, N


W
1) and
G0 ({un }, {u0n }, (p + N ) 2).
W

a) There is a constant K such that

g ∈ CpN +6 (IRd ) ⇒ kUgkp+6l,N ≤ Kkgkp,N +6 , kVgkp+6l,N ≤ Kkgkp,N +6 .

b) If (gt ) ∈ Cp6,1 (IRd × [0, 1]), then t 7→ Ugt (x) and t 7→ Vgt (x) are continuously
differentiable and their derivatives are Ugt0 (x) and Vgt0 (x).

Lemma 6.11 Let l = 0 or l = 1, and p ≥ 0, and N ∈ IN , and assume H(l, N


W
1).
a) Under G({un }, (p + N )
W
2) there is a constant K such that

g ∈ CpN +4 (IRd ) ⇒ kAn g − Agkp+4l,N ≤ Kun kgkp,N +4 .

b) Under G’({un }, {u0n }, (p + N )


W
2) there is a constant K such that

1 _ 1
 
g ∈ CpN +6 (IRd ) ⇒ kAn g − Ag − un Ug − Vgkp+6l,N ≤ K u0n kgkp,N +6 .
2n n2

c) Under G”(k − 1, (p + N ) 2) for some k ≥ 2 there is a constant K such that


W

k
X 1 K
g ∈ CpN +2k+2 (IRd ) ⇒ kAn g − A(i) gkp+2l(k+1),N ≤ kgkp,N +2k+2 .
i=1
ni−1 i! nk

Proof. a) Let g ∈ CpN +4 (IRd ). Since Bn and B commute with derivations we have (as in
(6.32)) for r = 0, . . . , N :

∇rx (Bn (Lx g) − BLx g(0)) = Bn (∇rx Lx g) − B∇rx Lx g(0).

We also have (6.31) with k = 1, hence k∇rx Lx gkp+N,4 ≤ Kαp+4l (x)kgkp,N +4 for r =
0, . . . , N . Then the result readily follows from (6.4).
b) Let g ∈ CpN +6 (IRd ). Exactly as above (and using Lemma 6.10) we get

1
 
∇rx Bn (Lx g) − BLx g(0) − un φ(Lx g) − B 2 Lx g(0)
2n
1
= Bn (∇rx Lx g) − B∇rx Lx g(0) − un φ(∇rx Lx g) − B 2 ∇rx Lx g(0).
2n
By (6.31) for k = 2, we have k∇rx Lx gkp+N,6 ≤ Kαp+6l (x)kgkp,N +6 for r = 0, . . . , N . Hence
the result readily follows from (6.10).

30
c) Let g ∈ CpN +2k+2 (IRd ). Using now Lemma 6.8, we get
k k
! !
X 1 X 1
∇rx Bn − B (i)
(Lx g) = Bn − B (i)
(∇rx Lx g)
i=1
ni−1 i0 ! i=1
ni−1 i!

and also k∇rx Lx gkp+N,2k+2 ≤ Kαp+2l(k+1) (x)kgkp,N +2k+2 for r = 0, . . . , N . Then the
result follows from (6.11). 2

(k)
Now we define the operators which come in the definition of Γt in the expansion
(2.11). We set, as soon as A(k) is well defined:
1
Dk = (A(k) − Ak ). (6.33)
k!

Observe that D1 = 0. By combining Lemmas 5.5 and 6.8, we readily get:

Lemma 6.12 Let l = 0 or l = 1, and p ≥ 0, and N ∈ IN , and assume H(l, N + 2k − 2)


and G”(k, p + N + 2k − 2) for some k ≥ 2.
a) There is a constant K such that

g ∈ CpN +2k+2 (IRd ) ⇒ kDk gkp+2(k+1)l,N ≤ Kkgkp,N +2k+2 .

b) If (gt ) ∈ Cp2k+2,1 (IRd × [0, 1]), then t 7→ Dk gt is continuously differentiable and its
derivative is Dk gt0 .

(k)
6.5 The operators Ut , Vt and Γt
(k)
At this point we can define the operators Ut , Vt and Γt coming in (2.10) and (2.11).
First, Ut and Vt are defined as follows:
Z t Z t
Ut g(x) = Ps UPt−s g(x)ds, Vt g(x) = Ps (V − A2 )Pt−s g(x)ds. (6.34)
0 0

(k)
For Γt we start by defining a sequence of numbers by induction on n:
n+1
X dn+1−k
d0 = 1, dn+1 = − . (6.35)
k=1
(k + 1)!

(q) (0)
Then Γt is defined by induction on q, starting with Γt = Pt , and setting for q ≥ 1:

r dq−k−r−u
t ∂ q−k−r−u  (u)
Z 
(q) X
r
Γt g(x) = (−1) Γs Dk+1 P t−s A g(x) ds
k≥1,u,r≥0: k+u+r≤q
r! 0 ∂sq−k−r−u
(6.36)
Of course one has to prove that this makes sense. For q = 1, the previous equation takes
a simpler form: Z t
(1)
Γt g(x) = Ps D2 Pt−s g(x) ds, (6.37)
0

31
(i)
More generally, the right side of (6.36) involves the operators Γu for i = 0, . . . , q − 1, so
this formula is indeed an induction formula.

In order to give a precise meaning to the previous formulas, we need some prerequisites.
Let j ≥ 1 and n, q ≥ 2. We say that an operator Qu1 ,...,um acting on functions over IRd ,
where ui ∈ [0, 1] and m ≥ 1, is “of type Aq (n, j)” if it is the composition (in an arbitrary
order) of the operators Pui −ui0 , and j operators Dki with 2 ≤ ki ≤ q, and j 0 times the
operator A, with j 0 + k1 + . . . + kj = n.

Lemma 6.13 Let l = 0 or l = 1, and p ≥ 0, and j ≥ 1 and n, q ≥ 2. Let Qu1 ,...,um be an


W
operator of type Aq (n, j). Assume H(l, N ) and G”(q, N + N p) for some N .
a) If N ≥ 2n + 2j there is a constant K such that

g ∈ CpN (IRd ) ⇒ kQu1 ,...,um gkp+2l(n+j),N −2n−2j ≤ Kkgkp,N .

b) If N ≥ 2n + 2j + 2, then (u1 , . . . , um ) 7→ Qu1 ,...,um g(x) is continuously differentiable,


and any one of the partial derivatives is the action over g and at point x of a linear
combination of operators of type Aq (n+1, j), containing exactly the same Dk ’s as Qu1 ,...,um
does.

Proof. Qu1 ,...,um is a product Rs Bs Rs−1 Bs−1 . . . R1 B1 R0 , where each Rk is either Pu1 or
Pui −ui−1 or the identity, and each Bi is either some Dki (we then set ki0 = ki + 1), or A
(we then set ki = ki0 = 1), with k1 + . . . + ks = n and 2 ≤ ki ≤ q if Bi = Dki . We write
Tk = Bk Rk−1 Bk−1 . . . R1 B1 R0 , and also ri = k10 + . . . + ki0 . Observe that n + j = rs . Below,
the constant K will change from line to line.
Then apply repeatedly Lemmas 5.2-(b), 5.4-(a) and 6.12-(a): first, R0 sends CpN (IRd )
N −2r1
continuously into CpN (IRd ); then T1 = B1 R0 sends CpN (IRd ) continuously into Cp+2lr 1
(IRd )
because N + N p ≥ p + (N − 2r1 ).
W

N −2ri
Suppose that Ti sends CpN (IRd ) continuously into Cp+2lr i
(IRd ). If i ≤ s, then Rs Ts
N −2ri
sends CpN (IRd ) continuously into Cp+2lr (IRd ) as well by (5.4), because 2 (N − 2ri +
W
i
(N − 2ri )) (p + 2lri )) ≤ N + N p; and if i < s, then Ti+1 = Bi+1 Ri Ti sends CpN (IRd )
W W
N −2r
continuously into Cp+2lri+1 i+1
(IRd ) by Lemmas 6.12 or 5.4. Since Qu1 ,...,un = Rs Ts and
n + j = rs0 , we finally get kQu1 ,...,um gkp+2l(n+j),N −2n−2j ≤ Kkgkp,N .
For (b) we can apply repeatedly (5.14): with the notation above, only the operators
Ri have to be differentiated w.r.t. some given uj . Assume that Ri = Puj −uj−1 (resp.
N −2ri
= Puj+1 −uj ). Then we differentiate Ri applied to Ti g, which is in Cp+2lr i
(IRd ), so we
need F(p + 2lri + 2l), which holds because 2rs + 2 = 2n + 2j ≤ N , and the differential is
N −2ri −2
ARi Ti g (resp. −ARi Ti g), which belongs to Cp+2lr i +2l
(IRd ); then we have to check that this
differentiation commutes with the action of Rs Bs . . . Bi+1 : for this we use Lemmas 5.4-(b)
or 6.12-(b) and we need N ≥ 2n + 2j + 2. Hence the partial derivative of Qu1 ,...,um g(x)
w.r.t. uj is the sum, over all i such that Ri is as above, of the same operator except that
we introduce an additional operator A or −A at the i0 th place. 2

In a similar way, but with the help of Lemma 6.10, we get:

32
Lemma 6.14 Let l = 0 or l = 1, and p ≥ 0, and N ≥ 6. Let Qs,t = Ps UPt−s or
Qs,t = Ps (V − A2 )Pt−s . Assume H(l, N ) and G’({un }, {u0n }, N + N p) for some N .
W

a) We have a constant K such that

g ∈ CpN (IRd ) ⇒ kQs,t gkp+6l,N −6 ≤ Kkgkp,N .

b) If N ≥ 8, then s 7→ Qs,t g(x) is continuously differentiable, and its derivative is the


N −8
action over g and at a point x of an operator which sends CpN (IRd ) into Cp+8l (IRd ).

Lemma 6.15 Let N ≥ 6 and l = 0 or l = 1 and p ≥ 0, and assume H(l, N ).


a) Under G0 ({un }, {u0n }, N + N
W
p) there is a constant K such that

g ∈ CpN (IRd ) ⇒ kUt gkp+6l,N −6 ≤ Ktkgkp,N , kVt gkp+6l,N −6 ≤ Ktkgkp,N .

b) Under G”(q, N + N p) for some q ≥ 1 and N ≥ 6q the formula (6.36) defines an


W
(q)
operator Γt on C N
p , and there is a constant K such that

(q)
g ∈ CpN (IRd ) ⇒ kΓt gkp+6lq,N −6q ≤ Ktkgkp,N .

Proof. (a) is obvious (because of the previous lemma), so we concentrate on (b). In


all the proof we assume H(l, N ) and G”(q + 1, N + N p) with N ≥ 6q, and r ranges
W

through {2, . . . , q + 1}. Here again K changes from line to line.

1) An operator Rt,v is said to be “of type Br (n, j)” if its action over g is a linear
combination of terms of the form
Z t Z um−1
du1 . . . dum Qu1 ,...,um ,t,v g(x) (6.38)
0 0

(when m = 0 this is just Rt = Qt ), where each Qu1 ,...,um ,t,v is of type Ar (n0 , j 0 ) for some
n0 ≤ n and j 0 ≤ j. Of course, the second argument v may be lacking, and then we just
write Rt .
If Rt,v is of type Bq (n, j), the previous lemma readily gives, for any t > 0:

N ≥ 2n + 2j, g ∈ CpN (IRd ) ⇒ kRt,v gkp+2l(n+j),N −2n−2j ≤ Ktkgkp,N . (6.39)

Next, if we formally differentiate the expression (6.38), say Πt,v g(x), we get

∂ t Z um−1 Z

Πt,v g(x) = du1 . . . dum Qu1 ,...,um ,t,v g(x)
∂v 0 0 ∂v
Z t Z um−1
∂ ∂
Πt,v g(x) = du1 . . . dum Qu1 ,...,um ,t,v g(x)
∂t 0 0 ∂t
Z t Z um−1
+ du2 . . . dum Qt,u2 ,...,um ,t,v g(x).
0 0

33
Therefore the second part of the previous lemma gives us for any operator Rt,v of type
Br (n, j):

N ≥ 2n + 2j + 2, g ∈ CpN (IRd ) ⇒ (t, v) 7→ Rt,v g(x) is continuously 

differentiable, and its partial derivatives are the action over g (6.40)

and at point x of another operator of type Br (n + 1, j). 

Two other trivial facts are as follows:


Rt
Rt,v is of type Br (n, j) ⇒ Rt0 = 0 Rs,t ds is of type Br (n, j). (6.41)

is of type Br (n, j) and Qt,v is of type Ar (n0 , j 0 )


)
Rt
(6.42)
=⇒ Rt Qt,v is of type Br (n + n0 , j + j 0 ).

2) Now we prove by induction on m that


(m)
Γt is of type Bm+1 (2m, m) (6.43)

for all m = 1, . . . , q. Observe that this is true for m = 1, in an obvious way, by (6.37).
Let us assume that (6.43) holds for all m0 ≤ m − 1, for some m between 2 and q. In
order to prove (6.43) for m, and in view of (6.36), it is enough to prove that for any k ≥ 1
and i, w, r ≥ 0 with i + k + r + w = m, then the operator
t ∂ i  (w)
Z 
r
Rt = Γs D P
k+1 t−s A ds is of type Bm+1 (2m, m) (6.44)
0 ∂sj
(0) (w)
(recall that Γs = Ps ). For w ≥ 1 our induction hypothesis yields that Γs is of type
(w)
Bw+1 (2w, w), hence Γs Dk+1 Pt−s Ar is of type B1+k∨w (2w + k + r + 1, w + 1) by (6.42);
and the same is obviously true when w = 0. Therefore (6.40) applied repeatedly and
(6.41) imply that, provided N ≥ 6w + 2k + 2r + 2i + 2, then Rt is of type B1+k∨w (2w +
k + r + i + 1, w + 1). Since the maxima of w (resp. 2w + k + r + i + 1 = m + w + 1, resp.
6w + 2k + 2r + 2i + 2) over our possible choices of (w, k, i, r) are achieved simultaneously
and are equal to m − 1 (resp. 2m, resp. 6m − 2), and since k w ≤ m, we deduce from
W

N ≥ 6m − 2 that indeed (6.44) holds: hence we get (6.43) whenever m ≤ q.


At this stage, (6.43) with m = q and (6.39) gives the result. 2

7 Proof of Theorem 2.1


Let us set for n ≥ 1 and j = 1, . . . , n and i = 0, . . . , j:
n
βn,i,j g(x) = Pi/n P(j−i)/n g(x),

and also for i = 1, . . . , j:

γn,i,j g(x) = βn,i,j g(x) − βn,i−1,j g(x).

34
Observe that:
[nt]
X
∆n,t g(x) = βn,[nt],[nt] g(x) − βn,[nt],0 g(x) = γn,i,[nt] g(x). (7.1)
i=1

Below, we assume H(l, 4) for l = 0 or l = 1, and also G({un }, 4 + 4 p) for some p ≥ 0


W
0
and for some sequence (un ) decreasing to 0. By Corollary 6.2 we have η4+4 W < ∞. We
p
also take g ∈ Cp4 (IRd ). In view of (5.15) for N = 0 and of (6.25), a simple computation
shows that
1 n
Z 1/n
n
γn,i,j g(x) = P(i−1)/n An P(j−i)/n g(x) − P(i−1)/n Ps AP(j−i)/n g(x)ds. (7.2)
n 0

Proposition 5.2 and Lemma 6.11 for N = 0 and N 0 = 1 and (6.30) yield that k(An −
A)Pt gkp+2l,0 ≤ Kkgkp,4 for all t and some constant K. Hence if

1 n
Z 1/n
0 n
γn,i,j g(x) = P AP g(x) − P(i−1)/n Ps AP(j−i)/n g(x)ds
n (i−1)/n (j−i)/n 0
Z 1/n
n
= P(i−1)/n (I − Ps )AP(j−i)/n g(x)ds, (7.3)
0

where I denotes the identity operator, then by virtue of (6.23) and (7.1), we clearly have
[nt]
0
X
∆n,t g(x) − γn,i,[nt] g(x) ≤ Ktun αp+2l (x)kgkp,4 . (7.4)
i=1

Next we apply (5.15) for N = 0 again and to the function g 0 = AP(j−i)/n g, which
satisfies kg 0 kp+2l,2 ≤ Kkgkp,4 by Proposition 5.2 and Lemma 5.4, to get that kPs g 0 −
g 0 kp+4l,0 ≤ Kkgkp,4 (uniformly in i and j). Using also (6.23), we readily deduce that

0 K
|γn,i,j g(x)| ≤ αp+4l (x)kgkp,4 ,
n2
and the estimate (2.8) thus follows from (7.4).

8 Proof of Theorem 2.2


Let us start with a lemma, which will also be used in the next section, and which shows
in particular how the constants dn of (6.35) come into the picture through expansions of
some integrals. This lemma is a simple variation on Taylor’s formula.

Lemma 8.1 Let M ∈ IN , and h be an M + 1 times differentiable function over [0, 1],
whose derivatives of order 0, 1, . . . , M + 1 are all bounded by a constant ρ. Then we have
for all t ∈ [0, 1]:
[nt] M [nt]/n
1X i−1 dr 2M tρ
  X Z
h − h(r) (s)ds ≤ . (8.1)
n i=1 n r=0
nr 0 nM +1

35
Proof. For all l = 0, . . . , M we set
[nt] [nt]/n
1 X (l) i − 1
  Z
An,l = h , Bn,l = h(l) (s)ds.
n i=1 n 0

Clearly,
0≤l≤M ⇒ |An,l | ≤ tρ, |Bn,l | ≤ tρ. (8.2)
Recalling ϕn (s) in (2.2), we observe that
M −l
X 1 (l+i) ρ
h(l) (s) − h (ϕn (s))(s − ϕn (s))i ≤ M −l+1 .
i=0
i! n (M − l + 1)!

Then we get
M −l
X 1 tρ
Bn,l = An,l+i + εn,l,M , |εn,l,M | ≤ .
i=0
ni (i + 1)! nM −l+1 (M − l + 1)!

and this can be “inverted” to give


M −l
X 1
An,l = Bn,l − An,l+i − εn,l,M . (8.3)
i=1
ni (i + 1)!

Now, we define dr by (6.35), and we consider the relations for l = 0, . . . , M :


M −l
di 2(M − l)tρ
Bn,l+i + ε0n,l,M , |ε0n,l,M | ≤
X
An,l = . (8.4)
i=0
ni nM +1−l

By (8.3) this relation holds if l = M (recall d0 = 1), with ε0n,M,M = −εn,M,M . Assume
that it holds for all l = L + 1, L + 2, . . . , M for some L. Then (8.3) yields
M −L ε0n,i+L,M MX
−i−L
!
X 1 dr
An,L = Bn,L − + Bn,i+L+r − εn,L,M
i=1
(i + 1)! ni r=0
n i+r

M −L j
1 X dj−i
+ ε0n,L,M ,
X
= Bn,L − Bn,L+j
j=1
nj i=1
(i + 1)!

−L
where ε0n,L,M = −εn,L,M − M 1 0
P
i=1 ni (i+1)! εn,i+L,M . Then both properties in (8.4) are
satisfied for l = L (use (6.35)). Finally, (8.1) reduces to (8.4) applied with l = 0. 2

Now we assume H(l, 10) and G0 ({un }, {u0n }, 10 + 10 p) for some p ≥ 0. Recall that
W

un and u0n /un go to 0. Take a function g ∈ Cp10 (IRd ). For simplicity we also write
u00n = u0n u2n n12 .
W W

We still have (7.1) and (7.2). By Proposition 5.2 and Lemma 6.11–(b) we have for some
1
constant K (which below will change from to line) that k(An −A−un U − 2n V)Pt gkp+6l,0 ≤
Ku00n kgkp,10 . Hence, if instead of (7.3) we set
Z 1/n 
1

0 n
γn,i,j g(x) = P(i−1)/n (I − Ps )A + un U + V P(j−i)/n g(x)ds,
0 2n

36
by virtue of (6.23) and (7.1), we get

0 Ku00n
γn,i,[nt] g(x) − γn,i,[nt] g(x) ≤ αp+6l (x)kgkp,10 . (8.5)
n

Next we apply (5.15) with N 0 = 0 and N = 1, and Lemma 5.5–(b) with k = 2 and
N = 0, to the function AP(j−i)/n g, which satisfies kAP(j−i)/n gkp+2l,4 ≤ Kkgkp,10 (by
Proposition 5.2 and Lemma 5.4), to get that k(Ps − I − sA)AP(j−i)/n gkp+6l,0 ≤ Kkgkp,10 .
Using also (6.23), we readily deduce that if
Z 1/n 
1

00 n
γn,i,j g(x) = P(i−1)/n un U + V − sA2 P(j−i)/n g(x)ds
0 2n
!
1 n V − A2
= P un U + P(j−i)/n g(x),
n (i−1)/n 2n

then
0 00 K
|γn,i,[nt] g(x) − γn,i,[nt] g(x)| ≤ αp+6l (x)kgkp,10 . (8.6)
n3
Next we apply (5.15) again with N = 0, and Lemma 5.5–(b) with k = 1, to get that
kP(j−i)/n g − P(j−i+1)/n gkp+2l,6 ≤ K
n kgkp,10 . Since by Lemmas 5.5 and 6.8 the operators U
2 6 d
and V − A send Cp+2l (IR ) continuously into Cp+8l 0 (IRd ), and by (6.23), we obtain
!
00 1 n V − A2 Ku00n
γn,i,j g(x) − P(i−1)/n un U + P(j−i+1)/n g(x) ≤ αp+8l (x)kgkp,10 . (8.7)
n 2n n

Next we observe once more that the k.kp+4l,4 norms of the functions UPs g and (V −
A2 )Ps g are smaller than Kkgkp,10 : we can apply Theorem 2.1 to these functions and, since
u2n and 1/n2 are smaller than u00n , we get
! !
n V − A2 V − A2
P i−1 un U + P j−i+1 g(x) − P i−1 un U + P j−i+1 g(x)
n 2n n n 2n n

≤ Ku00n αp+8l (x)kgkp,10 .

Then putting this together with (7.1), (8.5), (8.6) and (8.7) gives us, with the notation
Gx;t (s) = Ps UPt−s g(x) and Hx;t (s) = Ps (V − A2 )Pt−s g(x):

[nt] [nt]
un X i−1 1 X i−1
   
∆n,t g(x) − Gx;[nt]/n − 2 Hx;[nt]/n ≤ tKu00n αp+8l (x)kgkp,10 .
n i=1 n 2n i=1 n
(8.8)
It remains to observe that the two functions Gx;t and Hx;t (on the interval [0, t])
satisfy the assumptions of Lemma 8.1 with M = 0 and for some constant ρ, by virtue
of Lemma 6.14–(b). Then with the notation (6.34) we readily deduce (2.10) from the
previous inequality and the fact that un /n ≤ u00n .

37
9 Proof of Theorem 2.3
The proof of this theorem is similar to the proof of the previous one, except that here we
need an induction on N , after observing that the result for N = 0 is nothing else than
Theorem 2.1.
So below we assume that N ≥ 1 and that H(l, 6N +4) and G(N, 6N +4+(6N +4) p)
W

hold, and we take g ∈ Cp6N +4 (IRd ). We also assume that the expansion (2.11) with a
remainder satisfying (2.13) holds for all integers from 0 up to N −1. The claims concerning
(k)
the operators Γt are in Lemma 6.15, so we concentrate on the expansion.
We still have (7.1) and (7.2). By Proposition 5.2 and Lemma 6.11–(b), and since
3 + N ≤ 6N + 4, we have
N +1
X 1 K
(An − A(k) )Pt g ≤ kgkp,6N +4
k=1
nk−1 k! nN +1
p+2l(N +2),0

for all t, for some constant K (which again changes from line to line). Hence, recalling
that A(1) = A, and if instead of (7.3) we set
N +1
!
Z 1/n 1
0
X
n
γn,i,j g(x) = P(i−1)/n (I − Ps )A + A(k) P(j−i)/n g(x)ds,
0 k=2
nk−1 k!

by virtue of (6.23) and (7.1), we get

0 K
γn,i,[nt] g(x) − γn,i,[nt] g(x) ≤ αp+2l(N +2) (x)kgkp,6N +4 . (9.1)
nN +2

Next we apply (5.15) with N 0 = 0, to the function AP(j−i)/n g: taking advantage of


Proposition 5.2 and Lemma 5.4), we get that
N
!
X sk k
Ps − I − A AP(j−i)/n g ≤ KsN +1 kgkp,6N +4 .
k=1
k!
p+2l(N +2),0

Using also (6.23) and the notation Dk of (6.33), we readily deduce that if
N +1 N
!
1/n 1 sk
Z
00
X X
n (k) k+1
γn,i,j g(x) = P(i−1)/n A − A P(j−i)/n g(x)ds
0 k=2
nk−1 k! k=1
k!
N
!
1 n X 1
= P(i−1)/n Dk+1 P(j−i)/n g(x),
n k=1
nk

then
0 00 K
|γn,i,[nt] g(x) − γn,i,[nt] g(x)| ≤ α (x)kgkp,6N +4 . (9.2)
nN +2 p+2l(N +2)
We easily deduce from (5.15) that, under the same assumptions than in Lemma 5.6–(b),
we have
N k t
kt 1
X Z
k
g(x) = (−1) Pt A g(x) − (2t − s)N AN +1 Ps g(x)ds. (9.3)
k=0
k! (N )! 0

38
(For checking this, we use Pt Ak = Ak Pt and we can replace Pt g in the kth summand above
by the right side of (5.15) written for N − k instead of N and then compute the right side
of (6.21); a repeated use of the binomial formula gives us that this right side equals g(x)).
Coming back to our problem, we deduce from (9.3) that the function ψn,N −k = g −
PN −k k 1 r
r=0 (−1) nr r! P1/n A g, for any k ≤ N , satisfies

K
kψn,N −k kp+2l(N −k+1),4N +2k+2 ≤ kgkp,6N +4 ,
nN −k+1
and so does the function φi,j,n,N −k = P(j−i)/n ψn,N −k . Hence

n K
kP(i−1)/n Dk+1 φi,j,n,N −k kp+2l(N +2),4N −2 ≤ kgkp,6N +4
nN −k+1
by Lemmas 6.6 and 6.12, and if
N N −k
000 1X X (−1)r n
γn,i,j g(x) = P Dk+1 P(j−i+1)/n Ar g(x)
n k=1 r=0 nk+r r! (i−1)/n
we get
00 000 K
|γn,i,[nt] g(x) − γn,i,[nt] g(x)| ≤ α (x)kgkp,6N +4 . (9.4)
nN +2 p+2l(N +2)
Now we apply the induction hypothesis. Observe that if k + r ≤ N the function
g 0 = Dk+1 Pt Ar g satisfies kg 0 kp+2l(k+r+1),6N −2k−2r ≤ Kkgkp,6N +4 by Lemma 6.12. So our
assumptions and the fact that 6(N − k − r) + 4 ≤ 6N − 2k − 2r (remember that k + r ≥ 1)
allow us to apply the expansion (2.11) to this function at the order N − k − r, which gives
−k−r
NX
!
n 1 (u)
P i−1 − P i−1 − Γ i−1 Dk+1 P j−i+1 Ar g(x)
n n
u=1
nu n n

K
≤ αp+2l(2N +3−k−r) (x)kgkp,6N +4 .
nN −k−r+1
(0)
Henceforth, if we set Γt = Pt and
k,r,u (u)
ξn,i,j g(x) = Γ i−1 Dk+1 P j−i+1 Ar g(x)
n n

for 1 ≤ k ≤ N and 0 ≤ r ≤ N − k and 0 ≤ u ≤ N − k − r, then


N N −k N X
−r−k
000 1X X (−1)r k,r,u K
γn,i,[nt] g(x) − k+r+u
ξn,i,[nt] g(x) ≤ N +2 αp+4l(N +1) (x)kgkp,6N +4 .
n k=1 r=0 u=0 n r! n
(9.5)
In other words, if we fix t and introduce the functions
gx;k,r,u,t (s) = Γ(u) r
s Dk+1 Pt−s A g(x),

by putting together (7.1), (9.1), (9.2), (9.4) and (9.5), we obtain


N N −k N X
−r−k [nt]
(−1)r 1X i−1
X X  
∆n,t g(x) − gx;k,r,u,t
k=1 r=0 u=0
nk+r+u r! n i=1 n
Kt
≤ α (x)kgkp,6N +4 . (9.6)
nN +1 p+4l(N +1)

39
Now, by (6.43) gx;k,r,u,t (s) is the action over g and at point x of an operator of type
B1+u∨k (2u + k + r + 1, u + 1). Hence (6.40) applied repeatedly and the fact that 6N + 4 ≥
2(2u + k + r + 1) + 2(u + 1) + 2(N − k − r − u) for all u, k, r with k ≥ 1 and k + r + u ≤ N
show that the function gx;k,r,u,t is differentiable up to order N − k − r − u + 1, with all
partial derivatives up to that order being bounded by Kαp+4l(N +1) (x)kgkp,6N +4 . Then
Lemma 8.1 applied to M = N − k − r − u gives:
[nt] N −k−r−u [nt]/n
1X i−1 dv Kαp+4l(N +1) (x)
  Z
X (v)
gx;k,r,u,t − gx;k,r,u,t (s)ds ≤ kgkp,6N +4 .
n i=1 n v=0
nv 0 nN +1−k−r−u

Injecting this into (9.6) gives


N N −k N X
−r−k N −k−r−u
(−1)r dv 1 [nt]/n
Z
X X X (v)
∆n,t g(x) − gx;k,r,u,t (s)ds
k=1 r=0 u=0 v=0
nk+r+u+v r! n 0

Kt
≤ α (x)kgkp,6N +4 .
nN +1 p+4l(N +1)
(q)
At this point, it suffices to use the definition (6.36) of Γt and to reorder the sums above:
we readily get (2.11) and (2.13), and we are done.

(k)
Remark 9.1 We can compute “explicitly” the operators Γt , although this becomes
incredibly tedious when k grows. For example, in the 1–dimensional case (r = d = 1), and
using (6.37) and the fact that
1
Z t
(1)
Γt g(x) = Ps D2 Pt−s g(x)ds,
2 0

where Π is the following operator:


bc  00 2 0 
D2 g(x) = −b2 (g 0 f f 0 )(x) − 4g f f + g 0 f 2 f 00 (x)
2
c2  000 3 0 
− 2g f f + g 00 f 3 f 00 + g 00 f 2 f 02 (x)
2 Z
−bf (x)f 0 (x) F (dy) g 0 (x + f (x)y)y − g 0 (x)τ (y)


Z
F (dy) g 0 (x + f (x)y) (f (x + f (x)y) − f (x)) − g 0 (x)f (x)f 0 (x)τ (y)

−b
Z
0
2
F (dy) g 00 (x + f (x)y)y − g 00 (x)τ (y)

−c f (x)f (x)
c 2
Z
f (x)f 00 (x) F (dy) g 0 (x + f (x)y)y − g 0 (x)τ (y)


2
c 2
Z
0
− f (x)f (x) 2
F (dy)g 00 (x + f (x)y)y 2
2Z
c    
− F (dy) g 00 (x + f (x)y) f (x + f (x)y)2 − f (x)2 ) − 2g 00 (x)f (x)2 f 0 (x)τ (y)
2Z Z 
− F (dy) F (dy 0 ) g(x + f (x)y 0 + f (x + f (x)y 0 )y) − g(x + f (x)y 0 + f (x)y)
−g 0 (x + f (x)y 0 )(f (x + f (x)y 0 ) − f (x))τ (y)

− g 0 (x + f (x)y)y − g 0 (x)τ (y) f (x)f 0 (x)τ (y 0 ) .


40
R
If we are in the compound Poisson case (i.e. c = 0 and F (IR) < ∞ and b =
τ (y)F (dy)), we see that D2 Ps g = Hs g where Hs is defined by (4.2), as it should be.
R
On the other hand, as soon as b 6= τ (y)F (dy), and even if F (IR) < ∞, then D2 g is
well defined only under F(1), and we even need F(2) if further c 6= 0. So, although this is
no true proof, it seems quite unlikely that Theorems 2.2 and 2.3 stay true when we drop
all integrability assumption on the jumps of Y , even when g is bounded. 2

References
[1] S. Asmussen and J. Rosiński, “Approximations of small jumps of Lévy processes
with a view towards simulation”, J. Applied Probab. 38, 482–493 (2001).

[2] V. Bally and D. Talay, “The law of the Euler scheme for stochastic differential
equations (I): convergence rate of the distribution function”, Probab. Th. Rel.
Fields 104, 43–60 (1996).

[3] S. N. Ethier and M. F. Norman, “Error estimate for the diffusion approximation
of the Wright-Fisher model”, Proc. Nat. Acad. Sci. U.S.A. 74, no. 11, 5096–5098
(1977).

[4] S. N. Ethier and T. G. Kurtz, Characterization and convergence, Wiley, New York
(1986).

[5] J. Jacod and P. Protter, “Asymptotic error distributions for the Euler method for
stochastic differential equations”, Ann. Probab. 26, 267–307 (1998).

[6] J. Jacod, “The Euler scheme for Lévy driven stochastic differential equations”,
Prépublication LPMS 711 (2002).

[7] J. Jacod, A.N. Shiryaev, Limit theorems for stochastic processes, 2d edition,
Springer–Verlag, Heidelberg (2003).

[8] A. Kohatsu-Higa and N. Yoshida, “On the simulation of some functionals of solu-
tions of Lévy driven sde’s”, Preprint (2001).

[9] T. G. Kurtz and P. Protter, “Wong-Zakai corrections, random evolutions, and


simulation schemes for SDEs,” in Stochastic Analysis, Edited by Eddy Mayer-
Wolf, Ely Merzbach, and Adam Shwartz, 331–346, Academic Press, Boston, MA
(1991).

[10] T. G. Kurtz and P. Protter, “Weak error estimates for simulation schemes for
SDEs” (1999).

[11] G. C. Papanicolaou and W. Kohler, “Asymptotic theory of mixing stochastic ordi-


nary differential equations”, Comm. Pure Appl. Math. 27, 641–668 (1974).

[12] P. Protter, Stochastic Integration and Differential Equations, Springer–Verlag, Hei-


delberg (1990).

41
[13] P. Protter and D. Talay, “The Euler scheme for Lévy driven stochastic differential
equations”, Ann. Probab. 25, 393–423 (1997).

[14] J. Rosiński, “Series representations of Lévy processes from the perpective of point
processes”, in Lévy processes – Theory and Applications, Eds O.E. Barndorff–
Nielsen, T. Mikosch and S.I. Resnick, Birkhäuser, Boston, 401–415 (2001).

[15] L. Slominski, “Stability of strong solutions of stochastic differential equations”,


Stoch. Processes and Appl. 31, 173–202 (1989).

[16] D. Talay and L. Tubaro, “Expansion of the global error for numerical schemes
solving stochastic differential equations”, Stoch. Anal. Appl. 8, 94–120 (1990).

42

You might also like