A Course On Rough Paths
A Course On Rough Paths
March 2020
Last update to this version: March 25, 2022
Springer
To Waltraud and Rudolf Friz
and
To Xue-Mei
Preface to the Second Edition
It has been a joy seeing the subject of “rough analysis” flourish over the last few
years. As far as this book is concerned, this comes at the price of an increasingly
long list of (important) omissions. A systematic presentation of higher-level geomet-
ric and then branched (possibly càdlàg) rough paths remains beyond the scope
of this book, despite being an excellent preparation for the algebraic thinking
later required for regularity structures. (The references [LCL07, FV10b, CF19]
and [Gub10, HK15, FZ18, BCFP19] partially make up for this.) Also absent remains
a systematic mathematical study of signatures. This topic, together with recent appli-
cations to data science and machine learning, may well fill a book in its own right;
until then the reader may consult Lyons’ ICM article [Lyo14] and the survey [CK16].
The theory of regularity structures, a major extension of rough path theory, has,
since the appearance of the first edition of this book, grown into an essentially
complete solution theory for general singular, subcritical semilinear (and quasilinear)
stochastic partial differential equations. Despite this progress, our running example
of a singular SPDE remains the KPZ equation, originally solved with rough paths
[Hai13], later also with the Gubinelli–Imkeller–Perkowksi theory of paracontrolled
distribitions [GIP15, GP15, GP17], another topic that deserves a book in its own
right.
As far as the content of this second edition is concerned, we added many new
examples and updated notations throughout in order to bring it closer to current
practice in the literature. Our short incursion into low regularity (a.k.a. higher order)
rough paths in Section 2.4 has been expanded, the recently obtained stochastic sewing
lemma is presented in Section 4.6. Section 9.4 shows how the Laplace method allows
one to elegantly obtain precise asymptotics in the large deviation principle, while
Section 12.1 contains a detailed discussion of rough transport equations. We also
expanded and updated large parts of Chapters 13-15 dealing with regularity structures.
In particular, we give a more modern and self-contained proof of the reconstruction
theorem (not relying on wavelet bases anymore), as well as a thorough discussion
of an application of regularity structures to a “rough” stochastic volatility model in
Section 14.5, and a detailed description of the KPZ structure and renormalisation
groups in Sections 15.3 and 15.5.
vii
viii Preface to the Second Edition
We also take the opportunity here to thank, in addition to those friends and
colleagues already named in the first edition, Yvain Bruned, Ajay Chandra, Ilya
Chevyrev, Rosa Preiß, and Lorenzo Zambotti for many interesting discussions over
the last few years. Of the many people who communicated to us lists of typos and
minor issues we especially thank Christian Litterer. We also thank Carlo Bellingeri,
Oleg Butkovsky, Andris Gerasimovics, Mate Gerencsér, Tom Klose, Khoa Lê, Mario
Maurelli and Nikolas Tapia for feedback on various aspects of the new content. The
first author also thanks ETH Zürich (FIM) for its hospitality during the finalisation
of this second edition.
Last but not least, we would like to acknowledge financial support: PKF is sup-
ported by the European Research Council under the European Union’s Horizon 2020
research and innovation programme through Consolidator Grant 683164 (GPSART),
by DFG research unit FOR2402, and the Einstein Foundation Berlin through an
Einstein professorship. MH was supported by the European Research Council under
the European Union’s Seventh Framework Programme through Consolidator Grant
615897 (CRITICAL), by the Leverhulme trust through a leadership award, and by
the Royal Society through a research professorship.
ix
x Preface to the First Edition
Programme (FP7/2007-2013) / ERC grant agreement nr. 258237 and DFG, SPP 1324.
MH was supported by the Leverhulme trust through a leadership award and by the
Royal Society through a Wolfson research award.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 What is it all about? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Analogies with other branches of mathematics . . . . . . . . . . . . . . . . . . 6
1.3 Regularity structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Frequently used notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Rough path theory works in infinite dimensions . . . . . . . . . . . . . . . . . 13
xiii
xiv Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Chapter 1
Introduction
We give a short overview of the scopes of both the theory of rough paths and the
theory of regularity structures. The main ideas are introduced and we point out some
analogies with other branches of mathematics.
where the (ξi ) are i.i.d. standard Gaussian random variables. Based on martingale
theory, Itô’s stochastic differential equations (SDEs) have provided a rigorous and
extremely useful mathematical framework for all this. And yet, stability is lost in the
passage to continuous time: while it is trivial to solve (1.2) for a fixed realisation of
ξi (ω), after all (ξ1, . . . ξT ; Y0 ) 7→ Yi is surely a continuous map, the continuity of
the solution as a function of the driving noise is lost in the limit.
Taking Ẋ = ξ to be white noise in time (which amounts to say that X is a
Brownian motion, say B), the solution map S : B 7→ Y to (1.1), known as Itô map,
is a measurable map which in general lacks continuity, whatever norm one uses to
1
2 1 Introduction
equip the space of realisations of B. 1 Actually, one can show the following negative
result (see [Lyo91, LCL07] as well as Exercise 5.7 below):
Proposition 1.1. There exists no separable Banach space B ⊂ C([0, 1]) with the
following properties:
1. Sample paths of Brownian
R· motions lie in B almost surely.
2. The map (f, g) 7→ 0 f (t)ġ(t) dt defined on smooth functions extends to a
continuous map from B × B into the space of continuous functions on [0, 1].
Since, for any two distinct indices i and j, the map
Z ·
B 7→ B i (t) Ḃ j (t) dt , (1.3)
0
is itself the solution of one of the simplest possible differential equations driven by
B (take Y ∈ R2 solving Ẏ 1 = Ḃ i and Ẏ 2 = Y 1 Ḃ j ), this shows that it takes very
little for S to lack continuity. In this sense, solving SDEs is an analytically ill-posed
task! On the other hand, there are well-known probabilistic well-posedness results
for SDEs of the form 2
Theorem 1.2. Let ξε = δε ∗ ξ denote the regularisation of white noise in time with a
compactly supported smooth mollifier δε . Denote by Y ε the solutions to (1.1) driven
by Ẋ = ξε . Then Y ε converges in probability (uniformly on compact sets). The
limiting process does not depend on the choice of mollifier δε , and in fact is the
Stratonovich solution to (1.4).
There are many variations on such “Wong–Zakai” results, another popular choice
being ξε = Ḃ (ε) where B (ε) is a piecewise linear approximation (of mesh size
∼ ε) to Brownian motion. However, as consequence of the aforementioned lack of
continuity of the Itô-map, there are also reasonable approximations to white noise for
which the above convergence fails. (We shall see an explicit example in Section 3.4.)
Perhaps rather surprisingly, it turns out that well-posedness is restored via the
iterated integrals (1.3) which are in fact the only data that is missing to turn S into
a continuous map. The role of (1.3) was already appreciated in [INY78, Thm 4.1]
and related works in the seventies, but statements at the time were probabilistic
in nature, such as Theorem 1.2 above. Rough path analysis introduced by Terry
Lyons in the seminal article [Lyo98] and by now exposed in several monographs
[LQ02, LCL07, FV10b], provides the following remarkable insight: Itô’s solution
map can be factorised into a measurable “universal” map Ψ and a continuous solution
map Ŝ as
1
This lack of regularity is the raison d’être for Malliavin calculus, a Sobolev type theory of C([0, T ])
equipped with Wiener measure, the law of Brownian motion.
2
For the purpose of this introduction, all coefficients are assumed to be sufficiently nice.
1.1 What is it all about? 3
Ψ Ŝ
B(ω) 7→ (B, B)(ω) 7→ Y (ω). (1.5)
The map Ψ is universal in the sense that it depends neither on the initial condition, nor
on the vector fields driving the stochastic differential equation, but merely consists
of enhancing Brownian motion with iterated integrals of the form
Z t
Bi,j (s, t) = B i (r) − B i (s) dB j (r) .
(1.6)
s
At this stage, the choice of stochastic integration in (1.6) (e.g. Itô or Stratonovich)
does matter and probabilistic techniques are required for the construction of Ψ .
Indeed, the map Ψ is only measurable and usually requires the use of some sort
of stochastic integration theory (or some equivalent construction, see for example
Section 10 below for a general construction in a Gaussian, non-semimartingale
context).
The solution map Ŝ on the other hand, the solution map to a rough differential
equation (RDE), also known as Itô–Lyons map and discussed in Section 8.1, is purely
deterministic and only makes use of analytical constructions. More precisely, it allows
input signals to be arbitrary rough paths which, as discussed in Chapter 2, are objects
(thought of as enhanced paths) of the form (X, X), defined via certain algebraic
properties (which mimic the interplay between a path and its iterated integrals) and
certain analytical, Hölder-type regularity conditions. In Chapter 3 these conditions
will be seen to hold true a.s. for (B, B); a typical realisation is thus called Brownian
rough path.
The Itô–Lyons map turns out, cf. Section 8.6, to be “nice” in the sense that it is a
continuous map of both its initial condition and the driving noise (X, X), provided
that the dependency on the latter is measured in a suitable “rough path” metric. In
other words, rough path analysis allows for a pathwise solution theory for SDEs, i.e.
for a fixed realisation of the Brownian rough path. The solution map Ŝ is however
a much richer object than the original Itô map, since its construction is completely
independent of the choice of stochastic integral and even of the knowledge that the
driving path is Brownian. For example, if we denote by Ψ I (resp. Ψ S ) the maps
B 7→ (B, B) obtained by Itô (resp. Stratonovich) integration, then we have the almost
sure identities
S I = Ŝ ◦ Ψ I , S S = Ŝ ◦ Ψ S ,
where S I (resp. S S ) denotes the solution to (1.4) interpreted in the Itô (resp.
Stratonovich) sense. Returning to Theorem 1.2, we see that the convergence there
is really a deterministic consequence of the probabilistic question whether or not
Ψ S (B ε ) → Ψ S (B) in probability and rough path topology, with Ḃ ε = ξ ϵ . This
can be shown to hold in the case of mollifier, piecewise linear, and many other
approximations.
So how is this Itô–Lyons map Ŝ built? In order to solve (1.1), we need to be able
to make sense of the expression
4 1 Introduction
Z t
f (Ys ) dXs , (1.7)
0
where Y is itself the as yet unknown solution. Here is where the usual pathwise
approach breaks down: as we have seen in Proposition 1.1 it is in general impossible,
even in the simplest cases, to find a Banach space of functions containing Brownian
sample paths and in which (1.7) makes sense. Actually, if we measure regularity in
terms of Hölder exponents, then (1.7) makes sense as a limit of Riemann sums for X
and Y that are arbitrary α-Hölder continuous functions if and only if α > 21 . The key
word here is arbitrary: in our case the function Y is anything but arbitrary! Actually,
since the function Y solves (1.1), one would expect the small-scale fluctuations of Y
to look exactly like a scaled version of the small-scale fluctuations of X in the sense
that one would expect that
where, for any path F with values in a linear space, we set Fs,t = Ft − Fs , and
where Rs,t is some remainder that one would expect to be “of higher order” in the
sense that |Rs,t | ≲ |t − s|β for some β > α. (We will see later that β = 2α is a
natural choice.)
Suppose now that X is a “rough path”, which is to say that it has been “enhanced”
with a two-parameter function X which should be interpreted as postulating the
values for Z t
Xi,j (s, t) = i
Xs,r dXrj . (1.8)
s
Note here that this identity should be read in the reverse order from what one may be
used to: it is the right-hand side that is defined by the left-hand side and not the other
way around! The idea here is that if X is too rough, then we do not a priori know
how to define the integral of X against itself, so we simply postulate its values. Of
course, X cannot just be anything, but should satisfy a number of natural algebraic
identities and analytical bounds, which will be discussed in detail in Chapter 2.
Anyway, assuming that we are provided with the data (X, X), then we know how
to give meaning to the integral of components of X against other components of X:
this is precisely what X encodes. Intuitively, this suggests that if we similarly encode
the fact that Y “looks like X at small scales”, then one should be able to extend
the definition of (1.7) to a large enough class of integrands to include solutions to
(1.1), even when α < 12 . One of the achievements of rough path theory is to make
this intuition precise. Indeed, in the framework of rough integration sketched here
and made precise in Chapter 4, the barrier α = 21 can be lowered to α = 31 . In
principle, this can be lowered further by further enhancing X with iterated integrals
of higher order, but we decided to focus on the first non-trivial case for the sake of
simplicity and because it already covers the most important case when X is given
by a Brownian motion, or a stochastic process with properties similar to those of
Brownian motion. We do however indicate briefly in Sections 2.4, 4.5 and 7.6 how
1.1 What is it all about? 5
the theory can be modified to cover the case α ≤ 13 , at least in the “geometric” case
when X is a limit of smooth paths.
The simplest way for Y to “look like X” is when Y = G(X) for some sufficiently
regular function G. Despite what one might guess, it turns out that this particular
class of functions Y R is already sufficiently rich so that knowing how to define
t
integrals of the form 0 G(Xs ) dXs for (non-gradient) functions G allows to give a
meaning to equations of the type (1.1), which is the approach originally developed in
[Lyo98]. A few yearsR t later, Gubinelli realised in [Gub04] that, in order to be able to
give a meaning to 0 Ys dXs given the data (X, X), it is sufficient that Y admits a
“derivative” Y ′ such that
Ys,t = Ys′ Xs,t + Rs,t ,
with a remainder satisfying Rs,t = O(|t − s|2α ). This extension of the original theory
turns out to be quite convenient, especially when applying it to problems other than
the resolution of evolution equations of the type (1.1).
An intriguing question is to what extent rough path theory, essentially a theory
of controlled ordinary differential equations, can be extended to partial differential
equations. In the case of finite-dimensional noise, and very loosely stated, one has
for instance a statement of the following type. (See [CF09, CFO11, FO14, GT10,
Tei11, DGT12] as well as Section 12.2 below.)
Theorem 1.3. Classes of SPDEs of the form du = F [u] dt + H[u] ◦ dB, with
second and first order differential operators F and H, respectively, and driven
by finite-dimensional noise, with the Zakai equation from filtering and stochastic
Hamilton–Jacobi–Bellman (HJB) equations as examples, can be solved pathwise, i.e.
for a fixed realisation of the Brownian rough path. As in the SDE case, the SPDE
solution map factorises as S S = Ŝ ◦ Ψ S where Ŝ, the solution map to a rough partial
differential equation (RPDE) is continuous in the rough path topology.
diverse mathematical fields, including for example quantum field theory [GL09],
nonlinear PDEs [Gub12], Malliavin calculus [CFV09], non-Markovian Hörmander
and ergodic theory, [CF10, HP13, CHLT15] and the multiscale analysis of chaotic
behaviour in fast-slow systems [KM16, KM17, CFK+ 19b].
In view of these developments, we believe that it is an opportune time to try to
summarise some of the main results of the theory in a way that is as elementary as
possible, yet sufficiently precise to provide a technical working knowledge of the
theory. We therefore include elementary but essentially complete proofs of several
of the main results, including the continuity and definition of the Itô–Lyons map,
the lifting of a class of Gaussian processes to the space of rough paths, etc. In
contrast to the available textbook literature [LQ02, LCL07, FV10b], we emphasise
Gubinelli’s view on rough integration [Gub04, Gub10] which allows to linearise
many considerations and to simplify the exposition. That said, the resulting theory
of rough differential equations is (immediately) seen to be equivalent to Davie’s
definition [Dav08] and, generally, we have tried to give a good idea what other
perspectives one can take on what amounts to essentially the same objects.
As we have just seen, the main idea of the theory of rough paths is to “enhance”
a path X with some additional data X, namely the integral of X against itself, in
order to restore continuity of the Itô map. The general idea of building a larger
object containing additional information in order to restore the continuity of some
nonlinear transformation is of course very old and there are several other theories
that have a similar “flavour” to the theory of rough paths, one of them being the
theory of Young measures (see for example the notes [Bal00]) where the value of
a function is replaced by a probability measure, thus allowing to describe limits of
highly oscillatory functions.
Nevertheless, when first confronted with some of the notions just outlined, the first
reaction of the reader might be that simply postulating the values for the right-hand
side of (1.8) makes no sense. Indeed, if X is smooth, then we “know” that there is
only one “reasonable” choice for the integral X of X against itself, and this is the
Riemann integral. How could this be replaced by something else and how can one
expect to still get a consistent theory with a natural interpretation? These questions
will of course be fully answered in these notes.
For the moment, let us draw an analogy with a very well established branch of
geometric measure theory, namely the theory of varifolds [Alm66, LY02].
Varifolds arise as natural extensions of submanifolds in the context of certain
variational problems. We are not going into details here, but loosely speaking a
k-dimensional varifold in Rn is a (Radon) measure v on Rn × G(k, n), where
G(k, n) denotes the space of all k-dimensional subspaces of Rn . Here, one should
interpret G(k, n) as the space of all possible tangent spaces at any given point for
a k-dimensional submanifold of Rn . The projection of v onto Rn should then be
1.2 Analogies with other branches of mathematics 7
Mε M
⇒
ε
It is intuitively clear that, as ε → 0, this converges to a circle, but the right half has
twice as much “weight” as the left half so that, if we were to describe the limit M
simply as a manifold, we would have lost some information about the convergence of
the surface measures in the process. More dramatically, there are situations where one
has a sequence of smooth manifolds such that the limit is again a smooth manifold,
but with a limiting “tangent space” which has nothing to do with the actual tangent
space of the limit! Indeed, consider the sequence of one-dimensional submanifolds
of R2 given by
ε2
This time, the limit is a piece of straight line, which is in principle a perfectly nice
smooth submanifold, but the limiting tangent space is deterministic and makes a 45◦
angle with the canonical tangent space associated to the limit.
The situation here is philosophically very similar to that of the theory of rough
paths: a subset M ⊂ Rn may be sufficiently “rough” so that there is no way of
canonically associating to it either a k-dimensional Riemannian volume element,
or a k-dimensional tangent space, so we simply postulate them. The two examples
given above show that even in situations where M is a nice smooth manifold, it
still makes sense to associate to it a volume element and / or tangent space that are
different from the ones that one would construct canonically. A similar situation
arises in the theory of rough paths. Indeed, it may so happen that X is actually
given by a smooth function. Even so, this does not automatically mean that the
right-hand side of (1.8) is given by the usual Riemann integral of X against itself.
An explicit example illustrating this fact is given in Exercise 2.10 below. Similarly to
8 1 Introduction
The problem with this equation is that, if anything, one has (∂x h)2 = +∞ (a
consequence of the roughness of (1 + 1)-dimensional space-time white noise) and
one would have to compensate with C = +∞. It has instead become customary
to define the solution of the KPZ equation as the logarithm of the (multiplicative)
stochastic heat equation ∂t u = ∂x2 u + uξ, essentially ignoring the (infinite) Itô-
correction term.3 The so-constructed solutions are called Hopf–Cole solutions and,
to cite J. Quastel [Qua11],
The evidence for the Hopf–Cole solutions is now overwhelming. Whatever the physicists
mean by KPZ, it is them.
3
This requires one of course to know that solutions to ∂t u = ∂x2 u + uξ stay strictly positive with
probability one, provided u0 > 0 a.s., but this turns out to be the case.
1.3 Regularity structures 9
Theorem 1.4. Consider KPZ and Φ43 on a bounded square spatial domain with
periodic boundary conditions. Let ξε = δε ∗ ξ denote the regularisation of space-time
white noise with a compactly supported smooth mollifier δε that is scaled by ε in
the spatial direction(s) and by ε2 in the time direction. Denote by hε and Φε the
solutions to
∂t hε = ∂x2 hε + (∂x hε )2 − Cε + ξε ,
∂t Φε = ∆Φε + C̃ε Φε − Φ3ε + ξε .
In the case of the KPZ equation, the topology in which one obtains convergence is
that of convergence in probability in a suitable space of space-time Hölder continuous
functions. Let us also emphasise that in this case the resulting renormalised solutions
coincide indeed with the Hopf–Cole solutions.
In the case of the dynamical Φ43 model, convergence takes place instead in some
space of space-time distributions. One caveat that also has to be dealt with in the
latter case is that the limiting process Φ may in principle explode in finite time for
some instances of the driving noise. (Although this is of course not expected.)
Chapters 13 and 14 of this book gives a short and mostly self-contained intro-
duction to the theory of regularity structures and the last chapter shows how it can
be used to provide a robust solution theory for the KPZ equation. The material in
these chapters differs significantly in presentation from the remainder of the book.
Indeed, since a detailed and rigorous exposition of this material would require an
entire book by itself (see the rather lengthy articles [Hai13] and [Hai14b]), we made
a conscious decision to keep the exposition mostly at an intuitive level. We therefore
omit virtually all proofs (with the notable exception of the proof of the reconstruction
theorem, Theorem 13.12, which is the fundamental result on which the theory builds)
and instead merely give short glimpses of the main ideas involved.
10 1 Introduction
Basics: Natural numbers, including zero, are denoted by N, integers by Z, real and
complex numbers are denoted by R and C, respectively. Strictly positive reals are
denoted by R+ . For x real, ⌊x⌋ (resp. ⌈x⌉) is the largest (resp. smallest) integer n
such that n ≤ x (resp. n ≥ x). We also write {x} ∈ (0, 1] for the non-zero fractional
part so that x − {x} ∈ Z. A d-dimensional multi-index is an element k ∈ Nd , and
given x ∈ Rd , we write xk as a shorthand for xk11 · · · xkdd and k! as a shorthand for
k1 ! · · · kd !.
Tensors: We shall deal with paths with values in, as well as maps between, Banach
spaces, typically denoted by V, W , equipped with their respective norms, always
written as | · |. Continuous linear maps from V to W form a Banach space, denoted
by L(V, W ). It will be important to consider tensor products of Banach spaces. If
V, W are finite-dimensional, say V ∼ = Rm and W ∼ = Rn , the tensor product V ⊗ W
m×n
can be identified with the matrix space R . Indeed, if (ei : 1 ≤ i ≤ m) [resp.
(fj : 1 ≤ j ≤ n)] is a basis of V [resp. W ], then (ei ⊗ fj : 1 ≤ i ≤ m, 1 ≤ j ≤ n)
is a basis of V ⊗W . If V and W are Hilbert spaces and (ei ) and (fj ) are orthonormal
bases it is natural to define a Euclidean structure on V ⊗ W by declaring the (ei ⊗ fj )
to be orthonormal. This induces a norm on V ⊗ W , also denoted by | · |, which is
compatible in the sense that |v ⊗ w| ≤ |v| · |w| for all v ∈ V , w ∈ W . This tensor
norm is furthermore symmetric, namely |u ⊗ v| = |v ⊗ u|, equivalently expressed as
invariance under transposition x 7→ xT .
We also introduce the symmetric and antisymmetric parts of x ∈ V ⊗ V :
Sym(x) = 12 (x + xT ), Anti(x) = 21 (x − xT ) .
The defining feature of tensor product spaces is their ability to linearise bilinear
maps,4
L(2) (V × V̄ , W ) ∼
= L V, L(V̄ , W ) ∼
= L(V ⊗ V̄ , W ) . (1.11)
We briefly discuss the extension to infinite dimensions. Given Banach spaces V, V̄
one completes the algebraic tensor product V ⊗a V̄ under a compatible tensor norm
to obtain a Banach space V ⊗ V̄ . By [Rya02, Thm 2.9], the second5 identification in
(1.11) requires one to work with the projective tensor norm
nX X o
v i ⊗ v̄ i ,
def
|x|proj = inf |vi ||v̄i | : x =
i i
where all sums are finite and | · | stands for either norm in V or V̄ . This norm is
obviously compatible and symmetric. Symmetric and antisymmetric part of x ∈
V ⊗ V are defined as before (note that the transposition map V ⊗ W → W ⊗ V
4
This will arise naturally, with V̄ = V , when pairing the second Fréchet derivatives (of some
F : V → W ) with second iterated integrals with values in V ⊗ V .
5
The first identification holds for general Banach spaces.
1.4 Frequently used notations 11
The spaces C γ and Cbγ satisfy the obvious inclusions and continuous embeddings,
respectively. Warning. The Lipγ -spaces frequently seen in the rough path literature
are precisely our Cbγ -spaces for γ ∈ / N (at least when V is finite-dimensional),
whereas F ∈ Lipn+1 means F ∈ C n with globally Lipschitz Dn F ; some authors
also interpret Cbγ -spaces, for integer γ, in this way.
Path spaces:R We say that X : [0, T ] → V is continuously (Fréchet) differentiable if
•
X = X0 + 0 Ẋt dt, for some continuous path Ẋ : [0, T ] → V , the derivative of X.
•
def |Xs,t |
∥X∥α = sup <∞,
s,t∈[0,T ] |t − s|α
def
where we define the path increment Xs,t = Xt − Xs (and also use the convention
def
0/0 = 0); we also write δX for the map (s, t) 7→ Xt − Xs . This seminorm fails to
def
separate constants, the norm on C α is then given by ∥X∥C α = |X0 |+∥X∥α . We write
6
One checks that every F ∈ C 1 is locally Lipschitz (though not necessarily Cb1 on bounded sets).
12 1 Introduction
Here one works with partitions or dissections of [0, T ]; since every dissection D =
{0 = t0 < t1 < · · · < tn = T } ⊂ [0, T ] can be thought of as a partition of [0, T ]
into (essentially) disjoint intervals, P ={[ti−1 , ti ] : i = 1, . . . n}, and vice-versa, we
shall use whatever is (notationally) more convenient.
We further recall that lim|P|→0 , typically defined via nets, means convergence
along any sequence (Pk ) with mesh |Pk | → 0, with identical limit along each such
sequence. Here, the mesh |P| of a partition P is the length of its largest element, i.e.
|P| = supk∈{1,...,n} |tk − tk−1 | if P is as above.
Two parameter spaces: Every V -valued path X gives rise to its increment function
δX : (s, t) 7→ Xs,t = Xt − Xs . More generally, consider (s, t) 7→ Ξs,t , with some
sort of “on-diagonal” β-Hölder regularity, formalised by the Banach space C2β of
maps Ξ : [0, T ] → V with finite norm,
def |Ξs,t |
∥Ξ∥β = sup <∞.
s,t∈[0,T ] |t − s|β
Unless explicitly otherwise stated, all rough path results in this book are valid in
infinite dimensions. This is rather obvious in the case of Young integration, say with
L(V, W )-valued integrand and V -valued integrator, for general Banach spaces V
and W . In the case of rough integration, Section 4.2, of a L(V, W )-valued one-form
F , against a V ⊕ V ⊗2 -valued rough path, the pairing of DF · F , with values in
L(V, L(V, W )), with V ⊗2 is crucial and requires (1.11). As was explained there,
this is guaranteed by equipping V ⊗ V with the projective norm which will be our
standing assumption for the rest of this text, unless otherwise stated.
Alternatively, Lyons [Lyo98], [LQ02, pp. 28, 110] or [LCL07, pp.75] adjusts the
notion of C γ -regularity required for F in a way that basically forces DF · F to
take values in L(V ⊗ V, W ), with the consequence that the regularity condition on
F then depends on the chosen tensor product norm. This modification entails no
changes in subsequent arguments. Of course, there is no difference whatsoever when
dim V < ∞.
The same remarks apply to solving rough differential equations. The Young case
of Section 8.3 is not affected by tensor norms, whereas the typical second order
14 1 Introduction
approximation for RDEs, as e.g. seen in 8.13 later on, immediately points to the
need for a well-defined pairing of the form (1.11). This is ensured by having V ⊗ V
equipped with the projective norm. Alternatively, and as before, it is possible to
replace the projective norm by weaker compatible norms, but this then forces one to
think more carefully about the necessary modifications on the precise assumptions
on the space of vector fields when solving RDEs. This can be important when
the existence of the V ⊗2 -valued rough path in the projective tensor product space
is problematic, as happens in the case of Banach valued Brownian motion (e.g.
[LLQ02]] and Exercise 3.5, used e.g. in [IK06, IK07]). See also [Lyo98, Def. 1.2.4]
or [KM17, Proof of Thm 3.3], where dim W < ∞ is noted to be helpful, and [LCL07,
pp.19–20] and [LQ02, pp. 28, 111] for more information.
Chapter 2
The space of rough paths
We define the space of (Hölder continuous) rough paths, as well as the subspace of
“geometric” rough paths which preserve the usual rules of calculus. The latter can
be interpreted in a natural way as paths with values in a certain nilpotent Lie group.
At the end of the chapter, we give a short discussion showing how these definitions
should be generalised to treat paths of arbitrarily low regularity.
15
16 2 The space of rough paths
which we assume to hold for every triplet of times (s, u, t). Since Xt,t = 0, it
immediately follows (take s = u = t) that we also have Xt,t = 0 for every t. As
already mentioned in the introduction, one should think of X as postulating the value
of the quantity Z t
def
Xs,r ⊗ dXr = Xs,t , (2.2)
s
where we take the right-hand side as a definition for the left-hand side. (And not
the other way around!) We insist (cf. Exercise 2.4 below) that as a consequence
of (2.1), knowledge of the path t 7→ (X0,t , X0,t ) already determines the entire
second order process X. In this sense, the pair (X, X) is indeed a path, and not
some two-parameter object, although it is often more convenient to consider it
as one. If X is a smooth function and we read (2.2) from right to left, then it is
straightforward to verify (see Exercise 2.1 below) that the Rrelation (2.1) does indeed
hold. Furthermore, one can convince oneself that if f 7→ f dX denotes any form
Rt
of “integration” which is linear in f , has the property that s dXr = Xs,t , and is
Rt Ru Ru
such that s f (r) dXr + t f (r) dXr = s f (r) dXr for any admissible integrand
f , and if we use such a notion of “integral” to define X via (2.2), then (2.1) does
automatically hold. This makes it a very natural postulate in our setting.
Note that the algebraic relations (2.1) are by themselves not sufficient to determine
X as a function of X. Indeed, for any V ⊗ V -valued function F , the substitution
Xs,t 7→ Xs,t + Ft − Fs leaves the left-hand side of (2.1) invariant. We will see later
on how one should interpret such a substitution. It remains to discuss what are the
natural analytical conditions one should impose for X. We are going to assume that
the path X itself is α-Hölder continuous, so that |Xs,t | ≲ |t − s|α . The archetype of
an α-Hölder continuous function is one which is self-similar with index α, so that
Xλs,λt ∼ λα Xs,t .
(We intentionally do not give any mathematical definition of self-similarity here,
just think of ∼ as having the vague meaning of “looks like”.) Given (2.2), it is then
very natural to expect X to also be self-similar, but with Xλs,λt ∼ λ2α Xs,t . This
discussion motivates the following definition of our basic spaces of rough paths.
Definition 2.1. For α ∈ ( 13 , 12 ], define the space of α-Hölder rough paths (over V ),
in symbols C α ([0, T ], V ), as those pairs (X, X) =: X such that
this way.1 We have the strict inclusion L (C ∞ ) ⊂ C ∞ , the class of smooth rough
paths,2 by which we mean a genuine rough path with the additional property that the
V -valued (resp. V ⊗ V -valued) maps X and Xs, are smooth, for every basepoint s.
• •
For instance, X ≡ (0, 0) is the trivial canonical rough path associated to the scalar
zero path, as opposed to the smooth “pure second level” rough path (over R) given
by (s, t) 7→ (0, t − s); see also Exercise 2.10 for a natural example with dim V > 1.
Remark 2.2. Any scalar path X ∈ C α can be lifted to a rough path (over R), simply
by setting Xs,t := (Xs,t )2 /2. However, for a vector-valued path X ∈ C α , with
values in some Banach space V , it is far from obvious that one can find suitable
“second order increments” X such that X lifts to a rough path (X, X) ∈ C α . The
Lyons–Victoir extension theorem (Exercise 2.14) asserts that this can always be done,
/ N which means α ∈ ( 13 , 12 ) in our
even in a continuous fashion, provided that 1/α ∈
1
present discussion. (A counterexample for α = 2 is hinted on in Exercise 2.13). The
reader may wonder how this continuity property dovetails with Proposition 1.1. The
point is that if we define X 7→ X by an application of the Lyons–Victoir extension
theorem, this map restricted to smooth paths does in general not coincide with the
Riemann–Stieltjes integral of X against itself.
If one ignores the nonlinear constraint (2.1), the quantities defined in (2.3) suggest
to think of (X, X) as an element of the Banach space C α ⊕ C22α with (semi-)norm
∥X∥α + ∥X∥2α (which vanishes when X is constant and X ≡ 0). However, taking
into account (2.1) we see that C α is not a linear space, although it is a closed subset
of the aforementioned Banach space; see Exercise 2.7. We will need (some sort of) a
norm and metric on C α . The induced “natural” norm on C α given by ∥X∥α +∥X∥2α
fails to respect the structure of (2.1) which is homogeneous with respect to a natural
dilation on C α , given by δλ : (X, X) 7→ (λX, λ2 X). This suggests to introduce the
α-Hölder homogeneous rough path norm
1
We note immediately that “smooth” can be replaced by “sufficiently smooth”, such as C 1 and
even C α , with α > 1/2, in view of Young integration, Section 4.1.
2
We deviate here from the early rough path literature, including [LQ02], where smooth rough paths
meant canonical rough paths. Instead, we are aligned with the terminology of regularity structures,
where (canonical, smooth) models generalise the corresponding notions of rough paths.
18 2 The space of rough paths
def
p
|||X|||α = ∥X∥α + ∥X∥2α , (2.4)
which, although not a norm in the usual sense of normed linear spaces, is a very
adequate concept for the rough path X = (X, X). On the other hand, (2.3) leads to a
natural notion of rough path metric (and then rough path topology).
The perhaps easiest way to show convergence with respect to this rough path
metric is based on interpolation: in essence, it is enough to establish pointwise
convergence together with uniform “rough path” bounds of the form (2.3); see
Exercise 2.9. Let us also note that C α ([0, T ], V ) endowed with this distance is a
complete metric space; the reader is asked to work out the details in Exercise 2.7.
We conclude this part with two important remarks. First, we can ask ourselves up
to which point the relations (2.1) are already sufficient to determine X. Assume that
we can associate to a given function X two different second order processes X and
X̄, and set Gs,t = Xs,t − X̄s,t . It then follows immediately from (2.1) that
so that in particular Gs,t = G0,t − G0,s . Since, conversely, we already noted that
setting X̄s,t = Xs,t + Ft − Fs for an arbitrary continuous function F does not change
the left-hand side of (2.1), we conclude that X is in general determined only up to
the increments of some function F ∈ C 2α (V ⊗ V ). The choice of F does usually
matter and there is in general no obvious canonical choice.
The second remark is that this construction can possibly be useful only if α ≤ 12 .
Indeed, if α > 12 , then a canonical choice of X is given by reading (2.2) from
right to left and interpreting the left-hand side as a simple Young integral [You36].
Furthermore, it is clear in this case that X must be unique, since any additional
increment should be 2α-Hölder continuous by (2.3), which is of course only possible
if α ≤ 12 . Let us stress once more however that this is not to say that X is uniquely
determined by X if the latter is smooth, when it is interpreted as an element of C α
for some α ≤ 12 . Indeed, if α ≤ 21 , F is any 2α-Hölder continuous function with
values in V ⊗ V and Xs,t = Ft − Fs , then the path (0, X) is a perfectly “legal”
element of C α , even though one cannot get any smoother than the function 0. The
impact of perturbing X by some F ∈ C 2α in the context of integration is considered
3
As was already emphasised, C α is not a linear space but is naturally embedded in the Banach
space C α ⊕ C22α (cf. Exercise 2.7), the (inhomogeneous) rough path metric is then essentially the
induced metric. While this may not appear intrinsic (the situation is somewhat similar to using the
(restricted) Euclidean metric on R3 on the 2-sphere), the ultimate justification is that the Itô map
will turn out to be locally Lipschitz continuous in this metric.
2.1 Basic definitions 19
in Example 4.14 below. In Chapter 5, we shall use this for a pathwise understanding
of how exactly Itô and Stratonovich integrals differ.
Remark 2.5. There are some simple variations on the definition of a rough path, and
it can be very helpful to switch from one view-point to the other. (The analytic
conditions are not affected by this.)
a) From the “full increment” view point one has (X, X) : [0, T ]2 → V ⊕ V ⊗2 ,
(s, t) 7→ (Xs,t , Xs,t ) subject to the “full” Chen relation
X−1
s ⊗ Xt =: Xs,t =: (Xs,t , Xs,t ).
Chen’s relation (2.5) is nothing but the trivial identity Xs,u ⊗ Xu,t = Xs,t so that
any such group-valued path X induces an increment map (X, X), of the form
discussed in a). Conversely, such increments determine X modulo constants as
seen from Xt = X0 ⊗ X0,t . If we restrict to X0 = 1 = (1, 0, 0), or identify
paths X, X̃ for which X̃ ⊗ X−1 is constant, then there is no difference. (Such
a “base-point free” object corresponds to “fat” Π in the theory of regularity
structures and induces a model (Π, Γ ) in great generality.)
c) Our Definition 2.1 is a compromise in the sense that we want to start from a
familiar object, namely a path X : [0, T ] → V , together with minimal second
level increment
R information to define (in Section 4.2) the prototypical rough
integral F (X)d(X, X). From the “increment” view point, we have thus sup-
plied more than necessary (namely X0 ), whereas from the “full path” view point,
we have supplied X, with X0 = (1, X0 , ∗) specified on the first level only. (Of
course, this affects in no way the second level increments Xs,t .)
20 2 The space of rough paths
While (2.1) does capture the most basic (additivity) property that one expects any
decent theory of integration to respect, it does not imply any form of integration by
parts / chain rule. Now, if one looks for a first order calculus setting, such as is valid
in the context of smooth paths or the Stratonovich stochastic calculus, then for any
pair e∗i , e∗j of elements in V ∗ , writing Xti = e∗i (Xt ) and Xij ∗ ∗
s,t = (ei ⊗ ej )(Xs,t ), one
would expect to have the identity
Z t Z t
Xij
s,t + X ji
s,t “ = ” X i
s,r dX j
r + j
Xs,r dXri
s s
Z t
j
= d(X i X j )r − Xsi Xs,t − Xsj Xs,t
i
s
j j
= (X i X j )s,t − Xsi Xs,t − Xsj Xs,t
i i
= Xs,t Xs,t ,
so that the symmetric part of X is determined by X. In other words, for all times s, t
we have the “first order calculus” condition
1
Sym(Xs,t ) = Xs,t ⊗ Xs,t . (2.6)
2
However, if we take X to be an n-dimensional Brownian path and define X by Itô
integration, then (2.1) still holds, but (2.6) certainly does not.
There are two natural ways to define a set of “geometric” rough paths for which
(2.6) holds. On the one hand, we can define the space of weakly geometric (α-Hölder)
rough paths.
Cgα ([0, T ], V ) ⊂ C α ([0, T ], V ) ,
by stipulating that (X, X) ∈ Cgα if and only if (X, X) ∈ C α and (2.6) holds as
equality in V ⊗ V , for every s, t ∈ [0, T ]. Note that Cgα is a closed subset of C α .
On the other hand, we have already seen that every smooth path can be lifted
canonically to an element in L (C ∞ ) ⊂ C α by reading the definition (2.2) from
right to left. This choice of X then obviously satisfies (2.6) and we can define the
space of geometric (α-Hölder) rough paths,
below. The situation is similar to the classical situation of the set of α-Hölder
continuous functions being strictly larger than the closure of smooth functions under
the α-Hölder norm. (Or the set of bounded measurable functions being strictly larger
than C, the closure of smooth functions under the supremum norm.) In practice, at
2.3 Rough paths as Lie group valued paths 21
least when dim V < ∞, the distinction between weakly and “genuinely” geometric
rough paths rarely matters for the following reason: similar to classical Hölder spaces,
one has the converse inclusion Cgβ ⊂ Cg0,α whenever β > α, see Proposition 2.8
below and also Exercise 2.12. For this reason, we will often casually speak of
“geometric rough paths”, even when we mean weakly geometric rough paths. (There
is no confusion in precise statements when we write Cg0,α or Cgα .) Let us finally
mention that non-geometric rough paths can always be embedded in a space of
geometric rough paths at the expense of adding new components; in the present
(level-2) setting this can be accomplished in terms of a rough path bracket, see
Exercise 2.11 and also Section 5.3.
We now present a very fruitful view of rough paths, taken over a Banach space V .
2
Consider X : [0, T ] → V, X : [0, T ] → V ⊗2 subject to (2.1) and define, with
Xs,t = Xt − Xs as usual,
The space T (2) (V ) is itself a Banach space, with the norm of an element (a, b, c)
given by |a| + |b| + |c|, where in abusive notation | • | standards for any of the norms
in R, V and V ⊗ V , the norm on the latter assumed compatible and symmetric, cf.
Section 1.4 . More interestingly for our purposes, this space is a Banach algebra,
non-commutative when dim V > 1 and with unit element (1, 0, 0), when endowed
with the product
We call T (2) (V ) the step-2 truncated tensor algebra over V . This multiplicative
structure is very well adapted to our needs since Chen’s relation (2.1), combined
with the obvious identity Xs,t = Xs,u + Xu,t , takes the elegant form
4 (2)
The Lie group T1 (V ) is finite-dimensional if and only if dim V < ∞.
22 2 The space of rough paths
Identifying 1, b, c with elements (1, 0, 0), (0, b, 0), (0, 0, c) ∈ T (2) (V ), we may
write (1, b, c) = 1 + b + c. The resulting calculus is familiar from formal power series
−1
in non-commuting indeterminates. For instance, the usual power series (1 + x) =
2
1 − x + x − . . . leads to, omitting tensors of order 3 and higher,
−1
(1 + b + c) = 1 − (b + c) + (b + c) ⊗ (b + c)
=1−b−c+b⊗b,
(2)
Having identified T1 (V ) as the natural state space of (step-2) rough paths, we now
equip it with a homogeneous, symmetric and subadditive norm. For x = (1, b, c),
p
|||x||| = 12 N (x) + N (x−1 ) with N (x) = max{|b|, 2|c|} ,
def
(2.10)
noting |||δλ x||| = |λ||||x|||, homogeneity with respect to dilation, and |||x ⊗ x′ ||| ≤
|||x||| + |||x′ |||, a consequence of subaddivity for N ( • ) which requires a short argument
left to the reader. It is clear that
(2)
defines a bona fide (left-invariant) metric on the group T1 (V ). Important for us, the
graded Hölder regularity (2.3) of X = (X, X), part of the definition of a rough path,
can now be condensed to demand the “metric” Hölder seminorm
d(Xs , Xt ) p
sup α
≍ ∥X∥α + ∥X∥2α = |||X|||α;[0,T ] (2.11)
s̸=t∈[0,T ] |t − s|
The usual power series and / or basic Lie group theory suggest to define
def 1
log (1 + b + c) = b + c − b ⊗ b , (2.12)
2
def 1
exp (b + c) = 1 + b + c + b ⊗ b , (2.13)
2
2.3 Rough paths as Lie group valued paths 23
(2) (2)
which allow us to identify T0 (V ) ∼
= V ⊕ V ⊗2 with T1 (V ) = exp(V ⊕ V ⊗2 ).
(2)
The following Lie bracket makes T0 (V ) a Lie algebra. For b, b′ ∈ V, c, c′ ∈ V ⊗2 ,
[b + c, b′ + c′ ] = b ⊗ b′ − b′ ⊗ b ,
def
(2)
and T0 (V ) is step-2 nilpotent in the sense that all iterated brackets of length 2 vanish.
(2) (2)
Define g(2) (V ) ⊂ T0 (V ) as the closed Lie subalgebra generated by V ⊂ T0 (V ),
explicitly given by
called the free step-2 nilpotent Lie algebra over V . Note that in finite dimensions, say
V = Rd , the closing procedure is unnecessary and [V, V ] is nothing but the space
of antisymmetric d × d matrices, with linear basis ([ei , ej ] : 1 ≤ i < j ≤ d), where
(ei : 1 ≤ i ≤ d) denotes the standard basis of Rd . Thanks to step-2 nilpotency, one
checks by hand the Baker–Campbell–Hausdorff formula
The image of g(2) under the exponential map then defines a closed Lie subgroup,
(2)
G(2) (V ) = exp g(2) (V ) ⊂ T1 (V ) ,
def
called the free step-2 nilpotent group over V . These considerations provide us with
an elegant characterisation of weakly geometric rough paths. (The proof is immediate
from the previous proposition and rewriting (2.6) as Xs,t − 12 Xs,t ⊗ Xs,t ∈ [V, V ].)
Proposition 2.7 (Weakly geometric case).
a) Assume (X, X) ∈ Cgα ([0, T ], V ). Then the path t 7→ Xt = (1, X0,t , X0,t ), with
values in G(2) (V ) is α-Hölder continuous (with respect to the metric d.)
b) Conversely, if [0, T ] ∋ t 7→ Xt is a G(2) (V )-valued and α-Hölder continuous
path, then (X, X) ∈ Cgα ([0, T ], V ) with (1, Xs,t , Xs,t ) := X−1
s ⊗ Xt .
It is clear from the discussion in Section 2.2 that any sufficiently smooth path, say
γ ∈ C 1 ([0, 1], V ), produces an element in G(2) (V ) by iterated integration, namely
Z 1 Z 1Z t
S (2) (γ) = 1, dγ(t), dγ(s) ⊗ dγ(t) ∈ G(2) (V ) .
0 0 0
The map S (2) , which maps (sufficiently regular) paths on a fixed interval, here [0, 1],
into the above collection of tensors is know as step-2 signature map. We note in
passing that Chen’s relation here has the pretty interpretation that the signature map
is a morphism from the space of paths, equipped with concatenation product, to
the tensor algebra. The inclusion S (2) (C 1 ) ⊂ G(2) becomes an equality in finite
dimensions,
{S (2) (γ) : γ ∈ C 1 ([0, 1], Rd )} = G(2) (Rd ) . (2.14)
24 2 The space of rough paths
To see this, fix b + c ∈ g(2) (Rd ) and try to find finitely many, say n, affine linear
paths γi , with each signature determined by the direction γi (1) − γi (0) = vi ∈ Rd ,
so that
exp(v1 ) ⊗ . . . ⊗ exp(vn ) = exp(b + c) .
Properly applied, thePBaker–Campbell–Hausdorff formula allows to “break up”
the exponential exp( i bi ei + j,k cjk [ej , ek ]). In conjunction with the identity
P
nZ 1 o
|γ̇(t)| dt : γ ∈ C 1 ([0, 1], Rd ) , S (2) (γ) = x ,
def
∥x∥C = inf (2.15)
0
which, despite its dependence on the dimension d, is sufficient for many practical
purposes. As a useful application, we now state an approximation result for weakly
geometric roughs over Rd . With the preparations made, the interested reader will
have no trouble to provide a full proof for
with uniform rough path bound supn≥1 |||X n , Xn |||β ≲ |||X, X|||β . By interpolation,
convergence holds in C α , for any α < β.
Remark 2.9. By definition, every geometric rough path X ∈ Cg0,β is the limit of
canonical rough path lifts (X n , Xn ) = Xn ; trivially then, |||Xn |||β → |||X|||β . This is
not true for a generic weakly geometric rough path X ∈ Cgβ . However, the above
proposition supplies approximations (Xn ), which converge uniformly with uniform
rough paths bounds. In such a case, |||X|||β ≤ liminfn≥1 |||Xn |||β and this can be strict.
This lower-semicontinuous behaviour of the rough path norm is reminiscent of norms
on Hilbert spaces under weak convergence and led to the terminology of “weakly”
geometric rough paths.
The interpretation given above gives a strong hint on how to construct geometric
rough paths with α-Hölder regularity for α ≤ 31 : setting N = ⌊1/α⌋, one defines the
step-N truncated tensor algebra over a Banach space V
N
M ⊗n
T (N ) (V ) =
def
V ,
n=0
with the natural convention that (V )⊗0 = R. The product in T (N ) (V ) is simply the
tensor product ⊗, but we truncate it in a natural way by postulating that a ⊗ b = 0 for
a ∈ (V )⊗k , b ∈ (V )⊗ℓ with k + ℓ > N . A homogeneous, symmetric and subadditive
norm which generalises (2.10) to the step-N case is given by
1/n
1
N (x) + N (x−1 ) (n!|xn |)
def
|||x||| = 2 with N (x) = max , (2.17)
n=1,...,N
(N )
where every x = (1, x1 , . . . , xN ) ∈ T1 (V ), element with scalar component 1,
is invertible, and where | • | denotes any of the tensor norms on (V )⊗n , assumed
compatible and symmetric (permutation invariant).6 .
Proposition 2.6 suggests the naı̈ve definition of an α-Hölder rough path over
(N )
V as a path X, on [0, T ] say, with values in the group T1 (V ) which is α-Hölder
continuous with respect to d(x, x′ ) = |||x−1 ⊗ x′ |||. Modulo knowledge of X0 this is
(N )
equivalent to a multiplicative map (s, t) 7→ Xs,t ∈ T1 (V ), multiplicative in the
sense that Chen’s relation holds,
6
The definitions from Section 1.4 for N = 2 extend easily to N > 2, see also [LCL07, Def 1.25]
26 2 The space of rough paths
for every triplet of times (s, u, t), and with graded Hölder regularity,
|Xns,t | ≲ |t − s|kα , n = 1, . . . , N ,
|Xns,t | ≲ |t − s|nα , n = 1, . . . , N .
Here, again multiplicative means validity of Chen’s relation as spelled out in (2.18)
above.
We now assume, for notationally convenience, V = Rd , which allows us to
(N )
think of components of some fixed rough path increment Xs,t ∈ T1 (Rd ) as being
indexed by words w of length at most N with letters in the alphabet {1, . . . , d}.
Similarly to before, given a word w = w1 · · · wn , the corresponding component Xw ,
which we also write as ⟨X, w⟩, is then interpreted as the n-fold integral
Z tZ sn Z s1
⟨Xs,t , w⟩ = ··· dXsw11 · · · dXswnn , (2.19)
s s s
and |||Xs,t ||| ≲ |t − s|α is equivalent to, for all words with length |w| ≤ ⌊1/α⌋,
In order to describe the constraints imposed on these iterated integrals by the chain
rule, we define the shuffle product between two words as the formal sum over all
possible ways of interleaving them. For example, one has
We already remarked earlier, that in the (weakly) geometric case, the assumed
chain rule (now in the form of (2.21)) allows to reduce such expressions to linear
combinations of iterated integrals. In general, one should define a rough path as the
enhancement of a path X with additional functions that are interpreted as the various
formal expressions that can be formed by the two operations “multiplication” and
“integration against X”. The resulting algebraic construction is more involved and
gives rise to the concept of branched rough path X due to Gubinelli [Gub10]. The
terminology comes from the fact that the natural way of indexing the components
of such an object is no longer given by words, but by labelled trees, as suggested
in (2.22) above with labels i, j, k ∈ {1, . . . , d}. As detailed in [Gub10], see also
[HK15, BCFP19], branched rough paths take values in the character group of the
Connes–Kreimer Hopf algebra of trees [CK00], also known as the Butcher group
[But72]. A concise description of the branched rough path regularity via an explicit
homogeneous subadditive norms on this Lie group, similar to (2.17), can be found in
[TZ18], cf. also [HS90].
28 2 The space of rough paths
2.5 Exercises
(3) def Q∞
where ∆s,t = {u : s < u1 < u2 < u3 < t} and T ((V )) = k=0 V ⊗k is the
space of tensor series over V , equipped with the obvious algebra structure (cf.
Section 2.4). Show that the following general form of Chen’s relation holds:
The element Xs,t ∈ T ((V )) is known as the signature of X on the interval [s, t].
c) Show that the indefinite signature S := X0, solves the linear differential equa-
•
tion
dS = S ⊗ dX , S0 = 1 .
We will see later (Exercises 4.6 and 8.9) that the signature can be defined for every
rough path.
Hint: For point (b), it suffices to consider the projection of Xs,t to V ⊗n , for an
arbitrary integer n, given by the n-fold integral of dXu1 ⊗ · · · ⊗ dXun over the
simplex {s < u1 < · · · < un < t}.
♯ Exercise 2.2 (Shuffle) Let V = Rd . As discussed in (2.19), the collection Xs,t of
all iterated integrals over a fixed interval [s, t] can also be viewed as
w
Xs,t = ⟨Xs,t , w⟩ : w word on A ,
with alphabet A = {1, . . . , d}, where we recall that a word on A is a finite sequence
of elements of A, including the empty sequence ̸#, called the empty word. By
convention, X̸#
s,t = 1. Write uv for the concatenation of two words u and v, and
accordingly ui for attaching a letter i ∈ A to the right of u. The linear span of such
words (which can be identified with polynomials in d non-commuting indeterminates)
carries an important commutative product known as the shuffle product. It is defined
recursively by requiring ̸# to be the neutral element, ie. u ̸# = ̸# u = u, and
then
ui vj = (u vj)i + (ui v)j .
Let Xs,t be the signature of a smooth path X, as given in (2.23). Show that, for all
words u, v,
⟨Xs,t , u v⟩ = ⟨Xs,t , u⟩⟨Xs,t , v⟩ . (2.24)
2.5 Exercises 29
The case of single letter words w = i, v = j gives i j = ij + ji and expresses
precisely the product rule from calculus, which leads us to the level-2 geometricity
condition (2.6).
Hint: Proceed by induction in joint length: express ⟨Xs,t , ui⟩⟨Xs,t , vj⟩ by the product
rule as an integral over [s, t] and use the hypothesis for words of joint length |u| +
|v| + 1 < |ui| + |vj|.
∗ Exercise 2.3 Call a tensor series x ∈ T ((Rd )) group-like, in symbols x ∈ G((Rd )),
if for all words u, v,
⟨x, u v⟩ = ⟨x, u⟩⟨x, v⟩ . (2.25)
An element in T ((Rd )) is called a Lie series if, for all N ∈ N, its projection to
T (N ) = T (N ) (Rd ) is a Lie polynomial, i.e. an element of g(N ) , which was defined
(N )
in Section 2.4 as the Lie algebra generated by Rd ⊂ T0 . Given x ∈ T ((Rd )), show
d
that x is group-like, i.e. x ∈ G((R )), if and only if log x is a Lie series.
♯ Exercise 2.4
a) It is common to define the (V ⊗ V )-valued map X on ∆0,T := {(s, t) : 0 ≤
s ≤ t ≤ T } rather than [0, T ]2 . There is no difference however: if Xs,t is only
defined for s ≤ t, show that the relation (2.1) implies
b) In fact, show that knowledge of the path t 7→ (X0,t , X0,t ) already determines
the entire second order process X. In this sense (X, X) is indeed a path, and not
some two-parameter object, cf. Remark 2.5.
c) Specialise to the case of geometric rough path and show the identity Xt,s = XTt,s
where (. . .)T denotes the transpose. (When dim V = 1, so that X is scalar
2
valued, this is a trivial consequence of Xs,t = Xs,t /2.)
Exercise 2.5 Consider s ≡ τ0 < τ1 < · · · < τN ≡ t. Show that (2.1) implies
X X
Xs,t = Xτi ,τi+1 + Xτj ,τj+1 ⊗ Xτi ,τi+1
0≤i<N 0≤j<i<N
N
X −1
= Xτi ,τi+1 + Xs,τi ⊗ Xτi ,τi+1 . (2.26)
i=0
Together with Exercise 2.7, b), this shows that Cg0,α is Polish.
b) Show that the closure of smooth rough paths,
C α in the respective topologies (we could even take R α = 1), we have more than
enough to assert that every lifted smooth path, X, X ⊗ dX , is the limit in C α of
lifted paths in Λ. It is then easy to see that every limit point of lifted smooth paths is
also the limit of lifted paths in Λ.
♯ Exercise 2.9 (Interpolation) Assume that Xn ∈ C β , for 1/3 < α < β, with
uniform bounds
n
and uniform convergence Xs,t → Xs,t and Xns,t → Xs,t , i.e. uniformly over s, t ∈
[0, T ]. Show that this implies X ∈ C β and Xn → X in C α . Show furthermore that
the assumption of uniform convergence can be weakened to pointwise convergence:
2.5 Exercises 31
n
∀t ∈ [0, T ] : X0,t → X0,t and Xn0,t → X0,t .
Solution. Using the uniform bounds and pointwise convergence, there exists C such
that uniformly in s, t
≤ C|t − s|β , 2β
n
|Xs,t | = lim Xns,t ≤ C|t − s| .
|Xs,t | = lim Xs,t
n n
♯ Exercise 2.10 (Pure area rough path) Identify R2 with the complex numbers and
consider
[0, 1] ∋ t 7→ n−1 exp 2πin2 t ≡ X n .
Rt n
a) Set Xns,t := s Xs,r ⊗ dXrn . Show that, for fixed s < t,
32 2 The space of rough paths
n 0 1
Xs,t → 0, Xns,t → π(t − s) . (2.27)
−1 0
b) Establish the uniform bounds supn ∥X n ∥1/2 < ∞ and supn ∥Xn ∥1 < ∞.
1 n
Xns,t = n
+ Ans,t = O 1/n2 + Ans,t
Xs,t ⊗ Xs,t
2
where Ans,t ∈ so(2) is the antisymmetric part of Xns,t . To avoid cumbersome
notation, we identify
0 a
∈ so(2) ↔ a ∈ R.
−a 0
Ans,t then represents the signed area between the curve (Xrn : s ≤ r ≤ t) and
the straight chord from Xtn to Xsn . (This is a simple consequence of Stokes
theorem: the exterior derivative of the 1-form 12 (x dy − y dx) which vanishes
along straight chords, is the volume form dx∧dy.) With s < t, (Xrn : s ≤ r ≤ t)
makes ⌊n2 (t − s)⌋ full spins around the origin, at radius 1/n. Each full spin
2
contributes area π(1/n) , while the final incomplete spin contributes some area
2
less than π(1/n) . The total signed area, with multiplicity, is thus
π Cs,t
Ans,t = n2 (t − s) + O(1) 2 = π(t − s) + 2 ,
n n
where |Cs,t | ≤ π uniformly in s, t. It follows that
n 0 1
+ O 1/n2
Xs,t = π(t − s) (2.28)
−1 0
b) The following two estimates for path increments of n−1 exp 2πin2 t ≡ Xtn
hold true:
n n n
Xs,t ≤ Ẋ |t − s| ≤ n|t − s| , Xs,t ≤ 2|X n | = 2/n .
∞ ∞
√
Since a ∧ b ≤ ab, it immediately follows that
n p
Xs,t ≤ 2|t − s| ,
uniformly in n, s, t. In other words, supn ∥X n ∥1/2 < ∞. The argument for the
uniform bounds on Xs,t is similar. On the one hand, we have the bound (2.28).
On the other hand, we also have
2.5 Exercises 33
2 |t − s|2 n2
Z Z
n
Xs,t = Ẋun ⊗ Ẋvn du dv ≤ Ẋ n ∞ ≤ |t − s|2 .
s<u<v<t 2 2
The required uniform bound on ∥X∥1 follows by using (2.28) for n2 |t − s| > 1
and the above bound for n2 |t − s| ≤ 1.
Show that 2δH = (δX)⊗2 − 2 Sym(X) =: [X], called bracket of the rough path
X, further studied in Section 5.3.
Exercise 2.12 (Vanishing Hölder oscillation) a) Let X ∈ C α ([0, T ], V ) with
Hölder exponent α ∈ (0, 1]. Define the space of Hölder path with “vanish-
ing Hölder oscillation”,
( )
|X s,t |
C van,α = X ∈ C α :
def
sup α → 0, as ε → 0 .
s,t:|t−s|<ε |t − s|
Show that for α ∈ (0, 1) we have C van,α = C 0,α , the closure of smooth paths
in C α . (For α = 1 this fails, why?) Show by explicit example that the inclusion
C 0,α ⊂ C α is strict. (Hint: consider the function t 7→ tα .)
b) Let X = (X, X) ∈ Cgα ([0, T ], V ) with α ∈ ( 13 , 12 ]. Define the space of Hölder
rough paths with “vanishing Hölder oscillation”,
( )
van,α def α |Xs,t | |Xs,t |
Cg = X ∈ Cg : sup α + sup 2α → 0 as ε → 0 .
|t−s|<ε |t − s| |t−s|<ε |t − s|
34 2 The space of rough paths
i) Show the inclusions Cg0,α ⊂ Cgvan,α and also Cgβ ⊂ Cgvan,α , whenever
α < β. Show that the inclusion Cgvan,α ⊂ Cgα is strict.
ii) Assume dim V < ∞ from here on. Show Cg0,α = Cgvan,α (Hint: use the
“geodesic” approximations from Proposition 2.8.)
iii) From ii) we have Cgβ ⊂ Cg0,α ⊂ Cgα , whenever 13 < α < β ≤ 12 . Show that
one has the compact embedding (Hint: Arzela–Ascoli)
Cgβ ,→ Cg0,α .
Solution. We use Cg0,α ⊂ Cgvan,α Exercise 2.12, for α = 1/2. Consider a dissection
{s = τ0 < τ1 < · · · < τN = t} with mesh ≤ ε. It follows from Chen’s relation (2.1),
in the form (2.26),
X X
Xs,t − Xs,τi ⊗ Xτi ,τi+1 =
Xτi ,τi+1
0≤i<N 0≤i<N
2α
X
≤ C(ε) |τi+1 − τi | = T C(ε).
0≤i<N
♯∗ Exercise 2.14 (Lyons–Victoir extension [LV07]) Let α ∈ (0, 1/2) and consider
X ∈ C α ([0, T ], L(V, W )), Y ∈ C α ([0, T ], V ) and Z ∈ C22α ([0, T ], W ). We omit
[0, T ] and the precise target space in what follows. We here say that Chen’s relation
holds if, for every triple of times (s, t, u),
(Y, X) 7→ Z := Φ(Y, X)
1
an ⩽ 2−(1−2α) an−1 + ∥Y ∥α ∥X∥α ,
2
so that the sequence (an ) is bounded since 1 − 2α > 0. In fact, one easily obtains
the bound
sup |an | ≲ ∥Y ∥α ∥X∥α ,
n⩾0
with proportionality constant only depending on α < 1/2. This implies the estimate
(⋆) and also settles continuity of Φ = Φ(Y, X). It remains to show that t 7→ Z0,t ∈ C β
whenever Y, X ∈ C β and β ∈ (1/2, 1). But this is an immediate consequence of the
bound
|Z0,t − Z0,s | ⩽ |Zs,t | + |X0,s | · |Xs,t |,
noting that, thanks to the first part of the theorem, |Zs,t | ≲ |t − s|2α for all 2α < 1.
direction h is given by
Th (X) = X h , Xh ,
def
where X h := X + h and
Z t Z t Z t
Xhs,t := Xs,t + hs,r ⊗ dXr + Xs,r ⊗ dhr + hs,r ⊗ dhr . (2.29)
s s s
a) Assume h ∈ C 1 . (In particular, the last three integrals above are well-defined
Riemann–Stieltjes integrals.) Show that for fixed h, the translation operator
Th : X 7→ Th (X) is a continuous map from C α into itself.
b) By convention, h ∈ C 1 means Lipschitz or equivalently h ∈ W 1,∞ , where W 1,q
denotes the space of absolutely continuous paths h with derivative ḣ ∈ Lq .
Weaken the assumption on h by only requiring ḣ ∈ Lq , for suitable q = q(α).
Show that q = 2 (“Cameron–Martin paths of Brownian motion”) works for all
α ≤ 1/2. (As a matter of fact, the integrals appearing in (2.29) make sense for
every q ≥ 1, but the resulting translated “rough path” falls out of the class of
Hölder rough paths. One can resolve this issue by switching to (1/α)-variation
rough paths.)
(2)
c) Call any h = (h, H) : [0, T ] → Rd ⊕ (Rd )⊗2 = T0 , with h ∈ W 1,2 and
2α
H ∈ C an admissible perturbation. With some notational overloading, T is
also used for the second order translation introduced in Exercise 2.11, show that
Th := Th ◦ TH = TH ◦ Th
is a well-defined action on C α , in the sense of Tg ◦Th = Tg+h . Show that for any
(2)
fixed (a, b) ∈ T0 , the constant speed perturbation t 7→ (at, bt) is admissible,
(2)
which then yields an action of T0 with its additive structure on C α . Show that
these statements remain true for Cgα provided admissible perturbations take
values in the Lie algebra g(2) = Rd ⊕ so(d) as introduced in Section 2.3.
2.6 Comments 37
2.6 Comments
Many early works in stochastic analysis starting from Itô (and then in no particular
order Kunita, Yamato, Sugita, Azencott, Ben Arous [BA89], etc) and in control theory
(Magnus, Brocket, Sussmann, Fliess [FNC82], etc) have recognised the importance
of iterated integrals of the driving noise / signal; many references are given [Lyo98]
and the books [LQ02, LCL07, FV10b].
The notion of rough path is due to Lyons and was introduced in [Lyo98] in p-
variation sense, p ∈ [1, ∞), and over Banach spaces. Earlier notes [Lyo94, Lyo95]
already dealt with α-Hölder rough paths for α ∈ 31 , 12 .
The analytical aspects of rough paths are related to Young’s seminal work
[You36], revisited in Chapter 4. On the algebraic side, Chen’s relation is rooted
in [Che54, Che57] and encodes abstractly basic additivity properties of iterated
integrals. A key observation of Chen [Che57, Che58] was that log signatures are
Lie series, the description via shuffles (cf. Section 2.4) is due to Ree [Ree58] (see
also [Che71]). It follows from the works of Chow and Rashevskii [Cho39, Ras38],
also [Che57, Che58], that this map is, upon truncation, onto: for every element
in x ∈ G(N ) (Rd ) := exp(g(N ) (Rd )) there exists a smooth path γ : [0, 1] → Rd
with prescribed signature x = S (N ) (γ). The shortest such path can be viewed as
sub-Riemannian geodesic, concatenation of such geodesics is then a natural way to
approximate weakly geometric rough paths (cf. Proposition 2.8) and underlies the
geometric approach of Friz–Victoir [FV05, FV10b], surveyed from a sub Rieman-
nian perspective in [FG16a]. The polynomial nature of (truncated) shuffle relations
and log Lie conditions recently led Améndola, Friz and Sturmfels [AFS19] to the
study of signature varieties in computational algebraic geometry.
Up to equivalence under a generalised notion of reparameterisation of paths known
as treelike equivalence, the “full” signature map γ 7→ S(γ) ∈ G((V )) ⊂ T ((V )) was
shown to be injective by Chen [Che58] in case of piecewise smooths paths, Hambly–
Lyons [HL10] in case of rectifiable paths, and Boedihardjo et al. [BGLY16] in case of
weakly geometric rough paths of arbitrarily low regularity, see also Boedihardjo, Ni
and Qian [BNQ14]. The inversion problem “signature 7→ path” is studied by Lyons–
Xu [LX17, LX18] and [AFS19]. All this is part of the mathematical justification of
the signature method in machine learning, see e.g. Lyons’ ICM article [Lyo14] and
the survey [CK16].
For some constructions of level-2 geometric rough paths motivated from harmonic
analysis see Hara–Lyons [HL07] and Lyons–Yang [LY13], see also the comments
38 2 The space of rough paths
Section 3.8 for some martingale constructions related to harmonic analysis. Lyons–
Qian, in their monograph [LQ02] work with geometric rough paths (over a Banach
space V ), per definition limits of canonically lifted smooth paths. The strict inclusion
“geometric ⊂ weakly geometeric” was somewhat blurred in the earlier rough paths
literature. For dim V < ∞, matters were clarified in [FV06a]. For a discussion
of weakly geometric rough paths over Banach spaces in their own right, see e.g.
in [CDLL16], see also the supplementary appendix [BGLY15] of [BGLY16]. The
discussion in Section 2.4, the “shuffle” view on weakly geometric rough paths and
then Gubinelli’s branched rough paths [Gub10], also extends from V = Rd to infinite
dimension but setting up basis-independent notations is somewhat more involved.
See for example [CW16, CCHS20] for some recent results in this direction.
(N )
“Naı̈ve” higher order non-geometric rough paths with values in T1 (V ) are
called in [Lyo98] multiplicative functionals (with α-Hölder or p-variation regularity,
⌊p⌋ = N ), insisting on their inability to handle nonlinearities when N ≥ 3. The
notion of branched rough path, for any α ∈ (0, 1], further studied in [HK15, FZ18,
BCFP19, BC19, TZ18] provides the required extra information when N ≥ 3; for
N = ⌊1/α⌋ = 2 there is no difference. It is possible to embed spaces of non-
geometric rough paths of low regularity into suitable spaces of geometric rough
paths, see [LV06] or Exercise 2.11 part c) when N = 2. The case of very low
regularities, with N large, is much more involved and studied by Hairer–Kelly
[HK15] and later Boedihardjo–Chevyrev [BC19].
Rough paths with jumps, in p-variation scale, are studied in [Wil01, FS17, FZ18,
CF19], previously introduced discrete rough paths [Kel16] are also accomodated e.g.
by the càdlàg rough path setting of [FZ18]. See also the comment Sections 4.8, 5.6
and 9.6. Rough paths in a geometric ambient space have been studied by Cass, Driver,
Litterer and Lyons in [CLL12, CDL15], see also Bailleul [Bai19] for rough paths on
Banach manifolds.
Chapter 3
Brownian motion as a rough path
In this chapter, we consider the most important example of a rough path, which is the
one associated to Brownian motion. We discuss the difference, at the level of rough
paths, between Itô and Stratonovich Brownian motion. We also provide a natural
example of approximation to Brownian motion which converges to neither of them.
The integration here is understood either in Itô or Stratonovich sense (in the latter
case, we would write ◦dB); sometimes we indicate this by writing BItô resp. BStrat . It
should be noted that the antisymmetric part of B, also known as Lévy’s stochastic
area, with values in so(d), is not affected by the choice of stochastic integration.
Condition (2.1) is seen to be valid with either choice, while condition (2.6) only
holds in the Stratonovich case. We now address the question of α- resp. 2α-Hölder
regularity of X resp. X by a suitable extension of the classical Kolmogorov criterion;
the application to Brownian motion is then carried out in detail in the following
subsection.
Recalling that B ∈ C α ([0, T ], Rd ), a.s. for any α < 1/2, we now address the
question of 2α-Hölder regularity for B.
39
40 3 Brownian motion as a rough path
Theorem 3.1 (Kolmogorov criterion for rough paths). Let q ≥ 2, β > 1/q.
Assume, for all s, t in [0, T ]
β 2β
|Xs,t |Lq ≤ C|t − s| , |Xs,t |Lq/2 ≤ C|t − s| , (3.2)
for some constant C < ∞. Then, for all α ∈ [0, β − 1/q), there exists a modification
of (X, X) (also denoted by (X, X)) and random variables Kα ∈ Lq , Kα ∈ Lq/2
such that, for all s, t in [0, T ]
α 2α
|Xs,t | ≤ Kα (ω)|t − s| , |Xs,t | ≤ Kα (ω)|t − s| . (3.3)
Proof. The proof is almost the same as the classical proof of Kolmogorov’s continuity
criterion, as exposed for example in [RY99]. Without loss of generality take T = 1
and let Dn denote the set of integerSmultiples of 2−n in [0, 1). As in the usual
criterion, it suffices to consider s, t ∈ n Dn , with the values at the remaining times
filled in using continuity. (This is why in general one ends up with a modification.)
Note that the number of elements in Dn is given by #Dn = 1/|Dn | = 2n . Set
Kn = sup Xt,t+2−n , Kn = sup Xt,t+2−n .
t∈Dn t∈Dn
Xt,t+2−n q ≤
X 1 βq βq−1
E Knq ≤ E C q |Dn | = C q |Dn |
,
|Dn |
t∈Dn
where (τi , τi+1 ) ∈ Dn for some n ≥ m + 1, and for each fixed n ≥ m + 1 there are
at most two such intervals taken from Dn . In this context, such a type of multiscale
decomposition is sometimes called a “chaining argument”. It follows that
NX−1
X
|Xs,t | ≤ max Xs,τi+1 ≤ Xτ ,τ ≤ 2
i i+1
Kn ,
0≤i<N
i=0 n≥m+1
and similarly,
NX
−1 NX−1
|Xs,t | = Xτi ,τi+1 + Xs,τi ⊗ Xτi ,τi+1 ≤ Xτi ,τi+1 + |Xs,τi |Xτi ,τi+1
i=0 i=0
N −1
X NX
−1
≤ Xτi ,τi+1 + max Xs,τi+1 Xτj ,τj+1
0≤i<N
i=0 j=0
X X 2
≤2 Kn + 2 Kn .
n≥m+1 n≥m+1
We thus obtain
|Xs,t | X 1 X 2Kn
α ≤ α 2Kn ≤ α ≤ Kα ,
|t − s| |Dm+1 | |Dn |
n≥m+1 n≥m+1
α
where Kα := 2 n≥0 Kn /|Dn | is in Lq . Indeed, since α < β −1/q by assumption
P
and |Dn | to any positive power is summable, we have
X 2 q 1/q
X 2C β−1/q
∥Kα ∥Lq ≤ α |E(Kn )| ≤ α |Dn | <∞.
|Dn | |Dn |
n≥0 n≥0
Similarly,
|Xs,t | X 1 X 1 2
2α ≤ 2α 2Kn + α 2Kn ≤ Kα + Kα2 ,
|t − s| |Dm+1 | |Dm+1 |
n≥m+1 n≥m+1
2α
is in Lq/2 . Indeed,
P
where Kα := 2 n≥0 Kn /|Dn |
2 2/q X 2C 2β−2/q
X
∥Kα ∥Lq/2 ≤ Kq/2 ≤ 2α |Dn | <∞,
2α E n
n≥0
|Dn | n≥0
|Dn |
The reader will notice that the classical Kolmogorov criterion (KC) is contained
in the above proof and theorem by simply ignoring all considerations related to the
second-order process X. Let us also note in this context that the classical KC works
for processes (Xt : 0 ≤ t ≤ 1) with values in an arbitrary (separable) metric space
42 3 Brownian motion as a rough path
(it suffices to replace |Xs,t | by d(Xs , Xt ) in the argument). This observation actually
gives an alternative and immediate proof of Theorem 3.1. All we have to do is to
remember from Proposition 2.6 that rough paths can always be viewed as bona fide
(2)
paths with values in a metric space, namely T1 , equipped with the homogeneous left-
1/2
invariance metric d(Xs , Xt ) ≍ |Xs,t | + |Xs,t | . The moment assumption (3.2) is
β
then equivalent to |d(Xs , Xt )|Lq ≤ C|t − s| and we can conclude with the “metric”
form of KC. From Section 2.4, a version of this KC for “level-N ” low regularity
rough paths is then also immediate. The reason we still like the pedestrian step-2
proof is that it is easily tweaked, e.g. to the case of the R2 -valued process (B H , B) the
pair of a fractional and standard Brownian motion, independent say, with Itô second
level BH := B H dB, in the rough regime H ∈ (0, 1/2]. In this case β should
R
be replaced by the vector (β1 , β2 ) = (H, 1/2) of regularities, and the conclusion
can be stated with α resp. 2α replaced by the vector (α1 , α2 ) = (H − , 1/2− ) resp.
(H + 1/2)− .
Remark 3.2 (Warning). It is not possible to obtain (3.3) by applying the classical
KC to the (V ⊗ V )-valued process (X0,t : 0 ≤ t ≤ T ). Doing so only gives |Xs,t | =
α
O(|t − s| ) a.s. since one misses a crucial cancellation inherent in (cf. (2.1))
That said, it is possible [Fri05] (but tedious) to use a 2-parameter version of the KC
to see that (s, t) 7→ Xs,t /|t − s|2α admits a continuous modification, which implies
that ∥X∥2α is finite almost surely.
Here is a similar result for rough path distances, say between X and X̃. Note
that, due to the nonlinear structure of rough path spaces, one cannot simply apply
Theorem 3.1 to the “difference” of two rough paths. Indeed, if we consider X̃ − X,
where addition is taken in the ambient Banach space Cα ⊕ C22α , then Chen’s relation
is in general not satisfied.
Theorem 3.3 (Kolmogorov criterion for rough path distance). Let α, β, q be as
above in Kolmogorov’s criterion (KC), Theorem 3.1. Assume that both X̃ = (X̃, X̃)
and X = (X, X) satisfy the moment condition in the statement of KC with some
constant C. Set
∆X := X̃ − X , ∆X := X̃ − X ,
and assume that for some ε > 0 and all s, t ∈ [0, T ]
β 2β
|∆Xs,t |Lq ≤ Cε|t − s| , |∆Xs,t |Lq/2 ≤ Cε|t − s| .
In particular, if β − 1q > 1 1 1
we have |||X̃|||α , |||X|||α ∈ Lq
3 then, for every α ∈ 3, β− q
and
|ϱα X̃, X |Lq/2 ≤ M ε.
3.2 Itô Brownian motion 43
Proof. The proof is a straightforward modification of the proof of Theorem 3.1 and
is left as an exercise to the reader. ⊔
⊓
In fact, the homogeneous rough path norm |||BItô |||α has Gaussian tails.
Proof. Using Brownian scaling and finite moments of B0,1 , which are immediate
from integrability properties of the (homogeneous) second Wiener–Itô chaos, the
KC for rough paths applies with β = 1/2 and all q < ∞. (As an exercise, the
reader may want to show finite moments of B0,1 without chaos arguments; an
elementary way to do so is via conditioning, Itô isometry, and reflection principle.)
The integrability |||BItô |||α ∈ Lq , any q < ∞, is clear from KC. The Gaussian
integrability (and hence tails) can be obtained by carefully tracking the moment
growth in Theorem 3.1 applied to BItô ; alternatively see Theorem 11.9 below for an
elegant Gaussian argument). ⊔ ⊓
44 3 Brownian motion as a rough path
Observe that Brownian motion enhanced with its iterated Itô integrals (2nd order
calculus!) yields a (random) rough path but not a geometric rough path which is, by
definition, an object with hardwired first order behaviour. Indeed, Itô formula yields
the identity
d(B i B j ) = B i dB j + B j dB i + B i , B j dt ,
i, j = 1, . . . , d ,
so that, writing Id for the identity matrix in d dimensions, we have for s < t,
1 1 1
Sym BItô
s,t = Bs,t ⊗ Bs,t − Id(t − s) ̸= Bs,t ⊗ Bs,t ,
2 2 2
in contradiction with (2.6).
Let us finally mention that Brownian motion with values in infinite-dimensional
spaces can also be lifted to rough paths, see the exercise section.
and has the advantage of a first order calculus. For instance, one has the first order
product rule
d(M N ) = M ◦ dN + N ◦ dM .
One can then define BStrat by (component-wise) Stratonovich integration of Brownian
motion against itself. Using basic results on quadratic variation of Brownian motion,
namely d⟨B i , B j ⟩t = δ i,j dt where δ i,j = 1 if i = j, zero else, we see that
1
BStrat Itô
s,t = Bs,t + Id(t − s) . (3.5)
2
Note that the difference between BStrat and BItô is symmetric, so that the antisymmet-
ric parts of the two processes (Lévy’s stochastic area) are identical.
Proposition 3.5. For any α ∈ (1/2, 1/3), with probability one,
and here again the homogeneous rough path norm |||BStrat |||α has Gaussian tails.
Proof. Using (3.5), rough path regularity of BStrat is immediately reduced to the
already established Itô case. (Alternatively, one can use again the Kolmogorov
3.3 Stratonovich Brownian motion 45
criterion for rough paths; the only – insignificant – difference is that now BStrat
0,1 takes
values in the inhomogeneous second chaos, due to the deterministic part Id/2.) At
last, B(ω) is geometric since
1
Sym BStrat
s,t = Bs,t ⊗ Bs,t ,
2
an immediate consequence of the first order product rule. Finally, integrability of
BStrat is clear from the already seen integrability of BItô , proving the final claim. ⊔
⊓
A typical realisation B(ω) is called Brownian rough path, as a process B = BStrat
is a.k.a. (Stratonovich) enhanced Brownian motion. It is a deterministic feature
of every weakly geometric rough path (X, X) that it can be approximated – in the
precise sense of Proposition 2.8 – by smooth paths in the rough path topology. Such
approximations require knowledge not only of the underlying path X, but of the
entire rough path, including the second-order information X.
In contrast, one has the probabilistic statement that piecewise linear, mollifier
and many other “obvious” approximations still converge in rough path sense. More
specifically, in the present context of d-dimensional standard Brownian motion, we
now give an elegant proof of this based on (discrete-time!) martingale arguments.
Proposition 3.6. Consider dyadic piecewise linear approximations (B (n) ) to B on
(n)
[0, T ]. That is, Bt = Bt whenever t = iT /2n for some integer i, and linearly
interpolated on intervals [iT /2n , (i + 1)T /2n ]. Then, with probability one,
Z ·
(n) (n) (n)
B , B ⊗ dB → (B, BStrat ) in Cgα .
0
and upon conditioning with respect to σ{BkT 2−n : 0 ≤ k ≤ 2n }, the same bounds
R · (n);i
(n);i
hold for B and for 0 B dB (n);j . In fact, Kα , Kα have (more than enough)
46 3 Brownian motion as a rough path
integrability to apply Doob’s maximal inequality. This leads, with probability one, to
the bound
Z ·
(n)
sup
B , B (n) ⊗ dB (n)
< ∞ .
n 0 2α
The reader should be warned that there are perfectly smooth and uniform ap-
proximations to Brownian motion, which do not converge to Stratonovich enhanced
Brownian motion, but instead to some different geometric (random) rough path, such
as
B̄ = B, B̄ , where B̄s,t = BStrat
s,t + (t − s)A , A ∈ so(d) .
Note that the difference between B̄ and BStrat is now antisymmetric, i.e. B̄ has a
stochastic area that is different from Lévy’s area. To construct such approximations,
it suffices to include oscillations (at small scales) such as to create the desired
effect in the area, while they do not affect the limiting path, see Exercise 2.10.
(In the context of Brownian motion and SDEs driven by Brownian motion such
approximations were studied by McShean, Ikeda–Watanabe and others, see [McS72,
IW89].) Although such “twisted” approximations do not seem to be the most obvious
way to approximate Brownian motion, they also arise naturally in some perfectly
reasonable situations.
Newton’s second law for a particle in R3 with mass m, and position x = x(t), (for
simplicity: constant) frictions α1 , α2 , α3 > 0 in orthonormal directions, subject
to a (3-dimensional) white noise in time, i.e. the distributional derivative of a 3-
dimensional Brownian motion B, reads
Note that these second order dynamics can be rewritten as evolution equation for the
momentum p(t) = mẋ(t),
1
ṗ = −M ẋ + Ḃ = − M ṗ + Ḃ.
m
As we shall see X = X m , indexed by “mass” m, converges in a quite non-trivial
way to Brownian motion on the level of rough paths. In fact, the correct limit in
rough path sense is B̄ = (B, B̄), where
B̄s,t = BStrat
s,t + (t − s)A, (3.7)
Theorem 3.8. Let M ∈ Rd×d be a square matrix in dimension d such that all its
eigenvalues have strictly positive real part. Let B be a d-dimensional standard
Brownian motion, m > 0, and consider the stochastic differential equations
1 1
dX = P dt , dP = − M P dt + dB .
m m
with zero initial position X and momentum P . Then, for any q ≥ 1 and α ∈
(1/3, 1/2), as mass m → 0,
Z
M X, M X ⊗ d(M X) → B̄ in C α and Lq .
Ytε = Pt /ε.
By assumption, there exists λ > 0 such that the real part of every eigenvalue of M
is (strictly) bigger than λ. For later reference, we note that this implies the estimate
| exp(−τ M )| = O(exp(−λτ )) as τ → ∞. For fixed ε, define the Brownian motion
B̃ = ε−1 Bε2 · so that ε−1 dBt = dB̃ε−2 t , and consider the SDEs
Note that the law of the solutions does not depend on ε. Furthermore, when solved
with identical initial data, we have pathwise equality
For each t (and in particular for t = 0), the law of Ỹtstat is precisely ν. We then see
that
Z 0 ∗
Z ∞
∗
Σ = E Ỹ0stat ⊗ Ỹ0stat = e−M (−s) e−M (−s) ds = e−M s e−M s ds.
−∞ 0
for all reasonable test functions f ; we shall only use it for quadratics. Using dX ε =
ε−1 Y ε dt we can then write
Z t Z t Z t
M Xsε ⊗ d(M X ε )s = M Xsε ⊗ dBs − ε M Xsε ⊗ dYsε
0 0 0
Z t Z t
= M Xs ⊗ dBs − M Xt ⊗ (εYtε ) + ε
ε ε
d(M X ε )s ⊗ Ysε
0 0
Z t Z t
ε ε ε
= M Xs ⊗ dBs − M Xt ⊗ (εYt ) + M Ysε ⊗ Ysε ds
0 0
1
In its standard form, see e.g. Stroock [Str11] or Kallenberg [Kal02], test functions are assumed to
be bounded. In our setting an easy truncation argument yields the extension to quadratics.
3.4 Brownian motion in a magnetic field 49
Z t Z
→ Bs ⊗ dBs − 0 + t (M y ⊗ y) ν(dy)
0
t
1
Z
= Bs ⊗ dBs + tM Σ = B0,t + t M Σ − Id ,
0 2
Step 2. (Uniform rough path bounds in Lq .) We claim that, for any q < ∞,
Z
q
q
sup E[∥M X ε ∥α ] < ∞ , ε ε
sup E
M X ⊗ d(M X )
<∞,
ε∈(0,1] ε∈(0,1] 2α
Since X is Gaussian, it follows from integrability properties of the first two Wiener–
Itô chaoses that it is enough to show these bounds for q = 2. Furthermore, we note
that the desired estimates are a consequence of the bounds
h 2 i
E X̃s,t ≲ |t − s| , (3.10)
2
" Z #
t
2
E X̃s,u ⊗ dX̃u ≲ |t − s| , (3.11)
s
where the implied proportionality constants are uniform over t, s ∈ (0, ∞). Indeed,
this follows directly from writing
h i h 2 i
ε 2
= E εX̃ε−2 s,ε−2 t ≲ ε2 ε−2 t − ε−2 s = |t − s| ,
E Xs,t
(note the uniformity in ε), and similarly for the second moment of the iterated
integral.
In order to check (3.10), it is enough to note that M X̃s,t = B̃s,t − Ỹs,t , combined
with the estimate
50 3 Brownian motion as a rough path
h i 2 Z t ∗
E |Ỹs,t |2 = E (e−M (t−s) − I)Ỹs + Tr(e−M u e−M u ) du ≲ |t − s| ,
s
where we used the fact that Real{σ(M )} ⊂ (0, ∞) to get a uniform bound. In order
to control (3.11), we consider one of the components and write
"Z 2 # "Z Z 2 #
t i j
t u i j
E X̃s,u dX̃u = E Ỹr Ỹu dr du
s s s
Z h i
= E Ỹri Ỹuj Ỹqi Ỹvj 1{r≤u;q≤v} dr du dq dv
[s,t]4
Z h i h i h i h i
≤ E Ỹri Ỹuj E Ỹqi Ỹvj + E Ỹri Ỹqi E Ỹuj Ỹvj
[s,t]4
h i h i
+E Ỹri Ỹvj E Ỹuj Ỹqi dr du dq dv
Z h i 2
≲ E Ỹr ⊗ Ỹu dr du
[s,t]2
Z h i 2
≲ E Ỹr ⊗ Ỹu 1{r≤u} dr du ,
[s,t]2
where we have used the fact that Ỹ is Gaussian (which yields Wick’s formula for the
of products) in order to get the bound on the third line. But for r ≤ u,
expectation
E Ỹu Ỹr = e−M (u−r) Ỹr , so that
Z h i Z h i
E Ỹr ⊗ Ỹu 1{r≤u} dr du = E Ỹr ⊗ e−M (u−r) Ỹr 1{r≤u} dr du
[s,t]2 [s,t]2
Z t Z t
−λ(u−r)
2
≲ e du E Ỹr dr ≲ |t − s| .
s r
order to find such cubature formulae, the mandatory first step, on which we focus
here, is the computation of the expectations of the n-fold iterated integrals2
Z
E ◦dB ⊗ · · · ⊗ ◦dB .
0<t1 <...<tn <T
Let us combine all of these integrals into one single object, also known as
(Stratonovich) signature of Brownian motion, by writing
XZ
S(B)0,T = 1 + ◦dB ⊗ · · · ⊗ ◦dB .
n≥1 0<t1 <...<tn <T
The signature S(B)0,T naturally takes values in the algebra of infinite formal tensor
d
L T ((R d)),⊗n
series effectively the closure of the space of tensor polynomials given
by n≥0 (R ) . It turns out that in the case of Brownian motion, the expected
signature can be expressed in a particularly concise and elegant form.
Proof. (Shekhar) Set φt := ES(B)0,t . (It is not hard to see, by Wiener–Itô chaos
integrability or otherwise, that all involved iterated integrals are integrable so that φ
is well-defined.) By Chen’s formula (in its general form, see Exercise 2.1) and the
independence of Brownian increments, one has the identity
φt+s = φt ⊗ φs .
For integers m, n we have log φm = n log φm/n and log φm = m log φ1 . It follows
that
log φt = t log φ1 ,
2
We remark that all n-fold iterated Stratonovich integrals can be obtained from the “level-2” rough
path (B(ω), BStrat (ω)) ∈ Cgα by a continuous map. In fact, this so-called Lyons lift, allows to view
any geometric rough path as a “level-n” rough path for arbitrary n ≥ 2.
52 3 Brownian motion as a rough path
first for t = m
n ∈ Q, then for any real t by continuity. On the other hand, for t > 0,
Brownian scaling implies that φt = δ√t φ1 where δλ is the dilation operator, which
acts by multiplication with λn on the nth tensor level, (Rd )⊗n . Since δλ commutes
with ⊗ (and thus also with log, defined as power series),
Recall that in this expression, “1” is identified with (1, 0, 0) in the truncated tensor
algebra, and similarly for the other summands, and addition also takes place in
T (2) (Rd ). Taking the logarithm (in the tensoralgebra truncated beyond level 2; in
this case log (1 + a + b) = a + b − 12 a ⊗ a if a is a 1-tensor, b a 2-tensor) then
immediately gives the desired identification. ⊔ ⊓
Theorem 3.10 (Kolmogorov tightness criterion for rough paths). Let q ≥ 2, β >
1/q. Assume, for all s, t in [0, T ]
n q q/2
≤ C|t − s|βq , En Xns,t
βq
En Xs,t ≤ C|t − s| , (3.12)
1
> 31 . Then for every α ∈ 1 1
for some constant C < ∞. Assume β − q 3, β − q , the
Xn ’s are tight in C 0,α .
In typical applications, the X n are only defined for discrete times, such as s =
j/n, t = k/n for integers j, k. The non-trivial work then consists, for a suitable
3.6 Scaling limits of random walks 53
j − k βq j − k βq
q q/2
En X nj , k ≤ C , En Xnj , k ≤ C . (3.13)
n n n n n n
defined on discrete times only, by piecewise linear interpolation to all times and
construct Xn = (X n , Xn ) by iterated (Riemann–Stieltjes) integration. Then the
tightness estimates in Theorem 3.10 hold with β = 1/2 and all q < ∞.
Proof. The iterated integrals of a linear (or affine) path with increment v ∈ Rd
takes the simple form exp(v) in terms of the tensor exponential introduced in (2.13).
Chen’s relation then implies
Xnj , k = exp X nj , j+1 ⊗ · · · ⊗ exp X nk−1 , k . (3.14)
n n n n n n
d
The simple calculus on the level-2 tensor algebra T (2) R leads to an explicit
expression for Xnj , k , to which one can apply the (discrete) Burkholder–Davis–Gundy
n n
inequality in order to get the discrete tightness estimates (3.13). The extension to all
times is straightforward. Details are left to the reader (see e.g. [BF13]). An alternative
argument, not restricted to level 2, is found in Breuillard et al. [BFH09]. ⊔ ⊓
Note that Xn , as constructed above, is a (random) geometric rough path. Recall
that suchrough paths can be viewed as genuine paths with values in the Lie group
G(2) Rd ⊂ T (2) Rd . On the other hand, from (3.14), we see that Xn restricted to
discrete times { nj : j ∈ N} is a Lie group valued random walk, rescaled with the aid
of the dilation operator. By using central limit theorems available on such Lie groups,
one can see that Xn at unit time converges weakly to Brownian motion, enhanced
with its iterated integrals in the Stratonovich sense. Under the additional assumption
that E(X ⊗ X) = Id, the identity matrix, this Brownian motion is in fact a standard
Brownian motion. This is enough to characterise the finite-dimensional distributions
of any weak limit point and one has the following “Donsker” type result.
Theorem 3.12. In the rescaled random walk setting of Proposition 3.11, and under
the additional assumption that E(X ⊗ X) = Id, we have the weak convergence
Xn =⇒ BStrat
3.7 Exercises
exists a.s. and in L2 , uniformly on compacts and so defines X with values in H ⊗HS H,
the closure of the algebraic tensor product H ⊗a H under the Hilbert–Schmidt norm.
Consider both the case of Itô and Stratonovich integration and verify that with either
choice, (X, X) ∈ C α a.s. for any α < 1/2.
∗ Exercise 3.5 (Banach-valued Brownian motion as rough path [LLQ02]) Given
a separable Banach space V equipped with a centred Gaussian measure µ, a standard
construction (cf. [Led96]) gives rise to a so-called abstract Wiener space (V, H, µ),
with H ⊂ V the Cameron–Martin space of µ. (Examples to have in mind are V =
H = Rd with µ = N (0, I), or the usual Wiener space V = C([0, 1]) equipped with
Wiener measure, H is then the space of absolutely continuous paths starting at zero
with L2 -derivative.) There then exists a V -valued Brownian motion (Bt : t ∈ [0, T ])
such that
3.7 Exercises 55
• B0 = 0,
• B has independent increments,
2
• ⟨Bs,t , v ∗ ⟩ ∼ N 0, (t − s)v ∗ H whenever 0 ≤ s < t ≤ T and v ∗ ∈ V ∗ ,→
H∗ ∼= H.
We assume that V ⊗ V is equipped with an exact tensor norm (with respect to µ)
in the sense that there exists γ ∈ [1/2, 1) and a constant C > 0 such that for any
sequence {Gk ⊗ G̃k : k ≥ 1} of independent V -valued Gaussian random variables
with identical distribution µ,
N
X 2
≤ CN 2γ = o(N ).
E Gk ⊗ G̃k
k=1 V ⊗V
a) Verify that exactness holds with γ = 1/2 whenever dim V < ∞. (More gener-
ally, exactness with γ = 1/2 always holds true if one works with the injective
tensor product space, V ⊗inj V , the injective norm being the smallest possible.
For the largest possible norm, the projective norm, the o(N )-estimate remains
true but can be as slow as one wishes. Exactness may then fail, see for example
[LLQ02]. Exactness of the usual Wiener space, with uniform or Hölder norm, is
also known to be true.)
b) Fix α < 1/2.R Show that dyadic piecewise linear approximations B n , enhanced
with Bn = B n ⊗ dB n , converge in α-Hölder rough path metric to a limit
B in C α ([0, T ], V ). More precisely, use the previous exercise to show that the
sequence Bn = (B n , Bn ) is Cauchy in the sense that
X2n 2
n 2
n+1
0,1 − B0,1 L2 ∼ E Btn+1 ⊗ Btn+1
B
n+1 n+1
2k−2 ,t2k−1 2k−1 ,t2k
V ⊗V
k=1
n
2 2
1 X n+1 n+1
∼ E 2 2 B n+1 n+1 ⊗ 2 2 B n+1
n+1
t2k−2 ,t2k−1 t2k−1 ,t2k
22n+2
V ⊗V
k=1
X2n 2
−2n−2
∼2 E Gk ⊗ G̃k ≲ 2−2n−2 22γn
V ⊗V
k=1
56 3 Brownian motion as a rough path
∼ 2−2n(1−γ) ,
Exercise 3.6 In the context of Theorem 3.8, show that for M normal the Lévy area
correction takes the form
1
A= Anti(M ) Sym(M )−1 .
2
Conclude that the correction vanishes if and only if M is symmetric. Is this also true
without the assumption that M is normal?
Exercise 3.7 In the context of Theorem 3.8, show that “physical Brownian motion
with mass m” converges as m → 0, in ϱα and Lq , α ∈ (1/2, 1/3) and q < ∞, with
rate
1
O , any θ < 1/2 − α.
mθ
Hint: Use Theorem 3.3 to show rough path convergence. (The computations are a
little longer, but of similar type, with the additional feature that the use of the ergodic
theorem can be avoided.)
Exercise 3.8 Consider physical Brownian motion in dimension d = 2, with
0 −1
M =I −α , α ∈ R.
1 0
Show that the area correction of X m , in the (small mass) limit m → 0, is given by
α 0 −1
.
2(1 + α2 ) 1 0
Here, the exponential should be interpreted as the exponential in the tensor algebra,
i.e.
1 1
exp(u) = 1 + u + u ⊗ u + u ⊗ u ⊗ u + . . .
2! 3!
Exercise 3.10 (Expected signature for Lévy processes [FS17]) Consider a com-
pound Poisson process Y with intensity λ and jumps distributed like J = J(ω) ∼ ν.
3.7 Exercises 57
in other words, Y is Lévy with triplet (0, 0, K) where the Lévy measure is given by
K = λν. A sample path of Y gives rise to piecewise linear, continuous path; simply
by connecting J1 , J1 + J2 etc. Show that, under a suitable integrability condition
for J,
ES(Y )0,T = exp T λE(eJ − 1).
Can you handle the case of a general Lévy process?
±1
Call the resulting process (Xt (ω) : t ∈ [0, 1]) and compute the expected signature
up to level 3, that is
Z Z
E 1, X0,1 , dXt1 ⊗ dXt2 , dXt1 ⊗ dXt2 ⊗ dXt3 .
0<t1 <t2 <1 0<t1 <t2 <t3 <1
Then,
1X 1
Z
dXt1 ⊗ dXt2 = Zi Zj ei ⊗ ej = Id + (zero mean)
0<t1 <t2 <1 2 i,j 2
and so the expected value at level 2 matches π2 exp( 21 I) = 12 Id. A similar ex-
pansion on level 3 shows that every summand either contains, for some i, a factor
3
EZti1 = 0 or E Zti1 = 0. In other
words, the expected signature at level 3 is zero,
in agreement with π3 exp( 12 Id) = 0. We conclude that the expected signatures, of
µ on the one hand and Wiener measure on the other hand, agree up to level 3.
3.8 Comments
The modification of Kolmogorov’s criterion for rough paths (Theorem 3.1) is a minor
variation on a rather well-known theme. Rough path regularity of Brownian motion
was first established in the thesis of Sipiläinen, [Sip93].
For extensions to infinite-dimensional Wiener processes (and also convergence
of piecewise linear approximations in rough path sense) see Ledoux, Lyons and
Qian [LLQ02] and Dereich [Der10]; much of the interest here is to go beyond the
Hilbert space setting. The resulting stochastic integration theory against Banach space
valued Brownian motion, which in essence cannot be done by classical methods, has
proven crucial in some recent applications (cf. the works of Kawabi–Inahama [IK06],
Dereich [Der10]).
Early proofs of Brownian rough path regularity were typically established by
convergence of dyadic piecewise linear approximations to (B, BStrat ) in (p-variation)
rough path metric; see e.g. Lyons–Qian [LQ02]. Many other “obvious” (but as we
have seen: not all reasonable) approximations are seen to yield the same Brownian
rough path limit. The discussion of Brownian motion in a magnetic field follows
closely Friz, Gassiat and Lyons [FGL15]. Semimartingales [CL05, FV08a, LP18,
CF19] and large classes of Markovian processes [Lej06, FV08c] lift in a natural way
to random rough paths. For Gaussian rough paths see Chapter 10. Infinite dimensional
rough path constructions from free probability include [CDM01, Vic04].
Friz–Victoir [FV08a] extend Lépingle’s classical p-variation Burkholder–Davis–
Gundy (BDG) inequality [Lep76] for martingales to continuous martingale rough
paths (a.k.a. enhanced martingales). This was further extended to càdlàg martingale
roughR paths by Chevyrev–Friz [CF19] and a precise “off-diagonal” variation estimate
for M dN , two martingales, was given by Kovač and Zorin–Kranich [KZK19],
extending a variational estimate of Do, Musalu and Thiele [DMT12], with motivation
from harmonic analysis.
Lyons–Zeitouni [LZ99] use rough paths to bound Stratonovich iterated stochas-
tic integrals under conditioning, with application to Onsager-Machlup functionals.
The componentwise expectation of (Stratonovich) iterated integrals, expected signa-
ture of Brownian motion, was first computed in the thesis of Fawcett [Faw04];
different proofs were then given by Lyons–Victoir, Baudoin and Friz–Shekhar,
[LV04, Bau04, FS17]. Fawcett’s formula is central to the Kusuoka–Lyons–Victoir
cubature method [Kus01, LV04]. More generally, expected signatures capture im-
portant aspects of the law of a stochastic process, see Chevyrev–Lyons [CL16]. The
computation of expected signatures of large classes of stochastic processes including
fractional Brownian motion, Schramm–Loewner trace, stopped Brownian motion
and Lévy processes has been pursued by a number of people including Baudoin
[Bau04], Werness [Wer12], Lyons–Ni [LN15], Friz–Shekhar [FS17]. The Donsker
type theorem, Theorem 3.12, in uniform topology, is a consequence of Stroock–
Varadhan [SV73]; the rough path case is due to Breuillard, Friz and Huesmann
[BFH09]. Applications to cubature are discussed in [BF13]. Several authors have
studied functional CLTs in rough paths topology in more complicated settings, includ-
ing [LS17, LS18, LO18], see also [IKN18]. The case of random walks in random
3.8 Comments 59
R
The aim of this chapter is to give a meaning to the expression Yt dXt for a suitable
class of integrands Y , integrated against a rough path X. We first discuss the case
originally studied by Lyons where Y = F (X). We then introduce the notion of a
controlled rough path and show that this forms a natural class of integrands.
4.1 Introduction
R
We consider the problem of giving a meaning to the expression Yt dXt , for X ∈
C α ([0, T ], V ) and Y some continuous function with values in L(V, W ), the space
of bounded linear operators from V into some other Banach space W . Of course,
such an integral cannot be defined
R for arbitrary continuous functions Y , especially if
we want the map (X, Y ) 7→ Y dX to be continuous in the relevant topologies. We
therefore also want to identify a “good” class of integrands Y for the rough path X.
A natural approach would be to try to define the integral as a limit of Riemann–
Stieltjes sums, that is
Z 1 X
Yt dXt = lim Ys Xs,t , (4.1)
0 |P|→0
[s,t]∈P
61
62 4 Integration against rough paths
Z
1
(Yr − Y0 ) dXr ≤ C∥Y ∥β;[0,1] ∥X∥α;[0,1] , (4.2)
0
with C depending on α + β > 1. Given paths X, Y defined on [s, t] rather than [0, 1]
it is an easy consequence of the scaling properties of Hölder seminorms, that
Z t
α+β
Yr dXr − Ys s,t ≤ C∥Y ∥β ∥X∥α |t − s|
X . (4.3)
s
2α
In particular, when α = β > 1/2, the right-hand side is proportional to |t − s| =
o(|t − s|) which is to be compared with the estimate (4.22) below.
The main insight of the theory of rough paths is that this seemingly unsurmount-
able barrier of α + β > 1 (which reduces to α > 1/2 in the case α = β which is our
main interest1 ) can be broken by adding additional structure to the problem. Indeed,
for a rough path X, we postulate the values Xs,t of the integral of X against
R itself,
see (2.2). It is then intuitively clear that one should be able to define Y dX in a
consistent way, provided that Y “looks like X”, at least on very small scales (in the
precise sense of (4.18) below). The easiest way for a function Y to “look like X”
is to have Yt = F (Xt ) for some sufficiently smooth F : V → L(V, W ), called a
one-form.
for r in some (small) interval [s, t], say. Recall (see sections 1.4 and 1.5 concerning
the infinite-dimensional case) that2
L(V, L(V, W )) ∼
= L(V ⊗ V, W ) ,
side of Z 1 X
F (Xs ) dXs ≈ F (Xs )Xs,t + DF (Xs )Xs,t , (4.5)
0 [s,t]∈P
4
does exist and call it rough integral.
R · In fact, in this section αwe shall construct the
(indefinite) rough integral Z = 0 F (X)dX as element in C , i.e. as path, similar
to the construction of stochastic integrals as processes rather than random variables.
Even this may not be sufficient in applications – one often wants to have an extended
meaning of the rough integral, such as (Z, Z) ∈ C α , point of view emphasised in
[Lyo98, LQ02, LCL07], or something similar (such as “Z controlled by X” in the
sense of Definition 4.6 below, to be discussed in the next section).
Lemma 4.1. Let F : V → L(V, W ) be a Cb2 function and let (X, X) ∈ C α for some
α > 13 . Set Ys := F (Xs ), Ys′ := DF (Xs ) and Rs,t
Y
:= Ys,t − Ys′ Xs,t . Then
∥Y ∥α ≤ ∥DF ∥∞ ∥X∥α ,
∥Y ′ ∥ ≤
D2 F
∥X∥ ,
α ∞ α
R
≤ 1
D2 F
∥X∥2 .
Y
2α ∞ α
2
3
Recall that lim|P|→0 means convergence along any sequence (Pn ) with mesh |Pn | → 0, with
identical limit along each such sequence. In particular, it is not enough to establish convergence
along a particular sequence (Pn ), although a particular sequence may be used to identify the limit.
4
Of course, we can and will consider intervals other than [0, 1]. Without further notice, P always
denotes a partition of the interval under consideration.
64 4 Integration against rough paths
Proof. Cb2 regularity of F implies that F and DF are both Lipschitz continuous with
Lipschitz constants ∥DF ∥∞ and ∥D2 F ∥∞ respectively. The α-Hölder bounds on Y
and Y ′ are then immediate. For the remainder term, consider the function
A Taylor expansion, with intermediate value remainder, yields ξ ∈ (0, 1) such that
Y 1 2
Rs,t = F (Xt ) − F (Xs ) − DF (Xs )Xs,t = D F (X s + ξXs,t )(Xs,t , Xs,t ) .
2
Y
The claimed 2α-Hölder estimate, in the sense that |Rs,t | ≲ |t − s|2α , then follows at
once. ⊔ ⊓
Before we prove that the rough integral (4.6) exists, we discuss some sort of
abstract Riemann integration. In what follows, at first reading, one may
R t have in mind
the construction of a Riemann–Stieltjes (or Young) integral Zt := 0 Yr dXr . From
Young’s inequality (4.3), one has (with Zs,t = Zt − Zs as usual)
and Ξs,t := Ys Xs,t is a sufficiently good local approximation in the sense that it
fully determines the integral Z via the limiting procedure given in (4.1)). In this
sense Z = IΞ is the well-defined image of Ξ under some abstract integration map
I. Note that Zs,t = Zs,u + Zu,t , i.e. increments are additive (or “multiplicative” if
one regards + as group operation5 ) whereas a similar property fails for Ξ. In the
language of [Lyo98], such a Ξ corresponds to a “almost multiplicative functional”
and it is a key result in the theory that there is a unique associated “multiplicative
functional” (here: Z = IΞ). Following [FdLP06] we call “sewing” the step from a
(good enough) local approximation Ξ to some (abstract) integral IΞ; the concrete
estimate which quantifies how well IΞ is approximated by Ξ will be called “sewing
lemma”. It plays an analogous role to Davie’s lemma (cf. Section 8.7) in the context
of (rough) differential equations.
We now formalise what we mean by Ξ being a good enough local approximation.
For this, we introduce the space C2α,β ([0, T ], W ) of functions Ξ from the 2-simplex
{(s, t) : 0 ≤ s ≤ t ≤ T } into W such that Ξt,t = 0 and such that
def
∥Ξ∥α,β = ∥Ξ∥α + ∥δΞ∥β < ∞ , (4.8)
|Ξs,t |
where ∥Ξ∥α = sups<t |t−s|α as usual, and also
5
This terminology becomes natural if one considers Z together with its iterated integrals as
group-valued path, increments of which satisfy Chen’s “multiplicative” relation, see (2.8).
4.2 Integration of one-forms 65
Provided that β > 1, it turns out that such functions are “almost” of the form
Ξs,t = Ft − Fs , for some α-Hölder continuous function F (they would be if and
only if δΞ = 0). Indeed, it is possible to construct in a canonical way a function Ξ̂
with δ Ξ̂ = 0 and such that Ξ̂s,t ≈ Ξs,t for |t − s| ≪ 1:
Lemma 4.2 (Sewing lemma). Let α and β be such that 0 < α ≤ 1 < β. Then,
there exists a unique continuous linear map I : C2α,β ([0, T ], W ) → C α ([0, T ], W )
such that (IΞ)0 = 0 and
(IΞ)s,t − Ξs,t ≤ C|t − s|β .
(4.9)
where C only depends on β and ∥δΞ∥β . (The α-Hölder norm of IΞ also depends
on ∥Ξ∥α and hence on ∥Ξ∥α,β .)
Because of its importance we give two independent but related arguments. The
first argument is based on successive (dyadic) refinement to construct Is,t with the
desired bound (4.9), followed by an argument for additivity. Fix [s, t] ⊂ [0, T ] and
let Pn be the level-n dyadic partion of [s, t], which contains 2n intervals, each of
length 2−n |t − s|, starting with the trivial partition P0 = {[s, t]}. Define Is,t
0
= Ξs,t
and then the nth level approximation by
X X
n+1 def n
Is,t = Ξu,v = Is,t − δΞu,m,v ,
[u,v]∈Pn+1 [u,v]∈Pn
where it is a straightforward exercise to check that the second equality holds. It then
follows immediately from the definition of ∥δΞ∥β that
n+1 n
≤ 2n(1−β) |t − s|β ∥δΞ∥β .
Is,t − Is,t
66 4 Integration against rough paths
Since β > 1, these terms are summable whence we conclude that the sequence
n
(Is,t : n ∈ N) is Cauchy. Its limit Is,t is such that, summing up the bound above,
X n+1 n
≤ C∥δΞ∥β |t − s|β ,
Is,t − Ξs,t ≤ Is,t − Is,t (4.11)
n≥0
for some universal constant C depending only on β, which is precisely the required
bound (4.9). Unfortunately, addivity of I is no consequence of this argument so
we have to be a little smarter (but see Remark 4.3). Taking T = 1 without loss of
generality (and for notational simplicity only), we restrict the previous construction
to elementary dyadic intervals of the form [s, t] = 2−k [ℓ, ℓ + 1] for some k ≥ 0 and
ℓ ∈ {0, . . . , 2k − 1}. The advantage is that now mid-point additivity holds in the
sense that
s+t
Is,t = Is,u + Iu,t , u= , (4.12)
2
n+1 n n
as a simple consequence of taking limits in the identity Is,t = Is,u + Iu,t . The
−k
natural additive extension of I to non-elementary dyadic intervals 2 [ℓ, m] is then
given by postulating that
m−1
X
I2−k ℓ,2−k m = I2−k j,2−k (j+1) , (4.13)
j=ℓ
which is indeed well-defined (note that 2−k [ℓ, m] = 2−k−1 [2ℓ, 2m] for example
so (4.13) can be written in several ways) by (4.12). This defines Is,t for all dyadic
numbers s, t and the construction guarantees addivitiy. We leave the fact that Is,t
satisfies (4.9) for all dyadic s, t (and therefore for all s, t ∈ [0, 1] by continuous
extension) as Exercise 4.3.
The second argument, which is essentially due to Young, yields immediately the
convergence (4.10), as |P| → 0, i.e. the same limit is obtained along any sequence
Pn with mesh tending to zero. This has the important consequence that addivity of
increment (δI = 0) is a consequence of (4.10) and requires no additional argument.
(Another advantage of Young’s construction is that it also works under variation
- rather than Hölder type assumption and thus in application allows to deal with
jumps.) Consider a partition P of [s, t] and let r ≥ 1 be the number of intervals in P.
When r ≥ 2 there exists u ∈ [s, t] such that [u− , u], [u, u+ ] ∈ P and
2
|u+ − u− | ≤ |t − s|.
r−1
P
Indeed, assuming otherwise R the contradiction 2|t − s| ≥ u∈P ◦ |u+ − u− | >
gives
2|t − s|. Hence, | P\{u} Ξ − P Ξ| = |δΞu− ,u,u+ | ≤ ∥δΞ∥β (2|t − s|/(r − 1))β
R
and by iterating this procedure until the partition is reduced to P = {[s, t]}, we arrive
at the maximal inequality,
4.2 Integration of one-forms 67
Z
sup Ξs,t − Ξ ≤ 2β ∥δΞ∥β ζ(β)|t − s|β ,
P⊂[s,t] P
But then, for any P with |P| ≤ ε we can use the maximal inequality to see that
Z Z
β
X
≤ 2β ζ(β)∥δΞ∥ |v − u| = O |P|β−1 = O(εβ−1 ).
Ξ− Ξ
β
P P′ [u,v]∈P
This concludes the Young argument (with no hidden tedium left to the reader). ⊔
⊓
Remark 4.3. The first argument ultimately suffered from the tedium of checking
the additivity property δIΞ = 0. In some situations this extra step can be avoided,
notably in the case where all one wants are uniform rough path estimates for classical
Riemann–StieltjesR integrals. More precisely, consider the case that X : [0, T ] → V
is smooth, X = X ⊗ dX, and one is only interested in an error estimate for second
order approximations of Riemann–Stieltjes integrals, of the form
Z t
3α
F (X r ) dXr − F (X s )Xs,t − DF (Xs )X s,t ≤ O(|t − s| ),
s
uniform over all (smooth) paths X with ∥X∥α + ∥X∥2α bounded. In the context of
the above proof, this estimate is contained in the first step, applied with (cf. the proof
of Theorem 4.4)
Ξs,t = F (Xs )Xs,t + DF (Xs )Xs,t .
But here we know already from classical Riemann integration theory that (IΞ)s,t ,
constructed as limit of dyadic partitions of [s, t], is precisely the Riemann–Stieltjes
Rt
integral s F (Xr ) dXr and therefore additive. (The contribution of DF (X)X effec-
tively constitutes a higher-order approximation and surely does not affect the limit,
2
as can be seen from the estimate |Xu,v | ≲ |v − u| , thanks to smoothness of X.)
We now apply the sewing lemma to the construction of (4.6).
Theorem 4.4 (Lyons). Let X = (X, X) ∈ C α ([0, T ], V ) for some T > 0 and
α > 13 , and let F : V → L(V, W ) be a Cb2 function. Then, the rough integral defined
in (4.6) exists and one has the bound
68 4 Integration against rough paths
Z t
F (Xr ) dXr − F (Xs )Xs,t − DF (Xs )Xs,t
s
3 3α
≲ ∥F ∥C 2 ∥X∥α + ∥X∥α ∥X∥2α |t − s| , (4.15)
b
where the constant C only depends on p T and α and can be chosen uniformly in
T ≤ 1. Furthermore, |||X|||α = ∥X∥α + ∥X∥2α denotes again the homogeneous
α-Hölder rough path norm.
R·
Remark 4.5. We will see in Section 4.4 that the map (X, X) ∈ C α 7→ 0 F (X) dX ∈
C α is continuous in α-Hölder rough path metric.
Proof. Let us stress the fact that the argument given here only relies on the properties
of the integrand Y = F (X) collected in Lemma 4.1 above. In particular, the general-
isation to “extended” integrands (Y, Y ′ ), which replace (F (X), DF (X)), subject to
(4.7), will be immediate. (We shall develop this “Gubinelli” point of view further in
Section 4.3 below.)
The result follows as a consequence of Lemma 4.2. With the notation that we just
introduced, the classical Young integral [You36] can be defined as the usual limit of
Riemann sums by
Z t
Yr dXr = IΞ s,t , Ξs,t = Ys Xs,t .
s
so that, except in trivial cases, the required bound (4.8) is satisfied only if Y and
X are Hölder continuous with Hölder exponents adding up to β > 1. In order to
be able to cover the situation α < 12 , it follows that we need to consider a better
approximation to the Riemann sums, as discussed above. To this end, we use the
notation from Lemma 4.1, namely
and then set Ξs,t = Ys Xs,t + Ys′ Xs,t . Note that, for any u ∈ (s, t), we have the
identity
Y ′
δΞs,u,t = −Rs,u Xu,t − Ys,u Xu,t .
Thanks to the α-Hölder regularity of X, Y ′ and the 2α-regularity of R, X, the triangle
inequality shows that (4.8) holds true with the given α > 1/3 and β := 3α > 1. The
4.3 Integration of controlled rough paths 69
which is the claimed estimate (4.16) in the limit α ↓ 1/3. However, one can do better
by realising that the above estimate is best for |t − s| small, whereas for t − s large
it is better to split up |Zs,t | into the sum of small increments. To make this more
precise, set ϱ := |||X|||α and write (hide factor C = C(α, T ) in ≲ below)
α 2α 3α
|Zs,t | ≲ ϱ|t − s| + ϱ2 |t − s| + ϱ3 |t − s|
α
≤ 3ϱ|t − s| for ϱ1/α |t − s| ≤ 1.
Increments of Z over [s, t] with length greater than h := ϱ−1/α are handled by
cutting them into pieces of length h. More precisely (cf. Exercise 4.5) we have
∥Z∥α;h ≤ 3ϱ which entails
∥Z∥α ≤ 3ϱ 1 ∨ 2h−(1−α) ≤ 6 ϱ ∨ ϱ1/α .
Motivated by Lemma 4.1 and the observation that rough integration essentially relies
on the properties (4.7) we introduce the notion of a controlled path Y , relative to
some “reference” path X, due to Gubinelli [Gub04]. For the sake of the following
definition we assume that Y takes values in some Banach space, say W̄ . When
it comes to the definition of a rough integral we typically take W̄ = L(V, W );
although other choices can be useful (see e.g. Remark 4.12). In the context of rough
70 4 Integration against rough paths
satisfies ∥RY ∥2α < ∞. This defines the space of controlled rough paths,
(Y, Y ′ ) ∈ DX
2α
([0, T ], W̄ ).
Although Y ′ is not, in general, uniquely determined from Y (cf. Remark 4.7 and
Section 6 below) we call any such Y ′ the Gubinelli derivative of Y (with respect to
X).
Y
Here, Rs,t takes values in W̄ , and the norm ∥ • ∥2α for a function with two
2α
arguments is given by (2.3) as before. We endow the space DX with the seminorm
Remark 4.7. Since we only assume that ∥Y ∥α < ∞, but then impose that ∥RY ∥2α <
∞, it is in general the case that a genuine cancellation takes place in (4.18). The
question arises to what extent Y determines Y ′ . Somewhat contrary to the classical
situation, where a smooth function has a unique derivative, too much regularity of
the underlying rough path X leads to less information about Y ′ . For instance, if Y is
smooth, or in fact in C 2α , and the underlying rough path X happens to have a path
component X that is also C 2α , then we may take Y ′ = 0, but as a matter of fact
any continuous path Y ′ would satisfy (4.18) with ∥R∥2α < ∞. On the other hand,
if X is far from smooth, i.e. genuinely rough on all (small) scales, uniformly in all
directions, then Y ′ is uniquely determined by Y , cf. Section 6 below.
Remark 4.8. It is important to note that while the space of rough paths C α is not
2α
even a vector space, the space DX is a perfectly normal Banach space for any given
α
X = (X, X) ∈ C . The twist of course is that the space in question depends in a
crucial way on the choice of X. The set of all pairs (X; (Y, Y ′ )) gives rise to the total
space G
C α ⋉ D 2α = 2α
def
{X} × DX ,
X∈C α
4.3 Integration of controlled rough paths 71
Remark 4.9. While the notion of “controlled rough path” has many appealing fea-
tures, it does not come with a natural approximation theory. To wit, consider
X, X ∈ Cgα [0, T ], Rd as limit of smooth paths Xn : [0, T ] → Rd in the sense
of Proposition 2.8. Then it is natural to approximate Y = F (X) by Yn = F (Xn ),
which is again smooth (to the extent that F permits). There is no obvious analogue
of this for controlled rough paths. However, there is a non-canonical approximation
result, based on the Lyons–Victoir extension, which the reader is invited to explore
in Exercise 4.8.
where we took W̄ = L(V, W ) and used the canonical injection L(V, L(V, W )) ,→
L(V ⊗ V, W ) in writing Ys′ Xs,t . With these notations, the resulting integral takes
values in W .
With these notations at hand, it is now straightforward to prove the following
result, which is a slight reformulation of [Gub04, Prop.1]:
Theorem 4.10 (Gubinelli). Let T > 0, let X = (X,X) ∈ C α ([0, T ], V ) for some
α ∈ 31 , 12 , and let (Y, Y ′ ) ∈ DX
2α
[0, T ], L(V, W ) . Then there exists a constant
C depending only on α such that
a) The integral defined in (4.21) exists and, for every pair s, t, one has the bound
Z t
Yr dXr −Ys Xs,t −Ys′ Xs,t ≤ C ∥X∥α ∥RY ∥2α +∥X∥2α ∥Y ′ ∥α |t−s|3α .
s
(4.22)
2α 2α
b) The map from DX [0, T ], L(V, W ) to DX [0, T ], W given by
Z ·
(Y, Y ′ ) 7→ (Z, Z ′ ) := Yt dXt , Y , (4.23)
0
6
Note the abuse of notation: we hide dependence on Y ′ which in general affects the limit but is
usually clear from the context.
72 4 Integration against rough paths
is a continuous linear map between Banach spaces and one has the bound 7
Remark 4.11. One actually obtains better information than just (Z, Z ′ ) ∈ DX
2α
,
namely one has control up to order 3α in the sense that
Zs,t − Ys Xs,t − Ys′ Xs,t ≲ |t − s|3α ,
see (4.34). Similar consideration will lead to the more general concept of modelled
distribution in the theory of regularity structures, see in particular Definition 13.10.
Remark 4.12. As in the above theorem, assume that (X, X) ∈ C α ([0, T ], V ) and
consider Y and Z two paths controlled by X. More precisely, we assume (Y, Y ′ ) ∈
2α
DX ([0, T ], L(V̄ , W )) and (Z, Z ′ ) ∈ DX
2α
([0, T ], V̄ ), where of course V, V̄ , W are
all Banach spaces. Then, in terms of the abstract integration map I (cf. the sewing
lemma) we may define the integral of Y against Z, with values in W , as follows,
Z t
Yu dZu = (IΞ)s,t , Ξu,v = Yu Zu,v + Yu′ Zu′ Xu,v .
def
(4.24)
s
Here, we use the fact that Zu′ ∈ L(V, V̄ ) can be canonically identified with an opera-
tor in L(V ⊗V, V ⊗ V̄ ) by acting only on the second factor, and Yu′ ∈ L(V, L(V̄ , W ))
is identified as before with an operator in L(V ⊗ V̄ , W ). The reader may be helped
to see this spelled out in coordinates, assuming finite dimensions: using indices i, j
in W, V̄ respectively, and then k, l in V :
i i j i j k,l
(Ξu,v ) = (Yu )j (Zu,v ) + (Yu′ )k,j (Zu′ )l (Xu,v ) .
A short computation, similar to the one that justified the application of the sewing
lemma for the construction of the rough integral introduced in (4.21), gives
Y
−δΞs,u,t = Rs,u Zu,t + Ys′ Xs,u Rs,u
Z
+ Ys′ Xs,u Zs,u
′
Xu,t + (Y ′ Z ′ )s,u Xu,t .
7
As in (4.20), this implies ∥Z, Z ′ ∥X,2α ≲ |Y0′ | + T α ∥Y, Y ′ ∥X,2α , uniformly over bounded X.
4.3 Integration of controlled rough paths 73
It immediately follows that ∥δΞ∥3α < ∞ so that, since 3α > 1, the right-hand
side of (4.24) is well defined. The sewing lemma furthermore yields the following
generalisation of (4.22), with Ξ as given in (4.24),
Z t
Y dZ − Ξs,t ≲ (∥RY ∥2a ∥Z∥α + (∗) + ∥Y ′ Z ′ ∥α ∥X∥2α )|t − s|3α , (4.25)
s
Note that (∗) duly vanishes when Z = X and Z ′ is the identity operator, since then
RZ ≡ 0 and Z ′ , constant in time, has vanishing α-Hölder seminorm. In that case, we
recover precisely the previously obtained estimate for the rough integral introduced
in (4.21). Furthermore, in the smooth case, one can check that we again recover the
usual Riemann / Young integral.
Remark 4.13. If, in the notation of the proof of Theorem 4.4, Ξ and Ξ̃ are such that
Ξ − Ξ̃ ∈ C2β for some β > 1, i.e.
which converges to 0 as |P| → 0. (This remains true if O(|t − s|β ) with β > 1 is
replaced by o(|t − s|).)
This also shows that, if X and Y are smooth functions and X is defined by (2.2),
the integral that we just defined does coincide with the usual Riemann–Stieltjes
integral. However, if we change X, then the resulting integral does change, as will be
seen in the next example.
Example 4.14. Let f be a 2α-Hölder continuous function and let X = (X, X) and
X̄ = (X̄, X̄) be two rough paths such that
Here, the second term on the right-hand side is a simple Young integral, which is
well-defined since α + 2α > 1 by assumption.
74 4 Integration against rough paths
Remark 4.15. As we will see in Section 5.2 below, (4.26) can be interpreted as a
generalisation of the usual expression relating Itô integrals to Stratonovich integrals.
Remark 4.16. The bound (4.22) does behave in a very natural way under dilations.
Indeed, the integral is invariant under the transformation
The same is true for the right-hand side of (4.22), since under this dilation, we also
have RY 7→ λ−1 RY .
will be useful. Even when X = X̃, it is not a proper metric for it fails to separate
(Y, Y ′ ) and (Y + cX + c̄, Y ′ + c) for anytwo constants
c and c̄. When X ̸= X̃,
the assertion “zero distance implies Y, Y ′ = Ỹ , Ỹ ′ ” does not even make sense.
(The two objects live in completely different spaces!) That said, for every fixed
(X, X) ∈ C α , one has (with Rs,t
Y
= Ys,t − Ys′ Xs,t as usual), a canonical map
ιX : Y, Y ′ ∈ CX
α
7→ Y ′ , RY ∈ C α ⊕ C22α .
Given Y0 = ξ, this map is injective since one can reconstruct Y by Yt = ξ +Y0′ X0,t +
Y
R0,t . From this point of view, one simply has
provided |Y0′ |, ∥Y ′ ∥∞ , ∥X∥α , and also with tilde, are bounded by R. It follows that
Y − Ỹ
≤ C
X − X̃
+ Y0′ − Ỹ0′ + T α ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥
α α
. (4.30)
X,X̃,2α
X = X, X , X̃ = X̃, X̃ ∈ C α , Y, Y ′ ∈ DX 2α
, Ỹ , Ỹ ′ ∈ DX̃
2α
in a bounded
set, in the sense
Z ·
′ 2α
(Z, Z ) := Y dX, Y ∈ DX ,
0
′
and similarly for Z̃, Z̃ . Then, the following local Lipschitz estimates holds true,
∥Z, Z ′ ; Z̃, Z̃ ′ ∥X,X̃,2α ≤ C ϱα X, X̃ + Y0′ − Ỹ0′ + T α ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥X,X̃,2α ,
(4.31)
and also
Z−Z̃
≤ C ϱα X, X̃ + Y0 − Ỹ0 + Y0′ − Ỹ0′ + T α ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥
α X, X̃,2α ,
(4.32)
where C = CM = C(M, α) is a suitable constant.
Proof. (The reader is advised to review the proofs of Theorems 4.4, 4.10.) We first
note that (4.30) applied to Z, Z̃ (note: Z0′ − Z̃0 = Y0 − Ỹ ) shows that (4.32) is an
immediate consequence of the first estimate (4.31). Thus, we only need to discuss
the first estimate. By definition of dX,X̃,2α , we need to estimate
′
Z − Z̃ ′
+ ∥RZ − RZ̃ ∥2α =
Y − Ỹ
+
RZ − RZ̃
.
α α 2α
Thanks to (4.30), the first summand is clearly bounded by the right-hand side of
(4.31). For the second summand we recall
Z t
Z
Rs,t = Zs,t − Zs′ Xs,t = Y dX − Ys Xs,t = (IΞ)s,t − Ξs,t + Ys′ Xs,t
s
where Ξs,t = Ys Xs,t + Ys′ Xs,t and similar for RZ̃ . Setting ∆ = Ξ − Ξ̃, we use
(4.11) with β = 3α and Ξ replaced by ∆, so that
Z Z̃
= I∆ s,t − ∆s,t + Ys′ Xs,t − Ỹs′ X̃s,t
Rs,t − Rs,t
76 4 Integration against rough paths
3α
+ Ys′ Xs,t − Ỹs′ X̃s,t ,
≤ C∥δ∆∥3α |t − s|
Ỹ Y ′ ′
where δ∆s,u,t = Rs,u X̃u,t −Rs,u Xu,t + Ỹs,u X̃u,t −Ys,u Xu,t . We then conclude with
some elementary estimates of the type (4.29), just like in the proof of Theorem 4.10.
⊔
⊓
Recall that we showed in Section 2.3 how an α-Hölder rough path X could be defined
as a path with values in the free step-N nilpotent Lie group G(N ) (Rd ) ⊂ T (N ) (Rd ),
with N = ⌊1/α⌋. It does not seem obvious at all a priori how one would define a
controlled rough path in this context. One way of interpreting Definition 4.6 is as a
kind of local “Taylor expansion” up to order 2α. It seems natural in the light of the
previous subsections that if α ≤ 13 , a controlled rough path should have a kind of
“Taylor expansion” up to order N α.
As a consequence, if we expand Xs,t = X−1
def
s ⊗ Xt as
X
Xs,t = Xws,t ew ,
|w|≤N
where |w| denotes the length of the word w, one would expect that a controlled rough
path should have an expansion of the form
X
δYs,t = Ysw Xw Y
s,t + Rs,t , (4.33)
|w|≤N −1
Y
with |Rs,t | ≲ |t−s|N α . Here, given a word w = w1 · · · wk with letters in {1, . . . , d},
we write ew = e1 ⊗ . . . ⊗ ek for the corresponding basis vector of T (N ) (Rd ). As in
Section 2.4, we then identify the words themselves as the dual basis of T (N ) (Rd )∗ .
Note that e̸# = 1 ∈ R ≃ (Rd )⊗0 ⊂ T (N ) (Rd ).
Recall that in Definition 4.6 we also needed a regularity condition on the “deriva-
tive process” Y ′ . The equivalent statement in the present context is that the Ysw
should themselves be described by a local “Taylor expansion”, but this time only up
to order (N − |w|)α. A neat way of packaging this into a compact statement is to
view a controlled rough path as a T (N −1) (Rd )∗ -valued function. Definition 4.6 then
generalises as follows.8
Definition 4.18. Let α ∈ (0, 1), let N = ⌊1/α⌋, and let X be a geometric α-Hölder
rough path as defined in Section 2.4. A controlled rough path is a T (N −1) (Rd )∗ -
valued function Y such that, for every word w with |w| ≤ N − 1, one has the
bound
⟨ew , Yt ⟩ − ⟨Xs,t ⊗ ew , Ys ⟩ ≤ C|t − s|(N −|w|)α .
(4.34)
8
This is for Y with values in R, but the extension to vector-valued Y is straightforward.
4.6 Stochastic sewing 77
We call Y a lift of Yt := ⟨e̸# , Yt ⟩ and write DXN α for the space of such controlled
rough paths.
It is convenient to write Yw t instead of ⟨ew , Yt ⟩. Given such a controlled rough path
Y, it is then natural to define its integral against any component X i by
Z t X X
Ys dXsi = lim Yw
def
Zt = r ⟨Xr,s , wi⟩ , (4.35)
0 |P|→0
[r,s]∈P |w|≤N −1
where wi denotes the concatenation of w with the letter i. It turns out [Gub10, HK15]
that Z can be lifted as controlled rough path Z in the sense of Definition 4.18. It
suffices to set Z̸#
def
t = ⟨e̸# , Zt ⟩ = Zt ,
⟨ew ⊗ ei , Zt ⟩ = Yw
def
t ,
and Zw
t = 0 for all non-empty words w that do not terminate with the letter i.
We saw in Theorem 4.10 that suitably controlled integrands, such as F (B), F ∈ Cb2
can be integrated against a Brownian rough path B = (B, B), as constructed in
Chapter 3. In this case (see the proof of Theorem 4.4) one applies the sewing lemma
with Ξ̃(s, t) = F (Bs )Bs,t + DF (Bs )Bs,t , crucially using that δ Ξ̃ is of order
3α = 1 + ε > 1, in the sense that |δ Ξ̃sut | ≲ |t − s|1+ε uniformly over s < u < t
in [0, T ]. We leave it to Chapter 5 to reconcile this construction a posteriori with
classical stochastic integration. In the present section we show that stochastic and
rough analysis can also be combined a priori; the resulting stochastic sewing lemma
obtained by K. Lê in [Lê18] has proved very useful in a number of recent applications.
The setting is similar as in the sewing lemma, but the to-be-sewed two-parameter
function Ξ is now a sufficiently integrable random field. As running example,
consider the Itô left point approximation Ξs,t = F (Bs )Bs,t . With this choice
of Ξ (i.e. without the term DF (Bs )Bs,t ), classical sewing fails since δΞs,u,t =
−F (B)s,u Bu,t is at best of order 2α < 1. Note however that the martingale property
of Brownian motion makes this problem disappear upon inserting a conditional
expectation. Indeed, writing Es for the conditional expectation with respect to Fs for
some fixed filtration F = (Ft )t≤T such that B is F-adapted we have, always with
s < u < t,
This is of course very similar to the reason why classical Itô integration works: even
though Ξs,t is of size about |t − s|1/2 so that there is no reason a priori to believe that
Riemann sums converge, they do so thanks to the stochastic cancellations encoded in
the fact that Es Ξs,t = 0. The idea now is to obtain a version of the sewing lemma
78 4 Integration against rough paths
which combines the “best of both worlds”: its assumptions should be strictly weaker
than those of Lemma 4.2 and it should exploit improvements from situations in which
the conditional expectation of an expression is much smaller than the expression
itself.
Throughout this section, we assume that we are working with L2 random variables
on a filtered probability space (Ω, (Ft )0≤t≤T , P) and we write L2s for the space of
def
Fs -measurable square integrable random variables. We also write as usual ∥X∥L2 =
2 1/2
(EX ) . In fact, using the Burkholder–Davis–Gundy inequality, it is not difficult
to extend the following results to an Lq setting with 2 ≤ q < ∞.
Proposition 4.19 (Stochastic Sewing Lemma). Let (s, t) 7→ Ξs,t ∈ L2t for 0 ≤
s ≤ t ≤ T be continuous (viewed as a map with values in L2 ) with Ξt,t = 0 for
all t. Suppose that there are constants Γ1 , Γ2 ≥ 0 and ε1 , ε2 > 0 such that for all
0 ≤ s ≤ u ≤ t ≤ T,
1
∥δΞsut ∥L2 ≤ Γ1 |t − s| 2 +ε1 . (4.36)
and
∥Es δΞsut ∥L2 ≤ Γ2 |t − s|1+ε2 , (4.37)
Then there exists a unique continuous (again as a map [0, T ] → L2 ) process t 7→
Xt ∈ L2t with X0 = 0 and a suitable constant C such that, for all 0 ≤ s ≤ t ≤ T ,
1
∥Xt − Xs − Ξs,t ∥L2 ≤ CΓ1 |t − s| 2 +ε1 + CΓ2 |t − s|1+ε2 (4.38)
and
∥Es (Xt − Xs − Ξs,t )∥L2 ≤ CΓ2 |t − s|1+ε2 . (4.39)
Proof. (Uniqueness) Assuming there are two adapted processes X, X̄ with the stated
properties (4.38) and (4.39), we show that ∆t := Xt − X̄t = 0 almost surely
for every t. Let n be a positive integer and set ti = ti/n. The abusive notation
Xi := Xti ,ti+1 and similarly for ∆ and Ξ is convenient. Note that L2 estimates for
∆i = (Xi − Ξi ) − (X̄i − Ξi ), as well as Eti ∆i are immediate from (4.38) and
(4.39). We have
n−1 n−1
(1) (2)
X X
∆t = (∆i − Eti ∆i ) + Eti ∆i =: ∆t + ∆t ,
i=0 i=0
P
which is nothing but Doob’s decomposition of the partial sum process i ∆i into
martingale and predictable component. Using the orthogonality of martingale in-
crements, L2 -contraction property of the conditional expectation, and (4.38), we
have
n−1 12 n−1 12
(1)
X X
∥∆t ∥L2 = ∥(∆i − Eti ∆i )∥2L2 ≤2 ∥∆i ∥2L2
i=0 i=0
1/2+ε1
1
≲ n1/2 · .
n
4.7 Exercises 79
(1) (2)
Since n is arbitrary, it follows that ∆t = 0 a.s. The same conclusion for ∆t is
immediate from the triangle inequality and (4.39), since
1+ε2
(2)
X 1
∥∆t ∥L2 ≤ ∥Eti ∆i ∥L2 ≲ n · .
i
n
(Existence) The proof follows the “dyadic refinement” proof of the sewing lemma
given earlier. Fix 0 ≤ s < t ≤ T and consider dyadic refinements (tki ) of [s, t], so
that the kth level approximation is given by
k
2X −1
k
Is,t = Ξtki ,tki+1 ∈ L2t .
i=0
With midpoint uki ∈ [tki , tki+1 ] and, for fixed k, δΞi := δΞtki ,uki ,tki+1 , we again work
with the Doob decomposition
k
2X −1
k+1 k k;(1) k;(2)
Is,t − Is,t = δΞi = Is,t + Is,t . (4.40)
i=0
Arguing as in the uniqueness part, the first (resp. second) term is estimated (in L2 )
with (4.36) (resp. (4.37)) and one arrives at
1
k+1 k
∥Is,t − Is,t ∥L2 ≲ |t − s| 2 +ε1 2−kε1 + |t − s|1+ε2 2−kε2 .
k
which implies existence of Is,t := limk→∞ Is,t in L2t ,uniformly in 0 ≤ s ≤ t ≤ T ,
with a local estimate of the form (4.38) with Xt −Xs replaced by Is,t . (By assumption
Ξ, and hence all I k , are L2 -continuous, and so is the uniform limit I.) Moreover,
k;(1)
since Es Is,t = 0, for all k, a better estimate, of the form (4.39), is obtained for
k
Es Is,t = limk→∞ Es Is,t . At last, as in the “dyadic” proof of the deterministic sewing
lemma, one needs to argue that I is additive, a non-trivial exercise left to the reader,
and hence the increment of a unique L2 -path I started from I0 = 0 which is nothing
but the desired square-integrable process X = X(t, ω). ⊔ ⊓
4.7 Exercises
Exercise 4.1 a) In the setting of Young integration, deduce (4.3) from (4.2).
b) Show that there is a constant C depending only on T > 0 and α + β > 1 such
that
Z ·
Y dX
≤ C |Y0 | + ∥Y ∥β;[0,T ] ∥X∥α;[0,T ] . (4.41)
0 α;[0,T ]
80 4 Integration against rough paths
holds true whenever X is a geometric rough path. (Hence, from a rough path per-
spective, integration of gradient 1-forms against geometric rough paths is trivial for
the outcome does not depend on X.) What about non-geometric rough paths?
Exercise 4.3 Complete the first “dyadic” proof of the sewing Lemma 4.2.
Solution. To show that (4.9) is valid for all intervals [s, t] ⊂ [0, 1] it suffices to
consider s < t dyadic by continuity. As in the proof of the Kolmogorov criterion,
Theorem 3.1, we consider a (finite) partition P = (τi ) of [s, t], which “efficiently”
exhausts [s, t] with dyadic intervals of length ∼ 2−n , n ≥ m, in the sense that no
three intervals have the same length. Note that |P | ≡ max {|v − u| : [v, u] ∈ P } =
2−m ≤ |t − s| (and in fact ∼ |t − s| due to minimal choice of m). Thanks to the
additivity of I and (4.9) for dyadic intervals,
X X
|Is,t − Ξs,t | = (Iu,v − Ξu,v ) − Ξs,t − Ξu,v
[u,v]∈P [u,v]∈P
β
X X
≲ |v − u| + Ξs,t − Ξu,v .
[u,v]∈P [u,v]∈P
∞
β
X
≤ |t − s| + δΞs,τ
−(i+1) ,τ−i
+ δΞτi ,τi+1 ,t ,
i=0
β
so that |Is,t − Ξs,t | ≲ |t − s| , as required.
Exercise 4.4 Adapt the proof of Theorem 4.4 to obtain Young’s estimate (4.3).
Exercise 4.5 Fix α ∈ (0, 1], h > 0 and M > 0. Consider a path Z : [0, T ] → V
and show that
|Zs,t |
−(1−α)
∥Z∥α;h ≡ sup α ≤ M =⇒ ∥Z∥α;[0,T ] ≤ M 1 ∨ 2h .
0≤s<t≤T |t − s|
t−s≤h
1
Show that, with Cnn = n! , and for all 0 ≤ s ≤ t ≤ T ,
(n) 1
|Xs,t | n ≤ Cn ∥X∥1 |t − s| .
b) Show an analogous result in the Young case i.e. when X ∈ C α ([0, T ], V ), α > 12 .
(n)
c) Fix X = (X, X) ∈ C α ([0, T ], V ), α ∈ ( 13 , 12 ], and define Xs,t ∈ V ⊗n , any
n ≥ 1, by the right-hand side above, via iterated integration of controlled rough
paths. Noting (X(1) , X(2) ) = (δX, X), define the T (N ) (V )-valued extension of
X by
X̄ := (1, X(1) , X(2) , X(3) , . . . , X(N ) ) ,
82 4 Integration against rough paths
for any integer N > ⌊ α1 ⌋ = 2. Show the validity of Chen’s relation, i.e. Xs,t =
Xs,u ⊗ Xu,t , 0 ≤ s < u < t ≤ T , as equation in T (N ) (V ), and the estimate
(n) 1
|Xs,t | n ≤ Cn,α |||X|||α |t − s|α ,
2α
DX ([s, t]) and, with all norms on [s, t], one has ∥Xs, , Xs, ∥2α,X ≡ ∥Xs, ∥α +
• • •
Z t Z 1 Z 1
(3)
Xs,t = Xs, ⊗ dX =
• X̂0,τ ⊗ dX̂τ = c3 X̃0,τ ⊗ dX̃τ ,
s 0 0
(Yε , Yε′ ) ∈ DX
2α
ε
(Such an approximation result was first suggested in [GH19, Rem 5.5], for a general-
isation to modelled distributions in the theory of regularity structures see [ST18].)
Yε′ := Y ′ ∗ ψε ∈ C ∞
and also, thanks to the first part of that theorem, with R̄ε := Φ(Yε′ , Xε ), uniformly
in C22α ,
Ȳε (t) = Ȳε (s) + Yε′ (s)Xε (s, t) + R̄s,t
ε
.
By continuity of Φ, it is clear that R̄ε → Φ(Y ′ , X) ∈ C22α , uniformly, with uniform
−
2α-Hölder bounds. (As before, this entails C22α -convergence.) It remains to deal with
the (mostly cosmetic) problem that Ȳε is not smooth. But then Yε := Ȳε ∗ ψε ∈ C ∞
converges uniformly with uniform 1− -Hölder bounds and from
ε
Rs,t := Yε (s, t) − Yε′ (s)Xε (s, t) = R̄s,t
ε
+ Yε (s, t) − Ȳε (s, t)
we see that Rε − R̄ε → 0 uniformly, also with uniform 1− -Hölder bounds (and
hence Rε → Φ(Y ′ , X) with uniform 2α-Hölder bounds).
Solution. As in the solution to the previous exercise, we can use the Lyons–Victoir
extension theorem (see Exercise 2.14), to find a continuous map I : C α × C α → C α
with the property that Z = I(X, Y ′ ) satisfies (Z, YR ′ ) ∈ DX 2α
. (One should think
•
′ ′
of I(X, Y ) as being a “plausible candidate” for 0 Ys dXs , which is of course
ill-defined since we do not assume that Y ′ is controlled by X.)
In particular, the map Ĩ : (X, Ỹ , Y ′ ) 7→ X, Ỹ + I(X, Y ′ ), Y ′ is continuous
(X, Y, Y ′ ) 7→ X, Y − I(X, Y ′ ), Y ′ ,
which concludes the proof. Note that this construction is far from being canonical
due to the lack of a canonical map I having these properties.
Exercise 4.10 (Rough Fubini) Let X = (X, X) ∈ C α ([0, T ], V ), α > 13 and con-
2α
sider a measurable map from some measure space (Ω, F, µ) to DX , so that
With a pointwise definition of the µ-integrated controlled rough path on the right-hand
side, show that both sides are well-defined and equality holds,
Z Z T Z T Z
ω ω ′
(Y , (Y ) )dX µ(dω) = (Y ω , (Y ω )′ )µ(dω) dX.
Ω 0 0 Ω
Hint: Apply the integration by parts formula for bounded variation paths to the
indefinite integrals of Y and Ỹ against X.
b) Let now X = (X, X) ∈ C α ([0, T ]) for some α > 1/3, and (Y, Y ′ ), (Ỹ , Ỹ ′ ) ∈
2α
DX . Set Zs,t := Ys ⊗ Yt . Show that
Z T Z t Z T Z T Z T
Zs,t dXs dXt = Zs,t dXt dXs + Zt,t d[X]t ,
0 0 0 s 0
where the final integral is a Young integral against [X] ∈ C 2α , the bracket
introduced in Exercise 2.11.
Hint: If X is the canonical lift of some smooth X, then both [X] and [X] vanish
and the equality follows from part a) and consistency of rough with Riemann–
Stieltjes integration in case of smooth integrators. Treat the case of X ∈ Cgα
with the approximation result of Exercise 4.8 and then X ∈ C α as “second level
perturbation”, as in Exercise 2.11.
Exercise 4.12 (Singular rough paths, improper rough integration [BFG20])
a) (Young case) Consider 0 < α ≤ 1 and η ≤ α and a path Y defined on (0, T ].
Show that
4.7 Exercises 85
def |Yt − Ys |
∥Y ∥α,η = sup η−α
<∞
0<s<t≤T s |t − s|α
if and only if ∥Y ∥α;[ε,T ] = O(εη−α ) as ε ↓ 0, and write Y ∈ C α,η ((0, T ]) for
the resulting class of “singular” Hölder paths. Fix X ∈ C α ([0, T ]), α > 1/2
and assume η + α > 0, η ̸= 0. Show that the improper Young integral
Z t Z t
def
Zt := Y dX = lim Y dX, 0 < t ≤ T ,
0+ ε↓0 ε
t
Z
Y dX ≲ |Ys ||t − s|α + ∥Y ∥α;[s,T ] |t − s|2α
s
RT
with s = 2−(n+1) , t = 2−n and show that In := 2−n Y dX is a Cauchy
sequence.
b) (Rough path case) Let X = (X, X) ∈ C α ([0, T ]) for some α > 1/3, and let
(Y, Y ′ ) be defined on (0, T ] so that, for some η ≤ 2α,
exists and definesRa singular Hölder path Z ∈ C α,η∧0+α ((0, T ]). In fact, show
2α,η∧0+α
that (Z, Z ′ ) := ( 0+ Y dX, Y ) ∈ DX . (Such singular controlled rough
paths are examples of singular modelled distributions in the theory of regularity
structures, [Hai14b, Ch. 6].)
Exercise 4.13 Check that Definition 4.18 is consistent with Definition 4.6 in the case
when α ∈ 13 , 12 . Check also that if one takes w = ̸#, the empty word, then (4.34)
Y
reduces to (4.33) with |Rs,t | ≲ |t − s|N α .
Exercise 4.14 (From [Lê18]) Let B be a Brownian motion. Assume F is bounded
and ε-Hölder continuous for some ε > 0. Apply the stochastic sewing lemma with
86 4 Integration against rough paths
Ξ
R s,t = F (Bs )Bs,t and identify the resulting process X as the indefinite Itô integral
F (B)dB.
Exercise 4.15 (Hybrid stochastic rough integral) Let B be a Brownian motion
and X = (X, X) ∈ C α ([0, T ], V ) a (deterministic) rough path, α ∈ 13 , 12 . Apply
the stochastic sewing lemma with
a) Let 0 < γ ≤ 1 < µ. Show that there exists a unique continuous linear map
I : Cˆ2γ,µ ([0, T ], Hα ) → C γ ([0, T ], Hα ) such that (IΞ)0 = 0 and
∥Z, Z ′ ∥∧ ∧ ′ ′ ∧
X,2γ;α ≲ ∥Y ∥γ;α + (∥Y0 ∥Hα + ∥(Y, Y )∥X,2γ;α )(∥X∥γ + ∥X∥2γ ).
(4.49)
c) Make the (notational) adjustment to handle general d ∈ N.
Exercise 4.18 (Integration against step-N rough paths) Any path X : [0, T ] →
(N )
T1 (Rd ) gives rise to increments X−1 s ⊗Xt =: Xs,t so that Chen’s relation becomes
a tautology. Assume also |⟨Xs,t , w⟩| ≲ |t − s|α|w| , |w| ≤ N = ⌊1/α⌋. (These
are the naı̈ve higher
R order rough paths introduced in Section 2.4.) Show that the
rough integral Y dX defined as in (4.35) is well-defined and detail its structure.
(Naı̈ve rough paths are ill-suited to integrate f (Y) with regular but non-linear f , in
Section 7.6 this is resolved for geometric rough paths.)
88 4 Integration against rough paths
4.8 Comments
Young integration [You36], which can be seen as level-1 rough integration, was a key
inspiration for the analytical aspects of Lyons’ rough integration [Lyo94, Lyo98],
and has remained the “entrance test” for every subsequent (re)interpretation of rough
integration, including [Gub04, FdLP06, Pic08, HN09, GIP15, GIP16, FS17]. From
a harmonic analysis perspective, the here presented Young integration in Hölder
scale implies that the product of smooth functions extends naturally to C β × C −α
into D′ (R) if and only if β > α. Similar statements, replacing one-dimensional
space ([0, T ] ⊂ R, “time”) by Rd are well known, cf. e.g. [BCD11, Thm 2.52]
and Theorem 13.18 later on in the book. Young (and later rough) integration is
naturally formulated in p-variation scale, examples with p < 2 are plentiful and
range from Schramm–Loewner trace [Wer12, FT17], fractional Brownian motion (cf.
Section 10.3) with Hurst parameter H > 1/2 to Lévy processes and homogenisation
problems [CFKM19]. Of course, p = 2+ is the correct scale for semimartingales,
also in the càdlàg setting, see Section 3.8. The sewing lemma, obtained independently
by Feyel–De La Pradelle (in an early version of [FdLP06]) and Gubinelli [Gub04],
formalises abstract Riemann–Young integration and is a flexible real analysis lemma,
with many variations found in [FDM08, GT10, BL19, Yas18, GH19, GHN19] and
also [FS17, FZ18] for a sewing lemma, and subsequent integration theory, with
jumps. An application of sewing to level sets in the Heisenberg group is given
in [MST18]. The applications of Lê’s important stochastic sewing lemma [Lê18],
Section 4.6, include regularisation by noise [Lê18], the construction of rough Markov
diffusions [FHL20] by solving hybrid Itô-rough differential equation in the spirit
of Section 12.2.1, and an averaging result for SDEs driven by fractional Brownian
motion [HL19].
Integration of one-forms against continuous p-variation geometric rough paths for
any p ∈ [1, ∞) was developed by Lyons [Lyo98]; see also [LQ02, LCL07, FV10b,
LY15]. For a careful discussion of the integration of weakly geometric rough paths
in infinite dimensions we refer to Cass et al. [CDLL16].
Rough integration against controlled paths is due to Gubinelli, see [Gub04]
where it is developed in an α-Hölder setting, α > 13 . Loosely speaking, it allows
to “linearise” many considerations (the space of controlled paths is a Banach space,
while a typical space of rough paths is not). This point of view has been generalised
to arbitrary α (both in the geometric and the non-geometric setting) in [Gub10],
see also [HK15, FZ18]. Rough convolution, Exercise 4.17, follows [GT10, GH19],
crucial for “mild” RPDE solution, cf. Section 12.5.
The controlled rough path integration point of view can be pushed even further
and, as a matter of fact, the theory of regularity structures developed in [Hai14b] and
exposed in Chapter 13 onwards, provides a unified framework in which the Gubinelli
derivative and the regular derivatives are but two examples of a more general theory
of objects behaving “like Taylor expansions” and allowing to describe the small-scale
structure of a function and / or distribution in terms of “known” objects (polynomials
in the case of Taylor expansions, the underlying rough path in the case of controlled
paths).
Chapter 5
Stochastic integration and Itô’s formula
In this chapter, we compare the integration theory developed in the previous chapter
to the usual theories of stochastic integration, be it in the Itô or the Stratonovich
sense.
Recall from Section 3 that Brownian motion B can be enhanced to a (random) rough
path B = (B, B). Presently our focus is the case when B is given by the iterated Itô
integral 1 Z t
def
Bs,t = BItô
s,t = Bs,u ⊗ dBu
s
and the so enhanced Brownian motion has almost surely (non-geometric) α-Hölder
rough sample paths, for any α ∈ 31 , 12 . That is, B(ω) = (B(ω), B(ω)) ∈ C α for
every ω ∈ N1c where, here and in the sequel, Ni , i = 1, 2, . . . denote suitable null
sets. We now show that rough integrals (against B = BItô ) and Itô integrals, whenever
both are well-defined, coincide.
Proposition 5.1. Assume (Y (ω), Y ′ (ω)) ∈ DB(ω)
2α
for every ω ∈ N2c . Set N3 =
N1 ∪ N2 . Then the rough integral
Z T X
Y dB = lim (Yu Bu,v + Yu′ Bu,v )
0 n→∞
[u,v]∈Pn
exists, for each fixed ω ∈ N3c , along any sequence (Pn ) with mesh |Pn | ↓ 0. If Y, Y ′
are adapted then, almost surely,
Z T Z T
Y dB = Y dB .
0 0
1
The case when B is given via iterated Stratonovich integration is left to Section 5.2 below.
89
90 5 Stochastic integration and Itô’s formula
Proof. Without loss of generality T = 1. The existence of the rough integral for
ω ∈ N3c under the stated assumptions is immediate from Theorem 4.10, applied
to Y (ω), controlled by B(ω), for ω ∈ N2c fixed. Recall (e.g. [RY99]) that for any
continuous, adapted process Y the Itô integral against Brownian motion has the
representation
Z 1 X
Y dB = lim Yu Bu,v (in probability)
0 n→∞
[u,v]∈Pn
sup |Y ′ (ω)|∞ ≤ M .
ω∈N5c
(This is the case in the “model” situation Y = F (X), Y ′ = DF (X) where F was
in particular assumed to have bounded derivatives; the general case is obtained by
localisation and left to Exercise 5.1.)
The claim is that the rough and Itô integral coincide on N5c . With a look at the
respective Riemann-sums, convergent away from N5 , basic analysis tells us that
X
∀ω ∈ N5c : ∃ lim Yu′ Bu,v ,
n
[u,v]∈Pn
and that this limit equals the difference of rough and Itô integrals (on N5c , a set of
full measure). Of course, |Pn | ↓ 0, and to see that the above limit is indeed zero (at
least on a set of full measure), it will be enough to show that
2
X ′
Yu Bu,v
2 = O(|P|) . (5.1)
[u,v]∈P L
as desired. ⊔
⊓
5.2 Stratonovich integration 91
Almost surely, this construction then yields geometric α-Hölder rough sample paths,
for any α ∈ 13 , 12 . Recall that, by definition, the Stratonovich integral is given by
T T
1
Z Z
def
Y ◦ dB = Y dB + [Y, B]T
0 0 2
t
Proof. BStrat
s,t= BItô
s,t + fs,t where f (t) = 2 Id. This entails, as was discussed in
Example 4.14,
Z 1 Z 1 Z 1
Y dB Strat
= Y dB Itô
+ Y ′ df.
0 0 0
R1 R1
Thanks to Proposition 5.1, it only remains to identify 2 0
Y ′ df = 0
Yt′ dt with
[Y, B]1 . To see this, write
X X
′
Yu,v Bu,v = Yu,v Bu,v Bu,v + Ru,v Bu,v
[u,v]∈P [u,v]∈P
X
′ 3α−1
= Yu,v (Bu,v ⊗ Bu,v ) + O |P| ,
[u,v]∈P
92 5 Stochastic integration and Itô’s formula
3α−1
thanks to R ∈ C22α and B ∈ C α .
P
where we used that Ru,v Bu,v = O |P|
Note that
We have seen in the proof of Proposition 5.1 that any limit (in probability, say) of
X
′
Yu,v BItô
u,v
[u,v]∈P
must be zero. In fact, a look at the argument reveals that this remains true with BItô
u,v
replaced by Sym BItôu,v . It follows that
X X Z 1
′
lim Yu,v Bu,v = lim Yu,v (v − u) = Yt′ dt ,
|P|→0 |P|→0 0
[u,v]∈P [u,v]∈P
in the sense that the compensated Riemann-Stieltjes sums appearing on the right-
hand side converge with mesh |P| → 0. Let us split X into symmetric part, Ss,t :=
Sym (Xs,t ), and antisymmetric (“area”) part, Anti (Xs,t ) := As,t . Then
and the final term disappears in the gradient case, i.e. when G = DF . Indeed, the
contraction of a symmetric tensor (here: D2 F ) with an antisymmetric tensor (here:
A) always vanishes. In other words, area matters very much for general integrals
of 1-forms but not at all for gradient 1-forms. Note also that, contrary to A, the
symmetric part S is a nice function of the underlying path X. For instance, for Itô
enhanced Brownian motion in Rd , one has the identity
Z t
1 i j
Si,j
s,t = B i
dB j
= B B − δ ij
(t − s) , 1 ≤ i, j ≤ d .
s
s,r r
2 s,t s,t
These considerations suggest that the following definition encapsulates all the data
required for the integration of gradient 1-forms.
Remark 5.7. While this notion of bracket does not rely on any sort of “quadratic
variation”, it is consistent with the product (a.k.a. integration by parts) formula from
Itô calculus. Indeed, for any semimartingale X = X(t, ω), with X0 = 0 say, we
have Z t Z t
i j
Xs dXs + Xsj dXsi = Xti Xtj − ⟨X i , X j ⟩t ; (5.2)
0 0
from a rough path perspective, the left-hand side is precisely Xi,j j,i i,j
0,t + X0,t = 2S0,t .
Here, writing P for partitions of [0, t], the first integral is given by2
Z t X
DF (Xu )Xu,v + D2 F (Xu )Su,v ,
def
DF (Xs )dXs = lim (5.3)
0 |P|→0
[u,v]∈P
Proof. Consider first the geometric case, S = S̄, in which case the bracket is zero. The
proof is straightforward. Indeed, thanks to α-Hölder regularity of X with α > 1/3,
we obtain
X
F (XT ) − F (X0 ) = F (Xv ) − F (Xu )
[u,v]∈P
X 1
= DF (Xu )Xu,v + D2 F (Xu )(Xu,v , Xu,v )
2
[u,v]∈P
+ o(|v − u|)
X
= DF (Xu )Xu,v + D2 F (Xu ), S̄u,v + o(|v − u|) .
[u,v]∈P
2
Note consistency with the rough integral when X ∈ C α .
5.3 Itô’s formula and Föllmer 95
P
We conclude by taking the limit |P| → 0, also noting that [u,v]∈P o(|v − u|) → 0.
For the non-geometric situation, just substitute
1
S̄u,v = Su,v + [X]u,v .
2
Since D2 F is Lipschitz, D2 F (X· ) ∈ C α and we can split-up the “bracket” term and
note that Z t
X
D2 F (Xu )[X]u,v → D2 F (Xu )d[X]u ,
[u,v]∈P 0
where the convergence to the Young integral follows from [X] ∈ C 2α . The rest is
now obvious. ⊔⊓
Example 5.9. Consider the case when X = B, Itô enhanced Brownian motion. Then
X is given by iterated Itô integrals and, thanks to the Itô product rule (5.2),
Z t
2Si,j B i dB j + B j dB i = Bti Btj − B i , B j t .
0,t =
0
The usual Itô formula is then recovered from the fact that
i,j j
i
− 2Si,j
i j i,j
[B]t = B0,t B0,t 0,t = B , B 0,t = δ t .
We conclude this section with a short discussion on Föllmer’s calcul d’Itô sans
probabilités [Föl81]. For simplicity of notation, we take V = Rd , W = Re in what
follows. With regard to (5.3), let us insist that the compensation is necessary and one
cannot, in general, separate the sum into two convergent sums. On the other hand,
we can combine the converging sums and write
X
F (X)0,t = lim DF (Xu )Xu,v + D2 F (Xu )Su,v
|P|→0
[u,v]∈P
1 X
+ D2 F (Xu )[X]u,v (5.4)
2
[u,v]∈P
X 1
= lim DF (Xu )Xu,v + D2 F (Xu )(Xu,v , Xu,v ) .
|P|→0 2
[u,v]∈P
We now put forward an assumption that allows to break up the above sum.
Definition 5.10. Let π = (Pn )n≥0 be a sequence of partitions of [0, T ] with mesh
|Pn | → 0. We say that X : [0, T ] → Rd has finite quadratic variation in the sense of
Föllmer along π if, for every t ∈ [0, T ] and 1 ≤ i, j ≤ d the limit
i j π X
i i
j j
X , X t := lim Xv∧t − Xu∧t Xv∧t − Xu∧t
n→∞
[u,v]∈Pn
96 5 Stochastic integration and Itô’s formula
π
exists. Write [X, X] for the resulting path with values in Sym Rd ⊗ Rd , i.e. the
Z t
π
X
G(u)d[X, X]u ∈ Re .
lim G(u) Xu,v , Xu,v =
n→∞ 0
[u,v]∈Pn
u<t
Proof. For the first statement, it is enough to argue component by component. Set
[X i ]π := [X i , X i ]π . By polarisation,
i j π 1 π π π
X , X t = Xi + Xj t − Xi t − Xj t .
2
π
Since each term on the right-hand side is monotone in t, we see that t 7→ X i , X j t
Indeed, we can apply this for each component, with g = Gki,j and
X i + X j , X i, X j ,
Y ∈
2
P R
To see that (5.5) holds, write [u,v]∈Pn ,u<t g(u)Yu,v = [0,t)
g(u)dµn (u) with
X
2
µn = Yu,v δu .
[u,v]∈Pn ,u<t
Combination of the above lemma with (5.4) gives the Itô–Föllmer formula,
Z t
1 t 2
Z
F (Xt ) = F (X0 ) + DF (Xs )dX + D F (Xs )d[X, X]t , 0 ≤ t ≤ T
0 2 0
(5.6)
where the middle integral is given by the (now existent) limit of left-point Riemann-
Stieltjes approximations
X Z t
lim DF (Xu ) Xu,v =: DF (X)dX.
n→∞ 0
[u,v]∈Pn
In fact, we encourage the reader to verify as an exercise that this formula is valid
whenever X : [0, T ] → Rd is continuous, of finite quadratic variation, with t 7→
π
[X, X]t continuous. Note, however, that Föllmer’s notion of quadratic variation (and
the above integral) can and will depend in general on the sequence (Pn ).
Given a Brownian motion B = Bt (ω), one can define the backward Itô-integral
T
←−
Z X
ft dB t := lim ft Bs,t ,
0 n
[s,t]∈Pn
whenever |Pn | → 0 and this limit exists, in probability and uniformly on compact
time intervals, and does not depend on the sequence of partitions (Pn ) of [0, T ]. For
instance, Z T
←− 1 T
Bt dB t = BT2 + .
0 2 2
In many applications one encounters integrands f = ft (ω) that are backward
adapted in the sense that each ft is measurable with respect to the σ-field FtT :=
σ(Bu,v : t ≤ u ≤ v ≤ T ). For example,
T T
←− ←− 1 T
Z Z
(BT − Bt ) dB t = BT2 − Bt dB t = BT2 −
0 0 2 2
and we note (in contrast to the previous example) the zero mean property, which
of course comes from a backward martingale structure. Indeed, B̂t := BT − BT −t
is a standard Brownian motion, adapted to F̂t := FTT−t and so is fˆt = fT −t . The
98 5 Stochastic integration and Itô’s formula
Also, by analogy with its forward counterpart, the backward Stratonovich integral is
defined as the backward Itô integral, minus 1/2 times the quadratic variation of the
integrand.
The purpose of this section is to understand backward integration as rough integra-
tion. To this end, recall that the “forward” rough integral of (Y, Y ′ ) ∈ DX
2α
against
X = (X, X) was given in Theorem 4.10 by
Z T X
Y dX = lim Ys Xs,t + Ys′ Xs,t (5.8)
0 |P |↓0
[s,t]∈P
where P are partitions of [0, T ] with mesh-size |P |. Clearly, some sort of “left-point”
evaluation has been hard-wired into our definition of rough integral. On the other
hand, one can expect that feeding in explicit second order information makes this
choice somewhat less important than in the case of classical stochastic integration.
The next proposition, purely deterministic, answers the questions to what extent
one can replace left-point by right-point evaluation. In fact, it provides the natural
analogue of (5.7)3 but without any need of “backward” rough integration: both rough
integrals which appear in the following proposition are “forward” in the sense of
(5.8).
←
−
with X (t) = X(T − t) and similar for Y and Y ′ .
Proof. It is clear from (5.8) the rough integral is given as (compensated) Riemann–
Stieltjes limit
Z T X
Ys Xs,t + Ys′ Xs,t + (∗)s,t
Y dX = lim
r |P|↓0
[s,t]∈P
3 ←− ←
−
With regard to (5.7), note that dB̂ = −d B where B t = BT − Bt , not be mixed up with the
←−
backward Itô differential dB.
5.4 Backward integration 99
3α
whenever (∗)s,t ≈ 0 in the sense that (∗)s,t = O |t − s| = o(|t − s|), so that it
does not contribute to the limit. (Recall (4.21) and Lemma 4.2.) But then
which settles the first equality in (5.9). The second one follows from Xs,t = −Xt,s
and, from Chen’s relation, Xs,t + Xt,s + Xs,t ⊗ Xt,s = Xs,s = 0. For the final
←−
equality, note that every partition P of [r, T ] induces a time-reversed partition P
of [0, T − r], with each [s, t] replaced by [T − t, T − s]. By Exercise 2.6, the (time
←
−
T ) time-reversal of X is again a rough path, X ∈ C α , and since (easy to see)
←− ←−
(Y, Y ′ ) ∈ DX
2α
if and only ( Y , Y ′ ) ∈ D←2α
− , we obtain the final equality. ⊔
⊓
X
At this stage, one could rephrase the defining condition for (Y, Y ′ ) ∈ DX 2α
in terms
of a “backward” controlledness condition for (Ŷ , Yˆ′ ) := (Y, −(Y ′ )T ), together with
a ”backward” rough integral given by5
T
←−
X Z
(Ŷ , Ŷ ′ )dX .
lim Ŷt Xs,t + Ŷt Xs,t =: (5.11)
|P|↓0 0
[s,t]∈P
However, this is no different than the “forward” integral (Y, Y ′ )dX. Comparing
R
(5.8) with (5.11), one changed left- to right-point evaluation, followed by twisting
the meaning of controlled rough path, to make sure nothing happened!
As should be clear at this point, a naı̈ve backward rough integral of (Y, Y ′ ) ∈ DX
2α
is, in general, not well-defined. In fact, in view of Proposition 5.12, existence of this
limit is equivalent to existence of (either)
X X
Yt′ Xs,t ⊗ Xs,t = lim Ys′ Xs,t ⊗ Xs,t .
lim
|P|↓0 |P|↓0
[s,t]∈P [s,t]∈P
4
In coordinates: (Y ′ X)k = (Y ′ )k
i,j X
i,j
vs. (Y ′ )T X = (Y ′ )k
j,i X
i,j
with implicit summation
over i, j = 1, . . . , d.
5
R ←−
Not to be confused with a standard “forward” rough integral (. . .)d X seen in (5.9).
100 5 Stochastic integration and Itô’s formula
There is no reason why, for a general path X ∈ C α , the above limits should exist. On
the other hand, we already considered such sums in the context of the Itô–Föllmer
formula, cf. Lemma 5.11. The appropriate condition for X was seen to be “quadratic
variation (in the sense of Föllmer, along some (Pn ))”. And under this assumption,
Z T
π
X
Ys′ Xs,t ⊗ Xs,t → Ys′ d[X]s .
(5.12)
[s,t]∈P n 0
′ 2α
ii) Assume (Y (ω), Y (ω)) ∈ a.s. and Yt , Yt′ are FtT -measurable for all
DB(ω)
t < T . Then with probability one, for all r ∈ [0, T ],
T T T T
←− 1 ←−
Z Z Z Z
Y dBStrat = Yt dB t − Yt′ Id dt = Ys ◦ dB s ,
r r 2 r r
T T
←−
Z Z
Y dBback = Yt dB t .
r r
Proof. Regarding point i), it follows from the definition of the rough integral (see
also Example 4.14) that
Z t Z t Z t
Y dBback = Y dBItô + Y ′ Id ds .
0 0 0
The claim then follows from Proposition 5.1. The Stratonovich case is similar, now
using Corollary 5.2.
5.4 Backward integration 101
′ ′
using Ys,t (Xs,t ⊗ Xs,t ) ≈ 0 and Ys,t Id(t − s) ≈ 0. (As before (∗)s,t ≈ 0 means
(∗)s,t = o(|t − s|).) Now we know that with probability 1, B(ω) has finite quadratic
π
variation [B]t = Idt, in the sense of Föllmer along some sequence π = (P n ). As a
purely deterministic consequence, cf. (5.12), on the same set of full measure,
Z T
π
X X
lim Ys′ Bs,t ⊗ Bs,t = Ys′ d[B]s = lim Ys′ Id(t − s).
n→∞ 0 n→∞
[s,t]∈P n [s,t]∈P n
T ′ T
Since BItô
s,t is independent from Ft and Yt , Yt are Ft -measurable, a (backward)
martingale argument shows that
X
lim Yt′ BItô
s,t = 0.
n→∞
[s,t]∈P n
5.5 Exercises
Exercise 5.1 Complete the proof of Proposition 5.1 in the case of unbounded Y ′ .
and note that limM →∞ τM = ∞ almost surely. The stopped process S·τM is also a
martingale, and we see as above that, for every fixed M > 0,
X 2
′
Yu Bu,v = O(|P|).
2 L
[u,v]∈P
u≤τM
YT2 − y02 − σ 2 T
ÃT (Y ) = RT . (5.14)
2 0 Yt2 dt
Exercise 5.3 (Rough vs. anticipating Skorokhod integration) We have seen that
Itô integration coincides with rough integration against BItô (ω), subject to natural
conditions (in particular: adaptedness of (Y, Y ′ ) which guarantees that both are
well-defined). A well-known extension of the Itô integral to non-adapted integrands
is given by the Skorokhod integral, details of which are found in any textbook on
Malliavin calculus, see for example [Nua06].
a) Let B denote one-dimensional Brownian motion on [0, T ]. Show that the Sko-
RT
rokhod integral of BT against B over [0, T ], in symbols 0 BT δBt , is given by
BT2 − T .
b) Set Yt (ω) := BT (ω), with (zero) increments (trivially) controlled by B with
Y ′ := 0. (In view of true roughness of Brownian motion, cf. Section 6, there is no
other choice for Y ′ ). Show that the rough integral of Y against Brownian motion
over [0, T ] equals BT2 . Conclude that Skorokhod and rough integrals (against
Itô enhanced Brownian motion) do not coincide beyond adapted integrals.
Exercise 5.4 (Rough vs. anticipating Stratonovich integration [CFV07]) In the
spirit of Nualart–Pardoux [NP88], define the Stratonovich anticipating stochastic
integral by
Z t Z t
dB n (ω)
u(s, ω) s
def
u(s, ω) ◦ dBs (ω) = lim ds,
0 n→∞ 0 ds
where the limit on the right-hand side exists in the almost sure sense. Conclude that
in this case rough integration against BStrat coincides almost surely with Stratonovich
anticipating stochastic integration, i.e.
Z · Z ·
Fω (Bs )dBStrat (ω) ≡ Fω (Bs ) ◦ dBs (ω).
0 0
Exercise 5.5 Fix t > 0 and a sequence of dissections (Pn ) ⊂ [0, t] with mesh
|Pn | → 0. Consider the Itô–Föllmer integral given by
Z t X
def
DF (X) dX = lim DF (Xu ) Xu,v ,
0 n→∞
[u,v]∈Pn
whenever this limit exists. Show that this limit does not exist, in general, when
X = B H , a d-dimensional fractional Brownian motion with Hurst parameter
H < 1/2.
Hint: Consider the simplest possible non-trival case, namely d = 1 and F (x) = x2 .
Solution. Assume convergence in probability say along some (Pn ) for the approxi-
mating (left-point) sum, X
Xu Xu,v .
[u,v]∈Pn
We look for a contradiction. Elementary “calculus for sums” implies that the mid-
point sum converges, i.e. where Xu above is replaced by Xu + Xu,v /2. It follows
that convergence of the left-point sums is equivalent to to existence of quadratic
variation, i.e. existence of
2
X
lim |Xu,v | .
n→∞
[u,v]∈Pn
2 2H
Note that E|Xu,v | = (1/2n ) so that the expectation of this sum equals 2n(1−2H) ,
which diverges when H < 1/2. In particular, quadratic variation does not exist as L1
limit. But is also cannot exist as a limit in probability, for both types of convergence
are equivalent on any finite Wiener–Itô chaos.
Solution. Without loss of generality, we consider the problem on the interval [0, 2π].
Assume by contradiction that there is a spaceR B ⊂ C([0, 2π]) which carries the law µ
of Brownian motion and such that (f, g) 7→ f dg is continuous on B. By definition,
the Cameron–Martin space of µ is H = W01,2 ([0, 1]), which has an orthonormal
basis {en }n∈Z given by
t sin kt 1 − cos kt
e0 (t) = √ , ek (t) = √ , e−k (t) = √ ,
2π k π k π
for k > 0. It follows from standard Gaussian measure theory [Bog98] that, given
a sequence ξn of i.i.d. normal Gaussian random variables, the sequence XN =
5.6 Comments 105
PN
n=−N en ξn converges almost surely in B to a limit X such that the law of X is µ.
PN
Write now YN = n=−N sign(n)en ξn , so that one also has YN → Y with law of
Y given by µ.
This immediately leads to a contradiction: on the one hand, assuming that (f, g) 7→
R R 2π
f dg is continuous on B, this implies that 0 XN (t) dYN (t) converges to some
finite (random) real number. On the other hand, an explicit calculation yields
2π N
ξ02 X ξn2 + ξ−n
2
Z
XN (t) dYN (t) = + .
0 2 n=1
n
5.6 Comments
Rough integrals of 1-forms against the Brownian rough path (and also continuous
semimartingales enhanced to rough paths) are well known to coincide with stochastic
integrals, see [LQ02, FV10b] and the references therein, [FS17, CF19] for the case
of càdlàg semimartingales. Chouk and Tindel [TC15] discuss, from a rough path
view, Skorohod and Stratonovich integration in the plane. Pathwise integration à la
Föllmer is revisited and extended by Ananova, Cont and Perkowski [AC17, CP19].
Sharp rough path type p-variation and integrability estimates on martingale trans-
forms (and then stochastic integrals against general càdlàg semimartingales) are given
by Friz and Zorin-Kranich [FZK20], this extends and unifies the relevant parts of
[Lep76, FV08a, KZK19], see also [DOP19] for the use of such an estimate. recently
led to the notion of rough semimartingale [FZK20], which leads to a simultaneous
development of (càdlàg) rough and stochastic integration. A parallel development
[FHL20], in a Hölder setting, is based on stochastic sewing (Section 4.6), see also
Exercise 4.15.
Chapter 6
Doob–Meyer type decomposition for rough paths
M ≡ M̃ and A ≡ Ã .
hence, by the first part, M τ , Aτ ≡ 0. This also implies that the quadratic variation of
M τ , denoted by ⟨M τ ⟩, vanishes. Since ⟨M τ ⟩ = ⟨M ⟩τ (see e.g. [RY99, Ch. IV]) it
indeed follows that ⟨M ⟩ ≡ 0 on [0, τ ). ⊔ ⊓
107
108 6 Doob–Meyer type decomposition for rough paths
1
As opposed to Hölder regularity which quantifies “roughness from above”, in the sense of an
upper estimate of the increment.
6.2 Uniqueness of the Gubinelli derivative and Doob–Meyer 109
Here and in the sequel of this section we fix α ∈ ( 13 , 12 ], a rough path X = (X, X) ∈
C α ([0, T ], V ) and a controlled rough path (Y, Y ′ ) ∈ DX 2α
. We first address the
question to what extent X and Y determine the Gubinelli derivative Y ′ . As it turns
out, Y ′ is uniquely determined, provided that X is sufficiently “rough from below, in
all directions”. A Doob–Meyer type decomposition will then follow as a corollary.
Let us first consider the case when X is scalar, i.e. with values in V = R. Assume
that for some given s ∈ [0, T ), there exists a sequence of times tn ↓ s such that
2α
|Xs,tn |/|tn − s| → ∞, i.e.
|Xs,t |
lim 2α = +∞.
t↓s |t − s|
Then Ys′ is uniquely determined from Y by (4.18) and the condition that ∥RY ∥2α <
∞. In fact, one necessarily has Xs,tn ∈ R \ {0} for n large enough and so, from the
very definition of RY ,
Y 2α
Ys,tn Rs,t |tn − s|
Ys′ = − n
2α
Xs,tn |tn − s| Xs,tn
which implies that limn→∞ Ys,tn /Xs,tn exists and equals Ys′ . The multidimensional
case is not that different, and the above consideration suggests the following defini-
tion.
|v ∗ (Xs,t )|
∀v ∗ ∈ V ∗ \{0} : lim 2α =∞.
t↓s |t − s|
where the second equality follows from the assumption made in (6.2). Now, Ys′ Xs,t
takes values in W̄ , the same Banach space in which Y takes its values. For every
w∗ ∈ W̄ ∗ , the map V ∋ v 7→ w∗ (Ys′ v) defines an element v ∗ ∈ V ∗ so that
|v ∗ (Xs,t )| w∗ (Ys′ Xs,t )
2α = 2α = O(1) as t ↓ s;
|t − s| |t − s|
Unless v ∗ = 0, the assumption that “X is rough at time s” implies that, along some
sequence tn ↓ s, we have the divergent behaviour |v ∗ (Xs,tn )|/|tn − s|2α → ∞,
which contradicts that the same expression is O(1) as tn ↓ s. We thus conclude that
v ∗ = 0. In other words,
∀w∗ ∈ W ∗ , v ∈ V : w∗ (Ys′ v) = 0 ,
and this clearly implies Ys′ = 0. This finishes the proof of the implication stated in
(6.2). ⊔⊓
Theorem 6.5 (Doob–Meyer for rough paths). Assume that X is rough at some
time s ∈ [0, T ) and let (Y, Y ′ ) ∈ DX
2α
. Then
Z t
2α
Y dX = O |t − s| as t ↓ s =⇒ Ys = 0 . (6.3)
s
2α
where the last inequality is just the statement that |t − s| = O |t − s| as t ↓ s,
thanks to α ≤ 1/2. We then conclude using (6.3) that Ys = Ỹs . If we now assume
true roughness of X, this conclusion holds for a dense set of times s and hence, by
6.3 Brownian motion is truly rough 111
(Attention that the above notation “hides” the dependence on Y ′ resp. Ỹ ′ .) But then
(6.4) implies Z t Z t
Zr dr ≡ Z̃r dr for t ∈ [0, T ],
0 0
and we conclude by differentiation with respect to t. ⊔
⊓
Recall that (say, d-dimensional standard) Brownian motion satisfies the so-called
(Khintchine) law of the iterated logarithm, that is
!
|Bt,t+h | √
∀t ≥ 0 : P lim 1 = 2 = 1. (6.5)
h↓0 h 2 (ln ln 1/h)1/2
See [McK69, p.18] or [RY99, Ch. II] for instance, typically proved with exponential
martingales. Remark that it is enough to consider t = 0 since (Bt,t+h : h ≥ 0) is
also a Brownian motion.
Theorem 6.6. With probability one, Brownian motion on V = Rd is truly rough,
relative to any Hölder exponent α ∈ [1/4, 1/2).
Proof. It is enough to show that, for fixed time s, and any θ ∈ [1/2, 1),
!
∗
|v (B s,t )|
P ∀v ∗ ∈ V ∗ , |v ∗ | = 1 : lim = +∞ = 1.
t↓s |t − s|θ
(Then take s ∈ Q and conclude that the above event holds true, simultaneously for
all such s, with probability one.)
1 1/2
To this end, set h 2 (ln ln 1/h) ≡ ψ(h).√We need the following two conse-
quences of (6.5). There exists c > 0 (here c = 2) such that for every fixed unit dual
∗
vector v ∗ ∈ V ∗ = Rd and every fixed s ∈ [0, T )
P lim |v ∗ (Bs,t )|/ψ(t − s) ≥ c = 1 ,
t↓s
|Bs,t |
P lim <∞ =1.
t↓s ψ(t − s)
On the other hand, every unit dual vector v ∗ ∈ V ∗ is the limit of some (vn∗ ) ⊂ K.
Then
|vn∗ (Bs,t )| |v ∗ (Bs,t )| |Bs,t |
≤ + |vn∗ − v ∗ |V ∗
ψ(t − s) ψ(t − s) ψ(t − s)
so that, using lim (|a| + |b|) ≤ lim (|a|) + lim (|b|), and restricting to the above set
of full measure,
|v ∗ (Bs,t )|
0 < c ≤ lim .
t↓s ψ(t − s)
Hence, for a.e. sample B = B(ω) we can pick a sequence (tn ) converging to s such
that |v ∗ (Bs,tn )|/ψ(tn − s) ≥ c − 1/n. On the other hand, for any θ ≥ 1/2 we have
Observe that, indeed, any element in C α which is θ-Hölder rough for θ < 2α
is truly rough. (We shall see in the next section that multidimensional Brownian
motion is θ-Hölder rough for any θ > 1/2.) The following result can be viewed as
quantitative version of Proposition 6.4.
Proposition 6.8. Let (X, X) ∈ C α [0, T ], V be such that X is θ-Hölder rough for
some θ ∈ (0, 1]. Then, for every controlled rough path (Y, Y ′ ) ∈ DX 2α
[0, T ], W
one has,
∀ε ∈ (0, ε0 ] : Lεθ ∥Y ′ ∥∞ ≤ osc(Y, ε) +
RY
2α ε2α .
(6.7)
As immediate consequence, if θ < 2α, Y ′ is uniquely determined from Y , i.e. if
′ ′ 2α
and Y ≡ Ỹ , then Y ′ ≡ Ỹ ′ .
Y, Y and Ỹ , Ỹ both belong to DX
Proof. Let us start with the consequence: apply estimate (6.7) with Y replaced by
Y − Ỹ = 0 and similarly Y ′ replaced by Y ′ − Ỹ ′ . Thanks to L > 0 it follows that
′
Y − Ỹ ′
= O ε2α−θ
∞
(Note that one has indeed (Ys′ )∗ : W ∗ → V ∗ .) Combining both (6.8) and (6.9), we
thus obtain that
Taking the supremum over all such w∗ ∈ W ∗ of unit length and using the fact that
the norm of a linear operator is equal to the norm of its dual, we obtain
Theorem 6.10 (Norris lemma for rough paths). Let X = (X, X) ∈ C α [0, T ], V
Z t Z t
It = Ys dXs + Zs ds.
0 0
Then there exist constants r > 0 and q > 0 such that, setting
−1
R := 1 + Lθ (X) + |||X|||α + ∥Y, Y ′ ∥X;2α + |Y0 | + |Y0′ | + ∥Z∥α + |Z0 |
We now turn to Hölder-roughness of Brownian motion. Our focus will be on the unit
interval T = 1, and we consider scales up to ε0 = 1/2 for the sake of argument.
Proposition 6.11. Let B be a standard Brownian motion on [0, 1] taking values in
Rd . Then, for every θ > 12 , the sample paths of B are almost surely θ-Hölder rough.
Moreover, with scale ε0 = 1/2 and writing Lθ (B) for the modulus of θ-Hölder
roughness, there exist constants M and c such that
Proof. The standard small ball estimate for Brownian motion (see for example
[LS01]) yields the bound
sup P sup |⟨φ, B(t)⟩| ≤ ε ≤ C exp(−cδε−2 ) . (6.11)
|φ|=1 t∈[0,δ]
The required estimate then follows from a standard chaining argument, as in [Nor86,
p. 127]: cover the sphere |φ| = 1 with ε−2(d−1) balls of radius ε2 , say, centred
at φi . We then use the fact that, since the supremum of B has Gaussian tails, if
supt∈[0,δ] |⟨φi , B(t)⟩| ≤ ε, then the same bound, but with ε replaced by 2ε holds
with probability exponentially close to 1 uniformly over all φ in the ball of radius ε2
centred at φi . Since there are only polynomially many such balls required to cover
the whole sphere, (6.10) follows. Note that this chaining argument uses in a crucial
way that the number of balls of radius ε2 required to cover the sphere ∥φ∥ = 1 grows
only polynomially with ε−1 .
It is clear that bounds of the type (6.10) break down in infinite dimensions: if we
consider a cylindrical Wiener process, then (6.11) still holds, but the unit sphere of a
Hilbert space cannot be covered by a finite number of small balls anymore. If on the
other hand, we consider a process with a non-trivial covariance, then we can get the
chaining argument to work, but the bound (6.11) would break down due to the fact
that ⟨φ, B(t)⟩ can then have arbitrarily small variance. ⊔ ⊓
where the inf is taken over |φ| = 1, s ∈ [0, 1] and ε ∈ (0, 1/2]. We then define the
“discrete analog” Dθ (X) of Lθ (X) to be given by
Therefore, by the triangle inequality, we conclude that the magnitude of the difference
between ⟨φ, Xs ⟩ and one of the two terms ⟨φ, Xti ⟩, i = 1, 2 (say t1 ) is at least
116 6 Doob–Meyer type decomposition for rough paths
1 −nθ
|⟨φ, Xs,t1 ⟩| ≥ 2 Dθ (X)
2
and therefore
|⟨φ, Xs,t1 ⟩| 1 2−nθ 1 1
θ
≥ Dθ (X) ≥ Dθ (X).
ε 2 εθ 2 2θ
Since s, ε and φ were chosen arbitrarily, the claim (6.12) follows.
Applying this to Brownian sample paths, X = B(ω), it follows that it is sufficient
to obtain the requested bound on P(Dθ (B) < ε). We have the straightforward bound
|⟨φ, Bs,t ⟩|
P(Dθ (B) < ε) ≤ P inf inf infn sup < ε
∥φ∥=1 n≥1 k≤2 s,t∈Ik,n 2−nθ
X 2n
∞ X
≤ P inf sup |⟨φ, Bs,t ⟩| < 2−nθ ε .
∥φ∥=1 s,t∈Ik,n
n=1 k=1
Trivially sups,t∈Ik,n |⟨φ, Bs,t ⟩| ≥ supt∈Ik,n |⟨φ, Br,t ⟩|, where r is the left boundary
of the interval Ik,n , we can bound this by applying Lemma 6.12. Noting that the
bound obtained in this way is independent of k, we conclude that
∞
X ∞
X
2n exp −c2(2θ−1)n ε−2 ≤ M̃ exp −c̃nε−2 .
P(Dθ (B) < ε) ≤ M
n=1 n=1
Here, we used the fact that as soon as θ > 12 , we can find constants K and c̃ such that
uniformly over all ε < 1 and all n ≥ 1. (Consider separately the cases ε2 ∈ (0, 1/n)
and ε2 ∈ [1/n, 1).) We deduce from this the bound
Z ∞
−c̃ε−2
P(Dθ (B) < ε) ≤ M e + exp −c̃ε−2 x dx ,
1
Note that the proof given above is quite robust. In particular, we did not really
make use of the fact that B has independent increments. In fact, it transpires that all
that is required in order to prove the Hölder roughness of sample paths of a Gaussian
process W with stationary increments is a small ball estimate of the type
P sup |Wt − W0 | ≤ ε ≤ C exp(−cδ α ε−β ) ,
t∈[0,δ]
for some exponents α, β > 0. These kinds of estimates are available for example for
fractional Brownian motion with arbitrary Hurst parameter H ∈ (0, 1).
6.7 Comments 117
6.6 Exercises
Exercise 6.1 Show that the Q-Wiener process (as introduced in Exercise 3.4) is truly
rough.
Exercise 6.2 Prove and state precisely: multidimensional fractional Brownian mo-
tion B H , H ∈ (1/3, 1/2], is truly rough.
Exercise 6.3 In (6.7), estimate osc(Z, ε) by 2∥Y ∥∞ (or alternatively by ∥Y ∥α εα )
and deduce the estimate
1
∥Z ′ ∥∞ ≤ inf 2ε−θ ∥Y ∥∞ +
RZ
2α ε2α−θ .
L ε∈(0,ε0 ]
Carry out the elementary optimisation, e.g. when ε0 = T /2, to see that
4∥Y ∥∞
′
θ
Z
2α
θ
− 2α −θ
∥Z ∥∞ ≤
R 2α ∥Y ∥∞ ∨ T .
L(θ, X)
∗ Exercise 6.4 (Norris’ lemma for rough paths; [HP13]) Give a complete proof of
Theorem 6.10.
6.7 Comments
The notion of θ-roughness was first introduced in Hairer–Pillai [HP13], which also
contains Proposition 6.8, although some of the ideas underlying the concepts pre-
sented here were already apparent in Baudoin–Hairer [BH07] and Hairer–Mattingly
[HM11]. A version of this “Norris lemma” in the context of SDEs driven by fractional
Brownian motion was proposed independently by Hu–Tindel [HT13]. The simplified
condition of “true” roughness (which may be verified in infinite dimensions), targeted
directly at a Doob–Meyer decomposition, is taken from Friz–Shekhar [FS13]; the
quantitative “Norris lemma” is taken from Hairer–Pillai [HP13]. These results also
hold in “rougher” situations, i.e. when α ≤ 1/3, see [FS13, CHLT15].
Chapter 7
Operations on controlled rough paths
R
At first sight, the notation Y dX introduced in Chapter 4 is ambiguous since the
resulting controlled rough path depends in general on the choices of both the second-
order process X and the derivative process Y ′ . Fortunately, this “lack of completeness”
in our notations is mitigated by the fact that in virtually all situations of interest, Y
is constructed by using a small number of elementary operations described in this
chapter. For all of these operations, it turns out to be intuitively rather clear how the
corresponding derivative process is constructed.
where Yu′ ⊗Yu′ ∈ L(V ⊗V, W ⊗W ) is given by (Yu′ ⊗Yu′ )(v⊗ṽ) = (Yu′ (v))⊗(Yu′ (ṽ)).
The fact that ∥Y∥2α is finite is then a consequence of (4.25). On the other hand, the
algebraic relations (2.1) already hold for the “Riemann sum” approximations to the
three integrals, provided that the partition used for the approximation of Ys,t is the
union of the one used for the approximation of Ys,u with the one used for Yu,t .
1
It can also be useful to consider t 7→ X0,t as a path “controlled by X”, resulting in the controlled
rough path (X, X); cf. Exercise 4.6.
119
120 7 Operations on controlled rough paths
Here, the left-hand side uses (4.24) to define the integral of two controlled rough
paths against each other and the right-hand side uses the original definition (4.21)
of the integral of a controlled rough path against its reference path.
Proof. By assumption, one has Ys,t = Ys′ Xs,t + O(|t − s|2α ) and Z̃s,t = Z̃s′ Ys,t +
O(|t − s|2α ). Combining these identities, it follows immediately that
Zs,t = Z̃s′ Ys′ Xs,t + O(|t − s|2α ) = Zs′ Xs,t + O(|t − s|2α ) ,
so that (Z, Z ′ ) ∈ DX
2α
as required. Now the left-hand side of (7.1) is given by IΞ0,t
with Ξs,t = Zs Ys,t + Zs′ Ys′ Xs,t , whereas the right-hand side is given by I Ξ̃0,t ,
where we set Ξ̃s,t = Z̃s Ỹs,t + Z̃s′ Ys,t . Since |Ys,t − Ys′ Ys′ Xs,t | ≤ C|t − s|3α by
(4.22), the claim now follows from Remark 4.13. ⊔ ⊓
Remark 7.2. It is straightforward to see that if 13 < β < α, then C α ,→ C β and, for
2β
every X ∈ C α , we have a canonical embedding DX 2α
,→ DX . Furthermore, in view
of the definition (4.10) of I, the values of the integrals defined above do not depend
on the interpretation of the integrand and integrator as elements of one or the other
space.
Let W and W̄ be two Banach spaces and let φ : W → W̄ be a function in Cb2 . Let
furthermore (Y, Y ′ ) ∈ DX2α
([0, T ], W ) for some X ∈ C α . (In applications X will
be part of some X = (X, X) ∈ C α but this is irrelevant here.) Then one can define a
(candidate) controlled rough path (φ(Y ), φ(Y )′ ) ∈ DX2α
([0, T ], W̄ ) by
′
which shows that φ(Y ), φ(Y ) ∈ C α . Furthermore, Rφ ≡ Rφ(Y ) is given by
φ
Rs,t = φ(Yt ) − φ(Ys ) − Dφ(Ys )Ys′ Xs,t
Y
= φ(Yt ) − φ(Ys ) − Dφ(Ys )Ys,t + Dφ(Ys )Rs,t
so that,
1 2 2
∥Rφ ∥2α ≤ D φ ∞ ∥Y ∥α + |Dφ|∞
RY
2α .
2
It follows that
φ(Y ), φ(Y )′
≤ ∥Dφ(Y· )∥∞ ∥Y·′ ∥α + ∥Y·′ ∥∞
D2 φ(Y· )
∞ ∥Y· ∥α
X,2α
1 2
+ D2 φ∞ ∥Y ∥α + |Dφ|∞
RY
2α
2
2
≤ ∥φ∥C 2 ∥Y·′ ∥α + ∥Y·′ ∥∞ ∥Y· ∥α + ∥Y ∥α +
RY
2α
b
122 7 Operations on controlled rough paths
2
≤ Cα,T ∥φ∥C 2 (1 + ∥X∥α ) 1 + |Y0′ | + ∥Y, Y ′ ∥X,2α
b
× |Y0 | + ∥Y, Y ′ ∥X,2α ,
′
In Lemma 7.3 we showed that controlled rough paths composed with (sufficiently)
regular functions are again controlled rough paths. We shall be interested to quantify
the continuity of this operation. As a useful warm-up, we start with the case of Hölder
paths.
Lemma 7.5. Assume φ ∈ Cb2 (W, W̄ ) and T ≤ 1. Then there exists a constant Cα,K
such that for all X, Y ∈ C α ([0, T ], W ) with ∥X∥α;[0,T ] , ∥Y ∥α;[0,T ] ≤ K ∈ [1, ∞),
∥φ(X) − φ(Y )∥α;[0,T ] ≤ Cα,K ∥φ∥C 2 |X0 − Y0 | + ∥X − Y ∥α;[0,T ] .
b
The idea is to use a division property of sufficiently smooth functions. In the present
context, this simply means that one has
Z 1
φ(x) − φ(y) = g(x, y)(x − y) with g(x, y) := Dφ(tx + (1 − t)y) dt ,
0
|(g(x, y) − g(x̃, ỹ))| ≤ ∥g∥Lip |(x − x̃, y − ỹ)| ≤ C∥D2 φ∥∞ (|x − x̃| + |y − ỹ|).
We can now show the analogous statement for controlled rough paths, using
notation previously introduced in Section 4.4.
and similarly for Z̃, Z̃ ′ . Then, one has the local Lipschitz estimates
∥Z, Z ′ ; Z̃, Z̃ ′ ∥X,X̃,2α ≤ CM ∥X − X̃∥α + Y0 − Ỹ0 + Y0′ − Ỹ0′
+ ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥X,X̃,2α , (7.4)
as well as
Z − Z̃
≤ CM ∥X − X̃∥α + Y0 − Ỹ0 + Y0′ − Ỹ0′ + ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥
α X,X̃,2α ,
(7.5)
for a suitable constant CM = C(M, α, φ).
Proof. (The reader is urged to revisit Lemma 7.3 where the composition (7.3) was
seen to be well-defined for φ ∈ Cb2 .) Similar as in the previous proof, noting that
′
Z0 − Z̃0′ = Dφ(Y0 )Y0′ − Dφ Ỹ0 Ỹ0′ ≤ CM Y0 − Ỹ0 + Y0′ − Ỹ0′
Dφ(Y )Y ′ − Dφ Ỹ Ỹ ′
+
RZ − RZ̃
.
α 2α
Write CM (εX + ε0 + ε′0 + ε) for the right-hand side of (7.4). Note that with this
notation, from (4.30),
Y − Ỹ
≲ εX + ε′0 + ε =: εY ,
α
and also
Y − Ỹ
∞;[0,T ] ≲ ε0 + εY (uniformly over T ≤ 1). Since Dφ ∈ Cb2 , we
know from Lemma 7.5 that
Dφ Ỹ − Dφ(Y )
α = Dφ Ỹ0 − Dφ(Y0 ) +
Dφ Ỹ − Dφ(Y )
C α
124 7 Operations on controlled rough paths
≤ C(ε0 + εY )
where C depends on the Cb3 -norm of φ. Also,
Y ′ − Ỹ ′
C α ≤ ε′0 + ε. Clearly then
(C α is a Banach algebra under pointwise multiplication), we have, for a constant CM ,
Dφ(Y )Y ′ − Dφ Ỹ Ỹ ′
≤ CM (ε0 + εY + ε′0 + ε)
α
≲ CM (εX + ε0 + ε′0 + ε) .
Z̃ ′ Y ′ Ỹ
Z
with R (replace Y, Y , R above by Ỹ , Ỹ , R ) leads to the
Taking the difference
Z̃
bound Rs,t − Rs,t ≤ T1 + T2 where
T1 := φ(Yt ) − φ(Ys ) − Dφ(Ys )Ys,t − φ Ỹt − φ Ỹs − Dφ Ỹs Ỹs,t
Z 1
D2 φ(Ys + θYs,t )(Ys,t , Ys,t ) − D2 φ Ỹs + θỸs,t Ỹs,t , Ỹs,t (1 − θ)dθ
=
0
Y
Ỹ
T2 := Dφ(Ys )Rs,t − Dφ Ỹs Rs,t .
Y Ỹ 2α
As for the second term, we know Rs,t − Rs,t ≤ (ε′0 + ε)|t − s| , for all s, t, while
noting that this estimate is uniform in s, t ∈ [0, T ] and θ ∈ [0, 1]. RIt then suffices
to insert / subtract D2 φ(Ys + θYs,t ) Ỹs,t , Ỹs,t under the integral . . . (1 − θ)dθ
appearing in the definition of T1 and conclude with the triangle inequality and some
simple estimates, keeping in mind that ∥Y − Ỹ ∥α ≤ εY and ∥Y ∥α , ∥Ỹ ∥α ≲ CM .
⊔
⊓
7.5 Itô’s formula revisited 125
and now ask for a similar formula for F (Yt ), when (Y, Y ′ ) ∈ DX2α
is a controlled
rough path. It turns out that we need to be more specific and assume
Z t
Yt = Y0 + Ys′ dXs + Γt , (7.7)
0
with (Y ′ , Y ′′ ) ∈ DX 2α
, such as to have a well-defined rough integral; some flexibility
is added in form of a “drift” term Γ , assumed regular in time. Such paths arise
naturally as rough integrals of 1-forms, cf. Section 4.2, and also if Y is the solution
to a rough differential equation driven by X to be discussed in Section 8.1. In analogy
with similar Itô formulae from stochastic calculus, we expect
Z t Z t
F (Yt ) = F (Y0 ) + DF (Ys )Ys′ dXs + DF (Ys ) dΓs
0 0
1 t 2
Z
D F (Ys ) Ys′ , Ys′ d[X]s .
+ (7.8)
2 0
Before going on, we note that the right-hand side above is indeed meaningful: the last
two integrals are Young integrals and the first is a bona-fide rough integral. Indeed,
by Lemma 7.3 and Corollary 7.4, the integrand Z ′ := DF (Y )Y ′ is controlled by X,
with Gubinelli derivative Z ′′ = D2 F (Y )(Y ′ , Y ′ ) + DF (Y )Y ′′ , so that the rough
integral, following Theorem 4.10,
Z t Z t X
DF (Ys )Ys′ dXs = Zs′ dXs = lim Zu′ Xu,v + Zu′′ Xu,v , (7.9)
0 0 |P |→0
[u,v]∈P
Rv (7.11)
where Yu,v = u Yu,· ⊗ dY in the sense of Remark 4.12, noting that Yu,v =
Yu′ Yu′ Xu,v + o(|v − u|). Also,
Let us also subtract / add DF (Yu )Yu′′ Xu,v from (7.11). Then F (Yt ) − F (Y0 ) equals
X
DF (Yu )(Yu,v − Yu′′ Xu,v ) + DF (Yu )Yu′′ Xu,v + D2 F (Yu )Yu′ Yu′ Xu,v
lim
|P|→0
[u,v]∈P
X
+ lim D2 F (Yu )Yu′ Yu′ [X]u,v
|P|→0
[u,v]∈P
X
DF (Yu )Yu′ Xu,v + DF (Yu )Yu′′ + D2 F (Yu )Yu′ Yu′ Xu,v
= lim
|P|→0
[u,v]∈P
X Z t
+ lim DF (Yu )Γu,v + D2 F (Yu )Yu′ Yu′ d[X]u .
|P|→0 0
[u,v]∈P
In view of (7.9), also noting the appearance of two Young integrals in the last line,
the proof is complete. ⊔ ⊓
It is worth having a different perspective on this Itô formula and take Γ = 0 for
an unobstructed view. Then assumption 7.7 means exactly that (Y, Y ′ , Y ′′ ) ∈ DX3α
in the sense (cf. Definition 4.18)
Z = (Z, Z ′ , Z ′′ ) := (F (Y ), DF (Y )Y ′ , DF (Y )Y ′′ + D2 F (Y )(Y ′ , Y ′ ))
Remark 7.9. The conclusion Z ∈ DX3α can be “itemised”, similar to (7.12). The kα
estimates (k = 1, 2, 3) are then uniform over F ∈ Cb3 , in analogy with the estimate
2α
for elements in DX , as was detailed in Lemma 7.3.
Proof. We give a direct proof, without intermediate use of rough integrals (and in
fact no need for α > 1/3) to emphasise the analogy with our previous Lemma 7.3
2α
on composition of elements in DX with regularity functions. By Taylor’s theorem,
1
= F (Ys ) + DF (Ys )(Ys′ Xs,t + Ys′′ Xs,t ) + D2 F (Ys )(Ys,t , Ys,t ).
F (Yt ) 3α
2
⊗2
= (Ys′ Xs,t )⊗2 and by geometricity 12 Xs,t
Note that Ys,t ⊗ Ys,t 3α = Xs,t , so that the
second order term in the Taylor expansion can be replaced by
Let us conclude this section by showing how these canonical operations can be lifted
to the case of controlled rough paths of low regularity, i.e. when α < 13 . Recall
from Section 4.5 that basis vectors in T (N ) (Rd ) are of the form ew = e1 ⊗ . . . ⊗ ek ,
for words of the form w = w1 · · · wk with letters in {1, . . . , d}, whereas we words
themselves are identified via the dual basis of T (N ) (Rd )∗ ,
w ↔ e∗w .
Controlled rough paths Y are T (N −1) (Rd )∗ -valued functions, which are controlled
by increments of X in the sense of Definition 4.18.
This suggests that, in order to define the product of two controlled rough paths
Y and Ȳ , we should first ask ourselves how a product of the type Xw w̄
s,t Xs,t for two
different words w a w̄ can be rewritten as a linear combination of the increments of
X. It was seen in Section 2.4 that such a product is described by the shuffle product
of words.
With this definition at hand, we saw that for any (weakly) geometric rough path X
satisfies the identity
ww̄
Xw w̄
s,t Xs,t = Xs,t .
128 7 Operations on controlled rough paths
This strongly suggests that the “correct” way of multiplying two controlled rough
paths Y and Ȳ is to define their product Z by
Zt = Yt ⋆ Ȳt .
7.7 Exercises
Rt
♯ Exercise 7.1 Verify that Xs,t = s Xs,r ⊗dXr where the integral is to be interpreted
in the sense of (4.24), taking (Y, Y ′ ) to be (X, I). In fact, check that
R this holds not
only in the limit |P| → 0 but in fact for every fixed |P|, i.e. Xs,t = P Ξ. Compare
this with formula (2.26), obtained in Exercise 2.4.
Exercise 7.2 Let φ : W × [0, T ] → W̄ be a function which is uniformly C 2 in its
first argument (i.e. φ is bounded and both Dy φ and Dy2 φ are bounded, where Dy
denotes the Fréchet derivative with respect to the first argument) and uniformly C 2α
in its second argument. Let furthermore (Y, Y ′ ) ∈ DX 2α
([0, T ], W ). Show that
where we denote by ∥φ∥2α;t the supremum over y of the 2α-Hölder norm of φ(y, ·).
7.8 Comments 129
Show that (Zt , Zt′ ) := (F (Yt ), DF (Yt ) ◦ Yt′ ) defines an element (Z, Z ′ ) ∈
2γ
DS,X ([0, T ], Hα ) with the quantitative bound
in Theorem 7.7, and similarly for Ȳ . Assume X is geometric, so that the bracket [X]
vanishes. Then the following product formula holds
Z t Z t
(M, M ′ )dX +
Yt Ȳt = Y0 Ȳ0 + (dΓs )Ȳs + Ys dΓ̄s
0 0
with Ms = Ys′ Ȳs + Ys Ȳs′ , Ms′ = Ys′′ Ȳs + 2Ys′ Ȳs′ + Ys Ȳs′′ .
7.8 Comments
Stability of controlled rough paths under composition with regular functions goes
back in Gubinelli [Gub04], also in an α-Hölder setting α > 13 , similar to our
Sections 7.3 and 7.4. Extension to lower order regularity and then the “branched”
setting are given by in [Gub10, HK15, FZ18], see also [BDFT20, Thm 2.11] for a
concise proof in the geometric setting and connections to a multivariate Faà di Bruno
formula.
Our discussion of Itô’s formula, Section 7.5, expands on a similar section of the
first edition (2014), and makes more explicit the point that Itô’s formula is really a
composition formula for higher order controlled rough paths. Assuming α > 31 for
the sake of argument,
130 7 Operations on controlled rough paths
Such formulae are sometimes directly given for RDE solutions, in which case
the equation dictates a particular controlled structure, as seen spelled out directly in
Davie’s approach, Section 8.7. This is also a natural way to define manifold valued
RDE solutions, similar to the definition of manifold valued semimartingales. See
also comment Section 12.5 for some pointers to Itô formulae in the context of rough
and stochastic PDEs.
Chapter 8
Solutions to rough differential equations
We show how to solve differential equations driven by rough paths by a simple Picard
iteration argument. This yields a pathwise solution theory mimicking the standard
solution theory for ordinary differential equations. We start with the simple case of
differential equations driven by a signal that is sufficiently regular for Young’s theory
of integration to apply and then proceed to the case of more general rough signals.
8.1 Introduction
131
132 8 Solutions to rough differential equations
ingredients are estimates for rough integrals (cf. Theorem 4.10) and the composition
of controlled paths with smooth maps (Lemma 7.3). Recall that, for rather trivial
reasons (of the sort |t − s|2α ≤ |t − s|, when 0 ≤ s ≤ t ≤ T ≤ 1), all constants in
these estimates were seen to be uniform in T ∈ (0, 1].
Let us postulate that there exists a solution to a differential equation in Young’s sense
and let us derive an a-priori estimate. (In finite dimension, this can actually be used
to prove the existence of solutions. Note that the regularity requirement here is “one
degree less” than what is needed for the corresponding uniqueness result.)
Proposition 8.1. Assume X, Y ∈ C β ([0, 1], V ) for some β ∈ (1/2, 1] such that,
given ξ ∈ W, f ∈ Cb1 (W, L(V, W )), we have
Rt
Proof. By assumption, for 0 ≤ s < t ≤ 1, Ys,t = s f (Yr )dXr . Using Young’s
inequality (4.3), with C = C(β),
Z t
|Ys,t − f (Ys )Xs,t | = (f (Yr ) − f (Ys ))dXr
s
2β
≤ C∥Df ∥∞ ∥Y ∥β;[s,t] ∥X∥β;[s,t] |t − s|
so that
β β
|Ys,t |/|t − s| ≤ ∥f ∥∞ ∥X∥β + C∥Df ∥∞ ∥Y ∥β;[s,t] ∥X∥β;[s,t] |t − s| .
β
Write ∥Y ∥β;h ≡ sup |Ys,t |/|t − s| where the sup is restricted to times s, t ∈ [0, 1]
for which |t − s| ≤ h. Clearly then,
and upon taking h small enough, s.t. δhβ ≍ 1, with δ = ∥X∥β , more precisely s.t.
C∥Df ∥∞ ∥X∥β hβ ≤ C 1 + ∥f ∥C 1 ∥X∥β hβ ≤ 1/2
b
(we will take h such that the second ≤ becomes an equality; adding 1 avoids trouble
when f ≡ 0)
8.3 Review of the Young case: Picard iteration 133
1
∥Y ∥β;h ≤ ∥f ∥∞ ∥X∥β .
2
−1/β
It then follows from Exercise 4.5 that, with h ∝ ∥X∥β ,
∥Y ∥β ≤ ∥Y ∥β;h 1 ∨ h−(1−β) ≤ C∥X∥β 1 ∨ h−(1−β)
1/β
= C ∥X∥β ∨ ∥X∥β .
Here, we have absorbed the dependence on f ∈ Cb1 into the constants. By scaling
(any non-zero f may be normalised to ∥f ∥C 1 = 1 at the price of replacing X by
b
∥f ∥C 1 × X) we then get immediately the claimed estimate. ⊔
⊓
b
The reader may be helped by first reviewing the classical Picard argument in a
Young setting, i.e. when β ∈ (1/2, 1]. Given ξ ∈ W , f ∈ Cb2 (W, L(V, W )), X ∈
C β ([0, 1], V ) and Y : [0, T ] → W of suitable Hölder regularity, T ∈ (0, 1], one
defines the map MT by
Z t
MT (Y ) := ξ + f (Ys )dXs : t ∈ [0, T ] .
0
and so the α-Hölder norm of X has the desired behaviour. As previously, when no
confusion is possible, we write ∥ · ∥α ≡ ∥ · ∥α;[0,T ] .
To avoid norm versus seminorm considerations, it is convenient to work on
the space of paths started at ξ, namely {Y ∈ C α ([0,
T ], W )
: Y0 = ξ}. This affine
subspace is a complete metric space under Y, Ỹ 7→
Y − Ỹ
α and so is the closed
unit ball
BT = {Y ∈ C α ([0, T ], W ) : Y0 = ξ, ∥Y ∥α ≤ 1} .
Young’s inequality (4.41) shows that there is a constant C which only depends on α
(thanks to T ≤ 1) such that for every Y ∈ BT ,
Similarly, for Y, Ỹ ∈ BT , using Young, f Y0 = f Ỹ0 and Lemma 7.5 (with
K = 1)
Z · Z ·
Y − M Ỹ = f Y dX − f Ỹ dX
MT T
s s s s
α
0 0 α
≤ C f Y0 − f Ỹ0 + f Y − f Ỹ
α ∥X∥α
We now consider a priori estimates for rough differential equations, similar to Sec-
tion 8.2. Recall that the homogeneous rough path norm |||X|||α was introduced in
(2.4).
Proposition 8.2. Let ξ ∈ W, f ∈ Cb2 (W, L(V, W )) and a rough path X = (X, X) ∈
C α with α ∈ (1/3, 1/2] and assume that (Y, Y ′ ) = (Y, f (Y )) ∈ DX
2α
is an RDE
8.4 Rough differential equations: a priori estimates 135
2α
+ ∥X∥2α;I |t − s| . (8.3)
Recall that ∥ · ∥α is the usual Hölder seminorm over [0, T ], while ∥ · ∥α;I denotes
the same norm, but over I ⊂ [0, T ], so that trivially ∥X∥α;I ≤ ∥X∥α . Whenever
notationally convenient, multiplicative constants depending on α and f are absorbed
in ≲, at the very end we can use scaling to make the f dependence reappear. We
will also write ∥ · ∥α;h for the supremum of ∥ · ∥α;I over all intervals I ⊂ [0, T ] with
length |I| ≤ h. Again, one trivially has ∥X∥α;I ≤ ∥X∥α;h whenever |I| ≤ h. Using
this notation, we conclude from (8.3) that
Y
f (Y )
α
R
2α;h
≲ ∥X∥ 2α;h + ∥X∥ α;h
R
2α;h
+ ∥X∥ 2α;h ∥f (Y )∥α;h h .
so that,
f (Y )
1 2
≤ D2 f ∞ ∥Y ∥α;h + |Df |∞
RY
2α;h
R
2α;h 2
1
Later we will establish existence and uniqueness under Cb3 -regularity.
136 8 Solutions to rough differential equations
2
≲ ∥Y ∥α;h +
RY
2α;h .
Hence, also using ∥f (Y )∥α;h ≲ ∥Y ∥α;h , there exists c1 > 0, not dependent on X or
Y , such that
Y
2
R
2α;h
≤ c1 ∥X∥2α;h + c1 ∥X∥α;h hα ∥Y ∥α;h (8.4)
+ c1 ∥X∥α;h hα
RY
2α;h + c1 ∥X∥2α;h hα ∥Y ∥α;h .
Y
with c2 = (2c1 + 1). On the other hand, since Ys,t = f (Ys )Xs,t − Rs,t and f is
bounded, we have the bound
ψh ≤ λh + ψh2 .
(and similarly: limg↓h ψg ≤ 3ψh ) which rules out any jumps of relative jump size
greater than 3. However, given that ψh ≥ 1/2 in the first regime and ψh < 1/6 in the
second, we can never jump from the second into the first regime, as h increases (from
zero). And so, we indeed must be in the second regime for all h ≤ h0 . Elementary
estimates on ψ− , as function of λh then show that
∥Y ∥α;h ≤ c6 |||X|||α ,
for all h ≤ h0 ∼ |||X|||−1/α . We conclude with Exercise 4.5, arguing exactly as in the
Young case, Proposition 8.1. ⊔ ⊓
The aim of this section is to show that if f is regular enough and (X, X) ∈ C β with
β > 13 , then we can solve differential equations driven by the rough path X = (X, X)
of the type
dY = f (Y ) dX .
2α
Such an equation will yield solutions in DX and will be interpreted in the corre-
sponding integral formulation, where the integral of f (Y ) against X is defined using
Lemma 7.3 and Theorem 4.10. More precisely, one has the following local existence
and uniqueness result. (The construction of a maximal solution is left as Exercise 8.4.)
2β
Here, the integral is interpreted in the sense of Theorem 4.10 and f (Y ) ∈ DX is
3
built from Y by Lemma 7.3. Moreover, if f is linear or f ∈ Cb , we may take T0 = T ,
and thus global existence holds on [0, T ].
Remark 8.4. The condition Y ′ = f (Y ) (and then f (Y )′ = Df (Y )Y ′ by Lemma 7.3)
is crucial for uniqueness. To see what can happen, consider the canonical lift of
X ∈ C 1 to X = (X, X ⊗ dX), in which case any choice of f (Y )′ ∈ C β yields a
R
2β
pair (f (Y ), f (Y )′ ) ∈ DX . (Indeed, thanks to |Xs,t | ≲ |t − s|, the term f (Y )′s Xs,t
can always be absorbed in the 2β-remainder.) On the other hand, regardless of the
choice of Y ′ , or f (Y )′ , the
R rough integral in (8.6) here always agrees with the
Riemann-Stieltjes integral f (Y )dX, so that (8.6) is satisfied whenever Y solves
the ODE Ẏ = f (Y )Ẋ, with Y0 = ξ.
Proof. With X = (X, X) ∈ C β ⊂ C α , 13 < α < β and (Y, Y ′ ) ∈ DX 2α
we know
from Lemma 7.3 that
′
(Ξ, Ξ ′ ) := f (Y ), f (Y ) := (f (Y ), Df (Y )Y ′ ) ∈ DX
2α
.
Restricting from [0, 1] to [0, T ], any T ≤ 1, Theorem 4.10 allows to define the map
Z ·
MT (Y, Y ′ ) = ξ + 2α
def
Ξs dXs , Ξ ∈ DX .
0
The RDE solution on [0, T ] we are looking for is a fixed point of this map. Strictly
speaking, this would only yield a solution (Y, Y ′ ) in DX 2α
. But since X ∈ C β , it
2β
turns out that this solution is automatically an element of DX . Indeed, |Ys,t | ≤
2α
|Y ′ |∞ |Xs,t | +
RY
2α |t − s| , so that Y ∈ C β . From the fixed point property it
then follows that Y ′ = f (Y ) ∈ C β and also RY ∈ C22β , since X ∈ C22β and
t
Z
Y
Rs,t = Ys,t − Ys′ Xs,t = (f (Yr ) − f (Ys ))dXt
s
3α
≤ |Y ′ |∞ |Xs,t | + O |t − s| .
Note that if (Y, Y ′ ) is such that (Y0 , Y0′ ) = (ξ, f (ξ)), then the same is true for
MT (Y, Y ′ ). Therefore, MT can be viewed as map on the space of controlled paths
started at (ξ, f (ξ)), i.e.
(Y, Y ′ ) ∈ DX
2α
([0, T ], W ) : Y0 = ξ, Y0′ = f (ξ) .
2α
Since DX is a Banach space (under the norm (Y, Y ′ ) 7→ |Y0 | + |Y0′ | + ∥Y, Y ′ ∥X,2α )
the above (affine) subspace is a complete metric space under the induced metric. This
is also true for the (closed) unit ball BT centred at, say
t 7→ (ξ + f (ξ)X0,t , f (ξ)).
(Note here that the apparently simpler choice t 7→ ξ, f (ξ) does in general not
2α
belong to DX .) In other words, BT is the set of all (Y, Y ′ ) ∈ DX
2α
([0, T ], W ) :
8.5 Rough differential equations 139
In fact, ∥(Y − f (ξ)X0,· , Y·′ − f (ξ))∥X,2α = ∥Y, Y·′ ∥X,2α as a consequence of the
triangle inequality and ∥(f (ξ)X0,· , f (ξ))∥X,2α = ∥f (ξ)∥α + ∥0∥2α = 0, so that
n o
BT = (Y, Y ′ ) ∈ DX
2α
([0, T ], W ) : Y0 = ξ, Y0′ = f (ξ) : ∥(Y, Y·′ )∥X,2α ≤ 1 .
Let us also note that, for all (Y, Y ′ ) ∈ BT , one has the bound
′
Y0 + ∥(Y, Y ′ )∥
X,2α ≤ |f |∞ + 1 =: M ∈ [1, ∞). (8.7)
We now show that, for T small enough, MT leaves BT invariant and in fact is
contracting. Constants below are denoted by C, may change from line to line and
may depend on α, β, X, X without special indication. They are, however, uniform
in T ∈ (0, 1] and we prefer to be explicit (enough) with respect to f such as to
see where Cb3 -regularity is used. With these conventions, we recall the following
estimates, direct consequences from Lemma 7.3 and Theorem 4.10 , respectively,
+ C ∥X∥α
RΞ
2α + ∥X∥2α ∥Ξ ′ ∥α
Z ·
MT (Y , Y ′ )
X,2α
=
Ξs dXs , Ξ
0 X,2α
|Ξ0′ |
+ ∥Ξ, Ξ ∥X,2α T β−α
′
≤ ∥Ξ∥α + C
2
≤ ∥f ∥C 1 ∥Y ∥α + C ∥f ∥C 1 + CM ∥f ∥C 2 |Y0′ | + ∥Y, Y ′ ∥X,2α T β−α
b b b
β−α 2
≤ ∥f ∥C 1 (∥f ∥∞ + 1)T + CM ∥f ∥C 1 + ∥f ∥C 2 (∥f ∥∞ + 1) T β−α ,
b b b
where in the last step we used (8.7) and also ∥Y ∥α;[0,T ] ≤ Cf T β−α , seen from
2α
|Ys,t | ≤ |Y ′ |∞ |Xs,t | +
RY
2α |t − s|
140 8 Solutions to rough differential equations
β 2α
≤ (|Y0′ | + ∥Y ′ ∥α )∥X∥β |t − s| +
RY
2α |t − s| .
Then, using T α ≤ T β−α and
RY
2α ≤ ∥Y, Y ′ ∥X,2α ≤ 1 , we obtain the bound
In other words, ∥MT (Y, Y ′ )∥X,2α = ∥MT (Y, Y ′ )∥X,2α;[0,T ] = O T β−α with
|∆′0 |
+ ∥∆, ∆′ ∥X,2α T β−α
≤ ∥∆∥α + C
≤ C∥f ∥C 2
Y − Ỹ
α + C∥∆, ∆′ ∥X,2α T β−α .
b
The contraction property is obvious, provided that we can establish the following
two estimates:
Y − Ỹ
≤ CT β−α
Y − Ỹ , Y ′ − Ỹ ′
α X,2α
, (8.9)
′ ′ ′
∆, ∆
X,2α
≤ C
Y − Ỹ , Y − Ỹ
X,2α . (8.10)
To obtain (8.9), replace Y by Y − Ỹ in (8.8), noting Y0′ − Ỹ0′ = 0, and this shows
We now turn to (8.10). Similar to the proof of Lemma 7.5, f ∈ C 3 allows to write
∆s = Gs Hs where
Gs := g Ys , Ỹs , Hs := Ys − Ỹs ,
and g ∈ Cb2 with ∥g∥C 2 ≤ C∥f ∥C 3 . Lemma 7.3 tells us that (G, G′ ) ∈ DX
2α
(with
b b
G′ = (DY g)Y ′ + (DỸ g)Ỹ ′ ) and in fact immediately yields an estimate of the form
′ ′ 2α
uniformly over Y, Y , Ỹ , Ỹ ∈ BT and T ≤ 1. On the other hand, DX is an
′ 2α ′ ′ ′
algebra in the sense that (GH, (GH) ) ∈ DX with (GH) = G H + GH . In fact,
we leave it as easy exercise to the reader to check that
8.6 Stability III: Continuity of the Itô–Lyons map 141
≲
Y − Ỹ , Y ′ − Ỹ ′
X,2α ,
where we made use of ∥g∥∞ , ∥g∥C 1 ≲ ∥f ∥C 3 and |Y0′ | = Ỹ0′ = |f (ξ)| ≤ |f |∞ .
b b
The argument from here on is identical to the Young case: the previous esti-
mates allow fora small enough T0 ≤ 1 such that MT0 (BT0 ) ⊂ BT0 and for all
Y, Y ′ , Ỹ , Ỹ ′ ∈ BT0 :
MT Y, Y ′ − MT Ỹ , Ỹ ′
1
≤
Y − Ỹ , Y ′ − Ỹ ′
X,2α
0 0 X,2α 2
and so MT0 (·) admits a unique fixed point (Y, Y ′ ) ∈ BT0 , which is then the unique
solution Y to (8.1) on the (possibly rather small) interval [0, T0 ]. Noting that the
choice of T0 can again be done uniformly in the starting point, the solution on [0, 1]
is then constructed iteratively as before. ⊔
⊓
instead of (8.6). On the one hand, it is possible to recast (8.11) in the form (8.6) by
writing it as an RDE for Ŷt = (Yt , t) driven by X̂t = (X̂, X̂) where X̂ = (Xt , t)
and X̂ is given by X and the “remaining cross integrals” of Xt and t, given by usual
Riemann-Stieltjes integration. However, it is possible to exploit the structure of (8.11)
to obtain somewhat better bounds on the solutions. See [FV10b, Ch. 12].
Theorem 8.5 (Rough path stability of the Itô–Lyons map). Let f ∈ Cb3 and, for
α ∈ 31 , 12 , let (Y, f (Y )) ∈ DX
2α
be the unique RDE solution given by Theorem 8.3
to
dY = f (Y ) dX, Y0 = ξ ∈ W .
142 8 Solutions to rough differential equations
Similarly, let (Ỹ , f (Ỹ )) be the RDE solution driven by X̃ and started at ξ˜ where
X, X̃ ∈ C α . Assuming
|||X|||α , |||X̃|||α ≤ M < ∞
we have the local Lipschitz estimates
˜ + ϱα X, X̃ ,
dX,X̃,2α Y, f (Y ); Ỹ , f (Ỹ ) ≤ CM |ξ − ξ|
and also
˜ + ϱα X, X̃ ,
Y − Ỹ
≤ CM |ξ − ξ|
α
Remark 8.6. The proof only uses the a priori information that RDE solutions remain
bounded if the driving rough paths do, combined with basic stability properties of
rough integration and composition.
α
and similarly for M̃T Ỹ , f Ỹ ∈ CX̃ . Then, thanks to the fixed point property
(similarly with tilde) and the local Lipschitz estimate for rough integration, Theo-
′
rem 4.17, and writing (Ξ, Ξ ′ ) := f (Y ), f (Y ) for the integrand, we obtain the
bound
≲ ϱα X, X̃ + ξ − ξ˜ + T α dX,X̃,2α Ξ, Ξ ′ ; Ξ̃, Ξ̃ ′ ,
Thanks to the local Lipschitz estimate for composition, Theorem 7.6, uniform in
T ≤ 1,
dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ ≤ C ϱα X, X̃ + ξ − ξ˜
+ T α dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ
.
dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ ≤ 2C ϱα X, X̃ + ξ − ξ˜ ,
8.7 Davie’s definition and numerical schemes 143
which is precisely the required bound. The bound on
Y − Ỹ
α then follows as in
(4.32), and these bounds can be iterated to cover a time interval of arbitrary (fixed)
length. ⊔ ⊓
Fix f ∈ Cb2 (W, L(V, W )) and X = (X, X) ∈ C β ([0, T ], V ) with β > 31 . Under
these assumptions, the rough differential equation dY = f (Y )dX makes sense as
well-defined integral equation. (In Theorem 8.3 we used additional regularity, namely
Cb3 , to establish existence of a unique solution on [0, T ].) By the very definition of an
2β
RDE solution, unique or not, (Y, f (Y )) ∈ DX , i.e.
2β
Ys,t = f (Ys )Xs,t + O |t − s| ,
and we recognise a step of first-order Euler approximation, Ys,t ≈ f (Ys )Xs,t , started
from Ys . Clearly O |t − s|2β = o(|t − s|) if and only if β > 1/2 and one can show
that iteration of such steps along a partition P of [0, T ] yields a convergent “Euler”
scheme as |P| ↓ 0, see [Dav08] or [FV10b].
In the case β ∈ 13 , 21 we have to exploit that we know more than just
2β Rt
(Y, f (Y )) ∈ DX . Indeed, since Ys,t = s f (Y )dX, estimate (4.22) for rough
integrals tells us that, for all pairs s, t
′ 3β
Ys,t = f (Ys )Xs,t + (f (Y ))s Xs,t + O |t − s| . (8.12)
′
Using the identity f (Y ) = Df (Y )Y ′ = Df (Y )f (Y ), this can be spelled out
further to
Ys,t = f (Ys )Xs,t + Df (Ys )f (Ys )Xs,t + o(|t − s|) (8.13)
and, omitting the small remainder term, we recognise a step of a second-order Euler
or Milstein approximation. Again, one can show that iteration of such steps along a
partition P of [0, T ] yields a convergent “Euler” scheme as |P| ↓ 0; see [Dav08] or
[FV10b].
Remark 8.7. This schemes can be understood from simple Taylor expansions based
on the differential equation dY = f (Y )dX, at least when X is smooth (enough), or
via Itô’s formula in a semimartingale setting. With focus on the smooth case, the Euler
approximation is obtained by a “left-point freezing” approximation f (Y· ) ≈ f (Ys )
over [s, t] in the integral equation,
Z t
Ys,t = f (Yr )dXr ≈ f (Ys )Xs,t
s
Rt
whereas the Milstein scheme, with Xs,t = s
Xs,r dXr for smooth paths, is obtained
from the next-best approximation
144 8 Solutions to rough differential equations
It turns out that the description (8.13) is actually a formulation that is equivalent
to the RDE solution built previously in the following sense.
Proof. We already discussed how (8.13) is obtained from an RDE solution to
2β
(8.6). Conversely, (8.13) implies immediately Ys,t = f (Ys )Xs,t + O |t − s|
β ′ β 2
which shows that Y ∈ C and also Y := f (Y ) ∈ C , thanks to f ∈ Cb , so that
2β
(Y, f (Y )) ∈ DX . It remains to see, in the notation of the proof of Theorem 4.10,
that Ys,t = (IΞ)s,t with
′
Ξs,t = f (Ys )Xs,t + (f (Y ))s Xs,t = f (Ys )Xs,t + Df (Ys )f (Ys )Xs,t .
To see this, we note that trivially Ys,t = (I Ξ̃)s,t with Ξ̃s,t := Ys,t . But Ξ̃s,t =
Ξs,t + o(|t − s|) and one sees as in Remark 4.13 that I Ξ̃ = IΞ. ⊔ ⊓
It is possible to check that Ξ̄ s ∈ C2α,3α for every fixed s (see the proof of Theo-
rem 4.10) so that the second line makes sense. It is also straightforward to check that
(Z, Z) satisfies (2.1), so that it does indeed belong to C α . Actually, one can see that
Z t Z t
Zt = F (Xs ) dXs , Zs,t = Zs,r ⊗ dZr ,
0 s
2
As always, we only consider the step-2 α-Hölder case, i.e. α > 13 , whereas Lyons’ theory is
valid for every Hölder-exponent α ∈ (0, 1] (or: variation parameter p ≥ 1) at the complication of
heaving to deal with ⌊p⌋ levels.
8.9 Linear rough differential equations 145
2α
where the integrals are defined as in the previous sections, where F (X) ∈ DX as in
Section 7.3.
We can now define solutions to (8.6) in the following way.
Let X ∈ C 1 ([0, 1], V ), A ∈ L(W, L(V, W )) with finite operator norm ∥A∥op = a ∈
[0, ∞), and consider the linear differential equation dY = AY dX, with initial data
Y0 ∈ W , written in integral form as
Z t
Yt = Y0 + AYs dXs .
0
Rt Rt
Clearly |Yt | ≤ |Y0 |+a 0 |Ys |d|X|s in terms of the Lipschitz path |X|t := 0 |Ẋs |ds,
and the classical Gronwall lemma gives
The following lemma, applied with α = 1, then leads to a similar conclusion. More
importantly, it will be seen to be applicable in rough situations with α < 1.
Lemma 8.10. (Rough Gronwall) Assume Y ∈ C([0, 1]), α ∈ (0, 1], and
146 8 Solutions to rough differential equations
Remark 8.11. Since |Ys,t | ≤ 2∥Y ∥∞;[s,t] the assumption is trivially satisfied for
“distant” times s, t such that M |t − s|α ≥ 2. It then suffices to check the assumption
for “nearby” times with M |t − s|α ≤ θ with θ = 2, and in fact any θ > 0, at the
2M
price of replacing M by θ∧2 .
Proof. For any ξ ∈ [s, t] have |Yξ | ≤ |Ys | + |Ys,ξ | ≤ |Ys | + M ∥Y ∥∞;[s,t] |t − s|α ,
and so
∥Y ∥∞;[s,t] (1 − M |t − s|α ) ≤ |Ys | .
Since e−2x ≤ 1 − x for x ∈ [0, 1/2], we have, for M |t − s|α ∈ [0, 1/2],
α
∥Y ∥∞;[s,t] ≤ |Ys |e2M |t−s| ≤ e|Ys | .
This induces a greedy partition of [0, 1], of mesh-size (2M )−1/α and hence no more
than (2M )1/α + 1 intervals. The final estimate is then
1/α
∥Y ∥∞;[0,1] ≤ e1+(2M ) |Y0 | ,
We now apply this to linear (Young and rough) differential equations, without
loss of generality posed on [0, 1]. By general theory, Theorem 8.3, we have a (non-
explosive) solution.
Proposition 8.12. Let Y solve the linear Young differential equation dY = AY dX,
started from Y0 and driven by X ∈ C α ([0, 1]), α > 1/2, with A of finite operator
norm a. Then there exists c = c(α) ∈ (0, ∞) so that
∥Y ∥∞;[0,1] ≤ c exp c(a∥X∥α;[0,1] )1/α |Y0 |.
and so 12 ∥Y ∥α;[s,t] ≤ a|Y |∞;[s,t] whenever ca|t − s|α ≤ 1/2. Re-insert the estimate
on ∥Y ∥α;[s,t] (and also use ca|t − s|α ≤ 1/2) above to obtain precisely
|Ys,t | ≤ a|Ys ||t − s|α + a|Y |∞;[s,t] |t − s|α ≤ 2a∥Y ∥∞;[s,t] |t − s|α .
8.9 Linear rough differential equations 147
This holds whenever ca|t − s|α ≤ 1/2 and so we can conclude with the rough
Gronwall lemma (and the remark after it). The constant c is allowed to change of
course, but remains c = c(α). ⊔
⊓
Proposition 8.13. Let Y solve the linear rough differential equation dY = AY dX,
started from Y0 and driven by X ∈ C α ([0, 1]), α > 1/3, with A of finite operator
norm a. Then there exists c = c(α) ∈ (0, ∞) so that
∥Y ∥∞;[0,1] ≤ c exp c(a|||X|||α;[0,1] )1/α |Y0 |.
Proof. By scaling A, we can again assume unit (homogeneous) rough path norm for
X. By a basic estimate for rough integrals it then holds, with c = c(α) ∈ [1, ∞) and
a = |A|,
♮
|Ys,t | ≤ c∥AY, A2 Y ∥2α,X |t − s|3α ≤ ca∥Y, AY ∥2α,X |t − s|3α
= ca(∥AY ∥α + ∥Y # ∥2α )|t − s|3α ,
# ♮
using musical notation Ys,t ≡ AYs Xs,t + Ys,t ≡ AYs Xs,t + A2 Ys Xs,t + Ys,t . This
entails
# ♮
|Ys,t | ≤ |A2 Ys Xs,t | + |Ys,t | ≤ a2 |Ys ||t − s|2α + (a∥Y ∥α + ∥Y # ∥2α )ca|t − s|3α
|Ys,t | ≤ a|Ys ||t − s|α + 8a2 ∥Y ∥∞;[s,t] |t − s|2α ≤ 5a∥Y ∥∞;[s,t] |t − s|α .
We conclude with the rough Gronwall lemma, just as in the Young case. ⊔
⊓
Remark 8.14. All this can be vector-valued. Assuming X takes values in some space
V and Y takes values in W , we should view A as a linear map A : W ⊗ V → W .
The operator A2 : W ⊗ V ⊗ V → W should then be interpreted as A ◦ (A ⊗ Id).
148 8 Solutions to rough differential equations
Write π(f ) (0, y; X) = Y for this solution. Note that the inverse flow exists trivially,
by following the RDE driven by X(t − .),
We call the map y 7→ π(f ) (0, y; X) the flow associated to the above RDE. Moreover,
if X ϵ is a smooth approximation to X (in rough path metric), then the corresponding
ODE solution Y ϵ is close to Y , with a local Lipschitz estimate as given in Section 8.6.
It is natural to ask if the flow depends smoothly on y. Given a multi-index
k = (k1 , . . . , ke ) ∈ Ne , write Dk for the partial derivative with respect to y 1 , . . . , y e .
The proof of the following statement is an easy consequence of [FV10b, Chapter 12].
Theorem 8.15. Let α ∈ (1/3, 1/2] and X, X̃ ∈ Cgα . Assume f ∈ Cb3+n for some
integer n. Then the associated flow is of regularity C n+1 in y, as is its inverse flow.
The resulting family of partial derivatives, {Dk π(f ) (0, ξ; X), |k| ≤ n} satisfies the
RDE obtained by formally differentiating dY = f (Y )dX.
At last, for every M > 0 there exist C, K depending on M and the norm of f
such that, whenever |||X|||α , |||X̃|||α ≤ M < ∞ and |k| ≤ n,
sup Dk π(f ) (0, ξ; X) − Dk π(f ) (0, ξ; X̃)α;[0,t] ≤ Cϱα (X, X̃),
ξ∈Re
sup Dk π(f ) (0, ξ; X)−1 − Dk π(f ) (0, ξ; X̃)−1 α;[0,t] ≤ Cϱα (X, X̃),
ξ∈Re
8.11 Exercises
f = (f1 , . . . , fd ) ∈ Cb∞ Re , L Rd , Re ,
a) First assume f0 to have the same regularity as f , in which case you may solve
dY = f¯(Y )X̄ with f¯ = (f, f0 ) and X̄R as (canonical) space-time rough path
extension of X. (The missing integrals X i dt, tdX i , i = 1, . . . , d are canoni-
R
Show also that RDE solutions are β-Hölder, uniformly over (X, X) ∈ BR , any
R < ∞.
Exercise 8.7 Show that ∥Y, f (Y ); Y n , f (Y n )∥X,X n ,2α → 0, together with X →
Xn in C β implies that also (Y n , Yn ) → (Y, Y) in C α . Since, at the price of replacing
f by F , cf. Definition 8.9, there is no loss of generality in solving for the controlled
rough path Z = (X, Y ), conclude that continuity of the RDE solution map (Itô–
Lyons map) also holds with Lyons’ definition of a solution.
Exercise 8.8 Show that ∥Y, f (Y ); Y n , f (Y n )∥X,X n ,2α → 0, together with X →
Xn in C β implies that also (Y n , Yn ) → (Y, Y) in C α . Since, at the price of replacing
f by F , cf. Definition 8.9, there is no loss of generality in solving for the controlled
rough path Z = (X, Y ), conclude that continuity of the RDE solution map (Itô–
Lyons map) also holds with Lyons’ definition of a solution.
Exercise 8.9 (Lyons extension theorem revisited) Let α ∈ ( 13 , 12 ] and consider
X = (X, X) ∈ C α ([0, T ], V ). Show that X̄ = (1, X(1) , X(2) , X(3) , . . . , X(N ) ), the
(level-N ) Lyons lift of X from Exercise 4.6, solves a linear RDE. Use this and a
scaling argument for another proof of the estimate, 0 ≤ s < t ≤ T, n = 1, . . . , N ,
(n) 1
|Xs,t | n ≲ |||X|||α |t − s|α .
8.12 Comments
ODEs driven by not too rough paths, i.e. paths that are α-Hölder continuous for some
α > 1/2 or of finite p-variation with p < 2, understood in the (Young) integral sense
were first studied by Lyons in [Lyo94]; nonetheless, the terminology Young-ODEs is
now widely used. Existence and uniqueness for such equations via Picard iterations
is by now classical, our discussion in Section 8.3 is a mild variation of [LCL07, p.22]
where also the division property (cf. proof of Lemma 7.5) is emphasised. Existence
and uniqueness of solutions to RDEs via Picard iteration in the (Banach!) space of
8.12 Comments 151
type derivatives, etc) and studied e.g. in [Lyo98, FV10b, CL14], see also [HH10]
for related analysis. Solutions can be estimated by the rough Gronwall lemma
[DGHT19b, Hof18], in a sense a real-analysis abstraction of previously used argu-
ments for linear RDE solutions, [HN07, FV10b].
The existence and uniqueness results for rough differential equations have seen
many variations over recents years. Gubinelli, Imkeller and Perkowski apply their
theory of paracontrolled distributions to (level-2) RDEs with Hölder drivers [GIP15,
Sec.3], extended to Besov drivers by Prömel–Trabs [PT16], revisited with “classical”
rough path tools in [FP18].
Rough/stochastic Volterra equations are discussed from a rough path point of
view in [DT09, HT19, Com19], from a paracontrolled point of view in [PT18] and
in a regularity structure context in [BFG+ 19, Sec.5]. Bailleul–Diehl then study the
inverse problem for rough differential equations [BD15]. For a “joint development”
of RDEs and SDEs with stochastic sewing, by a fixed point argument in a space of
stochastic controlled rough paths, see [FHL20]. Rough partial differential equations
are discussed in Chapter 12.
Last not least, we note that the point of view to construct RDE solutions by fixed
point arguments in the (linear) space of controlled rough paths, where the rough path
figures as parameter of the fixed point problem, extends naturally to the framework of
regularity structures developed in [Hai14b], cf. Chapter 13 onwards. In that context,
solutions (to singular SPDEs, say) are found by similar fixed point arguments in a
linear space of “modelled distributions”), with enhanced noise (“the model”) again
as parameter of the fixed point problem. (The question of renormalisation is a priori
disconnected from the construction of a solution and only concerns the model / rough
path. However, one would like to understand the equation driven by renormalised
noise, at least when the latter is smooth. In the setting of rough differential equations
such effects have been observed in [FO09], a systematic study in case of branched
RDE is found in Bruned et al. [BCFP19], see also [BCEF20].)
Chapter 9
Stochastic differential equations
In particular, we may use almost every realisation of (B, B) as the driving signal
of a rough differential equation. This RDE is then solved “pathwise” i.e. for a
fixed realisation of (B(ω), B(ω)). Recall that the choice of B is never unique: two
Itô Strat
important choices R the Stratonovich lift, we write B and B , where
R are the Itô and
B is defined as B ⊗ dB and B ⊗ ◦dB respectively. We now discuss the interplay
with classical stochastic differential equations (SDEs).
Theorem 9.1. Let f ∈ Cb3 Re , L Rd , Re , let f0 : Re → Re be Lipschitz continu-
dY = f0 (Y )dt + f (Y ) dBItô , Y0 = ξ.
153
154 9 Stochastic differential equations
A classical result (e.g. [IW89, p.392]) asserts that SDE approximations based on
piecewise linear approximations to the driving Brownian motions converge to the
solution of the Stratonovich equation. Using the machinery built in the previous
sections, we can now give a simple proof of this by combining Proposition 3.6,
Theorem 8.5 and the understanding that RDEs driven by BStrat yield solutions to the
Stratonovich equation (Theorem 9.1).
Theorem 9.3 (Wong–Zakai, Clark, Stroock–Varadhan). Let f, f0 , ξ be as in Theo-
rem 9.1 above. Let α < 1/2. Consider dyadic piecewise linear approximations (B n )
to B on [0, T ], as defined in Proposition 3.6. Write Y n for the (random) ODE solu-
tions to dY n = f0 (Y n )dt + f (Y n )dB n and Y for the Stratonovich SDE solution to
dY = f0 (Y )dt + f (Y ) ◦ dB, all started at ξ. Then the Wong–Zakai approximations
converge a.s. to the Stratonovich solution. More precisely, with probability one,
∥Y − Y n ∥α;[0,T ] → 0.
The only reason for dyadic piecewise linear approximations in the above statement
is the formulation of the martingale-based Proposition 3.6. In Section 10 we shall
present a direct analysis (going far beyond the setting of Brownian drivers) which
easily entails quantitative convergence (in probability and Lq , any q < ∞) for all
piecewise linear approximations towards a (Gaussian) rough path.
In the forthcoming Exercise 10.2 it will be seen that (non-dyadic) piecewise linear
approximations of mesh size ∼ 1/n, viewed canonically as rough paths, converge a.s.
9.3 Support theorem and large deviations 155
in C α with rate anything less than 1/2 − α. As long as α > 1/3, it then follows from
(local) Lipschitzness of the Itô–Lyons map that Wong–Zakai approximations also
converge with rate (1/2 − α)− . Note that the “best” rate one obtains in this way is
(1/2 − 1/3)− = 1/6− ; the reason being that rate is measured in some Hölder space
with exponent 1/3+ , rather than the uniform norm. The well-known almost sure
“strong” rate 1/2− can be obtained from rough path theory at the price of working in
rough path spaces of much lower regularity, see [FR14].
We briefly discuss two fundamental results in diffusion theory and explain how
the theory of rough paths provides elegant proofs, reducing a question for general
diffusion to one for Brownian motion and its Lévy area.
The results discussed in this section were among the very first applications of
rough path theory to stochastic analysis, see Ledoux et al. [LQZ02]. Much more
on these topics is found in [FV10b], so we shall be brief. The first result, due to
Stroock–Varadhan [SV72] concerns the support of diffusion processes.
(where Euclidean norm is used for the conditioning ∥B − h∥∞,[0,T ] < ε). As a
consequence, the support of the law of Y , viewed as measure on the pathspace
C 0,α ([0, T ], Re ), is precisely the α-Hölder closure of {y h : ḣ ∈ L2 ([0, T ], Rd )}.
Proof. Using Theorem 9.1 we can and will take Y as RDE solution driven by
BStrat (ω). For h ∈ H and some fixed α ∈ ( 31 , 12 ), we furthermore denote by
S (2) (h) = (h, h ⊗ dh) ∈ Cg0,α the canonical lift given by computing the it-
R
1
Strictly speaking, this was shown for h ∈ C 2 ; the extension to h ∈ H is non-trivial and found in
[FV10b].
156 9 Stochastic differential equations
BStrat , S (2) (h) < δ ∥B − h∥∞;[0,T ] < ε = 1.
lim P ϱα;[0,T ] (9.3)
ε→0
The conditional statement then follows easily from continuity of the Itô–Lyons map
and so yields the “difficult” support inclusion: every y h is in the support of Y . The
easy inclusion, support of Y contained in the closure of {y h }, follows from the
Wong–Zakai theorem, Theorem 9.3. If one is only interested in the support statement,
but without the conditional statement (9.2), there are “softer” proofs; see Exercise 9.1
below. ⊔ ⊓
Here I is Schilder’s rate function for Brownian motion, i.e. I(h) = 12 ∥ḣ∥2L2 ([0,T ],Rd )
for h ∈ H and I(h) = +∞ otherwise.
Proof. The key remark is that large deviation principles are robust under continuous
maps, a simple fact known as contraction principle. The problem is then reduced to
establishing a suitable large deviation principle for the Stratonovich lift of εB (which
is exacly δε BStrat ) in the α-Hölder rough path topology. Readers familiar with general
facts of large deviation theory, in particular the inverse and generalised contraction
principles, are invited to complete the proof along Exercise 9.2 below. ⊔ ⊓
Theorem 9.6. Let Y ε be the unique Stratonovich SDE solution on [0, T ] in the small
noise regime from Theorem 9.5. Under conditions (H1-H4), the following precise
Laplace asymptotic holds
FΛ (γ)
E exp −F (Y ε )/ε2 = exp − 2
(c0 + o(1)) as ε ↓ 0, (9.6)
ε
Since (δε B) satisfies an LDP with good rate function, (H1) implies that there exists
d > a := FΛ (γ) and ε0 > 0 such that for all ε ∈ (0, ε0 )
Hence this term does not contribute to the asymptotics (9.6). In the sequel, we shall
take, for some ϱ > 0,
(ii) Cameron–Martin shift. It is easy to see that, for Wiener a.e. ω, one has
B(ω + h) = Th B(ω). In particular, the Cameron–Martin shift εB ⇝ εB + γ (or
ω ⇝ ω + γ/ε) induces a translation of δε B in the sense that
Z Z
δε B = εB, εB ⊗ d(εB) ⇝ εB + γ, (εB + γ) ⊗ d(εB + γ) = Tγ δε B .
From the Cameron–Martin theorem, with all integrals below understood over [0, T ],
∥γ∥2 R
γ̇d(εB) F ◦ Φ(T δ B)
H γ ε
Jϱ (ε) = E exp − − exp − ; |||δε B||| < ϱ
2ε2 ε2 ε2
∥γ∥2 + F ◦ Φ(γ) (∗)
H
= exp − 2
E exp − 2 ; |||δε B||| < ϱ ;
2ε ε
where we recognise FΛ (γ) in the first exponential and also set
Z
(∗) = F ◦ Φ(Tγ δε B) − F ◦ Φ(γ) + ε γ̇dB .
(iii) Local analysis around the minimiser. We argue on a fixed rough path realisation
X := B(ω). One checks that ε 7→ Φ(Tγ δε X) is sufficiently smooth so that
ε2 2
Φ(Tγ δε X) = Φ(γ) + εG1 (X) + 2 G (X) + ε3 Rε (X)
with remainder Rε (X), uniformly bounded in ε ∈ (0, 1]. We now use (H3) to obtain
the expansion
ε2 h i
DF |φ G2 (X) + D2 F |φ G1 (X), G1 (X) + ε3 RεF (X) ,
+
2| {z }
=:Q(X)
where (H3) requires us to take ε less than some ε1 (X), with remainder RεF (X),
uniformly bounded in ε ∈ (0, ε1 ). Write G1 = G1 (h), and similar for G2 , Q, when
evaluated at the canonical lift of an element h ∈ H. We note for later
∂ 2
Q(h) = 2 (F ◦ Φ)(γ + εh) .
∂ε ε=0
Since γ minimises FΛ = F ◦ Φ + I, first order optimality leads precisely to
Z
1
DF |φ G (h) + γ̇dh = 0 , (9.8)
To see why this is so, we first show integrability and even exp [−Q(B)/2] ∈ L1+β ,
for some β > 0, as consequence of the non-degeneracy assumption on the minimizer.
The claimed integrability follows from the tail estimate P(−Q(B)/2 ≥ r) ≤ e−Cr ,
with C > 1 and for sufficiently large r. Now Q is “quadratic” in the precise sense
Q(δλ X) = λ2 Q(X), λ > 0, so that upon setting r ≡ 1/ε2 , we are left to show
2
P(−Q(δε B) ≥ 2) ≤ e−C/ε .
1 ∂ 2 1
1 ≤ −Q(h∗ )/2 = 2 (−F ◦ Φ)(γ + εh∗ ) < ∥h∗ ∥2H .
2 ∂ε ε=0 2
This establishes exp [−Q(B)/2] ∈ L1+β . This additional amount of integrability,
β > 0, is now used to give a uniform L1 -bound on exp −Q(B)/2 + εRεF (B) over
|||δε B||| < ϱ, after which one can conclude by dominated convergence. To this end,
we revert to a pathwise consideration, X := B(ω). We need the remainder estimate,
Exercise 9.4,
sup RεF (X) ≲ 1 + |||X|||3 ,
(9.9)
ε∈(0,ε1 ]
valid whenever ε|||X||| = |||δε X||| remains bounded. It follows that, on |||δε B||| < ϱ, we
have the (uniform in small ε) estimate
and this estimate is uniform over ε ∈ (0, 1]. By Fernique’s estimate for the (homoge-
neous!) rough path norm |||B||| of B = B(ω) and by choosing ϱ = ϱ(β) small enough,
we can guarantee that
F ′
eεRε (B) 1{|||δε B|||<ϱ} ≲ exp Cϱ|||B|||2 ∈ Lβ ,
160 9 Stochastic differential equations
9.5 Exercises
support. The “easy” inclusion, supp µ ⊂ Cg0,α is clear from Proposition 3.6. For the
other inclusion, recall the translation operator from Exercise 2.15 and follow the
steps below.
a) (Cameron–Martin theorem for Brownian rough path) Let h ∈ [0, T ] ∈ H =
W01,2 . Show that X ∈ supp µ implies Th (X) ∈ supp µ.
b) Show that the support of µ contains at least one point, say X̂ ∈ Cg0,α with
the property that there exists a sequence of Lipschitz paths (h(n) ) so that
Th(n) (X̂) → (0, 0) in α-Hölder rough path metric.
Hint: Almost every realisation of BStrat (ω) will do, with −h(n) = B (n) , the
dyadic piecewise linear approximations from Proposition 3.6.
c) Conclude that (0, 0) = limn→∞R Th(n) (X̂) ∈ supp µ.
d) As a consequence, any (h, h ⊗ dh) = Th (0, 0) ∈ supp µ, for any h ∈ H and
taking the closure yields the “difficult” inclusion.
e) Appeal to continuity of the Itô–Lyons map to obtain the “difficult” support
inclusion (“every y h is in the support of Y ” ) in the context of Theorem 9.4.
Exercise 9.2 (“Schilder” large deviations, see [FV10b]) Fix α ∈ ( 13 , 12 ) and con-
sider
δε BStrat = (εB, ε2 BStrat ) ,
0,α
the laws of which are viewed as probability measures µε on the Polish space Cg,0 .
ε
Show that (µ ) : ε > 0 satisfies a large deviation principle in α-Hölder rough path
topology with good rate function
J(X) = I(X) ,
where X = (X, X) and I is Schilder’s rate function for Brownian motion, i.e.
I(h) = 12 ∥ḣ∥2L2 ([0,T ],Rd ) for h ∈ H = W01,2 and I(h) = +∞ otherwise.
Hint: Thanks to Gaussian integrability for the homogeneous rough paths norm of
BStrat it is actually enough to establish a large deviation principle for (δε BStrat : ε >
0) in the (much coarser) uniform topology, which is not very hard to do “by hand”,
cf. [FV10b].
9.6 Comments 161
such that |rε (X)| ≲ |X1 |3 , uniformly in ε ∈ (0, 1], provided |εX1 | remains
bounded.
b) Show that an extra ε-dependent drift, say εX replaced by εX + εµ for some
fixed µ ∈ C([0, 1], Rd ), alters the remainder estimate to |rε (X)| ≲ 1 + |X1 |3 .
c) Generalise a) and b) to the situation when Φ is C 3 -regular in Fréchet sense. (This
trivially covers the case F ◦ Φ, with another F ∈ C 3 .)
d) Prove the real thing, i.e. the remainder estimate (9.9) based on the expansion
of ε 7→ F ◦ Φ(Tγ δε X) where Φ is the Itô–Lyons map. (See e.g. [IK07, Thm 5.1]
and the references therein. For a similar estimate in a slightly different setting,
see also [FGP18].)
9.6 Comments
The rough path approach to solving stochastic differential equations (SDEs) driven
by d-dimensional noise, can be seen as far-reaching extension of the works of Doss
and Sussmann [Dos77, Sus78], and the Wong–Zakai approximation result [WZ65]
(d = 1) and Clark [Cla66], Stroock-Varadhan [SV72] for d > 1. Lyons [Lyo98]
used the Wong–Zakai theorem in conjunction with his continuity result to deduce
the fact that RDE solutions (driven by the Brownian rough path BStrat ) coincide with
solution to (Stratonovich) stochastic differential equations. Similar to Friz–Victoir
[FV10b], the logic is reversed in our presentation: thanks to an a priori identification
of f (Y ) dBStrat as a Stratonovich stochastic integral, the Wong–Zakai results is
R
of the Wong–Zakai theorem for a singular SPDEs with space-time white noise via
regularity structures is established by Hairer–Pardoux [HP15].
Almost sure rates for Wong–Zakai approximations in Brownian (and then more
general Gaussian) rough path situations, were studied by Hu–Nualart [HN09], Deya–
Neuenkirch–Tindel [DNT12] and Friz–Riedel [FR14]; see also Riedel–Xu [RX13].
Let us also note that Lq -rates for the convergence of approximations are not easy
to obtain with rough path techniques (in contrast to Itô calculus which is ideally
suited for moment calculations). Nonetheless, such rates can be obtained by Gaussian
techniques, as discussed in Section 11.2.3 below; applications include multi-level
Monte Carlo for SDEs and more generally Gaussian RDEs [BFRS16]. The rough
path approach to SDEs (and more generally Gaussian RDEs) leads naturally to
random dynamical systems, cf. comment Section 10.5.
The rough path approach to the Stroock-Varadhan support theorem [SV72] in
Section 9.3 goes back to Ledoux–Qian–Zhang [LQZ02] in p-variation and Friz
[Fri05] in Hölder topology, simplified and extended with Victoir in [FV05, FV07,
FV10b]; the conditional estimate (9.3) is due to Friz, Lyons and Stroock [FLS06].
We note that this strategy of proof applies whenever one has rough path stability,
which includes many stochastic partial differential equations (with finite-dimensional
noise) discussed in Chapter 12. In the case of infinite-dimensional noise, a general
support theorem for singular SPDEs was obtained via regularity structures by Hairer–
Schönbauer [HS19] and extends the paracontrolled work of Chouk–Friz [CF18], as
well as classical results such as the work of Bally, Millet and Sanz-Sole [BMSS95].
The rough path approach to Freidlin–Wentzell (small noise) large deviations in
Section 9.3 goes also back to Ledoux, Qian and Zhang [LQZ02]; in p-variation,
strengthened to Hölder topology in [FV05]; Inahama studies large deviations for
pinned diffusions [Ina15], see also [Ina16a]. Once more, the strategy of proof applies
whenever one has rough path stability, and thus applies to many stochastic partial
differential equations as discussed in Chapter 12. Large deviations for Banach valued
Wiener–Itô chaos proved useful in extensions to Gaussian rough paths and then Gaus-
sian models (in the sense of regularity structures), see [FV07] and [HW15], where
Hairer–Weber establish small noise large deviations for large classes of singular
SPDEs.
Theorem 9.6 is an elegant application of rough paths, due to Aida [Aid07], to the
classical theme of Laplace method on Wiener space, in a setting close to Ben Arous
[BA88]; see also Inahama [Ina06], his work with Kawabi [IK07] and [Ina13]. Our
presentation borrows from Friz, Gassiat and Pigato [FGP18]. See Friz–Klose [FK20]
for a recent extension of these works to singular SPDEs via regularity structures.
Recent applications to heat kernel expansions include [IT17].
The pathwise approach has also been useful to study mean field or McKean–Vlasov
stochastic differential equations. This goes back to Tanaka [Tan84], with pathwise
analysis of additive noise, revisited and extended by Coghi et al. [CDFM18]. The
rough path case was pioneered by Cass–Lyons [CL15], with measure dependent drift,
followed by Bailleul, Catellier and Delarue [BCD20, BCD19] to a setup that includes
the important case of measure dependent noise vector fields. Dawson–Gärtner type
large deviations from the McKean-Vlasov limit of weakly interacting diffusions is
9.6 Comments 163
studied in by [Tan84, CDFM18], and also in Deuschel et al. [DFMS18] via rough
paths, always under additive noise. Coghi–Nilssen [CN19] study, from a rough path
point of view, McKean-Vlasov diffusion with “common” noise.
The Lions–Sznitman theory of reflecting SDEs [LS84] was revisited from a
purely analytic rough path perspective by Aida [Aid15] and Deya et al. [DGHT19a]
(existence) Gassiat [Gas20] shows non-uniqueness.
Homogenisation has also seen much impetus from rough path theory. After early
works by Lejay–Lyons [LL03], we mention Bailleul–Catellier [BC17] and Kelly–
Melbourn [KM16, KM17], who pioneered applications to deterministic homogeni-
sation for fast-slow systems with chaotic noise, work continued by Chevyrev et al.
[CFK+ 19b, CFK+ 19a, CFKM19].
Stochastic differential equations with jumps, driven by Lévy or general semi-
martingale noise, noise are well-known [KPP95, Pro05, App09] to require a careful
interpretation: forward vs. geometric (a.k.a. Marcus canonical) sense. The pathwise
interpretation of such differential equations was started by Williams [Wil01] and
essentially completed by Chevyrev, Friz, Shekhar and Zhang [FS17, FZ18, CF19],
consistency with the corresponding stochastic theories is also shown.
Rough analysis is “strong” by nature, yet has also proven a powerful tool for
“weak” (or martingale) problems. This was pioneered by Delarue–Diehl [DD16],
using rough paths to study a one-dimensional SDE with distributional drift, with
applications to polymer measures. The extension to higher dimensions was carried
out with paracontrolled methods by Cannizzaro–Chouk [CC18a].
Bruned et al. [BCF18] construct examples of renormalised SDE solutions, par-
tially based on the “Hoff” process [Hof06, FHL16], related to Itô SDE solutions as
averaging Stratonovich solutions [LY16].
Chapter 10
Gaussian rough paths
X(ω) : [0, T ] → Rd
and may take the underlying probability space as C [0, T ], Rd , equipped with a
Gaussian measure µ so that Xt (ω) = ω(t). Recall that µ, the law of X, is fully
determined by its covariance function
2
R : [0, T ] → Rd×d
(s, t) 7→ E[Xs ⊗ Xt ] .
In this section, a major role will be played by the rectangular increments of the
covariance, namely
s , t def
R ′ ′ = E[Xs,t ⊗ Xs′ ,t′ ] .
s ,t
As far as the Hölder regularity of sample paths is concerned, we have the following
classical result, which is nothing but a special case of Kolmogorov’s continuity
criterion:
Proposition 10.1. Assume there exists positive ϱ and M such that for every 0 ≤ s ≤
t ≤ T,
165
166 10 Gaussian rough paths
R s, t ≤ M |t − s|1/ϱ .
(10.1)
s, t
Then, for every α < 1/(2ϱ) there exists Kα ∈ Lq , for all q < ∞, such that
α
|Xs,t (ω)| ≤ Kα (ω)|t − s| .
Proof. We may argue componentwise and thus take d = 1 without loss of generality.
Since
1/2
1/2 s, t 1
≤ M 1/2 |t − s| 2ϱ
|Xs,t |L2 = (E[Xs,t Xs,t ]) ≤ R
s, t
and |Xs,t |Lq ≤ cq |Xs,t |L2 by Gaussianity, we conclude immediately with an appli-
cation of the Kolmogorov criterion. ⊔ ⊓
Whenever the above proposition applies with ϱ < 1, the resulting sample paths
can be taken with Hölder exponent α ∈ ( 12 , 2ϱ
1
); differential equations driven by X
can then be handled with Young’s theory, cf. Section 8.3. Therefore, our focus will be
on Gaussian processes which satisfy a suitable modification of condition (10.1) with
ϱ ≥ 1 such that the process X allows for a probabilistic construction of a suitable
second order process1
2
X(ω) : [0, T ] → Rd×d ,
which is tantamount to making sense of the “formal” stochastic integrals
Z t
i
Xs,r dXrj for 0 ≤ s < t ≤ T, 1 ≤ i, j ≤ d , (10.2)
s
such that almost every realisation X(ω) satisfies the algebraicand analytical prop-
erties of Section 2, notably (2.1) and (2.3) for some α ∈ 31 , 12 . We shall also look
for (X, X) as (random) geometric rough path; thanks to (2.6), only the case i < j in
(10.2) then needs to be considered.
At the risk of being repetitive, the reader should keep in mind the following three
points: (i) the sample paths X(ω) will not have, in general, enough regularity to
define (10.2) as Young integrals; (ii) the process X will not be, in general, a semi-
martingale, so (10.2) cannot be defined using classical stochastic integrals; (iii) a lift
of the process X to (X, X) ∈ Cgα for some α ∈ 13 , 12 , if at all possible, will never
be unique (as discussed in Chapter 2, one can always perturb the area, i.e. Anti(X)
by the increments of a 2α-Hölder path). But there might still be one distinguished
canonical choice forR X, in the same way as BStrat is canonically obtained as limit
(in probability) of B ⊗ dB n , for many natural approximations B n of Brownian
n
motion B.
1
Despite the two parameters (s, t) one should not think of a random field here: as was noted in
Exercise 2.4, (X, X) is really a path.
10.2 Stochastic integration and variation regularity of the covariance 167
where the limit is understood in probability, say. Classical stochastic analysis (e.g.
[RY99, p144]) tells us that care is necessary: if X, X̃ are semimartingales, the
choice ξ = s (“left-point evaluation”) leads to the Itô integral; ξ = t (“right-point
evaluation”) to the backward Itô – and ξ = (s + t)/2 to the Stratonovich integral.
On the other hand, all these integrals only differ by a bracket term ⟨X, X̃⟩ which
vanishes if X, X̃ are independent. While we do not assume a semimartingale structure
here, we do have the standing assumption of componentwise independence. This
suggests a Riemann sum approximation of (10.2) in which we expect the precise
point of evaluation to play no rôle; we thus consider left-point evaluation (but mid-
or rightpoint evaluation would lead to the same result; cf. Exercise 10.5, (ii) below).
Given a partition P of an interval and an integrand F , we set
Z X
Fs dX̃s := Fs X̃s,t ,
P [s,t]∈P
Let us now assume that R has finite ϱ-variation in the sense ∥R∥ϱ;[0,1]2 < ∞ where
the ϱ-variation on a rectangle I × I ′ is given by
ϱ !1/ϱ
R s′ , t′
X
∥R∥ϱ;I×I ′ := sup < ∞, (10.5)
P⊂I,
s ,t
[s,t]∈P
P ′ ⊂I ′ ′ ′
[s ,t ]∈P ′
and similarly for R̃, with θ = 1/ϱ + 1/ϱ̃ > 1. A generalisation of Young’s maximal
inequality due to Towghi [Tow02] states that 2
Z
sup R dR̃ ≤ C(θ)
R
ϱ;I×I ′
R̃
ϱ̃;I×I ′ .
P⊂I, P×P ′
P ′ ⊂I ′
R1
X0,r dX̃r exists as the L2 -limit of
R
Hence, 0 P
X0,r dX̃r as |P| ↓ 0 and
"Z
1 2 #
E X0,r dX̃r ≤ C
R
ϱ;[0,1]2
R̃
ϱ;[0,1]2 (10.7)
0
Proof. At first glance, the situation looks similar to Young’s part in the proof of
Theorem 4.10 where we deduce (4.14) from Young’s maximal inequality. However,
the same argument fails if re-run with Ξs,t = X0,s X̃s,t and | · | replaced by | · |L2 ;
in effect, the triangle inequality is too crude and does not exploit probabilistic
cancellations present here. We now present two arguments for the key estimate (10.6).
First argument: at the price of adding / subtracting P ∩ P ′ , we may assume without
loss of generality that P ′ refines P. This allows to write
Z Z X Z def
X0,r dX̃r − X0,r dX̃r = Xu,r dX̃r = I ,
P′ P [u,v]∈P P ′ ∩[u,v]
X X Z
= R dR̃ .
[u,v]∈P [u′ ,v ′ ]∈P P ′ ∩[u,v]×P ′ ∩[u′ ,v ′ ]
Thanks to Towghi’s maximal inequality, the absolute value of this term is bounded
from above by a constant C = C(ϱ) times
X X
∥R∥ϱ;[u,v]×[u′ ,v′ ]
R̃
ϱ;[u,v]×[u′ ,v′ ]
[u,v]∈P [u′ ,v ′ ]∈P
X X 1 1
≤ ω([u, v] × [u′ , v ′ ]) ϱ ω̃([u, v] × [u′ , v ′ ]) ϱ ,
[u,v]∈P [u′ ,v ′ ]∈P
where ω = ω([s, t] × [s′ , t′ ]) (and similarly for ω̃) is a so-called 2D control [FV11]:
super-additive, continuous and zero when s = t or s′ = t′ . A possible choice, if
finite, is
ϱ
′ ′ def
X u , v
ω([s, t] × [s , t ]) = sup R u′ , v ′ .
(10.8)
Q⊂[s,t]×[s′ ,t′ ] ′ ′ [u,v]×[u ,v ]∈Q
The difference to (10.5) is that the sup is taken over all (finite) partitions Q of
[s, t]×[s′ , t′ ] into rectangles; not just “grid-like” partitions induced by P ×P ′ . At this
stage it looks like one should the change assumption “covariance of finiteϱ-variation”
2
to “finite controlled ϱ-variation”, which by definition means ω [0, 1] < ∞. But
in fact there is little difference [FV11]: finite controlled ϱ-variation trivially implies
finite ϱ-variation; conversely, finite ϱ-variation implies finite controlled ϱ′ -variation,
any ϱ′ > ϱ. Since (10.6) does not depend on ϱ, we may as well (at the price
of replacing ϱ by ϱ′ ) assume finite controlled ϱ-variation. The Cauchy–Schwarz
inequality for finite sums shows that ω̄ := ω 1/2 ω̃ 1/2 is again a 2D control; the above
170 10 Gaussian rough paths
where we used the facts that |P| ↓ 0, ϱ < 2 and super-additivity of ω̄ to obtain
the last inequality. This is precisely the required bound. The second argument
makes use of Riemann-Stieltjes theory, applicable after mollification of X̃, and a
uniformity property of ϱ-variation upon mollification. Let thus denote X̃ n := X̃ ∗ fn
the convolution of t 7→ X̃t with (fn ), a family of smooth, compactly supported
n
probability density functions, weakly convergent to a Dirac at 0. Writing R̃s,t :=
n n n n n
E X̃s X̃t for the covariance of X̃ , and also S̃s,t := E X̃s X̃t for the “mixed”
covariance, we leave the fact that
as and easy exercise for the reader. (Hint: Note R̃n = R̃ ∗ (fn ⊗ fn ), S̃ n = R̃ ∗
(δ ⊗ fn ); estimate then the rectangular increments of R̃n , respectively S̃ n , to the
power ϱ with Jensen’s inequality.)
Since X̃ n has finite variation sample paths, basic Riemann–Stieltjes theory implies
Z Z
X0,r dX̃r → X0,r dX̃rn as |P| → 0.
n
(10.10)
P
In fact, this convergence (n fixed) takes also place in L2 which may be seen as con-
sequence of Lemma 10.2. On the other hand, pick ϱ′ ∈ (ϱ, 2) and apply Lemma 10.2
to obtain3
Z Z 2
X0,r dX̃rn 2 ≤ C∥RX ∥ϱ′ ;[0,1]2
RX̃−X̃ n
ϱ′ ;[0,1]2
sup X0,r dX̃r −
P P P L
ϱ/ϱ′
1−ϱ/ϱ′
≤ C∥RX ∥ϱ′ ;[0,1]2
RX̃−X̃ n
ϱ;[0,1]2
RX̃−X̃ n
∞;[0,1]2 , (10.11)
where C = C(ϱ). Now ϱ′ > ϱ implies ∥RX ∥ϱ′ ;[0,1]2 ≤ ∥RX ∥ϱ;[0,1]2 (immediate
Pm ϱ
consequence of |x|ϱ′ ≤ |x|ϱ ≡ ( i=1 |xi | )1/ϱ on Rm ) and thanks to (10.9) we
also have the (uniform in n) estimate
n
R
X̃−X̃ n
ϱ;[0,1] 2 ≤ C ϱ
R
X̃ ϱ;[0,1] 2 + 2
S̃
ϱ;[0,1]2 +
R n
X̃ ϱ;[0,1] 2
≤ 4Cϱ R̃ ϱ;[0,1]2 .
u, v
3
Define |f |∞;[0,1]2 = sup f ′ ′ where the sup is taken over all [u, v], [u′ , v ′ ] ⊂ [0, 1].
u ,v
10.2 Stochastic integration and variation regularity of the covariance 171
Note that there was nothing special about the time horizon [0, 1] in the above
discussion. Indeed, given any time horizon [s, t] of interest,
it suffices to apply the
same argument to the process Xs+τ (t−s) : 0 ≤ τ ≤ 1 . Since variation norms are
conveniently invariant under reparametrisation, (10.7) translates immediately to an
estimate of the form
"Z
t 2 #
E Xs,r dX̃r ≤ C
R
2 R̃
ϱ;[s,t] 2 ,
ϱ;[s,t]
(10.12)
s
first for the approximating Riemann–Stieltjes sums and then for their L2 -limits.
and then also (the algebraic conditions (2.1) and (2.6) leave no other choice!)
1 i 2
Xi,i
s,t := X and Xj,i i,j i j
s,t := −Xs,t + Xs,t Xs,t . (10.14)
2 s,t
Then, the following properties hold:
a) For every q ∈ [1, ∞) there exists C1 = C1 (q, ϱ, d, T ) such that for all 0 ≤ s ≤
t ≤ T,
2q q q/ϱ
E |Xs,t | + |Xs,t | ≤ C1 M q |t − s| . (10.15)
1
c) For any α < 2ϱ , with probability one, the pair (X, X) satisfies conditions (2.1),
(2.3) and (2.6). In particular, for ϱ ∈ [1, 23 ) and any α ∈ ( 13 , 2ϱ
1
) we have
α
(X, X) ∈ Cg almost surely.
Assume that there exists ϱ ∈ [1, 2) and M ∈ (0, ∞) such that the bounds
1/ϱ 1/ϱ
∥RX i ∥ϱ;[s,t]2 ≤ M |t − s| , ∥RY i ∥ϱ;[s,t]2 ≤ M |t − s| ,
1/ϱ
∥RX i −Y i ∥ϱ;[s,t]2 ≤ ε2 M |t − s| , (10.17)
(Here, ϱα (X, Y) denotes the α-Hölder rough path distance between X = (X, X)
and Y = (Y, X) in Cgα .)
Proof. By scaling we may without loss of generality assume M = 1. As for a) we
note (again) that equivalence of Lq - and L2 -norm on Wiener–Itô chaos allow to
reduce our discussion to q = 2. The first level estimate being easy, we focus on
the second level estimate; to this end fix i ̸= j. Since L2 -convergence implies a.s.
convergence along a subsequence there exists (Pn ), with mesh tending to zero, so
that we can use Fatou’s lemma to estimate
Z 2
i,j 2
i,j i
dYrj − Xs,r
i
dXrj
E Ys,t − Xs,t = E lim Ys,r
n→∞ Pn
Z 2
i
≤ lim inf E Ys,r dYrj − Xs,r
i
dXrj
n Pn
Z 2
i
≤ sup E Ys,r dYrj − Xs,ri
dXrj .
P P
where we estimate the second moment of each term on the right-hand side by the
respective variation norms of the covariances; e.g.
Z 2
i j
E Ys,r d(Y − X)r ≤ C∥RY i ∥ϱ;[s,t]2 ∥RY j −X j ∥ϱ;[s,t]2
P
2
≤ Cε2 |t − s| ϱ .
i,i 2 1 i 2
2
E Yi,i i
s,t − Xs,t = E Ys,t − Xs,t
4
1 i i
i i
= E Ys,t − Xs,t Ys,t + Xs,t ,
4
then conclude with Cauchy–Schwarz.
Regarding b), given the pointwise Lq -estimates as stated in a), the Lq -estimates
for ∥X − Y ∥α and ∥Y − X∥2α are obtained from Theorem 3.3. The last statement
is then an immediate consequence of the definition of ϱα . ⊔ ⊓
1
, Y , . . . , X , Y d be a centred
1 d
Corollary 10.6. As above, let (X, Y ) = X contin-
uous Gaussian process such that X i , Y i is independent of X j , Y j when i ̸= j.
Proof. At the price of replacing (X, Y ) by the rescaled process M −1/2 (X, Y ) we
may take M = 1. (The concluding Lq -estimate on ϱα M −1/2 X, M −1/2 Y is then
readily translated into an estimate on ϱα (X, Y ), given that we allow the final constant
to depend on M .) Assumption (10.18) then spells out precisely to
1/ϱ 1/ϱ
∥RX i ∥ϱ;[s,t]2 ≤ |t − s| , ∥RY i ∥ϱ;[s,t]2 ≤ |t − s|
∥RX i −Y i ∥ϱ;[s,t]2 ≤ Cϱ ∥RX i ∥ϱ;[s,t]2 + 2
R(X i ,Y i )
ϱ;[s,t]2 + ∥RY i ∥ϱ;[s,t]2
1/ϱ
≤ 4Cϱ |t − s| ,
η := max{∥RX i −Y i ∥∞;[0,T ]2 : 1 ≤ i ≤ d}
for any given θ ∈ 0, 21 − ϱα . At last, take i∗ ∈ {1, . . . , d} as the arg max in the
Lemma 10.8. Assume that σ 2 (·) is concave on [0, h] for some h > 0. Then, one
has non-positive correlation of non-overlapping increments in the sense that, for
0 ≤ s ≤ t ≤ u ≤ v ≤ h,
s, t
E[Xs,t Xu,v ] = R ≤ 0.
u, v
176 10 Gaussian rough paths
The first claim now easily follows from concavity, cf. [MR06, Lemma 7.2.7].
To show the second bound, note that Xs,t Xu,v = (a + b + c)b where a = Xs,u ,
b = Xu,v , and c = Xv,t . Applying the algebraic identity
2 2
2(a + b + c)b = (a + b) − a2 + (c + b) − c2
where we used that σ 2 (·) is non-decreasing. On the other hand, using (a + b + c)b =
b2 + ab + cb and the non-positive correlation of non-overlapping increments, we
have
2 2
E[Xs,t Xu,v ] = E Xu,v + E[Xs,u Xu,v ] + E[Xv,t Xu,v ] ≤ E Xu,v ,
|σ 2 (τ )| ≤ L|τ |1/ϱ .
for all intervals [s, t] with length |t − s| ≤ h and some M = M (ϱ, L) > 0.
Proof. Consider some interval [s, t] with length |t − s| ≤ h. The proof relies on
separating “diagonal” and “off-diagonal” contributions. Let D = {ti }, D′ = {t′j } be
two dissections of [s, t]. For fixed i, we have
10.3 Fractional Brownian motion and beyond 177
X ϱ
ϱ
31−ϱ E Xti ,ti+1 Xt′j ,t′j+1 ≤ 31−ϱ
EXti ,ti+1 X·
ϱ-var;[s,t]
(10.21)
t′j ∈D ′
ϱ
ϱ
≤
EXti ,ti+1 X·
ϱ-var;[s,ti ] +
EXti ,ti+1 X·
ϱ-var;[ti ,ti+1 ]
ϱ
+
EXti ,ti+1 X·
ϱ-var;[ti+1 ,t] .
≤ 2σ 2 (ti+1 − ti ) .
The third term is bounded analogously. For the middle term in (10.21) we estimate
EXti ,ti+1 X·
ϱ
X
|EXti ,ti+1 Xt′j ,t′j+1 |ϱ
ϱ-var;[t ,t ]
= sup
i i+1
D′
t′j ∈D ′
where we used the second estimate of Lemma 10.8 for the penultimate bound and
the assumption on σ 2 for the last bound. Using these estimates in (10.21) yields
X
|EXti ,ti+1 Xt′j ,t′j+1 |ϱ ≤ C|ti+1 − ti | ,
t′j ∈D ′
and (10.20) follows by summing over ti and taking the supremum over all dissections
of [s, t]. ⊔
⊓
Corollary 10.10. Let X = (X 1 , . . . , X d ) be a centred continuous Gaussian process
with independent components such that each X i satisfies the assumption of the
previous theorem, with common values of h, L and ϱ ∈ [1, 3/2). Then X, restricted
to any interval [0, T ], lifts to X = (X, X) ∈ Cgα [0, T ], Rd .
Proof. Set In = [(n − 1)h, nh] so that [0, T ] ⊂ I1 ∪ I2 ∪ · · · ∪ I[T /h]+1 . On each
interval In , we may apply
Theorem 10.4 to lift Xn := X|In to a (random) rough
path Xn ∈ Cgα In , Rd . The concatenation of X1 , X2 , . . . then yields the desired
rough path lift on [0, T ]. ⊔
⊓
Example 10.11 (Fractional Brownian motion). Clearly, d-dimensional fractional
Brownian motion B H with Hurst parameter H ∈ ( 13 , 12 ] satisfies the assumptions of
the above theorem / corollary for all components with
σ(u) = u2H ,
1
obviously non-decreasing and concave for H ≤ 2 and on any time interval [0, T ].
This also identifies
1
ϱ=
2H
178 10 Gaussian rough paths
and ϱ < 32 translates to H > 13 in which case we obtain a canonical geometric rough
path BH = (B H , BH ) associated to fBm. In fact, a canonical “level-3” rough path
BH can be constructed as long as ϱ < ϱ∗ = 2, corresponding to H > 1/4 but this
requires level-3 considerations which we do not discuss here (see [FV10b, Ch.15]).
Example 10.12 (Ornstein-Uhlenbeck process). Consider the d-dimensional (station-
ary) OU process, consisting of i.i.d. copies of a scalar Gaussian process X with
covariance
E[Xs Xt ] = K(|t − s|) , K(u) = exp (−cu) ,
where c > 0 is fixed. Note that σ 2 (u) = EXt,t+u
2 2
= EXt+u + EXt2 − 2EXt,t+u =
2
2[K(0) − K(u)] = 1 − exp (−cu), so that σ (u) is indeed increasing and concave:
One also has the bound σ 2 (u) = 1 − exp (−cu) ≤ cu, which shows that the
assumptions of the above corollary are satisfied with ϱ = 1, L = c and arbitrary
h > 0.
10.4 Exercises
Use a Borel–Cantelli argument to show that, also for any θ < 1/2 − α,
1
∥B − B n ∥α + ∥B − Bn ∥2α ≤ C(ω) .
nθ
10.4 Exercises 179
1 1
When α ∈ 3, 2 , we can conclude convergence in α-Hölder rough path metric, i.e.
ϱα ((B, B), (B n , Bn )) → 0 ,
This remains true when the mixed derivative is a signed measure, which in turn is the
case when R(s, t) = K(|t − s|) for some C 2 -function K. Indeed, write H and 2δ
for the distributional derivatives of | • |. Formal application of the chain-rule gives
∂t R = K ′ (|t − s|)H(t − s) and then, using |H| ≤ 1 a.s.,
2
∂s,t R(s, t) ≤ |K ′′ (|t − s|)| + 2|K ′ (|t − s|)|δ(t − s).
2 2
Integration again over [s, t] ⊂ [0, T ] yields
Z
2
∂u,v R(u, v) du dv ≤ (T |K ′′ | + 2|K ′ (0)|)|t − s|.
∥R∥1-var;[s,t]2 = ∞
[s,t]2
This is easily made rigorous by replacing | • | (and then H, 2δ) by a mollified version,
say | • |ε (and Hε , 2δε ), noting that variation norms are lower semicontinuous fashion
under pointwise limits; that is
in probability (and Lq , any q < ∞) for any sequence of partitions (Pn ) of [0, T ]
with mesh |Pn | → 0.
(ii) Under the assumptions of (i), show that there exists (Pn ) with |Pn | → 0 so
that, with probability one, the quadratic (co)variation X i , X j , in the sense of
Definition 5.10, vanishes, for any i ̸= j, with i, j ∈ {1, . . . , d}.
Conclude that, with regard to Theorem 10.4, the off-diagonal elements Xi,j s,t ,
defined as the L2 limit of left-point Riemann–Stieltjes sums, could have been
equivalently defined via mid- or right-point Riemann sums.
(iii) Assume ϱ = 1. Show that, for all i = 1, . . . , d, there exists a sequence (P
n ) with
mesh |Pn | → 0 so that, with probability one, the quadratic variation X i , X i ,
in the sense of Definition 5.10, exists and equals
i X
i
2
X t := lim sup E Xu,v .
ε→0 |P|<ε
[u,v]∈P
u<t
Verify that ϱ = 1 and compute [X]. (This example is related to the stochastic heat
equation, where s, t should be thought of as spatial variables, cf. Lemma 12.30)
Solution. (i) Using Wick’s formula for the expectation of products of centred
Gaussians, namely
On the other hand, at the price of passing to another subsequence also denoted
by P̃n , we have
X
2 2
sup Xu,v − E(Xu,v ) →0 almost surely,
t∈[0,T ]
[u,v]∈P̃n
u<t
2
(iv) One has E(Xs,t ) = cosh (−π)−cosh (|t − s| − π) = sinh (π)|t − s|+o(|t−s|)
and so [X]t = t sinh (π).
182 10 Gaussian rough paths
Exercise 10.6 Assume finite 1-variation of the covariance (as e.g. defined in (10.5))
2
of a zero-mean Gaussian process X and E[Xt,t+h ] = f (t)h + o(h) as h ↓ 0, for
some f ∈ C([0, T ], R). Show that, for every smooth test function φ,
T 2 T
Xt,t+h
Z Z
φ(t) dt → φ(t)f (t) dt as h → 0,
0 h 0
where the convergence takes place in Lq for any q < ∞ (and hence also in probabil-
ity).
Solution. Since all types of Lq -convergence are equivalent on the finite Wiener–Itô
chaos (here we only need the chaos up to level 2), it suffices to consider q = 2. A
dissection (tk ) of [0, T ] is given by tk = kh ∧ T . We have
X 1 Z tk+1 Z 1 X
2
φ(t)Xt,t+h dt = dθ φ(tk + θh)Xt2k +θh,tk +θh+h
h tk 0
k k
Z 1
≡ ⟨φ, µθ,h ⟩dθ ,
0
where the random measure µθ,h := k δtk +θh Xt2k +θh,tk +θh+h acts on test func-
P
uniformly in θ ∈ [0, 1], t ∈ [0, T ]. On the other hand, the Gaussian (or Wick) identity
E(A2 B 2 ) − E[A2 ]E(B 2 ) = 2(E(AB))2 , applied with A = Xtk +θh,tk +θh+h and
B = Xtj +θh,tj +θh+h , gives
2
E F (t) − F̄ (t) = E F 2 (t) − F̄ 2 (t)
2
X t + θh, tk + θh + h
=2 RX k
tj + θh, tj + θh + h
k:tk +θh≤t
j:tj +θh≤t
≲ osc R2−ϱ ; h → 0
as h → 0 ,
in L2 , again uniformly in t and θ. Now, for fixed smooth φ, one has the bound
Z Z 2 Z Z t 2
φ(t)µθ,h (dt) − φ(t)f (t)dt = f (s)ds − µθ,h ([0, t]) φ̇(t)dt
0
Z 1 Z t 2
≲ f (s)ds − µθ,h ([0, t]) dt
0 0
and so
Z Z 2 Z 1 Z t 2
E φ(t)µθ,h (dt) − φ(t)f (t)dt ≲ E f (s)ds − µθ,h ([0, t]) dt .
0 0
10.5 Comments
Classes of Gaussian processes which admit (canonical) lifts to random rough paths
were first studied by Coutin–Qian [CQ02], with focus on fBm with Hurst parameter
H > 1/4. Ledoux, Qian and Zhang [LQZ02] used Gaussian techniques to establish
large deviation and support for the Brownian rough paths, extensions to fractional
Brownian motions were investigated by Millet–Sanz-Solé [MSS06], Feyel and de
la Pradelle [FdLP06], Friz–Victoir [FV07, FV06a]. When H ≤ 1/4, there is no
canonical rough path lift: as noted in [CQ02], the L2 -norm of the area associated to
piecewise linear approximations to fBm diverges. See however the works of Unter-
berger and then Nualart–Tindel [Unt10, NT11]. Parameter estimation for fractional
SDEs via rough paths is studied in Papavasiliou–Ladroue [PL11], see also [DFM16].
The notion of two-dimensional ϱ-variation of the covariance, as adopted in this
chapter, is due to Friz–Victoir, [FV10a], [FV10b, Ch.15], [FV11], and allows for
an elegant and general construction of Gaussian rough paths. It also leads naturally
to useful Cameron–Martin embeddings, see Section 11.1. If restricted to the “diag-
onal”, ϱ-variation of the covariance relates to a classical criterion of Jain–Monrad
[JM83]. The question remains how one checks finite ϱ-variation when faced with a
non-trivial (and even non-explicit, e.g. given as Fourier series) covariance function.
A general criterion based on a certain covariance measure structure (reminiscent of
Kruk, Russo and Tudor [KRT07]) was recently given by Friz, Gess, Gulisashvili
and Riedel [FGGR16], a special case of which is the “concavity criterion” of Theo-
rem 10.9. Cass-Lim establish a Stratonovich-Skorohod integral formula for Gaussian
rough paths. Multi-level Monte Carlo for Gaussian RDEs is analysed by Bayer et
al. [BFRS16]. Bailleul, Riedel and Scheutzow [BRS17] show that random RDEs
driven by suitable Gaussian rough paths constitute random dynamical system. It is
interesting to note that many key results for Gaussian rough paths (tail estimate, sup-
port, densities, . . .) can be shown with different tools to hold in a Markovian setting
[CO17, CO18], using the framework of Markovian rough paths [FV08c, FV10b].
Chapter 11
Cameron–Martin regularity and applications
with supremum taken over all partitions of [0, T ] and this constitutes a seminorm
on C p-var . The 1-variation (p = 1) of such a path is of course nothing but its length,
possibly +∞. Hölder implies variation regularity, one has the immediate estimate
∥X∥p-var;[0,T ] ≤ T α ∥X∥α;[0,T ] .
185
186 11 Cameron–Martin regularity and applications
is satisfied.
We are now interested in the regularity of Cameron–Martin paths. As in the
last section, X is an Rd -valued, continuous and centred Gaussian process on [0, T ],
realised as X(ω) = ω ∈ C [0, T ], Rd , a Banach space under the uniform norm,
d Z
X T Z T
Z= gsi dBsi ≡ ⟨g, dB⟩ .
i=1 0 0
Rt 2
By Itô’s isometry, hit := E ZBti = 0 gsi ds so that ḣ = g and ∥h∥H := E Z 2 =
RT 2
0
|gs | ds = ∥ḣ∥2L2 where | • | denotes Euclidean norm on Rd . Clearly, h is of finite
1-variation, and its length is given by ∥ḣ∥L1 . On the other hand, Cauchy–Schwarz
shows any h ∈ H is 1/2-Hölder which, in general, “only” implies 2-variation.
The proposition below applies to Brownian motion with ϱ = 1, also recalling that
∥R∥1;[s,t]2 = |t − s| in the Brownian motion case.
2
scaling), that ∥h∥H := E Z 2 = 1. Let (tj ) be a dissection of [s, t]. Let ϱ′ be the
1
The case ϱ = 1 may be seen directly by taking βj = sgn htj ,tj+1 .
11.1 Complementary Young regularity 187
sX
≤ sup βj ⊗ βk , E Xtj ,tj+1 ⊗ Xtk ,tk+1
β,|β|lϱ′ ≤1 j,k
v
u X 1 X ϱ ϱ1
ϱ′ ϱ ′ ϱ′
≤ |βj | |βk | E Xtj ,tj+1 ⊗ Xt ,t
u
sup t k k+1
β,|β|lϱ′ ≤1 j,k j,k
X ϱ 1/(2ϱ) q
≤ E Xt ,t ⊗ Xt ,t
j j+1 k k+1
≤ ∥R∥ϱ-var;[s,t]2 .
j,k
The proof is then completed by taking the supremum over all dissections (tj ) of [0, t].
⊔
⊓
Remark 11.3. It is typical (e.g. for Brownian or fractional Brownian motion, with
ϱ = 1/(2H) ≥ 1) that
1/ϱ
∀s < t in [0, T ] : ∥R∥ϱ-var;[s,t]2 ≤ M |t − s| .
In such a situation, Proposition 11.2 implies that
1/(2ϱ)
|hs,t | ≤ ∥h∥ϱ-var;[s,t] ≤ ∥h∥H M 1/2 |t − s| ,
(As before, we shall drop [0, T ] from our notation whenever the time horizon is
fixed.) The homogeneous p-variation rough path norm (over [0, T ]) is then given by
q
def
|||X|||p-var;[0,T ] = |||X|||p-var = ∥X∥p-var + ∥X∥p/2-var . (11.4)
Of course, a geometric rough path of finite p-variation, X ∈ Cgp-var is one for which
the “first order calculus” condition (2.6) holds.
The following results will prove crucial in Section 11.2 where we will derive,
based on the Gaussian isoperimetric inequality, good probabilistic estimates on
Gaussian rough path objects. They are equally crucial for developing the Malliavin
calculus for (Gaussian) rough differential equations in Section 11.3.
188 11 Cameron–Martin regularity and applications
Recall from Exercise 2.15 that the translation of a rough path X = (X, X) in
direction h is given by
Th (X) = X h , Xh
def
(11.5)
where X h := X + h and
Z t Z t Z t
h
Xs,t := Xs,t + hs,r ⊗ dXr + Xs,r ⊗ dhr + hs,r ⊗ dhr , (11.6)
s s s
provided that h is sufficienly regular to make the final three integrals above well-
defined.
Lemma 11.4. i) Let X ∈ Cgp-var ([0, T ], Rd ), with p ∈ [2, 3) and consider a func-
tion h ∈ C q-var ([0, T ], Rd ) with complementary Young regularity in the sense
that
1/p + 1/q > 1 .
Then the translation of X in direction h is well-defined in the sense that the
integrals appearing in (11.6) are well-defined Young integrals and Th : X 7→
Th (X) maps Cgp-var [0, T ], Rd into itself. Moreover, one has the estimate, for
some constant C = C(p, q),
|||Th (X)|||p-var ≤ C |||X|||p-var + ∥h∥q-var .
1
Let α ∈ ( 13 , 2ϱ ] and X = (X, X) ∈ C α [0, T ], Rd a.s. be the random Gaussian
rough path constructed in Theorem 10.4. Then there exists a null set N such that for
every ω ∈ N c and every h ∈ H,
Th (X(ω)) = X(ω + h) .
Proof. Note that complementary Young regularity holds, with p = α1 < 3 and
q = ϱ < 32 , as is seen from p1 + 1q > 13 + 32 = 1. As a consequence of Lemma 11.4,
the translation Th (X(ω)) is well-defined whenever X(ω) ∈ C α . The proof requires
a close look at the precise construction of X(ω) = (X(ω), X(ω)) in Theorem 10.4,
using Kolmogorov’s criterion to build a suitable (continuous, and then Hölder) modi-
fication from X restricted to dyadic times. We recall that X(ω) = ω ∈ C([0, T ], Rd ).
Let N1 be the null set of ω where X(ω) fails to be of α-Hölder (or p-variation)
regularity. Note that ω ∈ N1c implies ω + h ∈ N1c for all h ∈ H. By the very
construction of Xs,t as an L2 -limit, for fixed
R s, t there exists a sequence of partitions
(P m ) of [s, t] such that Xs,t (ω) = limm P m X ⊗ dX exists for a.e. ω, and we write
N2;s,t for the null set on which this fails. The intersections of all these, for dyadic
times s, t, is again a null set, denoted by N2 . Now take ω ∈ N1c ∩ N2c . For fixed
dyadic s, t, consider the aforementioned partitions (P m ) and note
Z
X(ω + h) ⊗ dX(ω + h)
Pm
Z Z Z Z
= X(ω) ⊗ dX(ω) + h ⊗ dX + X ⊗ dh + h ⊗ dh .
Pm Pm Pm Pm
for the cumulative distribution function of a standard Gaussian, noting the elementary
tail estimate
Φ̄(y) := 1 − Φ(y) ≤ exp −y 2 /2 , y ≥ 0.
Theorem 11.6 (Borell’s inequality). Let (E, H, µ) be an abstract Wiener space and
A ⊂ E a measurable Borel set with µ(A) > 0 so that
Aa = {x : g(x) ≤ a}
Assume furthermore that there exists a null-set N such that for all x ∈ N c , h ∈ H :
Then f has a Gaussian tail. More precisely, for all r > a and with ā := â − a/σ,
Proof. Note that µ(Aa ) > 0 implies â = Φ−1 (µ(Aa )) > −∞. We have for all
x∈/ N and arbitrary r, M > 0 and h ∈ rK,
Example 11.8 (Classical Fernique estimate). Take f (x) = g(x) = ∥x∥E . Then the
assumptions of the generalised Fernique Theorem are satisfied with σ equal to the
operator norm of the continuous embedding H ,→ E. This applies in particular to
Wiener measure on C [0, T ], Rd .
Remark 11.10. Recall pthat the homogeneous “norm” |||X|||α was defined in (2.4) as
the sum of ∥X∥α and ∥X∥2α . Since X is “quadratic” in X (more precisely: in the
second Wiener–Itô chaos), the square root is crucial for the Gaussian estimate (11.9)
to hold.
Proof. Combining Theorem 11.5 with Lemma 11.4 and Proposition 11.2 shows that
for a.e. ω and all h ∈ H
|||X(ω)|||α ≤ C |||(X(ω − h))|||α + M 1/2 ∥h∥H .
We can thus apply the generalised Fernique Theorem with f (ω) = |||X|||α (ω) and
g(ω) = Cf (ω), noting that |||X|||α (ω) < ∞ almost surely implies that
def
Aa = {x : g(x) ≤ a}
has positive probability for a large enough (and in fact, any a > 0 thanks to a
support theorem for Gaussian rough paths, [FV10b]). Gaussian integrability of the
homogeneous rough path norm, for a fixed Gaussian rough path X is thus established.
The claimed uniformity, η = η(M, T, α, ϱ) and not depending on the particular X
under consideration requires an additional argument. We need to make sure that
µ(Aa ) is uniformly positive over all X with given bounds on the parameters (in
particular M, ϱ, a, d); but this is easy, using (10.16),
1 1
E|||X|||2α ≥ 1 − 2 C ,
µ(|||X|||α ≤ a) ≥ 1 −
a2 a
√
where C = C(M, ϱ, α, d) and so, say, a = 2C would do. ⊔ ⊓
The price of a pathwise integration / SDE theory is that all estimates (have to) deal
with the worst possible scenario. To wit, given X = (X, X) ∈ Cgα and a nice 1-form,
F ∈ Cb2 say, we had the estimate
Z T
1/α
F (X)dX ≤ C |||X|||α;[0,T ] ∨ |||X|||α;[0,T ] ,
0
reparametrisation. For the same reason, the integration domain [0, T ] in (11.10) may
be replaced by any other interval.
Example 11.11. The estimate (11.10) is sharp, at least when p = 1/α = 2, in the
following sense. Consider the (“pure-area”) rough path given by
0 c
t 7→ (0, At) , A = ,
−c 0
for some c > 0. The homogeneous (p-variation, or α-Hölder) rough path norm here
scales with c1/2 . Hence, the right-hand side of (11.10) scales like c (for c large), as
does the left-hand side which in fact is given by T |DF (0)A|.
|||X|||p-var;[τi ,τi+1 ] = 1,
i.e. for all but the very last interval for which one has |||X|||p-var;[τN ,τN +1 ] ≤ 1. One
can then exploit rough path estimates such as (11.10) on (small) intervals [τi , τi+1 ]
on which estimates are linear in |||X|||p-var ∼ 1. The problem of estimating rough
integrals is thus reduced to estimating N = N (X) and it was a key technical result
in [CLL13] to use Borell’s inequality to establish good (probabilistic) estimates on
N when X = X(ω) is a Gaussian rough path. (Our proof below is different from
[CLL13] and makes good use of the generalised Fernique estimate.)
To formalise this construction, we fixed a (1D) control function w = w(s, t), i.e.
a continuous map on {0 ≤ s ≤ t ≤ T }, super-additive, continuous and zero on the
5
The construction is purely deterministic. Of course, when X = X(ω) is random, then so is the
partition.
194 11 Cameron–Martin regularity and applications
so that w(τi , τi+1 ) = β for all i < N , while w(τN , τN +1 ) ≤ β, where N is given
by
N (w) ≡ Nβ (w; [0, T ]) := sup {i ≥ 0 : τi < T }.
As immediate consequence of super-additivity of controls,
N
X −1
βNβ (w; [0, T ]) = w(τi , τi+1 ) ≤ w(0, τN ) ≤ w(0, τN +1 ) = w(0, T ).
i=0
Note also that N is monotone in w, i.e. w ≤ w̃ implies N (w) ≤ N (w̃). At last, let us
set N (X) = N (wX ). The following (purely deterministic) lemma is most naturally
stated in variation regularity.
Proof. (Riedel) It is easy to see that all Nβ , Nβ ′ , with β, β ′ > 0 are comparable, it
is therefore enough to prove the lemma for some fixed β > 0.
q
Given h ∈ C q-var , wh (s, t) = |||h|||q-var;[s,t] is a control and so is whθ whenever
θ ≥ 1. (Noting 1 ≤ q ≤ p, we shall use this fact with θ = p/q.) From Lemma 11.4
we have, for any interval I
p p
(s, t) 7→ |||Th X|||p-var;[s,t] ≤ C |||X|||p-var;[s,t] + ∥h∥pq-var;[s,t] =: C w̃(s, t) .
6
Do not confuse a control w with “randomness” ω.
7
Super-additivity, i.e. ω(s, t) + ω(t, u) ≤ ω(s, u) whenever s ≤ t ≤ u is immediate, but
continuity is non-trivial see e.g. [FV10b, Prop. 5.8])
11.2 Concentration of measure 195
Replace X = Th T−h X by T−h X and then use elementary estimates of the type
(a + b)1/q ≤ (a1/q + b1/q ) for non-negative reals a, b, to obtain the claimed estimate
(11.12). ⊔ ⊓
The previous lemma, combined with variation regularity of Cameron–Martin
paths (Proposition 11.2) and the generalised Fernique Theorem 11.7 then gives
immediately
Theorem 11.13 (Cass–Litterer–Lyons). Let X = (X, X) ∈ Cgα a.s. be a Gaussian
rough path, as in Theorem 11.9. (In particular, the covariance is assumed to have
finite 2D ϱ-variation.) Then the integer-valued random variable
has a Weibull tail with shape parameter 2/ϱ (by which we mean that N 1/ϱ has a
Gaussian tail).
Let us quickly illustrate how to use the above estimate.
Corollary 11.14. Let X be as in the previous theorem and assume F ∈ Cb2 . Then the
random rough integral
Z T
def
Z(ω) = F (X(ω))dX(ω)
0
has a Weibull tail with shape parameter 2/ϱ by which we mean that |Z|1/ϱ has a
Gaussian tail.
Proof. Let (τi ) be the (random) partition associated to the p-variation of X(ω) as
defined in (11.11), with β = 1 and w = wX . Thanks to (11.10) we may estimate
196 11 Cameron–Martin regularity and applications
Z
T X
Z
τi+1
F (X(ω))dX(ω) ≤ F (X(ω))dX(ω)
0 τi
[τi ,τi+1 ]∈P
p
≲ (N (ω) + 1) sup |||X|||p-var;[τi ,τi+1 ] ∨ |||X|||p-var;[τi ,τi+1 ]
i
= (N (ω) + 1) ,
i
1 1
where the proportionality constant may depend on F , T and α ∈ 3 , 2ϱ . ⊔
⊓
In this section, we assume that the reader is already familiar with the basics of
Malliavin calculus as exposed for example in the monographs [Mal97, Nua06].
Consider some abstract Wiener space (W, H, µ) and a Wiener functional of the form
F : W → Re . In the context of stochastic – or rough – differential equations driven
by Gaussian signals, the Banach space W is of the form C [0, T ], Rd where µ
describes the statistics of the driving noise. If F denotes the solution to a stochastic
differential equation at some time t ∈ (0, T ], then, in general, F is not a continuous,
let alone Fréchet regular, function of the driving path. However, as we will see in this
section, it can be the case that for µ-almost every ω, the map H ∋ h 7→ F (ω + h), i.e.
F (ω + ·) restricted to the Cameron-Martin space (H, ⟨·, ·⟩) is Fréchet differentiable.
(This implies D1,p
loc -regularity, based on the commonly used Shigekawa Sobolev space
D1,p ; our notation here follows [Mal97] or [Nua06, Sec. 1.2, 1.3.4].) More precisely,
we introduce the following notion, see for example [Nua06, Sec. 4.1.3]:
Definition 11.15. Given an abstract Wiener space (W, H, µ), a random variable
1
F : W → R is said to be continuously H-differentiable, in symbols F ∈ CH , if for
µ-almost every ω, the map
H ∋ h 7→ F (ω + h)
The following well-known criterion of Bouleau–Hirsch, see [BH91, Thm 5.2.2] and
[Nua06, Sec. 1.2, 1.3.4] then provides a condition under which the law of F has a
density with respect to Lebesgue measure:
Remark 11.17. Higher order differentiability, together with control of inverse mo-
ments of M allow to strengthen this result to obtain smoothness of this density.
As beautifully explained in his own book [Mal97], Malliavin realised that the
strong solution to the stochastic differential equation
d
X
dYt = Vi (Yt ) ◦ dBti , (11.14)
i=1
Lie {V1 , . . . , Vd } y0 = Re .
(H)
(Here, Lie V denotes the Lie algebra generated by a collection V of smooth vector
fields.) There are many variations on this theme, one can include a drift vector
field (which gives rise to a modified Hörmander condition) and under the same
assumptions one can show that YT admits a smooth density. This result can also
(and was originally, see [Hör67, Koh78]) be obtained by using purely functional
analytic techniques, exploiting the fact that the density solves Kolmogorov’s forward
equation. On the other hand, Malliavin’s approach is purely stochastic and allows to
go beyond the Markovian / PDE setting. In particular, we will see that it is possible
to replace B by a somewhat generic sufficiently non-degenerate Gaussian process,
with the interpretation of (11.14) as a random RDE driven by some Gaussian rough
path X rather than Brownian motion.
for any α-Hölder geometric driving rough path X = (X, X) ∈ Cg0,α , which may
be obtained as limit of smooth, or piecewise smooth, paths in α-Hölder rough path
metric. Set p = 1/α. Recall that, thanks to continuity of the Itô–Lyons maps, RDE
solutions are limits of the corresponding ODE solutions.
The unique RDE solution (11.15) passing through Yt0 = y0 gives rise to the
X
solution flow y0 7→ Ut←t 0
(y0 ) = Yt . We call the derivative of the flow with respect
X
to the starting point the Jacobian and denote it by Jt←t 0
, so that
X d X
Jt←t a= Ut←t0 (y0 + εa) .
0
dε ε=0
X d Tεh X
Dh Ut←0 = U ,
dε t←0 ε=0
for any sufficiently smooth path h : R+ → Re . Recall that the translation operator
Th was defined in (11.5). In particular, we have seen in Lemma 11.4 that, if X arises
from a smooth path X together with its iterated integrals, then the translated rough
path Th X is nothing but X + h together with its iterated integrals. In the general case,
given h ∈ C q-var of complementary Young regularity, i.e. with 1/p + 1/q > 1, the
translation Th X can be written in terms of X and cross-integrals between X and h.
Suppose for a moment that the rough path X is the canonical lift of a smooth
Rd -valued path X. Then, it is classical to prove that Jt←t
X
0
X
= Jt←t 0
X
, where Jt←t 0
solves the linear ODE
d
X
X X
dJt←t 0
= DVi (Yt )Jt←t 0
dXti , (11.16)
i=1
and satisfies JtX2 ←t0 = JtX2 ←t1 · JtX1 ←t0 . Furthermore, the variation of constants
formula leads to
Z tX d
X X
Dh Ut←0 = Jt←s Vi (Ys ) dhis . (11.17)
0 i=1
where [V, W ] denotes the Lie bracket between the vector fields V and W . All this
extends to the rough path limit without difficulties. For instance, (11.16) can be
interpreted as a linear equation driven by the rough path X, using the fact that
DV (Y ) is controlled by X to give meaning to the equation. It is then still the case
X
that Jt←t 0
is the derivative of the flow associated to (11.15) with respect to its initial
condition.
11.3 Malliavin calculus for rough differential equations 199
d
Z tX
X X X
i
Dh Ut←0 (y0 ) = Jt←s Vi Us←0 dhs (11.19)
0 i=1
Consider now an RDE driven by a Gaussian rough path X = X(ω). We now show
that the Re -valued random variable obtained from solving this random RDE enjoys
1
CH -regularity.
1
Proposition 11.19. With ϱ ∈ [1, 23 ) and α ∈ ( 13 , 2ϱ ), let X = (X, X) ∈ Cgα be a
Gaussian rough path as constructed in Theorem 10.4. For fixed t ≥ 0, the Re -valued
random variable
X(ω)
ω 7→ Ut←0 (y0 )
is continuously H-differentiable.
Proof. Recall h ∈ H ⊂ C ϱ-var so that a.e. X(ω) and h enjoy complementary Young
regularity. As a consequence, we saw that the event
Note that s 7→ Zi,s is of finite p-variation, with p = 1/α. We have, with implicit
summation over i,
200 11 Cameron–Martin regularity and applications
t X i t
Z Z
X X
Zi dhi
Dh Ut←0 (y0 ) = Jt←s Vi Us←0 dhs =
0 0
≲ (∥Z∥p-var + |Z(0)|) × ∥h∥ϱ-var
≲ (∥Z∥p-var + |Z(0)|) × ∥h∥H .
X
Hence, the linear map DUt←0 X
(y0 ) : h 7→ Dh Ut←0 (y0 ) ∈ Re is bounded and each
∗
component is an element of H . We just showed that
d Tεh X(ω) D
X(ω)
E
h 7→ Ut←0 (y0 ) = DUt←0 (y0 ), h
dε ε=0 H
and hence
d X(ω+εh) D
X(ω)
E
h 7→ Ut←0 (y0 ) = DUt←0 (y0 ), h
dε ε=0 H
emphasizing again that X(ω + h) ≡ Th X(ω) almost surely for all h ∈ H simulta-
neously. Repeating the argument with Tg X(ω) = X(ω + g) shows that the Gâteaux
X(ω+·)
differential of Ut←0 at g ∈ H is given by
X(ω+g) gT X(ω)
DUt←0 = DUt←0 .
T X(ω)
g
(b) It remains to be seen that g ∈ H 7→ DUt←0 ∈ L(H, Re ), the space of linear
bounded maps equipped with operator norm, is continuous. We leave this as exercise
to the reader, cf. Exercise 11.4 below. ⊔
⊓
Condition 2 With probability one, sample paths of X are truly rough, at least in a
right-neighbourhood of 0.
These conditions obviously hold for d-dimensional Brownian motion: the first
condition is satisfied because 0 is the only (continuous) function orthogonal to all of
L2 ([0, T ], Rd ); the second condition was already verified in Section 6.3. More inter-
estingly, these conditions are very robust and also hold for the Ornstein–Uhlenbeck
process, a Brownian bridge which returns to the origin at a time strictly greater than
T , and some non-semimartingale examples such as fractional Brownian motion,
including the rough regime of Hurst parameter less than 1/2. We now show that
under these conditions the process admits a density at strictly positive times. Note
that the aforementioned situations are not at all covered by the “usual” Hörmander
theorem.
X
Before we proceed we note that, by the multiplicative property of Jt←s , see the
remark following (11.16), one has
X X
⊺
Mt = Jt←0 M̃t Jt←0 ,
202 11 Cameron–Martin regularity and applications
on [0, t]. Thanks to Condition 2, true roughness of X, we can apply Theorem 6.5 to
conclude that one has
X
z ⊺ J0←· [Vi , Vj ](Y· ) ≡ 0 ,
for every i, j ∈ {1, . . . , d}. Iterating this argument shows that, with non-zero prob-
ability, the processes s 7→ z ⊺ J0←s X
W (Ys ) vanish identically for every vector field
W obtained as a Lie bracket of the vector fields Vi . In particular, this is the case for
s = 0, which implies that with positive probability, z is orthogonal to W (z0 ) for
all such vector fields. Since Hörmander’s condition (H) asserts precisely that these
vector fields span the tangent space at the starting point y0 , we conclude that z = 0
with positive probability, which is in contradiction with the fact that z is a random
unit vector and thus concludes the proof. ⊔ ⊓
11.4 Exercises
Solution. In the notation of the (proof of) this Proposition, we have to show that
Tg X(ω)
g ∈ H 7→ DUt←0 ∈ L(H, Re ) is continuous. To this end, assume gn → g in H
ϱ-var
(and hence in C ). Continuity properties of the Young integral imply continuity of
the translation operator viewed as map h ∈ C ϱ-var 7→ Th X(ω) and so
depends continuously on x with respect to p-variation rough path metric: using the
x x
fact that Jt←· and U·←0 both satisfy rough differential equations driven by x this is
just a consequence of Lyons’ limit theorem (the universal limit theorem of rough path
theory). We apply this with x = X(ω) where ω remains a fixed element in (11.20). It
follows that
Tgn X(ω) Tg X(ω)
Tgn X(ω) Tg X(ω)
− DUt←0
= sup Dh Ut←0 − Dh Ut←0
DUt←0
op h:∥h∥H =1
204 11 Cameron–Martin regularity and applications
Tg X(ω) Tg X(ω)
and defining Zig (s) ≡ Jt←s Vi Us←0 , and similarly Zign (s), the same rea-
soning as in part (a) leads to the estimate
Tgn X(ω) Tg X(ω)
≤ c |Z gn − Z g |p-var + |Z gn (0) − Z g (0)| .
− DUt←0
DUt←0
op
From the explanations just given this tends to zero as n → ∞ which establishes
continuity of the Gâteaux differential, as required, and the proof is finished.
Exercise 11.5 Prove Theorem 11.20 in presence of a drift vector field V0 . In particu-
lar, show that in this case condition (H) can be weakened to
11.5 Comments
of course classical, see the monographs [Nua06, Mal97] or the original articles
[Mal78, KS84, KS85, KS87, Bis81b, Bis81a, Nor86], a short self-contained proof
can be found in [Hai11a]. In the case of rough differential equations driven by less
regular Gaussian rough path (including the case of fBm with H > 1/4), the relevance
of complementary Young regularity of Cameron–Martin paths to Malliavin regularity
or (Gaussian) RDE solutions was first recognised by Cass, Friz and Victoir [CFV09].
Existence of a density under Hörmander’s condition for such RDEs was obtained
by Cass–Friz [CF10], see also [FV10b, Ch.20], but with a Stroock-Varadhan sup-
port type argument instead of true roughness (already commented on at the end of
Chapter 6.) Smoothness of densities was subsequently established by Hairer–Pillai
[HP13] in the case of fBm and then Cass, Hairer, Litterer and Tindel [CHLT15] in
the general Gaussian setting of Chapter 10, making crucial use of the integrability
estimates discussed in Section 11.2. Indeed, combined with known estimates for
the Jacobian of RDE flows (Friz–Victoir, [FV10b, Thm 10.16]) one readily obtains
finite moments of the Jacobian of the inverse flow. This is a key ingredient in the
smoothness proof via Malliavin calculus, as is the higher-order Malliavin differentia-
bility of Gaussian RDE solutions established by Inahama [Ina14]. Several authors
have studied the resulting density, see e.g. [BNOT16, Ina16b, GOT19, IN19] and the
references therein.
We note that existence of densities via Malliavin calculus for singular SPDEs,
in the framework of regularity structures, has been studied by Cannizzaro, Friz and
Gassiat [CFG17], Gassiat–Labbé [GL20] and in great generality by Schönbauer
[Sch18].
Chapter 12
Stochastic partial differential equations
Second order stochastic partial differential equations are discussed from a rough path
point of view. In the linear and finite-dimensional noise case we follow a Feynman–
Kac approach which makes good use of concentration of measure results, as those
obtained in Section 11.2. Alternatively, one can proceed by flow decomposition
and this approach also works in a number of nonlinear situations. Secondly, now
motivated by some semilinear SPDEs of Burgers’ type with infinite-dimension noise,
we study the stochastic heat equation (in space dimension 1) as evolution in Gaussian
rough path space relative to the spatial variable, in the sense of Chapter 10.
As a prototypical linear first order PDE with noise we consider the transport equation,
posed (without loss of generality) as a terminal value problem. This is,
d
X
−∂t u(t, x) = fi (x) · Dx u(t, x)Ẇti ≡ Γ ut (x)Ẇt , u(T, • ) = g , (12.1)
i=1
207
208 12 Stochastic partial differential equations
provided g ∈ C 1 and the vector fields f1 , . . . , fd are nice enough (Cb1 will do) to
Pd
ensure a C 1 solution flow for the ODE Ẋ = i=1 fi (X)Ẇ i ≡ f (X)Ẇ ; here X s,x
denotes the unique solution started from Xs = x.
We start with a rough path stability result for the transport equation, the proof of
which is an immediate consequence of our results on flow stability of RDEs.
Proposition 12.1. Let g ∈ C(Rm ) and W ε ∈ C 1 ([0, T ], Rd ), with geometric rough
path limit W ∈ Cg0,α , α > 1/3. Write uε (s, x) := u(s, x; W ε ), defined as in (12.3)
with W replaced by W ε . Let f ∈ Cb3 . Then uε converges locally uniformly to
where X s,x denotes the (unique) RDE solution to dX = f (X)dW, started from
Xs = x. (In particular, the limit depends on W but not on the approximating
sequence.)
It is instructive to consider the case of Brownian motion B = B(t, ω) with
Stratonovich lift as prototypical example of a (random) geometric rough path.
The RDE solution X is then equivalently described by a Stratonovich SDE and
u(t, x; ω) = g(XTt,x (ω)) is FtT -measurable. The so-defined random field should
then constitute a (backward adapted) solution to the Stratonovich backward stochas-
tic partial differential equation
←−
−dut (x) = Γ ut (x) ◦ dB t , u(T, • ) = g , (12.5)
←−
where dB stands for backward Stratonovich integration (cf. Section 5.4) provided g
(und then Γ ut ) are sufficiently regular to make this Stratonovich integral meaningful.
If rewritten in Itô-form, a matrix valued second order Γ 2 = (Γi Γj )1≤i,j≤d appears,
which of course must not change the hyperbolic nature of the stochastic transport
equation. (In classical SPDE theory on has the stochastic parabolicity condition,
which in the transport case is fully degenerate.)
All this strongly suggests that rough transport noise must be geometric (i.e.
W ∈ Cgα ). We now prepare the definition of (regular, backward) solution to the rough
transport equation. Since we are in the fortunate position to have an explicit solution
(candidate) we derive a graded set of rough path estimates that provide a natural
generalisation of the classical the transport differential equation. In what follows we
γ
abbreviate estimates of the form |(a) − (b)| ≲ |t − s|γ by writing (a) = (b). (Both
sides may depend on s, t and the multiplicative constant hidden in ≲ is assumed
uniform over bounded intervals).
us (x) 3α i
= ut (x) + Γi ut (x)Ws,t + Γi Γj ut (x)Wi,j
s,t
j
Γi us (x) 2α
= Γi ut (x) + Γi Γj ut (x)Ws,t ,
α
Γi Γj us (x) = Γi Γj ut (x) ,
Remark 12.3. The first 3α estimate is nothing but Davie’s definition of solution for
a linear RDE, here of the form −du = Γ u dW. In finite dimensions, a linear map
Γ is necessarily bounded (equivalently: continuous) as linear operator, so that the
cascade of lower order (2α, α) estimates are a trivial consequence of the first. This is
different in the present situation, where ut takes values in a function space where
each application of Γ amounts to take one derivative. These estimates then have the
interpretation that time regularity of u, in the stated (“kα”) controlled sense, can be
traded against space regularity.
Remark 12.4. The rough integral formulation needs explanation. Indeed, while it is
clear from δΞ 3α= δu(x) = 0 that Ξs,t = Γi ut (x)Ws,t i
+Γi Γj ut (x)Wi,js,t has a sewing
limit, the right-point evalution requires attention, cf. Proposition 5.12 and the subse-
quent discussion about the subtleties of “right-point” rough integrals. Fortunately,
one checks that (Γ u, −(Γ 2 u)T ) ∈ DX 2α
so that, thanks to (5.10), Remark 5.13, this
sewing limit, over all partitions of [0, T ] say, is exactly identified as
X Z T
2 T
(Γ u, −Γ 2 uT )dX ,
lim Γ ut Xs,t − (Γ ut ) Xs,t =
|P|↓0 0
[s,t]∈P
where we omitted x for better readability. (Since the matrix Γ 2 ut = (Γi Γj ut )1≤i,j≤d
is in general not symmetric, a careful check of the controlledness condition is best
spelled out in coordinates.)
Definition 12.5. Any C α,3 -function u : [0, T ] × Rn → R, for which the (locally
uniform) estimates in Proposition 12.2 hold is called a regular solution to the rough
backward transport equation
−du = Γ udW.
= x + f (x)Ws,t + f ′ f (x)Ws,t .
Xt 3α
210 12 Stochastic partial differential equations
Fix times s < t < T . By uniqueness of RDE flow, XTt,y = XTs,x whenever y = Xts,x .
From u(s, x) := g(XTs,x ) and uniqueness of the RDE flow it is clear that, for all such
t,
u(s, x) = u(t, Xts,x ).
Note that ut = u(t, • ) ∈ C 3 follows from g ∈ C 3 , f ∈ C 5 ; the claimed C α,3 regularity
is then easy to see. We can expand
1
ut (Xts,x ) 3α
= ut (x)+Dut (x)(f (x)Ws,t +(Df )f (x)Ws,t )+ D2 ut (x)(f (x)Ws,t )2
2
where the final term is really the contraction ∂ij ut fki flj ( 12 Ws,t ⊗ Ws,t )k,l with
summation over all repeated indices. Using geometricity of X and symmetry of
D2 ut (x)(f, f ) the right-hand side becomes
(We essentially repeated the proof of Itô’s formula here, cf. Section 7.5.) In terms of
the first order differential operators Γi associated to the vector fields fi this can be
written elegantly as
We can now show that solutions in the sense of Definition 12.5 are unique.
Proof. Existence is clear, since Proposition 12.2 exactly says that (s, x) 7→ g(XTs,x )
gives a regular solution. Let now u be any solution with uT = g. We show that,
whenever X solves dX = f (X)dW,
12.1 First order rough partial differential equations 211
u(t, Xt ) − u(s, Xs ) 3α
= 0.
Γ 3−k ut (Xt ) kα
= Γ 3−k us (Xs ).
From the (third) defining property of a solution, the first difference on the right-hand
side of order α. Since solutions are C 3 in space, hence Γ 2 u(s, • ) ∈ C 1 , always
uniformly in s ∈ [0, T ] the final difference is also of order α, as required.
(Case k = 2.) Write
By the second defining property of a solution, the first difference on the right-hand
side equals −Γ 2 ut (Xt )Ws,t (up to order 2α). On the other hand, Γ us ∈ C 2 so that
the final difference can be replaced by
Put together we have Γ ut (Xt ) − Γ us (Xs ) = (Γ 2 us (Xs )−Γ 2 ut (Xt ))Ws,t . We see
α
that this is of (desired) order 2α, thanks to the case k = 1 and Ws,t = 0.
(Case k = 3.) We write
By the (first) defining property of a solution, the the first difference on the right-hand
side equals −Γ ut (Xt )Ws,t − Γ 2 ut (Xt )Ws,t (up to order 3α). On the other hand,
u(s, • ) ∈ C 3 so that the final difference can be replaced, using a second order Taylor
expansion, exactly as in the proof of Proposition 7.8, by
1
Dus (Xs )(f (Xs )Ws,t + f ′ f (Xs )Ws,t ) + D2 us (f (Xs ), f (Xs ))Ws,t ⊗ Ws,t
2
= Γ us (Xs )Ws,t + Γ 2 us (Xs )Ws,t
Put together (and using the cases k = 1, 2) gives the desired estimate. ⊔
⊓
(Note Γ φ, Γ 2 φ ∈ Cb so all pairings are well-defined. Formally, the second and third
estimate follow from the first with φ replaces by Γ φ and Γ 2 φ), however doing so
would require test functions up to Γ 4 φ ∈ / Cb . Itemizing the estimates allows us to
keep track of the correct regularity of φ.)
These estimates imply immediately the following (analytically) weak formulation
Z t
3
∀φ ∈ Cb : ϱt (φ) − ϱ0 (φ) = (ϱs (Γ φ), ϱs (Γ 2 φ))dWs ,
0
but the finer information, as put foward in the definition, is crucial for uniqueness.
(Remark 12.9 below comments on time-dependent test functions.)
= φ(Xs ) + φ(Xs )Xs′ Ws,t + (Dφ(Xs )Xs′′ + D2 φ(Xs )(Xs′ , Xs′ ))Ws,t ,
φ(Xt ) 3α
12.1 First order rough partial differential equations 213
Combining this with ϱt (φ) := φ(Xt ) yields the claimed 3α-estimate. Similar, but
now using standard facts on composition of controlled rough paths with regular
functions, we obtain
φ(Xt ) 2α
= φ(Xs ) + Γ φ(Xs )Ws,t ,
uniformly over φ bounded in Cb2 . At last, the third estimate comes from α-Hölder
regularity of t 7→ ϱt (Γ 2 φ)=Γ 2 φ(Xt ), itself a manifest consequence of Γ 2 φ ∈ Cb1
and α-Hölder regularity of X.
We are not yet done, because until now, we have only handled the case of Dirac
initial data ϱ0 = δx . (Since ϱ0 (φ) = φ(X00,x ) = φ(x).) Fortunately, we are in a
linear situation so that, given ϱ0 = ν ∈ M, it suffices to generalise our construction
and define Z
ϱt (φ) := φ(Xt0,x )ν(dx).
It remains to see that such an integration in x respects all graded 3α, 2α, α estimates.
This is indeed the case, because all required estimates are uniform in X0 = x. (A
pleasant consequence of dealing with bounded vector fields so that all quantitative
bounds are invariant under shift.)
(Uniqueness) Given any g = uT ∈ Cb3 , there exists a regular backward RPDE
solution, ut = u(t, • ) ∈ Cb3 , with
(and then u′ = Γ u ∈ Cb2 etc). Write us,t = ut − us and similarly for ϱ. Then
The first summand on the right-hand side expands, using the very definition of weak
solution (applied with φ = ut ∈ Cb3 , uniformly in t ∈ [0, T ]),
= ϱs (Γ ut )Ws,t + ϱs (Γ 2 ut )Ws,t .
ϱs,t (ut ) 3α
The second summand on the other hand expands, using the defining property of
regular backward equation,
(Here one needs to argue that the 3α-bound on us,t (x)+Γ ut (x)Ws,t +Γ 2 ut (x)Ws,t
is uniform in x, for uT ∈ Cb3 , and thus the same 3α-estimate holds after integrating
against ϱs (dx).) Taken together we see a perfect cancellation so that ϱt (ut ) −
ϱs (us ) 3α
= 0. By a familiar argument (using 3α > 1) this implies that t 7→ ϱt (ut ) is
constant and thus
214 12 Stochastic partial differential equations
Remark 12.9. The uniqueness part of the proof actually shows that analytically weak
solutions to the rough PDE (12.6) can be tested in space-time with test functions
φ = φ(t, x) that have a precise controlled structure, starting with
(and then 2α, resp. α expansions for φ′ and φ′′ ). This space of test functions is
tailored to the realisation of the noise W.
As motivation, consider the second order stochastic partial differential equation with
d-dimensional Brownian noise in (backward) Stratonovich form, posed as terminal
value problem,
←−
−du = L[u]dt + Γ [u] ◦ dB , u(T, • ) = g , (12.8)
1
In contrast to the space Cb we shall equip BC with the topology of locally uniform convergence.
12.2 Second order rough partial differential equations 215
We have already treated the fully degenerate case L = 0, with pure transport noise,
Γi = βi (x) · Dx , in Section 12.1.1. Since geometric rough paths are limits of smooth
paths, we start with the case when W is replaced by Ẇ dt, for W ∈ C 1 [0, T ], Rd .
Z ·
ε ε ε ε ε
(W , W ) := W , W0,t ⊗ dWt → W
0
uε = S[W ε ; g] → u =: S[W; g]
This is clearly not an equation that can be solved by Itô theory alone. But is also
not immediately well-posed as rough differential equation since for this we would
need to understand B and W = (W, W) jointly as a rough path. In view of the
Itô-differential dB in (12.15), we take B, BItô , as constructed in Section 3.2),
and are basically short of the cross-integrals between B and W . (For simplicity
Rof notation only, pretend over the next few lines W, B to be scalar.) We Rcan define
W dB(ω)
R as Wiener integral (Itô with deterministic integrand), and then BdW =
W B − W dB by imposing integration by parts. We then easily get the estimate
Z t 2
2 2α+1
E Ws,r dBr ≲ ∥W ∥α |t − s| ,
s
also when switching the roles of W, B, thanks to the integration by parts formula. It
′
follows from Kolmogorov’s criterion that ZW (ω) := Z = (Z, Z) ∈ C α a.s. for any
α′ ∈ (1/3, α) where
Rt !
BItô
Bt (ω) s,t (ω) s
Ws,r ⊗ dBr (ω)
Zt = , Zs,t = R t
Wt Bs,r ⊗ dWr (ω) Ws,t
s
We are hence able to say that a solution X = X(ω) of (12.15) is, by definition, a
solution to the genuine (random) rough differential equation
limit, e.g. in probability and uniformly on [0, T ], of classical Itô SDE solutions X ε ,
obtained by replacing dWt by the Ẇtε dt in (12.15).
Step 2: Given (s, x) we have a solution (Xt : s ≤ t ≤ T ) to the hybrid equation
′
(12.15), started at Xs = x. In fact (X, X ′ ) ∈ DZ2α with X ′ = (σ, β)(X). In
particular, the rough integral
Z Z
γ(X)dW := (0, γ(X))dZ
One can see, similar to (11.10), but now also relying on RDE growth estimates as
established in Proposition 8.2), with p = 1/α′ ,
Z t
γ(X)dW ≲ |||Z|||p-var;[s,t]
s
is indeed well-defined and the pointwise limit of uε (defined in the same way, with
W replaced by W ε ). By an Arzela–Ascoli argument, the limit is locally uniform. At
last, the claimed continuity of the solution map follows from the same arguments,
essentially by replacing W ϵ by Wϵ everywhere in the above argument, and of course
using (12.19) with g, W replaced by g ε , Wε , respectively. ⊔
⊓
Remark 12.12. The proof actually shows that our solution u = u(s, x; W) to the
linear RDPE (12.10) enjoys a Feynman–Kac type representation, namely (12.19),
in terms of the process constructed as solution to the hybrid Itô-rough differ-
ential equation (12.15). Assume now W is a Brownian motion, independent of
B, and W(ω) = WStrat = (W, WStrat ) ∈ Cg0,α a.s. It is not difficult to show
that u = u(., ., WStrat (ω)) coincides with the Feynman–Kac SPDE solution de-
rived by Pardoux [Par79] or Kunita [Kun82], via conditional expectations given
σ({Wu,v : s ≤ u ≤ v ≤ T }, and so provides an identification with classical SPDE
theory. In conjunction with continuity of the solution map S = S[W; g] one ob-
tains, along the lines of Sections 9.2, SPDE limit theorems of Wong–Zakai type,
218 12 Stochastic partial differential equations
In Exercise 12.1 the reader is invited to check that our Feynman–Kac solution is
indeed a weak solution in this sense. In particular, the final integral term is a bona
fide rough integral of the controlled rough path (Y, Y ′ ) ∈ DW2α
against W, where
It is seen in [DFS17] that a uniqueness result holds for such weak RPDE solutions
holds, provided in the definition a suitable uniformity over the test function φ is
required. The strategy is a very similar to what was seen in Section 12.1.2: arguing
(for convenience) on the terminal value formulation (12.10), we construct a regular
forward solution and then employ a forward-backward argument to obtain uniqueness.
This is essentially the uniqueness argument employed in Theorem 12.8, with switched
roles of forward and backward evolution. Alternatively, in [HH18] the unbounded
rough driver framework of [DGHT19b] has been adapted to linear second order
RPDEs with L in divergence form.
Remark 12.15. Let u = u(t, x; X) be a weak solution in the sense of (12.21), and
W be a Brownian motion with Stratonovich rough path lift W = WStrat (ω). Then,
thanks to Theorem 5.14, it follows that u(t, x; ω) := u(t, x; WStrat (ω)) yields an
analytically weak SPDE solution in the sense that for every φ ∈ D and 0 ≤ t ≤ T
one has, with probability one,
Z t Z t
∗
⟨ut , φ⟩ = ⟨u0 , φ⟩ + ⟨us , L φ⟩ds + ⟨us , Γ ∗ φ⟩ ◦ dWs ,
0 0
12.2 Second order rough partial differential equations 219
We want to give meaning to the rough partial differential equation (12.23). Similar
to (12.21), there is a natural – still formal – analytically weak formulation: for every
h ∈ Dom(L) ⊂ H and 0 ≤ t ≤ T the following integral formula holds (angle
brackets denote the inner product in H):
Z t Z t
⟨ut , h⟩ = ⟨ξ, h⟩ + ⟨us , Lh⟩ds + ⟨F (us ), h⟩dXs . (12.25)
0 0
On the other hand, if (St )t≥0 denotes the associated semigroup St = eLt (which is
analytic since L is assumed to be selfadjoint) one expects a mild formulation of the
form, for all 0 ≤ t ≤ T
Z t
ut = St ξ + St−s F (us )dXs , (12.26)
0
where the identity holds between elements in H. The regularity of F will be measured
in Fréchet sense, as map from Hα to itself, for a to be specified range of α ∈ R.3
Here, for α ≥ 0, the interpolation space Hα = Dom((−L)α ) is a Hilbert space
when endowed with the norm ∥ • ∥Hα = ∥(−L)α • ∥H . Similarly, H−α is defined as
the completion of H with respect to the norm ∥ • ∥H−α = ∥(−L)−α • ∥H . Note that
this setting is compatible with that of Exercise 4.16.
The weak formulation requires of course that s 7→ ⟨F (us ), h⟩ has meaning as a
controlled rough path, so that (12.25) is well-defined. In the mild formulation (12.26)
on the other hand we recognise the rough convolution integral previously defined in
(4.47), provided that s 7→ F (us ) is mildly controlled in the sense of (4.46). It can
be seen that weak and mild solutions coincide [GH19]. (The proof of this involves a
simple variant of the rough Fubini theorem from Exercise 4.11.) In what follows we
only consider the mild formulation.
We introduce the following spaces which are a slight strengthening of the spaces
2γ
DS,X introduced in Exercise 4.17:
2γ 2γ
([0, T ], Hα )∩ Cˆγ ([0, T ], Hα+2γ )×L∞ ([0, T ], Hα+2γ ) .
DX ([0, T ], Hα ) = DS,X
The basic ingredients, stability of mildly controlled rough paths under rough con-
volution and composition with regular functions were already established in Ex-
ercises 4.17 and 7.3. Taken together, they show that the image of (Y, Y ′ ) ∈
2γ
DX ([0, T ], H) under the map
Z t
′
MT (Y, Y )t := St ξ + St−u F (Yu )dXu , F (Yt ) (12.27)
0
2γ
yields again an element of DX ([0, T ], H). We now show that for small enough times
this map has a unique fixed point:
3
This rules out taking any derivatives in F . In particular, the previously considered transport noise,
involved Dx u, is not accommodated in this setting.
12.2 Second order rough partial differential equations 221
Proof. First note X = (X, X) ∈ C γ ⊂ C η for 1/3 < η < γ ≤ 1/2. Fixing T < 1,
2η
we will find a solution (Y, Y ′ ) ∈ DX ([0, T ], H2η−2γ ) as a fixed point of the map
MT given by (12.27). In the end we will briefly describe how one can make an
2γ
improvement and show that one actually has (Y, Y ′ ) ∈ DX ([0, T ], H). The proof
is analogous to Theorem 8.3, the only difference being that we have two different
scales of space regularity for which we need to be able to obtain the bound (7.14), as
prepared in Exercise 4.17. We will therefore show only invariance of the solution
map (12.27), because proving it already contains all the techniques that are not
present in the Theorem 8.3.
Note that if (Y, Y ′ ) is such that (Y0 , Y0′ ) = (ξ, F (ξ)) then the same is true for
MT (Y, Y ′ ), so we can view MT as a map on the complete metric space
2η
BT = {(Y, Y ′ ) ∈ DX ([0, T ], H2η−2γ ) : Y0 = ξ, Y0′ = F (ξ), ∥(Y, Y ′ )∥∧
X,2η;−2γ
+ ∥Y − S F (ξ)X0, ∥η;2η−2γ + ∥Y ′ − S F (ξ)∥∞;2η−2γ ≤ 1} .
• • •
(We use the same notational convention as in Exercise 4.17, namely indices after
a semicolon indicate in which one of the Hα norms are taken.) Note that by the
triangle inequality for (Y, Y ′ ) ∈ BT we have
0 X,2η
Since (Y, Y ′ ) ∈ BT , we have the bound ∥Y ∥η,−2γ ≤ (∥X∥γ + 1)T γ−η . One can
also show along the same lines as in Exercise 7.3 that
and since 2η < 1 we can use a better bound from (4.48) to deduce:
Z t
∥δ̂(MT (Y ) − S F (ξ)X0, )t,s ∥H2η−2γ =
St−u F (Yu )dXu − St F (ξ)Xs,t
• •
s H2η−2γ
η ′ 2η
≤ (∥F (ξ)∥H + ∥Z∥∞;−2γ )∥X∥η |t − s| + ∥Z ∥∞;−2γ ∥X∥2η |t − s|
+ C(∥X∥η |RZ |2η + ∥X∥2η ∥Z ′ ∥η )|t − s|3η−2η
≲ (∥F (ξ)∥H + ∥Z0′ ∥H−2γ + ∥(Z, Z ′ )∥∧
X,2η;−2γ )|t − s|
η
≲ T γ−η |t − s|η .
′
+ ∥MT (Y, Y )∥∧
X,2η;−2γ ≲ T γ−η .
If T is small enough we guarantee that the left-hand side of the above expression
is smaller than 1, thus proving that BT is invariant under MT . In order to show
contractivity of MT , one can use analogous steps to first show
This guarantees contractivity for small enough T , completing the fixed point argu-
ment and thus showing the existence of the unique maximal solution to (12.28).
12.2 Second order rough partial differential equations 223
2η
Let now (Y, Y ′ ) ∈ DX ([0, T ], H2η−2γ ) be the solution constructed above, we
2γ
sketch an argument showing that in fact it belongs to DX ([0, T ], H). We know that
Taking β = 2γ and using (12.30) again we show that Y ∈ Cˆγ ([0, T ], H), which
2γ
completes the proof that (Y, Y ′ ) ∈ DX ([0, T ], H). ⊔
⊓
and semilinear
Hi [u] = Hi (x, u, Du) , i = 1, . . . , d .
We essentially rule out nonlinear dependence on Du, hence the terminology “semilin-
ear noise”, which makes a (global) flow transformation method work. In a stochastic
setting such transformation (at least in the linear case) are attributed to Kunita. As
already noted in the context of first order equations, the case Hi = Hi (x, Du) re-
quires a subtle local version of such as transformation and is topic of the pathwise
Lions–Souganidis theory of stochastic viscosity theory for fully nonlinear SPDEs;
[LS98a, LS98b, LS00b] and [Sou19] for a recent overview.
As in the previous section we aim to replace ◦dW by a “rough” differential dW,
for some geometric rough path W ∈ Cg0,α ([0, T ], Rd ), and show that an RPDE
solution arises as the unique limit under approximations (W ε , Wε ) → W. Of course,
224 12 Stochastic partial differential equations
there is little one can say at this level of generality and we have not even clarified in
which sense we mean to solve (12.31) when W ∈ C 1 ! Let us postpone this discussion
and assume momentarily that F and H are sufficiently “nice” so that, for every
W ∈ C 1 and g ∈ BC, say, there is a classical
P solution u = u(t, x) for t > 0.
With noise of the form H[u]Ẇ = i Hi (x, u, Du)Ẇ i , we shall focus on the
following three cases.
a) Transport noise. For sufficienly nice vector fields βi on Rn ,
Hi [u] = βi (x) · Du ;
We now develop the “calculus” for the transformations associated to each of the
above cases. All proofs consist of elementary computations and are left to the reader.
∂t v − F ψ t, x, v, Dv, D2 v = 0
F ψ (t, ψt (x), r, p, X)
= F x, r, p, Dψt−1 , X, Dψt−1 ⊗ Dψt−1 + p, D2 ψt−1 .
def
Proposition 12.20 (Case b). For any fixed x ∈ Rn , assume that the one-dimensional
ODE
φ̇ = H(x, φ)Ẇ , φ(0; x) = r ,
has a unique solution flow φ = φW = φ(t, r; x) which is of class C 2 as a function
of both r and x. Then u is a classical solution to
if and only if v(t, x) = φ−1 (t, u(t, x), x), or equivalently φ(t, v(t, x), x) = u(t, x) ,
is a solution of
12.2 Second order rough partial differential equations 225
∂t v − φ F t, x, r, Dv, D2 v = 0 ,
with
φ 1
F (t, x, φ, Dφ + φ′ p,
def
F (t, x, r, p, X) = (12.32)
φ′
φ′′ p ⊗ p + Dφ′ ⊗ p + p ⊗ Dφ′ + D2 φ + φ′ X ,
Remark 12.22. Note that all dependence on Ẇ has disappeared in (12.33), and
consequently (12.32). In the SPDE / filtering context this is known as robustification:
the transformed PDE (∂t − φ F )v = 0 can be solved for any W ∈ C([0, T ], Rd ).
Pd
This provides a way to solve SPDEs of the form du = F [u]dt + i=1 γi (x)u ◦ dWt
pathwise, so that u depends continuously on W in uniform topology.
We now turn our attention to case c). The point here is that the “inner” and “outer”
transformation seen above, namely
Proposition
R 12.23 (Case c). Let ψ = ψ W be as in case a) and set φ(t, r, x) =
t
r exp 0 γ(ψs (x))dWs . Then u is a (classical) solution to
R
t
if and only if v(t, x) = u(t, ψt (x)) exp − 0 γ(ψs (x))dWs is a (classical) solu-
tion to
∂t v − φ (F ψ ) t, x, v, Dv, D2 v = 0.
again a linear operator. Because of the appearance of quadratic terms in Du, this
is not true for the inner transformation F → φ F unless φ′′ = 0. Fortunately, this
happens in the linear case and it follows that the transformation F → φ (F ψ ) used in
case c) above does preserve the class of linear operators.
Let us reflect for a moment on what has been achieved. We started with a PDE
that involves Ẇ and in all cases we managed to transform the original problem to a
PDE where all dependence on Ẇ has been isolated in some auxiliary ODEs. In the
stochastic context (◦dW instead of dW = Ẇ dt) this is nothing but the reduction,
via stochastic flows, from a stochastic PDE to a random PDE, to be solved ω-wise.
In the same spirit, the rough case is now handled with the aid of flows for RDEs and
their stability properties.
Given W ∈ Cg0,α , we pick an approximating sequence (W ε ), and transform
(in abusive notation) and the function F ε which appears on the right-hand side above
converges (e.g. locally uniformly) as ε → 0, due to stability properties of flows
associated to RDEs as discussed in Section 8.10.
All one now needs is a (deterministic) PDE framework with a number of good
properties, along the following “wish list”.
1. All approximate problems, i.e. with W ε ∈ C 1 ([0, T ], Rd )
d
Hi [uε ]Ẇtε,i ,
X
∂t uε = F [uε ] + uε (0, • ) = g ε ,
i=1
•
Z
(W ε , Wε ) := W ε, ε
W0,t ⊗ dWtε →W
0
uε = S[W ε ; g] → u =: S[W; g]
as ε → 0 in U. This u is the unique solution to the RPDE (12.36) in the sense of the
above definition. Moreover, the resulting solution map,
S : Cg0,α ([0, T ], Rd ) × G → U
is continuous.
It remains to identify suitable PDE frameworks, depending on the nonlinearity F .
When ∂t u = F [u] is a scalar conservation law, entropy solutions actually provide
a suitable framework to handle additional rough noise, at least of (linear) type c),
[FG16b]. On the other hand, when F = F [u] is a fully nonlinear second order opera-
tor, say of Hamilton–Jacobi–Bellman (HJB) or Isaacs type, the natural framework
is viscosity theory [CIL92, FS06] and the problem of handling additional “rough”
4
Given the roughness in t of our transformations, typically α-Hölder, it would not be wise to
incorporate temporal C 1 -regularity in the definition of the space U .
228 12 Stochastic partial differential equations
and some technical conditions hold.6 Without going into technical details, the
conditions are met for F = L as in (12.9) and are robust under taking inf
and sup (provided the regularity of the coefficients holds uniformly). As a
consequence, HJB and Isaacs type nonlinearities, where F takes the form
infa La , infa supa′ La,a′ , are also covered.
2. The change of variables “calculus” of Propositions 12.19–12.23 remains valid for
(continuous) viscosity solutions. This can be checked directly from the definition
of a viscosity solution.
5
the space of bounded uniformly continuous functions
6
. . .the most important of which is [CIL92, (3.14)]. Additional assumptions on F are necessary,
however, in particular due to the unboundedness of the domain Rn , and these are not easily found
in the literature; see [DFO14]. One can also obtain existence and uniqueness result in BUC.
12.2 Second order rough partial differential equations 229
In the context of RPDEs above, again with focus on the transport case a) for
the sake of argument, F 0 = F ψ where ψ = ψ W , where ψ is a flow of C 3 -
diffeomorphisms (associated to the RDE dY = −β(Y )dW thereby leading to
the assumption β ∈ Cb5 ). As a structural condition on F , we may simply assume
“ψ-invariant comparison” meaning that comparison holds for ∂t − F ψ , for any C 3 -
diffeomorphism with bounded derivatives. Checking this condition turns out to be
easy. First, when F = L is linear, we have F ψ = Lψ also linear, with similar bounds
on the coefficients as L due to the stringent assumptions on the derivatives of ψ.
From the above discussion, and in particular from what was said in 1., it is then
clear that L satisfies ψ-invariant comparison. In fact, stability of the condition in 1.
under taking inf and sup, also implies that HJB and Isaacs type nonlinearities satisfy
ψ-invariant comparison.
It is now possible to implement the arguments of the previous Theorem 12.25
in the viscosity framework [CFO11], see also [FO11] for applications to splitting
methods. We tacitly assume that all approximate problems of the form (12.40) below
have a viscosity solution, for all W ε ∈ C 1 and g ∈ BU C, but see Remark 12.27.
where F satisfies ψ-invariant comparison. Then there exists u = u(t, x) ∈ BC, not
dependent on the approximating (W ε ) but only on W ∈ Cg0,α ([0, T ], Rd ), so that
uε = S[W ε ; g] → u =: S[W; g]
as ε → 0 in local uniform sense. This u is the unique solution to the RPDE (12.36)
with transport noise H[u] = ⟨β(x), Du⟩ in the sense of the definition given previous
to Theorem 12.25. Moreover, we have continuity of the solution map,
Remark 12.27. In the above theorem, existence of RPDE solutions actually relies on
existence of approximate solutions uε , which one of course expects from standard
viscosity theory. Mild structural conditions on F , satisfied by HJB and Isaacs exam-
ples, which imply this existence are reviewed in [DFO14]. One can also establish a
modulus of continuity for RPDE solutions, so that u ∈ BU C after all.
Unfortunately, in case b), it turns out the structural assumptions one has to impose
on F in order to have the necessary comparison for ∂t − F 0 = 0 is rather restrictive,
although semilinear situations are certainly covered. Even in this case, due to the
appearance of a quadratic nonlinearity in Du, the argument is involved and requires
a careful analysis on consecutive small time intervals, rather than [0, T ]; see [LS00a,
DF12]. A nonlinear Feynman–Kac representation, in terms of rough backward
stochastic differential equations is given in [DF12].
At last, we return to the fully linear case of Section 12.2.3. That is, we consider
the (linear noise) case c) with linear F = L. With some care [FO14], the double
transformation leading to the transformed equation ∂t − φ (F ψ ) = 0 can be imple-
mented with the aid of coupled flows of rough differential equations. We can then
recover Theorem 12.11, but with somewhat different needs concerning the regularity
of the coefficients. (For instance, in the aforementioned theorem we really needed
σ, β ∈ Cb3 whereas now, using flow decomposition, we need β ∈ Cb5 but only σ ∈ Cb1 .
Nonlinear stochastic partial differential equations driven by very singular noise, say
space-time white noise, may suffer from the fact that their nonlinearities are ill-posed.
For instance, even in space dimension one, there is no obvious way of giving “weak”
meaning to Burgers-like stochastic PDEs of the type
n
X
∂t ui = ∂x2 ui + f (u) + gji (u)∂x uj + ξ i , i = 1, . . . , n , (12.41)
j=1
copies of scalar space-time white noise). Recall that, at least formally, space-time
white noise is a Gaussian generalised stochastic process such that
12.3 Stochastic heat equation as a rough path 231
As a consequence of the lack of regularity of ξ, it turns out that the solution to the
stochastic heat equation (i.e. the case f = g = 0 in (12.41) above) is only α-Hölder
continuous in the spatial variable x for any α < 1/2. In other words, one would
not expect any solution u to (12.41) to exhibit spatial regularity better than that of a
Brownian motion.
As a consequence, even when aiming for a weak solution theory, it is not clear
how to define the integral of a spatial test function φ against the nonlinearity. Indeed,
this would require us to make sense of expressions of the type
Z
φ(x)gji (u)∂x uj (t, x) dx ,
for fixed t. When g happens to be a gradient, such an integral can be defined by pos-
tulating that the chain rule holds and integrating by parts. For a general g, as arising
in applications from path sampling [HSV07], this approach fails. This suggests to
seek an understanding of u(t, • ) as a spatial rough path. Indeed, this would solve the
problem just explained by allowing us to define the nonlinearity in a weak sense as
Z
φ(x)gji (u) duj (t, x) ,
∂t ψ = ∂x2 ψ + ξ .
Indeed, writing u = ψ + v and proceeding formally for the moment, we then see that
v should solve
n
X
∂t v i = ∂x2 v i + f (v + ψ) + gji (v + ψ) ∂x ψ j + ∂x v j ) .
j=1
If we were able to make sense of the term appearing in the right-hand side of this
equation, one would expect it to have the same regularity as ∂x ψ so that, since
ψ(t, • ) turns out to belong to C α for every α < 1/2, one would expect v(t, • ) to be
of regularity C α+1 for every α < 1/2. In particular, we would not expect the term
involving ∂x v j to cause any trouble, so that it only remains to provide a meaning for
the term gji (v + ψ)∂x ψ j . If we know that v ∈ C 1 and we have an interpretation of
ψ(t, • ) as a rough path ψ (in space), then this can be interpreted as the distribution
whose action, when tested against a test function φ, is given by
Z
φ(x)gji (ψ + v)) dψ j (t, x) .
232 12 Stochastic partial differential equations
This reasoning can actually be made precise, see the original article [Hai11b]. In this
section we limit ourselves to providing the construction of ψ and giving some of its
basic properties.
We now study the model problem in this context - the construction of a spatial rough
path associated, in essence, to the above SPDE in the case f = g = 0. More precisely,
we are considering stationary (in time) solution to the stochastic heat equation7 ,
Thanks to the fact that we chose λ > 0, the stochastic heat equation (12.42) has
indeed a stationary solution
P which, by taking Fourier transforms, may be decom-
posed as ψ(x, t; ω) = k Ytk (ω)ek (x). The components Ytk are then a family of
independent stationary one-dimensional Ornstein-Uhlenbeck processes given by
where K is given by
1 2 X cos (ku) σ2 √
K(u) := σ = √ √ cosh λ(u − π) .
4π µk 4 λ sinh λπ
k∈Z
Here, the second equality holds for u restricted to [0, 2π]. In fact, the cosine series is
the periodic continuation of the r.h.s. restricted to [0, 2π].
Proof. From the basic identity cos (α − β) = cos α cos β + sin α sin β,
1
e−k (x)e−k (y) + ek (x)ek (y) = cos (k(x − y)), k ∈ Z .
π
σ 2 X cos (kx)
K(x) = .
4π λ + k2
k∈Z
√
At last, expand the (even) function cosh λ |•|−π in its (cosine) Fourier-series
to get the claimed equality. ⊔ ⊓
Proposition 12.31. Fix t ≥ 0. Then ψt (x; ω) = ψ(t, x; ω), indexed by x ∈ [0, 2π],
is a centred Gaussian process with covariance of finite 1-variation. More precisely,
Rψ(t, )
1;[x,y]2
• ≤ 2π∥K∥C 2 ;[0,2π] |x − y| ,
and so (cf. Theorem 10.4), for each fixed t ≥ 0, the Rd -valued process
Remark 12.32. There are ad-hoc ways to construct a (spatial) rough path lift asso-
ciated to the stochastic heat-equation, for instance be writing ψ(t, • ) as Brownian
bridge plus a random smooth function. In this way, however, one ignores the large
body of results available for general Gaussian rough paths: for instance, rough path
convergence of hyper-viscosity or Galerkin approximation, extensions to fractional
stochastic heat equations, concentration of measure can all be deduced from general
principles.
We now show that solutions to the stochastic heat equation induces a continuous
stochastic evolution in rough path space.
Theorem 12.33. There exists a continuous modification of the map t 7→ ψ t with
values in Cgα [0, 2π], Rd .
Proof. Fix s and t. The proof then proceeds in two steps. First, we will verify the
assumptions of Corollary 10.6, namely we will show that
h iθ
2
|ϱα (ψs , ψt )|Lq ≤ C sup E(|ψs (x, y) − ψt (x, y)| ) ,
x,y∈[0,2π]
for some constant C that is independent of s and t. In the second step, we will show
that (here we may assume d = 1), with ψs (x, y) := ψs (y) − ψs (x), one has the
bound h i
2 1/2
sup E |ψs (x, y) − ψt (x, y)| = O |t − s| .
x,y∈[0,2π]
k∈Z
σ 2 X cos (k(x − y)) −(λ+k2 )|t−s|.
= e =: Rτ (x, y).
4π λ + k2
k∈Z
2
After integrating over [u, v] , we see that the error made above is actually of order
2
O |v − u| . This is more than enough to conclude that
R(X 1 ,Y 1 )
1-var;[u,v]2
≤ C|v − u| < ∞ ,
the question reduces to a similar bound on E|ψs1 (x)−ψt1 (x)|2 , uniform in x ∈ [0, 2π].
This quantity is equal to
where we used that 1 − e−cx ≤ cx for c, x > 0 in the first sum. We then take
N ∼ |t − s|−1/2 , so that the first sum is of order O |t − s|1/2 . For the second sum,
2
we use the trivial bound 1 − e−(λ+k )|t−s| ≤ 1. It then suffices to note that
X 1 X 1 1/2
≤ = O(1/N ) = O |t − s| ,
λ + k2 k2
k≥N k≥N
also implies “almost 41 -Hölder” temporal regularity of the stochastic heat equation.
12.4 Exercises
where Yt is the observation filtration and g is a suitably chosen test function. Measure
theory tells us that there exists a Borel-measurable map θtg : C([0, t], RdY ) → R,
such that a.s. πt (g) = θtg (Y ) where we consider Y = Y (ω) as a C([0, t], RdY )-
valued random variable. Note that θtg is not uniquely determined (after all, modifica-
tions on null sets are always possible). On the other hand, there is obvious interest to
have a robust filter, in the sense of having a continuous version of θtg , so that close
observations lead to nearby conclusions about the signal.
a) Give an example showing that, in general, θtg does not admit a continuous
version.
b) Let α ∈ (1/2, 1/3). Show that there exists a continuous map on rough path
space
Θtg : Cg0,α ([0, t], RdY ) → R ,
such that a.s.
πt (g) = Θtg (Y) , (12.46)
where Y is the random geometric rough path obtained from Y by iterated
Stratonovich integration.
Hint: You may use the “Kallianpur–Striebel formula”, a standard result in filtering
theory which asserts that
pt (g)
πt (g) = , pt (g) := E0 [g(Xt , Yt )vt |Yt ]
pt (1)
where
!
XZ t 1 t
Z
dP0 i i 2
= exp − h (Xs , Ys )dWs − ||h(Xs , Ys )|| ds
dP Ft i 0 2 0
and v = {vt , t > 0} is defined as the right-hand side above with −W replaced by
Y.
Exercise 12.4 Show almost sure “( 14 − ε)-Hölder” temporal regularity of ψ =
ψt (x; ω), solution to the stochastic heat equation. Show that, for fixed x, ψt (x; ω) is
not a semimartingale.
Exercise 12.5 (Spatial Itô–Stratonovich correction [HM12]) Writing T for the
interval [0, 2π] with periodic boundary, let us say that
u = u(t, x; ω) : [0, T ] × T × Ω → R
(u(. + ε) − u)
u ;
ε
Assume v ε = uε − ψ → v := u − ψ and its first (spatial) derivatives converge
locally uniformly in probability. Show that u is an analytically weak solution to
the perturbed equation
1
∂t u = ∂xx u + ∂x u2 + C + ξ
2
with C ̸= 0. Determine the value of C. Hint: Use Exercise 10.6.
b) We note
2
1 u(. + ε) − u2 (u(. + ε) + u) (u(. + ε) − u)
1 2
Dε,r u = =
2 2 ε 2 ε
(u(. + ε) − u) 1 2
=u + (u(. + ε) − u) .
ε 2ε
It follows that
12.5 Comments 239
(uε (. + ε) − uε )
∂t ⟨v ε , φ⟩ = ⟨v ε , ∂xx φ⟩ − ⟨v ε , φ⟩ + uε ,φ .
ε
= ⟨v ε , ∂xx φ⟩ − ⟨v ε , φ⟩
1 ε 2 1 ε ε 2
− (u ) , Dε,l φ − (u (. + ε) − u ) , φ .
2 2ε
[uε (. + ε) − uε ] = ψ(. + ε) − ψ + v ε (. + ε) − v ε
= ψ(. + ε) − ψ + O(ε)
(u−π)
Since K(u) = cosh ′ 1
4 sinh (π) , we have C = −2K (0) = 2 , and it follows from
Exercise 10.6 that
ψ 2x,x+ε
1 1
Z
2
(ψ(. + ε) − ψ) , φ = φ(x) dx
2ε 2 ε
1 1
Z
→ φ(x)Cdx = ,φ ,
2 4
12.5 Comments
Section 12.1: The explicit solution of the rough transport equation in Section 12.1.1
is a (geometric) rough-pathification of the classical method of characteristics and Ku-
nita’s (Stratonovich) stochastic version thereof [Kun84], first pointed out in [CF09].
240 12 Stochastic partial differential equations
on a suitable scale of Banach spaces, which satisfy an operator Chen relation and then
the (operator) geometricity condition A2s,t /2 = As,t . The rough transport equation,
say dut = Γ ut dW if written as initial value problem, then fits into an abstract rough
linear equation of the form
dut = A(dt)ut .
An analytically weak formulation (somewhat similar to our Section 12.1.2, but now
formulated via Banach duals) then allows them to obtain existence and uniqueness
under Cb3 assumptions on the vector fields, at the price of a doubling of variables
argument related in the spirit to Di Perna–Lions [DL89].
Entropy solutions to scalar conservation laws with rough forcing are studied by
Friz–Gess [FG16b]; in [HNS20] Hocquet et al. study a generalized Burgers equation
with rough transport noise. A different class of rough scalar conservation laws,
closely related to rough transport, is given by
one can rewrite (12.47) in its (formal) kinetic form: for T > 0 fixed,
dt χ + ∂u A(x, ξ) · Dx χ − divx A(x, ξ)∂ξ χ dW = (∂ξ m)dt , (12.49)
known as defect measure, which is part of the solution. The definition of rough
kinetic solution [GS15] is then given as analytically weak solution of (12.49), with
test functions obtained as (spatially) regular solutions to an auxilary rough transport
equation, similar in spirit to Section 12.1.2. See also Gess et al. [GPS16] for a semi-
discretisation. The idea of test functions with (here: temporal) structure tailor-made
to a realisation of the noise (a.k.a. rough path) is central to RPDEs. A well-posedness
result for rough kinetic solutions was also obtained by Deya et al. [DGHT19b], in an
extended setting of RPDEs with (unbounded) rough drivers, of the form
where the abstract assumptions on the drift term µ are seen to accommodate the
defect measure. Rough Hamilton–Jacobi equations are of the form
We give a short introduction to the main concepts of the general theory of regularity
structures. This theory unifies the theory of (controlled) rough paths with the usual
theory of Taylor expansions and allows to treat situations where the underlying space
is multidimensional.
13.1 Introduction
While a full exposition of the theory of regularity structures is well beyond the
scope of this book, we aim to give a concise overview to most of its concepts and
to show how the theory of controlled rough paths fits into it. In most cases, we will
only state results in a rather informal way and give some ideas as to how the proofs
work, focusing on conceptual rather than technical issues. The only exception is
the “reconstruction theorem”, Theorem 13.12 below, which is one of the linchpins
of the whole theory. Since its proof (or rather a slightly simplified version of it) is
relatively concise, we provide a fully self-contained version. For precise statements
and complete proofs of most of the results exposed here, we refer to the original
article [Hai14b]. See also the review articles [Hai15, Hai14a] for shorter expositions
that complement the one given here.
It should be clear by now that a controlled rough path (Y, Y ′ ) ∈ DW 2α
bears a
strong resemblance to a differentiable function, with the Gubinelli derivative Y ′
describing the coefficient in front of a “first-order Taylor expansion” of the type
Compare this to the fact that a function f : R → R is of class C γ with γ ∈ (k, k+1)
(1) (k)
if for every s ∈ R there exist coefficients fs , . . . , fs such that
k
X
ft = fs + fs(ℓ) (t − s)ℓ + O(|t − s|γ ) . (13.2)
ℓ=1
243
244 13 Introduction to regularity structures
(ℓ)
Of course, fs is nothing but the ℓth derivative of f at the point s, divided by ℓ!.
In this sense, one should really think of a controlled rough path (Y, Y ′ ) ∈ DW 2α
Remark 13.2. In principle, the index set A can be infinite. By analogy with the
polynomials,
P it is then natural to interpret T as the set of all formal series of the form
α∈A τ α , where only finitely many of the τα ’s are non-zero. This also dovetails
nicely with the particular form of elements in G. In practice however we will only
ever work with finite subsets of A so that the precise topology on T does not matter as
long as each of the Tα is finite-dimensional, which is the case in all of the examples
we will consider here.
T = ⟨τ1 , τ2 , . . . ⟩. (13.6)
1
In [Hai14b], T was called model space, somewhat in clash with the space of models.
246 13 Introduction to regularity structures
We start with two simple special cases followed by the general polynomial structure.
Fix γ ∈ (0, 1) and consider a real-valued function belonging to the Hölder space
γ
of exponent γ, say f ∈ C γ . In other words, f : R → R, and |fx − fy | ≲ |y − x|
uniformly for x, y on compacts. The trivial regularity structure
T = T0 = ⟨1⟩ ∼
=R, G = {Id} ,
x 7→ f (x) := fx 1.
T = T0 ⊕ T1 ⊕ T2 = ⟨1, X, X 2 ⟩ ∼
= R3 ,
with structure group G = {Γh ∈ L(T, T ) : h ∈ (R, +)} where Γh is given, with
respect to the ordered basis 1, X, X 2 , by the matrix
1 h h2
Γh ∼
= 0 1 2h .
001
In other words,
Γh 1 = 1 , Γh X = X + h1 , Γh X 2 = (X + h1)2 ,
Γh P (X) = P (X + h1) , h ∈ Rd ,
We start again from simple examples. What structure would be appropriate for Young
integration? Fix α ∈ (0, 1) and consider the problem of integrating a (continuous)
α
R against a scalar W ∈ C . In the case of smooth W , the indefinite integral
path Y
Z = Y dW exists in Riemann–Stieltjes’ sense and one has Ż = Y Ẇ . In general,
Ẇ only exists as a distribution, more precisely an element of the negative Hölder
space C α−1 . A regularity structure allowing to describe this situation is given by
s 7→ Ż(s) := Ys Ẇ .
The right-hand side above is some sort of Taylor expansion, based on W ∈ C α , which
describes Y well near the (time) point s. We want to formalise this by attaching to
each time s the “jet”
Y (s) := Ys 1 + Ys′ W .
248 13 Introduction to regularity structures
T = T0 ⊕ Tα = ⟨1⟩ ⊕ ⟨W ⟩ ∼
= R2 ,
Γh 1 = 1 , Γh W = W + h1 .
this suggests (rather informally at this stage), that in the vicinity of any fixed time s,
the distributional derivative of Z should have an expansion of the type
The case of multi-component rough paths just needs more basis vectors Ẇ i , Ẇj,k ,
W l (with 1 ≤ i, j, k, l ≤ e). This suggests the following definition.
Definition 13.4. Let α ∈ (1/3, 1/2]. The regularity structure for α-Hölder rough
2
paths (over Re ) is given by T = Tα−1 ⊕ T2α−1 ⊕ T0 ⊕ Tα ∼ = Re+e +1+e with
T0 = ⟨1⟩ , Tα = ⟨W 1 , . . . , W e ⟩ ,
Tα−1 = ⟨Ẇ 1 , . . . , Ẇ e ⟩ , T2α−1 = ⟨Ẇij : 1 ≤ i, j ≤ e⟩ ,
Γh 1 = 1 , Γh W i = W i + hi 1 ,
i i
(13.12)
Γh Ẇ = Ẇ , Γh Ẇij = Ẇij + hi Ẇ j .
It will be seen later in Proposition 13.21 that in this framework the function Ż
defined in (13.11) does indeed Rgive rise naturally to Ż, the distributional derivative
of the indefinite rough integral Y dW.
In a Brownian (rough path) context, one has Hölder regularity with exponent
α = 1/2 − κ, for arbitrarily small κ > 0. The above index set A, relevant for a
1
“regularity structure view” on stochastic integration, then becomes A = − 2 −
κ, −2κ, 0, 12 − κ , which, in abusive but convenient notation, we write as
13.3 Definition of a model and first examples 249
n 1− − 1 −o
A= − , 0 , 0, .
2 2
Index sets of this form (“half-integers− ”) will also be typical in later SPDE situations
driven by spatial or space-time white noise.
Π : Rd → L T, D′ (Rd ) Γ : Rd × Rd → G
x 7→ Πx (x, y) 7→ Γxy
such that Γxy Γyz = Γxz and Πx Γxy = Πy . Write r for the smallest integer such that
r > |min A| ≥ 0 and impose that for every compact set K ⊂ Rd and every γ > 0,
there exists a constant C = C(K, γ) such that the bounds
Πx τ (φλx ) ≤ Cλα ∥τ ∥α , ∥Γxy τ ∥β ≤ C|x − y|α−β ∥τ ∥α ,
(13.13)
One very important remark is that the space M of all models for a given regularity
structure is not a linear space. However, it can be viewed as a closed subset (deter-
mined by the nonlinear constraints Γxy ∈ G, Γxy Γyz = Γxz , and Πy = Πx Γxy )
of the linear space with seminorms (indexed by the compact set K and the upper
bound γ) given by the smallest constant C in (13.13). In particular, there is a natural
collection of “distances” between models (Π, Γ ) and (Π̄, Γ̄ ) given by the smallest
constant C in (13.13), when replacing Πx by Πx − Π̄x and Γxy by Γxy − Γ̄xy .
Since this collection is essentially countable (consider for example the sequence of
pseudometrics dn corresponding to the choices (Kn , γn ) with Kn the centred ball
P n and γn = n), it determines a metrisable topology (take for example
of radius
d = n≥1 2−n (dn ∧ 1)).
Remark 13.6. The precise choice of r in Definition 13.5 is not very important, as
one can see that any other choice r > |min A| ≥ 0 leads to the same definition. See
Lemma 14.13 for a similar statement in the context of Hölder spaces.
Remark 13.7. The test functions appearing in (13.13) are smooth. It turns out that if
these bounds hold for smooth elements of Br , then Πx τ can be extended canonically
to allow any C r test function with compact support.
Remark 13.8. The identity Πx Γxy = Πy reflects the fact that Γxy is the linear map
that takes an expansion around y and turns it into an expansion around x. The first
bound in (13.13) states what we mean precisely when we say that τ ∈ Tα represents
a term that vanishes at order α. (See Exercise 13.2; note that α can be negative, so
that this may actually not vanish at all!) The second bound in (13.13) is very natural
in view of both (13.3) and (13.4). It states that when expanding a monomial of order
α around a new point at distance h from the old one, the coefficient appearing in
front of lower-order monomials of order β is of order at most hα−β .
Remark 13.9. In many cases of interest, it is natural to scale the different directions of
Rd in a different way. This is the case for example when using the theory of regularity
structures to build solution theories for parabolic stochastic PDEs, in which case
the time direction “counts as” two space directions. This “parabolic scaling” can be
formalised by the integer vector (2, 1, . . . , 1). More generally, one can introduce a
scaling s of Rd , which is just a collection of d scalars si ∈ [1, ∞) and to define φλx in
such a way that the ith direction is scaled by λsi . The polynomial structure introduced
earlier, in particular (13.7), should be changed accordingly by postulating that the
Pd
degree of X k is given by |k|s = i=1 si ki . In this case, the Euclidean distance
d
between two points x, y ∈ R P should be1/s replaced everywhere by the corresponding
scaled distance |x − y|s = i |xi − yi | i . See [Hai14b] for more details.
Remark 13.11. (Compare with Remark 4.8 in the rough path context.) It is important
γ
to note that while the space of models M is not a linear space, the space DM is a
linear (in fact: Fréchet) space given a model M ∈ M . The twist of course is that the
space in question depends in a crucial way on the choice of M. The total space then
is the disjoint union
γ
G
M ⋉ Dγ =
def
{M} × DM ,
M∈M
γ
with base space M and “fibres” DM .
The most fundamental result in the theory of regularity structures then states that
given f ∈ D γ with γ > 0, there exists a unique distribution Rf on Rd such that, for
every x ∈ Rd , Rf “looks like Πx f (x) near x”. More precisely, one has
Remark 13.13. With a look to Remark 13.11, and M = (Π, Γ ) ∈ M , one should
really view R = RM f as a map from M ⋉ D γ into D′ . Since the space M ⋉ D γ is
not a linear space, this shows that the map R isn’t actually linear, despite appearances.
However, the map (Π, Γ, f ) 7→ Rf turns out to be locally Lipschitz continuous
provided that the distance between (Π, Γ, f ) and (Π̄, Γ̄ , f¯) is given by the smallest
constant C such that
Here, in order to obtain bounds on Rf − R̄f¯ (ψ) for some smooth compactly
supported test function ψ, the above bounds should hold uniformly for x and y in a
neighbourhood of the support of ψ. The proof that this stronger continuity property
also holds is actually crucial when showing that sequences of solutions to mollified
equations all converge to the same limiting object. However, its proof is somewhat
more involved which is why we chose not to give it here but refer instead to [Hai14b,
Thm 3.10].
Remark 13.14. There are obvious analogies between the construction of the recon-
struction operator R and that of the “rough integral” in Section 4. As a matter of fact,
there exists a slightly more abstract formulation of the reconstruction theorem which
can be interpreted as a multidimensional analogue to the sewing lemma, Lemma 4.2,
see [Hai14b, Prop. 3.25].
Remark 13.15. The reconstruction theorem with γ < 0 allows one to recover the
Lyons–Victoir extension theorem previously obtained in Exercise 2.14, see also
Exercise 13.6. Note that the reconstruction theorem does not hold for γ = 0 (even if
we forego uniqueness of R), for the same reason that the Lyons–Victoir extension
theorem fails for α = 12 (and more generally when 1/α ∈ N).
We postpone the proof of the reconstruction theorem to Section 13.4 and turn instead
to our previous list of regularity structures, now adding the relevant models and
indicating the interest of the reconstruction map.
Πx X k = (y 7→ (y − x)k ) ,
Γxy = Γh h=x−y .
We leave it as an exercise to the reader to verify that this does indeed satisfy the
bounds and relations of Definition 13.5.
In the sense of the following proposition, modelled distributions in the context of
the polynomial model are nothing but classical Hölder functions.
X f (k) (x) k
f (x) = f (x)1 + X .
k!
1≤|k|≤n
uniformly over all φ ∈ Br and λ ∈ (0, 1], and locally uniformly in x. Given any
compact set K, the best possible constant such that the above bound holds uniformly
over x ∈ K yields a seminorm. The collection of these seminorms endows C −α with
a Fréchet space structure.
Remark 13.17. In terms of the scale of classical Besov spaces, the space C −α is a
−α
local version of B∞,∞ . It is in some sense the largest space of distributions that is
invariant under the scaling φ(·) 7→ λ−α φ(λ−1 ·), see for example [BP08].
B : C β × C −α → D′ (Rd )
Proof. Assume from now on that g = ξ ∈ C −α for some α > 0 and that f ∈ C β
for some β > α. We then build a regularity structure T in the following way. For
the index set A, we take A = N ∪ (N − α) and for T , we set T = V ⊕ W , where
each one of the spaces V and W is a copy of the polynomial regularity structure (in
254 13 Introduction to regularity structures
with the obvious abuse of notation in the second expression. It is then straightforward
to verify that Πy = Πx ◦ Γxy and that the relevant analytical bounds are satisfied, so
that this is indeed a model.
Denote now by Rξ the reconstruction map associated to the model (Π ξ , Γ ) and,
for f ∈ C β , denote by f the element in D β given by the local Taylor expansion of
f of order β at each point. Note that even though the space D β does in principle
depend on the choice of model, in our situation f ∈ D β for any choice of ξ. It
follows immediately from the definitions that the map x 7→ Ξf (x) belongs to D β−α
so that, provided that β > α, one can apply the reconstruction operator to it. This
suggests that the multiplication operator we are looking for can be defined as
B(f, ξ) = Rξ Ξf .
Let us see now how some of the results of Section 4 can be reinterpreted in the light
of this theory. Fix α ∈ (1/3, 1/2] and let T be the rough path regularity structure
put forward in Definition 13.4. Recall that this means that T0 = ⟨1⟩, Tα and Tα−1
are copies of Re with respective basis vectors W j and Ẇ j , and T2α−1 is a copy of
Re×e with basis vectors Ẇij . The structure group G is isomorphic to Re and, for
h ∈ Re , acts on T via
13.3 Definition of a model and first examples 255
Γh 1 = 1 , Γh Ẇ i = Ẇ i , Γh W i = W i + hi 1 , Γh Ẇij = Ẇij + hi Ẇ j .
(13.16)
Let now W = (W, W) be an α-Hölder continuous rough path over Re . It turns out
that this defines a model for T in the following way:
Lemma 13.20. Given an α-Hölder continuous rough path W, one can define a model
M = MW for T on R by setting Γt,s = ΓWs,t and
j
Πs W j (t) = Ws,t
Πs 1 (t) = 1 ,
Z Z
Πs Ẇ j (ψ) = ψ(t) dWtj , Πs Ẇij (ψ) = ψ(t) dWij
s,t .
Here, both integrals are perfectly well-defined Riemann integrals, with the differential
in the second case taken with respect to the variable t. Given a controlled rough path
(Y, Y ′ ) ∈ DW
2α
, this then defines an element Y ∈ DM 2α
by
Proof. We first check that the algebraic properties of Definition 13.5 are satisfied.
It is clear that Γs,u Γu,t = Γs,t and that Πs Γs,u τ = Πu τ for τ ∈ {1, W j , Ẇ j }.
Regarding Ẇij , we differentiate Chen’s relations (2.1) which yields the identity
dWi,j i,j i j
s,t = dWu,t + Ws,u dWt .
The last missing algebraic relation then follows at once. The required analytic bounds
follow immediately (exercise!) from the definition of the rough path space C α .
Regarding the function Y defined in the statement, we have
so that the condition (13.14) with γ = 2α does indeed coincide with the definition of
a controlled rough path. ⊔ ⊓
Theorems 4.4 and 4.10 can then be recovered as a particular case of the recon-
struction theorem in the following way.
j
and such that Zs,t = Y (s) Ws,t + Yi′ (s) Wi,j 3α
s,t + O(|t − s| ).
By a simple approximation argument, see Exercise 13.10, one can take for ψ the
indicator function of the interval [0, 1], so that
j
η(1[s,t] ) = Y (s) Ws,t + Yi′ (s) Wi,j 3α
s,t + O(|t − s| ) .
Here, the reason why one obtains an exponent 3α rather than 3α − 1 is that it is
really |t − s|−1 1[s,t] that scales like an approximate δ-distribution as t → s. ⊔
⊓
Remark 13.22. Using the formula (13.26), it is straightforward to verify that if W
happens to be a smooth function and W is defined from W via (2.2), but this time
viewing it as a definition for the right-hand side, with the left-hand side given by
a usual Riemann integral, then the function Z constructed in Proposition 13.21
coincides with the usual Riemann integral of Y against W j .
Remark 13.23. The theory of (controlled) rough paths of lower regularity already
hinted at in Section 2.4 can be recovered from the reconstruction operator and a
suitable choice of regularity structure (essentially two copies of the truncated tensor
algebra) in virtually the same way.
where ∗ denotes convolution. We also set φ(n) = limm→∞ ϱ(n,m) , so that in particu-
(n) (n)
lar φ(n) = ϱ(n) ∗ φ(n+1) and we write ϱx (y) = ϱ(n) (y − x) and similarly for φx ;
see Exercise 13.7 to see that the limit φ(n) exists and belongs to Cc∞ . We then have
the following preliminary lemma.
Proof. Let λ ∈ (0, 1] and let ψλ be a test function that is supported in the ball of
radius λ and such that |Dk ψ| ≤ λ−d−|k| for all |k| ≤ α + 1. In order to show that
ξn is Cauchy in C −β it then suffices to exhibit a bound of the type
def X Dk ψλ (x)
ψλ (y) − Tx(α) (y) = (y − x)k ≲ λ−N −d |y − x|N ,
ψλ (y) −
k!
|k|≤α
(13.20)
(α) (α) (α)
where N = ⌈α⌉. Since, by (13.17), one has ϱ(n) ∗ Tx = Tx and since Tx (x) =
ψλ (x), one has
showing that ξn = φ(n) ∗ ξ as required. (Here we use the fact that the convergence
ϱ(n,m) → φ(n) takes place in C r for r = rβ by Exercise 13.7.)
The proof of the second claim follows the same lines. We write
X
ξ(ψxλ ) = ξn (ψxλ ) + (ξk+1 − ξk )(ψxλ ) ,
k≥n
Since N > α and N > −γ, this is summable and its sum is again of order λγ , thus
concluding the proof. ⊔
⊓
Remark 13.25. Note the strong similarity of this setting with that of multiresolution
analysis [Mey92]: the image of the convolution operator with φ(n) plays the role of
Vn and convolution with ϱ(n) plays the role of the projection Vn+1 → Vn .
Let us now restate the reconstruction theorem for the reader’s convenience. (We
only consider the case γ > 0 here.)
Theorem 13.26. Let T be a regularity structure as above and let (Π, Γ ) a model
for T on Rd . Then, for γ > 0, there exists a unique linear map R : D γ → D′ (Rd )
such that
Rf − Πx f (x) (ψxλ ) ≲ λγ ,
13.4 Proof of the reconstruction theorem 259
uniformly over ψ ∈ Br and λ ∈ (0, 1], and locally uniformly in x. The statement still
holds for γ < 0, except that uniqueness fails.
Proof. We first define operators R(m,m) by
The idea then is to obtain R as the limit of R(m,m) as m → ∞. This however turns
out not to be that easy to obtain directly. Instead, we try to make use of Lemma 13.24
and define, for m > n,
At this stage we note that, as a consequence of the analytical bounds (13.13) im-
(m+1)
posed in the definition of a model, the quantity Πy τ (φz ) is bounded by
C2−αm ∥τ ∥α , uniformly over |y − z| ≲ 2−m and τ ∈ Tα . On the other hand,
the definition of the spaces D γ guarantees that the component of f (y) − Γyz f (z)
in Tα is bounded by 2(α−γ)m , again uniformly over |y − z| ≲ 2−m . Since
R (n,m−1)
|ϱx (y)| dy ≲ 1, uniformly over m and n, we conclude that
(n,m)
− R(n,m+1) f
L∞ ≲ 2−γm ,
R (13.22)
where α denotes the smallest degree in the ambient regularity structure. It follows
that R(n) f = limm→∞ R(n,m) f is well-defined and also satisfies the bound (13.23).
Since the identity
R(n,m) f = ϱ(n) ∗ R(n+1,m) f
holds for every m ≥ n + 1, it follows that R(n) f = ϱ(n) ∗ R(n+1) f , so that
Rf = limn→∞ R(n) f exists in C α for every α < α by Lemma 13.24.
It remains to show that one has the bound
Rf − Πx f (x) (ψxλ ) ≲ λγ .
(13.24)
For this, we note first that if we define f x ∈ D γ by f x (y) = Γyx f (x), then one has
R(n) f x = φ(n) ∗ Πx f (x), so that (13.24) can be written as
R f − f x (ψxλ ) ≲ λγ .
(13.25)
260 13 Introduction to regularity structures
By (13.22) the same bound also holds for R(n) , so that the claim follows from the
second part of Lemma 13.24.
The case γ < 0 works in a similar way, but this time we explicitly define
X
Rf = R(0,0) f + ϱ(n) − δ ∗ R(n,n) f ,
n
where δ denotes the Dirac delta-distribution. We leave it as an exercise for the reader
to verify that this sum does indeed converge in C α for every α < α and that the limit
satisfies the required bound. ⊔ ⊓
13.5 Exercises
Exercise 13.1 a) Relate Theorem 13.18, in case d = 1, with the Young integral.
b) Draw inspiration from Weierstrass’s construction of a continuous nowhere
differentiable function to construct examples demonstrating the “only if” part of
Theorem 13.18.
Exercise 13.2 (Hölder spaces) For k ∈ N and α ∈ (0, 1), it is customary to define
C k+α as the space of k times continuously differentiable functions f : Rd → R such
that their derivatives of order k are α-Hölder continuous. Show that this agrees with
the obvious extension to Rd of the definition given earlier in (13.2).
Exercise 13.3 Show that in general, the function Z from
R tProposition 13.21 coincides,
up to an additive constant, with the rough integral 0 Y (s) dXsj , in the sense of
Remark 4.12.
♯ Exercise 13.4 Let γ̄ ≥ γ > 0 and let f ∈ C(Rd , T<γ̄ ) such the “modelled distribu-
tion” bound (13.14) holds for every α < γ.
∥f ∥D γ < ∞ .
13.5 Exercises 261
Show that fx,τ ∈ D γ and that one has Rfx,τ = Πx τ . Use this to give another proof
of Lyons’ extension theorem (Exercise 4.6).
262 13 Introduction to regularity structures
13.6 Comments
The original motivation for the development of the theory of regularity structures
was to provide robust solution theories for singular stochastic PDEs like the KPZ
equation or the dynamical Φ43 model. The idea is to reformulate them as fixed point
problems in some space D γ (or rather a slightly modified version that takes into
account possible singular behaviour near time 0) based on a suitable random model
in a regularity structure purpose-built for the problem at hand. In order to achieve
this this chapter provides a systematic way of formulating the standard operations
arising in the construction of the corresponding fixed point problem (differentiation,
multiplication, composition by a regular function, convolution with the heat kernel)
as operations on the spaces D γ .
14.1 Differentiation
263
264 14 Operations on modelled distributions
for some δ > 0. Here, we defined ψxλ as before. By the assumption on the model Π,
we have the identity
One of the main purposes of the theory presented here is to give a robust way to
multiply distributions (or functions with distributions) that goes beyond the barrier
illustrated by Theorem 13.18. Provided that our functions / distributions are repre-
sented as elements in D γ for some model and regularity structure, we can multiply
their “Taylor expansions” pointwise, provided that we give ourselves a table of
multiplication on T .
It is natural to consider products with the following properties.
Definition 14.3. Given a regularity structure (T, G) and two sectors V, V̄ ⊂ T , a
product on (V, V̄ ) is a bilinear map ⋆ : V × V̄ → T such that, for any τ ∈ Vα and
τ̄ ∈ V̄β , one has τ ⋆ τ̄ ∈ Tα+β and such that, for any element Γ ∈ G, one has
Γ (τ ⋆ τ̄ ) = Γ τ ⋆ Γ τ̄ .
Remark 14.4. The condition that degrees add up under multiplication is very natural,
bearing in mind the case of the polynomial regularity structure. The second condition
is also very natural since it merely states that if one reexpands the product of two
“polynomials” around a different point, one should obtain the same result as if one
reexpands each factor first and then multiplies them together.
14.2 Products and composition by regular functions 265
Given such a product, we can ask ourselves when the pointwise product of an
element D γ1 with an element in D γ2 again belongs to some D γ . In order to answer
this question, we introduce the notation Dαγ to denote those elements f ∈ D γ such
that furthermore M
f (x) ∈ T≥α = Tβ ,
β≥α
Theorem 14.5. Let f1 ∈ Dαγ11 (V ), f2 ∈ Dαγ22 (V̄ ), and let ⋆ be a product on (V, V̄ ).
Then, the function f given by f (x) = f1 (x) ⋆ f2 (x) belongs to Dαγ with
It follows from the properties of the product ⋆ that the first term in (14.2) is bounded
by a constant times
X
∥Γxy f1 (y) − f1 (x)∥β1 ∥Γxy f2 (y) − f2 (x)∥β2
β1 +β2 =β
X
≲ ∥x − y∥γ1 −β1 ∥x − y∥γ2 −β2 ≲ ∥x − y∥γ1 +γ2 −β .
β1 +β2 =β
≲ ∥x − y∥γ1 +α2 −β ,
Remark 14.6. Strictly speaking, it is the projection of f (x) = f1 (x) ⋆ f2 (x) to T<γ
that belongs to Dαγ , see Exercise 13.4.
Remark 14.7. It is clear that the formula (14.1) for γ is optimal in general as can
be seen from the following two “reality checks”. First, consider the case of the
polynomial model and take fi ∈ C γi . In this case, the (abstract) truncated Taylor
series fi for fi belong to D0γi . It is clear that in this case, the product cannot be
expected to have better regularity than γ1 ∧ γ2 in general, which is indeed what (14.1)
states. The second reality check comes from (the proof of) Theorem 13.18. In this
case, with β > α ≥ 0, one has f ∈ D0β , while the constant function x 7→ Ξ belongs
∞ β−α
to D−α so that, according to (14.1), one expects their product to belong to D−α ,
which is indeed the case.
and where Q<γ : T → T<γ is the natural projection. Here, G(k) denotes the kth
derivative of G and τ ⋆k denotes the k-fold product τ ⋆ · · · ⋆ τ . We also used the usual
conventions G(0) = G and τ ⋆0 = 1.
Note that as long as G is C ∞ , this expression is well-defined. Indeed, by as-
sumption, there exists some α0 > 0 such that f˜(x) ∈ T≥α0 . By the properties of
14.3 Classical Schauder estimates 267
the product, this implies that one has f˜(x)⋆k ∈ T≥kα0 . As a consequence, when
considering the component of G ◦ f in Tβ for β < γ, the only terms that give a
contribution are those with k < γ/α0 . Since we cannot possibly hope in general that
′
G ◦ f ∈ D γ for some γ ′ > γ, this is all we really need.
It turns out that if G is sufficiently regular, then the map f 7→ G ◦ f enjoys
similarly nice continuity properties to what we are used to from classical Hölder
spaces. The following result is the analogue in this context to Lemma 7.3:
Proposition 14.8. In the same setting as above, provided that G is of class C k with
k > γ/α0 , the map f 7→ G◦f is continuous from D γ (V ) into itself. If k > γ/α0 +1,
then it is locally Lipschitz continuous.
The proof of the first statement can be found in [Hai14b], while the second
statement was shown in [HP15]. It is a somewhat lengthy, but ultimately rather
straightforward calculation.
One of the reasons why the theory of regularity structures is very successful at
providing detailed descriptions of the small-scale features of solutions to semilinear
(S)PDEs is that it comes with very sharp Schauder estimates. A full proof of the
Schauder estimates for regularity structures is beyond the scope of this book, but we
want to convey the flavour of the proof. The aim of this section is therefore to give
a self-contained proof of the classical Schauder estimates which state that for any
(compactly supported) kernel K that is approximately homogeneous of degree β − d,
the convolution map ζ 7→ K ∗ ζ is continuous from C α to C α+β , provided that α + β
is not a positive integer. We first make precise our assumptions on the kernel K.
Immediate examples are (smooth truncations of) the Newton potential in dimension
d ≥ 3, proportional to 1/|x|d−2 and hence 2-regularising, the fractional Volterra
kernel (xH−1/2 1x>0 ) with d = 1 and β = H + 1/2. The heat kernel on space-time
2
Rn+1 , proportional to (t, x) 7→ t−n/2 exp(− |x| 4t )1t>0 , also fits in this setting (and
is 2-regularising), provided one works with “parabolic” scaling (cf. Remark 13.9).
As in Section 13.3, and for any r ∈ N, we work with Br ⊂ D, the set of smooth
test functions with C r -norm bounded by 1 and supported in the unit ball. It will be
λ
convenient for the purpose of this section to write Br,x for the set of all test functions
of the form φλx with φ ∈ Br . Such ψ ∈ Br,x λ
are characterised by having support in
the ball of radius λ centred at x and derivatives bounds |Dk ψ| ≤ λ−d−|k| for |k| ≤ r.
We also note that, for any real s ∈ [0, r], the estimate ∥ψ∥C s ≲ λ−d−s holds true.
268 14 Operations on modelled distributions
The following simple proposition is the first crucial ingredient in our approach.
Loosely speaking, it states that the convolution of two test functions localised at two
distinct scales is localised at the sum (or equivalently maximum) of the two scales
and that one gains in amplitude if the tighter of the two test functions annihilates
polynomials of a certain degree.
λ µ
Proposition 14.11. There exists C > 0 such that, for all φ ∈ Br,x and ψ ∈ Br,y , one
λ+µ R
has ψ ∗ φ ∈ CBr,x+y . If furthermore λ ≤ µ and P (z)φ(z) dz = 0 for every poly-
2µ
nomial P with deg P < γ ≤ r, some γ ∈ R+ , then ψ ∗ φ ∈ C(λ/µ)γ B⌊r−γ⌋,x+y .
Z
ψ (k) ( • − z) − P γ;(k) ( • − z) φ(z) dz ,
= •
for 0 ≤ |k| ≤ r − γ, where P γ;(k) denotes the Taylor expansion (at the dotted
•
We only need to consider z in the support of φ, and in fact can assume without loss of
R that x = 0 (otherwise
generality R subtract another annihilated Taylor polynomial. . .),
so that |z|γ |φ(z)| dz ≤ λγ |φ(z)| dz ≲ λγ . The desired estimate now follows.
⊔
⊓
Definition 14.12. For α ∈ R, write r = ro (α) for the smallest non-negative integer
such that r + α > 0. We then define Z α as the space of distributions on Rd such that
for every compact set K ⊂ Rd there exists a constant C such that the bound
|ζ(φ)| ≤ Cλα ,
λ
R
holds uniformly λ ∈ (0, 1], x ∈ K and all φ ∈ Br,x such that φ(z)P (z) dz = 0
for all polynomials P with deg P ≤ α. For any compact set K, the best possible
constant such that the above bound holds uniformly over x ∈ K yields a seminorm.
The collection of these seminorms endows Z α with a Fréchet space structure.
The precise choice of r in Definition 14.12 is not very important, as one could
have taken any other choice r ≥ ro (α). More precisely, one has the following result.
Lemma 14.13. For r ≥ ro (α), write Zrα for Z α as defined above, but with ro (α)
replaced by r. Then Zrα = Z α .
Proof. We fix a partition of unity {χy }y∈Λ for Rd such that all the χy are translates
of χ0 by y ∈ Rd and Λ ⊂ Rd is a lattice. In particular, we make sure that χy ∈ Br,yλ
.
Given any λ > 0, we write χy,λ (x) = χy/λ (x/λ) and we set Λλ = Λ/λ. We also
fix a function ψ ∈ C ∞ with support in the centred unit ball and such that
Z
xk ψ(x) dx = δk,0 , ∀k : |k| ≤ r . (14.3)
Rd
(Such functions exist by Exercise 13.8.) We then write ψ̃(x) = 2d ψ(2x) − ψ(x) and
k
R
note that by (14.3) one has Rd x ψ̃(x) dx = 0 for |k| ≤ r.
Let now α < 0 and take ζ ∈ Zrα , we want to show that ζ ∈ Z α . Given φ ∈ Brλo ,x
and setting λn = 2−n λ, we write
X X
φ = φ ∗ ψλ + φn,y = φ ∗ ψ̃ λn · χy,λn .
φn,y , (14.4)
n≥0 y∈Λλn
As a simple consequence of the Taylor remainder theorem, one has the bound
φ ∗ Dk ψ̃ λn
≲ λ−d 2−ro n λ−|k| = 2−(d+ro )n λn−d−|k| ,
∞ n
so that there exists a constant C independent of φ such that φn,y ∈ C2−(d+ro )n Br,y
λn
,
which in particular implies that
270 14 Operations on modelled distributions
Since the number of terms in Λλn such that φn,y is non-zero is of order 2nd , we
conclude that X
|ζ(φ)| ≲ λα + λα 2−(ro +α)n ≲ λα ,
n≥0
Proof. There is nothing to prove for α < 0, so let α > 0. We first show that
C α ⊂ Z α , this inclusion also being valid for integer values of α. In fact, it suffices to
note that, given f ∈ C α and φ ∈ Br,xλ
as in Definition 14.12, one has
Z Z
f (y) − Pxα (y − x) φ(y) dy ≲ λα ,
f (y)φ(y) dy =
where the identity follows from the fact that φ annihilates Pxα , the Taylor expansion
at order α of f , based at x, and the bound is as in the proof of Proposition 14.11.
For the converse inclusion, we first consider the case α ∈ (0, 1) and let ζ ∈ Z α .
Let ϱ : Rd → R be a smooth function that is compactly supported in the unit ball
around the origin and such that ϱ(z) dz = 1. Note first that, for any x ∈ Rd and
R
λ ∈ (0, 1], it follows from the definition of Z α that one has the bound
−n −n−1 −n −n−1
|ζ(ϱ2x λ
) − ζ(ϱ2x λ
)| = |ζ(ϱ2x λ
− ϱ2x λ
)| ≤ Cλα 2−αn .
14.3 Classical Schauder estimates 271
−n
It follows that f (x) = limn→∞ ζ(ϱ2x λ
) is well-defined and that
|f (x) − ζ(ϱλx )| ≲ λα .
α
R λ = |x − y|, it follows that f ∈ C . The fact that f = ζ in the sense that
Choosing
ζ(φ) = f (z) φ(z) dz follows immediately from the fact that
Z
λ
ζ(φ) = lim ζ(φ ∗ ϱ ) = lim ζ(ϱλx ) φ(x) dx .
λ→0 λ→0
The claim for general non-integer α can then be seen from the fact that ζ ∈ Z α
implies Dk ζ ∈ Z α−|k| (interpreted as distributional derivatives) for every multi-
index k. Details are left to the reader. ⊔
⊓
Remark 14.16. For n ∈ N, the spaces Z n are usually called Hölder–Zygmund spaces
in the literature (thus our choice of symbol Z). They are distinct from the usual
Hölder spaces since one can check that x 7→ xn log x belongs to Z n , but not to C n .
With all of these preliminaries in place, we can give a very simple proof of
Schauder’s theorem. (See for example [Sim97] for an alternative proof of a very
similar statement.)
−n
with 2βn Kn ∈ CBr,02
for some C > 0. It then follows from Proposition 14.11
(applied with µ = 2−n , noting that Kn ∗ φ also annihilates polynomials of degree
up to α + β) and the definition of Z α that
λα if 2−n ≤ λ,
|ζ(2βn Kn ∗ φ)| ≲ n γ −αn
(2 λ) 2 otherwise,
As we saw in the previous section, the classical Schauder estimates state that if
K : Rd → R is a kernel that is smooth everywhere, except for a singularity at the
origin that is approximately homogeneous of degree β − d for some fixed β > 0 (i.e.
it is β-regularising in the sense of Definition 14.9), then the operator f 7→ K ∗ f
maps C α into C α+β for every α ∈ R, except for those values for which α + β ∈ N.
It turns out that similar Schauder estimates hold in the context of general regularity
structures in the sense that it is in general possible to build an operator K : D γ →
D γ+β with the property that RKf = K ∗Rf . We call such a statement a “multi-level
Schauder estimate” since it is a form of Schauder estimate for all the components of
f in Tα for all α < γ. Of course, such a statement can only be expected to hold if
our regularity structure contains not only the objects necessary to describe Rf up to
order γ, but also those required to describe K ∗ Rf up to order γ + β. What are these
objects? At this stage, it might be useful to reflect on the effect of the convolution of
a singular function (or distribution) with K.
Let us assume for a moment that a given real-valued function f is smooth ev-
erywhere, except at some point x0 . It is then straightforward to convince ourselves
that K ∗ f is also smooth everywhere, except at x0 . Indeed, for any δ > 0, we can
write K = Kδ + Kδc , where Kδ is supported in a ball of radius δ around 0 and
Kδc is a smooth function. Similarly, we can decompose f as f = fδ + fδc , where
fδ is supported in a δ-ball around x0 and fδc is smooth. Since the convolution of
a smooth function with an arbitrary distribution is smooth, it follows that the only
non-smooth component of K ∗ f is given by Kδ ∗ fδ , which is supported in a ball of
radius 2δ around x0 . Since δ was arbitrary, the statement follows. By linearity, this
strongly suggests that the local structure of the singularities of K ∗ f can be described
completely by only using knowledge on the local structure of the singularities of f .
14.4 Multilevel Schauder estimates and admissible models 273
It also suggests that the “singular part” of the operator K should be local, with the
non-local parts of K only contributing to the “regular part”.
This discussion suggests that we need the following ingredients to build an
operator K with the desired properties:
• The polynomial structure should be part of our regularity structure in order to be
able to describe the “regular parts”.
• We should be given an “abstract integration operator” I (of order β) on T which
describes how the “singular parts” of Rf transform under convolution by K.
• We should restrict ourselves to models which are “compatible” with the action
of I in the sense that the behaviour of Πx Iτ should relate in a suitable way to
the behaviour of K ∗ Πx τ near x.
One way to implement these ingredients is to assume first that our regularity structure
contains abstract polynomials in the following sense.
Assumption 14.20 There exists a sector T̄ ⊂ T isomorphic to the polynomial
regularity structure. In other words, T̄α ̸= 0 if and only if α ∈ N, and one can
find basis vectors X k of T|k| such that every element Γ ∈ G acts on T̄ by Γ X k =
(X + h1)k for some h ∈ Rd .
Furthermore, we assume that there exists an abstract integration operator I, of
fixed order β > 0, with the following properties.
Assumption 14.21 There exists a linear map I : V → T for some sector V ⊂ T
such that IVα ⊂ Tα+β and, for every Γ ∈ G and τ ∈ T ,
Γ Iτ − IΓ τ ∈ T̄ . (14.7)
Πx X k (y) = (y − x)k ,
Πx Iτ = K ∗ Πx τ − Πx Jx τ , (14.8)
Xk
X Z
Dk K(x − y) Πx τ (dy) .
Jx τ = (14.9)
k!
|k|<deg τ +β
274 14 Operations on modelled distributions
which is clearly consistent with the constraint (14.7) and which one can show guar-
antees that Πx Γxy Iτ = Πy Iτ . See also Exercise 14.6.
with Kn as in Lemma 14.10. The scaling properties of the Kn ensure that the function
2(β−|k|)n Dk Kn (x − • ) is a test function that is localised around x at scale 2−n . As
a consequence, one has
Πx τ Dk Kn (x − • ) ≲ 2(|k|−β−deg τ )n ,
Remark 14.28. As a matter of fact, it turns out that the above definition of an ad-
missible model dovetails very nicely with our axioms defining a general model.
Indeed, starting from any regularity structure T , any model (Π, Γ ) for T , and a
β-regularising kernel K, it is usually possible to build a larger regularity structure
Tˆ containing T (in the “obvious” sense that T ⊂ T̂ and the action of Ĝ on T is
14.4 Multilevel Schauder estimates and admissible models 275
compatible with that of G) and endowed with an abstract integration map I, as well
as an admissible model (Π̂, Γ̂ ) on Tˆ which reduces to (Π, Γ ) when restricted to T .
See [Hai14b] for more details.
The only exception to this rule arises when the original structure T contains some
homogeneous element τ which does not represent a polynomial and which is such
that deg τ + β ∈ N. Since the bounds appearing both in the definition of a model
and in that of a β-regularising kernel are only upper bounds, it is in practice easy to
exclude such a situation by slightly tweaking the definition of either the exponent β
or of the original regularity structure T .
With all of these definitions in place, we can finally build the operator K : D γ →
γ+β
D announced at the beginning of this section. Recalling the definition of J from
(14.9), we set
Kf (x) = If (x) + Jx f (x) + N f (x) , (14.11)
where the operator N is given by
X Xk Z
Dk K(x − y) Rf − Πx f (x) (dy) .
N f (x) = (14.12)
k!
|k|<γ+β
Note first that thanks to the reconstruction theorem, it is possible to verify that the
right-hand side of (14.12) does indeed make sense for every f ∈ D γ in virtually the
same way as in Remark 14.27. One has:
Proof. The complete proof of this result can be found in [Hai14b] and will not
be given here. Since it is rather straightforward, we will however give a proof
of Schauder’s estimate in the classical case (i.e. that of the polynomial regularity
structure) in Section 14.3 below.
Let us simply show that one has indeed RKf = K ∗ Rf in the particular case
when our model consists of continuous functions so that Remark 13.27 applies. In
this case, one has
RKf (x) = Πx (If (x) + Jx f (x)) (x) + Πx N f (x) (x) .
As a consequence of (14.8), the first term appearing in the right-hand side of this
expression is given by
Πx (If (x) + Jx f (x)) (x) = K ∗ Πx f (x) (x) .
On the other hand, the only term contributing to the second term is the one with
k = 0 (which is always present since γ > 0 by assumption) which then yields
276 14 Operations on modelled distributions
Z
Πx N f (x) (x) = K(x − y) Rf − Πx f (x) (dy) .
Adding both of these terms, we see that the expression K ∗ Πx f (x) (x) cancels,
leaving us with the desired result. ⊔
⊓
with Riemann–Liouville kernel K H (x) = xH−1/2 1x>0 . Since K H ∈ L2loc (R) but
not in L2 (R), we replace it in the sequel by a compactly supported K, smooth away
from zero and equal to K H in some neighbourhood of zero. We then require W to
be a two-sided Brownian motion, so that ξ := Ẇ defines Gaussian white noise on R,
and
c =K ∗ξ .
W (14.15)
Alternatively, as done in [BFG+ 19], see also [BFG20], one can restrict integration in
(14.14) to [0, t] with the benefit of exactly recovering Brownian motion W
c = W for
H = 1/2 in which case the integral (14.13) fits squarely into rough integration theory
(namely Theorem 4.4, applied with the Itô Brownian rough path from Proposition 3.4).
However, for H ∈ (0, 1/2) rough integration must fail. Indeed, K is (1/2 + H)-
regularising so that it follows from Schauder’s Theorem 14.17 that W c and then
−
σ(Wc ) have generically H -Hölder regularity and hence cannot be expected to be
−
controlled by W ∈ C 1/2 . We can make (minor) progress by noting that (W c , W̄ )
is a 2-dimensional Gaussian process with independent components. At least for
H > 1/3, the results of Section 10.3 for Gaussian rough paths apply essentially
14.5 Rough volatility and robust Itô integration revisited 277
We are interested in a robust form of this Itô stochastic integral. In case of Wc=W
we can in fact express (14.16) via Itô’s formula, which immediately gives a version
of this integral which is continuous in W , even in uniform topology. Certainly, this
trick fails when Wc ̸= W .
In this section we set up a regularity structure that provides a full solution to this
problem. Needless to say, this structure is much simpler than what is needed for the
KPZ equation in the next chapter. Yet, it showcases a number of features omnipresent
for singular SPDEs, but without some of the added complexity coming from PDE
theory.
Recall that the Hölder exponent of W c is H − κ for any κ > 0. As a result, we
m m(H−κ)
have |Ws,t | ≲ |t − s|
c and the building blocks for a robust representation of
(14.16) are
Z t
m cs,r )m dWr ,
Ws,t = (W (14.17)
s
for small enough κ > 0. For definiteness, let us focus on the case
1
H> , M =3.
8
We first define symbols (these will be the basis vectors of our regularity structure) to
represent (Wcs,t )m , 0 ≤ m ≤ 3. If Ξ ≡ is the symbol for white noise ξ ≡ Ẇ , we
can write the required symbols indifferently as
{ , , , } (with for example = I(Ξ)2 Ξ). We then define the structure space of
278 14 Operations on modelled distributions
our regularity structure as the free vector space generated by these symbols, namely
T =⟨, , , , 1, , , ⟩. (14.18)
The partial product defined on T (for example = ) does not extend to all of T .1 It
−
is natural to postulate that Ξ has degree deg Ξ = − 12 (the presence of the exponent
‘−’ reflects the fact that in order for the bound (13.13) to be satisfied when Πt Ξ is
given by white noise, we need to make sure that deg Ξ is strictly smaller than − 21 ,
but by how much exactly is irrelevant as long as it is a small enough quantity), that I
increases degree by H + 12 , and that the degree is additive under multiplication. Since
it is natural to take deg 1 = 0 to retain consistency with the polynomial regularity
structure, this uniquely determines the degree of each of the basis vectors of T , for
instance
deg = deg + 3 deg = (3H − 12 )− .
To understand the structure group, we shift from a base point s to a new base
point t. Basic additivity properties of the integral in (14.17) show that
• •
cs,t + 3W1t, W
W3s, = W3t, + 3W2t, W •
2
cs,t + W0t, W
•
3
cs,t + W3s,t .
•
This suggests to “break up” the symbol (for ∂W3∗, ) in the form
•
∆+ ( ) := ⊗1+3 ⊗ +3 ⊗ + ⊗ ∈ T ⊗ T+ ,
where the introduction of a new space T + is justified by the fact that elements in T +
represent functions of two variables (s and t here), while elements of T represent
functions of one variable (the base point s resp. t) that are distributions in the
remaining free variable. In particular, it is rather natural that T + (unlike T ) contains
no symbols of negative degree and that elements of T + can be multiplied freely. In
other words, it is natural in this context to define T + as the free commutative algebra
def
generated by the single element = J ( ). The difference between T + and T is
emphasised in our notation by drawing basis vectors of T + in black.
The action of the linear map ∆+ : T → T ⊗ T + has the appealing graphical
interpretation of cutting off positive branches: for instance, the summand 3 ⊗ =
⊗3 in ∆+ ( ) is explained as follows: there are three ways to “cut off” a “lollipop”
from , which are then painted black and put as 3 ∈ T + to the right-hand side;
the remaining “pruned” tree ∈ T goes to the left. Similarly, there are three ways to
cut off two lollipops from , which then appear as 3 ∈ T + on the right-hand side,
while the pruned remainder ∈ T appears on the left.
1
For instance, we do not want our regularity structure to contain a symbol Ξ 2 denoting the square
of white-noise. We also have no need for trees with ≥ 4 branches so that products like ,
etc. remain deliberately undefined within T .
14.5 Rough volatility and robust Itô integration revisited 279
∆+ 1 = 1 ⊗ 1 , ∆+ Ξ = Ξ ⊗ 1 ,
∆+ (τ τ̄ ) = ∆+ τ · ∆+ τ̄ ,
∆+ I(τ ) = (I ⊗ Id)∆+ τ + 1 ⊗ J (τ ) .
000 1
One can check that Nc Nc̄ = Nc+c̄ with c, c̄ ∈ R so that, as a group, G is isomorphic
to (R, +). This completes the construction of the regularity structure (T, G). We
leave it to the reader to identify pairs of sectors on which (the usually omitted) ⋆
defines a product in the sense of Section 14.2 and to show that I is indeed an abstract
integration operator3 in the sense of Definition 14.21.
2
The multiplicative property is understood for all symbols τ , τ̄ ∈ T which can be multiplied in T .
3
In the present setting there is no need to include higher order abstract polynomials X, X 2 , . . . as
part of T .
280 14 Operations on modelled distributions
As already hinted at, the natural Itô model MItô := (Π, Γ ) in this context is
defined by setting
Πs 1 = 1 , Πs Ξ = Ẇ , Πs (I(Ξ)m ) = W
cs,m , Πs (ΞI(Ξ)m ) = ∂W ,
•
as well as Γst = Γgs,t with gs,t ( ) = Wcs,t . We leave it to the reader to check that
Itô
M satisfies the required bounds (13.13) and therefore really defines a random
model for the regularity structure (T, G). We also note that the model is admissible
in the sense of Definition 14.23: in essence, this is seen from the identity
Πs IΞ = K ∗ Πs Ξ − Πs J (s)Ξ = K ∗ Ẇ − (K ∗ Ẇ )(s) = W
cs, • (14.22)
On the other hand, we can replace white noise Ẇ = Ẇ (ω) Rby a mollification
Ẇ ε := δ ε ∗ Ẇ with δ ε (t) = ε−1 ϱ(ε−1 t), for some ϱ ∈ Cc∞ with ϱ = 1, or indeed
any smooth function ξ, and define the associated canonical model L (ξ) = (Π, Γ )
by prescribing
Πs Ξ = ξ, Πs (I(Ξ)m ) = (K ∗ ξ)m
s, , • Πs (ΞI(Ξ)m ) = ξ(·)(K ∗ ξ)m
s, , •
as well as gs,t ( ) = (K ∗ ξ)s,t . We again leave it to the reader to check that L (ξ) is
indeed an admissible model for our regularity structure.
It is interesting to consider the canonical model L (Ẇ ε ) as ε → 0. Formally, one
would expect convergence to a “Stratonovich model”, but this does not exist because
of an infinite Itô–Stratonovich correction. To wit, assume the approximate bracket
X
[W, Wc ]π := Ws,t W cs,t
[s,t]∈π
converges, say in L1 , upon refinement |π| → 0. Then the mean would have to
convergence, which is contradicted by the computation, using Itô isometry,
Z t Z t−s
EWs,t W
cs,t = K(t − r)dr = K(r)dr
s 0
Z t−s 1
∼ K H (r)dr = cH (t − s)H+ 2 ,
0
and the standing assumption that H < 1/2. As a consequence, the canonical model
L (Ẇ ε ) will not converge as ε → 0, although the previous discussion suggests to
“cure” this by subtracting a diverging term, namely to consider4
4
This is an instance of Wick renormalisation where one replaces the product of two scalar Gaussian
random variables X, Y by X ⋄ Y := XY − E[XY ].
14.5 Rough volatility and robust Itô integration revisited 281
Z Z
ε ε c ε dW ε ,
W dW − E
c W (14.23)
Wick renormalisation at the level of generalised increments may destroy the algebraic
Chen relations. (Indeed, they only hold when the expectation is proportional to [s, t],
which has no reason to be the case in general.)
In fact, our admissible model (Π, Γ ) here can be described in terms of a single
“base-point free” realisation map Π : T → D′ which enjoys somewhat more natural
relations, such as
ΠIΞ = K ∗ ΠΞ = K ∗ Ẇ = K ∗ ξ
instead of (14.22) in the Itô-model case, and similarly for Π ε with Ẇ replaced by
Ẇ ε = ξ ε . The full specification reads5
Π ε 1 = 1, Π εΞ = ξε,
ε m ε m ε (14.24)
Π (I(Ξ) ) = (K ∗ ξ ) , Π (ΞI(Ξ)m ) = ξ ε (K ∗ ξ ε )m .
Z
= (K ∗ δ ε )(t − s)δ ε (t − s)ds = (K ∗ δ̄ ε )(0) .
R
where we recall δ ε = ε−1 ϱ(ε−1 • ); and similarly for δ̄ ε with ϱ̄ = ϱ(−( • )) ∗ ϱ. Since
K(x) = xH−1/2 1x>0 in a neighourhood of zero, there is no loss of generality in
assuming that this includes the support of ϱ̄. For ε ∈ (0, 1], it follows that8
5
One defines Π(ΞI(Ξ)m ) as the distributional derivative of an Itô integral.
6
. . .and similarly in the canonical one, with (K ∗ ξ)t replaced by (K ∗ ξ ε )t . . .
7
Thanks to stationarity, this quantity is independent of t. In particular, one could immediately take
t = 0.
8
In the case of H = 1/2, so that K H ≡ 1, noting that ϱ( • ), and hence ϱ̄ = ϱ(−( • )) ∗ ϱ, has unit
mass, the constant equals 1/2, which is the same 1/2 appearing in the Itô–Stratonovich correction.
282 14 Operations on modelled distributions
∞ ∞
1 s
Z Z
= (K ∗ δ̄ ε )(0) = K H (s) ϱ̄ ds = εH−1/2 K H (s) ϱ̄(s)ds .
0 ε ε 0
and hence affects all higher levels (m = 1, 2, . . .). While V̇ = 1 naturally has 1
as associated symbol, Vb leads to a new symbol, indifferently written as I1 ≡ I()
or , in agreement with out earlier convention to represent action of I as single
downfacing line.
9
This is nothing but a variation of the concept of translation of rough paths.
14.5 Rough volatility and robust Itô integration revisited 283
have identical law. This reduces the renormalisation group to (R3 , +) and reflects a
general principle: symmetries help to reduce the dimension of the renormalisation
group. See [BGHZ19] for an example where this principle takes centre stage in a
striking manner.
In general one proceeds as follows. Define T − as the free commutative algebra
generated by all negative symbols in T ; that is,
T − := Alg({ , , , }) . (14.26)
− 12 1 − 3 + + 4
3 ∈ T− ,
where 1 denotes the empty forest. As before, it is useful to introduce a linear map
∆− : T → T − ⊗ T which iterates over all possible ways of extracting possibly
empty collections of subtrees of negative degree, putting them as a forest on the
left-hand side, and leaving the remaining tree (where all “extracted” subtrees have
now been contracted to a point) on the T -valued right-hand side. For instance,
Mg = (g ⊗ Id)∆−
The resulting renormalised model Π ε;ren ≡ Π ε Mgε realises, for instance, the
symbol as
it is immediate from (14.27) that one has indeed E(Π ε Mgε ) = 0. Further-
more, since first and third moments of centred Gaussians vanish, we also have
E(Π ε Mgε ) = E(Π ε Mgε ) = 0 as a consequence of the fact that we set
g( ) = g( ) = 0. Finally, it follows from Wick’s formula that
so that Π ε Mgε has vanishing mean if and only if we also choose cε3 = 0.
We have made it plausible that
indeed gives rise to an (admissible) model, with all analytic bounds and algebraic
constraints intact, and such that in the sense of model convergence,
The main result of [CH16] is that the convergence Mε;ren → MBPHZ remains true in
vastly greater generality and that the limiting model is independent of the specific
choice of Mε for a large class of stationary approximations ξ ε to the noise ξ.
At last, we leave it to the reader to adapt the material of Section 13.3.2 to define
Rt
the modelled distribution that allows to reconstruct the Itô integral 0 f (W cs )dWs
and further deduce from (14.28) the following (renormalised) Wong–Zakai result,
Z t Z t Z t
csε )dWsε − cε1
f (W f ′ (W
csε )ds → f (W
cs )dWs (14.29)
0 0 0
R∞
where we recall that cε1 = εH−1/2 0 K H (s) ϱ̄(s)ds. Noting that ϱ̄ = ϱ(−( • )) ∗ ϱ
is even and has unit mass, we see that cε1 = 12 when H = 1/2. We can then pass to
the limit for each term on the right-hand side of (14.29) separately. This allows us to
recover the identity
Z t
1 t ′
Z Z t
f (Ws ) ◦ dWs − f (Ws )ds = f (Ws )dWs ,
0 2 0 0
14.6 Exercises
fails.
b) Transfer Exercise 2.10 to the present context.
Solution. (We only address the first part.) Consider for instance the regularity struc-
ture given by A = (−2κ, −κ, 0) for fixed κ > 0 with each Tα being a copy of R
given by T−nκ = ⟨Ξ n ⟩. We furthermore take for G the trivial group. This regularity
structure comes with an obvious product by setting Ξ m ⋆ Ξ n = Ξ m+n provided
that m + n ≤ 2.
Then, we could for example take as a model for T = (T, G):
Since our group G is trivial, one has fi ∈ D γ provided that each of the fi belongs to
C γ and each of the f˜ibelongs to C γ+κ . (And one has γ + κ < 1.) One furthermore
has the identity Rfi (x) = fi (x).
However, the pointwise product is given by
f1 ⋆ f2 (x) = f1 (x)f2 (x)Ξ 0 + f˜1 (x)f2 (x) + f˜2 (x)f1 (x) Ξ + f˜1 (x)f˜2 (x)Ξ 2 ,
which by Theorem 14.5 belongs to D γ−κ . Provided that γ > κ, one can then apply
the reconstruction operator to this product and one obtains
which is obviously quite different from the pointwise product (Rf1 )(x) · (Rf2 )(x).
How should this be interpreted? For n > 0, we could have defined a model Π (n)
by
√
Πx(n) Ξ 0 (y) = 1, Πx(n) Ξ (y) = 2c sin(ny), Πx(n) Ξ 2 (y) = 2c sin2 (ny).
as well as R(n) (f1 ⋆ f2 ) = R(n) f1 · R(n) f2 . As a model, the model Π (n) actually
converges to the limiting model Π defined in (14.30). As a consequence of the
continuity of the reconstruction operator, this implies that
which is of course also easy to see “by hand”. This shows that in some cases, the
“non-canonical” models as in (14.30) can be interpreted as limits of “canonical”
models for which the usual rules of calculus hold. Even this is however not always
the case (think of the Itô Brownian rough path).
Solution. As in Lemma 14.13, we aim to bound |ζ(φ)| for φ ∈ Brλo ,x and ζ ∈ Zrα
for some r ≥ ro . One strategy is to consider a compactly supported wavelet basis of
regularity r and to separately bound the terms in the wavelet expansion of φ.
If we wish to rely purely on elementary arguments, one strategy goes as follows.
a) Show first that ζ ∈ Zrα if and only if ζχ ∈ Zrα for every smooth compactly
supported function χ. This allows us to reduce ourselves to the case when ζ
itself is compactly supported and we assume this from now on.
b) Show that if ζ ∈ Zr0 is supported in a ball of radius 1 and if ψ is such that
ψ(x) dx = 0 and such that |Dk ψ(x)| ≤ (1 + |x|)−β−|k| for |k| ≤ r and some
R
large enough exponent k, then |ζ(ψxλ )| ≲ 1, uniformly over such ψ and over
x ∈ Rd and λ ∈ (0, 1].
c) Choose a function ψ with the property that its Fourier transform is smooth,
identically 1 in the ball of radius 1, and identically 0 outside of the ball of radius
2 and define ψ̃ as in the proof of Lemma 14.13. Write
X
φ = φ ∗ ψλ + φ ∗ ψ̃ λn
n≥0
ζ(φ ∗ ψ̃ λn ) = ⟨ζ ∗ ψ̃ λn , φ ∗ χλn ⟩ .
14.6 Exercises 287
provided that γ > max{1, −α, 1 + α + β}. Furthermore, the explicit expression for
K shows that
where (. . .) denotes terms that either belong to the polynomial part of the regularity
structure or are of degree strictly greater than α + β + 1 (which is the degree of
I(XΞ)). In particular, the truncation of F at level α + β + 1 belongs to DPα+β+1 ,
and we conclude by the second part of Proposition 13.16.
Exercise 14.5 Consider space-time Rd with one temporal and (d − 1) spatial di-
mensions, under the parabolic scaling (2, 1, . . . , 1), as introduced in Remark 13.9.
Denote by G the heat kernel (i.e. the Green’s function of the operator ∂t − ∂x2 ). Show
that one has the decomposition
G = K + K̂ ,
where the kernel K satisfies all of the assumptions of Section 14.4 (with β = 2) and
the remainder K̂ is smooth and bounded.
Exercise 14.6 (From [Bru18]) In the context of Remark 14.25, establish the recur-
sion
Γxy Iτ = I(Γxy τ ) − Γxy Jxy τ , (14.31)
with
X Xk
Jxy τ := Πx (Ik (Γxy τ ))(y) .
k!
|k|<deg τ +β
Exercise 14.7 Show that if one defines Γxy Iτ in such a way that (14.10) holds, then
it guarantees that Πx Γxy Iτ = Πy Iτ .
288 14 Operations on modelled distributions
Exercise 14.8 Adapt the material in Section 14.5 and construct a suitable regularity
structure and model so that the two-dimensional Itô integral (14.13) is obtained as
reconstruction of a suitable modelled distribution.
14.7 Comments
We show how the theory of regularity structures can be used to build a robust
solution theory for the KPZ equation. We also give a very short survey of the original
approach to the same problem using controlled rough paths and we discuss how the
two approaches are linked.
Let us now briefly explain how the theory of regularity structures can be used to
make sense of solutions to very singular semilinear stochastic PDEs. We will keep
the discussion in this chapter at a very informal level without attempting to make
mathematically precise statements. The interested reader may find more details in
[Hai13, Hai14b].
For definiteness, we focus on the case of the KPZ equation [KPZ86], which is
formally given by
∂t h = ∂x2 h + (∂x h)2 + ξ − C , (15.1)
where ξ denotes space-time white noise, the spatial variable takes values in the
one-dimensional torus T, i.e. in the interval [0, 2π] endowed with periodic boundary
conditions, and C is a fixed constant. The problem with such an equation is that even
the solution to the linear part of the equation, namely
∂t Ψ = ∂x2 Ψ + ξ ,
289
290 15 Application to the KPZ equation
This has usually been interpreted in the following way. Assuming for a moment
that ξ is a smooth function, a simple consequence of the change of variables formula
shows that if we define h = log Z, then Z satisfies the PDE
∂t Z = ∂x2 Z + Z ξ .
The only ill-posed product appearing in this equation now is the product of the
solution Z with white noise ξ. As long as Z takes values in L2 , this product can
be given a meaning as a classical Itô integral, so that the equation for Z can be
interpreted as the Itô equation
dZ = ∂x2 Z dt + Z dW , (15.2)
R
Sa
C∈ F × M × Cα Dγ
L · R
(15.4)
Sc
F × C × Cα C([0, T ], C α )
∈
R
ξ h0 h
Here, Sc denotes the classical solution map Sc (C, ξ, h0 ) which provides the solution
(up to some fixed final time T ) to the equation
for regular instances of the noise ξ. The space F of “formal right-hand sides” is in
this case just a copy of R which holds the value of the constant C appearing in (15.5).
The diagram commutes in the sense that if M ∈ R, then
than one noise as a subsymbol. This in particular explains why fractional Brownian
motion B H with Hurst parameter H can only be lifted to a rough path when H > 41
even though SDEs driven by fractional Brownian motion are “subcritical” for every
H > 0. Indeed, for H = 14 , the natural degree of the symbol Ẇ of Section 13.2.2
(which would be represented by in the graphical notation used earlier and contains
−
two instances of the noise) would be (2H − 1)− = − 12 < − d2 .
where ξε = δε ∗ ξ with δεR(t, x) = ε−3 ϱ(ε−2 t, ε−1 x), for some smooth compactly
supported function ϱ with ϱ = 1, and ξ denotes space-time white noise. Then, there
exists a (diverging) choice of constants Cε such that the sequence hε converges in
probability to a limiting process h.
Furthermore, one can ensure that the limiting process h does not depend on the
choice of mollifier ϱ and that it coincides with the Hopf–Cole solution to the KPZ
equation.
Remark 15.3. It is important to note that although the limiting process is independent
of the choice of mollifier ϱ, the constant Cε does very much depend on this choice,
as we already alluded to earlier.
Remark 15.4. Regarding the initial condition, one can take h0 ∈ C β for any fixed
β > 0. Unfortunately, this result does not cover the case of “infinite wedge” initial
conditions, see for example [Cor12].
The aim of this section is to sketch how the theory of regularity structures can be
used to obtain this kind of convergence results and how (15.4) is constructed. First of
all, we note that while our solution h will be a Hölder continuous space-time function
(or rather an element of D γ for some regularity structure with a model over R2 ), the
“time” direction has a different scaling behaviour from the three “space” directions.
As a consequence, it turns out to be effective to slightly change our definition of
“localised test functions” by setting
Our first step is to build a regularity structure that is sufficiently large to allow to
reformulate (15.1) as a fixed point in D γ for some γ > 0. Denoting by G the heat
kernel (i.e. the Green’s function of the operator ∂t − ∂x2 ), we can rewrite the solution
to (15.1) with initial condition h0 as
where ∗ denotes space-time convolution and where we denote by Gh0 the harmonic
extension of h0 . (That is the solution to the heat equation with initial condition h0 .)
Remark 15.5. We view (15.7) as an equation on the whole space by considering its
periodic extension.
In order to have a chance of fitting this into the framework described above, we
first decompose the heat kernel G as in Exercise 14.5 as
G = K + K̂ ,
where the kernel K satisfies all of the assumptions of Section 14.4 (with β = 2) and
the remainder K̂ is smooth. If we consider any regularity structure containing the
usual Taylor polynomials and equipped with an admissible model, is straightforward
to associate to K̂ an operator K̂ : D γ → D ∞ via
X Xk
Dk K̂ ∗ Rf (z) ,
K̂f (z) =
k!
k
where z denotes a space-time point and k runs over all possible 2-dimensional
multiindices. Similarly, the harmonic extension of h0 can be lifted to an element
in D ∞ which we denote again by Gh0 by considering its Taylor expansion around
every space-time point. At this stage, we note that we actually cheated a little: while
Gh0 is smooth in {(t, x) : t > 0, x ∈ T} and vanishes when t < 0, it is of course
singular on the time-0 hyperplane {(0, x) : x ∈ T}. This problem can be cured
by introducing weighted versions of the spaces D γ allowing for singularities on
a given hyperplane. A precise definition of these singular model spaces and their
behaviour under multiplication and the action of the integral operator K can be found
in [Hai14b]; but see Exercise 4.12 for the (singular, controlled) rough path analogue.
For the purpose of the informal discussion given here, we will simply ignore this
problem.
This suggests that the “abstract” formulation of (15.1) should be given by
294 15 Application to the KPZ equation
We then simply add to T all of the formal expressions that an application of the
right-hand side of (15.9) can generate for the description of H, ∂H, and (∂H)2 .
The degree of a given expression is furthermore completely determined by the rules
deg Iτ = deg τ + 2, deg ∂τ = deg τ − 1 and deg τ τ̄ = deg τ + deg τ̄ . For example,
it follows from (15.9) that the symbol I(Ξ) is required for the description of H, so
that I ′ (Ξ) is required for the description of ∂H. This then implies that I ′ (Ξ)2 is
required for the description of the right-hand side of (15.9), which in turn implies
that I(I ′ (Ξ)2 ) is also required for the description of H, etc. This “Picard iteration”
yields the (formal) expansion, writing z for a generic space-time point,1
Remark 15.7. Here we made a distinction between I(Ξ), interpreted as the linear
map I applied to the symbol Ξ, and the symbol I(Ξ). Since the map I is then
1
Note that h′ is treated as an independent function (similar to the Gubinelli derivative of a controlled
path); we do not even expect h to be differentiable!
15.2 Construction of the associated regularity structure 295
defined by I(Ξ) := I(Ξ), this distinction is somewhat moot and will be blurred in
the sequel. Similarly, the abstract (spatial) differentiation operator ∂ acts on suitable
symbols as ∂(I(. . .)) := I ′ (. . .), plus of course ∂(X0k0 X1k1 ) := k1 X0k0 X1k1 −1 , for
every multi-index (k0 , k1 ).
More formally, denote by U the collection of those formal expressions that are
required to describe H. This is then defined as the smallest collection containing X k
for all multiindices k ≥ 0, I(Ξ), and such that
τ1 , τ2 ∈ U =⇒ I(∂τ1 ∂τ2 ) ∈ U .
We then set
W = U ∪ {Ξ} ∪ {∂τ1 ∂τ2 : τi ∈ U} , (15.10)
and define T as the set of all linear combinations of elements in a finite subset
W0 ⊂ W, sufficiently large to allow close the fixed pointed problem (15.8). Remark
that this defines (implicitly!) a multiplication between some (but not all) of the
symbols, notably ∂τ1 ⋆ ∂τ2 := ∂τ1 ∂τ2 so that we can safely omit ⋆ in the sequel.
Naturally, Tα consists of those linear combinations that only involve elements in W0
of degree α. (Already W contains only finitely many elements of degree less than α,
which reflects subcriticality of the problem.)
In order to simplify expressions later, we use again a shorthand graphical notation
for elements of W as we already did in Section 14.5. Similarly to before, Ξ is
represented a small circle, while the integration map I is represented by a downfacing
wavy line and I ′ = ∂I is represented by a downfacing plain line. For example, we
write
H = h1 + + + h′ X1 + 2 + 2h′ + ...
{∂τ1 ∂τ2 : τi ∈ U} = { , , , , , , , 1, . . .} .
As it turns out, provided that we also include the noise itself, the 14 symbols encoun-
tered so far already generate a sufficiently large structure space, given by
def
T = TKPZ = ⟨W0 ⟩ = ⟨Ξ, , , , , , , , 1, , , X1 , , ⟩. (15.11)
that deg X1 = 1 for the abstract space variable, whereas due to parabolic scaling the
abstract time variable has deg X0 = 2 and does not show up here.
Note that at this stage, we have not defined a regularity structure yet, as we have
not described a structure group G acting on T . However, similarly to what was done
in (14.24), it is already natural to consider “representations” of the existing structure,
which are linear maps Π from T into some suitable space of functions / distributions
respecting a form of admissibility condition. For the sake of the present discussion,
we assume that all objects are smooth. Given a (smooth) realisation of a “driving
noise” ξ, we can then define its canonical lift by setting
ΠX k (x) = xk ,
ΠΞ (x) = ξ(x) , (15.12)
Πτ τ̄ = Πτ · Π τ̄ , ΠIτ = K ∗ Πτ . (15.13)
In general, we say that a linear map Π : T → C(Rd ) is admissible if one has the
relations
ΠIτ = K ∗ Πτ , Π1 = 1 , ΠX k τ = ( • )k Πτ . (15.14)
Remark 15.8 (Where to truncate?). The (14-dimensional) space TKPZ is indeed suffi-
cient to treat the KPZ equation. Indeed, once in possession of an admissible model,
thanks to Theorem 14.5, the fixed point problem (15.8) can be solved in D γ as soon
as γ is a little bit greater than 3/2. This is why we only need to keep track of terms
describing the abstract KPZ solution up to degree 3/2. Regarding the terms required
to describe the right-hand side of the fixed point problem, we need to go up to degree
0, which guarantees that the reconstruction operator (and therefore also the integra-
tion operator K) is well-defined. This is similar to T = T<1/2 , as in Definition 13.4,
being sufficient to treat rough / stochastic integration (and then SDEs) in a Brown-
ian rough path / model context. Indeed, in that context (Proposition 13.21) consider
Y ∈ D02α (now for α to be determined!) and abstract Brownian noise Ẇ ∈ D−1/2 ∞
−.
2α
Then f (Y ), composition with a nice function f , is also in D0 and the product is
−
in D 2α−1/2 . We needed this exponent to be positive to have a well-defined rough
integration which in turn allows to formulate a fixed point problem, so that we need
2α ≥ 1/2. By definition of D 2α , this means that we need Y to take values in T<1/2
which is of course what we did by working in ⟨Ẇ , Ẇ, 1, W ⟩, ignoring all symbols
of higher degree.
15.3 The structure group and positive renormalisation 297
Recall that the purpose of the group G is to provide a class of linear maps Γ : T → T
arising as possible candidates for the action of “reexpanding” a “Taylor series” around
a different point. In our case, in view of (14.8) and Definition 14.3, the coefficients
of these reexpansions will naturally be some polynomials in x and in the expressions
appearing in (14.9). This suggests that we should define a space T + whose basis
vectors consist of formal expressions of the type
N
Y
Xk Jℓi (τi ) , (15.15)
i=1
where N is an arbitrary but finite number, the τi are canonical basis elements in
W defined in (15.10), and the ℓi are d-dimensional multiindices satisfying |ℓi | <
deg τi + 2. (The last bound is a reflection of the restriction of the summands in (14.9)
with β = 2.) The space T + , which also contains the empty product 1, is endowed
with a natural commutative product, written as · or (usually) omitted. (T + , ·, 1)
is nothing but the free commutative algebra over the symbols {Xi , Jℓ (τ )} with
i ∈ {1, . . . , d} and τ ∈ W with deg Jℓ (τ ) := deg τ + 2 − |ℓ| > 0.)
∆+ 1 = 1 ⊗ 1 , ∆+ Ξ = Ξ ⊗ 1 , ∆+ Xi = Xi ⊗ 1 + 1 ⊗ Xi .
∆+ (τ τ̄ ) = ∆+ τ · ∆+ τ̄ ,
X Xℓ
∆+ I(τ ) = (I ⊗ Id)∆+ τ + ⊗ Jℓ (τ ) ,
ℓ!
ℓ
X Xℓ
∆+ I ′ (τ ) = (I ′ ⊗ Id)∆+ τ + ⊗ Jℓ+(0,1) (τ ) .
ℓ!
ℓ,m
298 15 Application to the KPZ equation
Remark 15.10. A less explicit way to define G is to simply take it as the set of
all linear maps that are ‘allowed’ in the sense that they are upper triangular with
the identity on the diagonal as imposed by (13.5), commute with derivatives as in
Definition 14.1, are multiplicative with respect to the product as in Definition 14.3,
and satisfy (14.7). See for example [Hai16].
Example 15.11 (KPZ structure group). Running through this procedure, and restrict-
ing to T = TKPZ reveals G as a 7-dimensional (non-commutative) matrix group,
canonically realised as a subgroup of the invertible maps T → T , themselves repre-
sentable as 16 × 16-matrix. Full details are left for Exercise 15.1.
{J ′ ( ), J ′ ( ), . . . , J (Ξ), J ( ), X1 , J ( ), J ( ), . . .}
≡ − ≡ −
with (non-negative) degrees { 12 , 12 , . . . , 1= , 1, 23 , 32 , . . .} and shorthands J =
J(0,0) , J ′ = J(0,1) . We note that all symbols here can be represented by elementary
trees,2 where J (τ ) (resp. J ′ (τ )) is represented by attaching a single downfacing
wavy (resp. plain) line to the root of τ . For instance
3 · 1 − J (Ξ) + 2 · J ′ ( ) · J ′ ( ) ∈ T+
but the symbol J ′ (Ξ) (which would be of negative homogeneity) is not an element
of T + .
Before we show that G does indeed form a group (actually a subgroup of the
invertible maps from T to T ), we show how to use it to turn an admissible linear
2
With some goodwill this even includes X-factors, which then appear as polynomial decorations
of the trees.
15.3 The structure group and positive renormalisation 299
map Π : T → C ∞ (Rd ) (in the sense of (15.14)) into a model (Π, Γ ). Consider the
recursion
(−x)k
X Z
Dℓ+k K(x − y) Πx τ (dy) ,
fx (Jℓ (τ )) = −
k!
|k+ℓ|<|τ |+2
Πx τ = (Π ⊗ fx )∆+ τ , (15.18)
where we furthermore impose that the fx are characters, namely that they extend
to all of (T + )∗ in a multiplicative fashion, fx (σσ̄) = fx (σ)fx (σ̄). We leave it as a
simple exercise to verify that these two identities are sufficient to define the fx and
the Πx uniquely.
Remark 15.13. The correspondence Π ⇔ (Π, Γ ) can also be inverted and the two
notions of admissibility are consistent, so that these are two completely equivalent
ways of looking at admissible models for our regularity structure. Indeed, it suffices
to set Πτ = RHτ , where the elements Hτ ∈ D ∞ (i.e. one can make sure that
Hτ ∈ D γ for any fixed γ) are given by HX k (x) = (X + x)k , HΞ (x) = Ξ, and
then recursively by
In particular, this correspondence does not at all rely on the fact that the model
was built by lifting a smooth function. Note that this is strongly reminiscent of the
construction given in Exercise 13.11. See also Exercise 15.3.
it follows immediately from (15.18) that the Πx and the maps Γxy do indeed satisfy
the desired algebraic relation Πx Γxy = Πy . We also note that the coefficients of
the linear maps Γxy are expressed as polynomials of the numbers fx (Jℓi (τi )) and
fy (Jℓi (τi )) for suitable expressions τi and multiindices ℓi . Note that the linear maps
Fx : T → T perform a kind of “recentering” of Π around x in the sense that (15.18)
guarantees that, at least when Π is sufficiently smooth, Πx I(τ ) vanishes at the order
determined by the degree of τ . As a matter of fact, one could even have taken this as
the defining property of the maps Fx (together with the fact that they are of the form
(15.19) for some multiplicative functional fx ). We will see in Section 15.5 below
that the renormalisation procedure required to give a meaning to singular SPDEs
like the KPZ equation can equally be interpreted as a type of recentering procedure,
but this time in “probability space”. This also explains the terminology “positive
renormalisation” which is sometimes encountered for the maps Fx .
300 15 Application to the KPZ equation
We now argue that G as defined above actually forms a group, so that in particular
the maps Fx are invertible. To this end, define a linear map ∆+ : T + → T + ⊗ T + ,
very similarly to the previously defined map ∆+ : T → T ⊗ T + , by
∆+ 1 = 1 ⊗ 1 , ∆+ X = X ⊗ 1 + 1 ⊗ X ,
Γf ◦g = Γf Γg .
Also, the element e is neutral in the sense that Γe is the identity operator, and as a
consequence Γf −1 = Γf−1 whenever f ∈ G+ . In particular then,
The fact that ∆+ preserves degree (as can be seen by induction from its definition)
and that elements of T + all have strictly positive degree, except for 1 leads to
the conclusion that, for every Γ ∈ G and every τ ∈ T , Γ τ is indeed of the form
(13.5). The multiplicativity property of ∆+ furthermore guarantees that the constraint
mentioned in Definition 14.3 does hold. This justifies our definition of structure group
G associated to T as the set of all multiplicative linear functionals on T + , acting on
T via (15.16), as given in (15.17), for G has group structure induced from G+ .
Returning to the relation between Πx and Π, we showed actually more, namely
that the knowledge of Π and the knowledge of (Π, Γ ) are equivalent. Indeed, on
the one hand one has Π = Πx Fx−1 and the map Fx can be recovered from Πx by
(15.18) and (15.19). On the other hand however, one also has of course Πx = ΠFx
and, if we equip T with an adequate recursive structure, then we have already seen
that the coefficients fx are uniquely determined by Π.
Furthermore, the correspondence (Π, Γ ) ↔ Π outlined above works for any
admissible model and does not at all rely on the fact that it was built by lifting a
continuous function. In particular, it does not rely on the fact that Πx and Π are
multiplicative. In the general case, the first identity in (15.13) may then of course
fail to be true, even if Πτ happens to be a continuous function for every τ ∈ T . The
only reason why our definition of an admissible model does not simply consist of
the single map Π is that there seems to be no simple way of describing the topology
given by Definition 13.5 in terms of Π.
Recall that, given any sufficiently regular function ξ (say a continuous space-time
function), there is a canonical way of lifting ξ to an admissible model L ξ = (Π, Γ )
for T by imposing (15.12) and (15.13), and then turning Π into a model as described
in the previous paragraph. With such a model L ξ at hand, it follows from (15.13)
and (13.26) that the associated reconstruction operator satisfies the properties
RKf = K ∗ Rf , R(f g) = Rf · Rg ,
as long as all the functions to which R is applied belong to D γ for some γ > 0. As
a consequence, applying the reconstruction operator R to both sides of (15.8), we
see that if H solves (15.8) then, provided that the model (Π, Γ ) = L ξ was built as
above starting from any continuous realisation ξ of the driving noise, the function
h = RH solves the equation (15.1).
At this stage, the situation is as follows. For any continuous realisation ξ of the
driving noise, we have factorised the solution map (h0 , ξ) 7→ h associated to (15.1)
into maps
(h0 , ξ) 7→ (h0 , L ξ) 7→ H 7→ h = RH ,
302 15 Application to the KPZ equation
where the middle arrow corresponds to the solution to (15.8) in some weighted
D γ -space. The advantage of such a factorisation is that the last two arrows yield
continuous maps, even in topologies sufficiently weak to be able to describe driving
noise having the lack of regularity of space-time white noise. The only arrow that
isn’t continuous in such a weak topology is the first one. At this stage, it should
be believable that a similar construction can be performed for a very large class of
semilinear stochastic PDEs, provided that certain scaling properties are satisfied.
This is indeed the case and large parts of this programme have been carried out in
[Hai14b].
Given this construction, one is lead naturally to the following question: given a
sequence ξε of “natural” regularisations of space-time white noise, for example as
in (15.6), do the lifts L ξε converge in probably in a suitable space of admissible
models? Unfortunately, unlike in the theory of rough paths where this is very often
the case (see Section 10), the answer to this question in the context of SPDEs is often
an emphatic no. Indeed, if it were the case for the KPZ equation, then one could
have been able to choose the constant Cε to be independent of ε in (15.6), which is
certainly not the case.
One way of circumventing the fact that L ξε does not converge to a limiting model
as ε → 0 is to consider instead a sequence of renormalised models. The main idea
is to exploit the fact that our definition of an admissible model does not impose the
multiplicative identity
Πτ τ̄ = Πτ · Π τ̄ ,
used in (15.13) for the canonical lift, even in situations where ξ itself happens to be a
continuous function. One question that then imposes itself is: what are the natural
ways of “deforming” the usual product which still lead to lifts to an admissible model?
It turns out that the regularity structure whose construction was sketched above comes
equipped with a natural finite-dimensional group of continuous transformations R
on its space of admissible models (henceforth called the “renormalisation group”),
which essentially amounts to the space of all natural deformations of the product. It
then turns out that even though the canonical lift L ξε does not converge, it is possible
to find a sequence Mε of elements in R such that the sequence Mε L ξε converges
to a limiting model (Π̂, Γ̂ ). Unfortunately, the elements Mε do not preserve the
image of L in the space of admissible models. As a consequence, when solving the
fixed point map (15.8) with respect to the model Mε L ξε and inserting the solution
into the reconstruction operator, it is not clear a priori that the resulting function
(or distribution) can again be interpreted as the solution to some modified PDE. It
turns out that in the present setting this is again the case and the modified equation
is precisely given by (15.6), where Cε is some linear combination of the constants
appearing in the description of Mε .
15.5 Renormalisation of the KPZ equation 303
How does all this help with the identification of a natural class of deformations for
the usual product? Throughout this section, we will only consider models constructed
from a single map Π by the recursive procedure given in (15.18), combined with
(15.20). At this point, we crucially note that if Π : T → C ∞ (Rd ) is an arbitrary
admissible linear map (in the sense that ΠIτ = K ∗ Πτ as before), then there is no
reason in general for (15.18) and (15.20) to define a model. The reason is that while
these definitions do guarantee that Πx Iτ satisfies the first bound in (13.13), there
is no reason in general for Πx τ (y) to vanish at the right order as y → x for an
arbitrary symbol τ that is not obtained by applying the integration map to some other
symbol. It is however the case that these bounds hold whenever Π is obtained as the
canonical lift of a smooth function, as can easily be seen from the multiplicativity
property of the canonical lift.
This suggests to define a space M∞ consisting of those admissible maps Π : T →
C ∞ (Rd ) which do generate a model by the above procedure. By Remark 15.13, there
is a canonical bijection between M∞ and the set of all smooth admissible models,
so we henceforth also call an element Π ∈ M∞ simply a model (or an admissible
model). Note that even though the space of linear maps T → C ∞ (Rd ) is linear, the
space M∞ is far from being a linear space.
At this stage, we would like to introduce probability into the game. For this, note
first that we have a natural action S of the group of translations (Rd , +) onto T by
setting Sh X k = (X + h)k , Sh Ξ = Ξ, and then recursively by
Sh Iτ = ISh τ , Sh τ τ̄ = Sh τ Sh τ̄ .
Π M τ = ΠM τ , (15.24)
one would like to have again Π M ∈ M∞ . It is clear that in order to guarantee this,
M needs to commute with the integration operators I and I ′ , but this alone is by no
means sufficient.
It turns out that the construction of a natural family of operators with the required
properties goes in a way that is strongly reminiscent of the construction of the
structure group, but with many aspects of the construction “reversed”. A natural
starting point of the construction is given by the set W− ⊂ W consisting of the
canonical basis vectors of strictly negative degree of our regularity structure T which
furthermore have the property that they can be built from products and integrations
applied to Ξ, i.e. do not involve any X k for k > 0. We then define T − similarly to
T + as the free unital algebra generated by W− , i.e.3
n o
T − = Alg
def
, , , , , , , ,
the algebra given by all polynomials with real coefficients and indeterminates in
W− ; the unit is denoted by 1 (or, equivalently, as the empty forest ̸#). The reason
why W− is expected to play a major role is that, by combing Exercise 13.11 with
admissibility and multiplicativity of the action of Γ , Πτ for deg τ > 0 is uniquely
determined by the knowledge of Πτ for all symbols τ with deg τ ≤ 0.
By analogy with the BPHZ renormalisation procedure in quantum field theory
[BP57, Hep69, Zim69], it is natural to look for renormalisation maps that consist in
“contracting subtrees of negative degree”. In order to formalise such an operation, we
take more seriously the interpretation of the canonical basis elements of T as “trees”.
More precisely, we consider labelled trees τ = (V, E, ϱ, n, e), where V is a finite
vertex set, E ⊂ V × V is an edge set, ϱ ∈ V is a root, n : V → Nd is a “polynomial
label” and e : E → {Ξ, I, I ′ } is an “edge label”. As usual, we identify labelled
trees if they can be related by a tree isomorphism preserving the root and labels.
The way this correspondence works is as follows. The symbol X k is represented as
the (unique) tree with a sole vertex V = {ϱ} and polynomial label n(ϱ) = k. The
symbol Ξ is represented by the tree with two vertices V = {ϱ, •}, one (oriented)
edge E = {e} = {(•, ϱ)}, and labels n = 0, e(e) = Ξ. Integration is then performed
by adding an edge of the corresponding type to the root, i.e. we have for example
3
As in the case of rough volatility, cf. 14.26, we colour basis elements of T − differently to
distinguish them from those of T and / or T + . Elements in T − are naturally represented as
(unordered) forests.
15.5 Renormalisation of the KPZ equation 305
where In(ϱ̄) = 0 and otherwise agrees with n, while Ie((ϱ, ϱ̄)) = I and again
otherwise agrees with e. Multiplication is obtained by joining roots:
(V, E, ϱ, n, e) · (V̄ , Ē, ϱ̄, n̄, ē) = ((V ⊔ V̄ )/{ϱ, ϱ̄}, E ⊔ Ē, {ϱ, ϱ̄}, n ⊔ n̄, e ⊔ ē) ,
∆− = ⊗1+1⊗ +2 ⊗ + ⊗
+ ⊗ +2 ⊗ + ⊗ + ⊗ (15.26)
+2 ⊗ +2 ⊗ +2 ⊗ ,
where we used red symbols to denote elements of T − just as in Section 14.5. In most
situations it is natural to only consider characters of T − that vanish on planted trees,
4
Mind that ≡ ⊂ in three distinct ways which explains the terms 2 ⊗ + ⊗ .
306 15 Application to the KPZ equation
i.e. trees with only one edge incident to the root,5 in which case this simplifies to
∆− = ⊗1+1⊗ +2 ⊗ + ⊗ .
Note also that there is for example no term ⊗ appearing in (15.26); indeed
fails to have negative degree, hence is not an element of T − and killed by Q− .
Remark 15.17. While the present construction is sufficient for KPZ, in full generality,
one should also allow polynomial decorations for elements in T − in which case
the expression for ∆− involves additional combinatorial factors, similarly to the
definition of ∆+ .
Mg τ = (g ⊗ Id)∆− τ , (15.27)
then corresponds to iterating over all ways of contracting subtrees of negative degree
contained in τ and replacing them by the corresponding constant assigned to it by g.
This corresponds to replacing a kernel of possibly several variables by a multiple of
a Dirac delta function forcing all arguments to collapse.
Similarly to before, one can also define an operator ∆− : T − → T − ⊗ T − by
setting X
∆− B = Q− A ⊗ Q− RA B ,
A⊂B
exactly the same as discussed here, except that there are two “noises” Ξ1 and Ξ2
and every occurrence of Ξ can be replaced by either of them, the old definition
would for example include the map M that swaps the two noises in a consistent
way. (Consistency is in the sense that M I ′ (Ξ2 )I ′ (I ′ (Ξ1 )2 ) = I ′ (Ξ1 )I ′ (I ′ (Ξ2 )2 )
for example.) This is not an operation that is described by a character of T − . The
advantage of the present definition is that it is much more explicit. Furthermore, it
follows from the analytical results of [CH16] that it is sufficiently large to serve the
purpose of renormalising divergent models.
∆− = ⊗1+1⊗ +2 ⊗ + ⊗ + ⊗ .
Note that we have not considered the simplification of removing planted trees. Instead,
the analogues of the remaining terms appearing in (15.26) are killed by the projection
Q− . We also note that this expression is symmetric in the two factors T − which
is the case for all the symbols appearing in the analysis of the KPZ equation. This
implies that the KPZ renormalisation group R is abelian. (In general though, the
presence of “overlapping divergencies” can cause R to be non-abelian.)
Πxg = Πx Mg , g
Γxy = Mg−1 Γxy Mg . (15.28)
Proof. We sketch the proof. Recall that ∆− has been defined (with notational over-
load) as map from T → T − ⊗ T and T − → T − ⊗ T − ; we now also define
∆− : T + → T − ⊗ T + as multiplicative linear map, determined by
∆− Xi = 1 ⊗ Xi , ∆− Jℓ (τ ) = (Id ⊗ Jℓ (·))∆− τ .
In the special case of KPZ one can check by hand that, thanks in particular to the
fact that I ′ (1) = 0 by Remark 15.6 (which correctly suggests that we should also
impose J ′ (1) = 0),
(i) On T one has the cointeraction formula
Recall the correspondence Π ⇔ (Π, Γ ) given in Remark 15.13. With the special
properties (i)-(ii) it is straightforward to verify that, for g ∈ R arbitrary, Π g = ΠMg
defines a model Π g ⇔ (Π g , Γ g ) with
One then uses (i), on T , to show that the required identity for Πzg holds. Finally, one
uses (i), on T + to show that if one views Mg = (g ⊗ Id)∆− as acting on T + , then
its action distributes over the product in the character group defined in (15.23) in the
sense that (Mg f ) ◦ (Mg f¯) = Mg (f ◦ f¯), which then implies the required identity
g
for γxy . The fact that the action of Mg does not decrease degrees guarantees that
(Π g , Γ g ) is again a model (since (Π, Γ ) is). ⊔
⊓
Remark 15.22. In general (i.e. in the case of similar regularity structures set up
for different examples of subcritical semilinear SPDEs), the cointeraction property
(15.29) may fail. It turns out however that it can still be rescued by working in a
suitably extended regularity structure, see [Hai16, BHZ19].
One important feature of this theorem is that the last statement provides quantita-
tive bounds on the map Π 7→ Π g which show that it can be extended to a continuous
action of R onto the space M of all admissible models. A crucial property of R is
that it is sufficiently large to allow us to “recenter” models in a natural way.
Definition 15.23. Let ξ be a (smooth) stationary stochastic process and let Π be its
canonical lift. Then, there exists a unique character g BPHZ ∈ R such that Π BPHZ =
ΠMgBPHZ satisfies E(Π BPHZ τ )(0) = 0 for every canonical basis vector τ ∈ T with
deg τ < 0. We call Π BPHZ the BPHZ lift of ξ.
Remark 15.24. This is named after Bogoliubow, Parasiuk, Hepp and Zimmermann
[BP57, Hep69, Zim69] who introduced an analogous renormalisation procedure in
the context of perturbative quantum field theory in the sixties.
Remark 15.25. Note also that while the BPHZ lift of a noise ξ is “canonical”, it
does depend on the choice of kernel K for our notion of admissibility. In particular,
different truncations of the heat kernel will in general lead to different values for the
BPHZ renormalisation constants.
A beautiful property of the BPHZ lift is that it is much more stable than the
canonical lift. Indeed, it was shown in [CH16] that one can introduce a natural
measure of the “size” N (ξ) of a stationary noise ξ which is such that for any sequence
ξn such that supn N (ξn ) < ∞ and ξn → ξ in probability as random distributions, the
corresponding BPHZ lifts Π BPHZ
n converge to a limiting model Π BPHZ . This limiting
model is furthermore independent of the choice of approximating sequence.
15.5 Renormalisation of the KPZ equation 309
g( ) = C0 , g( ) = C1 , g( ) = C2 , g( ) = C3 , (15.30)
and set to vanish on the remaining symbols which require no renormalisation. The
resulting renormalisation maps M : T → T is then given by M := (g ⊗ Id)∆− .
(It turns out that we only need a three-parameter subgroup of R to renormalise the
equation, but in order to explain the procedure we prefer to work with the larger
4-dimensional subgroup of R.) It is now rather straightforward to show the following:
Proof. By Theorem 14.5, it turns out that (15.8) can be solved in D γ as soon as γ is
a little bit greater than 3/2. Therefore, we only need to keep track of its solution H
up to terms of degree 3/2. By repeatedly applying the identity (15.9), we see that the
solution H ∈ D γ for γ close enough to 3/2 is necessarily of the form
H = h1 + + + h′ X1 + 2 + 2h′ ,
∂H = + + h′ 1 + 2 + 2h′ , (15.31)
It follows from the definition of M that one then has the identity
M ∂H = ∂H − 4C0 ,
310 15 Application to the KPZ equation
so that, as an element of D γ with very small (but positive) γ, one has the identity
As a consequence, after neglecting all terms of strictly positive order, one has the
identity (writing c instead of c1 for real constants c)
Combining this with the fact that M and ∂ commute, the claim now follows at once.
⊔
⊓
Remark 15.27. It turns out that, thanks to the symmetry x 7→ −x enjoyed by our
problem, the corresponding model can be renormalised by a map M as above, but
with C0 = 0. The reason why we considered the general case here is twofold. First, it
shows that it is possible to obtain renormalised equations that differ from the original
equation in a more complicated way than just by the addition of a large constant.
Second, if one tries to approximate the KPZ equation by a microscopic model which
is not symmetric under space inversion, then the constant C0 plays a non-trivial role,
see for example [HS17].
It remains to argue why one expects to be able to find constants Ciε such that the
P3
sequence of renormalised models M ε L ξε with M ε = exp( i=1 Ciε Li ) converges
to a limiting model. Instead of considering the actual sequence of models, we only
ε
consider the sequence of stationary processes Π̂ τ := Π ε M ε τ , where Π ε is
associated to (Π ε , Γ ε ) = L ξε as in Section 15.5.1.
Remark 15.28. It is important to note that we do not attempt here to give a full proof
that the renormalised model converges to a limit in the correct topology for the space
ε
of admissible models. We only aim to argue that it is plausible that Π̂ converges
to a limit in some topology. A full proof of convergence (but in a slightly different
setting) can be found in [Hai13], see also [Hai14b, Section 10] and [CH16] for most
general statements.
Since there are general arguments available to deal with all the expressions τ
of positive degree as well as expressions of the type I ′ (τ ) and Ξ itself, we restrict
ourselves to those that remain. Inspecting (15.11), we see that they are given by
, , , , .
For this part, some elementary notions from the theory of Wiener chaos expansions
are required, but we’ll try to hide this as much as possible. At a formal level, one has
15.5 Renormalisation of the KPZ equation 311
the identity
Π ε = K ′ ∗ ξε = Kε′ ∗ ξ ,
where the kernel Kε′ is given by Kε′ = K ′ ∗ δε . This shows that, at least formally,
one has
ZZ
ε ′
2
Kε′ (z − z1 )Kε′ (z − z2 ) ξ(z1 )ξ(z2 ) dz1 dz2 .
Π (z) = K ∗ ξε (z) =
Similar but more complicated expressions can be found for any formal expression τ .
This naturally leads to the study of random variables of the type
Z Z
Ik (f ) = · · · f (z1 , . . . , zk ) ξ(z1 ) · · · ξ(zk ) dz1 · · · dzk . (15.33)
Ideally, one would hope to have an Itô isometry of the type EIk (f )Ik (g) =
⟨f sym , g sym ⟩, where ⟨·, ·⟩ denotes the L2 -scalar product and f sym denotes the sym-
metrisation of f . This is unfortunately not the case. Instead, one should replace the
products in (15.33) by Wick products, which are formally generated by all possible
contractions of the type
If we then set
Z Z
Iˆk (f ) = ··· f (z1 , . . . , zk ) ξ(z1 ) ⋄ · · · ⋄ ξ(zk ) dz1 · · · dzk ,
Finally, one has EIˆk (f )Iˆℓ (g) = 0 if k ̸= ℓ. Random variables of the form Iˆk (f ) for
some k ≥ 0 and some square integrable function f are said to belong to the kth
homogeneous Wiener chaos.
Returning to our problem, we first argue that it should be possible to choose M ε
ε
in such a way that Π̂ converges to a limit as ε → 0. The above considerations
suggest that one should rewrite Π ε as
Π ε (z) = K ′ ∗ ξε (z)2
(15.34)
ZZ
= Kε′ (z − z1 )Kε′ (z − z2 ) ξ(z1 ) ⋄ ξ(z2 ) dz1 dz2 + Cε(1) ,
(1)
where the constant Cε is given by the contraction
312 15 Application to the KPZ equation
Z
2
Cε(1) = Kε′ (z) dz .
def
=
Note now that Kε′ is an ε-approximation of the kernel K ′ which has the same singular
behaviour as the derivative of the heat kernel. In terms of the parabolic distance, the
singularity of the derivative of the heat kernel scales like p K(z) ∼ |z|−2 for z → 0.
(Recall that we consider the parabolic distance |(t, x)| = |t| + |x|, so that this is
consistent with the fact that the derivative of the heat kernel is bounded by t−1 .) This
2
suggests that one has Kε′ (z) ∼ |z|−4 for |z| ≫ ε. Since parabolic space-time has
scaling dimension 3 (time counts double!), this is a non-integrable singularity. As a
matter of fact, there is a whole power of z missing to make it borderline integrable,
which suggests that one has
1
Cε(1) ∼ .
ε
This already shows that one should not expect Π ε to converge to a limit as ε → 0.
However, it turns out that the first term in (15.34) converges to a distribution-valued
stationary space-time process, so that one would like to somehow get rid of this
(1)
diverging constant Cε . This is exactly where the renormalisation map M ε (in
particular the factor exp(−C1 L1 )) enters into play. Following the above definitions,
we see that one has
ε
(z) = Π ε M (z) = Π ε (z) − C1 .
Π̂
(1) ε
This suggests that if we make the choice C1 = Cε , then Π̂ does indeed converge
to a non-trivial limit as ε → 0. This limit is a distribution given, at least formally, by
ZZ
Π ε (ψ) = ψ(z)K ′ (z − z1 )K ′ (z − z2 ) dz ξ(z1 ) ⋄ ξ(z2 ) dz1 dz2 .
Using again the scaling properties of the kernel K ′ , it is not too difficult to show that
this yields indeed a random variable belonging to the second homogeneous Wiener
chaos for every choice of smooth test function ψ.
The case τ = is treated in a somewhat similar way. This time one has
(0)
where the constant Cε is given by the contraction
Z
(0)
= Kε′ (z) K ′ ∗ Kε′ (z) dz .
def
Cε =
This time however Kε′ is an odd function (in the spatial variable) and K ′ ∗ Kε′ is an
(0)
even function, so that Cε vanishes for every ε > 0. This is why we can set C0 = 0
and no renormalisation is required for .
15.5 Renormalisation of the KPZ equation 313
Remark 15.29. The factor 2 comes from the fact that the contraction (15.35) appears
twice, since it is equal to the contraction . In principle, one would think that the
(2)
contraction also contributes to Cε . This term however vanishes due to the fact
that the integral of Kε′ vanishes.
Z
Cε(3) = 2 K ′ (z)K ′ (z̄)Qε (z̄)Qε (z + z̄) dz dz̄ ,
def
=2
(2)
which diverges logarithmically for exactly the same reason as Cε . Setting C2 =
(2)
Cε , this diverging constant can again be cancelled out. The combinatorial factor 2
arises in essentially the same way as for and the contribution of the term where
the two top nodes are contracted vanishes for the same reason as previously.
It remains to consider the contribution of Π ε to the second Wiener chaos. This
contribution consists of three terms, which correspond to the contractions
It turns out that the first one of these terms does not give raise to any singularity. The
last two terms can be treated in essentially the same way, so we focus on the last one,
314 15 Application to the KPZ equation
Note that RW has the property that if φ(0) = 0, then it simply corresponds to
integration against W , which is the standard way of associating a distribution to
a function. Furthermore, the above expression is always well-defined, since φ is
smooth and therefore the factor (φ(x) − φ(0)) cancels out the singularity of W at
the origin. It is also straightforward to verify that if Wε is a sequence of smooth
approximations to W (say one has Wε (x) = W (x) for |x| > ε and |Wε | ≲ 1/ε
otherwise) which has the property that each Wε integrates to 0, then W ε → RW in
a distributional sense.
In the same way, one can show that Q̂ε converges as ε → 0 to a limiting distribu-
tion R Q̂. As a consequence, one can show that η ε converges to a limiting (random)
distribution η given by
Z
η(ψ) = ψ(z0 ) R Q̂(z0 −z1 )K ′ (z2 −z1 )K ′ (z3 −z2 )K ′ (z4 −z2 ) ξ(z3 )⋄ξ(z4 ) dz .
It should be clear from this whole discussion that while the precise values of the
constants Ci depend on the details of the mollifier δε , the limiting (random) model
(Π̂, Γ̂ ) obtained in this way is independent of it. Combining this with the continuity
of the solution to the fixed point map (15.8) and of the reconstruction operator R
with respect to the underlying model, we see that the statement of Theorem 15.2
follows almost immediately.
15.6 The KPZ equation and rough paths 315
In the particular case of the KPZ equation, it turns out that is possible to give a robust
solution theory by only using “classical” controlled rough path theory, as exposed in
the earlier part of this book. This is actually how it was originally treated in [Hai13].
To see how this can be the case, we make the following crucial remarks:
1. First, looking at the expression (15.31) for ∂H, we see that most symbols come
with constant coefficients. The only non-constant coefficients that appear are
in front of the term 1, which is some kind of renormalised value for ∂H, and
in front of the term . This suggests that the problem of finding a solution h to
the KPZ equation (or equivalently a solution h′ to the corresponding Burgers’
equation) can be simplified considerably by considering instead the function v
given by
v = ∂x h − Π + + 2 , (15.36)
where Π is the operator given by (15.12–15.14).
2. The only symbol τ appearing in ∂H such that deg τ + deg < 0 is the symbol
. Furthermore, one has
∆1 = 1 ⊗ 1 , ∆ = ⊗ 1 + 1 ⊗ J ′( ) ,
∆ = ⊗1, ∆ = ⊗ 1 + ⊗ J ′( ) .
It then follows from this and the definition (15.16) of the structure group G that
the space ⟨ , , 1, ⟩ ⊂ T is invariant under the action of G. Furthermore, its ac-
tion on this subspace is completely described by one real number corresponding
to J ′ ( ). Finally, viewing this subspace as a regularity structure in its own right,
we see that it is nothing but the regularity structure of Section 13.3.2, provided
that we make the identifications ∼ Ẇ , ∼ W , and ∼ Ẇ.
3. One has the identities
∆ = ⊗ 1 + ⊗ J ′( ), ∆ = ⊗ 1 + ⊗ J ′( ),
so that the pair of symbols { , } could also have played the role of {W , Ẇ}
in the previous remark.
Let now ξ be a smooth function and let h be given by the solution to the unrenor-
malised KPZ equation (15.1). Defining Π by ΠΞ = ξ and then recursively as in
(15.13), and defining v by (15.36), we then obtain for v the equation
∂t v = ∂x2 v + ∂x v Π + 4 Π
+R, (15.37)
where the “remainder” R belongs to C α for every α < −1. Similarly to before, it also
turns out that if we replace Π bi Π̂ = Π M defined as in (15.24) (with C0 = 0) and
h as the solution to the renormalised KPZ equation (15.6) with Cε = C1 + C2 + 4C3 ,
then v also satisfies (15.37), but with Π replaced by the renormalised model Π̂.
316 15 Application to the KPZ equation
∂t Z = ∂x2 Z + ∂x2 Y ,
15.7 Exercises
Exercise 15.1 (KPZ Structure Group) Consider the 16-dimensional KPZ regular-
ity structure with T = TKPZ given by
T = ⟨ Ξ, , , , , , , , 1, , , , , X1 , , ⟩.
Ξ 1 X1
Ξ 1
1
1
1 c1 c2
1
1
1
1
1 1 c1 c2 c3 c4 c5 c6 c7
1
1
1
1
X1
1 c1 c2
1
1
where empty entries mean zeros. Note that the upper-triangular form reflects the fact
that Γ − Id is only allowed to produce lower order terms. (Remark: It is immediate
from this representation that ⟨ , , 1, ⟩ and ⟨ , , 1, ⟩ are indeed sectors, with
− −
“rough path” index set {− 12 , 0− , 0, 21 }, and action of the structure group exactly
as in the rough path case (13.12) (with “h” replaced by c1 and c2 , respectively.)
Solution. We first derive the coaction on all the symbols, and here prefer to write ∆
for the coaction and keep ∆+ for the coproduct on T + . By definition of the coaction,
∆(Ξ) = Ξ ⊗ 1 and
X Xk
∆( ) = I ′ (Ξ) ⊗ 1 + ⊗ Jk′ (Ξ) = ⊗ 1 ,
2
k!
k∈N
since deg Jk′ (Ξ) = deg Jk+(0,1) (Ξ) = deg Ξ + 1 − |k| < 0 so that Jk′ (Ξ) = 0.
Similarly, write ∆ instead of ∆+ for better readability,
∆( ) = ∆( )∆( ) = ( ⋆ ) ⊗ 1 = ⊗ 1,
= ∆I ′ ( ) = . . . = ⊗ 1,
∆
∆ = ∆( )∆ = ... = ⊗ 1,
∆ =∆ ∆ = ⊗ 1,
′
∆ = ⊗1+1⊗J ,
⊗1+ ⊗J′
∆ = ∆( )∆ = .
318 15 Application to the KPZ equation
−
Note the interpretation of cutting off positive branches: deg J ′ = 1 + 3(− 32 ) +
− −
4 = 21 > 0, and also deg J ′ ( ) = 21 as seen in
⊗ 1 + 1 ⊗ J ′ ( ),
∆ =
∆( ) = ⊗ 1 + ⊗ J ′ ( ).
∆ =∆
To deal with = I(Ξ), note deg J (Ξ) > 0, deg J ′ (Ξ) < 0 so that the latter term
does not figure (same reasoning for = I( )), and obtain
∆( ) = ⊗ 1 + 1 ⊗ J (Ξ),
∆( ) = ⊗ 1 + 1 ⊗ J ( ).
Inspecting the above reveals that we need 1 and then the following 7 “positive”
symbols (also viewable as trees) in T + ,
J ′( ), J ′ ( ), J (Ξ), J ( ), X1 , J ( ), J ( ), (15.38)
− − − − −
of resp. homogeneities 12 , 12 , 12 , 1− , 1, 32 , 23 . On the other hand, T + was
introduced abstractly as free commutative algebra generated by all of the above
+
symbols (with unit element 1). Even upon truncation, say T + = T<3/2 with abusive
notation, this leaves us with 10 + 4 + 1 = 15 generating symbols,
J ′( ), J ′ ( ), . . . , J ′ ( ); J (Ξ), . . . , J ( ); X1 (15.39)
(of which only 7 are needed). Of course, T + also contains (free) products such as
J ′ ( )J ′ ( ), X1 J ′ ( ), J ′ ( )J ( , ) (all of degree < 3/2), however by working
in T these did not appear as “right-hand side”-image of ∆ above.
Consider now a character of the algebra T + ; that is, an element g ∈ (T + )∗ ,
so that g(1) = 1 and g(σσ̄) = g(σ)g(σ̄). (Actually, in view of the truncation we
impose this only for σ, σ̄ with deg(σσ̄) = deg σ + deg σ̄ < 3/2.) Such g is obviously
determined by its value on each of the 15 basis symbols listed in (15.39). Now T +
can be given a Hopf structure, with coproduct ∆+ and antipode, so that the set of
characters forms the group G+ , with product given by
X
(f ◦ g)(σ) = (f ⊗ g)∆+ σ = ⟨f, σ ′ ⟩⟨g, σ ′′ ⟩;
(σ)
inverses are given in terms of the antipode. One thus sees that G+ is a 15-dimensional
(Lie) group. However, only a 7-dimensioal subgroup is needed, for we only care
15.7 Exercises 319
c1 = g J ′
, . . . , c7 = g(J ( )).
Ξ 1 X1
Ξ 1
1
1
2C0 1
1
1
C0 1
2C0 1
1 C1 C2 C3 C0 1
1
2C0 1
1
1
X1
1
1
2C0 1
Exercise 15.3 Show that the two procedures for recovering Π from the knowledge
of (Π, Γ ) outlined in Remark 15.13 and on page 301 are equivalent.
15.8 Comments
The original proof [Hai13] of well-posedness of the KPZ equation without using the
Cole–Hopf transform did not use regularity structures but instead viewed the solution
at any fixed time as a spatial rough path controlled by the solution to the linearised
equation, in the spirit of Section 12.3. An alternative approach using paracontrolled
distributions as developed in [GIP15] was used in [GP17] to obtain a number of
additional properties of the solutions, including a clean variational formulation.
Given that the KPZ equation is expected to enjoy a form of “universality”, a very
natural question is that of showing that “most” classes of interface fluctuation models
converge to it in the weakly asymmetric regime. The first result in this direction was
obtained by Bertini–Giacomin [BG97], but this relied crucially on a microscopic
version of the Hopf–Cole transform to show that the transformed particle system
converges to the multiplicative stochastic heat equation. A first more general result
was obtained by Jara–Conçalves [GJ14] who showed that the large scale fluctuations
of a large number of particle systems solve the KPZ equation in a relatively weak
sense. It has been an open problem for quite some time now whether such a weak
notion of solution characterises solutions to the KPZ uniquely. Major progress in
this direction was obtained by Gubinelli–Perkowski [GP18] who showed that this
is indeed the case at stationarity under an additional structural assumption on the
15.8 Comments 321
generator of the particle system that can be verified for a number of systems of
interest.
On the other hand, a large class of interface fluctuation models that fall outside of
this approach is given by solutions to an equation of the type
√
∂t hε = ∂x2 hε + εF (∂x hε ) + η(t, x) , (15.40)
where η is a (smooth) space-time random field with sufficiently good mixing prop-
erties, F : R → R is an even function growing at infinity, and ε > 0 is a parameter
controlling the asymmetry of the problem. Under rather weak assumptions on η and F
one then expects to be able to find constants Cε such that ε−1/2 hε (ε−2 t, ε−1 x)−Cε t
converges to solutions to the KPZ equation. This was shown to be indeed the case in
various special cases of increasing generality in [HS17, HQ18, HX19, FG19]. (The
last reference treats a different class of models but its proofs could be adapted to the
setting of (15.40).)
There is a natural generalisation of the KPZ equation going in a completely
different direction. Indeed, given a Riemannian manifold (M, g) (where g denotes
the metric tensor), we can ask ourselves what the natural “stochastic heat equation
with values in M” looks like. A moment’s thought suggests that it should be given,
in local coordinates, by an equation of the form
∂t uα = ∂x2 uα + Γβγ
α
(u) ∂x uβ ∂x uγ + σiα (u) ξi , (15.41)
α
where the ξi are i.i.d. space-time white noises, Γβγ are the Christoffel symbols for
M, the σi are any finite collection of vector fields such that
[AC17] A. A NANOVA and R. C ONT. Pathwise integration with respect to paths of finite
quadratic variation. J. Math. Pures Appl. (9) 107, no. 6, (2017), 737–757. doi:
10.1016/j.matpur.2016.10.004.
[AC19] A. L. A LLAN and S. N. C OHEN. Pathwise stochastic control with applications
to robust filtering. arXiv e-prints (2019), 1–42. Ann. Appl. Probab., to appear.
arXiv:1902.05434.
[AD99] L. A NDERSSON and B. K. D RIVER. Finite-dimensional approximations to Wiener
measure and path integral formulas on manifolds. J. Funct. Anal. 165, no. 2, (1999),
430–498. doi:10.1006/jfan.1999.3413.
[AFS19] C. A M ÉNDOLA, P. F RIZ, and B. S TURMFELS. Varieties of signature tensors. Forum
Math. Sigma 7, (2019), e10, 54. doi:10.1017/fms.2019.3.
[Aid07] S. A IDA. Semi-classical limit of the bottom of spectrum of a Schrödinger operator on
a path space over a compact Riemannian manifold. J. Funct. Anal. 251, no. 1, (2007),
59–121. doi:10.1016/j.jfa.2007.06.009.
[Aid15] S. A IDA. Reflected rough differential equations. Stochastic Processes Appl. 125,
no. 9, (2015), 3570–3595. doi:10.1016/j.spa.2015.03.008.
[Alm66] F. J. A LMGREN , J R . Plateau’s problem: An invitation to varifold geometry. W. A.
Benjamin, Inc., New York-Amsterdam, 1966, xii+74.
[App09] D. A PPLEBAUM. Lévy Processes and Stochastic Calculus. Cambridge Studies in
Advanced Mathematics. Cambridge University Press, 2 ed., 2009. doi:10.1017/
CBO9780511809781.
[AR91] S. A LBEVERIO and M. R ÖCKNER. Stochastic differential equations in infinite
dimensions: solutions via Dirichlet forms. Probab. Theory Related Fields 89, no. 3,
(1991), 347–386. doi:10.1007/BF01198791.
[BA88] G. B EN A ROUS. Methods de Laplace et de la phase stationnaire sur
l’espace de Wiener. Stochastics 25, no. 3, (1988), 125–153. doi:10.1080/
17442508808833536.
[BA89] G. B EN A ROUS. Flots et séries de Taylor stochastiques. Probab. Theory Related
Fields 81, no. 1, (1989), 29–77. doi:10.1007/BF00343737.
[Bai14] I. BAILLEUL. Flows driven by Banach space-valued rough paths. In Séminaire de
Probabilités XLVI, vol. 2123 of Lecture Notes in Math., 195–205. Springer, Cham,
2014. doi:10.1007/978-3-319-11970-0_7.
[Bai15a] I. BAILLEUL. Flows driven by rough paths. Revista Matemática Iberoamericana 31,
no. 3, (2015), 901–934. doi:10.4171/rmi/858.
[Bai15b] I. BAILLEUL. Regularity of the Itô-Lyons map. Confluentes Math. 7, no. 1, (2015),
3–11. doi:10.5802/cml.15.
[Bai19] I. BAILLEUL. Rough integrators on Banach manifolds. Bull. Sci. Math. 151, (2019),
51–65. doi:10.1016/j.bulsci.2018.12.001.
323
324 References
[Bal00] E. J. BALDER. Lectures on Young measure theory and its applications in economics.
Rend. Istit. Mat. Univ. Trieste 31, no. suppl. 1, (2000), 1–69. Workshop on Measure
Theory and Real Analysis (Italian) (Grado, 1997).
[Bau04] F. BAUDOIN. An introduction to the geometry of stochastic flows. Imperial College
Press, London, 2004, x+140. doi:10.1142/9781860947261.
[BB19] I. BAILLEUL and F. B ERNICOT. High order paracontrolled calculus. Forum Math.
Sigma 7, (2019), e44, 94. doi:10.1017/fms.2019.44.
[BBR+ 18] C. BAYER, D. B ELOMESTNY, M. R EDMANN, S. R IEDEL, and J. S CHOENMAKERS.
Solving linear parabolic rough partial differential equations. arXiv e-prints (2018),
1–36. arXiv:1803.09488.
[BC17] I. BAILLEUL and R. C ATELLIER. Rough flows and homogenization in stochastic
turbulence. J. Differential Equations 263, no. 8, (2017), 4894–4928. doi:10.1016/
j.jde.2017.06.006.
[BC19] H. B OEDIHARDJO and I. C HEVYREV. An isomorphism between branched and
geometric rough paths. Ann. Inst. Henri Poincaré Probab. Stat. 55, no. 2, (2019),
1131–1148. doi:10.1214/18-aihp912.
[BCCH17] Y. B RUNED, A. C HANDRA, I. C HEVYREV, and M. H AIRER. Renormalising SPDEs
in regularity structures. arXiv e-prints (2017), 1–85. J. Eur. Math. Soc., to appear.
arXiv:1711.10239.
[BCD11] H. BAHOURI, J.-Y. C HEMIN, and R. DANCHIN. Fourier analysis and non-
linear partial differential equations, vol. 343 of Grundlehren der Mathematis-
chen Wissenschaften. Springer, Heidelberg, 2011, xvi+523. doi:10.1007/
978-3-642-16830-7.
[BCD19] I. BAILLEUL, R. C ATELLIER, and F. D ELARUE. Propagation of chaos for mean field
rough differential equations. arXiv e-prints (2019), 1–61. arXiv:1907.00578.
[BCD20] I. BAILLEUL, R. C ATELLIER, and F. D ELARUE. Solving mean field rough differential
equations. Electron. J. Probab. 25, (2020), 51 pp. doi:10.1214/19-EJP409.
[BCEF20] Y. B RUNED, C. C URRY, and K. E BRAHIMI -FARD. Quasi-shuffle algebras and
renormalisation of rough differential equations. Bull. Lond. Math. Soc. 52, no. 1,
(2020), 43–63. doi:10.1112/blms.12305.
[BCF18] Y. B RUNED, I. C HEVYREV, and P. K. F RIZ. Examples of renormalized SDEs.
In Stochastic partial differential equations and related fields, in Honor of Michael
Röckner, Bielefeld 2016, vol. 229 of Springer Proc. Math. Stat., 303–317. Springer,
Cham, 2018. doi:10.1007/978-3-319-74929-7_19.
[BCFP19] Y. B RUNED, I. C HEVYREV, P. K. F RIZ, and R. P REISS. A rough path perspective
on renormalization. J. Funct. Anal. 277, no. 11, (2019), 108283, 60. doi:10.1016/
j.jfa.2019.108283.
[BD15] I. BAILLEUL and J. D IEHL. The inverse problem for rough controlled differential
equations. SIAM J. Control Optim. 53, no. 5, (2015), 2762–2780. doi:10.1137/
140995982.
[BDFT20] C. B ELLINGERI, A. D JURDJEVAC, P. K. F RIZ, and N. TAPIA. Transport and
continuity equations with (very) rough noise. arXiv e-prints (2020), 1–20. arXiv:
2002.10432.
[Bel20] C. B ELLINGERI. An Itô type formula for the additive stochastic heat equation.
Electron. J. Probab. 25, (2020), 52 pp. doi:10.1214/19-EJP404.
[BF13] C. BAYER and P. K. F RIZ. Cubature on Wiener space: pathwise convergence. Appl.
Math. Optim. 67, no. 2, (2013), 261–278. doi:10.1007/s00245-012-9187-8.
[BFG+ 19] C. BAYER, P. K. F RIZ, P. G ASSIAT, J. M ARTIN, and B. S TEMPER. A regularity
structure for rough volatility. Math. Financ. (2019), 1–51. doi:10.1111/mafi.
12233.
[BFG20] C. B ELLINGERI, P. K. F RIZ, and M. G ERENCS ÉR. Singular paths spaces and
applications. arXiv e-prints (2020), 1–15. arXiv:2003.03352.
[BFH09] E. B REUILLARD, P. F RIZ, and M. H UESMANN. From random walks to rough
paths. Proc. Amer. Math. Soc. 137, no. 10, (2009), 3487–3496. doi:10.1090/
S0002-9939-09-09930-4.
References 325
[CF09] M. C ARUANA and P. F RIZ. Partial differential equations driven by rough paths. J.
Differential Equations 247, no. 1, (2009), 140–173. doi:10.1016/j.jde.2009.01.
026.
[CF10] T. C ASS and P. F RIZ. Densities for rough differential equations under Hörmander’s
condition. Ann. of Math. (2) 171, no. 3, (2010), 2115–2141. doi:10.4007/annals.
2010.171.2115.
[CF18] K. C HOUK and P. K. F RIZ. Support theorem for a singular SPDE: the case of
gPAM. Ann. Inst. Henri Poincaré Probab. Stat. 54, no. 1, (2018), 202–219. doi:
10.1214/16-AIHP800.
[CF19] I. C HEVYREV and P. K. F RIZ. Canonical rdes and general semimartingales as rough
paths. Ann. Probab. 47, no. 1, (2019), 420–463. doi:10.1214/18-AOP1264.
[CFG17] G. C ANNIZZARO, P. K. F RIZ, and P. G ASSIAT. Malliavin calculus for regularity
structures: the case of gPAM. J. Funct. Anal. 272, no. 1, (2017), 363–419. doi:
10.1016/j.jfa.2016.09.024.
[CFK+ 19a] I. C HEVYREV, P. K. F RIZ, A. KOREPANOV, I. M ELBOURNE, and H. Z HANG.
Deterministic homogenization for discrete-time fast-slow systems under optimal
moment assumptions. arXiv e-prints (2019), 1–24. arXiv:1903.10418.
[CFK+ 19b] I. C HEVYREV, P. K. F RIZ, A. KOREPANOV, I. M ELBOURNE, and H. Z HANG.
Multiscale systems, homogenization, and rough paths. In Probability and analysis
in interacting physical systems, In Honor of S.R.S. Varadhan, Berlin, August, 2016,
vol. 283 of Springer Proc. Math. Stat., 17–48. Springer, Cham, 2019. doi:10.1007/
978-3-030-15338-0.
[CFKM19] I. C HEVYREV, P. K. F RIZ, A. KOREPANOV, and I. M ELBOURNE. Superdiffusive
limits for deterministic fast-slow dynamical systems. arXiv e-prints (2019), 1–35.
arXiv:1907.04825.
[CFO11] M. C ARUANA, P. K. F RIZ, and H. O BERHAUSER. A (rough) pathwise approach to a
class of non-linear stochastic partial differential equations. Ann. Inst. H. Poincaré Anal.
Non Linéaire 28, no. 1, (2011), 27–46. doi:10.1016/j.anihpc.2010.11.002.
[CFV07] L. C OUTIN, P. F RIZ, and N. V ICTOIR. Good rough path sequences and applications
to anticipating stochastic calculus. Ann. Probab. 35, no. 3, (2007), 1172–1193.
doi:10.1214/009117906000000827.
[CFV09] T. C ASS, P. F RIZ, and N. V ICTOIR. Non-degeneracy of Wiener functionals arising
from rough differential equations. Trans. Amer. Math. Soc. 361, no. 6, (2009), 3359–
3371. doi:10.1090/S0002-9947-09-04677-7.
[CH16] A. C HANDRA and M. H AIRER. An analytic BPHZ theorem for regularity structures.
arXiv e-prints (2016), 1–129. arXiv:1612.08138.
[Che54] K.-T. C HEN. Iterated integrals and exponential homomorphisms. Proc. London Math.
Soc. (3) 4, (1954), 502–512. doi:10.1112/plms/s3-4.1.502.
[Che57] K.-T. C HEN. Integration of paths, geometric invariants and a generalized Baker-
Hausdorff formula. Ann. of Math. (2) 65, no. 1, (1957), 163–178. doi:10.2307/
1969671.
[Che58] K.-T. C HEN. Integration of paths—a faithful representation of paths by non-
commutative formal power series. Trans. Amer. Math. Soc. 89, (1958), 395–407.
doi:10.2307/1993193.
[Che71] K.-T. C HEN. Algebras of iterated path integrals and fundamental groups. Trans.
Amer. Math. Soc. 156, (1971), 359–379. doi:10.2307/1995617.
[Che72] K. C HENG. Quantization of a general dynamical system by Feynman’s path
integration formulation. J. Math. Phys. 13, no. 11, (1972), 1723–1726. doi:
10.1063/1.1665897.
[Che18] I. C HEVYREV. Random walks and Lévy processes as rough paths. Probab. Theory
Related Fields 170, no. 3-4, (2018), 891–932. doi:10.1007/s00440-017-0781-1.
[CHLT15] T. C ASS, M. H AIRER, C. L ITTERER, and S. T INDEL. Smoothness of the density for
solutions to Gaussian rough differential equations. Ann. Probab. 43, no. 1, (2015),
188–239. doi:10.1214/13-AOP896.
328 References
[Cho39] W.-L. C HOW. Über Systeme von linearen partiellen Differentialgleichungen erster
Ordnung. Math. Ann. 117, (1939), 98–105. doi:10.1007/BF01450011.
[CIL92] M. G. C RANDALL, H. I SHII, and P.-L. L IONS. User’s guide to viscosity solutions of
second order partial differential equations. Bull. Amer. Math. Soc. (N.S.) 27, no. 1,
(1992), 1–67. doi:10.1090/s0273-0979-1992-00266-5.
[CK00] A. C ONNES and D. K REIMER. Renormalization in quantum field theory and
the Riemann-Hilbert problem. I. The Hopf algebra structure of graphs and the
main theorem. Comm. Math. Phys. 210, no. 1, (2000), 249–273. doi:10.1007/
s002200050779.
[CK16] I. C HEVYREV and A. KORMILITZIN. A primer on the signature method in machine
learning. arXiv e-prints (2016), 1–45. arXiv:1603.03788.
[CL05] L. C OUTIN and A. L EJAY. Semi-martingales and rough paths theory. Electron. J.
Probab. 10, (2005), no. 23, 761–785. doi:10.1214/EJP.v10-162.
[CL14] L. C OUTIN and A. L EJAY. Perturbed linear rough differential equations. Ann. Math.
Blaise Pascal 21, no. 1, (2014), 103–150. doi:10.5802/ambp.338.
[CL15] T. C ASS and T. LYONS. Evolving communities with individual preferences. Proc.
Lond. Math. Soc. (3) 110, no. 1, (2015), 83–107. doi:10.1112/plms/pdu040.
[CL16] I. C HEVYREV and T. LYONS. Characteristic functions of measures on geomet-
ric rough paths. Ann. Probab. 44, no. 6, (2016), 4049–4082. doi:10.1214/
15-AOP1068.
[CL18] L. C OUTIN and A. L EJAY. Sensitivity of rough differential equations: an approach
through the omega lemma. J. Differential Equations 264, no. 6, (2018), 3899–3917.
doi:10.1016/j.jde.2017.11.031.
[Cla66] M. C LARK. The representation of non-linear stochastic systems with applications to
filtering. Ph.D. thesis, Imperial College, 1966.
[CLL12] T. C ASS, C. L ITTERER, and T. LYONS. Rough paths on manifolds. In New trends in
stochastic analysis and related topics, vol. 12 of Interdiscip. Math. Sci., 33–88. World
Sci. Publ., Hackensack, NJ, 2012. doi:10.1142/9789814360920_0002.
[CLL13] T. C ASS, C. L ITTERER, and T. LYONS. Integrability and tail estimates for Gaussian
rough differential equations. Ann. Probab. 41, no. 4, (2013), 3026–3050. doi:
10.1214/12-AOP821.
[CN19] M. C OGHI and T. N ILSSEN. Rough nonlocal diffusions. arXiv e-prints (2019), 1–54.
arXiv:1905.07270.
[CO17] T. C ASS and M. O GRODNIK. Tail estimates for Markovian rough paths. Ann. Probab.
45, no. 4, (2017), 2477–2504. doi:10.1214/16-AOP1117.
[CO18] I. C HEVYREV and M. O GRODNIK. A support and density theorem for Markovian
rough paths. Electron. J. Probab. 23, (2018), Paper No. 56, 16. doi:10.1214/
18-ejp184.
[Col51] J. D. C OLE. On a quasi-linear parabolic equation occurring in aerodynamics. Quart.
Appl. Math. 9, (1951), 225–236. doi:10.1090/qam/42889.
[Com19] G. C OMI. Semi-Linear Heat Equation and Singular Volterra Equation. Ph.D. thesis,
Università degli studi di Milano Bicocca, Università degli studi di Pavia, 2019.
[Cor12] I. C ORWIN. The Kardar-Parisi-Zhang equation and universality class. Ran-
dom Matrices Theory Appl. 1, no. 1, (2012), 1130001, 76. doi:10.1142/
S2010326311300014.
[CP19] R. C ONT and N. P ERKOWSKI. Pathwise integration and change of variable formulas
for continuous paths with arbitrary regularity. Trans. Am. Math. Soc., Ser. B 6, (2019),
161–186. doi:10.1090/btran/34.
[CQ02] L. C OUTIN and Z. Q IAN. Stochastic analysis, rough path analysis and fractional
Brownian motions. Probab. Theory Related Fields 122, no. 1, (2002), 108–140.
doi:10.1007/s004400100158.
[CW16] T. C ASS and M. P. W EIDNER. Tree algebras over topological vector spaces in rough
path theory. arXiv e-prints (2016), 1–25. arXiv:1604.07352.
References 329
[Dos77] H. D OSS. Liens entre équations différentielles stochastiques et ordinaires. Ann. Inst.
H. Poincaré Sect. B (N.S.) 13, no. 2, (1977), 99–125.
[DPD03] G. DA P RATO and A. D EBUSSCHE. Strong solutions to the stochastic quantiza-
tion equations. Ann. Probab. 31, no. 4, (2003), 1900–1916. doi:10.1214/aop/
1068646370.
[DPZ92] G. DA P RATO and J. Z ABCZYK. Stochastic equations in infinite dimensions, vol. 44
of Encyclopedia of Mathematics and its Applications. Cambridge University Press,
Cambridge, 1992, xviii+454. doi:10.1017/CBO9780511666223.
[DT09] A. D EYA and S. T INDEL. Rough Volterra equations. I. The algebraic integration
setting. Stoch. Dyn. 9, no. 3, (2009), 437–477. doi:10.1142/S0219493709002737.
[Faw04] T. FAWCETT. Non-commutative harmonic analysis. Ph.D. thesis, University of
Oxford, 2004.
[FdLP06] D. F EYEL and A. DE L A P RADELLE. Curvilinear integrals along enriched paths.
Electron. J. Probab. 11, (2006), no. 34, 860–892. doi:10.1214/EJP.v11-356.
[FDM08] D. F EYEL, A. D E L A P RADELLE, and G. M OKOBODZKI. A non-commutative
sewing lemma. Electron. Commun. Probab. 13, (2008), 24–34. doi:10.1214/ECP.
v13-1345.
[FG16a] P. F RIZ and P. G ASSIAT. Geometric foundations of rough paths. In Geometry,
analysis and dynamics on sub-Riemannian manifolds. Vol. II, EMS Ser. Lect. Math.,
171–210. Eur. Math. Soc., Zürich, 2016. doi:10.4171/163-1/3.
[FG16b] P. K. F RIZ and B. G ESS. Stochastic scalar conservation laws driven by rough
paths. Ann. Inst. H. Poincaré Anal. Non Linéaire 33, no. 4, (2016), 933–963. doi:
10.1016/j.anihpc.2015.01.009.
[FG19] M. F URLAN and M. G UBINELLI. Weak universality for a class of 3d stochastic
reaction-diffusion models. Probab. Theory Related Fields 173, no. 3-4, (2019),
1099–1164. doi:10.1007/s00440-018-0849-6.
[FGGR16] P. K. F RIZ, B. G ESS, A. G ULISASHVILI, and S. R IEDEL. The Jain-Monrad crite-
rion for rough paths and applications to random Fourier series and non-Markovian
Hörmander theory. Ann. Probab. 44, no. 1, (2016), 684–738. arXiv:1307.3460.
doi:10.1214/14-AOP986.
[FGL15] P. F RIZ, P. G ASSIAT, and T. LYONS. Physical Brownian motion in a magnetic
field as a rough path. Trans. Amer. Math. Soc. 367, no. 11, (2015), 7939–7955.
arXiv:1302.2531. doi:10.1090/S0002-9947-2015-06272-2.
[FGLS17] P. K. F RIZ, P. G ASSIAT, P.-L. L IONS, and P. E. S OUGANIDIS. Eikonal equa-
tions and pathwise solutions to fully non-linear SPDEs. Stoch. Partial Differ. Equ.
Anal. Comput. 5, no. 2, (2017), 256–277. arXiv:1602.04746. doi:10.1007/
s40072-016-0087-9.
[FGP18] P. K. F RIZ, P. G ASSIAT, and P. P IGATO. Precise asymptotics: robust stochastic
volatility models. arXiv e-prints (2018), 1–34. arXiv:1811.00267.
[FHL16] G. F LINT, B. H AMBLY, and T. LYONS. Discretely sampled signals and the rough
Hoff process. Stochastic Process. Appl. 126, no. 9, (2016), 2593–2614. doi:10.
1016/j.spa.2016.02.011.
[FHL20] P. K. F RIZ, A. H OCQUET, and K. L Ê. Rough Markov diffusions and stochastic
differential equations, 2020. In preparation.
[FK20] P. K. F RIZ and T. K LOSE. Precise Laplace Asymptotics for Singular Stochastic
Partial Differential Equations: The case of the 2D generalised Parabolic Anderson
Model, 2020. In preparation.
[FLS06] P. F RIZ, T. LYONS, and D. S TROOCK. Lévy’s area under conditioning. Ann. Inst.
H. Poincaré Probab. Statist. 42, no. 1, (2006), 89–101. doi:10.1016/j.anihpb.
2005.02.003.
[FNC82] M. F LIESS and D. N ORMAND -C YROT. Algèbres de Lie nilpotentes, formule de
Baker-Campbell-Hausdorff et intégrales itérées de K. T. Chen. In Seminar on Proba-
bility, XVI, vol. 920 of Lecture Notes in Math., 257–267. Springer, Berlin-New York,
1982.
References 331
[FO09] P. F RIZ and H. O BERHAUSER. Rough path limits of the Wong-Zakai type with a
modified drift term. J. Funct. Anal. 256, no. 10, (2009), 3236–3256. doi:10.1016/
j.jfa.2009.02.010.
[FO10] P. F RIZ and H. O BERHAUSER. A generalized Fernique theorem and appli-
cations. Proc. Amer. Math. Soc. 138, (2010), 3679–3688. doi:10.1090/
S0002-9939-2010-10528-2.
[FO11] P. F RIZ and H. O BERHAUSER. On the splitting-up method for rough (partial)
differential equations. J. Differential Equations 251, no. 2, (2011), 316–338. doi:
10.1016/j.jde.2011.02.009.
[FO14] P. F RIZ and H. O BERHAUSER. Rough path stability of (semi-)linear SPDEs.
Probab. Theory Related Fields 158, no. 1-2, (2014), 401–434. doi:10.1007/
s00440-013-0483-2.
[Föl81] H. F ÖLLMER. Calcul d’Itô sans probabilités. In Seminar on Probability, XV (Univ.
Strasbourg, Strasbourg, 1979/1980) (French), vol. 850 of Lecture Notes in Math.,
143–150. Springer, Berlin, 1981. doi:10.1007/bfb0088364.
[FP18] P. K. F RIZ and D. J. P R ÖMEL. Rough path metrics on a Besov-Nikolskii-type scale.
Trans. Amer. Math. Soc. 370, no. 12, (2018), 8521–8550. doi:10.1090/tran/7264.
[FR11] P. F RIZ and S. R IEDEL. Convergence rates for the full Brownian rough paths with
applications to limit theorems for stochastic flows. Bull. Sci. Math. 135, no. 6-7,
(2011), 613–628. doi:10.1016/j.bulsci.2011.07.006.
[FR13] P. F RIZ and S. R IEDEL. Integrability of (non-)linear rough differential equations and
integrals. Stoch. Anal. Appl. 31, no. 2, (2013), 336–358. doi:10.1080/07362994.
2013.759758.
[FR14] P. F RIZ and S. R IEDEL. Convergence rates for the full Gaussian rough paths. Ann.
Inst. Henri Poincaré Probab. Stat. 50, no. 1, (2014), 154–194. doi:10.1214/
12-AIHP507.
[Fri05] P. K. F RIZ. Continuity of the Itô-map for Hölder rough paths with applications to the
support theorem in Hölder norm. In Probability and partial differential equations in
modern applied mathematics, vol. 140 of IMA Vol. Math. Appl., 117–135. Springer,
New York, 2005. doi:10.1007/978-0-387-29371-4_8.
[FS06] W. H. F LEMING and H. M. S ONER. Controlled Markov processes and viscosity
solutions, vol. 25 of Stochastic Modelling and Applied Probability. Springer, New
York, second ed., 2006, xviii+429. doi:10.1007/0-387-31071-1.
[FS13] P. F RIZ and A. S HEKHAR. Doob-Meyer for rough paths. Bull. Inst. Math. Acad. Sin.
(N.S.) 8, no. 1, (2013), 73–84. arXiv:1205.2505.
[FS17] P. K. F RIZ and A. S HEKHAR. General rough integration, Lévy rough paths and a
Lévy-Kintchine-type formula. Ann. Probab. 45, no. 4, (2017), 2707–2765. arXiv:
1212.5888. doi:10.1214/16-AOP1123.
[FT17] P. K. F RIZ and H. T RAN. On the regularity of SLE trace. Forum Math. Sigma 5,
(2017), e19, 17. doi:10.1017/fms.2017.18.
[FV05] P. F RIZ and N. V ICTOIR. Approximations of the Brownian rough path with applica-
tions to stochastic analysis. Ann. Inst. H. Poincaré Probab. Statist. 41, no. 4, (2005),
703–724. doi:10.1016/j.anihpb.2004.05.003.
[FV06a] P. F RIZ and N. V ICTOIR. A note on the notion of geometric rough paths.
Probab. Theory Related Fields 136, no. 3, (2006), 395–416. doi:10.1007/
s00440-005-0487-7.
[FV06b] P. F RIZ and N. V ICTOIR. A variation embedding theorem and applications. J. Funct.
Anal. 239, no. 2, (2006), 631–637. doi:10.1016/j.jfa.2005.12.021.
[FV07] P. F RIZ and N. V ICTOIR. Large deviation principle for enhanced Gaussian processes.
Ann. Inst. H. Poincaré Probab. Statist. 43, no. 6, (2007), 775 – 785. doi:10.1016/
j.anihpb.2006.11.002.
[FV08a] P. F RIZ and N. V ICTOIR. The Burkholder-Davis-Gundy inequality for enhanced
martingales. In Séminaire de probabilités XLI, vol. 1934 of Lecture Notes in Math.,
421–438. Springer, Berlin, 2008. doi:10.1007/978-3-540-77913-1_20.
332 References
[FV08b] P. F RIZ and N. V ICTOIR. Euler estimates for rough differential equations. J. Differ-
ential Equations 244, no. 2, (2008), 388–412. doi:10.1016/j.jde.2007.10.008.
[FV08c] P. F RIZ and N. V ICTOIR. On uniformly subelliptic operators and stochastic area.
Probab. Theory Related Fields 142, no. 3-4, (2008), 475–523. doi:10.1007/
s00440-007-0113-y.
[FV10a] P. F RIZ and N. V ICTOIR. Differential equations driven by Gaussian signals. Ann. Inst.
H. Poincaré Probab. Statist. 46, no. 2, (2010), 369–413. doi:10.1214/09-AIHP202.
[FV10b] P. F RIZ and N. V ICTOIR. Multidimensional Stochastic Processes as Rough Paths, vol.
120 of Cambridge Studies in Advanced Mathematics. Cambridge University Press,
Cambridge, 2010, xiv+670. doi:10.1017/CBO9780511845079.
[FV11] P. F RIZ and N. V ICTOIR. A note on higher dimensional p-variation. Electron. J.
Probab. 16, (2011), 1880–1899. doi:10.1214/EJP.v16-951.
[FZ18] P. K. F RIZ and H. Z HANG. Differential equations driven by rough paths with jumps.
J. Differential Equations 264, no. 10, (2018), 6226–6301. doi:10.1016/j.jde.
2018.01.031.
[FZK20] P. K. F RIZ and P. Z ORIN -K RANICH. Rough semimartingales and p-variation esti-
mates for martingale transforms, 2020. In preparation.
[Gas20] P. G ASSIAT. Non-uniqueness for reflected rough differential equations. arXiv e-prints
(2020), 1–25. arXiv:2001.11914.
[GGLS20] P. G ASSIAT, B. G ESS, P.-L. L IONS, and P. E. S OUGANIDIS. Speed of propagation
for hamilton–jacobi equations with multiplicative rough time dependence and convex
hamiltonians. Probab. Theory Related Fields 176, no. 1, (2020), 421–448. doi:
10.1007/s00440-019-00921-5.
[GH19] A. G ERASIMOVICS and M. H AIRER. Hörmander’s theorem for semilinear spdes.
Electron. J. Probab. 24, (2019), 56 pp. doi:10.1214/19-EJP387.
[GHN19] A. G ERASIMOVICS, A. H OCQUET, and T. N ILSSEN. Non-autonomous rough semi-
linear PDEs and the multiplicative sewing lemma. arXiv e-prints (2019), 1–48.
arXiv:1907.13398.
[GIP15] M. G UBINELLI, P. I MKELLER, and N. P ERKOWSKI. Paracontrolled distributions
and singular PDEs. Forum Math. Pi 3, (2015), e6, 75. doi:10.1017/fmp.2015.2.
[GIP16] M. G UBINELLI, P. I MKELLER, and N. P ERKOWSKI. A Fourier analytic approach
to pathwise stochastic integration. Electron. J. Probab. 21, (2016), Paper No. 2, 37.
doi:10.1214/16-EJP3868.
[GJ14] P. G ONÇALVES and M. JARA. Nonlinear fluctuations of weakly asymmetric in-
teracting particle systems. Arch. Ration. Mech. Anal. 212, no. 2, (2014), 597–644.
doi:10.1007/s00205-013-0693-x.
[GL09] M. G UBINELLI and J. L ÖRINCZI. Gibbs measures on Brownian currents. Comm.
Pure Appl. Math. 62, no. 1, (2009), 1–56. doi:10.1002/cpa.20260.
[GL20] P. G ASSIAT and C. L ABB É. Existence of densities for the dynamic Φ43 model.
Ann. Inst. Henri Poincaré Probab. Stat. 56, no. 1, (2020), 326–373. doi:10.1214/
19-AIHP963.
[GLP99] G. G IACOMIN, J. L. L EBOWITZ, and E. P RESUTTI. Deterministic and stochastic
hydrodynamic equations arising from simple microscopic model systems. In Stochas-
tic partial differential equations: six perspectives, vol. 64 of Math. Surveys Monogr.,
107–152. Amer. Math. Soc., Providence, RI, 1999. doi:10.1090/surv/064/03.
[GOT19] B. G ESS, C. O UYANG, and S. T INDEL. Density bounds for solutions to differential
equations driven by gaussian rough paths. J. Theoret. Probab. (2019). doi:10.1007/
s10959-019-00967-0.
[GP15] M. G UBINELLI and N. P ERKOWSKI. Lectures on singular stochastic PDEs, vol. 29 of
Ensaios Matemáticos [Mathematical Surveys]. Sociedade Brasileira de Matemática,
Rio de Janeiro, 2015, 89.
[GP17] M. G UBINELLI and N. P ERKOWSKI. KPZ reloaded. Comm. Math. Phys. 349, no. 1,
(2017), 165–269. arXiv:1508.03877. doi:10.1007/s00220-016-2788-3.
[GP18] M. G UBINELLI and N. P ERKOWSKI. Energy solutions of KPZ are unique. J. Amer.
Math. Soc. 31, no. 2, (2018), 427–471. doi:10.1090/jams/889.
References 333
[HW15] M. H AIRER and H. W EBER. Large deviations for white-noise driven, nonlinear
stochastic PDEs in two and three dimensions. Ann. Fac. Sci. Toulouse Math. (6) 24,
no. 1, (2015), 55–92. doi:10.5802/afst.1442.
[HX19] M. H AIRER and W. X U. Large scale limit of interface fluctuation models. Ann.
Probab. 47, no. 6, (2019), 3478–3550. doi:10.1214/18-aop1317.
[IK06] Y. I NAHAMA and H. K AWABI. Large deviations for heat kernel measures on loop
spaces via rough paths. J. London Math. Soc. (2) 73, no. 3, (2006), 797–816. doi:
10.1112/S0024610706022654.
[IK07] Y. I NAHAMA and H. K AWABI. Asymptotic expansions for the Laplace approxima-
tions for Itô functionals of Brownian rough paths. J. Funct. Anal. 243, no. 1, (2007),
270–322. doi:10.1016/j.jfa.2006.09.016.
[IKN18] S. I SHIWATA, H. K AWABI, and R. NAMBA. Central limit theorems for non-symmetric
random walks on nilpotent covering graphs: Part ii. arXiv e-prints (2018), 1–41.
arXiv:1808.08856.
[IM85] A. I NOUE and Y. M AEDA. On integral transformations associated with a certain
Lagrangian—as a prototype of quantization. J. Math. Soc. Japan 37, no. 2, (1985),
219–244. doi:10.2969/jmsj/03720219.
[IN19] Y. I NAHAMA and N. NAGANUMA. Asymptotic expansion of the density for hypoel-
liptic rough differential equation. arXiv e-prints (2019), 1–33. arXiv:1902.05219.
[Ina06] Y. I NAHAMA. Laplace’s method for the laws of heat processes on loop spaces. J.
Funct. Anal. 232, no. 1, (2006), 148–194. doi:10.1016/j.jfa.2005.06.006.
[Ina10] Y. I NAHAMA. A stochastic Taylor-like expansion in the rough path theory. J. Theor.
Probab. 23, (2010), 671–714. doi:10.1007/s10959-010-0287-6.
[Ina13] Y. I NAHAMA. Laplace approximation for rough differential equation driven by
fractional Brownian motion. Ann. Probab. 41, no. 1, (2013), 170–205. doi:10.
1214/11-AOP733.
[Ina14] Y. I NAHAMA. Malliavin differentiability of solutions of rough differential equations.
J. Funct. Anal. 267, no. 5, (2014), 1566–1584. doi:10.1016/j.jfa.2014.06.011.
[Ina15] Y. I NAHAMA. Large deviation principle of Freidlin-Wentzell type for pinned diffusion
processes. Trans. Amer. Math. Soc. 367, no. 11, (2015), 8107–8137. doi:10.1090/
S0002-9947-2015-06290-4.
[Ina16a] Y. I NAHAMA. Large deviations for rough path lifts of Watanabe’s pullbacks of
delta functions. Int. Math. Res. Not. IMRN 2016, no. 20, (2016), 6378–6414. doi:
10.1093/imrn/rnv349.
[Ina16b] Y. I NAHAMA. Short time kernel asymptotics for rough differential equation driven
by fractional Brownian motion. Electron. J. Probab. 21, (2016), Paper No. 34, 29.
doi:10.1214/16-EJP4144.
[INY78] N. I KEDA, S. NAKAO, and Y. YAMATO. A class of approximations of Brownian
motion. Publ. Res. Inst. Math. Sci. 13, no. 1, (1977/78), 285–300. doi:10.2977/
prims/1195190109.
[IT17] Y. I NAHAMA and S. TANIGUCHI. Short time full asymptotic expansion of hypoelliptic
heat kernel at the cut locus. Forum Math. Sigma 5, (2017), e16, 74. doi:10.1017/
fms.2017.14.
[IW89] N. I KEDA and S. WATANABE. Stochastic differential equations and diffusion pro-
cesses. North-Holland Publishing Co., Amsterdam, second ed., 1989, xvi+555.
[JLM85] G. J ONA -L ASINIO and P. K. M ITTER. On the stochastic quantization of field theory.
Comm. Math. Phys. 101, no. 3, (1985), 409–436. doi:10.1007/bf01216097.
[JM83] N. C. JAIN and D. M ONRAD. Gaussian measures in Bp . Ann. Probab. 11, no. 1,
(1983), 46–57. doi:10.1214/aop/1176993659.
[Kal02] O. K ALLENBERG. Foundations of modern probability. Probability and its Ap-
plications (New York). Springer-Verlag, New York, second ed., 2002, xx+638.
doi:10.1007/978-1-4757-4015-8.
[Kel16] D. K ELLY. Rough path recursions and diffusion approximations. Ann. Appl. Probab.
26, no. 1, (2016), 425–461. doi:10.1214/15-aap1096.
336 References
[LS17] O. L OPUSANSCHI and D. S IMON. Area anomaly in the rough path Brownian scaling
limit of hidden Markov walks. arXiv e-prints (2017), 1–27. arXiv:1709.04288.
[LS18] O. L OPUSANSCHI and D. S IMON. Lévy area with a drift as a renormalization limit
of Markov chains on periodic graphs. Stochastic Process. Appl. 128, no. 7, (2018),
2404–2426. doi:10.1016/j.spa.2017.09.004.
[LV04] T. LYONS and N. V ICTOIR. Cubature on Wiener space. Proc. R. Soc. Lond. Ser.
A Math. Phys. Eng. Sci. 460, no. 2041, (2004), 169–198. Stochastic analysis with
applications to mathematical finance. doi:10.1098/rspa.2003.1239.
[LV06] A. L EJAY and N. V ICTOIR. On (p, q)-rough paths. J. Differential Equations 225,
no. 1, (2006), 103–133. doi:10.1016/j.jde.2006.01.018.
[LV07] T. LYONS and N. V ICTOIR. An extension theorem to rough paths. Ann. Inst. H.
Poincaré Anal. Non Linéaire 24, no. 5, (2007), 835–847. doi:10.1016/j.anihpc.
2006.07.004.
[LX13] T. J. LYONS and W. X U. A uniform estimate for rough paths. Bull. Sci. Math. 137,
no. 7, (2013), 867–879. doi:10.1016/j.bulsci.2013.04.004.
[LX17] T. J. LYONS and W. X U. Hyperbolic development and inversion of signature. J.
Funct. Anal. 272, no. 7, (2017), 2933–2955. doi:10.1016/j.jfa.2016.12.024.
[LX18] T. J. LYONS and W. X U. Inverting the signature of a path. J. Eur. Math. Soc. (JEMS)
20, no. 7, (2018), 1655–1687. doi:10.4171/JEMS/796.
[LY02] F. L IN and X. YANG. Geometric measure theory—an introduction, vol. 1 of Advanced
Mathematics (Beijing/Boston). Science Press Beijing, Beijing, 2002, x+237.
[LY13] T. J. LYONS and D. YANG. The partial sum process of orthogonal expansions as
geometric rough process with Fourier series as an example—an improvement of
Menshov-Rademacher theorem. J. Funct. Anal. 265, no. 12, (2013), 3067–3103.
doi:10.1016/j.jfa.2013.08.032.
[LY15] T. J. LYONS and D. YANG. The theory of rough paths via one-forms and the extension
of an argument of Schwartz to rough differential equations. J. Math. Soc. Japan 67,
no. 4, (2015), 1681–1703. doi:10.2969/jmsj/06741681.
[LY16] T. LYONS and D. YANG. Recovering the pathwise Itô solution from averaged
Stratonovich solutions. Electron. Commun. Probab. 21, (2016), Paper No. 7, 18.
doi:10.1214/16-ECP3795.
[Lyo91] T. LYONS. On the nonexistence of path integrals. Proc. Roy. Soc. London Ser. A 432,
no. 1885, (1991), 281–290. doi:10.1098/rspa.1991.0017.
[Lyo94] T. LYONS. Differential equations driven by rough signals. I. An extension of an
inequality of L. C. Young. Math. Res. Lett. 1, no. 4, (1994), 451–464. doi:10.4310/
MRL.1994.v1.n4.a5.
[Lyo95] T. J. LYONS. The interpretation and solution of ordinary differential equations driven
by rough signals. In Stochastic analysis (Ithaca, NY, 1993), vol. 57 of Proc. Sympos.
Pure Math., 115–128. Amer. Math. Soc., Providence, RI, 1995. doi:10.1090/
pspum/057/1335466.
[Lyo98] T. J. LYONS. Differential equations driven by rough signals. Rev. Mat. Iberoamericana
14, no. 2, (1998), 215–310. doi:10.4171/RMI/240.
[Lyo14] T. LYONS. Rough paths, signatures and the modelling of functions on streams. In
Proceedings of the International Congress of Mathematicians—Seoul 2014. Vol. IV,
163–184. Kyung Moon Sa, Seoul, 2014. arXiv:1405.4537.
[LZ99] T. LYONS and O. Z EITOUNI. Conditional exponential moments for iterated
Wiener integrals. Ann. Probab. 27, no. 4, (1999), 1738–1749. doi:10.1214/aop/
1022677546.
[Mal78] P. M ALLIAVIN. Stochastic calculus of variations and hypoelliptic operators. Proc. In-
tern. Symp. SDE (1978), 195–263.
[Mal97] P. M ALLIAVIN. Stochastic analysis, vol. 313 of Grundlehren der Mathematischen
Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag,
Berlin, 1997, xii+343. doi:10.1007/978-3-642-15074-6.
[McK69] H. P. M C K EAN , J R . Stochastic integrals. Probability and Mathematical Statistics, No.
5. Academic Press, New York-London, 1969, xiii+140. doi:10.1090/chel/353.
References 339
[PW81] G. PARISI and Y. S. W U. Perturbation theory without gauge fixing. Sci. Sinica 24,
no. 4, (1981), 483–496. doi:10.1360/ya1981-24-4-483.
[Qua11] J. Q UASTEL. Introduction to KPZ. Current Developments in Mathematics 2011,
(2011), 125–194. doi:10.4310/cdm.2011.v2011.n1.a3.
[Ras38] P. R ASHEVSKII. About connecting two points of complete non-holonomic space by
admissible curve (in Russian). Uch. Zapiski ped. inst. Libknexta 2, (1938), 83–94.
[Ree58] R. R EE. Lie elements and an algebra associated with shuffles. Ann. of Math. (2) 68,
(1958), 210–220. doi:10.2307/1970243.
[Rie17] S. R IEDEL. Transportation–cost inequalities for diffusions driven by gaussian pro-
cesses. Electron. J. Probab. 22, (2017), 26 pp. doi:10.1214/17-EJP40.
[RS17] S. R IEDEL and M. S CHEUTZOW. Rough differential equations with unbounded drift
term. J. Differential Equations 262, no. 1, (2017), 283–312. doi:10.1016/j.jde.
2016.09.021.
[RX13] S. R IEDEL and W. X U. A simple proof of distance bounds for Gaussian rough paths.
Electron. J. Probab. 18, (2013), no. 108, 1–18. doi:10.1214/EJP.v18-2387.
[RY99] D. R EVUZ and M. YOR. Continuous martingales and Brownian motion, vol. 293
of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of
Mathematical Sciences]. Springer-Verlag, Berlin, third ed., 1999, xiv+602. doi:
10.1007/978-3-662-06400-9.
[Rya02] R. A. RYAN. Introduction to tensor products of Banach spaces. Springer Monographs
in Mathematics. Springer-Verlag London Ltd., London, 2002, xiv+225. doi:10.
1007/978-1-4471-3903-4_1.
[Sch18] P. S CH ÖNBAUER. Malliavin calculus and density for singular stochastic partial
differential equations. arXiv e-prints (2018), 1–63. arXiv:1809.03570.
[See18a] B. S EEGER. Approximation schemes for viscosity solutions of fully nonlinear
stochastic partial differential equations. arXiv e-prints (2018), 1–40. arXiv:1802.
04740.
[See18b] B. S EEGER. Perron’s method for pathwise viscosity solutions. Comm. Partial
Differential Equations 43, no. 6, (2018), 998–1018. doi:10.1080/03605302.2018.
1488262.
[Sim97] L. S IMON. Schauder estimates by scaling. Calc. Var. Partial Differential Equations 5,
no. 5, (1997), 391–407. doi:10.1007/s005260050072.
[Sip93] E.-M. S IPIL ÄINEN. A pathwise view of solutions of stochastic differential equations.
Ph.D. thesis, University of Edinburgh, 1993.
[Sou19] P. E. S OUGANIDIS. Pathwise solutions for fully nonlinear first- and second-
order partial differential equations with multiplicative rough time dependence. In
F. F LANDOLI, M. G UBINELLI, and M. H AIRER, eds., Singular Random Dynam-
ics : Cetraro, Italy 2016, 75–220. Springer International Publishing, Cham, 2019.
doi:10.1007/978-3-030-29545-5_3.
[ST78] V. N. S UDAKOV and B. S. T SIREL’ SON. Extremal properties of half-spaces for
spherically invariant measures. J. Sov. Math. 9, no. 1, (1978), 9–18. doi:10.1007/
BF01086099.
[ST18] H. S INGH and J. T EICHMANN. An elementary proof of the reconstruction theorem.
arXiv e-prints (2018), 1–25. arXiv:1812.03082.
[Str11] D. W. S TROOCK. Probability theory. Cambridge University Press, Cambridge,
second ed., 2011, xxii+527. An analytic view. doi:10.1017/cbo9780511974243.
[Sus78] H. J. S USSMANN. On the gap between deterministic and stochastic ordinary dif-
ferential equations. Ann. Probability 6, no. 1, (1978), 19–41. doi:10.1214/aop/
1176995608.
[Sus91] H. J. S USSMANN. Limits of the Wong-Zakai type with a modified drift term. In
Stochastic analysis, Proc. Conf. Honor Moshe Zakai 65th Birthday, Haifa/Isr., 475–
493. Academic Press, Boston, MA, 1991.
[SV72] D. W. S TROOCK and S. R. S. VARADHAN. On the support of diffusion processes with
applications to the strong maximum principle. In Proceedings of the Sixth Berkeley
References 341
343
344 Index