0% found this document useful (0 votes)
158 views360 pages

A Course On Rough Paths

This book provides an introduction to the theory of rough paths and regularity structures. It aims to make these topics more accessible to probabilists by focusing on driving signals that are Brownian motion, avoiding more complex algebraic concepts. It also covers recent developments in using these theories to solve stochastic partial differential equations.

Uploaded by

nitefather
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
158 views360 pages

A Course On Rough Paths

This book provides an introduction to the theory of rough paths and regularity structures. It aims to make these topics more accessible to probabilists by focusing on driving signals that are Brownian motion, avoiding more complex algebraic concepts. It also covers recent developments in using these theories to solve stochastic partial differential equations.

Uploaded by

nitefather
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 360

Peter K.

Friz and Martin Hairer

A Course on Rough Paths


With an introduction to regularity structures

March 2020
Last update to this version: March 25, 2022

Springer
To Waltraud and Rudolf Friz

and

To Xue-Mei
Preface to the Second Edition

It has been a joy seeing the subject of “rough analysis” flourish over the last few
years. As far as this book is concerned, this comes at the price of an increasingly
long list of (important) omissions. A systematic presentation of higher-level geomet-
ric and then branched (possibly càdlàg) rough paths remains beyond the scope
of this book, despite being an excellent preparation for the algebraic thinking
later required for regularity structures. (The references [LCL07, FV10b, CF19]
and [Gub10, HK15, FZ18, BCFP19] partially make up for this.) Also absent remains
a systematic mathematical study of signatures. This topic, together with recent appli-
cations to data science and machine learning, may well fill a book in its own right;
until then the reader may consult Lyons’ ICM article [Lyo14] and the survey [CK16].
The theory of regularity structures, a major extension of rough path theory, has,
since the appearance of the first edition of this book, grown into an essentially
complete solution theory for general singular, subcritical semilinear (and quasilinear)
stochastic partial differential equations. Despite this progress, our running example
of a singular SPDE remains the KPZ equation, originally solved with rough paths
[Hai13], later also with the Gubinelli–Imkeller–Perkowksi theory of paracontrolled
distribitions [GIP15, GP15, GP17], another topic that deserves a book in its own
right.
As far as the content of this second edition is concerned, we added many new
examples and updated notations throughout in order to bring it closer to current
practice in the literature. Our short incursion into low regularity (a.k.a. higher order)
rough paths in Section 2.4 has been expanded, the recently obtained stochastic sewing
lemma is presented in Section 4.6. Section 9.4 shows how the Laplace method allows
one to elegantly obtain precise asymptotics in the large deviation principle, while
Section 12.1 contains a detailed discussion of rough transport equations. We also
expanded and updated large parts of Chapters 13-15 dealing with regularity structures.
In particular, we give a more modern and self-contained proof of the reconstruction
theorem (not relying on wavelet bases anymore), as well as a thorough discussion
of an application of regularity structures to a “rough” stochastic volatility model in
Section 14.5, and a detailed description of the KPZ structure and renormalisation
groups in Sections 15.3 and 15.5.

vii
viii Preface to the Second Edition

We also take the opportunity here to thank, in addition to those friends and
colleagues already named in the first edition, Yvain Bruned, Ajay Chandra, Ilya
Chevyrev, Rosa Preiß, and Lorenzo Zambotti for many interesting discussions over
the last few years. Of the many people who communicated to us lists of typos and
minor issues we especially thank Christian Litterer. We also thank Carlo Bellingeri,
Oleg Butkovsky, Andris Gerasimovics, Mate Gerencsér, Tom Klose, Khoa Lê, Mario
Maurelli and Nikolas Tapia for feedback on various aspects of the new content. The
first author also thanks ETH Zürich (FIM) for its hospitality during the finalisation
of this second edition.
Last but not least, we would like to acknowledge financial support: PKF is sup-
ported by the European Research Council under the European Union’s Horizon 2020
research and innovation programme through Consolidator Grant 683164 (GPSART),
by DFG research unit FOR2402, and the Einstein Foundation Berlin through an
Einstein professorship. MH was supported by the European Research Council under
the European Union’s Seventh Framework Programme through Consolidator Grant
615897 (CRITICAL), by the Leverhulme trust through a leadership award, and by
the Royal Society through a research professorship.

Berlin and London, Peter K. Friz


March 2020 Martin Hairer
Preface to the First Edition

Since its original development in the mid-nineties by Terry Lyons, culminating in


the landmark paper [Lyo98], the theory of rough paths has grown into a mature and
widely applicable mathematical theory, and there are by now several monographs
dedicated to the subject, notably Lyons–Qian [LQ02], Lyons et al [LCL07] and
Friz–Victoir [FV10b]. So why do we believe that there is room for yet another book
on this matter? Our reasons for writing this book are twofold.
First, the theory of rough paths has gathered the reputation of being difficult to
access for “mainstream” probabilists because it relies on some non-trivial algebraic
and / or geometric machinery. It is true that if one wishes to apply it to signals
of arbitrary roughness, the general theory relies on several objects (in particular
on the Hopf-algebraic properties of the free tensor algebra and the free nilpotent
group embedded in it) that are unfamiliar to most probabilists. However, in our
opinion, some of the most interesting applications of the theory arise in the context
of stochastic differential equations, where the driving signal is Brownian motion. In
this case, the theory simplifies dramatically and essentially no non-trivial algebraic
or geometric objects are required at all. This simplification is certainly not novel.
Indeed, early notes by Lyons, and then of Davie and Gubinelli, all took place in
this simpler setting (which allows to incorporate Brownian motion and Lévy’s area).
However, it does appear to us that all these ideas can nowadays be put together in
unprecedented simplicity, and we made a conscious choice to restrict ourselves to
this simpler case throughout most of this book.
The second and main raison d’être of this book is that the scope of the theory
has expanded dramatically over the past few years and that, in this process, the
point of view has slightly shifted from the one exposed in the aforementioned
monographs. While Lyons’ theory was built on the integration of 1-forms, Gubinelli
gave a natural extension to the integration of so-called “controlled rough paths”. As a
benefit, differential equations driven by rough paths can now be solved by fixed point
arguments in linear Banach spaces which contain a sufficiently accurate (second
order) local description of the solution.
This shift in perspective has first enabled the use of rough paths to provide solution
theories for a number of classically ill-posed stochastic partial differential equations

ix
x Preface to the First Edition

with one-dimensional spatial variables, including equations of Burgers type and


the KPZ equation. More recently, the perspective which emphasises linear spaces
containing sufficiently accurate local descriptions modelled on some (rough) input,
spurred the development of the theory of “regularity structures” which allows to
give consistent interpretations for a number of ill-posed equations, also in higher
dimensions. It can be viewed as an extension of the theory of controlled rough paths,
although its formulation is somewhat different. In the last chapters of this book, we
give a short and rather informal (i.e. very few proofs) introduction to that theory,
which in particular also sheds new light on some of the definitions of the theory of
rough paths.
This book does not have the ambition to provide an exhaustive description of the
theory of rough paths, but rather to complement the existing literature on the subject.
As a consequence, there are a number of aspects that we chose not to touch, or to do
so only barely. One omission is the study of rough paths of arbitrarily low regularity:
we do provide hints at the general theory at the end of several chapters, but these are
self-contained and can be skipped without impacting the understanding of the rest
of the book. Another serious omission concerns the systematic study of signatures,
that is the collection of all iterated integrals over a fixed interval associated to a
sufficiently regular path, providing an intriguing nonlinear characterisation.
We have used several parts of this book for lectures and mini-courses. In particular,
over the last years, the material on rough paths was given repeatedly by the first
author at TU Berlin (Chapters 1-12, in the form of a 4h/week, full semester lecture for
an audience of beginning graduate students in stochastics) and in some mini-courses
(Vienna, Columbia, Rennes, Toulouse; e.g. Chapters 1-5 with a selection of further
topics). The material of Chapters 13-15 originates in a number of minicourses by
the second author (Bonn, ETHZ, Toulouse, Columbia, XVII Brazilian School of
Probability, 44th St. Flour School of Probability, etc). The “KPZ and rough paths”
summer school in Rennes (2013) was a particularly good opportunity to try out much
of the material here in joint mini-course form – we are very grateful to the organisers
for their efforts. Chapters 13-15 are, arguably, a little harder to present in a classroom.
Jointly with Paul Gassiat, the first author gave this material as full lecture at TU
Berlin (with examples classes run by Joscha Diehl, and more background material
on Schwartz distributions, Hölder spaces and wavelet theory than what is found
in this book); we also started to use consistently colours on our handouts. We felt
the resulting improvement in readability was significant enough to try it out also
in the present book and take the opportunity to thank Jörg Sixt from Springer for
making this possible, aside from his professional assistance concerning all other
aspects of this book project. We are very grateful for all the feedback we received
from participants at all theses courses. Furthermore, we would like to thank Bruce
Driver, Paul Gassiat, Massimilliano Gubinelli, Terry Lyons, Etienne Pardoux, Jeremy
Quastel and Hendrik Weber for many interesting discussions on how to present this
material. In addition, Khalil Chouk, Joscha Diehl and Sebastian Riedel kindly offered
to partially proofread the final manuscript.
At last, we would like to acknowledge financial support: PKF was supported by
the European Research Council under the European Union’s Seventh Framework
Preface to the First Edition xi

Programme (FP7/2007-2013) / ERC grant agreement nr. 258237 and DFG, SPP 1324.
MH was supported by the Leverhulme trust through a leadership award and by the
Royal Society through a Wolfson research award.

Berlin and Coventry, Peter K. Friz


June 2014 Martin Hairer
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 What is it all about? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Analogies with other branches of mathematics . . . . . . . . . . . . . . . . . . 6
1.3 Regularity structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Frequently used notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Rough path theory works in infinite dimensions . . . . . . . . . . . . . . . . . 13

2 The space of rough paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15


2.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 The space of geometric rough paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Rough paths as Lie group valued paths . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Geometric rough paths of low regularity . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Brownian motion as a rough path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39


3.1 Kolmogorov criterion for rough paths . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Itô Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3 Stratonovich Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4 Brownian motion in a magnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Cubature on Wiener Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.6 Scaling limits of random walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.8 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4 Integration against rough paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Integration of one-forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Integration of controlled rough paths . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4 Stability I: rough integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5 Controlled rough paths of lower regularity . . . . . . . . . . . . . . . . . . . . . . 76

xiii
xiv Contents

4.6 Stochastic sewing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77


4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.8 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5 Stochastic integration and Itô’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . 89


5.1 Itô integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 Stratonovich integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.3 Itô’s formula and Föllmer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.4 Backward integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.6 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6 Doob–Meyer type decomposition for rough paths . . . . . . . . . . . . . . . . . . 107


6.1 Motivation from stochastic analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2 Uniqueness of the Gubinelli derivative and Doob–Meyer . . . . . . . . . 109
6.3 Brownian motion is truly rough . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.4 A deterministic Norris’ lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.5 Brownian motion is Hölder rough . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.7 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

7 Operations on controlled rough paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119


7.1 Relation between rough paths and controlled rough paths . . . . . . . . . 119
7.2 Lifting of regular paths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.3 Composition with regular functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.4 Stability II: Regular functions of controlled rough paths . . . . . . . . . . 122
7.5 Itô’s formula revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.6 Controlled rough paths of low regularity . . . . . . . . . . . . . . . . . . . . . . . 127
7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.8 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

8 Solutions to rough differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . 131


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.2 Review of the Young case: a priori estimates . . . . . . . . . . . . . . . . . . . . 132
8.3 Review of the Young case: Picard iteration . . . . . . . . . . . . . . . . . . . . . 133
8.4 Rough differential equations: a priori estimates . . . . . . . . . . . . . . . . . . 134
8.5 Rough differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.6 Stability III: Continuity of the Itô–Lyons map . . . . . . . . . . . . . . . . . . . 141
8.7 Davie’s definition and numerical schemes . . . . . . . . . . . . . . . . . . . . . . 143
8.8 Lyons’ original definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.9 Linear rough differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.10 Stability IV: Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
8.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
8.12 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Contents xv

9 Stochastic differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153


9.1 Itô and Stratonovich equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
9.2 The Wong–Zakai theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9.3 Support theorem and large deviations . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9.4 Laplace method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
9.6 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

10 Gaussian rough paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165


10.1 A simple criterion for Hölder regularity . . . . . . . . . . . . . . . . . . . . . . . . 165
10.2 Stochastic integration and variation regularity of the covariance . . . . 167
10.3 Fractional Brownian motion and beyond . . . . . . . . . . . . . . . . . . . . . . . 175
10.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
10.5 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

11 Cameron–Martin regularity and applications . . . . . . . . . . . . . . . . . . . . . 185


11.1 Complementary Young regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
11.2 Concentration of measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
11.2.1 Borell’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
11.2.2 Fernique theorem for Gaussian rough paths . . . . . . . . . . . . . . 191
11.2.3 Integrability of rough integrals and related topics . . . . . . . . . . 192
11.3 Malliavin calculus for rough differential equations . . . . . . . . . . . . . . . 196
11.3.1 Bouleau–Hirsch criterion and Hörmander’s theorem . . . . . . . 196
11.3.2 Calculus of variations for ODEs and RDEs . . . . . . . . . . . . . . . 197
11.3.3 Hörmander’s theorem for Gaussian RDEs . . . . . . . . . . . . . . . . 200
11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
11.5 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

12 Stochastic partial differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 207


12.1 First order rough partial differential equations . . . . . . . . . . . . . . . . . . . 207
12.1.1 Rough transport equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
12.1.2 Continuity equation and analytically weak formulation . . . . . 211
12.2 Second order rough partial differential equations . . . . . . . . . . . . . . . . 214
12.2.1 Linear theory: Feynman–Kac . . . . . . . . . . . . . . . . . . . . . . . . . . 214
12.2.2 Mild solutions to semilinear RPDEs . . . . . . . . . . . . . . . . . . . . 219
12.2.3 Fully nonlinear equations with semilinear rough noise . . . . . 223
12.2.4 Rough viscosity solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
12.3 Stochastic heat equation as a rough path . . . . . . . . . . . . . . . . . . . . . . . . 230
12.3.1 The linear stochastic heat equation . . . . . . . . . . . . . . . . . . . . . . 232
12.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
12.5 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
xvi Contents

13 Introduction to regularity structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243


13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
13.2 Definition of a regularity structure and first examples . . . . . . . . . . . . . 244
13.2.1 The polynomial structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
13.2.2 The rough path structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
13.3 Definition of a model and first examples . . . . . . . . . . . . . . . . . . . . . . . 249
13.3.1 The polynomial model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
13.3.2 The rough path model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
13.4 Proof of the reconstruction theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
13.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
13.6 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

14 Operations on modelled distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263


14.1 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
14.2 Products and composition by regular functions . . . . . . . . . . . . . . . . . . 264
14.3 Classical Schauder estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
14.4 Multilevel Schauder estimates and admissible models . . . . . . . . . . . . 272
14.5 Rough volatility and robust Itô integration revisited . . . . . . . . . . . . . . 276
14.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
14.7 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

15 Application to the KPZ equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289


15.1 Formulation of the main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
15.2 Construction of the associated regularity structure . . . . . . . . . . . . . . . 293
15.3 The structure group and positive renormalisation . . . . . . . . . . . . . . . . 297
15.4 Reconstruction for canonical lifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
15.5 Renormalisation of the KPZ equation . . . . . . . . . . . . . . . . . . . . . . . . . . 302
15.5.1 The renormalisation group . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
15.5.2 The renormalised equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
15.5.3 Convergence of the renormalised models . . . . . . . . . . . . . . . . 310
15.6 The KPZ equation and rough paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
15.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
15.8 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Chapter 1
Introduction

We give a short overview of the scopes of both the theory of rough paths and the
theory of regularity structures. The main ideas are introduced and we point out some
analogies with other branches of mathematics.

1.1 What is it all about?

Differential equations are omnipresent in modern pure and applied mathematics;


many “pure” disciplines in fact originate in attempts to analyse differential equations
from various application areas. Classical ordinary differential equations (ODEs) are
of the form Ẏt = f (Yt , t); an important subclass is given by controlled ODEs of the
form
Ẏt = f0 (Yt ) + f (Yt )Ẋt , (1.1)
where X models the input (taking values in Rd , say), and Y is the output (in Re , say)
of some system modelled by nonlinear functions f0 and f , and by the initial state
Y0 . The need for a non-smooth theory arises naturally when the system is subject to
white noise, which can be understood as the scaling limit as h → 0 of the discrete
evolution equation

Yi+1 = Yi + hf0 (Yi ) + hf (Yi )ξi+1 , (1.2)

where the (ξi ) are i.i.d. standard Gaussian random variables. Based on martingale
theory, Itô’s stochastic differential equations (SDEs) have provided a rigorous and
extremely useful mathematical framework for all this. And yet, stability is lost in the
passage to continuous time: while it is trivial to solve (1.2) for a fixed realisation of
ξi (ω), after all (ξ1, . . . ξT ; Y0 ) 7→ Yi is surely a continuous map, the continuity of
the solution as a function of the driving noise is lost in the limit.
Taking Ẋ = ξ to be white noise in time (which amounts to say that X is a
Brownian motion, say B), the solution map S : B 7→ Y to (1.1), known as Itô map,
is a measurable map which in general lacks continuity, whatever norm one uses to

1
2 1 Introduction

equip the space of realisations of B. 1 Actually, one can show the following negative
result (see [Lyo91, LCL07] as well as Exercise 5.7 below):
Proposition 1.1. There exists no separable Banach space B ⊂ C([0, 1]) with the
following properties:
1. Sample paths of Brownian
R· motions lie in B almost surely.
2. The map (f, g) 7→ 0 f (t)ġ(t) dt defined on smooth functions extends to a
continuous map from B × B into the space of continuous functions on [0, 1].
Since, for any two distinct indices i and j, the map
Z ·
B 7→ B i (t) Ḃ j (t) dt , (1.3)
0

is itself the solution of one of the simplest possible differential equations driven by
B (take Y ∈ R2 solving Ẏ 1 = Ḃ i and Ẏ 2 = Y 1 Ḃ j ), this shows that it takes very
little for S to lack continuity. In this sense, solving SDEs is an analytically ill-posed
task! On the other hand, there are well-known probabilistic well-posedness results
for SDEs of the form 2

dYt = f0 (Yt )dt + f (Yt ) ◦ dBt , (1.4)


(see e.g. [INY78, Thm 4.1]), which imply for instance

Theorem 1.2. Let ξε = δε ∗ ξ denote the regularisation of white noise in time with a
compactly supported smooth mollifier δε . Denote by Y ε the solutions to (1.1) driven
by Ẋ = ξε . Then Y ε converges in probability (uniformly on compact sets). The
limiting process does not depend on the choice of mollifier δε , and in fact is the
Stratonovich solution to (1.4).

There are many variations on such “Wong–Zakai” results, another popular choice
being ξε = Ḃ (ε) where B (ε) is a piecewise linear approximation (of mesh size
∼ ε) to Brownian motion. However, as consequence of the aforementioned lack of
continuity of the Itô-map, there are also reasonable approximations to white noise for
which the above convergence fails. (We shall see an explicit example in Section 3.4.)
Perhaps rather surprisingly, it turns out that well-posedness is restored via the
iterated integrals (1.3) which are in fact the only data that is missing to turn S into
a continuous map. The role of (1.3) was already appreciated in [INY78, Thm 4.1]
and related works in the seventies, but statements at the time were probabilistic
in nature, such as Theorem 1.2 above. Rough path analysis introduced by Terry
Lyons in the seminal article [Lyo98] and by now exposed in several monographs
[LQ02, LCL07, FV10b], provides the following remarkable insight: Itô’s solution
map can be factorised into a measurable “universal” map Ψ and a continuous solution
map Ŝ as
1
This lack of regularity is the raison d’être for Malliavin calculus, a Sobolev type theory of C([0, T ])
equipped with Wiener measure, the law of Brownian motion.
2
For the purpose of this introduction, all coefficients are assumed to be sufficiently nice.
1.1 What is it all about? 3

Ψ Ŝ
B(ω) 7→ (B, B)(ω) 7→ Y (ω). (1.5)
The map Ψ is universal in the sense that it depends neither on the initial condition, nor
on the vector fields driving the stochastic differential equation, but merely consists
of enhancing Brownian motion with iterated integrals of the form
Z t
Bi,j (s, t) = B i (r) − B i (s) dB j (r) .

(1.6)
s

At this stage, the choice of stochastic integration in (1.6) (e.g. Itô or Stratonovich)
does matter and probabilistic techniques are required for the construction of Ψ .
Indeed, the map Ψ is only measurable and usually requires the use of some sort
of stochastic integration theory (or some equivalent construction, see for example
Section 10 below for a general construction in a Gaussian, non-semimartingale
context).
The solution map Ŝ on the other hand, the solution map to a rough differential
equation (RDE), also known as Itô–Lyons map and discussed in Section 8.1, is purely
deterministic and only makes use of analytical constructions. More precisely, it allows
input signals to be arbitrary rough paths which, as discussed in Chapter 2, are objects
(thought of as enhanced paths) of the form (X, X), defined via certain algebraic
properties (which mimic the interplay between a path and its iterated integrals) and
certain analytical, Hölder-type regularity conditions. In Chapter 3 these conditions
will be seen to hold true a.s. for (B, B); a typical realisation is thus called Brownian
rough path.
The Itô–Lyons map turns out, cf. Section 8.6, to be “nice” in the sense that it is a
continuous map of both its initial condition and the driving noise (X, X), provided
that the dependency on the latter is measured in a suitable “rough path” metric. In
other words, rough path analysis allows for a pathwise solution theory for SDEs, i.e.
for a fixed realisation of the Brownian rough path. The solution map Ŝ is however
a much richer object than the original Itô map, since its construction is completely
independent of the choice of stochastic integral and even of the knowledge that the
driving path is Brownian. For example, if we denote by Ψ I (resp. Ψ S ) the maps
B 7→ (B, B) obtained by Itô (resp. Stratonovich) integration, then we have the almost
sure identities
S I = Ŝ ◦ Ψ I , S S = Ŝ ◦ Ψ S ,
where S I (resp. S S ) denotes the solution to (1.4) interpreted in the Itô (resp.
Stratonovich) sense. Returning to Theorem 1.2, we see that the convergence there
is really a deterministic consequence of the probabilistic question whether or not
Ψ S (B ε ) → Ψ S (B) in probability and rough path topology, with Ḃ ε = ξ ϵ . This
can be shown to hold in the case of mollifier, piecewise linear, and many other
approximations.
So how is this Itô–Lyons map Ŝ built? In order to solve (1.1), we need to be able
to make sense of the expression
4 1 Introduction
Z t
f (Ys ) dXs , (1.7)
0

where Y is itself the as yet unknown solution. Here is where the usual pathwise
approach breaks down: as we have seen in Proposition 1.1 it is in general impossible,
even in the simplest cases, to find a Banach space of functions containing Brownian
sample paths and in which (1.7) makes sense. Actually, if we measure regularity in
terms of Hölder exponents, then (1.7) makes sense as a limit of Riemann sums for X
and Y that are arbitrary α-Hölder continuous functions if and only if α > 21 . The key
word here is arbitrary: in our case the function Y is anything but arbitrary! Actually,
since the function Y solves (1.1), one would expect the small-scale fluctuations of Y
to look exactly like a scaled version of the small-scale fluctuations of X in the sense
that one would expect that

Ys,t = f (Ys )Xs,t + Rs,t

where, for any path F with values in a linear space, we set Fs,t = Ft − Fs , and
where Rs,t is some remainder that one would expect to be “of higher order” in the
sense that |Rs,t | ≲ |t − s|β for some β > α. (We will see later that β = 2α is a
natural choice.)
Suppose now that X is a “rough path”, which is to say that it has been “enhanced”
with a two-parameter function X which should be interpreted as postulating the
values for Z t
Xi,j (s, t) = i
Xs,r dXrj . (1.8)
s
Note here that this identity should be read in the reverse order from what one may be
used to: it is the right-hand side that is defined by the left-hand side and not the other
way around! The idea here is that if X is too rough, then we do not a priori know
how to define the integral of X against itself, so we simply postulate its values. Of
course, X cannot just be anything, but should satisfy a number of natural algebraic
identities and analytical bounds, which will be discussed in detail in Chapter 2.
Anyway, assuming that we are provided with the data (X, X), then we know how
to give meaning to the integral of components of X against other components of X:
this is precisely what X encodes. Intuitively, this suggests that if we similarly encode
the fact that Y “looks like X at small scales”, then one should be able to extend
the definition of (1.7) to a large enough class of integrands to include solutions to
(1.1), even when α < 12 . One of the achievements of rough path theory is to make
this intuition precise. Indeed, in the framework of rough integration sketched here
and made precise in Chapter 4, the barrier α = 21 can be lowered to α = 31 . In
principle, this can be lowered further by further enhancing X with iterated integrals
of higher order, but we decided to focus on the first non-trivial case for the sake of
simplicity and because it already covers the most important case when X is given
by a Brownian motion, or a stochastic process with properties similar to those of
Brownian motion. We do however indicate briefly in Sections 2.4, 4.5 and 7.6 how
1.1 What is it all about? 5

the theory can be modified to cover the case α ≤ 13 , at least in the “geometric” case
when X is a limit of smooth paths.
The simplest way for Y to “look like X” is when Y = G(X) for some sufficiently
regular function G. Despite what one might guess, it turns out that this particular
class of functions Y R is already sufficiently rich so that knowing how to define
t
integrals of the form 0 G(Xs ) dXs for (non-gradient) functions G allows to give a
meaning to equations of the type (1.1), which is the approach originally developed in
[Lyo98]. A few yearsR t later, Gubinelli realised in [Gub04] that, in order to be able to
give a meaning to 0 Ys dXs given the data (X, X), it is sufficient that Y admits a
“derivative” Y ′ such that
Ys,t = Ys′ Xs,t + Rs,t ,
with a remainder satisfying Rs,t = O(|t − s|2α ). This extension of the original theory
turns out to be quite convenient, especially when applying it to problems other than
the resolution of evolution equations of the type (1.1).
An intriguing question is to what extent rough path theory, essentially a theory
of controlled ordinary differential equations, can be extended to partial differential
equations. In the case of finite-dimensional noise, and very loosely stated, one has
for instance a statement of the following type. (See [CF09, CFO11, FO14, GT10,
Tei11, DGT12] as well as Section 12.2 below.)

Theorem 1.3. Classes of SPDEs of the form du = F [u] dt + H[u] ◦ dB, with
second and first order differential operators F and H, respectively, and driven
by finite-dimensional noise, with the Zakai equation from filtering and stochastic
Hamilton–Jacobi–Bellman (HJB) equations as examples, can be solved pathwise, i.e.
for a fixed realisation of the Brownian rough path. As in the SDE case, the SPDE
solution map factorises as S S = Ŝ ◦ Ψ S where Ŝ, the solution map to a rough partial
differential equation (RPDE) is continuous in the rough path topology.

As a consequence, if ξε = δε ∗ ξ denotes the regularisation of white noise in


time with a compactly supported smooth mollifier δε that is scaled by ε, and if uε
denotes the random PDE solutions driven by ξε dt (instead of ◦dB) then uε converges
in probability. The limiting process does not depend on the choice of mollifier δε ,
and is viewed as Stratonovich SPDE solution. The same conclusion holds whenever
Ψ S (B ε ) → Ψ S (B) in probability and rough path topology.
The case of SPDEs driven by infinite-dimensional noise poses entirely different
problems. Already the stochastic heat equation in space dimension one has not
enough spatial regularity for additional nonlinearities of the type g(u)∂x u (which
arises in applications from path sampling [Hai11b, HW13]) or (∂x u)2 (the Kardar–
Parisi–Zhang equation) to be well-defined. In space dimension one, “spatial” rough
paths indexed by x, rather than t, have proved useful here and the quest to handle
dimension larger than one led to the general theory of regularity structures, see
Section 1.3 below.
Rather than trying to survey all applications to date of rough paths to stochastics,
let us say that the past few years have seen an explosion of results made possible by
the use of rough paths theory. New stimulus to the field was given by its use in rather
6 1 Introduction

diverse mathematical fields, including for example quantum field theory [GL09],
nonlinear PDEs [Gub12], Malliavin calculus [CFV09], non-Markovian Hörmander
and ergodic theory, [CF10, HP13, CHLT15] and the multiscale analysis of chaotic
behaviour in fast-slow systems [KM16, KM17, CFK+ 19b].
In view of these developments, we believe that it is an opportune time to try to
summarise some of the main results of the theory in a way that is as elementary as
possible, yet sufficiently precise to provide a technical working knowledge of the
theory. We therefore include elementary but essentially complete proofs of several
of the main results, including the continuity and definition of the Itô–Lyons map,
the lifting of a class of Gaussian processes to the space of rough paths, etc. In
contrast to the available textbook literature [LQ02, LCL07, FV10b], we emphasise
Gubinelli’s view on rough integration [Gub04, Gub10] which allows to linearise
many considerations and to simplify the exposition. That said, the resulting theory
of rough differential equations is (immediately) seen to be equivalent to Davie’s
definition [Dav08] and, generally, we have tried to give a good idea what other
perspectives one can take on what amounts to essentially the same objects.

1.2 Analogies with other branches of mathematics

As we have just seen, the main idea of the theory of rough paths is to “enhance”
a path X with some additional data X, namely the integral of X against itself, in
order to restore continuity of the Itô map. The general idea of building a larger
object containing additional information in order to restore the continuity of some
nonlinear transformation is of course very old and there are several other theories
that have a similar “flavour” to the theory of rough paths, one of them being the
theory of Young measures (see for example the notes [Bal00]) where the value of
a function is replaced by a probability measure, thus allowing to describe limits of
highly oscillatory functions.
Nevertheless, when first confronted with some of the notions just outlined, the first
reaction of the reader might be that simply postulating the values for the right-hand
side of (1.8) makes no sense. Indeed, if X is smooth, then we “know” that there is
only one “reasonable” choice for the integral X of X against itself, and this is the
Riemann integral. How could this be replaced by something else and how can one
expect to still get a consistent theory with a natural interpretation? These questions
will of course be fully answered in these notes.
For the moment, let us draw an analogy with a very well established branch of
geometric measure theory, namely the theory of varifolds [Alm66, LY02].
Varifolds arise as natural extensions of submanifolds in the context of certain
variational problems. We are not going into details here, but loosely speaking a
k-dimensional varifold in Rn is a (Radon) measure v on Rn × G(k, n), where
G(k, n) denotes the space of all k-dimensional subspaces of Rn . Here, one should
interpret G(k, n) as the space of all possible tangent spaces at any given point for
a k-dimensional submanifold of Rn . The projection of v onto Rn should then be
1.2 Analogies with other branches of mathematics 7

interpreted as a generalisation of the natural “surface measure” of a submanifold,


while the conditional (probability) measure on G(k, n) induced at almost every point
by disintegration should be interpreted as selecting a (possibly random) tangent
space at each point. Why is this a reasonable extension of the notion of submanifold?
Consider the following sequence Mε of one-dimensional submanifolds of R2 :

Mε M

ε

It is intuitively clear that, as ε → 0, this converges to a circle, but the right half has
twice as much “weight” as the left half so that, if we were to describe the limit M
simply as a manifold, we would have lost some information about the convergence of
the surface measures in the process. More dramatically, there are situations where one
has a sequence of smooth manifolds such that the limit is again a smooth manifold,
but with a limiting “tangent space” which has nothing to do with the actual tangent
space of the limit! Indeed, consider the sequence of one-dimensional submanifolds
of R2 given by

ε2

This time, the limit is a piece of straight line, which is in principle a perfectly nice
smooth submanifold, but the limiting tangent space is deterministic and makes a 45◦
angle with the canonical tangent space associated to the limit.
The situation here is philosophically very similar to that of the theory of rough
paths: a subset M ⊂ Rn may be sufficiently “rough” so that there is no way of
canonically associating to it either a k-dimensional Riemannian volume element,
or a k-dimensional tangent space, so we simply postulate them. The two examples
given above show that even in situations where M is a nice smooth manifold, it
still makes sense to associate to it a volume element and / or tangent space that are
different from the ones that one would construct canonically. A similar situation
arises in the theory of rough paths. Indeed, it may so happen that X is actually
given by a smooth function. Even so, this does not automatically mean that the
right-hand side of (1.8) is given by the usual Riemann integral of X against itself.
An explicit example illustrating this fact is given in Exercise 2.10 below. Similarly to
8 1 Introduction

the examples of “non-canonical” varifolds given above, “non-canonical” rough paths


can also be constructed as limits of ordinary smooth paths (with the second-order
term X defined by (1.8) where the integral is the usual Riemann integral), provided
that one takes limits in a suitably weak topology.

1.3 Regularity structures

Recently, a new theory of “regularity structures” was introduced [Hai14b], unify-


ing various flavours of the theory of rough paths (including Gubinelli’s controlled
rough paths [Gub04], as well as his branched rough paths [Gub10]), as well as the
usual Taylor expansions. While it has its conceptual roots in the theory of rough
paths, the main advantage of this new theory is that it is no longer tied to the one-
dimensionality of the time parameter, which makes it also suitable for the description
of solutions to stochastic partial differential equations, rather than just stochastic
ordinary differential equations.
The main achievement of the theory of regularity structures is that it allows to
give a (pathwise!) meaning to ill-posed stochastic PDEs that arise naturally when
trying to describe the macroscopic behaviour of models from statistical mechanics
near criticality. One example of such an equation is the KPZ equation arising as a
natural model for one-dimensional interface motion [KPZ86, BG97, Hai13]:

∂t h = ∂x2 h + (∂x h)2 − C + ξ . (1.9)

The problem with this equation is that, if anything, one has (∂x h)2 = +∞ (a
consequence of the roughness of (1 + 1)-dimensional space-time white noise) and
one would have to compensate with C = +∞. It has instead become customary
to define the solution of the KPZ equation as the logarithm of the (multiplicative)
stochastic heat equation ∂t u = ∂x2 u + uξ, essentially ignoring the (infinite) Itô-
correction term.3 The so-constructed solutions are called Hopf–Cole solutions and,
to cite J. Quastel [Qua11],
The evidence for the Hopf–Cole solutions is now overwhelming. Whatever the physicists
mean by KPZ, it is them.

It should emphasised that previous to [Hai13], to be discussed in Chapter 15, no


direct mathematical meaning had been given to the actual KPZ equation.
Another example is the dynamical Φ43 model arising for example in the stochastic
quantisation of Euclidean quantum field theory [PW81, JLM85, AR91, DPD03,
Hai14b], as well as a universal model for phase coexistence near the critical point
[GLP99]:
∂t Φ = ∆Φ + CΦ − Φ3 + ξ . (1.10)

3
This requires one of course to know that solutions to ∂t u = ∂x2 u + uξ stay strictly positive with
probability one, provided u0 > 0 a.s., but this turns out to be the case.
1.3 Regularity structures 9

Here, ξ denotes (3 + 1)-dimensional space-time white noise. In contrast to the KPZ


equation where the Hopf–Cole solution is a Hölder continuous random field, here
Φ is at best a random Schwartz distribution, making the term Φ3 ill-defined. Again,
one formally needs to set C = ∞ to create suitable cancellations and so, again, the
stochastic partial differential equation (1.10) has no “naı̈ve” mathematical meaning.
Loosely speaking, the type of well-posedness results that can be proven with the
help of the theory of regularity structures can be formulated as follows.

Theorem 1.4. Consider KPZ and Φ43 on a bounded square spatial domain with
periodic boundary conditions. Let ξε = δε ∗ ξ denote the regularisation of space-time
white noise with a compactly supported smooth mollifier δε that is scaled by ε in
the spatial direction(s) and by ε2 in the time direction. Denote by hε and Φε the
solutions to

∂t hε = ∂x2 hε + (∂x hε )2 − Cε + ξε ,
∂t Φε = ∆Φε + C̃ε Φε − Φ3ε + ξε .

Then, there exist choices of constants Cε and C̃ε diverging as ε → 0, as well as


processes h and Φ such that hε → h and Φε → Φ in probability. Furthermore, while
the constants Cε and C̃ε do depend crucially on the choice of mollifiers δε , the limits
h and Φ do not depend on them.

In the case of the KPZ equation, the topology in which one obtains convergence is
that of convergence in probability in a suitable space of space-time Hölder continuous
functions. Let us also emphasise that in this case the resulting renormalised solutions
coincide indeed with the Hopf–Cole solutions.
In the case of the dynamical Φ43 model, convergence takes place instead in some
space of space-time distributions. One caveat that also has to be dealt with in the
latter case is that the limiting process Φ may in principle explode in finite time for
some instances of the driving noise. (Although this is of course not expected.)
Chapters 13 and 14 of this book gives a short and mostly self-contained intro-
duction to the theory of regularity structures and the last chapter shows how it can
be used to provide a robust solution theory for the KPZ equation. The material in
these chapters differs significantly in presentation from the remainder of the book.
Indeed, since a detailed and rigorous exposition of this material would require an
entire book by itself (see the rather lengthy articles [Hai13] and [Hai14b]), we made
a conscious decision to keep the exposition mostly at an intuitive level. We therefore
omit virtually all proofs (with the notable exception of the proof of the reconstruction
theorem, Theorem 13.12, which is the fundamental result on which the theory builds)
and instead merely give short glimpses of the main ideas involved.
10 1 Introduction

1.4 Frequently used notations

Basics: Natural numbers, including zero, are denoted by N, integers by Z, real and
complex numbers are denoted by R and C, respectively. Strictly positive reals are
denoted by R+ . For x real, ⌊x⌋ (resp. ⌈x⌉) is the largest (resp. smallest) integer n
such that n ≤ x (resp. n ≥ x). We also write {x} ∈ (0, 1] for the non-zero fractional
part so that x − {x} ∈ Z. A d-dimensional multi-index is an element k ∈ Nd , and
given x ∈ Rd , we write xk as a shorthand for xk11 · · · xkdd and k! as a shorthand for
k1 ! · · · kd !.
Tensors: We shall deal with paths with values in, as well as maps between, Banach
spaces, typically denoted by V, W , equipped with their respective norms, always
written as | · |. Continuous linear maps from V to W form a Banach space, denoted
by L(V, W ). It will be important to consider tensor products of Banach spaces. If
V, W are finite-dimensional, say V ∼ = Rm and W ∼ = Rn , the tensor product V ⊗ W
m×n
can be identified with the matrix space R . Indeed, if (ei : 1 ≤ i ≤ m) [resp.
(fj : 1 ≤ j ≤ n)] is a basis of V [resp. W ], then (ei ⊗ fj : 1 ≤ i ≤ m, 1 ≤ j ≤ n)
is a basis of V ⊗W . If V and W are Hilbert spaces and (ei ) and (fj ) are orthonormal
bases it is natural to define a Euclidean structure on V ⊗ W by declaring the (ei ⊗ fj )
to be orthonormal. This induces a norm on V ⊗ W , also denoted by | · |, which is
compatible in the sense that |v ⊗ w| ≤ |v| · |w| for all v ∈ V , w ∈ W . This tensor
norm is furthermore symmetric, namely |u ⊗ v| = |v ⊗ u|, equivalently expressed as
invariance under transposition x 7→ xT .
We also introduce the symmetric and antisymmetric parts of x ∈ V ⊗ V :

Sym(x) = 12 (x + xT ), Anti(x) = 21 (x − xT ) .

The defining feature of tensor product spaces is their ability to linearise bilinear
maps,4

L(2) (V × V̄ , W ) ∼
= L V, L(V̄ , W ) ∼

= L(V ⊗ V̄ , W ) . (1.11)
We briefly discuss the extension to infinite dimensions. Given Banach spaces V, V̄
one completes the algebraic tensor product V ⊗a V̄ under a compatible tensor norm
to obtain a Banach space V ⊗ V̄ . By [Rya02, Thm 2.9], the second5 identification in
(1.11) requires one to work with the projective tensor norm
nX X o
v i ⊗ v̄ i ,
def
|x|proj = inf |vi ||v̄i | : x =
i i

where all sums are finite and | · | stands for either norm in V or V̄ . This norm is
obviously compatible and symmetric. Symmetric and antisymmetric part of x ∈
V ⊗ V are defined as before (note that the transposition map V ⊗ W → W ⊗ V
4
This will arise naturally, with V̄ = V , when pairing the second Fréchet derivatives (of some
F : V → W ) with second iterated integrals with values in V ⊗ V .
5
The first identification holds for general Banach spaces.
1.4 Frequently used notations 11

given by v ⊗ w 7→ w ⊗ v defined on the algebraic tensor product uniquely extends


to its completion for any symmetric compatible tensor norm). Without going into
further detail, we note that the projective tensor norm is the largest compatible tensor
norm (by the triangle inequality) satisfying |u ⊗ v| = |u| · |v|, and thus produces the
smallest Banach tensor product space.
Differentiable maps: Given (possibly infinite-dimensional) Banach spaces V, W we
write C n = C n (V ; W ), n ∈ N, for the space of continuous maps from V to W which
are n times continuously differentiable in Fréchet sense. A Banach space Cbn ⊂ C n is
given by those F ∈ C n with

∥F ∥Cbn = ∥F ∥∞ + ∥DF ∥∞ + . . . + ∥Dn F ∥∞ < ∞ ,


def
(1.12)

where we recall DF (v) ∈ L(V, W ), D2 F ∈ L(V, L(V, W )) ∼ = L(2) (V × V, W ),


the space of continuous bilinear maps from V × V to W . For γ ∈ (0, 1), we say that
F is locally γ-Hölder continuous, in symbols F ∈ C γ , if for every x ∈ V there exists
a neighbourhood N = N (x) and constant C = C(x), such that for all y, z ∈ N ,
|F (z) − F (y)| ≤ C|z − y|γ . (In finite dimensions, one equivalently demands this
estimate to hold on bounded sets.) The case γ = 1 is meaningful (“locally Lipschitz”)
but not denoted by C 1 for the sake of notational consistency.6 More generally, we
say F ∈ C γ , for non-integer γ = n + {γ}, with fractional part {γ} ∈ (0, 1), when
F ∈ C n and Dn F ∈ C {γ} . A Banach space Cbγ ⊂ C γ is introduced via

def |Dn F (z) − Dn F |


∥F ∥Cbγ = ∥F ∥Cbn + sup <∞. (1.13)
y̸=z |z − y|{γ}

The spaces C γ and Cbγ satisfy the obvious inclusions and continuous embeddings,
respectively. Warning. The Lipγ -spaces frequently seen in the rough path literature
are precisely our Cbγ -spaces for γ ∈ / N (at least when V is finite-dimensional),
whereas F ∈ Lipn+1 means F ∈ C n with globally Lipschitz Dn F ; some authors
also interpret Cbγ -spaces, for integer γ, in this way.
Path spaces:R We say that X : [0, T ] → V is continuously (Fréchet) differentiable if

X = X0 + 0 Ẋt dt, for some continuous path Ẋ : [0, T ] → V , the derivative of X.

We say that X is smooth, X ∈ C ∞ = C ∞ ([0, T ], V ), if X and all its derivatives are


continuously differentiable. The Banach space C α = C α ([0, T ], V ), for α ∈ (0, 1),
consists of α-Hölder paths, with finite α-Hölder seminorm,

def |Xs,t |
∥X∥α = sup <∞,
s,t∈[0,T ] |t − s|α

def
where we define the path increment Xs,t = Xt − Xs (and also use the convention
def
0/0 = 0); we also write δX for the map (s, t) 7→ Xt − Xs . This seminorm fails to
def
separate constants, the norm on C α is then given by ∥X∥C α = |X0 |+∥X∥α . We write

6
One checks that every F ∈ C 1 is locally Lipschitz (though not necessarily Cb1 on bounded sets).
12 1 Introduction

C 0,α ⊂ C α for the closure of smooth paths. As above, C 1 is potentially ambiguous,


and we adopt (in the context of paths) the Lipschitz (1-Hölder) interpretation; this
is convenient e.g. to include piecewise smooth approximations of less regular paths.

We will sometimes say that X ∈ C α as a shorthand for the statement that X ∈ C β
for every β < α. This abuse of notation will also be used for other scales of “Hölder-
type” spaces depending on a regularity index α. Similarly, one introduces the Banach
space C p-var ([0, T ], V ), for p ∈ [1, ∞), of continuous paths with finite p-variation
seminorm,
 1
p p
def
X
∥X∥p-var;[0,T ] = sup |Xs,t | <∞.
P
[s,t]∈P

Here one works with partitions or dissections of [0, T ]; since every dissection D =
{0 = t0 < t1 < · · · < tn = T } ⊂ [0, T ] can be thought of as a partition of [0, T ]
into (essentially) disjoint intervals, P ={[ti−1 , ti ] : i = 1, . . . n}, and vice-versa, we
shall use whatever is (notationally) more convenient.
We further recall that lim|P|→0 , typically defined via nets, means convergence
along any sequence (Pk ) with mesh |Pk | → 0, with identical limit along each such
sequence. Here, the mesh |P| of a partition P is the length of its largest element, i.e.
|P| = supk∈{1,...,n} |tk − tk−1 | if P is as above.
Two parameter spaces: Every V -valued path X gives rise to its increment function
δX : (s, t) 7→ Xs,t = Xt − Xs . More generally, consider (s, t) 7→ Ξs,t , with some
sort of “on-diagonal” β-Hölder regularity, formalised by the Banach space C2β of
maps Ξ : [0, T ] → V with finite norm,

def |Ξs,t |
∥Ξ∥β = sup <∞.
s,t∈[0,T ] |t − s|β

(Note X ∈ C β if and only if δX ∈ C2β .)


Rough path spaces: The symbols C α , DX α
etc. refer to spaces of rough paths
and controlled rough paths, respectively. In the given order, L (C ∞ ) ⊂ Cg0,α ⊂ Cgα
denote the spaces of canonically lifted smooth paths, geometric and weakly geometric
rough paths; C ∞ is the space of smooth rough paths. Every level-2- rough path
X ∈ C α admits a bracket [X] that quantifies deviations from the classical chain rule.
Hölder spaces and distributions: Local and global regularity of maps f : Rd → R
can be measured in the above-mentioned Hölder space scale C γ and Cbγ , for γ ∈ R+ .
We write D(Rd ) or D for Cc∞ , the space of smooth, compactly supported functions.
Upon equipping D with a suitable topology, the topological dual D′ = D′ (Rd ) is
the space of generalised functions or distributions. The Hölder scale extends to
negative γ, and then agrees (for non-integer γ) with the Zygmund spaces Z γ , precise
definitions are left to Chapter 14.
Stochastic analysis: We expect the reader is familiar with (d-dimensional stan-
dard) Brownian motion B = B(t, ω) and basics of Itô calculus for (continuous)
1.5 Rough path theory works in infinite dimensions 13

semimartingales as exposed e.g. in [RY99]. In particular, for (continuous) semi-


P ⟨X, Y ⟩ denotes the usual quadratic covariation
martingales X, Y , the angle-bracket
process, so that ⟨X, Y ⟩T = lim [s,t]∈P Xs,t Ys,t along any sequence (Pn ) of deter-
ministic partitions of [0, T ] with mesh |Pn | → 0. We also write ⟨X⟩ = ⟨X, X⟩. The
square-bracket [X, Y ] is understood in Föllmer sense (and tacitly depends on a fixed
sequence partitions); here too we write [X] = [X, X].
Miscellaneous: We will use the notation A = O(x) if there exists a constant C such
that the bound |A| ≤ Cx holds for every x ≤ 1 (or every x ≥ 1, depending on the
context). Similarly, we write A = o(x) if the constant C can be made arbitrarily
small as x → 0 (or as x → ∞, depending on the context). We will also occasionally
write C for a generic constant that only depends on the data of the problem under
consideration and which can change value from one line to the other without further
notice. We further write x ≲ y for two positive quantities to express an estimate
x ≤ Cy; and x ≍ y if in addition y ≲ x. Dependence on a parameter δ may be
indicated by writing ≍δ . We often consider quantities A = As,t and B = Bs,t with
γ
s, t ∈ [0, T ], for fixed T > 0, and then write A = B for |As,t − Bs,t | ≲ |t − s|γ ,
with (hidden) constant uniform in s, t ∈ [0, T ].
We write int(A), cl(A) for the interior and closure of a subset A in (some topo-
logical space) X.
Exercises: The difficulty of exercises is indicated with the same convention as in
[RY99]: one star ∗ denotes difficult ones and two stars ∗∗ denote very difficult
ones. The symbol ♯ denotes exercises that are important for the comprehension of
subsequent material.

1.5 Rough path theory works in infinite dimensions

Unless explicitly otherwise stated, all rough path results in this book are valid in
infinite dimensions. This is rather obvious in the case of Young integration, say with
L(V, W )-valued integrand and V -valued integrator, for general Banach spaces V
and W . In the case of rough integration, Section 4.2, of a L(V, W )-valued one-form
F , against a V ⊕ V ⊗2 -valued rough path, the pairing of DF · F , with values in
L(V, L(V, W )), with V ⊗2 is crucial and requires (1.11). As was explained there,
this is guaranteed by equipping V ⊗ V with the projective norm which will be our
standing assumption for the rest of this text, unless otherwise stated.
Alternatively, Lyons [Lyo98], [LQ02, pp. 28, 110] or [LCL07, pp.75] adjusts the
notion of C γ -regularity required for F in a way that basically forces DF · F to
take values in L(V ⊗ V, W ), with the consequence that the regularity condition on
F then depends on the chosen tensor product norm. This modification entails no
changes in subsequent arguments. Of course, there is no difference whatsoever when
dim V < ∞.
The same remarks apply to solving rough differential equations. The Young case
of Section 8.3 is not affected by tensor norms, whereas the typical second order
14 1 Introduction

approximation for RDEs, as e.g. seen in 8.13 later on, immediately points to the
need for a well-defined pairing of the form (1.11). This is ensured by having V ⊗ V
equipped with the projective norm. Alternatively, and as before, it is possible to
replace the projective norm by weaker compatible norms, but this then forces one to
think more carefully about the necessary modifications on the precise assumptions
on the space of vector fields when solving RDEs. This can be important when
the existence of the V ⊗2 -valued rough path in the projective tensor product space
is problematic, as happens in the case of Banach valued Brownian motion (e.g.
[LLQ02]] and Exercise 3.5, used e.g. in [IK06, IK07]). See also [Lyo98, Def. 1.2.4]
or [KM17, Proof of Thm 3.3], where dim W < ∞ is noted to be helpful, and [LCL07,
pp.19–20] and [LQ02, pp. 28, 111] for more information.
Chapter 2
The space of rough paths

We define the space of (Hölder continuous) rough paths, as well as the subspace of
“geometric” rough paths which preserve the usual rules of calculus. The latter can
be interpreted in a natural way as paths with values in a certain nilpotent Lie group.
At the end of the chapter, we give a short discussion showing how these definitions
should be generalised to treat paths of arbitrarily low regularity.

2.1 Basic definitions

In this section, we give a practical definition of the space of Hölder continuous


rough paths. Our choice of Hölder spaces is chiefly motivated by our hope that most
readers will already be familiar with the classical Hölder spaces from real analysis.
We could in the sequel have replaced “α-Hölder continuous” by “finite p-variation”
for p = 1/α in many statements. This choice would also have been quite natural,
due to the fact
R that one of our primary goals will be to give meaning to integrals
of the form f (X) dX or solutions to controlled differential equations of the form
dY = f (Y ) dX for rough paths X. The value of such an integral / solution does not
depend on the parametrisation of X, which dovetails nicely with the fact that the
p-variation of a function is also independent of its parametrisation. This motivated its
choice in the original development of the theory. In some other applications however
(like the solution theory to rough stochastic partial differential equations developed
in [Hai11b, HW13, Hai13] and more generally the theory of regularity structures
[Hai14b] exposed in the last chapters), parametrisation-independence is lost and the
choice of Hölder norms is more natural.
A rough path on an interval [0, T ] with values in a Banach space V then consists
of a continuous function X : [0, T ] → V , as well as a continuous “second order
process” X : [0, T ]2 → V ⊗ V , subject to certain algebraic and analytical conditions.
Regarding the former, the behaviour of iterated integrals, such as (2.2) below, suggests
to impose the algebraic relation (“Chen’s relation”),

15
16 2 The space of rough paths

Xs,t − Xs,u − Xu,t = Xs,u ⊗ Xu,t , (2.1)

which we assume to hold for every triplet of times (s, u, t). Since Xt,t = 0, it
immediately follows (take s = u = t) that we also have Xt,t = 0 for every t. As
already mentioned in the introduction, one should think of X as postulating the value
of the quantity Z t
def
Xs,r ⊗ dXr = Xs,t , (2.2)
s
where we take the right-hand side as a definition for the left-hand side. (And not
the other way around!) We insist (cf. Exercise 2.4 below) that as a consequence
of (2.1), knowledge of the path t 7→ (X0,t , X0,t ) already determines the entire
second order process X. In this sense, the pair (X, X) is indeed a path, and not
some two-parameter object, although it is often more convenient to consider it
as one. If X is a smooth function and we read (2.2) from right to left, then it is
straightforward to verify (see Exercise 2.1 below) that the Rrelation (2.1) does indeed
hold. Furthermore, one can convince oneself that if f 7→ f dX denotes any form
Rt
of “integration” which is linear in f , has the property that s dXr = Xs,t , and is
Rt Ru Ru
such that s f (r) dXr + t f (r) dXr = s f (r) dXr for any admissible integrand
f , and if we use such a notion of “integral” to define X via (2.2), then (2.1) does
automatically hold. This makes it a very natural postulate in our setting.
Note that the algebraic relations (2.1) are by themselves not sufficient to determine
X as a function of X. Indeed, for any V ⊗ V -valued function F , the substitution
Xs,t 7→ Xs,t + Ft − Fs leaves the left-hand side of (2.1) invariant. We will see later
on how one should interpret such a substitution. It remains to discuss what are the
natural analytical conditions one should impose for X. We are going to assume that
the path X itself is α-Hölder continuous, so that |Xs,t | ≲ |t − s|α . The archetype of
an α-Hölder continuous function is one which is self-similar with index α, so that
Xλs,λt ∼ λα Xs,t .
(We intentionally do not give any mathematical definition of self-similarity here,
just think of ∼ as having the vague meaning of “looks like”.) Given (2.2), it is then
very natural to expect X to also be self-similar, but with Xλs,λt ∼ λ2α Xs,t . This
discussion motivates the following definition of our basic spaces of rough paths.

Definition 2.1. For α ∈ ( 13 , 12 ], define the space of α-Hölder rough paths (over V ),
in symbols C α ([0, T ], V ), as those pairs (X, X) =: X such that

def |Xs,t | def |Xs,t |


∥X∥α = sup <∞, ∥X∥2α = sup < ∞ , (2.3)
s̸=t∈[0,T ] |t − s|α s̸=t∈[0,T ] |t − s|2α

and such that the algebraic constraint (2.1) is satisfied.

R example is the canonical rough path lift of a smooth path X, of the


The obvious
form (X, X ⊗ dX), and we write L (C ∞ ) for the class of rough paths obtained in
2.1 Basic definitions 17

this way.1 We have the strict inclusion L (C ∞ ) ⊂ C ∞ , the class of smooth rough
paths,2 by which we mean a genuine rough path with the additional property that the
V -valued (resp. V ⊗ V -valued) maps X and Xs, are smooth, for every basepoint s.
• •

For instance, X ≡ (0, 0) is the trivial canonical rough path associated to the scalar
zero path, as opposed to the smooth “pure second level” rough path (over R) given
by (s, t) 7→ (0, t − s); see also Exercise 2.10 for a natural example with dim V > 1.

Remark 2.2. Any scalar path X ∈ C α can be lifted to a rough path (over R), simply
by setting Xs,t := (Xs,t )2 /2. However, for a vector-valued path X ∈ C α , with
values in some Banach space V , it is far from obvious that one can find suitable
“second order increments” X such that X lifts to a rough path (X, X) ∈ C α . The
Lyons–Victoir extension theorem (Exercise 2.14) asserts that this can always be done,
/ N which means α ∈ ( 13 , 12 ) in our
even in a continuous fashion, provided that 1/α ∈
1
present discussion. (A counterexample for α = 2 is hinted on in Exercise 2.13). The
reader may wonder how this continuity property dovetails with Proposition 1.1. The
point is that if we define X 7→ X by an application of the Lyons–Victoir extension
theorem, this map restricted to smooth paths does in general not coincide with the
Riemann–Stieltjes integral of X against itself.

Remark 2.3. In typical applications to stochastic processes with α-Hölder continuous


sample paths, α ∈ ( 13 , 12 ), such as Brownian motion, rough path lift(s) are constructed
via probability, and one does not rely on the extension theorem. In many cases, one
has a “canonical” (a.k.a. Stratonovich, Wong-Zakai) lift of a process given as limit (in
probability and rough path topology) of canonically lifted sample path mollification
of the process. Examples where such a construction works include a large class of
Gaussian processes, in particular Brownian motion, and more generally fractional
Brownian motion for every Hurst parameter H > 14 , cf. Section 10. However, this
may not be the only meaningful construction: already in Section 3, we will discuss
three natural, but different, ways to lift Brownian motion to a rough path. For a
detailed discussion of Markov (with uniformly elliptic generator in divergence form)
and semimartingale rough paths we refer to [FV10b].

If one ignores the nonlinear constraint (2.1), the quantities defined in (2.3) suggest
to think of (X, X) as an element of the Banach space C α ⊕ C22α with (semi-)norm
∥X∥α + ∥X∥2α (which vanishes when X is constant and X ≡ 0). However, taking
into account (2.1) we see that C α is not a linear space, although it is a closed subset
of the aforementioned Banach space; see Exercise 2.7. We will need (some sort of) a
norm and metric on C α . The induced “natural” norm on C α given by ∥X∥α +∥X∥2α
fails to respect the structure of (2.1) which is homogeneous with respect to a natural
dilation on C α , given by δλ : (X, X) 7→ (λX, λ2 X). This suggests to introduce the
α-Hölder homogeneous rough path norm
1
We note immediately that “smooth” can be replaced by “sufficiently smooth”, such as C 1 and
even C α , with α > 1/2, in view of Young integration, Section 4.1.
2
We deviate here from the early rough path literature, including [LQ02], where smooth rough paths
meant canonical rough paths. Instead, we are aligned with the terminology of regularity structures,
where (canonical, smooth) models generalise the corresponding notions of rough paths.
18 2 The space of rough paths
def
p
|||X|||α = ∥X∥α + ∥X∥2α , (2.4)

which, although not a norm in the usual sense of normed linear spaces, is a very
adequate concept for the rough path X = (X, X). On the other hand, (2.3) leads to a
natural notion of rough path metric (and then rough path topology).

Definition 2.4. Given rough paths X, Y ∈ C α ([0, T ], V ), we define the (inhomoge-


neous) α-Hölder rough path metric 3

|Xs,t − Ys,t | |Xs,t − Ys,t |


ϱα (X, Y) := sup α
+ sup .
s̸=t∈[0,T ] |t − s| s̸=t∈[0,T ] |t − s|2α

The perhaps easiest way to show convergence with respect to this rough path
metric is based on interpolation: in essence, it is enough to establish pointwise
convergence together with uniform “rough path” bounds of the form (2.3); see
Exercise 2.9. Let us also note that C α ([0, T ], V ) endowed with this distance is a
complete metric space; the reader is asked to work out the details in Exercise 2.7.
We conclude this part with two important remarks. First, we can ask ourselves up
to which point the relations (2.1) are already sufficient to determine X. Assume that
we can associate to a given function X two different second order processes X and
X̄, and set Gs,t = Xs,t − X̄s,t . It then follows immediately from (2.1) that

Gs,t = Gu,t + Gs,u ,

so that in particular Gs,t = G0,t − G0,s . Since, conversely, we already noted that
setting X̄s,t = Xs,t + Ft − Fs for an arbitrary continuous function F does not change
the left-hand side of (2.1), we conclude that X is in general determined only up to
the increments of some function F ∈ C 2α (V ⊗ V ). The choice of F does usually
matter and there is in general no obvious canonical choice.
The second remark is that this construction can possibly be useful only if α ≤ 12 .
Indeed, if α > 12 , then a canonical choice of X is given by reading (2.2) from
right to left and interpreting the left-hand side as a simple Young integral [You36].
Furthermore, it is clear in this case that X must be unique, since any additional
increment should be 2α-Hölder continuous by (2.3), which is of course only possible
if α ≤ 12 . Let us stress once more however that this is not to say that X is uniquely
determined by X if the latter is smooth, when it is interpreted as an element of C α
for some α ≤ 12 . Indeed, if α ≤ 21 , F is any 2α-Hölder continuous function with
values in V ⊗ V and Xs,t = Ft − Fs , then the path (0, X) is a perfectly “legal”
element of C α , even though one cannot get any smoother than the function 0. The
impact of perturbing X by some F ∈ C 2α in the context of integration is considered

3
As was already emphasised, C α is not a linear space but is naturally embedded in the Banach
space C α ⊕ C22α (cf. Exercise 2.7), the (inhomogeneous) rough path metric is then essentially the
induced metric. While this may not appear intrinsic (the situation is somewhat similar to using the
(restricted) Euclidean metric on R3 on the 2-sphere), the ultimate justification is that the Itô map
will turn out to be locally Lipschitz continuous in this metric.
2.1 Basic definitions 19

in Example 4.14 below. In Chapter 5, we shall use this for a pathwise understanding
of how exactly Itô and Stratonovich integrals differ.

Remark 2.5. There are some simple variations on the definition of a rough path, and
it can be very helpful to switch from one view-point to the other. (The analytic
conditions are not affected by this.)
a) From the “full increment” view point one has (X, X) : [0, T ]2 → V ⊕ V ⊗2 ,
(s, t) 7→ (Xs,t , Xs,t ) subject to the “full” Chen relation

Xs,t = Xs,u + Xu,t , Xs,t = Xs,u + Xu,t + Xs,u ⊗ Xu,t . (2.5)

Every path X : [0, T ] → V induces (vector) increments Xs,t ≡ (δX)s,t =


Xt − Xs for which the first equality is a triviality. Conversely, increments
determine a path modulo constants. In particular, Xt = X0 + X0,t and this
definition is equivalent to what we had in Definition 2.1), if restricted to paths
with X0 = 0 (or, less rigidly, by identifying paths X, X̄ for which X̄ − X is
constant). In many situations, notably differential equations driven by (X, X),
this difference does not matter. (This increment view point is also closest to
“models” (Π, Γ ) in the theory of regularity structures, Section 13.3, where s is
regarded as base-point and one is given a collection of functions (Xs,· , Xs,· ).
The Chen relation (2.5) then has the interpretation of shifting the base-point.)
(2)
b) The “full path” view point starts with X : [0, T ] → {1} × V ⊕ V ⊗2 ≡ T1 (V ),
a Lie group under the (truncated) tensor product, the details of which are left to
Section 2.3 below. Every such path has group increments defined by

X−1
s ⊗ Xt =: Xs,t =: (Xs,t , Xs,t ).

Chen’s relation (2.5) is nothing but the trivial identity Xs,u ⊗ Xu,t = Xs,t so that
any such group-valued path X induces an increment map (X, X), of the form
discussed in a). Conversely, such increments determine X modulo constants as
seen from Xt = X0 ⊗ X0,t . If we restrict to X0 = 1 = (1, 0, 0), or identify
paths X, X̃ for which X̃ ⊗ X−1 is constant, then there is no difference. (Such
a “base-point free” object corresponds to “fat” Π in the theory of regularity
structures and induces a model (Π, Γ ) in great generality.)
c) Our Definition 2.1 is a compromise in the sense that we want to start from a
familiar object, namely a path X : [0, T ] → V , together with minimal second
level increment
R information to define (in Section 4.2) the prototypical rough
integral F (X)d(X, X). From the “increment” view point, we have thus sup-
plied more than necessary (namely X0 ), whereas from the “full path” view point,
we have supplied X, with X0 = (1, X0 , ∗) specified on the first level only. (Of
course, this affects in no way the second level increments Xs,t .)
20 2 The space of rough paths

2.2 The space of geometric rough paths

While (2.1) does capture the most basic (additivity) property that one expects any
decent theory of integration to respect, it does not imply any form of integration by
parts / chain rule. Now, if one looks for a first order calculus setting, such as is valid
in the context of smooth paths or the Stratonovich stochastic calculus, then for any
pair e∗i , e∗j of elements in V ∗ , writing Xti = e∗i (Xt ) and Xij ∗ ∗
s,t = (ei ⊗ ej )(Xs,t ), one
would expect to have the identity
Z t Z t
Xij
s,t + X ji
s,t “ = ” X i
s,r dX j
r + j
Xs,r dXri
s s
Z t
j
= d(X i X j )r − Xsi Xs,t − Xsj Xs,t
i
s
j j
= (X i X j )s,t − Xsi Xs,t − Xsj Xs,t
i i
= Xs,t Xs,t ,

so that the symmetric part of X is determined by X. In other words, for all times s, t
we have the “first order calculus” condition
1
Sym(Xs,t ) = Xs,t ⊗ Xs,t . (2.6)
2
However, if we take X to be an n-dimensional Brownian path and define X by Itô
integration, then (2.1) still holds, but (2.6) certainly does not.
There are two natural ways to define a set of “geometric” rough paths for which
(2.6) holds. On the one hand, we can define the space of weakly geometric (α-Hölder)
rough paths.
Cgα ([0, T ], V ) ⊂ C α ([0, T ], V ) ,
by stipulating that (X, X) ∈ Cgα if and only if (X, X) ∈ C α and (2.6) holds as
equality in V ⊗ V , for every s, t ∈ [0, T ]. Note that Cgα is a closed subset of C α .
On the other hand, we have already seen that every smooth path can be lifted
canonically to an element in L (C ∞ ) ⊂ C α by reading the definition (2.2) from
right to left. This choice of X then obviously satisfies (2.6) and we can define the
space of geometric (α-Hölder) rough paths,

Cg0,α ([0, T ], V ) ⊂ C α ([0, T ], V ) ,

as the closure of L (C ∞ ) in C α . We leave it as exercise to the reader to see that C ∞


here may be replaced by C 1 paths without changing the resulting space of geometric
rough paths.
One has the obvious inclusion Cg0,α ⊂ Cgα , which turns out to be strict. In fact,
Cg is separable (provided V is separable), whereas Cgα is not, cf. Exercise 2.8
0,α

below. The situation is similar to the classical situation of the set of α-Hölder
continuous functions being strictly larger than the closure of smooth functions under
the α-Hölder norm. (Or the set of bounded measurable functions being strictly larger
than C, the closure of smooth functions under the supremum norm.) In practice, at
2.3 Rough paths as Lie group valued paths 21

least when dim V < ∞, the distinction between weakly and “genuinely” geometric
rough paths rarely matters for the following reason: similar to classical Hölder spaces,
one has the converse inclusion Cgβ ⊂ Cg0,α whenever β > α, see Proposition 2.8
below and also Exercise 2.12. For this reason, we will often casually speak of
“geometric rough paths”, even when we mean weakly geometric rough paths. (There
is no confusion in precise statements when we write Cg0,α or Cgα .) Let us finally
mention that non-geometric rough paths can always be embedded in a space of
geometric rough paths at the expense of adding new components; in the present
(level-2) setting this can be accomplished in terms of a rough path bracket, see
Exercise 2.11 and also Section 5.3.

2.3 Rough paths as Lie group valued paths

We now present a very fruitful view of rough paths, taken over a Banach space V .
2
Consider X : [0, T ] → V, X : [0, T ] → V ⊗2 subject to (2.1) and define, with
Xs,t = Xt − Xs as usual,

Xs,t := (1, Xs,t , Xs,t ) ∈ R ⊕ V ⊕ V ⊗2 = T (2) (V ).


def
(2.7)

The space T (2) (V ) is itself a Banach space, with the norm of an element (a, b, c)
given by |a| + |b| + |c|, where in abusive notation | • | standards for any of the norms
in R, V and V ⊗ V , the norm on the latter assumed compatible and symmetric, cf.
Section 1.4 . More interestingly for our purposes, this space is a Banach algebra,
non-commutative when dim V > 1 and with unit element (1, 0, 0), when endowed
with the product

(a, b, c) ⊗ (a′ , b′ , c′ ) = (aa′ , ab′ + a′ b, ac′ + a′ c + b ⊗ b′ ) .


def

We call T (2) (V ) the step-2 truncated tensor algebra over V . This multiplicative
structure is very well adapted to our needs since Chen’s relation (2.1), combined
with the obvious identity Xs,t = Xs,u + Xu,t , takes the elegant form

Xs,t = Xs,u ⊗ Xu,t . (2.8)


(2) def
Set Ta (V ) = {(a, b, c) : b ∈ V, c ∈ V ⊗ V }. As suggested by (2.7), the affine
(2)
subspace T1 (V ) will play a special role for us. We remark that each of its elements
has an inverse given by

(1, b, c) ⊗ (1, −b, −c + b ⊗ b) = (1, −b, −c + b ⊗ b) ⊗ (1, b, c) = (1, 0, 0) , (2.9)


(2)
so that T1 (V ) is a Lie group.4 It follows that Xs,t = X−1
0,s ⊗ X0,t are the natural
increments of the group valued path t 7→ X0,t =: Xt .

4 (2)
The Lie group T1 (V ) is finite-dimensional if and only if dim V < ∞.
22 2 The space of rough paths

Identifying 1, b, c with elements (1, 0, 0), (0, b, 0), (0, 0, c) ∈ T (2) (V ), we may
write (1, b, c) = 1 + b + c. The resulting calculus is familiar from formal power series
−1
in non-commuting indeterminates. For instance, the usual power series (1 + x) =
2
1 − x + x − . . . leads to, omitting tensors of order 3 and higher,
−1
(1 + b + c) = 1 − (b + c) + (b + c) ⊗ (b + c)
=1−b−c+b⊗b,

allowing us to recover (2.9). We also introduce the dilation operator δλ on T (2) (V ),


with λ ∈ R, which acts by multiplication with λn on the nth tensor level V ⊗n ,
namely
δλ : (a, b, c) 7→ a, λb, λ2 c .


(2)
Having identified T1 (V ) as the natural state space of (step-2) rough paths, we now
equip it with a homogeneous, symmetric and subadditive norm. For x = (1, b, c),
p
|||x||| = 12 N (x) + N (x−1 ) with N (x) = max{|b|, 2|c|} ,
def 
(2.10)

noting |||δλ x||| = |λ||||x|||, homogeneity with respect to dilation, and |||x ⊗ x′ ||| ≤
|||x||| + |||x′ |||, a consequence of subaddivity for N ( • ) which requires a short argument
left to the reader. It is clear that

(x, x′ ) 7→ |||x−1 ⊗ x′ ||| = d(x, x′ )


def

(2)
defines a bona fide (left-invariant) metric on the group T1 (V ). Important for us, the
graded Hölder regularity (2.3) of X = (X, X), part of the definition of a rough path,
can now be condensed to demand the “metric” Hölder seminorm
d(Xs , Xt ) p
sup α
≍ ∥X∥α + ∥X∥2α = |||X|||α;[0,T ] (2.11)
s̸=t∈[0,T ] |t − s|

to be finite. To summarise, we arrived at the following appealing characterisation of


(Banach space valued) rough paths.
Proposition 2.6. (Hölder continuity is with respect to the left-invariant metric d.)
a) Assume (X, X) ∈ C α ([0, T ], V ). Then the path t 7→ Xt = (1, X0,t , X0,t ), with
(2)
values in T1 (V ) is α-Hölder continuous.
(2)
b) Conversely, if [0, T ] ∋ t 7→ Xt is a T1 (V )-valued and α-Hölder continuous
path, then (X, X) ∈ C ([0, T ], V ) with (1, Xs,t , Xs,t ) := X−1
α
s ⊗ Xt .

The usual power series and / or basic Lie group theory suggest to define

def 1
log (1 + b + c) = b + c − b ⊗ b , (2.12)
2
def 1
exp (b + c) = 1 + b + c + b ⊗ b , (2.13)
2
2.3 Rough paths as Lie group valued paths 23

(2) (2)
which allow us to identify T0 (V ) ∼
= V ⊕ V ⊗2 with T1 (V ) = exp(V ⊕ V ⊗2 ).
(2)
The following Lie bracket makes T0 (V ) a Lie algebra. For b, b′ ∈ V, c, c′ ∈ V ⊗2 ,

[b + c, b′ + c′ ] = b ⊗ b′ − b′ ⊗ b ,
def

(2)
and T0 (V ) is step-2 nilpotent in the sense that all iterated brackets of length 2 vanish.
(2) (2)
Define g(2) (V ) ⊂ T0 (V ) as the closed Lie subalgebra generated by V ⊂ T0 (V ),
explicitly given by

g(2) (V ) = V ⊕ [V, V ] with [V, V ] = cl(span{[v, w] : v, w ∈ V }) ,


def

called the free step-2 nilpotent Lie algebra over V . Note that in finite dimensions, say
V = Rd , the closing procedure is unnecessary and [V, V ] is nothing but the space
of antisymmetric d × d matrices, with linear basis ([ei , ej ] : 1 ≤ i < j ≤ d), where
(ei : 1 ≤ i ≤ d) denotes the standard basis of Rd . Thanks to step-2 nilpotency, one
checks by hand the Baker–Campbell–Hausdorff formula

exp(b + c) ⊗ exp(b′ + c′ ) = exp(b + b′ + c + c′ + 12 [b, b′ ]) .

The image of g(2) under the exponential map then defines a closed Lie subgroup,
(2)
G(2) (V ) = exp g(2) (V ) ⊂ T1 (V ) ,
def 

called the free step-2 nilpotent group over V . These considerations provide us with
an elegant characterisation of weakly geometric rough paths. (The proof is immediate
from the previous proposition and rewriting (2.6) as Xs,t − 12 Xs,t ⊗ Xs,t ∈ [V, V ].)
Proposition 2.7 (Weakly geometric case).
a) Assume (X, X) ∈ Cgα ([0, T ], V ). Then the path t 7→ Xt = (1, X0,t , X0,t ), with
values in G(2) (V ) is α-Hölder continuous (with respect to the metric d.)
b) Conversely, if [0, T ] ∋ t 7→ Xt is a G(2) (V )-valued and α-Hölder continuous
path, then (X, X) ∈ Cgα ([0, T ], V ) with (1, Xs,t , Xs,t ) := X−1
s ⊗ Xt .

It is clear from the discussion in Section 2.2 that any sufficiently smooth path, say
γ ∈ C 1 ([0, 1], V ), produces an element in G(2) (V ) by iterated integration, namely
 Z 1 Z 1Z t 
S (2) (γ) = 1, dγ(t), dγ(s) ⊗ dγ(t) ∈ G(2) (V ) .
0 0 0

The map S (2) , which maps (sufficiently regular) paths on a fixed interval, here [0, 1],
into the above collection of tensors is know as step-2 signature map. We note in
passing that Chen’s relation here has the pretty interpretation that the signature map
is a morphism from the space of paths, equipped with concatenation product, to
the tensor algebra. The inclusion S (2) (C 1 ) ⊂ G(2) becomes an equality in finite
dimensions,
{S (2) (γ) : γ ∈ C 1 ([0, 1], Rd )} = G(2) (Rd ) . (2.14)
24 2 The space of rough paths

To see this, fix b + c ∈ g(2) (Rd ) and try to find finitely many, say n, affine linear
paths γi , with each signature determined by the direction γi (1) − γi (0) = vi ∈ Rd ,
so that
exp(v1 ) ⊗ . . . ⊗ exp(vn ) = exp(b + c) .
Properly applied, thePBaker–Campbell–Hausdorff formula allows to “break up”
the exponential exp( i bi ei + j,k cjk [ej , ek ]). In conjunction with the identity
P

e[v,w] = e−w ⊗ e−v ⊗ ew ⊗ ev it is easy to find a possible choice of v1 , . . . , vn .


By concatenation of the γi ’s one has constructed a path γ with prescribed signature
S (2) (γ) = exp(b + c). This path is clearly in C 1 , the space of Lipschitz paths.5 This
gives a very natural way to introduce another (homogeneous, symmetric, subadditive)
norm on G(2) (Rd ), namely

nZ 1 o
|γ̇(t)| dt : γ ∈ C 1 ([0, 1], Rd ) , S (2) (γ) = x ,
def
∥x∥C = inf (2.15)
0

known as Carnot–Carathéodory norm. (In infinite dimensions, there is no guar-


antee for the set on the right-hand side to be non-empty.) When equipped with
its Euclidean structure, Rd defines a “horizontal” subspace Rd × {0} ⊂ g(2) (Rd ),
seen as tangent space to G(2) (Rd ) at (1, 0, 0) which in turn induces a left-invariant
sub-Riemannian structure on G(2) (Rd ). The associated left-invariant Carnot–Cara-
théodory metric dC can then be seen as the minimal length of “horizontal” paths
connecting two points. Any minimising sequence in (2.15), parametrised by constant
speed, is equicontinuous so that by Arzela–Ascoli such minimisers, also called sub-
Riemannian geodesics, exist and must be in C 1 . Such geodesics are a key tool in the
approach of Friz–Victoir [FV10b]. The explicit computation of such geodesics (and
Carnot–Carathéodory norms) is a difficult problem, with explicit formulae available
for d = 2, noting that, as Lie groups, G(2) (R2 ) ∼= H3 , the 3-dimensional Heisen-
berg group, see e.g. [Mon02]. Fortunately, a compactness argument, as detailed for
example in [FV10b, Sec 7.5], shows that all continuous homogeneous norms are
equivalent. Upon checking continuity of the Carnot–Carathéodory norm, one gets,
for x = exp (b + c) ∈ G(2) (Rd ),
1/2
∥x∥C ≍d |b| + |c| ≍ max{|b|, |c|1/2 } , (2.16)

which, despite its dependence on the dimension d, is sufficient for many practical
purposes. As a useful application, we now state an approximation result for weakly
geometric roughs over Rd . With the preparations made, the interested reader will
have no trouble to provide a full proof for

Proposition 2.8 (Geodesic approximation). For every (X, X) ∈ Cgβ [0, T ], Rd ,




there exists a sequence of smooth paths X n : [0, T ] → Rd such that


5
In fact, by smoothly slowing down speed to zero whenever switching directions, the path γ can
also be parametrized to be smooth. In particular, in (2.14) and (2.15) below we could have replaced
C 1 by C ∞ .
2.4 Geometric rough paths of low regularity 25
 Z · 
(X n , Xn ) := X n , n
X0,t ⊗ dXtn → (X, X) uniformly on [0, T ]
0

with uniform rough path bound supn≥1 |||X n , Xn |||β ≲ |||X, X|||β . By interpolation,
convergence holds in C α , for any α < β.

Remark 2.9. By definition, every geometric rough path X ∈ Cg0,β is the limit of
canonical rough path lifts (X n , Xn ) = Xn ; trivially then, |||Xn |||β → |||X|||β . This is
not true for a generic weakly geometric rough path X ∈ Cgβ . However, the above
proposition supplies approximations (Xn ), which converge uniformly with uniform
rough paths bounds. In such a case, |||X|||β ≤ liminfn≥1 |||Xn |||β and this can be strict.
This lower-semicontinuous behaviour of the rough path norm is reminiscent of norms
on Hilbert spaces under weak convergence and led to the terminology of “weakly”
geometric rough paths.

2.4 Geometric rough paths of low regularity

The interpretation given above gives a strong hint on how to construct geometric
rough paths with α-Hölder regularity for α ≤ 31 : setting N = ⌊1/α⌋, one defines the
step-N truncated tensor algebra over a Banach space V
N
M ⊗n
T (N ) (V ) =
def
V ,
n=0

with the natural convention that (V )⊗0 = R. The product in T (N ) (V ) is simply the
tensor product ⊗, but we truncate it in a natural way by postulating that a ⊗ b = 0 for
a ∈ (V )⊗k , b ∈ (V )⊗ℓ with k + ℓ > N . A homogeneous, symmetric and subadditive
norm which generalises (2.10) to the step-N case is given by
1/n
1
N (x) + N (x−1 ) (n!|xn |)
def 
|||x||| = 2 with N (x) = max , (2.17)
n=1,...,N

(N )
where every x = (1, x1 , . . . , xN ) ∈ T1 (V ), element with scalar component 1,
is invertible, and where | • | denotes any of the tensor norms on (V )⊗n , assumed
compatible and symmetric (permutation invariant).6 .
Proposition 2.6 suggests the naı̈ve definition of an α-Hölder rough path over
(N )
V as a path X, on [0, T ] say, with values in the group T1 (V ) which is α-Hölder
continuous with respect to d(x, x′ ) = |||x−1 ⊗ x′ |||. Modulo knowledge of X0 this is
(N )
equivalent to a multiplicative map (s, t) 7→ Xs,t ∈ T1 (V ), multiplicative in the
sense that Chen’s relation holds,

Xs,t = Xs,u ⊗ Xu,t , (2.18)

6
The definitions from Section 1.4 for N = 2 extend easily to N > 2, see also [LCL07, Def 1.25]
26 2 The space of rough paths

for every triplet of times (s, u, t), and with graded Hölder regularity,

|Xns,t | ≲ |t − s|kα , n = 1, . . . , N ,

uniformly over s, t ∈ [0, T ]. The interpretation of rough paths discussed at length in


the step-2 setting is unchanged and Xns,t ∈ V R
⊗n
should be thought of as a substitute
for the (possibly ill-defined) n-fold integral dXu1 ⊗ · · · ⊗ dXun over the n-simplex
{s < u1 < · · · < un < t}. Such a notion of naı̈ve higher order rough path is
sometimes sufficient, e.g. for solving linear rough differential equations, see also
Exercise 4.18, but does not contain the necessary information Rt to deal with non-
linearities, already seen in the simple example of the form s (Xr − Xs )⊗2 ⊗ dXr .
Higher order (weakly) geometric rough paths resolve this problem by imposing
a chain rule. In the above example, (δX)⊗2 /2 = Sym(X2 ), formerly written as
Sym(X), and the situation is reduced to (a linear combination of) 3-fold iterated
integrals. To proceed in a systematic fashion, we first introduce the correct state
space as the free step-N nilpotent Lie group over V
(N )
G(N ) (V ) = exp(g(N ) (V )) ⊂ T1
def
(V )
(N )
where the exponential map is defined via its power series and g(N ) ⊂ T0 (V ) is the
(closed) Lie algebra generated by all elements of the form (0, c, 0, . . . , 0) with c ∈ V
via the natural Lie bracket [a, b] = a ⊗ b − b ⊗ a. The neutral element 1 ∈ G(N ) (V )
is given by 1 = (1, 0, . . . , 0). Given any α ∈ (0, 1] and N = ⌊1/α⌋ as the number of
“levels”, Proposition 2.7 now suggests the definition of a weakly geometric α-Hölder
rough path over V as a path X, on [0, T ] say, with values in the group G(N ) (V ) which
is α-Hölder continuous with respect to d(x, x′ ) = |||x−1 ⊗ x′ |||. Modulo knowledge of
X0 this is equivalent to a multiplicative map (s, t) 7→ Xs,t ∈ G(N ) (V ) with graded
Hölder regularity, uniformly over s, t ∈ [0, T ],

|Xns,t | ≲ |t − s|nα , n = 1, . . . , N .

Here, again multiplicative means validity of Chen’s relation as spelled out in (2.18)
above.
We now assume, for notationally convenience, V = Rd , which allows us to
(N )
think of components of some fixed rough path increment Xs,t ∈ T1 (Rd ) as being
indexed by words w of length at most N with letters in the alphabet {1, . . . , d}.
Similarly to before, given a word w = w1 · · · wn , the corresponding component Xw ,
which we also write as ⟨X, w⟩, is then interpreted as the n-fold integral
Z tZ sn Z s1
⟨Xs,t , w⟩ = ··· dXsw11 · · · dXswnn , (2.19)
s s s

and |||Xs,t ||| ≲ |t − s|α is equivalent to, for all words with length |w| ≤ ⌊1/α⌋,

|⟨Xs,t , w⟩| ≲ |t − s|α|w| . (2.20)


2.4 Geometric rough paths of low regularity 27

In order to describe the constraints imposed on these iterated integrals by the chain

rule, we define the shuffle product between two words as the formal sum over all
possible ways of interleaving them. For example, one has

a  x = ax + xa , ab  xy = abxy + axby + xaby + axyb + xayb + xyab ,


with the empty word acting as the neutral element. With this notation at hand, it was
already remarked by Ree [Ree58] (see also [Che71]) that the chain rule implies the
identity

⟨Xs,t , v⟩⟨Xs,t , w⟩ = ⟨Xs,t , v w⟩ . (2.21)
(The reader is asked to show this in Exercise 2.2.) It is a remarkable fact that the
algebraic properties of the tensor and shuffle algebras combine in such a way that the
set of elements X ∈ T (N ) satisfying (2.21) is not only stable under the product ⊗,
but forms a group, which in turn was shown in [Ree58] to be nothing but the group
G(N ) (Rd ). In the language of Hopf algebras, this group is exactly the character
group for the (truncated) shuffle Hopf algebra.
In general, one may decide to forego the chain rule (after all, it doesn’t hold in the
context of Itô integration, as is manifest in Itô’s formula) in which case there is no
reason to impose (2.21). In this case, considering a rough path as an enhancement
of a path X by iterated integrals of the type (2.19) no longer provides sufficient
additional data. Indeed, in order to solve differential equations driven by X, one
would like to give meaning to expressions like for example
Z tZ r  Z r 
j
dXui dXvj dXrk =: ⟨Xs,t , i k ⟩ . (2.22)
s s s

We already remarked earlier, that in the (weakly) geometric case, the assumed
chain rule (now in the form of (2.21)) allows to reduce such expressions to linear
combinations of iterated integrals. In general, one should define a rough path as the
enhancement of a path X with additional functions that are interpreted as the various
formal expressions that can be formed by the two operations “multiplication” and
“integration against X”. The resulting algebraic construction is more involved and
gives rise to the concept of branched rough path X due to Gubinelli [Gub10]. The
terminology comes from the fact that the natural way of indexing the components
of such an object is no longer given by words, but by labelled trees, as suggested
in (2.22) above with labels i, j, k ∈ {1, . . . , d}. As detailed in [Gub10], see also
[HK15, BCFP19], branched rough paths take values in the character group of the
Connes–Kreimer Hopf algebra of trees [CK00], also known as the Butcher group
[But72]. A concise description of the branched rough path regularity via an explicit
homogeneous subadditive norms on this Lie group, similar to (2.17), can be found in
[TZ18], cf. also [HS90].
28 2 The space of rough paths

2.5 Exercises

♯ Exercise 2.1 Let X be a smooth V -valued path.


Rt
a) Show that Xs,t := s Xs,r ⊗ Ẋr dr satisfy Chen’s relation (2.1).
b) Consider the collection of all iterated integrals over [s, t],
Z !
Xs,t := 1, Xs,t , Xs,t , dXu1 ⊗ dXu2 ⊗ dXu3 , . . . ∈ T ((V )) , (2.23)
(3)
∆s,t

(3) def Q∞
where ∆s,t = {u : s < u1 < u2 < u3 < t} and T ((V )) = k=0 V ⊗k is the
space of tensor series over V , equipped with the obvious algebra structure (cf.
Section 2.4). Show that the following general form of Chen’s relation holds:

Xs,t = Xs,u ⊗ Xu,t .

The element Xs,t ∈ T ((V )) is known as the signature of X on the interval [s, t].
c) Show that the indefinite signature S := X0, solves the linear differential equa-

tion
dS = S ⊗ dX , S0 = 1 .
We will see later (Exercises 4.6 and 8.9) that the signature can be defined for every
rough path.
Hint: For point (b), it suffices to consider the projection of Xs,t to V ⊗n , for an
arbitrary integer n, given by the n-fold integral of dXu1 ⊗ · · · ⊗ dXun over the
simplex {s < u1 < · · · < un < t}.
♯ Exercise 2.2 (Shuffle) Let V = Rd . As discussed in (2.19), the collection Xs,t of
all iterated integrals over a fixed interval [s, t] can also be viewed as
 w
Xs,t = ⟨Xs,t , w⟩ : w word on A ,

with alphabet A = {1, . . . , d}, where we recall that a word on A is a finite sequence
of elements of A, including the empty sequence ̸#, called the empty word. By
convention, X̸#
s,t = 1. Write uv for the concatenation of two words u and v, and
accordingly ui for attaching a letter i ∈ A to the right of u. The linear span of such
words (which can be identified with polynomials in d non-commuting indeterminates)
carries an important commutative product known as the shuffle product. It is defined
 
recursively by requiring ̸# to be the neutral element, ie. u ̸# = ̸# u = u, and
then
 
ui vj = (u vj)i + (ui v)j . 
Let Xs,t be the signature of a smooth path X, as given in (2.23). Show that, for all
words u, v,

⟨Xs,t , u v⟩ = ⟨Xs,t , u⟩⟨Xs,t , v⟩ . (2.24)
2.5 Exercises 29


The case of single letter words w = i, v = j gives i j = ij + ji and expresses
precisely the product rule from calculus, which leads us to the level-2 geometricity
condition (2.6).
Hint: Proceed by induction in joint length: express ⟨Xs,t , ui⟩⟨Xs,t , vj⟩ by the product
rule as an integral over [s, t] and use the hypothesis for words of joint length |u| +
|v| + 1 < |ui| + |vj|.
∗ Exercise 2.3 Call a tensor series x ∈ T ((Rd )) group-like, in symbols x ∈ G((Rd )),
if for all words u, v,

⟨x, u v⟩ = ⟨x, u⟩⟨x, v⟩ . (2.25)
An element in T ((Rd )) is called a Lie series if, for all N ∈ N, its projection to
T (N ) = T (N ) (Rd ) is a Lie polynomial, i.e. an element of g(N ) , which was defined
(N )
in Section 2.4 as the Lie algebra generated by Rd ⊂ T0 . Given x ∈ T ((Rd )), show
d
that x is group-like, i.e. x ∈ G((R )), if and only if log x is a Lie series.
♯ Exercise 2.4
a) It is common to define the (V ⊗ V )-valued map X on ∆0,T := {(s, t) : 0 ≤
s ≤ t ≤ T } rather than [0, T ]2 . There is no difference however: if Xs,t is only
defined for s ≤ t, show that the relation (2.1) implies

Xt,s = −Xs,t + Xs,t ⊗ Xs,t .

b) In fact, show that knowledge of the path t 7→ (X0,t , X0,t ) already determines
the entire second order process X. In this sense (X, X) is indeed a path, and not
some two-parameter object, cf. Remark 2.5.
c) Specialise to the case of geometric rough path and show the identity Xt,s = XTt,s
where (. . .)T denotes the transpose. (When dim V = 1, so that X is scalar
2
valued, this is a trivial consequence of Xs,t = Xs,t /2.)
Exercise 2.5 Consider s ≡ τ0 < τ1 < · · · < τN ≡ t. Show that (2.1) implies
X X
Xs,t = Xτi ,τi+1 + Xτj ,τj+1 ⊗ Xτi ,τi+1
0≤i<N 0≤j<i<N
N
X −1

= Xτi ,τi+1 + Xs,τi ⊗ Xτi ,τi+1 . (2.26)
i=0

This identity effectively compares Xs,t with a left-point Riemann-Stieltjes approxima-


PN −1
tion i=0 Xs,τi ⊗ Xτi ,τi+1 of the “motivating” integral expression in (2.2).
Exercise 2.6 Following Section 2.3 and Exercise 2.4, view X ∈ C α ([0, T ], V ) as a
one-parameter path and define the (time T ) time reversal of X in the “naı̈ve” way as


X t = XT −t , 0≤t≤T .

− ←
− ←

Verify that X is again a rough path, i.e. X ∈ C α . Show furthermore that X is
geometric if and only if X is geometric.
30 2 The space of rough paths

♯ Exercise 2.7 Let V be a Banach space.


a) Let α ∈ (0, 1]. Show that the linear space of all continuous maps X : [0, T ]2 →
V ⊗ V s.t. ∥X∥ := sup |Xs,t |/|t − s|2α < ∞ is a Banach space, denoted by C22α .
Deduce that C α ⊕C22α is also Banach, with seminorm ∥ • , • ∥α,2α = ∥ • ∥α +∥ • ∥2α .
(A genuine norm is given by (X, X) 7→ |X0 | + ∥X, X∥α,2α .)
b) Show that the rough path spaces Cgα and C α are complete metric spaces. In fact,
both are closed subspaces, defined through (nonlinear) algebraic relations, of
the infinite-dimensional Banach space C α ⊕ C22α .
c) Show that the rough path spaces Cgα and C α over V = R (and a fortiori every
V ̸= 0) are not separable. (You may use the well-known fact that the Hölder
spaces C α ([0, T ], R) are non-separable.)
Exercise 2.8 (Separable rough path spaces) Let V be a separable Banach space
and α ∈ ( 13 , 21 ].
a) Show separability of the space of geometric (α-Hölder) rough paths

Cg0,α ([0, T ], V ) = cl(L (C ∞ )) ⊂ C α ([0, T ], V ) .


def

Together with Exercise 2.7, b), this shows that Cg0,α is Polish.
b) Show that the closure of smooth rough paths,

C 0,α ([0, T ], V ) = cl(C ∞ ) ⊂ C α ([0, T ], V ) ,


def

is also separable (and hence Polish).


Solution. (a) Let Q be a countable, dense subset of V and consider the space
Λn of paths which are piecewise linear between level-n dyadic rationals Dn :=
{kT /2n : 0 ≤ k ≤ 2n }, and, at level-n dyadic points, take values in Q. Clearly Λ =
∪Λn is countable for each Λn is in one-to-one correspondence with the (2n + 1)-fold
Cartesian product of Q. It is easy to see that each smooth X is the limit in C 1 of
some sequence (X n ) ⊂ Λ. Indeed, one can take X n to be the piecewise linear
dyadic approximation, modified such that X n |Dn takes values in QRand such that
|(X n − X)|Dn | < 1/n. By continuity of the map X ∈ C 1 7→ X, X ⊗ dX ∈


C α in the respective topologies (we could even take R α = 1),  we have more than
enough to assert that every lifted smooth path, X, X ⊗ dX , is the limit in C α of
lifted paths in Λ. It is then easy to see that every limit point of lifted smooth paths is
also the limit of lifted paths in Λ.
♯ Exercise 2.9 (Interpolation) Assume that Xn ∈ C β , for 1/3 < α < β, with
uniform bounds

sup ∥X n ∥β < ∞ and sup ∥Xn ∥2β < ∞


n n

n
and uniform convergence Xs,t → Xs,t and Xns,t → Xs,t , i.e. uniformly over s, t ∈
[0, T ]. Show that this implies X ∈ C β and Xn → X in C α . Show furthermore that
the assumption of uniform convergence can be weakened to pointwise convergence:
2.5 Exercises 31

n
∀t ∈ [0, T ] : X0,t → X0,t and Xn0,t → X0,t .

Solution. Using the uniform bounds and pointwise convergence, there exists C such
that uniformly in s, t

≤ C|t − s|β , 2β
n
|Xs,t | = lim Xns,t ≤ C|t − s| .

|Xs,t | = lim Xs,t
n n

It readily follows that X = (X, X) ∈ C β . In combination with the assumed uniform


convergence, there exists εn → 0, such that, uniformly in s, t,
n n β
|Xs,t − Xs,t | ≤ εn , |Xs,t − Xs,t | ≤ 2C|t − s| ,

|Xns,t − Xs,t | ≤ εn , |Xns,t − Xs,t | ≤ 2C|t − s| .

By geometric interpolation (a ∧ b ≤ a1−θ bθ when a, b > 0 and 0 < θ < 1) with


θ = α/β we have
n α 2α
|Xs,t − Xs,t | ≲ ε1−α/β
n |t − s| , |Xns,t − Xs,t | ≲ ε1−α/β
n |t − s| ,

and the desired convergence in C α follows.


It remains to weaken the assumption to pointwise convergence. By Chen’s relation,
pointwise convergence of Xn0,t for all t actually implies pointwise convergence of
Xns,t for all s, t. We claim that, thanks to the uniform Hölder bounds, this implies
uniform convergence. Indeed, given ε > 0, pick a (finite) dissection D of [0, T ]
with small enough mesh so that C|D|β < ε/8. Given s, t ∈ [0, T ] write ŝ, t̂ for the
nearest points in D and note that
n n n
|Xs,t − Xs,t | ≤ |Xŝ,t̂ − Xŝ, t̂
| + |Xs,ŝ | + |Xs,ŝ | + |Xt,t̂ | + |Xt,nt̂ |
n
≤ |Xŝ,t̂ − Xŝ, t̂
| + ε/2 .
n
By picking n large enough, |Xŝ,t̂ − Xŝ, t̂
| can also be bounded by ε/2, uniformly
over the (finitely many!) points in D, so that X n → X uniformly. Although the
second level is handled similarly, the non-additivity of (s, t) 7→ Xs,t requires some
extra care, (2.1). For simplicity of notation only, we assume s < ŝ < t = t̂ so that

|Xs,t − Xns,t | ≤ |Xs,ŝ − Xnŝ,t | + |Xŝ,t | + |Xs,ŝ ⊗ Xŝ,t − Xs,ŝ


n n
⊗ Xŝ,t |.
n n n
It remains to write the last summand as |Xs,ŝ ⊗(Xŝ,t −Xŝ,t )−(Xs,ŝ −Xs,ŝ )⊗Xŝ,t |
and to repeat the same reasoning as in the first level.

♯ Exercise 2.10 (Pure area rough path) Identify R2 with the complex numbers and
consider
[0, 1] ∋ t 7→ n−1 exp 2πin2 t ≡ X n .


Rt n
a) Set Xns,t := s Xs,r ⊗ dXrn . Show that, for fixed s < t,
32 2 The space of rough paths
 
n 0 1
Xs,t → 0, Xns,t → π(t − s) . (2.27)
−1 0

b) Establish the uniform bounds supn ∥X n ∥1/2 < ∞ and supn ∥Xn ∥1 < ∞.

c) Conclude that (X n , Xn ) converges in C α , any α < 1/2.


n
Solution. a) Obviously, Xs,t = O(1/n) → 0 uniformly in s, t. Then

1 n
Xns,t = n
+ Ans,t = O 1/n2 + Ans,t

Xs,t ⊗ Xs,t
2
where Ans,t ∈ so(2) is the antisymmetric part of Xns,t . To avoid cumbersome
notation, we identify
 
0 a
∈ so(2) ↔ a ∈ R.
−a 0

Ans,t then represents the signed area between the curve (Xrn : s ≤ r ≤ t) and
the straight chord from Xtn to Xsn . (This is a simple consequence of Stokes
theorem: the exterior derivative of the 1-form 12 (x dy − y dx) which vanishes
along straight chords, is the volume form dx∧dy.) With s < t, (Xrn : s ≤ r ≤ t)
makes ⌊n2 (t − s)⌋ full spins around the origin, at radius 1/n. Each full spin
2
contributes area π(1/n) , while the final incomplete spin contributes some area
2
less than π(1/n) . The total signed area, with multiplicity, is thus
π Cs,t
Ans,t = n2 (t − s) + O(1) 2 = π(t − s) + 2 ,
n n
where |Cs,t | ≤ π uniformly in s, t. It follows that
 
n 0 1
+ O 1/n2

Xs,t = π(t − s) (2.28)
−1 0

and the claimed uniform convergence follows.

b) The following two estimates for path increments of n−1 exp 2πin2 t ≡ Xtn


hold true:
n n n
Xs,t ≤ Ẋ |t − s| ≤ n|t − s| , Xs,t ≤ 2|X n | = 2/n .
∞ ∞

Since a ∧ b ≤ ab, it immediately follows that
n p
Xs,t ≤ 2|t − s| ,

uniformly in n, s, t. In other words, supn ∥X n ∥1/2 < ∞. The argument for the
uniform bounds on Xs,t is similar. On the one hand, we have the bound (2.28).
On the other hand, we also have
2.5 Exercises 33

2 |t − s|2 n2
Z Z
n
Xs,t = Ẋun ⊗ Ẋvn du dv ≤ Ẋ n ∞ ≤ |t − s|2 .

s<u<v<t 2 2

The required uniform bound on ∥X∥1 follows by using (2.28) for n2 |t − s| > 1
and the above bound for n2 |t − s| ≤ 1.

c) The interpolation argument is left to the reader.

Exercise 2.11 (Second order translation and bracket) Fix α ∈ ( 13 , 12 ] and X =


(X, X) ∈ C α ([0, T ], V ). Define the (second order) translation of X in direction
H ∈ C 2α ([0, T ], V ⊗ V ) by
def
TH (X) = X, X + δH) ,

where (δH) denotes the map (s, t) 7→ Ht − Hs .


a) Show that TH (X) ∈ C α . In fact, show that the (linear) space C 2α acts freely on
the (nonlinear) rough path space C α in the sense that, for all G, H ∈ C 2α , we
have 
TG TH (X) = (TG ◦ TH )(X) = TG+H (X) .
Fix X ∈ C α . Is H 7→ TH (X) is injective?
b) When does TH preserve the space Cgα ([0, T ], V )?
c) Show that any X = (X, X) ∈ C α ([0, T ], V ) can be written, in a unique way, as
TH (Xg ), where Xg ∈ Cgα ([0, T ], V ) for some H ∈ C 2α ([0, T ], Sym(V ⊗ V )),
so that we have the bijection

C α ([0, T ], V ) ↔ Cgα ([0, T ], V ) × C 2α ([0, T ], Sym(V ⊗ V )).

Show that 2δH = (δX)⊗2 − 2 Sym(X) =: [X], called bracket of the rough path
X, further studied in Section 5.3.
Exercise 2.12 (Vanishing Hölder oscillation) a) Let X ∈ C α ([0, T ], V ) with
Hölder exponent α ∈ (0, 1]. Define the space of Hölder path with “vanish-
ing Hölder oscillation”,
( )
|X s,t |
C van,α = X ∈ C α :
def
sup α → 0, as ε → 0 .
s,t:|t−s|<ε |t − s|

Show that for α ∈ (0, 1) we have C van,α = C 0,α , the closure of smooth paths
in C α . (For α = 1 this fails, why?) Show by explicit example that the inclusion
C 0,α ⊂ C α is strict. (Hint: consider the function t 7→ tα .)
b) Let X = (X, X) ∈ Cgα ([0, T ], V ) with α ∈ ( 13 , 12 ]. Define the space of Hölder
rough paths with “vanishing Hölder oscillation”,
( )
van,α def α |Xs,t | |Xs,t |
Cg = X ∈ Cg : sup α + sup 2α → 0 as ε → 0 .
|t−s|<ε |t − s| |t−s|<ε |t − s|
34 2 The space of rough paths

i) Show the inclusions Cg0,α ⊂ Cgvan,α and also Cgβ ⊂ Cgvan,α , whenever
α < β. Show that the inclusion Cgvan,α ⊂ Cgα is strict.
ii) Assume dim V < ∞ from here on. Show Cg0,α = Cgvan,α (Hint: use the
“geodesic” approximations from Proposition 2.8.)
iii) From ii) we have Cgβ ⊂ Cg0,α ⊂ Cgα , whenever 13 < α < β ≤ 12 . Show that
one has the compact embedding (Hint: Arzela–Ascoli)

Cgβ ,→ Cg0,α .

c) Discuss similar statements for non-geometric rough path spaces. In particular,


discuss the validity of

C 0,α = cl(C ∞ ) = C van,α ,


def

and also, cf. Exercise 2.11, c),

C 0,α ↔ Cg0,α × C 0,2α ;

for α = 1/2 this fails, why?


Remark: This is essentially taken from [FV06a], for a recent extension to
0,1/2
∗ Exercise 2.13 Show that for every geometric 1/2-Hölder rough path, X ∈ Cg ,
X is necessarily the iterated Riemann–Stieltjes integral of the underlying path X ∈
C 0,1/2 . Show also that there exists X ∈ C 0,1/2 (with values in R2 ) such that the
iterated Riemann–Stieltjes integrals do not exist. This further shows that the Lyons–
Victoir extension (Exercise 2.14, part d) can fail for α-Hölder rough paths when
1/α ∈ N.

Solution. We use Cg0,α ⊂ Cgvan,α Exercise 2.12, for α = 1/2. Consider a dissection
{s = τ0 < τ1 < · · · < τN = t} with mesh ≤ ε. It follows from Chen’s relation (2.1),
in the form (2.26),
X X
Xs,t − Xs,τi ⊗ Xτi ,τi+1 =

Xτi ,τi+1
0≤i<N 0≤i<N

X
≤ C(ε) |τi+1 − τi | = T C(ε).
0≤i<N

It follows that Xs,t is the limit of the above Riemann–Stieltjes sum.


Regarding the second question, a counterexample is found in [FV10b, Ex.9.14
(iii)].

♯∗ Exercise 2.14 (Lyons–Victoir extension [LV07]) Let α ∈ (0, 1/2) and consider
X ∈ C α ([0, T ], L(V, W )), Y ∈ C α ([0, T ], V ) and Z ∈ C22α ([0, T ], W ). We omit
[0, T ] and the precise target space in what follows. We here say that Chen’s relation
holds if, for every triple of times (s, t, u),

Zs,u = Zs,t + Zt,u + Ys,t Xt,u .


2.5 Exercises 35
Rt
(This is the algebraic relation satisfied by (s, t) 7→ s
Ys,r dXr whenever X ∈ C 1 .)
a) Show that here exists a bilinear continuous map Φ : C α × C α → C22α ,

(Y, X) 7→ Z := Φ(Y, X)

such that Chen’s relation holds.


b) Show that the restriction of Φ to Hölder paths with exponent β ∈ (1/2, 1)
cannot possibly be a continuous as map C β × C β → C22β . (Hint:R the Chen
relation would force Φ(Y, X) to coincide with the Young integral Y dX. In

particular, Φ0,· would have to coincide with 0 Y (t)Ẋ(t)dt in case of smooth
path. Proposition 1.1 then allows to conclude.)
c) Show however that Φ can be constructed such that its restriction to a map
C β × C β → C β , where the image is now regarded as path t 7→ Φ(Y, X)0,t , is a
bilinear continuous map.
d) Let α ∈ (1/3, 1/2). Show that every path X ∈ C α ([0, T ], V ) admits a (if so
desired: geometric) rough path lift (X, X) ∈ C α ([0, T ], V ).
e) Conclude that the nonlinear rough path space C α ([0, T ], V ) is in (non-canonical)
one-one correspondence with the linear space C α ([0, T ], V ) ⊕ C 2α ([0, T ], V ⊗
V ). (For a generalisation of this to rough paths of low regularity see [TZ18].)

Solution. We show a) and c) together; d) is really a variation / consequence of a)


and we leave b) and e) to the reader. Without loss of generality, T = 1. Write
Z(s,t] ≡ Zs,t and similarly for the path increments of Y, X. We want to construct Z
such that
ZI = ZL + ZR + YL ⊗ XR
whenever I = (s, t] is the union of two adjacent “left and right” intervals L and R,
and such that
|ZI | ≲ |I|2α (⋆)
where |I| = |t − s|. By a continuity and chaining argument (see theSproof of
Theorem 3.1 below), it is enough to do so for dyadic times, i.e. s, t ∈ n⩾0 Dn
where D0 = {(0, 1]}, D1 = {(0, 1/2], (1/2, 1]} and so on. We start with the (ad-
hoc!) choice Z0,1 ≡ Z(0,1] = 0 and note its (trivial) bilinearity in (Y, X). Assume
now ZI for I ∈ Dn−1 has been constructed. Write I as the union of two nth level
dyadic intervals, I = L ∪ R. Make the (ad-hoc) imposition ZL = ZR which leads to
1
ZL = ZR = (ZI − YL ⊗ XR ).
2
(Note that bilinear dependence in Y, X is preserved.) On the analytic side, we have
1 1 1
|ZL | = |ZR | = |ZI − YL ⊗ XR | ⩽ |ZI | + |YL | · |XR |
2 2 2
and, setting an := supJ∈Dn |ZJ |/|J|2α = 22nα supJ∈Dn |ZJ |, it follows that
36 2 The space of rough paths

1
an ⩽ 2−(1−2α) an−1 + ∥Y ∥α ∥X∥α ,
2
so that the sequence (an ) is bounded since 1 − 2α > 0. In fact, one easily obtains
the bound
sup |an | ≲ ∥Y ∥α ∥X∥α ,
n⩾0

with proportionality constant only depending on α < 1/2. This implies the estimate
(⋆) and also settles continuity of Φ = Φ(Y, X). It remains to show that t 7→ Z0,t ∈ C β
whenever Y, X ∈ C β and β ∈ (1/2, 1). But this is an immediate consequence of the
bound
|Z0,t − Z0,s | ⩽ |Zs,t | + |X0,s | · |Xs,t |,
noting that, thanks to the first part of the theorem, |Zs,t | ≲ |t − s|2α for all 2α < 1.

Exercise 2.15 (Translation of rough paths) Fix α ∈ ( 13 , 12 ] and X = (X, X) ∈


C α [0, T ], Rd . For sufficiently smooth h : [0, T ] → Rd , the translation of X in


direction h is given by
Th (X) = X h , Xh ,
def 

where X h := X + h and
Z t Z t Z t
Xhs,t := Xs,t + hs,r ⊗ dXr + Xs,r ⊗ dhr + hs,r ⊗ dhr . (2.29)
s s s

a) Assume h ∈ C 1 . (In particular, the last three integrals above are well-defined
Riemann–Stieltjes integrals.) Show that for fixed h, the translation operator
Th : X 7→ Th (X) is a continuous map from C α into itself.
b) By convention, h ∈ C 1 means Lipschitz or equivalently h ∈ W 1,∞ , where W 1,q
denotes the space of absolutely continuous paths h with derivative ḣ ∈ Lq .
Weaken the assumption on h by only requiring ḣ ∈ Lq , for suitable q = q(α).
Show that q = 2 (“Cameron–Martin paths of Brownian motion”) works for all
α ≤ 1/2. (As a matter of fact, the integrals appearing in (2.29) make sense for
every q ≥ 1, but the resulting translated “rough path” falls out of the class of
Hölder rough paths. One can resolve this issue by switching to (1/α)-variation
rough paths.)
(2)
c) Call any h = (h, H) : [0, T ] → Rd ⊕ (Rd )⊗2 = T0 , with h ∈ W 1,2 and

H ∈ C an admissible perturbation. With some notational overloading, T is
also used for the second order translation introduced in Exercise 2.11, show that

Th := Th ◦ TH = TH ◦ Th

is a well-defined action on C α , in the sense of Tg ◦Th = Tg+h . Show that for any
(2)
fixed (a, b) ∈ T0 , the constant speed perturbation t 7→ (at, bt) is admissible,
(2)
which then yields an action of T0 with its additive structure on C α . Show that
these statements remain true for Cgα provided admissible perturbations take
values in the Lie algebra g(2) = Rd ⊕ so(d) as introduced in Section 2.3.
2.6 Comments 37

Remark: Some far-reaching extensions of this are found in [BCFP19]. Constant


speed perturbations respect stationarity of the noise (stationary increments of the
process) and thus serve as elementary examples of (algebraic) renormalisation
(2)
of models in regularity structures. The (abelian) groups (g(2) , +) and (T0 , +)
together with their action h 7→ Th , are examples of a renormalisation group in
the sense of Section 15.5.1.

2.6 Comments

Many early works in stochastic analysis starting from Itô (and then in no particular
order Kunita, Yamato, Sugita, Azencott, Ben Arous [BA89], etc) and in control theory
(Magnus, Brocket, Sussmann, Fliess [FNC82], etc) have recognised the importance
of iterated integrals of the driving noise / signal; many references are given [Lyo98]
and the books [LQ02, LCL07, FV10b].
The notion of rough path is due to Lyons and was introduced in [Lyo98] in p-
variation sense, p ∈ [1, ∞), and over Banach spaces. Earlier notes [Lyo94, Lyo95]
already dealt with α-Hölder rough paths for α ∈ 31 , 12 .


The analytical aspects of rough paths are related to Young’s seminal work
[You36], revisited in Chapter 4. On the algebraic side, Chen’s relation is rooted
in [Che54, Che57] and encodes abstractly basic additivity properties of iterated
integrals. A key observation of Chen [Che57, Che58] was that log signatures are
Lie series, the description via shuffles (cf. Section 2.4) is due to Ree [Ree58] (see
also [Che71]). It follows from the works of Chow and Rashevskii [Cho39, Ras38],
also [Che57, Che58], that this map is, upon truncation, onto: for every element
in x ∈ G(N ) (Rd ) := exp(g(N ) (Rd )) there exists a smooth path γ : [0, 1] → Rd
with prescribed signature x = S (N ) (γ). The shortest such path can be viewed as
sub-Riemannian geodesic, concatenation of such geodesics is then a natural way to
approximate weakly geometric rough paths (cf. Proposition 2.8) and underlies the
geometric approach of Friz–Victoir [FV05, FV10b], surveyed from a sub Rieman-
nian perspective in [FG16a]. The polynomial nature of (truncated) shuffle relations
and log Lie conditions recently led Améndola, Friz and Sturmfels [AFS19] to the
study of signature varieties in computational algebraic geometry.
Up to equivalence under a generalised notion of reparameterisation of paths known
as treelike equivalence, the “full” signature map γ 7→ S(γ) ∈ G((V )) ⊂ T ((V )) was
shown to be injective by Chen [Che58] in case of piecewise smooths paths, Hambly–
Lyons [HL10] in case of rectifiable paths, and Boedihardjo et al. [BGLY16] in case of
weakly geometric rough paths of arbitrarily low regularity, see also Boedihardjo, Ni
and Qian [BNQ14]. The inversion problem “signature 7→ path” is studied by Lyons–
Xu [LX17, LX18] and [AFS19]. All this is part of the mathematical justification of
the signature method in machine learning, see e.g. Lyons’ ICM article [Lyo14] and
the survey [CK16].
For some constructions of level-2 geometric rough paths motivated from harmonic
analysis see Hara–Lyons [HL07] and Lyons–Yang [LY13], see also the comments
38 2 The space of rough paths

Section 3.8 for some martingale constructions related to harmonic analysis. Lyons–
Qian, in their monograph [LQ02] work with geometric rough paths (over a Banach
space V ), per definition limits of canonically lifted smooth paths. The strict inclusion
“geometric ⊂ weakly geometeric” was somewhat blurred in the earlier rough paths
literature. For dim V < ∞, matters were clarified in [FV06a]. For a discussion
of weakly geometric rough paths over Banach spaces in their own right, see e.g.
in [CDLL16], see also the supplementary appendix [BGLY15] of [BGLY16]. The
discussion in Section 2.4, the “shuffle” view on weakly geometric rough paths and
then Gubinelli’s branched rough paths [Gub10], also extends from V = Rd to infinite
dimension but setting up basis-independent notations is somewhat more involved.
See for example [CW16, CCHS20] for some recent results in this direction.
(N )
“Naı̈ve” higher order non-geometric rough paths with values in T1 (V ) are
called in [Lyo98] multiplicative functionals (with α-Hölder or p-variation regularity,
⌊p⌋ = N ), insisting on their inability to handle nonlinearities when N ≥ 3. The
notion of branched rough path, for any α ∈ (0, 1], further studied in [HK15, FZ18,
BCFP19, BC19, TZ18] provides the required extra information when N ≥ 3; for
N = ⌊1/α⌋ = 2 there is no difference. It is possible to embed spaces of non-
geometric rough paths of low regularity into suitable spaces of geometric rough
paths, see [LV06] or Exercise 2.11 part c) when N = 2. The case of very low
regularities, with N large, is much more involved and studied by Hairer–Kelly
[HK15] and later Boedihardjo–Chevyrev [BC19].
Rough paths with jumps, in p-variation scale, are studied in [Wil01, FS17, FZ18,
CF19], previously introduced discrete rough paths [Kel16] are also accomodated e.g.
by the càdlàg rough path setting of [FZ18]. See also the comment Sections 4.8, 5.6
and 9.6. Rough paths in a geometric ambient space have been studied by Cass, Driver,
Litterer and Lyons in [CLL12, CDL15], see also Bailleul [Bai19] for rough paths on
Banach manifolds.
Chapter 3
Brownian motion as a rough path

In this chapter, we consider the most important example of a rough path, which is the
one associated to Brownian motion. We discuss the difference, at the level of rough
paths, between Itô and Stratonovich Brownian motion. We also provide a natural
example of approximation to Brownian motion which converges to neither of them.

3.1 Kolmogorov criterion for rough paths


2
Consider random X(ω) : [0, T ] → V and X(ω) : [0, T ] → V ⊗ V , subject to (2.1).
Equivalently, following Exercise 2.4, we can think of

X(ω) ≡ (X, X)(ω) : [0, T ] → V ⊕ (V ⊗ V )

as a (random) path. The basic example, of course, is that of d-dimensional standard


Brownian motion B enhanced with
Z t
Bs,r ⊗ dBr ∈ Rd ⊗ Rd ∼ = Rd×d .
def
Bs,t = (3.1)
s

The integration here is understood either in Itô or Stratonovich sense (in the latter
case, we would write ◦dB); sometimes we indicate this by writing BItô resp. BStrat . It
should be noted that the antisymmetric part of B, also known as Lévy’s stochastic
area, with values in so(d), is not affected by the choice of stochastic integration.
Condition (2.1) is seen to be valid with either choice, while condition (2.6) only
holds in the Stratonovich case. We now address the question of α- resp. 2α-Hölder
regularity of X resp. X by a suitable extension of the classical Kolmogorov criterion;
the application to Brownian motion is then carried out in detail in the following
subsection.
Recalling that B ∈ C α ([0, T ], Rd ), a.s. for any α < 1/2, we now address the
question of 2α-Hölder regularity for B.

39
40 3 Brownian motion as a rough path

Using Brownian scaling and exponential integrability of B0,1 , which is an imme-


diate consequence of the integrability properties of the second Wiener chaos, the
following result applies with β = 1/2 and all q < ∞. It gives the desired 2α-Hölder
α
 As a consequence, (B, B) ∈ C almost surely,
regularity for B, a.s. for any α < 1/2.
1 1
where we may take any α ∈ 3 , 2 and B ≡ (B, B) is known as Brownian rough
path or enhanced Brownian motion. In the Stratonovich case, thanks to (2.6), we
obtain a geometric rough path, i.e. (B, BStrat ) ∈ Cgα .

Theorem 3.1 (Kolmogorov criterion for rough paths). Let q ≥ 2, β > 1/q.
Assume, for all s, t in [0, T ]
β 2β
|Xs,t |Lq ≤ C|t − s| , |Xs,t |Lq/2 ≤ C|t − s| , (3.2)

for some constant C < ∞. Then, for all α ∈ [0, β − 1/q), there exists a modification
of (X, X) (also denoted by (X, X)) and random variables Kα ∈ Lq , Kα ∈ Lq/2
such that, for all s, t in [0, T ]
α 2α
|Xs,t | ≤ Kα (ω)|t − s| , |Xs,t | ≤ Kα (ω)|t − s| . (3.3)

In particular, if β − 1q > 13 then, for every α ∈ 13 , β − 1q , we have homogeneous




rough path norm |||X|||α ∈ Lq and hence X = (X, X) ∈ C α almost surely.

Proof. The proof is almost the same as the classical proof of Kolmogorov’s continuity
criterion, as exposed for example in [RY99]. Without loss of generality take T = 1
and let Dn denote the set of integerSmultiples of 2−n in [0, 1). As in the usual
criterion, it suffices to consider s, t ∈ n Dn , with the values at the remaining times
filled in using continuity. (This is why in general one ends up with a modification.)
Note that the number of elements in Dn is given by #Dn = 1/|Dn | = 2n . Set

Kn = sup Xt,t+2−n , Kn = sup Xt,t+2−n .
t∈Dn t∈Dn

It follows from (3.2) that

Xt,t+2−n q ≤
X 1 βq βq−1
E Knq ≤ E C q |Dn | = C q |Dn |

,
|Dn |
t∈Dn

q/2 Xt,t+2−n q/2 ≤


X 1 2βq/2 βq−1
C q/2 |Dn | = C q/2 |Dn |

E Kn ≤E .
|Dn |
t∈Dn
S
Fix s < t in n Dn and choose m : |Dm+1 | < t − s ≤ |Dm |. The interval [s, t) can
be expressed as the finite disjoint union of intervals of the form [u, v) ∈ Dn with
n ≥ m + 1 and where no three intervals have the same length. In other words, we
have a partition of [s, t) of the form

s = τ0 < τ1 < · · · < τN = t ,


3.1 Kolmogorov criterion for rough paths 41

where (τi , τi+1 ) ∈ Dn for some n ≥ m + 1, and for each fixed n ≥ m + 1 there are
at most two such intervals taken from Dn . In this context, such a type of multiscale
decomposition is sometimes called a “chaining argument”. It follows that

NX−1
X
|Xs,t | ≤ max Xs,τi+1 ≤ Xτ ,τ ≤ 2
i i+1
Kn ,
0≤i<N
i=0 n≥m+1

and similarly,

NX
−1 NX−1

|Xs,t | = Xτi ,τi+1 + Xs,τi ⊗ Xτi ,τi+1 ≤ Xτi ,τi+1 + |Xs,τi | Xτi ,τi+1


i=0 i=0
N −1
X NX
−1

≤ Xτi ,τi+1 + max Xs,τi+1 Xτj ,τj+1
0≤i<N
i=0 j=0
X  X 2
≤2 Kn + 2 Kn .
n≥m+1 n≥m+1

We thus obtain
|Xs,t | X 1 X 2Kn
α ≤ α 2Kn ≤ α ≤ Kα ,
|t − s| |Dm+1 | |Dn |
n≥m+1 n≥m+1

α
where Kα := 2 n≥0 Kn /|Dn | is in Lq . Indeed, since α < β −1/q by assumption
P
and |Dn | to any positive power is summable, we have
X 2 q 1/q
X 2C β−1/q
∥Kα ∥Lq ≤ α |E(Kn )| ≤ α |Dn | <∞.
|Dn | |Dn |
n≥0 n≥0

Similarly,

|Xs,t | X 1  X 1 2
2α ≤ 2α 2Kn + α 2Kn ≤ Kα + Kα2 ,
|t − s| |Dm+1 | |Dm+1 |
n≥m+1 n≥m+1


is in Lq/2 . Indeed,
P
where Kα := 2 n≥0 Kn /|Dn |

2   2/q X 2C 2β−2/q
X
∥Kα ∥Lq/2 ≤ Kq/2 ≤ 2α |Dn | <∞,

2α E n
n≥0
|Dn | n≥0
|Dn |

thus concluding the proof. ⊔


The reader will notice that the classical Kolmogorov criterion (KC) is contained
in the above proof and theorem by simply ignoring all considerations related to the
second-order process X. Let us also note in this context that the classical KC works
for processes (Xt : 0 ≤ t ≤ 1) with values in an arbitrary (separable) metric space
42 3 Brownian motion as a rough path

(it suffices to replace |Xs,t | by d(Xs , Xt ) in the argument). This observation actually
gives an alternative and immediate proof of Theorem 3.1. All we have to do is to
remember from Proposition 2.6 that rough paths can always be viewed as bona fide
(2)
paths with values in a metric space, namely T1 , equipped with the homogeneous left-
1/2
invariance metric d(Xs , Xt ) ≍ |Xs,t | + |Xs,t | . The moment assumption (3.2) is
β
then equivalent to |d(Xs , Xt )|Lq ≤ C|t − s| and we can conclude with the “metric”
form of KC. From Section 2.4, a version of this KC for “level-N ” low regularity
rough paths is then also immediate. The reason we still like the pedestrian step-2
proof is that it is easily tweaked, e.g. to the case of the R2 -valued process (B H , B) the
pair of a fractional and standard Brownian motion, independent say, with Itô second
level BH := B H dB, in the rough regime H ∈ (0, 1/2]. In this case β should
R

be replaced by the vector (β1 , β2 ) = (H, 1/2) of regularities, and the conclusion
can be stated with α resp. 2α replaced by the vector (α1 , α2 ) = (H − , 1/2− ) resp.
(H + 1/2)− .
Remark 3.2 (Warning). It is not possible to obtain (3.3) by applying the classical
KC to the (V ⊗ V )-valued process (X0,t : 0 ≤ t ≤ T ). Doing so only gives |Xs,t | =
α
O(|t − s| ) a.s. since one misses a crucial cancellation inherent in (cf. (2.1))

Xs,t = X0,t − X0,s − X0,s ⊗ Xs,t .

That said, it is possible [Fri05] (but tedious) to use a 2-parameter version of the KC
to see that (s, t) 7→ Xs,t /|t − s|2α admits a continuous modification, which implies
that ∥X∥2α is finite almost surely.
Here is a similar result for rough path distances, say between X and X̃. Note
that, due to the nonlinear structure of rough path spaces, one cannot simply apply
Theorem 3.1 to the “difference” of two rough paths. Indeed, if we consider X̃ − X,
where addition is taken in the ambient Banach space Cα ⊕ C22α , then Chen’s relation
is in general not satisfied.
Theorem 3.3 (Kolmogorov criterion for rough path distance). Let α, β, q be as
above in Kolmogorov’s criterion (KC), Theorem 3.1. Assume that both X̃ = (X̃, X̃)
and X = (X, X) satisfy the moment condition in the statement of KC with some
constant C. Set
∆X := X̃ − X , ∆X := X̃ − X ,
and assume that for some ε > 0 and all s, t ∈ [0, T ]
β 2β
|∆Xs,t |Lq ≤ Cε|t − s| , |∆Xs,t |Lq/2 ≤ Cε|t − s| .

Then there exists M , depending increasingly on C, so that

|∥∆X∥α |Lq ≤ M ε , |∥∆X∥2α |Lq/2 ≤ M ε .

In particular, if β − 1q > 1 1 1
we have |||X̃|||α , |||X|||α ∈ Lq

3 then, for every α ∈ 3, β− q
and 
|ϱα X̃, X |Lq/2 ≤ M ε.
3.2 Itô Brownian motion 43

Proof. The proof is a straightforward modification of the proof of Theorem 3.1 and
is left as an exercise to the reader. ⊔

Often one has a sequence of (random) rough paths {Xn ≡ (X n , Xn ) : 1 ≤ n ≤ ∞},


such that the moment conditions in the statement of Kolmogorov’s criterion hold with
a constant C, uniformly over 1 ≤ n ≤ ∞, and such that ε = εn → 0. Theorem 3.3
now quantifies the convergence Xn → X∞ , with rates given by

|ϱα (Xn , X∞ )|Lq/2 ≲ εn .

Of course, when εn decays sufficiently fast, a Borel–Cantelli argument also gives


almost sure convergence with suitable rates.

3.2 Itô Brownian motion

Consider a d-dimensional standard Brownian motion B enhanced with its iterated


integrals Z t
Bs,r ⊗ dBr ∈ Rd ⊗ Rd ∼
= Rd×d ,
def
Bs,t = (3.4)
s
where the stochastic integration is understood in the sense of Itô. Sometimes we
indicate this by writing BItô . We shall assume straight away that Bt and Bs,t are
continuous in t and s, t respectively, with probability one. For instance, if one takes
as granted that, almost surely, Brownian motion and indefinite Itô integrals against
Brownian motion (such as B0,· ) are continuous, then it suffices to (re)define the
second order increments as Bs,t = B0,t − B0,s − Bs ⊗ Bs,t . Of course, by additivity
of the Itô integral, this coincides a.s. with the earlier definition. En passant, (2.1) it
then immediately satisfied, for all times, on a common set of probability one.

Proposition 3.4. For any α ∈ 13 , 12 , with probability one,




BItô = (B, BItô ) ∈ C α ([0, T ], Rd ) .

In fact, the homogeneous rough path norm |||BItô |||α has Gaussian tails.

Proof. Using Brownian scaling and finite moments of B0,1 , which are immediate
from integrability properties of the (homogeneous) second Wiener–Itô chaos, the
KC for rough paths applies with β = 1/2 and all q < ∞. (As an exercise, the
reader may want to show finite moments of B0,1 without chaos arguments; an
elementary way to do so is via conditioning, Itô isometry, and reflection principle.)
The integrability |||BItô |||α ∈ Lq , any q < ∞, is clear from KC. The Gaussian
integrability (and hence tails) can be obtained by carefully tracking the moment
growth in Theorem 3.1 applied to BItô ; alternatively see Theorem 11.9 below for an
elegant Gaussian argument). ⊔ ⊓
44 3 Brownian motion as a rough path

Observe that Brownian motion enhanced with its iterated Itô integrals (2nd order
calculus!) yields a (random) rough path but not a geometric rough path which is, by
definition, an object with hardwired first order behaviour. Indeed, Itô formula yields
the identity

d(B i B j ) = B i dB j + B j dB i + B i , B j dt ,


i, j = 1, . . . , d ,

so that, writing Id for the identity matrix in d dimensions, we have for s < t,
 1 1 1
Sym BItô
s,t = Bs,t ⊗ Bs,t − Id(t − s) ̸= Bs,t ⊗ Bs,t ,
2 2 2
in contradiction with (2.6).
Let us finally mention that Brownian motion with values in infinite-dimensional
spaces can also be lifted to rough paths, see the exercise section.

3.3 Stratonovich Brownian motion

In the previous section we defined BItô by Itô integration of d-dimensional Brownian


motion B against itself. Now, for (scalar) continuous semimartingales, M, N say,
the Stratonovich integral is defined as
Z t Z t
1
M ◦ dN := M dN + ⟨M, N ⟩t
0 0 2

and has the advantage of a first order calculus. For instance, one has the first order
product rule
d(M N ) = M ◦ dN + N ◦ dM .
One can then define BStrat by (component-wise) Stratonovich integration of Brownian
motion against itself. Using basic results on quadratic variation of Brownian motion,
namely d⟨B i , B j ⟩t = δ i,j dt where δ i,j = 1 if i = j, zero else, we see that
1
BStrat Itô
s,t = Bs,t + Id(t − s) . (3.5)
2
Note that the difference between BStrat and BItô is symmetric, so that the antisymmet-
ric parts of the two processes (Lévy’s stochastic area) are identical.
Proposition 3.5. For any α ∈ (1/2, 1/3), with probability one,

BStrat = (B, BStrat ) ∈ Cgα ([0, T ], Rd ) ,

and here again the homogeneous rough path norm |||BStrat |||α has Gaussian tails.
Proof. Using (3.5), rough path regularity of BStrat is immediately reduced to the
already established Itô case. (Alternatively, one can use again the Kolmogorov
3.3 Stratonovich Brownian motion 45

criterion for rough paths; the only – insignificant – difference is that now BStrat
0,1 takes
values in the inhomogeneous second chaos, due to the deterministic part Id/2.) At
last, B(ω) is geometric since
 1
Sym BStrat
s,t = Bs,t ⊗ Bs,t ,
2
an immediate consequence of the first order product rule. Finally, integrability of
BStrat is clear from the already seen integrability of BItô , proving the final claim. ⊔

A typical realisation B(ω) is called Brownian rough path, as a process B = BStrat
is a.k.a. (Stratonovich) enhanced Brownian motion. It is a deterministic feature
of every weakly geometric rough path (X, X) that it can be approximated – in the
precise sense of Proposition 2.8 – by smooth paths in the rough path topology. Such
approximations require knowledge not only of the underlying path X, but of the
entire rough path, including the second-order information X.
In contrast, one has the probabilistic statement that piecewise linear, mollifier
and many other “obvious” approximations still converge in rough path sense. More
specifically, in the present context of d-dimensional standard Brownian motion, we
now give an elegant proof of this based on (discrete-time!) martingale arguments.
Proposition 3.6. Consider dyadic piecewise linear approximations (B (n) ) to B on
(n)
[0, T ]. That is, Bt = Bt whenever t = iT /2n for some integer i, and linearly
interpolated on intervals [iT /2n , (i + 1)T /2n ]. Then, with probability one,
 Z · 
(n) (n) (n)
B , B ⊗ dB → (B, BStrat ) in Cgα .
0

(The integral on the left-hand side is understood as classical Riemann–Stieltjes


integral.)
Remark 3.7. With Theorem 3.3, one can see rough path convergence (in probability,
and actually Lq , any q < ∞) of piecewise linear approximation along any sequence
of dissections with mesh tending to zero. Moreover, this approach will give the rate
θ, any θ < 1/2 − α.
Proof. It is easy to check that B gives B (n) via conditioning on B at dyadic times,

B (n) = E(B | σ{BkT 2−n : 0 ≤ k ≤ 2n }).

By independence of the components B i , B j for i ̸= j, the same holds for BStrat


off-diagonal; the on-diagonal terms require no further attention since BStrat;i,i
s,t =
1 i 2
2 (B s,t ) . Almost sure pointwise convergence then readily follows from martingale
convergence. Furthermore, Theorem 3.1 implies
Bs,t ≤ Kα (ω)|t − s|α , ≤ Kα (ω)|t − s|2α ,
i Strat;i,j
Bs,t

and upon conditioning with respect to σ{BkT 2−n : 0 ≤ k ≤ 2n }, the same bounds
R · (n);i
(n);i
hold for B and for 0 B dB (n);j . In fact, Kα , Kα have (more than enough)
46 3 Brownian motion as a rough path

integrability to apply Doob’s maximal inequality. This leads, with probability one, to
the bound Z ·
(n)
sup B , B (n) ⊗ dB (n) < ∞ .

n 0 2α

Together with a.s. pointwise convergence, a (deterministic) interpolation argument


shows a.s. convergence with respect to the α-Hölder rough path metric ϱα . ⊔⊓

The reader should be warned that there are perfectly smooth and uniform ap-
proximations to Brownian motion, which do not converge to Stratonovich enhanced
Brownian motion, but instead to some different geometric (random) rough path, such
as
B̄ = B, B̄ , where B̄s,t = BStrat

s,t + (t − s)A , A ∈ so(d) .
Note that the difference between B̄ and BStrat is now antisymmetric, i.e. B̄ has a
stochastic area that is different from Lévy’s area. To construct such approximations,
it suffices to include oscillations (at small scales) such as to create the desired
effect in the area, while they do not affect the limiting path, see Exercise 2.10.
(In the context of Brownian motion and SDEs driven by Brownian motion such
approximations were studied by McShean, Ikeda–Watanabe and others, see [McS72,
IW89].) Although such “twisted” approximations do not seem to be the most obvious
way to approximate Brownian motion, they also arise naturally in some perfectly
reasonable situations.

3.4 Brownian motion in a magnetic field

Newton’s second law for a particle in R3 with mass m, and position x = x(t), (for
simplicity: constant) frictions α1 , α2 , α3 > 0 in orthonormal directions, subject
to a (3-dimensional) white noise in time, i.e. the distributional derivative of a 3-
dimensional Brownian motion B, reads

mẍ = −M ẋ + Ḃ, (3.6)

assuming M symmetric with spectrum α1 , α2 , α3 . The process x(t) describes what is


known as physical Brownian motion. It is well known that in small mass regime, m ≪
1, of obvious physical relevance when dealing with particles, a good approximation
is given by (mathematical) Brownian motion (with non-standard covariance). To see
this formally, it suffices to take m = 0 in (3.6) in which case x = M −1 B.
Let us now assume that our particle (with position x and momentum mẋ) carries
a non-zero electric charge and moves in a magnetic field which we assume to be
constant. Recall that such a particle experiences a sideways force (“Lorentz force”)
that is proportional to the strength of the magnetic field, the component of the velocity
that is perpendicular to the magnetic field and the charge of the particle. In terms
of our assumptions, this simply means that a non-zero antisymmetric component is
added to M . We shall hence drop the assumption of symmetry, and instead consider
3.4 Brownian motion in a magnetic field 47

for M a general square matrix with

Real{σ(M )} ⊂ (0, ∞).

Note that these second order dynamics can be rewritten as evolution equation for the
momentum p(t) = mẋ(t),
1
ṗ = −M ẋ + Ḃ = − M ṗ + Ḃ.
m
As we shall see X = X m , indexed by “mass” m, converges in a quite non-trivial
way to Brownian motion on the level of rough paths. In fact, the correct limit in
rough path sense is B̄ = (B, B̄), where

B̄s,t = BStrat
s,t + (t − s)A, (3.7)

in terms of an antisymmetric matrix A; written explicitly as A = 21 (M Σ − ΣM ∗ ) ∈


so(d), where Z ∞

Σ= e−M s e−M s ds.
0
When M is normal, i.e. M ∗ M = M M ∗ , it is an exercise in linear algebra to show
that this expression simplifies to
1
A= Anti(M ) Sym(M )−1 ,
2
where Anti(M ) denotes the antisymmetric part of a matrix and Sym(M ) its symmet-
ric part. We can now state the result in full detail.

Theorem 3.8. Let M ∈ Rd×d be a square matrix in dimension d such that all its
eigenvalues have strictly positive real part. Let B be a d-dimensional standard
Brownian motion, m > 0, and consider the stochastic differential equations
1 1
dX = P dt , dP = − M P dt + dB .
m m
with zero initial position X and momentum P . Then, for any q ≥ 1 and α ∈
(1/3, 1/2), as mass m → 0,
 Z 
M X, M X ⊗ d(M X) → B̄ in C α and Lq .

Proof. Step 1. (Pointwise convergence in Lq .) In order to exploit Brownian scaling,


it is convenient to set m = ε2 and then Y ε as rescaled momentum,

Ytε = Pt /ε.

We shall also write X ε = X, to emphasise dependence on ε. We then have


48 3 Brownian motion as a rough path

dYtε = −ε−2 M Ytε dt + ε−1 dBt , dXtε = ε−1 Ytε dt .

By assumption, there exists λ > 0 such that the real part of every eigenvalue of M
is (strictly) bigger than λ. For later reference, we note that this implies the estimate
| exp(−τ M )| = O(exp(−λτ )) as τ → ∞. For fixed ε, define the Brownian motion
B̃ = ε−1 Bε2 · so that ε−1 dBt = dB̃ε−2 t , and consider the SDEs

dỸt = −M Ỹt dt + dB̃t , dX̃t = Ỹt dt .

Note that the law of the solutions does not depend on ε. Furthermore, when solved
with identical initial data, we have pathwise equality

Ytε , ε−1 Xtε = Ỹε−2 t , X̃ε−2 t .


 
(3.8)

Thanks to our assumption on M , Ỹ is ergodic; the stationary solution has (zero


mean, Gaussian) law ν = N (0, Σ) for some covariance matrix Σ. To compute it,
write down the stationary solution
Z t
stat
Ỹt = e−M (t−s) dBs .
−∞

For each t (and in particular for t = 0), the law of Ỹtstat is precisely ν. We then see
that
  Z 0 ∗
Z ∞

Σ = E Ỹ0stat ⊗ Ỹ0stat = e−M (−s) e−M (−s) ds = e−M s e−M s ds.
−∞ 0

Since sup0≤t<∞ E|Ỹt2 | < ∞, it is clear that εỸε−2 t = εYtε → 0 in L2 uniformly in


t (and hence in Lq for any q < ∞). Noting that M Xtε = Bt − εY0,t ε
, the first part of
the proposition is now obvious. Moreover, by the ergodic theorem1 ,
Z t Z
f (Ytε ) dt → t f (y)ν(dy) , in Lq for any q < ∞, (3.9)
0

for all reasonable test functions f ; we shall only use it for quadratics. Using dX ε =
ε−1 Y ε dt we can then write
Z t Z t Z t
M Xsε ⊗ d(M X ε )s = M Xsε ⊗ dBs − ε M Xsε ⊗ dYsε
0 0 0
Z t Z t
= M Xs ⊗ dBs − M Xt ⊗ (εYtε ) + ε
ε ε
d(M X ε )s ⊗ Ysε
0 0
Z t Z t
ε ε ε
= M Xs ⊗ dBs − M Xt ⊗ (εYt ) + M Ysε ⊗ Ysε ds
0 0

1
In its standard form, see e.g. Stroock [Str11] or Kallenberg [Kal02], test functions are assumed to
be bounded. In our setting an easy truncation argument yields the extension to quadratics.
3.4 Brownian motion in a magnetic field 49
Z t Z
→ Bs ⊗ dBs − 0 + t (M y ⊗ y) ν(dy)
0
t
1 
Z 
= Bs ⊗ dBs + tM Σ = B0,t + t M Σ − Id ,
0 2

where the convergence is in Lq for any q ≥ 2. By considering the symmetric part of


the above equation,
1 1  1 
(M Xtε ) ⊗ (M Xtε ) → Bt ⊗ Bt + Sym M Σ − Id ,
2 2 2
we see that M Σ − 12 I is antisymmetric, and hence also equals 12 (M Σ − ΣM ∗ ).
This settles pointwise convergence, in the sense that
 Z t 
S(M X ε )t := M Xtε , M Xsε ⊗ d(M X ε )s → Bt , B̄0,t .

0

Step 2. (Uniform rough path bounds in Lq .) We claim that, for any q < ∞,
 Z q 
q
sup E[∥M X ε ∥α ] < ∞ , ε ε

sup E M X ⊗ d(M X ) <∞,
ε∈(0,1] ε∈(0,1] 2α

which, in view of Theorem 3.1, is an immediate consequence of the bounds


" Z q #
 ε q  q t ε ε
q
sup E Xs,t ≲ |t − s| 2 , sup E Xs,· ⊗ dX ≲ |t − s| .
ε∈(0,1] ε∈(0,1] s

Since X is Gaussian, it follows from integrability properties of the first two Wiener–
Itô chaoses that it is enough to show these bounds for q = 2. Furthermore, we note
that the desired estimates are a consequence of the bounds
h 2 i
E X̃s,t ≲ |t − s| , (3.10)
2
" Z #
t

2
E X̃s,u ⊗ dX̃u ≲ |t − s| , (3.11)
s

where the implied proportionality constants are uniform over t, s ∈ (0, ∞). Indeed,
this follows directly from writing
h i h 2 i
ε 2
= E εX̃ε−2 s,ε−2 t ≲ ε2 ε−2 t − ε−2 s = |t − s| ,

E Xs,t

(note the uniformity in ε), and similarly for the second moment of the iterated
integral.
In order to check (3.10), it is enough to note that M X̃s,t = B̃s,t − Ỹs,t , combined
with the estimate
50 3 Brownian motion as a rough path
h i  2  Z t ∗
E |Ỹs,t |2 = E (e−M (t−s) − I)Ỹs + Tr(e−M u e−M u ) du ≲ |t − s| ,

s

where we used the fact that Real{σ(M )} ⊂ (0, ∞) to get a uniform bound. In order
to control (3.11), we consider one of the components and write
" Z 2 # " Z Z 2 #
t i j
t u i j
E X̃s,u dX̃u = E Ỹr Ỹu dr du
s s s
Z h i
= E Ỹri Ỹuj Ỹqi Ỹvj 1{r≤u;q≤v} dr du dq dv
[s,t]4
Z  h i h i h i h i
≤ E Ỹri Ỹuj E Ỹqi Ỹvj + E Ỹri Ỹqi E Ỹuj Ỹvj

[s,t]4
h i h i 
+ E Ỹri Ỹvj E Ỹuj Ỹqi dr du dq dv

Z h i 2
≲ E Ỹr ⊗ Ỹu dr du

[s,t]2
Z h i 2
≲ E Ỹr ⊗ Ỹu 1{r≤u} dr du ,

[s,t]2

where we have used the fact that Ỹ is Gaussian (which yields Wick’s formula for the
  of products) in order to get the bound on the third line. But for r ≤ u,
expectation
E Ỹu Ỹr = e−M (u−r) Ỹr , so that
Z h i Z h i
E Ỹr ⊗ Ỹu 1{r≤u} dr du = E Ỹr ⊗ e−M (u−r) Ỹr 1{r≤u} dr du

[s,t]2 [s,t]2
Z t Z t 
−λ(u−r)
 2 
≲ e du E Ỹr dr ≲ |t − s| .
s r

It now suffices to recall that | exp(−τ M )| = O(exp(−λτ )) to conclude the proof of


(3.11).
Step 3. (Rough path convergence in Lq .) The remainder of the proof is an easy
application of interpolation, along the lines of Exercise 2.9. ⊔

3.5 Cubature on Wiener Space

Quadrature rules replace Lebesgue measure


P λ on [0, 1] by a finite, convex linear
combination of point masses, say µ = ai δxi , where weights (ai ) and points (xi )
are chosen such that all monomials (and hence all polynomials) up to degree N are
correctly evaluated. In other words, one first computes the moments of λ, namely
3.5 Cubature on Wiener Space 51
1
1
Z
xn dλ(x) = ,
0 n+1
R1
for all n ≥ 0. One then looks for a measure µ such that 0 xn dµ(x) = 1/(n + 1)
for all n ∈ {0, 1, . . . , N }. The same can be done on Wiener space: the monomial
xn is then replaced by the n-fold iterated integrals (in the sense of Stratonovich),
integration is on C [0, T ], Rd against standard d-dimensional Wiener measure. In


order to find such cubature formulae, the mandatory first step, on which we focus
here, is the computation of the expectations of the n-fold iterated integrals2
Z 
E ◦dB ⊗ · · · ⊗ ◦dB .
0<t1 <...<tn <T

Let us combine all of these integrals into one single object, also known as
(Stratonovich) signature of Brownian motion, by writing
XZ
S(B)0,T = 1 + ◦dB ⊗ · · · ⊗ ◦dB .
n≥1 0<t1 <...<tn <T

The signature S(B)0,T naturally takes values in the algebra of infinite formal tensor
d
L T ((R d)),⊗n
series effectively the closure of the space of tensor polynomials given
by n≥0 (R ) . It turns out that in the case of Brownian motion, the expected
signature can be expressed in a particularly concise and elegant form.

Theorem 3.9 (Fawcett). Consider S(B)0,T as above as a T ((Rd ))-valued random


variable. Then
T X d 
ES(B)0,T = exp ei ⊗ ei .
2 i=1

Proof. (Shekhar) Set φt := ES(B)0,t . (It is not hard to see, by Wiener–Itô chaos
integrability or otherwise, that all involved iterated integrals are integrable so that φ
is well-defined.) By Chen’s formula (in its general form, see Exercise 2.1) and the
independence of Brownian increments, one has the identity

φt+s = φt ⊗ φs .

Since φt ⊗ φs = φs ⊗ φt , we have [φs , φt ] = 0, so that

log φt+s = log φt + log φs .

For integers m, n we have log φm = n log φm/n and log φm = m log φ1 . It follows
that
log φt = t log φ1 ,
2
We remark that all n-fold iterated Stratonovich integrals can be obtained from the “level-2” rough
path (B(ω), BStrat (ω)) ∈ Cgα by a continuous map. In fact, this so-called Lyons lift, allows to view
any geometric rough path as a “level-n” rough path for arbitrary n ≥ 2.
52 3 Brownian motion as a rough path

first for t = m
n ∈ Q, then for any real t by continuity. On the other hand, for t > 0,
Brownian scaling implies that φt = δ√t φ1 where δλ is the dilation operator, which
acts by multiplication with λn on the nth tensor level, (Rd )⊗n . Since δλ commutes
with ⊗ (and thus also with log, defined as power series),

log φt = δ√t log φ1

and it follows that one necessarily has


⊗2
log φ1 ∈ Rd .
Pd
It remains to identify log φ1 with 12 i=1 ei ⊗ ei . To this end it suffices to compute
the expected signature up to level two, which yields
 1  d
1X
Z
ES (2) (B) = E 1 + B0,1 + B ⊗ ◦dB =1+ ei ⊗ ei .
0 2 i=1

Recall that in this expression, “1” is identified with (1, 0, 0) in the truncated tensor
algebra, and similarly for the other summands, and addition also takes place in
T (2) (Rd ). Taking the logarithm (in the tensoralgebra truncated beyond level 2; in
this case log (1 + a + b) = a + b − 12 a ⊗ a if a is a 1-tensor, b a 2-tensor) then
immediately gives the desired identification. ⊔ ⊓

The (constructive) existence of cubature formulae, a finite family of piecewise


smooth paths with associated probabilities, such as to mimic the behaviour of the
expected signature up to a given level is not a trivial problem (although much has
been achieved to date), the reader can explore a simple case in Exercise 3.11 below.

3.6 Scaling limits of random walks

Consider a family of continuous processes Xn = (X n , Xn ), with values in V ⊕


(V ⊗ V ) where dim V < ∞. Assume Xn0 = (0, 0) for all n. We leave the proof of
the following result as exercise.

Theorem 3.10 (Kolmogorov tightness criterion for rough paths). Let q ≥ 2, β >
1/q. Assume, for all s, t in [0, T ]
n q q/2
≤ C|t − s|βq , En Xns,t
βq

En Xs,t ≤ C|t − s| , (3.12)
1
> 31 . Then for every α ∈ 1 1

for some constant C < ∞. Assume β − q 3, β − q , the
Xn ’s are tight in C 0,α .

In typical applications, the X n are only defined for discrete times, such as s =
j/n, t = k/n for integers j, k. The non-trivial work then consists, for a suitable
3.6 Scaling limits of random walks 53

choice of Xn , in checking the following discrete tightness estimates,

j − k βq j − k βq
q q/2
En X nj , k ≤ C , En Xnj , k ≤ C . (3.13)

n n n n n n

The analogous continuous tightness estimates are typically obtained by suitable


extension of Xn to continuous times (e.g. piecewise geodesic).
Proposition 3.11. Consider a d-dimensional random walk (Xj : j ∈ N), with i.i.d.
increments of zero mean, finite moments of any order q < ∞, and unit covariance
matrix. Extend the rescaled random walk
1
X nj := √ Xj ,
n n

defined on discrete times only, by piecewise linear interpolation to all times and
construct Xn = (X n , Xn ) by iterated (Riemann–Stieltjes) integration. Then the
tightness estimates in Theorem 3.10 hold with β = 1/2 and all q < ∞.
Proof. The iterated integrals of a linear (or affine) path with increment v ∈ Rd
takes the simple form exp(v) in terms of the tensor exponential introduced in (2.13).
Chen’s relation then implies
   
Xnj , k = exp X nj , j+1 ⊗ · · · ⊗ exp X nk−1 , k . (3.14)
n n n n n n

d
The simple calculus on the level-2 tensor algebra T (2) R leads to an explicit


expression for Xnj , k , to which one can apply the (discrete) Burkholder–Davis–Gundy
n n
inequality in order to get the discrete tightness estimates (3.13). The extension to all
times is straightforward. Details are left to the reader (see e.g. [BF13]). An alternative
argument, not restricted to level 2, is found in Breuillard et al. [BFH09]. ⊔ ⊓
Note that Xn , as constructed above, is a (random) geometric rough path. Recall
that suchrough paths can be viewed as genuine paths with values in the Lie group
G(2) Rd ⊂ T (2) Rd . On the other hand, from (3.14), we see that Xn restricted to
discrete times { nj : j ∈ N} is a Lie group valued random walk, rescaled with the aid
of the dilation operator. By using central limit theorems available on such Lie groups,
one can see that Xn at unit time converges weakly to Brownian motion, enhanced
with its iterated integrals in the Stratonovich sense. Under the additional assumption
that E(X ⊗ X) = Id, the identity matrix, this Brownian motion is in fact a standard
Brownian motion. This is enough to characterise the finite-dimensional distributions
of any weak limit point and one has the following “Donsker” type result.
Theorem 3.12. In the rescaled random walk setting of Proposition 3.11, and under
the additional assumption that E(X ⊗ X) = Id, we have the weak convergence

Xn =⇒ BStrat

in the rough path space C α ([0, T ], Rd ), any α < 1/2.


54 3 Brownian motion as a rough path

Recall that, by definition, weak convergence is stable under pushforward by


continuous maps. The interest in this result is therefore clearly given by the fact that
stochastic integrals and the Itô map can be viewed as continuous maps on rough path
spaces, as will be discussed in later chapters.

3.7 Exercises

Exercise 3.1 Complete the proof of Theorem 3.3.


Exercise 3.2 Bypass the use of Wiener–Itô chaos integrability in Proposition 3.4 by
showing directly that the matrix-valued random variable BItô 0,1 has moments of all
orders. Hint: This is trivial for the on-diagonal entries, for the off-diagonal entries
one can argue via conditioning, Itô isometry, and reflection principle.
♯ Exercise 3.3 Show that d-dimensional Brownian motion B enhanced with Lévy’s
stochastic area is a degenerate diffusion process and find its generator.
Exercise 3.4 (Q-Wiener process as rough path) Given a separable Hilbert space
H with orthonormal basis (ek ), (λk ) ∈ l1 , λk > 0 for all k, and a countable
k

sequence β of independent standard Brownian motions, the limit

1/2
X
Xt := λk βtk ek
k=1

exists a.s. and in L2 , uniformly onP


compacts. This defines a Q-Wiener process in
the sense of [DPZ92], where Q = k λk ⟨ek , ·⟩ek is symmetric, non-negative and
trace-class; conversely, any such operator Q on H can be written in this form and
thus gives rise to a Q-Wiener process. Show that
∞ Z t
1/2 1/2
X
Xs,t := λj λk βsj dβsk ej ⊗ ek
j,k=1 s

exists a.s. and in L2 , uniformly on compacts and so defines X with values in H ⊗HS H,
the closure of the algebraic tensor product H ⊗a H under the Hilbert–Schmidt norm.
Consider both the case of Itô and Stratonovich integration and verify that with either
choice, (X, X) ∈ C α a.s. for any α < 1/2.
∗ Exercise 3.5 (Banach-valued Brownian motion as rough path [LLQ02]) Given
a separable Banach space V equipped with a centred Gaussian measure µ, a standard
construction (cf. [Led96]) gives rise to a so-called abstract Wiener space (V, H, µ),
with H ⊂ V the Cameron–Martin space of µ. (Examples to have in mind are V =
H = Rd with µ = N (0, I), or the usual Wiener space V = C([0, 1]) equipped with
Wiener measure, H is then the space of absolutely continuous paths starting at zero
with L2 -derivative.) There then exists a V -valued Brownian motion (Bt : t ∈ [0, T ])
such that
3.7 Exercises 55

• B0 = 0,
• B has independent increments,
2 
• ⟨Bs,t , v ∗ ⟩ ∼ N 0, (t − s) v ∗ H whenever 0 ≤ s < t ≤ T and v ∗ ∈ V ∗ ,→
H∗ ∼= H.
We assume that V ⊗ V is equipped with an exact tensor norm (with respect to µ)
in the sense that there exists γ ∈ [1/2, 1) and a constant C > 0 such that for any
sequence {Gk ⊗ G̃k : k ≥ 1} of independent V -valued Gaussian random variables
with identical distribution µ,

N
 X 2 
≤ CN 2γ = o(N ).

E Gk ⊗ G̃k
k=1 V ⊗V

a) Verify that exactness holds with γ = 1/2 whenever dim V < ∞. (More gener-
ally, exactness with γ = 1/2 always holds true if one works with the injective
tensor product space, V ⊗inj V , the injective norm being the smallest possible.
For the largest possible norm, the projective norm, the o(N )-estimate remains
true but can be as slow as one wishes. Exactness may then fail, see for example
[LLQ02]. Exactness of the usual Wiener space, with uniform or Hölder norm, is
also known to be true.)

b) Fix α < 1/2.R Show that dyadic piecewise linear approximations B n , enhanced
with Bn = B n ⊗ dB n , converge in α-Hölder rough path metric to a limit
B in C α ([0, T ], V ). More precisely, use the previous exercise to show that the
sequence Bn = (B n , Bn ) is Cauchy in the sense that

|ϱa (Bn , Bm )|Lq → 0 with n, m → ∞ .

Conclude that Bn converges in C α and Lq to some limit B ∈ C α ([0, T ], V ) a.s.


c) Show that B is the Lq -limit in α-Hölder rough path metric for all piecewise
linear approximations, say B Dn , as long as mesh |Dn | → 0 with n → ∞. Show
that the convergence is almost sure if |Dn | ∼ 2−n and also |Dn | ∼ 1/n.
Solution. We only sketch the main step in the proof of b). Without loss of generality,
we set T = 1. The crux of the matter is to show that Bn0,1 converges in V ⊗ V . The
rest follows from scaling and equivalence of moments in the first two Wiener chaoses.
Set tnk = k/2n . Then

X2n 2
n 2
n+1
0,1 − B0,1 L2 ∼ E Btn+1 ⊗ Btn+1
B
n+1 n+1
2k−2 ,t2k−1 2k−1 ,t2k

V ⊗V
k=1
n
2 2
1 X n+1 n+1
∼ E 2 2 B n+1 n+1 ⊗ 2 2 B n+1

n+1
t2k−2 ,t2k−1 t2k−1 ,t2k
22n+2

V ⊗V
k=1
X2n 2
−2n−2
∼2 E Gk ⊗ G̃k ≲ 2−2n−2 22γn

V ⊗V
k=1
56 3 Brownian motion as a rough path

∼ 2−2n(1−γ) ,

where the penultimate bound was obtained by exactness. By definition of exactness


1 − γ > 0 and so Bn0,1 is Cauchy in the L2 -space of V ⊗ V -valued random variables.

Exercise 3.6 In the context of Theorem 3.8, show that for M normal the Lévy area
correction takes the form
1
A= Anti(M ) Sym(M )−1 .
2
Conclude that the correction vanishes if and only if M is symmetric. Is this also true
without the assumption that M is normal?

Exercise 3.7 In the context of Theorem 3.8, show that “physical Brownian motion
with mass m” converges as m → 0, in ϱα and Lq , α ∈ (1/2, 1/3) and q < ∞, with
rate  
1
O , any θ < 1/2 − α.

Hint: Use Theorem 3.3 to show rough path convergence. (The computations are a
little longer, but of similar type, with the additional feature that the use of the ergodic
theorem can be avoided.)
Exercise 3.8 Consider physical Brownian motion in dimension d = 2, with
 
0 −1
M =I −α , α ∈ R.
1 0

Show that the area correction of X m , in the (small mass) limit m → 0, is given by
 
α 0 −1
.
2(1 + α2 ) 1 0

(This correction is computed by multiscale / homogenisation techniques in [PS08]).


Exercise 3.9 Consider Xt = bt + σBt where b ∈ Rd , a = σσ ∗ ∈ (Rd )⊗2 . In other
words, X is a Lévy process with triplet (a, b, 0). Show that the expected signature of
X over [0, T ] is given by
  
1 
ES(X)0,T = exp T b + a .
2

Here, the exponential should be interpreted as the exponential in the tensor algebra,
i.e.
1 1
exp(u) = 1 + u + u ⊗ u + u ⊗ u ⊗ u + . . .
2! 3!
Exercise 3.10 (Expected signature for Lévy processes [FS17]) Consider a com-
pound Poisson process Y with intensity λ and jumps distributed like J = J(ω) ∼ ν.
3.7 Exercises 57

in other words, Y is Lévy with triplet (0, 0, K) where the Lévy measure is given by
K = λν. A sample path of Y gives rise to piecewise linear, continuous path; simply
by connecting J1 , J1 + J2 etc. Show that, under a suitable integrability condition
for J,
ES(Y )0,T = exp T λE(eJ − 1).
Can you handle the case of a general Lévy process?

Exercise 3.11 (Level-3 cubature formula) Define a measure µ on C [0, 1], Rd by




assigning equal weight 2−d to each of the paths


 
±1
 ±1  d
t 7→ t
... ∈ R .

±1

Call the resulting process (Xt (ω) : t ∈ [0, 1]) and compute the expected signature
up to level 3, that is
 Z Z 
E 1, X0,1 , dXt1 ⊗ dXt2 , dXt1 ⊗ dXt2 ⊗ dXt3 .
0<t1 <t2 <1 0<t1 <t2 <t3 <1

Compare with expected signature of Brownian motion, the tensor exponential


exp( 12 I), projected to the first 3 levels.
P
Solution. One can write Xt (ω) = t i Zi (ω)ei with i.i.d. random variables Zi
taking values +1, −1 with equal probability. Clearly,
Z
E dXt1 = EXt1 = 0.
0<t1 <1

Then,
1X 1
Z
dXt1 ⊗ dXt2 = Zi Zj ei ⊗ ej = Id + (zero mean)
0<t1 <t2 <1 2 i,j 2

and so the expected value at level 2 matches π2 exp( 21 I) = 12 Id. A similar ex-


pansion on level 3 shows that every summand either contains, for some i, a factor
3
EZti1 = 0 or E Zti1 = 0. In other
 words, the expected signature at level 3 is zero,
in agreement with π3 exp( 12 Id) = 0. We conclude that the expected signatures, of
µ on the one hand and Wiener measure on the other hand, agree up to level 3.

Exercise 3.12 Prove the Kolmogorov tightness criterion, Theorem 3.10.


58 3 Brownian motion as a rough path

3.8 Comments

The modification of Kolmogorov’s criterion for rough paths (Theorem 3.1) is a minor
variation on a rather well-known theme. Rough path regularity of Brownian motion
was first established in the thesis of Sipiläinen, [Sip93].
For extensions to infinite-dimensional Wiener processes (and also convergence
of piecewise linear approximations in rough path sense) see Ledoux, Lyons and
Qian [LLQ02] and Dereich [Der10]; much of the interest here is to go beyond the
Hilbert space setting. The resulting stochastic integration theory against Banach space
valued Brownian motion, which in essence cannot be done by classical methods, has
proven crucial in some recent applications (cf. the works of Kawabi–Inahama [IK06],
Dereich [Der10]).
Early proofs of Brownian rough path regularity were typically established by
convergence of dyadic piecewise linear approximations to (B, BStrat ) in (p-variation)
rough path metric; see e.g. Lyons–Qian [LQ02]. Many other “obvious” (but as we
have seen: not all reasonable) approximations are seen to yield the same Brownian
rough path limit. The discussion of Brownian motion in a magnetic field follows
closely Friz, Gassiat and Lyons [FGL15]. Semimartingales [CL05, FV08a, LP18,
CF19] and large classes of Markovian processes [Lej06, FV08c] lift in a natural way
to random rough paths. For Gaussian rough paths see Chapter 10. Infinite dimensional
rough path constructions from free probability include [CDM01, Vic04].
Friz–Victoir [FV08a] extend Lépingle’s classical p-variation Burkholder–Davis–
Gundy (BDG) inequality [Lep76] for martingales to continuous martingale rough
paths (a.k.a. enhanced martingales). This was further extended to càdlàg martingale
roughR paths by Chevyrev–Friz [CF19] and a precise “off-diagonal” variation estimate
for M dN , two martingales, was given by Kovač and Zorin–Kranich [KZK19],
extending a variational estimate of Do, Musalu and Thiele [DMT12], with motivation
from harmonic analysis.
Lyons–Zeitouni [LZ99] use rough paths to bound Stratonovich iterated stochas-
tic integrals under conditioning, with application to Onsager-Machlup functionals.
The componentwise expectation of (Stratonovich) iterated integrals, expected signa-
ture of Brownian motion, was first computed in the thesis of Fawcett [Faw04];
different proofs were then given by Lyons–Victoir, Baudoin and Friz–Shekhar,
[LV04, Bau04, FS17]. Fawcett’s formula is central to the Kusuoka–Lyons–Victoir
cubature method [Kus01, LV04]. More generally, expected signatures capture im-
portant aspects of the law of a stochastic process, see Chevyrev–Lyons [CL16]. The
computation of expected signatures of large classes of stochastic processes including
fractional Brownian motion, Schramm–Loewner trace, stopped Brownian motion
and Lévy processes has been pursued by a number of people including Baudoin
[Bau04], Werness [Wer12], Lyons–Ni [LN15], Friz–Shekhar [FS17]. The Donsker
type theorem, Theorem 3.12, in uniform topology, is a consequence of Stroock–
Varadhan [SV73]; the rough path case is due to Breuillard, Friz and Huesmann
[BFH09]. Applications to cubature are discussed in [BF13]. Several authors have
studied functional CLTs in rough paths topology in more complicated settings, includ-
ing [LS17, LS18, LO18], see also [IKN18]. The case of random walks in random
3.8 Comments 59

environments is a consequence of a Kipnis–Varadhan view on additive functionals as


rough paths [DOP19]. Convergence to Brownian rough paths, with area anomaly, is
also generic in the context of homogenisation, Section 9.6 contains precise references.
Chevyrev [Che18] considers random walks and Lévy processes on homogeneous
groups from a rough path point of view.
Chapter 4
Integration against rough paths

R
The aim of this chapter is to give a meaning to the expression Yt dXt for a suitable
class of integrands Y , integrated against a rough path X. We first discuss the case
originally studied by Lyons where Y = F (X). We then introduce the notion of a
controlled rough path and show that this forms a natural class of integrands.

4.1 Introduction
R
We consider the problem of giving a meaning to the expression Yt dXt , for X ∈
C α ([0, T ], V ) and Y some continuous function with values in L(V, W ), the space
of bounded linear operators from V into some other Banach space W . Of course,
such an integral cannot be defined
R for arbitrary continuous functions Y , especially if
we want the map (X, Y ) 7→ Y dX to be continuous in the relevant topologies. We
therefore also want to identify a “good” class of integrands Y for the rough path X.
A natural approach would be to try to define the integral as a limit of Riemann–
Stieltjes sums, that is
Z 1 X
Yt dXt = lim Ys Xs,t , (4.1)
0 |P|→0
[s,t]∈P

where P denotes a partitionSof [0, 1] (interpreted as a finite collection of essentially


disjoint intervals such that P = [0, 1]) and |P| denotes the length of the largest
element of P. Such a definition – the Young integral – was studied in detail in the
seminal paper by Young [You36], where it was shown that such a sum converges
if X ∈ C α and Y ∈ C β , provided α + β > 1, and that the resulting bilinear map
is continuous. This result is sharp in the sense that one can construct sequences of
n n n n 1/2
R n Yn and X such that Y → 0 and X → 0 in C ([0, 1], R), but
smooth functions
such that Y dX → ∞.
As a consequence of Young’s inequality [You36], one has the bound

61
62 4 Integration against rough paths
Z
1


(Yr − Y0 ) dXr ≤ C∥Y ∥β;[0,1] ∥X∥α;[0,1] , (4.2)
0

with C depending on α + β > 1. Given paths X, Y defined on [s, t] rather than [0, 1]
it is an easy consequence of the scaling properties of Hölder seminorms, that
Z t
α+β

Yr dXr − Ys s,t ≤ C∥Y ∥β ∥X∥α |t − s|
X . (4.3)
s


In particular, when α = β > 1/2, the right-hand side is proportional to |t − s| =
o(|t − s|) which is to be compared with the estimate (4.22) below.
The main insight of the theory of rough paths is that this seemingly unsurmount-
able barrier of α + β > 1 (which reduces to α > 1/2 in the case α = β which is our
main interest1 ) can be broken by adding additional structure to the problem. Indeed,
for a rough path X, we postulate the values Xs,t of the integral of X against
R itself,
see (2.2). It is then intuitively clear that one should be able to define Y dX in a
consistent way, provided that Y “looks like X”, at least on very small scales (in the
precise sense of (4.18) below). The easiest way for a function Y to “look like X”
is to have Yt = F (Xt ) for some sufficiently smooth F : V → L(V, W ), called a
one-form.

4.2 Integration of one-forms

We aim to integrate Y = F (X) against X = (X, X) ∈ C α . When F : V → L(V, W )


is in C 1 , or better, a Taylor approximation gives

F (Xr ) ≈ F (Xs ) + DF (Xs )Xs,r , (4.4)

for r in some (small) interval [s, t], say. Recall (see sections 1.4 and 1.5 concerning
the infinite-dimensional case) that2

L(V, L(V, W )) ∼
= L(V ⊗ V, W ) ,

so that DF (Xs ) may be regarded as element in L(V ⊗ V, W ). Since the Young


integral defined in (4.1), when applied to Y = F (X), is effectively based on the
approximation F (Xr ) ≈ F (Xs ), for r ∈ [s, t], it is natural to hope, with a motivating
look at (2.2), that the compensated Riemann–Stieltjes sum appearing at the right-hand
1
. . .but see Exercise 4.7.
2
In coordinates, when dim V, dim W < ∞, G = DF (Xs ) takes the form of a (1, 2)-tensor
(Gki,j ) and the identification amounts to
 X   X 
v 7→ ṽ 7→ Gk i j
i,j v ṽ versus M 7→ Gk
i,j M
i,j
.
k k
i,j i,j
4.2 Integration of one-forms 63

side of Z 1 X 
F (Xs ) dXs ≈ F (Xs )Xs,t + DF (Xs )Xs,t , (4.5)
0 [s,t]∈P

provides a good enough approximation (say, is Cauchy as |P| → 0) even when


X ceases to have α-Hölder regularity for α > 1/2 (as required by Young theory),
but assuming instead X = (X, X) ∈ C α , α ∈ 13 , 12 . Why should this be good
enough? The intuition is as follows: given α ∈ 13 , 12 neither |Xs,t | ∼ |t − s|α nor
|Xs,t | ∼ |t − s|2α in the above sum will be negligible as |P| → 0. Continuing in the
same fashion, one expects (in fact one can show it) that the third iterated integral
(3) (3)
Xs,t is of order Xs,t ∼ |t − s|3α = o(|t − s|), so that adding a third term of the form
(3)
D2 F (Xs )Xs,t in the sum of (4.5), at the very least, will not affect any limit, should
it exist. In the following, we will see that this limit,3
Z 1 X 
F (Xs ) dXs = lim F (Xs )Xs,t + DF (Xs )Xs,t , (4.6)
0 |P|→0
[s,t]∈P

4
does exist and call it rough integral.
R · In fact, in this section αwe shall construct the
(indefinite) rough integral Z = 0 F (X)dX as element in C , i.e. as path, similar
to the construction of stochastic integrals as processes rather than random variables.
Even this may not be sufficient in applications – one often wants to have an extended
meaning of the rough integral, such as (Z, Z) ∈ C α , point of view emphasised in
[Lyo98, LQ02, LCL07], or something similar (such as “Z controlled by X” in the
sense of Definition 4.6 below, to be discussed in the next section).

Lemma 4.1. Let F : V → L(V, W ) be a Cb2 function and let (X, X) ∈ C α for some
α > 13 . Set Ys := F (Xs ), Ys′ := DF (Xs ) and Rs,t
Y
:= Ys,t − Ys′ Xs,t . Then

Y, Y ′ ∈ C α and RY ∈ C22α . (4.7)

(In the terminology of the forthcoming Definition 4.6: “Y is controlled by X with


Gubinelli derivative Y ′ ; in symbols (Y, Y ′ ) ∈ DX

”.) More precisely, we have the
estimates

∥Y ∥α ≤ ∥DF ∥∞ ∥X∥α ,
∥Y ′ ∥ ≤ D2 F ∥X∥ ,

α ∞ α

R ≤ 1 D2 F ∥X∥2 .
Y
2α ∞ α
2

3
Recall that lim|P|→0 means convergence along any sequence (Pn ) with mesh |Pn | → 0, with
identical limit along each such sequence. In particular, it is not enough to establish convergence
along a particular sequence (Pn ), although a particular sequence may be used to identify the limit.
4
Of course, we can and will consider intervals other than [0, 1]. Without further notice, P always
denotes a partition of the interval under consideration.
64 4 Integration against rough paths

Proof. Cb2 regularity of F implies that F and DF are both Lipschitz continuous with
Lipschitz constants ∥DF ∥∞ and ∥D2 F ∥∞ respectively. The α-Hölder bounds on Y
and Y ′ are then immediate. For the remainder term, consider the function

[0, 1] ∋ ξ 7→ F (Xs + ξXs,t ) .

A Taylor expansion, with intermediate value remainder, yields ξ ∈ (0, 1) such that

Y 1 2
Rs,t = F (Xt ) − F (Xs ) − DF (Xs )Xs,t = D F (X s + ξXs,t )(Xs,t , Xs,t ) .
2
Y
The claimed 2α-Hölder estimate, in the sense that |Rs,t | ≲ |t − s|2α , then follows at
once. ⊔ ⊓

Before we prove that the rough integral (4.6) exists, we discuss some sort of
abstract Riemann integration. In what follows, at first reading, one may
R t have in mind
the construction of a Riemann–Stieltjes (or Young) integral Zt := 0 Yr dXr . From
Young’s inequality (4.3), one has (with Zs,t = Zt − Zs as usual)

Zs,t = Ys Xs,t + o(|t − s|)

and Ξs,t := Ys Xs,t is a sufficiently good local approximation in the sense that it
fully determines the integral Z via the limiting procedure given in (4.1)). In this
sense Z = IΞ is the well-defined image of Ξ under some abstract integration map
I. Note that Zs,t = Zs,u + Zu,t , i.e. increments are additive (or “multiplicative” if
one regards + as group operation5 ) whereas a similar property fails for Ξ. In the
language of [Lyo98], such a Ξ corresponds to a “almost multiplicative functional”
and it is a key result in the theory that there is a unique associated “multiplicative
functional” (here: Z = IΞ). Following [FdLP06] we call “sewing” the step from a
(good enough) local approximation Ξ to some (abstract) integral IΞ; the concrete
estimate which quantifies how well IΞ is approximated by Ξ will be called “sewing
lemma”. It plays an analogous role to Davie’s lemma (cf. Section 8.7) in the context
of (rough) differential equations.
We now formalise what we mean by Ξ being a good enough local approximation.
For this, we introduce the space C2α,β ([0, T ], W ) of functions Ξ from the 2-simplex
{(s, t) : 0 ≤ s ≤ t ≤ T } into W such that Ξt,t = 0 and such that
def
∥Ξ∥α,β = ∥Ξ∥α + ∥δΞ∥β < ∞ , (4.8)
|Ξs,t |
where ∥Ξ∥α = sups<t |t−s|α as usual, and also

def def |δΞs,u,t |


δΞs,u,t = Ξs,t − Ξs,u − Ξu,t , ∥δΞ∥β = sup β
.
s<u<t |t − s|

5
This terminology becomes natural if one considers Z together with its iterated integrals as
group-valued path, increments of which satisfy Chen’s “multiplicative” relation, see (2.8).
4.2 Integration of one-forms 65

Provided that β > 1, it turns out that such functions are “almost” of the form
Ξs,t = Ft − Fs , for some α-Hölder continuous function F (they would be if and
only if δΞ = 0). Indeed, it is possible to construct in a canonical way a function Ξ̂
with δ Ξ̂ = 0 and such that Ξ̂s,t ≈ Ξs,t for |t − s| ≪ 1:

Lemma 4.2 (Sewing lemma). Let α and β be such that 0 < α ≤ 1 < β. Then,
there exists a unique continuous linear map I : C2α,β ([0, T ], W ) → C α ([0, T ], W )
such that (IΞ)0 = 0 and
(IΞ)s,t − Ξs,t ≤ C|t − s|β .

(4.9)

where C only depends on β and ∥δΞ∥β . (The α-Hölder norm of IΞ also depends
on ∥Ξ∥α and hence on ∥Ξ∥α,β .)

Proof. As linear map, continuity of I will be an immediate consequence of its


boundedness. We shall construct the path IΞ =: I, with I0 = 0, via its increments
Is,t = It − Is . Additivity of these increments (δI = 0) is an important aspect of the
proof. Uniqueness of I is immediate: assuming two paths I and I¯ both satisfy (4.9),
it follows that I − I¯ satisfies (I − I)
¯ 0 = 0 and |(I − I)
¯ s,t | = |(I − I)
¯ t − (I − I)
¯ s| ≲
β ¯
|t − s| . Since β > 1 by assumption, we conclude that I − I vanishes identically. In
fact, (4.9) shows that I is necessarily given as Riemann-type limit: writing P for a
partition of [s, t] and |P| for its mesh size, we have
X X 
Is,t − Ξu,v = Iu,v − Ξu,v = O |P|β−1 )

[u,v]∈P [u,v]∈P

which is nothing but a quantitative form of


X
IΞ)s,t = lim Ξu,v . (4.10)
|P|→0
[u,v]∈P

Because of its importance we give two independent but related arguments. The
first argument is based on successive (dyadic) refinement to construct Is,t with the
desired bound (4.9), followed by an argument for additivity. Fix [s, t] ⊂ [0, T ] and
let Pn be the level-n dyadic partion of [s, t], which contains 2n intervals, each of
length 2−n |t − s|, starting with the trivial partition P0 = {[s, t]}. Define Is,t
0
= Ξs,t
and then the nth level approximation by
X X
n+1 def n
Is,t = Ξu,v = Is,t − δΞu,m,v ,
[u,v]∈Pn+1 [u,v]∈Pn

where it is a straightforward exercise to check that the second equality holds. It then
follows immediately from the definition of ∥δΞ∥β that
n+1 n
≤ 2n(1−β) |t − s|β ∥δΞ∥β .

Is,t − Is,t
66 4 Integration against rough paths

Since β > 1, these terms are summable whence we conclude that the sequence
n
(Is,t : n ∈ N) is Cauchy. Its limit Is,t is such that, summing up the bound above,
X n+1 n
≤ C∥δΞ∥β |t − s|β ,

Is,t − Ξs,t ≤ Is,t − Is,t (4.11)
n≥0

for some universal constant C depending only on β, which is precisely the required
bound (4.9). Unfortunately, addivity of I is no consequence of this argument so
we have to be a little smarter (but see Remark 4.3). Taking T = 1 without loss of
generality (and for notational simplicity only), we restrict the previous construction
to elementary dyadic intervals of the form [s, t] = 2−k [ℓ, ℓ + 1] for some k ≥ 0 and
ℓ ∈ {0, . . . , 2k − 1}. The advantage is that now mid-point additivity holds in the
sense that
s+t
Is,t = Is,u + Iu,t , u= , (4.12)
2
n+1 n n
as a simple consequence of taking limits in the identity Is,t = Is,u + Iu,t . The
−k
natural additive extension of I to non-elementary dyadic intervals 2 [ℓ, m] is then
given by postulating that
m−1
X
I2−k ℓ,2−k m = I2−k j,2−k (j+1) , (4.13)
j=ℓ

which is indeed well-defined (note that 2−k [ℓ, m] = 2−k−1 [2ℓ, 2m] for example
so (4.13) can be written in several ways) by (4.12). This defines Is,t for all dyadic
numbers s, t and the construction guarantees addivitiy. We leave the fact that Is,t
satisfies (4.9) for all dyadic s, t (and therefore for all s, t ∈ [0, 1] by continuous
extension) as Exercise 4.3.
The second argument, which is essentially due to Young, yields immediately the
convergence (4.10), as |P| → 0, i.e. the same limit is obtained along any sequence
Pn with mesh tending to zero. This has the important consequence that addivity of
increment (δI = 0) is a consequence of (4.10) and requires no additional argument.
(Another advantage of Young’s construction is that it also works under variation
- rather than Hölder type assumption and thus in application allows to deal with
jumps.) Consider a partition P of [s, t] and let r ≥ 1 be the number of intervals in P.
When r ≥ 2 there exists u ∈ [s, t] such that [u− , u], [u, u+ ] ∈ P and
2
|u+ − u− | ≤ |t − s|.
r−1
P
Indeed, assuming otherwise R the contradiction 2|t − s| ≥ u∈P ◦ |u+ − u− | >
gives
2|t − s|. Hence, | P\{u} Ξ − P Ξ| = |δΞu− ,u,u+ | ≤ ∥δΞ∥β (2|t − s|/(r − 1))β
R

and by iterating this procedure until the partition is reduced to P = {[s, t]}, we arrive
at the maximal inequality,
4.2 Integration of one-forms 67
Z
sup Ξs,t − Ξ ≤ 2β ∥δΞ∥β ζ(β)|t − s|β ,

P⊂[s,t] P

where ζ denotes the classical ζ function. It then remains to show that


Z Z
sup Ξ − Ξ → 0 as ε ↓ 0, (4.14)

|P|∨|P ′ |<ε P P′
R
which implies existence of IΞ as the limit lim|P|→0 P Ξ. To this end, at the price
of adding / subtracting P ∪ P ′ , we can assume without loss of generality that P ′
refines P. In particular, then |P| ∨ |P ′ | = |P| and
Z Z X  Z 
Ξ− Ξ= Ξu,v − Ξ .
P P′ [u,v]∈P P ′ ∩[u,v]

But then, for any P with |P| ≤ ε we can use the maximal inequality to see that
Z Z
β
X
≤ 2β ζ(β)∥δΞ∥ |v − u| = O |P|β−1 = O(εβ−1 ).

Ξ− Ξ
β
P P′ [u,v]∈P

This concludes the Young argument (with no hidden tedium left to the reader). ⊔

Remark 4.3. The first argument ultimately suffered from the tedium of checking
the additivity property δIΞ = 0. In some situations this extra step can be avoided,
notably in the case where all one wants are uniform rough path estimates for classical
Riemann–StieltjesR integrals. More precisely, consider the case that X : [0, T ] → V
is smooth, X = X ⊗ dX, and one is only interested in an error estimate for second
order approximations of Riemann–Stieltjes integrals, of the form
Z t



F (X r ) dXr − F (X s )Xs,t − DF (Xs )X s,t ≤ O(|t − s| ),

s

uniform over all (smooth) paths X with ∥X∥α + ∥X∥2α bounded. In the context of
the above proof, this estimate is contained in the first step, applied with (cf. the proof
of Theorem 4.4)
Ξs,t = F (Xs )Xs,t + DF (Xs )Xs,t .
But here we know already from classical Riemann integration theory that (IΞ)s,t ,
constructed as limit of dyadic partitions of [s, t], is precisely the Riemann–Stieltjes
Rt
integral s F (Xr ) dXr and therefore additive. (The contribution of DF (X)X effec-
tively constitutes a higher-order approximation and surely does not affect the limit,
2
as can be seen from the estimate |Xu,v | ≲ |v − u| , thanks to smoothness of X.)
We now apply the sewing lemma to the construction of (4.6).
Theorem 4.4 (Lyons). Let X = (X, X) ∈ C α ([0, T ], V ) for some T > 0 and
α > 13 , and let F : V → L(V, W ) be a Cb2 function. Then, the rough integral defined
in (4.6) exists and one has the bound
68 4 Integration against rough paths
Z t
F (Xr ) dXr − F (Xs )Xs,t − DF (Xs )Xs,t


s
 
3 3α
≲ ∥F ∥C 2 ∥X∥α + ∥X∥α ∥X∥2α |t − s| , (4.15)
b

where the proportionality constant depends only on α. Furthermore, the indefinite


rough integral is α-Hölder continuous on [0, T ] and we have the following quantita-
tive estimate,
Z ·  
F (X) dX ≤ C∥F ∥ 2 |||X|||α ∨ |||X|||1/α

C α , (4.16)
b
0 α

where the constant C only depends on p T and α and can be chosen uniformly in
T ≤ 1. Furthermore, |||X|||α = ∥X∥α + ∥X∥2α denotes again the homogeneous
α-Hölder rough path norm.

Remark 4.5. We will see in Section 4.4 that the map (X, X) ∈ C α 7→ 0 F (X) dX ∈
C α is continuous in α-Hölder rough path metric.
Proof. Let us stress the fact that the argument given here only relies on the properties
of the integrand Y = F (X) collected in Lemma 4.1 above. In particular, the general-
isation to “extended” integrands (Y, Y ′ ), which replace (F (X), DF (X)), subject to
(4.7), will be immediate. (We shall develop this “Gubinelli” point of view further in
Section 4.3 below.)
The result follows as a consequence of Lemma 4.2. With the notation that we just
introduced, the classical Young integral [You36] can be defined as the usual limit of
Riemann sums by
Z t

Yr dXr = IΞ s,t , Ξs,t = Ys Xs,t .
s

Unfortunately, this definition satisfies the identity

δΞs,u,t = −Ys,u Xu,t ,

so that, except in trivial cases, the required bound (4.8) is satisfied only if Y and
X are Hölder continuous with Hölder exponents adding up to β > 1. In order to
be able to cover the situation α < 12 , it follows that we need to consider a better
approximation to the Riemann sums, as discussed above. To this end, we use the
notation from Lemma 4.1, namely

Ys := F (Xs ) , Ys′ := DF (Xs ) and Rs,t


Y
:= Ys,t − Ys′ Xs,t ,

and then set Ξs,t = Ys Xs,t + Ys′ Xs,t . Note that, for any u ∈ (s, t), we have the
identity
Y ′
δΞs,u,t = −Rs,u Xu,t − Ys,u Xu,t .
Thanks to the α-Hölder regularity of X, Y ′ and the 2α-regularity of R, X, the triangle
inequality shows that (4.8) holds true with the given α > 1/3 and β := 3α > 1. The
4.3 Integration of controlled rough paths 69

fact that the integral is well-defined, and the bound


Z t

Y dX − Ys Xs,t − Ys Xs,t ≲ ∥X∥α RY 2α + ∥X∥2α ∥Y ′ ∥α |t − s|




s
(4.17)
then follow immediately from (4.11). Upon substituting the estimate obtained in
Lemma 4.1, we obtain (4.15). R
We now turn to the proof of (4.16). Writing Z = F (X)dX and using the triangle
inequality in (4.15) gives

|Zs,t | ≤ ∥F ∥∞ |Xs,t | + ∥DF ∥∞ |Xs,t |


 
3 3α
+ C∥F ∥C 2 ∥X∥α + ∥X∥α ∥X∥2α |t − s|
b
h i
α 2α 3α
≤ C∥F ∥C 2 A1 |t − s| + A2 |t − s| + A3 |t − s| ,
b

with Ai ≤ |||X|||α , for 1 ≤ i ≤ 3. Allowing C to change, this already implies

∥Z∥α ≤ C∥F ∥C 2 |||X|||α ∨ |||X|||3α ,



b

which is the claimed estimate (4.16) in the limit α ↓ 1/3. However, one can do better
by realising that the above estimate is best for |t − s| small, whereas for t − s large
it is better to split up |Zs,t | into the sum of small increments. To make this more
precise, set ϱ := |||X|||α and write (hide factor C = C(α, T ) in ≲ below)
α 2α 3α
|Zs,t | ≲ ϱ|t − s| + ϱ2 |t − s| + ϱ3 |t − s|
α
≤ 3ϱ|t − s| for ϱ1/α |t − s| ≤ 1.

Increments of Z over [s, t] with length greater than h := ϱ−1/α are handled by
cutting them into pieces of length h. More precisely (cf. Exercise 4.5) we have
∥Z∥α;h ≤ 3ϱ which entails
   
∥Z∥α ≤ 3ϱ 1 ∨ 2h−(1−α) ≤ 6 ϱ ∨ ϱ1/α .

At last, we note that C = C(α, T ) can be chosen uniformly in T ≤ 1. ⊔


4.3 Integration of controlled rough paths

Motivated by Lemma 4.1 and the observation that rough integration essentially relies
on the properties (4.7) we introduce the notion of a controlled path Y , relative to
some “reference” path X, due to Gubinelli [Gub04]. For the sake of the following
definition we assume that Y takes values in some Banach space, say W̄ . When
it comes to the definition of a rough integral we typically take W̄ = L(V, W );
although other choices can be useful (see e.g. Remark 4.12). In the context of rough
70 4 Integration against rough paths

differential equations, with solutions in W̄ = W , we actually need to integrate


f (Y ), which will be seen to be controlled by X for sufficiently smooth coefficients
f : W → L(V, W ).
Definition 4.6. Given a path X ∈ C α ([0, T ], V ), we say that Y ∈ C α ([0, T ], W̄ ) is
controlled by X if there exists Y ′ ∈ C α ([0, T ], L(V, W̄ )) so that the remainder term
RY given implicitly through the relation

Ys,t = Ys′ Xs,t + Rs,t


Y
, (4.18)

satisfies ∥RY ∥2α < ∞. This defines the space of controlled rough paths,

(Y, Y ′ ) ∈ DX

([0, T ], W̄ ).

Although Y ′ is not, in general, uniquely determined from Y (cf. Remark 4.7 and
Section 6 below) we call any such Y ′ the Gubinelli derivative of Y (with respect to
X).
Y
Here, Rs,t takes values in W̄ , and the norm ∥ • ∥2α for a function with two

arguments is given by (2.3) as before. We endow the space DX with the seminorm

∥Y, Y ′ ∥X,2α = ∥Y ′ ∥α + ∥RY ∥2α .


def
(4.19)

As in the case of classical Hölder spaces, DX is a Banach space under the norm
′ ′ ′
(Y, Y ) 7→ |Y0 | + |Y0 | + ∥Y, Y ∥X,2α . This quantity also controls the α-Hölder
regularity of Y since, uniformly over X bounded in α-Hölder seminorm,

∥Y ∥α ≤ ∥Y ′ ∥∞ ∥X∥α + T α ∥RY ∥2α ≤ |Y0′ |∥X∥α + T α {∥Y ′ ∥α ∥X∥α + ∥RY ∥2α }


≤ (1 + ∥X∥α ) (|Y0′ | + T α ∥Y, Y ′ ∥X,2α ) ≲ |Y0′ | + T α ∥Y, Y ′ ∥X,2α . (4.20)

Remark 4.7. Since we only assume that ∥Y ∥α < ∞, but then impose that ∥RY ∥2α <
∞, it is in general the case that a genuine cancellation takes place in (4.18). The
question arises to what extent Y determines Y ′ . Somewhat contrary to the classical
situation, where a smooth function has a unique derivative, too much regularity of
the underlying rough path X leads to less information about Y ′ . For instance, if Y is
smooth, or in fact in C 2α , and the underlying rough path X happens to have a path
component X that is also C 2α , then we may take Y ′ = 0, but as a matter of fact
any continuous path Y ′ would satisfy (4.18) with ∥R∥2α < ∞. On the other hand,
if X is far from smooth, i.e. genuinely rough on all (small) scales, uniformly in all
directions, then Y ′ is uniquely determined by Y , cf. Section 6 below.
Remark 4.8. It is important to note that while the space of rough paths C α is not

even a vector space, the space DX is a perfectly normal Banach space for any given
α
X = (X, X) ∈ C . The twist of course is that the space in question depends in a
crucial way on the choice of X. The set of all pairs (X; (Y, Y ′ )) gives rise to the total
space G
C α ⋉ D 2α = 2α
def
{X} × DX ,
X∈C α
4.3 Integration of controlled rough paths 71

with base space C α and “fibres” DX 2α


. We will see in Exercise 4.9 that C α ⋉

D is actually a “trivial” infinite-dimensional fibre bundle in the sense that it is
homeomorphic to C α × (C 2α ⊕ C α ), albeit not in a canonical way. (At least when
α ̸= 12 .) At the intuitive level, this clashes with the results of Chapter 6 which suggest

that, the rougher the underlying path X, the “smaller” is DX .

Remark 4.9. While the notion of “controlled rough path” has many appealing fea-
tures, it does not come with a natural approximation theory. To wit, consider
X, X ∈ Cgα [0, T ], Rd as limit of smooth paths Xn : [0, T ] → Rd in the sense
of Proposition 2.8. Then it is natural to approximate Y = F (X) by Yn = F (Xn ),
which is again smooth (to the extent that F permits). There is no obvious analogue
of this for controlled rough paths. However, there is a non-canonical approximation
result, based on the Lyons–Victoir extension, which the reader is invited to explore
in Exercise 4.8.

We are now ready to extend Young’s integral to that of a path controlled by


X against X = (X, X). Recall from Lemma 4.1 that Y = F (X), with Y ′ =
DF (X), is somewhat
R the prototype of a controlled rough path. The definition of the
rough integral F (X)dX in terms of compensated Riemann sums, cf. (4.6), then
immediately suggests to define the integral of Y against X by6
Z 1 X
Ys Xs,t + Ys′ Xs,t ,
def 
Y dX = lim (4.21)
0 |P|→0
[s,t]∈P

where we took W̄ = L(V, W ) and used the canonical injection L(V, L(V, W )) ,→
L(V ⊗ V, W ) in writing Ys′ Xs,t . With these notations, the resulting integral takes
values in W .
With these notations at hand, it is now straightforward to prove the following
result, which is a slight reformulation of [Gub04, Prop.1]:

Theorem 4.10 (Gubinelli). Let T > 0, let X = (X,X) ∈ C α ([0, T ], V ) for some
α ∈ 31 , 12 , and let (Y, Y ′ ) ∈ DX

[0, T ], L(V, W ) . Then there exists a constant
C depending only on α such that
a) The integral defined in (4.21) exists and, for every pair s, t, one has the bound
Z t
Yr dXr −Ys Xs,t −Ys′ Xs,t ≤ C ∥X∥α ∥RY ∥2α +∥X∥2α ∥Y ′ ∥α |t−s|3α .


s
(4.22)
2α 2α
 
b) The map from DX [0, T ], L(V, W ) to DX [0, T ], W given by
Z · 
(Y, Y ′ ) 7→ (Z, Z ′ ) := Yt dXt , Y , (4.23)
0

6
Note the abuse of notation: we hide dependence on Y ′ which in general affects the limit but is
usually clear from the context.
72 4 Integration against rough paths

is a continuous linear map between Banach spaces and one has the bound 7

∥Z, Z ′ ∥X,2α ≤ ∥Y ∥α + ∥Y ′ ∥∞ ∥X∥2α + CT α ∥X∥α ∥RY ∥2α + ∥X∥2α ∥Y ′ ∥α .




Proof. Part a) is an immediate consequence of Lemma 4.2, as already pointed out in


the proof of Theorem 4.4. The estimate (4.22) was pointed out explicitly in (4.17).
It remains to show the bound on ∥Z, Z ′ ∥X,2α . Splitting up the left-hand side
R t term, using the triangle inequality, gives immediately an α
of (4.22) after the first
Hölder estimate on s Yr dXr = Zs,t , so that Z ∈ C α . (Z ′ = Y ∈ C α is trivial,
by the very nature of Y since it is controlled by X.) Similarly, splitting up the
left-hand
Rt side of (4.22) after the second term, gives a 2α-Hölder type estimate on
′ Z
Y
s r
dX r − Ys Xs,t = Zs,t − Zs Xs,t =: Rs,t , i.e. on the remainder term in the sense
of (4.18). The explicit estimate for ∥Z, Z ′ ∥X,2α = ∥Y ∥α + ∥RZ ∥2α is then obvious.

Remark 4.11. One actually obtains better information than just (Z, Z ′ ) ∈ DX

,
namely one has control up to order 3α in the sense that
Zs,t − Ys Xs,t − Ys′ Xs,t ≲ |t − s|3α ,

see (4.34). Similar consideration will lead to the more general concept of modelled
distribution in the theory of regularity structures, see in particular Definition 13.10.

Remark 4.12. As in the above theorem, assume that (X, X) ∈ C α ([0, T ], V ) and
consider Y and Z two paths controlled by X. More precisely, we assume (Y, Y ′ ) ∈

DX ([0, T ], L(V̄ , W )) and (Z, Z ′ ) ∈ DX

([0, T ], V̄ ), where of course V, V̄ , W are
all Banach spaces. Then, in terms of the abstract integration map I (cf. the sewing
lemma) we may define the integral of Y against Z, with values in W , as follows,
Z t
Yu dZu = (IΞ)s,t , Ξu,v = Yu Zu,v + Yu′ Zu′ Xu,v .
def
(4.24)
s

Here, we use the fact that Zu′ ∈ L(V, V̄ ) can be canonically identified with an opera-
tor in L(V ⊗V, V ⊗ V̄ ) by acting only on the second factor, and Yu′ ∈ L(V, L(V̄ , W ))
is identified as before with an operator in L(V ⊗ V̄ , W ). The reader may be helped
to see this spelled out in coordinates, assuming finite dimensions: using indices i, j
in W, V̄ respectively, and then k, l in V :
i i j i j k,l
(Ξu,v ) = (Yu )j (Zu,v ) + (Yu′ )k,j (Zu′ )l (Xu,v ) .

A short computation, similar to the one that justified the application of the sewing
lemma for the construction of the rough integral introduced in (4.21), gives
Y
−δΞs,u,t = Rs,u Zu,t + Ys′ Xs,u Rs,u
Z
+ Ys′ Xs,u Zs,u

Xu,t + (Y ′ Z ′ )s,u Xu,t .

7
As in (4.20), this implies ∥Z, Z ′ ∥X,2α ≲ |Y0′ | + T α ∥Y, Y ′ ∥X,2α , uniformly over bounded X.
4.3 Integration of controlled rough paths 73

It immediately follows that ∥δΞ∥3α < ∞ so that, since 3α > 1, the right-hand
side of (4.24) is well defined. The sewing lemma furthermore yields the following
generalisation of (4.22), with Ξ as given in (4.24),
Z t
Y dZ − Ξs,t ≲ (∥RY ∥2a ∥Z∥α + (∗) + ∥Y ′ Z ′ ∥α ∥X∥2α )|t − s|3α , (4.25)



s

and additional term

(∗) = ∥Y ′ ∥∞ ∥X∥α (∥RZ ∥2α + ∥Z ′ ∥α ∥X∥α ) .

Note that (∗) duly vanishes when Z = X and Z ′ is the identity operator, since then
RZ ≡ 0 and Z ′ , constant in time, has vanishing α-Hölder seminorm. In that case, we
recover precisely the previously obtained estimate for the rough integral introduced
in (4.21). Furthermore, in the smooth case, one can check that we again recover the
usual Riemann / Young integral.

Remark 4.13. If, in the notation of the proof of Theorem 4.4, Ξ and Ξ̃ are such that
Ξ − Ξ̃ ∈ C2β for some β > 1, i.e.

|Ξs,t − Ξ̃s,t | = O(|t − s|β ) ,

then IΞ = I Ξ̃. Indeed, it is immediate that


X
|Ξu,v − Ξ̃u,v | = O(|P|β−1 ) ,
[u,v]∈P

which converges to 0 as |P| → 0. (This remains true if O(|t − s|β ) with β > 1 is
replaced by o(|t − s|).)
This also shows that, if X and Y are smooth functions and X is defined by (2.2),
the integral that we just defined does coincide with the usual Riemann–Stieltjes
integral. However, if we change X, then the resulting integral does change, as will be
seen in the next example.

Example 4.14. Let f be a 2α-Hölder continuous function and let X = (X, X) and
X̄ = (X̄, X̄) be two rough paths such that

X̄t = Xt , X̄s,t = Xs,t + f (t) − f (s) .

Let furthermore (Y, Y ′ ) ∈ DX 2α


as above. Then also (Ȳ , Ȳ ′ ) := (Y, Y ′ ) ∈ DX̄2α
.
However, it follows immediately from (4.21) that
Z t Z t Z t
Ȳr dX̄r = Yr dXr + Yr′ df (r) . (4.26)
s s s

Here, the second term on the right-hand side is a simple Young integral, which is
well-defined since α + 2α > 1 by assumption.
74 4 Integration against rough paths

Remark 4.15. As we will see in Section 5.2 below, (4.26) can be interpreted as a
generalisation of the usual expression relating Itô integrals to Stratonovich integrals.

Remark 4.16. The bound (4.22) does behave in a very natural way under dilations.
Indeed, the integral is invariant under the transformation

(Y, Y ′ , X, X) 7→ (λ−1 Y, λ−2 Y ′ , λX, λ2 X) . (4.27)

The same is true for the right-hand side of (4.22), since under this dilation, we also
have RY 7→ λ−1 RY .

4.4 Stability I: rough integration

Consider X = (X, X), X̃ = (X̃, X̃) ∈ C α with (Y, Y ′ ) ∈ DX 2α


, (Ỹ , Ỹ ′ ) ∈ DX̃

. As
′ ′
earlier, we consider a fixed time horizon [0, T ]. Although (Y, Y ) and (Ỹ , Ỹ ) live,
in general, in different Banach spaces, the “distance”

∥Y, Y ′ ; Ỹ , Ỹ ′ ∥X,X̃,2α = Y ′ − Ỹ ′ α + RY − RỸ 2α


def

(4.28)

will be useful. Even when X = X̃, it is not a proper metric for it fails to separate
(Y, Y ′ ) and (Y + cX + c̄, Y ′ + c) for anytwo constants
 c and c̄. When X ̸= X̃,
the assertion “zero distance implies Y, Y ′ = Ỹ , Ỹ ′ ” does not even make sense.
(The two objects live in completely different spaces!) That said, for every fixed
(X, X) ∈ C α , one has (with Rs,t
Y
= Ys,t − Ys′ Xs,t as usual), a canonical map

ιX : Y, Y ′ ∈ CX
α
7→ Y ′ , RY ∈ C α ⊕ C22α .
 

Given Y0 = ξ, this map is injective since one can reconstruct Y by Yt = ξ +Y0′ X0,t +
Y
R0,t . From this point of view, one simply has

∥ • ; ∗∥X,X̃,2α = ∥ιX ( • ) − ιX̃ (∗)∥α,2α ,

and one is back in a normal Banach setting, where ∥ • , • ∥α,2α = ∥ • ∥α + ∥ • ∥2α is a


natural seminorm on C α ⊕ C22α ; cf. Exercise 2.7. Elementary estimates of the form

ab − ãb̃ ≤ a b − b̃ + a − ã b̃ (4.29)

then lead to, with a constant C = CR ,



− Y0′ Xs,t + Ỹ0,s

+ Ỹ0′ X̃s,t + Rs,tY Ỹ
 
Ys,t − Ỹs,t = Y0,s − Rs,t
 
α
≤ C|t − s| Y0′ − Ỹ0′ + X − X̃ α + Y0,·
′ ′
Y
R − RỸ

− Ỹ0,· ∞
+ α
  
α ′ ′
α ′ ′
Y Ỹ
≤ C|t − s| Y0 − Ỹ0 + X − X̃ α + T Y − Ỹ α + R − R 2α ,
4.4 Stability I: rough integration 75

provided |Y0′ |, ∥Y ′ ∥∞ , ∥X∥α , and also with tilde, are bounded by R. It follows that
 
Y − Ỹ ≤ C X − X̃ + Y0′ − Ỹ0′ + T α ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥

α α
. (4.30)
X,X̃,2α

An estimate of the proper


α-Hölder
norm of Y − Ỹ (rather than its seminorm) is
obtained by adding Y0 − Ỹ0 to both sides.

Theorem 4.17 (Stability of rough integration). For α ∈ 31 , 12 as before, consider




X = X, X , X̃ = X̃, X̃ ∈ C α , Y, Y ′ ∈ DX 2α
, Ỹ , Ỹ ′ ∈ DX̃

   
in a bounded
set, in the sense

|Y0 | + |Y0′ | + ∥Y, Y ′ ∥X,2α ≤ M, ϱα (0, X) ≡ ∥X∥α + ∥X∥2α ≤ M,

with identical bounds for X̃, X̃ , Ỹ , Ỹ ′ , for some M < ∞. Define


 

Z · 
′ 2α
(Z, Z ) := Y dX, Y ∈ DX ,
0



and similarly for Z̃, Z̃ . Then, the following local Lipschitz estimates holds true,
 
∥Z, Z ′ ; Z̃, Z̃ ′ ∥X,X̃,2α ≤ C ϱα X, X̃ + Y0′ − Ỹ0′ + T α ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥X,X̃,2α ,


(4.31)
and also
 
Z−Z̃ ≤ C ϱα X, X̃ + Y0 − Ỹ0 + Y0′ − Ỹ0′ + T α ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥

α X, X̃,2α ,
(4.32)
where C = CM = C(M, α) is a suitable constant.

Proof. (The reader is advised to review the proofs of Theorems 4.4, 4.10.) We first
note that (4.30) applied to Z, Z̃ (note: Z0′ − Z̃0 = Y0 − Ỹ ) shows that (4.32) is an
immediate consequence of the first estimate (4.31). Thus, we only need to discuss
the first estimate. By definition of dX,X̃,2α , we need to estimate

Z − Z̃ ′ + ∥RZ − RZ̃ ∥2α = Y − Ỹ + RZ − RZ̃ .

α α 2α

Thanks to (4.30), the first summand is clearly bounded by the right-hand side of
(4.31). For the second summand we recall
Z t
Z
Rs,t = Zs,t − Zs′ Xs,t = Y dX − Ys Xs,t = (IΞ)s,t − Ξs,t + Ys′ Xs,t
s

where Ξs,t = Ys Xs,t + Ys′ Xs,t and similar for RZ̃ . Setting ∆ = Ξ − Ξ̃, we use
(4.11) with β = 3α and Ξ replaced by ∆, so that
Z Z̃
= I∆ s,t − ∆s,t + Ys′ Xs,t − Ỹs′ X̃s,t

Rs,t − Rs,t
76 4 Integration against rough paths

+ Ys′ Xs,t − Ỹs′ X̃s,t ,

≤ C∥δ∆∥3α |t − s|

Ỹ Y ′ ′
where δ∆s,u,t = Rs,u X̃u,t −Rs,u Xu,t + Ỹs,u X̃u,t −Ys,u Xu,t . We then conclude with
some elementary estimates of the type (4.29), just like in the proof of Theorem 4.10.

4.5 Controlled rough paths of lower regularity

Recall that we showed in Section 2.3 how an α-Hölder rough path X could be defined
as a path with values in the free step-N nilpotent Lie group G(N ) (Rd ) ⊂ T (N ) (Rd ),
with N = ⌊1/α⌋. It does not seem obvious at all a priori how one would define a
controlled rough path in this context. One way of interpreting Definition 4.6 is as a
kind of local “Taylor expansion” up to order 2α. It seems natural in the light of the
previous subsections that if α ≤ 13 , a controlled rough path should have a kind of
“Taylor expansion” up to order N α.
As a consequence, if we expand Xs,t = X−1
def
s ⊗ Xt as
X
Xs,t = Xws,t ew ,
|w|≤N

where |w| denotes the length of the word w, one would expect that a controlled rough
path should have an expansion of the form
X
δYs,t = Ysw Xw Y
s,t + Rs,t , (4.33)
|w|≤N −1

Y
with |Rs,t | ≲ |t−s|N α . Here, given a word w = w1 · · · wk with letters in {1, . . . , d},
we write ew = e1 ⊗ . . . ⊗ ek for the corresponding basis vector of T (N ) (Rd ). As in
Section 2.4, we then identify the words themselves as the dual basis of T (N ) (Rd )∗ .
Note that e̸# = 1 ∈ R ≃ (Rd )⊗0 ⊂ T (N ) (Rd ).
Recall that in Definition 4.6 we also needed a regularity condition on the “deriva-
tive process” Y ′ . The equivalent statement in the present context is that the Ysw
should themselves be described by a local “Taylor expansion”, but this time only up
to order (N − |w|)α. A neat way of packaging this into a compact statement is to
view a controlled rough path as a T (N −1) (Rd )∗ -valued function. Definition 4.6 then
generalises as follows.8
Definition 4.18. Let α ∈ (0, 1), let N = ⌊1/α⌋, and let X be a geometric α-Hölder
rough path as defined in Section 2.4. A controlled rough path is a T (N −1) (Rd )∗ -
valued function Y such that, for every word w with |w| ≤ N − 1, one has the
bound
⟨ew , Yt ⟩ − ⟨Xs,t ⊗ ew , Ys ⟩ ≤ C|t − s|(N −|w|)α .

(4.34)
8
This is for Y with values in R, but the extension to vector-valued Y is straightforward.
4.6 Stochastic sewing 77

We call Y a lift of Yt := ⟨e̸# , Yt ⟩ and write DXN α for the space of such controlled
rough paths.
It is convenient to write Yw t instead of ⟨ew , Yt ⟩. Given such a controlled rough path
Y, it is then natural to define its integral against any component X i by
Z t X X
Ys dXsi = lim Yw
def
Zt = r ⟨Xr,s , wi⟩ , (4.35)
0 |P|→0
[r,s]∈P |w|≤N −1

where wi denotes the concatenation of w with the letter i. It turns out [Gub10, HK15]
that Z can be lifted as controlled rough path Z in the sense of Definition 4.18. It
suffices to set Z̸#
def
t = ⟨e̸# , Zt ⟩ = Zt ,

⟨ew ⊗ ei , Zt ⟩ = Yw
def
t ,

and Zw
t = 0 for all non-empty words w that do not terminate with the letter i.

4.6 Stochastic sewing

We saw in Theorem 4.10 that suitably controlled integrands, such as F (B), F ∈ Cb2
can be integrated against a Brownian rough path B = (B, B), as constructed in
Chapter 3. In this case (see the proof of Theorem 4.4) one applies the sewing lemma
with Ξ̃(s, t) = F (Bs )Bs,t + DF (Bs )Bs,t , crucially using that δ Ξ̃ is of order
3α = 1 + ε > 1, in the sense that |δ Ξ̃sut | ≲ |t − s|1+ε uniformly over s < u < t
in [0, T ]. We leave it to Chapter 5 to reconcile this construction a posteriori with
classical stochastic integration. In the present section we show that stochastic and
rough analysis can also be combined a priori; the resulting stochastic sewing lemma
obtained by K. Lê in [Lê18] has proved very useful in a number of recent applications.
The setting is similar as in the sewing lemma, but the to-be-sewed two-parameter
function Ξ is now a sufficiently integrable random field. As running example,
consider the Itô left point approximation Ξs,t = F (Bs )Bs,t . With this choice
of Ξ (i.e. without the term DF (Bs )Bs,t ), classical sewing fails since δΞs,u,t =
−F (B)s,u Bu,t is at best of order 2α < 1. Note however that the martingale property
of Brownian motion makes this problem disappear upon inserting a conditional
expectation. Indeed, writing Es for the conditional expectation with respect to Fs for
some fixed filtration F = (Ft )t≤T such that B is F-adapted we have, always with
s < u < t,

Es δΞsut = Es Eu δΞsut = −Es (F (B)s,u Eu Bu,t ) = 0 .

This is of course very similar to the reason why classical Itô integration works: even
though Ξs,t is of size about |t − s|1/2 so that there is no reason a priori to believe that
Riemann sums converge, they do so thanks to the stochastic cancellations encoded in
the fact that Es Ξs,t = 0. The idea now is to obtain a version of the sewing lemma
78 4 Integration against rough paths

which combines the “best of both worlds”: its assumptions should be strictly weaker
than those of Lemma 4.2 and it should exploit improvements from situations in which
the conditional expectation of an expression is much smaller than the expression
itself.
Throughout this section, we assume that we are working with L2 random variables
on a filtered probability space (Ω, (Ft )0≤t≤T , P) and we write L2s for the space of
def
Fs -measurable square integrable random variables. We also write as usual ∥X∥L2 =
2 1/2
(EX ) . In fact, using the Burkholder–Davis–Gundy inequality, it is not difficult
to extend the following results to an Lq setting with 2 ≤ q < ∞.

Proposition 4.19 (Stochastic Sewing Lemma). Let (s, t) 7→ Ξs,t ∈ L2t for 0 ≤
s ≤ t ≤ T be continuous (viewed as a map with values in L2 ) with Ξt,t = 0 for
all t. Suppose that there are constants Γ1 , Γ2 ≥ 0 and ε1 , ε2 > 0 such that for all
0 ≤ s ≤ u ≤ t ≤ T,
1
∥δΞsut ∥L2 ≤ Γ1 |t − s| 2 +ε1 . (4.36)
and
∥Es δΞsut ∥L2 ≤ Γ2 |t − s|1+ε2 , (4.37)
Then there exists a unique continuous (again as a map [0, T ] → L2 ) process t 7→
Xt ∈ L2t with X0 = 0 and a suitable constant C such that, for all 0 ≤ s ≤ t ≤ T ,
1
∥Xt − Xs − Ξs,t ∥L2 ≤ CΓ1 |t − s| 2 +ε1 + CΓ2 |t − s|1+ε2 (4.38)

and
∥Es (Xt − Xs − Ξs,t )∥L2 ≤ CΓ2 |t − s|1+ε2 . (4.39)

Proof. (Uniqueness) Assuming there are two adapted processes X, X̄ with the stated
properties (4.38) and (4.39), we show that ∆t := Xt − X̄t = 0 almost surely
for every t. Let n be a positive integer and set ti = ti/n. The abusive notation
Xi := Xti ,ti+1 and similarly for ∆ and Ξ is convenient. Note that L2 estimates for
∆i = (Xi − Ξi ) − (X̄i − Ξi ), as well as Eti ∆i are immediate from (4.38) and
(4.39). We have
n−1 n−1
(1) (2)
X X
∆t = (∆i − Eti ∆i ) + Eti ∆i =: ∆t + ∆t ,
i=0 i=0
P
which is nothing but Doob’s decomposition of the partial sum process i ∆i into
martingale and predictable component. Using the orthogonality of martingale in-
crements, L2 -contraction property of the conditional expectation, and (4.38), we
have
 n−1  12  n−1  12
(1)
X X
∥∆t ∥L2 = ∥(∆i − Eti ∆i )∥2L2 ≤2 ∥∆i ∥2L2
i=0 i=0
 1/2+ε1
1
≲ n1/2 · .
n
4.7 Exercises 79

(1) (2)
Since n is arbitrary, it follows that ∆t = 0 a.s. The same conclusion for ∆t is
immediate from the triangle inequality and (4.39), since
 1+ε2
(2)
X 1
∥∆t ∥L2 ≤ ∥Eti ∆i ∥L2 ≲ n · .
i
n

(Existence) The proof follows the “dyadic refinement” proof of the sewing lemma
given earlier. Fix 0 ≤ s < t ≤ T and consider dyadic refinements (tki ) of [s, t], so
that the kth level approximation is given by
k
2X −1
k
Is,t = Ξtki ,tki+1 ∈ L2t .
i=0

With midpoint uki ∈ [tki , tki+1 ] and, for fixed k, δΞi := δΞtki ,uki ,tki+1 , we again work
with the Doob decomposition
k
2X −1
k+1 k k;(1) k;(2)
Is,t − Is,t = δΞi = Is,t + Is,t . (4.40)
i=0

Arguing as in the uniqueness part, the first (resp. second) term is estimated (in L2 )
with (4.36) (resp. (4.37)) and one arrives at
1
k+1 k
∥Is,t − Is,t ∥L2 ≲ |t − s| 2 +ε1 2−kε1 + |t − s|1+ε2 2−kε2 .
k
which implies existence of Is,t := limk→∞ Is,t in L2t ,uniformly in 0 ≤ s ≤ t ≤ T ,
with a local estimate of the form (4.38) with Xt −Xs replaced by Is,t . (By assumption
Ξ, and hence all I k , are L2 -continuous, and so is the uniform limit I.) Moreover,
k;(1)
since Es Is,t = 0, for all k, a better estimate, of the form (4.39), is obtained for
k
Es Is,t = limk→∞ Es Is,t . At last, as in the “dyadic” proof of the deterministic sewing
lemma, one needs to argue that I is additive, a non-trivial exercise left to the reader,
and hence the increment of a unique L2 -path I started from I0 = 0 which is nothing
but the desired square-integrable process X = X(t, ω). ⊔ ⊓

4.7 Exercises

Exercise 4.1 a) In the setting of Young integration, deduce (4.3) from (4.2).
b) Show that there is a constant C depending only on T > 0 and α + β > 1 such
that
Z ·  

Y dX
≤ C |Y0 | + ∥Y ∥β;[0,T ] ∥X∥α;[0,T ] . (4.41)
0 α;[0,T ]
80 4 Integration against rough paths

In fact, show that C can be chosen uniformly over T ∈ (0, 1].


Solution. a) Given X on [s, t], define X̃ : [0, 1] ∋ u 7→ X(s + u(t − s)) and
β
verify ∥X̃∥α;[0,1] = |t − s| ∥X∥β;[s,t] . Proceeding similarly for Y , applying
(4.2) to X̃, Ỹ then gives (4.3).
b) Write Z for the indefinite integral. From (4.3), for every 0 ≤ s < t ≤ T ,
α+β
|Zs,t | ≤ |Ys ||Xs,t | + C∥Y ∥β;[s,t] ∥X∥α;[s,t] |t − s|
 
α
≤ |Y0 | + ∥Y ∥β;[0,T ] T β |Xs,t | + C∥Y ∥β;[0,T ] ∥X∥α;[0,T ] T β |t − s|
h i
α
≤ |Y0 | + ∥Y ∥β;[0,T ] T β (1 + C) ∥X∥α;[0,T ] |t − s| .
h i
β α
≤ (1 ∨ T ) (1 + C) |Y0 | + ∥Y ∥β;[0,T ] ∥X∥α;[0,T ] |t − s| ,

and this entails the claimed estimates.


Exercise 4.2 Let X = (X, X) ∈ C α ([0, T ], V ), α ∈ 31 , 12 , and assume that


F : V → L(V, W ) is of gradient form, i.e. F = DG where G : V → W is


sufficiently smooth, say Cb3 . Show that the relation
Z t
F (X)dX = G(Xt ) − G(Xs ) ,
s

holds true whenever X is a geometric rough path. (Hence, from a rough path per-
spective, integration of gradient 1-forms against geometric rough paths is trivial for
the outcome does not depend on X.) What about non-geometric rough paths?
Exercise 4.3 Complete the first “dyadic” proof of the sewing Lemma 4.2.
Solution. To show that (4.9) is valid for all intervals [s, t] ⊂ [0, 1] it suffices to
consider s < t dyadic by continuity. As in the proof of the Kolmogorov criterion,
Theorem 3.1, we consider a (finite) partition P = (τi ) of [s, t], which “efficiently”
exhausts [s, t] with dyadic intervals of length ∼ 2−n , n ≥ m, in the sense that no
three intervals have the same length. Note that |P | ≡ max {|v − u| : [v, u] ∈ P } =
2−m ≤ |t − s| (and in fact ∼ |t − s| due to minimal choice of m). Thanks to the
additivity of I and (4.9) for dyadic intervals,
X  X 
|Is,t − Ξs,t | = (Iu,v − Ξu,v ) − Ξs,t − Ξu,v

[u,v]∈P [u,v]∈P
 
β
X X
≲ |v − u| + Ξs,t − Ξu,v .
[u,v]∈P [u,v]∈P

β
X
≤ |t − s| + δΞs,τ
−(i+1) ,τ−i
+ δΞτi ,τi+1 ,t ,
i=0

equality (“τi = τi+1 ” for some i),


where the sum is actually finite. Possibly allowing
we may assume |τi+1 − τi | = τ−i − τ−(i+1) ≲ 1/2m+i , so that
4.7 Exercises 81

X ∞
X
|t − τi | = |τj+1 − τj | ≲ 1/2m+j ∼ 1/2m+i ,
j=i j=i

and similarly, |τ−i − s| ≲ 1/2m+i . As a consequence, one obtains



β β
X X
(1/2n ) ∼ 1/2mβ ∼ |t − s| ,

δΞs,τ + δΞτi ,τi+1 ,t ≲ 2
−(i+1) ,τ−i
i=0 n≥m

β
so that |Is,t − Ξs,t | ≲ |t − s| , as required.

Exercise 4.4 Adapt the proof of Theorem 4.4 to obtain Young’s estimate (4.3).
Exercise 4.5 Fix α ∈ (0, 1], h > 0 and M > 0. Consider a path Z : [0, T ] → V
and show that
|Zs,t | 
−(1−α)

∥Z∥α;h ≡ sup α ≤ M =⇒ ∥Z∥α;[0,T ] ≤ M 1 ∨ 2h .
0≤s<t≤T |t − s|
t−s≤h

Solution. By scaling it suffices to consider M = 1. Fix 0 ≤ s < t ≤ T , we need


α
to show |Zs,t |/|t − s| is bounded by 1 ∨ 2hα−1 . There is nothing to show for
|t − s| ≤ h. We therefore assume h ≤ |t − s| and define ti = (s + ih) ∧ t, for
i = 0, 1, . . . noting that tN = t for N ≥ |t − s|/h and also ti+1 − ti ≤ h for all i.
It then suffices to estimate
X
|Zs,t | ≤ Zti ,ti+1
0≤i<|t−s|/h

≤ hα (1 + |t − s|/h) = hα−1 (h + |t − s|) ≤ 2hα−1 |t − s|.

♯ Exercise 4.6 (Lyons extension theorem) a) Let X ∈ C 1 ([0, T ], V ), so that the


Lipschitz seminorm ∥X∥1 is finite, and consider the n-fold iterated (Riemann–
Stieltjes) integral with values in V ⊗n ,
Z
(n)
Xs,t = dX ⊗ · · · ⊗ dX .
s<t1 <...<tn <t

1
Show that, with Cnn = n! , and for all 0 ≤ s ≤ t ≤ T ,
(n) 1
|Xs,t | n ≤ Cn ∥X∥1 |t − s| .

b) Show an analogous result in the Young case i.e. when X ∈ C α ([0, T ], V ), α > 12 .
(n)
c) Fix X = (X, X) ∈ C α ([0, T ], V ), α ∈ ( 13 , 12 ], and define Xs,t ∈ V ⊗n , any
n ≥ 1, by the right-hand side above, via iterated integration of controlled rough
paths. Noting (X(1) , X(2) ) = (δX, X), define the T (N ) (V )-valued extension of
X by
X̄ := (1, X(1) , X(2) , X(3) , . . . , X(N ) ) ,
82 4 Integration against rough paths

for any integer N > ⌊ α1 ⌋ = 2. Show the validity of Chen’s relation, i.e. Xs,t =
Xs,u ⊗ Xu,t , 0 ≤ s < u < t ≤ T , as equation in T (N ) (V ), and the estimate
(n) 1
|Xs,t | n ≤ Cn,α |||X|||α |t − s|α ,

for 0 ≤ s < t ≤ T and n = 1, . . . , N . Show that these properties uniquely


determine X̄, called (level-N ) Lyons lift of X. Show that Lyons’ extension map
X 7→ X̄ is continuous in the appropriate rough path spaces. Is X̄ geometric when
X is?
Hint: Use induction for the analytic estimate. To get started, note (Xs, , Xs, ) ∈ • •


DX ([s, t]) and, with all norms on [s, t], one has ∥Xs, , Xs, ∥2α,X ≡ ∥Xs, ∥α +
• • •

∥RXs, ∥2α = ∥X∥α + ∥X∥2α . Then


Z t Z 1 Z 1
(3)
Xs,t = Xs, ⊗ dX =
• X̂0,τ ⊗ dX̂τ = c3 X̃0,τ ⊗ dX̃τ ,
s 0 0

in terms of X̂ : τ 7→ X(s + τ (t − s)), noting |||X̂|||α;[0,1] = |||X|||α;[s,t] |t − s|α =: c,


and then “unit size” X̃ = δ1/c X̂, with “≍ 1-estimate” for the final rough
integral.
n
Remark: One knows that Cn,α is of order 1/(nα)! = 1/Γ (nα+1) as a consequence
of the Lyons–Hara–Hino neo-classical inequality [Lyo98, HH10]. For continuity of
the extension map, uniform over n ∈ N, see also [LX13]. For extensions to branched
rough paths see [Gub10, Boe18].

Exercise 4.7 Show that the assumption on Y ∈ DX can be weakend to Y ∈
2α′ ′ ′
DX , α < α, provided α + 2α > 1, and reformulate Theorem 4.10 accordingly.
In particular, show that the estimate (4.22) holds upon replacing the final factor
3α α+2α′
|t − s| by |t − s| , and ∥Y ′ ∥α (resp. ∥RY ∥2α ) by ∥Y ′ ∥α′ (resp. ∥RY ∥2α′ ).
∗∗ Exercise 4.8 (Approximation of controlled rough paths) Let α ∈ 13 , 12 . As-


sume X ∈ C α and (Y, Y ′ ) ∈ DX 2α


. Consider smooth approximations Xε such
α
that Xε → X in C . Show that there then exist smooth paths

(Yε , Yε′ ) ∈ DX

ε

such that (Yε , Yε′ ) → (Y, Y ′ ) uniformly with uniform bounds in DX



ε
. By interpola-

tion, for any α < α,

∥Yε′ − Y ′ ∥α′ + ∥RYε − RY ∥2α′ → 0.

(Such an approximation result was first suggested in [GH19, Rem 5.5], for a general-
isation to modelled distributions in the theory of regularity structures see [ST18].)

Solution. Let Φ : C α × C α → C22α be the map constructed in part a) of Exercise 2.14.


Set Z := Φ(Y ′ , X) ∈ C22α and also Ȳ := Z0,· ∈ C α . From the properties of Φ
(“Chen’s relation”)
4.7 Exercises 83

Ȳt = Ȳs + Ys′ Xs,t + Zs,t


which shows (Ȳ , Y ′ ) ∈ DX2α
. On the other hand, (Y, Y ′ ) ∈ DX

means that Yt =
Ys + Ys′ Xs,t + Rs,t
Y
with remainder of order 2α. Upon taking the difference we see
that Γ := RY − Z ∈ C22α can be written as

Yt − Ys − (Ȳt − Ȳs ) = Γs,t

which identifies Γ as the the increment of a path; we write Γ ∈ C 2α accordingly. Let


ψϵ be an approximation of the identity so that

Yε′ := Y ′ ∗ ψε ∈ C ∞

converges uniformly, with uniform α-Hölder bounds, to Y ′ . (By interpolation, this



entails convergence in C α .) On the other hand, thanks to part c) of Exercise 2.14,

Ȳε := Φ(Yε′ , Xε )0,· ∈ C 1 ,

and also, thanks to the first part of that theorem, with R̄ε := Φ(Yε′ , Xε ), uniformly
in C22α ,
Ȳε (t) = Ȳε (s) + Yε′ (s)Xε (s, t) + R̄s,t
ε
.
By continuity of Φ, it is clear that R̄ε → Φ(Y ′ , X) ∈ C22α , uniformly, with uniform

2α-Hölder bounds. (As before, this entails C22α -convergence.) It remains to deal with
the (mostly cosmetic) problem that Ȳε is not smooth. But then Yε := Ȳε ∗ ψε ∈ C ∞
converges uniformly with uniform 1− -Hölder bounds and from
ε
Rs,t := Yε (s, t) − Yε′ (s)Xε (s, t) = R̄s,t
ε
+ Yε (s, t) − Ȳε (s, t)

we see that Rε − R̄ε → 0 uniformly, also with uniform 1− -Hölder bounds (and
hence Rε → Φ(Y ′ , X) with uniform 2α-Hölder bounds).

∗ Exercise 4.9 For α ∈ ( 13 , 12 ), consider the space C α ⋉ D 2α as in Remark 4.8


endowed with the distance

d(X, (Y, Y ′ ); X̄, (Ȳ , Ȳ ′ )) = ϱα (X, X̄) + ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥X,X̃,2α ,

see (4.28). Show that this space is homeomorphic to C α × (C 2α ⊕ C α ). (Here, C 2α


denotes the usual space of 2α-Hölder functions in one variable. See also [TZ18,
BH19] for generalisations of this statement.)

Solution. As in the solution to the previous exercise, we can use the Lyons–Victoir
extension theorem (see Exercise 2.14), to find a continuous map I : C α × C α → C α
with the property that Z = I(X, Y ′ ) satisfies (Z, YR ′ ) ∈ DX 2α
. (One should think

′ ′
of I(X, Y ) as being a “plausible candidate” for 0 Ys dXs , which is of course
ill-defined since we do not assume that Y ′ is controlled by X.)
In particular, the map Ĩ : (X, Ỹ , Y ′ ) 7→ X, Ỹ + I(X, Y ′ ), Y ′ is continuous


from C α × (C 2α ⊕ C α ) to C α ⋉ D 2α . Its inverse map is given by


84 4 Integration against rough paths

(X, Y, Y ′ ) 7→ X, Y − I(X, Y ′ ), Y ′ ,


which concludes the proof. Note that this construction is far from being canonical
due to the lack of a canonical map I having these properties.

Exercise 4.10 (Rough Fubini) Let X = (X, X) ∈ C α ([0, T ], V ), α > 13 and con-

sider a measurable map from some measure space (Ω, F, µ) to DX , so that

{ω 7→ |Y0ω | + ∥Y ω , (Y ω )′ ∥2α,X } ∈ L1 (Ω, F, µ).

With a pointwise definition of the µ-integrated controlled rough path on the right-hand
side, show that both sides are well-defined and equality holds,
Z Z T  Z T Z 
ω ω ′
(Y , (Y ) )dX µ(dω) = (Y ω , (Y ω )′ )µ(dω) dX.
Ω 0 0 Ω

Exercise 4.11 (Rough Fubini d’après [GH19])


a) As a warmup, consider a real-valued càdlàg path X of bounded variation on
[0, T ], so that integration of càglàd integrands against dX can be understood
equivalently in Lebesgue or Riemann–Stieltjes sense. Write [X]t for the sum-
square of all jumps at times in (0, t]. Given two real-valued càglàd paths Y, Ỹ ,
set Zs,t := Ys Ỹt and show
Z T Z t Z T Z T Z T
Zs,t dXs dXt = Zs,t dXt dXs + Zt,t d[X]t .
0 0 0 s 0

Hint: Apply the integration by parts formula for bounded variation paths to the
indefinite integrals of Y and Ỹ against X.
b) Let now X = (X, X) ∈ C α ([0, T ]) for some α > 1/3, and (Y, Y ′ ), (Ỹ , Ỹ ′ ) ∈

DX . Set Zs,t := Ys ⊗ Yt . Show that
Z T Z t Z T Z T Z T
Zs,t dXs dXt = Zs,t dXt dXs + Zt,t d[X]t ,
0 0 0 s 0

where the final integral is a Young integral against [X] ∈ C 2α , the bracket
introduced in Exercise 2.11.
Hint: If X is the canonical lift of some smooth X, then both [X] and [X] vanish
and the equality follows from part a) and consistency of rough with Riemann–
Stieltjes integration in case of smooth integrators. Treat the case of X ∈ Cgα
with the approximation result of Exercise 4.8 and then X ∈ C α as “second level
perturbation”, as in Exercise 2.11.
Exercise 4.12 (Singular rough paths, improper rough integration [BFG20])
a) (Young case) Consider 0 < α ≤ 1 and η ≤ α and a path Y defined on (0, T ].
Show that
4.7 Exercises 85

def |Yt − Ys |
∥Y ∥α,η = sup η−α
<∞
0<s<t≤T s |t − s|α
if and only if ∥Y ∥α;[ε,T ] = O(εη−α ) as ε ↓ 0, and write Y ∈ C α,η ((0, T ]) for
the resulting class of “singular” Hölder paths. Fix X ∈ C α ([0, T ]), α > 1/2
and assume η + α > 0, η ̸= 0. Show that the improper Young integral
Z t Z t
def
Zt := Y dX = lim Y dX, 0 < t ≤ T ,
0+ ε↓0 ε

exists and defines a singular Hölder path Z ∈ C α,η∧0+α ((0, T ]).


Hint: For a start, apply the Young estimate

t
Z
Y dX ≲ |Ys ||t − s|α + ∥Y ∥α;[s,T ] |t − s|2α


s
RT
with s = 2−(n+1) , t = 2−n and show that In := 2−n Y dX is a Cauchy
sequence.
b) (Rough path case) Let X = (X, X) ∈ C α ([0, T ]) for some α > 1/3, and let
(Y, Y ′ ) be defined on (0, T ] so that, for some η ≤ 2α,

∥Y, Y ′ ∥X,2α;[ε,1] = O(εη−2α ), ε ↓ 0 .

Show that this estimate is equivalent to finiteness of

|Yt′ − Ys′ | |Yt − Ys − Ys′ Xs,t |


∥Y, Y ′ ∥X,2α,η =
def
sup η−2α
+ sup
0≤s<t≤T s |t − s|α 0≤s<t≤T sη−2α |t − s|2α
2α,η
and write (Y, Y ′ ) ∈ DX for the resulting class of singular controlled rough
paths. Show that under this condition, provided that −α < η ≤ 2α and η ̸= 0,
the improper rough integral
Z t Z t
def
Zt := Y dX = lim Y dX ,
0+ ε↓0 ε

exists and definesRa singular Hölder path Z ∈ C α,η∧0+α ((0, T ]). In fact, show
2α,η∧0+α
that (Z, Z ′ ) := ( 0+ Y dX, Y ) ∈ DX . (Such singular controlled rough
paths are examples of singular modelled distributions in the theory of regularity
structures, [Hai14b, Ch. 6].)
Exercise 4.13 Check that Definition 4.18 is consistent with Definition 4.6 in the case
when α ∈ 13 , 12 . Check also that if one takes w = ̸#, the empty word, then (4.34)

Y
reduces to (4.33) with |Rs,t | ≲ |t − s|N α .
Exercise 4.14 (From [Lê18]) Let B be a Brownian motion. Assume F is bounded
and ε-Hölder continuous for some ε > 0. Apply the stochastic sewing lemma with
86 4 Integration against rough paths

Ξ
R s,t = F (Bs )Bs,t and identify the resulting process X as the indefinite Itô integral
F (B)dB.
Exercise 4.15 (Hybrid stochastic rough integral) Let B be a Brownian motion
and X = (X, X) ∈ C α ([0, T ], V ) a (deterministic) rough path, α ∈ 13 , 12 . Apply
the stochastic sewing lemma with

Ξs,t = F (Bs + Xs )Xs,t + DF (Bs + Xs )Xs,t

to define the stochastic rough integral


Z
F (Bt + Xt )dXt .
R
Detail the assumptions on F . Since F (Bt + Xt )dBt is automatically well-defined
as Itô integral this settles integration against “B + X”.
Exercise 4.16 (Mild sewing, [GT10, GH19]) Consider a strongly continuous semi-
group (St )t≥0 acting on a scale of Hilbert spaces (Hα : α ∈ R) with Hα ⊂ Hβ
densely whenever α ≥ β, such that, for all α ≥ β and γ ∈ [0, 1], one has

∥St u∥Hα ≲ tβ−α ∥u∥Hβ , ∥St u − u∥Hβ−γ ≲ tγ ∥u∥Hβ , (4.42)

uniformly over t ∈ (0, 1] and u ∈ Hβ . (This situation is typical when S is an analytic


semigroup, for example generated by a self-adjoint operator, cf. Section 12.2.2.) We
define Cˆ2γ,µ ([0, T ], Hα ) as functions Ξ from the simplex {0 ≤ s < t ≤ T } into Hα
such that
∥Ξ∥γ + ∥δ̂Ξ∥µ < ∞ ,
where we used the modified second order increment operator

δ̂Ξs,u,t := Ξs,t − Su,t Ξs,u − Ξu,t .

a) Let 0 < γ ≤ 1 < µ. Show that there exists a unique continuous linear map
I : Cˆ2γ,µ ([0, T ], Hα ) → C γ ([0, T ], Hα ) such that (IΞ)0 = 0 and

∥(IΞ)s,t − Ξs,t ∥Hα ≲ |t − s|µ . (4.43)


P
Hint: (IΞ)s,t = lim|P|→0 [u,v]∈P St−v Ξu,v .
b) If in addition δ̂Ξu,m,v = Sv−m Ξ̃u,m,v for some Hα -valued function Ξ̃ =
Ξ̃(u, m, v), with 0 ≤ u < m < v ≤ T , for which there exists M > 0 with

∥Ξ̃u,m,v ∥Hα ≤ M |v − m|µ−1 |v − u| , (4.44)

then for every β ∈ [0, µ) the following inequality holds:

∥IΞs,t − Ξs,t ∥Hα+β ≲µ,β M |t − s|µ−β . (4.45)


4.7 Exercises 87

Exercise 4.17 (Rough convolution, [GT10, GH19]) We continue in the Hilbert /


semigroup setting of Exercise 4.16 and fix α ∈ R. Consider a rough path X =
(X, X) ∈ C γ ([0, T ], Rd ) for some γ ∈ (1/3, 1/2] and take d = 1 for notational
simplicity only. In the semigroup setting of the previous exercise, write Y ∈ Cˆγ Hα if
∥δ̂Ys,t ∥Hα
Y : [0, T ] → Hα with ∥Y ∥∧ γ;α := sup |t−s|γ < ∞, where δ̂Ys,t = Yt − St−s Ys .

We say that (Y, Y ′ ) ∈ DS,X ([0, T ], Hα ) and call it a mildly controlled rough path if
(Y, Y ′ ) ∈ Cˆγ Hα × Cˆγ Hα and
Y
Rs,t := δ̂Ys,t − St−s Ys′ Xs,t , (4.46)

belongs to C22γ Hα . With ∥Y, Y ′ ∥∧ ′ ∧ Y


X,2γ;α := ∥Y ∥γ;α + ∥R ∥2γ;α a seminorm is

defined on DS,X . We show that mildly controlled rough paths are stable under rough
convolution.
a) Apply the modified sewing lemma of Exercise 4.16 to show existence of the rough
convolution integral
Z t X
St−u Yu dXu := lim St−u (Yu Xu,v + Yu′ Xu,v ), (4.47)
s |P|→0
[u,v]∈P

exists as an element of Cˆγ Hα and satisfies for every 0 ≤ β < 3γ:


Z t
St−u Yu dXu − St−s Ys Xs,t − St−s Ys′ Xs,t (4.48)


s Hα+β

≲ ∥RY ∥2γ;α ∥X∥γ + ∥Y ′ ∥∧ 3γ−β



γ;α ∥X∥2γ |t − s| .

b) Show that the map (Y, Y ′ ) 7→ (Z, Z ′ ) := 0 S·−u Yu dXu , Y is continuous

2γ 2γ
from DS,X ([0, T ], Hα ) to DS,X ([0, T ], Hα ) and one has the bound:

∥Z, Z ′ ∥∧ ∧ ′ ′ ∧
X,2γ;α ≲ ∥Y ∥γ;α + (∥Y0 ∥Hα + ∥(Y, Y )∥X,2γ;α )(∥X∥γ + ∥X∥2γ ).
(4.49)
c) Make the (notational) adjustment to handle general d ∈ N.
Exercise 4.18 (Integration against step-N rough paths) Any path X : [0, T ] →
(N )
T1 (Rd ) gives rise to increments X−1 s ⊗Xt =: Xs,t so that Chen’s relation becomes
a tautology. Assume also |⟨Xs,t , w⟩| ≲ |t − s|α|w| , |w| ≤ N = ⌊1/α⌋. (These
are the naı̈ve higher
R order rough paths introduced in Section 2.4.) Show that the
rough integral Y dX defined as in (4.35) is well-defined and detail its structure.
(Naı̈ve rough paths are ill-suited to integrate f (Y) with regular but non-linear f , in
Section 7.6 this is resolved for geometric rough paths.)
88 4 Integration against rough paths

4.8 Comments

Young integration [You36], which can be seen as level-1 rough integration, was a key
inspiration for the analytical aspects of Lyons’ rough integration [Lyo94, Lyo98],
and has remained the “entrance test” for every subsequent (re)interpretation of rough
integration, including [Gub04, FdLP06, Pic08, HN09, GIP15, GIP16, FS17]. From
a harmonic analysis perspective, the here presented Young integration in Hölder
scale implies that the product of smooth functions extends naturally to C β × C −α
into D′ (R) if and only if β > α. Similar statements, replacing one-dimensional
space ([0, T ] ⊂ R, “time”) by Rd are well known, cf. e.g. [BCD11, Thm 2.52]
and Theorem 13.18 later on in the book. Young (and later rough) integration is
naturally formulated in p-variation scale, examples with p < 2 are plentiful and
range from Schramm–Loewner trace [Wer12, FT17], fractional Brownian motion (cf.
Section 10.3) with Hurst parameter H > 1/2 to Lévy processes and homogenisation
problems [CFKM19]. Of course, p = 2+ is the correct scale for semimartingales,
also in the càdlàg setting, see Section 3.8. The sewing lemma, obtained independently
by Feyel–De La Pradelle (in an early version of [FdLP06]) and Gubinelli [Gub04],
formalises abstract Riemann–Young integration and is a flexible real analysis lemma,
with many variations found in [FDM08, GT10, BL19, Yas18, GH19, GHN19] and
also [FS17, FZ18] for a sewing lemma, and subsequent integration theory, with
jumps. An application of sewing to level sets in the Heisenberg group is given
in [MST18]. The applications of Lê’s important stochastic sewing lemma [Lê18],
Section 4.6, include regularisation by noise [Lê18], the construction of rough Markov
diffusions [FHL20] by solving hybrid Itô-rough differential equation in the spirit
of Section 12.2.1, and an averaging result for SDEs driven by fractional Brownian
motion [HL19].
Integration of one-forms against continuous p-variation geometric rough paths for
any p ∈ [1, ∞) was developed by Lyons [Lyo98]; see also [LQ02, LCL07, FV10b,
LY15]. For a careful discussion of the integration of weakly geometric rough paths
in infinite dimensions we refer to Cass et al. [CDLL16].
Rough integration against controlled paths is due to Gubinelli, see [Gub04]
where it is developed in an α-Hölder setting, α > 13 . Loosely speaking, it allows
to “linearise” many considerations (the space of controlled paths is a Banach space,
while a typical space of rough paths is not). This point of view has been generalised
to arbitrary α (both in the geometric and the non-geometric setting) in [Gub10],
see also [HK15, FZ18]. Rough convolution, Exercise 4.17, follows [GT10, GH19],
crucial for “mild” RPDE solution, cf. Section 12.5.
The controlled rough path integration point of view can be pushed even further
and, as a matter of fact, the theory of regularity structures developed in [Hai14b] and
exposed in Chapter 13 onwards, provides a unified framework in which the Gubinelli
derivative and the regular derivatives are but two examples of a more general theory
of objects behaving “like Taylor expansions” and allowing to describe the small-scale
structure of a function and / or distribution in terms of “known” objects (polynomials
in the case of Taylor expansions, the underlying rough path in the case of controlled
paths).
Chapter 5
Stochastic integration and Itô’s formula

In this chapter, we compare the integration theory developed in the previous chapter
to the usual theories of stochastic integration, be it in the Itô or the Stratonovich
sense.

5.1 Itô integration

Recall from Section 3 that Brownian motion B can be enhanced to a (random) rough
path B = (B, B). Presently our focus is the case when B is given by the iterated Itô
integral 1 Z t
def
Bs,t = BItô
s,t = Bs,u ⊗ dBu
s
and the so enhanced Brownian motion has  almost surely (non-geometric) α-Hölder
rough sample paths, for any α ∈ 31 , 12 . That is, B(ω) = (B(ω), B(ω)) ∈ C α for
every ω ∈ N1c where, here and in the sequel, Ni , i = 1, 2, . . . denote suitable null
sets. We now show that rough integrals (against B = BItô ) and Itô integrals, whenever
both are well-defined, coincide.
Proposition 5.1. Assume (Y (ω), Y ′ (ω)) ∈ DB(ω)

for every ω ∈ N2c . Set N3 =
N1 ∪ N2 . Then the rough integral
Z T X
Y dB = lim (Yu Bu,v + Yu′ Bu,v )
0 n→∞
[u,v]∈Pn

exists, for each fixed ω ∈ N3c , along any sequence (Pn ) with mesh |Pn | ↓ 0. If Y, Y ′
are adapted then, almost surely,
Z T Z T
Y dB = Y dB .
0 0
1
The case when B is given via iterated Stratonovich integration is left to Section 5.2 below.

89
90 5 Stochastic integration and Itô’s formula

Proof. Without loss of generality T = 1. The existence of the rough integral for
ω ∈ N3c under the stated assumptions is immediate from Theorem 4.10, applied
to Y (ω), controlled by B(ω), for ω ∈ N2c fixed. Recall (e.g. [RY99]) that for any
continuous, adapted process Y the Itô integral against Brownian motion has the
representation
Z 1 X
Y dB = lim Yu Bu,v (in probability)
0 n→∞
[u,v]∈Pn

along any sequence (Pn ) with mesh |Pn | ↓ 0. By switching to a subsequence, if


necessary, we can assume that the convergence holds almost surely, say on N4c . Set
N5 := N3 ∪ N4 . We shall complete the proof under the assumption that there exists
a (deterministic) constant M > 0 such that

sup |Y ′ (ω)|∞ ≤ M .
ω∈N5c

(This is the case in the “model” situation Y = F (X), Y ′ = DF (X) where F was
in particular assumed to have bounded derivatives; the general case is obtained by
localisation and left to Exercise 5.1.)
The claim is that the rough and Itô integral coincide on N5c . With a look at the
respective Riemann-sums, convergent away from N5 , basic analysis tells us that
X
∀ω ∈ N5c : ∃ lim Yu′ Bu,v ,
n
[u,v]∈Pn

and that this limit equals the difference of rough and Itô integrals (on N5c , a set of
full measure). Of course, |Pn | ↓ 0, and to see that the above limit is indeed zero (at
least on a set of full measure), it will be enough to show that
2
X ′


Yu Bu,v
2 = O(|P|) . (5.1)
[u,v]∈P L

To this end, assume the partition is of the form P = {0 = τ0 < · · · < τN = 1}


and define a (discrete-time) martingale started at S0 := 0 with increments Sk+1 −
Sk = Yτ′k Bτk ,τk+1 . Since |Bτk ,τk+1 |2L2 is proportional to |τk+1 − τk |2 , as may be
seen from Brownian scaling, we then have
2 N −1 2 N −1
X X 2
X



Yu Bu,v
=
2
(Sk+1 − Sk )
2 = |Sk+1 − Sk |L2
[u,v]∈P L k=0 L k=0
N −1
X 2
≤ M2

Bτ 2 = O(|P|) ,
k ,τk+1 L
k=0

as desired. ⊔

5.2 Stratonovich integration 91

5.2 Stratonovich integration

We could equally well have enhanced Brownian motion with


Z t
1
BStrat
s,t := Bs,u ⊗ ◦dBu = BItô
s,t + (t − s)I .
s 2

Almost surely, this construction then yields geometric α-Hölder rough sample paths,
for any α ∈ 13 , 12 . Recall that, by definition, the Stratonovich integral is given by
T T
1
Z Z
def
Y ◦ dB = Y dB + [Y, B]T
0 0 2

P quadratic covariation of Y and


whenever the Itô integral is well-defined and the
B exists in the sense that [Y, B]T := lim|P|→0 [u,v]∈P Yu,v Bu,v exists as limit in
probability.
In complete analogy to the Itô case, we now show that rough integration against
Stratonovich enhanced Brownian motion coincides with usual Stratonovich integra-
tion against Brownian motion under some natural assumptions guaranteeing that
both notions of integral are well-defined.
α
Corollary 5.2. As above, assume Y = Y (ω) ∈ CB(ω) for every ω ∈ N2c . Set
N3 = N1 ∪ N2 . Then the rough integral of Y against B = BStrat exists,
Z T X
Yu Bu,v + Yu′ BStrat

Y dB = lim u,v .
0 n→∞
[u,v]∈Pn

Moreover, if Y, Y ′ are adapted, the quadratic covariation of Y and B exists and,


almost surely,
Z T Z T
Y dB = Y ◦ dB.
0 0

t
Proof. BStrat
s,t= BItô
s,t + fs,t where f (t) = 2 Id. This entails, as was discussed in
Example 4.14,
Z 1 Z 1 Z 1
Y dB Strat
= Y dB Itô
+ Y ′ df.
0 0 0
R1 R1
Thanks to Proposition 5.1, it only remains to identify 2 0
Y ′ df = 0
Yt′ dt with
[Y, B]1 . To see this, write
X X

 
Yu,v Bu,v = Yu,v Bu,v Bu,v + Ru,v Bu,v
[u,v]∈P [u,v]∈P
 X 
′ 3α−1 
= Yu,v (Bu,v ⊗ Bu,v ) + O |P| ,
[u,v]∈P
92 5 Stochastic integration and Itô’s formula
3α−1 
thanks to R ∈ C22α and B ∈ C α .
P
where we used that Ru,v Bu,v = O |P|
Note that

Bu,v ⊗ Bu,v = 2 Sym BStrat Itô


 
u,v = 2 Sym Bu,v + (v − u)I.

We have seen in the proof of Proposition 5.1 that any limit (in probability, say) of
X

Yu,v BItô
u,v
[u,v]∈P

must be zero. In fact, a look at the argument reveals that this remains true with BItô
u,v

replaced by Sym BItôu,v . It follows that

X  X  Z 1

lim Yu,v Bu,v = lim Yu,v (v − u) = Yt′ dt ,
|P|→0 |P|→0 0
[u,v]∈P [u,v]∈P

thus concluding the proof. ⊔


5.3 Itô’s formula and Föllmer

Given a smooth path X : [0, T ] → V and a map F : V → W in Cb1 , where V, W are


Banach spaces as usual, the chain rule from classical “first order” calculus tells us
that Z t
F (Xt ) = F (X0 ) + DF (Xs )dXs , 0 ≤ t ≤ T.
0
Unsurprisingly, the same change of variables formula holds for geometric rough
paths X = (X, X), which are essentially limits of smooth paths, and it is not hard
to figure out, in view of Example 4.14, that a “second order” correction, involving
D2 F , appears in the non-geometric case. In other words, one can write down Itô
formulae for rough paths.
Before doing so, however, an important preliminary discussion is in order. Namely,
much of our effort so far was devoted to the understanding of (rough) integration
against 1-forms, say G = G(X) and indeed we found
Z X  
G(X)dX ≈ G(Xs )Xs,t + DG(Xs )Xs,t
[s,t]∈P

in the sense that the compensated Riemann-Stieltjes sums appearing on the right-
hand side converge with mesh |P| → 0. Let us split X into symmetric part, Ss,t :=
Sym (Xs,t ), and antisymmetric (“area”) part, Anti (Xs,t ) := As,t . Then

DG(Xs )Xs,t = DG(Xs )Ss,t + DG(Xs )As,t


5.3 Itô’s formula and Föllmer 93

and the final term disappears in the gradient case, i.e. when G = DF . Indeed, the
contraction of a symmetric tensor (here: D2 F ) with an antisymmetric tensor (here:
A) always vanishes. In other words, area matters very much for general integrals
of 1-forms but not at all for gradient 1-forms. Note also that, contrary to A, the
symmetric part S is a nice function of the underlying path X. For instance, for Itô
enhanced Brownian motion in Rd , one has the identity
Z t
1 i j 
Si,j
s,t = B i
dB j
= B B − δ ij
(t − s) , 1 ≤ i, j ≤ d .
s
s,r r
2 s,t s,t

These considerations suggest that the following definition encapsulates all the data
required for the integration of gradient 1-forms.

Definition 5.3. We call X = (X, S) a reduced rough path, in symbols X ∈


Crα ([0, T ], V ), if X = Xt takes values in a Banach space V , S = Ss,t takes values
in Sym (V ⊗ V ), and the following hold:
i) a “reduced” Chen relation

Ss,t − Ss,u − Su,t = Sym (Xs,u ⊗ Xu,t ) , 0 ≤ s, t, u ≤ T ,


α
ii) the usual analytical conditions, Xs,t = O(|t − s| ), Ss,t = O |t − s|2α , for


some α > 1/3.

Clearly, any X = (X, X) ∈ C α ([0, T ], V ) induces a reduced rough path by


ignoring its area A = Anti (X). More importantly, and in stark contrast to the general
rough path case, a lift of a path X ∈ C α to a reduced rough path can be trivially
obtained via its square-increments 12 Xs,t ⊗ Xs,t . We have the following result.

Lemma 5.4. Given X ∈ C α , α ∈ (1/3, 1/2], the “geometric” choice S̄s,t =


1 α

X
2 s,t ⊗ X s,t yields a reduced rough path, i.e. X, S̄ ∈ C r . Moreover, for any
2α-Hölder path γ with values in Sym (V ⊗ V ), the perturbation
1 1
Ss,t = S̄s,t + (γt − γs ) = (Xs,t ⊗ Xs,t + γs,t )
2 2
also yields a reduced rough path (X, S). Finally, all reduced rough path lifts of X
are obtained in this fashion.

Proof. A simple exercise for the reader. ⊔


The previous lemma gives in particular a one-one correspondence between S and


γ. We thus formalise the role of γ.

Definition 5.5 (Bracket of a reduced rough path). Given X = (X, S) ∈ Crα (V ),


we define the bracket

[X] : [0, T ] → Sym (V ⊗ V )


def
t 7→ [X]t = X0,t ⊗ X0,t − 2S0,t .
94 5 Stochastic integration and Itô’s formula

Note that, as consequence of the previous lemma, [X] ∈ C 2α . Furthermore, if one


defines
def
[X]s,t = Xs,t ⊗ Xs,t − 2Ss,t ,
then one has the identity [X]s,t = [X]0,t − [X]0,s for any two times s, t.

Remark 5.6. We already encountered [X] as a way to decompose X ∈ C α into a


geometric rough paths plus extra information (Exercise 2.11). Our motivation here is
different in that we explore that the fact that [X] requires no knowledge of the area
Anti (X) := A, a central object for rough path theory.

Remark 5.7. While this notion of bracket does not rely on any sort of “quadratic
variation”, it is consistent with the product (a.k.a. integration by parts) formula from
Itô calculus. Indeed, for any semimartingale X = X(t, ω), with X0 = 0 say, we
have Z t Z t
i j
Xs dXs + Xsj dXsi = Xti Xtj − ⟨X i , X j ⟩t ; (5.2)
0 0

from a rough path perspective, the left-hand side is precisely Xi,j j,i i,j
0,t + X0,t = 2S0,t .

Proposition 5.8 (Itô formula for reduced rough paths). Let F : V → W be of


class Cb3 and let X = (X, S) ∈ Crα ([0, T ], V ) with α > 1/3. Then
t t
1
Z Z
F (Xt ) = F (X0 ) + DF (Xs )dXs + D2 F (Xs )d[X]s , 0 ≤ t ≤ T.
0 2 0

Here, writing P for partitions of [0, t], the first integral is given by2
Z t X
DF (Xu )Xu,v + D2 F (Xu )Su,v ,
def 
DF (Xs )dXs = lim (5.3)
0 |P|→0
[u,v]∈P

while the second integral is a well-defined Young integral.

Proof. Consider first the geometric case, S = S̄, in which case the bracket is zero. The
proof is straightforward. Indeed, thanks to α-Hölder regularity of X with α > 1/3,
we obtain
X 
F (XT ) − F (X0 ) = F (Xv ) − F (Xu )
[u,v]∈P
X  1
= DF (Xu )Xu,v + D2 F (Xu )(Xu,v , Xu,v )
2
[u,v]∈P

+ o(|v − u|)
X  
= DF (Xu )Xu,v + D2 F (Xu ), S̄u,v + o(|v − u|) .
[u,v]∈P

2
Note consistency with the rough integral when X ∈ C α .
5.3 Itô’s formula and Föllmer 95
P
We conclude by taking the limit |P| → 0, also noting that [u,v]∈P o(|v − u|) → 0.
For the non-geometric situation, just substitute
1
S̄u,v = Su,v + [X]u,v .
2
Since D2 F is Lipschitz, D2 F (X· ) ∈ C α and we can split-up the “bracket” term and
note that Z t
X
D2 F (Xu )[X]u,v → D2 F (Xu )d[X]u ,
[u,v]∈P 0

where the convergence to the Young integral follows from [X] ∈ C 2α . The rest is
now obvious. ⊔⊓

Example 5.9. Consider the case when X = B, Itô enhanced Brownian motion. Then
X is given by iterated Itô integrals and, thanks to the Itô product rule (5.2),
Z t
2Si,j B i dB j + B j dB i = Bti Btj − B i , B j t .


0,t =
0

The usual Itô formula is then recovered from the fact that
i,j j
i
− 2Si,j

i j i,j
[B]t = B0,t B0,t 0,t = B , B 0,t = δ t .

We conclude this section with a short discussion on Föllmer’s calcul d’Itô sans
probabilités [Föl81]. For simplicity of notation, we take V = Rd , W = Re in what
follows. With regard to (5.3), let us insist that the compensation is necessary and one
cannot, in general, separate the sum into two convergent sums. On the other hand,
we can combine the converging sums and write
X 
F (X)0,t = lim DF (Xu )Xu,v + D2 F (Xu )Su,v
|P|→0
[u,v]∈P
1 X 
+ D2 F (Xu )[X]u,v (5.4)
2
[u,v]∈P
X  1 
= lim DF (Xu )Xu,v + D2 F (Xu )(Xu,v , Xu,v ) .
|P|→0 2
[u,v]∈P

We now put forward an assumption that allows to break up the above sum.

Definition 5.10. Let π = (Pn )n≥0 be a sequence of partitions of [0, T ] with mesh
|Pn | → 0. We say that X : [0, T ] → Rd has finite quadratic variation in the sense of
Föllmer along π if, for every t ∈ [0, T ] and 1 ≤ i, j ≤ d the limit
 i j π X
i i
 j j 
X , X t := lim Xv∧t − Xu∧t Xv∧t − Xu∧t
n→∞
[u,v]∈Pn
96 5 Stochastic integration and Itô’s formula
π
exists. Write [X, X] for the resulting path with values in Sym Rd ⊗ Rd , i.e. the


space of symmetric d × d matrices.

Lemma 5.11. Assume X : [0, T ] → Rd is continuous and has finite quadratic


variation in the sense of Föllmer, along π = (Pn )n≥0 . Then the map t 7→
π
[X, X]t is of bounded variation on [0, T ] and, for any continuous G : [0, T ] →
L(2) Rd × Rd , Re ,


Z t
π
X
G(u)d[X, X]u ∈ Re .

lim G(u) Xu,v , Xu,v =
n→∞ 0
[u,v]∈Pn
u<t

Proof. For the first statement, it is enough to argue component by component. Set
[X i ]π := [X i , X i ]π . By polarisation,
 i j π 1 π  π  π
X , X t = Xi + Xj t − Xi t − Xj t .
2

Since each term on the right-hand side is monotone in t, we see that t 7→ X i , X j t


is indeed of bounded variation.


Regarding the second statement, it is enough to check that, for continuous g :
π
[0, T ] → R and Y of finite quadratic variation, with continuous bracket t 7→ [Y ]t ,
Z t
π
X
2
lim g(u)Yu,v = g(u)d[Y ]u . (5.5)
n→∞ 0
[u,v]∈Pn
u<t

Indeed, we can apply this for each component, with g = Gki,j and

X i + X j , X i, X j ,
 
Y ∈

which then also gives, by polarisation,


X Z t π
Gki,j (u)Xu,v
i j
Gki,j (u)d X i , X j u .

Xu,v →
[u,v]∈Pn 0
u<t

2
P R
To see that (5.5) holds, write [u,v]∈Pn ,u<t g(u)Yu,v = [0,t)
g(u)dµn (u) with
X
2
µn = Yu,v δu .
[u,v]∈Pn ,u<t

Note that µn is a finite measure on [0, t) with distribution function


X
2
Fn (s) := µn ([0, s]) = Yu,v .
[u,v]∈Pn
u≤s
5.4 Backward integration 97
π
As n → ∞, Fn (s) → [Y ]s for any s ≤ t by continuity of Y . Pointwise convergence
of the distribution functions implies weak convergence of the measures µn to the
π
measure d[Y ] on [0, t), with distribution function the right-continuous modification
π
of [Y ] . Since g|[0,t) is continuous, (5.5) follows. ⊔

Combination of the above lemma with (5.4) gives the Itô–Föllmer formula,
Z t
1 t 2
Z
F (Xt ) = F (X0 ) + DF (Xs )dX + D F (Xs )d[X, X]t , 0 ≤ t ≤ T
0 2 0
(5.6)
where the middle integral is given by the (now existent) limit of left-point Riemann-
Stieltjes approximations
X Z t
lim DF (Xu ) Xu,v =: DF (X)dX.
n→∞ 0
[u,v]∈Pn

In fact, we encourage the reader to verify as an exercise that this formula is valid
whenever X : [0, T ] → Rd is continuous, of finite quadratic variation, with t 7→
π
[X, X]t continuous. Note, however, that Föllmer’s notion of quadratic variation (and
the above integral) can and will depend in general on the sequence (Pn ).

5.4 Backward integration

Given a Brownian motion B = Bt (ω), one can define the backward Itô-integral
T
←−
Z X
ft dB t := lim ft Bs,t ,
0 n
[s,t]∈Pn

whenever |Pn | → 0 and this limit exists, in probability and uniformly on compact
time intervals, and does not depend on the sequence of partitions (Pn ) of [0, T ]. For
instance, Z T
←− 1 T
Bt dB t = BT2 + .
0 2 2
In many applications one encounters integrands f = ft (ω) that are backward
adapted in the sense that each ft is measurable with respect to the σ-field FtT :=
σ(Bu,v : t ≤ u ≤ v ≤ T ). For example,
T T
←− ←− 1 T
Z Z
(BT − Bt ) dB t = BT2 − Bt dB t = BT2 −
0 0 2 2

and we note (in contrast to the previous example) the zero mean property, which
of course comes from a backward martingale structure. Indeed, B̂t := BT − BT −t
is a standard Brownian motion, adapted to F̂t := FTT−t and so is fˆt = fT −t . The
98 5 Stochastic integration and Itô’s formula

backward integral can then be written as classical (forward) Itô integral


T T
←−
Z Z
fˆt dB̂t = lim fˆs B̂s,t .
X
ft dB t = (5.7)
0 0 n
[s,t]∈P n

Also, by analogy with its forward counterpart, the backward Stratonovich integral is
defined as the backward Itô integral, minus 1/2 times the quadratic variation of the
integrand.
The purpose of this section is to understand backward integration as rough integra-
tion. To this end, recall that the “forward” rough integral of (Y, Y ′ ) ∈ DX

against
X = (X, X) was given in Theorem 4.10 by
Z T X
Y dX = lim Ys Xs,t + Ys′ Xs,t (5.8)
0 |P |↓0
[s,t]∈P

where P are partitions of [0, T ] with mesh-size |P |. Clearly, some sort of “left-point”
evaluation has been hard-wired into our definition of rough integral. On the other
hand, one can expect that feeding in explicit second order information makes this
choice somewhat less important than in the case of classical stochastic integration.
The next proposition, purely deterministic, answers the questions to what extent
one can replace left-point by right-point evaluation. In fact, it provides the natural
analogue of (5.7)3 but without any need of “backward” rough integration: both rough
integrals which appear in the following proposition are “forward” in the sense of
(5.8).

Proposition 5.12 (Backward representation of rough integral). Given a rough


path X = (X, X) ∈ C α with α > 1/3 and (Y, Y ′ ) ∈ DX 2α
we have, for all
r ∈ [0, T ],
Z T X
(Y, Y ′ )dX = lim Yt Xs,t + Yt′ (Xs,t − Xs,t ⊗ Xs,t )

(5.9)
r |P|↓0
[s,t]∈P
T −r
←− ← − ← −
X Z
Yt Xt,s + Yt′ Xt,s = − ( Y , Y ′ )d X .

= − lim
|P|↓0 0
[s,t]∈P



with X (t) = X(T − t) and similar for Y and Y ′ .

Proof. It is clear from (5.8) the rough integral is given as (compensated) Riemann–
Stieltjes limit
Z T X
Ys Xs,t + Ys′ Xs,t + (∗)s,t

Y dX = lim
r |P|↓0
[s,t]∈P

3 ←− ←

With regard to (5.7), note that dB̂ = −d B where B t = BT − Bt , not be mixed up with the
←−
backward Itô differential dB.
5.4 Backward integration 99
3α 
whenever (∗)s,t ≈ 0 in the sense that (∗)s,t = O |t − s| = o(|t − s|), so that it
does not contribute to the limit. (Recall (4.21) and Lemma 4.2.) But then

Ys Xs,t + Ys′ Xs,t = Yt Xs,t − Ys,t Xs,t + Ys′ Xs,t


≈ Yt Xs,t − Ys′ Xs,t ⊗ Xs,t + Ys′ Xs,t
≈ Yt Xs,t + Yt′ (Xs,t − Xs,t ⊗ Xs,t ) ,

which settles the first equality in (5.9). The second one follows from Xs,t = −Xt,s
and, from Chen’s relation, Xs,t + Xt,s + Xs,t ⊗ Xt,s = Xs,s = 0. For the final
←−
equality, note that every partition P of [r, T ] induces a time-reversed partition P
of [0, T − r], with each [s, t] replaced by [T − t, T − s]. By Exercise 2.6, the (time


T ) time-reversal of X is again a rough path, X ∈ C α , and since (easy to see)
←− ←−
(Y, Y ′ ) ∈ DX

if and only ( Y , Y ′ ) ∈ D←2α
− , we obtain the final equality. ⊔

X

Remark 5.13 (Backward geometric integration). For X ∈ Cg ([0, T ], Rd ), it was seen


in Exercise 2.4) that Xt,s = XTs,t . It then follows from (5.9) that4
Z T X
(Y, Y ′ )dX = lim Yt Xs,t − (Yt′ )T Xs,t .

(5.10)
0 |P|↓0
[s,t]∈P

At this stage, one could rephrase the defining condition for (Y, Y ′ ) ∈ DX 2α
in terms
of a “backward” controlledness condition for (Ŷ , Yˆ′ ) := (Y, −(Y ′ )T ), together with
a ”backward” rough integral given by5
T
←−
X Z
(Ŷ , Ŷ ′ )dX .

lim Ŷt Xs,t + Ŷt Xs,t =: (5.11)
|P|↓0 0
[s,t]∈P

However, this is no different than the “forward” integral (Y, Y ′ )dX. Comparing
R

(5.8) with (5.11), one changed left- to right-point evaluation, followed by twisting
the meaning of controlled rough path, to make sure nothing happened!
As should be clear at this point, a naı̈ve backward rough integral of (Y, Y ′ ) ∈ DX

against X ∈ C ([0, T ], Rd ), with left- replaced by right-point evaluation in (5.8),


X
Yt Xs,t + Yt′ Xs,t ,

lim
|P|↓0
[s,t]∈P

is, in general, not well-defined. In fact, in view of Proposition 5.12, existence of this
limit is equivalent to existence of (either)
X X
Yt′ Xs,t ⊗ Xs,t = lim Ys′ Xs,t ⊗ Xs,t .
 
lim
|P|↓0 |P|↓0
[s,t]∈P [s,t]∈P

4
In coordinates: (Y ′ X)k = (Y ′ )k
i,j X
i,j
vs. (Y ′ )T X = (Y ′ )k
j,i X
i,j
with implicit summation
over i, j = 1, . . . , d.
5
R ←−
Not to be confused with a standard “forward” rough integral (. . .)d X seen in (5.9).
100 5 Stochastic integration and Itô’s formula

There is no reason why, for a general path X ∈ C α , the above limits should exist. On
the other hand, we already considered such sums in the context of the Itô–Föllmer
formula, cf. Lemma 5.11. The appropriate condition for X was seen to be “quadratic
variation (in the sense of Föllmer, along some (Pn ))”. And under this assumption,
Z T
π
X
Ys′ Xs,t ⊗ Xs,t → Ys′ d[X]s .

(5.12)
[s,t]∈P n 0

Of course, with probability one, d-dimensional standard Brownian motion has


quadratic variation in the sense of Föllmer, along dyadic partitions, for instance, with
[B, B]πt = tId. These remarks are crucial for proving the following.
def
Theorem 5.14. Define the random rough paths BStrat = (B, BStrat ) and Bback =
(B, Bback ) by
t
1
Z
def
BStrat
s,t = Bs,r ⊗ ◦dBr = BItô
s,t + Id(t − s) ,
s 2
Z t
def ←−
Bback
s,t = Bs,r ⊗ dB r = BItô
s,t + Id(t − s) .
s

Then, the following statements hold.


i) Assume (Y (ω), Y ′ (ω)) ∈ DB(ω)

a.s. and Y, Y ′ are adapted as processes. Then,
with probability one, for all t ∈ [0, T ],
t t t t
1
Z Z Z Z
Y dBStrat = Ys dBs + Ys′ Id ds = Ys ◦ dBs ,
0 0 2 0 0
Z t Z t Z t
Y dBback = Ys dBs + Ys′ Id ds .
0 0 0

′ 2α
ii) Assume (Y (ω), Y (ω)) ∈ a.s. and Yt , Yt′ are FtT -measurable for all
DB(ω)
t < T . Then with probability one, for all r ∈ [0, T ],
T T T T
←− 1 ←−
Z Z Z Z
Y dBStrat = Yt dB t − Yt′ Id dt = Ys ◦ dB s ,
r r 2 r r
T T
←−
Z Z
Y dBback = Yt dB t .
r r

Proof. Regarding point i), it follows from the definition of the rough integral (see
also Example 4.14) that
Z t Z t Z t
Y dBback = Y dBItô + Y ′ Id ds .
0 0 0

The claim then follows from Proposition 5.1. The Stratonovich case is similar, now
using Corollary 5.2.
5.4 Backward integration 101

We now turn to point ii). Thanks to the backward presentation established in


Proposition 5.12,
Z T X
Y dBback = lim Yt Bs,t + Yt′ BItô

s,t + Id(t − s) − Bs,t ⊗ Bs,t
r n→∞
[s,t]∈P n
X
= lim Yt Bs,t + Yt′ BItô ′
s,t − Ys (Bs,t ⊗ Bs,t − Id(t − s)) ,
n→∞
[s,t]∈P n

′ ′
using Ys,t (Xs,t ⊗ Xs,t ) ≈ 0 and Ys,t Id(t − s) ≈ 0. (As before (∗)s,t ≈ 0 means
(∗)s,t = o(|t − s|).) Now we know that with probability 1, B(ω) has finite quadratic
π
variation [B]t = Idt, in the sense of Föllmer along some sequence π = (P n ). As a
purely deterministic consequence, cf. (5.12), on the same set of full measure,
Z T
π
X X
lim Ys′ Bs,t ⊗ Bs,t = Ys′ d[B]s = lim Ys′ Id(t − s).
n→∞ 0 n→∞
[s,t]∈P n [s,t]∈P n

It follows at once that


Z T X
Y dBback (ω) = lim Yt Bs,t + Yt′ BItô
s,t .
r n→∞
[s,t]∈P n

T ′ T
Since BItô
s,t is independent from Ft and Yt , Yt are Ft -measurable, a (backward)
martingale argument shows that
X
lim Yt′ BItô
s,t = 0.
n→∞
[s,t]∈P n

As a consequence, with probability one,


T T
←−
Z X Z
Y dBback (ω) = lim Yt Bs,t = Y dB .
r n→∞ r
[s,t]∈P n

The (backward) Stratonovich case is then treated as simple perturbation,


Z T X 
Y dBStrat = lim Yt Bs,t + Yt′ BItô

s,t + Id(t − s) − Bs,t ⊗ Bs,t
r n→∞
[s,t]∈P n
1 
− Yt′ Id(t − s)
2
Z T
←− 1 T ′
Z
= Yt dB t − Y Id dt ,
0 2 0 t

thus concluding the proof. ⊔



102 5 Stochastic integration and Itô’s formula

5.5 Exercises

Exercise 5.1 Complete the proof of Proposition 5.1 in the case of unbounded Y ′ .

Solution. It suffices to show the convergence of (5.1) in probability; to this end, we


introduce stopping times
n o
τM = max t ∈ P : |Yt′ | < M ∈ [0, T ] ∪ {+∞}
def

and note that limM →∞ τM = ∞ almost surely. The stopped process S·τM is also a
martingale, and we see as above that, for every fixed M > 0,
X 2



Yu Bu,v = O(|P|).
2 L
[u,v]∈P
u≤τM

The proof is then easily finished by sending M to infinity.

Exercise 5.2 (Applications to statistics [DFM16]) Let B be a d-dimensional Brow-


nian motion. Consider a d × d matrix A, a non-degenerate volatility matrix σ of the
same dimension and a sufficiently nice map h : Rd → Rd so that the Itô stochastic
differential equation
dYt = A h(Yt )dt + σdBt (5.13)
has a unique solution, starting from any Y0 = y0 ∈ Rd . (As a matter of fact, this
SDE can be solved pathwise by considering the random ODE for Zt = Yt − σBt .)
We are interested in the maximum likelihood estimation of the drift parameter A over
a fixed time horizon [0, T ], given some observation path Y = Y (ω). Recall that this
estimator, ÂT (ω), is based on the Radon–Nikodym density on pathspace, as given
by Girsanov’s theorem, relative to the drift free diffusion.
a) Let d = 1, h(y) = y. Show that the estimator  can be “robustified” in the
sense that ÂT (ω) = ÃT (Y (ω)) where

YT2 − y02 − σ 2 T
ÃT (Y ) = RT . (5.14)
2 0 Yt2 dt

is defined deterministically for any non-zero Y ∈ C([0, T ], Rd ), and continuous


with respect to uniform topology.
b) Take again h(x) = x, but now in dimension d > 1. Show that  admits a
robust representation on rough path space, i.e. one has ÂT (ω) = ÃT (Y(ω))
where ÃT = ÃT (Y) is deterministically defined and continuous with respect to
α-Hölder rough path topology for any fixed α ∈ (1/3, 1/2). Here, Y(ω) is the
geometric rough path constructed from Y by iterated Stratonovich integration.
Explain why there cannot be a robust representation on path space (as was the
case when d = 1). What about more general h?
5.5 Exercises 103

Exercise 5.3 (Rough vs. anticipating Skorokhod integration) We have seen that
Itô integration coincides with rough integration against BItô (ω), subject to natural
conditions (in particular: adaptedness of (Y, Y ′ ) which guarantees that both are
well-defined). A well-known extension of the Itô integral to non-adapted integrands
is given by the Skorokhod integral, details of which are found in any textbook on
Malliavin calculus, see for example [Nua06].
a) Let B denote one-dimensional Brownian motion on [0, T ]. Show that the Sko-
RT
rokhod integral of BT against B over [0, T ], in symbols 0 BT δBt , is given by
BT2 − T .
b) Set Yt (ω) := BT (ω), with (zero) increments (trivially) controlled by B with
Y ′ := 0. (In view of true roughness of Brownian motion, cf. Section 6, there is no
other choice for Y ′ ). Show that the rough integral of Y against Brownian motion
over [0, T ] equals BT2 . Conclude that Skorokhod and rough integrals (against
Itô enhanced Brownian motion) do not coincide beyond adapted integrals.
Exercise 5.4 (Rough vs. anticipating Stratonovich integration [CFV07]) In the
spirit of Nualart–Pardoux [NP88], define the Stratonovich anticipating stochastic
integral by
Z t Z t
dB n (ω)
u(s, ω) s
def
u(s, ω) ◦ dBs (ω) = lim ds,
0 n→∞ 0 ds

where B n is the dyadic piecewise linear approximation to a (d-dimensinoal) Brown-


ian motion B, whenever this limit exists in probability and uniformly on compacts.
Consider (possibly anticipating) random 1-forms, u(s, ω) = Fω (Bs ) ∈ Cb2 , for a.e.
ω. Show that with probability one,
Z · Z ·
dB n (ω)
Fω (Bs )dBStrat (ω) ≡ lim Fω (Bs ) s ds .
0 n→∞ 0 ds

where the limit on the right-hand side exists in the almost sure sense. Conclude that
in this case rough integration against BStrat coincides almost surely with Stratonovich
anticipating stochastic integration, i.e.
Z · Z ·
Fω (Bs )dBStrat (ω) ≡ Fω (Bs ) ◦ dBs (ω).
0 0

Hint: It is useful to consider the pair (BStrat , B n ), canonically viewed as (geometric)


rough paths over R2d , followed by its rough path convergence to the “doubled”
rough path (BStrat , BStrat ) (which needs to be defined rigorously).
Remark. Nualart–Pardoux actually define their integral in terms of arbitrary
deterministic (not necessarily dyadic) piecewise linear approximations and demand
that the limit does not depend on the choice of the sequence of partitions. At the
price of giving up the martingale argument, which made dyadic approximations easy
(Proposition 3.6), everything can also be done in the general case; see Exercises 10.1
and 10.2 below.
104 5 Stochastic integration and Itô’s formula

Exercise 5.5 Fix t > 0 and a sequence of dissections (Pn ) ⊂ [0, t] with mesh
|Pn | → 0. Consider the Itô–Föllmer integral given by
Z t X
def
DF (X) dX = lim DF (Xu ) Xu,v ,
0 n→∞
[u,v]∈Pn

whenever this limit exists. Show that this limit does not exist, in general, when
X = B H , a d-dimensional fractional Brownian motion with Hurst parameter
H < 1/2.
Hint: Consider the simplest possible non-trival case, namely d = 1 and F (x) = x2 .

Solution. Assume convergence in probability say along some (Pn ) for the approxi-
mating (left-point) sum, X
Xu Xu,v .
[u,v]∈Pn

We look for a contradiction. Elementary “calculus for sums” implies that the mid-
point sum converges, i.e. where Xu above is replaced by Xu + Xu,v /2. It follows
that convergence of the left-point sums is equivalent to to existence of quadratic
variation, i.e. existence of
2
X
lim |Xu,v | .
n→∞
[u,v]∈Pn

2 2H
Note that E|Xu,v | = (1/2n ) so that the expectation of this sum equals 2n(1−2H) ,
which diverges when H < 1/2. In particular, quadratic variation does not exist as L1
limit. But is also cannot exist as a limit in probability, for both types of convergence
are equivalent on any finite Wiener–Itô chaos.

Exercise 5.6 In Proposition 5.8, replace the assumption that X = (X, S) ∈


Crα ([0, T ], V ) with α > 1/3, by a suitable p-variation assumption with p < 3.
Show that [X] has finite p/2-variation and that D2 F (X)d[X], as it appears in Itô’s
R

formula for reduced rough paths, remains a Young integral.


♯ Exercise 5.7 Prove Proposition 1.1.

Solution. Without loss of generality, we consider the problem on the interval [0, 2π].
Assume by contradiction that there is a spaceR B ⊂ C([0, 2π]) which carries the law µ
of Brownian motion and such that (f, g) 7→ f dg is continuous on B. By definition,
the Cameron–Martin space of µ is H = W01,2 ([0, 1]), which has an orthonormal
basis {en }n∈Z given by

t sin kt 1 − cos kt
e0 (t) = √ , ek (t) = √ , e−k (t) = √ ,
2π k π k π

for k > 0. It follows from standard Gaussian measure theory [Bog98] that, given
a sequence ξn of i.i.d. normal Gaussian random variables, the sequence XN =
5.6 Comments 105
PN
n=−N en ξn converges almost surely in B to a limit X such that the law of X is µ.
PN
Write now YN = n=−N sign(n)en ξn , so that one also has YN → Y with law of
Y given by µ.
This immediately leads to a contradiction: on the one hand, assuming that (f, g) 7→
R R 2π
f dg is continuous on B, this implies that 0 XN (t) dYN (t) converges to some
finite (random) real number. On the other hand, an explicit calculation yields

2π N
ξ02 X ξn2 + ξ−n
2
Z
XN (t) dYN (t) = + .
0 2 n=1
n

It is now straightforward to verify that this diverges logarithmically, thus concluding


the proof.

5.6 Comments

Rough integrals of 1-forms against the Brownian rough path (and also continuous
semimartingales enhanced to rough paths) are well known to coincide with stochastic
integrals, see [LQ02, FV10b] and the references therein, [FS17, CF19] for the case
of càdlàg semimartingales. Chouk and Tindel [TC15] discuss, from a rough path
view, Skorohod and Stratonovich integration in the plane. Pathwise integration à la
Föllmer is revisited and extended by Ananova, Cont and Perkowski [AC17, CP19].
Sharp rough path type p-variation and integrability estimates on martingale trans-
forms (and then stochastic integrals against general càdlàg semimartingales) are given
by Friz and Zorin-Kranich [FZK20], this extends and unifies the relevant parts of
[Lep76, FV08a, KZK19], see also [DOP19] for the use of such an estimate. recently
led to the notion of rough semimartingale [FZK20], which leads to a simultaneous
development of (càdlàg) rough and stochastic integration. A parallel development
[FHL20], in a Hölder setting, is based on stochastic sewing (Section 4.6), see also
Exercise 4.15.
Chapter 6
Doob–Meyer type decomposition for rough paths

A deterministic Doob–Meyer type decomposition is established. It is closely related


to the question to what extent Y ′ is determined by Y , given that (Y, Y ′ ) ∈ DX

. The
crucial property is true roughness of X, a deterministic property that guarantees that
X varies in all directions, all the time.

6.1 Motivation from stochastic analysis

Consider a continuous semimartingale (St : t ≥ 0). By definition (e.g. [RY99, Ch.


IV]) this means that S = M + A where M ∈ M, the space of continuous local
martingales, and A ∈ V, the space of continuous adapted process of finite variation.
Then it is well known that the decomposition S = M + A is unique in the following
sense.

Proposition 6.1. Assume M, M̃ ∈ M, vanishing at zero, and A, Ã ∈ V such that


M + A ≡ M̃ + Ã (i.e. the respective processes are indistinguishable). Then

M ≡ M̃ and A ≡ Ã .

Furthermore, if S = M + A ≡ 0 on some random interval [0, τ ) where τ is a


stopping time, then ⟨M ⟩ ≡ 0 on [0, τ ) and A ≡ 0 on [0, τ ).

Proof. Assume M + A ≡ M̃ + Ã. Then M − M̃ ∈ V, and null at zero. By a standard


result in martingale theory, see for example [RY99, IV, Prop 1.2], this entails that
M − M̃ ≡ 0. But then A ≡ Ã and the proof is complete.
Regarding the second statement, consider the stopped semimartingale, S τ =
M + Aτ where Mtτ = Mt∧τ and similarly for A. By assumption S τ ≡ 0 and
τ

hence, by the first part, M τ , Aτ ≡ 0. This also implies that the quadratic variation of
M τ , denoted by ⟨M τ ⟩, vanishes. Since ⟨M τ ⟩ = ⟨M ⟩τ (see e.g. [RY99, Ch. IV]) it
indeed follows that ⟨M ⟩ ≡ 0 on [0, τ ). ⊔ ⊓

107
108 6 Doob–Meyer type decomposition for rough paths

The above proposition applies in particular when M is given  as multidimensional


(say Re -valued) stochastic integral of a suitable L Rd , Re -valued integrand Y
(continuous and adapted will do) against d-dimensional Brownian motion B, while
A is the indefinite integral of some suitable Re -valued process Z (again, continuous
and adapted will do). We then have
Corollary 6.2. Let B be a d-dimensional Brownian motion and let Y , Z, Ỹ , Z̃ be
continuous stochastic processes adapted to the filtration generated by B. Assume, in
the sense of indistinguishability of left- and right-hand sides, that
Z · Z · Z · Z ·
Y dB + Zdt ≡ Ỹ dB + Z̃ dt on [0, T ]. (6.1)
0 0 0 0

Then Y ≡ Ỹ and Z ≡ Z̃ on [0, T ].

Proof. We may take set the dimension to e = 1 by arguing componentwise. Also, by


linearity, it suffices to consider the case Ỹ = 0, Z̃ = 0. By the second part of the
previous proposition
 *X d Z ·
Z · +
k
Y dB ≡ Yk dB ≡ 0 on [0, T ].
0 k=1 0

On the other hand, since ⟨B k , B l ⟩t = t if k = l, and zero otherwise,


* d Z + d Z · d Z ·
X · X
k l X
k
Yk dB ≡ Yk Yl d B , B = Yk2 dt.
k=1 0 k,l=1 0 k=1 0
·

It follows that Y ≡ 0 as claimed. By differentiation, it then follows that also Z ≡ 0.



Clearly, the martingale and quadratic (co-)variation – i.e. probabilistic – properties


of B play a key role in the proof of Corollary 6.2. It is worth noting that, with β
a scalar Brownian motion and B 1 = B 2 = β the conclusion fails; try non-zero
Y 1 ≡ −Y 2 , Z ≡ 0. It is crucial that d-dimensional standard Brownian motion
“moves in all directions”, captured through the non-degeneracy of the quadratic
covariation matrix ⟨B k , B l ⟩t .
Surprisingly perhaps, one can formulate a purely deterministic decomposition
of the form (6.1): the stochastic integrals will be replaced by rough integrals, the
relevant probabilistic properties of B by certain conditions (“roughness from below1 ,
in all directions”) on the sample path.

1
As opposed to Hölder regularity which quantifies “roughness from above”, in the sense of an
upper estimate of the increment.
6.2 Uniqueness of the Gubinelli derivative and Doob–Meyer 109

6.2 Uniqueness of the Gubinelli derivative and Doob–Meyer

Here and in the sequel of this section we fix α ∈ ( 13 , 12 ], a rough path X = (X, X) ∈
C α ([0, T ], V ) and a controlled rough path (Y, Y ′ ) ∈ DX 2α
. We first address the
question to what extent X and Y determine the Gubinelli derivative Y ′ . As it turns
out, Y ′ is uniquely determined, provided that X is sufficiently “rough from below, in
all directions”. A Doob–Meyer type decomposition will then follow as a corollary.
Let us first consider the case when X is scalar, i.e. with values in V = R. Assume
that for some given s ∈ [0, T ), there exists a sequence of times tn ↓ s such that

|Xs,tn |/|tn − s| → ∞, i.e.

|Xs,t |
lim 2α = +∞.
t↓s |t − s|

Then Ys′ is uniquely determined from Y by (4.18) and the condition that ∥RY ∥2α <
∞. In fact, one necessarily has Xs,tn ∈ R \ {0} for n large enough and so, from the
very definition of RY ,
Y 2α
Ys,tn Rs,t |tn − s|
Ys′ = − n

Xs,tn |tn − s| Xs,tn

which implies that limn→∞ Ys,tn /Xs,tn exists and equals Ys′ . The multidimensional
case is not that different, and the above consideration suggests the following defini-
tion.

Definition 6.3. For fixed s ∈ [0, T ) we call X ∈ C α ([0, T ], V ) “rough at time s” if

|v ∗ (Xs,t )|
∀v ∗ ∈ V ∗ \{0} : lim 2α =∞.
t↓s |t − s|

If X is rough on some dense set of [0, T ], we call it truly rough.

This definition is vindicated by the following result.

Proposition 6.4 (Uniqueness


R of Y ′ ). Let X = (X, X) ∈ C α , (Y, Y ′ ) ∈ DX

, so
that the rough integral Y dX exists. Assume X is rough at some time s ∈ [0, T ).
Then
2α 
Ys,t = O |t − s| as t ↓ s =⇒ Ys′ = 0 . (6.2)
As a consequence, if X is truly rough and (Y, Ỹ ′ ) ∈ DX

is another controlled rough
′ ′
path (with respect to X) then Y ≡ Ỹ .

Proof. From the definition of (Y, Ỹ ′ ) ∈ DX



, we have
2α 
Ys,t = Ys′ Xs,t + O |t − s| .

Hence, for t ∈ (s, s + ε),


110 6 Doob–Meyer type decomposition for rough paths

Ys′ Xs,t Ys,t


2α = 2α + O(1) = O(1) ,
|t − s| |t − s|

where the second equality follows from the assumption made in (6.2). Now, Ys′ Xs,t
takes values in W̄ , the same Banach space in which Y takes its values. For every
w∗ ∈ W̄ ∗ , the map V ∋ v 7→ w∗ (Ys′ v) defines an element v ∗ ∈ V ∗ so that

|v ∗ (Xs,t )| w∗ (Ys′ Xs,t )
2α = 2α = O(1) as t ↓ s;
|t − s| |t − s|

Unless v ∗ = 0, the assumption that “X is rough at time s” implies that, along some
sequence tn ↓ s, we have the divergent behaviour |v ∗ (Xs,tn )|/|tn − s|2α → ∞,
which contradicts that the same expression is O(1) as tn ↓ s. We thus conclude that
v ∗ = 0. In other words,

∀w∗ ∈ W ∗ , v ∈ V : w∗ (Ys′ v) = 0 ,

and this clearly implies Ys′ = 0. This finishes the proof of the implication stated in
(6.2). ⊔⊓

The following result should be compared with Corollary 6.2.

Theorem 6.5 (Doob–Meyer for rough paths). Assume that X is rough at some
time s ∈ [0, T ) and let (Y, Y ′ ) ∈ DX

. Then
Z t
2α 
Y dX = O |t − s| as t ↓ s =⇒ Ys = 0 . (6.3)
s

As a consequence, if X is truly rough, Ỹ , Ỹ ′ ∈ DX 2α



and Z, Z̃ ∈ C([0, T ], W ),
then the identity
Z · Z · Z · Z ·
Y dX + Zdt ≡ Ỹ dX + Z̃dt (6.4)
0 0 0 0

on [0, T ] implies that (Y, Y ′ ) ≡ (Ỹ , Ỹ ′ ) and Z ≡ Z̃ on [0, T ].

Proof. Recall from Theorem 4.10 that (I, I ′ ) :=


R 
Y dX, Y is controlled by X,
i.e. (I, I ′ ) ∈ DX

. The statement (6.3) is then an immediate consequence of (6.2).
The claim is now straightforward. Pick any s ∈ [0, T ) such that X is rough at
time s. From (6.4), and for all 0 ≤ s ≤ t ≤ T ,
Z t Z t
  2α 
Y − Ỹ dX = Zr − Z̃r dr = O(|t − s|) = O |t − s| ,
s s

2α 
where the last inequality is just the statement that |t − s| = O |t − s| as t ↓ s,
thanks to α ≤ 1/2. We then conclude using (6.3) that Ys = Ỹs . If we now assume
true roughness of X, this conclusion holds for a dense set of times s and hence, by
6.3 Brownian motion is truly rough 111

continuity of Y and Ỹ , we have Y ≡ Ỹ . But then, by Proposition 6.4, we also have


Y ′ ≡ Ỹ ′ and so Z · Z ·
Y dX ≡ Ỹ dX .
0 0

(Attention that the above notation “hides” the dependence on Y ′ resp. Ỹ ′ .) But then
(6.4) implies Z t Z t
Zr dr ≡ Z̃r dr for t ∈ [0, T ],
0 0
and we conclude by differentiation with respect to t. ⊔

6.3 Brownian motion is truly rough

Recall that (say, d-dimensional standard) Brownian motion satisfies the so-called
(Khintchine) law of the iterated logarithm, that is
!
|Bt,t+h | √
∀t ≥ 0 : P lim 1 = 2 = 1. (6.5)
h↓0 h 2 (ln ln 1/h)1/2

See [McK69, p.18] or [RY99, Ch. II] for instance, typically proved with exponential
martingales. Remark that it is enough to consider t = 0 since (Bt,t+h : h ≥ 0) is
also a Brownian motion.
Theorem 6.6. With probability one, Brownian motion on V = Rd is truly rough,
relative to any Hölder exponent α ∈ [1/4, 1/2).
Proof. It is enough to show that, for fixed time s, and any θ ∈ [1/2, 1),
!

|v (B s,t )|
P ∀v ∗ ∈ V ∗ , |v ∗ | = 1 : lim = +∞ = 1.
t↓s |t − s|θ

(Then take s ∈ Q and conclude that the above event holds true, simultaneously for
all such s, with probability one.)
1 1/2
To this end, set h 2 (ln ln 1/h) ≡ ψ(h).√We need the following two conse-
quences of (6.5). There exists c > 0 (here c = 2) such that for every fixed unit dual

vector v ∗ ∈ V ∗ = Rd and every fixed s ∈ [0, T )
 
P lim |v ∗ (Bs,t )|/ψ(t − s) ≥ c = 1 ,
t↓s

|Bs,t |
 
P lim <∞ =1.
t↓s ψ(t − s)

Take K ⊂ V ∗ to be any dense, countable set of dual unit vectors. Since K is


countable, the set on which the first condition holds simultaneously for all v ∗ ∈ K
112 6 Doob–Meyer type decomposition for rough paths

has full measure,


 
∗ ∗
P ∀v ∈ K : lim |v (Bs,t )|/ψ(t − s) ≥ c = 1 .
t↓s

On the other hand, every unit dual vector v ∗ ∈ V ∗ is the limit of some (vn∗ ) ⊂ K.
Then
|vn∗ (Bs,t )| |v ∗ (Bs,t )| |Bs,t |
≤ + |vn∗ − v ∗ |V ∗
ψ(t − s) ψ(t − s) ψ(t − s)
so that, using lim (|a| + |b|) ≤ lim (|a|) + lim (|b|), and restricting to the above set
of full measure,

|vn∗ (Bs,t )| |v ∗ (Bs,t )| |Bs,t |V


c ≤ lim ≤ lim + |vn∗ − v ∗ |V ∗ lim .
t↓s ψ(t − s) t↓s ψ(t − s) t↓s ψ(t − s)

Sending n → ∞ gives, with probability one,

|v ∗ (Bs,t )|
0 < c ≤ lim .
t↓s ψ(t − s)

Hence, for a.e. sample B = B(ω) we can pick a sequence (tn ) converging to s such
that |v ∗ (Bs,tn )|/ψ(tn − s) ≥ c − 1/n. On the other hand, for any θ ≥ 1/2 we have

|v ∗ (Bs,tn (ω))| |v ∗ (Bs,tn (ω))| ψ(tn − s)


θ
=
|tn − s| ψ(tn − s) |tn − s|θ
1
−θ
≥ (c − 1/n)|tn − s| 2 L(tn − s) → ∞ ,
1/2
with L(τ ) = (ln ln 1/τ ) , where in the borderline case θ = 1/2 (which corre-
sponds to α = 1/4) this divergence is only logarithmic. ⊔

6.4 A deterministic Norris’ lemma

We now turn our attention to a quantitative version of true roughness. In essence, we


now replace 2α in Definition 6.3 by θ and quantify the divergence, uniformly over
all directions.

Definition 6.7. A path X : [0, T ] → V with values in a Banach space V is said to


be θ-Hölder rough for θ ∈ (0, 1), on scale (smaller than) ε0 > 0, if there exists
a constant L := Lθ (X) := L(θ, ε0 , T ; X) > 0 such that for every v ∗ ∈ V ∗ , s ∈
[0, T ] and ε ∈ (0, ε0 ] there exists t ∈ [0, T ] such that

|t − s| < ε , and |v ∗ (Xs,t )| ≥ Lθ (X) εθ |v ∗ | . (6.6)

the largest such value of L is called the modulus of θ-Hölder roughness of X.


6.4 A deterministic Norris’ lemma 113

Observe that, indeed, any element in C α which is θ-Hölder rough for θ < 2α
is truly rough. (We shall see in the next section that multidimensional Brownian
motion is θ-Hölder rough for any θ > 1/2.) The following result can be viewed as
quantitative version of Proposition 6.4.
Proposition 6.8. Let (X, X) ∈ C α [0, T ], V be such that X is θ-Hölder rough for


some θ ∈ (0, 1]. Then, for every controlled rough path (Y, Y ′ ) ∈ DX 2α
[0, T ], W
one has,
∀ε ∈ (0, ε0 ] : Lεθ ∥Y ′ ∥∞ ≤ osc(Y, ε) + RY 2α ε2α .

(6.7)
As immediate consequence, if θ < 2α, Y ′ is uniquely determined from Y , i.e. if
′ ′ 2α
and Y ≡ Ỹ , then Y ′ ≡ Ỹ ′ .
 
Y, Y and Ỹ , Ỹ both belong to DX
Proof. Let us start with the consequence: apply estimate (6.7) with Y replaced by
Y − Ỹ = 0 and similarly Y ′ replaced by Y ′ − Ỹ ′ . Thanks to L > 0 it follows that

Y − Ỹ ′ = O ε2α−θ


and we send ε → 0 to conclude Y ′ = Ỹ ′ . The remainder of the proof is devoted to


establish (6.7). Fix s ∈ [0, T ] and ε ∈ (0, ε0 ]. From the definition of the remainder
RY in (4.18), it then follows that

sup |Ys′ Xs,t | ≤ sup |Ys,t | + |Rs,t Y


| ≤ osc(Y, ε) + ∥RY ∥2α ε2α . (6.8)

|t−s|≤ε |t−s|≤ε

Let now w∗ ∈ W ∗ be such that |w∗ | = 1. Since X is θ-Hölder rough by assumption,


there exists u = u(w∗ ) ∈ [0, T ] with |u − s| < ε such that
∗ ′
w (Ys Xs,u )| = (Ys′ )∗ w∗ (Xs,u )| > L εθ |(Ys′ )∗ w∗ | .

(6.9)

(Note that one has indeed (Ys′ )∗ : W ∗ → V ∗ .) Combining both (6.8) and (6.9), we
thus obtain that

L εθ |(Ys′ )∗ w∗ | ≤ osc(X, ε) + ∥RY ∥2α ε2α .

Taking the supremum over all such w∗ ∈ W ∗ of unit length and using the fact that
the norm of a linear operator is equal to the norm of its dual, we obtain

L εθ |Ys′ | ≤ osc(Y, ε) + ∥RY ∥2α ε2α .

Since s was also arbitrary, the stated bound follows at once. ⊔



Remark 6.9. Even though the argument presented above is independent of the di-
mension of V , we are not aware of any example where L = L(θ, X) > 0 and
dim V = ∞. The reason why this definition works well only in the finite-dimensional
case will be apparent in the proof of Proposition 6.11 below.
This leads us to the following quantitative version of our previous Doob–Meyer
result for rough paths, Theorem 6.5. As usual, we assume that α ∈ (1/3, 1/2).
114 6 Doob–Meyer type decomposition for rough paths

Theorem 6.10 (Norris lemma for rough paths). Let X = (X, X) ∈ C α [0, T ], V 


be such that X is θ-Hölder rough with θ < 2α. Let (Y, Y ′ ) ∈ DX



[0, T ], L(V, W )
and Z ∈ C α [0, T ], W and set


Z t Z t
It = Ys dXs + Zs ds.
0 0

Then there exist constants r > 0 and q > 0 such that, setting
−1
R := 1 + Lθ (X) + |||X|||α + ∥Y, Y ′ ∥X;2α + |Y0 | + |Y0′ | + ∥Z∥α + |Z0 |

one has the bound


r
∥Y ∥∞ + ∥Z∥∞ ≤ M Rq ∥I∥∞ ,
for a constant M depending only on α, θ, and the final time T .
Proof. We leave the details of the proof as an exercise, see [HP13], and only sketch
its broad lines.
First, we conclude from Proposition 6.8 that I small in the supremum norm
implies that ∥Y ∥∞ is also small. Then, we use interpolation to conclude from this
that R(Y, Y ′ ) is small when viewed as an element of 2ᾱ
R D for ᾱ < α, thus implying
that Y dX is necessarily small. This implies that Z ds is itself small from which,
using again interpolation, we finally conclude that Z itself must be small in the
supremum norm. ⊔ ⊓

6.5 Brownian motion is Hölder rough

We now turn to Hölder-roughness of Brownian motion. Our focus will be on the unit
interval T = 1, and we consider scales up to ε0 = 1/2 for the sake of argument.
Proposition 6.11. Let B be a standard Brownian motion on [0, 1] taking values in
Rd . Then, for every θ > 12 , the sample paths of B are almost surely θ-Hölder rough.
Moreover, with scale ε0 = 1/2 and writing Lθ (B) for the modulus of θ-Hölder
roughness, there exist constants M and c such that

P(Lθ (B) < ε) ≤ M exp −cε−2 ,




for all ε ∈ (0, 1).


The proof of Proposition 6.11 relies on the following variation of the standard
small ball estimate for Brownian motion:
Lemma 6.12. Let B be a d-dimensional standard Brownian motion. Then, there
exist constants c > 0 and C > 0 such that
 
P inf sup |⟨φ, B(t)⟩| ≤ ε ≤ C exp(−cδε−2 ) . (6.10)
|φ|=1 t∈[0,δ]
6.5 Brownian motion is Hölder rough 115

Proof. The standard small ball estimate for Brownian motion (see for example
[LS01]) yields the bound
 
sup P sup |⟨φ, B(t)⟩| ≤ ε ≤ C exp(−cδε−2 ) . (6.11)
|φ|=1 t∈[0,δ]

The required estimate then follows from a standard chaining argument, as in [Nor86,
p. 127]: cover the sphere |φ| = 1 with ε−2(d−1) balls of radius ε2 , say, centred
at φi . We then use the fact that, since the supremum of B has Gaussian tails, if
supt∈[0,δ] |⟨φi , B(t)⟩| ≤ ε, then the same bound, but with ε replaced by 2ε holds
with probability exponentially close to 1 uniformly over all φ in the ball of radius ε2
centred at φi . Since there are only polynomially many such balls required to cover
the whole sphere, (6.10) follows. Note that this chaining argument uses in a crucial
way that the number of balls of radius ε2 required to cover the sphere ∥φ∥ = 1 grows
only polynomially with ε−1 .
It is clear that bounds of the type (6.10) break down in infinite dimensions: if we
consider a cylindrical Wiener process, then (6.11) still holds, but the unit sphere of a
Hilbert space cannot be covered by a finite number of small balls anymore. If on the
other hand, we consider a process with a non-trivial covariance, then we can get the
chaining argument to work, but the bound (6.11) would break down due to the fact
that ⟨φ, B(t)⟩ can then have arbitrarily small variance. ⊔ ⊓

Proof (Proposition 6.11). With T = 1, ε0 = 1/2, a different way of formulating


Definition 6.7 is given by
1
Lθ (X) = inf sup θ
|⟨φ, Xs,t ⟩|.
t:|t−s|≤ε ε

where the inf is taken over |φ| = 1, s ∈ [0, 1] and ε ∈ (0, 1/2]. We then define the
“discrete analog” Dθ (X) of Lθ (X) to be given by

Dθ (X) = inf sup 2nθ |⟨φ, Xs,t ⟩| ,


s,t∈Ik,n

where Ik,n = [ k−1


2n ,
k
2n ] and the inf is taken over |φ| = 1, n ≥ 1 and k ≤ 2n . We
first claim that
1 1
Lθ (X) ≥ Dθ (X). (6.12)
2 2θ
To this end, fix a unit vector φ ∈ V ∗ , s ∈ [0, 1] and ε ∈ (0, 1/2]. Pick n ≥ 1 :
ε/2 < 2−n ≤ ε. It follows that there exists some k such that Ik,n is included in the
set {t : |t − s| ≤ ε}. Then, by definition of Dθ , for any unit vector φ there exist two
points t1 , t2 ∈ Ik,n such that

|⟨φ, Xt1 ,t2 ⟩| ≥ 2−nθ Dθ (X).

Therefore, by the triangle inequality, we conclude that the magnitude of the difference
between ⟨φ, Xs ⟩ and one of the two terms ⟨φ, Xti ⟩, i = 1, 2 (say t1 ) is at least
116 6 Doob–Meyer type decomposition for rough paths

1 −nθ
|⟨φ, Xs,t1 ⟩| ≥ 2 Dθ (X)
2
and therefore
|⟨φ, Xs,t1 ⟩| 1 2−nθ 1 1
θ
≥ Dθ (X) ≥ Dθ (X).
ε 2 εθ 2 2θ
Since s, ε and φ were chosen arbitrarily, the claim (6.12) follows.
Applying this to Brownian sample paths, X = B(ω), it follows that it is sufficient
to obtain the requested bound on P(Dθ (B) < ε). We have the straightforward bound
 |⟨φ, Bs,t ⟩| 
P(Dθ (B) < ε) ≤ P inf inf infn sup < ε
∥φ∥=1 n≥1 k≤2 s,t∈Ik,n 2−nθ
X 2n
∞ X  
≤ P inf sup |⟨φ, Bs,t ⟩| < 2−nθ ε .
∥φ∥=1 s,t∈Ik,n
n=1 k=1

Trivially sups,t∈Ik,n |⟨φ, Bs,t ⟩| ≥ supt∈Ik,n |⟨φ, Br,t ⟩|, where r is the left boundary
of the interval Ik,n , we can bound this by applying Lemma 6.12. Noting that the
bound obtained in this way is independent of k, we conclude that

X ∞
X
2n exp −c2(2θ−1)n ε−2 ≤ M̃ exp −c̃nε−2 .
 
P(Dθ (B) < ε) ≤ M
n=1 n=1

Here, we used the fact that as soon as θ > 12 , we can find constants K and c̃ such that

n log 2 − c2(2θ−1)n ε−2 ≤ K − c̃nε−2 ,

uniformly over all ε < 1 and all n ≥ 1. (Consider separately the cases ε2 ∈ (0, 1/n)
and ε2 ∈ [1/n, 1).) We deduce from this the bound
 Z ∞
−c̃ε−2
 
P(Dθ (B) < ε) ≤ M e + exp −c̃ε−2 x dx ,
1

which immediately implies the result. ⊔


Note that the proof given above is quite robust. In particular, we did not really
make use of the fact that B has independent increments. In fact, it transpires that all
that is required in order to prove the Hölder roughness of sample paths of a Gaussian
process W with stationary increments is a small ball estimate of the type
 
P sup |Wt − W0 | ≤ ε ≤ C exp(−cδ α ε−β ) ,
t∈[0,δ]

for some exponents α, β > 0. These kinds of estimates are available for example for
fractional Brownian motion with arbitrary Hurst parameter H ∈ (0, 1).
6.7 Comments 117

6.6 Exercises

Exercise 6.1 Show that the Q-Wiener process (as introduced in Exercise 3.4) is truly
rough.
Exercise 6.2 Prove and state precisely: multidimensional fractional Brownian mo-
tion B H , H ∈ (1/3, 1/2], is truly rough.
Exercise 6.3 In (6.7), estimate osc(Z, ε) by 2∥Y ∥∞ (or alternatively by ∥Y ∥α εα )
and deduce the estimate
1
∥Z ′ ∥∞ ≤ inf 2ε−θ ∥Y ∥∞ + RZ 2α ε2α−θ .

L ε∈(0,ε0 ]

Carry out the elementary optimisation, e.g. when ε0 = T /2, to see that

4∥Y ∥∞
 

θ
Z 2α
θ
− 2α −θ
∥Z ∥∞ ≤ R 2α ∥Y ∥∞ ∨ T .
L(θ, X)

∗ Exercise 6.4 (Norris’ lemma for rough paths; [HP13]) Give a complete proof of
Theorem 6.10.

6.7 Comments

The notion of θ-roughness was first introduced in Hairer–Pillai [HP13], which also
contains Proposition 6.8, although some of the ideas underlying the concepts pre-
sented here were already apparent in Baudoin–Hairer [BH07] and Hairer–Mattingly
[HM11]. A version of this “Norris lemma” in the context of SDEs driven by fractional
Brownian motion was proposed independently by Hu–Tindel [HT13]. The simplified
condition of “true” roughness (which may be verified in infinite dimensions), targeted
directly at a Doob–Meyer decomposition, is taken from Friz–Shekhar [FS13]; the
quantitative “Norris lemma” is taken from Hairer–Pillai [HP13]. These results also
hold in “rougher” situations, i.e. when α ≤ 1/3, see [FS13, CHLT15].
Chapter 7
Operations on controlled rough paths

R
At first sight, the notation Y dX introduced in Chapter 4 is ambiguous since the
resulting controlled rough path depends in general on the choices of both the second-
order process X and the derivative process Y ′ . Fortunately, this “lack of completeness”
in our notations is mitigated by the fact that in virtually all situations of interest, Y
is constructed by using a small number of elementary operations described in this
chapter. For all of these operations, it turns out to be intuitively rather clear how the
corresponding derivative process is constructed.

7.1 Relation between rough paths and controlled rough paths

Consider X = (X, X) ∈ C α ([0, T ], V ). It is easy to see that X itself can be inter-


preted as a path “controlled by X”. Indeed, we can identify X with the element

(X, Id) ∈ DX , where Id is the identity matrix (more precisely: the constant path
with value Id for all times).1 Conversely, an element (Y, Y ′ ) ∈ DX2α
([0, T ], W ) can
itself be interpreted as a rough path again, say Y = (Y, Y). Indeed, with the interpre-
tation of the integral in the sense of (4.24), below fully spelled out for the reader’s
convenience, we can set
Z t Z
Ξ , Ξu,v = Yu ⊗ Yu,v + Yu′ ⊗ Yu′ Xu,v .
def
Ys,t = Ys,r ⊗ dYr = lim
s |P|→0 P

where Yu′ ⊗Yu′ ∈ L(V ⊗V, W ⊗W ) is given by (Yu′ ⊗Yu′ )(v⊗ṽ) = (Yu′ (v))⊗(Yu′ (ṽ)).
The fact that ∥Y∥2α is finite is then a consequence of (4.25). On the other hand, the
algebraic relations (2.1) already hold for the “Riemann sum” approximations to the
three integrals, provided that the partition used for the approximation of Ys,t is the
union of the one used for the approximation of Ys,u with the one used for Yu,t .

1
It can also be useful to consider t 7→ X0,t as a path “controlled by X”, resulting in the controlled
rough path (X, X); cf. Exercise 4.6.

119
120 7 Operations on controlled rough paths

We summarise the above consideration in saying that for every fixed X ∈


C α ([0, T ], V ), we have a continuous canonical injection

DX ([0, T ], W ) ,→ C α ([0, T ], W ) .

Furthermore, this interpretation of elements of DX as elements of C α is coherent in
terms of the theory of integration constructed in the previous section, as can be seen
by the following result:

Proposition 7.1. Let (X, X) ∈ C α , let (Y, Y ′ ) ∈ DX 2α


, and let Y = (Y, Y) ∈ C α
be the associated rough path constructed as above. If (Z̃, Z̃ ′ ) ∈ DY2α , then (Z, Z ′ ) ∈

DX , where Zt = Z̃t and Zt′ = Z̃t′ Yt′ . Furthermore, one has the identity
Z t Z t
Zs dYs = Z̃s dYs . (7.1)
0 0

Here, the left-hand side uses (4.24) to define the integral of two controlled rough
paths against each other and the right-hand side uses the original definition (4.21)
of the integral of a controlled rough path against its reference path.

Proof. By assumption, one has Ys,t = Ys′ Xs,t + O(|t − s|2α ) and Z̃s,t = Z̃s′ Ys,t +
O(|t − s|2α ). Combining these identities, it follows immediately that

Zs,t = Z̃s′ Ys′ Xs,t + O(|t − s|2α ) = Zs′ Xs,t + O(|t − s|2α ) ,

so that (Z, Z ′ ) ∈ DX

as required. Now the left-hand side of (7.1) is given by IΞ0,t
with Ξs,t = Zs Ys,t + Zs′ Ys′ Xs,t , whereas the right-hand side is given by I Ξ̃0,t ,
where we set Ξ̃s,t = Z̃s Ỹs,t + Z̃s′ Ys,t . Since |Ys,t − Ys′ Ys′ Xs,t | ≤ C|t − s|3α by
(4.22), the claim now follows from Remark 4.13. ⊔ ⊓

Remark 7.2. It is straightforward to see that if 13 < β < α, then C α ,→ C β and, for

every X ∈ C α , we have a canonical embedding DX 2α
,→ DX . Furthermore, in view
of the definition (4.10) of I, the values of the integrals defined above do not depend
on the interpretation of the integrand and integrator as elements of one or the other
space.

7.2 Lifting of regular paths.

There is a canonical embedding ι : C 2α ,→ DX2α


given by ιY = (Y, 0), since in this
case Rs,t = Ys,t does indeed satisfy ∥R∥2α < ∞. Recall that we are only interested
in the case α ≤ 12 . After all, if Ys,t = O(|t − s|2α ) with α > 12 , then Y has a
vanishing derivative and must be constant.
7.3 Composition with regular functions. 121

7.3 Composition with regular functions.

Let W and W̄ be two Banach spaces and let φ : W → W̄ be a function in Cb2 . Let
furthermore (Y, Y ′ ) ∈ DX2α
([0, T ], W ) for some X ∈ C α . (In applications X will
be part of some X = (X, X) ∈ C α but this is irrelevant here.) Then one can define a
(candidate) controlled rough path (φ(Y ), φ(Y )′ ) ∈ DX2α
([0, T ], W̄ ) by

φ(Y )t = φ(Yt ) , φ(Y )′t = Dφ(Yt )Yt′ . (7.2)

It is straightforward to check that the corresponding remainder term does indeed


satisfy the required bound. It is also straightforward to check that, as a consequence
of the chain rule, this definition is consistent in the sense that (φ ◦ ψ)(Y, Y ′ ) =
φ(ψ(Y, Y ′ )). We have the following result. Note that, since φ (and its derivatives)
are only evaluated in a compact set (namely Y ([0, T ]) ⊂ W ), there is no loss in
generality in assuming φ (and its derivatives) bounded.
Lemma 7.3. Let φ ∈ Cb2 , (Y, Y ′ ) ∈ DX 2α
([0, T ], W ) for some X ∈ C α with |Y0′ | +

∥Y, Y ∥X,2α ≤ M ∈ [1, ∞). Let (φ(Y ), φ(Y )′ ) ∈ DX 2α
([0, T ], W̄ ) be given by
(7.2). Then, there exists a constant C depending only on T > 0 and α > 31 such that
one has the bound
 
φ(Y ), φ(Y )′ 2 ′ ′

X,2α
≤ C α,T M ∥φ∥C 2 (1 + ∥X∥ α ) |Y0 | + ∥Y, Y ∥X,2α .
b

At last, C can be chosen uniformly over T ∈ (0, 1].


′
Proof. We have φ(Y ), φ(Y ) = (φ(Y· ), Dφ(Y· )Y·′ ) ∈ DX

. Indeed,

∥φ(Y· )∥α ≤ ∥Dφ∥∞ ∥Y· ∥α


φ(Y )′ ≤ ∥Dφ(Y· )∥ ∥Y·′ ∥ + ∥Y·′ ∥ ∥Dφ(Y· )∥

α ∞ α ∞ α
′ ′
2
≤ ∥Dφ(Y· )∥∞ ∥Y· ∥α + ∥Y· ∥∞ D φ(Y· ) ∞ ∥Y· ∥α ,


which shows that φ(Y ), φ(Y ) ∈ C α . Furthermore, Rφ ≡ Rφ(Y ) is given by
φ
Rs,t = φ(Yt ) − φ(Ys ) − Dφ(Ys )Ys′ Xs,t
Y
= φ(Yt ) − φ(Ys ) − Dφ(Ys )Ys,t + Dφ(Ys )Rs,t

so that,
1 2 2
∥Rφ ∥2α ≤ D φ ∞ ∥Y ∥α + |Dφ|∞ RY 2α .

2
It follows that
φ(Y ), φ(Y )′ ≤ ∥Dφ(Y· )∥∞ ∥Y·′ ∥α + ∥Y·′ ∥∞ D2 φ(Y· ) ∞ ∥Y· ∥α

X,2α
1 2
+ D2 φ ∞ ∥Y ∥α + |Dφ|∞ RY 2α

2 
2 
≤ ∥φ∥C 2 ∥Y·′ ∥α + ∥Y·′ ∥∞ ∥Y· ∥α + ∥Y ∥α + RY 2α
b
122 7 Operations on controlled rough paths
 
2
≤ Cα,T ∥φ∥C 2 (1 + ∥X∥α ) 1 + |Y0′ | + ∥Y, Y ′ ∥X,2α
b
 
× |Y0 | + ∥Y, Y ′ ∥X,2α ,

where we used in particular (4.20). ⊔



It follows immediately that one has the following “Leibniz rule”, the proof of
which is left to the reader:
Corollary 7.4. Let (Y, Y ′ ) and (Z, Z ′ ) be two controlled paths in DX 2α
for some
X ∈ C . Then the path U = Y Z, with Gubinelli derivative U = Y Z + ZY ′ also
α ′ ′

belongs to DX .

7.4 Stability II: Regular functions of controlled rough paths

In Lemma 7.3 we showed that controlled rough paths composed with (sufficiently)
regular functions are again controlled rough paths. We shall be interested to quantify
the continuity of this operation. As a useful warm-up, we start with the case of Hölder
paths.
Lemma 7.5. Assume φ ∈ Cb2 (W, W̄ ) and T ≤ 1. Then there exists a constant Cα,K
such that for all X, Y ∈ C α ([0, T ], W ) with ∥X∥α;[0,T ] , ∥Y ∥α;[0,T ] ≤ K ∈ [1, ∞),
 
∥φ(X) − φ(Y )∥α;[0,T ] ≤ Cα,K ∥φ∥C 2 |X0 − Y0 | + ∥X − Y ∥α;[0,T ] .
b

Proof. Consider the difference

φ(X)s,t − φ(Y )s,t = (φ(Xt ) − φ(Yt )) − (φ(Xs ) − φ(Ys )).

The idea is to use a division property of sufficiently smooth functions. In the present
context, this simply means that one has
Z 1
φ(x) − φ(y) = g(x, y)(x − y) with g(x, y) := Dφ(tx + (1 − t)y) dt ,
0

where g : W × W → L(W, W̄ ) is obviously bounded by ∥Dφ∥∞ and in fact


Lipschitz with ∥g∥Lip ≤ C∥D2 φ∥∞ for some constant C ≥ 1 relative to any
product norm on W × W , such as |(x, y)|W ×W = |x| + |y|. It follows that

|(g(x, y) − g(x̃, ỹ))| ≤ ∥g∥Lip |(x − x̃, y − ỹ)| ≤ C∥D2 φ∥∞ (|x − x̃| + |y − ỹ|).

Setting ∆t = Xt − Yt then allows to write



φ(X) − φ(Y ) = |g(Xt , Yt )∆t − g(Xs , Ys )∆s |
s,t s,t
= |g(Xt , Yt )(∆t − ∆s ) + (g(Xt , Yt ) − g(Xs , Ys ))∆s |
7.4 Stability II: Regular functions of controlled rough paths 123

≤ ∥g∥∞ |Xs,t − Ys,t | + ∥g∥Lip |(Xs,t , Ys,t )|W ×W |Xs − Ys |


≤ ∥Dφ∥∞ |Xs,t − Ys,t | + C∥D2 φ∥∞ (|Xs,t | + |Ys,t |)∥X − Y ∥∞;[0,T ]
≲ |t − s|α ∥Dφ∥∞ ∥X − Y ∥α + K∥D2 φ∥∞ ∥X − Y ∥∞;[0,T ] .


Since T ≤ 1 we can also estimate ∥X − Y ∥∞;[0,T ] ≤ |X0 − Y0 | + ∥X − Y ∥α;[0,T ]


and the claimed estimate on φ(X) − φ(Y ) follows immediately. ⊔ ⊓

We can now show the analogous statement for controlled rough paths, using
notation previously introduced in Section 4.4.

Theorem 7.62α(Stability  of composition). Let X, X̃ ∈ C α ([0, T ]) with T ≤ 1,


′ ′ 2α 3
Y, Y ∈ DX , Ỹ , Ỹ ∈ DX̃ . For φ ∈ Cb define

(Z, Z ′ ) := (φ(Y ), Dφ(Y )Y ′ ) ∈ DX 2α


(7.3)

and similarly for Z̃, Z̃ ′ . Then, one has the local Lipschitz estimates



∥Z, Z ′ ; Z̃, Z̃ ′ ∥X,X̃,2α ≤ CM ∥X − X̃∥α + Y0 − Ỹ0 + Y0′ − Ỹ0′


+ ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥X,X̃,2α , (7.4)

as well as
 
Z − Z̃ ≤ CM ∥X − X̃∥α + Y0 − Ỹ0 + Y0′ − Ỹ0′ + ∥Y, Y ′ ; Ỹ , Ỹ ′ ∥

α X,X̃,2α ,
(7.5)
for a suitable constant CM = C(M, α, φ).

Proof. (The reader is urged to revisit Lemma 7.3 where the composition (7.3) was
seen to be well-defined for φ ∈ Cb2 .) Similar as in the previous proof, noting that

Z0 − Z̃0′ = Dφ(Y0 )Y0′ − Dφ Ỹ0 Ỹ0′ ≤ CM Y0 − Ỹ0 + Y0′ − Ỹ0′
 

it suffices to establish the first estimate, for (7.5) is an immediate consequence of


(7.4) and (4.30). In order to establish the first estimate we need to bound

Dφ(Y )Y ′ − Dφ Ỹ Ỹ ′ + RZ − RZ̃ .

α 2α

Write CM (εX + ε0 + ε′0 + ε) for the right-hand side of (7.4). Note that with this
notation, from (4.30),
Y − Ỹ ≲ εX + ε′0 + ε =: εY ,

α

and also Y − Ỹ ∞;[0,T ] ≲ ε0 + εY (uniformly over T ≤ 1). Since Dφ ∈ Cb2 , we
know from Lemma 7.5 that
  
Dφ Ỹ − Dφ(Y ) α = Dφ Ỹ0 − Dφ(Y0 ) + Dφ Ỹ − Dφ(Y )
C α
124 7 Operations on controlled rough paths

≤ C(ε0 + εY )

where C depends on the Cb3 -norm of φ. Also, Y ′ − Ỹ ′ C α ≤ ε′0 + ε. Clearly then
(C α is a Banach algebra under pointwise multiplication), we have, for a constant CM ,
Dφ(Y )Y ′ − Dφ Ỹ Ỹ ′ ≤ CM (ε0 + εY + ε′0 + ε)

α
≲ CM (εX + ε0 + ε′0 + ε) .

To deal with RZ − RZ̃ , write


Z
Rs,t = φ(Yt ) − φ(Ys ) − Dφ(Ys )Ys′ Xs,t
Y
= φ(Yt ) − φ(Ys ) − Dφ(Ys )Ys,t + Dφ(Ys )Rs,t .

Z̃ ′ Y ′ Ỹ
Z
with R (replace Y, Y , R above by Ỹ , Ỹ , R ) leads to the
Taking the difference

bound Rs,t − Rs,t ≤ T1 + T2 where

   
T1 := φ(Yt ) − φ(Ys ) − Dφ(Ys )Ys,t − φ Ỹt − φ Ỹs − Dφ Ỹs Ỹs,t
Z 1

D2 φ(Ys + θYs,t )(Ys,t , Ys,t ) − D2 φ Ỹs + θỸs,t Ỹs,t , Ỹs,t (1 − θ)dθ

=
0
Y
 Ỹ
T2 := Dφ(Ys )Rs,t − Dφ Ỹs Rs,t .

Y Ỹ 2α
As for the second term, we know Rs,t − Rs,t ≤ (ε′0 + ε)|t − s| , for all s, t, while

Dφ Ỹs − Dφ(Ys ) ≤ D2 φ Ỹs − Ys ≤ D2 φ (ε0 + εY ).



∞ ∞

By elementary estimates of the form ab − ãb̃ ≤ a b − b̃ + a − ã b̃ it then

follows immediately that one has T2 ≤ C(εX + ε0 + ε′0 + ε)|t − s| .
One argues similarly for
R the first term. This time, we consider the expression
under the above integral (. . .)(1 − θ)dθ for fixed integration variable θ ∈ [0, 1].
Using Y n → Y in α-Hölder norm, we obtain
2
D φ Ỹs + θỸs,t − D2 φ(Ys + θYs,t ) ≤ D3 φ Ỹs − Ys + Ỹs,t − Ys,t
 

≤ 3 D3 φ Ỹ − Y ≲ ε0 + εY ,

∞ ∞

noting that this estimate is uniform in s, t ∈ [0,  T ] and θ ∈ [0, 1]. RIt then suffices
to insert / subtract D2 φ(Ys + θYs,t ) Ỹs,t , Ỹs,t under the integral . . . (1 − θ)dθ
appearing in the definition of T1 and conclude with the triangle inequality and some
simple estimates, keeping in mind that ∥Y − Ỹ ∥α ≤ εY and ∥Y ∥α , ∥Ỹ ∥α ≲ CM .


7.5 Itô’s formula revisited 125

7.5 Itô’s formula revisited

Let X = (X, X) ∈ C α , with α ∈ 13 , 12 as usual. In Proposition 5.8 we derived the




following Itô formula


Z t
1 t 2
Z
F (Xt ) = F (X0 ) + DF (Xs )dXs + D F (Xs )d[X]s , (7.6)
0 2 0

and now ask for a similar formula for F (Yt ), when (Y, Y ′ ) ∈ DX2α
is a controlled
rough path. It turns out that we need to be more specific and assume
Z t
Yt = Y0 + Ys′ dXs + Γt , (7.7)
0

with (Y ′ , Y ′′ ) ∈ DX 2α
, such as to have a well-defined rough integral; some flexibility
is added in form of a “drift” term Γ , assumed regular in time. Such paths arise
naturally as rough integrals of 1-forms, cf. Section 4.2, and also if Y is the solution
to a rough differential equation driven by X to be discussed in Section 8.1. In analogy
with similar Itô formulae from stochastic calculus, we expect
Z t Z t
F (Yt ) = F (Y0 ) + DF (Ys )Ys′ dXs + DF (Ys ) dΓs
0 0
1 t 2
Z
D F (Ys ) Ys′ , Ys′ d[X]s .

+ (7.8)
2 0

Before going on, we note that the right-hand side above is indeed meaningful: the last
two integrals are Young integrals and the first is a bona-fide rough integral. Indeed,
by Lemma 7.3 and Corollary 7.4, the integrand Z ′ := DF (Y )Y ′ is controlled by X,
with Gubinelli derivative Z ′′ = D2 F (Y )(Y ′ , Y ′ ) + DF (Y )Y ′′ , so that the rough
integral, following Theorem 4.10,
Z t Z t X
DF (Ys )Ys′ dXs = Zs′ dXs = lim Zu′ Xu,v + Zu′′ Xu,v , (7.9)

0 0 |P |→0
[u,v]∈P

is well-defined. (The extra structure (Y ′ , Y ′′ ) ∈ DX



was crucially used.)
Itô
We note that, when X = B (ω), Itô enhanced Brownian motion, and Y, Y ′ , Y ′′
are all adapted, then so is (Z ′ , Z ′′ ) and the rough integral in (7.9) becomes, by
Proposition 5.1, a classical Itô integral.

Theorem 7.7 (Itô formula II). Let F : V → W in C 3 , X = (X, X) ∈ C α and


(Y, Y ′ ) ∈ DX2α
a controlled rough path of the form (7.7) for some controlled rough
′ ′′ 2α
path (Y , Y ) ∈ DX and some path Γ ∈ C 2α . Then the the Itô formula (7.8) holds
true.

Proof. Assumption (7.7) implies that increments of Y are of the form


126 7 Operations on controlled rough paths

Ys,t = Ys′ Xs,t + Ys′′ Xs,t + Γs,t + o(|t − s|) . (7.10)

Thanks to (7.6), we know that F (Yt ) − F (Y0 ) equals


X X
DF (Yu )Yu,v + D2 F (Yu )Yu,v + lim D2 F (Yu )[Y]u,v

lim
|P|→0 |P|→0
[u,v]∈P [u,v]∈P

Rv (7.11)
where Yu,v = u Yu,· ⊗ dY in the sense of Remark 4.12, noting that Yu,v =
Yu′ Yu′ Xu,v + o(|v − u|). Also,

[Y]u,v = Yu,v ⊗ Yu,v − 2 Sym (Yu,v )


= Yu′ Yu′ (Xu,v ⊗ Xu,v − 2 Sym (Xu,v )) + o(|v − u|)
= Yu′ Yu′ [X]u,v + o(|v − u|).

Let us also subtract / add DF (Yu )Yu′′ Xu,v from (7.11). Then F (Yt ) − F (Y0 ) equals
X
DF (Yu )(Yu,v − Yu′′ Xu,v ) + DF (Yu )Yu′′ Xu,v + D2 F (Yu )Yu′ Yu′ Xu,v

lim
|P|→0
[u,v]∈P
X
+ lim D2 F (Yu )Yu′ Yu′ [X]u,v
|P|→0
[u,v]∈P
X
DF (Yu )Yu′ Xu,v + DF (Yu )Yu′′ + D2 F (Yu )Yu′ Yu′ Xu,v

= lim
|P|→0
[u,v]∈P
X Z t
+ lim DF (Yu )Γu,v + D2 F (Yu )Yu′ Yu′ d[X]u .
|P|→0 0
[u,v]∈P

In view of (7.9), also noting the appearance of two Young integrals in the last line,
the proof is complete. ⊔ ⊓

It is worth having a different perspective on this Itô formula and take Γ = 0 for
an unobstructed view. Then assumption 7.7 means exactly that (Y, Y ′ , Y ′′ ) ∈ DX3α
in the sense (cf. Definition 4.18)

= Ys′ Xs,t + Ys′′ Xs,t ,


δYs,t 3α ′ 2α
δYs,t = Ys′′ Xs,t , ′′ α
δYs,t =0. (7.12)

If we furthermore restrict to X geometric, so that [X] ≡ 0, Itô’s formula takes the


form of a classical chain rule,
Z t
F (Yt ) − F (Ys ) = Zs,t = Zr′ dXr 3α
= Zs′ Xs,t + Zs′′ Xs,t .
s

On the other hand, (Z ′ , Z ′′ ) ∈ DX2α means exactly δZs,t



= Zs′′ Xs,t , δZs,t
2α ′′ α
= 0.
This discussion leads us to the following.

Proposition 7.8. Let F ∈ C 3 and Y = (Y, Y ′ , Y ′′ ) ∈ DX3α with geometric X =


(X, X) ∈ Cgα . Then
7.6 Controlled rough paths of low regularity 127

Z = (Z, Z ′ , Z ′′ ) := (F (Y ), DF (Y )Y ′ , DF (Y )Y ′′ + D2 F (Y )(Y ′ , Y ′ ))

is also an element in DX3α . By abuse of notation, we write Z = F (Y).

Remark 7.9. The conclusion Z ∈ DX3α can be “itemised”, similar to (7.12). The kα
estimates (k = 1, 2, 3) are then uniform over F ∈ Cb3 , in analogy with the estimate

for elements in DX , as was detailed in Lemma 7.3.

Proof. We give a direct proof, without intermediate use of rough integrals (and in
fact no need for α > 1/3) to emphasise the analogy with our previous Lemma 7.3

on composition of elements in DX with regularity functions. By Taylor’s theorem,

1
= F (Ys ) + DF (Ys )(Ys′ Xs,t + Ys′′ Xs,t ) + D2 F (Ys )(Ys,t , Ys,t ).
F (Yt ) 3α
2
⊗2
= (Ys′ Xs,t )⊗2 and by geometricity 12 Xs,t
Note that Ys,t ⊗ Ys,t 3α = Xs,t , so that the
second order term in the Taylor expansion can be replaced by

D2 F (Ys )(Ys′ , Ys′ )Xs,t .

The remaining details are left to the reader. ⊔


As will be discussed in the next section, similar composition formulae can be


obtained for arbitrary Y ∈ DXγ as long as γ > 0.

7.6 Controlled rough paths of low regularity

Let us conclude this section by showing how these canonical operations can be lifted
to the case of controlled rough paths of low regularity, i.e. when α < 13 . Recall
from Section 4.5 that basis vectors in T (N ) (Rd ) are of the form ew = e1 ⊗ . . . ⊗ ek ,
for words of the form w = w1 · · · wk with letters in {1, . . . , d}, whereas we words
themselves are identified via the dual basis of T (N ) (Rd )∗ ,

w ↔ e∗w .

Controlled rough paths Y are T (N −1) (Rd )∗ -valued functions, which are controlled
by increments of X in the sense of Definition 4.18.
This suggests that, in order to define the product of two controlled rough paths
Y and Ȳ , we should first ask ourselves how a product of the type Xw w̄
s,t Xs,t for two
different words w a w̄ can be rewritten as a linear combination of the increments of
X. It was seen in Section 2.4 that such a product is described by the shuffle product
of words.
With this definition at hand, we saw that for any (weakly) geometric rough path X
satisfies the identity
ww̄
Xw w̄
s,t Xs,t = Xs,t .
128 7 Operations on controlled rough paths

Also, T (N ) (Rd )∗ becomes a commutative algebra, the shuffle algbera, via

e∗w ⋆ e∗w̄ = e∗ww̄ .

This strongly suggests that the “correct” way of multiplying two controlled rough
paths Y and Ȳ is to define their product Z by

Zt = Yt ⋆ Ȳt .

It is possible to check that Z is indeed again a controlled rough path. Similarly, if F is


a (sufficiently) smooth function and Y is a controlled rough path, we abuse notation
and define F (Y) by
N −1
1 (k) ̸# ⋆k
F (Y)t = F (Y̸#
def
X
t )+ F (Yt ) Ỹt , (7.13)
k!
k=1

where F (k) denotes the kth derivative of F and Ỹt = Yt − Y̸#


def
t is the part describing
the “local fluctuations”. It is again possible to show that F (Y) is a controlled rough
path if Y is a controlled rough path and F is sufficiently smooth (class C p will do).
This is nothing but the natural generalisation of the Itô formula in the formulation of
Corollary 7.8. (The detailed verification of this is left to the reader in Exercise 7.5.)
Remark 7.10. A generalisation of (7.13) in the context of regularity structures is
given in Proposition 14.8.

7.7 Exercises
Rt
♯ Exercise 7.1 Verify that Xs,t = s Xs,r ⊗dXr where the integral is to be interpreted
in the sense of (4.24), taking (Y, Y ′ ) to be (X, I). In fact, check that
R this holds not
only in the limit |P| → 0 but in fact for every fixed |P|, i.e. Xs,t = P Ξ. Compare
this with formula (2.26), obtained in Exercise 2.4.
Exercise 7.2 Let φ : W × [0, T ] → W̄ be a function which is uniformly C 2 in its
first argument (i.e. φ is bounded and both Dy φ and Dy2 φ are bounded, where Dy
denotes the Fréchet derivative with respect to the first argument) and uniformly C 2α
in its second argument. Let furthermore (Y, Y ′ ) ∈ DX 2α
([0, T ], W ). Show that

φ(Y )t = φ(Yt , t) , φ(Y )′t = Dy φ(Yt , t)Yt′ .

defines an element (φ(Y ), φ(Y )′ ) ∈ DX



([0, T ], W̄ ). In fact, show that there exists
a constant C, depending only on T , such that one has the bound
2 2
∥φ(Y )∥X,2α ≤ C ∥Dy2 φ∥∞ + ∥φ∥∞ + ∥φ∥2α;t 1 + ∥X∥α 1 + ∥Y ∥X,2α ,


where we denote by ∥φ∥2α;t the supremum over y of the 2α-Hölder norm of φ(y, ·).
7.8 Comments 129

Exercise 7.3 (Composition with smooth functions; from [GH19]) We return to


the Hilbert / semigroup setting of Exercise 4.16. Let α ∈ R and let F ∈ C 2 (Hβ , Hβ ),
consistently for every β ≥ α, with derivatives up to order 2 bounded. Let X =

(X, X) ∈ C γ ([0, T ], Rd ) for γ ∈ (1/3, 1/2] and (Y, Y ′ ) ∈ DS,X ([0, T ], Hα ). More-
over assume that in addition Y ∈ L ([0, T ], Hα+2γ ) and Y ∈ L∞ ([0, T ], Hα+2γ ).
∞ ′

Show that (Zt , Zt′ ) := (F (Yt ), DF (Yt ) ◦ Yt′ ) defines an element (Z, Z ′ ) ∈

DS,X ([0, T ], Hα ) with the quantitative bound

∥(Z, Z ′ )∥X,2γ,α ≲ (1 + |X|γ )2 (1 + ∥Y ∥∞,α+2γ + ∥Y ′ ∥∞,α+2γ + ∥(Y, Y ′ )∥X,2γ )2 .


(7.14)
The proportionality constant depends on the bounds on F and its derivatives. It also
depends on time T , but is uniform over T ∈ (0, 1].
Exercise 7.4 (Rough product formula) Assume Y = Y0 + (Y ′ , Y ′′ )dX + Γ as
R

in Theorem 7.7, and similarly for Ȳ . Assume X is geometric, so that the bracket [X]
vanishes. Then the following product formula holds
Z t Z t
(M, M ′ )dX +

Yt Ȳt = Y0 Ȳ0 + (dΓs )Ȳs + Ys dΓ̄s
0 0
with Ms = Ys′ Ȳs + Ys Ȳs′ , Ms′ = Ys′′ Ȳs + 2Ys′ Ȳs′ + Ys Ȳs′′ .

(If Y, Ȳ take values in a Banach space V , the formula is understood as an identity in


V ⊗ V .)
Hint: Apply Theorem 7.7 with F (y, ȳ) = y ȳ (or y ⊗ ȳ).
Exercise 7.5 a) Consider a controlled rough path (Y, Y ′ ) ∈ DX 2α
, with X ∈ C α ,
and verify that the composition formula (7.13), with p = 2, is consistent with
Lemma 7.3.
b) Consider then a controlled rough path (Y, Y ′ , Y ′′ ) ∈ DX3α , with X = (X, X) ∈
Cgα , and verify that the composition formula (7.13), with p = 3, is consistent
with Corollary 7.8.

7.8 Comments

Stability of controlled rough paths under composition with regular functions goes
back in Gubinelli [Gub04], also in an α-Hölder setting α > 13 , similar to our
Sections 7.3 and 7.4. Extension to lower order regularity and then the “branched”
setting are given by in [Gub10, HK15, FZ18], see also [BDFT20, Thm 2.11] for a
concise proof in the geometric setting and connections to a multivariate Faà di Bruno
formula.
Our discussion of Itô’s formula, Section 7.5, expands on a similar section of the
first edition (2014), and makes more explicit the point that Itô’s formula is really a
composition formula for higher order controlled rough paths. Assuming α > 31 for
the sake of argument,
130 7 Operations on controlled rough paths

Such formulae are sometimes directly given for RDE solutions, in which case
the equation dictates a particular controlled structure, as seen spelled out directly in
Davie’s approach, Section 8.7. This is also a natural way to define manifold valued
RDE solutions, similar to the definition of manifold valued semimartingales. See
also comment Section 12.5 for some pointers to Itô formulae in the context of rough
and stochastic PDEs.
Chapter 8
Solutions to rough differential equations

We show how to solve differential equations driven by rough paths by a simple Picard
iteration argument. This yields a pathwise solution theory mimicking the standard
solution theory for ordinary differential equations. We start with the simple case of
differential equations driven by a signal that is sufficiently regular for Young’s theory
of integration to apply and then proceed to the case of more general rough signals.

8.1 Introduction

We now turn our attention to (rough) differential equations of the form

dYt = f (Yt ) dXt , Y0 = ξ ∈ W . (8.1)

Here, X : [0, T ] → V is the driving or input signal, while Y : [0, T ] → W is the


output signal. As usual V and W are Banach spaces, and f : W → L(V, W ). When
dim V = d < ∞, one may think of f as a collection of vector fields (f1 , . . . , fd ) on
W . As usual, the reader is welcome to think V = Rd and W = Rn but there is really
no difference in the argument. Such equations are familiar from the theory of ODEs,
and more specifically, control theory, where X is typically assumed to be absolutely
continuous so that dXt = Ẋt dt. The case of SDEs, stochastic differential equations,
with dX interpreted as Itô or Stratonovich differential of Brownian motion, is also
well known. Both cases will be seen as special examples of RDEs, rough differential
equations.
We may consider (8.1) on the unit time interval. Indeed, equation (8.1) is invariant
under time-reparametrisation so that any (finite) time horizon may be rescaled to [0, 1].
Alternatively, global solutions on a larger time horizon are constructed successively,
i.e. by concatenating Y |[0,1] (started at Y0 ) with Y |[1,2] (started at Y1 ) and so on.
As a matter of fact, we shall construct solutions by a variation of the classical
Picard iteration on intervals [0, T ], where T ∈ (0, 1] will be chosen sufficiently
small to guarantee invariance of suitable balls and the contraction property. Our key

131
132 8 Solutions to rough differential equations

ingredients are estimates for rough integrals (cf. Theorem 4.10) and the composition
of controlled paths with smooth maps (Lemma 7.3). Recall that, for rather trivial
reasons (of the sort |t − s|2α ≤ |t − s|, when 0 ≤ s ≤ t ≤ T ≤ 1), all constants in
these estimates were seen to be uniform in T ∈ (0, 1].

8.2 Review of the Young case: a priori estimates

Let us postulate that there exists a solution to a differential equation in Young’s sense
and let us derive an a-priori estimate. (In finite dimension, this can actually be used
to prove the existence of solutions. Note that the regularity requirement here is “one
degree less” than what is needed for the corresponding uniqueness result.)

Proposition 8.1. Assume X, Y ∈ C β ([0, 1], V ) for some β ∈ (1/2, 1] such that,
given ξ ∈ W, f ∈ Cb1 (W, L(V, W )), we have

dYt = f (Yt )dXt , Y0 = ξ ,

in the sense of a Young integral equation. Then


   1/β 
∥Y ∥β ≤ C ∥f ∥C 1 ∥X∥β ∨ ∥f ∥C 1 ∥X∥β .
b b

Rt
Proof. By assumption, for 0 ≤ s < t ≤ 1, Ys,t = s f (Yr )dXr . Using Young’s
inequality (4.3), with C = C(β),
Z t

|Ys,t − f (Ys )Xs,t | = (f (Yr ) − f (Ys ))dXr
s

≤ C∥Df ∥∞ ∥Y ∥β;[s,t] ∥X∥β;[s,t] |t − s|

so that
β β
|Ys,t |/|t − s| ≤ ∥f ∥∞ ∥X∥β + C∥Df ∥∞ ∥Y ∥β;[s,t] ∥X∥β;[s,t] |t − s| .

β
Write ∥Y ∥β;h ≡ sup |Ys,t |/|t − s| where the sup is restricted to times s, t ∈ [0, 1]
for which |t − s| ≤ h. Clearly then,

∥Y ∥β;h ≤ ∥f ∥∞ ∥X∥β + C∥Df ∥∞ ∥Y ∥β;h ∥X∥β hβ

and upon taking h small enough, s.t. δhβ ≍ 1, with δ = ∥X∥β , more precisely s.t.
 
C∥Df ∥∞ ∥X∥β hβ ≤ C 1 + ∥f ∥C 1 ∥X∥β hβ ≤ 1/2
b

(we will take h such that the second ≤ becomes an equality; adding 1 avoids trouble
when f ≡ 0)
8.3 Review of the Young case: Picard iteration 133

1
∥Y ∥β;h ≤ ∥f ∥∞ ∥X∥β .
2
−1/β
It then follows from Exercise 4.5 that, with h ∝ ∥X∥β ,
   
∥Y ∥β ≤ ∥Y ∥β;h 1 ∨ h−(1−β) ≤ C∥X∥β 1 ∨ h−(1−β)
 
1/β
= C ∥X∥β ∨ ∥X∥β .

Here, we have absorbed the dependence on f ∈ Cb1 into the constants. By scaling
(any non-zero f may be normalised to ∥f ∥C 1 = 1 at the price of replacing X by
b
∥f ∥C 1 × X) we then get immediately the claimed estimate. ⊔

b

8.3 Review of the Young case: Picard iteration

The reader may be helped by first reviewing the classical Picard argument in a
Young setting, i.e. when β ∈ (1/2, 1]. Given ξ ∈ W , f ∈ Cb2 (W, L(V, W )), X ∈
C β ([0, 1], V ) and Y : [0, T ] → W of suitable Hölder regularity, T ∈ (0, 1], one
defines the map MT by
 Z t 
MT (Y ) := ξ + f (Ys )dXs : t ∈ [0, T ] .
0

Following a classical pattern of proof, we shall establish invariance of suitable balls,


and then a contraction property upon taking T = T0 small enough. The resulting
unique fixed point is then obviously the unique solution to (8.1) on [0, T0 ]. The
unique solution on [0, 1] is then constructed successively, i.e. by concatenating the
solution Y on [0, T0 ], started at Y0 = ξ, with the solution Y on [T0 , 2T0 ] started
at YT0 and so on. Care is necessary to ensure that T0 can be chosen uniformly;
for instance, if f were only C 2 (without the boundedness assumption) one can still
obtain local existence on [0, T1 ], and then [T1 , T2 ], etc, but the resulting maximal
solution (with respect to extension of solutions) may only be exist on [0, τ ), for some
limn Tn = τ ≤ T = 1. In finite dimension, τ can be identified as explosion time,
see also Exercise 8.4. (The situation here is completely analogous to the theory of
Banach valued ODEs.)
We will need the Hölder norm of X over [0, T ] to tend to zero as T ↓ 0. Now, as
the example of the map t 7→ t and β = 1 shows, this may not be true relative to the
β-Hölder norm; the (cheap) trick is to take α ∈ (1/2, β) and to view MT as map
from the Banach space C α ([0, T ], W ), rather than C β ([0, T ], W ), into itself. Young’s
inequality is still applicable since all paths involved will be (at least) α-Hölder
continuous with α > 1/2. On the other hand,

∥X∥α;[0,T ] ≤ T β−α ∥X∥β;[0,T ] ,


134 8 Solutions to rough differential equations

and so the α-Hölder norm of X has the desired behaviour. As previously, when no
confusion is possible, we write ∥ · ∥α ≡ ∥ · ∥α;[0,T ] .
To avoid norm versus seminorm considerations, it is convenient to work on
the space of paths started at ξ, namely {Y ∈ C α  ([0, T ], W ) : Y0 = ξ}. This affine
subspace is a complete metric space under Y, Ỹ 7→ Y − Ỹ α and so is the closed
unit ball
BT = {Y ∈ C α ([0, T ], W ) : Y0 = ξ, ∥Y ∥α ≤ 1} .
Young’s inequality (4.41) shows that there is a constant C which only depends on α
(thanks to T ≤ 1) such that for every Y ∈ BT ,

∥MT (Y )∥α ≤ C(|f (Y0 )| + ∥f (Y )∥α )∥X∥α


≤ C(|f (ξ)| + ∥Df ∥∞ ∥Y ∥α )∥X∥α
≤ C(|f |∞ + ∥Df ∥∞ )∥X∥α ≤ C|f |C 1 ∥X∥β T β−α .
b

 
Similarly, for Y, Ỹ ∈ BT , using Young, f Y0 = f Ỹ0 and Lemma 7.5 (with
K = 1)
Z · Z ·
   
Y − M Ỹ = f Y dX − f Ỹ dX

MT T s s s s

α

0 0 α
     
≤ C f Y0 − f Ỹ0 + f Y − f Ỹ α ∥X∥α

≤ C∥f ∥C 2 ∥X∥β T β−α Y − Ỹ α .



b

It is clear from the previous estimates that a small enough T0 = T0 (f, α, β, X) ≤ 1


can be found such that MT0 (BT0 ) ⊂ BT0 and, for all Y, Ỹ ∈ BT0 ,
 
MT Y − MT Ỹ 1
0 0 α;[0,T0 ]
≤ Y − Ỹ α;[0,T0 ] .
2
Therefore, MT0 (·) admits a unique fixed point Y ∈ BT0 which is the unique solution
Y to (8.1) on the (small) interval [0, T0 ]. Noting that the choice T0 = T0 (f, α, β, X)
can indeed be done uniformly (in particular it does not change when the starting point
ξ is replaced by YT0 ), the unique solution on [0, 1] is then constructed iteratively, as
explained in the beginning.

8.4 Rough differential equations: a priori estimates

We now consider a priori estimates for rough differential equations, similar to Sec-
tion 8.2. Recall that the homogeneous rough path norm |||X|||α was introduced in
(2.4).

Proposition 8.2. Let ξ ∈ W, f ∈ Cb2 (W, L(V, W )) and a rough path X = (X, X) ∈
C α with α ∈ (1/3, 1/2] and assume that (Y, Y ′ ) = (Y, f (Y )) ∈ DX

is an RDE
8.4 Rough differential equations: a priori estimates 135

solution to dY = f (Y ) dX started at Y0 = ξ ∈ W . That is, for all t ∈ [0, T ],


Z t
Yt = ξ + f (Ys ) dXs , (8.2)
0

with integral interpreted in the sense of Theorem 4.10 and (f (Y ), f (Y )′ ) ∈ DX



2
built from Y by Lemma 7.3. (Thanks to Cb -regularity of f and Lemma 7.3 the above
rough integral equation (8.2) is well-defined.1 )
Then the following (a priori) estimate holds true
   1/α 
∥Y ∥α ≤ C ∥f ∥C 2 |||X|||α ∨ ∥f ∥C 2 |||X|||α
b b

where C = C(α) is a suitable constant.


Proof. Consider an interval I := [s, t] so that, using basic estimates for rough
integrals (cf. Theorem 4.10),
Y
Rs,t = |Ys,t − f (Ys )Xs,t |
Z t

≤ f (Y )dX − f (Ys )Xs,t − Df (Ys )f (Ys )Xs,t + |Df (Ys )f (Ys )Xs,t |
s 

≲ ∥X∥α;I Rf (Y ) 2α;I + ∥X∥2α;I ∥f (Y )∥α;I |t − s|


+ ∥X∥2α;I |t − s| . (8.3)

Recall that ∥ · ∥α is the usual Hölder seminorm over [0, T ], while ∥ · ∥α;I denotes
the same norm, but over I ⊂ [0, T ], so that trivially ∥X∥α;I ≤ ∥X∥α . Whenever
notationally convenient, multiplicative constants depending on α and f are absorbed
in ≲, at the very end we can use scaling to make the f dependence reappear. We
will also write ∥ · ∥α;h for the supremum of ∥ · ∥α;I over all intervals I ⊂ [0, T ] with
length |I| ≤ h. Again, one trivially has ∥X∥α;I ≤ ∥X∥α;h whenever |I| ≤ h. Using
this notation, we conclude from (8.3) that
Y  f (Y ) 
α
R
2α;h
≲ ∥X∥ 2α;h + ∥X∥ α;h
R
2α;h
+ ∥X∥ 2α;h ∥f (Y )∥α;h h .

We would now like to relate Rf (Y ) to RY . As in the proof of Lemma 7.3, we obtain


the bound
f (Y )
Rs,t = f (Yt ) − f (Ys ) − Df (Ys )Ys′ Xs,t
Y
= f (Yt ) − f (Ys ) − Df (Ys )Ys,t + Df (Ys )Rs,t

so that,
f (Y ) 1 2
≤ D2 f ∞ ∥Y ∥α;h + |Df |∞ RY 2α;h

R
2α;h 2
1
Later we will establish existence and uniqueness under Cb3 -regularity.
136 8 Solutions to rough differential equations
2
≲ ∥Y ∥α;h + RY 2α;h .

Hence, also using ∥f (Y )∥α;h ≲ ∥Y ∥α;h , there exists c1 > 0, not dependent on X or
Y , such that
Y 2
R
2α;h
≤ c1 ∥X∥2α;h + c1 ∥X∥α;h hα ∥Y ∥α;h (8.4)
+ c1 ∥X∥α;h hα RY 2α;h + c1 ∥X∥2α;h hα ∥Y ∥α;h .

We now restrict ourselves to h small enough so that |||X|||α hα ≪ 1. More precisely,


we choose it such that
1 1/2 1
c1 ∥X∥α hα ≤ , c1 ∥X∥2α hα ≤ .
2 2
Inserting this bound into (8.4), we conclude that
Y 1 2 1 1/2
R
2α;h
≤ c1 ∥X∥2α;h + ∥Y ∥α;h + RY 2α;h + ∥X∥2α;h ∥Y ∥α;h .
2 2
This in turn yields the bound
Y 2 1/2
R
2α;h
≤ 2c1 ∥X∥2α;h + ∥Y ∥α;h + 2∥X∥2α;h ∥Y ∥α;h
2
≤ c2 ∥X∥2α;h + 2∥Y ∥α;h , (8.5)

Y
with c2 = (2c1 + 1). On the other hand, since Ys,t = f (Ys )Xs,t − Rs,t and f is
bounded, we have the bound

∥Y ∥α;h ≲ ∥X∥α + RY 2α;h hα .


Combining this bound with (8.5) yields


2
∥Y ∥α;h ≤ c3 ∥X∥α + c3 ∥X∥2α;h hα + c3 ∥Y ∥α;h hα
1/2 2
≤ c3 ∥X∥α + c4 ∥X∥2α;h + c3 ∥Y ∥α;h hα ,

for some constants c3 and c4 . Multiplication with c3 hα then yields, with ψh :=


c3 ∥Y ∥α;h hα and λh := c5 |||X|||α hα → 0 as h → 0,

ψh ≤ λh + ψh2 .

Clearly, for all h small enough depending on Y (so that ψh ≤ 1/2) ψh ≤ λh + ψh /2


implies ψh ≤ 2λh and so
∥Y ∥α;h ≤ c6 |||X|||α .
To see that this is true for all h small enough without dependence on Y , pick h0
small enough so that λh0 < 1/4. It then follows that for each h ≤ h0 , one of the
following two estimates must hold true
8.5 Rough differential equations 137
r
1 1 1
ψh ≥ ψ+ ≡ + − λh ≥
2 4 2
r
1 1 1 p 
ψh ≤ ψ− ≡ − − λh = 1 − 1 − 4λh ∼ λh as h ↓ 0.
2 4 2
(In fact, for reasons that will become apparent shortly, we may decrease h0 further to
guarantee that for h < h0 we have not only ψh < 1/2 but ψh < 1/6.) We already
know that we are in the regime of the second estimate above as h ↓ 0. Noting that
ψh (< 1/6) < 1/2 in the second regime, the only reason that could prevent us from
being in the second regime for all h < h0 is an (upwards) jump of the (increasing)
function (0, h0 ] ∋ h 7→ ψh . But ψh ≤ 3 limg↑h ψg , as seen from

∥Y ∥α;h ≤ 3∥Y ∥α;h/3 ≤ 3 lim ∥Y ∥α;g ,


g↑h

(and similarly: limg↓h ψg ≤ 3ψh ) which rules out any jumps of relative jump size
greater than 3. However, given that ψh ≥ 1/2 in the first regime and ψh < 1/6 in the
second, we can never jump from the second into the first regime, as h increases (from
zero). And so, we indeed must be in the second regime for all h ≤ h0 . Elementary
estimates on ψ− , as function of λh then show that

∥Y ∥α;h ≤ c6 |||X|||α ,

for all h ≤ h0 ∼ |||X|||−1/α . We conclude with Exercise 4.5, arguing exactly as in the
Young case, Proposition 8.1. ⊔ ⊓

8.5 Rough differential equations

The aim of this section is to show that if f is regular enough and (X, X) ∈ C β with
β > 13 , then we can solve differential equations driven by the rough path X = (X, X)
of the type
dY = f (Y ) dX .

Such an equation will yield solutions in DX and will be interpreted in the corre-
sponding integral formulation, where the integral of f (Y ) against X is defined using
Lemma 7.3 and Theorem 4.10. More precisely, one has the following local existence
and uniqueness result. (The construction of a maximal solution is left as Exercise 8.4.)

Theorem 8.3. Given ξ ∈ W , f ∈ C 3 (W, L(V, W )) and a rough path X = (X, X) ∈


C β ([0, T ], V ) with β ∈ ( 31 , 12 ), there exists 0 < T0 ≤ T and a unique element

(Y, Y ′ ) ∈ DX ([0, T0 ], W ), with Y ′ = f (Y ), such that, for all 0 ≤ t ≤ T0 ,
Z t
Yt = ξ + f (Ys ) dXs . (8.6)
0
138 8 Solutions to rough differential equations


Here, the integral is interpreted in the sense of Theorem 4.10 and f (Y ) ∈ DX is
3
built from Y by Lemma 7.3. Moreover, if f is linear or f ∈ Cb , we may take T0 = T ,
and thus global existence holds on [0, T ].
Remark 8.4. The condition Y ′ = f (Y ) (and then f (Y )′ = Df (Y )Y ′ by Lemma 7.3)
is crucial for uniqueness. To see what can happen, consider the canonical lift of
X ∈ C 1 to X = (X, X ⊗ dX), in which case any choice of f (Y )′ ∈ C β yields a
R

pair (f (Y ), f (Y )′ ) ∈ DX . (Indeed, thanks to |Xs,t | ≲ |t − s|, the term f (Y )′s Xs,t
can always be absorbed in the 2β-remainder.) On the other hand, regardless of the
choice of Y ′ , or f (Y )′ , the
R rough integral in (8.6) here always agrees with the
Riemann-Stieltjes integral f (Y )dX, so that (8.6) is satisfied whenever Y solves
the ODE Ẏ = f (Y )Ẋ, with Y0 = ξ.
Proof. With X = (X, X) ∈ C β ⊂ C α , 13 < α < β and (Y, Y ′ ) ∈ DX 2α
we know
from Lemma 7.3 that
′
(Ξ, Ξ ′ ) := f (Y ), f (Y ) := (f (Y ), Df (Y )Y ′ ) ∈ DX

.

Restricting from [0, 1] to [0, T ], any T ≤ 1, Theorem 4.10 allows to define the map
 Z · 
MT (Y, Y ′ ) = ξ + 2α
def
Ξs dXs , Ξ ∈ DX .
0

The RDE solution on [0, T ] we are looking for is a fixed point of this map. Strictly
speaking, this would only yield a solution (Y, Y ′ ) in DX 2α
. But since X ∈ C β , it

turns out that this solution is automatically an element of DX . Indeed, |Ys,t | ≤

|Y ′ |∞ |Xs,t | + RY 2α |t − s| , so that Y ∈ C β . From the fixed point property it
then follows that Y ′ = f (Y ) ∈ C β and also RY ∈ C22β , since X ∈ C22β and

t
Z
Y
Rs,t = Ys,t − Ys′ Xs,t = (f (Yr ) − f (Ys ))dXt

s
3α 
≤ |Y ′ |∞ |Xs,t | + O |t − s| .

Note that if (Y, Y ′ ) is such that (Y0 , Y0′ ) = (ξ, f (ξ)), then the same is true for
MT (Y, Y ′ ). Therefore, MT can be viewed as map on the space of controlled paths
started at (ξ, f (ξ)), i.e.

(Y, Y ′ ) ∈ DX

([0, T ], W ) : Y0 = ξ, Y0′ = f (ξ) .



Since DX is a Banach space (under the norm (Y, Y ′ ) 7→ |Y0 | + |Y0′ | + ∥Y, Y ′ ∥X,2α )
the above (affine) subspace is a complete metric space under the induced metric. This
is also true for the (closed) unit ball BT centred at, say

t 7→ (ξ + f (ξ)X0,t , f (ξ)).

(Note here that the apparently simpler choice t 7→ ξ, f (ξ) does in general not

belong to DX .) In other words, BT is the set of all (Y, Y ′ ) ∈ DX

([0, T ], W ) :
8.5 Rough differential equations 139

Y0 = ξ, Y0′ = f (ξ) and

|Y0 − ξ| + |Y0′ − f (ξ)| + ∥(Y − (ξ + f (ξ)X0,· ), Y·′ − f (ξ))∥X,2α


= ∥(Y − f (ξ)X0,· , Y·′ − f (ξ))∥X,2α ≤ 1.

In fact, ∥(Y − f (ξ)X0,· , Y·′ − f (ξ))∥X,2α = ∥Y, Y·′ ∥X,2α as a consequence of the
triangle inequality and ∥(f (ξ)X0,· , f (ξ))∥X,2α = ∥f (ξ)∥α + ∥0∥2α = 0, so that
n o
BT = (Y, Y ′ ) ∈ DX

([0, T ], W ) : Y0 = ξ, Y0′ = f (ξ) : ∥(Y, Y·′ )∥X,2α ≤ 1 .

Let us also note that, for all (Y, Y ′ ) ∈ BT , one has the bound

Y0 + ∥(Y, Y ′ )∥
X,2α ≤ |f |∞ + 1 =: M ∈ [1, ∞). (8.7)

We now show that, for T small enough, MT leaves BT invariant and in fact is
contracting. Constants below are denoted by C, may change from line to line and
may depend on α, β, X, X without special indication. They are, however, uniform
in T ∈ (0, 1] and we prefer to be explicit (enough) with respect to f such as to
see where Cb3 -regularity is used. With these conventions, we recall the following
estimates, direct consequences from Lemma 7.3 and Theorem 4.10 , respectively,

∥Ξ, Ξ ′ ∥X,2α ≤ CM ∥f ∥C 2 |Y0′ | + ∥Y, Y ′ ∥X,2α



b
Z ·
≤ ∥Ξ∥α + ∥Ξ ′ ∥∞ ∥X∥2α

Ξs dXs , Ξ

0 X,2α

+ C ∥X∥α RΞ 2α + ∥X∥2α ∥Ξ ′ ∥α


≤ ∥Ξ∥α + C |Ξ0′ | + ∥Ξ, Ξ ′ ∥X,2α (∥X∥α + ∥X∥2α )




≤ ∥Ξ∥α + C |Ξ0′ | + ∥Ξ, Ξ ′ ∥X,2α T β−α .




Invariance: For (Y, Y ′ ) ∈ BT , noting that ∥Ξ∥α = ∥f (Y )∥α ≤ ∥f ∥C 1 ∥Y ∥α and


b
2
that |Ξ0′ | = |Df (Y0 )Y0′ | ≤ ∥f ∥C 1 , we obtain the bound
b

Z ·
MT (Y , Y ′ )

X,2α
= Ξs dXs , Ξ


0 X,2α

|Ξ0′ |
+ ∥Ξ, Ξ ∥X,2α T β−α


≤ ∥Ξ∥α + C
 
2
≤ ∥f ∥C 1 ∥Y ∥α + C ∥f ∥C 1 + CM ∥f ∥C 2 |Y0′ | + ∥Y, Y ′ ∥X,2α T β−α
b b b
 
β−α 2
≤ ∥f ∥C 1 (∥f ∥∞ + 1)T + CM ∥f ∥C 1 + ∥f ∥C 2 (∥f ∥∞ + 1) T β−α ,
b b b

where in the last step we used (8.7) and also ∥Y ∥α;[0,T ] ≤ Cf T β−α , seen from


|Ys,t | ≤ |Y ′ |∞ |Xs,t | + RY 2α |t − s|

140 8 Solutions to rough differential equations
β 2α
≤ (|Y0′ | + ∥Y ′ ∥α )∥X∥β |t − s| + RY 2α |t − s| .


Then, using T α ≤ T β−α and RY 2α ≤ ∥Y, Y ′ ∥X,2α ≤ 1 , we obtain the bound

∥Y ∥α;[0,T ] ≤ |Y0′ | + ∥Y, Y ′ ∥X,2α ∥X∥β T β−α + RY 2α T β−α



(8.8)
≤ (∥f ∥∞ + 1)∥X∥β + 1 T β−α .


In other words, ∥MT (Y, Y ′ )∥X,2α = ∥MT (Y, Y ′ )∥X,2α;[0,T ] = O T β−α with


constant only depending on α, β, X and f ∈ Cb2 . By choosing T = T0 small enough,


we obtain the bound ∥MT0 (Y, Y ′ )∥X,2α;[0,T0 ] ≤ 1 so that MT0 leaves BT0 invariant,
as desired. 
Contraction: Setting ∆s = f (Ys ) − f Ỹs as a shorthand, we have the bound
Z ·
MT Y, Y ′ − MT Ỹ , Ỹ ′
 
X,2α
= ∆s dXs , ∆


0 X,2α

|∆′0 |
+ ∥∆, ∆′ ∥X,2α T β−α

≤ ∥∆∥α + C
≤ C∥f ∥C 2 Y − Ỹ α + C∥∆, ∆′ ∥X,2α T β−α .

b

The contraction property is obvious, provided that we can establish the following
two estimates:
Y − Ỹ ≤ CT β−α Y − Ỹ , Y ′ − Ỹ ′

α X,2α
, (8.9)
′ ′ ′

∆, ∆
X,2α
≤ C Y − Ỹ , Y − Ỹ X,2α . (8.10)

To obtain (8.9), replace Y by Y − Ỹ in (8.8), noting Y0′ − Ỹ0′ = 0, and this shows

Y − Ỹ ≤ Y ′ − Ỹ ′ ∥X∥ T β−α + RY − RỸ T β−α



α α β 2α
≤ CT β−α Y − Ỹ , Y ′ − Ỹ ′ X,2α .

We now turn to (8.10). Similar to the proof of Lemma 7.5, f ∈ C 3 allows to write
∆s = Gs Hs where

Gs := g Ys , Ỹs , Hs := Ys − Ỹs ,

and g ∈ Cb2 with ∥g∥C 2 ≤ C∥f ∥C 3 . Lemma 7.3 tells us that (G, G′ ) ∈ DX

(with
b b

G′ = (DY g)Y ′ + (DỸ g)Ỹ ′ ) and in fact immediately yields an estimate of the form

∥G, G′ ∥X,2α ≤ C∥f ∥C 3 ,


b

′ ′ 2α
 
uniformly over Y, Y , Ỹ , Ỹ ∈ BT and T ≤ 1. On the other hand, DX is an
′ 2α ′ ′ ′
algebra in the sense that (GH, (GH) ) ∈ DX with (GH) = G H + GH . In fact,
we leave it as easy exercise to the reader to check that
8.6 Stability III: Continuity of the Itô–Lyons map 141

∥GH, (GH)′ ∥X,2α ≲ |G0 | + |G′0 | + ∥G, G′ ∥X,2α




× |H0 | + |H0′ | + ∥H, H ′ ∥X,2α .




In our situation,H0 = Y0 − Ỹ0 = ξ − ξ = 0, and similarly H0′ = 0, so that, for all


Y, Y ′ , Ỹ , Ỹ ′ ∈ BT , we have

∆, ∆′ ≲ |G0 | + |G′0 | + ∥G, G′ ∥X,2α ∥H, H ′ ∥X,2α



X,2α

≲ ∥g∥∞ + ∥g∥C 1 Y0′ + Ỹ0′ + C∥f ∥C 3 Y − Ỹ , Y ′ − Ỹ ′ X,2α


 
b b

≲ Y − Ỹ , Y ′ − Ỹ ′ X,2α ,


where we made use of ∥g∥∞ , ∥g∥C 1 ≲ ∥f ∥C 3 and |Y0′ | = Ỹ0′ = |f (ξ)| ≤ |f |∞ .
b b
The argument from here on is identical to the Young case: the previous esti-
mates allow fora small enough T0 ≤ 1 such that MT0 (BT0 ) ⊂ BT0 and for all
Y, Y ′ , Ỹ , Ỹ ′ ∈ BT0 :

MT Y, Y ′ − MT Ỹ , Ỹ ′ 1
≤ Y − Ỹ , Y ′ − Ỹ ′ X,2α
 
0 0 X,2α 2
and so MT0 (·) admits a unique fixed point (Y, Y ′ ) ∈ BT0 , which is then the unique
solution Y to (8.1) on the (possibly rather small) interval [0, T0 ]. Noting that the
choice of T0 can again be done uniformly in the starting point, the solution on [0, 1]
is then constructed iteratively as before. ⊔

In many situations, one is interested in solutions to an equation of the type

dY = f0 (Y, t) dt + f (Y, t) dXt , (8.11)

instead of (8.6). On the one hand, it is possible to recast (8.11) in the form (8.6) by
writing it as an RDE for Ŷt = (Yt , t) driven by X̂t = (X̂, X̂) where X̂ = (Xt , t)
and X̂ is given by X and the “remaining cross integrals” of Xt and t, given by usual
Riemann-Stieltjes integration. However, it is possible to exploit the structure of (8.11)
to obtain somewhat better bounds on the solutions. See [FV10b, Ch. 12].

8.6 Stability III: Continuity of the Itô–Lyons map

We now obtain continuity of solutions to rough differential equations as function of


their (rough) driving signals.

Theorem 8.5 (Rough path stability of the Itô–Lyons map). Let f ∈ Cb3 and, for
α ∈ 31 , 12 , let (Y, f (Y )) ∈ DX

be the unique RDE solution given by Theorem 8.3
to
dY = f (Y ) dX, Y0 = ξ ∈ W .
142 8 Solutions to rough differential equations

Similarly, let (Ỹ , f (Ỹ )) be the RDE solution driven by X̃ and started at ξ˜ where
X, X̃ ∈ C α . Assuming
|||X|||α , |||X̃|||α ≤ M < ∞
we have the local Lipschitz estimates
˜ + ϱα X, X̃ ,
 
dX,X̃,2α Y, f (Y ); Ỹ , f (Ỹ ) ≤ CM |ξ − ξ|

and also
˜ + ϱα X, X̃ ,

Y − Ỹ ≤ CM |ξ − ξ|
α

where CM = C(M, α, f ) is a suitable constant.

Remark 8.6. The proof only uses the a priori information that RDE solutions remain
bounded if the driving rough paths do, combined with basic stability properties of
rough integration and composition.

Proof. Recall that, for given X ∈ C α , the RDE solution (Y, f (Y )) ∈ DX 2α


is
constructed as the unique fixed point of
 Z · 
′ ′ 2α
MT (Y, Y ) := (Z, Z ) := ξ + f (Ys )dXs , f (Y· ) ∈ DX ,
0

α

and similarly for M̃T Ỹ , f Ỹ ∈ CX̃ . Then, thanks to the fixed point property

(Y, f (Y )) = (Y, Y ′ ) = (Z, Z ′ ) = (Z, f (Y )) ,

(similarly with tilde) and the local Lipschitz estimate for rough integration, Theo-
′
rem 4.17, and writing (Ξ, Ξ ′ ) := f (Y ), f (Y ) for the integrand, we obtain the
bound

dX,X̃,2α Y, Y ′ ; Ỹ , Ỹ ′ = dX,X̃,2α Z, Z ′ ; Z̃, Z̃ ′


 

≲ ϱα X, X̃ + ξ − ξ˜ + T α dX,X̃,2α Ξ, Ξ ′ ; Ξ̃, Ξ̃ ′ ,
 

Thanks to the local Lipschitz estimate for composition, Theorem 7.6, uniform in
T ≤ 1,

dX,X̃,2α Ξ, Ξ ′ ; Ξ̃, Ξ̃ ′ ≲ ϱα X, X̃ + ξ − ξ˜ + dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ .


  

In summary, for some constant C = C(α, f, M ), we have the bound

dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ ≤ C ϱα X, X̃ + ξ − ξ˜
 

+ T α dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ

.

By taking T = T0 (M, α, f ) smaller, if necessary, we may assume that CT α ≤ 1/2,


from which it follows that

dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ ≤ 2C ϱα X, X̃ + ξ − ξ˜ ,
  
8.7 Davie’s definition and numerical schemes 143

which is precisely the required bound. The bound on Y − Ỹ α then follows as in
(4.32), and these bounds can be iterated to cover a time interval of arbitrary (fixed)
length. ⊔ ⊓

8.7 Davie’s definition and numerical schemes

Fix f ∈ Cb2 (W, L(V, W )) and X = (X, X) ∈ C β ([0, T ], V ) with β > 31 . Under
these assumptions, the rough differential equation dY = f (Y )dX makes sense as
well-defined integral equation. (In Theorem 8.3 we used additional regularity, namely
Cb3 , to establish existence of a unique solution on [0, T ].) By the very definition of an

RDE solution, unique or not, (Y, f (Y )) ∈ DX , i.e.
2β 
Ys,t = f (Ys )Xs,t + O |t − s| ,

and we recognise a step of first-order Euler approximation, Ys,t ≈ f (Ys )Xs,t , started
from Ys . Clearly O |t − s|2β = o(|t − s|) if and only if β > 1/2 and one can show


that iteration of such steps along a partition P of [0, T ] yields a convergent “Euler”
scheme as |P| ↓ 0, see [Dav08]  or [FV10b].
In the case β ∈ 13 , 21 we have to exploit that we know more than just
2β Rt
(Y, f (Y )) ∈ DX . Indeed, since Ys,t = s f (Y )dX, estimate (4.22) for rough
integrals tells us that, for all pairs s, t
′ 3β 
Ys,t = f (Ys )Xs,t + (f (Y ))s Xs,t + O |t − s| . (8.12)

Using the identity f (Y ) = Df (Y )Y ′ = Df (Y )f (Y ), this can be spelled out
further to
Ys,t = f (Ys )Xs,t + Df (Ys )f (Ys )Xs,t + o(|t − s|) (8.13)
and, omitting the small remainder term, we recognise a step of a second-order Euler
or Milstein approximation. Again, one can show that iteration of such steps along a
partition P of [0, T ] yields a convergent “Euler” scheme as |P| ↓ 0; see [Dav08] or
[FV10b].

Remark 8.7. This schemes can be understood from simple Taylor expansions based
on the differential equation dY = f (Y )dX, at least when X is smooth (enough), or
via Itô’s formula in a semimartingale setting. With focus on the smooth case, the Euler
approximation is obtained by a “left-point freezing” approximation f (Y· ) ≈ f (Ys )
over [s, t] in the integral equation,
Z t
Ys,t = f (Yr )dXr ≈ f (Ys )Xs,t
s
Rt
whereas the Milstein scheme, with Xs,t = s
Xs,r dXr for smooth paths, is obtained
from the next-best approximation
144 8 Solutions to rough differential equations

f (Yr ) ≈ f (Ys ) + Df (Ys )Ys,r


≈ f (Ys ) + Df (Ys )f (Ys )Xs,r .

It turns out that the description (8.13) is actually a formulation that is equivalent
to the RDE solution built previously in the following sense.

Proposition 8.8. The following two statements are equivalent


i) (Y, f (Y )) is a RDE solution to (8.6), as constructed in Theorem 8.3.
ii) Y ∈ C([0, T ], W ) is an “RDE solution in the sense of Davie”, i.e. in the sense
of (8.13).

Proof. We already discussed how (8.13) is obtained from an RDE solution to

(8.6). Conversely, (8.13) implies immediately Ys,t = f (Ys )Xs,t + O |t − s|
β ′ β 2
which shows that Y ∈ C and also Y := f (Y ) ∈ C , thanks to f ∈ Cb , so that

(Y, f (Y )) ∈ DX . It remains to see, in the notation of the proof of Theorem 4.10,
that Ys,t = (IΞ)s,t with

Ξs,t = f (Ys )Xs,t + (f (Y ))s Xs,t = f (Ys )Xs,t + Df (Ys )f (Ys )Xs,t .

To see this, we note that trivially Ys,t = (I Ξ̃)s,t with Ξ̃s,t := Ys,t . But Ξ̃s,t =
Ξs,t + o(|t − s|) and one sees as in Remark 4.13 that I Ξ̃ = IΞ. ⊔ ⊓

8.8 Lyons’ original definition

A slightly different notion of solution was originally introduced in [Lyo98] by Lyons.2


This notion only uses the spaces C α , without ever requiring the use of the spaces

DX of “controlled rough paths”. Indeed, for X = (X, X) ∈ C α ([0, T ], V ) and F ∈
2
Cb (V, L(V, W )) we can define an element Z = (Z, Z) = IF (X) ∈ C α ([0, T ], W )
directly by
def 
Zt = IΞ 0,t
, Ξs,t = F (Xs ) Xs,t + DF (Xs )Xs,t ,
def s s
 
Zs,t = I Ξ̄ s,t
, Ξ̄u,v = Zs,u Zu,v + F (Xu ) ⊗ F (Xu ) Xu,v .

It is possible to check that Ξ̄ s ∈ C2α,3α for every fixed s (see the proof of Theo-
rem 4.10) so that the second line makes sense. It is also straightforward to check that
(Z, Z) satisfies (2.1), so that it does indeed belong to C α . Actually, one can see that
Z t Z t
Zt = F (Xs ) dXs , Zs,t = Zs,r ⊗ dZr ,
0 s

2
As always, we only consider the step-2 α-Hölder case, i.e. α > 13 , whereas Lyons’ theory is
valid for every Hölder-exponent α ∈ (0, 1] (or: variation parameter p ≥ 1) at the complication of
heaving to deal with ⌊p⌋ levels.
8.9 Linear rough differential equations 145


where the integrals are defined as in the previous sections, where F (X) ∈ DX as in
Section 7.3.
We can now define solutions to (8.6) in the following way.

Definition 8.9. A rough path Y = (Y, Y) ∈ C α ([0, T ], W ) is a solution in the sense


of Lyons to (8.6) if there exists Z = (Z, Z) ∈ C α (V ⊕ W ) such that the projection
of (Z, Z) onto C α (V ) is equal to (X, X), the projection onto C α (W ) is equal to
(Y, Y), and Z = IF (Z) where
 
I 0
F (x, y) = .
f (y) 0

It is straightforward to see that if (Y, Y ′ ) ∈ DX



(W ) is a solution to (8.6) in the
sense of the previous section, then the path Z = (X, Y ) ∈ V ⊕ W is controlled by
X. As seen in Section 7.1, it can therefore be interpreted as an element of C α . It
follows immediately from the definitions that it is then also a solution in the sense of
Lyons. Conversely, if (Y, Y) is a solution in the sense of Lyons, then one can check

that one necessarily has (Y, f (Y )) ∈ DX (W ) and that this is a solution in the sense
of the previous section. We leave the verification of this fact as an exercise to the
reader.

8.9 Linear rough differential equations

Let X ∈ C 1 ([0, 1], V ), A ∈ L(W, L(V, W )) with finite operator norm ∥A∥op = a ∈
[0, ∞), and consider the linear differential equation dY = AY dX, with initial data
Y0 ∈ W , written in integral form as
Z t
Yt = Y0 + AYs dXs .
0
Rt Rt
Clearly |Yt | ≤ |Y0 |+a 0 |Ys |d|X|s in terms of the Lipschitz path |X|t := 0 |Ẋs |ds,
and the classical Gronwall lemma gives

∥Y ∥∞;[0,1] ≤ |Y0 | exp(a∥X∥1;[0,1] ) ,

= sup0≤s≤1 |X˙ s |. Alternatively, one can ex-


|Xs,t |
with ∥X∥1;[0,1] = sup0≤s<t≤1 |t−s|
tract from the integral formulation the estimate, valid for all 0 ≤ s < t ≤ 1,

|Ys,t | ≤ a∥X∥1;[0,1] ∥Y ∥∞;[s,t] |t − s|.

The following lemma, applied with α = 1, then leads to a similar conclusion. More
importantly, it will be seen to be applicable in rough situations with α < 1.

Lemma 8.10. (Rough Gronwall) Assume Y ∈ C([0, 1]), α ∈ (0, 1], and
146 8 Solutions to rough differential equations

|Ys,t | ≤ M ∥Y ∥∞;[s,t] |t − s|α

whenever 0 ≤ s < t ≤ 1. Then there exists c = cα < ∞ such that

∥Y ∥∞;[0,1] ≤ c exp(cM 1/α )|Y0 |.

Remark 8.11. Since |Ys,t | ≤ 2∥Y ∥∞;[s,t] the assumption is trivially satisfied for
“distant” times s, t such that M |t − s|α ≥ 2. It then suffices to check the assumption
for “nearby” times with M |t − s|α ≤ θ with θ = 2, and in fact any θ > 0, at the
2M
price of replacing M by θ∧2 .

Proof. For any ξ ∈ [s, t] have |Yξ | ≤ |Ys | + |Ys,ξ | ≤ |Ys | + M ∥Y ∥∞;[s,t] |t − s|α ,
and so
∥Y ∥∞;[s,t] (1 − M |t − s|α ) ≤ |Ys | .
Since e−2x ≤ 1 − x for x ∈ [0, 1/2], we have, for M |t − s|α ∈ [0, 1/2],
α
∥Y ∥∞;[s,t] ≤ |Ys |e2M |t−s| ≤ e|Ys | .

This induces a greedy partition of [0, 1], of mesh-size (2M )−1/α and hence no more
than (2M )1/α + 1 intervals. The final estimate is then
1/α
∥Y ∥∞;[0,1] ≤ e1+(2M ) |Y0 | ,

so that the claimed estimate holds with c = e ∨ 21/α . ⊔


We now apply this to linear (Young and rough) differential equations, without
loss of generality posed on [0, 1]. By general theory, Theorem 8.3, we have a (non-
explosive) solution.

Proposition 8.12. Let Y solve the linear Young differential equation dY = AY dX,
started from Y0 and driven by X ∈ C α ([0, 1]), α > 1/2, with A of finite operator
norm a. Then there exists c = c(α) ∈ (0, ∞) so that
 
∥Y ∥∞;[0,1] ≤ c exp c(a∥X∥α;[0,1] )1/α |Y0 |.

Proof. By scaling A, we can and will assume ∥X∥α;[0,1] = 1. Young’s inequality


gives, with a = |A| and c = c(α),
Z t
A(Yr − Ys )dXr ≤ a|Ys ||t − s|α + ca∥Y ∥α;[s,t] |t − s|2α

|Ys,t | ≤ |AYs Xs,t | +
s

and so 12 ∥Y ∥α;[s,t] ≤ a|Y |∞;[s,t] whenever ca|t − s|α ≤ 1/2. Re-insert the estimate
on ∥Y ∥α;[s,t] (and also use ca|t − s|α ≤ 1/2) above to obtain precisely

|Ys,t | ≤ a|Ys ||t − s|α + a|Y |∞;[s,t] |t − s|α ≤ 2a∥Y ∥∞;[s,t] |t − s|α .
8.9 Linear rough differential equations 147

This holds whenever ca|t − s|α ≤ 1/2 and so we can conclude with the rough
Gronwall lemma (and the remark after it). The constant c is allowed to change of
course, but remains c = c(α). ⊔

A similar result also holds in the rough case.

Proposition 8.13. Let Y solve the linear rough differential equation dY = AY dX,
started from Y0 and driven by X ∈ C α ([0, 1]), α > 1/3, with A of finite operator
norm a. Then there exists c = c(α) ∈ (0, ∞) so that
 
∥Y ∥∞;[0,1] ≤ c exp c(a|||X|||α;[0,1] )1/α |Y0 |.

Proof. By scaling A, we can again assume unit (homogeneous) rough path norm for
X. By a basic estimate for rough integrals it then holds, with c = c(α) ∈ [1, ∞) and
a = |A|,

|Ys,t | ≤ c∥AY, A2 Y ∥2α,X |t − s|3α ≤ ca∥Y, AY ∥2α,X |t − s|3α
= ca(∥AY ∥α + ∥Y # ∥2α )|t − s|3α ,
# ♮
using musical notation Ys,t ≡ AYs Xs,t + Ys,t ≡ AYs Xs,t + A2 Ys Xs,t + Ys,t . This
entails
# ♮
|Ys,t | ≤ |A2 Ys Xs,t | + |Ys,t | ≤ a2 |Ys ||t − s|2α + (a∥Y ∥α + ∥Y # ∥2α )ca|t − s|3α

and so for all s < t with ca|t − s|α ≤ 1/2 we obtain


1 # a
∥Y ∥2α;[s,t] ≤ a2 ∥Y ∥∞;[s,t] + ∥Y ∥α .
2 2
#
Similarly, |Ys,t | ≤ |AYs Xs,t |+|Ys,t | ≤ a|Ys ||t−s|α +(2a2 ∥Y ∥∞;[s,t] +a∥Y ∥α )|t−

s| and so
1
∥Y ∥α;[s,t] ≤ a∥Y ∥∞;[s,t] + 2a2 ∥Y ∥∞;[s,t] |t − s|α ≤ 3a∥Y ∥∞;[s,t] .
2
for all s < t with a|t − s|α ≤ 1/2. Re-inserting this and the bound for ∥Y ∥α =
∥Y ∥α;[s,t] in the above estimate for |Ys,t |, we obtain

|Ys,t | ≤ a|Ys ||t − s|α + 8a2 ∥Y ∥∞;[s,t] |t − s|2α ≤ 5a∥Y ∥∞;[s,t] |t − s|α .

We conclude with the rough Gronwall lemma, just as in the Young case. ⊔

Remark 8.14. All this can be vector-valued. Assuming X takes values in some space
V and Y takes values in W , we should view A as a linear map A : W ⊗ V → W .
The operator A2 : W ⊗ V ⊗ V → W should then be interpreted as A ◦ (A ⊗ Id).
148 8 Solutions to rough differential equations

8.10 Stability IV: Flows

We briefly state, without proof, a result concerning regularity of flows associated to


rough differential equations, as well as local Lipschitz estimates of the Itô–Lyons
maps on the level of such flows. More precisely, given a geometric rough path X ∈
Cgα ([0, T ], Rd ), we saw in Theorem 8.3 that, for Cb3 vector fields f = (f1 , . . . , fd )
on Re , there is a unique global solution to the rough integral equation
Z t
Yt = y + f (Ys ) dXs , t≥0. (8.14)
0

Write π(f ) (0, y; X) = Y for this solution. Note that the inverse flow exists trivially,
by following the RDE driven by X(t − .),

π(f ) (0, • ; X)−1


t = π(f ) (0, ; X(t − .))t .

We call the map y 7→ π(f ) (0, y; X) the flow associated to the above RDE. Moreover,
if X ϵ is a smooth approximation to X (in rough path metric), then the corresponding
ODE solution Y ϵ is close to Y , with a local Lipschitz estimate as given in Section 8.6.
It is natural to ask if the flow depends smoothly on y. Given a multi-index
k = (k1 , . . . , ke ) ∈ Ne , write Dk for the partial derivative with respect to y 1 , . . . , y e .
The proof of the following statement is an easy consequence of [FV10b, Chapter 12].

Theorem 8.15. Let α ∈ (1/3, 1/2] and X, X̃ ∈ Cgα . Assume f ∈ Cb3+n for some
integer n. Then the associated flow is of regularity C n+1 in y, as is its inverse flow.
The resulting family of partial derivatives, {Dk π(f ) (0, ξ; X), |k| ≤ n} satisfies the
RDE obtained by formally differentiating dY = f (Y )dX.
At last, for every M > 0 there exist C, K depending on M and the norm of f
such that, whenever |||X|||α , |||X̃|||α ≤ M < ∞ and |k| ≤ n,

sup Dk π(f ) (0, ξ; X) − Dk π(f ) (0, ξ; X̃) α;[0,t] ≤ Cϱα (X, X̃),

ξ∈Re

sup Dk π(f ) (0, ξ; X)−1 − Dk π(f ) (0, ξ; X̃)−1 α;[0,t] ≤ Cϱα (X, X̃),

ξ∈Re

sup Dk π(f ) (0, ξ; X) α;[0,t] ≤ K,



ξ∈Re

sup Dk π(f ) (0, ξ; X)−1 α;[0,t] ≤ K.



ξ∈Re

8.11 Exercises

Exercise 8.1 a) Consider the case of a smooth, one-dimensional driving signal X :


[0, T ] → R. Show that the solution map to the (ordinary) differential equation
dY = f (Y )dX, for sufficiently nice f (say bounded with bounded derivatives)
8.11 Exercises 149

and started at some fixed point Y0 = ξ, is locally Lipschitz continuous with


respect to the driving signal in the supremum norm on [0, T ]. Conclude that it
admits a unique continuous extension to every continuous driving signal X.
b) Show by an example that no such continuous extension is possible, in general, in
a multi-dimensional situation, with vector fields f = (f1 , . . . , fd ) driven by a
d-dimensional signal X : [0, T ] → Rd , with d > 2.
♯ c) Show that a continuous extension is possible for commuting vector fields, in the
sense that all Lie bracket [fi , fj ], 1 ≤ i, j ≤ d, vanish or, equivalently, their
flows commute.

Exercise 8.2 (Explicit solution, Chen–Strichartz formula) View

f = (f1 , . . . , fd ) ∈ Cb∞ Re , L Rd , Re ,


as a collection of d (smooth, bounded with bounded derivatives of all orders) vector


fields on Re . Assume that f is step-2 nilpotent in the sense that [fi , [fj , fk ]] ≡ 0
for all i, j, k ∈ {1, . . . , d}. Here, [·, ·] denotes the Lie bracket between two vector
fields. Let (Y, f (Y )) be the RDE solution to dY = f (Y )dX started at some ξ ∈ Re
and assume that the rough path X is geometric. Give an explicit formula of the type
Yt = exp(. . .)ξ where exp denotes the unit time solution flow along a vector field
(. . .) which you should write down explicitly.

∗ Exercise 8.3 (Explosion along linear-growth vector fields) Give an example of


smooth f with linear growth, and X ∈ C α so that dY = f (Y )dX started at some ξ
fails to have a global solution.
♯ Exercise 8.4 (Maximal RDE solution) We are in the setting of the local existence
and uniqueness Theorem 8.3, with C 3 -regular coefficients, f ∈ C 3 (W, L(V, W )),
and local solution Y to (8.6) with values in the Banach space W .
a) Show that Y can either be extended to a global solution on the whole interval
[0, T ] or only on a subinterval [0, τ ) which is maximal with respect to extension
of solutions.
b) Show that τ = τ (X) is a lower semicontinuous function of the driving rough
path, i.e. limn→∞ τ (Xn ) ≥ τ (X) whenever Xn → X ∈ C α .
c) Assume f is C 3 -bounded on bounded sets. (This is always the case for f ∈
C 3 with W, V finite-dimensional.) If a solution only exists on [0, τ ), then
limt↑τ |Yt | = +∞ and we call τ ∈ (0, T ] explosion time.
Remark: In infinite dimensions, there are examples of Banach-valued ODEs
with smooth coefficients, where global existence fails but the solution does not
explode. In essence, this is possible because a smooth vector field need not map
bounded sets into bounded sets.
Exercise 8.5 Let T > 0, α ∈ (1/3, 1/2] and X, X̃ ∈ C α ([0, T ], Rd . Establish
existence, continuity and stability for rough differential equations with drift (cf.
(8.6)),
dYt = f0 (Yt ) dt + f (Yt ) dXt . (8.15)
150 8 Solutions to rough differential equations

a) First assume f0 to have the same regularity as f , in which case you may solve
dY = f¯(Y )X̄ with f¯ = (f, f0 ) and X̄R as (canonical) space-time rough path
extension of X. (The missing integrals X i dt, tdX i , i = 1, . . . , d are canoni-
R

cally defined as Riemann–Stieltjes integrals.)


b) Give a direct analysis for f0 ∈ Cb1 (or in fact f0 Lipschitz continuous, without
boundedness assumption).
Exercise 8.6 Let f ∈ Cb2 and assume (Y, f (Y )) is a RDE solution to (8.6), as
constructed in Theorem 8.3. Show that the o-term in Davie’s definition, (8.13), can
be bounded uniformly over (X, X) ∈ BR , any R < ∞, where
n o
BR := (X, X) ∈ C β : ∥X∥β + ∥X∥2β ≤ R , any R < ∞.

Show also that RDE solutions are β-Hölder, uniformly over (X, X) ∈ BR , any
R < ∞.
Exercise 8.7 Show that ∥Y, f (Y ); Y n , f (Y n )∥X,X n ,2α → 0, together with X →
Xn in C β implies that also (Y n , Yn ) → (Y, Y) in C α . Since, at the price of replacing
f by F , cf. Definition 8.9, there is no loss of generality in solving for the controlled
rough path Z = (X, Y ), conclude that continuity of the RDE solution map (Itô–
Lyons map) also holds with Lyons’ definition of a solution.
Exercise 8.8 Show that ∥Y, f (Y ); Y n , f (Y n )∥X,X n ,2α → 0, together with X →
Xn in C β implies that also (Y n , Yn ) → (Y, Y) in C α . Since, at the price of replacing
f by F , cf. Definition 8.9, there is no loss of generality in solving for the controlled
rough path Z = (X, Y ), conclude that continuity of the RDE solution map (Itô–
Lyons map) also holds with Lyons’ definition of a solution.
Exercise 8.9 (Lyons extension theorem revisited) Let α ∈ ( 13 , 12 ] and consider
X = (X, X) ∈ C α ([0, T ], V ). Show that X̄ = (1, X(1) , X(2) , X(3) , . . . , X(N ) ), the
(level-N ) Lyons lift of X from Exercise 4.6, solves a linear RDE. Use this and a
scaling argument for another proof of the estimate, 0 ≤ s < t ≤ T, n = 1, . . . , N ,
(n) 1
|Xs,t | n ≲ |||X|||α |t − s|α .

8.12 Comments

ODEs driven by not too rough paths, i.e. paths that are α-Hölder continuous for some
α > 1/2 or of finite p-variation with p < 2, understood in the (Young) integral sense
were first studied by Lyons in [Lyo94]; nonetheless, the terminology Young-ODEs is
now widely used. Existence and uniqueness for such equations via Picard iterations
is by now classical, our discussion in Section 8.3 is a mild variation of [LCL07, p.22]
where also the division property (cf. proof of Lemma 7.5) is emphasised. Existence
and uniqueness of solutions to RDEs via Picard iteration in the (Banach!) space of
8.12 Comments 151

controlled rough paths originates in [Gub04] for regularity α ∈ ( 13 , 12 ]. This approach


also allows to treat arbitrary regularities, [Gub10, HK15]. In case of driving rough
paths with jumps, one has to distinguish between forward (think Itô or branched)
and geometric (think Marcus canonical) sense, this was started in [Wil01], and the
general forward resp. geometric case completed by Chevyrev, Friz and Zhang in
[FZ18, CF19], see also comment Section 9.6.
The continuity result of Theorem 8.5 is due to T. Lyons; proofs of uniform
continuity on bounded sets were given in [Lyo98, LQ02, LCL07]. Local Lipschitz
estimates were pointed out subsequently and in different settings by various authors
including Lyons–Qian [LQ02], Gubinelli [Gub04], Friz–Victoir [FV10b], Inahama
[Ina10], Deya et al. [DNT12]; Bailleul [Bai15a, Bai14] and Bailleul–Riedel [BR19]
take a flow perspective, initially studied in [LQ98]. Smoothness of the Lyons–Itô
map is discussed in [LL06, FV10b, Bai15b, CL18], see also comment Section 11.5.
The name universal limit theorem was suggested by P. Malliavin, meaning con-
tinuity of the Itô–Lyons map in rough path metrics. As we tried to emphasise, the
stability in rough path metrics is seen at all levels of the theory.
Lyons’ original argument (for arbitrary regularity) also involves a Picard iteration,
see e.g. [LCL07, p.88]. In his p-variation setting, vector fields are assumed Lipγ , γ >
p, which agrees with our Cbγ in finite dimensions, cf. Sections 1.4 and 1.5, with
the usual disclaimer γ ∈ / N (Lipschitz vs continuously. differentiable). In finite
dimensions, existence results are given for γ > p − 1, see [Dav08, FV10b] for p < 3
and general p respectively. In infinite dimensions, due to lack of compactness, extra
assumptions on the vector fields are necessary; a Peano existence theorem, as in
the case of Banach valued RDEs is shown by Caruana [Car10]. On the other hand,
under local C γ regularity one has a unique (in infinite dimensions: not necessarily
exploding) maximal solution, cf. Exercise 8.4. In finite dimensions, global existence
is guaranteed by non-explosion, discussed in [Dav08, FV10b, Lej12, RS17].
For regularity 1/p = α > 1/3, Davie [Dav08] establishes existence and unique-
ness for Young resp. rough differential equations via discrete Euler resp. Milstein
approximations. Step-N Euler schemes. with ⌊p⌋ ≤ N , are studied in [FV08b] via
sub-Riemannian geodesics in G(N ) (Rd ), Boutaib et al. [BGLY14] establish simi-
lar estimates in the Banach setting, Boedihardjo, Lyons and Yang [BLY15] study
N → ∞.
Our regularity assumption as stated in Theorem 8.3, namely C 3 for a unique
(local) solution is not sharp; it is straightforward to push this to C γ any γ > 1/α for
1 1

α ∈ 3 , 2 (due to our level-2 exposition) in agreement with [Lyo98, Dav08]. It is
less straightforward [Dav08, FV10b] to show that uniqueness also holds for γ = 1/α
and this is optimal, with counter-examples constructed in [Dav08]. Local existence
results on the other hand are available for γ > (1/α) − 1. Setting α = 1, this is
consistent with the theory of ODEs where it is well known that, at least modulo
possible logarithmic divergencies and in finite dimensions, Lipschitz continuity of
the coefficients is required for the uniqueness of local solutions, but continuity is
sufficient for their existence.
Theorem 8.3 gives global existence for f ∈ Cb3 or (affine) linear f . Linear rough
differential equations are important (Jacobian of the flow, equations for Malliavin
152 8 Solutions to rough differential equations

type derivatives, etc) and studied e.g. in [Lyo98, FV10b, CL14], see also [HH10]
for related analysis. Solutions can be estimated by the rough Gronwall lemma
[DGHT19b, Hof18], in a sense a real-analysis abstraction of previously used argu-
ments for linear RDE solutions, [HN07, FV10b].
The existence and uniqueness results for rough differential equations have seen
many variations over recents years. Gubinelli, Imkeller and Perkowski apply their
theory of paracontrolled distributions to (level-2) RDEs with Hölder drivers [GIP15,
Sec.3], extended to Besov drivers by Prömel–Trabs [PT16], revisited with “classical”
rough path tools in [FP18].
Rough/stochastic Volterra equations are discussed from a rough path point of
view in [DT09, HT19, Com19], from a paracontrolled point of view in [PT18] and
in a regularity structure context in [BFG+ 19, Sec.5]. Bailleul–Diehl then study the
inverse problem for rough differential equations [BD15]. For a “joint development”
of RDEs and SDEs with stochastic sewing, by a fixed point argument in a space of
stochastic controlled rough paths, see [FHL20]. Rough partial differential equations
are discussed in Chapter 12.
Last not least, we note that the point of view to construct RDE solutions by fixed
point arguments in the (linear) space of controlled rough paths, where the rough path
figures as parameter of the fixed point problem, extends naturally to the framework of
regularity structures developed in [Hai14b], cf. Chapter 13 onwards. In that context,
solutions (to singular SPDEs, say) are found by similar fixed point arguments in a
linear space of “modelled distributions”), with enhanced noise (“the model”) again
as parameter of the fixed point problem. (The question of renormalisation is a priori
disconnected from the construction of a solution and only concerns the model / rough
path. However, one would like to understand the equation driven by renormalised
noise, at least when the latter is smooth. In the setting of rough differential equations
such effects have been observed in [FO09], a systematic study in case of branched
RDE is found in Bruned et al. [BCFP19], see also [BCEF20].)
Chapter 9
Stochastic differential equations

We identify the solution to a rough differential equation driven by the Itô or


Stratonovich lift of Brownian motion with the solution to the corresponding stochas-
tic differential equation. In combination with continuity of the Itô–Lyons maps, a
quick proof of the Wong–Zakai theorem is given. Applications to Stroock–Varadhan
support theory and Freidlin–Wentzell large deviations are briefly discussed.

9.1 Itô and Stratonovich equations

We saw in Section 3 that d-dimensional Brownian motion lifts in an essentially


canonical way to B = (B, B) ∈ C α [0, T ], Rd almost surely, for any α ∈ 13 , 12 .
 

In particular, we may use almost every realisation of (B, B) as the driving signal
of a rough differential equation. This RDE is then solved “pathwise” i.e. for a
fixed realisation of (B(ω), B(ω)). Recall that the choice of B is never unique: two
Itô Strat
important choices R the Stratonovich lift, we write B and B , where
R are the Itô and
B is defined as B ⊗ dB and B ⊗ ◦dB respectively. We now discuss the interplay
with classical stochastic differential equations (SDEs).
Theorem 9.1. Let f ∈ Cb3 Re , L Rd , Re , let f0 : Re → Re be Lipschitz continu-


ous, and let ξ ∈ Re . Then,


i) With probability one, BItô (ω) ∈ C α , any α ∈ (1/3, 1/2) and there is a unique

RDE solution (Y (ω), f (Y (ω))) ∈ DB(ω) to

dY = f0 (Y )dt + f (Y ) dBItô , Y0 = ξ.

Moreover, Y = (Yt (ω)) is a strong solution to the Itô SDE dY = f0 (Y )dt +


f (Y )dB started at Y0 = ξ.
ii) Similarly, the RDE solution driven by BStrat yields a strong solution to the
Stratonovich SDE dY = f0 (Y )dt + f (Y ) ◦ dB started at Y0 = ξ.
Proof. We assume zero drift f0 , but see Exercise 8.5. The map

153
154 9 Stochastic differential equations

B|[0,t] 7→ (B, BStrat )|[0,t] ∈ Cg0,α [0, t], Rd




is measurable, where Cg0,α denotes the (separable, hence Polish) subspace of C α


obtained by taking the closure, in α-Hölder rough path metric, of piecewise smooth
paths. This follows, for instance, from Proposition 3.6. By the continuity of the
Itô–Lyons map (adding a drift vector field is left as an easy exercise) the RDE
solution Yt ∈ Re is the continuous image of the driving signal (B, BStrat )|[0,t] ∈
Cg0,α [0, t], Rd . It follows that Yt is adapted to


σ{Br,s , Br,s : 0 ≤ r ≤ s ≤ t} = σ{Bs : 0 ≤ s ≤ t} ,


1
and it suffices to apply Corollary 5.2. Since BItô Strat
s,t = Bs,t − 2 (t − s)I, measurability
is also guaranteed and we conclude with the same argument, using Proposition 5.1.


Remark 9.2. In contrast to standard SDE theory, the present solution constructed
via RDEs is immediately well-defined as a flow, i.e. for all ξ on a common set of
probability one. The price to pay is that of C 3 regularity of f , as opposed to the mere
Lipschitz regularity required for the standard theory.

9.2 The Wong–Zakai theorem

A classical result (e.g. [IW89, p.392]) asserts that SDE approximations based on
piecewise linear approximations to the driving Brownian motions converge to the
solution of the Stratonovich equation. Using the machinery built in the previous
sections, we can now give a simple proof of this by combining Proposition 3.6,
Theorem 8.5 and the understanding that RDEs driven by BStrat yield solutions to the
Stratonovich equation (Theorem 9.1).
Theorem 9.3 (Wong–Zakai, Clark, Stroock–Varadhan). Let f, f0 , ξ be as in Theo-
rem 9.1 above. Let α < 1/2. Consider dyadic piecewise linear approximations (B n )
to B on [0, T ], as defined in Proposition 3.6. Write Y n for the (random) ODE solu-
tions to dY n = f0 (Y n )dt + f (Y n )dB n and Y for the Stratonovich SDE solution to
dY = f0 (Y )dt + f (Y ) ◦ dB, all started at ξ. Then the Wong–Zakai approximations
converge a.s. to the Stratonovich solution. More precisely, with probability one,

∥Y − Y n ∥α;[0,T ] → 0.

The only reason for dyadic piecewise linear approximations in the above statement
is the formulation of the martingale-based Proposition 3.6. In Section 10 we shall
present a direct analysis (going far beyond the setting of Brownian drivers) which
easily entails quantitative convergence (in probability and Lq , any q < ∞) for all
piecewise linear approximations towards a (Gaussian) rough path.
In the forthcoming Exercise 10.2 it will be seen that (non-dyadic) piecewise linear
approximations of mesh size ∼ 1/n, viewed canonically as rough paths, converge a.s.
9.3 Support theorem and large deviations 155

in C α with rate anything less than 1/2 − α. As long as α > 1/3, it then follows from
(local) Lipschitzness of the Itô–Lyons map that Wong–Zakai approximations also
converge with rate (1/2 − α)− . Note that the “best” rate one obtains in this way is
(1/2 − 1/3)− = 1/6− ; the reason being that rate is measured in some Hölder space
with exponent 1/3+ , rather than the uniform norm. The well-known almost sure
“strong” rate 1/2− can be obtained from rough path theory at the price of working in
rough path spaces of much lower regularity, see [FR14].

9.3 Support theorem and large deviations

We briefly discuss two fundamental results in diffusion theory and explain how
the theory of rough paths provides elegant proofs, reducing a question for general
diffusion to one for Brownian motion and its Lévy area.
The results discussed in this section were among the very first applications of
rough path theory to stochastic analysis, see Ledoux et al. [LQZ02]. Much more
on these topics is found in [FV10b], so we shall be brief. The first result, due to
Stroock–Varadhan [SV72] concerns the support of diffusion processes.

Theorem 9.4 (Stroock–Varadhan support theorem). Let f, f0 , ξ be as in Theo-


rem 9.1 above. Let α < 1/2, B be a d-dimensional Brownian motion and consider
the unique Stratonovich SDE solution Y on [0, T ] to
d
X
dY = f0 (Y )dt + fi (Y ) ◦ dB i (9.1)
i=1

started at Y0 = ξ ∈ Re . Write y h for the ODE solution obtained by replacing ◦dB


with dh ≡ ḣ dt, whenever h ∈ H = W01,2 , i.e. absolutely continuous, h(0) = 0 and
ḣ ∈ L2 ([0, T ], Rd ). Then, for every δ > 0,
 
lim P ∥Y − Y h ∥α;[0,T ] < δ ∥B − h∥∞;[0,T ] < ε = 1 (9.2)

ε→0

(where Euclidean norm is used for the conditioning ∥B − h∥∞,[0,T ] < ε). As a
consequence, the support of the law of Y , viewed as measure on the pathspace
C 0,α ([0, T ], Re ), is precisely the α-Hölder closure of {y h : ḣ ∈ L2 ([0, T ], Rd )}.

Proof. Using Theorem 9.1 we can and will take Y as RDE solution driven by
BStrat (ω). For h ∈ H and some fixed α ∈ ( 31 , 12 ), we furthermore denote by
S (2) (h) = (h, h ⊗ dh) ∈ Cg0,α the canonical lift given by computing the it-
R

erated integrals using usual Riemann–Stieltjes integration. It was then shown in


[FLS06]1 that for every δ > 0,

1
Strictly speaking, this was shown for h ∈ C 2 ; the extension to h ∈ H is non-trivial and found in
[FV10b].
156 9 Stochastic differential equations
 
BStrat , S (2) (h) < δ ∥B − h∥∞;[0,T ] < ε = 1.

lim P ϱα;[0,T ] (9.3)

ε→0

The conditional statement then follows easily from continuity of the Itô–Lyons map
and so yields the “difficult” support inclusion: every y h is in the support of Y . The
easy inclusion, support of Y contained in the closure of {y h }, follows from the
Wong–Zakai theorem, Theorem 9.3. If one is only interested in the support statement,
but without the conditional statement (9.2), there are “softer” proofs; see Exercise 9.1
below. ⊔ ⊓

The second result to be discussed here, due to Freidlin–Wentzell, concerns the


behaviour of diffusion in the singular (ε → 0) limit when B is replaced by εB. We
assume the reader is familar with large deviation theory.

Theorem 9.5 (Freidlin–Wentzell large deviations). Let f, f0 , ξ be as in Theo-


rem 9.1 above. Let α < 1/2, B be a d-dimensional Brownian motion and consider
the unique Stratonovich SDE solution Y = Y ε on [0, T ] to
d
X
dY = f0 (Y )dt + fi (Y ) ◦ εdB i (9.4)
i=1

started at Y0 = ξ ∈ Re . Write Y h for the ODE solution obtained by replacing ◦εdB


with dh where h ∈ H = W01,2 . Then (Ytε : 0 ≤ t ≤ T ) satisfies a large deviation
principle (in α-Hölder topology) with good rate function on pathspace given by

J(y) = inf I(h) : Y h = y .




Here I is Schilder’s rate function for Brownian motion, i.e. I(h) = 12 ∥ḣ∥2L2 ([0,T ],Rd )
for h ∈ H and I(h) = +∞ otherwise.

Proof. The key remark is that large deviation principles are robust under continuous
maps, a simple fact known as contraction principle. The problem is then reduced to
establishing a suitable large deviation principle for the Stratonovich lift of εB (which
is exacly δε BStrat ) in the α-Hölder rough path topology. Readers familiar with general
facts of large deviation theory, in particular the inverse and generalised contraction
principles, are invited to complete the proof along Exercise 9.2 below. ⊔ ⊓

9.4 Laplace method

We have seen that (Ytε : 0 ≤ t ≤ T ), given


 as continuous images of the rescaled
Brownian rough path, Y ε = Φ δε BStrat , satisfies a large deviations principle (in
Hölder and hence also in uniform topology) with rate function

J(y) = inf {I(h) : Φ(h) = y, h ∈ H} (9.5)


9.4 Laplace method 157
R 
with the mild abuse of notation Φ(h) ≡ Φ(h) where h = h, h ⊗ dh is the canoni-
cal lift of h ∈ H. A standard fact of large deviation theory, Varadhan’s lemma, implies
the following Laplace principle: for bounded continuous F : C([0, T ], Re ) → R,

lim ε2 log E exp −F (Y ε )/ε2 = − inf{FΛ (h) : h ∈ H} ,


 
ε→0

where we set FΛ = F ◦ Φ + I, for I as in Theorem 9.5. We are interested in precise


asymptotics, hence the following collection of hypotheses.
(H1) The function F is bounded continuous on C([0, T ], Re ).
(H2) The function FΛ attains its unique minimum at γ ∈ H.
(H3) The function F is C 3 in the Fréchet sense at φ := Φ(γ).
(H4) The element γ is a non-degenerate minimum of FΛ restricted to H namely,
for all h ∈ H\{0},

D2 FΛ (γ)(h, h) = D2 (F ◦ Φ) γ (h, h) + ∥h∥2H > 0


Theorem 9.6. Let Y ε be the unique Stratonovich SDE solution on [0, T ] in the small
noise regime from Theorem 9.5. Under conditions (H1-H4), the following precise
Laplace asymptotic holds
 
FΛ (γ)
E exp −F (Y ε )/ε2 = exp − 2
 
(c0 + o(1)) as ε ↓ 0, (9.6)
ε

for some constant c0 ∈ (0, ∞).


Proof. (i) Localisation around the minimiser. We regard B = BStrat (and its ε-
dilations) as random
R variables in the (Polish) rough path space C := Cg0,α ([0, T ]).
Write γ := (γ, γ ⊗ dγ) ∈ C for the canonical lift of the minimiser γ ∈ H. Take
now an arbitrary neighbourhood O of γ ∈ C and decompose

E exp −F (Y ε )/ε2 = E exp −F ◦ Φ(δε B)/ε2


   

= E[. . . ; {δε B ∈ O}] + E[. . . ; {δε B ∈ O}c ] .

Since (δε B) satisfies an LDP with good rate function, (H1) implies that there exists
d > a := FΛ (γ) and ε0 > 0 such that for all ε ∈ (0, ε0 )

E exp −F ◦ Φ(δε B)/ε2 ; {δε B ∈ O}c ≤ exp −d/ε2 .


   
(9.7)

Hence this term does not contribute to the asymptotics (9.6). In the sequel, we shall
take, for some ϱ > 0,

O := Oϱ := {Tγ X : X ∈ C , |||X||| < ϱ} = {X ∈ C : |||T−γ X||| < ϱ} .

(By continuity of the translation operator, this is indeed an open neighbourhood of


Tγ 0 = γ.) We are thus left to analyse

Jϱ := E exp −F ◦ Φ(δε B)/ε2 ; |||T−γ δε B||| < ϱ .


  
158 9 Stochastic differential equations

(ii) Cameron–Martin shift. It is easy to see that, for Wiener a.e. ω, one has
B(ω + h) = Th B(ω). In particular, the Cameron–Martin shift εB ⇝ εB + γ (or
ω ⇝ ω + γ/ε) induces a translation of δε B in the sense that
 Z   Z 
δε B = εB, εB ⊗ d(εB) ⇝ εB + γ, (εB + γ) ⊗ d(εB + γ) = Tγ δε B .

From the Cameron–Martin theorem, with all integrals below understood over [0, T ],
  ∥γ∥2 R
γ̇d(εB)   F ◦ Φ(T δ B)  
H γ ε
Jϱ (ε) = E exp − − exp − ; |||δε B||| < ϱ
2ε2 ε2 ε2
 ∥γ∥2 + F ◦ Φ(γ)    (∗)  
H
= exp − 2
E exp − 2 ; |||δε B||| < ϱ ;
2ε ε
where we recognise FΛ (γ) in the first exponential and also set
Z
(∗) = F ◦ Φ(Tγ δε B) − F ◦ Φ(γ) + ε γ̇dB .

(iii) Local analysis around the minimiser. We argue on a fixed rough path realisation
X := B(ω). One checks that ε 7→ Φ(Tγ δε X) is sufficiently smooth so that

ε2 2
Φ(Tγ δε X) = Φ(γ) + εG1 (X) + 2 G (X) + ε3 Rε (X)

with remainder Rε (X), uniformly bounded in ε ∈ (0, 1]. We now use (H3) to obtain
the expansion

(F ◦ Φ)(Tγ δε X) = (F ◦ Φ)(γ) + ε DF |φ G1 (X)




ε2 h i
DF |φ G2 (X) + D2 F |φ G1 (X), G1 (X) + ε3 RεF (X) ,

+
2| {z }
=:Q(X)

where (H3) requires us to take ε less than some ε1 (X), with remainder RεF (X),
uniformly bounded in ε ∈ (0, ε1 ). Write G1 = G1 (h), and similar for G2 , Q, when
evaluated at the canonical lift of an element h ∈ H. We note for later
∂ 2
Q(h) = 2 (F ◦ Φ)(γ + εh) .
∂ε ε=0
Since γ minimises FΛ = F ◦ Φ + I, first order optimality leads precisely to
Z
1

DF |φ G (h) + γ̇dh = 0 , (9.8)

for any h ∈ H. By continuous extension we have DF |φ G1 (B(ω)) + γ̇dB = 0,


 R

see Exercise 9.3 (ii), and so


9.4 Laplace method 159
 
FΛ (γ)
E exp −Q(B)/2 + εRεF (B) ; |||δε B||| < ϱ .
 
Jϱ (ε) = exp − 2

We claim that, as one would expect from exchanging ε → 0 with expectation,

lim E exp −Q(B)/2 + εRεF (B) ; ∥δε B∥ < ϱ = E[exp(−Q(B))/2] < ∞.


  
ε→0

To see why this is so, we first show integrability and even exp [−Q(B)/2] ∈ L1+β ,
for some β > 0, as consequence of the non-degeneracy assumption on the minimizer.
The claimed integrability follows from the tail estimate P(−Q(B)/2 ≥ r) ≤ e−Cr ,
with C > 1 and for sufficiently large r. Now Q is “quadratic” in the precise sense
Q(δλ X) = λ2 Q(X), λ > 0, so that upon setting r ≡ 1/ε2 , we are left to show
2
P(−Q(δε B) ≥ 2) ≤ e−C/ε .

Since Q is seen to be continuous on rough path space, we have a good Large


Deviations Principle for {−Q(δε B) : ε > 0}, and using the upper LDP bound

+o(1))/ε2
P(−Q(δε B)/2 ≥ 1) ≤ e−(C ,

it remains to see 1 < C ∗ , where, using goodness of the rate function,

C ∗ = inf 12 ∥h∥2H : h ∈ H, −Q(h)/2 ≥ 1 = 21 ∥h∗ ∥2H for some h∗ ∈ H .




But this follows exactly from “D2 (F ◦ Φ + I)(γ) > 0” in direction h∗ ,

1 ∂ 2 1
1 ≤ −Q(h∗ )/2 = 2 (−F ◦ Φ)(γ + εh∗ ) < ∥h∗ ∥2H .
2 ∂ε ε=0 2
This establishes exp [−Q(B)/2] ∈ L1+β . This additional amount of integrability,
β > 0, is now used to give a uniform L1 -bound on exp −Q(B)/2 + εRεF (B) over


|||δε B||| < ϱ, after which one can conclude by dominated convergence. To this end,
we revert to a pathwise consideration, X := B(ω). We need the remainder estimate,
Exercise 9.4,
sup RεF (X) ≲ 1 + |||X|||3 ,

(9.9)
ε∈(0,ε1 ]

valid whenever ε|||X||| = |||δε X||| remains bounded. It follows that, on |||δε B||| < ϱ, we
have the (uniform in small ε) estimate

ε RεF (B) ≲ 1 + ε|||B|||3 ≲ 1 + ϱ|||B|||2



(9.10)

and this estimate is uniform over ε ∈ (0, 1]. By Fernique’s estimate for the (homoge-
neous!) rough path norm |||B||| of B = B(ω) and by choosing ϱ = ϱ(β) small enough,
we can guarantee that
F ′
eεRε (B) 1{|||δε B|||<ϱ} ≲ exp Cϱ|||B|||2 ∈ Lβ ,

160 9 Stochastic differential equations

where β ′ < ∞ is the Hölder conjugate of β > 1. Hence exp[−Q(B)/2 + ϱ|||B|||2 ] ∈


L1 serves as the uniform L1 -bound we were looking for and the proof is complete.

9.5 Exercises

Exercise 9.1 (Support of Brownian rough path [FV10b]) Fix α ∈ ( 13 , 12 ) and


0,α
view the law µ of BStrat as probability measure on the Polish space Cg,0 , the (closed)
subspace of Cg of rough paths X started at X0 = 0. Show that BStrat has full
0,α

support. The “easy” inclusion, supp µ ⊂ Cg0,α is clear from Proposition 3.6. For the
other inclusion, recall the translation operator from Exercise 2.15 and follow the
steps below.
a) (Cameron–Martin theorem for Brownian rough path) Let h ∈ [0, T ] ∈ H =
W01,2 . Show that X ∈ supp µ implies Th (X) ∈ supp µ.
b) Show that the support of µ contains at least one point, say X̂ ∈ Cg0,α with
the property that there exists a sequence of Lipschitz paths (h(n) ) so that
Th(n) (X̂) → (0, 0) in α-Hölder rough path metric.
Hint: Almost every realisation of BStrat (ω) will do, with −h(n) = B (n) , the
dyadic piecewise linear approximations from Proposition 3.6.
c) Conclude that (0, 0) = limn→∞R Th(n) (X̂) ∈ supp µ.
d) As a consequence, any (h, h ⊗ dh) = Th (0, 0) ∈ supp µ, for any h ∈ H and
taking the closure yields the “difficult” inclusion.
e) Appeal to continuity of the Itô–Lyons map to obtain the “difficult” support
inclusion (“every y h is in the support of Y ” ) in the context of Theorem 9.4.
Exercise 9.2 (“Schilder” large deviations, see [FV10b]) Fix α ∈ ( 13 , 12 ) and con-
sider
δε BStrat = (εB, ε2 BStrat ) ,
0,α
the laws of which are viewed as probability measures µε on the Polish space Cg,0 .
ε
Show that (µ ) : ε > 0 satisfies a large deviation principle in α-Hölder rough path
topology with good rate function

J(X) = I(X) ,

where X = (X, X) and I is Schilder’s rate function for Brownian motion, i.e.
I(h) = 12 ∥ḣ∥2L2 ([0,T ],Rd ) for h ∈ H = W01,2 and I(h) = +∞ otherwise.
Hint: Thanks to Gaussian integrability for the homogeneous rough paths norm of
BStrat it is actually enough to establish a large deviation principle for (δε BStrat : ε >
0) in the (much coarser) uniform topology, which is not very hard to do “by hand”,
cf. [FV10b].
9.6 Comments 161

Exercise 9.3 In the context of Laplace asymptotics given in Theorem 9.6:


a) Detail the localisation estimate (9.7).
b) Derive the first order optimality condition (9.8) and justify its “continuous
extension”, i.e. replacing h by B(ω).
c) Show that G2 = G2 (X) is continuous in rough path sense. Conclude that the
same holds for Q = Q(X).
Remark: Related results appear in [BA88] (on path space) and [Ina06, Lemma 8.2].
Exercise 9.4 (Stochastic Taylor-like rough path expansion) We aim to show the
remainder estimate (9.9).
a) As a warmup, consider Φ : C([0, 1], Rd ) → R so that Φ(X) = φ(X1 ), for some
φ ∈ C 3 (Rd ). Fix γ ∈ C([0, 1], Rd ) and establish the expansion

Φ(γ + εX) ≡ g0 + εg1 (X) + ε2 g2 (X) + ε3 rε (X) ,

such that |rε (X)| ≲ |X1 |3 , uniformly in ε ∈ (0, 1], provided |εX1 | remains
bounded.
b) Show that an extra ε-dependent drift, say εX replaced by εX + εµ for some
fixed µ ∈ C([0, 1], Rd ), alters the remainder estimate to |rε (X)| ≲ 1 + |X1 |3 .
c) Generalise a) and b) to the situation when Φ is C 3 -regular in Fréchet sense. (This
trivially covers the case F ◦ Φ, with another F ∈ C 3 .)
d) Prove the real thing, i.e. the remainder estimate (9.9) based on the expansion
of ε 7→ F ◦ Φ(Tγ δε X) where Φ is the Itô–Lyons map. (See e.g. [IK07, Thm 5.1]
and the references therein. For a similar estimate in a slightly different setting,
see also [FGP18].)

9.6 Comments

The rough path approach to solving stochastic differential equations (SDEs) driven
by d-dimensional noise, can be seen as far-reaching extension of the works of Doss
and Sussmann [Dos77, Sus78], and the Wong–Zakai approximation result [WZ65]
(d = 1) and Clark [Cla66], Stroock-Varadhan [SV72] for d > 1. Lyons [Lyo98]
used the Wong–Zakai theorem in conjunction with his continuity result to deduce
the fact that RDE solutions (driven by the Brownian rough path BStrat ) coincide with
solution to (Stratonovich) stochastic differential equations. Similar to Friz–Victoir
[FV10b], the logic is reversed in our presentation: thanks to an a priori identification
of f (Y ) dBStrat as a Stratonovich stochastic integral, the Wong–Zakai results is
R

obtained. Ikeda–Watanabe [IW89] present “twisted” Wong–Zakai approximation,


based on McShane [McS72], in which case an additional limiting drift vector field
appears; see also [Sus91, FO09]. Wong-Zakai type results for SPDEs (with finite-
dimensional noise) is a straight-forward consequence of continuity statements for
rough partial differential equations, as discussed in Sections 12.1 and 12.2. A version
162 9 Stochastic differential equations

of the Wong–Zakai theorem for a singular SPDEs with space-time white noise via
regularity structures is established by Hairer–Pardoux [HP15].
Almost sure rates for Wong–Zakai approximations in Brownian (and then more
general Gaussian) rough path situations, were studied by Hu–Nualart [HN09], Deya–
Neuenkirch–Tindel [DNT12] and Friz–Riedel [FR14]; see also Riedel–Xu [RX13].
Let us also note that Lq -rates for the convergence of approximations are not easy
to obtain with rough path techniques (in contrast to Itô calculus which is ideally
suited for moment calculations). Nonetheless, such rates can be obtained by Gaussian
techniques, as discussed in Section 11.2.3 below; applications include multi-level
Monte Carlo for SDEs and more generally Gaussian RDEs [BFRS16]. The rough
path approach to SDEs (and more generally Gaussian RDEs) leads naturally to
random dynamical systems, cf. comment Section 10.5.
The rough path approach to the Stroock-Varadhan support theorem [SV72] in
Section 9.3 goes back to Ledoux–Qian–Zhang [LQZ02] in p-variation and Friz
[Fri05] in Hölder topology, simplified and extended with Victoir in [FV05, FV07,
FV10b]; the conditional estimate (9.3) is due to Friz, Lyons and Stroock [FLS06].
We note that this strategy of proof applies whenever one has rough path stability,
which includes many stochastic partial differential equations (with finite-dimensional
noise) discussed in Chapter 12. In the case of infinite-dimensional noise, a general
support theorem for singular SPDEs was obtained via regularity structures by Hairer–
Schönbauer [HS19] and extends the paracontrolled work of Chouk–Friz [CF18], as
well as classical results such as the work of Bally, Millet and Sanz-Sole [BMSS95].
The rough path approach to Freidlin–Wentzell (small noise) large deviations in
Section 9.3 goes also back to Ledoux, Qian and Zhang [LQZ02]; in p-variation,
strengthened to Hölder topology in [FV05]; Inahama studies large deviations for
pinned diffusions [Ina15], see also [Ina16a]. Once more, the strategy of proof applies
whenever one has rough path stability, and thus applies to many stochastic partial
differential equations as discussed in Chapter 12. Large deviations for Banach valued
Wiener–Itô chaos proved useful in extensions to Gaussian rough paths and then Gaus-
sian models (in the sense of regularity structures), see [FV07] and [HW15], where
Hairer–Weber establish small noise large deviations for large classes of singular
SPDEs.
Theorem 9.6 is an elegant application of rough paths, due to Aida [Aid07], to the
classical theme of Laplace method on Wiener space, in a setting close to Ben Arous
[BA88]; see also Inahama [Ina06], his work with Kawabi [IK07] and [Ina13]. Our
presentation borrows from Friz, Gassiat and Pigato [FGP18]. See Friz–Klose [FK20]
for a recent extension of these works to singular SPDEs via regularity structures.
Recent applications to heat kernel expansions include [IT17].
The pathwise approach has also been useful to study mean field or McKean–Vlasov
stochastic differential equations. This goes back to Tanaka [Tan84], with pathwise
analysis of additive noise, revisited and extended by Coghi et al. [CDFM18]. The
rough path case was pioneered by Cass–Lyons [CL15], with measure dependent drift,
followed by Bailleul, Catellier and Delarue [BCD20, BCD19] to a setup that includes
the important case of measure dependent noise vector fields. Dawson–Gärtner type
large deviations from the McKean-Vlasov limit of weakly interacting diffusions is
9.6 Comments 163

studied in by [Tan84, CDFM18], and also in Deuschel et al. [DFMS18] via rough
paths, always under additive noise. Coghi–Nilssen [CN19] study, from a rough path
point of view, McKean-Vlasov diffusion with “common” noise.
The Lions–Sznitman theory of reflecting SDEs [LS84] was revisited from a
purely analytic rough path perspective by Aida [Aid15] and Deya et al. [DGHT19a]
(existence) Gassiat [Gas20] shows non-uniqueness.
Homogenisation has also seen much impetus from rough path theory. After early
works by Lejay–Lyons [LL03], we mention Bailleul–Catellier [BC17] and Kelly–
Melbourn [KM16, KM17], who pioneered applications to deterministic homogeni-
sation for fast-slow systems with chaotic noise, work continued by Chevyrev et al.
[CFK+ 19b, CFK+ 19a, CFKM19].
Stochastic differential equations with jumps, driven by Lévy or general semi-
martingale noise, noise are well-known [KPP95, Pro05, App09] to require a careful
interpretation: forward vs. geometric (a.k.a. Marcus canonical) sense. The pathwise
interpretation of such differential equations was started by Williams [Wil01] and
essentially completed by Chevyrev, Friz, Shekhar and Zhang [FS17, FZ18, CF19],
consistency with the corresponding stochastic theories is also shown.
Rough analysis is “strong” by nature, yet has also proven a powerful tool for
“weak” (or martingale) problems. This was pioneered by Delarue–Diehl [DD16],
using rough paths to study a one-dimensional SDE with distributional drift, with
applications to polymer measures. The extension to higher dimensions was carried
out with paracontrolled methods by Cannizzaro–Chouk [CC18a].
Bruned et al. [BCF18] construct examples of renormalised SDE solutions, par-
tially based on the “Hoff” process [Hof06, FHL16], related to Itô SDE solutions as
averaging Stratonovich solutions [LY16].
Chapter 10
Gaussian rough paths

We investigate when multidimensional stochastic processes can be viewed – in a


“canonical” fashion – as random rough paths. Gaussianity only enters through equiva-
lence of moments. A simple criterion is given which applies in particular to fractional
Brownian motion with suitable Hurst parameter.

10.1 A simple criterion for Hölder regularity

We now consider a driving signal modelled by a continuous, centred Gaussian process


with values in V = Rd . We thus have continuous sample paths

X(ω) : [0, T ] → Rd

and may take the underlying probability space as C [0, T ], Rd , equipped with a


Gaussian measure µ so that Xt (ω) = ω(t). Recall that µ, the law of X, is fully
determined by its covariance function
2
R : [0, T ] → Rd×d
(s, t) 7→ E[Xs ⊗ Xt ] .

In this section, a major role will be played by the rectangular increments of the
covariance, namely  
s , t def
R ′ ′ = E[Xs,t ⊗ Xs′ ,t′ ] .
s ,t
As far as the Hölder regularity of sample paths is concerned, we have the following
classical result, which is nothing but a special case of Kolmogorov’s continuity
criterion:

Proposition 10.1. Assume there exists positive ϱ and M such that for every 0 ≤ s ≤
t ≤ T,

165
166 10 Gaussian rough paths
 
R s, t ≤ M |t − s|1/ϱ .

(10.1)
s, t
Then, for every α < 1/(2ϱ) there exists Kα ∈ Lq , for all q < ∞, such that
α
|Xs,t (ω)| ≤ Kα (ω)|t − s| .

Proof. We may argue componentwise and thus take d = 1 without loss of generality.
Since
  1/2
1/2 s, t 1
≤ M 1/2 |t − s| 2ϱ

|Xs,t |L2 = (E[Xs,t Xs,t ]) ≤ R
s, t

and |Xs,t |Lq ≤ cq |Xs,t |L2 by Gaussianity, we conclude immediately with an appli-
cation of the Kolmogorov criterion. ⊔ ⊓

Whenever the above proposition applies with ϱ < 1, the resulting sample paths
can be taken with Hölder exponent α ∈ ( 12 , 2ϱ
1
); differential equations driven by X
can then be handled with Young’s theory, cf. Section 8.3. Therefore, our focus will be
on Gaussian processes which satisfy a suitable modification of condition (10.1) with
ϱ ≥ 1 such that the process X allows for a probabilistic construction of a suitable
second order process1
2
X(ω) : [0, T ] → Rd×d ,
which is tantamount to making sense of the “formal” stochastic integrals
Z t
i
Xs,r dXrj for 0 ≤ s < t ≤ T, 1 ≤ i, j ≤ d , (10.2)
s

such that almost every realisation X(ω) satisfies the algebraicand analytical prop-
erties of Section 2, notably (2.1) and (2.3) for some α ∈ 31 , 12 . We shall also look
for (X, X) as (random) geometric rough path; thanks to (2.6), only the case i < j in
(10.2) then needs to be considered.
At the risk of being repetitive, the reader should keep in mind the following three
points: (i) the sample paths X(ω) will not have, in general, enough regularity to
define (10.2) as Young integrals; (ii) the process X will not be, in general, a semi-
martingale, so (10.2) cannot be defined using classical stochastic integrals; (iii) a lift
of the process X to (X, X) ∈ Cgα for some α ∈ 13 , 12 , if at all possible, will never
be unique (as discussed in Chapter 2, one can always perturb the area, i.e. Anti(X)
by the increments of a 2α-Hölder path). But there might still be one distinguished
canonical choice forR X, in the same way as BStrat is canonically obtained as limit
(in probability) of B ⊗ dB n , for many natural approximations B n of Brownian
n

motion B.

1
Despite the two parameters (s, t) one should not think of a random field here: as was noted in
Exercise 2.4, (X, X) is really a path.
10.2 Stochastic integration and variation regularity of the covariance 167

10.2 Stochastic integration and variation regularity of the


covariance

Our standing assumption from here on is independence of the d components of


X, which is tantamount to saying that the covariance takes values in the diagonal
matrices. Basic examples to have in mind are d-dimensional standard Brownian
motion B with
R(s, t) = (s ∧ t)Id ∈ Rd×d
(here Id denotes the identity matrix in Rd×d ) or fractional Brownian motion B H ,
with
1h 2H
i
RH (s, t) = s2H + t2H − |t − s| Id ∈ Rd×d
2
2  2H
where H ∈ (0, 1); note the implication E BtH − BsH

= |t − s| . The reader
should observe that Proposition 10.1 above applies with ϱ = 1/(2H); the focus on
ϱ ≥ 1 (to avoid trivial situations covered by Young theory) translates to H ≤ 1/2.
We return to the task of making sense of (10.2), componentwise for fixed i < j,
and it will be enough to do so for theunit interval; theinterval [s, t] is handled
 by
considering Xs+τ (t−s) : 0 ≤ τ ≤ 1 . Writing X, X̃ , rather than X i , X j , we
attempt a definition of the form
Z 1 X
def
X0,u dX̃u = lim X0,ξ X̃s,t with ξ ∈ [s, t] , (10.3)
0 |P|↓0
[s,t]∈P

where the limit is understood in probability, say. Classical stochastic analysis (e.g.
[RY99, p144]) tells us that care is necessary: if X, X̃ are semimartingales, the
choice ξ = s (“left-point evaluation”) leads to the Itô integral; ξ = t (“right-point
evaluation”) to the backward Itô – and ξ = (s + t)/2 to the Stratonovich integral.
On the other hand, all these integrals only differ by a bracket term ⟨X, X̃⟩ which
vanishes if X, X̃ are independent. While we do not assume a semimartingale structure
here, we do have the standing assumption of componentwise independence. This
suggests a Riemann sum approximation of (10.2) in which we expect the precise
point of evaluation to play no rôle; we thus consider left-point evaluation (but mid-
or rightpoint evaluation would lead to the same result; cf. Exercise 10.5, (ii) below).
Given a partition P of an interval and an integrand F , we set
Z X
Fs dX̃s := Fs X̃s,t ,
P [s,t]∈P

so that under the assumption that X and X̃ are independent, we have


Z Z     
X 0, s s,t
E X0,s dX̃s X0,s dX̃s = R R̃ . (10.4)
P P′ 0, s′ s′ , t′
[s,t]∈P
[s′ ,t′ ]∈P ′
168 10 Gaussian rough paths

On the right-hand side we recognise a 2D Riemann–Stieltjes sum and set


Z    
X 0, s s,t
R dR̃ := R R̃ ′ ′ .
P×P ′ 0, s′ s ,t
[s,t]∈P
[s′ ,t′ ]∈P ′

Let us now assume that R has finite ϱ-variation in the sense ∥R∥ϱ;[0,1]2 < ∞ where
the ϱ-variation on a rectangle I × I ′ is given by
  ϱ !1/ϱ
R s′ , t′
X
∥R∥ϱ;I×I ′ := sup < ∞, (10.5)
P⊂I,
s ,t
[s,t]∈P
P ′ ⊂I ′ ′ ′
[s ,t ]∈P ′

and similarly for R̃, with θ = 1/ϱ + 1/ϱ̃ > 1. A generalisation of Young’s maximal
inequality due to Towghi [Tow02] states that 2
Z

sup R dR̃ ≤ C(θ) R ϱ;I×I ′ R̃ ϱ̃;I×I ′ .

P⊂I, P×P ′
P ′ ⊂I ′

In particular, if the covariance of X̃ has similar variation regularity as X, the condi-


tion simplifies to ϱ < 2 and we obtain the following L2 -maximal inequality.
Lemma 10.2. Let X, X̃ be independent, continuous, centred Gaussian processes
with respective covariances R, R̃ of finite ϱ-variation, some ϱ < 2. Then
"Z 2 #

sup E X0,r dX̃r ≤ C R 2 R̃
2 ,
ϱ;[0,1] ϱ;[0,1]
P⊂[0,1] P

where the constant C depends on ϱ.


We can now show existence of (10.3) as L2 -limit.
Proposition 10.3. Under the assumptions of the previous lemma,
Z Z

lim sup X0,r dX̃r − X0,r dX̃r = 0. (10.6)

ε→0 P,P ′ ⊂[0,1]: P P′ L2
|P|∨|P ′ |<ε,

R1
X0,r dX̃r exists as the L2 -limit of
R
Hence, 0 P
X0,r dX̃r as |P| ↓ 0 and
"Z
1 2 #

E X0,r dX̃r ≤ C R ϱ;[0,1]2 R̃ ϱ;[0,1]2 (10.7)
0

with a constant C = C(ϱ).


2
This holds more generally if R is evaluated at [0, ξ] × [0, ξ ′ ] where ξ ∈ [s, t], ξ ′ ∈ [s′ , t′ ].
10.2 Stochastic integration and variation regularity of the covariance 169

Proof. At first glance, the situation looks similar to Young’s part in the proof of
Theorem 4.10 where we deduce (4.14) from Young’s maximal inequality. However,
the same argument fails if re-run with Ξs,t = X0,s X̃s,t and | · | replaced by | · |L2 ;
in effect, the triangle inequality is too crude and does not exploit probabilistic
cancellations present here. We now present two arguments for the key estimate (10.6).
First argument: at the price of adding / subtracting P ∩ P ′ , we may assume without
loss of generality that P ′ refines P. This allows to write
Z Z X Z def
X0,r dX̃r − X0,r dX̃r = Xu,r dX̃r = I ,
P′ P [u,v]∈P P ′ ∩[u,v]

and we need to show convergence of I to zero in L2 as |P| = |P| ∨ |P ′ | → 0. To


see this, we rewrite the square of the expectation of this quantity as
Z Z !
X X
EI 2 = E Xu,r dX̃r Xu′ ,r′ dX̃r′
[u,v]∈P [u′ ,v ′ ]∈P P ′ ∩[u,v] P ′ ∩[u′ ,v ′ ]

X X Z
= R dR̃ .
[u,v]∈P [u′ ,v ′ ]∈P P ′ ∩[u,v]×P ′ ∩[u′ ,v ′ ]

Thanks to Towghi’s maximal inequality, the absolute value of this term is bounded
from above by a constant C = C(ϱ) times
X X
∥R∥ϱ;[u,v]×[u′ ,v′ ] R̃ ϱ;[u,v]×[u′ ,v′ ]
[u,v]∈P [u′ ,v ′ ]∈P
X X 1 1
≤ ω([u, v] × [u′ , v ′ ]) ϱ ω̃([u, v] × [u′ , v ′ ]) ϱ ,
[u,v]∈P [u′ ,v ′ ]∈P

where ω = ω([s, t] × [s′ , t′ ]) (and similarly for ω̃) is a so-called 2D control [FV11]:
super-additive, continuous and zero when s = t or s′ = t′ . A possible choice, if
finite, is
  ϱ
′ ′ def
X u , v
ω([s, t] × [s , t ]) = sup R u′ , v ′ .
(10.8)
Q⊂[s,t]×[s′ ,t′ ] ′ ′ [u,v]×[u ,v ]∈Q

The difference to (10.5) is that the sup is taken over all (finite) partitions Q of
[s, t]×[s′ , t′ ] into rectangles; not just “grid-like” partitions induced by P ×P ′ . At this
stage it looks like one should the change assumption “covariance of finiteϱ-variation”
2
to “finite controlled ϱ-variation”, which by definition means ω [0, 1] < ∞. But
in fact there is little difference [FV11]: finite controlled ϱ-variation trivially implies
finite ϱ-variation; conversely, finite ϱ-variation implies finite controlled ϱ′ -variation,
any ϱ′ > ϱ. Since (10.6) does not depend on ϱ, we may as well (at the price
of replacing ϱ by ϱ′ ) assume finite controlled ϱ-variation. The Cauchy–Schwarz
inequality for finite sums shows that ω̄ := ω 1/2 ω̃ 1/2 is again a 2D control; the above
170 10 Gaussian rough paths

estimates can then be continued to


2/ϱ
X X
EI 2 ≤ C ω̄([u, v] × [u′ , v ′ ])
[u,v]∈P [u′ ,v ′ ]∈P
2−ϱ X X
≤C max ω̄([u, v] × [u′ , v ′ ]) ϱ
× ω̄([u, v] × [u′ , v ′ ])
[u,v]∈P
[u,v]∈P [u′ ,v ′ ]∈P
[u′ ,v′ ]∈P
≤ o(1) × ω̄([0, 1] × [0, 1]) ,

where we used the facts that |P| ↓ 0, ϱ < 2 and super-additivity of ω̄ to obtain
the last inequality. This is precisely the required bound. The second argument
makes use of Riemann-Stieltjes theory, applicable after mollification of X̃, and a
uniformity property of ϱ-variation upon mollification. Let thus denote X̃ n := X̃ ∗ fn
the convolution of t 7→ X̃t with (fn ), a family of smooth, compactly supported
n
probability density functions, weakly convergent to a Dirac at 0. Writing R̃s,t :=
n n n n n
 
E X̃s X̃t for the covariance of X̃ , and also S̃s,t := E X̃s X̃t for the “mixed”
covariance, we leave the fact that

sup R̃n ϱ;[0,1]2 , sup S̃ n ϱ;[0,1]2 ≤ R̃ ϱ;[0,1]2 ,



(10.9)
n n

as and easy exercise for the reader. (Hint: Note R̃n = R̃ ∗ (fn ⊗ fn ), S̃ n = R̃ ∗
(δ ⊗ fn ); estimate then the rectangular increments of R̃n , respectively S̃ n , to the
power ϱ with Jensen’s inequality.)
Since X̃ n has finite variation sample paths, basic Riemann–Stieltjes theory implies
Z Z
X0,r dX̃r → X0,r dX̃rn as |P| → 0.
n
(10.10)
P

In fact, this convergence (n fixed) takes also place in L2 which may be seen as con-
sequence of Lemma 10.2. On the other hand, pick ϱ′ ∈ (ϱ, 2) and apply Lemma 10.2
to obtain3
Z Z 2
X0,r dX̃rn 2 ≤ C∥RX ∥ϱ′ ;[0,1]2 RX̃−X̃ n ϱ′ ;[0,1]2

sup X0,r dX̃r −

P P P L
ϱ/ϱ′ 1−ϱ/ϱ′
≤ C∥RX ∥ϱ′ ;[0,1]2 RX̃−X̃ n ϱ;[0,1]2 RX̃−X̃ n ∞;[0,1]2 , (10.11)

where C = C(ϱ). Now ϱ′ > ϱ implies ∥RX ∥ϱ′ ;[0,1]2 ≤ ∥RX ∥ϱ;[0,1]2 (immediate
Pm ϱ
consequence of |x|ϱ′ ≤ |x|ϱ ≡ ( i=1 |xi | )1/ϱ on Rm ) and thanks to (10.9) we
also have the (uniform in n) estimate
 n 
R
X̃−X̃ n

ϱ;[0,1] 2 ≤ C ϱ
R
X̃ ϱ;[0,1] 2 + 2 S̃
ϱ;[0,1]2 + R n
X̃ ϱ;[0,1] 2

≤ 4Cϱ R̃ ϱ;[0,1]2 .

 
u, v
3
Define |f |∞;[0,1]2 = sup f ′ ′ where the sup is taken over all [u, v], [u′ , v ′ ] ⊂ [0, 1].
u ,v
10.2 Stochastic integration and variation regularity of the covariance 171

Since X̃ n converges to X̃ uniformly and in L2 , it is not hard to see that RX̃−X̃ n → 0


2
uniformly on [0, 1] . We then see that (10.11) tends to zero as n → ∞. It is now an
elementary exercise to combine this with (10.10) to conclude the (second) proof of
(10.6).
At last, the L2 -estimate is an immediate corollary of the maximal inequality given
in Lemma 10.2 and L2 -convergence of the approximating Riemann–Stieltjes sums.

Note that there was nothing special about the time horizon [0, 1] in the above
discussion. Indeed, given any time horizon [s, t] of interest,
 it suffices to apply the
same argument to the process Xs+τ (t−s) : 0 ≤ τ ≤ 1 . Since variation norms are
conveniently invariant under reparametrisation, (10.7) translates immediately to an
estimate of the form
"Z
t 2 #

E Xs,r dX̃r ≤ C R 2 R̃

ϱ;[s,t] 2 ,
ϱ;[s,t]
(10.12)
s

first for the approximating Riemann–Stieltjes sums and then for their L2 -limits.

Theorem 10.4. Let (Xt : 0 ≤ t ≤ T ) be a d-dimensional, continuous, centred Gaus-


sian process with independent components and covariance R such that there exists
ϱ ∈ [1, 2) and M < ∞ such that for every i ∈ {1, . . . , d} and 0 ≤ s ≤ t ≤ T ,
1/ϱ
∥RX i ∥ϱ;[s,t]2 ≤ M |t − s| . (10.13)

Define, for 1 ≤ i < j ≤ d and 0 ≤ s ≤ t ≤ T , in L2 -sense (cf. Proposition 10.3),


Z
Xi,j Xri − Xsi dXrj ,

s,t := lim
|P|→0 P

and then also (the algebraic conditions (2.1) and (2.6) leave no other choice!)
1 i 2
Xi,i
s,t := X and Xj,i i,j i j
s,t := −Xs,t + Xs,t Xs,t . (10.14)
2 s,t
Then, the following properties hold:
a) For every q ∈ [1, ∞) there exists C1 = C1 (q, ϱ, d, T ) such that for all 0 ≤ s ≤
t ≤ T,  
2q q q/ϱ
E |Xs,t | + |Xs,t | ≤ C1 M q |t − s| . (10.15)

b) There exists a continuous modification of X, denoted by the same letter from


here on. Moreover, for any α < 1/(2ϱ) and q ∈ [1, ∞) there exists C2 =
C2 (q, ϱ, d, α) such that
 
2q q
E ∥X∥α + ∥X∥2α ≤ C2 M q . (10.16)
172 10 Gaussian rough paths

1
c) For any α < 2ϱ , with probability one, the pair (X, X) satisfies conditions (2.1),
(2.3) and (2.6). In particular, for ϱ ∈ [1, 23 ) and any α ∈ ( 13 , 2ϱ
1
) we have
α
(X, X) ∈ Cg almost surely.

Proof. By scaling, we can take M = 1 without loss of generality. Regarding the


first property, the “first level” estimates are contained in Proposition
q  10.1. Thus,
in view of (10.14), in order to establish (10.15) only E Xi,j s,t
for i < j needs
to be considered. For q = 2 this is an immediate consequence of (10.12) and our
assumption (10.13). The case of general q follows from the well-known equivalence
of Lq - and L2 -norm on the second Wiener–Itô chaos (e.g. [FV10b, Appendix D]).
Regarding the remaining two properties, almost sure validity of the algebraic con-
straint (2.1) for any fixed pair of times is an easy consequence of algebraic identities
for Riemann sums. The construction of a continuous modification of (s, t) 7→ Xs,t
under the assumed bound is then standard (in fact, the proof of Theorem 3.1 shows
this for dyadic times and the unique continuous extension is the desired modification).
At last, Theorem 3.1 yields Kα , Kα , with moments of all orders, such that
α 2α
|Xs,t | ≤ Kα (ω)|t − s| , |Xs,t | ≤ Kα (ω)|t − s| .

The dependence of the moments of Kα and Kα on M finally follows by simple


rescaling. ⊔⊓

Theorem 10.5. Let (X, Y ) = X 1 , Y 1 , . . . , X d , Y d be a centred continuous



Gaus-
sian process on [0, T ] such that X i , Y i is independent of X j , Y j when i ̸= j.


Assume that there exists ϱ ∈ [1, 2) and M ∈ (0, ∞) such that the bounds
1/ϱ 1/ϱ
∥RX i ∥ϱ;[s,t]2 ≤ M |t − s| , ∥RY i ∥ϱ;[s,t]2 ≤ M |t − s| ,
1/ϱ
∥RX i −Y i ∥ϱ;[s,t]2 ≤ ε2 M |t − s| , (10.17)

hold for all i ∈ {1, . . . , d} and all 0 ≤ s ≤ t ≤ T . Then


a) For every q ∈ [1, ∞), the bounds
q 1
√ 1
E(|Ys,t − Xs,t | ) q ≲ ε M |t − s| 2ϱ ,
1 1
q
E(|Ys,t − Xs,t | ) q ≲ ε M |t − s| ϱ ,

hold for all 0 ≤ s ≤ t ≤ T .

b) For any α < 1/(2ϱ) and q ∈ [1, ∞), one has


q 1

|E(∥Y − X∥α )| q ≲ ε M ,
1
q
|E(∥Y − X∥2α )| q ≲ εM .

c) For ϱ ∈ [1, 23 ) and any α ∈ ( 31 , 2ϱ


1
), q < ∞, one has
10.2 Stochastic integration and variation regularity of the covariance 173

|ϱα (X, Y)|Lq ≲ ε .

(Here, ϱα (X, Y) denotes the α-Hölder rough path distance between X = (X, X)
and Y = (Y, X) in Cgα .)
Proof. By scaling we may without loss of generality assume M = 1. As for a) we
note (again) that equivalence of Lq - and L2 -norm on Wiener–Itô chaos allow to
reduce our discussion to q = 2. The first level estimate being easy, we focus on
the second level estimate; to this end fix i ̸= j. Since L2 -convergence implies a.s.
convergence along a subsequence there exists (Pn ), with mesh tending to zero, so
that we can use Fatou’s lemma to estimate
 Z 2 
i,j 2 
i,j i
dYrj − Xs,r
i
dXrj

E Ys,t − Xs,t = E lim Ys,r

n→∞ Pn
 Z 2 
i
≤ lim inf E Ys,r dYrj − Xs,r
i
dXrj

n Pn
 Z 2 
i
≤ sup E Ys,r dYrj − Xs,ri
dXrj .

P P

The result now follows from the bound


Z Z Z
i j i j
i j i j

Ys,r
dYr − Xs,r dXr ≤ Ys,r d(Y − X)r + (Y − X)s,r dXr ,

P P P

where we estimate the second moment of each term on the right-hand side by the
respective variation norms of the covariances; e.g.
 Z 2 
i j
E Ys,r d(Y − X)r ≤ C∥RY i ∥ϱ;[s,t]2 ∥RY j −X j ∥ϱ;[s,t]2

P
2
≤ Cε2 |t − s| ϱ .

The case i = j is easier: it suffices to note that

i,i 2  1 i 2
2 
E Yi,i i

s,t − Xs,t = E Ys,t − Xs,t
4
1 i i
 i i

= E Ys,t − Xs,t Ys,t + Xs,t ,
4
then conclude with Cauchy–Schwarz.
Regarding b), given the pointwise Lq -estimates as stated in a), the Lq -estimates
for ∥X − Y ∥α and ∥Y − X∥2α are obtained from Theorem 3.3. The last statement
is then an immediate consequence of the definition of ϱα . ⊔ ⊓
1
, Y , . . . , X , Y d be a centred
1 d

Corollary 10.6. As above, let (X, Y ) = X contin-
uous Gaussian process such that X i , Y i is independent of X j , Y j when i ̸= j.
 

Assume that there exists ϱ ∈ [1, 23 ) and M ∈ (0, ∞) such that


1/ϱ
R(X,Y )
ϱ;[s,t]2
≤ M |t − s| ∀0 ≤ s ≤ t ≤ T. (10.18)
174 10 Gaussian rough paths

Then, for every α ∈ ( 13 , 2ϱ


1
), every θ ∈ 0, 21 − ϱα and q < ∞, there exists a


constant C such that


h iθ
2
|ϱα (X, Y)|Lq ≤ C sup E|Xs,t − Ys,t | . (10.19)
s,t∈[0,T ]

Proof. At the price of replacing (X, Y ) by the rescaled process M −1/2 (X,  Y ) we
may take M = 1. (The concluding Lq -estimate on ϱα M −1/2 X, M −1/2 Y is then
readily translated into an estimate on ϱα (X, Y ), given that we allow the final constant
to depend on M .) Assumption (10.18) then spells out precisely to
1/ϱ 1/ϱ
∥RX i ∥ϱ;[s,t]2 ≤ |t − s| , ∥RY i ∥ϱ;[s,t]2 ≤ |t − s|

and (not present in the assumptions of the previous theorem!)


1/ϱ
R(X i ,Y i )
ϱ;[s,t]2
≤ |t − s|

where R(X i ,Y i ) (u, v) = E Xui Yvi . Thanks to this assumption we have




 
∥RX i −Y i ∥ϱ;[s,t]2 ≤ Cϱ ∥RX i ∥ϱ;[s,t]2 + 2 R(X i ,Y i ) ϱ;[s,t]2 + ∥RY i ∥ϱ;[s,t]2
1/ϱ
≤ 4Cϱ |t − s| ,

which is handy in the following interpolation argument. Set

η := max{∥RX i −Y i ∥∞;[0,T ]2 : 1 ≤ i ≤ d}

and note that, for any ϱ′ > ϱ,


1−ϱ/ϱ′ ϱ/ϱ′
∥RX i −Y i ∥ϱ′ ;[s,t]2 ≤ ∥RX i −Y i ∥∞;[s,t]2 ∥RX i −Y i ∥ϱ;[s,t]2
ϱ/ϱ′ 1−ϱ/ϱ′ 1/ϱ′
≤ (4Cϱ ) η |t − s| .

Also, with M̃ = 1 ∨ T 1/ϱ−1/ϱ , and then similar for RY i ,
1/ϱ 1/ϱ′
∥RX i ∥ϱ′ ;[s,t]2 ≤ ∥RX i ∥ϱ;[s,t]2 ≤ |t − s| ≤ M̃ |t − s|
ϱ
and so, picking ϱ′ = 1−2θ the previous theorem (with ϱ′ ← ϱ and ε2 ←
′ ϱ/ϱ′
η 1−ϱ/ϱ , M ← M̃ ∨ (4Cϱ ) ) yields
1 1
−ϱ
|ϱα (X, Y )|Lq ≤ Cε = Cη 2 2ϱ′ = Cη θ .

for any given θ ∈ 0, 21 − ϱα . At last, take i∗ ∈ {1, . . . , d} as the arg max in the


definition of η and set ∆ = X i∗ − Y i∗ . Then, by Cauchy–Schwarz,


10.3 Fractional Brownian motion and beyond 175

η = ∥R∆ ∥∞;[0,T ]2 = sup E(∆s,t ∆s′ ,t′ ) ≤ sup E∆2s,t


0≤s≤t≤T 0≤s≤t≤T
0≤s′ ≤t′ ≤T

and the proof is finished. ⊔


Remark 10.7. Corollary 10.6 suggests an alternative route to the construction of a


rough path lift X = (X, X) for some Gaussian process X as in Theorem 10.4. The
idea is to establish the crucial estimate (10.19) only for processes with regular sample
paths, in which case X is canonically given by iterated Riemann–Stieltjes integration.
Apply this to piecewise linear (or mollifier) approximations X n , X m to see that
(X n , Xn ) is Cauchy, in probability and rough path metric in the space Cg0,α . The
resulting limiting (random) rough path X is easily seen to be indistinguishable from
the one constructed in Theorem 10.4. All estimates are then seen to remain valid in
the limit. (This is the approach taken in [FV10a, FV10b].)

10.3 Fractional Brownian motion and beyond

We remarked in the beginning of Section 10.2 that (d-dimensional) fractional Brown-


ian motion B H , with Hurst parameter H ∈ (0, 1), determined through its covariance
1 h 2H 2H
i
RH (s, t) = s + t2H − |t − s| Id ∈ Rd×d
2
has α-Hölder sample paths for any α < H. For H > 1/2, there is little need for
rough path analysis - after all, Young’s theory is applicable. For H = 1/2, one deals
with d-dimensional standard Brownian motion which, of course, renders the classical
martingale based stochastic analysis applicable. For H < 1/2, however, all these
theories fail but rough path analysis works. In the remainder of this section we detail
the construction of a fractional Brownian rough path.
In fact, we shall consider centred,continuous Gaussian processes with indepen-
dent components X = X 1 , . . . , X d and stationary increments. The construction
of a (geometric) rough path associated to X then naturally passes through an under-
standing of the two-dimensional ϱ-variation of R = RX , the covariance of X; cf.
Theorem 10.4. To this end, it is enough to focus on one component and we may take
X to be scalar until further notice. The law of such a process is fully determined by
 
2
 2  t, t + u
σ (u) := E Xt,t+u = R .
t, t + u

Lemma 10.8. Assume that σ 2 (·) is concave on [0, h] for some h > 0. Then, one
has non-positive correlation of non-overlapping increments in the sense that, for
0 ≤ s ≤ t ≤ u ≤ v ≤ h,
 
s, t
E[Xs,t Xu,v ] = R ≤ 0.
u, v
176 10 Gaussian rough paths

If in addition σ 2 (·) restricted to [0, h] is non-decreasing (which is always the case


for some possibly smaller h), then for 0 ≤ s ≤ u ≤ v ≤ t ≤ h,
 2 
0 ≤ E[Xs,t Xu,v ] = |E[Xs,t Xu,v ]| ≤ E Xu,v = σ 2 (v − u) .
2 2 2
Proof. Using the identity 2ac = (a + b + c) + b2 − (b + c) − (a + b) with
a = Xs,t , b = Xt,u and c = Xu,v , we see that
 2   2   2   2 
2E[Xs,t Xu,v ] = E Xs,v + E Xt,u − E Xt,v − E Xs,u
= σ 2 (v − s) + σ 2 (u − t) − σ 2 (v − t) − σ 2 (u − s).

The first claim now easily follows from concavity, cf. [MR06, Lemma 7.2.7].
To show the second bound, note that Xs,t Xu,v = (a + b + c)b where a = Xs,u ,
b = Xu,v , and c = Xv,t . Applying the algebraic identity
2 2
2(a + b + c)b = (a + b) − a2 + (c + b) − c2

and taking expectations yields


 2   2   2   2 
2E[Xs,t Xu,v ] = E Xs,v − E Xs,u + E Xu,t − E Xv,t
= σ 2 (v − s) − σ 2 (u − s) + σ 2 (t − u) − σ 2 (t − v) ≥ 0 ,
 

where we used that σ 2 (·) is non-decreasing. On the other hand, using (a + b + c)b =
b2 + ab + cb and the non-positive correlation of non-overlapping increments, we
have
 2   2 
E[Xs,t Xu,v ] = E Xu,v + E[Xs,u Xu,v ] + E[Xv,t Xu,v ] ≤ E Xu,v ,

thus concluding the proof. ⊔


Theorem 10.9. Let X be a real-valued Gaussian process with stationary increments


and σ 2 (·) concave and non-decreasing on [0, h], some h > 0. Assume also, for
constants L, ϱ ≥ 1, and all τ ∈ [0, h],

|σ 2 (τ )| ≤ L|τ |1/ϱ .

Then the covariance of X has finite ϱ-variation. More precisely


1/ϱ
∥RX ∥ϱ-var;[s,t]2 ≤ M |t − s| (10.20)

for all intervals [s, t] with length |t − s| ≤ h and some M = M (ϱ, L) > 0.

Proof. Consider some interval [s, t] with length |t − s| ≤ h. The proof relies on
separating “diagonal” and “off-diagonal” contributions. Let D = {ti }, D′ = {t′j } be
two dissections of [s, t]. For fixed i, we have
10.3 Fractional Brownian motion and beyond 177
X  ϱ ϱ
31−ϱ E Xti ,ti+1 Xt′j ,t′j+1 ≤ 31−ϱ EXti ,ti+1 X· ϱ-var;[s,t]

(10.21)
t′j ∈D ′
ϱ ϱ
≤ EXti ,ti+1 X· ϱ-var;[s,ti ] + EXti ,ti+1 X· ϱ-var;[ti ,ti+1 ]
ϱ
+ EXti ,ti+1 X· ϱ-var;[ti+1 ,t] .

By Lemma 10.8 above, we have

≤ |EXti ,ti+1 Xs,ti | ≤ |EXti ,ti+1 Xs,ti+1 | + |EXt2i ,ti+1 |



EXti ,ti+1 X·
ϱ-var;[s,ti ]

≤ 2σ 2 (ti+1 − ti ) .

The third term is bounded analogously. For the middle term in (10.21) we estimate

EXti ,ti+1 X· ϱ
X
|EXti ,ti+1 Xt′j ,t′j+1 |ϱ

ϱ-var;[t ,t ]
= sup
i i+1
D′
t′j ∈D ′

σ 2 (t′j+1 − t′j ) ϱ ≤ L|ti+1 − ti | ,


X
≤ sup
D′
t′j ∈D ′

where we used the second estimate of Lemma 10.8 for the penultimate bound and
the assumption on σ 2 for the last bound. Using these estimates in (10.21) yields
X
|EXti ,ti+1 Xt′j ,t′j+1 |ϱ ≤ C|ti+1 − ti | ,
t′j ∈D ′

and (10.20) follows by summing over ti and taking the supremum over all dissections
of [s, t]. ⊔

Corollary 10.10. Let X = (X 1 , . . . , X d ) be a centred continuous Gaussian process
with independent components such that each X i satisfies the assumption of the
previous theorem, with common values of h, L and ϱ ∈ [1, 3/2). Then X, restricted
to any interval [0, T ], lifts to X = (X, X) ∈ Cgα [0, T ], Rd .
Proof. Set In = [(n − 1)h, nh] so that [0, T ] ⊂ I1 ∪ I2 ∪ · · · ∪ I[T /h]+1 . On each
interval In , we may apply
 Theorem 10.4 to lift Xn := X|In to a (random) rough
path Xn ∈ Cgα In , Rd . The concatenation of X1 , X2 , . . . then yields the desired
rough path lift on [0, T ]. ⊔

Example 10.11 (Fractional Brownian motion). Clearly, d-dimensional fractional
Brownian motion B H with Hurst parameter H ∈ ( 13 , 12 ] satisfies the assumptions of
the above theorem / corollary for all components with

σ(u) = u2H ,
1
obviously non-decreasing and concave for H ≤ 2 and on any time interval [0, T ].
This also identifies
1
ϱ=
2H
178 10 Gaussian rough paths

and ϱ < 32 translates to H > 13 in which case we obtain a canonical geometric rough
path BH = (B H , BH ) associated to fBm. In fact, a canonical “level-3” rough path
BH can be constructed as long as ϱ < ϱ∗ = 2, corresponding to H > 1/4 but this
requires level-3 considerations which we do not discuss here (see [FV10b, Ch.15]).
Example 10.12 (Ornstein-Uhlenbeck process). Consider the d-dimensional (station-
ary) OU process, consisting of i.i.d. copies of a scalar Gaussian process X with
covariance
E[Xs Xt ] = K(|t − s|) , K(u) = exp (−cu) ,
where c > 0 is fixed. Note that σ 2 (u) = EXt,t+u
2 2
= EXt+u + EXt2 − 2EXt,t+u =
2
2[K(0) − K(u)] = 1 − exp (−cu), so that σ (u) is indeed increasing and concave:

∂u σ 2 (u) = c exp (−cu) > 0


 

∂u2 σ 2 (u) = −c2 exp (−cu) < 0 .


 

One also has the bound σ 2 (u) = 1 − exp (−cu) ≤ cu, which shows that the
assumptions of the above corollary are satisfied with ϱ = 1, L = c and arbitrary
h > 0.

10.4 Exercises

Exercise 10.1 Let X D be a piecewise linear approximation to X. Show that (Xs,t )


as constructed in Theorem 10.4R is the limit, in probability and uniformly on
t D
{(s, t) : 0 ≤ s ≤ t ≤ T } say, of s Xs,u ⊗ dXuD as |D| → 0. (In particular, any
algebraic relations which hold for (piecewise) smooth paths and their iterated inte-
grals then hold true in the limit. This yields an alternative proof that (X, X) satisfies
conditions (2.1) and (2.6).)
Exercise 10.2 (Convergence to Brownian rough path [HN09, FR11]) Let X =
B and Y = B n be a d-dimensional Brownian motion and its piecewise linear
approximation with mesh size 1/n, respectively. Show that the covariance of (B, B n )
has finite 1-variation, uniformly in n. Show also that
 
h
n 2
i 1
sup E( Bs,t − Bs,t ) =O .
s,t∈[0,T ] n

Conclude that, for any θ < 1/2 − α


 
1
q
∥B − B n ∥α + ∥B − Bn ∥2α =O θ .

Lq n

Use a Borel–Cantelli argument to show that, also for any θ < 1/2 − α,
1
∥B − B n ∥α + ∥B − Bn ∥2α ≤ C(ω) .

10.4 Exercises 179

1 1

When α ∈ 3, 2 , we can conclude convergence in α-Hölder rough path metric, i.e.

ϱα ((B, B), (B n , Bn )) → 0 ,

almost surely with rate 1/2 − α − ε for every ε > 0.


Exercise 10.3 Let (B, B̃) be a 2-dimensional standard Brownian motion. The (Gaus-
sian) process given by
X = (Bt , Bt + B̃t )
fails to have independent components and yet lifts to a Gaussian rough path. Explain
how and detail the construction.
Exercise 10.4 Assume R(s, t) = K(|t − s|) for some C 2 -function K. (This was
exactly the situation in the above Ornstein–Uhlenbeck case, Example 10.12.) Give a
direct proof that R has finite 2-dimensional 1-variation, more precisely,

∥R∥1-var;[s,t]2 ≤ C|t − s| , ∀0≤s≤t≤T ,

for a constant C which depends on T and K.


Solution. If (s, t) 7→ R(s, t) := E[Xs Xt ] is smooth, the 2-dimensional 1-variation
is given by Z
2
∥R∥ 2 =
1-var;[0,T ]
∂s,t R(s, t) ds dt
[0,T ]2

This remains true when the mixed derivative is a signed measure, which in turn is the
case when R(s, t) = K(|t − s|) for some C 2 -function K. Indeed, write H and 2δ
for the distributional derivatives of | • |. Formal application of the chain-rule gives
∂t R = K ′ (|t − s|)H(t − s) and then, using |H| ≤ 1 a.s.,
2
∂s,t R(s, t) ≤ |K ′′ (|t − s|)| + 2|K ′ (|t − s|)|δ(t − s).

2 2
Integration again over [s, t] ⊂ [0, T ] yields
Z
2
∂u,v R(u, v) du dv ≤ (T |K ′′ | + 2|K ′ (0)|)|t − s|.

∥R∥1-var;[s,t]2 = ∞
[s,t]2

This is easily made rigorous by replacing | • | (and then H, 2δ) by a mollified version,
say | • |ε (and Hε , 2δε ), noting that variation norms are lower semicontinuous fashion
under pointwise limits; that is

∥R∥1-var;[s,t]2 ≤ liminf ∥Rε ∥1-var;[s,t]2


ε→0

whenever Rε → R pointwise. To see this, it suffices to take arbitrary dissections


D = (ti ) and D′ = (t′j ) of [u, v] and note that

X  ti−1 , ti  X  ti−1 , ti 
R ′ = lim R ε ′ ≤ liminf ∥Rε ∥
tj−1 , t′j ε→0 tj−1 , t′j 1-var;[u,v]2 .

ε→0
i,j i,j
180 10 Gaussian rough paths

Exercise 10.5 Assume X = X 1 , . . . , X d is a centred, continuous Gaussian pro-




cess with independent components.


(i) Assume covariance of finite ϱ-variation with ϱ < 2. Show that each component
X = X i , for i = 1, . . . , d, has almost surely vanishing compensated quadratic
variation on [0, T ] by which we mean
X
2 2

lim Xs,t − E(Xs,t ) =0,
n→∞
[s,t]∈Pn

in probability (and Lq , any q < ∞) for any sequence of partitions (Pn ) of [0, T ]
with mesh |Pn | → 0.
(ii) Under the assumptions of (i), show that there exists (Pn ) with  |Pn | → 0 so
that, with probability one, the quadratic (co)variation X i , X j , in the sense of
Definition 5.10, vanishes, for any i ̸= j, with i, j ∈ {1, . . . , d}.
Conclude that, with regard to Theorem 10.4, the off-diagonal elements Xi,j s,t ,
defined as the L2 limit of left-point Riemann–Stieltjes sums, could have been
equivalently defined via mid- or right-point Riemann sums.
(iii) Assume ϱ = 1. Show that, for all i = 1, . . . , d, there exists a sequence (P
 n ) with
mesh |Pn | → 0 so that, with probability one, the quadratic variation X i , X i ,
in the sense of Definition 5.10, exists and equals
 i X
i
2
X t := lim sup E Xu,v .
ε→0 |P|<ε
[u,v]∈P
u<t

Discuss the possibility of lifting X to a (random) non-geometric rough path,


similar to the Itô-lift of Brownian motion.
(iv) Consider the case of a zero-mean, stationary Gaussian process on [0, 2π] with
i.i.d. components, each specified by
2
E(Xs,t ) = cosh (−π) − cosh (|t − s| − π).

Verify that ϱ = 1 and compute [X]. (This example is related to the stochastic heat
equation, where s, t should be thought of as spatial variables, cf. Lemma 12.30)

Solution. (i) Using Wick’s formula for the expectation of products of centred
Gaussians, namely

E(ABCD) = E(AB)E(CD) + E(AC)E(BD) + E(AD)E(BC) ,

we obtain the identity


X 2
2 2
E Xs,t − E(Xs,t )

[s,t]∈Pn
X X  
2
= E([Xs,t Xs2′ ,t′ ) − E(Xs,t
2
)E(Xs2′ ,t′ )
[s,t]∈Pn [s′ ,t′ ]∈Pn
10.4 Exercises 181
X X
= 2E(Xs,t Xs′ ,t′ )E(Xs,t Xs′ ,t′ )
[s,t]∈Pn [s′ ,t′ ]∈Pn
  2
R s′ , t′
X X
=2 s ,t
[s,t]∈Pn [s′ ,t′ ]∈Pn
  2−ϱ
R s′ , t′ ∥R∥ϱ

≤ sup ϱ-var;[0,T ]2
.
t−s≤|Pn |
s ,t
t′ −s′ ≤|Pn |

This term on the other hand converges to 0 as |Pn | → 0. This gives L2 -


convergence and hence convergence in probability. Convergence in Lq for any
q < ∞ follows from general facts on Wiener–Itô chaos.
(ii) Left to the reader.
(iii) We fix i and drop the index. We easily see that (i) holds uniformly on compacts,
say, in the sense that
X
2 2

sup Xu,v − E(Xu,v ) → 0 as n → ∞
t∈[0,T ] [u,v]∈D
n
u<t

in probability whenever |Pn | → 0 . On the other hand,


2
X
sup E(Xu,v ) < ∞
|P|<ε [u,v]∈P
u<t

thanks to finite 1-variation of the covariance. By monotonicity, the limit as


ε =1/n → 0 exists, and we call it [[X]]t . Then, along a suitable sequence
P̃n ,
2
X
[[X]]t = lim E(Xu,v ) .
n
[u,v]∈P̃n
u<t

On the other hand, at the price of passing to another subsequence also denoted
by P̃n , we have
X
2 2

sup Xu,v − E(Xu,v ) →0 almost surely,
t∈[0,T ]
[u,v]∈P̃n
u<t

and so with probability one, and uniformly in t ∈ [0, T ],


X
2
Xu,v → [[X]]t .
[u,v]∈P̃n
u<t

2
(iv) One has E(Xs,t ) = cosh (−π)−cosh (|t − s| − π) = sinh (π)|t − s|+o(|t−s|)
and so [X]t = t sinh (π).
182 10 Gaussian rough paths

Exercise 10.6 Assume finite 1-variation of the covariance (as e.g. defined in (10.5))
2
of a zero-mean Gaussian process X and E[Xt,t+h ] = f (t)h + o(h) as h ↓ 0, for
some f ∈ C([0, T ], R). Show that, for every smooth test function φ,
T 2 T
Xt,t+h
Z Z
φ(t) dt → φ(t)f (t) dt as h → 0,
0 h 0

where the convergence takes place in Lq for any q < ∞ (and hence also in probabil-
ity).
Solution. Since all types of Lq -convergence are equivalent on the finite Wiener–Itô
chaos (here we only need the chaos up to level 2), it suffices to consider q = 2. A
dissection (tk ) of [0, T ] is given by tk = kh ∧ T . We have
X 1 Z tk+1 Z 1 X
2
φ(t)Xt,t+h dt = dθ φ(tk + θh)Xt2k +θh,tk +θh+h
h tk 0
k k
Z 1
≡ ⟨φ, µθ,h ⟩dθ ,
0

where the random measure µθ,h := k δtk +θh Xt2k +θh,tk +θh+h acts on test func-
P

tions by integration. It obviously suffices to establish ⟨φ, µθ,h ⟩ → ⟨φ, f ⟩ in L2 ,


uniformly in θ ∈ [0, 1]. Define the (random) distribution function of µθ,h
X
F (t) := µθ,h ([0, t]) = Xt2k +θh,tk +θh+h ,
k:tk +θh≤t

and also F̄ (t) = EF (t). Note that,


X Z t
F̄ (t) = f (tk + θh)h + o(h) ∼ f (s)ds as h ↓ 0,
k:tk +θh≤t 0

uniformly in θ ∈ [0, 1], t ∈ [0, T ]. On the other hand, the Gaussian (or Wick) identity
E(A2 B 2 ) − E[A2 ]E(B 2 ) = 2(E(AB))2 , applied with A = Xtk +θh,tk +θh+h and
B = Xtj +θh,tj +θh+h , gives
2
E F (t) − F̄ (t) = E F 2 (t) − F̄ 2 (t)

 2
X t + θh, tk + θh + h
=2 RX k
tj + θh, tj + θh + h
k:tk +θh≤t
j:tj +θh≤t

≲ osc R2−ϱ ; h → 0

as h → 0 ,

uniformly in θ ∈ [0, 1], t ∈ [0, T ]. It follows that


Z t
F (t) = µθ,h ([0, t]) → f (s)ds
0
10.5 Comments 183

in L2 , again uniformly in t and θ. Now, for fixed smooth φ, one has the bound
Z Z 2 Z Z t  2

φ(t)µθ,h (dt) − φ(t)f (t)dt = f (s)ds − µθ,h ([0, t]) φ̇(t)dt

0
Z 1 Z t 2
≲ f (s)ds − µθ,h ([0, t]) dt
0 0

and so
Z Z 2 Z 1 Z t 2

E φ(t)µθ,h (dt) − φ(t)f (t)dt ≲ E f (s)ds − µθ,h ([0, t]) dt .
0 0

This expression converges to 0 as h → 0, uniformly in θ, thus completing the proof.

10.5 Comments

Classes of Gaussian processes which admit (canonical) lifts to random rough paths
were first studied by Coutin–Qian [CQ02], with focus on fBm with Hurst parameter
H > 1/4. Ledoux, Qian and Zhang [LQZ02] used Gaussian techniques to establish
large deviation and support for the Brownian rough paths, extensions to fractional
Brownian motions were investigated by Millet–Sanz-Solé [MSS06], Feyel and de
la Pradelle [FdLP06], Friz–Victoir [FV07, FV06a]. When H ≤ 1/4, there is no
canonical rough path lift: as noted in [CQ02], the L2 -norm of the area associated to
piecewise linear approximations to fBm diverges. See however the works of Unter-
berger and then Nualart–Tindel [Unt10, NT11]. Parameter estimation for fractional
SDEs via rough paths is studied in Papavasiliou–Ladroue [PL11], see also [DFM16].
The notion of two-dimensional ϱ-variation of the covariance, as adopted in this
chapter, is due to Friz–Victoir, [FV10a], [FV10b, Ch.15], [FV11], and allows for
an elegant and general construction of Gaussian rough paths. It also leads naturally
to useful Cameron–Martin embeddings, see Section 11.1. If restricted to the “diag-
onal”, ϱ-variation of the covariance relates to a classical criterion of Jain–Monrad
[JM83]. The question remains how one checks finite ϱ-variation when faced with a
non-trivial (and even non-explicit, e.g. given as Fourier series) covariance function.
A general criterion based on a certain covariance measure structure (reminiscent of
Kruk, Russo and Tudor [KRT07]) was recently given by Friz, Gess, Gulisashvili
and Riedel [FGGR16], a special case of which is the “concavity criterion” of Theo-
rem 10.9. Cass-Lim establish a Stratonovich-Skorohod integral formula for Gaussian
rough paths. Multi-level Monte Carlo for Gaussian RDEs is analysed by Bayer et
al. [BFRS16]. Bailleul, Riedel and Scheutzow [BRS17] show that random RDEs
driven by suitable Gaussian rough paths constitute random dynamical system. It is
interesting to note that many key results for Gaussian rough paths (tail estimate, sup-
port, densities, . . .) can be shown with different tools to hold in a Markovian setting
[CO17, CO18], using the framework of Markovian rough paths [FV08c, FV10b].
Chapter 11
Cameron–Martin regularity and applications

A continuous Gaussian process gives rise to a Gaussian measure on path-space.


Thanks to variation regularity properties of Cameron–Martin paths, powerful tools
from the analysis on Gaussian spaces become available. A general Fernique type
theorem leads us to integrability properties of rough integrals with Gaussian integrator
akin to those of classical stochastic integrals. We then discuss Malliavin calculus for
differential equations driven by Gaussian rough paths. As application a version of
Hörmander’s theorem in this non-Markovian setting is established.

11.1 Complementary Young regularity

Although we have chosen to introduce (rough) paths subject to α-Hölder regularity,


the arguments are not difficult to adapt to continuous paths with finite p-variation
with p = 1/α ∈ [1, ∞). Recall that C p-var ([0, T ], Rd ) is the space of continuous
paths X : [0, T ] → Rd so that
  p1
p
def
X
∥X∥p-var;[0,T ] = sup |Xs,t | <∞, (11.1)
P
[s,t]∈P

with supremum taken over all partitions of [0, T ] and this constitutes a seminorm
on C p-var . The 1-variation (p = 1) of such a path is of course nothing but its length,
possibly +∞. Hölder implies variation regularity, one has the immediate estimate

∥X∥p-var;[0,T ] ≤ T α ∥X∥α;[0,T ] .

Conversely, a time-change renders p-variation paths Hölder continuous with exponent


α = 1/p. Given two paths X ∈ C p-var ([0, T ], Rd ), h ∈ C q-var ([0, T ], Rd ) let us say
that they enjoy complementary Young regularity if Young’s condition
1 1
+ >1, (11.2)
p q

185
186 11 Cameron–Martin regularity and applications

is satisfied.
We are now interested in the regularity of Cameron–Martin paths. As in the
last section, X is an Rd -valued, continuous and centred Gaussian process on [0, T ],
realised as X(ω) = ω ∈ C [0, T ], Rd , a Banach space under the uniform norm,


equipped with a Gaussian measure. General principles of Gaussian measures on


(separable) Banach spaces thus apply, see e.g. [Led96]. Specialising
 to the situation
at hand, the associated Cameron–Martin space H ⊂ C [0, T ], Rd consists of paths
t 7→ ht = E(ZXt ) where Z ∈ W 1 is an element in the so-called first Wiener
chaos, the L2 -closure of span Xti : t ∈ [0, T ], 1≤ i ≤ d , consisting of Gaussian
random variables. We recall that if h′ = E Z ′ X· denotes another element in H, the
inner product ⟨h, h′ ⟩H = E(ZZ ′ ) makes H a Hilbert space; Z 7→ h is an isometry
between W 1 and H.
Example 11.1. (Brownian motion). Let B be a d-dimensional Brownian motion, let
g ∈ L2 [0, T ], Rd , and set


d Z
X T Z T
Z= gsi dBsi ≡ ⟨g, dB⟩ .
i=1 0 0

 Rt 2
By Itô’s isometry, hit := E ZBti = 0 gsi ds so that ḣ = g and ∥h∥H := E Z 2 =

RT 2
0
|gs | ds = ∥ḣ∥2L2 where | • | denotes Euclidean norm on Rd . Clearly, h is of finite
1-variation, and its length is given by ∥ḣ∥L1 . On the other hand, Cauchy–Schwarz
shows any h ∈ H is 1/2-Hölder which, in general, “only” implies 2-variation.
The proposition below applies to Brownian motion with ϱ = 1, also recalling that
∥R∥1;[s,t]2 = |t − s| in the Brownian motion case.

Proposition 11.2. Assume the covariance R : (s, t) 7→ E(Xs ⊗ Xt ) is of finite ϱ-


variation (in 2D sense) for ϱ ∈ [1, ∞). Then H is continuously embedded in the
space of continuous paths of finite ϱ-variation. More, precisely, for all h ∈ H and all
s < t in [0, T ], q
∥h∥ϱ-var;[s,t] ≤ ∥h∥H ∥R∥ϱ-var;[s,t]2 .

Proof. We assume X, h to be scalar, the extension to d-dimensional X is straightfor-


ward (and even trivial when X has independent components, which will always be
the case for us). Setting h = E(ZX ), we may assume without loss of generality (by

2
scaling), that ∥h∥H := E Z 2 = 1. Let (tj ) be a dissection of [s, t]. Let ϱ′ be the


Hölder conjugate of ϱ. Using duality for lϱ -spaces, we have1


X ϱ 1/ϱ X

ht = sup βj , htj ,tj+1
j ,tj+1

j β,|β|lϱ′ ≤1 j
 X

= sup E Z βj , Xtj ,tj+1
β,|β|lϱ′ ≤1 j

1

The case ϱ = 1 may be seen directly by taking βj = sgn htj ,tj+1 .
11.1 Complementary Young regularity 187
sX


≤ sup βj ⊗ βk , E Xtj ,tj+1 ⊗ Xtk ,tk+1
β,|β|lϱ′ ≤1 j,k
v
u X  1 X  ϱ  ϱ1
ϱ′ ϱ ′ ϱ′
≤ |βj | |βk | E Xtj ,tj+1 ⊗ Xt ,t
u
sup t k k+1

β,|β|lϱ′ ≤1 j,k j,k
X  ϱ 1/(2ϱ) q
≤ E Xt ,t ⊗ Xt ,t
j j+1 k k+1
≤ ∥R∥ϱ-var;[s,t]2 .
j,k

The proof is then completed by taking the supremum over all dissections (tj ) of [0, t].

Remark 11.3. It is typical (e.g. for Brownian or fractional Brownian motion, with
ϱ = 1/(2H) ≥ 1) that
1/ϱ
∀s < t in [0, T ] : ∥R∥ϱ-var;[s,t]2 ≤ M |t − s| .
In such a situation, Proposition 11.2 implies that
1/(2ϱ)
|hs,t | ≤ ∥h∥ϱ-var;[s,t] ≤ ∥h∥H M 1/2 |t − s| ,

which tells us that H is continuously embedded in the space of 1/(2ϱ)-Hölder


continuous paths (which can also be seen directly from hs,t = E(ZXs,t ) and Cauchy–
Schwarz). The point is that 1/(2ϱ)-Hölder only implies 2ϱ-variation regularity, in
contrast to the sharper result of Proposition 11.2.

In part i) of the following lemma we allow X = (X, X) to be a (continuous) rough


path of finite p-variation rather than of α-Hölder regularity. More formally, we write
X ∈ C p-var ([0, T ], Rd ) when p ∈ [2, 3) and the analytic Hölder type condition (2.3)
in the definition of a rough path is replaced by ∥X∥p-var;[0,T ] < ∞ and the second
order regularity condition
 2/p
p/2
def
X
∥X∥p/2-var;[0,T ] = sup |Xs,t | <∞. (11.3)
P
[s,t]∈P

(As before, we shall drop [0, T ] from our notation whenever the time horizon is
fixed.) The homogeneous p-variation rough path norm (over [0, T ]) is then given by
q
def
|||X|||p-var;[0,T ] = |||X|||p-var = ∥X∥p-var + ∥X∥p/2-var . (11.4)

Of course, a geometric rough path of finite p-variation, X ∈ Cgp-var is one for which
the “first order calculus” condition (2.6) holds.
The following results will prove crucial in Section 11.2 where we will derive,
based on the Gaussian isoperimetric inequality, good probabilistic estimates on
Gaussian rough path objects. They are equally crucial for developing the Malliavin
calculus for (Gaussian) rough differential equations in Section 11.3.
188 11 Cameron–Martin regularity and applications

Recall from Exercise 2.15 that the translation of a rough path X = (X, X) in
direction h is given by
Th (X) = X h , Xh
def 
(11.5)
where X h := X + h and
Z t Z t Z t
h
Xs,t := Xs,t + hs,r ⊗ dXr + Xs,r ⊗ dhr + hs,r ⊗ dhr , (11.6)
s s s

provided that h is sufficienly regular to make the final three integrals above well-
defined.
Lemma 11.4. i) Let X ∈ Cgp-var ([0, T ], Rd ), with p ∈ [2, 3) and consider a func-
tion h ∈ C q-var ([0, T ], Rd ) with complementary Young regularity in the sense
that
1/p + 1/q > 1 .
Then the translation of X in direction h is well-defined in the sense that the
integrals appearing in (11.6) are well-defined Young integrals and Th : X 7→
Th (X) maps Cgp-var [0, T ], Rd into itself. Moreover, one has the estimate, for
some constant C = C(p, q),
 
|||Th (X)|||p-var ≤ C |||X|||p-var + ∥h∥q-var .

ii) Similarly, let α = 1/p ∈ ( 13 , 21 ], X ∈ Cgα [0, T ], Rd and h : [0, T ] → Rd again




of complementary Young regularity, but now “respectful” of α-Hölder regularity


in the sense that 2
α
∥h∥q-var;[s,t] ≤ K|t − s| , (11.7)
uniformly in 0 ≤ s < t ≤ T . Write ∥h∥q,α for the smallest constant K in the
bound (11.7). Then again Th is well-defined and now maps Cgα [0, T ], Rd into
itself. Moreover, one has the estimate, again with C = C(p, q),

|||Th (X)|||α ≤ C(|||X|||α + ∥h∥q,α ) .

Proof. This is essentially a consequence of Young’s inequality which gives


Z t


hs,r ⊗ dXr ≤ C∥h∥q-var;[s,t] ∥X∥p-var;[s,t] ,
s

and then similar estimates for the other (Young) integrals


√ appearing in the definition
of Xh . One then uses elementary estimates of the form ab ≤ a+b (for non-negative
reals a, b), in view of the definition of homogeneous norm (which involves Xh with a
square root). Details are left to the reader. ⊔⊓
By combining the Cameron–Martin regularity established in Proposition 11.2, see
also Remark 11.3, with the previous lemma we obtain the following result.
2 1
From Remark 11.3, ∥h∥ϱ,α ≲ ∥h∥H for all α ≤ 2ϱ
.
11.1 Complementary Young regularity 189

Theorem 11.5. Assume (Xt : 0 ≤ t ≤ T ) is a continuous d-dimensional, centred


Gaussian process with independent components and covariance R such that there
exists ϱ ∈ [1, 32 ) and M < ∞ such that for every i ∈ {1, . . . , d} and 0 ≤ s ≤ t ≤ T ,
1/ϱ
∥RX i ∥ϱ-var;[s,t]2 ≤ M |t − s| .

1
Let α ∈ ( 13 , 2ϱ ] and X = (X, X) ∈ C α [0, T ], Rd a.s. be the random Gaussian


rough path constructed in Theorem 10.4. Then there exists a null set N such that for
every ω ∈ N c and every h ∈ H,

Th (X(ω)) = X(ω + h) .

Proof. Note that complementary Young regularity holds, with p = α1 < 3 and
q = ϱ < 32 , as is seen from p1 + 1q > 13 + 32 = 1. As a consequence of Lemma 11.4,
the translation Th (X(ω)) is well-defined whenever X(ω) ∈ C α . The proof requires
a close look at the precise construction of X(ω) = (X(ω), X(ω)) in Theorem 10.4,
using Kolmogorov’s criterion to build a suitable (continuous, and then Hölder) modi-
fication from X restricted to dyadic times. We recall that X(ω) = ω ∈ C([0, T ], Rd ).
Let N1 be the null set of ω where X(ω) fails to be of α-Hölder (or p-variation)
regularity. Note that ω ∈ N1c implies ω + h ∈ N1c for all h ∈ H. By the very
construction of Xs,t as an L2 -limit, for fixed
R s, t there exists a sequence of partitions
(P m ) of [s, t] such that Xs,t (ω) = limm P m X ⊗ dX exists for a.e. ω, and we write
N2;s,t for the null set on which this fails. The intersections of all these, for dyadic
times s, t, is again a null set, denoted by N2 . Now take ω ∈ N1c ∩ N2c . For fixed
dyadic s, t, consider the aforementioned partitions (P m ) and note
Z
X(ω + h) ⊗ dX(ω + h)
Pm
Z Z Z Z
= X(ω) ⊗ dX(ω) + h ⊗ dX + X ⊗ dh + h ⊗ dh .
Pm Pm Pm Pm

Thanks to ω ∈ N1c and Proposition 11.2, X(ω) and h have complementary


Young regularities, which guarantees convergence of the last three integrals to
their
R respective Young integrals. On the other hand, ω ∈ N2c guarantees that
Pm
X(ω) ⊗ dX(ω) → Xs,t (ω). This shows that the left-hand side converges,
the limit being by definition X(ω + h). In other words, for all ω ∈ N1c ∩ N2c , h ∈ H
and dyadic times s, t,
Th (X(ω))s,t = X(ω + h)s,t .
The construction of Xs,t for non-dyadic times was obtained by continuity (see
Theorem 10.4) and the above almost sure identity remains valid. ⊔

190 11 Cameron–Martin regularity and applications

11.2 Concentration of measure

11.2.1 Borell’s inequality

Let us first recall a remarkable isoperimetric inequality for Gaussian measures.


Following [Led96], we state it in the form due to C. Borell [Bor75], but an essentially
equivalent result was obtained independently by Sudakov and Tsirelson [ST78].
In order to state things in their natural generality, we consider in this section an
abstract Wiener-space (E, H, µ). The reader may have in mind the Banach space
E = C [0, T ], Rd , equipped with norm ∥x∥E := sup0≤t≤T |xt | and a Gaussian


measure µ, the law of a d-dimensional, continuous centred  Gaussian process X. In


this example, the Cameron–Martin space is given by H = E(X· Z) : Z ∈ W 1 with
1/2
∥h∥H = E Z 2 for h = E(X· Z). Let us write
y
1
Z
2
Φ(y) = √ e−x /2
dx
2π −∞

for the cumulative distribution function of a standard Gaussian, noting the elementary
tail estimate
Φ̄(y) := 1 − Φ(y) ≤ exp −y 2 /2 , y ≥ 0.


Theorem 11.6 (Borell’s inequality). Let (E, H, µ) be an abstract Wiener space and
A ⊂ E a measurable Borel set with µ(A) > 0 so that

â := Φ−1 (µ(A)) ∈ (−∞, ∞]

Then, if K denotes the unit ball in H, for every r ≥ 0,


c
µ((A + rK) ) ≤ Φ̄(â + r).

where A + rK = {x + rh : x ∈ A, h ∈ K} is the so-called Minkowski sum.3

Theorem 11.7 (Generalised Fernique Theorem). Let a, σ ∈ (0, ∞) and consider


measurable maps f, g : E → [0, ∞] such that

Aa = {x : g(x) ≤ a}

has (strictly) positive µ measure4 and set

â := Φ−1 (µ(Aa )) ∈ (−∞, ∞].

Assume furthermore that there exists a null-set N such that for all x ∈ N c , h ∈ H :

f (x) ≤ g(x − h) + σ∥h∥H . (11.8)


3
Measurability is a delicate matter but circumventable by reading µ as outer measure; [Led96].
4
Unless g = +∞ almost surely, this holds true for sufficienly large a.
11.2 Concentration of measure 191

Then f has a Gaussian tail. More precisely, for all r > a and with ā := â − a/σ,

µ({x : f (x) > r}) ≤ Φ̄(ā + r/σ).

Proof. Note that µ(Aa ) > 0 implies â = Φ−1 (µ(Aa )) > −∞. We have for all
x∈/ N and arbitrary r, M > 0 and h ∈ rK,

{x : f (x) ≤ M } ⊃ {x : g(x − h) + σ∥h∥H ≤ M }


⊃ {x : g(x − h) + σr ≤ M }
= {x + h : g(x) ≤ M − σr}.

Since h ∈ rK was arbitrary, this immediately implies the inclusion


[
{x : f (x) ≤ M } ⊃ {x + h : g(x) ≤ M − σr}
h∈rK
= {x : g(x) ≤ M − σr} + rK ,

and we see that

µ(f (x) ≤ M ) ≥ µ({x : g(x) ≤ M − σr} + rK) .

Setting M = σr + a and A := {x : g(x) ≤ a}, it then follows from Borell’s


inequality that
c
µ(f (x) > σr + a) ≤ µ((A + rK) ) ≤ Φ̄(â + r) .

It then suffices to rewrite the estimate in terms of r̃ := σr + a > a, noting that


â + r = ā + r̃/σ. ⊔ ⊓

Example 11.8 (Classical Fernique estimate). Take f (x) = g(x) = ∥x∥E . Then the
assumptions of the generalised Fernique Theorem are satisfied with σ equal to the
operator norm of the continuous embedding H ,→ E. This applies in particular to
Wiener measure on C [0, T ], Rd .

11.2.2 Fernique theorem for Gaussian rough paths

Theorem 11.9. Let (Xt : 0 ≤ t ≤ T ) be a d-dimensional, centred Gaussian process


with independent components and covariance R such that there exists ϱ ∈ [1, 32 ) and
M < ∞ such that for every i ∈ {1, . . . , d} and 0 ≤ s ≤ t ≤ T ,
1/ϱ
∥RX i ∥ϱ-var;[s,t]2 ≤ M |t − s| .

Then, for any α ∈ ( 13 , 2ϱ


1
), the associated rough path X = (X, X) ∈ Cgα built in
Theorem 10.4 is such that there exists η = η(M, T, α, ϱ) with
192 11 Cameron–Martin regularity and applications
 
2
E exp η|||X|||α < ∞ . (11.9)

Remark 11.10. Recall pthat the homogeneous “norm” |||X|||α was defined in (2.4) as
the sum of ∥X∥α and ∥X∥2α . Since X is “quadratic” in X (more precisely: in the
second Wiener–Itô chaos), the square root is crucial for the Gaussian estimate (11.9)
to hold.

Proof. Combining Theorem 11.5 with Lemma 11.4 and Proposition 11.2 shows that
for a.e. ω and all h ∈ H
 
|||X(ω)|||α ≤ C |||(X(ω − h))|||α + M 1/2 ∥h∥H .

We can thus apply the generalised Fernique Theorem with f (ω) = |||X|||α (ω) and
g(ω) = Cf (ω), noting that |||X|||α (ω) < ∞ almost surely implies that
def
Aa = {x : g(x) ≤ a}

has positive probability for a large enough (and in fact, any a > 0 thanks to a
support theorem for Gaussian rough paths, [FV10b]). Gaussian integrability of the
homogeneous rough path norm, for a fixed Gaussian rough path X is thus established.
The claimed uniformity, η = η(M, T, α, ϱ) and not depending on the particular X
under consideration requires an additional argument. We need to make sure that
µ(Aa ) is uniformly positive over all X with given bounds on the parameters (in
particular M, ϱ, a, d); but this is easy, using (10.16),
1 1
E|||X|||2α ≥ 1 − 2 C ,
µ(|||X|||α ≤ a) ≥ 1 −
a2 a

where C = C(M, ϱ, α, d) and so, say, a = 2C would do. ⊔ ⊓

11.2.3 Integrability of rough integrals and related topics

The price of a pathwise integration / SDE theory is that all estimates (have to) deal
with the worst possible scenario. To wit, given X = (X, X) ∈ Cgα and a nice 1-form,
F ∈ Cb2 say, we had the estimate
Z T  
1/α
F (X)dX ≤ C |||X|||α;[0,T ] ∨ |||X|||α;[0,T ] ,


0

where C may depend on F , T and α ∈ 13 , 12 . In terms of p-variation, p = 1/α, one




can show similarly, with |||X|||p-var;[0,T ] as introduced earlier, cf. (11.4),


Z T  
F (X)dX ≤ C |||X|||p-var;[0,T ] ∨ |||X|||pp-var;[0,T ] , (11.10)


0
11.2 Concentration of measure 193

where C depends on F and α ∈ 13 , 12 but not on T , thanks to invariance under




reparametrisation. For the same reason, the integration domain [0, T ] in (11.10) may
be replaced by any other interval.

Example 11.11. The estimate (11.10) is sharp, at least when p = 1/α = 2, in the
following sense. Consider the (“pure-area”) rough path given by
 
0 c
t 7→ (0, At) , A = ,
−c 0

for some c > 0. The homogeneous (p-variation, or α-Hölder) rough path norm here
scales with c1/2 . Hence, the right-hand side of (11.10) scales like c (for c large), as
does the left-hand side which in fact is given by T |DF (0)A|.

The “trouble”, in Brownian (ϱ = 1) or worse (ϱ > 1) regimes of Gaussian rough


paths is that, despite Gaussian tails of the random variable |||X(ω)|||α , established
in Theorem 11.9, the above estimate (11.10) fails to deliver Gaussian, or even
exponential, integrability of the “random” rough integral
Z T
def
Z(ω) = F (X(ω))dX(ω) ,
0

something which is rather straightforward in the context of (Itô or Stratonovich)


stochastic integration against Brownian motion.
As we shall now see, Borell’s inequality, in the manifestation of our generalised
Fernique estimate, allows to fully close this “gap” between integrability properties.
The key idea, due to Cass–Litterer–Lyons [CLL13] is to define, for a fixed rough path
X of finite homogeneous p-variation in the sense of (11.4), a tailor-made partition5
of [0, T ], say
P = {[τi , τi+1 ] : i = 0, . . . , N }
with the property that for all i < N

|||X|||p-var;[τi ,τi+1 ] = 1,

i.e. for all but the very last interval for which one has |||X|||p-var;[τN ,τN +1 ] ≤ 1. One
can then exploit rough path estimates such as (11.10) on (small) intervals [τi , τi+1 ]
on which estimates are linear in |||X|||p-var ∼ 1. The problem of estimating rough
integrals is thus reduced to estimating N = N (X) and it was a key technical result
in [CLL13] to use Borell’s inequality to establish good (probabilistic) estimates on
N when X = X(ω) is a Gaussian rough path. (Our proof below is different from
[CLL13] and makes good use of the generalised Fernique estimate.)
To formalise this construction, we fixed a (1D) control function w = w(s, t), i.e.
a continuous map on {0 ≤ s ≤ t ≤ T }, super-additive, continuous and zero on the

5
The construction is purely deterministic. Of course, when X = X(ω) is random, then so is the
partition.
194 11 Cameron–Martin regularity and applications

diagonal.6 The canonical example of a control in this context is7


p
wX (s, t) = |||X|||p-var;[s,t] .

Thanks to continuity of w = wX we can then define a partition tailor-made for X


based on eating up unit (β = 1 below) pieces of p-variation as follows. Set

τ0 = 0 , τi+1 = inf {t : w(τi , t) ≥ β, τi < t ≤ T } ∧ T , (11.11)

so that w(τi , τi+1 ) = β for all i < N , while w(τN , τN +1 ) ≤ β, where N is given
by
N (w) ≡ Nβ (w; [0, T ]) := sup {i ≥ 0 : τi < T }.
As immediate consequence of super-additivity of controls,
N
X −1
βNβ (w; [0, T ]) = w(τi , τi+1 ) ≤ w(0, τN ) ≤ w(0, τN +1 ) = w(0, T ).
i=0

Note also that N is monotone in w, i.e. w ≤ w̃ implies N (w) ≤ N (w̃). At last, let us
set N (X) = N (wX ). The following (purely deterministic) lemma is most naturally
stated in variation regularity.

Lemma 11.12. Assume X ∈ Cgp-var , p ∈ [2, 3), and h ∈ C q-var , q ≥ 1, of complemen-


tary Young regularity in the sense that p1 + 1q > 1. Then there exists C = C(p, q) so
that
1
 p 
q
N1 (X; [0, T ]) q ≤ C ∥T−h (X)∥p-var;[0,T ] + ∥h∥q-var;[0,T ] . (11.12)

Proof. (Riedel) It is easy to see that all Nβ , Nβ ′ , with β, β ′ > 0 are comparable, it
is therefore enough to prove the lemma for some fixed β > 0.
q
Given h ∈ C q-var , wh (s, t) = |||h|||q-var;[s,t] is a control and so is whθ whenever
θ ≥ 1. (Noting 1 ≤ q ≤ p, we shall use this fact with θ = p/q.) From Lemma 11.4
we have, for any interval I

|||Th X|||p-var;I ≲ |||X|||p-var;I + ∥h∥q-var;I .

Raise everything to the pth power to see that

 
p p
(s, t) 7→ |||Th X|||p-var;[s,t] ≤ C |||X|||p-var;[s,t] + ∥h∥pq-var;[s,t] =: C w̃(s, t) .

where C = C(p, q) and w̃ is a control. Choose β = C. By monotonicity of Nβ in


the control,

6
Do not confuse a control w with “randomness” ω.
7
Super-additivity, i.e. ω(s, t) + ω(t, u) ≤ ω(s, u) whenever s ≤ t ≤ u is immediate, but
continuity is non-trivial see e.g. [FV10b, Prop. 5.8])
11.2 Concentration of measure 195

Nβ (Th X; [0, T ]) ≤ Nβ (C w̃; [0, T ]) = N1 (ω̃; [0, T ]).

By definition, Ñ := N1 (ω̃; [0, T ]) is the number of consecutive intervals [τi , τi+1 ]


for which
p
1 = ω̃(τi , τi+1 ) = |||X|||p-var;[τi ,τi+1 ] + ∥h∥pq-var;[τi ,τi+1 ] .

Using the manifest estimate ∥h∥pq-var;[τi ,τi+1 ] ≤ 1 and q/p ≤ 1 we have


p
1 ≤ |||X|||p-var;[τi ,τi+1 ] + ∥h∥qq-var;[τi ,τi+1 ] = wX (τi , τi+1 ) + wh (τi , τi+1 )

for 0 ≤ i < Ñ . Summation over i yields


p
Ñ ≤ wX (0, τÑ ) + wh (0, τÑ ) ≤ |||X|||p-var;[0,T ] + ∥h∥qq-var;[0,T ] .

Combination of these estimate hence shows that


p
Nβ (Th X; [0, T ]) ≤ |||X|||p-var;[0,T ] + ∥h∥qq-var;[0,T ] .

Replace X = Th T−h X by T−h X and then use elementary estimates of the type
(a + b)1/q ≤ (a1/q + b1/q ) for non-negative reals a, b, to obtain the claimed estimate
(11.12). ⊔ ⊓
The previous lemma, combined with variation regularity of Cameron–Martin
paths (Proposition 11.2) and the generalised Fernique Theorem 11.7 then gives
immediately
Theorem 11.13 (Cass–Litterer–Lyons). Let X = (X, X) ∈ Cgα a.s. be a Gaussian
rough path, as in Theorem 11.9. (In particular, the covariance is assumed to have
finite 2D ϱ-variation.) Then the integer-valued random variable

N (ω) := N1 (X(ω); [0, T ])

has a Weibull tail with shape parameter 2/ϱ (by which we mean that N 1/ϱ has a
Gaussian tail).
Let us quickly illustrate how to use the above estimate.
Corollary 11.14. Let X be as in the previous theorem and assume F ∈ Cb2 . Then the
random rough integral
Z T
def
Z(ω) = F (X(ω))dX(ω)
0

has a Weibull tail with shape parameter 2/ϱ by which we mean that |Z|1/ϱ has a
Gaussian tail.
Proof. Let (τi ) be the (random) partition associated to the p-variation of X(ω) as
defined in (11.11), with β = 1 and w = wX . Thanks to (11.10) we may estimate
196 11 Cameron–Martin regularity and applications
Z
T X
Z
τi+1

F (X(ω))dX(ω) ≤ F (X(ω))dX(ω)


0 τi
[τi ,τi+1 ]∈P
 
p
≲ (N (ω) + 1) sup |||X|||p-var;[τi ,τi+1 ] ∨ |||X|||p-var;[τi ,τi+1 ]
i
= (N (ω) + 1) ,
 i
1 1
where the proportionality constant may depend on F , T and α ∈ 3 , 2ϱ . ⊔

11.3 Malliavin calculus for rough differential equations

In this section, we assume that the reader is already familiar with the basics of
Malliavin calculus as exposed for example in the monographs [Mal97, Nua06].

11.3.1 Bouleau–Hirsch criterion and Hörmander’s theorem

Consider some abstract Wiener space (W, H, µ) and a Wiener functional of the form
F : W → Re . In the context of stochastic – or rough – differential equations driven
by Gaussian signals, the Banach space W is of the form C [0, T ], Rd where µ
describes the statistics of the driving noise. If F denotes the solution to a stochastic
differential equation at some time t ∈ (0, T ], then, in general, F is not a continuous,
let alone Fréchet regular, function of the driving path. However, as we will see in this
section, it can be the case that for µ-almost every ω, the map H ∋ h 7→ F (ω + h), i.e.
F (ω + ·) restricted to the Cameron-Martin space (H, ⟨·, ·⟩) is Fréchet differentiable.
(This implies D1,p
loc -regularity, based on the commonly used Shigekawa Sobolev space
D1,p ; our notation here follows [Mal97] or [Nua06, Sec. 1.2, 1.3.4].) More precisely,
we introduce the following notion, see for example [Nua06, Sec. 4.1.3]:

Definition 11.15. Given an abstract Wiener space (W, H, µ), a random variable
1
F : W → R is said to be continuously H-differentiable, in symbols F ∈ CH , if for
µ-almost every ω, the map

H ∋ h 7→ F (ω + h)

is continuously Fréchet differentiable. A vector-valued random variable is said to be


1
in CH if this is the case for each ofits components. In particular, µ-almost surely,
DF (ω) = DF 1 (ω), . . . , DF e (ω) is a linear bounded map from H to Re .

Given an Re -valued random variable F in CH


1
, we define the Malliavin covariance
matrix
Mij (ω) = DF i (ω), DF j (ω) .
def

(11.13)
11.3 Malliavin calculus for rough differential equations 197

The following well-known criterion of Bouleau–Hirsch, see [BH91, Thm 5.2.2] and
[Nua06, Sec. 1.2, 1.3.4] then provides a condition under which the law of F has a
density with respect to Lebesgue measure:

Theorem 11.16. Let (W, H, µ) be an abstract Wiener space and let F be an Re -


1
valued random variable F in CH . If the associated Malliavin matrix M is invertible
µ-almost surely, then the law of F is has a density with respect to Lebesgue measure
on Re .

Remark 11.17. Higher order differentiability, together with control of inverse mo-
ments of M allow to strengthen this result to obtain smoothness of this density.

As beautifully explained in his own book [Mal97], Malliavin realised that the
strong solution to the stochastic differential equation
d
X
dYt = Vi (Yt ) ◦ dBti , (11.14)
i=1

started at Y0 = y0 ∈ Re and driven along C ∞ -bounded vector fields Vi on Re , gives


rise to a non-degenerate Wiener functional F = YT , admitting a density with respect
to Lebesgue measure, provided that the vector fields satisfy Hörmander’s famous
“bracket condition” at the starting point y0 :

Lie {V1 , . . . , Vd } y0 = Re .

(H)

(Here, Lie V denotes the Lie algebra generated by a collection V of smooth vector
fields.) There are many variations on this theme, one can include a drift vector
field (which gives rise to a modified Hörmander condition) and under the same
assumptions one can show that YT admits a smooth density. This result can also
(and was originally, see [Hör67, Koh78]) be obtained by using purely functional
analytic techniques, exploiting the fact that the density solves Kolmogorov’s forward
equation. On the other hand, Malliavin’s approach is purely stochastic and allows to
go beyond the Markovian / PDE setting. In particular, we will see that it is possible
to replace B by a somewhat generic sufficiently non-degenerate Gaussian process,
with the interpretation of (11.14) as a random RDE driven by some Gaussian rough
path X rather than Brownian motion.

11.3.2 Calculus of variations for ODEs and RDEs

Throughout, we assume that V = (V1 , . . . , Vd ) is a given set of smooth vector fields,


bounded and with bounded derivatives of all orders. In particular, there is a unique
solution flow to the RDE
dY = V (Y ) dX , (11.15)
198 11 Cameron–Martin regularity and applications

for any α-Hölder geometric driving rough path X = (X, X) ∈ Cg0,α , which may
be obtained as limit of smooth, or piecewise smooth, paths in α-Hölder rough path
metric. Set p = 1/α. Recall that, thanks to continuity of the Itô–Lyons maps, RDE
solutions are limits of the corresponding ODE solutions.
The unique RDE solution (11.15) passing through Yt0 = y0 gives rise to the
X
solution flow y0 7→ Ut←t 0
(y0 ) = Yt . We call the derivative of the flow with respect
X
to the starting point the Jacobian and denote it by Jt←t 0
, so that

X d X
Jt←t a= Ut←t0 (y0 + εa) .

0
dε ε=0

We also consider the directional derivative

X d Tεh X
Dh Ut←0 = U ,
dε t←0 ε=0

for any sufficiently smooth path h : R+ → Re . Recall that the translation operator
Th was defined in (11.5). In particular, we have seen in Lemma 11.4 that, if X arises
from a smooth path X together with its iterated integrals, then the translated rough
path Th X is nothing but X + h together with its iterated integrals. In the general case,
given h ∈ C q-var of complementary Young regularity, i.e. with 1/p + 1/q > 1, the
translation Th X can be written in terms of X and cross-integrals between X and h.
Suppose for a moment that the rough path X is the canonical lift of a smooth
Rd -valued path X. Then, it is classical to prove that Jt←t
X
0
X
= Jt←t 0
X
, where Jt←t 0
solves the linear ODE
d
X
X X
dJt←t 0
= DVi (Yt )Jt←t 0
dXti , (11.16)
i=1

and satisfies JtX2 ←t0 = JtX2 ←t1 · JtX1 ←t0 . Furthermore, the variation of constants
formula leads to
Z tX d
X X
Dh Ut←0 = Jt←s Vi (Ys ) dhis . (11.17)
0 i=1

Similarly, given any smooth vector field W , a straightforward application of the


chain rule yields
d
X
 X X
d J0←t W (Yt ) = J0←t [Vi , W ](Yt ) dXti , (11.18)
i=1

where [V, W ] denotes the Lie bracket between the vector fields V and W . All this
extends to the rough path limit without difficulties. For instance, (11.16) can be
interpreted as a linear equation driven by the rough path X, using the fact that
DV (Y ) is controlled by X to give meaning to the equation. It is then still the case
X
that Jt←t 0
is the derivative of the flow associated to (11.15) with respect to its initial
condition.
11.3 Malliavin calculus for rough differential equations 199

Proposition 11.18. Let X ∈ Cg0,α ([0, T ], Rd ) and h ∈ C q-var [0, T ], Rd with α ∈




( 13 , 12 ] and complementary Young regularity in the sense that α + 1q > 1. Then

d
Z tX
X X X
 i
Dh Ut←0 (y0 ) = Jt←s Vi Us←0 dhs (11.19)
0 i=1

where the right-hand side is well-defined as Young integral.


X X X
Proof. Both Jt←0 and Dh Ut←0 satisfy (jointly with Ut←0 ) an RDE driven by X.
This is well known in the ODE case, i.e. when both X, h are smooth, (Duhamel’s
principle, variation of constant formula, . . .) and remains valid in the geometric rough
path limit by appealing to continuity of the Itô–Lyons and continuity properties
of the Young integral. A little care is needed since the resulting vector fields are
not bounded anymore. It suffices to rule out explosion so that the problem can be
X
localised. The required remark is that that Jt←0 also satisfies a linear RDE of form
X
dJt←0 = dMX · Jt←0
X
(y0 )

and linear RDEs do not explode. ⊔


Consider now an RDE driven by a Gaussian rough path X = X(ω). We now show
that the Re -valued random variable obtained from solving this random RDE enjoys
1
CH -regularity.
1
Proposition 11.19. With ϱ ∈ [1, 23 ) and α ∈ ( 13 , 2ϱ ), let X = (X, X) ∈ Cgα be a
Gaussian rough path as constructed in Theorem 10.4. For fixed t ≥ 0, the Re -valued
random variable
X(ω)
ω 7→ Ut←0 (y0 )
is continuously H-differentiable.

Proof. Recall h ∈ H ⊂ C ϱ-var so that a.e. X(ω) and h enjoy complementary Young
regularity. As a consequence, we saw that the event

{ω : X(ω + h) ≡ Th X(ω) for all h ∈ H} (11.20)


X(ω+h)
has full measure. We show that h ∈ H 7→ Ut←0 (y0 ) is continuously Fréchet
differentiable for every ω in the above set of full measure. By basic facts of Fréchet
theory, it is sufficient to show (a) Gâteaux differentiability and (b) continuity of the
Gâteaux differential.
Ad (a): Using X(ω + g + h) ≡ Tg Th X(ω) for g, h ∈ H it suffices to show Gâteaux
X(ω+·)
differentiability of Ut←0 (y0 ) at 0 ∈ H. For fixed t, define
X X

Zi,s ≡ Jt←s Vi Us←0 .

Note that s 7→ Zi,s is of finite p-variation, with p = 1/α. We have, with implicit
summation over i,
200 11 Cameron–Martin regularity and applications

t X  i t
Z Z
X X
Zi dhi

Dh Ut←0 (y0 ) = Jt←s Vi Us←0 dhs =
0 0
≲ (∥Z∥p-var + |Z(0)|) × ∥h∥ϱ-var
≲ (∥Z∥p-var + |Z(0)|) × ∥h∥H .
X
Hence, the linear map DUt←0 X
(y0 ) : h 7→ Dh Ut←0 (y0 ) ∈ Re is bounded and each

component is an element of H . We just showed that

d Tεh X(ω) D
X(ω)
E
h 7→ Ut←0 (y0 ) = DUt←0 (y0 ), h
dε ε=0 H

and hence
d X(ω+εh) D
X(ω)
E
h 7→ Ut←0 (y0 ) = DUt←0 (y0 ), h
dε ε=0 H

emphasizing again that X(ω + h) ≡ Th X(ω) almost surely for all h ∈ H simulta-
neously. Repeating the argument with Tg X(ω) = X(ω + g) shows that the Gâteaux
X(ω+·)
differential of Ut←0 at g ∈ H is given by
X(ω+g) gT X(ω)
DUt←0 = DUt←0 .
T X(ω)
g
(b) It remains to be seen that g ∈ H 7→ DUt←0 ∈ L(H, Re ), the space of linear
bounded maps equipped with operator norm, is continuous. We leave this as exercise
to the reader, cf. Exercise 11.4 below. ⊔

11.3.3 Hörmander’s theorem for Gaussian RDEs

Recall that ϱ ∈ [1, 32 ), α ∈ ( 13 , 2ϱ


1
) and X = (X, X) ∈ Cgα a.s. is the Gaussian
rough path constructed in Theorem 10.4. Any h ∈ H ⊂ C ϱ-var and a.e. X(ω) enjoy
complementary Young regularity. We now present the remaining conditions on X,
followed by some commentary on each of the conditions, explaining their significance
in the context of the problem and verifying them for some explicit examples of
Gaussian processes.
Condition 1 Fix T > 0. For every t ∈ (0, T ] we assume non-degeneracy of the law
Pd R t
of X on [0, t] in the following sense. Given f ∈ C α ([0, t], Rd ), if j=1 0 fj dhj = 0
for all h ∈ H, then one has f = 0.
Rt
Note that, thanks to complementary Young regularity, the integral 0 fj dhj makes
sense as a Young integral. Some assumption along the lines of Condition 1 is certainly
necessary: just consider the trivial rough differential equation dY = dX, starting at
Y0 = 0, with driving process X = X(ω) given by a Brownian bridge which returns
to the origin at time T (i.e. Xt = Bt − Tt BT in terms of a standard Brownian motion
B). Clearly YT = XT = 0 and so YT does not admit a density, despite the equation
11.3 Malliavin calculus for rough differential equations 201

dY = dX being even “elliptic”. However, it is straightforward to verify that in this


RT
example 0 dh = 0 for every h belonging to the Cameron–Martin space of the
Brownian bridge, so that Condition 1 is violated by taking for f a non-zero constant
function.

Condition 2 With probability one, sample paths of X are truly rough, at least in a
right-neighbourhood of 0.
These conditions obviously hold for d-dimensional Brownian motion: the first
condition is satisfied because 0 is the only (continuous) function orthogonal to all of
L2 ([0, T ], Rd ); the second condition was already verified in Section 6.3. More inter-
estingly, these conditions are very robust and also hold for the Ornstein–Uhlenbeck
process, a Brownian bridge which returns to the origin at a time strictly greater than
T , and some non-semimartingale examples such as fractional Brownian motion,
including the rough regime of Hurst parameter less than 1/2. We now show that
under these conditions the process admits a density at strictly positive times. Note
that the aforementioned situations are not at all covered by the “usual” Hörmander
theorem.

Theorem 11.20. With ϱ ∈ [1, 32 ) and α ∈ ( 31 , 2ϱ 1


), let X = (X, X) ∈ Cgα be a
Gaussian rough path as constructed in Theorem 10.4. Assume that the Gaussian
process X satisfies Conditions 1 and 2. Let V = (V1 , . . . , Vd ) be a collection of
C ∞ -bounded vector fields on Re , which satisfies Hörmander’s condition (H) at some
point y0 ∈ Re . Then the law of the RDE solution

dYt = V (Yt ) dXt , Y (0) = y0 ,

admits a density with respect to Lebesgue measure on Re for all t ∈ (0, T ].

Proof. Thanks to Proposition 11.19 and in view of the Bouleau–Hirsch criterion,


Theorem 11.16 we only need to show almost sure invertibility of the Malliavin matrix
associated to the solution map. As a consequence of (11.13) and (11.19), we have
for every z ∈ Re the identity
d
z Jt←· Vj (Y· ) 2 ,
X
z ⊺ Mt z =
⊺ X
t
j=1

where we wrote ∥ • ∥t for the norm given by


Z t
∥f ∥t = sup f (s) dh(s) .
h∈H : ∥h∥=1 0

X
Before we proceed we note that, by the multiplicative property of Jt←s , see the
remark following (11.16), one has
X X
⊺
Mt = Jt←0 M̃t Jt←0 ,
202 11 Cameron–Martin regularity and applications

where M̃t is given by


d
z J0←· Vj (Y· ) 2 .
X
z ⊺ M̃t z =
⊺ X
t
j=1

Since we know that the Jacobian is invertible, invertibility of Mt is equivalent to


that of M̃t , and it is the invertibility of the latter that we are going to show.
Assume now by contradiction that M̃t is not almost surely invertible. This im-
plies that there exists a random unit vector z ∈ Re such that z ⊺ M̃t z = 0 with
non-zero probability. It follows immediately from Condition 1 that, with non-zero
X(ω)
probability, the functions s 7→ z ⊺ J0←s Vj (Ys ) vanish identically on [0, t] for every
j ∈ {1, . . . , d}. By (11.18), this is equivalent to
d Z
X ·
X
z ⊺ J0←s [Vi , Vj ](Ys ) dXi (s) ≡ 0
i=1 0

on [0, t]. Thanks to Condition 2, true roughness of X, we can apply Theorem 6.5 to
conclude that one has
X
z ⊺ J0←· [Vi , Vj ](Y· ) ≡ 0 ,
for every i, j ∈ {1, . . . , d}. Iterating this argument shows that, with non-zero prob-
ability, the processes s 7→ z ⊺ J0←s X
W (Ys ) vanish identically for every vector field
W obtained as a Lie bracket of the vector fields Vi . In particular, this is the case for
s = 0, which implies that with positive probability, z is orthogonal to W (z0 ) for
all such vector fields. Since Hörmander’s condition (H) asserts precisely that these
vector fields span the tangent space at the starting point y0 , we conclude that z = 0
with positive probability, which is in contradiction with the fact that z is a random
unit vector and thus concludes the proof. ⊔ ⊓

11.4 Exercises

Exercise 11.1 (Improved Cameron–Martin regularity, [FGGR16]) A combina-


tion of Theorem 10.9 with the Cameron–Martin embedding, Proposition 11.2, shows
that every Cameron–Martin path associated to a Gaussian process enjoys finite
q-variation regularity with q = ϱ. Show that, under the assumptions of Theorem 10.9,
this can be improved to
1
q= 1 1 . (11.21)
2 + 2ϱ

As a consequence, “complementary Young regularity”, now holds for all ϱ < 2. In


the fBm setting, this covers every Hurst parameter H > 1/4. (To exploit this in the
newly covered regime H ∈ (1/4, 1/3], one would need to work in a “level-3” rough
path setting.)
11.4 Exercises 203

Exercise 11.2 Formulate a quantitative version of Theorem 11.14. Show in particu-


lar that the Gaussian tail of |Z|1/ϱ is uniform over rough integrals against Gaussian
rough paths, provided that ∥F ∥C 2 and the ϱ-variation of the covariance, say in the
b
form of the constant M in Theorem 11.9, are uniformly bounded.
Exercise 11.3 (Noise doubling, from [Ina14, Sch18]) Let X be a d-dimensional
Gaussian process as considered in Theorem 10.4 and X = (X, X) the random α-
Hölder rough path over Rd constructed therein. Recall that any h ∈ H, with H the
associated Cameron–Martin space, is given by ht = E(ΞXt ) = Ē(Ξ̄ X̄t ) ∈ Rd
where X̄ = X̄(ω̄) is an IID copy of X = X(ω) and Ξ̄, Ξ are elements in their
respective first Wiener chaoses with L2 -norm equal to ∥h∥H .
a) Apply Theorem 10.4 to construct the “doubled” rough path associated to the
2d-dimensional process (X, X̄) and use this to show that Z h := (X, h) can be
extended canonically to a random rough path Zh = (Z h , Zh ) over R2d .
Hint: Formally, in case d = 1 for notational simplicity,
 R R  
Zh = RXdX  Ē Ξ̄ RXdX̄  ,
Ē Ξ̄ X̄dX Ē Ξ̄ Ξ̄ X̄dX̄
ω̄
where Ē = Ē only averages over ω̄.
b) Show further that
E ∥Zh − Zk ∥22α ≲ ∥h − k∥2H .


(Since ∥Z h − Z k ∥α = ∥h − k∥α ≲ ∥h − k∥H this shows that the construction


of the joint lift of (X, h) as a random rough path is continuous in h ∈ H.)
Exercise 11.4 Finish the proof of part (b) of Proposition 11.19.

Solution. In the notation of the (proof of) this Proposition, we have to show that
Tg X(ω)
g ∈ H 7→ DUt←0 ∈ L(H, Re ) is continuous. To this end, assume gn → g in H
ϱ-var
(and hence in C ). Continuity properties of the Young integral imply continuity of
the translation operator viewed as map h ∈ C ϱ-var 7→ Th X(ω) and so

Tgn X(ω) → Tg X(ω)

in p-variation rough path metric. The point here is that


x
x 7→ Jt←· x
and Jt←· x
(Vi (U·←0 )) ∈ C p-var

depends continuously on x with respect to p-variation rough path metric: using the
x x
fact that Jt←· and U·←0 both satisfy rough differential equations driven by x this is
just a consequence of Lyons’ limit theorem (the universal limit theorem of rough path
theory). We apply this with x = X(ω) where ω remains a fixed element in (11.20). It
follows that

Tgn X(ω) Tg X(ω) Tgn X(ω) Tg X(ω)
− DUt←0 = sup Dh Ut←0 − Dh Ut←0

DUt←0
op h:∥h∥H =1
204 11 Cameron–Martin regularity and applications

Tg X(ω) Tg X(ω) 
and defining Zig (s) ≡ Jt←s Vi Us←0 , and similarly Zign (s), the same rea-
soning as in part (a) leads to the estimate

Tgn X(ω) Tg X(ω)
≤ c |Z gn − Z g |p-var + |Z gn (0) − Z g (0)| .

− DUt←0

DUt←0
op

From the explanations just given this tends to zero as n → ∞ which establishes
continuity of the Gâteaux differential, as required, and the proof is finished.

Exercise 11.5 Prove Theorem 11.20 in presence of a drift vector field V0 . In particu-
lar, show that in this case condition (H) can be weakened to

Lie {V1 , . . . , Vd , [V0 , V1 ], . . . , [V0 , Vd ]} y0 = Re .



(11.22)

11.5 Comments

Section 11.1: Regularity of Cameron–Martin paths (q-variation, with q = ϱ) under


the assumption of finite ϱ-variation of the covariance was established in Friz–Victoir,
[FV10a], see also [FV10b, Ch.15]. In the context of Gaussian rough paths, this leads
to complementary Young regularity (CYR) whenever ϱ < 32 which covers general
“level-2” Gaussian rough paths as discussed in Chapter 10. On the other hand, “level-
3” Gaussian rough paths can be constructed for any ϱ < 2 which includes fBm
1
with H = 2ϱ > 14 ). A sharper Cameron regularity result specific to fBm follows
from a Besov–variation embedding theorem [FV06b], thereby leading to CYR for
all H > 14 . The general case was understood in [FGGR16]: one can take q as in
(11.21), provided one makes the slightly stronger assumption of finite “mixed” (1, ϱ)-
variation of the covariance. The conclusion concerning ϱ-variation of Theorem 10.9
can in fact be strengthened to finite mixed (1, ϱ)-variation at no extra cost and indeed
this theorem is only a special case of a general criterion given in [FGGR16].
Section 11.2: Theorem 11.9 was originally obtained by careful tracking of constants
via the Garsia–Rodemich–Rumsey Lemma, see [FV10b]. The generalised Fernique
estimate is taken from Friz–Oberhauser and then Diehl, Oberhauser and Riedel
[FO10, DOR15]; Riedel [Rie17] establishes a further generalisation in form of a
transportation cost inequality in the spirit of Talagrand. This yields an elegant proof
of Theorem 11.13 with which Cass, Litterer, and Lyons [CLL13] have overcome the
longstanding problem of obtaining moment bounds for the Jacobian of the flow of a
rough differential equation driven by Gaussian rough paths, thereby paving the way
for the proof of the Hörmander-type results, see below. As was illustrated, this above
methodology can be adapted to many other situations of interest, a number of which
are discussed in [FR13]. See also [CO17] for Fernique type estimate in a Markovian
context.
Section 11.3: Baudoin–Hairer [BH07] proved a Hörmander theorem for differen-
tial equations driven by fBm in the regular regime of Hurst parameter H > 1/2
in a framework of Young differential equations. The Brownian case H = 1/2
11.5 Comments 205

of course classical, see the monographs [Nua06, Mal97] or the original articles
[Mal78, KS84, KS85, KS87, Bis81b, Bis81a, Nor86], a short self-contained proof
can be found in [Hai11a]. In the case of rough differential equations driven by less
regular Gaussian rough path (including the case of fBm with H > 1/4), the relevance
of complementary Young regularity of Cameron–Martin paths to Malliavin regularity
or (Gaussian) RDE solutions was first recognised by Cass, Friz and Victoir [CFV09].
Existence of a density under Hörmander’s condition for such RDEs was obtained
by Cass–Friz [CF10], see also [FV10b, Ch.20], but with a Stroock-Varadhan sup-
port type argument instead of true roughness (already commented on at the end of
Chapter 6.) Smoothness of densities was subsequently established by Hairer–Pillai
[HP13] in the case of fBm and then Cass, Hairer, Litterer and Tindel [CHLT15] in
the general Gaussian setting of Chapter 10, making crucial use of the integrability
estimates discussed in Section 11.2. Indeed, combined with known estimates for
the Jacobian of RDE flows (Friz–Victoir, [FV10b, Thm 10.16]) one readily obtains
finite moments of the Jacobian of the inverse flow. This is a key ingredient in the
smoothness proof via Malliavin calculus, as is the higher-order Malliavin differentia-
bility of Gaussian RDE solutions established by Inahama [Ina14]. Several authors
have studied the resulting density, see e.g. [BNOT16, Ina16b, GOT19, IN19] and the
references therein.
We note that existence of densities via Malliavin calculus for singular SPDEs,
in the framework of regularity structures, has been studied by Cannizzaro, Friz and
Gassiat [CFG17], Gassiat–Labbé [GL20] and in great generality by Schönbauer
[Sch18].
Chapter 12
Stochastic partial differential equations

Second order stochastic partial differential equations are discussed from a rough path
point of view. In the linear and finite-dimensional noise case we follow a Feynman–
Kac approach which makes good use of concentration of measure results, as those
obtained in Section 11.2. Alternatively, one can proceed by flow decomposition
and this approach also works in a number of nonlinear situations. Secondly, now
motivated by some semilinear SPDEs of Burgers’ type with infinite-dimension noise,
we study the stochastic heat equation (in space dimension 1) as evolution in Gaussian
rough path space relative to the spatial variable, in the sense of Chapter 10.

12.1 First order rough partial differential equations

12.1.1 Rough transport equation

As a prototypical linear first order PDE with noise we consider the transport equation,
posed (without loss of generality) as a terminal value problem. This is,
d
X
−∂t u(t, x) = fi (x) · Dx u(t, x)Ẇti ≡ Γ ut (x)Ẇt , u(T, • ) = g , (12.1)
i=1

where u : [0, T ] × Rn → R, with vector fields f = (f1 , . . . , fd ) driven by a C 1


driving signal W = (W 1 , . . . , W d ), and we write indifferently u(t, x) = ut (x). The
canonical pairing of Du = Dx u = (∂x1 u, . . . , ∂xn u) with a vector field is indicated
by a dot, and we already used the operator / vector notation

Γi = fi (x) · Dx , Γ = (Γ1 , . . . , Γd ). (12.2)

By the methods of characteristics, the unique (classical) C 1,1 -solution u : [0, T ] ×


Rn → R, is given explicitly by

207
208 12 Stochastic partial differential equations

u(s, x) = u(s, x; W ) := g(XTs,x ) , (12.3)

provided g ∈ C 1 and the vector fields f1 , . . . , fd are nice enough (Cb1 will do) to
Pd
ensure a C 1 solution flow for the ODE Ẋ = i=1 fi (X)Ẇ i ≡ f (X)Ẇ ; here X s,x
denotes the unique solution started from Xs = x.
We start with a rough path stability result for the transport equation, the proof of
which is an immediate consequence of our results on flow stability of RDEs.
Proposition 12.1. Let g ∈ C(Rm ) and W ε ∈ C 1 ([0, T ], Rd ), with geometric rough
path limit W ∈ Cg0,α , α > 1/3. Write uε (s, x) := u(s, x; W ε ), defined as in (12.3)
with W replaced by W ε . Let f ∈ Cb3 . Then uε converges locally uniformly to

u(s, x; W) := g(XTs,x ) (12.4)

where X s,x denotes the (unique) RDE solution to dX = f (X)dW, started from
Xs = x. (In particular, the limit depends on W but not on the approximating
sequence.)
It is instructive to consider the case of Brownian motion B = B(t, ω) with
Stratonovich lift as prototypical example of a (random) geometric rough path.
The RDE solution X is then equivalently described by a Stratonovich SDE and
u(t, x; ω) = g(XTt,x (ω)) is FtT -measurable. The so-defined random field should
then constitute a (backward adapted) solution to the Stratonovich backward stochas-
tic partial differential equation
←−
−dut (x) = Γ ut (x) ◦ dB t , u(T, • ) = g , (12.5)
←−
where dB stands for backward Stratonovich integration (cf. Section 5.4) provided g
(und then Γ ut ) are sufficiently regular to make this Stratonovich integral meaningful.
If rewritten in Itô-form, a matrix valued second order Γ 2 = (Γi Γj )1≤i,j≤d appears,
which of course must not change the hyperbolic nature of the stochastic transport
equation. (In classical SPDE theory on has the stochastic parabolicity condition,
which in the transport case is fully degenerate.)
All this strongly suggests that rough transport noise must be geometric (i.e.
W ∈ Cgα ). We now prepare the definition of (regular, backward) solution to the rough
transport equation. Since we are in the fortunate position to have an explicit solution
(candidate) we derive a graded set of rough path estimates that provide a natural
generalisation of the classical the transport differential equation. In what follows we
γ
abbreviate estimates of the form |(a) − (b)| ≲ |t − s|γ by writing (a) = (b). (Both
sides may depend on s, t and the multiplicative constant hidden in ≲ is assumed
uniform over bounded intervals).

Proposition 12.2. Consider vector fields f = (f1 , . . . , fd ) ∈ Cb5 , with associated


first order differential operators Γ = (Γ1 , . . . , Γd ). There is a unique C 3 solution
flow for the RDE dX = f (X)dW with W ∈ Cgα , α > 1/3. Let g ∈ C 3 and define
u(s, x; W) := g(XTs,x ) as in (12.4). Then u = u(s, x) ∈ C α,3 , uT = g, and we
have the estimates, with Einstein summation,
12.1 First order rough partial differential equations 209

us (x) 3α i
= ut (x) + Γi ut (x)Ws,t + Γi Γj ut (x)Wi,j
s,t
j
Γi us (x) 2α
= Γi ut (x) + Γi Γj ut (x)Ws,t ,
α
Γi Γj us (x) = Γi Γj ut (x) ,

with 0 ⩽ s < t ⩽ T , i, j = 1, . . . , d, locally uniformly in x, and, as consequence,


Z T
us (x) − g(x) = us (x) − uT (x) = Γ ut (x) dWt .
s

Remark 12.3. The first 3α estimate is nothing but Davie’s definition of solution for
a linear RDE, here of the form −du = Γ u dW. In finite dimensions, a linear map
Γ is necessarily bounded (equivalently: continuous) as linear operator, so that the
cascade of lower order (2α, α) estimates are a trivial consequence of the first. This is
different in the present situation, where ut takes values in a function space where
each application of Γ amounts to take one derivative. These estimates then have the
interpretation that time regularity of u, in the stated (“kα”) controlled sense, can be
traded against space regularity.

Remark 12.4. The rough integral formulation needs explanation. Indeed, while it is
clear from δΞ 3α= δu(x) = 0 that Ξs,t = Γi ut (x)Ws,t i
+Γi Γj ut (x)Wi,js,t has a sewing
limit, the right-point evalution requires attention, cf. Proposition 5.12 and the subse-
quent discussion about the subtleties of “right-point” rough integrals. Fortunately,
one checks that (Γ u, −(Γ 2 u)T ) ∈ DX 2α
so that, thanks to (5.10), Remark 5.13, this
sewing limit, over all partitions of [0, T ] say, is exactly identified as
X Z T
2 T
(Γ u, −Γ 2 uT )dX ,

lim Γ ut Xs,t − (Γ ut ) Xs,t =
|P|↓0 0
[s,t]∈P

where we omitted x for better readability. (Since the matrix Γ 2 ut = (Γi Γj ut )1≤i,j≤d
is in general not symmetric, a careful check of the controlledness condition is best
spelled out in coordinates.)

Notwithstanding the elegance of the rough integral formulation, additional quanti-


fiers, such as local uniformity in x, are better formulated at the level of the detailed
estimates which brings us to

Definition 12.5. Any C α,3 -function u : [0, T ] × Rn → R, for which the (locally
uniform) estimates in Proposition 12.2 hold is called a regular solution to the rough
backward transport equation

−du = Γ udW.

Proof (Proposition 12.2). Consider a solution X = X s,x to dX = f (X)dW, started


from Xs = x so that

= x + f (x)Ws,t + f ′ f (x)Ws,t .
Xt 3α
210 12 Stochastic partial differential equations

Fix times s < t < T . By uniqueness of RDE flow, XTt,y = XTs,x whenever y = Xts,x .
From u(s, x) := g(XTs,x ) and uniqueness of the RDE flow it is clear that, for all such
t,
u(s, x) = u(t, Xts,x ).
Note that ut = u(t, • ) ∈ C 3 follows from g ∈ C 3 , f ∈ C 5 ; the claimed C α,3 regularity
is then easy to see. We can expand
1
ut (Xts,x ) 3α
= ut (x)+Dut (x)(f (x)Ws,t +(Df )f (x)Ws,t )+ D2 ut (x)(f (x)Ws,t )2
2

where the final term is really the contraction ∂ij ut fki flj ( 12 Ws,t ⊗ Ws,t )k,l with
summation over all repeated indices. Using geometricity of X and symmetry of
D2 ut (x)(f, f ) the right-hand side becomes

ut (x) + Dut (x)f (x)Ws,t + {Dut (x)(Df )f (x) + D2 ut (x)(f, f )(x)}Ws,t .

(We essentially repeated the proof of Itô’s formula here, cf. Section 7.5.) In terms of
the first order differential operators Γi associated to the vector fields fi this can be
written elegantly as

= u(t, x) + Γ ut (x)Ws,t + Γ 2 ut (x)Ws,t .


us (x) 3α

This relation actually implies that with Ξs,t := Γi ut (x)Ws,t i


+ Γi Γj ut (x)Wi,j
s,t

we have |(δΞ)r,s,t | = O(|t − r| ) and hence (after a line of algebra) (Γi us,t −
j i 3α
Γi Γj ut Ws,t )Wr,s = 0 which strongly suggests validity of the desired 2α-estimate,
for all i = 1, . . . , d,
j
Γi us (x) 2α
= Γi ut (x) + Γi Γj ut (x)Ws,t .

Since no true roughness condition on W is imposed (W could be zero!), one has


to check this by hand from u(s, x) = g(XTs,x ), left to the reader. Similarly, the
previous relation gives (Γ 2 ut − Γ 2 us )Ws,t 2α
= 0 so that the same argument suggests
Γ 2 us (x) − Γ 2 ut (x) =
α
0. Here again, a direct verification is not hard (and amounts
to check α-Hölder regularity of s 7→ Γ 2 g(XTs,x ), with g ∈ C 3 .) ⊔ ⊓

We can now show that solutions in the sense of Definition 12.5 are unique.

Theorem 12.6. Consider vector fields f = (f1 , . . . , fd ) ∈ Cb5 , with associated


first order differential operators Γ = (Γ1 , . . . , Γd ) and W ∈ Cgα ([0, T ], Rd ) with
α > 1/3. For g ∈ C 3 , there exists a unique regular solution u : [0, T ] × Rn → R of
C α,3 regularity to the rough backward transport equation

−du = Γ udW , u(T, • ) = g .

Proof. Existence is clear, since Proposition 12.2 exactly says that (s, x) 7→ g(XTs,x )
gives a regular solution. Let now u be any solution with uT = g. We show that,
whenever X solves dX = f (X)dW,
12.1 First order rough partial differential equations 211

u(t, Xt ) − u(s, Xs ) 3α
= 0.

Since 3α > 1 this entails that t 7→ u(t, Xt ) is constant, and so u(s, x) =


u(T, XTs,x ) = g(XTs,x ). In fact, we show for k = 1, 2, 3

Γ 3−k ut (Xt ) kα
= Γ 3−k us (Xs ).

(Case k = 1.) Write

Γ 2 ut (Xt ) − Γ 2 us (Xs ) = Γ 2 ut (Xt ) − Γ 2 us (Xt ) + Γ 2 u(s, Xt ) − Γ 2 u(s, Xs ).

From the (third) defining property of a solution, the first difference on the right-hand
side of order α. Since solutions are C 3 in space, hence Γ 2 u(s, • ) ∈ C 1 , always
uniformly in s ∈ [0, T ] the final difference is also of order α, as required.
(Case k = 2.) Write

Γ ut (Xt ) − Γ us (Xs ) = Γ ut (Xt ) − Γ us (Xt ) + Γ us (Xt ) − Γ us (Xs ).

By the second defining property of a solution, the first difference on the right-hand
side equals −Γ 2 ut (Xt )Ws,t (up to order 2α). On the other hand, Γ us ∈ C 2 so that
the final difference can be replaced by

= DΓ us (Xs )f (Xs )Ws,t = Γ 2 us (Xs )Ws,t .


DΓ us (Xs )(Xt − Xs ) 2α

Put together we have Γ ut (Xt ) − Γ us (Xs ) = (Γ 2 us (Xs )−Γ 2 ut (Xt ))Ws,t . We see
α
that this is of (desired) order 2α, thanks to the case k = 1 and Ws,t = 0.
(Case k = 3.) We write

u(t, Xt ) − u(s, Xs ) = u(t, Xt ) − u(s, Xt ) + u(s, Xt ) − u(s, Xs ).

By the (first) defining property of a solution, the the first difference on the right-hand
side equals −Γ ut (Xt )Ws,t − Γ 2 ut (Xt )Ws,t (up to order 3α). On the other hand,
u(s, • ) ∈ C 3 so that the final difference can be replaced, using a second order Taylor
expansion, exactly as in the proof of Proposition 7.8, by
1
Dus (Xs )(f (Xs )Ws,t + f ′ f (Xs )Ws,t ) + D2 us (f (Xs ), f (Xs ))Ws,t ⊗ Ws,t
2
= Γ us (Xs )Ws,t + Γ 2 us (Xs )Ws,t

Put together (and using the cases k = 1, 2) gives the desired estimate. ⊔

12.1.2 Continuity equation and analytically weak formulation

Given a finite measure


R ϱ ∈ M(Rn ) and a continuous bounded function φ ∈ Cb (Rn ),
we write ϱ(φ) = φ(x)ϱ(dx) for the natural pairing. We are interested in measure-
212 12 Stochastic partial differential equations

valued (forward) solutions to the continuity equation


d
X
∂t ϱ = − divx (fi (x)ϱt )Ẇti ≡ Γ ⋆ ϱt Ẇt
i=1

when W becomes a (geometric) rough path. As before, Γi = fi (x) · Dx , with formal


adjoint Γi⋆ = − divx (fi • ).
Definition 12.7. We say that ϱ : [0, T ] → M(Rn ) is a measure-valued forward
RPDE solution to the rough continuity equation

dϱt + divx (f (x)ϱt )dWt = 0 (12.6)

if, uniformly over φ bounded in Cb3 ,

= ϱs (φ) + ϱs (Γ φ)Ws,t + ϱs (Γ 2 φ)Ws,t


ϱt (φ) 3α
= ϱs (Γ φ) + ϱs (Γ 2 φ)Ws,t
ϱt (Γ φ) 2α
ϱt (Γ 2 φ) =
α
ϱs (Γ 2 φ).

(Note Γ φ, Γ 2 φ ∈ Cb so all pairings are well-defined. Formally, the second and third
estimate follow from the first with φ replaces by Γ φ and Γ 2 φ), however doing so
would require test functions up to Γ 4 φ ∈ / Cb . Itemizing the estimates allows us to
keep track of the correct regularity of φ.)
These estimates imply immediately the following (analytically) weak formulation
Z t
3
∀φ ∈ Cb : ϱt (φ) − ϱ0 (φ) = (ϱs (Γ φ), ϱs (Γ 2 φ))dWs ,
0

but the finer information, as put foward in the definition, is crucial for uniqueness.
(Remark 12.9 below comments on time-dependent test functions.)

Theorem 12.8. Consider vector fields f = (f1 , . . . , fd ) ∈ Cb5 , with associated


first order differential operators Γ = (Γ1 , . . . , Γd ) and W ∈ Cgα ([0, T ], Rd ) with
α > 1/3. For every measure ν ∈ M(Rn ), there exists a unique measure-valued
solution to the rough continuity equation

dϱt + divx (f (x)ϱt )dWt , ϱ0 = ν , (12.7)

with explicit representation, for φ ∈ Cb3 , given by


Z
ϱt (φ) = φ(Xt0,x )ν(dx) .

Proof. (Existence) Let X = X 0,x be a solution to the RDE dX = f (X)dW, started


at X0 = x. By Proposition 7.8, a form of Itô’s formula for controlled rough paths,

= φ(Xs ) + φ(Xs )Xs′ Ws,t + (Dφ(Xs )Xs′′ + D2 φ(Xs )(Xs′ , Xs′ ))Ws,t ,
φ(Xt ) 3α
12.1 First order rough partial differential equations 213

uniformly in φ ∈ Cb3 . Taking into account X ′ = f (X), X ′′ = (Df )f gives

= φ(Xs ) + Γ φ(Xs )Ws,t + (Γ 2 φ)(Xs )Ws,t .


φ(Xt ) 3α

Combining this with ϱt (φ) := φ(Xt ) yields the claimed 3α-estimate. Similar, but
now using standard facts on composition of controlled rough paths with regular
functions, we obtain

φ(Xt ) 2α
= φ(Xs ) + Γ φ(Xs )Ws,t ,

uniformly over φ bounded in Cb2 . At last, the third estimate comes from α-Hölder
regularity of t 7→ ϱt (Γ 2 φ)=Γ 2 φ(Xt ), itself a manifest consequence of Γ 2 φ ∈ Cb1
and α-Hölder regularity of X.
We are not yet done, because until now, we have only handled the case of Dirac
initial data ϱ0 = δx . (Since ϱ0 (φ) = φ(X00,x ) = φ(x).) Fortunately, we are in a
linear situation so that, given ϱ0 = ν ∈ M, it suffices to generalise our construction
and define Z
ϱt (φ) := φ(Xt0,x )ν(dx).

It remains to see that such an integration in x respects all graded 3α, 2α, α estimates.
This is indeed the case, because all required estimates are uniform in X0 = x. (A
pleasant consequence of dealing with bounded vector fields so that all quantitative
bounds are invariant under shift.)
(Uniqueness) Given any g = uT ∈ Cb3 , there exists a regular backward RPDE
solution, ut = u(t, • ) ∈ Cb3 , with

= u′t Ws,t + u′′t Ws,t


us − ut 3α

(and then u′ = Γ u ∈ Cb2 etc). Write us,t = ut − us and similarly for ϱ. Then

ϱt (ut ) − ϱs (us ) = ϱs,t (ut ) + ϱs (us,t ) .

The first summand on the right-hand side expands, using the very definition of weak
solution (applied with φ = ut ∈ Cb3 , uniformly in t ∈ [0, T ]),

= ϱs (Γ ut )Ws,t + ϱs (Γ 2 ut )Ws,t .
ϱs,t (ut ) 3α

The second summand on the other hand expands, using the defining property of
regular backward equation,

= −ϱs (Γ ut )Ws,t − ϱs (Γ 2 ut )Ws,t .


ϱs (us,t ) = −ϱs (us − ut ) 3α

(Here one needs to argue that the 3α-bound on us,t (x)+Γ ut (x)Ws,t +Γ 2 ut (x)Ws,t
is uniform in x, for uT ∈ Cb3 , and thus the same 3α-estimate holds after integrating
against ϱs (dx).) Taken together we see a perfect cancellation so that ϱt (ut ) −
ϱs (us ) 3α
= 0. By a familiar argument (using 3α > 1) this implies that t 7→ ϱt (ut ) is
constant and thus
214 12 Stochastic partial differential equations

ϱT (g) = ϱT (uT ) = ϱ0 (u0 ) = ν(u0 )


where u is a regular backward RPDE solution (with terminal data g = uT ∈ Cb3 ).
(Uniqueness of the regular backward RPDE solutions is not used here.) Hence, with
given initial data ϱ = ν ∈ M we see that ϱT (g) is determined for all g ∈ Cb3 and this
(uniquely) determines the measure ϱT ∈ M. ⊔ ⊓

Remark 12.9. The uniqueness part of the proof actually shows that analytically weak
solutions to the rough PDE (12.6) can be tested in space-time with test functions
φ = φ(t, x) that have a precise controlled structure, starting with

= φ′t Ws,t + φ′′t Ws,t


φs − φt 3α

(and then 2α, resp. α expansions for φ′ and φ′′ ). This space of test functions is
tailored to the realisation of the noise W.

12.2 Second order rough partial differential equations

12.2.1 Linear theory: Feynman–Kac

As motivation, consider the second order stochastic partial differential equation with
d-dimensional Brownian noise in (backward) Stratonovich form, posed as terminal
value problem,
←−
−du = L[u]dt + Γ [u] ◦ dB , u(T, • ) = g , (12.8)

for u = u(ω) : [0, T ]×Rn → R, with differential operators L and Γ = (Γ1 , . . . , Γd )


given by
1
Tr σ(x)σ T (x)D2 u + b(x) · Du + c(x)u ,
def 
L[u] = (12.9)
2
def
Γi [u] = βi (x) · Du + γi (x)u .

The coefficients σ = (σ1 , . . . , σm ), b and β = (β1 , . . . , βd ) are viewed as vector


fields on Rn , while c, γ1 , . . . , γd are scalar functions. For simplicity only, all coef-
ficients are assumed to be bounded with bounded derivatives of all orders (but see
Remark 12.12). We assume g ∈ BC(Rn ), that is bounded and continuous.1 As in the
previous section, we are interested in replacing W by a genuine (geometric) rough
path W, such as to solve the rough partial differential equation (RPDE)

−du = L[u]dt + Γ [u]dW , u(T, • ) = g . (12.10)

1
In contrast to the space Cb we shall equip BC with the topology of locally uniform convergence.
12.2 Second order rough partial differential equations 215

We have already treated the fully degenerate case L = 0, with pure transport noise,
Γi = βi (x) · Dx , in Section 12.1.1. Since geometric rough paths are limits of smooth
paths, we start with the case when W is replaced by Ẇ dt, for W ∈ C 1 [0, T ], Rd .


It is a basic exercise in Itô calculus that any bounded C 1,2 solution to


d
X
−∂t u = L[u] + Γi [u]Ẇti , u(T, • ) = g , (12.11)
i=1

is given by the classical Feynman–Kac formula (and hence also unique),


" !#
Z T
Z T
u(s, x) = Es,x g(XT ) exp c(Xt )dt + γ(Xt )Ẇt dt (12.12)
s s

=: S[W ; g](s, x), (12.13)

where X is the (unique) strong solution to

dXt = σ(Xt )dB(ω) + b(Xt )dt + β(Xt )Ẇt dt, (12.14)

where B is a m-dimensional standard Brownian motion. When σ ≡ 0, this is nothing


but the method of characteristics, previously encountered for the transport equation in
(12.3). (For the moment, we keep W ∈ C 1 , but will soon encounter rough stochastic
characteristics.)
Remark 12.10. The natural form of the Feynman–Kac formula is the reason for
considering terminal value problems here, rather than Cauchy problems of the form
∂t u = L[u] + Γ [u]Ẇ with given initial data u(0, • ). Of course, a change of the time
variable t 7→ T − t allows to switch between these problems.
Clearly, there are situations when solutions cannot be expected to be C 1,2 , notably
when g ∈ / C 2 and L fails to provide smoothing as is the case, for example, in
“transport” equations where L is of first order. In such a case, formula (12.12) is
a perfectly good way to define a generalised solution to (12.11). Such a solution
need not be C 1,2 although it is bounded and continuous on [0, T ] × Rn , as one can
see directly from (12.12). As a matter of fact, (12.12) yields a (analytically) weak
PDE solution (cf. Exercise 12.1). It is also a stochastic representation of the unique
(bounded) viscosity solution [CIL92, FS06] to (12.11) although this will play no
role for us in the present section. The main result here is the following rough path
stability for linear second order RPDEs.
Theorem 12.11. Let α ∈ ( 31 , 12 ]. Given a geometric rough path W = (W, W) ∈
Cg0,α ([0, T ], Rd ), pick W ε ∈ C 1 [0, T ], Rd so that


 Z · 
ε ε ε ε ε
(W , W ) := W , W0,t ⊗ dWt → W
0

in α-Hölder rough path metric. Then there exists u = u(t, x) ∈ BC([0, T ] × Rn ),


not dependent on the approximating (W ε ) but only on W ∈ Cg0,α ([0, T ], Rd ), so
216 12 Stochastic partial differential equations

that, for g ∈ BC(Rn ),

uε = S[W ε ; g] → u =: S[W; g]

as ε → 0 in the sense of locally uniform convergence. Moreover, the resulting solution


map
S : Cg0,α ([0, T ], Rd ) × BC(Rn ) → BC([0, T ] × Rn )
is continuous. We say that u satisfies the RPDE (12.10).

Proof. Step 1: Write X = X W for the solution to (12.14) whenever W ∈ C 1 . The


first step is to make sense of the stochastic RDS

dXt = σ(Xt )dBt + b(Xt )dt + β(Xt )dWt . (12.15)

This is clearly not an equation that can be solved by Itô theory alone. But is also
not immediately well-posed as rough differential equation since for this we would
need to understand B and W = (W, W) jointly as a rough path. In view of the
Itô-differential dB in (12.15), we take B, BItô , as constructed in Section 3.2),
and are basically short of the cross-integrals between B and W . (For simplicity
Rof notation only, pretend over the next few lines W, B to be scalar.) We Rcan define
W dB(ω)
R as Wiener integral (Itô with deterministic integrand), and then BdW =
W B − W dB by imposing integration by parts. We then easily get the estimate
Z t 2
2 2α+1
E Ws,r dBr ≲ ∥W ∥α |t − s| ,
s

also when switching the roles of W, B, thanks to the integration by parts formula. It

follows from Kolmogorov’s criterion that ZW (ω) := Z = (Z, Z) ∈ C α a.s. for any
α′ ∈ (1/3, α) where
Rt !
BItô
 
Bt (ω) s,t (ω) s
Ws,r ⊗ dBr (ω)
Zt = , Zs,t = R t
Wt Bs,r ⊗ dWr (ω) Ws,t
s

where we reverted to tensor notation reflecting the multidimensional nature of B, W .


It is easy to deduce from Theorem 3.3 that, for any q < ∞,
   
′ W W̃
ϱα Z , Z q ≲ ϱα W, W̃ . (12.16)
L

We are hence able to say that a solution X = X(ω) of (12.15) is, by definition, a
solution to the genuine (random) rough differential equation

dX = (σ, β)(X)dZW (ω) + b(X)dt (12.17)

driven by the random rough path Z = ZW (ω). Moreover, as an immediate conse-


quence of (12.16) and continuity of the Itô–Lyons map, we see that X is really the
12.2 Second order rough partial differential equations 217

limit, e.g. in probability and uniformly on [0, T ], of classical Itô SDE solutions X ε ,
obtained by replacing dWt by the Ẇtε dt in (12.15).
Step 2: Given (s, x) we have a solution (Xt : s ≤ t ≤ T ) to the hybrid equation

(12.15), started at Xs = x. In fact (X, X ′ ) ∈ DZ2α with X ′ = (σ, β)(X). In
particular, the rough integral
Z Z
γ(X)dW := (0, γ(X))dZ

is well-defined, as is - with regard to the Feynman–Kac formula (12.12) - the random


variable !
Z T Z T
g(XT ) exp c(Xt )dt + γ(Xt )dWt (ω). (12.18)
s s

One can see, similar to (11.10), but now also relying on RDE growth estimates as
established in Proposition 8.2), with p = 1/α′ ,
Z t


γ(X)dW ≲ |||Z|||p-var;[s,t]
s

whenever |||Z|||p-var;[s,t] is of order one. An application of the generalised Fernique


Theorem 11.7, similar to the proof of Theorem 11.13 but with ϱ = 1 in the present
context, then shows that the number of consecutive intervals on which Z accumulates
unit p-variation has Gaussian tails; in fact, uniformly in ε ∈ (0, 1], if W is replaced by
W ε with limit W.) This implies that (12.18) is integrable (and uniformly integrable
with respect to ε when W is replaced by W ε ). It follows that
" !#
Z T Z T
u(s, x) := Es,x g(XT ) exp c(Xt )dt + γ(Xt )dWt (12.19)
s s

is indeed well-defined and the pointwise limit of uε (defined in the same way, with
W replaced by W ε ). By an Arzela–Ascoli argument, the limit is locally uniform. At
last, the claimed continuity of the solution map follows from the same arguments,
essentially by replacing W ϵ by Wϵ everywhere in the above argument, and of course
using (12.19) with g, W replaced by g ε , Wε , respectively. ⊔

Remark 12.12. The proof actually shows that our solution u = u(s, x; W) to the
linear RDPE (12.10) enjoys a Feynman–Kac type representation, namely (12.19),
in terms of the process constructed as solution to the hybrid Itô-rough differ-
ential equation (12.15). Assume now W is a Brownian motion, independent of
B, and W(ω) = WStrat = (W, WStrat ) ∈ Cg0,α a.s. It is not difficult to show
that u = u(., ., WStrat (ω)) coincides with the Feynman–Kac SPDE solution de-
rived by Pardoux [Par79] or Kunita [Kun82], via conditional expectations given
σ({Wu,v : s ≤ u ≤ v ≤ T }, and so provides an identification with classical SPDE
theory. In conjunction with continuity of the solution map S = S[W; g] one ob-
tains, along the lines of Sections 9.2, SPDE limit theorems of Wong–Zakai type,
218 12 Stochastic partial differential equations

Stroock–Varadhan type support statements and Freidlin–Wentzell type small noise


large deviations.
Remark 12.13. It is easy to quantify the required regularity of the coefficients. The
argument essentially relies on solving (12.17) as bona fide rough differential equation.
It is then clear that we need to impose Cb3 -regularity for the vector fields σ and β.
The drift vector field b may be taken to be Lipschitz and c ∈ Cb .
Remark 12.14. We have not given meaning to the actual equation (12.10) which we
here reproduce equivalently (cf. Remark 12.10) in the form

du = L[u]dt + Γ [u]dW , u(0, • ) = u0 . (12.20)

Indeed, in the absence of ellipticity or Hörmander type conditions on L, the solution


may not be any more regular than the initial data g so that in general (for g ∈ Cb ,
say) the action of the first order differential operator Γ = (Γ1 , . . . , Γd ) on u has no
pointwise meaning, let alone its rough integral against W. On the other hand, we
can (at least formally) test the equation against φ ∈ D = Cc∞ (Rn ) and so arrive the
following “analytically weak” formulation: call u = u(s, x; X) a weak solution to
(12.20) if for every φ ∈ D and 0 ≤ t ≤ T the following integral formula holds:
Z t Z t
⟨ut , φ⟩ = ⟨u0 , φ⟩ + ⟨us , L∗ φ⟩ds + ⟨us , Γ ∗ φ⟩dWs . (12.21)
0 0

In Exercise 12.1 the reader is invited to check that our Feynman–Kac solution is
indeed a weak solution in this sense. In particular, the final integral term is a bona
fide rough integral of the controlled rough path (Y, Y ′ ) ∈ DW2α
against W, where

Yt = ⟨ut , Γ ∗ φ⟩ , Yt′ = ⟨ut , Γ ∗ Γ ∗ φ⟩ . (12.22)

It is seen in [DFS17] that a uniqueness result holds for such weak RPDE solutions
holds, provided in the definition a suitable uniformity over the test function φ is
required. The strategy is a very similar to what was seen in Section 12.1.2: arguing
(for convenience) on the terminal value formulation (12.10), we construct a regular
forward solution and then employ a forward-backward argument to obtain uniqueness.
This is essentially the uniqueness argument employed in Theorem 12.8, with switched
roles of forward and backward evolution. Alternatively, in [HH18] the unbounded
rough driver framework of [DGHT19b] has been adapted to linear second order
RPDEs with L in divergence form.
Remark 12.15. Let u = u(t, x; X) be a weak solution in the sense of (12.21), and
W be a Brownian motion with Stratonovich rough path lift W = WStrat (ω). Then,
thanks to Theorem 5.14, it follows that u(t, x; ω) := u(t, x; WStrat (ω)) yields an
analytically weak SPDE solution in the sense that for every φ ∈ D and 0 ≤ t ≤ T
one has, with probability one,
Z t Z t

⟨ut , φ⟩ = ⟨u0 , φ⟩ + ⟨us , L φ⟩ds + ⟨us , Γ ∗ φ⟩ ◦ dWs ,
0 0
12.2 Second order rough partial differential equations 219

where the existence of the Stratonovich integral is implied by Corollary 5.2.

12.2.2 Mild solutions to semilinear RPDEs

We now turn to a class of “abstract” rough evolution problems introduced by


Gubinelli–Tindel [GT10], although our exposition is taken from [GH19]. Following
a familiar picture in PDE theory, we would like to view an RPDE solution as a
controlled path with values in a Hilbert space H which solves an RDE of the form

dut = Lut dt + F (ut )dXt and u0 = ξ ∈ H . (12.23)

Here, X = (X, X) ∈ C γ ([0, T ], Rd ), γ ∈ (1/3, 1/2], not necessarily geometric. L is


a negative definite self-adjoint operator, F = (F1 , . . . , Fd ) are suitable (essentially
0-order) operators. In particular, no transport noise is covered by our setup so that – in
contrast to previous sections – there is no restriction here to geometric rough paths.
Remark 12.16. Unlike Section 12.2.1 (Feynman–Kac) and Section 12.2.4 below
(maximum principle), the present section is not really restricted to second order
equations, even though these constitute the typical examples we have in mind.
To fix ideas, we give an example that will fit into the framework described below.
Example 12.17. Consider the rough reaction diffusion equation2

dut (x) = ∆u(x) dt + f (ut (x)) dt + p(ut (x)) dXt , (12.24)

with ut : Tn → Rl where Tn is the n-dimensional torus with Laplace operator


∆, and polynomial nonlinearities f and p = (p1 , . . . , pd ) on Rl . As as typical in
PDE theory, one looks for solutions ut ∈ H k (Tn , Rl ) =: H, where H k is the L2 -
based Sobolev space with k weak derivates in L2 . Of course, ∆ is negative definite
self-adjoint on H, with dense domain Dom(∆) = H1 , where we set (in agreement
with a later abstract interpolation space definition) Hα = H k+2α (Tn , Rl ), and also
note that the heat semigroup (e∆t )t≥0 acts naturally on this Sobolev scale. The
nonlinearity in this example is given by composition with a polynomial. Smoothness
of this operation requires H to be an algebra, which, by basic Sobolev theory, requires
k > n/2. The main theorem below requires each nonlinearity (as operator, here:
u 7→ pi ◦ u) to be C 3 in Fréchet sense as map from H−2γ = H k−4γ into itself.
Therefore we have the requirement on k to satisfy k > n/2 + 4γ. This means that
γ = 1/3+ is the optimal choice (in a level-2 rough path setting). Of course, this
covers the case of Brownian rough paths so that X above can be replaced by WItô (ω)
or WStrat (ω).
2
As in the case of RDEs with additional drift vector field, Exercise 8.5, the extra nonlinearity
(f ◦ ut ) dt can be absorbed in the X-term, by working with the space-time extensions of X. Less
trivially, a direct analysis allows for more general nonlinearities in (12.23) such as to handle 2D
Navier–Stokes with rough noise.
220 12 Stochastic partial differential equations

We want to give meaning to the rough partial differential equation (12.23). Similar
to (12.21), there is a natural – still formal – analytically weak formulation: for every
h ∈ Dom(L) ⊂ H and 0 ≤ t ≤ T the following integral formula holds (angle
brackets denote the inner product in H):
Z t Z t
⟨ut , h⟩ = ⟨ξ, h⟩ + ⟨us , Lh⟩ds + ⟨F (us ), h⟩dXs . (12.25)
0 0

On the other hand, if (St )t≥0 denotes the associated semigroup St = eLt (which is
analytic since L is assumed to be selfadjoint) one expects a mild formulation of the
form, for all 0 ≤ t ≤ T
Z t
ut = St ξ + St−s F (us )dXs , (12.26)
0

where the identity holds between elements in H. The regularity of F will be measured
in Fréchet sense, as map from Hα to itself, for a to be specified range of α ∈ R.3
Here, for α ≥ 0, the interpolation space Hα = Dom((−L)α ) is a Hilbert space
when endowed with the norm ∥ • ∥Hα = ∥(−L)α • ∥H . Similarly, H−α is defined as
the completion of H with respect to the norm ∥ • ∥H−α = ∥(−L)−α • ∥H . Note that
this setting is compatible with that of Exercise 4.16.
The weak formulation requires of course that s 7→ ⟨F (us ), h⟩ has meaning as a
controlled rough path, so that (12.25) is well-defined. In the mild formulation (12.26)
on the other hand we recognise the rough convolution integral previously defined in
(4.47), provided that s 7→ F (us ) is mildly controlled in the sense of (4.46). It can
be seen that weak and mild solutions coincide [GH19]. (The proof of this involves a
simple variant of the rough Fubini theorem from Exercise 4.11.) In what follows we
only consider the mild formulation.
We introduce the following spaces which are a slight strengthening of the spaces

DS,X introduced in Exercise 4.17:
2γ 2γ
([0, T ], Hα )∩ Cˆγ ([0, T ], Hα+2γ )×L∞ ([0, T ], Hα+2γ ) .

DX ([0, T ], Hα ) = DS,X

The basic ingredients, stability of mildly controlled rough paths under rough con-
volution and composition with regular functions were already established in Ex-
ercises 4.17 and 7.3. Taken together, they show that the image of (Y, Y ′ ) ∈

DX ([0, T ], H) under the map
 Z t 

MT (Y, Y )t := St ξ + St−u F (Yu )dXu , F (Yt ) (12.27)
0


yields again an element of DX ([0, T ], H). We now show that for small enough times
this map has a unique fixed point:

3
This rules out taking any derivatives in F . In particular, the previously considered transport noise,
involved Dx u, is not accommodated in this setting.
12.2 Second order rough partial differential equations 221

Theorem 12.18 (Rough Evolution Equation). Let ξ ∈ H, F1 , . . . , Fd : H →


H, bounded in C 3 (Hβ , Hβ ) on bounded sets for every β ≥ −2γ, for some γ ∈
(1/3, 1/2], and X = (X, X) ∈ C γ (R+ , Rd ). Then there exists τ > 0 and a unique

element (Y, Y ′ ) ∈ DX ([0, τ ), H) such that Y ′ = F (Y ) and
Z t
Yt = St ξ + St−u F (Yu )dXu , t < τ. (12.28)
0

Proof. First note X = (X, X) ∈ C γ ⊂ C η for 1/3 < η < γ ≤ 1/2. Fixing T < 1,

we will find a solution (Y, Y ′ ) ∈ DX ([0, T ], H2η−2γ ) as a fixed point of the map
MT given by (12.27). In the end we will briefly describe how one can make an

improvement and show that one actually has (Y, Y ′ ) ∈ DX ([0, T ], H). The proof
is analogous to Theorem 8.3, the only difference being that we have two different
scales of space regularity for which we need to be able to obtain the bound (7.14), as
prepared in Exercise 4.17. We will therefore show only invariance of the solution
map (12.27), because proving it already contains all the techniques that are not
present in the Theorem 8.3.
Note that if (Y, Y ′ ) is such that (Y0 , Y0′ ) = (ξ, F (ξ)) then the same is true for
MT (Y, Y ′ ), so we can view MT as a map on the complete metric space

BT = {(Y, Y ′ ) ∈ DX ([0, T ], H2η−2γ ) : Y0 = ξ, Y0′ = F (ξ), ∥(Y, Y ′ )∥∧
X,2η;−2γ
+ ∥Y − S F (ξ)X0, ∥η;2η−2γ + ∥Y ′ − S F (ξ)∥∞;2η−2γ ≤ 1} .
• • •

(We use the same notational convention as in Exercise 4.17, namely indices after
a semicolon indicate in which one of the Hα norms are taken.) Note that by the
triangle inequality for (Y, Y ′ ) ∈ BT we have

∥(Y, Y ′ )∥D2η ≲ (1 + ∥ξ∥ + ∥F (ξ)∥)(1 + ∥X∥γ ) ≲ 1.


X

Here and below we write A ≲ B as a shorthand for A ≤ CB for a constant C that


may depend on γ, η, X, X, F and ξ, but is uniform over T ∈ (0, 1].
It remains to show that for T small enough MT leaves BT invariant and is
contracting there, so that the claim follows from the Banach fixed point theorem. We
will consider the simpler case when F is Cb3 . For (Zt , Zt′ ) = (F (Yt ), DF (Yt ) ◦ Yt′ )
we have by Exercise 7.3

∥(Z, Z ′ )∥X,2η ≲ (1 + ∥(Y, Y ′ )∥D2η )2 ≲ (1 + ∥ξ∥ + ∥F (ξ)∥)2 ≲ 1 ,


X

and from Exercise 4.17


 Z • 

∥MT (Y, Y )∥X,2η = S −u Zu dXu , Z

0 X,2η

≲ ∥Z∥η,−2γ + (∥Z0′ ∥H−2γ + ∥(Z, Z ′ )∥∧


X,2η;−2γ )ϱη (0, X)
≲ ∥Z∥η,−2γ + (∥Z0′ ∥H−2γ + ∥(Z, Z ′ )∥∧
X,2η;−2γ )T
γ−η
.
222 12 Stochastic partial differential equations

Since (Y, Y ′ ) ∈ BT , we have the bound ∥Y ∥η,−2γ ≤ (∥X∥γ + 1)T γ−η . One can
also show along the same lines as in Exercise 7.3 that

∥δ̂Zs,t ∥H−2γ ≲ ∥δ̂Ys,t ∥H−2γ + ∥St−s Ys − Ys ∥H−2γ + |t − s|2η ∥F (Ys )∥H2η−2γ


≲ T γ−η |t − s|η + |t − s|2η ∥Ys ∥H2η−2γ + T η |t − s|η


≲ T γ−η + T γ+η + T η |t − s|η .




Therefore since T < 1 we conclude that ∥Z∥η,−2γ ≲ T γ−η .


To estimate ∥MT (Y ) − S F (ξ)X0, ∥η,2η−2γ we use the identity
• •

δ̂(S F (ξ)X0, )t,s = St F (ξ)Xs,t


• •

and since 2η < 1 we can use a better bound from (4.48) to deduce:
Z t
∥δ̂(MT (Y ) − S F (ξ)X0, )t,s ∥H2η−2γ = St−u F (Yu )dXu − St F (ξ)Xs,t

• •

s H2η−2γ
η ′ 2η
≤ (∥F (ξ)∥H + ∥Z∥∞;−2γ )∥X∥η |t − s| + ∥Z ∥∞;−2γ ∥X∥2η |t − s|
+ C(∥X∥η |RZ |2η + ∥X∥2η ∥Z ′ ∥η )|t − s|3η−2η
≲ (∥F (ξ)∥H + ∥Z0′ ∥H−2γ + ∥(Z, Z ′ )∥∧
X,2η;−2γ )|t − s|
η

≲ T γ−η |t − s|η .

Finally we estimate the term ∥MT (Y )′t − St F (ξ)∥H2η−2γ :

∥MT (Y )′t − St F (ξ)∥H2η−2γ =


= ∥F (Yt ) − F (St ξ) + F (St ξ) − F (ξ) + F (ξ) − St F (ξ)∥H2η−2γ
≲ ∥Yt − St ξ∥H2η−2γ + ∥St ξ − ξ∥H2η−2γ + ∥F (ξ) − St F (ξ)∥H2η−2γ
≲ ∥Yt − St ξ − St F (ξ)Xt,0 ∥H2η−2γ + ∥F (ξ)∥H ∥X∥γ T γ
+ t2γ−2η ∥ξ∥H + t2γ−2η ∥F (ξ)∥H
≲ (∥Y − S F (ξ)X0, ∥η,2η−2γ T η + T γ + T 2γ−2η ) ≲ T γ−η .
• •

Putting it all together we obtain the bound

∥MT (Y ) − S F (ξ)X0, ∥η;2η−2γ + ∥MT (Y )′ − S F (ξ)∥∞;2η−2γ


• • •


+ ∥MT (Y, Y )∥∧
X,2η;−2γ ≲ T γ−η .

If T is small enough we guarantee that the left-hand side of the above expression
is smaller than 1, thus proving that BT is invariant under MT . In order to show
contractivity of MT , one can use analogous steps to first show

∥MT (Y, Y ′ ) − MT (V, V ′ )∥D2η ≲ ∥(Y − V, Y ′ − V ′ )∥D2η T γ−η .


X X

This guarantees contractivity for small enough T , completing the fixed point argu-
ment and thus showing the existence of the unique maximal solution to (12.28).
12.2 Second order rough partial differential equations 223


Let now (Y, Y ′ ) ∈ DX ([0, T ], H2η−2γ ) be the solution constructed above, we

sketch an argument showing that in fact it belongs to DX ([0, T ], H). We know that

Yt = St ξ + St F (ξ)X0,t + St DF (ξ)F (ξ) + R0,t , (12.29)


Yt − St−s Ys = St−s F (Ys )Xt,s + St−s DF (Ys )F (Ys )Xs,t + Rs,t . (12.30)
Rt
Here Rs,t = s St−r F (Yr )dXr − St−s F (Ys )Xs,t − St−s DF (Ys )F (Ys )Xs,t . From
the estimate on R0,t using (4.48) and since ξ ∈ H, we see that (12.29) implies Y ∈
L∞ ([0, T ], H). Moreover (12.30) implies Y ∈ Cˆγ ([0, T ], H−2γ ) which, together
with Y ∈ L∞ ([0, T ], H), implies F (Y ) ∈ Cˆγ ([0, T ], H−2γ
d
) ∩ L∞ ([0, T ], H2η−2γ
d
).

This itself implies that (Y, F (Y )) ∈ DS,X ([0, T ], H−2γ ) (using again (12.30)) and

(F (Y ), DF (Y )F (Y )) ∈ DS,X ([0, T ], H−2γ ) which enables us to get an estimate
for every β < 3γ:

∥Rs,t ∥Hβ ≲ ∥F (Y ), DF (Y )F (Y )∥∧


X,2γ;−2γ |t − s|
3γ−β
.

Taking β = 2γ and using (12.30) again we show that Y ∈ Cˆγ ([0, T ], H), which

completes the proof that (Y, Y ′ ) ∈ DX ([0, T ], H). ⊔

12.2.3 Fully nonlinear equations with semilinear rough noise

We now consider nonlinear rough partial differential equations of the form


d
X
du = F [u]dt + Hi [u] ◦ dWti (ω) , u(0, • ) = g , (12.31)
i=1

with fully nonlinear, possibly degenerate, operator

F [u] = F (x, u, Du, D2 u),

and semilinear
Hi [u] = Hi (x, u, Du) , i = 1, . . . , d .
We essentially rule out nonlinear dependence on Du, hence the terminology “semilin-
ear noise”, which makes a (global) flow transformation method work. In a stochastic
setting such transformation (at least in the linear case) are attributed to Kunita. As
already noted in the context of first order equations, the case Hi = Hi (x, Du) re-
quires a subtle local version of such as transformation and is topic of the pathwise
Lions–Souganidis theory of stochastic viscosity theory for fully nonlinear SPDEs;
[LS98a, LS98b, LS00b] and [Sou19] for a recent overview.
As in the previous section we aim to replace ◦dW by a “rough” differential dW,
for some geometric rough path W ∈ Cg0,α ([0, T ], Rd ), and show that an RPDE
solution arises as the unique limit under approximations (W ε , Wε ) → W. Of course,
224 12 Stochastic partial differential equations

there is little one can say at this level of generality and we have not even clarified in
which sense we mean to solve (12.31) when W ∈ C 1 ! Let us postpone this discussion
and assume momentarily that F and H are sufficiently “nice” so that, for every
W ∈ C 1 and g ∈ BC, say, there is a classical
P solution u = u(t, x) for t > 0.
With noise of the form H[u]Ẇ = i Hi (x, u, Du)Ẇ i , we shall focus on the
following three cases.
a) Transport noise. For sufficienly nice vector fields βi on Rn ,

Hi [u] = βi (x) · Du ;

b) Semilinear noise. For a sufficienly nice function Hi on Rn × R,

Hi [u] = Hi (x, u);

c) Linear noise. With βi as above and sufficiently nice functions γi on Rn

Hi [u] = Γi [u] := βi (x) · Du + γi (x)u.

We now develop the “calculus” for the transformations associated to each of the
above cases. All proofs consist of elementary computations and are left to the reader.

Proposition 12.19 (Case a). Assume that ψ = ψ W is a C 3 solution flow of diffeo-


morphisms associated to the ODE Ẏ = −β(Y )Ẇ , where W ∈ C 1 . (This is the case
if β ∈ Cb3 .) Then u is a classical solution to

∂t u = F x, u, Du, D2 u + ⟨β(x), Du⟩Ẇ




if and only if v(t, x) = u(t, ψt (x)) is a classical solution to

∂t v − F ψ t, x, v, Dv, D2 v = 0


where F ψ is determined from

F ψ (t, ψt (x), r, p, X)
= F x, r, p, Dψt−1 , X, Dψt−1 ⊗ Dψt−1 + p, D2 ψt−1 .
def




Proposition 12.20 (Case b). For any fixed x ∈ Rn , assume that the one-dimensional
ODE
φ̇ = H(x, φ)Ẇ , φ(0; x) = r ,
has a unique solution flow φ = φW = φ(t, r; x) which is of class C 2 as a function
of both r and x. Then u is a classical solution to

∂t u = F x, u, Du, D2 u + H(x, u)Ẇ




if and only if v(t, x) = φ−1 (t, u(t, x), x), or equivalently φ(t, v(t, x), x) = u(t, x) ,
is a solution of
12.2 Second order rough partial differential equations 225

∂t v − φ F t, x, r, Dv, D2 v = 0 ,


with

φ 1
F (t, x, φ, Dφ + φ′ p,
def
F (t, x, r, p, X) = (12.32)
φ′
φ′′ p ⊗ p + Dφ′ ⊗ p + p ⊗ Dφ′ + D2 φ + φ′ X ,


where φ′ denotes the derivative of φ = φ(t, r, x) with respect to r.


Remark 12.21. It is worth noting that the “quadratic gradient” term φ′′ p ⊗ p disap-
pears in (12.32) whenever φ′′ = 0. This happens when H(x, u) is linear in u, i.e.
when
Hi [u] = γi (x)u , i = 1, . . . , d .
in which case we have
d
!
Z t  X
i
φ(t, r, x) = r exp γ(x)dWs = r exp γi (x)W0,t . (12.33)
0 i=1

Remark 12.22. Note that all dependence on Ẇ has disappeared in (12.33), and
consequently (12.32). In the SPDE / filtering context this is known as robustification:
the transformed PDE (∂t − φ F )v = 0 can be solved for any W ∈ C([0, T ], Rd ).
Pd
This provides a way to solve SPDEs of the form du = F [u]dt + i=1 γi (x)u ◦ dWt
pathwise, so that u depends continuously on W in uniform topology.
We now turn our attention to case c). The point here is that the “inner” and “outer”
transformation seen above, namely

v(t, x) = u(t, ψt (x)) , v(t, x) = φ−1 (t, u(t, x), x) ,

respectively, can be combined to handle noise coefficients obtained by adding those


from cases a) and b), i.e. noise coefficients of the type ⟨βi (x), Du⟩ + Hi (x, u). We
content ourselves with the linear case

Hi [u] = ⟨βi (x), Du⟩ + γi (x)u .

Proposition
R 12.23 (Case c). Let ψ = ψ W be as in case a) and set φ(t, r, x) =
t
r exp 0 γ(ψs (x))dWs . Then u is a (classical) solution to

∂t u = F x, u, Du, D2 u + ⟨β(x), Du⟩ + γ(x)u Ẇ ,


 

 R 
t
if and only if v(t, x) = u(t, ψt (x)) exp − 0 γ(ψs (x))dWs is a (classical) solu-
tion to
∂t v − φ (F ψ ) t, x, v, Dv, D2 v = 0.


Remark 12.24. It is worth noting that the outer transformation F → F ψ preserves


the class of linear operators. That is, if F [u] = L[u] as given in (12.9), then F ψ is
226 12 Stochastic partial differential equations

again a linear operator. Because of the appearance of quadratic terms in Du, this
is not true for the inner transformation F → φ F unless φ′′ = 0. Fortunately, this
happens in the linear case and it follows that the transformation F → φ (F ψ ) used in
case c) above does preserve the class of linear operators.

Let us reflect for a moment on what has been achieved. We started with a PDE
that involves Ẇ and in all cases we managed to transform the original problem to a
PDE where all dependence on Ẇ has been isolated in some auxiliary ODEs. In the
stochastic context (◦dW instead of dW = Ẇ dt) this is nothing but the reduction,
via stochastic flows, from a stochastic PDE to a random PDE, to be solved ω-wise.
In the same spirit, the rough case is now handled with the aid of flows for RDEs and
their stability properties.
Given W ∈ Cg0,α , we pick an approximating sequence (W ε ), and transform

∂t uε = F [uε ] + H[uε ]Ẇ ε (12.34)

to a PDE of the form


∂t v ε = F ε [v ε ], (12.35)
ϵ
ε ψ W
e.g. with F = F and ψ = ψ in case a) and accordingly in the other cases. Then

F ε [w] = F ε [t, x, w, Dw, D2 w]

(in abusive notation) and the function F ε which appears on the right-hand side above
converges (e.g. locally uniformly) as ε → 0, due to stability properties of flows
associated to RDEs as discussed in Section 8.10.
All one now needs is a (deterministic) PDE framework with a number of good
properties, along the following “wish list”.
1. All approximate problems, i.e. with W ε ∈ C 1 ([0, T ], Rd )
d
Hi [uε ]Ẇtε,i ,
X
∂t uε = F [uε ] + uε (0, • ) = g ε ,
i=1

should admit a unique solution, in a suitable class U of functions on [0, T ] × Rn ,


for a suitable class of initial conditions in some space G.
2. The change of variable calculus (Propositions 12.19–12.23) should remain valid,
so that uε ∈ U is a solution to (12.34) if and only if its transformation v ε ∈ U is
a solution to (12.35).
3. There should be a good stability theory, so that g ε → g 0 in G and F ε → F 0 (in a
suitable sense) allows to obtain convergence in U of solutions v ε to (12.35) with
intitial data g ε to the (unique) solution of the limiting problem ∂t v 0 = F 0 [v 0 ]
with initial data g 0 .
12.2 Second order rough partial differential equations 227

4. At last, the topology of U should be weak enough to make sure that v ϵ → v 0


implies that the “back-transformed” uϵ converges in U, with limit u0 being v 0
back-transformed.4
The final point suggests to define a solution to

du = F [u]dt + H[u]dW , u(0, • ) = g , (12.36)

as an element in U which, under the correct flow transformation associated to W and


H, solves the transformed equation ∂t v = F 0 [v], v(0, • ) = g. To make this more
concrete, consider the transport case a). As before, ψ = ψ W is the flow associated to
the RDE dY = −β(Y )dW and u solves the above RPDE (with H[u] = ⟨β(x), Du⟩)
if, by definition, v(t, x) := u(t, ψt (x)) solves ∂t v = F ψ [v], with v(0, • ) = g. The
same logic applies to cases b) and c).
We then have the following (meta-)theorem, subject to a PDE framework with the
above properties.
Theorem 12.25. Let α ∈ ( 31 , 12 ]. Given a geometric rough path W = (W, W) ∈
Cg0,α ([0, T ], Rd ), pick W ε ∈ C 1 [0, T ], Rd so that



 Z 
(W ε , Wε ) := W ε, ε
W0,t ⊗ dWtε →W
0

in α-Hölder rough path metric. Consider unique solutions uϵ ∈ U to the PDEs

∂t uϵ = F [uϵ ] + H[uϵ ]Ẇ ϵ



(12.37)
uϵ (0, • ) = g ∈ G.

Then there exists u = u(t, x) ∈ U, not dependent on the approximating (W ε ) but


only on W ∈ Cg0,α ([0, T ], Rd ), so that

uε = S[W ε ; g] → u =: S[W; g]

as ε → 0 in U. This u is the unique solution to the RPDE (12.36) in the sense of the
above definition. Moreover, the resulting solution map,

S : Cg0,α ([0, T ], Rd ) × G → U

is continuous.
It remains to identify suitable PDE frameworks, depending on the nonlinearity F .
When ∂t u = F [u] is a scalar conservation law, entropy solutions actually provide
a suitable framework to handle additional rough noise, at least of (linear) type c),
[FG16b]. On the other hand, when F = F [u] is a fully nonlinear second order opera-
tor, say of Hamilton–Jacobi–Bellman (HJB) or Isaacs type, the natural framework
is viscosity theory [CIL92, FS06] and the problem of handling additional “rough”
4
Given the roughness in t of our transformations, typically α-Hölder, it would not be wise to
incorporate temporal C 1 -regularity in the definition of the space U .
228 12 Stochastic partial differential equations

/ C 1 , also with nonlinear H = H(Du), was first raised by


noise, in the sense of W ∈
Lions–Sougandis [LS98a, LS98b, LS00a, LS00b].

12.2.4 Rough viscosity solutions

Consider a real-valued function u = u(x) with x ∈ Rm and assume u ∈ C 2 is a


classical supersolution,
−G x, u, Du, D2 u ≥ 0,


where G is continuous and degenerate elliptic in the sense that G(x, u, p, A) ≤


G(x, u, p, A + B) whenever B ≥ 0 in the sense of symmetric matrices. The idea is
to consider a (smooth) test function φ which touches u from below at some interior
point x̄. Basic calculus implies that Du(x̄) = Dφ(x̄), D2 u(x̄) ≥ D2 φ(x̄) and, from
degenerate ellipticity,
−G x̄, φ, Dφ, D2 φ ≥ 0.

(12.38)
This motivates the definition of a viscosity supersolution (at the point x̄) to −G = 0
as a (lower semi-)continuous function u with the property that (12.38) holds for
any test function which touches u from below at x̄. Similarly, viscosity subsolutions
are (upper semi-)continuous functions defined via test functions touching u from
above and by reversing inequality in (12.38); viscosity solutions are both super-
and subsolutions. Observe that this definition covers (completely degenerate) first
order equations as well as parabolic equations, e.g. by considering ∂t − F = 0
on [0, T ] × Rn where F is degenerate elliptic. Let us mention a few key results of
viscosity theory, with special regard to our “wish list”.
1. One has existence and uniqueness results in the class of BC solutions to the
initial value problem (∂t − F )u = 0, u(0, • ) = g ∈ BUC(Rn )5 , provided
F = F (t, x, u, Du, D2 u) is continuous, degenerate elliptic, there exists γ ∈ R
such that, uniformly in t, x, p, X,

γ(s − r) ≤ F (t, x, r, p, X) − F (t, x, s, p, X) whenever r ≤ s, (12.39)

and some technical conditions hold.6 Without going into technical details, the
conditions are met for F = L as in (12.9) and are robust under taking inf
and sup (provided the regularity of the coefficients holds uniformly). As a
consequence, HJB and Isaacs type nonlinearities, where F takes the form
infa La , infa supa′ La,a′ , are also covered.
2. The change of variables “calculus” of Propositions 12.19–12.23 remains valid for
(continuous) viscosity solutions. This can be checked directly from the definition
of a viscosity solution.
5
the space of bounded uniformly continuous functions
6
. . .the most important of which is [CIL92, (3.14)]. Additional assumptions on F are necessary,
however, in particular due to the unboundedness of the domain Rn , and these are not easily found
in the literature; see [DFO14]. One can also obtain existence and uniqueness result in BUC.
12.2 Second order rough partial differential equations 229

3. In fact, the technical conditions mentioned in 1. imply a particularly strong


form of uniqueness, known as comparison: assume u (resp. v) is a subsolution
(resp. supersolution) and u0 ≤ v0 ; then u ≤ v on [0, T ] × Rn . A key feature
of viscosity theory is what workers in the field simply call stability, a powerful
incarnation of which is known as Barles and Perthame procedure [FS06, Section
VII.3] and relies on comparison for (semicontinuous) sub- and super-solutions.
In the form relevant for us, one assumes comparison for ∂t − F 0 and considers
viscosity solutions to (∂t − F ε )v ε = 0, with v ε (0, • ) = g ε , assuming locally
uniform boundedness of v ε and g ε → g 0 locally uniformly. Then v ε → v 0
locally uniformly where v 0 is the (unique) solution to the limiting problem
∂t − F v = 0, with v 0 (0, • ) = g 0 .
0 0


In the context of RPDEs above, again with focus on the transport case a) for
the sake of argument, F 0 = F ψ where ψ = ψ W , where ψ is a flow of C 3 -
diffeomorphisms (associated to the RDE dY = −β(Y )dW thereby leading to
the assumption β ∈ Cb5 ). As a structural condition on F , we may simply assume
“ψ-invariant comparison” meaning that comparison holds for ∂t − F ψ , for any C 3 -
diffeomorphism with bounded derivatives. Checking this condition turns out to be
easy. First, when F = L is linear, we have F ψ = Lψ also linear, with similar bounds
on the coefficients as L due to the stringent assumptions on the derivatives of ψ.
From the above discussion, and in particular from what was said in 1., it is then
clear that L satisfies ψ-invariant comparison. In fact, stability of the condition in 1.
under taking inf and sup, also implies that HJB and Isaacs type nonlinearities satisfy
ψ-invariant comparison.
It is now possible to implement the arguments of the previous Theorem 12.25
in the viscosity framework [CFO11], see also [FO11] for applications to splitting
methods. We tacitly assume that all approximate problems of the form (12.40) below
have a viscosity solution, for all W ε ∈ C 1 and g ∈ BU C, but see Remark 12.27.

Theorem 12.26. Let α ∈ ( 31 , 12 ]. Given a geometric rough path W = (W, W) ∈


Cg0,α ([0, T ], Rd ), pick W ε ∈ C 1 [0, T ], Rd so that (W ε , Wε ) → W in α-Hölder


rough path metric. Consider unique BC viscosity solutions uϵ to

∂t uϵ = F [uϵ ] + ⟨β(x), Du⟩Ẇ ϵ



(12.40)
uϵ (0, • ) = g ∈ BU C(Rn )

where F satisfies ψ-invariant comparison. Then there exists u = u(t, x) ∈ BC, not
dependent on the approximating (W ε ) but only on W ∈ Cg0,α ([0, T ], Rd ), so that

uε = S[W ε ; g] → u =: S[W; g]

as ε → 0 in local uniform sense. This u is the unique solution to the RPDE (12.36)
with transport noise H[u] = ⟨β(x), Du⟩ in the sense of the definition given previous
to Theorem 12.25. Moreover, we have continuity of the solution map,

S : Cg0,α ([0, T ], Rd ) × BUC(Rn ) → BC([0, T ] × Rn ) .


230 12 Stochastic partial differential equations

Remark 12.27. In the above theorem, existence of RPDE solutions actually relies on
existence of approximate solutions uε , which one of course expects from standard
viscosity theory. Mild structural conditions on F , satisfied by HJB and Isaacs exam-
ples, which imply this existence are reviewed in [DFO14]. One can also establish a
modulus of continuity for RPDE solutions, so that u ∈ BU C after all.

Remark 12.28. Rough partial differential equations as considered here, du =


F [u]dt+⟨β(x), Du⟩dW, with F = infa La of HJB form, arise in pathwise stochastic
control [LS98b, BM07, DFG17], also in conjunction with filtering [AC19].

Unfortunately, in case b), it turns out the structural assumptions one has to impose
on F in order to have the necessary comparison for ∂t − F 0 = 0 is rather restrictive,
although semilinear situations are certainly covered. Even in this case, due to the
appearance of a quadratic nonlinearity in Du, the argument is involved and requires
a careful analysis on consecutive small time intervals, rather than [0, T ]; see [LS00a,
DF12]. A nonlinear Feynman–Kac representation, in terms of rough backward
stochastic differential equations is given in [DF12].
At last, we return to the fully linear case of Section 12.2.3. That is, we consider
the (linear noise) case c) with linear F = L. With some care [FO14], the double
transformation leading to the transformed equation ∂t − φ (F ψ ) = 0 can be imple-
mented with the aid of coupled flows of rough differential equations. We can then
recover Theorem 12.11, but with somewhat different needs concerning the regularity
of the coefficients. (For instance, in the aforementioned theorem we really needed
σ, β ∈ Cb3 whereas now, using flow decomposition, we need β ∈ Cb5 but only σ ∈ Cb1 .

Remark 12.29. By either approach, case c) with linear F = L or Theorem 12.11,


we obtain a robust view on classes SPDEs which contain the Zakai equation from
filtering theory, provided the initial law admits a BU C-density. Robustness is an
important issue in filtering theory, see also Exercise 12.3.

12.3 Stochastic heat equation as a rough path

Nonlinear stochastic partial differential equations driven by very singular noise, say
space-time white noise, may suffer from the fact that their nonlinearities are ill-posed.
For instance, even in space dimension one, there is no obvious way of giving “weak”
meaning to Burgers-like stochastic PDEs of the type
n
X
∂t ui = ∂x2 ui + f (u) + gji (u)∂x uj + ξ i , i = 1, . . . , n , (12.41)
j=1

where ξ = ξ i denotes space-time white noise (strictly speaking, n independent




copies of scalar space-time white noise). Recall that, at least formally, space-time
white noise is a Gaussian generalised stochastic process such that
12.3 Stochastic heat equation as a rough path 231

Eξ i (t, x)ξ j (s, y) = δij δ(t − s)δ(x − y) .

As a consequence of the lack of regularity of ξ, it turns out that the solution to the
stochastic heat equation (i.e. the case f = g = 0 in (12.41) above) is only α-Hölder
continuous in the spatial variable x for any α < 1/2. In other words, one would
not expect any solution u to (12.41) to exhibit spatial regularity better than that of a
Brownian motion.
As a consequence, even when aiming for a weak solution theory, it is not clear
how to define the integral of a spatial test function φ against the nonlinearity. Indeed,
this would require us to make sense of expressions of the type
Z
φ(x)gji (u)∂x uj (t, x) dx ,

for fixed t. When g happens to be a gradient, such an integral can be defined by pos-
tulating that the chain rule holds and integrating by parts. For a general g, as arising
in applications from path sampling [HSV07], this approach fails. This suggests to
seek an understanding of u(t, • ) as a spatial rough path. Indeed, this would solve the
problem just explained by allowing us to define the nonlinearity in a weak sense as
Z
φ(x)gji (u) duj (t, x) ,

where u is the rough path associated to u.


In the particular case of (12.41), it is actually sufficient to be able to associate a
rough path to the solution ψ to the stochastic heat equation

∂t ψ = ∂x2 ψ + ξ .

Indeed, writing u = ψ + v and proceeding formally for the moment, we then see that
v should solve
n
X
∂t v i = ∂x2 v i + f (v + ψ) + gji (v + ψ) ∂x ψ j + ∂x v j ) .
j=1

If we were able to make sense of the term appearing in the right-hand side of this
equation, one would expect it to have the same regularity as ∂x ψ so that, since
ψ(t, • ) turns out to belong to C α for every α < 1/2, one would expect v(t, • ) to be
of regularity C α+1 for every α < 1/2. In particular, we would not expect the term
involving ∂x v j to cause any trouble, so that it only remains to provide a meaning for
the term gji (v + ψ)∂x ψ j . If we know that v ∈ C 1 and we have an interpretation of
ψ(t, • ) as a rough path ψ (in space), then this can be interpreted as the distribution
whose action, when tested against a test function φ, is given by
Z
φ(x)gji (ψ + v)) dψ j (t, x) .
232 12 Stochastic partial differential equations

This reasoning can actually be made precise, see the original article [Hai11b]. In this
section we limit ourselves to providing the construction of ψ and giving some of its
basic properties.

12.3.1 The linear stochastic heat equation

We now study the model problem in this context - the construction of a spatial rough
path associated, in essence, to the above SPDE in the case f = g = 0. More precisely,
we are considering stationary (in time) solution to the stochastic heat equation7 ,

dψt = −Aψt dt + σdWt , (12.42)

where, for fixed λ > 0


Au = −∂x2 u + λu;
and W is a cylindrical Wiener process over L2 (T), the L2 -space over the one-
dimensional torus T = [0, 2π], endowed with periodic boundary conditions. Let
(ek : k ∈ Z) denote the standard Fourier-basis of L2 (T)
 1
 √π sin (kx) for k > 0

ek (x) = √1 for k = 0

 √1 cos (kx) for k < 0

π

which diagonalises the operator A in the sense that

Aek = µk ek , muk = k 2 + λ , k∈Z.

Thanks to the fact that we chose λ > 0, the stochastic heat equation (12.42) has
indeed a stationary solution
P which, by taking Fourier transforms, may be decom-
posed as ψ(x, t; ω) = k Ytk (ω)ek (x). The components Ytk are then a family of
independent stationary one-dimensional Ornstein-Uhlenbeck processes given by

dYtk = −µk Ytk dt + σdBtk ,

where (B k : k ∈ Z) is a family of i.i.d. standard Brownian motions. An explicit


calculation yields
σ2
E Ysk Ytk =

exp (−µk |t − s|) ,
2µk
so that in particular, for any fixed time t,
2 σ2
E Ytk = .
2µk
7
With λ = 0, the 0th mode of ψ behaves like a Brownian motion and ψ cannot be stationary in
time, unless one identifies functions that only differ by a constant.
12.3 Stochastic heat equation as a rough path 233

Lemma 12.30. For each fixed t, the spatial covariance of ψ is given by

E(ψ(x, t)ψ(y, t)) = K(|x − y|)

where K is given by

1 2 X cos (ku) σ2 √ 
K(u) := σ = √ √  cosh λ(u − π) .
4π µk 4 λ sinh λπ
k∈Z

Here, the second equality holds for u restricted to [0, 2π]. In fact, the cosine series is
the periodic continuation of the r.h.s. restricted to [0, 2π].

Proof. From the basic identity cos (α − β) = cos α cos β + sin α sin β,
1
e−k (x)e−k (y) + ek (x)ek (y) = cos (k(x − y)), k ∈ Z .
π

 in R(x, y) := E(ψ(x, t)ψ(y, t)), and using the


Inserting the respective expansion
independence of the Y k : k ∈ Z , gives

X 2 1 2 1 X 2
R(x, y) = ek (x)ek (y)E Ytk = E Yt0 + cos (k(x − y))E Ytk
2π π
k∈Z k=1
2 X
σ cos (k(x − y))
= ,
4π λ + k2
k∈Z

and then R(x, y) = K(|x − y|) where

σ 2 X cos (kx)
K(x) = .
4π λ + k2
k∈Z
√ 
At last, expand the (even) function cosh λ |•|−π in its (cosine) Fourier-series
to get the claimed equality. ⊔ ⊓

Proposition 12.31. Fix t ≥ 0. Then ψt (x; ω) = ψ(t, x; ω), indexed by x ∈ [0, 2π],
is a centred Gaussian process with covariance of finite 1-variation. More precisely,

Rψ(t, )
1;[x,y]2
• ≤ 2π∥K∥C 2 ;[0,2π] |x − y| ,

and so (cf. Theorem 10.4), for each fixed t ≥ 0, the Rd -valued process

[0, 2π] ∋ x 7→ ψt1 (x), . . . , ψtd (x) ,




 copies of ψt , lifts canonically to a Gaussian rough path ψ t ( ) ∈


consisting of d i.i.d. •

Cg0,α [0, 2π], Rd .

Proof. This follows immediately from Exercise 10.4. ⊔



234 12 Stochastic partial differential equations

Remark 12.32. There are ad-hoc ways to construct a (spatial) rough path lift asso-
ciated to the stochastic heat-equation, for instance be writing ψ(t, • ) as Brownian
bridge plus a random smooth function. In this way, however, one ignores the large
body of results available for general Gaussian rough paths: for instance, rough path
convergence of hyper-viscosity or Galerkin approximation, extensions to fractional
stochastic heat equations, concentration of measure can all be deduced from general
principles.
We now show that solutions to the stochastic heat equation induces a continuous
stochastic evolution in rough path space.
Theorem 12.33. There exists a continuous modification of the map t 7→ ψ t with
values in Cgα [0, 2π], Rd .


Proof. Fix s and t. The proof then proceeds in two steps. First, we will verify the
assumptions of Corollary 10.6, namely we will show that
h iθ
2
|ϱα (ψs , ψt )|Lq ≤ C sup E(|ψs (x, y) − ψt (x, y)| ) ,
x,y∈[0,2π]

for some constant C that is independent of s and t. In the second step, we will show
that (here we may assume d = 1), with ψs (x, y) := ψs (y) − ψs (x), one has the
bound h i
2 1/2 
sup E |ψs (x, y) − ψt (x, y)| = O |t − s| .
x,y∈[0,2π]

The existence of a continuous (and even Hölder) modification is then a consequence


of the classical Kolmogorov criterion.
For the first step, we write X = ψs1 (·), . .. , ψsd (·) and Y = ψt1 (·), . . . , ψtd (·) .
 

Note that one has independence of X i , Y i with X j , Y j for i ̸= j. We have to


verify finite 1-variation (in the 2D sense) of the covariance of (X, Y ). In view of
Proposition 12.31, it remains to establish finite 1-variation of
 X
(x, y) 7→ R(X 1 ,Y 1 ) (x, y) = E ψs1 (x)ψt1 (y) = ek (x)ek (y)E Ysk Ytk
 

k∈Z
σ 2 X cos (k(x − y)) −(λ+k2 )|t−s|.
= e =: Rτ (x, y).
4π λ + k2
k∈Z

For every τ > 0, exponential decay of the Fourier-modes implies smoothness of Rτ .


We claim
∥Rτ ∥1-var;[u,v]2 ≤ C|v − u| < ∞,
uniformly in τ ∈ (0, 1] and u, v. To see this, write
Z vZ v
∥Rτ ∥1-var;[u,v]2 = |∂xy Rτ |dx dy
u u
Z v Z v X ik(x−y)

2e −(λ+k2 )τ

∼ k 2
e dx dy
u u
λ + k
12.3 Stochastic heat equation as a rough path 235
Z v Z v X
2
∼ eik(x−y) e−k τ dx dy


Zuv Zuv
= pτ (x − y)dy dx ≤ |v − u| ,
u u
Rv R 2π
where we used the trivial estimate u pτ (x − y)dy ≤ 0 pτ (x − y)dy = 1. In this
expression, p denotes the (positive) transition kernel of the heat semigroup on the
torus. The step above, between second and third line, where we effectively set λ = 0
is harmless. The factor e−λτ may simply be taken out, and

k2 X k 2 X λ
X  
ik(x−y) −k2 τ
1− e e ≤ 1 − λ + k 2 = <∞.

λ + k2 λ + k2


k k k

2
After integrating over [u, v] , we see that the error made above is actually of order
2
O |v − u| . This is more than enough to conclude that

R(X 1 ,Y 1 )
1-var;[u,v]2
≤ C|v − u| < ∞ ,

uniformly in τ ∈ (0, 1] and u, v.


We now turn to the second step of our proof. We claim that E|ψs1 (x, y) −
1/2 
ψt1 (x, y)|2 = O |t − s| , uniformly in x, y ∈ [0, 2π]. Since
1
ψs (x, y) − ψt1 (x, y) ≤ ψs1 (x) − ψt1 (x) + ψs1 (y) − ψt1 (y) ,

the question reduces to a similar bound on E|ψs1 (x)−ψt1 (x)|2 , uniform in x ∈ [0, 2π].
This quantity is equal to

E ψs1 (x)ψs1 (x) − 2E ψs1 (x)ψt1 (x) + E ψt1 (x)ψt1 (x)


     
2
σ 2 X 2 1 − e−(λ+k )|t−s|

= .
4π λ + k2
k∈Z
2
σ 2 X 2 1 − e−(λ+k )|t−s|

σ2 X
≤ 2|t − s| + 2 ,
4π 4π λ + k2
|k|<N k≥N

where we used that 1 − e−cx ≤ cx for c, x > 0 in the first  sum. We then take
N ∼ |t − s|−1/2 , so that the first sum is of order O |t − s|1/2 . For the second sum,
2
we use the trivial bound 1 − e−(λ+k )|t−s| ≤ 1. It then suffices to note that
X 1 X 1 1/2 
≤ = O(1/N ) = O |t − s| ,
λ + k2 k2
k≥N k≥N

which completes the proof. ⊔


Remark 12.34. The final estimate in the above proof, namely


236 12 Stochastic partial differential equations
2 1/2 
E ψs1 (x) − ψt1 (x) = O |t − s|

,

also implies “almost 41 -Hölder” temporal regularity of the stochastic heat equation.

12.4 Exercises

Exercise 12.1 (From [DFS17]) a) Assume W ∈ C 1 . Show that the Feynman–Kac


(or equivalently viscosity) solution to (12.11) is an analytically weak solution in
the sense of (12.21) with dW replaced by Ẇ dt.
b) Assume now W = (W, W) ∈ Cg0,α . Show that (Y, Y ′ ) ∈ DW 2α
.
c) Show that the Feynman–Kac solution constructed in Theorem 12.11 is an analyt-
ically weak solution in the sense of (12.21).
Exercise 12.2 (From [CDFO13]) A crucial role in the proof of Theorem 12.11 was
played by a hybrid Itô-rough differential equation of the form

dXt = σ(Xt )dB + β(Xt )dW, (12.43)

ultimately solved as (random) rough differential equation, subject to σ, β ∈ Cb3 . Give


an alternative construction to the hybrid equation based on flow decomposition. That
is, use the flow associated to the RDE dY = β(Y )dW and transform (12.43) into a
bona fide Itô differential equation.
Hint: When W is replaced by a C 1 path W ε this is a straightforward computation.
Use the stability of RDE flows, combined with stability results for Itô SDEs to
conclude. Specify the regularity requirements on σ, β.

Exercise 12.3 (Robust filtering, [CDFO13]) Consider a pair of processes (X, Y )


with dynamics

Vj (Xt , Yt )dBtj , (12.44)


X X
dXt = V0 (Xt , Yt )dt + Zk (Xt , Yt )dWtk +
k j
dYt = h(Xt , Yt )dt + dWt , (12.45)

with X0 ∈ L∞ and Y0 = 0. For simplicity, assume coefficients V0 , V1 , . . . , VdB :


RdX +dY → RdX , Z1 , . . . , ZdY : RdX +dY → RdX and h = (h1 , . . . , hdY ) :
RdX +dY → RdY to be bounded with bounded derivatives of all orders; W and
B are independent Brownian motions of the correct dimension. We now interpret
X as a signal and Y as noisy and incomplete observation. The filtering problem
consists in computing the conditional distribution of the unobserved component X,
given the observation Y . Equivalently, one is interested in computing

πt (g) = E[g(Xt , Yt )|Yt ] ,


12.4 Exercises 237

where Yt is the observation filtration and g is a suitably chosen test function. Measure
theory tells us that there exists a Borel-measurable map θtg : C([0, t], RdY ) → R,
such that a.s. πt (g) = θtg (Y ) where we consider Y = Y (ω) as a C([0, t], RdY )-
valued random variable. Note that θtg is not uniquely determined (after all, modifica-
tions on null sets are always possible). On the other hand, there is obvious interest to
have a robust filter, in the sense of having a continuous version of θtg , so that close
observations lead to nearby conclusions about the signal.
a) Give an example showing that, in general, θtg does not admit a continuous
version.
b) Let α ∈ (1/2, 1/3). Show that there exists a continuous map on rough path
space
Θtg : Cg0,α ([0, t], RdY ) → R ,
such that a.s.
πt (g) = Θtg (Y) , (12.46)
where Y is the random geometric rough path obtained from Y by iterated
Stratonovich integration.
Hint: You may use the “Kallianpur–Striebel formula”, a standard result in filtering
theory which asserts that

pt (g)
πt (g) = , pt (g) := E0 [g(Xt , Yt )vt |Yt ]
pt (1)

where
!
XZ t 1 t
Z
dP0 i i 2
= exp − h (Xs , Ys )dWs − ||h(Xs , Ys )|| ds
dP Ft i 0 2 0

and v = {vt , t > 0} is defined as the right-hand side above with −W replaced by
Y.
Exercise 12.4 Show almost sure “( 14 − ε)-Hölder” temporal regularity of ψ =
ψt (x; ω), solution to the stochastic heat equation. Show that, for fixed x, ψt (x; ω) is
not a semimartingale.
Exercise 12.5 (Spatial Itô–Stratonovich correction [HM12]) Writing T for the
interval [0, 2π] with periodic boundary, let us say that

u = u(t, x; ω) : [0, T ] × T × Ω → R

is a (analytically) weak solution to


1
∂t u = ∂xx u − u + ∂x u2 + ξ ,

(⋆)
2
if and only if u = v + ψ where ψ is the stationary solution to ∂t ψ = ∂xx ψ − ψ + ξ
and, for all test functions φ ∈ C ∞ (T),
238 12 Stochastic partial differential equations
 
1 2
∂t ⟨v, φ⟩ = ⟨v, ∂xx φ⟩ − ⟨v, φ⟩ − u , ∂x φ .
2

a) Replace 12 ∂x (u2 ) in (⋆) by a (spatially right) finite-difference approximation,


2
1 u(. + ε) − u2
;
2 ε
write uε for a solution to the resulting equation. Assume uε → u locally uni-
formly in probability. Show that u is a solution to (⋆).

b) At least formally, ∂x 12 u2 = u∂x u in (⋆), which suggests an alternative finite




difference approximation, namely,

(u(. + ε) − u)
u ;
ε
Assume v ε = uε − ψ → v := u − ψ and its first (spatial) derivatives converge
locally uniformly in probability. Show that u is an analytically weak solution to
the perturbed equation
1
∂t u = ∂xx u + ∂x u2 + C + ξ

2
with C ̸= 0. Determine the value of C. Hint: Use Exercise 10.6.

Solution. a) By switching to suitable subsequences, we may assume uε → u


locally uniformly with probability one. Write Dε,l , Dε,r for a discrete (left,
right) finite difference approximation. Note
       
1 2 1 2 1 2
Dε,r u , φ = − u , Dε,l φ → − u , ∂x φ .
2 2 2

Given that v ε = uε − ψ → v := u − ψ locally uniform it then suffices to pass


to the limit in the (integral formulation) of
 
ε ε ε 1 2
∂t ⟨v , φ⟩ = ⟨v , ∂xx φ⟩ − ⟨v , φ⟩ + u , Dε,l φ .
2

b) We note
2
1 u(. + ε) − u2 (u(. + ε) + u) (u(. + ε) − u)
 
1 2
Dε,r u = =
2 2 ε 2 ε
(u(. + ε) − u) 1 2
=u + (u(. + ε) − u) .
ε 2ε
It follows that
12.5 Comments 239

(uε (. + ε) − uε )
 
∂t ⟨v ε , φ⟩ = ⟨v ε , ∂xx φ⟩ − ⟨v ε , φ⟩ + uε ,φ .
ε
= ⟨v ε , ∂xx φ⟩ − ⟨v ε , φ⟩
   
1 ε 2 1 ε ε 2
− (u ) , Dε,l φ − (u (. + ε) − u ) , φ .
2 2ε

In order to pass to the ε → 0 limit, we must understand the final “quadratic


variation” term. By assumption v ε are of class C 1 , uniformly in ε. Hence

[uε (. + ε) − uε ] = ψ(. + ε) − ψ + v ε (. + ε) − v ε
= ψ(. + ε) − ψ + O(ε)

and so, with osc (ψ; ε)O(1) + O(ε) = o(1) as ε → 0,


1 ε 2 1 2
(u (. + ε) − uε ) = (ψ(. + ε) − ψ) + o(1)
2ε 2ε
we have
   
1 ε 2 1 2
(u (. + ε) − uε ) , φ = (ψ(. + ε) − ψ) , φ + o(1) .
2ε 2ε

From Lemma 12.30 we know that


2
E[ψx,x+ε ] = 2(K(0) − K(ε)) = −2K ′ (0)ε + o(ε) = Cε + o(ε) .

(u−π)
Since K(u) = cosh ′ 1
4 sinh (π) , we have C = −2K (0) = 2 , and it follows from
Exercise 10.6 that
ψ 2x,x+ε
 
1 1
Z
2
(ψ(. + ε) − ψ) , φ = φ(x) dx
2ε 2 ε
 
1 1
Z
→ φ(x)Cdx = ,φ ,
2 4

where the convergence takes place in probability. It follows that u is a solution


(in the above analytically weak sense) of
1  1
∂t u = ∂xx u − u + ∂x u2 + + ξ .
2 4

12.5 Comments

Section 12.1: The explicit solution of the rough transport equation in Section 12.1.1
is a (geometric) rough-pathification of the classical method of characteristics and Ku-
nita’s (Stratonovich) stochastic version thereof [Kun84], first pointed out in [CF09].
240 12 Stochastic partial differential equations

Our intrinsic definition of (regular vs. weak / measure-valued) RPDE solution is


essentially taken from Diehl et al. [DFS17] and Bellingeri et al. [BDFT20], which
also treats the low regularity case. Bailleul–Gubinelli [BG17] suggest an abstract
framework of (unbounded) rough drivers in which (Γ [ • ]Ws,t , Γ 2 [ • ]Ws,t ), with Γ as
in (12.2), are viewed as (s, t)-indexed familiy of unbounded operators

As,t = (As,t , As,t )

on a suitable scale of Banach spaces, which satisfy an operator Chen relation and then
the (operator) geometricity condition A2s,t /2 = As,t . The rough transport equation,
say dut = Γ ut dW if written as initial value problem, then fits into an abstract rough
linear equation of the form
dut = A(dt)ut .
An analytically weak formulation (somewhat similar to our Section 12.1.2, but now
formulated via Banach duals) then allows them to obtain existence and uniqueness
under Cb3 assumptions on the vector fields, at the price of a doubling of variables
argument related in the spirit to Di Perna–Lions [DL89].
Entropy solutions to scalar conservation laws with rough forcing are studied by
Friz–Gess [FG16b]; in [HNS20] Hocquet et al. study a generalized Burgers equation
with rough transport noise. A different class of rough scalar conservation laws,
closely related to rough transport, is given by

du + divx (A(x, u))dW = 0 , u = u0 , (12.47)

where u : [0, T ] × Rn → R, with A = (Aij : 1 ≤ i ≤ n, 1 ≤ j ≤ d) sufficiently


smooth, matrix valued functions and W a geometric Hölder rough path over Rd . (The
case of linear A(x, u) = f (x)u is precisely the rough continuity equation treated in
Section 12.1.2.)
Such equations were studied from a “pathwise” point of view (essentially possible
when A = A(u) has no x-dependence or when d = 1) in Lions, Perthame and
Souganidis [LPS13] and [LPS14], followed by Gess–Souganidis [GS15] who treat
the general case (12.47) and then Hofmanová [Hof16]. When dW = Ẇ dt, this falls
into the well established theories of entropy solutions and kinetic solutions. The latter
formulation related to rough transport as follows. With


 +1 if 0 ≤ ξ ≤ u(x, t),

χ(x, ξ, t) := χ(u(x, t), ξ) := −1 if u(x, t) ≤ ξ ≤ 0, (12.48)


0 otherwise,

one can rewrite (12.47) in its (formal) kinetic form: for T > 0 fixed,

dt χ + ∂u A(x, ξ) · Dx χ − divx A(x, ξ)∂ξ χ dW = (∂ξ m)dt , (12.49)

on Rn × R × (0, T ] with initial data χ( • , ∗, 0) = χ(u0 ( • ), ∗) where divx A =


(divx A1 , . . . , divx Ad ) and m is a bounded nonnegative measure on Rn × R × [0, T ],
12.5 Comments 241

known as defect measure, which is part of the solution. The definition of rough
kinetic solution [GS15] is then given as analytically weak solution of (12.49), with
test functions obtained as (spatially) regular solutions to an auxilary rough transport
equation, similar in spirit to Section 12.1.2. See also Gess et al. [GPS16] for a semi-
discretisation. The idea of test functions with (here: temporal) structure tailor-made
to a realisation of the noise (a.k.a. rough path) is central to RPDEs. A well-posedness
result for rough kinetic solutions was also obtained by Deya et al. [DGHT19b], in an
extended setting of RPDEs with (unbounded) rough drivers, of the form

dut = µ(dt) + A(dt)ut ,

where the abstract assumptions on the drift term µ are seen to accommodate the
defect measure. Rough Hamilton–Jacobi equations are of the form

du + H(Du, x)dW = 0 , u(0, • ) = u0 , (12.50)

on (0, T ] × Rn , with Hamiltonians H = (H1 , . . . , Hd ). When dW = Ẇ dt, this


falls into the well established theory of viscosity solutions, with intrinsic notion
of sub (resp. super) solutions via “touching” test functions φ = φ(t, x) ∈ C 1,1 .
Short-time regular solutions via the method of “rough” characteristics then supply
the correct class of test functions (depending on the noise realisation modelled by
W): when inserted in the equation, they at least formally “eliminate” the rough part,
this is basically a local change of the unknown. (A global change of coordinates is
sometimes possible, notably in the case of transport noise when H(p, x) is linear
in p, cf. Section 12.2.3 below.) These ideas form the basis of Lions–Souganidis’
stochastic viscosity theory [LS98a, LS98b, LS00b] which predates most works on
rough paths, the resulting “pathwise” theory essentially requires H = H(p) with no
x-dependence, or d = 1; see also [FGLS17] (x-dependent quadratic Hamiltonian)
and [GGLS20] (speed of propagation). In spatial dimension n = 1, there is a
noteworthy connection with rough conservation laws: if v solves the rough HJ
equation dv + A(∂x v, x)dW = 0, then, at least formally, u = ∂x v satisfies the rough
conservation law du + ∂x (A(u, x))dW = 0.
Section 12.2: Linear stochastic partial differential equations go back at least to
Krylov–Rozovskii [KR77] and play an important problem in filtering theory (Zakai
equation). A Feynman–Kac representation appears in Pardoux [Par79] and Kunita
[Kun82]. Kunita also has flow decompositions of SPDE solutions. Caruana–Friz
[CF09] implement this in the rough path setting in a framework of classical PDE so-
lutions. The construction of hybrid stochastic / rough differential equations which un-
derlies the “rough” Feynman–Kac approach, Theorem 12.11, is taken from [DOR15]
(see also [FHL20]). Diehl et al. [DFS17] establish existence and uniqueness, based
on an intrinsic definition for (linear) RPDEs, numerical algorithms are given by
Bayer et al. [BBR+ 18]. Hofmanova–Hocquet [HH18] study (linear) RPDEs from a
variational perspective and unbounded rough driver perspective, as does Hofmanová
et al. [HLN19] for the Navier–Stokes equation perturbed by rough transport noise.
242 12 Stochastic partial differential equations

An extension of Lions, Perthame and Souganidis [LPS13, LPS14] to rough, scalar,


degenerate parabolic-hyperbolic equation is given in [GS17].
In the context of Crandall–Ishii–Lions viscosity setting, by nature a theory for
second order equations with a maximum principle, stochastic (pathwise) viscos-
ity solutions for fully non-linear equations were introduced by Lions–Souganidis
[LS98a, LS98b, LS00a, LS00b]. Caruana, Friz and Oberhauser [CFO11] introduce
rough viscosity solutions by a limiting procedure for classes of nonlinear SPDEs
with transport noise; an intrinsic definition (via global transformaion) is given e.g.
in [DFO14]. An adaption of the original intrinsic definition of (pathwise) viscosity
solutions to fully non-linear equations [LS98a] is given in Seeger [See18b]. Exten-
sions to different noise situations are due to Diehl–Friz, [DF12] and then [FO14].
Nonlinear noise, x-dependent and quadratic in Du is considered by Friz, Gassiat,
Lions and Souganidis [FGLS17]. Approximation schemes for (pathwise) viscosity
solutions of fully nonlinear problems are studied [See18a].
A nonlinear Feynman–Kac representation (with relations to “rough BSDEs”)
is given in [DF12]. In a filtering context, a (rough path) robustified Kalianpur–
Striebel formula (cf. Exercise 12.3) was given by Crisan, Diehl, Friz and Oberhauser
[CDFO13], which is also the first source of hybrid differential equations. At last,
we refer to Gubinelli–Tindel, Deya et al. and Teichmann [GT10, DGT12, Tei11] for
some other rough path approaches to SPDEs. Theorem 12.18 is essentially due to
[GH19], but very closely related to the earlier results of [GT10]. Compared to the
latter, we restrict ourselves to finite-dimensional drivers, but allow for a more natural
class of nonlinearities thanks to a slightly different use of the various interpolation
spaces.
Section 12.3: The construction of a spatial rough path associated to the stochastic
heat equation is due to Hairer [Hai11b] and allows to deal with otherwise ill-posed
SPDEs of stochastic Burgers type, see also Hairer–Weber [HW13] and Friz, Gess,
Gulisashvili, Riedel [FGGR16] for various extensions (including multiplicative noise,
and fractional Laplacian / non-periodic boundary respectively). This construction
is also an ingredient in one construction for solutions to the KPZ equation, see
Hairer [Hai13] and Chapter 15. Exercise 12.5, in the spirit of Föllmer – rather than
rough path – integration, is taken from Hairer–Maas [HM12]. Similar results are avail-
able for rough SPDEs of type (12.41), see Hairer, Maas and Weber [HMW14], but
this is beyond the scope of these notes. Bellingeri [Bel20] uses regularity structures
to establish an Itô formula for the stochastic heat equation.
Chapter 13
Introduction to regularity structures

We give a short introduction to the main concepts of the general theory of regularity
structures. This theory unifies the theory of (controlled) rough paths with the usual
theory of Taylor expansions and allows to treat situations where the underlying space
is multidimensional.

13.1 Introduction

While a full exposition of the theory of regularity structures is well beyond the
scope of this book, we aim to give a concise overview to most of its concepts and
to show how the theory of controlled rough paths fits into it. In most cases, we will
only state results in a rather informal way and give some ideas as to how the proofs
work, focusing on conceptual rather than technical issues. The only exception is
the “reconstruction theorem”, Theorem 13.12 below, which is one of the linchpins
of the whole theory. Since its proof (or rather a slightly simplified version of it) is
relatively concise, we provide a fully self-contained version. For precise statements
and complete proofs of most of the results exposed here, we refer to the original
article [Hai14b]. See also the review articles [Hai15, Hai14a] for shorter expositions
that complement the one given here.
It should be clear by now that a controlled rough path (Y, Y ′ ) ∈ DW 2α
bears a
strong resemblance to a differentiable function, with the Gubinelli derivative Y ′
describing the coefficient in front of a “first-order Taylor expansion” of the type

Yt = Ys + Ys′ Ws,t + O(|t − s|2α ) . (13.1)

Compare this to the fact that a function f : R → R is of class C γ with γ ∈ (k, k+1)
(1) (k)
if for every s ∈ R there exist coefficients fs , . . . , fs such that
k
X
ft = fs + fs(ℓ) (t − s)ℓ + O(|t − s|γ ) . (13.2)
ℓ=1

243
244 13 Introduction to regularity structures

(ℓ)
Of course, fs is nothing but the ℓth derivative of f at the point s, divided by ℓ!.
In this sense, one should really think of a controlled rough path (Y, Y ′ ) ∈ DW 2α

as a 2α-Hölder continuous function, but with respect to a “model” given by W ,


rather than the usual Taylor polynomials. This formal analogy between controlled
rough paths and Taylor expansions suggests that it might be fruitful to systematically
investigate what are the “right” objects that could possibly take the place of Taylor
polynomials, while still retaining many of their nice properties.

13.2 Definition of a regularity structure and first examples

The first step in such an endeavour is to set up an algebraic structure reflecting


the properties of Taylor expansions. First of all, such a structure should contain a
vector space T that will contain the coefficients
L of our expansion. It is natural to
assume that T has a graded structure: T = α∈A Tα , for some set A of possible
“homogeneities”. For example, in the case of the usual Taylor expansion (13.2), it is
natural to take for A the set of natural numbers and to have Tℓ contain the coefficients
corresponding to the derivatives of order ℓ. In the case of controlled rough paths
however, it is natural to take A = {0, α}, to have again T0 contain the value of the
function Y at any time s, and to have Tα contain the Gubinelli derivative Ys′ . This
reflects the fact that the “monomial” t 7→ Xs,t only vanishes at order α near t = s,
while the usual monomials t 7→ (t − s)ℓ vanish at integer order ℓ.
This however isn’t the full algebraic structure describing Taylor-like expansions.
Indeed, one of the characteristics of Taylor expansions is that an expansion around
some point x0 can be re-expanded around any other point x1 by writing
X m!
(x − x0 )m = (x1 − x0 )k · (x − x1 )ℓ . (13.3)
k!ℓ!
k+ℓ=m

(In the case when x ∈ Rd , k, ℓ and m denote multi-indices and k! = k1 ! . . . kd !.)


Somewhat similarly, in the case of controlled rough paths, we have the (rather trivial)
identity
Ws0 ,t = Ws0 ,s1 · 1 + 1 · Ws1 ,t . (13.4)
What is a natural abstraction of this fact? In terms of the coefficients of a “Taylor
expansion”, the operation of reexpanding around a different point is ultimately just a
linear operation from Γ : T → T , where the precise value of the map Γ depends on
the starting point x0 , the endpoint x1 , and possibly also on the details of the particular
“model” that we are considering. In view of the above examples, it is naturalL to impose
furthermore that Γ has the property that if τ ∈ Tα , then Γ τ − τ ∈ β<α Tβ . In
other words, when reexpanding a homogeneous monomial around a different point,
the leading order coefficient remains the same, but lower order monomials may
appear.
13.2 Definition of a regularity structure and first examples 245

These heuristic considerations can be summarised in the following definition of


an abstract object we call a regularity structure:

Definition 13.1. A regularity structure T = (T, G) consists of the following ele-


ments:
L
• A structure space given as graded vector space T = α∈A Tα where each Tα
is a Banach space, with index set A ⊂ R bounded from below and locally finite.1
Elements of Tα are said to have degree α and we write deg τ = α for τ ∈ Tα .
Given τ ∈ T , we will write ∥τ ∥α for the norm of its component in Tα .
• A structure group G of continuous linear operators acting on T such that, for
every Γ ∈ G, every α ∈ A, and every τα ∈ Tα , one has
def
M
Γ τα − τα ∈ T<α = Tβ . (13.5)
β<α
L
A sector V of T is a linear subspace V = α∈A Vα ⊂ T , with closed linear
subspaces Vα ⊂ Tα , invariant under G, such that (V, G|V ) is a regularity structure
in its own right.

Remark 13.2. In principle, the index set A can be infinite. By analogy with the
polynomials,
P it is then natural to interpret T as the set of all formal series of the form
α∈A τ α , where only finitely many of the τα ’s are non-zero. This also dovetails
nicely with the particular form of elements in G. In practice however we will only
ever work with finite subsets of A so that the precise topology on T does not matter as
long as each of the Tα is finite-dimensional, which is the case in all of the examples
we will consider here.

The space T should be thought of as consisting of “abstract” Taylor expansions (or


“jets”) , where each element of Tα would correspond to a “homogeneous polynomial
of degree α” (this will be made in combination with the definition of a model
in Definition 13.5 below). To avoid confusion between “abstract” elements of T
and “concrete” associated functions (or distributions), we will use colour to denote
elements of T , e.g. τ . Typically, T will be generated (as a free vector space) by a
set of “basis symbols”, so that T consists of all formal (finite) linear combination
obtained from regarding these symbols as basis vectors. Given basis symbols / vectors
τ1 , τ2 , . . . we indicate this by

T = ⟨τ1 , τ2 , . . . ⟩. (13.6)

Important convention: basis symbols will always by listed in order of increasing


homogeneities. That is, τi ∈ Tαi with α1 ≤ α2 ≤ . . . in (13.6). We now turn to
some first examples of regularity structures.

1
In [Hai14b], T was called model space, somewhat in clash with the space of models.
246 13 Introduction to regularity structures

13.2.1 The polynomial structure

We start with two simple special cases followed by the general polynomial structure.
Fix γ ∈ (0, 1) and consider a real-valued function belonging to the Hölder space
γ
of exponent γ, say f ∈ C γ . In other words, f : R → R, and |fx − fy | ≲ |y − x|
uniformly for x, y on compacts. The trivial regularity structure

T = T0 = ⟨1⟩ ∼
=R, G = {Id} ,

allows us to interpret the function f as a T -valued map

x 7→ f (x) := fx 1.

Consider next a real-valued function f : R → R of class C 2+γ , with γ ∈ (0, 1).


By this we mean that continuous derivatives Df and D2 f exist, with D2 f locally
γ-Hölder continuous. The minimal regularity structure allowing to capture the fact
that f ∈ C 2+γ is

T = T0 ⊕ T1 ⊕ T2 = ⟨1, X, X 2 ⟩ ∼
= R3 ,

with structure group G = {Γh ∈ L(T, T ) : h ∈ (R, +)} where Γh is given, with
respect to the ordered basis 1, X, X 2 , by the matrix

1 h h2
 

Γh ∼
=  0 1 2h .
001

In other words,

Γh 1 = 1 , Γh X = X + h1 , Γh X 2 = (X + h1)2 ,

with the obvious abuse of notation in the last expression.


Note that Γg ◦ Γh = Γg+h , so that G inherits its group structure from (R, +).
Moreover, the triangular form, with ones on the diagonal, expresses exactly the
requirement (13.5). This structure allows to represent the function f and its first two
derivatives as a truncated Taylor series, namely as the T -valued map
1
x 7→ f (x) := fx 1 + Dfx X + D2 fx X 2 .
2
It is now an easy matter to generalise the above considerations to general Hölder
maps of several variables, say f : Rd → R in the Hölder space C n+γ , which is
defined by the obvious generalisation of (13.2) to functions on Rd . In this case, we
would take T to be the space of polynomials of degree at most n in d commuting
indeterminates X1 , . . . , Xd . This motivates the following definition.

Definition 13.3. The polynomial regularity structure on Rd is given by


13.2 Definition of a regularity structure and first examples 247

• T = R[X1 , . . . , Xd ] is the space of real polynomials in d commuting indetermi-


nates and Tα is given by the homogeneous polynomials of degree α ∈ N.
• The structure group G ∼ Rd , + acts on T via


Γh P (X) = P (X + h1) , h ∈ Rd ,

for any polynomial P .

Given an arbitrary multi-index k = (k1 , . . . , kd ), we write X k as a shorthand


for X1k1 · · · Xdkd , and we write |k| = k1 + · · · + kd . With this notation, for any
α ∈ A = N,
Tα = ⟨X k : |k| = α⟩. (13.7)
Note that T≤α = T0 ⊕ T1 ⊕ · · · ⊕ Tα , i.e. the space of polynomials of degree at most
α, any α ∈ A = N, is a sector of the polynomial regularity structure.

13.2.2 The rough path structure

We start again from simple examples. What structure would be appropriate for Young
integration? Fix α ∈ (0, 1) and consider the problem of integrating a (continuous)
α
R against a scalar W ∈ C . In the case of smooth W , the indefinite integral
path Y
Z = Y dW exists in Riemann–Stieltjes’ sense and one has Ż = Y Ẇ . In general,
Ẇ only exists as a distribution, more precisely an element of the negative Hölder
space C α−1 . A regularity structure allowing to describe this situation is given by

T = Tα−1 ⊕ T0 = ⟨Ẇ ⟩ ⊕ ⟨1⟩ ∼


= R2 , G = {Id} . (13.8)

The potentially ill-defined product Ż = Y Ẇ can now be replaced by the perfectly


well-defined T -valued map

s 7→ Ż(s) := Ys Ẇ .

R how Ż gives rise to Ż, the distributional derivative of the indefinite


We shall see later
Young integral Y dW , provided of that Y is sufficiently regular, namely Y ∈ C β
with α + β > 1.
Let us next consider the “task” of representing a controlled rough path in a suitable
regularity structure. More precisely, consider α ∈ (1/3, 1/2], a path W ∈ C α with
values in R, say, and (Y, Y ′ ) ∈ DW

so that

Yt ≈ Ys + Ys′ Ws,t . (13.9)

The right-hand side above is some sort of Taylor expansion, based on W ∈ C α , which
describes Y well near the (time) point s. We want to formalise this by attaching to
each time s the “jet”
Y (s) := Ys 1 + Ys′ W .
248 13 Introduction to regularity structures

Performing the substitution 1 7→ 1, W 7→ Ws,· gets us back to the right-hand side of


(13.9). This suggests to define the following regularity structure

T = T0 ⊕ Tα = ⟨1⟩ ⊕ ⟨W ⟩ ∼
= R2 ,

with structure group G = {Γh ∈ L(T, T ) : h ∈ (R, +)} where Γh acts as

Γh 1 = 1 , Γh W = W + h1 .

The regularity structure relevant for rough integration is essentially a combination


of the two previous ones.R Let W = (W, W) ∈ C α and (Y, Y ′ ) ∈ DW 2α
and consider
the rough integral Z := Y dW. Since, for s ≈ t, we have
Z t
Zs,t = Y dW ≈ Ys Ws,t + Ys′ Ws,t ,
s

this suggests (rather informally at this stage), that in the vicinity of any fixed time s,
the distributional derivative of Z should have an expansion of the type

Ż ≈ Ys Ẇ + Ys′ Ẇs , (13.10)

where Ẇ := ∂t Wt and Ẇs := ∂t Ws,t are distributional derivatives. This suggests


to attach the following “jet” at each point s,

Ż(s) := Ys Ẇ + Ys′ Ẇ . (13.11)

The case of multi-component rough paths just needs more basis vectors Ẇ i , Ẇj,k ,
W l (with 1 ≤ i, j, k, l ≤ e). This suggests the following definition.

Definition 13.4. Let α ∈ (1/3, 1/2]. The regularity structure for α-Hölder rough
2
paths (over Re ) is given by T = Tα−1 ⊕ T2α−1 ⊕ T0 ⊕ Tα ∼ = Re+e +1+e with

T0 = ⟨1⟩ , Tα = ⟨W 1 , . . . , W e ⟩ ,
Tα−1 = ⟨Ẇ 1 , . . . , Ẇ e ⟩ , T2α−1 = ⟨Ẇij : 1 ≤ i, j ≤ e⟩ ,

and structure group G ∼ (Re , +) acting on T by

Γh 1 = 1 , Γh W i = W i + hi 1 ,
i i
(13.12)
Γh Ẇ = Ẇ , Γh Ẇij = Ẇij + hi Ẇ j .

It will be seen later in Proposition 13.21 that in this framework the function Ż
defined in (13.11) does indeed Rgive rise naturally to Ż, the distributional derivative
of the indefinite rough integral Y dW.
In a Brownian (rough path) context, one has Hölder regularity with exponent
α = 1/2 − κ, for arbitrarily small κ > 0. The above index set A, relevant  for a
1
“regularity structure view” on stochastic integration, then becomes A = − 2 −
κ, −2κ, 0, 12 − κ , which, in abusive but convenient notation, we write as

13.3 Definition of a model and first examples 249
n 1− − 1 −o
A= − , 0 , 0, .
2 2
Index sets of this form (“half-integers− ”) will also be typical in later SPDE situations
driven by spatial or space-time white noise.

13.3 Definition of a model and first examples

At this stage, a regularity structure is a completely abstract object. It only becomes


useful when endowed with a model, which is a concrete way of associating to
any τ ∈ T and x ∈ Rd , the actual “Taylor polynomial based at x” represented
by τ . Furthermore, we want elements τ ∈ Tα to represent functions (or possibly
distributions!) that “vanish at order α” around the given point x, thereby justifying
our terminology of calling α a degree.
Since we would like to allow A to contain negative values and therefore allow
elements in T to represent actual distributions, we need a suitable notion of “vanishing
at order α”. We achieve this by considering the size of our distributions, when tested
against test functions that are localised around the given point x0 . Given a test
function φ on Rd , we write φλx as a shorthand for

φλx (y) = λ−d φ λ−1 (y − x) .




Given r ∈ N, we also denote by Br the set of all smooth test functions φ : Rd → R


such that φ ∈ C r with ∥φ∥C r ≤ 1 that are furthermore supported in the unit ball
around the origin; clearly Br ⊂ D(Rd ), the test function space for D′ (Rd ), the space
of distributions on Rd . With these notations, our definition of a model for a given
regularity structure T is as follows.

Definition 13.5. Given a regularity structure T = (T, G) and an integer d ≥ 1, a


model M = (Π, Γ ) for T on Rd consists of maps

Π : Rd → L T, D′ (Rd ) Γ : Rd × Rd → G


x 7→ Πx (x, y) 7→ Γxy

such that Γxy Γyz = Γxz and Πx Γxy = Πy . Write r for the smallest integer such that
r > |min A| ≥ 0 and impose that for every compact set K ⊂ Rd and every γ > 0,
there exists a constant C = C(K, γ) such that the bounds
Πx τ (φλx ) ≤ Cλα ∥τ ∥α , ∥Γxy τ ∥β ≤ C|x − y|α−β ∥τ ∥α ,

(13.13)

hold uniformly over x, y ∈ K, λ ∈ (0, 1], φ ∈ Br , τ ∈ Tα with α ≤ γ and β < α.

We then call Π the realisation map, since Πx τ realises an element τ ∈ T as a


distribution, and Γ the reexpansion map.
250 13 Introduction to regularity structures

One very important remark is that the space M of all models for a given regularity
structure is not a linear space. However, it can be viewed as a closed subset (deter-
mined by the nonlinear constraints Γxy ∈ G, Γxy Γyz = Γxz , and Πy = Πx Γxy )
of the linear space with seminorms (indexed by the compact set K and the upper
bound γ) given by the smallest constant C in (13.13). In particular, there is a natural
collection of “distances” between models (Π, Γ ) and (Π̄, Γ̄ ) given by the smallest
constant C in (13.13), when replacing Πx by Πx − Π̄x and Γxy by Γxy − Γ̄xy .
Since this collection is essentially countable (consider for example the sequence of
pseudometrics dn corresponding to the choices (Kn , γn ) with Kn the centred ball
P n and γn = n), it determines a metrisable topology (take for example
of radius
d = n≥1 2−n (dn ∧ 1)).

Remark 13.6. The precise choice of r in Definition 13.5 is not very important, as
one can see that any other choice r > |min A| ≥ 0 leads to the same definition. See
Lemma 14.13 for a similar statement in the context of Hölder spaces.

Remark 13.7. The test functions appearing in (13.13) are smooth. It turns out that if
these bounds hold for smooth elements of Br , then Πx τ can be extended canonically
to allow any C r test function with compact support.

Remark 13.8. The identity Πx Γxy = Πy reflects the fact that Γxy is the linear map
that takes an expansion around y and turns it into an expansion around x. The first
bound in (13.13) states what we mean precisely when we say that τ ∈ Tα represents
a term that vanishes at order α. (See Exercise 13.2; note that α can be negative, so
that this may actually not vanish at all!) The second bound in (13.13) is very natural
in view of both (13.3) and (13.4). It states that when expanding a monomial of order
α around a new point at distance h from the old one, the coefficient appearing in
front of lower-order monomials of order β is of order at most hα−β .

Remark 13.9. In many cases of interest, it is natural to scale the different directions of
Rd in a different way. This is the case for example when using the theory of regularity
structures to build solution theories for parabolic stochastic PDEs, in which case
the time direction “counts as” two space directions. This “parabolic scaling” can be
formalised by the integer vector (2, 1, . . . , 1). More generally, one can introduce a
scaling s of Rd , which is just a collection of d scalars si ∈ [1, ∞) and to define φλx in
such a way that the ith direction is scaled by λsi . The polynomial structure introduced
earlier, in particular (13.7), should be changed accordingly by postulating that the
Pd
degree of X k is given by |k|s = i=1 si ki . In this case, the Euclidean distance
d
between two points x, y ∈ R P should be1/s replaced everywhere by the corresponding
scaled distance |x − y|s = i |xi − yi | i . See [Hai14b] for more details.

With these definitions at hand, it is then natural to define an analogue in this


context of the space of γ-Hölder continuous functions in the following way.

Definition 13.10. Given a regularity structure T equipped with a model M = (Π, Γ )


γ
over Rd , the space DM is given by the set of functions f : Rd → T<γ such that, for
every compact set K and every α < γ, there exists a constant C with
13.3 Definition of a model and first examples 251

∥f (x) − Γxy f (y)∥α ≤ C|x − y|γ−α (13.14)

uniformly over x, y ∈ K. Such functions f are called modelled distributions. For


fixed K, a seminorm ∥f ∥M,γ;K is defined as the smallest constant C in the bound
γ
(13.14). The space DM endowed with this family of seminorms is then a Fréchet
space.

It is furthermore convenient to be able to compare two modelled distributions


defined over two different models. In this case, a natural way of comparing them is
to take as a “metric” the smallest constant C in the bound

∥f (x) − Γxy f (y) − f¯(x) + Γ̄xy f¯(y)∥α ≤ C|x − y|γ−α .

Remark 13.11. (Compare with Remark 4.8 in the rough path context.) It is important
γ
to note that while the space of models M is not a linear space, the space DM is a
linear (in fact: Fréchet) space given a model M ∈ M . The twist of course is that the
space in question depends in a crucial way on the choice of M. The total space then
is the disjoint union
γ
G
M ⋉ Dγ =
def
{M} × DM ,
M∈M
γ
with base space M and “fibres” DM .

The most fundamental result in the theory of regularity structures then states that
given f ∈ D γ with γ > 0, there exists a unique distribution Rf on Rd such that, for
every x ∈ Rd , Rf “looks like Πx f (x) near x”. More precisely, one has

Theorem 13.12 (Reconstruction). Let M = (Π, Γ ) be a model for a regularity


γ
structure T on Rd . Assume f ∈ DM with γ > 0. Then, there exists a unique linear
map
γ
R = RM : DM → D′ (Rd )
such that
Rf − Πx f (x) (φλx ) ≲ λγ ,

(13.15)
uniformly over φ ∈ Br and λ as before, and locally uniformly in x. For γ < 0,
everything remains valid but uniqueness of R.

Remark 13.13. With a look to Remark 13.11, and M = (Π, Γ ) ∈ M , one should
really view R = RM f as a map from M ⋉ D γ into D′ . Since the space M ⋉ D γ is
not a linear space, this shows that the map R isn’t actually linear, despite appearances.
However, the map (Π, Γ, f ) 7→ Rf turns out to be locally Lipschitz continuous
provided that the distance between (Π, Γ, f ) and (Π̄, Γ̄ , f¯) is given by the smallest
constant C such that

∥f (x) − f¯(x) − Γxy f (y) + Γ̄xy f¯(y)∥α ≤ C|x − y|γ−α ,


Πx τ − Π̄x τ (φλx ) ≤ Cλα ∥τ ∥ ,


∥Γxy τ − Γ̄xy τ ∥β ≤ C|x − y|α−β ∥τ ∥ .


252 13 Introduction to regularity structures

Here, in order to obtain bounds on Rf − R̄f¯ (ψ) for some smooth compactly


supported test function ψ, the above bounds should hold uniformly for x and y in a
neighbourhood of the support of ψ. The proof that this stronger continuity property
also holds is actually crucial when showing that sequences of solutions to mollified
equations all converge to the same limiting object. However, its proof is somewhat
more involved which is why we chose not to give it here but refer instead to [Hai14b,
Thm 3.10].

Remark 13.14. There are obvious analogies between the construction of the recon-
struction operator R and that of the “rough integral” in Section 4. As a matter of fact,
there exists a slightly more abstract formulation of the reconstruction theorem which
can be interpreted as a multidimensional analogue to the sewing lemma, Lemma 4.2,
see [Hai14b, Prop. 3.25].

Remark 13.15. The reconstruction theorem with γ < 0 allows one to recover the
Lyons–Victoir extension theorem previously obtained in Exercise 2.14, see also
Exercise 13.6. Note that the reconstruction theorem does not hold for γ = 0 (even if
we forego uniqueness of R), for the same reason that the Lyons–Victoir extension
theorem fails for α = 12 (and more generally when 1/α ∈ N).

In the particular case where Πx τ happens to be a continuous function for every


τ ∈ T (and every x ∈ Rd ), we will see in Remark 13.27 that Rf is also a continuous
function and R is given by the somewhat trivial explicit formula
 
Rf (x) = Πx f (x) (x) .

We postpone the proof of the reconstruction theorem to Section 13.4 and turn instead
to our previous list of regularity structures, now adding the relevant models and
indicating the interest of the reconstruction map.

13.3.1 The polynomial model

Recall the polynomial regularity structure in d variables defined in Section 13.2.1. In


this context, the polynomial model P is given by

Πx X k = (y 7→ (y − x)k ) ,

Γxy = Γh h=x−y .

We leave it as an exercise to the reader to verify that this does indeed satisfy the
bounds and relations of Definition 13.5.
In the sense of the following proposition, modelled distributions in the context of
the polynomial model are nothing but classical Hölder functions.

Proposition 13.16. Let β = n + γ with n ∈ N and γ ∈ (0, 1). If f belongs to the


Hölder space C β , then f ∈ DPβ with
13.3 Definition of a model and first examples 253

X f (k) (x) k
f (x) = f (x)1 + X .
k!
1≤|k|≤n

Conversely, if fˆ ∈ DPβ then f := ⟨fˆ, 1⟩ is in C β and necessarily fˆ = f . ⊔


This proposition is essentially a consequence of the (well-known) fact that f ∈ C β


if and only if for every x ∈ Rd , there exists a polynomial Px = Px (y) of degree n,
such that, locally uniformly in x, y, one has |f (y) − Px (y)| ≲ |y − x|β . Necessarily
then, such a function f is n times continuously differentiable, and Px is its Taylor
polynomial of degree n. This characterisation and the above proposition remain
valid for integer values of β with the caveat that in this context C β means β − 1
times continuously differentiable with the highest order derivatives locally Lipschitz
continuous.
It will be convenient for the sequel to introduce a suitable notion of “negative”
Hölder spaces. In fact, the definition of a model (see also Exercise 13.2) suggests that
a very natural space of distributions is obtained in the following way. Given α > 0,
we denote by C −α the space of all distributions η such that, with r the smallest
integer such that r > α,
η(φλx ) ≲ λ−α ,

uniformly over all φ ∈ Br and λ ∈ (0, 1], and locally uniformly in x. Given any
compact set K, the best possible constant such that the above bound holds uniformly
over x ∈ K yields a seminorm. The collection of these seminorms endows C −α with
a Fréchet space structure.

Remark 13.17. In terms of the scale of classical Besov spaces, the space C −α is a
−α
local version of B∞,∞ . It is in some sense the largest space of distributions that is
invariant under the scaling φ(·) 7→ λ−α φ(λ−1 ·), see for example [BP08].

Let us now give a very simple application of the reconstruction theorem. It is


a classical result in the “folklore” of harmonic analysis (see for example [BCD11,
Thm 2.52] for a very similar statement) that the product extends naturally to C β ×C −α
into D′ (Rd ) if and only if β > α, which can also be seen as higher-dimensional
version of the Young integral, cf. Exercise 13.1. We illustrate how to use the recon-
struction theorem in order to obtain a straightforward proof of the “if” part of this
result:

Theorem 13.18. For β > α > 0, there is a continuous bilinear map

B : C β × C −α → D′ (Rd )

such that B(f, g) = f g for any two continuous functions f and g.

Proof. Assume from now on that g = ξ ∈ C −α for some α > 0 and that f ∈ C β
for some β > α. We then build a regularity structure T in the following way. For
the index set A, we take A = N ∪ (N − α) and for T , we set T = V ⊕ W , where
each one of the spaces V and W is a copy of the polynomial regularity structure (in
254 13 Introduction to regularity structures

d commuting variables). We also choose Γ as in the polynomial case above, acting


simultaneously and identically on each of the two instances.
As before, we denote by X k the canonical basis vectors in V . We also use the
suggestive notation “ΞX k ” for the corresponding basis vector in W , but we postulate
that ΞX k ∈ T|k|−α rather than ΞX k ∈ T|k| . Given any distribution ξ ∈ C −α , we
then define a model (Π ξ , Γ ), where Γ is as in the canonical model, while Π ξ acts as

Πxξ X k (y) = (y − x)k , Πxξ ΞX k (y) = (y − x)k ξ(y) ,


 

with the obvious abuse of notation in the second expression. It is then straightforward
to verify that Πy = Πx ◦ Γxy and that the relevant analytical bounds are satisfied, so
that this is indeed a model.
Denote now by Rξ the reconstruction map associated to the model (Π ξ , Γ ) and,
for f ∈ C β , denote by f the element in D β given by the local Taylor expansion of
f of order β at each point. Note that even though the space D β does in principle
depend on the choice of model, in our situation f ∈ D β for any choice of ξ. It
follows immediately from the definitions that the map x 7→ Ξf (x) belongs to D β−α
so that, provided that β > α, one can apply the reconstruction operator to it. This
suggests that the multiplication operator we are looking for can be defined as

B(f, ξ) = Rξ Ξf .


By Theorem 13.12, this is a jointly continuous map from C β × C −α into D′ (Rd ),


provided that β > α. If ξ happens to be a smooth function, then it follows immedi-
ately from the remark after Theorem 13.12 that B(f, ξ) = f (x)ξ(x), so that B is
indeed the requested continuous extension of the usual product. ⊔ ⊓
Remark 13.19. In the context of this theorem, one can actually show that B(f, g) ∈
C −α . More generally, denoting by −α the smallest degree arising in a given regularity
structure T , i.e. α = − min A, it is possible to show that the reconstruction operator
R takes values in C −α .
The reader may notice that one can also work with a finite-dimensional regularity
structure, based on index set Ñ ∪ (Ñ − α), with Ñ = {0, 1, . . . , n} and β = n + γ.
In particular, if n = 0, the regularity structure used here is exacty the one already
encountered in (13.8).

13.3.2 The rough path model

Let us see now how some of the results of Section 4 can be reinterpreted in the light
of this theory. Fix α ∈ (1/3, 1/2] and let T be the rough path regularity structure
put forward in Definition 13.4. Recall that this means that T0 = ⟨1⟩, Tα and Tα−1
are copies of Re with respective basis vectors W j and Ẇ j , and T2α−1 is a copy of
Re×e with basis vectors Ẇij . The structure group G is isomorphic to Re and, for
h ∈ Re , acts on T via
13.3 Definition of a model and first examples 255

Γh 1 = 1 , Γh Ẇ i = Ẇ i , Γh W i = W i + hi 1 , Γh Ẇij = Ẇij + hi Ẇ j .
(13.16)
Let now W = (W, W) be an α-Hölder continuous rough path over Re . It turns out
that this defines a model for T in the following way:

Lemma 13.20. Given an α-Hölder continuous rough path W, one can define a model
M = MW for T on R by setting Γt,s = ΓWs,t and
j
Πs W j (t) = Ws,t
 
Πs 1 (t) = 1 ,
Z Z
Πs Ẇ j (ψ) = ψ(t) dWtj , Πs Ẇij (ψ) = ψ(t) dWij
 
s,t .

Here, both integrals are perfectly well-defined Riemann integrals, with the differential
in the second case taken with respect to the variable t. Given a controlled rough path
(Y, Y ′ ) ∈ DW

, this then defines an element Y ∈ DM 2α
by

Y (s) = Y (s) 1 + Yi′ (s) W i ,

with summation over i implied.

Proof. We first check that the algebraic properties of Definition 13.5 are satisfied.
It is clear that Γs,u Γu,t = Γs,t and that Πs Γs,u τ = Πu τ for τ ∈ {1, W j , Ẇ j }.
Regarding Ẇij , we differentiate Chen’s relations (2.1) which yields the identity

dWi,j i,j i j
s,t = dWu,t + Ws,u dWt .

The last missing algebraic relation then follows at once. The required analytic bounds
follow immediately (exercise!) from the definition of the rough path space C α .
Regarding the function Y defined in the statement, we have

∥Y (s) − Γs,u Y (u)∥0 = |Y (s) − Y (u) + Yi′ (u)Ws,u


i
|,
∥Y (s) − Γs,u Y (u)∥α = |Y ′ (s) − Y ′ (u)| ,

so that the condition (13.14) with γ = 2α does indeed coincide with the definition of
a controlled rough path. ⊔ ⊓

Theorems 4.4 and 4.10 can then be recovered as a particular case of the recon-
struction theorem in the following way.

Proposition 13.21. In the same context as above, let α ∈ ( 31 , 12 ], and consider



the modelled distribution Y ∈ DM W
built as above from a controlled rough path
(Y, Y ) ∈ DW . Then, the map Y Ẇ j given by
′ 2α

Y Ẇ j (s) := Y (s) Ẇ j + Yi′ (s) Ẇij




belongs to D 3α−1 . Furthermore, there exists a function Z, unique up to addition of


constants, such that
256 13 Introduction to regularity structures
Z
RY Ẇ j (ψ) =

ψ(t) dZ(t) ,

j
and such that Zs,t = Y (s) Ws,t + Yi′ (s) Wi,j 3α
s,t + O(|t − s| ).

Proof. The fact that Y Ẇ j ∈ D 3α−1 is an immediate consequence of the definitions.


Since α > 13 by assumption, we can apply the reconstruction theorem to it, from
which it follows that there exists a unique distribution η such that, if ψ is a smooth
compactly supported test function, one has
Z Z
η(ψsλ ) = ψsλ (t)Y (s) dWtj + ψsλ (t)Yi′ (s) dWi,j s,t + O(λ
3α−1
).

By a simple approximation argument, see Exercise 13.10, one can take for ψ the
indicator function of the interval [0, 1], so that
j
η(1[s,t] ) = Y (s) Ws,t + Yi′ (s) Wi,j 3α
s,t + O(|t − s| ) .

Here, the reason why one obtains an exponent 3α rather than 3α − 1 is that it is
really |t − s|−1 1[s,t] that scales like an approximate δ-distribution as t → s. ⊔

Remark 13.22. Using the formula (13.26), it is straightforward to verify that if W
happens to be a smooth function and W is defined from W via (2.2), but this time
viewing it as a definition for the right-hand side, with the left-hand side given by
a usual Riemann integral, then the function Z constructed in Proposition 13.21
coincides with the usual Riemann integral of Y against W j .
Remark 13.23. The theory of (controlled) rough paths of lower regularity already
hinted at in Section 2.4 can be recovered from the reconstruction operator and a
suitable choice of regularity structure (essentially two copies of the truncated tensor
algebra) in virtually the same way.

13.4 Proof of the reconstruction theorem

The proof of the reconstruction theorem originally given in [Hai14b] relied on


wavelet analysis, in particular on the existence of compactly supported wavelets of
arbitrary regularity [Dau88]. More recently, Otto and coauthors [OSSW18] and then
Moinat and Weber [MW18] obtained a version of the reconstruction theorem that
bypasses this theory and is completely self-contained. The version of the proof given
here is inspired by their work and has the advantage of being purely local: although
we state the result for models and modelled distributions that are assumed to be
defined on all of Rd , the proof generalises immediately to arbitrary domains. The
proof given here also generalises immediately to non-Euclidean scalings, even in
situations where the ratios between scaling exponents are irrational.
A crucial ingredient is the following remark. Fix α > 0 and let ϱ : Rd → R be
even, smooth, compactly supported in the ball of radius 1, such that
13.4 Proof of the reconstruction theorem 257
Z
xk ϱ(x) dx = δk,0 , 0 < |k| ≤ α , (13.17)

where k denotes a d-dimensional multi-index and δ denotes Kronecker’s delta. Note


that such a function necessarily exists, since Rotherwise one would be able to find a
polynomial P of degree at most α such that P (x)φ(x) dx = 0 for every smooth
and compactly supported φ, which is clearly absurd. (See also Exercise 13.8 for a
constructive proof.)
Given such a function ϱ, we define ϱ(n) (x) = 2nd ϱ(2n x), as well as

ϱ(n,m) = ϱ(n) ∗ ϱ(n+1) ∗ · · · ∗ ϱ(m) , (13.18)

where ∗ denotes convolution. We also set φ(n) = limm→∞ ϱ(n,m) , so that in particu-
(n) (n)
lar φ(n) = ϱ(n) ∗ φ(n+1) and we write ϱx (y) = ϱ(n) (y − x) and similarly for φx ;
see Exercise 13.7 to see that the limit φ(n) exists and belongs to Cc∞ . We then have
the following preliminary lemma.

Lemma 13.24. Let α > 0, let ϱ be as above and let ξn : Rd → R be a sequence of


functions such that for every compact K there exists CK such that supx∈K |ξn (x)| ≤
CK 2αn , and such that furthermore ξn = ϱ(n) ∗ ξn+1 . Then, the sequence ξn is
Cauchy in C −β for every β > α and its limit ξ satisfies ξn = φ(n) ∗ ξ.
If furthermore, for somex ∈ Rd and γ > −α one has the bound |ξn (y)| ≤
αn
2 |x−y|γ+α + 2−(γ+α)n , uniformly over n ≥ 0 and |y −x| ≤ 1, then |ξ(ψxλ )| ≲
γ
λ for λ ≤ 1.

Proof. Let λ ∈ (0, 1] and let ψλ be a test function that is supported in the ball of
radius λ and such that |Dk ψ| ≤ λ−d−|k| for all |k| ≤ α + 1. In order to show that
ξn is Cauchy in C −β it then suffices to exhibit a bound of the type

|ψλ ∗ (ξn − ξn+1 )| ≲ λ−β 2(α−β)n , (13.19)

locally uniformly in x, Rfor a proportionality constant independent of ψλ . Since there


exists C̄ > 0 such that |ψλ (x)| dx ≤ C̄, uniformly over λ and ψλ , it follows from
the assumption |ξn (x)| ≤ C2αn that the left-hand side of (13.19) is bounded by
(1 + 2α )C C̄2αn , so that the bound (13.19) holds whenever λ ≤ 2−n .
To deal with the converse case 2−n ≤ λ, we rewrite the left-hand side of (13.19)
as |(ψλ ∗ ϱ(n) − ψλ ) ∗ ξn+1 | and we note that, by Taylor’s remainder theorem,

def X Dk ψλ (x)
ψλ (y) − Tx(α) (y) = (y − x)k ≲ λ−N −d |y − x|N ,

ψλ (y) −

k!
|k|≤α
(13.20)
(α) (α) (α)
where N = ⌈α⌉. Since, by (13.17), one has ϱ(n) ∗ Tx = Tx and since Tx (x) =
ψλ (x), one has

ψλ ∗ ϱ(n) − ψλ (x) = ϱ(n) ∗ (ψλ − Tx(α) ) (x) ,


 
258 13 Introduction to regularity structures

which is bounded by λ−N −d 2−nN as an immediate consequence of (13.20). Since


furthermore the support of this function has diameter at most 2λ, it follows that
its integral is at most λ−N 2−nN so that, combining this with the a priori bound
|ξn+1 | ≲ 2αn , we conclude that

|ψλ ∗ (ξn − ξn+1 )| ≲ λ−N 2(α−N )n .

Since N ≥ α, the bound (13.19) then follows for 2−n ≤ λ as required.


Since we have just shown that the sequence ξn is Cauchy, it has a limit ξ ∈ C −β .
Given a test function ψ, we have

ξn (ψ) = ξn+1 (ϱ(n) ∗ ψ) = ξm (ϱ(n,m) ∗ ψ) = ξ(φ(n) ∗ ψ) ,

showing that ξn = φ(n) ∗ ξ as required. (Here we use the fact that the convergence
ϱ(n,m) → φ(n) takes place in C r for r = rβ by Exercise 13.7.)
The proof of the second claim follows the same lines. We write
X
ξ(ψxλ ) = ξn (ψxλ ) + (ξk+1 − ξk )(ψxλ ) ,
k≥n

where n is chosen in such a way that λ ∈ [2−(n+1) , 2−n ]. As a consequence of this


choice and of our assumption on ξn , one has the bound
Z
λ −d
2αn |x − y|γ+α + 2−(γ+α)n dy

|ξn (ψx )| ≲ λ
Bx (λ)
γ+α αn
≲λ 2 + 2−γn ≲ λγ .

To bound (ξk+1 − ξk )(ψxλ ) we proceed as above so that


Z
(ξk+1 − ξk )(ψxλ ) ≲ λ−N −d 2−nN

|ξn+1 (y)| dy
Bx (2λ)
γ+α−N (α−N )n
≲λ 2 + λ−N 2−(γ+N )n .

Since N > α and N > −γ, this is summable and its sum is again of order λγ , thus
concluding the proof. ⊔

Remark 13.25. Note the strong similarity of this setting with that of multiresolution
analysis [Mey92]: the image of the convolution operator with φ(n) plays the role of
Vn and convolution with ϱ(n) plays the role of the projection Vn+1 → Vn .

Let us now restate the reconstruction theorem for the reader’s convenience. (We
only consider the case γ > 0 here.)
Theorem 13.26. Let T be a regularity structure as above and let (Π, Γ ) a model
for T on Rd . Then, for γ > 0, there exists a unique linear map R : D γ → D′ (Rd )
such that
Rf − Πx f (x) (ψxλ ) ≲ λγ ,

13.4 Proof of the reconstruction theorem 259

uniformly over ψ ∈ Br and λ ∈ (0, 1], and locally uniformly in x. The statement still
holds for γ < 0, except that uniqueness fails.
Proof. We first define operators R(m,m) by

R(m,m) f (y) = (φ(m) ∗ Πy f (y))(y) = (Πy f (y))(φ(m)



y ). (13.21)

The idea then is to obtain R as the limit of R(m,m) as m → ∞. This however turns
out not to be that easy to obtain directly. Instead, we try to make use of Lemma 13.24
and define, for m > n,

R(n,m) f = ϱ(n,m−1) ∗ R(m,m) f ,

so that, as a consequence of the identity Πz = Πy Γyz ,


Z
(n,m) (n,m+1)
f (x) = ϱ(n,m−1)

R f −R x (y)
Z
ϱ(m)
 (m+1)
z (y) Πy f (y) − Γyz f (z) (φz ) dz dy .

At this stage we note that, as a consequence of the analytical bounds (13.13) im-
 (m+1)
posed in the definition of a model, the quantity Πy τ (φz ) is bounded by
C2−αm ∥τ ∥α , uniformly over |y − z| ≲ 2−m and τ ∈ Tα . On the other hand,
the definition of the spaces D γ guarantees that the component of f (y) − Γyz f (z)
in Tα is bounded by 2(α−γ)m , again uniformly over |y − z| ≲ 2−m . Since
R (n,m−1)
|ϱx (y)| dy ≲ 1, uniformly over m and n, we conclude that
(n,m)
− R(n,m+1) f L∞ ≲ 2−γm ,

R (13.22)

uniformly over n ≥ 0 and m ≥ n. Furthermore, it is straightforward to check that


(n,n)
R f L∞ ≲ 2−αn , (13.23)

where α denotes the smallest degree in the ambient regularity structure. It follows
that R(n) f = limm→∞ R(n,m) f is well-defined and also satisfies the bound (13.23).
Since the identity
R(n,m) f = ϱ(n) ∗ R(n+1,m) f
holds for every m ≥ n + 1, it follows that R(n) f = ϱ(n) ∗ R(n+1) f , so that
Rf = limn→∞ R(n) f exists in C α for every α < α by Lemma 13.24.
It remains to show that one has the bound
Rf − Πx f (x) (ψxλ ) ≲ λγ .

(13.24)

For this, we note first that if we define f x ∈ D γ by f x (y) = Γyx f (x), then one has
R(n) f x = φ(n) ∗ Πx f (x), so that (13.24) can be written as
R f − f x (ψxλ ) ≲ λγ .

(13.25)
260 13 Introduction to regularity structures

Since ∥(f − f x )(y)∥α ≲ |y − x|γ−α , it follows from the definition (13.14) of D γ


that
(n,n) X −αn
f − f x (y) = Πy f − f x (φ(n) |y − x|γ−α
 
R y ) ≲
2
α<γ
−αn γ−α (α−γ)n

≲2 |y − x| +2 .

By (13.22) the same bound also holds for R(n) , so that the claim follows from the
second part of Lemma 13.24.
The case γ < 0 works in a similar way, but this time we explicitly define
X
Rf = R(0,0) f + ϱ(n) − δ ∗ R(n,n) f ,

n

where δ denotes the Dirac delta-distribution. We leave it as an exercise for the reader
to verify that this sum does indeed converge in C α for every α < α and that the limit
satisfies the required bound. ⊔ ⊓

Remark 13.27. In the particular case where Πx τ happens to be a continuous function


for every τ ∈ T (and every x ∈ Rd ), Rf is also a continuous function and one has
the identity  
Rf (x) = Πx f (x) (x) . (13.26)
We leave it as an exercise to show that this is the case, taking (13.21) as a starting
point.

13.5 Exercises

Exercise 13.1 a) Relate Theorem 13.18, in case d = 1, with the Young integral.
b) Draw inspiration from Weierstrass’s construction of a continuous nowhere
differentiable function to construct examples demonstrating the “only if” part of
Theorem 13.18.
Exercise 13.2 (Hölder spaces) For k ∈ N and α ∈ (0, 1), it is customary to define
C k+α as the space of k times continuously differentiable functions f : Rd → R such
that their derivatives of order k are α-Hölder continuous. Show that this agrees with
the obvious extension to Rd of the definition given earlier in (13.2).
Exercise 13.3 Show that in general, the function Z from
R tProposition 13.21 coincides,
up to an additive constant, with the rough integral 0 Y (s) dXsj , in the sense of
Remark 4.12.
♯ Exercise 13.4 Let γ̄ ≥ γ > 0 and let f ∈ C(Rd , T<γ̄ ) such the “modelled distribu-
tion” bound (13.14) holds for every α < γ.

∥f ∥D γ < ∞ .
13.5 Exercises 261

Show that the projection of f on T<γ belongs to D γ .


Exercise 13.5 Let (Π, Γ ) be a model for the “rough path” regularity structure
given in Definition 13.4 with the additional property that Πs Ẇ i is the distributional
derivative of Πs W i for every s. Show that it is then necessarily of the form MW for
some α-Hölder rough path W as in Lemma 13.20.
Exercise 13.6 Using the regularity structure defined in Section 13.3.2, give a proof
of the Lyons–Victoir extension theorem using the case γ < 0 of the reconstruction
theorem. Hint: A useful fact is that, for any symbol τ of degree α and any model
(Π, Γ ), the function y 7→ fxτ (y) = Γyx τ − τ belongs to D α .
∗ Exercise 13.7 Show that the limit φ(n) = limm→∞ ϱ(n,m) with ϱ(n,m) as in (13.18)
exists and belongs to Cc∞ , with the limit being taken in C Rr for any r > 0. Show
R one necessarily has |ϱ(x)| dx > 1 (why?),
furthermore that, despite the fact that
there exists a constant C such that |ϱ(n,m) (x)| dx < C, uniformly over n, m ∈ N.
Hint: Work in Fourier space to show existence and smoothness of the limit and in
direct space to show that it has compact support.
Exercise 13.8 Show that it is possible to find a smooth compactly supported function
ϱ such that (13.17) holds. Hint: Note first that for any ψ integrating to 1 one can find
a differential operator
R L of order α with constant coefficients and without constant
term such that ψ(x)PP (x) dx = (Id − L)P (0) for all polynomials P of degree
α. Show that then ϱ = k≤α (L∗ )k ψ does the trick, where L∗ denotes the formal
adjoint of L.
Exercise 13.9 Show that the construction of Section 2.4 determines a regularity
structure with T = T (p) (Rd ), structure group G(p) (Rd ), and such that deg ew =
α|w|. Show also that every rough path X determines a model for this regularity
structure and that the definition of a controlled path given in Definition 4.18 coincides
with the definition of the space D pα for the model associated to the rough path X.
Exercise 13.10 Show that one can indeed take φ = 1[0,1] in the last step of the proof
of Proposition 13.21. Hint: show first that one can write
X
1[0,1] = φn + ψn ) ,
n≥0

where φn is supported on [0, 2−n ], ψn is supported on [1 − 2−n , 1], all of these


functions are smooth, and ∥Dk φn ∥∞ + ∥Dk ψn ∥∞ ≤ C2kn for some C > 0,
uniformly over n ≥ 0 and k ∈ [0, r].
Exercise 13.11 Given a fixed regularity structure and model, given γ > 0, τ ∈ Tγ
and x ∈ Rd , define a function fx,τ : Rd → T<γ by

fx,τ (y) = Γyx τ − τ .

Show that fx,τ ∈ D γ and that one has Rfx,τ = Πx τ . Use this to give another proof
of Lyons’ extension theorem (Exercise 4.6).
262 13 Introduction to regularity structures

13.6 Comments

All basic definitions (regularity structure, model, modelled distribution, . . .) are


taken from [Hai14b]. An alternative theory to the theory of regularity structures was
introduced more or less simultaneously in Gubinelli–Imkeller–Perkowski [GIP15].
Instead of the reconstruction theorem, that theory builds on properties of Bony’s
paraproduct [Bon81, BMN10, BCD11] and it introduces a notion of “paracontrolled
distribution” which replaces the notion of “modelled distribution”. This theory is
also able to deal with stochastic PDEs like the KPZ equation or the dynamical Φ43
equation, see Catellier–Chouk [CC18b], but its scope is not as wide as that of the
theory of regularity structures. For example, as it stands it does not appear to be
able to deal with classical one-dimensional parabolic SPDEs driven by space-time
white noise with a diffusion coefficient depending on the solution or the type of
equation arising as natural evolutions on the space of loops with values in a manifold
[Hai16, BGHZ19]. This is however evolving rapidly as a number of recent results
show that paracontrolled calculus can alternatively be used as the foundation for the
analytical aspects of the theory of regularity structures. We refer to [BB19, BH18,
MP18, BH19, BM19] for more details.
One advantage of the paraproduct-based theory is that one generally deals with
globally defined objects rather than the “jets” used in the theory of regularity struc-
tures. It also uses some already well-studied objects, so that it can rely on a substantial
body of existing literature. On the flip side, it usually achieves a less clean break
between the analytical and the algebraic aspects of a given problem. Furthermore,
while the probabilistic aspects of the theory are expected to be equivalent to some
extent, it is not completely clear how an analogue of the results [CH16] would even
be formulated in the paracontrolled setting, although the results mentioned above
may provide a hint. A third approach, closer in spirit to Wilson’s renormalisation
group ideas, was developed by Kupiainen [Kup16] who used it to give an alternative
construction of the solutions to the dynamical Φ43 equation.
The regularity structure view on rough paths, Sections 13.2.2 and 13.3.2, is
further explored in [BCFP19]; see also [Hai14b, Sec. 4.4]. As already mentioned,
the original proof of the reconstruction theorem given in [Hai14b] (also reproduced
in the first edition of this book) relies on wavelet analysis, in particular on the
existence of compactly supported wavelets of arbitrary regularity [Dau88]. The new
proof in Section 13.4 was inspired by [OSSW18, MW18] and has the advantage
of being entirely self-contained. One additional advantage is that the current proof
immediately generalises to scalings s that are not necessarily rational. (Rationality of
s was required in the original articles in order to be able to build a suitable wavelet
basis by tensorisation of one-dimensional wavelet bases.)
One advantage of the proof using wavelets is that it implies that a model is
uniquely determined by the actions of Πx and Γxy on countably many translates
and scalings of a finite number of functions and for a countable number of values of
x, y. It also makes it very easy to prove a Kolmogorov-type criterion for models, see
[Hai14b, Prop. 3.32 & Thm. 10.7].
Chapter 14
Operations on modelled distributions

The original motivation for the development of the theory of regularity structures
was to provide robust solution theories for singular stochastic PDEs like the KPZ
equation or the dynamical Φ43 model. The idea is to reformulate them as fixed point
problems in some space D γ (or rather a slightly modified version that takes into
account possible singular behaviour near time 0) based on a suitable random model
in a regularity structure purpose-built for the problem at hand. In order to achieve
this this chapter provides a systematic way of formulating the standard operations
arising in the construction of the corresponding fixed point problem (differentiation,
multiplication, composition by a regular function, convolution with the heat kernel)
as operations on the spaces D γ .

14.1 Differentiation

Being a local operation, differentiating a modelled distribution is straightforward,


provided that the model one works with is sufficiently rich. Denote by L some
(formal) differential operator with constant coefficients that is homogeneous of
degree m, i.e. it is of the form
X
L= ak D k ,
|k|=m

where k is a d-dimensional multi-index, ak ∈ R, and Dk denotes the kth mixed


derivative in the distributional sense.
Given a regularity structure (T, G), it is convenient to define “abstract” differenti-
ation only on suitable substructures. The appropriate notion of sector was already
introduced in Definition 13.1. We have

Definition 14.1. Consider a sector V ⊂ T . A linear operator ∂ : V → T is said to


realise L (of degree m) for the model (Π, Γ ) if

263
264 14 Operations on modelled distributions

• one has ∂τ ∈ Tα−m for every τ ∈ Vα ,


• one has Γ ∂τ = ∂Γ τ for every τ ∈ V and every Γ ∈ G.
• one has Πx ∂τ = LΠx τ for every τ ∈ V and every x ∈ Rd .
Writing D γ (V ) for those elements in D γ taking values in the sector V , it then
turns out that one has the following fact:
Proposition 14.2. Assume that ∂ realises L for the model (Π, Γ ) and let f ∈ D γ (V )
for some γ > m. Then, ∂f ∈ D γ−m and the identity R∂f = LRf holds.
Proof. The fact that ∂f ∈ D γ−m is an immediate consequence of the definitions, so
we only need to show that R∂f = LRf .
By the “uniqueness” part of the reconstruction theorem, this on the other hand
follows immediately if we can show that, for every fixed test function ψ and every
x ∈ Rd , one has
Πx ∂f (x) − LRf (ψxλ ) ≲ λδ ,


for some δ > 0. Here, we defined ψxλ as before. By the assumption on the model Π,
we have the identity

Πx ∂f (x)−LRf (ψxλ ) = LΠx f (x)−LRf (ψxλ ) = − Πx f (x)−Rf (L∗ ψxλ ) ,


  

where L∗ is the formal adjoint of L. Since, as a consequence of the homogeneity of



L, one has the identity L∗ ψxλ = λ−m L∗ ψ x , it then follows immediately from the
reconstruction theorem that the right-hand side of this expression is of order λγ−m ,
as required. ⊔⊓

14.2 Products and composition by regular functions

One of the main purposes of the theory presented here is to give a robust way to
multiply distributions (or functions with distributions) that goes beyond the barrier
illustrated by Theorem 13.18. Provided that our functions / distributions are repre-
sented as elements in D γ for some model and regularity structure, we can multiply
their “Taylor expansions” pointwise, provided that we give ourselves a table of
multiplication on T .
It is natural to consider products with the following properties.
Definition 14.3. Given a regularity structure (T, G) and two sectors V, V̄ ⊂ T , a
product on (V, V̄ ) is a bilinear map ⋆ : V × V̄ → T such that, for any τ ∈ Vα and
τ̄ ∈ V̄β , one has τ ⋆ τ̄ ∈ Tα+β and such that, for any element Γ ∈ G, one has
Γ (τ ⋆ τ̄ ) = Γ τ ⋆ Γ τ̄ .
Remark 14.4. The condition that degrees add up under multiplication is very natural,
bearing in mind the case of the polynomial regularity structure. The second condition
is also very natural since it merely states that if one reexpands the product of two
“polynomials” around a different point, one should obtain the same result as if one
reexpands each factor first and then multiplies them together.
14.2 Products and composition by regular functions 265

Given such a product, we can ask ourselves when the pointwise product of an
element D γ1 with an element in D γ2 again belongs to some D γ . In order to answer
this question, we introduce the notation Dαγ to denote those elements f ∈ D γ such
that furthermore M
f (x) ∈ T≥α = Tβ ,
β≥α

for every x. With this notation at hand, it is not hard to show:

Theorem 14.5. Let f1 ∈ Dαγ11 (V ), f2 ∈ Dαγ22 (V̄ ), and let ⋆ be a product on (V, V̄ ).
Then, the function f given by f (x) = f1 (x) ⋆ f2 (x) belongs to Dαγ with

α = α1 + α2 , γ = (γ1 + α2 ) ∧ (γ2 + α1 ) . (14.1)

Proof. It is clear that f (x) ∈ T≥α , so it remains to show that it belongs to D γ .


Furthermore, since we are only interested in showing that f1 ⋆ f2 ∈ D γ , we discard
all of the components in Tβ for β ≥ γ.
By the properties of the product ⋆, it remains to obtain a bound of the type

∥Γxy f1 (y) ⋆ Γxy f2 (y) − f1 (x) ⋆ f2 (x)∥β ≲ |x − y|γ−β .

By adding and subtracting suitable terms, we obtain


 
∥Γxy f (y) − f (x)∥β ≤ ∥ Γxy f1 (y) − f1 (x) ⋆ Γxy f2 (y) − f2 (x) ∥β

+ ∥ Γxy f1 (y) − f1 (x) ⋆ f2 (x)∥β (14.2)

+ ∥f1 (x) ⋆ Γxy f2 (y) − f2 (x) ∥β .

It follows from the properties of the product ⋆ that the first term in (14.2) is bounded
by a constant times
X
∥Γxy f1 (y) − f1 (x)∥β1 ∥Γxy f2 (y) − f2 (x)∥β2
β1 +β2 =β
X
≲ ∥x − y∥γ1 −β1 ∥x − y∥γ2 −β2 ≲ ∥x − y∥γ1 +γ2 −β .
β1 +β2 =β

Since γ1 + γ2 ≥ γ, this bound is as required. The second term is bounded by a


constant times
X X
∥Γxy f1 (y) − f1 (x)∥β1 ∥f2 (x)∥β2 ≲ ∥x − y∥γ1 −β1 1β2 ≥α2
β1 +β2 =β β1 +β2 =β

≲ ∥x − y∥γ1 +α2 −β ,

where the second inequality uses the identity β1 + β2 = β. Since γ1 + α2 ≥ γ, this


bound is again of the required type. The last term is bounded similarly by reversing
the roles played by f1 and f2 . ⊔⊓
266 14 Operations on modelled distributions

Remark 14.6. Strictly speaking, it is the projection of f (x) = f1 (x) ⋆ f2 (x) to T<γ
that belongs to Dαγ , see Exercise 13.4.

Remark 14.7. It is clear that the formula (14.1) for γ is optimal in general as can
be seen from the following two “reality checks”. First, consider the case of the
polynomial model and take fi ∈ C γi . In this case, the (abstract) truncated Taylor
series fi for fi belong to D0γi . It is clear that in this case, the product cannot be
expected to have better regularity than γ1 ∧ γ2 in general, which is indeed what (14.1)
states. The second reality check comes from (the proof of) Theorem 13.18. In this
case, with β > α ≥ 0, one has f ∈ D0β , while the constant function x 7→ Ξ belongs
∞ β−α
to D−α so that, according to (14.1), one expects their product to belong to D−α ,
which is indeed the case.

It turns out that if we have a product on a regularity structure, then in many


cases this also naturally yields a notion of composition with regular functions. Of
course, one could in general not expect to be able to compose a regular function with a
distribution of negative order. As a matter of fact, we will only define the composition
of regular functions with elements in some D γ for which it is guaranteed that the
reconstruction operator yields a continuous function. One might think at this case
that this would yield a triviality, since we know of course how to compose arbitrary
continuous function. The subtlety is that we would like to design our composition
operator in such a way that the result is again an element of D γ .
For this purpose, we say that a given sector V ⊂ T is function-like if α <
0 =⇒ Vα = 0 and if V0 is one-dimensional. (Denote the unit vector of V0 by 1.)
We will furthermore always assume that our models are normal in the sense that
Πx 1 (y) = 1. In this case, it turns out that if f ∈ D γ (V ) for a function-like
 sector
V , then Rf is a continuous function and one has the identity Rf (x) = ⟨1, f (x)⟩,
where we denote by ⟨1, • ⟩ the element in the dual of V which picks out the prefactor
of 1.
Assume now that we are given a regularity structure with a function-like sector
V and a product ⋆ : V × V → V . For any smooth function G : R → R and any
f ∈ D γ (V ) with γ > 0, we can then define G ◦ f (also denoted G(f )) to be the
V -valued function given by
X G(k) (f¯(x))
Q<γ f˜(x)⋆k ,

G ◦ f (x) =
k!
k≥0

where we have set

f¯(x) = ⟨1, f (x)⟩ , f˜(x) = f (x) − f¯(x)1 ,

and where Q<γ : T → T<γ is the natural projection. Here, G(k) denotes the kth
derivative of G and τ ⋆k denotes the k-fold product τ ⋆ · · · ⋆ τ . We also used the usual
conventions G(0) = G and τ ⋆0 = 1.
Note that as long as G is C ∞ , this expression is well-defined. Indeed, by as-
sumption, there exists some α0 > 0 such that f˜(x) ∈ T≥α0 . By the properties of
14.3 Classical Schauder estimates 267

the product, this implies that one has f˜(x)⋆k ∈ T≥kα0 . As a consequence, when
considering the component of G ◦ f in Tβ for β < γ, the only terms that give a
contribution are those with k < γ/α0 . Since we cannot possibly hope in general that

G ◦ f ∈ D γ for some γ ′ > γ, this is all we really need.
It turns out that if G is sufficiently regular, then the map f 7→ G ◦ f enjoys
similarly nice continuity properties to what we are used to from classical Hölder
spaces. The following result is the analogue in this context to Lemma 7.3:

Proposition 14.8. In the same setting as above, provided that G is of class C k with
k > γ/α0 , the map f 7→ G◦f is continuous from D γ (V ) into itself. If k > γ/α0 +1,
then it is locally Lipschitz continuous.

The proof of the first statement can be found in [Hai14b], while the second
statement was shown in [HP15]. It is a somewhat lengthy, but ultimately rather
straightforward calculation.

14.3 Classical Schauder estimates

One of the reasons why the theory of regularity structures is very successful at
providing detailed descriptions of the small-scale features of solutions to semilinear
(S)PDEs is that it comes with very sharp Schauder estimates. A full proof of the
Schauder estimates for regularity structures is beyond the scope of this book, but we
want to convey the flavour of the proof. The aim of this section is therefore to give
a self-contained proof of the classical Schauder estimates which state that for any
(compactly supported) kernel K that is approximately homogeneous of degree β − d,
the convolution map ζ 7→ K ∗ ζ is continuous from C α to C α+β , provided that α + β
is not a positive integer. We first make precise our assumptions on the kernel K.

Definition 14.9. Given β > 0, a kernel K : Rd \{0} → R, smooth except for a


singularity at the origin, is said to be β-regularising if it is supported in the unit
ball around the origin and, for every k ∈ Nd , there exists a constant C such that
|Dk K(x)| ≤ C|x|β−d−|k| .

Immediate examples are (smooth truncations of) the Newton potential in dimension
d ≥ 3, proportional to 1/|x|d−2 and hence 2-regularising, the fractional Volterra
kernel (xH−1/2 1x>0 ) with d = 1 and β = H + 1/2. The heat kernel on space-time
2
Rn+1 , proportional to (t, x) 7→ t−n/2 exp(− |x| 4t )1t>0 , also fits in this setting (and
is 2-regularising), provided one works with “parabolic” scaling (cf. Remark 13.9).
As in Section 13.3, and for any r ∈ N, we work with Br ⊂ D, the set of smooth
test functions with C r -norm bounded by 1 and supported in the unit ball. It will be
λ
convenient for the purpose of this section to write Br,x for the set of all test functions
of the form φλx with φ ∈ Br . Such ψ ∈ Br,x λ
are characterised by having support in
the ball of radius λ centred at x and derivatives bounds |Dk ψ| ≤ λ−d−|k| for |k| ≤ r.
We also note that, for any real s ∈ [0, r], the estimate ∥ψ∥C s ≲ λ−d−s holds true.
268 14 Operations on modelled distributions

Lemma 14.10. Given a β-regularising kernel K and r ≥ 0, one can write K =


βn 2−n
P
n≥−1 Kn in such a way that 2 Kn ∈ CBr,0 for some C > 0.

Proof. As is common in the construction of Paley–Littlewood blocks, we work with


a dyadic partitions of unity, basedP on a smooth “cutoff” function” φ : R+ → [0, 1],
supported in [2−1 , 21 ], such that n≥0 φn ≡ 1 on (0, 1], where φn := φ(2n • ) is
supported in [2−n−1 , 2−n+1 ]. Since K is supported in {x : |x| ≤ 1}, the stated
decomposition clearly holds with (smooth) Kn (x) := φn+1 (|x|)K(x), supported in
2−n
the ball of radius 2−n centred at the origin. To see that 2βn Kn ∈ CBr,0 , for given
j −n β−d−|j|
r ≥ 0, it remains to see that |D Kn | ≲ (2 ) for |j| ≤ r. This estimate
holds, with Kn replaced by K, by the defining property of a β-regularising kernel,
restricted to x ≍ 2−n . On the other hand, |Di φn | = |(2n )|i| Di φ| ≲ (2n )|i| , and we
conclude with Leibnitz’ product rule. ⊔ ⊓

The following simple proposition is the first crucial ingredient in our approach.
Loosely speaking, it states that the convolution of two test functions localised at two
distinct scales is localised at the sum (or equivalently maximum) of the two scales
and that one gains in amplitude if the tighter of the two test functions annihilates
polynomials of a certain degree.
λ µ
Proposition 14.11. There exists C > 0 such that, for all φ ∈ Br,x and ψ ∈ Br,y , one
λ+µ R
has ψ ∗ φ ∈ CBr,x+y . If furthermore λ ≤ µ and P (z)φ(z) dz = 0 for every poly-

nomial P with deg P < γ ≤ r, some γ ∈ R+ , then ψ ∗ φ ∈ C(λ/µ)γ B⌊r−γ⌋,x+y .

Proof. Clearly, ψ ∗ φ is supported in the ball of radius λ + µ centred at x + y. For the


first claim, by swapping the roles of φ and ψ if necessary, we may assume λ ≤ µ. To
λ+µ
see that the convolution yields an element in Br,x+y , in view of the characterisation
of such spaces, it suffices to estimate, for |k|R≤ r, Dk (ψ ∗ φ) = (Dk ψ) ∗ φ using
|(Dk ψ)| ≲ µ−d−|k| ≍ (λ + µ)−d−|k| and |φ(z)| dz ≤ C (independent of λ).
Regarding the second claim, we write
Z
D (ψ ∗ φ)( ) = ψ (k) ( • − z) φ(z) dz
k •

Z
ψ (k) ( • − z) − P γ;(k) ( • − z) φ(z) dz ,

= •

for 0 ≤ |k| ≤ r − γ, where P γ;(k) denotes the Taylor expansion (at the dotted

base-point) of ψ (k) ≡ Dk ψ of integer degree γ − {γ} < γ (annihilated by φ). It


remains to be seen that, for all such k,

|Dk (φ ∗ ψ)( • )| ≲ (λ/µ)γ µ−d−|k| .

To this end, using that γ + |k| ≤ r, one has the estimate

|ψ (k) ( • − z) − P k,γ ( • − z)| ≲ ∥ψ∥C γ+|k| |z|γ ≲ µ−d−γ−|k| |z|γ .



14.3 Classical Schauder estimates 269

We only need to consider z in the support of φ, and in fact can assume without loss of
R that x = 0 (otherwise
generality R subtract another annihilated Taylor polynomial. . .),
so that |z|γ |φ(z)| dz ≤ λγ |φ(z)| dz ≲ λγ . The desired estimate now follows.

Our second crucial ingredient is a characterisation of Hölder spaces that is well


adapted to our approach. For this, we define the following scale of spaces of distribu-
tions.

Definition 14.12. For α ∈ R, write r = ro (α) for the smallest non-negative integer
such that r + α > 0. We then define Z α as the space of distributions on Rd such that
for every compact set K ⊂ Rd there exists a constant C such that the bound

|ζ(φ)| ≤ Cλα ,
λ
R
holds uniformly λ ∈ (0, 1], x ∈ K and all φ ∈ Br,x such that φ(z)P (z) dz = 0
for all polynomials P with deg P ≤ α. For any compact set K, the best possible
constant such that the above bound holds uniformly over x ∈ K yields a seminorm.
The collection of these seminorms endows Z α with a Fréchet space structure.

The precise choice of r in Definition 14.12 is not very important, as one could
have taken any other choice r ≥ ro (α). More precisely, one has the following result.

Lemma 14.13. For r ≥ ro (α), write Zrα for Z α as defined above, but with ro (α)
replaced by r. Then Zrα = Z α .

Proof. We fix a partition of unity {χy }y∈Λ for Rd such that all the χy are translates
of χ0 by y ∈ Rd and Λ ⊂ Rd is a lattice. In particular, we make sure that χy ∈ Br,yλ
.
Given any λ > 0, we write χy,λ (x) = χy/λ (x/λ) and we set Λλ = Λ/λ. We also
fix a function ψ ∈ C ∞ with support in the centred unit ball and such that
Z
xk ψ(x) dx = δk,0 , ∀k : |k| ≤ r . (14.3)
Rd

(Such functions exist by Exercise 13.8.) We then write ψ̃(x) = 2d ψ(2x) − ψ(x) and
k
R
note that by (14.3) one has Rd x ψ̃(x) dx = 0 for |k| ≤ r.
Let now α < 0 and take ζ ∈ Zrα , we want to show that ζ ∈ Z α . Given φ ∈ Brλo ,x
and setting λn = 2−n λ, we write
X X
φ = φ ∗ ψλ + φn,y = φ ∗ ψ̃ λn · χy,λn .

φn,y , (14.4)
n≥0 y∈Λλn

As a simple consequence of the Taylor remainder theorem, one has the bound
φ ∗ Dk ψ̃ λn ≲ λ−d 2−ro n λ−|k| = 2−(d+ro )n λn−d−|k| ,

∞ n

so that there exists a constant C independent of φ such that φn,y ∈ C2−(d+ro )n Br,y
λn
,
which in particular implies that
270 14 Operations on modelled distributions

|ζ(φn,y )| ≲ λα 2−(d+ro +α)n . (14.5)

Since the number of terms in Λλn such that φn,y is non-zero is of order 2nd , we
conclude that X
|ζ(φ)| ≲ λα + λα 2−(ro +α)n ≲ λα ,
n≥0

where we used the fact that ro + α > 0 by definition.


Note that the assumption α < 0 was used in order to obtain the bound (14.5)
since there is no reason for φn,y to annihilate polynomials even if φ does. The case
α > 0 is easier, noting that the definition of Zrα implies that ζ ∗ ψ̃ λn is a continuous
function bounded by O(λα n ). We then use the fact that
X
ζ(φ) = ζ(φ ∗ ψ λ ) + ⟨ζ ∗ ψ̃ λn , φ⟩ ,
n≥0

with ⟨·, ·⟩ denoting the L2 scalar product, P


combined with the fact that φ integrates to
O(1), to conclude that |ζ(φ)| ≲ λα (1 + n≥0 2−αn ) ≲ λα as required.
The case α = 0 is a bit more delicate and we leave it as Exercise 14.3. ⊔ ⊓

Remark 14.14. Validity of the stated bounds implies that distributions in Z α ⊂ D′


can be extended canonically to test functions in Ccr (elements in C r with compact
support). In this sense, Z α is contained in the topological dual of Ccr . (The situation
is similar in the definition of models, cf. Remark 13.7.)

For α < 0, the polynomial-annihilation condition is void and there is no additional


λ
condition on φ besides φ ∈ Br,x . In this case Z α is precisely the negative Hölder
α
space C introduced in Section 13.3.1. The following proposition shows that to some
extent this is also true in case of positive Hölder spaces, as previously encountered in
Section 13.3.1.

Proposition 14.15. For α ̸∈ N, one has Z α = C α .

Proof. There is nothing to prove for α < 0, so let α > 0. We first show that
C α ⊂ Z α , this inclusion also being valid for integer values of α. In fact, it suffices to
note that, given f ∈ C α and φ ∈ Br,xλ
as in Definition 14.12, one has
Z Z
f (y) − Pxα (y − x) φ(y) dy ≲ λα ,

f (y)φ(y) dy =

where the identity follows from the fact that φ annihilates Pxα , the Taylor expansion
at order α of f , based at x, and the bound is as in the proof of Proposition 14.11.
For the converse inclusion, we first consider the case α ∈ (0, 1) and let ζ ∈ Z α .
Let ϱ : Rd → R be a smooth function that is compactly supported in the unit ball
around the origin and such that ϱ(z) dz = 1. Note first that, for any x ∈ Rd and
R

λ ∈ (0, 1], it follows from the definition of Z α that one has the bound
−n −n−1 −n −n−1
|ζ(ϱ2x λ
) − ζ(ϱ2x λ
)| = |ζ(ϱ2x λ
− ϱ2x λ
)| ≤ Cλα 2−αn .
14.3 Classical Schauder estimates 271
−n
It follows that f (x) = limn→∞ ζ(ϱ2x λ
) is well-defined and that

|f (x) − ζ(ϱλx )| ≲ λα .

As a consequence, one has

|f (x) − f (y)| ≲ λα + ζ(ϱλx − ϱλy ) .


α
R λ = |x − y|, it follows that f ∈ C . The fact that f = ζ in the sense that
Choosing
ζ(φ) = f (z) φ(z) dz follows immediately from the fact that
Z
λ
ζ(φ) = lim ζ(φ ∗ ϱ ) = lim ζ(ϱλx ) φ(x) dx .
λ→0 λ→0

The claim for general non-integer α can then be seen from the fact that ζ ∈ Z α
implies Dk ζ ∈ Z α−|k| (interpreted as distributional derivatives) for every multi-
index k. Details are left to the reader. ⊔

Remark 14.16. For n ∈ N, the spaces Z n are usually called Hölder–Zygmund spaces
in the literature (thus our choice of symbol Z). They are distinct from the usual
Hölder spaces since one can check that x 7→ xn log x belongs to Z n , but not to C n .

With all of these preliminaries in place, we can give a very simple proof of
Schauder’s theorem. (See for example [Sim97] for an alternative proof of a very
similar statement.)

Theorem 14.17. For any β-regularising kernel K, the map ζ 7→ K ∗ ζ is continuous


from Z α to Z α+β for every α ∈ R.

Proof. Let ζ ∈ Z α and let φ ∈ Br,x


λ
where we will (andR can by Lemma 14.13) work
with suitable r ≥ ro (α + β), chosen below, such that φ(z)P (z) dz = 0 for every
P with deg P ≤ α + β. Lemma 14.10 yields a decomposition (Kn : n ≥ −1) for
Ǩ(x) = K(−x), so that
X X
(K ∗ ζ)(φ) = ζ(Ǩ ∗ φ). = ζ(Kn ∗ φ) = 2−βn ζ(2βn Kn ∗ φ) , (14.6)
n n

−n
with 2βn Kn ∈ CBr,02
for some C > 0. It then follows from Proposition 14.11
(applied with µ = 2−n , noting that Kn ∗ φ also annihilates polynomials of degree
up to α + β) and the definition of Z α that

λα if 2−n ≤ λ,

|ζ(2βn Kn ∗ φ)| ≲ n γ −αn
(2 λ) 2 otherwise,

provided ⌊r − γ⌋ ≥ ro (α + β). We will also need γ > α + β, so that for instance


r := 2(|α| + β) + 2 is a safe choice. Inserting this bound into (14.6), and using
β > 0, γ > α + β to estimate the geometric sums, one has the bounds
272 14 Operations on modelled distributions
X X
2−βn λα ≲ λα+β , 2(γ−α−β)n λγ ≲ λα+β ,
n≥0 n≥0
2−n ≤λ 2−n ≥λ

it follows that |(K ∗ ζ)(φ)| ≲ λα+β , whence the claim follows. ⊔



Remark 14.18. The proof is (much) simpler in the “negative” case, with Hölder
exponents α < α + β < 0. In essence, this is due to the absence of polynomial
vanishing conditions. More specifically, one can take r = ro (α + β) in the above
proof, and then γ = 0 later on, so that only the easy (first) part of Proposition 14.15
is used. A reduction of the general to the negative case, in dimension d = 1, is
discussed in Exercise 14.2.
Remark 14.19. One can verify that the proof never made explicit use of the Euclidean
scaling and can be adapted mutatis mutandis to the case of arbitrary scalings as
mentioned in Remark 13.9, provided that the notion of “β-regularising kernel” is
adjusted accordingly (replace the exponent β − d − |k| by β − |s| − |k|s ).

14.4 Multilevel Schauder estimates and admissible models

As we saw in the previous section, the classical Schauder estimates state that if
K : Rd → R is a kernel that is smooth everywhere, except for a singularity at the
origin that is approximately homogeneous of degree β − d for some fixed β > 0 (i.e.
it is β-regularising in the sense of Definition 14.9), then the operator f 7→ K ∗ f
maps C α into C α+β for every α ∈ R, except for those values for which α + β ∈ N.
It turns out that similar Schauder estimates hold in the context of general regularity
structures in the sense that it is in general possible to build an operator K : D γ →
D γ+β with the property that RKf = K ∗Rf . We call such a statement a “multi-level
Schauder estimate” since it is a form of Schauder estimate for all the components of
f in Tα for all α < γ. Of course, such a statement can only be expected to hold if
our regularity structure contains not only the objects necessary to describe Rf up to
order γ, but also those required to describe K ∗ Rf up to order γ + β. What are these
objects? At this stage, it might be useful to reflect on the effect of the convolution of
a singular function (or distribution) with K.
Let us assume for a moment that a given real-valued function f is smooth ev-
erywhere, except at some point x0 . It is then straightforward to convince ourselves
that K ∗ f is also smooth everywhere, except at x0 . Indeed, for any δ > 0, we can
write K = Kδ + Kδc , where Kδ is supported in a ball of radius δ around 0 and
Kδc is a smooth function. Similarly, we can decompose f as f = fδ + fδc , where
fδ is supported in a δ-ball around x0 and fδc is smooth. Since the convolution of
a smooth function with an arbitrary distribution is smooth, it follows that the only
non-smooth component of K ∗ f is given by Kδ ∗ fδ , which is supported in a ball of
radius 2δ around x0 . Since δ was arbitrary, the statement follows. By linearity, this
strongly suggests that the local structure of the singularities of K ∗ f can be described
completely by only using knowledge on the local structure of the singularities of f .
14.4 Multilevel Schauder estimates and admissible models 273

It also suggests that the “singular part” of the operator K should be local, with the
non-local parts of K only contributing to the “regular part”.
This discussion suggests that we need the following ingredients to build an
operator K with the desired properties:
• The polynomial structure should be part of our regularity structure in order to be
able to describe the “regular parts”.
• We should be given an “abstract integration operator” I (of order β) on T which
describes how the “singular parts” of Rf transform under convolution by K.
• We should restrict ourselves to models which are “compatible” with the action
of I in the sense that the behaviour of Πx Iτ should relate in a suitable way to
the behaviour of K ∗ Πx τ near x.
One way to implement these ingredients is to assume first that our regularity structure
contains abstract polynomials in the following sense.
Assumption 14.20 There exists a sector T̄ ⊂ T isomorphic to the polynomial
regularity structure. In other words, T̄α ̸= 0 if and only if α ∈ N, and one can
find basis vectors X k of T|k| such that every element Γ ∈ G acts on T̄ by Γ X k =
(X + h1)k for some h ∈ Rd .
Furthermore, we assume that there exists an abstract integration operator I, of
fixed order β > 0, with the following properties.
Assumption 14.21 There exists a linear map I : V → T for some sector V ⊂ T
such that IVα ⊂ Tα+β and, for every Γ ∈ G and τ ∈ T ,

Γ Iτ − IΓ τ ∈ T̄ . (14.7)

Remark 14.22. We do not want to assume Γ I = IΓ . This is already seen in case


of the rough path structure given by Definition 13.4. The map I : Ẇ i 7→ W i ,
1 ≤ i ≤ e, constitutes an abstract integration operator (defined on the sector Tα−1 ).
Since a generic Γh ∈ G maps W i to W i + hi 1, we see that Γ I − IΓ ̸= 0 (for
h ̸= 0) and takes values in T0 = ⟨1⟩.
Finally, we want to restrict our attention to models that are compatible with this
structure for a given kernel K in the following sense.
Definition 14.23. Given a β-regularising kernel K and a regularity structure T
satisfying Assumptions 14.20 and 14.21, we say that a model (Π, Γ ) is admissible if
the identities

Πx X k (y) = (y − x)k ,

Πx Iτ = K ∗ Πx τ − Πx Jx τ , (14.8)

hold for every τ ∈ V . Here, Jx : V → T̄ is the linear map given on homogeneous


elements by

Xk
X Z
Dk K(x − y) Πx τ (dy) .

Jx τ = (14.9)
k!
|k|<deg τ +β
274 14 Operations on modelled distributions

Remark 14.24. In some cases, it will be convenient to introduce a whole family Ik


of integration operators of order β − |k|. The notion of admissibility is then defined
similarly, with I replaced by Ik and K replaced by Dk K, to the extent that these
symbols are included in the structure space.

Remark 14.25. If ξ is smooth and we furthermore impose that Πx is multiplicative


(which is not enforced in general!), this yields a recursion to define the canonical
model associated to ξ provided one manages to construct Γxy at the same time. The
correct recursion to do this is

Γxy (I + Jy )τ = (I + Jx )Γxy τ , (14.10)

which is clearly consistent with the constraint (14.7) and which one can show guar-
antees that Πx Γxy Iτ = Πy Iτ . See also Exercise 14.6.

Remark 14.26. Recall that if P is a polynomial and K is a compactly supported


function, then K ∗ P is again a polynomial of the same degree as P . Since, for
Πx τ smooth enough, the term Πx Jx τ appearing in (14.8) is nothing but the Taylor
expansion of K ∗ Πx τ around x, it follows that one has Πx IX k = 0 for any multi-
index k and any admissible model, which would suggest that one could have imposed
the identity IX k = 0 already at the algebraic level. This would however create
inconsistencies
R later on when incorporating renormalisation, unless we assume that
K(x)P (x) dx = 0 for every polynomial P of degree N , for some sufficiently
large value of N . Here, we chose to simply add instead IX k as separate symbols to
our regularity structure and to then set IX k = IX k .

Remark 14.27. While K ∗ ξ is well-defined for any distribution ξ, it is not so clear a


priori whether the operator Jx given in (14.9) is also well-defined. It turns out that
the axioms of a model do ensure that this is the case. The correct way of interpreting
(14.9) is by
X X Xk
Πx τ Dk Kn (x − • ) ,
 
Jx τ =
k!
|k|<deg τ +β n≥0

with Kn as in Lemma 14.10. The scaling properties of the Kn ensure that the function
2(β−|k|)n Dk Kn (x − • ) is a test function that is localised around x at scale 2−n . As
a consequence, one has
Πx τ Dk Kn (x − • ) ≲ 2(|k|−β−deg τ )n ,
 

so that this expression is indeed summable as long as |k| < deg τ + β.

Remark 14.28. As a matter of fact, it turns out that the above definition of an ad-
missible model dovetails very nicely with our axioms defining a general model.
Indeed, starting from any regularity structure T , any model (Π, Γ ) for T , and a
β-regularising kernel K, it is usually possible to build a larger regularity structure
Tˆ containing T (in the “obvious” sense that T ⊂ T̂ and the action of Ĝ on T is
14.4 Multilevel Schauder estimates and admissible models 275

compatible with that of G) and endowed with an abstract integration map I, as well
as an admissible model (Π̂, Γ̂ ) on Tˆ which reduces to (Π, Γ ) when restricted to T .
See [Hai14b] for more details.
The only exception to this rule arises when the original structure T contains some
homogeneous element τ which does not represent a polynomial and which is such
that deg τ + β ∈ N. Since the bounds appearing both in the definition of a model
and in that of a β-regularising kernel are only upper bounds, it is in practice easy to
exclude such a situation by slightly tweaking the definition of either the exponent β
or of the original regularity structure T .

With all of these definitions in place, we can finally build the operator K : D γ →
γ+β
D announced at the beginning of this section. Recalling the definition of J from
(14.9), we set  
Kf (x) = If (x) + Jx f (x) + N f (x) , (14.11)
where the operator N is given by
X Xk Z
Dk K(x − y) Rf − Πx f (x) (dy) .
 
N f (x) = (14.12)
k!
|k|<γ+β

Note first that thanks to the reconstruction theorem, it is possible to verify that the
right-hand side of (14.12) does indeed make sense for every f ∈ D γ in virtually the
same way as in Remark 14.27. One has:

Theorem 14.29. Let K be a β-regularising kernel, let T = (T, G) be a regularity


structure satisfying Assumptions 14.20 and 14.21, and let (Π, Γ ) be an admissible
model for T . Then, for every f ∈ D γ with γ ∈ (0, N − β) and γ + β ̸∈ N, the
function Kf defined in (14.11) belongs to D γ+β and satisfies RKf = K ∗ Rf .

Proof. The complete proof of this result can be found in [Hai14b] and will not
be given here. Since it is rather straightforward, we will however give a proof
of Schauder’s estimate in the classical case (i.e. that of the polynomial regularity
structure) in Section 14.3 below.
Let us simply show that one has indeed RKf = K ∗ Rf in the particular case
when our model consists of continuous functions so that Remark 13.27 applies. In
this case, one has
   
RKf (x) = Πx (If (x) + Jx f (x)) (x) + Πx N f (x) (x) .

As a consequence of (14.8), the first term appearing in the right-hand side of this
expression is given by
 
Πx (If (x) + Jx f (x)) (x) = K ∗ Πx f (x) (x) .

On the other hand, the only term contributing to the second term is the one with
k = 0 (which is always present since γ > 0 by assumption) which then yields
276 14 Operations on modelled distributions
Z
  
Πx N f (x) (x) = K(x − y) Rf − Πx f (x) (dy) .


Adding both of these terms, we see that the expression K ∗ Πx f (x) (x) cancels,
leaving us with the desired result. ⊔

We are now in principle in possession of all of the ingredients required to formulate


fixed point problems for a large number of semilinear stochastic PDEs: multiplication,
composition by regular functions, differentiation, and integration against the Green’s
function of the linearised equation. Before we show how this can be leveraged in
practice in order to build a robust solution theory for the KPZ equation, we briefly
explore some of main concepts in setting of (very) rough paths.

14.5 Rough volatility and robust Itô integration revisited

Recent applications from mathematical finance, where σ(t, ω) = σ(W ct ) models


rough stochastic volatility, involve (standard) Itô integrals of the form
Z T Z T Z T
ct )d(Wt , W̄t ) ≡
σ(W f (W
ct )dWt + f¯(W
ct )dW̄t , (14.13)
0 0 0

where σ = (f, f¯) : R → R2 is a sufficiently smooth map, (W, W̄ ) is a 2-dimensional


standard Brownian motion, and W ct given by
Z
K H (t − s) dWs , (14.14)

with Riemann–Liouville kernel K H (x) = xH−1/2 1x>0 . Since K H ∈ L2loc (R) but
not in L2 (R), we replace it in the sequel by a compactly supported K, smooth away
from zero and equal to K H in some neighbourhood of zero. We then require W to
be a two-sided Brownian motion, so that ξ := Ẇ defines Gaussian white noise on R,
and
c =K ∗ξ .
W (14.15)
Alternatively, as done in [BFG+ 19], see also [BFG20], one can restrict integration in
(14.14) to [0, t] with the benefit of exactly recovering Brownian motion W
c = W for
H = 1/2 in which case the integral (14.13) fits squarely into rough integration theory
(namely Theorem 4.4, applied with the Itô Brownian rough path from Proposition 3.4).
However, for H ∈ (0, 1/2) rough integration must fail. Indeed, K is (1/2 + H)-
regularising so that it follows from Schauder’s Theorem 14.17 that W c and then

σ(Wc ) have generically H -Hölder regularity and hence cannot be expected to be

controlled by W ∈ C 1/2 . We can make (minor) progress by noting that (W c , W̄ )
is a 2-dimensional Gaussian process with independent components. At least for
H > 1/3, the results of Section 10.3 for Gaussian rough paths apply essentially
14.5 Rough volatility and robust Itô integration revisited 277

directly to the final integral f¯(W


R
c )dW̄ above and Exercise 14.8 allows to deal with
arbitrary H > 0.
The remainder of this section will focus on the other, seemingly harmless, one-
dimensional Itô integral, with W
c as given in (14.15),
Z T
f (W
c )dW . (14.16)
0

We are interested in a robust form of this Itô stochastic integral. In case of Wc=W
we can in fact express (14.16) via Itô’s formula, which immediately gives a version
of this integral which is continuous in W , even in uniform topology. Certainly, this
trick fails when Wc ̸= W .
In this section we set up a regularity structure that provides a full solution to this
problem. Needless to say, this structure is much simpler than what is needed for the
KPZ equation in the next chapter. Yet, it showcases a number of features omnipresent
for singular SPDEs, but without some of the added complexity coming from PDE
theory.
Recall that the Hölder exponent of W c is H − κ for any κ > 0. As a result, we
m m(H−κ)
have |Ws,t | ≲ |t − s|
c and the building blocks for a robust representation of
(14.16) are
Z t
m cs,r )m dWr ,
Ws,t = (W (14.17)
s

with m = 0, 1, 2, . . . , M where M is the smallest integer such that (M + 1)H +


1/2 > 1, which reflects the analytic redundancy of WM +1 in the sense of

|WM +1 (s, t)| ≲ |t − s|(M +1)(H−κ)+1/2 = o(t − s) ,

for small enough κ > 0. For definiteness, let us focus on the case
1
H> , M =3.
8
We first define symbols (these will be the basis vectors of our regularity structure) to
represent (Wcs,t )m , 0 ≤ m ≤ 3. If Ξ ≡ is the symbol for white noise ξ ≡ Ẇ , we
can write the required symbols indifferently as

{1, I(Ξ), I(Ξ)2 , I(Ξ)3 } ≡ {1, , , }.

The map I : Ξ 7→ I(Ξ) represents convolution with K and is graphically repre-


sented by a downfacing plain line; multiplication (which we postulate to be commuta-
tive and associative) is depicted by joining trees at their roots. For instance, ⋆ =
(we will omit ⋆ in the sequel). Similarly, the symbols denoting (W cs,t )m Ẇt , defined
m
as the generalised derivative ∂Ws, , are given in the same pictorial representation as

{ , , , } (with for example = I(Ξ)2 Ξ). We then define the structure space of
278 14 Operations on modelled distributions

our regularity structure as the free vector space generated by these symbols, namely

T =⟨, , , , 1, , , ⟩. (14.18)

The partial product defined on T (for example = ) does not extend to all of T .1 It

is natural to postulate that Ξ has degree deg Ξ = − 12 (the presence of the exponent
‘−’ reflects the fact that in order for the bound (13.13) to be satisfied when Πt Ξ is
given by white noise, we need to make sure that deg Ξ is strictly smaller than − 21 ,
but by how much exactly is irrelevant as long as it is a small enough quantity), that I
increases degree by H + 12 , and that the degree is additive under multiplication. Since
it is natural to take deg 1 = 0 to retain consistency with the polynomial regularity
structure, this uniquely determines the degree of each of the basis vectors of T , for
instance
deg = deg + 3 deg = (3H − 12 )− .
To understand the structure group, we shift from a base point s to a new base
point t. Basic additivity properties of the integral in (14.17) show that

• •
cs,t + 3W1t, W
W3s, = W3t, + 3W2t, W •
2
cs,t + W0t, W

3
cs,t + W3s,t .

Considering the (generalised) derivative in the free variable, we have

∂W3s, = ∂W3t, + 3(∂W2t, )W


• •
cs,t + 3(∂W1t, )W

2
cs,t + (∂W0t, )W

3
cs,t . • (14.19)

This suggests to “break up” the symbol (for ∂W3∗, ) in the form

∆+ ( ) := ⊗1+3 ⊗ +3 ⊗ + ⊗ ∈ T ⊗ T+ ,

where the introduction of a new space T + is justified by the fact that elements in T +
represent functions of two variables (s and t here), while elements of T represent
functions of one variable (the base point s resp. t) that are distributions in the
remaining free variable. In particular, it is rather natural that T + (unlike T ) contains
no symbols of negative degree and that elements of T + can be multiplied freely. In
other words, it is natural in this context to define T + as the free commutative algebra
def
generated by the single element = J ( ). The difference between T + and T is
emphasised in our notation by drawing basis vectors of T + in black.
The action of the linear map ∆+ : T → T ⊗ T + has the appealing graphical
interpretation of cutting off positive branches: for instance, the summand 3 ⊗ =
⊗3 in ∆+ ( ) is explained as follows: there are three ways to “cut off” a “lollipop”
from , which are then painted black and put as 3 ∈ T + to the right-hand side;
the remaining “pruned” tree ∈ T goes to the left. Similarly, there are three ways to
cut off two lollipops from , which then appear as 3 ∈ T + on the right-hand side,
while the pruned remainder ∈ T appears on the left.

1
For instance, we do not want our regularity structure to contain a symbol Ξ 2 denoting the square
of white-noise. We also have no need for trees with ≥ 4 branches so that products like ,
etc. remain deliberately undefined within T .
14.5 Rough volatility and robust Itô integration revisited 279

A concise recursive algebraic description of ∆+ starts with

∆+ 1 = 1 ⊗ 1 , ∆+ Ξ = Ξ ⊗ 1 ,

followed by an extension to all of T by imposing the identities2

∆+ (τ τ̄ ) = ∆+ τ · ∆+ τ̄ ,
∆+ I(τ ) = (I ⊗ Id)∆+ τ + 1 ⊗ J (τ ) .

Here, J (τ ) is the element in T + obtained from a (then painted black) symbol τ .


In our pictorial representation J is visualised by a (black) downfacing line. The
tree associated to J (τ ) has exactly one line emerging from the root (such trees are
called planted). In the present example, τ = is the only symbol in T , as given in
(14.18), with image under I in T , so that the second relation above can only produce
= J ( ) ∈ T + ; whereas the first relation leads to powers thereof (in T + ).
Let now G+ denote the set of characters on T + , i.e. all linear maps g : T + → R
with the property that g(σσ̄) = g(σ)g(σ̄) for any two elements σ and σ̄ in T + . There
is not much choice here, since c = g( ) ∈ R fully determines any such map. In order
to get back to (14.19), we introduce Γg : T → T by

Γg τ = (Id ⊗ g)∆+ τ , (14.20)

so that, for instance, Γg ( ) = + 3c + 3c2 + c3 ∈ T , and with c = g( ) = W cs,t


this precisely captures (14.19) as an abstract shift map Γst = Γgs,t with gs,t ( ) =
cs,t . In principle, (14.20) makes sense for every g ∈ (T + )∗ , but it turns out that the
W
set of those maps Γg with g ∈ G+ forms a group, which is precisely our structure
group:
G := {Γg : g ∈ G+ }. (14.21)
Written in matrix form, with respect to the ordered basis of T consisting of 4 negative
and 4 non-negative symbols, each Γg is block-diagonal with two (4 × 4)-blocks of
the form
1 c c 2 c3
 
 0 1 2c 3c2 
 0 0 1 3c  =: Nc
 

000 1
One can check that Nc Nc̄ = Nc+c̄ with c, c̄ ∈ R so that, as a group, G is isomorphic
to (R, +). This completes the construction of the regularity structure (T, G). We
leave it to the reader to identify pairs of sectors on which (the usually omitted) ⋆
defines a product in the sense of Section 14.2 and to show that I is indeed an abstract
integration operator3 in the sense of Definition 14.21.

2
The multiplicative property is understood for all symbols τ , τ̄ ∈ T which can be multiplied in T .
3
In the present setting there is no need to include higher order abstract polynomials X, X 2 , . . . as
part of T .
280 14 Operations on modelled distributions

As already hinted at, the natural Itô model MItô := (Π, Γ ) in this context is
defined by setting

Πs 1 = 1 , Πs Ξ = Ẇ , Πs (I(Ξ)m ) = W
cs,m , Πs (ΞI(Ξ)m ) = ∂W ,

as well as Γst = Γgs,t with gs,t ( ) = Wcs,t . We leave it to the reader to check that
Itô
M satisfies the required bounds (13.13) and therefore really defines a random
model for the regularity structure (T, G). We also note that the model is admissible
in the sense of Definition 14.23: in essence, this is seen from the identity

Πs IΞ = K ∗ Πs Ξ − Πs J (s)Ξ = K ∗ Ẇ − (K ∗ Ẇ )(s) = W
cs, • (14.22)

where we used that only k = 0 figures in the sum of (14.9), so that


Z

Js Ξ = 1 K(s − t) Πt Ξ (dt) = (K ∗ Ẇ )(s) 1 .

On the other hand, we can replace white noise Ẇ = Ẇ (ω) Rby a mollification
Ẇ ε := δ ε ∗ Ẇ with δ ε (t) = ε−1 ϱ(ε−1 t), for some ϱ ∈ Cc∞ with ϱ = 1, or indeed
any smooth function ξ, and define the associated canonical model L (ξ) = (Π, Γ )
by prescribing

Πs Ξ = ξ, Πs (I(Ξ)m ) = (K ∗ ξ)m
s, , • Πs (ΞI(Ξ)m ) = ξ(·)(K ∗ ξ)m
s, , •

as well as gs,t ( ) = (K ∗ ξ)s,t . We again leave it to the reader to check that L (ξ) is
indeed an admissible model for our regularity structure.
It is interesting to consider the canonical model L (Ẇ ε ) as ε → 0. Formally, one
would expect convergence to a “Stratonovich model”, but this does not exist because
of an infinite Itô–Stratonovich correction. To wit, assume the approximate bracket
X
[W, Wc ]π := Ws,t W cs,t
[s,t]∈π

converges, say in L1 , upon refinement |π| → 0. Then the mean would have to
convergence, which is contradicted by the computation, using Itô isometry,
Z t Z t−s
EWs,t W
cs,t = K(t − r)dr = K(r)dr
s 0
Z t−s 1
∼ K H (r)dr = cH (t − s)H+ 2 ,
0

and the standing assumption that H < 1/2. As a consequence, the canonical model
L (Ẇ ε ) will not converge as ε → 0, although the previous discussion suggests to
“cure” this by subtracting a diverging term, namely to consider4
4
This is an instance of Wick renormalisation where one replaces the product of two scalar Gaussian
random variables X, Y by X ⋄ Y := XY − E[XY ].
14.5 Rough volatility and robust Itô integration revisited 281
Z Z 
ε ε c ε dW ε ,
W dW − E
c W (14.23)

with integration understood over [s, t] with re-centred integrand W


cs, . However, such

Wick renormalisation at the level of generalised increments may destroy the algebraic
Chen relations. (Indeed, they only hold when the expectation is proportional to [s, t],
which has no reason to be the case in general.)
In fact, our admissible model (Π, Γ ) here can be described in terms of a single
“base-point free” realisation map Π : T → D′ which enjoys somewhat more natural
relations, such as
ΠIΞ = K ∗ ΠΞ = K ∗ Ẇ = K ∗ ξ
instead of (14.22) in the Itô-model case, and similarly for Π ε with Ẇ replaced by
Ẇ ε = ξ ε . The full specification reads5

Π ε 1 = 1, Π εΞ = ξε,
ε m ε m ε (14.24)
Π (I(Ξ) ) = (K ∗ ξ ) , Π (ΞI(Ξ)m ) = ξ ε (K ∗ ξ ε )m .

Remark 14.30. Define a character ft on T + by specifying (in the Itô model6 )


Z

ft ( ) = ft (J (Ξ)) := K(t − s) Πt Ξ (s) = (K ∗ ξ)t , (14.25)

and also a linear map Ft : T → T by Ft τ = (Id ⊗ ft )∆+ τ . One checks without


difficulty that Ft is an invertible map, Γts = Ft−1 ◦ Fs and

Π = Πs Fs−1 = Πt Ft−1 =⇒ Πs = Πt Ft−1 ◦ Fs = Π ◦ Fs .

At the level of the canonical model Π ε , switching to Πtε = Π ε Ft , this construction


merely replaces K ∗ ξ ε with the “base-pointed” expression (K ∗ ξ ε )t, and tracks •

the induced changes to the higher levels.


The Wick renormalisation in (14.23) points us to the (divergent) quantity7

= E(Π ε ( )) = E[(K ∗ δ ε ∗ ξ)(t)(δ ε ∗ ξ)(t)]


def

Z
= (K ∗ δ ε )(t − s)δ ε (t − s)ds = (K ∗ δ̄ ε )(0) .
R

where we recall δ ε = ε−1 ϱ(ε−1 • ); and similarly for δ̄ ε with ϱ̄ = ϱ(−( • )) ∗ ϱ. Since
K(x) = xH−1/2 1x>0 in a neighourhood of zero, there is no loss of generality in
assuming that this includes the support of ϱ̄. For ε ∈ (0, 1], it follows that8

5
One defines Π(ΞI(Ξ)m ) as the distributional derivative of an Itô integral.
6
. . .and similarly in the canonical one, with (K ∗ ξ)t replaced by (K ∗ ξ ε )t . . .
7
Thanks to stationarity, this quantity is independent of t. In particular, one could immediately take
t = 0.
8
In the case of H = 1/2, so that K H ≡ 1, noting that ϱ( • ), and hence ϱ̄ = ϱ(−( • )) ∗ ϱ, has unit
mass, the constant equals 1/2, which is the same 1/2 appearing in the Itô–Stratonovich correction.
282 14 Operations on modelled distributions
∞ ∞
1 s
Z Z
= (K ∗ δ̄ ε )(0) = K H (s) ϱ̄ ds = εH−1/2 K H (s) ϱ̄(s)ds .
0 ε ε 0

We can now replace the informal (14.23) by defining a “renormalised” (admissible)


model
Π ε;ren (ΞI(Ξ)) := Π ε (ΞI(Ξ) + cε1 1) ,
with diverging constant
cε1 = −E(Π ε ( )) = − .
In essence, we can leave it to the algebra to handle the correct shifting to different
base points (in other words: to recover (Π ε;ren , Γ ε;ren ) from knowledge of Π ε;ren )
in the same spirit as Chen’s relation allows to work out increments Xs,t of a given
rough path t 7→ Xt .) On the analytic side, we note that the right-hand side still has
controlled blow up of order deg ΞI(Ξ) = (−1/2 + H)− < 0. This further suggests
that the renormalisation procedure can be described by suitable (linear) maps, say
M : T → T , which are (only) allowed to produces additional terms (of higher
degrees) as, for instance, Mc1 : ΞI(Ξ) 7→ ΞI(Ξ) + c1 1 in our present example.
At this stage we could proceed “by hand” and try to work out the correct fixes for
all Πsε (ΞI(Ξ)m ), m = 1, 2, 3, but care is necessary since “curing” level m = 1, as
done above, will spill over to the higher levels. This is already seen in the instructive
case when m = 0, i.e. for Πs (Ξ) = Ẇ . Indeed, if one “renormalises” Ẇ =⇒
Ẇ + c0 , then writing V (t) := t, this leads to9
Z t Z t
Wm
s,t = cs,r )m dWr 7→
(W cs,r + c0 Vbs,r )m (dWr + c0 dVr ) .
(W
s s

and hence affects all higher levels (m = 1, 2, . . .). While V̇ = 1 naturally has 1
as associated symbol, Vb leads to a new symbol, indifferently written as I1 ≡ I()
or , in agreement with out earlier convention to represent action of I as single
downfacing line.

Ξ(IΞ)m 7→ (Ξ + c0 1)(IΞ + c0 I1)m .


Provided we manage to define all these “fixes” (for m = 0, 1, 2, 3) consistently,
we can expect a family of linear maps M = Mc indexed by c = (c0 , c1 , c2 , c3 ) ∈ R4
which furthermore constitutes a group in the sense that of (the matrix identity)
Mc Mc̄ = Mc+c̄ with c, c̄ ∈ R4 . This is the renormalisation group, here isomorphic
to (R4 , +). There was a cheat here, in that our initial collection of symbols (with
linear span T ) was not rich enough to define Mc as linear map from T into itself. In
this sense T was incomplete, and one should work on a space T̃ ⊃ T which contains
required symbols such as or . (The notion of complete rule put forward in [BHZ19]
formalises this.) However, in the present example this was really a consequence of
the (analytically unnecessary!) level-0 renormalisation. In fact, c0 = 0 is the only
possible choice that respects the symmetry of the noise, in the sense that Ẇ and −Ẇ

9
This is nothing but a variation of the concept of translation of rough paths.
14.5 Rough volatility and robust Itô integration revisited 283

have identical law. This reduces the renormalisation group to (R3 , +) and reflects a
general principle: symmetries help to reduce the dimension of the renormalisation
group. See [BGHZ19] for an example where this principle takes centre stage in a
striking manner.
In general one proceeds as follows. Define T − as the free commutative algebra
generated by all negative symbols in T ; that is,

T − := Alg({ , , , }) . (14.26)

(Similarly to before, we colour basis elements of T − differently to distinguish them


from those of T and / or T + .) Elements in T − are naturally represented as linear
combination of (unordered) forests; for instance

− 12 1 − 3 + + 4
3 ∈ T− ,

where 1 denotes the empty forest. As before, it is useful to introduce a linear map
∆− : T → T − ⊗ T which iterates over all possible ways of extracting possibly
empty collections of subtrees of negative degree, putting them as a forest on the
left-hand side, and leaving the remaining tree (where all “extracted” subtrees have
now been contracted to a point) on the T -valued right-hand side. For instance,

∆− ( ) = 1 ⊗ + ... + 3 ⊗ + ... + 3 ⊗ + ... + ⊗1.

The resulting renormalisation maps M : T → T are then parametrised by characters


on T − , similar to the construction of the structure group. Consider for instance the
case of a character g = g ε defined by g( ) = cε1 , g( ) = cε3 , and set to vanish on
the remaining two generators and . Then, the map Mg given by

Mg = (g ⊗ Id)∆−

acts as the identity on all symbols of T other than

Mg = + cε1 1, Mg = + 2cε1 , Mg = + 3cε1 + cε3 1 . (14.27)

The resulting renormalised model Π ε;ren ≡ Π ε Mgε realises, for instance, the
symbol as

Π ε;ren = Π ε Mg = ξ ε (K ∗ ξ ε )3 + 3cε1 (K ∗ ξ ε )2 + cε3 .

It is a non-trivial but nevertheless fairly general fact that it is possible to choose


the character g ε in such a way that the model Π ε;ren converges to a limiting model.
This is the case if we choose g ε as the BPHZ character (see [BHZ19, Thm 6.18])
associated to Π ε . This is defined in general as the unique character g ε of T − such
that the renormalised model Π ε;ren satisfies EΠ ε;ren τ = 0 for every symbol τ of
strictly negative degree. With our earlier choice

cε1 = −E(Π ε )(0) = −


284 14 Operations on modelled distributions

it is immediate from (14.27) that one has indeed E(Π ε Mgε ) = 0. Further-
more, since first and third moments of centred Gaussians vanish, we also have
E(Π ε Mgε ) = E(Π ε Mgε ) = 0 as a consequence of the fact that we set
g( ) = g( ) = 0. Finally, it follows from Wick’s formula that

EΠ ε Mgε = E[ξ ε (K ∗ ξ ε )3 ] + 3cε1 E(K ∗ ξ ε )2 + cε3


 
= 3 E[ξ ε (K ∗ ξ ε )] + cε1 E(K ∗ ξ ε )2 + cε3

= 3 + cε1 + cε3 = cε3 ,




so that Π ε Mgε has vanishing mean if and only if we also choose cε3 = 0.
We have made it plausible that

Mε;ren := (Π ε;ren , Γ ε;ren ) ↔ Π ε;ren ,

indeed gives rise to an (admissible) model, with all analytic bounds and algebraic
constraints intact, and such that in the sense of model convergence,

Mε;ren → MBPHZ = MItô . (14.28)

The main result of [CH16] is that the convergence Mε;ren → MBPHZ remains true in
vastly greater generality and that the limiting model is independent of the specific
choice of Mε for a large class of stationary approximations ξ ε to the noise ξ.
At last, we leave it to the reader to adapt the material of Section 13.3.2 to define
Rt
the modelled distribution that allows to reconstruct the Itô integral 0 f (W cs )dWs
and further deduce from (14.28) the following (renormalised) Wong–Zakai result,
Z t Z t Z t
csε )dWsε − cε1
f (W f ′ (W
csε )ds → f (W
cs )dWs (14.29)
0 0 0
R∞
where we recall that cε1 = εH−1/2 0 K H (s) ϱ̄(s)ds. Noting that ϱ̄ = ϱ(−( • )) ∗ ϱ
is even and has unit mass, we see that cε1 = 12 when H = 1/2. We can then pass to
the limit for each term on the right-hand side of (14.29) separately. This allows us to
recover the identity
Z t
1 t ′
Z Z t
f (Ws ) ◦ dWs − f (Ws )ds = f (Ws )dWs ,
0 2 0 0

in agreement with the usual Itô–Stratonovich correction familiar from stochastic


calculus.
14.6 Exercises 285

14.6 Exercises

Exercise 14.1 a) Construct an example of a regularity structure with trivial group


G, as well as a model and modelled distributions fi such that both Rf1 and
Rf2 are continuous functions but the identity

R(f1 ⋆ f2 )(x) = (Rf1 )(x) (Rf2 )(x)

fails.
b) Transfer Exercise 2.10 to the present context.

Solution. (We only address the first part.) Consider for instance the regularity struc-
ture given by A = (−2κ, −κ, 0) for fixed κ > 0 with each Tα being a copy of R
given by T−nκ = ⟨Ξ n ⟩. We furthermore take for G the trivial group. This regularity
structure comes with an obvious product by setting Ξ m ⋆ Ξ n = Ξ m+n provided
that m + n ≤ 2.
Then, we could for example take as a model for T = (T, G):

Πx Ξ 0 (y) = 1 , Πx Ξ (y) = 0 , Πx Ξ 2 (y) = c ,


  
(14.30)

where c is an arbitrary constant. Let furthermore

f1 (x) = f1 (x)Ξ 0 + f˜1 (x)Ξ , f2 (x) = f2 (x)Ξ 0 + f˜2 (x)Ξ .

Since our group G is trivial, one has fi ∈ D γ provided that each of the fi belongs to
C γ and each of the f˜ibelongs to C γ+κ . (And one has γ + κ < 1.) One furthermore
has the identity Rfi (x) = fi (x).
However, the pointwise product is given by

f1 ⋆ f2 (x) = f1 (x)f2 (x)Ξ 0 + f˜1 (x)f2 (x) + f˜2 (x)f1 (x) Ξ + f˜1 (x)f˜2 (x)Ξ 2 ,
 

which by Theorem 14.5 belongs to D γ−κ . Provided that γ > κ, one can then apply
the reconstruction operator to this product and one obtains

R f 1 ⋆ f 2 (x) = f1 (x)f2 (x) + cf˜1 (x)f˜2 (x) ,




which is obviously quite different from the pointwise product (Rf1 )(x) · (Rf2 )(x).
How should this be interpreted? For n > 0, we could have defined a model Π (n)
by

Πx(n) Ξ 0 (y) = 1, Πx(n) Ξ (y) = 2c sin(ny), Πx(n) Ξ 2 (y) = 2c sin2 (ny).
  

Denoting by R(n) the corresponding reconstruction operator, we have the identity



R(n) f i (x) = fi (x) + 2cf˜i (x) sin(nx) ,

286 14 Operations on modelled distributions

as well as R(n) (f1 ⋆ f2 ) = R(n) f1 · R(n) f2 . As a model, the model Π (n) actually
converges to the limiting model Π defined in (14.30). As a consequence of the
continuity of the reconstruction operator, this implies that

R(n) f1 · R(n) f2 = R(n) (f1 ⋆ f2 ) → R(f1 ⋆ f2 ) ̸= Rf1 · Rf2 ,

which is of course also easy to see “by hand”. This shows that in some cases, the
“non-canonical” models as in (14.30) can be interpreted as limits of “canonical”
models for which the usual rules of calculus hold. Even this is however not always
the case (think of the Itô Brownian rough path).

Exercise 14.2 Consider Z α = Z α (Rd ).


a) Show that distributional derivatives satisfy Dk Z α ⊂ Z α−|k| for any multi-index
k. Show that for d = 1 equality holds. That is, any g ∈ Z α−k , with k ∈ N, is
the kth distributional derivative of some f ∈ Z α .
b) The proof of Schauder’s theorem in Section 14.3 was more involved in the
“positive” case, when 0 ≤ α + β ∈ [n − 1, n), some n ∈ N. Give an easier proof
in the case d = 1 by reducing the positive to the negative case.
∗∗ Exercise 14.3 Provide a proof of the case α = 0 in Lemma 14.13.

Solution. As in Lemma 14.13, we aim to bound |ζ(φ)| for φ ∈ Brλo ,x and ζ ∈ Zrα
for some r ≥ ro . One strategy is to consider a compactly supported wavelet basis of
regularity r and to separately bound the terms in the wavelet expansion of φ.
If we wish to rely purely on elementary arguments, one strategy goes as follows.
a) Show first that ζ ∈ Zrα if and only if ζχ ∈ Zrα for every smooth compactly
supported function χ. This allows us to reduce ourselves to the case when ζ
itself is compactly supported and we assume this from now on.
b) Show that if ζ ∈ Zr0 is supported in a ball of radius 1 and if ψ is such that
ψ(x) dx = 0 and such that |Dk ψ(x)| ≤ (1 + |x|)−β−|k| for |k| ≤ r and some
R

large enough exponent k, then |ζ(ψxλ )| ≲ 1, uniformly over such ψ and over
x ∈ Rd and λ ∈ (0, 1].
c) Choose a function ψ with the property that its Fourier transform is smooth,
identically 1 in the ball of radius 1, and identically 0 outside of the ball of radius
2 and define ψ̃ as in the proof of Lemma 14.13. Write
X
φ = φ ∗ ψλ + φ ∗ ψ̃ λn
n≥0

as in the proof of Lemma 14.13.


d) Choose χ such that its Fourier transform is smooth, identically equal to 1 on
the annulus of radii in [1, 4] and vanishes outside the annulus of radii in [1/2, 5].
Note that this implies that ψ̃ λn = ψ̃ λn ∗ χλn and conclude that

ζ(φ ∗ ψ̃ λn ) = ⟨ζ ∗ ψ̃ λn , φ ∗ χλn ⟩ .
14.6 Exercises 287

e) Use the fact that φ ∈ C 1 and χ integrates to 0 to conclude that |φ ∗ χλn | ≲


2−n λ−d and therefore that |ζ(φ ∗ ψ̃ λn )| ≲ 2−n , which is summable as required.
∗ Exercise 14.4 Show that, for g smooth enough, one has K ∗ (gη) − g(K ∗ η) ∈
C α+β+1 for every β-regularising kernel K and η ∈ C α with α < 0. How smooth is
smooth enough? Compare the following two strategies.
Strategy 1: Go through the proof of the Schauder estimate in Section 14.3 and
estimate the difference ⟨Kn ∗ (gη) − g(Kn ∗ η), ψλ ⟩.
Strategy 2: Consider the regularity structure T spanned by the Taylor polynomials
and an additional symbol Ξ of degree α, with the structure group acting trivially on
Ξ. We extend this by adding an integration operator of order β and all products with
Taylor polynomials. We also consider on it the natural model mapping Ξ to η. Writing
g ∈ D γ for the Taylor lift of g as in Proposition 13.16, verify that gΞ ∈ D γ+α . The
multilevel Schauder estimate then shows that, provided that γ + α > 0, one has
K(gΞ) ∈ D γ+α+β and gK(Ξ) ∈ D γ+min{0,α+β} , so in particular

F = K(gΞ) − gK(Ξ) ∈ D 1+α+β ,


def

provided that γ > max{1, −α, 1 + α + β}. Furthermore, the explicit expression for
K shows that

K(gΞ) = gI(Ξ) + g ′ I(XΞ) + (. . .) , gK(Ξ) = gI(Ξ) + (. . .) ,

where (. . .) denotes terms that either belong to the polynomial part of the regularity
structure or are of degree strictly greater than α + β + 1 (which is the degree of
I(XΞ)). In particular, the truncation of F at level α + β + 1 belongs to DPα+β+1 ,
and we conclude by the second part of Proposition 13.16.
Exercise 14.5 Consider space-time Rd with one temporal and (d − 1) spatial di-
mensions, under the parabolic scaling (2, 1, . . . , 1), as introduced in Remark 13.9.
Denote by G the heat kernel (i.e. the Green’s function of the operator ∂t − ∂x2 ). Show
that one has the decomposition

G = K + K̂ ,

where the kernel K satisfies all of the assumptions of Section 14.4 (with β = 2) and
the remainder K̂ is smooth and bounded.
Exercise 14.6 (From [Bru18]) In the context of Remark 14.25, establish the recur-
sion
Γxy Iτ = I(Γxy τ ) − Γxy Jxy τ , (14.31)
with
X Xk
Jxy τ := Πx (Ik (Γxy τ ))(y) .
k!
|k|<deg τ +β

Exercise 14.7 Show that if one defines Γxy Iτ in such a way that (14.10) holds, then
it guarantees that Πx Γxy Iτ = Πy Iτ .
288 14 Operations on modelled distributions

Exercise 14.8 Adapt the material in Section 14.5 and construct a suitable regularity
structure and model so that the two-dimensional Itô integral (14.13) is obtained as
reconstruction of a suitable modelled distribution.

14.7 Comments

The material on differentiation, products and admissible models follows essentially


[Hai14b], although the conditions on the kernel K – previously assumed to annihilate
certain polynomials – are now more flexible. In particular, we do not enforce the
identity I(X k ) = 0 and instead allow for the possibility of simply including symbols
I(X k ) as basis vectors of our regularity structure. It is the case that any admissible
model will necessarily satisfy Πx I(X k ) = 0, but in general Γxy I(X k ) ̸= 0. The
material of Section 14.5 is essentially taken from [BFG+ 19], with a viewpoint similar
to [BCFP19].
Chapter 15
Application to the KPZ equation

We show how the theory of regularity structures can be used to build a robust
solution theory for the KPZ equation. We also give a very short survey of the original
approach to the same problem using controlled rough paths and we discuss how the
two approaches are linked.

15.1 Formulation of the main result

Let us now briefly explain how the theory of regularity structures can be used to
make sense of solutions to very singular semilinear stochastic PDEs. We will keep
the discussion in this chapter at a very informal level without attempting to make
mathematically precise statements. The interested reader may find more details in
[Hai13, Hai14b].
For definiteness, we focus on the case of the KPZ equation [KPZ86], which is
formally given by
∂t h = ∂x2 h + (∂x h)2 + ξ − C , (15.1)
where ξ denotes space-time white noise, the spatial variable takes values in the
one-dimensional torus T, i.e. in the interval [0, 2π] endowed with periodic boundary
conditions, and C is a fixed constant. The problem with such an equation is that even
the solution to the linear part of the equation, namely

∂t Ψ = ∂x2 Ψ + ξ ,

is not differentiable as a function of the spatial variable. As a matter of fact, as already


noted in Section 12.3, for any fixed time t, Ψ has the regularity of Brownian motion
as a function of the spatial variable x. As a consequence, the only way of possibly
giving meaning to (15.1) is to “renormalise” the equation by subtracting from its
right-hand side an “infinite constant”, which counteracts the divergence of the term
(∂x h)2 .

289
290 15 Application to the KPZ equation

This has usually been interpreted in the following way. Assuming for a moment
that ξ is a smooth function, a simple consequence of the change of variables formula
shows that if we define h = log Z, then Z satisfies the PDE

∂t Z = ∂x2 Z + Z ξ .

The only ill-posed product appearing in this equation now is the product of the
solution Z with white noise ξ. As long as Z takes values in L2 , this product can
be given a meaning as a classical Itô integral, so that the equation for Z can be
interpreted as the Itô equation

dZ = ∂x2 Z dt + Z dW , (15.2)

were W is an L2 -cylindrical Wiener process. It is well known [DPZ92] that this


equation has a unique (mild) solution and we can then go backwards and define the
solution to the KPZ equation as h = log Z. The expert reader will have noticed that
this argument appears to be flawed: since (15.2) is interpreted as an Itô equation,
we should really use Itô’s formula to find out what equation h satisfies. If one does
this a bit more carefully, one notices that the Itô correction term appearing in this
way is indeed an infinite constant! This is the case in the following sense. If Wε
is a Wiener process with spatial covariance given by x 7→ ε−1 ϱ(ε−1 x) for some
smooth compactly supported function ϱ integrating to 1 and Zε solves (15.2) with
W replaced by Wε , then hε = log Zε solves

dh = ∂x2 h dt + (∂x h)2 dt + dWε − ε−1 Cϱ dt , (15.3)

for some constant Cϱ depending on ϱ. Since Zε converges to a strictly positive limit


Z, this shows that the sequence of functions hε solving (15.3) converges to a limit
h. This limit is called the Hopf–Cole solution to the KPZ equation [Hop50, Col51,
BG97, Qua11].
This notion of solution is of course not very satisfactory since it relies on a nonlin-
ear transformation and provides no direct interpretation of the term (∂x h)2 appearing
in the right-hand side of (15.1). Furthermore, many natural growth models lead to
equations that structurally “look like” (15.1), rather than (15.2). Since perturbations
are usually rather badly behaved under exponentiation and since there is no really
good approximation theory for (15.2) either (for example it had been an open problem
for some time whether space-time regularisations of the noise lead to the same notion
of solution), one would like to have a robust solution theory for (15.1) directly.
Such a robust solution theory is precisely what the theory of regularity structures
provides. More precisely, it provides spaces M (a suitable space of “admissible
models”) and D γ , maps Sa (an abstract “solution map”), R (the reconstruction
operator) and L (a “canonical lift map”), as well as a finite-dimensional group R
acting both on R and M such that the following diagram commutes:
15.1 Formulation of the main result 291

R
Sa
C∈ F × M × Cα Dγ

L · R
(15.4)
Sc
F × C × Cα C([0, T ], C α )


R
ξ h0 h

Here, Sc denotes the classical solution map Sc (C, ξ, h0 ) which provides the solution
(up to some fixed final time T ) to the equation

∂t h = ∂x2 h + (∂x h)2 + ξ − C , h(0, x) = h0 (x) , (15.5)

for regular instances of the noise ξ. The space F of “formal right-hand sides” is in
this case just a copy of R which holds the value of the constant C appearing in (15.5).
The diagram commutes in the sense that if M ∈ R, then

Sc (M (C), ξ, h0 ) = RSa (C, M (L (ξ)), h0 ) ,

where we identify M with its respective actions on R and M . A full justification of


these considerations for a very large class of systems of SPDEs is beyond the scope
of this text. The construction of R in full generality and its action on the space of
admissible models was obtained in [BHZ19]. Its adjoint action on a suitable space of
equations F as well as the commutativity of the above diagram were then obtained
in [BCCH17]. Important additional features of this picture are the following:
• If ξε denotes a “natural” regularisation of space-time white noise, then there ex-
ists a sequence Mε of elements in R such that Mε L (ξε ) converges to a limiting
random element (Π, Γ ) ∈ M . This element can also be characterised directly
without resorting to specific approximation procedures and RSa (0, (Π, Γ ), h0 )
coincides almost surely with the Hopf–Cole solution to the KPZ equation. The
fact that an analogous statement “always” holds for subcritical equations was
shown in the work [CH16].
• The maps Sa and R are both continuous, unlike the map Sc which is discon-
tinuous in its second argument for any topology for which ξε converges to
ξ.
• As an abstract group, the “renormalisation group” R is simply equal to (R3 , +).
However, it is possible to extend the picture to deal with much larger classes of
approximations, which has the effect of increasing both R and the space F of
possible right-hand sides. See for example [HQ18] for a proof of convergence to
KPZ for a much larger class of interface growth models.
Remark 15.1. An important condition for the convergence result in [CH16] to hold is
that T does not contain any symbol τ with deg τ ≤ − d2 and such that τ contains more
292 15 Application to the KPZ equation

than one noise as a subsymbol. This in particular explains why fractional Brownian
motion B H with Hurst parameter H can only be lifted to a rough path when H > 41
even though SDEs driven by fractional Brownian motion are “subcritical” for every
H > 0. Indeed, for H = 14 , the natural degree of the symbol Ẇ of Section 13.2.2
(which would be represented by in the graphical notation used earlier and contains

two instances of the noise) would be (2H − 1)− = − 12 < − d2 .

An example of statement that can be proved from these considerations (see


[Hai13, Hai14b, HQ18]) is the following.
Theorem 15.2. Consider the sequence of equations

∂t hε = ∂x2 hε + (∂x hε )2 + ξε − Cε , (15.6)

where ξε = δε ∗ ξ with δεR(t, x) = ε−3 ϱ(ε−2 t, ε−1 x), for some smooth compactly
supported function ϱ with ϱ = 1, and ξ denotes space-time white noise. Then, there
exists a (diverging) choice of constants Cε such that the sequence hε converges in
probability to a limiting process h.
Furthermore, one can ensure that the limiting process h does not depend on the
choice of mollifier ϱ and that it coincides with the Hopf–Cole solution to the KPZ
equation.

Remark 15.3. It is important to note that although the limiting process is independent
of the choice of mollifier ϱ, the constant Cε does very much depend on this choice,
as we already alluded to earlier.

Remark 15.4. Regarding the initial condition, one can take h0 ∈ C β for any fixed
β > 0. Unfortunately, this result does not cover the case of “infinite wedge” initial
conditions, see for example [Cor12].

The aim of this section is to sketch how the theory of regularity structures can be
used to obtain this kind of convergence results and how (15.4) is constructed. First of
all, we note that while our solution h will be a Hölder continuous space-time function
(or rather an element of D γ for some regularity structure with a model over R2 ), the
“time” direction has a different scaling behaviour from the three “space” directions.
As a consequence, it turns out to be effective to slightly change our definition of
“localised test functions” by setting

φλ(s,x) (t, y) = λ−3 φ λ−2 (t − s), λ−1 (y − x) .




Accordingly, the “effective dimension” of our space-time is actually 3, rather than 2.


The theory presented in Chapter 13 extends mutatis mutandis to this setting. (Note
however that when considering the degree of a regular monomial, powers of the time
variable should now be counted double.) Note also that with this way of measuring
regularity, space-time white noise belongs to C −α for every α > 23 . This is because
of the bound 1/2 3
E⟨ξ, φλx ⟩2 = ∥φλx ∥L2 ≈ λ− 2 ,
15.2 Construction of the associated regularity structure 293

combined with an argument somewhat similar to the proof of Kolmogorov’s continu-


ity lemma.

15.2 Construction of the associated regularity structure

Our first step is to build a regularity structure that is sufficiently large to allow to
reformulate (15.1) as a fixed point in D γ for some γ > 0. Denoting by G the heat
kernel (i.e. the Green’s function of the operator ∂t − ∂x2 ), we can rewrite the solution
to (15.1) with initial condition h0 as

h = G ∗ (∂x h)2 + ξ + Gh0 ,



(15.7)

where ∗ denotes space-time convolution and where we denote by Gh0 the harmonic
extension of h0 . (That is the solution to the heat equation with initial condition h0 .)

Remark 15.5. We view (15.7) as an equation on the whole space by considering its
periodic extension.

In order to have a chance of fitting this into the framework described above, we
first decompose the heat kernel G as in Exercise 14.5 as

G = K + K̂ ,

where the kernel K satisfies all of the assumptions of Section 14.4 (with β = 2) and
the remainder K̂ is smooth. If we consider any regularity structure containing the
usual Taylor polynomials and equipped with an admissible model, is straightforward
to associate to K̂ an operator K̂ : D γ → D ∞ via
X Xk
Dk K̂ ∗ Rf (z) ,
 
K̂f (z) =
k!
k

where z denotes a space-time point and k runs over all possible 2-dimensional
multiindices. Similarly, the harmonic extension of h0 can be lifted to an element
in D ∞ which we denote again by Gh0 by considering its Taylor expansion around
every space-time point. At this stage, we note that we actually cheated a little: while
Gh0 is smooth in {(t, x) : t > 0, x ∈ T} and vanishes when t < 0, it is of course
singular on the time-0 hyperplane {(0, x) : x ∈ T}. This problem can be cured
by introducing weighted versions of the spaces D γ allowing for singularities on
a given hyperplane. A precise definition of these singular model spaces and their
behaviour under multiplication and the action of the integral operator K can be found
in [Hai14b]; but see Exercise 4.12 for the (singular, controlled) rough path analogue.
For the purpose of the informal discussion given here, we will simply ignore this
problem.
This suggests that the “abstract” formulation of (15.1) should be given by
294 15 Application to the KPZ equation

H = K (∂H)2 + Ξ + K̂ (∂H)2 + Ξ + Gh0 ,


 
(15.8)

where it still remains to be seen how to define an “abstract differentiation operator” ∂


realising the spatial derivative ∂x as in Section 14.1. In view of (14.11), this equation
is of the type
H = I (∂H)2 + Ξ + (. . .) ,

(15.9)
where the terms (. . .) consist of functions that take values in the subspace T̄ of
T spanned by regular Taylor polynomials in the time variable X0 and the space
variable X1 . (As previously, X denotes the collection of both.) In order to build
a regularity structure in which (15.9) can be formulated, it is then natural to start
with the structure T̄ given by these abstract polynomials (again with the parabolic
scaling which causes the abstract “time” variable to have degree 2 rather than 1),

and to then add a symbol Ξ to it which we postulate to have degree − 32 , where

we denote by α an exponent strictly smaller than, but arbitrarily close to, the value
α. As a consequence of our definitions, it will also turn out that the symbol ∂ is
always immediately followed by the symbol I, so that it makes sense to introduce the
shorthand I ′ = ∂I. This is also suggestive of the fact that I ′ can itself be considered
an abstract integration map, associated to the kernel K ′ = ∂x K. Comparing this to
Remark 14.24, we see that we could alternatively view I ′ as the operator I(0,1) .

Remark 15.6. In order to avoid a proliferation of inconsequential terms, we impose


from the start the identity I ′ (1) = 0 in T (we can do this by Remark 15.6). We could
also set I(1) = 0 by choosing K appropriately, but this is irrelevant anyway in view
of Remark 15.8 below.

We then simply add to T all of the formal expressions that an application of the
right-hand side of (15.9) can generate for the description of H, ∂H, and (∂H)2 .
The degree of a given expression is furthermore completely determined by the rules
deg Iτ = deg τ + 2, deg ∂τ = deg τ − 1 and deg τ τ̄ = deg τ + deg τ̄ . For example,
it follows from (15.9) that the symbol I(Ξ) is required for the description of H, so
that I ′ (Ξ) is required for the description of ∂H. This then implies that I ′ (Ξ)2 is
required for the description of the right-hand side of (15.9), which in turn implies
that I(I ′ (Ξ)2 ) is also required for the description of H, etc. This “Picard iteration”
yields the (formal) expansion, writing z for a generic space-time point,1

H(z) = h(z) 1 + I(Ξ) + I(I ′ (Ξ)2 ) + h′ (z) X1


+ 2I(I ′ (Ξ)I ′ (I ′ (Ξ)2 )) + 2h′ (z)I(I ′ (Ξ)) + . . .

where h and h′ are to be considered as independent functions (similar to a controlled


rough path). In particular, h may not be differentiable at all.

Remark 15.7. Here we made a distinction between I(Ξ), interpreted as the linear
map I applied to the symbol Ξ, and the symbol I(Ξ). Since the map I is then

1
Note that h′ is treated as an independent function (similar to the Gubinelli derivative of a controlled
path); we do not even expect h to be differentiable!
15.2 Construction of the associated regularity structure 295

defined by I(Ξ) := I(Ξ), this distinction is somewhat moot and will be blurred in
the sequel. Similarly, the abstract (spatial) differentiation operator ∂ acts on suitable
symbols as ∂(I(. . .)) := I ′ (. . .), plus of course ∂(X0k0 X1k1 ) := k1 X0k0 X1k1 −1 , for
every multi-index (k0 , k1 ).
More formally, denote by U the collection of those formal expressions that are
required to describe H. This is then defined as the smallest collection containing X k
for all multiindices k ≥ 0, I(Ξ), and such that

τ1 , τ2 ∈ U =⇒ I(∂τ1 ∂τ2 ) ∈ U .

We then set
W = U ∪ {Ξ} ∪ {∂τ1 ∂τ2 : τi ∈ U} , (15.10)
and define T as the set of all linear combinations of elements in a finite subset
W0 ⊂ W, sufficiently large to allow close the fixed pointed problem (15.8). Remark
that this defines (implicitly!) a multiplication between some (but not all) of the
symbols, notably ∂τ1 ⋆ ∂τ2 := ∂τ1 ∂τ2 so that we can safely omit ⋆ in the sequel.
Naturally, Tα consists of those linear combinations that only involve elements in W0
of degree α. (Already W contains only finitely many elements of degree less than α,
which reflects subcriticality of the problem.)
In order to simplify expressions later, we use again a shorthand graphical notation
for elements of W as we already did in Section 14.5. Similarly to before, Ξ is
represented a small circle, while the integration map I is represented by a downfacing
wavy line and I ′ = ∂I is represented by a downfacing plain line. For example, we
write

I ′ (Ξ)2 = ⋆ = , (I ′ (I ′ (Ξ)2 ))2 = ⋆ = , I(I ′ (Ξ)2 ) = .

Symbols containing factors of X have no particular graphical representation, so we


will for example write Xi I ′ (Ξ)2 = Xi . With this notation,

H = h1 + + + h′ X1 + 2 + 2h′ + ...

described with symbols in U = {1, , , X1 , , , . . .}, here spelled out up to degree


3
2 (which will turn out to be “enough”, cf. Remark 15.8 below). For the “right-hand
side” of the equation we need to include Ξ and, spelling out symbols up to degree 0
which is the minimum required to be able to apply the reconstruction operator to it,

{∂τ1 ∂τ2 : τi ∈ U} = { , , , , , , , 1, . . .} .

As it turns out, provided that we also include the noise itself, the 14 symbols encoun-
tered so far already generate a sufficiently large structure space, given by
def
T = TKPZ = ⟨W0 ⟩ = ⟨Ξ, , , , , , , , 1, , , X1 , , ⟩. (15.11)

Here we ordered symbols by increasing order of degree. In fact, if τ is a tree with



l circles, m plain lines and k wavy lines, then deg τ = n × 32 + m + 2k. Note
296 15 Application to the KPZ equation

that deg X1 = 1 for the abstract space variable, whereas due to parabolic scaling the
abstract time variable has deg X0 = 2 and does not show up here.
Note that at this stage, we have not defined a regularity structure yet, as we have
not described a structure group G acting on T . However, similarly to what was done
in (14.24), it is already natural to consider “representations” of the existing structure,
which are linear maps Π from T into some suitable space of functions / distributions
respecting a form of admissibility condition. For the sake of the present discussion,
we assume that all objects are smooth. Given a (smooth) realisation of a “driving
noise” ξ, we can then define its canonical lift by setting

ΠX k (x) = xk ,
 
ΠΞ (x) = ξ(x) , (15.12)

and then recursively by

Πτ τ̄ = Πτ · Π τ̄ , ΠIτ = K ∗ Πτ . (15.13)

In general, we say that a linear map Π : T → C(Rd ) is admissible if one has the
relations

ΠIτ = K ∗ Πτ , Π1 = 1 , ΠX k τ = ( • )k Πτ . (15.14)

(And similarly with I replaced by I ′ and K replaced by ∂x K in the case of KPZ. . .)


Such a map Π is clearly not a model since it is a single linear map rather than a
family of such maps and the admissibility condition (14.8) is replaced by the more
“natural” identity ΠIτ = K ∗ Πτ . We will see in the next section how to construct
the structure group G and how to use its construction to assign in a unique way a
model to the linear map Π.

Remark 15.8 (Where to truncate?). The (14-dimensional) space TKPZ is indeed suffi-
cient to treat the KPZ equation. Indeed, once in possession of an admissible model,
thanks to Theorem 14.5, the fixed point problem (15.8) can be solved in D γ as soon
as γ is a little bit greater than 3/2. This is why we only need to keep track of terms
describing the abstract KPZ solution up to degree 3/2. Regarding the terms required
to describe the right-hand side of the fixed point problem, we need to go up to degree
0, which guarantees that the reconstruction operator (and therefore also the integra-
tion operator K) is well-defined. This is similar to T = T<1/2 , as in Definition 13.4,
being sufficient to treat rough / stochastic integration (and then SDEs) in a Brown-
ian rough path / model context. Indeed, in that context (Proposition 13.21) consider
Y ∈ D02α (now for α to be determined!) and abstract Brownian noise Ẇ ∈ D−1/2 ∞
−.


Then f (Y ), composition with a nice function f , is also in D0 and the product is

in D 2α−1/2 . We needed this exponent to be positive to have a well-defined rough
integration which in turn allows to formulate a fixed point problem, so that we need
2α ≥ 1/2. By definition of D 2α , this means that we need Y to take values in T<1/2
which is of course what we did by working in ⟨Ẇ , Ẇ, 1, W ⟩, ignoring all symbols
of higher degree.
15.3 The structure group and positive renormalisation 297

15.3 The structure group and positive renormalisation

Recall that the purpose of the group G is to provide a class of linear maps Γ : T → T
arising as possible candidates for the action of “reexpanding” a “Taylor series” around
a different point. In our case, in view of (14.8) and Definition 14.3, the coefficients
of these reexpansions will naturally be some polynomials in x and in the expressions
appearing in (14.9). This suggests that we should define a space T + whose basis
vectors consist of formal expressions of the type
N
Y
Xk Jℓi (τi ) , (15.15)
i=1

where N is an arbitrary but finite number, the τi are canonical basis elements in
W defined in (15.10), and the ℓi are d-dimensional multiindices satisfying |ℓi | <
deg τi + 2. (The last bound is a reflection of the restriction of the summands in (14.9)
with β = 2.) The space T + , which also contains the empty product 1, is endowed
with a natural commutative product, written as · or (usually) omitted. (T + , ·, 1)
is nothing but the free commutative algebra over the symbols {Xi , Jℓ (τ )} with
i ∈ {1, . . . , d} and τ ∈ W with deg Jℓ (τ ) := deg τ + 2 − |ℓ| > 0.)

Remark 15.9. While the canonical basis of T + is related to that of T , it should be


viewed as a completely disjoint space. We emphasise this by using the notation Jℓ
rather than Iℓ .

The space T + also has a natural graded structure T + =


L +
Tα similarly to before
by setting
deg Jℓ (τ ) = deg τ + 2 − |ℓ| , deg X k = |k| ,
and by postulating that the degree of a product is the sum of the degrees of its factors.
Unlike in the case of T however, elements of T + all have strictly positive degree,
except for the empty product 1 which we postulate to have degree 0.
Still inspired by (14.8), as well as by the multiplicativity constraint given by
Definition 14.3, we consider the following construction. We define a linear map,
sometimes called coaction, ∆+ : T → T ⊗ T + in the following way. For the basic
elements Ξ, 1 and Xi (i ∈ {0, 1}), we set

∆+ 1 = 1 ⊗ 1 , ∆+ Ξ = Ξ ⊗ 1 , ∆+ Xi = Xi ⊗ 1 + 1 ⊗ Xi .

We then extend this recursively to all of T by imposing the following identities

∆+ (τ τ̄ ) = ∆+ τ · ∆+ τ̄ ,
X Xℓ
∆+ I(τ ) = (I ⊗ Id)∆+ τ + ⊗ Jℓ (τ ) ,
ℓ!

X Xℓ
∆+ I ′ (τ ) = (I ′ ⊗ Id)∆+ τ + ⊗ Jℓ+(0,1) (τ ) .
ℓ!
ℓ,m
298 15 Application to the KPZ equation

Here, we extend τ 7→ Jk (τ ) to a linear map Jk : T → T + by setting Jk (τ ) = 0 for


those basis vectors τ ∈ W for which deg τ ≤ |k| − 2. This in particular shows that
the sums appearing in the above expressions are actually finite.
Let now G+ denote the set of characters on T + , i.e. all linear maps g : T + → R
with the property that g(σσ̄) = g(σ)g(σ̄) for any two elements σ and σ̄ in T + . Then,
to any such map, we can associate a linear map Γg : T → T by

Γg τ = (Id ⊗ g)∆+ τ . (15.16)

In principle, this definition makes sense for every g ∈ (T + )∗ . However, as already


seen in (14.21) it turns out that the set of such maps with g ∈ G+ forms a group,
which we take as our structure group G by setting again
def
G = {Γg : g ∈ G+ } . (15.17)

Remark 15.10. A less explicit way to define G is to simply take it as the set of
all linear maps that are ‘allowed’ in the sense that they are upper triangular with
the identity on the diagonal as imposed by (13.5), commute with derivatives as in
Definition 14.1, are multiplicative with respect to the product as in Definition 14.3,
and satisfy (14.7). See for example [Hai16].

Example 15.11 (KPZ structure group). Running through this procedure, and restrict-
ing to T = TKPZ reveals G as a 7-dimensional (non-commutative) matrix group,
canonically realised as a subgroup of the invertible maps T → T , themselves repre-
sentable as 16 × 16-matrix. Full details are left for Exercise 15.1.

Example 15.12 (KPZ). Recall T = ⟨Ξ, , , , , , , , 1, . . .⟩ in the


case of KPZ. Then T + is linearly spanned by the symbol 1 and polynomials in the
commuting symbols as (partially!) listed in

{J ′ ( ), J ′ ( ), . . . , J (Ξ), J ( ), X1 , J ( ), J ( ), . . .}
≡ − ≡ −
with (non-negative) degrees { 12 , 12 , . . . , 1= , 1, 23 , 32 , . . .} and shorthands J =
J(0,0) , J ′ = J(0,1) . We note that all symbols here can be represented by elementary
trees,2 where J (τ ) (resp. J ′ (τ )) is represented by attaching a single downfacing
wavy (resp. plain) line to the root of τ . For instance

3 · 1 − J (Ξ) + 2 · J ′ ( ) · J ′ ( ) ∈ T+

but the symbol J ′ (Ξ) (which would be of negative homogeneity) is not an element
of T + .

Before we show that G does indeed form a group (actually a subgroup of the
invertible maps from T to T ), we show how to use it to turn an admissible linear

2
With some goodwill this even includes X-factors, which then appear as polynomial decorations
of the trees.
15.3 The structure group and positive renormalisation 299

map Π : T → C ∞ (Rd ) (in the sense of (15.14)) into a model (Π, Γ ). Consider the
recursion
(−x)k
X Z
Dℓ+k K(x − y) Πx τ (dy) ,

fx (Jℓ (τ )) = −
k!
|k+ℓ|<|τ |+2

Πx τ = (Π ⊗ fx )∆+ τ , (15.18)

where we furthermore impose that the fx are characters, namely that they extend
to all of (T + )∗ in a multiplicative fashion, fx (σσ̄) = fx (σ)fx (σ̄). We leave it as a
simple exercise to verify that these two identities are sufficient to define the fx and
the Πx uniquely.

Remark 15.13. The correspondence Π ⇔ (Π, Γ ) can also be inverted and the two
notions of admissibility are consistent, so that these are two completely equivalent
ways of looking at admissible models for our regularity structure. Indeed, it suffices
to set Πτ = RHτ , where the elements Hτ ∈ D ∞ (i.e. one can make sure that
Hτ ∈ D γ for any fixed γ) are given by HX k (x) = (X + x)k , HΞ (x) = Ξ, and
then recursively by

HI(τ ) = KHτ , Hτ τ̄ = Hτ · Hτ̄ .

In particular, this correspondence does not at all rely on the fact that the model
was built by lifting a smooth function. Note that this is strongly reminiscent of the
construction given in Exercise 13.11. See also Exercise 15.3.

If we now define elements Fx ∈ G by

Fx τ = Γfx = (Id ⊗ fx )∆+ τ ,


def
(15.19)

and then set (an expression for Fx−1 is given below)

Γxy = Fx−1 Fy , (15.20)

it follows immediately from (15.18) that the Πx and the maps Γxy do indeed satisfy
the desired algebraic relation Πx Γxy = Πy . We also note that the coefficients of
the linear maps Γxy are expressed as polynomials of the numbers fx (Jℓi (τi )) and
fy (Jℓi (τi )) for suitable expressions τi and multiindices ℓi . Note that the linear maps
Fx : T → T perform a kind of “recentering” of Π around x in the sense that (15.18)
guarantees that, at least when Π is sufficiently smooth, Πx I(τ ) vanishes at the order
determined by the degree of τ . As a matter of fact, one could even have taken this as
the defining property of the maps Fx (together with the fact that they are of the form
(15.19) for some multiplicative functional fx ). We will see in Section 15.5 below
that the renormalisation procedure required to give a meaning to singular SPDEs
like the KPZ equation can equally be interpreted as a type of recentering procedure,
but this time in “probability space”. This also explains the terminology “positive
renormalisation” which is sometimes encountered for the maps Fx .
300 15 Application to the KPZ equation

We now argue that G as defined above actually forms a group, so that in particular
the maps Fx are invertible. To this end, define a linear map ∆+ : T + → T + ⊗ T + ,
very similarly to the previously defined map ∆+ : T → T ⊗ T + , by

∆+ 1 = 1 ⊗ 1 , ∆+ X = X ⊗ 1 + 1 ⊗ X ,

extended recursively to all of T + by imposing the identities, for all multiindices k,

∆+ (σσ̄) = (∆+ σ)(∆+ σ̄) ,


X Xℓ (15.21)
∆+ Jk (τ ) = (Jk ⊗ Id)∆+ τ + ⊗ Jℓ+k (τ ) .
2
ℓ!
ℓ∈N

It can be verified that ∆+ is coassociatve in the sense

(∆+ ⊗ Id)∆+ = (Id ⊗ ∆+ )∆+ . (15.22)

This and the multiplicative property make ∆+ a coproduct and T + a (connected,


graded) coalgebra. From general principles there exists a unique linear map A+ :
T + → T + , called antipode, so that (T + , ·, ∆+ , A+ ) is a Hopf algebra. Moreover,
our notational overload is justified by the fact that (15.22) also holds when both sides
of the identity are interpreted as linear maps T → T ⊗ T + ⊗ T + .
We then define a product ◦ on the space of linear functionals f : T + → R by

(f ◦ g)(σ) = (f ⊗ g)∆+ σ , (15.23)

noting that coassociativity of ∆+ implies associativity of ◦. Restricted to multiplica-


tive elements, i.e. to G+ , the definition of the antipode implies that G+ is indeed
a group with f −1 = f A+ , that is f −1 ◦ f = f ◦ f −1 = e, where e : T + → R
maps every basis vector of the form (15.15) to zero, except for e(1) = 1. This is a
general construction for Hopf algebras and G+ is known as the character group of
T + . The product ◦ in this context is usually called the convolution product. Indeed,
the first identity in (15.21), valid by definition for every coproduct in a Hopf algebra,
ensures that if f and g belong to G+ , then f ◦ g ∈ G+ . (Spelled out, this says if
f, g ∈ (T + )∗ are both multiplicative in the sense that f (σσ̄) = f (σ)f (σ̄), then f ◦ g
is again multiplicative.)
Since, by definition, Γf = (Id ⊗ f )∆ we can rewrite (15.19) as Fx = Γfx , and
the intertwining identity (15.22) entails that

Γf ◦g = Γf Γg .

Also, the element e is neutral in the sense that Γe is the identity operator, and as a
consequence Γf −1 = Γf−1 whenever f ∈ G+ . In particular then,

Fx−1 = Γfx−1 = Γfx A+

and we can fully spell out (15.20) as


15.4 Reconstruction for canonical lifts 301

Γxy = Γfx A+ ◦fy = (Id ⊗ γx,y )∆+ , γx,y = fx A+ ◦ fy = (fx A+ ⊗ fy )∆+ .


def

The fact that ∆+ preserves degree (as can be seen by induction from its definition)
and that elements of T + all have strictly positive degree, except for 1 leads to
the conclusion that, for every Γ ∈ G and every τ ∈ T , Γ τ is indeed of the form
(13.5). The multiplicativity property of ∆+ furthermore guarantees that the constraint
mentioned in Definition 14.3 does hold. This justifies our definition of structure group
G associated to T as the set of all multiplicative linear functionals on T + , acting on
T via (15.16), as given in (15.17), for G has group structure induced from G+ .
Returning to the relation between Πx and Π, we showed actually more, namely
that the knowledge of Π and the knowledge of (Π, Γ ) are equivalent. Indeed, on
the one hand one has Π = Πx Fx−1 and the map Fx can be recovered from Πx by
(15.18) and (15.19). On the other hand however, one also has of course Πx = ΠFx
and, if we equip T with an adequate recursive structure, then we have already seen
that the coefficients fx are uniquely determined by Π.
Furthermore, the correspondence (Π, Γ ) ↔ Π outlined above works for any
admissible model and does not at all rely on the fact that it was built by lifting a
continuous function. In particular, it does not rely on the fact that Πx and Π are
multiplicative. In the general case, the first identity in (15.13) may then of course
fail to be true, even if Πτ happens to be a continuous function for every τ ∈ T . The
only reason why our definition of an admissible model does not simply consist of
the single map Π is that there seems to be no simple way of describing the topology
given by Definition 13.5 in terms of Π.

15.4 Reconstruction for canonical lifts

Recall that, given any sufficiently regular function ξ (say a continuous space-time
function), there is a canonical way of lifting ξ to an admissible model L ξ = (Π, Γ )
for T by imposing (15.12) and (15.13), and then turning Π into a model as described
in the previous paragraph. With such a model L ξ at hand, it follows from (15.13)
and (13.26) that the associated reconstruction operator satisfies the properties

RKf = K ∗ Rf , R(f g) = Rf · Rg ,

as long as all the functions to which R is applied belong to D γ for some γ > 0. As
a consequence, applying the reconstruction operator R to both sides of (15.8), we
see that if H solves (15.8) then, provided that the model (Π, Γ ) = L ξ was built as
above starting from any continuous realisation ξ of the driving noise, the function
h = RH solves the equation (15.1).
At this stage, the situation is as follows. For any continuous realisation ξ of the
driving noise, we have factorised the solution map (h0 , ξ) 7→ h associated to (15.1)
into maps
(h0 , ξ) 7→ (h0 , L ξ) 7→ H 7→ h = RH ,
302 15 Application to the KPZ equation

where the middle arrow corresponds to the solution to (15.8) in some weighted
D γ -space. The advantage of such a factorisation is that the last two arrows yield
continuous maps, even in topologies sufficiently weak to be able to describe driving
noise having the lack of regularity of space-time white noise. The only arrow that
isn’t continuous in such a weak topology is the first one. At this stage, it should
be believable that a similar construction can be performed for a very large class of
semilinear stochastic PDEs, provided that certain scaling properties are satisfied.
This is indeed the case and large parts of this programme have been carried out in
[Hai14b].
Given this construction, one is lead naturally to the following question: given a
sequence ξε of “natural” regularisations of space-time white noise, for example as
in (15.6), do the lifts L ξε converge in probably in a suitable space of admissible
models? Unfortunately, unlike in the theory of rough paths where this is very often
the case (see Section 10), the answer to this question in the context of SPDEs is often
an emphatic no. Indeed, if it were the case for the KPZ equation, then one could
have been able to choose the constant Cε to be independent of ε in (15.6), which is
certainly not the case.

15.5 Renormalisation of the KPZ equation

One way of circumventing the fact that L ξε does not converge to a limiting model
as ε → 0 is to consider instead a sequence of renormalised models. The main idea
is to exploit the fact that our definition of an admissible model does not impose the
multiplicative identity
Πτ τ̄ = Πτ · Π τ̄ ,
used in (15.13) for the canonical lift, even in situations where ξ itself happens to be a
continuous function. One question that then imposes itself is: what are the natural
ways of “deforming” the usual product which still lead to lifts to an admissible model?
It turns out that the regularity structure whose construction was sketched above comes
equipped with a natural finite-dimensional group of continuous transformations R
on its space of admissible models (henceforth called the “renormalisation group”),
which essentially amounts to the space of all natural deformations of the product. It
then turns out that even though the canonical lift L ξε does not converge, it is possible
to find a sequence Mε of elements in R such that the sequence Mε L ξε converges
to a limiting model (Π̂, Γ̂ ). Unfortunately, the elements Mε do not preserve the
image of L in the space of admissible models. As a consequence, when solving the
fixed point map (15.8) with respect to the model Mε L ξε and inserting the solution
into the reconstruction operator, it is not clear a priori that the resulting function
(or distribution) can again be interpreted as the solution to some modified PDE. It
turns out that in the present setting this is again the case and the modified equation
is precisely given by (15.6), where Cε is some linear combination of the constants
appearing in the description of Mε .
15.5 Renormalisation of the KPZ equation 303

There are now three questions that remain to be answered:


1. How does one construct the renormalisation group R?
2. How does one derive the new equation obtained when renormalising a model?
3. What is the right choice of Mε ensuring that the renormalised models converge?
As already pointed out at the start of this chapter, these questions have now been an-
swered in full generality in the series of articles [Hai14b, BHZ19, CH16, BCCH17].
The aim of this section is to illlustrate how the machinery developed there applies to
the particular case of the KPZ equation and go give a feeling for how the main steps
of the construction generalise to other settings.

15.5.1 The renormalisation group

How does all this help with the identification of a natural class of deformations for
the usual product? Throughout this section, we will only consider models constructed
from a single map Π by the recursive procedure given in (15.18), combined with
(15.20). At this point, we crucially note that if Π : T → C ∞ (Rd ) is an arbitrary
admissible linear map (in the sense that ΠIτ = K ∗ Πτ as before), then there is no
reason in general for (15.18) and (15.20) to define a model. The reason is that while
these definitions do guarantee that  Πx Iτ satisfies the first bound in (13.13), there
is no reason in general for Πx τ (y) to vanish at the right order as y → x for an
arbitrary symbol τ that is not obtained by applying the integration map to some other
symbol. It is however the case that these bounds hold whenever Π is obtained as the
canonical lift of a smooth function, as can easily be seen from the multiplicativity
property of the canonical lift.
This suggests to define a space M∞ consisting of those admissible maps Π : T →
C ∞ (Rd ) which do generate a model by the above procedure. By Remark 15.13, there
is a canonical bijection between M∞ and the set of all smooth admissible models,
so we henceforth also call an element Π ∈ M∞ simply a model (or an admissible
model). Note that even though the space of linear maps T → C ∞ (Rd ) is linear, the
space M∞ is far from being a linear space.
At this stage, we would like to introduce probability into the game. For this, note
first that we have a natural action S of the group of translations (Rd , +) onto T by
setting Sh X k = (X + h)k , Sh Ξ = Ξ, and then recursively by

Sh Iτ = ISh τ , Sh τ τ̄ = Sh τ Sh τ̄ .

We then note that if ξ happens to be a stationary stochastic process and Π = L ξ is


its canonical lift as a random model, then Π is a stationary stochastic process in the
generalised sense that
 law 
Πτ ( • + h) = ΠS h τ ( • ) .
304 15 Application to the KPZ equation

In order to define the renormalisation group R, it is then natural to consider only


transformations of the space of admissible models that preserve this property. Since
we are not in general allowed to multiply components of Π and we do not want
to “pull arbitrary functions out of a hat”, the only remaining operation is to form
linear combinations. It is therefore natural to look for linear maps M : T → T which
furthermore preserve M∞ in the sense that if, given Π ∈ M∞ , we define Π M by

Π M τ = ΠM τ , (15.24)

one would like to have again Π M ∈ M∞ . It is clear that in order to guarantee this,
M needs to commute with the integration operators I and I ′ , but this alone is by no
means sufficient.
It turns out that the construction of a natural family of operators with the required
properties goes in a way that is strongly reminiscent of the construction of the
structure group, but with many aspects of the construction “reversed”. A natural
starting point of the construction is given by the set W− ⊂ W consisting of the
canonical basis vectors of strictly negative degree of our regularity structure T which
furthermore have the property that they can be built from products and integrations
applied to Ξ, i.e. do not involve any X k for k > 0. We then define T − similarly to
T + as the free unital algebra generated by W− , i.e.3
n o
T − = Alg
def
, , , , , , , ,

the algebra given by all polynomials with real coefficients and indeterminates in
W− ; the unit is denoted by 1 (or, equivalently, as the empty forest ̸#). The reason
why W− is expected to play a major role is that, by combing Exercise 13.11 with
admissibility and multiplicativity of the action of Γ , Πτ for deg τ > 0 is uniquely
determined by the knowledge of Πτ for all symbols τ with deg τ ≤ 0.
By analogy with the BPHZ renormalisation procedure in quantum field theory
[BP57, Hep69, Zim69], it is natural to look for renormalisation maps that consist in
“contracting subtrees of negative degree”. In order to formalise such an operation, we
take more seriously the interpretation of the canonical basis elements of T as “trees”.
More precisely, we consider labelled trees τ = (V, E, ϱ, n, e), where V is a finite
vertex set, E ⊂ V × V is an edge set, ϱ ∈ V is a root, n : V → Nd is a “polynomial
label” and e : E → {Ξ, I, I ′ } is an “edge label”. As usual, we identify labelled
trees if they can be related by a tree isomorphism preserving the root and labels.
The way this correspondence works is as follows. The symbol X k is represented as
the (unique) tree with a sole vertex V = {ϱ} and polynomial label n(ϱ) = k. The
symbol Ξ is represented by the tree with two vertices V = {ϱ, •}, one (oriented)
edge E = {e} = {(•, ϱ)}, and labels n = 0, e(e) = Ξ. Integration is then performed
by adding an edge of the corresponding type to the root, i.e. we have for example

3
As in the case of rough volatility, cf. 14.26, we colour basis elements of T − differently to
distinguish them from those of T and / or T + . Elements in T − are naturally represented as
(unordered) forests.
15.5 Renormalisation of the KPZ equation 305

I(V, E, ϱ, n, e) = (V ⊔ {ϱ̄}, E ⊔ {(ϱ, ϱ̄)}, ϱ̄, In, Ie) ,

where In(ϱ̄) = 0 and otherwise agrees with n, while Ie((ϱ, ϱ̄)) = I and again
otherwise agrees with e. Multiplication is obtained by joining roots:

(V, E, ϱ, n, e) · (V̄ , Ē, ϱ̄, n̄, ē) = ((V ⊔ V̄ )/{ϱ, ϱ̄}, E ⊔ Ē, {ϱ, ϱ̄}, n ⊔ n̄, e ⊔ ē) ,

where (n ⊔ n̄)({ϱ, ϱ̄}) = n(ϱ) + n̄(ϱ̄).


Remark 15.14. This is nothing but a formalisation of the graphical notation already
used earlier. The notation used in (15.11) for example suggests that one could equiva-
lently have viewed the noise as part of a “vertex label” and this is the viewpoint taken
for example in [BCCH17]. It appears however that viewing noises as edges, as for
example in [BHZ19], usually yields a more consistent formalism. This is especially
the case in situations where one would like to “attach” additional information to
noises as done in [CCHS20, Sec. 5].
In a similar way, elements of T − can be interpreted as elements A = (V, E, ϱ, e)
as above, except that there is no “polynomial label” n and (V, E) is allowed to be a
forest, with ϱ denoting the set of its roots, one per connected component. In particular,
the empty forest V = ̸# is allowed, which wasn’t the case for T .
Given A = (V̄ , Ē, ϱ̄, ē) ∈ T − and τ = (V, E, ϱ, n, e) ∈ T , we say that A ⊂ τ
if one has an injective map ι : V̄ ⊔ Ē → V ⊔ E preserving connectivity and edge
labels. Note that the injectivity of ι implies in particular that the different connected
components of A are vertex-disjoint in τ . In such a situation, we then write RA τ for
the tree obtained by contracting the connected components of A in τ , i.e. the vertex
set of RA τ consists of V /∼ where v ∼ v̄ if v and v̄ are equal or belong to the image
of the same connected component of A, while the edge set of RA τ equals E \ ιĒ.
We then define an operator ∆− : T → T − ⊗ T by
X
∆− τ = Q− A ⊗ RA τ , (15.25)
A⊂τ

where Q− A = A if every connected component of A has negative degree and


Q− A = 0 otherwise. Note again the graphical interpretation of extracting possibly
empty collections of subtrees of negative degree.
Example 15.15. For the regularity structure associated to the KPZ equation, we have
for example4

∆− = ⊗1+1⊗ +2 ⊗ + ⊗
+ ⊗ +2 ⊗ + ⊗ + ⊗ (15.26)
+2 ⊗ +2 ⊗ +2 ⊗ ,

where we used red symbols to denote elements of T − just as in Section 14.5. In most
situations it is natural to only consider characters of T − that vanish on planted trees,
4
Mind that ≡ ⊂ in three distinct ways which explains the terms 2 ⊗ + ⊗ .
306 15 Application to the KPZ equation

i.e. trees with only one edge incident to the root,5 in which case this simplifies to

∆− = ⊗1+1⊗ +2 ⊗ + ⊗ .

Note also that there is for example no term ⊗ appearing in (15.26); indeed
fails to have negative degree, hence is not an element of T − and killed by Q− .

Remark 15.16. Since I ′ (1) = 0 by Remark 15.6, there is no term such as ⊗


appearing in the right-hand side of (15.26).

Remark 15.17. While the present construction is sufficient for KPZ, in full generality,
one should also allow polynomial decorations for elements in T − in which case
the expression for ∆− involves additional combinatorial factors, similarly to the
definition of ∆+ .

Our motivation for the definition of ∆− is as follows. Assigning a number to each


τ ∈ W− is equivalent to choosing an algebra morphism g : T − → R. If we ignore
for a moment the labels n and e, an operation of the type Mg : T → T with

Mg τ = (g ⊗ Id)∆− τ , (15.27)

then corresponds to iterating over all ways of contracting subtrees of negative degree
contained in τ and replacing them by the corresponding constant assigned to it by g.
This corresponds to replacing a kernel of possibly several variables by a multiple of
a Dirac delta function forcing all arguments to collapse.
Similarly to before, one can also define an operator ∆− : T − → T − ⊗ T − by
setting X
∆− B = Q− A ⊗ Q− RA B ,
A⊂B

where the notions of inclusion A ⊂ B and the contraction RA B are defined in


complete analogy to above.
This yields an algebraic structure very similar to the one given by T and T + . We
will however not describe it in any more detail here, but refer instead to [BHZ19] for
additional details. In particular, T − , with forest product and coproduct ∆− , admits
an antipode A− turning it into a commutative Hopf algebra. Its characters then form
a group with product analogous to (15.23) and inverse given by g 7→ gA− , acting on
T by (15.27).

Definition 15.18. The renormalisation group R for our regularity structure T is


defined as the character group of T − .

Remark 15.19. The original definition of the “renormalisation group” given in


[Hai14b] (and in the first edition of this book) is slightly more general. In the
situation of the regularity structure built for a two-component KPZ equation, i.e.
5
In essence, extracting negative trees will help to renormalise otherwise ill-posed products. A single
edge incident to the root corresponds to convolution with a (compactly supported) kernel, which is
always well-posed.
15.5 Renormalisation of the KPZ equation 307

exactly the same as discussed here, except that there are two “noises” Ξ1 and Ξ2
and every occurrence of Ξ can be replaced by either of them, the old definition
would for example include the map M that swaps the two noises in a consistent
way. (Consistency is in the sense that M I ′ (Ξ2 )I ′ (I ′ (Ξ1 )2 ) = I ′ (Ξ1 )I ′ (I ′ (Ξ2 )2 )
for example.) This is not an operation that is described by a character of T − . The
advantage of the present definition is that it is much more explicit. Furthermore, it
follows from the analytical results of [CH16] that it is sufficiently large to serve the
purpose of renormalising divergent models.

Example 15.20. Continuing the above example, we have

∆− = ⊗1+1⊗ +2 ⊗ + ⊗ + ⊗ .

Note that we have not considered the simplification of removing planted trees. Instead,
the analogues of the remaining terms appearing in (15.26) are killed by the projection
Q− . We also note that this expression is symmetric in the two factors T − which
is the case for all the symbols appearing in the analysis of the KPZ equation. This
implies that the KPZ renormalisation group R is abelian. (In general though, the
presence of “overlapping divergencies” can cause R to be non-abelian.)

One of the main results of [BHZ19] is a generalisation of the following statement,


which shows that the action of the renormalisation group plays nice with our notion
of admissible model.

Theorem 15.21. Let g ∈ R and define Mg = (g ⊗ Id)∆− as in (15.27). Then, for


any Π ∈ M∞ , one has Π g = ΠMg ∈ M∞ . Furthermore, one has
def

Πxg = Πx Mg , g
Γxy = Mg−1 Γxy Mg . (15.28)

Proof. We sketch the proof. Recall that ∆− has been defined (with notational over-
load) as map from T → T − ⊗ T and T − → T − ⊗ T − ; we now also define
∆− : T + → T − ⊗ T + as multiplicative linear map, determined by

∆− Xi = 1 ⊗ Xi , ∆− Jℓ (τ ) = (Id ⊗ Jℓ (·))∆− τ .

In the special case of KPZ one can check by hand that, thanks in particular to the
fact that I ′ (1) = 0 by Remark 15.6 (which correctly suggests that we should also
impose J ′ (1) = 0),
(i) On T one has the cointeraction formula

M13 (∆− ⊗ ∆− )∆+ = (Id ⊗ ∆+ )∆− , (15.29)

where M13 : T − ⊗ T ⊗ T − ⊗ T + → T − ⊗ T ⊗ T + is the map that multiplies


the first and third factor (in T − ), and the same holds also on T + .
(ii) The actions of R onto T and T + given by Mg do not decrease the degree. (For
the relevant set of characters g, this is seen explicitly in Exercise 15.2.)
308 15 Application to the KPZ equation

Recall the correspondence Π ⇔ (Π, Γ ) given in Remark 15.13. With the special
properties (i)-(ii) it is straightforward to verify that, for g ∈ R arbitrary, Π g = ΠMg
defines a model Π g ⇔ (Π g , Γ g ) with

Πxg = Πx Mg = (g ⊗ Πx )∆− , Γxy


g g
)∆+ , g
= (g ⊗ γxy )∆− .
def
= (Id ⊗ γx,y γxy
g
(The second identity in (15.28) then follows from the formula for γxy , combined
Π
with the cointeraction formula.) To show all this, first write fx = fx for fx obtained
from Π as in (15.18). One shows recursively that
g
fxΠ = fxΠ Mg .

One then uses (i), on T , to show that the required identity for Πzg holds. Finally, one
uses (i), on T + to show that if one views Mg = (g ⊗ Id)∆− as acting on T + , then
its action distributes over the product in the character group defined in (15.23) in the
sense that (Mg f ) ◦ (Mg f¯) = Mg (f ◦ f¯), which then implies the required identity
g
for γxy . The fact that the action of Mg does not decrease degrees guarantees that
(Π g , Γ g ) is again a model (since (Π, Γ ) is). ⊔

Remark 15.22. In general (i.e. in the case of similar regularity structures set up
for different examples of subcritical semilinear SPDEs), the cointeraction property
(15.29) may fail. It turns out however that it can still be rescued by working in a
suitably extended regularity structure, see [Hai16, BHZ19].
One important feature of this theorem is that the last statement provides quantita-
tive bounds on the map Π 7→ Π g which show that it can be extended to a continuous
action of R onto the space M of all admissible models. A crucial property of R is
that it is sufficiently large to allow us to “recenter” models in a natural way.
Definition 15.23. Let ξ be a (smooth) stationary stochastic process and let Π be its
canonical lift. Then, there exists a unique character g BPHZ ∈ R such that Π BPHZ =
ΠMgBPHZ satisfies E(Π BPHZ τ )(0) = 0 for every canonical basis vector τ ∈ T with
deg τ < 0. We call Π BPHZ the BPHZ lift of ξ.
Remark 15.24. This is named after Bogoliubow, Parasiuk, Hepp and Zimmermann
[BP57, Hep69, Zim69] who introduced an analogous renormalisation procedure in
the context of perturbative quantum field theory in the sixties.
Remark 15.25. Note also that while the BPHZ lift of a noise ξ is “canonical”, it
does depend on the choice of kernel K for our notion of admissibility. In particular,
different truncations of the heat kernel will in general lead to different values for the
BPHZ renormalisation constants.
A beautiful property of the BPHZ lift is that it is much more stable than the
canonical lift. Indeed, it was shown in [CH16] that one can introduce a natural
measure of the “size” N (ξ) of a stationary noise ξ which is such that for any sequence
ξn such that supn N (ξn ) < ∞ and ξn → ξ in probability as random distributions, the
corresponding BPHZ lifts Π BPHZ
n converge to a limiting model Π BPHZ . This limiting
model is furthermore independent of the choice of approximating sequence.
15.5 Renormalisation of the KPZ equation 309

15.5.2 The renormalised equations

As introduced, the renormalisation group R for KPZ is a Lie group of dimension


8, equal to the number of symbols ( , , , , , , , ) used to generate T − .
As already hinted in Example 15.15 above, we will not need to renormalise planted
trees, nor the noise symbol itself, nor symbols with three leaves (cubic in Gaussian
noise, hence of zero mean, so that the BPHZ condition is trivially satisfied). We thus
define a character g on T − by specifying

g( ) = C0 , g( ) = C1 , g( ) = C2 , g( ) = C3 , (15.30)

and set to vanish on the remaining symbols which require no renormalisation. The
resulting renormalisation maps M : T → T is then given by M := (g ⊗ Id)∆− .
(It turns out that we only need a three-parameter subgroup of R to renormalise the
equation, but in order to explain the procedure we prefer to work with the larger
4-dimensional subgroup of R.) It is now rather straightforward to show the following:

Proposition 15.26. Let M := (g ⊗ Id)∆− with g as specified in (15.30) and let


(Π M , Γ M ) = M L ξ, where L ξ is the canonical lift of some smooth function ξ. Let
furthermore H be the solution to (15.8) with respect to the model (Π M , Γ M ). Then,
writing RM for the reconstruction operator associated to this renormalised model,
the function h(t, x) = RM H (t, x) solves the equation


∂t h = ∂x2 h + (∂x h)2 − 4C0 ∂x h + ξ − (C1 + C2 + 4C3 ) .

Proof. By Theorem 14.5, it turns out that (15.8) can be solved in D γ as soon as γ is
a little bit greater than 3/2. Therefore, we only need to keep track of its solution H
up to terms of degree 3/2. By repeatedly applying the identity (15.9), we see that the
solution H ∈ D γ for γ close enough to 3/2 is necessarily of the form

H = h1 + + + h′ X1 + 2 + 2h′ ,

for some real-valued functions h and h′ . (Note that h′ is treated as an independent


function here, we certainly do not suggest that the function h is differentiable! Our
notation is only by analogy with the classical Taylor expansion.) As an immediate
consequence, ∂H is given by

∂H = + + h′ 1 + 2 + 2h′ , (15.31)

as an element of D γ for γ sufficiently close to 1/2. Similarly, the right-hand side of


the equation is given up to order 0 by

(∂H)2 +Ξ = Ξ + +2 +2h′ + +4 +2h′ +4h′ +(h′ )2 1 . (15.32)

It follows from the definition of M that one then has the identity

M ∂H = ∂H − 4C0 ,
310 15 Application to the KPZ equation

so that, as an element of D γ with very small (but positive) γ, one has the identity

(M ∂H)2 = (∂H)2 − 8C0 .

As a consequence, after neglecting all terms of strictly positive order, one has the
identity (writing c instead of c1 for real constants c)

M (∂H)2 + Ξ = (∂H)2 + Ξ − C0 4 + 4 + 8 + 4h′ 1 − C1 − C2 − 4C3


 

= (M ∂H)2 + Ξ − 4C0 M ∂H − (C1 + C2 + 4C3 ) .

Combining this with the fact that M and ∂ commute, the claim now follows at once.


Remark 15.27. It turns out that, thanks to the symmetry x 7→ −x enjoyed by our
problem, the corresponding model can be renormalised by a map M as above, but
with C0 = 0. The reason why we considered the general case here is twofold. First, it
shows that it is possible to obtain renormalised equations that differ from the original
equation in a more complicated way than just by the addition of a large constant.
Second, if one tries to approximate the KPZ equation by a microscopic model which
is not symmetric under space inversion, then the constant C0 plays a non-trivial role,
see for example [HS17].

15.5.3 Convergence of the renormalised models

It remains to argue why one expects to be able to find constants Ciε such that the
P3
sequence of renormalised models M ε L ξε with M ε = exp( i=1 Ciε Li ) converges
to a limiting model. Instead of considering the actual sequence of models, we only
ε
consider the sequence of stationary processes Π̂ τ := Π ε M ε τ , where Π ε is
associated to (Π ε , Γ ε ) = L ξε as in Section 15.5.1.
Remark 15.28. It is important to note that we do not attempt here to give a full proof
that the renormalised model converges to a limit in the correct topology for the space
ε
of admissible models. We only aim to argue that it is plausible that Π̂ converges
to a limit in some topology. A full proof of convergence (but in a slightly different
setting) can be found in [Hai13], see also [Hai14b, Section 10] and [CH16] for most
general statements.
Since there are general arguments available to deal with all the expressions τ
of positive degree as well as expressions of the type I ′ (τ ) and Ξ itself, we restrict
ourselves to those that remain. Inspecting (15.11), we see that they are given by

, , , , .

For this part, some elementary notions from the theory of Wiener chaos expansions
are required, but we’ll try to hide this as much as possible. At a formal level, one has
15.5 Renormalisation of the KPZ equation 311

the identity
Π ε = K ′ ∗ ξε = Kε′ ∗ ξ ,
where the kernel Kε′ is given by Kε′ = K ′ ∗ δε . This shows that, at least formally,
one has
ZZ
ε ′
 2
Kε′ (z − z1 )Kε′ (z − z2 ) ξ(z1 )ξ(z2 ) dz1 dz2 .

Π (z) = K ∗ ξε (z) =

Similar but more complicated expressions can be found for any formal expression τ .
This naturally leads to the study of random variables of the type
Z Z
Ik (f ) = · · · f (z1 , . . . , zk ) ξ(z1 ) · · · ξ(zk ) dz1 · · · dzk . (15.33)

Ideally, one would hope to have an Itô isometry of the type EIk (f )Ik (g) =
⟨f sym , g sym ⟩, where ⟨·, ·⟩ denotes the L2 -scalar product and f sym denotes the sym-
metrisation of f . This is unfortunately not the case. Instead, one should replace the
products in (15.33) by Wick products, which are formally generated by all possible
contractions of the type

ξ(zi )ξ(zj ) 7→ ξ(zi ) ⋄ ξ(zj ) + δ(zi − zj ) .

If we then set
Z Z
Iˆk (f ) = ··· f (z1 , . . . , zk ) ξ(z1 ) ⋄ · · · ⋄ ξ(zk ) dz1 · · · dzk ,

One has indeed


EIˆk (f )Iˆk (g) = ⟨f sym , g sym ⟩ .
Furthermore, one has equivalence of moments in the sense that, for every k > 0 and
p > 0 there exists a constant Ck,p such that

E|Iˆk (f )|p ≤ Ck,p ∥f sym ∥p .

Finally, one has EIˆk (f )Iˆℓ (g) = 0 if k ̸= ℓ. Random variables of the form Iˆk (f ) for
some k ≥ 0 and some square integrable function f are said to belong to the kth
homogeneous Wiener chaos.
Returning to our problem, we first argue that it should be possible to choose M ε
ε
in such a way that Π̂ converges to a limit as ε → 0. The above considerations
suggest that one should rewrite Π ε as

Π ε (z) = K ′ ∗ ξε (z)2
 
(15.34)
ZZ
= Kε′ (z − z1 )Kε′ (z − z2 ) ξ(z1 ) ⋄ ξ(z2 ) dz1 dz2 + Cε(1) ,

(1)
where the constant Cε is given by the contraction
312 15 Application to the KPZ equation
Z
2
Cε(1) = Kε′ (z) dz .
def
=

Note now that Kε′ is an ε-approximation of the kernel K ′ which has the same singular
behaviour as the derivative of the heat kernel. In terms of the parabolic distance, the
singularity of the derivative of the heat kernel scales like p K(z) ∼ |z|−2 for z → 0.
(Recall that we consider the parabolic distance |(t, x)| = |t| + |x|, so that this is
consistent with the fact that the derivative of the heat kernel is bounded by t−1 .) This
2
suggests that one has Kε′ (z) ∼ |z|−4 for |z| ≫ ε. Since parabolic space-time has
scaling dimension 3 (time counts double!), this is a non-integrable singularity. As a
matter of fact, there is a whole power of z missing to make it borderline integrable,
which suggests that one has
1
Cε(1) ∼ .
ε
This already shows that one should not expect Π ε to converge to a limit as ε → 0.
However, it turns out that the first term in (15.34) converges to a distribution-valued
stationary space-time process, so that one would like to somehow get rid of this
(1)
diverging constant Cε . This is exactly where the renormalisation map M ε (in
particular the factor exp(−C1 L1 )) enters into play. Following the above definitions,
we see that one has
ε 
(z) = Π ε M (z) = Π ε (z) − C1 .
 
Π̂
(1) ε
This suggests that if we make the choice C1 = Cε , then Π̂ does indeed converge
to a non-trivial limit as ε → 0. This limit is a distribution given, at least formally, by
ZZ
Π ε (ψ) = ψ(z)K ′ (z − z1 )K ′ (z − z2 ) dz ξ(z1 ) ⋄ ξ(z2 ) dz1 dz2 .


Using again the scaling properties of the kernel K ′ , it is not too difficult to show that
this yields indeed a random variable belonging to the second homogeneous Wiener
chaos for every choice of smooth test function ψ.
The case τ = is treated in a somewhat similar way. This time one has

Π ε (z) = K ′ ∗ ξε (z) K ′ ∗ K ′ ∗ ξε (z)


  
ZZ
= Kε′ (z − z1 )(K ∗ Kε′ )(z − z2 ) ξ(z1 ) ⋄ ξ(z2 ) dz1 dz2 + Cε(0) ,

(0)
where the constant Cε is given by the contraction
Z
(0)
= Kε′ (z) K ′ ∗ Kε′ (z) dz .
def 
Cε =

This time however Kε′ is an odd function (in the spatial variable) and K ′ ∗ Kε′ is an
(0)
even function, so that Cε vanishes for every ε > 0. This is why we can set C0 = 0
and no renormalisation is required for .
15.5 Renormalisation of the KPZ equation 313

Turning to our list of terms of negative degree, it remains to consider , , and


. It turns out that the latter two are the more difficult ones, so we only discuss
these. Let us first argue why we expect to be able to choose the constant C2 in such
ε
a way that Π̂  converges to a limit. In this case, the “bad” term comes from the
part of Π ε (z) belonging to the homogeneous chaos of order 0. This is simply
a constant, which is given by
Z
Cε(2) = 2 = 2 K ′ (z)K ′ (z̄)Q2ε (z − z̄) dz dz̄ ,
def
(15.35)

where the kernel Qε is given by


Z
Qε (z) = Kε′ (z̄)Kε′ (z̄ − z) dz̄ .

Remark 15.29. The factor 2 comes from the fact that the contraction (15.35) appears
twice, since it is equal to the contraction . In principle, one would think that the
(2)
contraction also contributes to Cε . This term however vanishes due to the fact
that the integral of Kε′ vanishes.

Since Kε′ is an ε-mollification of a kernel with a singularity of order −2 and


the scaling dimension of the underlying space is 3, we see that Qε behaves like an
ε-mollification of a kernel with a singularity of order −2 − 2 + 3 = −1 at the origin.
As a consequence, the singularity of the integrand in (15.35) is of order −6, which
gives rise to a logarithmic divergence as ε → 0. This suggests that one should choose
(2)
C2 = Cε in order to cancel out this diverging term and obtain a non-trivial limit
ε
for Π̂ as ε → 0. This is indeed the case.
We finally turn to the case τ = . In this case, there are “bad” terms appearing in
the Wiener chaos decomposition of Π ε both in the second and the zeroth Wiener
chaos. This time, the constant appearing in the zeroth Wiener chaos is given by

Z
Cε(3) = 2 K ′ (z)K ′ (z̄)Qε (z̄)Qε (z + z̄) dz dz̄ ,
def
=2

(2)
which diverges logarithmically for exactly the same reason as Cε . Setting C2 =
(2)
Cε , this diverging constant can again be cancelled out. The combinatorial factor 2
arises in essentially the same way as for and the contribution of the term where
the two top nodes are contracted vanishes for the same reason as previously.
It remains to consider the contribution of Π ε to the second Wiener chaos. This
contribution consists of three terms, which correspond to the contractions

It turns out that the first one of these terms does not give raise to any singularity. The
last two terms can be treated in essentially the same way, so we focus on the last one,
314 15 Application to the KPZ equation

which we denote by η ε . For fixed ε, the distribution (actually smooth function) η ε is


given by
Z
η (ψ) = ψ(z0 )K ′ (z0 − z1 )Qε (z0 − z1 )K ′ (z2 − z1 )
ε

× Kε′ (z3 − z2 )Kε′ (z4 − z2 ) ξ(z3 ) ⋄ ξ(z4 ) dz .

The problem with this is that as ε → 0, the product Q̂ε := K ′ Qε converges to a


kernel Q̂ = K ′ Q, which has a non-integrable singularity at the origin. In particular,
it is not clear a priori whether the action of integrating a test function against Q̂ε
converges to a limiting distribution as ε → 0. Our saving grace here is that since Qε
is even and K ′ is odd, the kernel Q̂ε integrates to 0 for every fixed ε.
This is akin to the problem of making sense of the “Cauchy principal value”
distribution, which formally corresponds to the integration against 1/x. For the sake
of the argument, let us consider a function W : R → R which is compactly supported
and smooth everywhere except at the origin, where it diverges like |W (x)| ∼ 1/|x|.
It is then natural to associate to W a “renormalised” distribution RW given by
Z
 
RW (φ) = W (x) φ(x) − φ(0) dx .

Note that RW has the property that if φ(0) = 0, then it simply corresponds to
integration against W , which is the standard way of associating a distribution to
a function. Furthermore, the above expression is always well-defined, since φ is
smooth and therefore the factor (φ(x) − φ(0)) cancels out the singularity of W at
the origin. It is also straightforward to verify that if Wε is a sequence of smooth
approximations to W (say one has Wε (x) = W (x) for |x| > ε and |Wε | ≲ 1/ε
otherwise) which has the property that each Wε integrates to 0, then W ε → RW in
a distributional sense.
In the same way, one can show that Q̂ε converges as ε → 0 to a limiting distribu-
tion R Q̂. As a consequence, one can show that η ε converges to a limiting (random)
distribution η given by
Z
η(ψ) = ψ(z0 ) R Q̂(z0 −z1 )K ′ (z2 −z1 )K ′ (z3 −z2 )K ′ (z4 −z2 ) ξ(z3 )⋄ξ(z4 ) dz .

It should be clear from this whole discussion that while the precise values of the
constants Ci depend on the details of the mollifier δε , the limiting (random) model
(Π̂, Γ̂ ) obtained in this way is independent of it. Combining this with the continuity
of the solution to the fixed point map (15.8) and of the reconstruction operator R
with respect to the underlying model, we see that the statement of Theorem 15.2
follows almost immediately.
15.6 The KPZ equation and rough paths 315

15.6 The KPZ equation and rough paths

In the particular case of the KPZ equation, it turns out that is possible to give a robust
solution theory by only using “classical” controlled rough path theory, as exposed in
the earlier part of this book. This is actually how it was originally treated in [Hai13].
To see how this can be the case, we make the following crucial remarks:
1. First, looking at the expression (15.31) for ∂H, we see that most symbols come
with constant coefficients. The only non-constant coefficients that appear are
in front of the term 1, which is some kind of renormalised value for ∂H, and
in front of the term . This suggests that the problem of finding a solution h to
the KPZ equation (or equivalently a solution h′ to the corresponding Burgers’
equation) can be simplified considerably by considering instead the function v
given by 
v = ∂x h − Π + + 2 , (15.36)
where Π is the operator given by (15.12–15.14).
2. The only symbol τ appearing in ∂H such that deg τ + deg < 0 is the symbol
. Furthermore, one has

∆1 = 1 ⊗ 1 , ∆ = ⊗ 1 + 1 ⊗ J ′( ) ,
∆ = ⊗1, ∆ = ⊗ 1 + ⊗ J ′( ) .

It then follows from this and the definition (15.16) of the structure group G that
the space ⟨ , , 1, ⟩ ⊂ T is invariant under the action of G. Furthermore, its ac-
tion on this subspace is completely described by one real number corresponding
to J ′ ( ). Finally, viewing this subspace as a regularity structure in its own right,
we see that it is nothing but the regularity structure of Section 13.3.2, provided
that we make the identifications ∼ Ẇ , ∼ W , and ∼ Ẇ.
3. One has the identities

∆ = ⊗ 1 + ⊗ J ′( ), ∆ = ⊗ 1 + ⊗ J ′( ),

so that the pair of symbols { , } could also have played the role of {W , Ẇ}
in the previous remark.
Let now ξ be a smooth function and let h be given by the solution to the unrenor-
malised KPZ equation (15.1). Defining Π by ΠΞ = ξ and then recursively as in
(15.13), and defining v by (15.36), we then obtain for v the equation

∂t v = ∂x2 v + ∂x v Π + 4 Π

+R, (15.37)

where the “remainder” R belongs to C α for every α < −1. Similarly to before, it also
turns out that if we replace Π bi Π̂ = Π M defined as in (15.24) (with C0 = 0) and
h as the solution to the renormalised KPZ equation (15.6) with Cε = C1 + C2 + 4C3 ,
then v also satisfies (15.37), but with Π replaced by the renormalised model Π̂.
316 15 Application to the KPZ equation

We are now in the following situation. As a consequence of (15.31) we can guess


that for any fixed time t, the solution v should be controlled by the function Π̂ ,
which we can interpret as one component (say W 1 ) of some rough path (W, W).
Note that here the spatial variable plays the role of time! The time variables merely
plays the role of a parameter, so we really have a family of rough paths indexed
by time. Furthermore, Π̂ can be interpreted as the distributional derivative of
another component (say W 0 ) of the rough path W . Finally, the function Π̂ can be
interpreted as a third component W 2 of W .
As a consequence of the second and third remarks above, the two distributions
Π̂ and Π̂ can then be interpreted as the distributional derivatives of the “iterated
integrals” W1,0 and W2,1 . It follows automatically from these algebraic relations
combined with the analytic bounds (13.13) that W1,0 and W2,1 then satisfy the
required estimates (2.3). Our model does not provide any values for W1,2 , but these
turn out not to be required. Assuming that v is indeed controlled by X1 = Π̂ , it
is then possible to give meaning to the term v Π appearing in (15.37) by using
“classical” rough integration.
As a consequence, we then see that the right-hand side of (15.37) is of the form
∂x2 Y , for some function Y controlled by W 0 . One of the main technical results of
[Hai13] guarantees that if Z solves

∂t Z = ∂x2 Z + ∂x2 Y ,

and Y is controlled by W 0 , then Z is necessarily controlled by W 1 = Π̂ . This


“closes the loop” and allows to set up a fixed point equation for v that is stable as
a function of the underlying model Π̂ and therefore also allows to deal with the
limiting case of the KPZ equation driven by space-time white noise.

15.7 Exercises

Exercise 15.1 (KPZ Structure Group) Consider the 16-dimensional KPZ regular-
ity structure with T = TKPZ given by

T = ⟨ Ξ, , , , , , , , 1, , , , , X1 , , ⟩.

Show that the structure group G is a 7-dimensional (non-commutative) Lie group, an


element Γ ∈ G ⊂ L(T, T ) of which has the upper triangular matrix representation
15.7 Exercises 317

Ξ 1 X1
Ξ 1
 
 1 
1
 
 
1 c1 c2
 
 
1
 
 
 

 1 


 1 


 1 

1 1 c1 c2 c3 c4 c5 c6 c7 


 1 


 1 


 1 


 1 

X1
 1 c1 c2 
 1 
1

where empty entries mean zeros. Note that the upper-triangular form reflects the fact
that Γ − Id is only allowed to produce lower order terms. (Remark: It is immediate
from this representation that ⟨ , , 1, ⟩ and ⟨ , , 1, ⟩ are indeed sectors, with
− −
“rough path” index set {− 12 , 0− , 0, 21 }, and action of the structure group exactly
as in the rough path case (13.12) (with “h” replaced by c1 and c2 , respectively.)

Solution. We first derive the coaction on all the symbols, and here prefer to write ∆
for the coaction and keep ∆+ for the coproduct on T + . By definition of the coaction,
∆(Ξ) = Ξ ⊗ 1 and
X Xk
∆( ) = I ′ (Ξ) ⊗ 1 + ⊗ Jk′ (Ξ) = ⊗ 1 ,
2
k!
k∈N

since deg Jk′ (Ξ) = deg Jk+(0,1) (Ξ) = deg Ξ + 1 − |k| < 0 so that Jk′ (Ξ) = 0.
Similarly, write ∆ instead of ∆+ for better readability,

∆( ) = ∆( )∆( ) = ( ⋆ ) ⊗ 1 = ⊗ 1,
= ∆I ′ ( ) = . . . = ⊗ 1,


 
∆ = ∆( )∆ = ... = ⊗ 1,
  
∆ =∆ ∆ = ⊗ 1,
 


∆ = ⊗1+1⊗J ,
   
⊗1+ ⊗J′

∆ = ∆( )∆ = .
318 15 Application to the KPZ equation

Note the interpretation of cutting off positive branches: deg J ′ = 1 + 3(− 32 ) +

− −
4 = 21 > 0, and also deg J ′ ( ) = 21 as seen in

⊗ 1 + 1 ⊗ J ′ ( ),

∆ =
∆( ) = ⊗ 1 + ⊗ J ′ ( ).
 
∆ =∆

To deal with = I(Ξ), note deg J (Ξ) > 0, deg J ′ (Ξ) < 0 so that the latter term
does not figure (same reasoning for = I( )), and obtain

∆( ) = ⊗ 1 + 1 ⊗ J (Ξ),
∆( ) = ⊗ 1 + 1 ⊗ J ( ).

By definition, ∆X1 = X1 ⊗ 1 + 1 ⊗ X1 . Next consider and . In view of


| and |J ′

|J | > 0 we have (same reasoning for ),
 
= ⊗ 1 + 1 ⊗ J ( ) + X1 ⊗ J ′

∆ ,
= ⊗ 1 + 1 ⊗ J ( ) + X1 ⊗ J ′ ( ),


Inspecting the above reveals that we need 1 and then the following 7 “positive”
symbols (also viewable as trees) in T + ,

J ′( ), J ′ ( ), J (Ξ), J ( ), X1 , J ( ), J ( ), (15.38)
− − − − −
of resp. homogeneities 12 , 12 , 12 , 1− , 1, 32 , 23 . On the other hand, T + was
introduced abstractly as free commutative algebra generated by all of the above
+
symbols (with unit element 1). Even upon truncation, say T + = T<3/2 with abusive
notation, this leaves us with 10 + 4 + 1 = 15 generating symbols,

J ′( ), J ′ ( ), . . . , J ′ ( ); J (Ξ), . . . , J ( ); X1 (15.39)

(of which only 7 are needed). Of course, T + also contains (free) products such as
J ′ ( )J ′ ( ), X1 J ′ ( ), J ′ ( )J ( , ) (all of degree < 3/2), however by working
in T these did not appear as “right-hand side”-image of ∆ above.
Consider now a character of the algebra T + ; that is, an element g ∈ (T + )∗ ,
so that g(1) = 1 and g(σσ̄) = g(σ)g(σ̄). (Actually, in view of the truncation we
impose this only for σ, σ̄ with deg(σσ̄) = deg σ + deg σ̄ < 3/2.) Such g is obviously
determined by its value on each of the 15 basis symbols listed in (15.39). Now T +
can be given a Hopf structure, with coproduct ∆+ and antipode, so that the set of
characters forms the group G+ , with product given by
X
(f ◦ g)(σ) = (f ⊗ g)∆+ σ = ⟨f, σ ′ ⟩⟨g, σ ′′ ⟩;
(σ)

inverses are given in terms of the antipode. One thus sees that G+ is a 15-dimensional
(Lie) group. However, only a 7-dimensioal subgroup is needed, for we only care
15.7 Exercises 319

about the 7 values arising from (15.38), which we call

c1 = g J ′

, . . . , c7 = g(J ( )).

It then follows from Γg := (Id ⊗ g)∆ that Γg : T → T acts as identity on all


symbols other than
 
Γg = + g(J ′ ( ))1 ≡ + c1 1,
 
Γg = + g(J ′ ( )) = + c1 ,


Γg = + g(J ( ))1 = + c2 1,

Γg = + c2 ,
Γg ( ) = + g(J (Ξ))1 = + c3 1,
Γg ( ) = + g(J ( ))1 = + c4 1,
Γg (X1 ) = X1 + c5 1,
Γg ( ) = + c6 1 + c1 X1 ,
Γg ( ) = + c7 1 + c2 X1 .

The matrix representation of Γg is then immediate.

Exercise 15.2 (KPZ Renormalisation Group) Consider again the 16-dimensional


KPZ regularity structure with structure space T = TKPZ . The renormalisation group
was given as subgroup R ⊂ L(T, T ), given by Mg τ = (g ⊗ Id)∆− τ , where g ranges
over the characters of T − . Consider more specifically M = Mg with g as specified
in (15.30), i.e. g( ) = C0 , g( ) = C1 , g( ) = C2 , g( ) = C3 and set to vanish
on the remaining symbols.
Show that this gives a subgroup of R which is a 4-dimensional (commutative)
Lie group, an element M ∈ R ⊂ L(T, T ) of which has lower triangular matrix
representation
320 15 Application to the KPZ equation

Ξ 1 X1
Ξ 1
 
 1 
1
 
 
2C0 1
 
 
1
 
 
 

 1 


 C0 1 


 2C0 1 

1 C1 C2 C3 C0 1 


 1 


 2C0 1 


 1 


 1 

X1
 1 

 1 
2C0 1

Exercise 15.3 Show that the two procedures for recovering Π from the knowledge
of (Π, Γ ) outlined in Remark 15.13 and on page 301 are equivalent.

15.8 Comments

The original proof [Hai13] of well-posedness of the KPZ equation without using the
Cole–Hopf transform did not use regularity structures but instead viewed the solution
at any fixed time as a spatial rough path controlled by the solution to the linearised
equation, in the spirit of Section 12.3. An alternative approach using paracontrolled
distributions as developed in [GIP15] was used in [GP17] to obtain a number of
additional properties of the solutions, including a clean variational formulation.
Given that the KPZ equation is expected to enjoy a form of “universality”, a very
natural question is that of showing that “most” classes of interface fluctuation models
converge to it in the weakly asymmetric regime. The first result in this direction was
obtained by Bertini–Giacomin [BG97], but this relied crucially on a microscopic
version of the Hopf–Cole transform to show that the transformed particle system
converges to the multiplicative stochastic heat equation. A first more general result
was obtained by Jara–Conçalves [GJ14] who showed that the large scale fluctuations
of a large number of particle systems solve the KPZ equation in a relatively weak
sense. It has been an open problem for quite some time now whether such a weak
notion of solution characterises solutions to the KPZ uniquely. Major progress in
this direction was obtained by Gubinelli–Perkowski [GP18] who showed that this
is indeed the case at stationarity under an additional structural assumption on the
15.8 Comments 321

generator of the particle system that can be verified for a number of systems of
interest.
On the other hand, a large class of interface fluctuation models that fall outside of
this approach is given by solutions to an equation of the type

∂t hε = ∂x2 hε + εF (∂x hε ) + η(t, x) , (15.40)

where η is a (smooth) space-time random field with sufficiently good mixing prop-
erties, F : R → R is an even function growing at infinity, and ε > 0 is a parameter
controlling the asymmetry of the problem. Under rather weak assumptions on η and F
one then expects to be able to find constants Cε such that ε−1/2 hε (ε−2 t, ε−1 x)−Cε t
converges to solutions to the KPZ equation. This was shown to be indeed the case in
various special cases of increasing generality in [HS17, HQ18, HX19, FG19]. (The
last reference treats a different class of models but its proofs could be adapted to the
setting of (15.40).)
There is a natural generalisation of the KPZ equation going in a completely
different direction. Indeed, given a Riemannian manifold (M, g) (where g denotes
the metric tensor), we can ask ourselves what the natural “stochastic heat equation
with values in M” looks like. A moment’s thought suggests that it should be given,
in local coordinates, by an equation of the form

∂t uα = ∂x2 uα + Γβγ
α
(u) ∂x uβ ∂x uγ + σiα (u) ξi , (15.41)
α
where the ξi are i.i.d. space-time white noises, Γβγ are the Christoffel symbols for
M, the σi are any finite collection of vector fields such that

σiα σiβ = g , (15.42)

and summation over repeated indices is implied. By combining the results of


[CH16, BHZ19, BCCH17], it is not difficult to see that there are natural notions
of solution to (15.41), but these are of course only well-defined modulo an element
of the renormalisation group R. It turns out that in this case, even after taking into
account simplifications due to the symmetry x ↔ −x and the fact that the noises are
i.i.d. Gaussian, the relevant subgroup of R is generically (namely for large enough
dimension of M) of dimension 54.
This is a good example illustrating the role played by symmetries. In this particular
case, there are two additional symmetries one would like to exploit. On the one hand,
one would like to enforce equivariance under the group of diffeomorphisms of M.
In other words, solutions to (15.41) should be independent of the specific coordinate
system used to write (15.41). This is akin to the property of solutions to regular
SDEs written in Stratonovich form (or indeed those of RDEs driven by a geometric
rough path). On the other hand, the derivation of (15.41) implicitly makes use of
Itô’s isometry to guarantee that, at least in law, its solutions do not depend on the
specific choice of the vector fields satisfying (15.42). This in turn is akin to the
property of solutions to SDEs written in Itô form. It turns out – and this is the main
result of [BGHZ19] – that in this context it is possible to find solution theories that
322 15 Application to the KPZ equation

do satisfy both properties simultaneously! In fact there still exists a two-parameter


family of them, but if we restrict ourselves to (15.1) (i.e. with Γ and σ related to
the same metric g), then it reduces to a one-parameter family and the corresponding
correction term (analogous to the Itô-Stratonovich correction term allowing to switch
between solution theories for SDEs) is given by a multiple of the gradient of the
scalar curvature of M. This sheds new light on observations that had previously
been made in a closely related context both in the physics [Che72, Um74] and in the
mathematics [Dar84, IM85, AD99] literatures.
References

[AC17] A. A NANOVA and R. C ONT. Pathwise integration with respect to paths of finite
quadratic variation. J. Math. Pures Appl. (9) 107, no. 6, (2017), 737–757. doi:
10.1016/j.matpur.2016.10.004.
[AC19] A. L. A LLAN and S. N. C OHEN. Pathwise stochastic control with applications
to robust filtering. arXiv e-prints (2019), 1–42. Ann. Appl. Probab., to appear.
arXiv:1902.05434.
[AD99] L. A NDERSSON and B. K. D RIVER. Finite-dimensional approximations to Wiener
measure and path integral formulas on manifolds. J. Funct. Anal. 165, no. 2, (1999),
430–498. doi:10.1006/jfan.1999.3413.
[AFS19] C. A M ÉNDOLA, P. F RIZ, and B. S TURMFELS. Varieties of signature tensors. Forum
Math. Sigma 7, (2019), e10, 54. doi:10.1017/fms.2019.3.
[Aid07] S. A IDA. Semi-classical limit of the bottom of spectrum of a Schrödinger operator on
a path space over a compact Riemannian manifold. J. Funct. Anal. 251, no. 1, (2007),
59–121. doi:10.1016/j.jfa.2007.06.009.
[Aid15] S. A IDA. Reflected rough differential equations. Stochastic Processes Appl. 125,
no. 9, (2015), 3570–3595. doi:10.1016/j.spa.2015.03.008.
[Alm66] F. J. A LMGREN , J R . Plateau’s problem: An invitation to varifold geometry. W. A.
Benjamin, Inc., New York-Amsterdam, 1966, xii+74.
[App09] D. A PPLEBAUM. Lévy Processes and Stochastic Calculus. Cambridge Studies in
Advanced Mathematics. Cambridge University Press, 2 ed., 2009. doi:10.1017/
CBO9780511809781.
[AR91] S. A LBEVERIO and M. R ÖCKNER. Stochastic differential equations in infinite
dimensions: solutions via Dirichlet forms. Probab. Theory Related Fields 89, no. 3,
(1991), 347–386. doi:10.1007/BF01198791.
[BA88] G. B EN A ROUS. Methods de Laplace et de la phase stationnaire sur
l’espace de Wiener. Stochastics 25, no. 3, (1988), 125–153. doi:10.1080/
17442508808833536.
[BA89] G. B EN A ROUS. Flots et séries de Taylor stochastiques. Probab. Theory Related
Fields 81, no. 1, (1989), 29–77. doi:10.1007/BF00343737.
[Bai14] I. BAILLEUL. Flows driven by Banach space-valued rough paths. In Séminaire de
Probabilités XLVI, vol. 2123 of Lecture Notes in Math., 195–205. Springer, Cham,
2014. doi:10.1007/978-3-319-11970-0_7.
[Bai15a] I. BAILLEUL. Flows driven by rough paths. Revista Matemática Iberoamericana 31,
no. 3, (2015), 901–934. doi:10.4171/rmi/858.
[Bai15b] I. BAILLEUL. Regularity of the Itô-Lyons map. Confluentes Math. 7, no. 1, (2015),
3–11. doi:10.5802/cml.15.
[Bai19] I. BAILLEUL. Rough integrators on Banach manifolds. Bull. Sci. Math. 151, (2019),
51–65. doi:10.1016/j.bulsci.2018.12.001.

323
324 References

[Bal00] E. J. BALDER. Lectures on Young measure theory and its applications in economics.
Rend. Istit. Mat. Univ. Trieste 31, no. suppl. 1, (2000), 1–69. Workshop on Measure
Theory and Real Analysis (Italian) (Grado, 1997).
[Bau04] F. BAUDOIN. An introduction to the geometry of stochastic flows. Imperial College
Press, London, 2004, x+140. doi:10.1142/9781860947261.
[BB19] I. BAILLEUL and F. B ERNICOT. High order paracontrolled calculus. Forum Math.
Sigma 7, (2019), e44, 94. doi:10.1017/fms.2019.44.
[BBR+ 18] C. BAYER, D. B ELOMESTNY, M. R EDMANN, S. R IEDEL, and J. S CHOENMAKERS.
Solving linear parabolic rough partial differential equations. arXiv e-prints (2018),
1–36. arXiv:1803.09488.
[BC17] I. BAILLEUL and R. C ATELLIER. Rough flows and homogenization in stochastic
turbulence. J. Differential Equations 263, no. 8, (2017), 4894–4928. doi:10.1016/
j.jde.2017.06.006.
[BC19] H. B OEDIHARDJO and I. C HEVYREV. An isomorphism between branched and
geometric rough paths. Ann. Inst. Henri Poincaré Probab. Stat. 55, no. 2, (2019),
1131–1148. doi:10.1214/18-aihp912.
[BCCH17] Y. B RUNED, A. C HANDRA, I. C HEVYREV, and M. H AIRER. Renormalising SPDEs
in regularity structures. arXiv e-prints (2017), 1–85. J. Eur. Math. Soc., to appear.
arXiv:1711.10239.
[BCD11] H. BAHOURI, J.-Y. C HEMIN, and R. DANCHIN. Fourier analysis and non-
linear partial differential equations, vol. 343 of Grundlehren der Mathematis-
chen Wissenschaften. Springer, Heidelberg, 2011, xvi+523. doi:10.1007/
978-3-642-16830-7.
[BCD19] I. BAILLEUL, R. C ATELLIER, and F. D ELARUE. Propagation of chaos for mean field
rough differential equations. arXiv e-prints (2019), 1–61. arXiv:1907.00578.
[BCD20] I. BAILLEUL, R. C ATELLIER, and F. D ELARUE. Solving mean field rough differential
equations. Electron. J. Probab. 25, (2020), 51 pp. doi:10.1214/19-EJP409.
[BCEF20] Y. B RUNED, C. C URRY, and K. E BRAHIMI -FARD. Quasi-shuffle algebras and
renormalisation of rough differential equations. Bull. Lond. Math. Soc. 52, no. 1,
(2020), 43–63. doi:10.1112/blms.12305.
[BCF18] Y. B RUNED, I. C HEVYREV, and P. K. F RIZ. Examples of renormalized SDEs.
In Stochastic partial differential equations and related fields, in Honor of Michael
Röckner, Bielefeld 2016, vol. 229 of Springer Proc. Math. Stat., 303–317. Springer,
Cham, 2018. doi:10.1007/978-3-319-74929-7_19.
[BCFP19] Y. B RUNED, I. C HEVYREV, P. K. F RIZ, and R. P REISS. A rough path perspective
on renormalization. J. Funct. Anal. 277, no. 11, (2019), 108283, 60. doi:10.1016/
j.jfa.2019.108283.
[BD15] I. BAILLEUL and J. D IEHL. The inverse problem for rough controlled differential
equations. SIAM J. Control Optim. 53, no. 5, (2015), 2762–2780. doi:10.1137/
140995982.
[BDFT20] C. B ELLINGERI, A. D JURDJEVAC, P. K. F RIZ, and N. TAPIA. Transport and
continuity equations with (very) rough noise. arXiv e-prints (2020), 1–20. arXiv:
2002.10432.
[Bel20] C. B ELLINGERI. An Itô type formula for the additive stochastic heat equation.
Electron. J. Probab. 25, (2020), 52 pp. doi:10.1214/19-EJP404.
[BF13] C. BAYER and P. K. F RIZ. Cubature on Wiener space: pathwise convergence. Appl.
Math. Optim. 67, no. 2, (2013), 261–278. doi:10.1007/s00245-012-9187-8.
[BFG+ 19] C. BAYER, P. K. F RIZ, P. G ASSIAT, J. M ARTIN, and B. S TEMPER. A regularity
structure for rough volatility. Math. Financ. (2019), 1–51. doi:10.1111/mafi.
12233.
[BFG20] C. B ELLINGERI, P. K. F RIZ, and M. G ERENCS ÉR. Singular paths spaces and
applications. arXiv e-prints (2020), 1–15. arXiv:2003.03352.
[BFH09] E. B REUILLARD, P. F RIZ, and M. H UESMANN. From random walks to rough
paths. Proc. Amer. Math. Soc. 137, no. 10, (2009), 3487–3496. doi:10.1090/
S0002-9939-09-09930-4.
References 325

[BFRS16] C. BAYER, P. K. F RIZ, S. R IEDEL, and J. S CHOENMAKERS. From rough path


estimates to multilevel Monte Carlo. SIAM J. Numer. Anal. 54, no. 3, (2016), 1449–
1483. doi:10.1137/140995209.
[BG97] L. B ERTINI and G. G IACOMIN. Stochastic Burgers and KPZ equations from par-
ticle systems. Comm. Math. Phys. 183, no. 3, (1997), 571–607. doi:10.1007/
s002200050044.
[BG17] I. BAILLEUL and M. G UBINELLI. Unbounded rough drivers. Ann. Fac. Sci. Toulouse
Math. (6) 26, no. 4, (2017), 795–830. doi:10.5802/afst.1553.
[BGHZ19] Y. B RUNED, F. G ABRIEL, M. H AIRER, and L. Z AMBOTTI. Geometric stochastic
heat equations. arXiv e-prints (2019), 1–83. arXiv:1902.02884.
[BGLY14] Y. B OUTAIB, L. G. G YURK Ó, T. LYONS, and D. YANG. Dimension-free Euler
estimates of rough differential equations. Rev. Roumaine Math. Pures Appl. 59, no. 1,
(2014), 25–53.
[BGLY15] H. B OEDIHARDJO, X. G ENG, T. LYONS, and D. YANG. Note on the signatures of
rough paths in a Banach space. arXiv e-prints (2015), 1–14. arXiv:1510.04172.
[BGLY16] H. B OEDIHARDJO, X. G ENG, T. LYONS, and D. YANG. The signature of a rough
path: Uniqueness. Advances in Mathematics 293, (2016), 720–737. doi:10.1016/
j.aim.2016.02.011.
[BH91] N. B OULEAU and F. H IRSCH. Dirichlet forms and analysis on Wiener space, vol. 14
of de Gruyter Studies in Mathematics. Walter de Gruyter & Co., Berlin, 1991, x+325.
doi:10.1515/9783110858389.
[BH07] F. BAUDOIN and M. H AIRER. A version of Hörmander’s theorem for the fractional
Brownian motion. Probab. Theory Related Fields 139, no. 3-4, (2007), 373–395.
doi:10.1007/s00440-006-0035-0.
[BH18] I. BAILLEUL and M. H OSHINO. Paracontrolled calculus and regularity structures.
arXiv e-prints (2018), 1–32. arXiv:1812.07919.
[BH19] I. BAILLEUL and M. H OSHINO. Regularity structures and paracontrolled calculus.
arXiv e-prints (2019), 1–29. arXiv:1912.08438.
[BHZ19] Y. B RUNED, M. H AIRER, and L. Z AMBOTTI. Algebraic renormalisation of regularity
structures. Invent. Math. 215, no. 3, (2019), 1039–1156. arXiv:1610.08468. doi:
10.1007/s00222-018-0841-x.
[Bis81a] J.-M. B ISMUT. Martingales, the Malliavin calculus and Hörmander’s theorem. In
Stochastic integrals (Proc. Sympos., Univ. Durham, Durham, 1980), vol. 851 of
Lecture Notes in Math., 85–109. Springer, Berlin, 1981.
[Bis81b] J.-M. B ISMUT. Martingales, the Malliavin calculus and hypoellipticity under general
Hörmander’s conditions. Z. Wahrsch. Verw. Gebiete 56, no. 4, (1981), 469–505.
doi:10.1007/bf00531428.
[BL19] A. B RAULT and A. L EJAY. The non-linear sewing lemma I: weak formulation.
Electron. J. Probab. 24, (2019), Paper No. 59, 24. doi:10.1214/19-EJP313.
[BLY15] H. B OEDIHARDJO, T. LYONS, and D. YANG. Uniform factorial decay estimates for
controlled differential equations. Electron. Commun. Probab. 20, (2015), no. 94, 11.
doi:10.1214/ECP.v20-4124.
[BM07] R. B UCKDAHN and J. M A. Pathwise stochastic control problems and stochastic HJB
equations. SIAM Journal on Control and Optimization 45, no. 6, (2007), 2224–2256.
doi:10.1137/S036301290444335X.
[BM19] I. BAILLEUL and A. M OUZARD. Paracontrolled calculus for quasilinear singular
PDEs. arXiv e-prints (2019), 1–32. arXiv:1912.09073.
[BMN10] Á. B ÉNYI, D. M ALDONADO, and V. NAIBO. What is . . . a paraproduct? Notices
Amer. Math. Soc. 57, no. 7, (2010), 858–860.
[BMSS95] V. BALLY, A. M ILLET, and M. S ANZ -S OLE. Approximation and support theorem in
Hölder norm for parabolic stochastic partial differential equations. Ann. Probab. 23,
no. 1, (1995), 178–222. doi:10.1214/aop/1176988383.
[BNOT16] F. BAUDOIN, E. N UALART, C. O UYANG, and S. T INDEL. On probability laws of
solutions to differential systems driven by a fractional brownian motion. Ann. Probab.
44, no. 4, (2016), 2554–2590. doi:10.1214/15-AOP1028.
326 References

[BNQ14] H. B OEDIHARDJO, H. N I, and Z. Q IAN. Uniqueness of signature for simple curves. J.


Funct. Anal. 267, no. 6, (2014), 1778–1806. doi:10.1016/j.jfa.2014.06.006.
[Boe18] H. B OEDIHARDJO. Decay rate of iterated integrals of branched rough paths. Annales
de l’Institut Henri Poincaré C, Analyse non linéaire 35, no. 4, (2018), 945 – 969.
doi:10.1016/j.anihpc.2017.09.002.
[Bog98] V. I. B OGACHEV. Gaussian measures, vol. 62 of Mathematical Surveys and
Monographs. American Mathematical Society, Providence, RI, 1998, xii+433.
doi:10.1090/surv/062.
[Bon81] J.-M. B ONY. Calcul symbolique et propagation des singularités pour les équations
aux dérivées partielles non linéaires. Ann. Sci. École Norm. Sup. (4) 14, no. 2, (1981),
209–246. doi:10.24033/asens.1404.
[Bor75] C. B ORELL. The Brunn-Minkowski inequality in Gauss space. Invent. Math. 30,
no. 2, (1975), 207–216. doi:10.1007/bf01425510.
[BP57] N. N. B OGOLIUBOW and O. S. PARASIUK. Über die Multiplikation der Kausal-
funktionen in der Quantentheorie der Felder. Acta Math. 97, (1957), 227–266.
doi:10.1007/BF02392399.
[BP08] J. B OURGAIN and N. PAVLOVI Ć. Ill-posedness of the Navier-Stokes equations
in a critical space in 3D. J. Funct. Anal. 255, no. 9, (2008), 2233–2247. doi:
10.1016/j.jfa.2008.07.008.
[BR19] I. BAILLEUL and S. R IEDEL. Rough flows. J. Math. Soc. Japan 71, no. 3, (2019),
915–978. doi:10.2969/jmsj/80108010.
[BRS17] I. BAILLEUL, S. R IEDEL, and M. S CHEUTZOW. Random dynamical systems, rough
paths and rough flows. J. Differential Equations 262, no. 12, (2017), 5792–5823.
doi:10.1016/j.jde.2017.02.014.
[Bru18] Y. B RUNED. Recursive formulae in regularity structures. Stoch. Partial Differ. Equ.
Anal. Comput. 6, no. 4, (2018), 525–564. doi:10.1007/s40072-018-0115-z.
[But72] J. C. B UTCHER. An algebraic theory of integration methods. Math. Comp. 26, (1972),
79–106. doi:10.1090/S0025-5718-1972-0305608-0.
[Car10] M. C ARUANA. Peano’s theorem for rough differential equations in infinite-
dimensional Banach spaces. Proc. Lond. Math. Soc. (3) 100, no. 1, (2010), 177–215.
doi:10.1112/plms/pdp028.
[CC18a] G. C ANNIZZARO and K. C HOUK. Multidimensional SDEs with singular drift and
universal construction of the polymer measure with white noise potential. Ann. Probab.
46, no. 3, (2018), 1710–1763. doi:10.1214/17-AOP1213.
[CC18b] R. C ATELLIER and K. C HOUK. Paracontrolled distributions and the 3-dimensional
stochastic quantization equation. Ann. Probab. 46, no. 5, (2018), 2621–2679. doi:
10.1214/17-aop1235.
[CCHS20] A. C HANDRA, I. C HEVYREV, M. H AIRER, and H. S HEN. Langevin dynamic for the
2D Yang–Mills measure, 2020. In preparation.
[CDFM18] M. C OGHI, J.-D. D EUSCHEL, P. F RIZ, and M. M AURELLI. Pathwise McKean–
Vlasov theory with additive noise. arXiv e-prints (2018), 1–41. Ann. Appl. Probab.,
to appear. arXiv:1812.11773.
[CDFO13] D. C RISAN, J. D IEHL, P. K. F RIZ, and H. O BERHAUSER. Robust filtering: correlated
noise and multidimensional observation. Ann. Appl. Probab. 23, no. 5, (2013), 2139–
2160. doi:10.1214/12-AAP896.
[CDL15] T. C ASS, B. K. D RIVER, and C. L ITTERER. Constrained rough paths. Proc. Lond.
Math. Soc. (3) 111, no. 6, (2015), 1471–1518. doi:10.1112/plms/pdv060.
[CDLL16] T. C ASS, B. K. D RIVER, N. L IM, and C. L ITTERER. On the integration of weakly
geometric rough paths. J. Math. Soc. Japan 68, no. 4, (2016), 1505–1524. doi:
10.2969/jmsj/06841505.
[CDM01] M. C APITAINE and C. D ONATI -M ARTIN. The Lévy area process for the free
Brownian motion. J. Funct. Anal. 179, no. 1, (2001), 153–169. doi:10.1006/jfan.
2000.3679.
References 327

[CF09] M. C ARUANA and P. F RIZ. Partial differential equations driven by rough paths. J.
Differential Equations 247, no. 1, (2009), 140–173. doi:10.1016/j.jde.2009.01.
026.
[CF10] T. C ASS and P. F RIZ. Densities for rough differential equations under Hörmander’s
condition. Ann. of Math. (2) 171, no. 3, (2010), 2115–2141. doi:10.4007/annals.
2010.171.2115.
[CF18] K. C HOUK and P. K. F RIZ. Support theorem for a singular SPDE: the case of
gPAM. Ann. Inst. Henri Poincaré Probab. Stat. 54, no. 1, (2018), 202–219. doi:
10.1214/16-AIHP800.
[CF19] I. C HEVYREV and P. K. F RIZ. Canonical rdes and general semimartingales as rough
paths. Ann. Probab. 47, no. 1, (2019), 420–463. doi:10.1214/18-AOP1264.
[CFG17] G. C ANNIZZARO, P. K. F RIZ, and P. G ASSIAT. Malliavin calculus for regularity
structures: the case of gPAM. J. Funct. Anal. 272, no. 1, (2017), 363–419. doi:
10.1016/j.jfa.2016.09.024.
[CFK+ 19a] I. C HEVYREV, P. K. F RIZ, A. KOREPANOV, I. M ELBOURNE, and H. Z HANG.
Deterministic homogenization for discrete-time fast-slow systems under optimal
moment assumptions. arXiv e-prints (2019), 1–24. arXiv:1903.10418.
[CFK+ 19b] I. C HEVYREV, P. K. F RIZ, A. KOREPANOV, I. M ELBOURNE, and H. Z HANG.
Multiscale systems, homogenization, and rough paths. In Probability and analysis
in interacting physical systems, In Honor of S.R.S. Varadhan, Berlin, August, 2016,
vol. 283 of Springer Proc. Math. Stat., 17–48. Springer, Cham, 2019. doi:10.1007/
978-3-030-15338-0.
[CFKM19] I. C HEVYREV, P. K. F RIZ, A. KOREPANOV, and I. M ELBOURNE. Superdiffusive
limits for deterministic fast-slow dynamical systems. arXiv e-prints (2019), 1–35.
arXiv:1907.04825.
[CFO11] M. C ARUANA, P. K. F RIZ, and H. O BERHAUSER. A (rough) pathwise approach to a
class of non-linear stochastic partial differential equations. Ann. Inst. H. Poincaré Anal.
Non Linéaire 28, no. 1, (2011), 27–46. doi:10.1016/j.anihpc.2010.11.002.
[CFV07] L. C OUTIN, P. F RIZ, and N. V ICTOIR. Good rough path sequences and applications
to anticipating stochastic calculus. Ann. Probab. 35, no. 3, (2007), 1172–1193.
doi:10.1214/009117906000000827.
[CFV09] T. C ASS, P. F RIZ, and N. V ICTOIR. Non-degeneracy of Wiener functionals arising
from rough differential equations. Trans. Amer. Math. Soc. 361, no. 6, (2009), 3359–
3371. doi:10.1090/S0002-9947-09-04677-7.
[CH16] A. C HANDRA and M. H AIRER. An analytic BPHZ theorem for regularity structures.
arXiv e-prints (2016), 1–129. arXiv:1612.08138.
[Che54] K.-T. C HEN. Iterated integrals and exponential homomorphisms. Proc. London Math.
Soc. (3) 4, (1954), 502–512. doi:10.1112/plms/s3-4.1.502.
[Che57] K.-T. C HEN. Integration of paths, geometric invariants and a generalized Baker-
Hausdorff formula. Ann. of Math. (2) 65, no. 1, (1957), 163–178. doi:10.2307/
1969671.
[Che58] K.-T. C HEN. Integration of paths—a faithful representation of paths by non-
commutative formal power series. Trans. Amer. Math. Soc. 89, (1958), 395–407.
doi:10.2307/1993193.
[Che71] K.-T. C HEN. Algebras of iterated path integrals and fundamental groups. Trans.
Amer. Math. Soc. 156, (1971), 359–379. doi:10.2307/1995617.
[Che72] K. C HENG. Quantization of a general dynamical system by Feynman’s path
integration formulation. J. Math. Phys. 13, no. 11, (1972), 1723–1726. doi:
10.1063/1.1665897.
[Che18] I. C HEVYREV. Random walks and Lévy processes as rough paths. Probab. Theory
Related Fields 170, no. 3-4, (2018), 891–932. doi:10.1007/s00440-017-0781-1.
[CHLT15] T. C ASS, M. H AIRER, C. L ITTERER, and S. T INDEL. Smoothness of the density for
solutions to Gaussian rough differential equations. Ann. Probab. 43, no. 1, (2015),
188–239. doi:10.1214/13-AOP896.
328 References

[Cho39] W.-L. C HOW. Über Systeme von linearen partiellen Differentialgleichungen erster
Ordnung. Math. Ann. 117, (1939), 98–105. doi:10.1007/BF01450011.
[CIL92] M. G. C RANDALL, H. I SHII, and P.-L. L IONS. User’s guide to viscosity solutions of
second order partial differential equations. Bull. Amer. Math. Soc. (N.S.) 27, no. 1,
(1992), 1–67. doi:10.1090/s0273-0979-1992-00266-5.
[CK00] A. C ONNES and D. K REIMER. Renormalization in quantum field theory and
the Riemann-Hilbert problem. I. The Hopf algebra structure of graphs and the
main theorem. Comm. Math. Phys. 210, no. 1, (2000), 249–273. doi:10.1007/
s002200050779.
[CK16] I. C HEVYREV and A. KORMILITZIN. A primer on the signature method in machine
learning. arXiv e-prints (2016), 1–45. arXiv:1603.03788.
[CL05] L. C OUTIN and A. L EJAY. Semi-martingales and rough paths theory. Electron. J.
Probab. 10, (2005), no. 23, 761–785. doi:10.1214/EJP.v10-162.
[CL14] L. C OUTIN and A. L EJAY. Perturbed linear rough differential equations. Ann. Math.
Blaise Pascal 21, no. 1, (2014), 103–150. doi:10.5802/ambp.338.
[CL15] T. C ASS and T. LYONS. Evolving communities with individual preferences. Proc.
Lond. Math. Soc. (3) 110, no. 1, (2015), 83–107. doi:10.1112/plms/pdu040.
[CL16] I. C HEVYREV and T. LYONS. Characteristic functions of measures on geomet-
ric rough paths. Ann. Probab. 44, no. 6, (2016), 4049–4082. doi:10.1214/
15-AOP1068.
[CL18] L. C OUTIN and A. L EJAY. Sensitivity of rough differential equations: an approach
through the omega lemma. J. Differential Equations 264, no. 6, (2018), 3899–3917.
doi:10.1016/j.jde.2017.11.031.
[Cla66] M. C LARK. The representation of non-linear stochastic systems with applications to
filtering. Ph.D. thesis, Imperial College, 1966.
[CLL12] T. C ASS, C. L ITTERER, and T. LYONS. Rough paths on manifolds. In New trends in
stochastic analysis and related topics, vol. 12 of Interdiscip. Math. Sci., 33–88. World
Sci. Publ., Hackensack, NJ, 2012. doi:10.1142/9789814360920_0002.
[CLL13] T. C ASS, C. L ITTERER, and T. LYONS. Integrability and tail estimates for Gaussian
rough differential equations. Ann. Probab. 41, no. 4, (2013), 3026–3050. doi:
10.1214/12-AOP821.
[CN19] M. C OGHI and T. N ILSSEN. Rough nonlocal diffusions. arXiv e-prints (2019), 1–54.
arXiv:1905.07270.
[CO17] T. C ASS and M. O GRODNIK. Tail estimates for Markovian rough paths. Ann. Probab.
45, no. 4, (2017), 2477–2504. doi:10.1214/16-AOP1117.
[CO18] I. C HEVYREV and M. O GRODNIK. A support and density theorem for Markovian
rough paths. Electron. J. Probab. 23, (2018), Paper No. 56, 16. doi:10.1214/
18-ejp184.
[Col51] J. D. C OLE. On a quasi-linear parabolic equation occurring in aerodynamics. Quart.
Appl. Math. 9, (1951), 225–236. doi:10.1090/qam/42889.
[Com19] G. C OMI. Semi-Linear Heat Equation and Singular Volterra Equation. Ph.D. thesis,
Università degli studi di Milano Bicocca, Università degli studi di Pavia, 2019.
[Cor12] I. C ORWIN. The Kardar-Parisi-Zhang equation and universality class. Ran-
dom Matrices Theory Appl. 1, no. 1, (2012), 1130001, 76. doi:10.1142/
S2010326311300014.
[CP19] R. C ONT and N. P ERKOWSKI. Pathwise integration and change of variable formulas
for continuous paths with arbitrary regularity. Trans. Am. Math. Soc., Ser. B 6, (2019),
161–186. doi:10.1090/btran/34.
[CQ02] L. C OUTIN and Z. Q IAN. Stochastic analysis, rough path analysis and fractional
Brownian motions. Probab. Theory Related Fields 122, no. 1, (2002), 108–140.
doi:10.1007/s004400100158.
[CW16] T. C ASS and M. P. W EIDNER. Tree algebras over topological vector spaces in rough
path theory. arXiv e-prints (2016), 1–25. arXiv:1604.07352.
References 329

[Dar84] R. W. R. DARLING. On the convergence of Gangolli processes to Brownian mo-


tion on a manifold. Stochastics 12, no. 3-4, (1984), 277–301. doi:10.1080/
17442508408833305.
[Dau88] I. DAUBECHIES. Orthonormal bases of compactly supported wavelets. Comm. Pure
Appl. Math. 41, no. 7, (1988), 909–996. doi:10.1002/cpa.3160410705.
[Dav08] A. M. DAVIE. Differential equations driven by rough paths: an approach via discrete
approximation. Appl. Math. Res. Express. AMRX 2008, no. 2, (2008), 1–40. doi:
10.1093/amrx/abm009.
[DD16] F. D ELARUE and R. D IEL. Rough paths and 1d SDE with a time dependent distri-
butional drift: application to polymers. Probab. Theory Related Fields 165, no. 1-2,
(2016), 1–63. doi:10.1007/s00440-015-0626-8.
[Der10] S. D EREICH. Rough paths analysis of general Banach space-valued Wiener processes.
J. Funct. Anal. 258, no. 9, (2010), 2910–2936. doi:10.1016/j.jfa.2010.01.018.
[DF12] J. D IEHL and P. F RIZ. Backward stochastic differential equations with rough drivers.
Ann. Probab. 40, no. 4, (2012), 1715–1758. doi:10.1214/11-AOP660.
[DFG17] J. D IEHL, P. K. F RIZ, and P. G ASSIAT. Stochastic control with rough paths. Ap-
plied Mathematics & Optimization 75, no. 2, (2017), 285–315. doi:10.1007/
s00245-016-9333-9.
[DFM16] J. D IEHL, P. F RIZ, and H. M AI. Pathwise stability of likelihood estimators for
diffusions via rough paths. Ann. Appl. Probab. 26, no. 4, (2016), 2169–2192. doi:
10.1214/15-AAP1143.
[DFMS18] J.-D. D EUSCHEL, P. K. F RIZ, M. M AURELLI, and M. S LOWIK. The enhanced
Sanov theorem and propagation of chaos. Stochastic Process. Appl. 128, no. 7, (2018),
2228–2269. doi:10.1016/j.spa.2017.09.010.
[DFO14] J. D IEHL, P. K. F RIZ, and H. O BERHAUSER. Regularity theory for rough partial
differential equations and parabolic comparison revisited. In Stochastic Analysis
and Applications, In Honour of Terry Lyons, vol. 100 of Springer Proc. Math. Stat.,
203–238. Springer, Cham, 2014. doi:10.1007/978-3-319-11292-3_8.
[DFS17] J. D IEHL, P. F RIZ, and W. S TANNAT. Stochastic partial differential equations: a
rough paths view on weak solutions via Feynman-Kac. Ann. Fac. Sci. Toulouse Math.
(6) 26, no. 4, (2017), 911–947. doi:10.5802/afst.1556.
[DGHT19a] A. D EYA, M. G UBINELLI, M. H OFMANOV Á, and S. T INDEL. One-dimensional
reflected rough differential equations. Stochastic Process. Appl. 129, no. 9, (2019),
3261–3281. doi:10.1016/j.spa.2018.09.007.
[DGHT19b] A. D EYA, M. G UBINELLI, M. H OFMANOV Á, and S. T INDEL. A priori estimates for
rough PDEs with application to rough conservation laws. J. Funct. Anal. 276, no. 12,
(2019), 3577–3645. doi:10.1016/j.jfa.2019.03.008.
[DGT12] A. D EYA, M. G UBINELLI, and S. T INDEL. Non-linear rough heat equations.
Probab. Theory Related Fields 153, no. 1-2, (2012), 97–147. doi:10.1007/
s00440-011-0341-z.
[DL89] R. J. D I P ERNA and P.-L. L IONS. Ordinary differential equations, transport theory
and Sobolev spaces. Invent. Math. 98, no. 3, (1989), 511–547. doi:10.1007/
BF01393835.
[DMT12] Y. D O, C. M USCALU, and C. T HIELE. Variational estimates for paraproducts. Rev.
Mat. Iberoam. 28, no. 3, (2012), 857–878. doi:10.4171/RMI/694.
[DNT12] A. D EYA, A. N EUENKIRCH, and S. T INDEL. A Milstein-type scheme without Lévy
area terms for SDEs driven by fractional Brownian motion. Ann. Inst. Henri Poincaré
Probab. Stat. 48, no. 2, (2012), 518–550. doi:10.1214/10-AIHP392.
[DOP19] J.-D. D EUSCHEL, T. O RENSHTEIN, and N. P ERKOWSKI. Additive functionals as
rough paths. arXiv e-prints (2019), 1–30. arXiv:1912.09819.
[DOR15] J. D IEHL, H. O BERHAUSER, and S. R IEDEL. A Lévy area between Brownian
motion and rough paths with applications to robust nonlinear filtering and rough
partial differential equations. Stochastic Process. Appl. 125, no. 1, (2015), 161–181.
doi:10.1016/j.spa.2014.08.005.
330 References

[Dos77] H. D OSS. Liens entre équations différentielles stochastiques et ordinaires. Ann. Inst.
H. Poincaré Sect. B (N.S.) 13, no. 2, (1977), 99–125.
[DPD03] G. DA P RATO and A. D EBUSSCHE. Strong solutions to the stochastic quantiza-
tion equations. Ann. Probab. 31, no. 4, (2003), 1900–1916. doi:10.1214/aop/
1068646370.
[DPZ92] G. DA P RATO and J. Z ABCZYK. Stochastic equations in infinite dimensions, vol. 44
of Encyclopedia of Mathematics and its Applications. Cambridge University Press,
Cambridge, 1992, xviii+454. doi:10.1017/CBO9780511666223.
[DT09] A. D EYA and S. T INDEL. Rough Volterra equations. I. The algebraic integration
setting. Stoch. Dyn. 9, no. 3, (2009), 437–477. doi:10.1142/S0219493709002737.
[Faw04] T. FAWCETT. Non-commutative harmonic analysis. Ph.D. thesis, University of
Oxford, 2004.
[FdLP06] D. F EYEL and A. DE L A P RADELLE. Curvilinear integrals along enriched paths.
Electron. J. Probab. 11, (2006), no. 34, 860–892. doi:10.1214/EJP.v11-356.
[FDM08] D. F EYEL, A. D E L A P RADELLE, and G. M OKOBODZKI. A non-commutative
sewing lemma. Electron. Commun. Probab. 13, (2008), 24–34. doi:10.1214/ECP.
v13-1345.
[FG16a] P. F RIZ and P. G ASSIAT. Geometric foundations of rough paths. In Geometry,
analysis and dynamics on sub-Riemannian manifolds. Vol. II, EMS Ser. Lect. Math.,
171–210. Eur. Math. Soc., Zürich, 2016. doi:10.4171/163-1/3.
[FG16b] P. K. F RIZ and B. G ESS. Stochastic scalar conservation laws driven by rough
paths. Ann. Inst. H. Poincaré Anal. Non Linéaire 33, no. 4, (2016), 933–963. doi:
10.1016/j.anihpc.2015.01.009.
[FG19] M. F URLAN and M. G UBINELLI. Weak universality for a class of 3d stochastic
reaction-diffusion models. Probab. Theory Related Fields 173, no. 3-4, (2019),
1099–1164. doi:10.1007/s00440-018-0849-6.
[FGGR16] P. K. F RIZ, B. G ESS, A. G ULISASHVILI, and S. R IEDEL. The Jain-Monrad crite-
rion for rough paths and applications to random Fourier series and non-Markovian
Hörmander theory. Ann. Probab. 44, no. 1, (2016), 684–738. arXiv:1307.3460.
doi:10.1214/14-AOP986.
[FGL15] P. F RIZ, P. G ASSIAT, and T. LYONS. Physical Brownian motion in a magnetic
field as a rough path. Trans. Amer. Math. Soc. 367, no. 11, (2015), 7939–7955.
arXiv:1302.2531. doi:10.1090/S0002-9947-2015-06272-2.
[FGLS17] P. K. F RIZ, P. G ASSIAT, P.-L. L IONS, and P. E. S OUGANIDIS. Eikonal equa-
tions and pathwise solutions to fully non-linear SPDEs. Stoch. Partial Differ. Equ.
Anal. Comput. 5, no. 2, (2017), 256–277. arXiv:1602.04746. doi:10.1007/
s40072-016-0087-9.
[FGP18] P. K. F RIZ, P. G ASSIAT, and P. P IGATO. Precise asymptotics: robust stochastic
volatility models. arXiv e-prints (2018), 1–34. arXiv:1811.00267.
[FHL16] G. F LINT, B. H AMBLY, and T. LYONS. Discretely sampled signals and the rough
Hoff process. Stochastic Process. Appl. 126, no. 9, (2016), 2593–2614. doi:10.
1016/j.spa.2016.02.011.
[FHL20] P. K. F RIZ, A. H OCQUET, and K. L Ê. Rough Markov diffusions and stochastic
differential equations, 2020. In preparation.
[FK20] P. K. F RIZ and T. K LOSE. Precise Laplace Asymptotics for Singular Stochastic
Partial Differential Equations: The case of the 2D generalised Parabolic Anderson
Model, 2020. In preparation.
[FLS06] P. F RIZ, T. LYONS, and D. S TROOCK. Lévy’s area under conditioning. Ann. Inst.
H. Poincaré Probab. Statist. 42, no. 1, (2006), 89–101. doi:10.1016/j.anihpb.
2005.02.003.
[FNC82] M. F LIESS and D. N ORMAND -C YROT. Algèbres de Lie nilpotentes, formule de
Baker-Campbell-Hausdorff et intégrales itérées de K. T. Chen. In Seminar on Proba-
bility, XVI, vol. 920 of Lecture Notes in Math., 257–267. Springer, Berlin-New York,
1982.
References 331

[FO09] P. F RIZ and H. O BERHAUSER. Rough path limits of the Wong-Zakai type with a
modified drift term. J. Funct. Anal. 256, no. 10, (2009), 3236–3256. doi:10.1016/
j.jfa.2009.02.010.
[FO10] P. F RIZ and H. O BERHAUSER. A generalized Fernique theorem and appli-
cations. Proc. Amer. Math. Soc. 138, (2010), 3679–3688. doi:10.1090/
S0002-9939-2010-10528-2.
[FO11] P. F RIZ and H. O BERHAUSER. On the splitting-up method for rough (partial)
differential equations. J. Differential Equations 251, no. 2, (2011), 316–338. doi:
10.1016/j.jde.2011.02.009.
[FO14] P. F RIZ and H. O BERHAUSER. Rough path stability of (semi-)linear SPDEs.
Probab. Theory Related Fields 158, no. 1-2, (2014), 401–434. doi:10.1007/
s00440-013-0483-2.
[Föl81] H. F ÖLLMER. Calcul d’Itô sans probabilités. In Seminar on Probability, XV (Univ.
Strasbourg, Strasbourg, 1979/1980) (French), vol. 850 of Lecture Notes in Math.,
143–150. Springer, Berlin, 1981. doi:10.1007/bfb0088364.
[FP18] P. K. F RIZ and D. J. P R ÖMEL. Rough path metrics on a Besov-Nikolskii-type scale.
Trans. Amer. Math. Soc. 370, no. 12, (2018), 8521–8550. doi:10.1090/tran/7264.
[FR11] P. F RIZ and S. R IEDEL. Convergence rates for the full Brownian rough paths with
applications to limit theorems for stochastic flows. Bull. Sci. Math. 135, no. 6-7,
(2011), 613–628. doi:10.1016/j.bulsci.2011.07.006.
[FR13] P. F RIZ and S. R IEDEL. Integrability of (non-)linear rough differential equations and
integrals. Stoch. Anal. Appl. 31, no. 2, (2013), 336–358. doi:10.1080/07362994.
2013.759758.
[FR14] P. F RIZ and S. R IEDEL. Convergence rates for the full Gaussian rough paths. Ann.
Inst. Henri Poincaré Probab. Stat. 50, no. 1, (2014), 154–194. doi:10.1214/
12-AIHP507.
[Fri05] P. K. F RIZ. Continuity of the Itô-map for Hölder rough paths with applications to the
support theorem in Hölder norm. In Probability and partial differential equations in
modern applied mathematics, vol. 140 of IMA Vol. Math. Appl., 117–135. Springer,
New York, 2005. doi:10.1007/978-0-387-29371-4_8.
[FS06] W. H. F LEMING and H. M. S ONER. Controlled Markov processes and viscosity
solutions, vol. 25 of Stochastic Modelling and Applied Probability. Springer, New
York, second ed., 2006, xviii+429. doi:10.1007/0-387-31071-1.
[FS13] P. F RIZ and A. S HEKHAR. Doob-Meyer for rough paths. Bull. Inst. Math. Acad. Sin.
(N.S.) 8, no. 1, (2013), 73–84. arXiv:1205.2505.
[FS17] P. K. F RIZ and A. S HEKHAR. General rough integration, Lévy rough paths and a
Lévy-Kintchine-type formula. Ann. Probab. 45, no. 4, (2017), 2707–2765. arXiv:
1212.5888. doi:10.1214/16-AOP1123.
[FT17] P. K. F RIZ and H. T RAN. On the regularity of SLE trace. Forum Math. Sigma 5,
(2017), e19, 17. doi:10.1017/fms.2017.18.
[FV05] P. F RIZ and N. V ICTOIR. Approximations of the Brownian rough path with applica-
tions to stochastic analysis. Ann. Inst. H. Poincaré Probab. Statist. 41, no. 4, (2005),
703–724. doi:10.1016/j.anihpb.2004.05.003.
[FV06a] P. F RIZ and N. V ICTOIR. A note on the notion of geometric rough paths.
Probab. Theory Related Fields 136, no. 3, (2006), 395–416. doi:10.1007/
s00440-005-0487-7.
[FV06b] P. F RIZ and N. V ICTOIR. A variation embedding theorem and applications. J. Funct.
Anal. 239, no. 2, (2006), 631–637. doi:10.1016/j.jfa.2005.12.021.
[FV07] P. F RIZ and N. V ICTOIR. Large deviation principle for enhanced Gaussian processes.
Ann. Inst. H. Poincaré Probab. Statist. 43, no. 6, (2007), 775 – 785. doi:10.1016/
j.anihpb.2006.11.002.
[FV08a] P. F RIZ and N. V ICTOIR. The Burkholder-Davis-Gundy inequality for enhanced
martingales. In Séminaire de probabilités XLI, vol. 1934 of Lecture Notes in Math.,
421–438. Springer, Berlin, 2008. doi:10.1007/978-3-540-77913-1_20.
332 References

[FV08b] P. F RIZ and N. V ICTOIR. Euler estimates for rough differential equations. J. Differ-
ential Equations 244, no. 2, (2008), 388–412. doi:10.1016/j.jde.2007.10.008.
[FV08c] P. F RIZ and N. V ICTOIR. On uniformly subelliptic operators and stochastic area.
Probab. Theory Related Fields 142, no. 3-4, (2008), 475–523. doi:10.1007/
s00440-007-0113-y.
[FV10a] P. F RIZ and N. V ICTOIR. Differential equations driven by Gaussian signals. Ann. Inst.
H. Poincaré Probab. Statist. 46, no. 2, (2010), 369–413. doi:10.1214/09-AIHP202.
[FV10b] P. F RIZ and N. V ICTOIR. Multidimensional Stochastic Processes as Rough Paths, vol.
120 of Cambridge Studies in Advanced Mathematics. Cambridge University Press,
Cambridge, 2010, xiv+670. doi:10.1017/CBO9780511845079.
[FV11] P. F RIZ and N. V ICTOIR. A note on higher dimensional p-variation. Electron. J.
Probab. 16, (2011), 1880–1899. doi:10.1214/EJP.v16-951.
[FZ18] P. K. F RIZ and H. Z HANG. Differential equations driven by rough paths with jumps.
J. Differential Equations 264, no. 10, (2018), 6226–6301. doi:10.1016/j.jde.
2018.01.031.
[FZK20] P. K. F RIZ and P. Z ORIN -K RANICH. Rough semimartingales and p-variation esti-
mates for martingale transforms, 2020. In preparation.
[Gas20] P. G ASSIAT. Non-uniqueness for reflected rough differential equations. arXiv e-prints
(2020), 1–25. arXiv:2001.11914.
[GGLS20] P. G ASSIAT, B. G ESS, P.-L. L IONS, and P. E. S OUGANIDIS. Speed of propagation
for hamilton–jacobi equations with multiplicative rough time dependence and convex
hamiltonians. Probab. Theory Related Fields 176, no. 1, (2020), 421–448. doi:
10.1007/s00440-019-00921-5.
[GH19] A. G ERASIMOVICS and M. H AIRER. Hörmander’s theorem for semilinear spdes.
Electron. J. Probab. 24, (2019), 56 pp. doi:10.1214/19-EJP387.
[GHN19] A. G ERASIMOVICS, A. H OCQUET, and T. N ILSSEN. Non-autonomous rough semi-
linear PDEs and the multiplicative sewing lemma. arXiv e-prints (2019), 1–48.
arXiv:1907.13398.
[GIP15] M. G UBINELLI, P. I MKELLER, and N. P ERKOWSKI. Paracontrolled distributions
and singular PDEs. Forum Math. Pi 3, (2015), e6, 75. doi:10.1017/fmp.2015.2.
[GIP16] M. G UBINELLI, P. I MKELLER, and N. P ERKOWSKI. A Fourier analytic approach
to pathwise stochastic integration. Electron. J. Probab. 21, (2016), Paper No. 2, 37.
doi:10.1214/16-EJP3868.
[GJ14] P. G ONÇALVES and M. JARA. Nonlinear fluctuations of weakly asymmetric in-
teracting particle systems. Arch. Ration. Mech. Anal. 212, no. 2, (2014), 597–644.
doi:10.1007/s00205-013-0693-x.
[GL09] M. G UBINELLI and J. L ÖRINCZI. Gibbs measures on Brownian currents. Comm.
Pure Appl. Math. 62, no. 1, (2009), 1–56. doi:10.1002/cpa.20260.
[GL20] P. G ASSIAT and C. L ABB É. Existence of densities for the dynamic Φ43 model.
Ann. Inst. Henri Poincaré Probab. Stat. 56, no. 1, (2020), 326–373. doi:10.1214/
19-AIHP963.
[GLP99] G. G IACOMIN, J. L. L EBOWITZ, and E. P RESUTTI. Deterministic and stochastic
hydrodynamic equations arising from simple microscopic model systems. In Stochas-
tic partial differential equations: six perspectives, vol. 64 of Math. Surveys Monogr.,
107–152. Amer. Math. Soc., Providence, RI, 1999. doi:10.1090/surv/064/03.
[GOT19] B. G ESS, C. O UYANG, and S. T INDEL. Density bounds for solutions to differential
equations driven by gaussian rough paths. J. Theoret. Probab. (2019). doi:10.1007/
s10959-019-00967-0.
[GP15] M. G UBINELLI and N. P ERKOWSKI. Lectures on singular stochastic PDEs, vol. 29 of
Ensaios Matemáticos [Mathematical Surveys]. Sociedade Brasileira de Matemática,
Rio de Janeiro, 2015, 89.
[GP17] M. G UBINELLI and N. P ERKOWSKI. KPZ reloaded. Comm. Math. Phys. 349, no. 1,
(2017), 165–269. arXiv:1508.03877. doi:10.1007/s00220-016-2788-3.
[GP18] M. G UBINELLI and N. P ERKOWSKI. Energy solutions of KPZ are unique. J. Amer.
Math. Soc. 31, no. 2, (2018), 427–471. doi:10.1090/jams/889.
References 333

[GPS16] B. G ESS, B. P ERTHAME, and P. E. S OUGANIDIS. Semi-discretization for stochastic


scalar conservation laws with multiple rough fluxes. SIAM J. Numer. Anal. 54, no. 4,
(2016), 2187–2209. doi:10.1137/15M1053670.
[GS15] B. G ESS and P. E. S OUGANIDIS. Scalar conservation laws with multiple rough fluxes.
Commun. Math. Sci. 13, no. 6, (2015), 1569–1597. doi:10.4310/CMS.2015.v13.
n6.a10.
[GS17] B. G ESS and P. E. S OUGANIDIS. Stochastic non-isotropic degenerate parabolic-
hyperbolic equations. Stochastic Process. Appl. 127, no. 9, (2017), 2961–3004.
doi:10.1016/j.spa.2017.01.005.
[GT10] M. G UBINELLI and S. T INDEL. Rough evolution equations. Ann. Probab. 38, no. 1,
(2010), 1–75. doi:10.1214/08-AOP437.
[Gub04] M. G UBINELLI. Controlling rough paths. J. Funct. Anal. 216, no. 1, (2004), 86–140.
doi:10.1016/j.jfa.2004.01.002.
[Gub10] M. G UBINELLI. Ramification of rough paths. J. Differential Equations 248, no. 4,
(2010), 693–721. doi:10.1016/j.jde.2009.11.015.
[Gub12] M. G UBINELLI. Rough solutions for the periodic Korteweg–de Vries equation.
Commun. Pure Appl. Anal. 11, no. 2, (2012), 709–733. doi:10.3934/cpaa.2012.
11.709.
[Hai11a] M. H AIRER. On Malliavin’s proof of Hörmander’s theorem. Bull. Sci. Math. 135, no.
6-7, (2011), 650–666. doi:10.1016/j.bulsci.2011.07.007.
[Hai11b] M. H AIRER. Rough stochastic PDEs. Comm. Pure Appl. Math. 64, no. 11, (2011),
1547–1585. doi:10.1002/cpa.20383.
[Hai13] M. H AIRER. Solving the KPZ equation. Ann. of Math. (2) 178, no. 2, (2013), 559–664.
arXiv:1109.6811. doi:10.4007/annals.2013.178.2.4.
[Hai14a] M. H AIRER. Singular stochastic PDEs. In Proceedings of the International Congress
of Mathematicians—Seoul 2014. Vol. IV, 49–73. Kyung Moon Sa, Seoul, 2014.
arXiv:1403.6353.
[Hai14b] M. H AIRER. A theory of regularity structures. Invent. Math. 198, no. 2, (2014),
269–504. doi:10.1007/s00222-014-0505-4.
[Hai15] M. H AIRER. Introduction to regularity structures. Braz. J. Probab. Stat. 29, no. 2,
(2015), 175–210. doi:10.1214/14-BJPS241.
[Hai16] M. H AIRER. The motion of a random string. arXiv e-prints (2016), 1–20. arXiv:
1605.02192.
[Hep69] K. H EPP. On the equivalence of additive and analytic renormalization. Comm. Math.
Phys. 14, (1969), 67–69. doi:10.1007/BF01645456.
[HH10] K. H ARA and M. H INO. Fractional order Taylor’s series and the neo-classical
inequality. Bull. Lond. Math. Soc. 42, no. 3, (2010), 467–477. doi:10.1112/blms/
bdq013.
[HH18] A. H OCQUET and M. H OFMANOV Á. An energy method for rough partial differential
equations. J. Differential Equations 265, no. 4, (2018), 1407–1466. doi:10.1016/
j.jde.2018.04.006.
[HK15] M. H AIRER and D. K ELLY. Geometric versus non-geometric rough paths. Ann.
Inst. Henri Poincaré Probab. Stat. 51, no. 1, (2015), 207–251. doi:10.1214/
13-AIHP564.
[HL07] K. H ARA and T. LYONS. Smooth rough paths and applications for Fourier analysis.
Rev. Mat. Iberoam. 23, no. 3, (2007), 1125–1140. doi:10.4171/RMI/526.
[HL10] B. H AMBLY and T. LYONS. Uniqueness for the signature of a path of bounded
variation and the reduced path group. Ann. of Math. (2) 171, no. 1, (2010), 109–167.
doi:10.4007/annals.2010.171.109.
[HL19] M. H AIRER and X.-M. L I. Averaging dynamics driven by fractional Brownian
motion. arXiv e-prints (2019), 1–42. Ann. Probab., to appear. arXiv:1902.11251.
[HLN19] M. H OFMANOV Á, J.-M. L EAHY, and T. N ILSSEN. On the Navier-Stokes equation
perturbed by rough transport noise. J. Evol. Equ. 19, no. 1, (2019), 203–247. doi:
10.1007/s00028-018-0473-z.
334 References

[HM11] M. H AIRER and J. C. M ATTINGLY. A theory of hypoellipticity and unique ergodicity


for semilinear stochastic PDEs. Electron. J. Probab. 16, (2011), no. 23, 658–738.
doi:10.1214/EJP.v16-875.
[HM12] M. H AIRER and J. M AAS. A spatial version of the Itô-Stratonovich correction. Ann.
Probab. 40, no. 4, (2012), 1675–1714. doi:10.1214/11-AOP662.
[HMW14] M. H AIRER, J. M AAS, and H. W EBER. Approximating rough stochastic PDEs.
Comm. Pure Appl. Math. 67, no. 5, (2014), 776–870. doi:10.1002/cpa.21495.
[HN07] Y. H U and D. N UALART. Differential equations driven by Hölder continuous func-
tions of order greater than 1/2. In Stochastic analysis and applications, vol. 2 of Abel
Symp., 399–413. Springer, Berlin, 2007. doi:10.1007/978-3-540-70847-6_17.
[HN09] Y. H U and D. N UALART. Rough path analysis via fractional calculus.
Trans. Amer. Math. Soc. 361, no. 5, (2009), 2689–2718. doi:10.1090/
S0002-9947-08-04631-X.
[HNS20] A. H OCQUET, T. N ILSSEN, and W. S TANNAT. Generalized Burgers equation with
rough transport noise. Stochastic Process. Appl. 130, no. 4, (2020), 2159–2184.
doi:10.1016/j.spa.2019.06.014.
[Hof06] B. H OFF. The Brownian Frame Process as a Rough Path. Ph.D. thesis, University of
Oxford, 2006. arXiv:math/0602008.
[Hof16] M. H OFMANOV Á. Scalar conservation laws with rough flux and stochastic forcing.
Stoch. Partial Differ. Equ. Anal. Comput. 4, no. 3, (2016), 635–690. doi:10.1007/
s40072-016-0072-3.
[Hof18] M. H OFMANOV Á. On the rough Gronwall lemma and its applications. In Stochastic
partial differential equations and related fields, in Honor of Michael Röckner, Bielefeld
2016, vol. 229 of Springer Proc. Math. Stat., 333–344. Springer, Cham, 2018. doi:
10.1007/978-3-319-74929-7_2.
[Hop50] E. H OPF. The partial differential equation ut + uux = µuxx . Comm. Pure Appl.
Math. 3, (1950), 201–230. doi:10.1002/cpa.3160030302.
[Hör67] L. H ÖRMANDER. Hypoelliptic second order differential equations. Acta Math. 119,
(1967), 147–171. doi:10.1007/bf02392081.
[HP13] M. H AIRER and N. S. P ILLAI. Regularity of laws and ergodicity of hypoelliptic
SDEs driven by rough paths. Ann. Probab. 41, no. 4, (2013), 2544–2598. doi:
10.1214/12-AOP777.
[HP15] M. H AIRER and E. PARDOUX. A Wong-Zakai theorem for stochastic PDEs. J. Math.
Soc. Japan 67, no. 4, (2015), 1551–1604. doi:10.2969/jmsj/06741551.
[HQ18] M. H AIRER and J. Q UASTEL. A class of growth models rescaling to KPZ. Forum
Math. Pi 6, (2018), e3, 112. arXiv:1512.07845. doi:10.1017/fmp.2018.2.
[HS90] W. H EBISCH and A. S IKORA. A smooth subadditive homogeneous norm on a
homogeneous group. Studia Mathematica 96, no. 3, (1990), 231–236. doi:10.
4064/sm-96-3-231-236.
[HS17] M. H AIRER and H. S HEN. A central limit theorem for the KPZ equation. Ann.
Probab. 45, no. 6B, (2017), 4167–4221. doi:10.1214/16-AOP1162.
[HS19] M. H AIRER and P. S CH ÖNBAUER. The support of singular stochastic PDEs. arXiv
e-prints (2019), 1–147. arXiv:1909.05526.
[HSV07] M. H AIRER, A. M. S TUART, and J. VOSS. Analysis of SPDEs arising in path
sampling. II. The nonlinear case. Ann. Appl. Probab. 17, no. 5-6, (2007), 1657–1706.
doi:10.1214/07-AAP441.
[HT13] Y. H U and S. T INDEL. Smooth density for some nilpotent rough differential
equations. J. Theoret. Probab. 26, no. 3, (2013), 722–749. doi:10.1007/
s10959-011-0388-x.
[HT19] F. A. H ARANG and S. T INDEL. Volterra equations driven by rough signals. arXiv
e-prints (2019), 1–51. arXiv:1912.02064.
[HW13] M. H AIRER and H. W EBER. Rough Burgers-like equations with multiplicative
noise. Probab. Theory Related Fields 155, no. 1-2, (2013), 71–126. doi:10.1007/
s00440-011-0392-1.
References 335

[HW15] M. H AIRER and H. W EBER. Large deviations for white-noise driven, nonlinear
stochastic PDEs in two and three dimensions. Ann. Fac. Sci. Toulouse Math. (6) 24,
no. 1, (2015), 55–92. doi:10.5802/afst.1442.
[HX19] M. H AIRER and W. X U. Large scale limit of interface fluctuation models. Ann.
Probab. 47, no. 6, (2019), 3478–3550. doi:10.1214/18-aop1317.
[IK06] Y. I NAHAMA and H. K AWABI. Large deviations for heat kernel measures on loop
spaces via rough paths. J. London Math. Soc. (2) 73, no. 3, (2006), 797–816. doi:
10.1112/S0024610706022654.
[IK07] Y. I NAHAMA and H. K AWABI. Asymptotic expansions for the Laplace approxima-
tions for Itô functionals of Brownian rough paths. J. Funct. Anal. 243, no. 1, (2007),
270–322. doi:10.1016/j.jfa.2006.09.016.
[IKN18] S. I SHIWATA, H. K AWABI, and R. NAMBA. Central limit theorems for non-symmetric
random walks on nilpotent covering graphs: Part ii. arXiv e-prints (2018), 1–41.
arXiv:1808.08856.
[IM85] A. I NOUE and Y. M AEDA. On integral transformations associated with a certain
Lagrangian—as a prototype of quantization. J. Math. Soc. Japan 37, no. 2, (1985),
219–244. doi:10.2969/jmsj/03720219.
[IN19] Y. I NAHAMA and N. NAGANUMA. Asymptotic expansion of the density for hypoel-
liptic rough differential equation. arXiv e-prints (2019), 1–33. arXiv:1902.05219.
[Ina06] Y. I NAHAMA. Laplace’s method for the laws of heat processes on loop spaces. J.
Funct. Anal. 232, no. 1, (2006), 148–194. doi:10.1016/j.jfa.2005.06.006.
[Ina10] Y. I NAHAMA. A stochastic Taylor-like expansion in the rough path theory. J. Theor.
Probab. 23, (2010), 671–714. doi:10.1007/s10959-010-0287-6.
[Ina13] Y. I NAHAMA. Laplace approximation for rough differential equation driven by
fractional Brownian motion. Ann. Probab. 41, no. 1, (2013), 170–205. doi:10.
1214/11-AOP733.
[Ina14] Y. I NAHAMA. Malliavin differentiability of solutions of rough differential equations.
J. Funct. Anal. 267, no. 5, (2014), 1566–1584. doi:10.1016/j.jfa.2014.06.011.
[Ina15] Y. I NAHAMA. Large deviation principle of Freidlin-Wentzell type for pinned diffusion
processes. Trans. Amer. Math. Soc. 367, no. 11, (2015), 8107–8137. doi:10.1090/
S0002-9947-2015-06290-4.
[Ina16a] Y. I NAHAMA. Large deviations for rough path lifts of Watanabe’s pullbacks of
delta functions. Int. Math. Res. Not. IMRN 2016, no. 20, (2016), 6378–6414. doi:
10.1093/imrn/rnv349.
[Ina16b] Y. I NAHAMA. Short time kernel asymptotics for rough differential equation driven
by fractional Brownian motion. Electron. J. Probab. 21, (2016), Paper No. 34, 29.
doi:10.1214/16-EJP4144.
[INY78] N. I KEDA, S. NAKAO, and Y. YAMATO. A class of approximations of Brownian
motion. Publ. Res. Inst. Math. Sci. 13, no. 1, (1977/78), 285–300. doi:10.2977/
prims/1195190109.
[IT17] Y. I NAHAMA and S. TANIGUCHI. Short time full asymptotic expansion of hypoelliptic
heat kernel at the cut locus. Forum Math. Sigma 5, (2017), e16, 74. doi:10.1017/
fms.2017.14.
[IW89] N. I KEDA and S. WATANABE. Stochastic differential equations and diffusion pro-
cesses. North-Holland Publishing Co., Amsterdam, second ed., 1989, xvi+555.
[JLM85] G. J ONA -L ASINIO and P. K. M ITTER. On the stochastic quantization of field theory.
Comm. Math. Phys. 101, no. 3, (1985), 409–436. doi:10.1007/bf01216097.
[JM83] N. C. JAIN and D. M ONRAD. Gaussian measures in Bp . Ann. Probab. 11, no. 1,
(1983), 46–57. doi:10.1214/aop/1176993659.
[Kal02] O. K ALLENBERG. Foundations of modern probability. Probability and its Ap-
plications (New York). Springer-Verlag, New York, second ed., 2002, xx+638.
doi:10.1007/978-1-4757-4015-8.
[Kel16] D. K ELLY. Rough path recursions and diffusion approximations. Ann. Appl. Probab.
26, no. 1, (2016), 425–461. doi:10.1214/15-aap1096.
336 References

[KM16] D. K ELLY and I. M ELBOURNE. Smooth approximation of stochastic differential


equations. Ann. Probab. 44, no. 1, (2016), 479–520. doi:10.1214/14-AOP979.
[KM17] D. K ELLY and I. M ELBOURNE. Deterministic homogenization for fast–slow systems
with chaotic noise. J. Funct. Anal. 272, no. 10, (2017), 4063–4102. doi:10.1016/j.
jfa.2017.01.015.
[Koh78] J. J. KOHN. Lectures on degenerate elliptic problems. In Pseudodifferential operator
with applications (Bressanone, 1977), 89–151. Liguori, Naples, 1978.
[KPP95] T. G. K URTZ, E. PARDOUX, and P. P ROTTER. Stratonovich stochastic differential
equations driven by general semimartingales. Ann. Inst. H. Poincaré Probab. Statist.
31, no. 2, (1995), 351–377.
[KPZ86] M. K ARDAR, G. PARISI, and Y.-C. Z HANG. Dynamic scaling of growing interfaces.
Phys. Rev. Lett. 56, no. 9, (1986), 889–892. doi:10.1103/PhysRevLett.56.889.
[KR77] N. V. K RYLOV and B. L. ROZOVSKII. The Cauchy problem for linear stochastic
partial differential equations. Izv. Akad. Nauk SSSR Ser. Mat. 41, no. 6, (1977),
1329–1347, 1448. doi:10.1070/im1977v011n06abeh001768.
[KRT07] I. K RUK, F. RUSSO, and C. A. T UDOR. Wiener integrals, Malliavin calculus and
covariance measure structure. J. Funct. Anal. 249, no. 1, (2007), 92–142. doi:
10.1016/j.jfa.2007.03.031.
[KS84] S. K USUOKA and D. S TROOCK. Applications of the Malliavin calculus. I. In Stochas-
tic analysis (Katata/Kyoto, 1982), vol. 32 of North-Holland Math. Library, 271–306.
North-Holland, Amsterdam, 1984. doi:10.1016/S0924-6509(08)70397-0.
[KS85] S. K USUOKA and D. S TROOCK. Applications of the Malliavin calculus. II. J. Fac.
Sci. Univ. Tokyo Sect. IA Math. 32, no. 1, (1985), 1–76.
[KS87] S. K USUOKA and D. S TROOCK. Applications of the Malliavin calculus. III. J. Fac.
Sci. Univ. Tokyo Sect. IA Math. 34, no. 2, (1987), 391–442.
[Kun82] H. K UNITA. Stochastic partial differential equations connected with nonlinear filtering.
In Nonlinear filtering and stochastic control (Cortona, 1981), vol. 972 of Lecture
Notes in Math., 100–169. Springer, Berlin, 1982. doi:10.1007/BFb0064861.
[Kun84] H. K UNITA. First order stochastic partial differential equations. In K. I T Ô, ed.,
Stochastic Analysis, vol. 32 of North-Holland Mathematical Library, 249 – 269.
Elsevier, 1984. doi:10.1016/S0924-6509(08)70396-9.
[Kup16] A. K UPIAINEN. Renormalization group and stochastic PDEs. Ann. Henri Poincaré
17, no. 3, (2016), 497–535. doi:10.1007/s00023-015-0408-y.
[Kus01] S. K USUOKA. Approximation of expectation of diffusion process and mathematical
finance. In Taniguchi Conference on Mathematics Nara ’98, vol. 31 of Adv. Stud. Pure
Math., 147–165. Math. Soc. Japan, Tokyo, 2001. doi:10.2969/aspm/03110147.
[KZK19] V. KOVA Č and P. Z ORIN -K RANICH. Variational estimates for martingale paraprod-
ucts. Electron. Commun. Probab. 24, (2019), Paper No. 48, 14. doi:10.1214/
19-ecp257.
[LCL07] T. J. LYONS, M. C ARUANA, and T. L ÉVY. Differential equations driven by rough
paths, vol. 1908 of Lecture Notes in Mathematics. Springer, Berlin, 2007, xviii+109.
Lectures from the 34th Summer School on Probability Theory held in Saint-Flour,
July 6–24, 2004, With an introduction concerning the Summer School by Jean Picard.
doi:10.1007/978-3-540-71285-5.
[Lê18] K. L Ê. A stochastic sewing lemma and applications. arXiv e-prints (2018). Electron.
J. Probab., to appear. arXiv:1810.10500.
[Led96] M. L EDOUX. Isoperimetry and Gaussian analysis. In Lectures on probability theory
and statistics (Saint-Flour, 1994), vol. 1648 of Lecture Notes in Math., 165–294.
Springer, Berlin, 1996. doi:10.1007/bfb0095676.
[Lej06] A. L EJAY. Stochastic differential equations driven by processes generated by diver-
gence form operators. I. A Wong-Zakai theorem. ESAIM Probab. Stat. 10, (2006),
356–379. doi:10.1051/ps:2006015.
[Lej12] A. L EJAY. Global solutions to rough differential equations with unbounded vector
fields. In Séminaire de Probabilités XLIV, vol. 2046 of Lecture Notes in Math.,
215–246. Springer, Heidelberg, 2012. doi:10.1007/978-3-642-27461-9_11.
References 337

[Lep76] D. L EPINGLE. La variation d’ordre p des semi-martingales. Z. Wahrscheinlichkeits-


theorie und Verw. Gebiete 36, no. 4, (1976), 295–316. doi:10.1007/BF00532696.
[LL03] A. L EJAY and T. J. LYONS. On the Importance of the Levy Area for Studying the
Limits of Functions of Converging Stochastic Processes. Application to Homoge-
nization. In D. BAKRY, L. B EZNEA, G. B UCUR, and M. R ÖCKNER, eds., Current
Trends in Potential Theory, vol. 7 of Current Trends in Potential Theory Conference
Proceedings, Bucharest, September 2002 and 2003. The Theta foundation / American
Mathematical Society, Bucarest, 2003.
[LL06] X.-D. L I and T. J. LYONS. Smoothness of Itô maps and diffusion processes on
path spaces. I. Ann. Sci. École Norm. Sup. (4) 39, no. 4, (2006), 649–677. doi:
10.1016/j.ansens.2006.07.001.
[LLQ02] M. L EDOUX, T. LYONS, and Z. Q IAN. Lévy area of Wiener processes in Banach
spaces. Ann. Probab. 30, no. 2, (2002), 546–578. doi:10.1214/aop/1023481002.
[LN15] T. LYONS and H. N I. Expected signature of Brownian motion up to the first exit
time from a bounded domain. Ann. Probab. 43, no. 5, (2015), 2729–2762. doi:
10.1214/14-AOP949.
[LO18] O. L OPUSANSCHI and T. O RENSHTEIN. Ballistic random walks in random environ-
ment as rough paths: convergence and area anomaly. arXiv e-prints (2018), 1–15.
arXiv:1812.01403.
[LP18] C. L IU and D. J. P R ÖMEL. Examples of Itô càdlàg rough paths. Proc. Amer. Math.
Soc. 146, no. 11, (2018), 4937–4950. doi:10.1090/proc/14142.
[LPS13] P.-L. L IONS, B. P ERTHAME, and P. E. S OUGANIDIS. Scalar conservation laws with
rough (stochastic) fluxes. Stoch. Partial Differ. Equ. Anal. Comput. 1, no. 4, (2013),
664–686. doi:10.1007/s40072-013-0021-3.
[LPS14] P.-L. L IONS, B. P ERTHAME, and P. E. S OUGANIDIS. Scalar conservation laws with
rough (stochastic) fluxes: the spatially dependent case. Stoch. Partial Differ. Equ.
Anal. Comput. 2, no. 4, (2014), 517–538. doi:10.1007/s40072-014-0038-2.
[LQ98] T. LYONS and Z. Q IAN. Flow of diffeomorphisms induced by a geometric mul-
tiplicative functional. Probab. Theory Related Fields 112, no. 1, (1998), 91–119.
doi:10.1007/s004400050184.
[LQ02] T. LYONS and Z. Q IAN. System control and rough paths. Oxford Mathematical Mono-
graphs. Oxford University Press, Oxford, 2002, x+216. Oxford Science Publications.
doi:10.1093/acprof:oso/9780198506485.001.0001.
[LQZ02] M. L EDOUX, Z. Q IAN, and T. Z HANG. Large deviations and support theorem for
diffusion processes via rough paths. Stochastic Process. Appl. 102, no. 2, (2002),
265–283. doi:10.1016/S0304-4149(02)00176-X.
[LS84] P.-L. L IONS and A.-S. S ZNITMAN. Stochastic differential equations with reflecting
boundary conditions. Comm. Pure Appl. Math. 37, no. 4, (1984), 511–537. doi:
10.1002/cpa.3160370408.
[LS98a] P.-L. L IONS and P. E. S OUGANIDIS. Fully nonlinear stochastic partial differential
equations. C. R. Acad. Sci. Paris Sér. I Math. 326, no. 9, (1998), 1085–1092. doi:
10.1016/S0764-4442(98)80067-0.
[LS98b] P.-L. L IONS and P. E. S OUGANIDIS. Fully nonlinear stochastic partial differential
equations: non-smooth equations and applications. C. R. Acad. Sci. Paris Sér. I Math.
327, no. 8, (1998), 735–741. doi:10.1016/S0764-4442(98)80161-4.
[LS00a] P.-L. L IONS and P. E. S OUGANIDIS. Fully nonlinear stochastic PDE with semilinear
stochastic dependence. C. R. Acad. Sci. Paris Sér. I Math. 331, no. 8, (2000), 617–624.
doi:10.1016/S0764-4442(00)00583-8.
[LS00b] P.-L. L IONS and P. E. S OUGANIDIS. Uniqueness of weak solutions of fully nonlinear
stochastic partial differential equations. C. R. Acad. Sci. Paris Sér. I Math. 331, no. 10,
(2000), 783–790. doi:10.1016/S0764-4442(00)01597-4.
[LS01] W. V. L I and Q.-M. S HAO. Gaussian processes: inequalities, small ball proba-
bilities and applications. In Stochastic processes: theory and methods, vol. 19 of
Handbook of Statist., 533–597. North-Holland, Amsterdam, 2001. doi:10.1016/
s0169-7161(01)19019-x.
338 References

[LS17] O. L OPUSANSCHI and D. S IMON. Area anomaly in the rough path Brownian scaling
limit of hidden Markov walks. arXiv e-prints (2017), 1–27. arXiv:1709.04288.
[LS18] O. L OPUSANSCHI and D. S IMON. Lévy area with a drift as a renormalization limit
of Markov chains on periodic graphs. Stochastic Process. Appl. 128, no. 7, (2018),
2404–2426. doi:10.1016/j.spa.2017.09.004.
[LV04] T. LYONS and N. V ICTOIR. Cubature on Wiener space. Proc. R. Soc. Lond. Ser.
A Math. Phys. Eng. Sci. 460, no. 2041, (2004), 169–198. Stochastic analysis with
applications to mathematical finance. doi:10.1098/rspa.2003.1239.
[LV06] A. L EJAY and N. V ICTOIR. On (p, q)-rough paths. J. Differential Equations 225,
no. 1, (2006), 103–133. doi:10.1016/j.jde.2006.01.018.
[LV07] T. LYONS and N. V ICTOIR. An extension theorem to rough paths. Ann. Inst. H.
Poincaré Anal. Non Linéaire 24, no. 5, (2007), 835–847. doi:10.1016/j.anihpc.
2006.07.004.
[LX13] T. J. LYONS and W. X U. A uniform estimate for rough paths. Bull. Sci. Math. 137,
no. 7, (2013), 867–879. doi:10.1016/j.bulsci.2013.04.004.
[LX17] T. J. LYONS and W. X U. Hyperbolic development and inversion of signature. J.
Funct. Anal. 272, no. 7, (2017), 2933–2955. doi:10.1016/j.jfa.2016.12.024.
[LX18] T. J. LYONS and W. X U. Inverting the signature of a path. J. Eur. Math. Soc. (JEMS)
20, no. 7, (2018), 1655–1687. doi:10.4171/JEMS/796.
[LY02] F. L IN and X. YANG. Geometric measure theory—an introduction, vol. 1 of Advanced
Mathematics (Beijing/Boston). Science Press Beijing, Beijing, 2002, x+237.
[LY13] T. J. LYONS and D. YANG. The partial sum process of orthogonal expansions as
geometric rough process with Fourier series as an example—an improvement of
Menshov-Rademacher theorem. J. Funct. Anal. 265, no. 12, (2013), 3067–3103.
doi:10.1016/j.jfa.2013.08.032.
[LY15] T. J. LYONS and D. YANG. The theory of rough paths via one-forms and the extension
of an argument of Schwartz to rough differential equations. J. Math. Soc. Japan 67,
no. 4, (2015), 1681–1703. doi:10.2969/jmsj/06741681.
[LY16] T. LYONS and D. YANG. Recovering the pathwise Itô solution from averaged
Stratonovich solutions. Electron. Commun. Probab. 21, (2016), Paper No. 7, 18.
doi:10.1214/16-ECP3795.
[Lyo91] T. LYONS. On the nonexistence of path integrals. Proc. Roy. Soc. London Ser. A 432,
no. 1885, (1991), 281–290. doi:10.1098/rspa.1991.0017.
[Lyo94] T. LYONS. Differential equations driven by rough signals. I. An extension of an
inequality of L. C. Young. Math. Res. Lett. 1, no. 4, (1994), 451–464. doi:10.4310/
MRL.1994.v1.n4.a5.
[Lyo95] T. J. LYONS. The interpretation and solution of ordinary differential equations driven
by rough signals. In Stochastic analysis (Ithaca, NY, 1993), vol. 57 of Proc. Sympos.
Pure Math., 115–128. Amer. Math. Soc., Providence, RI, 1995. doi:10.1090/
pspum/057/1335466.
[Lyo98] T. J. LYONS. Differential equations driven by rough signals. Rev. Mat. Iberoamericana
14, no. 2, (1998), 215–310. doi:10.4171/RMI/240.
[Lyo14] T. LYONS. Rough paths, signatures and the modelling of functions on streams. In
Proceedings of the International Congress of Mathematicians—Seoul 2014. Vol. IV,
163–184. Kyung Moon Sa, Seoul, 2014. arXiv:1405.4537.
[LZ99] T. LYONS and O. Z EITOUNI. Conditional exponential moments for iterated
Wiener integrals. Ann. Probab. 27, no. 4, (1999), 1738–1749. doi:10.1214/aop/
1022677546.
[Mal78] P. M ALLIAVIN. Stochastic calculus of variations and hypoelliptic operators. Proc. In-
tern. Symp. SDE (1978), 195–263.
[Mal97] P. M ALLIAVIN. Stochastic analysis, vol. 313 of Grundlehren der Mathematischen
Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag,
Berlin, 1997, xii+343. doi:10.1007/978-3-642-15074-6.
[McK69] H. P. M C K EAN , J R . Stochastic integrals. Probability and Mathematical Statistics, No.
5. Academic Press, New York-London, 1969, xiii+140. doi:10.1090/chel/353.
References 339

[McS72] E. J. M C S HANE. Stochastic differential equations and models of random processes.


In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and
Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. III: Probability theory,
263–294. Univ. California Press, Berkeley, Calif., 1972.
[Mey92] Y. M EYER. Wavelets and operators, vol. 37 of Cambridge Studies in Advanced Math-
ematics. Cambridge University Press, Cambridge, 1992, xvi+224. Translated from
the 1990 French original by D. H. Salinger. doi:10.1017/cbo9780511623820.
[Mon02] R. M ONTGOMERY. A tour of subriemannian geometries, their geodesics and appli-
cations, vol. 91 of Mathematical Surveys and Monographs. American Mathematical
Society, Providence, RI, 2002, xx+259. doi:10.1090/surv/091.
[MP18] J. M ARTIN and N. P ERKOWSKI. A Littlewood-Paley description of modelled distri-
butions. arXiv e-prints (2018), 1–25. arXiv:1808.00500.
[MR06] M. B. M ARCUS and J. ROSEN. Markov processes, Gaussian processes, and local
times, vol. 100 of Cambridge Studies in Advanced Mathematics. Cambridge University
Press, Cambridge, 2006, x+620. doi:10.1017/CBO9780511617997.
[MSS06] A. M ILLET and M. S ANZ -S OL É. Large deviations for rough paths of the fractional
Brownian motion. Ann. Inst. H. Poincaré Probab. Statist. 42, no. 2, (2006), 245–271.
doi:10.1016/j.anihpb.2005.04.003.
[MST18] V. M AGNANI, E. S TEPANOV, and D. T REVISAN. A rough calculus approach to level
sets in the Heisenberg group. J. Lond. Math. Soc., II. Ser. 97, no. 3, (2018), 495–522.
doi:10.1112/jlms.12115.
[MW18] A. M OINAT and H. W EBER. Space-time localisation for the dynamic Φ43 model.
arXiv e-prints (2018), 1–27. arXiv:1811.05764.
[Nor86] J. N ORRIS. Simplified Malliavin calculus. In Séminaire de Probabilités, XX, 1984/85,
vol. 1204 of Lecture Notes in Math., 101–130. Springer, Berlin, 1986. doi:10.1007/
BFb0075716.
[NP88] D. N UALART and É. PARDOUX. Stochastic calculus with anticipating inte-
grands. Probab. Theory Related Fields 78, no. 4, (1988), 535–581. doi:10.1007/
BF00353876.
[NT11] D. N UALART and S. T INDEL. A construction of the rough path above fractional
Brownian motion using Volterra’s representation. Ann. Probab. 39, no. 3, (2011),
1061–1096. doi:10.1214/10-AOP578.
[Nua06] D. N UALART. The Malliavin calculus and related topics. Probability and its Ap-
plications (New York). Springer-Verlag, Berlin, second ed., 2006, xiv+382. doi:
10.1007/3-540-28329-3.
[OSSW18] F. OTTO, J. S AUER, S. S MITH, and H. W EBER. Parabolic equations with rough
coefficients and singular forcing. arXiv e-prints (2018), 1–93. arXiv:1803.07884.
[Par79] E. PARDOUX. Stochastic partial differential equations and filtering of diffusion pro-
cesses. Stochastics 3, no. 2, (1979), 127–167. doi:10.1080/17442507908833142.
[Pic08] J. P ICARD. A tree approach to p -variation and to integration. Ann. Probab. 36, no. 6,
(2008), 2235–2279. doi:10.1214/07-AOP388.
[PL11] A. PAPAVASILIOU and C. L ADROUE. Parameter estimation for rough differential
equations. Ann. Statist. 39, no. 4, (2011), 2047–2073. doi:10.1214/11-AOS893.
[Pro05] P. E. P ROTTER. Stochastic integration and differential equations, vol. 21 of Stochastic
Modelling and Applied Probability. Springer-Verlag, Berlin, 2005, xiv+419. Second
edition. Version 2.1, Corrected third printing. doi:10.1007/978-3-662-10061-5.
[PS08] G. A. PAVLIOTIS and A. M. S TUART. Multiscale methods, vol. 53 of Texts in Applied
Mathematics. Springer, New York, 2008, xviii+307. Averaging and homogenization.
doi:10.1007/978-0-387-73829-1.
[PT16] D. J. P R ÖMEL and M. T RABS. Rough differential equations driven by signals
in Besov spaces. J. Differential Equations 260, no. 6, (2016), 5202–5249. doi:
10.1016/j.jde.2015.12.012.
[PT18] D. J. P R ÖMEL and M. T RABS. Paracontrolled distribution approach to stochastic
Volterra equations. arXiv e-prints (2018), 1–39. arXiv:1812.05456.
340 References

[PW81] G. PARISI and Y. S. W U. Perturbation theory without gauge fixing. Sci. Sinica 24,
no. 4, (1981), 483–496. doi:10.1360/ya1981-24-4-483.
[Qua11] J. Q UASTEL. Introduction to KPZ. Current Developments in Mathematics 2011,
(2011), 125–194. doi:10.4310/cdm.2011.v2011.n1.a3.
[Ras38] P. R ASHEVSKII. About connecting two points of complete non-holonomic space by
admissible curve (in Russian). Uch. Zapiski ped. inst. Libknexta 2, (1938), 83–94.
[Ree58] R. R EE. Lie elements and an algebra associated with shuffles. Ann. of Math. (2) 68,
(1958), 210–220. doi:10.2307/1970243.
[Rie17] S. R IEDEL. Transportation–cost inequalities for diffusions driven by gaussian pro-
cesses. Electron. J. Probab. 22, (2017), 26 pp. doi:10.1214/17-EJP40.
[RS17] S. R IEDEL and M. S CHEUTZOW. Rough differential equations with unbounded drift
term. J. Differential Equations 262, no. 1, (2017), 283–312. doi:10.1016/j.jde.
2016.09.021.
[RX13] S. R IEDEL and W. X U. A simple proof of distance bounds for Gaussian rough paths.
Electron. J. Probab. 18, (2013), no. 108, 1–18. doi:10.1214/EJP.v18-2387.
[RY99] D. R EVUZ and M. YOR. Continuous martingales and Brownian motion, vol. 293
of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of
Mathematical Sciences]. Springer-Verlag, Berlin, third ed., 1999, xiv+602. doi:
10.1007/978-3-662-06400-9.
[Rya02] R. A. RYAN. Introduction to tensor products of Banach spaces. Springer Monographs
in Mathematics. Springer-Verlag London Ltd., London, 2002, xiv+225. doi:10.
1007/978-1-4471-3903-4_1.
[Sch18] P. S CH ÖNBAUER. Malliavin calculus and density for singular stochastic partial
differential equations. arXiv e-prints (2018), 1–63. arXiv:1809.03570.
[See18a] B. S EEGER. Approximation schemes for viscosity solutions of fully nonlinear
stochastic partial differential equations. arXiv e-prints (2018), 1–40. arXiv:1802.
04740.
[See18b] B. S EEGER. Perron’s method for pathwise viscosity solutions. Comm. Partial
Differential Equations 43, no. 6, (2018), 998–1018. doi:10.1080/03605302.2018.
1488262.
[Sim97] L. S IMON. Schauder estimates by scaling. Calc. Var. Partial Differential Equations 5,
no. 5, (1997), 391–407. doi:10.1007/s005260050072.
[Sip93] E.-M. S IPIL ÄINEN. A pathwise view of solutions of stochastic differential equations.
Ph.D. thesis, University of Edinburgh, 1993.
[Sou19] P. E. S OUGANIDIS. Pathwise solutions for fully nonlinear first- and second-
order partial differential equations with multiplicative rough time dependence. In
F. F LANDOLI, M. G UBINELLI, and M. H AIRER, eds., Singular Random Dynam-
ics : Cetraro, Italy 2016, 75–220. Springer International Publishing, Cham, 2019.
doi:10.1007/978-3-030-29545-5_3.
[ST78] V. N. S UDAKOV and B. S. T SIREL’ SON. Extremal properties of half-spaces for
spherically invariant measures. J. Sov. Math. 9, no. 1, (1978), 9–18. doi:10.1007/
BF01086099.
[ST18] H. S INGH and J. T EICHMANN. An elementary proof of the reconstruction theorem.
arXiv e-prints (2018), 1–25. arXiv:1812.03082.
[Str11] D. W. S TROOCK. Probability theory. Cambridge University Press, Cambridge,
second ed., 2011, xxii+527. An analytic view. doi:10.1017/cbo9780511974243.
[Sus78] H. J. S USSMANN. On the gap between deterministic and stochastic ordinary dif-
ferential equations. Ann. Probability 6, no. 1, (1978), 19–41. doi:10.1214/aop/
1176995608.
[Sus91] H. J. S USSMANN. Limits of the Wong-Zakai type with a modified drift term. In
Stochastic analysis, Proc. Conf. Honor Moshe Zakai 65th Birthday, Haifa/Isr., 475–
493. Academic Press, Boston, MA, 1991.
[SV72] D. W. S TROOCK and S. R. S. VARADHAN. On the support of diffusion processes with
applications to the strong maximum principle. In Proceedings of the Sixth Berkeley
References 341

Symposium on Mathematical Statistics and Probability, Volume 3: Probability Theory,


333–359. University of California Press, Berkeley, Calif., 1972.
[SV73] D. W. S TROOCK and S. R. S. VARADHAN. Limit theorems for random walks on Lie
groups. Sankhyā Ser. A 35, no. 3, (1973), 277–294.
[Tan84] H. TANAKA. Limit theorems for certain diffusion processes with interaction. In
Stochastic analysis (Katata/Kyoto, 1982), vol. 32 of North-Holland Math. Library,
469–488. North-Holland, Amsterdam, 1984. doi:10.1016/S0924-6509(08)
70405-7.
[TC15] S. T INDEL and K. C HOUK. Skorohod and Stratonovich integration in the plane.
Electron. J. Probab. 20, (2015), 39. Id/No 39. doi:10.1214/EJP.v20-3041.
[Tei11] J. T EICHMANN. Another approach to some rough and stochastic partial dif-
ferential equations. Stoch. Dyn. 11, no. 2-3, (2011), 535–550. doi:10.1142/
S0219493711003437.
[Tow02] N. T OWGHI. Multidimensional extension of L. C. Young’s inequality. JIPAM. J.
Inequal. Pure Appl. Math. 3, no. 2, (2002), Article 22, 13 pp. (electronic).
[TZ18] N. TAPIA and L. Z AMBOTTI. The geometry of the space of branched Rough Paths.
arXiv e-prints (2018), 1–35. Proc. LMS, to appear. arXiv:1810.12179.
[Um74] G. S. U M. On normalization problems of the path integral method. J. Math. Phys. 15,
no. 2, (1974), 220–224. doi:10.1063/1.1666626.
[Unt10] J. U NTERBERGER. A rough path over multidimensional fractional Brownian motion
with arbitrary Hurst index by Fourier normal ordering. Stochastic Process. Appl. 120,
no. 8, (2010), 1444–1472. doi:10.1016/j.spa.2010.04.001.
[Vic04] N. V ICTOIR. Levy area for the free Brownian motion: existence and non-existence.
J. Funct. Anal. 208, no. 1, (2004), 107–121. doi:10.1016/S0022-1236(03)
00063-6.
[Wer12] B. M. W ERNESS. Regularity of Schramm-Loewner evolutions, annular crossings,
and rough path theory. Electron. J. Probab. 17, (2012), no. 81, 21. doi:10.1214/
EJP.v17-2331.
[Wil01] D. R. E. W ILLIAMS. Path-wise solutions of stochastic differential equations driven
by Lévy processes. Rev. Mat. Iberoamericana 17, no. 2, (2001), 295–329. doi:
10.4171/RMI/296.
[WZ65] E. W ONG and M. Z AKAI. On the convergence of ordinary integrals to stochastic
integrals. Ann. Math. Statist. 36, no. 5, (1965), 1560–1564. doi:10.1214/aoms/
1177699916.
[Yas18] P. YASKOV. Extensions of the sewing lemma with applications. Stochastic Processes
Appl. 128, no. 11, (2018), 3940–3965. doi:10.1016/j.spa.2017.09.023.
[You36] L. C. YOUNG. An inequality of the Hölder type, connected with Stieltjes integration.
Acta Math. 67, no. 1, (1936), 251–282. doi:10.1007/BF02401743.
[Zim69] W. Z IMMERMANN. Convergence of Bogoliubov’s method of renormalization in
momentum space. Comm. Math. Phys. 15, (1969), 208–234. doi:10.1007/
BF01645676.
Index

||| · |||α , 68 Burkholder–Davis–Gundy inequality, 58


BC, 214
BU C, 228 Cameron–Martin
C α , 16 embedding theorem, 186
Cgα , 20 paths, 186
0,α space, 186
Cg , 30
0,α theorem for Brownian rough path, 160
Cg,0 , 160
C γ , 246 variation embedding, 186
p-var variation embedding, improved, 202
Cg , 187
C p-var , 187 Carnot–Carathéodory
DX 2α
, 70 norm, 24
γ Cass–Litterer–Lyons estimates, 195
Dα , 265
D γ (V ), 264 Chen’s relation, 15, 21, 25, 28
G((Rd )), 29 Chen–Strichartz formula, 149
M , 250 complementary Young regularity, 185
ϱα , 18 concentration of measure, 190
T ((V )), 28 continuity equation, 211
T≥α , 265 controlled rough paths, 70
Th , 36 composition with regular functions, 121
W 1 , 186 integration, 71
of low regularity, 127
admissible models, 272 operations on, 119
relation to rough paths, 119
Baker–Campbell–Hausdorff formula, 23 covariance function, 165
Borell’s inequality, 190 cubature formula, 52, 57
Bouleau–Hirsch criterion, 196 cubature on Wiener space, 50
bracket of a rough path, 93
Brownian motion, 186 Davie’s lemma, 64
Banach-valued, 54 differential equations
fractional, 175, 177 Young, 132
Hölder roughness, 114 dilation, 17
Hilbert-valued, 54 division property, 122
in magnetic field, as rough path, 46 Doob–Meyer
Itô, as rough path, 43 decomposition, 107
physical, 46, 56 for rough paths, 110
Stratonovich, as rough path, 44
Brownian rough path, 40, 45 enhanced Brownian motion, 40

343
344 Index

fast-slow system, 163 solution via regularity structures, 293


Fawcett’s formula, 51 solution via rough paths, 315
Fernique theorem, 190
for Gaussian rough paths, 191 Lépingle’s BDG inequality, 58
generalised, 190 Lévy’s stochastic area, 39
Feynman–Kac formula, 215 Laplace method, 156, 162
filtering, 236 large deviations, 156
flow, 148 large deviations of Schilder type, 160
fractional Brownian motion, 116, 165, 177 law of the iterated logarithm, 111
as rough path, 175 Lie algebra, 23
Freidlin–Wentzell large deviations, 156, 218 Lie group, 21, 26
free nilpotent, 26
Gaussian rough paths, 165 lift
group-like, 29 BPHZ lift, 308
Gubinelli derivative, 63, 70 canonical lift, 296
uniqueness, 109
Malliavin calculus, 196
Hölder roughness, 112 Malliavin covariance matrix, 196
of Brownian motion, 114 model, 249
of fractional Brownian motion, 116 modelled distribution, 250
Hölder space, 246, 260 composition with regular function, 264
Hörmander’s theorem, 196 differentiation, 263
rough path proof, 200 singular, 85
Hölder–Zygmund spaces, 271 multiplicative functional, 64
Hamilton–Jacobi equation, 241 almost, 64
harmonic analysis, 38, 58, 88, 253
Heisenberg group, 24 neo-classical inequality, 82
homogenisation, 163 Norris’ lemma, 112
hybrid Itô-rough differential equation, 236
one-form, 62
integrability of rough integrals, 192 Ornstein–Uhlenbeck process, 178, 179, 232
integral
of controlled rough paths, 71 p-variation, 185
rough, 63, 67 polymer measure, 163
Skorokhod, 103
stochastic rough, 86 quadratic variation
Stratonovich anticipating, 103 in the sense of Föllmer, 95
integration
backward Itô, 97 random dynamical system, 183
Itô, 89 random walk, 53
of controlled rough paths, 71 reconstruction theorem, 251, 256
rough, 63, 67, 86 regularity structure, 243, 245
stochastic rough, 86 model for, 249
Stratonovich, 91 polynomial structure, 246
interpolation, 30 rough path structure, 247
Itô’s formula, 92, 94 renormalisation
controlled rough path point of view, 125 BPHZ renormalisation, 308
Itô–Föllmer formula, 92 renormalisation group, 307
Itô–Lyons map, 141 Riemann–Stieltjes sum, 61
compensated, 62
Kallianpur–Striebel formula, 237 robustness
Kolmogorov type criteria, 40, 42, 52, 165 of filtering, 236
KPZ equation, 289 of maximum likelihood estimation, 102
Hopf–Cole solution, 291 of rough integration, 74
Index 345

rough from random walk, 53


continuity equation, 211 Gaussian, 165
convolution, 87 Fernique theorem for, 191
Hamilton–Jacobi equation, 241 Malliavin calculus for, 196
scalar conservation law, 240 geometric, 20
transport equation, 207 integral, 67
truly, 109 integral of convolution type, 86
rough differential equation, 131, 134, 137 Kolmogorov criterion, 39
calculus of variations, 197 Kolmogorov tightness criterion, 52
Davie’s definition, 143 Lévy–Kintchine formula, 56
driven by Gaussian signal Lyons lift, 81, 150
Hörmander theory, 196 Lyons–Victoir extension, Lyons–Victoir
Malliavin calculus, 196 extension, 34, 261
Euler approximation, 143 metric, 18
explicit solution, 149 mildly controlled, 129
explosion, 149 norm
flows, 148 homogeneous, 17
Hörmander’s theorem, 200 homogeneous p-variation, 187
in the sense of Davie, 144 Norris’ lemma for, 112
linear, 145 pure area, 31, 193
Lyons’ definition, 144 reduced, 93
Milstein approximation, 143 relation to controlled rough paths, 119
partial, 207 space-time, 149
partial, Feynman–Kac formula, 214 spaces, separability, 30
Peano existence, 151 spatial, 230
Picard iteration, 151 time reversal, 29
stochastic, 216 translation, 33, 36
with drift, 149 translation operator, 36, 188
rough Gronwall lemma, 145 weakly geometric, 20, 38
rough integral, 63 with jumps, 38
improper, 85 rough path norm, 17
integrability, 192 rough transport equation, 207
rough integration, 63 rough viscosity solutions, 228
rough partial differential equations, 207
rough path, 16 scalar conservation law, 240
bracket, 93 Schauder estimates, 272
branched, 27, 38 sewing lemma, 64, 65
Brownian, large deviations, 160 stochastic, 78, 85, 86
Brownian, support, 160 with semigroups, 86
càdlàg, 38 shuffle algebra, 128
Cameron–Martin theorem for, 160 shuffle product, 28
controlled, 63, 70 stability
controlled, of lower regularity, 76 flows of rough differential equations, 148
convergence, via interpolation, 30 functions of controlled rough paths, 122
convergence, via Kolmogorov, 42 rough differential equations, 141
discrete, 38 rough integration, 74
Donsker theorem, 53 viscosity solutions, 229
Doob–Meyer for, 110 statistics
extension theorem, 81, 150 applications to, 102, 183
Fernique theorem, 190 stochastic contuity equation, 211
for Gaussian process, 175 stochastic differential equation, 153
for Ornstein-Uhlenbeck process, 178 Freidlin–Wentzell large deviations, 156
for physical Brownian motion, 46 in Itô sense, 153
for stochastic heat equation, 230 in Stratonovich sense, 153
346 Index

Stroock–Varadhan support theorem, 155 higher order, 33


with jumps, 163 second order, 33
with singular drift, 163 transport equation, 207
Wong–Zakai approximations, 154 true roughness
stochastic heat equation as rough path, 230 as condition for Hörmander’s theorem, 201
stochastic integration of Brownian motion, 111
anticipating, 103 truly rough, 109
backward Itô, 97
Itô, 89 universal limit theorem, 151
Stratonovich, 91
stochastic partial differential equation, 207 variation
Burger-like, 230 2D ϱ-variation, 168
Feynman–Kac formula, 214
controlled ϱ-variation, 169
KPZ, 289
regularity, 185
linear stochastic heat equation, 232
singular semilinear, 289
spatial Itô–Stratonovich correction, 237 wavelets, 256
stochastic HJB equation, 230 Wiener–Itô chaos, 186
Zakai equation, 230 Wong–Zakai theorem
stochastic transport equation, 207 for Brownian rough path, 45
Stroock–Varadhan support theorem, 155, 218 for SDEs, 154, 161
for singular SPDEs, 161
tensor algebra for SPDEs, 218
truncated, 21 word, 28
tensor norm
injective, 55 Young
projective, 10, 55 2D maximal inequality, 168
tensor series, 28 differential equations, 133
translation of a rough path, 36, 188 inequality, 62
translation operator, 36 integral, 61

You might also like