0% found this document useful (0 votes)

27 views21 pages

Generative Modeling For Time Series Via SCHR Odinger Bridge: Mohamed HAMDOUCHE Pierre Henry-Labordere Huy en Pham

The document presents a novel generative model for time series using the Schrödinger bridge approach, which facilitates entropic interpolation between a reference probability measure and a target measure consistent with time series data. The model is characterized by a stochastic differential equation and allows for the estimation of drift functions through methods like kernel regression or LSTM neural networks, enabling the generation of synthetic time series data. The effectiveness of the model is demonstrated through numerical experiments, including applications in financial deep hedging and image sequence generation.

Uploaded by

Gabriel Augusto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views21 pages

Generative Modeling For Time Series Via SCHR Odinger Bridge: Mohamed HAMDOUCHE Pierre Henry-Labordere Huy en Pham

Uploaded by

Gabriel Augusto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Generative modeling for time series via Schrödinger bridge

Mohamed HAMDOUCHE∗ Pierre HENRY-LABORDERE† Huyên PHAM‡

Abstract
arXiv:2304.05093v1 [math.OC] 11 Apr 2023

We propose a novel generative model for time series based on Schrödinger bridge
(SB) approach. This consists in the entropic interpolation via optimal transport be-
tween a reference probability measure on path space and a target measure consistent
with the joint data distribution of the time series. The solution is characterized by a
stochastic differential equation on finite horizon with a path-dependent drift function,
hence respecting the temporal dynamics of the time series distribution. We can esti-
mate the drift function from data samples either by kernel regression methods or with
LSTM neural networks, and the simulation of the SB diffusion yields new synthetic
data samples of the time series.
The performance of our generative model is evaluated through a series of numerical
experiments. First, we test with a toy autoregressive model, a GARCH Model, and the
example of fractional Brownian motion, and measure the accuracy of our algorithm
with marginal and temporal dependencies metrics. Next, we use our SB generated
synthetic samples for the application to deep hedging on real-data sets. Finally, we
illustrate the SB approach for generating sequence of images.

Keywords: generative models, time series, Schrödinger bridge, kernel estimation, deep
hedging.

1 Introduction
Sequential data appear widely in our society like in video and audio data, and simulation of
time series models are used in various industrial applications including clinical predictions
[17], and weather forecasts [5]. In the financial industry, simulations of dynamical scenarios
are considered in market stress tests, risk measurement, and risk management, e.g. in
deep hedging [2]. The design of time series model is a delicate issue, requiring expensive
calibration task, and subject to error misspecification and model risk. Therefore, the
generation for synthetic samples of time series has gained an increasing attention over the
last years, and opens the door in the financial sector for a pure data driven approach in
risk management.
∗
LPSM, Université Paris Cité and Sorbonne Université, and Qube Research and Technologies, ham-
douche at lpsm.paris
†
Qube Research and Technologies, pierre.henrylabordere at qube-rt.com
‡
LPSM, Université Paris Cité and Sorbonne Université, pham at lpsm.paris; This author is supported by
the BNP-PAR Chair “Futures of Quantitative Finance”, and by FiME, Laboratoire de Finance des Marchés de
l’Energie, and the “Finance and Sustainable Development” EDF - CACIB Chair

1
Generative modeling (GM) has become over the last years a successful machine lear-
ning task for data synthesis notably in (static) image processing. Several competing
methods have been developed and state-of-the-art includes Likelihood-based models like
energy-based models (EBM) [15], variational auto-encoders (VAE) [14], Implicit genera-
tive models with the prominent works on generative adversarial network (GAN) [11] and
its extensions [1], and recently the new generation of score-based models using Langevin
dynamics, [21], [22], [12], and diffusion models via Schrödinger bridge, see [13] for the
application to a class of stochastic volatility models, and [23], [8]. Generative methods
for time series raises challenging issues for learning efficiently the temporal dependencies.
Indeed, in order to capture the potentially complex dynamics of variables across time, it is
not sufficient to learn the time marginals or even the joint distribution without exploiting
the sequential structure. An increasing attention has been paid to these methods in the
literature and state-of-the-art generative methods for time series are: Time series GAN
[27] which combines an unsupervised adversarial loss on real/synthetic data and super-
vised loss for generating sequential data, Quant GAN [25] with an adversarial generator
using temporal convolutional networks, Causal optimal transport COT-GAN [26] with ad-
versarial generator using the adapted Wasserstein distance for processes, Conditional loss
Euler generator [20] starting from a diffusion representation time series and minimizing the
conditional distance between transition probabilities of real/synthetic samples, Signature
embedding of time series [9], [18], [3], and Functional data analysis with neural SDEs [6].
In this paper, we develop a novel generative model based on Schrödinger bridge ap-
proach that captures the temporal dynamics of the time series. This consists in the entropic
interpolation via optimal transport between a reference probability measure on path space
and a target measure consistent with the joint data distribution of the time series. The
solution is characterized by a stochastic differential equation on finite horizon with a path-
dependent drift function, called Schrödinger bridge time series (SBTS) diffusion, and the
simulation of the SBTS diffusion yields new synthetic data samples of the time series.
Our SB approach differs from related works that have been recently designed for lear-
ning marginal (or static) distributions. In [23], the authors perform generative models by
solving two SB problems. The paper [8] formulates generative modeling by computing the
SB problem between the data and prior distribution. The very recent work [4] proposes
momentum SB by considering an additional velocity variable for learning multi marginal
distributions. Let us mention also the recent paper [19] that combines SB with h-transform
in order to respect aligned data. Instead, our SBTS diffusion interpolates the joint time
series distribution starting from an initial deterministic value. Moreover, we propose
an alternate method for the estimation of the drift function, which is path-dependent
in our case. While [23] uses a logistic regression for estimating the density ratio and
then the drift function, which requires additional samples from Gaussian noises, and [8]
performs an extension of the Sinkhorn algorithm, we propose a kernel regression method
relying solely on data samples, and this turns out to be quite simple, efficient and low-cost
computationally. Compared to GAN type methods, the simulation of synthetic samples
from SBTS is much faster as it does not require the training of neural networks.
We validate our methodology with several numerical experiments. We first test on
some time series models like autoregressive, GARCH models, and also for the fractional
Brownian motion with rough paths. The accuracy is measured by some metrics aiming to

2
capture the temporal dynamics and the correlation structure. We also provide operational
metrics of interest for the financial industry by implementing our results on real data-
sets, and applying to the deep hedging of call options. Finally, we show some numerical
illustrations of our SB method in high dimension for the generation of sequential images.

2 Problem formulation
Let µ be the distribution of a time series representing the evolution of some Rd -valued
process of interest (e.g. asset price, claim process, audio/video data, etc), and suppose
that one can observe samples of this process at given fixed times of a discrete time grid T
= {ti , i = 1, . . . , N } on (0, ∞). We set T = tN as the terminal observation horizon. Our
goal is to construct a model that generates time series samples according to the unknown
target distribution µ ∈ P((Rd )N ) the set of probability measures on (Rd )N . For that
objective, we propose a dynamic modification of the Schrödinger bridge as follows. Let
Ω = C([0, T ]; Rd ) be the space of Rd -valued continuous functions on [0, T ], X = (Xt )t
the canonical process with initial value X0 = 0, and F = (Ft )t the canonical filtration.
Denoting by P(Ω) as the space of probability measures on Ω, we search for P (representing
the theoretical generative model) in P(Ω), close to the Wiener measure W in the sense
of Kullback-Leibler (or relative entropy), and consistent with the observation samples. In
other words, we look for a probability measure P∗ ∈ P(Ω) solution to:

P∗ ∈ arg min
µ
H(P|W), (2.1)
P∈PT (Ω)

where PTµ (Ω) is the set of probability measures P on Ω with joint distribution µ at
(t1 , . . . , tN ), i.e., P ◦ (Xt1 , . . . , XtN )−1 = µ, and H(.|.) is the relative entropy between
two probability measures defined by
dP
R
ln dW dP, if P W
H(P|W) =
∞, otherwise .
dP
Denoting by EP and EW the expectation under P and W, we see that H(P|W) = EP [ln dW ]
dP dP
= EW [ dW ln dW ] when P W. Compared to the classical Schrödinger bridge (SB) (see
[16], and the application to generative modeling in [23]), which looks for a probability
measure P that interpolates between an initial probability measure and a target probability
distribution at terminal time T , here, we take into account via the constraint in PTµ the
temporal dependence of the process observed at sequential times t1 < . . . < tN , and look
for an entropic interpolation of the time series distribution. We call (2.1) the Schrödinger
bridge for time series (SBTS) problem.
Let us now formulate (SBTS) as a stochastic control problem following the well-known
connection established for classical (SB) in [7]. Given P ∈ P(Ω) with finite relative entropy
H(P|W) < ∞, it is known by Girsanov’s theorem that one can associate to P an F-adapted
RT
Rd -valued process α = (αt ) with finite energy: EP [ 0 |αt |2 dt] < ∞ such that
Z T Z T
dP 1
ln = αt .dXt − |αt |2 dt, (2.2)
dW 0 2 0

3
Rt
and Xt − 0 αs ds, 0 ≤ t ≤ T , is a Brownian motion under P. We then have
h1 Z T i
H(P|W) = EP |αt |2 dt .
2 0

Therefore, (SBTS) is reformulated equivalently in the language of stochastic control as:

 h i
 Minimize over α ∈ A, J(α) = E 1 R T |αt |2 dt
P 2 0
 subject to dX = α dt + dW , X = 0, (X , . . . , X ) ∼ P
t t t 0 µ, t1 tN

where W is a Brownian motion under P, A is the set of Rd -valued F-adapted processes s.t.
RT
EP [ 0 |αt |2 dt] < ∞, and (Xt1 , . . . , XtN ) ∼ µ is the usual notation for P ◦ (Xt1 , . . . , XtN )−1
P

= µ. In the sequel, when there is no ambiguity, we omit the reference on P in E = EP and

P
∼ = ∼. We denote by VSBT S the infimum of this stochastic control problem under joint
distribution constraint:

VSBT S := inf J(α),

α∈Aµ
T

Rt
where AµT is the set of controls α in A satisfying (Xt1 , . . . , XtN ) ∼ µ with Xt = 0 αs ds+Wt .
Our goal is to prove the existence of an optimal control α∗ that can be explicitly derived,
and then used to generate samples of the time series distribution µ via the probability
measure P∗ on Ω, i.e., the simulation of the optimal diffusion process X controlled by the
drift α∗ .

3 Solution to Schrödinger bridge for time series

Similarly as for the classical Schrödinger bridge problem, we assume that the target dis-
tribution µ admits a density with respect to the Lebesgue measure on (Rd )N , and by
misuse of notation, we denote by µ(x1 , . . . , xN ) this density function. Denote by µW T the
distribution of the Brownian motion on T , i.e. of (Wt1 , . . . , WtN ), which admits a density
given by (by abuse of language, we use the same notation for the measure and its density)
N −1 |x 2
1 i+1 − xi |
Y
µW
T (x1 , . . . , xN ) = p exp − , (3.1)
2π(ti+1 − ti ) 2(ti+1 − ti )
i=0

for (x1 , . . . , xN ) ∈ (Rd )N (with the convention that t0 = 0, x0 = 0). The measure µ is
absolutely continuous with respect to µW T , and we shall assume that its relative entropy is
finite, i.e.,
Z
µ
H(µ|µWT ) = ln W dµ < ∞.
µT

The solution to the (SBTS) problem is provided in the following theorem.

4
Rt
Theorem 3.1. The diffusion process Xt = 0 αs∗ ds + Wt , 0 ≤ t ≤ T , with α∗ defined as

αt∗ = a∗ (t, Xt ; (Xti )ti ≤t ), 0 ≤ t < T,

with a∗ (t, x; xi ), for t ∈ [ti , ti+1 ), xi = (x1 , . . . , xi ) ∈ (Rd )i , x ∈ Rd , given by

h µ i
a∗ (t, x; xi ) = ∇x ln EW W (Xt1 , . . . , XtN ) X ti = xi , Xt = x ,
µT

where we set X ti = (Xt1 , . . . , Xti ), induces a probability measure P∗ = µµW (Xt1 , . . . , XtN )W,
T
which solves the Schrödinger bridge time series problem. Moreover, we have

VSBT S = H(P∗ |W) = H(µ|µW

T ).

Proof. First, observe that EW [ µµW (Xt1 , . . . , XtN )] = 1, and thus one can define a probability
T
measure P∗ W with density process
h dP∗ i h µ i
Zt = EW Ft = EW (Xt1 , . . . , X t ) F t , 0 ≤ t ≤ T.
dW µW
T
N

Notice from the Markov and Gaussian properties of the Brownian motion that for t ∈
[ti , ti+1 ), i = 0, . . . , N − 1, we have Zt = hi (t, Xt ; X ti ), where for a path xi = (x1 , . . . , xi )
∈ (Rd )i , hi (; xi ) is defined on [ti , ti+1 ) × Rd by
h µ p √ i
hi (t, x; xi ) = EY ∼N (0,Id ) W (xi , x + ti+1 − tY, . . . , x + tN − tY )
µT

for t ∈ [ti , ti+1 ), x ∈ Rd , and EY ∼N (0,Id ) [.] is the expectation when Y is distributed ac-
cording to the Gaussian law N (0, Id ). Moreover, by the law of conditional expectations,
we have
h i
hi (t, x; xi ) = EW hi+1 (ti+1 , Xti+1 ; xi , Xti+1 ) Xt = x ,

µ
with the convention that hN (tN , x; xN −1 , x) = µW
(x1 , . . . , xN −1 , x). Therefore, for i
T
= 0, . . . , N − 1, and x ∈ (Rd )i , the function (t, x) 7→ hi (t, x; xi ) is a strictly positive
C 1,2 ([ti , ti+1 ) × Rd ) ∩ C 0 ([ti , ti+1 ] × Rd ) classical solution to the heat equation

∂hi (.; xi ) 1
+ ∆x hi (.; xi ) = 0, on [ti , ti+1 ) × Rd ,
∂t 2
with the terminal condition: hi (ti+1 , x; xi ) = hi+1 (ti+1 , x; xi , x) (here ∆x is the Laplacian
operator). By applying Itô’s formula to the martingale density process Z of P∗ under the
Wiener measure W, we derive

dZt = ∇x hi (t, Xt ; X ti )dXt ,

= Zt ∇x ln hi (t, Xt ; X ti )dXt , ti ≤ t < ti+1 ,

5
for i = 0, . . . , N − 1. Thus, by defining the process α∗ by αt∗ = ∇x ln hi (t, Xt ; X ti ), for t
∈ [ti , ti+1 ), i = 0, . . . , N − 1, we have

dP∗ Z T 1 T ∗2
Z
= exp αt∗ dXt − |αt | dt ,
dW 0 2 0
Rt
and by Girsanov’s theorem, Xt − 0 αs∗ ds is a Brownian motion under P∗ . On the other
hand, by definition of P∗ , and Bayes formula, we have for any bounded measurable function
ϕ on (Rd )N :
h µ i
EP∗ ϕ(Xt1 , . . . , XtN ) = EW W (Xt1 , . . . , XtN )ϕ(Xt1 , . . . , XtN )
µT
Z
µ
= (x1 , . . . , xN )ϕ(x1 , . . . , XN )µWT (x1 , . . . , xN )dx1 . . . dxN
µW
T
Z
= ϕ(x1 , . . . , XN )µ(x1 , . . . , xN )dx1 . . . dxN ,

P∗
which shows that (Xt1 , . . . , XtN ) ∼ µ. Moreover, by noting that
hZ T
1 ∗2 i h dP∗ i
∗
J(α ) = E P∗ |αt | dt = EP∗ ln = H(P∗ |W) = H(µ|µW
T ) < ∞,
0 2 dW
P∗
where we used in the last inequality the fact that (Xt1 , . . . , XtN ) ∼ µ, this shows in
particular that α∗ ∈ AµT .
It remains to show that for any α ∈ AµT associated to a probability measure P W
with density given by (2.2), i.e. J(α) = H(P|W), we have

J(α) ≥ H(µ|µW
T ). (3.2)
Rt
For this, we write from Bayes formula and since Wt = Xt − 0 αs ds is a Brownian motion
under P by Girsanov theorem:
h µ i
1 = EW W (Xt1 , . . . , XtN )
µT
Z T
1 T
Z
h µ i
= EP exp ln W (Xt1 , . . . , XtN ) − αt dWt − |αt |2 dt
µT 0 2 0
Z T
1 T
Z
h µ i
≥ exp EP ln W (Xt1 , . . . , XtN ) − αt dWt − |αt |2 dt
µT 0 2 0

= exp H(µ|µW T ) − J(α) ,

P
where we use Jensen’s inequality, and the fact that (Xt1 , . . . , XtN ) ∼ µ in the last equality.
This proves the required inequality (3.2), and ends the proof.

6
Remark 3.2. The optimal drift of the Schrödinger bridge time series diffusion is in general
path-dependent: it depends at given time t not only on its current state Xt , but also on
the past values X η(t) = (Xt1 , . . . , Xη(t) ), where η(t) = max{ti : ti ≤ t}, and we have:

dXt = a∗ (t, Xt ; X η(t) )dt + dWt , 0 ≤ t ≤ T, X0 = 0.

Moreover, the proof of the above theorem shows that this drift function is explicitly given
by
∇x hi (t, x; xi )
a∗ (t, x; xi ) = , t ∈ [ti , ti+1 ), xi ∈ (Rd )i , x ∈ Rd , (3.3)
hi (t, x; xi )
for i = 0, . . . , N − 1, where
h p √ i
hi (t, x; xi ) = EY ∼N (0,Id ) ρ(xi , x + ti+1 − tY, . . . , x + tN − tY ) ,
µ
with ρ := µW
the density ratio.
T

The following result states an alternate representation of the drift function that will
be useful in the next section for estimation.
Proposition 3.3. For i = 0, . . . , N − 1, t ∈ [ti , ti+1 ), xi = (x1 , . . . , xi ) ∈ (Rd )i , x ∈ Rd , we
have

∗ 1 Eµ (Xti+1 − x)Fi (t, xi , x, Xti+1 ) X ti = xi
a (t, x; xi ) = , (3.4)
ti+1 − t Eµ Fi (t, xi , x, Xti+1 ) X ti = xi
where
|xi+1 − x|2 |xi+1 − xi |2

Fi (t, xi , x, xi+1 ) = exp − + ,
2(ti+1 − t) 2(ti+1 − ti )
and Eµ [·] denotes the expectation under µ.
Proof. Fix i ∈ J0, N − 1K, and t ∈ [ti , ti+1 ). From the expression of µW T in (3.1), we have

µ
EW W (Xt1 , · · · , XtN ) (Xt1 , · · · , Xti ) = (x1 , · · · , xi ), Xt = x (3.5)
µT
NY −1
|xi+1 − x|2 |xj+1 − xj |2
Z
µ
=C (x1 , · · · , xN ) exp − exp − dxi+1 · · · dxN
µW
T 2(ti+1 − t)
j=i+1
2∆tj
Z
µ(x1 , . . . , xN ) h i
= C Fi (t, xi , x, xi+1 ) dxi+1 · · · dxN = CEµ Fi (t, xi , x, Xti+1 ) X ti = xi ,
µi (x1 , . . . , xi )

where C is a constant varying from line to line and depending only on t and xi =
(x1 , · · · , xi ), but not on x, and µi is the density of (Xt1 , · · · , Xti ) under µ, i.e.,
Z
µi (x1 , · · · , xi ) = µ(x1 , · · · , xN )dxi+1 · · · dxN .

7
By plugging the new expression (3.5) into a∗ and differentiating with respect to x, we then
get

∗ 1 Eµ (Xti+1 − x)Fi (t, xi , x, Xti+1 ) X ti = xi
a (t, x; xi ) = .
ti+1 − t Eµ Fi (t, xi , x, Xti+1 ) X ti = xi

Remark 3.4. In the case where µ is the distribution arising from a Markov chain, i.e., in
Q −1
the form µ(dx1 , . . . , xN ) = N d
i=1 νi (xi , dxi+1 ), for some transition kernels νi on R , then
the conditional expectations in (3.4) will depend on the past values X ti = (Xt1 , . . . , Xti )
only via the last value Xti .

4 Generative learning
From Theorem 3.1, we can run an Euler scheme for simulating the Schrödinger bridge
diffusion, and then samples of the target distribution µ. For that purpose, we need an
accurate estimation of the drift terms, i.e., of the functions a∗i , for i = 0, . . . , N − 1. We
propose several estimation methods. In the sequel, for a probability measure ν on (Rd )N ,
we denote by Eν [·] the expectation under the distribution ν.

4.1 Drift estimation

Estimation of the density ratio. This method follows the idea in [23]. Denote by ρ = µµW
T
the density ratio, and observe that the log-density ratio ln ρ minimizes over functions r on
(Rd )N the logistic regression function
h i h i
Llogistic (r) = Eµ ln 1 + exp(−r(X) + EµW ln 1 + exp(r(X) .
T

(m) (m
Therefore, given data samples X (m) = (Xt1 , . . . , XtN ), m = 1, . . . , M from µ, and using
samples Y (m) from µWT , we estimate the density ratio ρ by
ρ̂ = exp(rθ̂ )
where rθ̂ is the neural network that minimizes the empirical logistic loss function:
M
1 X
ln 1 + exp(−r(X (m) ) + ln 1 + exp(r(Y (m) ) .

θ 7→
M
m=1
By writing from (3.3) the Schrödinger drift a∗ as
h √ √ i
EY ∼N (0,Id ) ρ∇x ρ(xi , x + ti+1 − tY, . . . , x + tN − tY )
a∗ (t, x; xi ) = h √ √ i , (4.1)
EY ∼N (0,Id ) ρ(xi , x + ti+1 − tY, . . . , x + tN − tY )
for t ∈ [ti , ti+1 ), we obtain an estimator of a∗ by plugging into (4.1) the estimate ρ̂, and rθ̂
of ρ, and ln ρ, and then computing the expectation with Monte-Carlo approximations from
samples in N (0, Id ). Notice that this method is very costly as it requires in addition to
the training of the neural networks for estimating the density ratio, another Monte-Carlo
sampling for estimating finally the drift.

8
Kernel estimation of the drift. In order to overcome the computational issue of the above
estimation method, we propose an alternative approach that relies on the representation
of the drift term in Proposition 3.3. Indeed, the key feature of the formula (3.4) is that it
involves (conditional) expectations under the target distribution µ, which is amenable to
direct estimation using data samples. For the approximation of the conditional expecta-
tion, we can then use the classical kernel methods.
(m) (m)
From data samples X (m) = (Xt1 , . . . , XtN ), m = 1, . . . , M from µ, the Nadaraya-
Watson estimator of the drift function is given by
M i
(m) (m) (m) (m)
X Y
(Xti+1 − x)Fi (t, Xti , x, Xti+1 ) Kh (xj − Xtj )
1 m=1 j=1
â(t, x; xi ) = , (4.2)
ti+1 − t M i
(m) (m) (m)
X Y
Fi (t, Xti , x, Xti+1 ) Kh (xj − Xtj )
m=1 j=1

for t ∈ [ti , ti+1 ), xi ∈ (Rd )i , x ∈ Rd , i = 0, . . . , N − 1, where Kh is a kernel, i.e., a

non-negative real-valued integrable and symmetric function on Rd , with bandwith h > 0.
Common kernel function is the Gaussian density function, but we shall use here for lower
time complexity reason, the quartic kernel Kh (x) = h1 K( hx ) with

K(x) = (1 − |x|2 )1|x|≤1 .

LSTM network approximation of the path-dependent drift. The conditional expectations

in the numerator and denominator of the drift terms can alternately be approximated by
neural networks. In order to achieve this, we need a neural network architecture that fits
well with the path-dependency of the drift term, i.e., the a priori non Markov feature of
the data time series distribution µ. We shall then consider a combination of feed-forward
and LSTM (Long Short Term Memory) neural network.

Figure 1: Architecture of the neural network

For i = 0, . . . , N − 1, the conditional expectation function in the numerator:

(t, xi , x) ∈ [ti , ti+1 ) × (Rd )i × Rd

h i
7−→ Eµ (Xti+1 − Xt )F (t, Xti , Xt , Xti+1 ) (X ti , Xt ) = (xi , x) ,

is approximated by Ψθf (t, pti , x), where Ψθf ∈ N Nd+k+1,d is a feed-forward neural network
with input dimension 1 + k + d, and output dimension d, and pti is an output vector of

9
dimension k from an LSTM network, i.e., pti = ψθr (xi ) with ψθr ∈ LSTMi,d,k at time ti .
This neural network is trained by minimizing over the parameter θ = (θf , θr ) the quadratic
loss function
N −1
X 2
L(θ) = Ê (Xti+1 − X)F (τ, Xti , X, Xti+1 ) − Ψθf (τ, ψθr (X ti ), X) .
i=0

Here Ê is the empirical loss expectation where (Xt1 , . . . , XtN ) are sampled from the data
distribution µ, τ is sampled according to an uniform law on [ti , ti+1 ), and X is sampled
e.g. from a Gaussian law with mean Xti for i = 0, . . . , N − 1. The conditional expectation
function in the denominator is similarly approximated.
The output of this neural network training yields an approximation:

t ∈ [0, T ), x ∈ (Rd )η(t) , x ∈ Rd 7−→ â(t, x; x),

of the drift term function, which is then used for generating samples (Xt1 , . . . , XtN ) of µ
from the simulation of the diffusion:

dXt = â(t, Xt ; X η(t) )dt + dWt , X0 = 0. (4.3)

4.2 Schrödinger bridge time series algorithm

From the estimator â of the drift path-dependent function, we can now simulate the SB
SDE (4.3) by an Euler scheme. Let N π be the number of uniform time steps between two
consecutive observation dates ti and ti+1 , for i = 0, . . . , N − 1, and tπk,i = ti + Nkπ , k =
0, . . . , N π − 1, the associated time grid. The pseudo-code of the Schrödinger bridge time
series (SBTS) algorithm is described in Algorithm 1.

Algorithm 1: SBTS Simulation

(m) (m)
Input: data samples of time series (Xt1 , · · · , XtN ), m = 1, . . . , M , and N π .
Initialization: initial state x0 = 0;
for i = 0, . . . , N − 1 do
Initialize state y0 = xi ;
for k = 0, . . . , N π − 1 do
Compute â(tπk,i , yk ; xi ) e.g. by kernel estimator (4.2);
Sample εk ∈ N (0, 1) and compute
1 1
yk+1 = yk + π
â(tπk,i , yk ; xi ) + √ εk ,
N Nπ

end
Set xi+1 = yN π .
end
Return: x1 , · · · , xN

10
5 Numerical experiments
In this section, we demonstrate the effectiveness of our SBTS algorithm on several examples
of time series models, as well as on real data sets for an application to deep hedging. The
algorithms are performed on a computer with the following characteristics: Intel(R) i7-
7500U CPU @ 2.7GHz, 2 Core(s).

5.1 Evaluation metrics

In addition to visual plot of data vs generator samples path, we use some metrics to
evaluate the accuracy of our generators:

• Marginal metrics for quantifying how well are the marginal distributions from the
generated samples compared to the data ones. These include

– Classical statistics like mean, 95% and 5% percentiles

– Kolmogorov-Smirnov test: we compute the p-valued, and when p > α (usually
5%), we do not reject the null-hypothesis (generator came from data of reference
distribution)

• Temporal dynamics metrics for quantifying the ability of the generator to capture
the time structure of the P
time series data. We compute the empirical distribution of
the quadratic variation: i |Xti+1 − Xti |2 .

• Correlation structure for evaluating the ability of the generator to capture the multi-
dimensional structure of the time series. We shall compare the empirical covariance
or correlation matrix induced by the generator SBTS and the ones from the data
samples.

5.2 Toy autoregressive model of time series

We consider the following toy autoregressive (AR) model:

Xt1 = b + ε1 ,

Xt2 = β1 Xt1 + ε2 ,
 p
Xt3 = β2 Xt2 + |Xt1 | + ε3 ,


where the noises εi ∼ N (0, σi2 ), i = 1, . . . , 3 are mutually independent. The model param-
eters are b = 0.7, σ1 = 0.1, σ2 = σ3 = 0.05 and β1 = β2 = −1.
We use samples of size M = 1000 for simulated data of the AR model. The drift of
the SBTS diffusion is estimated with a kernel of bandwith h = 0.05, and simulated from
euler scheme with N π = 100. The runtime for generating 500 paths of SBTS is 8 seconds.
In Figure 2, we plot the empirical distribution of each pair (Xti , Xtj ) from the AR
model, and from the generated SBTS. We also show the marginal empirical distributions.
Table 1 presents the marginal metrics for the AR model and generator (p-value and per-
centiles at level 5% and 95%). In Table 2, we give the difference between the empirical
correlation from generated samples and AR data samples.

11

Data Data Data
SBTS SBTS SBTS

Xt2

Xt3

Xt1 Xt1 Xt2

Figure 2: Comparison between the true and generated distribution for each couple (Xti , Xtj ) with
i, j ∈ J1, 3K with i 6= j

p-value q5 q̃5 q95 q̃95

Xt1 0.98 0.535 0.528 0.855 0.861
Xt2 0.74 -0.873 -0.861 -0.516 -0.514
Xt3 0.90 1.243 1.251 1.808 1.793

Table 1: Marginal metrics for AR model and generator (q̃ for percentile)

Xt1 Xt2 Xt3

Xt1 0 0.014 -0.01
Xt2 0.014 0 0.013
Xt3 -0.01 0.013 0

Table 2: Difference between empirical correlation from generated samples and reference samples

5.3 GARCH Model

We consider a GARCH model:
(
Xti+1 = σti+1 εti+1
σt2i+1 = α0 + α1 Xt2i + α2 Xt2i−1 , i = 1, . . . , N,

with α0 = 5, α1 = 0.4, α2 = 0.1, and the noises εti ∼ N (0, 0.1), i = 1, . . . , N , are i.i.d.
The size of the time series is N = 60.
The hyperparameters for the training and generation of SBTS are M = 1000, N π =
100, and a bandwith for the kernel estimation h = 0.2 (larger than for the AR model since
by nature the GARCH process is more ”volatile”). The runtime for generating 1000 paths
is 120 seconds.
In Figure 3, we plot four sample paths of the GARCH process to be compared with
four sample paths of the SBTS. Figure 4 represents samples plot of the joint distribution
between (Xt1 and the terminal value XtN ) of the time series. Figure 5 provides some
metrics to evaluate the performance of SBTS. On the left, we represent the p-value for

12
each of the marginals of the generated SBTS. On the right, we compute for each marginal
index i = 1, . . . , N = 60, the difference between ρi and ρ̂i where ρi (resp. ρ̂i ) is the sum
over j of the empirical correlation between Xi and Xtj from GARCH (resp. generated
SBTS), and plot its mean and standard deviation.

0.8 0.8

0.4 0.4
Garch

SBTS
0.0 0.0

0.4 0.4

0.8 0.8
0 10 20 30 40 50 60 0 10 20 30 40 50 60
time time

Figure 3: Samples path of reference GARCH (left) and generator SBTS (right)

0.8 SBTS
Data
0.6
0.4
0.2
XtN

0.0
0.2
0.4
0.6
0.8
0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6
Xt1

Figure 4: Samples plot of the joint distribution (Xt1 , XtN )

100% 10% Average

Std

5%
Correlation
p-value

50% 0%

5%
10%
10%
0 10 20 30 40 50 60 0 10 20 30 40 50 60
Marginal index Marginal index

Figure 5: Left: p-value for the marginals Xti . Right: Difference between the term-by-term empirical
correlation from generated samples and reference samples.

5.4 Fractional Brownian Motion

We consider a fractional Brownian motion (FBM) with Hurst index H that measures the
roughness of this Gaussian process. We plot in Figure 6 four samples path of FBM with
H = 0.2, and samples paths generated by SBTS. The generator is trained with M =

13
1000 sample paths, and the hyperparameters used for the simulation are N π = 100, with
bandwith h = 0.05 for the kernel estimation of the Schrödinger drift. The runtime for
1000 paths is 120 seconds.

6%76

I%P

WLPH WLPH

Figure 6: Samples path of reference FBM (left) and generator SBTS (right)

Figure 7 represents the covariance matrix of (Xt1 , . . . , XtN ) for N = 60 of the FBM
and of the generated
PSBTS, while we plot in Figure 8 the empirical distribution of the
N −1
quadratic variation i=0 |Xti+1 − Xti |2 for the FBM and the SBTS.

1.0
1.0
Theoretical Covariance Matrix Generative Model
1 1
0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2
60 60
1 60 1 60
0.0 0.0

Figure 7: Covariance matrix for reference FBM and SBTS

14

6%76 6%76
I%P I%P

'HQVLW\

Q Q

Figure 8: Quadratic variation distribution for N = 30 (left), N = 60 (right) and T = tN = 1

Finally, we provide estimate of the Hurst index from our generated SBTS with the
standard estimator (see e.g. [10]) given by:
−1
NX
" log |Xti+1 − Xti |2 #
1 i=0
Ĥ = 1− .
2 log N

For N = 60, we get: Ĥ = 0.2016, Std = 0.004.

5.5 Application to deep hedging on real-data sets

In this paragraph, we use generated time series for applications to risk management, and
notably the pricing of derivatives and the computation of associated hedging strategies via
deep hedging approach. We use samples of historical data for generating by SBTS new
synthetic time series samples. We then compute deep hedging strategies that are trained
from these synthetic samples, and we compare with the PnL and the replication error
based on historical dataset. The general backtest procedure is illustrated in Figure 9.

15
Figure 9: Procedure of backtest for deep hedging

We consider stock price S from the company Apple with data (ticker is AAPL) from
january 1, 2010 to january 30, 2020, and produce M = 2500 samples of N = 60 successive
days, with a sliding window. The hyperparameters for the generation of SBTS synthetic
samples are N π = 100, bandwith h = 0.05.
We plot in Figure 10 four sample paths of the SBTS diffusion to be compared with the
real ones from Apple. We illustrate the excess of kurtosis of the real data by plotting in
St
Figure 11 the tail distribution for the return Rti = Si+1
ti
− 1: x in log-scale 7→ P[|R| ≥ x],
and found that the excess of kurtosis of real data is 1.96, to be compared with the one
from SBTS, and equal to 2.34. Figure 12 represents the empirical distribution of the Apple
time series data vs SBTS, while Figure 13 shows their covariance matrices.

1.4 1.4
1.3 1.3
1.2 1.2
SBTS
AAPL

1.1 1.1
1.0 1.0
0.9 0.9
0.8 0.8
0 10 20 30 40 50 60 0 10 20 30 40 50 60
time time

Figure 10: Four paths generated by Schrodinger bridge(Right) vs real ones (Left)

16
100

10 1

(|R| x)
Data
SBTS
Gaussian
10 2

10 1 100
x

Figure 11: Plot of tail distribution for the return: x in log-scale 7→ P[|R| ≥ x]. Excess of kurtosis for
real-data = 1.96, for generated SBTS = 2.34

Data
SBTS
80

60
Density

0
0.00 0.01 0.02 0.03 0.04 0.05 0.06

Figure 12: Comparison of quadratic variation distribution

0.0175
Empirical Covariance Matrix SBTS 0.0175
1 0.0150 1
0.0150
0.0125
0.0125
0.0100
0.0100
0.0075 0.0075
0.0050 0.0050
60 0.0025 60 0.0025
1 60 1 60
0.0000 0.0000

Figure 13: Covariance matrix for real-data and generative SBTS

The synthetic time series generated by SBTS is now used for the deep hedging of ATM
call option g(ST ) = (ST −S0 )+ , i.e., by minimizing over the initial capital p (premium) and

17
the parameters of the neural network ∆ the (empirical) loss function, called replication
error:
N −1
2
X
E PnLp,∆ , with PnLp,∆ = p + ∆(ti , Sti )(Sti+1 − Sti ) − g(ST ).
i=0

We then compare with the deep hedging on historical data by looking at the PnL and repli-
cation errors. The historical data set of Apple is split in the chronological order, namely
training data set from 01/01/2007 to 31/12/2017, validation data set from 01/01/2018
to 31/12/2028, and test set from 01/01/2019 to 30/01/2020. As pointed out in [24], it
is important not to break the time structure as it may lead to an overestimation of the
model performance.
In Figure 14, we plot the empirical distribution of the PnL with deep hedging obtained
from real data vs SBTS, and backtested on the validation and test sets. It appears that
the PnL from SBTS has a smaller variance (hence smaller replication error), and yields
less extreme values, i.e. outside the zero value, than the PnL from real data. This is also
quantified in Table 3 where we note that the premium obtained from SBST is higher than
the one from real data, which means that one is more conservative with SBTS by charging
a higher premium.
20.0

25
17.5

15.0
20

12.5
Density

Density

15
10.0
Data Data
SBTS SBTS
7.5 10

5.0
5
2.5

0.0 0
0.25 0.20 0.15 0.10 0.05 0.00 0.05 0.04 0.02 0.00 0.02 0.04 0.06
PnL PnL

Figure 14: Deep hedging PnL distribution with backtest from validation set (left) and test set (right).

Training Set Validation Set Test Set

Premium Mean Std Mean Std Mean Std
Data 0.0415 0.0008 0.0098 -0.0154 0.0371 0.003 0.012
SBTS 0.0471 0.0004 0.0109 -0.0075 0.0164 -0.0024 0.0076

Table 3: Mean of PnL and its Std (replication error).

6 Further tests in high dimension

In this section, we illustrate how our SB approach can be used for generating samples in
very high dimension.

18
We use a data set of images from MNIST with training size M = 10000. We first
start with handwritten digital numbers, and plot in Figure 15 the static images from real
MNIST data set and the ones generated by SBTS. The number of pixels is 28 × 28, and
the runtime for generating 16 iamges is equal to 120 seconds.

Figure 15: Left: MNIST samples. Right: Generated SBTS samples.

Next, in Figures 16 and 17, we plot sequential images sampled from real data set, and
compare with the sequence generated by SBTS algorithm. The number of pixels is 14×14,
and the runtime for generating 100 paths of sequential images is equal to 9 minutes.

t1 t2 t3 t4 t5 t6

Figure 16: Top: a time series sampled from real distribution. Bottom: generated time series via SBTS

t1 t2 t3 t4 t5 t6

Figure 17: Top: a time series sampled from real distribution. Bottom: generated time series via SBTS

19
References
[1] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein generative adversarial networks.
In ICML, 2017.

[2] H. Buehler, L. Gonon, J. Teichmann, and B. Wood. Deep hedging. Quantitative

Finance, 19(8):1271–1291, 2019.

[3] H. Buehler, B. Horvath, T. Lyons, I. Perez-Arribas, and B. Wood. A data-driven

market simulator for small data environments. SSRN 3632431, 2020.

[4] T. Chen, G-H. Liu, M. Tao, and E. Theodorou. Deep momentum multi-marginal
Schrödinger bridge. arXiv:2303.0175, 2023.

[5] Y. Chen, Y. Wang, D. Kirschen, and B. Zhang. Model-free renewable scenario gen-
eration using generative adversarial networks. IEEE Transactions on power systems,
33(3):3265–3275, 2018.

[6] V. Choudhary, S. Jaimungal, and M. Bergeron. FuNVol: A Multi-Asset Implied

Volatility Market Simulator using Functional Principal Components and Neural
SDEs. arXiv:2303.00859, 2023.

[7] P. Dai Pra. A stochastic control approach to reciprocal diffusion process. Applied
Mathematics and Optimization, 23(1):313–329, 1991.

[8] V. De Bortoli, J. Thornton, J. Heng, and A. Doucet. Diffusion Schrödinger bridge

with applications to score-based generative modeling. ArXiv: 2106.01357, 2021.

[9] A. Fermanian. Embedding and learning with signatures. arXiv:1911.13211, 2019.

[10] J. Gairing, P. Imkeller, R. Shevchenko, and C. Tudor. Hurst index estimation in

stochastic differential equations driven by fractional brownian motion. Journal of
Theoretical Probability, 33:1691–1714, 2020.

[11] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Faley, S. Ozair,

A. Courville, and Y. Bengio. Generative adversarial nets. Advances in Neural Infor-
mation Processing Systems, pages 2672–2680, 2014.

[12] F. Guth, S. Coste, V. De Bortoli, and S. Mallat. Wawelet score-based generative

modeling. arXiv: 2208.05003, 2022.

[13] P. Henry-Labordere. From (martingale) Schrodinger bridges to a new class of stochas-

tic volatility model. SSRN.3353270, 2019.

[14] D. Kingma and M. Welling. Auto-encoding variational Bayes. In ICLR, 2014.

[15] Y. LeCun, S. Chopra, R. Hadsell, M. Ranzato, and F. Huang. A tutorial on energy-

based modeling. Predicting structured data, 1, 2006.

[16] C. Leonard. A survey of the schrodinger problem and some of its connections with
optimal transport. Dynamical Systems, 34(4):1533–1574, 2014.

20
[17] X. Lyu, S. Hueser, S.L. Hyland, G. Zerveas, and G. Raetsch. Improving clinical pre-
dictions through unsupervised time series representation learning. arXiv:1812.00490,
2018.

[18] H. Ni, L. Szpruch, M. Wiese, S. Liao, and B. Xiao. Conditional Sig-Wasserstein GANs
for time series generation. arXiv:2006.05421, 2020.

[19] V. Ram Somnath, M. Pariset, Y-P. Hsieh, M.R. Martinez, A. Krause, and C. Bunne.
Aligned diffusion Schrödinger bridges. Arxiv:2302.11419, 2023.

[20] C. Remlinger, J. Mikael, and R. Elie. Conditional versus adversarial Euler-based

generators for time series. arXiv:2102.05313, 2021.

[21] Y. Song and S. Ermon. Generative modeling by estimating gradients of the distribu-
tion. In NIPS, pages 11918–11930, 2019.

[22] Y. Song, J. Sohl-Dickstein, D. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-

based generative modeling through stochastic differential equations. In International
Conference on learning Representation, 2021.

[23] G. Wang, Y. Jiao, Q. Xu, Y. Wang, and C. Yang. Deep generative learning via
schrödinger bridge. arXiv:2106.10410.

[24] W. Wang and J. Ruf. A note on spurious model selection. Quantitative Finance,
22(10):1797–2000, 2022.

[25] M. Wiese, R. Knobloch, R. Korn, and Kretschmer. Quant gans: deep generation of
financial time series. Quantitative Finance, 20(9):1419–1440, 2020.

[26] T. Xu, W. Li, M. Munn, and B. Acciaio. COT-GAN: Generative Sequential Data via
Causal Optimal Transport. In NeurIPS, 2020.

[27] J. Yoon, D. Jarrett, and Van der Schaar. Time-series generative adversarial networks.
In NeurIPS, 2019.

Sig-Wasserstein Gans For Conditional Time Series Generation: Shujian Liao
No ratings yet
Sig-Wasserstein Gans For Conditional Time Series Generation: Shujian Liao
51 pages
Recurrent Interpolants For Probabilistic Time Series Prediction
No ratings yet
Recurrent Interpolants For Probabilistic Time Series Prediction
14 pages
Simplified Diffusion Schrödinger Bridge
No ratings yet
Simplified Diffusion Schrödinger Bridge
28 pages
Diffusion Models For Time Series Applications: A Survey
No ratings yet
Diffusion Models For Time Series Applications: A Survey
25 pages
Diffusion Time Series Prediction (General)
No ratings yet
Diffusion Time Series Prediction (General)
30 pages
Recurrent Neural Processes: Preprint. Under Review
No ratings yet
Recurrent Neural Processes: Preprint. Under Review
12 pages
495 Time Series Generative Adversa
No ratings yet
495 Time Series Generative Adversa
11 pages
Generative Adversarial Networks For Time-Series Simulations Under Continuous Conditions
No ratings yet
Generative Adversarial Networks For Time-Series Simulations Under Continuous Conditions
11 pages
A Universal Framework
No ratings yet
A Universal Framework
13 pages
Learning Graphical Models For Stationary Time Series: Fbach@cs - Berkeley.edu Jordan@cs - Berkeley.edu
No ratings yet
Learning Graphical Models For Stationary Time Series: Fbach@cs - Berkeley.edu Jordan@cs - Berkeley.edu
20 pages
Schrodinger Bridge Flow NeurIPS24
No ratings yet
Schrodinger Bridge Flow NeurIPS24
58 pages
Entropy 21 00713
No ratings yet
Entropy 21 00713
16 pages
Deep Functional Factor Models: Forecasting High-Dimensional Functional Time Series Via Bayesian Nonparametric Factorization
No ratings yet
Deep Functional Factor Models: Forecasting High-Dimensional Functional Time Series Via Bayesian Nonparametric Factorization
19 pages
Modeling Systems With Machine Learning Based Differential Equations
No ratings yet
Modeling Systems With Machine Learning Based Differential Equations
12 pages
Simulating Financial Time Series Using Attention
No ratings yet
Simulating Financial Time Series Using Attention
30 pages
Learning Multidimensional Fourier Series With Tensor Trains
No ratings yet
Learning Multidimensional Fourier Series With Tensor Trains
6 pages
Lian Duke 0066D 13204
No ratings yet
Lian Duke 0066D 13204
117 pages
Temporal Evolution
No ratings yet
Temporal Evolution
18 pages
Kuremoto 2019
No ratings yet
Kuremoto 2019
17 pages
Multilevel Neural Simulation-Based Inference: Yuga Hikida Ayush Bharti Niall Jeffrey François-Xavier Briol
No ratings yet
Multilevel Neural Simulation-Based Inference: Yuga Hikida Ayush Bharti Niall Jeffrey François-Xavier Briol
35 pages
A NN-based Model For Time Series Forecasting in Function of Energy Associated of Series
No ratings yet
A NN-based Model For Time Series Forecasting in Function of Energy Associated of Series
7 pages
Generative Adversarial Networks in Time Series: A Systematic Literature Review
No ratings yet
Generative Adversarial Networks in Time Series: A Systematic Literature Review
31 pages
Modeling Extreme Events in Time Series Prediction: Daizong Ding, Mi Zhang Xudong Pan, Min Yang Xiangnan He
No ratings yet
Modeling Extreme Events in Time Series Prediction: Daizong Ding, Mi Zhang Xudong Pan, Min Yang Xiangnan He
9 pages
Alexandridis 2015
No ratings yet
Alexandridis 2015
5 pages
Coeurdoux etal23-PnPGibbs
No ratings yet
Coeurdoux etal23-PnPGibbs
15 pages
Machine Learning
No ratings yet
Machine Learning
35 pages
5772 Learning Stationary Time Series Using Gaussian Processes With Nonparametric Kernels
No ratings yet
5772 Learning Stationary Time Series Using Gaussian Processes With Nonparametric Kernels
9 pages
Neural Processes
No ratings yet
Neural Processes
11 pages
1st Exam Question Paper
No ratings yet
1st Exam Question Paper
2 pages
Bayesian Inference On Change Point Problems
No ratings yet
Bayesian Inference On Change Point Problems
71 pages
Variational Inference For SDE
No ratings yet
Variational Inference For SDE
28 pages
Bellman Filtering For State Space Models
No ratings yet
Bellman Filtering For State Space Models
26 pages
Neural Continuous-Discrete State Space Models For Irregularly-Sampled Time Series
No ratings yet
Neural Continuous-Discrete State Space Models For Irregularly-Sampled Time Series
26 pages
RBF NN Examples
No ratings yet
RBF NN Examples
7 pages
A Score-Based Density Formula, With Applications in
No ratings yet
A Score-Based Density Formula, With Applications in
24 pages
MCMC Methods For Functions: Modifying Old Algorithms To Make Them Faster
No ratings yet
MCMC Methods For Functions: Modifying Old Algorithms To Make Them Faster
23 pages
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
Timeseriesfoundationmodels Sundial2502.00816v1
No ratings yet
Timeseriesfoundationmodels Sundial2502.00816v1
20 pages
Better Theory For SGD in The Nonconvex World
No ratings yet
Better Theory For SGD in The Nonconvex World
33 pages
Cosmological Parameter Estimation With Sequential Linear Simulation-Based Inference
No ratings yet
Cosmological Parameter Estimation With Sequential Linear Simulation-Based Inference
12 pages
Datascience
No ratings yet
Datascience
14 pages
An SGBM-XVA Demonstrator: A Scalable Python Tool For Pricing XVA
No ratings yet
An SGBM-XVA Demonstrator: A Scalable Python Tool For Pricing XVA
19 pages
A-NICE-MC: Adversarial Training For MCMC
No ratings yet
A-NICE-MC: Adversarial Training For MCMC
19 pages
Pay Attention To Evolution Time Series Forecasting
No ratings yet
Pay Attention To Evolution Time Series Forecasting
17 pages
Gaussian Dy BM
No ratings yet
Gaussian Dy BM
33 pages
Time Series Forecasting With Transformer Models and Application To Asset Management
No ratings yet
Time Series Forecasting With Transformer Models and Application To Asset Management
44 pages
(2010) Paisley
No ratings yet
(2010) Paisley
160 pages
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
No ratings yet
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
41 pages
Ergodic Network Stochastic Differential Equations
No ratings yet
Ergodic Network Stochastic Differential Equations
22 pages
Quickest Detection of Gauss-Markov Random Fields: Javad Heydari Ali Tajer H. Vincent Poor
No ratings yet
Quickest Detection of Gauss-Markov Random Fields: Javad Heydari Ali Tajer H. Vincent Poor
7 pages
A Study on Neural Networks Approach to Time-series Analysis
No ratings yet
A Study on Neural Networks Approach to Time-series Analysis
4 pages
Bayesian Dynamic Modelling: Bayesian Theory and Applications
No ratings yet
Bayesian Dynamic Modelling: Bayesian Theory and Applications
27 pages
Bayesian Dynamic Modelling: Bayesian Theory and Applications
No ratings yet
Bayesian Dynamic Modelling: Bayesian Theory and Applications
29 pages
Survey of Neural Transfer Functions
No ratings yet
Survey of Neural Transfer Functions
50 pages
Ch13 4-LinearDynamicalSystems
No ratings yet
Ch13 4-LinearDynamicalSystems
20 pages
Ai Lorenz Pinn
No ratings yet
Ai Lorenz Pinn
28 pages
GS 02 2020 0024
No ratings yet
GS 02 2020 0024
12 pages
Nonparametric Risk Bounds For Time-Series Forecasting - McDonald at Al
No ratings yet
Nonparametric Risk Bounds For Time-Series Forecasting - McDonald at Al
40 pages
5725-Article Text-8950-1-10-20200513
No ratings yet
5725-Article Text-8950-1-10-20200513
8 pages
Fluent Simulation and Modeling Techniques: Definitive Reference for Developers and Engineers
From Everand
Fluent Simulation and Modeling Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
K-Means Data Clustering From Scratch Using C# - Visual Studio Magazine
No ratings yet
K-Means Data Clustering From Scratch Using C# - Visual Studio Magazine
9 pages
Generative Time Series Forecasting With Diffusion, Denoise, and Disentanglement
No ratings yet
Generative Time Series Forecasting With Diffusion, Denoise, and Disentanglement
29 pages
Mathematics 11 04451
No ratings yet
Mathematics 11 04451
16 pages
Seriesnet:A Generative Time Series Forecasting Model: Zhipeng Shen, Yuanming Zhang, Jiawei Lu, Jun Xu, Gang Xiao
No ratings yet
Seriesnet:A Generative Time Series Forecasting Model: Zhipeng Shen, Yuanming Zhang, Jiawei Lu, Jun Xu, Gang Xiao
8 pages
PMI Project Management Ready: Certification Framework
No ratings yet
PMI Project Management Ready: Certification Framework
18 pages
Effcient Monte Carlo Simulation Method of GERT Type Network For Project Management
No ratings yet
Effcient Monte Carlo Simulation Method of GERT Type Network For Project Management
11 pages
SHS Statistics and Probability Q3 Mod3 Random Sampling v4 1 Cutted 1
No ratings yet
SHS Statistics and Probability Q3 Mod3 Random Sampling v4 1 Cutted 1
47 pages
Central Tendency Measures BYJU's
No ratings yet
Central Tendency Measures BYJU's
6 pages
Steven Shreve. Lectures On Stochastic Calculus and Finance
100% (1)
Steven Shreve. Lectures On Stochastic Calculus and Finance
365 pages
Independence (Probability Theory)
No ratings yet
Independence (Probability Theory)
5 pages
PSQT-QB-Unit 1
No ratings yet
PSQT-QB-Unit 1
3 pages
EBSCO FullText 2024 06 21
No ratings yet
EBSCO FullText 2024 06 21
4 pages
Unit 6 Correlation Coefficient: Structure
No ratings yet
Unit 6 Correlation Coefficient: Structure
20 pages
STA301 Quiz-1 by Vu Topper RM
No ratings yet
STA301 Quiz-1 by Vu Topper RM
76 pages
Lampiran Hasil Analisa Statistik
No ratings yet
Lampiran Hasil Analisa Statistik
7 pages
Stats Lecture 06. Normal Distribution Data
No ratings yet
Stats Lecture 06. Normal Distribution Data
46 pages
(Maa 4.11) Normal Distribution - Solutions
No ratings yet
(Maa 4.11) Normal Distribution - Solutions
10 pages
A Multi-Armed Bandit Approach To Hyperparameter Tuning: Bhishma Dedhia Swadha Sanghvi Santanu Rathod
No ratings yet
A Multi-Armed Bandit Approach To Hyperparameter Tuning: Bhishma Dedhia Swadha Sanghvi Santanu Rathod
43 pages
3-Basic Stats
No ratings yet
3-Basic Stats
27 pages
Computing Binomial Probabilities With Minitab
No ratings yet
Computing Binomial Probabilities With Minitab
6 pages
Estimating Population ProportionLength of Confidence Interval, and Sample Size LP
No ratings yet
Estimating Population ProportionLength of Confidence Interval, and Sample Size LP
4 pages
St213 1993 Lecture Notes
No ratings yet
St213 1993 Lecture Notes
25 pages
Bharathidasan University-Statistics-QP-Nov-2010
No ratings yet
Bharathidasan University-Statistics-QP-Nov-2010
3 pages
(Ebook) A Pocket Guide To Risk Mathematics: Key Concepts Every Auditor Should Know by Matthew Leitch ISBN 9780470710524, 0470710527
No ratings yet
(Ebook) A Pocket Guide To Risk Mathematics: Key Concepts Every Auditor Should Know by Matthew Leitch ISBN 9780470710524, 0470710527
58 pages
Applying Quantitative Bias Analysis To Epidemiologic Data 2nd Edition Free Ebook Download
100% (16)
Applying Quantitative Bias Analysis To Epidemiologic Data 2nd Edition Free Ebook Download
17 pages
Syllabus: For Probability and Statistics
No ratings yet
Syllabus: For Probability and Statistics
2 pages
Minitabdat PDF
No ratings yet
Minitabdat PDF
249 pages
Assignment 4: IC252 - IIT Mandi
No ratings yet
Assignment 4: IC252 - IIT Mandi
2 pages
Analysis of Covariance: David Markham Djmarkham@bsu - Edu
No ratings yet
Analysis of Covariance: David Markham Djmarkham@bsu - Edu
22 pages
STAT 151 Introduction To Statistical Theory: August 21, 2018
No ratings yet
STAT 151 Introduction To Statistical Theory: August 21, 2018
31 pages
Sampling Distribution and Central Limit Theorem: Session 2
No ratings yet
Sampling Distribution and Central Limit Theorem: Session 2
19 pages
Random Processes: Professor Ke-Sheng Cheng
No ratings yet
Random Processes: Professor Ke-Sheng Cheng
23 pages
Conditional Probability..
No ratings yet
Conditional Probability..
18 pages
MATH2930A Assignment-2 Solutions
No ratings yet
MATH2930A Assignment-2 Solutions
8 pages
QBW M1V2Ch12 ENG Vw5Kosuc
No ratings yet
QBW M1V2Ch12 ENG Vw5Kosuc
21 pages

Generative Modeling For Time Series Via SCHR Odinger Bridge: Mohamed HAMDOUCHE Pierre Henry-Labordere Huy en Pham

Uploaded by

Generative Modeling For Time Series Via SCHR Odinger Bridge: Mohamed HAMDOUCHE Pierre Henry-Labordere Huy en Pham

Uploaded by

Generative modeling for time series via Schrödinger bridge

Mohamed HAMDOUCHE∗ Pierre HENRY-LABORDERE† Huyên PHAM‡

Therefore, (SBTS) is reformulated equivalently in the language of stochastic control as:

= µ. In the sequel, when there is no ambiguity, we omit the reference on P in E = EP and

VSBT S := inf J(α),

3 Solution to Schrödinger bridge for time series

The solution to the (SBTS) problem is provided in the following theorem.

αt∗ = a∗ (t, Xt ; (Xti )ti ≤t ), 0 ≤ t < T,

with a∗ (t, x; xi ), for t ∈ [ti , ti+1 ), xi = (x1 , . . . , xi ) ∈ (Rd )i , x ∈ Rd , given by

VSBT S = H(P∗ |W) = H(µ|µW

dZt = ∇x hi (t, Xt ; X ti )dXt ,

dXt = a∗ (t, Xt ; X η(t) )dt + dWt , 0 ≤ t ≤ T, X0 = 0.

4.1 Drift estimation

for t ∈ [ti , ti+1 ), xi ∈ (Rd )i , x ∈ Rd , i = 0, . . . , N − 1, where Kh is a kernel, i.e., a

K(x) = (1 − |x|2 )1|x|≤1 .

LSTM network approximation of the path-dependent drift. The conditional expectations

Figure 1: Architecture of the neural network

For i = 0, . . . , N − 1, the conditional expectation function in the numerator:

(t, xi , x) ∈ [ti , ti+1 ) × (Rd )i × Rd

t ∈ [0, T ), x ∈ (Rd )η(t) , x ∈ Rd 7−→ â(t, x; x),

dXt = â(t, Xt ; X η(t) )dt + dWt , X0 = 0. (4.3)

4.2 Schrödinger bridge time series algorithm

Algorithm 1: SBTS Simulation

5.1 Evaluation metrics

– Classical statistics like mean, 95% and 5% percentiles

5.2 Toy autoregressive model of time series

p-value q5 q̃5 q95 q̃95

Xt1 Xt2 Xt3

5.3 GARCH Model

Figure 4: Samples plot of the joint distribution (Xt1 , XtN )

100% 10% Average

5.4 Fractional Brownian Motion

Figure 7: Covariance matrix for reference FBM and SBTS

         

Figure 8: Quadratic variation distribution for N = 30 (left), N = 60 (right) and T = tN = 1

For N = 60, we get: Ĥ = 0.2016, Std = 0.004.

5.5 Application to deep hedging on real-data sets

Figure 12: Comparison of quadratic variation distribution

Figure 13: Covariance matrix for real-data and generative SBTS

Training Set Validation Set Test Set

Table 3: Mean of PnL and its Std (replication error).

6 Further tests in high dimension

Figure 15: Left: MNIST samples. Right: Generated SBTS samples.

[2] H. Buehler, L. Gonon, J. Teichmann, and B. Wood. Deep hedging. Quantitative

[3] H. Buehler, B. Horvath, T. Lyons, I. Perez-Arribas, and B. Wood. A data-driven

[6] V. Choudhary, S. Jaimungal, and M. Bergeron. FuNVol: A Multi-Asset Implied

[8] V. De Bortoli, J. Thornton, J. Heng, and A. Doucet. Diffusion Schrödinger bridge

[9] A. Fermanian. Embedding and learning with signatures. arXiv:1911.13211, 2019.

[10] J. Gairing, P. Imkeller, R. Shevchenko, and C. Tudor. Hurst index estimation in

[11] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Faley, S. Ozair,

[12] F. Guth, S. Coste, V. De Bortoli, and S. Mallat. Wawelet score-based generative

[13] P. Henry-Labordere. From (martingale) Schrodinger bridges to a new class of stochas-

[14] D. Kingma and M. Welling. Auto-encoding variational Bayes. In ICLR, 2014.

[15] Y. LeCun, S. Chopra, R. Hadsell, M. Ranzato, and F. Huang. A tutorial on energy-

[20] C. Remlinger, J. Mikael, and R. Elie. Conditional versus adversarial Euler-based

[22] Y. Song, J. Sohl-Dickstein, D. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-

You might also like