0% found this document useful (0 votes)

213 views

Neural Ordinary Differential Equations: Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud

Neural ordinary differential equations (ODEs) provide a novel framework for modeling temporal data and continuous normalizing flows using neural networks parameterized as ODEs. By interpreting deep learning models like ResNets as discretizations of ODEs, the framework allows adapting step sizes and leveraging black-box ODE solvers during training for more accurate and memory-efficient computation compared to fixed discretizations. The framework enables continuous-time modeling using only constant memory and adapts computation to instance complexity.

Uploaded by

Gabriel L

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

213 views

Neural Ordinary Differential Equations: Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud

Uploaded by

Gabriel L

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Neural Ordinary

Differential Equations

Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt*, David Duvenaud

University of Toronto
Background: Ordinary Differential Equations (ODEs)
- Model the instantaneous change of a state.

(explicit form)

- Solving an initial value problem (IVP) corresponds to integration.

(solution is a trajectory)

- Euler method approximates with small steps:

Residual Networks interpreted as an ODE Solver
- Hidden units look like:
- Final output is the composition:

Haber & Ruthotto (2017). E (2017).

Residual Networks interpreted as an ODE Solver
- Hidden units look like:
- Final output is the composition:

- This can be interpreted as an Euler

discretization of an ODE.

- In the limit of smaller steps:

Haber & Ruthotto (2017). E (2017).

Deep Learning as Discretized Differential Equations
Many deep learning networks can be interpreted as ODE solvers.
Network Fixed-step Numerical Scheme

ResNet, RevNet, ResNeXt, etc. Forward Euler

Lu et al. (2017)
Chang et al. (2018)
PolyNet Approximation to Backward Euler
Zhu et al. (2018)
FractalNet Runge-Kutta

DenseNet Runge-Kutta
Deep Learning as Discretized Differential Equations
Many deep learning networks can be interpreted as ODE solvers.
Network Fixed-step Numerical Scheme

ResNet, RevNet, ResNeXt, etc. Forward Euler

Lu et al. (2017)
Chang et al. (2018)
PolyNet Approximation to Backward Euler
Zhu et al. (2018)
FractalNet Runge-Kutta

DenseNet Runge-Kutta

But:
(1) What is the underlying dynamics?
(2) Adaptive-step size solvers provide better error handling.
“Neural” Ordinary Differential Equations

Instead of y = F(x),
“Neural” Ordinary Differential Equations

Instead of y = F(x), solve y = z(T)

given the initial condition z(0) = x.

Parameterize
“Neural” Ordinary Differential Equations

Instead of y = F(x), solve y = z(T)

given the initial condition z(0) = x.

Parameterize

Solve the dynamic using any

black-box ODE solver.
- Adaptive step size.
- Error estimate.
- O(1) memory learning.
Backprop without knowledge of the ODE Solver
Ultimately want to optimize some loss
Backprop without knowledge of the ODE Solver
Ultimately want to optimize some loss

Naive approach: Know the solver. Backprop through the solver.

- Memory-intensive.
- Family of “implicit” solvers perform inner optimization.
Backprop without knowledge of the ODE Solver
Ultimately want to optimize some loss

Naive approach: Know the solver. Backprop through the solver.

- Memory-intensive.
- Family of “implicit” solvers perform inner optimization.
Our approach: Adjoint sensitivity analysis. (Reverse-mode Autodiff.)
- Pontryagin (1962).
+ Automatic differentiation.
+ O(1) memory in backward pass.
Continuous-time Backpropagation
Residual network. Adjoint method. Define:

Forward:

Backward:

Params:
Continuous-time Backpropagation
Residual network. Adjoint method. Define:

Forward: Forward:

Backward:

Params:
Continuous-time Backpropagation
Residual network. Adjoint method. Define:

Forward: Forward:

Backward: Backward:
Adjoint State Adjoint DiffEq

Params:
Continuous-time Backpropagation
Residual network. Adjoint method. Define:

Forward: Forward:

Backward: Backward:
Adjoint State Adjoint DiffEq

Params: Params:
A Differentiable Primitive for AutoDiff

Forward:

Backward:
A Differentiable Primitive for AutoDiff

Forward:

Backward:
A Differentiable Primitive for AutoDiff

Don’t need to store layer activations for reverse pass - just follow dynamics in
reverse!

Reversible networks (Gomez et al. 2018) also only require O(1)-memory, but
require very specific neural network architectures with partitioned dimensions.
Reverse versus Forward Cost

- Empirically, reverse
pass roughly half as
expensive as forward
pass.
-

- Adapts to instance
difficulty.
-

- Num evaluations can

be viewed as number of
layers in neural nets.

NFE = Number of Function Evaluations.

Dynamics Become Increasingly Complex

- Dynamics become
more demanding to
compute during
training.

- Adapts computation
time according to
complexity of diffeq.

In contrast, Chang et al. (ICLR 2018)

explicitly add layers during training.
Continuous-time RNNs for Time Series Modeling
- We often want arbitrary measurement times, ie. irregular time intervals.
- Can do VAE-style inference with a latent ODE.
ODEs vs Recurrent Neural Networks (RNNs)

- RNNs learn very

stiff dynamics,
have exploding
gradients.
-

- Whereas ODEs
are guaranteed
to be smooth.
Continuous Normalizing Flows
Instantaneous Change of variables (iCOV):

- For a Lipschitz continuous function

Continuous Normalizing Flows
Instantaneous Change of variables (iCOV):

- For a Lipschitz continuous function

- In other words,
Continuous Normalizing Flows
Instantaneous Change of variables (iCOV):

- For a Lipschitz continuous function

- In other words,

With an
invertible F:
Continuous Normalizing Flows
1D: 2D: Data Discrete-NF CNF
Is the ODE being correctly solved?
Stochastic Unbiased Log Density
Stochastic Unbiased Log Density

Can further reduce time complexity using stochastic estimators.

Grathwohl et al. (2019)

FFJORD - Stochastic Continuous Flows
MNIST - Model Samples CIFAR10 - Model Samples

Grathwohl et al. (2019)

Variational Autoencoders with FFJORD
ODE Solving as a Modeling Primitive
Adaptive-step solvers with O(1) memory backprop.

github.com/rtqichen/torchdiffeq

Future directions we’re currently working on:

- Latent Stochastic Differential Equations.

- Network architectures suited for ODEs.
- Regularization of dynamics to require fewer evaluations.
Co-authors:

Yulia Rubanova Jesse Bettencourt David Duvenaud

Thanks!
Extra Slides
Latent Space Visualizations
• Released an implementation of reverse-mode
autodiﬀ through black-box ODE solvers.

• Solves a system of size 2D + K + 1.

• In contrast, forward-mode implementation

solves a system of size D^2 + KD.

• Tensorﬂow has Dormand-Prince-Shampine

Runge-Kutta 5(4) implemented, but uses
naive autodiﬀ for backpropagation.
How much precision is needed?
Explicit Error Control

- More fine-grained
control than
low-precision floats.

- Cost scales with

instance difficulty.

NFE = Number of Function Evaluations.

Computation Depends on Complexity of Dynamics

- Time cost is dominated by

evaluation of dynamics f.

NFE = Number of Function Evaluations.

Why not use an ODE solver as modeling primitive?
- Solving an ODE is expensive.
Future Directions
- Stochastic differential equations and Random ODEs. Approximates stochastic
gradient descent.
- Scaling up ODE solvers with machine learning.
- Partial differential equations.
- Graphics, physics, simulations.

Model Optimization Methods for Efficient and Edge AI (2025)
No ratings yet
Model Optimization Methods for Efficient and Edge AI (2025)
414 pages
The Hundred Page Machine Learning Book
No ratings yet
The Hundred Page Machine Learning Book
7 pages
CPIQ Overview 2016-02-15
No ratings yet
CPIQ Overview 2016-02-15
32 pages
EMLabScript PartA
No ratings yet
EMLabScript PartA
10 pages
Learning Python - From Zero To Hero
No ratings yet
Learning Python - From Zero To Hero
29 pages
Kohavi R Tang, D Xu, Y. (2020) - Trustworthy Online Controlled Experiments. A Practical Guide To A-B Testing. 1° Edición. Cap ES
No ratings yet
Kohavi R Tang, D Xu, Y. (2020) - Trustworthy Online Controlled Experiments. A Practical Guide To A-B Testing. 1° Edición. Cap ES
33 pages
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
No ratings yet
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
15 pages
CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
Intel Fpga Industrial Solutions Playbook 2022
No ratings yet
Intel Fpga Industrial Solutions Playbook 2022
43 pages
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
100% (1)
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
8 pages
Hardware Accleration For ML
No ratings yet
Hardware Accleration For ML
26 pages
Download full Machine Learning in Finance: From Theory to Practice Matthew F. Dixon ebook all chapters
100% (6)
Download full Machine Learning in Finance: From Theory to Practice Matthew F. Dixon ebook all chapters
65 pages
UiPath Notes
No ratings yet
UiPath Notes
73 pages
2017 BC Liberal Donations Spreadsheet Through April 28
No ratings yet
2017 BC Liberal Donations Spreadsheet Through April 28
347 pages
The Cambridge Handbook of the Learning Sciences 2022 3rd Edition R. Keith Sawyer 2024 scribd download
100% (1)
The Cambridge Handbook of the Learning Sciences 2022 3rd Edition R. Keith Sawyer 2024 scribd download
55 pages
深度学习入门：基于Python的理论与实现
No ratings yet
深度学习入门：基于Python的理论与实现
314 pages
Deep Learning Course File
No ratings yet
Deep Learning Course File
56 pages
Factors Affecting The Construction Productivity: Case Study On A Public Flat Project in Singapore
No ratings yet
Factors Affecting The Construction Productivity: Case Study On A Public Flat Project in Singapore
11 pages
CNN PPT Unit Iv
No ratings yet
CNN PPT Unit Iv
134 pages
John Lorenz R. Belanio
No ratings yet
John Lorenz R. Belanio
36 pages
(窄门) (法) 安德烈·纪德著扫描版
No ratings yet
(窄门) (法) 安德烈·纪德著扫描版
153 pages
1 - Tutorial - CPLEX Studio IDE Gsoplide
No ratings yet
1 - Tutorial - CPLEX Studio IDE Gsoplide
98 pages
Mbithi Nelson Matheka PHD 2022
No ratings yet
Mbithi Nelson Matheka PHD 2022
88 pages
Slide 7 - Neural Networks
No ratings yet
Slide 7 - Neural Networks
64 pages
The Map of Mathematics
No ratings yet
The Map of Mathematics
5 pages
见证失衡双顺差人民币汇率和美元陷阱 (余永定) (Z-Library)
No ratings yet
见证失衡双顺差人民币汇率和美元陷阱 (余永定) (Z-Library)
371 pages
04 Associative Memory
No ratings yet
04 Associative Memory
42 pages
超高效时间管理 (美) 布莱恩· P.莫兰（Brian P. Moran）) (Z-Library)
No ratings yet
超高效时间管理 (美) 布莱恩· P.莫兰（Brian P. Moran）) (Z-Library)
259 pages
GBB Ready Uses Cases - Azure Advanced Workloads For Vertical Industry v2
No ratings yet
GBB Ready Uses Cases - Azure Advanced Workloads For Vertical Industry v2
29 pages
The Landscape of Words Stone Inscriptions in Early and Medieval China Robert E. Harrist Jr. - Read the ebook now or download it for a full experience
No ratings yet
The Landscape of Words Stone Inscriptions in Early and Medieval China Robert E. Harrist Jr. - Read the ebook now or download it for a full experience
51 pages
Diagnosis of Neurological Disorders Based On Deep Learning Techniques (Jyotismita Chaki)
No ratings yet
Diagnosis of Neurological Disorders Based On Deep Learning Techniques (Jyotismita Chaki)
234 pages
Deep Learning and CNNFYTGS5101-Guoyangxie
No ratings yet
Deep Learning and CNNFYTGS5101-Guoyangxie
42 pages
一路这样走来马云
No ratings yet
一路这样走来马云
228 pages
0day 與 Nday 漏洞攻擊剖析及企業應對之道
No ratings yet
0day 與 Nday 漏洞攻擊剖析及企業應對之道
42 pages
GCR Nuggets Compilation by Viktor
No ratings yet
GCR Nuggets Compilation by Viktor
19 pages
Essilor Acquisition Transitions Presentation July-29-1
No ratings yet
Essilor Acquisition Transitions Presentation July-29-1
17 pages
Diffusion Models
No ratings yet
Diffusion Models
46 pages
Instant Download (Ebook) How to Future: Leading and Sense-making in an Age of Hyperchange by Scott Smith, Madeline Ashby ISBN 9781789664720, 1789664721 PDF All Chapters
100% (9)
Instant Download (Ebook) How to Future: Leading and Sense-making in an Age of Hyperchange by Scott Smith, Madeline Ashby ISBN 9781789664720, 1789664721 PDF All Chapters
65 pages
人工智能报告汇
No ratings yet
人工智能报告汇
82 pages
Toggle Rate Estimation and Glitch Analysis On Logic Circuits
No ratings yet
Toggle Rate Estimation and Glitch Analysis On Logic Circuits
5 pages
Afa 2
No ratings yet
Afa 2
15 pages
Course Overview - Deep Learning With PyTorch Zero To GANs PDF
No ratings yet
Course Overview - Deep Learning With PyTorch Zero To GANs PDF
2 pages
ALN TheMillionDollarAIPortfolio 0823
No ratings yet
ALN TheMillionDollarAIPortfolio 0823
23 pages
Mystery's Posts 02
No ratings yet
Mystery's Posts 02
181 pages
NN LMS DR Gamal PDF
No ratings yet
NN LMS DR Gamal PDF
34 pages
AI for beginners
No ratings yet
AI for beginners
105 pages
Computer Vision I: Ai Courses by Opencv
No ratings yet
Computer Vision I: Ai Courses by Opencv
9 pages
Dl4j in Action
No ratings yet
Dl4j in Action
26 pages
脉络：小我与大势
No ratings yet
脉络：小我与大势
440 pages
Final Exam 2013
No ratings yet
Final Exam 2013
22 pages
Simplex Method
100% (1)
Simplex Method
59 pages
营养圣经
No ratings yet
营养圣经
517 pages
Ux Style Frameworks
No ratings yet
Ux Style Frameworks
246 pages
人间清醒2：底层逻辑和顶层认知
No ratings yet
人间清醒2：底层逻辑和顶层认知
210 pages
YouTube SEO & Client Hunting E-Book- SEO Station by Aoyon
No ratings yet
YouTube SEO & Client Hunting E-Book- SEO Station by Aoyon
107 pages
Taihan - CommunicationCables - Eng PDF
No ratings yet
Taihan - CommunicationCables - Eng PDF
66 pages
Heterogeneous Information Network Embedding For Recommendation
No ratings yet
Heterogeneous Information Network Embedding For Recommendation
14 pages
Utility GIS The Ultimate Step-By-Step Guide
From Everand
Utility GIS The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
2407.08615v2
No ratings yet
2407.08615v2
29 pages
Unit-Iv Hopfield Networks: As Per Jntu Your Syllabus Is
No ratings yet
Unit-Iv Hopfield Networks: As Per Jntu Your Syllabus Is
48 pages
Scalable Quantum Simulation of Molecular Energies: Doi: Subject Areas: Condensed Matter Physics, Quantum Information
No ratings yet
Scalable Quantum Simulation of Molecular Energies: Doi: Subject Areas: Condensed Matter Physics, Quantum Information
13 pages
Chapter 9 Quantitative Feedback Theory
No ratings yet
Chapter 9 Quantitative Feedback Theory
44 pages
Control System Design - QFT: Bo Bernhardsson, K. J. Åström
No ratings yet
Control System Design - QFT: Bo Bernhardsson, K. J. Åström
54 pages
16.06 Principles of Automatic Control: The Nichols Chart
No ratings yet
16.06 Principles of Automatic Control: The Nichols Chart
12 pages
2019 AIChE Spring MTG Compressor Surge Modeling and Control
No ratings yet
2019 AIChE Spring MTG Compressor Surge Modeling and Control
7 pages
Chapter 5 Quantum Mechanics of Atoms
No ratings yet
Chapter 5 Quantum Mechanics of Atoms
47 pages
Human Eye and Colorful World
No ratings yet
Human Eye and Colorful World
18 pages
A Kudu UserGuide M1
No ratings yet
A Kudu UserGuide M1
21 pages
Poster 2023 IEEE PESGM1562 V1
No ratings yet
Poster 2023 IEEE PESGM1562 V1
1 page
DLL G8 Q1w2
No ratings yet
DLL G8 Q1w2
5 pages
Test 1 XI-F (chapter 12, 2)
No ratings yet
Test 1 XI-F (chapter 12, 2)
1 page
Tech Mathematics Collectable Marks For Grade12 1
No ratings yet
Tech Mathematics Collectable Marks For Grade12 1
4 pages
ER 0427 Report - Installation of Security Camera On Lightning Mast
No ratings yet
ER 0427 Report - Installation of Security Camera On Lightning Mast
3 pages
Ashrae 62 Table 6.1 - Minimum Ventilation Rates
100% (2)
Ashrae 62 Table 6.1 - Minimum Ventilation Rates
2 pages
Engineering Physics A (PHT100) - Ktu Qbank
No ratings yet
Engineering Physics A (PHT100) - Ktu Qbank
3 pages
CHAP # 7 Permutation 1$t Year - Docx Ere Repeat.......................
No ratings yet
CHAP # 7 Permutation 1$t Year - Docx Ere Repeat.......................
33 pages
Pile Cap
No ratings yet
Pile Cap
7 pages
Mass Flow Rate: Schaum's Et Al
No ratings yet
Mass Flow Rate: Schaum's Et Al
3 pages
CHARLESS-LAW-8Es-LESSON-PLAN
No ratings yet
CHARLESS-LAW-8Es-LESSON-PLAN
8 pages
2nd Law of Thermodynamics
No ratings yet
2nd Law of Thermodynamics
4 pages
Fore End
No ratings yet
Fore End
6 pages
Week1 - Motivation QM102
No ratings yet
Week1 - Motivation QM102
9 pages
B Tech Electrical & Electronics Engineering R20 Syllabus
No ratings yet
B Tech Electrical & Electronics Engineering R20 Syllabus
323 pages
Wave Optics Interference of Light
No ratings yet
Wave Optics Interference of Light
27 pages
Mi 501 CBGS020221051555
No ratings yet
Mi 501 CBGS020221051555
4 pages
Section A
No ratings yet
Section A
11 pages
Phase Diagram 1 - AD
No ratings yet
Phase Diagram 1 - AD
30 pages
ISO-17892-10-2018
No ratings yet
ISO-17892-10-2018
12 pages
Primary Load Cases
No ratings yet
Primary Load Cases
54 pages
DOSS End Sem Paper
No ratings yet
DOSS End Sem Paper
2 pages
pw
No ratings yet
pw
4 pages
Part 1 - Introduction To Differential Equations
No ratings yet
Part 1 - Introduction To Differential Equations
12 pages
Free-surface flow: shallow-water dynamics Katopodes - The ebook in PDF format with all chapters is ready for download
100% (3)
Free-surface flow: shallow-water dynamics Katopodes - The ebook in PDF format with all chapters is ready for download
53 pages
Lect.17 Column Base Plate Chapter J (J8. Column Bases and Bearing On Concrete)
No ratings yet
Lect.17 Column Base Plate Chapter J (J8. Column Bases and Bearing On Concrete)
10 pages
550series - Hydraustar
No ratings yet
550series - Hydraustar
8 pages