CVPR2022 Tutorial Diffusion Model

Download as pdf or txt
Download as pdf or txt
You are on page 1of 188

Denoising Diffusion-based Generative Modeling:

Foundations and Applications


Karsten Kreis Ruiqi Gao Arash Vahdat

1
Deep Generative Learning
Learning to generate data

Train

Samples from a Data Distribution Neural Network

Sample

2
Application (1): Content Generation
StyleGAN3 example images

Karras et al. Alias-Free Generative Adversarial Networks, NeurIPS 2021 3


Application (2): Representation Learning
Learning from limited labels

Zhang et al., DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort, CVPR 2021
Li et al., Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization, CVPR 2021 4
Application (3): Artistic Tools
NVIDIA GauGAN

Park et al., Semantic Image Synthesis with Spatially-Adaptive Normalization, CVPR 2019
5
The Landscape of Deep Generative Learning

Normalizing
Autoregressive Flows
Models

Variational
Autoencoders

Denoising
Generative Energy-based
Diffusion Models
Adversarial Networks Models

6
The Landscape of Deep Generative Learning

Normalizing
Autoregressive Flows
Models

Variational
Autoencoders

Denoising
Generative Energy-based
Diffusion Models
Adversarial Networks Models

7
Denoising Diffusion Models
Emerging as powerful generative models, outperforming GANs

“Diffusion Models Beat GANs on Image Synthesis” “Cascaded Diffusion Models for High Fidelity Image Generation”
Dhariwal & Nichol, OpenAI, 2021 Ho et al., Google, 2021
8
Image Super-resolution
Successful applications

Saharia et al., Image Super-Resolution via Iterative Refinement, ICCV 2021


9
Text-to-Image Generation
DALL·E 2 Imagen
A group of teddy bears in suit in a corporate office celebrating
“a teddy bear on a skateboard in times square”
the birthday of their friend. There is a pizza cake on the desk.

“Hierarchical Text-Conditional Image Generation with CLIP Latents” “Photorealistic Text-to-Image Diffusion Models with Deep
Ramesh et al., 2022 Language Understanding”, Saharia et al., 2022
10
Today’s Program

Title Speaker Time


Introduction Arash 10 min
Part (1): Denoising Diffusion Probabilistic Models Arash 35 min
Part (2): Score-based Generative Modeling with Differential Equations Karsten 45 min
Part (3): Advanced Techniques: Accelerated Sampling, Conditional Generation, and Beyond Ruiqi 45 min
Applications (1): Image Synthesis, Text-to-Image, Controllable Generation Ruiqi 15 min
Applications (2): Image Editing, Image-to-Image, Super-resolution, Segmentation Arash 15 min
Applications (3): Video Synthesis, Medical Imaging, 3D Generation, Discrete State Models Karsten 15 min
Conclusions, Open Problems and Final Remarks Arash 10 min

cvpr2022-tutorial-diffusion-models.github.io
13
Today’s Program

Title Speaker Time


Introduction Arash 10 min
Part (1): Denoising Diffusion Probabilistic Models Arash 35 min
Part (2): Score-based Generative Modeling with Differential Equations Karsten 45 min
Part (3): Advanced Techniques: Accelerated Sampling, Conditional Generation, and Beyond Ruiqi 45 min
Applications (1): Image Synthesis, Text-to-Image, Controllable Generation Ruiqi 15 min
Applications (2): Image Editing, Image-to-Image, Super-resolution, Segmentation Arash 15 min
Applications (3): Video Synthesis, Medical Imaging, 3D Generation, Discrete State Models Karsten 15 min
Conclusions, Open Problems and Final Remarks Arash 10 min

cvpr2022-tutorial-diffusion-models.github.io
14
Disclaimer
You didn’t include
my arXiv submission
that will come out
next week?

15
Today’s Program

Title Speaker Time


Introduction Arash 10 min
Part (1): Denoising Diffusion Probabilistic Models Arash 35 min
Part (2): Score-based Generative Modeling with Differential Equations Karsten 45 min
Part (3): Advanced Techniques: Accelerated Sampling, Conditional Generation, and Beyond Ruiqi 45 min
Applications (1): Image Synthesis, Text-to-Image, Controllable Generation Ruiqi 15 min
Applications (2): Image Editing, Image-to-Image, Super-resolution, Segmentation Arash 15 min
Applications (3): Video Synthesis, Medical Imaging, 3D Generation, Discrete State Models Karsten 15 min
Conclusions, Open Problems and Final Remarks Arash 10 min

cvpr2022-tutorial-diffusion-models.github.io
16
Part (1):
Denoising Diffusion Probabilistic Models

17
Denoising Diffusion Models
Learning to generate by denoising

Denoising diffusion models consist of two processes:

• Forward diffusion process that gradually adds noise to input

• Reverse denoising process that learns to generate data by denoising

Forward diffusion process (fixed)

Data Noise

Reverse denoising process (generative)

Sohl-Dickstein et al., Deep Unsupervised Learning using Nonequilibrium Thermodynamics, ICML 2015
Ho et al., Denoising Diffusion Probabilistic Models, NeurIPS 2020
Song et al., Score-Based Generative Modeling through Stochastic Differential Equations, ICLR 2021 18
Forward Diffusion Process

The formal definition of the forward process in T steps:

Forward diffusion process (fixed)

Data Noise

x0 x1 x2 x3 x4 … xT

(joint)

19
Diffusion Kernel

Forward diffusion process (fixed)

Data Noise

x0 x1 x2 x3 x4 … xT

Define (Diffusion Kernel)

For sampling: where

values schedule (i.e., the noise schedule) is designed such that and

20
What happens to a distribution in the forward diffusion?

So far, we discussed the diffusion kernel but what about ?

Diffused Data Distributions


Data Noise

xt
Diffused Joint Input Diffusion
data dist. dist. data dist. kernel

The diffusion kernel is Gaussian convolution. q(x0) q(x1) q(x2) q(x3) … q(xT)

We can sample by first sampling and then sampling (i.e., ancestral sampling).

21
Generative Learning by Denoising

Recall, that the diffusion parameters are designed such that

Diffused Data Distributions

Generation:

Sample
xt
Iteratively sample

True Denoising Dist.

q(x0) q(x1) q(x2) q(x3) … q(xT)

q(x0|x1) q(x1|x2) q(x2|x3) q(x3|x4) q(xT-1|xT)


In general, is intractable.

Can we approximate ? Yes, we can use a Normal distribution if is small in each forward diffusion step.
22
Reverse Denoising Process

Formal definition of forward and reverse processes in T steps:

Reverse denoising process (generative)

Data Noise

x0 x1 x2 x3 x4 … xT

Trainable network
(U-net, Denoising Autoencoder) 23
Learning Denoising Model
Variational upper bound

For training, we can form variational upper bound that is commonly used for training variational autoencoders:

Sohl-Dickstein et al. ICML 2015 and Ho et al. NeurIPS 2020 show that:

where is the tractable posterior distribution:

24
Parameterizing the Denoising Model

Since both and are Normal distributions, the KL divergence has a simple form:

Recall that . Ho et al. NeurIPS 2020 observe that:

They propose to represent the mean of the denoising model using a noise-prediction network:

With this parameterization

25
Training Objective Weighting
Trading likelihood for perceptual quality

The time dependent ensures that the training objective is weighted properly for the maximum data likelihood training.

However, this weight is often very large for small t’s.

Ho et al. NeurIPS 2020 observe that simply setting improves sample quality. So, they propose to use:

For more advanced weighting see Choi et al., Perception Prioritized Training of Diffusion Models, CVPR 2022.

26
Summary
Training and Sample Generation

27
Implementation Considerations
Network Architectures

Diffusion models often use U-Net architectures with ResNet blocks and self-attention layers to represent

Time Representation
Fully-connected
Layers

Time representation: sinusoidal positional embeddings or random Fourier features.

Time features are fed to the residual blocks using either simple spatial addition or using adaptive group normalization
layers. (see Dharivwal and Nichol NeurIPS 2021)
28
Diffusion Parameters
Noise Schedule

Data Noise

Above, and control the variance of the forward diffusion and reverse denoising processes respectively.

Often a linear schedule is used for , and is set equal to .

Kingma et al. NeurIPS 2022 introduce a new parameterization of diffusion models using signal-to-noise ratio (SNR), and
show how to learn the noise schedule by minimizing the variance of the training objective.

We can also train while training the diffusion model by minimizing the variational bound (Improved DPM by Nichol and
Dhariwal ICML 2021) or after training the diffusion model (Analytic-DPM by Bao et al. ICLR 2022).
29
What happens to an image in the forward diffusion process?

Recall that sampling from is done using where

Small t

Freq.

Fourier Transform

Freq.

Large t

In the forward diffusion, the high frequency content is perturbed faster.


Freq.

30
Content-Detail Tradeoff

Reverse denoising process (generative)

Data Noise

x0 x1 x2 x3 x4 … xT

The denoising model is The denoising model is


specialized for generating the specialized for generating the
high-frequency content (i.e., low-frequency content (i.e.,
low-level details) coarse content)

The weighting of the training objective for different timesteps is important!


31
Connection to VAEs

Diffusion models can be considered as a special form of hierarchical VAEs.

However, in diffusion models:

• The encoder is fixed

• The latent variables have the same dimension as the data

• The denoising model is shared across different timestep

• The model is trained with some reweighting of the variational bound.

Vahdat and Kautz, NVAE: A Deep Hierarchical Variational Autoencoder, NeurIPS 2020
Sønderby, et al.. Ladder variational autoencoders, NeurIPS 2016. 32
Summary
Denoising Diffusion Probabilistic Models

In this part, we reviewed denoising diffusion probabilistic models.

The model is trained by sampling from the forward diffusion process and training a denoising model to predict the noise.

We discussed how the forward process perturbs the data distribution or data samples.

The devil is in the details:

• Network architectures

• Objective weighting

• Diffusion parameters (i.e., noise schedule)

See “Elucidating the Design Space of Diffusion-Based Generative Models” by Karras et al. for important design decisions.

33
Today’s Program

Title Speaker Time


Introduction Arash 10 min
Part (1): Denoising Diffusion Probabilistic Models Arash 35 min
Part (2): Score-based Generative Modeling with Differential Equations Karsten 45 min
Part (3): Advanced Techniques: Accelerated Sampling, Conditional Generation, and Beyond Ruiqi 45 min
Applications (1): Image Synthesis, Text-to-Image, Controllable Generation Ruiqi 15 min
Applications (2): Image Editing, Image-to-Image, Super-resolution, Segmentation Arash 15 min
Applications (3): Video Synthesis, Medical Imaging, 3D Generation, Discrete State Models Karsten 15 min
Conclusions, Open Problems and Final Remarks Arash 10 min

cvpr2022-tutorial-diffusion-models.github.io
34
Part (2):
Score-based Generative Modeling
with Differential Equations

35
Forward Diffusion Process

Consider the forward diffusion process again:

Forward diffusion process (fixed)

Data Noise

x0 x1 x2 x3 x4 … xT

<latexit sha1_base64="bS3hNK6LY4xHhyx2Z3REaOHHT2g=">AAACY3icbVFdS8MwFE3r9/yqH28iBKeg4EYrooIIgi/6IhOcCmspaZbOYPphcivO2j/pm2+++D9suwrb9EDg5Jx7uTcnXiy4AtP81PSJyanpmdm52vzC4tKysbJ6p6JEUtamkYjkg0cUEzxkbeAg2EMsGQk8we69p4vCv39hUvEovIV+zJyA9ELuc0ogl1zj7XnXDgg8en76mrnwPnRJoWFle/gMlxolIr3ORopPsa2eJaRWw/YYEBcye3+8fx9XHv51rrK9WgnXqJtNswT+S6yK1FGFlmt82N2IJgELgQqiVMcyY3BSIoFTwbKanSgWE/pEeqyT05AETDlpmVGGd3Kli/1I5icEXKrDHSkJlOoHXl5ZLKrGvUL8z+sk4J84KQ/jBFhIB4P8RGCIcBE47nLJKIh+TgiVPN8V00ciCYX8W4oQrPEn/yV3B03rqHl4c1g/367imEUbaAvtIgsdo3N0iVqojSj60qa1Zc3QvvV5fVVfH5TqWtWzhkagb/4AXle2hQ==</latexit>

p
q(xt |xt 1) = N (xt ; 1 t xt 1, t I)

36
Forward Diffusion Process

Consider the limit of many small steps:

Forward diffusion process (fixed)

Data Noise

x0 x1 … … xT
<latexit sha1_base64="bS3hNK6LY4xHhyx2Z3REaOHHT2g=">AAACY3icbVFdS8MwFE3r9/yqH28iBKeg4EYrooIIgi/6IhOcCmspaZbOYPphcivO2j/pm2+++D9suwrb9EDg5Jx7uTcnXiy4AtP81PSJyanpmdm52vzC4tKysbJ6p6JEUtamkYjkg0cUEzxkbeAg2EMsGQk8we69p4vCv39hUvEovIV+zJyA9ELuc0ogl1zj7XnXDgg8en76mrnwPnRJoWFle/gMlxolIr3ORopPsa2eJaRWw/YYEBcye3+8fx9XHv51rrK9WgnXqJtNswT+S6yK1FGFlmt82N2IJgELgQqiVMcyY3BSIoFTwbKanSgWE/pEeqyT05AETDlpmVGGd3Kli/1I5icEXKrDHSkJlOoHXl5ZLKrGvUL8z+sk4J84KQ/jBFhIB4P8RGCIcBE47nLJKIh+TgiVPN8V00ciCYX8W4oQrPEn/yV3B03rqHl4c1g/367imEUbaAvtIgsdo3N0iVqojSj60qa1Zc3QvvV5fVVfH5TqWtWzhkagb/4AXle2hQ==</latexit>

p
q(xt |xt 1) = N (xt ; 1 t xt 1, t I)

p
<latexit sha1_base64="Xx1aSUA36oLWQc9sUw56k8noitQ=">AAACWHicbVFNS8NAFNzE1tb6VevRy2JVFLUkUtSLIHjRiyhYW2hK2Gw37eLmw90XsYT8ScGD/hUvJm2U2jqwMMzMY9/OOqHgCgzjQ9MXCsXFUnmpsryyurZe3ag9qiCSlLVoIALZcYhigvusBRwE64SSEc8RrO08XWV++4VJxQP/AUYh63lk4HOXUwKpZFcDyyMwdNz4NbEB7+ELbKlnCbF5bDkMiA2JdTQVieHYTPBhHpqJUCLi22T/J24kv5M3yUFlArtaNxrGGHiemDmpoxx3dvXN6gc08pgPVBCluqYRQi8mEjgVLKlYkWIhoU9kwLop9YnHVC8eF5Pg3VTpYzeQ6fEBj9XpiZh4So08J01mm6pZLxP/87oRuOe9mPthBMynk4vcSGAIcNYy7nPJKIhRSgiVPN0V0yGRhEL6F1kJ5uyT58njScM8bTTvm/XLnbyOMtpC22gfmegMXaJrdIdaiKJ39KUVtKL2qSO9pC9NorqWz2yiP9Br3y5lshw=</latexit>

p
xt = 1 t xt 1 + t N (0, I)

Song et al., “Score-Based Generative Modeling through Stochastic Differential Equations”, ICLR, 2021 37
Forward Diffusion Process

Consider the limit of many small steps:

Forward diffusion process (fixed)

Data Noise

x0 x1 … … xT
<latexit sha1_base64="bS3hNK6LY4xHhyx2Z3REaOHHT2g=">AAACY3icbVFdS8MwFE3r9/yqH28iBKeg4EYrooIIgi/6IhOcCmspaZbOYPphcivO2j/pm2+++D9suwrb9EDg5Jx7uTcnXiy4AtP81PSJyanpmdm52vzC4tKysbJ6p6JEUtamkYjkg0cUEzxkbeAg2EMsGQk8we69p4vCv39hUvEovIV+zJyA9ELuc0ogl1zj7XnXDgg8en76mrnwPnRJoWFle/gMlxolIr3ORopPsa2eJaRWw/YYEBcye3+8fx9XHv51rrK9WgnXqJtNswT+S6yK1FGFlmt82N2IJgELgQqiVMcyY3BSIoFTwbKanSgWE/pEeqyT05AETDlpmVGGd3Kli/1I5icEXKrDHSkJlOoHXl5ZLKrGvUL8z+sk4J84KQ/jBFhIB4P8RGCIcBE47nLJKIh+TgiVPN8V00ciCYX8W4oQrPEn/yV3B03rqHl4c1g/367imEUbaAvtIgsdo3N0iVqojSj60qa1Zc3QvvV5fVVfH5TqWtWzhkagb/4AXle2hQ==</latexit>

p
q(xt |xt 1) = N (xt ; 1 t xt 1, t I)

<latexit sha1_base64="A6LuapS4QGfKMkuAT60VxT7UCNQ=">AAACynichVFNaxRBEO0Zo8b1a2OOXpqsygbNMhOCegkEDJiASAQ3CWwvQ01vTdKk58PumiRLMzd/oTeP+SfObEaJu0EfNDxevfroqrjQylIQ/PT8O0t3791fftB5+Ojxk6fdlWeHNi+NxKHMdW6OY7CoVYZDUqTxuDAIaazxKD770MSPztFYlWdfaVrgOIWTTCVKAtVS1L0SKdBpnLjLKiL+im9zYb8ZcuGGiJEgokq8uWFxtBFW/HVrmrNI0O5z1f9tD6o/mfvVOheis1C+T+tiFzUB/3+fW7z/bNhpEXV7wSCYgS+SsCU91uIg6v4Qk1yWKWYkNVg7CoOCxg4MKamx6ojSYgHyDE5wVNMMUrRjNztFxV/WyoQnualfRnym3sxwkFo7TePa2Yxq52ONeFtsVFLyfuxUVpSEmbxulJSaU86bu/KJMihJT2sC0qh6Vi5PwYCk+vrNEsL5Ly+Sw81B+Haw9WWrt/OiXccye87WWJ+F7B3bYXvsgA2Z9D56qXfuXfiffONPfXdt9b02Z5X9Bf/7L7xL28o=</latexit>

p p
xt = 1 t xt 1 N (0, I)
+ t
p p
(t) t N (0, I)
<latexit sha1_base64="1jjMMaY10RvWhcF1JvQMb+2cnR4=">AAACBHicbZDJSgNBEIZ74hbjFvWYS2MUkkuYkaAiCAE9eIxgFsgMoadTSZr0LHTXCCHk4MVX8eJBEa8+hDffxs5y0MQfGj7+qqK6fj+WQqNtf1upldW19Y30ZmZre2d3L7t/UNdRojjUeCQj1fSZBilCqKFACc1YAQt8CQ1/cD2pNx5AaRGF9ziMwQtYLxRdwRkaq53NFVwfkLXx8moKBSy6NyCRUSy2s3m7ZE9Fl8GZQ57MVW1nv9xOxJMAQuSSad1y7Bi9EVMouIRxxk00xIwPWA9aBkMWgPZG0yPG9MQ4HdqNlHkh0qn7e2LEAq2HgW86A4Z9vVibmP/VWgl2L7yRCOMEIeSzRd1EUozoJBHaEQo4yqEBxpUwf6W8zxTjaHLLmBCcxZOXoX5acs5K5btyvnI8jyNNcuSIFIhDzkmF3JIqqRFOHskzeSVv1pP1Yr1bH7PWlDWfOSR/ZH3+AMM5ls4=</latexit>

= 1 (t) t xt 1 + ( t := (t) t)

Song et al., “Score-Based Generative Modeling through Stochastic Differential Equations”, ICLR, 2021 38
Forward Diffusion Process

Consider the limit of many small steps:

Forward diffusion process (fixed)

Data Noise

x0 x1 … … xT
<latexit sha1_base64="bS3hNK6LY4xHhyx2Z3REaOHHT2g=">AAACY3icbVFdS8MwFE3r9/yqH28iBKeg4EYrooIIgi/6IhOcCmspaZbOYPphcivO2j/pm2+++D9suwrb9EDg5Jx7uTcnXiy4AtP81PSJyanpmdm52vzC4tKysbJ6p6JEUtamkYjkg0cUEzxkbeAg2EMsGQk8we69p4vCv39hUvEovIV+zJyA9ELuc0ogl1zj7XnXDgg8en76mrnwPnRJoWFle/gMlxolIr3ORopPsa2eJaRWw/YYEBcye3+8fx9XHv51rrK9WgnXqJtNswT+S6yK1FGFlmt82N2IJgELgQqiVMcyY3BSIoFTwbKanSgWE/pEeqyT05AETDlpmVGGd3Kli/1I5icEXKrDHSkJlOoHXl5ZLKrGvUL8z+sk4J84KQ/jBFhIB4P8RGCIcBE47nLJKIh+TgiVPN8V00ciCYX8W4oQrPEn/yV3B03rqHl4c1g/367imEUbaAvtIgsdo3N0iVqojSj60qa1Zc3QvvV5fVVfH5TqWtWzhkagb/4AXle2hQ==</latexit>

p
q(xt |xt 1) = N (xt ; 1 t xt 1, t I)

p
<latexit sha1_base64="fzYLz757x7HdzwbxOxEs6lp8uoI=">AAADVnicpVJda9RAFJ0k1tb1a6uPvgyuyhbdJSml+iIU9EFfpILbFnaWcDM7aYdOPpy5KV2G/El90Z/iizjZplI3BQUPDBzOPfdjLjcplTQYht89P7ixdnN941bv9p279+73Nx8cmKLSXEx4oQp9lIARSuZighKVOCq1gCxR4jA5fdPED8+ENrLIP+GiFLMMjnOZSg7opHjTUywDPElSe17HSJ/R15SZzxptNGKJQIixZi+uWCyOopo+b00rFg7KfqiHl/aw/p35vt6ijPU65Ye4xd4KhUD/3uca7z80ZFCWujinndojylINvFvbbtdd939N0rtE3B+E43AJ2iVRSwakxX7c/8LmBa8ykSNXYMw0CkucWdAouRJ1j1VGlMBP4VhMHc0hE2Zml2dR06dOmdO00O7lSJfq1QwLmTGLLHHOZlazGmvE62LTCtNXMyvzskKR84tGaaUoFrS5MTqXWnBUC0eAa+lmpfwE3LLRXWKzhGj1y11ysD2Odsc7H3cGe0/adWyQR+QxGZKIvCR75B3ZJxPCva/eD9/3A/+b/zNYC9YvrL7X5jwkfyDo/wJFtA1R</latexit>

p
xt = 1 t xt 1 N (0, I)
+ t
p p
(t) t N (0, I)
<latexit sha1_base64="1jjMMaY10RvWhcF1JvQMb+2cnR4=">AAACBHicbZDJSgNBEIZ74hbjFvWYS2MUkkuYkaAiCAE9eIxgFsgMoadTSZr0LHTXCCHk4MVX8eJBEa8+hDffxs5y0MQfGj7+qqK6fj+WQqNtf1upldW19Y30ZmZre2d3L7t/UNdRojjUeCQj1fSZBilCqKFACc1YAQt8CQ1/cD2pNx5AaRGF9ziMwQtYLxRdwRkaq53NFVwfkLXx8moKBSy6NyCRUSy2s3m7ZE9Fl8GZQ57MVW1nv9xOxJMAQuSSad1y7Bi9EVMouIRxxk00xIwPWA9aBkMWgPZG0yPG9MQ4HdqNlHkh0qn7e2LEAq2HgW86A4Z9vVibmP/VWgl2L7yRCOMEIeSzRd1EUozoJBHaEQo4yqEBxpUwf6W8zxTjaHLLmBCcxZOXoX5acs5K5btyvnI8jyNNcuSIFIhDzkmF3JIqqRFOHskzeSVv1pP1Yr1bH7PWlDWfOSR/ZH3+AMM5ls4=</latexit>

= 1 (t) t xt 1 + ( t := (t) t)
(t) t p
⇡ xt 1 xt 1 + (t) t N (0, I) (Taylor expansion)
2

Song et al., “Score-Based Generative Modeling through Stochastic Differential Equations”, ICLR, 2021 39
Forward Diffusion Process as Stochastic Differential Equation

Consider the limit of many small steps:

Forward diffusion process (fixed)

Data Noise

x0 x1 … … xT
p
<latexit sha1_base64="vy6Q1yTeEXCBXerAkFNeHudF4Ac=">AAACiHicbVFNbxMxEPUuFNrwFeDYi0VAStU22q0ChVsFHOCCikTaSnEUzTqzrVXvB/YsamT5t/Q/9ca/wZumUtvwJEtPb95oxm+yWitLSfI3ih88XHv0eH2j8+Tps+cvui9fHdmqMRJHstKVOcnAolYljkiRxpPaIBSZxuPs/EtbP/6Dxqqq/EXzGicFnJYqVxIoSNPupSiAzrLcXfgpcQF1baoLfkt0tJt6vstFbkA6kSFBn7bEV9QEnLzb86vubS7sb0OrbrGz8ErQ7ofv3/QlfueGfvdbnTuYdnvJIFmAr5J0SXpsicNp90rMKtkUWJLUYO04TWqaODCkpEbfEY3FGuQ5nOI40BIKtBO3CNLzd0GZ8bwy4ZXEF+rtDgeFtfMiC852YXu/1or/q40byj9OnCrrhrCU14PyRnOqeHsVPlMGJel5ICCNCrtyeQYhcQq3a0NI7395lRztDdIPg+HPYe/g7TKOdbbJ3rA+S9k+O2Df2CEbMRmtRdvRMHofd+Ik3o8/XVvjaNnzmt1B/PkfhHrAWg==</latexit>

(t) t
xt ⇡ xt 1 xt 1 + (t) t N (0, I)
2

<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

1 p
dxt = (t)xt dt + (t) d! t
2
Stochastic Differential Equation (SDE)
describing the diffusion in infinitesimal limit

Song et al., “Score-Based Generative Modeling through Stochastic Differential Equations”, ICLR, 2021 40
Crash Course in Differential Equations
Ordinary Differential Equation (ODE):
dx
<latexit sha1_base64="HU7MMEeX0A5eg0kajyOZaMQJ4Fg=">AAACK3icbVDLSsNAFJ34rPUVdekmWIQKUhIp6kYodeOygn1AU8pkOmmHTh7M3EhLyP+48Vdc6MIHbv0PJ23E2npg4HDOucy9xwk5k2Ca79rS8srq2npuI7+5tb2zq+/tN2QQCULrJOCBaDlYUs58WgcGnLZCQbHncNp0htep37ynQrLAv4NxSDse7vvMZQSDkrp61XYFJrENdATCi3uJ7WEYOG48SpIZFZKrH8NNir+ZUzjJd/WCWTInMBaJlZECylDr6s92LyCRR30gHEvZtswQOjEWwAinSd6OJA0xGeI+bSvqY4/KTjy5NTGOldIz3ECo54MxUWcnYuxJOfYclUy3lPNeKv7ntSNwLzsx88MIqE+mH7kRNyAw0uKMHhOUAB8rgolgaleDDLAqD1S9aQnW/MmLpHFWss5L5dtyoVLN6sihQ3SEishCF6iCblAN1RFBD+gJvaI37VF70T60z2l0SctmDtAfaF/fhxKqPA==</latexit>

<latexit sha1_base64="DKhIukpuJIHY7KlWiVGjOAQE2hY=">AAACInicbVDLSsNAFJ3UV62vqEs3wSJUkJJI8bEQim5cVrAPaEKZTCbt0MmDmRtpCfkWN/6KGxeKuhL8GJM2orYeGDiccy5z77FDziTo+odSWFhcWl4prpbW1jc2t9TtnZYMIkFokwQ8EB0bS8qZT5vAgNNOKCj2bE7b9vAq89t3VEgW+LcwDqnl4b7PXEYwpFJPPTeBjkB4sZOYHoaB7caj5OKbuknlRz2Cw58wlHpqWa/qE2jzxMhJGeVo9NQ30wlI5FEfCMdSdg09BCvGAhjhNCmZkaQhJkPcp92U+tij0oonJybaQao4mhuI9PmgTdTfEzH2pBx7dprMFpazXib+53UjcM+smPlhBNQn04/ciGsQaFlfmsMEJcDHKcFEsHRXjQywwATSVrMSjNmT50nruGqcVGs3tXL9Mq+jiPbQPqogA52iOrpGDdREBN2jR/SMXpQH5Ul5Vd6n0YKSz+yiP1A+vwBRb6X6</latexit>

= f (x, t) or dx = f (x, t)dt


dt
<latexit sha1_base64="gMTgjs7J9T7tfl8I4J/4iGZ8J/U=">AAAB8XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIqMuiG5cV7APbUjLpnTY0kxmSjFiG/oUbF4q49W/c+Tdm2llo64HA4Zx7ybnHjwXXxnW/ncLK6tr6RnGztLW9s7tX3j9o6ihRDBssEpFq+1Sj4BIbhhuB7VghDX2BLX98k/mtR1SaR/LeTGLshXQoecAZNVZ66IbUjPwgfZr2yxW36s5AlomXkwrkqPfLX91BxJIQpWGCat3x3Nj0UqoMZwKnpW6iMaZsTIfYsVTSEHUvnSWekhOrDEgQKfukITP190ZKQ60noW8ns4R60cvE/7xOYoKrXsplnBiUbP5RkAhiIpKdTwZcITNiYgllitushI2ooszYkkq2BG/x5GXSPKt6F9Xzu/NK7TqvowhHcAyn4MEl1OAW6tAABhKe4RXeHO28OO/Ox3y04OQ7h/AHzucPADaRJQ==</latexit>

t
<latexit sha1_base64="4mSRiAOC1HPbUsbyd7QN48TyFAA=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeiF48t2FpoQ9lsN+3azSbsToQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IJHCoOt+O4W19Y3NreJ2aWd3b/+gfHjUNnGqGW+xWMa6E1DDpVC8hQIl7ySa0yiQ/CEY3878hyeujYjVPU4S7kd0qEQoGEUrNbFfrrhVdw6ySrycVCBHo1/+6g1ilkZcIZPUmK7nJuhnVKNgkk9LvdTwhLIxHfKupYpG3PjZ/NApObPKgISxtqWQzNXfExmNjJlEge2MKI7MsjcT//O6KYbXfiZUkiJXbLEoTCXBmMy+JgOhOUM5sYQyLeythI2opgxtNiUbgrf88ippX1S9y2qtWavUb/I4inACp3AOHlxBHe6gAS1gwOEZXuHNeXRenHfnY9FacPKZY/gD5/MH4xeNAQ==</latexit>

41
Crash Course in Differential Equations
Ordinary Differential Equation (ODE): Stochastic Differential Equation (SDE):
Wiener Process
dx dx
<latexit sha1_base64="HU7MMEeX0A5eg0kajyOZaMQJ4Fg=">AAACK3icbVDLSsNAFJ34rPUVdekmWIQKUhIp6kYodeOygn1AU8pkOmmHTh7M3EhLyP+48Vdc6MIHbv0PJ23E2npg4HDOucy9xwk5k2Ca79rS8srq2npuI7+5tb2zq+/tN2QQCULrJOCBaDlYUs58WgcGnLZCQbHncNp0htep37ynQrLAv4NxSDse7vvMZQSDkrp61XYFJrENdATCi3uJ7WEYOG48SpIZFZKrH8NNir+ZUzjJd/WCWTInMBaJlZECylDr6s92LyCRR30gHEvZtswQOjEWwAinSd6OJA0xGeI+bSvqY4/KTjy5NTGOldIz3ECo54MxUWcnYuxJOfYclUy3lPNeKv7ntSNwLzsx88MIqE+mH7kRNyAw0uKMHhOUAB8rgolgaleDDLAqD1S9aQnW/MmLpHFWss5L5dtyoVLN6sihQ3SEishCF6iCblAN1RFBD+gJvaI37VF70T60z2l0SctmDtAfaF/fhxKqPA==</latexit>
<latexit sha1_base64="sdfFbxA4bO046gbLntH1JdLIPVo=">AAACV3icbVFNSyNBEO2ZuBqzX9E9emkMgssuYUZk9SIE9+LRBaNCJoSenprY2B9Dd40YhvmT4sW/4kV7YmT9Kmj68V4V9fp1WkjhMIrugrC19Gl5pb3a+fzl67fv3bX1U2dKy2HIjTT2PGUOpNAwRIESzgsLTKUSztLLv41+dgXWCaNPcFbAWLGpFrngDD016eokt4xXCcI1WlVldaIYXqR5dV3XL1isD56FvN7+3/Mbf9JfNHFiqthrOkmNzNxM+atKjIIpqyfYmXR7UT+aF30P4gXokUUdT7o3SWZ4qUAjl8y5URwVOK6YRcEl1J2kdFAwfsmmMPJQMwVuXM1zqemWZzKaG+uPRjpnX05UTLnGou9srLu3WkN+pI1KzPfHldBFiaD506K8lBQNbUKmmbDAUc48YNwK75XyC+aDRv8VTQjx2ye/B6c7/fhPf/ffbm9wuIijTTbIJtkmMdkjA3JEjsmQcHJL7oNWsBTcBQ/hcth+ag2DxcwP8qrCtUfgaLfD</latexit>

(Gaussian
<latexit sha1_base64="DKhIukpuJIHY7KlWiVGjOAQE2hY=">AAACInicbVDLSsNAFJ3UV62vqEs3wSJUkJJI8bEQim5cVrAPaEKZTCbt0MmDmRtpCfkWN/6KGxeKuhL8GJM2orYeGDiccy5z77FDziTo+odSWFhcWl4prpbW1jc2t9TtnZYMIkFokwQ8EB0bS8qZT5vAgNNOKCj2bE7b9vAq89t3VEgW+LcwDqnl4b7PXEYwpFJPPTeBjkB4sZOYHoaB7caj5OKbuknlRz2Cw58wlHpqWa/qE2jzxMhJGeVo9NQ30wlI5FEfCMdSdg09BCvGAhjhNCmZkaQhJkPcp92U+tij0oonJybaQao4mhuI9PmgTdTfEzH2pBx7dprMFpazXib+53UjcM+smPlhBNQn04/ciGsQaFlfmsMEJcDHKcFEsHRXjQywwATSVrMSjNmT50nruGqcVGs3tXL9Mq+jiPbQPqogA52iOrpGDdREBN2jR/SMXpQH5Ul5Vd6n0YKSz+yiP1A+vwBRb6X6</latexit>

= f (x, t) or dx = f (x, t)dt = f (x, t) + (x, t)! t White Noise)


dt dt
<latexit sha1_base64="gMTgjs7J9T7tfl8I4J/4iGZ8J/U=">AAAB8XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIqMuiG5cV7APbUjLpnTY0kxmSjFiG/oUbF4q49W/c+Tdm2llo64HA4Zx7ybnHjwXXxnW/ncLK6tr6RnGztLW9s7tX3j9o6ihRDBssEpFq+1Sj4BIbhhuB7VghDX2BLX98k/mtR1SaR/LeTGLshXQoecAZNVZ66IbUjPwgfZr2yxW36s5AlomXkwrkqPfLX91BxJIQpWGCat3x3Nj0UqoMZwKnpW6iMaZsTIfYsVTSEHUvnSWekhOrDEgQKfukITP190ZKQ60noW8ns4R60cvE/7xOYoKrXsplnBiUbP5RkAhiIpKdTwZcITNiYgllitushI2ooszYkkq2BG/x5GXSPKt6F9Xzu/NK7TqvowhHcAyn4MEl1OAW6tAABhKe4RXeHO28OO/Ox3y04OQ7h/AHzucPADaRJQ==</latexit>

x ✓
<latexit sha1_base64="29xJXKBzbqt1OxSFFWNMKKbD24E=">AAACanicdVFNaxRBEO0ZjcZNYtYEFMmlyUbYkLDMSNCACEEvHiNkk8D2svT01EyadE8P3TWSZZiDf9Gbv8CLP8Ke3ZXNhxYU/Xj1iqp6nZRKOoyin0H46PHKk6erzzpr6xvPN7svts6dqayAoTDK2MuEO1CygCFKVHBZWuA6UXCRXH9u6xffwDppijOcljDWPC9kJgVHT02631ki81z12QeGcINW12nDNMerJKtvmo9/Ydb0l+wh7i/FSA8oczLX/H8KlhiVuqn2T82Mhpw3E/Tz2sF2vzPp9qJBNAv6EMQL0COLOJ10f7DUiEpDgUJx50ZxVOK45halUNB0WOWg5OKa5zDysOAa3LieWdXQN55JaWaszwLpjL3dUXPt2l29sr3G3a+15L9qowqz43Eti7JCKMR8UFYpioa2vtNUWhCoph5wYaXflYorbrlA/zutCfH9kx+C87eD+N3g6OtR7+TTwo5VskN2SZ/E5D05IV/IKRkSQX4FG8HL4FXwO9wKX4c7c2kYLHq2yZ0I9/4AuuW9Pg==</latexit>
drift coefficient diffusion coefficient

dx = f (x, t)dt + (x, t)d! t !t
<latexit sha1_base64="RcuqhmwN7N0eJ5jx1OYIRcXmAoE=">AAAB/nicbVBLSwMxGMz6rPW1Kp68BIvgqexKUY9FLx4r2Ad0lyWbzbaheSxJVihLwb/ixYMiXv0d3vw3ZtsetHUgZJj5PjKZOGNUG8/7dlZW19Y3Nitb1e2d3b199+Cwo2WuMGljyaTqxUgTRgVpG2oY6WWKIB4z0o1Ht6XffSRKUykezDgjIUcDQVOKkbFS5B4HsWSJHnN7FYHkZIAmkYncmlf3poDLxJ+TGpijFblfQSJxzokwmCGt+76XmbBAylDMyKQa5JpkCI/QgPQtFYgTHRbT+BN4ZpUEplLZIwycqr83CsR1mdBOcmSGetErxf+8fm7S67CgIssNEXj2UJozaCQsu4AJVQQbNrYEYUVtVoiHSCFsbGNVW4K/+OVl0rmo+5f1xn2j1ryZ11EBJ+AUnAMfXIEmuAMt0AYYFOAZvII358l5cd6dj9noijPfOQJ/4Hz+AC6Rlkk=</latexit>

<latexit sha1_base64="gMTgjs7J9T7tfl8I4J/4iGZ8J/U=">AAAB8XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIqMuiG5cV7APbUjLpnTY0kxmSjFiG/oUbF4q49W/c+Tdm2llo64HA4Zx7ybnHjwXXxnW/ncLK6tr6RnGztLW9s7tX3j9o6ihRDBssEpFq+1Sj4BIbhhuB7VghDX2BLX98k/mtR1SaR/LeTGLshXQoecAZNVZ66IbUjPwgfZr2yxW36s5AlomXkwrkqPfLX91BxJIQpWGCat3x3Nj0UqoMZwKnpW6iMaZsTIfYsVTSEHUvnSWekhOrDEgQKfukITP190ZKQ60noW8ns4R60cvE/7xOYoKrXsplnBiUbP5RkAhiIpKdTwZcITNiYgllitushI2ooszYkkq2BG/x5GXSPKt6F9Xzu/NK7TqvowhHcAyn4MEl1OAW6tAABhKe4RXeHO28OO/Ox3y04OQ7h/AHzucPADaRJQ==</latexit>

t
<latexit sha1_base64="4mSRiAOC1HPbUsbyd7QN48TyFAA=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeiF48t2FpoQ9lsN+3azSbsToQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IJHCoOt+O4W19Y3NreJ2aWd3b/+gfHjUNnGqGW+xWMa6E1DDpVC8hQIl7ySa0yiQ/CEY3878hyeujYjVPU4S7kd0qEQoGEUrNbFfrrhVdw6ySrycVCBHo1/+6g1ilkZcIZPUmK7nJuhnVKNgkk9LvdTwhLIxHfKupYpG3PjZ/NApObPKgISxtqWQzNXfExmNjJlEge2MKI7MsjcT//O6KYbXfiZUkiJXbLEoTCXBmMy+JgOhOUM5sYQyLeythI2opgxtNiUbgrf88ippX1S9y2qtWavUb/I4inACp3AOHlxBHe6gAS1gwOEZXuHNeXRenHfnY9FacPKZY/gD5/MH4xeNAQ==</latexit>

<latexit sha1_base64="NM5jjv5ALjSw2IUmFhHaNf4RtUQ=">AAACOXicbVDLSgMxFM34rPVVdekmWIQWpcyIqBtBdOOygn1Ap5ZMmtFgJjMkd8QyzG+58S/cCW5cKOLWHzDTVqytBwLnnnMvufd4keAabPvZmpqemZ2bzy3kF5eWV1YLa+t1HcaKshoNRaiaHtFMcMlqwEGwZqQYCTzBGt7tWeY37pjSPJSX0ItYOyDXkvucEjBSp1B1AwI3np/cpyUoH49UdhnvYJdL6NhXgH8MPy399uy6QOKyC+weVJB006zsFIp2xe4DTxJnSIpoiGqn8OR2QxoHTAIVROuWY0fQTogCTgVL826sWUToLblmLUMlCZhuJ/3LU7xtlC72Q2WeBNxXRycSEmjdCzzTmW2tx71M/M9rxeAftRMuoxiYpIOP/FhgCHEWI+5yxSiIniGEKm52xfSGKELBhJ03ITjjJ0+S+l7FOajsX+wXT06HceTQJtpCJeSgQ3SCzlEV1RBFD+gFvaF369F6tT6sz0HrlDWc2UB/YH19A2BZrfU=</latexit>

Z t
Analytical x(t) = x(0) + f (x, ⌧ )d⌧
Solution:
0
t
<latexit sha1_base64="4mSRiAOC1HPbUsbyd7QN48TyFAA=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeiF48t2FpoQ9lsN+3azSbsToQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IJHCoOt+O4W19Y3NreJ2aWd3b/+gfHjUNnGqGW+xWMa6E1DDpVC8hQIl7ySa0yiQ/CEY3878hyeujYjVPU4S7kd0qEQoGEUrNbFfrrhVdw6ySrycVCBHo1/+6g1ilkZcIZPUmK7nJuhnVKNgkk9LvdTwhLIxHfKupYpG3PjZ/NApObPKgISxtqWQzNXfExmNjJlEge2MKI7MsjcT//O6KYbXfiZUkiJXbLEoTCXBmMy+JgOhOUM5sYQyLeythI2opgxtNiUbgrf88ippX1S9y2qtWavUb/I4inACp3AOHlxBHe6gAS1gwOEZXuHNeXRenHfnY9FacPKZY/gD5/MH4xeNAQ==</latexit>

Iterative <latexit sha1_base64="YvrqyVJ8Dl8enYgpWcrF1IZZYes=">AAACh3icbVFbT9swFHYCDCi3Ao97sVYhtVCVBKrCI2N72F4mJlFaqamqE9cpVp0L9gmiivJX+FF749/MKakGdEey9Pm72MfHfiKFRsd5seyV1bVP6xubla3tnd296v7BnY5TxXiXxTJWfR80lyLiXRQoeT9RHEJf8p4//VbovUeutIijW5wlfBjCJBKBYICGGlWfvRDw3g+yp7yOJ953LhEoNjxIEhU/vRUb9IQu9kFefyc1TaKMFi4tJiEsO/SDwmzhy73m3MBAZr/+HefkzQX8mTcqo2rNaTnzosvALUGNlHUzqv7xxjFLQx4hk6D1wHUSHGagUDDJ84qXap4Am8KEDwyMIOR6mM3nmNMjw4xpECuzIqRz9m0ig1DrWegbZ9Gj/qgV5P+0QYrB5TATUZIij9jrRUEqKca0+BQ6FoozlDMDgClheqXsHhQwNF9XDMH9+ORlcHfWcjut9u927eq6HMcG+Uy+kDpxyQW5Ij/IDekSZq1ax9a51bY37VO7Y1++Wm2rzBySd2V//QvLLMOj</latexit>

p
t) ⇡ x(t) + f (x(t), t) t
<latexit sha1_base64="x+1+cUIxVneSUb7cYp0xe5wLkEk=">AAACOXicbVDLSgMxFM34rPVVdekmWISWSpmRoi6LunBZwT6gU0omzbShmQfJHWkZ+ltu/At3ghsXirj1B0zbEfrwQuDknHu49x4nFFyBab4aK6tr6xubqa309s7u3n7m4LCmgkhSVqWBCGTDIYoJ7rMqcBCsEUpGPEewutO/Gev1RyYVD/wHGIas5ZGuz11OCWiqnanYHoGe48aDUQ4K9i0TQDDkbRKGMhjMinlcwH9/d5Sbk860I7G2M1mzaE4KLwMrAVmUVKWdebE7AY085gMVRKmmZYbQiokETgUbpe1IsZDQPumypoY+8ZhqxZPLR/hUMx3sBlI/H/CEnXXExFNq6Dm6c7yvWtTG5H9aMwL3qhVzP4yA+XQ6yI0EhgCPY8QdLhkFMdSAUMn1rpj2iCQUdNhpHYK1ePIyqJ0XrYti6b6ULV8ncaTQMTpBOWShS1RGd6iCqoiiJ/SGPtCn8Wy8G1/G97R1xUg8R2iujJ9fXbatOg==</latexit>

Numerical x(t + x(t + t) ⇡ x(t) + f (x(t), t) t + (x(t), t) t N (0, I)


Solution: 42
Crash Course in Differential Equations
Ordinary Differential Equation (ODE): Stochastic Differential Equation (SDE):
Wiener Process
dx dx
<latexit sha1_base64="HU7MMEeX0A5eg0kajyOZaMQJ4Fg=">AAACK3icbVDLSsNAFJ34rPUVdekmWIQKUhIp6kYodeOygn1AU8pkOmmHTh7M3EhLyP+48Vdc6MIHbv0PJ23E2npg4HDOucy9xwk5k2Ca79rS8srq2npuI7+5tb2zq+/tN2QQCULrJOCBaDlYUs58WgcGnLZCQbHncNp0htep37ynQrLAv4NxSDse7vvMZQSDkrp61XYFJrENdATCi3uJ7WEYOG48SpIZFZKrH8NNir+ZUzjJd/WCWTInMBaJlZECylDr6s92LyCRR30gHEvZtswQOjEWwAinSd6OJA0xGeI+bSvqY4/KTjy5NTGOldIz3ECo54MxUWcnYuxJOfYclUy3lPNeKv7ntSNwLzsx88MIqE+mH7kRNyAw0uKMHhOUAB8rgolgaleDDLAqD1S9aQnW/MmLpHFWss5L5dtyoVLN6sihQ3SEishCF6iCblAN1RFBD+gJvaI37VF70T60z2l0SctmDtAfaF/fhxKqPA==</latexit>
<latexit sha1_base64="sdfFbxA4bO046gbLntH1JdLIPVo=">AAACV3icbVFNSyNBEO2ZuBqzX9E9emkMgssuYUZk9SIE9+LRBaNCJoSenprY2B9Dd40YhvmT4sW/4kV7YmT9Kmj68V4V9fp1WkjhMIrugrC19Gl5pb3a+fzl67fv3bX1U2dKy2HIjTT2PGUOpNAwRIESzgsLTKUSztLLv41+dgXWCaNPcFbAWLGpFrngDD016eokt4xXCcI1WlVldaIYXqR5dV3XL1isD56FvN7+3/Mbf9JfNHFiqthrOkmNzNxM+atKjIIpqyfYmXR7UT+aF30P4gXokUUdT7o3SWZ4qUAjl8y5URwVOK6YRcEl1J2kdFAwfsmmMPJQMwVuXM1zqemWZzKaG+uPRjpnX05UTLnGou9srLu3WkN+pI1KzPfHldBFiaD506K8lBQNbUKmmbDAUc48YNwK75XyC+aDRv8VTQjx2ye/B6c7/fhPf/ffbm9wuIijTTbIJtkmMdkjA3JEjsmQcHJL7oNWsBTcBQ/hcth+ag2DxcwP8qrCtUfgaLfD</latexit>

(Gaussian
<latexit sha1_base64="DKhIukpuJIHY7KlWiVGjOAQE2hY=">AAACInicbVDLSsNAFJ3UV62vqEs3wSJUkJJI8bEQim5cVrAPaEKZTCbt0MmDmRtpCfkWN/6KGxeKuhL8GJM2orYeGDiccy5z77FDziTo+odSWFhcWl4prpbW1jc2t9TtnZYMIkFokwQ8EB0bS8qZT5vAgNNOKCj2bE7b9vAq89t3VEgW+LcwDqnl4b7PXEYwpFJPPTeBjkB4sZOYHoaB7caj5OKbuknlRz2Cw58wlHpqWa/qE2jzxMhJGeVo9NQ30wlI5FEfCMdSdg09BCvGAhjhNCmZkaQhJkPcp92U+tij0oonJybaQao4mhuI9PmgTdTfEzH2pBx7dprMFpazXib+53UjcM+smPlhBNQn04/ciGsQaFlfmsMEJcDHKcFEsHRXjQywwATSVrMSjNmT50nruGqcVGs3tXL9Mq+jiPbQPqogA52iOrpGDdREBN2jR/SMXpQH5Ul5Vd6n0YKSz+yiP1A+vwBRb6X6</latexit>

= f (x, t) or dx = f (x, t)dt = f (x, t) + (x, t)! t White Noise)


dt dt
<latexit sha1_base64="gMTgjs7J9T7tfl8I4J/4iGZ8J/U=">AAAB8XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIqMuiG5cV7APbUjLpnTY0kxmSjFiG/oUbF4q49W/c+Tdm2llo64HA4Zx7ybnHjwXXxnW/ncLK6tr6RnGztLW9s7tX3j9o6ihRDBssEpFq+1Sj4BIbhhuB7VghDX2BLX98k/mtR1SaR/LeTGLshXQoecAZNVZ66IbUjPwgfZr2yxW36s5AlomXkwrkqPfLX91BxJIQpWGCat3x3Nj0UqoMZwKnpW6iMaZsTIfYsVTSEHUvnSWekhOrDEgQKfukITP190ZKQ60noW8ns4R60cvE/7xOYoKrXsplnBiUbP5RkAhiIpKdTwZcITNiYgllitushI2ooszYkkq2BG/x5GXSPKt6F9Xzu/NK7TqvowhHcAyn4MEl1OAW6tAABhKe4RXeHO28OO/Ox3y04OQ7h/AHzucPADaRJQ==</latexit>

x ✓
<latexit sha1_base64="29xJXKBzbqt1OxSFFWNMKKbD24E=">AAACanicdVFNaxRBEO0ZjcZNYtYEFMmlyUbYkLDMSNCACEEvHiNkk8D2svT01EyadE8P3TWSZZiDf9Gbv8CLP8Ke3ZXNhxYU/Xj1iqp6nZRKOoyin0H46PHKk6erzzpr6xvPN7svts6dqayAoTDK2MuEO1CygCFKVHBZWuA6UXCRXH9u6xffwDppijOcljDWPC9kJgVHT02631ki81z12QeGcINW12nDNMerJKtvmo9/Ydb0l+wh7i/FSA8oczLX/H8KlhiVuqn2T82Mhpw3E/Tz2sF2vzPp9qJBNAv6EMQL0COLOJ10f7DUiEpDgUJx50ZxVOK45halUNB0WOWg5OKa5zDysOAa3LieWdXQN55JaWaszwLpjL3dUXPt2l29sr3G3a+15L9qowqz43Eti7JCKMR8UFYpioa2vtNUWhCoph5wYaXflYorbrlA/zutCfH9kx+C87eD+N3g6OtR7+TTwo5VskN2SZ/E5D05IV/IKRkSQX4FG8HL4FXwO9wKX4c7c2kYLHq2yZ0I9/4AuuW9Pg==</latexit>
drift coefficient diffusion coefficient

dx = f (x, t)dt + (x, t)d! t !t
<latexit sha1_base64="RcuqhmwN7N0eJ5jx1OYIRcXmAoE=">AAAB/nicbVBLSwMxGMz6rPW1Kp68BIvgqexKUY9FLx4r2Ad0lyWbzbaheSxJVihLwb/ixYMiXv0d3vw3ZtsetHUgZJj5PjKZOGNUG8/7dlZW19Y3Nitb1e2d3b199+Cwo2WuMGljyaTqxUgTRgVpG2oY6WWKIB4z0o1Ht6XffSRKUykezDgjIUcDQVOKkbFS5B4HsWSJHnN7FYHkZIAmkYncmlf3poDLxJ+TGpijFblfQSJxzokwmCGt+76XmbBAylDMyKQa5JpkCI/QgPQtFYgTHRbT+BN4ZpUEplLZIwycqr83CsR1mdBOcmSGetErxf+8fm7S67CgIssNEXj2UJozaCQsu4AJVQQbNrYEYUVtVoiHSCFsbGNVW4K/+OVl0rmo+5f1xn2j1ryZ11EBJ+AUnAMfXIEmuAMt0AYYFOAZvII358l5cd6dj9noijPfOQJ/4Hz+AC6Rlkk=</latexit>

<latexit sha1_base64="gMTgjs7J9T7tfl8I4J/4iGZ8J/U=">AAAB8XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIqMuiG5cV7APbUjLpnTY0kxmSjFiG/oUbF4q49W/c+Tdm2llo64HA4Zx7ybnHjwXXxnW/ncLK6tr6RnGztLW9s7tX3j9o6ihRDBssEpFq+1Sj4BIbhhuB7VghDX2BLX98k/mtR1SaR/LeTGLshXQoecAZNVZ66IbUjPwgfZr2yxW36s5AlomXkwrkqPfLX91BxJIQpWGCat3x3Nj0UqoMZwKnpW6iMaZsTIfYsVTSEHUvnSWekhOrDEgQKfukITP190ZKQ60noW8ns4R60cvE/7xOYoKrXsplnBiUbP5RkAhiIpKdTwZcITNiYgllitushI2ooszYkkq2BG/x5GXSPKt6F9Xzu/NK7TqvowhHcAyn4MEl1OAW6tAABhKe4RXeHO28OO/Ox3y04OQ7h/AHzucPADaRJQ==</latexit>

t
<latexit sha1_base64="4mSRiAOC1HPbUsbyd7QN48TyFAA=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeiF48t2FpoQ9lsN+3azSbsToQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IJHCoOt+O4W19Y3NreJ2aWd3b/+gfHjUNnGqGW+xWMa6E1DDpVC8hQIl7ySa0yiQ/CEY3878hyeujYjVPU4S7kd0qEQoGEUrNbFfrrhVdw6ySrycVCBHo1/+6g1ilkZcIZPUmK7nJuhnVKNgkk9LvdTwhLIxHfKupYpG3PjZ/NApObPKgISxtqWQzNXfExmNjJlEge2MKI7MsjcT//O6KYbXfiZUkiJXbLEoTCXBmMy+JgOhOUM5sYQyLeythI2opgxtNiUbgrf88ippX1S9y2qtWavUb/I4inACp3AOHlxBHe6gAS1gwOEZXuHNeXRenHfnY9FacPKZY/gD5/MH4xeNAQ==</latexit>

<latexit sha1_base64="NM5jjv5ALjSw2IUmFhHaNf4RtUQ=">AAACOXicbVDLSgMxFM34rPVVdekmWIQWpcyIqBtBdOOygn1Ap5ZMmtFgJjMkd8QyzG+58S/cCW5cKOLWHzDTVqytBwLnnnMvufd4keAabPvZmpqemZ2bzy3kF5eWV1YLa+t1HcaKshoNRaiaHtFMcMlqwEGwZqQYCTzBGt7tWeY37pjSPJSX0ItYOyDXkvucEjBSp1B1AwI3np/cpyUoH49UdhnvYJdL6NhXgH8MPy399uy6QOKyC+weVJB006zsFIp2xe4DTxJnSIpoiGqn8OR2QxoHTAIVROuWY0fQTogCTgVL826sWUToLblmLUMlCZhuJ/3LU7xtlC72Q2WeBNxXRycSEmjdCzzTmW2tx71M/M9rxeAftRMuoxiYpIOP/FhgCHEWI+5yxSiIniGEKm52xfSGKELBhJ03ITjjJ0+S+l7FOajsX+wXT06HceTQJtpCJeSgQ3SCzlEV1RBFD+gFvaF369F6tT6sz0HrlDWc2UB/YH19A2BZrfU=</latexit>

Z t
Analytical x(t) = x(0) + f (x, ⌧ )d⌧
Solution:
0
t
<latexit sha1_base64="4mSRiAOC1HPbUsbyd7QN48TyFAA=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeiF48t2FpoQ9lsN+3azSbsToQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IJHCoOt+O4W19Y3NreJ2aWd3b/+gfHjUNnGqGW+xWMa6E1DDpVC8hQIl7ySa0yiQ/CEY3878hyeujYjVPU4S7kd0qEQoGEUrNbFfrrhVdw6ySrycVCBHo1/+6g1ilkZcIZPUmK7nJuhnVKNgkk9LvdTwhLIxHfKupYpG3PjZ/NApObPKgISxtqWQzNXfExmNjJlEge2MKI7MsjcT//O6KYbXfiZUkiJXbLEoTCXBmMy+JgOhOUM5sYQyLeythI2opgxtNiUbgrf88ippX1S9y2qtWavUb/I4inACp3AOHlxBHe6gAS1gwOEZXuHNeXRenHfnY9FacPKZY/gD5/MH4xeNAQ==</latexit>

Iterative <latexit sha1_base64="YvrqyVJ8Dl8enYgpWcrF1IZZYes=">AAACh3icbVFbT9swFHYCDCi3Ao97sVYhtVCVBKrCI2N72F4mJlFaqamqE9cpVp0L9gmiivJX+FF749/MKakGdEey9Pm72MfHfiKFRsd5seyV1bVP6xubla3tnd296v7BnY5TxXiXxTJWfR80lyLiXRQoeT9RHEJf8p4//VbovUeutIijW5wlfBjCJBKBYICGGlWfvRDw3g+yp7yOJ953LhEoNjxIEhU/vRUb9IQu9kFefyc1TaKMFi4tJiEsO/SDwmzhy73m3MBAZr/+HefkzQX8mTcqo2rNaTnzosvALUGNlHUzqv7xxjFLQx4hk6D1wHUSHGagUDDJ84qXap4Am8KEDwyMIOR6mM3nmNMjw4xpECuzIqRz9m0ig1DrWegbZ9Gj/qgV5P+0QYrB5TATUZIij9jrRUEqKca0+BQ6FoozlDMDgClheqXsHhQwNF9XDMH9+ORlcHfWcjut9u927eq6HMcG+Uy+kDpxyQW5Ij/IDekSZq1ax9a51bY37VO7Y1++Wm2rzBySd2V//QvLLMOj</latexit>

p
t) ⇡ x(t) + f (x(t), t) t
<latexit sha1_base64="x+1+cUIxVneSUb7cYp0xe5wLkEk=">AAACOXicbVDLSgMxFM34rPVVdekmWISWSpmRoi6LunBZwT6gU0omzbShmQfJHWkZ+ltu/At3ghsXirj1B0zbEfrwQuDknHu49x4nFFyBab4aK6tr6xubqa309s7u3n7m4LCmgkhSVqWBCGTDIYoJ7rMqcBCsEUpGPEewutO/Gev1RyYVD/wHGIas5ZGuz11OCWiqnanYHoGe48aDUQ4K9i0TQDDkbRKGMhjMinlcwH9/d5Sbk860I7G2M1mzaE4KLwMrAVmUVKWdebE7AY085gMVRKmmZYbQiokETgUbpe1IsZDQPumypoY+8ZhqxZPLR/hUMx3sBlI/H/CEnXXExFNq6Dm6c7yvWtTG5H9aMwL3qhVzP4yA+XQ6yI0EhgCPY8QdLhkFMdSAUMn1rpj2iCQUdNhpHYK1ePIyqJ0XrYti6b6ULV8ncaTQMTpBOWShS1RGd6iCqoiiJ/SGPtCn8Wy8G1/G97R1xUg8R2iujJ9fXbatOg==</latexit>

Numerical x(t + x(t + t) ⇡ x(t) + f (x(t), t) t + (x(t), t) t N (0, I)


Solution: 43
Forward Diffusion Process as Stochastic Differential Equation

Consider the limit of many small steps:

Forward diffusion process (fixed)

Data Noise

x0 x1 … … xT
p
<latexit sha1_base64="vy6Q1yTeEXCBXerAkFNeHudF4Ac=">AAACiHicbVFNbxMxEPUuFNrwFeDYi0VAStU22q0ChVsFHOCCikTaSnEUzTqzrVXvB/YsamT5t/Q/9ca/wZumUtvwJEtPb95oxm+yWitLSfI3ih88XHv0eH2j8+Tps+cvui9fHdmqMRJHstKVOcnAolYljkiRxpPaIBSZxuPs/EtbP/6Dxqqq/EXzGicFnJYqVxIoSNPupSiAzrLcXfgpcQF1baoLfkt0tJt6vstFbkA6kSFBn7bEV9QEnLzb86vubS7sb0OrbrGz8ErQ7ofv3/QlfueGfvdbnTuYdnvJIFmAr5J0SXpsicNp90rMKtkUWJLUYO04TWqaODCkpEbfEY3FGuQ5nOI40BIKtBO3CNLzd0GZ8bwy4ZXEF+rtDgeFtfMiC852YXu/1or/q40byj9OnCrrhrCU14PyRnOqeHsVPlMGJel5ICCNCrtyeQYhcQq3a0NI7395lRztDdIPg+HPYe/g7TKOdbbJ3rA+S9k+O2Df2CEbMRmtRdvRMHofd+Ik3o8/XVvjaNnzmt1B/PkfhHrAWg==</latexit>

(t) t
xt ⇡ xt 1 xt 1 + (t) t N (0, I)
2

<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

1 p
dxt = (t)xt dt + (t) d! t
2
Stochastic Differential Equation (SDE)
describing the diffusion in infinitesimal limit

Song et al., “Score-Based Generative Modeling through Stochastic Differential Equations”, ICLR, 2021 44
Forward Diffusion Process as Stochastic Differential Equation

Forward diffusion process (fixed)

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(xT )
<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

q(x0 )

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

1 p
Forward Diffusion SDE: dxt = (t)xt dt + (t) d! t
2

drift term diffusion term


Song et al., ICLR, 2021 (pulls towards mode) (injects noise) 45
Forward Diffusion Process as Stochastic Differential Equation

Song et al., ICLR, 2021 46


Forward Diffusion Process as Stochastic Differential Equation
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) q(xT )
<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

1 p
Forward Diffusion SDE: dxt = (t)xt dt + (t) d! t
2

drift term diffusion term


(pulls towards mode) (injects noise)

Special case of more general SDEs used in generative diffusion models:


<latexit sha1_base64="idKHWldUxa+NvM9jTCgVDCesBFA=">AAACSXicbVBNa9tAEF05bZO6X25yzGWpKaTUGKmYtJdASC49plDbAUu4q9XIXryrFbujEiP893LJLbf8h1x6aAk9deUI6tgZWPbx5j1m5sW5FBZ9/8ZrbD15+mx753nzxctXr9+03u4OrC4Mhz7XUpvzmFmQIoM+CpRwnhtgKpYwjGenVX/4E4wVOvuO8xwixSaZSAVn6Khx60eIcIFGlckiVAyncVpeLMZIj2h6gB9WqbBD/2uRfqSTStBZ8cdaJnau3FeGWsGEOVdz3Gr7XX9ZdBMENWiTus7Gresw0bxQkCGXzNpR4OcYlcyg4BIWzbCwkDM+YxMYOZgxBTYql0ks6HvHJDTVxr0M6ZJddZRM2WpFp6xOs+u9inysNyow/RKVIssLhIzfD0oLSVHTKlaaCAMc5dwBxo1wu1I+ZYZxdOFXIQTrJ2+CwaducNjtfeu1j0/qOHbIPnlHDkhAPpNj8pWckT7h5JLckt/kj3fl/fLuvL/30oZXe/bIg2ps/QNXh7Rb</latexit>

dxt = f (t)xt dt + g(t) d! t

Song et al., ICLR, 2021 47


The Generative Reverse Stochastic Differential Equation
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) q(xT )
<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

1 p
Forward Diffusion SDE: dxt = (t)xt dt + (t) d! t
2

But what about the reverse


direction, necessary for generation?

Song et al., ICLR, 2021 48


The Generative Reverse Stochastic Differential Equation
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) q(xT )
<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

1 p
Forward Diffusion SDE: dxt = (t)xt dt + (t) d! t
2
drift term diffusion term

p
<latexit sha1_base64="dfScmo8UnT0TSPHhsg7BJ14tcks=">AAACv3icbVFbb9MwFHbCbZTLCjzyYlEhdYJVyTQBDyANeOFxSHSbVFeR45ykVh0ns0/QKst/kgck/g1OV7Ru40iWP33n9p1z8lZJi0nyJ4rv3L13/8HOw8Gjx0+e7g6fPT+xTWcETEWjGnOWcwtKapiiRAVnrQFe5wpO8+XX3n/6E4yVjf6BqxbmNa+0LKXgGKhs+JshXKCpXeFZzXGRl+7CZ0g/UaagxNk+Kw0XLvXuwLMckI9xj16L3KdXfF9rrckZKLxjmueKZ2473ofCTUXPMxxv03ueGVktcE6vBCF9Q5k9N+j+dfDs7ZbenJvgaVRhV3X4HGtqqLgP1QbZcJRMkrXR2yDdgBHZ2HE2/MWKRnQ1aBSKWztLkxbnjhuUQoEfsM5Cy8WSVzALUPMa7NytZ/X0dWAKWjYmPI10zW5nOF7bXmOI7Ee2N309+T/frMPyw9xJ3XYIWlw2KjtFsaH9MWkhDQhUqwC4MDJopWLBw8EwnLxfQnpz5Nvg5GCSvpscfj8cHX3ZrGOHvCSvyJik5D05It/IMZkSEX2M8mgZqfhzXMU6bi9D42iT84Jcs3j1F+Jn3w0=</latexit>

Reverse Generative 1
dxt = (t)xt (t)rxt log qt (xt ) dt + ¯t
(t) d!
Diffusion SDE: 2

“Score Function”

Simulate reverse diffusion process: Data generation from random noise!


Song et al., ICLR, 2021
49
Anderson, in Stochastic Processes and their Applications, 1982
The Generative Reverse Stochastic Differential Equation

Song et al., ICLR, 2021


50
Anderson, in Stochastic Processes and their Applications, 1982
The Generative Reverse Stochastic Differential Equation
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) q(xT )
<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

1 p
Forward Diffusion SDE: dxt = (t)xt dt + (t) d! t
2
drift term diffusion term

p
<latexit sha1_base64="dfScmo8UnT0TSPHhsg7BJ14tcks=">AAACv3icbVFbb9MwFHbCbZTLCjzyYlEhdYJVyTQBDyANeOFxSHSbVFeR45ykVh0ns0/QKst/kgck/g1OV7Ru40iWP33n9p1z8lZJi0nyJ4rv3L13/8HOw8Gjx0+e7g6fPT+xTWcETEWjGnOWcwtKapiiRAVnrQFe5wpO8+XX3n/6E4yVjf6BqxbmNa+0LKXgGKhs+JshXKCpXeFZzXGRl+7CZ0g/UaagxNk+Kw0XLvXuwLMckI9xj16L3KdXfF9rrckZKLxjmueKZ2473ofCTUXPMxxv03ueGVktcE6vBCF9Q5k9N+j+dfDs7ZbenJvgaVRhV3X4HGtqqLgP1QbZcJRMkrXR2yDdgBHZ2HE2/MWKRnQ1aBSKWztLkxbnjhuUQoEfsM5Cy8WSVzALUPMa7NytZ/X0dWAKWjYmPI10zW5nOF7bXmOI7Ee2N309+T/frMPyw9xJ3XYIWlw2KjtFsaH9MWkhDQhUqwC4MDJopWLBw8EwnLxfQnpz5Nvg5GCSvpscfj8cHX3ZrGOHvCSvyJik5D05It/IMZkSEX2M8mgZqfhzXMU6bi9D42iT84Jcs3j1F+Jn3w0=</latexit>

Reverse Generative 1
dxt = (t)xt (t)rxt log qt (xt ) dt + ¯t
(t) d!
Diffusion SDE: 2

“Score Function”

Simulate reverse diffusion process: Data generation from random noise!


Song et al., ICLR, 2021
51
Anderson, in Stochastic Processes and their Applications, 1982
The Generative Reverse Stochastic Differential Equation
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) q(xT )

But how to get the score


x function
xT rxt log qt (xt )?
<latexit sha1_base64="v3+1BGO+rwvNAihLslaX7J7IbtI=">AAACKHicbVBNS8NAFNz4WetX1aOXxSropSRS1JsFLx4rWC00JWy2L3Vxs4m7L9IS8nO8+Fe8iCjSq7/EpO1BqwMLw8x7u7Pjx1IYtO2RNTe/sLi0XFopr66tb2xWtrZvTJRoDi0eyUi3fWZACgUtFCihHWtgoS/h1r+/KPzbR9BGROoahzF0Q9ZXIhCcYS55lXMXYYDje1INvSx1FfMl81I3ZHjnB+kg8zCjroz69MHDw5/yUVYue5WqXbPHoH+JMyVVMkXTq7y5vYgnISjkkhnTcewYuynTKLiErOwmBmLG71kfOjlVLATTTccBM3qQKz0aRDo/CulY/bmRstCYYejnk0VOM+sV4n9eJ8HgrJsKFScIik8eChJJMaJFa7QnNHCUw5wwrkWelfI7phnHvNuiBGf2y3/JzXHNOanVr+rVxv60jhLZJXvkkDjklDTIJWmSFuHkibyQd/JhPVuv1qc1mozOWdOdHfIL1tc3WFan0Q==</latexit>

x … … <latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

0
<latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

t
<latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

1 p
Forward Diffusion SDE: dxt = (t)xt dt + (t) d! t
2
drift term diffusion term

p
<latexit sha1_base64="dfScmo8UnT0TSPHhsg7BJ14tcks=">AAACv3icbVFbb9MwFHbCbZTLCjzyYlEhdYJVyTQBDyANeOFxSHSbVFeR45ykVh0ns0/QKst/kgck/g1OV7Ru40iWP33n9p1z8lZJi0nyJ4rv3L13/8HOw8Gjx0+e7g6fPT+xTWcETEWjGnOWcwtKapiiRAVnrQFe5wpO8+XX3n/6E4yVjf6BqxbmNa+0LKXgGKhs+JshXKCpXeFZzXGRl+7CZ0g/UaagxNk+Kw0XLvXuwLMckI9xj16L3KdXfF9rrckZKLxjmueKZ2473ofCTUXPMxxv03ueGVktcE6vBCF9Q5k9N+j+dfDs7ZbenJvgaVRhV3X4HGtqqLgP1QbZcJRMkrXR2yDdgBHZ2HE2/MWKRnQ1aBSKWztLkxbnjhuUQoEfsM5Cy8WSVzALUPMa7NytZ/X0dWAKWjYmPI10zW5nOF7bXmOI7Ee2N309+T/frMPyw9xJ3XYIWlw2KjtFsaH9MWkhDQhUqwC4MDJopWLBw8EwnLxfQnpz5Nvg5GCSvpscfj8cHX3ZrGOHvCSvyJik5D05It/IMZkSEX2M8mgZqfhzXMU6bi9D42iT84Jcs3j1F+Jn3w0=</latexit>

Reverse Generative 1
dxt = (t)xt (t)rxt log qt (xt ) dt + ¯t
(t) d!
Diffusion SDE: 2

“Score Function”

Simulate reverse diffusion process: Data generation from random noise!


Song et al., ICLR, 2021
54
Anderson, in Stochastic Processes and their Applications, 1982
Score Matching
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) q(xT )
<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
• Naïve idea, learn model for the score function by direct regression?
<latexit sha1_base64="kKt1I9wmEayvuU79NVuOarHMCuE=">AAACu3icbVFba9swFJa9W5fdsu1xL2LZIIEu2KG0fRkUxmCPHTRtIU6NrMiJVl086XgsKPqTe9u/mexkkDQ9IPTpO9+5qqgEt5Akf6P4wcNHj58cPO08e/7i5avu6zeXVteGsjHVQpvrglgmuGJj4CDYdWUYkYVgV8Xtl8Z/9YsZy7W6gGXFppLMFS85JRCovPsnk1zlLiu0mNmlDJfLYMGAeI8zSWBRFO6rzx1klsuWoES4se8nhxcDv61Y49L99nkrxj9z6G+TA79a/X9bf3/NnYBDGHzKFCkE2c0eOhN6vp9/tcpHN6NO3u0lw6Q1vA/SDeihjZ03S5hpWkumgApi7SRNKpg6YoBTwXwnqy2rCL0lczYJUBHJ7NS1u/f4Y2BmuNQmHAW4ZbcjHJG2GTIom17tXV9D3ueb1FCeTh1XVQ1M0XWhshYYNG4+Es+4YRTEMgBCDQ+9YroghlAI390sIb078j64HA3T4+HR96Pe2YfNOg7QO/Qe9VGKTtAZ+obO0RjR6DS6iebRIv4c0/hHLNbSONrEvEU7Ftf/AHdT3tk=</latexit>

min Et⇠U (0,T ) Ext ⇠qt (xt ) ||s✓ (xt , t) rxt log qt (xt )||22

diffusion diffused neural score of


time t data xt network diffused data
<latexit sha1_base64="cJsxGNpqK+W2GHDdEsl0q4vaP2c=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSp4KokU9Vjw4rGCaQttKJvtpl262YTdiVBKf4MXD4p49Qd589+4aXPQ1gcDj/dmmJkXplIYdN1vZ219Y3Nru7RT3t3bPzisHB23TJJpxn2WyER3Qmq4FIr7KFDyTqo5jUPJ2+H4LvfbT1wbkahHnKQ8iOlQiUgwilbysWzRr1TdmjsHWSVeQapQoNmvfPUGCctirpBJakzXc1MMplSjYJLPyr3M8JSyMR3yrqWKxtwE0/mxM3JhlQGJEm1LIZmrvyemNDZmEoe2M6Y4MsteLv7ndTOMboOpUGmGXLHFoiiTBBOSf04GQnOGcmIJZVrYWwkbUU0Z2nzyELzll1dJ66rmXdfqD/Vq47yIowSncAaX4MENNOAemuADAwHP8ApvjnJenHfnY9G65hQzJ/AHzucPrGiNMw==</latexit>

<latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

(marginal)

But rxt log qt (xt ) (score of the marginal diffused density qt (xt )) is not tractable!
<latexit sha1_base64="Ax9Lie9GJPh2vqJKM4XRKATCvsU=">AAAB/HicbVDLSsNAFJ34rPUV7dLNYBXqpiRS1GXBjcsK9gFtCJPppB06eThzI4ZQf8WNC0Xc+iHu/BsnbRbaemDgcM693DPHiwVXYFnfxsrq2vrGZmmrvL2zu7dvHhx2VJRIyto0EpHseUQxwUPWBg6C9WLJSOAJ1vUm17nffWBS8Si8gzRmTkBGIfc5JaAl16zcu1AbBATGnp89Tl04K5dds2rVrRnwMrELUkUFWq75NRhGNAlYCFQQpfq2FYOTEQmcCjYtDxLFYkInZMT6moYkYMrJZuGn+FQrQ+xHUr8Q8Ez9vZGRQKk08PRkHlMtern4n9dPwL9yMh7GCbCQzg/5icAQ4bwJPOSSURCpJoRKrrNiOiaSUNB95SXYi19eJp3zun1Rb9w2qs2Too4SOkLHqIZsdIma6Aa1UBtRlKJn9IrejCfjxXg3PuajK0axU0F/YHz+ALDNlA4=</latexit>

<latexit sha1_base64="lasQkhSRqZJ3UMvYLrPV64XoiMI=">AAACFnicbVBNS8NAEN34WetX1aOXxSrUgyWRoh4LXjxWsB/QhLDZbtqlm03cnYgl5Fd48a948aCIV/HmvzFpe6itDwYe780wM8+LBNdgmj/G0vLK6tp6YaO4ubW9s1va22/pMFaUNWkoQtXxiGaCS9YEDoJ1IsVI4AnW9obXud9+YErzUN7BKGJOQPqS+5wSyCS3dGZL4gniJnZAYOD5yWPqQoptEfbxvQuVWfm06JbKZtUcAy8Sa0rKaIqGW/q2eyGNAyaBCqJ11zIjcBKigFPB0qIdaxYROiR91s2oJAHTTjJ+K8UnmdLDfqiykoDH6uxEQgKtR4GXdeZX6nkvF//zujH4V07CZRQDk3SyyI8FhhDnGeEeV4yCGGWEUMWzWzEdEEUoZEnmIVjzLy+S1nnVuqjWbmvl+vE0jgI6REeogix0ieroBjVQE1H0hF7QG3o3no1X48P4nLQuGdOZA/QHxtcvzEKfqA==</latexit>

Vincent, “A Connection Between Score Matching and Denoising Autoencoders”, Neural Computation, 2011
55
Song and Ermon, “Generative Modeling by Estimating Gradients of the Data Distribution”, NeurIPS, 2019
Denoising Score Matching
Forward diffusion process (fixed)

“Variance Preserving” SDE:


1 p
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

dxt = (t)xt dt + (t) d! t


q(x0 ) q(xT ) 2 <latexit sha1_base64="ON6f5QfBeJLr/8l+XY5R0c9FRgU=">AAACTnicbVFNS+tAFJ3Ur74+P6ou3QxWQUFKIqKCCIIb3YiCVaGp4WY6aQdnkjhzI5aYX+hG3Pkz3LyFIprUClXfhYFzzzmXuXPGj6UwaNtPVmlkdGx8ovyn8ndyanqmOjt3ZqJEM95gkYz0hQ+GSxHyBgqU/CLWHJQv+bl/tV/o5zdcGxGFp9iLeUtBJxSBYIA55VX5tYcrrgLs+kF6m3l4N9TYq7v9joFMj7Jvth23A0qBh3TYv0ZdIzoKLtc9/OIPs9WKV63Zdbtf9DdwBqBGBnXsVR/ddsQSxUNkEoxpOnaMrRQ0CiZ5VnETw2NgV9DhzRyGoLhppf04MrqcM20aRDo/IdI+OzyRgjKmp/zcWexofmoF+T+tmWCw3UpFGCfIQ/Z5UZBIihEtsqVtoTlD2csBMC3yXSnrggaG+Q8UITg/n/wbnK3Xnc36xslGbW9pEEeZLJBFskIcskX2yAE5Jg3CyD15Ji/k1Xqw/llv1vuntWQNZubJtyqVPwB50LYi</latexit>

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT qt (xt |x0 ) = N (xt ; t x0 , t2 I)


<latexit sha1_base64="5csNGsgEGhWzfj3FRC5grPM/eNk=">AAACFXicbVBNSxxBEO3RfOjmwzEevQxZAwaSZUYkegkIuXg0kHWFndmhprdmbba7Z+iuEZZh/kQu+StePBjEq+DNf2PPuodEfVDweK+KqnpZKYWlMLzzlpZfvHz1emW18+btu/dr/vqHY1tUhmOfF7IwJxlYlEJjnwRJPCkNgsokDrLpj9YfnKGxotC/aFZiomCiRS44kJNS/0s8AaUgpe84qr/GuQFeR02908RCUxqOKM6QYNt+Htumk/rdsBfOETwl0YJ02QJHqX8bjwteKdTEJVg7jMKSkhoMCS6x6cSVxRL4FCY4dFSDQpvU86+a4JNTxkFeGFeagrn670QNytqZylynAjq1j71WfM4bVpTvJ7XQZUWo+cOivJIBFUEbUTAWBjnJmSPAjXC3BvwUXDLkgmxDiB6//JQc7/Sib73dn7vdg61FHCtsk31k2yxie+yAHbIj1mec/Wbn7JL99f54F96Vd/3QuuQtZjbYf/Bu7gHzfp6V</latexit>

1
Rt
(s)ds
• Instead, diffuse individual data points x0 . Diffused qt (xt |x0 ) is tractable! =e
<latexit sha1_base64="eNVMkJXCYevgI79ft3d7SQpc2DM=">AAACCXicbVC7TsMwFHV4lvIKMLJYFKSyVAmqgLESC2OR6ENqo8hxndaq4wT7BlGFriz8CgsDCLHyB2z8DUmbobQcydLxOffq3nu8SHANlvVjLC2vrK6tFzaKm1vbO7vm3n5Th7GirEFDEaq2RzQTXLIGcBCsHSlGAk+wlje8yvzWPVOah/IWRhFzAtKX3OeUQCq5Jr5zodwNCAw8P3kYu/A487FOi0XXLFkVawK8SOyclFCOumt+d3shjQMmgQqidce2InASooBTwcbFbqxZROiQ9FknpZIETDvJ5JIxPkmVHvZDlT4JeKLOdiQk0HoUeGlltqae9zLxP68Tg3/pJFxGMTBJp4P8WGAIcRYL7nHFKIhRSghVPN0V0wFRhEIaXhaCPX/yImmeVezzSvWmWqod53EU0CE6QmVkowtUQ9eojhqIoif0gt7Qu/FsvBofxue0dMnIew7QHxhfvxy/mdk=</latexit>

2 0
<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

t
<latexit sha1_base64="yqYMfP/TZS8iKSBKV3tjLYmW+FM=">AAACDXicbVA9SwNBEN2LXzF+RS1tDhMhFgl3IaiNINhYKhgTyCXH3maSLNnbO3bnhHDkD9j4V2wsFLG1t/PfuIkpNPHBwOO9GWbmBbHgGh3ny8osLa+srmXXcxubW9s7+d29Ox0likGdRSJSzYBqEFxCHTkKaMYKaBgIaATDy4nfuAeleSRvcRRDO6R9yXucUTSSny96mvdD6mOneu6WoZOWPS7RdzroBYC0pI+7euznC07FmcJeJO6MFMgM137+0+tGLAlBIhNU65brxNhOqULOBIxzXqIhpmxI+9AyVNIQdDudfjO2j4zStXuRMiXRnqq/J1Iaaj0KA9MZUhzoeW8i/ue1EuydtVMu4wRBsp9FvUTYGNmTaOwuV8BQjAyhTHFzq80GVFGGJsCcCcGdf3mR3FUr7kmldlMrXBRncWTJATkkJeKSU3JBrsg1qRNGHsgTeSGv1qP1bL1Z7z+tGWs2s0/+wPr4Bi/Wmuw=</latexit>

Rt
2 (s)ds
=1 e 0
• Denoising Score Matching: t

<latexit sha1_base64="t0i1pCznTHPDu6bVK1eC39Lb+4A=">AAADBnichVLLbhMxFPVMeZTwaApLWFgEpFQq0UxUActKFRLLIjVtpUwYeRxPYtWPwb6DiJxZddNfYcMChNjyDez4GzzJCCVtEVeyfHzuuboP36wQ3EIU/Q7CjRs3b93evNO6e+/+g6329sNjq0tD2YBqoc1pRiwTXLEBcBDstDCMyEywk+zsoPaffGTGcq2OYFawkSQTxXNOCXgq3Q6eJJKr1CWZFmM7k/5yCUwZkKrCiSQwzTL3pkodJJbLBUGJcIOqG+0e7VSriiXO3acqjWox/pBG3VXyn3Jo5LAqh/l67Pzv21bX17sWvQs7LxJFMkHWU/muhJ78J9l8nvbf91uttN2JetHC8FUQN6CDGjtM27+SsaalZAqoINYO46iAkSMGOBWsaiWlZQWhZ2TChh4qIpkducU3Vvi5Z8Y418YfBXjBrkY4Im3dslfWxdrLvpq8zjcsIX89clwVJTBFl4nyUmDQuN4JPOaGURAzDwg13NeK6ZQYQsFvTj2E+HLLV8Fxvxe/7O292+vsP2vGsYkeo6eoi2L0Cu2jt+gQDRANzoPPwdfgW3gRfgm/hz+W0jBoYh6hNQt//gGoKv1y</latexit>

min Et⇠U (0,T ) Ex0 ⇠q0 (x0 ) Ext ⇠qt (xt |x0 ) ||s✓ (xt , t) rxt log qt (xt |x0 )||22

diffusion data diffused data neural score of diffused


time t sample x0 sample xt network data sample
<latexit sha1_base64="cJsxGNpqK+W2GHDdEsl0q4vaP2c=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSp4KokU9Vjw4rGCaQttKJvtpl262YTdiVBKf4MXD4p49Qd589+4aXPQ1gcDj/dmmJkXplIYdN1vZ219Y3Nru7RT3t3bPzisHB23TJJpxn2WyER3Qmq4FIr7KFDyTqo5jUPJ2+H4LvfbT1wbkahHnKQ8iOlQiUgwilbysWzRr1TdmjsHWSVeQapQoNmvfPUGCctirpBJakzXc1MMplSjYJLPyr3M8JSyMR3yrqWKxtwE0/mxM3JhlQGJEm1LIZmrvyemNDZmEoe2M6Y4MsteLv7ndTOMboOpUGmGXLHFoiiTBBOSf04GQnOGcmIJZVrYWwkbUU0Z2nzyELzll1dJ66rmXdfqD/Vq47yIowSncAaX4MENNOAemuADAwHP8ApvjnJenHfnY9G65hQzJ/AHzucPrGiNMw==</latexit>

<latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

After expectations, s✓ (xt , t) ⇡ rxt log qt (xt )!


<latexit sha1_base64="K9f83K2ay2t203lPpxdHLz4iHTs=">AAACTnicbVFNaxsxENW6H3GdtnHTYy+ibiCBYnZDSHIM5NJjCrUd8JplVp61RbTSRpoNMcv+wl5Kb/0ZvfTQUlqt40Jid0Do8d48NPOUFko6CsNvQevR4ydPt9rPOtvPX7zc6b7aHTpTWoEDYZSxlyk4VFLjgCQpvCwsQp4qHKVX540+ukHrpNGfaFHgJIeZlpkUQJ5KuhjnQPM0q1ydVHFq1NQtcn9VMc2RoK73/zXc1gm9p4MYisKaWx5rSBV4zz25jpWZ8euEHpgOOp2k2wv74bL4JohWoMdWdZF0v8ZTI8ocNQkFzo2jsKBJBZakUFh34tJhAeIKZjj2UEOOblIt46j5nmemPDPWH018yd53VJC7Zkvf2Yzp1rWG/J82Lik7nVRSFyWhFncPZaXiZHiTLZ9Ki4LUwgMQVvpZuZiDBUH+B5oQovWVN8HwsB8d948+HvXO3q3iaLM37C3bZxE7YWfsA7tgAybYZ/ad/WS/gi/Bj+B38OeutRWsPK/Zg2q1/wK0zbdo</latexit>

Vincent, in Neural Computation, 2011


Song and Ermon, NeurIPS, 2019
56
Song et al. ICLR, 2021
Denoising Score Matching
Implementation 1: Noise Prediction
Forward diffusion process (fixed)

“Variance Preserving” SDE:


1 p
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

dxt = (t)xt dt + (t) d! t


q(x0 ) q(xT ) 2 <latexit sha1_base64="ON6f5QfBeJLr/8l+XY5R0c9FRgU=">AAACTnicbVFNS+tAFJ3Ur74+P6ou3QxWQUFKIqKCCIIb3YiCVaGp4WY6aQdnkjhzI5aYX+hG3Pkz3LyFIprUClXfhYFzzzmXuXPGj6UwaNtPVmlkdGx8ovyn8ndyanqmOjt3ZqJEM95gkYz0hQ+GSxHyBgqU/CLWHJQv+bl/tV/o5zdcGxGFp9iLeUtBJxSBYIA55VX5tYcrrgLs+kF6m3l4N9TYq7v9joFMj7Jvth23A0qBh3TYv0ZdIzoKLtc9/OIPs9WKV63Zdbtf9DdwBqBGBnXsVR/ddsQSxUNkEoxpOnaMrRQ0CiZ5VnETw2NgV9DhzRyGoLhppf04MrqcM20aRDo/IdI+OzyRgjKmp/zcWexofmoF+T+tmWCw3UpFGCfIQ/Z5UZBIihEtsqVtoTlD2csBMC3yXSnrggaG+Q8UITg/n/wbnK3Xnc36xslGbW9pEEeZLJBFskIcskX2yAE5Jg3CyD15Ji/k1Xqw/llv1vuntWQNZubJtyqVPwB50LYi</latexit>

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT qt (xt |x0 ) = N (xt ; t x0 , t2 I)


Rt
• Denoising Score Matching:
<latexit sha1_base64="5csNGsgEGhWzfj3FRC5grPM/eNk=">AAACFXicbVBNSxxBEO3RfOjmwzEevQxZAwaSZUYkegkIuXg0kHWFndmhprdmbba7Z+iuEZZh/kQu+StePBjEq+DNf2PPuodEfVDweK+KqnpZKYWlMLzzlpZfvHz1emW18+btu/dr/vqHY1tUhmOfF7IwJxlYlEJjnwRJPCkNgsokDrLpj9YfnKGxotC/aFZiomCiRS44kJNS/0s8AaUgpe84qr/GuQFeR02908RCUxqOKM6QYNt+Htumk/rdsBfOETwl0YJ02QJHqX8bjwteKdTEJVg7jMKSkhoMCS6x6cSVxRL4FCY4dFSDQpvU86+a4JNTxkFeGFeagrn670QNytqZylynAjq1j71WfM4bVpTvJ7XQZUWo+cOivJIBFUEbUTAWBjnJmSPAjXC3BvwUXDLkgmxDiB6//JQc7/Sib73dn7vdg61FHCtsk31k2yxie+yAHbIj1mec/Wbn7JL99f54F96Vd/3QuuQtZjbYf/Bu7gHzfp6V</latexit>

1
(s)ds
t =e 2 0
Rt
<latexit sha1_base64="t0i1pCznTHPDu6bVK1eC39Lb+4A=">AAADBnichVLLbhMxFPVMeZTwaApLWFgEpFQq0UxUActKFRLLIjVtpUwYeRxPYtWPwb6DiJxZddNfYcMChNjyDez4GzzJCCVtEVeyfHzuuboP36wQ3EIU/Q7CjRs3b93evNO6e+/+g6329sNjq0tD2YBqoc1pRiwTXLEBcBDstDCMyEywk+zsoPaffGTGcq2OYFawkSQTxXNOCXgq3Q6eJJKr1CWZFmM7k/5yCUwZkKrCiSQwzTL3pkodJJbLBUGJcIOqG+0e7VSriiXO3acqjWox/pBG3VXyn3Jo5LAqh/l67Pzv21bX17sWvQs7LxJFMkHWU/muhJ78J9l8nvbf91uttN2JetHC8FUQN6CDGjtM27+SsaalZAqoINYO46iAkSMGOBWsaiWlZQWhZ2TChh4qIpkducU3Vvi5Z8Y418YfBXjBrkY4Im3dslfWxdrLvpq8zjcsIX89clwVJTBFl4nyUmDQuN4JPOaGURAzDwg13NeK6ZQYQsFvTj2E+HLLV8Fxvxe/7O292+vsP2vGsYkeo6eoi2L0Cu2jt+gQDRANzoPPwdfgW3gRfgm/hz+W0jBoYh6hNQt//gGoKv1y</latexit>

min Et⇠U (0,T ) Ex0 ⇠q0 (x0 ) Ext ⇠qt (xt |x0 ) ||s✓ (xt , t) rxt log qt (xt |x0 )||22
<latexit sha1_base64="yqYMfP/TZS8iKSBKV3tjLYmW+FM=">AAACDXicbVA9SwNBEN2LXzF+RS1tDhMhFgl3IaiNINhYKhgTyCXH3maSLNnbO3bnhHDkD9j4V2wsFLG1t/PfuIkpNPHBwOO9GWbmBbHgGh3ny8osLa+srmXXcxubW9s7+d29Ox0likGdRSJSzYBqEFxCHTkKaMYKaBgIaATDy4nfuAeleSRvcRRDO6R9yXucUTSSny96mvdD6mOneu6WoZOWPS7RdzroBYC0pI+7euznC07FmcJeJO6MFMgM137+0+tGLAlBIhNU65brxNhOqULOBIxzXqIhpmxI+9AyVNIQdDudfjO2j4zStXuRMiXRnqq/J1Iaaj0KA9MZUhzoeW8i/ue1EuydtVMu4wRBsp9FvUTYGNmTaOwuV8BQjAyhTHFzq80GVFGGJsCcCcGdf3mR3FUr7kmldlMrXBRncWTJATkkJeKSU3JBrsg1qRNGHsgTeSGv1qP1bL1Z7z+tGWs2s0/+wPr4Bi/Wmuw=</latexit>

2 (s)ds
✓ t =1 e 0

<latexit sha1_base64="GEQIeV3PcLFAenUG+ELZBYRBK80=">AAACJnicbZBNS8NAEIY39avWr6hHL8EqVJCSSFEvQsGLXqSC/YCmlM120y7d7IbdjVBCfo0X/4oXDxURb/4UN2kEbR1Y9uGdGWbm9UJKpLLtT6OwtLyyulZcL21sbm3vmLt7LckjgXATccpFx4MSU8JwUxFFcScUGAYexW1vfJ3m249YSMLZg5qEuBfAISM+QVBpqW9euR6nAzkJ9Be7OJSEcpa4kgRuANUIQRrfJZWMPT+2k9MfvE1OSqW+WbardhbWIjg5lEEejb45dQccRQFmClEoZdexQ9WLoVAEUZyU3EjiEKIxHOKuRgYDLHtxdmZiHWtlYPlc6MeUlam/O2IYyPQSXZkuKedzqfhfrhsp/7IXExZGCjM0G+RH1FLcSj2zBkRgpOhEA0SC6F0tNIICIqWdTU1w5k9ehNZZ1Tmv1u5r5fpRbkcRHIBDUAEOuAB1cAMaoAkQeAIvYArejGfj1Xg3PmalBSPv2Qd/wvj6Bgv9ppA=</latexit>

• Re-parametrized sampling: xt = t x0 + t✏ ✏ ⇠ N (0, I)


<latexit sha1_base64="ZtxkWzbXLGQaQmZy3x0F2rN5vo4=">AAACLnicbVDLSgMxFM3UV62vqks3wSoIQpkRUTeCIILLClYLnTJk0kwNzWNI7ohl6Be58Vd0IaiIWz/DTO1CrQdCDufcy733xKngFnz/xStNTc/MzpXnKwuLS8sr1dW1K6szQ1mTaqFNKyaWCa5YEzgI1koNIzIW7Drunxb+9S0zlmt1CYOUdSTpKZ5wSsBJUfUslARu4iS/G0aAj3HYI1KSCH7IPt7FoeW9kRxr0bUD6b48ZKnlQqthpRJVa37dHwFPkmBMamiMRlR9CruaZpIpoIJY2w78FDo5McCpYMNKmFmWEtonPdZ2VBHJbCcfnTvE207p4kQb9xTgkfqzIyfSFju6yuIK+9crxP+8dgbJUSfnKs2AKfo9KMkEBo2L7HCXG0ZBDBwh1HC3K6Y3xBAKLuEihODvyZPkaq8eHNT3L/ZrJ1vjOMpoA22iHRSgQ3SCzlEDNRFF9+gRvaI378F79t69j+/SkjfuWUe/4H1+AbV1qXU=</latexit>

x00))22
<latexit sha1_base64="Ujl1SLZpztC+vK4Y+zcRngxdnCc=">AAADOniclVLLbhMxFPVMeZTwSmHJxmqEVIQSzUQVZVOpgg3LIJG2UhxGHsczterHML5TEZn5rm74CnYs2LAAIbZ8QD1JUJM2EuJKlo/PPceP65sWUliIoq9BuHHj5q3bm3dad+/df/CwvfXo0JqqZHzIjDTlcUotl0LzIQiQ/LgoOVWp5Efp6esmf3TGSyuMfgfTgo8VzbXIBKPgqWQrGBBNU0kTRxSFkzRzH+sEakykyfGHBHaW6U9Li+jZfnetlWQlZW7F1yU5VYomgFc2eN+vXZ9YkTcpv9jH3bn5397aLfnwpXGdFj/Hf8WYpEZO7FT5yRFeWCGNrv/viHU7XGrrVtLuRL1oFvg6iBeggxYxSNpfyMSwSnENTFJrR3FUwNjREgSTvG6RyvKCslOa85GHmipux2729TV+6pkJzkzphwY8Y5cdjirb3NYrm7fZq7mGXJcbVZC9HDuhiwq4ZvODskpiMLjpIzwRJWcgpx5QVgp/V8xOqK8Q+G5rihBfffJ1cNjvxS96u293OwevFuXYRE/QNtpBMdpDB+gNGqAhYsF58C34EfwMP4ffw1/h77k0DBaex2glwj8Xm3URrw==</latexit> <latexit sha1_base64="Ujl1SLZpztC+vK4Y+zcRngxdnCc=">AAADOniclVLLbhMxFPVMeZTwSmHJxmqEVIQSzUQVZVOpgg3LIJG2UhxGHsczterHML5TEZn5rm74CnYs2LAAIbZ8QD1JUJM2EuJKlo/PPceP65sWUliIoq9BuHHj5q3bm3dad+/df/CwvfXo0JqqZHzIjDTlcUotl0LzIQiQ/LgoOVWp5Efp6esmf3TGSyuMfgfTgo8VzbXIBKPgqWQrGBBNU0kTRxSFkzRzH+sEakykyfGHBHaW6U9Li+jZfnetlWQlZW7F1yU5VYomgFc2eN+vXZ9YkTcpv9jH3bn5397aLfnwpXGdFj/Hf8WYpEZO7FT5yRFeWCGNrv/viHU7XGrrVtLuRL1oFvg6iBeggxYxSNpfyMSwSnENTFJrR3FUwNjREgSTvG6RyvKCslOa85GHmipux2729TV+6pkJzkzphwY8Y5cdjirb3NYrm7fZq7mGXJcbVZC9HDuhiwq4ZvODskpiMLjpIzwRJWcgpx5QVgp/V8xOqK8Q+G5rihBfffJ1cNjvxS96u293OwevFuXYRE/QNtpBMdpDB+gNGqAhYsF58C34EfwMP4ffw1/h77k0DBaex2glwj8Xm3URrw==</latexit>
<latexit sha1_base64="Ujl1SLZpztC+vK4Y+zcRngxdnCc=">AAADOniclVLLbhMxFPVMeZTwSmHJxmqEVIQSzUQVZVOpgg3LIJG2UhxGHsczterHML5TEZn5rm74CnYs2LAAIbZ8QD1JUJM2EuJKlo/PPceP65sWUliIoq9BuHHj5q3bm3dad+/df/CwvfXo0JqqZHzIjDTlcUotl0LzIQiQ/LgoOVWp5Efp6esmf3TGSyuMfgfTgo8VzbXIBKPgqWQrGBBNU0kTRxSFkzRzH+sEakykyfGHBHaW6U9Li+jZfnetlWQlZW7F1yU5VYomgFc2eN+vXZ9YkTcpv9jH3bn5397aLfnwpXGdFj/Hf8WYpEZO7FT5yRFeWCGNrv/viHU7XGrrVtLuRL1oFvg6iBeggxYxSNpfyMSwSnENTFJrR3FUwNjREgSTvG6RyvKCslOa85GHmipux2729TV+6pkJzkzphwY8Y5cdjirb3NYrm7fZq7mGXJcbVZC9HDuhiwq4ZvODskpiMLjpIzwRJWcgpx5QVgp/V8xOqK8Q+G5rihBfffJ1cNjvxS96u293OwevFuXYRE/QNtpBMdpDB+gNGqAhYsF58C34EfwMP4ffw1/h77k0DBaex2glwj8Xm3URrw==</latexit> <latexit sha1_base64="Ujl1SLZpztC+vK4Y+zcRngxdnCc=">AAADOniclVLLbhMxFPVMeZTwSmHJxmqEVIQSzUQVZVOpgg3LIJG2UhxGHsczterHML5TEZn5rm74CnYs2LAAIbZ8QD1JUJM2EuJKlo/PPceP65sWUliIoq9BuHHj5q3bm3dad+/df/CwvfXo0JqqZHzIjDTlcUotl0LzIQiQ/LgoOVWp5Efp6esmf3TGSyuMfgfTgo8VzbXIBKPgqWQrGBBNU0kTRxSFkzRzH+sEakykyfGHBHaW6U9Li+jZfnetlWQlZW7F1yU5VYomgFc2eN+vXZ9YkTcpv9jH3bn5397aLfnwpXGdFj/Hf8WYpEZO7FT5yRFeWCGNrv/viHU7XGrrVtLuRL1oFvg6iBeggxYxSNpfyMSwSnENTFJrR3FUwNjREgSTvG6RyvKCslOa85GHmipux2729TV+6pkJzkzphwY8Y5cdjirb3NYrm7fZq7mGXJcbVZC9HDuhiwq4ZvODskpiMLjpIzwRJWcgpx5QVgp/V8xOqK8Q+G5rihBfffJ1cNjvxS96u293OwevFuXYRE/QNtpBMdpDB+gNGqAhYsF58C34EfwMP4ffw1/h77k0DBaex2glwj8Xm3URrw==</latexit>

(x
(xtt ttx xxtt ttx
x00 ttx
x00 +
+ tt✏✏ ttx
x00 ✏✏
• Score function: r
rxxtt log (xtt|x
log qqtt(x |x00)) =
= r
rxxtt 22 =
= 22 =
= 22 =
=
<latexit sha1_base64="ohgfpxJKGn9g0cDEGKwxNrzr/1Y=">AAACa3icjVHRatswFJW9rUuzrfXah5Z1D2JZoIM12KNsZTAo9GWPHTRpIQ5GVq4TEVky0vVoMH7ZJ/atf9CX/cPkNIM12UMvCB3OPVc6OkoLKSyG4a3nP3n6bON5a7P94uWrre3g9c7A6tJw6HMttblKmQUpFPRRoISrwgDLUwmX6eys6V/+BGOFVhc4L2CUs4kSmeAMHZUEv+Kc4TTNKlsnVZxqObbz3G1VjFNAVteHfwXXdYIf8cPXb/QozgzjD9VQWCG1euQhdRVbMclZgnW7nQSdsBcuiq6DaAk6ZFnnSXATjzUvc1DIJbN2GIUFjipmUHAJdTsuLRSMz9gEhg4qloMdVYusatp1zJhm2rilkC7YfycqltvGvVM2nu1qryH/1xuWmJ2MKqGKEkHx+4uyUlLUtAmejoUBjnLuAONGOK+UT5kLEt33NCFEq09eB4NPvehz7/jHcef0/TKOFjkg78ghicgXckq+k3PSJ5zceVvenrfv/fZ3/Tf+23up7y1ndsmD8rt/AE/Zvnc=</latexit>
22 tt tt tt tt
✏✓ (xt , t)
• Neural network model: s✓ (xt , t) :=
t

1
<latexit sha1_base64="Ltq2VMbOtxF8jfvZAbPMZAo5Sn0=">AAADD3icbVJNbxMxEPUuX2X5aApHLhahUiqFaDeqWo6VEBJcUJGatlI2XXkdb2LV9i7rWUTk+B9w4a9w4QBCXLly49/g3YQqpBnJ8tOb9zzjsdNCcA1h+Mfzb9y8dfvO1t3g3v0HD7dbO49OdV6VlA1oLvLyPCWaCa7YADgIdl6UjMhUsLP08mWdP/vASs1zdQKzgo0kmSiecUrAUcmOtxtLrhITp7kY65l0m4lhyoBYi2NJYJqm5pVNDMSay4agRJiB7YTdkz27qljgzHy0SViL8fsk7KySa/LViqzQXOTKNr6rKm/tlT+03X/wjXUHZSWhJrLGGSaSJHDRt/P5xiOfb2Q333i1XejC3nye9C/6QRAkrXbYC5vA10G0BG20jOOk9Tse57SSTAEVROthFBYwMqQETgWzQVxpVhB6SSZs6KAikumRad7T4l3HjHGWl24pwA276jBE6rpzp6z71eu5mtyUG1aQvRgZrooKmKKLQlklMOS4/hx4zEtGQcwcILTkrldMp8SNGtwXqocQrV/5Ojjt96KD3v67/fbRs+U4ttAT9BR1UIQO0RF6jY7RAFHvk/fF++Z99z/7X/0f/s+F1PeWnsfov/B//QW4ZwGn</latexit>

Vincent, in Neural Computation, 2011 min Et⇠U (0,T ) Ex0 ⇠q0 (x0 ) E✏⇠N (0,I) 2 ||✏ ✏✓ (xt , t)||22
Song and Ermon, NeurIPS, 2019 ✓ t 57
Song et al. ICLR, 2021
Denoising Score Matching
Implementation 2: Loss Weightings
Forward diffusion process (fixed)

“Variance Preserving” SDE:


1 p
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

dxt = (t)xt dt + (t) d! t


q(x0 ) q(xT ) 2 <latexit sha1_base64="ON6f5QfBeJLr/8l+XY5R0c9FRgU=">AAACTnicbVFNS+tAFJ3Ur74+P6ou3QxWQUFKIqKCCIIb3YiCVaGp4WY6aQdnkjhzI5aYX+hG3Pkz3LyFIprUClXfhYFzzzmXuXPGj6UwaNtPVmlkdGx8ovyn8ndyanqmOjt3ZqJEM95gkYz0hQ+GSxHyBgqU/CLWHJQv+bl/tV/o5zdcGxGFp9iLeUtBJxSBYIA55VX5tYcrrgLs+kF6m3l4N9TYq7v9joFMj7Jvth23A0qBh3TYv0ZdIzoKLtc9/OIPs9WKV63Zdbtf9DdwBqBGBnXsVR/ddsQSxUNkEoxpOnaMrRQ0CiZ5VnETw2NgV9DhzRyGoLhppf04MrqcM20aRDo/IdI+OzyRgjKmp/zcWexofmoF+T+tmWCw3UpFGCfIQ/Z5UZBIihEtsqVtoTlD2csBMC3yXSnrggaG+Q8UITg/n/wbnK3Xnc36xslGbW9pEEeZLJBFskIcskX2yAE5Jg3CyD15Ji/k1Xqw/llv1vuntWQNZubJtyqVPwB50LYi</latexit>

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT qt (xt |x0 ) = N (xt ; t x0 , t2 I)


<latexit sha1_base64="5csNGsgEGhWzfj3FRC5grPM/eNk=">AAACFXicbVBNSxxBEO3RfOjmwzEevQxZAwaSZUYkegkIuXg0kHWFndmhprdmbba7Z+iuEZZh/kQu+StePBjEq+DNf2PPuodEfVDweK+KqnpZKYWlMLzzlpZfvHz1emW18+btu/dr/vqHY1tUhmOfF7IwJxlYlEJjnwRJPCkNgsokDrLpj9YfnKGxotC/aFZiomCiRS44kJNS/0s8AaUgpe84qr/GuQFeR02908RCUxqOKM6QYNt+Htumk/rdsBfOETwl0YJ02QJHqX8bjwteKdTEJVg7jMKSkhoMCS6x6cSVxRL4FCY4dFSDQpvU86+a4JNTxkFeGFeagrn670QNytqZylynAjq1j71WfM4bVpTvJ7XQZUWo+cOivJIBFUEbUTAWBjnJmSPAjXC3BvwUXDLkgmxDiB6//JQc7/Sib73dn7vdg61FHCtsk31k2yxie+yAHbIj1mec/Wbn7JL99f54F96Vd/3QuuQtZjbYf/Bu7gHzfp6V</latexit>

1
Rt
(s)ds
• Denoising Score Matching objective with loss weighting (t) : =e
<latexit sha1_base64="vnpcuwEAb17NsJ9DKKqz5w3UTco=">AAACBnicbVDLSsNAFJ34rPUVdSlCsAp1UxIp6rLgxmUF+4AmlMnkth06eTBzI5bQlRt/xY0LRdz6De78GydtF9p6YOBwzj1zZ46fCK7Qtr+NpeWV1bX1wkZxc2t7Z9fc22+qOJUMGiwWsWz7VIHgETSQo4B2IoGGvoCWP7zO/dY9SMXj6A5HCXgh7Ue8xxlFLXXNIxfhASf3ZBKCceYKHQ5oGc/GxWLXLNkVewJrkTgzUiIz1LvmlxvELA0hQiaoUh3HTtDLqETOBIyLbqogoWxI+9DRNKIhKC+brB9bp1oJrF4s9YnQmqi/ExkNlRqFvp4MKQ7UvJeL/3mdFHtXXsajJEWI2HRRLxUWxlbeiRVwCQzFSBPKJNdvtdiASspQN5eX4Mx/eZE0zyvORaV6Wy3VTmZ1FMghOSZl4pBLUiM3pE4ahJFH8kxeyZvxZLwY78bHdHTJmGUOyB8Ynz8IIpi5</latexit>

2 0
t
<latexit sha1_base64="yqYMfP/TZS8iKSBKV3tjLYmW+FM=">AAACDXicbVA9SwNBEN2LXzF+RS1tDhMhFgl3IaiNINhYKhgTyCXH3maSLNnbO3bnhHDkD9j4V2wsFLG1t/PfuIkpNPHBwOO9GWbmBbHgGh3ny8osLa+srmXXcxubW9s7+d29Ox0likGdRSJSzYBqEFxCHTkKaMYKaBgIaATDy4nfuAeleSRvcRRDO6R9yXucUTSSny96mvdD6mOneu6WoZOWPS7RdzroBYC0pI+7euznC07FmcJeJO6MFMgM137+0+tGLAlBIhNU65brxNhOqULOBIxzXqIhpmxI+9AyVNIQdDudfjO2j4zStXuRMiXRnqq/J1Iaaj0KA9MZUhzoeW8i/ue1EuydtVMu4wRBsp9FvUTYGNmTaOwuV8BQjAyhTHFzq80GVFGGJsCcCcGdf3mR3FUr7kmldlMrXBRncWTJATkkJeKSU3JBrsg1qRNGHsgTeSGv1qP1bL1Z7z+tGWs2s0/+wPr4Bi/Wmuw=</latexit>

Rt
2 (s)ds
<latexit sha1_base64="eezl3jw6yIsURPkD8OIVvpncE9w=">AAADKnicbVJbb9MwFHbCbZTLOnjkxaIgtVKpkmraeKyEkOAFDWndJjVd5DhOa81OQnyCVrn+PbzwV3jZA2jilR+Ck5YpdD2S5c/f+Y7PxY5ywRV43rXj3rl77/6DnYetR4+fPN1t7z07UVlZUDammciKs4goJnjKxsBBsLO8YERGgp1GF+8q/+lXViiepcewyNlUklnKE04JWCrcc0aB5GmogygTsVpIu+kA5gyIMTiQBOZRpN+bUEOguKwJSoQem67XP+6ZpmKFE31pQq8S4y+h122SG/JmRpYrLrLU1HE3WT6Zm3jP9P/Bj8ZelBSE2kLZJdQz0AWLjQ6EbTwmXegZe1B8JkkI50OzXG5N9mYru30WzUagD73lMhyeD1vWwnbHG3i14dvAX4MOWttR2L4K4oyWkqVABVFq4ns5TDUpgFPBTCsoFcsJvSAzNrEwJZKpqa7bNPi1ZWKcZIVdKeCabUZoIlVVulVWBatNX0Vu801KSN5ONU/zElhKV4mSUmDIcPVvcMwLRkEsLCC04LZWTOfEvgLY31UNwd9s+TY4GQ78g8H+5/3O6NV6HDvoBXqJushHh2iEPqAjNEbU+eb8cH46v9zv7pV77f5eSV1nHfMc/Wfun78NZg17</latexit>

(t) t =1 e 0

min Et⇠U (0,T ) Ex0 ⇠q0 (x0 ) E✏⇠N (0,I) 2 ||✏ ✏✓ (xt , t)||22
✓ t

Different loss weightings trade off between model with


good perceptual quality vs. high log-likelihood
<latexit sha1_base64="N2ZMwrHpX6aQ3ErJJ2Y05kPtkvQ=">AAACEXicbVC7SgNBFJ31GeMrammzGIXYhN0Q1EYI2FhGMA/IxjA7e5MMmX0wc1cMS37Bxl+xsVDE1s7Ov3Gy2UITDwwczrln5s5xI8EVWta3sbS8srq2ntvIb25t7+wW9vabKowlgwYLRSjbLlUgeAAN5CigHUmgviug5Y6upn7rHqTiYXCL4wi6Ph0EvM8ZRS31CiUH4QHTexIJ3iRxhA57tISnk0tH8YFPe3hXyed7haJVtlKYi8TOSJFkqPcKX44XstiHAJmgSnVsK8JuQiVyJmCSd2IFEWUjOoCOpgH1QXWTdJOJeaIVz+yHUp8AzVT9nUior9TYd/WkT3Go5r2p+J/XibF/0U14EMUIAZs91I+FiaE5rcf0uASGYqwJZZLrXU02pJIy1CVOS7Dnv7xImpWyfVau3lSLteOsjhw5JEekRGxyTmrkmtRJgzDySJ7JK3kznowX4934mI0uGVnmgPyB8fkDWxWdNA==</latexit>

2
Ho et al, NeurIPS, 2020 • Perceptual quality: (t) = t
Song et al., NeurIPS, 2021
• Maximum log-likelihood: (t) = (t) (negative ELBO)
<latexit sha1_base64="3txSz913PAF+Fog7Xw3iBSNtTQM=">AAACD3icbVC7SgNBFJ2Nrxhfq5Y2i1GJTdiVoDZCwMYygnlANoTZ2ZtkyOyDmbtiWPIHNv6KjYUitrZ2/o2zSQpNPDBwOOeemTvHiwVXaNvfRm5peWV1Lb9e2Njc2t4xd/caKkokgzqLRCRbHlUgeAh15CigFUuggSeg6Q2vM795D1LxKLzDUQydgPZD3uOMopa65omL8ICTe1IJ/jh1hQ77tISn4yvXA8xYodA1i3bZnsBaJM6MFMkMta755foRSwIIkQmqVNuxY+ykVCJnAsYFN1EQUzakfWhrGtIAVCed7DG2jrXiW71I6hOiNVF/J1IaKDUKPD0ZUByoeS8T//PaCfYuOykP4wQhZNOHeomwMLKyciyfS2AoRppQJrne1WIDKilDXWFWgjP/5UXSOCs75+XKbaVYPZrVkScH5JCUiEMuSJXckBqpE0YeyTN5JW/Gk/FivBsf09GcMcvskz8wPn8AT3CcDQ==</latexit>

Kingma et al., NeurIPS, 2021


Vahdat et al., NeurIPS, 2021
Huang et al., NeurIPS, 2021
Same objectives as derived with variational approach in Part (1)! 58
Karras et al., arXiv, 2022
Denoising Score Matching
Implementation 2: Loss Weightings
Forward diffusion process (fixed)

“Variance Preserving” SDE:


More sophisticated model
<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

dxt =
1
(t)xt dt +
p
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

(t) d! t
q(x0 ) q(xT ) 2 <latexit sha1_base64="ON6f5QfBeJLr/8l+XY5R0c9FRgU=">AAACTnicbVFNS+tAFJ3Ur74+P6ou3QxWQUFKIqKCCIIb3YiCVaGp4WY6aQdnkjhzI5aYX+hG3Pkz3LyFIprUClXfhYFzzzmXuXPGj6UwaNtPVmlkdGx8ovyn8ndyanqmOjt3ZqJEM95gkYz0hQ+GSxHyBgqU/CLWHJQv+bl/tV/o5zdcGxGFp9iLeUtBJxSBYIA55VX5tYcrrgLs+kF6m3l4N9TYq7v9joFMj7Jvth23A0qBh3TYv0ZdIzoKLtc9/OIPs9WKV63Zdbtf9DdwBqBGBnXsVR/ddsQSxUNkEoxpOnaMrRQ0CiZ5VnETw2NgV9DhzRyGoLhppf04MrqcM20aRDo/IdI+OzyRgjKmp/zcWexofmoF+T+tmWCw3UpFGCfIQ/Z5UZBIihEtsqVtoTlD2csBMC3yXSnrggaG+Q8UITg/n/wbnK3Xnc36xslGbW9pEEeZLJBFskIcskX2yAE5Jg3CyD15Ji/k1Xqw/llv1vuntWQNZubJtyqVPwB50LYi</latexit>

parametrizations andxTloss <latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>
qt (xt |x0 ) = N (xt ; t x0 , t2 I)
<latexit sha1_base64="5csNGsgEGhWzfj3FRC5grPM/eNk=">AAACFXicbVBNSxxBEO3RfOjmwzEevQxZAwaSZUYkegkIuXg0kHWFndmhprdmbba7Z+iuEZZh/kQu+StePBjEq+DNf2PPuodEfVDweK+KqnpZKYWlMLzzlpZfvHz1emW18+btu/dr/vqHY1tUhmOfF7IwJxlYlEJjnwRJPCkNgsokDrLpj9YfnKGxotC/aFZiomCiRS44kJNS/0s8AaUgpe84qr/GuQFeR02908RCUxqOKM6QYNt+Htumk/rdsBfOETwl0YJ02QJHqX8bjwteKdTEJVg7jMKSkhoMCS6x6cSVxRL4FCY4dFSDQpvU86+a4JNTxkFeGFeagrn670QNytqZylynAjq1j71WfM4bVpTvJ7XQZUWo+cOivJIBFUEbUTAWBjnJmSPAjXC3BvwUXDLkgmxDiB6//JQc7/Sib73dn7vdg61FHCtsk31k2yxie+yAHbIj1mec/Wbn7JL99f54F96Vd/3QuuQtZjbYf/Bu7gHzfp6V</latexit>

1
Rt
(s)ds
• Denoising Score Matching objective with loss weighting (t) : =e
<latexit sha1_base64="vnpcuwEAb17NsJ9DKKqz5w3UTco=">AAACBnicbVDLSsNAFJ34rPUVdSlCsAp1UxIp6rLgxmUF+4AmlMnkth06eTBzI5bQlRt/xY0LRdz6De78GydtF9p6YOBwzj1zZ46fCK7Qtr+NpeWV1bX1wkZxc2t7Z9fc22+qOJUMGiwWsWz7VIHgETSQo4B2IoGGvoCWP7zO/dY9SMXj6A5HCXgh7Ue8xxlFLXXNIxfhASf3ZBKCceYKHQ5oGc/GxWLXLNkVewJrkTgzUiIz1LvmlxvELA0hQiaoUh3HTtDLqETOBIyLbqogoWxI+9DRNKIhKC+brB9bp1oJrF4s9YnQmqi/ExkNlRqFvp4MKQ7UvJeL/3mdFHtXXsajJEWI2HRRLxUWxlbeiRVwCQzFSBPKJNdvtdiASspQN5eX4Mx/eZE0zyvORaV6Wy3VTmZ1FMghOSZl4pBLUiM3pE4ahJFH8kxeyZvxZLwY78bHdHTJmGUOyB8Ynz8IIpi5</latexit>

2 0
t

(t)weightings2 possible!
<latexit sha1_base64="yqYMfP/TZS8iKSBKV3tjLYmW+FM=">AAACDXicbVA9SwNBEN2LXzF+RS1tDhMhFgl3IaiNINhYKhgTyCXH3maSLNnbO3bnhHDkD9j4V2wsFLG1t/PfuIkpNPHBwOO9GWbmBbHgGh3ny8osLa+srmXXcxubW9s7+d29Ox0likGdRSJSzYBqEFxCHTkKaMYKaBgIaATDy4nfuAeleSRvcRRDO6R9yXucUTSSny96mvdD6mOneu6WoZOWPS7RdzroBYC0pI+7euznC07FmcJeJO6MFMgM137+0+tGLAlBIhNU65brxNhOqULOBIxzXqIhpmxI+9AyVNIQdDudfjO2j4zStXuRMiXRnqq/J1Iaaj0KA9MZUhzoeW8i/ue1EuydtVMu4wRBsp9FvUTYGNmTaOwuV8BQjAyhTHFzq80GVFGGJsCcCcGdf3mR3FUr7kmldlMrXBRncWTJATkkJeKSU3JBrsg1qRNGHsgTeSGv1qP1bL1Z7z+tGWs2s0/+wPr4Bi/Wmuw=</latexit>

Rt
2 (s)ds
<latexit sha1_base64="eezl3jw6yIsURPkD8OIVvpncE9w=">AAADKnicbVJbb9MwFHbCbZTLOnjkxaIgtVKpkmraeKyEkOAFDWndJjVd5DhOa81OQnyCVrn+PbzwV3jZA2jilR+Ck5YpdD2S5c/f+Y7PxY5ywRV43rXj3rl77/6DnYetR4+fPN1t7z07UVlZUDammciKs4goJnjKxsBBsLO8YERGgp1GF+8q/+lXViiepcewyNlUklnKE04JWCrcc0aB5GmogygTsVpIu+kA5gyIMTiQBOZRpN+bUEOguKwJSoQem67XP+6ZpmKFE31pQq8S4y+h122SG/JmRpYrLrLU1HE3WT6Zm3jP9P/Bj8ZelBSE2kLZJdQz0AWLjQ6EbTwmXegZe1B8JkkI50OzXG5N9mYru30WzUagD73lMhyeD1vWwnbHG3i14dvAX4MOWttR2L4K4oyWkqVABVFq4ns5TDUpgFPBTCsoFcsJvSAzNrEwJZKpqa7bNPi1ZWKcZIVdKeCabUZoIlVVulVWBatNX0Vu801KSN5ONU/zElhKV4mSUmDIcPVvcMwLRkEsLCC04LZWTOfEvgLY31UNwd9s+TY4GQ78g8H+5/3O6NV6HDvoBXqJushHh2iEPqAjNEbU+eb8cH46v9zv7pV77f5eSV1nHfMc/Wfun78NZg17</latexit>

t =1 e 0

min Et⇠U (0,T ) Ex0 ⇠q0 (x0 ) E✏⇠N (0,I) 2 ||✏ ✏✓ (xt , t)||2
✓ t

Different loss weightings trade off between model with


Karras et al., “Elucidating the Design Space of Diffusion-Based Generative Models”, arXiv, 2022
good perceptual quality vs. high log-likelihood
<latexit sha1_base64="N2ZMwrHpX6aQ3ErJJ2Y05kPtkvQ=">AAACEXicbVC7SgNBFJ31GeMrammzGIXYhN0Q1EYI2FhGMA/IxjA7e5MMmX0wc1cMS37Bxl+xsVDE1s7Ov3Gy2UITDwwczrln5s5xI8EVWta3sbS8srq2ntvIb25t7+wW9vabKowlgwYLRSjbLlUgeAAN5CigHUmgviug5Y6upn7rHqTiYXCL4wi6Ph0EvM8ZRS31CiUH4QHTexIJ3iRxhA57tISnk0tH8YFPe3hXyed7haJVtlKYi8TOSJFkqPcKX44XstiHAJmgSnVsK8JuQiVyJmCSd2IFEWUjOoCOpgH1QXWTdJOJeaIVz+yHUp8AzVT9nUior9TYd/WkT3Go5r2p+J/XibF/0U14EMUIAZs91I+FiaE5rcf0uASGYqwJZZLrXU02pJIy1CVOS7Dnv7xImpWyfVau3lSLteOsjhw5JEekRGxyTmrkmtRJgzDySJ7JK3kznowX4934mI0uGVnmgPyB8fkDWxWdNA==</latexit>

2
Ho et al, NeurIPS, 2020 • Perceptual quality: (t) = t
Song et al., NeurIPS, 2021
• Maximum log-likelihood: (t) = (t) (negative ELBO)
<latexit sha1_base64="3txSz913PAF+Fog7Xw3iBSNtTQM=">AAACD3icbVC7SgNBFJ2Nrxhfq5Y2i1GJTdiVoDZCwMYygnlANoTZ2ZtkyOyDmbtiWPIHNv6KjYUitrZ2/o2zSQpNPDBwOOeemTvHiwVXaNvfRm5peWV1Lb9e2Njc2t4xd/caKkokgzqLRCRbHlUgeAh15CigFUuggSeg6Q2vM795D1LxKLzDUQydgPZD3uOMopa65omL8ICTe1IJ/jh1hQ77tISn4yvXA8xYodA1i3bZnsBaJM6MFMkMta755foRSwIIkQmqVNuxY+ykVCJnAsYFN1EQUzakfWhrGtIAVCed7DG2jrXiW71I6hOiNVF/J1IaKDUKPD0ZUByoeS8T//PaCfYuOykP4wQhZNOHeomwMLKyciyfS2AoRppQJrne1WIDKilDXWFWgjP/5UXSOCs75+XKbaVYPZrVkScH5JCUiEMuSJXckBqpE0YeyTN5JW/Gk/FivBsf09GcMcvskz8wPn8AT3CcDQ==</latexit>

Kingma et al., NeurIPS, 2021


Vahdat et al., NeurIPS, 2021
Huang et al., NeurIPS, 2021
Same objectives as derived with variational approach in Part (1)! 59
Karras et al., arXiv, 2022
Denoising Score Matching
Implementation 3: Variance Reduction and Numerical Stability
<latexit sha1_base64="blMhhCK01xNxW2osdiGwy6BdKeo=">AAADGXicbVLLjtMwFHXCayivDizZWBSkViqdpBoBy5EQEmzQIE1nRmo6keM6rTW2E+IbROX6N9jwK2xYgBBLWPE3OGkZhU6vZOXo3HN8H3GSC64hCP54/pWr167f2LnZunX7zt177d37xzorC8pGNBNZcZoQzQRXbAQcBDvNC0ZkIthJcv6yyp98YIXmmTqCRc4mkswUTzkl4Kh419uLJFexiZJMTPVCuo+JYM6AWIsjSWCeJOaVjQ1EmsuaoESYke0G/aOebSpWODUfbRxUYvw+DrpNckPerMhyzUWmbO27qPLWXvgD2/8H31h3UVoQaiLhBp2SLvSscc6ZJDGcDe1yufXup1vZ7aM3+4Y+9JbLeHg2bLmI251gENSBL4NwDTpoHYdx+1c0zWgpmQIqiNbjMMhhYkgBnApmW1GpWU7oOZmxsYOKSKYnpv6zFj9xzBSnWeGOAlyzTYchUletO2XVsN7MVeS23LiE9MXEcJWXwBRdFUpLgSHD1TPBU14wCmLhAKEFd71iOidu6eAeU7WEcHPky+B4OAifDfbf7XcOHq/XsYMeokeoi0L0HB2g1+gQjRD1PnlfvG/ed/+z/9X/4f9cSX1v7XmA/gv/91/IrgVm</latexit>

(t)
min Et⇠U (0,T ) Ex0 ⇠q0 (x0 ) E✏⇠N (0,I) 2 ||✏ ✏✓ (xt , t)||22
✓ t
<latexit sha1_base64="f2CK3vN536N/4hQcPS7FZb4vdKs=">AAACBHicbVDLSsNAFJ34rPUVddnNYBVclaQUdVlw47KCfUBTw2Q6SYdOHszcKCV04cZfceNCEbd+hDv/xkmbhbYeuHA4517uvcdLBFdgWd/Gyura+sZmaau8vbO7t28eHHZUnErK2jQWsex5RDHBI9YGDoL1EslI6AnW9cZXud+9Z1LxOLqFScIGIQki7nNKQEuuWXEUD0Liwl3dkTwYAZEyfsBWWcM1q1bNmgEvE7sgVVSg5ZpfzjCmacgioIIo1betBAYZkcCpYNOykyqWEDomAetrGpGQqUE2e2KKT7UyxH4sdUWAZ+rviYyESk1CT3eGBEZq0cvF/7x+Cv7lIONRkgKL6HyRnwoMMc4TwUMuGQUx0YRQyfWtmI6IJBR0bnkI9uLLy6RTr9nntcZNo9o8KeIooQo6RmfIRheoia5RC7URRY/oGb2iN+PJeDHejY9564pRzByhPzA+fwB0YZaR</latexit>

2
• Notice ! 0 , as t ! 0. Loss heavily amplified when
<latexit sha1_base64="3woXvwLbB1HTj4hwCPAL32r/418=">AAAB+3icbVBNSwMxEM3Wr1q/1nr0EqyCp7IrRT0WvHisYD+gXUo2TdvQbLIks2pZ+le8eFDEq3/Em//GbLsHbX0w8Hhvhpl5YSy4Ac/7dgpr6xubW8Xt0s7u3v6Be1huGZVoyppUCaU7ITFMcMmawEGwTqwZiULB2uHkJvPbD0wbruQ9TGMWRGQk+ZBTAlbqu2XoaT4aA9FaPWKvZNF3K17VmwOvEj8nFZSj0Xe/egNFk4hJoIIY0/W9GIKUaOBUsFmplxgWEzohI9a1VJKImSCd3z7DZ1YZ4KHStiTgufp7IiWRMdMotJ0RgbFZ9jLxP6+bwPA6SLmME2CSLhYNE4FB4SwIPOCaURBTSwjV3N6K6ZhoQsHGlYXgL7+8SloXVf+yWrurVeqneRxFdIxO0Dny0RWqo1vUQE1E0RN6Rq/ozZk5L86787FoLTj5zBH6A+fzB8Kykts=</latexit>

t
sampling t close to 0 (for (t) = (t)). High variance!
<latexit sha1_base64="5k+rGmdW1l38S75U43cRVWjEIqE=">AAAB/3icbVDLSsNAFJ34rPEVFdy4CVahbkoiRd0IBTcuK9gHNKFMppN26GQSZm6EErvwV9y4UMStv+HOv3HSZqGtBwYO55zLvXOChDMFjvNtLC2vrK6tlzbMza3tnV1rb7+l4lQS2iQxj2UnwIpyJmgTGHDaSSTFUcBpOxjd5H77gUrFYnEP44T6ER4IFjKCQUs969DjOtzHFTi79gIKOTFNs2eVnaozhb1I3IKUUYFGz/ry+jFJIyqAcKxU13US8DMsgRFOJ6aXKppgMsID2tVU4IgqP5veP7FPtdK3w1jqJ8Ceqr8nMhwpNY4CnYwwDNW8l4v/ed0Uwis/YyJJgQoyWxSm3IbYzsuw+0xSAnysCSaS6VttMsQSE9CV5SW4819eJK3zqntRrd3VyvWToo4SOkLHqIJcdInq6BY1UBMR9Iie0St6M56MF+Pd+JhFl4xi5gD9gfH5A/4QlAw=</latexit>

<latexit sha1_base64="cJsxGNpqK+W2GHDdEsl0q4vaP2c=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSp4KokU9Vjw4rGCaQttKJvtpl262YTdiVBKf4MXD4p49Qd589+4aXPQ1gcDj/dmmJkXplIYdN1vZ219Y3Nru7RT3t3bPzisHB23TJJpxn2WyER3Qmq4FIr7KFDyTqo5jUPJ2+H4LvfbT1wbkahHnKQ8iOlQiUgwilbysWzRr1TdmjsHWSVeQapQoNmvfPUGCctirpBJakzXc1MMplSjYJLPyr3M8JSyMR3yrqWKxtwE0/mxM3JhlQGJEm1LIZmrvyemNDZmEoe2M6Y4MsteLv7ndTOMboOpUGmGXLHFoiiTBBOSf04GQnOGcmIJZVrYWwkbUU0Z2nzyELzll1dJ66rmXdfqD/Vq47yIowSncAaX4MENNOAemuADAwHP8ApvjnJenHfnY9G65hQzJ/AHzucPrGiNMw==</latexit>

<latexit sha1_base64="hcXzz6c2ZsrfWPQAElkU/cRVOYE=">AAAB63icbVBNS8NAEJ34WeNX1aOXxSp4KokU9Vjw4rGC/YA2lM120i7d3YTdjVBK/4IXD4p49Q9589+YtDlo64OBx3szzMwLE8GN9bxvZ219Y3Nru7Tj7u7tHxyWj45bJk41wyaLRaw7ITUouMKm5VZgJ9FIZSiwHY7vcr/9hNrwWD3aSYKBpEPFI86ozSXPdd1+ueJVvTnIKvELUoECjX75qzeIWSpRWSaoMV3fS2wwpdpyJnDm9lKDCWVjOsRuRhWVaILp/NYZuciUAYlinZWyZK7+nphSacxEhlmnpHZklr1c/M/rpja6DaZcJalFxRaLolQQG5P8cTLgGpkVk4xQpnl2K2EjqimzWTx5CP7yy6ukdVX1r6u1h1qlfl7EUYJTOINL8OEG6nAPDWgCgxE8wyu8OdJ5cd6dj0XrmlPMnMAfOJ8/D+iM2w==</latexit>

<latexit sha1_base64="v4VMACH6D2ANt2Q5u/vzNCS3dPA=">AAACFHicbVDJSgNBEO2JWxy3qEcvjVFQxDAjcQEvAS8eIxgTyMTQ06kkTXoWumskYZiP8OKvePGgiFcP3vwbJ8tBEx8UPN6r6q56biiFRsv6NjJz8wuLS9llc2V1bX0jt7l1p4NIcajwQAaq5jINUvhQQYESaqEC5rkSqm7vauhXH0BpEfi3OAih4bGOL9qCM0ylZu7IQejj6J1YQSuJHUCWOJcHDgtDFfSpbd3Hx6fJoWmazVzeKlgj0FliT0ieTFBu5r6cVsAjD3zkkmldt60QGzFTKLiExHQiDSHjPdaBekp95oFuxKNlErqfKi3aDlRaPtKR+nsiZp7WA89NOz2GXT3tDcX/vHqE7YtGLPwwQvD5+KN2JCkGdJgQbQkFHOUgJYwrke5KeZcpxjHNcRiCPX3yLLk7KdhnheJNMV/am8SRJTtklxwQm5yTErkmZVIhnDySZ/JK3own48V4Nz7GrRljMrNN/sD4/AEzwp16</latexit>

5
• 1. Train with small time cut-off ⌘ (⇡ 10 <latexit sha1_base64="5Vum5fp/89dI3kAwGUTFuETJqrI=">AAADLXicbVJda9swFJW9ry77SrfHvYhlgwSy4ISy7bFQBtvL6KBpC3FqZFlORGXZk65Hg6I/tJf9lTHYQ8fY6/7GZCcrXpoLxodz77k6OiguBNcQBJeef+Pmrdt3du627t1/8PBRe/fxsc5LRdmY5iJXpzHRTHDJxsBBsNNCMZLFgp3E5wdV/+QzU5rn8ggWBZtmZCZ5yikBR0W73kGYcRmZMM5FoheZ+5kQ5gyItTjMCMzj2Ly1kYFQ86wmKBFmbLshsAuoDRjFEmvCStM/6tmmaoVTc2GjoFqAP0VBt0lujDddsEJzkUtb665O/mCv9IHt/4PvrVuUKkJNKNzlE9KFnrOk+SwjEZyN7HK5dffLrez2OJq+oQ+95TIanY1arqJ2JxgEdeHrYLgGHbSuw6j9PUxyWmZMAhVE68kwKGBqiAJOBbOtsNSsIPSczNjEQUkypqemDtviF45JcJor90nANdtUGJLpyrqbrAzrzV5FbutNSkjfTA2XRQlM0tVBaSkw5Lh6OjjhilEQCwcIVdx5xXROXOjgHlgVwnDzytfB8WgwfDXY+7jX2X++jmMHPUXPUBcN0Wu0j96hQzRG1PviffMuvZ/+V/+H/8v/vRr1vbXmCfqv/D9/AcDlDv8=</latexit>
):
(t)
min Et⇠U (⌘,T ) Ex0 ⇠q0 (x0 ) E✏⇠N (0,I) 2 ||✏ ✏✓ (xt , t)||22
✓ t
(image from: Song et al., “Maximum Likelihood Training of
Score-Based Diffusion Models“, NeurIPS, 2021)
• 2. Variance reduction by Importance Sampling:
<latexit sha1_base64="pHxteptstfJx84ZYZxBpoIFefGw=">AAACFHicbVDLSsNAFJ3UV42vqEs3wSpUhJKUoi4LblxWsA9oaplMJ+3QSSbM3Agl5CPc+CtuXCji1oU7/8Zpm4W2Hhg4nHMPd+7xY84UOM63UVhZXVvfKG6aW9s7u3vW/kFLiUQS2iSCC9nxsaKcRbQJDDjtxJLi0Oe07Y+vp377gUrFRHQHk5j2QjyMWMAIBi31rXNZhjMvliIG4QUSk9TjOj3AWs5ST7FhiPtwX81M0+xbJafizGAvEzcnJZSj0be+vIEgSUgjIBwr1XWdGHoplsAIp5npJYrGmIzxkHY1jXBIVS+dHZXZp1oZ2IGQ+kVgz9TfiRSHSk1CX0+GGEZq0ZuK/3ndBIKrXsqiOAEakfmiIOE2CHvakD1gkhLgE00wkUz/1SYjrKsB3eO0BHfx5GXSqlbci0rttlaqn+R1FNEROkZl5KJLVEc3qIGaiKBH9Ixe0ZvxZLwY78bHfLRg5JlD9AfG5w+qK53T</latexit>

(t)
Importance sampling distribution: r(t) / 2
t
<latexit sha1_base64="gdML7/fnFSqwtZYTzrpV1AZ2WrA=">AAADHHicbVLLjtMwFHXCayivDizZWBSkjjRUSTQCliMhJNigQaIzIzWdyHGc1ho7CfYNonL9IWz4FTYsQIgNCyT+BictKHR6JcvH557r+7DTSnANQfDb8y9dvnL12s713o2bt27f6e/ePdZlrSgb01KU6jQlmglesDFwEOy0UozIVLCT9Px54z95z5TmZfEWFhWbSjIreM4pAUclu14US14kJk5LkemFdJuJYc6AWItjSWCepuaFTQzEmkushrBnu/QK5+aDTYJW8S4Jhl1yQ95NwyrNRVnYNq5VUSLMa/svPrD7f+Er6y7KFaEmtGZVRXuKhes1Iw1h3D0zSRI4i+xyuTXT463s9u67XcA+7C2XSXQW9Zwl/UEwClrDF0G4BgO0tqOk/zPOSlpLVgAVROtJGFQwNUQBp4LZXlxrVhF6TmZs4mBBJNNT0z6uxY8ck+G8VG4VgFu2G2GI1E3pTtkUrDd9DbnNN6khfzY1vKhqYAVdJcprgaHEzU/BGVeMglg4QKjirlZM58QNHdx/aoYQbrZ8ERxHo/DJ6ODNweDw4XocO+g+eoCGKERP0SF6iY7QGFHvo/fZ++p98z/5X/zv/o+V1PfWMffQf+b/+gNgugan</latexit>

1 (t)
min Et⇠r(t) Ex0 ⇠q0 (x0 ) E✏⇠N (0,I) 2 ||✏ ✏✓ (xt , t)||22
Song et al., NeurIPS, 2021 ✓ r(t) t
Vahdat et al., NeurIPS, 2021
60
Huang et al., NeurIPS, 2021
Probability Flow ODE
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) Reverse Generative Process q(xT )

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT

<latexit sha1_base64="vc0Rdn6P/NciwoqoaXVhaAfpl64=">AAACrXicbVFbb9MwFHbCbZRbgUdeLCqkjktJqmnjBWmCFx6LRLdBHaIT10mtOXFmn6BVVv4dv4A3/g1OF0S3cSTLn77v3E9WK2kxin4H4Y2bt27f2bk7uHf/wcNHw8dPjqxuDBdzrpU2JxlYoWQl5ihRiZPaCCgzJY6z04+dfvxDGCt19QXXtUhKKCqZSw7oqXT4k6E4R1O6ZctKwFWWu/M2RfqevmG5Ae7i1k1blgmEMe4yJXJcXHJ8RaeUVZApSN220FKmdEHPUhxv07vMyGKFCf1Xt8vB7JlB97dMy15vtZWB8YpWS7su/eeYLkUBrU826C0djqJJtDF6HcQ9GJHeZunwF1tq3pSiQq7A2kUc1Zg4MCi5Eu2ANVbUwE+hEAsPKyiFTdxm2y194ZklzbXxr0K6YbcjHJS2a9V7doPbq1pH/k9bNJi/S5ys6gZFxS8K5Y2iqGl3OrqURnBUaw+AG+l7pXwF/kjoD9wtIb468nVwNJ3E+5O9z3ujww/9OnbIM/KcjElMDsgh+URmZE548DKYBV+Db+HbcB6y8PuFaxj0MU/JJQuLP5kR1Ic=</latexit>

1 p
• Consider reverse generative diffusion SDE: dxt = (t) [xt + 2rxt log qt (xt )] dt + ¯t
(t) d!
2
1
<latexit sha1_base64="8a9pZy+dZ2HN/8iMRnxU+Wa/yXg=">AAACbXicbVFNa9wwEJXdj6Tul9PSQz8ookvphpLFDqHppRDaS48pdJPAyhhZO94VkWVHGpcswrf+wtz6F3rpX6i8Wegm6YDg8d4bZuapaJS0mCS/gvDW7Tt3NzbvRfcfPHz0ON56cmTr1ggYi1rV5qTgFpTUMEaJCk4aA7wqFBwXp196/fgHGCtr/R0XDWQVn2lZSsHRU3n8kyGco6nctGMVx3lRuvMuR/qJ7rDScOHSzu12rADkQ9xmCkqcXDG+p0zzQvHcrdMdZaqe0bMch+v0NjNyNseM/puKURTl8SAZJcuiN0G6AgOyqsM8vmDTWrQVaBSKWztJkwYzxw1KoaCLWGuh4eKUz2DioeYV2Mwt0+roW89MaVkb/zTSJbve4Xhl7aIqvLNf3V7XevJ/2qTF8mPmpG5aBC0uB5WtoljTPno6lQYEqoUHXBjpd6Vizn3I6D+oDyG9fvJNcLQ7Sj+M9r7tDQ4+r+LYJC/JGzIkKdknB+QrOSRjIsjvIA6eBy+CP+Gz8FX4+tIaBquep+RKhe/+AgQDvQs=</latexit>

• In distribution equivalent to ”Probability Flow ODE”: dxt = (t) [xt + rxt log qt (xt )] dt
<latexit sha1_base64="0TUldqMhvRuVia4sGBobdlycsG4=">AAACAXicbVDLSsNAFJ3UV62vqBvBzWAV6qYkUtRlwY3LCvYBbQiT6aQdOnk4cyOWUDf+ihsXirj1L9z5N07aLLT1wIXDOfdy7z1eLLgCy/o2CkvLK6trxfXSxubW9o65u9dSUSIpa9JIRLLjEcUED1kTOAjWiSUjgSdY2xtdZX77nknFo/AWxjFzAjIIuc8pAS255sGdC5VeQGDo+enDxIXT0gyuWbaq1hR4kdg5KaMcDdf86vUjmgQsBCqIUl3bisFJiQROBZuUeoliMaEjMmBdTUMSMOWk0w8m+EQrfexHUlcIeKr+nkhJoNQ48HRndqua9zLxP6+bgH/ppDyME2AhnS3yE4EhwlkcuM8loyDGmhAqub4V0yGRhIIOLQvBnn95kbTOqvZ5tXZTK9eP8ziK6BAdoQqy0QWqo2vUQE1E0SN6Rq/ozXgyXox342PWWjDymX30B8bnD9iBlHI=</latexit>
2
(solving this ODE results in the same qt (xt ) when
<latexit sha1_base64="9oau/sxphZE9gk8+PQWf4XKRtKE=">AAACN3icbVDLSsNAFJ3UV62vqEs3wSq0ICWRooKbghvdlAp9QRPCZDpph04ezkykJeSv3Pgb7nTjQhG3/oFJmoJtPTBw5px7ufcey6eEC1V9lXIrq2vrG/nNwtb2zu6evH/Q5l7AEG4hj3qsa0GOKXFxSxBBcddnGDoWxR1rdJP4nUfMOPHcppj42HDgwCU2QVDEkinXH8xmSXegGFp2OI7MZlmHvs+8caohSMN6NOdfzz5qdDajd1G5MIUpF9WKmkJZJlpGiiBDw5Rf9L6HAge7AlHIeU9TfWGEkAmCKI4KesCxD9EIDnAvpi50MDfC9O5IOY2VvmJ7LH6uUFL1b0cIHc4njhVXJpvyRS8R//N6gbCvjJC4fiCwi6aD7IAqwlOSEJU+YRgJOokJRIzEuypoCBlEIo46CUFbPHmZtM8r2kWlel8t1k6yOPLgCByDEtDAJaiBW9AALYDAE3gDH+BTepbepS/pe1qak7KeQzAH6ecXh5yrbg==</latexit>

initializing qT (xT ) ⇡ N (xT ; 0, I) )

61
Song et al., ICLR, 2021
Probability Flow ODE
Encoding with Probability Flow ODE

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) Generation with Probability Flow ODE q(xT )

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT

1
<latexit sha1_base64="m2vD7V/JKTHgiP18veKsYqTt05s=">AAACcnicbVFdaxQxFM2MVev6tSq+WNDoImyxLjOlaF+Eoi99rOC2hc0wZDJ3dkMzHyR3pEvID/Dv+eav6Et/QDPrSL88EHJy7r25NydZo6TBKPoThHfW7t67v/5g8PDR4ydPh8+eH5q61QKmola1Ps64ASUrmKJEBceNBl5mCo6yk29d/OgnaCPr6gcuG0hKPq9kIQVHL6XDXwzhFHVpc8dKjoussKcuRfqFfmSF5sLGzm47lgHyMW4yBQXOriV+oP+OxqWWZbXKzbL0m2W48FXOja/mb/lLtJwvMKGXnXHgkQ5H0SRagd4mcU9GpMdBOvzN8lq0JVQoFDdmFkcNJpZrlEKBG7DWQMPFCZ/DzNOKl2ASu7LM0fdeyWlRa78qpCv1aoXlpene4TO76c3NWCf+LzZrsdhNrKyaFqESfxsVraJY085/mksNAtXSEy609LNSseDeafS/1JkQ33zybXK4PYk/TXa+74z2vvZ2rJMN8o6MSUw+kz2yTw7IlAhyFrwMXgdvgvPwVfg27L0Lg77mBbmGcOsCcQW/kA==</latexit>

dxt = (t) [xt + s✓ (xt , t)] dt


2
62
Song et al., ICLR, 2021
Synthesis with SDE vs. ODE

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit> <latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit> <latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) Generation with Reverse Diffusion SDE q(xT ) q(x0 ) Generation with Probability Flow ODE q(xT )

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT <latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT

• Generative Reverse Diffusion SDE (stochastic): • Generative Probability Flow ODE (deterministic):
<latexit sha1_base64="tabimnFXPYf3xGHlr9sie3mTJ5Q=">AAACrHicbVFbb9MwFHbCbZTLCjzyYlEhdQKqpJoACSFN8MIbQ1q3ojiKHOekteZcsE/QKsu/jn/AG/8Gp+tEu/FJlj9/5+pz8lZJg1H0Jwhv3b5z997e/cGDh48e7w+fPD01TacFzESjGj3PuQEla5ihRAXzVgOvcgVn+fnn3n72E7SRTX2CqxbSii9qWUrB0UvZ8BdDuEBd2cKxiuMyL+2Fy5B+pG9YqbmwsbNTx3JAPsYDpqDEZMfxFZ1evY3LLMsbVZhV5S/LcOnDnBtvB7z2WbRcLDGl/0r7NMz80GivCjn2YauxnOvdxE0FC584w0GPbDiKJtEa9CaJN2RENjjOhr9Z0YiughqF4sYkcdRiarlGKRS4AesMtFyc8wUknta8ApPa9bAdfemVgpaN9qdGula3IyyvTN+n9+y/ba7bevF/tqTD8n1qZd12CLW4LFR2imJD+83RQmoQqFaecKGl75WKJfc7Qr/ffgjx9S/fJKfTSfx2cvjtcHT0aTOOPfKcvCBjEpN35Ih8IcdkRkRwEHwN5sH3cBKehEmYXrqGwSbmGdlBWP4Fr8DWdw==</latexit>

1 p <latexit sha1_base64="m2vD7V/JKTHgiP18veKsYqTt05s=">AAACcnicbVFdaxQxFM2MVev6tSq+WNDoImyxLjOlaF+Eoi99rOC2hc0wZDJ3dkMzHyR3pEvID/Dv+eav6Et/QDPrSL88EHJy7r25NydZo6TBKPoThHfW7t67v/5g8PDR4ydPh8+eH5q61QKmola1Ps64ASUrmKJEBceNBl5mCo6yk29d/OgnaCPr6gcuG0hKPq9kIQVHL6XDXwzhFHVpc8dKjoussKcuRfqFfmSF5sLGzm47lgHyMW4yBQXOriV+oP+OxqWWZbXKzbL0m2W48FXOja/mb/lLtJwvMKGXnXHgkQ5H0SRagd4mcU9GpMdBOvzN8lq0JVQoFDdmFkcNJpZrlEKBG7DWQMPFCZ/DzNOKl2ASu7LM0fdeyWlRa78qpCv1aoXlpene4TO76c3NWCf+LzZrsdhNrKyaFqESfxsVraJY085/mksNAtXSEy609LNSseDeafS/1JkQ33zybXK4PYk/TXa+74z2vvZ2rJMN8o6MSUw+kz2yTw7IlAhyFrwMXgdvgvPwVfg27L0Lg77mBbmGcOsCcQW/kA==</latexit>

1
dxt = (t) [xt + 2s✓ (xt , t)] dt + ¯t
(t) d! dxt = (t) [xt + s✓ (xt , t)] dt
2 2

63
Song et al., ICLR, 2021
Probability Flow ODE
Diffusion Models as Continuous Normalizing Flows
Encoding with Probability Flow ODE

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) Generation with Probability Flow ODE q(xT )

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
• Probability Flow ODE as Neural ODE or
Continuous Normalizing Flow (CNF): Enables use of advanced ODE solvers
1 Deterministic encoding and generation
<latexit sha1_base64="m2vD7V/JKTHgiP18veKsYqTt05s=">AAACcnicbVFdaxQxFM2MVev6tSq+WNDoImyxLjOlaF+Eoi99rOC2hc0wZDJ3dkMzHyR3pEvID/Dv+eav6Et/QDPrSL88EHJy7r25NydZo6TBKPoThHfW7t67v/5g8PDR4ydPh8+eH5q61QKmola1Ps64ASUrmKJEBceNBl5mCo6yk29d/OgnaCPr6gcuG0hKPq9kIQVHL6XDXwzhFHVpc8dKjoussKcuRfqFfmSF5sLGzm47lgHyMW4yBQXOriV+oP+OxqWWZbXKzbL0m2W48FXOja/mb/lLtJwvMKGXnXHgkQ5H0SRagd4mcU9GpMdBOvzN8lq0JVQoFDdmFkcNJpZrlEKBG7DWQMPFCZ/DzNOKl2ASu7LM0fdeyWlRa78qpCv1aoXlpene4TO76c3NWCf+LzZrsdhNrKyaFqESfxsVraJY085/mksNAtXSEy609LNSseDeafS/1JkQ33zybXK4PYk/TXa+74z2vvZ2rJMN8o6MSUw+kz2yTw7IlAhyFrwMXgdvgvPwVfg27L0Lg77mBbmGcOsCcQW/kA==</latexit>

dxt = (t) [xt + s✓ (xt , t)] dt


2 (semantic image interpolation, etc.)
✓ <latexit sha1_base64="AyGPl1h5QHbZc/RmqSukMnpkYhk=">AAACjnicbVFbaxQxFM6Mt7pVO+qjL8FF2KIuM6VUsRSLvvSxgtsWNsOQyZyZDc1cSM5Il5Cf4x/yzX9jZneFtfVAyJfvfOfkXPJOSYNx/DsI791/8PDRzuPR7pOnz/ai5y8uTNtrATPRqlZf5dyAkg3MUKKCq04Dr3MFl/n118F/+QO0kW3zHZcdpDWvGllKwdFTWfST5bKq1IQds1JzYRnCDeraFo7VHBd5aW9chm6LR0dP6Pu1OnH2wLEckE9wnykocb4dRt/Sv0/jMsvyVhVmWfvL51v4KOcm2/p3PomW1QJTX85Ql94frSyLxvE0Xhm9C5INGJONnWfRL1a0oq+hQaG4MfMk7jC1XKMUCtyI9QY6Lq55BXMPG16DSe1qnI6+8UxBy1b70yBdsdsRltdmaMMrh+LNbd9A/s8377H8mFrZdD1CI9Yflb2i2NJhN7SQGgSqpQdcaOlrpWLB/aDRb3AYQnK75bvg4mCaHE0Pvx2OT79sxrFDXpHXZEIS8oGckjNyTmZEBLtBEnwKjsMoPApPws9raRhsYl6Sfyw8+wNMUcjN</latexit>


dxt 1
= (t) [xt + s✓ (xt , t)]
dt 2

Chen et al., NeurIPS, 2018


Grathwohl, ICLR, 2019
64
Song et al., ICLR, 2021
Semantic Image Interpolation with Probability Flow ODE

Continuous changes in latent space (xT )


<latexit sha1_base64="FYUG9q8AY2rTiKwSaD55gjA7s70=">AAAB83icbVDLSgMxFL2pr1pfVZdugkVwVWZE1GXRjcsKfUFnKJk004ZmMkOSEcvQ33DjQhG3/ow7/8ZMOwttPRA4nHMv9+QEieDaOM43Kq2tb2xulbcrO7t7+wfVw6OOjlNFWZvGIla9gGgmuGRtw41gvUQxEgWCdYPJXe53H5nSPJYtM02YH5GR5CGnxFjJ8yJixkGYPc0GrUG15tSdOfAqcQtSgwLNQfXLG8Y0jZg0VBCt+66TGD8jynAq2KzipZolhE7IiPUtlSRi2s/mmWf4zCpDHMbKPmnwXP29kZFI62kU2Mk8o172cvE/r5+a8MbPuExSwyRdHApTgU2M8wLwkCtGjZhaQqjiNiumY6IINbamii3BXf7yKulc1N2r+uXDZa1xW9RRhhM4hXNw4RoacA9NaAOFBJ7hFd5Qil7QO/pYjJZQsXMMf4A+fwBiWpHs</latexit>

result in continuous, semantically


meaningful changes in data space (x0)!
<latexit sha1_base64="YMrv4RjbPzOlshNy2VNKOHvmErY=">AAAB83icbVDLSgMxFL2pr1pfVZdugkVwVWakqMuiG5cV7AM6Q8mkmTY0kxmSjFiG/oYbF4q49Wfc+Tdm2llo64HA4Zx7uScnSATXxnG+UWltfWNzq7xd2dnd2z+oHh51dJwqyto0FrHqBUQzwSVrG24E6yWKkSgQrBtMbnO/+8iU5rF8MNOE+REZSR5ySoyVPC8iZhyE2dNs4AyqNafuzIFXiVuQGhRoDapf3jCmacSkoYJo3XedxPgZUYZTwWYVL9UsIXRCRqxvqSQR0342zzzDZ1YZ4jBW9kmD5+rvjYxEWk+jwE7mGfWyl4v/ef3UhNd+xmWSGibp4lCYCmxinBeAh1wxasTUEkIVt1kxHRNFqLE1VWwJ7vKXV0nnou5e1hv3jVrzpqijDCdwCufgwhU04Q5a0AYKCTzDK7yhFL2gd/SxGC2hYucY/gB9/gArypHI</latexit>

Generation with Probability Flow ODE


65
Dockhorn et al., ICLR, 2022
Probability Flow ODE
Diffusion Models as Continuous Normalizing Flows
Encoding with Probability Flow ODE

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) Generation with Probability Flow ODE q(xT )

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
• Probability Flow ODE as Neural ODE or
Continuous Normalizing Flow (CNF): Enables use of advanced ODE solvers
1 Deterministic encoding and generation
<latexit sha1_base64="m2vD7V/JKTHgiP18veKsYqTt05s=">AAACcnicbVFdaxQxFM2MVev6tSq+WNDoImyxLjOlaF+Eoi99rOC2hc0wZDJ3dkMzHyR3pEvID/Dv+eav6Et/QDPrSL88EHJy7r25NydZo6TBKPoThHfW7t67v/5g8PDR4ydPh8+eH5q61QKmola1Ps64ASUrmKJEBceNBl5mCo6yk29d/OgnaCPr6gcuG0hKPq9kIQVHL6XDXwzhFHVpc8dKjoussKcuRfqFfmSF5sLGzm47lgHyMW4yBQXOriV+oP+OxqWWZbXKzbL0m2W48FXOja/mb/lLtJwvMKGXnXHgkQ5H0SRagd4mcU9GpMdBOvzN8lq0JVQoFDdmFkcNJpZrlEKBG7DWQMPFCZ/DzNOKl2ASu7LM0fdeyWlRa78qpCv1aoXlpene4TO76c3NWCf+LzZrsdhNrKyaFqESfxsVraJY085/mksNAtXSEy609LNSseDeafS/1JkQ33zybXK4PYk/TXa+74z2vvZ2rJMN8o6MSUw+kz2yTw7IlAhyFrwMXgdvgvPwVfg27L0Lg77mBbmGcOsCcQW/kA==</latexit>

dxt = (t) [xt + s✓ (xt , t)] dt


2 (semantic image interpolation, etc.)
✓ <latexit sha1_base64="AyGPl1h5QHbZc/RmqSukMnpkYhk=">AAACjnicbVFbaxQxFM6Mt7pVO+qjL8FF2KIuM6VUsRSLvvSxgtsWNsOQyZyZDc1cSM5Il5Cf4x/yzX9jZneFtfVAyJfvfOfkXPJOSYNx/DsI791/8PDRzuPR7pOnz/ai5y8uTNtrATPRqlZf5dyAkg3MUKKCq04Dr3MFl/n118F/+QO0kW3zHZcdpDWvGllKwdFTWfST5bKq1IQds1JzYRnCDeraFo7VHBd5aW9chm6LR0dP6Pu1OnH2wLEckE9wnykocb4dRt/Sv0/jMsvyVhVmWfvL51v4KOcm2/p3PomW1QJTX85Ql94frSyLxvE0Xhm9C5INGJONnWfRL1a0oq+hQaG4MfMk7jC1XKMUCtyI9QY6Lq55BXMPG16DSe1qnI6+8UxBy1b70yBdsdsRltdmaMMrh+LNbd9A/s8377H8mFrZdD1CI9Yflb2i2NJhN7SQGgSqpQdcaOlrpWLB/aDRb3AYQnK75bvg4mCaHE0Pvx2OT79sxrFDXpHXZEIS8oGckjNyTmZEBLtBEnwKjsMoPApPws9raRhsYl6Sfyw8+wNMUcjN</latexit>


dxt
=
1
(t) [xt + s✓ (xt , t)] Log-likelihood computation (instantaneous change of variables):
dt 2
<latexit sha1_base64="pRUyknXRUvn4KricG7HovP6ZJcs=">AAAC5nichVJbi9QwFE7rbR0vO+qjL8FRmKIO7bKoL8KCLz6u0NldmNSSZtJp2PRCcio7hPwAX3xQxFd/k2/+GMF0WpdxV/BAyMf3nXNyLskaKTSE4U/Pv3L12vUbOzdHt27fubs7vnf/SNetYnzOalmrk4xqLkXF5yBA8pNGcVpmkh9np286/fgDV1rUVQzrhiclXVUiF4yCo9LxLyLrFW5SQ7JaLvW6dJchUHCg1k5JSaHIcnNm0zDAr/HgHG8LcYCfYyIqSMP3MQF+Bqo0sbJE8hymJFeUmciaPUsyl3QKAe450lAFgkp7jvBWVujjF9sUfnruoe3/K4ZnEBAlVgUk/RX8qW5pYbSxdDwJZ+HG8GUQDWCCBjtMxz/IsmZtyStgkmq9iMIGEtPVzyS3I9Jq3lB2Sld84WBFS64Ts1mTxU8cs8R5rdypAG/Y7QhDS9314zy7LvRFrSP/pS1ayF8lRlRNC7xi/UN5KzHUuNs5XgrFGci1A5Qp4WrFrKBuCeB+RjeE6GLLl8HR3ix6Mdt/tz85eDyMYwc9RI/QFEXoJTpAb9EhmiPmMe+j99n74hf+J/+r/6139b0h5gH6y/zvvwFXr+3J</latexit>

Z T ✓ ◆
1 @
log p✓ (x0 ) = log pT (xT ) Tr (t) [xt + s✓ (xt , t)] dt
0 2 @xt
Chen et al., NeurIPS, 2018 Diffusion models can be considered
Grathwohl, ICLR, 2019
Song et al., ICLR, 2021
CNFs trained with score matching! 67
Sampling from “Continuous-Time” Diffusion Models
How to solve the generative SDE or ODE in practice?

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

q(x0 ) q(xT )
<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

q(x0 ) Generation with Reverse Diffusion SDE


q(xT ) Generation with Probability Flow ODE

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT <latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
Generative Diffusion SDE: Probability Flow ODE:
<latexit sha1_base64="tabimnFXPYf3xGHlr9sie3mTJ5Q=">AAACrHicbVFbb9MwFHbCbZTLCjzyYlEhdQKqpJoACSFN8MIbQ1q3ojiKHOekteZcsE/QKsu/jn/AG/8Gp+tEu/FJlj9/5+pz8lZJg1H0Jwhv3b5z997e/cGDh48e7w+fPD01TacFzESjGj3PuQEla5ihRAXzVgOvcgVn+fnn3n72E7SRTX2CqxbSii9qWUrB0UvZ8BdDuEBd2cKxiuMyL+2Fy5B+pG9YqbmwsbNTx3JAPsYDpqDEZMfxFZ1evY3LLMsbVZhV5S/LcOnDnBtvB7z2WbRcLDGl/0r7NMz80GivCjn2YauxnOvdxE0FC584w0GPbDiKJtEa9CaJN2RENjjOhr9Z0YiughqF4sYkcdRiarlGKRS4AesMtFyc8wUknta8ApPa9bAdfemVgpaN9qdGula3IyyvTN+n9+y/ba7bevF/tqTD8n1qZd12CLW4LFR2imJD+83RQmoQqFaecKGl75WKJfc7Qr/ffgjx9S/fJKfTSfx2cvjtcHT0aTOOPfKcvCBjEpN35Ih8IcdkRkRwEHwN5sH3cBKehEmYXrqGwSbmGdlBWP4Fr8DWdw==</latexit>

1 p <latexit sha1_base64="m2vD7V/JKTHgiP18veKsYqTt05s=">AAACcnicbVFdaxQxFM2MVev6tSq+WNDoImyxLjOlaF+Eoi99rOC2hc0wZDJ3dkMzHyR3pEvID/Dv+eav6Et/QDPrSL88EHJy7r25NydZo6TBKPoThHfW7t67v/5g8PDR4ydPh8+eH5q61QKmola1Ps64ASUrmKJEBceNBl5mCo6yk29d/OgnaCPr6gcuG0hKPq9kIQVHL6XDXwzhFHVpc8dKjoussKcuRfqFfmSF5sLGzm47lgHyMW4yBQXOriV+oP+OxqWWZbXKzbL0m2W48FXOja/mb/lLtJwvMKGXnXHgkQ5H0SRagd4mcU9GpMdBOvzN8lq0JVQoFDdmFkcNJpZrlEKBG7DWQMPFCZ/DzNOKl2ASu7LM0fdeyWlRa78qpCv1aoXlpene4TO76c3NWCf+LzZrsdhNrKyaFqESfxsVraJY085/mksNAtXSEy609LNSseDeafS/1JkQ33zybXK4PYk/TXa+74z2vvZ2rJMN8o6MSUw+kz2yTw7IlAhyFrwMXgdvgvPwVfg27L0Lg77mBbmGcOsCcQW/kA==</latexit>

1
dxt = (t) [xt + 2s✓ (xt , t)] dt + ¯t
(t) d! dxt = (t) [xt + s✓ (xt , t)] dt
2 2

Euler-Maruyama: Euler’s Method:


p 1
<latexit sha1_base64="rZ8UE8u4gEuADDJ5LTu7PbKYwjU=">AAACeHicbVFNbxMxEPUu9INA2wDHXixCaapCtBtVhQtSRXvg2EpNWyleRV5nNrHq/ZA9ixpZ/g38N278EC6c8KZ7SFtGsvz85j17ZpxWShqMot9B+Oz52vrG5ovOy1db2zvd12+uTFlrASNRqlLfpNyAkgWMUKKCm0oDz1MF1+ntaZO//gHayLK4xEUFSc5nhcyk4OipSfcnyznO08zeuYnFT7GjH+hX+oB09JCyTHNhY2eHjqWAvI8HTEGG4xUlNrr2aLyRpaWamkXuN8tw7l3O9Vf1H/0lWs7mmFB2Bgo5xc6k24sG0TLoUxC3oEfaOJ90f7FpKeocChSKGzOOowoTyzVKocB1WG2g4uKWz2DsYcFzMIldDs7RPc9MaVZqvwqkS3bVYXlumg68sqnbPM415P9y4xqzL4mVRVUjFOL+oaxWFEva/AKdSg0C1cIDLrT0tVIx537G6P+qGUL8uOWn4Go4iI8HRxdHvZNv7Tg2yS55R/okJp/JCflOzsmICPIn2A3eB3vB35CG++HBvTQMWs9b8iDC4T869ME/</latexit>

1
<latexit sha1_base64="iAeFU6EK8Pd/NHhSQXgxQCWinGc=">AAACt3icbVFNb9QwEHVCgbJ8LXDkYnUF2oqySlYrWoSQqtJDe0FFYtuKdRQ5jrOx6nzUniBWlv8iB279NzjbtGxbRrL8/GbezHgmqaXQEAQXnn9v7f6Dh+uPeo+fPH32vP/i5bGuGsX4lFWyUqcJ1VyKkk9BgOSnteK0SCQ/Sc6+tP6Tn1xpUZXfYVHzqKDzUmSCUXBU3P9NCgp5kplfNjbwPrT4Lf6Mb5AWv8MkU5SZ0JqxJQkHOoRNInkGs5VIcHHjq7d2SpJUMtWLwl2GQO5k1g5XBVsuixLzHCJM9rkEil0Oos8VmOsqHW/Jp6WSUWm+/ssS2K0reGg3e724PwhGwdLwXRB2YIA6O4r7f0hasabgJTBJtZ6FQQ2RoQoEk9z2SKN5TdkZnfOZgyUtuI7Mcu4Wv3FMirNKuVMCXrKrCkML3f7fRbZN6tu+lvyfb9ZAthMZUdYN8JJdFsoaiaHC7RJxKhRnIBcOUKaE6xWznLoVgVt1O4Tw9pfvguPxKPwwmnybDHb3unGso9doAw1RiLbRLjpAR2iKmDfxfnjMS/2Pfuxnfn4Z6nud5hW6Yf75X/pt2Ig=</latexit>

xt 1 = xt + (t) [xt + 2s✓ (xt , t)] t+ (t) t N (0, I) xt 1 = xt + (t) [xt + s✓ (xt , t)] t
2 2

Ancestral Sampling (Part 1) is In practice: Higher-Order ODE solvers


also a generative SDE sampler! (Runge-Kutta, linear multistep methods,
exponential integrators, …) 68
Song et al., ICLR, 2021
Sampling from “Continuous-Time” Diffusion Models
How to solve the generative SDE or ODE in practice?

• Runge-Kutta adaptive step-size ODE solver [1]


• Higher-Order adaptive step-size SDE solver [2]
Generation with Reverse Diffusion SDE • Reparametrized, smoother ODE [3] Generation with Probability Flow ODE


• Higher-Order

ODE solver with linear multistepping

[4] …
<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT <latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
• Exponential
Generative Diffusion SDE: ODE Integrators [5,6]
Probability Flow ODE:
<latexit sha1_base64="tabimnFXPYf3xGHlr9sie3mTJ5Q=">AAACrHicbVFbb9MwFHbCbZTLCjzyYlEhdQKqpJoACSFN8MIbQ1q3ojiKHOekteZcsE/QKsu/jn/AG/8Gp+tEu/FJlj9/5+pz8lZJg1H0Jwhv3b5z997e/cGDh48e7w+fPD01TacFzESjGj3PuQEla5ihRAXzVgOvcgVn+fnn3n72E7SRTX2CqxbSii9qWUrB0UvZ8BdDuEBd2cKxiuMyL+2Fy5B+pG9YqbmwsbNTx3JAPsYDpqDEZMfxFZ1evY3LLMsbVZhV5S/LcOnDnBtvB7z2WbRcLDGl/0r7NMz80GivCjn2YauxnOvdxE0FC584w0GPbDiKJtEa9CaJN2RENjjOhr9Z0YiughqF4sYkcdRiarlGKRS4AesMtFyc8wUknta8ApPa9bAdfemVgpaN9qdGula3IyyvTN+n9+y/ba7bevF/tqTD8n1qZd12CLW4LFR2imJD+83RQmoQqFaecKGl75WKJfc7Qr/ffgjx9S/fJKfTSfx2cvjtcHT0aTOOPfKcvCBjEpN35Ih8IcdkRkRwEHwN5sH3cBKehEmYXrqGwSbmGdlBWP4Fr8DWdw==</latexit>

1 p 1
<latexit sha1_base64="m2vD7V/JKTHgiP18veKsYqTt05s=">AAACcnicbVFdaxQxFM2MVev6tSq+WNDoImyxLjOlaF+Eoi99rOC2hc0wZDJ3dkMzHyR3pEvID/Dv+eav6Et/QDPrSL88EHJy7r25NydZo6TBKPoThHfW7t67v/5g8PDR4ydPh8+eH5q61QKmola1Ps64ASUrmKJEBceNBl5mCo6yk29d/OgnaCPr6gcuG0hKPq9kIQVHL6XDXwzhFHVpc8dKjoussKcuRfqFfmSF5sLGzm47lgHyMW4yBQXOriV+oP+OxqWWZbXKzbL0m2W48FXOja/mb/lLtJwvMKGXnXHgkQ5H0SRagd4mcU9GpMdBOvzN8lq0JVQoFDdmFkcNJpZrlEKBG7DWQMPFCZ/DzNOKl2ASu7LM0fdeyWlRa78qpCv1aoXlpene4TO76c3NWCf+LzZrsdhNrKyaFqESfxsVraJY085/mksNAtXSEy609LNSseDeafS/1JkQ33zybXK4PYk/TXa+74z2vvZ2rJMN8o6MSUw+kz2yTw7IlAhyFrwMXgdvgvPwVfg27L0Lg77mBbmGcOsCcQW/kA==</latexit>

dxt = (t) [xt + 2s✓ (xt , t)] dt + ¯t


(t) d! dxt = (t) [xt + s✓ (xt , t)] dt
2 • Higher-Order ODE solver with Heun’s Method
2 [7]

Euler-Maruyama: Euler’s Method:


[1] Song et al.,1“Score-Based Generative Modeling pthrough Stochastic Differential Equations”, ICLR, 2021 1
<latexit sha1_base64="rZ8UE8u4gEuADDJ5LTu7PbKYwjU=">AAACeHicbVFNbxMxEPUu9INA2wDHXixCaapCtBtVhQtSRXvg2EpNWyleRV5nNrHq/ZA9ixpZ/g38N278EC6c8KZ7SFtGsvz85j17ZpxWShqMot9B+Oz52vrG5ovOy1db2zvd12+uTFlrASNRqlLfpNyAkgWMUKKCm0oDz1MF1+ntaZO//gHayLK4xEUFSc5nhcyk4OipSfcnyznO08zeuYnFT7GjH+hX+oB09JCyTHNhY2eHjqWAvI8HTEGG4xUlNrr2aLyRpaWamkXuN8tw7l3O9Vf1H/0lWs7mmFB2Bgo5xc6k24sG0TLoUxC3oEfaOJ90f7FpKeocChSKGzOOowoTyzVKocB1WG2g4uKWz2DsYcFzMIldDs7RPc9MaVZqvwqkS3bVYXlumg68sqnbPM415P9y4xqzL4mVRVUjFOL+oaxWFEva/AKdSg0C1cIDLrT0tVIx537G6P+qGUL8uOWn4Go4iI8HRxdHvZNv7Tg2yS55R/okJp/JCflOzsmICPIn2A3eB3vB35CG++HBvTQMWs9b8iDC4T869ME/</latexit>

<latexit sha1_base64="iAeFU6EK8Pd/NHhSQXgxQCWinGc=">AAACt3icbVFNb9QwEHVCgbJ8LXDkYnUF2oqySlYrWoSQqtJDe0FFYtuKdRQ5jrOx6nzUniBWlv8iB279NzjbtGxbRrL8/GbezHgmqaXQEAQXnn9v7f6Dh+uPeo+fPH32vP/i5bGuGsX4lFWyUqcJ1VyKkk9BgOSnteK0SCQ/Sc6+tP6Tn1xpUZXfYVHzqKDzUmSCUXBU3P9NCgp5kplfNjbwPrT4Lf6Mb5AWv8MkU5SZ0JqxJQkHOoRNInkGs5VIcHHjq7d2SpJUMtWLwl2GQO5k1g5XBVsuixLzHCJM9rkEil0Oos8VmOsqHW/Jp6WSUWm+/ssS2K0reGg3e724PwhGwdLwXRB2YIA6O4r7f0hasabgJTBJtZ6FQQ2RoQoEk9z2SKN5TdkZnfOZgyUtuI7Mcu4Wv3FMirNKuVMCXrKrCkML3f7fRbZN6tu+lvyfb9ZAthMZUdYN8JJdFsoaiaHC7RJxKhRnIBcOUKaE6xWznLoVgVt1O4Tw9pfvguPxKPwwmnybDHb3unGso9doAw1RiLbRLjpAR2iKmDfxfnjMS/2Pfuxnfn4Z6nud5hW6Yf75X/pt2Ig=</latexit>

xt[2]1Jolicoeur-Martineau
= xt + (t) [xett +
al.,2s ✓ (xtGo
“Gotta , t)]FasttWhen (t) t NData
+ Generating (0, I) xt 1arXiv,
with Score-Based Models”, = xt2021
+ (t) [xt + s✓ (xt , t)] t
[3] Song et al.,2“Denoising Diffusion Implicit Models”, ICLR, 2021
2
[4] Liu et al., “Pseudo Numerical Methods for Diffusion Models on Manifolds”, ICLR, 2022
[5] Ancestral
Zhang and Chen,Sampling (Part 1)
“Fast Sampling is
of Diffusion Models with Exponential Integrator”, arXiv, 2022In practice: Higher-Order ODE solvers
[6] also
Lu et a
al.,generative
“DPM-Solver:SDE
A Fastsampler!
ODE Solver for Diffusion Probabilistic Model Sampling in Around (Runge-Kutta,
10 Steps”, arXiv, linear
2022 multistep methods,
[7] Karras et al., “Elucidating the Design Space of Diffusion-Based Generative Models”, arXiv, 2022 exponential integrators, …) 69
Song et al., ICLR, 2021
Sampling from “Continuous-Time” Diffusion Models
SDE vs. ODE Sampling: Pro’s and Con’s

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

q(x0 )
<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

Generation with Reverse Diffusion SDE q(xT ) Generation with Probability Flow ODE

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
Generative Diffusion SDE: Probability Flow ODE:
<latexit sha1_base64="tabimnFXPYf3xGHlr9sie3mTJ5Q=">AAACrHicbVFbb9MwFHbCbZTLCjzyYlEhdQKqpJoACSFN8MIbQ1q3ojiKHOekteZcsE/QKsu/jn/AG/8Gp+tEu/FJlj9/5+pz8lZJg1H0Jwhv3b5z997e/cGDh48e7w+fPD01TacFzESjGj3PuQEla5ihRAXzVgOvcgVn+fnn3n72E7SRTX2CqxbSii9qWUrB0UvZ8BdDuEBd2cKxiuMyL+2Fy5B+pG9YqbmwsbNTx3JAPsYDpqDEZMfxFZ1evY3LLMsbVZhV5S/LcOnDnBtvB7z2WbRcLDGl/0r7NMz80GivCjn2YauxnOvdxE0FC584w0GPbDiKJtEa9CaJN2RENjjOhr9Z0YiughqF4sYkcdRiarlGKRS4AesMtFyc8wUknta8ApPa9bAdfemVgpaN9qdGula3IyyvTN+n9+y/ba7bevF/tqTD8n1qZd12CLW4LFR2imJD+83RQmoQqFaecKGl75WKJfc7Qr/ffgjx9S/fJKfTSfx2cvjtcHT0aTOOPfKcvCBjEpN35Ih8IcdkRkRwEHwN5sH3cBKehEmYXrqGwSbmGdlBWP4Fr8DWdw==</latexit>

1 p <latexit sha1_base64="m2vD7V/JKTHgiP18veKsYqTt05s=">AAACcnicbVFdaxQxFM2MVev6tSq+WNDoImyxLjOlaF+Eoi99rOC2hc0wZDJ3dkMzHyR3pEvID/Dv+eav6Et/QDPrSL88EHJy7r25NydZo6TBKPoThHfW7t67v/5g8PDR4ydPh8+eH5q61QKmola1Ps64ASUrmKJEBceNBl5mCo6yk29d/OgnaCPr6gcuG0hKPq9kIQVHL6XDXwzhFHVpc8dKjoussKcuRfqFfmSF5sLGzm47lgHyMW4yBQXOriV+oP+OxqWWZbXKzbL0m2W48FXOja/mb/lLtJwvMKGXnXHgkQ5H0SRagd4mcU9GpMdBOvzN8lq0JVQoFDdmFkcNJpZrlEKBG7DWQMPFCZ/DzNOKl2ASu7LM0fdeyWlRa78qpCv1aoXlpene4TO76c3NWCf+LzZrsdhNrKyaFqESfxsVraJY085/mksNAtXSEy609LNSseDeafS/1JkQ33zybXK4PYk/TXa+74z2vvZ2rJMN8o6MSUw+kz2yTw7IlAhyFrwMXgdvgvPwVfg27L0Lg77mBbmGcOsCcQW/kA==</latexit>

1
dxt = (t) [xt + 2s✓ (xt , t)] dt + (t) d!¯t dxt = (t) [xt + s✓ (xt , t)] dt
2 2
<latexit sha1_base64="0EXUbzTrFaHPKLWFCLj833taXD0=">AAADA3iclVJLaxRBEO4ZX3F9ZKMn8dK4CCvRZSaEKIgQ9OIxgpsEtoehp7dmt0nPw+6akKVp8OJf8eJBEa/+CW/+G3s2G7Kb5GJB01/X46tHV1YraTCK/gbhtes3bt5au925c/fe/fXuxoN9UzVawFBUqtKHGTegZAlDlKjgsNbAi0zBQXb0rrUfHIM2sio/4qyGpOCTUuZScPSqdCN4xBBOUBd27FjBcZrl9sSlSN/QFyzXXNjY2S3HMkDex2dMQY6jFcdNevY0LrUsq9TYzAp/WYZTH+Vcf9n/uSfRcjLFhJ5nxquT/SfvCuEmM5802jMux14vNZpxvcpYFTDxjCl2ziXt9qJBNBd6GcQL0CML2Uu7f9i4Ek0BJQrFjRnFUY2J5RqlUOA6rDFQc3HEJzDysOQFmMTO/9DRp14zpnml/SmRzrXLEZYXpq3We7Zdm4u2VnmVbdRg/iqxsqwbhFKcJsobRbGi7ULQsdQgUM084EJLXysVU+4/A/3atEOIL7Z8GexvDeKdwfaH7d7u28U41shj8oT0SUxekl3ynuyRIRHB5+Br8D34EX4Jv4U/w1+nrmGwiHlIViT8/Q/nLPi8</latexit>

1 1 p
dxt = (t) [xt + s✓ (xt , t)] dt (t)s✓ (xt , t)dt + ¯t
(t) d!
2 2
Probability Flow ODE Langevin Diffusion SDE

Pro: Continuous noise injection can help to Pro: Can leverage fast ODE solvers. Best
compensate errors during diffusion process (Langevin when targeting very fast sampling.
sampling actively pushes towards correct distribution).
Con: No “stochastic” error correction, often slightly
Con: Often slower, because the stochastic terms themselves
lower performance than stochastic sampling.
require fine discretization during solve.
70
Karras et al., “Elucidating the Design Space of Diffusion-Based Generative Models”, arXiv, 2022
Diffusion Models as Energy-based Models
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) Reverse Generative Process q(xT )

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
<latexit sha1_base64="nk/CsuLR2pVv1/mowzUnY+dtJts=">AAACc3icjVFNSyNBEO0Z3TVGd40KXjzYGBcUNMxIUC+CIIJHBaNiJht6OjVJY88H3TViaOYP+PO8+S+8eLcnzsGvhS1o+vWrV3TVqzCTQqPnPTnu1PSPnzO12frc/K/fC43FpUud5opDh6cyVdch0yBFAh0UKOE6U8DiUMJVeHtc5q/uQGmRJhc4zqAXs2EiIsEZWqrfeMj6JghTOdDj2F4mwBEgK4rNIGY4CiNzX2zj1mEQKcYN/DU7J/+jLwozeXMmzU3xjworq7+PfqPptbxJ0K/Ar0CTVHHWbzwGg5TnMSTIJdO663sZ9gxTKLiEoh7kGjLGb9kQuhYmLAbdMxPPCvrHMgMapcqeBOmEfV9hWKzLjq2ynER/zpXkd7lujtFBz4gkyxES/vZRlEuKKS0XQAdCAUc5toBxJWyvlI+YtRftmkoT/M8jfwWXuy1/r9U+bzePNio7amSVrJNN4pN9ckROyRnpEE6enRVnzaHOi7vqrruV1nWqmmXyIdydV8Mgvew=</latexit>

E✓ (x,t)
e
• Assume an Energy-based Model (EBM): p✓ (x, t) =
Z✓ (t)
<latexit sha1_base64="sXnIDBAowufe9afY1dxDmCKkLwM=">AAACgnicbVFdSxwxFM2MttXt11YffQldLFp1O6PSFlSQlkL7Uix0VdhZhjvZjBvMZMbkjnQJ+SH+Ld/8NW1m3WJXeyHk5Jxzk3tvskoKg1F0E4Rz848eP1lYbD199vzFy/arpWNT1prxHitlqU8zMFwKxXsoUPLTSnMoMslPsvPPjX5yybURpfqJ44oPCjhTIhcM0FNp+yopAEdZbn+51IqN2NE39IDOkI5u0YQjJAoyCam9E90Xf8pKOTTjwm82wZH3Obc2m7+J63QjMRca7XZzkUv2JgYG0n6/M0du8y/85tZbabsTdaNJ0IcgnoIOmcZR2r5OhiWrC66QSTCmH0cVDixoFExy10pqwytg53DG+x4qKLgZ2MkIHV31zJDmpfZLIZ2w/2ZYKEzTpXc2NZr7WkP+T+vXmH8cWKGqGrlitw/ltaRY0uY/6FBozlCOPQCmha+VshFoYOh/rRlCfL/lh+B4uxu/7+7+2O0cfpqOY4GskNdkjcTkAzkkX8kR6RFGfgerQTd4F86Hb8M43Lm1hsE0Z5nMRLj/B9B2xRs=</latexit>

p
• Sample EBM via Langevin dynamics: xi+1 = xi ⌘rx E✓ (xi , t) + 2⌘ N (0, I)

• Requires only gradient of energy rx E✓ (x, t), not E✓ (x, t) itself, nor Z✓ (t)!
<latexit sha1_base64="y8tttzxThw9pATA0l7cAnwjU/i4=">AAACGnicbVDLSsNAFJ3UV62vqEs3wSrUTUmkqMuCG5cV7AObECaTSTt08mDmRigh3+HGX3HjQhF34sa/cdJmoa0Hhjmccy/33uMlnEkwzW+tsrK6tr5R3axtbe/s7un7Bz0Zp4LQLol5LAYelpSziHaBAaeDRFAcepz2vcl14fcfqJAsju5gmlAnxKOIBYxgUJKrW3aIYUwwz+5zN7O9mPtyGqovs2FMAed5A85qv+HqdbNpzmAsE6skdVSi4+qfth+TNKQREI6lHFpmAk6GBTDCaV6zU0kTTCZ4RIeKRjik0slmp+XGqVJ8I4iFehEYM/V3R4ZDWSysKotD5KJXiP95wxSCKydjUZICjch8UJByA2KjyMnwmaAE+FQRTARTuxpkjAUmoNIsQrAWT14mvfOmddFs3bbq7ZMyjio6QseogSx0idroBnVQFxH0iJ7RK3rTnrQX7V37mJdWtLLnEP2B9vUDUjGdow==</latexit>

<latexit sha1_base64="OPm8h0v0fsjYxqA1xHgdwIzlCOo=">AAACG3icbVDLSsNAFJ3UV62vqEs3wSpUkJKUoi4LIrisYB/QhDKZTtqhkwczN2IJ+Q83/oobF4q4Elz4N07aLGrrgWEO597LPfe4EWcSTPNHK6ysrq1vFDdLW9s7u3v6/kFbhrEgtEVCHoquiyXlLKAtYMBpNxIU+y6nHXd8ndU7D1RIFgb3MImo4+NhwDxGMCipr9du+onthnwgJ776EhtGFHCaVmwfw8j1ksf0HM5K8+jrZbNqTmEsEysnZZSj2de/7EFIYp8GQDiWsmeZETgJFsAIp2nJjiWNMBnjIe0pGmCfSieZ3pYap0oZGF4o1AvAmKrzEwn2ZeZddWaO5WItE/+r9WLwrpyEBVEMNCCzRV7MDQiNLChjwAQlwCeKYCKY8mqQERaYgIozC8FaPHmZtGtV66Jav6uXGyd5HEV0hI5RBVnoEjXQLWqiFiLoCb2gN/SuPWuv2of2OWstaPnMIfoD7fsXvpid1A==</latexit>

<latexit sha1_base64="siQCqsnlMEt1+Njmb/6IQYXnZ18=">AAACLnicbVBNS8NAEN34bf2qevQSrEIFLYmIeixIwaOCbYWmlMl20y5uNmF3IpaQX+TFv6IHQUW8+jPc1ILa+mDZt2/esDPPjwXX6Dgv1tT0zOzc/MJiYWl5ZXWtuL7R0FGiKKvTSETq2gfNBJesjhwFu44Vg9AXrOnfnOX15i1TmkfyCgcxa4fQkzzgFNBInWLtwJPgC+ikXgjY94P0Lstq5uVHoqsHoblSD/sMIcvKP5Z93Cv8QqdYcirOEPYkcUekREa46BSfvG5Ek5BJpAK0brlOjO0UFHIqWFbwEs1ioDfQYy1DJYRMt9Phupm9a5SuHUTKHIn2UP3dkUKo89GNMx9Yj9dy8b9aK8HgtJ1yGSfIJP3+KEiEjZGdZ2d3uWIUxcAQoIqbWW3aBwUUTcJ5CO74ypOkcVhxjytHl0el6s4ojgWyRbZJmbjkhFTJObkgdULJPXkkr+TNerCerXfr49s6ZY16NskfWJ9fiL6mpA==</latexit>

In diffusion models, we learn “energy gradients” for all diffused distributions directly:
rx log qt (x) ⇡ s✓ (x, t) =: rx log p✓ (x, t) = rx E✓ (x, t)
<latexit sha1_base64="AxbVpUsSYZwkYNZlCKokJui8PVE=">AAACR3icbVDPT9swGHUKbF3HRmHHXaxVk0CaqgShjWM1LhyLRAtS00VfHKe1cOxgf5moovx3u3Dlxr/AZYdNiOOckmn82CdZfnrve/Lzi3MpLPr+tddaWV178bL9qvN6/c3bje7m1tjqwjA+YlpqcxqD5VIoPkKBkp/mhkMWS34Snx3U+sl3bqzQ6hgXOZ9mMFMiFQzQUVH3W6gglhCVYQY4j9PyoqpCqWf0PMLtf9wODSHPjb6gfzlbOU+sZWIXmbvKEOccoaoemD7hTtTt+X1/OfQ5CBrQI80Mo+5VmGhWZFwhk2DtJPBznJZgUDDJq05YWJ4DO4MZnzioION2Wi57qOhHxyQ01cYdhXTJPnSUkNk6rtusQ9qnWk3+T5sUmO5PS6HyArli9w+lhaSoaV0qTYThDOXCAWBGuKyUzcEAQ1d9x5UQPP3yczDe7Qef+3tHe73B16aONnlPPpBtEpAvZEAOyZCMCCM/yA35RX57l95P79a7u19teY3nHXk0Le8Pzn+00w==</latexit>
<latexit sha1_base64="NChqKRKCkuInA68dff8NURLQL9E=">AAACKXicbVDLSitBEO3xdTW+oi7dNAZBQcKMiIoghOvGpYJRIRNCTacnaezpHrprLoZhfseNv+LmCoq69UfsiQGfBU0fzqmiTp0olcKi7z97Y+MTk1N/pmcqs3PzC4vVpeVzqzPDeJNpqc1lBJZLoXgTBUp+mRoOSST5RXR1VOoX/7ixQqszHKS8nUBPiVgwQEd1qo3Dg1BBJKGThwlgP4rz66IIpe7R1FGRll07SNyXh9jnCEWx8dG3hZuVTrXm1/1h0Z8gGIEaGdVJp3ofdjXLEq6QSbC2FfgptnMwKJjkRSXMLE+BXUGPtxxUkHDbzoeXFnTdMV0aa+OeQjpkP0/kkNjSr+ssXdrvWkn+prUyjPfbuVBphlyx90VxJilqWsZGu8JwhnLgADAjnFfK+mCAoQu3DCH4fvJPcL5dD3brO6c7tcbfURzTZJWskQ0SkD3SIMfkhDQJIzfkjjyQR+/W++89eS/vrWPeaGaFfCnv9Q0NlqhV</latexit>

<latexit sha1_base64="9Nj9kpa1k7NDUGTRb8JV0MI4gVM=">AAACJnicbVBNSwMxEM36bf2qevQSLEIFLbsi6kUQRfCoYLXQLWU2zbbBbHZJZsWy7K/x4l/x4kER8eZPMVsLanUg5OXNG/LmBYkUBl333Rkbn5icmp6ZLc3NLywulZdXrkycasbrLJaxbgRguBSK11Gg5I1Ec4gCya+Dm5Oif33LtRGxusR+wlsRdJUIBQO0VLt8SA/ptq8gkNDO/AiwF4TZXZ6f2lcQy47pR/bKfOxxhDyvfku2cLPULlfcmjso+hd4Q1Ahwzpvl5/9TszSiCtkEoxpem6CrQw0CiZ5XvJTwxNgN9DlTQsVRNy0ssGaOd2wTIeGsbZHIR2wPycyiEzh1yoLl2a0V5D/9ZophgetTKgkRa7Y10dhKinGtMiMdoTmDGXfAmBaWK+U9UADQ5tsEYI3uvJfcLVT8/Zquxe7laPjYRwzZI2skyrxyD45ImfknNQJI/fkkTyTF+fBeXJenbcv6ZgznFklv8r5+ASawaaB</latexit>

rx E✓ (x, t) rx log Z✓ (t) =


<latexit sha1_base64="FpZG1fxnNvXifltK3K6fpGUtBLw=">AAACZXicdVFNTxsxEPUuHw0p0EBRLxxqNUICiUS7FSpcKkUgpB5BagA1G0Wzjjex8Nore7ZqtOyf5MaVC38DbxKJz45k+fnNm/H4Oc6ksBgEd56/sLi0/KG2Uv+4urb+qbGxeWF1bhjvMi21uYrBcikU76JAya8ywyGNJb+Mr0+q/OVfbqzQ6jdOMt5PYaREIhigowaNm5+0FSmIJQyKKAUcx0nxryxP3SnWcmgnqduKCMccoSx3nyT7uEdb9J1SGkk9onTKMJDFn/I/vXBv0GgG7WAa9C0I56BJ5nE2aNxGQ83ylCtkEqzthUGG/QIMCiZ5WY9yyzNg1zDiPQcVpNz2i6lLJd1xzJAm2rilkE7Z5xUFpLaa0Smr2e3rXEW+l+vlmBz1C6GyHLlis4uSXFLUtLKcDoXhDOXEAWBGuFkpG4MBhu5j6s6E8PWT34KL7+3wR/vg/KDZOZ7bUSPb5BvZJSE5JB3yi5yRLmHk3qt5G96m9+Cv+Vv+l5nU9+Y1n8mL8L8+Amkdus8=</latexit>

=0
<latexit sha1_base64="8WENX6PFjcHLZLfA7xYUyG41zKA=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE1ItQ9OKxov2ANpTNdtIu3WzC7kYooT/BiwdFvPqLvPlv3LY5aOuDgcd7M8zMCxLBtXHdb6ewsrq2vlHcLG1t7+zulfcPmjpOFcMGi0Ws2gHVKLjEhuFGYDtRSKNAYCsY3U791hMqzWP5aMYJ+hEdSB5yRo2VHq7dUq9ccavuDGSZeDmpQI56r/zV7ccsjVAaJqjWHc9NjJ9RZTgTOCl1U40JZSM6wI6lkkao/Wx26oScWKVPwljZkobM1N8TGY20HkeB7YyoGepFbyr+53VSE175GZdJalCy+aIwFcTEZPo36XOFzIixJZQpbm8lbEgVZcamMw3BW3x5mTTPqt5F9fz+vFK7yeMowhEcwyl4cAk1uIM6NIDBAJ7hFd4c4bw4787HvLXg5DOH8AfO5w8yJo0Y</latexit>

Diffusion Models model energy gradient directly, along entire diffusion process, and avoid modeling
partition function. Different noise levels along diffusion are analogous to annealed sampling in EBMs. 71
Unique Identifiability
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) Reverse Generative Process q(xT )

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT
<latexit sha1_base64="ZwczRm8EwbKEv1uLVyn4USLA6f4=">AAACZnicbVFNTxsxEPUu/YC0pQGEeujFalqJqjTaRQh6QULi0iNIDSDFUeT1zgYLe721ZxGRtX+SW8+99GfgDUEE6EiWn968Gc88Z5WSDpPkTxQvvXj56vXySufN23er77tr66fO1FbAQBhl7HnGHShZwgAlKjivLHCdKTjLLo/a/NkVWCdN+QunFYw0n5SykIJjoMbdhiFco9U+b5jmeJEV/roZIz2g3ykrLBc+bfxOwzJAvoVf6aKIbdOHaqTfKHO/Lfp7bcO2F5pnRuVuqsPlmdEw4aFBZ9ztJf1kFvQ5SOegR+ZxPO7esNyIWkOJQnHnhmlS4chzi1IoaDqsdlBxccknMAyw5BrcyM9sauiXwOS0MDacEumMXazwXLt2xKBst3RPcy35v9ywxuLHyMuyqhFKcfdQUSuKhrae01xaEKimAXBhZZiVigsezMXwM60J6dOVn4PTnX6619892e0dfp7bsUw+kk9ki6RknxySn+SYDIggf6OVaD3aiP7Fq/Fm/OFOGkfzmg3yKGJ6C/ezul4=</latexit>

1 p
Forward Diffusion SDE: dxt = (t)xt dt + (t) d! t
2
Reverse Generative
<latexit sha1_base64="vc0Rdn6P/NciwoqoaXVhaAfpl64=">AAACrXicbVFbb9MwFHbCbZRbgUdeLCqkjktJqmnjBWmCFx6LRLdBHaIT10mtOXFmn6BVVv4dv4A3/g1OF0S3cSTLn77v3E9WK2kxin4H4Y2bt27f2bk7uHf/wcNHw8dPjqxuDBdzrpU2JxlYoWQl5ihRiZPaCCgzJY6z04+dfvxDGCt19QXXtUhKKCqZSw7oqXT4k6E4R1O6ZctKwFWWu/M2RfqevmG5Ae7i1k1blgmEMe4yJXJcXHJ8RaeUVZApSN220FKmdEHPUhxv07vMyGKFCf1Xt8vB7JlB97dMy15vtZWB8YpWS7su/eeYLkUBrU826C0djqJJtDF6HcQ9GJHeZunwF1tq3pSiQq7A2kUc1Zg4MCi5Eu2ANVbUwE+hEAsPKyiFTdxm2y194ZklzbXxr0K6YbcjHJS2a9V7doPbq1pH/k9bNJi/S5ys6gZFxS8K5Y2iqGl3OrqURnBUaw+AG+l7pXwF/kjoD9wtIb468nVwNJ3E+5O9z3ujww/9OnbIM/KcjElMDsgh+URmZE548DKYBV+Db+HbcB6y8PuFaxj0MU/JJQuLP5kR1Ic=</latexit>

1 p
Diffusion SDE:
dxt = (t) [xt + 2rxt log qt (xt )] dt + ¯t
(t) d!
2
⇡ s✓ (xt , t)
<latexit sha1_base64="QryIbomuaGTn4jeqsc/2K6cqakQ=">AAACI3icbVDLSsNAFJ3UV62vqks3g0VQkJKIqLgqunFZwT6gKWEynbRDJ5kwcyOWkH9x46+4caEUNy78FydtBbUeGOZw7rnce48fC67Btj+swsLi0vJKcbW0tr6xuVXe3mlqmSjKGlQKqdo+0UzwiDWAg2DtWDES+oK1/OF1Xm/dM6W5jO5gFLNuSPoRDzglYCSvfOmSOFbyAbshgYEfpDrzUteXoqdHoflSFwYMSJYdfhseMg+O4ahU8soVu2pPgOeJMyMVNEPdK4/dnqRJyCKggmjdcewYuilRwKlgWclNNIsJHZI+6xgakZDpbjq5McMHRunhQCrzIsAT9WdHSkKdr2yc+aL6by0X/6t1EgguuimP4gRYRKeDgkRgkDgPDPe4YhTEyBBCFTe7YjogilAwseYhOH9PnifNk6pzVj29Pa3UrmZxFNEe2keHyEHnqIZuUB01EEWP6Bm9ojfryXqxxtb71FqwZj276Beszy8I7qWx</latexit>

• Denoising model s✓ (xt , t) and deterministic data encodings uniquely determined by data and fixed forward diffusion!
<latexit sha1_base64="q2WXpthijgfs39x8L24g7r0/zmk=">AAACG3icbVDLSsNAFJ34rPUVdekmWIQKUpJS1GXRjcsK9gFNCJPppB06eTBzI5aQ/3Djr7hxoYgrwYV/46SNoK0Hhjmcey733uPFnEkwzS9taXlldW29tFHe3Nre2dX39jsySgShbRLxSPQ8LClnIW0DA057saA48DjteuOrvN69o0KyKLyFSUydAA9D5jOCQUmuXrcDDCPPT2XmprYX8YGcBOpLbRhRwFlW/THcZy6cwkm57OoVs2ZOYSwSqyAVVKDl6h/2ICJJQEMgHEvZt8wYnBQLYITTrGwnksaYjPGQ9hUNcUClk05vy4xjpQwMPxLqhWBM1d8dKQ5kvrJy5ovK+Vou/lfrJ+BfOCkL4wRoSGaD/IQbEBl5UMaACUqATxTBRDC1q0FGWGACKs48BGv+5EXSqdess1rjplFpXhZxlNAhOkJVZKFz1ETXqIXaiKAH9IRe0Kv2qD1rb9r7zLqkFT0H6A+0z2+KG6JL</latexit>

• Even with different architectures and initializations, we recover identical model outputs and encodings (given
sufficient training data, model capacity and optimization accuracy), in contrast to GANs, VAEs, etc.

72
Song et al., ICLR, 2021
Unique Identifiability

(image from: Song et al., “Score-Based Generative Modeling through Stochastic Differential Equations”, ICLR, 2021)

73
Song et al., ICLR, 2021
Why use Differential Equation Framework?
Forward diffusion process (fixed)

<latexit sha1_base64="pHOJjjRZ/AF3g89zy5Wdxt+YFhE=">AAAB+nicbVDLSsNAFL2pr1pfqS7dBItQNyWRoi6LblxWsA9oQ5hMJ+3QySTOTNQS+yluXCji1i9x5984abPQ1gMDh3Pu5Z45fsyoVLb9bRRWVtfWN4qbpa3tnd09s7zfllEiMGnhiEWi6yNJGOWkpahipBsLgkKfkY4/vsr8zj0Rkkb8Vk1i4oZoyGlAMVJa8szyXbUfIjXyg/Rx6tknpZJnVuyaPYO1TJycVCBH0zO/+oMIJyHhCjMkZc+xY+WmSCiKGZmW+okkMcJjNCQ9TTkKiXTTWfSpdayVgRVEQj+urJn6eyNFoZST0NeTWUy56GXif14vUcGFm1IeJ4pwPD8UJMxSkZX1YA2oIFixiSYIC6qzWniEBMJKt5WV4Cx+eZm0T2vOWa1+U680LvM6inAIR1AFB86hAdfQhBZgeIBneIU348l4Md6Nj/lowch3DuAPjM8fuVSTAQ==</latexit>

<latexit sha1_base64="Frtl7jj79vvBpy7yUIC/bT97H/4=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSLUTUmkqMuiG5cV+oI2hMl00g6dTOLMRC2xn+LGhSJu/RJ3/o1Jm4W2Hhg4nHMv98zxIs6UtqxvY2V1bX1js7BV3N7Z3ds3SwdtFcaS0BYJeSi7HlaUM0FbmmlOu5GkOPA47Xjj68zv3FOpWCiaehJRJ8BDwXxGsE4l1yzdVfoB1iPPTx6nbvO0WHTNslW1ZkDLxM5JGXI0XPOrPwhJHFChCcdK9Wwr0k6CpWaE02mxHysaYTLGQ9pLqcABVU4yiz5FJ6kyQH4o0yc0mqm/NxIcKDUJvHQyi6kWvUz8z+vF2r90EiaiWFNB5of8mCMdoqwHNGCSEs0nKcFEsjQrIiMsMdFpW1kJ9uKXl0n7rGqfV2u3tXL9Kq+jAEdwDBWw4QLqcAMNaAGBB3iGV3gznowX4934mI+uGPnOIfyB8fkD8FCTJQ==</latexit>

q(x0 ) Reverse Generative Process q(xT )

<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT

Advantages of the Differential Equation framework for Diffusion Models:

• Can leverage broad existing literature on advanced and fast SDE and ODE solvers
• Allows us to construct deterministic Probability Flow ODE
• Deterministic Data Encodings
• Log-likelihood Estimation
• Clean mathematical framework based on Diffusion Processes and Score Matching;
connections to Neural ODEs, Continuous Normalizing Flows and Energy-based Models
74
Today’s Program

Title Speaker Time


Introduction Arash 10 min
Part (1): Denoising Diffusion Probabilistic Models Arash 35 min
Part (2): Score-based Generative Modeling with Differential Equations Karsten 45 min
Part (3): Advanced Techniques: Accelerated Sampling, Conditional Generation, and Beyond Ruiqi 45 min
Applications (1): Image Synthesis, Text-to-Image, Controllable Generation Ruiqi 15 min
Applications (2): Image Editing, Image-to-Image, Super-resolution, Segmentation Arash 15 min
Applications (3): Video Synthesis, Medical Imaging, 3D Generation, Discrete State Models Karsten 15 min
Conclusions, Open Problems and Final Remarks Arash 10 min

cvpr2022-tutorial-diffusion-models.github.io
75
Part (3):
Advanced Techniques: Accelerated Sampling,
Conditional Generation, and Beyond

76
Outline
Questions to address with advanced techniques

• Q1: How to accelerate the sampling process?

• Advanced forward diffusion process

• Advanced reverse process

• Hybrid models & model distillation

• Q2: How to do high-resolution (conditional) generation?

• Conditional diffusion models

• Classifier(-free) guidance

• Cascaded generation

77
Part (3)-1:
Q: How to accelerate sampling process?

78
What makes a good generative model?
The generative learning trilemma
Likelihood-based models
(Variational Autoencoders
& Normalizing flows)

Mode
Fast
Coverage/
Sampling
Diversity

Generative Denoising
Adversarial Diffusion
Networks (GANs) Models
High
Quality Often requires 1000s of
Samples network evaluations!

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs, ICLR 2022 79
What makes a good generative model?
The generative learning trilemma

Tackle the trilemma by accelerating diffusion models

Mode
Fast
Coverage/
Sampling
Diversity

High
Quality
Samples

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs, ICLR 2022 80
How to accelerate diffusion models?
[Image credit: Ben Poole, Mohammad Norouzi]
Simple forward process slowly maps data to noise

Reverse process maps noise back to data where


diffusion model is trained

• Naïve acceleration methods, such as reducing diffusion time steps in Diffusion model
training or sampling every k time step in inference, lead to immediate
worse performance.

• We need something cleverer.

• Given a limited number of functional calls, usually much less than


1000s, how to improve performance?
81
(1/3) Advanced forward process
The reverse process will be changed accordingly

Simple forward process slowly maps data to noise

Reverse process maps noise back to data where


diffusion model is trained

Diffusion model

• Does the noise schedule have to be predefined?

• Does it have to be a Markovian process?

• Is there any faster mixing diffusion process?


82
Variational diffusion models
Learnable diffusion process

• Given the forward process

• Directly parametrize the variance through a learned function :

• : a monotonic MLP.

• Strictly positive weights & monotonic activations (e.g. sigmoid)

• Analogous to hierarchical VAE (part 1): unlike diffusion models using a fixed encoder,
we include learnable parameters in the encoder.

Kingma et al., “Variational diffusion models”, NeurIPS 2021.


Vahdat and Kautz, NVAE: A Deep Hierarchical Variational Autoencoder, NeurIPS 2020 83
Variational diffusion models
New parametrization of training objectives

• Optimizing variational upper bound of diffusion models can be simplified to the following training objective:

- Learning noise schedule improves likelihood estimation of diffusion models, given fewer diffusion steps.

• Letting leads to variational upper bound in continuous-time

- it is shown to be only related to the signal-to-noise ratio at endpoints, invariant to the


noise schedule in-between.

- The continuous-time noise schedule can be learned to minimize the variance of the training objective for faster
training.
Kingma et al., “Variational diffusion models”, NeurIPS 2021. 84
Variational diffusion models
SOTA likelihood estimation

• Key factor: appending Fourier features to the input of U-Net

• Good likelihoods require modeling all bits, even the ones


corresponding to very small changes in input.

• But: neural nets are usually bad at modeling small changes to


inputs.

• Significant improvements in log-likelihoods.

Kingma et al., “Variational diffusion models”, NeurIPS 2021. 85


Denoising diffusion implicit models (DDIM)
Non-Markovian diffusion process

Main Idea

Design a family of non-Markovian diffusion processes and corresponding reverse processes.

The process is designed such that the model can be optimized by the same surrogate objective as
the original diffusion model.

Therefore, can take a pretrained diffusion model but with more choices of sampling procedure.

Song et al., “Denoising Diffusion Implicit Models”, ICLR 2021. 86


Denoising diffusion implicit models (DDIM)
How to define the non-Markovian forward process?

Recall that the KL divergence in the variational upper bound can be written as:

If we assume loss weighting can be arbitrary values, the above formulation holds as long as
(make sure )

Forward process:

Reverse process:

(assume )
No need to specify to be a Markovian process!

Song et al., “Denoising Diffusion Implicit Models”, ICLR 2021. 87


Denoising diffusion implicit models (DDIM)
Non-Markovian diffusion process

For the forward process , need to choose

such that

Define a family of forward processes that meets the above requirement:

The corresponding reverse process is

Song et al., “Denoising Diffusion Implicit Models”, ICLR 2021. 88


DDIM sampler
Deterministic generative process

• DDIM sampler –

- a deterministic generative process, with randomness from only t=T.

Song et al., “Denoising Diffusion Implicit Models”, ICLR 2021. 89


ODE interpretation
Deterministic generative process

• DDIM sampler can be considered as an integration rule of the following ODE:

• With the optimal model, the ODE is equivalent to a probability flow ODE of a “variance-exploding” SDE:

• Sampling procedure can be different from standard Euler’s method: wrt. vs wrt.
Song et al., “Denoising Diffusion Implicit Models”, ICLR 2021.
Karras et al., “Elucidating the Design Space of Diffusion-Based Generative Models”, arXiv 2022. 90
Salimans & Ho, “Progressive distillation for fast sampling of diffusion models”, ICLR 2022.
DDIM sampler
Faster & low curvature

(Karras et al.) argues that the ODE of DDIM is favored, as the tangent of the solution trajectory always points
towards the denoiser output.

Leads to largely linear solution trajectories with low curvature.

Low curvature means less truncation errors accumulated over the trajectories.

Song et al., “Denoising Diffusion Implicit Models”, ICLR 2021.


Karras et al., “Elucidating the Design Space of Diffusion-Based Generative Models”, arXiv 2022. 91
Salimans & Ho, “Progressive distillation for fast sampling of diffusion models”, ICLR 2022.
Critically-damped Langevin diffusion
“fast mixing” diffusion process

<latexit sha1_base64="myt1lYVmS8Y8ucJp7Jb8xJBn8mg=">AAAB+nicbVDLSsNAFL3xWesr1aWbYBHqpiRS1GXRjcsK9gFtCJPppB06mYSZiVpiPsWNC0Xc+iXu/BsnbRbaemDgcM693DPHjxmVyra/jZXVtfWNzdJWeXtnd2/frBx0ZJQITNo4YpHo+UgSRjlpK6oY6cWCoNBnpOtPrnO/e0+EpBG/U9OYuCEacRpQjJSWPLMS19JBiNTYD9LHLPPsU8+s2nV7BmuZOAWpQoGWZ34NhhFOQsIVZkjKvmPHyk2RUBQzkpUHiSQxwhM0In1NOQqJdNNZ9Mw60crQCiKhH1fWTP29kaJQymno68k8pVz0cvE/r5+o4NJNKY8TRTieHwoSZqnIynuwhlQQrNhUE4QF1VktPEYCYaXbKusSnMUvL5POWd05rzduG9XmVVFHCY7gGGrgwAU04QZa0AYMD/AMr/BmPBkvxrvxMR9dMYqdQ/gD4/MHE+eT5A==</latexit>

p(x0 ) Fixed Forward Diffusion Process <latexit sha1_base64="qIcSH8a7UWSZ6ytJ3hMSfx2zaw8=">AAAB+nicbVDLSsNAFL3xWesr1aWbYBHqpiRS1GXRjcsK9gFtCJPppB06mYSZiVpiPsWNC0Xc+iXu/BsnbRbaemDgcM693DPHjxmVyra/jZXVtfWNzdJWeXtnd2/frBx0ZJQITNo4YpHo+UgSRjlpK6oY6cWCoNBnpOtPrnO/e0+EpBG/U9OYuCEacRpQjJSWPLMS19JBiNTYD9LHLPOcU8+s2nV7BmuZOAWpQoGWZ34NhhFOQsIVZkjKvmPHyk2RUBQzkpUHiSQxwhM0In1NOQqJdNNZ9Mw60crQCiKhH1fWTP29kaJQymno68k8pVz0cvE/r5+o4NJNKY8TRTieHwoSZqnIynuwhlQQrNhUE4QF1VktPEYCYaXbKusSnMUvL5POWd05rzduG9XmVVFHCY7gGGrgwAU04QZa0AYMD/AMr/BmPBkvxrvxMR9dMYqdQ/gD4/MHFWyT5Q==</latexit>

p(x1 )

Generation with Parametrized Reverse Denoising Process

<latexit sha1_base64="jZlN6Ex2izsPKi/LzFjXfPXeK04=">AAACUXicbVFNSwMxEJ2u3/Wr6tHLYhEUseyKqBdB9OJRwarQLSWbztpg9sNkVi1h/6IHPfk/vHhQTGvFjzoQeLx5bzJ5CTMpNHnec8kZGR0bn5icKk/PzM7NVxYWz3WaK451nspUXYZMoxQJ1kmQxMtMIYtDiRfh9VGvf3GLSos0OaNuhs2YXSUiEpyRpVqVTkB4Tyo27SKIGXXCyNwXLdrfDCLFuPELs1UEIRJbo/Wfgm8bbQT6RpH5UhXDE++soVWpejWvX+4w8AegCoM6aVUeg3bK8xgT4pJp3fC9jJqGKRJcYlEOco0Z49fsChsWJixG3TT9RAp31TJtN0qVPQm5ffanw7BY624cWmVvR/231yP/6zVyivaaRiRZTpjwz4uiXLqUur143bZQyEl2LWBcCburyzvMRkn2E8o2BP/vk4fB+VbN36ltn25XDw4HcUzCMqzAGviwCwdwDCdQBw4P8AJv8F56Kr064DifUqc08CzBr3KmPwDZBLf5</latexit>

1 p
• Regular forward Diffusion Process: dxt = (t)xt dt + (t)dwt
2
• It is a special case of (overdamped) 1 p
<latexit sha1_base64="MpLj96bJozcthJaMl9k3yw7XRG4=">AAACfHicbVHLahsxFNVMX6n7iNMsu6ioW7AJNTMhJN0UQkuhywTqJGCZ4Y6scUQ0mql0J4kR+or8WXf9lG5KZWdK3bgXBIdzz33o3LxW0mKS/Ijie/cfPHy08bjz5Omz55vdrRcntmoMFyNeqcqc5WCFklqMUKISZ7URUOZKnOYXnxb500thrKz0V5zXYlLCTMtCcsBAZd0bhuIaTemmnpWA53nhrn2GH1hhgLvUu13PcoHQxwFlGnIFmVsVeqaqGa0D2fb5fOx9f1Ux+DsBd5j9ZtD96ejXh1+FiqzbS4bJMug6SFvQI20cZd3vbFrxphQauQJrx2lS48SBQcmV8B3WWFEDv4CZGAeooRR24pbmefo2MFNaVCY8jXTJrlY4KK2dl3lQLna0d3ML8n+5cYPF+4mTum5QaH47qGgUxYouLkGn0giOah4AcCPDrpSfQ7Adw706wYT07pfXwcnuMN0f7h3v9Q4/tnZskJfkNemTlByQQ/KFHJER4eRn9CrqR4PoV/wm3onf3UrjqK3ZJv9EvP8b3jnGxA==</latexit>

Langevin dynamics:
dxt = (t)rxt log pEQ (xt )dt + (t)dwt
2 <latexit sha1_base64="Wohe+9tRDQAg3zRIq/seteh/e3w=">AAACWnicbVFdSxwxFM2M3+vXavvWl9BFUNBlZhEtlIJUhPZFFFwVdtYhk72jwWRmTO4Ul5A/2Zci9K8Iza4r+NELgXPPuYfcnGSVFAaj6CEIp6ZnZufmFxqLS8srq8219XNT1ppDl5ey1JcZMyBFAV0UKOGy0sBUJuEiuz0c6Re/QBtRFmc4rKCv2HUhcsEZeipt3lWpTRDuUSt7dOrcZqIY3mS5vXcpbn0bd5xJe/xa+Uqfu8htP8OfbisxQlG4sjtJrhm3sbMd99J31XFpsxW1o3HR9yCegBaZ1Ena/J0MSl4rKJBLZkwvjirsW6ZRcAmukdQGKsZv2TX0PCyYAtO342gc3fDMgOal9qdAOmZfOixTxgxV5idHa5q32oj8n9arMf/St6KoaoSCP12U15JiSUc504HQwFEOPWBcC78r5TfMh4L+Nxo+hPjtk9+D80473mvvnu62Dr5P4pgnn8hnsklisk8OyA9yQrqEkz/kMZgN5oK/YRguhItPo2Ew8Xwgryr8+A+mS7cb</latexit>

1 2
pEQ (xt ) = N (xt ; 0, I) ⇠ e 2 xt

Dockhorn et al., “Score-Based Generative Modeling with Critically-Damped Langevin Diffusion”, ICLR 2022. 92
!Momentum-based” diffusion
Introduce a velocity variable and run diffusion in extended space

Main idea: Inject noise only into vt, faster mixing


<latexit sha1_base64="D0SbyTGrwHJr685xEyt8PyDkkXs=">AAAB83icbVDLSsNAFL2pr1pfVZdugkVwVRIp6rLoxmUF+4AmlMl00g6dTMLMTaGE/oYbF4q49Wfc+TdO2iy09cDA4Zx7uWdOkAiu0XG+rdLG5tb2Tnm3srd/cHhUPT7p6DhVlLVpLGLVC4hmgkvWRo6C9RLFSBQI1g0m97nfnTKleSyfcJYwPyIjyUNOCRrJ8yKC4yDMpvMBDqo1p+4sYK8TtyA1KNAaVL+8YUzTiEmkgmjdd50E/Ywo5FSwecVLNUsInZAR6xsqScS0ny0yz+0LowztMFbmSbQX6u+NjERaz6LATOYZ9aqXi/95/RTDWz/jMkmRSbo8FKbCxtjOC7CHXDGKYmYIoYqbrDYdE0UompoqpgR39cvrpHNVd6/rjcdGrXlX1FGGMziHS3DhBprwAC1oA4UEnuEV3qzUerHerY/laMkqdk7hD6zPH4/Mkgo=</latexit>

through Hamiltonian component!

Dockhorn et al., “Score-Based Generative Modeling with Critically-Damped Langevin Diffusion”, ICLR 2022. 93
Advanced reverse process
Approximate reverse process with more complicated distributions

Simple forward process slowly maps data to noise

Reverse process maps noise back to data where


diffusion model is trained

Diffusion model
• Q: is normal approximation of the reverse process still accurate when
there’re less diffusion time steps?

94
Advanced approximation of reverse process
Normal assumption in denoising distribution holds only for small step

Denoising Process with Uni-modal Normal Distribution

Data Noise

Data Noise

Requires more complicated functional approximators!


Xiao et al., “Tackling the Generative Learning Trilemma with Denoising Diffusion GANs”, ICLR 2022.
Gao et al., “Learning energy-based models by diffusion recovery likelihood”, ICLR 2021. 95
Denoising diffusion GANs
Approximating reverse process by conditional GANs

Compared to a one-shot GAN generator:

• Both generator and discriminator are


solving a much simpler problem.

• Stronger mode coverage

• Better training stability

Xiao et al., “Tackling the Generative Learning Trilemma with Denoising Diffusion GANs”, ICLR 2022. 96
Diffusion energy-based models
Approximating reverse process by conditional energy-based models

• An energy-based model (EBM) is in the form

Energy function
Partition function
Analytically intractble
f()
Energy landscape Observations

x
f(x)

Parametrization Learning energy-based models

• Optimizing energy-based models requires MCMC from the current model

Gao et al., “Learning energy-based models by diffusion recovery likelihood”, ICLR 2021. 97
Diffusion energy-based models
Conditional energy-based models

• Assume at each diffusion step marginally . Let (data at a higher noise level).

• The conditional energy-based models can be derived by Bayes’ rule:


Compared to a single EBM:

• Sampling is more friendly and


easier to converge
Localize the energy landscape • Training is more efficient

Less • Well-formed energy potential


complicated
than marginal Compared to diffusion models:
density of x.
• Much less diffusion steps (6 steps)

• Learn the sequence of EBMs by maximizing conditional log-likelihoods:

• Get samples by progressive sampling from EBMs from high-noise levels to low-noise levels.
Gao et al., “Learning energy-based models by diffusion recovery likelihood”, ICLR 2021. 98
Advanced modeling
Latent space modeling & model distillation

Simple forward process slowly maps data to noise

• Both generator and discriminator are solving a much simpler problem.

Reverse process maps noise back to data where


diffusion model is trained

Diffusion model
• Can we do model distillation for fast sampling?

• Can we lift the diffusion model to a latent space that is faster to diffuse?

99
Progressive distillation
• Distill a deterministic DDIM sampler to the same model architecture.
• At each stage, a “student” model is learned to distill two adjacent sampling steps of the “teacher” model to
one sampling step.
• At next stage, the “student” model from previous stage will serve as the new “teacher” model.

Distillation stage
Salimans & Ho, “Progressive distillation for fast sampling of diffusion models”, ICLR 2022. 100
Progressive distillation
Teacher-student learning

Distill the DDIM sampler:

Salimans & Ho, “Progressive distillation for fast sampling of diffusion models”, ICLR 2022. 101
Latent-space diffusion models
Variational autoencoder + score-based prior

Encoder Latent Space Forward Diffusion


Data

Reconst.
Decoder Latent Space Generative Denoising

Variational Autoencoder Denoising Diffusion Prior

Main Idea

Encoder maps the input data to an embedding space

Denoising diffusion models are applied in the latent space

Vahdat et al., “Score-based generative modeling in latent space”, NeurIPS 2021.


102
Rombach et al., “High-Resolution Image Synthesis with Latent Diffusion Models”, CVPR 2022.
Latent-space diffusion models
Variational autoencoder + score-based prior

Encoder Latent Space Forward Diffusion


Data

Reconst.
Decoder Latent Space Generative Denoising

Variational Autoencoder Denoising Diffusion Prior


Advantages:

(1) The distribution of latent embeddings close to Normal distribution à Simpler denoising, Faster Synthesis!

(2) Augmented latent space à More expressivity!

(3) Tailored Autoencoders à More expressivity, Application to any data type (graphs, text, 3D data, etc.) !
Vahdat et al., “Score-based generative modeling in latent space”, NeurIPS 2021.
103
Rombach et al., “High-Resolution Image Synthesis with Latent Diffusion Models”, CVPR 2022.
Latent-space diffusion models
Training objective: score-matching for cross entropy

L(x, , ✓, ) = Eq (z0 |x) [ log p (x|z0 )]+KL(q (z0 |x)||p✓ (z0 ))


= Eq (z0 |x) [ log p (x|z0 )]+Eq (z0 |x) [log q (z0 |x)]+Eq (z0 |x) [ log p✓ (z0 )]
| {z } | {z } | {z }
reconstruction term negative encoder entropy cross entropy

time Forward Diffusion Trainable Constant


sampling diffusion kernel score function
Vahdat et al., “Score-based generative modeling in latent space”, NeurIPS 2021. 104
Part 3-2:
Q: How to do high-resolution conditional generation?

105
Impressive conditional diffusion models
Text-to-image generation
DALL·E 2 IMAGEN
“a propaganda poster depicting a cat dressed as french “A photo of a raccoon wearing an astronaut helmet,
emperor napoleon holding a piece of cheese” looking out of the window at night.”

Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022.
106
Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022.
Impressive conditional diffusion models
Super-resolution & colorization

Super-resolution Colorization

Saharia et al., “Palette: Image-to-Image Diffusion Models”, arXiv 2021. 107


Impressive conditional diffusion models
Panorama generation
← Generated Input Generated →

Saharia et al., “Palette: Image-to-Image Diffusion Models”, arXiv 2021. 108


Conditional diffusion models
Include condition as input to reverse process

Reverse process:

Variational
upper bound:

Incorporate conditions into U-Net

• Scalar conditioning: encode scalar as a vector embedding, simple spatial addition or adaptive
group normalization layers.

• Image conditioning: channel-wise concatenation of the conditional image.

• Text conditioning: single vector embedding – spatial addition or adaptive group norm / a seq of
vector embeddings - cross-attention.
109
Classifier guidance
Using the gradient of a trained classifier as guidance

Score model Classifier gradient

Main Idea

For class-conditional modeling of , train an extra classifier

Mix its gradient with the diffusion/score model during sampling

Dhariwal and Nichol, “Diffusion models beat GANs on image synthesis”, NeurIPS 2021. 110
Classifier guidance
Using the gradient of a trained classifier as guidance

Score model Classifier gradient

Main Idea

Sample with a modified score:

Approximate samples from the distribution

Dhariwal and Nichol, “Diffusion models beat GANs on image synthesis”, NeurIPS 2021. 111
Classifier-free guidance
Get guidance by Bayes’ rule on conditional diffusion models

• Instead of training an additional classifier, get an “implicit classifier” by jointly training a conditional and unconditional
diffusion model:

Conditional diffusion model Unconditional diffusion model

• In practice, and by randomly dropping the condition of the diffusion model at certain chance.

• The modified score with this implicit classifier included is:

Ho & Salimans, “Classifier-Free Diffusion Guidance”, 2021. 112


Classifier-free guidance
Trade-off for sample quality and sample diversity

Non-guidance 𝜔=1 𝜔=3

Large guidance weight (𝜔) usually leads to better individual sample quality but less sample diversity.

Ho & Salimans, “Classifier-Free Diffusion Guidance”, 2021. 113


Cascaded generation
Pipeline

Cascaded Diffusion Models outperform Big-GAN in FID and IS and VQ-VAE2 in Classification Accuracy Score.

Ho et al., “Cascaded Diffusion Models for High Fidelity Image Generation”, 2021. 114
Noise conditioning augmentation
Reduce compounding error

• Need robust super-resolution model:

• Training conditional on original low-res images from the dataset.


Mismatch issue
• Inference on low-res images generated by the low-res model.

• Noise conditioning augmentation:

• During training, add varying amounts of Gaussian noise (or blurring by Gaussian kernel) to the low-res images.

• During inference, sweep over the optimal amount of noise added to the low-res images.

• BSR-degradation process: applies JPEG compressions noise, camera sensor noise, different image interpolations for
downsampling, Gaussian blur kernels and Gaussian noise in a random order to an image.

Ho et al., “Cascaded Diffusion Models for High Fidelity Image Generation”, 2021.
115
Nichol et al., “GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models”, 2021.
Summary
Questions to address with advanced techniques

• Q1: How to accelerate the sampling process?

• Advanced forward diffusion process

• Advanced reverse process

• Hybrid models & model distillation

• Q2: How to do high-resolution (conditional) generation?

• Conditional diffusion models

• Classifier(-free) guidance

• Cascaded generation

116
Today’s Program

Title Speaker Time


Introduction Arash 10 min
Part (1): Denoising Diffusion Probabilistic Models Arash 35 min
Part (2): Score-based Generative Modeling with Differential Equations Karsten 45 min
Part (3): Advanced Techniques: Accelerated Sampling, Conditional Generation, and Beyond Ruiqi 45 min
Applications (1): Image Synthesis, Text-to-Image, Controllable Generation Ruiqi 15 min
Applications (2): Image Editing, Image-to-Image, Super-resolution, Segmentation Arash 15 min
Applications (3): Video Synthesis, Medical Imaging, 3D Generation, Discrete State Models Karsten 15 min
Conclusions, Open Problems and Final Remarks Arash 10 min

cvpr2022-tutorial-diffusion-models.github.io
117
Applications (1):
Image Synthesis, Controllable Generation,
Text-to-Image

118
Text-to-image generation
Inverse of image captioning
• Conditional generation: given a text prompt c, generate high-res images x.

Video source: Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 119
GLIDE
OpenAI

• A 64x64 base model + a 64x64 → 256x256 super-resolution model.

• Tried classifier-free and CLIP guidance. Classifier-free guidance works better than CLIP guidance.

Samples generated with classifier-free guidance (256x256)

Nichol et al., “GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models”, 2021. 120
CLIP guidance
What is a CLIP model?

• Trained by contrastive cross-entropy loss:

• The optimal value of is

Radford et al., “Learning Transferable Visual Models From Natural Language Supervision”, 2021.
121
Nichol et al., “GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models”, 2021.
CLIP guidance
Replace the classifier in classifier guidance with a CLIP model

• Sample with a modified score:

CLIP model

Radford et al., “Learning Transferable Visual Models From Natural Language Supervision”, 2021.
122
Nichol et al., “GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models”, 2021.
GLIDE
OpenAI

• Fine-tune the model especially for inpainting: feed randomly occluded images with an additional mask channel as
the input.

Text-conditional image inpainting examples

Nichol et al., “GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models”, 2021. 123
DALL·E 2
OpenAI

1kx1k Text-to-image generation.


Outperform DALL-E (autoregressive transformer).
Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022. 124
DALL·E 2
Model components

Prior: produces CLIP image embeddings conditioned on the caption.

Decoder: produces images conditioned on CLIP image embeddings and text.

Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022. 125
DALL·E 2
Model components

Why conditional on CLIP image embeddings?

CLIP image embeddings capture high-level semantic meaning; latents in the decoder model take care of the rest.

The bipartite latent representation` enables several text-guided image manipulation tasks.

Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022. 126
DALL·E 2
Model components (1/2): prior model

Prior: produces CLIP image embeddings conditioned on the caption.

• Option 1. autoregressive prior: quantize image embedding to a seq. of discrete codes and predict them
autoregressively.

• Option 2. diffusion prior: model the continuous image embedding by diffusion models conditioned on caption.

Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022. 127
DALL·E 2
Model components (2/2): decoder model

Decoder: produces images conditioned on CLIP image embeddings (and text).

• Cascaded diffusion models: 1 base model (64x64), 2 super-resolution models (64x64 → 256x256, 256x256 → 1024x1024).

• Largest super-resolution model is trained on patches and takes full-res inputs at inference time.

• Classifier-free guidance & noise conditioning augmentation are important.

Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022. 128
DALL·E 2
Bipartite latent representations

Bipartite latent representations

: CLIP image embeddings

: inversion of DDIM sampler


(latents in the decoder model)
Near exact
reconstruction

Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022. 129
DALL·E 2
Image variations

Fix the CLIP embedding


Decode using different decoder latents
Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022. 130
DALL·E 2
Image interpolation

Interpolate image CLIP embeddings .

Use different to get different interpolation trajectories.

Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022. 131
DALL·E 2
Text Diffs

Change the image CLIP embedding towards the difference of the text CLIP embeddings of two prompts.

Decoder latent is kept as a constant.


Ramesh et al., “Hierarchical Text-Conditional Image Generation with CLIP Latents”, arXiv 2022. 132
Imagen
Google Research, Brain team

Input: text; Output: 1kx1k images

• An unprecedented degree of photorealism

• SOTA automatic scores & human ratings

• A deep level of language understanding

• Extremely simple

• no latent space, no quantization

A brain riding a rocketship heading towards the moon.

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 133
Imagen
Google Research, Brain team

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 134
Imagen
Google Research, Brain team

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 135
Imagen
Google Research, Brain team

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 136
Imagen
Google Research, Brain team

A cute hand-knitted koala wearing a sweater with 'CVPR' written on it.

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 137
Imagen

Key modeling components:

• Cascaded diffusion models

• Classifier-free guidance and dynamic


thresholding.

• Frozen large pretrained language models as


text encoders. (T5-XXL)

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 138
Imagen

Key observations:

• Beneficial to use text conditioning for all super-res


models.

• Noise conditioning augmentation weakens


information from low-res models, thus needs text
conditioning as extra information input.

• Scaling text encoder is extremely efficient.

• More important than scaling diffusion model size.

• Human raters prefer T5-XXL as the text encoder over


CLIP encoder on DrawBench.

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 139
Imagen
Dynamic thresholding

• Large classifier-free guidance weights → better text alignment, worse image quality

Better sample quality

Better text alignment

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 140
Imagen
Dynamic thresholding

• Large classifier-free guidance weights → better text alignment, worse image quality

• Hypothesis : at large guidance weight, the generated images are saturated due to the very large gradient updates
during sampling

• Solution – dynamic thresholding: adjusts the pixel values of samples at each sampling step to be within a dynamic range
computed over the statistics of the current samples.

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 141
Imagen
Dynamic thresholding

• Large classifier-free guidance weights → better text alignment, worse image quality

• Hypothesis : at high guidance weight, the generated images are saturated due to the very large gradient updates during
sampling

• Solution – dynamitic thresholding: adjusts the pixel values of samples at each sampling step to be within a dynamic
range computed over the statistics of the current samples.

Static thresholding Dynamic thresholding

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 142
Imagen
DrawBench: new benchmark for text-to-image evaluations

• A set of 200 prompts to evaluate text-to-image models across multiple dimensions.

• E.g., the ability to faithfully render different colors, numbers of objects, spatial relations, text in the scene, unusual
interactions between objects.

• Contains complex prompts, e.g, long and intricate descriptions, rare words, misspelled prompts.

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 143
Imagen
DrawBench: new benchmark for text-to-image evaluations

• A set of 200 prompts to evaluate text-to-image models across multiple dimensions.

• E.g., the ability to faithfully render different colors, numbers of objects, spatial relations, text in the scene, unusual
interactions between objects.

• Contains complex prompts, e.g, long and intricate descriptions, rare words, misspelled prompts.

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 144
Imagen
Evaluations

Imagen got SOTA automatic evaluation scores Imagen is preferred over recent work by human raters in sample
on COCO dataset quality & image-text alignment on DrawBench.

Saharia et al., “Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding”, arXiv 2022. 145
Diffusion Autoencoders
Learning semantic meaningful latent representations in diffusion models

Preechakul et al., “Diffusion Autoencoders: Toward a Meaningful and Decodable Representation”, CVPR 2022. 146
Diffusion Autoencoders
Learning semantic meaningful latent representations in diffusion models

Changing the semantic latent .

Preechakul et al., “Diffusion Autoencoders: Toward a Meaningful and Decodable Representation”, CVPR 2022. 147
Diffusion Autoencoders
Learning semantic meaningful latent representations in diffusion models

Preechakul et al., “Diffusion Autoencoders: Toward a Meaningful and Decodable Representation”, CVPR 2022. 148
Today’s Program

Title Speaker Time


Introduction Arash 10 min
Part (1): Denoising Diffusion Probabilistic Models Arash 35 min
Part (2): Score-based Generative Modeling with Differential Equations Karsten 45 min
Part (3): Advanced Techniques: Accelerated Sampling, Conditional Generation, and Beyond Ruiqi 45 min
Applications (1): Image Synthesis, Text-to-Image, Controllable Generation Ruiqi 15 min
Applications (2): Image Editing, Image-to-Image, Super-resolution, Segmentation Arash 15 min
Applications (3): Video Synthesis, Medical Imaging, 3D Generation, Discrete State Models Karsten 15 min
Conclusions, Open Problems and Final Remarks Arash 10 min

cvpr2022-tutorial-diffusion-models.github.io
150
Applications (2):
Image Editing, Image-to-Image,
Super-resolution, Segmentation

151
Super-Resolution
Super-Resolution via Repeated Refinement (SR3)

Image super-resolution can be considered as training where y is a low-resolution image and x is the corresponding
high-resolution image

Train a score model for x conditioned on y using:


Many image-to-image translation applications can be considered as training .
Train a score model for x conditioned on y using:

The conditional score is simply a U-Net with xt and y (resolution image) concatenated.

Saharia et al., Image Super-Resolution via Iterative Refinement, 2021 152


Super-Resolution
Super-Resolution via Repeated Refinement (SR3)

Saharia et al., Image Super-Resolution via Iterative Refinement, 2021 153


Image-to-Image Translation
Palette: Image-to-Image Diffusion Models

Many image-to-image translation applications can be considered as training where y is the input image.

For example, for colorization, x is a colored image and y is a gray-level image.

Train a score model for x conditioned on y using:

The conditional score is simply a U-Net with xt and y concatenated.

Saharia et al., Palette: Image-to-Image Diffusion Models, 2022 154


Image-to-Image Translation
Palette: Image-to-Image Diffusion Models

Saharia et al., Palette: Image-to-Image Diffusion Models, 2022 155


Conditional Generation
Iterative Latent Variable Refinement (ILVR)

A simple technique to guide the generation process of an unconditional diffusion model using a reference image.

Given the conditioning (reference) image y the generation process is modified to pull the samples towards the reference
image.

Choi et al., ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models, ICCV 2021 156
Conditional Generation
Iterative Latent Variable Refinement (ILVR)

Choi et al., ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models, ICCV 2021 157
Semantic Segmentation
Label-efficient semantic segmentation with diffusion models

Can we use representation learned from diffusion models for downstream applications such as semantic segmentation?

Baranchuk et al., Label-Efficient Semantic Segmentation with Diffusion Models, ICLR 2022 158
Semantic Segmentation
Label-efficient semantic segmentation with diffusion models
The experimental results show that the proposed method outperforms Masked Autoencoders, GAN and VAE-based models.

Baranchuk et al., Label-Efficient Semantic Segmentation with Diffusion Models, ICLR 2022 159
Image Editing
SDEdit

Forward diffusion brings two distributions close to each other

Meng et al., SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations, ICLR 2022 160
Image Editing
SDEdit

Meng et al., SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations, ICLR 2022 161
Adversarial Robustness
Diffusion Models for Adversarial Purification

Nie et al., Diffusion Models for Adversarial Purification, ICML 2022 162
Adversarial Robustness
Diffusion Models for Adversarial Purification

Comparison with state-of-the-art defense methods against


unseen threat models (including AutoAttack ℓ∞, AutoAttack ℓ2, and
StdAdv) on ResNet-50 for CIFAR-10.

Nie et al., Diffusion Models for Adversarial Purification, ICML 2022 163
Today’s Program

Title Speaker Time


Introduction Arash 10 min
Part (1): Denoising Diffusion Probabilistic Models Arash 35 min
Part (2): Score-based Generative Modeling with Differential Equations Karsten 45 min
Part (3): Advanced Techniques: Accelerated Sampling, Conditional Generation, and Beyond Ruiqi 45 min
Applications (1): Image Synthesis, Text-to-Image, Controllable Generation Ruiqi 15 min
Applications (2): Image Editing, Image-to-Image, Super-resolution, Segmentation Arash 15 min
Applications (3): Video Synthesis, Medical Imaging, 3D Generation, Discrete State Models Karsten 15 min
Conclusions, Open Problems and Final Remarks Arash 10 min

cvpr2022-tutorial-diffusion-models.github.io
165
Applications (3):
Video Synthesis, Medical Imaging,
3D Generation, Discrete State Models

166
Video Generation

Samples from a text-conditioned video diffusion model, conditioned on the string fireworks.

(video from: Ho et al., “Video Diffusion Models”, arXiv, 2022,


https://video-diffusion.github.io/)

Ho et al., “Video Diffusion Models”, arXiv, 2022


Harvey et al., “Flexible Diffusion Modeling of Long Videos”, arXiv, 2022
Yang et al., “Diffusion Probabilistic Modeling for Video Generation”, arXiv, 2022
Höppe et al., “Diffusion Models for Video Prediction and Infilling”, arXiv, 2022
167
Voleti et al., “MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation”, arXiv, 2022
Video Generation

Video Generation Tasks: Learn one model for everything:

• Unconditional Generation (Generate all frames) • Architecture as one diffusion model over all frames concatenated.
• Future Prediction (Generate future from past fames) • Mask frames to be predicted; provide conditioning frames; vary
applied masking/conditioning for different tasks during training.
• Past Prediction (Generate past from future fames)
• Use time position encodings to encode times.
• Interpolation (Generate intermediate frames)

Learn a model of the form:


t1 tK ⌧ 1 ⌧M
<latexit sha1_base64="3Bpkfm95CWG+SVUUZ8kucRd/jpA=">AAACX3icdVFdS8MwFE3r19z8qPokvgSHoCCjFVEfh74IIig4J6yzpGnqgmlTkltx1P5J3wRf/Cemcw9u0wMhh3PuJfeehJngGlz3w7Ln5hcWl2rL9cbK6tq6s7F5r2WuKOtQKaR6CIlmgqesAxwEe8gUI0koWDd8vqj87gtTmsv0DoYZ6yfkKeUxpwSMFDgvWVD4oRSRHibmKnwYMCBlue8nBAZhXLyWjwUEXnno00iCPsSTxlWJ3yYkH0j+X/nIuy4P6vXAabotdwQ8S7wxaaIxbgLn3Y8kzROWAhVE657nZtAviAJOBSvrfq5ZRugzeWI9Q1OSMN0vRvmUeM8oEY6lMicFPFJ/dxQk0dX+prKaVk97lfiX18shPusXPM1yYCn9eSjOBQaJq7BxxBWjIIaGEKq4mRXTAVGEgvmSKgRveuVZcn/U8k5ax7fHzfb5OI4a2kG7aB956BS10SW6QR1E0adlWw1rxfqyl+w12/kpta1xzxaagL39Df8CuBg=</latexit>

p✓ (x , · · · , x |x , · · · , x )

x ⌧1 , · · · , x ⌧M
<latexit sha1_base64="GEEnirgna6EWp2tayORMZ1A+v/k=">AAACGHicbVDLSsNAFJ34rPVVdelmsAguSk2kqMuiGzdCBfuAJobJdNIOnTyYuRFLyGe48VfcuFDEbXf+jZO2C209cOFwzr3ce48XC67ANL+NpeWV1bX1wkZxc2t7Z7e0t99SUSIpa9JIRLLjEcUED1kTOAjWiSUjgSdY2xte5377kUnFo/AeRjFzAtIPuc8pAS25pVM7IDDw/PQpe0htIIlrZRWb9iJQFbzg3WZuqWxWzQnwIrFmpIxmaLilsd2LaBKwEKggSnUtMwYnJRI4FSwr2oliMaFD0mddTUMSMOWkk8cyfKyVHvYjqSsEPFF/T6QkUGoUeLozP1XNe7n4n9dNwL90Uh7GCbCQThf5icAQ4Twl3OOSURAjTQiVXN+K6YBIQkFnWdQhWPMvL5LWWdU6r9buauX61SyOAjpER+gEWegC1dENaqAmougZvaJ39GG8GG/Gp/E1bV0yZjMH6A+M8Q9CXKEr</latexit>

Given frames:
x t1 , · · · , x tK
<latexit sha1_base64="LYM/lAfnSeE2122yMB4boIK41Ao=">AAACFHicbVBNS8NAEN3Ur1q/qh69LBZBsJREinosehG8VLAf0Maw2W7apZts2J2IJeRHePGvePGgiFcP3vw3Jm0PtvXBwOO9GWbmuaHgGkzzx8gtLa+sruXXCxubW9s7xd29ppaRoqxBpZCq7RLNBA9YAzgI1g4VI74rWMsdXmV+64EpzWVwB6OQ2T7pB9zjlEAqOcWTrk9g4HrxY3Ifg2Ml5S7tSdBlPGvcJIWCUyyZFXMMvEisKSmhKepO8bvbkzTyWQBUEK07lhmCHRMFnAqWFLqRZiGhQ9JnnZQGxGfajsdPJfgoVXrYkyqtAPBY/TsRE1/rke+mndmlet7LxP+8TgTehR3zIIyABXSyyIsEBomzhHCPK0ZBjFJCqOLprZgOiCIU0hyzEKz5lxdJ87RinVWqt9VS7XIaRx4doEN0jCx0jmroGtVRA1H0hF7QG3o3no1X48P4nLTmjOnMPpqB8fUL4FyesQ==</latexit>

Frames to be predicted:

Ho et al., “Video Diffusion Models”, arXiv, 2022


Harvey et al., “Flexible Diffusion Modeling of Long Videos”, arXiv, 2022
(image from: Harvey et al., “Flexible Diffusion Modeling of Long Videos”, arXiv, 2022)
Yang et al., “Diffusion Probabilistic Modeling for Video Generation”, arXiv, 2022
Höppe et al., “Diffusion Models for Video Prediction and Infilling”, arXiv, 2022
168
Voleti et al., “MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation”, arXiv, 2022
Video Generation
Architecture Details

Architecture Details: Learn one model for everything:

Data is 4D (image height, image width, #frames, channels) • Architecture as one diffusion model over all frames concatenated.
• Option (1): 3D Convolutions. Can be • Mask frames to be predicted; provide conditioning frames; vary
computationally expensive. applied masking/conditioning for different tasks during training.
• Use time position encodings to encode times.
• Option (2): Spatial 2D Convolutions + Attention
Layers along frame axis.

Additional Advantage:

Ignoring the attention layers, the model can be


trained additionally on pure image data!

Ho et al., “Video Diffusion Models”, arXiv, 2022


Harvey et al., “Flexible Diffusion Modeling of Long Videos”, arXiv, 2022
(image from: Harvey et al., “Flexible Diffusion Modeling of Long Videos”, arXiv, 2022)
Yang et al., “Diffusion Probabilistic Modeling for Video Generation”, arXiv, 2022
Höppe et al., “Diffusion Models for Video Prediction and Infilling”, arXiv, 2022
169
Voleti et al., “MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation”, arXiv, 2022
Video Generation
Results

Long term video generation in hierarchical manner:


• 1. Generate future frames in sparse manner, conditioning on frames far back
• 2. Interpolate in-between frames

Test Data:
1+ hour coherent video
generation possible!

Generated:

(video from: Harvey et al., “Flexible Diffusion Modeling of Long Videos”, arXiv, 2022,
Ho et al., “Video Diffusion Models”, arXiv, 2022 https://plai.cs.ubc.ca/2022/05/20/flexible-diffusion-modeling-of-long-videos/)
Harvey et al., “Flexible Diffusion Modeling of Long Videos”, arXiv, 2022
Yang et al., “Diffusion Probabilistic Modeling for Video Generation”, arXiv, 2022
Höppe et al., “Diffusion Models for Video Prediction and Infilling”, arXiv, 2022
170
Voleti et al., “MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation”, arXiv, 2022
Solving Inverse Problems in Medical Imaging

Forward CT or MRI imaging process (simplified):

sparse-view CT undersampled MRI

(image from: Song et al., “Solving Inverse Problems in Medical Imaging with Score-Based Generative Models”, ICLR, 2022)

Inverse Problem:
Reconstruct original image from sparse measurements.

Song et al., “Solving Inverse Problems in Medical Imaging with Score-Based Generative Models”, ICLR, 2022 171
Solving Inverse Problems in Medical Imaging

High-level idea: Learn Generative Diffusion Model as “prior”; then guide synthesis conditioned on sparse observations:

(image from: Song et al., “Solving Inverse Problems in Medical Imaging with Score-Based Generative Models”, ICLR, 2022)

Outperforms even fully-supervised methods.


Song et al., “Solving Inverse Problems in Medical Imaging with Score-Based Generative Models”, ICLR, 2022 172
Solving Inverse Problems in Medical Imaging
Lots of Literature

High-level idea: Learn Generative Diffusion Model as “prior”; then guide synthesis conditioned on sparse observations:

• Song et al., “Solving Inverse Problems in Medical Imaging with Score-Based Generative Models”, ICLR, 2022
• Chung and Ye, “Score-based diffusion models for accelerated MRI”, Medical Image Analysis, 2022
• Chung et al., “Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction”, CVPR, 2022
• Peng et al., “Towards performant and reliable undersampled MR reconstruction via diffusion model sampling”, arXiv, 2022
• Xie and Li, “Measurement-conditioned Denoising Diffusion Probabilistic Model for Under-sampled Medical Image Reconstruction”, arXiv, 2022
• Luo et al, “MRI Reconstruction via Data Driven Markov Chain with Joint Uncertainty Estimation”, arXiv, 2022
• …

(Song et al., “Solving Inverse Problems in Medical Imaging with Score-Based Generative Models”, ICLR, 2022)

Song et al., “Solving Inverse Problems in Medical Imaging with Score-Based Generative Models”, ICLR, 2022 173
3D Shape Generation

• Point clouds as 3D shape representation can be diffused easily and intuitively


• Denoiser implemented based on modern point cloud-processing networks (PointNets & Point-VoxelCNNs)

(image from: Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021)

Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021
Luo and Hu, “Diffusion Probabilistic Models for 3D Point Cloud Generation”, CVPR, 2021 174
3D Shape Generation

• Point clouds as 3D shape representation can be diffused easily and intuitively


• Denoiser implemented based on modern point cloud-processing networks (PointNets & Point-VoxelCNNs)

(video from: Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021,
https://alexzhou907.github.io/pvd)

Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021 175
3D Shape Generation
Shape Completion

• Can train conditional shape completion diffusion model (subset of points fixed to given conditioning points):

(video from: Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021,
https://alexzhou907.github.io/pvd)

Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021 176
3D Shape Generation
Shape Completion – Multimodality

(video from: Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021,
https://alexzhou907.github.io/pvd)

Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021 177
3D Shape Generation
Shape Completion – Multimodality – On Real Data

(video from: Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021,
https://alexzhou907.github.io/pvd)

Zhou et al., “3D Shape Generation and Completion through Point-Voxel Diffusion”, ICCV, 2021 178
Towards Discrete State Diffusion Models

• So far:
Continuous diffusion and denoising processes.

Data Noise
<latexit sha1_base64="1FJ2Efhg5qcTrvU55qAcEPfzByE=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRSqsuCG5cV7APaECbTSTt0Mgkzk2IJ/RM3LhRx65+482+ctFlo64GBwzn3cs+cIOFMacf5tkobm1vbO+Xdyt7+weGRfXzSUXEqCW2TmMeyF2BFORO0rZnmtJdIiqOA024wucv97pRKxWLxqGcJ9SI8EixkBGsj+bY9iLAeB2H2NPedioFvV52aswBaJ25BqlCg5dtfg2FM0ogKTThWqu86ifYyLDUjnM4rg1TRBJMJHtG+oQJHVHnZIvkcXRpliMJYmic0Wqi/NzIcKTWLAjOZ51SrXi7+5/VTHd56GRNJqqkgy0NhypGOUV4DGjJJieYzQzCRzGRFZIwlJtqUlZfgrn55nXSua26jVn+oV5sXRR1lOINzuAIXbqAJ99CCNhCYwjO8wpuVWS/Wu/WxHC1Zxc4p/IH1+QN+q5Ir</latexit>

x0 … <latexit sha1_base64="AqTsPoJ8QRhCLsOZsLF1Tq44dYA=">AAAB+XicbVBNS8NAFHypX7V+RT16CVbBU0mkqMeCF48VbC20oWy2m3bpZhN2X4ol9J948aCIV/+JN/+NmzYHbR1YGGbe481OkAiu0XW/rdLa+sbmVnm7srO7t39gHx61dZwqylo0FrHqBEQzwSVrIUfBOoliJAoEewzGt7n/OGFK81g+4DRhfkSGkoecEjRS37Z7EcFREGZPsz5WDPp21a25czirxCtIFQo0+/ZXbxDTNGISqSBadz03QT8jCjkVbFbppZolhI7JkHUNlSRi2s/myWfOuVEGThgr8yQ6c/X3RkYiradRYCbznHrZy8X/vG6K4Y2fcZmkyCRdHApT4WDs5DU4A64YRTE1hFDFTVaHjogiFE1ZeQne8pdXSfuy5l3V6vf1auOsqKMMJ3AKF+DBNTTgDprQAgoTeIZXeLMy68V6tz4WoyWr2DmGP7A+fwDmy5Jv</latexit>

xt … <latexit sha1_base64="GS205AhwIbESFdeXgSRcbzQfuPg=">AAAB+XicbVDLSsNAFL2pr1pfUZduBqvgqiRS1GXBjcsKfUEbwmQ6aYdOJmFmUiyhf+LGhSJu/RN3/o2TNgttPTBwOOde7pkTJJwp7TjfVmljc2t7p7xb2ds/ODyyj086Kk4loW0S81j2AqwoZ4K2NdOc9hJJcRRw2g0m97nfnVKpWCxaepZQL8IjwUJGsDaSb9uDCOtxEGZPc79VMfDtqlNzFkDrxC1IFQo0fftrMIxJGlGhCcdK9V0n0V6GpWaE03llkCqaYDLBI9o3VOCIKi9bJJ+jS6MMURhL84RGC/X3RoYjpWZRYCbznGrVy8X/vH6qwzsvYyJJNRVkeShMOdIxymtAQyYp0XxmCCaSmayIjLHERJuy8hLc1S+vk851zb2p1R/r1cZFUUcZzuAcrsCFW2jAAzShDQSm8Ayv8GZl1ov1bn0sR0tWsXMKf2B9/gC1y5JP</latexit>

xT

<latexit sha1_base64="bS3hNK6LY4xHhyx2Z3REaOHHT2g=">AAACY3icbVFdS8MwFE3r9/yqH28iBKeg4EYrooIIgi/6IhOcCmspaZbOYPphcivO2j/pm2+++D9suwrb9EDg5Jx7uTcnXiy4AtP81PSJyanpmdm52vzC4tKysbJ6p6JEUtamkYjkg0cUEzxkbeAg2EMsGQk8we69p4vCv39hUvEovIV+zJyA9ELuc0ogl1zj7XnXDgg8en76mrnwPnRJoWFle/gMlxolIr3ORopPsa2eJaRWw/YYEBcye3+8fx9XHv51rrK9WgnXqJtNswT+S6yK1FGFlmt82N2IJgELgQqiVMcyY3BSIoFTwbKanSgWE/pEeqyT05AETDlpmVGGd3Kli/1I5icEXKrDHSkJlOoHXl5ZLKrGvUL8z+sk4J84KQ/jBFhIB4P8RGCIcBE47nLJKIh+TgiVPN8V00ciCYX8W4oQrPEn/yV3B03rqHl4c1g/367imEUbaAvtIgsdo3N0iVqojSj60qa1Zc3QvvV5fVVfH5TqWtWzhkagb/4AXle2hQ==</latexit>

p
Fixed forward diffusion process: q(xt |xt 1) = N (xt ; 1 t xt 1, t I)

2
<latexit sha1_base64="UF9yIfMqcH3hAZUXXOOJUrdbgDE=">AAACk3icjVFbaxNBFJ5db228dKv0yZfBIDRQw24ptVDEUH3QB6WCaQvZuJydzCZDZ3aXmbNiGOcP9ef0zX/jbBrBND54YJiP7zv3k9dSGIzjX0F45+69+w82NjsPHz1+shVtPz0zVaMZH7JKVvoiB8OlKPkQBUp+UWsOKpf8PL981+rn37k2oiq/4rzmYwXTUhSCAXoqi67qzKZ5JSdmrvxnU5xxBOd2UwU4ywv7w2UWXyXu5wrhevQNXTAMpP287n5MV7Kqxv1PHbeHvT2aGjFVkOG3ffpH/eh6nSzqxv14YXQdJEvQJUs7zaLrdFKxRvESmQRjRklc49iCRsEkd520MbwGdglTPvKwBMXN2C526uhLz0xoUWn/SqQL9u8IC8q0s3jPtkdzW2vJf2mjBoujsRVl3SAv2U2hopEUK9oeiE6E5gzl3ANgWvheKZuBBob+jO0Sktsjr4Oz/X5y2D/4ctAdnCzXsUGekxdklyTkNRmQD+SUDAkLouAweBsMwp3wODwJ39+4hsEy5hlZsfDTb3RvzNo=</latexit>

Reverse generative process: p✓ (xt 1 |xt ) = N (xt 1 ; µ✓ (xt , t), t I)

But what if data is discrete? Categorical?


Continuous perturbations are not possible!
(Text, Pixel-wise Segmentation Labels,
Discrete Image Encodings, etc.)

179
Discrete State Diffusion Models

• Categorical diffusion: q(xt |xt • Reverse process can be parametrized


<latexit sha1_base64="fD+WVjPnIHoUtLahPKQ60d8i08E=">AAACUnicbVJNSwMxEM2uX7VWXfXoJViE9mDZFVFBBLEXjy3YD2jLkk2zGsx+mMyKZd3fKIgXf4gXD2q2VqitA4H33swwk5d4seAKbPvNMBcWl5ZXCqvFtdL6xqa1td1WUSIpa9FIRLLrEcUED1kLOAjWjSUjgSdYx7ur5/nOA5OKR+E1jGI2CMhNyH1OCWjJtfh9pR8QuPX89DFz4WmKpHDgZFV8jvvAHkEGaZ1A9qf6DP+yODuf7fzlzZxn1aJrle2aPQ48D5wJKKNJNFzrpT+MaBKwEKggSvUcO4ZBSiRwKlhW7CeKxYTekRvW0zAkAVODdGxJhve1MsR+JPUJAY/V6Y6UBEqNAk9X5nuq2Vwu/pfrJeCfDlIexgmwkP4M8hOBIcK5v3jIJaMgRhoQKrneFdNbIgkF/Qq5Cc7sledB+7DmHNeOmkfli8uJHQW0i/ZQBTnoBF2gK9RALUTRM3pHn+jLeDU+TP1LfkpNY9Kzg/6EWfoGugO2lw==</latexit>

1) = Cat(xt ; p = xt 1 Qt )

xt : one-hot state vector <latexit sha1_base64="c+szFMtRxnkOO6jZlYy3JtaaiP8=">AAAB9HicbVDLSsNAFL2pr1pfVZdugkVwVRIp6rLoxmUF+4A2lMl00g6dTOLMTbGEfocbF4q49WPc+TdO2iy09cDA4Zx7uWeOHwuu0XG+rcLa+sbmVnG7tLO7t39QPjxq6ShRlDVpJCLV8YlmgkvWRI6CdWLFSOgL1vbHt5nfnjCleSQfcBozLyRDyQNOCRrJ64UER36QPs36WOqXK07VmcNeJW5OKpCj0S9/9QYRTUImkQqiddd1YvRSopBTwWalXqJZTOiYDFnXUElCpr10HnpmnxllYAeRMk+iPVd/b6Qk1Hoa+mYyC6mXvUz8z+smGFx7KZdxgkzSxaEgETZGdtaAPeCKURRTQwhV3GS16YgoQtH0lJXgLn95lbQuqu5ltXZfq9Rv8jqKcAKncA4uXEEd7qABTaDwCM/wCm/WxHqx3q2PxWjByneO4Q+szx/MDZIg</latexit>

categorical distribution.

Qt : transition matrix [Qt ]ij = q(xt = j|xt


<latexit sha1_base64="dKn3/YbGiHSQ9olmOBuw8f3NcBI=">AAACI3icbVDLSgMxFM34rPVVdekmWARdWGZEVASh6MZlC7YWOsOQSTNtauZhckdaxv6LG3/FjQuluHHhv5hpu9DWAyGHc+4lOceLBVdgml/G3PzC4tJybiW/ura+sVnY2q6rKJGU1WgkItnwiGKCh6wGHARrxJKRwBPszru/zvy7RyYVj8Jb6MfMCUg75D6nBLTkFi5swXxo2gGBjuen1YELtuTtDjhuyrsDfIkfDnou6LuLn3DPTeHIylR+mHcLRbNkjoBniTUhRTRBxS0M7VZEk4CFQAVRqmmZMTgpkcCpYIO8nSgWE3pP2qypaUgCppx0lHGA97XSwn4k9QkBj9TfGykJlOoHnp7MsqhpLxP/85oJ+OdOysM4ARbS8UN+IjBEOCsMt7hkFERfE0Il13/FtEMkoaBrzUqwpiPPkvpxyTotnVRPiuWrSR05tIv20AGy0BkqoxtUQTVE0TN6Re/ow3gx3oyh8TkenTMmOzvoD4zvH8llowM=</latexit>

= i)
<latexit sha1_base64="26gwsf8xyB4s105xxOnSugEhMM8=">AAAB9HicbVDLSsNAFL2pr1pfVZduBovgqiRS1GXRjcsW7APaUCbTSTt0MokzN4US+h1uXCji1o9x59+YtFlo64GBwzn3cs8cL5LCoG1/W4WNza3tneJuaW//4PCofHzSNmGsGW+xUIa661HDpVC8hQIl70aa08CTvONN7jO/M+XaiFA94izibkBHSviCUUwltx9QHHt+0pwPsDQoV+yqvQBZJ05OKpCjMSh/9YchiwOukElqTM+xI3QTqlEwyeelfmx4RNmEjngvpYoG3LjJIvScXKTKkPihTp9CslB/byQ0MGYWeOlkFtKsepn4n9eL0b91E6GiGLliy0N+LAmGJGuADIXmDOUsJZRpkWYlbEw1ZZj2lJXgrH55nbSvqs51tdasVep3eR1FOINzuAQHbqAOD9CAFjB4gmd4hTdrar1Y79bHcrRg5Tun8AfW5w+QVZH5</latexit>

(image from: Hoogeboom et al., “Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions”, NeurIPS, 2022)

Austin et al., “Structured Denoising Diffusion Models in Discrete State-Spaces”, NeurIPS, 2021
Hoogeboom et al., “Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions”, NeurIPS, 2022 180
Discrete State Diffusion Models

• Uniform categorical diffusion: • Progressive masking out of data


Options for forward process: (generation is “de-masking”)
<latexit sha1_base64="5pXSxG2xL1aBwW9SJ9d9N7Y9R34=">AAACTXicbVHLSsNAFJ3UV62vqEs3g0WoiCURUTdC0Y3iRtHaQlPDZDqpg5MHMzdCCfkwv8K1Kxdu9APciThpI2j1wMDhnHu5957xYsEVWNaTUZqYnJqeKc9W5uYXFpfM5ZVrFSWSsiaNRCTbHlFM8JA1gYNg7VgyEniCtby749xv3TOpeBRewSBm3YD0Q+5zSkBLrnnpBARuPT+9yFzAh7hmbzseA+LC5rdzmuEt7PiS0LSwsvQsG7lekNo/6Y0DUeyaVatuDYH/ErsgVVTg3DVfnF5Ek4CFQAVRqmNbMXRTIoFTwbKKkygWE3pH+qyjaUgCprrp8PgMb2ilh/1I6hcCHqo/O1ISKDUIPF2Z76nGvVz819MXjY0G/6Cb8jBOgIV0NNlPBIYI59HiHpeMghhoQqjkenlMb4nODfQHVHQq9ngGf8n1Tt3eq+9e7FYbR0U+ZbSG1lEN2WgfNdAJOkdNRNEDekav6M14NN6ND+NzVFoyip5V9AulmS+m07Un</latexit>

t >
Qt = (1 t )I +
K • Tailored to ordinal data
(e.g. discretized Gaussian)

(image from: Austin et al., “Structured Denoising Diffusion Models in Discrete State-Spaces”, NeurIPS, 2021)

Austin et al., “Structured Denoising Diffusion Models in Discrete State-Spaces”, NeurIPS, 2021 181
Discrete State Diffusion Models

(image from: Austin et al., “Structured Denoising Diffusion Models in Discrete State-Spaces”, NeurIPS, 2021)

Austin et al., “Structured Denoising Diffusion Models in Discrete State-Spaces”, NeurIPS, 2021 182
Discrete State Diffusion Models
Modeling Categorical Image Pixel Values

Progressive denoising
starting from all-
masked state.

Progressive denoising
starting from random
uniform state.
(with discretized Gaussian
denoising model)

(image from: Austin et al., “Structured Denoising Diffusion Models in Discrete State-Spaces”, NeurIPS, 2021)

Austin et al., “Structured Denoising Diffusion Models in Discrete State-Spaces”, NeurIPS, 2021 183
Discrete State Diffusion Models
Modeling Discrete Image Encodings

Encoding images into latent space with discrete tokens, and


modeling discrete token distribution

Class-conditional model samples


Iterative generation

(images from: Chang et al., “MaskGIT: Masked Generative Image Transformer”, CVPR, 2022)

Chang et al., “MaskGIT: Masked Generative Image Transformer”, CVPR, 2022


184
Esser et al., “ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis”, NeurIPS, 2021
Discrete State Diffusion Models
Modeling Pixel-wise Segmentations

(image from: Hoogeboom et al., “Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions”, NeurIPS, 2022)

Hoogeboom et al., “Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions”, NeurIPS, 2022 185
Today’s Program

Title Speaker Time


Introduction Arash 10 min
Part (1): Denoising Diffusion Probabilistic Models Arash 35 min
Part (2): Score-based Generative Modeling with Differential Equations Karsten 45 min
Part (3): Advanced Techniques: Accelerated Sampling, Conditional Generation, and Beyond Ruiqi 45 min
Applications (1): Image Synthesis, Text-to-Image, Controllable Generation Ruiqi 15 min
Applications (2): Image Editing, Image-to-Image, Super-resolution, Segmentation Arash 15 min
Applications (3): Video Synthesis, Medical Imaging, 3D Generation, Discrete State Models Karsten 15 min
Conclusions, Open Problems and Final Remarks Arash 10 min

cvpr2022-tutorial-diffusion-models.github.io
187
Conclusions, Open Problems and Final Remarks

188
Summary: Denoising Diffusion Probabilistic Models
“Discrete-time” Diffusion Models

We started with denoising diffusion probabilistic models:

Forward diffusion process (fixed)

Data Noise

Reverse denoising process (generative)

We showed how the denoising model can be trained by predicting noise injected in each diffused image:

189
Summary: Score-based Generative Models with Differential Eqn.
“Continuous-time” Diffusion Models

In the second part, we considered the limit of an infinite number of steps with an infinitesimal noise

Generative Reverse Diffusion SDE (stochastic) Generative Probability Flow ODE (deterministic):

These continuous-time diffusion models allow us to choose discretization and ODE/SDE solvers at test time.
190
Summary: Advanced Techniques
Acceleration, Guidance and beyond

In the third part, we discussed several advanced topics in diffusion models.

How can we accelerate the sample generation?


[Image credit: Ben Poole, Mohammad Norouzi]

Simple forward process slowly maps data to noise

Reverse process maps noise back to data with a denoising model

How to scale up diffusion models to high-resolution (conditional) generation?

• Cascaded models

• Guided diffusion models


191
Summary: Applications

We covered many successful applications of diffusion models:

• Image generation, text-to-image generation, controllable generation

• Image editing, image-to-image translation, super-resolution, segmentation, adversarial robustness

• Discrete models, 3D generation, medical imaging, video synthesis

192
Open Problems (1)
• Diffusion models are a special form of VAEs and continuous normalizing flows

• Why do diffusion models perform so much better than these models?

• How can we improve VAEs and normalizing flows with lessons learned from diffusion models?

• Sampling from diffusion models is still slow especially for interactive applications

• The best we could reach is 4-10 steps. How can we have one step samplers?

• Do we need new diffusion processes?

• Diffusion models can be considered as latent variable models, but their latent space lacks semantics

• How can we do latent-space semantic manipulations in diffusion models

193
Open Problems (2)
• How can diffusion models help with discriminative applications?

• Representation learning (high-level vs low-level)

• Uncertainty estimation

• Joint discriminator-generator training

• What are the best network architectures for diffusion models?

• Can we go beyond existing U-Nets?

• How can we feed the time input and other conditioning?

• How can we improve the sampling efficiency using better network designs?

194
Open Problems (3)
• How can we apply diffusion models to other data types?

• 3D data (e.g., distance functions, meshes, voxels, volumetric representations), video, text, graphs, etc.

• How should we change diffusion models for these modalities?

• Compositional and controllable generation

• How can we go beyond images and generate scenes?

• How can we have more fine-grained control in generation?

• Diffusion models for X

• Can we better solve applications that were previously addressed by GANs and other generative models?

• Which applications will benefit most from diffusion models?


195
Thanks!

https://cvpr2022-tutorial-diffusion-models.github.io/

@karsten_kreis @RuiqiGao @ArashVahdat

196

You might also like