0% found this document useful (0 votes)
11 views43 pages

Application DPM

Uploaded by

satwick93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views43 pages

Application DPM

Uploaded by

satwick93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Diffusion Probabilistic Models:

Theory and Applications


Fan Bao
Tsinghua University

By Fan Bao, Tsinghua University 1


Diffusion Probabilistic Models (DPMs)
Ho et al. Denoising diffusion probabilistic models (DDPM), Neurips 2020.
Song et al. Score-based generative modeling through stochastic differential equations, ICLR 2021.
Bao et al. Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models, ICLR 2022.
Bao et al. Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models, ICML 2022.

By Fan Bao, Tsinghua University 2


• Diffusion process gradually injects noise to data
• Described by a Markov chain: 𝑞 𝑥0 , … , 𝑥𝑁 = 𝑞 𝑥0 𝑞 𝑥1 𝑥0 … 𝑞(𝑥𝑁 |𝑥𝑁−1 )

Transition of diffusion: 𝑞 𝑥𝑛 𝑥𝑛−1 = 𝑁( 𝛼𝑛 𝑥𝑛−1 , 𝛽𝑛 𝐼) 𝛼𝑛 = 1 − 𝛽𝑛

𝑥0 𝑥1 𝑥2 … 𝑥𝑁 ≈ 𝑁(0, 𝐼)

Diffusion process: 𝑞 𝑥0 , … , 𝑥𝑁 = 𝑞 𝑥0 𝑞 𝑥1 𝑥0 … 𝑞(𝑥𝑁 |𝑥𝑁−1 )

Demo Images from Song et al. Score-based generative modeling through stochastic differential equations, ICLR 2021.

By Fan Bao, Tsinghua University 3


• Diffusion process in the reverse direction ⇔ denoising process
• Reverse factorization: 𝑞 𝑥0 , … , 𝑥𝑁 = 𝑞 𝑥0 |𝑥1 … 𝑞 𝑥𝑁−1 𝑥𝑁 𝑞(𝑥𝑁 )

Transition of denoising: 𝑞 𝑥𝑛−1 𝑥𝑛 =?

𝑥0 𝑥1 𝑥2 … 𝑥𝑁 ≈ 𝑁(0, 𝐼)

Diffusion process: 𝑞 𝑥0 , … , 𝑥𝑁 = 𝑞 𝑥0 𝑞 𝑥1 𝑥0 … 𝑞(𝑥𝑁 |𝑥𝑁−1 )


= 𝑞 𝑥0 |𝑥1 … 𝑞 𝑥𝑁−1 𝑥𝑁 𝑞(𝑥𝑁 )

By Fan Bao, Tsinghua University 4


• Approximate diffusion process in the reverse direction

Model transition: 𝑝 𝑥𝑛−1 𝑥𝑛 = 𝑁(𝜇𝑛 𝑥𝑛 , Σ𝑛 (𝑥𝑛 ))


approximate
Transition of denoising: 𝑞 𝑥𝑛−1 𝑥𝑛 =?

𝑥0 𝑥1 𝑥2 … 𝑥𝑁 ≈ 𝑁(0, 𝐼)

Diffusion process: 𝑞 𝑥0 , … , 𝑥𝑁 = 𝑞 𝑥0 𝑞 𝑥1 𝑥0 … 𝑞(𝑥𝑁 |𝑥𝑁−1 )


= 𝑞 𝑥0 |𝑥1 … 𝑞 𝑥𝑁−1 𝑥𝑁 𝑞(𝑥𝑁 )
The model: 𝑝 𝑥0 , … , 𝑥𝑁 = 𝑝 𝑥0 |𝑥1 … 𝑝 𝑥𝑁−1 𝑥𝑁 𝑝(𝑥𝑁 )

By Fan Bao, Tsinghua University 5


• We hope 𝑞 𝑥0 , … , 𝑥𝑁 ≈ 𝑝 𝑥0 , … , 𝑥𝑁 𝑝 𝑥𝑛−1 𝑥𝑛 = 𝑁(𝜇𝑛 𝑥𝑛 , Σ𝑛 (𝑥𝑛 ))

• Achieved by minimizing their KL divergence (i.e., maximizing the ELBO)

min KL max ELBO


𝑝(𝑥0:𝑁 )
min 𝐾𝐿(𝑞(𝑥0:𝑁 )||𝑝 𝑥0:𝑁 ) ⇔ max E𝑞 log
𝜇𝑛 ,Σ𝑛 𝜇𝑛 ,Σ𝑛 𝑞(𝑥1:𝑁 |𝑥0 )

What is the optimal solution?

By Fan Bao, Tsinghua University 6


Bao et al. Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models, ICLR 2022.

Theorem (The optimal solution under scalar variance, i.e., Σ𝑛 𝑥𝑛 = 𝜎𝑛2 𝐼)

The optimal solution to min 2 𝐾𝐿(𝑞(𝑥0:𝑁 )||𝑝 𝑥0:𝑁 ) is


𝜇𝑛 ⋅ ,𝜎𝑛
3 key steps in proof:
1 ➢ Moment matching
𝜇𝑛∗ 𝑥𝑛 = 𝑥𝑛 + 𝛽𝑛 ∇ log 𝑞𝑛 (𝑥𝑛 ) , ➢ Law of total variance
𝛼𝑛
➢ Score representation of
moments of 𝑞(𝑥0 |𝑥𝑛 )
𝛽𝑛 ∇ log 𝑞𝑛 𝑥𝑛 2
𝜎𝑛∗2 = (1 − 𝛽𝑛 E𝑞𝑛 (𝑥𝑛 ) ).
𝛼𝑛 𝑑

Noise prediction form: Parameterization of 𝝁𝒏 ⋅ :


1 1 1
∇ log 𝑞𝑛 (𝑥𝑛 ) = − ഥ E𝑞 𝑥0 𝑥𝑛 [𝜖𝑛 ] 𝜇𝑛 𝑥𝑛 = 𝑥𝑛 − 𝛽𝑛 𝜖Ƹ (𝑥 )
𝛽𝑛 𝛼𝑛 ഥ𝑛 𝑛 𝑛
𝛽
Estimated by predicting noise

By Fan Bao, Tsinghua University 7


Bao et al. Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models, ICML 2022.

Theorem (The optimal solution for diagonal covariance , i.e., Σ𝑛 𝑥𝑛 = diag(𝜎𝑛 𝑥𝑛 2 ) )

The optimal solution to min 𝐾𝐿(𝑞(𝑥0:𝑁 )||𝑝 𝑥0:𝑁 ) is


𝜇𝑛 ⋅ ,𝜎𝑛 ⋅ 2

Predict noise
1
𝜇𝑛∗ 𝑥𝑛 = 𝑥𝑛 + 𝛽𝑛 ∇ log 𝑞𝑛 (𝑥𝑛 ) ,
𝛼𝑛

ഥ𝑛−1
𝛽 𝛽𝑛2
𝜎𝑛∗ 𝑥𝑛 2 = ഥ𝑛 𝛽𝑛 + ഥ𝑛 𝛼𝑛 (E𝑞(𝑥𝑛 |𝑥𝑛 ) 𝜖𝑛2 − E𝑞(𝑥𝑛 |𝑥𝑛 ) 𝜖𝑛 2 ).
𝛽 𝛽

Predict squared noise

By Fan Bao, Tsinghua University 8


 Implementation framework of predicting squared noise

最优协方差表达式:
constant

预测网络 最小化均方误差
𝜖𝑛Ƹ (𝑥𝑛 ) min 𝐄‖𝜖𝑛Ƹ (𝑥𝑛 ) − 𝜖𝑛 ‖22
𝜖ො 𝑛

数据 高斯噪声 带噪数据
𝑥0 𝜖𝑛 𝑥𝑛
预测网络 最小化均方误差
ℎ𝑛 (𝑥𝑛 ) min 𝐄‖ℎ𝑛 𝑥𝑛 − 𝜖𝑛2 ‖22
ℎ𝑛
平方噪声
𝜖𝑛2

基于预测噪声平方的最优协方差估计:

By Fan Bao, Tsinghua University 9


Bao et al. Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models, ICML 2022.

Theorem (The optimal solution for diagonal covariance , i.e., Σ𝑛 𝑥𝑛 = diag(𝜎𝑛 𝑥𝑛 2 ) )

The optimal solution to min2 𝐾𝐿(𝑞(𝑥0:𝑁 )||𝑝 𝑥0:𝑁 ) with imperfect mean is
𝜎𝑛 ⋅

ഥ𝑛−1
𝛽 𝛽𝑛2
𝜎෤𝑛∗ 𝑥𝑛 2 = ഥ𝑛 𝛽𝑛 + ഥ𝑛 𝛼𝑛 E𝑞(𝑥0 |𝑥𝑛 ) [ 𝜖𝑛 − 𝜖𝑛Ƹ (𝑥𝑛 ) 2 ].
𝛽 𝛽

Noise prediction residual


(NPR)

1 1
Generally, the mean 𝜇𝑛 𝑥𝑛 = 𝑥𝑛 − 𝛽𝑛 𝜖Ƹ (𝑥 )
ഥ𝑛 𝑛 𝑛
is not optimal due to approximation or
𝛼𝑛 𝛽
optimization error of 𝜖𝑛Ƹ (𝑥𝑛 ).

By Fan Bao, Tsinghua University 10


 Implementation framework of predicting NPR

最优协方差表达式:

预测网络 最小化均方误差
𝜖𝑛Ƹ (𝑥𝑛 ) min 𝐄‖𝜖𝑛Ƹ (𝑥𝑛 ) − 𝜖𝑛 ‖22
𝜖ො 𝑛

数据 高斯噪声 带噪数据
𝑥0 𝜖𝑛 𝑥𝑛
预测网络 最小化均方误差
𝑔𝑛 (𝑥𝑛 ) min 𝐄‖𝑔𝑛 𝑥𝑛 − (𝜖𝑛Ƹ 𝑥𝑛 − 𝜖𝑛 )2 ‖22
𝑔𝑛
噪声残差
(𝜖𝑛Ƹ 𝑥𝑛 − 𝜖𝑛 )2

基于预测噪声残差的最优协方差估计:

Page 11
Song et al. Score-based generative modeling through stochastic differential equations, ICLR 2021.

• The continuous timesteps version (SDE)

• 𝑞 𝑥0 , … , 𝑥𝑁 becomes
• 𝑑𝒙 = 𝑓 𝑡 𝒙𝑑𝑡 + 𝑔 𝑡 𝑑𝒘 ↔ 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 ∇log 𝑞 ഥ
𝒙 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘
𝑡

• 𝑝 𝑥0 , … , 𝑥𝑁 becomes
• 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 𝒔𝑡 𝒙 𝑑𝑡 + 𝑔 𝑡 𝑑𝒘

By Fan Bao, Tsinghua University 12


Conditional DPMs: Paired Data
We have pairs of (𝑥0 , 𝑐), where 𝑥0 is the data and 𝑐 is the condition.
The goal is to learn the unknown conditional data distribution 𝑞(𝑥0 |𝑐).

By Fan Bao, Tsinghua University 13


Conditional Model
 Original model 𝑠𝑛 𝑥𝑛 → conditional model 𝑠𝑛 𝑥𝑛 |𝑐

 Training: minE𝑐 E𝑛 𝛽𝑛ҧ E𝑞𝑛 (𝑥𝑛 |𝑐) 𝑠𝑛 𝑥𝑛 |𝑐 − ∇ log 𝑞𝑛 (𝑥𝑛 |𝑐) 2


𝑠𝑛

 Conditional DPM:
1
 Discrete time: 𝑝 𝑥𝑛−1 𝑥𝑛 , 𝑐 = 𝑁(𝜇𝑛 𝑥𝑛 |𝑐 , Σ𝑛 (𝑥𝑛 )), 𝜇𝑛 𝑥𝑛 = 𝑥𝑛 + 𝛽𝑛 𝑠𝑛 𝑥𝑛 |𝑐
𝛼𝑛
 Continuous time: 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 𝒔𝑡 𝒙|c 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘

 Challenge: design the model architecture 𝑠𝑛 𝑥𝑛 |𝑐

By Fan Bao, Tsinghua University 14


Discriminative Guidance
2
 Exact reverse SDE: 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 ഥ
∇log 𝑞𝑡 𝒙|𝑐 𝑑𝑡 + 𝑔 𝑡 𝑑𝒘

 ∇log 𝑞𝑡 𝒙|𝑐 = ∇ log 𝑞𝑡 (𝑥) + ∇ log 𝑞𝑡 (𝑐|𝑥)


The paired data is used in the
Approximated by Original Discriminative training of the discriminative model
DPM model

 Conditional score-based SDE:


 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (𝑠𝑡 𝑥 + ∇ log 𝑝𝑡 (𝑐|𝑥)) 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘

 Benefits: Many discriminative models have well studied architectures

By Fan Bao, Tsinghua University 15


Scale Discriminative Guidance
 Exact reverse SDE: 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (∇ log 𝑞𝑡 (𝑥) + ∇ log 𝑞𝑡 (𝑐|𝑥)) 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘

 Scale discriminative guidance:


 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (∇ log 𝑞𝑡 (𝑥) + 𝜆∇ log 𝑞𝑡 (𝑐|𝑥)) 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘

 Conditional score-based SDE:


 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (𝑠𝑡 𝑥 + 𝜆∇ log 𝑝𝑡 (𝑐|𝑥)) 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘

 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (𝑠𝑡 𝑥|𝑐 + 𝜆∇ log 𝑝𝑡 (𝑐|𝑥)) 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘

By Fan Bao, Tsinghua University 16


Conditioned on label

Dhariwal et al. Diffusion Models Beat GANs on Image Synthesis

By Fan Bao, Tsinghua University 17


Self Guidance Ho et al. Unconditional Diffusion Guidance

 Scale discriminative guidance: 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (∇ log 𝑞𝑡 (𝑥) + 𝜆∇ log 𝑞𝑡 (𝑐|𝑥)) 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘


Require an extra
 ∇ log 𝑞𝑡 (𝑐|𝑥) = ∇ log 𝑞𝑡 (𝑥|𝑐) − ∇ log 𝑞𝑡 (𝑥) discriminative model
 Learn conditional & unconditional model together
 Introduce token ∅, and use 𝑠𝑡 𝑥𝑡 |∅ to represent unconditional cases
 Conditional score-based SDE:
 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (𝑠𝑡 𝑥|∅ + 𝜆(𝑠𝑡 𝑥 𝑐 − 𝑠𝑡 (𝑥|∅)) 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘

 Training:
 min E𝑐 E𝑛 𝛽𝑛ҧ E𝑞𝑛 (𝑥𝑛 |𝑐) 𝑠𝑛 𝑥𝑛 |𝑐 − ∇ log 𝑞𝑛 (𝑥𝑛 |𝑐) 2 + 𝜆E𝑛 𝛽𝑛ҧ E𝑞𝑛 (𝑥𝑛 ) 𝑠𝑛 𝑥𝑛 |∅ − ∇ log 𝑞𝑛 (𝑥𝑛 ) 2
𝑠𝑛 ⋅
conditional loss unconditional loss
By Fan Bao, Tsinghua University 18
Saharia et al. Image Super-Resolution via Iterative Refinement

Application: Image Super-Resolution


 Paired data (𝑥0 , 𝑐), 𝑥0 is high resolution image, 𝑐 is low resolution image

 Learn a conditional model 𝑠𝑛 𝑥𝑛 |𝑐


 Architecture: 𝑠𝑛 𝑥𝑛 |𝑐 = UNet(cat 𝑥𝑛 , 𝑐 ′ , 𝑛), 𝑐 ′ is the bicubic interpolation of 𝑐

By Fan Bao, Tsinghua University 19


Saharia et al. Image Super-Resolution via Iterative Refinement

Application: Image Super-Resolution

By Fan Bao, Tsinghua University 20


Nichol et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Application: Text to Image


 Dataset contains pairs of (𝑥0 , 𝑐), where 𝑥0 is image and 𝑐 is text
 Techniques: conditional model with self-guidance
 Challenge: design 𝑠𝑡 𝑥 𝑐

By Fan Bao, Tsinghua University 21


Nichol et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Application: Text to Image


 Architecture of 𝑠𝑡 𝑥 𝑐 : UNet + Transformer Other details
Dataset: the same as DALL-E
#parameters: 2.3 billion for 64x64
 UNet encodes image 𝑥

 Transformer encodes text 𝑐 and the embedding is injected to UNet


 The token embedding is injected after group normalization in Res Block:

 The token embedding is concatenated to the attention context in UNet

By Fan Bao, Tsinghua University 22


Amit et.al. SegDiff: Image Segmentation with Diffusion Probabilistic Models

Application: Segmentation
 Paired data (𝑥0 , 𝑐), 𝑥0 is segmentation, 𝑐 is image
 𝑠𝑡 𝑥 𝑐 = UNet(𝐹 𝑥 + 𝐺(𝑐), 𝑡)

By Fan Bao, Tsinghua University 23


Conditional DPMs: Unpaired Data
We only have a set of 𝑥0 (data).
The goal is to construct a conditional distribution 𝑝(𝑥0 |𝑐).

By Fan Bao, Tsinghua University 24


Energy Guidance
 Unconditional DPM trained from a set of 𝑥0 (data):
 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 𝒔𝑡 𝒙 𝑑𝑡 + 𝑔 𝑡 𝑑𝒘

 A strategy to construct 𝑝(𝑥0 |𝑐) is to insert an energy function:


 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (𝒔𝑡 𝒙 − ∇𝐸𝑡 (𝒙, 𝑐)) 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘, ഥ 𝑥𝑇 ∼ 𝑝(𝑥𝑇 |𝑐)

 The generated data tends to have a low energy 𝐸𝑡 (𝒙, 𝑐)


 The energy depends on specific applications

By Fan Bao, Tsinghua University 25


Energy Guidance
 Pros:
 Provides a framework for incorporating domain knowledge to DPMs

 Cons:
 𝑝(𝑥0 |𝑐) is very black box
 Energy design is based on intuition

By Fan Bao, Tsinghua University 26


Application: Text to Image
 High level idea: Define energy as a negative similarity between image and text

 CLIP provides a model to measure the similarity between images and texts:
 Similarity: sim 𝒙, 𝑐 = 𝒇(𝒙) ∙ 𝒈(𝑐)
 Energy: 𝐸𝑡 𝒙, 𝑐 = −sim 𝒙, 𝑐

Nichol et al. GLIDE: Towards Photorealistic Image


Generation and Editing with Text-Guided Diffusion Models

By Fan Bao, Tsinghua University 27


Application: Text to Image

Energy
guidance

Self
guidance

By Fan Bao, Tsinghua University 28


Vikash et al. Generating High Fidelity Data from Low-density Regions using Diffusion Models

Application: Generate Low Density Images

Samples from SDE is more similar


to high density part in dataset

Dataset Samples from SDE of 𝒔𝑡 𝒙|c

By Fan Bao, Tsinghua University 29


Vikash et al. Generating High Fidelity Data from Low-density Regions using Diffusion Models

Application: Generate Low Density Images


 Original SDE: 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 𝒔𝑡 𝒙|c 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘

 New SDE: 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (𝒔𝑡 𝒙|c − ∇𝐸𝑡 (𝒙, 𝑐)) 𝑑𝑡 + 𝑔 𝑡 𝑑 𝒘

 High level intuition: Small energy ~ x is away from the class c


 𝐸𝑡 𝑥, 𝑐 = sim 𝑥, 𝑐 = 𝑓 𝑥 ∙ 𝜇𝑐
 𝑓 is an image encoder and 𝜇𝑐 is the averaged embedding of class 𝑐
 Empirically, use a contrastive version of the loss

Vikash et al. Generating High Fidelity Data from Low-density Regions using Diffusion Models

By Fan Bao, Tsinghua University 30


Vikash et al. Generating High Fidelity Data from Low-density Regions using Diffusion Models

Application: Generate Low Density Images

Dataset Samples from SDE of 𝒔𝑡 𝒙|c Samples from 𝒔𝑡 𝒙|c − ∇𝐸𝑡 (𝒙, 𝑐)

By Fan Bao, Tsinghua University 31


Meng et al. Image Synthesis and Editing with Stochastic Differential Equations

Application: Image2Image Translation


 𝑐 is the reference image
 𝒔𝑡 𝒙 is a DPM on target domain

 𝑑𝒙 = 𝑓 𝑡 𝒙 − 𝑔 𝑡 2 (𝒔𝑡 𝒙 ) 𝑑𝑡 + 𝑔 𝑡 𝑑𝒘,
ഥ 𝑥𝑡0 ∼ 𝑝(𝑥𝑡0 |𝑐)
 No energy guidance
 𝑐 only influence the start distribution
 Choose an early start time 𝑡0 < 𝑇
 𝑝(𝑥𝑡0 |𝑐) is a Gaussian perturbation of 𝑐

By Fan Bao, Tsinghua University 32


Meng et al. Image Synthesis and Editing with Stochastic Differential Equations

Application: Image2Image Translation


𝑝(𝑥𝑡0 |𝑐) is a Gaussian perturbation of 𝑐

Stroke to painting

By Fan Bao, Tsinghua University 33


DPMs for Downstream Tasks
Regard DPMs as pretrained models (feature extractors)

By Fan Bao, Tsinghua University 34


Dmitry et.al. Label-Efficient Semantic Segmentation with Diffusion Models

DPMs for Downstream SegmentationDPM features are already


unsupervised segmentation.

By Fan Bao, Tsinghua University 35


Dmitry et.al. Label-Efficient Semantic Segmentation with Diffusion Models

DPMs for Downstream Segmentation


 Use features from DPMs at different layers and times.
 Finetune a MLP after these features.
 Only a small number of segmented data is required.

By Fan Bao, Tsinghua University 36


DPMs for Other Domains

By Fan Bao, Tsinghua University 37


DPMs for Other Domains
 Text to speech
 Vadim et. al. Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

 Video generation
 Ho et. al. Video Diffusion Models

 Industry anomaly detection


 Yana et.al. TFDPM: Attack detection for cyber-physical systems with diffusion
probabilistic models (网络物理系统的攻击检测)

By Fan Bao, Tsinghua University 38


DPMs for Other Domains
 Point Cloud
 Lyu et. al. A CONDITIONAL POINT DIFFUSION-REFINEMENT PARADIGM FOR 3D
POINT CLOUD COMPLETION

By Fan Bao, Tsinghua University 39


DPMs for Science
 Molecular dynamics
 Wang et.al. From data to noise to data: mixing physics across temperatures
with generative artificial intelligence
 Hoogeboom et. al. Equivariant Diffusion for Molecule Generation in 3D

By Fan Bao, Tsinghua University 40


DPMs for Science

By Fan Bao, Tsinghua University 41


DPMs for Science
 Medical
 Aviles-Rivero et. al. Multi-Modal Hypergraph Diffusion Network with Dual
Prior for Alzheimer Classification

By Fan Bao, Tsinghua University 42


Thanks!

By Fan Bao, Tsinghua University 43

You might also like