Variational Autoencoder
Variational Autoencoder
1
Table of Contents
Introduction
Variational Autoencoders
2
Introduction
Table of Contents
Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization
3
What is our Goal?
Suppose we have observed a random sample x which is being drawn from an unknown
probability process with distribution p(x). We want to find a distribution p̂ θ (x) such
that p̂ θ (x) ≈ p(x), where θ is the parameter to be estimated.
* Now onwards, I will refer p(x) to our distribution to be estimated, instead of p̂ θ (x)
for notational convenience.
4
Table of Contents
Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization
5
Latent Variable Models
We have,
1. A high dimensional object of interest x ∈ X D .
2. Low dimensional latent variables z ∈ Z M , often called hidden factors in the data.
We can define the generative process as:
1.z ∼ p(z)
2.x ∼ p(x|z)
Now from the joint probability distribution of x and z is given by p(x, z) = p(z)p(x|z)
To get the likelihood function p(x), we will marginalize p(x, z) as
R R
p(x) = p(x, z) = p(z)p(x|z)dz
6
Table of Contents
Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization
7
Variational Autoencoders
8
Variational Autoencoders
Table of Contents
Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization
9
The Model and The Objective
R
p(z)p(x|z)dz does not have an analytical solution which we can optimize.
To solve this, we introduce a parametric inference model qϕ (z), called encoders or
recognition model. Parameters ϕ are called variational parameters, and we will
optimize ϕ such that qϕ (z) ≈ p(x).
We can assume qϕ (z) as Gaussian with mean, µ and variance, σ 2 . That is ϕ = {µ, σ 2 }
10
The Model and The Objective
Then,
ln p(x)
R
= ln p(x|z)p(z)dz
R q (z)
= ln qϕϕ (z) p(x|z)p(z)dz
=ln Eqϕ (z) [ p(x|z)p(z)
qϕ (z) ]
≥Eqϕ (z) ln[ p(x|z)p(z)
qϕ (z) ] (by Jensen’s inequality)
=Eqϕ (z) [ln(p(x|z)) + ln(p(z)) − ln(qϕ (z))]
=Eqϕ (z) [ln(p(x|z))] − Eqϕ (z) [ln(qϕ (z)) − ln(p(z))]
11
The Model and The Objective
12
Table of Contents
Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization
13
Evidence Lower Bound
ln p(x)
= Eqϕ (z|x) [ln(p(x))]
=Eqϕ (z|x) [ln p(z|x)(p(x))
(z|x) ]
=Eqϕ (z|x) [ln p(x|z)(p(z))
(z|x) ]
qϕ (z|x)
=Eqϕ (z|x) [ln p(x|z)(p(z))qϕ (z|x ]
p(z) qϕ (z|x)
=Eqϕ (z|x) [lnp(x|z) qϕ (z|x) p(z|x) ]
=Eqϕ (z|x) [ln(p(x|z))]-KL(qϕ (z|x||p(z))+KL(qϕ (z|x)||p(x|z))
14
Table of Contents
Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization
15
Optimization
Our goal is to maximize this with respect to θ and ϕ using stochastic gradient ascent.
Alternatively, we can try to minimize the negative ELBO. It is easy to take gradients of
this with respect to θ, using automatic differentiation. Unfortunately, taking gradients
with respect to ϕ is harder, since we need to take into account that the sampling
process itself depends on .
16
Table of Contents
Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization
17
Reparameterization
18
Thank You !
19