0% found this document useful (0 votes)
9 views55 pages

Lecture # 6 Latent Variable Models

The document presents a lecture on Latent Variable Models, focusing on Autoencoders and Variational Inference. It explains the concepts of observed and latent variables, the architecture and applications of autoencoders, and the training methods for latent variable models. Additionally, it discusses the importance of variational autoencoders in generating new data instances while maintaining the properties of the latent space.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views55 pages

Lecture # 6 Latent Variable Models

The document presents a lecture on Latent Variable Models, focusing on Autoencoders and Variational Inference. It explains the concepts of observed and latent variables, the architecture and applications of autoencoders, and the training methods for latent variable models. Additionally, it discusses the importance of variational autoencoders in generating new data instances while maintaining the properties of the latent space.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

National University of Computer and Emerging Sciences

Latent Variable Models

AI-4009 Generative AI

Dr. Akhtar Jamil


Department of Computer Science

04/23/2025 Presented by Dr. AKHTAR JAMIL 1


Goals
• Today’s Lecture
– Latent Variable Models
– Autoencoders
– Variational Inference with autoencoders

04/23/2025 Presented by Dr. AKHTAR JAMIL 2


Probabilistic models
Why would we want to do this?

[object
label]

04/23/2025 Presented by Dr. AKHTAR JAMIL 3


Generative models
Today: can we go from “language models” to “everything models”?

This is called unsupervised learning

Just different ways to solve the same problem!

“I” “think” “therefore” “am” <EOS> Why would we want to do this?


“I”
Same reasons as language modeling!
 Unsupervised pretraining on lots of data
 Representation learning
 Pretraining for later finetuning
not actually
a
<START “I” “think” “therefore” “am”  Actually generating things!
variable! “I”
04/23/2025 > Presented by Dr. AKHTAR JAMIL 4
Can we “language model” images?
before now

time step = word time step = pixel


ask the “image language model”
to fill in the blanks
some (arbitrary) ordering on pixels

This is basically the main idea, but


there are some details we need
to figure out!
 How to order the pixels?
 What kind of model to use?

<START>

Van den 04/23/2025


Oord et al. Pixel Recurrent Neural Networks. 2016. Presented by Dr. AKHTAR JAMIL 5
Types of Variables
• Observed vs. Latent:
• Observed: something that we can see from our data, e.g. X or Y
• Latent: a variable that we assume exists, but we aren’t given the
value
• Deterministic vs. Random:
• Deterministic: variables that are calculated directly according to
some deterministic function
• Random (stochastic): variables that obey a probability
distribution, and may take any of several (or infinite) values

04/23/2025 Presented by Dr. AKHTAR JAMIL 6


Latent Variable Models
• A latent variable model (LVM) is a probability distribution over two
sets of variables :

• where the x variables are observed at learning time in a dataset


and z are latent variables.
– Should be a simple distribution

04/23/2025 Presented by Dr. AKHTAR JAMIL 7


Latent variable models

mixture

element

04/23/2025 Presented by Dr. AKHTAR JAMIL 8


Latent variable models in general

“easy” distribution
(e.g., Gaussian)

04/23/2025 Presented by Dr. AKHTAR JAMIL 9


Latent variable models in general

“easy” distribution “easy” distribution


(e.g., conditional Gaussian) (e.g., Gaussian)

“easy” distribution
(e.g., Gaussian)

04/23/2025 Presented by Dr. AKHTAR JAMIL 10


An Example (Goersch 2016)
f

z x
04/23/2025 Presented by Dr. AKHTAR JAMIL 11
Latent variable models in deep learning

A latent variable deep generative model is


(usually) just a model that turns random
numbers into valid samples (e.g., images)

There are many types of such models: VAEs,


GANs, normalizing flows, etc.

Using the model for generation:

“generate a vector of random numbers”

“turn that vector of random numbers into an image”

04/23/2025 Presented by Dr. AKHTAR JAMIL 12


Representing latent variable models

what architecture should we use?

Easy choice: just a big fully connected network (linear layers + ReLU)
works well for tiny images (e.g., MNIST) or non-image data

Better choice: transpose convolutions

04/23/2025 Presented by Dr. AKHTAR JAMIL 13


Representing latent variable models
• Transposed Convolutions (fractionally-strided or deconvolutions)
• Transposed convolutional layers to increase the spatial
dimensions
– Opposite to conventional convolutions
• Inserting zeros between the input values (up-sampling or
dilating)
• Used in VAEs, GANs etc.
– Upsampling, image generation, and semantic segmentation.

04/23/2025 Presented by Dr. AKHTAR JAMIL 14


Training latent variable models

variational autoencoders (VAEs)

normalizing flows

Generative adversarial networks (GANs)


04/23/2025 Presented by Dr. AKHTAR JAMIL 15
How do we train latent variable models?

04/23/2025 Presented by Dr. AKHTAR JAMIL 16


Estimating the log-likelihood

this is called probabilistic inference

04/23/2025 Presented by Dr. AKHTAR JAMIL 17


The variational approximation

04/23/2025 Presented by Dr. AKHTAR JAMIL 18


The variational
approximation
Jensen’s inequality

04/23/2025 Presented by Dr. AKHTAR JAMIL 19


The variational
approximation
Jensen’s inequality

04/23/2025 Presented by Dr. AKHTAR JAMIL 20


Entropy

low
high

Intuition 1: how random is the random variable?


Intuition 2: how large is the log probability in expectation under itself

this maximizes the first part

this also maximizes the second part


(makes it as wide as possible)

04/23/2025 Presented by Dr. AKHTAR JAMIL 21


KL-Divergence

Intuition 1: how different are two distributions?

Intuition 2: how small is the expected log probability of one distribution under another, minus entropy?

why entropy?
this maximizes the first part

this also maximizes the second part


(makes it as wide as possible)

04/23/2025 Presented by Dr. AKHTAR JAMIL 22


The variational approximation

Evidence Lower Bound (ELBO)


04/23/2025 Presented by Dr. AKHTAR JAMIL 23
The variational approximation

04/23/2025 Presented by Dr. AKHTAR JAMIL 24


The reparameterization trick
Is there a better way?

most autodiff software (e.g., TensorFlow)


will compute this for you!
04/23/2025 Presented by Dr. AKHTAR JAMIL 25
The variational approximation

this often has a convenient analytical


form (e.g., KL-divergence for Gaussians)

04/23/2025 Presented by Dr. AKHTAR JAMIL 26


Reparameterization trick
• Reparameterization trick
• Only continuous latent variables
• Very simple to implement
• Low variance

04/23/2025 Presented by Dr. AKHTAR JAMIL 27


Autoencoders
• Autoencoders are simple neural network architectures.
• They basically do compression on data and reconstruct the
original data.
– Think like MP3 compression, or image compression using JPEG.

04/23/2025 Presented by Dr. AKHTAR JAMIL 28


Autoencoders

04/23/2025 Presented by Dr. AKHTAR JAMIL 29


Autoencoders

04/23/2025 Presented by Dr. AKHTAR JAMIL 30


Autoencoders
• An autoencoder is a type of artificial neural network used to learn
efficient data coding in an unsupervised manner.
• The aim of an autoencoder is to learn a representation (encoding)
– Representation Learning
• Network can learn most important features and ignore noise.
– Dimensionality reduction

04/23/2025 Presented by Dr. AKHTAR JAMIL 31


Autoencoders: Applications
• 1. Dimensionality Reduction: Reduce the dimensionality of the data for
visualization or to improve computational efficiency for other tasks.
• 2. Feature Learning: Autoencoders can learn useful features automatically
from the input data, which can then be used for tasks like classification,
regression, or further unsupervised learning.
• 3. Anomaly Detection: By learning to reproduce normal data, autoencoders
can be used to detect anomalies or outliers by examining the reconstruction
error.
• 4. Denoising: Denoising autoencoders are trained to remove noise from data.
• 5. Data Generation: Variational autoencoders (VAEs), generate new data that
is similar to the training data. This is useful in generating synthetic data for
training machine learning models when actual data is scarce.
04/23/2025 Presented by Dr. AKHTAR JAMIL 32
Autoencoders: Applications
• 6. Image Processing: such as image colorization, image resolution
enhancement (super-resolution), and image inpainting.
• 7. Sequence-to-sequence Modeling:
– Neural machine translation, text summarization, and dialogue systems.
• 8. Feature Transfer: Trained autoencoder networks can be used for transferring
features learned on one dataset to another dataset, which is beneficial when the
second dataset is too small to train a deep network effectively.
• 9. Unsupervised Pre-training: pre-train the weights of neural networks in an
unsupervised manner to improve the performance of subsequent supervised
learning tasks.
• 10. Compression: Autoencoders can learn a compressed representation of
data, which could potentially be used for efficient storage and transmission.

04/23/2025 Presented by Dr. AKHTAR JAMIL 33


Deeper Autoencoders

04/23/2025 Presented by Dr. AKHTAR JAMIL 34


Variational Autoencoders

04/23/2025 Presented by Dr. AKHTAR JAMIL 35


Variational Autoencoders (VAE)
• A variational autoencoder is an autoencoder whose training is
regularized to avoid overfitting and ensure that the latent space
has good properties that enable the generative process.
• Just like a standard autoencoder, a variational autoencoder is an
architecture composed of both an encoder and a decoder
• It is designed to handle the issue of generating new instances that
are varied yet still resemble the input data.
• Instead of encoding an input as a single point, we encode it as a
distribution over the latent space.

04/23/2025 Presented by Dr. AKHTAR JAMIL 36


Common VAE Architecture

04/23/2025 Presented by Dr. AKHTAR JAMIL 37


VAEs with convolutions
64x64x3 30x30x32
14x14x64
6x6x128
256

256

5x5x32 3x3x64 1024 256 1024


3x3x128
conv conv conv
stride 2 stride 2 stride 2
transpose convolutions

(independent) mean and


variance for each pixel
256

Question: can we design a fully convolutional VAE?


Yes, but be careful with the latent codes!

04/23/2025 Presented by Dr. AKHTAR JAMIL 38


Some background first: Autoencoders
Unsupervised approach for learning a lower-dimensional feature representation
from unlabeled training data

z usually smaller than x Originally: Linear +


(dimensionality reduction) nonlinearity (sigmoid)
Later: Deep, fully-
Q: Why dimensionality connected
reduction?
Later: ReLU CNN
A: Want features to
capture meaningful Features Presented by
factors of variation in Dr.
data
04/23/2025 Encoder 39
AKHTAR
Input data JAMIL
Some background first: Autoencoders
How to learn this feature representation?
Train such that features can be used to reconstruct original data
“Autoencoding” - encoding itself
Or
igi
nonlinearity (sigmoid)
Reconstructed na
Later: Deep, fully-connected
input data lly
:Later: ReLU CNN (upconv)
Decoder Li
ne
Features ar
+ Presented by
Dr.
04/23/2025 Encoder 40
AKHTAR
Input data JAMIL

Lecture 13 -
Reconstructed data

Some background first: Autoencoders


How to learn this feature representation?
Train such that features can be used to reconstruct original data
“Autoencoding” - encoding itself

Reconstructed Encoder: 4-layer conv


input data Decoder: 4-layer upconv
Decoder
Input data
Features Presented by
Dr.
04/23/2025 Encoder 41
AKHTAR
Input data JAMIL

Lecture 13 -
Reconstructed data

Some background first: Autoencoders


Train such that features
can be used to L2 Loss function:
reconstruct original data

Reconstructed Encoder: 4-layer conv


input data Decoder: 4-layer upconv
Decoder
Input data
Featu res Presented by
Dr.
04/23/2025 Encoder 42
AKHTAR
Input data JAMIL

Lecture 13 -
Some background first: Autoencoders
Loss function
(Softmax, bird plane
etc) dog deer truck

Predicted Label
Fine-tune Train for final task
Classifier (sometimes with
Encoder can be encoder
jointly with small data)
used to initialize a Features Presented by
supervised model classifier
Dr.
04/23/2025 Encoder 43
AKHTAR
Input data JAMIL

Lecture 13 -
Some background first: Autoencoders
Autoencoders can reconstruct
data, and can learn features to
initialize a supervised model

Reconstructed Features capture factors of


input data variation in training data. Can we
Decoder generate new images from an
autoencoder?
Features Presented by
Dr.
04/23/2025 Encoder 44
AKHTAR
Input data JAMIL

Lecture 13 -
Variational Autoencoders
Probabilistic spin on autoencoders - will let us sample from the model to generate data!

Assume training data is generated from underlying unobserved (latent)


representation z
Intuition (remember from autoencoders!):
Sample from x is an image, z is latent factors used to
true conditional generate x: attributes, orientation, etc.

Presented by
Sample from Dr.
04/23/2025 45
true prior AKHTAR
JAMIL
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014

Lecture 13 -
Variational Autoencoders
We want to estimate the true parameters
of this generative model.
Sample from How should we represent this model?
true conditional
Choose prior p(z) to be simple, e.g.
Decoder Gaussian.
network
Sample from Conditional p(x|z) is complex (generates
true prior image) => represent with neural Presented by
network Dr.
46
AKHTAR
JAMIL
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014

Lecture 13 -
04/23/2025
Variational Autoencoders
We want to estimate the true parameters
of this generative model.
Sample from
How to train the model?
true conditional
Learn model parameters to maximize
Decoder likelihood of training data
network
Sample from
true prior Presented by
Dr.
04/23/2025 47
AKHTAR
JAMIL
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014

Lecture 13 -
Variational
ʰ Autoencoders:
✔ ✔ Intractability
Data likelihood:
•Posterior density also intractable: ✔ ✔ ʰ

•Solution: In addition to decoder network modeling pθ(x|z), define additional encoder


network qɸ(z|x) that approximates pθ(z|x)

•Will see that this allows us to derive a lower bound on the data likelihood that is tractable,
which we can optimize Presented by
Dr.
04/23/2025 48
AKHTAR
JAMIL
Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014

Lecture 13 -
Variational Autoencoders
Maximize
Putting it all together: maximizing the likelihood of Sample x|z from
likelihood lower bound original input
being
reconstructed
Decoder network

Kulback-Leibler Divergence
Sample z from
Make approximate
posterior distribution
close to prior
E
Presented by
For every minibatch of input n 49 Dr.
data: compute this forward c AKHTAR
pass, and then backprop! o Input Data
d JAMIL
e
r
Lecture 13 -

n
e
04/23/2025 t
Variational Autoencoders

04/23/2025 Presented by Dr. AKHTAR JAMIL 50


Variational Autoencoder Likelihood

Labeled Faces in the Wild

04/23/2025 Presented by Dr. AKHTAR JAMIL 51


The variational autoencoder

04/23/2025 Presented by Dr. AKHTAR JAMIL 52


VAEs in practice
Common issue: very tempting for VAEs (especially conditional VAEs) to ignore the latent codes, or generate poor samples

why?

Problem 1: latent code is ignored Problem 2: latent code is not compressed

what does this look like? blurry “average” image what does this look like? garbage images
when reconstructing when sampling

too low too high

need to control this quantity


carefully to get good results!

04/23/2025 Presented by Dr. AKHTAR JAMIL 53


VAEs in practice
Problem 1: latent code is ignored Problem 2: latent code is not compressed
too low too high

need to control this quantity


carefully to get good results!

multiplier to adjust regularizer strength

04/23/2025 Presented by Dr. AKHTAR JAMIL 54


Thank You 

04/23/2025 Presented by Dr. AKHTAR JAMIL 55

You might also like