Séquence 12
“Les Variational Autoencoder (VAE),
ou comment jouer avec les espaces latents”
v2.07
Course materials (pdf)
Practical work environment
https://fidle.cnrs.fr v3.0.9
Powered by CNRS CRIC, Corrected notebooks
and UGA DGDSI of Grenoble, thanks !
Videos (YouTube)
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) (*) Procedure via Docket or pip
https://creativecommons.org/licenses/by-nc-nd/4.0/ Remember to get the latest version !
Questions and answers :
https://fidle.cnrs.fr/q2a
Accompanied by :
IA Support (dream) Team of IDRIS
Directed by :
Agathe, Baptiste et Yanis - UGA/DAPI
Thibaut, Kamel - IDRIS
https://fidle.cnrs.fr/listeinfo
Fidle information list
New !
http://fidle.cnrs.fr/agoria
AI exchange list
agoria@grenoble.cnrs.fr
(*) ESR is Enseignement Supérieur et Recherche, french universities and public academic research organizations
https://listes.services.cnrs.fr/wws/info/devlog
List of ESR* « Software developers » group
https://listes.math.cnrs.fr/wws/info/calcul
List of ESR* « Calcul » group
(*) ESR is Enseignement Supérieur et Recherche, french universities and public academic research organizations
Previously on Fidle !
– Unsupervised learning is great !
– The world of convolutions is vast
– Don't forget 1D convolutions
– Keras 3 is good for you !
– PyTorch is good for you too ! :-)
– Multi-input models
– Multi-output models
– Inception models
– GPUS is good for you
– Think clean, structured programming !
– (and to document them!)
VAE network
➔ Concepts and architecture
➔ Gaussian / probabilistic projections
➔ Kullback-Leibler divergence
➔ Morphing in latent space
Applied example
➔ Deeper in programming !
➔ VAE using Functional API (MNIST)
➔ VAE using Model subclass (MNIST)
VAE network
➔ Concepts and architecture
➔ Gaussian / probabilistic projections
➔ Kullback-Leibler divergence
➔ Morphing in latent space
Applied example
➔ Deeper in programming !
➔ VAE using Functional API (MNIST)
➔ VAE using Model subclass (MNIST)
Concepts & architecture
We have seen that an autoencoder network is
trained to minimize a reconstruction error :
Input Encoder Latent space Decoder Output
10
Autoencoder
An autoencoders network have 2 parts :
inputs z
Encoder
z = encoder(inputs)
11
Autoencoder
An autoencoders network have 2 parts :
inputs z outputs
Decoder
outputs = decoder(z)
12
Autoencoder
An autoencoders network have 2 parts :
inputs z outputs
z = encoder(inputs)
ae = keras.Model(inputs, outputs)
outputs = decoder(z)
13
Autoencoder
An autoencoders network have 2 parts :
inputs z outputs
z = encoder(inputs)
ae = keras.Model(inputs, outputs)
outputs = decoder(z)
Keras functional API 14
Autoencoder
3
An autoencoders network have 2 parts :
inputs outputs
Latent space
z
15
Autoencoder
Region
Example of of the « 1 » Region
MNIST of the « 0 » Clusters
dataset appear, but
distribution many of them
in its latent are nested or
space : very spread out
How can
Only two we make our
dimensions network better
are represented separate the
in abscissa and different
ordinate clusters?
z=encoder(inputs) Region
of the « 6 »
See : notebook [AE3]
16
Autoencoder (AE)
z vector in latent space An autoencoder performs a direct
projection into the latent space and an
upsampling from this latent space.
Loss
Binary cross entropy
Measures the difference
between input and output
17
Variational Autoencoder (VAE) The stochastic approach
μ is a mean encourages the network
σ is a variance to generate relevant
z vector in latent space output throughout the
distribution space.
This implies that the
input data are
statistically distributed.
Diederik P Kingma; Welling, Max (2013). "Auto-Encoding Variational Bayes".
https://arxiv.org/abs/1312.6114
18
Variational Autoencoder (VAE)
Kullback-Leibler divergence*
KL divergence between two
(probability) distributions
measures how much they
diverge from each other
KL loss
Reconstruction loss Total loss
Binary cross entropy totalloss = k1.rloss + k2.klloss
Measures the The trick is to find a nice
difference between compromise between
input and output these two components of
the loss function. 19
(*) Special case for a standard normal distribution
Variational Autoencoder (VAE)
totalloss = k1.rloss + k2.klloss
20
Variational Autoencoder (VAE) Easy !
k1 is too high k2 is too high
Reconstruction totalloss = k1.rloss + k2.klloss Agglomeration
is privileged. is privileged.
21
Variational Autoencoder
Variational Autoencoder (VAE)(VAE) Nice !
VAE2, with : VAE2, with :
k1, k2 = [1, 5e-4] totalloss = k1.rloss + k2.klloss k1, k2 = [1, 1e-3]
22
Variational Autoencoder (VAE)
Tha’s great ! We can generate data
in profusion !!! 23
VAE network
➔ Concepts and architecture
➔ Gaussian / probabilistic projections
➔ Kullback-Leibler divergence
➔ Morphing in latent space
Applied example
➔ Deeper in programming !
➔ VAE using Functional API (MNIST)
➔ VAE using Model subclass (MNIST)
VAE network
➔ Concepts and architecture
➔ Gaussian / probabilistic projections
➔ Kullback-Leibler divergence
➔ Morphing in latent space
Applied example
➔ Deeper in programming !
➔ VAE using Functional API (MNIST)
➔ VAE using Model subclass (MNIST)
Our first VAE
Notebook : [VAE1/3]
Objectives :
implementating a VAE, using Keras 3 functional API
and model subclass, using real PyTorch !
Dataset :
MNIST
28
VAE1 | VAE2 |VAE3
#1 Using MNIST
module
ImagesCallBack
module
20 ‘ on a CPU
Custom Layer BestModelCallBack
module With scale=1
MNIST
Import Retrieve the
and init dataset Train
START
END
Parameters Build
models Training
Objectives : latent_dim = 2
loss_weights = [1,.001] review
VAE using scale = .1
Keras seed = 123
SamplingLayer
functional API module
and custom VariationalLossLayer
module
layer 29
VAE1 | VAE2 |VAE3
The classical loss functions use only
the inputs and the outputs tensors.
The calculation of the loss is here
more complex and will be done via a
custom layer.
So we have two custom layers in this
example :
SamplingLayer
VariationalLossLayer
30
VAE1 | VAE2 |VAE3
#1 Using MNIST ImagesCallBack
module
module
20 ‘ on a CPU
Custom Layer BestModelCallBack
module With scale=1
MNIST
Import Retrieve the
and init dataset Train
START
END
Parameters Build
models Training
Objectives : latent_dim = 2
loss_weights = [1,.001] review
VAE using scale = .1
Keras seed = 123
SamplingLayer
functional API module
and custom VariationalLossLayer
module
layer 31
VAE1 | VAE2 |VAE3
#2 Using MNIST ImagesCallBack
module
module 20 ‘ on a CPU
Custom model BestModelCallBack With scale=1
module
MNIST
Import Retrieve the
and init dataset Train Model evaluation
START
END
Parameters Build Training
latent_dim = 2
model review
Objectives : loss_weights = [1,.001]
VAE using scale
seed
= .1
= 123
Keras subclass SamplingLayer
API and MNIST module
dataset VAE
32
module
VAE1 | VAE2 |VAE3
The encoder and decoder are always
built via the functional API and a custom
layer.
The model is defined by a custom class in
real Python :-)
The gradient descent and the the loss
function are defined via a method.
We have one custom layer and one
custom model class :
SamplingLayer
VAE
33
VAE1 | VAE2 |VAE3
#2 Using MNIST ImagesCallBack
module
module 20 ‘ on a CPU
Custom model BestModelCallBack With scale=1
module
MNIST
Import Retrieve the
and init dataset Train Model evaluation
START
END
Parameters Build Training
latent_dim = 2
model review
Objectives : loss_weights = [1,.001]
VAE using scale
seed
= .1
= 123
Keras subclass SamplingLayer
API and MNIST module
dataset VAE
34
module
VAE1 | VAE2 | VAE3
MNIST
module
Few seconds !
MNIST
Import
and init Retrieve the Images Generate fom
dataset reconstruction latent space
START
Reload END
Parameters Visualizing
scale = .1
model latent space
seed = 123
Objectives :
Reload a
saved model SamplingLayer
and visualize module
latent space VAE
35
module
Little things and concepts to keep in mind
– Unsupervised learning is great!
– The use cases of the VAE are very wide
– Anomaly detection
– Clustering
– Semi-supervised classification
– ...
– VAEs are often coupled with other tools
– Balancing the loss function is not always easy to find
– GPUs are great !
– Other (and better?) generative networks exist
– ...
36
Jeudi
Next, on Fidle :
21
Mars
à 14h00
L’IA
comme
un outil
Réalisation, technique et mise en scène :
Agathe, Baptiste et Yanis - UGA/DAPI
Thibaut, Kamel, Léo - IDRIS 37