09_Autoencoder (1)
09_Autoencoder (1)
Unsupervised Learning
• Definition
– Unsupervised learning refers to most attempts to extract information from a distribution that do not require
human labor to annotate example
– Main task is to find the ‘best’ representation of the data
• Dimension Reduction
– Attempt to compress as much information as possible in a smaller representation
– Preserve as much information as possible while obeying some constraint aimed at keeping the representation
simpler
– This modeling consists of finding “meaningful degrees of freedom” that describe the signal, and are of lesser
dimension.
2
Autoencoders
• It is like ‘deep learning version’ of dimension reduction
• Definition
– An autoencoder is a neural network that is trained to attempt to copy its input to its output
– The network consists of two parts: an encoder and a decoder that produce a reconstruction
3
Autoencoder
• Dimension reduction
• Recover the input data
4
Autoencoder
• Dimension reduction
• Recover the input data
– Learns an encoding of the inputs so as to recover the original input from the encodings as well as possible
• A proper autoencoder has to capture a "good" parametrization of the signal, and in particular the
statistical dependencies between the signal components.
7
Autoencoder with TensorFlow
• MNIST example
• Use only (1, 5, 6) digits to visualize in 2-D
8
Test or Evaluation
9
Distribution in Latent Space
• Make a projection of 784-dim image onto 2-dim latent space
10
Autoencoder as Generative Model
11
Generative Capabilities
• We can assess the generative capabilities of the decoder 𝑔 by introducing a [simple] density model 𝑞 𝑍
over the latent space ℱ, sample there, and map the samples into the image space 𝒳 with 𝑔.
13
Latent Representation
• To get an intuition of the latent representation, we can pick two samples 𝑥 and 𝑥′ at random and
interpolate samples along the line in the latent space
16
Interpolation in Manifold
17
MNIST Example: Walk in the Latent Space
18
Generative Models
• It generates something that makes sense.
• These results are unsatisfying, because the density model used on the latent space ℱ is too simple
and inadequate.
• Building a “good” model amounts to our original problem of modeling an empirical distribution,
although it may now be in a lower dimension space.
19