Lecture26 Autoencoders Part B
Lecture26 Autoencoders Part B
512 neurons 256 neurons 20 neurons 256 neurons 512neurons 784 neurons
NO
Since we are dealing with images, it is best to use CNNs
CS109B, PROTOPAPAS, GLICKMAN, TANNER 3
Convolu6onal Autoencoders
16 filters (3x3x1)
same padding
Output: 28x28x16
8 filters (3x3x16) same 3 filters (3x3x8) same
Output: 14x14x8 Output: 7x7x3
Max pooling 2x2
Output: 14x14x16
Max pooling
2x2
Output: 7x7x8
16 filters (3x3x1)
same padding
Output: 28x28x16
8 filters (3x3x16) same 3 filters (3x3x8) same
Output: 14x14x8 Output: 7x7x3
Max pooling 2x2
Output: 14x14x16
Max pooling
2x2 But how can we increase the
Output: 7x7x8 dimensions of the conv layers?
Upsampling!
CS109B, PROTOPAPAS, GLICKMAN, TANNER 5
Convolu6onal Autoencoders
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 6
Convolutional Autoencoders
Original Images
Reconstructed Images
with DeepFCN AE
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 7
Convolutional Autoencoders
Original Images
Reconstructed Images
with Conv AE
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 8
Regulariza6on of Autoencoders
• Sparse autoencoders
• Contractive autoencoders
• Denoising autoencoders
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R
Sparse Autoencoders
This trade-off requires the model to learn only the variations in the data required to
reconstruct the input. Avoid holding on to redundancies within the input.
Question: How to achieve this?
Αdd a second loss term that encourages low-dimensional latent space (sparsity penalty).
ℒ 𝑥, 𝑥$ + Ω(𝑧)
Regularization on the output of encoder (latent space), not on network parameters.
The first term encourages our model to be sensitive to the inputs (reconstruction loss)
and a second term discourages memorization/overfitting (regularization).
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R
Sparse Autoencoders
ℒ 𝑥, 𝑥$ + 𝜆 + |𝑧! |
!
• We ask the AE to have the lowest possible dimensional latent space that is
sufficient to reconstruct the input data.
• Limit the network's capacity to memorize the input data without limiting the
networks capability to extract features from the data.
• Individual regions of the AE are selectively activated depending on the
input. Each region takes care of a specific attribute of the input data.
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R
Contractive Autoencoders
Intuitively, we would expect that for very similar inputs, the learned
encoding would also be very similar.
In other words, we want the latent space to not change a lot when the
input data slightly changes.
How can we assist the AE to do that?
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 14
Denoising Autoencoders
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R
Applications of Autoencoders
• Denoising images
• Blending
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R
Denoising images
A popular use of autoencoders is to remove noise from samples.
Start with a pris%ne image
Feed corrupted
input into
autoencoder
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 18
Blending
We blend inputs to create new data that is similar to the input data,
but not exactly the same.
One example is the content blending where the content of two pieces
of data is directly blended. An example is if we overlay images of a
cow and zebra.
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 19
Image taken from A. Glassner, Deep Learning, Vol. 2: From Basics to PracFce
Content Blending
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 20
Image taken from A. Glassner, Deep Learning, Vol. 2: From Basics to Practice
Representation blending
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 21
Representation blending (cont)
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 22
Image taken from A. Glassner, Deep Learning, Vol. 2: From Basics to Practice
Representation blending
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 23
Image taken from A. Glassner, Deep Learning, Vol. 2: From Basics to Practice
Blending Latent Variables
Back to the example of MNIST.
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 24
Image taken from A. Glassner, Deep Learning, Vol. 2: From Basics to Practice
Problems with Autoencoders
The cure:
Variational autoencoders (VAE)
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R
Exercise 2:
Recreating an image of Pavlos
CS109B, PR O TO PA PA S , GLIC K M A N , TA N N E R 26