Auto Encoder
Auto Encoder
Roger Grosse
Why autoencoders?
Map high-dimensional data to two dimensions for visualization
Compression (i.e. reducing the file size)
Note: autoencoders don’t do this for free — it requires other ideas as
well.
Learn abstract features in an unsupervised way so you can apply them
to a supervised task
Unlabled data can be much more plentiful than labeled data
You wouldn’t actually sove this problem by training a neural net. There’s a
closed-form solution, which you learn about in CSC 411.
The algorithm is called principal component analysis (PCA).
Roger Grosse CSC321 Lecture 20: Autoencoders 7 / 16
Principal Component Analysis
PCA for faces (“Eigenfaces”)
Note that now W(1) is held fixed, but W(2) is being trained using
contrastive divergence.
This gives a good initialization for the deep autoencoder. You can
then fine-tune the autoencoder weights using backprop.
This strategy is known as layerwise pre-training.
Roger Grosse CSC321 Lecture 20: Autoencoders 14 / 16
Autoencoders are not a probabilistic model.
However, there is an autoencoder-like probabilistic model called a
variational autoencoder (VAE). These are beyond the scope of the
course, and require some more advanced math.
Check out David Duvenaud’s excellent course “Differentiable
Inference and Generative Models”: https://www.cs.toronto.edu/
~duvenaud/courses/csc2541/index.html