Autoencoders - Buffalo University
Autoencoders - Buffalo University
Autoencoders
Sargur Srihari
[email protected]
1
Deep Learning Srihari
Topics in Autoencoders
• What is an autoencoder?
1. Undercomplete Autoencoders
2. Regularized Autoencoders
3. Representational Power, Layout Size and Depth
4. Stochastic Encoders and Decoders
5. Denoising Autoencoders
6. Learning Manifolds and Autoencoders
7. Contractive Autoencoders
8. Predictive Sparse Decomposition
9. Applications of Autoencoders
2
Deep Learning
What is an Autoencoder? Srihari
3
Deep Learning Srihari
4
Deep Learning Srihari
5
Deep Learning Srihari
6
Deep Learning Srihari
7
Deep Learning Srihari
8
Deep Learning Srihari
Autoencoder History
9
Deep Learning Srihari
An autoencoder architecture
Decoder g
as discussed next
Encoder f
10
Deep Learning Srihari
11
Deep Learning Srihari
1. Undercomplete Autoencoder
12
Deep Learning Srihari
13
Deep Learning Srihari
h
• One hidden layer
• Non-linear encoder
• Takes input x ε Rd
• Maps into output h ε Rp Encoder f
Decoder g
h = σ1(Wx +b)
x ' = σ2(W 'h +b') σ is an element-wise activation function such as sigmoid or Relu
Encoder/Decoder Capacity
15
Deep Learning Srihari
16
Deep Learning Srihari
17
Deep Learning Srihari
18
Deep Learning Srihari
19
Deep Learning Srihari
Source: https://www.jeremyjordan.me/variational-autoencoders/ 20
Deep Learning
Variational Autoencoder Srihari
21
Deep Learning Srihari
Sparse Autoencoder
Only a few nodes are encouraged to activate when a single
sample is fed into the network
Fewer nodes activating while still keeping its performance would guarantee that the autoencoder is
actually learning latent representations instead of redundant information in our input data
22
Deep Learning Srihari
24
Deep Learning Srihari
Sparsity-inducing Priors
• The log pmodel(h) term can be sparsity-inducing. For example the
Laplace prior
λ −λ|h |
pmodel (hi ) = e i
2
• corresponds to an absolute value sparsity penalty
• Expressing the log-prior as an absolute value penalty
⎛ λ⎞
−log pmodel (h) = ∑ ⎜⎜⎜λ | hi | −log ⎟⎟⎟ = Ω(h) +const where Ω(h) = λ∑ hi
i ⎝ 2 ⎟⎠
i
26
Deep Learning Srihari
29
Deep Learning Srihari
30
Deep Learning Srihari
Stochastic encoder
32
Deep Learning Srihari
• Both the encoder and decoder are not simple functions but
involve a distribution
• The output is sampled from a distribution pencoder(h|x) for the
encoder and pdecoder(x|h) for the decoder
33
Deep Learning Srihari
34
Deep Learning Srihari
Sampling pmodel(h|x)
pencoder(h|x) pdecoder(x|h)
35
Deep Learning Srihari