Autoencoders
Autoencoders
Architecture
Let’s explore the details of the encoder, code and decoder. Both the encoder
and decoder are fully-connected feedforward neural networks, essentially
the ANNs we covered in Part 1. Code is a single layer of an ANN with the
dimensionality of our choice. The number of nodes in the code layer (code
size) is a hyperparameter that we set before training the autoencoder.
This is very similar to the ANNs we worked on, but now we’re using the
Keras functional API. Refer to this guide for details, but here’s a quick
comparison. Before we used to add layers using the sequential API as
follows:
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
It’s more verbose but a more flexible way to define complex models. We can
easily grab parts of our model, for example only the decoder, and work with
that. The output of Dense method is a callable layer, using the functional
API we provide it with the input and store the output. The output of a layer
becomes the input of the next layer. With the sequential API the add method
implicitly handled this for us.
Note that all the layers use the relu activation function, as it’s the standard
with deep neural networks. The last layer uses the sigmoid activation
because we need the outputs to be between [0, 1]. The input is also in the
same range.
Also note the call to fit function, before with ANNs we used to do:
model.fit(x_train, y_train)
model.fit(x_train, x_train)
Remember that the targets of the autoencoder are the same as the input.
That’s why we supply the training data as the target.
Visualization
Now let’s visualize how well our autoencoder reconstructs its input.
We run the autoencoder on the test set simply by using the predict function
of Keras. For every image in the test set, we get the output of the
autoencoder. We expect the output to be very similar to the input.
They are indeed pretty similar, but not exactly the same. We can notice it
more clearly in the last digit “4”. Since this was a simple task our
autoencoder performed pretty well.
Advice
autoencoder.fit(x_train, x_train)
autoencoder.fit(x_train_noisy, x_train)
Visualization
Sparse Autoencoders
We introduced two ways to force the autoencoder to learn
useful features: keeping the code size small and denoising
autoencoders. The third method is using regularization. We
can regularize the autoencoder by using a sparsity
constraint such that only a fraction of the nodes would have
nonzero values, called active nodes.
The final loss of the sparse model is 0.01 higher than the
standard one, due to the added regularization term.
Use Cases
Now we might ask the following questions. How good are
autoencoders at compressing the input? And are they a
commonly used deep learning technique?
Conclusion
Autoencoders are a very useful dimensionality reduction
technique. They are very popular as a teaching material in
introductory deep learning courses, most likely due to their
simplicity.