0% found this document useful (0 votes)
3 views2 pages

Autoencoder Architecture

The document outlines the architecture and training of autoencoders, which consist of an encoder and decoder that are fully-connected feedforward neural networks. Key hyperparameters include code size, number of layers, and nodes per layer, with a focus on maintaining a smaller code size to avoid overfitting. The goal is to learn meaningful representations of the input data while ensuring the output matches the input in dimensionality.

Uploaded by

akargutkar5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views2 pages

Autoencoder Architecture

The document outlines the architecture and training of autoencoders, which consist of an encoder and decoder that are fully-connected feedforward neural networks. Key hyperparameters include code size, number of layers, and nodes per layer, with a focus on maintaining a smaller code size to avoid overfitting. The goal is to learn meaningful representations of the input data while ensuring the output matches the input in dimensionality.

Uploaded by

akargutkar5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

(ARTIFICIAL INTELLIGENCE & MACHINE LEARNING)

Autoencoder architecture

o Both the encoder and decoder are fully-connected feedforward NN.


o Autoencoders are trained the same way as ANNs via backpropagation.
o Code is a single layer of an ANN with the dimensionality of our choice.
o The number of nodes in the code layer (code size) is a hyperparameter that we set before training
the autoencoder.

o First the input passes through the encoder, which is a fully-connected ANN, to produce the code.
o The decoder, which has the similar ANN structure, then produces the output only using the code.
o The goal is to get an output identical with the input.
o Note that the decoder architecture is the mirror image of the encoder.
o This is not a requirement but it’s typically the case.
o The only requirement is the dimensionality of the input and output needs to be the same.\
o Anything in the middle can be played with.

Hyperparameters:
• Code size: number of nodes in the middle layer. Smaller size results in more compression.

Department of Computer Science & Engineering-(AI&ML) | APSIT


DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
(ARTIFICIAL INTELLIGENCE & MACHINE LEARNING)

• Number of layers: the autoencoder can be as deep as we like. In the figure above we have 2 layers
in both the encoder and decoder, without considering the input and output.

• Number of nodes per layer: The autoencoder architecture is called a stacked autoencoder since
the layers are stacked one after another.

• Loss function: We either use mean squared error (mse) or binary crossentropy.
o If the input values are in the range [0, 1] then we typically use crossentropy,
o Otherwise we use the mean squared error.

Assumptions:
• We can make autoencoder very powerful by increasing # layers, nodes per layer and most
importantly the code size
• Increasing these hyperparameters will let the autoencoder to learn more complex codings.
• Be careful while making it too powerful. Then it will
o Simply learn to copy its inputs to the output, without learning any meaningful
representation.
o Just mimic the identity function.
o Reconstruct training data perfectly, but result in overfitting without being able to
generalize to new instances.
• This is why we prefer a “sandwitch” architecture, and deliberately keep the code size small.
• Since the coding layer has a lower dimensionality than the input data, the autoencoder is said to
be undercomplete.
• It won’t be able to directly copy its inputs to the output, and will be forced to learn intelligent
features.
• For example:
--Digit “1” is somewhat straight line & digit “0” is circular, it will learn fact and encode it in
compact form.
If the input data was completely random without any internal correlation or dependency,
--Then an undercomplete autoencoder won’t be able to recover it perfectly.

Department of Computer Science & Engineering-(AI&ML) | APSIT

You might also like