AIDS2 Assignment 2
AIDS2 Assignment 2
Chaudhary 09
AI DS 2
Assignment
No.2
Structure of RBM
1. Visible Layer:
○ Represents the observed data. Each neuron in this layer corresponds
to an element of the input vector (e.g., pixel values in an image,
user preferences in a collaborative filtering scenario).
2. Hidden Layer:
○ Captures the underlying features of the data. Each neuron in this
layer is a latent variable that encodes the dependencies between the
visible units.
Key Characteristics
● Bipartite Graph: RBMs have a bipartite structure, meaning that there are
no connections between the neurons in the same layer. Only connections
between the visible and hidden layers exist.
● Energy-Based Model: RBMs are defined in terms of energy. Each
configuration of visible and hidden states has an associated energy, and the
network tends to minimize this energy.
● Binary and Real-Valued Units: While RBMs can use binary units for both
layers, they can also incorporate Gaussian visible units for real-valued data.
1. Data Representation:
○ The visible layer is activated by input data. Each neuron in the
hidden layer computes the probability of being activated based on
the states of the visible layer.
2. Forward Pass:
○ Given a visible vector v, the hidden layer's state h is determined by the
activation probabilities:
Vaishnavi D20B/
Chaudhary 09
3. Reconstruction:
○ The hidden layer can then generate a reconstruction of the visible
layer. The visible units are activated again based on the hidden
states:
○ where ai is the bias for visible unit i.
4. Training:
○ RBMs are trained using a method called Contrastive Divergence
(CD). This method involves:
■ Performing Gibbs sampling to approximate the distribution of
the visible units given the hidden units.
■ Updating the weights and biases to minimize the difference
between the original and reconstructed visible states.
Applications of RBM
1. Dimensionality Reduction:
○ RBMs can be used to reduce the dimensionality of data while
preserving essential features, making it easier to visualize or
analyze data.
2. Feature Learning:
○ By learning a set of features that represent the input data, RBMs can
improve the performance of supervised learning tasks.
3. Collaborative Filtering:
○ In recommendation systems, RBMs can learn user preferences by
modeling interactions between users and items, helping to predict
ratings for unseen items.
4. Image Recognition:
○ RBMs can extract features from images, making them useful in
computer vision tasks, especially as building blocks in deep learning
architectures.
1. Constant Learning Rate: The learning rate remains fixed throughout the
training process.
2. Exponential Decay: The learning rate decreases exponentially after each
epoch, allowing for larger updates in the beginning and smaller updates as
training progresses.
3. Step Decay: The learning rate is reduced by a factor (e.g., halved) at
specific intervals (epochs), helping the model refine its weights as it
approaches convergence.
4. Adaptive Learning Rates: Techniques like Adam, RMSprop, and
Adagrad automatically adjust the learning rate based on the gradients of
the loss function, adapting to the landscape of the loss surface and
promoting stable training.
Vaishnavi D20B/
Chaudhary 09
1. Input Layer (X): This layer receives the original input data. The input can
be an image, a time series, or any type of data.
2. Encoder: The encoder processes the input data and reduces its
dimensionality by mapping it to a lower-dimensional space (the latent space
or bottleneck layer). This part typically consists of several layers of neurons
with activation functions (often ReLU or sigmoid) that compress the data.
3. Bottleneck Layer (Latent Space): This layer contains the compressed
representation of the input data. The number of neurons in this layer is less
than that in the input layer, capturing the most important features of the
data.
4. Decoder: The decoder reconstructs the input data from the compressed
representation in the bottleneck layer. It usually mirrors the encoder's
structure, expanding the data back to its original dimensions.
5. Output Layer (Reconstructed X): This layer produces the final
output, which is an attempt to recreate the original input data. The
reconstruction error (the difference between the input and the output)
is typically used to train the autoencoder.