Unit III
Unit III
1.Input Layers:
It’s the layer in which we give input to our model. The number of neurons in this
layer is equal to the total number of features in our data (number of pixels in the
case of an image).
2.Hidden Layer: The input from the Input layer is then fed into the hidden layer.
There can be many hidden layers depending on our model and data size. Each
hidden layer can have different numbers of neurons which are generally greater
than the number of features. The output from each layer is computed by matrix
multiplication of the output of the previous layer with learnable weights of that
layer and then by the addition of learnable biases followed by activation function
which makes the network nonlinear.
3.Output Layer: The output from the hidden layer is then fed into a logistic
function like sigmoid or softmax which converts the output of each class into the
probability score of each class.
• The data is fed into the model and output from each
layer is obtained from the above step is called
feedforward, we then calculate the error using an
error function, some common error functions are cross-
entropy, square loss error, etc.
• The error function measures how well the network is
performing. After that, we backpropagate into the
model by calculating the derivatives. This step is called
Backpropagation which basically is used to minimize
the loss.
• cross-entropy measures the difference between the
discovered probability distribution of a classification
model and the predicted values.
• The cross-entropy loss function is used to find the
optimal solution by adjusting the weights of a machine
learning model during training. The objective is to
minimize the error between the actual and predicted
outcomes. A lower cross-entropy value indicates better
performance.
Convolution Neural Network
• Convolutional Neural Network (CNN) is the extended
version of artificial neural networks (ANN) which is
predominantly used to extract the feature from the grid-
like matrix dataset. For example visual datasets like
images or videos where data patterns play an extensive
role.
CNN architecture
• Convolutional Neural Network consists of multiple layers
like the input layer, Convolutional layer, Pooling layer,
and fully connected layers.
How Convolutional Layers works
• A complete Convolution Neural Networks architecture is also known as covnets. A covnets is a sequence of layers, and every layer
transforms one volume to another through a differentiable function.
Types of layers: datasets
Let’s take an example by running a covnets on of image of dimension 32 x 32 x 3.
• Input Layers: It’s the layer in which we give input to our model. In CNN, Generally, the input will be an image or a sequence of
images. This layer holds the raw input of the image with width 32, height 32, and depth 3.
• Convolutional Layers: This is the layer, which is used to extract the feature from the input dataset. It applies a set of learnable filters
known as the kernels to the input images. The filters/kernels are smaller matrices usually 2×2, 3×3, or 5×5 shape. it slides over the
input image data and computes the dot product between kernel weight and the corresponding input image patch. The output of
this layer is referred as feature maps. Suppose we use a total of 12 filters for this layer we’ll get an output volume of dimension 32 x
32 x 12.
• Activation Layer: By adding an activation function to the output of the preceding layer, activation layers add nonlinearity to the
network. it will apply an element-wise activation function to the output of the convolution layer. Some common activation functions
are RELU: max(0, x), Tanh, Leaky RELU, etc. The volume remains unchanged hence output volume will have dimensions 32 x 32 x
12.
• Pooling layer: This layer is periodically inserted in the covnets and its main function is to reduce the size of volume which makes the
computation fast reduces memory and also prevents overfitting. Two common types of pooling layers are max pooling and average
pooling. If we use a max pool with 2 x 2 filters and stride 2, the resultant volume will be of dimension 16x16x12.
• Flattening: The resulting feature maps are flattened
into a one-dimensional vector after the convolution and
pooling layers so they can be passed into a completely
linked layer for categorization or regression.
• Fully Connected Layers: It takes the input from the
previous layer and computes the final classification or
regression task.
• Output Layer: The output from the fully connected
layers is then fed into a logistic function for
classification tasks like sigmoid or softmax which
converts the output of each class into the probability
score of each class.
• Advantages of Convolutional Neural Networks (CNNs):
1.Good at detecting patterns and features in images, videos, and audio signals.
2.Robust to translation, rotation, and scaling invariance.
3.End-to-end training, no need for manual feature extraction.
4.Can handle large amounts of data and achieve high accuracy.
• Disadvantages of Convolutional Neural Networks (CNNs):
1.Computationally expensive to train and require a lot of memory.
2.Can be prone to overfitting if not enough data or proper regularization is
used.
3.Requires large amounts of labeled data.
4.Interpretability is limited, it’s hard to understand what the network has
learned.
Normalization
Need for Batch Normalization in
CNN mode
• Batch Normalization in CNN addresses several challenges encountered during training.
There are following reasons highlight the need for batch normalization in CNN:
1.Addressing Internal Covariate Shift: Internal covariate shift occurs when the
distribution of network activations changes as parameters are updated during training.
Batch normalization addresses this by normalizing the activations in each layer,
maintaining consistent mean and variance across inputs throughout training. This
stabilizes training and speeds up convergence.
2.Improving Gradient Flow: Batch normalization contributes to stabilizing the gradient
flow during backpropagation by reducing the reliance of gradients on parameter scales.
As a result, training becomes faster and more stable, enabling effective training of
deeper networks without facing issues like vanishing or exploding gradients.
3.Regularization Effect: During training, batch normalization introduces noise to the
network activations, serving as a regularization technique. This noise aids in averting
overfitting by injecting randomness and decreasing the network’s sensitivity to minor
fluctuations in the input data.
Max pooling
Applications
• Decoding Facial Recognition
• Facial recognition is broken down by a convolutional
neural network into the following major components -
• Identifying every face in the picture
• Focusing on each face despite external factors, such as
light, angle, pose, etc.
• Identifying unique features
• Comparing all the collected data with already existing
data in the database to match a face with a name.
• A similar process is followed for scene labeling as well.
Analyzing Documents
• Convolutional neural networks can also be used for
document analysis. This is not just useful for
handwriting analysis, but also has a major stake in
recognizers. For a machine to be able to scan an
individual's writing, and then compare that to the wide
database it has, it must execute almost a million
commands a minute. It is said with the use of CNNs and
newer models and algorithms, the error rate has been
brought down to a minimum of 0.4% at a character
level, though it's complete testing is yet to be widely
seen.
Collecting Historic and
Environmental Elements
• CNNs are also used for more complex purposes such as
natural history collections. These collections act as key
players in documenting major parts of history such as
biodiversity, evolution, habitat loss, biological invasion,
and climate change.
Collecting Historic and
Environmental Elements
• CNNs are also used for more complex purposes such as
natural history collections. These collections act as key
players in documenting major parts of history such as
biodiversity, evolution, habitat loss, biological invasion,
and climate change.
Understanding Climate
• CNNs can be used to play a major role in the fight
against climate change, especially in understanding the
reasons why we see such drastic changes and how we
could experiment in curbing the effect. It is said that the
data in such natural history collections can also provide
greater social and scientific insights, but this would
require skilled human resources such as researchers
who can physically visit these types of repositories.
There is a need for more manpower to carry out deeper
experiments in this field.
Understanding Gray Areas
• Introduction of the gray area into CNNs is posed to
provide a much more realistic picture of the real world.
Currently, CNNs largely function exactly like a machine,
seeing a true and false value for every question.
However, as humans, we understand that the real world
plays out in a thousand shades of gray. Allowing the
machine to understand and process fuzzier logic will
help it understand the gray area we humans live in and
strive to work against. This will help CNNs get a more
holistic view of what human sees.
Example for CNN
Recurrent Neural Network(RNN)
• is a type of Neural Network where the output from the previous step is fed as input to
the current step.
• In traditional neural networks, all the inputs and outputs are independent of each other.
• Still, in cases when it is required to predict the next word of a sentence, the previous
words are required and hence there is a need to remember the previous words.
• Thus RNN came into existence, which solved this issue with the help of a Hidden Layer.
The main and most important feature of RNN is its Hidden state, which remembers
some information about a sequence. The state is also referred to as Memory
State since it remembers the previous input to the network.
• It uses the same parameters for each input as it performs the same task on all the
inputs or hidden layers to produce the output. This reduces the complexity of
parameters, unlike other neural networks.
• Types Of RNN
• There are four types of RNNs based on the number of
inputs and outputs in the network.
1.One to One
2.One to Many
3.Many to One
4.Many to Many
• One to One
• This type of RNN behaves the same as any simple
Neural network it is also known as Vanilla Neural
Network. In this Neural network, there is only one input
and one output.
How does RNN work?