0% found this document useful (0 votes)

11 views9 pages

Convolutional Neural Networks: ZV0GDF798E

The document provides an overview of Convolutional Neural Networks (CNNs), explaining their structure and components such as convolution, pooling, padding, stride, and fully connected layers. It highlights the efficiency of CNNs in image processing by reducing the number of parameters needed compared to traditional neural networks. Additionally, it discusses the historical significance of AlexNet in demonstrating the effectiveness of deep learning for image classification tasks.

Uploaded by

Mandy Law

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views9 pages

Convolutional Neural Networks: ZV0GDF798E

Uploaded by

Mandy Law

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Convolutional Neural Networks

Let's understand the notion of a convolutional neural network (CNN) and how it is
different from the kind of neural network we learned earlier.

Let’s understand the first layer once. Let's say that our input is a grayscale image that's
256 x 256. In grayscale images, we have only one channel so the image size will be
256 x 256 x 1.

[email protected]
ZV0GDF798E

We can break this image up into 8 x 8 image patches. The patch is nothing but a small
portion of an image.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
In total, we have a 32 x 32 grid of these patches that comprise the image.

Now let's consider a linear function on just the 8 x 8 image patch. Sometimes this is
called a filter.

In our case, a filter is an 8 x 8 grid of weights learned during backpropagation to learn

complex patterns in the image grid. And when we apply a filter to an image patch,
what we do is we take an inner product between them as vectors.
[email protected]
ZV0GDF798E

So if we apply a filter to each one of the patches, we get a new 32 x 32 grid of

numbers. What good is this? Remember the intuition is that the first few layers detect
simple objects like edges.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
We can think of a filter as a naive object detector, and instead of having different
parameters when we apply it to each of the different image patches, why not have the
same parameters throughout? This drastically reduces the number of parameters and
consequently, there are much fewer things to learn. If we take this idea to its natural
conclusion, we get convolutional neural networks.

In the first layer, we have a collection of filters, each one is applied to each of the image
patches,
[email protected] and together they give the output of the first layer, after applying a
ZV0GDF798E
non-linearity at the end. This is already a major innovation because it means we can
work with much larger neural networks in practice. Just the first few layers are
convolutional and the others are general and fully connected.

Another important idea is the notion of dropout.

Here when we compute how well a neural network classifies some image, say through
the quadratic cost function, we instead randomly delete some fraction of the network,

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
and then compute the new function from the input to the outputs. The idea is, that if a
neural network continues to work, even if we drop perceptrons from the intermediate
layers then it must be spreading information out in a way that no node is a single point
of failure. Training a neural network with dropout makes the function that we learn
more robust.

Additional Content :

We have understood neural networks and how they can be used to build robust
models using numerical data. Let's assume we have an image with height = 6, width =
6, and the number of channels = 3 (colored image).

So there are 6 x 6 x 3 = 108 numbers required to fully describe the image. Consider we
have the first hidden layer of a neural network with 10 units. The total number of
parameters (weights) are 108 x 10 = 1080. So we need 1080 weights for only one
layer and generally, the size of the image will be equal to 224 x 224. In these kinds of
cases, we get a lot of parameters to train, which makes it very computationally
expensive and the model doesn't perform much better. To deal with this kind of
[email protected]
ZV0GDF798Eproblem, we have special neural networks called Convolutional Neural Networks
(CNNs).

A convolutional neural network is a type of neural network that is used in image

processing and image classification. This neural network takes the pixels of an image
as input and generates the desired output.

Let’s understand the various building blocks of CNN:

1. Convolution
2. Pooling
3. Padding
4. Stride
5. Fully Connected Layer

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Convolution :
The first step of a CNN is to detect features like edges, shapes, etc. This is done by
applying a convolution to the image using Filters (Filters are responsible for detecting
some kind of shape).

Let’s understand this using an example:

We take an input image of 6 X 6 and convolve this 6 X 6 matrix with a 3 X 3 filter:

Note: The blue matrix on the right above, represents just the first application of the
filter to the first 3 X 3 portion of the 6 X 6 image. It does not represent the final
[email protected]
ZV0GDF798Eoutput of the convolution. The blue matrix on the right will actually result in just one
final number, which is the sum of the element-wise products (products of the big
number and the small number inside each square).

So for example, the blue matrix will actually give:

3*1 + 0*0 + 1*(-1) + 1*1 + 5*0 + 8*(-1) + 2*1 + 7*0 + 2*(-1) = -5 is the number
resulting from the blue matrix.

Similar numbers result from a moving application of the 3 X 3 filter to corresponding 3

X 3 regions in the 6 X 6 image, first horizontally throughout each row, and then for
every row in the whole image matrix itself. This will eventually give a 4 X 4 final
output after the convolution, where each number of the 4 X 4 final output is a number
computed from the sum of products, like the -5 computed above.

So after the convolution, we finally get a 4 X 4 image. The first element of the 4 X 4
matrix will be calculated as we take the first 3 X 3 matrix from the 6 X 6 image and

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
multiply it with the filter. The first element of the 4 X 4 matrix, will be the sum of the
element-wise product of these values which is:
3*1 + 0*0 + 1*(-1) + 1*1 + 5*0 + 8*(-1) + 2*1 + 7*0 + 2*(-1) = -5
Similarly, we will convolve over the entire image and get a 4 X 4 matrix in the end.

In convolution the total number of parameters = [ 3 x 3 x 3 (numbers in the filters) x 10

(number of filters) ] + 10 (bias) = 280.
280 is much smaller than the corresponding number of parameters we would have
required in the case of a fully connected neural network, and this is evidence of the
computational efficiency of using the convolution operation.

Filters:
Filters are responsible for locating objects in an image by detecting changes in the
intensity values of the images. Generally, we have an edge detector that is capable of
detecting edges in an image in mathematical form.
For example:

[email protected]
ZV0GDF798E

In images, we have a lot of complex features that need to be detected other than
edges. For that purpose, we randomly initialize filter values, and the model itself will
learn the best filter values for feature detection during the backpropagation phase.

Pooling:
Pooling is another technique used to reduce the spatial size of the representation, in
order to reduce the number of parameters and the computational cost of the network.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Strides:
When performing convolution, we observe that we slide from the top-most corner to
the bottom-most corner by a shift - this shift is called the stride. Hence, the stride
also helps us in dimensionality reduction; we observe that as the stride number
increases, the computational power required correspondingly decreases as well.

Padding:
The convolutional layers reduce the size of the output. So in cases where we want to
increase the size of the output and save the information presented in the corners, we
[email protected]
ZV0GDF798Ecan use padding layers, where padding helps by adding extra rows and columns on the
outer dimension of the images. So the size of the input data will remain similar to the
output data. We mostly add zeros in the extra rows and columns (Zero padding).

Fully Connected Layers:

The result after applying different filters is a matrix. So, we have to flatten that matrix
in the form of a vector to feed it into the fully connected layer. For this, we make a fully
connected layer. In the picture shown below, the first matrix is the result we get after

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
the image goes through convolutional layers, the second layer is the flattened layer
that acts as the input for the fully connected layers.

After getting a fully connected layer, we pass this layer as an input layer to the neural
network in order to get the results.

We now
[email protected] have an understanding of the building blocks of a CNN. CNN's do nothing
ZV0GDF798E
other than arranging these building blocks in the right order. Usually, that order is a
convolution layer followed by a pooling layer (multiple times), and finally a fully
connected layer.

Let’s look at one of the historically famous CNN architectures in Deep Learning.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
AlexNet :
● AlexNet is a masterpiece created by the SuperVision group, which included Alex
Krizhevsky, Geoffrey Hinton, and Ilya Sutskever from the University of Toronto.
● The winner of ImageNet 2012, AlexNet showed that deep learning was the way
forward, towards achieving the lowest error rates in computer vision tasks.
.

[email protected]
ZV0GDF798E

What is the architectural structure of AlexNet?

● The major feature of AlexNet is that it overlaps the pooling operation to reduce
the size of the network.
● With five convolution layers and three fully connected layers, and the ReLU
function applied after every convolutional & fully connected layer, AlexNet
showed the way towards achieving state-of-the-art image classification results.
● It uses ReLU as its activation function, which speeds up the rate of training and
increases the accuracy. The regularization technique it uses is Dropout.

There are several other architectures that can be explored to get a better
understanding of which building blocks are to be used in the implementation of CNNs.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.

CNNS, Part 1: An Introduction To Convolutional Neural Networks
No ratings yet
CNNS, Part 1: An Introduction To Convolutional Neural Networks
17 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
6 pages
Unit3 2023 NNDL
No ratings yet
Unit3 2023 NNDL
69 pages
Convolutional Neural Networks (CNN) : Convolutions
No ratings yet
Convolutional Neural Networks (CNN) : Convolutions
17 pages
Lecture 3 Updated
No ratings yet
Lecture 3 Updated
56 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
FODL Unit-4
No ratings yet
FODL Unit-4
46 pages
A Convolutional Neural Network
No ratings yet
A Convolutional Neural Network
6 pages
Unit 4
No ratings yet
Unit 4
19 pages
DL Unit Iii
No ratings yet
DL Unit Iii
13 pages
Convolution Neural Networks U2
No ratings yet
Convolution Neural Networks U2
24 pages
Unit 3
No ratings yet
Unit 3
80 pages
DeepLearning Unit-II
No ratings yet
DeepLearning Unit-II
70 pages
CS601 Machine Learning Unit 3
No ratings yet
CS601 Machine Learning Unit 3
47 pages
Unit 5th Ig Ann
No ratings yet
Unit 5th Ig Ann
112 pages
CNN Architecture
No ratings yet
CNN Architecture
24 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
27 pages
Convolution Operation
No ratings yet
Convolution Operation
23 pages
Theory of CNN (Convolutional Neural Network)
No ratings yet
Theory of CNN (Convolutional Neural Network)
4 pages
What Is Convolutional Neural Network
No ratings yet
What Is Convolutional Neural Network
16 pages
Unit 3 CNN
No ratings yet
Unit 3 CNN
47 pages
CS601 - Machine Learning - Unit 3 - Notes - 1672759761
No ratings yet
CS601 - Machine Learning - Unit 3 - Notes - 1672759761
15 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
35 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
11 pages
Convolutional Neural Networks - Part 1
No ratings yet
Convolutional Neural Networks - Part 1
44 pages
Unit 3 CNN 2024
No ratings yet
Unit 3 CNN 2024
58 pages
DL Unit4
No ratings yet
DL Unit4
31 pages
CNN (Neural Network)
No ratings yet
CNN (Neural Network)
32 pages
Iii Unit - Deeplearning
No ratings yet
Iii Unit - Deeplearning
93 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
34 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
61 pages
Convolution Neural Network (CNN) Unit 2: Dr. Kavita R Singh
No ratings yet
Convolution Neural Network (CNN) Unit 2: Dr. Kavita R Singh
65 pages
Unit4 CNN
No ratings yet
Unit4 CNN
187 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
53 pages
Mod 5
No ratings yet
Mod 5
96 pages
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
No ratings yet
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
11 pages
CNN and Autoencoder
No ratings yet
CNN and Autoencoder
56 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
The Math Behind Convolutional Neural Networks - Towards Data Science
No ratings yet
The Math Behind Convolutional Neural Networks - Towards Data Science
37 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
Introduction To Convolutional Neural Networks
No ratings yet
Introduction To Convolutional Neural Networks
41 pages
(Fall 2024) Images and Convolutions
No ratings yet
(Fall 2024) Images and Convolutions
69 pages
DL Unit 3
No ratings yet
DL Unit 3
18 pages
DL Unit4 CNN
No ratings yet
DL Unit4 CNN
132 pages
Lecture-25 - Building - Training CNN
No ratings yet
Lecture-25 - Building - Training CNN
26 pages
Unit Iii Deep Learning
No ratings yet
Unit Iii Deep Learning
31 pages
NN 06
No ratings yet
NN 06
18 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
9 pages
UNIT 2 Study Materials 1
No ratings yet
UNIT 2 Study Materials 1
42 pages
ANN Unit 4
No ratings yet
ANN Unit 4
66 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
98 pages
DL Mod3
No ratings yet
DL Mod3
102 pages
Unit 2
No ratings yet
Unit 2
45 pages
Convolutinal Neural Networks
No ratings yet
Convolutinal Neural Networks
43 pages
CS 601 Machine Learning Unit 3
No ratings yet
CS 601 Machine Learning Unit 3
37 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
9 pages
Deep Learning UNIT-4
No ratings yet
Deep Learning UNIT-4
34 pages
Why Convolutions?: Till Now in MLP
No ratings yet
Why Convolutions?: Till Now in MLP
38 pages
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
DL - M2 - Deep Feedforward NN
No ratings yet
DL - M2 - Deep Feedforward NN
97 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
17 pages
Kaldi Nnet3 PDF
No ratings yet
Kaldi Nnet3 PDF
1 page
ML Unit 2
No ratings yet
ML Unit 2
71 pages
UNIT-5 Foundations of Deep Learning
No ratings yet
UNIT-5 Foundations of Deep Learning
9 pages
Kannan M5L3 Notes
No ratings yet
Kannan M5L3 Notes
98 pages
ccs355 Model - A
No ratings yet
ccs355 Model - A
2 pages
L5 CNN Architectures
No ratings yet
L5 CNN Architectures
42 pages
Convolutional Neuralnetworks: Abin - Roozgard
No ratings yet
Convolutional Neuralnetworks: Abin - Roozgard
54 pages
Attention Is All You Need
No ratings yet
Attention Is All You Need
18 pages
5 2 Multilayer Perceptron
No ratings yet
5 2 Multilayer Perceptron
17 pages
Forward & Backward Propagation
No ratings yet
Forward & Backward Propagation
2 pages
Lab4 RBM DBN Extra Slides
No ratings yet
Lab4 RBM DBN Extra Slides
31 pages
Saheaw 2020
No ratings yet
Saheaw 2020
4 pages
ADALINE
No ratings yet
ADALINE
3 pages
Artificial Neural Networks: Part 1/3
No ratings yet
Artificial Neural Networks: Part 1/3
25 pages
Winsem2020-21 Eee1007 Eth Vl2020210500383 Model Question Paper Eee1007 QP
No ratings yet
Winsem2020-21 Eee1007 Eth Vl2020210500383 Model Question Paper Eee1007 QP
4 pages
A Comprehensive Review On Fake News Detection With Deep Learning
No ratings yet
A Comprehensive Review On Fake News Detection With Deep Learning
20 pages
ANN Course File 2011
No ratings yet
ANN Course File 2011
8 pages
LSTM
No ratings yet
LSTM
22 pages
Lec 12 Generative Adversarial Networks
No ratings yet
Lec 12 Generative Adversarial Networks
85 pages
ANN Matlab
No ratings yet
ANN Matlab
13 pages
CNN Architectures - LeNet, AlexNet, VGG, GoogLeNet, ResNet and More - by Siddharth Das - Analytics Vidhya - Medium
No ratings yet
CNN Architectures - LeNet, AlexNet, VGG, GoogLeNet, ResNet and More - by Siddharth Das - Analytics Vidhya - Medium
6 pages
Restricted Boltzmann Machines (RBMS)
No ratings yet
Restricted Boltzmann Machines (RBMS)
13 pages
Mtech It 2 Sem Soft Computing 2012
No ratings yet
Mtech It 2 Sem Soft Computing 2012
3 pages
Deep Learning
No ratings yet
Deep Learning
7 pages
ViViT: A Video Vision Transformer
No ratings yet
ViViT: A Video Vision Transformer
14 pages
What Is Backpropagation
No ratings yet
What Is Backpropagation
8 pages
L5 Neural Network
No ratings yet
L5 Neural Network
67 pages
Machine Learning Systems
No ratings yet
Machine Learning Systems
1,748 pages