0% found this document useful (0 votes)
7 views32 pages

Convolution Nueral Networks

The document provides an overview of Convolutional Neural Networks (CNNs), detailing their architecture, including input, hidden, and output layers, as well as the specific roles of convolution, pooling, and fully connected layers. It explains the concept of convolution in image processing, the importance of sparse connectivity and parameter sharing, and various types of convolutions such as traditional, separable, and depthwise separable convolutions. Additionally, it discusses the motivation behind CNNs and their application in computer vision tasks.

Uploaded by

heisenberganaya1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views32 pages

Convolution Nueral Networks

The document provides an overview of Convolutional Neural Networks (CNNs), detailing their architecture, including input, hidden, and output layers, as well as the specific roles of convolution, pooling, and fully connected layers. It explains the concept of convolution in image processing, the importance of sparse connectivity and parameter sharing, and various types of convolutions such as traditional, separable, and depthwise separable convolutions. Additionally, it discusses the motivation behind CNNs and their application in computer vision tasks.

Uploaded by

heisenberganaya1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Convolution Nueral Networks

Neural Networks / Multi-layer Perceptron



Perceptron/ Artificial Neuron

2
Neural Networks / Multi-layer Perceptron

Regular Neural Networks

Fig: https://cs231n.github.io/convolutional-networks/

3
Neural Networks / Multi-layer Perceptron

Neural Networks: Layers and Functionality

In a regular Neural Network there are three types of layers:


Input Layers: It’s the layer in which we give input to our model. The number of neurons in this layer is equal to the total number of features in our data (number
of pixels in the case of an image).


Hidden Layer: The input from the Input layer is then fed into the hidden layer. There can be many hidden layers depending on our model and data size. Each
hidden layer can have different numbers of neurons which are generally greater than the number of features. The output from each layer is computed by matrix
multiplication of the output of the previous layer with learnable weights of that layer and then by the addition of learnable biases followed by activation function
which makes the network nonlinear.


Output Layer: The output from the hidden layer is then fed into a logistic function like sigmoid or softmax which converts the output of each class into the
probability score of each class.

Fig: https://cs231n.github.io/convolutional-networks/

4
Convolutional Neural Networks

Neural networks that use convolution in place of general matrix
multiplication in atleast one of its layers.

Commonly used in Computer Vision.

Fig: https://www.geeksforgeeks.org/introduction-convolution-neural-network/

5
Convolutional Neural Networks

The Convolutional layer applies
filters to the input image to
extract features.

The Pooling layer downsamples
the image to reduce
computation.

The fully connected layer makes
the final prediction.

The network learns the optimal
filters through backpropagation
and gradient descent. Fig: https://www.geeksforgeeks.org/introduction-convolution-neural-network/

6
Convolution

Convolution is a mathematical operation to combine two
functions, say f(t) and g(t), denoted by *.

f(t) * g(t)

We can combine 2 functions by adding them or multiplying them.

The convolution operation is commutative, associative

7

Let f(x) be the treatment plan and g(x) be the list of patients.

At minute 0, I am yelling, so f(0) = 1. So the sound remaining in
the room = f(0).g(0), where g() is the sound impulse.

At minute 1, I yell again.

So the total sound at minute 1 = f(0).g(1) + f(1).g(0)

At minute 2, I yell again.

Total sound at minute 2 = f(0).g(2)+ f(1).g(1) + f(2).g(0)

8
Convolution in Image Processing

Is used to process an input image and transform that image into a
form that is more useful for downstream processing.

A kernel/filter is used to transform the input image. The kernels
are smaller in size when compared to the size of the input.

For example, the kernels can be 3X3, 5X5 matrices ....etc.

How a 6X6 matrix is convolved by a 3X3 kernel.

9
10
Convolutional Networks


In convolutional network terminology, the first argument to the
convolution is often referred to as the input.

The second argument as the kernel.

The output is sometimes referred to as the feature map.

In machine learning applications, the input is usually a multidimensional
arrayof data, and the kernel is usually a multidimensional array of
parameters that areadapted by the learning algorithm.
11
Motivation

Sparse interactions/ sparse
connectivity/ sparse weights

In a normal neural network,
every output unit interacts with
every input unit, through the
weights in the
interconnections.

Using convolution, small
features can be detected with
the help of smaller kernels. So
a group of input values are
mapped to an output value.
12
Sparse Connectivity
Receptive field of S3 – with convolution (Sparse)

Receptive field of units in the deeper layer are larger


then those in the shallow layers.

Receptive field of S3 – with matrix multiplication


(Dense)
13
Parameter Sharing

Using the same parameter for more than one function in a model.

A typical Neural network has separate weights for each
connection. Each of these weights need to be stored and learned.

In convolution, we reuse the parameters in the kernel.

The kernel size is much smaller when compared to the input and
output size.

14
Example CNN Architecture

15
Why Pooling?

We need the CNN to have a feature called spatial invariance.

It should identify the feature ireespective of where in the image it
appears, whether it is tilted, squished, elongated etc..

Pooling reduces the parameter size and so prevents overfitting.

Irrelevant features are removed.

Too much pooling can cause underfitting.

16
Types of COnvolution

Traditional Convolution – the kernels are shared

Multiple kernel convolution – multiple filters are used to extract different
features.

Unshared convolution – separate kernels are used in each stride
– Deepface uses this. This results in locally connected layers.

Locally connected layers are useful when we know that each feature
should be a function of a small part of space, but there is no reason to
think that the same feature should occur across all of space.

Tiled Convolution – A compromise between convoltion and
unshared convolution.

Cycle between a set of filters for each stride.
17

https://www.youtube.com/watch?v=FYlqTp2IoCY
Types of COnvolution

18
Separable Convolution

Spatial separable convolution and depthwise separable
convolution.

19
Spatial separable convolution

Deals with the spatial dimensions of an image and kernel (height and width).

A spatial separable convolution simply divides a kernel into two, smaller
kernels. The most common case would be to divide a 3x3 kernel into a 3x1 and
1x3 kernel, like so:


Instead of doing one convolution with 9 multiplications, we do two convolutions
with 3 multiplications each (6 in total) to achieve the same effect.

With less multiplications, computational complexity goes down, and the
network is able to run faster.

https://medium.com/towards-data-science/a-basic-introduction-to-separable-convolutions-20
b99ec3102728
Spatial separable convolution

21
Spatial separable convolution

Fig: Separating the sobel kernel – used to


identify edges

But not all kernels can be separated into


smaller kernels.

https://medium.com/towards-data-science/a-basic-introduction-to-separable-convolutions-22
b99ec3102728
Depthwise Separable Convolution

23
Depthwise separable Convolution

If we want to increase the number of channels in the output
image, say we want 8 X8X256...

We need 256 kernels

Create 256 kernels to create 256 8x8x1 images, then stack them up
together to create a 8x8x256 image output.

24

12x12x3 —> (5x5x3x256) — >12x12x256 (Where 5x5x3x256
represents the height, width, number of input channels, and
number of output channels of the kernel).

25
Depthwise – Step 1

Give the input image a convolution without changing the depth.
Use 3 kernels of shape 5x5x1.

26
Pointwise Separation – Step 2

The pointwise convolution is so named because it uses a 1x1
kernel, or a kernel that iterates through every single point.

This kernel has a depth of however many channels the input
image has; in our case, 3.

Therefore, we iterate a 1x1x3 kernel through our 8x8x3 image, to
get a 8x8x1 image.

27
Pointwise Separation – Step 2

28

We can create 256 1x1x3 kernels that output a 8x8x1 image each
to get a final image of shape 8x8x256.

29

Separable Convolution

Convolution with lesser number of parameters.

30
31

https://www.youtube.com/watch?v=vCJ4magCPts

32

You might also like