0% found this document useful (0 votes)
17 views36 pages

16-Optimization and Loss Functions in Classifiers, Convolution Layers, Max Pool Layers-24!08!2024

The document provides an overview of Convolutional Neural Networks (CNN), detailing their architecture, including convolution layers, pooling layers, and fully connected layers. It explains key concepts such as padding, stride, dropout, and data augmentation, as well as the calculation of parameters and output shapes for different scenarios. Additionally, it discusses various CNN architectures and techniques like transfer learning and normalization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views36 pages

16-Optimization and Loss Functions in Classifiers, Convolution Layers, Max Pool Layers-24!08!2024

The document provides an overview of Convolutional Neural Networks (CNN), detailing their architecture, including convolution layers, pooling layers, and fully connected layers. It explains key concepts such as padding, stride, dropout, and data augmentation, as well as the calculation of parameters and output shapes for different scenarios. Additionally, it discusses various CNN architectures and techniques like transfer learning and normalization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Module -3

CNN
• Convolution layers,
• max pool layers,
• ELU Gradient Descent,
• training CNN-initialization,
• CNN architectures VGG, Google Net,
• ResNet,
• dropout, normalization,
• rules update,
• data augmentation,
• transfer learning,
CNN
• Convolutional Neural Networks (CNN, or ConvNet) are a type of
multi-layer neural network that is meant to recognize visual
patterns from pixel images.
• A convolutional neural network is made up of numerous layers,
such as convolution layers, pooling layers, and fully connected
layers, and it uses a backpropagation algorithm to learn spatial
hierarchies of data automatically and adaptively.
• In a convolutional neural network, the kernel is nothing but a filter
that is used to extract the features from the images.
• Formula = [i-k]+1
• i -> Size of input , K-> Size of kernel
• Stride
• Stride is a parameter of the neural network’s filter that modifies
the amount of movement over the image or video. we had stride 1
so it will take one by one. If we give stride 2 then it will take value
by skipping the next 2 pixels.
• Formula =[i-k/s]+1
• i -> Size of input , K-> Size of kernel, S-> Stride
Stride =1 and stride =2
• Pooling
• Pooling in convolutional neural networks is a technique for
generalizing features extracted by convolutional filters and
helping the network recognize features independent of their
location in the image.
• Flatten
• Flattening is used to convert all the resultant 2-Dimensional arrays
from pooled feature maps into a single long continuous linear
vector. The flattened matrix is fed as input to the fully connected
layer to classify the image.
• Convolutional layer
• This layer is the first layer that is used to extract the various features
from the input images. In this layer, We use a filter or Kernel method to
extract features from the input image.

• Pooling layer
• The primary aim of this layer is to decrease the size of the convolved
feature map to reduce computational costs.
• This is performed by decreasing the connections between layers and
independently operating on each feature map.
• Depending upon the method used, there are several types of Pooling
operations. We have Max pooling and average pooling.
• Fully-connected layer
• The Fully Connected (FC) layer consists of the weights and biases along with the
neurons and is used to connect the neurons between two different layers.
• These layers are usually placed before the output layer and form the last few
layers of a CNN Architecture.
• Dropout
• The Dropout layer is a mask that nullifies the contribution of some neurons
towards the next layer and leaves unmodified all others.
Padding
• There are two approaches to resize smaller images up to the
fixed size: zero-padding and scaling them up (zooming in) using
interpolation.
• Advantages of padding : Speed and no loss of information
Padding size
• The possible values for the padding size, P, depends on the
input size, the filter size F, and the stride S.
• Assume width and height are the same.
• Ensure the output size, (W−F+2P)/S+1, is an integer.
• When S=1 then you get your first equation P=(F−1)/2 as a
necessary condition.
• Usually we need to consider the three parameters, namely W, F,
and S to determine valid values of P.
Padding
• Same padding
• Causal padding
• Valid padding
Valid padding (or no padding):
• Valid padding is simply no padding.
• This is by default keras choose if not specified.
• When (n x n) image is used and (f x f) filter is used with valid
padding the output image size would be (n-f+1)x(n-f+1).
Same padding
• when we need an output of the same shape as the input, this is used
• This value calculates and adds padding required to the input image to ensure the
shape before and after.
• If the values for the padding are zeroes then it can be called zero padding.
• When the padding is set to zero, then every pixel in padding has value of zero.
• When the zero padding is set to 1 then 1 pixel border is added to the image with
value zero.
• When we use an (n x n) image and (f x f) filter and we add padding (p) to the
image. The output image size would be (n x n).
• That means it restores the size of the image.
• The following equation represents the sizes of input and output with the same
padding.
• [(n + 2p) x (n + 2p) image] * [(f x f) filter] —> [(n x n) image].
• The value of p = (f-1)/2 since (n+2p-f+1) = n
Python code to check why padding is
required
Same padding
Valid Padding

• So in this, we really don’t apply padding but we assume that every pixel
of the image is valid so that the input can get fully covered by the filter
wherein a simple model assumes corners are invalid. And do not
consider them in the coverage area.
Same padding vs valid padding
• while using valid padding use them with max-pooling layers in
the same way.
• In the same padding, we use every point/pixel value while
learning the model wherein the valid padding, we consider
every point as valid so nothing can be left; it does not work with
the size of the input, it works on validation of pixel value.
• In VALID (i.e. no padding mode), Tensorflow will drop right
and/or bottom cells if your filter and stride don’t fully cover the
input image. Where the same padding model tries to spread
similar padding across the frame of the image.
Causal Padding

• This is a special type of padding and basically works with the


one-dimensional convolutional layers.
• We can use them majorly in time series analysis.
• Since a time series is sequential data it helps in adding zeros
at the start of the data.
• Which also helps in predicting the values of early time steps.
Learning parameters rules
• If the previous layer was a dense layer, the input to the conv
layer is just the number of nodes in the previous dense layer.
• If the previous layer was a convolutional layer, the input will be
the number of filters from that previous convolutional layer.
Now, what's the output of a convolutional layer?
• With a dense layer, it was just the number of nodes.
• With a convolutional layer, the output will be the number of
filters times the size of the filters.
How to calculate number of parameters and shape of output in convolution layer

• Scenario 1:
• Input:
• filters = 1
• kernel_size=(3,3)
• input_shape=(10,10,1)
• Parameters:
• Weights in one filter of size(3,3)= 3*3 =9
• Bias =1
[One bias will be added to each filter. Since only one filter kernel is used,
bias =1]
• Total parameters for one filter kernel of size (3,3) = 9+1 =10
Output shape
• s →stride, p →padding, n →input size, f →filter size
• Stride by default =1 , padding is not mentioned (so,p=0)

• 10-3+1 = 8

• Guess what are the parameters missing here?


• 3, 10,10,1 (hidden)
Output from the previous code
Scenario 2:
• Input:
• filters = 5
• kernel_size=(3,3)
• input_shape=(10,10,1)
• Parameters:
• Weights in one filter of size(3,3)= 3*3 =9
• Bias =1
[One bias will be added to each filter.]
• Total parameters for filter kernel of size (3,3) = 9+1 =10
• The total number of filters= 5.
• Total parameters for one filter kernel of size (3,3) = 10 * 5 =50
• Output shape =????
• So, Output_shape of feature map= (8,8,5)
Scenario 3:
• Input:
• filters = 5
• kernel_size=(3,3)
• input_shape=(100,100,3)
• Parameters:
• Weights in one filter of size(3,3)= 3*3 =9 . Three channels. Hence 9* 3
=27.
• Bias =1. 27+1 =28
Total parameters for filter kernel [One bias will be added to each filter].
• The total number of filters= 5.
• Total parameters for one filter kernel of size = 28 * 5 =140
Scenario 4:

•input layer - images of size 20x20x3


•hidden convolutional layer - 2 filters of size 3x3
•hidden convolutional layer - 3 filters of size 3x3
•dense output layer - 2 nodes
•input layer - images of size 20x20x3
•hidden convolutional layer - 2 filters of size 3x3
Scenario 4:
•hidden convolutional layer - 3 filters of size 3x3

•Hidden conv layer 1 (With a convolutional layer, the output will


be the number of filters times the size of the filters)
•= 3 *3 *2 =18 * 3(inputs) = 54+2(bias) (since no. of bias =no.
of filters) =56

•Hidden conv layer 2


•How many inputs are coming in to this layer? We have two
from the number of filters in the previous layer.
• = 3 * 3* 3= 27 *2(inputs) =54+3 (bias) =57
Output layer
• How many inputs are coming into this layer?
• you must be clear that before passing output from a convolutional
layer to a dense layer
• Here flattening the output is done by multiplying the dimensions of
the data from the conv layer by the number of filters in that layer.
• In our case, the data is image data.
• Our input image size is 20 *20 * 3(filters) =1200.
• As given output layer 2 nodes . 1200 *2 =2400 +2 (bias) =2402
parameters.
• Model can be sequential/graphical
• 4 parameters are passed.
• Num of filters, kernel size,inut shape and activation function
• Convolutional_1 : ((kernel_size)*stride+1)*filters)
= 3*3*1+1*32 = 320 parameters. In first layer, the convolutional
layer has 32 filters.
• Convolutional_2 : number of trainable parameters in this layer is
3 * 3 * 32 + 1 * 32 = 9248 and so on.
• Convolutional_3 : 3 * 3 * 32 + 1 * 64 = 18496 and so on.

You might also like