16-Optimization and Loss Functions in Classifiers, Convolution Layers, Max Pool Layers-24!08!2024
16-Optimization and Loss Functions in Classifiers, Convolution Layers, Max Pool Layers-24!08!2024
CNN
• Convolution layers,
• max pool layers,
• ELU Gradient Descent,
• training CNN-initialization,
• CNN architectures VGG, Google Net,
• ResNet,
• dropout, normalization,
• rules update,
• data augmentation,
• transfer learning,
CNN
• Convolutional Neural Networks (CNN, or ConvNet) are a type of
multi-layer neural network that is meant to recognize visual
patterns from pixel images.
• A convolutional neural network is made up of numerous layers,
such as convolution layers, pooling layers, and fully connected
layers, and it uses a backpropagation algorithm to learn spatial
hierarchies of data automatically and adaptively.
• In a convolutional neural network, the kernel is nothing but a filter
that is used to extract the features from the images.
• Formula = [i-k]+1
• i -> Size of input , K-> Size of kernel
• Stride
• Stride is a parameter of the neural network’s filter that modifies
the amount of movement over the image or video. we had stride 1
so it will take one by one. If we give stride 2 then it will take value
by skipping the next 2 pixels.
• Formula =[i-k/s]+1
• i -> Size of input , K-> Size of kernel, S-> Stride
Stride =1 and stride =2
• Pooling
• Pooling in convolutional neural networks is a technique for
generalizing features extracted by convolutional filters and
helping the network recognize features independent of their
location in the image.
• Flatten
• Flattening is used to convert all the resultant 2-Dimensional arrays
from pooled feature maps into a single long continuous linear
vector. The flattened matrix is fed as input to the fully connected
layer to classify the image.
• Convolutional layer
• This layer is the first layer that is used to extract the various features
from the input images. In this layer, We use a filter or Kernel method to
extract features from the input image.
• Pooling layer
• The primary aim of this layer is to decrease the size of the convolved
feature map to reduce computational costs.
• This is performed by decreasing the connections between layers and
independently operating on each feature map.
• Depending upon the method used, there are several types of Pooling
operations. We have Max pooling and average pooling.
• Fully-connected layer
• The Fully Connected (FC) layer consists of the weights and biases along with the
neurons and is used to connect the neurons between two different layers.
• These layers are usually placed before the output layer and form the last few
layers of a CNN Architecture.
• Dropout
• The Dropout layer is a mask that nullifies the contribution of some neurons
towards the next layer and leaves unmodified all others.
Padding
• There are two approaches to resize smaller images up to the
fixed size: zero-padding and scaling them up (zooming in) using
interpolation.
• Advantages of padding : Speed and no loss of information
Padding size
• The possible values for the padding size, P, depends on the
input size, the filter size F, and the stride S.
• Assume width and height are the same.
• Ensure the output size, (W−F+2P)/S+1, is an integer.
• When S=1 then you get your first equation P=(F−1)/2 as a
necessary condition.
• Usually we need to consider the three parameters, namely W, F,
and S to determine valid values of P.
Padding
• Same padding
• Causal padding
• Valid padding
Valid padding (or no padding):
• Valid padding is simply no padding.
• This is by default keras choose if not specified.
• When (n x n) image is used and (f x f) filter is used with valid
padding the output image size would be (n-f+1)x(n-f+1).
Same padding
• when we need an output of the same shape as the input, this is used
• This value calculates and adds padding required to the input image to ensure the
shape before and after.
• If the values for the padding are zeroes then it can be called zero padding.
• When the padding is set to zero, then every pixel in padding has value of zero.
• When the zero padding is set to 1 then 1 pixel border is added to the image with
value zero.
• When we use an (n x n) image and (f x f) filter and we add padding (p) to the
image. The output image size would be (n x n).
• That means it restores the size of the image.
• The following equation represents the sizes of input and output with the same
padding.
• [(n + 2p) x (n + 2p) image] * [(f x f) filter] —> [(n x n) image].
• The value of p = (f-1)/2 since (n+2p-f+1) = n
Python code to check why padding is
required
Same padding
Valid Padding
• So in this, we really don’t apply padding but we assume that every pixel
of the image is valid so that the input can get fully covered by the filter
wherein a simple model assumes corners are invalid. And do not
consider them in the coverage area.
Same padding vs valid padding
• while using valid padding use them with max-pooling layers in
the same way.
• In the same padding, we use every point/pixel value while
learning the model wherein the valid padding, we consider
every point as valid so nothing can be left; it does not work with
the size of the input, it works on validation of pixel value.
• In VALID (i.e. no padding mode), Tensorflow will drop right
and/or bottom cells if your filter and stride don’t fully cover the
input image. Where the same padding model tries to spread
similar padding across the frame of the image.
Causal Padding
• Scenario 1:
• Input:
• filters = 1
• kernel_size=(3,3)
• input_shape=(10,10,1)
• Parameters:
• Weights in one filter of size(3,3)= 3*3 =9
• Bias =1
[One bias will be added to each filter. Since only one filter kernel is used,
bias =1]
• Total parameters for one filter kernel of size (3,3) = 9+1 =10
Output shape
• s →stride, p →padding, n →input size, f →filter size
• Stride by default =1 , padding is not mentioned (so,p=0)
• 10-3+1 = 8