0% found this document useful (0 votes)
35 views21 pages

Convolutional Neural Network - 5

This document provides an overview of convolutional neural networks including their basic structure, common layer types such as convolutional, ReLU, pooling and fully connected layers, and techniques like padding, strides, data augmentation and training. CNNs are biologically inspired networks used in applications like image classification and pattern recognition.

Uploaded by

Basem Mostafa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views21 pages

Convolutional Neural Network - 5

This document provides an overview of convolutional neural networks including their basic structure, common layer types such as convolutional, ReLU, pooling and fully connected layers, and techniques like padding, strides, data augmentation and training. CNNs are biologically inspired networks used in applications like image classification and pattern recognition.

Uploaded by

Basem Mostafa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Convolutional Neural

Network(CNN)

1
Basic structure of CNN
• In CNN, the nodes in each layer are arranged according to the spatial
grid structure of input.
• It is important to maintain these spatial relationships of the input
regions through the network layers
• The CNN functions much like traditional feed-forward neural
networks, except the operations in its layers are organized with sparse
connections.
• CNNs are biologically inspired networks used for:
✓ Image classification
✓ Pattern recognition

2
Cont.
• Each layer in CNN is a 3-dimensional grid structure(height, width and
depth).
• Depth means number of channels in layer (e.g., colored image has 3
colors).
• The depth may increase in next layers corresponding to number of
new resulting features.
• Features in lower level layers capture lines or simple shapes, whereas
the features in higher layers capture complex shapes like loops.
• Four types of layers in CNN architecture:
“convolutional - ReLU - pooling - fully connected “

3
4
Convolution layer
• A convolutional layer is the main building block of a CNN. It contains a
set of filters (or kernels) organized into sets of 3-dimensional
structural units.
• The filter is usually square but much smaller than those layers that it
is applied to.
• The filter is the same depth of the layers.
• The depth of the layers is depending on the number of used filters.
• The filter is placed at each position in the image so that it fully
overlaps with the image.
• The dot product is performed on the corresponding elements in the
filters and local regions in the image.

5
Cont.
• If the size of image (input layer) is 32 x 32 , filter size is 5 x 5, then the
size of next layer is 28 x 28:
32 – 5 +1=28
32 – 5 +1=28

6
Cont.
• How filter works in convolutional layer?
The filter tries to identifies a particular type of patterns in small
rectangular region. Thus, a large number of filters are required to
capture all possible shapes.
• Next figure represents horizontal edge detector on gray image with
one channel (vertical edge gives zero activation whereas horizontal
edge gives high value activation)

7
8
Hierarchical Feature Engineering

Filters detect low level features like


edges in early layers and combine
them to create high level features
like rectangle in later layers

9
padding
• One observation is that, convolution operations reduce the original size.
• This operation tends to lose some information along the border of the
image. This problem can be resolved by padding.
• Padding: “adding pixels(set to zero) around the border of feature map
in order to maintain the spatial footprint”.
• Padding is performed in all layers, not just in first layer.
• Padding allows more space for the filter to cover in the image.

10
Strides
• It is not necessary to perform convolution operations at each spatial
positions in the layer.
• Stride is: ”long step or the distance covered by such a step”
• It is most common to use a stride of 1(sometimes 2).
• Larger strides reduce the spatial size of layers. Therefore, it reduces
the storage required.

11
ReLU layers
• ReLU is typically follows the convolution layer.
• It has the same form (discussed early) in traditional neural networks.
• It doesn’t reduce the size of layers, because it is one-to-one mapping
of activation values.
• In earlier years sigmoid and tanh were used.
• Recently ReLU is used to improve speed and accuracy.

12
pooling
• Pooling operation works on small regions in each layer and produce
another layer with the same depth.
• Two types of pooling: max-pooling , average pooling
• The common type of pooling is max-pooling
• Max-pooling returns the maximum value on the local region.
• it is more common to use stride more than 1 in pooling(2 x 2 with
stride of 2).
• Pooling independently operates on each feature map to produce
another feature map. Whereas, convolution work on all feature amps
simultaneously and produce one single.
• Advantage: Pooling layers downsampling (reduce) feature maps by
summarizing the presence of features in patches of the feature map.

13
14
Fully connected layer
• Fully Connected Layer is simply, feed-forward neural networks. It
forms the last few layers in the network.
• The input to the fully connected layer is the output from the final
Pooling or Convolutional Layer.
• all neurons in a fully connected layer connect to all neurons in the
previous layer.
• In most cases, more than one fully connected layers are used to
increase the power computation towards the end.

15
Interleaving between layers
• Convolution, pooling and ReLU layers are interleaved to increase the
power of network.
• In general, ReLU follows Convolution.
• Convolution and ReLU stuck together one after the other.
• After two or three sets of Convolution-ReLU combinations, max-
pooling comes as in the example:

Where ‘C’ refers to Convolution, ‘R’ for ReLU and ‘P’ for max-pooling

16
LeNet-5

17
Training of CNN
• CNN uses backpropagation(BP) as in traditional feed forward neural
networks.
• Among all kinds of ANN, BP is one of the most mature, widely used
multilayer feed-forward neural networks based on error reverse
spread.
• According to statistics, up to 80% of the neural network models apply
backpropagation or its variant forms.

18
Data augmentation
• Data augmentation:
Using new training examples generated by using transformations on original
examples.
• Data augmentation can reduce overfitting in CNN especially in image
processing domain (because it doesn't change properties of objects in image)
• Popular Augmentation Techniques:
some basic but powerful augmentation techniques that are popularly used:
✓ Rotation:
Rotating image with angle clockwise or anticlockwise. One key thing to note
about this operation is that image dimensions may not be preserved after
rotation.

19
Cont.
✓ Scale:
The image can be scaled outward or inward. In scaling outward,
the final image size will be larger than the original image size.
✓ Crop:
Unlike scaling, we just randomly cut/ sample a section from the original
image. We then resize this section to the original image size.
✓ Translation:
Translation just involves moving the image along the X or Y direction
(or both).

20
Successful Variants of CNN
✓ AlexNet
✓ ZFNet
✓ VGG
✓ GoogLeNet
✓ ResNet

21

You might also like