Unit 4 Deep Learning Model:: Introduction To Cnns
Unit 4 Deep Learning Model:: Introduction To Cnns
Introduction to CNNs:
1. What is CNN?
CNNs have four main types of layers that work together to recognize features
in images:
Convolutional Layer: This layer applies filters (small grids) that move
across the image to detect features, like edges or colors, which are
called feature maps. This is how CNNs start to "see" parts of the image.
Activation Layer: After the convolution layer, an activation function (like
ReLU, which replaces negative values with zero) is used. This adds
flexibility, allowing the CNN to learn more complex patterns.
Pooling Layer: This layer reduces the size of the feature maps, making
the network faster and helping it focus on the most important parts.
Pooling keeps the main details but reduces the amount of data to
process.
Fully Connected Layer: At the end, this layer takes everything the
network has learned about the image and uses it to make a final
decision, like identifying an object or classifying an image.
Pooling often keeps only the largest values in small sections of the feature map
(called max pooling). This helps CNNs focus on the strongest patterns, like
edges or textures, that are important for recognizing objects.
Convolutional Neural Network consists of multiple layers like the input layer,
Convolutional layer, Pooling layer, and fully connected layers.
1. Input Layer
This is where the image data is fed into the network.
The input is usually a 3D matrix with dimensions representing the
height, width, and color channels of the image (e.g., for a 32x32
RGB image, the input dimensions would be 32x32x3).
2. Convolutional Layer (Conv Layer)
The first convolutional layer applies filters (small grids of numbers)
that slide across the input image.
Each filter detects certain features, like edges or textures,
producing an output known as a feature map.
The number of filters determines the depth of this layer's output.
For example, if there are 32 filters, the output will be a 32-channel
feature map.
3. Activation Layer
After each convolution, an activation function (usually ReLU) is
applied to introduce non-linearity, allowing the network to learn
complex patterns.
The ReLU(Rectified Linear Unit) function replaces all negative
values in the feature map with zero, making the model more
robust and easier to train.
4. Pooling Layer
After some convolutional and activation layers, a pooling layer is
added to down sample the feature maps.
Max pooling is common, where each filter takes the maximum
value in a small area, typically 2x2 or 3x3. This reduces the size of
the feature map and helps the network focus on the most
important features.
Pooling makes the model faster and reduces the chances of
overfitting.
5. Flattening Layer
After the final pooling layer, the feature maps are flattened into a
1D vector.
This "flattened" vector is used as input for the fully connected
layers, allowing the network to make final predictions.
6. Fully Connected Layer (FC Layer)
The fully connected layer takes the flattened vector and connects
it to a series of neurons (like a standard neural network layer).
This layer combines the learned features to make the final
decision, like recognizing an object in the image.
7. Output Layer
The output layer provides the final classification result.
For a classification task, a softmax activation function is typically
used, which converts the output into probabilities for each class.
If it's a binary classification, a sigmoid activation is often used to
get a probability score between 0 and 1.
Types of Pooling
There are different methods for summarizing information within each region,
and each type of pooling has a specific role in CNNs.
1. Max Pooling
How it Works: For each small region, max pooling selects the largest
value (or most prominent feature).
Purpose: It captures the strongest feature in each area, making the CNN
focus on the most critical parts, like edges and corners.
When to Use: It’s especially helpful in tasks like object recognition,
where we want to capture the main details of an object.
2. Average Pooling
How it Works: This pooling takes the average of all values in a region.
Purpose: Instead of focusing only on the strongest feature, it considers
the overall information in the area, which can keep more context about
the image.
When to Use: Average pooling is often used in tasks that need more
precise details, like image segmentation (separating objects in an image)
or object detection.
3. Global Pooling
CNN Application:
Convolutional Neural Networks (CNNs) are widely used in various fields due to
their powerful feature extraction capabilities. Here are some of the primary
applications of CNNs:
1. Image Classification
2. Object Detection
3. Image Segmentation
6. Self-Driving Cars