Convolutional Neural Network (CNN) : Assignment On
Convolutional Neural Network (CNN) : Assignment On
ASSIGNMENT ON
Convolutional neural network (CNN)
Jul,2024
1
Table Content
2
Background of Convolutional neural networks
(CNNs)
CNNs were first developed and used around the 1980s. The most that a Convolutional
Neural Network (CNN) could do at that time was recognize handwritten digits. It was
mostly used in the postal sector to read zip codes, pin codes, etc. The important thing
to remember about any deep learning model is that it requires a large amount of data
to train and also requires a lot of computing resources. This was a major drawback for
CNNs at that period, and hence CNNs were only limited to the postal sectors and it
failed to enter the world of machine learning. Back propagation, the algorithm used to
train neural networks, was also computationally expensive at the time.
In 2012, Alex Krizhevsky realized that it was time to bring back the branch of deep
learning that uses multi-layered neural networks. The availability of large sets of data,
more specific ImageNet datasets with millions of labeled images, and an abundance
of computing resources enabled researchers to revive CNNs.
Method of CNN
1. Convolutional Layer:
The convolutional layer is the fundamental building block of a CNN and is where the
majority of computations occur. This layer uses a filter or kernel -- a small matrix of
1
weights -- to move across the receptive field of an input image to detect the presence
of specific features.
The process begins by sliding the kernel over the image's width and height, eventually
sweeping across the entire image over multiple iterations. At each position, a dot
product is calculated between the kernel's weights and the pixel values of the image
under the kernel. This transforms the input image into a set of feature maps or
convolved features, each of which represents the presence and intensity of a certain
feature at various points in the image.
The convolutional layer is the fundamental building block of a CNN and is where the
majority of computations occur. This layer uses a filter or kernel -- a small matrix of
weights -- to move across the receptive field of an input image to detect the presence
of specific features.
Activation Function:
2. Pooling Layer:
The pooling layer of a CNN is a critical component that follows the convolutional
layer. Similar to the convolutional layer, the pooling layer's operations involve a
sweeping process across the input image, but its function is otherwise different.
The pooling layer aims to reduce the dimensionality of the input data while
retaining critical information, thus improving the network's overall efficiency.
This is typically achieved through downsampling: decreasing the number of data
points in the input.
2. Reduces the spatial dimensions of the feature maps, retaining the most
important features.
3. Common methods are Max Pooling and Average Pooling.
The fully connected layer plays a critical role in the final stages of a CNN, where it is
responsible for classifying images based on the features extracted in the previous
layers. The term fully connected means that each neuron in one layer is connected to
each neuron in the subsequent layer.
2
The fully connected layer integrates the various features extracted in the previous
convolutional and pooling layers and maps them to specific classes or outcomes. Each
input from the previous layer connects to each activation unit in the fully connected
layer, enabling the CNN to simultaneously consider all features when making a final
classification decision.
Preprocessing
Preprocessing in the context of Convolutional Neural Networks (CNNs) refers
to a series of steps taken to prepare raw input data (typically images) into a
format that is more suitable for the model to process. The goal of
preprocessing is to enhance the quality of the data, reduce complexity, and
improve the performance and accuracy of the CNN.
Resizing:
Ensuring that all images have the same dimensions (e.g., 224x224 pixels) so
that they can be fed into the CNN. This uniformity is crucial because CNNs
require consistent input sizes.
Normalization:
Scaling pixel values to a standard range, typically [0, 1] or [-1, 1]. This step
helps in accelerating the training process and achieving faster convergence by
ensuring that the input values are within a similar scale.
Standardization:
Adjusting the pixel values so that they have a mean of zero and a standard
deviation of one. This is another technique to normalize the input data, making
the optimization process more efficient.
Grayscale Conversion:
Data Augmentation:
3
Though typically considered a separate step from basic preprocessing, data
augmentation can also be seen as part of preprocessing. Techniques such as
rotating, flipping, and scaling images can be applied to artificially expand the
training dataset and make the model more robust to variations.
Noise Reduction:
Removing or reducing noise in images (e.g., using filters to smooth out the
image) to improve the quality of the input data.
Converting images from one color space to another, such as from RGB to
HSV or LAB, depending on what may be more suitable for the specific task at
hand.
Preprocessing is a crucial step because it directly impacts the performance of
the CNN. Well-preprocessed data can lead to better feature extraction by the
network and, ultimately, more accurate predictions and classifications..
Augmentation
In the context of Convolutional Neural Networks (CNNs), augmentation refers to the
process of creating new training examples from the existing dataset by applying
various transformations. The goal of data augmentation is to increase the diversity of
the training data, which helps improve the robustness and generalization capabilities
of the model, reducing the risk of overfitting.
Splitting
In CNN "splitting" refers to the practice of dividing the available dataset into different
subsets for training, validating, and testing the model. This process is crucial for
building and evaluating a reliable and robust model.
4
Splitting the dataset into training, validation, and test sets is crucial for evaluating the
performance of a model:
II. Classification:
o The fully connected layers act as the classifier, taking the extracted
features and making the final prediction.
o The output layer typically uses a softmax activation function for multi-
class classification.
Hyperparameters
Hyperparameters are configuration settings used to control the training process and
architecture of the model. Examples include:
Validation:
5
o Used to evaluate the model during training and tune hyperparameters.
o Helps prevent overfitting by providing an unbiased evaluation of the
model's performance.
Testing:
o The process of transforming raw data into a set of features that can be
used for training the model.
o In CNNs, this is done automatically by convolutional and pooling
layers.
o Focuses on identifying the most relevant patterns in the data.
Augmentation:
In summary, while feature extraction and augmentation both play crucial roles in
improving the performance of CNNs, they serve different purposes. Feature extraction
is about finding and using the most informative aspects of the data, whereas
augmentation is about artificially expanding the dataset to improve model robustness
and prevent overfitting.
Summary
From this article, I have learned that Convolutional Neural Networks are powerful
tools for analyzing visual data due to their ability to capture spatial hierarchies
through convolutional layers. Effective preprocessing and augmentation are crucial
for improving model performance and generalization. Proper dataset splitting into
training, validation, and test sets is essential for unbiased model evaluation. Feature
extraction focuses on identifying key patterns in data, while augmentation enhances
the dataset diversity. Understanding and tuning hyperparameters are vital for
optimizing model training and achieving better performance.