NNDL Assignment-2 Report
NNDL Assignment-2 Report
AN ASSIGNMENT-PROJECT REPORT
ON
“Forward Propagation for Convolutional Neural Network”
Submitted in partial fulfilment of the requirements for the award of
the Degree of
Submitted by
REVA University
Rukmini Knowledge Park, Kattigenahalli, Yelahanka,Bengaluru-560064
www.reva.edu.in
i
DECLARATION
We, students of B.Tech., VI Semester, School of Computer Science and Engineering, REVA
University declare that the Assignment-Project Report entitled “Forward Propagation for
Convolutional Neural Network” done by us, School of Computer Science and Engineering,
REVA University.
We are submitting the Assignment-Project Report in partial fulfillment of the requirements for the
award of the degree of Bachelor of Technology in Artificial Intelligence and Data Science by the
ii
Abstract
This outlines a systematic approach to train and evaluate a Convolutional Neural Network (CNN)
model using TensorFlow and Keras with the MNIST dataset. It covers importing libraries, defining
the CNN model architecture, loading, and preprocessing the dataset, and evaluating the model's
accuracy on test data. Key steps include exploratory analysis, visualization of sample images, data
normalization, reshaping for CNN compatibility, and assessing class distribution for balanced
training. The concise description underscores the structured workflow for developing and
accessing CNN models for image classification tasks.
Problem Statement
Develop a CNN-based model that can accurately classify handwritten digits (0-9) from input
images, by efficiently learning relevant features through the forward propagation of the input data
through the network's layers.
Implement the forward_propagation function below to build the following model: CONV2D ->
RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL ->
FLATTEN-> FULLYCONNECTED
Problem Description:
This forward propagation function is the core of the CNN-based handwritten digit recognition
model, where the input image is transformed through a series of convolutional, activation, pooling,
and fully connected layers to produce the final digit classification.
Possible Applications:
This is majorly used in the following areas: Image Classification, Object Detection, Image
Segmentation, Video Analysis, Natural Language Processing (NLP), Medical Imaging,
Autonomous Vehicles.
iii
Motivation for the Project
Social Impact:
CNNs are vital in healthcare for medical image analysis and diagnosis, as well as in autonomous
vehicles for object detection, enhancing healthcare outcomes and road safety.
Tech-Innovations:
Mastering forward propagation in CNNs enables the creation of more sophisticated architectures
and optimization techniques, fostering innovation in AI systems.
Significance:
Implementing forward propagation for CNNs provides practical insights into feature extraction,
down sampling, and classification, crucial for effective CNN design and training.
Literature Survey
The research paper "Latent Log-Linear Models for Handwritten Digit Classification" uses the
USPS and MNIST datasets for handwritten digit classification, evaluating Gaussian mixture
models (GMMs) and latent log-linear models (LLMMs)[1]. The USPS dataset contains 7,291
training and 2,007 test observations with 16x16 pixel images, while the MNIST dataset has 60,000
training and 10,000 test observations with 28x28 pixel images. The methodology trains and tunes
models on the USPS dataset, investigates settings, transfers tuned settings to the MNIST dataset,
and experiments with deformation-aware log-linear models on the USPS dataset. Limitations
include model complexity, computational requirements, and generalization to other datasets. To
address these issues, a CNN architecture is proposed with improved classification accuracy,
handling of spatial variability, exceptional generalization across datasets, and scalability through
hardware and software optimization techniques.
An another study utilizes the MNIST dataset of 60,000 handwritten digit images for digit
classification using a Convolutional Neural Network (CNN) model. The methodology involves
image processing, CNN modelling with convolution, pooling, and dense layers, and Softmax for
classification. The study's limitations include a single dataset, lack of diversity, limited evaluation
metrics, model architecture, validation, explainability, real-world applications, and comparison
with other models. Overcoming these limitations involves diversifying datasets, increasing
diversity, improving evaluation metrics, exploring different CNN architectures, using multiple
iv
validation sets, improving explainability, expanding real-world applications, and comparing with
other deep learning models.
Whereas the CIFAR-10 dataset, comprising 50,000 32x32x3 images across 10 object classes, to
train and test various classification algorithms, including kNN, SVM, SoftMax, Fully Connected
Neural Network, and CNN. The CNN architecture, inspired by AlexNet, achieves the highest
accuracy of 85.97%. Limitations identified include the inability of SVM and SoftMax to recognize
different viewpoints and the challenge of achieving human-level accuracy above 90%. To
overcome these, the researchers suggest implementing fully connected layers with multiple hidden
layers, advanced deep learning techniques, data augmentation, and model optimization. The study
highlights the broad applications of Deep Learning and the vast potential of CNNs in technological
advancements.
The methodology using the USPS and MNIST handwritten digit datasets. The methodology
involves using latent log-linear models, including Gaussian mixture models (GMMs) and latent
log-linear models (LLMMs), for digit classification. Limitations include limited model complexity,
computational requirements, and generalization challenges. To overcome these, the study proposes
a CNN architecture with convolutional, max-pooling, and fully connected layers. This approach
autonomously learns hierarchical features, handles spatial variability, generalizes across datasets,
and achieves scalability and efficiency through hardware and software optimization techniques.
The CNN implementation demonstrates improved classification accuracy and robustness
compared to the models outlined in the research paper.
Dataset
The dataset which is used is MNSIT which is a large database of handwritten digits that is
commonly used for training various image processing systems.
The MNIST database contains 60,000 training images and 10,000 testing images. Half of the
training set and half of the test set were taken from NIST's training dataset, while the other half of
the training set and the other half of the test set were taken from NIST's testing dataset
Methodology
For training and evaluating a Convolutional Neural Network (CNN) model using TensorFlow and
Keras with the MNIST dataset. It starts with importing necessary libraries and defining a CNN-
Model class, encapsulating the model architecture and training process. The MNIST dataset is
loaded and preprocessed, then used to train the CNN model. Finally, the trained model is evaluated
on the test data, and its accuracy is reported. This streamlined approach ensures a systematic
workflow for developing and accessing CNN models for image classification tasks.
v
Exploratory Analysis
1. To start the exploratory analysis, we first load the MNIST dataset using the
tensorflow.keras.datasets.mnist.load_data() func on, which provides the training and test sets.
2. Next, we visualize a few sample images from the dataset to get a sense of the data. This is done
by displaying 9 random images from the training set, along with their corresponding labels.
3. During the data preprocessing step, we normalize the pixel values to the range [0, 1] to improve
the model's performance. Addi onally, we reshape the input data to fit the Convolu onal Neural
Network (CNN) architecture.
4. To further understand the dataset, we examine the class distribu on by coun ng the number of
samples for each digit (0-9) in both the training and test sets. This helps us assess the balance of
the classes, which is an important considera on for the model's training.
5. The MNIST dataset consists of 28x28 pixel grayscale images of handwri en digits (0-9).
6. The training set has 60,000 images, and the test set has 10,000 images.
7. The pixel values are normalized to the range [0, 1] for be er model performance.
8. The input data is reshaped to fit the CNN architecture.
9. The class distribu on is rela vely balanced, with each digit having a similar number of
samples in both the training and test sets.
Expected Outcome
1. Visualization of Sample Images:
1. The analysis should display 9 random handwritten digit images from the training set,
along with their corresponding labels.
2. This will give you a visual understanding of the data and the challenges involved in
recognizing handwritten digits.
2. Normalization of Pixel Values:
1. The pixel values of the input images should be normalized to the range [0, 1].
2. This is a common preprocessing step that helps improve the model's performance by
ensuring that the input features are on a similar scale.
3. Reshaping of Input Data:
1. The input data should be reshaped to fit the Convolutional Neural Network (CNN)
architecture.
2. For the given problem, the input data should have a shape of (batch_size, 28, 28, 1),
where the last dimension represents the single channel (grayscale) of the images.
4. Class Distribu on Analysis:
1. The analysis should provide the number of samples for each digit (0-9) in both the
training and test sets.
vi
2. This information is crucial for understanding the balance of the classes, which can
impact the model's performance and help you determine if any data augmentation or
class weighting is necessary.
5. Insights for Model Development:
1. The exploratory analysis should give you a solid understanding of the MNIST dataset,
including the data characteristics, preprocessing requirements, and the class
distribution.
2. This information will be valuable when designing and training the CNN-based
handwritten digit recognition model, as it will help you make informed decisions about
the model architecture, hyperparameters, and training strategies.
Output
vii
REFERENCES
[1]https://ppl-ai-file-upload.s3.amazonaws.com/web/directfiles/6018399/664b55c4-
3070-4a4a-8482-55ea5f01ea13/digit-classification-using-convolutional-neural-
network-IJERTV8IS110324.pdf
[2] Viswanatha, V., Ramachandra, A. C., Nalluri, S. D., Thota, S. M., & Thota, A.
(2023). Handwritten Digit Recognition Using CNN. International Journal of
Innovative Research in Computer and Communication Engineering, 11(1), 8849.
https://doi.org/10.15680/IJIRCCE.2023.11010012
[3] T. Deselaers, T. Gass, G. Heigold, and H. Ney, "Latent Log-Linear Models for
Handwritten Digit Classification," IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 34, no. 6, pp. 1105-1117, June 2012.
[4] M. Jogin, D. G. D, Mohana, M. R. K, M. M. S, and A. S, "Feature Extraction
using Convolution Neural Networks (CNN) and Deep Learning," 2018 3rd IEEE
International Conference on Recent Trends in Electronics, Information &
Communication Technology (RTEICT-2018), Bengaluru, India, 2018, pp. 2319-
2319, doi: 10.1109/RTEICT42901.2018.9012350.
viii
ix