0% found this document useful (0 votes)
3 views

ML project

This project report details the development of a Handwritten Digit Recognition system using a Convolutional Neural Network (CNN) to classify digits from the MNIST dataset. The report outlines the system's architecture, data preprocessing techniques, and training methodologies, demonstrating the effectiveness of neural networks in image classification tasks. Additionally, it discusses the limitations of existing systems and the advantages of the proposed approach, aiming to improve accuracy and robustness in digit recognition.

Uploaded by

penmethsajahnavi
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

ML project

This project report details the development of a Handwritten Digit Recognition system using a Convolutional Neural Network (CNN) to classify digits from the MNIST dataset. The report outlines the system's architecture, data preprocessing techniques, and training methodologies, demonstrating the effectiveness of neural networks in image classification tasks. Additionally, it discusses the limitations of existing systems and the advantages of the proposed approach, aiming to improve accuracy and robustness in digit recognition.

Uploaded by

penmethsajahnavi
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

A

Project Report on
HANDWRITTEN DIGIT RECOGNITION
Submitted for partial fulfilment of the requirements for the award of the degree of

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING(AI&ML)

By
P.JAHNAVI – 22K81A6649
Under the Guidance of
Under The Guidance Of
Mr. K. NAVEEN CHAKRAVARTHI
Assistant Professor

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING(AI &


ML)
St. MARTIN'S ENGINEERING COLLEGE
UGC Autonomous
Affiliated to JNTUH, Approved by AICTE
Accredited by NBA & NAAC A+ , ISO 9001-2008 Certified
Dhulapally, Secunderabad – 500 100
www.smec.ac.in
St. MARTIN'S ENGINEERING COLLEGE
UGC Autonomous
NBA & NAAC A+ Accredited
Dhulapally, Secunderabad-500 100

CERTIFICATE

This is to certify that the project entitled “HANDWRITTEN DIGIT


RECOGNITION” is being submitted by P.Jahnavi(22K81A6649) in
fulfilment of the requirement for the award of degree of BACHELOR OF
TECHNOLOGY IN COMPUTER SCIENCE AND
ENGINEERING(AI&ML) is recorded of bonafide work carried out by them.
The result embodied in this report have been verified and found satisfactory.

Project Internal Examiner Signature of HOD


Mr. K. Naveen Chakravarthi Dr. B. Venkateswara Rao
Assistant Professor Head of Department
Department of CSE(AI&ML) Department of CSE(AI&ML)
St. MARTIN'S ENGINEERING COLLEGE
UGC Autonomous
Accredited by NBA & NAAC A+
Dhulapally, Secunderabad-500 100
www.smec.ac.in

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING(AI&ML)

DECLARATION

We, the students of “Bachelor of Technology in Deparatment of


COMPUTER SCIENCE AND ENGINEERING(AI&ML)”, session: 2022 -
2026, St. Martin’s Engineering College, Dhulapally, Kompally,
Secunderabad, hereby declare that the work presented in this project work
entitled HANDWRITTEN DIGIT RECOGNITION is the outcome of our
own bonafide work and is correct to the best of our knowledge and this work
has been undertaken taking care of Engineering Ethics. This result embodied in
this project report has not been submitted in any university for award of any
degree.

P.Jahnavi – 22K81A6649
ACKNOWLEDGEMENT

The satisfaction and euphoria that accompanies the successful completion of


any task would be incomplete without the mention of the people who made it
possible and whose encouragement and guidance have crowded our efforts with
success. First and foremost, we would like to express our deep sense of
gratitude and indebtedness to our College Management for their kind support
and permission to use the facilities available in the Institute.

We especially would like to express our deep sense of gratitude and


indebtedness to Dr. P. SANTOSH KUMAR PATRA, Group Director, St.
Martin’s Engineering College Dhulapally, for permitting us to undertake this
project.

We wish to record our profound gratitude to Dr. M. SREENIVAS RAO,


Principal, St. Martin’s Engineering College, for his motivation and
encouragement.

We are also thankful to Dr. B. VENKATESWARA RAO, Head of the


Department, Computer Science and Engineering (AI & ML) , St. Martin’s
Engineering College, Dhulapally, Secunderabad, for his support and guidance
throughout our project.

We would like to express our sincere gratitude and indebtedness to our project
supervisor Mr. K. NAVEEN CHAKRAVARTHI Assistant Professor,
Department of Computer Science and Engineering(AI&ML), St. Martins
Engineering College, Dhulapally, for his support and guidance throughout our
project.

Finally, we express thanks to all those who have helped us successfully


completing this project. Furthermore, we would like to thank our family and
friends for their moral support and encouragement. We express thanks to all
those who have helped us in successfully completing the project.
P.Jahnavi – 22K81A6649
CONTENTS

CHAPTER 1-ABSTRACT 1

CHAPTER 2- INTRODUCTION 2

CHAPTER 3-SYSTEM ANALYSIS 3

3.1 Existing System 3

3.2 Proposed System 4

CHAPTER 4-SYSTEM REQUIREMENT 5

4.1 Hardware Requirement 5

4.2 Software Requirement 6

CHAPTER 5- ALGORITHM 8

CHAPTER 6- SYSTEM IMPLEMENTATION 10

CHAPTER 7-SYSTEM TESTING 12

CHAPTER 8-OUTPUT SCREENS 13

CHAPTER 9-CONCLUSION 14

CHAPTER 10-FUTURE ENHANCEMENT 15

CHAPTER 11-REFERENCES 16
1. ABSTRACT

Handwritten digit recognition is a fundamental problem in the field of image


classification and machine learning. This project aims to develop a system that
can automatically recognize and classify handwritten digits (0-9) from images.
Using the popular MNIST dataset, which contains 70,000 28x28 grayscale images
of handwritten digits, we employ a feedforward neural network model to perform
the classification task. The dataset is preprocessed by normalizing pixel values
and reshaping the images into vectors, while the labels are one-hot encoded for
multi-class classification. The model is trained using a simple neural network
architecture consisting of one input layer, one hidden layer, and one output layer
with a softmax activation function for classification. The model is evaluated on a
separate test dataset, achieving a high classification accuracy, demonstrating the
effectiveness of neural networks for image recognition tasks. The project also
provides insights into model training, overfitting, and the importance of
validation, offering a solid foundation for further research in optical character
recognition (OCR) and deep learning for image classification.

1
2. INTRODUCTION

Handwritten digit recognition is a classic problem in the field of machine learning


and computer vision. It involves identifying and classifying digits (0-9) written by
humans, typically in images captured by scanners, cameras, or touchscreen
devices. This task is particularly challenging due to the variability in handwriting
styles, stroke thickness, orientation, and noise in the images. Over the years,
significant advancements in machine learning, particularly deep learning, have
enabled systems to achieve high accuracy in classifying handwritten digits,
making it an essential problem in areas like optical character recognition (OCR),
postal code reading, and bank check processing.

In this project, we develop a handwritten digit recognition system using a simple


neural network model. The approach involves preprocessing the dataset,
normalizing pixel values, and training a neural network to classify the digits. This
method demonstrates the effectiveness of machine learning in image classification
tasks and serves as a foundational exercise for more complex models, such as
Convolutional Neural Networks (CNNs), which are specifically designed for
image-related tasks.

The primary objective of this project is to showcase how machine learning


techniques, particularly neural networks, can be applied to real-world problems,
while also understanding the key concepts of data preprocessing, model training,
evaluation, and performance optimization. The results from this system provide
insights into how deep learning models can effectively address challenges in
handwritten digit recognition.

2
3.SYSTEM ANALYSIS

3.1EXISTING SYSTEM

1. Traditional Machine Learning Approaches:


o K-Nearest Neighbors (KNN): KNN is one of the simplest machine
learning algorithms. It classifies digits based on the similarity to the
nearest training examples. It’s computationally expensive, especially
as the dataset grows.
o Support Vector Machines (SVM): SVMs have been used for digit
classification by finding optimal hyperplanes in high-dimensional
space. Although SVMs are effective, they require careful tuning and
can be slow with large datasets.
o Random Forests: An ensemble method that aggregates multiple
decision trees to classify digits. While effective, they don’t capture
spatial relationships in images as well as deep learning methods.

2. Neural Networks:
o Multilayer Perceptrons (MLPs): Early neural networks, such as
MLPs, used backpropagation to learn pixel-based features. However,
they performed poorly in image classification tasks compared to
newer architectures like CNNs due to the lack of spatial hierarchy
recognition.

3. Transfer Learning and Pretrained Models:


o Leveraging pretrained models like VGG16, ResNet, or Inception,
fine-tuned on the MNIST dataset, allows faster training and can yield
higher accuracy, even with limited data.

4. Generative Models:
o Generative Adversarial Networks (GANs): GANs and other
generative models have been used to augment training data, helping
to improve model robustness, especially in cases where the dataset is
small or noisy.

Limitations of Existing Systems:


1. Accuracy Issues with Simple Models
2. Data Requirements and Overfitting
3. Computational Complexity
4. Noise and Variability
5. Scalability
3
6. Real-Time Processing Constraints

3.2PROPOSED SYSTEM

The proposed system aims to develop a more efficient and accurate handwritten
digit recognition model using Convolutional Neural Networks (CNNs), which
are particularly effective for image-based tasks. Unlike traditional machine
learning methods, CNNs can automatically learn hierarchical features from raw
pixel data, reducing the need for manual feature extraction and improving model
accuracy.

Key Features of the Proposed System:

1. Model Architecture:
o The system will use a CNN architecture designed for image classification.
The model will consist of multiple convolutional layers followed by
pooling layers and fully connected layers. This structure will allow the
model to learn both low-level features (edges, textures) and high-level
patterns (shapes, structures) specific to handwritten digits.

2. Data Preprocessing:
o The MNIST dataset will be preprocessed by normalizing the pixel values
to a range of [0, 1] to improve training efficiency.
o Data augmentation techniques such as rotation, scaling, and translation will
be applied to artificially increase the diversity of the dataset, helping the
model generalize better and handle variations in handwriting styles.

3. Model Training:
o The model will be trained using backpropagation and stochastic
gradient descent (SGD) with a suitable loss function (e.g., categorical
cross-entropy) and an optimizer (e.g., Adam).
o The training process will involve monitoring both training and validation
accuracy to detect overfitting, and techniques like dropout and early
stopping will be used to mitigate it.

4. Evaluation and Optimization:


o The system's performance will be evaluated on a test set of handwritten
digits to measure accuracy, precision, recall, and F1-score.
o Hyperparameter tuning will be performed to optimize the model’s
performance, such as adjusting the learning rate, number of layers, and
filter sizes.

4
Expected Advantages:

 Improved Accuracy
 Robustness
 Real-Time Processing
 Scalability

4.SYSTEM REQUIREMENTS

4.1HARDWARE REQUIREMENTS

1. Development and Training Phase (Model Building)

For training deep learning models, especially convolutional neural networks


(CNNs), significant computational power is required due to the complexity of the
models and the large amount of data processing involved.

a. CPU (Central Processing Unit): High-performance multi-core processors


(e.g., Intel Core i7 or i9, AMD Ryzen 7 or 9).
b. GPU (Graphics Processing Unit): NVIDIA GPUs with CUDA support (e.g.,
NVIDIA RTX 3080, RTX 4090, or Tesla V100).
c. RAM (Random Access Memory): 16GB or more.
d. Storage: SSD (Solid State Drive) with 500GB or more of available space.
e. Network Connectivity: Stable internet connection for downloading datasets,
libraries, and frameworks (e.g., TensorFlow, Keras, PyTorch).

2. Inference/Deployment Phase (Model Prediction)

Once the model is trained, deploying it for real-time inference (digit


classification) requires different hardware considerations, especially when
deploying it to user devices or edge devices.

a. CPU (for Edge or Low-Power Devices): ARM-based processors (e.g.,


Raspberry Pi 4, Qualcomm Snapdragon).
b. GPU (for Real-Time Inference on High-End Devices): NVIDIA Jetson
Xavier, Jetson Nano (for edge devices) or NVIDIA Tegra X2 for mobile
applications.
c. RAM: 4GB or more for mobile devices or edge computing units.
d. Storage: 32GB or more for mobile devices or edge systems.
5
e. Power Supply: If deployed on mobile or embedded devices, an efficient and
portable power supply (e.g., battery or USB power).

4.2SOFTWARE REQUIREMENTS

1. Development and Training Phase (Model Building)

For model development and training, several software tools and libraries are
necessary to support machine learning and deep learning tasks.

a. Operating System:

o Windows 10/11 (for general development and software compatibility)


o Ubuntu 20.04 or later (preferred for data science and machine learning due
to better support for deep learning frameworks)
o macOS (if working on a Mac, though GPU support may be limited)

b. Python Programming Language: Python 3.7 or later.

c. Machine Learning Frameworks:

 TensorFlow
 Keras
 PyTorch
 Scikit-learn

d. Libraries for Data Handling and Preprocessing:

 NumPy
 Pandas
 OpenCV
 Matplotlib and Seaborn

e. Data Augmentation and Preprocessing Libraries:

6
 Albumentations: A fast and flexible library for augmenting image data
(e.g., rotation, scaling, flipping), which is important for improving model
generalization.
 ImageDataGenerator (Keras): A built-in Keras tool for real-time image
augmentation during training.

2. Inference/Deployment Phase (Real-Time Prediction)

Once the model is trained, software tools are needed for deploying and running
predictions, whether in a cloud environment or on local edge devices.

a. Frameworks for Model Deployment:

 TensorFlow Lite: A lightweight version of TensorFlow designed for


mobile and embedded devices. It’s optimized for running models on
resource-constrained devices.
 ONNX (Open Neural Network Exchange): An open format that allows
for the interchange of models between different deep learning frameworks
(e.g., PyTorch to TensorFlow). It supports running models on various
platforms.
 CoreML (for macOS/iOS): Apple's framework for running machine
learning models on iOS and macOS devices. Useful if deploying the model
on Apple hardware.
 OpenVINO: An Intel toolkit for optimizing deep learning models to run
efficiently on Intel hardware, including CPUs, GPUs, and VPUs.

b. Mobile or Edge Device Deployment (for embedded systems):

 Android Studio (for Android deployment)


 Xcode (for iOS deployment)
 Raspberry Pi OS (for Edge Devices)

3. Version Control and Collaboration Tools:

 Git: Version control system for tracking code changes, collaborating with
teams, and managing different versions of the model and datasets.

7
 GitHub/GitLab/Bitbucket: Platforms for hosting and sharing code
repositories, enabling collaboration and version control.

5. ALGORITHM

1. Collect and Prepare Data:

 Load the dataset (e.g., MNIST) containing images of handwritten digits (0-9).
 Preprocess the data:
o Normalize the pixel values (scale images from [0, 255] to [0, 1]).
o Reshape images to the format required for the model (e.g., 28x28
grayscale images).

2. Split the Data:

 Divide the dataset into two parts:


o Training set: Used to train the model.
o Test set: Used to evaluate the model after training.

3. Build the Model:

 Choose a model architecture (e.g., neural network, CNN).


o For CNN: Add layers like:
 Convolutional layers (for feature extraction).
 Pooling layers (to reduce dimensionality).
 Fully connected layers (for final classification).

4. Compile the Model:

 Set the optimizer: Choose an algorithm like Adam or SGD.


 Set the loss function: For multi-class classification, use categorical cross-
entropy.
 Choose metrics: Accuracy is a common choice to track model performance.

8
5. Train the Model:

 Fit the model on the training data.


 Define epochs: Number of times the entire dataset is passed through the
model.
 Define batch size: Number of samples processed before the model is updated.

6. Evaluate the Model:

 Test the model on the test set.


 Check the accuracy: How well does the model perform on unseen data?

7. Make Predictions:

 Input a new image of a handwritten digit into the model.


 Output the predicted digit (the class with the highest probability).

8. Fine-tune (optional):

 Adjust hyperparameters like learning rate, number of layers, or filters.


 Retrain the model if necessary to improve performance.

9
6. SYSTEM IMPLEMENTATION

SOURCE CODE

# Import necessary libraries


import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt

# Step 1: Load the MNIST dataset


(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Step 2: Preprocess the data


# Normalize the image pixel values to be between 0 and 1
x_train, x_test = x_train / 255.0, x_test / 255.0

# Reshape the data to include the channel dimension (28, 28, 1)


x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

# Step 3: Build the Convolutional Neural Network (CNN)


model = models.Sequential()

# Add a convolutional layer (32 filters, 3x3 kernel)


model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28,
28, 1)))

# Add a max-pooling layer


model.add(layers.MaxPooling2D((2, 2)))

# Add another convolutional layer (64 filters, 3x3 kernel)


10
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
# Add another max-pooling layer
model.add(layers.MaxPooling2D((2, 2)))

# Flatten the 2D output to 1D


model.add(layers.Flatten())

# Add a fully connected (dense) layer


model.add(layers.Dense(64, activation='relu'))

# Add the output layer with 10 units (one for each digit) and softmax
activation
model.add(layers.Dense(10, activation='softmax'))

# Step 4: Compile the model


model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Step 5: Train the model


model.fit(x_train, y_train, epochs=5, batch_size=64)

# Step 6: Evaluate the model


test_loss, test_acc = model.evaluate(x_test, y_test)
print(f'Test accuracy: {test_acc}')

# Step 7: Make a prediction on a test image


predictions = model.predict(x_test)
predicted_label = np.argmax(predictions[0]) print(f'Predicted label for the
first test image: {predicted_label}')

# Step 8: Plot the first test image


plt.imshow(x_test[0].reshape(28, 28), cmap='gray')
plt.title(f"Predicted: {predicted_label}, True: {y_test[0]}")
11
plt.show()
7. SYSTEM TESTING

1. Functional Testing:

 Test Case 1: Verify if the system correctly loads and preprocesses the
MNIST dataset (images and labels).
 Test Case 2: Ensure the model can be trained without errors (check for
correct implementation of layers, loss function, and optimizer).
 Test Case 3: Validate that the model produces predictions after training,
and the predictions match expected digit outputs.
 Test Case 4: Test if the system performs real-time digit recognition when
fed with new, unseen images.

2. Performance Testing:

 Test Case 1: Evaluate the training time for the model over a specified
number of epochs and check for acceptable training duration.
 Test Case 2: Assess the inference time for making predictions on a test
image to ensure it meets the real-time requirements (if applicable).

3. Accuracy Testing:

 Test Case 1: Measure the model's accuracy on the test set using metrics
such as accuracy, precision, recall, and F1-score.
 Test Case 2: Perform cross-validation to ensure consistent model
performance across different subsets of the data.

4. Stress Testing:

 Test Case : Test how the system handles large datasets or edge cases, such
as corrupted images or unexpected input formats.

5. Usability Testing (for deployed systems):

 Test Case : Ensure that the user interface (if any) is intuitive and allows
easy interaction for real-time digit recognition.
12
8. OUTPUT SCREENS

Epoch 1/5
938/938 [==============================] - 5s 4ms/step - loss: 0.2130 - accuracy: 0.9372
Epoch 2/5
938/938 [==============================] - 4s 5ms/step - loss: 0.0536 - accuracy: 0.9834
Epoch 3/5
938/938 [==============================] - 4s 5ms/step - loss: 0.0395 - accuracy: 0.9875
Epoch 4/5
938/938 [==============================] - 4s 5ms/step - loss: 0.0297 - accuracy: 0.9901
Epoch 5/5
938/938 [==============================] - 4s 5ms/step - loss: 0.0241 - accuracy: 0.9920

Test accuracy: 0.9905


Predicted label for the first test image: 7
13
Predicted: 7, True: 7

9. CONCLUSION

Conclusion for Handwritten Digit Recognition:

The convolutional neural network (CNN) model successfully trained on the


MNIST dataset, achieving a high test accuracy of around 99%. This performance
demonstrates the effectiveness of CNNs for image recognition tasks, particularly
for recognizing handwritten digits. The model's architecture, which includes
convolutional layers for feature extraction, max-pooling layers for downsampling,
and fully connected layers for classification, enabled it to efficiently learn the
patterns and variations in handwritten digits.

With a test accuracy of over 99%, the model is capable of accurately predicting
the digits in new, unseen images. This highlights the ability of deep learning
models, especially CNNs, to generalize well to real-world image recognition
tasks.

This approach can be extended to more complex datasets and other image
classification tasks, showcasing the power and versatility of CNNs in computer
vision. The high accuracy of the model also suggests that further fine-tuning, such
as using more advanced techniques (e.g., dropout, data augmentation), could
improve the performance even further.

14
10. FUTURE ENHANCEMENTS

1. Model Architecture Improvements:


o Deeper Networks: Introducing more layers or more complex architectures
could improve performance.
o Transfer Learning: Using pre-trained models (e.g., models trained on
larger datasets like ImageNet) and fine-tuning them on the MNIST dataset
can significantly improve accuracy, especially when the dataset is small.
2. Hyperparameter Tuning:
o Batch Size: Experimenting with different batch sizes could impact training
time and model performance.
o Number of Epochs: Increasing the number of epochs, along with early
stopping techniques, could enhance model performance by ensuring it learns
fully without overfitting.
3. Advanced Preprocessing:
o Implementing more sophisticated image preprocessing methods (e.g.,
histogram equalization, edge detection) may help the model focus on
important features, potentially improving accuracy for more complex or
noisy data.
4. Integration with Real-World Applications:
o Deployment: The model can be integrated into real-world applications like
document scanning, postal code recognition, and OCR systems.
o Interactive Interfaces: Building user-friendly interfaces for handwritten
digit recognition systems, such as apps that allow users to draw digits and
get immediate predictions, would enhance accessibility and usefulness.
5. Model Interpretability:
o Exploring techniques like Grad-CAM or SHAP for visualizing which parts
of the image the model is focusing on can help in understanding and
improving model decisions, especially in critical applications where
interpretability is key.
6. Transfer to More Complex Datasets:
o Expanding the scope to more complex datasets, such as the SVHN (Street
View House Numbers) dataset, which includes numbers in more varied and
challenging conditions, or CIFAR-10, which includes a wider variety of

15
images, could offer the opportunity to apply and refine the model for real-
world scenarios.

11.REFERENCES

https://www.kaggle.com/code/arunrk7/digit-recognition-using-cnn-99-accuracy

https://www.geeksforgeeks.org/python-classifying-handwritten-digits-with-
tensorflow/

https://github.com/arpita739/MNIST-Handwritten-Digit-Recognition-using-
CNN

https://www.geeksforgeeks.org/handwritten-digit-recognition-using-neural-
network/

https://machinelearningmastery.com/how-to-develop-a-convolutional-neural-
network-from-scratch-for-mnist-handwritten-digit-classification/

16

You might also like