ML project
ML project
Project Report on
HANDWRITTEN DIGIT RECOGNITION
Submitted for partial fulfilment of the requirements for the award of the degree of
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING(AI&ML)
By
P.JAHNAVI – 22K81A6649
Under the Guidance of
Under The Guidance Of
Mr. K. NAVEEN CHAKRAVARTHI
Assistant Professor
CERTIFICATE
DECLARATION
P.Jahnavi – 22K81A6649
ACKNOWLEDGEMENT
We would like to express our sincere gratitude and indebtedness to our project
supervisor Mr. K. NAVEEN CHAKRAVARTHI Assistant Professor,
Department of Computer Science and Engineering(AI&ML), St. Martins
Engineering College, Dhulapally, for his support and guidance throughout our
project.
CHAPTER 1-ABSTRACT 1
CHAPTER 2- INTRODUCTION 2
CHAPTER 5- ALGORITHM 8
CHAPTER 9-CONCLUSION 14
CHAPTER 11-REFERENCES 16
1. ABSTRACT
1
2. INTRODUCTION
2
3.SYSTEM ANALYSIS
3.1EXISTING SYSTEM
2. Neural Networks:
o Multilayer Perceptrons (MLPs): Early neural networks, such as
MLPs, used backpropagation to learn pixel-based features. However,
they performed poorly in image classification tasks compared to
newer architectures like CNNs due to the lack of spatial hierarchy
recognition.
4. Generative Models:
o Generative Adversarial Networks (GANs): GANs and other
generative models have been used to augment training data, helping
to improve model robustness, especially in cases where the dataset is
small or noisy.
3.2PROPOSED SYSTEM
The proposed system aims to develop a more efficient and accurate handwritten
digit recognition model using Convolutional Neural Networks (CNNs), which
are particularly effective for image-based tasks. Unlike traditional machine
learning methods, CNNs can automatically learn hierarchical features from raw
pixel data, reducing the need for manual feature extraction and improving model
accuracy.
1. Model Architecture:
o The system will use a CNN architecture designed for image classification.
The model will consist of multiple convolutional layers followed by
pooling layers and fully connected layers. This structure will allow the
model to learn both low-level features (edges, textures) and high-level
patterns (shapes, structures) specific to handwritten digits.
2. Data Preprocessing:
o The MNIST dataset will be preprocessed by normalizing the pixel values
to a range of [0, 1] to improve training efficiency.
o Data augmentation techniques such as rotation, scaling, and translation will
be applied to artificially increase the diversity of the dataset, helping the
model generalize better and handle variations in handwriting styles.
3. Model Training:
o The model will be trained using backpropagation and stochastic
gradient descent (SGD) with a suitable loss function (e.g., categorical
cross-entropy) and an optimizer (e.g., Adam).
o The training process will involve monitoring both training and validation
accuracy to detect overfitting, and techniques like dropout and early
stopping will be used to mitigate it.
4
Expected Advantages:
Improved Accuracy
Robustness
Real-Time Processing
Scalability
4.SYSTEM REQUIREMENTS
4.1HARDWARE REQUIREMENTS
4.2SOFTWARE REQUIREMENTS
For model development and training, several software tools and libraries are
necessary to support machine learning and deep learning tasks.
a. Operating System:
TensorFlow
Keras
PyTorch
Scikit-learn
NumPy
Pandas
OpenCV
Matplotlib and Seaborn
6
Albumentations: A fast and flexible library for augmenting image data
(e.g., rotation, scaling, flipping), which is important for improving model
generalization.
ImageDataGenerator (Keras): A built-in Keras tool for real-time image
augmentation during training.
Once the model is trained, software tools are needed for deploying and running
predictions, whether in a cloud environment or on local edge devices.
Git: Version control system for tracking code changes, collaborating with
teams, and managing different versions of the model and datasets.
7
GitHub/GitLab/Bitbucket: Platforms for hosting and sharing code
repositories, enabling collaboration and version control.
5. ALGORITHM
Load the dataset (e.g., MNIST) containing images of handwritten digits (0-9).
Preprocess the data:
o Normalize the pixel values (scale images from [0, 255] to [0, 1]).
o Reshape images to the format required for the model (e.g., 28x28
grayscale images).
8
5. Train the Model:
7. Make Predictions:
8. Fine-tune (optional):
9
6. SYSTEM IMPLEMENTATION
SOURCE CODE
# Add the output layer with 10 units (one for each digit) and softmax
activation
model.add(layers.Dense(10, activation='softmax'))
1. Functional Testing:
Test Case 1: Verify if the system correctly loads and preprocesses the
MNIST dataset (images and labels).
Test Case 2: Ensure the model can be trained without errors (check for
correct implementation of layers, loss function, and optimizer).
Test Case 3: Validate that the model produces predictions after training,
and the predictions match expected digit outputs.
Test Case 4: Test if the system performs real-time digit recognition when
fed with new, unseen images.
2. Performance Testing:
Test Case 1: Evaluate the training time for the model over a specified
number of epochs and check for acceptable training duration.
Test Case 2: Assess the inference time for making predictions on a test
image to ensure it meets the real-time requirements (if applicable).
3. Accuracy Testing:
Test Case 1: Measure the model's accuracy on the test set using metrics
such as accuracy, precision, recall, and F1-score.
Test Case 2: Perform cross-validation to ensure consistent model
performance across different subsets of the data.
4. Stress Testing:
Test Case : Test how the system handles large datasets or edge cases, such
as corrupted images or unexpected input formats.
Test Case : Ensure that the user interface (if any) is intuitive and allows
easy interaction for real-time digit recognition.
12
8. OUTPUT SCREENS
Epoch 1/5
938/938 [==============================] - 5s 4ms/step - loss: 0.2130 - accuracy: 0.9372
Epoch 2/5
938/938 [==============================] - 4s 5ms/step - loss: 0.0536 - accuracy: 0.9834
Epoch 3/5
938/938 [==============================] - 4s 5ms/step - loss: 0.0395 - accuracy: 0.9875
Epoch 4/5
938/938 [==============================] - 4s 5ms/step - loss: 0.0297 - accuracy: 0.9901
Epoch 5/5
938/938 [==============================] - 4s 5ms/step - loss: 0.0241 - accuracy: 0.9920
9. CONCLUSION
With a test accuracy of over 99%, the model is capable of accurately predicting
the digits in new, unseen images. This highlights the ability of deep learning
models, especially CNNs, to generalize well to real-world image recognition
tasks.
This approach can be extended to more complex datasets and other image
classification tasks, showcasing the power and versatility of CNNs in computer
vision. The high accuracy of the model also suggests that further fine-tuning, such
as using more advanced techniques (e.g., dropout, data augmentation), could
improve the performance even further.
14
10. FUTURE ENHANCEMENTS
15
images, could offer the opportunity to apply and refine the model for real-
world scenarios.
11.REFERENCES
https://www.kaggle.com/code/arunrk7/digit-recognition-using-cnn-99-accuracy
https://www.geeksforgeeks.org/python-classifying-handwritten-digits-with-
tensorflow/
https://github.com/arpita739/MNIST-Handwritten-Digit-Recognition-using-
CNN
https://www.geeksforgeeks.org/handwritten-digit-recognition-using-neural-
network/
https://machinelearningmastery.com/how-to-develop-a-convolutional-neural-
network-from-scratch-for-mnist-handwritten-digit-classification/
16