Mnist Classification Report
Mnist Classification Report
Harshyara
Bukkapatnam
—
ENG21CS0085
—
7 Semester B
th
—
November 6,2024
—
Sequence Networks and GAN
—
Prof.Arjun KrishnaMurthy
CNN for MNIST Handwritten Digit
Classification
Dataset
The dataset that is being used here is the MNIST digits classification
dataset. Keras is a deep learning API written in Python and MNIST is a
dataset provided by this API. This dataset consists of 60,000 training
images and 10,000 testing images.
Model Methodology
Data Preparation:
Model Architecture:
Training and Testing: Within each fold, the model trains for 10
epochs with a batch size of 32. Accuracy is recorded for both training
and validation data, enabling performance comparison across folds.
Development Environment
Google Colab was used as the development environment for this project,
providing a cloud-based Jupyter notebook interface with pre-installed
libraries for deep learning, such as TensorFlow and Keras. It enables
seamless access to GPU acceleration, enhancing model training efficiency
on the MNIST dataset. Additionally, Colab's collaborative features
facilitate code sharing and documentation, streamlining the development
and testing process.
Principle
The principle behind this CNN model is to classify handwritten digits by
progressively learning spatial hierarchies of features through
convolutional layers. The model leverages convolution to capture local
patterns, like edges and textures, which are essential for recognizing digit
shapes. Max pooling layers down-sample these features, reducing
computational complexity while preserving key information.
Developing a Model
Importing Libraries
To develop the convolutional neural network (CNN) model for digit
classification, we first import essential libraries:
These libraries collectively provide the tools to preprocess data, build the
CNN model, train with cross-validation, and evaluate performance.
Model Architecture
The model is a convolutional neural network (CNN) designed to classify
images of handwritten digits from the MNIST dataset. The architecture is
structured to progressively capture features at multiple levels of
abstraction through several key layers:
Dropout Layers: Dropout layers are included after each max pooling
layer to reduce overfitting. A dropout rate of 0.2 is applied after the
first convolutional layer, and 0.3 after the second.
Model Compilation
The model is compiled with the following configurations:
Evaluation Function
The evaluate_model function was implemented to automate this cross-
validation process. It initializes a KFold object, shuffling the dataset with a
fixed random state for reproducibility. Within each fold, the model is
defined, trained for 10 epochs with a batch size of 32, and evaluated on the
validation fold. The function then appends the accuracy score and training
history of each fold to respective lists, scores and histories, for
performance analysis.
Performance Summary
After training the model across multiple folds, we compute an overall
performance summary by calculating the mean and standard deviation of
accuracy scores. This evaluation offers a comprehensive view of the
model’s effectiveness and consistency. A boxplot visualizes the distribution
of accuracy scores across folds, emphasizing the model's generalization
capability and performance stability. This summarization provides a
reliable measure of the model's robustness on unseen data.
Final Model Training
In this step, the model is trained on the entire training dataset using the
previously defined CNN architecture. The training process consists of
fitting the model to the trainX and trainY data for 10 epochs, with a batch
size of 32. During training, the model adjusts its weights to minimize the
loss function and improve its ability to classify handwritten digits from the
MNIST dataset.
Model Saving
Execution
Loading and Evaluating the Final Model
To assess the performance of the trained model, the final version is loaded
using TensorFlow's load_model function. The model, saved as
'final_model.h5', is then evaluated on the test dataset (testX, testY) to
gauge its accuracy. This step provides a final validation of the model's
ability to generalize on unseen data, with the test accuracy printed as the
output.
Digit Prediction
The predict_digit function loads the pre-processed image and passes it
through the trained CNN model. The model predicts the class (digit) by
outputting a probability distribution over all 10 classes. The predicted digit
is identified by selecting the class with the highest probability using
np.argmax(). The predicted digit along with its confidence score
(probability distribution) is printed to the console for evaluation.
Reference:
Google Collab Link:
https://colab.research.google.com/drive/
1tp8z4wC8olSFZHYByPkjune6FRbw21Iv?usp=sharing