Project Report
Project Report
Abstract:
This paper provides a deep learning method for the drowsiness detection and classification from eye
images. In this paper, I have introduced CNN for drowsiness detection of drivers. The model scored at
an accuracy of 99.75% overall on the test set, which further proves its effectiveness.
1. Introduction:
The identification and categorization of the images focussed on human eyes are the most
critical factors for diagnosis of drowsiness levels of the driver to prevent serious accidents.
Methods based on deep learning techniques offer a promising solution to this challenge by
providing fast and accurate detection and providing alert to the drivers so that they are
attentive all the time or take a rest whatever they feels is fit for them.
2. Methodology
2.1 Dataset
The dataset I used for the demonstrations is the MRL Eye Dataset. It contains eye images of
men and women with and without eye glasses in both open and closed conditions. First the
dataset was pre-processed so that all images are of the same size and pixel intensity.
The dataset is publicly available and you can access it from websites like Kaggle. You may
find the dataset at the end of this report.
The Preprocessing steps which are the following augmentations for the training and test
datasets to achieve better model performance:
Training Augmentations
Such augmentations aid in diversity of dataset and hence the robustness of the model under
consideration is increased.
Test Augmentations
These augmentations help the model generalize better maintaining consistency during
testing.
I have used the Xception model, which is a deep convolutional neural network architecture
that has shown impressive performance on image classification tasks. Proved by many
papers. The model was initialized with pretrained weights to take advantage of transfer
learning, which can improve performance and reduce training time.
I have used the timm library to load the Xception model with pretrained weights. The
(pretrained=True) parameter ensures that the model is initialized with weights pre-trained
on a large dataset (mostly ImageNet).
The Xception model is designed to accept RGB images (3 channel images) and the dataset
also contains RGB images.
Modifying the Output Layer
The original Xception model is designed for classification tasks with a large number of
classes. So, I have changed the final fully connected layer to adapt it for our specific task of
classifying brain tumors into four classes.
• Checked the number of input features to the final fully connected layer (in_features =
model.fc.in_features).
Replaced the final fully connected layer with a new one (nn.Linear(in_features, 2)) to classify
the input images into two classes: 'OPEN', 'CLOSED'.
For training the model, below the training and optimization strategy used for modified
Xception model:
2.4 Training
The model was trained on Google Colab with GPU enabled using Adam optimizer with a
learning rate of 0.0001. Epochs: The model was trained for 5 epochs.
In 10 epochs only the training accuracy increased to freaking 99.677% and the loss dropped
to 0.0088.
The dataset was passed in a batch size of 32 with 2 num_workers. Here is a complete analysis
of the training process:
Below are the metrics I have used to evaluate the performance of model:
3. Results
3.1 Classification Performance
My proposed model is able to achieve a testing accuracy of 99.75 % across the test dataset. The
model seem fit as it gives precision, recall and F1-score as higher for all of the classes in the
classification report suggesting that the difference between two types of tumor was detected
effectively by model.
The confusion matrix provides a visual representation of the model's performance. Each cell
in the matrix represents the number of instances of a predicted class compared to the actual
class.
An ROC curve can be used to illustrate the trade-off between the true positive rate and false
positive rate for each class. In addition, the Area Under the Curve (AUC) was used to
evaluate the model capability of discriminating classes.
Figure 2: ROC Curves
4. Discussion
The high accuracy and robust performance of the proposed model indicate that it is suitable for
practical application. Automated drowsiness detection is a method that can increase the assistance
to drivers and prevent a lot of accidents.
5. Future Work
Future research could explore the following directions:
6. References
• Coursera course: Neural Networks and Deep Learning by Andrew Ng
https://www.coursera.org/learn/neural-networks-deep-learning?specialization=deep-
learning
• CS231n: Convolutional Neural Networks for Visual Recognition course by Stanford:
http://cs231n.stanford.edu/2017/index.html
• CDEEP Course: https://www.cse.iitb.ac.in/~shivaram/teaching/old/cs337+335
s2019/index.html
• CS229: Course offered by Stanford: https://cs229.stanford.edu/
• Course on Computer Vision by Stanford:
http://vision.stanford.edu/teaching/cs131_fall2021/
• OpenCV Tutorial: https://docs.opencv.org/4.x/d6/d00/tutorial_py_root.html
• PyTorch Tutorial:
https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html
• Book by Michael Nielson: http://neuralnetworksanddeeplearning.com/
• Book by Aurélien Géron: Hands-on machine learning with scikit-learn, keras and
tensor flow.
https://github.com/Rakshit-Sawarn-iitb/Drowsiness-Detection/tree/main
Contains two versions of the code the notebook one with some good comments which is
named as drowsiness detection(1) and other with testing metrics added.