0% found this document useful (0 votes)
18 views

Design of Real-time Drowsiness Detection System using Dlib

Uploaded by

pta21cs031
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Design of Real-time Drowsiness Detection System using Dlib

Uploaded by

pta21cs031
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Design of Real-time Drowsiness Detection System

using Dlib
1
Shruti Mohanty, 1Shruti V Hegde, 1Supriya Prasad, 2J. Manikandan
1
Dept of ECE, PES University, 100-Feet Ring Road, BSK Stage III, Bengaluru - 85, Karnataka, India
2
Dept of ECE, Crucible of Research and Innovation (CORI)
PES University, 100-Feet Ring Road, BSK Stage III, Bengaluru - 85, Karnataka, India
{email : [email protected], [email protected], [email protected],
[email protected]}

Abstract—Drowsiness while driving is a highly prevalent There has been extensive research and a number of papers
problem that leads to thousands of fatal accidents every year. A that have put forth possible methodologies to detect
solution to prevent accidents and fatalities is the need of the hour inattentiveness and drowsiness in a driver in the last two
and while there are complex systems developed that provide decades. In [4], traditional techniques are elaborated which
solutions for detecting drowsiness in drivers, this paper explores are based on physiological measurements including brain
a simpler, yet highly effectual method of doing the same. In this waves, heart rate, pulse rate and respiration. However, these
paper, drowsy driver detection system is designed using Python techniques are intrusive in nature. Reference [5] is based on
and Dlib model. Dlib’s shape detector is used to map the Rowley’s eye detection code from the STASM library.
coordinates of the facial landmarks of the input video and However, the presence of glasses adversely affects the
drowsiness detected by monitoring aspect ratios of eyes and performance of the system. Reference [6] monitors only
mouth. Performance evaluation of the proposed system designed yawning patterns of the driver using two separate cameras to
is carried out by testing videos from a standard public dataset
acquire information of the upper part of the body in order to
track the driver’s mouth. However, the hardware dependency
as well as real-time video captured in our lab. The proposed
is higher.
system gave a maximum recognition accuracy of 96.71% for
dataset video input. Drowsiness in humans is characterized by a few very
specific movements and facial expressions- the eyes begin to
Keywords— Python, face detection, drowsiness detection, close, mouth opens in a yawn, the jaw goes slack and the neck
computer vision, Dlib, OpenCV, HOG, facial landmark estimation tilts. This paper focuses on tracking the eyes and mouth to
detect drowsiness and classify a driver as drowsy. For real-
I. INTRODUCTION time application of the model, the input video can be acquired
Facial expressions have the ability to offer deep insights by mounting a camera on the dashboard of the car and can
into many physiological conditions of the body. The display accommodate the driver’s face, hands, upper body and
of a myriad of emotions and reactions to stimuli have been a occlusions such as non-tinted spectacles.
constant area of study and research, and also used in the The Dlib model is trained to identify 68 facial landmarks.
development of intelligent systems such as facial emotion As shown in Fig 1. the drowsiness features are extracted and
detection in [1], automatic pain detection in [2] and prediction the driver is alerted incase of drowsiness being detected. The
of personality in [3] to name a few. There are numerous model does not require prior information on the individual
algorithms and methodologies available for face detection who is testing it. Main software requirements are Python and
which is the fundamental first step in the process. OpenCV.

Dlib facial Landmarks Facial Signs of Alert if


detector Frame pre- landmarks drowsiness
predictor drowsiness
initialized processing mapped
created monitored detected

Real-time video is acquired

Fig. 1: Block diagram of proposed real-time drowsiness detection system


II. IMPLEMENTATION
For this approach, we implement the drowsiness detector
using OpenCV and Python. The Dlib library is used to detect
and localize facial landmarks using Dlib’s pre-trained facial
landmark detector. It consists of two shape predictor models
[7] trained on the i-Bug 300-W dataset, that each localize 68
and 5 landmark points respectively within a face image. In Eyes closed, mouth open Eyes closed, mouth closed
this approach, 68 facial landmarks have been used (as shown
in Fig. 2 below).

Eyes open, mouth open Eyes open, mouth closed


Fig. 4: HOG face features for four different real-time input cases

The outline of the facial features made by the oriented


gradients makes it easy to discern the location and even the
Fig. 2: Manner in which 68 facial landmarks are mapped on a detected face state of facial features. For example, in Fig. 3 and Fig. 4 we
[8] can see the difference in the HOG of an open mouth versus
that of a closed mouth.
Histogram of Oriented Gradients (HOG) based face
detector is used in Dlib. In this method, frequencies of Upon finding the location of the face, the facial landmarks
gradient direction of an image in localized regions are used predictor is called to map the points of interest (eyes and
to form histograms. In many cases, it is more accurate than mouth) and extract their coordinates.
Haar cascades as the false positive ratio is small. Also, tuning
at test time requires less parameters. It is especially suitable
for face detection as firstly, it can describe contour and edge
features exceptionally in various objects. Secondly, it
performs operations on regional cells which allows motion of
the subject to be overlooked. Moreover, Dalal and Triggs [9]
discovered that HOG descriptor works well for human
detection in images, which makes it appropriate for
drowsiness detection.
Eyes closed, mouth open Eyes closed, mouth closed
In our model, a HOG based detector is first instantiated to
find the location of the face in each individual frame of the
input video stream.

Eyes open, mouth open Eyes open, mouth closed


Fig. 5: Facial landmark mapping for four different dataset cases

Eyes closed, mouth open Eyes closed, mouth closed Fig. 5 shows the facial landmark mapping on particular
frames extracted from the video input for different cases such
as eyes closed mouth open, eyes open mouth closed, eyes
open mouth open and eyes closed mouth closed.

The coordinates of the right eye, left eye and mouth


extracted at this stage are used to compute aspect ratio for the
right eye, left eye and mouth based on Euclidean distance (as
shown in Fig. 6 below).
Eyes open, mouth open Eyes open, mouth closed
Fig. 3: HOG face features for four different dataset input cases
Fig. 6: Eye coordinates
Eyes closed, mouth open Eyes closed, mouth closed
From the formula in Fig. 6, the eye coordinates are obtained
and eye aspect ratio (EAR) is calculated according to the
formula. The aspect ratio of both the eyes is averaged as
blinking is performed by both simultaneously.
Similarly, mouth aspect ratio is determined to detect
yawning from the coordinates of the mouth and the formula
shown in Fig. 7 below.

Eyes open, mouth open Eyes open, mouth closed

Fig. 9: Drowsiness detection on a real-time video for 4 different cases


(inset text: aspect ratios of both eyes and mouth and drowsiness count)

The following steps are followed for the testing of the


model:
Step 1: Input video (pre-recorded or real-time) is fed into the
model. Individual frames are resized and converted to
grayscale.
Fig. 7: Mouth coordinates
Step 2: Dlib’s HOG based face detector is initialised. The
The final display of the drowsiness detection system shows location of the face is pinpointed.
the feed of the video input (from video dataset or real-time Step 3: The facial landmarks for the face region are
capture), along with the computed aspect ratio values and determined by the predictor and mapped onto the face.
drowsiness detection alerts. If the aspect ratio of the eyes falls
below the stipulated threshold, then the message “Eye Step 4: Left eye, right eye and mouth coordinates are
Drowsiness detected” flashes on the screen along with the extracted, which are then used to compute aspect ratio for
count of how many times the eye was noticed to be closed. If both eyes and mouth based on Euclidean distance
the aspect ratio of the mouth falls below the stipulated respectively.
threshold, then the message “Yawning Drowsiness detected”
Step 5: The calculated aspect ratios are compared with fixed
is displayed on the screen along with the count of how many
threshold values 0.15 and 0.83 for eye and mouth respectively
times yawning was detected.
to determine signs of drowsiness. If the average aspect ratio
of left and right eye falls below the threshold, it is recognized
as a sign of drowsiness. Similarly, if the mouth aspect ratio
exceeds the set threshold, there is a possibility for it to be a
yawn.
Step 6: When continuous signs of drowsiness is detected over
a longer duration, the driver is alerted.
The real time-video is processed at 20 frames per second
Eyes closed, mouth open Eyes closed, mouth closed
(fps), so each frame lasts for 0.05 seconds. Drowsy blinks
typically last for 20 frames i.e., 1 second. Thus, a normal
blink will not be identified as drowsy. Continuous eye blinks
also last for lesser number of frames and are hence
distinguishable from drowsy blinks.

Eyes open, mouth open Eyes open, mouth closed

Fig. 8: Drowsiness detection on a dataset video for 4 different cases


(inset text: aspect ratios of both eyes and mouth and drowsiness count)
III. EXPERIMENTAL RESULTS The results of real-time detection are lower as the model
currently works exceedingly well under good to perfect light
A. Dataset Description
conditions like those found in the dataset videos, whereas the
Table I provides a description of the 2 datasets used for real-time testing was performed under a variety of lighting
testing purposes. conditions.

TABLE I: DATASET DESCRIPTION Future work will focus on enhancing the model to work
Feature YawDD Dataset[10] MRL Eye Dataset[11]
under poor to mediocre lighting conditions, and including
Number of 29 84,898
more drowsiness signs such as head nodding for the
videos/images Videos Images drowsiness detection model.
Number of males 16 33
Number of females 13 4
REFERENCES
Actions performed Without talking or Closed or open eyes
talking/singing or [1] M. Matsugu., K. Mori., Y. Mitari, and Y. Kaneda, “Subject
yawning independent facial expression recognition with robust face detection
Resolution 640x480 RGB (24-bit 640x480, 1280x1024, using a convolutional neural network,” Neural networks: the official
true colour) 752x480 journal of the Intern. Neural Network Society, vol. 16, no. 5-6, pp.
555-559, June 2003.
[2] P. Lucey, J. Cohn, S. Lucey, I. Matthews, S. Sridharan and K. M.
B. Results Obtained from Comparison Prkachin, "Automatically detecting pain using facial actions," in 3rd
Intern. Conf. on Affective Computing and Intelligent Interaction and
Table II below describes the recognition accuracy obtained Workshops, Amsterdam, 2009, pp. 1-8.
from two approaches (eye closure and yawn detection), on [3] L. Liu, D. Preotiuc-Pietro., Z.R. Saman, M.E. Moghaddam, and L.H.
using standard datasets and real-time. Ungar, “Analyzing Personality through Social Media Profile Picture
Choice,” Intern. AAAI Conf. on Web and Social Media, Cologne,
Real-time computational results were calculated by taking Germany, May 2016, pp. 214.
the average of 5 trials each of 12 subjects (including 5 males [4] P. K. Stanley, T. Jaya Prakash, S. Sibin Lal and P. V. Daniel,
and 7 females) recorded at different locations. Average result "Embedded based drowsiness detection using EEG signals," IEEE
included cases with and without glasses. Video frames with Intern. Conf. on Power, Control, Signals and Instrumentation
instances of 2 states (sleepy and non-sleepy) for every trial. Engineering, Chennai, India, Sept. 2017, pp. 2596-2600.
[5] T. Danisman, I. M. Bilasco, C. Djeraba and N. Ihaddadene, "Drowsy
Highest percentage accuracy obtained is 96.71 % for yawn
driver detection system using eye blink patterns," Intern. Conf. on
detection followed by 93.25% for drowsy blink detection.
Machine and Web Intelligence, Algiers, Algeria, Oct. 2010, pp. 230-
TABLE II: RESULTS OBTAINED
233.
Recognition Accuracy [6] L. Li, Y. Chen and Z. Li, "Yawning detection for monitoring driver
Features fatigue based on two cameras," 12th Intern. IEEE Conf. on Intelligent
Real Time Dataset
Transportation Systems, St. Louis, USA, Oct. 2009, pp. 1-6.
Eye 82.02 % 93.25 % [7] Davis E. King, Dlib-models [Online].
Available: https://github.com/davisking/dlib-models
Yawn 85.44 % 96.71 %
[8] L. Anzalone, Training Alternative Dlib Shape Predictor models using
Python, Oct. 2018. Accessed on July 16, 2019. [Online]. Available:
https://medium.com/datadriveninvestor/training-alternative-dlib-
shape-predictor-models-using-python-d1d8f8bd9f5c
IV. CONCLUSIONS [9] N. Dalal, B. Triggs “Histogram of oriented gradients for human
In the Dlib approach, the library’s pre-trained 68 facial detection”, IEEE Computer Society Conf. on Computer Vision and
landmark detector is used. The face detector which is based Pattern Recognition, San Diego, USA, June 2005, pp. 63-69.
on Histogram of Oriented Gradients (HOG) was [10] S. Abtahi, M. Omidyeganeh, S. Shirmohammadi, and B. Hariri. 2014.
implemented. The quantitative metric used in the proposed YawDD: a yawning detection dataset, 5th ACM Multimedia Systems
algorithm was the Eye Aspect Ratio (EAR) to monitor the Conf., New York, USA, Mar. 2014, pp. 24-28.
[11] R. Fusek, MRL Eye Dataset, Accessed on May 28th, 2019. [Online].
driver’s blinking pattern and Mouth Aspect Ratio (MAR) to
Available: http://mrl.cs.vsb.cz/eyedataset
determine if the driver yawned in the frames of the
continuous video stream. The average real-time test
accuracies obtained using Dlib for eyes and yawn were
82.02% and 85.44% respectively and 93.25% and 96.71%
respectively for pre-recorded videos.

You might also like