Kumar 2018
Kumar 2018
Abstract—Drowsy driving is one of the major causes of road drowsiness. This is a nonintrusive measurement as the sensors
accidents and death. Hence, detection of driver’s fatigue and its are not attached on the driver. In behavioural based method [1-
indication is an active research area. Most of the conventional 7], the visual behavior of the driver i.e., eye blinking, eye
methods are either vehicle based, or behavioural based or closing, yawn, head bending etc. are analyzed to detect
physiological based. Few methods are intrusive and distract the drowsiness. This is also nonintrusive measurement as simple
driver, some require expensive sensors and data handling. camera is used to detect these features. In physiological based
Therefore, in this study, a low cost, real time driver’s drowsiness method [8,9], the physiological signals like Electrocardiogram
detection system is developed with acceptable accuracy. In the (ECG), Electooculogram (EOG), Electroencephalogram
developed system, a webcam records the video and driver’s face is
(EEG), heartbeat, pulse rate etc. are monitored and from these
detected in each frame employing image processing techniques.
metrics, drowsiness or fatigue level is detected. This is intrusive
Facial landmarks on the detected face are pointed and
subsequently the eye aspect ratio, mouth opening ratio and nose measurement as the sensors are attached on the driver which
length ratio are computed and depending on their values, will distract the driver. Depending on the sensors used in the
drowsiness is detected based on developed adaptive thresholding. system, system cost as well as size will increase. However,
Machine learning algorithms have been implemented as well in an inclusion of more parameters/features will increase the
offline manner. A sensitivity of 95.58% and specificity of 100% has accuracy of the system to a certain extent. These factors
been achieved in Support Vector Machine based classification. motivate us to develop a low-cost, real time driver’s drowsiness
detection system with acceptable accuracy. Hence, we have
Keywords—drowsiness detection, visual behaviour, eye aspect proposed a webcam based system to detect driver’s fatigue from
ratio, mouth opening ratio, nose length ratio. the face image only using image processing and machine
learning techniques to make the system low-cost as well as
I. INTRODUCTION portable.
Drowsy driving is one of the major causes of deaths
occurring in road accidents. The truck drivers who drive for II. THE PROPOSED SYSTEM AND COMPUTATION OF
continuous long hours (especially at night), bus drivers of long PARAMETERS
distance route or overnight buses are more susceptible to this A block diagram of the proposed driver drowsiness
problem. Driver drowsiness is an overcast nightmare to monitoring system has been depicted in Fig 1. At first, the video
passengers in every country. Every year, a large number of is recorded using a webcam. The camera will be positioned in
injuries and deaths occur due to fatigue related road accidents. front of the driver to capture the front face image. From the
Hence, detection of driver’s fatigue and its indication is an video, the frames are extracted to obtain 2-D images. Face is
active area of research due to its immense practical detected in the frames using histogram of oriented gradients
applicability. The basic drowsiness detection system has three (HOG) and linear support vector machine (SVM) for object
blocks/modules; acquisition system, processing system and detection [10]. After detecting the face, facial landmarks [11]
warning system. Here, the video of the driver’s frontal face is like positions of eye, nose, and mouth are marked on the
captured in acquisition system and transferred to the processing images. From the facial landmarks, eye aspect ratio, mouth
block where it is processed online to detect drowsiness. If opening ratio and position of the head are quantified and using
drowsiness is detected, a warning or alarm is send to the driver these features and machine learning approach, a decision is
from the warning system. obtained about the drowsiness of the driver. If drowsiness is
Generally, the methods to detect drowsy drivers are detected, an alarm will be sent to the driver to alert him/her. The
classified in three types; vehicle based, behavioural based and details of each block are discussed below.
physiological based. In vehicle based method, a number of
metrics like steering wheel movement, accelerator or brake
pattern, vehicle speed, lateral acceleration, deviations from lane
position etc. are monitored continuously. Detection of any
abnormal change in these values is considered as driver
A. Data Acquisition
The video is recorded using webcam (Sony CMU-BR300)
and the frames are extracted and processed in a laptop. After
extracting the frames, image processing techniques are applied
on these 2D images. Presently, synthetic driver data has been
generated. The volunteers are asked to look at the webcam with
intermittent eye blinking, eye closing, yawning and head
bending. The video is captured for 30 minutes duration.
B. Face Detection
After extracting the frames, first the human faces are
detected. Numerous online face detection algorithms are there.
In this study, histogram of oriented gradients (HOG) and linear
SVM method [10] is used. In this method, positive samples of
340
towards zero. As yawn is one of the characteristics of
drowsiness, MOR gives a measure regarding driver drowsiness.
Head Bending: Due to drowsiness, usually driver’s head
tilts (forward or backward) with respect to vertical axis. So,
from the head bending angle, driver drowsiness can be detected.
As the projected length of nose on the camera focal plane is
proportional to this bending, it can be used as a measure of head
bending. In normal condition, our nose makes an acute angle
2 3 8 9 with respect to focal plane of the camera. This angle increases
4 7 as the head moves vertically up and decreases on moving down.
1 10 Therefore, the ratio of nose length to an average nose length
25 while awake is a measure of head bending and if the value is
6 5 12 11 greater or less than a particular range, it indicates head bending
26 as well as drowsiness. From the facial landmarks, the nose
length is calculated and it is defined as
27
28 nose length(p 28 − p 25 )
NLR =
average nose length
15 1718
14 16 The average nose length is computed during the setup phase
13 19 of the experiment as described in the next sub-section.
24 20 E. Classification
23 22 21 After computing all the three features, the next task is to
detect drowsiness in the extracted frames. In the beginning,
adaptive thresholding is considered for classification. Later,
machine learning algorithms are used to classify the data.
Fig. 2 The facial landmark points For computing the threshold values for each feature, it is
assumed that initially the driver is in complete awake state. This
D. Feature Extraction
is called setup phase. In the setup phase, the EAR values for
After detecting the facial landmarks, the features are first three hundred (for 10s at 30 fps) frames are recorded. Out
computed as described below. of these three hundred initial frames containing face, average of
Eye aspect ratio (EAR): From the eye corner points, the eye 150 maximum values is considered as the hard threshold for
aspect ratio is calculated as the ratio of height and width of the EAR. The higher values are considered so that no eye closing
eye as given by instances will be present. If the test value is less than this
threshold, then eye closing (i.e., drowsiness) is detected. As the
size of eye can vary from person to person, this initial setup for
each person will reduce this effect. Similarly, for calculating
threshold of MOR, since the mouth may not be open to its
maximum in initial frames (setup phase) so the threshold is
where represents point marked as i in facial landmark and
taken experimentally from the observations. If the test value is
is the distance between points marked as i and j.
greater than this threshold then yawn (i.e., drowsiness) is
Therefore, when the eyes are fully open, EAR is high value and detected. Head bending feature is used to find the angle made
as the eyes are closed, EAR value goes towards zero. Thus, by head with respect to vertical axis in terms of ratio of
monotonically decreasing EAR values indicate gradually projected nose lengths. Normally, NLR has values from 0.9 to
closing eyes and it’s almost zero for completely closed eyes 1.1 for normal upright position of head and it increases or
(eye blink). Consequently, EAR values indicate the drowsiness decreases when head bends down or up in the state of
of the driver as eye blinks occur due to drowsiness. drowsiness. The average nose length is computed as the average
Mouth opening ratio (MOR): Mouth opening ratio is a of the nose lengths in the setup phase assuming that no head
parameter to detect yawning during drowsiness. Similar to bending is there. After computing the threshold values, the
EAR, it is calculated as system is used for testing. The system detects the drowsiness if
in a test frame drowsiness is detected for at least one feature.
( p15 − p23 ) + ( p16 − p22 ) + ( p17 − p21 ) To make this thresholding more realistic, the decision for each
MOR = frame depends on the last 75 frames. If at least 70 frames (out
3( p19 − p13 ) of those 75) satisfy drowsiness conditions for at least one
As defined, it increases rapidly when mouth opens due to feature, then the system gives drowsiness detection indication
yawning and remains at that high value for a while due to yawn and the alarm.
(indicating that the mouth is open) and again decreases rapidly To make this thresholding adaptive, another single
threshold value is computed which initially depends on EAR
341
threshold value. The average of EAR values is computed as the
average of 150 maximum values out of 300 frames in the setup
phase. Then offset is determined heuristically and the threshold
is obtained as offset subtracted from the average value. Driver
safety is at risk when EAR is below this threshold. This EAR
threshold value increases slightly with each yawning and head
bending upto a certain limit. As each yawning and head bending
is distributed over multiple frames, so yawning and head
bending of consecutive frames are considered as single yawn
and head bending and added once in the adaptive threshold. In
a test frame, if EAR value is less than this adaptive threshold Fig. 3 Normal or awake state with facial landmarks
value, then drowsiness is detected and an alarm is given to the
driver. Sometimes it may happen that when the head is too low
due to bending, the system is unable to detect the face. In such
situation, previous three frames are considered and if head
bending was detected in those three frames, drowsiness alarm
will be shown. Table II illustrates this calculation for
determining the adaptive threshold.
Table II: Threshold for the computed parameters
EAR from setup phase (average of 0.34
150 maximum values out of 300
(a)
frames)
Threshold=EAR- offset 0.34-.045=0.295
At Yawning,(MOR> 0.6) Threshold=Threshold +0.002
*Max bound exist
At Head Bending, Threshold=Threshold +0.001
(NLR<0.7 OR NLR >1.2) *Max bound exist
342
Different drowsy conditions are displayed in Fig. 4. Figure Table IV: Accuracy of different classifiers
4(a) illustrates an example of drowsiness alert due to yawn and Method Sensitivity Specificity Overall Accuracy
Fig. 4(b) illustrates an example of same due to eye closing. An
Bayesian Classifier 0.973 0.561 0.854
example of head bending and detecting it as drowsiness alert is
shown in Fig. 4(c). Figure 4(d) depicts the condition when the FLDA 0.896 1 0.926
head is too low due to bending and drowsiness is detected as SVM 0.956 1 0.958
described in previous section. Table III illustrates sample values
of the parameters for different states.
IV. CONCLUSION
Table III: Sample values of different parameters for different states
In this paper, a low cost, real time driver drowsiness
State EAR MOR NLR monitoring system has been proposed based on visual behavior
Normal 0.35 0.34 1.003 and machine learning. Here, visual behavior features like eye
Yawning 0.22 0.77 0.76
aspect ratio, mouth opening ratio and nose length ratio are
computed from the streaming video, captured by a webcam. An
Eye Closed 0.15 0.419 0.876 adaptive thresholding technique has been developed to detect
Head Bending 0.15 0.577 0.66 driver drowsiness in real time. The developed system works
accurately with the generated synthetic data. Subsequently, the
The developed system can also detect drowsiness with feature values are stored and machine learning algorithms have
persons wearing spectacles as depicted in Fig. 5. been used for classification. Bayesian classifier, FLDA and
SVM have been explored here. It has been observed that FLDA
and SVM outperform Bayesian classifier. The sensitivity of
FLDA and SVM is 0.896 and 0.956 respectively whereas the
specificity is 1 for both. As FLDA and SVM give better
accuracy, work will be carried out to implement them in the
developed system to do the classification (i.e., drowsiness
detection) online. Also, the system will be implemented in
hardware to make it portable for car system and pilot study on
drivers will be carried out to validate the developed system.
Fig. 5 Detection of eyes in presence of spectacles REFERENCES
The developed algorithm has been tested on INVEDRIFAC [1] W. L. Ou, M. H. Shih, C. W. Chang, X. H. Yu, C. P. Fan, "Intelligent
Video-Based Drowsy Driver Detection System under Various
dataset [13]. It is a video and image database of faces of in-
Illuminations and Embedded Software Implementation", 2015
vehicle automotive drivers. The developed algorithm has been international Conf. on Consumer Electronics - Taiwan, 2015.
tested on 6 different driver videos. The performance of the same [2] W. B. Horng, C. Y. Chen, Y. Chang, C. H. Fan, “Driver Fatigue Detection
on this data is with acceptable accuracy. Also, the videos have based on Eye Tracking and Dynamic Template Matching”, IEEE
different illumination conditions which indicate that the International Conference on Networking,, Sensing and Control, Taipei,
algorithm can perform well even at low illumination condition. Taiwan, March 21-23, 2004.
[3] S. Singh, N. P. papanikolopoulos, “Monitoring Driver Fatigue using
Subsequently statistical analysis and classification of the Facial Analysis Techniques”, IEEE Conference on Intelligent
features into two classes have been explored as well. As the Transportation System, pp 314-318.
features are correlated, principal component analysis has been [4] B. Alshaqaqi, A. S. Baquhaizel, M. E. A. Ouis, M. Bouumehed, A.
used to transform the feature space into an independent one. Ouamri, M. Keche, “Driver Drowsiness Detection System”, IEEE
The independent features are statistically significant at 5% level International Workshop on Systems, Signal Processing and their
of significance. Bayesian classifier, Fisher’s linear discriminant Applications, 2013.
analysis (FLDA) and Support Vector Machine (SVM) with [5] M. Karchani, A. Mazloumi, G. N. Saraji, A. Nahvi, K. S. Haghighi, B. M.
Abadi, A. R. Foroshani, A. Niknezhad, “The Steps of Proposed
linear kernel have been used for classification. Presently, this Drowsiness Detection System Design based on Image Processing in
has been done in offline manner on the stored data. Two-third Simulator Driving”, International Research Journal of Applied and Basic
data is used for training and one-third data is used for testing Sciences, vol. 9(6), pp 878-887, 2015.
the algorithms. The classifier results are given in Table IV. [6] R. Ahmad, and J. N. Borole, “Drowsy Driver Identification Using Eye
Sensitivity is calculated as the ratio of correctly classifying Blink Detection,” IJISET - International Journal of Computer Science and
drowsy states out of all actual drowsy states and specificity is Information Technologies, vol. 6, no. 1, pp. 270-274, Jan. 2015.
computed as the ratio of correctly classifying awake states out [7] A. Abas, J. Mellor, and X. Chen, “Non-intrusive drowsiness detection by
of all actual awake states. Overall accuracy is computed as the employing Support Vector Machine,” 2014 20th International Conference
correctly classified states out of all the frames. It is evident that on Automation and Computing (ICAC), Bedfordshire, UK, 2014, pp. 188-
193.
the overall accuracy of FLDA and SVM is better than that of
[8] A. Sengupta, A. Dasgupta, A. Chaudhuri, A. George, A. Routray, R.
Bayesian whereas Bayesian gives best sensitivity of 97%. Guha; "A Multimodal System for Assessing Alertness Levels Due to
However, the specificity of Bayesian is quite low (56%). This Cognitive Loading", IEEE Trans. on Neural Systems and Rehabilitation
may be due to the error in approximating the probability Engg., vol. 25 (7), pp 1037-1046, 2017.
distributions. Due to low specificity, the alarm may ring when [9] K. T. Chui, K. F. Tsang, H. R. Chi, B. W. K. Ling, and C. K. Wu, “An
actually drowsiness is not there. This can be disturbing to the accurate ECG based transportation safety drowsiness detection scheme,”
driver.
343
IEEE Transactions on Industrial Informatics, vol. 12, no. 4, pp. 1438-
1452, Aug. 2016.
[10] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human
Detection”, IEEE conf. on CVPR, 2005.
[11] V. Kazemi and J. Sullivan; "One millisecond face alignment with an
ensemble of regression trees", IEEE Conf. on Computer Vision and
Pattern Recognition, 23-28 June, 2014, Columbus, OH, USA.
[12] Richard O. Duda, Peter E. Hart, David G. Stork, “Pattern Classification”,
Wiley student edition.
[13] Dataset: https://sites.google.com/site/invedrifac/
344