0% found this document useful (0 votes)
70 views16 pages

A Fatigue Driving Detection Algorithm Based On Facial Multi-Feature Fusion

This document summarizes a research paper on developing an algorithm to detect driver fatigue based on facial features using machine vision techniques. The algorithm uses an improved YOLOv3 convolutional neural network to capture facial regions under various driving conditions. It then introduces Eye Feature Vectors and Mouth Feature Vectors to evaluate the driver's eye and mouth states, respectively. Driver identity information is obtained through offline training of classifiers. The algorithm then constructs online models to verify identity and assess fatigue by analyzing closed eye time, blink frequency, and yawn frequency at speeds over 20 frames per second with 95.1% accuracy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views16 pages

A Fatigue Driving Detection Algorithm Based On Facial Multi-Feature Fusion

This document summarizes a research paper on developing an algorithm to detect driver fatigue based on facial features using machine vision techniques. The algorithm uses an improved YOLOv3 convolutional neural network to capture facial regions under various driving conditions. It then introduces Eye Feature Vectors and Mouth Feature Vectors to evaluate the driver's eye and mouth states, respectively. Driver identity information is obtained through offline training of classifiers. The algorithm then constructs online models to verify identity and assess fatigue by analyzing closed eye time, blink frequency, and yawn frequency at speeds over 20 frames per second with 95.1% accuracy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

SPECIAL SECTION ON ARTIFICIAL INTELLIGENCE (AI)-

EMPOWERED INTELLIGENT TRANSPORTATION SYSTEMS

Received May 16, 2020, accepted May 26, 2020, date of publication June 1, 2020, date of current version June 10, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.2998363

A Fatigue Driving Detection Algorithm Based on


Facial Multi-Feature Fusion
KENING LI 1,3 , YUNBO GONG 2, AND ZILIANG REN 4,5 , (Member, IEEE)
1 School of Traffic and Environment, Shenzhen Institute of Information Technology, Shenzhen 518172, China
2 School of Civil Engineering and Transportation, South China University of Technology, Guangzhou 510640, China
3 Guangdong Key Laboratory of Intelligent Transportation System, School of Intelligent Systems Engineering, Sun Yat-sen University, Guangzhou 510275, China
4 CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences,

Shenzhen 518055, China


5 Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong

Corresponding author: Ziliang Ren ([email protected])


This work was supported in part by the National Natural Science Foundation of China under Grant 51808151, Grant U1913202, and Grant
U1813205, in part by the Opening Fund of Guangdong Key Laboratory of Intelligent Transportation System under Grant 202001002, and
in part by the Shenzhen Technology Project under Grant JCYJ20180507182610734.

ABSTRACT Researches on machine vision-based driver fatigue detection algorithm have improved traffic
safety significantly. Generally, many algorithms do not analyze driving state from driver characteristics.
It results in some inaccuracy. The paper proposes a fatigue driving detection algorithm based on facial multi-
feature fusion combining driver characteristics. First, we introduce an improved YOLOv3-tiny convolutional
neural network to capture the facial regions under complex driving conditions, eliminating the inaccuracy and
affections caused by artificial feature extraction. Second, on the basis of the Dlib toolkit, we introduce the Eye
Feature Vector(EFV) and Mouth Feature Vector(MFV), which are the evaluation parameters of the driver’s
eye state and mouth state, respectively. Then, the driver identity information library is constructed by offline
training, including driver eye state classifier library, driver mouth state classifier library, and driver biometric
library. Finally, we construct the driver identity verification model and the driver fatigue assessment model
by online assessment. After passing the identity verification, calculate the driver’s closed eyes time, blink
frequency and yawn frequency to evaluate the driver’s fatigue state. In simulated driving applications, our
algorithm detects the fatigue state at a speed of over 20fps with an accuracy of 95.10%.

INDEX TERMS Traffic safety and environment, fatigue driving detection, machine vision, convolutional
neural network.

I. INTRODUCTION According to the NHTSA (National Highway Traffic


With the rapid growth of the number of cars, there are more Safety Administration) survey of driving vehicles, more
and more traffic accidents, which brings huge potential safety than 70% of the surveyed drivers experienced fatigue
hazards to travel. In order to minimize the occurrence of driving [2], [3]. The NTSB (The National Transportation
traffic accidents, recently, the government has introduced Safety Board) of the United States found that after investigat-
multiple related policies, and achieved significant results. ing 120 accidents related to drivers, nearly 60% of them were
However, at this stage, traffic accidents are still one of the related to driver fatigue. In France, the number of casualties
main threats to life safety. For example, lack of road safety caused by driver fatigue is 23% of the total. Australia NRSA
driving awareness, drunk driving, and fatigue driving are (National Road Safety Administration) released a report on
the main factors that cause traffic accidents. Among them, the cause of major traffic accidents. It shows that fatigue
fatigue driving accounts for 14% -20% of the causes of traffic driving accounted for more than 15%. After investigating
accidents, about 43% in heavy traffic accidents, and about the causes of traffic accidents, Flatley [4] found that more
37% in traffic accidents on large trucks and highways [1]. than 21.6% of traffic accidents are related to fatigue driving.
In addition, the investigation also found that other traffic
The associate editor coordinating the review of this manuscript and accidents caused by improper operation and carelessness are
approving it for publication was Amr Tolba . also related to fatigue driving to some extent.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
101244 VOLUME 8, 2020
K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

The Road Traffic Safety Law of China clearly states that and eye closing time. After a large number of experiments
if drivers drive for more than 4 hours without rest, he will and verifications, the Carnegie Mellon Institute in the United
be considered to be fatigued driving [5], [6]. If the driver States proposed the PERCLOS method for measuring fatigue.
has excessive fatigue driving behavior, the traffic control The method is considered to be the most reliable and effective
department can conduct it penalties and deduction of driv- fatigue determination method at present. The driver ’s eye
ing license. Although the regulation can reduce the driver’s closure can be obtained based on this parameter to judge its
excessive fatigue driving behavior to a certain extent, giv- fatigue state [17].
ing fatigue warnings at critical times can greatly reduce the 3. Fatigue is judged by the degree of opening and closing of
occurrence of traffic accidents caused by fatigue driving. the mouth: the fatigue is judged according to the different per-
Especially for drivers engaged in long-distance passenger formance of the driver’s mouth state when speaking normally
transportation and freight transportation, they need to drive and yawning. First obtain the image of the mouth through the
motor vehicles continuously for a long time due to work video capture device, and then import the opening and closing
requirements. However, it is difficult to maintain a high alert features of the mouth Neural network system to judge fatigue
state all the time while driving a vehicle. Therefore, real-time based on the duration of mouth opening and closing [18].
detection and alarm of fatigue status is even more important. It is a non-contact method to determine fatigue based on
At present, the detection of fatigue driving is mainly the driver’s facial features [19], [20]. It does not cause inter-
divided into subjective method and objective method [7], [8]. ference and impact on the driver while driving the vehicle,
The subjective method is based on a questionnaire survey. and has the advantages of fast speed and strong operability.
Well-known questionnaires include the Stanford Sleep Scale, Therefore, compared with the other two types of fatigue
Pearson Fatigue Scale, Driver Record Form, and Cooper- driving detection methods, this method is currently the most
Harper Evaluation Questionnaire, which include subjective concerned and widely used.
load assessment,sleep habits table and so on. The question- Universities, research institutes and enterprises have con-
naire survey is based on the driver’s subjective thinking to ducted long-term and in-depth research on fatigue driving
answer the questions in the questionnaire. It has a strong testing. Abe et al. [21] obtained the best results of fatigue
subjectivity, so it cannot be used as a standard method for detection by studying the relationship between eye state
detecting fatigued driving. and fatigue. The eye characteristics studied by them mainly
The objective method is to use the auxiliary tools to detect include eye opening and closing, eye movement, pupil, etc.
the driver’s physiological characteristics or monitor the vehi- Devi and Bajaj [22] proposed a fatigue detection model,
cle information, etc., and to judge fatigue driving [9], [10]. in which the system first locates the face of the driver image,
Mainly divided into three categories: and then extracts the mouth and eye features in the localized
 Fatigue detection based on physiological characteristics face area. Finally, through the fatigue detection system, the
Studies have shown that as the degree of fatigue increases, features are comprehensively processed to determine whether
the physiological indicators of the human body will gradually the driver is fatigued. By studying the positioning and track-
deviate from the normal value [11].Therefore, fatigue can be ing of the eyes under infrared light at night, Singh et al. [23]
judged according to the change of the driver’s physiological analyzed the relationship between the eye’s state and fatigue.
characteristics. Common features include EEG, ECG, and Haro et al. [24] proposed the use of infrared light for eye
EMG. Among them, EEG is regarded as the ‘‘gold standard’’ positioning and Kalman filtering for eye tracking, which
for detecting fatigue [12]. is highly robust. Coetzer and Hancke [25] found that the
 Fatigue detection based on vehicle behavior Adaboost algorithm has certain advantages in some aspects of
characteristics face detection, after comparing the Adaboost algorithm with
This type of detection method mainly collects and analyzes ANN and SVM.
the relevant information of the vehicle itself during the driv- Although the technology of fatigue detection has made
ing process of the vehicle, to determine whether the driver of better progress and results, it still need to be improved.
the vehicle is fatigued [13], [14]. Detection methods based on physiology and behavior
 Fatigue detection based on facial features usually require the driver to wear or additionally install
When the human body is in a fatigue state and a non- more physiological information monitoring devices, which
fatigue state, part of its body parts will show very different affects the comfort of the driver’s normal driving. Moreover,
performances [15], [16]. For example, in the fatigue state, the equipment that collects physiological information is often
human body may experience head droop, body tilt, increased expensive and vulnerable, which is not conducive to the
blink frequency, and yawning. popularization of fatigue driving detection systems.
1. Use the position of the head to determine fatigue: when The detection method based on vision usually uses
the driver is tired, his head may appear tilted, swaying, etc., Adaboost Classifier Algorithm for face localization [26],
so the driver’s head movement can be detected to determine [27]. However, when the driver wears glasses or sunglasses,
whether he is fatigued. light changes, and the face is partially occluded, Adaboost
2. Use the state of the eyes to determine fatigue: detect cannot accurately locate the face position and promptly warn
fatigue through eye characteristics such as blinking frequency of fatigue driving.

VOLUME 8, 2020 101245


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

At present, the common algorithms judge fatigue by state verification model and the fatigue assessment model, and how
of the driver’s eyes and mouth. However, these algorithms to use the model to judge the fatigue state.
do not take driver’s individual characteristics into account. The third chapter is the experimental analysis. Firstly, the
In fact, the algorithm has high misjudgment, using a fixed experimental environment and data set are introduced. Then
threshold to determine the state of the eyes and mouth. we use qualitative description and quantitative evaluation to
As discussed above literatures, results of the driving fatigue measure face detection and feature point location. Finally,
detection have defect of high intrusion, low robustness, and we evaluate our fatigue driving detection algorithm in two
low reliability. Therefore, we propose a new algorithm. The directions: accuracy and real-time.
innovations are as follows: The fourth chapter is the conclusion, which mainly sum-
We design a driver’s face detection architecture based on marizes the main work content of this paper, analyzes the
the improved YOLOv3-tiny convolutional neural network, shortcomings of the system and the aspects that need to
and trains the network with the open-source dataset WIDER be improved. And then, we propose the future optimization
FACE [28]. Compared with other deep learning algorithms, direction and prospect of the algorithm.
such as YOLOv3 [29], MTCNN [30], the algorithm based on
the improved YOLOv3-tiny network improve the face recog- II. METHODOLOGY
nition accuracy, simplify the network structure, and reduce As shown in Figure 1, our algorithm includes the following 3
the amount of calculation. And then, it is more convenient to modules.
transplant to the mobile. Identity Entry: Firstly, we use the camera to collect driver
Most of the existing algorithms are based on the PER- biometric images, eye classification images and mouth clas-
CLOS, which uses the driver’s eyes state as a feature to sification images. Based on deep learning theory, we apply
judge fatigue. In fact, when the driver’s eyes are too small, the improved YOLOv3-tiny network to locate suspected face
the algorithm has high misjudgment. Similarly, the algo- regions from complex backgrounds. Secondly, according to
rithm based on yawn frequency is also related to the size the driver’s face regions coordinates, the Dlib toolkit is
of the driver’s mouth [31], [32]. Therefore, we design the used to extract facial feature points coordinates, by which
eye and mouth SVM classifier, which takes driver charac- we calculate 128-dimensional Feature Vector, Eye Feature
teristics into account driver characteristics. It judges fatigue Vector (EFV), and Mouth Feature Vector (MFV) of the
based on the actual driver’s eyes and mouth size. It has high driver’s face in the image. Then, to get the eye state classifier
accuracy. that takes driver characteristics into account, support vector
The existing machine learning algorithms that consider machines (SVM) are trained with the eye feature vector in the
individual characteristics often train classifiers by initializa- open-eye image and the closed-eye image. The same goes for
tion before the system starts, which requires re-initialization the mouth. Finally, the driver’s biometric, eye state classifier,
every time the driver is changed. Not only is it a waste of time, and mouth state classifier are stored in the driver’s identity
it also does not ensure that every initialization works well. information library.
Therefore, we constructed the driver identity information Identity Verification: Firstly, we use the camera to collect
library. There are three types of driver identity information in images including driver biometric, and based on deep learn-
the system: driver biometrics, driver eye state classifier, and ing theory, we apply the improved YOLOv3-tiny network
driver mouth state classifier. We train classifiers in advance to locate suspected face regions from complex backgrounds.
and store them into the driver identity information library. Secondly, according to the driver’s face regions coordinates,
Then, through identity verification, the driver’s classifiers the Dlib toolkit is used to extract facial feature points coordi-
are called before system startup. It not only simplifies the nates, by which we calculate 128-dimensional Feature Vector
initialization, but also avoids inaccuracies due to entering the of the driver’s face in the image. Then, it is compared with all
identity manually. the stored driver biometric in the driver identity information
This paper is divided into the following 4 parts. library. Finally, according to the comparison result, the eye
The first chapter is the introduction, which mainly intro- classifier and mouth classifier of the corresponding driver are
duces the background and research significance of our fatigue called for online recognition.
driving detection system, and briefly expounds the research Online Recognition: The original data source is the real-
status of domestic and foreign fatigue driving detection. time camera video. Firstly, based on deep learning theory,
Moreover, according to the shortcomings of current research, we apply the improved YOLOv3-tiny network to extract
we propose a new algorithm. Finally, we introduce the inno- suspected face regions from complex backgrounds. Secondly,
vation of our algorithm. according to the driver’s face regions coordinates, we use the
The second chapter is the introduction of the algorithm. Dlib toolkit extract the driver’s eye and mouth coordinates,
Firstly, we use the improved YOLOv3-tiny network for face by which we calculate the driver’s EFV and MFV in real
detection. Secondly, we introduce how to combine the Dlib time during driving. Then, according to the EFV, we use the
toolkit to extract facial feature parameters. Then, it is intro- eye state classifier obtained by identity verification to judge
duced how to establish the driver identity information library. the driver’s eye state. The same goes for the mouth. Finally,
Finally, we introduce how to construct the driver’s identity based on the eye state and mouth state of each picture detected

101246 VOLUME 8, 2020


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

FIGURE 1. Algorithm structure diagram.

over a period of time, we calculate the driver’s PERCLOS, features of the input image can be extracted. Based on these
blinking frequency, and yawn frequency to judge the driver’s extracted features, AdaBoost algorithm is used to train mul-
fatigue state. tiple weak classifiers. Finally, multiple weak classifiers are
cascaded to obtain a strong classifier, which is the final face
A. FACE DETECTION BASED ON THE IMPROVED detector. This method effectively improves the performance
YOLOv3-TINY NETWORK of face detection, and it is still used and improved to this day.
The correctness of the face detection directly affects the per- Recently, with the continuous development and application
formance of the driving fatigue detection algorithm. So, accu- of deep learning, it provides a new method for face detection
rate and rapid face detection is the fundamental task of the and segmentation [35]. It can be divided into two categories:
driving fatigue detection algorithm. In the traditional algo- One is a multi-level detection algorithm based on proposal
rithm, Viola and Jones [33] proposed that the Haar-Like [34] region. The second is the target detection algorithm based on

VOLUME 8, 2020 101247


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

the anchor box. The former‘s representative algorithms are (http://wider-challenge.org/2019.html) [28] data set as the
Faster-Rcnn [36] and MTCNN [30]. The latter’s represen- driving data. The WIDER FACE dataset includes 32,203
tative algorithms are S3FD [37], SSH [38]. Compared with images and 393,703 marked faces, which is one of the
traditional methods [39], face detection based on convolu- most common face databases. The data set includes different
tional neural network(CNN) avoids the artificial extraction scales, poses, occlusions, expressions, makeup, lighting, as
of features. With the support of data sets, face detection shown in Figure 3.
performance has been greatly improved. The WIDER FACE data set has the following features:
YOLO [40] is You Only Look Once, which means you only • The image resolution is generally high, and all image are
need to look at the picture once to get the target information. color images.
YOLO treats target detection as a regression problem to • Each image has a large number of faces, and each image
solve, and uses an end-to-end convolutional neural network contains an average of 12.2 faces, with more dense small
to extract the characteristics of the input image. It can obtain faces.
the position, size and category information of the target in the • The data set is divided into three types: training set, test
image. set, and verification set, which respectively account for 40%,
YOLOv3 is an improved version of the YOLO algorithm. 50%, and 10% of the data set.
And it is one of the best algorithms in the field of tar- Firstly, based on the YOLOv3-tiny network, the picture of
get detection. Based on YOLO, YOLOv3 refers to many the WIDER FACE data set is adjusted to 10 different sizes,
excellent research results. When using YOLOv3 to detect and the grid cells are arranged on the adjusted pictures by
320 × 320 images, the detection accuracy is consistent with 13 × 13 and 26 × 26. Then, we find the location of the
the SSD algorithm, but it is three times faster than the SSD driver’s face on the non-overlapping grid cell and classify it.
algorithm. For each grid cell, the network outputs B bounding boxes,
YOLOv3-tiny is a lightweight target detection model based corresponding confidence, and the conditional probability of
on YOLOv3. When detecting images on Pascal Titan X, the driver’s face. Finally, non-maximal values are used to
the detection speed can reach 220FPS, and is far higher than suppress redundant bounding boxes. The confidence formula
the general network. The YOLOv3-tiny algorithm has the is given as Equation (1).
following advantages: 1) Fast detection speed. The detection
truth
result can be obtained by running a neural network once score = Pr (Object) ∗ IOUpred (1)
for each test image, and it can be used for real-time detec-
tion; 2) Global understanding of the image. The information where Pr (Object) is the probability of the driver’s face. If the
around the target can be learned during training, and the face is included, Pr (Object) = 1; otherwise Pr (Object) = 0.
background error rate is less than half of the Fast R-CNN truth is the intersection over union(IOU) of the bounding
IOUpred
algorithm; YOLOv3-tiny can be used on devices with low box to the real box.
computing power such as embedded devices, but the detec- The YOLOv3-tiny network loss function consists of the
tion accuracy of YOLOv3-tiny is greatly reduced compared central error term of the bounding box, the width and high
to YOLOv3. error term of the bounding box, the error term of the predic-
The network structure of YOLOv3-tiny is obtained by tion confidence, and the error term of the prediction category.
simplifying the network structure of YOLOv3. Based on the YOLOv3-tiny network completed by offline
The YOLO [40] (You Only Look Once) model is a fast training, we realize the location of the driver’s suspected face
target detection model based on deep learning [41], [42]. It is area and provides an accurate driver’s face image for the
a separate end-to-end network that turns target detection into following algorithm.
a regression problem. To be more specific, the method of
regression and the CNN [43], [44] are used to replace the
sliding window of the traditional target detection to realize the B. DRIVER’S FACIAL MOTION FEATURE EXTRACTION
feature extraction of the driver’s face. This method of feature 1) FACE FEATURE LOCATION AND 128-DIMENSIONAL
extraction is less affected by the external environment and has FEATURE VECTOR EXTRACTION BASED ON THE
the advantage of extracting target features quickly. DLIB TOOLKIT
YOLOv3-tiny has a 23-layer network, including 13 Con- On the driver’s face area located by the improved YOLOv3-
volution layers, 6 Max Pooling layers, 1 Up Sampling layer, tiny network, the Face keypoint detection model based on
1 Fully connected dence layer, and 2 Output layers.To sim- the Dlib [45] library(As is shown in Figure 4(a)) is used to
plify the network and reduce the computation, we trans- extract the fine-grained features of the driver’s face. The Dlib
form the regression of multiple targets into a single target library contains 68 face key points, which uses the method of
according to the regression idea of YOLO model. And then, cascading shape regression to query the key points of the face
we improve YOLOv3-tiny network to locate suspected face component.
regions. The improved network structure is shown in Figure 2. Dlib is a modern C ++ toolbox, which contains machine
In the YOLOv3-tiny network training phase, we use the learning algorithms and tools designed in C ++, and used to
WIDER FACE (Face Detection Data Set and Benchmark) solve practical problems.

101248 VOLUME 8, 2020


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

FIGURE 2. Improved YOLOv3-tiny network structure diagram.

In the face key point detection, Dlib adopts the method directly from the sparse subset of pixel intensity, with high
in [46]–[48] and provides a model trained based on millions detection accuracy and very little time-consuming. This
of faces. This method uses the integration of regression method will be used in the face feature extraction proposed
trees [49]–[51] to estimate the position of facial key points in this paper.

VOLUME 8, 2020 101249


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

FIGURE 3. Improved YOLOv3-tiny network structure diagram.

FIGURE 4. Driver’s face feature point acquisition based on Dlib. (a) Dlib face feature; (b) Face feature point
positioning effect.

When the driver’s face is detected, the feature points of 2) EYE STATE PARAMETERS EXTRACTION BASED ON EFV
the face are obtained in real time by the above algorithm, As discussed above, whether the fatigue detection algorithm
as shown in the Figure 4(b). based on the traditional PERCLOS or the blink frequency is
After extracting 68 feature points with the Dlib toolkit, dependent on the judgment of eye state. The methods mostly
they can be used to form the face information into a use the P80 standard to extract the parameters of the eye state.
128-dimensional Feature Vector [52]–[54]. In this vector Firstly, the image of the eye is pre-processed through image
space, the Euclidean distance of the same face is closer than processing. Secondly, the contour of the driver’s eyes is fitted
that of different faces. Therefore, 128-dimensional Feature by ellipse fitting. Finally, the ratio of the major axis to the
Vectors extracted based on the Dlib toolkit can be used as minor axis of the ellipse is used as a parameter to characterize
driver biometrics for identity verification. the state of the eye. This method relies on the effect of eye

101250 VOLUME 8, 2020


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

FIGURE 5. Eye state and EFV difference (Orange means closed, and blue
means open). FIGURE 6. Mouth state and MFV difference.

image preprocessing. In a real scenario, this method may have opening. In the normal driving process of the driver, speaking
low accuracy due to constant changes in lighting conditions also appears as a change in the degree of mouth opening,
and the driver’s head posture during driving. which greatly interferes with fatigue judgment based on the
To this end, based on Dlib facial feature point localiza- degree of mouth opening. After analyzing the yawning pro-
tion, the paper proposes a new parameter, Eye Feature Vec- cess, we find that: when speaking, the mouth is opened to a
tor (EFV), which can be used to evaluate the driver’s eye small extent and its opening duration is short; In the yawn
state. According to the Dlib eye feature points, EFV can state, the mouth is opened to a greater extent and its opening
be defined: the output of the driver eyes extraction module, duration is longer.
to gain the parameter which can indicate the fatigue status of In order to distinguish between the yawn and speaking,
driver. In this module, the ellipse fitting method is applied to the paper divides the mouth state into three types: closed
obtain the shape of pupils of driver. Eyes state (opening or mouth, small mouth, and big mouth. Similar to the eye state
closed) can be decided according to the relationship between parameters, based on Dlib facial feature point localization,
the long and short axes of the ellipse. Furthermore, the fatigue the paper proposes a new parameter, Mouth Feature Vector
status of driver is evaluated by PERCLOS. (MFV), which can be used to evaluate the driver’s mouth
||p2 − p6 || ||p3 − p5 || state. According to the Dlib mouth feature points, MFV can
EFV = ( , ) (2) be defined:
||p1 − p4 || ||p1 − p4 ||
||M2 − M8 || ||M3 − M7 || ||M4 − M6 ||
where Pi , i = 1, 2, . . . , 6 is the eye feature point coordinate. MFV = ( , , ) (3)
||M1 − M5 || ||M1 − M5 || ||M1 − M5 ||
As shown in Figure 5, when the driver’s eyes are in differ-
ent states, the eye feature points have significant differences. where Mi , i = 1, 2, . . . , 8 is the mouth feature point
As seen in the plane scatter plot (where blue is the EFV of coordinate.
the open-eye picture and orange is the EFV of the closed-eye As shown in Figure 6, when the driver’s mouth is in
picture), when the driver’s eyes are in different states, there different states, the mouth feature points have significant
are also significant differences in EFV, which are in line with differences. As seen in the plane scatter plot, when the driver’s
eye feature points. Therefore, EFV can be used as a parameter mouth is in different states, there are also significant differ-
to characterize the state of eyes for driver fatigue detection ences in MFV, which are in line with mouth feature points.
algorithms. Therefore, MFV can be used as a parameter to characterize
the state of mouth for driver fatigue detection algorithms.
3) MOUTH STATE PARAMETERS EXTRACTION
BASED ON MFV C. DRIVER IDENTITY INFORMATION LIBRARY
Similar to the PERCLOS and blink frequency, the yawn 1) DRIVER EYE STATE CLASSIFIER LIBRARY
frequency is also an important index for evaluating fatigue. As mentioned above, traditional driver fatigue detection algo-
It is inaccurate to judge fatigue based on the degree of mouth rithms are mostly based on the P80 criterion, which uses a

VOLUME 8, 2020 101251


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

In the offline training phase of the driver’s eye,


the improved YOLOv3-tiny is used to detect the face. Based
on face feature points, the driver’s EFV is calculated to
form the training set. Among them, When yi = +1, xi is
a positive sample, indicating that the eye of driver is open,
When yi = −1, xi is a negative sample, indicating that
he eye of driver is close. Combined with the constraints of
Equation(5), the hyperplane parameters wT and b can be
solved to construct the driver eye state classifier library.

2) DRIVER MOUTH STATE CLASSIFIER LIBRARY


Similar to the eye, traditional algorithms use a fixed threshold
to judge the driver’s mouth state without considering driver
characteristics. To this end, we establish the driver mouth
state classifier library. According to different driver character-
FIGURE 7. The optimal hyperplane of SVM. istics, we collect images of the driver’s different mouth states
within a specific time, calculate MFV in different states, and
train the SVM classifier to judge the driver’s mouth state.
fixed threshold to judge the driver’s eye state without con- The mouth state classification is not a binary classification
sidering driver characteristics. To this end, we establish the problem. However, SVM is a binary classifier. The construc-
driver eye state classifier library. According to different driver tion of SVM multi-class classifiers is mainly through the
characteristics, we collect images of the driver’s different eye indirect method, that is, combine multiple two classifiers to
states within a specific time, calculate EFV in different states, construct multi-classifiers.
and train the SVM [55] classifier to judge the driver’s eye So, in the offline training phase of the driver’s mouth, the
state. improved YOLOv3-tiny is used to detect the face. Based on
For a binary classification problem, if there are classified face feature points, the driver’s MFV is calculated to form
data samples {(x1 , y1 ), . . . , (xi , yi ), . . . , (xm , ym )}, i = 1...m, the training set. Then, two SVM classifiers are constructed.
where m represents sample data and xi ∈ Rn represents for The first classifier is trained to judge the open-closed mouth
m-dimensional data, the corresponding classification label state, and the second classifier is trained to judge the small-
yi ∈ (−1, 1). In the case of linear separability, based on the big mouth state, as shown in Figure 8, to build the driver
soft-separation maximization criterion, the SVM algorithm mouth state classifier library.
seeks an optimal hyperplane to separate the two types of data
samples, where the distance between the sample point closest
to the hyperplane and the hyperplane is the largest.
As shown in Figure 7, a schematic diagram of finding the
optimal hyperplane for a two-dimensional space. The points
of triangles and circles represent two types of data, H1 and
H2 are the boundaries of the two classes parallel to the opti-
mal hyperplane. They are determined by the sample points of
the closest points to the optimal hyperplane in each category,
and the distance from the boundary to the optimal hyperplane
is called the classification interval M arg in = 1/||w||. As can
be seen from the figure, the classification interval between
the two types of samples is 2∗ M arg in = 2/||w||, where
the optimal classification hyperplane can be expressed as FIGURE 8. Mouth state classifier model.
Equation (4):

w∗T · x + b∗ = 0 (4) 3) DRIVER BIOMETRIC LIBRARY


As mentioned above, considering driver’s characteristics
The normal vector wT and the intercept b determine the can improve detection accuracy. However, considering that
superclass surface function. The constrained optimization offline training needs to be performed again, every time the
problem can be defined: driver is changed, we have established the driver biometric
library based on section 2.2.1. Every driver only needs to
min J (w) = 1 kwk2

input his eye state classifier, mouth state classifier and bio-
2
w,b 2 (5) metric to the driver identity information library once. Before
s.t. yi (w · xi + b) ≥ 1, i = 1, 2, . . . N
T
driving, the system will extract the biometric of the current

101252 VOLUME 8, 2020


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

driver and compare it with the driver biometric informa- In order to obtain the eye classifier that takes the driver
tion library. If successful, the driver’s eye state classifier characteristics into account, the driver’s identity information
and mouth state classifier are called for online recognition; must first be verified to obtain an eye classifier that corre-
if unsuccessful, the driver will be reminded to input his sponds to the driver’s identity. After the identity verification
own identity information into the driver identity information is completed, online recognition of driver fatigue state based
library. on PERCLOS will be performed. First in the driving process,
we use the ordinary car camera to obtain the driver’s face
D. DRIVER’S IDENTITY VERIFICATION MODEL image in real time. Next, the improved YOLOv3-tiny net-
According to the driver’s identity information library, before work is used to detect the driver’s face. If the driver’s face
assessing the fatigue, the driver needs to complete identity is detected, the facial area is used as the input image and
information verification. First we use the camera to col- the facial feature points are located using the Dlib toolkit.
lect biometric information of the current driver, that is, the To reduce the false detection rate, the paper supplements the
128-dimensional feature vector of the driver’s face. Then, driver’s head posture information as an auxiliary discrimina-
calculate the Euclidean distance between it and all 128- tion parameter. When the improved YOLOv3-tiny network
dimensional feature vectors in the driver’s biometric library. fails to detect a face or locate the facial feature points, it is
The calculation formula is: determined that the driver is in an abnormal head posture
v
u 128 during driving, and this frame image is used as a closed-
uX eye frame. After completing the face feature point position-
d =t (x − y )2 i i (6) ing, the EFV is calculated based on the coordinates of the
i=1
eye feature points, and then the driver’s eye state classifier
where xi is the i th dimension of the 128-dimensional feature obtained by identity verification is used to determine the
vector currently collected, and yi is the i th dimension of driver’s eye state in the image. Finally, the number of closed-
the 128-dimensional feature vector in the driver’s biometric eye images of the driver is counted in a specific number
library. of frames (1000 frames are set in the article), and then we
We use 0.6 as the system’s decision threshold. When all calculate the PERCLOS value. If PERCLOS > ThPERCLOS
d values are greater than 0.6, it is determined that the cur- (ThPERCLOS is the driver’s fatigue state determination thresh-
rent driver is not in the driver biometric library, that is, the old (the article takes 0.4)), it is determined that the driver is
verification fails; When there is a d value less than 0.6, it is in fatigue, otherwise, it is in non-fatigue.
determined that the current driver is in the driver’s biometric
library, that is, the verification is passed. Then find the iden- 2) FATIGUE JUDGMENT BASED ON BLINK FREQUENCY
tity information corresponding to the minimum d value as the Under normal circumstances, during the driving process,
result of the identity verification, that is, the driver’s eye and the driver blinks relatively quickly each time, and the dura-
mouth state classifier is called for detection before the system tion is between 100-400ms. However, in the fatigue state,
starts. the duration of blinking is longer and more than 1 second,
and the blinking frequency increases. Therefore, the blink
E. DRIVER FATIGUE ASSESSMENT MODEL frequency can also intuitively reflect the driver’s fatigue level.
1) FATIGUE JUDGMENT BASED ON PERCLOS The article stipulates that the system detects the state change
Driver fatigue is a description of the state, and its correspond- of eye open-closed-open in turn as a blink, so the formula of
ing fatigue level is a dynamically changing process. Carnegie blink frequency is:
Mellon Research Center Wierwille proposed Percentage of NBlink
Eye Closure (PERCLOS). It has been widely accepted and FBlink = × 100% (8)
T
adopted by many researchers as an effective indicator of
fatigue driving. PERCLOS [56] is a physical quantity that where T is time, the unit is minute, and NBlink is the number
measures the state of human fatigue (drowsiness), which of blinks in T minutes.
is defined as the time taken by the eyes to be closed per Normal people blink about 10-20 times per minute when
unit time. The U.S. Federal Highway Administration and the they are awake. When they are in fatigue, the blink frequency
National Highway Traffic Safety Administration simulated will increase by 64%. Based on related research, similar
driving in a laboratory, which has verified the effectiveness to section 2.5.1, after the identity verification is completed,
of PERCLOS in characterizing driver fatigue. PERCLOS is online recognition of driver fatigue state based on blink
defined as: frequency will be performed. Different from section 2.5.1,
the number of blink images of the driver is counted in a
Nclose
PERCLOS = × 100% (7) specific number of frames (1000 frames are set in the article),
Ntotal and then we calculate the blink frequency. If FBlink > ThBlink
where Nclose is the number of closed eyes images in a specific (ThBlink is the driver’s fatigue state determination threshold
time, and Ntotal is the total number of images in a specific (the article takes 20)), it is determined that the driver is in
time. fatigue, otherwise, it is in non-fatigue.

VOLUME 8, 2020 101253


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

TABLE 1. Hardware configuration table.

3) FATIGUE JUDGMENT BASED ON YAWN FREQUENCY determination threshold (the article takes 3)), it is determined
Similar to PERCLOS and blink frequency, yawn frequency that the driver is in fatigue, otherwise, it is in non-fatigue.
is also an important indicator of evaluating fatigue. Yawn is In summary, the flowchart of driver fatigue assessment
a deep breathing activity that often occurs during laziness, model based on multi-feature fusion is shown in Figure 9.
tiredness, and lack of rest. And inhale more oxygen through
enlarging the lung. It stimulates the central nervous system to III. EXPERIMENTS
boost the spirit, which is the conditioned reflex under fatigue. To verify the validity of the algorithm, the paper evaluated the
Using this conditioned reflex activity can provide an intuitive performance of the improved YOLOv3-tiny network with the
evaluation index of the fatigue level. In order to distinguish Self-built data set DSD and public data set WIDER FACE.
non-yawning mouth activities such as speaking, this article On this basis, the design comparison experiment is carried
judges whether to yawn from the degree of mouth opening out to verify whether the fatigue driving detection algorithm
and the time of opening. The article stipulates that when the based on facial multi-feature fusion is correct.
system detects the state changes of Close-Small-Big-Small-
Close and the duration of opening up is more than 2 seconds, A. EXPERIMENTAL ENVIRONMENT AND DATA SET
it is a yawn. The formula of yawn frequency is: The experimental platform is the Intel Core i5-8400 with
x86 architecture, and the CPU clock speed is 2.80GHz.
NYawn Graphics card is GTX1060 with Pascal architecture (CUDA:
FYawn = × 100% (9) 9.2; CUDNN: 7.2), The RAM is 8G DDR4, and the
T
opencv3.4.6 image library is used. The deep learning comput-
where T is time, the unit is minute, and NYawn is the number ing framework is PaddlePaddle1.5. The environment of the
of blinks in T minutes. program is in python 3.6. Hardware configuration as shown
According to related research, yawn frequency increases in Table 1.
significantly when the human is in fatigue. Based on this, The data set used in the experiment included Self-built
the paper studies driver fatigue based on yawn frequency. data set DSD and public data set WIDER FACE, where
In order to obtain the mouth classifier that takes the driver the public data set WIDER FACE includes 32203 pictures
characteristics into account, the driver’s identity information and 393703 marked faces, which is used to train Yolov3-
must first be verified to obtain a mouth classifier that corre- tiny’s face network. However, the WIDER FACE dataset
sponds to the driver’s identity. After the identity verification only contains marker face images and does not provide any
is completed, online recognition of driver fatigue state based information about the driver’s fatigue status. Therefore, the
on yawn frequency will be performed. First in the driving WIDER FACE data set cannot be used to analyze driver
process, we use the ordinary car camera to obtain the driver’s fatigue status. To this end, the driving state dataset (DSD)
face image in real time. Next, the improved YOLOv3-tiny is established in this paper, which contains data collected
network is used to detect the driver’s face. After completing by 50 test drivers sitting on a driving simulator (as shown
the face feature point positioning, the MFV is calculated in Figure 10). The data set of each test driver is shown in the
based on the coordinates of the mouth feature points, and table below.
then the driver’s mouth state classifier obtained by identity
verification is used to determine the driver’s mouth state B. FACE DETECTION
in the image. Finally, the number of yawn of the driver is The improved YOLOv3-tiny network provides face land-
counted in a specific number of frames (1000 frames are marks for fatigue driving detection, and its performance is
set in the article), and then we calculate the yawn frequency directly related to the pros and cons of the fatigue driving
value. If FYawn > ThYawn (ThYawn is the driver’s fatigue state detection algorithm. Therefore, we quantitatively evaluate of

101254 VOLUME 8, 2020


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

FIGURE 9. Flow chart of driver fatigue assessment model.

the performance of the improved YOLOv3-tiny network on the face detection area and the marked real area. In Figure 11,
the WIDER FACE data sets. face_d is the face area detected by the model, face is the real
In this paper, accuracy are selected as evaluation indicators. area marked, and the calculation formula is the Equation 11:
It is an intuitive evaluation index of model performance.,
S(face_d ∩ face)
as shown in Equation(10). IoU = (11)
S(face_d ∪ face)
Nd
accuracy = (10) where S(face_d ∩ face) is the area of face_d ∩ face, and
Nt
S(face_d ∪ face) is the area of face_d ∪ face.
where Nd is the number of correctly detected images, and Nt The intersection ratio indicates the degree of overlap
is the total number of images. between the model prediction area and the real area. As can
In the process of improving the YOLOv3-tiny network be seen from Figure 11, the higher the value is, the higher the
training and verification, the intersection ratio parameter detection accuracy is. In the case IOU = 1, the prediction box
(IOU) [57] is introduced to measure the similarity between overlaps with the real box. Generally speaking, in the task of

VOLUME 8, 2020 101255


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

TABLE 2. DSD data set.

FIGURE 10. Driving simulator for experiment.

FIGURE 12. Driver face detection accuracy.

can be obtained. Then according to the state classifier of the


corresponding subject, the eye state and mouth state of each
frame in the video can be discriminated. Finally, calculate
the PERCLOS, blink frequency, and yawn frequency of the
corresponding subjects in this video respectively. If any of
the indicators exceeds the set threshold, fatigue will be deter-
mined. We randomly select 10 videos from the data set, and
the experimental results are shown in the table:
In this paper, we randomly select ten videos from the DSD
FIGURE 11. Intersection over union. test set, including non-fatigued driving status and fatigued
driving status. The fatigue is judged based on the PERCLOS
the target detection, it is considered that when the IOU > 0.5, exceeding 0.4, the blink frequency exceeding 20, or the yawn
the object is correctly detected. In the face detection task of frequency exceeding 3. As seen from the table, the accuracy
this paper, considering that the face detection result directly of fatigue driving detection in 10 videos is 90%. In the
affects the accuracy of the subsequent algorithm, we set a end, the accuracy rate of the system was 95.10% accurate
higher threshold. When the IOU>0.75, the face is consid- on the entire DSD data set. The system was tested to meet
ered to be correctly detected. Figure 12 shows the accuracy the expected design goals and meet the needs of practical
curve of the driver’s face detection during the training of applications.
the improved YOLOv3-tiny network. Obviously, with the
increase of training rounds, the accuracy of face detection 2) SPEED
gradually increases. The improved YOLOv3-tiny network Based on hardware configuration as shown in Table 1, a com-
has an accuracy rate of 97.9%. parison test is performed on the image source to verify the
real-time performance of the system. The results are shown
C. FATIGUE STATE EVALUATION in Table 4.
1) ACCURACY From Table 4, the time taken by the image from the camera
We use the DSD dataset to test the performance of fatigue is slightly longer than by video. After analysis, it is consid-
detection. The DSD data set is shown in Table 2. Before the ered that the image acquisition module based on OpenCV
experiment, each subject needs to complete the identity entry. used different reading methods between the camera and the
After this, the state classifiers required for online detection file video stream, so the time is different.

101256 VOLUME 8, 2020


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

TABLE 3. Sample test table.

TABLE 4. The time spend of fatigue status judge.

TABLE 5. The time spend of fatigue status judge. based on facial multi-feature fusion. The main contributions
cover as follows.
 We designed a driver’s face detection architecture based
on the improved YOLOv3-tiny convolutional neural network,
and trained the network with the open-source dataset WIDER
FACE [28].
 We designed the eye and mouth SVM classifier that
Our algorithm shows that the system has good accuracy takes driver characteristics into account driver characteristics,
and high-speed performance under various conditions, and which judges fatigue based on the actual driver’s eyes and
can accurately judge the fatigue state of the driver. Compared mouth size. It has high accuracy.
with Adaboost +CNN and MTCNN+LRCN algorithm [58],  We constructed the driver identity information library.
[59], our method improves the accuracy of the fatigue driving There are three types of driver identity information in the
detection algorithm. It also has better real-time performance, system: driver biometrics, driver eye state classifier, and
which meets the requirements of the fatigue driving detection driver mouth state classifier. We train classifiers in advance
system. The comparative result is shown in Table 5. and store them into the driver identity information library.
Then, through identity verification, the driver’s classifiers
are called before system startup. It not only simplifies the
IV. CONCLUSION
initialization, but also avoids inaccuracies due to entering the
Fatigue driving can seriously affect driving skills and
identity manually.
seriously threaten drivers and other traffic participants.
At present, fatigue driving detection have achieved better
research results, but it still needs to be improved, such as ACKNOWLEDGMENT
high intrusiveness, poor detection performance in complex Thanks to the open source dataset, WIDER_FACE for
environments, and simple evaluation indicator. Therefore, providing data. WIDER_FACE: http://shuoyang1213.me/
we propose a new detection algorithm for fatigue driving WIDERFACE/

VOLUME 8, 2020 101257


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

REFERENCES [21] T. Abe, T. Nonomura, Y. Komada, S. Asaoka, T. Sasai, A. Ueno, and


[1] A. Amodio, M. Ermidoro, D. Maggi, S. Formentin, and S. M. Savaresi, Y. Inoue, ‘‘Detecting deteriorated vigilance using percentage of eyelid
‘‘Automatic detection of driver impairment based on pupillary light closure time during behavioral maintenance of wakefulness tests,’’ Int. J.
reflex,’’ IEEE Trans. Intell. Transp. Syst., vol. 20, no. 8, pp. 3038–3048, Psychophysiol., vol. 82, no. 3, pp. 269–274, 2011.
Aug. 2018, doi: 10.1109/TITS.2018.2871262. [22] M. S. Devi and P. R. Bajaj, ‘‘Fuzzy based driver fatigue detection,’’ in Proc.
[2] Z. Zhou, Y. Zhou, Z. Pu, and Y. Xu, ‘‘Simulation of pedestrian behavior IEEE Int. Conf. Syst., Man Cybern., Oct. 2010, pp. 3139–3144.
during the flashing green signal using a modified social force model,’’ [23] H. Singh, J. S. Bhatia, and J. Kaur, ‘‘Eye tracking based driver fatigue
Transportmetrica A, Transp. Sci., vol. 15, no. 2, pp. 1019–1040, Nov. 2019. monitoring and warning system,’’ in Proc. India Int. Conf. Power Electron.
[3] Z. Zhou, Y. Cai, R. Ke, and J. Yang, ‘‘A collision avoidance model for two- (IICPE), Jan. 2011, pp. 1–6.
pedestrian groups: Considering random avoidance patterns,’’ Phys. A, Stat. [24] A. Haro, M. Flickner, and I. Essa, ‘‘Detecting and tracking eyes by using
Mech. Appl., vol. 475, pp. 142–154, Jun. 2017. their physiological properties, dynamics, and appearance,’’ in Proc. IEEE
[4] T. Toroyan, ‘‘Global status report on road safety 2013: Supporting a decade Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2000, pp. 163–168.
of action,’’ Injury Prevention, vol. 15, no. 4, p. 286, 2013. [25] R. C. Coetzer and G. P. Hancke, ‘‘Eye detection for a real-time vehicle
[5] D. Ma, X. Luo, S. Jin, D. Wang, W. Guo, and F. Wang, ‘‘Lane-based driver fatigue monitoring system,’’ in Proc. IEEE Intell. Vehicles Symp.
saturation degree estimation for signalized intersections using travel time (IV), Jun. 2011, pp. 66–71.
data,’’ IEEE Intell. Transp. Syst. Mag., vol. 9, no. 3, pp. 136–148, Jul. 2017. [26] Z. Ning, Y. Li, P. Dong, X. Wang, M. S. Obaidat, X. Hu, L. Guo, Y. Guo,
[6] D. Ma, X. Luo, W. Li, S. Jin, W. Guo, and D. Wang, ‘‘Traffic demand J. Huang, and B. Hu, ‘‘When deep reinforcement learning meets 5G-
estimation for lane groups at signal-controlled intersections using travel enabled vehicular networks: A distributed offloading framework for traffic
times from video-imaging detectors,’’ IET Intell. Transp. Syst., vol. 11, big data,’’ IEEE Trans. Ind. Informat., vol. 16, no. 2, pp. 1352–1361,
no. 4, pp. 222–229, May 2017. Feb. 2020.
[7] G. Zhang, K. K. W. Yau, X. Zhang, and Y. Li, ‘‘Traffic accidents involving [27] Z. Ning, Y. Feng, M. Collotta, X. Kong, X. Wang, L. Guo, X. Hu,
fatigue driving and their extent of casualties,’’ Accident Anal. Prevention, and B. Hu, ‘‘Deep learning in edge of vehicles: Exploring trirelation-
vol. 87, pp. 34–42, Feb. 2016, doi: 10.1016/j.aap.2015.10.033. ship for data transmission,’’ IEEE Trans. Ind. Informat., vol. 15, no. 10,
[8] D. Mollicone, K. Kan, C. Mott, R. Bartels, S. Bruneau, M. van Wollen, pp. 5737–5746, Oct. 2019.
A. R. Sparrow, and H. P. A. Van Dongen, ‘‘Predicting performance and [28] S. Yang, P. Luo, C.-C. Loy, and X. Tang, ‘‘WIDER FACE: A face detection
safety based on driver fatigue,’’ Accident Anal. Prevention, vol. 126, benchmark,’’ in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern
pp. 142–145, May 2019, doi: 10.1016/j.aap.2018.03.004. Recognit., Las Vegas, NV, USA, Dec. 2016, pp. 5525–5533.
[9] X. Qu, Y. Yu, M. Zhou, C.-T. Lin, and X. Wang, ‘‘Jointly dampening traffic [29] J.-H. Won, D.-H. Lee, K.-M. Lee, and C.-H. Lin, ‘‘An improved YOLOv3-
oscillations and improving energy consumption with electric, connected based neural network for de-identification technology,’’ in Proc. 34th
and automated vehicles: A reinforcement learning based approach,’’ Appl. Int. Tech. Conf. Circuits/Syst., Comput. Commun. (ITC-CSCC), JeJu,
Energy, vol. 257, Jan. 2020, Art. no. 114030. South Korea, Jun. 2019, pp. 1–2.
[10] M. Zhou, Y. Yu, and X. Qu, ‘‘Development of an efficient driving strategy [30] X. Chen, X. Luo, X. Liu, and J. Fang, ‘‘Eyes localization algorithm based
for connected and automated vehicles at signalized intersections: A rein- on prior MTCNN face detection,’’ in Proc. IEEE 8th Joint Int. Inf. Technol.
forcement learning approach,’’ IEEE Trans. Intell. Transp. Syst., vol. 21, Artif. Intell. Conf. (ITAIC), Chongqing, China, May 2019, pp. 1763–1767.
no. 1, pp. 433–443, Jan. 2020. [31] Z. Ning, J. Huang, X. Wang, J. J. P. C. Rodrigues, and L. Guo, ‘‘Mobile
[11] Y. Feng and L. Zhaode, ‘‘Research on driver fatigue monitoring algorithm edge computing-enabled Internet of vehicles: Toward energy-efficient
based on deep learning,’’ J. Wuhan Univ. Technol., Transp. Sci. Eng., scheduling,’’ IEEE Netw., vol. 33, no. 5, pp. 198–205, Sep. 2019.
vol. 42, no. 3, pp. 417–421, 2018. [32] X. Kong, F. Xia, J. Li, M. Hou, M. Li, and Y. Xiang, ‘‘A shared bus
[12] G. Sikander and S. Anwar, ‘‘Driver fatigue detection systems: A review,’’ profiling scheme for smart cities based on heterogeneous mobile crowd-
IEEE Trans. Intell. Transp. Syst., vol. 20, no. 6, pp. 2339–2352, Jun. 2019, sourced data,’’ IEEE Trans. Ind. Informat., vol. 16, no. 2, pp. 1436–1444,
doi: 10.1109/TITS.2018.2868499. Feb. 2020.
[13] D. Ma, X. Luo, S. Jin, W. Guo, and D. Wang, ‘‘Estimating maximum queue
[33] P. Viola and M. Jones, ‘‘Robust real-time face detection,’’ in Proc. 8th IEEE
length for traffic lane groups using travel times from video-imaging data,’’
Int. Conf. Comput. Vis. (ICCV), vol. 2. Vancouver, BC, Canada: Institute
IEEE Intell. Transp. Syst. Mag., vol. 10, no. 3, pp. 123–134, Jun. 2018.
of Electrical and Electronics Engineers, 2001, p. 747.
[14] C. Xu, Y. Yang, S. Jin, Z. Qu, and L. Hou, ‘‘Potential risk and its influencing
[34] S. Saha and V. Demoulin, ‘‘ALOHA: An efficient binary descriptor based
factors for separated bicycle paths,’’ Accident Anal. Prevention, vol. 87,
on Haar features,’’ in Proc. 19th IEEE Int. Conf. Image Process., Orlando,
pp. 59–67, Feb. 2016, doi: 10.1016/j.aap.2015.11.014.
FL, USA, Sep. 2012, pp. 2345–2348.
[15] X. Sun, H. Zhang, W. Meng, R. Zhang, K. Li, and T. Peng, ‘‘Pri-
mary resonance analysis and vibration suppression for the harmonically [35] S. Jida, B. Aksasse, and M. Ouanan, ‘‘Face segmentation and detection
excited nonlinear suspension system using a pair of symmetric viscoelastic using Voronoi diagram and 2D histogram,’’ in Proc. Intell. Syst. Comput.
buffers,’’ Nonlinear Dyn., vol. 94, no. 2, pp. 1243–1265, Oct. 2018, doi: Vis. (ISCV), Fez, Morocco, Apr. 2017, pp. 1–5.
10.1007/s11071-018-4421-9. [36] J. Zou and R. Song, ‘‘Microarray camera image segmentation with faster-
[16] H. Xiong, X. Zhu, and R. Zhang, ‘‘Energy recovery strategy numeri- RCNN,’’ in Proc. IEEE Int. Conf. Appl. Syst. Invention (ICASI), Chiba,
cal simulation for dual axle drive pure electric vehicle based on motor Japan, Apr. 2018, pp. 86–89.
loss model and big data calculation,’’ Complexity, vol. 2018, Aug. 2018, [37] N. L. Arifin, H. Widiastuti, and A. Wibowo, ‘‘Study on effect of source to
Art. no. 4071743, doi: 10.1155/2018/4071743. film distance (SFD) on the radiographic images,’’ in Proc. Int. Conf. Appl.
[17] Z. Ning, R. Y. K. Kwok, K. Zhang, X. Wang, M. S. Obaidat, L. Guo, Eng. (ICAE), Batam, Indonesia, Oct. 2018, pp. 1–4.
X. Hu, B. Hu, Y. Guo, and B. Sadoun, ‘‘Joint computing and caching in [38] P. Samangouei, R. Chellappa, M. Najibi, and L. S. Davis, ‘‘Face-MagNet:
5G-envisioned Internet of vehicles: A deep reinforcement learning-based Magnifying feature maps to detect small faces,’’ in Proc. IEEE Winter
traffic control system,’’ IEEE Trans. Intell. Transp. Syst., early access, Conf. Appl. Comput. Vis. (WACV), Lake Tahoe, NV, USA, Mar. 2018,
Feb. 5, 2020, doi: 10.1109/TITS.2020.2970276. pp. 122–130.
[18] A. Anund, C. Fors, and C. Ahlstrom, ‘‘The severity of driver fatigue in [39] M. El-Arabawy, S. Zaki, and F. Harby, ‘‘Improved AdaBoost algo-
terms of line crossing: A pilot study comparing day- and night time driving rithm for face detection,’’ in Proc. Int. Conf. Image Process., Com-
in simulator,’’ Eur. Transp. Res. Rev., vol. 9, no. 2, pp. 1–7, Jun. 2017, doi: put. Vis., Pattern Recognit., Las Vegas, NV, USA, vol. 1, 2010,
10.1007/s12544-017-0248-6. pp. 353–358.
[19] R.-H. Zhang, Z.-C. He, H.-W. Wang, F. You, and K.-N. Li, ‘‘Study [40] D. P. Lestari, R. Kosasih, T. Handhika, Murni, I. Sari, and A. Fahrurozi,
on self-tuning tyre friction control for developing main-servo loop inte- ‘‘Fire hotspots detection system on CCTV videos using you only look once
grated chassis control system,’’ IEEE Access, vol. 5, pp. 6649–6660, (YOLO) method and tiny YOLO model for high buildings evacuation,’’
2017. in Proc. 2nd Int. Conf. Comput. Informat. Eng. (ICIE), Banyuwangi,
[20] Z. Ning, P. Dong, X. Wang, X. Hu, L. Guo, B. Hu, Y. Guo, T. Qiu, and Indonesia, Sep. 2019, pp. 87–92.
R. Y. Kwok, ‘‘Mobile edge computing enabled 5G health monitoring for [41] D. Goularas and S. Kamis, ‘‘Evaluation of deep learning techniques in
Internet of medical things: A decentralized game theoretic approach,’’ sentiment analysis from Twitter data,’’ in Proc. Int. Conf. Deep Learn.
IEEE J. Sel. Areas Commun., pp. 1–16, Mar. 2020. [Online]. Available: Mach. Learn. Emerg. Appl. (Deep-ML), İstanbul, Turkey, Aug. 2019,
https://www.researchgate.net/publication/339874488 pp. 12–17.

101258 VOLUME 8, 2020


K. Li et al.: Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion

[42] S. Kido, Y. Hirano, and N. Hashimoto, ‘‘Detection and classification of [58] G. Lei, X. Liang, Z. Xiao, and Y. Li, ‘‘Real-time driver fatigue detection
lung abnormalities by use of convolutional neural network (CNN) and based on morphology infrared features and deep learning,’’ Infr. Laser
regions with CNN features (R-CNN),’’ in Proc. Int. Workshop Adv. Image Eng., vol. 47, no. 2, 2018, Art. no. 203009.
Technol. (IWAIT), Chiang Mai, Thailand, Jan. 2018, pp. 1–4. [59] L. Tychsen-Smith and L. Petersson, ‘‘Improving object localization with
[43] Y. Tian, Y. Du, Q. Zhang, J. Cheng, and Z. Yang, ‘‘Depth estimation for fitness NMS and bounded IoU loss,’’ 2017, arXiv:1711.00164. [Online].
advancing intelligent transport systems based on self-improving pyramid Available: https://arxiv.org/abs/1711.00164
stereo network,’’ IET Intell. Transp. Syst., vol. 14, no. 5, pp. 338–345,
May 2020.
[44] Y. Tian, Q. Zhang, Z. Ren, F. Wu, P. Hao, and J. Hu, ‘‘Multi-scale dilated
convolution network based depth estimation in intelligent transportation
systems,’’ IEEE Access, vol. 7, pp. 185179–185188, 2019. KENING LI received the B.E. degree in commu-
[45] D. E. King, ‘‘Dlib-ml: A machine learning toolkit,’’ J. Mach. Learn. Res., nications and transportation from the University
vol. 10, pp. 1755–1758, Jul. 2009. of Henan Agricultural University, Henan, China,
[46] J. H. Friedman, ‘‘Greedy function approximation: A gradient boosting
in 2004, the M.S. degree in vehicle operation
machine,’’ Ann. Statist., vol. 29, pp. 1189–1232, Oct. 2001.
engineering from Jilin University, Jilin, in 2007,
[47] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification
with deep convolutional neural networks,’’ Commun. ACM, vol. 60, no. 6, and the Ph.D. degree in vehicle engineering from
pp. 84–90, May 2017. Shanghai Jiao Tong University, Shanghai, in 2014.
[48] Q. Zhang and S.-I. Kamata, ‘‘Improved color barycenter model and its He is currently a Lecturer with the School of
separation for road sign detection,’’ IEICE Trans. Inf. Syst., vol. E96.D, Traffic and Environment, Shenzhen Institute of
no. 12, pp. 2839–2849, Dec. 2013. Information Technology, Shenzhen, China. His
[49] X. Cao, Y. Wei, F. Wen, and J. Sun, ‘‘Face alignment by explicit shape research interests include vehicle system dynamics and control, intelligent
regression,’’ Int. J. Comput. Vis., vol. 107, no. 2, pp. 177–190, Apr. 2014. vehicle systems, and artificial intelligence.
[50] P. Dollár, P. Welinder, and P. Perona, ‘‘Cascaded pose regression,’’ in Proc.
IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., San Francisco,
CA, USA, Jun. 2010, pp. 1078–1085.
[51] Q. Zhang and S.-I. Kamata, ‘‘A novel color descriptor for road-sign
detection,’’ IEICE Trans. Fundam. Electron., Commun. Comput. Sci., YUNBO GONG was born in Tieling, Liaon-
vol. E96-A, no. 5, pp. 971–979, May 2013. ing, in 1995. He received the B.E. degree
[52] X. Kong, X. Liu, B. Jedari, M. Li, L. Wan, and F. Xia, ‘‘Mobile crowdsourc- from the South China University of Technology,
ing in smart cities: Technologies, applications, and future challenges,’’ Guangzhou, China, in 2018, where he is currently
IEEE Internet Things J., vol. 6, no. 5, pp. 8095–8113, Oct. 2019. pursuing the master’s degree in traffic informa-
[53] D. Ma, X. Song, and P. Li, ‘‘Daily traffic flow forecasting through tion engineering and control. His research interests
a contextual convolutional recurrent neural network modeling inter-and
include intelligent vehicles, computer vision, and
intra-day traffic patterns,’’ IEEE Trans. Intell. Transp. Syst., early access,
Feb. 24, 2020, doi: 10.1109/TITS.2020.2973279.
3D laser radar.
[54] D. Ma, J. Xiao, X. Song, X. Ma, and S. Jin, ‘‘A back-pressure-based
model with fixed phase sequences for traffic signal optimization under
oversaturated networks,’’ IEEE Trans. Intell. Transp. Syst., early access,
Apr. 29, 2020, doi: 10.1109/TITS.2020.2987917.
[55] Z. You, Y. Gao, J. Zhang, H. Zhang, M. Zhou, and C. Wu, ‘‘A study on ZILIANG REN (Member, IEEE) received the
driver fatigue recognition based on SVM method,’’ in Proc. 4th Int. Conf. Ph.D. degree from the South China University of
Transp. Inf. Saf. (ICTIS). Banff, AB, Canada: Institute of Electrical and
Technology, Guangzhou, China, in 2017. He is
Electronics Engineers, Aug. 2017, pp. 693–697.
currently with the Guangdong Provincial Key Lab-
[56] J.-J. Yan, H.-H. Kuo, Y.-F. Lin, and T.-L. Liao, ‘‘Real-time driver drowsi-
ness detection system based on PERCLOS and grayscale image process-
oratory of Robotics and Intelligent System, Shen-
ing,’’ in Proc. Int. Symp. Comput., Consum. Control (ISC), Xi’an, China, zhen Institutes of Advanced Technology, Chinese
Jul. 2016, pp. 243–246. Academy of Sciences, Shenzhen, China, as a Post-
[57] L. Tychsen-Smith and L. Petersson, ‘‘Improving object localization with doctoral Researcher and an Assistant Researcher.
fitness NMS and bounded IoU loss,’’ in Proc. 31st Meeting IEEE/CVF His current research interests include computer
Conf. Comput. Vis. Pattern Recognit. (CVPR), Salt Lake City, UT, USA, vision and machine learning.
Jun. 2018, pp. 6877–6885.

VOLUME 8, 2020 101259

You might also like