0% found this document useful (0 votes)
18 views9 pages

Fatigue Driving Detection Based On Deep Learning and Multi-Index Fusion

The document presents a fatigue driving detection algorithm that utilizes deep learning and facial multi-index fusion to identify driver fatigue based on facial features. It improves the multi-task cascaded convolutional neural network (MTCNN) for accurate face and key point detection, and employs a depth separable convolution neural network (E-MSR Net) for recognizing eye and mouth states. The proposed method achieves a high accuracy of 97.5% on a self-made dataset, demonstrating its effectiveness in complex driving scenarios.

Uploaded by

Alreign Deo Miel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views9 pages

Fatigue Driving Detection Based On Deep Learning and Multi-Index Fusion

The document presents a fatigue driving detection algorithm that utilizes deep learning and facial multi-index fusion to identify driver fatigue based on facial features. It improves the multi-task cascaded convolutional neural network (MTCNN) for accurate face and key point detection, and employs a depth separable convolution neural network (E-MSR Net) for recognizing eye and mouth states. The proposed method achieves a high accuracy of 97.5% on a self-made dataset, demonstrating its effectiveness in complex driving scenarios.

Uploaded by

Alreign Deo Miel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Received October 8, 2021, accepted October 22, 2021, date of publication October 27, 2021, date of current version

November 4, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3123388

Fatigue Driving Detection Based on Deep


Learning and Multi-Index Fusion
HUIJIE JIA , ZHONGJUN XIAO , AND PENG JI
School of Electrical Engineering and Automation, Qilu University of Technology, Shandong Academy of Sciences, Jinan 250353, China
Corresponding author: Zhongjun Xiao ([email protected])
This work was supported in part by the National Natural Science Foundation of China Youth Science Foundation Project under Grant
61903207.

ABSTRACT In order to reduce traffic accidents caused by fatigue driving, a fatigue driving detection
algorithm is proposed based on deep learning and facial multi-index fusion from the driver0 s facial features.
Because the scene in the actual driving process is very complex and changeable, this algorithm first improves
the multi-task cascaded convolutional neural network (MTCNN) so that it can quickly and accurately locate
the face and detect the facial key points. According to the facial key points, the driver’s eyes and mouth
regions are determined. Second, these regions are input into the eyes and mouth state recognition network
(E-MSR Net) for state recognition. The E-MSR Net is a depth separable convolution neural network that is
improved and optimized based on MobilenetV2. Finally, the three facial features of eye closure rate (ECR),
mouth opening rate (MOR), and head non-positive face rate (HNFR) are fused to judge the driver’s fatigue
state. This algorithm can quickly and accurately make judgments in the face of complex and changeable
scenes. At the same time, it can avoid the failure of the algorithm caused by the occlusion of the eyes or
mouth due to wearing sunglasses or masks during driving. The accuracy of the proposed algorithm on the
self-made data set achieved 97.5%, which proved the feasibility of the algorithm.

INDEX TERMS Fatigue driving detection, improved MTCNN, E-MSR Net, facial multi-index fusion.

I. INTRODUCTION the rotational speed of the steering wheel under a certain


Fatigue driving endangers road traffic safety and has grown hand-held pressure [3], and the driving monitoring device
up to be an important cause of traffic accidents. A 2011 World to monitor the lane line deviation degree and other indirect
Health Organization survey found that 1.3 million people died driving data in real-time. Fatigue detection based on driving
each year from road traffic accidents and about 50 million data is non-invasive and does not affect the normal driving
were disabled. By 2020, traffic accidents were the fifth lead- behavior during monitoring. When the monitoring shows that
ing cause of death worldwide, killing about 2.4 million people the driving trajectory of the vehicle is deviated, or the pressure
each year. The American Automobile Association (AAA) of the hand-held steering wheel is reduced. The system will
Traffic Safety Foundation showed that nearly one-fifth of determine that the driver is currently fatigued, and the driver
the deaths in traffic accidents in the United States each year, will be awakened by vibration and voice. The indirect driving
about 4.55 million people were killed, which was caused by data of the vehicles will also be affected by the driver’s
fatigue driving. personal driving habits, driving skills and other non-fatigue
How to effectively monitor and determine the fatigue factors.
driving state so as to achieve fatigue warning has become Wang et al. [4] combined with the steering wheel angle
a hot topic in scientific research. At present, there are information at different time points, the longitudinal accel-
three main driving fatigue detection methods. The first is eration and lateral acceleration of the vehicle, and used the
the driver fatigue detection technology based on driving random forest algorithm to predict the fatigue state of the
data [1], [2]. This technology detects the driver’s fatigue state driver. After comparing and analyzing the above indicators,
by monitoring the real-time driving trajectory of the vehicle, they found that the lateral acceleration of the vehicle is the
best indicator to detect the fatigue state of the driver. Mao and
The associate editor coordinating the review of this manuscript and Du [5] used a simulated driver to obtain a variety of physical
approving it for publication was Wei Liu. characteristics of the vehicle during driving, analyzed the

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
147054 VOLUME 9, 2021
H. Jia et al.: Fatigue Driving Detection Based on Deep Learning

FIGURE 1. Fatigue detection algorithm framework.

normal state and abnormal state of the vehicle, and con- changes can be easily observed from their facial features,
structed a classifier according to different characteristics. such as eye, head and face changes, including longer blinking
The second is the driving fatigue state detection method time, slow eyelid movement, smaller degree of eye
based on driver physiological parameters [6]– [9]. The signals opening (or even close the eyes), frequent nodding, yawn-
from electrode patches at the positions of forehead, heart and ing, gaze (narrow line of sight), slow facial expression and
muscle are received through special medical equipment, and drooping posture [17]. Computer vision is a natural and non-
the signals are transmitted to the system for real-time analy- invasive method. The facial visual features that can charac-
sis [10], [11]. The clinically defined physiological parameter terize the driver’s fatigue level are extracted from the images
signal threshold is used as the basis for fatigue identifica- captured by the camera in front of the driver. Combined
tion. [12] Physiological parameters are mostly collected by with machine vision, image processing, pattern recognition
special signal acquisition instruments. In order to obtain the and other related technologies, the driver’s eye state, mouth
precision of the collected data and the correctness of the anal- state, and head motion state are analyzed to judge the driver’s
ysis, the volume of medical equipment is relatively large, and fatigue state [18], [19]. Compared with many traditional
it will occupy the driving space. The driver cannot fully carry methods, the fatigue detection method based on computer
out the driving of the vehicle, which invisibly aggravates the vision has the advantages of non-contact, non-interference
occurrence of accidents. In addition, because one end of the and high detection accuracy.
data acquisition equipment must be connected to the skin Zhang and Su [20] designed a new yawing detection
of the human body, or placed on the head, or placed on the method for fatigue detection. They used a convolution neural
trunk, it will more or less will restrict the behavior of the network to extract spatial image features and long-term and
driver, increasing the uncertainty of the driving process. Most short-term memory network to analyze time features. This
importantly, due to the individual differences among drivers, method can solve the problem of eye occlusion caused by sun-
the parameter thresholds of each driver’s fatigue identifica- glasses. However, under the condition of insufficient sunlight
tion are not the same, which brings resistance to individual at night, the algorithm will miss detection and check errors.
differences to fatigue detection. The system design needs to Zhuang et al. [21] proposed an effective fatigue detection
be more targeted and cannot be universal. method based on eye status with pupil and iris segmentation.
Chui et al. [13] proposed an EEG signal processing method The method separates the pupil and iris of the human eye
based on SVM algorithm. This method reduces the amount through a streamlined network, and then determines the eye
of calculation. At the same time, it speeds up the operation state according to the characteristics of the pupil and iris.
speed, which is beneficial to the improvement of fatigue This method is less affected by the change of sunlight, but
driving detection speed. Piazzi et al. found that when the it is not very effective for wearing sunglasses. Zhang and
driver ’ s heart beat frequency during fatigue is lower than Wang [22] proposed an algorithm using image processing
20% of the normal state, the driver appears fatigue phe- technology to process the images, and then evaluated it by
nomenon, which can be judged by the driver ’ s heart beat SVM model. Finally, the sequence floating forward selection
frequency. algorithm is used to select the optimal parameters, and the
The third is the fatigue detection method based on com- fatigue detection model was established. Li et al. [34] used
puter vision [14], [16]. For fatigued drivers, some visual the improved Yolov3-micro to extract facial features, and then

VOLUME 9, 2021 147055


H. Jia et al.: Fatigue Driving Detection Based on Deep Learning

FIGURE 4. O-Net structure.

FIGURE 2. P-Net structure.

FIGURE 3. R-Net structure.

FIGURE 5. The SPP layer specific workflow.


judged the driver ’ s eyes and mouth. This method combines
a variety of deep learning algorithms to improve the accuracy
of fatigue detection. However, the deep learning network
model used is large, which is not conducive to porting mobile
terminals.
The existing detection methods have the problems of low
comfort, vulnerability to external factors such as sunlight, and
low accuracy caused by a single index. To solve these prob-
lems, this paper studies a driving fatigue detection algorithm
based on deep learning and facial multi-index fusion, which
mainly includes face location and facial key points detection
module, state judgment module, fatigue judgment module. FIGURE 6. The SE Net composition.

II. FATIGUE DETECTION ALGORITHM point positioning. However, the focus of the three stages is
This algorithm first collects the driver’s images through the difference.
camera in front of the driver, and then the images use the The network model in the first stage is called the P-Net.
improved MTCNN (Multi-task Convolutional Neural Net- The network structure is shown in Figure 2. The main func-
work) to locate the face and detect the facial key points. Based tion is to obtain the window of the face region and the
on the detected key points, the location areas of the eyes boundary box regression. The obtained window of the face
and mouth are determined, and they are input into the eye area will be corrected by the results of the boundary box
and mouth state recognition network (E-MSR Net) for state regression, and then use non-maximum suppression (NMS)
judgment. Finally, the fatigue state of the driver is judged by to merge the overlapping windows.
the facial multi-index fusion strategy. The overall framework The network model in the second stage is called the R-Net,
of the algorithm in this paper is shown in Figure 1. which filters out most of the non-face candidate windows
through a more capable CNN network, then continues to
A. FACE LOCATION AND FACIAL KEY POINTS DETECTION correct the results of the Bounding Box regression, and uses
BASED ON IMPROVED MTCNN NMS to merge overlapping windows. The network structure
MTCNN contains three cascaded multi-task convolutional is shown in Figure 3.
neural networks, namely Proposal Network(P-Net), Refine The network model in the third stage is called the O-Net.
Network(R-Net), and Output Network (O-Net) [23]. Each Inputting the data of the second stage for further extraction,
multi-task convolutional neural network has three learning and find the five marker points on the face through a more
tasks, namely face classification, border regression, and key capable network. The network structure is shown in Figure 4.

147056 VOLUME 9, 2021


H. Jia et al.: Fatigue Driving Detection Based on Deep Learning

FIGURE 7. The bottleneck of the E-MSR Net.

In the process of fatigue detection, due to changes in ing the problem of gradient disappearance in training, so as
human posture and scene light intensity, MTCNN often to improve the performance of the network.
fails to detect. In response to this phenomenon, MTCNN
is improved. Adding SPP layer to the last layer of O-Net B. EYE AND MOUTH STATE RECOGNITION NETWORK
network can improve the accuracy of the network in com- (E-MSR NET)
plex scenes. Adding BN algorithm to MTCNN can improve After these images pass through the face location and facial
network performance. key points detection module, identifying the eye and mouth
areas of the face through the detected feature points. Send-
1) ADDING THE SPATIAL PYRAMID POLLING(SPP) LAYER TO ing them to the depth separable convolution neural network
O-NET NETWORK STRUCTURE E-MSR Net to judge the state of the driver’s eyes and mouth.
Aiming at the problem of complex scene changes in the The E-MSR Net is a lightweight network, which adds SE
driving process, an SPP layer [24] is added after the last network to the inverted residual structure of MobileNetV2
convolution layer in the O-Net network, which can output network [26] to improve the network accuracy, and replaces
a fixed-length vector for feature maps of different sizes. the sigmoid activation function in MobileNetV2 network
As shown in Figure 5, the SPP layers of this algorithm adopt with h-swish activation function to reduce the calculation
1 × 1, 2 × 2, and 4 × 4 for the three-proportion pyramid pool- amount.
ing layers. After the SPP layer, the feature map is divided into
21 parts, and the maximum pooling operation is performed 1) ADDING THE SQUEEZE-AND-EXCITATION (SE)
on each part respectively. Through the SPP layer, the feature NETWORKS
map is transformed into 21 × 128 matrices, which is sent to The SE networks [27] mainly includes squeeze and excita-
the full connection layer to develop into a one-dimensional tion, as shown in Figure 6. Firstly, the global average pooling
matrix. By adding the SPP layer, the feature maps of any operation is performed on the input feature map. Next, two
size can be converted into feature vectors of fixed size, which full connection layers are connected to increase the nonlinear
improves the recognition accuracy of the network model for processing of the feature map and fit the complex correlation
features of different scenes and sizes, and reduces the amount between channels.
of calculation. By adding the SE network into the bottleneck of the
MobileNetV2 network, as shown in Figure 7. The E-MSR
2) ADDING THE BATCH NORMALIZATION (BN) ALGORITHM Net can automatically obtain the importance of each feature
BEFORE EACH LAYER OF MTCNN NETWORK ACTIVATION channel through learning, and then according to this result
FUNCTION to enhance useful features and inhibit features that are not
The different distribution of input values will affect the train- useful for the current tasks. The feature is calibrated to make
ing of MTCNN network. When the input feature values have the effective weight is significant and the ineffective weight
relatively large differences, with the training of the network, is small, so as to improve the accuracy of the network.
these differences will affect the back-layer network, and at
the same time will lead to gradient explosion in the back 2) CHANGING THE ACTIVATION FUNCTION
propagation process. The sigmoid function [28] used in the MobileNetV2 network
The function of the BN algorithm [25] is to standardize the can be seen from the formula that the sigmoid activation
input value and reduce the difference of input value to the function is very complex in the calculation and derivation
same range. On the one hand, it improves the convergence process, which consumes a lot of computing resources. This
degree of the gradient and accelerates the training speed of algorithm uses the h-swish function to replace it. The formula
the model. On the other hand, each layer can face the input is shown below.
value of the same feature distribution as much as possible, 1
which reduces the uncertainty caused by the changes, and sigmoid(x) = (1)
1 + e−x
also reduces the impact on the back layer network, making ReLU 6 (x + 3)
each layer network become relatively independent, alleviat- h − swish [x] = x (2)
6
VOLUME 9, 2021 147057
H. Jia et al.: Fatigue Driving Detection Based on Deep Learning

FIGURE 8. H-swish activation function diagram.


FIGURE 9. PERCLOS schematic diagram under P80 standard.

ReLU = max(0, x) (3)


B. MOUTH STATE EVALUATION INDEX
The h-swish function has many advantages while maintain-
The mouth state evaluation index is similar to the eye state
ing accuracy. Firstly, ReLU6 can be implemented in many
evaluation index. Definition fm represents the MOR to judge
hardware and software frameworks. Secondly, it avoids the
the driver ’ s mouth fatigue state. The calculation formula is
loss of numerical accuracy when quantifying. And because
as follows:
of the decrease in the amount of calculation, the run-
ning speed is accelerated. The function image is shown in tm
fm = × 100% (6)
Figure 8. Tm
Among them, tm represents the number of mouths opening
III. FATIGUE STATE JUDGMENT ALGORITHM BASED ON frames during the detection time, and Tm represents the total
MULTI-INDEX FUSION number of frames in the detection time.
In order to solve the problem that the algorithm is inaccurate
due to a single evaluation index in the process of fatigue C. HEAD POSTURE EVALUATION INDEX
detection, this algorithm defines three evaluation indexes of The improved MTCNN detects the five 2D key points of
eye closure rate (ECR), mouth opening rate (MOR) and head the left eye, right eye, nose tip, and left and right mouth
non-face rate (HNFR). The three evaluation indexes are fused corners of the human face. The POSIT algorithm converts
to determine whether the driver is tired or not. five 2D feature points into five 3D feature points in the
world coordinate systems by rotation, translation and other
A. EYE STATE EVALUATION INDEX methods. Next, the transformation parameters are estimated.
PERCLOS [29] is an internationally recognized fatigue judg- Finally, the human head posture parameters in 2D plane are
ment criterion, which refers to the time proportion of eyes obtained, which are yaw angle (Yaw), pitch angle (Pitch) and
closed within a certain time. The calculation formula is as roll angle (Roll). According to the actual head posture when
follows. fatigued during driving, Pitch and Roll are selected as the
PERCLOS evaluation indexes.
Number of closed eyes frames The head posture evaluation index is similar to the eye
= × 100% (4) and mouth state evaluation index. Definition fh represents the
Total frame number of detection period
HNFR to judge the driver’s head posture state. The calcula-
The PERCLOS judgment criterion includes P70, P80, and
tion formula is as follows:
EM, which respectively indicate that the area of eyelid cov-
ering the pupil exceeds 70 %, 80 % and 50 % is counted as th
fh = × 100% (7)
eye closure. The time proportion of eye closure in a certain Th
time is counted. Among them, P80 is considered to be the Among them, th represents the number of head posture
most responsive standard to fatigue. The principle is shown fatigue frames during the detection time, and Th represents
in Figure 9. the total number of frames in the detection time.
In this paper, the eye state evaluation index refers to the P80
criterion in the PERCLOS criterion. Definition fe the ECR to D. FACIAL MULTI-INDEX FUSION
judge the driver’s eye fatigue state. The calculation formula In the actual driving environment, fatigued driving is a com-
is as follows: plex state of psychology and physiology, and the detection
te results are easily interfered with by various environments.
fe = × 100% (5)
Te If only eye features or mouth features are extracted to deter-
Among them, te represents the number of eye closing mine the driver’s fatigue, the accuracy of detection will be
frames during the detection time, and Te represents the total greatly reduced when the driver wears sunglasses or masks
number of frames during the detection time. and other facial occlusions. Therefore, the fatigue detection

147058 VOLUME 9, 2021


H. Jia et al.: Fatigue Driving Detection Based on Deep Learning

TABLE 1. Hardware configuration table.

FIGURE 12. Eye and mouth state judgment module dataset.

fatigue. According to people usually yawn state duration is


3 ∼ 5 seconds [31]. In this algorithm, the unit detection time
FIGURE 10. Face location and facial key points detection module dataset. is selected as 20 seconds, and the yawning frequency within
(a) WIDER FACE dataset; (b) MTFL dataset. 20 seconds cannot exceed twice. Therefore, this paper sets the
parameter threshold of MOR as 0.3, that is, when the yawning
frequency within unit time is 0.3, it is judged that the mouth
is in a state of fatigue.
According to the PERCLOS judgment criterion, when
the Pitch and Roll change angles of the head posture are
detected more than 20 %, it is judged that the head is in an
incorrect state. According to multiple tests, the head posture
changes when the driver is fatigued during actual driving, the
parameter threshold of HNFR is set to 0.5. That is, when the
proportion of the driver’s improper head posture in the unit
time period is 0.5, the head posture is judged to be fatigue.
The parameter threshold fusion of the above ECR, MOR,
and HNFR three evaluation indexes is used to determine
whether the driver is tired. The specific evaluation indexes
are as follows:
(
FIGURE 11. Test results of face location and facial key points detection normal : fe < 0.5 and fh < 0.5 and fm < 0.3
module. (a) Face location, and Eyes and mouth area detection; (b) Left eye
(8)
danger : fe ≥ 0.5 or fh ≥ 0.5 or fm ≥ 0.3
area extraction; (c) Right eye area extraction; (d) Mouth area extraction.

IV. EXPERIMENT AND RESULT ANALYSIS


method based on a single feature needs to be carried out in A. FACE LOCATION AND KEY POINTS DETECTION MODEL
an ideal situation. This algorithm uses the fusion of ECR, TRAINING AND RESULT ANALYSIS
MOR, HNFR three evaluation indicators to determine the In this algorithm, the improved MTCNN uses the WIDER
driver’s fatigue, in order to improve the system performance FACE database [32] as the training data set of face loca-
and enhance the system robustness. tion. The WIDER FACE database is the mainstream face
According to the single eye closure time of many tests detection database, which has 32203 face images and 393703
under different eye states, the eye state parameters were labeled faces. Moreover, all kinds of scenes are very complex,
calculated [30]. In the normal state, the time of single eye which can evaluate the performance of the algorithm from
closure is usually 0.12 ∼ 0.15 seconds. In the fatigue state, all directions and angles. Using the MTFL database as the
the time of single eye closure is greater than or equal to training data for facial feature points. The MTFL database
0.5 seconds, and the time is significantly greater than that in contains 12995 face images and 5 key points. It also provides
an awake state. Therefore, setting the parameter threshold of information on gender, smile, glasses, and head posture. The
ECR as 0.5, that is, when the eye closure frequency within images in the data set section are shown below. The dataset
unit time is 0.5, it is judged that the eyes are in a state of is shown in Figure 10.

VOLUME 9, 2021 147059


H. Jia et al.: Fatigue Driving Detection Based on Deep Learning

TABLE 2. Algorithm accuracy comparison.

FIGURE 13. E-MSR Net model training curve. (a) MobileNetV2 training curve; (b) E-MSR Net training curve.

TABLE 3. Comparison of single evaluation index and multi-feature fusion evaluation index.

This experimental computer hardware configuration is TABLE 4. Comparison of different fatigue detection algorithms.
shown in Table 1. The experiment in this paper is based
on python3.7 and Tensorflow2.2 in the Windows10 environ-
ment. CPU uses Intel (R) Core (TM) i7 - 10870H, and GPU
uses NVIDIA GeForce RTX 2060 for model training. The
experimental results are shown in Figure 11.

B. E-MSR NET MODEL TRAINING AND RESULT ANALYSIS


Using the self-built data set as the data set for E-MSR Net
training. The data set contains a total of 21821 pictures of four
types: open eyes, closed eyes, open mouth, and closed mouth,
including 1500 pictures of open eyes, 1374 pictures of closed
eyes, 9246 pictures of open mouth and 9701 pictures of closed The training process is iterative 50 times, the first 20 times
mouth. Dividing the data set into training set, validation set, learning rate is set to 0.01, and the last 20 times learning
and test set at 7: 1: 2. The image of the data set section is rate is set to 0.001. The training process curve is shown in
shown in Figure 12. The experimental environment is the Figure 13. The algorithm in this paper is compared with
same as that of the previous experiment. the MoblieNetV2 algorithm. It can be seen from the graph

147060 VOLUME 9, 2021


H. Jia et al.: Fatigue Driving Detection Based on Deep Learning

FIGURE 14. Test results of fatigue detection algorithm in this paper. (a) In a state of turning left and eyes closed; (b) In a state of closed eyes; (c) In a state
of yawning; (d) In a state of bowed head and closed eyes.

that the convergence speed of this algorithm is faster than video. Compare this algorithm with other algorithms, and the
MobileNetV2. results are shown in Table 4.
Using the test set to test the model trained by this algorithm
and the model trained by MobileNetV2 respectively. The test V. SUMMARY
results are shown in Table 2. The table shows that the accuracy The algorithm in this paper can deal with the fatigue detection
of the E-MSR Net model trained is higher than that of the of drivers in various complex environments. The improved
MobileNetV2 model. MTCNN can accurately and quickly detect the face, eyes and
C. FATIGUE DETECTION EXPERIMENT SIMULATION mouth regions. E-MSR Net can accurately and quickly judge
The fatigue detection experiment uses self-made fatigue the state of mouth and eyes. The facial multi-feature fusion
detection video as the test data. The test data includes fifty algorithm can accurately judge the fatigue state of drivers.
videos, each of which includes fatigue states such as eyes The experimental results show that this algorithm can well
closed, yawning, and nodding. The detection effect is shown detect different fatigue states.
in Figure 14. The single evaluation index and the multi-
feature fusion evaluation index are used to judge the fatigue REFERENCES
state respectively, and the results of the comparison are shown [1] M. H. Alkinani, W. Z. Khan, and Q. Arshad, ‘‘Detecting human driver
in Table 3. inattentive and aggressive driving behavior using deep learning: Recent
advances, requirements and open challenges,’’ IEEE Access, vol. 8,
It can be seen from the figure that the algorithm in this pp. 105008–105030, 2020.
paper can accurately and timely judge the fatigue state of the [2] J. Won Jeong, H. Kuk Kim, Y. Jin Lee, W. Wook Jung, F. Harashima,
driver. Through the accuracy to measure the evaluation index and M. Hyung Lee, ‘‘The implementation of the autonomous guided
vehicle driving system for durability test,’’ in Proc. IEEE Intell. Transp.
of the algorithm, the formula is as follows: Syst. (ITSC), 2000, pp. 101–106.
Nt [3] R. Li, Y. V. Chen, and L. Zhang, ‘‘A method for fatigue detection based
accuracy = (9) on driver’s steering wheel grip,’’ Int. J. Ind. Ergonom., vol. 82, Mar. 2021,
N Art. no. 103083.
where Nt represents the number of fatigues driving states [4] M. S. Wang, N. T. Jeong, K. S. Kim, S. B. Choi, S. M. Yang,
S. H. You, J. H. Lee, and M. W. Suh, ‘‘Drowsy behavior detection based on
detected by the algorithm in this paper, and N represents the driving information,’’ Int. J. Automot. Technol., vol. 17, no. 1, pp. 165–173,
number of fatigues driving states that really exist in the test Feb. 2016.

VOLUME 9, 2021 147061


H. Jia et al.: Fatigue Driving Detection Based on Deep Learning

[5] M. Mao and L. Du, ‘‘Research on drive fatigue detection using wavelet [27] J. Hu, L. Shen, and G. Sun, ‘‘Squeeze-and-excitation networks,’’ in
transform,’’ in Proc. IEEE Int. Conf. Veh. Electron. Saf., Dec. 2007, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018,
pp. 1–4. pp. 7132–7141.
[6] X.-Q. Huo, W.-L. Zheng, and B.-L. Lu, ‘‘Driving fatigue detection with [28] C.-H. Tsai, Y.-T. Chih, W. H. Wong, and C.-Y. Lee, ‘‘A hardware-efficient
fusion of EEG and forehead EOG,’’ in Proc. Int. Joint Conf. Neural Netw. sigmoid function with adjustable precision for a neural network system,’’
(IJCNN), Jul. 2016, pp. 897–904. IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 62, no. 11, pp. 1073–1077,
[7] R. P. Balandong, R. F. Ahmad, M. N. M. Saad, and A. S. Malik, ‘‘A review Nov. 2015.
on EEG-based automatic sleepiness detection systems for driver,’’ IEEE [29] A. Acioglu and E. Ercelebi, ‘‘Real time eye detection algorithm for PER-
Access, vol. 6, pp. 22908–22919, 2018. CLOS calculation,’’ in Proc. 24th Signal Process. Commun. Appl. Conf.
[8] L.-W. Ko, W.-K. Lai, W.-G. Liang, C.-H. Chuang, S.-W. Lu, Y.-C. Lu, (SIU), May 2016, pp. 1641–1644.
T.-Y. Hsiung, H.-H. Wu, and C.-T. Lin, ‘‘Single channel wireless EEG [30] T. Abe, T. Nonomura, Y. Komada, S. Asaoka, T. Sasai, A. Ueno, and
device for real-time fatigue level detection,’’ in Proc. Int. Joint Conf. Y. Inoue, ‘‘Detecting deteriorated vigilance using percentage of eyelid
Neural Netw. (IJCNN), Jul. 2015, pp. 1–5. closure time during behavioral maintenance of wakefulness tests,’’ Int.
[9] M. S. Hossain, K. Huda, S. M. S. Rahman, and M. Ahmad, ‘‘Imple- J. Psychophysiol., vol. 82, no. 3, pp. 269–274, 2011.
mentation of an EOG based security system by analyzing eye movement [31] H. Yang, L. Liu, W. Min, X. Yang, and X. Xiong, ‘‘Driver yawning detec-
patterns,’’ in Proc. Int. Conf. Adv. Electr. Eng. (ICAEE), Dec. 2015, tion based on subtle facial action recognition,’’ IEEE Trans. Multimedia,
pp. 149–152. vol. 23, pp. 572–583, 2021.
[10] T. Kobayshi, S. Okada, M. Makikawa, N. Shiozawa, and M. Kosaka, [32] S. Yang, P. Luo, C. C. Loy, and X. Tang, ‘‘WIDER FACE: A face detection
‘‘Development of wearable muscle fatigue detection system using capaci- benchmark,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
tance coupling electrodes,’’ in Proc. 39th Annu. Int. Conf. IEEE Eng. Med. Jun. 2016, pp. 5525–5533.
Biol. Soc. (EMBC), Jul. 2017, pp. 833–836. [33] S. Dey, S. A. Chowdhury, S. Sultana, M. A. Hossain, M. Dey, and S. K. Das,
[11] W. He, ‘‘Application heart rate variability to driver fatigue detection of ‘‘Real time driver fatigue detection based on facial behaviour along with
dangerous chemicals vehicles,’’ in Proc. 5th Int. Conf. Intell. Syst. Design machine learning approaches,’’ in Proc. IEEE Int. Conf. Signal Process.,
Eng. Appl., Jun. 2014, pp. 218–221. Inf., Commun. Syst. (SPICSCON), Nov. 2019, pp. 28–30.
[34] K. Li, Y. Gong, and Z. Ren, ‘‘A fatigue driving detection algorithm based
[12] R. Bhardwaj, S. Parameswaran, and V. Balasubramanian, ‘‘Comparison of
on facial multi-feature fusion,’’ IEEE Access, vol. 8, pp. 101244–101259,
driver fatigue trend on simulator and on-road driving based on EMG cor-
2020.
relation,’’ in Proc. IEEE 13th Int. Conf. Ind. Inf. Syst. (ICIIS), Dec. 2018,
pp. 94–97.
[13] K. T. Chui, K. F. Tsang, H. R. Chi, B. W. K. Ling, and C. K. Wu, ‘‘An accu-
rate ECG-based transportation safety drowsiness detection scheme,’’ IEEE
Trans. Ind. Informat., vol. 12, no. 4, pp. 1438–1452, Aug. 2016.
[14] Y.-C. Tsai, P.-W. Lai, P.-W. Huang, T.-M. Lin, and B.-F. Wu, ‘‘Vision-based
instant measurement system for driver fatigue monitoring,’’ IEEE Access, HUIJIE JIA is currently pursuing the master’s
vol. 8, pp. 67342–67353, 2020. degree with the Qilu University of Technology.
[15] B. K. Savas and Y. Becerikli, ‘‘Real time driver fatigue detection system His research interests include computer vision and
based on multi-task ConNN,’’ IEEE Access, vol. 8, pp. 12491–12498, deep learning.
2020.
[16] G. Soares, D. de Lima, and A. Miranda Neto, ‘‘A mobile application for
driver’s drowsiness monitoring based on PERCLOS estimation,’’ IEEE
Latin Amer. Trans., vol. 17, no. 2, pp. 193–202, Feb. 2019.
[17] D. Liu, C. Zhang, Q. Zhang, and Q. Kong, ‘‘Design and implementation
of multimodal fatigue detection system combining eye and yawn infor-
mation,’’ in Proc. IEEE 5th Int. Conf. Signal Image Process. (ICSIP),
Oct. 2020, pp. 65–69.
[18] S. Ansari, F. Naghdy, H. Du, and Y. N. Pahnwar, ‘‘Driver mental fatigue
detection based on head posture using new modified reLU-BiLSTM
deep neural network,’’ IEEE Trans. Intell. Transp. Syst., early access,
Aug. 4, 2021, doi: 10.1109/TITS.2021.3098309. ZHONGJUN XIAO received the Ph.D. degree
[19] E. Price, G. Moore, L. Galway, and M. Linden, ‘‘Towards mobile cognitive from the Shaanxi University of Science and Tech-
fatigue assessment as indicated by physical, social, environmental, and nology, in 2011. In 2011, he worked with the
emotional factors,’’ IEEE Access, vol. 7, pp. 116465–116479, 2019. School of Electrical Engineering and Automation,
[20] W. Zhang and J. Su, ‘‘Driver yawning detection based on long short Qilu University of Technology. He is currently the
term memory networks,’’ in Proc. IEEE Symp. Comput. Intell. (SSCI), Director of the Department of Automation, School
Nov./Dec. 2017, pp. 1–5. of Electrical Engineering. His research interests
[21] Q. Zhuang, Z. Kehua, J. Wang, and Q. Chen, ‘‘Driver fatigue detection include process industry automation, predictive
method based on eye states with pupil and iris segmentation,’’ IEEE Access, control, and embedded system technology.
vol. 8, pp. 173440–173449, 2020.
[22] F. Zhang and F. Wang, ‘‘Exercise fatigue detection algorithm based
on video image information extraction,’’ IEEE Access, vol. 8,
pp. 199696–199709, 2020.
[23] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, ‘‘Joint face detection and alignment
using multitask cascaded convolutional networks,’’ IEEE Signal Process.
Lett., vol. 23, no. 10, pp. 1499–1503, Oct. 2016.
PENG JI received the Ph.D. degree in instru-
[24] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Spatial pyramid pooling in deep
convolutional networks for visual recognition,’’ IEEE Trans. Pattern Anal. mental science and technology from Southeast
Mach. Intell., vol. 37, no. 9, pp. 1904–1916, Sep. 2015. University, Nanjing, China, in 2017. He is cur-
[25] S. Ioffe and C. Szegedy, ‘‘Batch normalization: Accelerating deep net- rently a Lecturer with the School of Electrical
work training by reducing internal covariate shift,’’ in Proc. 32nd Int. Engineering and Automation, Qilu University of
Conf. Mach. Learn. (Proceedings of Machine Learning Research), vol. 37, Technology (Shandong Academy of Sciences),
F. Bach and D. Blei, Eds., Lille, France: PMLR, Jul. 2015, pp. 448–456. Jinan, China. His research interests include robotic
[26] A. Howard, A. Zhmoginov, L.-C. Chen, M. Sandler, and M. Zhu, ‘‘Inverted vision, machine learning, and intelligent robot.
residuals and linear bottlenecks: Mobile networks for classification, detec-
tion and segmentation,’’ in Proc. CVPR, 2018, pp. 4510–4520.

147062 VOLUME 9, 2021

You might also like