Research On Driver Fatigue State Detection Method Based On Deep Learning
Research On Driver Fatigue State Detection Method Based On Deep Learning
Abstract. Fatigue driving detection is essential to ensure the safety of society and
drivers. At present, most fatigue detection methods are relatively traditional and single,
and have complex algorithms, low accuracy, and low fault tolerance. Based on the
improved Multi-task Cascaded Convolutional Network (MTCNN) to achieve precise
positioning of facial feature points, combined with the Res-SE-net model to achieve
eye, mouth area and state classification. The model is trained, and finally the driver
fatigue is judged based on the PERCLOS rule combined with the OMR rule of mouth
opening and closing frequency. Experimental results show that this method can
effectively extract fatigue features, has high detection accuracy, meets real-time
requirements, and has high robustness to complex environments.
1. Introduction
According to statistics, 48% of traffic accidents in China are caused by driver fatigue, with direct
economic losses of hundreds of thousands of dollars. With the development of expressways and the
increase of vehicle speed, the fatigue detection of automobile drivers has become an important part of
train safety research. The rise of deep neural networks has promoted the leap of machine vision
algorithms. It provides a large number of excellent solutions for target detection and state
classification problems. With the advent of Convolution Neural Networks (CNN) [1] in 2012, deep
learning has become the most common method for studying the field of imaging. Taigman et al. [2]
proposed the DeepFace mode in 2014 to improve the accuracy of facial recognition and cognitive
functions. In 2016, Zhang et al. [3] proposed Multitask Convolutional Neural Networks (MTCNN).
The model can detect multiple faces quickly and accurately. ZhangF et al. [4] proposed an eye state
recognition method based on the calculation of blinking frequency in the study of driver fatigue
detection based on human eye state recognition, but the fatigue judgment parameter is single and the
accuracy rate is low. The method of judging fatigue based on facial features is mainly based on closing
eyes and yawning. Wierwille et al. [5] set the theoretical mode of PERCLOS, that is, the time ratio of
closing eyes per hour. PERCLOS is now a very effective indicator of fatigue. Dai Shiqi et al. [6] are
based on HOG feature extraction and ERT algorithm, after realizing face detection and feature
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
MACE 2020 IOP Publishing
Journal of Physics: Conference Series 1744 (2021) 042242 doi:10.1088/1742-6596/1744/4/042242
location, they use convolutional neural network to identify the state of eyes and mouth. Gu Wanghuan
et al. [7] proposed a multi-scale pooling neural network model msp-net. Geng Lei et al. [8] first used
AdaBoost and KCF (kernelized correlation filter) for face perception and tracking, and then used a
classic network structure to perceive the state of the eyes and mouth. Zhao Xuepeng et. [9] used a
cascaded network to locate and detect the eye position to determine fatigue.
This paper uses the improved MTCNN detection model to determine the fatigue state of the driver,
extracts the characteristics of the eyes and mouth, puts the eye and mouth data into the Res-SE-net for
detection, judges the state of the eyes and mouth, and finally combines the PERCLOS rule and Open
Mouth Rate (OMR) to judge fatigue. The algorithm flow is shown in Figure 1.
2
MACE 2020 IOP Publishing
Journal of Physics: Conference Series 1744 (2021) 042242 doi:10.1088/1742-6596/1744/4/042242
3
MACE 2020 IOP Publishing
Journal of Physics: Conference Series 1744 (2021) 042242 doi:10.1088/1742-6596/1744/4/042242
4
MACE 2020 IOP Publishing
Journal of Physics: Conference Series 1744 (2021) 042242 doi:10.1088/1742-6596/1744/4/042242
convolution kernel continues to update, and the feature extraction function will continue to be
strengthened.
5. Fatigue Judgment
When the driver gets tired, he will have a series of biological reactions such as closing his eyes or
yawning for a long time. Based on the biometric response and the obtained eye and mouth conditions,
the degree of driver fatigue is determined by calculating PERCLOS and mouth opening rate (OMR).
5
MACE 2020 IOP Publishing
Journal of Physics: Conference Series 1744 (2021) 042242 doi:10.1088/1742-6596/1744/4/042242
the benchmark face data set WIDER FACE. The data set has 32203 pictures and 159424 faces. When
training the model, the data set is divided into three subsets according to the difficulty of image
recognition, of which 60% is used as the test set, 30% is used for model training, and 10% is the
validation set. The data set of eyes and mouth is collected through the ZJU blinking video data set and
YawDD fatigue driving video data set to collect a large number of closed eyes and open mouth
samples. In further verification experiments, data of 15 volunteers were collected, and a total of 17532
eye samples were collected, including 8562 samples with open eyes and 8970 samples with closed
eyes. There were 19431 mouth samples, including 10356 mouth samples and 9075 mouth samples.
6.2.1. Eye and mouth status detection. In order to test the detection and recognition of the method
proposed in this article under actual conditions, the continuous video image stream taken by the
camera is used as the test object, and the camera resolution is 640×480 pixels. First, MTCNN is used
for face detection and feature point extraction on the frame images of the video, and then the
Res-SE-net model is used to realize the judgment of the facial eye and mouth feature state, and finally
the Perclos algorithm and the OMR algorithm are combined to detect the fatigue state. This paper
compares the Res-net embedded in SE-net with the Alex-Net proposed in[11]. Res-SE-net and
Alex-Net are used for model training respectively. After training the eye and mouth state classification
models, the classification accuracy and classification time overhead of the two network structure
training models were tested. The results are shown in Tables 1 and 2.
Table 1. Test results of eye and mouth state classification model
Res-SE-net Average Alex-Net Average
Number of accuracy of positive accuracy of positive
Category
samples and negative and negative
samples% samples%
Left eye 1500 95.1 93.8
Right eye 1500 95.7 93.6
Mouth 1500 96.8 94.1
6
MACE 2020 IOP Publishing
Journal of Physics: Conference Series 1744 (2021) 042242 doi:10.1088/1742-6596/1744/4/042242
regression+SR-Net[8]
MTCNN+Res-SE-net 96.5 8.743
6.2.2. Fatigue status detection. In order to verify the feasibility of the algorithm, this paper randomly
selects 5 female data and 5 male data from YawDD video data, and 1 self-made video data of the
research group, counts the actual fatigue times in these 11 videos (total 752s), and recognizes the
algorithm Number of fatigues. Through comparison experiments with traditional fatigue detection
methods, the algorithm recall rate and precision rate are checked. The experimental data is shown in
Table 4. The precision rate is p = (N-Nt) / (N-Nt+Ne), the recall rate is q = (N-Nt) / N, where N
represents the number of real fatigues, Nt represents the number of missed inspections, and Ne
represents the number of false detections. From the experimental results, it can be seen that the
precision and recall of the algorithm in this paper are better than those of the traditional algorithm. The
false detections and missed detections mainly occur in low light. Relatively speaking, it is checked
under normal light. The full rate and precision rate will be higher.
Table 4. Experimental results of each algorithm
Missed
Fatigue False detections Precision Recall
detection
/Times /Times % rate%
times/ Times
Hog+ERT+CNN[6] 48 7 5 86 89.583
MTCNN+MSP-Net[7] 48 4 7 91.111 85.417
AdaBoost+Cascade
48 6 4 88 91.667
regression+SR-Net[8]
MTCNN+Res-SE-net 48 2 1 95.918 97.917
7. Conclusion
This paper first detects the driver’s face through MTCNN, and extracts the key positions of the eyes
and mouth; secondly, the eyes and mouth images are sent to the Res-Net fused with SE-net for state
testing, and the algorithm is accelerated and optimized. At the same time of accuracy, the training
speed is improved. Finally, the fatigue state is jointly judged by PERCLOS and OMR. Experiments
show that the algorithm proposed in this paper has a high detection accuracy rate, and can achieve
real-time detection results with good robustness.
References
[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural
networks. Communications of the ACM, 2017, 60(6): 84-90.
[2] Taigman Y, Yang Ming, Ranzato M, et al. DeepFace: losing the gap to human-level
performance in face verification //Proc of IEEE Conference on Computer Vision and Pattern
Recognition. Washington DC:IEEE Computer Society, 2014: 1701-1708.
[3] Zhang Kaipeng, Zhang Zhanpen, Li Zhifeng, et al. Joint face detection and alignment using
multitask cascaded convolutional networks. IEEE Signal Processing Letters, 2016, 23(10):
1499-1503.
[4] Zhang F, Su JJ, Geng L, et al. Driver fatigue detection based on eye state recognition.
International Conference on Machine Vision and Information Technology. Singapore. 2017.
105-110.
[5] Wierwille W W, Ellsworth L A. Evaluation of driver drowsiness by trained raters. Accident
Analysis & Prevention, 1994, 26(5): 571-581.
[6] Dai Shiqi, Zeng Zhiyong. Fatigue driving monitoring based on BP neural network. Computer
Systems & Applications, 2018, 27 (7): 113-120.
[7] Gu Wanghuan, Zhu Yu, Chen Xudong, et al. Driver’s fatigue detection system based on
multi-scale pooling convolutional neural networks. Application Research of Computers,
2019, 36 (11): 3471-3475
7
MACE 2020 IOP Publishing
Journal of Physics: Conference Series 1744 (2021) 042242 doi:10.1088/1742-6596/1744/4/042242
[8] Geng Lei, Yuan Fei, Xiao Zhitao, et al. Driver fatigue detection method based on facial
behavior analysis. Computer Engineering, 2018, 44(1): 274-279.
[9] Zhao Xuepeng, Meng Chunning, Feng Mingkui, et al. Fatigue detection based on cascade
convolutional neural network. Journal of Optoelectronics Laser, 2017, 28(5): 497-502.
[10] Shi Ruipeng, Qian Yi, Jiang Danni. Fatigue driving detection method based on convolutional
neural network. Application Research of Computers. https://doi.org/10.19734/j.issn.
1001-3695. 2019. 07. 0313
[11] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural
networks // Advances in Neural Information Processing Systems. New York: Curran
Associates, 2012: 1097-1105.
[12] He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image
Recognition // Proc of the IEEE conference on Computer Vision and Pattern Recognition.
Piscataway: IEEE Press, 2016: 770-778.
[13] Hu Jie, Shen Li, Sun Gang. Squeeze-and-excitation networks // Proc of the IEEE conference on
Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.