Neuromorphic Driver Monitoring Systems A Proof-of-Concept For Yawn Detection and Seatbelt State Detection Using An Event Camera
Neuromorphic Driver Monitoring Systems A Proof-of-Concept For Yawn Detection and Seatbelt State Detection Using An Event Camera
Received 18 July 2023, accepted 23 August 2023, date of publication 5 September 2023, date of current version 11 September 2023.
Digital Object Identifier 10.1109/ACCESS.2023.3312190
ABSTRACT Driver monitoring systems (DMS) are a key component of vehicular safety and essential for
the transition from semi-autonomous to fully autonomous driving. Neuromorphic vision systems, based on
event camera technology, provide advanced sensing in motion analysis tasks. In particular, the behaviours
of drivers’ eyes have been studied for the detection of drowsiness and distraction. This research explores the
potential to extend neuromorphic sensing techniques to analyse the entire facial region, detecting yawning
behaviours that give a complimentary indicator of drowsiness. A second proof of concept for the use of event
cameras to detect the fastening or unfastening of a seatbelt is also developed. Synthetic training datasets are
derived from RGB and Near-Infrared (NIR) video from both private and public datasets using a video-to-
event converter and used to train, validate, and test a convolutional neural network (CNN) with a self-attention
module and a recurrent head for both yawning and seatbelt tasks. For yawn detection, respective F1-scores
of 95.3% and 90.4% were achieved on synthetic events from our test set and the ‘‘YawDD’’ dataset. For
seatbelt fastness detection, 100% accuracy was achieved on unseen test sets of both synthetic and real events.
These results demonstrate the feasibility to add yawn detection and seatbelt fastness detection components
to neuromorphic DMS.
INDEX TERMS Driver monitoring, drowsiness detection, event camera, computer vision, CNN, LSTM,
neuromorphic sensing, seatbelt, yawn.
beyond the capabilities of the 30-60 FPS cameras typically Large datasets of synthetic events were simulated for
found in current DMS. developing these algorithms, and a set of real events was
The introduction of neuromorphic vision sensors promises collected for testing in addition to publicly available data.
a new era for DMS, by addressing a number of the hardware The network architecture designed for both tasks combines
limitations of conventional RGB and near-infrared (NIR) sys- a CNN backbone with self-attention module and a recurrent
tems, including low frame rate, power consumption, and low- head. Highly accurate models with very low inference times
light performance. The neuromorphic vision sensors used in were achieved, allowing real-time operation of both yawn
event cameras are designed to mimic the visual-processing detection and seatbelt fastness detection.
abilities of living objects by only gathering the relevant data The remainder of this paper is organised as follows:
from an observed scene. Instead of using a conventional Section II examines related research in the spaces of
shutter-based technique to capture an image, they report an yawn detection, seatbelt detection, and event based DMS.
event anytime a pixel in the sensor detects a change in In Section III we outline our network architecture, followed
brightness above a certain threshold. Each event is defined by by the datasets, event processing, and training details for
four parameters: the timestamp, the x and y coordinates of the both yawn detection and seatbelt state detection tasks.
pixel that reported the event, and the polarity, which indicates In Section IV we present our results and compare them to
whether an increase or decrease in brightness caused the others in literature. Section VI contains our final conclusions
event. As events are typically only generated by motion or of the work and its implications.
changes in lighting, event cameras are extremely useful in
motion analysis tasks.
II. RELATED WORK
These modifications also enable event cameras to offer a
In this section we discuss the current literature related
wider dynamic range, higher temporal resolution, and lower
to the two safety features developed in this work. Yawn
power consumption than conventional cameras. Events are
detection is a key indicator of driver drowsiness and the
recorded with an accuracy of one microsecond and can
detection of seatbelt is an important component of passenger
provide equivalent frame rates exceeding 10,000 FPS [10]. safety. By implementing and validating these two safety
These properties, and the parameters that can be modified features this work demonstrates the general feasibility of
to control the output event streams [11] for operation replacing a conventional RGB or NIR based DMS that
in various lighting conditions, make the event cameras employs conventional computer vision algorithms with a
highly suited to the various requirements of DMS. This fully neuromorphic DMS.
has already been demonstrated by Ryan et al. [12]. with an
event-based DMS capable of real-time face and eye tracking,
and blink detection as indicator of drowsiness. This could A. YAWN DETECTION
be combined with other symptoms of tiredness, such as Driver drowsiness is a critical factor in road accidents, and
yawns, for more accurate predictions of driver exhaustion various studies have explored yawning as a key indicator for
levels. detecting drowsiness. Abtahi et al. propose a real-time system
Drowsiness detection is critical when considering driver using face and mouth detection for accurate yawning mea-
safety, however there are few measures as simple and effec- surement and drowsiness detection [18]. Omidyeganeh et al.
tive as the seatbelt. The risk of injury to a belted passenger present a computer vision-based system that significantly
is 65% lower than that of an unbelted passenger [13], and in improves yawning detection rates by using a modified imple-
the United States, seatbelt use was shown to reduce mortality mentation of the Viola-Jones algorithm and backprojection
by 72% [14]. Existing seatbelt alert systems that rely on theory [19]. Knapik and Cyganek introduce a novel approach
under-seat pressure sensors are easily spoofed, and provide utilising thermal imaging for driver fatigue recognition based
no assurance of if the seatbelt is correctly fastened. This on yawning, demonstrating high efficacy in both laboratory
makes seatbelt fastness detection another desirable feature and real car environment [20]. Yang et al. propose a subtle
of camera-based DMS. Systems that can recognise seatbelt facial action recognition method for yawning detection,
use, even from surveillance footage outside the car, have been utilising a 3D deep learning network and a keyframe
made possible with deep learning approaches [15]. These selection algorithm to distinguish yawning from similar
techniques typically use RGB or NIR frames, often with facial actions [21]. Liu et al. design a multimodal fatigue
some form of edge detection pre-processing [16], however, detection system that combines eye and yawn information,
a correctly calibrated event camera can similarly isolate edges achieving a high accuracy rate of up to 95% in detecting
and other scene elements without additional processing [11]. drowsiness [22]. Kumari et al. develop a real-time drowsiness
DMS that already utilise event cameras could incorpo- and yawn detection system using Python and the Dlib model,
rate seatbelt fastness detection with no added hardware based on eye closure and yawn frequency, to minimise
costs. fatigue-related vehicle accidents [1]. Dehankar et al. propose
This research expands on our previous work of developing a non-invasive driver drowsiness and yawning detection
a proof-of-concept event-based yawn detection system [17] system using computer vision techniques and a Raspberry
and combines it with a seatbelt fastness detection algorithm. Pi microcontroller, achieving rapid fatigue detection within
a few seconds [23]. Alshaqaqi et al. introduce a driver Authors in [16] presented a classification model for driver
drowsiness detection system that computes the eye aspect seatbelt status detection based on image analysis from a
ratio and lip distance to determine drowsiness and yawning, vehicle’s in-cabin camera. They utilised a YOLO neural
aiming to reduce accidents caused by driver fatigue [24]. network and a two-step approach to detect the main part
Melvin et al. propose a novel approach based on facial of the belt and its corner. The model achieved accurate
motion identification using convolutional neural networks, classification of belt fastness, including cases where the belt
addressing challenges in accurate yawning recognition in is fastened behind the human body. Naik et al. [28] proposed
real-world driving conditions [25]. These studies collectively a technique using convolutional neural networks (CNN) to
contribute to the development of effective driver drowsiness detect driver’s seatbelt usage. Their ConvNet achieved higher
detection systems by leveraging yawning as a prominent accuracy compared to other classification algorithms and
indicator, aiming to enhance transportation safety and demonstrated the potential for reducing accidents caused by
mitigate accidents caused by drowsy driving. non-compliance with seatbelt usage.
To the best of the authors’ knowledge, there is only one Authors in [29] focused on the automatic vertical height
prior study that focuses directly on event-based driver yawn adjustment of incorrectly fastened seatbelts using deep
detection [17]. This work, in part, serves as an extension learning. They evaluated three CNN architectures and
of that study, aiming to further explore the potential of found that DenseNet121 achieved the highest classification
neuromorphic sensing techniques for yawn detection in driver accuracy. Their proposed system provides a solution for
drowsiness. Our research utilises a neuromorphic vision ensuring correct seatbelt positioning, thereby enhancing
system, leveraging event camera technology, to analyse the driver and passenger safety in fleet vehicles. Hosseini
entire facial region and capture yawning behaviours. This and Fathi [15] proposed a deep learning-based system for
provide a complementary indicator of tiredness to enhance. detecting vehicle occupancy and driver’s seatbelt status. Their
A dataset comprising 952 video clips and corresponding method employed a combination of pre-trained ResNet34 and
neuromorphic image frames is constructed and used for power mean transformation layers, achieving high accuracy
training and testing a CNN with self-attention and a recurrent in detecting occupants and seatbelt violations. The proposed
head. system demonstrates promising performance compared to
Event-based yawn detection offers several advantages, state-of-the-art methods.
including the ability to capture micro-facial movements Madake et al. [30] addressed seatbelt detection for
that indicate the onset of yawning. By focusing on spe- assisted driving scenarios. They proposed a real-time system
cific yawn events, rather than continuous monitoring, this using a combination of FAST key point detection, BRIEF
approach reduces computational requirements and enhances method, and Decision Trees. Their algorithm showed high
the accuracy of detection. Additionally, event-based yawn classification accuracy, considering practical constraints such
detection enables the identification of subtle variations in as dynamic environments, illumination variations, and low-
yawning patterns, allowing for a more refined analysis of quality images. Authors in [31] presented an efficient
driver drowsiness levels. This innovative approach holds and lightweight model for seatbelt detection on mobile
great potential in improving the effectiveness and efficiency devices. They pruned the SSD MobileNet V2 model and
of driver drowsiness monitoring systems, ultimately con- utilised the LSD linear segment detection multipoint fitting
tributing to enhanced road safety. algorithm to enhance detection performance. Their model
outperformed existing methods, demonstrating its practical-
ity for mobile-based seatbelt detection. Upadhyay et al. [32]
B. SEATBELT DETECTION proposed a real-time seatbelt detection system using the
Seatbelt detection is a crucial task in the automotive industry YOLO deep learning model. They emphasised the impor-
to ensure driver and occupant safety. Current technology tance of monitoring seatbelt fastening in automobiles and
primarily focuses on buckling detection, but proper seatbelt addressed the limitations of existing algorithms. Their
routing detection to ensure the seatbelt is safely routed YOLO-based model achieved accurate seatbelt detection,
through the body to protect the wearer, remains a challenge. contributing to automotive safety by ensuring proper seatbelt
Baltaxe et al. [26] addressed the problem of marker-less usage.
vision-based detection of improper seatbelt routing. They Although there are many prior research works relating
trained deep neural networks using a large database of images to seat belt detection, we believe that this work is the first
and achieved high accuracy in classifying seatbelt routing to explore the potential of event cameras to monitor and
scenarios. This work contributes to improving automotive verify seatbelt state and fastening activity. More specifi-
safety by reducing injuries caused by improperly routed cally we are interested in the potential for neuromorphic
seatbelts. Chun et al. [27] proposed NADS-Net, a light archi- sensing to better evaluate the correct completion of the
tecture for driver and seatbelt detection using convolutional fastening/unfastening process. Due to potential differences
neural networks. Their architecture, based on the feature in how event cameras features can be leveraged for the
pyramid network backbone, showed optimal performance for prediction of fastening/unfastening actions against a station-
driver/passenger state detection tasks. ary fastened/unfastened seatbelt, this paper investigates them
III. METHODOLOGY
In this section we present the details on our network design,
our collection of video datasets and the subsequent generation
of synthetic events, followed by the tailored preprocessing of
our event data for our different tasks, and finally the training
details of our various models. Fig. 1 gives an overview of
this section’s structure. All of the data used in this paper
was collected with informed consent and in compliance with
ethical guidelines.
A. NETWORK ARCHITECTURE
The possible manifestations of yawns are frequently oversim-
plified in yawn-detection literature, where mouth openness
is often assumed to be the only relevant feature. This is
unreliable when assessed over individual frames or short time
windows, as there is a risk of false positive predictions when
the mouth is open for speech or laughter. An additional flaw,
which is extremely challenging to solve in these systems,
is not handling the common case where a person reflexively
covers their mouth with their hand when yawning. Some
approaches also monitor the openness of the eyes, but
there is little consideration of other possible cues that often
accompany a yawn, such as the hand over the mouth or large
FIGURE 2. Our proposed network for yawn and seatbelt detection.
stretches of the upper body and arms. For this reason, our
proposed yawn detector does not use facial landmarks or
other deliberately programmed features to make a prediction. Fig. 2 gives a high-level overview of the model architecture
Instead, we rely on CNN components to learn the relevant designed for this paper. The MobileNetV2 network is used
features from the full input images, with a recurrent structure for feature extraction of the input frames. In their paper,
that can track how these features change over time in a Sandler et al. [33] demonstrate the impressive performance
yawn. of MobileNetV2 as a feature extractor with an efficient,
Similar principles can be applied when designing a lightweight architecture. The model we used was pretrained
network to predict seatbelt state. When viewing an individual on the ImageNet dataset [34]. After this initial feature
frame from a video of someone buckling their seatbelt, there extraction, batch normalisation and channel reduction by
is no information on the direction of motion or previous 2D convolution are applied to prepare the features for a
states, and so it can be easily confused with an unbuckling self-attention module. Recent years have seen self-attention
action, whereas a sequence of frames makes is much easier introduced to many CNN tasks for its ability to contextualise
to identify. Additionally, a fastened seatbelt does not typically and apply a weighting to input features, with only a
undergo a lot of motion when the wearer is sitting still. For small computational cost. The self-attention module in our
event cameras this can result in moments with very little proposed network is implemented according to [35]. Fig. 3
information on the seatbelt. By extending the input sequence, gives the expanded diagram of this module.
we provide more time to gather information on the seatbelt to When the attended feature maps are generated for every
obtin for a more reliable prediction. frame of the input sequence, they are stacked and passed
FIGURE 4. Sample event frames from a yawn sequence where the mouth is always visible.
FIGURE 5. Sample event frames from yawn sequences where the mouth is covered by a hand while yawning.
of spatial information in each frame, at the loss of much of TABLE 1. Distribution of our event yawn dataset partitions.
this temporal information.
We hypothesise that the fixed duration method is more
applicable for yawn detection. This yields frames at fixed
rate, much like a conventional camera’s output, and the
temporal information carried in a sequence of these frames
can be useful when identifying yawns from other actions such
as speech, due to differences in the rate of mouth motion. The
choice of this event frame duration should be informed by of each set is shown in Table 1. There is no overlap of subjects
the requirements of the underlying task. Accumulating events between the three sets. Sample event frames from two yawn
over a long period risks an aliasing effect, where speech sequences are shown in Fig. 4 and Fig. 5 The former has the
frames could appear as one long mouth open sequence if mouth fully visible throughout, but the latter shows the mouth
insufficiently sampled. On the other hand, using too short a covered by the subject’s hand.
period can yield many frames with low spatial information. The YawDD dash videos were converted from 30FPS RGB
For our final yawn dataset, each frame is generated by video to 10FPS event video following the same process as our
accumulating events over a duration of 0.1s, resulting in custom yawn dataset. The start and stop frames of the yawns
frame sequences of 100 frames at 10FPS. This reduction from were annotated and 100 frame sequences were extracted with
the 30FPS of the source data has 3 primary justifications: the yawn frames centered. Non-yawn sequences were also
(1) A higher frame frequency is unnecessary to distinguish saved from the frames between yawns. This totaled to 12,300
a yawn from speech. (2) With fewer than 300 frames in many synthetic event frames, containing 78 yawn sequences and
RGB sequences, accumulating an equal or greater number of 45 non-yawn sequences.
event frames would require an additional interpolation step,
otherwise a freezing effect occurs in the event videos due D. YAWN DETECTION—TRAINING DETAILS
to several frames showing the same motion. (3) A reduction The yawn training sequences were augmented to achieve bet-
from 300 to 100 frames for each sample carries a significant ter generalisation. This includes rotating 50% of sequences
speedup to network training. The event frames’ pixel values within ±10◦ , mirroring about a vertical axis, and cropping to
are clipped to ±10 and then normalised between 0,255. The squares of randomised size and position (within some limits
37 subjects in our simulated event yawn dataset were split into to ensure the full face is still visible). The augmentations
three sets for training, validation, and testing. The breakdown were only randomised between sequences, so each frame in a
sequence had identical transformations applied. All frames to reach a minimum number of events, but only within a
were downsampled to 256 × 256 using pixel area relation rectangle bounding the subject’s torso to minimise frames
before input to the network. The network was trained for generated from irrelevant motion in the scene. Additionally,
100 epochs with a batch size of 5. The initial learning rate each frame was required to span a minimum duration of
of 10−4 was halved every 10 epochs. Binary cross entropy 200ms to prevent the generation of a proportionally huge
loss was calculated between the predicted and actual labels of number of transition frames, which have a much higher rate of
each sequence in the validation set. The dropout probability events over the torso region than the static classes. This can
was set to 0.1. also be thought of as capping the frame rate to 5FPS. This
hybrid approach produced frames with much more reliable
E. SEATBELT STATE DETECTION—DATASETS seatbelt visibility, as demonstrated in Fig. 6 where the fixed
Another non-public in-cabin industry dataset was used counts/duration were specified so each method generates
for our seatbelt detection algorithm. Using a near-infrared 75 frames of the same ‘‘Seatbelt Fastened’’ clip. In the full
(NIR) camera in the rear-view mirror position of a car, 75 frames, the seatbelt was visible in (a) 27%, (b) 71%,
various subjects were recorded fastening and unfastening and (c) 93%.
their seatbelts repeatedly. The video frames were labelled by The events used to create Fig. 6 are from a set of real events
the following classes: that were collected for testing of the network. A Prophesee
1) The subject’s seatbelt is fastened. EVK4 event camera was mounted beside the rear-view mirror
2) The subject’s seatbelt is unfastened. of a driving simulator and focused on the driver’s seat.
3) The subject is fastening their seatbelt. Subjects were asked to fasten and unfasten their seatbelt at
4) The subject is unfastening their seatbelt. random intervals throughout each recording. These videos
The wide field of view lens of the camera captured both were labelled manually with the same 4 classes as the NIR
the driver and passenger seat. Both seats were given distinct dataset, but with the start and stop of each class defined by
labels of the seatbelt state. These videos were split into crops event timestamps instead of frames. This initial test dataset
of the driver’s seat and crops of the passenger seat, and was limited to 6 subjects to validate this proof-of-concept
the passenger seat crops were mirrored horizontally to have use case. Table 2 gives breakdown of the final event seatbelt
similar perspective and seatbelt direction to the driver’s seat dataset by class.
crops. The network then has a simpler task predicting on
the cropped images rather than requiring both seats to be G. SEATBELT STATE DETECTION—TRAINING DETAILS
considered separate features. Knowing the camera position is For seatbelt state detection, four distinct models were
fixed, the same cropping and mirroring can be carried out as developed with our same network structure:
required at inference. In this paper, the term ‘‘static classes’’
refers to 0 and 1, and ‘‘transition classes’’ refers to 2 and 3. 1) SEATBELT ON VS. SEATBELT OFF (STATIC CLASSES)
A binary classifier trained on just the static classes to directly
F. SEATBELT STATE DETECTION—EVENT SIMULATION AND assess the potential to predict on event data with little seatbelt
PRE-PROCESSING motion.
The seatbelt state classification task poses unique challenges
in choosing an approach to accumulate frames, as the 2) FASTENING VS. UNFASTENING CLIPS (TRANSITION
seatbelt is relatively stationary once fastened/unfastened and CLASSES)
generates few events, but the fastening/unfastening actions A binary classifier trained on just the transition classes to
generate a comparatively huge number of events. Both assess if predicting the changing state of the seatbelt is more
previously described methods (fixed duration and fixed reliable than using the static seatbelt in event data.
event count) were tested, but neither were fully suitable.
It proved too difficult to find a fixed duration large enough 3) COMBINED STATIC CLASSES VS. COMBINED TRANSITION
to keep a stationary seatbelt sufficiently visible without CLASSES
significantly reducing the number of frames for capturiung In a real-world deployment of a seatbelt state detector, all
the fastening/unfastening actions. Alternatively, using a fixed 4 classes must be handled. This necessitates another binary
event count was also unreliable in keeping the seatbelt visible model for a preliminary filtering to determine if an input
as there is no guarantee that the events contain relevant sequence of frames should be passed to the static model (1)
information. The event count was often saturated by unrelated or transition model (2) to refine the prediction.
movements such as head motion or the background changing
outside the car window. Specifying a number of events large 4) 4-CLASS MODEL
enough to keep the seatbelt visible in all of these cases is Trained with all 4 classes of our synthetic seatbelt dataset
impractical, as just one frame can span a huge time period to handle all states in a single model. The classes are all
when the rate of events is low. considered independently by this network, so each frame
A customised approach was developed for the final sequence is predicted as containing only one class, and the
iteration of the seatbelt dataset. Each frame was required previous state does not inform new predictions.
TABLE 3. Results of our best yawn detection model tested on all of our
synthetic event sets.
TABLE 5. Results of our model tested on all of our event seatbelt test
datasets.
V. CONCLUSION
In this article we provide proof of concept methods for both
yawn detection and seatbelt state detection with event cam-
eras using lightweight deep learning models. This includes
further evidence of the efficacy of synthetic event data in
developing neuromorphic algorithms that can generalize to
real data. Recent months have seen neuromorphic research
trend away from frame-based approaches in favour of sparse
representations, but this paper demonstrates how frames
can efficiently compress event data for tasks with lesser
time requirements. Our yawn detection algorithm offers
superior performance to typical keypoint-based methods by
accounting for associated motions of the upper body and
FIGURE 7. Visualised attention maps of yawn frames generated from
handling the frequent cases where the mouth is occluded by
(a) our test set and (b) the simulated YawDD dataset. a hand. Event cameras are typically employed for their fast
response times and motion analysis qualities, but with the
models developed for continuous monitoring of the seatbelt
- even while stationary for long periods - we demonstrate
how event data can be manipulated to satisfy a diverse set of
requirements for assorted tasks. The proposed neuromorphic
event-based algorithms for detecting yawns and seatbelt state
fill a research gap and offers promising potential for advanced
driver-assistance systems and intelligent safety features.
A. FUTURE WORK
Future work will seek to improve the seatbelt algorithms
by considering the fixed order of states. In particular,
FIGURE 8. Visualised attention maps of seatbelt frames generated from the 4 class model should weight future predictions based
real events in the test set.
on the current predicted state. For example, given the
current state is ‘‘seatbelt fastened’’, the network should
test sequences, both real and synthetic. The binary transition have the knowledge that ‘‘unfastening’’ must follow. Further
model (2) also achieved perfect accuracy on the simulated collection of real event data and sourcing more public datasets
test set, but the noticeable difference in performance on of seatbelt states and yawns are planned to greatly expand our
the real data indicates overfitting on the synthetic events. research. Additionally, the deployment of these models will
These results indicate that the resting state of the seatbelt be investigated within the limitations of embedded hardware
is more reliable for prediction than tracking the transition typically found in DMS. All models in this article use the
states. To select which of these two model to used for an same architecture, granting scope for extremely efficient
input frame sequence in a practical DMS, a model, the third deployment.
binary model (3) is needed. This was surprisingly the lowest
performing of all models, despite having an objectively ACKNOWLEDGMENT
easier task than the 4-class model (4), which uses exactly The authors recognize that the data, equipment, and expertise
the same data but categorizes them more precisely. This provided by Xperi Inc., were vital in enabling this research.
result, combined with high training accuracy on model (3), For the purpose of Open Access, the author has applied
reveals more overfitting to be the cause of the lower a CC BY public copyright licence to any author accepted
performance. Both models (3) and (4) are fed full videos and manuscript version arising from this submission.
REFERENCES [23] V. Dehankar, P. Jumle, and S. Tadse, ‘‘Design of drowsiness and yawning
[1] S. Kumari, K. Akanksha, S. Pahadsingh, and S. Singh, ‘‘Drowsiness detection system,’’ in Proc. 2nd Int. Conf. Electron. Renew. Syst. (ICEARS),
and yawn detection system using Python,’’ in Proc. Int. Conf. Commun., Mar. 2023, pp. 1585–1589.
Circuits, Syst. Singapore: Springer, 2021, pp. 225–232. [24] B. Alshaqaqi, A. S. Baquhaizel, M. E. A. Ouis, M. Boumehed, A. Ouamri,
[2] P. Ghorai, A. Eskandarian, Y.-K. Kim, and G. Mehr, ‘‘State estimation and and M. Keche, ‘‘Driver drowsiness detection system,’’ in Proc. 8th
motion prediction of vehicles and vulnerable road users for cooperative Int. Workshop Syst., Signal Process. their Appl. (WoSSPA), May 2013,
autonomous driving: A survey,’’ IEEE Trans. Intell. Transp. Syst., vol. 23, pp. 151–155.
no. 10, pp. 16983–17002, Oct. 2022. [25] J. S. R. Melvin, B. Rokesh, S. Dheepajyothieshwar, and K. Akila, ‘‘Driver
[3] K. Kuru and W. Khan, ‘‘A framework for the synergistic integration of yawn prediction using convolutional neural network,’’ in Proc. IoT, Cloud
fully autonomous ground vehicles with smart city,’’ IEEE Access, vol. 9, Data Sci., Feb. 2023, pp. 268–276.
pp. 923–948, 2021. [26] M. Baltaxe, R. Mergui, K. Nistel, and G. Kamhi, ‘‘Marker-less vision-
[4] D. Miculescu and S. Karaman, ‘‘Polling-systems-based autonomous based detection of improper seat belt routing,’’ in Proc. IEEE Intell.
vehicle coordination in traffic intersections with no traffic signals,’’ IEEE Vehicles Symp. (IV), Jun. 2019, pp. 783–789.
Trans. Autom. Control, vol. 65, no. 2, pp. 680–694, Feb. 2020. [27] S. Chun, N. H. Ghalehjegh, J. Choi, C. Schwarz, J. Gaspar, D. McGehee,
[5] K. Kuru, ‘‘Conceptualisation of human-on-the-loop haptic teleoperation and S. Baek, ‘‘NADS-Net: A nimble architecture for driver and seat belt
with fully autonomous self-driving vehicles in the urban environment,’’ detection via convolutional neural networks,’’ in Proc. IEEE/CVF Int.
IEEE Open J. Intell. Transp. Syst., vol. 2, pp. 448–469, 2021. Conf. Comput. Vis. Workshop (ICCVW), Oct. 2019, pp. 2413–2421.
[6] M. I. Pereira, R. M. Claro, P. N. Leite, and A. M. Pinto, ‘‘Advancing [28] D. Rao, ‘‘Driver’s seat belt detection using CNN,’’ Turkish J. Comput.
autonomous surface vehicles: A 3D perception system for the recognition Math. Educ., vol. 12, pp. 776–785, Apr. 2021.
and assessment of docking-based structures,’’ IEEE Access, vol. 9, [29] A. Ş. Şener, I. F. Ince, H. B. Baydargil, I. Garip, and O. Ozturk,
pp. 53030–53045, 2021. ‘‘Deep learning based automatic vertical height adjustment of incorrectly
[7] M. A. Khan, T. Nawaz, U. S. Khan, A. Hamza, and N. Rashid, ‘‘IoT- fastened seat belts for driver and passenger safety in fleet vehicles,’’ Proc.
based non-intrusive automated driver drowsiness monitoring framework Inst. Mech. Eng., D, J. Automobile Eng., vol. 236, no. 4, pp. 639–654,
for logistics and public transport applications to enhance road safety,’’ Mar. 2022, doi: 10.1177/09544070211025338.
IEEE Access, vol. 11, pp. 14385–14397, 2023. [30] J. Madake, S. Yadav, S. Singh, S. Bhatlawande, and S. Shilaskar, ‘‘Vision-
[8] E. Perkins, C. Sitaula, M. Burke, and F. Marzbanrad, ‘‘Challenges of driver based driver’s seat belt detection,’’ in Proc. Int. Conf. Advancement
drowsiness prediction: The remaining steps to implementation,’’ IEEE Technol. (ICONAT), Jan. 2023, pp. 1–5.
Trans. Intell. Vehicles, vol. 8, no. 2, pp. 1319–1338, Feb. 2023. [31] Y. Zang, B. Yu, and S. Zhao, ‘‘Lightweight seatbelt detection algorithm for
[9] A. Picot, A. Caplier, and S. Charbonnier, ‘‘Comparison between EOG and mobile device,’’ Multimedia Tools Appl., vol. 2023, pp. 1–15, Mar. 2023.
high frame rate camera for drowsiness detection,’’ in Proc. Workshop Appl. [32] A. Upadhyay, B. Sutrave, and A. Singh, ‘‘Real time seatbelt detection using
Comput. Vis. (WACV), Dec. 2009, pp. 1–6. YOLO deep learning model,’’ in Proc. IEEE Int. Students’ Conf. Electr.,
[10] G. Gallego, T. Delbrück, G. Orchard, C. Bartolozzi, B. Taba, Electron. Comput. Sci. (SCEECS), Feb. 2023, pp. 1–6.
A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, [33] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen,
and D. Scaramuzza, ‘‘Event-based vision: A survey,’’ IEEE Trans. Pattern ‘‘MobileNetV2: Inverted residuals and linear bottlenecks,’’ in
Anal. Mach. Intell., vol. 44, no. 1, pp. 154–180, Jan. 2022. Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018,
[11] M. S. Dilmaghani, W. Shariff, C. Ryan, J. Lemley, and P. Corcoran, pp. 4510–4520.
‘‘Control and evaluation of event cameras output sharpness via bias,’’ Proc. [34] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang,
SPIE, vol. 12701, pp. 455–462, Jun. 2022. A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei,
[12] C. Ryan, B. O’Sullivan, A. Elrasad, A. Cahill, J. Lemley, P. Kielty, ‘‘ImageNet large scale visual recognition challenge,’’ Int. J. Comput. Vis.,
C. Posch, and E. Perot, ‘‘Real-time face & eye tracking and blink detection vol. 115, no. 3, pp. 211–252, Dec. 2015.
using event cameras,’’ Neural Netw., vol. 141, pp. 87–97, Sep. 2021. [35] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, ‘‘Self-attention
[13] N. Fouda Mbarga, A.-R. Abubakari, L. N. Aminde, and A. R. Morgan, generative adversarial networks,’’ in Proc. Int. Conf. Mach. Learn., 2018,
‘‘Seatbelt use and risk of major injuries sustained by vehicle occupants pp. 7354–7363.
during motor-vehicle crashes: A systematic review and meta-analysis of [36] S. Hochreiter and J. Schmidhuber, ‘‘Long short-term memory,’’ Neural
cohort studies,’’ BMC Public Health, vol. 18, no. 1, p. 1413, Dec. 2018. Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997.
[14] C. S. Crandall, L. M. Olson, and D. P. Sklar, ‘‘Mortality reduction with [37] S. Abtahi, M. Omidyeganeh, S. Shirmohammadi, and B. Hariri, ‘‘YawDD:
air bag and seat belt use in head-on passenger car collisions,’’ Amer. A yawning detection dataset,’’ in Proc. 5th ACM Multimedia Syst. Conf.,
J. Epidemiol., vol. 153, no. 3, pp. 219–224, Feb. 2001. Mar. 2014, pp. 24–28.
[15] S. Hosseini and A. Fathi, ‘‘Automatic detection of vehicle occupancy and [38] Y. Hu, S.-C. Liu, and T. Delbruck, ‘‘v2e: From video frames to realistic
driver’s seat belt status using deep learning,’’ Signal, Image Video Process., DVS events,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.
vol. 17, no. 2, pp. 491–499, Mar. 2023. Workshops (CVPRW), Jun. 2021, pp. 1312–1321.
[16] A. Kashevnik, A. Ali, I. Lashkov, and N. Shilov, ‘‘Seat belt fastness [39] B.-T. Dong, H.-Y. Lin, and C.-C. Chang, ‘‘Driver fatigue and distracted
detection based on image analysis from vehicle in-abin camera,’’ in Proc. driving detection using random forest and convolutional neural network,’’
26th Conf. Open Innov. Assoc. (FRUCT), Apr. 2020, pp. 143–150. Appl. Sci., vol. 12, no. 17, p. 8674, Aug. 2022.
[17] K. Paul, M. S. Dilmaghani, C. Ryan, J. Lemley, and P. Corcoran, [40] W. Zhang and J. Su, ‘‘Driver yawning detection based on long short term
‘‘Neuromorphic sensing for yawn detection in driver drowsiness,’’ in Proc. memory networks,’’ in Proc. IEEE Symp. Ser. Comput. Intell. (SSCI),
15th Int. Conf. Mach. Vis. (ICMV), Jun. 2023, pp. 1–12. Nov. 2017, pp. 1–5.
[18] S. Abtahi, B. Hariri, and S. Shirmohammadi, ‘‘Driver drowsiness [41] B. Akrout and W. Mahdi, ‘‘Yawning detection by the analysis of variational
monitoring based on yawning detection,’’ in Proc. IEEE Int. Instrum. descriptor for monitoring driver drowsiness,’’ in Proc. Int. Image Process.,
Meas. Technol. Conf., May 2011, pp. 1–4. Appl. Syst. (IPAS), Nov. 2016, pp. 1–5.
[19] M. Omidyeganeh, S. Shirmohammadi, S. Abtahi, A. Khurshid, M. Farhan,
J. Scharcanski, B. Hariri, D. Laroche, and L. Martel, ‘‘Yawning detection
using embedded smart cameras,’’ IEEE Trans. Instrum. Meas., vol. 65,
no. 3, pp. 570–582, Mar. 2016.
PAUL KIELTY received the B.E. degree in elec-
[20] M. Knapik and B. Cyganek, ‘‘Driver’s fatigue recognition based on yawn
detection in thermal images,’’ Neurocomputing, vol. 338, pp. 274–292, tronic and computer engineering from the Univer-
2019. sity of Galway, in 2021. He is currently pursuing
[21] H. Yang, L. Liu, W. Min, X. Yang, and X. Xiong, ‘‘Driver yawning the joint Ph.D. degree with the University of
detection based on subtle facial action recognition,’’ IEEE Trans. Galway and the ADAPT SFI Research Centre. His
Multimedia, vol. 23, pp. 572–583, 2021. research interest includes deep learning methods
[22] D. Liu, C. Zhang, Q. Zhang, and Q. Kong, ‘‘Design and implementation with neuromorphic vision, with particular interest
of multimodal fatigue detection system combining eye and yawn in driver monitoring tasks.
information,’’ in Proc. IEEE 5th Int. Conf. Signal Image Process. (ICSIP),
Oct. 2020, pp. 65–69.
MEHDI SEFIDGAR DILMAGHANI received the JOE LEMLEY received the B.S. degree in
B.Sc. degree in electronics engineering from the computer science and the master’s degree in
University of Tabriz, in 2012, and the M.Sc. computational science from Central Washington
degree in electronics engineering from KNTU, University, in 2006 and 2016, respectively, and
in 2016. He is currently pursuing the Ph.D. degree the Ph.D. degree from the National University
with the Department of Electrical and Electronics of Ireland Galway. He is currently a Principal
Engineering, University of Galway, under the Research and Development Engineer and the
Hardiman Scholarship. During his M.Sc. studies, Manager of Xperi Inc., Galway. His field of work
he had focus on electronic implementation of is machine learning using deep neural networks
signal processing algorithms and wavelets. He is for tasks related to computer vision. His current
also a Research and Development Intern with Xperi Inc. His research research interests include computer vision and signal processing for the
interests include deep learning, computer vision, and neuromorphic sensors driver monitoring systems.
(event cameras).