Hardware Accelerators For Autonomous Cars: A Review: Abstract
Hardware Accelerators For Autonomous Cars: A Review: Abstract
Ruba Islayem, Fatima Alhosani, Raghad Hashem, Afra Alzaabi, Mahmoud Meribout
Abstract— Autonomous Vehicles (AVs) redefine widespread in AV development, led by companies such as
transportation with sophisticated technology, integrating Waymo, Uber, and Tesla. This shift replaces conventional
sensors, cameras, and intricate algorithms. Implementing systems, reducing reliance on costly equipment like LRF
machine learning in AV perception demands robust hardware (Laser Range Finder), LiDAR, and GPS [5]. Ongoing
accelerators to achieve real-time performance at reasonable
power consumption and footprint. Lot of research and
research aims to ensure AV safety by addressing challenges
development efforts using different technologies are still being in modelling human-like driving behaviour for passenger
conducted to achieve the goal of getting a fully AV and some cars comfort. ML, especially through the application of
manufactures offer commercially available systems. Convolutional Neural Networks (CNNs), assumes a central
Unfortunately, they still lack reliability because of the repeated role in performing vital computer vision tasks essential for
accidents they have encountered such as the recent one which AV autonomy [5].
happened in California and for which the Cruise company had AVs leverage not just machine vision algorithms but also
its license suspended by the state of California for an depend on hardware accelerators to furnish robust parallel
undetermined period [1]. This paper critically reviews the most computing frameworks, essential for managing the intricate
recent findings of machine vision systems used in AVs from both
hardware and algorithmic points of view. It discusses the
responsibilities of perception, decision-making, and control
technologies used in commercial cars with their pros and cons [6]. These hardware accelerators encompass graphics
and suggests possible ways forward. Thus, the paper can be a processing units (GPUs), Central processing units (CPUs),
tangible reference for researchers who have the opportunity to Field-Programmable Gate Arrays (FPGAs), and Application-
get involved in designing machine vision systems targeting AV. Specific Integrated Circuits (ASICs). The selection of these
hardware accelerators for AVs can fluctuate based on several
Index Terms— ADAS, ASIC, CNNs, CPU, Datasets, FPGA, factors, including the AVs autonomy level, sensor
GPU, Hardware Accelerators, SSD, Object Detection, YOLO configuration, computational demands, and safety
prerequisites.
I. INTRODUCTION
This review paper makes a valuable contribution to the
AVs represent a groundbreaking technological innovation field of AVs in several ways. Firstly, it addresses the absence
with profound implications for the field of transportation and of comprehensive review papers that discuss commercially
beyond. Using a combination of sensors, cameras, lidar available machine vision systems for autonomous vehicles.
(Light Detecting and Range Technology), radar, and complex This serves as a valuable resource for researchers and
software algorithms, autonomous cars can observe their industry professionals seeking insights into the practical
surroundings, make quick decisions in real time, and travel implementation and industry relevance of these systems.
safely without the need for a driver. AVs have garnered Furthermore, this paper is groundbreaking because it covers
significant interest recently and they hold a crucial place in all aspects of hardware accelerators and machine vision
transportation not just for the convenience they offer in systems for AVs in one comprehensive document.
relieving drivers but also for their capacity to revolutionize Additionally, it tackles the issue of fragmented information
the entire transportation ecosystem. As per the WHO, by consolidating and presenting it in one accessible resource,
approximately 1.3 million lives are lost each year due to road making research and knowledge exchange more efficient.
traffic accidents [2], and 94% of these accidents are because These contributions aim to advance research and drive
of human errors and distracted driving [3]. Therefore, their progress in the development of AVs technologies.
significance is underscored by their role in enhancing road The remaining sections of this paper are organized as
safety by eliminating human errors, optimizing traffic flow, follows:
reducing congestion, and minimizing environmental impact • Section II provides a background overview of the
[4]. Additionally, they offer increased mobility for hardware accelerators, sensors and machine vision
individuals who cannot drive due to elderly or disabilities, algorithms commonly employed in AVs.
promising a future that is safer, more efficient, and more • Section III is dedicated to providing an overview of the
accessible. machine vision algorithms used in AVs, offering insights
In recent decades, Machine Learning (ML) algorithms into deep learning algorithms and machine learning
have played a pivotal role in advancing AV technology, algorithms used to detect relevant objects on the road.
particularly in the perception system. These algorithms • In Section IV, we conduct an in-depth exploration of
facilitate the assessment of the vehicle's surroundings and some of the state-of-the-art processors and other
identification of objects like pedestrians, vehicles, and traffic potential hardware accelerators utilized in AVs to
signals. The control system module utilizes this information enhance machine vision algorithms.
to implement essential measures, covering actions related to • In Section V, we offer our conclusions, summarizing the
braking, speed, lane changes, or steering adjustments [5]. The main findings and implications presented throughout this
integration of artificial intelligence (AI) and ML is paper.
II. BACKGROUND C. Perception Sensors Used by Manufacturers
A. Levels of ADAS AVs utilize 4 main types of perception sensors: cameras,
RADARs, LIDARs and ultrasonic sensors. The cameras
In 2014, SAE International introduced the J3016
which arguably yield the most useful and larger information
standard, known as "Levels of Driving Automation" [7]. This
may be of different type: fish-eye cameras for wide-angle
standard classifies the Advanced Driver-Assistance System
coverage, monocular cameras for basic visual data, stereo
(ADAS) into six distinct levels of driving automation, as
cameras for depth perception, and 360-degree cameras for
depicted in Fig 1 [7]. It commences at SAE level 0, where the
panoramic views [7]. Also, depending on their focal length
driver maintains full control, and advances to SAE level 5,
and orientation, they can be used to cover different views
where vehicles achieve complete autonomy and handle all
surrounding the car: near/far front-view, side-view, rear-
dynamic driving tasks without human intervention. In a level
view, surround-view, and built-in cameras, based on the
5 system, the vehicle assumes full responsibility, even in the
different applications and scenarios [12].
event of faults, errors, or accidents [8]. To reach higher
RADARs, Radio Detection and Ranging sensors, detect
autonomy levels, AVs depend on a combination of sensors
and locate objects within a specific range from the car. Most
and software to perceive their environment and navigate
AVs employ 3 variants of RADARS: long-range, medium
autonomously [9]. Currently, automotive manufacturers like
range and short-range [13].
Audi (Volkswagen) and Tesla have adopted SAE level 2
LIDAR, sensors use laser beams to detect and measure
automation standards in the development of automation
the time it takes for the beams to reach the object; thereby
features like Tesla's Autopilot and Audi A80s Traffic Jam
allowing the system to create a 3D map of the environment
Pilot. In contrast, Alphabet’s Waymo has been exploring a
[14]. Their high accuracy, along with their effectiveness in
business model centered on SAE level 4 self-driving taxi
low-light conditions, makes them an important component
services since 2016, offering rides within specific areas in
for autonomous vehicles.
Arizona, USA [7].
Ultrasonic sensors provide short-distance data and are
typically used for parking assistance and backup warning
systems, as far as there is no rain [9]. However, unlike
cameras, they can operate in foggy and dusty weather
conditions.
Among all sensors, the camera is the main visual sensor
of the ADAS system due to its ability to perform high-
resolution tasks, including classification and scene
understanding that require color perception. There has been a
growing belief among researchers and even companies that
autonomous driving will be possible with cameras only. Tesla
is one of such companies as it uses AI and dedicated hardware
Figure 1. An overview of the levels of driving automation [7]
accelerators to process video data in order to simultaneously
B. General Structure of ADAS Systems estimate the depth, velocity, and acceleration using camera
input [15]. However, this system has yet to demonstrate its
ADAS is a system that helps automobile drivers navigate
reliability as some fatal accidents have occurred since its
and park without automating the whole process by employing
adoption [16]. In this paper, the focus will be made on such
camera-based sensors. It aims to minimize human accidents
camera-based systems as they have great potential to achieve
by processing important data about traffic, congestion levels,
the SAE level 5 in the near future. The continuous advance of
and road closures, among other things.
AI and associated hardware accelerators is the main catalyzer
The brain of most ADAS systems is a hardware
for this optimism. Fig 3 shows the overall sensor placement
accelerator to perceive the car's surroundings to avert danger.
in AVs.
They typically comprise four perception sensors LIDAR,
RADAR, Cameras, and Ultrasonic Sensors [10]. The data
from these sensors is processed using a dedicated hardware
accelerator and fused together to identify nearby objects such
as pedestrians, vehicles, lanes, and traffic signs [11]. Finally,
the pre-processed data is explored by other components such
as the brake, steering, and throttle control to react accordingly
based on the obstacles faced. The entire process is depicted
in the figure below.
Figure 2. ADAS General Processing Structure Figure 3. Typical Placement of Sensors Around an Autonomous
Vehicle [18]
2
D. Need for Hardware Accelerators vehicles to travel autonomously in specific weather
In AVs, traditional computer processors, such as Intel-7 conditions [25]. Furthermore, Waymo launched a driverless
CPU, lack the power to host computationally intensive taxi service with level 4 autonomy in 2018 in the Phoenix
machine vision algorithms hence the need for special-purpose area, USA, serving 1,000 to 2,000 riders weekly, with 5-10%
coprocessors or AI accelerators such as GPUs, FPGAs, and of these rides being entirely autonomous [26]. Cruise
ASICs have been widely used in automobiles as shown in Fig Automation, in 2017, began testing a fleet of 30 vehicles with
4 [17-19]. level 4 autonomy and introduced their self-driving Robotaxi
On one end, while multicore CPUs have general service in 2021 [23]. Cruise AVs utilize a sensor cluster,
flexibility and can execute AI and machine learning featuring a front radar and cameras along with lidar sensors
workloads, there is no specific support for them, and they are mounted on top to offer a comprehensive 360-degree view of
not energy efficient. On the other hand, GPUs, which are also their surroundings [26]. However, it's worth noting that the
versatile, feature higher levels of parallelism using even California Department of Motor Vehicles (DMV) has
single and double precision arithmetic. Thus, they are more recently revoked Cruise's permits for testing and operating
adequate to handle memory intensive tasks required in fully autonomous vehicles on the state's roads due to several
machine vision algorithms to yield higher throughput than reasons, including their failure to disclose information about
multicore CPUs [6]. FPGAs also offer adaptability for a pedestrian accident, where a Cruise vehicle struck a
customizing parallelism, data types, and hardware pedestrian and dragged them along the road [27]. Although
architecture to suit specific applications. They are useful for both Waymo and Cruise aspire to achieve level 5 autonomy,
accommodating lighter versions of modern DNN models their AVs are presently classified as level 4 due to the absence
featuring quantized weights and reduced number of layers of a guarantee for safe operation in all weather and
[6]. Furthermore, ASICs, which are customized hardware environmental conditions as well as to the road traffic
chips designed for specific applications, offer high accidents they have caused.
performance and efficiency tailored to their designated G. Challenges in AVs
functions in terms of execution time and power consumption
Recent research in machine vision for AVs has achieved
[6]. However, they lack flexibility and are designed for
significant progress but faces various challenges that warrant
specific purposes. Therefore, systems designers must
further investigation. Firstly, real-time object detection is
consider a blend of processor resources to meet their
complex due to the need for simultaneously processing
application needs. Tesla, NVIDIA, Qualcomm, and
several video streams in real-time (more than 10 video
Mobileye, among other companies, have been working on
streams corresponding to different orientations and zooming
developing their own AI accelerators targeting AV
of the cameras in most of the cases) [23]. A limited number
applications.
of studies, such as [28] and [29], considered multi-frame
perception, which uses data from previous and current time
instances. Moreover, semi-supervised object detection,
involving annotated data for model training, faces challenges
Figure 4. Spectrum of Hardware Accelerators [17] in annotating diverse scenarios, which are essential for the
models' adaptability in real-world AV driving scenarios [23].
Recent research [30-32] recommends semi-supervised
E. Machine Vision and emergence of CNNs transformer models for improved accuracy but deploying
One of the most significant advancements in machine them on embedded onboard computers poses memory
vision technology is the integration of CNNs. They are one challenges requiring further investigation. Finally, object
of the best machine learning algorithms for recognizing detector performance varies with changing environmental
image content and have demonstrated good performance in conditions like light and weather. Addressing this issue
image segmentation, classification, detection, and retrieval involves collecting diverse weather data, crucial for training
related tasks [20]. Some of the most widely used CNN reliable object detectors. The Waymo open dataset offers
models in machine vision targeting AV are YOLO, Faster such diversity to improve detector performance [33]. Such
RCNN, SSD, and MobileNet [21]. These models are well- open datasets are vital for ensuring consistent performance in
known for their exceptional performance in various image- various environmental conditions in AVs. The other
related tasks, making them essential tools in the field of AVs. challenge is the increasing complexity of the hardware
accelerators which require an in-depth hardware skill as well
F. State of the Art AVs and their level of autonomy as masterminding of both the AI algorithms and the
Car manufacturers have been researching AVs since the associated firmware and the real-time operating system
1920s [22]. The first modern AV in 1984 had level 1 structure. It is rare to have these attributes featured by one
autonomy, followed by a level 2 AV from Mercedes-Benz in single researcher which require multidisciplinary teamwork.
1987 that could control steering and acceleration with limited
human supervision [23]. In 2014, Tesla became the pioneer III. MACHINE VISION ALGORITHMS FOR AVS
in bringing AVs to the commercial market with their In the past, the computational capabilities of hardware
Autopilot system, offering level 2 autonomy [23]. Tesla's accelerators were not powerful enough to support the
AVs heavily rely on sensors for self-navigation and decision- integration of CNN models. Most of the traditional vision
making, including a suite of six forward-facing cameras and algorithms were not using CNN models, primarily because
ultrasonic sensors [24]. Volvo, in 2017, introduced their they are computationally intensive. Nevertheless, with the
Drive Me feature, providing level 2 autonomy, allowing their advancement of hardware accelerators, their implementation
3
at reasonable power consumption to be performed in real- used for feature extraction, providing features of various sizes
time is becoming possible. As a result, CNN models have [35]. R-CNN has been demonstrated to yield high
replaced most traditional image processing methods. performance for AVs, specifically for detecting various
Moreover, these image processing methods are not reliable objects, including pedestrians, cars, and traffic signs [36].
because they rely on manual feature engineering, making Even though R-CNN achieves cutting-edge results, it is very
them less adaptive and time-consuming, especially for slow to train and test due to the need to process thousands of
complex object detection tasks. They often struggle to regional proposals for each image [36].
recognize objects in diverse driving scenarios, requiring In the initial phase of the R-CNN methodology, as shown
frequent adjustments for changes in object scale, rotation, and in Fig 6, approximately 2,000 region proposals are generated
varying environmental conditions, which can limit their to encompass potential objects [34]. Then, each region goes
reliability and effectiveness. This is, in fact, a big drawback through a backbone network such as AlexNet, to extract
of traditional legacy image-processing algorithms dedicated feature representations consisting of 4,096 dimensions. To
to autonomous cars. The challenge lies in their ability to enhance the accuracy of object classification, the system uses
accurately detect vehicles, where even minor alterations in a a Support Vector Machine (SVM) for making predictions.
vehicle’s appearance can lead to detection failures. An Furthermore, the system utilizes Fully Connected Layers
illustrative example is the disruption caused by an extended (FCLs) to refine these predictions. Adjustments to the
arm from a car’s window, resulting in a system malfunction. bounding boxes are made more precise with a Bounding-Box
In contrast, CNN models prove advantageous as they exhibit regression technique and a method called greedy non-
robust performance, making them the preferred choice. maximum suppression (NMS). By following this process, R-
The illustration in Fig 5 outlines the autonomous vehicle CNN achieved a mean average precision (mAP) of 58.5% on
processing pipeline employed in today's machine vision the Pascal VOC dataset.
systems. The pipeline, structured in discrete stages, facilitates
the seamless flow of information from sensor data to high-
level decision-making. Specialized CNN models tailored for
distinct object detection tasks enhance vehicle safety and
overall performance. Beginning with the camera capturing
images, the pipeline includes video decoding for bandwidth
optimization, image preprocessing for tasks like resizing and
noise reduction, and specialized models for detecting Figure 6. R-CNN Process Pipeline
vehicles, pedestrians, lanes, and traffic signs. The high-level
preprocessing phase integrates these outputs to make
informed decisions, addressing tasks such as safe distance b) Fast R-CNN
calculation and responding to traffic signs. Finally, the Fast R-CNN model enhances object detection by analyzing
decoder translates processed data for visualization, control, the entire image simultaneously, making it faster and more
and output, including displaying object detections on a user accurate than the previous R-CNN model. As shown in Fig 7,
interface and transmitting commands to vehicle actuators. it begins by processing the image through a CNN to create a
feature map. Regions of interest (ROIs) are then identified on
this map, and through ROI pooling, fixed-size feature vectors
are generated. These vectors are employed in FCLs for
Figure 5. AVs Processing Pipeline predictions, using 'softmax' and 'bounding-box regression' for
categorization and precise location determination,
A. Object Detection Algorithms respectively. It achieved mAPs of 70.0%, 68.8%, and 68.4%
on Pascal VOC 2007, 2010, and 2012 datasets when trained
Object detection comprises two key tasks: localization, with VGG-16 [34]. However, it relies on external region
determining the precise object position in an image or video proposals, which is computationally expensive [6], therefore,
frame, and classification, assigning a specific class to the Faster R-CNN was introduced.
object. This classification can include identifying objects like
pedestrians, vehicles, or traffic lights [23]. Detection and
classification can be done in a single (e.g. R-CNN) or two
independent stages (e.g. YOLO) [34]. Unlike two-stage
detectors, which rely on a separate region proposal step for
bounding-box prediction, one-stage detectors perform this
directly from input images, resulting in faster performance Figure 7. Fast R-CNN Process Pipeline
[34].
4
and significantly enhancing training and computing speed. detection, segmentation, pose estimation, tracking, and
Moreover, Faster R-CNN employs a separate network to feed classification. It utilizes a modified backbone called the C2f
the ROI to the ROI pooling layer and the feature map [34]. module, which combines high-level features with contextual
These inputs are subsequently reshaped and utilized for information and employs an anchor-free model with a
prediction. In Faster R-CNN, the number of ROIs is not a decoupled head to enhance overall accuracy [39]. In the
constant value and is defined by the size of the feature map. output layer, the model employs the sigmoid activation
Thus, the region proposals were implemented on GPUs with function to determine the abjectness score, indicating the
nearly free computation cost compared to previous baselines probability that the bounding box contains an object.
[34]. This optimized architecture allows Faster R-CNN to Additionally, the softmax function is utilized for class
achieve a rapid 6 frames per second (FPS) inference speed on probabilities, indicating the objects’ probabilities belonging
a GPU while maintaining state-of-the-art detection accuracy to each possible class. When evaluated on the MS COCO
on Pascal-VOC 2007 [6]. Despite speed and accuracy dataset test-dev 2017, YOLOv8x achieved an Average
improvements, the two-stage approach still falls short of real- Precision (AP) of 53.9% with a 640-pixel image size and a
time performance requirements. speed of 280 FPS on an NVIDIA A100 with TensorRT [39].
5
its proficiency in accurately identifying objects. Although its learning emerged. RFCN, Mask RCNN, RetinaNet, YOLO,
specific model size is not disclosed, SSD exhibits a and SSD are commonly used algorithms for pedestrian
remarkable FPS rate of 105.14, indicating rapid real-time detection [50], [51]. Additionally, CompACT, SAF RCNN,
detection capabilities. and ALFNet are proposed optimized algorithms specific for
pedestrian detection tasks [51-53]. In [54], YOLO-R, an
TABLE I
optimized YOLO algorithm has been proposed, which has a
PERFORMANCE METRICS OF OBJECT DETECTION MODELS high precision of 98.6%. In comparison, R-CNN models
Hardware
Model typically reported a precision ranging from 70-80% [55],
Model Dataset Size FPS mAP [56]. To train pedestrian detection algorithms, Caltech,
Platform
(MB)
KITTI, CityPersons, EuroCity, INRIA and COCO are among
DYNAMI GeForce
C R-CNN MS COCO RTX 550 13.9 49.2 the most used datasets [57].
[34] 2080TI
YOLOv5x
VOC2007
GeForce 81.1 3) Traffic Sign Detection
+ 2012 87.37 10.09 Traffic sign detection algorithms are essential in analyzing,
[40] GTX 1650 8
COCO
MobileNet- VOC2007 detecting, and categorizing traffic signs based on their shape,
GeForce 73.1 color and drawings on them [58]. Traffic sign algorithms are
YOLO + 2012 3.23 73.39
GTX 1650 7
[40] COCO classified into two types: machine learning based, and deep
YOLOv7- GeForce learning based. Machine learning based algorithms include
TIB-Net 12.2 227 85
tiny [41] RTX 3070
Support Vector Machine (SVM), and AdaBoost to detect
YOLOv8 GeForce traffic signs accurately using handcrafted features [58]. On
TIB-Net 9.2 221 95.1
[41] RTX 3070
the other hand, deep learning algorithms such as CNNs and
PASCAL RNNs have been more commonly used recently due to their
GeForce
VOC 2007 90.5
SSD [42]
+ 2012
RTX - 105.14
6 ability to automatically learn complex features from raw data,
2080TI reducing the need for manual extraction [58]. For instance,
COCO
Faster R- PASCAL enhanced algorithms based on ResNet and CNN, as
CNN VOC 2007 introduced by [59], demonstrate effective capture of intricate
CPU - 7 73.2
(VGG16) + 2012
[43] COCO
features in traffic signs. Utilizing the Kaggle traffic sign
PASCAL dataset, the ResNet-based model achieved an impressive
Fast R- VOC 2007
CPU - 0.5 70.0
recognition accuracy of 99%, while the CNN-based model
CNN [43] + 2012 attained a recognition accuracy of 98%. GTSRB, COCO and
COCO
TT100K are some of the most used datasets to train traffic
sign detection algorithms [58], [60], [61].
B. Algorithms for Detected Objects in AVs
1) Lane Detection 4) Traffic Light Detection
Lane detection algorithms rely on line detection and edge Traditional image-processing traffic light algorithms can
detection [44]. Initially, traditional image processing be processed into two steps: feature extraction and template-
algorithms were used such as the Hough transform which is matching [62]. Feature-extraction algorithms are used to
one of the widely used algorithms as it features high level of know the features of the traffic light signal, and commonly
parallelism and accuracy of detection [45]. Other image used algorithms are SIFT, PCA-SIFT, and SURF [63-65]. On
processing-based algorithms including LaneATT [46], the other hand, template-matching algorithms, or classifiers
RANSAC, control point detection, lane marking clustering are used to match and classify features. Adaboost, SVM, and
and fan-scanning line detection, were also employed [44], LDA are some of the algorithms used for template matching
[46]. With the development of deep learning techniques, [68-70]. While these algorithms are still being used, they lack
CNN algorithms such as CNN, RNN, R-CNN and YOLO generality where even a marginal change in the object
family have been used for lane detection [44], [46]. appearance would cause false negatives. With the
According to [47], CNN models reported a 90% accuracy for development of deep learning algorithms, the YOLO family
lane detection as compared to traditional image processing and RCNN series have been widely used for traffic light
algorithms, which have an accuracy of 80% [44]. Caltech detection [62]. Most of the recent research has been focused
Lane, KITTI, TuSimple, and CuLane are the most used on optimizing YOLO algorithms [69-73]. Most notably, the
datasets to train algorithms for lane detections [44], [46]. most recent version of YOLO, YOLOv8 has been optimized
for traffic light detection in [72], achieving a high mean
2) Pedestrian Detection average precision of 98.5% as compared to the
In the past, traditional object detection algorithms such as implementation of Faster R-CNN in [74], which achieves a
VJ detector and Histogram of Oriented Gradients (HOG) maximum mean average precision of 86.4%. In order to train
have been used for pedestrian detection, all of which provided the algorithms, LISA, Bosch, and DriveU are some of the
high accuracy rates [48]. In 2008, the Deformable Parts main datasets created specifically for traffic light color
detection [75-77].
Models (DPM) detection algorithm was proposed [49]. DPM
divided pedestrians into different parts and then treated them IV. HARDWARE ACCELERATORS
as a collection consisting of different parts during object
classification. At that time, the algorithm had the best Recent advancements in computer vision algorithms have
detection results until the optimization methods using deep been primarily driven by deep learning and the availability of
extensive datasets. Hardware acceleration has played a
6
significant role in this progress, providing parallel computing
architectures that enable the efficient training and execution
of complex neural networks. State-of-the-art processors such
as the ones manufactured by Tesla, NVIDIA, Mobileye, and
Qualcomm hardware accelerators have been among the most
widely used accelerators in the industry to power autonomous
vehicles. However, FPGAs and TPUs are also other hardware
accelerators that hold great protentional to be used to other
autonomous vehicles. When hardware accelerators are
combined properly and optimized, they can make-up for the
drawbacks in each other, paving the path for attractive Figure 12. FSD Block Diagram [79]
heterogenous hardware solution. In this section of the report, 2) NVIDIA
an overview is first given of the different state-of-the-art Nvidia Jetson is a low-power computing board series,
processors used in AVs, concluding it with a comparison integrating an ARM architecture CPU used to accelerate
between them. machine learning applications using tensor cores [82]. Most
A. State-of-the-Art Processors Targeting AVs notably, Jetson Xavier, Jetson Nano, and Jetson Orin have
In the fast-changing world of AVs, the core of advanced been used for autonomous vehicle applications.
technology lies in state-of-the-art processors. Companies like The NVIDIA Jetson AGX Orin, released in 2023, is
NVIDIA, Tesla, Qualcomm, and MobileEye lead the way in programmable using CUDA and Tensor APIs and libraries,
shaping the intelligence and effectiveness of self-driving offering 275 TOPS with power configurable between 15W
systems using their own hardware accelerator. Besides and 60W [83]. Jetson AGX Orin modules feature the
NVIDIA, all other manufactures do not commercialize their NVIDIA Orin SoC, which is built on an 8nm chip, with a
respective processors, which may alter their progress in both NVIDIA Ampere architecture GPU, Arm Cortex-A78AE
the software and hardware areas. This has led NVIDIA to CPU, next-generation deep learning and vision accelerators,
lead the race by offering cutting edge processors effectively and a H.264/5 video encoder and a video decoder.
used not only in AVs but also in other related areas such as Furthermore, it supports LPDDR5 memory and has a DRAM
generative AI, metaverse, and robotics. Indeed, most of the capacity of 32GB or 64GB.
algorithms dedicated for AVs were developed on NVIDA
platforms. This section explores the details of these powerful
processors, exploring their unique features, innovations, and
contributions to improving self-driving technology.
1) TESLA
In 2019, Tesla introduced Hardware 3.0 (HW3), its
dedicated AI self-driving hardware supporting Full Self-
Driving (FSD) technology [78]. This custom-designed chip is
built on Samsung's 14 nm process [79]. As shown in Fig 12,
it integrates 3 quad-core Cortex-A72 clusters, totalling 12
ARM Cortex-A72 CPUs operating at 2.2 GHz, 2 neural
processing units (NPUs) operating at 2 GHz, achieving a peak
performance of 36.86 TOPS, and a GPU operating at 1 GHz Figure 13. NVIDIA Jetson Orin AGX Block Diagram [122]
with a capacity of 600 GFLOPS [79]. The FSD chip also
In addition to being very powerful, the other main
features an image signal processor (ISP) for handling the
eight High Dynamic Range (HDR) sensors, H.265 video advantage of this processor is its wide availability for
encoder, and camera serial interface (CSI) for managing researchers, and to feature a powerful software development
sensors, along with a conventional memory subsystem kit. Thus, Yassin K. et. al [84] proposed a lane detection
algorithm based on CNN Encoder–Decoder and Long Short-
supporting 128-bit LPDDR4 memory at 2133 MHz [80]. The
Term Memory (LSTM) networks, implemented on the
system features two independent FSD chips, each with its
NVIDIA Jetson Xavier. Notably, it achieves a frame rate of
dedicated storage and operating system [78]. In case of a
primary chip failure, the backup unit seamlessly takes over. 6.78 FPS and takes 147 ms to process a 1280*720 input
Notably, the HW3 outperforms the previous NVIDIA image as compared to Intel Core i7-2630QM CPU processor,
which achieves a frame rate of only 3.62 FPS and an
DRIVE PX 2 AI platform, delivering 36.86 TOPS compared
execution time of 276 ms. In [85], LW-YOLOv4-tiny is
to the previous 21 TOPS [78]. The FSD computer consumes
implemented on the Nvidia Jetson Nano for rapid object
72 Watts, with 15 Watts attributed to the NPUs [80].
detection and it achieves an execution speed of 56.1 FPS.
Various object detection algorithms are employed in Tesla
cars to recognize and monitor objects within the visual scope Automotive manufacturers like Audi, Mercedes-Benz, and
of a vehicle. This includes conventional computer vision Volvo partnered with Nvidia to incorporate NVIDIA Jetson
into their autonomous vehicles, aiming to achieve advanced
methods like HOG or employ more sophisticated deep
self-driving capabilities [86-88].
learning methodologies such as YOLO and R-CNN [81].
7
3) Qualcomm Snapdragon predecessors, it features two important GPUs, a small-scale
In January 2022, Qualcomm launched the Snapdragon ARM MALI GPU for AR image overlay, and the other GPU
Ride Vision System, employing cutting-edge 4-nanometer is unidentified; however, it is dedicated to handling OpenCL
processing technology in a flexible and scalable vision for stereo matching [94]. Furthermore, MobilEye launched
software stack [89]. Integrated with the proven Vision Stack, EyeQ Ultra shown in Fig 14 [95], which is expected to power
it enhances front and surround-view cameras for ADAS and autonomous vehicles from 2025. EyeQ Ultra is built on a 5nm
automated driving [89]. The Snapdragon Ride SoC, a key chip, has 12 CPU cores with 24 threads based on RISC-V
element of the hardware platform, is tailored for ADAS architecture, a GPU, a vision processor, an image signal
needs, featuring machine learning processors, image signal processing core, and 16 convolutional neural network
processors, vision and graphics acceleration, dedicated DSPs, clusters. Furthermore, it can encode videos of H.264/5
GPU technology, multi-core ARM-based CPU, and safety standard and it supports a memory of LPDDR5X. As
and security systems [90]. With excellent thermal efficiency, compared to NVIDIA, which focuses on deep learning
it delivers 30 TOPS for L1/L2 applications and over 700 algorithms, Mobileye solutions still utilize convolutional
TOPS at 130W for L4/L5 autonomous driving [91]. computer vision algorithms aided by deep learning
algorithms [96]. Some of those algorithms include True
Redundancy for Sensor Fusion, Road Experience
Management, and Intelligent Speed Assist [96], [97].
However, while the solutions are known to the public, most
of the underlying CNN algorithms used by Mobileye remain
undisclosed. Automative manufacturers such as Ford, NIO,
Volkswagen, BMW and Nissan have collaborated with
Mobileye to incorporate their EyeQ solution [98], [99].
TABLE III
SPECIFICATIONS OF FPGA BOARDS USED IN CNN ALGORITHMS
9
These specialized devices offer a tailored solution for with other hardware accelerators, including CPUs and GPUs,
complex AI and deep learning tasks within AV systems, in the context of autonomous driving tasks unveiled several
granting them high flexibility, high performance, and low significant insights. Firstly, ASICs exhibit a substantial
power in hardware implementation [111]. Developed as a reduction in power consumption with almost a seven-fold
stand-alone device, the TPU is finely tuned for neural improvement in energy efficiency for tasks like object
networks and is designed to work seamlessly with the Google detection. Additionally, when assessing power-hungry
TensorFlow framework [6]. This ASIC targets high volumes accelerators like GPUs, ASICs have the potential to
of low-precision arithmetic, particularly 8-bit calculations, significantly mitigate the thermal constraints, limiting the
and has already been leveraged across various applications at reduction in the vehicle's driving range to under 5%.
Google, including the search engine and AlphaGo [6]. Furthermore, ASIC-accelerated systems can markedly
The TPU v4 model comprises four chips, each with two enhance the system's performance, reducing tail latency by a
cores as shown in Fig 15, and can compute more than 275 substantial factor, up to 93 times. This underscores their
teraflops (BF16 or INT8) [112]. These cores incorporate crucial role in maintaining consistent and responsive
scalar units, vector units, and 4 128x128 matrix units, all operations in AV systems when compared to GPUs, thereby
interconnected with on-chip 32GB high bandwidth memory ensuring reliability and safety in real-time applications.
(HBM) to facilitate pulsating matrix calculations. Notably, Notably, specialized ASICs like Google's TPU excel in
the TPU's performance is heightened by its ability to execute lower-precision calculations, providing high throughput for
16K multiply-accumulate operations in each cycle through training and inference in neural networks [120].
one matrix unit per core employing BF16 precision.
Moreover, other ASIC solutions, such as the EdgeTPU AI 3) Heterogenous Hardware Platforms
accelerator can achieve a remarkable 4 TOPS while Table IV offers an insight into the performance of different
consuming just 2 watts of power [113]. For instance, it can hardware implementations to run algorithms like SSD, CNN,
efficiently run cutting-edge mobile vision models like YOLO, MobileNet, and others for object detection and
MobileNet V2 at nearly 400 FPS while conserving power classification. Each device is evaluated in terms of latency,
[113]. accuracy, execution time, and power consumption. Notable
findings include the diverse performance characteristics, with
GPUs generally providing fast execution times but higher
power consumption, while FPGAs and ASICs like the offer
impressive accuracy with low power usage. These insights
can be valuable for selecting the right hardware for specific
algorithmic applications targeting AVs. Additionally, Table
IV underscores the significance of heterogeneous hardware
accelerators in modern computing. As the demands of various
Figure 15. TPU v4 Chip [112] algorithms and datasets vary significantly, the availability of
diverse hardware options is critical. Heterogeneous hardware
A study conducted by [119] showed that Google's TPU v4 accelerators enable organizations and researchers to tailor
outperforms Nvidia A100 GPUs, demonstrating a 1.2 to 1.7 their hardware choices that align with their algorithmic and
times faster speed, while simultaneously consuming 1.3 to computational goals, ultimately leading to more efficient and
1.9 times less power than the Nvidia A100 GPU. In another effective implementations across a broad spectrum of
study conducted by [111], a comparative analysis of ASICs applications.
TABLE IV
COMPARISON OF DIFFERENT HARDWARE IMPLEMENTATIONS ACROSS VARIOUS OBJECT DETECTION ALGORITHMS USED IN AVS
Type CPU GPU FPGA ASIC
Source [114] [115] [116] [117] [118] [115] [116] [117]
Intel XILINX
Intel core NVIDIA Xilinx Intel Arria
Device i7- Nvidia Jetson Xavier ZYNQ Google Edge TPU
i7-4770 GTX1060 ZC706 10 GX
7700 ZCU102
CNN for
Speed-sign CNN for Speed-sign
CBFF- traffic MobileN Inception MobileNet Inception
Algorithm recognition YOLOv2 Stop-sign recognition
SSD sign et V2 V3 V2 V3
algorithm detection algorithm
detection
NWPU
Real time
VHR- LISA COCO COCO COCO Real time LISA COCO COCO
Dataset video
10 dataset dataset dataset dataset video input dataset dataset dataset
Input
dataset
Latency 382.15 - - 2.57 14.51 5.376 - - 3.5 52.77
Execution
- 136.2 ms 30.3 ms 24039 ms 42808 ms 0.244 s 7.9 ms 33.3 ms 6051 ms 17456 ms
Time
Power
65 W 76 W 19 W 10.47 W 21.84 W 5.376 W 5.2 W 12.5 W 4.89 W 4.68 W
Consumption
10
V. CONCLUSION https://semiengineering.com/enabling-integrated-adas-domain-
controllers-with-automotive-ip/
In conclusion, the evolving landscape of AVs demands a [11] F. ’Narisawa et al., “Vehicle Electronic Control Units for
meticulous integration of hardware accelerators and Autonomous Driving in Safety and Comfort,” Hitachi Review.
sophisticated machine vision algorithms. This review paper Accessed: Oct. 23, 2023. [Online]. Available:
https://www.hitachi.com/rev/archive/2022/r2022_01/01c01/index
has presented a comprehensive examination of different types .html
of hardware accelerators and their features and sophisticated [12] C. Wang, X. Wang, H. Hu, Y. Liang, and G. Shen, “On the
machine vision algorithms generally used for AVs, shedding Application of Cameras Used in Autonomous Vehicles,” Archives
light on the advancements in the field. The evolution of GPU- of Computational Methods in Engineering, vol. 29, no. 6, pp.
4319–4339, Oct. 2022, doi: 10.1007/s11831-022-09741-8.
based hardware accelerators has been fundamental in [13] J. ’Shepard, “How many types of radar are there?,” Sensor Tips.
addressing the computational demands of real-time Accessed: Oct. 23, 2023. [Online]. Available:
processing for commercial autonomous vehicles. https://www.sensortips.com/featured/how-many-types-of-radar-
Simultaneously, the development of machine vision are-there-faq/
[14] P. Wei, L. Cagle, T. Reza, J. Ball, and J. Gafford, “LiDAR and
algorithms has been instrumental in enhancing the perception Camera Detection Fusion in a Real-Time Industrial Multi-Sensor
of autonomous vehicles. However, to meet the increasing Collision Avoidance System,” Electronics (Basel), vol. 7, no. 6, p.
computational demands of machine vision algorithms for 84, May 2018, doi: 10.3390/electronics7060084.
autonomous vehicles, it is vital to consider other potential [15] B. ’Dickson, “Is camera-only the future of self-driving cars?,”
ADAS & Autonomous Vehicle International. Accessed: Oct. 23,
solutions such as FPGAs and TPUs and how can they be 2023. [Online]. Available:
integrated into autonomous vehicles to offload some tasks https://www.autonomousvehicleinternational.com/features/is-
from commercial hardware accelerators, paving the path for camera-only-the-future-of-self-driving-cars.html
new heterogenous hardware solutions in autonomous [16] T. ’Thadani, R. ’Lerman, I. ’Piper, F. ’Siddiqui, and I. ’Uraizee,
“The final 11 seconds of a fatal Tesla Autopilot crash,” The
vehicles. The synergy between hardware accelerators and Washington Post. Accessed: Dec. 04, 2023. [Online]. Available:
machine vision algorithms has paved the way for https://www.washingtonpost.com/technology/interactive/2023/tes
advancements in autonomous vehicle technology. Looking la-autopilot-crash-analysis/
into the future, the ongoing collaboration between [17] K. ’Power, S. ’Deva, T. ’Wang, J. ’Li, and C. ’Eising, “Hardware
Accelerators in Autonomous Driving,” in Proceedings of the Irish
researchers, engineers, and the industry will yield more Machine Vision and Image Processing Conference 2023, USA,
robust hardware accelerators to meet the ever-increasing Aug. 2023.
computational demands of machine vision algorithms to [18] G. ’Sanders, “Autonomous Vehicle Sensors - Making Sense of The
tackle the challenges faced in the field. World,” Wards Intelligence. Accessed: Oct. 23, 2023. [Online].
Available:
https://wardsintelligence.informa.com/WI965823/Autonomous-
Vehicle-Sensors---Making-Sense-of-The-World
VI. REFERENCE [19] M. ’Frąckiewicz, “The Role of AI Hardware Accelerators in
Autonomous Vehicles,” TS2 SPACE. Accessed: Oct. 23, 2023.
[Online]. Available: https://ts2.space/en/the-role-of-ai-hardware-
[1] H. ’Field, “California DMV suspends Cruise’s self-driving car
accelerators-in-autonomous-vehicles/
permits, effective immediately,” CNBC. Accessed: Nov. 16, 2023.
[20] V. Alonso, A. Dacal-Nieto, L. Barreto, A. Amaral, and E. Rivero,
[Online]. Available:
“Industry 4.0 implications in machine vision metrology: an
https://www.cnbc.com/2023/10/24/california-dmv-suspends-
overview,” Procedia Manuf, vol. 41, pp. 359–366, 2019, doi:
cruises-self-driving-car-permits.html
10.1016/j.promfg.2019.09.020.
[2] “Road traffic injuries,” World Health Organization. Accessed:
[21] I. Sonata, Y. Heryadi, L. Lukas, and A. Wibowo, “Autonomous
Oct. 15, 2023. [Online]. Available: https://www.who.int/news-
car using CNN deep learning algorithm,” J Phys Conf Ser, vol.
room/fact-sheets/detail/road-traffic-injuries
1869, no. 1, p. 012071, Apr. 2021, doi: 10.1088/1742-
[3] Wilson Kehoe Winingham staff, “Common Causes of Car
6596/1869/1/012071.
Accidents,” Wilson Kehoe Winingham. Accessed: Oct. 15, 2023.
[22] K. Othman, “Exploring the implications of autonomous vehicles:
[Online]. Available: https://www.wkw.com/auto-
a comprehensive review,” Innovative Infrastructure Solutions, vol.
accidents/blog/10-common-causes-traffic-accidents/
7, no. 2, p. 165, Apr. 2022, doi: 10.1007/s41062-022-00763-6.
[4] C. Badue et al., “Self-driving cars: A survey,” Expert Syst Appl,
[23] A. Balasubramaniam and S. Pasricha, “Object Detection in
vol. 165, p. 113816, Mar. 2021, doi: 10.1016/j.eswa.2020.113816.
Autonomous Vehicles: Status and Open Challenges,” Jan. 2022.
[5] S. Devi, P. Malarvezhi, R. Dayana, and K. Vadivukkarasi, “A
[24] “Model 3 Owner’s Manual,” Tesla. Accessed: Sep. 20, 2023.
Comprehensive Survey on Autonomous Driving Cars: A
[Online]. Available:
Perspective View,” Wirel Pers Commun, vol. 114, no. 3, pp. 2121–
https://www.tesla.com/ownersmanual/model3/en_jo/GUID-
2133, Oct. 2020, doi: 10.1007/s11277-020-07468-y.
682FF4A7-D083-4C95-925A-5EE3752F4865.html
[6] X. Feng, Y. Jiang, X. Yang, M. Du, and X. Li, “Computer vision
[25] “Volvo Cars to launch UK’s largest and most ambitious
algorithms and hardware implementations: A survey,” Integration,
autonomous driving trial,” Volvo Cars Global Media Newsroom.
vol. 69, pp. 309–320, Nov. 2019, doi: 10.1016/j.vlsi.2019.07.005.
Accessed: Oct. 15, 2023. [Online]. Available:
[7] D. J. Yeong, G. Velasco-Hernandez, J. Barry, and J. Walsh,
https://www.media.volvocars.com/global/en-
“Sensor and Sensor Fusion Technology in Autonomous Vehicles:
gb/media/pressreleases/189969/volvo-cars-to-launch-uks-largest-
A Review,” Sensors, vol. 21, no. 6, p. 2140, Mar. 2021, doi:
and-most-ambitious-autonomous-driving-trial
10.3390/s21062140.
[26] S. Singh and B. S. Saini, “Autonomous cars: Recent developments,
[8] M. Hasanujjaman, M. Z. Chowdhury, and Y. M. Jang, “Sensor
challenges, and possible solutions,” IOP Conf Ser Mater Sci Eng,
Fusion in Autonomous Vehicle with Traffic Surveillance Camera
vol. 1022, no. 1, p. 012028, Jan. 2021, doi: 10.1088/1757-
System: Detection, Localization, and AI Networking,” Sensors,
899X/1022/1/012028.
vol. 23, no. 6, p. 3335, Mar. 2023, doi: 10.3390/s23063335.
[27] P. ’Valdes, “GM self-driving car subsidiary withheld video of a
[9] J. Vargas, S. Alsweiss, O. Toker, R. Razdan, and J. Santos, “An
crash, California DMV says,” CNN Business . Accessed: Oct. 28,
Overview of Autonomous Vehicles Sensors and Their
2023. [Online]. Available:
Vulnerability to Weather Conditions,” Sensors, vol. 21, no. 16, p.
https://edition.cnn.com/2023/10/24/business/california-dmv-
5397, Aug. 2021, doi: 10.3390/s21165397.
cruise-permit-revoke/index.html
[10] R. ’Digiuseppe, “Enabling Integrated ADAS Domain Controllers
[28] S. Casas, W. Luo, and R. Urtasun, “IntentNet: Learning to Predict
With Automotive IP,” Semiconductor Engineering. Accessed:
Intention from Raw Sensor Data,” 2018.
Oct. 23, 2023. [Online]. Available:
11
[29] W. Luo, B. Yang, and R. Urtasun, “Fast and Furious: Real Time Intelligent Transportation Systems (ITSC), IEEE, Nov. 2016, pp.
End-to-End 3D Detection, Tracking and Motion Forecasting with 2475–2480. doi: 10.1109/ITSC.2016.7795954.
a Single Convolutional Net,” Dec. 2020, [Online]. Available: [48] D. Tian, Y. Han, B. Wang, T. Guan, and W. Wei, “A Review of
http://arxiv.org/abs/2012.12395 Intelligent Driving Pedestrian Detection Based on Deep
[30] E. Xie et al., “DetCo: Unsupervised Contrastive Learning for Learning,” Comput Intell Neurosci, vol. 2021, pp. 1–16, Jul. 2021,
Object Detection,” in 2021 IEEE/CVF International Conference doi: 10.1155/2021/5410049.
on Computer Vision (ICCV), IEEE, Oct. 2021, pp. 8372–8381. doi: [49] P. Felzenszwalb, D. McAllester, and D. Ramanan, “A
10.1109/ICCV48922.2021.00828. discriminatively trained, multiscale, deformable part model,” in
[31] Z. Dai, B. Cai, Y. Lin, and J. Chen, “UP-DETR: Unsupervised Pre- 2008 IEEE Conference on Computer Vision and Pattern
training for Object Detection with Transformers,” in 2021 Recognition, IEEE, Jun. 2008, pp. 1–8. doi:
IEEE/CVF Conference on Computer Vision and Pattern 10.1109/CVPR.2008.4587597.
Recognition (CVPR), IEEE, Jun. 2021, pp. 1601–1610. doi: [50] Y. Xiao et al., “Deep learning for occluded and multi‐scale
10.1109/CVPR46437.2021.00165. pedestrian detection: A review,” IET Image Process, vol. 15, no.
[32] A. Bar et al., “DETReg: Unsupervised Pretraining with Region 2, pp. 286–301, Feb. 2021, doi: 10.1049/ipr2.12042.
Priors for Object Detection,” in 2022 IEEE/CVF Conference on [51] W. Liu, S. Liao, W. Hu, X. Liang, and X. Chen, “Learning
Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. Efficient Single-Stage Pedestrian Detectors by Asymptotic
2022, pp. 14585–14595. doi: 10.1109/CVPR52688.2022.01420. Localization Fitting,” 2018, pp. 643–659. doi: 10.1007/978-3-030-
[33] P. Sun et al., “Scalability in Perception for Autonomous Driving: 01264-9_38.
Waymo Open Dataset,” in 2020 IEEE/CVF Conference on [52] S. K. Divvala, D. Hoiem, J. H. Hays, A. A. Efros, and M. Hebert,
Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. “An empirical study of context in object detection,” in 2009 IEEE
2020, pp. 2443–2451. doi: 10.1109/CVPR42600.2020.00252. Conference on Computer Vision and Pattern Recognition, IEEE,
[34] T. Turay and T. Vladimirova, “Toward Performing Image Jun. 2009, pp. 1271–1278. doi: 10.1109/CVPR.2009.5206532.
Classification and Object Detection With Convolutional Neural [53] Z. Cai, M. Saberian, and N. Vasconcelos, “Learning Complexity-
Networks in Autonomous Driving Systems: A Survey,” IEEE Aware Cascades for Deep Pedestrian Detection,” in 2015 IEEE
Access, vol. 10, pp. 14076–14119, 2022, doi: International Conference on Computer Vision (ICCV), IEEE, Dec.
10.1109/ACCESS.2022.3147495. 2015, pp. 3361–3369. doi: 10.1109/ICCV.2015.384.
[35] Y. Cao, C. Li, Y. Peng, and H. Ru, “MCS-YOLO: A Multiscale [54] W. Lan, J. Dang, Y. Wang, and S. Wang, “Pedestrian Detection
Object Detection Method for Autonomous Driving Road Based on YOLO Network Model,” in 2018 IEEE International
Environment Recognition,” IEEE Access, vol. 11, pp. 22342– Conference on Mechatronics and Automation (ICMA), IEEE, Aug.
22354, 2023, doi: 10.1109/ACCESS.2023.3252021. 2018, pp. 1547–1551. doi: 10.1109/ICMA.2018.8484698.
[36] D. Parekh et al., “A Review on Autonomous Vehicles: Progress, [55] J. Wang, H. Li, S. Yin, and Y. Sun, “Research on Improved
Methods and Challenges,” Electronics (Basel), vol. 11, no. 14, p. Pedestrian Detection Algorithm Based on Convolutional Neural
2162, Jul. 2022, doi: 10.3390/electronics11142162. Network,” in 2019 International Conference on Internet of Things
[37] Y. Wang, H. Wang, and Z. Xin, “Efficient Detection Model of (iThings) and IEEE Green Computing and Communications
Steel Strip Surface Defects Based on YOLO-V7,” IEEE Access, (GreenCom) and IEEE Cyber, Physical and Social Computing
vol. 10, pp. 133936–133944, 2022, doi: (CPSCom) and IEEE Smart Data (SmartData), IEEE, Jul. 2019,
10.1109/ACCESS.2022.3230894. pp. 254–258. doi:
[38] H. Slimani, J. El Mhamdi, and A. Jilbab, “Artificial Intelligence- 10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00063.
based Detection of Fava Bean Rust Disease in Agricultural [56] R. Ayachi, M. Afif, Y. Said, and A. Ben Abdelaali, “pedestrian
Settings: An Innovative Approach,” International Journal of detection for advanced driving assisting system: a transfer learning
Advanced Computer Science and Applications, vol. 14, no. 6, approach,” in 2020 5th International Conference on Advanced
2023, doi: 10.14569/IJACSA.2023.0140614. Technologies for Signal and Image Processing (ATSIP), IEEE,
[39] J. Terven and D. Cordova-Esparza, “A Comprehensive Review of Sep. 2020, pp. 1–5. doi: 10.1109/ATSIP49331.2020.9231559.
YOLO: From YOLOv1 and Beyond,” Apr. 2023, [Online]. [57] D. Tian, Y. Han, B. Wang, T. Guan, and W. Wei, “A Review of
Available: http://arxiv.org/abs/2304.00501 Intelligent Driving Pedestrian Detection Based on Deep
[40] L. Liu, C. Ke, H. Lin, and H. Xu, “Research on Pedestrian Learning,” Comput Intell Neurosci, vol. 2021, pp. 1–16, Jul. 2021,
Detection Algorithm Based on MobileNet-YoLo,” Comput Intell doi: 10.1155/2021/5410049.
Neurosci, vol. 2022, pp. 1–12, Oct. 2022, doi: [58] N. Triki, M. Karray, and M. Ksantini, “A Real-Time Traffic Sign
10.1155/2022/8924027. Recognition Method Using a New Attention-Based Deep
[41] X. Zhai, Z. Huang, T. Li, H. Liu, and S. Wang, “YOLO-Drone: An Convolutional Neural Network for Smart Vehicles,” Applied
Optimized YOLOv8 Network for Tiny UAV Object Detection,” Sciences, vol. 13, no. 8, p. 4793, Apr. 2023, doi:
Electronics (Basel), vol. 12, no. 17, p. 3664, Aug. 2023, doi: 10.3390/app13084793.
10.3390/electronics12173664. [59] Y. Wei, M. Gao, J. Xiao, C. Liu, Y. Tian, and Y. He, “Research
[42] J. Kim, J.-Y. Sung, and S. Park, “Comparison of Faster-RCNN, and Implementation of Traffic Sign Recognition Algorithm Model
YOLO, and SSD for Real-Time Vehicle Type Recognition,” in Based on Machine Learning,” Journal of Software Engineering
2020 IEEE International Conference on Consumer Electronics - and Applications, vol. 16, no. 06, pp. 193–210, 2023, doi:
Asia (ICCE-Asia), IEEE, Nov. 2020, pp. 1–4. doi: 10.1109/ICCE- 10.4236/jsea.2023.166011.
Asia49877.2020.9277040. [60] Y. Li, J. Li, and P. Meng, “Attention-YOLOV4: a real-time and
[43] S. A. Sanchez, H. J. Romero, and A. D. Morales, “A review: high-accurate traffic sign detection algorithm,” Multimed Tools
Comparison of performance metrics of pretrained models for Appl, vol. 82, no. 5, pp. 7567–7582, Feb. 2023, doi:
object detection using the TensorFlow framework,” IOP Conf Ser 10.1007/s11042-022-13251-x.
Mater Sci Eng, vol. 844, p. 012024, Jun. 2020, doi: 10.1088/1757- [61] T. P. Dang, N. T. Tran, V. H. To, and M. K. Tran Thi, “Improved
899X/844/1/012024. YOLOv5 for real-time traffic signs recognition in bad weather
[44] Y. Xing et al., “Advances in Vision-Based Lane Detection: conditions,” J Supercomput, vol. 79, no. 10, pp. 10706–10724, Jul.
Algorithms, Integration, Assessment, and Perspectives on ACP- 2023, doi: 10.1007/s11227-023-05097-3.
Based Parallel Vision,” IEEE/CAA Journal of Automatica Sinica, [62] R. Kulkarni, S. Dhavalikar, and S. Bangar, “Traffic Light
vol. 5, no. 3, pp. 645–661, May 2018, doi: Detection and Recognition for Self Driving Cars Using Deep
10.1109/JAS.2018.7511063. Learning,” in 2018 Fourth International Conference on
[45] M. Saranya, N. Archana, M. Janani, and R. Keerthishree, “Lane Computing Communication Control and Automation (ICCUBEA),
Detection in Autonomous Vehicles Using AI,” 2023, pp. 15–30. IEEE, Aug. 2018, pp. 1–4. doi:
doi: 10.1007/978-3-031-38669-5_2. 10.1109/ICCUBEA.2018.8697819.
[46] W. Hao, “Review on lane detection and related methods,” [63] M. Takaki and H. Fujiyoshi, “Traffic Sign Recognition Using
Cognitive Robotics, vol. 3, pp. 135–141, 2023, doi: SIFT Features,” IEEJ Transactions on Electronics, Information
10.1016/j.cogr.2023.05.004. and Systems, vol. 129, no. 5, pp. 824–831, 2009, doi:
[47] Bei He, Rui Ai, Yang Yan, and Xianpeng Lang, “Lane marking 10.1541/ieejeiss.129.824.
detection based on Convolution Neural Network from point [64] Md. Z. Abedin, P. Dhar, and K. Deb, “Traffic Sign Recognition
clouds,” in 2016 IEEE 19th International Conference on using SURF: Speeded up robust feature descriptor and artificial
12
neural network classifier,” in 2016 9th International Conference [82] “Jetson - Embedded AI Computing Platform,” NVIDIA
on Electrical and Computer Engineering (ICECE), IEEE, Dec. Developer. Accessed: Nov. 24, 2023. [Online]. Available:
2016, pp. 198–201. doi: 10.1109/ICECE.2016.7853890. https://developer.nvidia.com/embedded-computing
[65] H. Gao, C. Liu, Y. Yu, and B. Li, “Traffic signs recognition based [83] “NVIDIA Jetson Orin,” NVIDIA. Accessed: Nov. 25, 2023.
on PCA-SIFT,” in Proceeding of the 11th World Congress on [Online]. Available: https://www.nvidia.com/en-us/autonomous-
Intelligent Control and Automation, IEEE, Jun. 2014, pp. 5070– machines/embedded-systems/jetson-orin/
5076. doi: 10.1109/WCICA.2014.7053576. [84] Y. Kortli, S. Gabsi, L. F. C. L. Y. Voon, M. Jridi, M. Merzougui,
[66] X.-H. Wu, R. Hu, and Y.-Q. Bao, “Pedestrian traffic light detection and M. Atri, “Deep embedded hybrid CNN–LSTM network for
in complex scene using AdaBoost with multi-layer features,” lane detection on NVIDIA Jetson Xavier NX,” Knowl Based Syst,
2018. vol. 240, p. 107941, Mar. 2022, doi:
[67] Z. Ozcelik, C. Tastimur, M. Karakose, and E. Akin, “A vision 10.1016/j.knosys.2021.107941.
based traffic light detection and recognition approach for [85] B. R. Chang, H.-F. Tsai, and C.-W. Hsieh, “Accelerating the
intelligent vehicles,” in 2017 International Conference on Response of Self-Driving Control by Using Rapid Object
Computer Science and Engineering (UBMK), IEEE, Oct. 2017, pp. Detection and Steering Angle Prediction,” Electronics (Basel),
424–429. doi: 10.1109/UBMK.2017.8093430. vol. 12, no. 10, p. 2161, May 2023, doi:
[68] A. Vinod Deshpande and A. V Deshpande Assistant Professor, 10.3390/electronics12102161.
“Design Approach for a Novel Traffic Sign Recognition System [86] “ Mercedes-Benz and Nvidia to build autonomous driving
by Using LDA and Image Segmentation by Exploring the Color software,” Autovista24. Accessed: Sep. 23, 2023. [Online].
and Shape Features of an Image,” 2014. [Online]. Available: Available:
https://www.researchgate.net/publication/342674166 https://autovista24.autovistagroup.com/news/mercedes-benz-and-
[69] M. DeRong and T. ZhongMei, “Remote Traffic Light Detection nvidia-build-autonomous-driving-software/
and Recognition Based on Deep Learning,” in 2023 6th World [87] “ Volvo Cars deepens collaboration with NVIDIA; next-
Conference on Computing and Communication Technologies generation self-driving Volvos powered by NVIDIA DRIVE
(WCCCT), IEEE, Jan. 2023, pp. 194–198. doi: Orin,” Volvo Cars Global Media Newsroom. Accessed: Sep. 23,
10.1109/WCCCT56755.2023.10052610. 2023. [Online]. Available:
[70] P. Liu and T. Li, “Traffic light detection based on depth improved https://www.media.volvocars.com/global/en-
YOLOV5,” in 2023 3rd International Conference on Neural gb/media/pressreleases/280495/volvo-cars-deepens-
Networks, Information and Communication Engineering collaboration-with-nvidia-next-generation-self-driving-volvos-
(NNICE), IEEE, Feb. 2023, pp. 395–399. doi: powered-by-nvidia-d
10.1109/NNICE58320.2023.10105786. [88] M. ’Hibben, “Nvidia: Technology Leadership In Advanced Driver
[71] N. H. Sarhan and A. Y. Al-Omary, “Traffic light Detection using Assistance (NASDAQ:NVDA),” Seeking Alpha. Accessed: Sep.
OpenCV and YOLO,” in 2022 International Conference on 23, 2023. [Online]. Available:
Innovation and Intelligence for Informatics, Computing, and https://seekingalpha.com/article/4530275-nvidia-technology-
Technologies (3ICT), IEEE, Nov. 2022, pp. 604–608. doi: leadership-in-advanced-driver-assistance
10.1109/3ICT56508.2022.9990723. [89] “AUTONOMOUS DRIVING The future of ADAS and automated
[72] H. T. Ngoc, K. H. Nguyen, H. K. Hua, H. V. N. Nguyen, and L.- driving has arrived,” Qualcomm. Accessed: Sep. 20, 2023.
D. Quach, “Optimizing YOLO Performance for Traffic Light [Online]. Available:
Detection and End-to-End Steering Control for Autonomous https://www.qualcomm.com/products/automotive/autonomous-
Vehicles in Gazebo-ROS2,” International Journal of Advanced driving#
Computer Science and Applications, vol. 14, no. 7, 2023, doi: [90] P. ’Moorhead, “Qualcomm Officially Enters Self-Driving Market
10.14569/IJACSA.2023.0140752. With Snapdragon Ride Platform And Extends Partnership With
[73] S. Pavlitska, N. Lambing, A. K. Bangaru, and J. M. Zöllner, GM To Include ADAS,” Moor Insights & Strategy. Accessed: Sep.
“Traffic Light Recognition using Convolutional Neural Networks: 19, 2023. [Online]. Available:
A Survey,” Sep. 2023, [Online]. Available: https://moorinsightsstrategy.com/qualcomm-officially-enters-
http://arxiv.org/abs/2309.02158 self-driving-market-with-snapdragon-ride-platform-and-extends-
[74] S.-Y. Lin and H.-Y. Lin, “A Two-Stage Framework for Diverse partnership-with-gm-to-include-adas/
Traffic Light Recognition Based on Individual Signal Detection,” [91] “Snapdragon Ride Platform,” Qualcomm Developer Network.
2022, pp. 265–278. doi: 10.1007/978-3-031-04112-9_20. Accessed: Sep. 19, 2023. [Online]. Available:
[75] “LISA Traffic Light Dataset,” Kaggle. Accessed: Dec. 06, 2023. https://developer.qualcomm.com/software/digital-
[Online]. Available: chassis/snapdragon-ride/snapdragon-ride-platform#
https://www.kaggle.com/datasets/mbornoe/lisa-traffic-light- [92] C. ’Hammerschmidt, “BMW picks Qualcomm for Automated
dataset Driving systems,” eeNews Europe. Accessed: Sep. 19, 2023.
[76] “Bosch Small Traffic Lights Dataset,” Heidelberg Collaboratory [Online]. Available: https://www.eenewseurope.com/en/bmw-
for Image Processing (HCI). Accessed: Dec. 06, 2023. [Online]. picks-qualcomm-for-automated-driving-systems/
Available: https://hci.iwr.uni-heidelberg.de/content/bosch-small- [93] S. ’Abuelsamid, “Mobileye Announces EyeQ6 And EyeQ Ultra
traffic-lights-dataset Chips For Assisted And Automated Driving,” Forbes. Accessed:
[77] “DriveU Traffic Light Dataset (DTLD),” UULM. Accessed: Dec. Nov. 28, 2023. [Online]. Available:
06, 2023. [Online]. Available: https://www.uni-ulm.de/in/iui- https://www.forbes.com/sites/samabuelsamid/2022/01/04/mobile
drive-u/projekte/driveu-traffic-light-dataset/ ye-announces-eyeq6-and-eyeq-ultra-chips-for-assisted-and-
[78] Steve, “Tesla Hardware 3 (Full Self-Driving Computer) Detailed,” automated-driving/?sh=7730f20b79e6
AutoPilot Review. Accessed: Sep. 22, 2023. [Online]. Available: [94] “Mobileye EyeQ6 series in-depth analysis,” inf.news. Accessed:
https://www.autopilotreview.com/tesla-custom-ai-chips- Dec. 05, 2023. [Online]. Available:
hardware-3/ https://inf.news/en/tech/6176daaf60e89f0febe1fdec75e2ea47.htm
[79] “FSD chip - tesla,” WikiChip. Accessed: Sep. 22, 2023. [Online]. l
Available: [95] “New Mobileye EyeQ Ultra will Enable Consumer AVs,”
https://en.wikichip.org/wiki/tesla_(car_company)/fsd_chip Mobileye. Accessed: Dec. 01, 2023. [Online]. Available:
[80] C. ’Bos, “Tesla’s New HW3 Self-Driving Computer - It’s A Beast https://www.mobileye.com/news/mobileye-ces-2022-tech-news/
(CleanTechnica Deep Dive),” CleanTechnica. Accessed: Sep. 22, [96] “Intelligent Speed Assist Shows the Power of Mobileye’s Vision,”
2023. [Online]. Available: Mobileye. Accessed: Dec. 01, 2023. [Online]. Available:
https://cleantechnica.com/2019/06/15/teslas-new-hw3-self- https://www.mobileye.com/blog/intelligent-speed-assist-isa-
driving-computer-its-a-beast-cleantechnica-deep-dive/ computer-vision-adas-solution/
[81] A. ’Mishra, “Decoding the Technology Behind Tesla Autopilot: [97] “Rethinking technology for the autonomous future,” Mobileye.
How it Works,” Medium. Accessed: Nov. 26, 2023. [Online]. Accessed: Nov. 29, 2023. [Online]. Available:
Available: https://ai.plainenglish.io/decoding-the-technology- https://www.mobileye.com/technology/
behind-tesla-autopilot-how-it-works-af92cdd5605f [98] K. ’Korosec, “With Intel Mobileye’s newest chip, automakers can
bring automated driving to cars,” TechCrunch. Accessed: Dec. 06,
2023. [Online]. Available:
13
https://techcrunch.com/2022/01/04/intels-mobileye-autonomous- NY, USA: ACM, Mar. 2018, pp. 751–766. doi:
driving-chip-for-consumer-vehicles/ 10.1145/3173162.3173191.
[99] Z. ’Shahan, “Mobileye’s Partnerships With BMW, Ford, NIO, [112] “ System Architecture | Cloud TPU | Google Cloud,” Google
Nissan, Volkswagen, & WILLER,” CleanTechnica. Accessed: Cloud. Accessed: Sep. 26, 2023. [Online]. Available:
Dec. 06, 2023. [Online]. Available: https://cloud.google.com/tpu/docs/system-architecture-tpu-vm
https://cleantechnica.com/2020/08/10/mobileyes-partnerships- [113] “Introduction to Cloud TPU | Google Cloud,” Google Cloud.
with-bmw-ford-nio-nissan-volkswagen-willer/ Accessed: Sep. 26, 2023. [Online]. Available:
[100] R. ’Niranjana, “FPGAs in Self-Driving Cars: Accelerating https://cloud.google.com/tpu/docs/intro-to-tpu/
Perception and Decision-Making,” FPGA Insights. Accessed: Sep. [114] Li, Zhang, and Wu, “Efficient Object Detection Framework and
25, 2023. [Online]. Available: Hardware Architecture for Remote Sensing Images,” Remote Sens
https://fpgainsights.com/fpga/fpgas-in-self-driving-cars- (Basel), vol. 11, no. 20, p. 2376, Oct. 2019, doi:
accelerating-perception-and-decision-making/ 10.3390/rs11202376.
[101] R. Raj and R. Prakash, “FPGA Based Lane Tracking system for [115] W. Shi, X. Li, Z. Yu, and G. Overett, “An FPGA-Based Hardware
Autonomous Vehicles,” 2019. Accelerator for Traffic Sign Detection,” IEEE Trans Very Large
[102] “Zynq 7000 SoC,” AMD. Accessed: Nov. 05, 2023. [Online]. Scale Integr VLSI Syst, vol. 25, no. 4, pp. 1362–1372, Apr. 2017,
Available: https://www.xilinx.com/products/silicon- doi: 10.1109/TVLSI.2016.2631428.
devices/soc/zynq-7000.html [116] M. Yih, J. M. Ota, J. D. Owens, and P. Muyan-Ozcelik, “FPGA
[103] “Kria KV260 Vision AI Starter Kit,” AMD. Accessed: Nov. 05, versus GPU for Speed-Limit-Sign Recognition,” in 2018 21st
2023. [Online]. Available: International Conference on Intelligent Transportation Systems
https://www.xilinx.com/products/som/kria/kv260-vision-starter- (ITSC), IEEE, Nov. 2018, pp. 843–850. doi:
kit.html 10.1109/ITSC.2018.8569462.
[104] “ALINX SoM AC7020C: SoC Industrial Grade Module,” AMD. [117] H. Kong et al., “EDLAB: A Benchmark for Edge Deep Learning
Accessed: Dec. 07, 2023. [Online]. Available: Accelerators,” IEEE Des Test, vol. 39, no. 3, pp. 8–17, Jun. 2022,
https://www.xilinx.com/products/boards-and-kits/1-1bkpidx.html doi: 10.1109/MDAT.2021.3095215.
[105] “Ultra96-V2 | Avnet Boards,” AVNET. Accessed: Nov. 05, 2023. [118] S. N. Tesema and E.-B. Bourennane, “Resource- and Power-
[Online]. Available: Efficient High-Performance Object Detection Inference
https://www.avnet.com/wps/portal/us/products/avnet- Acceleration Using FPGA,” Electronics (Basel), vol. 11, no. 12, p.
boards/avnet-board-families/ultra96-v2/ 1827, Jun. 2022, doi: 10.3390/electronics11121827.
[106] “AMD Kintex7 FPGA KC705 Evaluation Kit,” AMD. Accessed: [119] N. Jouppi et al., “TPU v4: An Optically Reconfigurable
Nov. 05, 2023. [Online]. Available: Supercomputer for Machine Learning with Hardware Support for
https://www.xilinx.com/products/boards-and-kits/ek-k7-kc705- Embeddings,” in Proceedings of the 50th Annual International
g.html Symposium on Computer Architecture, New York, NY, USA:
[107] “ AMD Virtex 7 FPGA VC709 Connectivity Kit,” AMD. ACM, Jun. 2023, pp. 1–14. doi: 10.1145/3579371.3589350.
Accessed: Nov. 05, 2023. [Online]. Available: [120] J. ’Tovar, “GPUs vs TPUs: A Comprehensive Comparison for
https://www.xilinx.com/products/boards-and-kits/dk-v7-vc709- Neural Network Workloads,” LinkedIn. Accessed: Sep. 22, 2023.
g.html [Online]. Available: https://www.linkedin.com/pulse/gpus-vs-
[108] “All FPGA Boards - Cyclone V - DE10-Standard,” Terasic. tpus-comprehensive-comparison-neural-network-workloads-joel/
Accessed: Nov. 05, 2023. [Online]. Available: [121] I. Elmanaa, M. A. Sabri, Y. Abouch, and A. Aarab, “Efficient
https://www.terasic.com.tw/cgi- Roundabout Supervision: Real-Time Vehicle Detection and
bin/page/archive.pl?Language=English&CategoryNo=167&No= Tracking on Nvidia Jetson Nano,” Applied Sciences, vol. 13, no.
1081 13, p. 7416, Jun. 2023, doi: 10.3390/app13137416.
[109] “Intel® Cyclone® 10 GX Development Kit,” Intel. Accessed: [122] L. S. Karumbunathan, “NVIDIA Jetson AGX Orin Series A Giant
Nov. 05, 2023. [Online]. Available: Leap Forward for Robotics and Edge AI Applications,” 2022.
https://www.intel.com/content/www/us/en/products/details/fpga/ [123] “The Snapdragon Ride Platform continues to push ADAS/AD
development-kits/cyclone/10-gx.html forward,” Qualcomm. Accessed: Sep. 19, 2023. [Online].
[110] K. Shi, M. Wang, X. Tan, Q. Li, and T. Lei, “Efficient Dynamic Available:
Reconfigurable CNN Accelerator for Edge Intelligence https://www.qualcomm.com/news/onq/2023/01/snapdragon-ride-
Computing on FPGA,” Information, vol. 14, no. 3, p. 194, Mar. platform-continues-to-push-adas-ad-forward
2023, doi: 10.3390/info14030194.
[111] S.-C. Lin et al., “The Architectural Implications of Autonomous
Driving: Constraints and Acceleration,” in Proceedings of the
Twenty-Third International Conference on Architectural Support
for Programming Languages and Operating Systems, New York,
14