0% found this document useful (0 votes)
94 views11 pages

Vision-Based Front and Rear Surround Understanding Using Embedded Processors

Ieee paper

Uploaded by

nirikshith p
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views11 pages

Vision-Based Front and Rear Surround Understanding Using Embedded Processors

Ieee paper

Uploaded by

nirikshith p
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. 1, NO.

4, DECEMBER 2016 335

Vision-Based Front and Rear Surround


Understanding Using Embedded Processors
Ravi Kumar Satzoda, Member, IEEE, Sean Lee, Frankie Lu, and Mohan Manubhai Trivedi, Fellow, IEEE

Abstract—Vision-based driver assistance systems involve a


range of data-intensive operations, which pose challenges in im-
plementing them as robust and real-time systems on resource con-
strained embedded computing platforms. In order to achieve both
high accuracy and real-time performance, the constituent algo-
rithms need to be designed and optimized such that they lend well
for embedded realization. In this paper, we present a novel two-
camera-based embedded driver assistance framework that ana-
lyzes the dynamics of vehicles in the front and rear surround views
of the host vehicle (ego-vehicle). In order to do this, we propose
a set of integrated techniques that combine contextual cues and
lane information to detect vehicles that pose high threat to the
ego-vehicle. The threat analysis is then used for generating a safe
maneuver zone by the proposed system, which is implemented by Fig. 1. Front and rear surround understanding using proposed embedded
using two Snapdragon 810 embedded CPUs. A detailed perfor- system.
mance evaluation and tradeoff analysis is presented using a novel
multiperspective dataset (DualCam Dataset) that is released as part of vision sensors or cameras. Cameras are being used to monitor
of this paper. In terms of accuracy, the detailed evaluations show 360◦ surround of the vehicle to perceive the surrounding envi-
high robustness with true positive rates greater than 95% with less
than 6% false alarm rate. The proposed embedded system operates
ronment of the vehicle [7], [8]. Although visual data provides a
at real-time frame rates in our testbed under real-world highway rich set of information for inferring the states of the surrounding
driving conditions. The proposed framework was presented as a of the vehicle, performing such tasks with high level of robust-
live demonstration at the 2016 Consumer Electronics Show. ness under varying road and environmental conditions is still a
Index Terms—Embedded system, integrated vehicle detection, challenging task [8]. This requirement for high accuracy also
threat estimation. implies the use of data-intensive and computationally-intensive
computer vision algorithms, which pose a challenge in imple-
menting them as resource constrained embedded electronic sys-
I. INTRODUCTION: MOTIVATION & CONTRIBUTIONS tems in cars. Therefore, there is a classic tradeoff between com-
ITH the advent of autonomous and semi-autonomous putational efficiency (or real-time performance) and robustness
W vehicles, embedded electronic systems form a signif-
icant component of modern vehicles [1]–[3]. Among the in-
of such systems.
In this paper, we introduce integrated and context-aware tech-
creasing number of such embedded electronics, active safety niques that address challenges related to embedded realization
and intelligent perception systems play a vital role in the safety of vision-based operations resulting in real-time, yet robust, em-
of the driver and passengers in the vehicle [4], [5]. Among these bedded driver assistance systems. The main contributions of this
different electronic subsystems, vision-based advanced driver paper can be itemized as follows: (a) vision-based techniques
assistance systems (ADAS) form a significant percentage [6] that synergistically combine application context and surround
because of the increasing miniaturization and decreasing costs cues from a two-camera setup to generate a safe maneuver
zone visualization in order to aid the driver, (b) an embed-
ded framework for driver assistance that takes advantage of
Manuscript received March 27, 2016; revised July 19, 2016 and November
13, 2016; accepted January 9, 2017. Date of publication April 24, 2017; date of the integrated techniques in (a) so that the embedded system
current version June 16, 2017. (Corresponding author: Ravi Kumar Satzoda.) operates in real-time with high levels of robustness, and (c) a
R. K. Satzoda and S. Lee are with the Department of Electrical and Computer novel DualCam Dataset comprising of visual data that is col-
Engineering, University of California San Diego, La Jolla, CA 92093 USA
(e-mail: [email protected]; [email protected]). lected by cameras capturing the front and rear views from the
F. Lu is with the Department of Electrical and Electronic Engineering, ego-vehicle (host vehicle) for detailed evaluations and bench-
University of California San Diego, La Jolla, CA 92093 USA (e-mail: marking. Fig. 1 illustrates the proposed system, which com-
[email protected]).
M. M. Trivedi is with the Computer Vision and Robotics Research prises forward and rear facing cameras that are connected to
Laboratory-Laboratory for Intelligent and Safe Automobiles, University of Cal- two Snapdragon 810 embedded CPUs. The two embedded pro-
ifornia San Diego, La Jolla, CA 92093 USA (e-mail: [email protected]). cessors collaborate with each other to generate a safe maneuver
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. zone by performing threat analysis using the surround cues. The
Digital Object Identifier 10.1109/TIV.2017.2686084 open access DualCam Dataset is a significant contribution to the
2379-8858 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
336 IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. 1, NO. 4, DECEMBER 2016

intelligent vehicles’ community because it is the first of


it’s kind that involves multiple perspectives and high threat
maneuvers.

II. RELATED RECENT STUDIES


There is a rich body of literature which addresses various as-
pects of vision-based driver assistance systems. Lane detection
[9], vehicle detection [10] and vehicle activity detection [7],
in particular, have been studied extensively in various works.
Lane detection has been explored in detail in [9], [11]–[14].
While [9], [11], [13] involve filter-based approaches to detect
Fig. 2. Overview of the proposed embedded system: Two embedded CPUs,
lane features, [14] employs learned classifiers on groups of pix- one each for processing forward and rear visual views from the ego-vehicle,
els to detect lane features. Also, [9], [13], [14] operate the entire perform a series of operations and collaborate with each other to generate a safe
image, while [11] reduces the computational complexity by an- maneuver zone.
alyzing bands of the input image so that the method can be
implemented on embedded computing platforms. [12] provides
a detailed survey of existing lane detection techniques. Simi- vision algorithms in cars [22]. This includes speed and com-
larly, rear-view vehicle detection has also be studied in detail puting power of the embedded processor, communication band-
in works such as [10], [15]–[17]. A detailed survey of vehicle width and battery power.
detection techniques is presented in [8]. Most vehicle detection
approaches involve multi-scale and sliding window approaches,
where appearance based classifiers such as support vector ma- III. OVERVIEW & SCOPE OF PROPOSED SYSTEM
chines (SVMs), Haar-cascades etc. are used. These operations Before going into the details of the proposed techniques in
are further used to estimate the threat posed by surrounding this paper, we define the objective and scope of the proposed
vehicles on the ego-vehicle [18]–[20]. system. This is important because the proposed techniques use
Although, robust vision algorithms have been proposed for the application context in order to operate at real-time frame
these systems, there are fewer works in the context of embed- rates on embedded CPUs.
ded electronics that can be deployed in vehicles using resource The objective of the proposed system is to provide assistance
constrained embedded computing platforms [21]–[23]. Embed- to the driver by performing threat analysis using the visual data
ded processors offer lesser computing resources as compared to that is captured by on-board cameras in the ego-vehicle. Threat
conventional processors on which vision algorithms are evalu- analysis will be performed by analyzing the dynamics of the
ated. Therefore, conventional detection algorithms when imple- vehicles that are seen by forward and rear facing cameras. The
mented on embedded CPUs are often challenged with real-time proposed system comprises of two computing nodes - forward
and power consumption constraints. Therefore, the algorithms node and rear node. Each node estimates lanes, and the lane
need to be re-designed in order to realize them on embedded positions are then used to detect vehicles that pose high threat
hardware [22]. Some works have addressed such embedded con- to the ego-vehicle. The positions of the high threat vehicles from
straints for deploying vision algorithms in embedded hardware. the two nodes are then used to perform threat analysis. Fig. 2
In [24], multi-perspective camera based ADASs are discussed lists the operations that are performed in each processing node.
for implementation on Texas Instruments processors. Another Therefore, the application context is defined as analyzing
similar work is presented in [25], wherein multi-modal and the dynamics of high threat from forward and rear surround
multi-camera based ADASs are surveyed for embedded imple- views, and perform threat analysis in order to assist the driver in
mentation. Stereo-vision systems have also been implemented providing a safe maneuver zone. In this context, we define high
on embedded platforms in works such as [26], [27]. It is to be threat vehicles as follows.
noted that consumer vehicle manufacturers have deployed some Definition 1: High Threat Vehicles - The following vehicles
of these algorithms as part of their vehicles using electronic con- are considered to pose threat to the ego-vehicle in the proposed
trol units (ECUs) for applications such as lane departure warn- system:
ing (LDW). Additionally, there are plug-and-play vision-based 1) Vehicle in the ego-lane that is either in front of the ego-
electronic assistance systems such as MobilEye [28] which can vehicle or approaching from behind the ego-vehicle.
be installed in vehicles to assist the driver. Dedicated automotive 2) Vehicles in front and rear of the ego-vehicle, and in the
grade embedded vision processors such as Texas Instruments’ adjacent left and right lanes of the ego-vehicle.
EVE have also been developed for deployment in vehicles [29]. 3) If there are multiple vehicles in a particular lane, then the
The above list is only a brief review of related works. A more vehicle that is nearest to the ego-vehicle is the high threat
detailed study can be found in [23], [25]. A study of these works vehicle that must be detected.
show that the challenges with regards to translating computer vi- The scope of this work is limited to two cameras that capture
sion algorithms into robust and real-time embedded electronics the forward and rear views from the ego-vehicle. Therefore, the
are yet to be fully addressed. This is because embedded com- vehicles in the blind spots are not considered in this paper, al-
puting platforms pose a number of challenges in implementing though they pose high threat to the ego-vehicle. We will discuss
SATZODA et al.: VISION-BASED FRONT AND REAR SURROUND UNDERSTANDING USING EMBEDDED PROCESSORS 337

how the proposed embedded framework is scalable to address


other scenarios in Section VIII. Furthermore, the main contribu-
tion of the paper is the overall system that assists the driver with
the safe maneuver zone. In order to achieve this real-time system
on an embedded CPU without compromising on the robustness
of the system, we propose a set of techniques for optimizing
the vehicle detection process which forms the core operation
of the proposed ADAS. It is to be noted that lane detection is
not in the scope of this paper. We use existing methods for lane
detection to meet the application requirements of the proposed
system. Also, the proposed techniques are particularly designed
for highway driving conditions, wherein surrounding vehicles
are the obstacles that pose threat to the ego-vehicle. Fig. 3. Probability density functions for road surface and under vehicle region
pixels. Sample annotations are shown for the road surface and under vehicle
IV. HIGH THREAT VEHICLE DETECTION FOR EMBEDDED region pixels.
IMPLEMENTATION
In this section, we present a technique for detecting high-
threat vehicles on embedded CPU. In order to achieve real-
time frame rates one resource constrained embedded computing
platforms, we approach the vehicle detection problem in an in-
formed and context based manner. The context is defined by the
application requirement of detecting high-threat vehicles, which
are defined previously in Definition 1. Therefore, the detection
of vehicles is first restricted to three regions of interest (RoIs)
- ego-lane and the two adjacent lanes. Next, the vehicle detec-
tion method does not include the conventional multi-scale and
sliding window approach on the entire RoIs. Instead, a two-step
Fig. 4. Illustration of the scan line segments I L , I E and I R at y = y j ,
method is used to detect vehicles in the RoIs. In the first step, under-vehicle regions and hypothesis windows H L and H R .
possible hypothesis windows are generated in each RoI using
a simplified approach. Therefore, hypothesis windows limit the
total amount of image data wherein robust but computation-
PL and PR give the x-coordinates of the left and right lanes of
ally more complex classifiers can be applied. The application of
the ego-lane within the image region defined by y. y represents
such classifiers on limited regions of interest increases the frame
the vertical axis along the road such that ymin ≤ y ≤ ymax . The
rate and maintains high levels of robustness. The techniques are
coordinates in PL and PR of the ego-lane also form the right and
described for the image frames captured by the forward fac-
left boundaries of the adjacent left and right lanes respectively.
ing camera. The same techniques are also applied to the image
This is illustrated in Fig. 4. The lane detection algorithm in
frame from the rear-camera. Therefore, the actual accuracy of
[11] is evaluated in great detail in [11], [30]. The accuracy of
the method is still dependent on the robust appearance based
ego-lane detection is shown to be less than 7 cm in real-world
classifiers, but it is ensured that the complexity of multi-scale
coordinates.
and sliding window classifier based detection is limited to spe-
The adjacent lanes are required for generating RoIs in the
cific windows that are hypothesized by a computationally less
proposed system but extracting adjacent lane features of the ad-
complex method.
jacent lanes (denoted by PLL and PRR in Fig. 4) will increase
computational complexity. In order to determine PLL and PRR ,
A. Region of Interest (RoI) Generation a look-up table (LUT) based approach is used. An LUT is gener-
According to Definition 1, there are three regions of interest ated offline using the camera calibration information. The LUT
(RoIs): the ego-lane, adjacent left lane and adjacent right lane. stores offsets that must be either subtracted from or added to
A lane estimation algorithm such as [9], [11] is employed to PL and PR in order to get PLL and PRR respectively, i.e.,
determine positions of the ego-lane denoted by PL and PR
PLL = PL − Δx and PRR = PR + Δx (2)
where,
 where Δx represents the LUT vector with the offsets. Δx is
PL = P (xLym in , ymin ), · · · , P (xLyk , yk ), computed using the camera calibration parameters. Let H ∈
T R3×3 represent the calibration homography matrix that maps
· · · , P (xLym ax , ymax )
image view to the inverse perspective mapping (IPM) view [31].

PR = P (xR y m in , ymin ), · · · , P (xy k , yk ),
R The computation of H is detailed in [31], where H transforms
T an image (which has a perspective deformation) into a top-view
· · · , P (xR y m ax , ymax ) (1) or the IPM view. In the IPM view, the lanes are supposed to be
338 IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. 1, NO. 4, DECEMBER 2016

parallel and of specific width wL . Let ymin and ymax (refer to (1)) Given a test image frame I with the boundaries of the RoIs
in the image domain correspond to y W = 1 and y W = ymax W
in computed as described above, I is scanned along the y-direction
the IPM domain. This represents the depth along the road from starting from the bottom of the image, i.e., the point that is
the ego-vehicle for which the lane analysis and the subsequent nearest to the ego-vehicle. For a particular y-position yj , the
threat analysis is valid. Therefore, for every yiW ∈ [1, ymax
W
], we image pixels in that scan line are divided into the following
apply the following equation: three segments corresponding to the three RoIs:
 T  T
Δi yi 1 = H −1 wL yiW 1 (3) I L (yj ) = I(x, yj ) where PLL (x, yj ) ≤ x ≤ PL (x, yj )

where Δi is the x-offset which is stored in the LUT Δx for I E (yj ) = I(x, yj ) where PL (x, yj ) ≤ x ≤ PR (x, yj )
the yi -th coordinate in the image domain. The above equation I R (yj ) = I(x, yj ) where PR (x, yj ) ≤ x ≤ PRR (x, yj ) (4)
is applied for all values of yiW to create the LUT. This entire
process is a one-time offline process. The LUT is then applied The image pixels in each of three scan line segments I L (yj ),
according to (2) to generate the boundaries for the left and I E (yj ) and I R (yj ) are then thresholded using the PDFs for
right adjacent lanes during runtime resulting in the three RoIs: under-vehicle region and road surface - PU V (μU V , σU V ) and
adjacent left lane, ego-lane and adjacent right lane (denoted by PR S (μR S , σR S ) using the following equation:
RLF , RE F
and RRF
in Fig. 4). This method to detect the adjacent 
1 if PU V (I(xk , yj )) ≥ PR S (I(xk , yj ))
RoIs is particularly effective when the lane widths are fixed as I(xk , yj ) = (5)
seen in the case of highways. 0 otherwise
It is to be noted that the proposed LUT based approach is also
valid during lane change maneuvers made by the ego-vehicle. (the superscripts L, R and E are omitted in the above equation
This is because the relationship between the IPM image and the for simplicity). After thresholding the scan line segments, the
perspective (or image) domain remains the same during lane number of pixels that are classified as under-vehicle region is
change maneuver. During lane change the lanes are laterally determined for each segment. If the number of under-vehicle
shifted in the IPM domain but the width remains the same. region pixels form β% of the scan line segment, then that scan
Therefore, (3) results in the same offsets during lane change. The line segment is marked as an under-vehicle region segment.
only assumption for this is that the lens and camera distortions Therefore, when the scan line at y is scanned completely from
need to be corrected beforehand. x = 0 to x = xmax , possible vehicle regions in the three RoIs
are determined. This process is repeated with the next scan
line at y = yj − 2. In each RoI, the number of consecutive
B. Region of Interest (RoI) Based Vehicle Detection
under-vehicle region segments are counted which are denoted
Given the three RoIs, high-threat vehicles are detected in by nL , nE and nR corresponding to the three RoIs. If nL is
each region. As listed in Definition 1, the system will detect greater than κ then the nL scan line segments are considered
three vehicles, one from each RoI, which are nearest to the ego- as possible constituents of under-vehicle region in the adjacent
vehicle because the presence of these vehicles poses varying left lane or RoI. The same is repeated with the ego-lane and
levels of threat to the ego-vehicle. the adjacent right lane RoIs. If an RoI is found to have possible
Instead of applying the vehicle classifier directly in each of under-vehicle region, then a hypothesis window H is drawn in
the regions, hypothesis candidates are first generated. In order to which vehicle classifiers are applied. The dimensions of this
do this, each RoI is analyzed for the presence of under-vehicle window are computed in the following way:
dark regions (shown in Fig. 3 inset annotations). The lower parts
of the leading vehicles in front of the ego-vehicle are usually wh = (width of lane at y = yj m ax ) + padding
distinctive with dark pixels as compared to the surrounding road hh = (n + α × width of lane at y = yj m in ) + padding (6)
surface. The dark pixels are contributed by either the bumper
of the vehicle or the shadow cast by the vehicle or both. It is where wh and hh are the width and height of the hypothesis
to be noted that the under vehicle region is not dependent on window for the under-vehicle region that starts at y = yj m ax and
the shadow cast by the vehicle alone. The darker under vehicle ends at y = yj m in ; α is the aspect ratio factor that is dependent
regions are a result of the proximity of the vehicle chassis to the on the vehicle dimensions; n is the height of the under-vehicle
road surface. Therefore, the under vehicle regions apply to vehi- region that is detected in the RoI, i.e., n is either nL or nE
cles being detected in cloudy or rainy weather conditions also. or nR depending on the RoI. All the different parameters are
In order to detect the presence of this under vehicle region, the illustrated in Fig. 4.
annotated dataset for training the classifiers (which will be used Note on setting the parameters: α is learned form the training
later in the proposed method) is used. The under-vehicle regions data. Fig. 5 shows the distribution of the relationship between
of the training samples were annotated. Given that the cameras the width and height of 20,000 annotated vehicles in the training
capture the video in YUV color format, the mean Y component data. α is set to 1/1.25 based on this data. Varying κ affects the
of the under-vehicle regions is collected from the training data accuracy and computation time of the algorithm. A detailed
resulting in the probability density functions PU V (μU V , σU V ) analysis on κ is presented in Section VII. The value β is set
and PR S (μR S , σR S ) for the under-vehicle region and the road to 40% so that it can handle vehicles that are changing lanes.
surface respectively as shown in Fig 3. padding is heuristically set to 25 pixels.
SATZODA et al.: VISION-BASED FRONT AND REAR SURROUND UNDERSTANDING USING EMBEDDED PROCESSORS 339

the ego-vehicle. Each board performs RoI estimation followed


by the detection of high-threat vehicles using the proposed
techniques mentioned in the previous sections. After detect-
ing thevehicles, two sets
 of detection  windows are generated

ΨF = WLF , WEF , WRF and ΨR = WLR , WER , WRR , where
the superscripts F and R refer to forward and rear RoIs respec-
tively; and the subscripts L , E and R refer to the adjacent left,
ego and adjacent right lanes respectively. A detection window
W is a set of four parameters, i.e. W = [x, y, w, h], where (x, y)
denotes the coordinates of the top left corner of the window and
w and h denote the width and height of the detection window.
Fig. 5. The width versus height of over 20,000 vehicle annotations is plotted If the vehicle detector does not detect a vehicle in a lane, that
to get the factor α. window is set to a null value.
Given the detection windows in ΨF and ΨR , the relative dis-
tances of the vehicles are estimated using the inverse perspec-
Therefore, instead of applying the computationally more tive transformation matrix H that was previously described in
complex vehicle detection classifiers on the entire image, they Section IV-A. The mid point of the bottom edge of the detection
are applied in these limited image patches Hs. In case incorrect window is used to determine the position of the vehicle along the
hypothesis windows are generated due to presence of artifacts road from the ego-vehicle in the following way. Let us consider
such as shadows of trees and other road-side structures, such a detection window W = [xW , yW , wW , hW ] (we remove the
false positive windows are eliminated by the classifiers that are superscripts and subscripts for clarity). The following equation
applied in the hypothesis windows. In this work, Haar-Adaboost is applied to get the position of the vehicle in the IPM domain:
classifiers are applied to detect the vehicles in H. In order to use  T  T
the classifiers, over 11,000 positive annotations of vehicles were xI yI 1 = H xW yW + hW 1 (8)
used to train Adaboost cascade classifiers with Haar-like fea- dV = λyI
tures using the active learning methodology described in [17].
The active learning process uses a two-step cascade classifier where dV is the relative distance of the detected vehicle from
learning. In the first step, the cascade is trained with features the ego-vehicle in ground plane coordinates, λ is the calibration
that are extracted from positive annotations and random nega- coefficient and H is the homography matrix for IPM. The
tive annotations. The trained classifier is then used to mine for calibration coefficient λ converts the coordinates (xI , yI ) in
hard negatives, which are used to train another classifier, which the IPM domain into real-world coordinates in meters. λ is
is called actively learned classifier. computed using the camera calibration and it is a one time
The actively learned classifiers are applied within H (if found setup computation. Therefore, the forward and rear processors
3 3
produce two vectors dF V ∈ R and dV ∈ R corresponding to
R
in each RoI) to detect vehicles. If a vehicle is detected within
H, the remaining scan lines in the RoI above y = yj m in are are the distances of the vehicles in front and rear of the ego-vehicle.
not processed anymore. The above approach is repeated for the If any window is null (no vehicle in the RoI), then that particular
three different RoIs resulting in the detection of a maximum distance d is set to dmax which is the maximum distance from
of three vehicles corresponding to the three nearest vehicles in the ego-vehicle which is being processed for threat estimation.
the three RoIs. A Kalman tracker [32] is then applied to the The rear Snapdragon now acts as a client to the forward facing
detected windows in order to track the vehicles continuously Snapdragon processor and sends dR V to the host processor, i.e.
in the RoIs. The state space equations of the Kalman filter for the forward Snapdragon processor. This communication occurs
tracking the vehicles in the RoIs are listed below: via a bluetooth link. The host combines dF and dR to generate
 F F F RV R RVT
a 6-valued vector dV = dL , dE , dR , dL , dE , dR . The first
X t i +1|t i = AX t i |t i and Y ti = M X ti three values in dV denote the distances from the forward view
where X = [x y w h]T A = M = I4×4 (7) and the last three values denote the distances of the vehicles
from rear-view.
X in the above equations represents the position and the size The host processor uses the distance vector dV to compute
of the bounding box of the vehicle. Therefore, in this work, the risk vector Γ using a risk function fr (·). The risk function can be
Kalman tracker is based on the bounding boxes that are being defined in multiple ways depending on the factors that are con-
detected. However, a more complex model could be used that sidered for risk estimation. In this work, we consider a distance
includes the speeds and the heading angles of the surrounding based risk function which computes the risk vector Γ using the
vehicles and ego-vehicle. following equation:
 
V. SAFE ZONE ESTIMATION USING DUAL CAMERA SETUP dV
Γ = f (dV , dmax ) = 1 − (9)
dmax
As discussed in Section III and Fig. 2, there are two Snap-
dragon 810 processors with a camera connected to each pro- Γ is a six-valued vector with six risk values γi corresponding
cessor such that they capture forward and rear views from to the six RoIs, where γi ∈ [0, 1]. γi tends to 0 if the vehicle
340 IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. 1, NO. 4, DECEMBER 2016

TABLE I
DUALCAM-VEHICLE DATASET DETAILS

Fig. 6. Embedded system setup in our testbed with two Snapdragon 810
development boards that are connected to the cameras (rear camera is not seen
in the image).

In terms of software, the proposed algorithms are developed in


is far away from the ego-vehicle. If a vehicle is close to the
C/C++ with a mixture of OpenCV and FastCV functions. While
ego-vehicle, γi approaches 1.
OpenCV is the open source computer vision library, FastCV is
The threat values in Γ are used to visualize a safe maneuver
a mobile optimized computer vision library developed by Qual-
zone for the ego-vehicle. Some sample visualizations and an
comm. It can be used on any ARM-based processors, but it is
analysis of drive using the two-camera based setup on Snap-
best tuned for Snapdragon processors. Most of the proposed
dragon embedded processors are presented in the next section.
techniques were implemented using FastCV libraries but if they
This risk metric is based on the relative distance only and we
did not have the necessary functions (such as Haar-Cascade clas-
would like to highlight that this is one of the possible defini-
sifiers), then such functions were implemented using OpenCV.
tions for the risk function f (·). We discuss other possible factors
influencing risk function in Section VIII.
VII. DATASETS AND PERFORMANCE EVALUATION
VI. EMBEDDED SYSTEM SETUP In this section, we present a detailed evaluation of the pro-
posed techniques and the system. In addition to accuracy re-
The proposed techniques are implemented on two embed-
lated metrics, an analysis of the computational speed is also
ded Qualcomm Snapdragon 810 CPUs on two development kits
presented. In order to maintain consistency, all the evaluations
called Dragonboards. Snapdragon 810 processor is a 64-bit octa-
are performed with the algorithms running on one CPU core
core CPU running at 2 GHz, along with the latest Android 5.0
only of Snapdragon 810 processor.
Lollipop operating system. In addition, it has a 4 GB on board
DDR4 memory and supports a rich set of I/O interfaces and
A. Datasets
connectivity including Wifi and Bluetooth 4.1. It is to be noted
that the Snapdragon processors are widely used in commercially A set of datasets called as Dualcam-Vehicle is released as part
available Android smart phones and tablets. Therefore, the pro- of this paper. This includes 5 video segments which are selected
posed system that is currently implemented on Dragonboards from naturalistic drives. These segments include driving condi-
is readily transferable to any Snapdragon based smart phone or tions that pose different levels of threat to the ego-vehicle from
tablet, as long as the device is running an Android operating surrounding vehicles in the ego-lane and the adjacent lanes.
system. Additionally, Logitech C920 USB webcams are used to Moreover, each of the 5 datasets in Dualcam-Vehicle dataset
capture the input video streams with a resolution of 640 × 480. comprises two videos - the first capturing the forward view and
Fig. 6 shows the system setup in our testbed. The two Drag- the second capturing the rear view from the ego-vehicle. It is to
onboards are named as host and client, where the front and be highlighted that this is the first public dataset that has two
rear facing cameras are connected to the host and client CPUs camera perspectives in this manner. Additionally, the segments
respectively (the rear camera is not shown in Fig. 6). The cam- are particularly chosen to include different levels of threat from
eras require a one-time calibration for lane estimation, so that the two perspectives. The five data sequences S1, S2, S3, S4 and
homography is generated for inverse perspective mapping step S5 in Dualcam-Vehicle dataset have a total of over 5000 image
in lane estimation. Both the Dragonboards perform high threat frames comprising over 4100 vehicles in the three lanes and
vehicle detection followed by threat assessment. After the client two camera perspectives. The details of the datasets and sample
Dragonboard determines the threat posed by the rear vehicles, images are listed in Table I. Table I also lists the different traffic
the threat values are transferred to the host board via Bluetooth and lighting conditions that are captured in the dataset.
connection to determine the safe maneuver zone as described in Considering the anonymity condition for review, we are not
the previous section. releasing more details and sample clips of the dataset in the
SATZODA et al.: VISION-BASED FRONT AND REAR SURROUND UNDERSTANDING USING EMBEDDED PROCESSORS 341

TABLE II
ACCURACY RESULTS AND TIMING FOR ALL SEGMENTS IN
DUALCAM-VEHICLE DATASET

Left Lane Ego Lane Right Lane


TP/FP/FN TP/FP/FN TP/FP/FN TPR FDR Timing
ms (fps)

S1-F 128/5/1 0/1/0 154/5/2 0.99 0.04 62 (16)


S1-R 284/35/13 0/0/0 422/0/7 0.97 0.05
S2-F 301/19/1 0/0/0 57/0/14 0.96 0.05 66 (15)
S2-R 62/0/6 116/0/0 143/35/4 0.97 0.10
S3-F 152/1/2 159/4/2 0/0/0 0.99 0.02 66 (15)
S3-R 0/0/1 0/0/0 276/1/3 0.99 0.00
S4-F 285/3/18 30/1/29 99/32/4 0.89 0.08 71 (14) Fig. 7. Detection results of high threat vehicles within the distance indicated
S4-R 189/2/28 133/0/12 321/10/27 0.91 0.02 by the estimated lanes. Results are taken from two segments. Top row: results
S5-F 53/0/3 20/0/0 4/3/3 0.93 0.04 76 (13) on image frames from forward looking camera. Bottom row: results on rear
S5-R 90/16/1 88/27/0 216/24/16 0.96 0.15 camera images. (a) drive segment 1. (b) drive segment 2.
Total 1544/81/74 546/33/43 1692/110/80 0.95 0.06 66 (15)

96% with a false alarm rate of 8% on the same datasets. The


review manuscript. Once the paper is published, the dataset will higher false alarms are because of the features on the sides of
be made public for research studies. the road that give false positives.

B. Accuracy Evaluation C. Computation Time Analysis


The Dualcam-Vehicle datasets are used to evaluate the high- Although high accuracy is achieved, total computation time
threat vehicle detection method. The accuracy measures are is also important for embedded realization. The above accuracy
computed for the high-threat vehicles only. In order to perform evaluations are determined for the entire system including im-
this evaluation, the five Dualcam-Vehicle datasets were anno- age acquisition, lane estimation using [11], proposed high threat
tated for the three nearest vehicles in the three RoIs - adjacent left vehicle detection method, threat assessment and safe zone vi-
and right lanes, and the ego-lane. This annotation is performed sualization on Snapdragon 810 development platform. The av-
for both the forward view and rear view video segments. The erage computation time per frame is computed for each dataset
maximum distance dmax within which the vehicles are detected by finding the mean of the computation times of all the frames
to be posing high threat is set to 40 meters from the ego-vehicle in the dataset. The last column in Table II lists the timing per
in both perspectives (forward and rear). frame in seconds and the resulting frame rate in frames per
Table II lists the accuracy measures for the five video seg- second (fps).
ments in Dualcam-Vehicle datasets. The results are listed for Table II shows that the average frame rate for the entire dataset
forward and rear views separately. The number of true pos- is around 15 fps which is considered as real-time in ADAS ap-
itives (TP), false positives (FP) and false negatives (FN or plications [33], with the lowest frame rate being 13 fps for S5
missed detections) are listed for each lane (left, ego and right) segment in the DualCam dataset. S5 is a more complex segment
separately. The numbers are then used to compute two met- as compared to the rest of the segments in the dataset, which is
rics True Positive Rate (TPR) and False Detection Rate (FDR) also reflected in lower accuracy rates in Table II. We conducted a
using the following equations: TPR = T P/(T P + F N ) and detailed timing evaluation of the conventional multi-scale slid-
FDR = F P/(F P + T P ). TPR and FDR can also be used to ing window Haar-Adaboost cascade classifier by varying the
derive precision and recall in the following manner: Recall = different parameters of the cascade classifier such as maximum
TPR and FDR = 1-precision. We present all results in TPR and hit rate, maximum false alarm rate and number of cascade stages.
FDR because we use the classifiers that are used in [17] and The evaluation showed that the multi-scale sliding window ap-
[15] for the hypothesis verification step. The use of similar met- proach of applying the cascade classifier to the test sequences
rics will help to show in subsequent sections that the proposed gave a best case timing of 0.98 seconds resulting in 1 frame per
two-step method does not compromise robustness as compared second which violates the real-time requirement of the active
to original methods, and yet operates at real-time frame rates on safety system.
embedded CPUs.
It can be seen that the TPR for the entire dataset (all five
sequences) is over 95% with a false alarm rate or FDR of less D. Computation Time and Accuracy Tradeoff
than 6%. These numbers are computed for more than 4100 We now present an analysis of the tradeoff between compu-
instances of cars in the whole dataset. Some detection results tation time and accuracy of the proposed method. This anal-
are shown in Fig. 7. Optimizing the conventional multi-scale, ysis will also enable identifying appropriate parameters for
sliding window approach using the proposed techniques has the algorithm. Fig. 8 plot the FDR and TPR values (accuracy
not reduced the detection accuracy. Applying active learning metrics) along with the computation time of the vehicle de-
classifiers using conventional method results in an accuracy of tection module on Snapdragon 810 processor. One of the key
342 IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. 1, NO. 4, DECEMBER 2016

Fig. 8. Tradeoff analysis results showing computation speed of vehicle detection on Snapdragon 810 embedded CPU along with the TPR-FDR values for
individual sequences in (a)–(c) and the overall dataset in (d)–(e).

parameters that effects the accuracy of the algorithm is the num-


ber of consecutive under vehicle region segments that must be
detected in each RoI which is denoted by n (thresholded by κ) in
Section IV. This parameter is varied to evaluate the performance
of the algorithm in terms of both accuracy and computation time.
It can be seen from Fig. 8 the TPR is as high as 97% for the least
setting of n = 2 with false detections as low 1%. Lower value
of n also refers to more processing resulting in lower frame Fig. 9. Threat vector Γ is plotted for Sequence 1 with over 500 frames.
rates. However, the lowest frame rate among all the sequences
is still close to 15 fps. TPR, FDR and the computation times are
averaged over all sequences in the dataset to show the overall E. Safe Zone Estimation
measures in Fig. 8(d)–(f). In this sub-section, we present the threat/risk estimation re-
An important observation from Fig. 8 is that if n is increased sults that are computed using one of the sequences (S1) in
to 4, the frame rates increase from 25 fps to nearly 28 fps but Dualcam-Vehicle dataset. Fig. 9 plots the risk vector Γ for the
the TPR is still over 90% with FDR less than 6%. When n is 500-frame sequence S1. It can be seen that the risk values change
increased further to 6, it can be seen that although TPR drops to in different ways during the drive sequence. The risk values are
75%, false alarms are still at 6%. These measures and accuracy- approaching 1 in certain segments of the drive in Fig. 9, which
timing tradeoff plots indicate that setting n = 2 to the lowest implies that there are multiple instances when there is high level
value gives a real-time system with the robust detections. of threat to the ego-vehicle from the surrounding vehicles.
1) Comparison with other Methods: In order to compare the An analysis of this plot can reveal a variety of semantics that
performance of the proposed methods on embedded CPUs, the are related to the dynamics of the surrounding vehicles with
multi-scale and sliding window methods for vehicle detection respect to the ego-vehicle. For example, there is possibility of
[17] and [15] are implemented on the same embedded CPUs tailing vehicle behind the ego-vehicle in ego-lane for nearly
(Snapdragon 810). The frame rates of these methods for de- two-thirds of the time in this sequence. This is indicated by the
tecting vehicles for the same image resolutions were found to magenta plot in Fig. 9. Similar high values of threat are also
be not more than 1.5 frame per second. In contrast, the pro- seen from the vehicles approaching the ego-vehicle from the
posed framework, which includes image acquisition, lane de- rear right lane (cyan plot in Fig. 9). Additionally, the vehicle in
tection, high-threat vehicle detection and safe maneuver zone front of the ego-vehicle in the adjacent left lane also posed high
estimation, operates at real-time frame rates on Snapdragon 810 threat for about 150 frames or 10 seconds. The threat values
embedded CPUs. in Fig. 9 are verified by visually going through the segments.
SATZODA et al.: VISION-BASED FRONT AND REAR SURROUND UNDERSTANDING USING EMBEDDED PROCESSORS 343

Fig. 11. (a) Maserati vehicle with the demonstration of the proposed embed-
ded framework, (b) Visualization of the safe zone that is being demonstrated
inside the vehicle.

Fig. 10. High threat vehicles detected in (a) front and (b) rear of the ego-
vehicle for particular time instance from S1 segment. (c) Safe zone visualization A. Differences Compared to Existing Solutions
implemented on two Snapdragon 810 embedded CPUs using the threat analysis.
It is to be highlighted that we proposed an embedded elec-
tronic system that enables a different set of operations as com-
An example is shown in Fig. 10(a) and (b). The risk vector at
pared to existing ADAS. While most commercially available
t = 260 is given by Ψ = {0.87, 0, 0, 0.64, 0, 0.85} which shows
ADAS such as Mobileye use one forward looking camera to aid
three high values in Fig. 9 corresponding to the forward left,
the driver in operations such as lane departure warning, pedes-
rear left and rear right lanes. This can be seen from the vehicles
trian detection, headway monitoring etc., the proposed system
detected in Fig. 10(a) and (b) (it is to be noted that the rear-view
is a multi-camera setup in which multiple processing engines
image needs to be flipped along the vertical axis so that the rear
are collaboratively functioning to monitor the forward and rear
left lane is aligned with the forward left lane).
views from the ego-vehicle. Based on available literature, this
The risk values are also used to visualize the safe maneuver
is the first embedded solution that employs a two-camera sys-
zone by generating a polygon as shown in Fig. 10(c) for the
tem to give safe zone maneuvers in real-time during drives.
same pair of images at t = 260. The vehicle icon in the cen-
Although there is academic literature on multi-camera based
ter denotes the ego-vehicle, whereas the filled bubbles refer to
driver assistance, an embedded electronic system with the kind
the presence of vehicles in the RoIs. The colors filled in the
of functionality that is described in this paper is not presented
bubbles represent the level of risk to the ego-vehicle due to the
before. Additionally, the proposed system can also be used as
surrounding vehicles, where green represents a safe vehicle and
complementary technology to existing ADAS. For example, any
red represents a high threat vehicle. The visualization in Fig. 10
commercial lane departure warning can be integrated with the
is the representative output to driver in the proposed embedded
proposed system to provide additional functionality.
system operating on Snapdragon CPUs.

B. Limitations & Opportunities


F. Live Demonstration of Embedded Framework
We limited the scope of this paper to specific conditions as
We would like to highlight that the entire embedded system
presented in Section III so as to present the system in the lim-
is currently running at real-time frame rates in our testbed under
ited number of pages. We present some of the limitations and
real-world driving conditions. After a few hundred hours of test-
opportunities that need to be addressed and explored as part of
ing using our testbeds, the embedded system has been demon-
future initiatives.
strated live in various forums. Most notable is the demonstration
(i) In terms of the constituent vision algorithms, the vehicle
of this system in the 2016 Consumer Electronics Show (CES) in
detection classifiers are currently trained for detecting complete
Las Vegas in January 2016. This demonstration was one of the
rear or front views of vehicles. Therefore, when a vehicle is
six demonstrations that were demonstrated in a Maserati vehicle
passing or receding the ego-vehicle on the adjacent lanes, the
at the Qualcomm Automotive Booth. Fig. 11 shows the display
classifiers do not detect them. This could be resolved by applying
of the visualization that was generated for the demonstration
overtaking vehicle detection methods such as [34] in specific
in CES.
regions of the input frames without incurring additional time.
(ii) The proposed system is implemented on one CPU core in
VIII. DISCUSSIONS each Snapdragon 810 processor. Although the proposed system
In this section, we discuss the salient features of the proposed does give real-time frame rate of 15 fps, the speed could be
system followed by a discussion on it’s limitations and how they increased further by parallelizing the proposed system using the
will be addressed. multiple cores on the Snapdragon 810 processor.
344 IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. 1, NO. 4, DECEMBER 2016

(iii) The current two camera setup does not capture the vehi- Electronics Show. The authors thank all their colleagues at Lab-
cles in blind spots of the ego-vehicle, which pose high threat to oratory for Intelligent and Safe Automobiles (LISA) for their
the ego-vehicle. This can be addressed by implementing Item constant support.
(ii) above, i.e., using the multi-core architecture will enable
capturing additional visual data using another two cameras that
are monitoring the blindspots. Monitoring the blind spots also REFERENCES
along with the current two views will make the proposed system
[1] K. Jo, J. Kim, D. Kim, C. Jang, and M. Sunwoo, “Development of au-
a complete 360◦ surround view monitoring system. tonomous car—Part I : Distributed system architecture and development
(iv) The distance-based risk metric that is used in this paper process,” IEEE Trans. Ind. Electron., vol. 61, no. 12, pp. 7131–7140,
is one of the many possibilities. Also, additional parameters Dec. 2014.
[2] K. Jo, J. Kim, D. Kim, C. Jang, and M. Sunwoo, “Development of au-
such as velocities, yaw-rates and projected trajectories of the tonomous car—Part II : A case study on the implementation of an au-
ego-vehicle and surrounding vehicles can help to make the risk tonomous driving system based on distributed architecture,” IEEE Trans.
metric more complete. There are more items that can be ad- Ind. Electron., vol. 62, no. 8, pp. 5119–5132, Aug. 2015.
[3] R. K. Satzoda and M. M. Trivedi, “Efficient lane and vehicle detection
dressed but the above four are immediate items that can be with integrated synergies (ELVIS),” in Proc. IEEE Conf. Comput. Vis.
taken advantage of. Pattern Recognit. Workshops Embedded Vis., 2014, pp. 708–713.
[4] S. Chakraborty, M. Lukasiewycz, C. Buckl, S. Fahmy, P. Leteinturier,
and H. Adlkofer, “Embedded systems and software challenges in electric
C. Scalable Embedded Framework vehicles,” in Proc. Design, Autom. Test Eur. Conf. Exhib., Mar. 2012,
pp. 424–429.
We would like to highlight that the proposed embedded sys- [5] E. Ohn-Bar and M. M. Trivedi, “Looking at humans in the age of self-
tem is also a scalable embedded framework. This is because driving and highly automated vehicles,” IEEE Trans. Intell. Veh., vol. 1,
no. 1, pp. 90–104, Mar. 2016.
more cameras can be added to the system, where each camera [6] G. Liu, F. Worgotter, and I. Markelic, “Stochastic lane shape estimation
can be connected to one CPU. Each CPU processes the visual using local image descriptors,” IEEE Trans. Intell. Transp. Syst., vol. 14,
data from its visual perspective and the semantic information no. 1, pp. 13–21, Mar. 2013.
[7] A. Almagambetov, S. Velipasalar, and M. Casares, “Robust and compu-
of the dynamics of the vehicles its perspective can be fused tationally lightweight autonomous tracking of vehicle taillights and sig-
with the semantics from other CPUs to generate a complete nal detection by embedded smart cameras,” IEEE Trans. Ind. Electron.,
360◦ surround scene analysis. The proposed embedded system vol. 62, no. 6, pp. 3732–3741, Jun. 2015.
[8] S. Sivaraman and M. M. Trivedi, “Looking at vehicles on the Road: A
can scale to include more cameras, without overloading one survey of vision-based vehicle detection, tracking, and behavior analysis,”
single CPU and consequently resulting in real-time operation. IEEE Trans. Intell. Transp. Syst., vol. 14, no. 4, pp. 1773–1795, Dec. 2013.
Therefore, blind spots and all other missing perspectives can [9] A. Borkar, M. Hayes, and M. T. Smith, “A novel lane detection system
with efficient ground truth generation,” IEEE Trans. Intell. Transp. Syst.,
also be addressed by scaling the system with more cameras and vol. 13, no. 1, pp. 365–374, Mar. 2012.
processors. [10] B.-F. Wu, C.-C. Kao, C.-L. Jen, Y.-F. Li, Y.-H. Chen, and J.-H. Juang,
“A relative-discriminative-histogram-of-oriented- gradients-based parti-
cle filter approach to vehicle occlusion handling and tracking,” IEEE
IX. CONCLUSIONS & FUTURE DIRECTIONS Trans. Ind. Electron., vol. 61, no. 8, pp. 4228–4237, Aug. 2014.
[11] R. K. Satzoda and M. M. Trivedi, “Selective salient feature based lane
In this paper, we presented a detailed analysis, evaluation analysis,” in Proc. IEEE Intell. Transp. Syst. Conf., 2013, pp. 1906–1911.
and discussion on an embedded driver assistance system that [12] A. B. Hillel, R. Lerner, D. Levi, and G. Raz, “Recent progress in road and
includes a two-camera setup to generate a safe maneuver zone lane detection: A survey,” Mach. Vis. Appl., vol. 25, no. 3, pp. 727–745,
Feb. 2012.
around the ego-vehicle. In order to design this system using [13] J. McCall and M. Trivedi, “Video-based lane estimation and tracking for
embedded CPUs on Snapdragon 810 processor, application re- driver assistance: Survey, system, and evaluation,” IEEE Trans. Intell.
quirement of detecting high-threat vehicles is combined with Transp. Syst., vol. 7, no. 1, pp. 20–37, Mar. 2006.
[14] R. Gopalan, T. Hong, M. Shneier, and R. Chellappa, “A learning approach
a variety of optimization techniques so that computationally towards detection and tracking of lane markings,” IEEE Trans. Intell.
complex Haar-Adaboost classifiers are applied to selective im- Transp. Syst., vol. 13, no. 3, pp. 1088–1098, Sep. 2012.
age patches. It was shown that the resulting high-threat vehicle [15] R. K. Satzoda and M. M. Trivedi, “Multi-part vehicle detection using sym-
metry derived analysis and active learning,” IEEE Trans. Intell. Transp.
detection method enables the entire system to operate at a min- Syst., vol. 17, no. 4, pp. 926–937, Apr. 2016.
imum of 15 fps and an average of 25 fps unlike the multi-scale [16] B. Pepikj, M. Stark, P. Gehler, and B. Schiele, “Occlusion patterns for ob-
sliding window approach which operates at 1 fps. More impor- ject class detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
Jun. 2013, pp. 3286–3293.
tantly, it is ensured that the high rates of accuracy in detecting [17] S. Sivaraman and M. M. Trivedi, “A general active-learning framework
high-threat vehicles are maintained. A detailed discussion about for on-road vehicle recognition and tracking,” IEEE Trans. Intell. Transp.
the proposed system is also presented to draw possible future Syst., vol. 11, no. 2, pp. 267–276, Jun. 2010.
[18] H. Takahashi, D. Ukishima, K. Kawamoto, and K. Hirota, “A study on
directions in order to improve the efficiency and robustness of predicting hazard factors for safe driving,” IEEE Trans. Ind. Electron.,
the system. Future work will also include incorporating such a vol. 54, no. 2, pp. 781–789, Apr. 2007.
system as part of naturalistic driving studies [35]. [19] G. Ogawa, K. Kise, T. Torii, and T. Nagao, “Onboard evolutionary risk
recognition system for automobiles toward the risk map system,” IEEE
Trans. Ind. Electron., vol. 54, no. 2, pp. 878–886, Apr. 2007.
[20] S. Sivaraman and M. M. Trivedi, “Dynamic probabilistic drivability maps
ACKNOWLEDGMENT for lane change and merge driver assistance,” IEEE Trans. Intell. Transp.
The authors would like to thank our sponsors for their sup- Syst., vol. 15, no. 5, pp. 2063–2073, Oct. 2014.
[21] P. Jeong and S. Nedevschi, “Efficient and robust classification method
port during the project. In particular, they thank Qualcomm using combined feature vector for lane detection,” IEEE Trans. Circuits
for presenting a variant of this work in the 2016 Consumer Syst. Video Technol., vol. 15, no. 4, pp. 528–537, Apr. 2005.
SATZODA et al.: VISION-BASED FRONT AND REAR SURROUND UNDERSTANDING USING EMBEDDED PROCESSORS 345

[22] F. Stein, “The challenge of putting vision algorithms into a car,” in Proc. Sean Lee is currently working toward the integrated
IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Workshops, M.S. degree at the University of California, San
Jun. 2012, pp. 89–94. Diego, CA, USA. He conducts research in the field
[23] R. K. Satzoda and M. M. Trivedi, “On enhancing lane estimation using of computer vision and machine learning in the Lab-
contextual cues,” IEEE Trans. Circuits Syst. Video Technol., vol. 25, no. 11, oratory for Intelligent and Safe Automobiles. His
pp. 1870–1881, 2015. research interests include computer vision, machine
[24] S. Dabral, “Trends in camera based automotive driver assistance systems learning, embedded programming, and intelligent ve-
(ADAS),” in Proc. IEEE 57th Int. Midwest Symp. Circuits Syst., 2014, hicles.
pp. 1110–1115.
[25] J. Horgan, C. Hughes, J. McDonald, and S. Yogamani, “Vision-based
driver assistance systems: Survey, taxonomy and advances,” in Proc. IEEE
18th Int. Conf. Intell. Transp. Syst., Sep. 2015, pp. 2032–2039.
[26] G. P. Stein, Y. Gdalyahu, and A. Shashua, “Stereo-assist: Top-down stereo
for driver assistance systems,” in Proc. IEEE Intell. Veh. Symp., 2010,
pp. 723–730.
[27] A. D. Costea, A. V. Vesa, and S. Nedevschi, “Fast pedestrian detection for Frankie Lu is currently working toward the inte-
mobile devices,” in Proc. IEEE 18th Int. Conf. Intell. Transp. Syst., 2015, grated M.S. degree at the University of California,
pp. 2364–2369. San Diego, CA, USA. He is currently applying com-
[28] Mobileye, 2015. [Online]. Available: mobileye.com puter vision and machine learning techniques for em-
[29] S. Kamath and B. Valentine, “Implementation details of mid-level vision bedded computing platforms in the domain of intelli-
on the embedded vision engine (EVE),” in Proc. IEEE Int. Symp. Circuits gent vehicles. His research interests include computer
Syst., 2014, pp. 1283–1287. vision, machine learning, embedded programming,
[30] R. K. Satzoda and M. M. Trivedi, “On performance evaluation metrics and intelligent vehicles.
for lane estimation,” in Proc. 22nd Int. Conf. Pattern Recognit., 2014,
pp. 2625–2630.
[31] A. Broggi, “Parallel and local feature extraction: A real-time approach
to road boundary detection,” IEEE Trans. Image Process., vol. 4, no. 2,
pp. 217–223, Feb. 1995.
[32] S. Y. Chen, “Kalman filter for robot vision: A survey,” IEEE Trans. Ind.
Electron., vol. 59, no. 11, pp. 4409–4420, Nov. 2012.
[33] C. Caraffi, T. Vojir, J. Trefny, J. Sochman, and J. Matas, “A system for real-
time detection and tracking of vehicles from a single car-mounted camera,”
Proc. 15th Int. IEEE Conf. Intell. Transp. Syst., Sep. 2012, pp. 975–982.
[34] X. Zhang, P. Jiang, and F. Wang, “Overtaking vehicle detection us-
ing a spatio-temporal CRF,” in Proc. IEEE Intell. Veh. Symp., 2014,
pp. 338–342.
[35] R. K. Satzoda and M. M. Trivedi, “Drive analysis using vehicle dynamics Mohan Manubhai Trivedi (F’08) received the B.E.
and vision-based lane semantics,” IEEE Trans. Intell. Transp. Syst., vol. 16, (with Hons.) degree in electronics from Birla In-
no. 1, pp. 9–18, Feb. 2015. stitute of Technology and Science, Pilani, India, in
1974, and the M.S. and Ph.D. degrees in electrical
engineering from Utah State University, Logan, UT,
USA, in 1976 and 1979, respectively. He is currently
Ravi Kumar Satzoda (M’13) received the B.Eng. a Professor of electrical and computer engineering.
(with First Class Hons.), M. Eng. (by research), and He has also established the Laboratory for Intelli-
Ph.D. degrees from Nanyang Technological Univer- gent and Safe Automobiles, and Computer Vision
sity, Singapore, in 2004, 2007, and 2013, respectively. and Robotics Research Laboratory, University of Cal-
He is currently a Postdoctoral Fellow at the University ifornia, San Diego, CA, USA, where he and his team
of California, San Diego, CA, USA, where he is as- are currently pursuing research in machine and human perception, machine
sociated with the Laboratory for Intelligent and Safe learning, intelligent transportation, driver assistance, active safety systems, and
Automobiles. His research interests include computer Naturalistic Driving Study analytics. He received the IEEE Intelligent Trans-
vision, embedded vision systems, and intelligent portation Systems Society’s highest honor and Outstanding Research Award in
vehicles. 2013.

You might also like