Vision-Based Front and Rear Surround Understanding Using Embedded Processors
Vision-Based Front and Rear Surround Understanding Using Embedded Processors
parallel and of specific width wL . Let ymin and ymax (refer to (1)) Given a test image frame I with the boundaries of the RoIs
in the image domain correspond to y W = 1 and y W = ymax W
in computed as described above, I is scanned along the y-direction
the IPM domain. This represents the depth along the road from starting from the bottom of the image, i.e., the point that is
the ego-vehicle for which the lane analysis and the subsequent nearest to the ego-vehicle. For a particular y-position yj , the
threat analysis is valid. Therefore, for every yiW ∈ [1, ymax
W
], we image pixels in that scan line are divided into the following
apply the following equation: three segments corresponding to the three RoIs:
T T
Δi yi 1 = H −1 wL yiW 1 (3) I L (yj ) = I(x, yj ) where PLL (x, yj ) ≤ x ≤ PL (x, yj )
where Δi is the x-offset which is stored in the LUT Δx for I E (yj ) = I(x, yj ) where PL (x, yj ) ≤ x ≤ PR (x, yj )
the yi -th coordinate in the image domain. The above equation I R (yj ) = I(x, yj ) where PR (x, yj ) ≤ x ≤ PRR (x, yj ) (4)
is applied for all values of yiW to create the LUT. This entire
process is a one-time offline process. The LUT is then applied The image pixels in each of three scan line segments I L (yj ),
according to (2) to generate the boundaries for the left and I E (yj ) and I R (yj ) are then thresholded using the PDFs for
right adjacent lanes during runtime resulting in the three RoIs: under-vehicle region and road surface - PU V (μU V , σU V ) and
adjacent left lane, ego-lane and adjacent right lane (denoted by PR S (μR S , σR S ) using the following equation:
RLF , RE F
and RRF
in Fig. 4). This method to detect the adjacent
1 if PU V (I(xk , yj )) ≥ PR S (I(xk , yj ))
RoIs is particularly effective when the lane widths are fixed as I(xk , yj ) = (5)
seen in the case of highways. 0 otherwise
It is to be noted that the proposed LUT based approach is also
valid during lane change maneuvers made by the ego-vehicle. (the superscripts L, R and E are omitted in the above equation
This is because the relationship between the IPM image and the for simplicity). After thresholding the scan line segments, the
perspective (or image) domain remains the same during lane number of pixels that are classified as under-vehicle region is
change maneuver. During lane change the lanes are laterally determined for each segment. If the number of under-vehicle
shifted in the IPM domain but the width remains the same. region pixels form β% of the scan line segment, then that scan
Therefore, (3) results in the same offsets during lane change. The line segment is marked as an under-vehicle region segment.
only assumption for this is that the lens and camera distortions Therefore, when the scan line at y is scanned completely from
need to be corrected beforehand. x = 0 to x = xmax , possible vehicle regions in the three RoIs
are determined. This process is repeated with the next scan
line at y = yj − 2. In each RoI, the number of consecutive
B. Region of Interest (RoI) Based Vehicle Detection
under-vehicle region segments are counted which are denoted
Given the three RoIs, high-threat vehicles are detected in by nL , nE and nR corresponding to the three RoIs. If nL is
each region. As listed in Definition 1, the system will detect greater than κ then the nL scan line segments are considered
three vehicles, one from each RoI, which are nearest to the ego- as possible constituents of under-vehicle region in the adjacent
vehicle because the presence of these vehicles poses varying left lane or RoI. The same is repeated with the ego-lane and
levels of threat to the ego-vehicle. the adjacent right lane RoIs. If an RoI is found to have possible
Instead of applying the vehicle classifier directly in each of under-vehicle region, then a hypothesis window H is drawn in
the regions, hypothesis candidates are first generated. In order to which vehicle classifiers are applied. The dimensions of this
do this, each RoI is analyzed for the presence of under-vehicle window are computed in the following way:
dark regions (shown in Fig. 3 inset annotations). The lower parts
of the leading vehicles in front of the ego-vehicle are usually wh = (width of lane at y = yj m ax ) + padding
distinctive with dark pixels as compared to the surrounding road hh = (n + α × width of lane at y = yj m in ) + padding (6)
surface. The dark pixels are contributed by either the bumper
of the vehicle or the shadow cast by the vehicle or both. It is where wh and hh are the width and height of the hypothesis
to be noted that the under vehicle region is not dependent on window for the under-vehicle region that starts at y = yj m ax and
the shadow cast by the vehicle alone. The darker under vehicle ends at y = yj m in ; α is the aspect ratio factor that is dependent
regions are a result of the proximity of the vehicle chassis to the on the vehicle dimensions; n is the height of the under-vehicle
road surface. Therefore, the under vehicle regions apply to vehi- region that is detected in the RoI, i.e., n is either nL or nE
cles being detected in cloudy or rainy weather conditions also. or nR depending on the RoI. All the different parameters are
In order to detect the presence of this under vehicle region, the illustrated in Fig. 4.
annotated dataset for training the classifiers (which will be used Note on setting the parameters: α is learned form the training
later in the proposed method) is used. The under-vehicle regions data. Fig. 5 shows the distribution of the relationship between
of the training samples were annotated. Given that the cameras the width and height of 20,000 annotated vehicles in the training
capture the video in YUV color format, the mean Y component data. α is set to 1/1.25 based on this data. Varying κ affects the
of the under-vehicle regions is collected from the training data accuracy and computation time of the algorithm. A detailed
resulting in the probability density functions PU V (μU V , σU V ) analysis on κ is presented in Section VII. The value β is set
and PR S (μR S , σR S ) for the under-vehicle region and the road to 40% so that it can handle vehicles that are changing lanes.
surface respectively as shown in Fig 3. padding is heuristically set to 25 pixels.
SATZODA et al.: VISION-BASED FRONT AND REAR SURROUND UNDERSTANDING USING EMBEDDED PROCESSORS 339
TABLE I
DUALCAM-VEHICLE DATASET DETAILS
Fig. 6. Embedded system setup in our testbed with two Snapdragon 810
development boards that are connected to the cameras (rear camera is not seen
in the image).
TABLE II
ACCURACY RESULTS AND TIMING FOR ALL SEGMENTS IN
DUALCAM-VEHICLE DATASET
Fig. 8. Tradeoff analysis results showing computation speed of vehicle detection on Snapdragon 810 embedded CPU along with the TPR-FDR values for
individual sequences in (a)–(c) and the overall dataset in (d)–(e).
Fig. 11. (a) Maserati vehicle with the demonstration of the proposed embed-
ded framework, (b) Visualization of the safe zone that is being demonstrated
inside the vehicle.
Fig. 10. High threat vehicles detected in (a) front and (b) rear of the ego-
vehicle for particular time instance from S1 segment. (c) Safe zone visualization A. Differences Compared to Existing Solutions
implemented on two Snapdragon 810 embedded CPUs using the threat analysis.
It is to be highlighted that we proposed an embedded elec-
tronic system that enables a different set of operations as com-
An example is shown in Fig. 10(a) and (b). The risk vector at
pared to existing ADAS. While most commercially available
t = 260 is given by Ψ = {0.87, 0, 0, 0.64, 0, 0.85} which shows
ADAS such as Mobileye use one forward looking camera to aid
three high values in Fig. 9 corresponding to the forward left,
the driver in operations such as lane departure warning, pedes-
rear left and rear right lanes. This can be seen from the vehicles
trian detection, headway monitoring etc., the proposed system
detected in Fig. 10(a) and (b) (it is to be noted that the rear-view
is a multi-camera setup in which multiple processing engines
image needs to be flipped along the vertical axis so that the rear
are collaboratively functioning to monitor the forward and rear
left lane is aligned with the forward left lane).
views from the ego-vehicle. Based on available literature, this
The risk values are also used to visualize the safe maneuver
is the first embedded solution that employs a two-camera sys-
zone by generating a polygon as shown in Fig. 10(c) for the
tem to give safe zone maneuvers in real-time during drives.
same pair of images at t = 260. The vehicle icon in the cen-
Although there is academic literature on multi-camera based
ter denotes the ego-vehicle, whereas the filled bubbles refer to
driver assistance, an embedded electronic system with the kind
the presence of vehicles in the RoIs. The colors filled in the
of functionality that is described in this paper is not presented
bubbles represent the level of risk to the ego-vehicle due to the
before. Additionally, the proposed system can also be used as
surrounding vehicles, where green represents a safe vehicle and
complementary technology to existing ADAS. For example, any
red represents a high threat vehicle. The visualization in Fig. 10
commercial lane departure warning can be integrated with the
is the representative output to driver in the proposed embedded
proposed system to provide additional functionality.
system operating on Snapdragon CPUs.
(iii) The current two camera setup does not capture the vehi- Electronics Show. The authors thank all their colleagues at Lab-
cles in blind spots of the ego-vehicle, which pose high threat to oratory for Intelligent and Safe Automobiles (LISA) for their
the ego-vehicle. This can be addressed by implementing Item constant support.
(ii) above, i.e., using the multi-core architecture will enable
capturing additional visual data using another two cameras that
are monitoring the blindspots. Monitoring the blind spots also REFERENCES
along with the current two views will make the proposed system
[1] K. Jo, J. Kim, D. Kim, C. Jang, and M. Sunwoo, “Development of au-
a complete 360◦ surround view monitoring system. tonomous car—Part I : Distributed system architecture and development
(iv) The distance-based risk metric that is used in this paper process,” IEEE Trans. Ind. Electron., vol. 61, no. 12, pp. 7131–7140,
is one of the many possibilities. Also, additional parameters Dec. 2014.
[2] K. Jo, J. Kim, D. Kim, C. Jang, and M. Sunwoo, “Development of au-
such as velocities, yaw-rates and projected trajectories of the tonomous car—Part II : A case study on the implementation of an au-
ego-vehicle and surrounding vehicles can help to make the risk tonomous driving system based on distributed architecture,” IEEE Trans.
metric more complete. There are more items that can be ad- Ind. Electron., vol. 62, no. 8, pp. 5119–5132, Aug. 2015.
[3] R. K. Satzoda and M. M. Trivedi, “Efficient lane and vehicle detection
dressed but the above four are immediate items that can be with integrated synergies (ELVIS),” in Proc. IEEE Conf. Comput. Vis.
taken advantage of. Pattern Recognit. Workshops Embedded Vis., 2014, pp. 708–713.
[4] S. Chakraborty, M. Lukasiewycz, C. Buckl, S. Fahmy, P. Leteinturier,
and H. Adlkofer, “Embedded systems and software challenges in electric
C. Scalable Embedded Framework vehicles,” in Proc. Design, Autom. Test Eur. Conf. Exhib., Mar. 2012,
pp. 424–429.
We would like to highlight that the proposed embedded sys- [5] E. Ohn-Bar and M. M. Trivedi, “Looking at humans in the age of self-
tem is also a scalable embedded framework. This is because driving and highly automated vehicles,” IEEE Trans. Intell. Veh., vol. 1,
no. 1, pp. 90–104, Mar. 2016.
more cameras can be added to the system, where each camera [6] G. Liu, F. Worgotter, and I. Markelic, “Stochastic lane shape estimation
can be connected to one CPU. Each CPU processes the visual using local image descriptors,” IEEE Trans. Intell. Transp. Syst., vol. 14,
data from its visual perspective and the semantic information no. 1, pp. 13–21, Mar. 2013.
[7] A. Almagambetov, S. Velipasalar, and M. Casares, “Robust and compu-
of the dynamics of the vehicles its perspective can be fused tationally lightweight autonomous tracking of vehicle taillights and sig-
with the semantics from other CPUs to generate a complete nal detection by embedded smart cameras,” IEEE Trans. Ind. Electron.,
360◦ surround scene analysis. The proposed embedded system vol. 62, no. 6, pp. 3732–3741, Jun. 2015.
[8] S. Sivaraman and M. M. Trivedi, “Looking at vehicles on the Road: A
can scale to include more cameras, without overloading one survey of vision-based vehicle detection, tracking, and behavior analysis,”
single CPU and consequently resulting in real-time operation. IEEE Trans. Intell. Transp. Syst., vol. 14, no. 4, pp. 1773–1795, Dec. 2013.
Therefore, blind spots and all other missing perspectives can [9] A. Borkar, M. Hayes, and M. T. Smith, “A novel lane detection system
with efficient ground truth generation,” IEEE Trans. Intell. Transp. Syst.,
also be addressed by scaling the system with more cameras and vol. 13, no. 1, pp. 365–374, Mar. 2012.
processors. [10] B.-F. Wu, C.-C. Kao, C.-L. Jen, Y.-F. Li, Y.-H. Chen, and J.-H. Juang,
“A relative-discriminative-histogram-of-oriented- gradients-based parti-
cle filter approach to vehicle occlusion handling and tracking,” IEEE
IX. CONCLUSIONS & FUTURE DIRECTIONS Trans. Ind. Electron., vol. 61, no. 8, pp. 4228–4237, Aug. 2014.
[11] R. K. Satzoda and M. M. Trivedi, “Selective salient feature based lane
In this paper, we presented a detailed analysis, evaluation analysis,” in Proc. IEEE Intell. Transp. Syst. Conf., 2013, pp. 1906–1911.
and discussion on an embedded driver assistance system that [12] A. B. Hillel, R. Lerner, D. Levi, and G. Raz, “Recent progress in road and
includes a two-camera setup to generate a safe maneuver zone lane detection: A survey,” Mach. Vis. Appl., vol. 25, no. 3, pp. 727–745,
Feb. 2012.
around the ego-vehicle. In order to design this system using [13] J. McCall and M. Trivedi, “Video-based lane estimation and tracking for
embedded CPUs on Snapdragon 810 processor, application re- driver assistance: Survey, system, and evaluation,” IEEE Trans. Intell.
quirement of detecting high-threat vehicles is combined with Transp. Syst., vol. 7, no. 1, pp. 20–37, Mar. 2006.
[14] R. Gopalan, T. Hong, M. Shneier, and R. Chellappa, “A learning approach
a variety of optimization techniques so that computationally towards detection and tracking of lane markings,” IEEE Trans. Intell.
complex Haar-Adaboost classifiers are applied to selective im- Transp. Syst., vol. 13, no. 3, pp. 1088–1098, Sep. 2012.
age patches. It was shown that the resulting high-threat vehicle [15] R. K. Satzoda and M. M. Trivedi, “Multi-part vehicle detection using sym-
metry derived analysis and active learning,” IEEE Trans. Intell. Transp.
detection method enables the entire system to operate at a min- Syst., vol. 17, no. 4, pp. 926–937, Apr. 2016.
imum of 15 fps and an average of 25 fps unlike the multi-scale [16] B. Pepikj, M. Stark, P. Gehler, and B. Schiele, “Occlusion patterns for ob-
sliding window approach which operates at 1 fps. More impor- ject class detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
Jun. 2013, pp. 3286–3293.
tantly, it is ensured that the high rates of accuracy in detecting [17] S. Sivaraman and M. M. Trivedi, “A general active-learning framework
high-threat vehicles are maintained. A detailed discussion about for on-road vehicle recognition and tracking,” IEEE Trans. Intell. Transp.
the proposed system is also presented to draw possible future Syst., vol. 11, no. 2, pp. 267–276, Jun. 2010.
[18] H. Takahashi, D. Ukishima, K. Kawamoto, and K. Hirota, “A study on
directions in order to improve the efficiency and robustness of predicting hazard factors for safe driving,” IEEE Trans. Ind. Electron.,
the system. Future work will also include incorporating such a vol. 54, no. 2, pp. 781–789, Apr. 2007.
system as part of naturalistic driving studies [35]. [19] G. Ogawa, K. Kise, T. Torii, and T. Nagao, “Onboard evolutionary risk
recognition system for automobiles toward the risk map system,” IEEE
Trans. Ind. Electron., vol. 54, no. 2, pp. 878–886, Apr. 2007.
[20] S. Sivaraman and M. M. Trivedi, “Dynamic probabilistic drivability maps
ACKNOWLEDGMENT for lane change and merge driver assistance,” IEEE Trans. Intell. Transp.
The authors would like to thank our sponsors for their sup- Syst., vol. 15, no. 5, pp. 2063–2073, Oct. 2014.
[21] P. Jeong and S. Nedevschi, “Efficient and robust classification method
port during the project. In particular, they thank Qualcomm using combined feature vector for lane detection,” IEEE Trans. Circuits
for presenting a variant of this work in the 2016 Consumer Syst. Video Technol., vol. 15, no. 4, pp. 528–537, Apr. 2005.
SATZODA et al.: VISION-BASED FRONT AND REAR SURROUND UNDERSTANDING USING EMBEDDED PROCESSORS 345
[22] F. Stein, “The challenge of putting vision algorithms into a car,” in Proc. Sean Lee is currently working toward the integrated
IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Workshops, M.S. degree at the University of California, San
Jun. 2012, pp. 89–94. Diego, CA, USA. He conducts research in the field
[23] R. K. Satzoda and M. M. Trivedi, “On enhancing lane estimation using of computer vision and machine learning in the Lab-
contextual cues,” IEEE Trans. Circuits Syst. Video Technol., vol. 25, no. 11, oratory for Intelligent and Safe Automobiles. His
pp. 1870–1881, 2015. research interests include computer vision, machine
[24] S. Dabral, “Trends in camera based automotive driver assistance systems learning, embedded programming, and intelligent ve-
(ADAS),” in Proc. IEEE 57th Int. Midwest Symp. Circuits Syst., 2014, hicles.
pp. 1110–1115.
[25] J. Horgan, C. Hughes, J. McDonald, and S. Yogamani, “Vision-based
driver assistance systems: Survey, taxonomy and advances,” in Proc. IEEE
18th Int. Conf. Intell. Transp. Syst., Sep. 2015, pp. 2032–2039.
[26] G. P. Stein, Y. Gdalyahu, and A. Shashua, “Stereo-assist: Top-down stereo
for driver assistance systems,” in Proc. IEEE Intell. Veh. Symp., 2010,
pp. 723–730.
[27] A. D. Costea, A. V. Vesa, and S. Nedevschi, “Fast pedestrian detection for Frankie Lu is currently working toward the inte-
mobile devices,” in Proc. IEEE 18th Int. Conf. Intell. Transp. Syst., 2015, grated M.S. degree at the University of California,
pp. 2364–2369. San Diego, CA, USA. He is currently applying com-
[28] Mobileye, 2015. [Online]. Available: mobileye.com puter vision and machine learning techniques for em-
[29] S. Kamath and B. Valentine, “Implementation details of mid-level vision bedded computing platforms in the domain of intelli-
on the embedded vision engine (EVE),” in Proc. IEEE Int. Symp. Circuits gent vehicles. His research interests include computer
Syst., 2014, pp. 1283–1287. vision, machine learning, embedded programming,
[30] R. K. Satzoda and M. M. Trivedi, “On performance evaluation metrics and intelligent vehicles.
for lane estimation,” in Proc. 22nd Int. Conf. Pattern Recognit., 2014,
pp. 2625–2630.
[31] A. Broggi, “Parallel and local feature extraction: A real-time approach
to road boundary detection,” IEEE Trans. Image Process., vol. 4, no. 2,
pp. 217–223, Feb. 1995.
[32] S. Y. Chen, “Kalman filter for robot vision: A survey,” IEEE Trans. Ind.
Electron., vol. 59, no. 11, pp. 4409–4420, Nov. 2012.
[33] C. Caraffi, T. Vojir, J. Trefny, J. Sochman, and J. Matas, “A system for real-
time detection and tracking of vehicles from a single car-mounted camera,”
Proc. 15th Int. IEEE Conf. Intell. Transp. Syst., Sep. 2012, pp. 975–982.
[34] X. Zhang, P. Jiang, and F. Wang, “Overtaking vehicle detection us-
ing a spatio-temporal CRF,” in Proc. IEEE Intell. Veh. Symp., 2014,
pp. 338–342.
[35] R. K. Satzoda and M. M. Trivedi, “Drive analysis using vehicle dynamics Mohan Manubhai Trivedi (F’08) received the B.E.
and vision-based lane semantics,” IEEE Trans. Intell. Transp. Syst., vol. 16, (with Hons.) degree in electronics from Birla In-
no. 1, pp. 9–18, Feb. 2015. stitute of Technology and Science, Pilani, India, in
1974, and the M.S. and Ph.D. degrees in electrical
engineering from Utah State University, Logan, UT,
USA, in 1976 and 1979, respectively. He is currently
Ravi Kumar Satzoda (M’13) received the B.Eng. a Professor of electrical and computer engineering.
(with First Class Hons.), M. Eng. (by research), and He has also established the Laboratory for Intelli-
Ph.D. degrees from Nanyang Technological Univer- gent and Safe Automobiles, and Computer Vision
sity, Singapore, in 2004, 2007, and 2013, respectively. and Robotics Research Laboratory, University of Cal-
He is currently a Postdoctoral Fellow at the University ifornia, San Diego, CA, USA, where he and his team
of California, San Diego, CA, USA, where he is as- are currently pursuing research in machine and human perception, machine
sociated with the Laboratory for Intelligent and Safe learning, intelligent transportation, driver assistance, active safety systems, and
Automobiles. His research interests include computer Naturalistic Driving Study analytics. He received the IEEE Intelligent Trans-
vision, embedded vision systems, and intelligent portation Systems Society’s highest honor and Outstanding Research Award in
vehicles. 2013.