0% found this document useful (0 votes)
26 views4 pages

A System-On-Chip FPGA Design For Real-Time 2016

This paper presents a system-on-chip FPGA design for real-time traffic signal recognition using blob detection, histogram of oriented gradients (HOG), and support vector machine (SVM). The system achieves a processing rate of 60 frames per second with over 90% accuracy in detecting red and green traffic lights. The implementation demonstrates effective hardware/software co-design, enabling high-speed image processing suitable for advanced driver assistance systems.

Uploaded by

hsamsam10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views4 pages

A System-On-Chip FPGA Design For Real-Time 2016

This paper presents a system-on-chip FPGA design for real-time traffic signal recognition using blob detection, histogram of oriented gradients (HOG), and support vector machine (SVM). The system achieves a processing rate of 60 frames per second with over 90% accuracy in detecting red and green traffic lights. The implementation demonstrates effective hardware/software co-design, enabling high-speed image processing suitable for advanced driver assistance systems.

Uploaded by

hsamsam10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

A System-On-Chip FPGA Design for Real-Time

Traffic Signal Recognition System


Yuteng Zhou, Zhilu Chen, and Xinming Huang
Department of Electrical and Computer Engineering, Worcester Polytechnic Institute, MA 01609, USA

Abstract—Traffic signal detection has long been an important


function in an advanced driver assistance system (ADAS). This
paper presents a complete system design based on the techniques
of blob detection, histogram of oriented gradients (HOG) and
support vector machine (SVM). Blob detection is applied to detect Figure 1. Overall system diagram with detection and classification
potential candidates, and then HOG and SVM is for feature
classification. A novel hardware/software co-design architecture
is developed for traffic light recognition at real-time. With well-
balanced workload on FPGA fabric and the on-chip ARM
sor. The system implementation supports a frame rate of 60
processor, the entire system-on-chip can achieve a processing rate frames per second (fps), while still attaining a high detection
of 60 fps for XGA 1024-by-768 video. The system can achieve an rate over 90% on both green and red traffic lights. The rest
accuracy rate of over 90% on both red lights and green lights. of this paper is organized as follows. Section II presents
The proposed system can be improved by replacing HOG with methodologies for traffic light detection and classification.
more advanced feature algorithm to obtain higher accuracy.
Section III presents detailed implementation on the SOC
Index Terms—ADAS, traffic signal, real-time, blob detection,
HOG, SVM, FPGA, system-on-chip FPGA. Section IV presents the experimental results and finally
Section V concludes the paper.
I. I NTRODUCTION
II. M ETHODS FOR T RAFFIC L IGHT D ETECTION AND
Traffic signal recognition is an important feature in ad- C LASSIFICATION
vanced driver assistance systems (ADAS) and self-driving
vehicles [1]. Similar to traffic sign detection methods [2], The input to the FPGA system is the real-time video
vision-based solutions are popular for traffic light detection. stream at 60 fps captured by an RGB camera mounted on
The key challenge is to implement the algorithms for real- a car. We first divide the system into two parts, detection and
time processing, which is then meaningful to drivers, and classification as shown in Fig. 1. The detection part consists of
to achieve a high detection rate at the same time [3]. A pre-filtering which extracts green pixels and red pixels. After
typical approach is to first detect potential blobs first, and that, blob detection is applied to estimate the positions each
then differentiate true traffic lights from false positives [4] blob. After detection, all the potential green blobs for green
[5]. Previously, FPGA-based platforms have been widely used traffic lights and red blobs for red traffic lights are obtained
for the implementations of real-time image processing and including many false positives. Since a typical traffic light’s
computer vision. Similar works such as recognizing traffic length-width ratio is between 1/4 to 4, blobs with odd length-
signs has already been implemented on FPGA, achieving a width ratio are eliminated. Then each potential blob is resized
tremendous speedup and significant lower power consumption to 32-by-32 pixels. Furthermore, the HOG algorithm extracts
comparing to the software solutions on a general purpose CPU 324 features from each blob that are then fed to the SVM
[6]. classfier. The result of SVM decides whether current blob is
In this paper, we propose a complete system-on-chip (SOC) a traffic light or not. By rapidly scanning through every blob
FPGA design with balanced workload on hardware and soft- on an image, we are able to detect all red and green traffic
ware. We adopt the traditional detection and classification lights at the current scene.
approach [7] [2]. The detection part requires pre-filtering Another feature we have to consider is to make the entire
mainly to eliminate pixels unlikely to be part of a traffic light. system working in real-time. Since our system is used to warn
The system is targeted to detect both green lights and red drivers while red traffic lights are on, consider the typical
lights, so color filter is employed to separate them into two human response time is about 0.26 second [8], which means
processing branches. Then we apply blob detection to locate our system has to process images no slower than 4 fps to
the possible traffic light objects. For object recognition, we be meaningful. So a hardware/software co-design architecture
use histogram of oriented gradients (HOG) to extract shape becomes necessary, which implements computationally heavy
features and then apply support vector machine (SVM) as the tasks on FPGA fabric and at the same time maintains high-
classifier. speed data exchange with the embedded processor. The hard-
In this work, detection is implemented on the FPGA fabric ware/software partition strategy and FPGA implementation are
and classification is implemented on the on-chip ARM proces- explained in details in Section III.

978-1-4799-5341-7/16/$31.00 ©2016 IEEE 1778


hue
? 1:0
comparator

saturation
? 1:0 AND
comparator

value
? 1:0
comparator

Figure 3. Examples of some typical traffic lights


Figure 2. Apply HSV threshold to obtain binary images

A. Pre-filtering
Figure 4. Diagram of HOG computation procedure
The input to our system is in RGB format. Each pixel
is represented by 3 bytes, with each byte representing one
color channel. Due to changing luminance and varied weather C. HOG Algorithm
conditions on road, a pixel appearing green to human eyes Standard traffic lights have fixed length-width ratio from
does not necessarily indicate a large absolute value in green 1/4 to 4 as indicated in Fig. 3. Blobs with too large or too
channel, because it also relies on values of other two color small length-width ratio can be eliminated. Prior to computing
channels. The disadvantage of RGB color space is that it HOG, each blob is resized to 32-by-32 pixels. Then each input
cannot reflect the relations among red, green and blue. As the image is firstly divided into blocks. A block size is 16 * 16
first step in pre-filtering, we convert RGB to HSV color space. pixels, containing 4 cells with each cell size 8 * 8 pixels. Next
HSV is a cylindrical-coordinate representations of colorful the block starts sliding horizontally and then vertically, with
pixels, representing relationships between each color channel a step size of 8 pixels. This results in a total of 9 blocks on a
[9]. HSV stands for hue, saturation, and value. In HSV color 32-by-32 image.
domain, green and red colors can be easily picked out by The HOG computation typically consists of three steps.
setting proper thresholds. They are weighted magnitude and bin class calculation, block
Equations below show the pixel format conversions from histogram generation, normalization as illustrated by Fig. 3.
RGB to HSV [10]: As the first step, gradients of each pixel in both x and y
directions are computed:

G−B
60 × M AX−M
 IN + 0 (if M AX = R)
Gx (x, y) = |M x (x + 1, y) − M x (x − 1, y)| (4)

B−R
H= 60 × M AX−M IN + 120 (if M AX = G) (1)
R−G

60 × M AX−M IN + 240 (if M AX = B)

Gy (x, y) = |M y (x, y + 1) − M y (x, y − 1)| (5)
S = M AX − M IN (2)
Then, the gradient magnitude and the gradient angle can be
calculated:
V = M AX (3) q
G(x, y) = Gx (x, y) + Gy (x, y) (6)
As the last step in pre-filtering, a single pixel is binarized.
For instance, value is 1 if pixel is considered green, otherwise Gy (x, y)
0, as indicated in Fig. 2. The same process is repeated for red θ = arctan (7)
Gx (x, y)
pixels in parallel.
The gradient magnitude is further divided into 9 different
B. One-Pass Blob Detection bin classes. According to angle value with range 0-180 de-
grees, every 20 degrees represent one bin class. For each cell,
Blob detection collects connected pixels from pre-filtering a block histogram is generated by summing up the weighted
step. The principal idea is to label different clusters of pixels magnitudes for the corresponding bin class, resulting in 9
to different values on the entire image. Here we use 4- feature descriptors in one cell. For the whole image, there
connectivity to determine whether pixels are connected or are 324 feature descriptors in total.
not. 4-connectivity means, for center pixel, only 4 pixels
The last step of normalization makes algorithm more robust
(N,E,W,S) are considered to be its neighbors. For the purpose
to varied illuminations.
of high efficiency, one-pass labeling is utilized. We are able to
output all the potential blobs by scanning through the entire
s
b
image only once. More details on one-pass implementation is bnorm = (8)
sum(b)
explained in Section 3.2.

1779
D. Linear SVM For the implementation of blob detection algorithm on
Linear SVM maps input non-linear descriptors to higher FPGA, blob position table is required which records positions
dimension feature space, then a linear decision surface can be of each blob detected. Shown in Fig. 7, there is a label counter
constructed [11]. The linear SVM is expressed in (9). keeping track of current label number - each time a new blob is
detected, label counter adds its value by 1. The blob position
Y = αy T + γ (9) table is made up of 4 memory blocks, recording 4 vertices
of every blob. For a specific blob with label number n, its
where α is the support vector, y is HOG feature descriptors position information is stored at nth slot in each of these 4
vector, and γ is the SVM offset. In our work, support vector memory blocks.
α and SVM offset γ is pre-trained using labeled traffic light
samples. The result of (9) indicates whether a target blob
contains a traffic light or not.

III. SOC FPGA I MPLEMENTATION


A. Software/Hardware Co-Design
To implement the whole system on SOC FPGA, delicate di-
vision of software and hardware is required. In this application,
scanning through the whole image is quite computationally
heavy, so the detection part must be implemented on FPGA
fabric. Images after blob detection are no longer original im-
ages, and only blob images are needed for HOG computation.
As shown in Fig. 5, we divide the input data streaming into two
paths: one goes through blob detection and the other retains the
original image. As to HOG and SVM, experiments in Section
IV shows it takes less than 10ms to process a single image,
Figure 7. Store positions for different labels in blob position table
proving the feasibility of our partition between hardware and
software.
The main difference between one-pass blob detection and
Processing system
multi-pass detection is that one-pass has an extra connection
red blob red traffic
positions lights table which checks if two different labels actually indicates a
detection HOG SVM
green blob green traffic common blob. In this way, all the connected labels information
positions lights
are stored into the connection table as shown in Fig. 8. We
video video do not need to scan through the entire image any more. As
original image overlay
input output an example in Fig. 8, when labeling the center pixel to be 5,
connection label logic knows label 7 and label 5 are indeed
in the same blob, so value 5 is written to the7th memory slot
Figure 5. Hardware/sofwtare partition on SOC
in the connection table.

connection table
B. Pipeline Structure for Detection
As described in Section 2.1, detection part consists of color
conversion and blob detection. Fig. 6 shows the hardware 5
value
architecture of detection part on the FPGA fabric. Since two 7 5
connection
label logic 5 7th memory slot
types of traffic lights are to be recognized simultaneously, two address
blob detection blocks are used. Also, we implement one-pass
blob detection in order to achieve a frame rate of 60 fps.

Figure 8. Connection table keeps record of connecting labels

After scanning through the whole image, all the information


are stored into the connection table and blob position table.
We further merge position information of same blobs as shown
in Fig. 9. The connection table indicates which labels are to
be merged, and we can update position information in blob
Figure 6. Hardware architecture of detection part using FPGA fabric position table accordingly.

1780
Figure 9. Merging information in blob position table Figure 10. Green traffic lights is detected by the proposed system at real-time

Table II
D ETECTION ACCURACY
C. Blob to AXI4-Stream Interface
Recall Precision
The interface is employed to transfer the blob position Red traffic lights 92.11% 99.29%
information onto AXI4-stream bus, along with a video DMA, Green traffic lights 94.44% 98.27%
realizing high-speed from FPGA to frame buffers in DDR
memory. Subsequently, the on-chip embdded ARM processor
can access image frames from DDR with a very high pixel V. C ONCLUSION
rate. In this paper, we present an FPGA based SOC design for
real-time traffic light recognition. We successfully implement
IV. O NBOARD I MPLEMENTATION R ESULT
the entire system on Xilinx Zynq board, achieving real-time
We implement the entire system on Xilinx Zynq ZC-702 processing rate of 60 fps and beyond. With the advent of deep
board. The input video resolution is 1024×768 XGA. High- learning network, it is likely to obtain a higher detection rate
est frequency of FPGA implementation reaches 147.34MHz, by replacing HOG algorithm with stronger feature extractors.
so higher resolutions can also be supported. Overall FPGA
R EFERENCES
utilization is shown in Table 1. Since input data streaming
rate is at 60 fps, our one-pass blob detection is adequate to [1] R. Okuda, Y. Kajiwara, and K. Terashima, “A survey of technical
trend of adas and autonomous driving,” in VLSI Technology, Systems
follow such a rate for object detection. We also measure the and Application (VLSI-TSA), Proceedings of Technical Program-2014
time taken by the processor to classify all the traffic light International Symposium on. IEEE, 2014, pp. 1–4.
candidates on a single frame. The time varies from 1.96ms to [2] A. Møgelmose, M. M. Trivedi, and T. B. Moeslund, “Vision-based
traffic sign detection and analysis for intelligent driver assistance sys-
9.66ms, orless than 10ms all the time. For the embedded ARM tems: Perspectives and survey,” Intelligent Transportation Systems, IEEE
processor, we are able to achieve over 100 fps performance. Transactions on, vol. 13, no. 4, pp. 1484–1497, 2012.
[3] Y.-C. Chung, J.-M. Wang, and S.-W. Chen, “A vision-based traffic light
Table I detection system at intersections,” Journal of Taiwan Normal University:
FPGA RESOURCE UTILIZATION Mathematics, Science and Technology, vol. 47, no. 1, pp. 67–86, 2002.
[4] M. Omachi and S. Omachi, “Traffic light detection with color and edge
Used Available utilization information,” in Computer Science and Information Technology, 2009.
ICCSIT 2009. 2nd IEEE International Conference on. IEEE, 2009, pp.
Slice Registers 47656 106400 44.78% 284–287.
Slice LUTs 49183 53200 92.44% [5] Y. Shen, U. Ozguner, K. Redmill, and J. Liu, “A robust video based
DSP48E1s 4 220 1.81% traffic light detection algorithm for intelligent vehicles,” in Intelligent
Block RAM 25 140 17.85% Vehicles Symposium, 2009 IEEE. IEEE, 2009, pp. 521–526.
[6] Y. Zhou, Z. Chen, and X. Huang, “A pipeline architecture for traffic
sign classification on an fpga,” in Circuits and Systems (ISCAS), 2015
Also, we test our system through 10 video clips recorded IEEE International Symposium on, May 2015, pp. 950–953.
in different road and weather conditions. A sample image is [7] J. V. Gomes, P. R. Inácio, M. Pereira, M. M. Freire, and P. P. Monteiro,
shown in Fig. 10. Table 2 shows we have achieved a high “Detection and classification of peer-to-peer traffic: A survey,” ACM
Computing Surveys (CSUR), vol. 45, no. 3, p. 30, 2013.
recall and precision rate. Equations are given: [8] “Reaction time statistics.” [Online]. Available: http://www.
humanbenchmark.com/tests/reactiontime/statistics
true positives [9] “Hsl and hsv.” [Online]. Available: https://en.wikipedia.org/wiki/HSL_
recall = (10) and_HSV
true positives + f alse negatives [10] T. Hamachi, H. Tanabe, and A. Yamawaki, “Development of a generic
rgb to hsv hardware,” in The 1st International Conference on Industrial
Application Engineering 2013 (ICIAE2013), 2013.
true positives [11] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning,
precision = (11)
true positives + f alse positives vol. 20, no. 3, pp. 273–297, 1995.

1781

You might also like