0% found this document useful (0 votes)
12 views5 pages

Color Exploitation in HOG-based Traffic Sign Detection

Uploaded by

wiem wayma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views5 pages

Color Exploitation in HOG-based Traffic Sign Detection

Uploaded by

wiem wayma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Color exploitation in HOG-based traffic sign detection

Citation for published version (APA):


Creusen, I. M., Wijnhoven, R. G. J., Herbschleb, E., & With, de, P. H. N. (2010). Color exploitation in HOG-
based traffic sign detection. In Proceedings of the 17th IEEE International Conference on Image Processing
(ICIP 2010), 26-29 September 2010, Hong Kong, Hong Kong (pp. 2669-2672). Institute of Electrical and
Electronics Engineers. https://doi.org/10.1109/ICIP.2010.5651637

DOI:
10.1109/ICIP.2010.5651637

Document status and date:


Published: 01/01/2010

Document Version:
Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:


• A submitted manuscript is the version of the article upon submission and before peer-review. There can be
important differences between the submitted version and the official published version of record. People
interested in the research are advised to contact the author for the final version of the publication, or visit the
DOI to the publisher's website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page
numbers.
Link to publication

General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
• You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please
follow below link for the End User Agreement:
www.tue.nl/taverne

Take down policy


If you believe that this document breaches copyright please contact us at:
[email protected]
providing details and we will investigate your claim.

Download date: 15. sept.. 2024


Proceedings of 2010 IEEE 17th International Conference on Image Processing September 26-29, 2010, Hong Kong

COLOR EXPLOITATION IN HOG-BASED TRAFFIC SIGN DETECTION

I.M. Creusen1,3 , R.G.J. Wijnhoven2,3 , E. Herbschleb3 , P.H.N. de With1,3


1 2 3
CycloMedia BV ViNotion BV Eindhoven University of Technology
P.O. Box 68, 4180 BB P.O. Box 2346, 5600 CH P.O. Box 513, 5600 MB
Waardenburg, NL Eindhoven, NL Eindhoven, NL

ABSTRACT detection algorithm for gray-scale images, using a 24 × 24 pixel


detection window. From this window they extract a number of Haar-
We study traffic sign detection on a challenging large-scale real-
like features. These features are a local approximation to the image
world dataset of panoramic images. The core processing is based on
gradient which can be efficiently computed using integral images.
the Histogram of Oriented Gradients (HOG) algorithm which is ex-
The face detector is trained using many example face images, but
tended by incorporating color information in the feature vector. The
can easily be trained to detect other objects.
choice of the color space has a large influence on the performance,
where we have found that the CIELab and YCbCr color spaces give Another highly successful sliding-window object detector is
the best results. The use of color significantly improves the detection the Histogram-of-Gradient detector (HOG) proposed by Dalal and
performance. We compare the performance of a specific and HOG Triggs [3]. The algorithm outperforms the Viola-Jones algorithm.
algorithm, and show that HOG outperforms the specific algorithm The authors propose to divide the detection window into cells, and
by up to tens of percents in most cases. In addition, we propose a for each cell a histogram of the image gradient (over orientation)
new iterative SVM training paradigm to deal with the large variation is made. This type of feature is a dense version of the popular
in background appearance. This reduces memory consumption and SIFT [4] technique. However, HOG features are not rotation and
increases utilization of background information. scale invariant. After the feature generation stage, a Support Vector
Machine (SVM) is used to classify the high-dimensional features.
Index Terms— Object detection, Object recognition In a recent evaluation for pedestrian detection, the HOG algorithm
gives competitive performance [5].
1. INTRODUCTION Another sliding-window detector is recently proposed by Baró
and Vitrià [6], which has also been applied for traffic sign detec-
In this paper we consider traffic sign detection on images obtained tion. They use a more general version of the Haar-like features,
from a driving vehicle in large-scale environments. Panoramic im- called Dissociated Dipoles. During the training process, a genetic
ages are captured using two cameras with fish-eye lenses, thereby algorithm is used to iteratively add new features to the system, in
creating lens-distortion. As a pre-processing step, image enhance- contrast to the exhaustive search done by Viola and Jones. This
ment algorithms are used to improve the image quality. We study approach leads to strongly improved performance compared to the
the automated detection of a subset of the traffic signs in The Nether- standard Viola-Jones approach, the AUC (Area Under Curve) for a
lands. traffic sign detection problem is increased from 60% to 90%.
An object detection system can be realized in two ways. A first The Implicit Shape Model is proposed by Leibe and Schiele
approach is to manually create a specific object model by using prior in [7]. Their idea is to locate small parts of the object in the im-
knowledge regarding the objects and the scene. A second option is age, and vote for a consistent center location of the total object. The
to learn a generic object model automatically from manually anno- maxima in this voting space defines the location of the object. This
tated training data, also called supervised learning. An advantage technique gives competitive results for generic object detection with
of a specific object model is that the prior knowledge is explicitly relatively large objects.
modeled and no annotated training samples are required. Optimiz- In this paper our aim is to study a more generic algorithm that
ing such a model requires much effort and the resulting model can- can handle the large variety of traffic signs. We have adopted the
not be reused for other types of objects. However, because all prior HOG algorithm because of its high performance, parallel implemen-
knowledge is used in the model, an effective and efficient detection tation possibilities, and fast training compared to the proposal by
algorithm can be obtained. On the other hand, a generic object model Baró and Vitrià [6]. A closer look to the various algorithms reveals
is learned automatically from training data with little manual inter- that the HOG algorithm implicitly exploits features like the gradient
action. However, the model uses no prior information, and extracts patterns and the shape of traffic signs instead of explicitly building
its knowledge from the training data only. Therefore, good quality models of those features as in done in the above proposals from lit-
and completeness of the training data are key requirements. A clear erature. This is why we expect at least similar results.
advantage of such a model is that it can be generated for different ob- Furthermore, we propose to extend the standard HOG algorithm
ject classes without manual tuning. In this paper we compare these by utilizing color information as a concatenation of per-channel
two approaches for a large-scale traffic sign detection application. HOG features. We show that the choice of the color space signifi-
Many algorithms for traffic sign detection are primarily focusing cantly influences the performance, and the optimal choice depends
on color only. A survey on traffic sign detection was published by on the type of traffic sign.
Fleyeh and Dougherty [1]. In this paper, we compare a state-of-the-art specific algorithm
The work of Viola and Jones [2] describes one of the first suc- and our generic HOG-based algorithm. The specific detection al-
cessful applications of the generic approach. They designed a face gorithm by Herbschleb and De With [8] uses a fast three-stage ap-

978-1-4244-7994-8/10/$26.00 ©2010 IEEE 2669 ICIP 2010


histograms. Dalal and Triggs propose to normalize each cell several
times, for each b × b local neighborhood, where typically b = 2. The
total feature vector of a detection window is the concatenation of the
[ normalized orientation histograms of all the cells in the window.
For learning the actual detector, we use a linear Support Vector
Machine (SVM). Although kernel SVMs will increase performance
(as shown e.g. in [3]), a linear SVM is used for computational effi-
[ [ ciency, as we execute our algorithm on a large-scale database. We
use the same implementation used by Dalal and Triggs. The SVM
(a) Circular blue signs classifier is trained in an iterative process, unlike the proposal of
Dalal and Triggs. In the first iteration, all positive images are pro-
cessed and a set of randomly chosen background regions is used as
negative samples. In each additional iteration, the current detector
[ is applied to a new image without traffic signs and resulting false
detections are added to the training set for the next iteration. Af-
ter each iteration, the classifier is retrained and all negative training
samples which are not support vectors are discarded. The conse-
quence of this technique is that the set of negative features remains
[ [ [ [ [ [
small. This has two advantages: (1) it avoids the memory limitations
(b) Circular red signs observed by Dalal and Triggs and (2) it allows the utilization of more
background samples, leading to a more accurate description of the
background training set.

[ [ [ [ [ [
(c) Triangular signs

Fig. 1. Examples of traffic sign training samples. The numbers rep-


resent the number of samples in the training set. Fig. 2. Overview of the HOG algorithm.

As previously mentioned, we extend the standard algorithm by


including color information in the HOG-descriptor. This is done by
proach and uses color and shape features. Firstly, a fast algorithm
concatenating HOG descriptors for each color channel of the used
discards uninteresting image regions by distinguishing specific color
color space. Note that the dimensionality of the resulting features
information. Secondly, traffic signs are detected by locating several
and the computational complexity for detection increases linearly
consistent parts. Finally, a comprehensive validation step ensures
with the number of color channels.
the validity of the detected signs. In contrast with this, our algorithm
The specific detection algorithm by Herbschleb and De With [8]
models the background by a new iterative learning procedure. As
uses a three-stage approach. Firstly, a fast algorithm discards unin-
our algorithm has a generic nature, it can be reused for other objects
teresting image regions by using color information. The first step
than traffic signs.
applies color quantization and classifies each pixel of the image in
the color classes red, blue, yellow, white and black, as they are the
2. ALGORITHMS most appearing colors in traffic signs. This step uses a fixed map-
ping which was empirically estimated. Secondly, SIFT features [4]
The original HOG algorithm by Dalal and Triggs [3] applies a are extracted at Hessian interest points, and these are matched to
dense description strongly based on the popular SIFT algorithm a dictionary of traffic-sign features. The dictionary is constructed
from Lowe [4]. In a preprocessing step, the image input pixels are from synthetic traffic-sign images as specified by the local authori-
converted into HOG features, and object detection is performed by ties. The spatial consistency of neighboring features is checked to
sliding a detection window over the image. To obtain scale-invariant improve robustness. If three or more matches indicate the same traf-
detection, the preprocessing and detection process is repeated for fic sign, it is added as a valid detection. The final stage performs a
downscaled versions of the input image. After applying the detector, validation of the generated detections by checking color consistency
detections at various scales are merged using a mean-shift mode and template matching with several distorted templates.
finding algorithm [9]. There is a fundamental difference between the two approaches.
Let us now discuss the HOG feature generation step in more de- The generic HOG detector detects traffic signs of a specific category
tail. The processing steps are shown in Figure 2. For each input only, whereas the specific algorithm is designed to detect all varia-
pixel, the gradient magnitude and orientation are calculated. The tions of several traffic signs in a single algorithmic pass.
gradient magnitude of the pixel is added to the corresponding orien-
tation bin. Input pixels are spatially quantized in cells of n × n 3. EXPERIMENTAL RESULTS
pixels, where n is the cell size. Each cell contains one orienta-
tion histogram. To avoid quantization effects, linear interpolation Targeting the application of traffic sign detection on country-wide
is applied, both in the orientation and in the two spatial dimensions. scale, we evaluate our algorithms with a set of about 5,000 annotated
Illumination invariance is obtained by using the gradient operator. high-resolution images taken from a driving vehicle, using two fish-
Contrast normalization is applied by normalization of the orientation eye lens cameras. For the experiments, we extract traffic signs from a

2670
Fig. 3. Example image showing several correctly detected traffic signs, indicated by cyan rectangles and one undetected sign indicated by a
red rectangle

subset of the images to obtain a training set, and use the other images times vary between 30 seconds and a few minutes. Note that in a
(approx. 3,000) for testing. single pass of the specific algorithm, all traffic sign classes are de-
We evaluate the detection of three different classes of traffic tected simultaneously, whereas the generic detector locates only a
signs: blue-circular, red-circular and triangular signs, and we train single class of signs.
the system using 170, 74 and 53 samples, respectively. Each class We have applied both the specific algorithm and the HOG de-
contains intra-class variation due to the various signs in the class and tectors to the dataset (see Figure 1) and the results are shown in Fig-
due to the different imaging conditions, as shown in Figure 1. The ure 4. The AUC scores are summarized in Table 1. Figure 2 shows
distribution of the different types of signs in the training sets is rep- an example output image of the CIELab detector. In general, we ob-
resentative for the total dataset. The resolution of the images in our serve that the HOG detector outperforms the dedicated algorithm in
dataset is 4800 × 2400 pixels. most cases. We have found that the choice for a color space has a
Using the generic HOG detection algorithm, we train a differ- significant impact on the detection performance. For blue circular
ent detector for each class from the positive object samples and a traffic signs, the performance in the CIELab color space is superior
common set of negative samples in the form of images containing to other color spaces. For red circular traffic signs, the CIELab and
no traffic signs. Additionally, for each class, the positive samples YCbCr color spaces give similar performance, while for red triangu-
of the other classes are added as negative examples. The positive lar signs the performance in the YCbCr space is the highest. Detec-
samples are traffic signs having a resolution of 24 × 24 pixels in a tion in the RGB and HSV color spaces is suboptimal in these exper-
48 × 48 pixel region. Dalal and Triggs [3] have found that the use of iments. It is interesting to note that performance in the H-channel is
contextual information is beneficial. For training the SVM, the pro- almost identical to the performance in the HSV-space. This indicates
posed iterative approach is used with an initial set of 200 randomly that saturation and intensity information is largely irrelevant for the
selected background samples. considered traffic sign detection application.
We consider different versions of the HOG algorithm. Whereas
Dalal and Triggs propose to use the gradient in the color channel Name Red circ. AUC Blue circ. AUC Triangular AUC
with maximum gradient magnitude, traditional HOG only uses a sin- Dedicated 41.6% 56.2% 45.5%
gle color channel. The green channel of the RGB color space is often H(HSV) 32.0% 70.3% 50.0%
employed for traffic sign detection, but this causes many misdetec- HSV 32.0% 70.4% 50.0%
tions between red and blue signs. We have found that the H-channel CIELab 56.0% 85.0% 65.7%
of HSV color space gives better results. In our experiments we will
RGB 46.4% 56.9% 52.8%
use the H-channel detector as a single channel detector. Results for
YCbCr 55.7% 69.2% 74.6%
the G-channel detector are omitted because the performance is sig-
nificantly less. To incorporate more color information, we concate-
nate HOG descriptors for each color channel. We compare results Table 1. Detection performance of the Dedicated algorithm and the
for the following color spaces: RGB, HSV, CIELab and YCbCr. HOG detector in several color spaces, for three different classes of
In our experiments, we have used the following settings for our traffic signs. The highest scores are indicated in bold.
HOG detector: cell size 4 × 4 pixels, 9 orientation bins and 4 block
normalizations (b = 2). For each color channel, the dimensionality
of the feature vector is 2, 304. Applying the single-channel HOG
detector on a 4800 × 700 image takes about 23 seconds using a sin- 4. CONCLUSIONS
gle CPU-core at 2.7 GHz. Each image is downscaled in 35 steps
using a scaling factor of 1.05. This leads to the detection of traffic We have evaluated two different algorithms for traffic sign detection
signs ranging from 24 × 24 pixels to 132 × 132 pixels. Because of on a large-scale dataset. A dedicated algorithm uses a processing
the preprocessing steps in the specific algorithm, the execution time chain of three stages to detect traffic signs, which has been manu-
varies significantly over the total set of images. Typical execution ally tuned. We compare this to the generic Histogram of Oriented

2671
Gradients (HOG) algorithm, that automatically learns its detector
from a set of training images. In addition to the standard HOG al- 
gorithm, we propose an extension that simultaneously uses infor-
mation of multiple color channels and show that it outperforms the
&!%!  !"#$%
single-channel algorithm. Furthermore, we have employed an iter-      
ative technique for SVM training which is novel in this context, to
deal with the large variation in background appearance. This signifi-
cantly lowers memory consumption and therefore allows the utiliza- 
tion of more background images in the training process.


Experimental results show that for the considered task, the 
generic HOG algorithm significantly outperforms the dedicated al-



' (
gorithm in most cases by a range of 10–30%. The choice of the
color space has a profound effect on the performance. We have
found that the CIELab and YCbCr spaces provide the best per-      
formance, probably due to the availability of two dedicated color 
!"#$%
channels fitting to the traffic signs. The HSV and RGB spaces are 
&!%! 
less suitable for traffic sign detection. Furthermore, we have shown 
' (

that performance of the single channel H-detector is nearly identical      

to the performance of the HSV-detector. This indicates that satura-
(a) Circular blue traffic signs.
tion and intensity information is largely irrelevant for the considered

traffic sign detection application and thus that color is the dominant     

feature. !"#$%

     &!%! 
 ' (
5. REFERENCES
!"#$%
[1] H. Fleyeh and M. Dougherty, “Road and traffic sign detection 
and recognition,” in Proc. 16th Mini EURO Conf. and 10th



Meeting of EWGT, September 2005, pp. 644–653.


[2] Paul Viola and M. Jones, “Rapid object detection using a 

boosted cascade of simple features,” in Proc. IEEE Conf. on ' (


Computer Vision and Pattern Recognition (CVPR), 2001, vol. 1,
pp. 511–518. 
&!%! 

[3] N. Dalal and B. Triggs, “Histogram of oriented gradients for


human detection,” in Proc. IEEE Computer Vision and Pattern

Recognition (CVPR), June 2005, vol. 1, pp. 886–893.      

[4] D.G. Lowe, “Distinctive image features from scale-invariant (b) Circular red traffic signs.
keypoints,” Int. Journal of Computer Vision (IJCV), vol. 60, no. 
2, January 2004.     

!"#$%
[5] Piotr Dollar, Christian Wojek, Bernt Schiele, and Pietro Perona, 
&!%! 
“Pedestrian detection: A benchmark,” in Proc. IEEE Conf. on  ' (

Computer Vision and Pattern Recognition (CVPR), June 2009,


pp. 304–311.

[6] Xavier Baró and Jordi Vitrià, “Probabilistic darwin machines ' (


for object detection,” in Proc. Int. Conf. on Image Processing,



Computer Vision, and Pattern Recognition (IPCV), July 2009,


     


vol. 2, pp. 839–845.
!"#$%
[7] Bastian Leibe and Bernt Schiele, “Interleaved object categoriza-
&!%! 
tion and segmentation,” in Proc. British Machine Vision Confer- 
ence (BMVC), September 2003, pp. 759–768.
[8] Ernst Herbschleb and Peter H.N. de With, “Real-time traffic-
sign detection and recognition,” in Visual Communications 
     
and Image Processing, SPIE-IS&T Electronic Imaging, January 

2009, vol. 7257, pp. 0A1–0A12. (c) Triangular traffic signs.


[9] D. Comaniciu and P. Meer, “Mean shift: a robust approach
toward feature space analysis,” IEEE Trans. on Pattern Analysis Fig. 4. Resulting recall-precision curves for the evaluated traffic
and Machine Intelligence (PAMI), vol. 24, no. 5, pp. 603–619, sign classes. Note that the H(HSV) results show significant overlap
May 2004. with the HSV results. Figure best viewed in color.

2672

You might also like