0% found this document useful (0 votes)
63 views

Color-Based Object Recognition

The document analyzes and evaluates various color models for recognizing multicolored objects under different conditions. It shows that normalized color rgb, saturation S, hue H, and color models c c c , l l l , and m m m are invariant to changes in viewpoint, object geometry, and illumination assuming white light. Hue H and l l l are also invariant to highlights. Recognition accuracy is highest for l l l and hue H, followed by c c c , rgb, and m m m under white light. Accuracy degrades for other models with illumination color changes. Experiments evaluate the models on 500 images of 3D objects under various conditions.

Uploaded by

Andrada Cirneanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Color-Based Object Recognition

The document analyzes and evaluates various color models for recognizing multicolored objects under different conditions. It shows that normalized color rgb, saturation S, hue H, and color models c c c , l l l , and m m m are invariant to changes in viewpoint, object geometry, and illumination assuming white light. Hue H and l l l are also invariant to highlights. Recognition accuracy is highest for l l l and hue H, followed by c c c , rgb, and m m m under white light. Accuracy degrades for other models with illumination color changes. Experiments evaluate the models on 500 images of 3D objects under various conditions.

Uploaded by

Andrada Cirneanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Pattern Recognition 32 (1999) 453—464

Color-based object recognition


Theo Gevers*, Arnold W.M. Smeulders
ISIS, Faculty of WINS, University of Amsterdam, Kruislaan 403, 1098 SJ, Amsterdam, The Netherlands
Received 22 December 1997; received for publication 4 February 1998

Abstract

The purpose is to arrive at recognition of multicolored objects invariant to a substantial change in viewpoint, object
geometry and illumination. Assuming dichromatic reflectance and white illumination, it is shown that normalized color
rgb, saturation S and hue H, and the newly proposed color models c c c and l l l are all invariant to a change in
   
viewing direction, object geometry and illumination. Further, it is shown that hue H and l l l are also invariant to

highlights. Finally, a change in spectral power distribution of the illumination is considered to propose a new color
constant color model m m m . To evaluate the recognition accuracy differentiated for the various color models,
  
experiments have been carried out on a database consisting of 500 images taken from 3-D multicolored man-made
objects. The experimental results show that highest object recognition accuracy is achieved by l l l and hue H followed

by c c c , normalized color rgb and m m m under the constraint of white illumination. Also, it is demonstrated that
     
recognition accuracy degrades substantially for all color features other than m m m with a change in illumination
  
color. The recognition scheme and images are available within the PicToSeek and Pic2Seek systems on-line at: http:
//www.wins.uva.nl/research/isis/zomax/.  1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All
rights reserved.

Keywords: Object recognition; Multicolored objects; Color models; Dichromatic reflection; Reflectance properties;
Photometric color invariants; Color constancy

1. Introduction illumination independent by indexing on illumination-


invariant surface descriptors (color ratios) computed from
Color provides powerful information for object recog- neighboring points. However, it is assumed that neighbor-
nition. A simple and effective recognition scheme is to ing points have the same surface normal. Therefore, the
represent and match images on the basis of color histo- derived illumination-invariant surface descriptors are
grams as proposed by Swain and Ballard [1]. The work negatively affected by rapid changes in surface orientation
makes a significant contribution in introducing color for of the object (i.e. the geometry of the object). Healey and
object recognition. However, it has the drawback that Slater [4] and Finlayson et al. [5] use illumination-invari-
when the illumination circumstances are not equal, the ant moments of color distributions for object recognition.
object recognition accuracy degrades significantly. This These methods are sensitive to object occlusion and clut-
method is extended by Funt and Finlayson [2], based on tering as the moments are defined as an integral property
the retinex theory of Land [3], to make the method on the object as one. In global methods, in general, oc-
cluded parts will disturb recognition. Slater and Healey
[6] circumvent this problem by computing the color fea-
*Corresponding author. tures from small object regions instead of the entire object.

0031-3203/99/$ — See front matter  1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
PII: S 0 0 3 1 - 3 2 0 3 ( 9 8 ) 0 0 0 3 6 - 3
454 T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464

From the above observations, the choice which color A summary of the theoretical results is given in Section 5.
models to use does not only depend on their robustness In Section 6, experiments are carried out on an image
against varying illumination across the scene (e.g. mul- database of 500 images taken from 3-D multicolored
tiple light sources with different spectral power distribu- man-made objects. In Section 7, we conclude with a
tions), but also on their robustness against changes in guideline which color models to use under which imaging
surface orientation of the object (i.e. the geometry of the conditions for both invariant and discriminatory object
object), and on their robustness against object occlusion recognition.
and cluttering. Furthermore, the color models should be
concise, discriminatory and robust to noise. Therefore, in
this paper, our aim is to analyze and evaluate various 2. Basic color definitions
color models to be used for the purpose of recognition of
multicolored objects according to the following criteria: Commonly used well-known color spaces include: (for
display and printing processes) RGB, CM½; (for televi-
E Robustness to a change in viewing direction;
sion and video) ½IQ, ½º»; (standard set of primary
E robustness to a change in object geometry;
colors) X½Z; (uncorrelated features) I I I ; (normalized
E robustness to a change in the direction of the illumina-   
color) rgb, xyz; (perceptual uniform spaces) º*»*¼*,
tion;
¸*a*b*, ¸uv; and (for humans) HSI. Although, the num-
E robustness to a change in the intensity of the illumina-
ber of existing color spaces is large, a number of these
tion;
color models are correlated to intensity I: ½, ¸* and ¼*;
E robustness to a change in the spectral power distribu-
are linear combinations of RGB: CM½, X½Z and I I I ;
tion (SPD) of the illumination.   
or normalized with respect to intensity rgb: IQ, xyz, º»,
Next to defining color models which have: º*»*, a*b*, uv. Therefore, in this paper, we concentrate
on the following standard, essentially different, color fea-
E High discriminative power; tures: intensity I, RGB, normalized color rgb, hue H and
E robustness to object occlusion and cluttering; saturation S.
E robustness to noise in the images. In the sequel, we need to be precise on the definitions
of intensity I, RGB, normalized color rgb, saturation S,
It can be expected that two or more of the above cri- and hue H. To that end, in this section, we offer a quick
teria are interrelated. For example, Funt and Finlayson overview of well-known facts from color theory.
[2] show that when illumination is controlled Swain’s Let R, G and B, obtained by a color camera, represent
color-based recognition method performs better than the 3-D sensor space
object recognition based on illumination-independent

H p(j) f! (j) dj
image descriptors. However, Swain’s method is outper-
C" (1)
formed when illumination varies across the scene. Sup-
posedly, their is a tradeoff between the amount of
invariance and expressiveness of the color models. To for C3(R, G, B), where p(j) is the radiance spectrum and
that end, our goal is to get more insight to decide which f ( j) are the three color filter transmission functions.
!
color models to use under which imaging parameters. To represent the RGB-sensor space, a cube can be
This is useful for object recognition applications where defined on the R, G, and B axes. White is produced when
no constraints on the imaging process can be imposed as all three primary colors are at M, where M is the max-
well as for applications where one or more parameters of imum light intensity, say M"255. The main diagonal-
the imaging process can be controlled such as for robots axis connecting the black and white corners defines the
and industrial inspection (e.g. controlled object position- intensity
ing and lightning conditions). For such a case, color
I(R, G, B)"R#G#B. (2)
models can be used for object recognition which are less
invariant (at least under the given imaging conditions), All points in a plane perpendicular to the grey axis of the
but having higher discriminative power. color cube have the same intensity. The plane through
The paper is organized as follows. In Section 2, basic the color cube at points R"G"B"M is one such
color models are defined for completeness. In Section 3, plane. This plane cuts out an equilateral triangle which is
assuming white illumination and dichromatic reflectance, the standard rgb chromaticity triangle
we examine the effect of a change in viewpoint, surface
orientation, and illumination for the various color mod- R
r(R, G, B)" , (3)
els. From the analysis, two new invariant color models R#G#B
are proposed. Further, in Section 4, a change in spectral
power distribution (SPD) of the illumination is con- G
g(R, G, B)" , (4)
sidered to propose a new color constant color model. R#G#B
T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464 455

B for C 3+R , G , B , giving the red, green and blue


b(R, G, B)" . (5) U U U U
R#G#B sensor response under the assumption of a white light
source. Further,
The transformation from RGB used here to describe the
color impression hue H is given by k "
! H f! (j)c@ (j) dj (10)

 
(3(G!B)
H(R, G, B)"arctan (6) is the compact formulation depending on the sensors and
(R!G)#(R!B) the surface albedo only.
If the integrated white condition holds (as we assume
and saturation S measuring the relative white content of
throughout the paper)
a color as having a particular hue by

S(R, G, B)"1!
min(R, G, B)
R#G#B
. (7) H f0 (j) dj"H f% (j) dj"H f (j) dj"f (11)

we propose that the reflection from inhomogeneous di-


In this way, all color features can be calculated from the electric materials under white illumination is given by
original R, G, B values from the corresponding red, green,
and blue images provided by the color camera. C "em (n, s)k #em (n, s, v)c f. (12)
U @ ! Q Q
In the next section, this reflection model is used to study
and analyze the RGB- subspace on which colors will be
3. Reflectance with white illumination
projected coming from the same uniformly colored sur-
face.
3.1. ¹he reflection model

Consider an image of an infinitesimal surface patch of 3.2. Photometric color invariant features for Matte,
an inhomogeneous dielectric object. Using the red, green Dull surfaces
and blue sensors with spectral sensitivities given by f (j),
0 Consider the body reflection term of Eq. (12)
f (j) and f (j), respectively, to obtain an image of the
%
surface patch illuminated by a SPD of the incident light C "em (n, s)k (13)
denoted by e(j), the measured sensor values is given by @ @ !
Shafer [7] for C 3+R , G , B , giving the red, green and blue sensor
@ @ @ @
response of a infinitesimal matte surface patch under the
C"m (n, s)
@ H f! (j)e(j)c@ (j) dj assumption of a white light source.
According to the body reflection term, the color de-
pends on k (i.e. sensors and surface albedo) and the bright-
!
#m (n, s, v)
Q H f! (j)e(j)cQ (j) dj (8) ness on illumination intensity e and object geometry
m (n, s). If a matte surface region, which is homogeneous-
@
ly colored (i.e. with fixed albedo), contains a variety of
for C"+R, G, B, giving the Cth sensor response. Fur- surface normals, then the set of measured colors will
ther, c (j) and c (j) are the surface albedo and Fresnel generate an elongated color cluster in RGB-sensor space,
@ Q
reflectance respectively. j denotes the wavelength, n is the where the direction of the streak is determined by k and
surface patch normal, s is the direction of the illumina- !
its extent by the variations of surface normals n with
tion source, and v is the direction of the viewer. Geomet- respect to the illumination direction s. As a consequence,
ric terms m and m denote the geometric dependencies a uniformly colored surface which is curved (i.e. varying
@ Q
on the body and surface reflection component, respec- surface orientation) gives rise to a broad variance of RGB
tively. values. The same argument holds for intensity I.
Considering the neutral interface reflection (NIR) In contrast, rgb is insensitive to surface orientation,
model (assuming that c (j) has a constant value indepen- illumination direction and illumination intensity math-
Q
dent of the wavelength) and white illumination (equal ematically specified by substituting Eq. (13) in Eqs.
energy density for all wavelengths within the visible spec- (3)—(5)
trum), then e(j)"e and c (j)"c , and hence being con-
Q Q
stants. Then, we put forward that the measured sensor em (n, s)k
r(R , G , B )" @ 0
values are given by @ @ @ em (n, s)(k #k #k )
@ 0 %

H f! (j) dj
k
C "em (n, s)k #em (n, s, v)c (9) " 0 , (14)
U @ ! Q Q k #k #k
0 %
456 T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464

em (n, s)k denoting the angles of the body reflection vector and
g(R , G , B )" @ %
@ @ @ em (n, s)(k #k #k ) consequently being invariants for matte, dull objects cf.
@ 0 % Eqs. (13) and (19)—(21)
k
%
 
" , (15) em (n, s)k
k #k #k c (R , G , B )"arctan @ 0
0 %  @ @ @ max+em (n, s)k , em (n, s)k ,
@ % @
em (n, s)k
 
b(R , G , B )" @ k
0
@ @ @ em (n, s)(k #k #k ) "arctan , (22)
@ 0 % max+k , k ,
%

 
k em (n, s)k
" , (16) c (R , G , B )"arctan @ %
k #k #k  @ @ @ max+em (n, s)k , em (n, s)k ,
0 % @ 0 @

 
factoring out dependencies on illumination e and object k
"arctan % , (23)
geometry m (n, s) and hence only dependent on the sen-
@ max+k , k ,
0
sors and the surface albedo.

 
Because S corresponds to the radial distance from the em (n, s)k
c (R , G , B )"arctan @
color to the main diagonal in the RGB-color space, S is  @ @ @ max+em (n, s)k , em (n, s)k ,
@ 0 @ %
an invariant for matte, dull surfaces illuminated by white

 
light cf. Eqs. (13) and (7) k
"arctan , (24)
max+k , k ,
0 %
S(R , G , B )
@ @ @ only dependent on the sensors and the surface albedo.
min(em (n, s)k , em (n, s)k , em (n, s)k ) Obviously, in practice, the assumption of objects com-
"1! @ 0 @ % @ posed of matte, dull surfaces is not always realistic. To
em (n, s)(k #k #k )
@ 0 % that end, the effect of surface reflection (highlights) is
discussed in the following section.
min(k , k , k )
"1! 0 % , (17)
(k #k #k )
0 % 3.3. Photometric color invariant features for both Matte
only dependent on the sensors and the surface albedo. and Shiny surfaces
Similarly, H is an invariant for matte, dull surfaces
illuminated by white light cf. Eqs. (13) and (6) Consider the surface reflection term of Eq. (12)

C "em (n, s, v)c f (25)


H(R , G , B ) Q Q Q
@ @ @
for C 3+R , G , B , giving the red, green and blue sensor
Q Q Q Q

 
(3em (n, s)(k !k ) response for a highlighted infinitesimal surface patch
"arctan @ %
em (n, s)((k !k )#(k !k )) with white illumination.
@ 0 % 0 Note that under the given conditions, the color of
highlights is not related to the color of the surface on
 
(3(k !k )
"arctan % . (18) which they appear, but only on the color of the light
(k !k )#(k !k )
0 % 0 source. Thus, for the white light source, the set of mea-
sured colors from a highlighted surface region is on the
In fact, any expression defining colors on the same linear
grey axis of the RGB-color space. The extent of the streak
color cluster spanned by the body reflection vector in
depends on the roughness of the object surface. Very
RGB-space is an invariant for the dichromatic reflection
shiny object regions generate color clusters which are
model with white illumination. To that end, we put
spread out over the entire grey axis. For rough surfaces,
forward the following invariant color model
the extent will be small.
For a given point on a surface, the contribution of the

 
R
c "arctan , (19) body reflection component C and surface reflection
 max+G, B, @
component C are added cf. Eq. (12). Hence, the measured
Q
colors of a uniformly colored region must be on the

 
G triangular color plane in the RGB-space spanned by the
c "arctan , (20)
 max+R, B, two reflection components.
Because H is a function of the angle between the main

 
B diagonal and the color point in RGB-sensor space, all
c "arctan , (21)
 max+R, G, possible colors of the same (shiny) surface region (i.e. with
T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464 457

fixed albedo) have to be of the same hue as follows from 4. Reflectance with colored illumination
substituting Eq. (12) into Eq. (6)
4.1. The reflection model

 
(3(G !B )
H(R , G , G )"arctan U U
U U U (R !G )#(R !B ) We consider the body reflection term of the dichro-
U U U U matic reflection model

 
(3em (n, s)(k !k )
@ %
H f! (j)e(j)c@ (j) dj
"arctan
em (n, s)((k !k )#(k !k )) C "m (n, s) (31)
@ 0 % 0 A @

 
(3(k !k ) for C"+R, G, B,, where C "+R , G , B , gives the red,
"arctan % , (26) A A A A
(k !k )#(k !k ) green and blue sensor response of a matte infinitesimal
0 % 0
surface patch of an inhomogeneous dielectric object
factoring out dependencies on illumination e, object under unknown spectral power distribution of the
geometry m (n, s), viewpoint m (n, s, v), and specular
@ Q illumination.
reflection coefficient c and hence only dependent
Q Suppose that the sensor sensitivities of the color
on the sensors and the surface albedo. Note that camera are narrow band with spectral responses
R "em (n, s)k #em (n, s, v)c f, G "em (n, s)k #
U @ 0 Q Q U @ % approximated by delta functions f (j)"d(j!j ), then
! !
em (n, s, v)c f, and B "em (n, s)k #em (n, s, v)c f.
Q Q U @ Q Q the measured sensor values are
Obviously, other color features depend on the contri-
bution of the surface reflection component and hence are C "m (n, s)e(j )c (j ). (32)
A @ ! @ !
sensitive to highlights.
By simply filling in C in the color model equations given
In fact, any expression defining colors on the same A
linear triangular color plane, spanned by the two reflec- in Section 2, it can be easily seen that all color model
tion components in RGB-color space, are invariants for values change with a change in illumination color. To
the dichromatic reflection model with white illumination. that end, a new color constant color model is proposed in
To that end, a new color model l l l is proposed the next section.

uniquely determining the direction of the triangular color
plane in RGB-space 4.2. Color constant color feature for Matte, Dull surfaces

(R!G) Existing color constancy methods require specific


l " , (27)
 (R!G)#(R!B)#(G!B) a priori information about the observed scene (e.g. the
placement of calibration patches of known spectral re-
(R!B) flectance in the scene) which will not be feasible in practi-
l " , (28)
 (R!G)#(R!B)#(G!B) cal situations [8,9,3], for example. To circumvent these
problems, Funt and Finlayson [2] propose simple and
(G!B) effective illumination-independent color ratios for the
l " , (29)
 (R!G)#(R!B)#(G!B) purpose of object recognition. However, it is assumed
that the neighboring points, from which the color ratios
the set of normalized color differences which is, are computed, have the same surface normal. Therefore,
similar to H, a photometric color invariant for matte as the method depends on varying surface orientation of the
well as for shiny surfaces which follows from substituting object (i.e. the geometry of the objects) affecting negative-
Eq. (12) into Eqs. (27)—(29), which for l results in ly the recognition performance. To this end, we propose


(R !G )
l (R , G , G )" U U
 U U U (R !G )#(R !B )#(G !B )
U U U U U U
(em (n, s)(k !k ))
" @ 0 %
(em (n, s)(k !k ))#(em (n, s)(k !k ))#(em (n, s)(k !k ))
@ 0 % @ 0 @ %
(k !k )
" 0 % , (30)
(k !k )#(k !k )#(k !k )
0 % 0 %

only dependent on the sensors and the surface albedo. a new color constant color ratio not only independent of
Equal arguments hold for l and l . the illumination color but also discounting the object’s
 
458 T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464

geometry Having three color components of two locations, color


Cx Cx ratios obtained from a RGB-color image are
m(C , C ‚, C , C ‚ )" x x‚ , C OC ,
x x x x
(33)
    C ‚C   
  Rx Gx
expressing the color ratio between two neighboring im- m " x x ‚ , (36)
 R ‚G 
age locations, for C , C 3+R, G, B, where x and x de-
   
note the image locations of the two neighboring pixels.
Rx B x
Note that the set +R, G, B, must be colors from narrow- m " x x‚ , (37)
 R ‚B 
band sensor filters and that they are used in defining the
color ratio because they are immediately available from
Gx Bx
a color camera, but any other set of narrow-band colors m " x x‚ . (38)
derived from the visible spectrum will do as well.  G ‚B 
If we assume that the color of the illumination is
locally constant (at least over the two neighboring For the ease of exposition, we concentrate on m based

locations from which the ratio is computed, i.e. on the RG-color bands in the following discussion. With-
e x (j)"e x‚ (j)) the color ratio is independent of the illu- out loss of generality, all results derived for m will also

mination intensity and color, and also to a change in hold for m and m .
 
viewpoint, object geometry, and illumination direction as Taking the natural logarithm of both sides of Eq. (33)
shown by substituting Eq. (32) into Eq. (33) results for m in

m(c x, c x, c x, c x)
 
    RxG x‚
ln m (Rx, Rx‚, G x, G x‚)"ln
(m x (n, s)ex (j )c x (j ))(m x‚ (n, s)ex‚ (j )c x‚ (j ))  Rx‚ G x
" @x ! @ ! @ !‚ @ !‚
(m ‚ (n, s)ex‚ (j )c x‚ (j ))(m x (n, s)ex (j )c x (j ))
@ ! @ ! @ !‚ @ !‚ "ln Rx#ln G x‚!ln Rx‚!ln G x
c x (j )c x‚ (j )
" x@ ! @x !‚ , (34)
c ‚ (j )c  (j )

   
@ ! @ !‚ Rx Rx ‚
"ln !ln . (39)
factoring out dependencies on object geometry and illu- x
G  G x‚
mination direction m x (n, s) and m x‚ (n, s), and illumina-
@ @
tion ex and ex‚ as ex (j )"ex‚ (j ) and e x (j )" Hence, the color ratios can be seen as differences at two
x ! ! !‚
e ‚ (j ), and hence only dependent on the ratio of surface neighboring locations x and x in the image domain of
!‚  
albedos, where x and x are two neighboring locations the logarithm of R/G
 
on the object’s surface not necessarily of the same ori-

     
entation. R x x
d (x , x )" ln ! ln R ‚. (40)
Note that the color ratio does not require any specific K   G G
a priori information about the observed scene, as the
color model is an illumination-invariant surface descrip- By taking these differences in a particular direction be-
tor based on the ratio of surface albedos rather than the tween neighboring pixels, the finite-difference differenti-
recovering of the actual surface albedo itself. Also, the ation is obtained of the logarithm of image R/G which is
intensity and spectral power distribution of the illumina- independent of the illumination color, and also a change
tion is allowed to vary across the scene (e.g. multiple light in viewpoint, the object geometry, and illumination
sources with different SPDs), and a certain amount of intensity. We have taken the gradient magnitude by
object occlusion and cluttering is tolerated due to the applying Canny’s edge detector (derivative of the Gaus-
local computation of the color ratio. The color model is sian with p"1.0) on image ln(R/G) with non-maximum
not restricted to Mondrian worlds where the scenes are suppression in a standard way to obtain gradient magni-
flat, but any 3-D real-world scene is suited as the color tudes at local edge maxima denoted by G (x), where the
model can cope with varying surface orientations of K
Gaussian smoothing suppresses the sensitivity of the
objects. Further note that the color ratio is insensitive to color ratios to noise. The results obtained so far for
a change in surface orientation, illumination direction m hold also for m and m , yielding a 3-tuple (G (x),
and intensity for matte objects under white light, but    K
G (x), G (x)) denoting gradient magnitude at local edge
without the constraint of narrow-band filters, as follows K‚ Kƒ
maxima in images ln(R/G), ln(R/B) and ln(G/B), respec-
from substituting Eq. (13) into Eq. (33): tively. For pixels on a uniformly colored region (i.e. with
(ex m x (n, s)k x )(ex‚ mx‚ (n, s)k x‚ ) k x k x‚ fixed surface albedo), in theory, all three components will
@ ! @ !‚ " ! !‚ , (35) be zero whereas at least one the three components will be
(e x‚ m x‚ (n, s)k x‚ )(e x m x (n, s)k x ) k x‚ k x
@ ! @ !‚ ! !‚ non-zero for pixels on locations where two regions of
only dependent on the sensors and the surface albedo. distinct surface albedo meet.
T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464 459

Fig. 1. Overview of the various color models and their invariance to various imaging conditions. #denotes invariant and!denotes
sensitivity of the color model to the imaging condition.

Fig. 2. Left: 16 images which are included in the image database of 500 images. The images are representative for the images in the
database. Right: Corresponding images from the query set.

5. Summary of the theoretical results To evaluate photometric color invariant object recog-
nition, in practice, in the next section, the various color
In conclusion, assuming dichromatic reflection and models are evaluated and compared on an image
white illumination, normalized color rgb, saturation S database of 500 images taken from 3-D multicolored
and hue H, and the newly proposed color models c c c , man-made objects.
  
l l l and m m m are all invariant to the viewing direc-
   
tion, object geometry and illumination. Further, hue
H and l l l are also invariant to highlights. m m m is 6. Color-based object recognition: experiments
   
independent of the illumination color and inter-reflec-
tions (i.e. objects receiving reflected light from other ob- In the experiments, we focus on object recognition
jects) under the assumption of narrow-band filters. These by histogram matching for comparison reasons in the
results are summarized in Fig. 1. literature. Obviously, transforming RGB to one of the
460 T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464

invariant color models can be performed as a preprocess- performance of the recognition scheme can be experi-
ing step by other matching techniques. enced within the PicToSeek and Pic2Seek systems on-
This section is organized as follows. First, in Section line at http: //www.wins.uva.nl/research/isis/zomax/.
6.1, the experimental setup is given. The experimental
results are given in Section 6.2. 6.1.2. Error measures
For a measure of match quality, let rank r/G denote
6.1. Experimental setup the position of the correct match for test image Q ,
G
i"1, 2 , N , in the ordered list of N match values. The
 
The following section is outlined as follows. First, the rank r/G ranges from r"1 from a perfect match to
data sets on which the experiments will be conducted r"N for the worst possible match.

are described in Section 6.1.1. Error measures are given Then, for one experiment, the average ranking percen-
in Section 6.1.2. Histogram formation and similarity tile is defined by
measure are given in Section 6.1.3.

 
1 ,‚ N !r/G
rN "  100%. (41)
6.1.1. Datasets N N !1
 G 
The database consists of N "500 reference images of
 The cumulative percentile of test images producing
multicolored 3-D domestic objects, tools, toys, etc. Ob-
a rank smaller or equal to j is defined as
jects were recorded in isolation (one per image) with the

 
aid of the SONY XC-003P CCD color camera (3 chips) 1 H
and the Matrox magic color frame grabber. Objects were X( j)" g(r/G"k) 100%, (42)
N
recorded against a white cardboard background. Two  I
light sources of average day-light color are used to illu- where g reads as the number of test images having rank k.
minate the objects in the scene. A second, independent set
(the test set) of recordings was made of randomly chosen 6.1.3. Similarity measure and histogram formation
objects already in the database. These objects, N "70 in Histograms are constructed on the basis of different

number, were recorded again one per image with a new, color features representing the distribution of discrete
arbitrary position and orientation with respect to the color feature values in an n-dimensional color feature
camera, some recorded upside down, some rotated, some space, where n"3 for RGB, rgb, l l l , c c c and
   
at different distances. m m m , and n"1 for I, S and H. During histogram
  
In Fig. 2, 16 images from the image database of 500 construction, all pixels in a color image are discarded
images are shown on the left. Corresponding images with a local saturation and intensity smaller than 5% of
coming from the query set are shown on the right. the total range. Consequently, the white cardboard back-
More information about color-based object recogni- ground as well as the grey, white, dark or nearly colorless
tion can be found in [10]. The image database and the parts of objects as recorded in the color image will not be

Fig. 3. The discriminative power of the histogram matching process differentiated for the various color features plotted against the
ranking j. The cumulative percentile X for H ,H ,H ,H ,H , H , and H is given by X ,X ,X ,X ,X ,
JJ‚ Jƒ & AA‚ Aƒ PE@ K K‚ Kƒ 1 0% J J‚ Jƒ & AA‚Aƒ PE@ KK‚ Kƒ
X and X , respectively.
1 0%
T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464 461

Fig. 4. The discriminative power of the histogram matching process differentiated for the various color features plotted against the
illumination intensity represented by variation as expressed by the factor a. The average percentile rN for H ,H ,H ,H ,
JJ‚ Jƒ & AA‚ Aƒ PE@
H ,H,H and H is given by rN , rN , rN , rN , rN , rN , rN and rN , respectively.
KK‚ Kƒ 1 0% ' JJ‚ Jƒ & AA‚ Aƒ PE@ KK‚ Kƒ 1 0% '

Fig. 5. Four of the 10 objects with spatially varying illumination.

considered in the matching process. For comparison rea-


sons in the literature, in this paper, the histogram sim-
ilarity function is expressed by histogram intersection [1].
Histogram axes are partitioned uniformly with fixed
intervals. The resolution on the axes follows from the
amount of noise and computational efficiency consider-
Fig. 6. Ranking statistics of matching the 10 images with spa-
ations. We determined the appropriate bin size for our
tially varying illumination against the database of 500 images.
application empirically. This has been achieved by vary-
ing the same number of bins on the axes over
q3+2, 4, 8, 16, 32, 64, 128, 256, and chose the smallest
q for which the number of bins is kept small for computa- and G (x) denoting the gradient magnitude at local edge
K
tional efficiency and large for recognition accuracy. The maximaƒ in images ln(R/G), ln(R/B) and ln(G/B), respec-
results show (not presented here) that the number of bins tively. Unfortunately, we observed from the reference
was of little influence on the recognition accuracy when images in the datasets that RGB colors are non-uniform-
the number of bins ranges from q"32 to 256 for all color ly distributed and hence a theoretical model of the prob-
spaces. Therefore, the histogram bin size used during ability distribution of ratios is not feasible. To that end,
histogram formation is q"32 in the following. For each an experimental probability distribution is generated by
test and reference image, 3-D histograms are created for computing G (x), G (x), and G (x) for the 500 images
K K‚ K
the RGB, l l l , rgb and c c c color space denoted by in the image database. Accordingƒ to the experimentally
    
H , H , H and H , respectively. Further- determined probability distribution (not shown here), we
0% J J‚ Jƒ PE@ A A‚ Aƒ
more, 1-D histograms are created for I, S and H denoted partition the gradient magnitude axes finely near 0 and
by H , H , and H . sparsely when reaching maximum by their projection
' 1 &
Assuming a uniform distribution of the RGB colors onto the log axis. In this way, a 3-dimensional histogram
implies, however, a non-uniform distribution of color is created for G (x), G (x), and G (x) denoted by
K K‚ Kƒ
ratios m , m and m and corresponding G (x), G (x), H .
   K K‚ K K‚ Kƒ
462 T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464

Fig. 7. The discriminative power of the histogram matching process differentiated for the various color features plotted against the
change b in the color composition of the illumination spectrum. The average percentile rN for H ,H ,H ,H , H and
JJ‚ Jƒ & A A‚ Aƒ K K‚ Kƒ 1
H is given by rN , rN , rN , rN , rN and rN , respectively.
0% J J‚ Jƒ & AA‚ Aƒ K K‚ Kƒ 1 0%

Fig. 8. Overview of which color models to use under which imaging conditions#denotes controlled and!denotes uncontrolled
imaging condition.

6.2. Experimental results 6.2.2. The effect of a change in the illumination intensity
The effect of a change in the illumination intensity is
6.2.1. Results with white illumination equal to the multiplication of each RGB-color by a uni-
In this section, we report on the recognition accuracy form scalar factor a. In theory, we have shown that only
of the matching process for N "70 test images and RGB and I-color features are sensitive to changes in the

N "500 reference images for the various color features. illumination intensity. To measure the sensitivity of dif-

As stated, white lighting is used during the recording of ferent color features, in practice, RGB-images of the test
the reference images in the image database and the inde- set are multiplied by a constant factor varying over
pendent test set. However, the objects were recorded with a3+0.5, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.5,. The discrim-
a new, arbitrary position and orientation with respect to inative power of the histogram matching process differ-
camera. In Fig. 3 accumulated ranking percentile is entiated for the various color features plotted against
shown for the various color features. illumination intensity is shown in Fig. 4.
From the results of Fig. 3 we can observe that the As expected, RGB and I-color features depend on the
discriminative power of l l l , H followed by c c c , rgb illumination intensity. The further illumination intensity
    
and m m m is higher than the other color models deviates from the original value (i.e. a"1), the worse
  
achieving a probability of, respectively, 97, 96, 94, 92 and discriminative power is achieved. Note that objects are
89 perfect matches out of 100. Saturation S provides recognized randomly for rN "50. Furthermore, all other
significantly worse recognition accuracy. As expected, the color feature are fairly independent under varying inten-
discriminative power of RGB has the worst performance sity of the illumination.
due to its sensitivity to varying viewing directions and To test recognition accuracy for real images under
object positionings. varying illumination intensity, an independent test set of
T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464 463

recordings was made by randomly chosen 10 objects invariant and discriminatory object recognition, see
already in the database of 500 images. These objects were Fig. 8.
recorded again with the same pose but with spatially The schema is useful for object recognition applica-
varying illumination intensity, see Fig. 5. tions where no constraints on the imaging process can be
Then these 10 images were matched against the imposed as well as for applications where one or more
database of 500 images. From Fig. 6 is can be observed parameters of the imaging process can be controlled such
that the discriminative power of c c c and rgb (with as for robots and industrial inspection (e.g. controlled
  
9 perfect matches out of 10) with respect to l l l and H is object positioning and lightning).
 
similar or even better due to minor amount of highlights For such a case, color models can be used for object
in the test set. Further, m m m shows very high match- recognition which are less invariant (at least under the
  
ing accuracy, whereas S, RGB and I provide very poor given imaging parameters), but having higher dis-
matching accuracy under spatially varying illumination. criminative power. For example, an inspection task for
which lighting is controlled, but not the exact position
of the object (on the conveyer belt), color model l l l
6.2.3. The effect of a change in the illumination color  
Based on the coefficient rule or von Kries model, the is most appropriate for the inspection task at hand.
change in the illumination color is approximated by In addition, when the object does not produce a signifi-
cant amount of highlights, then c c c or rgb should be
a 3;3 diagonal matrix among the sensor bands and is   
equal to the multiplication of each RGB-color band by taken.
an independent scalar factor [3,11]. Note that the diag-
onal model of illumination change holds exactly in the
case of narrow-band sensors. In theory, all color features
except color ratio m m m are sensitive to changes in the 8. Conclusion
  
illumination color. To measure the sensitivity of the
various color feature, in practice, with respect to a change In this paper, new color models have been proposed
in the color of the illumination, the R, G and B-images of which are analyzed in theory and evaluated in
the test set are multiplied by a factor b "b, b "1 and practice for the purpose of recognition of multicolored
  objects invariant to a substantial change in viewpoint,
b "2!b, respectively (i.e. b R, b G and b B) by vary-
    object geometry and illumination.
ing b over +0.5, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.5,. The dis-
criminative power of the histogram matching process In conclusion, RGB is most appropriate for multi-
differentiated for the various color features plotted colored object recognition when all imaging conditions
against the illumination color is shown in Fig. 7. For are controlled. Without the presence of highlights and
b(1 the color is bluish whereas reddish for b'1. under the constraint of white illumination, c c c and
  
As expected, only the color ratio m m m is insensitive normalized color rgb are most appropriate. When images
   are also contaminated by highlights, l l l or H should
to a change in illumination color. From Fig. 7 we can  
observe that color features l l l , H, c c c and rgb be taken for the job at hand. When no constraints are
     imposed on the SPD of the illumination, m m m is most
which achieved highest recognition accuracy under white   
illumination, see Figs. 3, 4 and 6, are highly sensitive to appropriate.
a change in illumination color followed by S and RGB. We concluded by presenting a schema on which color
Even for a slight change in the illumination color, their models to use under which imaging conditions to achieve
recognition potential degrades drastically. on both invariant and discriminatory recognition of
multicolored objects.

7. Discussion
References
From the experimental results it is concluded that,
under the assumption of a white light source, the dis- [1] M.J. Swain, D.H. Ballard, Color indexing, Int. J. Com-
criminative power of l l l , H followed by c c c , rgb puter. Vision 7(1) (1991) 11—32.
    [2] B.V. Funt, G.D. Finlayson, Color constant color indexing,
and m m m is approximately the same. Saturation S
   IEEE Trans. PAMI 17(5) (1995) 522—529.
provides significantly worse recognition accuracy. The
[3] E.H. Land, J.J. McCann, Lightness and retinex theory,
discriminative power of RGB has the worst performance
J. Opt. Soc. Am. 61 (1971) 1—11.
due to its sensitivity to varying imaging conditions.
[4] G. Healey, D. Slater, Global color constancy: recognition
When no constraints are imposed on the illumination, of objects by use of illumination invariant properties
the proposed color ratio m m m is most appropriate.
   of color distributions, J. Opt. Soc. Am. 11(11) (1995)
Based on both the reported theory and the experimental 3003—3010.
results, we now present a schema which color models to [5] G.D. Finlayson, S.S. Chatterjee, B.V. Funt, Color angular
use under which imaging conditions to achieve both indexing, ECCV96, Vol. II, 1996, pp. 16—27.
464 T. Gevers, A.W.M. Smeulders / Pattern Recognition 32 (1999) 453–464

[6] D. Slater, G. Healey, The illumination-invariant recogni- [10] T. Gevers, Color image invariant segmentation and re-
tion of 3-D objects using local color invariants, IEEE trieval, Ph.D. Thesis, ISBN 90-74795-51-X, University of
Trans. PAMI 18(2) (1996) 206—211. Amsterdam, The Netherlands, 1996.
[7] S.A. Shafer, Using color to separate reflection components, [11] G.D. Finlayson, M.S. Drew, B.V. Funt, Spectral sharp-
COLOR Res. Appl. 10(4) (1985) 210—218. ening: sensor transformations for improved color con-
[8] D. Forsyth, A novel algorithm for color constancy, Int. J. stancy, J. Opt. Soc. Am. 11(5) (1994) 1553—1563.
Comput. Vision 5 (1990) 5—36.
[9] B.V. Funt, M.S. Drew, Color constancy computation in
near-mondrian scenes, pp. 544—549. CVPR, IEEE Com-
puter Society Press, Silver Spring, MD, 1988.

About the Author—THEO GEVERS received his Ph.D. degree in Computer Science from the University of Amsterdam in 1996 for
a thesis on color image segmentation and retrieval. His main research interests are in the fundamentals of image database system design,
image retrieval by content, theoretical foundation of geometric and photometric invariants and color image processing.

About the Author—ARNOLD W.M. SMEULDERS is professor of Computer Science on Multi Media Information Systems. He has
been in image processing since 1977 when he completed his M.Sc. in physics from Delft University of Technology. Initially, he was
interested in accurate and precise measurement from digital images. His current interest is in image databases and intelligent interactive
image analysis systems, as well as method- and system engineering aspects of image processing and image processing for documents and
geographical information.

You might also like