Color Features
Color Features
4
Chrominance
• Composition of wavelengths gives chrominance
– It is specified by hue (the dominant wavelengths) and
saturation (the purity) of a colour
– A pure colour has 100% of saturation
– All shades of colourless (grey) light, e.g. white light, have
0% of saturationk.
5
• The hue (H) of a color refers to which pure color it resembles.
All tints, tones and shades of red have the same hue.
• Hues are described by a number that specifies the position of
the corresponding pure color on the color wheel, as a fraction
between 0 and 1. Value 0 refers to red; 1/6 is yellow; 1/3 is
green; and so forth around the color wheel.
• The saturation (S) of a color describes how white the color is.
A pure red is fully saturated, with a saturation of 1; tints of
red have saturations less than 1; and white has a saturation of
0.
• The value (V) of a color, also called its lightness, describes
how dark the color is. A value of 0 is black, with increasing
lightness moving away from blac
6
Single Hexacone Model of Colour
Space
• The outer edge of the top of the cone is the
color wheel, with all the pure colors. The H
parameter describes the angle around the
wheel.
• The S (saturation) is zero for any color on the
axis of the cone; the center of the top circle is
white. An increase in the value of S corresponds
to a movement away from the axis.
• The V (value or lightness) is zero for black. An
increase in the value of V corresponds to a
movement away from black and toward the top
of the cone.
• The Ostwald diagram corresponds to a slice of
this cone. For example, the triangle between
red, white, and black is the Ostwald diagram
for the varieties of red.
7
Challenges to use Color Features
for CBIR
• The sensed colour varies considerably with
– 3D surface orientation
– camera viewpoint
– illumination of the scene, e.g., positions and spectra of illuminating sources.
• Also, human colour perception is quite subjective as regarding
perceptual similarity.
• To design colour descriptors, one should
i. specify colour space,
ii. its partitioning,
iii. how to measure similarity between colours
•
8
Concerns
• An absolute colour space defines unambiguous colours
that are independent of external factors
• Most of the popular colour spaces (e.g. RGB or HSI) are
not absolute
• They can be made absolute by more precise definitions
of or standards for their elements (e.g. sRGB).
• Absolute color spaces like a L*a*b* defines an exact
abstract colour that can be precisely reproduced when
an accurate device is viewed in the right condition.
9
RGB colour space
• A colour space is a multidimensional space of colour
components.
• Human colour perception combines the three primary
colours:
– red (R) with the wavelength l=700 nm
– green (G) with the wavelength l=546.1 nm
– blue (B) with the wavelength l=435.8 nm.
• Any visible wavelength L is sensed as a colour obtained
by a linear combination of the three primary colours
(R, G, B) with the particular weights cR( l ), cG( l) , cB( l):
10
XYZ Primary Colours
• The unreal primary colours XYZ pursue the goal
of obtaining only non-negative weights cX( λ ),
cY( λ ) , cZ( λ ) in the colour representation:
11
RGB Colour Space
• The RGB representation is most popular:
– It closely relates to human colour perception
– A majority of imaging devices produce RGB
images
• Gamma correction of a non-linear
relationship S = Lγ between the signal S and
light intensity L in imaging devices before
storing, transmitti ng,or processing the images
12
Variants of RGB Spaces
• RGB spaces in different application domains:
– Linear w.r.t. XYZ, not CIE-based (scanners, cameras)
– Non-linear CIE-based RGB spaces (displays, TV)
– Colorimetric sRGB standard (the Internet)
• The RGB space is not perceptually uniform: distances
do not reflect perceptual dissimilarity
• A large number of spaces derived from the RGB have
been used in practice for query-by-colour
applications
13
RGB and Query-by-Colour
• The initial RGB representation of an image is
of retrieval value only if recording was
performed in stable conditions
– Only in rare cases, e.g. for art paintings
• RGB coordinates are strongly interdependent
– RGB coordinates describe not only inherent colour
properties of objects but also variations of
illumination and other external factors
14
Independent Chrominance
• Luminance (e.g., R+B+G) is separated from the
two orthogonal chrominance components
that form independent (or opponent) axes:
R + G + B, R − G, −R − G + 2B
• Luminance and relative 2D colour coordinates:
R G B ⇒ r g b (r + g + b = 1);
• r = R /(R+B+G); g = G /(R+B+G); b = B /(R+B+G)
15
Independent Chrominance
• Luminance can be down-sampled
– human vision is more sensitive to chrominance
than to brightness
• Chrominance components: invariant to
changes in illumination intensity and shadows
– RGB-to-”Luminance-Chrominance”
transformations are computationally simple
– But: the resulting colour spaces are neither
uniform, nor natural
16
HSI (HSV) Colour Space
• HSI (hue−saturation−intensity) or HSV
(hue−saturation− value) is a non-linearly
transformed RGB space:
– The brightness (value, intensity) I = (R + G + B) / 3
axis is orthogonal to the chrominance plane
– The saturation S and the hue H are the radius and
angle, respectively, of the polar coordinates in the
chrominance plane
– This space is approximately perceptually uniform
17
HSI (HSV) Colour Space
• Conversion from RGB to HSI
18
HSI/HSV in MPEG-7
• In MPEG-7 the HSI / HSV colour space is defined in a
different way involving both the maximum and the
minimum RGB components
19
Other Opponent Colour Spaces
• YUV − PAL Phase Alternating Line TV (most of
the European countries, some Asian countries,
Australia, and New Zealand)
• Luminance (Y) and chrominance (U,V):
– Y = 0.299R + 0.587G + 0.114B
– U = 0.492(B−Y) = −0.147R − 0.289G + 0.436B
– V = 0.877(R−Y) = 0.615R − 0.515G − 0.100B
20
Other Opponent Colour Spaces
• YIQ − NTSC National Television Systems Committee
TV (USA, Canada, and Japan) with the luminance Y
and chrominance IQ (equal to UV rotated by 33o):
– Y = 0.299R + 0.587G + 0.114B
– I = −0.545U + 0.839V = 0.596R − 0.275G − 0.321B
– Q = 0.839U + 0.545V = 0.212R − 0.523G + 0.311B
• YDbDr − SECAM (Sequential Couleur a Memoire) TV
(France, Russia, Eastern Europe) is the scaled YUV:
– Db = 3.059U and Dr = −2.169V
21
Other Opponent Colour Spaces
• YCbCr − JPEG and MPEG coding standards:
nonnegative chrominance components by
scaling and shifting the YUV co-ordinates:
– Y = 0.257R + 0.504G + 0.098B + 16
– Cb= −0.148R − 0.291G + 0.439B + 128
– Cr= 0.439R − 0.368G − 0.071B + 128
• CIE uniform Lab (colour fax) and Luv
luminance - chrominance colour spaces
22
HMMD colour space
• Hue-Min-Max-Difference: a new space in MPEG-7
– Used in the colour structure descriptor (CSD)
– MPEG-7 supports also the greyscale (intensity only) space,
RGB, HSV, and YCrCb colour spaces
– Hue: as in HSV space for MPEG-7
– Min / Max: max{R, G, B} and min{R, G, B}
– Difference: max{R, G, B} − min{R, G, B}
• Only 3 of the 4 components define the HMMD colour
space
• Intensity: (min{R,G,B} + max{R,G,B})/2
• Chromaticity relates to the difference component
23
Vector Quantisation
• The whole 3D colour space is partitioned into K
disjoint subsets, one per code word ck ; k=1,…,K
• All the colours of one subset are represented by,
or quantised to, the same code word ck
• In a perceptually good palette C = c1,c2,...,cK},
each subset contains visually similar colours
• Multidimensional clustering of colours: K-means,
fuzzy K-means, EM (Expectation - Maximisation)
24
Vector Quantisation Algorithm
• Doubling the number of cluster centres until a prescribed limit
• Each iteration t: Kt = 2t centres Ct = {ck,t: k = 1, ..., Kt}
– Τ = 0 - a single centre c1,0 averaging colour vector over an image
– At each next iteration t each previous centre ck,t−1; k = 1, ..., Kt−1,
splits into the two new centres:
• Provisional splitting into cpr: k,t and cpr: Kt-1+k,t by multiplying ck,t−1bythe
fixed factors (1+w) and (1−w), respectively (w - a fixed value), or
• shifting it to and from the most distant vector in the cluster, etc
• • Assigning each colour vector in the image to the closest cluster
• centre
• • Forming the new centres ck,t and cKt-1+k,t by averaging the colour
• vectors assigned to each such cluster
25
Histogram
• A histogram represents the distribution of
colors in an image.
• It can be visualized as a graph (or plot) that
gives a high-level intuition of the intensity
(pixel value) distribution.
• We are going to assume a RGB color space in
this example, so these pixel values will be in
the range of 0 to 255.
26
How….?
• When plotting the histogram, the X-axis serves as our
“bins”.
• If we construct a histogram with 256 bins, then we
are effectively counting the number of times each
pixel value occurs.
• In contrast, if we use only 2 (equally spaced) bins,
then we are counting the number of times a pixel is
in the range [0, 128) or [128, 255].
• The number of pixels binned to the X-axis value is
then plotted on the Y-axis.
27
Why…?
• By simply examining the histogram of an
image, you get a general understanding
regarding the
– Contrast
– Brightness
– intensity distribution.
28
Application to Image Search
Engines
• In context of image search engines,
histograms can serve as feature vectors
• i.e. a list of numbers used to quantify an
image and compare it to other images
• In order to use color histograms in image
search engines, we make the assumption that
images with similar color distributions are
semantically similar.
29
Usage….
• Comparing the “similarity” of color histograms can be
done using a distance metric.
• Common choices include:
– Euclidean
– Correlation
– Chi-squared
– Intersection
– Bhattacharyya.
• the choice is usually dependent on the image dataset
being analyzed.
30
Implementation Details
• cv2.calcHist(images, channels, mask, histSize, ranges)
• images: This is the image that we want to compute a
histogram for. Wrap it as a list: [myImage].
• channels: A list of indexes, where we specify the
index of the channel we want to compute a
histogram for.
• To compute a histogram of a grayscale image, the list
would be [0].
• To compute a histogram for all three red, green, and
blue channels, the channels list would be [0, 1, 2].
31
• mask: a mask is a uint8 image with the same shape
as our original image, where pixels with a value of
zero are ignored and pixels with a value greater than
zero are included in the histogram computation.
Using masks allow us to only compute a histogram
for a particular region of an image.
32
• histSize: This is the number of bins we want to use
when computing a histogram. Again, this is a list, one
for each channel we are computing a histogram for.
The bin sizes do not all have to be the same. Here is
an example of 32 bins for each channel: [32, 32, 32].
• ranges: The range of possible pixel values. Normally,
this is [0, 256] for each channel, but if you are using a
color space other than RGB (such as HSV), the ranges
might be different.
33
Multi-dimensional Histograms
• “how many pixels have a Red value of 10 AND a Blue value of
30?”
• How many pixels have a Green value of 200 AND a Red value of
130?
• By using the conjunctive AND we are able to construct multi-
dimensional histograms.
• if we used a 256 bins for each dimension in a 2D histogram, our
resulting histogram would have 65,536 separate pixel counts.
• Not only is this wasteful of resources, it’s not practical.
• Most applications using somewhere between 8 and 64 bins
when computing multi-dimensional histograms
34
Drawbacks
• We made the assumption that images with
similar color distributions are semantically
similar.
• For small, simple datasets, this may in fact be
true.
• However, in practice, this assumption does
not always hold.
35
• For one, color histograms, by definition ignore both
the shape and texture of the object(s) in the image.
• This means that color histograms have no concept of
the shape of an object or the texture of the object.
• Furthermore, histograms also disregard any spatial
information (i.e. where in the image the pixel value
came from).
• An extension to the histogram, the color
correlogram, can be used to encode a spatial
relationship amongst pixels.
36
• Chic Engine, visual fashion search engine iPhone app.
• different categories for different types of clothes,
such as shoes and shirts.
• If I were using color histograms to describe a red
shoe and a red shirt, the histogram would assume
they were the same object.
• Clearly they are both red, but the semantics end
there — they are simply not the same.
• Color histograms simply have no way to “model”
what a shoe or a shirt is.
37
• Finally, color histograms are sensitive to “noise”, such as
changes in lighting in the environment the image was
captured under and quantization errors (selecting which bin
to increment).
• Some of these limitations can potentially be mitigated by
using a different color space than RGB (such as HSV or
L*a*b*).
• However, all that said, histograms are still widely used as
image descriptors. They are dead simple to implement and
very fast to compute. And while they have their limitations,
they are very powerful when used correctly and in the right
context.
38