Color-Based Face Detection in The "15 Seconds of Fame"
Color-Based Face Detection in The "15 Seconds of Fame"
Color-Based Face Detection in The "15 Seconds of Fame"
Franc Solina, Peter Peer, Borut Batagelj, Samo Juvan, Jure Kovač
[email protected]
University of Ljubljana, Faculty of Computer and Information Science
Tržaška 25, SI-1000 Ljubljana, Slovenia
1. INTRODUCTION
38
Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 10-11 2003
a) b)
c) d)
Figure 3: Stages in the process of finding faces in an image: a) downsize the resolution 2048×1536 of the original
image to 160×120 pixels, b) eliminate all pixels that can not represent a face, c) segment all the regions containing face-
like pixels using region growing, d) eliminate regions which can not represent a face using heuristic rules. Some color
transformations of the fourth face from the left are shown in Fig. 4.
39
Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 10-11 2003
2. FINDING FACES IN COLOR IMAGES R > 95 & G > 40 & B > 20 &
max{R, G, B} − min{R, G, B} > 15 &
Automatic face detection is like most other automatic ob-
|R − G| > 15 & R > G & R > B.
ject-detection methods difficult, especially if sample vari-
ations are significant. Large sample variations in face de- We tested also other color models [15] but the best results
tection arise due to a large variety of individual face ap- were obtained with this one.
pearances and due to differences in illumination. The goal of the method was to reach maximum classifica-
There are a few distinct approaches to face detection (for tion accuracy over the learning and testing sets of images
a detailed survey see [13]). We decided to use in the in- under the following constraints: near real-time operation
stallation a color-based approach of face detection that we on a standard personal computer, plain background, uni-
developed earlier [17]. In the next Subsection 2.1 we de- form ambient natural or studio illumination, faces of fair-
scribe our original method while in Subsection 2.2 we re- complexion, which must be entirely present in the image,
veal the simplifications of this method that we had to make and faces turned away from the frontal view for at most
for the installation. 30 degrees.
During the processing the method requires some thresh-
2.1. Our original face detection method olds, which are set quite tolerantly, but they become effec-
tive in a sequence. All thresholds were defined experimen-
We developed a face detection method which consists of tally. The method was developed using different training
two distinct parts: making of face hypotheses and verifica- sets of pictures [3] and tested over an independent test-
tion of these face hypotheses [17]. This method combines ing set of two public image-databases [16, 18] with good
two approaches: color-based and feature-based approach. results [17].
40
Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 10-11 2003
Since the original face detection algorithm is computa- ditions. Once we realized that we will not be able to
tionally demanding, we decided to develop a simpler ver- completely control the illumination in different exhibition
sion for integration in the “15 seconds of fame” instal- spaces, we decided to improve our face detection results
lation. by eliminating the influence of non-standard illumination.
41
Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 10-11 2003
maximal value of each channel in the whole image. The 3.3. Comparison of methods
Retinex method is above all suitable for dark images.
3.3.1. Color compensation methods
3.2. Color constancy methods In order to determine the influence of these algorithms
on our face detection results, some experiments were per-
Methods belonging to this group differ from the color com- formed on the set of images gathered in our lab and at
pensation methods above all in the need to integrate a pre- the first public showing of the installation. The testing
liminary learning step. They require the knowledge about set is composed of 160 images taken under four different
illumination properties and properties of the capturing de- types of illumination conditions. One subset of images (40
vices, e.g. digital cameras. The input image is then trans- images) was taken under standard daylight, in the second
formed in such a way that it reflects the state, which is in- subset (40 images) objects were illuminated by incandes-
dependent of the illumination. Thus a stable presentation cent lamps assembled into a chandelier, the third subset
of colors under different illuminations is achieved. Gen- (40 images) was taken under camera flash light conditions,
erally speaking the methods consist of two distinct steps: and the last subset of images (40 images) was taken under
scene illumination detection and standard illumination re- neon light illumination conditions. After that, one of the
construction. In the first step, the algorithm determines color compensation methods was applied and finally, face
with the help of preliminary knowledge which illumina- detection algorithm was applied to original and prepro-
tion out of the set of known illuminations is present in the cessed images.
image. In the second step, it applies the necessary trans-
Results gathered in Tab. 1 show perceivable improvement
formations to reconstruct the standard (or other wanted)
in face detection on images taken under different than stan-
illumination.
dard illumination conditions when one of the compensa-
In the Color by Correlation method [9] the preliminary tion algorithms was previously applied (see Figs. 7 and
knowledge about the illumination properties is represented 8). Grey World algorithm performed especially well since
with a set of colors, which can appear under specific illu- for the flashlight, incandescent and neon light conditions
mination, i.e. colors that are visible under specific illumi- a considerable increase in TP/Det percentage can be no-
nant. ticed.
42
Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 10-11 2003
Table 1: Color compensation results show the number of all detections (Detected), the number of detected faces as true
positives (TP), number of false detections as false positives (FP) and number of faces missed as false negatives (FN) on
four subsets of images which represent different illumination conditions (standard, incandescent, flashlight and neon),
previously preprocessed by Grey World (GW), Modified GW (MGW), White Patch Retinex (RET) or no preprocessing
at all (None). Row All Faces shows the number of faces in particular subset of images. TP/Det shows the percentage
of true positives out of all detections and FN/All shows the percentage of false negatives out of all faces in the subset.
For the installation the first percentage is extremely important, while the second one is merely informative, since we have
consciously eliminated faces that were too small for further processing but were included in the number of all faces! Note
that TP + FN ≤ All, since the region characterized as TP can contain one or more faces but counts as only one TP.
43
Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 10-11 2003
44
Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 10-11 2003
Table 2: Correlation results show the number of all detections (Detected), number of correct face detections as true
positives (TP), number of detections that turned out not to be faces as false positives (FP) and the number of faces
missed by detection algorithm as false negatives (FN) for different subsets of images (white, yellow, green, blue and red),
previously processed by correlation algorithm (C) and with no preprocessing at all (None). Row All Faces shows the
number of faces in a particular subset of images. TP/Det shows the percentage of true positives out of all detections and
FN/All shows the percentage of false negatives out of all faces in the subset. The TP/Det is for the installation extremely
important, while FN/All is merely informative for the performance of our face detector. Small faces are deliberately
eliminated from further processing already by the face detection algorithm. Note that TP + FN ≤ All, since the region
characterized as TP can contain one or more faces but counts as only one TP.
wards the color of the illumination. Consequently, a lot face/background, eyes, mouth, hair etc. These tasks are
of skin-like pixels are recognized as non-skin-like pixels still too complex to be routinely solved in a few seconds
and the number of correctly detected faces (true positives) on a large variety of input images. We decided therefore
is decreased, since we can not reliably find skin-like pix- to try to achieve somewhat similar effects with much sim-
els. When deviations from standard illumination are much pler means. Our system does not search for any features
more noticeable, we must choose a correlation technique in the face but just processes the input image with selected
with a proper illumination reconstruction. When illumi- filters.
nation conditions are constant over a large period of time, We selected filters that achieve effects similar to segmen-
no illumination detection is necessary. By manually se- tation. They drastically reduce the number of colors by
lecting the illumination we eliminate all the risks linked joining similar looking pixels into uniform looking re-
with false illumination detection and assure the best illu- gions. The system randomly selects one out of 34 prede-
mination reconstruction. termined filter combinations and applies it to the portrait.
Eliminating the influence of non-standard illumination be- 17 filters consist of different combinations of three well
fore face detection ensures much better results. The whole known filters: posterize, color balance and hue-saturation
system is much more flexible and the installation can be balance. The other half of filters are actually based on the
exhibited almost anywhere. first 17 filters, but with an important additional property,
which we call random coloring. Random coloring works
in the following way: the system selects a pixel from the
4. POP-ART COLOR TRANSFORMATIONS
already filtered image and then finds all other pixels in
As mentioned in the introduction, Andy Warhol took pho- the image of the same color. Then the system randomly
tographs of people and performed some kind of color sur- selects a new color from the RGB color space for these
gery on them. In this process Warhol sometimes seg- pixels. In this way, we achieve millions of different col-
mented the face from the background, delineated the con- oring effects and our pop-art portraits almost never look
tours, highlighted some facial features (the mouth or the completely the same. Eight different randomly obtained
eyes), started the process with the negative instead of the filtering effects on the same input portrait can be seen in
positive photography, overlayed the photo with geomet- Fig. 4.
ric color screens etc. [7]. His techniques of transform-
ing a photography into a painting could be described with 5. DISPLAY OF PORTRAITS
a set of formal construction rules, like the ones used in
shape grammars [12, 14]. The application of such rules The installation displays on the framed monitor the se-
for generation of new pop-art portraits which our instal- lected portraits in 15 second intervals. For the display of
lation tries to perform would require automatic segmenta- the final result the system also selects randomly among
tion of input images into its constituent perceptual parts: five possible configurations: in 75% of cases it shows
45
Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 10-11 2003
7. CONCLUSIONS
46
Proceedings of Mirage 2003, INRIA Rocquencourt, France, March, 10-11 2003
47