0% found this document useful (0 votes)
43 views8 pages

Data Face Detection

This document discusses techniques for face detection and recognition. It describes four categories of face detection methods: knowledge-based, feature invariant, template matching, and appearance-based. It also summarizes the Viola-Jones face detection framework, which uses Haar-like features, an integral image for fast feature computation, and AdaBoost for feature selection to rapidly detect faces. Face recognition techniques discussed include feature matching, template matching using whole or partial face templates, and appearance-based approaches using eigenfaces.

Uploaded by

devilr796
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views8 pages

Data Face Detection

This document discusses techniques for face detection and recognition. It describes four categories of face detection methods: knowledge-based, feature invariant, template matching, and appearance-based. It also summarizes the Viola-Jones face detection framework, which uses Haar-like features, an integral image for fast feature computation, and AdaBoost for feature selection to rapidly detect faces. Face recognition techniques discussed include feature matching, template matching using whole or partial face templates, and appearance-based approaches using eigenfaces.

Uploaded by

devilr796
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Face Detection Techniques:

Face detection techniques have been researched for years and much progress has been
proposed in literature. Most of the face detection methods focus on detecting frontal faces
with good lighting conditions. These methods can be categorized into four types: knowledge-
based, feature invariant, template matching and appearance-based.
 Knowledge-based methods use human-coded rules to model facial features, such as
two symmetric eyes, a nose in the middle and a mouth underneath the nose.

 Feature invariant methods try to find facial features which are invariant to pose,
lighting condition or rotation. Skin colors, edges and shapes fall into this category.

 Template matching methods calculate the correlation between a test image and pre-
selected facial templates.

 Appearance-based methods adopt machine learning techniques to extract


discriminative features from a pre-labeled training set.
An approach used to detect objects in general, applicable to human faces as well was
presented by Viola-Jones. This method proved to detect objects extremely rapidly and is
comparable to the best real time face detection systems. Viola and Jones (2004)[2] presented
in their research a new image representation called Integral Image which allows fast
calculation of image features to be used by their detection algorithm. The second step is an
algorithm based on AdaBoost which is trained against the relevant object class to select a
minimal set of features to represent the object. Viola and Jones used features extracted from
the training set and AdaBoost algorithm to select the best feature set and constructing the
final classifier which comprises few stages. Each stage consists of few simple weak
classifiers that work together to form a stronger classifier filtering out the majority of false
detections at early stages and producing an adequate final face detector.

2.3 Face Recognition Techniques:

Several algorithms are used for face recognition. Some of the popular methods are
discussed here. Face recognition by feature matching is one such method .We have to locate
points in the face image with high information content. We don’t have to consider the face
contour or the hair. We have to concentrate on the center of the face area, as most stable and
informative features are found there. The high informative points in the face are considered
around eyes, nose and mouth. To enforce this we apply Gaussian weighting to the center of
the face.

The simplest template-matching approaches represent a whole face using a single


template, i.e., a 2-D array of intensity, which is usually an edge map of the original face
image. In a more complex way of template-matching, multiple templates may be used for
each face to account for recognition from different viewpoints. Another important variation is
to employ a set of smaller facial feature templates that correspond to eyes, nose, and mouth,
for a single viewpoint. The most attractive advantage of template-matching is the simplicity,
however, it suffers from large memory requirement and inefficient matching. In feature-
based approaches, geometric features, such as position and width of eyes, nose, and mouth,
eyebrow's thickness and arches, face breadth, or invariant moments, are extracted to represent
a face. Feature-based approaches have smaller memory requirement and a higher recognition
speed than template-based ones do. They are particularly useful for face scale normalization
and 3D head model-based pose estimation. However, perfect extraction of features is shown
to be difficult in implementation [5]. The idea of appearance-based approaches is to project
face images onto a linear subspace of low dimensions. Such a subspace is first constructed by
principal component analysis on a set of training images, with eigenfaces as its eigenvectors.
Later, the concept of eigenfaces were extended to eigenfeatures, such as eigeneyes,
eigenmouth, etc. for the detection of facial features [6]. More recently, fisherface space [7]
and illumination subspace [1] have been proposed for dealing with recognition under varying
illumination.

4.6 Viola-Jones object detection framework:

Paul Viola and Michael Jones presented a fast and robust method for face detection
which is 15 times quicker than any technique at the time of release with 95% accuracy at
around 17 fps. This work has three key contributions:

1. "Haar-Like" feature representation


2. It introduced a new, efficient, method for features calculation, based
on an "Integral image"
3. It introduced a method for aggressive features selection, based on
AdaBoost learning algorithm

Face Detection Flow chart

Input Offline Face


Video Training Database
Segmenta Cropping
tion into
Frames Positive Negative
Integral (Face) (NonFace)
Image Images Images
Formatio Adaboost
Projectio
n Face Training
n of Model
Window (XML
onto Face Real Time Face
Resource)
Model Detector Window
Comparis
Haar
Features ion with with
Threshol Detected
ds Face
4.6.1. "Haar-Like" feature representation:

For their face detection framework Viola and Jones decided to use simple features
based on pixel intensities rather than to use pixels directly. They motivated this choice by two
main factors:
 Features can encode ad-hoc domain knowledge, which otherwise would
be difficult to learn using limited training data.
 Features-based system operates must faster than a pixel-based system.
They have defined three kinds of Haar-like rectangle features :
 Two-rectangle feature was defined as a difference between the sum of
the pixels within two adjacent regions (vertical or horizontal),
 Three-rectangle feature was defined as a difference between two outside
rectangles and an inner rectangle between then,
 Four-rectangle feature was defined as a difference between diagonal pairs of
rectangles.

Figure 4.6.1: Rectangle features example: (A) and (B) show two-rectangle features,
(C) shows three-rectangle feature, and (D) shows four-rectangle feature.

4.6.2.Integral Image:

To enable an efficient computation of rectangle features, the authors presented an


intermediate representation of an image, called an integral image. The value of integral image
at location x; y is defined by the sum of the pixels from the original image above and to the
left of location x; y inclusively (see Figure 4.5.2.1). Formally,

Figure 4.6.2.1: Integral Image

Maintaining a cumulative row sum at each location x; y, the integral image can be
computed in a single pass over the original image. Once it is computed, rectangle features can
be calculated using only a few accesses to it (see Figure 4.5.2.2):
i. Two-rectangle features require 6 array references,
ii. Three-rectangle features require 8 array references, and
iii. Four-rectangle features require 9 array references.

Figure 4.6.2.2: Calculation example. The sum of the pixels within rectangle D can be
computed as 4 + 1 - (2 + 3), where 1-4 are values of the integral image.

The authors defined the base resolution of the detector to be 24x24. In other words,
every image frame should be divided into 24x24 sub-windows, and features are extracted at
all possible locations and scales for each such sub-window. This results in an exhaustive set
of rectangle features which counts more than 160,000 features for a single sub-window.
4.6.3. AdaBoost learning algorithm:

The AdaBoost algorithm was introduced in 1995 by Freund and Schapire. The
complete set of features is quite large - 160,000 features per a single 24x24 sub-window.
Though computing a single feature can be done with only a few simple operations, evaluating
the entire set of features is still extremely expensive, and cannot be performed by a real-time
application.
In its original form, AdaBoost is used to improve classification results of a learning
algorithm by combining a collection of weak classifiers to form a strong classifier. The
algorithm starts with equal weights for all examples. In each round, the weight are updated so
that the misclassified examples receive more weight.
By drawing an analogy between weak classifiers and features, Viola and Jones
decided to use AdaBoost algorithm for aggressive selection of a small number of good
features, which nevertheless have significant variety.
Practically, the weak learning algorithm was restricted to the set of classification
functions, which of each was dependent on a single feature. A weak classifier h(x; f; p;θ )
was then defined for a sample x (i.e. 24x24 sub-window) by a feature f, a threshold θ, and a
polarity p indicating the direction of the inequality:
h(x,f, p,θ) = 1 if pf(x) < pθ;
= 0 otherwise......................................(2)

The key advantage of the AdaBoost over its competitors is the speed of learning. For
each feature, the examples are sorted based on a feature value. The optimal threshold for that
feature can be then computed in a single pass over this sorted list.
In their paper [8], Viola and Jones show that a strong classifier constructed from 200
features yields reasonable results - given a detection rate of 95%, false positive rate of 1 to
14,084 was achieved on a testing dataset. These results are promising. However, authors
realized that for a face detector to be practical for real applications, the false positive rate
must be closer to 1 in 1,000,000. The straightforward technique to improve detection
performance would be to add features to the classifier. This, unfortunately, would lead to
increasing computation time and thus would turn the classifier into inappropriate for real-time
applications.
4.6.4.Detectors Cascade:
There is a natural trade-off between classifier performance in terms of detection rates
and its complexity, i.e. an amount of time required to compute the classification result.
Viola and Jones [2], however, were looking for a method to speed up performance
without compromising quality. As a result, they came up with an idea of detectors cascade
(see Figure 4.5.4). Each sub-window is processed by a series of detectors, called cascade, in
the following way. Classifiers are combined sequentially in the order of their complexity,
from the simplest to the most complex. The processing of a sub-window starts, then, from a
simple classifier, which was trained to reject most of negative (non-face) frames, while
keeping almost all positive (face) frames. A sub-window proceeds to the following, more
complex, classifier only if it was classified as positive at the preceding stage. If any one of
classifiers in a cascade rejects a frame, it is thrown away, and a system proceeds to the next
sub-window. If a sub-window is classified as positive by all the classifiers in the cascade, it is
declared as containing a face.

Fig 4.6.4.1 AdaBoost Classifier

References:
1.“Human face segmentation and identification”, S. A. Sirohey, Technical Report CAR-TR-
695, Center for Automation Research, University of Maryland, College Park, MD, 1993

[2]Paul Viola, Michael J. Jones, “Robust Real-Time Face Detection”, International Journal of
Computer Vision 57(2), 137–154, 2004.
[3] R. Lienhart and J. Maydt, “An extended set of haar-like features for rapid object
detection,” in IEEE ICIP 2002, Vol. 1, pp 900-903, 2002.
4.“Human and machine recognition of faces: A survey”, R. Chellappa, C. L. Wilson, and S.
Sirohey, Proc. of IEEE, volume 83, pages 705-740, 1995

You might also like