Face detection and recognition using image processing
Group 11
Group Members
Jam Zia-ul-Haq Hafiz Nisar Ahmad Muhammad Yasir
contents
Introduction Face detection Face recognition Implementation
Introduction
Face interface Face detection Face recognition
Face detection Face recognition Face database
Output: Mr.Chan Prof..Cheng
What is Face Detection?
Given an image, tell whether there is any human face, if there is, where is Click to edit Master text styles it(or where they Second level are).
Click to edit Master text styles Second level Third level Fourth level Fifth level
Third level Fourth level Fifth level
Importance of Face Detection
The first step for any automatic face recognition system system First step in many Human Computer Interaction systems
Expression Recognition Cognitive State/Emotional State Recogntion
First step in many surveillance systems Tracking: Face is a highly non rigid object A step towards Automatic Target Recognition(ATR) or generic object detection/recognition Video coding
Face Detection: current state
State-of-the-art: Front-view face detection can be done at >15 frames per second on 320x240 black-and-white images on a 700MHz PC with ~95% accuracy. Detection of faces is faster than detection of edges! Side view face detection remains to be difficult.
Face Detection: challenges
Out-of-Plane Rotation: frontal, 45 degree, profile, upside down Presence of beard, mustache, glasses etc Facial Expressions Occlusions by long hair, hand In-Plane Rotation Image conditions:
Size Lighting condition Distortion Noise Compression
Different Approaches
Knowledge-based methods:
Encode what constitutes a typical face, e.g., the relationship between facial features
Feature invariant approaches:
Aim to find structure features of a face that exist even when pose, viewpoint or lighting conditions vary
Template matching:
Several standard patterns stored to describe the face as a whole or the facial features separately
Appearance-based methods:
The models are learned from a set of training images that capture the representative variability of faces.
Knowledge-Based Methods
Top Top-down approach: Represent a face using a set of human-coded rules Example:
The center part of face has uniform intensity values The difference between the average intensity values of the center part and the upper part is significant A face often appears with two eyes that are symmetric to each other, a nose and a mouth
Use these rules to guide the search process
Knowledge-Based Method: [Yang and Huang 94]
Level 1 (lowest resolution):
apply the rule the center part of the face has 4 cells with a basically uniform intensity to search for candidates
Level 2: local histogram equalization followed by edge equalization followed by edge detection Level 3: search for eye and mouth features for validation
Knowledge-based Methods: Summary
Pros:
Easy to come up with simple rules Based on the coded rules, facial features in an input image are extracted first, and face candidates are identified Work well for face localization in uncluttered background
Cons:
Difficult to translate human knowledge into rules precisely: detailed rules fail to detect faces and general rules may find many false positives Difficult to extend this approach to detect faces in different poses: implausible to enumerate all the possible cases
Feature-Based Methods
Bottom-up approach: Detect facial features (eyes, nose, mouth, etc) first Facial features: edge, intensity, shape, texture, color, etc Aim to detect invariant features Group features into candidates and verify them
Feature-Based Methods: Summary
Pros: Features are invariant to pose and orientation change Cons: Difficult to locate facial features due to several corruption (illumination, noise, occlusion) Difficult to detect features in complex background
Template Matching Methods
Store a template
Predefined: based on edges or regions
Deformable: based on facial contours (e.g., Snakes) Templates are hand-coded (not learned) Use correlation to locate faces
Click to edit Master text styles Second level Third level Fourth level Fifth level
Template-Based Methods: Summary
Pros: Simple Cons: Templates needs to be initialized near the face images Difficult to enumerate templates for different poses (similar to knowledge-based methods)
Image Features
Rectangle filters
Rectangle_Feature_value f= (pixels in white area) (pixels in shaded area)
Example
1
Find the Rectangle_Feature_valu e (f) of the box enclosed by the dotted line
Rectangle_Feature_value f= (pixels in white area) (pixels in shaded area) f=(8+7)-(0+1) =15-1= 14
2 0 8 2
3 1 7 3
3 3 1 6
3 5 0
Example: A simple face detection method using one Rectangle_Feature_value featurewhite area) f (pixels in shaded f= (pixels in
q q
area)
q
If (f) is large it is face ,i.e. qif (f)>threshold, then q face qElse q non-face
Result
This is a face:T he eye-part is dark, the nose-part is bright So f is large, hence it is face This is not a face. Because f is small
Why do we need to find pixel sum of rectangles? Answer: We want to get face features
You may consider these features as face features
Two eyes=
(Area_A-Area_B)
A B
Nose =(Area_C+Area_E-Area_D) Mouth =(Area_F+Area_HArea_G)
C D E F G H
They can be different sizes, polarity and aspect ratios
Face feature and example
Shaded area 7
Pixel values inside The areas 10 20 4 45 7
-1 216 102 78 Integral White area Image 129 210 111 +2 F=Feat_val = pixel sum in shared area - pixel sum in white area Example Pixel sum in white area= 216+102+78+129+210+111=846
Pixel sum in shared area= 10+20+4+7+45+7=93
A face
Feat_val=F=846-93 If F>threshold, feature=+1 Else feature=-1 End if; We can choose threshold =768 , so feature is +1.
4 basic types of features for white_area-gray_area
Type) Rows x columns Type 1) 1x2 Type 2) 2x1 Type 3) 1x3 Type 4) 3x1
22 22
Each basic type can have difference sizes and aspect ratios.
Feature selection
For a 24x24 detection region, the number of possible rectangle features is ~160,000!
Some examples and their types Fill in the types for the 2nd, 3rd rows
2 3 5 1 4
Type 1) 2) 3) 4) 5)
23 23
The detection challenge
Use 24x24 base window For y=1;y<=1024;y++
For x=1;x<=1280;x++{ Set (x,y) = the left top corner of the 24x24 sub-window For the 24x24 sub-window, extract 162,336 features and see they combine to form a face or not. }
Yaxis (x,y) 24x24Sub-window (1,1) X-axis 1280
1024
Conclusion : too slow
24 24
Solution to make it efficient
The whole 162,336 feature set is too large Solution: select good features to make it more efficient Use: Boosting Boosting Combine many small weak classifiers to become a strong classifier. Training is needed
2525
Boosting for face detection
Define weak learners based on rectangle features
value of rectangle feature
window
threshold Pt= polarity{+1,-1}
26 26
Face detection using Adaboost
AdaBoost
training
E.g. Collect 5000 faces, and 9400 non-faces. Different scales. AdaBoost for training to build a strong classifier. suitable features of different scales and positions, pick the best few. (Take months to do , details is in [Viola 2004] paper) through the image, pick a window and rescale it to 24x24, it to the strong classifier for detection. face, if the output is positive
27 27
Use
Pick
Testing
Scan Pass
Report
Boosting for face detection [viola2004]
In the paper it shows that the following two features (obtained after training) in cascaded picked by AdaBoost have 100% detection rate and 50% false positive rate
I.e. But 50% false positive rate is not good enough H(face)= Sign{1h1(image) Approach [viola2004] :Attentional cascade +2h2(image)}
H(face)=+1 face H(face)=-1non-face
Pick a window in the image and rescale it to 24x24 as image
h1(image) type2
h2(image) type3
28 28
Boosting for face detection
A 200-feature classifier can yield 95% detection rate and a false positive rate of 1 in 14084 (Still not god enough) Recall: False positive rate
The detector output is positive but it is false (there is actually no face). Definition of False positive: A result that is erroneously positive when a situation is normal. An example of a false positive: a particular test designed to detect cancer of the toenail is positive but the person does not have toenail cancer. (http://www.medterms.com/script/main/art.asp?articlekey=3377)
Correct Detection rate
Still not good enough!
False positive rate X10-3 29 29
To improve false positive rate: Attentional cascade
Cascade of many AdaBoost strong classifiers with with simple classifiers to reject many negative sub-windows non-faces are rejected at the first few stages. the system is efficient enough for real time processing.
Face found
30 30
Begin Many
Hence
Input image Adaboost True AdaboostTrueAdaboostTrue Classifier1 Classifier2 Classifier3 False False False Non-face Non-face Non-face
An example
More features for later stages in the cascade
type2 type3
2 features
10 features
25 features
50 features
Input image Adaboost True AdaboostTrueAdaboostTrue Classifier1 Classifier2 Classifier3 False False False Non-face Non-face Non-face
Face found
3131
Attentional cascade
Chain classifiers that are progressively more complex and have lower false positive rates:
Receiver operating characteristic
% False Pos 0 50
% Detection 0
100
vsfalseneg determined by
Input image Adaboost True AdaboostTrueAdaboostTrue Classifier1 Classifier2 Classifier3 False False False Non-face Non-face Non-face
False positive rate
Face found
32 32
Attentional cascade
Detection
rate for each stage is 0.99 , for 10 stages,
overall detection rate is 0.9910 0.9
False
positive rate at each stage is 0.3, for 10 stages
false
positive rate =0.310 610-6)
Face found
33 33
Input image Adaboost True AdaboostTrueAdaboostTrue Classifier1 Classifier2 Classifier3 False False False Non-face Non-face Non-face
Detection process in practice
Use 24x24 sub-window Scaling scale the detection (not the input image) Features evaluated at scales by factors of 1.25 at each level Location : move detector around the image (1 pixel increments) Final detections A real face may result in multiple nearby detections (merge them to become the final result)
3434
Face Recognition
Face Recognition by Humans
Performed routinely and effortlessly by humans Enormous interest in automatic processing of digital images and videos due to wide availability of powerful and low-cost desktop embedded computing
Applications: biometric authentication, surveillance, human-computer interaction multimedia management
Face recognition
Advantages over other biometric technologies: Natural Nonintruisive Easy to use
Among the six biometric attributes considered by Hietmeyer, facial features scored the highest compatibility in a Machine Readable Travel Documents (MRTD) system based on: Enrollment Renewal
Classification
A face recognition system is expected to identify faces present in images and videos automatically. It can operate in either or both of two modes: Face verification (or authentication): involves a one-to-one match that compares a query face image against a template face image whose identity is being claimed. Face identification (or recognition): involves one-to-many matches that compares a query face image against all the template images in the database to determine the identity of the query face.
First automatic face recognition system was developed by Kanade 1973.
Face recognition processing
Face recognition is a visual pattern recognition problem. A face is a three-dimensional object subject to varying illumination, pose, expression is to be identified based on its two-dimensional image ( or three- dimensional images obtained by laser scan). A face recognition system generally consists of 4 modules - detection, alignment, feature extraction, and matching. Localization and normalization (face detection and alignment) are processing steps before face recognition (facial feature extraction and matching) is performed.
Face recognition processing
Face detection segments the face areas from the background. In the case of video, the detected faces may need to be tracked using a face tracking component. Face alignment is aimed at achieving more accurate localization and at normalizing faces, whereas face detection provides coarse estimates of the location and scale of each face.
Face recognition processing
Facial components and facial outline are located; based on the location points, The input face image is normalized in respect to geometrical properties, such as size and pose, using geometrical transforms or morphing, The face is further normalized with respect to photometrical properties such as illumination and gray scale.
Face recognition processing
After a face is normalized, feature extraction is performed to provide effective information that is useful for distinguishing between faces of different persons and stable with respect to the geometrical and photometrical variations. For face matching, the extracted feature vector of the input face is matched against those of enrolled faces in the database; it outputs the identity of the face when a match is found with sufficient confidence or indicates an unknown face otherwise.
Face recognition processing
Click to edit Master text styles Second level Third level Fourth level Fifth level
Face recognition processing flow.