Adityanigam Mtech Report
Adityanigam Mtech Report
Eigenfaces appears as light and dark areas arranged in a specic pattern. Regions where
the dierence among the training images is large, the corresponding regions at the eigenfaces
will have large magnitude.
5
Figure 1.2: Eigenfaces
directly on gray quantized images
In quantized gray-scale image only n (n 8) most signicant bits of the gray value are
considered.
6
In Chapter 5, future work is presented and conclusion is given.
7
Chapter 2
Literature Review
2.1 HD and PHD
The Conventional Hausdor distance is dissimilarity between two set of points.
It can be applied on edge maps to compare shapes. This measures the proximity
rather than exact superposition, Hence it can be calculated without explicit pairing
up of points of two sets.
Let A = a
1
, a
2
, a
3
, a
4
..a
m
and B = b
1
, b
2
, b
3
, b
4
..b
n
be two Set of points
then, undirected Hausdor distance [8] between A and B is dened as:
HD(A, B) = HD(B, A) = max(hd(A, B), hd(B, A)) (2.1)
here hd(A,B) is the directed Hausdor distance dened by:
hd(A, B) = max
aA
min
bB
|a b| (2.2)
and, |.| is the norm of the vector.
8
and Correspondance
Min Value
10(1-a)
1 corresponds to a
2 corresponds to a
8(2-a)
3 corresponds to a
12(3-a)
'
&
$
%
'
&
$
%
1
2
3
a
b
Distances Max Value
10
14
8
10
12
15
[Most Dissimilar Points]
12(3-a)
This is the worst
correspondance
1-a
1-b
2-a
2-b
3-a
3-b
Pairs of Points
SET A
SET B
Figure 2.1: Example hd(A,B)
Basically it is the maximum distance that one has to travel from any point
of set A to any point of set B. It is a max min distance in which min estimates
the best correspondence for each point, and max extracts the worst out of those.
Hence, hd(A,B) is the distance between the worst correspondence pair (as shown
in Figure 2.1).
HD measure does not work well when some part of the object is occluded or
missing. This caused introduction of partial Hausdor distance or PHD which is
used for partial matching and is dened as:
phd(A, B) = K
th
max
aA
min
bB
|a b| (2.3)
HD and PHD do not solve point-to-point correspondence at all, and works on edge
maps. Both of them can tolerate small amount of local and non-rigid distortion as
well as illumination variations. But, the non-linear max and min functions make
9
HD and PHD very sensitive to noise.
2.2 MHD and M2HD
Modied Hausdor Distance MHD [15] has been introduced that uses averaging
which is a linear function which makes it less sensitive to noise. MHD is dened
as:
mhd(A, B) =
1
N
a
aA
min
bB
|a b| (2.4)
Where N
a
is the number of points in set A.
Further, MHD is improved to Doubly Modied Hausdor Distance M2HD [10]
by adding 3 more parameters :
Neighborhood function (NNN
a
B
) Neighborhood of the point a in set B
Indicator variable (I) I = 1 if as corresponding point lie in N
a
B
else I = 0
Associated penalty (P) if I = 0 penalize with this penalty
and is dened as:
m2hd(A, B) =
1
N
a
aA
d(a, B) (2.5)
Where d(a,B) is dened as:
d(a, B) = max[(I min
bN
a
B
|a b|), ((1 I) P)] (2.6)
10
2.3 SWHD and SW2HD
To achieve better discriminative power HD and MHD measures were further im-
proved by assigning the weights to every point according to its spatial information.
Crucial facial feature points like eyes and mouth are approximated by the rect-
angular windows (as shown in Figure 2.2) and are given more importance than
others. Hence, proposed Spatially Weighted Hausdor Distance SWHD and Dou-
bly Spatially Weighted Hausdor Distance SW2HD [11] were dened as:
swhd(A, B) = max
aA
w(b) min
bB
|a b|
(2.7)
sw2hd(A, B) =
1
N
a
aN
a
w(b) min
bB
|a b|
(2.8)
Where w(x) is dened as:
w(x) =
Eigenfaces appears as light and dark areas arranged in a specic pattern (as shown in Figure
1.2). Regions where the dierence among the training images is large, the corresponding regions
at the eigenfaces will have large magnitude.
11
Figure 2.2: Spatial Weighing Function
12
ing function because they represents the most signicant variations in the set of
training face images. Proposed Spatially Eigen Weighted Hausdor Distance SE-
WHD and Doubly Spatially Eigen Weighted Hausdor Distance SEW2HD [12]
are dened as:
sewhd(A, B) = max
aA
w
e
(b) min
bB
|a b|
(2.10)
sew2hd(A, B) =
1
N
a
aN
a
w
e
(b) min
bB
|a b|
(2.11)
where w
e
(x) is dened as:
w
e
(x) = The eigen weight function generated by the rst eigen vector (2.12)
2.5 H
g
and H
pg
Till 2006 Hausdor distance measure was being explored only on edge maps but
unfortunately on edge images most of the important facial features are lost which
are very useful for facial discrimination. Gray Hausdor Distance H
g
and Partial
Gray Hausdor Distance H
pg
[13] measures works on quantized images and are
found robust to slight variation in poses, expressions and illumination. It is seen
that quantized image with n 5 retains the perceptual appearance and the in-
trinsic facial feature information that resides in gray values (as shown in Figure
2.3).
H
g
and H
pg
are dened as :
13
(a) top 1 bit (b) top 2 bits (c) top 3 bits (d) top 4 bits
(e) top 5 bits (f) top 6 bits (g) top 7 bits (h) top 8 bits
Figure 2.3: Quantized Images
h
g
(A, B) = max
i=0..2
n
1
aA
i
d(a, B
i
) (2.13)
h
pg
(A, B) = K
th
max
i=0..2
n
1
aA
i
d(a, B
i
) (2.14)
where d(a, B
i
) is dened as :
d(a, B
i
) =
min
bB
i
|a b| if B
i
is non-empty (2.15a)
L otherwise (2.15b)
Here, A
i
and B
i
are the set of pixels in A and B images having quantized gray
value i. L is a large value can be
r
2
+c
2
+1 for r c images. Both H
g
and H
pg
search for a correspondence between sets of pixels having the same quantized value
from two images where the distance measure itself being the distance between the
14
worst correspondence.
2.6 Application in Face Recognition
All of the measures discussed before H
g
and H
pg
works on edge images. They treat
an image as a set of edge points and then calculates the value of the measure using
its mathematical formula as dened above (in their respective section).
H
g
and H
pg
works on quantized images. They treat a gray-scale image A
as 256 sets of points (say A
0
, A
1
, A
2
, ..A
255
), where A
i
is the set of pixels in
image A having quantized gray value i. Then H
g
and H
pg
are calculated using its
mathematical formula as dened above.
15
Chapter 3
The Proposed Approach and
Implementation Details
We dene a new Normalized Unmatched Points measure NUP that can be applied
on gray-scale facial images. It is similar to the Hausdor distance based mea-
sures but is computationally less expensive and more accurate. NUP also shows
robustness against slight variation in poses, expressions and illumination.
In a gray-scale image, each pixel has an 8-bit gray value that lies in between 0
to 255 which is very sensitive to the environmental conditions. In varying uncon-
trolled environment, it becomes very dicult for a measure to capture any useful
information about the image. Hence face recognition is very challenging using
gray-scale images.
16
3.1 Transformation
Sudha and Wong [14] describe a transformation (referred hereafter as SK-transformation)
which provides some robustness against illumination variation and local non-rigid
distortions by converting gray scale images into transformed images that preserve
intensity distribution.
A pixels relative gray value in its neighborhood can be more stable than its
own gray value. Hence in an SK-transformed image, every pixel is represented
by an 8-element vector which in itself can store the sign of rst-order derivative
with respect to its 8-neighborhood. Each SK-transformed images hold a property
that even if the gray value of the pixels are being changed in dierent poses of the
same subject, their corresponding vector (i.e. its contribution in the transformed
image) do not change by a great extent.
The above property holds when gray values of neighborhood pixels are not too
close to each other. But usually, we have small variations in the gray values (e.g.
in background, facial features etc.), where the above property fails to hold.
The problem is caused by our comparator function, which assumes for any gray
level X that:
X
= X (3.1a)
< (X, 255] (3.1b)
> [0, X) (3.1c)
where X is a gray level not merely a number.
Practically a gray level X is neither greater than gray level (X 1) nor less
than gray level (X + 1); ideally they should be considered as equal. Gray levels
17
Figure 3.1: Gray-value spectrum.
are hardly distinguishable within a range of 5 units (as shown in Figure 3.1) .
Quantization [13] can also be thought of as a solution to this problem however it
too behaves similarly at the boundaries.
3.1.1 gt-Transformation
In order to solve this problem a new gt-transformation is introduced which uses
gt-comparator function.
gt-Comparator
The gt-comparator function depends on parameter gt (gray value tolerance). It
assumes for any gray level X that:
X
[ BBB
d as:
2 (3.3)
Compare(A, B) compares image A to image B, and returns N
U
AB
(i.e. number
of unmatched points), which can be dened as:
N
U
AB
=
aA
(1 Match(a, B)) (3.4)
22
where Match(a, B) can be dened as:
Match(a, B) =
1 If
bN
a
B
V (a) = V (b) [i.e. Matched] (3.5a)
0 else (3.5b)
Match(a, B) matches a pixel a with a gt-transformed image B. It returns 1
if there is a pixel within the neighborhood of a in image B, having same gt-
transformed value (i.e. Matched), else it returns 0 (i.e. Unmatched).
Now NUP(A, B) is dened as:
NUP(A, B) = |nup(A, B), nup(B, A)|
p
(3.6)
where nup(A, B) is dened as:
nup(A, B) =
aA
(1 Match(a, B))
N
a
=
N
U
AB
N
a
(3.7)
and |.|
p
is the p
th
norm.
Some Properties of NUP and nup
1. NUP(A, B) = NUP(B, A).
2. If nup(A, B) = K, then K N
a
pixels of A do not have any pixel with same
transformed value within its neighborhood in B.
3. NUP(A, B) and nup(A, B) are always positive and normalized between 0
and 1.
23
4. NUP(A, B) and nup(A, B) are parameterized by gt, d and p.
3.3 Ecient computation of NUP
Compare(A, B) and Match(a, B) operations are required to compute NUP(A, B).
Both of these operations take O(rc) time for r c sized images. Hence, computing
NUP(A, B) using naive method requires O(r
2
c
2
) time , which is prohibitively
computationally intensive. Hence an ecient algorithm is required to compute
the NUP measure.
3.3.1 Algorithm
Flow Control of the Algorithm
Algorithm to compute NUP(A,B) (Algorithm 1) computes Normalized Unmatched
Points measure between two gt-transformed images. It calls the function Com-
pare(A,B) (Algorithm 2) that computes directional unmatched points, which itself
calls Matched(a,B) (Algorithm 3) which only checks whether a pixel a got a Match
in image B or not.
Discussion of the Algorithms
In Algorithm 1, two gt-transformed images are passed. Compare(A,B) function is
called to calculate the directional unmatched points, which is further normalized
by total number of pixels in the image.
To perform the Match(a, B) operation eciently an array of pointers to linked
list BLIST (as shown in Figure 3.5) is created. BLIST will have 3
8
elements such
24
Algorithm 1 NUP(A, B)
Require: gt-transformed images A and B.
Ensure: Return NUP(A, B).
1: Load gt-transformed images A and B from the Disk;
2: nup(A, B)
Compare(A,B)
N
a
;
3: nup(B, A)
Compare(B,A)
N
b
;
4: NUP(A, B) |nup(A, B), nup(B, A)|
p
;
5: RETURN NUP(A, B);
-
-
-
-
-
- - -
- - -
- -
-
- -
-
2
1
0
22222222
00000010
00000002
00000001
00000000
i i in base 3
6560
Linked list of pixels having T-value i
3
T-Value
Figure 3.5: Data Structure: BLIST
25
that i [0, 3
8
1] the i
th
element points to a linked list of pixels having the
transformed value i [14].
Computing BLIST data structure is a costly operation, and hence it is done
once in Algorithm 2 and Match(a, B) i.e. calculated using Algorithm 3 will use
it. In Algorithm 2, all pixels of gt-transformed image A are checked that whether
they got a match within their neighborhood or not, using Algorithm 3. Finally,
number of unmatched pixels is returned (i.e. N
a
AB
), when image A is compared
with image B.
Algorithm 2 Compare(A, B)
Require: gt-transformed images A and B.
Ensure: Return N
U
AB
.
1: Construct BLIST (array of pointers to linked list) for B;
2: unmatched 0;
3: for i = 0 to (r 3) do
4: for j = 0 to (c 3) do
5: if Match(A
ij
, B) is 0 then
6: unmatched unmatched + 1;
7: end if
8: end for
9: end for
10: RETURN unmatched;
After the aforementioned data structure BLIST is created for B in Algorithm 2,
the Match(a, B) operation can be performed eciently using Algorithm 3. Firstly,
Calculate the transformed value tval a of pixel a. BLIST[tval a] will point to the
linked list of pixels having the transformed value tval a in image B (as shown in
the Figure 3.5). Then search the list BLIST[tval a] linearly until a pixel is found
which N
a
B
. If such a pixel is found, return 1 else return 0.
26
Algorithm 3 Match(a, B)
Require: A pixel a and gt-transformed image B.
Ensure: If pixel a got Matched then return 1, else return 0.
1: tval a gt-transformed value of pixel a;
2: Search linked list BLIST[tval a], for a point P N
a
B
;
3: if no point found in step 3.3.1 then
4: RETURN 0;
5: else
6: RETURN 1;
7: end if
3.3.2 Running Time Analysis
Preprocessing
Conversion of gray scale images of size r c into gt-Transformed images is done
once for which a single scan of the whole image is sucient. Hence time complexity
is O(rc).
Processing
Match function involves linear search of a linked list of pixels, therefore the time
taken by this function depends on the length of the list. Let us assume that k
is the length of the largest linked list. To compute NUP between two images,
Compare function has to be called 2rc times, therefore time required to compute
NUP will be O(krc).
The worst case is when all the pixels in an image have the same transformed
value. Then k = rc, which leads to the trivial O(r
2
c
2
) time complexity. But, in
face images and varying environment above condition will never occur.
27
3.3.3 Space Analysis
Space requirement of an gray image is O(rc). The same space can be utilized for
storing gt-transformed images as original images are not used for further compu-
tation.
The array of pointers to the linked list of pixels (BLIST), is of size (3
8
). This
is constant independent of image size. As all the pixels in both the images will
be added once to lists of pixels the total memory used in constructing the data
structure for the images is 2 (3
8
+ rc) units.
28
Chapter 4
Experimental Results and
Analysis
4.1 Setup for Face Recognition System
Our face recognition system consists of 3 phases. In rst phase face detection is
done, in second phase some preprocessing is performed (i.e. gt-transformation),
and nally in third phase face comparison using NUP measure is performed.
In this system, face normalization is optional because for big databases a lot
of manual work is required to gather the ground truth information. Neighborhood
function will take care of this normalization. Also face feature extraction and
matching is not required as suggested in Figure 1.1, because our approach relies
on globally analysing a face as a whole for the recognition purpose.
29
(a) Input Face images
(b) Cropped Face images
(c) Normalized Face images
(d) gt-Transformed images(gt=0)
(e) gt-Transformed images(gt=5)
Figure 4.1: Images produced after various phases
30
4.1.1 Face Detection
Faces are extracted using Haar cascades. Trained Haar cascades are used directly
as available in OpenCV [25]. Cropped face are resized to the ORL standard size
(i.e. 92 112 pixels) (as shown in Figure 4.1(b)).
4.1.2 Preprocessing and Testing Strategy
After preprocessing, gt-transformed images are saved as color images (in TIFF
format), sized 90 110 (as shown in Figures 4.1(d) and 4.1(e)). For testing any
database we consider the whole database as the testing set and then each image
of the testing set is matched with all other images excluding itself. Finally top n
The value of n can range from 1 to ( Total number of poses per subject - 1 ).
31
Database Subjects Poses Total Images Varying
ORL 40 10 400 Poses and Expressions
YALE 15 11 165 Illumination and Expressions
BERN 30 10 300 Poses and Expressions
CALTECH 17 20 340 Poses and Illumination
IITK 149 10 1490 Poses and Scale
Table 4.1: Databases Information
4.2 Experimental Results
The performance evaluation of NUP measure was done on some standard bench-
mark facial image databases such as ORL [17], YALE [18], BERN [19], CALTECH
[20], and IITK (as shown in Table 4.1). Under varying lighting conditions, poses
and expressions NUP measure has demonstrated very good recognition rates.
4.2.1 Parameterized results of NUP based recognition on
dierent facial databases
NUP measure is parameterized primarily by two parameters gt and d, the third
parameter p (order of norm) is set to 20 for this work. Gray value Tolerance gt
can vary within range [0, 5] and Neighborhood parameter d can vary within range
[1, 15].
Discussion for gt (Gray Value Tolerance)
From the denition of gt (as shown in Equation 3.2a) it is clear that more and
more elements of V (a) start acquiring value 1 with higher gt values. This will
boost the blue value of pixels in the gt-transformed images. In the presence of
directional lights and heavy illumination condition variations some of the facial
32
(a) Original
(b) gt = 0
(c) gt = 5
Figure 4.2: Eect of High gt values under heavy illumination variation
33
regions becomes signicantly dark. High gt values in these conditions may further
lift up the blue value upto an extent that blue color starts dominating in gt-
transformed image (as shown in Figure 4.2). This results in deterioration of the
performance.
NUP measure performs well on illumination varying databases such as YALE
and CALTECH (as shown in Table 4.1) with lower gt values (as shown in Figures
4.4 and 4.6). Databases like ORL, BERN and IITK where illumination is not
varying too much and directional lighting is also absent, higher gt values will yield
better discrimination (as shown in Figures 4.3, 4.5 and 4.7).
Discussion for d (Neighborhood parameter)
From the denition of d (as shown in Equation 3.3), on unnormalized
, pose and
expression varying images of ORL, BERN and IITK databases, bigger neighbor-
hood yield good performance as also suggested by the plots (as shown in Figures
4.3, 4.5 and 4.7).
On databases like YALE and CALTECH smaller neighborhood is expected
because they contain fairly normalized images without too much pose and expres-
sion variations (as shown in Table 4.1), as also suggested by the plots (as shown
in Figures 4.4 and 4.6).
n
2
kl
pixels.
44
Bibliography
[1] A.Samal and P.A.Iyengar, Automatic recognition and analysis of human faces
and facial expressions; a survey, Pattern recognition 25 (1) (1992) 65-77.
[2] R.Chellappa, C.L.Wilson and S.Sircohey, Human and machine recognition of
faces: a survey, Proc. IEEE 83 (5) (1995) 705-740.
[3] M.Turk and A.Pentland, Eigenfaces for recognition, Journal of cognitive Neu-
roscience, March 1991.
[4] L.Wiskott, J.-M.Fellous, N.Kuiger and C.Von der Malsburg, Face recognition
by elastic bunch graph matching, IEEE Tran. on Pattern Anal.Mach.Intell., 19:
775-779.
[5] S.Lawrence, C.L.Giles, A.C.Tsoi, and A. D. Back, Face recognition: A con-
volutional neural network approach, IEEE Trans. Neural Networks, 8:98-113,
1997.
[6] Guodong Guo, Stan Z. Li, and Kapluk Chan, Face Recognition by Support
Vector Machines, Automatic Face and Gesture Recognition, 2000.Proc. Fourth
IEEE Inter.Conf. on Volume,Issue,2000 Page(s):196-201
[7] F.S.Samaria, Face recognition using Hidden Markov Models. PhD thesis, Trin-
ity College, University of Cambridge,Cambridge,1994.
[8] D.P.Huttenlocher, G.A.Klanderman and W.A.Rucklidge, Comparing images
using the Hausdor distance, IEEE Trans.Pattern Anal.Mach.Intell,vol.15,
no.9,pp.850-863, sep.1993.
[9] W.J.Rucklidge, Locating objects using the Hausdor distance, ICCV 95: Proc.
5th Int. Conf. Computer Vision, Washington, D.C, June 1995, pp. 457-464.
[10] B.Takacs, Comparing face images using the modied Hausdor distance, Pat-
tern Recognit,vol.31, no.12,pp.1873-1881, 1998.
45
[11] B.Guo, K.-M.Lam, K.-H.Lin and W.-C.Siu, Human face recognition based on
spatially weighted Hausdor distance, Pattern Recognit. Lett., vol. 24,pp.499-
507, Jan. 2003.
[12] K.-H.Lin, K.-M.Lam and W.-C.Siu, Spatially eigen-weighted Hausdor dis-
tances for human face recognition, Pattern Recognit.,vol.36,pp.1827-1834, Aug.
2003.
[13] E.P.Vivek and N.Sudha, Gray Hausdor distance measure for comparing face
images, IEEE Trans. Inf. Forensics and Security, vol.1, no. 3, Sep. 2006.
[14] N.Sudha and Y.Wong, Hausdor distance for iris recognition, Proc. of 22nd
IEEE Int. Symp. on Intelligent Control ISIC 2007,pages 614-619,Singapore,
October 2007.
[15] M.Dubuisson and A.K.Jain, A modied Hausdor distance for object Match-
ing, Proc. 12th Int. conf. on Pattern Recognition (ICPR), Jerusalem, Israel,
(1994).
[16] Y.Gao and M.K.Leung, Face recognition using line edgemap, IEEE Trans.
Pattern Anal. Machine Intell.,vol.24, pp.764-779, Jun. 2002.
[17] The ORL Database of Faces[Online], Available:http://www.uk.research.
att.com/facedatabase.html.
[18] The Yale University Face Database[Online], Available:http://cvc.yale.
edu/projects/yalefaces/yalefaces.html.
[19] The Bern University Face Database[Online], Available:ftp://ftp.iam.
unibe.ch/pub/images/faceimages/.
[20] The Caltech University Face Database[Online], Available:http://www.
vision.caltech.edu/html-files/archive.html.
[21] David A. Forsyth and Jean Ponce, Computer Vision - A Modern Approach,
Pearson Education, 2003.
[22] Yang,M.H.;Kriegman,D.J. and Ahuja, N, Detecting Faces in Images: A Sur-
vey, IEEE Transaction (PAMI), Vol.24, No. 1, (2002),(34-58).
[23] Li,S.Z and Jain, A.K Handbook of Face Recognition, Springer-Verlag, (2005)
[24] Yuankui Hu and Zengfu Wang, A Similarity Measure Based on Hausdor Dis-
tance for Human Face Recognition, 18
th
International Conference on Pattern
Recognition (ICPR06), IEEE (2006).
46
[25] Gary Bradski, Adrian Kaehler Learning OpenCV: Computer Vision with
the OpenCV Library, [ONLINE ], Available at http://www.amazon.com/
Learning-OpenCV-Computer-Vision-Library/dp/0596516134
47