0% found this document useful (0 votes)
20 views4 pages

A Model-Based Facial Expression Recognition

This document summarizes a facial expression recognition algorithm that uses Principal Component Analysis (PCA). It applies PCA to the vertices of a 3D facial model, the Candide grid, which is fitted to images of facial expressions. Two eigenvectors resulting from PCA along with the vertex barycenter are used to define a new coordinate system that maps the vertices and makes the algorithm invariant to translations, rotations, and scaling of the face. Support Vector Machines are then used to classify six basic facial expressions using the transformed vertex locations. The algorithm achieves satisfactory recognition results on the Cohn-Kanade database.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

A Model-Based Facial Expression Recognition

This document summarizes a facial expression recognition algorithm that uses Principal Component Analysis (PCA). It applies PCA to the vertices of a 3D facial model, the Candide grid, which is fitted to images of facial expressions. Two eigenvectors resulting from PCA along with the vertex barycenter are used to define a new coordinate system that maps the vertices and makes the algorithm invariant to translations, rotations, and scaling of the face. Support Vector Machines are then used to classify six basic facial expressions using the transformed vertex locations. The algorithm achieves satisfactory recognition results on the Cohn-Kanade database.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

A Model-Based Facial Expression Recognition

Algorithm using Principal Components Analysis


N. Vretos, N. Nikolaidis and I.Pitas
Informatics and Telematics Institute
Centre for Research and Technology Hellas, Greece
Department of Informatics, Aristotle University of Thessaloniki
Thessaloniki 54124, Greece Tel,Fax: +30-2310996304
e-mail: vretos,nikolaid,[email protected]

Abstract—In this paper, we propose a new method for facial are popular because they can capture essential geometrical
expression recognition. We utilize the Candide facial grid and information for facial expression recognition [3]. An overview
apply Principal Components Analysis (PCA) to find the two of the state of the art can be found in [2]
eigenvectors of the model vertices. These eigenvectors along with
the barycenter of the vertices are used to define a new coordinate In this paper, we propose a novel model-based facial
system where vertices are mapped. Support Vector Machines expression technique which utilizes the Candide facial grid
(SVMs) are then used for the facial expression classification task. (Figure 1), deformed so as to match the apex of an expression.
The method is invariant to in-plane translation and rotation as So far, most model-based techniques use Candide vertex
well as scaling of the face and achieves very satisfactory results. displacements, like in [3] or the Euclidian distance of pairs
of Candide vertices [4]. In our case we use the location of the
Candide vertices. Such information is sufficient to describe the
I. I NTRODUCTION geometry of the face and the facial features, and thus, can lead
Facial expression recognition in video sequences and still to facial expression recognition. However, this information
images is a very important research topic with applications is also extremely vulnerable to affine transformations (e.g.
in human-centered interfaces, ambient intelligence, behavior translation) of the face. To remedy this problem, we utilize
analysis etc. In [1], Ekman has established, based on an an- principal components analysis in order to establish a new
thropological investigation, six main facial expressions (anger, coordinate system that utilizes the evaluated eigenvectors and
surprise, happiness, disgust, fear and sadness) which are used the vertices barycenter. As it will be proven later on, mapping
in order to communicate human emotions. In many cases the the vertices to the new coordinate system ensures the method’s
neutral state is included along with the six expressions. robustness to translation, rotation and isotropic scaling of the
Although humans can easily recognize these facial expres- face.
sions, this is not the case for algorithms that try to imitate The paper is organized as follows: in Section II, we present
this skill. During the last decade, many attempts involving a the PCA-based feature selection procedure and prove the
wide range of approaches have been undertaken to resolve robustness of the method towards translation, scaling and
this problem. Despite their diversity, most algorithms utilize rotation. In Section III, we describe the application of Support
information coming from the eyes, the mouth and the forehead Vector Machines (SVM) upon the selected features for the
region since these areas are considered the ones with the recognition of 6 or 6 + 1 different facial expressions and show
richest information for facial expressions recognition. experimental results on the Cohn-Kanade [5] database. Finally,
One consideration that has to be taken into account when conclusions are drawn in Section IV.
designing facial expression recognition algorithms is the fact
that a facial expression is a dynamic process that evolves
over time and includes three stages [2]: an onset (attack),
an apex (sustain) and an offset (relaxation) as described in II. P RINCIPAL C OMPONENTS A NALYSIS AND F EATURE
[2]. Many facial expression recognition algorithms operate S ELECTION
on the video frame (or still image) that corresponds to the The first step in the proposed approach is to track the
expression apex. Based on the input data type used, facial Candide facial grid from the onset to the apex of the facial
expression recognition algorithms can be classified in two expression in a video sequence. In order to do so, we perform a
main categories: image (or image feature)-based and model- manual localization of 7 vertices of the grid on the video frame
based ones. Each approach has its own merits. For instance, the corresponding to the onset of the facial expression as shown
image-based algorithms are faster, as no complex image pre- in Figure 2a. The rest of the Candide vertices are arranged
processing is usually involved. On the other hand, model-based in this frame through the application of a spring mesh-model.
approaches employ a 2-D or 3-D face model, whose fitting on Then, we use the Kanade-Lucas-Tomasi (KLT) algorithm to
the facial image [3] implies significant computational cost. track the grid vertices to the facial expression apex, as shown
Despite their computational effort, model-based approaches in Figure 2b. It has been proven that only 67 out of the

978-1-4244-5654-3/09/$26.00 ©2009 IEEE 3301 ICIP 2009


11 36 7 37 38
1 50 9 58 18
12 37 5228 3160 59 20 51
6133 3053
2 34 8 56 103264
12
5429 62 11 65 21
22 63 23 19
13 38 34 5557
3
14 39
16
18 41
15 17 42 43
4 40
19 13 24
21
59 44
63 46
6561
29 20 24 54 35 1 36
6064 23
22 49
6245
48 6647 14
66 2 6725

27 25 50 52 163 27
39 40
41 47 42
2667 5
6 51
28 53 49
337 58 48

31 56 44
43
17 46
55 45
30
35 4
8
5
9
32 15
57 26
10 6

(a) (b)

Fig. 3: a) Deformed Candide face model corresponding to


surprise. b) Retained deformed vertices from the model.

Fig. 1: The Candide face model


1
Xnew = √ · P T · v2 (2)
λ2
1
Ynew = √ · P T · v1 (3)
λ1
where P is a 2 × 67 matrix containing all Pi , Xnew and Ynew
the 67 × 1 vectors that contains the normalized coordinates in
the new coordinates system, λ1 and λ2 the eigenvalues of the
covariance matrix Σ with λ1 > λ2 and v1 , v2 the respective
2 × 1 eigenvectors. We have to notice that we assume that
(a) (b) the variance in the y (vertical) axis of the original point cloud
is bigger than the one in the x axis, an assumption that is
Fig. 2: a) Initialization of the Candide face model. b) De- valid due to facial proportions. The set of normalized points
formed Candide model. in the new coordinate system is used as the feature set for the
expression recognition task.
In the following we prove the robustness of the proposed
104 Candide vertices carry important information for facial features (namely the Candide vertices expressed in the new
expression recognition [3]. Therefore only these vertices are coordinate system), with respect to translation, 2D-rotation
retained. In Figure 3a the deformed facial grid corresponding (in-plane) and scaling of the face, and subsequently, of the
to the surprise apex is shown. The 67 information-carrying Candide grid. First, since the origin of the new coordinate
vertices, are the ones at the mouth and eyes regions, as shown system is the barycenter of the vertices, a rigid translation of
in Figure 3b. all vertices will lead to the translation of the origin by the same
The N=67 retained vertices form a point cloud in R2 . We amount. Thus the vertices coordinates will remain unaltered.
calculate the barycenter Pm = (xm , ym ) of these points by In addition, 2D rotation invariance, is provided by the
averaging the x and y coordinates for all points. Then, we
subtract Pm from all points Pi orig
= (xi , yi ), i = [1..67] fact that the two axes of maximum variance (eigenvectors)
obtaining Pi = (xi − xm , yi − ym ) that have zero mean evaluated throw PCA rotate rigidly along with the points. To
coordinates. Finally, we calculate the 2 × 2 covariance matrix prove this suppose we multiply our set of points with an
of these points: arbitrary rotation matrix R. Thus the matrix containing the
2 N 3 rotated points will be P  = R · P . The covariance matrix of
X X
N
6 (xi − xm )(xi − xm ) (xi − xm )(yi − ym ) 7 the rotated points will be:
1 6 i=1 7
Σ= 6 i=1 7 1 1
N −1 6 XN XN 7 
4 5 Σ = RP · (RP )T = R P · P T RT = RΣRT
(yi − ym )(xi − xm ) (yi − ym )(yi − ym ) N −1 N −1
i=1 i=1 (4)
(1) The eigenanalysis of Σ will provide:
To achieve invariance to in-plane rotation, translation and
scaling, as will become apparent below, we adopt a new RΣRT · vi = λ · vi =⇒ (5)
coordinate system whose origin is Pm and its axis are defined R T
RΣR · vi T
= R λ · vi
T 
=⇒ (6)
by the eigenvectors v1 and v2 of Σ and evaluate the coordinates Σ · RT vi = λ · RT vi (7)
of all points with respect to this system. In addition, we
normalize the coordinates of each point with the inverse of the where Σ is the covariance of the rotated points, λi , vi its
square root of the respective eigenvalue. The reason for this eigenvalues and eigenvectors, and T the matrix transpose.
normalization will become apparent in the paragraphs below. Equation (7) implies that RT vi is an eigenvector of Σ thus
Thus: if vi is an eigenvector of Σ then it holds that vi = Rvi . In

3302
other words, the eigenvectors of the rotated points are those Many SVMs variants exist. These include both linear and non
of the original points rotated by the same rotation matrix. linear forms, with different kernels being used in the latter.
Furthermore, the eigenvalues of the set of the rotated points Six and seven class multiclass SVMs were used in our case.
are the same as the ones of the original matrix, i.e. λi = λi . We use the Cohn-Kanade [5] facial expressions database

For the vector Xnew contining the X coordinates (in the new in order to evaluate our method. Firstly, we have initialized
coordinate system) of the rotated points, one can easily see the Candide grid onto the onset frame of the database videos
that: and tracked it till the facial expression apex frame. In our
 1 experiments, we used only the apex phase of the expression.
Xnew =   · P T · v2 =⇒ (8) The method was applied to 440 video frames (resulting from
λ2
1 an equal number of videos): 35 for anger 35 for disgust 55
 T
Xnew = √ · (RP ) · Rv2 =⇒ (9) for fear 90 for happiness 65 for saddness 70 for surprise
λ2
and finally 90 for the neutral state. We have conducted
 1
Xnew = √ · P T · RT · Rv2 =⇒ (10) experiments for the recognition of either 6 or 6+1(neutral)
λ2 facial expressions.
 1
Xnew = √ · P T · v2 =⇒ (11) We shall first present the experiments for 6 facial expres-
λ2 sions. In order to establish a test and a training set, we

Xnew = Xnew (12) used a modified version of the leave-one-out cross-validation
The same holds obviously for Ynew
and Ynew . Thus the points procedure, where, in each run, we exclude 20% of the grids
coordinates in the defined coordinates system do not change for of each facial expression from the training set and use
with rotation and thus our feature are invariant to in-plane them to form the test set. Thus, in order to process all
rotation of the face. data, five runs were conducted and the average classification
Finally, invariance with respect to isotropic scaling can be accuracy was calculated. In Table I, results are drawn from
proven as follows: by scaling in both dimensions with the
Kernel Degree Recognition Rate
same factor s it is easy to see that the new set of points will RBF 3 88.69%
be P  = s · P with covariance matrix Σ = s2 · Σ. The RBF 5 88.18%
eigenanalysis of Σ provides that: RBF 4 88.15%
RBF 7 87.79%
Σ · vi = λi · vi =⇒ (13) RBF 8 87.79%
2 2 RBF 6 87.59%
s · Σ · vi = (s λi ) · vi =⇒ (14) RBF 2 87.26%
Σ · vi = λi · vi (15) Polynomial 3 88.71%
Polynomial 2 88.17%
where λi = s2 · λi the eigenvalues of the scaled points. The Polynomial 4 87.70%
above equation also implies that Σ has (as expected) the same TABLE I: Results for radial basis function (RBF) and poly-
eigenvectors v1 , v2 with Σ. From (2) and (3) we have that: nomial kernels with different degrees.
 1
Xnew =   · (s · P )T · v2 (16)
λ2
 classifiers involving different parameters and kernels. It can
1 be seen that the results are practically the same for all
= · (s · P )T · v2 (17)
s2 · λ2 tested SVM configurations. The confusion matrices for the

1 1 RBF and polynomial kernel parameters that achieved the
= · · (s · P )T · v2 (18) best performance are depicted in Tables II and III. The fact
s λ2
1 that for different kernels and different parameters we obtain
= √ · P T · v2 (19) the same results is obviously an advantage for the proposed
λ2
= Xnew (20) algorithm, since it potentially indicates good generalization
properties. This desirable behavior can be attributed to the
Thus the x coordinates of the scaled points in the new utilized features and is particularly important since many
coordinate system will remain unaltered. The same holds classification algorithms, using SVMs, suffer in generalization
obviously for Ynew . due to the known sensitivity of the SVMs with respect to their
parameters [10].
III. FACIAL E XPRESSION C LASSIFICATION E XPERIMENTS We performed experiments for the 6+1 expressions as well
We use Support Vector Machines (SVMs) classifiers for (Tables IV, V and VI). For most algorithms, the classification
recognizing facial expressions classes. SVMs were chosen accuracy when recognizing 6+1 expressions is worse than
due to their good performance in various practical pattern the one in the 6 classes case. In our case though, there is
recognition applications [6]-[9], and their solid theoretical a slight performance improvement on the overall accuracy
foundations. which can be perhaps attributed to the nature of the feature
SVMs minimize an objective function under certain con- space. Most probably in this space the neutral class is far
straints in the training phase, so as to find the support vectors, from all other classes and, thus, does not alter the 6 class
and subsequently use them to assign labels to the test set. results significantly. This fact can be noticed in the confusion

3303
Ang Dis Fea Hap Sad Sur Ang Dis Fea Hap Sad Sur Neu
Ang 84.8 0 0 0 8.9 0 Ang 81.2 3.3 0 0 11.2 0 0
Dis 6.1 93.3 2.2 2.2 2.5 1.4 Dis 6.2 93.3 2.2 1.0 3.4 1.4 0
Fea 0 0 89.1 11.0 3.8 0 Fea 0 0 88.9 11.0 3.4 0 0
Hap 3.0 6.7 6.5 86.8 3.8 1.4 Hap 6.2 3.3 6.7 88.0 3.4 1.4 0
Sad 6.1 0 0 0 81.0 0 Sad 6.2 0 0 0 77.5 0 0
Sur 0 0 2.2 0 0 97.1 Sur 0 0 2.2 0 0 97.1 0
Neu 0 0 0 0 1.1 0 100.0
TABLE II: Confusion matrix for polynomial kernel of 3rd
degree TABLE VI: Confusion for polynomial kernel of 1st degree
(6+1 expressions).

Ang Dis Fea Hap Sad Sur


Ang 81.1 0 0 0 6.7 0
Dis 8.1 93.3 0 2.2 2.7 1.5 classification accuracy is approximately 90%. The main advan-
Fea 0 0 89.4 10.9 2.7 0
Hap 5.4 6.7 6.4 85.9 4.0 0 tage of the employed feature space is its robustness towards
Sad 5.4 0 0 1.1 84.0 0 scaling, in-plane rotation and translation of the face.
Sur 0 0 4.3 0 0 98.5 It is also worth noting that the method is not sensitive
TABLE III: Confusion matrix for radial basis function (RBF) to the SVM configuration with respect to kernel and kernel
kernel of 3rd degree. parameters, which is an indication that the method has good
generalization properties. Future work will aim in the experi-
mental evaluation of this claim.

matrices where the rates for the 6 classes are practically the ACKNOWLEDGMENT
same as in the previous experiments. On the other hand, as
The research leading to these results has received funding
the neutral class exhibits 100% recognition rate the overall
from the European Community’s Seventh Framework Pro-
accuracy of the algorithm increases. The overall classification
gramme (FP7/2007-2013) under grant agreement no 211471
accuracy for the cases of Tables IV-VI is 90.22%, 90.02% and
(i3DPost).
89.49% respectively.

Ang Dis Fea Hap Sad Sur Neu R EFERENCES


Ang 82.4 3.3 0 0 9.3 0 0 [1] P. Ekman, “Facial expression and emotion,” Personality: Critical
Dis 5.9 93.3 2.3 1.0 3.5 1.4 0 Concepts in Psychology, vol. 48, pp. 384–92, 1998.
Fea 0 0 90.9 10.7 3.5 0 0 [2] M. Pantie and LJM Rothkrantz, “Automatic analysis of facial expres-
Hap 5.9 3.3 4.5 87.4 3.5 0 0 sions: the state of the art,” IEEE Transactions on Pattern Analysis and
Sad 5.9 0 0 1.0 79.1 0 0 Machine Intelligence, vol. 22, no. 12, pp. 1424–1445, 2000.
Sur 0 0 2.3 0 0 98.6 0 [3] I. Kotsia and I. Pitas, “Facial Expression Recognition in Image
Neu 0 0 0 0 1.2 0 100.0 Sequences Using Geometric Deformation Features and Support Vector
Machines,” IEEE Transactions on Image Processing, vol. 16, no. 1, pp.
TABLE IV: Confusion matrix for radial basis function (RBF) 172–187, 2007.
kernel of 5th degree (6+1 expressions). [4] K. Kahler, J. Haber, and H.P. Seidel, “Geometry-based muscle modeling
for facial animation,” Proc. Graphics Interface 2001, pp. 37–46, 2001.
[5] T. Kanade, J.F. Cohn, and Y. Tian, “Comprehensive database for facial
expression analysis,” Proceedings of the fourth IEEE International
Conference on Automatic Face and Gesture Recognition (FG00), pp.
Ang Dis Fea Hap Sad Sur Neu 46–53, 2000.
Ang 81.8 3.3 0 0 10.3 0 0 [6] A. Tefas, C. Kotropoulos, and I. Pitas, “Using support vector machines
Dis 6.1 93.3 2.3 1.0 3.4 1.4 0 to enhance the performance of elasticgraph matching for frontal face
Fea 0 0 90.9 10.7 3.4 0 0 authentication,” IEEE Transactions on Pattern Analysis and Machine
Hap 6.1 3.3 4.5 87.4 3.4 0 0 Intelligence, vol. 23, no. 7, pp. 735–746, 2001.
Sad 6.1 0 0 1.0 78.2 0 0 [7] H. Drucker, D. Wu, and V.N. Vapnik, “Support Vector Machines for
Sur 0 0 2.3 0 0 98.6 0 Spam Categorization,” IEEE Transactions on Neural Networks, vol. 10,
Neu 0 0 0 0 1.1 0 100.0 no. 5, 1999.
[8] A. Ganapathiraju, J. Hamaker, and J. Picone, “Support Vector Machines
TABLE V: Confusion matrix for radial basis function (RBF) for Speech Recognition,” Fifth International Conference on Spoken
Language Processing, 1998.
kernel of 2nd degree (6+1 expressions). [9] M. Pontil and A. Verri, “Support vector machines for 3 D object
recognition,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 20, no. 6, pp. 637–646, 1998.
[10] A. Zien, G. Ratsch, S. Mika, B. Scholkopf, T. Lengauer, and K.R.
Muller, “Engineering support vector machine kernels that recognize
IV. C ONCLUSIONS AND F UTURE W ORK translation initiation sites,” Bioinformatics, vol. 16, no. 9, pp. 799–807,
2000.
In this paper, we have introduced a method for automatic
facial expressions recognition using principal components
analysis on the vertices of the Candide model. SVMs are
used for the classification task. Results show clearly that the
proposed method is a good framework towards model-based
facial expression recognition. The achieved facial expression

3304

You might also like