A New Approach For Mood Detection Via Using Principal Component Analysis and Fisherface Algorithm 88 92
A New Approach For Mood Detection Via Using Principal Component Analysis and Fisherface Algorithm 88 92
7, July 2011
Journal of Global Research in Computer Science
RESERCH PAPER
Available Online at www.jgrcs.info
Abstract-- Natural facial expressions commonly occur in social interactions between people, and are useful for providing an emotional context
for the interaction, and for communicating social intentions. This paper depicts an idea regarding detecting an unknown human face from input
imagery and recognise his/her current mood. The objective of this paper is that psychological state giving information about some disorders
helpful with diagnosis of depression, mania or schizophrenia. The elimination of errors due to reflections in the image has not been implemented
but the algorithms used in this paper are computationally efficient to resolve errors. In this paper we have accepted five different moods to be
recognized are: Joy, Fear, Contempt, Sad, Disgust and Astonished. Principal Component Analysis (PCA) is implemented with Fisher face
Algorithm to recognize different moods.The main part of this paper is an emotional database which will contain images of faces, their
corresponding Action Units and their labels. The contribution of this database to the problem stated above is that it can be used by systems in
order to recognize emotional facial expressions given one of the database data i.e. action units’ combination
Keywords-- Feature Extraction, Facial Expression Detection, Principal Component analysis (PCA), Fisher face Algorithm,
subtle change of facial expressions, and emotion-specified search guided by the accuracy) approaches. See also
expressions. The optimum facial feature extraction combinatorial optimization problems. In some cases, data
algorithm, Canny Edge Detector, is applied to localize face analysis such as regression or classification can be done in
images, and a hierarchical clustering-based scheme the reduced space more accurately than in the original space.
reinforces the search region of extracted highly textured It is mainly considered as first step in face recognition
facial clusters. Peter et. al. [8] proposed a method is based process.
on Fisher’s Linear Discriminant and produces well separated
Feature Extraction
classes in a low-dimensional subspace, even under severe
variation in lighting and facial expressions. The Eigen face Feature extraction transforms the data in the high-
technique, another method based on linearly projecting the dimensional space to a space of fewer dimensions. The data
image space to a low dimensional subspace, has similar transformation may be linear, as in principal component
computational requirements. Yet, extensive experimental analysis (PCA), but many nonlinear dimensionality
results demonstrate that the proposed “Fisher face” method reduction techniques also exist.
has error rates that are lower than those of the Eigen face The main linear technique for dimensionality reduction,
technique for tests on the Harvard and Yale Face Databases. principal component analysis, performs a linear mapping of
Bartlett et. al. [5] explores and compares techniques for the data to a lower dimensional space in such a way that the
automatically recognizing facial actions in sequences of variance of the data in the low-dimensional representation is
images. These techniques include analysis of facial motion maximized. In practice, the correlation matrix of the data is
through estimation of optical flow; holistic spatial analysis, constructed and the eigenvectors on this matrix are
such as independent component analysis, local feature computed. The eigenvectors that correspond to the largest
analysis, and linear discriminant analysis; and methods eigenvalues (the principal components) can now be used to
based on the outputs of local filters, such as Gabor wavelet reconstruct a large fraction of the variance of the original
representations and local principal components. data. Moreover, the first few eigenvectors can often be
interpreted in terms of the large-scale physical behavior of
FACIAL EXPRESSION DATABASE the system. The original space (with dimension of the
number of points) has been reduced (with data loss, but
The Database used in my research paper for facial mood hopefully retaining the most important variance) to the
detection is Real Time Database. This Database contains( 48 space spanned by a few eigenvectors.
images of which 7 facial expressions including neutral Principal component analysis can be employed in a
images). The Database contains different images of an nonlinear way by means of the kernel trick. The resulting
individual which represents different moods according to technique is capable of constructing nonlinear mappings that
different situations. For the implementation the Database maximize the variance in the data. The resulting technique is
contains 48 coloured face images of individual. There are 4 entitled Kernel PCA. Other prominent nonlinear techniques
images per subject, and these 4 images are respectively, include manifold learning techniques such as locally linear
under the following different facial expressions or embedding (LLE), Hessian LLE, Laplacian eigenmaps, and
configuration. In this implementation, all images are resized LTSA. These techniques construct a low-dimensional data
to a uniform dimension of 256 x 256. Following Figure representation using a cost function that retains local
shows the database images considered for face Expression properties of the data, and can be viewed as defining a
recognition could be used. graph-based kernel for Kernel PCA. More recently,
techniques have been proposed that, instead of defining a
fixed kernel, try to learn the kernel using semi definite
programming. The most prominent example of such a
technique is maximum variance unfolding (MVU). The
central idea of MVU is to exactly preserve all pair wise
distances between nearest neighbors (in the inner product
space), while maximizing the distances between points that
are not nearest neighbors.
An alternative approach to neighborhood preservation is
through the minimization of a cost function that measures
differences between distances in the input and output spaces.
Important examples of such techniques include classical
Figure 3.1 Samples of the database used for training the recognition multidimensional scaling (which is identical to PCA),
system.
Isomap (which uses geodesic distances in the data space),
diffusion maps (which uses diffusion distances in the data
DIMENSION REDUCTION TECHNIQUES
space), t-SNE (which minimizes the divergence between
In statistics, dimension reduction is the process of reducing distributions over pairs of points), and curvilinear
the number of random variables under consideration, and component analysis.
can be divided into feature selection and feature extraction. A different approach to nonlinear dimensionality reduction
is through the use of auto encoders, a special kind of feed-
Feature Selection forward neural networks with a bottle-neck hidden layer.
The training of deep encoders is typically performed using a
Feature selection approaches try to find a subset of the greedy layer-wise pre-training (e.g., using a stack of
original variables (also called features or attributes). Two
strategies are filter (e.g. information gain) and wrapper (e.g.
© JGRCS 2010, All Rights Reserved 89
Rajneesh Singla, Journal of Global Research in Computer Science Volume 2 No.(7), July 2011, 88-92
Restricted Boltzmann machines) that is followed by a LDA is also closely related to principal component analysis
finetuning stage based on back propagation. (PCA) and factor analysis in that both look for linear
Principal Component Analysis: Principal component combinations of variables which best explain the data]. LDA
analysis (PCA) is a mathematical procedure that uses an explicitly attempts to model the difference between the
orthogonal transformation to convert a set of observations of classes of data. PCA on the other hand does not take into
possibly correlated variables into a set of values of account any difference in class, and factor analysis builds
uncorrelated variables called principal components. The the feature combinations based on differences rather than
number of principal components is less than or equal to the similarities. Discriminant analysis is also different from
number of original variables. This transformation is defined factor analysis in that it is not an interdependence technique:
in such a way that the first principal component has as high a distinction between independent variables and dependent
a variance as possible (that is, accounts for as much of the variables (also called criterion variables) must be made.
variability in the data as possible), and each succeeding LDA works when the measurements made on independent
component in turn has the highest variance possible under variables for each observation are continuous quantities.
the constraint that it be orthogonal to (uncorrelated with) the When dealing with categorical independent variables, the
preceding components. Principal components are guaranteed equivalent technique is discriminant correspondence
to be independent only if the data set is jointly normally analysis.
distributed. PCA is sensitive to the relative scaling of the
original variables. Depending on the field of application, it EXPERIMENT
is also named the discrete Karhunen–Loève transform
(KLT), the Hostelling transform or proper orthogonal We have experimented on The Real Time Database.
decomposition (POD). Database contains ( 48 images of which 7 facial expressions
PCA was invented in 1901 by Karl Pearson. Now it is including neutral images). The Database contains different
mostly used as a tool in exploratory data analysis and for images of an individual which represents different moods
making predictive models. PCA can be done by eigenvalue according to different situations. For the implementation the
decomposition of a data covariance matrix or singular value Database contains 48 coloured face images of individual.
decomposition of a data matrix, usually after mean centering There are 4 images per subject, and these 4 images are
the data for each attribute. The results of a PCA are usually respectively, under the following different facial expressions
discussed in terms of component scores (the transformed or configuration. In this implementation, all images are
variable values corresponding to a particular case in the resized to a uniform dimension of 256 x 256. Vigorous
data) and loadings (the weight by which each standardized experimentation is done by selecting proper number of
original variable should be multiplied to get the component epochs, number of runs, step size on randomize data set to
score). generalize the problem. Input image forms the first state for
PCA is the simplest of the true eigenvector-based the face recognition module. To this module a face image is
multivariate analyses. Often, its operation can be thought of passed as an input for the system. The input image samples
as revealing the internal structure of the data in a way which are considered of non-uniform illumination effects, variable
best explains the variance in the data. If a multivariate facial expressions, and face image with glasses. In second
dataset is visualized as a set of coordinates in a high- phase of operation the face image passed is transformed to
dimensional data space (1 axis per variable), PCA can operational compatible format, where the face Image is
supply the user with a lower-dimensional picture, a resized to uniform dimension; the data type of the image
"shadow" of this object when viewed from its (in some sample is transformed to double precision and passed for
sense) most informative viewpoint. This is done by using Feature extraction. In Feature extraction unit runs both
only the first few principal components so that the Fisher face and PCA algorithms for the computations of face
dimensionality of the transformed data is reduced. for extraction. These features are passed to classifier which
PCA is closely related to factor analysis; indeed, some calculates the minimum Euclidean distance from the neutral
statistical packages (such as Stata) deliberately conflate the image and the image having minimum distance is selected
two techniques. True factor analysis makes different for output. For the implementation of the proposed
assumptions about the underlying structure and solves recognition architecture the database samples are trained for
eigenvectors of a slightly different matrix. the knowledge creation for classification. During training
Fisherface Algorithm: Fisher's linear discriminant are phase when a new facial image is added to the system the
methods used in statistics, pattern recognition and machine features are calculated and aligned for the dataset formation.
learning to find a linear combination of features which Comparing the weights of the test face with the known
characterize or separate two or more classes of objects or weights of the database is found by calculating the norm of
events. The resulting combination may be used as a linear the differences between the test and known set of weights,
classifier, or, more commonly, for dimensionality reduction such that a minimum difference between any pair would
before later classification. symbolize the closest match.
In the other two methods however, the dependent variable is Methodology
a numerical quantity, while for LDA it is a categorical
variable (i.e. the class label). Logistic regression and probit The methodology used to detect different facial moods is
regression are more similar to LDA, as they also explain a described as follows:-
categorical variable. These other methods are preferable in Step 1: Two folders were created
applications where it is not reasonable to assume that the (1) Training images
independent variables are normally distributed, which is a (2) Input image
fundamental assumption of the LDA method. Step 2: Created Loop through the training images for
reading images data in T- matrix. (Preprocessing)
© JGRCS 2010, All Rights Reserved 90
Rajneesh Singla, Journal of Global Research in Computer Science Volume 2 No.(7), July 2011, 88-92
Disgust 90
Astonished 94
Fear 86
Contempt 93
face Image