0% found this document useful (0 votes)
13 views31 pages

Final Project Report FF

Uploaded by

Ujjawal Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views31 pages

Final Project Report FF

Uploaded by

Ujjawal Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Project Report

On
Facial Emotion Detection

Submitted in partial fulfilment of the Requirement for the award of the degree

Bachelor of Computer Application

Under The Supervision of


Mr.RajaKumar P
Department of Computing Science and Engineering

SUBMITTED BY
Ujjawal Gupta (22SCSE1040014)
Yogendra Singh (22SCSE1040078)
Priyanshi Raj (22SCSE1040035)
Roshan Kumar (22SCSE1040054)

SCHOOL OF COMPUTING SCIENCE AND ENGINEERING


GALGOTIAS UNIVERSITY, GREATER NOIDA
Abstract
Face detection has been around for ages. Taking a step forward, human emotion displayed by face and felt

by brain, captured in either video, electric signal (EEG) or image form can be approximated. Human

emotion detection is the need of the hour so that modern artificial intelligent systems can emulate and

gauge reactions from face. This can be helpful to make informed decisions be it regarding identification

of intent, promotion of offers or security relatedthreats. Recognizing emotions from images or video is a

trivial task for human eye, but proves to be very challenging for machines and requires many image

processing techniques for feature extraction. Several machine learning algorithms are suitable for this job.

Any detection or recognition by machine learning requires training algorithm and then testing them on a

suitable dataset. This paper explores a couple of machine learning algorithms as well as feature extraction

techniques which would help us in accurate identification of the human emotion.


Acknowledgments

I would like to express my gratitude to my project supervisor, Mr. Rajakumar P. for their guidance and support.

I also appreciate the resources provided by Galgotias University, Greater Noida and the efforts of mycolleagues

and team members. Finally, thanks to my family and friends for their unwavering support.

Student Name:

Ujjawal Gupta (22SCSE1040014)


Yogendra Singh (22SCSE1040078)
Priyanshi Raj (22SCSE1040035)
Roshan Kumar (22SCSE1040054)

Supervisor's Signature:
TABLE OF CONTENTS

1. Introduction ........................................................................................................................................ 1

2. Image Features.................................................................................................................................... 3

2.1 FACS .............................................................................................................................................. 3

2.2 Landmarks ...................................................................................................................................... 4

2.3 Feature Descriptors ........................................................................................................................ 5

3. Related Work ...................................................................................................................................... 6

3.1 Feature Extraction Techniques ....................................................................................................... 6

3.1.1 Ensemble of regression trees ....................................................................................................... 6

3.1.1.1 Displacement ratios .................................................................................................................. 7

3.1.2 Eulerian Motion Magnification (EMM) ...................................................................................... 8

3.1.2.1 Amplitude- Based Eulerian Motion Magnification .................................................................. 8

3.1.2.2 Phase – Based Eulerian Motion Magnification ........................................................................ 9

3.1.2.3 LBP-TOP feature extraction from EMM ............................................................................... 10

3.1.3 Face detection using Viola-Jones face detector......................................................................... 10

3.1.3.1 LBP technique for feature extraction ..................................................................................... 10

3.1.4 Using dynamic grid – based HoG features ................................................................................ 12

3.1.5 Geometrical facial features extraction ....................................................................................... 13

3.1.5.1 Eccentricity features ............................................................................................................... 13

3.5.1.1 Eccentricity extraction algorithm ........................................................................................... 15

3.5.1.2 Linear features ........................................................................................................................ 15

3.2 Machine Learning Algorithms ..................................................................................................... 16

i
3.2.1 Support Vector Machines (SVM).............................................................................................. 16

3.2.2 Hidden Markov Models (HMM) ............................................................................................... 17

3.2.3 Other Algorithms....................................................................................................................... 18

4.TOOLS AND LIBRARIES USED ................................................................................................... 20

4.1 OpenCV........................................................................................................................................ 20

4.2 Dlib............................................................................................................................................... 20

4.3 Python .......................................................................................................................................... 20

4.4 Scikit-learn ................................................................................................................................... 21

4.5 Jupyter Notebook ......................................................................................................................... 21

4.6 Database ....................................................................................................................................... 21

5. Implementation ................................................................................................................................. 23

5.1 Setting up the database ................................................................................................................. 23

5.2 Image processing pipeline ............................................................................................................ 26

5.2.1 Face detection............................................................................................................................ 26

5.2.2 Facial feature extraction ............................................................................................................ 27

5.2.3 Python pipeline.......................................................................................................................... 28

5.2.4 Machine learning ....................................................................................................................... 28

6. Results ................................................................................................................................................ 30

7. Conclusion and Future Work .......................................................................................................... 46

References ............................................................................................................................................. 49

ii
1. Introduction
Human emotion detection is implemented in many areas requiring additional security or

information about the person. It can be seen as a second step to face detection where we may be

required to set up a second layer of security, where along with the face, the emotion is also

detected. This can be useful to verify that the person standing in front of the camera is not just a

2-dimensional representation [1].

Another important domain where we see the importance of emotion detection is for business

promotions. Most of the businesses thrive on customer responses to all their products and offers.

If an artificial intelligent system can capture and identify real time emotions based on user image

or video, they can make a decision on whether the customer liked or disliked the product or offer.

We have seen that security is the main reason for identifying any person. It can be based on finger-

print matching, voice recognition, passwords, retina detection etc. Identifying the intent of the

person can also be important to avert threats. This can be helpful in vulnerable areaslike airports,

concerts and major public gatherings which have seen many breaches in recent years.

Human emotions can be classified as: fear, contempt, disgust, anger, surprise, sad, happy, and

neutral. These emotions are very subtle. Facial muscle contortions are very minimal and detecting

these differences can be very challenging as even a small difference results in differentexpressions

[4]. Also, expressions of different or even the same people might vary for the same emotion, as

emotions are hugely context dependent [7]. While we can focus on only those areas of the face

which display a maximum of emotions like around the mouth and eyes [3], how we

1
extract these gestures and categorize them is still an important question. Neural networks and

machine learning have been used for these tasks and have obtained good results.

Machine learning algorithms have proven to be very useful in pattern recognition and

classification. The most important aspects for any machine learning algorithm are the features. In

this paper we will see how the features are extracted and modified for algorithms like Support

Vector Machines [1]. We will compare algorithms and the feature extraction techniques from

different papers. The human emotion dataset can be a very good example to study the robustness

and nature of classification algorithms and how they perform for different types of dataset.

Usually before extraction of features for emotion detection, face detection algorithms are applied

on the image or the captured frame. We can generalize the emotion detection steps as follows:

1) Dataset preprocessing

2) Face detection

3) Feature extraction

4) Classification based on the features

In this work, we focus on the feature extraction technique and emotion detection based on

the extracted features. Section 2 focuses on some important features related to the face. Section 3

gives information on the related work done in this field. Related work covers many of the feature

extraction techniques used until now. It also covers some important algorithms which can be used

for emotion detection in human faces. Section 4 details the tools and libraries used in the

implementation. Section 5 explains the implementation of the proposed feature extraction and

emotion detection framework. Section 6 highlights the result of the experiment. Section 7 covers

the conclusion and future work.

2
2. Image Features

We can derive different types of features from the image and normalize it in vector form.We can employ

various types of techniques to identify the emotion like calculating the ellipses formed on the face or the

angles between different parts like eyes, mouth etc. Following are some of the prominent features which

can be used for training machine learning algorithms:

2.1 FACS

Facial Action Coding System is used to give a number to facial moment. Each such number is

called as action unit. Combination of action units result in a facial expression. The micro changes

in the muscles of the face can be defined by an action unit. For example, a smilingface can be

defined in terms of action units as 6 + 12, which simply means movement of AU6 muscle and

AU12 muscle results in a happy face. Here Action Unit 6 is cheek raiser and Action Unit 12 is lip

corner puller. Facial action coding system based on action units is a good system to determine

which facial muscles are involved in which expression. Real time face models can be generated

based on them.

Figure 1: Action Units corresponding to different movements in face [15]


3
2.2 Landmarks

Landmarks on the face are very crucial and can be used for face detection and recognition. The

same landmarks can also be used in the case of expressions. The Dlib libraryhas a 68 facial

landmark detector which gives the position of 68 landmarks on the face.

Figure 2: Landmarks on face [18]

Figure 2 shows all the 68 landmarks on face. Using dlib library we can extract the co-

ordinates(x,y) of each of the facial points. These 68 points can be divided into specific areas like

left eye, right eye, left eyebrow, right eyebrow, mouth, nose and jaw.

4
2.3 Feature Descriptors

Good features are those which help in identifying the object properly. Usually the images are

identified on the basis of corners and edges. For finding corners and edges in images, we have

many feature detector algorithms in the OpenCV library such as Harris corner detector.

These feature detectors take into account many more factors such as contours, hull and convex.

The Key-points are corner points or edges detected by the feature detector algorithm. The feature

descriptor describes the area surrounding the key-point. The description can be anything including

raw pixel intensities or co-ordinates of the surrounding area. The key-point and descriptor together

form a local feature. One example of a feature descriptor is a histogram of oriented gradients. ORB

(based on BRIEF), SURF, SIFT etc. are some of the feature descriptor algorithms [25].

5
3. Related Work

3.1 Feature Extraction Techniques


3.2 Ensemble of regression trees

This method uses cascaded regression trees and finds the important positions on the face using

images. Pixel intensities are used to distinguish between different parts of the face, identifying 68

facial landmarks [1]. Based on a current estimate of shape, parameter estimation is done by

transforming the image in the normal co-ordinate system instead of global. Extracted features are

used to re-estimate the shape parameter vectors and are recalculated until convergence [5].

Figure 3: Image with19 feature points [1]

The author [1] uses only 19 features as shown in Figure 3 from the 68 extracted features,focusing

only around eyes, mouth and eyebrows.

6
Ensemble of regression trees was very fast and robust giving 68 features in around 3

milliseconds.

3.1.1 Displacement ratios

Once the features are in place, the displacement ratios of these 19 feature points are

calculated using pixel coordinates. Displacement ratios are nothing but the difference in

pixel position in the image from initial expression to final expression.

Instead of using these distances directly, displacement ratios are used as these pixel

distances may vary depending on the distance between the camera and the person.

The dataset used for this experiment was the iBug-300W dataset which has more than

7000 images along with CK + dataset having 593 sequences of facial expressions of

123 differentsubjects.

7
Table 2: Distances calculated to determine displacement ratios between different parts of face [1]

Distance Description of the distances

D1 and D2 Distance between the upper and lower eyelid of the right and left eyes

D3 Distance between the inner points of the left and right eyebrow

D4 and D5 Distance between the nose point and the inner point of the left and right eyebrow

D6 and D8 Distance between the nose point and the right and left mouth corner

D7 and D9 Distance between the nose point and the midpoint of the upper and lower lip

D10 Distance between the right and left mouth corner

D11 Distance between the midpoint of the upper and lower lip

D12 Mouth circumference

3.1.2Eulerian Motion Magnification (EMM)

Subtle emotions are hard to detect. If we magnify the emotions, there is a possibility of increasing

the accuracy of detection. Motion properties such as velocity and acceleration can be used for

magnification. Image as a whole is transformed by magnifying changes in properties of amplitude

and phase. Based on the properties, there are A-EMM (Amplitude based) and P-EMM (Phase

based) motion magnification [4].

8
TOOLS AND LIBRARIES USED

3.1 OpenCV

OpenCV is the library we will be using for image transformation functions such as converting the

image to grayscale. It is an open source library and can be used for many image functions and has

a wide variety of algorithm implementations. C++ and Python are the languages supported by

OpenCV. It is a complete package which can be used with other librariesto form a pipeline for any

image extraction or detection framework. The range of functions it supports is enormous, and it

also includes algorithms to extract feature descriptors.

3.2 Dlib

Dlib is another powerful image-processing library which can be used in conjunction withPython,

C++ and other tools. The main function this library provides is of detecting faces, extracting

features, matching features etc. It has also support for other domains like machine learning,

threading, GUI and networking.

3.3 Python

Python is a powerful scripting language and is very useful for solving statistical problemsinvolving

machine learning algorithms. It has various utility functions which help in pre- processing.

Processing is fast and it is supported on almost all platforms. Integration with C++ and other image

libraries is very easy, and it has in-built functions and libraries to store and manipulate data of all

types. It provides the pandas and numpy framework which helps in manipulation of data as per

our need. A good feature set can be created using the numpy arrays which can have n-dimensional

data.

9
3.4 Scikit-learn

Scikit-learn is the machine learning library in python. It comprises of matplotlib, numpy and a

wide array of machine learning algorithms. The API is very easy to use and understand. It has

many functions to analyze and plot the data. A good feature set can be formed using many ofits

feature reduction, feature importance and feature selection functions. The algorithm it provides

can be used for classification and regression problems and their sub-types.

3.5 Jupyter Notebook

Jupyter Notebook is the IDE to combine python with all the libraries we will be using inour

implementation. It is interactive, although some complex computations require time to complete.

Plots and images are displayed instantly. It can be used as a one stop for all our requirements,

and most of the libraries like Dlib, OpenCV, Scikit-learn can be integrated easily.

3.6 Database

We have used the extended Cohn-Kanade database (CK+) and Radbound Faces database(RaFD).

CK+ has around 593 images for 123 subjects. Only 327 files have labeled/identified emotions. It

covers all the basic human emotions displayed by the face. The emotions and codes are as follows:

1 – Angry, 2 – Contempt, 4 – Fear, 5 – Happy, 6 – Sadness, 7

– Surprise. The database is widely used for emotion detection research and analysis. There are 3

more folders along with the images. FACS contains action units for each image. Landmark

contains AAM tracked facial features of the 68 facial points. Emotion contains emotion label for

the 327 files.

Radbound Faces Database [19] is a standard database having equal number of files for all

emotions. It has images of 67 subjects displaying 8 emotions: Neutral included. The pictures are

taken in 5 different camera poses. Also, the gaze is in 3 directions. We are using only front
10
facing images for our experiment. We have a total of 536 files with 67 models displaying 8

different emotions.

Figure 12: Various emotions from the kaggle database

11
4. Implementation

A static approach using extracted features and emotion recognition using machine learning is used in this

work. The focus is on extracting features using python and image processing libraries and using machine

learning algorithms for prediction. Our implementation isdivided into three parts. The first part is image

pre-processing and face detection. For face detection, inbuilt methods available in dlib library are used.

Once the face is detected, the regionof interest and important facial features are extracted from it. There

are various features which can be used for emotion detection. In this work, the focus is on facial points

around the eyes, mouth, eyebrows etc.

We have a multi-class classification problem and not multi-label. There is a subtle difference as a set of

features can belong to many labels but only one unique class. The extractedfacial features along with SVM

are used to detect the multi-class emotions. The papers we have studied focus on SVM as one of the widely

used and accepted algorithms for emotion classification. Our database has a total of 7 classes to classify.

We have compared our results with logistic regression and random forest to compare the results of

different algorithms. The processing pipeline can be visualized as Figure 13.

5.1 Setting up the database

The image files for the CK+ database are in different directories and sub-directories based on

the person and session number. Also, not all the images depict emotion; only 327 fileshave one

of the emotion depicted from 1-7. All the files were of type portable networks graphicfile(.png).

The emotion labels are in the different directory but with the same name as image files. We wrote

a small utility function in java which used the emotion file name to pick up the

12
correct image from the directory and copy it in our final dataset folder. We also appended the

name of the emotion file to the image file name. Thus, while parsing the file in our program we

will have the emotion label for that file.

Figure 13: Implementation Pipeline

For example, in the filename S137_001_00000014_7, S137 represents the subjectnumber,

001 the session number, 00000014 the image number in the session and finally 7represents

the emotion the subject is posing.

The dataset we created consisted of only frontal face images, and there was no file with no-emotion

(no neutral emotion). The lighting and illumination condition for some images was different. Some

images were colored. The processing pipeline for all the images was the same, in spite of the

illumination conditions.

For the RaFD database we simply extracted the name of the emotion from image filename

which was in jpg format. As this database was standard we had a balanced number of classes

for each emotion. Table 5 shows the distribution of different emotion classes.

13
Table 5: Number of images per class for CK+ database

Number of images

Emotion depicting the emotion

1: Anger 45

2: Contempt 18

3: Disgust 59

4:Fear 25

5: Happy 69

6: Sadness 28

7: Surprise 83

Figure 14: Bar plot of number of samples for each class

14
From Figure 14 we see that the numbers of classes are not equal and this might result in some

classes being misclassified. For example contempt has very less number of samples (18); hence if

none of the samples are present in training, it will be difficult to classify the class contempt in the

testing data set. Moreover, due to less training samples, the class can also be treated as an outlier.

The algorithm can become biased towards the emotion surprise, and classify most images as

surprise. For RaFD database there was no bias towards any single class.

5.2 Image processing pipeline

5.2.1 Face detection

Face detection was the first and important part of the processing pipeline. Before further

processing, we had to detect the face, even though our images contained only frontal facial

expression data. Once the face was detected, it was easier to determine the region of interest and

extract features from it.

Figure 15: Original image from the database and detected face from the image

For face detection, we tried many algorithms like Haar-cascades from OpenCV. Finally we

settled for face detector based on histogram of oriented gradients from Dlib library. HoG

15
descriptors along with SVM are used to identify the face from the image. Images are converted

to grayscale and resized for uniformity.

5.2.2 Facial feature extraction

For facial feature extraction, we used the 68 landmark facial feature predictor from dlib.The face

detector algorithm returns a window(x,y,width,height) which is the detected face. Thedetected

face is passed to the feature predictor algorithm. Figure 16 shows the detected 68 landmarks for

a particular face. The predictor function returns the 68 points at the eyes(left andright), mouth,

eyebrows(eft and right), nose and jaw. We used numpy array to convert the 68 points to an array

of 68 x and y co-ordinates representing their location. These are the facial features we have used

to predict emotion.

Figure 16: Detected landmarks from the face

The landmarks are easier to access in numpy array form. Also, from Figure 16 we knowthe

indices of each feature, hence we can focus on a particular feature instead of the entire set.

16
The feature points are divided as 1-17 for jaw, 49-68 for mouth and so on. So, for instance, if we

want to ignore the jaw, we can simply put the x and y co-ordinates for the jaw as 0, while

converting the features into numpy array. We also calculated distances and polygon areas for some

of the facial landmarks.

5.2.3 Python pipeline

The dataset of 327 files was stored in a directory and each file was processed to create thefeature

set. As soon as the file was picked up, the name of the file was parsed to extract the emotion label.

The emotion label was appended to a list of labels which will form our multi-classtarget variable.

The image was processed for face detection and feature prediction. The features derived from each

file were appended to a list which was later converted to a numpy array of dimension 327*68*2.

We also had the target classes in the form of a numpy array. Same process was followed for RaFD

database.

5.2.4 Machine learning

Once we had created the feature set and the target variable, we used Support Vector Machines

to predict the emotions. Sklearn machine library was used to implement the Support Vector

Machines (SVM) and Logistic Regression algorithms. The multiclass strategy used was

―One-Vs-Rest‖ for all the algorithms. Logistic regression algorithm was fine tuned for penalty

―l1‖ and ―l2‖. We also fine-tuned the linear kernel to rbf and poly to see the variation in results.

Cross-validation technique was used along with SVM to remove any biases in the databases.

Initially the dataset was divided as 70% for training and 30% for testing. We tried many other

splits such as 80:20 and 70:30. 70:30 split seemed more appealing as our assumption was all

classes will be equally represented in the test set. For cross-validation score we initially tested

with 4 splits. To improve the results we chose the value 5 and 10, which are standard values for
17
cross-validation. Random Forest Classifier and Decision Trees were also run on our dataset, but

resulted into low accuracy as compared to other algorithms in our experiment; hence we decided

to continue with SVM and Logistic Regression.

18
5. Results
We applied support vector machines to our dataset and predicted the results. The results were interpreted

using confusion matrix and accuracy metric. The train:test split was 75:25. We also did cross-validation

on the dataset to remove any biases. Value of split was chosen as 4 because the resultant splits will have

same number of images as our 25% test set. The results areas follows:

Table 6: Accuracy for 75:25 split and cross-validation

SVM kernel Accuracy (%) Cross-Validation Accuracy Score (cv=4)

linear 78.05 0.78(+/- 0.07)

rbf 21.95 0.25(+/- 0.01)

poly 75.61 0.76(+/- 0.06)

In our experiment SVM with linear kernel performed better than other kernels. Rbf gave us the worst

performance, whereas poly was as good as linear kernel. We tried to keep the test set % same for both

split and cross-validated data so as to have uniformity in results. The mean cross-validation score was also

approximately equal to the accuracy score achieved by the split. Figure 17 shows the heat-map of the

confusion matrix from our multi-class classification results.On further analysis of the confusion matrix,

we see that the diagonals have higher weights; there

are a few misclassifications in every class except class 4: Fear.

19
the predicted values and actual values in report format. From this, we can inferthe correct number of

emotions predicted for each class.

1: Anger – 9/11 – 82%

2: Contempt – 3/4 - 75%

3: Disgust – 12/17 - 70%

4: Fear – 5/5 - 100%

5: Happy – 15/16 - 93%

6: Sadness – 8/11 - 72%

7: Surprise – 12/18 – 66%

20
7. Conclusion and Future Work

Our implementation can roughly be divided into 3 parts:

1. Face detection

2. Feature extraction

3. Classification using machine learning algorithms

Feature extraction was very important part of the experiment. The added distance and area features

provided good accuracy for CK+ database (89%). But for cross-database experiment we observed that

raw features worked best with Logistic Regression for testing RaFD database and Mobile images dataset.

The accuracy was 66% and 36% for both using CK+ dataset as training set. The additional features

(distance and area) reduced the accuracy of the experiment for SVM as seen in Table 13 and Table 15.

The algorithm generalized the results from the training set to the testing set better than SVM and other

algorithms. The results of the emotion detection algorithm gave average accuracy up to 86% for RaFD

database and 87% for CK+ database for cross-validation=5. RaFDdataset had equal number of classes;

hence, cross-validation did not help in improving the accuracy of the model.

Table 22 shows our performance as compared to different papers. When compared to Paper[1], which

used ORB feature descriptors our method performed better, using only the 68 facial landmark points and

the distances and area features [14]. Paper [1] achieved an accuracy of 69.9% without the neutral emotion

whereas we achieved average accuracy of 89%. Paper [14]had similar feature extraction technique as ours

and their accuracy was slightly(0.78) better thanus. Paper [20] used large number of iterations to train the

layers and achieved an accuracy of

21
98.15% with 35000 iterations. Paper [21] also used similar concept of angles and areas and

achieved an accuracy of 82.2% and 86.7% for k-NN and CRF respectively.

Table 22: Comparison with different papers

Algorithm and Features Number of Emotions Accuracy

CNN 7: Neutral considered Without BNLayer: 85.25-

Network : FENet/ FeNet with instead of contempt >87.35 / With BNLayer:

BNLayer 97.41 -> 98.15

No. of Iterations: 5000->35000 [20]

k-nn and CRF(Angles and Areas) All 7 k-NN: 82.2/ CRF: 86.7

[21]

InceptionResNet with and without 7 With CRF: 93.04/ Without

CRF(Softmax Layer) [22] CRF: 85.77

M-CRT [23] 7:Neutral considered 90.72

instead of contempt

SVM (Multiclass +binary) 7 89.78

(displacement ratios) [14]

SVM (ORB Features) [1] 7 69.9% / 79.1% (with neutral)

22
We did not focus on face detection in this paper. Our main focus was on feature extraction and analysis

of the machine algorithm on the dataset. But accurate face-detection algorithm becomes very important

if there are multiple people in the image. If we are determining the emotion of a particular person from

a webcam, the webcam should be able todetect all the faces accurately.

For future work, a more robust face detection algorithm coupled with some good featurescan be researched

to improve the results. We focused on only some distances and areas, there can be many more such

interesting features on the face which can be statistically calculated and usedfor training the algorithm.

Also, not all the features help to improve the accuracy, some maybe not helpful with the other features.

Feature selection and reduction technique can be implemented on the created feature to improve the

accuracy of the dataset. We can experiment with facial action coding system or feature descriptors as

features or a combination of both of them. Also, we can experiment with different datasets amongst

different races. This will give us an idea if the approach is similar for all kinds of faces or if some other

features should be extracted to identify the emotion. Applications such as drowsiness detection amongst

drivers [1] can be developed using feature selection and cascading different algorithms together.

Algorithms like logistic regression, linear discriminant analysis and random forest classifier can be fine-

tuned to achieve good accuracy and results. Also, metrics such as cross- validation score, recall and f1

score can be used to define the correctness of model and the modelcan be improved based on these metric

results.

23
References

[1] W. Swinkels, L. Claesen, F. Xiao and H. Shen, "SVM point-based real-time emotion

detection," 2017 IEEE Conference on Dependable and Secure Computing, Taipei, 2017.

[2] Neerja and E. Walia, "Face Recognition Using Improved Fast PCA Algorithm," 2008

Congress on Image and Signal Processing, Sanya, Hainan, 2008

[3] H. Ebine, Y. Shiga, M. Ikeda and O. Nakamura, "The recognition of facial expressions with

automatic detection of the reference face," 2000 Canadian Conference on Electrical and

Computer Engineering. Conference Proceedings. Navigating to a New Era (Cat.

No.00TH8492), Halifax, NS, 2000, pp. 1091-1099 vol.2.

[4] A. C. Le Ngo, Y. H. Oh, R. C. W. Phan and J. See, "Eulerian emotion magnification for

subtle expression recognition," 2016 IEEE International Conference on Acoustics, Speech

and Signal Processing (ICASSP), Shanghai, 2016

[5] V. Kazemi and J. Sullivan, "One millisecond face alignment with an ensemble of regression

trees," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH,

2014

[6] M. Dahmane and J. Meunier, "Emotion recognition using dynamic grid-based HoG

features," Face and Gesture 2011, Santa Barbara, CA, 2011

[7] K. M. Rajesh and M. Naveenkumar, "A robust method for face recognition and face emotion

detection system using support vector machines," 2016 International Conference on

Electrical, Electronics, Communication, Computer and Optimization Techniques

(ICEECCOT), Mysuru, 2016

24
[8] C. Loconsole, C. R. Miranda, G. Augusto, A. Frisoli and V. Orvalho, "Real-time emotion

recognition novel method for geometrical facial features extraction," 2014 International

Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, 2014

[9] J. M. Saragih, S. Lucey and J. F. Cohn, "Real-time avatar animation from a single

image," Face and Gesture 2011, Santa Barbara, CA, USA, 2011

[10] G. T. Kaya, "A Hybrid Model for Classification of Remote Sensing Images With Linear

SVM and Support Vector Selection and Adaptation," in IEEE Journal of Selected Topics in

Applied Earth Observations and Remote Sensing, vol. 6, no. 4, pp. 1988-1997, Aug. 2013

[11] X. Jiang, "A facial expression recognition model based on HMM," Proceedings of 2011

International Conference on Electronic & Mechanical Engineering and Information

Technology, Harbin, Heilongjiang, China, 2011

[12] J. J. Lee, M. Zia Uddin and T. S. Kim, "spatiotemporal human facial expression

recognition using fisher independent component analysis and Hidden Markov Model," 2008

30th Annual International Conference of the IEEE Engineering in Medicine and Biology

Society, Vancouver, BC, 2008

[13] Xiaoxu Zhou, Xiangsheng Huang, Bin Xu and Yangsheng Wang, "Real-time facial

expression recognition based on boosted embedded hidden Markov model," Image and

Graphics (ICIG'04), Third International Conference on, Hong Kong, China, 2004

[14] T. Kundu and C. Saravanan, "Advancements and recent trends in emotion recognition

using facial image analysis and machine learning models," 2017 International Conference on

Electrical, Electronics, Communication, Computer, and Optimization Techniques

(ICEECCOT), Mysuru, 2017, pp.

25
26

You might also like