0% found this document useful (0 votes)
6 views

Face Emotion Recognition System Using Deep Learning

Uploaded by

nisargaachar26
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Face Emotion Recognition System Using Deep Learning

Uploaded by

nisargaachar26
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 11 Issue: 04 | Apr 2024 www.irjet.net p-ISSN: 2395-0072

Face Emotion Recognition System Using Deep Learning


ASST. PROF M.KAVITHA1, V. HARSHAN2, M. PRAVEEN KUMAR3, A. RAHUL4

1234 Dept. of Computer Science and Engineering, Government College of Engineering Srirangam, Tamilnadu, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - This paper presents an enhanced system for real- 1.1 RELATED WORK
time facial emotion detection, aiming to improve efficiency
and accuracy through deep learning. The proposed approach Sharmeen M. Saleem Abdullah and Adnan Mohsin
utilizes VGG-19 transfer learning, a pre-trained convolutional Abdulazeez [13] addressed the latest FER analysis. Numerous
neural network (CNN) architecture known for its depth and CNN architectures have been identified that have recently
strong performance in image classification. VGG-19's pre- been proposed. They have provided various databases of
trained weights contribute to improved efficiency compared to random photographs obtained from the actual world and
simpler CNNs, allowing for effective feature extraction and other laboratories to detect human emotions.
classification of emotional expressions in real-time. This
approach has the potential to benefit various applications in Hussein, E. S., Qidwai, U. and Al-Meer, M. [4]
human-computer interaction and psychology by enabling recommended a CNN model to understand face emotions
accurate and timely emotion recognition with three continuum emotions. This model uses residual
blocks and depth-separable convolutions inspired by
Key Words: Facial Emotion Detection, Deep Learning, VGG- Xception to minimize the sum of parameters to 33k. They
19 Transfer Learning, Real-Time Emotion Recognition, use a convolutional neural FER network for emotional
Efficiency, Human-Computer Interaction, Psychology stability identification. CNN uses convolution operations to
learn extract features from the input images, which reduces
1.INTRODUCTION the need to extract features from images manually. The
proposed model offers 81 percent total precision for
Human communication encompasses speech, invisible results. It senses negative and positive emotions,
gestures, and emotions, vital for interpersonal interactions. respectively, with a precision of 87% and 85%. However, the
AI systems capable of understanding human emotions are accuracy of neutral emotion detection is just 51%.
crucial, especially in healthcare, and e-learning where
emotional understanding is paramount. Traditional emotion Jiang, P., Liu, G., Wang, Q., and Wu, J [5] introduced a
detection methods often fall short in real-time scenarios, new loss feature called the advanced softmax loss to
necessitating models that can continuously interpret facial eradicate imbalanced training expressions. The proposed
expressions for dynamic emotional assessment. losses guarantee that any class would have a level playing
field and potential using fixed (unlearnable) weight
This paper proposes a real-time facial emotion parameters of the same size and equally allocated in angular
recognition model leveraging AI and computer vision space. The research shows that proposed (FER) methods are
advancements. The model aims to enhance human-computer better than specific state-of-the-art FER methods. The
interactions across diverse applications by dynamically proposed loss can be used as an isolated signal or used
detecting and responding to emotions. Automatic Facial simultaneously with other loss functions. To sum up, detailed
Expression Recognition (FER) has gained traction, driven by studies on FER2013 and the real-world practical face (RAF)
its potential in human-computer interaction and healthcare. databases have shown that ASL is considerably more precise
While Ekman's discrete categorization model is widely used, and effective than many state-of-the-art approaches..
its limitation in handling spontaneous expressions prompts
the need for more comprehensive approaches. 2 METHODOLOGIES

Our focus is on categorical facial expression This approach utilizes the VGG19 architecture for Facial
classification using the VGG-19 model, known for its depth Emotion Recognition by preprocessing the dataset, training
and performance in image tasks. By employing pre-trained the model, and validating it for real-time deployment. It
weights, our system achieves efficiency and accuracy for includes implementing a user interface for interaction and
real-time emotion recognition. This work explores VGG-19 feedback loops for continual improvement.
transfer learning's potential for facial emotion recognition
while remaining adaptable to other models with suitable CNNs, or Convolutional Neural Networks, are crucial in
data. deep learning and particularly effective in computer vision
tasks. They automatically learn relevant features from raw
input data, making them ideal for image and video
recognition. Structured to mimic human visual processing,

© 2024, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 2019
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 04 | Apr 2024 www.irjet.net p-ISSN: 2395-0072

CNNs consist of layers like input, convolutional, activation,


pooling, fully connected, and output layers, each extracting
increasingly abstract representations from input data..

2.2. Preprocessing Images:


The pixel values of grayscale images are converted
from string format to numpy arrays and reshapes them to a
consistent size of 48x48 pixels. These processed image
arrays are then stacked together along the 0th axis to create
a 4D numpy array representing the entire dataset, where
each image is now in a standardized format suitable for
further processing. Additionally, the grayscale images are Fig. 1. System Architecture
converted to RGB format using OpenCV's `cv2.cvtColor`
function to meet the input requirements of the VGG19 2.5. Feature Extraction:
model. Following this conversion, the pixel values of the RGB
images are normalized to a range of [0, 1] by dividing each The module accesses the intermediate features
pixel value by 255.0. This normalization step ensures learned by the VGG19 model, specifically targeting the
uniformity and provides optimal training conditions for the second-to-last layer before the classification layers. These
deep learning model, enhancing its ability to learn intermediate features capture rich representations of facial
meaningful patterns and features from the input images. expressions and features crucial for emotion recognition.
After obtaining the intermediate features, global average
2.3. Cross-Validation: pooling is applied to summarize and condense the feature
maps into a fixed-length feature vector for each image. A
To assess the performance and generalization custom output layer, typically a Dense layer with softmax
capabilities of the CNN models, cross-validation is employed. activation, is added to map the extracted features to different
The dataset is divided into multiple folds or subsets. The emotion classes, facilitating emotion recognition based on
training and evaluation procedure is reiterated numerous the VGG19-derived features. This approach leverages the
times, with every fold being utilized as the validation set power of transfer learning, using pre-trained deep learning
once. This approach ensures a comprehensive evaluation of models to extract meaningful features and then customizing
the CNN models' performance across different subsets of the the output layers for specific tasks like emotion recognition.
data. Cross-validation helps in validating the model's
performance robustness and ensures that it can generalize 2.6. Emotion Recognition:
well to unseen data, reducing the risk of overfitting and
providing more reliable performance metrics. The process of emotion prediction using the trained
model involves several steps. Initially, the preprocessed
2.4. Load VGG-19 Model: sample image data is fed into the trained model using
`model.predict(input_data)`, where the model calculates the
The pre-trained VGG-19 model is loaded probabilities associated with each emotion class based on
using TensorFlow's Keras API, initialized with weights pre- the image features learned during training. Subsequently,
trained on the ImageNet dataset. Figure 1 illustrates the the predicted probabilities array is examined to determine
system architecture, showcasing the loading process without the emotion with the highest probability, which serves as the
the fully connected layers typically used for ImageNet predicted emotion label. This label is then mapped to a
classification. These layers will be replaced with custom human-readable emotion category using a dictionary,
ones for the emotion recognition task. After loading, you can providing the final prediction of the emotional state depicted
use `model.summary()` to examine the model's architecture, in the sample image. Finally, the predicted emotion label is
layer types, output shapes, and trainable parameters. extracted and displayed alongside the original image,
Understanding these details is crucial for customizing the offering a clear representation of the model's prediction
model and preparing it for training on your specific dataset. regarding the emotional state conveyed in the image. This
comprehensive process allows for accurate and
interpretable emotion recognition outcomes based on deep
learning techniques and model predictions.

© 2024, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 2020
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 04 | Apr 2024 www.irjet.net p-ISSN: 2395-0072

3. EXPERIMENTAL EVALUATION 3.2 Dataset:

3.1 Environment Specifications: A carefully curated dataset of human facial


expression images is collected. The dataset comprises
Table 1 outlines the experimental configuration labeled images depicting a range of facial expressions,
employed in the study. The research work takes place on an including happiness, sadness, anger, and others as in fig.2.
AMD Ryzen 5 3500U CPU with 12GB of RAM and an These images are utilized as input data for training and
integrated graphics card. The models are constructed using evaluating convolutional neural network (CNN) models. The
Python and executed using deep learning frameworks such Fer2013 dataset stands as a widely employed resource for
as Keras and TensorFlow. facial expression recognition tasks.. The dataset contains
35,887 grayscale facial images containing 7 different
Process name S.N Action emotions (anger, disgust, fear, happiness, neutral, sad, and
surprise).
Input 1. Collected images of 10 classes of
facial expressions, including 7
classes representing different
emotions
Environment 2. Anaconda, Jupyter Notebook
Configuration 3. Import all necessary libraries and
packages
Directories 4. Load the images
Configuration 5. Load the directories for training,
testing and create validation on
20% training data
Training and 6 Developed transfer learning models
Testing trained on the ImageNet dataset and
CNN models Fig. 2. FER-2013 Samples
7 Fine-tuned by adding the fully
connected (FC) layer with SoftMax 3.3 Result Analysis:
activation for face emotion
detection using VGG19. Upon thorough analysis of the results, it is apparent
that the VGG19 model achieved the highest accuracy of
Model 8 The model complies with Adam 88%, surpassing both the custom CNN model with 76%
Compilation optimizer and a learning rate 0.001 accuracy as shown in Fig6. All models were trained using
data augmentation techniques to enhance their performance.
9 Set 70 epochs for model training
Despite the computational resources required by VGG19
10 As model checkpoint, use the compared to CNN, its deeper architecture and feature
validation loss to monitor extraction capabilities contribute significantly to its
performance. Therefore, based on these findings, the
11 Save model proposed system opts for VGG19 as the preferred model,
Performance 12 Generate classification report prioritizing accuracy over computational efficiency in this
Report scenario.
13 Generate model accuracy and loss
reports The accuracy and loss graphs for VGG19 model along with
Prediction 14 Load the best model confusion matrices aiding in identifying correct predictions
for each emotion class, are provided herein. Refer to Fig.3 for
15 Predict the type of Emotions and the accuracy and loss graphs of VGG19. Refer to Fig.4 for the
generate solution class. confusion matrix of VGG19 Additionally, the classification
report corresponding to each model is included within the
Fig.5, facilitating a detailed assessment of the model's
TABLE 1 Experimental Setup
performance across different emotion classes. These visual
representations serve as valuable tools for comprehensively
evaluating and comparing the efficacy of each model in the
context of facial emotion recognition.

© 2024, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 2021
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 04 | Apr 2024 www.irjet.net p-ISSN: 2395-0072

emotions and confidence scores. Utilizing Flask, React, and


OpenCV enables seamless interaction between users and the
application, providing a user-friendly interface for emotion
recognition tasks

4. CONCLUSIONS
In this system implementing face emotion
recognition using deep learning with the VGG19 architecture
presents a promising approach for accurately detecting and
Fig.3. Accuracy and Loss Graph. classifying emotions from facial images. By following a
systematic approach that involves data collection,
preprocessing, model architecture selection, transfer
learning, training, evaluation, and deployment, it's possible
to develop a robust and effective emotion recognition
system. Transfer learning with pre-trained VGG19 models
enables leveraging knowledge learned from large-scale
image classification tasks, which can significantly enhance
the performance of the emotion recognition model,
especially when training data is limited. Throughout the
development process, careful attention should be paid to
data preprocessing, augmentation, hyperparameter tuning,
and model evaluation to ensure the model generalizes well
to unseen data and accurately predicts emotions across
various facial expressions and environmental conditions.
Ultimately, the successful deployment of a face emotion
recognition system can open up possibilities for applications
in diverse fields, including human-computer interaction,
healthcare, entertainment, and security, contributing to
advancements in technology and enhancing user
experiences.

REFERENCES
Fig.4. Confusion matrix
[1] Sharmeen M. Saleem Abdullah, Adnan Mohsin
Abdulazeez. Facial Expression Recognition Based on Deep
Learning Convolution Neural Network: A Review in
JOURNAL OF SOFT COMPUTING AND DATA MINING VOL. 2
NO. 1 (2021) 53-65.

[2] Hussein, E. S., Qidwai, U. and Al-Meer, M. Emotional


Stability Detection Using Convolutional Neural Networks,
2020 IEEE International Conference on Informatics, IoT, and
Enabling Technologies (ICIoT), 136-140.

[3] Jiang, P., Liu, G.,Wang, Q. and Wu, J. (2020). Accurate and
Reliable Facial Expression Recognition Using Advanced
Fig.5. Classification Report Softmax Loss with Fixed Weights. IEEE Signal Processing
Letters, 27, 725-729.
3.4 Deployment:
The facial emotion recognition project is deployed
via a web-based platform built on the Flask framework and
integrated with React and OpenCV. Users access the
application through a web browser, where they can upload
images for emotion analysis. The Flask backend handles
image processing tasks using the trained VGG19 model,
while the frontend displays results such as predicted

© 2024, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 2022

You might also like