Image Processing Based Facial Emotion Recognition: A Project Report On
Image Processing Based Facial Emotion Recognition: A Project Report On
BACHELOR OF TECHNOLOGY IN
ELECTRONICS AND COMMUNICATION ENGINEERING
Submitted by
2017-2021
VIKAS COLLEGE OF ENGINEERING AND TECHNOLOGY
CERTIFICATE
This is to place on record our appreciation and deep gratitude to the persons
without support this project would never see the light of day.
Place: Nunna
Date:
INDEX
Certificate
Acknowledgement
Declaration
Contents
List of figures
Abstract
CONTENTS
PAGE NO
CHAPTER-1
INTRODUCTION 1
CHAPTER-2
2-3
SYSTEM ANALYSIS
2.1 EXISTING SYSTEM 2
2.2 PROPOSED SYSTEM 2
2.3 FUNCTIONALITIES 2
2.3.1 ACCES WEBCAM 3
2.3.2 DETECT FACE 3
2.3.3 DETECT MOTION 3
2.3.4 DISPLAY EMOTION
CHAPTER 3
REQUIREMENT SPECIFICATIONS 4
3.1 HARDWARE REQUIREMENTS 4
3.2 SOFTWARE REQUIREMENTS
CHAPTER 4
SYSTEM DESIGN 5-15
5
4.1 BLOCK DIAGRAM
6
4.2 DATA FLOW DIAGRAMS
7
4.2.1 CONTEXT LEVEL DFD
7
4.2.2 TOP LEVEL DFD
8
4.2.3 DETAILED LEVEL DIAGRAM
4.3 UNIFIED MODELLING LANGUAGE DIAGRAMS 8-14
4.3.1 USE CASE DIAGRAM 9-10
4.3.2 CLASS DIAGRAM 11
4.3.3 SEQUENCE DIAGRAM 11-12
4.3.4 COLLOBRATION DIAGRAM 13
4.3.5 ACTIVITY DIAGRAM 13-14
CHAPTER 5
CODING AND IMPLEMENTATION 15-17
CHAPTER 6
TESTING 18-19
6.1 BLACK BOX TESTING 18
6.2 WHITE BOX TESTING 19
CHAPTER -7
RESULT 20-21
CHAPTER – 8
CONCLUSION 22
CHAPTER-9
REFERENCES 23-24
CHAPTER-10
APPENDIX 25-28
List of figures Page .No
4.1 Block Diagram 5
4.2 Data Floe Diagrams 6
1|Page
Image Processing based facial emotion recognition
CHAPTER-2
SYSTEM ANALYSIS
2.1Existing System:
Manually a human can recognize the facial emotions of other humans. For this, one can have proper eye
condition. And also several approaches have been proposed to recognize facial emotions automatically by using
a system run by some programming languages like C++.
Disadvantages:
Continuous detection is not possible.
It is not possible to find accuracy rate.
Consists of large code.
2.2 Proposed System:
This proposed system is to detect continuous facial emotion expression for real time video.The proposed system
we are using AI-Deep learning image processing Convolutional Neural Network Algorithm(CNN) with
platform of python.
Advantages:
2|Page
Image Processing based facial emotion recognition
>>>pip install OpenCV-Python ****OpenCV gets installed and then we should import it to Python
libraries Import cv2.
2.3.2 Detect face:
Face-detection algorithms focus on the detection of human faces.
It is analogous to image detection in which the image of a person is matched bit by bit.
Image matches with the image stores in dataset.
The Haar Cascade algorithm is an Object Detection Algorithm used to identify faces in an image or a
real time video.
2.3.3 Detect emotion:
1. Convolutional layer
2. ReLU layer
3. Pooling layer
3|Page
Image Processing based facial emotion recognition
CHAPTER-3
REQUIREMENT SPECIFICATIONS
3.1 Hardware Requirements
• RAM : 4GB (or) above.
• Processor : INTEL i3 CPU (or) above.
• Graphic card : 500MB(or)above.
• Hard disk : 400GB(or)500GB
4|Page
Image Processing based facial emotion recognition
CHAPTER-4
SYSTEM DESIGN
5|Page
Image Processing based facial emotion recognition
A DFD shows what kinds of information will be input to and output from the system, where the data will
come from and go to, and where the data will be stored. It doesn’t show information about timing of
processes, or information about whether processes will operate in sequence or parallel. A DFD is also called
as “bubble chart”.
DFD Symbols:
In the DFD, there are four symbols:
• A square define a source or destination of system data.
• An arrow indicates dataflow. It is the pipeline through which the information flows.
• A circle or a bubble represents transforms dataflow into outgoing dataflow.
• An open rectangle is a store, data at reset or at temporary repository of data.
Process: People, procedures or devices that use or produce (Transform) data. The physical component is
not identified.
Sources: External sources or destination of data, which may be programs, organizations or other entity.
6|Page
Image Processing based facial emotion recognition
In our project, we had built the data flow diagrams at the very beginning of business process modelling
In order to model the functions that our project has to carry out and the interaction between those functions
with focusing on data exchanges between processes.
4.2.1.Context level DFD:
A Context level Data flow diagram created using select structured systems analysis and design
method (SSADM). This level shows the overall context of the system and its operating environment
and shows the whole system as just one process. It does not usually show data stores, unless they are
“owned” by external systems, e.g. are accessed by but not maintained by this system, however, these
are often shown as external entities. The Context level DFD is shown in fig.3.2.1
7|Page
Image Processing based facial emotion recognition
After starting and executing the application, training and testing the dataset can be done as shown in
the above figure
After starting and executing the application, training the dataset is done by using dividing into 2D
array and scaling using normalization algorithms, and then testing is done.
8|Page
Image Processing based facial emotion recognition
After starting and executing the application, training the dataset is done by using linear regression and
then testing is done.
4.3. UNIFIED MODELLING LANGUAGE DIAGRAMS:
The Unified Modelling Language (UML) is a Standard language for specifying, visualizing, constructing and
documenting the software system and its components. The UML focuses on the conceptual and physical
representation of the system. It captures the decisions and understandings about systems that must be
constructed. A UML system is represented using five different views that describe the system from distinctly
different perspective. Each view is defined by a set of diagram, which is as follows.
9|Page
Image Processing based facial emotion recognition
4.3.1. Use Case Diagram:
Use case diagrams are one of the five diagrams in the UML for modelling the dynamic aspects of the
systems (activity diagrams, sequence diagram, state chart diagram, collaboration diagram are the four
other kinds of diagrams in the UML for modelling the dynamic aspects of systems).Use case
diagrams are central to modelling the behavior of the system, a sub-system, or a class. Each one
shows a set of use cases and actors and relations.
10 | P a g e
Image Processing based facial emotion recognition
Figure 4.3.1 USECASE DIAGRAM
11 | P a g e
Image Processing based facial emotion recognition
4.3.2.Class Diagram:
The class diagram is the main building block of object-oriented modeling. It is used for general
conceptual modeling of the structure of the application, and for detailed modeling, translating the models
into programming code. Class diagrams can also be used for data modeling.[1] The classes in a class
diagram represent both the main elements, interactions in the application, and the classes to be
programmed.
4.3.3.Sequence Diagram:
Sequence diagram is an interaction diagram which is focuses on the time ordering of messages. It
shows a set of objects and messages exchanged between these objects. This diagram illustrates the
dynamic view of a system.
12 | P a g e
Image Processing based facial emotion recognition
13 | P a g e
Image Processing based facial emotion recognition
4.3.4.Collaboration Diagram:
Collaboration diagram is an interaction diagram that emphasizes the structural organization of the
objects that send and receive messages. Collaboration diagram and sequence diagram are isomorphic.
4.3.5.Activity Diagram:
An Activity diagram shows the flow from activity to activity within a system it emphasizes the flow
of control among objects.
14 | P a g e
Image Processing based facial emotion recognition
15 | P a g e
Image Processing based facial emotion recognition
CHAPTER-5
CODING & IMPLEMENTATION
Cv_cam_facial_expression.py:
#import python modules
import tensorflow as tf
from tensorflow import keras
import numpy as np
import cv2
model = keras.models.load_model("model_35_91_61.h5")
font = cv2.FONT_HERSHEY_SIMPLEX
cam = cv2.VideoCapture(0)
face_cas = cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')
while True:
ret, frame = cam.read()
if ret==True:
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
#gray = cv2.flip(gray,1)
faces = face_cas.detectMultiScale(gray, 1.3,5)
if cv2.waitKey(1) == 27:
break
else:
print ('Error')
cam.release()
cv2.destroyAllWindows()
facial_expression.py:
import tensorflow as tf
from tensorflow import keras
#from tensorflow.keras.models import Sequential
#from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Dropout, Flatten
#import matplotlib.pyplot as plt
import numpy as np
16 | P a g e
Image Processing based facial emotion recognition
import pandas as pd
#---------------------------------------------------------------------------------------------------------------------------------
def generate_dataset():
df = pd.read_csv("./fer2013/fer2013.csv")
train_samples = df[df['Usage']=="Training"]
validation_samples = df[df["Usage"]=="PublicTest"]
test_samples = df[df["Usage"]=="PrivateTest"]
y_train = train_samples.emotion.astype(np.int32).values
y_valid = validation_samples.emotion.astype(np.int32).values
y_test = test_samples.emotion.astype(np.int32).values
#---------------------------------------------------------------------------------------------------------------------------------
def generate_model(lr=0.001):
"""training model"""
with tf.device('/gpu:0'):
model = keras.models.Sequential()
model.add(keras.layers.Conv2D(128,(5,5), padding='same'))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Activation('relu'))
model.add(keras.layers.MaxPooling2D())
model.add(keras.layers.Dropout(0.20))
model.add(keras.layers.Conv2D(512,(3,3), padding="same"))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Activation('relu'))
model.add(keras.layers.MaxPooling2D())
model.add(keras.layers.Dropout(0.20))
model.add(keras.layers.Conv2D(512,(3,3)))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Activation('relu'))
17 | P a g e
Image Processing based facial emotion recognition
model.add(keras.layers.MaxPooling2D())
model.add(keras.layers.Dropout(0.25))
model.add(keras.layers.Conv2D(256,(3,3), activation='relu'))
model.add(keras.layers.Conv2D(128,(3,3), padding='same', activation='relu'))
model.add(keras.layers.MaxPooling2D())
model.add(keras.layers.Dropout(0.25))
#model.add(keras.layers.GlobalAveragePooling2D())
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(256))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Activation('relu'))
model.add(keras.layers.Dropout(0.5))
model.add(keras.layers.Dense(512, activation='relu'))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Activation('relu'))
model.add(keras.layers.Dropout(0.5))
model.add(keras.layers.Dense(7,activation='softmax'))
model.compile(loss="sparse_categorical_crossentropy", optimizer=keras.optimizers.Adam(lr=lr) ,
metrics=['accuracy'])
return model
#---------------------------------------------------------------------------------------------------------------------------------
if __name__=="__main__":
#df = pd.read_csv("./fer2013/fer2013.csv")
X_train, y_train, X_valid, y_valid, X_test, y_test = generate_dataset()
X_train = X_train.reshape((-1,48,48,1)).astype(np.float32)
X_valid = X_valid.reshape((-1,48,48,1)).astype(np.float32)
X_test = X_test.reshape((-1,48,48,1)).astype(np.float32)
X_train_std = X_train/255.
X_valid_std = X_valid/255.
X_test_std = X_test/255.
model = generate_model(0.01)
with tf.device("/gpu:0"):
history = model.fit(X_train_std, y_train,batch_size=128,epochs=35, validation_data=(X_valid_std, y_valid),
shuffle=True)
model.save("my_model.h5")
Implementation:
Implementation is the stage of the project when the theoretical design is turned out into a working system.Thusit
can be considered to be the most critical stage in achieving a successful new system and in giving the user,
confidence that the new system will work and be effective.
18 | P a g e
Image Processing based facial emotion recognition
The implementation stage involves careful planning, investigation of the existing system and it constraints on
implementation, designing of methods to achieve changeover and evaluation of changeover methods. The project
is implemented by accessing simultaneously from more than one system and more than one window in one
system.
CHAPTER-6
TESTING
It is the process of testing the functionality and it is the process of executing a program with the intent of finding
an error. A good test case is one that has a high probability of finding an as at undiscovered error. A successful
test is one that uncovers an as at undiscovered error. Software testing is usually performed for one of two reasons:
• Defect Detection
• Reliability estimation
In order to achieve consistency in the Testing style, it is imperative to have and follow a set of testing
principles. This enhances the efficiency of testing within SQA team members and thus contributes to
increased productivity. The purpose of this document is to provide overview of the testing, plus the
techniques. Here, after training is done on the training dataset, testing is done.
19 | P a g e
Image Processing based facial emotion recognition
White box testing requires access to source code. Though white box testingcan be performed any time in the
life cycle after the code is developed, it is a good practice to perform white box testing during unit testing
phase.
In designing of database the flow of specific inputs through the code, expected output and the functionality
of conditional loops are tested.
At SDEI, 3 levels of software testing is done at various SDLC phases
• UNIT TESTING: in which each unit (basic component) of the software is tested to verify that the detailed
design for the unit has been correctly implemented
• INTEGRATION TESTING: in which progressively larger groups of tested software components
corresponding to elements of the architectural design are integrated and tested until the software works as a
whole.
• SYSTEM TESTING: in which the software is integrated to the overall product and tested to show that all
requirements are met. A further level of testing is also done, in accordance with requirements:
• REGRESSION TESTING: is used to refer the repetition of the earlier successful tests to ensure that
changes made in the software have not introduced new bugs/side effects.
• ACCEPTANCE TESTING: Testing to verify a product meets customer specified requirements. The
acceptance test suite is run against supplied input data. Then the results obtained are compared with the
expected results of the client. A correct match was obtain.
20 | P a g e
Image Processing based facial emotion recognition
CHAPTER-7
RESULT
21 | P a g e
Image Processing based facial emotion recognition
22 | P a g e
Image Processing based facial emotion recognition
CHAPTER-8
CONCLUSION
In this what we found is during small datasets in some other cases most of time decision trees direct us to a
solution which is not accurate, but when we look at Logistic Regression results we are getting more accurate
results with probabilities of all other possibilities but due to guidance to only one solution decision trees may
miss lead. Finally we can say by this experiment that Logistic Regression is more accurate if the input data
is cleaned and well maintained even though ID3 can clean it self, it cannot give accurate results every time,
and in this same way Logistic Regression also will not give accurate results every time we need to consider
results of different algorithms and by all its results if a prediction is made it will be accurate. But we can use
Logistic Regression consider variables as individual we can use combination of algorithms like Logistic
Regression and K-means to get accuracy.
23 | P a g e
Image Processing based facial emotion recognition
CHAPTER-9
REFERENCES
1. Sonam Nikhar, A.M.Karandikar “Prediction of Heart Disease Us-ing Machine Learning Algorithms” in
International Journal of Advanced Engineering, Management and Science (IJAEMS) June-2016 vol-2
2. Deeanna Kelley “Heart Disease: Causes, Prevention, and Current Research” in JCCC Honors Journal
3. Nabil Alshurafa, Costas Sideris, Mohammad Pourhomayoun, Haik Kalantarian, Majid Sarrafzadeh
"Remote Health Monitoring Out-come Success Prediction using Baseline and First Month Interven-tion
Data" in IEEE Journal of Biomedical and Health Informatics
4. PonrathiAthilingam, Bradlee Jenkins, Marcia Johansson, Miguel Labrador "A Mobile Health
Intervention to Improve Self-Care in Patients With Heart Failure: Pilot Randomized Control Trial" in
JMIR Cardio 2017, vol. 1, issue 2, pg no:1
5. DhafarHamed, Jwan K. Alwan, Mohamed Ibrahim, Mohammad B. Naeem "The Utilisation of Machine
Learning Approaches for Med-ical Data Classification" in Annual Conference on New Trends in
Information & Communications Technology Applications - march-2017
6. Applying k-Nearest Neighbour in Diagnosing Heart Disease Pa-tients Mai Shouman, Tim Turner,
and Rob Stocker International Journal of Information and Education Technology, Vol. 2, No. 3, June
2012
24 | P a g e
Image Processing based facial emotion recognition
7. Amudhavel, J., Padmapriya, S., Nandhini, R., Kavipriya, G., Dha-vachelvan, P., Venkatachalapathy,
V.S.K., "Recursive ant colony optimization routing in wireless mesh network", (2016) Advances in
Intelligent Systems and Computing, 381, pp. 341-351.
8. Alapatt, B.P., Kavitha, A., Amudhavel, J., "A novel encryption al-gorithm for end to end secured fiber
optic communication", (2017) International Journal of Pure and Applied Mathematics, 117 (19
Special Issue), pp. 269-275.
9. Amudhavel, J., Inbavalli, P., Bhuvaneswari, B., Anandaraj, B., Vengattaraman, T., Premkumar, K.,
"An effective analysis on har-mony search optimization approaches", (2015) International Journal of
Applied Engineering Research, 10 (3), pp. 2035-2038.
10. Amudhavel, J., Kathavate, P., Reddy, L.S.S., BhuvaneswariAadharshini, A., "Assessment on
authentication mechanisms in dis-tributed system: A case study", (2017) Journal of Advanced Re-
search in Dynamical and Control Systems, 9 (Special Issue 12), pp. 1437-1448.
11. Amudhavel, J., Kodeeshwari, C., Premkumar, K., Jaiganesh, S., Rajaguru, D., Vengattatraman, T.,
Haripriya, R., "Comprehensive analysis on information dissemination protocols in vehicular ad hoc
networks", (2015) International Journal of Applied Engineering Re-search, 10 (3), pp. 2058-2061.
12. Amudhavel, J., Kathavate, P., Reddy, L.S.S., Satyanarayana, K.V.V., "Effects, challenges,
opportunities and analysis on security based cloud resource virtualization", (2017) Journal of Advanced
Research in Dynamical and Control Systems, 9 (Special Issue 12), pp. 1458-1463.
13. Amudhavel, J., Ilamathi, R., Moganarangan, N., Ravishankar, V., Baskaran, R., Premkumar, K.,
"Performance analysis in cloud au-diting: An analysis of the state-of-the-art", (2015) International
Journal of Applied Engineering Research, 10 (3), pp. 2043-2046.
25 | P a g e
Image Processing based facial emotion recognition
CHAPTER-10
Appendix
PYTHON:
Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is designed to be
highly readable. It uses English keywords frequently where as other languages use punctuation, and it has fewer
syntactical constructions than other languages.
Python is Interpreted − Python is processed at runtime by the interpreter. We do not need to compile
your program before executing it. This is similar to PERL and PHP.
Python is Interactive –we can actually sit at a Python prompt and interact with the interpreter directly to
write our programs.
Python is a Beginner's Language − Python is a great language for the beginner-level programmers and
supports the development of a wide range of applications from simple text processing to WWW browsers
to games.
26 | P a g e
Image Processing based facial emotion recognition
DEEP LEARNING:
Deep learning (also known as deep structured learning) is part of a broader family of machine learning
methods based on artificial neural networks with representation learning. Learning can be supervised, semi-
supervised or unsupervised.[1][2][3]
Deep-learning architectures such as deep neural networks, deep belief networks, graph neural networks, recurrent
neural networks and convolutional neural networks have been applied to fields including computer vision, speech
recognition, natural language processing, machine translation, bioinformatics, drug design, medical image
analysis, material inspection and board game programs, where they have produced results comparable to and in
some cases surpassing human expert performance.[4][5][6][7]
Artificial neural networks (ANNs) were inspired by information processing and distributed communication nodes
in biological systems. ANNs have various differences from biological brains. Specifically, neural networks tend
to be static and symbolic, while the biological brain of most living organisms is dynamic (plastic) and analogue.
[8][9][10]
The adjective "deep" in deep learning refers to the use of multiple layers in the network. Early work showed that
a linear perceptron cannot be a universal classifier, but that a network with a nonpolynomial activation function
with one hidden layer of unbounded width can. Deep learning is a modern variation which is concerned with an
unbounded number of layers of bounded size, which permits practical application and optimized implementation,
while retaining theoretical universality under mild conditions. In deep learning the layers are also permitted to be
heterogeneous and to deviate widely from biologically informed connectionist models, for the sake of efficiency,
trainability and understandability, whence the "structured" part.
CONVOLUTIONAL NEURAL NETWORK:
A convolutional neural network consists of an input layer, hidden layers and an output layer. In any feed-
forward neural network, any middle layers are called hidden because their inputs and outputs are masked by the
activation function and final convolution. In a convolutional neural network, the hidden layers include layers that
perform convolutions. Typically this includes a layer that performs a dot product of the convolution kernel with
the layer's input matrix. This product is usually the Frobenius inner product, and its activation function is
commonly ReLU. As the convolution kernel slides along the input matrix for the layer, the convolution operation
generates a feature map, which in turn contributes to the input of the next layer. This is followed by other layers
such as pooling layers, fully connected layers, and normalization layers.
Convolutional layers
27 | P a g e
Image Processing based facial emotion recognition
In a CNN, the input is a tensor with a shape: (number of inputs) x (input height) x (input width) x (input
channels). After passing through a convolutional layer, the image becomes abstracted to a feature map, also
called an activation map, with shape: (number of inputs) x (feature map height) x (feature map width) x (feature
map channels). A convolutional layer within a CNN generally has the following attributes:
Pooling layers
Convolutional networks may include local and/or global pooling layers along with traditional convolutional
layers. Pooling layers reduce the dimensions of data by combining the outputs of neuron clusters at one layer into
a single neuron in the next layer. Local pooling combines small clusters, tiling sizes such as 2 x 2 are commonly
used. Global pooling acts on all the neurons of the feature map.[18][19] There are two common types of pooling
in popular use: max and average. Max pooling uses the maximum value of each local cluster of neurons in the
feature map,[20][21] while average pooling takes the average value.
Receptive field
In neural networks, each neuron receives input from some number of locations in the previous layer. In a
convolutional layer, each neuron receives input from only a restricted area of the previous layer called the
neuron's receptive field. Typically the area is a square (e.g. 5 by 5 neurons). Whereas, in a fully connected layer,
the receptive field is the entire previous layer. Thus, in each convolutional layer, each neuron takes input from a
larger area in the input than previous layers. This is due to applying the convolution over and over, which takes
into account the value of a pixel, as well as its surrounding pixels. When using dilated layers, the number of
pixels in the receptive field remains constant, but the field is more sparsely populated as its dimensions grow
when combining the effect of several layers.
Weights
Each neuron in a neural network computes an output value by applying a specific function to the input values
received from the receptive field in the previous layer. The function that is applied to the input values is
determined by a vector of weights and a bias (typically real numbers). Learning consists of iteratively adjusting
these biases and weights.
The vector of weights and the bias are called filters and represent particular features of the input (e.g., a particular
shape). A distinguishing feature of CNNs is that many neurons can share the same filter. This reduces the
memory footprint because a single bias and a single vector of weights are used across all receptive fields that
share that filter, as opposed to each receptive field having its own bias and vector weighting.[22]
NumPy
NumPy is shortened from Numerical Python, it is the most universal and versatile library both for pros and
beginners. Using this toolwe are up to operate with multi-dimensional arrays and matrices with ease and comfort.
Such functions like linear algebra operations and numerical conversions are also available.
Pandas
29 | P a g e
Image Processing based facial emotion recognition
Pandas is a well-known and high-performance tool for presenting data frames. Using it we can load data from
almost any source, calculate various functions and create new parameters, build queries to data using aggregate
functions akin to SQL. What is more, there are various matrix transformation functions, a sliding window method
and other methods for obtaining information from data. So it’s totally an indispensable thing in the arsenal of a
good specialist.
30 | P a g e
Image Processing based facial emotion recognition
31 | P a g e