Deep Learning in Data Analytics: Recent Techniques, Practices and Applications 1st Edition Debi Prasanna Acharjya
Deep Learning in Data Analytics: Recent Techniques, Practices and Applications 1st Edition Debi Prasanna Acharjya
com
https://ebookmeta.com/product/deep-learning-in-data-
analytics-recent-techniques-practices-and-applications-1st-
edition-debi-prasanna-acharjya/
OR CLICK HERE
DOWLOAD EBOOK
https://ebookmeta.com/product/social-big-data-analytics-practices-
techniques-and-applications-bilal-abu-salih/
ebookmeta.com
https://ebookmeta.com/product/deep-learning-for-biomedical-data-
analysis-techniques-approaches-and-applications-1st-edition-mourad-
elloumi/
ebookmeta.com
https://ebookmeta.com/product/online-learning-analytics-data-
analytics-applications-1st-edition-jay-liebowitz/
ebookmeta.com
https://ebookmeta.com/product/minoru-yamasaki-and-the-fragility-of-
architecture-1st-edition-paul-kidder/
ebookmeta.com
The Legacy of Dell Hymes Ethnopoetics Narrative Inequality
and Voice 1st Edition Paul V Kroskrity Anthony K Webster
https://ebookmeta.com/product/the-legacy-of-dell-hymes-ethnopoetics-
narrative-inequality-and-voice-1st-edition-paul-v-kroskrity-anthony-k-
webster/
ebookmeta.com
https://ebookmeta.com/product/a-new-companion-to-linguistic-
anthropology-1st-edition-alessandro-duranti-rachel-george-robin-
conley-riner/
ebookmeta.com
https://ebookmeta.com/product/cursed-shadows-1st-edition-kc-kean/
ebookmeta.com
https://ebookmeta.com/product/dedicated-to-mr-darcy-a-pride-and-
prejudice-variation-1st-edition-florence-gold-a-lady/
ebookmeta.com
https://ebookmeta.com/product/indigenous-digital-life-the-practice-
and-politics-of-being-indigenous-on-social-media-carlson/
ebookmeta.com
Intermediate Maths for Chemists Chemistry Maths 2 2nd
edition J. E. Parker
https://ebookmeta.com/product/intermediate-maths-for-chemists-
chemistry-maths-2-2nd-edition-j-e-parker/
ebookmeta.com
Studies in Big Data 91
Deep Learning
in Data
Analytics
Recent Techniques, Practices and
Applications
Studies in Big Data
Volume 91
Series Editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Big Data” (SBD) publishes new developments and advances in
the various areas of Big Data- quickly and with a high quality. The intent is to cover the
theory, research, development, and applications of Big Data, as embedded in the fields
of engineering, computer science, physics, economics and life sciences. The books of
the series refer to the analysis and understanding of large, complex, and/or distributed
data sets generated from recent digital sources coming from sensors or other physical
instruments as well as simulations, crowd sourcing, social networks or other internet
transactions, such as emails or video click streams and other. The series contains
monographs, lecture notes and edited volumes in Big Data spanning the areas of
computational intelligence including neural networks, evolutionary computation,
soft computing, fuzzy systems, as well as artificial intelligence, data mining, modern
statistics and Operations research, as well as self-organizing systems. Of particular
value to both the contributors and the readership are the short publication timeframe
and the world-wide distribution, which enable both wide and rapid dissemination of
research output.
The books of this series are reviewed in a single blind peer review process.
Indexed by SCOPUS, SCIMAGO and zbMATH.
All books published in the series are submitted for consideration in Web of Science.
Noor Zaman
School of Computer Science
and Engineering
Taylor’s University
Subang Jaya, Malaysia
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Dedicated to
My loving children Aditi and Aditya
D. P. Acharjya
My caring parents Amitava, Anushila and
wife Sudeepta
Anirban Mitra
My all family members
Noor Zaman
Preface
The amount of data collected across a wide variety of fields today far exceeds our
ability to reduce and analyze without automated analysis techniques. There is much
information hidden in the accumulated voluminous data. It is tough to obtain this
information. So, it is essential for a new generation of computational theories and
tools to help humans extract knowledge, self-learning, and rule generation from huge
data. Deep learning has evolved into an important and active research area because of
theoretical challenges associated with discovering intelligent solutions for the smart
analysis of huge data.
Deep learning is a form of machine learning that enables computers to learn from
experience and understand the world in terms of a hierarchy of concepts. Because the
computer gathers knowledge from experience, there is no need for a human–computer
operator to specify all the knowledge that the computer needs formally. The hier-
archy of concepts allows the computer to learn complicated concepts by building
them out of simpler ones; a graph of these hierarchies would be many layers deep.
The amount of data collected from various applications worldwide across various
fields is expected to double every two years. It has no utility unless these are analyzed
to get useful information. This necessitates the development of techniques that can be
used to facilitate machine-based rule generation and data analysis. The development
of powerful computers is a boon to implement these techniques leading to automated
systems. The transformation of data into knowledge is by no means an easy task for
high-performance large-scale data processing, including exploiting the parallelism
of current and upcoming computer architectures for data mining. Moreover, these
data may involve many challenges in terms of consistency or size, or dimension.
Advanced computing and learning techniques that include deep learning concepts
are fruitful in representing data and generating rules and knowledge. Further, these
models are very much fruitful for analysis also. The new challenges usually comprise,
sometimes even deteriorate, the performance, efficiency, and scalability of the dedi-
cated data-intensive computing systems. Deep learning approaches prevent loss of
information and hence enhance the performance of data analysis and learning tech-
niques. This brings up many research issues in the industry and research community
in capturing and accessing data effectively. Besides, fast processing while achieving
high performance and high throughput and storing it efficiently for future use is
vii
viii Preface
another issue. Further, programming for data analysis using deep learning concepts
is an important and challenging issue. Expressing data access requirements of appli-
cations and designing programming language abstractions to exploit parallelism is
an immediate need.
This book offers concepts and techniques of deep learning in a precise and clear
manner to the research community. In editing the book, we have tried our best
to provide frontier advances and applications in deep learning for data analysis,
the conceptual basis required to achieve in-depth knowledge in computer science
and information technology. This volume will help researchers interested in this
field to keep insight into different concepts and their importance for applications in
industry, research, and management fields. Further, this edited book aims to provide
computer science and information technology researchers with recent advances in
deep learning, which are required to achieve in-depth knowledge in computer science.
We believe our effort will make this edited volume interesting and attractive among
the student and research communities. This book is comprised of four sections. The
Part I is on the theoretical foundation of deep learning. The Part II is about computing
system and machine learning associated with deep learning techniques. Some of the
deep learning algorithms have been discussed in the Part III, and the Part IV focuses
on applications of deep learning techniques. This has been done to make the edited
book more flexible and stimulate further research interest in topics.
Theoretical foundation of deep learning consists of four chapters. The chapter
“A Study on Discrete Action Sequences Using Deep Emotional Intelligence” is on
discrete action sequences using deep emotional intelligence. Recognition of emotions
from human plays a vital role in our day to day life and is essential for social commu-
nication. Automatic emotion recognition is becoming a recent research focus on arti-
ficial intelligence. This chapter proposed a machine learning approach and discussed
with deep learning model to achieve emotional intelligence. Further, it provides a
brief study on achieving emotional intelligence with a DCNN Model. A novel noise
removal technique influenced by deep Convolutional autoencoders on mammograms
has been discussed in the chapter “A Novel Noise Removal Technique Influenced
by Deep Convolutional Autoencoders on Mammograms”. This chapter explains the
concepts and implementation of a deep learning algorithm for data analytics on struc-
tured and unstructured data concerning biomedical images such as CT image, MRI,
and X-ray to detect some deadly diseases.
The chapter “A High Security Framework Through Human Brain Using Algo
Mixture Model Deep Learning Algorithm” focuses on the security framework
through human brain using deep learning algorithm. This chapter aims to analyze
human brain patterns’ uniqueness and provide high-level security based on various
factors such as size, state of mind, and rate of waves. Chapter “Knowledge Framework
for Deep Learning: Congenital Heart Disease” is on knowledge framework for deep
learning with implementation on congenital heart disease. This chapter discusses
an automated strategy for analyzing, detecting, and diagnosing heart diseases with
initial valid parameters.
The Part II of this edited volume is computing system and machine learning asso-
ciated with deep learning techniques. This section consists of the next four chapters.
Preface ix
Learning” introduces a deep learning model for anomaly detection for credit card
fraud in financial transactions, and further, it compares the deep learning model with
some of the existing machine learning algorithms. Validation of the proposed model
is carried out on a dataset gathered in Europe, for two days, in September 2013. The
accuracy evaluation metric is used to evaluate the proposed model’s ability to detect
credit card fraud.
The Part IV of this edited volume is on applications of deep learning techniques.
This section consists of the next three chapters. The Chapter “Application of Deep
Learning for Energy Management in Smart Grid” is based on the application of deep
learning for energy management in smart grid. The literature focuses on concepts
of load forecasting and energy management in smart grid. Different deep learning
techniques such as deep neural network, RBM, DBF, etc., based on applications
related to smart grid and smart vehicle has been elaborated.
Cost optimization of software quality assurance has been focused in the Chapter
“Cost Optimization of Software Quality Assurance”. The chapter explains that soft-
ware quality assurance constitutes an important share of an organization’s total devel-
opment overheads. The cited concepts are utilized in delivering an obliging response
on the prototypical and too funded to the frame of an acquaintance on the methods’
efficiency. Analytical approach for security of sensitive business cloud has been
discussed in the Chapter “Analytical Approach for Security of Sensitive Business
Cloud”. This chapter elaborates on the concept of security in cloud computing. It
also discusses security to sensitive data, four-level authentication authorization, data
security, network security, and cloud security. Since the increasing threat of security
in the growing demand of clouds is becoming one of the major issues, this chapter has
outlined a path to analyze and restrict dangerous security threats for cloud computing.
Numerous researchers worldwide are actively working with deep learning
concepts and integrating these concepts with computational intelligence, data anal-
ysis, cloud computing, and many other areas. The fusion of deep learning techniques
with the recent advancement area will acquire it to a newer dimension. To keep
abreast with this development in a cohesive manner, we strove to keep the book
reader-friendly. The main objective is to pen down the major developments in deep
learning areas precisely to serve as a handbook for many researchers. We believe
this edited volume will help the researchers working in deep learning areas to keep
insight into recent advances on concepts and understand the importance and usage
of deep learning techniques to handle real-life applications efficiently.
It is with great sense of satisfaction that we present our edited book entitled “Deep
Learning in Data Analytics: Recent Techniques, Practices and Applications” and
wish to express our views to all those who helped us both direct and indirect way
to complete this project. First and foremost, we praise and heart fully thank the
almighty God, which has been unfailing source of strength, comfort, and inspiration
in the completion of this project.
While writing, contributors have referred to several books and journals, and we
take this opportunity to thank all those authors and publishers. We are extremely
thankful to the reviewers for their constant support during the process of evaluation.
Special mention should be made of the timely help given by different persons during
the project work, those whose names are not mentioned here. Last but not the least, we
thank the series editor “Kacprzyk, Janusz” and the production team of Springer-
Verlag, USA, for encouraging us and extending their cooperation and help for a
timely completion of this edited book. We trust and hope that it will be appreciated
by many readers.
xi
Contents
xiii
xiv Contents
Abstract Recognition of emotions from human plays a vital role in our day to day
life and is essential for social communication. Automatic emotion recognition is
becoming recent research focus on artificial intelligence. A facet of human intel-
ligence is the ability to recognize emotion that is regarded as one of the attribute
of emotional intelligence. Although research based on facial expressions or speech
is seen in thrive, recognizing emotions from body gestures remains a less explored
topic. This chapter proposes a machine learning approach and discussed with deep
learning model to achieve emotional intelligence. The block based intensity value
(BBIV) feature and the different bin level HoG feature (DBLHoG) are extracted
from human body movements and are fed to a supervised learning algorithm. Sup-
port vector machine (SVM), k-nearest neighbor (KNN) and random forest classifiers
are the supervised learning algorithm used in this chapter. Finally, the pre-trained
deep convolutional neural network (DCNN) model is used. The experiment is con-
ducted using Geneva multimodal emotion portrayals (GEMEP) corpus dataset. In
this dataset, human body movement expressing the five archetypical emotions likes
(anger, fear, joy, pride and sad). In this emotions recognition system, The random
forest classifier outperformed better than the SVM and kNN classifier. Finally DCNN
model achieve better recognition than random forest classifier. This chapter gives a
brief study on achieving emotional intelligence with a DCNN Model.
1 Introduction
The ability to understand the human emotions plays a central role human social
behaviour and interaction. The emotions are important in decision making and ratio-
R. Santhoshkumar (B)
Associate Professor, Department of Computer Science and Engineering, St. Martin’s Engineering
College, Secunderabad 500100, Telangana, India
e-mail: [email protected]
M. K. Geetha
Department of Computer Science and Engineering, Annamalai University, Tamilnadu, India
e-mail: [email protected]
nal thinking on experimental psychology in recent research area. Over the years
research in emotion recognition mainly concentrated on facial expression, voice
analysis, full body movements and gestures. In different way human beings express
different type emotions for day to day communications. Understanding human emo-
tions is a key area of research, since recognizing emotions may provide a plethora of
opportunities and applications for instance, friendlier human-computer interactions
with an enhanced communication among humans, by refining the emotional intelli-
gence. The human communication includes verbal and non verbal communication.
The non-verbal communication is a sharing of wordless clues or information using
eyes, hands, heads, fingers and all body movements. This includes visual cues such
as body language and physical appearance. Human Emotion can be recognized using
body language and posture. Posture gives information which is not present in speech
and facial expression. For example, the emotional state of a person from a long
distance can be identified using human posture. Hence human emotion recognition
through non-verbal communication can be achieved by capturing body movement.
This chapter aims to examine the motion cues that indicate difference between
emotions. The possibility to use motion and gestures as indicators of the state of indi-
viduals provides a novel approach to quantitatively monitor and estimate the users in
an ecological environment and to react adaptively to them. This approach recognizes
emotional states such as anger, joy, sad, fear etc. based on body movement analysis.
From the recent literature it is evident that different emotions are often associated with
various body movements. The speed, amplitude and fluidity of movement express
specific emotions from human body movement. The analysis of emotional behavior
that provides indicators describing the dynamics of expressive motion cues using
static actions. The communication greatly improve by understanding and knowing
how to respond to peoples expression. The human communication includes not only
the language spoken, but also non-verbal cues as movements of hands, head and
body gestures. The important role in psychological research area is to develop the
concepts that may support the HCI (Human Computer Interaction) technologies and
understanding human emotions [1, 2]. The fundamental emotions like anger, neutral,
happy, fear, disgust, sadness, surprise, etc., are the particular emotions are related to
certain human body movements. For example: joy brings upward acceleration of the
fore arms and make body to openness, fear gives to contrast the body, the signal of
fear and sadness brings body turning away. Body turning towards indicates happi-
ness, anger, surprise. A recent survey reviews the literature on emotion recognition
from body posture and movement, and discusses the main challenges in collecting
appropriate data sets and ground truth labelling. Both automatic recognition of emo-
tion and generation of affect expressive movements are developed by computational
model. The emotion recognition systems provides applications in several motivat-
ing areas: in surveillance field, estimate emotional state of students in intelligent
tutoring systems, social robotics for social interaction, monitor player’s motivation
and interest in games, in medical field to monitoring depression levels of patients
and provide applications for autism and dementia patients. Some other application
domains are, educational software, telecommunications, automobile safety, video
games, animations, psychiatry, robotics, affect sensitive HCI [1, 3].
A Study on Discrete Action Sequences Using Deep Emotional Intelligence 5
Ashwini et al. describes the different types of approaches and developed a real time
emotion recognition system used for recognizing human emotions [1]. Stefano et al.
proposed automatic emotion recognition in real time from body movements. The real
time video are captured and converted into 3D skeletal frames using advance d video
capturing system. From the sequences of 3D skeletons, the kinematic, geometrical
and postural features are extracted and given to the multi class SVM classifier to
categorise the human emotion [3]. A survey on generation of such body movements
and the state of the art on automatic recognition of emotion is also presented in the
literature [2]. The important characteristics such as the representation of affective
state, the body movements analyzed and the use of information systems are dis-
cussed. A framework for behaviour recognition from human upper body movements
is presented in the literature [4]. The reduced amounts of visual information are used
to analysis the affective behaviour of body movements. The aim of the work is to
individuate a representation of emotional displays depending on nonverbal gesture
features. Further an advanced real time system for human body movements to rec-
ognize emotions continuously is presented [5]. The high-level kinematic features,
geometrical features and the united 3D postural features are given as input to ran-
dom forests classifier. Furthermore, a system for recognition of emotion depends
on different actions, different expression of emotions and low level body cues from
human body movement. To recognition the emotion from these aspects, the features
are extracted from the various parts and are fed to the random forest classifier [6].
Besides, a body posture and body action coding system from body movement on
an anatomical level of different articulations of body parts, direction and orientation
of movement is developed [7]. Similarly, a survey on recognizing human emotion
from hand, arms, gestures and body movements is conducted [8]. Further a robust
technique for assessing human body expression based on movement characteristics
with positive and negative emotions is developed [9]. In the literature, the study and
analysis of the spatial and temporal information structure of the motion capture data
and extract features that are related to affective state descriptors is also discussed [10].
Again a survey on recent advances in developing robust techniques and modalities
for automatic human emotion recognition system from body movements is also
presented [11]. Here the importances of body movement segmentation are discussed
and advanced application areas are described. Likewise, an analysis of emotional
behaviour system based on classification of time series and dynamics of expressive
motion cues is also proposed [12]. Similarly, a video retrieval applications is proposed
for shot detection and video classification using block intensity comparison code
(BICC) and unsupervised shot detection [13]. It uses a noval ANN misclusting rate
(AMR) algorithm to detect the shot transitions. Similarly, an automatic depression
analysis system from human gestures and upper body expressions is also developed
[14]. The bag of words and space time interest points is developed for the analysis
of facial and upper body movements.
Furthermore a robust gesture recognition system using learning local motion sig-
natures (LMS) is proposed [15]. Similarly, the human detection and robust visual
object recognition using adopting linear SVM is discussed in the literature [16]. It
showed the performance of human detection using feature sets of histograms of ori-
6 R. Santhoshkumar and M. K. Geetha
converted into binary frames and then put bounding box around human body. These
processes are called as pre-processing. From the sequences of gray frame the different
bin level HoG (DBLHoG) features and bock based intensity value (BBIV) features
are extracted and stored separately. The k-nearest neighbor and random forest are
employed to classify the trained and test video to recognize the human emotion. The
pre-processing block converts videos into RGB frames and RGB frames into gray
frames of all input videos. The human detection from gray frame using bounding
box is shown in Fig. 2 for angry emotion.
3 Feature Descriptions
This section describes the features associated with human emotion. The human basic
emotions such as angry, joy, fear, sad and pride are consider for this experiment from
GEMEP corpus dataset. The overall structure of the proposed work is shown in Fig. 3.
8 R. Santhoshkumar and M. K. Geetha
Initially the input video are preprocessed and converted into frames in BBIV fea-
ture extraction. Frame subtraction is a subtraction of successive frames in video
sequences. It is used for change detection. The difference image is produced by two
input frames at time t and (t + 1). The high amplitude regions shown in difference
image are considered as motion regions and is defined in Eq. 1.
In Eq. 1, Dk (i, j) refers to the difference image, I ntk (i, j) is the intensity of the
pixel (i, j) in the kth frame, w and h are the width and height of the image respectively.
The motion region is considered as the region of interest (ROI). Figure 4 shows the
two successive frames of the GEMEP dataset and shows the frame differencing
image. Motion information, M I n f ok is calculated using Eq. 2.
1 if Dk (i, j) > t
M I n f ok (i, j) = (2)
0 Otherwise
The input videos are preprocessed and converted into n frames. The motion infor-
mation is calculated from difference image and is considered as ROI. The difference
image of size (720 × 360) is divided into three blocks B1, B2, and B3 and each of
pixel size (240 × 360). Then the block having maximum intensity value is further
divided into (6 × 6) block each of pixel size (40 × 60). The 36 dimensional fea-
ture vectors are extracted from the (6 × 6) block. The Fig. 5 demonstrates the block
division and 36 dimensional feature extraction blocks.
The HoG feature is defined as local object appearance and shape can often be charac-
terized rather well by the distribution of local intensity gradients of the corresponding
A Study on Discrete Action Sequences Using Deep Emotional Intelligence 9
gradient [16, 22]. It is said that the description of the HOG method that has been used
in its higher form is scale invariant features transformation (SIFT) and it has been
broadly demoralized in human detection. The proposed architecture of DBLHOG
feature extraction is depicted in Fig. 6.
The subsequent building of a 1D histogram whose concatenation supplies the
feature vector from the HOG descriptor using gradient directions among the pixels
in the cell. The image to be analyzed as intensity function L. The image is further
divided into cells of size (3 × 3) pixels with different histogram bins (9, 15, 20, 25,
30). The Eqs. 3 and 4 define the gradient magnitude g and the gradient orientation θ .
10 R. Santhoshkumar and M. K. Geetha
These equations are used to compute magnitude g and orientation θ for all the pixels
in the block from the image gradients.
g(a, b) = gx (a, b)2 + g y (a, b)2 (3)
g y (a, b
Θ(a, b) = arctan (4)
gx (a, b)
Further, compute a feature vector Vi j for each cell ci j in the block. The Eq. 5
defines the weighted gradient magnitude by quantizing the unsigned orientation into
K orientation bins.
Vi j = [Vi j (β)]T (5)
Vi j = g(a, b)δ[bin(a, b) − β] (6)
(a,b)∈ci j
The index of the orientation bin with the pixel (a, b) returns the function bin(a, b)
and the function δ[] is the Kronecker delta. The coefficient ρ normalize the feature
vector in all cells from 2D descriptor of block. The extraction of HoG feature from
gray image is depicted in Fig. 7.
3
3
K
ρ= Vi j β (7)
i=1 j=1 β=1
In order to obtain a n-dimensional feature vector, the different bin level feature
values are concatenated into a distinctive vector. Using this approach the features
are extracted for different bin level such as (9, 15, 20, 25, 30) of training and testing
dataset. The extracted features are modeled by the KNN and random forest classifiers
for emotion detection.
A Study on Discrete Action Sequences Using Deep Emotional Intelligence 11
R(S, Aq ) = b(S, b p )y(b p , Aq )
The random forest (RF) is the another type of supervised machine learning algorithm.
There is a relationship between results obtained and number of trees in the forest.
The two steps in this algorithm are [23] random forest creation and random forest
prediction. The procedure for RF creation and prediction is given below.
Algorithm 1 Random Forest Creation
1. Select “k” features randomly from total “m” features, where k < m
2. Using best split point calculate the node “d” from the “k” features
3. Using best split, divide node into daughter node
4. Repeat the steps 1 to 3, until one number of nodes has been reached
5. To create n number of trees, repeating the steps 1 to 4
Algorithm 2 Random Forest Prediction
1. Calculate the votes for each prediction
2. Final prediction from this algorithm can be considered by high voted prediction
The support vector machine (SVM) is an important and efficient technique for clas-
sification in visual pattern recognition [20]. The SVM is most extensively used in
kernel learning algorithm. The elegant theory used to separate two classes by large-
margin hyperplanes. It cannot be extended easily to separate N mutually exclu-
sive classes. The most popular “one-vs-others” approach is used for the multi class
problem where, one class is separated from N classes. The classification task typ-
ically involves with training and testing data. The training data are separated by
(s1, t1), (s2, t2), . . . (sn, tn) into two classes, where bj ∈ 1, −1 are the class labels
and s j ∈ tn contains n-dimensional feature vector. The goal of SVM is to develop a
model which predicts target value from testing set. w.s + b = 0 is the hyper plane of
binary classification, where w ∈ R N . M = 2/||w|| is the large margin. The Lagrange
multipliers αi, i = 1, . . . , m are used to solve the minimization problem, where v
and y are optimal values obtained from Eq. 9.
n
h(s) = sgn x j b j L(S j , s) + y (9)
j=1
1 k
min v,y, v R v + D j (10)
2 j=1
b j V R φs j + y ≥ 0 (11)
A Study on Discrete Action Sequences Using Deep Emotional Intelligence 13
The Eqs. 10 and 11 obtain the soft margin classifier. When the training sample
is not linearly separable, the input space mapped into high dimensional space using
kernel function. The multiclass SVM is constructed by N -binary classifiers and one
class was separated from rest of the class. Here “one-vs-others” approach is used
in this SVM. The five classes of emotions are used in this work. The jth class of
the training sets have positive labels and all others with negative labels. Finally, the
feature vectors from the body movement feature are given into multiclass SVM for
classification of human emotion.
The GEMEP is a set of audio and video recordings. The 18 affective states of emo-
tional expression can be acted by 10 actors. They acted in various types of expression
and verbal contents. From that five basic emotions, such as angry, joy, fear, sad and
Fig. 8 Example frames of five basic emotions (Angry, Fear, Joy, Pride and Sad)
14 R. Santhoshkumar and M. K. Geetha
pride have been chosen for this work. There are 10 actors, 5 male and 5 female were
acted in each emotion videos. The resolutions of the recorded videos are (720 × 576)
and each video has 25 frames per second (fps) [24]. The data set is depicted in Fig. 8.
This experiment is conducted in MATLAB R2015a in computer with windows 7
operating system and Intel Xeon X3430 processor 2.40 GHz with 8 GB of RAM. The
feature extraction techniques are explained in above section, using n-dimensional
DBLHoG features and BBIV features are extracted. The metrics of the proposed
feature is tested on SVM, KNN and RF classifiers.
4 Performance Evaluations
According to the histogram bins, different sets of features were extracted with dif-
ferent dimensional levels. There are five different dimensional levels of feature were
used. Those extracted features are given to the SVM, KNN and RF classifiers one by
one. Accuracy, recall, F-score, specificity and precision are the measuring assessment
for this execution. The factual evaluation of accuracy, recall, F-score, specificity and
precision are given as below.
The various measures of confusion matrix is defined with true positives (t p), true
negatives (tn), false positives ( f p), and false negatives ( f n). The ratio between sum
of correct classifications and total number of classifications is called as accuracy. It
is given as:
t p + tn
Accuracy =
tn + f p + t p + f n
The ratio between correctly labeled instances and total instances in the class is
said to be recall. Similarly, the ratio between correctly labeled instances and total
labeled instances is called as precision. It is a percentage of positive predictions in
specific class that are correct. A good classifier can provide both recall and precision
values high. The harmonic mean of precision and recall is called as F-measure. These
measures are defined below.
tp
Pr ecision =
tp + f p
tp
Recall =
tp + f n
Pr ecision × Recall
F − Scor e = 2
Pr ecision + Recall
A Study on Discrete Action Sequences Using Deep Emotional Intelligence 15
The confusion matrix of KNN, RF and SVM classifier are shown in Tables 1, 2, and 3
respectively. The percentage of instance that are classified accurately are illustrated
in the diagonal of confusion matrix. The each emotion class occurrence is spoken
to by the lines and the emotion class anticipated by the classifier is spoken to by the
sections. The emotions like sad, fear and pride are grouped well with precision more
noteworthy than 90% in kNN, 92% in RF and 88% in SVM. From this, angry and
joy emotions are confused as curve, where these two emotions instinctively appear
to be difficult to recognize.
The confusion matrix of KNN, RF and SVM classifier with respect to DBLHOG is
shown in Tables 4, 5, and 6 respectively. The percentage of instance that are classified
accurately illustrated in the diagonal of confusion matrix. The each emotion class
occurrence is spoken to by the lines and the emotion class anticipated by the classifier
is spoken to by the sections. The emotions like sad, fear and pride are grouped well
with precision more noteworthy than 80% in KNN, 93% in RF, and 85% in SVM.
From this, angry and joy emotions are confused as curve, where these two emotions
instinctively appear to be difficult to recognize.
A Study on Discrete Action Sequences Using Deep Emotional Intelligence 17
Humans possess the ability to address problems, gain knowledge, create ideas, rec-
ognize patterns, provide decisions, reason, plan and think through their intelligence.
Intelligence is the intellectual skill of humans, which is discernible by high levels
of enthusiasm and self awareness. Artificial intelligence is the building of human
abilities artificially. The artificial intelligent system explores to some extent creative
thinking; learning and decision making of humans. The current focus of research
is permitting machines to shape the globe effectively to demonstrate intelligence
system. A large amount of data has to be stocked, explicitly or implicitly to realize
this system. Since, researchers are trying to create learning algorithms to capture
knowledge from that information. A lot of challenges of artificial intelligence are
considered for developing the learning algorithms.
The most important aspect for intelligent machines is considered as learning. Arti-
ficial intelligence is executed by machine learning technique. In artificial intelligence
research, a machine teaches to detect various patterns using machine learning pattern.
Conventional machine learning techniques expertise to design a feature extractor that
transformed the raw data into a feature vector could detect or classify patterns in the
input data. Deep learning is a dedicated form of machine learning. The technique
that instructs computers to do some operation and behave like humans is done by
machine learning. In machine learning methods the feature extraction methods starts
from input data. The features are fed to model that classifies the objects in the image.
Learning feature hierarchies are produced by combining of lower level and higher
level features [25]. In deep learning model, features are automatically extracted from
inputs data by several levels of abstraction. An overview of learning based methods
is depicted in Fig. 9.
The various types of machine learning methods are ever rising at an enormous
rate. Deep learning has emerged as a popular approach within machine learning.
The two major methods in human emotion recognition (HER) problem are machine
learning and deep learning methods [26]. The several limitations of machine learning
methods are failure to learn features automatically from the input data, and deep rep-
resentation of data in classifiers. Alternatively, a deep learning-based approach plays
the concept of end-to-end learning by using the trainable feature extractor followed
by a trainable classifier. The multiple layers of features are automatically extracted
from the raw data. This deep learning algorithm develops multi layer representation
of different patterns in the input data, where each successive layer is responsible for
learning increasingly complex features [27]. The lower layers extracting higher level
features from the input data, thus representation increased at abstraction level and
at each consecutive layer. The automatic learning ability of deep learning method
neglected the need of handcrafted feature detectors and descriptors. In many visual
categorization tasks, the deep learning models have shown higher performance than
traditional handcrafted feature-based techniques. The convolutional neural networks
(CNN) and deep belief networks (DBN), deep recurrent neural networks (RNN), and
deep Boltzmann machines (DBM) are the deep learning model employed for many
visual categorization tasks. DBN is an unsupervised probabilistic graphical model
capable of learning from the input data without any prior knowledge. This model can
also be trained in a semi-supervised or unsupervised fashion which is quite helpful
of labeled data or dealing with unlabeled data. The non-deep learning approach and
deep learning approach are the two learning based methods [28].
Non deep learning based methods are dictionary learning method and genetic pro-
gramming method. Briefly, these methods are discussed below.
Dictionary Learning Method The emotion recognition, action recognition,
image and video classification are the computer vision applications, which can
be experimented using dictionary learning based approaches [29]. From the large
number of samples, the representative vectors are learned and used in this concept.
Further, a framework is developed for human action recognition using dictionary
learning methods [30]. Based on the hierarchical descriptor the proposed method
[31] for human activity recognition outperforms the state-of-the-art methods. For a
visual recognition, a cross domain dictionary learning based method was developed.
An unsupervised model further developed for cross view human action recognition
[32] without any label information. The coding descriptors of locality constrained
linear coding (LLC) [33] are generated by a set of low level trajectory features for
each action.
Genetic Programming Approach The unknown primitive operations can
improve the accuracy performance of the human emotion recognition task by using
genetic programming technique. Now a day, this type of approach was introduced
A Study on Discrete Action Sequences Using Deep Emotional Intelligence 19
for emotion recognition [34]. In this approach, the spatio-temporal motion features
are automatically learned for action recognition. The 3D Gabor filter and wavelet are
evolved for this motion feature. Similarly, the valuable set of features was learned
for emotion recognition.
Probabilistic Graphical Model A probabilistic graphical model indicates the
dependencies and random variables in a directed acyclic graph form. The conditional
Bayesian networks, temporal Bayesian networks, and multi entity Bayesian networks
(MEBN) are the different types of Bayesian network. An interval temporal Bayesian
networks (ITBN) was introduced for recognition of complex human activities [35].
In order to evaluate the performance of the proposed method, a cargo loading dataset
was considered for experimentations and evaluations. Similarly, another method
was proposed for action detection using dynamic conditional Bayesian network,
which also achieved the state-of-the-art results. In MEBN, the predictive situation
awareness (PSAW) using multiple sensors is used [36]. For predicting and estimating
the temporally evolving situations these types of networks are robust for reasoning
the uncertainty in the complex domains.
There are no most excellent hand crafted feature descriptors for all types of dataset.
To handle these types of problem the features are directly learning from raw data.
Learning multiple levels of representation in data such as speech, images, videos, and
text is more advantageous in deep learning. These models have automated feature
extraction, classification and process the images as raw data. These models have
multiple processing layers in this work. There are three types of approaches in deep
learning models [37]. These are generative or unsupervised approach, discriminative
approach, and hybrid models (the characteristic combination of both approachs).
Generative or Unsupervised Approach The class labels are not required for
learning process in the unsupervised deep learning method. These types of methods
are specifically useful when labeled data are relatively unavailable. A remarkable
surge in the history of deep models was triggered by the work of Hinton [38]. In an
unsupervised pre-training learning stage, a back propagation method is used for fine
tuning. These types of deep learning approaches are used for many applications like
object identification, image classification, speech classification, activity and emotion
recognition. An unsupervised feature learning models from video data was proposed
for human action recognition [39]. The authors used an independent subspace anal-
ysis algorithm to learn spatio-temporal features combining them with deep learning
techniques such as convolutional neural networks and staking for action represen-
tation and recognition. Deep belief networks are trained with RBMs and were used
for human emotion recognition. Learning feature continuously without any labels
from the streaming video is a challenging task. Hasan and Roy-Chowdhury [40] was
addressed an unsupervised deep learning model to solve this types of problem. The
action recognition from unconstrained videos is a challenging task in most of the
20 R. Santhoshkumar and M. K. Geetha
action datasets have been recorded under a controlled environment; A human action
recognition method from unconstrained video sequences was also proposed [41]
using DBNs. Unsupervised learning played a pivotal role in reviving the interests of
the researchers in deep learning environment.
Discriminative or Supervised Models The CNN is the most frequently used
model from the supervised category. The CNN is a type of deep learning model
which has shown better performance at tasks such as image classification, pattern
recognition, human emotion recognition, human action recognition and hand-written
digit classification. In hierarchical learning model, the multiple hidden layers are used
to transform the input data into output categories. Its architecture consists of three
main types of layers, such as convolutional layer, pooling layer, and fully connected
layer [42].
The objects in the images are represented and recognized by deep CNN model. The
mapping back of different layers of CNN is called as deconvolutional networks. Using
CNN approach a spatial and temporal stream for action and emotion recognition is
proposed. The combination of the two methods outperformed better results than other
methods. In supervised method the RNN is the other popular model for all the above
mentioned applications. The skeleton-based action and emotion recognition using
RNNs are developed and the five parts of human skeleton was separately fed into
five subnets [43]. The output from the subnets were combined and fed into the single
layer for final demonstration. For a training process the deep learning based model
handle a large size of video data. An outstanding accuracy level has been achieved in
many application fields. An architecture of deep convolutional neural network model
is depicted in Fig. 10.
The input videos are converted into frames and saved in separate folder as training
set and validation set. Now, the raw images are the input of first layers. The feature
difference CNN (FDCNN) consists of multiple convolutional layers, each of which
performs the function that is discussed above. The input image is of size (150 ×
150 × 3); where 3 represent colour channel. In this network the size of filter is
(5 × 5) for all layers and the filter is called as weights. The multiplying of original
pixel value with weight values is called sliding or convolving. These multiplications
are summed and produced single number is called receptive field. Each receptive
field produces a number. Finally get the feature map with size of (150 × 150 × 3).
A Study on Discrete Action Sequences Using Deep Emotional Intelligence 21
Fig. 11 F-Score of BBIV feature with kNN, SVM, Random Forest and DCNN model
In first layer, 32 filters are applied and have 32 stacked feature maps in this stage.
Then the subsampling layer reduces the feature size of the representation with size
of (75 × 75 × 32). In second layer, 64 filters are applied and have 64 stacked feature
maps. Then the maxpooling layer is reduces the feature dimension to (37 × 37 × 64).
In the third convolutional layer, 128 numbers of filters are applied and have 128
stacked feature maps. Then the output of the maxpooling layer is reduces the feature
dimensions to (18 × 18 × 128). All max pooling layers are located with size of
(3 × 3). Finally fully connected layers with 512 hidden units are placed and the
output class have 5 neurons as per classes and shown the predicted emotions. The
confusion matrix is presented in Table 7 and the Fig. 11 shows the precision, recall,
F-score value in graphs for BBIV with KNN, RF and SVM and DCNN Model.
Figure 11 clearly shows the F-score performance for KNN, SVM and RF classifier
with BBIV and DBLHoG feature gives F-score of 89.7% and 95.3% and DCNN
model gives F-score 98.7%. From the result it is clearly indicated that the DCNN
model performs better than kNN, SVM and random forest classifiers.
22 R. Santhoshkumar and M. K. Geetha
6 Conclusion
This chapter introduced a novel approach for human emotion recognition from body
movements. Initially, a block based intensity value feature and DBLHoG features
extraction procedure was discussed for emotion recognition from pre-processed
bounding box frame. The motivation of this work is to recognize the activity and
make possible action immediately. The experimental results demonstrate that the
DCNN approach outperforms than the kNN, SVM and random forest classifiers.
Few of the promising future research directions to achieve emotional intelligence
with deep learning paradigm are discussed towards the end. The authors trust that
this attempt could afford obliging insights and significant support to the researchers
for exploring this topic. The future research direction moves on emotion recognition
of Autism Children using their movements. They can express their emotions using
heads and repeated of hand flapping.
References
1. Varghese, A.A., Cherian, J.P., Kizhakkethottam, J.J.: Overview on emotion recognition sys-
tem. In: Proceedings of International Conference on Soft Computing and Networks Security,
IEEExplore, pp. 1–5 (2015)
2. Karg, M., Samadani, A.A., Gorbet, R., Khnlenz, K., Hoey, J., Kulic, D.: Body movements for
affective expression: a survey of automatic recognition and generation. IEEE Trans. Affect.
Comput. 4(4), 341–359 (2013)
3. Piana, S., Staglian, A., Odone, F., Camurri, A.: Adaptive body gesture representation for auto-
matic emotion recognition. ACM Trans. Interact. Intell. Syst. 6(1), 1–31 (2016)
4. Glowinski, D., Dael, N., Camurri, A., Volpe, G., Mortillaro, M., Scherer, K.: Toward a minimal
representation of affective gestures. IEEE Trans. Affect. Comput. 2(2), 106–118 (2011)
5. Wang, W., Enescu, V., Sahli, H.: Adaptive real-time emotion recognition from body movements.
ACM Trans. Interact. Intell. Syst. 5(4), 1–21 (2015)
6. Fourati, N., Pelachaud, C.: Multi-level classification of emotional body expression. In: Proceed-
ings of 11th IEEE International Conference and Workshops on Automatic Face and Gesture
Recognition. IEEExplore, vol. 1, pp. 1–8 (2015)
7. Dael, N., Mortillaro, M., Scherer, K.R.: The body action and posture coding system (BAP):
development and reliability. J. Nonverbal Behav. 36(2), 97–121 (2012)
8. Stathopoulou, I.O., Tsihrintzis, G.A.: Emotion recognition from body movements and gestures.
In: Intelligent Interactive Multimedia Systems and Services, pp. 295–303. Springer, Berlin,
Heidelberg (2011)
9. Gross, M.M., Crane, E.A., Fredrickson, B.L.: Methodology for assessing bodily expression of
emotion. J. Nonverbal Behav. 34(4), 223–248 (2010)
10. Cimen, G., Ilhan, H., Capin, T., Gurcay, H.: Classification of human motion based on affective
state descriptors. Comput. Animat. Virtual Worlds 24(3–4), 355–363 (2013)
11. Zacharatos, H., Gatzoulis, C., Chrysanthou, Y.L.: Automatic emotion recognition based on
body movement analysis: a survey. IEEE Comput. Graph. Appl. 34(6), 35–45 (2014)
12. Castellano, G., Villalba, S.D., Camurri, A.: Recognising human emotions from body movement
and gesture dynamics. In: Proceedings of International Conference on Affective Computing
and Intelligent Interaction, pp. 71–82. Springer, Berlin, Heidelberg (2007)
13. Kalaiselvi Geetha, M., Palanivel, S.: Video classification and shot detection for video retrieval
applications. Int. J. Comput. Intell. Syst. 2(1), 39–50 (2009)
Other documents randomly have
different content
NOTES, OF THE BALANCE OF TRADE.