0% found this document useful (0 votes)

30 views12 pages

Speech Emotion Recognition For Enhanced User Experience: A Comparative Analysis of Classification Methods

Speech recognition has gained significant importance in facilitating user interactions with various technologies. Recognizing human emotions and affective states from speech, known as Speech Emotion Recognition (SER), has emerged as a rapidly growing research subject. Unlike humans, machines lack the innate ability to perceive and express emotions. Therefore, leveraging speech signals for emotion detection has become an adaptable and accessible approach.

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views12 pages

Speech Emotion Recognition For Enhanced User Experience: A Comparative Analysis of Classification Methods

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Speech Emotion Recognition for Enhanced User

Experience: A Comparative Analysis of
Classification Methods
1 2
Samjhana Pokharel (Author) Ujwal Basnet (Author)
Department of Computer Science and Engineering Department of Computer Science and Engineering
Kathmandu University Kathmandu University
Dhulikhel, Nepal Dhulikhel, Nepal

Abstract:- Speech recognition has gained significant the ability to perceive and express emotions. Speech,
importance in facilitating user interactions with various psychological signals, facial expressions, and other
technologies. Recognizing human emotions and affective modalities can all be used to detect emotions. Speech signals
states from speech, known as Speech Emotion are far more adaptable and simple to acquire than other
Recognition (SER), has emerged as a rapidly growing modalities. Mel-frequency cepstrum coefficients (MFCC),
research subject. Unlike humans, machines lack the chroma, and mel features are extracted from the speech
innate ability to perceive and express emotions. signals and used to train the classifiers.
Therefore, leveraging speech signals for emotion
detection has become an adaptable and accessible Our project aims to classify the emotional state of the
approach. This paper presents a project aimed at speech which can be used in a number of applications like
classifying emotional states in speech for applications call centers, measuring the degree of emotional attachment
such as call centers, measuring emotional attachment in from phone calls, real-time emotion recognition in online
phone calls, and real-time emotion recognition in online learning, etc. There are three classifying methods that are
learning. The classification methods employed in this used in this project for analyzing emotions (calm, happy,
study include Support Vector Machines (SVM), Logistic fearful, angry, disgust, surprised) using SVM, Logistic
Regression (LR), and Multi-Layer Perceptron (MLP). Regression (LR), and Multi-Layer Perceptron (MLP).
The project utilizes features such as Mel-frequency
cepstrum coefficients (MFCC), chroma, and mel to  Motivations for Doing the Project
extract relevant information from speech signals and In today's world, identifying the emotion exhibited in a
train the classifiers. Through a comparative analysis of spoken percept has various applications. Human-Computer
these classification methods, this research aims to Interaction (HCI) is a branch of study that looks into how
enhance the understanding of speech emotion humans and computers interact with each other. A computer
recognition and contribute to the development of more system that understands more than simply words is required
effective and accurate emotion recognition systems. for an efficient HCI application. Voice-based inputs are used
by several real-world IoT applications, including Amazon
Keywords:- Speech Emotion Recognition, Speech Alexa, Google Home, and Mycroft. In IoT applications,
Recognition (SER), Emotion Classification, Support Vector voice plays a critical role. According to a recent survey,
Machines (SVM), Logistic Regression (LR), Multi-Layer about 12% of all IoT applications will be completely
Perceptron (MLP), Mel-frequency Cepstrum Coefficients functional by 2022. Self-driving automobiles are one
(MFCC), Chroma, Mel Features. example of the emerging field that uses voice commands to
operate several of its tasks. In emergency scenarios where
I. INTRODUCTION the user may be unable to offer a clear spoken command, the
emotion communicated through the user's tone of voice can
Speech recognition has become increasingly important be used to activate specific car emergency functions.
in recent years as a means of assisting others with ease of
use. Several well-known technology companies, including  Objectives
Google, Samsung, and Apple, have used speech recognition The primary objective of speech emotion recognition
to convert human speech into sentences so that their is to improve human-machine interaction interface by
customers may quickly navigate around their products. detecting the emotional state of a person using speech.

Speech emotion recognition, SER, is the act of

attempting to recognize human emotion and the associated
affective states from speech. This uses the fact that tone and
pitch in the voice often indicate underlying emotion. In
recent years, emotion recognition has become a rapidly
increasing research subject. Machines, unlike humans, lack

IJISRT23MAY770 www.ijisrt.com 3781

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
II. RELATED WORKS  File Naming Convention
Each of the 1440 files has a unique filename. The
There are a number of studies done on speech emotion filename consists of a 7-part numerical identifier (e.g., 03-
recognition and different companies are doing research and 01-06-01-02-01-12.wav). These identifiers define the
work related to speech emotion recognition directly or as an stimulus characteristics:
application for different parts of the work.
audEERING, an audio analysis company based in  Filename Identifiers
Germany, that specialises in emotional artificial intelligence.
Their team are experts in voice emotion analytics, machine  Modality (01 = full-AV, 02 = video-only, 03 = audio-
learning and signal processing. only).
 Vocal channel (01 = speech, 02 = song).
Alexa, is a virtual assistant AI technology developed  Emotion (01 = neutral, 02 = calm, 03 = happy, 04 = sad,
by Amazon, first used in the Amazon Echo smart speaker 05 = angry, 06 = fearful, 07 = disgust, 08 = surprised).
and the Echo Dot, Echo Studio and Amazon Tap speakers  Emotional intensity (01 = normal, 02 = strong). NOTE:
developed by Amazon Lab, working on detecting emotions There is no strong intensity for the 'neutral' emotion.
like sadness, happiness, anger, etc, for understanding the  Statement (01 = "Kids are talking by the door", 02 =
mental state of a speaker from the sound of your voice. "Dogs are sitting by the door").
 Repetition (01 = 1st repetition, 02 = 2nd repetition).
III. DATASETS  Actor (01 to 24. Odd numbered actors are male, even
numbered actors are female).
In this project we have used the RAVDESS(Ryerson
Audio-Visual Database of Emotional Speech and Song)  Filename Example 03-01-06-01-02-01-12.wav
dataset. It contains 7356 files rated by 246 persons 10 times
on emotional validity. The dataset is 24.8 GB from 24  Audio-only (03)
different actors. The dataset is huge so we used the sample  Speech (01)
rate lowered versions which is around 171 MB. The dataset  Fearful (06)
includes the following emotions: Neutral, Calm, Happy,  Normal intensity (01)
Sad, Angry, Fearful, Disgust, Surprised.  Statement "dogs" (02)
 1st Repetition (01)
 12th Actor (12) Female, as the actor ID number is even.

IJISRT23MAY770 www.ijisrt.com 3782

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

IV. METHODS AND ALGORITHMS USED

Fig 1 Methodology

The above figures show the general flow-chart of our  Phase 1: Data Collection
project. It consists of 5 different phases. The data is stored in The RAVDESS dataset is used in the project. The
files in the project directory. The files are loaded using dataset is downloaded into our system.
different python libraries then unnecessary files are
removed. And, we extract different features of sound files Audio files in the directory are loaded using libraries
like mfcc, mel, chroma, which will be used as features for like: os, glob, and soundfile.
mapping classifier function. The dataset is then divided into
two different sets: testing and training sets. We then build We use glob module which finds all the path names
different classifier models. And using the training set, we matching a specified pattern as the dataset consists of audio
train the model. After that we use a testing set for different files named in some specific pattern, which also consists of
evaluations and accuracy calculations of the model. The the emotion decoded value in file name only. os module is
whole process is generalized by the above diagram. used to get the base name of the file. Then, using soundfile
library we read sound files along with the sample rate of the
audio.

IJISRT23MAY770 www.ijisrt.com 3783

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
 Phase 2: Extracting Features  Mel-Spectrogram
Different features of sound are extracted, Mel A spectrogram where the frequencies are converted to
Frequency Cepstral Coefficients(MFCCs), Chroma, Mel- the mel scale. It takes samples of sound files over time to
Spectrogram. represent audio signals. Then, the audio signal is mapped
from time domain into frequency domain using fast Fourier
 MFCCs transform then shifted frequency and amplitude to form a
It is a frequency domain feature of the sound, which spectrogram.
captures timbral or textural and the phonetical crucial
characteristics of the speech. It is widely used in speech, Librosa library is used to extract features from the
music genre, and musical instrument classifications. audio file. Librosa is a python library for music and audio
analysis. It provides the building blocks necessary to create
 Chroma music information retrieval systems.
Chroma captures harmonic and melodic characteristics
of music, while being robust to changes in timbre and
instrumentation. It is also referred to as pitch class profiles.

 Phase 3: Classification The dataset was divided into two sets:

In this phase different algorithm models are used for
classifying the emotions:  Training set (80%)
 Testing set (20%)
 MLP Classifier
 Logistic Regression  MLP Classifier
 SVM Multi-Layer Perceptron (MLP) is a part of feedforward
artificial neural network which consists of an input layer,
multiple hidden layers, and an output layer which are
connected. Based on adjustments of parameters, biases,
weights of the model, the model represents the target
function. Activation function that is used during the
experiment was relu which makes the model easier to train
and often achieves better performance.

Fig 2 MLP Classifier

IJISRT23MAY770 www.ijisrt.com 3784

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
MLP (Multi-Layer Perceptron) Classifier is used to  Alpha: a parameter for regularization term, aka penalty
categorize the given data into respective groups. It is term, that combats overfitting by constraining the size of
capable of approximating boolean and nonlinear functions. the weights.
It is frequently used in supervised learning problems. The  Batch Size: the number of samples that will be
network works on real-values, so the categorical values must propagated through the network.
be converted into real-value representation.  Epsilon: value for numerical stability.
 Hidden Layer Sizes: 1 hidden layers with 300 hidden
Following values of parameters were used in our units,
model:  Learning Rate: learning rate for weight updates

 alpha=0.01, ‘adaptive’ keeps the learning rate constant to

 batch_size=256, ‘learning_rate_init’ as long as training loss keeps
 epsilon=1e-08, decreasing. Each time two consecutive epochs fail to
 hidden_layer_sizes=(300,), decrease training loss by at least tol, or fail to increase
 learning_rate='adaptive', validation score by at least tol if ‘early_stopping’ is on, the
 max_iter=500 current learning rate is divided by 5.

 Logistic Regression Following are the parameters used in building the

The logistic model is used to model the probability of a model:
certain class or event existing such as pass/fail, win/lose,
alive/dead or healthy/sick. This can be extended to model  multi_class='multinomial',
several classes of events such as determining whether an  solver='lbfgs'
image contains a cat, dog, lion, etc. Each object being
detected in the image would be assigned a probability  Multi_class: multinomial, extension of logistic
between 0 and 1, with a sum of one. This model is regression that adds support for multi-class classification
preferable for dependent variable (categorical) data since the problems.
data used have small size of output (happy and sad).  Solver: lbfgs, solver is an algorithm used for
optimization problems. In our case, lbfgs is used, it
This linear relationship can be written in the following approximates the second derivative matrix updates with
mathematical form (where ℓ is the log-odds, b is the base of gradient evaluations, and stores only the last few
the logarithm, and βi are parameters of the model): updates, so it saves memory, also it isn't super fast with
large data sets.

 SVM
SVM (Support Vector Machine) are supervised learning models with associated learning algorithms that analyze data for
classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an
SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic
binary linear classifier. It is one of supervised machine learning models that linearly separable binary sets. The goal of this model
is to calculate and create a hyperplane that classifies all training vectors. After creating a hyperplane, the next step is to determine
the maximum margin between data point and hyperplane which can be called as support vectors.

IJISRT23MAY770 www.ijisrt.com 3785

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 3 SVM

Following are the parameters used in building model:

 kernel="linear",
 C=1

 kernel="linear", specify kernel type of the algorithm

 C=1, Regularization parameter. The strength of the regularization is inversely proportional to C and it must be strictly positive.

 Phase 4: Evaluation

Evaluation of the experiment involves comparison between each model classification report and accuracy. Evaluation of the
experiments includes comparison of accuracy between the multiple experiment’s of each algorithm and between different
algorithms.

V. EXPERIMENTS AND EVALUATIONS

In this phase different algorithm models are used for classifying the emotions: MLP Classifier, Logistic Regression, SVM.

Datasets consisted of following data:

Fig 4 Datasets consisted

IJISRT23MAY770 www.ijisrt.com 3786

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
 MLP Classifier

Table 1 Experiments Conducted with Respective Evaluations:

SN Experiments Accuracy F1 Score Precision Recall

I alpha=0.01, 0.63 0.63 0.67 0.64

batch_size=256,
epsilon=1e-08, hidden_layer_sizes=(300,),
learning_rate='adaptive',
max_iter=500

II hidden_layer_sizes=(300,150,), 0.61 0.59 0.63 0.6

III hidden_layer_sizes=(600,), 0.7 0.7 0.71 0.71

Fig 5 MLP, Experiment I

Fig 6 MLP, Experiment II

IJISRT23MAY770 www.ijisrt.com 3787

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 7 MLP, Experiment III

 Logistic Regression

Table 2 Experiments Conducted with Respective Evaluations:

SN Experiments Accuracy F1 Score Precision Recall

I solver='lbfgs' 0.53 0.52 0.53 0.53

II solver='saga' 0.50 0.50 0.53 0.50

III solver='newton-cg' 0.58 0.58 0.58 0.58

Fig 8 LR, Experiment I

IJISRT23MAY770 www.ijisrt.com 3788

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 9 LR, Experiment II

Fig 10 LR, Experiment III

 SVM

Table 3 Experiments Conducted with Respective Evaluations:

SN Experiments Accuracy F1 Score Precision Recall

I kernel="linear", C=1 0.43 0.56 0.56 0.56

II kernel="poly", C=1 0.33 0.26 0.26 0.32

III kernel="linear", C=2 0.42 0.57 0.57 0.57

IJISRT23MAY770 www.ijisrt.com 3789

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 11 SVM, Experiment I

Fig 12 SVM, Experiment II

Fig 13 SVM, Experiment III

IJISRT23MAY770 www.ijisrt.com 3790

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
VI. DISCUSSION ON RESULTS

Evaluation of the experiments includes comparison of accuracy

 between the multiple experiment’s of each algorithm and

 between different algorithms

The table below describes the comparison between different algorithms and the table consists the best result of the algorithm
after performing multiple experiments:

Model Best experiment condition Accuracy F1 Score Precision Recall

MLP alpha=0.01, 0.7 0.7 0.71 0.71

batch_size=256,
epsilon=1e-08, hidden_layer_sizes=(600,),
learning_rate='adaptive',
max_iter=500

SVM kernel="linear", C=1 0.43 0.56 0.56 0.56

LR solver='newton-cg' 0.58 0.58 0.58 0.58

MLP has best accuracy, f1-score, precision and recall when it has a hidden layer with 600 hidden units. SVM has best results
when it uses linear kernel and value 1 for regularization parameter, and LR has best results when it uses newton-cg as solver for
optimization.

Compared to different models, we get best results with MLP Classifier and other two SVM and LR have similar results.

VII. CONTRIBUTIONS OF EACH GROUP MEMBER

SN Name Contribution

1 Samjhana Pokharel 1. Research on Speech Emotion Recognition

2. Study of sound file and its features for speech emotion recognition
3. Visualization of speech features
4. Visualization of datasets
5. Study of ML models for speech emotion recognition
6. Multiple experimentation with Support Vector Machine and Logistic Regression
7. Performance and accuracy measurement of SVM and LR
8. Comparison of each experiments and models
9. Drawing Conclusions

2 Ujwal Basnet 1. Research on Speech Emotion Recognition

2. Study of sound file and its features for speech emotion recognition
3. Visualization of speech features
4. Visualization of datasets
5. Study of ML models for speech emotion recognition
6. Multiple experimentation with Multi-Layer Perceptron Classifiers
7. Performance and accuracy measurement of MLP
8. Comparison of each experiments and models
9. Drawing Conclusions

VIII. CODE

Snippet codes for the implementation of different methods have already been specified and discussed above.

The complete code can be accessed via public github repository:

https://github.com/SamjhanaP/speechemotionrecognition

IJISRT23MAY770 www.ijisrt.com 3791

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
IX. CONCLUSION AND FUTURE EXTENSIONS [5]. Nair, A. (2019, June 20). A Beginner’s Guide To
TO THE PROJECT Scikit-Learn’s MLPClassifier. Analytics India
Magazine. https://analyticsindiamag.com/a-beginners-
The new era of automation has begun as a result of the guide-to-scikit-learns-mlpclassifier/
increasing growth and development in the fields of AI and [6]. Peerzade, G. N., & Deshmukh, R. R. (2018, March). A
machine learning. The majority of these automated gadgets Review: Speech Emotion Recognition. International
are controlled by the user's vocal commands. Many Journal of Computer Sciences and Engineering, 6(3),
advantages can be created over present systems if, in 2347-2693.
addition to identifying words, the machines can interpret the https://www.researchgate.net/publication/325774548_
speaker's emotion. A_Review_Speech_Emotion_Recognition
[7]. scikit-learn developers. (n.d.). Support Vector
The processes for creating a voice emotion recognition Machines. Scikit Learn. https://scikit-
system were covered in detail in this project, and several learn.org/stable/modules/svm.html
experiments were conducted to determine the influence of [8]. SMART Lab. (n.d.). RAVDESS. Smart Laboratory.
each step. Three different learning models were used: MLP, https://smartlaboratory.org/ravdess/
LR and SVM. Firstly, speech features like mfcc, chroma, [9]. Stojiljković, M. (n.d.). Logistic Regression in Python.
and mel were extracted from the audio files. Then each Real Python. https://realpython.com/logistic-
model is trained in multiple experiments with variation in regression-python/
the parameters. And using the test dataset, accuracy of each [10]. Wikipedia. (2021). Multilayer perceptron. Wikipedia.
model and each experiment were studied. And, at the end https://en.wikipedia.org/wiki/Multilayer_perceptron
we conclude that MLP Classifier performs better when [11]. Wikipedia. (2021, March 9). Chroma feature.
hidden units in a hidden layer are increased. Wikipedia.
https://en.wikipedia.org/wiki/Chroma_feature
So, we conclude the following are the advantages of [12]. Wikipedia. (2021, May 7). Mel-frequency cepstrum.
using MLP classifier in Speech Emotion Recognition: Wikipedia. https://en.wikipedia.org/wiki/Mel-
frequency_cepstrum#:~:text=From%20Wikipedia%2C
 Allows you to work with nonlinear values with ease. %20the%20free%20encyclopedia,nonlinear%20mel%
 Higher performance compared to other models 20scale%20of%20frequency.
 Missing values can be handled, [13]. Wikipedia. (2021, May 19). Support-vector machine.
 Complicated relationships can be modelled, and Wikipedia. https://en.wikipedia.org/wiki/Support-
 Many inputs can be supported. vector_machine
[14]. Wikipedia. (2021, May 22). Logistic regression.
For future enhancements, the proposed project can be Wikipedia.
further modeled in terms of efficiency, accuracy, and https://en.wikipedia.org/wiki/Logistic_regression
usability. The model may be extended to recognize more
emotional states and sensations like sarcasm. And, a number
of interactive systems can be developed using trained
models in the underlying system to provide a system where
users can interact with the machine or more like command
using voice. Also, the communication can be made bi-
directional instead of directional.

REFERENCES

[1]. Brownlee, J. (2016, April 1). Logistic Regression for

Machine Learning. Machine Learning Mastery.
https://machinelearningmastery.com/logistic-
regression-for-machine-learning/
[2]. B.V., E. (2020). Speech Emotion Recognition with
deep learning. Procedia Computer Science, 176(2020),
251-260.
https://www.sciencedirect.com/science/article/pii/S187
7050920318512
[3]. Gandhi, R. (2018, June 7). Support Vector Machine —
Introduction to Machine Learning Algorithms.
Towards Data Science.
https://towardsdatascience.com/support-vector-
machine-introduction-to-machine-learning-algorithms-
934a444fca47
[4]. https://www.audeering.com/. (2020). Audeering.
Audeering. https://www.audeering.com/

IJISRT23MAY770 www.ijisrt.com 3792

Temperature-Energy Relationships and Spatial Distribution Analysis for Nano-Enhanced Phase Change Materials Via Thermal Energy Storage
No ratings yet
Temperature-Energy Relationships and Spatial Distribution Analysis for Nano-Enhanced Phase Change Materials Via Thermal Energy Storage
18 pages
Panasonic Lumix s5 II
No ratings yet
Panasonic Lumix s5 II
803 pages
Solid Dispersion-Based Approaches for Improving Oral Bioavailability: Current Progress and Future Perspectives
No ratings yet
Solid Dispersion-Based Approaches for Improving Oral Bioavailability: Current Progress and Future Perspectives
8 pages
Assessment of Caregivers' Knowledge and Acceptance of The Human Papilloma Virus Vaccine in Maihula Community, Bali Lga, Taraba State, Nigeria
No ratings yet
Assessment of Caregivers' Knowledge and Acceptance of The Human Papilloma Virus Vaccine in Maihula Community, Bali Lga, Taraba State, Nigeria
8 pages
Dental Care Flip Model: Dental Health Education To Improve Dental Health Maintenance Behavior of Elementary School Students
No ratings yet
Dental Care Flip Model: Dental Health Education To Improve Dental Health Maintenance Behavior of Elementary School Students
8 pages
Analyzing The Efficiency of Hybrid Explainable AI Models For Feature Extraction and Pattern Recognition in High-Dimensional Data Mining Tasks
No ratings yet
Analyzing The Efficiency of Hybrid Explainable AI Models For Feature Extraction and Pattern Recognition in High-Dimensional Data Mining Tasks
12 pages
Fenton Reagent-Based Advanced Oxidation For The Degradation of Reactive Black 5 and Methylene Blue Dyes
No ratings yet
Fenton Reagent-Based Advanced Oxidation For The Degradation of Reactive Black 5 and Methylene Blue Dyes
17 pages
Literature Study 2025
No ratings yet
Literature Study 2025
10 pages
Cardiovascular Catastrophe in Catastrophic Antiphospholipid Syndrome: A Case Report
No ratings yet
Cardiovascular Catastrophe in Catastrophic Antiphospholipid Syndrome: A Case Report
5 pages
Parental Participation and Students' Academic Achievement in Selected Government Aided Secondary Schools in Kibaale Town Council, Rakai District, Uganda
No ratings yet
Parental Participation and Students' Academic Achievement in Selected Government Aided Secondary Schools in Kibaale Town Council, Rakai District, Uganda
11 pages
Innovation of Detector Score Plaque Sensor Based to Improve the Effectiveness and Afficiency of Dental Health Services
No ratings yet
Innovation of Detector Score Plaque Sensor Based to Improve the Effectiveness and Afficiency of Dental Health Services
7 pages
The Impact of Artificial Intelligence Interventions on Adolescent Mental Health: A Multidimensional Study Using ChatGPT, Gemini, and DeepSeek
No ratings yet
The Impact of Artificial Intelligence Interventions on Adolescent Mental Health: A Multidimensional Study Using ChatGPT, Gemini, and DeepSeek
8 pages
NPAs and Profitability in Indian Private Sector Banks: Evidence from a Panel Study
No ratings yet
NPAs and Profitability in Indian Private Sector Banks: Evidence from a Panel Study
7 pages
IMPROVE Floodeye: Integrated Mobile System for Predictive Routing and Optimized Vehicle Navigation Using Ensemble Algorithm
No ratings yet
IMPROVE Floodeye: Integrated Mobile System for Predictive Routing and Optimized Vehicle Navigation Using Ensemble Algorithm
6 pages
From Global Standards to Local Fields: Redefining Labour Through MGNREGS in Kerala’s Tribal Heartlands – An Interrogation of ILO Norms
No ratings yet
From Global Standards to Local Fields: Redefining Labour Through MGNREGS in Kerala’s Tribal Heartlands – An Interrogation of ILO Norms
7 pages
From Resilience to Success: An Appreciative Inquiry into the Experiences of Criminologist Licensure Examination Passers
No ratings yet
From Resilience to Success: An Appreciative Inquiry into the Experiences of Criminologist Licensure Examination Passers
17 pages
Cementing “Optimization Techniques” in Social Sciences Research: Towards Non-Mathematical Optimization Techniques for the Social Sciences
No ratings yet
Cementing “Optimization Techniques” in Social Sciences Research: Towards Non-Mathematical Optimization Techniques for the Social Sciences
10 pages
Alzheimer's Disease: Advances in Early Diagnosis and Emerging Therapeutics
No ratings yet
Alzheimer's Disease: Advances in Early Diagnosis and Emerging Therapeutics
4 pages
Reviving Chettinad Architecture: A Cultural Legacy of Tamil Nadu
No ratings yet
Reviving Chettinad Architecture: A Cultural Legacy of Tamil Nadu
9 pages
Ginkgo Biloba-Derived Flavonoids as Metal Chelators in Alzheimer’s Neurochemistry: A Biochemical Approach
No ratings yet
Ginkgo Biloba-Derived Flavonoids as Metal Chelators in Alzheimer’s Neurochemistry: A Biochemical Approach
7 pages
Isolated Fallopian Tube Torsion Caused by A Mature Cystic Teratoma: A Rare Case Report
No ratings yet
Isolated Fallopian Tube Torsion Caused by A Mature Cystic Teratoma: A Rare Case Report
6 pages
MS Thesis Final
No ratings yet
MS Thesis Final
47 pages
SER Documentation Satwik
No ratings yet
SER Documentation Satwik
47 pages
Plag Report
No ratings yet
Plag Report
18 pages
Efficacy, Safety, and Feasibility of Verapamil in The Management of Atrial Fibrillation in Emergency Services With Limited Resources: A Systematic Review
No ratings yet
Efficacy, Safety, and Feasibility of Verapamil in The Management of Atrial Fibrillation in Emergency Services With Limited Resources: A Systematic Review
13 pages
SMR6!
No ratings yet
SMR6!
14 pages
Personal-Professional Attributes of Teachers and Learning Competence of Junior High School Students
No ratings yet
Personal-Professional Attributes of Teachers and Learning Competence of Junior High School Students
28 pages
A Survey of Speech Emotion Recognition in Natural Environment
No ratings yet
A Survey of Speech Emotion Recognition in Natural Environment
16 pages
Documentation Batch
No ratings yet
Documentation Batch
38 pages
SER Group Documentation
No ratings yet
SER Group Documentation
50 pages
Pre Processing
No ratings yet
Pre Processing
54 pages
Search For Binary Companions Around Millisecond Pulsars
No ratings yet
Search For Binary Companions Around Millisecond Pulsars
13 pages
Machine Learning and Deep Learning Techniques For Emotion Recognition From Human Speech Using Acoustic Analysis
No ratings yet
Machine Learning and Deep Learning Techniques For Emotion Recognition From Human Speech Using Acoustic Analysis
10 pages
Speech Emotion Recognition Based On SVM Using MATLAB: March 2016
No ratings yet
Speech Emotion Recognition Based On SVM Using MATLAB: March 2016
7 pages
Final Report
No ratings yet
Final Report
27 pages
Speech Emotion Recognization
No ratings yet
Speech Emotion Recognization
65 pages
Deep Learning Structure For Emotion Prediction Using MFCC From Native Languages
No ratings yet
Deep Learning Structure For Emotion Prediction Using MFCC From Native Languages
13 pages
UNIT 3 Notes
No ratings yet
UNIT 3 Notes
23 pages
Manual Deh 2250ub
0% (1)
Manual Deh 2250ub
112 pages
Applsci 12 09188 v2
No ratings yet
Applsci 12 09188 v2
17 pages
Speech and Text Emotion Recognition Using Machine Learning Batch Number - 08 First Review 2.0
No ratings yet
Speech and Text Emotion Recognition Using Machine Learning Batch Number - 08 First Review 2.0
12 pages
Vail CMMS
No ratings yet
Vail CMMS
24 pages
Pamectomy in Lobular Breast Cancer
No ratings yet
Pamectomy in Lobular Breast Cancer
3 pages
Electronics 11 03831
No ratings yet
Electronics 11 03831
12 pages
GROUP7 Researchpaper
No ratings yet
GROUP7 Researchpaper
9 pages
Digital Transformation in The Judiciary: Evaluating The Impact of Court Case Management Systems On Reducing Case Backlogs and Enhancing Efficiency in Subordinate Courts of Tamil Nadu
No ratings yet
Digital Transformation in The Judiciary: Evaluating The Impact of Court Case Management Systems On Reducing Case Backlogs and Enhancing Efficiency in Subordinate Courts of Tamil Nadu
2 pages
Social Medias Influence On Modern Language and Communication Skills
No ratings yet
Social Medias Influence On Modern Language and Communication Skills
12 pages
Multimodal Speech Emotion Recognition and Ambiguity Resolution
No ratings yet
Multimodal Speech Emotion Recognition and Ambiguity Resolution
9 pages
Human Speech Emotion Recognition Using Artificial Neural Networks Technique
No ratings yet
Human Speech Emotion Recognition Using Artificial Neural Networks Technique
7 pages
Quantifying, Measuring, and Correlating Socio - Cultural Variables: An Indispensable Technique For Diverse Fields of The Social Sciences
No ratings yet
Quantifying, Measuring, and Correlating Socio - Cultural Variables: An Indispensable Technique For Diverse Fields of The Social Sciences
12 pages
1822 B.E Cse Batchno 140
No ratings yet
1822 B.E Cse Batchno 140
55 pages
Childhood Adversity and Its Echoes in Adult Intimate Relationships
No ratings yet
Childhood Adversity and Its Echoes in Adult Intimate Relationships
9 pages
Recognition of Emotions in Speech Using Deep CNN A
No ratings yet
Recognition of Emotions in Speech Using Deep CNN A
18 pages
Project Report
No ratings yet
Project Report
106 pages
Intercalating A Multi-Barreled Approach To Educational and Pedagogical Reform: A Brief Summation of Our Publications On Pedagogy
No ratings yet
Intercalating A Multi-Barreled Approach To Educational and Pedagogical Reform: A Brief Summation of Our Publications On Pedagogy
12 pages
Gastrointestinal Stromal Tumour (GIST)
No ratings yet
Gastrointestinal Stromal Tumour (GIST)
5 pages
Emotion Recognition From Speech Via The Use of Dif
No ratings yet
Emotion Recognition From Speech Via The Use of Dif
11 pages
Exploring The Effectiveness of Advanced Machine Learning Models in Speech Emotion Recognition
No ratings yet
Exploring The Effectiveness of Advanced Machine Learning Models in Speech Emotion Recognition
6 pages
Parental Influence On Aggression and Self-Esteem Among Young Adults: An Indian Context
No ratings yet
Parental Influence On Aggression and Self-Esteem Among Young Adults: An Indian Context
6 pages
EtherWAN EX78602-01B User Manual
No ratings yet
EtherWAN EX78602-01B User Manual
249 pages
9 Removed
No ratings yet
9 Removed
5 pages
Wa0007
No ratings yet
Wa0007
6 pages
Step by Step Procdure by Power Point Presentation 5289M
No ratings yet
Step by Step Procdure by Power Point Presentation 5289M
34 pages
Set Conference Draft Paper - 223585
No ratings yet
Set Conference Draft Paper - 223585
6 pages
PACS DATA EXTRACT-User Guide
100% (1)
PACS DATA EXTRACT-User Guide
15 pages
Bootstrap - Comprehensive Guide
No ratings yet
Bootstrap - Comprehensive Guide
8 pages
EMOTIONDETECTION (1) Mini Project
No ratings yet
EMOTIONDETECTION (1) Mini Project
5 pages
AI Breaking Boundaries
From Everand
AI Breaking Boundaries
Avinash Vanam
No ratings yet
Opentext™ Vendor Invoice Management For Sap Solutions: Installation Guide
No ratings yet
Opentext™ Vendor Invoice Management For Sap Solutions: Installation Guide
290 pages
Group No 37
No ratings yet
Group No 37
19 pages
JETIR2106163
No ratings yet
JETIR2106163
5 pages
9 - Yogendra
No ratings yet
9 - Yogendra
5 pages
Fish Finder 300 C
No ratings yet
Fish Finder 300 C
24 pages
Emotion Detection Final Paper
No ratings yet
Emotion Detection Final Paper
15 pages
Daffodil International University
No ratings yet
Daffodil International University
1 page
10 1109@access 2019 2936124
No ratings yet
10 1109@access 2019 2936124
19 pages
Formats
No ratings yet
Formats
14 pages
Kismet: Fundamentals and Applications
From Everand
Kismet: Fundamentals and Applications
Fouad Sabry
No ratings yet
1 ST
No ratings yet
1 ST
23 pages
Extinguishant Control Panel (SHC70002, SHC70003) Operation and Maintenance Manual
No ratings yet
Extinguishant Control Panel (SHC70002, SHC70003) Operation and Maintenance Manual
38 pages
Sat - 82.Pdf - Election Prediction With Automated Speech Emotion Recognition
No ratings yet
Sat - 82.Pdf - Election Prediction With Automated Speech Emotion Recognition
11 pages
Analysing Vocal Pattern To Determine Emotion
No ratings yet
Analysing Vocal Pattern To Determine Emotion
13 pages
Winter Semester 2021-22 CSE4020-Machine Learning Digital Assignment-1
No ratings yet
Winter Semester 2021-22 CSE4020-Machine Learning Digital Assignment-1
20 pages
Affective Computing: Fundamentals and Applications
From Everand
Affective Computing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Applied Ethics
No ratings yet
Applied Ethics
5 pages
Speech Emotion Recognition
No ratings yet
Speech Emotion Recognition
6 pages
SER (Research Paper)
No ratings yet
SER (Research Paper)
5 pages
IJRPR4210
No ratings yet
IJRPR4210
12 pages
Research Paper On Speech Emotion Recogtion System
No ratings yet
Research Paper On Speech Emotion Recogtion System
9 pages
Speech Emotion Recognition: Ashish B. Ingale, D. S. Chaudhari
No ratings yet
Speech Emotion Recognition: Ashish B. Ingale, D. S. Chaudhari
4 pages
VR&AR
No ratings yet
VR&AR
8 pages
Speech Emotion Recognition Using Machine Learningg
No ratings yet
Speech Emotion Recognition Using Machine Learningg
19 pages
Scribbed 223751127-Chapter-12-Enhanced-Entity-Relationship-Modeling PDF
No ratings yet
Scribbed 223751127-Chapter-12-Enhanced-Entity-Relationship-Modeling PDF
16 pages
Speech Emotions Recognition Using Machine Learning
No ratings yet
Speech Emotions Recognition Using Machine Learning
5 pages
Geovision Hybrid Software Datasheet
No ratings yet
Geovision Hybrid Software Datasheet
6 pages
Finite Fields and String Matching Presentation
No ratings yet
Finite Fields and String Matching Presentation
10 pages
Sample Poster Template CSE
No ratings yet
Sample Poster Template CSE
1 page
Sentiment Emotion Recognition
No ratings yet
Sentiment Emotion Recognition
6 pages
Learning Material 1 in MMW, Ch3
No ratings yet
Learning Material 1 in MMW, Ch3
16 pages
Chethana H N REPORT
No ratings yet
Chethana H N REPORT
12 pages
CCNA Lab 1
No ratings yet
CCNA Lab 1
19 pages
Need of Scripting Languages
No ratings yet
Need of Scripting Languages
9 pages
Speech Emotion Recognition: Submitted by Manoj Rajput 2019PEC5303
No ratings yet
Speech Emotion Recognition: Submitted by Manoj Rajput 2019PEC5303
11 pages
Speech Emotion Recognition System
No ratings yet
Speech Emotion Recognition System
4 pages
Mi COMP111
No ratings yet
Mi COMP111
8 pages
Physical Features Based Speech Emotion Recognition Using Predictive Classification
No ratings yet
Physical Features Based Speech Emotion Recognition Using Predictive Classification
12 pages
Speech Emotion Recognition: Ashish B. Ingale, D. S. Chaudhari
No ratings yet
Speech Emotion Recognition: Ashish B. Ingale, D. S. Chaudhari
4 pages
MX-CPG Bim Impplan Rev0
No ratings yet
MX-CPG Bim Impplan Rev0
17 pages
Speech-Emotion-Recognition Using SVM, Decision Tree and LDA Report
No ratings yet
Speech-Emotion-Recognition Using SVM, Decision Tree and LDA Report
7 pages
CETECOM Antenna Testing Pocket Guide
No ratings yet
CETECOM Antenna Testing Pocket Guide
2 pages
Speech Emotion Recognition Using Deep Learning: Nithya Roopa S., Prabhakaran M, Betty.P
No ratings yet
Speech Emotion Recognition Using Deep Learning: Nithya Roopa S., Prabhakaran M, Betty.P
4 pages
Speech Emotion Recognition Based On SVM Using Matlab PDF
No ratings yet
Speech Emotion Recognition Based On SVM Using Matlab PDF
6 pages
A Review On Speech Emotion Classification Using Linear Predictive Coding and Neural Networks
No ratings yet
A Review On Speech Emotion Classification Using Linear Predictive Coding and Neural Networks
5 pages
Internet Safety - Crossword Puzzle
No ratings yet
Internet Safety - Crossword Puzzle
2 pages
ACURIL XL Local Org Comm Invit. (Eng)
No ratings yet
ACURIL XL Local Org Comm Invit. (Eng)
2 pages
q8, q9, q10 Question and Answers
No ratings yet
q8, q9, q10 Question and Answers
16 pages
Valery Petrushin - Emotion Recognition in Speech Signal. Experimental Study, Development and Application
No ratings yet
Valery Petrushin - Emotion Recognition in Speech Signal. Experimental Study, Development and Application
5 pages
Emotion Recognition in Speech Signal: Experimental Study, Development, and Application
No ratings yet
Emotion Recognition in Speech Signal: Experimental Study, Development, and Application
5 pages
Research Ii: Types of Research Data
No ratings yet
Research Ii: Types of Research Data
21 pages
Typing Lessons
No ratings yet
Typing Lessons
2 pages

Speech Emotion Recognition For Enhanced User Experience: A Comparative Analysis of Classification Methods

Uploaded by

Speech Emotion Recognition For Enhanced User Experience: A Comparative Analysis of Classification Methods

Uploaded by

Volume 8, Issue 5, May – 2023 International Journal of Innovative Science and Research Technology

Speech Emotion Recognition for Enhanced User

Speech emotion recognition, SER, is the act of

IJISRT23MAY770 www.ijisrt.com 3781

IJISRT23MAY770 www.ijisrt.com 3782

IV. METHODS AND ALGORITHMS USED

IJISRT23MAY770 www.ijisrt.com 3783

 Phase 3: Classification The dataset was divided into two sets:

Fig 2 MLP Classifier

IJISRT23MAY770 www.ijisrt.com 3784

 alpha=0.01, ‘adaptive’ keeps the learning rate constant to

 Logistic Regression Following are the parameters used in building the

IJISRT23MAY770 www.ijisrt.com 3785

Following are the parameters used in building model:

 kernel="linear", specify kernel type of the algorithm

V. EXPERIMENTS AND EVALUATIONS

Datasets consisted of following data:

Fig 4 Datasets consisted

IJISRT23MAY770 www.ijisrt.com 3786

Table 1 Experiments Conducted with Respective Evaluations:

I alpha=0.01, 0.63 0.63 0.67 0.64

II hidden_layer_sizes=(300,150,), 0.61 0.59 0.63 0.6

III hidden_layer_sizes=(600,), 0.7 0.7 0.71 0.71

Fig 5 MLP, Experiment I

Fig 6 MLP, Experiment II

IJISRT23MAY770 www.ijisrt.com 3787

Fig 7 MLP, Experiment III

Table 2 Experiments Conducted with Respective Evaluations:

I solver='lbfgs' 0.53 0.52 0.53 0.53

II solver='saga' 0.50 0.50 0.53 0.50

III solver='newton-cg' 0.58 0.58 0.58 0.58

Fig 8 LR, Experiment I

IJISRT23MAY770 www.ijisrt.com 3788

Fig 9 LR, Experiment II

Fig 10 LR, Experiment III

Table 3 Experiments Conducted with Respective Evaluations:

I kernel="linear", C=1 0.43 0.56 0.56 0.56

II kernel="poly", C=1 0.33 0.26 0.26 0.32

III kernel="linear", C=2 0.42 0.57 0.57 0.57

IJISRT23MAY770 www.ijisrt.com 3789

Fig 11 SVM, Experiment I

Fig 12 SVM, Experiment II

Fig 13 SVM, Experiment III

IJISRT23MAY770 www.ijisrt.com 3790

Evaluation of the experiments includes comparison of accuracy

 between the multiple experiment’s of each algorithm and

Model Best experiment condition Accuracy F1 Score Precision Recall

MLP alpha=0.01, 0.7 0.7 0.71 0.71

SVM kernel="linear", C=1 0.43 0.56 0.56 0.56

LR solver='newton-cg' 0.58 0.58 0.58 0.58

VII. CONTRIBUTIONS OF EACH GROUP MEMBER

1 Samjhana Pokharel 1. Research on Speech Emotion Recognition

2 Ujwal Basnet 1. Research on Speech Emotion Recognition

The complete code can be accessed via public github repository:

IJISRT23MAY770 www.ijisrt.com 3791

[1]. Brownlee, J. (2016, April 1). Logistic Regression for

IJISRT23MAY770 www.ijisrt.com 3792

You might also like