An Automated System For Depression Detection Based On Facial and Vocal Features

Uploaded by

Олег Юрійович Шевченко 3г

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views7 pages

An Automated System For Depression Detection Based On Facial and Vocal Features

Uploaded by

Олег Юрійович Шевченко 3г

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/374122693

An Automated System for Depression Detection Based on Facial and Vocal

Features

Article in International Journal on Recent and Innovation Trends in Computing and Communication · July 2023
DOI: 10.17762/ijritcc.v11i7s.7001

CITATIONS READS

2 51

2 authors, including:

Vijayshri Khedkar
Symbiosis International University
27 PUBLICATIONS 72 CITATIONS

SEE PROFILE

All content following this page was uploaded by Vijayshri Khedkar on 17 October 2023.

The user has requested enhancement of the downloaded file.

International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 7s
DOI: https://doi.org/10.17762/ijritcc.v11i7s.7001
Article Received: 02 April 2023 Revised: 22 May 2023 Accepted: 01 June 2023
___________________________________________________________________________________________________________________

An Automated System for Depression Detection

Based on Facial and Vocal Features
Mohit Patil1, Vijayshri Khedkar2
1Department of Artificial Intelligence and Machine Learning
Symbiosis Institute of Technology
Pune, India
e-mail: [email protected]
2Department of Artificial Intelligence and Machine Learning

Symbiosis Institute of Technology

Pune, India
e-mail: [email protected]

Abstract— Diagnosing depression is a challenge due to the subjective nature of traditional tools like questionnaires and interviews.
Researchers are exploring alternative methods for detecting depression, such as using facial and vocal features. This study investigated the
potential of facial and vocal features for depression detection using two datasets: images of facial expressions with emotion labels, and a vocal
expression dataset with positive and negative words. Four deep-learning models were evaluated for depression detection from facial
expressions, and two traditional machine-learning models were trained for sentiment analysis on the vocal expression dataset. The CNN model
performed best for facial expression analysis, while the Naive Bayes model performed best for vocal expression analysis. The models were
integrated into a web application for depression analysis, allowing users to upload a video and receive an analysis of their facial and vocal
expressions for signs of depression. This study demonstrates the potential of using facial and vocal features for depression detection and
provides insight into the performance of different machine learning algorithms for this task. The web application has the potential to be a useful
tool for individuals monitoring their mental health and may support mental health professionals in their clinical assessments of depression.
Keywords- Machine Learning, Classification Rule, Convolution Neural Networks, NLP, etc.

I. INTRODUCTION system. To the best of our knowledge, this study is one of the
few to investigate the use of both facial and vocal features for
Depression is a major mental health issue that affects millions
depression diagnosis. Our findings show that our method is
of people throughout the world. Early depression identification
successful, with high accuracy in diagnosing depression using
is critical for quick treatment and better results. There has been
both facial and vocal features. Our method has the potential to
a rise in interest in recent years in applying machine learning
be a significant tool for the early diagnosis and intervention of
and computer vision techniques to diagnose depression using
depression. There is a Literature Survey in Section 2,
facial and vocal features [1]. The goal of this research is to
Methodology in Section 3, Algorithms employed in Section 4,
create an automated depression detection system utilizing facial
Result Analysis in Section 5, and Conclusion in Section 6
and vocal features. We use two datasets in particular: the Facial
Expression Recognition 2013 dataset and the Positive and II. LITERATURE SURVEY
Negative Word dataset. We develop four models VGG16, The Author proposed a system for facial expression recognition
ResNet50, MobileNet, and CNN - for the face aspect of our is composed of three main components: facial landmark
strategy and assess their performance on the FER2013 dataset. detection, analysis of textual information within facial images
We discovered that the CNN model had the highest accuracy utilizing convolutional neural networks, and improvement of
and went on to apply it on a web page for depression diagnosis system performance using transfer learning, progressive image
using facial expressions. For the voice component of our resizing, data augmentation, and parameter fine-tuning. The
technique, we initially tested Support Vector Machines and system's effectiveness is evaluated on three benchmark
Naive Bayes on raw data and discovered that Naive Bayes databases, and the results demonstrate its superiority over
performed better. Next, we used the Google API for speech-to- existing approaches [3]. The author proposes a pipeline for
text to extract vocal features and performed sentiment analysis analyzing student behavior in an e-learning environment using
using Naive Bayes to detect depression [2]. Our approach facial processing techniques in this work. In the suggested
combines both facial and vocal features to provide a more approach, recognition of facial features, surveillance, and
accurate and comprehensive automated depression detection clustering algorithms are utilized to gather a series of faces from

286
IJRITCC | June 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 7s
DOI: https://doi.org/10.17762/ijritcc.v11i7s.7001
Article Received: 02 April 2023 Revised: 22 May 2023 Accepted: 01 June 2023
___________________________________________________________________________________________________________________
each student and a single efficient neural network is employed III. METHODOLOGY
to collect emotional qualities in every image. The model might
be used for real-time processing of videos on each learner's
mobile device, removing the need to upload each student's face
footage to a remote server or the teacher's PC. In an Emote task,
the suggested system greatly outperforms previous single
models [4]. They discuss the importance of facial expression
recognition in the development of highly intelligent systems,
especially in the interaction between robots and humans. The
paper presents the use of deep learning algorithms, specifically
the DCNN with the help of multiple models, for classifying the
facial expressions of humans. The suggested technique is tested
using two reference data sets, on FER2013 and JAFFE datasets,
respectively, using a hybrid model of Efficient-NetB0+VGG16
[5]. The purpose of this paper is an efficient deep learning
technique using a convolutional neural network model for
emotion recognition, age detection, and gender detection from
facial images [6]. Neuroscience, mental health, behavioral
science, and artificial intelligence are just a few of the
applications of machine learning. Machine learning algorithms
can aid in the detection of emotions, which is an important topic
of research. Emotion is a state that represents human feelings,
ideas, and behavior and may be found in all aspects of daily life.
In [7], the authors sought to characterize respondents' emotions
utilizing the EEG results from the DEAP dataset. They PCA to
lower the dimension of the preprocessed EEG data before
testing the accuracy of the CNN algorithm's categorization of
Fig.1.Proposed Methodology
both training and validation samples. They discovered that the
network may be utilized as a robust classifier for brain signals, A. Dataset
outperforming typical machine-learning approaches. In [8], the A facial expression recognition dataset is designated as
authors proposed an advanced convolutional neural network FER2013. It comprises 35,887 48x48 grayscale photos, each
model capable of recognizing five unique human facial with a matching label indicating the facial emotion seen in that
emotions. They used a manually gathered picture dataset to train image. Seven facial expressions are included in the dataset:
and validate the model. Similarly, [9] investigated the rage, contempt, fear, joy, sadness, surprise, and neutral. The
optimization of the deep convolution learning network's photographs are labeled by crowd-sourced employees after
topology and loss function for facial emotion recognition. They being gathered from various sources such as web searches,
trained the convolutional network using the fer2013 dataset and social media, and open datasets. We also used one more dataset
discovered that their algorithm can recognize facial emotions which contains positive words and negative words to analyze
well. The writers of [10], explored the difficulties encountered vocal features.
by the sentiment analysis and assessment technique and
reviewed several approaches used to gain a complete B. Data Pre-Processing
understanding of their benefits and limitations. The authors of Data preprocessing played a crucial role in preparing the
[11], described a hybrid rule-based technique for producing a datasets for analysis and modeling. For the facial expression
fully annotated dataset for five emotions, as well as machine dataset, we performed pre-processing steps such as image
learning classifiers, to categorize sentiments and emotions. resizing, normalization, and data augmentation to ensure that
Finally, the authors presented a data-analytic-based algorithm the images were in a consistent format and that the dataset was
to identify sadness in people using data gathered from their large enough to train the models. For the vocal feature dataset,
posts on social media websites such as Twitter and Facebook we cleaned the text data, removed stop words, and converted
[12]. They discovered that machine learning may be used to the text to a numerical representation using techniques such as
analyze scraped data from social media in order to detect the term frequency-inverse document frequency. The data
emergence of depressive illness symptoms. preprocessing helped to improve the quality and consistency of

287
IJRITCC | June 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 7s
DOI: https://doi.org/10.17762/ijritcc.v11i7s.7001
Article Received: 02 April 2023 Revised: 22 May 2023 Accepted: 01 June 2023
___________________________________________________________________________________________________________________
the datasets and played a key role in achieving accurate addition of a penalize term in the loss function to prevent the
depression detection results. weights from growing too big, whereas early stopping is the
termination of the training process when the model's
C. Feature Selection and Extraction
performance on the testing data begins to deteriorate. Model
For the facial expression dataset, the features were extracted. training is an important phase in the development of the
from the images using deep convolutional neural networks such automated depression detection system since it allows us to
as VGG16, ResNet50, and MobileNet. These models were create reliable and robust classification models that can identify
pretrained on large datasets such as ImageNet and fine-tuned on depression based on facial and voice data.
the facial expression dataset to extract high-level features from
the images. The output of the CNN models was a set of feature IV. ALGORITHMS USED
vectors, which were used as input to the classification models
A. Visual Geometry Group 16 (VGG16)
for depression detection. For the vocal feature dataset, the
The VGG16 design is made up of Sixteen convolutional layers
features were extracted from the text data using natural
and three fully linked layers. Convolutional layers are intended
language processing techniques such as TF-IDF. This technique
to extract information from an input picture by convolving a
converted the text data into a numerical representation based on
series of learned filters over it. These attributes are then utilized
the frequency of the words in the dataset and the inverse
to categorize the picture using the fully linked layers. One of the
frequency of the words in the entire corpus. This resulted in a
key strengths of the VGG16 architecture is its simplicity and
set of feature vectors for each text sample, which was used as
uniformity [13]. All of the convolutional layers have the same
input to the classification models for depression detection.
filter size and stride, and all of the pooling layers have the same
D. Data Splitting size and stride. This uniformity makes it easy to understand and
For the facial expression dataset, the data was split into. The modify the architecture and has also been shown to improve
training and validation sets were split in an 80:20 ratio, with performance on image recognition tasks. During training, the
80% of the data utilized for training and 20% for validation. The VGG16 architecture is typically initialized with weights learned
data was also augmented by performing random from the ImageNet dataset, a large-scale image recognition
transformations such as rotation, zooming, and flipping to dataset with over a million images and a thousand different
expand the size of the training set and improve model object categories. The network is then fine-tuned on the specific
generalization. For the vocal feature dataset, the data was split dataset and task at hand, in this case, facial expression
into training and testing sets using the same 80:20 ratio. recognition for depression detection. In our project, we trained
However, since the dataset only consisted of text data, no data the VGG16 model on a facial expression recognition dataset to
augmentation was performed. Data splitting played a crucial recognize 7 different Anger, contempt, fear, pleasure, sorrow,
role in evaluating the performance of the classification models surprise, and neutral facial emotions. The accuracy achieved by
on unseen data and ensured that the models were able to the VGG16 model on this task was 62%. While 62% accuracy
generalize well to new data. may seem low, it is important to note that facial expression
recognition is a challenging task, and there are many factors that
E. Model Training
can affect the accuracy of the model, such as lighting
Model training involves using the training data to train the conditions, pose, and occlusions. Additionally, accuracy can be
classification models to accurately classify depression based on affected by the size and quality of the dataset. Overall, VGG16
the extracted features from the facial and vocal datasets. For the is a powerful deep-learning model that has been widely used for
facial expression dataset, we trained four different classification image classification tasks and has shown impressive
models: VGG16, ResNet50, MobileNet, and CNN. For the performance in various benchmarks. In our project, it was a
vocal feature dataset, we trained two classification models: good choice for facial expression recognition, and while the
SVM and Naive Bayes. During the training process, the accuracy achieved was not perfect, it still demonstrated the
classification models were exposed to the training data multiple potential for this approach to be used in real-world applications.
times, and the model weights were updated each time based on
the distinction between the predicted and true output. The B. Residual Network (ResNet50)
models' performance was assessed using measures such as ResNet50 is a convolutional neural network architecture that
accuracy, and the model with the best performance was chosen was proposed by Microsoft Research in 2015. It is a deep neural
as the final model. Techniques such as regularization and early network with 50 layers, hence the name ResNet50. The main
stopping were employed to avoid overfitting, which happens advantage of ResNet50 is that it can effectively train very deep
when the model is excessively sophisticated and performs well neural networks by introducing a residual connection between
on training data but badly on testing data. Regularization is the layers. The residual connection allows the gradient to flow

288
IJRITCC | June 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 7s
DOI: https://doi.org/10.17762/ijritcc.v11i7s.7001
Article Received: 02 April 2023 Revised: 22 May 2023 Accepted: 01 June 2023
___________________________________________________________________________________________________________________
directly through the network without encountering the need to deploy our model on mobile devices or other embedded
vanishing gradient problem. The vanishing gradient problem is systems with limited computational resources.
a common issue with deep neural networks, where the gradient
D. Convolutional Neural Network (CNN)
becomes extremely small as it propagates through the network,
making it difficult to update the weights and train the network. This neural network is a deep neural network type that has been
ResNet50 consists of a series of convolutional layers followed frequently utilized for image categorization applications. We
by batch normalization, activation functions, and pooling employed a CNN model in our study to detect sadness based on
layers. It also includes skip connections that connect certain facial expressions. The model was fed a facial picture that had
layers to later layers in the network. The skip connections allow been preprocessed to remove noise and normalize brightness
information to bypass one or more layers and are added to the and contrast. The algorithm produced a probability score
output of a layer and in the network. During training, ResNet50 reflecting the likelihood of the input picture falling into one of
is typically trained using backpropagation and stochastic two categories: depressed or not depressed. Our CNN model
gradient descent with momentum. The model is trained on a consisted of several layers, including convolutional layers,
large dataset of labeled images, and the weights are adjusted pooling layers, and fully connected layers. The convolutional
iteratively based on the error between the predicted outputs and layers learned features from the input image by applying a set
the actual outputs. Once the model is trained, it can be used for of filters to detect patterns at different spatial scales. The
a variety of tasks, such as image classification, object detection, pooling layers reduced the spatial dimensions of the feature
and image segmentation, among others. The performance of maps by selecting the maximum value in each pooling region.
ResNet50 on image classification tasks is particularly The fully connected layers performed the final classification by
impressive, with an accuracy of 61.32% on several benchmark mapping the learned features to the output classes. During
datasets [14]. training, we employed a binary cross-entropy loss function to
optimize the model's parameters. In order to enhance the size of
C. MobileNet our training dataset and prevent overfitting, we applied data
MobileNet is an architecture of deep convolutional neural augmentation techniques. Our model was trained using the
networks. is intended for mobile and embedded vision FER2013 dataset, which contains 35,887 face photos labeled
applications. It was created in 2017 by Google and has a lower with seven distinct emotions, including depression. Our results
memory footprint and is quicker than previous deep neural indicated that the CNN model had a high accuracy of 83% in
networks. The depth-wise separable convolutions in the diagnosing depression based on facial expressions. We also
MobileNet architecture separate the spatial and channel-wise compared the CNN model's performance to that of other
convolutions. This decreases the number of parameters and models, such as VGG16, ResNet50, and MobileNet, and
simplifies the process. The network's computational discovered that the CNN model beat these models in terms of
complexity. The network additionally employs 1x1 accuracy. Our CNN model proved the efficacy of deep learning
convolutions to expand the number of channels and approaches in identifying depression based on facial
incorporates global average pooling at the network's expressions, and it has the potential to be employed in real-
conclusion. The MobileNet architecture may be taught using the world applications such as mental health screening and
same procedures as other deep neural networks in terms of diagnosis [16].
model training. The model may be trained on large-scale picture
E. Support Vector Machine (SVM)
datasets such as ImageNet and then fine-tuned on a smaller
dataset for the specific application [15]. The network's weights SVM is a binary classification technique that seeks the optimum
are modified during training using backpropagation and hyperplane with the greatest margin of separation between
gradient descent to minimize the loss function, which evaluates positive and negative data points. Positive and negative words
the difference between expected and actual outputs. Techniques can be thought of as data points in a sentiment analysis study.
like as batch normalization, data augmentation, and transfer We may utilize SVM for this job by first creating a lexicon of
learning can help to speed up the training process. Once trained, positive and negative phrases and then representing each
the model can be used to make inferences on new images. The document as a feature vector. The bag-of-words model, in
network processes the input picture, and the final layer outputs which each feature indicates the occurrence of a word in the
the anticipated class probabilities. Typically, the predicted class document, is a typical technique. We can then use these feature
is the class with the highest probability. After training our vectors to train an SVM model to categorize the texts as positive
model using MobileNet, we obtained an accuracy of 45%. This or negative. During training, the SVM algorithm seeks the best
is lower than the accuracies obtained using VGG16 and hyperplane that separates the positive and negative feature
ResNet50. However, MobileNet can be a good choice when we vectors with the least amount of margin. The best hyperplane is
obtained by minimizing classification error and increasing the

289
IJRITCC | June 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 7s
DOI: https://doi.org/10.17762/ijritcc.v11i7s.7001
Article Received: 02 April 2023 Revised: 22 May 2023 Accepted: 01 June 2023
___________________________________________________________________________________________________________________
margin. Once discovered, the hyperplane may be used to more complex models, such as ResNet50 and VGG16, may not
categorize fresh documents as positive or negative depending always outperform simpler models, such as CNN, especially
on their feature vectors. We got an accuracy of 53.91% after when the dataset is small. Although our findings indicate that
training the model, which means it correctly classified 53.91% the CNN model is a promising approach for detecting
of the terms in the dataset. In addition, we evaluated other depression based on facial expressions and could be used in
measures such as accuracy, recall, and F1 score to properly practical fields such as psychological assessment and diagnosis.
evaluate the model's performance [17]. However, more research is required to assess the model's
efficacy on a wider and more diverse dataset, as well as to
F. Naïve Bayes Classifier
investigate its generalizability to different people and cultures.
This classifier is a machine-learning algorithm that uses the
Bayes theorem to classify data based on input features. In the B. Result Analysis of Vocal Features
case of text classification, the input features are typically words We also compared the effectiveness of two distinct models for
in a document and the class labels are the possible categories. It diagnosing depression based on vocal features, SVM and Nave
assumes that the features are conditionally independent given Bayes. Each model's precision, recall, F1 score, and accuracy
the class label, which allows the likelihood of the evidence are provided in the table below: Table 2
given the hypothesis to be calculated as the product of the Table2. Accuracy Table of SVM, Naïve Bayes
probabilities of each individual feature given the hypothesis. To Models SVM Naïve Bayes
train a Naive Bayes classifier, the prior probabilities of every Precision 70% 84%
class label are estimated based on the training data, and the Recall 54.2% 79%
conditional probabilities of each feature given each class label F1-Score 42.7% 73.81%
are estimated using a simple counting method such as maximum Accuracy 53.91% 79%
likelihood estimation or smoothing. During classification, the Our findings reveal that the Naive Bayes model performed
posterior probabilities of each class label given the input much better than the SVM model in diagnosing depression
features are calculated using Bayes' theorem, and the class label based on vocal data. The Naive Bayes model also outperformed
with the higher probability is chosen as the predicted class. The the SVM model in terms of accuracy, showing that it properly
Naive Bayes model showed an accuracy of 79% in detecting identified all depressed patients in our sample. The recall rate
positive and negative sentiments from vocal features. The of the Naive Bayes model was higher than that of the SVM
precision for predicting sentiments was 84% while the recall model, suggesting that it may have a better ability to identify
was 79%. The f1-score was 73.81%. Overall, the model sad people. The F1 score, which is a harmonic mean of accuracy
performed well in detecting negative sentiments but had some and recall, was also greater for the Naive Bayes model than for
difficulty in detecting positive sentiments [18]. the SVM model, showing a better balance of precision and
recall. Our findings indicate that the Naive Bayes model is a
V. RESULT ANALYSIS
promising method for diagnosing depression based on vocal
A. Results Analysis For Facial Features Model features and that it might be applied in real-world applications
We compared the performance of four different models for such as mental health screening and diagnosis. However, more
identifying sadness based on facial expressions, including research is required to assess the model's performance on larger
ResNet50, VGG16, MobileNet, and CNN. The following table and more diverse datasets, as well as to investigate its
shows the accuracy of each model: Table 1 generalizability to different populations and cultures. Our
Table.1. Accuracy Table of VGG16, ResNet50, MobileNet, findings emphasize the utility of employing both facial and
CNN vocal features for automated depression identification, and they
Models Accuracy suggest that combining these features may boost the detection
ResNet50 61.32% system's effectiveness even more.
VGG16 62%
VI. CONCLUSION
MobileNet 45%
CNN 83% Our project shows promise in using machine learning
techniques to identify and diagnose depression. By combining
Our findings show that the CNN model outperformed the other
models in diagnosing sadness based on facial expressions, with facial and vocal features, we can gain a more complete
understanding of a person's emotional state. Our use of CNN for
an accuracy of 83%. The VGG16 model performed well as well,
facial expression recognition and Naive Bayes for vocal feature
having an accuracy of 62%. The accuracy ratings for the Res-
analysis allows for a more holistic approach to depression
Net50 and MobileNet models, on the other hand, were 61.32%
detection. This has the potential to make a significant
and 45%, respectively. These findings suggest that deeper and

290
IJRITCC | June 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 7s
DOI: https://doi.org/10.17762/ijritcc.v11i7s.7001
Article Received: 02 April 2023 Revised: 22 May 2023 Accepted: 01 June 2023
___________________________________________________________________________________________________________________
contribution to the field of mental health by providing a more [11] Wankhade and Mayur, "A survey on sentiment analysis
accurate and reliable method for detecting and diagnosing methods, applications, and challenges," Artificial Intelligence
depression. Future research can further optimize the models and Review, vol. 55, no. 7, p. 5731–5780, 2022.
explore the use of other machine-learning techniques. Our [12] "Sentiment analysis and Automatic Emotion Detection Analysis
project demonstrates the potential of machine learning to of Twitter using Machine Learning Classifiers," International
improve mental health outcomes and provide a framework for Journal of Mechanical Engineering, vol. 7, no. 2, p. 11, 2022.
the development of more accurate and reliable depression [13] Kanna, D. R. K. ., Muda, I. ., & Ramachandran, D. S. . (2022).
detection systems. Handwritten Tamil Word Pre-Processing and Segmentation
Based on NLP Using Deep Learning Techniques. Research
REFERENCES Journal of Computer Systems and Engineering, 3(1), 35–42.
Retrieved from
[1] Anastasia and P. Simos, "Automatic Assessment of Depression
https://technicaljournals.org/RJCSE/index.php/journal/article/v
Based on Visual Cues: A Systematic Review," IEEE
iew/39
Transactions on Affective Computing, vol. 10, no. 4, pp. 445-
[14] T. Vaidya and R. Yeole, "Deep Learning-Based Early
470, 2019.
Depression Detection using Social Media," INTERNATIONAL
[2] M. L. Joshi and N. Kanoongo, "Depression detection using JOURNAL OF INNOVATIVE RESEARCH IN
emotional artificial intelligence and machine learning: A closer TECHNOLOGY, vol. 9, no. 10, pp. 974-977, 2023.
review," Materials Today: Proceedings, vol. 58, pp. 217-226,
[15] Zhong and Shan, "Expression Recognition Method Using
2022.
Improved VGG16 Network Model in Robot Interaction,"
[3] A. Castiglione and S. Hossain, "Impact of Deep Learning Journal of Robotics, vol. 2021, p. 9, 2021.
Approaches on Facial Expression Recognition in Healthcare
[16] H. Abdullahi and Sharif, "Facial expression recognition using
Industries," EEE Transactions on Industrial Informatics, vol.
deep learning," in AIP Conference Proceedings, 2021.
18, no. 8, pp. 5619-5627, 2022.
[17] J. Ju and Q. Hua, "A-MobileNet: An approach of facial
[4] A. V. Savchenko and Savchenko, "Classifying Emotions and
expression recognition," Alexandria Engineering Journal, vol.
Engagement in Online Learning Based on a Single Facial
61, no. 6, pp. 4435-4444, 2022.
Expression Recognition Neural Network," IEEE Transactions
on Affective Computing, vol. 13, no. 4, pp. 2132-2143, 2022. [18] F. Arefin, S. R. Das and Shanto, "Depression Detection Using
Convolutional Neural Networks," 2021 IEEE International
[5] Dr. Govind Shah. (2017). An Efficient Traffic Control System
Conference on Signal Processing, Information, Communication
and License Plate Detection Using Image Processing.
& Systems (SPICSCON), pp. 9-13, 2021.
International Journal of New Practices in Management and
Engineering, 6(01), 20 - 25. Retrieved from [19] García, A., Petrović, M., Ivanov, G., Smith, J., & Cohen, D.
http://ijnpme.org/index.php/IJNPME/article/view/52 Enhancing Medical Diagnosis with Machine Learning and
[6] A. F. Yaseen and Shaukat, "Emotion Recognition from Facial Image Processing. Kuwait Journal of Machine Learning, 1(4).
Images using Hybrid Deep Learning Models," in 2022 2nd Retrieved from
International Conference on Digital Futures and Transformative http://kuwaitjournals.com/index.php/kjml/article/view/143
Technologies (ICoDT2), 2022. [20] S. Gide and S. Ghatte, "Depression Prediction using BERT and
SVM," International Research Journal of Engineering and
[7] M. Z. Asghar, "An efficient deep learning technique for facial
Technology, vol. 9, no. 3, pp. 2013-2016, 2022.
emotion recognition," Multimedia Tools and Applications, vol.
81, no. 2, pp. 1573-7721, 2022. [21] P. P. Surya and B. Subbulakshmi, "Sentimental Analysis using
Naive Bayes Classifier," 2019 International Conference on
[8] Yuliang, X. Meng and Gao, "Emotion Recognition Based On
Vision Towards Emerging Trends in Communication and
CNN," in 2019 Chinese Control Conference (CCC), 2019.
Networking (ViTECoN), Vols. 9-13, 2021.
[9] E. Pranav and Kamal, "Facial Emotion Recognition Using Deep
[22] Shalini, A. K. ., Saxena, S. ., & Kumar, B. S. . (2023). Designing
Convolutional Neural Network," in 2020 6th International
A Model for Fake News Detection in Social Media Using
Conference on Advanced Computing and Communication
Machine Learning Techniques. International Journal of
Systems (ICACCS), 2020, pp. 317-320.
Intelligent Systems and Applications in Engineering, 11(2s),
[10] Liu and Lingling, "Human Face Expression Recognition Based 218 –. Retrieved from
on Deep Learning-Deep Convolutional Neural Network," in https://ijisae.org/index.php/IJISAE/article/view/2620
2019 International Conference on Smart Grid and Electrical
Automation (ICSGEA), 2019, pp. 221-224.

291
IJRITCC | June 2023, Available @ http://www.ijritcc.org

View publication stats

Admission Form
No ratings yet
Admission Form
1 page
Background To IPSAS Implementation in Nigeria
67% (3)
Background To IPSAS Implementation in Nigeria
28 pages
Cat Connectors
No ratings yet
Cat Connectors
85 pages
Transportation Engg: Compiled By: Engr Muhammad Abbas Khan
No ratings yet
Transportation Engg: Compiled By: Engr Muhammad Abbas Khan
9 pages
Cics Question Bank 1 of 28
No ratings yet
Cics Question Bank 1 of 28
28 pages
CVP Analysis 2
50% (2)
CVP Analysis 2
7 pages
Baker
No ratings yet
Baker
4 pages
SINAMICS G120 PN at S7-1200 DOCU V1d0 en
No ratings yet
SINAMICS G120 PN at S7-1200 DOCU V1d0 en
63 pages
Data Mining Cat
No ratings yet
Data Mining Cat
6 pages
ct9 Ilm3
No ratings yet
ct9 Ilm3
11 pages
1.1 Purpose: 1.2.1 Selection
No ratings yet
1.1 Purpose: 1.2.1 Selection
7 pages
Optical Fiber Communication: Technology and Systems: Chapter 1: Introduction
No ratings yet
Optical Fiber Communication: Technology and Systems: Chapter 1: Introduction
44 pages
Sas#12 Acc150 Quiz
No ratings yet
Sas#12 Acc150 Quiz
3 pages
BMW PDF
No ratings yet
BMW PDF
38 pages
Order 19973751
No ratings yet
Order 19973751
37 pages
01 AB 0.428 000638740261156 P Y R&R Atms Rentals and Vending LLC UNIT 61054 2478 E Desert Inn RD LAS VEGAS NV 89160-8044
No ratings yet
01 AB 0.428 000638740261156 P Y R&R Atms Rentals and Vending LLC UNIT 61054 2478 E Desert Inn RD LAS VEGAS NV 89160-8044
4 pages
Vanity Litepaper
No ratings yet
Vanity Litepaper
6 pages
2017 Multimodal2
No ratings yet
2017 Multimodal2
13 pages
Border Irrigation: Advantages
No ratings yet
Border Irrigation: Advantages
8 pages
Curriculum - Vitae: Career Objective
No ratings yet
Curriculum - Vitae: Career Objective
3 pages
Prediction of Depression Severity Based On The Prosodic and Semantic Features With Bidirectional LSTM and Time Distributed CNN
No ratings yet
Prediction of Depression Severity Based On The Prosodic and Semantic Features With Bidirectional LSTM and Time Distributed CNN
15 pages
Detecting Depression From Facial Actions and Vocal Prosody
No ratings yet
Detecting Depression From Facial Actions and Vocal Prosody
7 pages
Depression Prediction Using Machine Learning: A Review
No ratings yet
Depression Prediction Using Machine Learning: A Review
11 pages
Detecting Depression With Audio/Text Sequence Modeling of Interviews
No ratings yet
Detecting Depression With Audio/Text Sequence Modeling of Interviews
5 pages
Deep Learning For Prediction of Depressive Symptoms in A Large Textual Dataset
No ratings yet
Deep Learning For Prediction of Depressive Symptoms in A Large Textual Dataset
24 pages
Detecting Depression From Speech
No ratings yet
Detecting Depression From Speech
8 pages
$Xwrpdwlf'Hsuhvvlrq/Hyho'Hwhfwlrq7Kurxjk 9lvxdo, QSXW: Abstract Depression Is The Most Comprehensive Mood
No ratings yet
$Xwrpdwlf'Hsuhvvlrq/Hyho'Hwhfwlrq7Kurxjk 9lvxdo, QSXW: Abstract Depression Is The Most Comprehensive Mood
4 pages
(IJCST-V11I3P25) :pooja Patil, Swati J. Patel
No ratings yet
(IJCST-V11I3P25) :pooja Patil, Swati J. Patel
5 pages
Group 40 Ijarcce
No ratings yet
Group 40 Ijarcce
5 pages
Clientele and Audiences in Communication (Diass) PDF
No ratings yet
Clientele and Audiences in Communication (Diass) PDF
1 page
BIGDAS2023 Paper 13
No ratings yet
BIGDAS2023 Paper 13
8 pages
Cloud Computing Along With ML - Research - Paper
No ratings yet
Cloud Computing Along With ML - Research - Paper
6 pages
Research Prop
No ratings yet
Research Prop
20 pages
WORKBOOK - Product Design Workshop-2
No ratings yet
WORKBOOK - Product Design Workshop-2
34 pages
Investigation of Speech Landmark Patterns For Depression Detection
No ratings yet
Investigation of Speech Landmark Patterns For Depression Detection
14 pages
MS015 User Manual Multi
No ratings yet
MS015 User Manual Multi
90 pages
Term Paper Topic:"Parking Management System"
No ratings yet
Term Paper Topic:"Parking Management System"
8 pages
Final Review
No ratings yet
Final Review
21 pages
Blue Professional Modern CV Resume
No ratings yet
Blue Professional Modern CV Resume
1 page
SSRN Id3363837
No ratings yet
SSRN Id3363837
4 pages
Detecting Depression With Word-Level Multimodal Fusion - LEÍDO
No ratings yet
Detecting Depression With Word-Level Multimodal Fusion - LEÍDO
5 pages
ProjectReport 3
No ratings yet
ProjectReport 3
66 pages
Important Questions
No ratings yet
Important Questions
21 pages
5.IEEE Trans Affect Compu Interpretation of Depression Detection Models Via Feature Selection Methods
No ratings yet
5.IEEE Trans Affect Compu Interpretation of Depression Detection Models Via Feature Selection Methods
52 pages
Exam Time Table 2024 Bulanala-1
No ratings yet
Exam Time Table 2024 Bulanala-1
2 pages
Depression Detection Presentation
No ratings yet
Depression Detection Presentation
20 pages
Automatic Diagnosis of Depression Based On Attention Mechanism and Feature Pyramid Model.
No ratings yet
Automatic Diagnosis of Depression Based On Attention Mechanism and Feature Pyramid Model.
20 pages
Intelligent System For Depression Scale Estimation With Facial Expressions and Case Study in Industrial Intelligence
No ratings yet
Intelligent System For Depression Scale Estimation With Facial Expressions and Case Study in Industrial Intelligence
19 pages
Priyanka RDC 2
No ratings yet
Priyanka RDC 2
26 pages
Depression Detection Using Multimodal Analysis With Chatbot Support
No ratings yet
Depression Detection Using Multimodal Analysis With Chatbot Support
7 pages
PHQ-V GAD-V Assessments To Identify Signals of Depression
No ratings yet
PHQ-V GAD-V Assessments To Identify Signals of Depression
15 pages
Depression Recognition Over Fusion of Visual and Vocal Expression Using Artificial Intelligence
No ratings yet
Depression Recognition Over Fusion of Visual and Vocal Expression Using Artificial Intelligence
8 pages
SSRN Id3363837
No ratings yet
SSRN Id3363837
5 pages
DepXGBoot: Depression Detection Using A Robust Tuned Extreme Gradient Boosting Model Generator
No ratings yet
DepXGBoot: Depression Detection Using A Robust Tuned Extreme Gradient Boosting Model Generator
12 pages
GOLD 2025 Pocket Guide References
No ratings yet
GOLD 2025 Pocket Guide References
16 pages
Automated Detection of Human Mental Disorder: Open Access Research
No ratings yet
Automated Detection of Human Mental Disorder: Open Access Research
10 pages
Enhancing - Depression - Detection - Employing - Autoencoders - and - Linguistic - Feature - Analysis - With - BERT - and - LSTM - Model
No ratings yet
Enhancing - Depression - Detection - Employing - Autoencoders - and - Linguistic - Feature - Analysis - With - BERT - and - LSTM - Model
6 pages
1.CAAI Transactions On Intelligence Technology Automatic Depression Recognition by Intelligent Speech Signal Processing A Systematic Survey
No ratings yet
1.CAAI Transactions On Intelligence Technology Automatic Depression Recognition by Intelligent Speech Signal Processing A Systematic Survey
11 pages
Privacy-Preserving Speech-Based Depression Diagnosis Via Federated Learning
No ratings yet
Privacy-Preserving Speech-Based Depression Diagnosis Via Federated Learning
4 pages
Depression Detection Using Text Face and Audio
No ratings yet
Depression Detection Using Text Face and Audio
19 pages
Multimodal Depression Detection Based On Self-Attention Network With Facial Expression and Pupil
No ratings yet
Multimodal Depression Detection Based On Self-Attention Network With Facial Expression and Pupil
13 pages
Depression Recognition Using Voice Based Pre Training Model
No ratings yet
Depression Recognition Using Voice Based Pre Training Model
13 pages
Team 7's Homebrew Handbook of Emerging Shinobi Talent (Season 3) - The Homebrewery
No ratings yet
Team 7's Homebrew Handbook of Emerging Shinobi Talent (Season 3) - The Homebrewery
2 pages
Multimodal Depression Detection Using Audio Visual Cues
No ratings yet
Multimodal Depression Detection Using Audio Visual Cues
5 pages
Automatic Diagnosis of Depression Based On Facial Expression Information and Deep Convolutional Neural Network
No ratings yet
Automatic Diagnosis of Depression Based On Facial Expression Information and Deep Convolutional Neural Network
12 pages
IJNRD2405665
No ratings yet
IJNRD2405665
8 pages
Conference
No ratings yet
Conference
7 pages
Additive Cross-Modal Attention Network ACMA For Depression Detection Based On Audio and Textual Features
No ratings yet
Additive Cross-Modal Attention Network ACMA For Depression Detection Based On Audio and Textual Features
11 pages
Depression Detection Through Transformers-Based Emotion Recognition in Multivariate Time Series Facial Data
No ratings yet
Depression Detection Through Transformers-Based Emotion Recognition in Multivariate Time Series Facial Data
9 pages
Depression Detection Using Python Django and Tensorflow and Machine Learning
No ratings yet
Depression Detection Using Python Django and Tensorflow and Machine Learning
26 pages
Petition For Disqualification Bartolome Bermudez
No ratings yet
Petition For Disqualification Bartolome Bermudez
9 pages
Automatic Depression Level Assessment From Speech by Long-Term Global Information Embedding
No ratings yet
Automatic Depression Level Assessment From Speech by Long-Term Global Information Embedding
5 pages
Emotion Detection and Suicidal Intention Prediction of Differently Depressed Individuals Using Mach
No ratings yet
Emotion Detection and Suicidal Intention Prediction of Differently Depressed Individuals Using Mach
4 pages
Depression Detection
No ratings yet
Depression Detection
5 pages
PDF 3
No ratings yet
PDF 3
16 pages
Paper 3173
No ratings yet
Paper 3173
5 pages
Depression Detection Using EI
No ratings yet
Depression Detection Using EI
7 pages
Copy of Copy of LOCAL BIRTH CERTIFICATE - 20250116 - 135004 - 0000.pdf - 20 - 20250221 - 121021 - 0000
No ratings yet
Copy of Copy of LOCAL BIRTH CERTIFICATE - 20250116 - 135004 - 0000.pdf - 20 - 20250221 - 121021 - 0000
4 pages
Final Paperhh
No ratings yet
Final Paperhh
13 pages
Speaker-Independent Depression Detection Based On Adversarial Training Method
No ratings yet
Speaker-Independent Depression Detection Based On Adversarial Training Method
5 pages
A Computer Vision Based Image Processing System Fo
No ratings yet
A Computer Vision Based Image Processing System Fo
10 pages
Depression Detection and Analysis Using Large Language Models On Textual and Audio-Visual Modalities
No ratings yet
Depression Detection and Analysis Using Large Language Models On Textual and Audio-Visual Modalities
12 pages
Whta Revels About Depression Level The Role of Multimodal Features at The Level of Interview Questions
No ratings yet
Whta Revels About Depression Level The Role of Multimodal Features at The Level of Interview Questions
14 pages
A Survey On Cross-Platform Depression Detection Combining Text, Audio, Images To Understand Emotions Over Time
No ratings yet
A Survey On Cross-Platform Depression Detection Combining Text, Audio, Images To Understand Emotions Over Time
7 pages
Detecting Depression With Heterogeneous Graph Neural Network in Clinical Interview Transcript
No ratings yet
Detecting Depression With Heterogeneous Graph Neural Network in Clinical Interview Transcript
10 pages
Towards Automatic Depression Detection A BiLSTM1D CNN-Based Model
No ratings yet
Towards Automatic Depression Detection A BiLSTM1D CNN-Based Model
20 pages
MFCC-based Recurrent Neural Network For Automatic Clinical Depression
No ratings yet
MFCC-based Recurrent Neural Network For Automatic Clinical Depression
14 pages
Multimodal Data Fusion For Depression Detection Approach
No ratings yet
Multimodal Data Fusion For Depression Detection Approach
18 pages
Phase 3
No ratings yet
Phase 3
25 pages

An Automated System For Depression Detection Based On Facial and Vocal Features

Uploaded by

An Automated System For Depression Detection Based On Facial and Vocal Features

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

An Automated System for Depression Detection Based on Facial and Vocal

The user has requested enhancement of the downloaded file.

An Automated System for Depression Detection

Symbiosis Institute of Technology

View publication stats

You might also like