0% found this document useful (0 votes)
14 views6 pages

Enhancing - Depression - Detection - Employing - Autoencoders - and - Linguistic - Feature - Analysis - With - BERT - and - LSTM - Model

Uploaded by

jas209422
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views6 pages

Enhancing - Depression - Detection - Employing - Autoencoders - and - Linguistic - Feature - Analysis - With - BERT - and - LSTM - Model

Uploaded by

jas209422
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2023 International Russian Automation Conference (RusAutoCon)

Enhancing Depression Detection: Employing


Autoencoders and Linguistic Feature Analysis with
BERT and LSTM Model
2023 International Russian Automation Conference (RusAutoCon) | 979-8-3503-4555-1/23/$31.00 ©2023 IEEE | DOI: 10.1109/RusAutoCon58002.2023.10272795

Neda Firoz Olga G. Beresteneva Sergey V. Aksyonov


Institute of Applied Mathematics and Department of I.T (Institute of Applied Department of I.T. (Institute of Applied
Computer Sciences Mathematics and Computer Sciences) Mathematics and Computer Sciences)
Tomsk State University Tomsk Polytechnic University Tomsk Polytechnic University
Tomsk, Russia Tomsk, Russia Tomsk, Russia
[email protected] [email protected] [email protected]

Lately, there has been an emergent attention in leveraging


Abstract—Depression is an abstinent health ailment that artificial intelligence (AI) to ascertain prediction of depression
affronts several people globally, affecting a consistent decline in using a sequence of textual, audio, and Patient Health
mood resulting in a trivial impact on their emotions. The article Questionnaire-8 (PHQ-8) questionnaire responses. The
centers around the utilization of BERT based features and increasing potential of analyzing depression and emotions has
Autoencoders to identify depression from textual input, while sparked a greater drive to establish databases like the
also exploring gender identity variations. The study emphasizes Audio/Visual Emotion Challenge (AVEC) in recent years
the importance of feature engineering for text data obtained (2016, 2017, 2019). Consequently, we integrated the DAIC-
from the DAIC_WOZ benchmark dataset. By employing BERT WOZ dataset into our study [6].
embeddings, which capture the semantic meaning of text, in
combination with Autoencoders and additional quantitative A noteworthy study conducted by [7] employed a multi-
characteristics such as PHQ-8 participant feedback, assessment input integrated deep learning model that incorporated audio,
of absolutist word usage, and gender details, the model's text, and PHQ-8 features, achieving a high level of precision in
performance is significantly improved. We conduct trials with identifying depression. Furthermore, there is substantial
BERT embeddings that encode the textual meaning to extract support verifying the efficacy of text-based approaches, as
text characteristics. These features are subsequently combined demonstrated by previous studies [8], [9], which have achieved
using Autoencoders. Our method surpasses the performance of remarkable performance in depression detection.
the baseline models, achieving an accuracy of 95.5%.
Additionally, the mean absolute error (MAE) is measured at 3.75 The medical field has embraced the integration of machine
and the root mean squared error (RMSE) at 5.02, demonstrating learning (ML) to create symptomatic resources that offer
the exceptional performance for binary depression classification. improved exactness and meticulousness and minimizing the
These discoveries reveal the capacity of machine learning in requisite for human involvement. Various findings have
mental health investigation, especially when considering gender demonstrated the potential of ML-driven technology in
disparities. detecting and improving the treatment of complex behavioural
or emotional disorders like depression [10]. Furthermore,
Keywords—depression, AI, deep learning, autoencoders, gender linguistic markers such as using absolutist words can serve as
feature, BERT, linguistic feature
an indicator of depression [11]. By analysing linguistic patterns
and their utilization among individuals with depression, it
I. INTRODUCTION becomes possible to identify psychological patterns [12].
Depression is prevalent mental health condition that Moreover, the inclusion of gender as a feature reveals distinct
impacts numerous individuals across the world. Detecting and and noteworthy differences between male and female
treating depression early on is crucial to mitigate its adverse individuals with depression. Leveraging this feature, alongside
effects on both mental and physical well-being [1]. Depression text and audio data, for depression pattern detection has shown
is described as “the presence of sad, empty, or irritable mood, promising results in accurately identifying depression [13],
accompanied by somatic and cognitive changes that [14]. Consequently, these findings suggest the utilization of
significantly affect the individual’s capacity to function [2].” gender factor for enhancing depression diagnosis [14].
Given that depression can affect individuals of several age Although there remain constraints and obstacles to
groups and extends beyond a explicit demographic, there is a overcome in utilizing AI for the detection of depression, the
critical necessity to automate decision-making processes in the budding advantages of promptly and precisely identifying
realm of public health [3, 4, 5]. depression are noteworthy. By facilitating timely intercessions

979-8-3503-4555-1/23/$31.00 ©2023 IEEE


299
Authorized licensed use limited to: Bangladesh University of Professionals. Downloaded on December 10,2023 at 17:00:43 UTC from IEEE Xplore. Restrictions apply.
2023 International Russian Automation Conference (RusAutoCon)
and medication, this advancement has the potential to enhance The task of feature augmentation on textual data by
the quality of life for masses of people globally. In this study, exploiting BERT (Bidirectional Encoder Representations from
our aim is to utilize textual features from text [14] along with Transformers) encompasses multiple phases. By assessing the
deep learning models to detect depression. significance of each word within the sequence, the attention
mechanism enables BERT to capture wide-ranging
A. Objectives of the Study relationships and associations in the text. As such, the authors
in the article [17], [18] employed BERT extracted vectors to
• Previous research investigating the gender-related
detect instances of depression. The attention mechanism
facets of depression and its variations among men and
incorporated in BERT has demonstrated remarkable efficacy in
women remains limited [14]. In this study, we aim to
capturing long-range dependencies within textual data,
delve into the detection of depression by leveraging
resulting in notable advancements in various natural language
linguistic features such as absolutist words and gender
processing exercises. In our study, we harness the power of the
details, with the intention of achieving more accurate
attention mechanism in BERT to extract essential features from
predictions.
the text corpus [19]. Through this process, we identified 767
• Besides, we incorporate feature fusion using BERT embeddings that play a crucial role in binary depression
autoencoders and denoise the fully fused features. This classification, contributing significantly to the overall
approach not only aids in reducing dimensionality but performance of the model. BERT is a language model, that
also eliminates the necessity of employing another integrates transformer model and attention procedure helping
algorithm in the subsequent steps. in comprehending the contextual associations among words
within a sequence [17].
• Subsequently, we employ an LSTM model for the
depression prediction based on textual data, as LSTM To strengthen the reliability of fault diagnosis, a novel
models have demonstrated strong performance in technique for integrating multisensory data is proposed, as
handling text-based information [15]. mentioned in [20]. We employed Autoencoders for feature
fusion at the next level, incorporating counting usage of
absolutist words, PHQ-8 responses, and the gender data from
II. LITERATURE REVIEW
the dataset. The fusion of these features is performed using a
Depression manifests as an obstinate deterioration in mood deep learning model known as Autoencoders [21].
and noticeably impacts cognitive courses. Prior investigation
has indicated that gender can serve as a valuable seir in Autoencoders have achieved esteem, particularly in the
identifying depression. Numerous studies have provided context of unlabelled datasets, as they offer valuable insights
evidence supporting the prediction of depression through text into unsupervised learning. They have been proven to be
conversations. These studies emphasize the substantial successful in extracting features and capturing hierarchical
differences in language between individuals who are relationships within a range of input domains, including
experiencing depression and those who are not, enabling the handwritten digits, newsgroup posts, and image data.
assessment of one's mental state. Many of these studies have Additionally, autoencoders have demonstrated their ability to
utilized text data to achieve accurate predictions of depression. learn intricate relationships from multimodal data, combining
Some have employed dedicated datasets like DAIC-WOZ [16], both image and text inputs [22].
while others have focused on analysing text from social media Deep autoencoders have been effectively utilized for
platforms. feature learning in recommender systems. By leveraging input
In an article by [10], a framework called AiME is embeddings based on autoencoders, deep learning models have
presented, which enables depression diagnosis with petty exhibited enhanced boost in prediction tasks. The encoded
human involvement. Another article [11] discusses three features, derived from the bottleneck activations of
findings carried out across 63 online forums, with the objective autoencoders, have been instrumental in improving the
of examining the existence of rigid thinking concerning accuracy of classification tasks across diverse datasets. In fact,
anxiety, depression, and thoughts of self-harm. Through a recent study has integrated additional linear or non-linear
linguistic investigation, it was observed that forums associated hidden features from autoencoders to further enhance the
with anxiety, depression, and self-harms or suicidal tendencies classification accuracy [23].
exhibited a higher frequency of absolutist words compared to In the study [15], which employed a standardized dataset,
control forums. These findings highlight the prevalence of Long Short-Term Memory (LSTM) networks exhibited
absolutist thinking in individuals with depression and suggest superior precision in identifying depression. LSTM, a variant
that it may serve as a vulnerability factor. of recurrent neural network (RNN) [24], is widely employed in
With the aim of understanding the significance of gender deep learning applications. Its suitability for processing
data on the depression prediction, the article [14] conducted a sequential data is well-known, owing to its proficiency to
study. The results of this study demonstrate two key findings: capture wide range associations and alleviate the concern of
Firstly, the inclusion of gender data significantly improves the vanishing gradients [15], [24].
prediction accuracy of depression severity. Moreover, the Despite the impressive outcomes achieved by the above
application of adversarial data augmentation to compute methodologies, they did not explore the potential of utilizing
depression severity considering gender additionally improves BERT and Autoencoders for text feature extraction. They also
the accuracy of estimating the severity of depression. did not used the absolutist words as potential feature in

300
Authorized licensed use limited to: Bangladesh University of Professionals. Downloaded on December 10,2023 at 17:00:43 UTC from IEEE Xplore. Restrictions apply.
2023 International Russian Automation Conference (RusAutoCon)
combination with gender feature which is a powerful B. Feature Augmentation
combination for depression detection. Considering this gap, our a) Pre-processing Text
proposed method involves extracting text features from a
pretrained BERT model and employing Autoencoders for Preparing the text transcript for analysis necessitates the
feature fusion to encode BERT features with absolutist words application of natural language processing (NLP) practices via
usage and gender features. The use of autoencoders helps to an ordered series of actions. This pre-processing stage is
minimize the feature space. Later, we leverage these encoded essential for refining the text data, reducing noise, and
features as embeddings at the input layer of our model, which enhancing its overall quality. By employing these techniques,
combines Long Short-Term Memory (LSTM) to attain superior the pre-processing step enables more accurate and meaningful
accuracy. analysis of the text data.
i) Tokenization
III. METHODOLOGY
Lexical analysis involves segmenting a text into lesser units
This section highlights the proposed methodology. We called lexemes, which may have expressions, words, or
intended to use gender features, so we experimented with the sentences [25].
model with and without gender as feature input to the model.
ii) Removing stop words
This work aims to address the following research questions
and present novel contributions: To minimize noise in text data, commonly used words such
as "and," "the," and "is" that lack significant meaning are
• Examining the comparative effectiveness of linguistic usually eliminated [26].
features versus text-based features in depression
detection, evaluated based on Mean Absolute Error iii) Stemming and lexical normalization
(MAE) and Root Mean Square Error (RMSE). Two common exercises, stemming and lemmatization, are
• Investigating the value of incorporating linguistic employed to transform words into their basic forms [27].
features such as absolutist word count assimilated into a Stemming involves removing word affixes, while lexical
depression prediction system pipeline. normalization relies on a wordlist to transform the word into its
root shape.
• Analyzing the effectiveness of gender-based features in
comparison to linguistic features based on absolutist iv) Morphosyntactic tagging
words, estimated by MAE/RMSE. This process is utilized to ascertain the syntactical
• Identifying the most effective combination of feature(s) organization of sentences by carrying the suitable part of
for capturing differences in depression levels. speech to each word, as discussed in the book "Speech and
Language Processing" by [26].
A. Data Used The proportion of absolutist terms in forum groups
The DAIC_WOZ dataset comprises audio and video discussing anxiety, depression, and suicidal ideation is notably
recordings of clinical interviews with individuals diagnosed higher than that observed in healthy groups [11]. Further, we
with depression. This dataset contains self-documented generate a list of absolutist words and subsequently create a
assessments of depression acuteness, distinct attributes, and frequency feature by iterating through the transcripts. We
verbatim interviews. Its purpose is to facilitate the development employ both Pearson's and Spearman's correlation coefficients
and evaluation of AI-powered systems targeting the detection to assess the relationship between the identified features and
of depression in clinical contexts. the PHQ-8 score. This correlation analysis was conducted
within the training and development datasets of the AVEC
Figure 1 illustrates the proposed architecture of this work. 2019 dataset.
The following figure 2 represents the dataset spread of
absolutist words usage, gender and depressed subjects.
b) BERT based feature extraction
For binary categorization of depression in the DAIC_WOZ
corpus, BERT [17] was utilized to extract 767 embeddings
deemed crucial. Initially introduced by Devlin et al. in 2018
[19], BERT effectively captures important information from
both Preceding and succeeding context at each layer, enabling
a comprehensive understanding of the input text. BERT offers
a versatile framework that can be adjusted for extensive range
of natural language processing (NLP) tasks with minimal
Fig. 1. Proposed Methodology. modifications to its architecture or hyperparameter tuning.
Bidirectional Encoder Representations from Transformers
(BERT) is model tailored to pre-train bidirectional
representations from raw text without the need for annotations.

301
Authorized licensed use limited to: Bangladesh University of Professionals. Downloaded on December 10,2023 at 17:00:43 UTC from IEEE Xplore. Restrictions apply.
2023 International Russian Automation Conference (RusAutoCon)
from the AVEC 2017-2019 Depression Sub-challenge, namely
the Mean Absolute Error (MAE) and Root Mean Square Error
(RMSE) [6], [16]. Various analytical statistics can be
employed to assess the performance of regression models, and
RMSE and MAE are commonly employed for this purpose.
We provide both evaluations because the AVEC 2014
Depression Sub-challenge utilizes both measures, and since
there is no agreement on the most appropriate metric for
assessing model errors [27].
Table 1 presents the results of our model. The accuracy
values during training and testing are 89.5% and 89.6%
respectively for our model. We conduct a comparative analysis
between the proposed model and experiments in Table 2 to
provide a contextual understanding of our findings in relation
to previous studies. The results of the other models mentioned
are sourced from their respective original studies.
Fig. 2. Visualizing dataset for use of absolutist words, gender and depressive
subjects TABLE I. MODEL EVALUATION

C. Feature coalescence using Autoencoders Train Test_L Train_A Test_Acc


MAE RMSE R2
Autoencoders are neural networks serving the purpose of _Loss oss ccuracy uracy
feature integration and noise attenuation of both text features
and gender features. In this process, the autoencoder network
receives the input of text features and gender features, 0.092 0.264 95.5 93.6 3.75 5.02 0.646
subsequently compressing and reconstructing the data to
generate an output. This compressed data can then be
employed for feature fusion [20], [21], combining the text TABLE II. COMPARISON TABLE FOR SOTA MODELS
features and gender features into a unified set of features for
LSTM Model MAE RMSE R2
subsequent analysis. Additionally, autoencoders can be utilized
for denoising tasks, training the network to eliminate noise in Williamson et al. [2] 3.34 4.46 ----
the input data, thus yielding enhanced and meticulous features. Jan et al. [3] 6.68 8.01 ----
Zhang et al. [4] 6.43 8.18 ----
D. LSTM
Oureshi et al. [14] 24.02 4.09 ----
In this research, we employed the LSTM model [24] for
text mining of depression analysis. The model encompassed Al Hanai et al. [15] 7.32 8.85 ----
dual layers of bi-directional LSTM, at each layer 4 hidden Lin et al. [28] 3.88 5.44 ----
nodes. To boost the model's performance, we used the Proposed method LSTM
concatenated merge mode and integrated input and time-step 3.75 5.02 0.76
model
dropout rates of 0.1 and 0.8, correspondingly. During the
training process, a batch scale of 32 and a momentum of 0.89 TABLE III. PERFORMANCE BY FEATURES
were used.
LSTM Model MAE RMSE R2
E. Evaluation Metrics BERT only 7.57 15.72 0.056
Evaluation metrics, MAE and RMSE, were applied to Simple fusion BERT
8.83 12.90 0.055
+Absolutist word count
evaluate the model's performance. During the training of our Simple concat BERT +
LSTM model, we employed a dataset consisting of 148 Gender
9.43 11.49 0.076
samples and 575 BERT features that were dimensionally Autoencoder based
reduced during the fusion process. Following this, we assessed fusion BERT +Absolutist 5.65 9.45 0.038
the performance of the proposed model using a separate unit of Word count
41 test dataset. The total number of epochs was 200, and early Autoencoder based
4.65 7.45 0.438
fusion BERT +Gender
stopping regularization technique was used.
Autoencoder based
Fused BERT Features+
3.75 5.02 0.76
IV. RESULTS Absolutist words +
Gender
As depression severity is assessed using a scale, and hence
can be perceived as a regression problem. To facilitate
V. DISCUSSION
comparison with other research conducted on this dataset, we
have decided to utilize comparative gauges as scoring metrics In table 2, the MAE 3.75 and RMSE 5.02 values of our
model is comparatively less in contradistinction with Al Hanai

302
Authorized licensed use limited to: Bangladesh University of Professionals. Downloaded on December 10,2023 at 17:00:43 UTC from IEEE Xplore. Restrictions apply.
2023 International Russian Automation Conference (RusAutoCon)
et al. [15] and Lin et al [28]. The accuracy values during Subsequent investigations can explore the incorporation of
training and testing are 89.5 and 89.6 respectively which supplementary features and data sources, along with the
suggests the predicted depressed label is in accord with the true implementation of our suggested approach on broader and
value. more heterogeneous datasets. By continuing to advance the
field of text-based depression prediction, we can ultimately
Comparing our LSTM model with the results from contribute to improved mental health support and care.
Williamson et al. [2], Jan et al. [3], Zhang et al. [4], Oureshi et
al. [14], Al Hanai et al. [15], and Lin et al. [28], it is apparent To evaluate the strength and applicability of this approach,
that our model achieved significantly lower MAE and RMSE it is advisable to extend its implementation to more extensive
values. This indicates that our LSTM model exhibits better datasets. Furthermore, future research endeavors could
accuracy and precision in predicting depression severity. The reconnoiter the potential of applying in medicine applications,
R-squared value of 0.76 postulates that approximately 76% of offering the possibility of enhancing the early identification
the variation in the dependent variable can be elucidated by the and amelioration of depression.
independent variables incorporated in the model.
Moreover, when considering different feature combinations ACKNOWLEDGMENT
in the context of exploiting BERT features, our autoencoder- We would like to extend our gratitude to the DAIC_WOZ
centered fusion approach showcased consistent improvements. dataset license, which has greatly facilitated our research
When fusing BERT features with Absolutist Word Count, endeavours. Additionally, we would like to express our
Gender, or both, the MAE, RMSE, and R2 scores were appreciation to the library of Tomsk State University for
significantly reduced compared to the other fusion methods. providing us with access to a diverse range of valuable
This suggests that incorporating additional features, especially resources.
those related to the usage of absolutist words and gender, can
augment the predictive effectiveness of the model.
REFERENCES
The utilization of Autoencoders [21] for feature fusion has [1] World Health Organization. [Online]. Available:
demonstrated its effectiveness in capturing vital features while https://www.who.int/news-room/fact-sheets/detail/depression.
mitigating noise. According to the findings displayed in Table [2] J. R. Williamson, E. Godoy, M. Cha, A. Schwarzentruber, P. Khorrami,
III, it is apparent that our proposed LSTM model surpassed Y. Gwon, H. T. Kung, C. Dagli, and T. F. Quatieri., “Detecting
numerous other models in relation to Mean Absolute Error depression using vocal, facial and semantic communication cues,” in
Proceedings of the 6th International Workshop on Audio/ Visual
(MAE), Root Mean Square Error (RMSE), and R2 score. Emotion Challenge, vol. 6, pp. 11–18, October 2016.
[3] M. L. Joshi and N. Kanoongo, “Depression detection using emotional
VI. CONCLUSION AND FUTURE WORKS artificial intelligence and machine learning: A closer review,” Materials
Today: Proceedings, vol. 58, pp. 217–226, January 2022.
Given the rapid advancement of Depression as a medical [4] D. G. Blazer, “Psychiatry and the oldest old,” American Journal of
condition, this study placed emphasis on the anticipation of Psychiatry, vol. 157, iss. 12, pp. 1915–1924, December 2000.
depression severity through the analysis of textual data. By [5] S. Chattopadhyay, “A neuro-fuzzy approach for the diagnosis of
employing an array of advanced approaches, including the depression,” Applied Computing and Informatics, vol. 13, iss. 1, pp. 10–
incorporation of gender-related details, evaluation of absolutist 18, January 2017.
language frequency, utilization of BERT embeddings, and [6] F. Ringeval, B. Schuller, M. Valstar, N. Cummins, R. Cowie, and M.
integration of features through autoencoder-based fusion, we Pantic, “AVEC'19: Audio/visual emotion challenge and workshop,”
Proceedings of the 27th ACM International Conference on Multimedia,
successfully bolstered the precision of prediction of true labels vol. 27, pp. 2718–2719, October 2019.
of depressive subjects.
[7] P. Wu, R. Wang, H. Lin, F. Zhang, J. Tu, and M. Sun, “Automatic
Our results emphasize the significance of feature depression recognition by intelligent speech signal processing: A
systematic survey,” CAAI Transactions on Intelligence Technology, pp.
engineering in text data and the relevance of incorporating 1–11, June 2022.
gender disparities in the detection of depression. By integrating
[8] T. Deng, X. Shu, and J. Shu, “A depression tendency detection model
these factors into our predictive model, we accomplished fusing weibo content and user behavior,” 5th International Conference
outstanding findings, surpassing baseline models and achieving on Artificial Intelligence and Big Data (ICAIBD) IEEE, vol. 5, pp. 304–
an elevated prediction accuracy of 89.5%. 309, May 2022.
[9] M. R. Morales and R. Levitan, “Speech vs. text: A comparative analysis
These findings showcase the capability of text-based of features for depression detection systems,” IEEE Spoken Language
methodologies in recognizing and forecasting the severity of Technology Workshop (SLT), pp. 136–143, December 2016.
depression. The integration of advanced techniques and [10] E. Victor, Z. M. Aghajan, A. R. Sewart, and R. Christian, “Detecting
meticulous evaluation of pertinent features can substantially depression using a framework combining deep multimodal neural
boost the accuracy and effectiveness of predictive models networks with a purpose-built automated evaluation,” Psychological
Assessment, vol. 31, no. 8, 1019, August 2019.
within the realm of mental health. Our study makes a valuable
contribution to the increasing body of research on harnessing [11] M. Al-Mosaiwi and T. Johnstone, “In an absolute state: Elevated use of
absolutist words is a marker specific to anxiety, depression, and suicidal
textual data for mental health assessment. The insights derived ideation,” Clinical Psychological Science, vol. 6, no. 4, pp. 529–542,
from this study have the capacity to shape the advancement of July 2018.
more accurate and reliable tools for early detection and [12] A. Trifan, R. Antunes, S. Matos, and J. L. Oliveira, “Understanding
intervention in cases of depression. depression from psycholinguistic patterns in social media texts,” in
Advances in Information Retrieval. ECIR 2020. Lecture Notes in

303
Authorized licensed use limited to: Bangladesh University of Professionals. Downloaded on December 10,2023 at 17:00:43 UTC from IEEE Xplore. Restrictions apply.
2023 International Russian Automation Conference (RusAutoCon)
Computer Science(), vol. 12036, J. Jose, et al. Springer, Cham., pp. 402– [20] Z. Chen and W. Li, “Multisensor feature fusion for bearing fault
409, April 2020. diagnosis using sparse autoencoder and deep belief network,” IEEE
[13] N. Cummins, B. Vlasenko, H. Sagha, and B. Schuller, “Enhancing Transactions on Instrumentation and Measurement, vol. 66, no. 7, pp.
speech-based depression detection through gender dependent vowel- 1693–1702, March 2017.
level formant features,” Artificial Intelligence in Medicine: 16th [21] D. Charte, F. Charte, S. García, M. J. del Jesus, and F. Herrera, “A
Conference on Artificial Intelligence in Medicine, AIME Vienna, practical tutorial on autoencoders for nonlinear feature fusion:
Austria, Springer International Publishing, vol. 16, pp. 209–214, June Taxonomy, models, software and guidelines,” Information Fusion, vol.
2017. 44, pp. 78–96, November 2018.
[14] S. A. Oureshi, G. Dias, S. Saha, and M. Hasanuzzaman, “Gender-aware [22] O. Irsoy and E. Alpaydın, “Unsupervised feature extraction with
estimation of depression severity level in a multimodal setting,” autoencoder trees,” Neurocomputing, vol. 258, pp. 63–73, October
International Joint Conference on Neural Networks (IJCNN) IEEE, pp. 2017.
1–8, July 2021. [23] K. Rama, P. Kumar, and B. Bhasker, “Deep autoencoders for feature
[15] T. Al Hanai, M. M. Ghassemi, and J. R. Glass, “Detecting depression learning with embeddings for recommendations: a novel recommender
with audio/text sequence modeling of interviews,” Interspeech, pp. system solution,” Neural Computing And Applications, vol. 33, pp.
1716–1720, September 2018. 14167–14177, November 2021.
[16] F. Ringeval, B. Schuller, M. Valstar, N. Cummins, R. Cowie, L. Tavabi, [24] C. Zhou, C. Sun, Z. Liu, and F. Lau., “A C-LSTM neural network for
M. Schmitt et al., “AVEC 2019 workshop and challenge: state-of-mind, text classification,” arXiv preprint, arXiv:1511.08630, November 2015.
detecting depression with AI, and cross-cultural affect recognition,” [25] W. Wagner, S. Bird, E. Klein, and E. Loper, Natural Language
Proceedings of the 9th International on Audio/visual Emotion Challenge Processing with Python, Analyzing Text with the Natural Language
and Workshop, vol. 9, pp. 3–12, October 2019. Toolkit. O’Reilly Media, Beijing, 2009, pp. 421–424.
[17] B. Cui, Y. Li, M. Chen, and Z. Zhang, “Fine-tune BERT with sparse [26] V. Keselj, “Book Review: Speech and language processing by Daniel
self-attention mechanism,” in Proceedings of the 2019 Conference on Jurafsky and James H. Martin,” Computational Linguistics, vol. 35, no.
Empirical Methods in Natural Language Processing and the 9th 3, Sept. 2009.
International Joint Conference on Natural Language Processing
(EMNLP-IJCNLP), vol. 9, pp. 3548–3553, November 2019. [27] T. Chai and R. R. Draxler, “Root mean square error (RMSE) or mean
absolute error (MAE)?–Arguments against avoiding RMSE in the
[18] M. M. Rodrigues, T. Warnita, K. Uto, and K. Shinoda, “Multimodal literature,” Geoscientific Model Development, vol. 7, no.. 3, pp. 1247–
fusion of bert-cnn and gated cnn representations for depression 1250, June 2014.
detection,” Proceedings of the 9th International on Audio/Visual
Emotion Challenge and Workshop, vol. 9, pp. 55–63, October 2019. [28] L. Lin, X. Chen, Y. Shen, and L. Zhang, “Towards automatic depression
detection: A BiLSTM /1D CNN-based model,” Applied Sciences, vol.
[19] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training 10, no. 23, pp. 1–20, Dec. 2020.
of deep bidirectional transformers for language understanding,” arXiv
preprint, arXiv:1810.04805, October 2018.

304
Authorized licensed use limited to: Bangladesh University of Professionals. Downloaded on December 10,2023 at 17:00:43 UTC from IEEE Xplore. Restrictions apply.

You might also like