0% found this document useful (0 votes)
11 views8 pages

Literature Paper

This paper presents a hybrid transformer-based framework for early detection of mental health risks from social media text, specifically targeting emotional states like depression and anxiety. The system enhances detection accuracy by combining transformer embeddings with traditional classifiers and offers personalized resource suggestions based on the severity of detected emotions. The approach aims to provide timely and context-aware mental health interventions, addressing gaps in current digital mental health support systems.

Uploaded by

anishv3504
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views8 pages

Literature Paper

This paper presents a hybrid transformer-based framework for early detection of mental health risks from social media text, specifically targeting emotional states like depression and anxiety. The system enhances detection accuracy by combining transformer embeddings with traditional classifiers and offers personalized resource suggestions based on the severity of detected emotions. The approach aims to provide timely and context-aware mental health interventions, addressing gaps in current digital mental health support systems.

Uploaded by

anishv3504
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

A Hybrid Transformer-Based Framework for Early

Detection of Mental Health Risks from Multi-


Platform Text Data with Personalized Resource
Suggestions

1st Debojeet Paul 2nd Disha Jaipal 3rd Goutham R


Dept. of CSE, Dept. of CSE, Dept. of CSE,
Don Bosco Institute of technology Don Bosco Institute of technology Don Bosco Institute of technology
Bengaluru, India Bengaluru, India Bengaluru, India
[email protected] [email protected] [email protected]

5th Dr Madhu C S
4th Ganavi R Dept. of CSE,
Dept. of CSE, Don Bosco Institute of technology
Don Bosco Institute of technology Bengaluru, India
Bengaluru, India [email protected]
[email protected]

Abstract— This paper proposes a hybrid transformer-based focused variants have demonstrated significant promise in
framework for early detection of mental health risks from social identifying early signs of mental health risks from online text.
media text. It analyzes content from social media platforms like Yet, most current systems stop at detection and fail to close
Twitter and Reddit, the system, then identifies emotional states the loop between identification and real-world assistance.
such as depression, anxiety, and anger by combining
transformer-based embeddings with traditional classifiers to This paper surveys recent advancements in transformer-based
improve detection accuracy. The model also offers personalized frameworks for detecting mental health risks from social
resource suggestions based on the severity of detected emotions. media text. It examines key methodologies, compares state-
This approach aims to support timely, scalable, and proactive of-the-art hybrid models, and emphasizes the potential of
mental health intervention for disturbed individuals. systems that not only identify emotional distress but also
deliver context-aware, personalized support—drawing on
Keywords-Mental Health Detection, BERT, Transformer Models, user behaviour and preferences to bridge critical gaps in
Social Media Analysis, Contextual Embeddings, Emotional State current digital mental health interventions.
Classification, Severity Assessment, Personalized
Recommendations
II. Related Work
I. INTRODUCTION
1. BERT-RF for Depression Detection
Mental health challenges are increasingly recognized as a
global concern, yet stigma, misinformation, and limited access This research aims to detect depression early by analyzing
to professional support continue to prevent many individuals social media user content with machine learning techniques
from seeking timely help. As psychological distress, anxiety, [6] Advanced machine learning-based early detection of
and mood disorders become more prevalent, there is a depression from social media is essential for studying
growing need for systems that offer proactive, compassionate, depression in medicine [6] The study introduces a novel
and context-aware intervention.
BERT-RF feature engineering approach that combines
Social media platforms like Twitter and Reddit have contextualized embeddings from BERT with probabilistic
evolved into digital spaces where people regularly share their features from a Random Forest model to enhance detection
thoughts, emotions, and daily experiences. This constant flow accuracy. By leveraging these features, the proposed
of user-generated content provides valuable insight into framework aims to improve the precision and reliability of
individual mental states and broader emotional trends. depression classification in social media posts, enabling
However, while some platforms utilize automated tools to timely identification of individuals at risk.
detect potential distress or flag harmful content, their
interventions are often limited to basic censorship or generic
support messages. These measures, though well-intentioned, A. Methodology
lack the contextual awareness and personalization needed to This study used machine learning techniques focused
truly assist individuals in crisis. on detecting depression in Twitter users by extracting
Advancements in artificial intelligence—particularly in tweet features [6] It utilized a benchmark depression
natural language processing—have enabled systems to dataset containing 20,000 tagged English tweet user
recognize emotional cues in text with increasing precision. profiles categorized as depressed or non-depressed [6]
Transformer-based models like BERT and its mental health– The preprocessing phase involved removing custom text
formatting, filtering stopwords using the NLTK library, approaches to supporting individuals experiencing mental
and applying stemming to reduce words to their root form illness [4] Specifically, the study focuses on leveraging
[6] Words in tweets were transformed into sequences of public Twitter data—namely users’ tweets and bios—to train
digits based on a dictionary index [6] The pre-processed and fine-tune pre-trained BERT models for depression
text was passed through a pre-trained BERT model to detection. By comparing multiple transformer variants from
generate contextual token embeddings while preserving Hugging Face with traditional machine learning methods, the
semantic relationships [6] Rather than using these research aims to identify the most accurate and
embeddings directly for classification, the authors computationally efficient model for early mental health risk
introduced an intermediate step: a Random Forest model prediction.
was applied to the BERT embeddings to generate
probabilistic features, effectively transforming high-
dimensional embeddings into more discriminative meta- A. Methodology
features [6] These features were then used as input for
training five classifiers—Random Forest (as a baseline), This study employed four pre-trained transformer models
Multilayer Perceptron (MLP), K-Neighbors Classifier from the Hugging Face library: distilbert-base-uncased-
(KNC), Logistic Regression (LR), and Long Short-Term finetuned-sst-2-english (DBUFS2E), bert-base-uncased
Memory (LSTM) [6] Among them, Logistic Regression (BBU), mental-bert-base-uncased (MBBU), and
performed best, achieving 99% accuracy, precision, distilroberta-base (DRB). These models had been previously
recall, and F1-score [6] For the training used an 80:20 fine-tuned on large corpora including reviews, tweets, and
data split, with 80% of the dataset used to train the various other text sources [4] The dataset used was Autodep,
machine learning model and 20%to evaluate the model’s automatically collected via the Twitter API. It included over
performance. To achieve this, thedata is split using the 11.8 million tweets and 553 bio-descriptions, representing
train-test-split () method in the scikit-learn module. [6] users who had disclosed mental health-related information
also includes k-fold cross validation to validate the results [4] Data preprocessing involved multiple cleaning and
obtained from the profile segmentation process [6] The normalization steps. The tweets and bios were first filtered
proposed BERT-RF feature engineering approach for English text and stripped of retweets, mentions, URLs,
significantly improved depression detection accuracy emojis, and special characters. All text was converted to
across models. The Logistic Regression (LR) classifier lowercase and extra spaces were removed. Tokenization was
achieved the best results, with 99% accuracy, precision, performed using the NLTK library, which segmented the
recall, and F1-score. Compared to baseline BERT-only input into word-level tokens. Common stop words were
features, which had much lower performance (e.g., LR removed using NLTK’s built-in list, and lemmatization was
F1-score = 0.56), the BERT-RF method demonstrated applied via the WordNetLemmatizer to standardize inflected
substantial performance gains. words into their base forms. Additionally, stemming was
This indicates that combining contextual embeddings tested, although lemmatization produced slightly better
from BERT with probabilistic features from Random outcomes [4] Following preprocessing, each dataset (tweets
Forest provides more discriminative and informative and bios) was divided into training and testing subsets using
features for classification. 10-fold cross-validation. Each fold was trained for five
epochs using Hugging Face’s Trainer method. Model-
B. Challenges specific tokenizers were used to convert cleaned text into
token sequences for model input. The training process
The model is based on specific language patterns that optimized models for binary classification, distinguishing
express melancholy. This leads to a discussion of the between depressed and non-depressed classes. Performance
possible limitations of why individuals may display evaluation was carried out using metrics including accuracy,
melancholic behaviour differently in less formal F1 score, and AUC, computed during training using an
settings.[6] evaluation function [4] These metrics helped assess the
precision, recall, and overall classification effectiveness of
Ethical considerations regarding mental health
the models.
profiling and responsible use of information on social
media should be validated, including issues of user consent
and privacy.[6] The final depression detection models were developed by
fine-tuning these BERT variants on the cleaned Twitter data.
Considering the use of BERT-based embeddings, the Among all models, DBUFS2E yielded the highest
interpretation of the prediction model needs to be further performance—achieving an AUC of 0.98 on tweets and 0.96
evaluated for clarity and reliability. on bios—demonstrating its strong capability to capture
depression-related signals [4].

2. Hugging Face BERT Models for B. Challenges


Depression Detection
It should be noted that additional user data, including
images, emojis, and hashtags, can enhance predictions. Future
This paper aims to propose a framework for addressing the research can explore combining data types to improve the
limitations of the current diagnostic process and harness the performance of the models.
power of transformers and social media to advance our
understanding of mental health and develop more effective
In real-world testing with 20 university students, EMI
chatbot achieved 40% top-1 accuracy and 65% top-2
3. Emotional Mental Intelligence- Digital accuracy in predicting depression severity, when validated
Twin Framework against PHQ-9 scores [3] Participants rated the chatbot’s
usability favourably, with a mean System Usability Scale
In the paper, they aim to address the barriers of stigma,
(SUS) score of 84.75%—significantly above the industry
accessibility, and affordability in mental healthcare by
standard of 68% [3].
designing and developing a dialogue system that analyses
individuals' mental status.[3] They built a conversational AI B. Challenges
chatbot that can detect the severity of depression and mental
health symptoms using a digital twin–inspired framework, The model’s classification accuracy was constrained by
improving early detection, access, and personalization in the limited and imbalanced dataset, as well as the lack of
mental healthcare. multimodal emotional cues such as facial expression and
voice tone, which the chatbot could not interpret [3] The
sample size in real-user testing was also small (n=20),
limiting statistical generalizability [3].
A. Methodology
The study utilized the Extended Distress Analysis
4. Transformer Calibration for Stress and
Interview Corpus (E-DAIC), a semi-clinical dataset Depression
comprising 219 transcripts of interviews conducted by a The primary aim of this study is to improve the reliability
virtual agent (“Ellie”) with participants diagnosed across and performance of transformer-based models for identifying
various levels of depression, labelled using the PHQ-8 scale stress and depression in social media posts by injecting
[3] The dataset included transcriptions of spoken dialogue linguistic features into BERT/Mental BERT models and
and metadata such as PHQ scores, gender, and mental health applying label smoothing to calibrate the models’ prediction
history. confidence. This is the first study to jointly address both
Preprocessing involved selecting interview utterances classification performance and calibration for mental health
longer than 50 characters and categorizing them into five detection from text, while also incorporating novel linguistic
depression severity levels based on PHQ-8 thresholds. For cues like LIWC, Top2Vec, and GOSS features derived from
classification efficiency, these were later merged into three LDA topic modelling [2].
categories: no symptoms, mild, and moderate/severe
depression [3] To address data imbalance, the authors
applied contextual data augmentation techniques using bi- A. Methodology
directional language models and BERT-style embeddings to
generate diverse training examples, especially for The researchers employed three benchmark datasets:
underrepresented classes [3]. Dreaddit(stressful vs. non-stressful posts), Depression-Mixed
(depressive vs. non-depressive), and Depression-Severity
The cleaned and augmented text data was converted into (minimal, mild, moderate, severe depression levels), all
token embeddings using a pre-trained BERT model. These collected from Reddit and related forums [2].
contextual embeddings were fed into a custom classification
layer, producing probability distributions over the three Preprocessing involved tokenizing input posts and
depression severity classes [3] The model was implemented extracting various extra-linguistic features. These included: (i)
within the Rasa framework, which handled both intent emotion-based scores using the NRC lexicon (10-dim), (ii)
classification and dialogue flow using BERT-based NLU LIWC 2022 features capturing linguistic and psychological
components and fallback mechanisms [3]. word categories (117-dim), (iii) 25-dimensional LDA topic
probabilities plus GOSS scores indicating topic deviation, and
Model training used an 85:15 train-validation split. (iv) 512-dimensional Top2Vec semantic embeddings [2].
BERT was fine-tuned using the Adam W optimizer (learning
rate = 1e−5), batch sizes of 4 and 32 (training/validation), The cleaned texts were first passed through either BERT
and a soft max activation to generate class probabilities [3] or Mental BERT to obtain contextual token embeddings.
For ambiguous user inputs, a fallback classifier was These were concatenated with the linguistic feature vectors
employed to trigger default responses, ensuring robustness in using a multimodal adaptation gate followed by a shifting
uncertain cases [3]. mechanism. This generated enhanced word embeddings,
which were normalized and passed into the transformer
Evaluation was performed using standard metrics: models for final classification [2].
accuracy, precision, recall, and F1-score. The authors also
designed custom rules and dialogue “stories” to test response For model calibration, the authors applied label
generation and intent routing under multiple conversation smoothing—a technique that modifies the hard target labels
flows [3]. by distributing small probability mass across all classes—
helping reduce overconfidence and improve uncertainty
On the E-DAIC dataset, the model achieved an overall estimation [2] The final architecture, denoted M-BERT or M-
classification accuracy of 69%, with moderate F1-scores Mental BERT (e.g., M-BERT(LIWC)), consisted of the
across the three severity levels [3] This outperformed transformer encoder plus a dense ReLU layer and soft max
comparable tools such as Ada Health, which had a top-5 output [2].
diagnosis accuracy of 51% [3].
Training was conducted using the Adam optimizer
(learning rate = 0.001), Step LR scheduler, and a batch size of
8. Early stopping and fivefold cross-validation were applied
on Depression-Severity, while an 80:20 split was used on the
other datasets. Models were trained for up to 30 epochs on a
Tesla P100 GPU using PyTorch and the Transformers library The study benchmarked both traditional and deep learning
[2]. methods (e.g., TF-IDF with SVM, BiLSTM with GloVe) but
focused primarily on transformer models. Alongside the
standard pretrained BERT and RoBERTa, the authors trained
Performance was assessed using standard metrics: their domain-specific variants—MIBERT and
accuracy, precision, recall, and F1-score. Calibration was MIRoBERTa—using Masked Language Modelling (MLM)
evaluated using Expected Calibration Error (ECE) and on Reddit mental health text to capture domain-relevant
Adaptive Calibration Error (ACE), which compare prediction patterns. This pretraining was followed by fine-tuning for the
confidence with accuracy across bins [2]. final classification task [5].
Fine-tuning was conducted using the AdamW optimizer
with learning rates between 1e-5 and 3e-5, a batch size of 16,
B. Challenges and training over three epochs. RoBERTa and MIRoBERTa
pretraining took approximately 9 hours, while BERT variants
While the proposed M-BERT model achieved strong trained in about 8.5 hours. All experiments were implemented
performance, certain limitations make it less suitable for real- using the Hugging Face Transformers library [5].
world deployment. The lack of interpretability features (e.g.,
SHAP, attention visualization) limits trust and usability in Model performance was evaluated using standard metrics:
clinical settings, which our project addresses by integrating accuracy, precision, recall, and F1-score (all weighted to
explainable AI components and clearer user feedback account for class imbalance). To enhance interpretability,
mechanisms [2]. SHAP (SHapley Additive exPlanations) was used to
determine which words contributed most to each model’s
Additionally, the model was trained with limited predictions [5].
hyperparameter tuning and without repeated statistical testing
on two datasets, which may affect generalizability. Our The domain-adaptive models outperformed all baselines.
project mitigates this by incorporating systematic tuning and MIRoBERTa achieved the highest performance among
multiple evaluation cycles [2]. individual models, with an accuracy and F1-score of 0.847 at
a learning rate of 1e-5. The best performance overall came
Finally, while linguistic features such as NRC and LIWC from an ensemble of MIRoBERTa and RoBERTa, which
were used, their inconsistent impact suggests the need for reached accuracy, precision, recall, and F1-score values of
more targeted, user-specific features — something our 0.851 each [5].
framework improves by integrating personalized input cues
and adaptive feedback [2]. In per-class evaluation, MIRoBERTa achieved the highest
recall (0.841) for the anxiety class, while RoBERTa ensemble
attained an F1-score of 0.804 for depression classification [5].
5. MIRoBERTa: Mental illness text
classification
B. Challenges
This study focuses on enhancing the performance and
contextual understanding of transformer-based models in the Although the MIRoBERTa model showed strong
domain of mental health detection on social media. To do this, quantitative results, the authors note several limitations. The
the authors propose domain-adaptive variants of two complexity of differentiating overlapping mental illnesses
pretrained models—BERT and RoBERTa—by continuing (e.g., depression and anxiety) and the lack of clinical
their training on Reddit posts related to mental illness. These validation for subreddit-based labelling affect the model's
adapted models, termed MIBERT and MIRoBERTa, aim to robustness [5]. Moreover, while SHAP was used to provide
provide more accurate classification in a multi-class setting some explainability, the model does not incorporate adaptive
involving different types of mental illnesses such as user feedback or personalized learning components—areas
depression, anxiety, OCD, PTSD, and others [5]. where our project seeks to innovate. Additionally, no
multilingual or multilabel extensions were implemented,
which limits its scalability across diverse user groups [5].
A. Methodology
The dataset used in this work was derived from 11 mental 6. Depression Detection on Social Media
health-focused subreddits including r/depression, r/anxiety, Posts using BERT and Support Vector
r/OCD, and r/ADHD, using the Reddit Pull-Push API. Only
posts with at least 10 upvotes were included to ensure the
Machine
relevance and engagement of the content. The dataset The objective of this study is to explore the feasibility and
ultimately consisted of over 54,000 labelled posts, with an performance of combining BERT embeddings with Support
additional 8,207 posts categorized as “none” (non-mental Vector Machine (SVM) classifiers for detecting depression in
health related) [5]. social media posts. By comparing the effect of different SVM
During preprocessing, the authors applied standard kernel functions on Reddit and Twitter datasets, the study
cleaning steps: lowercasing, URL and punctuation removal, aims to identify the most effective configuration for
and lemmatization. Notably, they retained stop words since depression detection using textual cues. The authors
transformer models can derive meaning from these tokens emphasize that this approach provides a computationally
based on their contextual embeddings [5].
efficient alternative to deep learning while leveraging BERT’s verified and included imbalanced samples—conditions that
powerful contextual embeddings [7]. could affect generalizability [7].
A. Methodology Importantly, the model lacks any explainability mechanisms
such as SHAP or attention-based visualization, which are
critical in mental health applications for ensuring
interpretability and ethical use. Our project seeks to address
Two publicly available datasets were used: one sourced
this by integrating explainable AI techniques and exploring
from Reddit and the other from Twitter, both obtained via
multilingual and multimodal data(e.g., audio, images) to
Kaggle. The Reddit dataset includes about 7,000 English-
enhance robustness [7].
language posts from depression-focused subreddits, whereas
the Twitter dataset contains labelled tweets (depressed or non-
depressed), with a natural class imbalance favouring non-
depressed tweets [7]. 7. Enhanced word embedding
Data preprocessing was conducted using NLTK, involving The primary goal of this study is to enhance the detection
standard cleaning steps: removal of URLs, mentions, of mental health and substance abuse issues including
hashtags, lowercasing, and stop word removal. These steps depression, suicidal ideation, eating disorders, and alcoholism
were implemented to ensure textual uniformity before using domain-specific word embeddings tailored for social
tokenization and embedding [7]. media text. The authors aim to optimize the semantic space
such that class-specific predictive terms are closely aligned
The cleaned text was tokenized using BERT's tokenizer, with condition-specific anchor words (pivot terms) while
with a maximum sequence length of 128 tokens. The bert- remaining distant from unrelated terms. This approach is
base-uncased model was used to generate contextual designed to improve classification accuracy in small,
embeddings by averaging the hidden state outputs of each specialized corpora by generating embeddings that better
token, resulting in a 768-dimensional vector for each post. capture condition-related language [1].
These vectors were then used as features for SVM
classification [7].
Four different SVM kernel functions were explored: A. Methodology
• Linear: a baseline for linearly separable data The study utilized Reddit data from five condition-related
subreddits-r/depression,----r/Suicide Watch, r/alcoholism,
• Polynomial: to capture complex, non-linear r/eating disorders, and r/bulimia—alongside 18 general-
dependencies interest subreddits like r/books, r/science, and r/law to serve
• Radial Basis Function (RBF): for mapping to higher- as control data [1]. Posts were selected only if they included
dimensional spaces self-referential statements and were filtered to eliminate those
with overlapping conditions to maintain label purity.
• Sigmoid: mimicking neural network behaviour, Additionally, to prevent keyword leakage, 70% of posts had
though more sensitive to hyperparameters key class-specific terms removed during preprocessing [1].
The study tuned key hyperparameters such as the Text preprocessing included stop word removal and
regularization parameter and kernel-specific gamma values to merging predictive bigrams into single tokens to preserve
optimize performance [7]. semantic context (e.g., “mental health” as one token). The
The models were evaluated using accuracy, precision, researchers identified predictive terms for each class using
statistical tests—Chi-squared and Mann-Whitney U—to form
recall, and F1-score, which provided a balanced view of
term pairs and define pivot terms representative of each
classification effectiveness across imbalanced classes [7].
condition [1]. These terms guided the creation of enhanced
The evaluation revealed that the Polynomial and RBF word embeddings by modifying the training objective so that
kernels consistently outperformed others on both platforms. semantically related words were pulled toward the pivot
On the Twitter dataset, both kernels achieved 87.5% accuracy, vector and semantically unrelated words were pushed away.
outperforming the linear (85.5%) and sigmoid (76.4%)
Several variations of embeddings were evaluated,
configurations [7]. On the Reddit dataset, the Polynomial
including basic GloVe, enhanced GloVe, GloVe + retrofitting,
kernel achieved the highest accuracy of 94.4%, followed
and concatenated models. For classification, the authors
closely by RBF at 94.2%, linear at 93.5%, and sigmoid at
employed four models: logistic regression, random forest, a
93.3% [7].
CNN-based deep learning model (DL1), and Distil BERT.
These results indicate that depression-related language on Among these, CNN with enhanced GloVe + retrofitting
Reddit is more explicit and lends itself to better classification. (Embedding Model 4) produced the best results [1].
In contrast, Twitter posts are shorter and noisier, making
Training was conducted using a 70:30 train-test split, with
detection slightly more challenging [7].
hyperparameter tuning applied via grid search and fivefold
B. Challenges cross-validation for the traditional models. The CNN
architecture included multiple filter sizes (2, 3, and 5), max
Although the BERT + SVM framework produced strong pooling layers, and soft max output. Evaluation metrics
results, the study acknowledged several limitations. First, it included accuracy, precision, recall, and F1-score.
relied solely on English-language text, limiting the model’s Additionally, the researchers used cosine similarity and PCA-
cross-linguistic applicability [7]. Moreover, the datasets were based visualization to assess the semantic clustering
noisy, especially the Twitter data, which was not manually behaviour of the embeddings in vector space [1].

The enhanced embeddings significantly outperformed
baseline methods. Specifically, Embedding Model 4 (GloVe-
• Stopword Removal: Removal of high-frequency,
initialized, retrofitted, and enhanced) achieved an F1-score of
low-information words to reduce noise.
86% in multi-class classification using the CNN model, which
outperformed standard Word2Vec, GloVe, and even Distil B. Feature Extraction Module
BERT by up to 13.97% in F1 score [1]. The cosine similarity
analysis further confirmed that the improved embeddings This module generates semantically rich representations of
offered tighter intra-class grouping and clearer inter-class textual data using transformer-based architectures.
separation [1]. • Contextual Embeddings: Utilizes pretrained models
such as BERT, RoBERTa, or Distil BERT to derive
token-level and sentence-level embeddings.
B. Challenges
• Augmented Features : Incorporates additional
Despite its strong performance, the study presents several linguistic or psycholinguistic features (e.g., LIWC
limitations that affect its applicability to real-world mental categories, sentiment scores, word counts) to
health detection. First, although the enhanced embeddings enhance model interpretability and performance.
improved semantic representation for class-specific language,
the model lacks interpretability mechanisms such as SHAP C. Classification Module
values or attention-based visualizations, which are essential in
A hybrid classification framework is deployed to detect
clinical and sensitive decision-making environments [1].
and categorize user emotions.
Second, the dataset was explicitly filtered to exclude comorbid
posts—i.e., instances where users exhibited multiple mental • Classifier Architectures: Combines deep contextual
health conditions—to maintain label purity. This reduces the embeddings with traditional features using classifiers
model's generalizability, especially given that comorbidity is such as Support Vector Machines (SVM), Random
common in real mental health scenarios [1]. Third, the Forests (RF), or Multilayer Perceptrons (MLP).
embedding-based approach used in the study is static; it does
not utilize contextualized word representations from • Emotion Categories: Trained to detect emotional
transformer models like BERT or RoBERTa, which limits its states including depression, anxiety, anger, and
capacity to capture nuanced meanings based on context. general distress.
Finally, the models operate on fixed embeddings and lack • Confidence Scoring: Each prediction is accompanied
mechanisms for user-specific adaptation or continuous by a probabilistic confidence score, enabling
learning, which may hinder their effectiveness in evolving or downstream modules to interpret prediction
personalized mental health applications. certainty.
D. Severity Assessment Module
III. METHODOLOGY To quantify the intensity of the detected emotional state, a
The proposed system is a modular pipeline designed for severity grading system is employed.
the early detection of emotional distress and mental health
risks through the analysis of user-generated social media • Severity Levels: Emotional states are categorized as
content. The architecture integrates data collection, Low, Moderate, or High.
preprocessing, feature extraction, emotional classification, Assessment Criteria:
severity assessment, and resource recommendation, along
with optional feedback mechanisms for continuous learning. • Model confidence in negative classifications.
A. Data Collection Module • Frequency and persistence of distress signals over
time.
This module is responsible for acquiring raw data from
social media platforms such as Twitter and Reddit, either • Co-occurrence of multiple severe emotion labels
through public APIs or curated datasets. The collected content (e.g., depression + suicidal ideation).
primarily includes textual posts, tweets, and comments that
are likely to reflect users’ emotional states. E. Personalized Resource Recommendation System

Preprocessing pipeline: Based on the detected emotion and its severity, the system
suggests relevant mental health resources.
• Noise Removal: Elimination of URLs, emojis,
hashtags, and special characters. • Low/Moderate Severity: Directs users to forums,
online support communities, and self-guided therapy
• Normalization: Conversion to lowercase, content.
lemmatization or stemming for lexical uniformity.
• High Severity: Recommends immediate professional
• Language Filtering: Retention of only English- help, including crisis helplines and licensed therapy
language content to ensure model compatibility. platforms.
• Recommendation Engine: Can be implemented of mental health risks. The integration of a severity assessment
using a rules-based decision tree or an AI-driven module and a personalized recommendation engine ensures
recommender trained on user response patterns. users receive appropriate support based on their emotional
state. Furthermore, the system’s design supports adaptability
and scalability, making it suitable for real-world applications
such as chatbot-based screening tools or digital health
F. Feedback and Adaptation Loop platforms. The optional feedback and retraining loop provides
An optional feedback mechanism enables the system to a mechanism for continuous learning, enabling the model to
evolve over time. stay updated with evolving language patterns and user
behaviour. Overall, this work contributes to the growing field
• User Feedback Integration: In chatbot-based of AI-assisted mental healthcare and lays the foundation for
deployments, user feedback is collected to evaluate accessible, proactive, and personalized support systems that
recommendation effectiveness. can help bridge existing gaps in traditional mental
health services.
• Model Update: The pipeline supports periodic
retraining to adapt to linguistic drift, evolving slang,
or shifting psychological discourse patterns
in social media. REFERENCES

[1] D. Ramírez-Cifuentes, C. Largeron, J. Tissier, R. Baeza-Yates, and A.


Freire, “Enhanced word embedding variations for the detection of substance
abuse and mental health issues on social media writings,” IEEE Access, vol.
9, pp. 130449–130470, Sep. 2021.

[2] L. Ilias, S. Mouzakitis, and D. Askounis, “Calibration of transformer-


based models for identifying stress and depression in social media,” IEEE
Trans. Comput. Social Syst., vol. 11, no. 2, pp. 1979–1991, Apr. 2024.

[3] A. Abilkaiyrkyzy, F. Laamarti, M. Hamdi, and A. E. Saddik, “Dialogue


system for early mental illness detection: Toward a digital twin solution,”
IEEE Access, vol. 12, pp. 2007–2024, Jan. 2024.

[4] A. Pourkeyvan, R. Safa, and A. Sorourkhah, “Harnessing the power of


Hugging Face transformers for predicting mental health disorders in social
networks,” IEEE Access, vol. 12, pp. 28025–28038, Feb. 2024.

[5] M. Sao and H.-J. Lim, “MIRoBERTa: Mental illness text classification
with transfer learning on subreddits,” IEEE Access, vol. 12, pp. 197454–
197470, Dec. 2024.

[6] M. A. Abbas, K. Munir, A. Raza, N. A. Samee, M. M. Jamjoom, and Z.


Ullah, “Novel transformer based contextualized embedding and probabilistic
features for depression detection from social media,” IEEE Access, vol. 12,
pp. 54087–54100, Apr. 2024.

[7] H. Shoaib, R. Ali, S. M. S. Khan, and M. Adil, “Depression detection on


social media posts using BERT and support vector machine,” in Proc. 14th
IEEE Int. Conf. Commun. Syst. Netw. Technol. (CSNT), Apr. 2025, pp. 708–
713.

Figure 1: Proposed Architecture of our solution [8] Y. Tay, M. Dehghani, D. Bahri, and D. Metzler, “A survey of text
classification with transformers: How wide? how large? how long? how
accurate? how expensive?,” in Proc. 58th Annu. Meeting Assoc. Comput.
Linguistics (ACL), Dec. 2020, pp. 4795–4810.
IV. CONCLUSION
This study presents a robust and modular framework for [9] S. M. Dhamdhere and A. P. Patil, “Mental health safety and depression
the early detection of mental health conditions, with a focus detection in social media text data: A classification approach based on a deep
on depression and emotional distress, using user-generated learning model,” in Proc. IEEE PuneSect. Int. Conf. Electr., Comput.
Commun. (PuneCon), Dec. 2023, pp. 1–6.
content from social media platforms. By combining advanced
transformer-based models such as BERT and RoBERTa with
traditional machine learning classifiers, the system effectively [10] A. Al-Mutairi and D. AlShamrani, “Text mining and emotion
captures contextual emotional cues and classifies the severity classification on monkeypox Twitter dataset: A deep learning–natural
language processing approach,” in Proc. 16th Int. Conf. Develop. e-Syst. Eng. transformer model architecture,” in Proc. 8th Int. Conf. Inventive Comput.
(DeSE), Oct. 2023, pp. 151–156. Technol. (ICICT), Mar. 2024, pp. 1560–1565.

[11] S. Aina and S. O. Fatumo, “A hybrid learning-architecture for mental


disorder detection using emotion recognition,” in Proc. IEEE 5th Int. Conf.
Artif. Intell. Comput. Appl. (ICAICA), Oct. 2023, pp. 457–462.[12] H. S. Naik,
M. B. Dudhamal, and H. D. Tanna, “Mental disorder indication detection with

You might also like