Literature Paper
Literature Paper
5th Dr Madhu C S
4th Ganavi R Dept. of CSE,
Dept. of CSE, Don Bosco Institute of technology
Don Bosco Institute of technology Bengaluru, India
Bengaluru, India [email protected]
[email protected]
Abstract— This paper proposes a hybrid transformer-based focused variants have demonstrated significant promise in
framework for early detection of mental health risks from social identifying early signs of mental health risks from online text.
media text. It analyzes content from social media platforms like Yet, most current systems stop at detection and fail to close
Twitter and Reddit, the system, then identifies emotional states the loop between identification and real-world assistance.
such as depression, anxiety, and anger by combining
transformer-based embeddings with traditional classifiers to This paper surveys recent advancements in transformer-based
improve detection accuracy. The model also offers personalized frameworks for detecting mental health risks from social
resource suggestions based on the severity of detected emotions. media text. It examines key methodologies, compares state-
This approach aims to support timely, scalable, and proactive of-the-art hybrid models, and emphasizes the potential of
mental health intervention for disturbed individuals. systems that not only identify emotional distress but also
deliver context-aware, personalized support—drawing on
Keywords-Mental Health Detection, BERT, Transformer Models, user behaviour and preferences to bridge critical gaps in
Social Media Analysis, Contextual Embeddings, Emotional State current digital mental health interventions.
Classification, Severity Assessment, Personalized
Recommendations
II. Related Work
I. INTRODUCTION
1. BERT-RF for Depression Detection
Mental health challenges are increasingly recognized as a
global concern, yet stigma, misinformation, and limited access This research aims to detect depression early by analyzing
to professional support continue to prevent many individuals social media user content with machine learning techniques
from seeking timely help. As psychological distress, anxiety, [6] Advanced machine learning-based early detection of
and mood disorders become more prevalent, there is a depression from social media is essential for studying
growing need for systems that offer proactive, compassionate, depression in medicine [6] The study introduces a novel
and context-aware intervention.
BERT-RF feature engineering approach that combines
Social media platforms like Twitter and Reddit have contextualized embeddings from BERT with probabilistic
evolved into digital spaces where people regularly share their features from a Random Forest model to enhance detection
thoughts, emotions, and daily experiences. This constant flow accuracy. By leveraging these features, the proposed
of user-generated content provides valuable insight into framework aims to improve the precision and reliability of
individual mental states and broader emotional trends. depression classification in social media posts, enabling
However, while some platforms utilize automated tools to timely identification of individuals at risk.
detect potential distress or flag harmful content, their
interventions are often limited to basic censorship or generic
support messages. These measures, though well-intentioned, A. Methodology
lack the contextual awareness and personalization needed to This study used machine learning techniques focused
truly assist individuals in crisis. on detecting depression in Twitter users by extracting
Advancements in artificial intelligence—particularly in tweet features [6] It utilized a benchmark depression
natural language processing—have enabled systems to dataset containing 20,000 tagged English tweet user
recognize emotional cues in text with increasing precision. profiles categorized as depressed or non-depressed [6]
Transformer-based models like BERT and its mental health– The preprocessing phase involved removing custom text
formatting, filtering stopwords using the NLTK library, approaches to supporting individuals experiencing mental
and applying stemming to reduce words to their root form illness [4] Specifically, the study focuses on leveraging
[6] Words in tweets were transformed into sequences of public Twitter data—namely users’ tweets and bios—to train
digits based on a dictionary index [6] The pre-processed and fine-tune pre-trained BERT models for depression
text was passed through a pre-trained BERT model to detection. By comparing multiple transformer variants from
generate contextual token embeddings while preserving Hugging Face with traditional machine learning methods, the
semantic relationships [6] Rather than using these research aims to identify the most accurate and
embeddings directly for classification, the authors computationally efficient model for early mental health risk
introduced an intermediate step: a Random Forest model prediction.
was applied to the BERT embeddings to generate
probabilistic features, effectively transforming high-
dimensional embeddings into more discriminative meta- A. Methodology
features [6] These features were then used as input for
training five classifiers—Random Forest (as a baseline), This study employed four pre-trained transformer models
Multilayer Perceptron (MLP), K-Neighbors Classifier from the Hugging Face library: distilbert-base-uncased-
(KNC), Logistic Regression (LR), and Long Short-Term finetuned-sst-2-english (DBUFS2E), bert-base-uncased
Memory (LSTM) [6] Among them, Logistic Regression (BBU), mental-bert-base-uncased (MBBU), and
performed best, achieving 99% accuracy, precision, distilroberta-base (DRB). These models had been previously
recall, and F1-score [6] For the training used an 80:20 fine-tuned on large corpora including reviews, tweets, and
data split, with 80% of the dataset used to train the various other text sources [4] The dataset used was Autodep,
machine learning model and 20%to evaluate the model’s automatically collected via the Twitter API. It included over
performance. To achieve this, thedata is split using the 11.8 million tweets and 553 bio-descriptions, representing
train-test-split () method in the scikit-learn module. [6] users who had disclosed mental health-related information
also includes k-fold cross validation to validate the results [4] Data preprocessing involved multiple cleaning and
obtained from the profile segmentation process [6] The normalization steps. The tweets and bios were first filtered
proposed BERT-RF feature engineering approach for English text and stripped of retweets, mentions, URLs,
significantly improved depression detection accuracy emojis, and special characters. All text was converted to
across models. The Logistic Regression (LR) classifier lowercase and extra spaces were removed. Tokenization was
achieved the best results, with 99% accuracy, precision, performed using the NLTK library, which segmented the
recall, and F1-score. Compared to baseline BERT-only input into word-level tokens. Common stop words were
features, which had much lower performance (e.g., LR removed using NLTK’s built-in list, and lemmatization was
F1-score = 0.56), the BERT-RF method demonstrated applied via the WordNetLemmatizer to standardize inflected
substantial performance gains. words into their base forms. Additionally, stemming was
This indicates that combining contextual embeddings tested, although lemmatization produced slightly better
from BERT with probabilistic features from Random outcomes [4] Following preprocessing, each dataset (tweets
Forest provides more discriminative and informative and bios) was divided into training and testing subsets using
features for classification. 10-fold cross-validation. Each fold was trained for five
epochs using Hugging Face’s Trainer method. Model-
B. Challenges specific tokenizers were used to convert cleaned text into
token sequences for model input. The training process
The model is based on specific language patterns that optimized models for binary classification, distinguishing
express melancholy. This leads to a discussion of the between depressed and non-depressed classes. Performance
possible limitations of why individuals may display evaluation was carried out using metrics including accuracy,
melancholic behaviour differently in less formal F1 score, and AUC, computed during training using an
settings.[6] evaluation function [4] These metrics helped assess the
precision, recall, and overall classification effectiveness of
Ethical considerations regarding mental health
the models.
profiling and responsible use of information on social
media should be validated, including issues of user consent
and privacy.[6] The final depression detection models were developed by
fine-tuning these BERT variants on the cleaned Twitter data.
Considering the use of BERT-based embeddings, the Among all models, DBUFS2E yielded the highest
interpretation of the prediction model needs to be further performance—achieving an AUC of 0.98 on tweets and 0.96
evaluated for clarity and reliability. on bios—demonstrating its strong capability to capture
depression-related signals [4].
Preprocessing pipeline: Based on the detected emotion and its severity, the system
suggests relevant mental health resources.
• Noise Removal: Elimination of URLs, emojis,
hashtags, and special characters. • Low/Moderate Severity: Directs users to forums,
online support communities, and self-guided therapy
• Normalization: Conversion to lowercase, content.
lemmatization or stemming for lexical uniformity.
• High Severity: Recommends immediate professional
• Language Filtering: Retention of only English- help, including crisis helplines and licensed therapy
language content to ensure model compatibility. platforms.
• Recommendation Engine: Can be implemented of mental health risks. The integration of a severity assessment
using a rules-based decision tree or an AI-driven module and a personalized recommendation engine ensures
recommender trained on user response patterns. users receive appropriate support based on their emotional
state. Furthermore, the system’s design supports adaptability
and scalability, making it suitable for real-world applications
such as chatbot-based screening tools or digital health
F. Feedback and Adaptation Loop platforms. The optional feedback and retraining loop provides
An optional feedback mechanism enables the system to a mechanism for continuous learning, enabling the model to
evolve over time. stay updated with evolving language patterns and user
behaviour. Overall, this work contributes to the growing field
• User Feedback Integration: In chatbot-based of AI-assisted mental healthcare and lays the foundation for
deployments, user feedback is collected to evaluate accessible, proactive, and personalized support systems that
recommendation effectiveness. can help bridge existing gaps in traditional mental
health services.
• Model Update: The pipeline supports periodic
retraining to adapt to linguistic drift, evolving slang,
or shifting psychological discourse patterns
in social media. REFERENCES
[5] M. Sao and H.-J. Lim, “MIRoBERTa: Mental illness text classification
with transfer learning on subreddits,” IEEE Access, vol. 12, pp. 197454–
197470, Dec. 2024.
Figure 1: Proposed Architecture of our solution [8] Y. Tay, M. Dehghani, D. Bahri, and D. Metzler, “A survey of text
classification with transformers: How wide? how large? how long? how
accurate? how expensive?,” in Proc. 58th Annu. Meeting Assoc. Comput.
Linguistics (ACL), Dec. 2020, pp. 4795–4810.
IV. CONCLUSION
This study presents a robust and modular framework for [9] S. M. Dhamdhere and A. P. Patil, “Mental health safety and depression
the early detection of mental health conditions, with a focus detection in social media text data: A classification approach based on a deep
on depression and emotional distress, using user-generated learning model,” in Proc. IEEE PuneSect. Int. Conf. Electr., Comput.
Commun. (PuneCon), Dec. 2023, pp. 1–6.
content from social media platforms. By combining advanced
transformer-based models such as BERT and RoBERTa with
traditional machine learning classifiers, the system effectively [10] A. Al-Mutairi and D. AlShamrani, “Text mining and emotion
captures contextual emotional cues and classifies the severity classification on monkeypox Twitter dataset: A deep learning–natural
language processing approach,” in Proc. 16th Int. Conf. Develop. e-Syst. Eng. transformer model architecture,” in Proc. 8th Int. Conf. Inventive Comput.
(DeSE), Oct. 2023, pp. 151–156. Technol. (ICICT), Mar. 2024, pp. 1560–1565.