0% found this document useful (0 votes)
21 views28 pages

AP Final Project - Group 2

The document discusses a research project aimed at developing a predictive model to anticipate deteriorating mental health in students and prevent student suicides. It analyzes factors contributing to students' mental health issues and reviews literature on suicide prevention. The project uses techniques like machine learning, NLP and MDA to analyze student data and predict self-harm instances with 69% accuracy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views28 pages

AP Final Project - Group 2

The document discusses a research project aimed at developing a predictive model to anticipate deteriorating mental health in students and prevent student suicides. It analyzes factors contributing to students' mental health issues and reviews literature on suicide prevention. The project uses techniques like machine learning, NLP and MDA to analyze student data and predict self-harm instances with 69% accuracy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

A PROJECT REPORT ON

Sentinel Minds: Using Predictive Analysis to Protect Student


Welfare

Submitted in partial fulfillment of the requirements for the


Two-year full-time in MBA - Business Analytics

For the Course


Analytics Project (Trimester III)

Submitted by
Name SAP ID Roll Number
Anujj Misra 80672300213 A011

Anushka Shukla 80672300051 A012

Karan Pandey 80672300153 A024

Stuti Saran 80672300177 A054

Yash Shah 80672300075 A061

Submitted to Faculty Mentor,


Prof. Siby Abraham
School of Business Management (SBM), SVKM'S NMIMS, MUMBAI – 400 056
27th February 2024
Acknowledgement

We sincerely thank Dr. Siby Abraham for all his help during this research, including his
insightful comments, persistent support, and helpful counsel. His knowledge and support
have been crucial in forming this research.

We further thank Ms. Malvika Rao, Counsellor (Psychologist) at SVKM’s Narsee Monjee
Institute of Management Studies for her guidance and views to help us successfully navigate
through this sensitive topic.

Additionally, we would like to sincerely thank our classmates for their unwavering
cooperation and support. Their supportive comments and helpful critiques have helped us to
improve our concepts and methods.

We also appreciate Swaranka Pethe, a committed psychology student from the University of
Derby, for her insightful comments throughout the sessions. Her knowledge and perceptions
have improved our comprehension of the intricacies involved in mental health concerns.

Without the kind collaboration and participation of the students who generously offered their
experiences and viewpoints, this research would not have been feasible. Their readiness to
have direct, unbiased, and open conversations has been crucial in illuminating the variables
affecting students' mental health and suicide rates.
Abstract employing random forests, a machine
learning model is devised to predict
instances of self-harm with 69% accuracy,
In today's educational milieu, students
based on socio-economic and
confront unprecedented mental health
psychological parameters extracted from
challenges, exacerbated by academic
student profiles. Furthermore, the research
demands, social detachment, and the
endeavors to identify pivotal factors
pervasive influence of digital technology.
contributing to depressive ideation or
The escalating prevalence of anxiety,
inclination toward self-harm utilizing
depression, and burnout among students
MDA techniques like Factor Analysis and
underscores the pressing need for
Regression. The underlying impetus for
comprehensive support and resources to
this project is to provide timely and
address this critical issue. This research
tailored support to students identified as
endeavors to construct a predictive
at-risk for suicide, to mitigate suicide risk
analytics model aimed at anticipating
within campus communities, and to
deteriorating mental health in students and
enhance the overall well-being of students
averting instances of student suicides.
in educational institutions.
Leveraging an extensive dataset
comprising past behavioral, academic, and
demographic information, the study
incorporates numerous variables, Introduction
encompassing social interactions,
academic attainment, and prior mental The surge in student suicides documented
health history. Employing sub-fields of by the National Crime Records Bureau
Artificial Intelligence such as Machine (NCRB) highlights a concerning trend in
Learning, Natural Language Processing India's educational sphere. With 13,089
(NLP), and Multivariate Data Analytics student suicides reported in 2021 alone,
(MDA), the objective is to develop a representing a staggering 70% increase
predictive model capable of discerning from 2011, the gravity of the situation
patterns and trends indicative of cannot be overstated. On average, nearly
heightened risks of suicidal ideation. By 36 students took their own lives each day
leveraging Natural Language Processing throughout the year, shedding light on the
to analyze extensive social media data, profound mental health challenges faced
coupled with the utilization of four by young learners across the nation.
machine learning techniques, the research
distinguishes between suicidal ideation This alarming rise underscores the urgent
and depressive thoughts. Through need for a comprehensive approach to
comparative analysis, the study ascertains address the complex interplay of factors
that artificial Neural Networks exhibit a contributing to deteriorating mental health
predictive accuracy of 70%. Additionally,
within educational settings. Academic essential for future endeavours. However,
pressures, social isolation, and the the transitional phase into academia can
pervasive influence of digital technology pose challenges, potentially leading to
are among the myriad stressors psychological distress among students.
exacerbating mental health issues among Moreover, the competitive nature inherent
students. The relentless pursuit of in academic environments, marked by the
academic excellence, coupled with the pursuit of excellence and overwhelming
pressures of navigating a workloads, coupled with interpersonal
hypercompetitive environment, often conflicts, can exacerbate stress levels,
takes a toll on students' psychological escalating the risk of mental health issues,
well-being. Moreover, the omnipresence including suicidal tendencies.
of social media platforms, while offering
connectivity and convenience, can foster Suicidal behaviour, a multifaceted
feelings of inadequacy, comparison, and phenomenon influenced by various factors
isolation, further exacerbating mental such as biological, psychological, social,
health challenges. and environmental elements, presents a
significant global concern. It manifests in
Ultimately, the overarching goal of this different forms, from ideation to attempted
research is to provide timely and tailored and completed suicide, with profound
support to students identified as at-risk for impacts on individuals, families, and
suicide, thereby mitigating the risk of self- societies at large.
harm within campus communities and
enhancing the overall well-being of According to the World Health
students in educational institutions. Organization (WHO) in August 2023,
Through rigorous analysis and targeted every year more than 700,000 people take
interventions, we aspire to create a safer their own lives, with suicide being the
and more supportive environment fourth leading cause of death among 15–
conducive to the holistic development and 29-year-olds globally in 2019. Moreover,
flourishing of every student. over 77% of global suicides occurred in
low- and middle-income countries in
2019. (World Health Organization,
Literature Review & Research August 2023)

Gap Addressing this pressing issue necessitates


a comprehensive understanding of its
Educational institutions serve as vital determinants and risk factors, spanning
arenas for the holistic development of individual experiences (such as adverse
individuals, encompassing academic, childhood events, mental disorders, and
social, and professional realms. They substance abuse) to broader environmental
foster the acquisition of foundational influences (including violence,
knowledge and the cultivation of skills
socioeconomic disparities, and lack of
social support). Hence, a multifaceted
approach is imperative for effective
prevention and intervention strategies. Goals, Objectives, and
(Machado et al., 2021) Methodology

Recognizing the limitations in data The primary objective of the research


availability, particularly regarding project is threefold:
suicide-related statistics, innovative a. To identify the underlying factors
approaches leveraging artificial contributing to the emergence of
intelligence (AI) and machine learning suicidal ideation and depressive
(ML) have emerged to enhance risk episodes among students;
assessment and early detection. By b. To develop a robust framework
harnessing extensive datasets and capable of distinguishing
collaborative efforts, these technologies between suicidal intentions and
facilitate proactive measures to safeguard manifestations of depression,
student well-being. (Bernert et al., 2020) thereby facilitating timely
In conclusion, prioritizing the welfare of interventions for at-risk
students requires concerted efforts to individuals;
mitigate the risk of suicide through c. To establish predictive models
evidence-based interventions and strategic for estimating the likelihood and
initiatives, underpinned by severity of self-harm based on
interdisciplinary insights and student profiles
technological advancements.
The project is delineated into four
distinct modules, each aligned with
specific objectives. We employ 3
distinct AI techniques - Multivariate
Data Analytics, Natural Language
Processing, and Machine learning to
draw insights and predict the possibility
of suicidal thoughts in a student. The
following figure explains the flow of
the project:
Fig. 1.1 Flowchart of the Project
3.1 Collection of reliable data: In our is utilized in this project due to its
scholarly endeavor, we meticulously three ability to simultaneously analyze
distinct datasets to pursue varied multiple variables and their
objectives, employing each artificial interrelationships, offering a
intelligence technique judiciously. This comprehensive understanding of
investigation conscientiously scrutinizes complex datasets. The methodology
anonymized data sourced from publicly involves extracting patterns, trends,
accessible repositories to delve into the and correlations from various socio-
realms of student depression and suicidal economic and psychological
ideation. parameters to discern factors
contributing to depressive ideation
1. Dataset 1: Derived from a cohort of and suicidal behavior among
anonymized patients, the dataset students. This holistic approach
encapsulates a spectrum of physical facilitates targeted interventions and
and social determinants germane to support strategies tailored to address
student mental health, meticulously the multifaceted nature of student
curated and assessed to generate mental health challenges.
depression scores under the purview
of qualified psychologists. a. Factor analysis: A
statistical technique utilized to
2. Dataset 2: Scrapped from the social uncover the underlying structure
media platform Reddit, constitutes a of a set of variables and
corpus of user-generated content, comprehend how they
comprising posts delineating collectively influence a specific
poignant narratives of suicidal phenomenon. In the context of
ideation and depressive episodes. student well-being and academic
performance, factor analysis was
conducted on two distinct
3. Dataset 3: Encompasses datasets. The first dataset,
comprehensive student profiles, centered on stress levels among
integrating multifaceted socio- students, aimed to identify
economic and psychological primary factors contributing to
parameters, thereby enriching our students' well-being.
analytical framework. Subsequently, the second dataset,
focusing on the impact of
3.2 Employing AI/Analytical Techniques: depression on academic
performance, sought to delineate
3.2.1. Multivariate Data Analyses: the underlying factors associated
Multivariate Data Analysis (MDA) with depressive symptoms.
Through factor analysis, these arrangement of points in the
methodologies enable a reduced-dimensional space
comprehensive understanding of reflects the relationships or
the multifaceted dimensions of similarities between the objects.
student experiences, informing Depression often co-occurs with
the development of tailored other mental health conditions or
interventions and support is influenced by various social
systems to address their unique and environmental factors. MDS
needs. can help visualize the
relationships between these
b. Regression analysis: factors in a low-dimensional
Regression analysis is a statistical space. This allowed us to explore
method used to examine the how different factors, like social
relationship between a dependent support and negative life events,
variable and one or more might interact and contribute to
independent variables. In the the severity or specific aspects of
context of this study, regression depression.
analysis was employed to
understand how various factors 3.2.2. Natural Language
contribute to depression among Processing:
students. The dependent variable Leveraging data from social media
in this analysis was depression, websites such as Reddit, we
while independent variables develop a classification model that
included factors such as anxiety can differentiate between
levels, self-esteem, sleep quality, depressed and suicidal individuals
academic stressors, social by analyzing textual data for
support, and physiological sentiment, linguistic patterns, and
markers like headache and blood keywords indicative of depression
pressure. By analyzing these or suicidal ideation. Specifically
variables, regression analysis scraped from two subreddits,
helps identify which factors have r/SuicideWatch, and r/Depression,
a significant impact on the dataset contains 1,895 total
depression scores and to what posts. To achieve this, we began
extent. with vectorization. In this case, a
vectorizer, such as TF-IDF (Term
c. Multidimensional Scaling: Frequency-Inverse Document
MDS attempts to represent the Frequency) or CountVectorizer,
relationships between objects in can convert textual data into
their original high-dimensional numerical representations suitable
space. Therefore, the spatial for machine learning algorithms.
TF-IDF assigns weights to terms identifying the support vectors
based on their frequency in a closest to the decision boundary,
document and inverse frequency SVM effectively discriminates
across all documents, while Count between depressive and suicidal
Vectorizer counts the frequency of instances, even in high-
terms in a document. To perform dimensional spaces.
the classification, we employed 4
different machine learning models
which are as followed - c. Artificial Neural Networks:
ANN is a versatile machine
a. Logistic Regression: learning model inspired by the
Logistic regression is a classic structure and functioning of the
algorithm used for binary human brain. In the context of
classification tasks like classifying depression versus
distinguishing between suicidal ideation, ANN learns
depressive thoughts and suicidal complex patterns and
ideation. It models the relationships between features
probability that a given input through multiple layers of
belongs to one of two classes interconnected neurons. By
using a logistic function. By adjusting the weights and biases
fitting a regression line to the data of these connections during
and applying the sigmoid training, ANN can capture
function, logistic regression intricate nonlinearities in the
predicts the likelihood of an data, allowing it to make accurate
instance belonging to the positive predictions based on input
class (suicidal ideation) or the features.
negative class (depressive
thoughts) based on its features. d. Random Forests: Random
Forests are an ensemble learning
b. Support Vector Machine: method that leverages the power
SVM is a powerful algorithm for of decision trees for classification
both binary and multiclass tasks. In the case of
classification tasks. In the context distinguishing between
of classifying between depression and suicidal ideation,
depression and suicidal ideation, Random Forests create a
SVM works by finding the multitude of decision trees using
optimal hyperplane that best bootstrapped subsets of the data
separates the two classes in the and random feature selection. By
feature space. By maximizing the aggregating the predictions of
margin between the classes and these individual trees, Random
Forests mitigate overfitting and student mental health. Thus, by leveraging
provide robust classifications Random Forests as the algorithm of
based on the majority vote of the choice, we endeavor to develop a
trees, making them well-suited predictive model capable of discerning
for handling complex and noisy nuanced patterns and trends, ultimately
data while achieving high contributing to the timely identification
predictive accuracy. and mitigation of student self-harm risks.

3.2.3 Machine Learning: While other 3.2.4 Tableau: Incorporating advanced


techniques of the project work on specific visualization techniques, specifically
factors that affect the mental health of leveraging the capabilities of Tableau,
students, we aim to build a model that enabled us to explore the intricate
analyzes various socio-economic and relationships between various factors
psychological factors in a holistic sense. influencing suicidal thoughts and
The model will predict the degree of self- depression among students. Through
harm that the student may indulge in due interactive visualizations, we gained
to these external factors. The algorithm invaluable insights into the degree of
used to make prediction is Random correlation and causation among our
Forests, a powerful ensemble learning independent and dependent variables.
technique, that is particularly well-suited Importantly, by visualizing our results in
to address the multifaceted nature of an accessible manner, we ensure that our
mental health prediction in students. By findings are understandable by a broader
harnessing the collective strength of audience and can be effectively utilized in
multiple decision trees, Random Forests future research endeavors and practical
excel in analyzing complex datasets applications aimed at addressing student
comprising diverse socio-economic and mental health challenges. In essence,
psychological factors. Through its Tableau served as a powerful tool for
inherent ability to capture nonlinear elucidating the multifaceted nature of
relationships and interactions among student mental health, enabling us to
variables, Random Forests facilitates derive actionable insights crucial for the
comprehensive insights into the intricate development of our predictive model.
interplay of external factors impacting
4. Results interconnectedness of
psychosocial factors influencing
students' well-being and academic
4.1 MDA Component: We derived
success, underscoring the
the following results from MDA
importance of holistic
techniques:
interventions tailored to address
the multifaceted nature of students'
a) Factor Analysis
experiences.

"How do psychosocial factors, Both datasets identified factors


including stress levels, depression, related to psychological distress,
anxiety, social support, and such as depression, anxiety, and
academic challenges, collectively low self-esteem, as well as factors
influence students' well-being and associated with academic
academic performance?". The challenges, social support, and
factor analysis conducted on physical health. While Dataset 1
Dataset 1, focusing on stress levels highlighted factors of general well-
among students, revealed two being and social connectedness,
primary factors: "General well- Dataset 2 emphasized factors of
being" and "Social general depression and anxiety
Connectedness." Factor 1 arousal. This suggests that while
explained 60.49% of the variance there are overlapping themes, each
and was characterized by variables dataset provides unique insights
related to psychological distress into specific aspects of students'
and academic challenges, while mental health and academic
Factor 2 explained 5.708% of the performance. The factors
variance and represented social identified in both datasets
support and physical health. In demonstrate the interconnected
Dataset 2, examining the effects of nature of psychosocial factors and
depression on academic their impact on students' well-
performance, Factor 1 represented being and academic outcomes.
a "General Depression Factor" Factors related to depression,
loaded with symptoms like feeling anxiety, social support, and
depressed, low self-esteem, and academic challenges are
suicidal thoughts, while Factor 2 intertwined, indicating the need for
indicated an "Anxiety Arousal comprehensive interventions that
Factor" loaded with symptoms address multiple dimensions of
such as insomnia and fidgetiness. students' experiences.
These findings highlight the
Fig.4.1.1 Fig.4.1.2

Fig.4.1.3

Factor 1 - General Wellbeing Factor 2 - Social Connectedness


Fig.4.1.4

Factor 1 - General Depression Factor Factor 2 - Anxiety Arousal Factor

While there are similarities between the common themes across student
factors identified in the two datasets, such experiences, each dataset offers unique
as factors related to depression and perspectives on the factors influencing
anxiety, there are also notable differences. students' mental health and academic
Dataset 1 emphasized broader domains performance. Together, these datasets
like general well-being and social provide comprehensive insights into the
connectedness, while Dataset 2 focused complex interplay of psychosocial factors
more specifically on symptoms of affecting students, informing the
depression and anxiety arousal. These development of targeted interventions and
differences suggest that while there are. support systems tailored to address their
specific needs.
b) Regression

Dependent Variable = Depression

Fig.4.1.5
The results of the regression analysis of social support and meaningful
reveal several noteworthy findings. engagement on mental well-being.
Firstly, anxiety levels, self-esteem, and
sleep quality emerge as significant One notable finding from the regression
predictors of depression, with higher analysis is the positive relationship
levels of anxiety and lower self-esteem between depression and factors like
associated with increased depressive headache, blood pressure, and stress level.
symptoms. This aligns with existing These physiological markers of stress and
literature highlighting the bidirectional arousal underscore the intricate interplay
relationship between anxiety and between psychological and physiological
depression, as individuals with anxiety processes in depression. Chronic stress
disorders are at heightened risk for and physiological arousal have been
developing depression, and vice versa. implicated in the pathogenesis of
Moreover, poor sleep quality has been depression, affecting neurotransmitter
consistently linked to depression, systems, neuroendocrine function, and
underscoring the importance of addressing inflammatory processes implicated in
sleep disturbances in mental health depressive symptomatology.
interventions.
Importantly, the inclusion of depression as
Furthermore, academic stressors, such as the dependent variable in this regression
study load and future career concerns, also model holds significant implications for
exert a significant influence on depression understanding the link between depression
scores. The pressures and expectations and suicide. Depression is a well-
associated with academic performance established risk factor for suicidal
and future aspirations can exacerbate behavior, with individuals experiencing
feelings of inadequacy and hopelessness, depressive symptoms being at heightened
contributing to depressive symptoms risk for suicidal ideation, suicide attempts,
among students. Similarly, challenges in and completed suicide. The hopelessness,
the teacher-student relationship and despair, and emotional pain characteristic
engagement in extracurricular activities of depression can overwhelm individuals'
are associated with variations in coping mechanisms and lead to suicidal
depression scores, highlighting the impact thoughts and behaviors as a perceived
means of escape from suffering.
c) MDS

Fig.4.1.6
Dimension 1 - Observable symptoms
Dimension 2 - Instantly Addressable symptoms

The MDS plot reveals distinct clusters based apparent symptoms like Fidgety and Lack of
on symptom characteristics. Well-separated Concentration. Dimension 2, potentially
points (e.g., Fidgety, Lack of Concentration) "Instantly Addressable Symptoms," includes
indicate high dissimilarity, while proximity Poor Appetite, Feeling Tired, and Lack of
suggests greater similarity (e.g., Poor Concentration, suggesting potential for
Appetite, Feeling Tired). Dimensions immediate intervention. However, these
represent abstract variations, not specific interpretations are tentative and require
features. Dimension 1, tentatively labeled further investigation for confirmation and
"Observable Symptoms," groups readily refinement.
4.2 NLP: input, the model is also able to make an
accurate classification between ‘depressed’
Following meticulous data processing and ‘suicidal’.
and vectorization procedures, diverse
machine-learning algorithms were
employed to delineate classifications
between 'depressed' and 'suicidal'. A
comparative analysis of these models
yielded the following results:
a. Logistic Regression: Within the
framework of Natural Language b. Support Vector Machines (SVM):
Processing (NLP) for classifying In NLP, distinguishing between
depression and suicidal behavior, depression and suicidal behavior
textual features are leveraged to utilizes text features to determine
ascertain the likelihood of each an optimal decision boundary,
class, thereby enabling effectively separating the two
discrimination between classes. By analyzing textual
depressed and suicidal data, SVM learns patterns in the
individuals. A model was feature space, identifying
meticulously constructed to linguistic cues indicative of
undergo training on the dataset, depression or suicidal ideation.
subsequently facilitating SVM seeks to maximize the
predictions regarding the margin between classes while
classification of input text as minimizing classification errors.
either 'depressed' or 'suicidal'. It transforms text samples into
numerical representations,
facilitating the creation of a
hyperplane that best separates
depressed and suicidal
individuals. Evaluation metrics
like accuracy, precision, and
recall assess SVM’s Performance
in classifying individuals based
on their textual expressions.

The accuracy achieved using logistic


regression is about 72%. Upon taking user
SVM's performance in classifying in predicting depression or
individuals based on their textual suicidal tendencies. It learns to
expression - recognize subtle linguistic cues
indicative of each class, allowing
for accurate classification. ANN's
ability to capture non-linear
relationships and adapt to diverse
textual contexts makes it
effective in discerning between
depressed and suicidal
individuals based on their
language expressions—
evaluation metrics such as
The accuracy achieved using SVM is also
accuracy and F1- score gauge
about 68%. Upon taking user input, the
ANN's efficacy in classifying
model is also able to make an accurate
textual data. The number of
classification between ‘depressed’ and
epochs used is 70.
‘suicidal’.

c. Artificial Neural Network (ANN): The accuracy achieved using


For distinguishing depression ANN is about 71%. Upon taking
from suicidal behavior utilize text user input, the model is also able
features to learn complex patterns to make accurate classifications
and relationships. ANN between ‘depressed’ and
processes textual data through ‘suicidal’.
multiple layers of interconnected
neurons, extracting meaningful
representations. By iteratively
adjusting weights and biases,
ANN optimizes its performance
d. Random Forests: An ensemble of
decision trees to analyze textual
features. Each tree independently
processes text data, making
predictions based on subsets of
features. By aggregating
predictions from multiple trees,
Random Forest creates robust
classification models, capable of The comparative accuracy of each model
capturing diverse linguistic is given below:
patterns indicative of depression
or suicidal ideation. It handles
non-linear relationships between
text features effectively,
providing reliable predictions.
Random Forest's ability to handle
high-dimensional data and
mitigate overfitting enhances its
performance in identifying
individuals expressing depressive
or suicidal sentiments.
Evaluation metrics such as
accuracy and F1-score assess
Random Forest's efficacy in NLP
classification tasks.

The accuracy achieved using Random


Forests is about 58%. Upon taking user
input, the model is also able to make
accurate classifications between
‘depressed’ and ‘suicidal’.
4.3 Machine Learning:

In this study, we applied the Random


Forests algorithm to a comprehensive
dataset comprising diverse student
profiles. This dataset encompassed
various factors including educational
attainment, ownership of electronic
devices, and dietary habits, among others.
These factors were leveraged to ascertain
the extent of self-harming tendencies
among students. Subsequently, a rigorous
assessment of the model's efficacy was
undertaken to gauge its predictive
capabilities and performance accuracy. Additionally, we conducted a feature
importance analysis to discern the
variables exerting the greatest influence on
the classifier's predictive outcomes. Our
findings indicate that factors such as the
number of electronic devices possessed by
the student and the manifestation of
symptoms related to lethargy or
restlessness significantly impact the
degree of self-harm contemplated by the
individual.
Following is depicted in Fig.4.3.1
The results indicate that the model can
predict with an accuracy of about 67.6%.

Fig.4.3.1
4.4 Tableau: levels of dependency and perceived
expectations, potentially leading to
Utilizing Tableau, our study has derived increased pressure. This dependency and
insightful visualizations that offer a pressure may contribute to elevated levels
nuanced comprehension of the dataset, of stress and, in some cases, escalate to
revealing previously unnoticed patterns more severe suicidal thoughts.
and inferences. These visual 8.23% of students are rated 3 or 4 who are
representations serve as a pivotal staying at private rented accommodation.
component in elucidating complex data Students living independently in private
relationships, facilitating a deeper rented accommodation may experience
understanding of the factors influencing feelings of isolation, as they are away from
student mental health and well-being. By the support systems provided by family or
leveraging Tableau's advanced university halls. Students managing their
visualization capabilities, we have accommodation in private rentals often
unearthed valuable insights crucial for face additional financial and academic
informing targeted interventions and stressors. The absence of a strong support
support strategies tailored to address the network can contribute to heightened
multifaceted challenges encountered by emotional distress and an increased
students in educational settings. likelihood of more severe suicidal
thoughts.
Insights derived -

I. Using the ML project dataset in Tableau,


we explored the relationship between self-
harm thoughts and student
accommodation, focusing on three
categories: Private rented accommodation,
Home (with parents), and University Hall
of residence.
Analyzing self-harm thoughts within these
accommodation categories reveals distinct
patterns.
7.67% of students who stay at home with
parents are rated 3 or 4. Students residing
with parents might experience higher
Fig.4.4.1 - Accommodation v/s Self-Harm Thoughts
II. Utilizing the ML project dataset in With age and experience, individuals tend
Tableau, I focused on two key columns: to develop better coping mechanisms to
"suicidal thoughts" (yes/no) and handle stressors and challenges. Older
"education level." The objective was to students may have acquired a more robust
illustrate the percentage of students set of coping skills, enabling them to
experiencing depression within each navigate academic and personal pressures
education level category, specifically high more effectively.
school, Bachelor's, and Master's. Despite variations across education levels,
The overall analysis revealed that 15.91% the fact that 15.91% of students’ overall
of students are experiencing depression. report suicidal thoughts raises a general
When broken down by education level, the concern about the mental health of
results indicate that 19.35% of high school students, emphasizing the need for
students, 14.71% of bachelor’s students, targeted interventions and support services
and 8.33% of master’s students report in educational institutions.
experiencing depressive thoughts.

Fig.4.4.2
III. When we were initially plotting the is a faster rate of increase in suicide
data, no clear inference was apparent. percentages among 10-14-year-olds
However, after some comparative compared to other age groups. Taking
analysis, a notable trend has emerged. In India as an example, it is evident that,
the age group of 10-14 years, the suicide except for the 10-14 age group, the suicide
percentage is lower compared to other age percentage decreased from 2001 to 2009.
brackets, but it is showing the most From 2010 to 2017, it remained relatively
significant increase over time. This pattern constant for all age brackets except the 10-
is observed in many countries, with 14 group, where a notable increase in
similarities across various brackets. percentage has been observed since 2005.
Notably, in several Asian countries, there

Fig.4.4.3
5. Conclusion
depression predictors, alongside academic
In conclusion, this research represents a
stressors, and future concerns.
significant stride toward addressing the
We used NLP to make the distinction
escalating mental health challenges
between depressive and suicidal thoughts
confronting students in today's educational
to identify students who require urgent
landscape. By harnessing advanced
intervention and care. We used 4 machine
analytical methodologies and artificial
learning models for classification. Out of
intelligence, our study endeavors to
the 4 models, comparative analysis shows
construct predictive models capable of
that using logistic regression gave the best
preempting deteriorating mental health
classification results with an accuracy of
trajectories and mitigating instances of
72%. This technique helps us identify at-
student suicides. Through a meticulous
risk youth using their sentiments extracted
analysis of diverse behavioral, academic,
from their social media activities. We also
and demographic variables, alongside
analyze the holistic socio-economic and
sophisticated techniques such as machine
psychological profiles using machine
learning, natural language processing, and
learning to predict the degree of self-harm
multivariate data analytics, we've gained
that a student might be contemplating. The
invaluable insights into the complex
data is extracted from real-life patients.
factors contributing to suicidal ideation
Using this algorithm, we were able to
and depressive episodes among students.
predict with an accuracy of 67.6%. We
Through MDA, nuanced factors
were also able to narrow down the factors
influencing students' well-being emerged.
that affected our dependent variable of
Factor analysis unveiled "General well-
self-harm the most. Continued refinement
being" and "Social Connectedness" in
and validation of these models, along with
Dataset 1, while Dataset 2 revealed
collaborative engagement among
"General Depression Factor" and "Anxiety
stakeholders, will be vital in translating
Arousal Factor." Regression showed
research insights into actionable strategies
anxiety, self-esteem, and sleep quality as
that prioritize student mental health and
facilitate positive change.
6. Future Scope

Moving forward, the future scope of this culture of preventive mental healthcare
project encompasses several key avenues within educational communities.
for further exploration and refinement. Additionally, future iterations of this
Firstly, ongoing efforts will focus on project will prioritize the development of
enhancing the predictive accuracy of our personalized interventions tailored to the
models through the incorporation of unique needs and circumstances of
additional data sources and the refinement individual students, leveraging advanced
of algorithmic parameters. Collaborative analytics and artificial intelligence to
partnerships with educational institutions deliver targeted support resources and
and mental health organizations will be strategies. Through ongoing research,
pivotal in augmenting our datasets and validation, and stakeholder engagement,
validating the efficacy of our predictive we aspire to catalyze a paradigm shift in
frameworks in diverse real-world settings. student mental health support, wherein
It's important to note that the data used for proactive identification, intervention, and
this particular project came only from the support mechanisms become integral
patients who consented to their components of educational ecosystems,
information being released. Therefore, nurturing the holistic well-being and
including more data from a wider pool of resilience of students across diverse socio-
individuals could yield even better results. cultural contexts. Ultimately, the future
Hence, gathering more data will be trajectory of this project is anchored in a
required to ensure the robustness and steadfast commitment to leveraging
generalizability of our predictive models. cutting-edge technology and
interdisciplinary collaboration to address
Furthermore, the integration of real-time the complex and pressing challenges
monitoring capabilities and intervention facing student mental health in the 21st
protocols will enable proactive century.
identification and support for students at
risk of mental health crises, fostering a
Declaration of Competing Interest
We have no interest to declare.

Data Availability
The authors do not have permission to
share data.
References

[1] World Health Organization: WHO. (2023a, August 28). Suicide.

https://www.who.int/news-room/fact-sheets/detail/suicide

[2] Machado, C. D. S., Ballester, P. L., Cao, B., Mwangi, B., Caldieraro, M. A., Kapczinski, F., &

Passos, I. C. (2021). Prediction of suicide attempts in a prospective cohort study with a

nationally representative sample of the US population. Psychological Medicine, 52(14), 2985–

2996.

https://doi.org/10.1017/s0033291720004997

[3] De Oliveira Crispim, M., Santos, C. M. R. D., Da Silva Frazão, I., De Queiroz Frazão, C. M. F.,

Rameh-De-Albuquerque, R. C., & Perrelli, J. G. A. (2021). Prevalence of suicidal

behavior in young university students: A systematic review with meta-analysis.

Revista Latino-americana De Enfermagem, 29.

https://doi.org/10.1590/1518-8345.5320.3495

[4] Bernert, R. A., Hilberg, A. M., Melia, R., Kim, J., Shah, N. H., & Abnousi, F. (2020). Artificial

Intelligence and Suicide Prevention: A Systematic Review of Machine Learning

Investigations. International Journal of Environmental Research and Public Health,

17(16), 5929.

https://doi.org/10.3390/ijerph17165929
[5] Aseltine, R. H., Jr., & Schoenborn, C. A. (2016). Depression and suicidal thoughts among college

students. Journal of Adolescent Health, 59(1), 118-124.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3057910/

[6] Bhattacharya, P., & Chawla, N. (2020). Mental health and suicidal ideation among university

students: A meta-analysis. Journal of Affective Disorders, 264, 294-302.

https://www.cambridge.org/core/journals/psychological-

medicine/article/prevalence-of-suicidal-thoughts-and-behaviours-among-college-

students-a-metaanalysis/F31360A7411B35C4AC3B1A8DA67FA016

[7] Garcia, N. M., & Calvete, E. (2019). Risk factors for suicidal ideation in university students: A

systematic review and meta-analysis. Clinical Psychology Review, 73, 101859.

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0261785

[8] Racine, N. E., Cunningham, N. A., Liu, H., & Weems, C. F. (2019). Depression and suicidal

ideation among college students: A review of risk factors and protective

factors. Educational Psychologist, 54(1), 64-92. **

https://pubmed.ncbi.nlm.nih.gov/17559087/

[9] Eaton, N. R., Keyes, K. M., Krueger, R. B., & Blazer, D. G. (2010). Major depressive disorder in

the United States: Prevalence, correlates, and sociodemographic characteristics. The

American Journal of Psychiatry, 167(12), 1516-1525.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9483000/

[10] Liu, X., & He, J. (2016). Depression and suicidal ideation among Chinese college students: A

meta-analysis. Journal of Affective Disorders, 203, 341-350.


https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0104368

[11] Serafini, L., Amore, M., De Berardis, D., & Russo, E. (2019). Suicidal risk factors and protective

factors in adolescents: A systematic review. Journal of Affective Disorders, 245,

170-190.

https://www.scielo.br/j/rbp/a/3bYbDB7dXFr6jtvsdhc6bYb/?lang=en

[12] Eisenberg, D. (2011). Depression and anxiety disorders in college students. American

Psychologist, 66(7), 599-612. **

https://pubmed.ncbi.nlm.nih.gov/21543948/

[13] Prather, R. M., & John, D. M. (2014). The prevalence of suicidal thoughts and behaviors among

high school students in the United States: Population-based estimates from the 2011

Youth Risk Behavior Survey. Journal of Adolescent Health, 55(1), 86-92. **

https://www.cdc.gov/mmwr/preview/mmwrhtml/ss6104a1.htm

You might also like