Revise 10

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

AGUSAN DEL SUR STATE COLLEGE OF AGRICULTURE

AND TECHNOLOGY

BACHELOR OF SCIENCE IN INFORMATION SYSTEM

JERA MAE C. BAGUIOS

ALLIAH JANE B. RUFANO

Predictive Analysis of Facebook for Mental Health Crises among

ASSCAT Students using Random Forest Algorithm

Anthonette Camosa-Azares

Capstone Adviser
AUGUST 2023
Introduction

The creation of social media platforms in recent years has changed the

way individuals communicate and engage online. These platforms, especially

Facebook have evolved into potent instruments for self-expression,

knowledge sharing, and social networking. However, in addition to their

benefits, social media platforms provide unique concerns, particularly in

terms of student mental health. Concerns have been expressed among

educators, parents, and mental health experts about the rising occurrence

of mental health crises among students. Predictive analysis employing

machine learning algorithms such as Random Forest has received a lot of

interest in order to effectively treat and alleviate these mental health crises.

Random Forest is a versatile and robust algorithm that has been shown to

be extremely good at analyzing complex information and delivering correct

predictions. Random Forest can provide useful insights in the context of

social media and mental health by examining multiple data points and

patterns linked with social media usage.

The potential of social media data in understanding mental health

issues among students has been highlighted in the existing research. Early

detection of mental health issues allows specialists to treat them more

effectively and it improves patient’s quality of life. Mental health is about

one’s psychological, emotional, and social well-being. It affects the way how

one thinks, feels, and acts. Mental health is very important at every stage of

life, from childhood and adolescence through adulthood. For instance, a

study by Vaishnavi, K., Kamath, U. N., et al (2022) identified five machine


learning techniques and assessed their accuracy in identifying mental

health issues using several accuracy criteria and one of them is the Random

Forest Algorithm. They have compared those techniques and implemented

them and also obtained the most accurate one in Stacking technique based

with an accuracy of prediction 81.75. Also, Sun, Q., & Ding, H. (2023, July),

Suicide risk assessment methods based on random forest algorithm

investigated. The random matching and traversal matching algorithms are

used to optimize the combination of hyperparameters, and the fitting

problem of this model in prediction accuracy improvement is improved. The

results of their study can be applied to the early intervention of suicide

prevention. At the same time, it also puts forward the reflection and

improvement of the model.

The current gap in knowledge lies in the limited application of

Random Forest predictive analysis to social media data for mental health

crisis prediction among students. While traditional statistical methods have

provided valuable insights, the utilization of Random Forest offers the

opportunity to improve prediction accuracy and identify critical risk factors

associated with mental health crises.

The main purpose of this study is to predict the ASSCAT students who

are suffering mental health crises by posting on their Facebook account

using Random Forest Algorithm. By leveraging this methodology,

researchers will use a predictive model that can detect early signs of mental

health issues based on social media usage patterns. This model can help in
early intervention and the provision of appropriate support to students who

may be at risk.

Statement of the Problem

 Student behavior on social media may indicate a mental health crisis

before it occurs.

 Percentage of students experiencing a mental health crisis at each

college

 Early intervention strategy and proper assistance for students with

Mental Health Crisis.

Objectives of the Study

This study aims to predict the ASSCAT students who are suffering

mental health crises by posting on their Facebook account using Random

Forest Algorithm.

1. To detect early signs a Mental Health Crisis based on Social Media

usage patterns.

2. To determine the percentage rate of students per college that suffers

Mental Health Crisis.

3. To provide an early intervention plan and provision of appropriate

support to student. (Intervention and Support Plan to Student).


Scope and Limitations of the Study

Scope

 The study will focus on ASSCAT students as the target population.

 The study will use Facebook posts and interviews as the primary data

sources for the adoptive predictive model.

 The study will use Random Forest machine learning algorithms.

 The study will only focus on detecting early signs of Mental Health

Crisis on social media specifically Facebook.

 The study will provide early intervention and provision of appropriate

support to student.

Delimitation

 The study will not include data from other sources such as medical

records.

 The study will not diagnose mental health conditions or provide

individualized treatment recommendations.

 The study will not guarantee the accuracy or reliability of the

predictive model, as it is based on the data available at the time of

analysis.

 The study will not include other social media platforms.


Conceptual Framework

Input Process

 Qualitative Data:  Quantitative Analysis: Feature


- Google forms extraction from Facebook data
for analysis.
 Quantitative Data:  Integration of Insights:
- Facebook data; Qualitative insights integrated
with quantitative features to
including post content,
form a comprehensive
comments, and likes understanding.
 Random Forest Algorithm:
Application of Random Forest for
prediction using integrated data.

Output
 Predictive analysis of early

signs of mental health crisis

 Prevention plan and Treatment

plan

Figure 1. Overall Conceptual Framework

In this process, qualitative and quantitative data are combined to gain

a comprehensive understanding of the relationship between mental health

and Facebook use and interviews. Machine learning technique will use to

predict potential outcomes and create mitigation strategies.


Review of Related Literature

An Improved Bagging Ensemble in Predicting Mental Disorder Using


Hybridized Random Forest - Artificial Neural Network Model

Machine Learning primarily offers the process of data collection,

identification, pre-processing, training, validation, and visualization. The

problem of late identification of mental problems among IT professionals is

addressed in this study. Many cases of mental diseases go unnoticed,

unnoticed, or undiagnosed until they become catastrophic. According to

Adeniji, O. D., Adeyemi, S. O., & Ajagbe, S. A. (2022) they experimented a

hybrid Random Forest and Artificial Neural Network (RF-ANN) model that

used to predict the likelihood of IT professionals getting mental problems.

The hybrid model result properly demonstrates a considerable improvement

in performance above the separate performances of the RF model and ANN

models. When they compared to the outcome of the parameter-tuned RF,

the hybrid model showed a modest improvement in performance. This

implies that an enhanced dataset might be explored and compared to the

findings produced in this work by employing the RF-ANN model.

A Machine Learning Approach for Predicting Suicidal Thoughts and


Behaviours Among College Students

In the study of Macalli, M., Navarro, et al., (2021) Suicidal ideas and

behaviors are common among college students. However, little is known

about screening strategies for identifying high-risk stud ents. Within

one year of baseline evaluation, they wanted to construct a risk algorithm to

identify the key determinants of suicide thoughts and behaviors among

college students. They employed random forests models with 70 possible


variables evaluated at baseline, including sociodemographic and family

traits, mental health, and drug use, to predict suicidal thoughts and

behaviors at follow-up. The area under the receiver operating curve (AUC),

sensitivity, and positive predictive value were used to assess model

performance. The authors found a reasonable group of characteristics of

mental health that correctly predicted one-year suicidal thoughts and

actions in a college students selected from the community.

The psychological effects of the COVID-19 worldwide pandemic are

still poorly understood from a scientific perspective. According to earlier

studies, demographic characteristics including gender and age are linked to

higher levels of distress during a global health crisis. The goal of their

current study to find factors that might predict psychological suffering

during the COVID-19 pandemic. Clinical levels of anxiety, sadness, and

post-traumatic stress disorder were more common overall than they would

have been in the absence of a pandemic and at greater rates than those

observed among healthcare personnel and those who had survived severe

acute respiratory syndrome. The most effective indicators of distress were

found using a random forest machine learning method. To pinpoint those

who are more susceptible to anxiety, depression, and post-traumatic stress

disorder, regression trees were created. Greater distress was linked to

somatization and reduced dependence on adaptive defensive mechanisms.

Their study finds emphasize the value of evaluating people's bodily

manifestations of psychological distress and their ability to regulate their

emotions in order to assist mental health professionals in adjusting


evaluations and treatments during a major public health emergency. Prout,

T. A., Zilcha-Mano, S., Aafjes-van Doorn, et al., (2020)

According to Chancellor, S., & De Choudhury, M. (2020) Social-media

is currently utilized to assess health outcomes and model mental well-being.

In order to anticipate the existence of certain mental diseases and

symptomatology, such as depression, suicidality, and anxiety, computer

scientists are currently applying quantitative methodologies. A study of

Huljanah, M., Rustam, Z., Utama, S., & Siswantining, T. (2019, June) Data

may be classified using the Random Forest Classifier by identifying the

decision tree. The precision to be attained will improve as more trees are

used. The Random Forest Classifier can handle big sample sets and can

classify data with missing characteristics. The procedure of feature selection

is crucial since the precision of classification might be impacted.

Research on good mental health and enhancements in health,

development, and lifespan spans more than 40 years, Van Agteren, J., et al.,

(2021). Random forest classifier is a randomized ensemble of decision trees

that recursively partition the dataset into roughly homogeneous or close to

homogeneous terminal nodes. It may contain hundreds to thousands of

trees that are grown by bootstrapping samples of the original data. The final

decision is obtained when the tree branching process terminates and

provides the expected forecasting results given the series of events in the

tree Dinov, I. D. (2018).


METHODOLOGY

This study aims to apply the Random Forest Algorithm

predictive analysis to Facebook data and survey to identify patterns and

indicators associate with Mental Health Crisis among ASSCAT students.

DATA IDENTIFICATION

DATA COLLECTION
(Sci-kit Learn (Python)

INTERVIEWS AND FACEBOOK POSTS


SURVEY

DATA TRANSLATION
(QUILLBOT)
STATISTIC ANALYSIS

DATA CLEANING
(Pandas Python)

FEATURE EXTRACTION
(TF-IDF)

INTEGRATION EXTRACTION

(CounterVector & TF-IDF)

RANDOM FOREST
ALGORITHM

PREDICTIVE ANALYSIS

PREVENTION PLAN
Figure 2. Overall Methodology

Data Identification

In this study, the researchers will collect the relevant data sources

through Interview and Facebook posts that provide the necessary

information. This could involve collecting social media data, student

demographic information, and student rating data.

Data Collection

Google Sheets CSV file


Facebook Posts

Figure 3. Data Collection through Statistical Analysis and Pandas

Figure 3 above shows how the researchers will get the data from

ASSCAT students, through interview and Facebook accounts and their

posts.

Data Translation

Data Translated
Google
Collected Data
Translator

Figure 4. Data Translation

As shown in Figure 4, in this stage, the researchers will use Quillbot

to translate the data collected. The researchers will use quillbot and select
the text that want to translate. Then, on the toolbar, select

Translator> Translate. Once the data is translated the researchers download

the translated documents.

Data Cleaning

The collected dataset will undergo preprocessing to handle missing

values, remove duplicate, and normalize the data. This step ensures the

datasets quality and prepares it for further analysis.

Missing values,
Translated remove
Data duplicate and
normalize data

Data cleaned

Figure 5. Data cleaning using Pandas (Python)

In this figure, the researchers will use Pandas. Data can be cleaned

using Pandas (Python). In this step, it will clean data to remove noise,

irrelevant information, duplicates, or any sensitive content. It helps improve

the quality of the data before further processing.

Feature Extraction
Feature extraction using TF-IDF involves transforming Facebook

posts into a TF-IDF feature matrix, which represents the importance of

terms within each post. These features will be used to train the predictive

model and identify potential mental health crises among ASSCAT students.

Integration Extraction

After extracting features from both sources, the researchers will

integrate them by concatenating the document-term matrices horizontally.

This will create a combined matrix where each row corresponds to a data

point (interview or post) and each column corresponds to a unique word or

term.

Random Forest Algorithm

After extracting all the combined data, train a Random Forest as a

classifier on mental health categories and evaluate the model's performance

and extract feature importance to understand which terms from the text

contribute to different aspects of mental health crises.

Prevention Plan

After will train the algorithm and evaluating the model’s performance,

ensuring that the prevention plan respects students' privacy, provides

accurate information, and avoids causing harm. The researchers will speak

with the guidance counselors and inquire about the best methods to align

our preventive strategy with the school’s prevention plan as part of the

methodology.

You might also like