0% found this document useful (0 votes)

20 views10 pages

Penerbit, 004

This document summarizes a study that used machine learning models to predict depression levels based on social media posts. Three models were evaluated: Decision Forest, Neural Network, and Support Vector Machine (SVM). The Neural Network model achieved the highest average precision of 78.27%, performing better than the other models at determining depression levels from social media content. Previous related studies that used machine learning on social media data to predict depression are also summarized, with accuracies ranging from 70-78% and precision scores from 0.705-1.000.

Uploaded by

Henry H

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views10 pages

Penerbit, 004

Uploaded by

Henry H

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

JOURNAL OF SOFT COMPUTING AND DATA MINING VOL.2 NO.

1 (2021) 39-48

© Universiti Tun Hussein Onn Malaysia Publisher’s Office

Journal of Soft
JSCDM Computing and
Data Mining
Journal homepage: http://penerbit.uthm.edu.my/ojs/index.php/jscdm
e-ISSN : 2716-621X

Predicting Depression Using Social Media Posts

Fahem Abu Bakar1, Nazri Mohd Nawi1*, Abdulkareem A. Hezam1
1
Faculty of Computer Science and Information Technology,
Universiti Tun Hussein Onn Malaysia (UTHM), Parit Raja, 86400 Batu Pahat, Johor, MALAYSIA

*Corresponding Author

DOI: https://doi.org/10.30880/jscdm.2021.02.02.004
Received 15 June 2021; Accepted 01 October 2021; Available online 15 October 2021

Abstract: The use of Social Network Sites (SNS) is on the rise these days, particularly among the younger
generations. Users can communicate their interests, feelings, and everyday routines thanks to the availability of
social media sites. Many studies show that properly utilizing user-generated content (UGC) can aid in determining
people's mental health status. The use of the UGC could aid in the prediction of mental health, particularly
depression, where it is a significant medical condition that impairs one's ability to work, learn, eat, sleep, and enjoy
life. However, all information about a person's mood and negativism can be gathered from their SNS user profile.
Therefore, this study utilizes SNS as a data source by using machine learning models to screen and identify users
in categorizing users based on their mental health. The performance of three machine learning models is evaluated
to classify the UGC: Decision Forest, Neural Network, and Support Vector Machine (SVM). The results show that
the accuracy and recall result of the Neural Network model is the same as the Support Vector Machine (SVM)
model, which is 78.27% and 0.042, but Neural Network performs better in the average precision value. This proves
that the Neural Network model is the best model for making predictions to determine the level of depression by
using social media posts.

Keywords: Social media, machine learning, decision forest, neural network, Support Vector Machine (SVM)

1. Introduction
The vast volume of information available on social media, which corresponds to user behavioral characteristics,
might be utilized [1]. Using that data to anticipate a social media user's mental health status can assist a psychiatrist,
family member, or friend in providing timely medical advice and therapy to a sad user [2]. According to the World
Health Organization (WHO) [3], depression affects roughly 350 million people worldwide today. Depression is one
of the world's most deadly disorders [4]. Furthermore, almost two-thirds of depressed persons do not seek adequate
treatment, resulting in serious repercussions [5]. Medical science relies on asking patients questions about their
circumstances, which does not allow for a precise diagnosis of depression [6]. During two weeks, the patient must
attend more than one session. A False Positive problem exists when a non-depressed condition is classified as
depressed [7]. Researchers discovered, however, that electronic health record (EHR) systems are not well-suited to
merging behavioral health and basic care. Documenting and tracking data for behavioral health problems such as
depression is not supported by EHRs [8].
According to eMarketer [9], the number of social media users in 2015 was about 2 billion, and it is growing every
day. The majority of people use social media to convey their feelings, emotions, and day-to-day activities. Social
media has been proven to be a safe space for many people to vent their negative feelings by sharing material that
reflects those emotions [10]. Many studies have demonstrated that data from social media can be utilized to help
people maintain their mental health. Researchers can gain a thorough picture of a user's natural behavior by mining

*Corresponding author: [email protected] 39

2020 UTHM Publisher. All rights reserved.
publisher.uthm.edu.my/periodicals/index.php/JSCDM
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

users' social media posts [11]. By using machine learning models, researchers can gather all information about a
person's mood, activities, sleep hours, thinking style, interactions, guilt feelings, worthlessness, loneliness, and
helplessness from a user's social media profile. Retrieving such behavioral attributes reveals depression symptoms in
social media users, which could be utilized to determine whether or not the user is depressed [12]. The use of machine
learning models can help Psychiatrists, parents, and friends detect their early symptoms and prevent the user's
depression, which would save time before the depressed person enters the serious depression stage.

2. Project Background
Depression is a type of mood illness characterized by a persistent sense of melancholy and loss of interest. There
are several different types of depressive disorders, each with its own set of symptoms. Major Depressive Disease
(MDD) is the most frequent type of depressive disorder, and it affects people's ability to work, study, eat, sleep, and
have fun. To diagnose a major depressive episode, the patient must exhibit five or more of the following nine
symptoms for at least two weeks and virtually every day [13], [14]. The first sign is having a sad mood for the majority
of the day. The second symptom is a general lack of interest in practically all activities. The third sign is weight loss
or increase, as well as excessive sleeping. Body agitation or retardation is the fourth symptom [15]. The sixth symptom
is exhaustion or a lack of energy. The sixth is a feeling of remorse or worthlessness. Finding concentration, thinking,
or making a decision becomes tough is the seventh symptom. The seventh symptom is difficulty sleeping or sleeping
too much. The ninth and last symptom is the only one that does not have to occur daily. It is the thought of death, a
suicide attempt, or a plan to commit suicide.
Asking patients questions about their symptoms cannot precisely diagnose depression, indicating that medical
science does not have complete confidence in the approaches employed to diagnose depression [2]. Nowadays, the
usage of social media is on the rise, particularly among the younger population. Users can access their SNS at any
time and from any location using their Smartphones, PCs, or laptops [16]. Users can communicate their interests,
feelings, and everyday routines thanks to the availability of social media sites. User-generated content (UGC) from
any social media platform could be used to study human health behaviors [17]. Many studies have shown that by
using UGC correctly it can help people maintain their mental health or identify at an early stage. This way may be
able to gain a complete picture of a user's natural behavior by mining their social media posts, which could aid in the
prediction of depression. This study might be able to get a precise result of the user's mental health by using
approaches that identify social media users' depression phases based on their individually authored content. To
overcome this problem, there must be a method to predict the level of patient depression through social media. So,
the objective of the study is to analyze depression on a dataset collected from an online public source. Then data
collected will be analyzed, and the performance of machine learning models will be compared on predicting
depression levels. Two scopes have been set up to guide this project toward its objectives. For the scope project, user
data is extracted from social media (Twitter). User posts can help to classify users according to mental health. The
patient's natural behavior will be created out of their written posts. The data of the user is collected from the Kaggle
website.

3. Related Studies
Many researchers have used machine learning techniques such as the Random Forest Tree (RFT), the Support
Vector Machine (SVM), and the Convolution Neural Network (CNN) to collect and classify data from websites to
predict depression. Table 1 presents the results of the comparison with relevant studies from other articles.

Table 1- Comparison result between various articles

References Related SNS Method Result
Total
Total Precision
Accuracy (%)
[18] Depression Detection by Twitter & SVM 78% 1.000
Analyzing Social Media Facebook Naïve Bayes 74% 1.000
Posts of User

[19] Social media as a Twitter SVM 73% 0.820

measurement tool of
depression in populations
[20] Predicting Depression via Twitter SVM 70% 0.705
Social Media
[21] Detecting Arabic Depressed Twitter Random 83% 85.7
Users from Twitter Data Forest
Naïve Bayes 75% 75.8

40
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

AdaBoostM1 55% 56.4

Liblinear 87.5% 87.6
[22] Predicting Anxiety, - Decision Anxiety: 73% 0.458
Depression, and Stress in Tree Depression: 0.731
Modern Life using Machine 78%
Learning Algorithms Stress 63% 0.599
Random Anxiety: 71% 0.431
Forest Depression: 0.881
79%
Stress: 72% 0.731
Naïve Bayes Anxiety: 73% 0.459
Depression: 0.822
85%
Stress: 74% 0.548
Support Anxiety: 67% 0.403
Vector Depression: 0.820
Machine 80%
Stress: 66% 0.672
K Nearest Anxiety: 69% 0.449
Neighbour Depression: 0.750
72%
Stress 71% 0.719

Different researchers have used different machine learning algorithms to predict psychiatric diseases, and the
results of the algorithms have varied depending on the context. There was no single algorithm has been determined
to be the best in all circumstances. Therefore, many machine learning algorithms were used to identify the symptoms
of anxiety, depression, sadness, and stress in the current study. The performance of those machine learning algorithms
drives this research's motivation to select some of the popular algorithms for the selected datasets.

4. Methodology/ Framework
This project follows the standard process of conducting experiments and simulations by selecting the CRISP-DM
model as the methodology for this research study. The CRISP-DM [23] (Cross-Industry Standard Process for Data
Mining) is the standard method used by most researchers in the data mining area to guide data mining activities. It
comprises descriptions of common project phases, tasks associated with each phase, and an explanation of the
interconnections between these activities as a methodology [24]. CRISP-DM presents an overview of the data mining
life cycle as a process model.

4.1 Data Collection

The author obtained datasets for this study, Narendra Sahu, collected from Twitter, including posts from several
random users. The dataset contains posts that have been scrapped from the Twint tool to detect all the tweet posts that
are related to depression. Keywords used are hopeless, depressed, suicide. The dataset consists of the tweet from
more than 10,314 users. This dataset contains six fields, including "ID", "Username". "Tweet Text", 'Prediction Score"
and "Sentiment". A more detailed description of the contents of the data set is described in Table 2 below.
The author used the dictionary to contain words with their polarity. Each word taken from the tweet is compared
with the dictionary and given a score. The sum of polarity is added for each tweet, and if it is above 0, then it is a
negative tweet that stands for a non-depressed post. If it is equal to 1, it is a positive tweet that declares that post as a
depressing tweet. In this way, tweets are classified as negative and positive. Evaluated tweet posts were used to be
incorporated into the machine learning model.

Table 2 - Description of each column field contained in the dataset

Fields Description
Id An index value of a tweet
Username Handle of account or person that their text had been identified on Twitter
Tweet Text The message on which the sentimental analysis needs to be performed
Prediction 0 stands for NO = “non-depressed” and 1 stands for YES = “depressed”
Sentiment 0 = negative, 1 = positive

41
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

4.2 Creating Prediction Model

In this part, the Microsoft Azure Machine Learning (Classic) was used to create the prediction models by selecting
Decision Forest, Neural Network, and Support Vector Machine (SVM) classifier for two-class models, which is
suitable for classification with discrete features. The two-class technique classifier takes three parameters to compare
each of this classifier: accuracy, precision, and recall. This classifier was fitted according to the training dataset. The
experimental design for each model is shown in Fig.s 1 and 2.

Fig. 1 - Model building using Decision Forest classifier.

42
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

Fig. 2 - Model building using Neural Network classifier.

4.3 Data Cleaning

Data cleaning is the process of identifying parts of data that are incorrect, incomplete, incorrect, irrelevant, or
missing, and then altering, replacing, or deleting them as needed. For analysis and machine learning, data is the most
valuable resource. There are several missing values identified in the data set that has been used in this study. Therefore,
a data cleaning process was used for this data set to avoid any errors during the training and testing period of the data.
Data cleaning eliminates a row of missing values with a minimum missing value ratio of 0 and a maximum missing
value ratio of 1.

4.4 Pre-processing Text

A dataset usually requires some pre-processing before it can be analyzed. This process also applied to this dataset
using pre-processing text in Microsoft Azure for each model machine learning. This pre-processing text help in
analyzing text to removing twitter handles (@user, removing links, removing punctuations, numbers, and special
characters, remove stop words and lowercase words.

4.5 Split Data

This section discusses the process used to construct a data set with basic permission label information on whether
a tweet post indicates depression. The data set is divided into training and testing models. The classifier changes the

43
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

model from Decision Forest Classification to Neural Network Classification and Support Vector Machine (SVM)
classification each time a test is recorded. In the test section, a trained model is applied to the monitored data set.
Machine learning will evaluate models that link training data sets and test data sets to provide predictive results.
Therefore, the three main metrics used to evaluate a classification model are accuracy, precision, and recall. The
difference value of the algorithm is shown in the result section.

4.6 Evaluate the Model

This section uses the classification method to perform a prediction using an Azure Machine Learning Studio
(classic). There are four types of outcomes that will be evaluated:
 True positives are when predict an observation belongs to a class and it does belong to that class.
 True negatives predict an observation that does not belong to a class and does not belong to that class.
 False positives occur when predicting an observation belongs to a class when in reality, it does not.
 False negatives occur when predicting an observation does not belong to a class when in fact, it does.

5. Analysis and Results

This section explains the analysis results of simulations that have been done for this study. The performance of
three selected machine learning algorithms is compared on the same datasets selected to determine the best models
that can make predictions. This section will show the result value of accuracy, precision, and recall for each machine
learning model with different algorithms during the training and testing data. This result will also be used in comparing
which model is better in making predictions. All the algorithms and results are recorded in the table as shown in Table
3, Table 4, and Table 5.

Table 3 - Accuracy results for Decision Forest, Neural Network, and SVM
Split Data Accuracy (%)
Decision Neural
Training Testing SVM
Forest Network
5 95 77.8 78.1 78.1
10 90 77.9 78.0 77.9
15 85 77.4 78.0 78.0
20 80 77.6 78.0 78.0
25 75 77.6 78.0 78.0
30 70 77.7 77.9 77.9
35 65 77.5 77.9 77.9
40 60 77.4 78.1 78.1
45 55 77.8 78.2 78.2
50 50 77.6 78.5 78.5
55 45 78.1 78.7 78.7
60 40 77.7 78.5 78.5
65 35 77.7 78.6 78.5
70 30 77.7 78.5 78.5
75 25 77.2 78.2 78.2
80 20 77.3 77.9 77.9
85 15 76.7 78.0 78.0
90 10 77.4 78.7 78.7
95 5 78.1 79.5 79.5
Average 77.58 78.27 78.27
Standard Deviation 0.316 0.396 0.387

Accuracy measures the goodness of a classification model as the proportion of true results to total cases. Based
on Table 3 as shown below, the following are the results for the accuracy value for each machine learning model that
has been used in this study. From the results of the accuracy, the average value for Neural Network and SVM machine
learning model has the same average accuracy of 78.27%. Meanwhile, the Decision Forest machine learning model
recorded a different average accuracy value of 77.58%. A difference of 0.69% for the average accuracy results for the
Neural Network and SVM machine learning models shows that the Neural Network model is more accurate than
SVM. Next, the calculation results for the standard deviation for the three machine learning models give different
results. The standard deviation value for the Decision Forest machine learning model is 0.316%.

44
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

Meanwhile, the Neural Network model recorded a standard deviation value of 0.396%, which is higher than the
standard deviation value for the SVM model, which is only 0.387%. This difference value shows the percentage of
standard deviation for the Decision Forest model was the lowest, while the Neural Network machine learning model
recorded the highest standard deviation value. The more detailed difference of accuracy result for each model is also
shown in Fig. 3 in the form of a graph.

80
79.5
79
78.5
Accurancy (%)

78
77.5
77
76.5
76
75.5
75
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Decision Forest 77.8 77.9 77.4 77.6 77.6 77.7 77.5 77.4 77.8 77.6 78.1 77.7 77.7 77.7 77.2 77.3 76.7 77.4 78.1
Neural Network 78.1 78 78 78 78 77.9 77.9 78.1 78.2 78.5 78.7 78.5 78.6 78.5 78.2 77.9 78 78.7 79.5
SVM 78.1 77.9 78 78 78 77.9 77.9 78.1 78.2 78.5 78.7 78.5 78.5 78.5 78.2 77.9 78 78.7 79.5
19-fold

Decision Forest Neural Network SVM

Fig. 3 - Graph of accuracy result for Decision Forest, Neural Network, and SVM

Precision is the proportion of true results overall positive results. Based on Table 4, shown below, the following
are the results for the precision value for each machine learning model used in this study. The average value of
precision for each model indicates a different value. The decision Forest model produces a high average precision
value which is one compared to the Neural Network and SVM models. Next, the average value of accuracy for the
Neural Network model is 0.981. The difference in the value of 0.007 when compared to the average value of accuracy
for the Support Vector Machine (SVM) model, the average value is 0.974. This shows that the Neural Network model
is the second-highest model for the average value of precision in making predictions. Meanwhile, for standard
deviation calculation for model decision forest is 0 while model Neural Network and Support Vector Machine (SVM)
record the standard value of the same deviation, which is 0.026.

Table 4 - Precision results for Decision Forest, Neural Network, and SVM
Split Data Precision
Training Testing Decision Neural SVM
Forest Network
5 95 1.000 1.000 1.000
10 90 1.000 1.000 1.000
15 85 1.000 0.894 0.894
20 80 1.000 1.000 1.000
25 75 1.000 1.000 1.000
30 70 1.000 0.978 0.978
35 65 1.000 0.980 0.980
40 60 1.000 0.980 0.980
45 55 1.000 0.981 0.981
50 50 1.000 0.981 0.981
55 45 1.000 0.979 0.979
60 40 1.000 0.977 0.977
65 35 1.000 0.936 0.976

45
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

70 30 1.000 1.000 0.951

75 25 1.000 1.000 0.949
80 20 1.000 1.000 0.935
85 15 1.000 0.955 0.955
90 10 1.000 1.000 1.000
95 5 1.000 1.000 1.000
Average 1 0.981 0.974
Standard Deviation 0 0.026 0.026

A recall is the fraction of all correct results returned by the model. Based on Table 5 which shows the results for
the recall value for each machine learning model that has been used in this study. There are differences in the standard
deviation values for these three models to produce recall results. The average value for the Decision Forest model is
0.011. Meanwhile, the average recall value for the Neural Network and Support Vector Machine (SVM) models is
similar. The value is 0.042. However, the value of the difference can be seen for these two models in the standard
deviation value that is the value for the Neural Network model is 0.014 while for the Support Vector Machine (SVM)
model is 0.015. The difference gap is only a small amount of 0.001. Finally, the standard deviation value for the
Decision Forest model is 0.007, significantly different from the value of the Neural Network and Support Vector
Machine (SVM) model.

Table 5 - Recall results for Decision Forest, Neural Network, and SVM
Split Data Recall
Decision Neural
Training Testing SVM
Forest Network
5 95 0.004 0.019 0.019
10 90 0.016 0.021 0.020
15 85 0.000 0.030 0.030
20 80 0.012 0.027 0.027
25 75 0.012 0.028 0.028
30 70 0.015 0.027 0.028
35 65 0.012 0.032 0.032
40 60 0.004 0.036 0.036
45 55 0.019 0.041 0.041
50 50 0.000 0.044 0.044
55 45 0.019 0.045 0.045
60 40 0.011 0.045 0.045
65 35 0.012 0.054 0.049
70 30 0.017 0.053 0.056
75 25 0.015 0.059 0.062
80 20 0.029 0.058 0.060
85 15 0.003 0.058 0.058
90 10 0.009 0.064 0.064
95 5 0.009 0.070 0.070
Average 0.011 0.042 0.042
Standard Deviation 0.007 0.014 0.015

Based on the result accuracy, precision, and recall that has been recorded in each training test and testing for each
model that has been developed. A comparison that can be made for the best model in making predictions for this study
is the Neural Network model. However, the result for the accuracy and recall result of the Neural Network model is
the same as the Support Vector Machine (SVM) model, which is 78.27% and 0.042. However, a significant difference
in value can be seen in the average precision value for these two models, which is 0.007. This proves that the Neural
Network model is the best in making predictions to determine the level of depression by using social media posts.

6. Conclusion
The study's show this way can determine whether there is a link between SNS users' activity and mental illness.
This study also shows that social media activity can disclose mental disease in its early stages. The psychiatrist cannot
obtain complete information from the depressed patient using typical questioning tactics. The SNS-based system has
the potential to solve the self-reporting issues. From the user’s social activities, machine learning can help researchers
get closer to the natural behavior of the depressed patient and his/her way of thinking and better classify the mental

46
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

levels. This result of the prediction level of depression could help psychiatrists, family, and friends of the depressed
patient identify early symptoms of depression and help them prevent some accidents from happening.

Acknowledgment
We would like to thank the Universiti Tun Hussein Onn Malaysia's Faculty of Computer Science and Information
Technology for their assistance in completing this work.

References
[1] Jamali, A. F., Mustapha, A., & Mostafa, S. A. (2021). Prediction of Sea Level Oscillations: Comparison of
Regression-based Approach. Engineering Letters, 29(3)
[2] Aguilar-Savén, R. S. (2004). Business process modelling: Review and framework. International Journal of
Production Economics, 90(2), 129–149
[3] Aldarwish, M. M., & Ahmad, H. F. (2017). Predicting Depression Levels Using Social Media Posts. Proceedings
- 2017 IEEE 13th International Symposium on Autonomous Decentralized Systems, ISADS 2017, 277–280
[4] Almaatouq, A., Shmueli, E., Nouh, M., Alabdulkareem, A., Singh, V. K., Alsaleh, M., Alarifi, A., Alfaris, A.,
& Pentland, A. ‘Sandy.’ (2016). If it looks like a spammer and behaves like a spammer, it must be a spammer:
analysis and detection of microblogging spam accounts. International Journal of Information Security, 15(5),
475–491
[5] Almouzini, S., Khemakhem, M., & Alageel, A. (2019). Detecting Arabic Depressed Users from Twitter Data.
Procedia Computer Science, 163, 257–265
[6] Asad, N. Al, Mahmud Pranto, M. A., Afreen, S., & Islam, M. M. (2019). Depression Detection by Analyzing
Social Media Posts of User. 2019 IEEE International Conference on Signal Processing, Information,
Communication and Systems, SPICSCON 2019, 13–17
[7] Cifuentes, M., Davis, M., Fernald, D., Gunn, R., Dickinson, P., & Cohen, D. J. (2015). Electronic Health Record
Challenges, Workarounds, and Solutions Observed in Practices Integrating Behavioral Health and Primary Care.
Journal of the American Board of Family Medicine : JABFM, 28(July), S63–S72
[8] De Choudhury, M., Gamon, M., Counts, S., & Horvitz, E. (2013). Predicting Depression via Social Media.
Predicting Depression via Social Media. In Seventh International AAAI Conference on Weblogs and Social
Media
[9] De Choudhury, Munmun, Scott Counts, and E. H. (2013). Social Media as a Measurement Tool of Depression
in Populations. National Conference Publication - Institution of Engineers, Australia, 92 pt 9, 47–56
[10] Greene, H. L., Graham, E. L., Werner, J. A., Sears, G. K., Gross, B. W., Gorham, J. P., Kudenchuk, P. J., &
Trobaugh, G. B. (1983). Toxic and therapeutic effects of amiodarone in the treatment of cardiac arrhythmias.
Journal of the American College of Cardiology, 2(6), 1114–1128
[11] Hosseinpoor, A. R., Bergen, N., Mendis, S., Harper, S., Verdes, E., Kunst, A., & Chatterji, S. (2012).
Socioeconomic inequality in the prevalence of noncommunicable diseases in low- and middle-income countries:
Results from the World Health Survey. BMC Public Health, 12(1), 1
[12] Hussain, J., Ali, M., Bilal, H. S. M., Afzal, M., Ahmad, H. F., Banos, O., & Lee, S. (2015). SNS based predictive
model for depression. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics), 9102, 349–354
[13] Karlsson, J., Taft, C., Rydén, A., Sjöström, L., & Sullivan, M. (2007). Ten-year trends in health-related quality
of life after surgical and conventional treatment for severe obesity: The SOS intervention study. International
Journal of Obesity, 31(8), 1248–1261
[14] Luca, M. (2015). User-Generated Content and Social Media. In Handbook of media Economics (Vol. 1, pp.
563–592). Elsevier B.V
[15] Marcus, M., Yasamy, M. T., Van Ommeren, M. V., Chisholm, D., & Saxena, S. (2012). Depression: A global
public health concern, WHO Dataset
[16] Razak, N. H., Mustapha, A., Nanthaamomphong, A., Abd Wahab, M. H., & Mostafa, S. A. (2021, May).
Prediction of Secondary Students Performance: A Case Study. In 2021 18th International Conference on
Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-
CON) (pp. 908-912). IEEE
[17] Mohr, D. C., Burns, M. N., Schueller, S. M., Clarke, G., & Klinkman, M. (2013). Behavioral Intervention
Technologies: Evidence review and recommendations for future research in mental health. General Hospital
Psychiatry, 35(4), 332–338
[18] Nambisan, P., Luo, Z., Kapoor, A., Patrick, T. B., & Cisler, R. A. (2015). Social media, big data, and public
health informatics: Ruminating behavior of depression revealed through twitter. Proceedings of the Annual
Hawaii International Conference on System Sciences, 2015-March(December 2017), 2906–2913
[19] Priya, A., Garg, S., & Tigga, N. P. (2020). Predicting Anxiety, Depression and Stress in Modern Life using
Machine Learning Algorithms. Procedia Computer Science, 167(2019), 1258–1267

47
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

[20] Scroll, P., & For, D. (2016). Clinical Use of an Alpha Asymmetry Neurofeedback Protocol in the Treatment of
Mood Disorders: Follow-Up Study One to Five Years Post Therapy
[21] Statista. (2021). SOCIAL NETWORK USERS IN LEADING MARKETS 2026 | STATISTA. Statista
[22] Technologies, M. (2014). Social Media – An Arena for Venting Negative Emotions Harri Jalonen, Turku
University of Applied Sciences, Finland. October, 53–70
[23] Wakefield, J. C., Schmitz, M. F., & Baer, J. C. (2010). Does the DSM-IV clinical significance criterion for major
depression reduce false positives? Evidence. American Journal of Psychiatry, 167(1), 298–304
William. (2016). CRISP-DM – a Standard Methodology to Ensure a Good Outcome
[24] Gupta, R., Sharma, A. K., Garg, O., Modi, K., Kassim, S., Baharum, Z., ... & Mostafa, S. A. (2021). WB-CPI:
Weather Based Crop Prediction in India using Big Data Analytics. IEEE Access

Predictive Maintenance of CNC Machine
100% (1)
Predictive Maintenance of CNC Machine
44 pages
Titanic: Logistic Regression Project
No ratings yet
Titanic: Logistic Regression Project
19 pages
4 Classification 1
100% (1)
4 Classification 1
45 pages
R22 ML Syllabus
No ratings yet
R22 ML Syllabus
2 pages
Students Placement Prediction Using Machine Learning Algorithms
No ratings yet
Students Placement Prediction Using Machine Learning Algorithms
14 pages
5clustering Solved MCQs of Clustering in Data Mining With Answers
No ratings yet
5clustering Solved MCQs of Clustering in Data Mining With Answers
26 pages
Exam 2003
No ratings yet
Exam 2003
21 pages
Quantitative Analyst
0% (1)
Quantitative Analyst
12 pages
Predictive Maintenance Matlab
No ratings yet
Predictive Maintenance Matlab
71 pages
Depression Detection From Social
No ratings yet
Depression Detection From Social
17 pages
PHD ResearchPhD Proposal
No ratings yet
PHD ResearchPhD Proposal
33 pages
Lab-Practice-I (ML) - Lab Manual-Vaishali
No ratings yet
Lab-Practice-I (ML) - Lab Manual-Vaishali
57 pages
DepXGBoot: Depression Detection Using A Robust Tuned Extreme Gradient Boosting Model Generator
No ratings yet
DepXGBoot: Depression Detection Using A Robust Tuned Extreme Gradient Boosting Model Generator
12 pages
Depression Detection in Social Media A Comprehensive Review of Machine Learning and Deep Learning Techniques
No ratings yet
Depression Detection in Social Media A Comprehensive Review of Machine Learning and Deep Learning Techniques
30 pages
Diagnosis of Mental Health Issues in Social Forums Using Semantic Biomarkers Markovian Models and Artificial Intelligence
No ratings yet
Diagnosis of Mental Health Issues in Social Forums Using Semantic Biomarkers Markovian Models and Artificial Intelligence
9 pages
Sensors 22 09775 v2
No ratings yet
Sensors 22 09775 v2
28 pages
A Review On Recognizing Depression in Social Networks
No ratings yet
A Review On Recognizing Depression in Social Networks
17 pages
Depression Prediction Using Machine Learning: A Review
No ratings yet
Depression Prediction Using Machine Learning: A Review
11 pages
UNIT - 3 - Clustering
No ratings yet
UNIT - 3 - Clustering
21 pages
Long Short-Term Memory (LSTM) by Mohsin
No ratings yet
Long Short-Term Memory (LSTM) by Mohsin
17 pages
Thema AI Topic 1 - 084848
No ratings yet
Thema AI Topic 1 - 084848
42 pages
Priyanka RDC 2
No ratings yet
Priyanka RDC 2
26 pages
Phase 1
No ratings yet
Phase 1
14 pages
Deep Learning
100% (1)
Deep Learning
2 pages
Phase 1
No ratings yet
Phase 1
15 pages
Towards Automatic Text-Based Estimation of Depression Through Symptom Prediction
No ratings yet
Towards Automatic Text-Based Estimation of Depression Through Symptom Prediction
14 pages
Intel Technology Journal
No ratings yet
Intel Technology Journal
14 pages
Application of Hybrid Deep Learning Algorithm For Sentimental Analysis & Emotional Behavior For Recognition and Classification On Twitter Data Set
No ratings yet
Application of Hybrid Deep Learning Algorithm For Sentimental Analysis & Emotional Behavior For Recognition and Classification On Twitter Data Set
14 pages
Research Article: Real-Time Vehicle Detection Using Cross-Correlation and 2D-DWT For Feature Extraction
No ratings yet
Research Article: Real-Time Vehicle Detection Using Cross-Correlation and 2D-DWT For Feature Extraction
10 pages
Machine Learning Based Mental Disorder Detection Through Emotions
No ratings yet
Machine Learning Based Mental Disorder Detection Through Emotions
24 pages
Survey On ML and DL in Health
No ratings yet
Survey On ML and DL in Health
6 pages
Deep Learning-Based Depression Detection From Social Media
No ratings yet
Deep Learning-Based Depression Detection From Social Media
20 pages
Project Report
No ratings yet
Project Report
16 pages
SRS Project
No ratings yet
SRS Project
9 pages
Mental Health Analysis in Social Media Posts: A Survey: Muskan Garg
No ratings yet
Mental Health Analysis in Social Media Posts: A Survey: Muskan Garg
24 pages
Research Paper FF
No ratings yet
Research Paper FF
18 pages
Harnessing The Power of Hugging Face Transformers For Predicting Mental Health Disorders in Social Networks
No ratings yet
Harnessing The Power of Hugging Face Transformers For Predicting Mental Health Disorders in Social Networks
11 pages
Chapter 7
No ratings yet
Chapter 7
18 pages
ICDSIS-2024 Conference-Template PDF
No ratings yet
ICDSIS-2024 Conference-Template PDF
8 pages
Depression Detection Using EI
No ratings yet
Depression Detection Using EI
7 pages
Detecting Fake News by RNN-based Gatekeeping Behavior Model On Social Networks
No ratings yet
Detecting Fake News by RNN-based Gatekeeping Behavior Model On Social Networks
13 pages
Researching Mental Health Disorders in The Era of Social Media: Systematic Review
No ratings yet
Researching Mental Health Disorders in The Era of Social Media: Systematic Review
17 pages
Vaishnavi 2022 J. Phys. Conf. Ser. 2161 012021
No ratings yet
Vaishnavi 2022 J. Phys. Conf. Ser. 2161 012021
8 pages
IJNGC Latex Research Paper
No ratings yet
IJNGC Latex Research Paper
10 pages
Web Usage Mining Using Improved KNN Algorithm: Dr.P.Tamijeselvy, Sangavi. S, Suvetha. T, Umashankari. T
No ratings yet
Web Usage Mining Using Improved KNN Algorithm: Dr.P.Tamijeselvy, Sangavi. S, Suvetha. T, Umashankari. T
6 pages
SVM - Depressionrecognition - SocialMediaDatas13042 017 0697 1
No ratings yet
SVM - Depressionrecognition - SocialMediaDatas13042 017 0697 1
15 pages
Towards Automatically Classifying Depressive Symptoms From Twitter Data For Population Health
No ratings yet
Towards Automatically Classifying Depressive Symptoms From Twitter Data For Population Health
10 pages
Researching Mental Health Disorders in The Era of Social Media - Systematic Review
No ratings yet
Researching Mental Health Disorders in The Era of Social Media - Systematic Review
17 pages
Predicting Mental Illness Using Social M
No ratings yet
Predicting Mental Illness Using Social M
7 pages
Research Paper (PREDICTION OF DEPRESSION LEVELS USING SOCIAL MEDIA)
No ratings yet
Research Paper (PREDICTION OF DEPRESSION LEVELS USING SOCIAL MEDIA)
11 pages
Base Paper (Flight Delay Prediction)
No ratings yet
Base Paper (Flight Delay Prediction)
6 pages
02 ruchiJWoo35-49
No ratings yet
02 ruchiJWoo35-49
16 pages
A Comparison of Segmentation Programs For High Resolution Remote Sensing Data
No ratings yet
A Comparison of Segmentation Programs For High Resolution Remote Sensing Data
6 pages
Retrieve
No ratings yet
Retrieve
8 pages
Calibration of Transformer-Based Models For
No ratings yet
Calibration of Transformer-Based Models For
12 pages
Implementation Paper (1) (AutoRecovered)
No ratings yet
Implementation Paper (1) (AutoRecovered)
5 pages
Forensis Review
No ratings yet
Forensis Review
19 pages
You Are What You Tweet - Data Analysis
No ratings yet
You Are What You Tweet - Data Analysis
12 pages
Major Paper Publication
No ratings yet
Major Paper Publication
10 pages
Abstract
No ratings yet
Abstract
11 pages
CH 2
No ratings yet
CH 2
14 pages
Viral Genome Prediction From Raw Human DNA Sequence Samples by Combining Natural Language Proc-1
No ratings yet
Viral Genome Prediction From Raw Human DNA Sequence Samples by Combining Natural Language Proc-1
10 pages
Multi-Kernel SVM Based Depression Recognition Using Social Media Data
No ratings yet
Multi-Kernel SVM Based Depression Recognition Using Social Media Data
15 pages
A Novel Approach For Identifying Social Media Posts Indicative of Depression
No ratings yet
A Novel Approach For Identifying Social Media Posts Indicative of Depression
6 pages
Projectsysnopsis
No ratings yet
Projectsysnopsis
7 pages
Depression PDF
No ratings yet
Depression PDF
12 pages
Machine Learning: Emmanuel Okafor, PH.D., M.SC., B.Eng
No ratings yet
Machine Learning: Emmanuel Okafor, PH.D., M.SC., B.Eng
13 pages
Research Paper-Final
No ratings yet
Research Paper-Final
5 pages
Research Paper2+
No ratings yet
Research Paper2+
7 pages
Analysis of Machine Learning Algorithms For
No ratings yet
Analysis of Machine Learning Algorithms For
4 pages
Synopsis 3
No ratings yet
Synopsis 3
7 pages
Depression Detection
No ratings yet
Depression Detection
5 pages
Detecting Depression and Mental Illness On Social Media
No ratings yet
Detecting Depression and Mental Illness On Social Media
7 pages
A Short Introduction To Boosting
No ratings yet
A Short Introduction To Boosting
14 pages
Predicting Depression Using Deep Learnin
No ratings yet
Predicting Depression Using Deep Learnin
6 pages
Identifying Depression Among Twitter Users Using Sentiment Analysis
No ratings yet
Identifying Depression Among Twitter Users Using Sentiment Analysis
6 pages
Leveraging Machine Learning Algorithms For Early Detection of Major Depressive Disorder A Deep Learning Approach With Twitter Data
No ratings yet
Leveraging Machine Learning Algorithms For Early Detection of Major Depressive Disorder A Deep Learning Approach With Twitter Data
4 pages
Review On Image Splicing Forgery Detecti
No ratings yet
Review On Image Splicing Forgery Detecti
5 pages
A Machine Learning Based Depression Analysis
No ratings yet
A Machine Learning Based Depression Analysis
6 pages
A Research Paper On Fake Number Plate Recognition
No ratings yet
A Research Paper On Fake Number Plate Recognition
5 pages
KNN Activity
No ratings yet
KNN Activity
4 pages
A-17 Paper
No ratings yet
A-17 Paper
4 pages
Predicting Depression Levels Using Social Media Posts: Abstract
No ratings yet
Predicting Depression Levels Using Social Media Posts: Abstract
4 pages
181 Predicting Ieee
No ratings yet
181 Predicting Ieee
4 pages
Final Survey SRM 4th Year Project
No ratings yet
Final Survey SRM 4th Year Project
3 pages
Depressive
No ratings yet
Depressive
1 page

Penerbit, 004

Uploaded by

Penerbit, 004

Uploaded by

JOURNAL OF SOFT COMPUTING AND DATA MINING VOL.2 NO.

© Universiti Tun Hussein Onn Malaysia Publisher’s Office

Predicting Depression Using Social Media Posts

*Corresponding author: [email protected] 39

Table 1- Comparison result between various articles

[19] Social media as a Twitter SVM 73% 0.820

AdaBoostM1 55% 56.4

4.1 Data Collection

Table 2 - Description of each column field contained in the dataset

4.2 Creating Prediction Model

Fig. 1 - Model building using Decision Forest classifier.

Fig. 2 - Model building using Neural Network classifier.

4.3 Data Cleaning

4.4 Pre-processing Text

4.5 Split Data

4.6 Evaluate the Model

5. Analysis and Results

Decision Forest Neural Network SVM

70 30 1.000 1.000 0.951

You might also like