0% found this document useful (0 votes)
20 views10 pages

Penerbit, 004

This document summarizes a study that used machine learning models to predict depression levels based on social media posts. Three models were evaluated: Decision Forest, Neural Network, and Support Vector Machine (SVM). The Neural Network model achieved the highest average precision of 78.27%, performing better than the other models at determining depression levels from social media content. Previous related studies that used machine learning on social media data to predict depression are also summarized, with accuracies ranging from 70-78% and precision scores from 0.705-1.000.

Uploaded by

Henry H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views10 pages

Penerbit, 004

This document summarizes a study that used machine learning models to predict depression levels based on social media posts. Three models were evaluated: Decision Forest, Neural Network, and Support Vector Machine (SVM). The Neural Network model achieved the highest average precision of 78.27%, performing better than the other models at determining depression levels from social media content. Previous related studies that used machine learning on social media data to predict depression are also summarized, with accuracies ranging from 70-78% and precision scores from 0.705-1.000.

Uploaded by

Henry H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

JOURNAL OF SOFT COMPUTING AND DATA MINING VOL.2 NO.

1 (2021) 39-48

© Universiti Tun Hussein Onn Malaysia Publisher’s Office


Journal of Soft
JSCDM Computing and
Data Mining
Journal homepage: http://penerbit.uthm.edu.my/ojs/index.php/jscdm
e-ISSN : 2716-621X

Predicting Depression Using Social Media Posts


Fahem Abu Bakar1, Nazri Mohd Nawi1*, Abdulkareem A. Hezam1
1
Faculty of Computer Science and Information Technology,
Universiti Tun Hussein Onn Malaysia (UTHM), Parit Raja, 86400 Batu Pahat, Johor, MALAYSIA

*Corresponding Author

DOI: https://doi.org/10.30880/jscdm.2021.02.02.004
Received 15 June 2021; Accepted 01 October 2021; Available online 15 October 2021

Abstract: The use of Social Network Sites (SNS) is on the rise these days, particularly among the younger
generations. Users can communicate their interests, feelings, and everyday routines thanks to the availability of
social media sites. Many studies show that properly utilizing user-generated content (UGC) can aid in determining
people's mental health status. The use of the UGC could aid in the prediction of mental health, particularly
depression, where it is a significant medical condition that impairs one's ability to work, learn, eat, sleep, and enjoy
life. However, all information about a person's mood and negativism can be gathered from their SNS user profile.
Therefore, this study utilizes SNS as a data source by using machine learning models to screen and identify users
in categorizing users based on their mental health. The performance of three machine learning models is evaluated
to classify the UGC: Decision Forest, Neural Network, and Support Vector Machine (SVM). The results show that
the accuracy and recall result of the Neural Network model is the same as the Support Vector Machine (SVM)
model, which is 78.27% and 0.042, but Neural Network performs better in the average precision value. This proves
that the Neural Network model is the best model for making predictions to determine the level of depression by
using social media posts.

Keywords: Social media, machine learning, decision forest, neural network, Support Vector Machine (SVM)

1. Introduction
The vast volume of information available on social media, which corresponds to user behavioral characteristics,
might be utilized [1]. Using that data to anticipate a social media user's mental health status can assist a psychiatrist,
family member, or friend in providing timely medical advice and therapy to a sad user [2]. According to the World
Health Organization (WHO) [3], depression affects roughly 350 million people worldwide today. Depression is one
of the world's most deadly disorders [4]. Furthermore, almost two-thirds of depressed persons do not seek adequate
treatment, resulting in serious repercussions [5]. Medical science relies on asking patients questions about their
circumstances, which does not allow for a precise diagnosis of depression [6]. During two weeks, the patient must
attend more than one session. A False Positive problem exists when a non-depressed condition is classified as
depressed [7]. Researchers discovered, however, that electronic health record (EHR) systems are not well-suited to
merging behavioral health and basic care. Documenting and tracking data for behavioral health problems such as
depression is not supported by EHRs [8].
According to eMarketer [9], the number of social media users in 2015 was about 2 billion, and it is growing every
day. The majority of people use social media to convey their feelings, emotions, and day-to-day activities. Social
media has been proven to be a safe space for many people to vent their negative feelings by sharing material that
reflects those emotions [10]. Many studies have demonstrated that data from social media can be utilized to help
people maintain their mental health. Researchers can gain a thorough picture of a user's natural behavior by mining

*Corresponding author: [email protected] 39


2020 UTHM Publisher. All rights reserved.
publisher.uthm.edu.my/periodicals/index.php/JSCDM
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

users' social media posts [11]. By using machine learning models, researchers can gather all information about a
person's mood, activities, sleep hours, thinking style, interactions, guilt feelings, worthlessness, loneliness, and
helplessness from a user's social media profile. Retrieving such behavioral attributes reveals depression symptoms in
social media users, which could be utilized to determine whether or not the user is depressed [12]. The use of machine
learning models can help Psychiatrists, parents, and friends detect their early symptoms and prevent the user's
depression, which would save time before the depressed person enters the serious depression stage.

2. Project Background
Depression is a type of mood illness characterized by a persistent sense of melancholy and loss of interest. There
are several different types of depressive disorders, each with its own set of symptoms. Major Depressive Disease
(MDD) is the most frequent type of depressive disorder, and it affects people's ability to work, study, eat, sleep, and
have fun. To diagnose a major depressive episode, the patient must exhibit five or more of the following nine
symptoms for at least two weeks and virtually every day [13], [14]. The first sign is having a sad mood for the majority
of the day. The second symptom is a general lack of interest in practically all activities. The third sign is weight loss
or increase, as well as excessive sleeping. Body agitation or retardation is the fourth symptom [15]. The sixth symptom
is exhaustion or a lack of energy. The sixth is a feeling of remorse or worthlessness. Finding concentration, thinking,
or making a decision becomes tough is the seventh symptom. The seventh symptom is difficulty sleeping or sleeping
too much. The ninth and last symptom is the only one that does not have to occur daily. It is the thought of death, a
suicide attempt, or a plan to commit suicide.
Asking patients questions about their symptoms cannot precisely diagnose depression, indicating that medical
science does not have complete confidence in the approaches employed to diagnose depression [2]. Nowadays, the
usage of social media is on the rise, particularly among the younger population. Users can access their SNS at any
time and from any location using their Smartphones, PCs, or laptops [16]. Users can communicate their interests,
feelings, and everyday routines thanks to the availability of social media sites. User-generated content (UGC) from
any social media platform could be used to study human health behaviors [17]. Many studies have shown that by
using UGC correctly it can help people maintain their mental health or identify at an early stage. This way may be
able to gain a complete picture of a user's natural behavior by mining their social media posts, which could aid in the
prediction of depression. This study might be able to get a precise result of the user's mental health by using
approaches that identify social media users' depression phases based on their individually authored content. To
overcome this problem, there must be a method to predict the level of patient depression through social media. So,
the objective of the study is to analyze depression on a dataset collected from an online public source. Then data
collected will be analyzed, and the performance of machine learning models will be compared on predicting
depression levels. Two scopes have been set up to guide this project toward its objectives. For the scope project, user
data is extracted from social media (Twitter). User posts can help to classify users according to mental health. The
patient's natural behavior will be created out of their written posts. The data of the user is collected from the Kaggle
website.

3. Related Studies
Many researchers have used machine learning techniques such as the Random Forest Tree (RFT), the Support
Vector Machine (SVM), and the Convolution Neural Network (CNN) to collect and classify data from websites to
predict depression. Table 1 presents the results of the comparison with relevant studies from other articles.

Table 1- Comparison result between various articles


References Related SNS Method Result
Total
Total Precision
Accuracy (%)
[18] Depression Detection by Twitter & SVM 78% 1.000
Analyzing Social Media Facebook Naïve Bayes 74% 1.000
Posts of User

[19] Social media as a Twitter SVM 73% 0.820


measurement tool of
depression in populations
[20] Predicting Depression via Twitter SVM 70% 0.705
Social Media
[21] Detecting Arabic Depressed Twitter Random 83% 85.7
Users from Twitter Data Forest
Naïve Bayes 75% 75.8

40
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

AdaBoostM1 55% 56.4


Liblinear 87.5% 87.6
[22] Predicting Anxiety, - Decision Anxiety: 73% 0.458
Depression, and Stress in Tree Depression: 0.731
Modern Life using Machine 78%
Learning Algorithms Stress 63% 0.599
Random Anxiety: 71% 0.431
Forest Depression: 0.881
79%
Stress: 72% 0.731
Naïve Bayes Anxiety: 73% 0.459
Depression: 0.822
85%
Stress: 74% 0.548
Support Anxiety: 67% 0.403
Vector Depression: 0.820
Machine 80%
Stress: 66% 0.672
K Nearest Anxiety: 69% 0.449
Neighbour Depression: 0.750
72%
Stress 71% 0.719

Different researchers have used different machine learning algorithms to predict psychiatric diseases, and the
results of the algorithms have varied depending on the context. There was no single algorithm has been determined
to be the best in all circumstances. Therefore, many machine learning algorithms were used to identify the symptoms
of anxiety, depression, sadness, and stress in the current study. The performance of those machine learning algorithms
drives this research's motivation to select some of the popular algorithms for the selected datasets.

4. Methodology/ Framework
This project follows the standard process of conducting experiments and simulations by selecting the CRISP-DM
model as the methodology for this research study. The CRISP-DM [23] (Cross-Industry Standard Process for Data
Mining) is the standard method used by most researchers in the data mining area to guide data mining activities. It
comprises descriptions of common project phases, tasks associated with each phase, and an explanation of the
interconnections between these activities as a methodology [24]. CRISP-DM presents an overview of the data mining
life cycle as a process model.

4.1 Data Collection


The author obtained datasets for this study, Narendra Sahu, collected from Twitter, including posts from several
random users. The dataset contains posts that have been scrapped from the Twint tool to detect all the tweet posts that
are related to depression. Keywords used are hopeless, depressed, suicide. The dataset consists of the tweet from
more than 10,314 users. This dataset contains six fields, including "ID", "Username". "Tweet Text", 'Prediction Score"
and "Sentiment". A more detailed description of the contents of the data set is described in Table 2 below.
The author used the dictionary to contain words with their polarity. Each word taken from the tweet is compared
with the dictionary and given a score. The sum of polarity is added for each tweet, and if it is above 0, then it is a
negative tweet that stands for a non-depressed post. If it is equal to 1, it is a positive tweet that declares that post as a
depressing tweet. In this way, tweets are classified as negative and positive. Evaluated tweet posts were used to be
incorporated into the machine learning model.

Table 2 - Description of each column field contained in the dataset


Fields Description
Id An index value of a tweet
Username Handle of account or person that their text had been identified on Twitter
Tweet Text The message on which the sentimental analysis needs to be performed
Prediction 0 stands for NO = “non-depressed” and 1 stands for YES = “depressed”
Sentiment 0 = negative, 1 = positive

41
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

4.2 Creating Prediction Model


In this part, the Microsoft Azure Machine Learning (Classic) was used to create the prediction models by selecting
Decision Forest, Neural Network, and Support Vector Machine (SVM) classifier for two-class models, which is
suitable for classification with discrete features. The two-class technique classifier takes three parameters to compare
each of this classifier: accuracy, precision, and recall. This classifier was fitted according to the training dataset. The
experimental design for each model is shown in Fig.s 1 and 2.

Fig. 1 - Model building using Decision Forest classifier.

42
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

Fig. 2 - Model building using Neural Network classifier.

4.3 Data Cleaning


Data cleaning is the process of identifying parts of data that are incorrect, incomplete, incorrect, irrelevant, or
missing, and then altering, replacing, or deleting them as needed. For analysis and machine learning, data is the most
valuable resource. There are several missing values identified in the data set that has been used in this study. Therefore,
a data cleaning process was used for this data set to avoid any errors during the training and testing period of the data.
Data cleaning eliminates a row of missing values with a minimum missing value ratio of 0 and a maximum missing
value ratio of 1.

4.4 Pre-processing Text


A dataset usually requires some pre-processing before it can be analyzed. This process also applied to this dataset
using pre-processing text in Microsoft Azure for each model machine learning. This pre-processing text help in
analyzing text to removing twitter handles (@user, removing links, removing punctuations, numbers, and special
characters, remove stop words and lowercase words.

4.5 Split Data


This section discusses the process used to construct a data set with basic permission label information on whether
a tweet post indicates depression. The data set is divided into training and testing models. The classifier changes the

43
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

model from Decision Forest Classification to Neural Network Classification and Support Vector Machine (SVM)
classification each time a test is recorded. In the test section, a trained model is applied to the monitored data set.
Machine learning will evaluate models that link training data sets and test data sets to provide predictive results.
Therefore, the three main metrics used to evaluate a classification model are accuracy, precision, and recall. The
difference value of the algorithm is shown in the result section.

4.6 Evaluate the Model


This section uses the classification method to perform a prediction using an Azure Machine Learning Studio
(classic). There are four types of outcomes that will be evaluated:
 True positives are when predict an observation belongs to a class and it does belong to that class.
 True negatives predict an observation that does not belong to a class and does not belong to that class.
 False positives occur when predicting an observation belongs to a class when in reality, it does not.
 False negatives occur when predicting an observation does not belong to a class when in fact, it does.

5. Analysis and Results


This section explains the analysis results of simulations that have been done for this study. The performance of
three selected machine learning algorithms is compared on the same datasets selected to determine the best models
that can make predictions. This section will show the result value of accuracy, precision, and recall for each machine
learning model with different algorithms during the training and testing data. This result will also be used in comparing
which model is better in making predictions. All the algorithms and results are recorded in the table as shown in Table
3, Table 4, and Table 5.

Table 3 - Accuracy results for Decision Forest, Neural Network, and SVM
Split Data Accuracy (%)
Decision Neural
Training Testing SVM
Forest Network
5 95 77.8 78.1 78.1
10 90 77.9 78.0 77.9
15 85 77.4 78.0 78.0
20 80 77.6 78.0 78.0
25 75 77.6 78.0 78.0
30 70 77.7 77.9 77.9
35 65 77.5 77.9 77.9
40 60 77.4 78.1 78.1
45 55 77.8 78.2 78.2
50 50 77.6 78.5 78.5
55 45 78.1 78.7 78.7
60 40 77.7 78.5 78.5
65 35 77.7 78.6 78.5
70 30 77.7 78.5 78.5
75 25 77.2 78.2 78.2
80 20 77.3 77.9 77.9
85 15 76.7 78.0 78.0
90 10 77.4 78.7 78.7
95 5 78.1 79.5 79.5
Average 77.58 78.27 78.27
Standard Deviation 0.316 0.396 0.387

Accuracy measures the goodness of a classification model as the proportion of true results to total cases. Based
on Table 3 as shown below, the following are the results for the accuracy value for each machine learning model that
has been used in this study. From the results of the accuracy, the average value for Neural Network and SVM machine
learning model has the same average accuracy of 78.27%. Meanwhile, the Decision Forest machine learning model
recorded a different average accuracy value of 77.58%. A difference of 0.69% for the average accuracy results for the
Neural Network and SVM machine learning models shows that the Neural Network model is more accurate than
SVM. Next, the calculation results for the standard deviation for the three machine learning models give different
results. The standard deviation value for the Decision Forest machine learning model is 0.316%.

44
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

Meanwhile, the Neural Network model recorded a standard deviation value of 0.396%, which is higher than the
standard deviation value for the SVM model, which is only 0.387%. This difference value shows the percentage of
standard deviation for the Decision Forest model was the lowest, while the Neural Network machine learning model
recorded the highest standard deviation value. The more detailed difference of accuracy result for each model is also
shown in Fig. 3 in the form of a graph.

80
79.5
79
78.5
Accurancy (%)

78
77.5
77
76.5
76
75.5
75
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Decision Forest 77.8 77.9 77.4 77.6 77.6 77.7 77.5 77.4 77.8 77.6 78.1 77.7 77.7 77.7 77.2 77.3 76.7 77.4 78.1
Neural Network 78.1 78 78 78 78 77.9 77.9 78.1 78.2 78.5 78.7 78.5 78.6 78.5 78.2 77.9 78 78.7 79.5
SVM 78.1 77.9 78 78 78 77.9 77.9 78.1 78.2 78.5 78.7 78.5 78.5 78.5 78.2 77.9 78 78.7 79.5
19-fold

Decision Forest Neural Network SVM

Fig. 3 - Graph of accuracy result for Decision Forest, Neural Network, and SVM

Precision is the proportion of true results overall positive results. Based on Table 4, shown below, the following
are the results for the precision value for each machine learning model used in this study. The average value of
precision for each model indicates a different value. The decision Forest model produces a high average precision
value which is one compared to the Neural Network and SVM models. Next, the average value of accuracy for the
Neural Network model is 0.981. The difference in the value of 0.007 when compared to the average value of accuracy
for the Support Vector Machine (SVM) model, the average value is 0.974. This shows that the Neural Network model
is the second-highest model for the average value of precision in making predictions. Meanwhile, for standard
deviation calculation for model decision forest is 0 while model Neural Network and Support Vector Machine (SVM)
record the standard value of the same deviation, which is 0.026.

Table 4 - Precision results for Decision Forest, Neural Network, and SVM
Split Data Precision
Training Testing Decision Neural SVM
Forest Network
5 95 1.000 1.000 1.000
10 90 1.000 1.000 1.000
15 85 1.000 0.894 0.894
20 80 1.000 1.000 1.000
25 75 1.000 1.000 1.000
30 70 1.000 0.978 0.978
35 65 1.000 0.980 0.980
40 60 1.000 0.980 0.980
45 55 1.000 0.981 0.981
50 50 1.000 0.981 0.981
55 45 1.000 0.979 0.979
60 40 1.000 0.977 0.977
65 35 1.000 0.936 0.976

45
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

70 30 1.000 1.000 0.951


75 25 1.000 1.000 0.949
80 20 1.000 1.000 0.935
85 15 1.000 0.955 0.955
90 10 1.000 1.000 1.000
95 5 1.000 1.000 1.000
Average 1 0.981 0.974
Standard Deviation 0 0.026 0.026

A recall is the fraction of all correct results returned by the model. Based on Table 5 which shows the results for
the recall value for each machine learning model that has been used in this study. There are differences in the standard
deviation values for these three models to produce recall results. The average value for the Decision Forest model is
0.011. Meanwhile, the average recall value for the Neural Network and Support Vector Machine (SVM) models is
similar. The value is 0.042. However, the value of the difference can be seen for these two models in the standard
deviation value that is the value for the Neural Network model is 0.014 while for the Support Vector Machine (SVM)
model is 0.015. The difference gap is only a small amount of 0.001. Finally, the standard deviation value for the
Decision Forest model is 0.007, significantly different from the value of the Neural Network and Support Vector
Machine (SVM) model.

Table 5 - Recall results for Decision Forest, Neural Network, and SVM
Split Data Recall
Decision Neural
Training Testing SVM
Forest Network
5 95 0.004 0.019 0.019
10 90 0.016 0.021 0.020
15 85 0.000 0.030 0.030
20 80 0.012 0.027 0.027
25 75 0.012 0.028 0.028
30 70 0.015 0.027 0.028
35 65 0.012 0.032 0.032
40 60 0.004 0.036 0.036
45 55 0.019 0.041 0.041
50 50 0.000 0.044 0.044
55 45 0.019 0.045 0.045
60 40 0.011 0.045 0.045
65 35 0.012 0.054 0.049
70 30 0.017 0.053 0.056
75 25 0.015 0.059 0.062
80 20 0.029 0.058 0.060
85 15 0.003 0.058 0.058
90 10 0.009 0.064 0.064
95 5 0.009 0.070 0.070
Average 0.011 0.042 0.042
Standard Deviation 0.007 0.014 0.015

Based on the result accuracy, precision, and recall that has been recorded in each training test and testing for each
model that has been developed. A comparison that can be made for the best model in making predictions for this study
is the Neural Network model. However, the result for the accuracy and recall result of the Neural Network model is
the same as the Support Vector Machine (SVM) model, which is 78.27% and 0.042. However, a significant difference
in value can be seen in the average precision value for these two models, which is 0.007. This proves that the Neural
Network model is the best in making predictions to determine the level of depression by using social media posts.

6. Conclusion
The study's show this way can determine whether there is a link between SNS users' activity and mental illness.
This study also shows that social media activity can disclose mental disease in its early stages. The psychiatrist cannot
obtain complete information from the depressed patient using typical questioning tactics. The SNS-based system has
the potential to solve the self-reporting issues. From the user’s social activities, machine learning can help researchers
get closer to the natural behavior of the depressed patient and his/her way of thinking and better classify the mental

46
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

levels. This result of the prediction level of depression could help psychiatrists, family, and friends of the depressed
patient identify early symptoms of depression and help them prevent some accidents from happening.

Acknowledgment
We would like to thank the Universiti Tun Hussein Onn Malaysia's Faculty of Computer Science and Information
Technology for their assistance in completing this work.

References
[1] Jamali, A. F., Mustapha, A., & Mostafa, S. A. (2021). Prediction of Sea Level Oscillations: Comparison of
Regression-based Approach. Engineering Letters, 29(3)
[2] Aguilar-Savén, R. S. (2004). Business process modelling: Review and framework. International Journal of
Production Economics, 90(2), 129–149
[3] Aldarwish, M. M., & Ahmad, H. F. (2017). Predicting Depression Levels Using Social Media Posts. Proceedings
- 2017 IEEE 13th International Symposium on Autonomous Decentralized Systems, ISADS 2017, 277–280
[4] Almaatouq, A., Shmueli, E., Nouh, M., Alabdulkareem, A., Singh, V. K., Alsaleh, M., Alarifi, A., Alfaris, A.,
& Pentland, A. ‘Sandy.’ (2016). If it looks like a spammer and behaves like a spammer, it must be a spammer:
analysis and detection of microblogging spam accounts. International Journal of Information Security, 15(5),
475–491
[5] Almouzini, S., Khemakhem, M., & Alageel, A. (2019). Detecting Arabic Depressed Users from Twitter Data.
Procedia Computer Science, 163, 257–265
[6] Asad, N. Al, Mahmud Pranto, M. A., Afreen, S., & Islam, M. M. (2019). Depression Detection by Analyzing
Social Media Posts of User. 2019 IEEE International Conference on Signal Processing, Information,
Communication and Systems, SPICSCON 2019, 13–17
[7] Cifuentes, M., Davis, M., Fernald, D., Gunn, R., Dickinson, P., & Cohen, D. J. (2015). Electronic Health Record
Challenges, Workarounds, and Solutions Observed in Practices Integrating Behavioral Health and Primary Care.
Journal of the American Board of Family Medicine : JABFM, 28(July), S63–S72
[8] De Choudhury, M., Gamon, M., Counts, S., & Horvitz, E. (2013). Predicting Depression via Social Media.
Predicting Depression via Social Media. In Seventh International AAAI Conference on Weblogs and Social
Media
[9] De Choudhury, Munmun, Scott Counts, and E. H. (2013). Social Media as a Measurement Tool of Depression
in Populations. National Conference Publication - Institution of Engineers, Australia, 92 pt 9, 47–56
[10] Greene, H. L., Graham, E. L., Werner, J. A., Sears, G. K., Gross, B. W., Gorham, J. P., Kudenchuk, P. J., &
Trobaugh, G. B. (1983). Toxic and therapeutic effects of amiodarone in the treatment of cardiac arrhythmias.
Journal of the American College of Cardiology, 2(6), 1114–1128
[11] Hosseinpoor, A. R., Bergen, N., Mendis, S., Harper, S., Verdes, E., Kunst, A., & Chatterji, S. (2012).
Socioeconomic inequality in the prevalence of noncommunicable diseases in low- and middle-income countries:
Results from the World Health Survey. BMC Public Health, 12(1), 1
[12] Hussain, J., Ali, M., Bilal, H. S. M., Afzal, M., Ahmad, H. F., Banos, O., & Lee, S. (2015). SNS based predictive
model for depression. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics), 9102, 349–354
[13] Karlsson, J., Taft, C., Rydén, A., Sjöström, L., & Sullivan, M. (2007). Ten-year trends in health-related quality
of life after surgical and conventional treatment for severe obesity: The SOS intervention study. International
Journal of Obesity, 31(8), 1248–1261
[14] Luca, M. (2015). User-Generated Content and Social Media. In Handbook of media Economics (Vol. 1, pp.
563–592). Elsevier B.V
[15] Marcus, M., Yasamy, M. T., Van Ommeren, M. V., Chisholm, D., & Saxena, S. (2012). Depression: A global
public health concern, WHO Dataset
[16] Razak, N. H., Mustapha, A., Nanthaamomphong, A., Abd Wahab, M. H., & Mostafa, S. A. (2021, May).
Prediction of Secondary Students Performance: A Case Study. In 2021 18th International Conference on
Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-
CON) (pp. 908-912). IEEE
[17] Mohr, D. C., Burns, M. N., Schueller, S. M., Clarke, G., & Klinkman, M. (2013). Behavioral Intervention
Technologies: Evidence review and recommendations for future research in mental health. General Hospital
Psychiatry, 35(4), 332–338
[18] Nambisan, P., Luo, Z., Kapoor, A., Patrick, T. B., & Cisler, R. A. (2015). Social media, big data, and public
health informatics: Ruminating behavior of depression revealed through twitter. Proceedings of the Annual
Hawaii International Conference on System Sciences, 2015-March(December 2017), 2906–2913
[19] Priya, A., Garg, S., & Tigga, N. P. (2020). Predicting Anxiety, Depression and Stress in Modern Life using
Machine Learning Algorithms. Procedia Computer Science, 167(2019), 1258–1267

47
Bakar et al., Journal of Soft Computing and Data Mining Vol. 2 No. 1 (2021) p. 39-48

[20] Scroll, P., & For, D. (2016). Clinical Use of an Alpha Asymmetry Neurofeedback Protocol in the Treatment of
Mood Disorders: Follow-Up Study One to Five Years Post Therapy
[21] Statista. (2021). SOCIAL NETWORK USERS IN LEADING MARKETS 2026 | STATISTA. Statista
[22] Technologies, M. (2014). Social Media – An Arena for Venting Negative Emotions Harri Jalonen, Turku
University of Applied Sciences, Finland. October, 53–70
[23] Wakefield, J. C., Schmitz, M. F., & Baer, J. C. (2010). Does the DSM-IV clinical significance criterion for major
depression reduce false positives? Evidence. American Journal of Psychiatry, 167(1), 298–304
William. (2016). CRISP-DM – a Standard Methodology to Ensure a Good Outcome
[24] Gupta, R., Sharma, A. K., Garg, O., Modi, K., Kassim, S., Baharum, Z., ... & Mostafa, S. A. (2021). WB-CPI:
Weather Based Crop Prediction in India using Big Data Analytics. IEEE Access

48

You might also like