Predicting Psycho-Somatic Disorders in Online Activity Using Multi-Layer Perceptron
Predicting Psycho-Somatic Disorders in Online Activity Using Multi-Layer Perceptron
Corresponding Author:
Manjunath Gadiparthi
Department of Computer Science and Engineering, Acharya Nagarjuna University
Guntur, Andhra Pradesh, India
Email: [email protected]
1. INTRODUCTION
People's techniques of communication have experienced a tremendous tidal change as a result of the
rise of social networking websites like LinkedIn, Facebook, and Twitter [1]. This modification has made it far
more challenging to precisely predict issues that a sizable proportion of consumers would encounter. According
to study, those who expend a lot of time on social network media are more likely to have mental health issues
including social depression, anxiety, and exposure to inappropriate information. Making predictions about user
behaviour based on data acquired from the quantity of time spent using social networks is becoming more and
more popular as a means of precisely anticipating user activities [2]. The data received via Social Network is
often unbalanced in comparison to traditional data [3]. As a result, it can be difficult to accurately predict user
problems using such data.
We will be able to learn more about algorithms that can study, generalize, and predict from large data
sets with the assistance of machine learning algorithms [4]. When we eventually have to find out how to put
these algorithms to use, having this information on hand will be helpful. Because machine learning plays a role
in both calculating statistics and making decisions, the two processes are intimately related to one another [5].
Machine learning paradigms are used in a wide range of applications, such as estimating the number of units
that will be sold of a certain product, calculating the chance that it will rain in a particular area, and many other
similar tasks. The building of prediction models for tailored issues pertaining to the amount of time spent on
social networks will be facilitated by the use of systematic analysis in conjunction with algorithms for machine
learning [6]. The monitoring of adverse occurrences in users while they are participating in the trial run, as
well as the determination of the best forecast for each user. In this research, machine learning methodologies
were used to anticipate the issue, and in order to accomplish this, we proposed an incorporation framework for
a decision support system. In order to create the decision support system for predicting the issue, the support
vector machine kernel technique was utilized, and the performance of the system was assessed [7].
Through the use of machine learning algorithms, we will get a deeper understanding of algorithms
that learn from massive data sets, generalise their results, and make predictions based on those discoveries.
Having this knowledge on hand will be very helpful in the future when we will need to determine precisely
how to implement these methods. Calculative statistics and decision-making research both include machine
learning, hence there is a close connection between the two areas of study [8]. Robotics-related methods, like
machine learning, are utilised in many different applications, such as predicting product sales, figuring out how
likely it is that it will rain there, and many other things [9]. It involves monitoring negative user events while
they are undergoing a trial run and choosing the most accurate forecast for each user. The creation of prediction
models for specific scenarios, such as the amount of time spent on social networks, will be helped by systems
analysis combined with machine learning techniques. The amount of time spent on social networks is one
illustration of such a circumstance. In an effort to foresee such issues, we used machine learning techniques
and developed an incorporation framework for a decision support system as part of our study. In our past study,
we built a decision support system for predicting the issue using models like logistic regression, support vector
machines, random forests, and neural networks [10]. The system's performance was then evaluated. The article
is put together in the following manner: The first section provides an introduction, the second section discusses
appropriate work in problem prediction, the third section discusses the data and methods for social network
problem prediction, and the fourth section discusses the execution of a forecast model and results. Section 5
provides the conclusion.
2. RELATED WORK
Even though more conventional methods of statistical modelling are capable of producing reliable
models, the application of artificial intelligence (AI) techniques could be able to facilitate the development of
high-quality prediction models [11], [12]. The authors of this paper propose a machine learning solution in the
form of a multi-layer perceptron (MLP) artificial neural network (ANN) [13] in order to illustrate the
progression of the disease. This answer forecasts the maximum number of instances of sickness per location
and time unit, as well as the utmost number of cases of ill health that recover per location and time unit and the
maximum number of cases of illness that pass away per location and time unit. The MLP was used instead of
other AI technologies since it is simpler to understand how it works. They wanted to test the viability of
modelling using relatively simple methods due to the shorter training time associated with such methods; the
importance of quick results generation when modelling diseases due to the as-fast-as-possible requirement for
models with adequate regression performance; and MLP is the most straightforward AI algorithm. The authors
selected MLP because it is the simplest AI algorithm. Modeling on the basis of previously obtained data is
made possible by statistical analysis. On the other hand, statistical analysis might not be able to grasp the
complexities of the data under examination in the event that the model in question is particularly difficult to
understand [14]. Complex algorithms, especially AI and machine learning algorithms, can be used to "learn"
not only the general trend of the data, but also its complexities, which ultimately leads to the production of
models of a higher quality [15]. This is made possible by the fact that complex algorithms are able to "learn"
not only the overall trend of the data, but also its complexities. AI algorithms are now being used in a wide
number of scientific and commercial domains, including medicine for the classification of a wide range of
disorders as well as the construction of regression models for estimating and forecasting purposes [16]. These
models adjust their parameters in order to adapt their predictions to the data that is currently available, despite
the fact that the data may or may not include the information that is being predicted. By doing so, the models
take into account interactions between a wide range of input elements. These interactions would not have been
taken into account if normal modelling methodologies had been used [17], but because the models do this, they
do.
Support vector regression was utilised by Yu et al. [18] in order to illustrate the effectiveness of the
maximal information coefficient technique for feature selection in the context of the determination of dissolved
oxygen (DO) concentration (SVR). In terms of root mean square error (RMSE), the findings that were provided
by the optimised dataset were much more reliable (28.5%) than those that were produced by the initial input
configuration. In order to accomplish this goal, Csábrági et al. [19] shown that three standard ANN ideas,
namely the multi-layer perceptron (MLP), the radial basis function (RBF), and the general regression neural
network (GRNN), are successful. A innovative ANN-based model, which Heddam [20] referred to as an
evolving fuzzy neural network, was proposed for the purpose of modelling. As an illustration, a total of fifty-
one studies have been carried out to investigate the applicability of fuzzy-based models. An efficient approach
to data mining is the adaptive neuro-fuzzy inference system (ANFIS), which is also known by its full name.
This approach has been investigated in a variety of studies. You could find further study on the application of
machine learning technology in [21] Ouma et al. [22] studied the capability of a feed-forward artificial neural
network and a multiple linear regression (MLR) model to reproduce the DO levels found in the Nyando River
in Kenya. ANN is for artificial neural network. MLR stands for multiple linear regression. When compared to
the correlation of the MLR, the correlation between the ANN and the MLR was much greater (i.e., 0.8546
versus 0.6199). It was revealed that the accuracy of the suggested model was roughly eight percent, seventeen
percent, and twelve percent higher than that of typical data mining approaches such as feed-forward ANN,
SVR, and GRNN, respectively. During the following hour, DO also demonstrated the highest level of accuracy
(the coefficient of determination R2 equaled 0.908). Tyesha et al. [23] analysed a variety of water quality
parameters by combining two well-known machine learning models, random forest (RF) and extreme gradient
boosting, with a so-called denoising method that they termed "complete ensemble empirical mode
decomposition with adaptive noise. You may find further work that is comparable to this one at [24], [25]. This
combination was used to analyse a number of different water quality parameters. "Complete ensemble
empirical mode decomposition with adaptive noise" is the name of this approach. It was shown that the RF-
based ensemble can properly mimic DO, temperature, and specific conductance. In addition to this, they offered
illustrations of the applicability of the proposed strategies by contrasting them with a number of other
instruments that are commonly used. In a manner analogous to this, Heino et al. [14] proved that RF is superior
to MLR when it comes to DO modelling. In addition to that, he added that the temperature and pH of the water
are the two most critical factors to consider during this procedure. Pflüger and Glorius [15], analyzed the
similarities and differences between the MLP, RBF, ANFIS (sub-clustering), and ANFIS [15]. (Partitioning of
the grid) The outputs of MLP are more closely connected to the measured DOs, as evidenced by the R2 values
of 0.98, 0.96, 0.95, and 0.86 for a single station (number: 02156500). It was determined what the values of one
station were.
A multi-layer perceptron, often known as an MLP, is a specific kind of neural network that utilizes
the back-propagation method for its supervised learning strategy. A three-layer structure, consisting of an input
layer, a hidden layer or layers, and an output layer or layers, as shown in the Figure 3, is optimal for the MLP.
In this configuration, each neuron is linked to all of the neurons in the following layer. MLP has been shown
to have a significant role in solving non-linear problems, according to several reports.
It is now time to begin the process of breeding in order to finish off the left over 12 networks in the
populace and ensure that the next generation will have a full set of 16 networks. The population of the
generation that follows the current one is produced from the population of the generation that came before it
through a process known as crossover. To generate one or more children for the generation that comes after
this one, there must be at least two persons from the present population who are referred to be parents. The
parents are chosen based on the scores, and after that, the network parameters are mixed in order to create a
new child that is a hybrid of their parents. In the context of this inquiry, each child that is born is a network
that possesses a unique combination of unpredictable factors that are passed down from its parents.
Transformation: At this point, we have the population that will be utilised for the next generation; all
the way through this process, some of the properties of the selected networks that make up the population are
determined in a manner that is completely arbitrary. The purpose of this method is to churn out individuals
who are even more excellent. It is now time to start the process of breeding in order to finish off the left over
12 networks in the population so that the next generation will have a full set of 16 networks. This is done as
part of the process known as "propagation." Crossover refers to the process through which the population of
one generation is used to contribute to the formation of the population of the generation that comes after it. It
takes at least two people from the current population, who are collectively referred to as parents, in order to
create one or more children for the generation that comes after the current one. The parents are determined by
the scores, and after those decisions have been made, the network parameters are merged to form a new child
who is a hybrid of their parents. According to the principles of this inquiry, every child born into this world is
a network with a unique combination of random factors that are passed down from their parents.
Transformation: at this point, we have the population that will be utilised for the next generation. All
the while, some of the properties of the selected networks that make up the population are determined in a
manner that is completely arbitrary. This method is an attempt to generate individuals that are even more
excellent than they already are.
In this work, we trained a total of about 160 MLPs, resulting in a more robust population as the MLPs
proliferated. The following part evaluates the predicting accuracy of the final population set and ranks the top
five neural networks. Predictive network accuracy was found to increase with population age. The models are
trained and output using the Scikit learn and Keras tools in Python. In order to assess performance, we have
employed cross-validation, a resampling strategy that will be elaborated upon in the next section. The suggested
model's results have been compared to those of traditional machine learning algorithms. According to the
simulation findings, the suggested model outperforms the support vector machine, logistic regression, and
random forest decision tree classifier methods in terms of prediction.
and one part is maintained for model testing), and k is the number of parts that are retained. This technique is
referred to as a stratified method because it makes an effort to equalise the number of samples that originate
from each class in the k-splits. As a result, it is regarded to be a stratified approach. In this specific piece of
study, the value of k has been determined to be 8, and the total accuracy of the models has been calculated by
taking the mean of each of their respective results. We came to the conclusion that the best way to assess the
performance of the classifiers would be to utilise three metrics that are common in the field of machine learning:
precision, recall, and F1-score. When carrying out an analysis of performance, it is important to take into
consideration the dependability of the data being examined. Because we used 10-fold cross-validation, we were
able to calculate the mean and standard deviations of each model's accuracy, recall, and F1-score. This allowed
us to compare the performance of the different models. There are just four different ways that a person may
come upon them. The information that is presented here will act as a framework for our conversation.
− Tp - True positive: Prediction is positive and individual is facing trouble
− Tn - True negative: Prediction is negative and individual is well
− Fp - False positive: Prediction is positive and individual is well, fake alarm, horrible
− Fn - False negative: Prediction is negative and individual is diabetic, the mainly horrifying
Accuracy = (Tp+Tn)/(Tp+Fp+Fn+Tn)
Precision = Tp/(Tp+Fp)
A comparison of the predicting capabilities of the models logistic regression, random forest, and MLP
is presented in Figure 4. It compares the anxiety problem on models of logistic regression, random forest, and
MLP. MLP results best forecasting compared to remaining models for identifying anxiety problems.
A comparison of the predicting capabilities of the models logical regression, random forest, and MLP
is presented in Figure 5. It evaluates the accuracy of the fear of missing out (FOMO) prediction using a variety
of machine learning methods. As compared to the other models, MLP provides the greatest predicting results
for determining the FOMO problem.
Figure 6 depicts a comparison of the models logical regression, random forest, and MLP for
forecasting. It examines the obesity prediction across a diversity of machine learning algorithms. MLP provides
the most accurate predictions when compared to other models for recognising the obesity epidemic.
These results can be compared to the previous works which done on different data sets and domains.
Career self-efficacy mediates instructional quality and social support on civil engineering vocational high
school students' career building. This study has major implications for vocational educators who construct
career development or strengthening programmes for vocational students [26]. A correlation-based filter helps
classifiers choose the most important features to improve classification accuracy. Weka and statistical package
for the social sciences (SPSS) sensitivity, specificity, accuracy, and precision analysis is shown. A decision
tree (J48) classified cardiovascular diseases (CVD) patients with 95.76% accuracy [27]. Our MLP is showing
94-98% accuracy which is better compared to earlier works in different field of data.
Figure 5. Visualising of FOMO forecasting with Figure 6. Visualising obesity forecasting with
different models different models
5. CONCLUSION
In our research, we provide a powerful MLP-based prediction machine learning model for spotting
potential issues associated with excessive use of social networking sites (social network time). In order to
analyse and evaluate our facts collection, we have selected models like logistic regression, support vector
machine (SVM), and random forest. We have done several iterations, modifying both the train set and the test
set data, to arrive at the most accurate results possible. Following extensive experimentation to validate our
hypothesis, we analysed the results using three separate performance indicators. Based on the outcome, we
deduce that an MLP is the best model for predicting issues related to time spent in social networks. It's feasible
that the findings of this study will be applied to social networking applications in the future, with the user being
made aware of any risks that may occur from excessive usage of that app.
REFERENCES
[1] K. Y. Sin, A. A. Mohamad, and M. C. Lo, “A critical review of literature in the rising tide of social media towards promoting
tourism,” Enlightening Tourism, vol. 10, no. 2, pp. 270–305, Dec. 2020, doi: 10.33776/et.v10i2.4887.
[2] H. Christian, D. Suhartono, A. Chowanda, and K. Z. Zamli, “Text based personality prediction from multiple social media data
sources using pre-trained language model and model averaging,” Journal of Big Data, vol. 8, no. 1, May 2021, doi: 10.1186/s40537-
021-00459-1.
[3] M. M. Ahsan and Z. Siddique, “Machine learning-based heart disease diagnosis: A systematic literature review,” Artificial
Intelligence in Medicine, vol. 128, p. 102289, Jun. 2022, doi: 10.1016/j.artmed.2022.102289.
[4] J. A. M. Sidey-Gibbons and C. J. Sidey-Gibbons, “Machine learning in medicine: a practical introduction,” BMC Medical Research
Methodology, vol. 19, no. 1, Mar. 2019, doi: 10.1186/s12874-019-0681-4.
[5] A. Campolo and K. Crawford, “Enchanted determinism: Power without responsibility in artificial intelligence,” Engaging Science,
Technology, and Society, vol. 6, pp. 1–19, Jan. 2020, doi: 10.17351/ests2020.277.
[6] S. Garg, S. Sinha, A. K. Kar, and M. Mani, “A review of machine learning applications in human resource management,”
International Journal of Productivity and Performance Management, vol. 71, no. 5, pp. 1590–1610, Feb. 2022, doi: 10.1108/IJPPM-
08-2020-0427.
[7] A. Yahyaoui, A. Jamil, J. Rasheed, and M. Yesiltepe, “A decision support system for diabetes prediction using machine learning
and deep learning techniques,” 1st International Informatics and Software Engineering Conference: Innovative Technologies for
Digital Transformation, IISEC 2019 - Proceedings, 2019, doi: 10.1109/UBMYK48245.2019.8965556.
[8] Q. B. Pham et al., “A comparison among fuzzy multi-criteria decision making, bivariate, multivariate and machine learning models
in landslide susceptibility mapping,” Geomatics, Natural Hazards and Risk, vol. 12, no. 1, pp. 1741–1777, Jan. 2021, doi:
10.1080/19475705.2021.1944330.
[9] Y. K. Dwivedi et al., “Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda
for research, practice and policy,” International Journal of Information Management, vol. 57, p. 101994, Apr. 2021, doi:
10.1016/j.ijinfomgt.2019.08.002.
[10] V. Rodriguez-Galiano, M. Sanchez-Castillo, M. Chica-Olmo, and M. Chica-Rivas, “Machine learning predictive models for mineral
prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines,” Ore Geology
Reviews, vol. 71, pp. 804–818, Dec. 2015, doi: 10.1016/j.oregeorev.2015.01.001.
[11] N. Le Chau et al., “An effective approach of adaptive neuro-fuzzy inference system-integrated teaching learning-based optimization
for use in machining optimization of S45C CNC turning,” Optimization and Engineering, vol. 20, no. 3, pp. 811–832, Dec. 2019,
Predicting psycho-somatic disorders in online activity using multi-layer perceptron (Manjunath Gadiparthi)
694 ISSN: 2252-8938
doi: 10.1007/s11081-018-09418-x.
[12] A. A. Masrur Ahmed, E. Sharma, S. Janifer Jabin Jui, R. C. Deo, T. Nguyen-Huy, and M. Ali, “Kernel ridge regression hybrid
method for wheat yield prediction with satellite-derived predictors,” Remote Sensing, vol. 14, no. 5, p. 1136, Feb. 2022, doi:
10.3390/rs14051136.
[13] M. Taki, A. Rohani, and H. Yildizhan, “Application of machine learning for solar radiation modeling,” Theoretical and Applied
Climatology, vol. 143, no. 3–4, pp. 1599–1613, Jan. 2021, doi: 10.1007/s00704-020-03484-x.
[14] M. T. J. Heino, K. Knittle, C. Noone, F. Hasselman, and N. Hankonen, “Studying behaviour change mechanisms under complexity,”
Behavioral Sciences, vol. 11, no. 5, p. 77, May 2021, doi: 10.3390/bs11050077.
[15] P. M. Pflüger and F. Glorius, “Molecular machine learning: The future of synthetic chemistry?,” Angewandte Chemie - International
Edition, vol. 59, no. 43, pp. 18860–18865, Sep. 2020, doi: 10.1002/anie.202008366.
[16] I. H. Sarker, “Machine learning: Algorithms, real-world applications and research directions,” SN Computer Science, vol. 2, no. 3,
2021, doi: 10.1007/s42979-021-00592-x.
[17] J. Wang, “Fast identification of possible drug treatment of coronavirus disease-19 (Covid-19) through computational drug
repurposing study,” Journal of Chemical Information and Modeling, vol. 60, no. 6, pp. 3277–3286, Feb. 2020, doi:
10.1021/acs.jcim.0c00179.
[18] W. Yu, T. Liu, R. Valdez, M. Gwinn, and M. J. Khoury, “Application of support vector machine modeling for prediction of common
diseases: The case of diabetes and pre-diabetes,” BMC Medical Informatics and Decision Making, vol. 10, no. 1, 2010, doi:
10.1186/1472-6947-10-16.
[19] A. Csábrági et al., “Estimation of dissolved oxygen in riverine ecosystems: Comparison of differently optimized neural networks,”
Ecological Engineering, vol. 138, pp. 298–309, Nov. 2019, doi: 10.1016/j.ecoleng.2019.07.023.
[20] A. Afram, F. Janabi-Sharifi, A. S. Fung, and K. Raahemifar, “Artificial neural network (ANN) based model predictive control
(MPC) and optimization of HVAC systems: A state of the art review and case study of a residential HVAC system,” Energy and
Buildings, vol. 141, pp. 96–113, Apr. 2017, doi: 10.1016/j.enbuild.2017.02.012.
[21] L. Fan, Y. Wang, X. Fang, and J. Jiang, “To predict the power generation based on machine learning method,” Journal of Physics:
Conference Series, vol. 2310, no. 1, p. 12084, Oct. 2022, doi: 10.1088/1742-6596/2310/1/012084.
[22] Y. O. Ouma, C. O. Okuku, and E. N. Njau, “Use of artificial neural networks and multiple linear regression model for the prediction
of dissolved oxygen in rivers: Case study of hydrographic basin of river nyando, kenya,” Complexity, vol. 2020, pp. 1–23, May
2020, doi: 10.1155/2020/9570789.
[23] T. Tiyasha et al., “Functionalization of remote sensing and on-site data for simulating surface water dissolved oxygen: Development
of hybrid tree-based artificial intelligence models,” Marine Pollution Bulletin, vol. 170, p. 112639, Sep. 2021, doi:
10.1016/j.marpolbul.2021.112639.
[24] B. Mohammadi, Y. Guan, R. Moazenzadeh, and M. J. S. Safari, “Implementation of hybrid particle swarm optimization-differential
evolution algorithms coupled with multi-layer perceptron for suspended sediment load estimation,” Catena, vol. 198, p. 105024,
Mar. 2021, doi: 10.1016/j.catena.2020.105024.
[25] R. Honysz, “Modeling the chemical composition of ferritic stainless steels with the use of artificial neural networks,” Metals, vol.
11, no. 5, p. 724, Apr. 2021, doi: 10.3390/met11050724.
[26] I. N. Saputro, Soenarto, and H. Sofyan, “How to improve career construction for civil engineering students?,” International Journal
of Evaluation and Research in Education, vol. 12, no. 2, pp. 1007–1015, Jun. 2023, doi: 10.11591/ijere.v12i2.24323.
[27] B. Abuhaija et al., “A comprehensive study of machine learning for predicting cardiovascular disease using Weka and SPSS tools,”
International Journal of Electrical and Computer Engineering, vol. 13, no. 2, pp. 1891–1902, Apr. 2023, doi:
10.11591/ijece.v13i2.pp1891-1902.
BIOGRAPHIES OF AUTHORS