Identification of Suicidal Intent Using Machine Learning Techniques Over Twitter Data
Identification of Suicidal Intent Using Machine Learning Techniques Over Twitter Data
https://doi.org/10.22214/ijraset.2023.51000
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
Abstract: Machine learning based on categorical classification has integrated usages in a variety of fields like prediction,
finance, supply chain management, sales and operations as well as product analytics. This study shows how Support Vector
Machine Learning Model from the Supervised Learning sub-branch of Machine Learning predicts the suicidal intent of a
person’s “tweet” on the social media platform ‘Twitter’. This model basically indicates whether the text tweeted or posted by a
person may be suicidal or not. Regularised data set for Modeling is divided into test data set and training data set at the rate of
3:7. Machine Learning for this study needs to use Pandas, Numpy , Pillow, ScikitLearn, Textblob and Nltk frameworks. Massive
amount of actual Twitter data is used as a dataset for the training and testing purpose so the model can analyse the text with
maximum accuracy.
I. INTRODUCTION
1) In today’s world the cargo work has become a pointer in people’s daily routine and lifestyle which is leading to some of the
extreme steps like committing suicide.
2) Technology has some positive fundamentals on human aspects while some are turning out to be negative towards the human
psychology like depression, suicidal steps and many more life changing decisions
3) Nowadays developers have come across various concepts which are in progress towards the remedy for the listed problems like
suicide and depression.
4) The reference of the mentioned problems have an serious impact towards the life of the humans living in this modern era.
However with the help of some of the pioneer methods residing in Computer Technology like Machine Learning,Deep
Learning with the help of the methodology assigned as Sentiment Analysis it has become easy to prioritize this issue with some
Machine Learning terminology and Algorithms
5) The problems were studied and it was associated with the help of some of the powerful definitions of Deep Learning and Data
Science frameworks with the empowerment of strong functioning of libraries and tools.
6) The interface was designed in such a way with interactive KPIs which made it easy for the end users to accumulate their data
fetched from the users which had been projected on the Twitter Platform with some of the keywords associated with some
negative thoughts like die, unhappy, sad and many more.
7) The study was implemented with vigorous discussions and thought exchange programs with the teammates and came up with
the working solution of implementing a Machine Learning Model which was used for identification of suicidal intent with a
help of huge Dataset which was further Trained and Tested with great accuracy.
A. Background
Machine Learning algorithms are used for the classification of large data into smaller chunks. We aim to order the extremity of the
tweet where it is either suicidal or not. On the off chance that the tweet is sad but not suicidal then the model would identify it as not
suicidal. Basically the more predominant estimation sentiment to be picked. Various machine learning algorithms can be used to
extract the features from the data.
B. Statement
1) The solution was implemented with the help of an interactive Machine Learning Model and also with some of the highly
advanced Research Publications in the relevant field as a reference and the findings were executed on the working of the model.
2) The Implementation was performed with some block building algorithms which were developed from scratch towards some
final end result.The advanced concepts from Deep Learning subject were used as a prerequisite towards the working model for
the advancement of the results for better working interface and accurate results towards the end users.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3487
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
C. Motivation
A Machine Learning model which can accurately identify the suicidal intent of a person just by taking input the ‘Tweets’ he/she
made on the Twitter platform is the main objective. This application can be used by all the therapists to track the virtual behaviour
of the patient and can treat him/her accordingly. There are many Sentiment Analysis or Text Analysis models readily available, but
a model specially tailored to address the issue of suicidal patients is far from practically or commercially available in a ready-to-use
state for the therapists/psychaitrists.
D. Challenge
The following are the challenges of this project:
1) Developing a Scientifically sound, time and cost effective Machine Learning model for predicting the suicidal intent of a person
using his/her Twitter data.
2) Differentiating tweets into just sad or suicidal accurately.
3) Raising awareness on how social media acts as a medium to share suicidal thoughts for people in depression or facing anxiety.
4) Collecting and obtaining a massive Twitter dataset to accurately train the model.
5) Constantly working on increasing the dataset size to make it more and more accurate along with prevention of overfitting of the
data.
6) Getting feedback from therapists on whether the model practically works just as expected.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3488
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
IV. REQUIREMENTS
A. User Requirements
1) Tweets chosen by the user to test can be pasted on the main User Interface of the app after signing up and logging in into the
software.
2) In this project a set of available libraries has been used. These libraries include the stopwords as well which are eliminated from
the tweet content and only core target words are focused for prediction.
3) User just has to click a button and the software will pop-up a message saying whether the tweet is suicidal or not.
B. Non-functional Requirements
In Computer engineering and Requirement engineering, a non-functional requirement is a requirement that specifies criteria that can
be used to judge the operation of a system, rather than specific aspects. They are into two opposite directions with functional
requirements that define specific behavior or functions. Non-functional requirements add massive value to business development .It
is commonly misled by a lot of people. It should be mandatory for business stakeholders, and Clients to clearly develop the
requirements and their high expectations in measurable terminology. If the non-functional requirements are not scalable then they
should be revised or interpreted again to gain better clarity. For example, User stories help in dissolving the gap between developers
and the user stakeholders in Agile Methodology.
C. Usability
Prioritize the End users tweet as per the sentiment analysis terminology and then segmenting the tweets into diversified importance
while simultaneously working with the dataset.
D. Reliability
Reliability refers to the level of confidence in a system that is established over time through its use. It indicates the extent to which
software can perform its intended functions without encountering errors or issues within a specific timeframe.The number of issues
and bugs encountered while execution of the working model were fixed with the help of reliability behavioral testing and also
through exclusive discussion panels throughout the functioning of the model. Your goal should be to create point to point dense
algorithms for the machine learning model and which makes the model easy to implement and familiar to the user of the working
directory.
E. Performance
Under what circumstances and at what specific peak times, such as during stress periods like the end of the month or during payroll
disbursement, should system response times be measured from any point? Additionally, are there times when the load on the system
will be abnormally high?
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3489
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3490
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
V. IMPLEMENTATION OF SYSTEM
A. Existing System
The increasing number suicides is not a new issue and many projects have been conducted and implemented in the past to address
this issue and tried to find a practical solution which will contribute in eventually reducing the number of suicides. Most of these
projects involve models which make use of Sentiment Analysis or Text Analysis to segregate the texts into negative and positive
behaviours.
B. Disadvantages
The major disadvantages of these models include
These models fail to find a difference between the tweets which are just sad and those which are actually suicidal and may prove the
person to have a suicidal intent.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3491
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
C. Proposed System
This research proposes a Machine Learning model which will identify the intent of a person who may or may not be having
thoughts about committing suicide using the tweets he/she posts on Twitter. This model can be implemented practically by
therapists to analyse the behavioural patterns of their patients and treat them with proper medications accordingly. A Support Vector
Machine (SVM) technique is used for the classification purpose. It involves mapping the input data points into a higher dimensional
feature space, where a hyperplane is then used to separate the classes or predict the target values. It aims to find the optimal
hyperplane that maximizes the margin, which is the distance between the hyperplane and the closest data points from each class.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3492
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
VII. CONCLUSION
Machine Learning algorithms like Supervised Learning approaches are highly valuable in resolving pragmatic issues. In this study,
the primary aim is to establish how the suicidal intention of an individual can be detected by scrutinizing the tweet he/she
disseminated through the utilization of the SVM model. The corpus of authentic tweets is utilised to achieve this objective.
REFERENCES
[1] J. T. Fiquer, P. S. Boggio, and C. Gorenstein, “Talking bodies: Nonverbal behavior in the assessment of depression severity,” Journal of affective disorders,
vol. 150, no. 3, pp. 1114–1119, 2013.
[2] N. Cummins, S. Scherer, J. Krajewski, S. Schnieder, J. Epps, and T. F. Quatieri, “A review of depression and suicide risk assessment using speech analysis,”
Speech Communication, vol. 71, pp. 10–49, 2015.
[3] J. F. Cohn, T. S. Kruez, I. Matthews, Y. Yang, M. H. Nguyen, M. T. Padilla, F. Zhou, and F. De la Torre, “Detecting depression from facial actions and vocal
prosody,” in Affective Computing and Intelligent Interaction and Workshops, 2009. ACII 2009. 3rd International Conference on. IEEE, 2009, pp. 1–7.
[4] L.-S. A. Low, N. C. Maddage, M. Lech, L. Sheeber, and N. Allen, “Influence of acoustic low-level descriptors in the detection of clinical depression in
adolescents,” in Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010, pp. 5154–5157.
[5] J. C. Mundt, P. J. Snyder, M. S. Cannizzaro, K. Chappie, and D. S. Geralts, “Voice acoustic measures of depression severity and treatment response collected
via interactive voice response (ivr) technology,” Journal of neurolinguistics, vol. 20, no. 1, pp. 50–64, 2007.
[6] S. Alghowinem, “From joyous to clinically depressed: Mood detection using multimodal analysis of a person’s appearance and speech,” in Affective
Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on. IEEE, 2013, pp. 648–654.
[7] Y. Yang, C. Fairbairn, and J. F. Cohn, “Detecting depression severity from vocal prosody,” IEEE Transactions on Affective Computing, vol. 4, no. 2, pp. 142–
150, 2013.
[8] T. R. Almaev and M. F. Valstar, “Local gabor binary patterns from three orthogonal planes for automatic facial expression recognition,” in 2013 Humaine
Association Conference on Affective Computing and Intelligent Interaction. IEEE, 2013, pp. 356–361.
[9] J. M. Girard, J. F. Cohn, M. H. Mahoor, S. M. Mavadati, Z. Hammal, and D. P. Rosenwald, “Nonverbal social withdrawal in depression: Evidence from manual
and automatic analyses,” Image and vision computing, vol. 32, no. 10, pp. 641–647, 2014.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3493
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
RESULTS
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3494
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3495
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3496
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3497
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3498