Depression Detection in Tweets Using Logistic Regression Model
Depression Detection in Tweets Using Logistic Regression Model
Volume 5 Issue 4, May-June 2021 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
@ IJTSRD | Unique Paper ID – IJTSRD41284 | Volume – 5 | Issue – 4 | May-June 2021 Page 724
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
capacity for each test node in a decision tree. Podgorelec, V., outstanding characterizing capacity and show quality,
& Zorman, M. (2014) gave insight about Decision tree isolating the information directly into two separate classes,
learning in Encyclopaedia of Complexity and Systems with the most extreme distance between the two classes.
Science 1-28 [3]. Splits are evaluated by calculating entropy.
d. K-Nearest Neighbour
K-Nearest Neighbour (K-NN) is potentially the clearest
Fig:2.1 Decision Tree computations embraced in machine learning for
classification and regression issues. In light of the nearest
b. Random Forest gauges, KNN takes data and arranges ongoing data focuses.
The Random forest classifier makes different choice trees The data is then assigned to the class with the premier
from an arbitrarily chosen subset of the preparation dataset. nearest neighbour. Taneja, S., Gupta, C., Goyal, K., Gureja, D.
Rodriguez-Galiano, V. F., Ghimire, B., Rogan, J., Chica-Olmo, (2014) proposed an enhanced k nearest neighbour
M., Rigol Sanchez, J. P. (2012) gave an assessment for the algorithm using information gain and clustering in Fourth
effectiveness of a random forest classifier for landcover International Conference on Advanced Computing &
classification. ISPRS Journal of Photogrammetry and Remote Communication Technologies IEEE:325-329. [8] KNN is
Sensing 67: 93-104 [4]. At that point, it adds up to the votes regularly used to characterise future data because of its
from various choice trees to choose for the last class of test simplicity of execution and sufficiency.
objects. Paul, A., Mukherjee, D. P., Das, P., Gangopadhyay, A.,
Chintha, A.R., Kundu, S. (2018) proposed an improved
random forest for classification IEEE Transactions on Image
Processing 27 (8): 4012-4024 [5]. A random forest
classification was proposed in with a diminished number of
trees.
@ IJTSRD | Unique Paper ID – IJTSRD41284 | Volume – 5 | Issue – 4 | May-June 2021 Page 725
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
IV. Proposed System the reoccurrence of words in each tweet, where each line
The goal of this project is to build a model which can help us demonstrates an archive of tweets and every segment
to detect whether the person is suffering from depression or demonstrates all words utilized in all records. TF-IDF is used
not based on their tweet. People can compose a tweet in the to measure the words' weight. Features are applied on to the
content box and it will be breaking down by the model that DTM are then converged with account measures removed
we have made and it will give us the outcome. I will be using from the social network and user activities. Result of the
the python web framework Flask to integrate it with the merge are then treated as free factors in classification
model and make it more intractable to the common user. algorithms to anticipate the reliant variable of a result of
interest. Ultimately, we decide upon the Logistic Regression
The proposed system can help us to make people aware of algorithm.
their mental health and they can take necessary measure and
help themselves. We take the dataset and clean it with the V. Result and Discussions
justifiable goal. The dataset contains tweets and label (0 and
1). If the tweet is depressing then the result will be 1 and if
the tweet is not depressing the result will be 0. I will be using
the Logistic Regression Model and check the
accuracy. Later the model will be saved using a pickle
library.
@ IJTSRD | Unique Paper ID – IJTSRD41284 | Volume – 5 | Issue – 4 | May-June 2021 Page 726
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
Reference- forest for classification."IEEE Transactions on Image
[1] Jamil, Z. Monitoring tweets for depression to Processing 27 (8): 4012-4024.
detect at-risk users. Université d'Ottawa/University
[6] Saitta, L., (2000) “Support Vector Networks.”
of Ottawa, 2017.
Kluwer Acad. Publ. Bost.: 273–297.
[2] Han J., Pei J., & Kamber, M. Data mining: concepts
[7] Hamed, T., Dara, R., Kremer, S. C. (2014) "An
and techniques (Elsevier,2011)
accurate, fast embedded feature selection for
[3] Podgorelec, V., & Zorman, M. Decision tree learning SVMs." In2014 13th International Conference on
in Encyclopedia of Complexity and Systems Science Machine Learning and ApplicationsIEEE :135-140.
1-28 (2014)
[8] Taneja, S., Gupta, C., Goyal, K., Gureja, D. (2014) "An
[4] Rodriguez-Galiano, V. F., Ghimire, B., Rogan, J., enhanced k nearest neighbour algorithm using
Chica-Olmo, M., Rigol Sanchez, J. P. (2012) “An information gain and clustering." Fourth
assessment of the effectiveness of a random forest International Conference on Advanced Computing &
classifier for landcover classification.”ISPRS Journal Communication Technologies IEEE:325-329.
of Photogrammetry and Remote Sensing 67: 93-
[9] Hornik, K., & Grün, B. Topicmodels: An R package
104.
for fitting topic models. J. of Stat. Softw. 40 (13), 1-
[5] Paul, A., Mukherjee, D. P., Das, P., Gangopadhyay, A., 30 (2011).
Chintha, A.R., Kundu, S. (2018)"Improved random
@ IJTSRD | Unique Paper ID – IJTSRD41284 | Volume – 5 | Issue – 4 | May-June 2021 Page 727