0% found this document useful (0 votes)
23 views13 pages

Predictive Modeling For Suspicious Content Identification On Twitter

References
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views13 pages

Predictive Modeling For Suspicious Content Identification On Twitter

References
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Social Network Analysis and Mining (2022) 12:149

https://doi.org/10.1007/s13278-022-00977-7

ORIGINAL ARTICLE

Predictive modeling for suspicious content identification on Twitter


Surendra Singh Gangwar1 · Santosh Singh Rathore1 · Satyendra Singh Chouhan2 · Sanskar Soni2

Received: 9 February 2022 / Revised: 24 August 2022 / Accepted: 17 September 2022 / Published online: 5 October 2022
© The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2022

Abstract
The wide popularity of Twitter as a medium of exchanging activities, entertainment, and information is attracted spammers
to discover it as a stage to spam clients and spread misinformation. It poses the challenge to the researchers to identify mali-
cious content and user profiles over Twitter such that timely action can be taken. Many previous works have used different
strategies to overcome this challenge and combat spammer activities on Twitter. In this work, we develop various models
that utilize different features such as profile-based features, content-based features, and hybrid features to identify malicious
content and classify it as spam or not-spam. In the first step, we collect and label a large dataset from Twitter to create a spam
detection corpus. Then, we create a set of rich features by extracting various features from the collected dataset. Further,
we apply different machine learning, ensemble, and deep learning techniques to build the prediction models. We performed
a comprehensive evaluation of different techniques over the collected dataset and assessed the performance for accuracy,
precision, recall, and f1-score measures. The results showed that the used different sets of learning techniques have achieved
a higher performance for the tweet spam classification. In most cases, the values are above 90% for different performance
measures. These results show that using profile, content, user, and hybrid features for suspicious tweets detection helps build
better prediction models.

Keywords Suspicious content detection · User-content features · Natural language processing · Machine learning
techniques · Social network

1 Introduction about the crisis or outbreaks, which can be useful in provid-


ing assistance and humanitarian support. Currently, govern-
With the availability of the Internet and web-based informa- ment and private organizations, as well as individuals, are
tion, platforms such as Twitter are widely used to support using Twitter to share information (Edo-Osagie et al. 2020).
the distribution of information. Twitter allows users to create In this way, Twitter plays a vital role in the rapid distri-
a network of people to disseminate information and allow bution of information (Martinez-Rojas et al. 2018). As the
the mass communication of the information to a widespread data show, recently, Twitter has refreshed its dynamic client
audience (Boukes 2019). For example, Twitter can serve as numbers over quite a while to 328 million1.
a platform to help with the crisis management process by While Twitter has established itself to distribute infor-
looking for specific hashtags. Individuals can also narrate mation successfully and rapidly, it has also become a plat-
form for spreading misinformation and panic phenomena
* Santosh Singh Rathore fueled by incomplete and inaccurate information (Wang
[email protected] and Zhuang 2017). The scientists at Italy’s Bruno Kessler
Surendra Singh Gangwar Foundation’s Center for Information and Communication
[email protected] Technology2 have scrutinized around 121,407,000 tweets
Satyendra Singh Chouhan and reported that more than half of the tweets are rumors
[email protected] and bots spread false news. Other studies have also reported
Sanskar Soni similar findings (Hennig-Thurau et al. 2015). The magnitude
[email protected] 1
https://www.statista.com/statistics/242606/number-of-active-twit-
1
ter-users-in-selected-countries/.
ABV-IIITM, Gwalior, India 2
Italy’s Bruno Kessler Foundation’s Center for Information and
2
MNIT, Jaipur, India Communication Technologyhttps://ict.fbk.eu/.

13
Vol.:(0123456789)
149 Page 2 of 13 Social Network Analysis and Mining (2022) 12:149

of this false information is huge that even the World Health tweet content, such as the number of hashtags in comparison
Organization has declared the onslaught of messages as an to total word count, users mentioned in a tweet, number of
“infodemic”3. The overabundance of information, some are URLs, and count of numerals, are used. (3) Relation fea-
correct, and others are not, makes it difficult for the end-user tures-based techniques: These techniques use the connec-
to find virtuous sources and reliable information when they tion degree measures such as whether a person mentioned
need it. This can cause a negative impact on the psychology a direct friend in a tweet or a mutual friend, etc., to identify
of users, driving them to anxiety and depression. It is highly malicious content. (4) Hybrid feature-based techniques:
essential to curb the pitfalls of Twitter to make it a more reli- these techniques drive new features such as reputation (ratio
able and trustworthy place (Lingam et al 2019). of followers with following), frequency of tweets, and the
Therefore, the boon of Twitter is now turning into a curse rate at which user follows other users by using the user fea-
as spammers are using these platforms to spread malicious tures. In 2020, Abkenar et al. (2020) performed a SLR on
or irritating information to achieve their malevolent intends. Twitter spam detection and reported that spam detection
In a tweet, the user can add text, URLs, videos, and images. approaches had used content analysis approaches (15%),
Further, it allows various functionalities, such as follow- user analysis approaches (9%), tweet analysis approaches
ing a user, mentioning a topic or user, hashtag, reply, and (9%), network analysis approaches (11%), and hybrid analy-
retweet. Lee and Kim (2013). A hashtag is used to catego- sis approaches (56%). Furthermore, the authors stated that
rize a tweet into a particular category, and all tweets related collecting real-time Twitter data, labeling datasets, spam
to that tweet can be read by clicking that tag. At the point drifting, and class imbalance problems are open challenges
when any remarkable occasion happens, a large number of in Twitter spam detection approaches.
clients tweet about it and quickly make it a trending subject. In this paper, first, we collect the spam dataset from Twit-
These trending themes become the objective of spammers ter by utilizing Twitter developer API. We fetch 4000 latest
who post tweets consisting of some trademark expressions tweets, consisting of information like timestamp, tweet text,
of the moving point with URL interfaces that lead clients username, hashtags, followers count, following count, the
to disconnected sites. As tweets usually incorporate abbre- number of mentions, word count, retweets, etc. Further, we
viated URL joins, it becomes for the clients to recognize perform feature engineering and extract different feature sets
the substance of the URL without stacking the site. Spam- such as content-based and user-based features. Additionally,
mers can have a few thought processes behind spamming, we create hybrid features such as the user’s reputation, fre-
for example, advertise a product to produce exceptional quency of tweets of a user, and following frequency. Further,
yield on deals, compromising the user’s account (Lingam we label the dataset as spam or non-spam using hybrid fea-
et al 2019; Barushka and Hajek 2018; Dokuz 2021). Spam- tures, blacked list URLs, and some predefined words in the
mers contaminate the continuous pursuit climate. However, text. Afterward, we apply different machine learning and
they additionally affect tweets statistics. Filtering malicious deep learning techniques to predict suspicious or malicious
content becomes a challenging problem because of URL tweets. Further, we perform an analysis to assess how differ-
shorteners, modern and informal languages, and abbrevia- ent techniques are performed to predict suspicious content
tions used on social networking sites. Spammers influence on Twitter. Specifically, we made the following contribu-
the users to click a particular URL or read the content with tions in the presented work.
specific phrases or words (Tingmin et al. 2018; Madisetty
and Desarkar 2018). 1. We create a spam dataset to detect suspicious content of
In their study, Kaur et al. (2016) have surveyed research Twitter.
papers published between 2010 and 2015 for malicious 2. We extract different features from the collected Twit-
tweets and content identification. The authors reported that ter dataset.These features are language based, content
most of the used techniques for malicious tweets detection based, and user based. Further, we create hybrid features
could be categorized into four categories. (1) User fea- to enrich the feature set for building effective prediction
tures-based techniques: These techniques classify a user as models.
spammer or non-spammer by analyzing the user’s account 3. We apply two different natural language processing
information such as no. of followers, no. of following, no. (NLP) techniques, bag of words and TF-IDF to extract
of mentions, and tweets creation time. (2) Content features- different language features.
based techniques: These techniques analyze the text proper- 4. We apply different machine learning and state-of-the-art
ties and decide whether tweets are spam or non-spam. The deep learning techniques and evaluate their performance
for the suspicious content detection on Twitter.

3
https://www.washingtonpost.com/science/2020/03/17/analysis-mil- The rest of the paper is organized as follows. Section 2 dis-
lions-coronavirus-tweets-shows-whole-world-is-sad/. cusses works related to the techniques used for the Tweets

13
Social Network Analysis and Mining (2022) 12:149 Page 3 of 13 149

spam classification. The Twitter spam data collection and that content and graph-based feature-based models achieved
feature extraction procedure are presented in Sect. 3. The an accuracy of 90%, and the user and graph-based feature-
experimental analysis and results are provided in Sect. 4. based models achieved an accuracy of 92%. The presented
Section 5 concludes the paper. work also performed correlation analysis between features
and removed features with higher correlations.
Sagar and Manik (2017) have applied different machine
2 Related studies learning algorithms for Twitter spam detection. The pre-
sented work has used SVM as the principal classifier. The
In this section, we discuss some of the state-of-arts related authors have introduced a new feature that matches the tweet
to proposed work. Kaur et al. (2016) have reported a review content with URL destination content. The experimental
of various research papers published between 2010 and 2015 dataset consists of a random set of 1000 tweets; out of those
and discussed techniques used in these research papers. The 1000 tweets, 95–97% were classified correctly. Arushi and
authors stated that researchers had utilized numerous meth- Rishabh (2015) proposed an integrated algorithm that com-
ods for spam detection. Most of the works have been done by bines the benefits of three distinctive learning algorithms
considering tweets’ content and profile-based features. Dan- (to be specific naive Bayes, clustering, and decision trees)
gkesee and Puntheeranurak (2017) performed an adaptive was implemented. This incorporated calculation classifies a
classification for spam detection. Authors have used spam record as spammer/non-spammer with a by and large preci-
world filter and URL filter using black-listed URLs. After sion of 87.9%. Lin and Huang (2013) analyzed the impor-
labeling and preprocessing the data set, the Naive Bayes tance of existing features for recognizing spammers on
classifier used 50000 and 10000 tweets. The results found Twitter and utilized two basic yet compelling features (i.e.,
that the proposed model outperformed spam world filters by the URL rate and the collaboration rate) to characterize the
comparing accuracy, precision, recall, f1-score. In the end, Twitter accounts. This study, dependent on 26,758 Twitter
the authors have suggested the utilization of safe browsing accounts with 508,403 tweets, shows that the classification
instead of URL blacklisting for filtering URLs. has precision up to around 0.99 and 0.86 and a higher recall.
Raj et al. (2020) applied multiple machine learning algo- Willian and Yanqing (2013) proposed a versatile strategy
rithms to classify tweet content. The experimental results to distinguish spam on Twitter using content, social, and
showed that out of the used techniques, KNN (92%), deci- graph-based data, and after various examinations, an edge
sion tree classifier (90%), random forest classifier (93%), and acquainted-based model is made. This new model is
and naive Bayes classifier (69%) outperformed other tech- contrasted with SVM and two other existing calculations
niques. The authors suggested that the tweet be deleted after utilizing accuracy, precision, and recall. The new classifier
detecting it as spam. Song et al. (2011) presented Bagging, with an accuracy of 79.26% is superior to SVM with a preci-
SVM, J48, BayesNet with relation-based features by cre- sion of 69.32%. Wu et al. (2017) used various deep learning
ating graphs between users. The authors have used meas- techniques utilizing training through word vectors and crea-
ures such as distance and connectivity between users. The tion of various classifiers through ML algorithms. Doc2Vec
results showed that Bagging outperformed other techniques was used as the word vector training model, and machine
with a 94.6% true positive rate and 6.5% false positive rate. learning algorithms included random forest, naive Bayes,
The authors have also highlighted that if any user created and decision tree. The author collected 10 days ground truth
a new account and generated a tweet, it would be added to data from twitter consisting of 1,376,206 spam tweets and
the spammer category, even if it is not spam. It is due to the 673,836 non-spam messages and created four different data-
classification of the user as malicious earlier. sets with varying spam to non-spam ratios. MLP proved to
Alom et al. (2020) have applied CNN with tweet text and perform the best on all four datasets. Tang et al. (2014) tried
with both tweet text and meta-data features for the spam a unique approach of extracting out features from tweets
classification. The presented approach utilized NLP meth- using deep learning networks in order to capture syntactic
ods such as word embeddings and n-grams methods. The texts of embedded words and labels. However, the machine
approach converts the text into a matrix before sending it to learning algorithms using these features did not perform that
CNN. The method that combined both the features produced well as the best f1 score was reported to be 87.61% (<90%).
better accuracy of around 93.38%. The presented approach The previous work done in spam detection on Twitter
outperformed other used deep learning methods. predominantly centers around the profile and content-based
Mateen et al. (2017) proposed a hybrid solution for spam features. Better utilization of other features in Twitter spam
detection that used different combinations of features such detection is still a major concern (Tingmin et al. 2018).
as content-based, graph-based, and user-based features. The Additionally, there is a need for adding hybrid features in
authors have applied J48, decorate, and naive Bayes classi- training set for tweet classification. The proposed work uses
fiers on the dataset having these features. The results showed two different datasets with different features combinations

13
149 Page 4 of 13 Social Network Analysis and Mining (2022) 12:149

Dataset 1
(DS1)

Predictive Model
Profile and Text Encoding using TF-IDF Building using ML
Features and BOW Techniques

Tweet Collection Evaluate and


using Spam Labeling Compare the
Tweepy Performance of
Different
Predictive Model Models
Encoding using Building using Deep
Profile, Text, and Sentence transformers Learning Techniques
Hybrid Features

Dataset 2
(DS2)

Fig. 1  Twitter data collection, feature extraction procedure and ML/DL model evaluation

to analyze different machine learning, ensemble, and deep the number of following, the number of mentions, and
learning techniques. tweets creation time.
2. Content-based features These features concern the text
properties of the tweets (Chen et al. 2017). A tweet con-
3 Twitter spam dataset collection tent has some crucial information such as the number of
hashtags, total word count, users mentioned in a tweet,
The overview of the dataset collection, feature extraction, the number of URLs, and count of numerals.
and model evaluation procedure is depicted in Fig. 1. 3. Hybrid features These features are derived from the
The proposed work utilizes the tweets fetched using Twit- user-based features. Some new features that can be
ter developer API4. Twitter allows its users to fetch Twitter derived are reputation (ratio of followers with follow-
data using the Tweepy library5. The Tweepy library required ing), frequency of tweets, the rate at which user follows
four user credentials like consumer_key, consumer_secret, other users, account age, metric entropy for all textual
access_key, access_secret to send the request over API. features, the proportion of similarity in username and
We fetched 4000 latest tweets, consisting of many features screen name, etc.
like timestamp, tweet text, username, hashtags, follow-
ers count, the following count, number of mentions, word Table 1 describes the important features that we have
count, retweet, etc. All of these features are categorized into extracted from the collected dataset. Specifically, we focus
content-based features and user-based features. Further, we on the following properties to extract different feature sets.
create various hybrid features such as the user’s reputation,
frequency of tweets of a user, and following frequency. For 1. Count of the number of followers and followees Follow-
labeling the dataset as spam or non-spam, we use hybrid ers are those users who follow a specific user, while fol-
features, blacked list URLs, and some predefined words in lowees are the users who a specific user follows. In gen-
the text (Gupta et al. 2018. Finally, the dataset is prepared eral, spammers have limited numbers of followers but
for analyzing the performances of different machine learn- large followees. Therefore, users with large followees
ing models. Two different datasets are created by combining and limited numbers of followers can be considered
user-content features, user-relation features, user-content- spam account.
relation features. 2. URLs URLs are the connections that direct to some other
We collect the features of three different categories as page on the program. With URL shorteners’ improve-
described below. ment, it has become simple to post irrelevant connec-
tions on any OSN. This is because URL shorteners
1. Profile-based features These features concern the pro- hide the original content of the URL, making it hard
file properties of the users. A user’s account includes for detection algorithms to detect malicious URLs. An
important information such as the number of followers, excessive number of URLs in tweets of a user are an
expected pointer of the user being a spammer.
3. Spam words A record with spam words in pretty much
4
https://developer.twitter.com/en. every tweet can be viewed as a spam account. Subse-
5
https://www.tweepy.org/.

13
Social Network Analysis and Mining (2022) 12:149 Page 5 of 13 149

Table 1  Description of the features collected for the Twitter’s spam dataset
Feature name Feature type Feature description

AccountAge Profile Days since account creation to date of collection


FollowersCount Profile In user profile meta-data
FriendsCount profile In user profile meta-data
StatusesCount Profile In user profile meta-data
DigitsCountInNmae Content Number of digits in screen name
TweetLen Content Number of characters in tweet
UserNamelen Content Number of characters in user name
ScreenNameLen Content Number of characters in screen name
Metric entropy for all textual features: tweet, user profile Hybrid To measure randomness in text. H(X) . Where |X| is the
description, user name and screen name, respectively
|X|
length of a string X, and H(X) is the Shannon entropy
of text
URIsRatio Hybrid |Characters in URLs|
|tweet length|
MentionsRatio Hybrid |Characters in user mentions|
|tweet length|
NameSim Hybrid Proportion of similarity in user name and screen name
Friendship Hybrid FriendsCount
FollowersCount
Followership Hybrid FollowersCount
FriendsCount
Interestingness Hybrid FavouritiesCount
StatusesCount
Activeness Hybrid StatusesCount
AccountAge
VerifiedAccount Profile In tweet meta-data
FavouritiesCount Profile In user profile meta-data
NamesRatio Hybrid |ScreenName length|
|UserNamelength|

quently, text including spam words can be considered count and followers count are prone to spam accounts. The
as a significant factor for identifying spammers. username and screenname of a legitimate user are usually
4. Replies Since, data or message sent by a spammer is similar, and the username is not very lengthy and does not
pointless, thusly individuals once in a while answers begin with a digit. If these naming conventions are not fol-
to its post. On the other hand, a spammer answers to an lowed, such users are usually spam accounts. NameSim and
enormous number of presents altogether on getting seen NamesRatio features capture this aspect of the accounts. A
by numerous individuals. This example can be utilized suspicious spam account usually posts 12 or more tweets per
in recognition of spammers. day, whereas a legitimate account posts on average 4 tweets
5. Hashtags Hashtags are the novel identifier (“#” trailed per day. We have considered these characteristics of the user
by the identifier name) which is utilized to bunch com- accounts and calculated hybrid features. The details of these
parative tweets together under a similar name. Spam- features are given in Inuwa-Dutse et al. (2018).
mers utilize enormous #hashtags in their posts, with the
goal that their post is posted under all the hashtag clas- 3.1 Labeling of spam dataset
sifications and consequently gets high viewership and is
perused by others. Initially, all of the tweets are unlabeled. We perform a data
labeling process and assign spam or non-spam label to reach
tweets. Concone et al. (2019) have presented a labeling tech-
The hybrid features are included in the dataset to under- nique for the Twitter spam account. The authors have used
stand the dynamism of features such as “statuses count, malicious URLs and recurrent content information to decide
friends count, followers count, favorites count, naming whether a tweet is spam or not. In our work, we use the
conventions and tweeting patterns.” Account age shows same technique to label the tweets. The labeling technique’s
the frequency of user activity. Accounts with a very high first step is defining some criteria that help decide between
value of status and friends count, but a low value of favorites spam and trustworthy content. The first criteria to consider

13
149 Page 6 of 13 Social Network Analysis and Mining (2022) 12:149

Table 2  Word categories


Category Words

Ads Ads, images, banners, Hedberg, RealMedia, img, announcer, popup, offer, adserver, sales, gifs, media, exit,
out, adv, splash, pub, pop, graphics
Books Catalog, book, patterns, weaving, product, sniacademic, news, ebook, educator, library, store, wilecyda
E-commerce Shop, store, catalog, tickets, art, users, business
Games Juegos, Jeux, category, game, Xbox, jeunesse, pc, online, Comunidad, consoles, flash, PSP, arcade, Wii,
emulator, gratis, Nintendo, PlayStation
Medical Health, conditions, article, content, diseases, meds, group
News News, newspapers, media, publications, section, feed, opinion, business, community, archive, papers, profile
Sport Sport, athletics, team, basketball, football, college, women, track, tennis, soccer, baseball, golf, mens

are the publication of URLs of some malicious sites in the 3. Based on hybrid and profile features There are some
tweet. It is simple to detect malicious content. Another cri- important hybrid and profile features on the basis of
terion is the publication of duplicate content or messages to which we can label a tweet. These features include the
spread some information. This strategy is often used to dis- ratio of friends count and followers count, the ratio of
seminate misinformation. The use of vocabulary and other the status count, and account age. Some profile features
meta-information is also used as the criteria. Based on these are also used for labeling, such as Is_verified and Listed.
characteristics, we design and use the labeling technique.To Is_verified represents whether the user is verified or not
label a tweet as spam or non-spam, we used a combination checked from the Twitter security bot. Listed represents
of word category filter, URL filter, and some hybrid and how many times the user reported. Table 1 lists all
profile-based meta-features. They are described as follows. hybrid and profile features used in the paper.

1. Word category filter In this filter, we create some rules


combining different words as given in Table 2. For We produced a labeled dataset after completing the labe-
example, the words such as free available, dear friend, ling procedure, which will be used for model building and
new offer, click here, unlimited offers, and register here evaluation. For the experimentation, we create two different
are considered. Furthermore, some suspicious words datasets, DS1 and DS2. These datasets can be found here6.
used for marketing purposes offer, register, extra, guar-
antee, discount, deal, collect, buy, apply now, bonus, – Dataset-1 (DS1) It consists of profile-based features
free, sales, unlimited, win, purchase, order now, lowest and content-based features. DS1 dataset has total 3650
are also considered (Martinez-Romo and Araujo 2013). instances, out of which 1897 are normal and 1753 are
If a combination of these words occurs in the tweet, we spam.
put it into the spam category. Tweets that contained at – Dataset-2 (DS2) It consists of profile-based features,
least two of the keywords are marked as spam. content-based features, and hybrid features. DS2 dataset
2. URL filter In this filter, we check the URL that is short- has total 9678 instances, out of which 5398 are normal
ened, from this URL we found the original URL. After and 4280 are spam.
finding the original URL, we match it with black-listed
URLs. Additionally, we check whether it is a secured or 3.2 Extraction of NLP features from tweet’s text
non-secured URL. If a tweet consists of any black-listed
URL, we label that tweet as spam. We have considered The used spam datasets consist of text of the tweets. This
three factors when analyzing URLs in a tweet. (1) Is it a textual information can classify the tweets into spam and
safe URL or not according to the Google Safe Browsing non-spam categories. However, the used machine learning
(GSB), (2) the total number of URLs posted in a tweet, techniques cannot work with raw text directly (Kim and Gil
and (3) the ratio of the total number of URLs and the 2019). Therefore, the text must be converted into numbers.
unique URLs in a tweet. A tweet is labeled as spam if We have used bag of words and TF-IDF vectorizer NLP
at least one URL is malicious or the ratio of the unique techniques to extract features from the tweets’ text.
URLs <=0.25.

6
https://github.com/ssrathore/Suspicious_Tweets-datasset.

13
Social Network Analysis and Mining (2022) 12:149 Page 7 of 13 149

3.2.1 Bag of words (BOW) 4.1 Used machine learning and ensemble


techniques
Bag of words is a popular and simple feature extraction
method from text data. This technique changes tokens of We have used five different machine learning techniques,
words into a series of features to utilize information within K-nearest neighbors, logistic regression, naive Bayes, deci-
the words. Each word is utilized to prepare the classifier sion tree, and random forest. Further, we have used three
in the BoW model. There are mainly three steps used to different ensemble techniques, bagging, boosting, and stack-
create the BOW model (Qader et al. 2019). (1) The pre- ing. These are the used widely used techniques for the tweets
processing step converts text into lower case and removes spam classification task. Therefore, we selected these tech-
all unnecessary information. (2) Building vocabulary step niques in the presented work.
counts the occurrences of the words, checks whether words
from sentences exist in the vocabulary or not, and prepares a 4.2 Used deep learning techniques
final dictionary of the words. (3) The text vectorization step
constructs a matrix of features by analyzing the presence or We have used seven different deep learning techniques, ANN
absence of words in sentences. (64 and 32 layers), long short-term memory (LSTM),GRU,
single convolution layer, two convolution layers, very deep
3.2.2 Term frequency‑inverse document frequency (TF‑IDF) convolution neural network (VDCNN), and convolution
+ LSTM. A brief description of these techniques is given
The TF-IDF technique is used to count the number of below.
words in a set of documents. It assigns each word a score
to indicate its prominence in the text and document. Term 4.2.1 Artificial neural networks (ANN)
frequency (TF) determines how often a term shows up in
the whole document. It can be considered the likelihood of It can be imagined as a single or a group of neurons and is
discovering a word inside the document. Inverse document also referred to as a feed-forward neural network. It consists
frequency (IDF) is a metric that determines whether a word of 3 layers: the input, hidden, and output layers. It is very
is uncommon or common among all documents in a corpus. well capable of handling the non-linearity in data. For the
The closer a term is to zero, the more common it is. IDF is implementation, we used a basic ANN with two hidden lay-
calculated by taking the total number of documents, dividing ers consisting of 64 and 32 neurons, respectively. We also
it by the number of documents that contain a word, and then used a Dropout layer with a dropout rate of 0.2 between two
calculating the logarithm (Aizawa 2003). Term frequency- hidden layers.
inverse document frequency (TF-IDF) is the multiplication
of TF and IDF. A word with a high recurrence in a record 4.2.2 Long short‑term memory (LSTM) (Hochreiter
and a low archive recurrence in the corpus has a high TF- and Schmidhuber 1997; Adhikari et al. 2019)
IDF score. The IDF value reaches 0 for a term that appears
in almost all texts, bringing the tf-idf closer to 0. When both Long short-term memory networks (LSTMs) are a unique
IDF and TF have higher values, the TF-IDF value is high, form of recurrent neural network (RNN) capable of handling
indicating that the word is uncommon in the document but long-term dependencies. Instead of having a single layer of
common within it. the neural network, four communicate uniquely. Some of the
The description of the BOW and TF-IDF methods can be works used the LSTM model for different text classification
referred from Appendix B. tasks Adhikari et al. (2019); Yang et al. (2018); Zhou et al.
(2016); Yang et al. (2016). We used a single LSTM layer
with 32 memory cells for the experimental purpose.
4 Experimental analysis and results
4.2.3 Gated recurrent units (GRU) (Cho et al. 2014)
We have used various machine learning techniques, ensem-
ble techniques, and deep learning techniques for building GRU are more or less similar to LSTM but only has two
the prediction models to classify tweets into spam and non- gates, namely the reset and the update gate. This has a much
spam categories. We have reported and compared the per- lower training time than the LSTM due to fewer parameters.
formance of different techniques for dataset-1 (DS1) and It was initially proposed to capture features from different
dataset-2 (DS2). time scales adaptively. We used a GRU with 128 units for the

13
149 Page 8 of 13 Social Network Analysis and Mining (2022) 12:149

experimental purpose, followed by a fully connected layer Table 3  Different Ml models with bag of words on DS1 and DS2
of 32 neurons and a classification layer. Classifier Accuracy Precision Recall F1-score

4.2.4 Convolution layer (Conv1D)‑based networks ML models with Bag of words on DS1


Logistic Regression 0.8763 0.87351 0.86263 0.87543
Employing convolution layers, we implemented four differ- Naive Bayes 0.68367 0.67542 0.67324 0.68453
ent models based on it. Using convolution in NLP-related KNN 0.83106 0.83225 0.84751 0.82257
tasks is a recent development. All the CNN-based networks Decision Tree 0.90127 0.90543 0.91248 0.89112
extract the n-gram-based feature using varied sizes of Random Forest 0.91602 0.90251 0.92152 0.91358
kernels/filters. ML models with bag of words on DS2
Logistic Regression 0.916 0.953 0.855 0.901
– Single and multilayer convolution Two separate models Naive Bayes 0.544 0.495 0.932 0.647
where the one with only a single convolution layer had KNN 0.839 0.894 0.726 0.802
100 filters with five as the kernel size. The other model Decision Tree 0.99 0.99 0.98 0.991
had two consecutive convolution layers with 100 filters Random Forest 0.992 0.99 0.985 0.992
and kernel sizes of 3 and 4, respectively.
– Very deep convolution neural network (VDCNN) (Simon-
yan and Andrew 2014) VDCNN uses multiple layered Table 4  Different ML models with TF-IDF on DS1 and DS2
convolution and max-pooling operation. The model Classifier Accuracy Precision Recall F1-score
makes use of four pooling operations, each of which
reduces the resolution by half, resulting in four different ML models with TF-IDF on DS1
feature map tiers: 64, 128, 256, and 512. At the end of 4 Logistic Regression 0.91375 0.91256 0.90145 0.90628
convolution pair operations, the resulting feature vector Naive Bayes 0.6912 0.68845 0.69014 0.68158
of size 512 × k(k = 3) resulting features is transformed KNN 0.83219 0.82156 0.84751 0.83348
into a single vector. This is fed into a three-layer fully Decision Tree 0.9241 0.92147 0.91254 0.91469
connected classifier (4096,2048,2048) with ReLU hidden Random Forest 0.94666 0.93458 0.94375 0.94112
units. ML models with TF-IDF on DS2
– Convolution + LSTM A mixed model captures short and Logistic Regression 0.753 0.83 0.567 0.671
long-range dependencies. It consists of a convolution Naive Bayes 0.544 0.495 0.932 0.647
layer followed by a pooling, LSTM, and the classifica- KNN 0.839 0.895 0.726 0.801
tion layer. The convolution layer with 100 filters uses a Decision Tree 0.99 0.988 0.989 0.989
kernel of size 5. The max-pooling layer has a pool size of Random Forest 0.99 0.99 0.98 0.99
2. The LSTM layer used has 32 memory cells followed
by a fully connected layer of 32 neurons and the clas-
sification layer. Table 5  Ensemble techniques with BOW on DS1 and DS2
Classifier Accuracy Precision Recall F1-score
4.3 Performance evaluation measures
Ensemble techniques with BOW on DS1
We have used four different performance evaluation meas- Bagging 0.997 0.99 0.99 0.99
ures to assess the performance of different used techniques Boosting 0.986 0.986 0.987 0.986
for the spam tweets detection. They are: accuracy, precision, Stacking 0.92 0.869 0.993 0.927
recall, and f1-score Gorunescu (2011). The description of Ensemble techniques with BOW on DS2
performance measures is given in Appendix A, Table 8. Bagging 0.783 0.878 0.598 0.712
Boosting 0.823 0.79 0.824 0.807
4.4 Implementation details Stacking 0.932 0.876 0.98 0.929

We have used different Python libraries to implement dif-


ferent machine learning and deep learning techniques. running MacOs BigSur, with 64- bit processor and access
All the experiments were carried out on a system having to NVidia K80 GPU kernel. To implement the machine
with Dual-Core Intel Core i5 processor and 8 GB RAM,

13
Social Network Analysis and Mining (2022) 12:149 Page 9 of 13 149

Table 6  Ensemble techniques with TF-IDF on DS1 and DS2 4.5.1 Results of machine learning techniques for the tweet
Classifier Accuracy Precision Recall F1-score spam classification

Ensemble techniques with TF-IDF on DS1 Tables 3 and 4 show the results of ML techniques with BOW
Bagging 0.94368 0.94283 0.93451 0.93457 and TF-IDF on DS1 and DS2 datasets in terms of accuracy,
Boosting 0.90354 0.90228 0.9134 0.91586 precision, recall, and f1-score measures. From Table 3, it can
Stacking 0.93765 0.92506 0.92355 0.93679 be seen that among the used ML techniques, random forest
Ensemble techniques with TF-IDF on DS2 and decision tree have produced the best prediction perfor-
Bagging 0.95242 0.94525 0.95221 0.94625 mance across all the measures. The highest achieved values
Boosting 0.93691 0.93542 0.92231 0.92042 for all the measures are above 90%. The naive Bayes tech-
Stacking 0.91969 0.90125 0.90589 0.91287 nique produced the lowest performance for all the measures.
The performance of the ML techniques has been improved
for the DS2, which consists of the profile, user, and hybrid
Table 7  Deep learning techniques based models on DS1 and DS2 features. Similarly, from Table 4, it can be observed that
Models Accuracy Precision Recall F1-score
again decision tree and random forest techniques are the top
performers for all the performance measures. The values of
Deep learning techniques on DS1 all the measures are greater than 90% for the decision tree,
BASIC ANN 64 and 32 0.979 0.969 0.988 0.978 random forest, and logistic regression techniques. Again,
layers
the performance of ML techniques has been improved for
LSTM 0.673 0.652 0.637 0.646
the DS2.
Single convolution layer 0.979 0.974 0.982 0.978
Two convolution layer 0.986 0.997 0.972 0.985
4.5.2 Results of ensemble techniques for the tweet spam
GRU​ 0.983 0.978 0.986 0.982
classification
VDCNN 0.938 0.99 0.868 0.929
Convolution + LSTM 0.923 0.88 0.954 0.92
Tables 5 and 6 show the results of ensemble techniques with
Deep learning techniques on DS2
BOW and TF-IDF on DS1 and DS2 datasets in terms of
BASIC ANN 64 and 32 0.928 0.948 0.868 0.916
layers
accuracy, precision, recall, and f1-score measures. From
LSTM 0.612 0.606 0.378 0.464
Table 5, it can be observed that the bagging technique pro-
Single convolution layer 0.906 0.951 0.832 0.88
duced the best prediction performance for all the measures
Two convolution layer 0.87 0.896 0.801 0.846
followed by the boosting technique. The three ensemble
GRU​ 0.558 0.559 0.047 0.086
techniques have achieved values above 90% for all the meas-
VDCNN 0.829 0.977 0.631 0.767
ures. The stacking technique produced the lowest perfor-
Convolution + LSTM 0.558 0.682 0.021 0.041
mance for all the measures. However, the performance of the
ensemble techniques has been decreased for the DS2, which
consists of the profile, user, and hybrid features. Similarly,
from Table 6, it can be seen that again the bagging technique
learning models, we used TF-IDF and bag of words as the is the best performer, followed by the boosting technique.
text embedding, whereas to test the deep learning models, It is true for both DS1 and DS2 datasets. The values of all
we used the pre-trained Paraphrase-distilroberta-base-v1 the measures are again greater than 90% for the bagging and
embedding (Reimers et al. 2019). The experimental pack- boosting techniques. Again, the performance of ensemble
age with the twitter spam dataset and source code can be techniques has been decreased for the DS2.
found here7.
4.5.3 Results of deep learning techniques for the tweet
4.5 Experiment results spam classification

We have used five different machine learning techniques and Table 7 shows the results of different deep learning tech-
three different ensemble techniques to build and evaluate the niques on DS1 and DS2 datasets in terms of accuracy, pre-
prediction models. These techniques have been applied to cision, recall, and f1-score measures. The table shows that
both DS1 and DS2 datasets with the bag of words (BOW) except for the LSTM technique, all other used deep learning
and TF-IDF feature extraction methods. The results of the
experimental analysis are reported in Tables 3, 4, 5, 6, and 7.

7
https://github.com/ssrathore/Suspicious_Tweets-datasset.

13
149 Page 10 of 13 Social Network Analysis and Mining (2022) 12:149

Accuracy with BOW DS1 DS2 Precision with BOW DS1 DS2
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0

Recall with BOW DS1 DS2 F1-score With Bow DS1 DS2
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0

Fig. 2  Comparison of different used ML, ensemble, and deep learn- layer convolution, TLC= Two layer convolution, GRU= Gated recur-
ing techniques with bag of words (BOW) on DS1 and DS2 datasets, rent unit, Cov_LSTM= Convolution + LSTM, VDCNN= Very deep
(*LR= Logistic Regression, NB= Naive Bayes, KNN= K-nearest convolution neural network)
neighbors, DT= Decision Tree, RF= Random Forest, SLC= Single

Accuracy with TF-IDF DS1 DS2 Precision with TF-IDF DS1 DS2
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0

Recall with TF-IDF DS1 DS2 F1-Score with TF-IDF DS1 DS2
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0

Fig. 3  Comparison of different used ML, ensemble, and deep learning techniques with TF-IDF on DS1 and DS2

13
Social Network Analysis and Mining (2022) 12:149 Page 11 of 13 149

techniques have produced a higher performance for spam that ensemble learning techniques-based models produced
classification on the DS1 dataset. The values are above 90% equal or better performance than deep learning techniques-
for all the measures in most cases. For the DS2 dataset, based models. The possible reason behind it is that the DS1
the performance of the deep learning techniques has been and DS2 datasets are not large enough to optimally train the
decreased. Here, basic ANN and single convolution layer deep learning-based models. Moreover, no improvement in the
techniques produced a performance greater than 90%. GRU performance of the deep learning models has been recorded
and the Convolution + LSTM have performed relatively when DS2 is used. Therefore, it can be inferred that adding
poorly on DS2. a hybrid does not help with performance improvement. One
exception report has been reported for the Convolution+LSTM
4.5.4 Performance comparison of the used machine model, where the recall value was very low. This issue can be
learning, ensemble, and deep learning techniques further investigated by optimally tuning the hyperparameters
for the tweet spam classification of the technique. Furthermore, it can also be inferred that time-
series models such as LSTM are not an ideal choice for the
Figures 2 and 3 show the performance comparison of the suspicious tweets’ identification.
used different set of techniques for the BOW and TF-IDF on
DS1 and DS2 datasets. The X-axis represents the set of tech-
niques, and Y-axis shows the achieved performance values. 5 Conclusions and future work
From Fig. 2, it is observed that overall, ensemble techniques
and deep learning techniques (except the LSTM technique) This paper focused on detecting suspicious tweets in trend-
have performed better than the machine learning techniques ing Twitter topics by analyzing the profile, user, content fea-
on the DS1 dataset. However, the performance of decision tures, and combinations. First, we crawled and extracted the
tree and random forest techniques is comparable or better data of Twitter trending topics by using the tweepy library.
than ensemble and deep learning techniques for the DS2 Further, we extracted different sets of features from the col-
dataset. For the recall and f1-score measures, LSTM, GRU, lected Twitter data. Additionally, we labeled the dataset with
and convolution+LSTM techniques have performed relatively spam and non-spam labels. Then, we applied and assessed
poorly compared to other used techniques in the case of DS2. the performance of different machine learning, ensemble,
Overall, techniques performed better for the DS1 and relatively and deep learning techniques for tweet spam classification.
poorly for DS2. Similarly, from Fig. 3, it is seen that again The results showed that the dataset with the combination of
ensemble learning techniques and deep learning techniques profile, content, and hybrid features improved the perfor-
produced a better performance compared to the machine learn- mance of machine learning and ensemble techniques but
ing techniques on DS1. The performance of machine learning did not improve deep learning techniques’ performance. The
and ensemble techniques has improved for the DS2. In com- used learning techniques performed almost equally for both
parison, the performance of deep learning techniques has been NLP feature extraction methods, BOW and TF-IDF. In most
decreased for the DS2. of the cases, used machine learning techniques produced
Overall, from the presented experimental analysis, we the performance of 90% or above for different performance
found that the used different sets of learning techniques have measures. The presented work showed that the hybrid fea-
achieved a higher performance for the tweet spam classifica- tures are most important for tweet spam classification. In
tion. In most cases, the values are above 90% for different this paper, we used some common behaviors of the users and
performance measures. These results show that using profile, content to label the tweets as spam or not and built several
content, user, and hybrid features for suspicious tweets detec- models for the identification of spam tweets. The idea was to
tion helps build better prediction models. recognize some patterns to design a method capable of auto-
matically annotating the large-scale dataset. However, there
4.6 Discussion of results is further scope for improving the filters to use in the spam
labeling of tweets. The experimental analysis presented in
This paper aims to develop models for the suspicious tweets’ this work showed that factors such as feature selection and
identification using different features such as profile-based, the use of filters for spam labeling greatly influence the per-
content-based, and hybrid features. Different machine learn- formance of the learning techniques. A stable and better-
ing and deep learning techniques have been applied to build annotated dataset could result in improved performance of
the prediction models. We tried two different combinations of the models. Future research work would present an approach
features and thus created two datasets of 3650 and 9778 tweets, to classify a user as a malicious or a valid user. Further, we
respectively. Dataset-1 (DS-1) includes the only profile-based would like to investigate the dependence among the features
and content-based features. Dataset-2 (DS-2) includes profile- and their significance in malicious bot detection.
based, content-based, and hybrid features. The results showed

13
149 Page 12 of 13 Social Network Analysis and Mining (2022) 12:149

Table 8  Description of the performance measures

Measure Description

Accuracy It is defined as the ratio of correctly predicted examples to the total examples.
(TP+TN)
Accuracy = (TP+TN+FP+FN)
Precision It is calculated as the proportion of accurately predicted positive examples to all positive examples
predicted. Precision = (TP+FP)
TP

Recall It is defined as the proportion of correctly predicted positive examples to all positive examples in the
actual class. Recall = (TP+FN)
TP

F1-score It is the weighted average of precision and recall. F1-score considers both the false positives and
false negatives. F1 − score = 2∗Precision∗Recall
(Precision+Recall)
*TP = True Positive, TN = True Negative, FP = False Positive, FN = False Negative

Appendix A Ethical approval This article does not contain any studies with human
participants or animals performed by any of the authors.

Informed consent Not applicable.

Appendix B
References
Term frequency (TF) It ascertains the occasions a word wi
Abkenar SB, Kashani MH, Akbari M, Mahdipour E (2020) Twitter
occurs in a survey rj ; with respect to the total number of spam detection: a systematic review. arXiv preprint arXiv:​2011.​
words. It is defined by Eq. 1. 14754
Adhikari A, Ram A, Tang R, Lin J (2019) Rethinking complex neural
Number of times wi occurs in rj network architectures for document classification. In: Proceed-
tf (wi , rj ) = (1) ings of the 2019 conference of the North American chapter of the
Total number of words in rj association for computational linguistics: human language tech-
nologies, vol 1 (Long and Short Papers), pp 4046–4051
Inverse document frequency (IDF) It highlights terms that Aizawa Akiko (2003) An information-theoretic perspective of TF-IDF
appear in a small number of documents throughout the cor- measures. Inf Process Manag 39(1):45–65
pus, or in plain English, words with a high IDF score. It is Alom Z, Carminati B, Ferrari Elena (2020) A deep learning model
for Twitter spam detection. Online Soc Netw Media 18:100079
defined by Eq. 2. Barushka A, Hajek P (2018) Spam filtering using integrated distribu-
tion-based balancing approach and regularized deep neural net-
|D|
idf (d, D) = log (2) works. Appl Intell 48(10):3538–3556
{d ∈ D ∶ t ∈ D} Boukes M (2019) Social network sites and acquiring current affairs
knowledge: the impact of Twitter and Facebook usage on learning
where ft,D is the recurrence of the term t in the record D. about the news. J Inf Technol Politics 16(1):36–51
|D| is the absolute number of reports in the corpus. Chen W, Yeo CK, Lau CT, Lee BS (2017) A study on real-time low-
{d ∈ D ∶ t ∈ D} is the include of archives in the corpus, quality content detection on Twitter from the users’ perspective.
PLoS ONE 12(8):e0182487
which contains the term t. Cho K, van Merrienboer B, Bahdanau D, Bengio Y (2014) On the
Term frequency-inverse document frequency(TF-IDF) properties of neural machine translation, encoder-decoder
TF-IDF is the multiplication of TF and IDF. It is defined approaches. CoRR arXiv:​1409.​1259
by Eq. 3. Concone F, Re GL, Morana M, Ruocco C (2019) Twitter spam account
detection by effective labeling. InITASEC
Dangkesee T, Puntheeranurak S (2017) Adaptive classification for
tf − idf (t, d, D) = tf (t, D) × idf (d, D) (3)
spam detection on twitter with specific data. In: 2017 21st inter-
national computer science and engineering conference (ICSEC),
Acknowledgments This work is partially supported by a Research pp 1–4. IEEE
Grant under National Super computing Mission (India), Grant number: Dokuz AS (2021) Social velocity based spatio-temporal anoma-
DST/NSM/R &D_HPC_Applications/2021/24. lous daily activity discovery of social media users. Appl Intell
52:2745–2762
Edo-Osagie O, De La Iglesia B, Lake I, Edeghere O (2020) A scoping
Declarations
review of the use of Twitter for public health research. Comput
Biol Med 122:103770
Conflict of interest The authors declare no potential conflict of inter- Gharge S, Chavan M (2017) An integrated approach for malicious
ests with respect to the research, authorship, and/or publication of this tweets detection using NLP. In: 2017 international conference
article.

13
Social Network Analysis and Mining (2022) 12:149 Page 13 of 13 149

on inventive communication and computational technologies In: 2019 international engineering conference (IEC), pp 200–204.
(ICICCT), pp 435–438. IEEE IEEE
Gorunescu F (2011) Classification performance evaluation. In: Data Raj RJR, Srinivasulu S, Ashutosh A (2020) A multi-classifier frame-
mining. Intelligent systems reference library, vol 12. Springer, work for detecting spam and fake spam messages in Twitter. In:
Berlin, Heidelberg. https://d​ oi.o​ rg/1​ 0.1​ 007/9​ 78-3-6​ 42-1​ 9721-5_6 2020 IEEE 9th international conference on communication sys-
Gupta A, Kaushal R (2015) Improving spam detection in online social tems and network technologies (CSNT), pp 266–270. IEEE
networks. In: 2015 International conference on cognitive comput- Reimers N, Gurevych I, Thakur N (2019) Sentence-bert: Sentence
ing and information processing (CCIP), pp 1–6. IEEE embeddings using siamese bert-networks. In: Proceedings of the
Gupta H, Jamal MS, Madisetty S, Desarkar MS (2018) A framework 2019 conference on empirical methods in natural language pro-
for real-time spam detection in twitter. In: 2018 10th interna- cessing. Association for Computational Linguistics
tional conference on communication systems & networks (COM- Simonyan K, Zisserman A (2014) Very deep convolutional networks
SNETS), pp 380–383. IEEE for large-scale image recognition. arXiv preprint arXiv:1​ 409.1​ 556
Hennig-Thurau T, Wiertz C, Feldhaus Fabian (2015) Does twitter mat- Song J, Lee S, Kim J (2011) Spam filtering in Twitter using sender-
ter? the impact of microblogging word of mouth on consumers’ receiver relationship. In: International workshop on recent
adoption of new movies. J Acad Mark Sci 43(3):375–394 advances in intrusion detection, pp 301–317. Springer, Cham
Hochreiter S, Schmidhuber Jürgen (1997) Long short-term memory. Sreekanth M, Sankar DM (2018) A neural network-based ensemble
Neural Comput 9(8):1735–1780 approach for spam detection in Twitter. IEEE Trans Comput Soc
Hua W, Zhang Y (2013) Threshold and associative based classifica- Syst 5(4):973–984
tion for social spam profile detection on twitter. In: 2013 ninth Tang D, Wei F, Qin B, Liu T, Zhou M (2014) Coooolll: a deep learn-
international conference on semantics, knowledge and grids, pp ing system for Twitter sentiment classification. In: Proceedings of
113–120. IEEE the 8th international workshop on semantic evaluation (SemEval
Inuwa-Dutse I, Liptrott M, Korkontzelos I (2018) Detection of spam- 2014), pp 208–212, Association for Computational Linguistics,
posting accounts on Twitter. Neurocomputing 315:496–511 Dublin
Kim S-W, Gil Joon-Min (2019) Research paper classification systems Tingmin W, Wen S, Xiang Y, Zhou Wanlei (2018) Twitter spam detec-
based on TF-IDF and LDA schemes. Hum-centric Comput Inf tion: survey of new approaches and comparative study. Comput
Sci 9(1):1–21 Secur 76:265–284
Lee S, Kim J (2013) Fluxing botnet command and control channels Wang B, Zhuang Jun (2017) Crisis information distribution on twitter:
with URL shortening services. Comput Commun 36(3):320–332 a content analysis of tweets during hurricane sandy. Nat Hazards
Lingam G, Rout RR, Somayajulu DVLN (2019) Adaptive deep 89(1):161–181
Q-learning model for detecting social bots and influential users Wu T, Liu S, Zhang J, Xiang Y (2017) Twitter spam detection based
in online social networks. Appl Intell 49(11):3947–3964 on deep learning. In: Proceedings of the Australasian computer
Lin P-C, Huang P-M (2013) A study of effective features for detecting science week multiconference, ACSW ’17, Association for Com-
long-surviving Twitter spam accounts. In: 2013 15th international puting Machinery, New York
conference on advanced communications technology (ICACT), Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-
pp 841–846. IEEE based bidirectional long short-term memory networks for relation
Martinez-Rojas M, del Carmen Pardo-Ferreira M, Rubio-Romero JC classification. In: Proceedings of the 54th annual meeting of the
(2018) Twitter as a tool for the management and analysis of emer- association for computational linguistics (vol 2: Short papers),
gency situations: a systematic literature review. Int J Inf Manag pp 207–212
43:196–208 Yang Zi, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical
Martinez-Romo J, Araujo Lourdes (2013) Detecting malicious tweets attention networks for document classification. In: Proceedings of
in trending topics using a statistical analysis of language. Expert the 2016 conference of the North American chapter of the associa-
Syst Appl 40(8):2992–3000 tion for computational linguistics: human language technologies,
Mateen M, Iqbal MA, Aleem M, Islam MA (2017) A hybrid approach pp 1480–1489,
for spam detection for Twitter. In: 2017 14th international Bhur-
ban conference on applied sciences and technology (IBCAST), Publisher's Note Springer Nature remains neutral with regard to
pp 466–471. IEEE jurisdictional claims in published maps and institutional affiliations.
Pengcheng Y, Sun X, Li W, Ma S, Wu W, Wang H (2018) Sgm:
sequence generation model for multi-label classification. arXiv Springer Nature or its licensor holds exclusive rights to this article under
preprint arXiv:​1806.​04822 a publishing agreement with the author(s) or other rightsholder(s);
Prabhjot K, Anubha S, Jasleen K (2016) Spam detection on twitter: a author self-archiving of the accepted manuscript version of this article
survey. In: 2016 3rd international conference on computing for is solely governed by the terms of such publishing agreement and
sustainable global development (INDIACom), pp 2570–2573. applicable law.
IEEE
Qader WA, Ameen MM, Ahmed BI (2019) An overview of bag of
words; importance, implementation, applications, and challenges.

13

You might also like