0% found this document useful (0 votes)
45 views11 pages

Rumor Identification in Twitter Data For 2020 US Presidential Election Using BERT Model

Social media platforms provide rich resources to their users to connect, share and search for the information of their interest. These platforms are even more significant for governmental issues and political campaigns. As information spreads within seconds, it is incredibly challenging to control and monitor the authenticity of the information. Many attempts have been made in this regard.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views11 pages

Rumor Identification in Twitter Data For 2020 US Presidential Election Using BERT Model

Social media platforms provide rich resources to their users to connect, share and search for the information of their interest. These platforms are even more significant for governmental issues and political campaigns. As information spreads within seconds, it is incredibly challenging to control and monitor the authenticity of the information. Many attempts have been made in this regard.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

UMT Artificial Intelligence Review (UMT-AIR)

Volume 1 Issue 1, Spring 2021


ISSN(P): 2791-1276 ISSN(E): 2791-1268
Journal DOI: https://doi.org/10.32350/UMT-AIR
Issue DOI: https://doi.org/10.32350/UMT-AIR/0101
Homepage: https://journals.umt.edu.pk/index.php/UMT-AIR

Journal QR Code:

Article: Rumor Identification on Twitter Data for 2020 US Presidential


Elections with BERT Model
Author(s): Abdul Rahim
Affiliation: Addo.ai

Article QR:

R. Abdul, “Rumor identification on twitter data for 2020 US


Citation:
presidential elections with BERT model,” UMT Artificial Intelligence
Review, vol. 1, pp. 44–54, 2021. https://doi.org/10.32350/UMT-
AIR/0101/03

Copyright
Information: This article is open access and is distributed under the terms of
Creative Commons Attribution 4.0 International License

A publication of the
Dr Hasan Murad School of Management
University of Management and Technology, Lahore, Pakistan
Rumor Identification in Twitter Data for 2020 US
Presidential Election using BERT Model
Abdul Rahim 1*
ABSTRACT: Social media validation of the information shared.
platforms provide rich resources to Also, once established, such models
their users to connect, share and help find the behavior of rumors and
search for the information of their patterns in American politics.
interest. These platforms are even KEYWORDS:
more significant for governmental BERT Model, rumor detection,
issues and political campaigns. As social media, US elections
information spreads within seconds,
it is incredibly challenging to I. INTRODUCTION
control and monitor the authenticity Social media and microblogging
of the information. Many attempts
platforms are great examples of the
have been made in this regard. This
paper briefly overviews some latest technologies becoming part of
significant efforts and discusses the our daily lives. These
patterns of rumors and fake news communication advancements are
using the latest machine learning used in personal and professional
techniques. For this purpose, we domains, although they are used
extracted the tweets, specifically primarily in journalism and for
with the hash tag Donald Trump,
extending political influences.
during the high time of the 2020 US
presidential election in order to test Nevertheless, such communication
their authenticity. Similar data was platforms allow their users to
extracted from the FactCheck experience more information flow
websites Snopes.com, quickly, easily, and cost-free.
factcheck.org, and politifact.org. These are undeniable advantages,
We applied the already established although they have given rise to
BERT model to train the data and
unnecessary competitiveness,
tested one million tweets. We found
the model as reliably accurate and which, in turn, has resulted in
proposed that once all the truthful inappropriate use of these exciting
information is saved and pretrained new technologies. Indeed, their dark
in the model, it can auto-identify the side is becoming more prominent

*Corresponding Author: [email protected]


Dr Hasan Murad School of Management
45
Volume 1 Issue 1, Spring 2021
Rumor Identification in Twitter Data …

with time due to the spread of false focus on the 2020 presidential
propaganda, fake news, and election.
tampered information, creating a The current research aims to review
battleground for hate speech [1], [2]. the previously utilized techniques
All of this has resulted in and then evaluate the BERT model
irreversible damage in various to classify the rumors based on a
forms, that is, damage to mental small set of information. Efforts are
health and repute and social being made to control incorrect
shaming [3], [4]. Such negative and information flow; indeed,
unethical use of social platforms remarkable efforts have been made
remains in the limelight for quite by Twitter in this regard [10].
some time, and this fact challenges However, fake news is still a norm
the reliability and survival of these in the glamour world when it comes
modern communication platforms. to journalism and politics. Keeping
It also creates prospects for machine in view the said consideration, this
learning techniques to provide study investigates the circulation of
solutions. rumors related to politicians during
With the rise of online and fact- elections as it provides us with
checking platforms and machine selected patterns for rumor
learning techniques, control over the propagation.
spread of incorrect information is The rest of the paper is organized as
being improved, though the need for follows. In Section 2, we briefly
a reliable autodetection system is review the existing studies related to
still there [5], [6]. This study focuses rumors in general by applying
on rumors and disinformation machine learning models,
propagated during elections, particularly in US elections. We
suggesting how these can be tackled proceed with the paper by
using machine learning techniques. presenting Section 3, including data
Specifically, we were inspired by preparation, the modeling aspect,
the studies on rumors regarding the and the results derived. Later, in
2016 US presidential election [7], Section 4, we discuss the results and
[8], [9]. In this work, we mainly

UMT Ar tificial Intelligence


46
Volume 1 Issue 1, Spring 2021
Abdul Rahim

the limitations and prospects of the Similarly, Harrag et al. [14] carried
current work. out another study that employed the
II. LITERATURE REVIEW BERT model with GPT2 to
recognize and classify the
Many studies have been carried information (in this case, tweets) as
out to identify and classify rumors either human-generated or machine-
and fake news by employing generated. They specifically
machine learning methods; targeted the Arabic language tweets
however, the recent trend has and compared their predictions with
shifted towards advanced deep hybrid models, such as RNN,
learning and hybrid approaches LSTM, GRU, and their variants.
[10], [11], [12]. New tools and They reported 98% accuracy for
technologies are emerging with their data.
advancements in the machine Indeed, much work has been carried
learning domain. We utilized out with considerable accuracy by
complex techniques by eschewing employing the BERT model, though
the details to achieve the goals more lack of resources, absence of
meaningfully. This section context, and unavailability of
discusses some prominent and latest standard corpora for fake or
studies explicitly aimed at rumor propaganda news are the challenges
detection using advanced machine faced in this domain [13]. Da San
learning approaches. Extensive Martino et al [13] proposed a dataset
work has been carried out by of annotated news articles with 14
leveraging BERT and its variants as fake approaches to address this
the foundation model. For instance, issue. This study proposed a BERT-
in the study of [13], the authors based model adopted by Patil,
proposed a combination of Singh, and Agarwal [14] for
Convolutional Neural Network SemEval 2020-Task 11. The
(CNN) with BERT. Adding the proposed approach consists of two
CNN layer was to enhance the aspects: identification of
word's semantic representation with propaganda and classification of the
varying sentence lengths. In doing techniques used to disseminate it
so, the said authors achieved results (among 14 classes), such as
with 98.9% accuracy.
Dr Hasan Murad School of Management
47
Volume 1 Issue 1, Spring 2021
Rumor Identification in Twitter Data …

exaggeration, minimization, name- and others) and strived to find the


calling, bandwagon, and others. behavioral characteristics. They
For the 2016 US election, extensive found no similarities among them.
studies were carried out on the role Their work highlighted the
of social media. Since these heterogeneous behavior exhibited
platforms are exploited at their best by suspended accounts.
by politicians and influencers to As there exist quite exciting and
promote their respective election distinctive perspectives about the
campaigns. Notable work was use of social media to disseminate
conducted by Jin et al. [7] for the fake news, rumors, disinformation,
2016 US presidential election. For and trolls, they have given rise to a
instance, in their study, Jin et al. variety of machine learning models.
analyzed the false and fabricated No doubt, all the existing attempts
information dispensed via eight are remarkable. They provide an
million tweets posted by opportunity for the research
campaigners of vital presidential community to review the literature
candidates, including Hillary on misinformation from a much
Clinton and Donald Trump. broader perspective and develop a
Conclusions were drawn by sophisticated corpus and model
matching them with confirmed better to understand the nature of the
news articles and classifying them problem and its challenges. In the
with TF-IDF and BM25, Word2Vec same vein, in this work, we use the
and Doc2Vec, and lexicon matching BERT model to analyze the data
approaches. retrieved from social media and
Moreover, Boynton, Shafique, and related to the 2020 US presidential
Srinivasan [10] carried out the election.
analysis of suspended accounts III. DATA AND METHODS
related to the 2016 US presidential
election by dividing the accounts In this section, we discuss the
into those that belonged to proposed model. Fig. 1 depicts the
suspended and regular layout of the methodology used for
communities, respectively (Trump- this experimental study.We applied
IRA, Gay rights, BlackLivesMatter, the Bidirectional Encoder
Representations from Transformers
UMT Ar tificial Intelligence
48
Volume 1 Issue 1, Spring 2021
Abdul Rahim

(BERT) technique to classify facts, and figures communicated via


election rumors intelligently. We these tweets.
divided the methodology into two We randomly collected around 1394
components. rumors and non-rumors from
FactCheck websites, namely
1. Data Preparation
2. Model Preparation snopes.com, factcheck.org, and
politifact.org. We ensured that data
a. Data Preparation was focused on the news circulating
We collected the data from Twitter during the said time period and
Kaggle and factcheck websites. We specifically targeting the US 2020
extracted the data generated from 1st election. Also, it is imperative to
October 2020 - to 25th December note that these factcheck websites
2020. This period contained the use their own standards to label the
peak time for the US 2020 election data: partial truth, entirely false,
campaign, and leaders and their undetermined, and others. However,
followers circulated a flood of for this study, we specifically
information and disinformation to targeted rumors and non-rumors and
attract voters. assumed that partial truth also falls
in the rumor category. We updated
the annotation of the extracted data
to be normalized using the Prodigy
annotation tool (as shown in Fig. 2),

Figure .1. Proposed Methodology


The collected data was intended to
be used for testing and predicting
the efficacy of the proposed model.
Initially, we intentionally eschewed
the followers' data and quotes of the
same tweet to focus on the Figure.2. Sample Annotation of Data using
uniqueness of false particulars, Prodigy

Dr Hasan Murad School of Management


49
Volume 1 Issue 1, Spring 2021
Rumor Identification in Twitter Data …

making it easier for us to know if the BERT model. The maximum length
data requires modification in of the sentence in the dataset was 55
annotations. Moreover, we (as shown in Fig. 3), and we set it to
manually cross-validated the results 128 for training purposes.
based on the information provided Consequently, it required
by FactCheck and people voter significantly less effort to develop
websites. the calibrated model. We fine-tuned
This practice enabled us to prepare the top fully connected layer with
reliable data corresponding to the the word embedding vector. Also,
desired objectives and the we opted for a broad-match strategy
identification and reliable that used a minimum number of
classification of rumors. keywords to predict as many tweets
b. Model Preparation that can be classified as rumors.

We used a pre-trained BERT model


as a sentence encoder. A significant
benefit of using this model is that it
accurately extracts the context of the
sentence. It also removes directional
constraints by applying the Masked
Language Model (MLM). This
feature has made BERT an
outstanding model compared to
other embedding models, such as
TF-IDF, Word2Vec, Lexicon Figure .3. Length of Sentence

Matching, and Doc2Vec [11]. Conventionally, two approaches are


After annotating the rumor data, the used to set the hyperparameter
next step was fine-tuning the model values: default optimal and manual.
with the training set. This process In this study, we opted for the
default option as we neither changed
allowed the implementation of the
the in-built functionality nor created
desired strategy on the pre-trained the hybrid of models. For model
BERT model. We tokenized the training, we used Google Colab, an
data set of 1394 rumors to make open-source platform for data
word embedding using a pre-trained science training.
UMT Ar tificial Intelligence
50
Volume 1 Issue 1, Spring 2021
Abdul Rahim

IV. RESULTS AND DISCUSSION million tweets comprised the test


After training the model, we data set. To tackle this issue, we
presented the model with test data made a virtual machine with the
containing the following specifications:
"hashtag_donaldtrump" data of 1. Operating System: Windows 10
971073 tweets. The main limitation Pro
of the study was inadequate 2. Processor: Intel Xeon X5670 2.9
computing resources. The majority GHz (8 vCPU cores)
of open-source platforms provide a 3. RAM: 128 GB
maximum of 16 GBs of RAM for 4. Hard Drive: 100 GB
computation. However, nearly 1

Table I. Models Scores

PRECISION RECALL F1-SCORE SUPPORT

0 0.88 0.62 0.73 106

1 0.70 0.91 0.79 104

Accuracy 0.77 0.76 210

Macro avg. 0.79 0.77 0.76 210

Weight 0.79 0.77 0.76 210


avg.

Table II. Models Scores

Col_0 Col_1

Row_0 66 40

Row_1 9 95

Dr Hasan Murad School of Management


51
Volume 1 Issue 1, Spring 2021
Rumor Identification in Twitter Data …

With these specs, we could run the Therefore, the proposed model
model provided that if we required achieved significantly accurate
more RAM for computing data, we results with minimal data loss and
minor information loss. Further, we
used local hardware and saved the
compare the train and test data. This
file. Moreover, if we needed GPU comparison contains
for computing, we used Kaggle a smaller set of train data and
hardware resources. It took around 3 extensively tested data with the
hours to predict the results. We same context. It can be established
made pandas data frames for these that bidirectional, pre-trained, word
predictions and saved them into embedding BERT leads to faster
training of model and lower cross-
CSV.
entropy loss.
The model displayed a precision
score of 77% accuracy (as given in V. CONCLUSION

Table 1). The respective confusion This work presented a BERT-based


matrix is also shown in Table 2. model for identifying rumors,
The training loss of the model specifically those spread by
decayed comparatively fast and at politicians, regarding the US
100 epochs with few presidential election 2020. The
inconsistencies. We adjust the loss current study intended to utilize the
to around 0.4. Fig 4 shows a cross- recent advancements in deep
entropy loss, reduced rapidly by learning models by training the
significantly affecting the learning proposed model with data from
of the data. authentic resources and excluding
rumors. The tuned model was set to
64 and 100 epochs batch size. The
cross-entropy technique was used as
a cost function for optimizing the
model. The model was able to
achieve a precision score of 77%.
Later, we used the trained model to
detect rumors in almost one million
tweets worldwide. Although the
initial objective was testing the
tweets related to the recent US
presidential election, the model can
Figure .4. Model Training and Testing easily be applied to track political

UMT Ar tificial Intelligence


52
Volume 1 Issue 1, Spring 2021
Abdul Rahim

rumors on the go, once it is trained 4. T. Enarsson and S. Lindgren,


with sufficient data from validated "Free speech or hate speech? A
sources. In the future, we intend to legal analysis of the discourse
extend this work by incorporating
about Roma on Twitter," Inf.
more diverse tweets in context of
political campaigns. Commun. Technol. Law, vol. 28,
no. 1, pp. 1-18, Jul. 2019.
Refrences
https://doi.org/10.1080/136008
1. T. Davidson, D. Warmsley, M. 34.2018.1494415
Macy, and I. Weber, 5. R. K. Kaliyar, A. Goswami, and
"Automated hate speech P. Narang, "FakeBERT: Fake
detection and the problem of news detection in social media
offensive language," in Pro. Int. with a BERT-based deep
AAAI Conf. Web Soc. Media, learning approach," Multimed.
vol. 11, no. 1, Mar. 2017, pp. Tools Appl., vol. 80, no. 8, pp.
512-515. 11765-11788, Jan. 2021.
2. A. Seyam, A. Bou Nassif, M. https://doi.org/10.1007/s11042-
Abu Talib, Q. Nasir, and B. Al 020-10183-2
Blooshi, "Deep Learning 6. M. Mozafari, R. Farahbakhsh,
Models to Detect Online False and N. Crespi, "Hate speech
Information: a Systematic detection and racial bias
Literature Review," in 7th Ann. mitigation in social media based
Int. Conf. Arab Women Comput. on BERT model," PloS one, vol.
Conj. 2nd Forum Women Res., 15, no. 8 , pp. e0237861, Aug.
Aug. 2021, Sharjah, UAE, pp. 1- 2020. https://doi.org/10. 1371/
5. journal.pone.0237861
3. A. Matamoros-Fernández and J. 7. Z. Jin, J. Cao, H. Guo, Y. Zhang,
Farkas, "Racism, hate speech, Y. Wang, and J. Luo, "Detection
and social media: A systematic and analysis of 2016 us
review and critique," Telev. New presidential election related
Media, vol. 22, no. 2, pp. 205- rumors on twitter," in Int. Conf.
224, Jan. 2021. https://doi.org/ Soc. Comput. Beh.-Cul. Model.
10.1177/1527476420982230 Predic. Behav. Represent.
Model. Simula., June. 2017, pp.

Dr Hasan Murad School of Management


53
Volume 1 Issue 1, Spring 2021
Rumor Identification in Twitter Data …

14-24. https://doi.org/10.1007/ learning approaches," in IOP


978-3-319-60240-0_2 Conf. Ser.: Mater. Sci. Eng.,
8. H. T. Le, G. Boynton, Y. Jaipur, India, 2021. pp. 12-40.
Mejova, Z. Shafiq, and P. 12. G. Da San Martino, A. Barrón-
Srinivasan, "Revisiting the Cedeno, H. Wachsmuth, R.
american voter on twitter," in Petrov, and P. Nakov,
Pro. 2017 CHI Conf. Hum. Fac. "SemEval-2020 task 11:
Comput. Sys., May. 2017, pp. Detection of propaganda
4507-4519. https://doi.org/10. techniques in news articles," in
1145/3025453.3025543 Proc. 14th Workshop Seman.
9. A. Khatua, A. Khatua, and E. Evalu., Dec. 2020, pp. 1377-
Cambria, "Predicting political 1414.
sentiments of voters from 13. R. Patil, S. Singh, and S.
Twitter in multi-party contexts," Agarwal, "Bpgc at semeval-
Appl. Soft Comput.,vol. 97, pp. 2020 task 11: Propaganda
106-743, Dec. 2020. detection in news articles with
https://doi.org/10.1016/j.asoc.2 multi-granularity knowledge
020.106743 sharing and linguistic features-
10. N. Aslam, I. Ullah Khan, F. S. based ensemble learning," ,
Alotaibi, L. A. Aldaej, and A. K. Arxiv, 2006 [Online].
Aldubaikil, "Fake detect: A https://doi.org/10.48550/arXiv.
deep learning ensemble model 2006.00593
for fake news detection," 14. Harrag, A., Rezk, H. "Indirect
complexity, vol. 2021, pp. 1-8, P&O type-2 fuzzy-based
Apr. 2021. https://doi.org/10. adaptive step MPPT for proton
exchange membrane fuel
1155/2021/5557784
cell." Neural Comput. Applic.,
11. Z. Khanam, B. Alwasel, H. vol. 33, no.15, pp. 9649–9662,
Sirafi, and M. Rashid, "Fake Feb. 2021. https://doi.org/10.
news detection using machine 1007/s00521-021-05729-w

UMT Ar tificial Intelligence


54
Volume 1 Issue 1, Spring 2021

You might also like