0% found this document useful (0 votes)
3 views6 pages

Dbms Unit1

The document presents a study on fake news detection using an ensemble approach that combines Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) models, achieving high accuracy rates of 98.71% for Bengali news and 98.94% for English news. It emphasizes the importance of detecting fake news in the context of social media and highlights the use of Natural Language Processing (NLP) techniques and pre-trained word embeddings like GloVe. The research contributes to the field by introducing a novel methodology specifically tailored for low-resource languages such as Bengali.

Uploaded by

Naresh Mk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views6 pages

Dbms Unit1

The document presents a study on fake news detection using an ensemble approach that combines Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) models, achieving high accuracy rates of 98.71% for Bengali news and 98.94% for English news. It emphasizes the importance of detecting fake news in the context of social media and highlights the use of Natural Language Processing (NLP) techniques and pre-trained word embeddings like GloVe. The research contributes to the field by introducing a novel methodology specifically tailored for low-resource languages such as Bengali.

Uploaded by

Naresh Mk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Fake News Detection Based on Deep Learning

1st Ashfia Jannat Keya 2nd Shahid Afridi 3rd Afroza Siddique Maria
Department of Computer Science and Department of Computer Science and Department of Computer Science and
2021 International Conference on Science & Contemporary Technologies (ICSCT) | 978-1-6654-2132-4/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICSCT53883.2021.9642565

Engineering, Bangladesh University Engineering, Bangladesh University Engineering, Bangladesh University


of Business and Technology of Business and Technology of Business and Technology
Dhaka-1216, Bangladesh Dhaka-1216, Bangladesh Dhaka-1216, Bangladesh
[email protected] [email protected] [email protected]

4th Snaha Sadhu Pinki 5th Joy Ghosh 6th M. F. Mridha


Department of Computer Science and Department of Computer Science and Department of Computer Science and
Engineering, Bangladesh University Engineering, Bangladesh University Engineering, Bangladesh University
of Business and Technology of Business and Technology of Business and Technology
Dhaka-1216, Bangladesh Dhaka-1216, Bangladesh Dhaka-1216, Bangladesh
[email protected] [email protected] [email protected]

Abstract—Fake news is invalid and misleading information have no idea what they are reading or posting. People usually
that is conveyed as accurate news. Fake news detection has follow celebrities, religious or political figures, and believe
become indispensable in modern society because of the extreme or support them. Therefore, if those leaders post any news
propagation of false news on social platforms and news portals.
Several studies have been released that use fake news on social without first verifying the truth, false information spreads
platforms instead of news content for decision-making. Therefore, quickly. Even fake news has significant regional impacts.
this paper introduces an automated model for detecting fake news Disruption of the American presidential election is an excellent
relying on Deep Learning (DL) and Natural Language Processing example of fake news, and distorted views of people [2].
(NLP) for a low-resource language like Bangla, utilizing news Several scientific communities’ efforts are seen before, in
content and headline features. We propose an ensemble approach
of Convolutional Neural Network (CNN) and Gated Recurrent which they attempt to discover how to detect fake news from
Unit (GRU) with a pre-trained GloVe embedding method that different perspectives (propagation patterns, source authen-
achieved an accuracy of 98.71% on the test data. For comparison, ticity, writing style) [3]. Researchers have suggested various
the combination of Long short-term memory (LSTM) and CNN strategies to prevent false information. Diverse Machine Learn-
with GloVe is trained using the same dataset and parameters. We ing (ML) algorithms and Neural networks are applied for the
also experimented on a benchmark dataset containing English
news with our suggested model and achieved an accuracy of detection of fake news [4]. Recently utilizing the Ensemble
98.94%. Our model’s performance is evaluated using diverse approach, the performance of the model is greatly increased
evaluation metrics, including accuracy, recall, precision, f1-score, [5]. So far, this sort of work has only been performed on
etc. English news. Currently, around 265 million people speak
Index Terms—Fake News, Natural Language Processing, Deep Bengali, as it is the 7th language in the world in terms of
Learning, Ensemble Approach, CNN+GRU.
speakers. But there are few works done to tackle the risk of
I. I NTRODUCTION fake news written in Bengali [6]. The Overall contribution of
Fake news is a terrific social issue. Few organizations the paper includes:
and systems work together to distribute fake news around • We introduce an ensemble approach with a pre-trained
the world. People sometimes spread fake news intention- word embedding method, GloVe, for the Bengali fake
ally to defame someone else and sometimes spread without news detection. This is the first research in fake news
any intention or knowing the actual story. Believing others detection that proposes a combination of CNN and GRU
opinions without judging and lack of knowledge causes the to the best of our knowledge.
dissemination of misleading information in society. According • We investigate models performance with an established
to research, the spread of fake news on social media has a Bengali dataset, "BanFakeNews," and an English dataset
long-lasting effect on those who are not as wise. And holds named "Fake News" utilizing NLP applications.
them back from making the proper judgments [1]. With the • We perform several experiments with the Bengali dataset
expansion of the internet and social media, this fake news and trained CNN and LSTM ensemble model following
spreading rate is increasing drastically. the same process to compare with our proposed model.
Many people already have access to social media, but they
The paper is structured as the following: Section II high-
lights the related work in this domain. Section III illustrates
978-1-6654-2132-4/21/$31.00 ©2021 IEEE the proposed methodology. Section IV depicts the evalauation

Authorized licensed use limited to: Sri Venkateshwara College of Engineering. Downloaded on March 03,2025 at 06:37:17 UTC from IEEE Xplore. Restrictions apply.
metrics. We discuss the results of our experiments in section various machine learning algorithms for higher accuracy, in-
V. Finally, Section VI brings the paper to a conclusion. cluding CNN, KNN, SVM, LSTM, and Naive Bayes, and
noticed that LSTM have the best accuracy with 97%. Asghar et
II. R ELATED W ORK al. [15] proposed a system for fake news detection, combining
Fake news and rumours are constantly growing through Bidirectional LSTM with CNN for classifying the tweet into
the internet. Several researchers are working to solve this rumours and non-rumours. Though their model achieved an
problem. We have discussed the related works based on three accuracy of 86.12%, they believe that by combining diverse
approaches: 1) Machine Learning Approach, 2) Deep Learning features, they can get more reliable results. A study by K. Shu
Approach, and 3) Ensemble Approach. et al. [16] wants to improve their work by including features
such as the article’s source or author, as well as user feedback.
A. Machine Learning Approach Previous researchers focused on combining CNN, and
Ibrishimova and Li [7] proposed a hybrid framework that LSTM, whereas GRU is structurally and functionally similar
combines 5 NLP features with three features of knowledge to LSTM. Still, GRU is simpler, more powerful, and uses less
verification. The authors used logistic regression as their clas- memory since it has only two gates, namely reset and update.
sifier and trained using Kaggle Fake News Dataset. Granik and The strength of LSTM in learning long-term dependencies is
Mesyura [8] utilized a dataset of Facebook news contents and retained in GRU. Thus, we propose an ensemble approach of
proposed a system for detecting fake news based on the Naive CNN and GRU utilizing news content and headline features.
Bayes classifier. However, the proposed model’s performance
was lacking, as it only obtained a 74% accuracy. Hussain et al.
III. P ROPOSED M ETHODOLOGY
[6] used Multinomial Naive Bayes (MNB) and Support Vector
Machine (SVM) classifiers for identifying Bangla fake news. The architecture of our proposed methodology is shown in
The research concludes that SVM, combined with the linear Fig. 1. A further detailed description of our methodology is
kernel, has an accuracy of 96.64%, which is marginally better given in the following sections.
than MNB’s 93.32% accuracy on the proposed dataset.

B. Deep Learning Approach


Bangla News
Bahad et al. [9] proposed a Bi-directional LSTM approach Dataset
to detect fake news. They compared the Bi-directional LSTM
model’s accuracy to that of CNN, vanilla RNN, and unidirec-
tional LSTM. It was found that the bi-directional LSTM model Data Pre-Processing
outperforms all other models. Girgis et al. [10] used Vanilla Stop Words Removal
RNN, GRU, and LSTM on the LIAR dataset. They observed
Tokenization
that the GRU reached 21.7% test accuracy, which is the best
of their results. The authors plan to use a hybrid model that Stemming

incorporates CNN and GRU techniques on the LIAR dataset.


Another study by Kaliyar et al. [11] proposed a deep CNN
(FNDNet) that can automatically learn the biased features for Feature Extraction

fake news classification through multiple hidden layers. Their News Content, Headline
model obtained an accuracy of 98.36%. The authors plan to GloVe Word Embedding
utilize a multimodel-based approach combining the pre-trained
word embeddings to improve their work.
Models
C. Ensemble Approach
CNN+LSTM CNN+GRU
A study by Sangamnerkar et al. [12] performed various
ensemble techniques with ML classifiers. An ensemble of
Logistic Regression (LR), Decision Tree (DT), and Bagging
Classifiers combined with a hard-voting ensemble technique
produces the best results, with over 88% accuracy. Hakak Prediction on test dataset
et al. [13] recommended an ensemble model for fake news
detection comprising three ML classifiers which are Random
Forest, Decision Tree, and Extra Tree Classifier. The proposed
model performed better in the ISOT dataset with an accuracy Predicted Label
of 100%, but on the Liar dataset, it obtained only 44.15%
accuracy. Agarwal and Dixit [14] introduced a method that in-
cludes an ensemble network for determining how news stories, Fig. 1: Architecture of our methodology.
writers, and titles are represented simultaneously. They tested

Authorized licensed use limited to: Sri Venkateshwara College of Engineering. Downloaded on March 03,2025 at 06:37:17 UTC from IEEE Xplore. Restrictions apply.
Embedding Dropout Conv1D Max Pooling Conv1D Max Pooling GRU Normalization Output
Dense Layer
Layer Layer Layer Layer Layer Layer Layer Layer

Fig. 2: Proposed model.

A. Dataset Description is applied to break all headlines into a word vector from our
We have used the "BanFakeNews" dataset and the "Fake 1 merged texts. After tokenizing, 72,355 and 2,47,340 unique
News" dataset 2 that are publicly available and collected tokens are found for Bengali and English datasets, respectively.
from Kaggle. The "BanFakeNews" dataset [17] consists of Numerical sequences are added, replacing textual sequences
approximately 50K news in Bengali, and the "Fake News" and padded to the highest sequence length of 1000. For
dataset consists of 20,800 news in English. The datasets are training, testing, and validation, we divided the dataset into
provided as a CSV file. There are seven attributes in the a ratio of 80:10:10.
"BanFakeNews" that are: articleID, domain, date, category,
headline, content, and label. And the "Fake News" dataset has C. Feature Extraction
five attributes which are: id, title, author, text, and label. The We removed some trivial features to construct a new col-
Table I displays the datasets details. lection of features to minimize the dimension of our dataset.
The final sample has fewer features than the original dataset.
TABLE I: Description of the Datasets
The benefit of eliminating features is that the computational
Dataset True News Fake News Total
time is decreased, resulting in a more remarkable performance.
BanFakeNews 48,678 1299 49,977 Most algorithms run much faster if there are fewer dimensions
Fake News 10,413 10,387 20,800 to consider. Following the data pre-processing steps, we have
removed articleID, date, category, and domain features from
the "BanFakeNews" dataset. Cause these features are in the
B. Data Pre-processing English Language while others are in Bengali. News content
Data pre-processing is essential to reduce noisy and unnec- and headline features are considered in the "BanFakeNews"
essary data. For Bengali news, we merged two files consists dataset. Author, text, and title features are considered in the
of true news and fake news from the "BanFakeNews" dataset. "Fake News" dataset for further process.
For English news, we take the pre merged "Fake News" We have applied pre-trained word embedding for our model.
dataset. We considered News content and headline features It is a form of transfer learning and is trained on large
for the "BanFakeNews" dataset. Author, text, and title features datasets. After the feature extraction, word embedding is used
are considered for the "Fake News" dataset. NLP methods, to mapping a term to a list of vectors. We combined pre-
including stemming, tokenization, and stop words removal, trained word embedding with our ensemble model. Global
are used to convert the raw data with Keras and TensorFlow Vectors for Word Representation (GloVe) is used after feature
libraries’ help. Stopwords are words that often appear in our extraction to map words to a list of vectors. An unsupervised
collected data but have no meaning concerning features. Thus, learning algorithm, GloVe, is an approach to create word
we discovered all stopwords in the "BanFakeNews" dataset, embeddings. It is easier to train over larger data as parallel
e.g. `ই,' `|,' `আমরা,' `কই', `তখন,' and shorten execution time implementation can be performed in gloVe. In this work,
and saved memory space by eliminating these stopwords. For we have used the "bn_glove.39M.zip" and "glove.6B.zip"
"Fake News" dataset, ‘again,’ ‘do,’ ‘off,’ ‘with’ are removed. for Bengali and English datasets, respectively. We prepared
Next, we identified words with similar meanings and used the word embedding on our Bengali and English dataset
stemming with NLTK’s porter stemmer. Finally, tokenization that consists of 1,78,153 and 4,00,000 words, respectively.
From several embedding vector sizes, we selected the 100-
1 https://www.kaggle.com/cryptexcode/banfakenews dimensional version. GloVe aims to emphasize the vectors of a
2 https://www.kaggle.com/c/fake-news/data word in the vector space for achieving sub-linear relationships.

Authorized licensed use limited to: Sri Venkateshwara College of Engineering. Downloaded on March 03,2025 at 06:37:17 UTC from IEEE Xplore. Restrictions apply.
The GloVe provides cheaper weight toward ubiquitous word
sets for limiting the unnecessary stop words that do not control
the training progress [11].

D. Model
1) Embedding and Dropout Layer: The embedding layer
is the model’s first layer, which takes input features and
transforms every word into a 100-dimensional vector. If any
text contains less than the maximum number of tokens, it is
padded to equal length. The dropout layer receives these word
vectors. We have added a dropout layer with a 0.2 dropout
rate for our regularization technique, which means input values
smaller than the dropout rate are dropped.
2) Convolution and Max-Pooling Layer: We have applied Fig. 3: The confusion matrix of the CNN+GRU model.
simple CNN with two convolution blocks, each consisting of
a single Conv-1D and Max Pooling layer. We have used 32
filters with kernel size 5 in the first layer, and for the second 2) Accuracy: Accuracy Score, also known as classification
layer, we have used 64 filters with kernel size 3. Each filter accuracy rating, is determined as the percentage of correct pre-
identifies more than one feature in the text with the help of dictions to total predictions made by the model. We depicted
the ReLu activation function. Then to the output of each CNN the accuracy (A) as given the formula in equation (1).
neuron, the ReLu activation function is used. This activation T rueP ositive + T rueN egative
function converts any negative value to zero. The values are A= (1)
T otalN umberof P redictions
then fed to the 1-D Max-pooling layer. The Max-Pooling layer
reduces dimensionality by conserving the learn patterns. The 3) Precision: When the number of true positive results is
pool size is set to 2. divided by the total number of positive results, including those
3) GRU Layer: After the convolution blocks, a GRU layer that were incorrectly identified, it is known as precision (P).
is used. We have used GRU as it helps to solve our problems of Precision is computed using equation (2).
gradient vanishing and explode gradient. It also helps to handle T rueP ositive
P = (2)
long sequential textual data over its recurrent architecture. P ositive + F alseP ositive
Since GRU has fewer tensor operations, it is easier to train 4) Recall: When the total number of samples that should
than LSTM. We trained our proposed model with 100 epochs. have been identified as positive is used to divide, the number
And 20% dropout, 20% recurrent dropout at the GRU layer. of true positive results is referred to as recall (R). The recall
Rest, we have kept default values for better accuracy. is computed using equation (3).
4) Batch Normalization and Dense Layer: We applied
Batch Normalization (BN) for standardizing the inputs to the T rueP ositive
R= (3)
dense layer. Standardizing the inputs indicates that inputs to T rueP ositive + F alseN egative
the dense layer should have approximately zero mean and 5) F1-score: The accuracy of the model for each class is
unit variance. The wholly connected dense layer is the final defined by the F1-score (F1). If the dataset is not balanced,
layer of our proposed model. It generates a single output. A then the F1-score metric is usually used. To show the proposed
Softmax activation function follows this layer. Small batches model’s performance, we have used F1-score as an evaluation
of 128 have been used to train and evaluate the proposed model metric. F1-score computation is done using the following
depicted in Fig. 2. equation (4).

IV. P ERFORMANCE E VALUATION M ETRICS precision × recall


F1 = 2 × (4)
precision + recall
Several performance measures are taken into consideration
to evaluate our model. The Confusion Matrix, Accuracy (A), 6) ROC curve and AUC: The Receiver Operating Char-
Precision (P), Recall (R), F1-score (F), and ROC curve are acteristics (ROC) curve is used to show the success of a
used as evaluation metrics for the proposed model. classification model across several classification thresholds.
1) Confusion Matrix: The confusion matrix shows an True Positive Rate (Recall) and False Positive Rate (FPR) are
overview of model performance on the testing dataset from the used in this curve. AUC is an abbreviation for "Area Under
known true values. It gives us a review of the model’s success the ROC curve." In other words, AUC tests the whole two-
and useful results of true positive, true negative, false positive, dimensional field under the entire ROC curve. The FPR is
and false-negative. Our model’s confusion matrix with 2,080 defined as in equation (5).
instances of the test set of the "Fake News" dataset is given F alseP ositive
in Fig. 3. FPR = (5)
F alseP ositive + T rueN egative

Authorized licensed use limited to: Sri Venkateshwara College of Engineering. Downloaded on March 03,2025 at 06:37:17 UTC from IEEE Xplore. Restrictions apply.
TABLE II: Evaluation Metrics

Dataset Proposed Model Accuracy Precision Recall F1-Score


CNN+GRU (with 2,500 instances) 0.82 0.83 0.83 0.83
CNN+GRU (with 13,000 instances) 0.95 0.94 0.82 0.87
BanFakeNews
CNN+GRU (with 49,977 instances) 0.98 0.94 0.79 0.85
CNN+LSTM (with 49,977 instances) 0.98 0.89 0.87 0.88
Fake News CNN+GRU (with 20,800 instances) 0.98 0.99 0.99 0.99

(a) Accuracy on "BanFake- (b) Loss on "BanFakeNews" (c) Accuracy on "Fake (d) Loss on "Fake News"
News" News"
Fig. 4: Accuracy and Loss (During Training and Validation) of CNN+GRU Model on "BanFakeNews" and "Fake News"
datasets.

(a) with 2,500 instances (b) with 13,000 instances (c) with 49,977 instances
Fig. 5: This figure illustrates the ROC curve and AUC score of the CNN+GRU model with a different range of instances of
the "BanFakeNews" dataset.

(a) with 2,500 instances (b) with 13,000 instances (c) with 49,977 instances
Fig. 6: This figure illustrates the ROC curve and AUC score of the CNN+LSTM model with a different range of instances of
the "BanFakeNews" dataset.

10 epochs with the "BanFakeNews’ dataset and 29 minutes


with the "Fake News" dataset. Adam optimizer is utilized
to monitor the learning rate and weights to reduce loss. We
V. E XPERIMENTS AND R ESULTS have performed several experiments with the combination of
All codes are executed in Python 3.8, using TensorFlow CNN and GRU. Our models’ average values of performance
2.3.0. All experiments are performed on a CoreTM pro- evaluation metrics are given in table II. In the first experiment,
cessor Intel CPU i5-7300H 2.50 GHz with 8 GB RAM. extracted features are fed to CNN+GRU architecture with
The CNN+GRU model’s training time is 63 minutes with

Authorized licensed use limited to: Sri Venkateshwara College of Engineering. Downloaded on March 03,2025 at 06:37:17 UTC from IEEE Xplore. Restrictions apply.
2,500 instances of the "BanFakeNews" dataset. In the second [2] Nir Grinberg, Kenneth Joseph, Lisa Friedland, Briony Swire-Thompson,
experiment, we have trained CNN+GRU architecture with and David Lazer. Fake news on twitter during the 2016 us presidential
election. Science, 363(6425):374–378, 2019.
13,000 instances. Lastly, the proposed ensemble model of [3] Xinyi Zhou and Reza Zafarani. A survey of fake news: Fundamental
CNN+GRU is trained with 49,977 instances. We have also theories, detection methods, and opportunities. ACM Computing Surveys
implemented a CNN+LSTM ensemble model for comparison (CSUR), 53(5):1–40, 2020.
[4] Supanya Aphiwongsophon and Prabhas Chongstitvatana. Detecting
with our proposed model following the same process. While fake news with machine learning method. In 2018 15th International
training with a different range of instances, it is observed that Conference on Electrical Engineering/Electronics, Computer, Telecom-
the model performed better with 49,977 instances with 98.71% munications and Information Technology (ECTI-CON), pages 528–531.
IEEE, 2018.
accuracy. The result demonstrates that when the dataset is [5] Muhammad Umer, Zainab Imtiaz, Saleem Ullah, Arif Mehmood,
imbalanced or the false news and true news variance increases, Gyu Sang Choi, and Byung-Won On. Fake news stance detection using
the recall decreases. The loss and accuracy graphs of our best- deep learning architecture (cnn-lstm). IEEE Access, 8:156695–156706,
2020.
performed model (CNN+GRU) on both datasets are given in [6] M. G. Hussain, M. Rashidul Hasan, M. Rahman, J. Protim, and S. Al
Fig. 4. Finally, CNN+GRU architecture is trained with the Hasan. Detection of bangla fake news using mnb and svm classifier. In
balanced "Fake News" dataset and achieved an accuracy of 2020 International Conference on Computing, Electronics Communica-
tions Engineering (iCCECE), 2020.
98.94% which outperformed existing models that used the [7] Marina Danchovsky Ibrishimova and Kin Fun Li. A machine learning
same dataset. The CNN+GRU model achieved the highest approach to fake news detection using knowledge verification and
F1-score of 0.99 on the "Fake News" dataset. The ROC and natural language processing. In International Conference on Intelligent
Networking and Collaborative Systems, pages 223–234. Springer, 2019.
AUC score graphs of our models with different instances are [8] Mykhailo Granik and Volodymyr Mesyura. Fake news detection using
given in Fig. 5 and 6. Besides, the accuracy compared with naive bayes classifier. In 2017 IEEE First Ukraine Conference on
existing models and our model (CNN+GRU) on the "Fake Electrical and Computer Engineering (UKRCON), pages 900–903.
IEEE, 2017.
News" dataset is given in Table III. [9] Pritika Bahad, Preeti Saxena, and Raj Kamal. Fake news detection
using bi-directional lstm-recurrent neural network. Procedia Computer
TABLE III: This table evaluates the performance comparison Science, 165:74–82, 2019.
with existing models on the "Fake News" dataset. [10] Sherry Girgis, Eslam Amer, and Mahmoud Gadallah. Deep learning
algorithms for detecting fake news in online text. In 2018 13th Inter-
national Conference on Computer Engineering and Systems (ICCES),
Dataset Model Accuracy pages 93–97. IEEE, 2018.
DT+LR+BGC [12] 88.08% [11] Rohit Kumar Kaliyar, Anurag Goswami, Pratik Narang, and Soumendu
LSTM+CNN [16] 94.71% Sinha. Fndnet–a deep convolutional neural network for fake news
Fake News Merged CNNs [18] 96% detection. Cognitive Systems Research, 61:32–44, 2020.
LSTM [14] 97% [12] S. Sangamnerkar, R. Srinivasan, M. R. Christhuraj, and R. Sukumaran.
FNDNet [11] 98.36% An ensemble technique to detect fabricated news article using machine
Proposed Model (CNN+GRU) 98.94% learning and natural language processing techniques. In 2020 Interna-
tional Conference for Emerging Technology (INCET), pages 1–7, June
2020.
VI. C ONCLUSION [13] Saqib Hakak, Mamoun Alazab, Suleman Khan, Thippa Reddy
Gadekallu, Praveen Kumar Reddy Maddikunta, and Wazir Zada Khan.
Fake News Detection is critical for determining whether or An ensemble machine learning approach through effective feature ex-
not a piece of news is genuine. We explored different neural traction to classify fake news. Future Generation Computer Systems,
117:47–58, 2021.
networks, CNN, GRU, and LSTM, and built an ensemble [14] Arush Agarwal and Akhil Dixit. Fake news detection: An ensemble
model using CNN and GRU to detect fake news. We have learning approach. In 2020 4th International Conference on Intelligent
used pre-trained GloVe embedding as our word embedding Computing and Control Systems (ICICCS), pages 1178–1183. IEEE,
2020.
because it complements the training process significantly than [15] Muhammad Zubair Asghar, Ammara Habib, Anam Habib, Adil Khan,
the traditional bag of words method. It provides each word Rehman Ali, and Asad Khattak. Exploring deep neural networks
with a vector projection and the relationship, similarities, dif- for rumor detection. Journal of Ambient Intelligence and Humanized
Computing, pages 1–19, 2019.
ferences with other words in the vocabulary. We have trained [16] Aman Agarwal, Mamta Mittal, Akshat Pathak, and Lalit Mohan Goyal.
our model using "BanFakeNews" and "Fake News" datasets. Fake news detection using a blend of neural networks: an application
The results illustrate that the proposed method achieved an of deep learning. SN Computer Science, 1(3):1–9, 2020.
[17] Md Zobaer Hossain, Md Ashraful Rahman, Md Saiful Islam, and
accuracy of 98.71% using "BanFakeNews" and 98.94% using Sudipta Kar. Banfakenews: A dataset for detecting fake news in bangla.
"Fake News." We have utilized various performance evalua- arXiv preprint arXiv:2004.08789, 2020.
tion parameters like F1-score, precision, recall, AUC, ROC, [18] Belhakimi Mohamed Amine, Ahlem Drif, and Silvia Giordano. Merging
deep learning model for fake news detection. In 2019 International
etc., for validating the results. Notwithstanding the excellent Conference on Advanced Electrical Engineering (ICAEE), pages 1–4.
performance of our model, there is scope for progression. The IEEE, 2019.
Bengali dataset that we have used is imbalanced. We plan
to collect and include more Bengali fake news. We will also
incorporate lexical features in the future attempt.
R EFERENCES
[1] Arne Roets et al. fake news: Incorrect, but hard to correct. the
role of cognitive ability on the impact of false information on social
impressions. Intelligence, 65:107–110, 2017.

Authorized licensed use limited to: Sri Venkateshwara College of Engineering. Downloaded on March 03,2025 at 06:37:17 UTC from IEEE Xplore. Restrictions apply.

You might also like