0% found this document useful (0 votes)

11 views14 pages

Machine Learning Fake News Blocking

This paper explores the use of Natural Language Processing techniques to identify fake news by building classifiers using a corpus of labeled articles. Four classification models were evaluated, with the Long Short-Term Memory (LSTM) model achieving the highest accuracy of 94.53%. The study emphasizes the importance of source identification in predicting the reliability of news articles.

Uploaded by

shivam.gairola.csit.2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views14 pages

Machine Learning Fake News Blocking

Uploaded by

shivam.gairola.csit.2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Machine Learning: Fake News Blocking

Harshvardhan Singh
Department of Engineering and Technology,
SRM Institute of Science and Technology, Kattankulathur,
Kancheepuram Dist., India, 603203
e-mail: [email protected]

Abstract In this paper, the applications

Keywords Feature extraction, Pre-processing,
of Natural Language Processing Naive Bayes, Support Vector Machine, Feed-
techniques are explored to identify when forward Neural Network, Long Short-Term
a news source may be producing fake Memory, Confusion matrix

news. For which, a corpus of labelled real

and fake news articles is used to build a Introduction Fake news, defined as a made-
up story with an intention to deceive, has
classifier that can make decisions about been widely cited as a contributing factor to
information based on the content from the outcome of the 2020 United States
presidential election. While Mark Zuckerberg,
the corpus. Here I use a text classification
Facebook’s CEO, made a public statement
approach, using four different denying that Facebook had an effect on the
classification models, and analyse the outcome of the election, Facebook and other
online media outlets have begun to develop
results. The best performing model is the
strategies for identifying fake news and
LSTM implementation. The model mitigating its spread. Zuckerberg admitted
focuses on identifying fake news sources, identifying fake news is difficult, writing, “This
is an area where I believe we must proceed
based on multiple articles originating
very carefully though. Identifying the truth is
from a source. Once a source is labelled complicated.” Fake news is increasingly
as a producer of fake news, it could be becoming a menace to our society. It is
typically generated for commercial interests
predicted with high confidence that any
to attract viewers and collect advertising
future articles from that source would revenue. However, people and groups with
also be fake news. Focusing on sources potentially malicious agendas have been
known to initiate fake news in order to
widens our article misclassification influence events and policies around the
tolerance, because we then have multiple world. It is also believed that circulation of
data points coming from each source. fake news had material impact on the
outcome of the 2016 US Presidential Election.

pg. 1
2

Data The datasets used for this project Dataset Description In a dataset, a
were drawn from Kaggle. The training training set is implemented to build up a
dataset has about 16600 rows of data model, while a test (or validation) set is to
from various articles on the internet. validate the model built. Data points in the
Quite a bit of pre-processing of the data training set are excluded from the test
had to be done, as is evident from our (validation) set. Usually, a dataset is
source code, in order to train our models. divided into a training set, a validation set
(some people use ‘test set’ instead) in
A full training dataset has the following each iteration, or divided into a training
attributes: set, a validation set and a test set in each
iteration. The research dataset includes
1. id: unique id for a news article the following.

2. title: the title of a news article  data.csv: A full training dataset

with the following attributes:
3. author: author of the news article
o id
4. text: the text of the article; incomplete o title
in some cases o author
o text
5. label: a label that marks the article as
o label
potentially unreliable
 test.csv: A testing training dataset
• 1: unreliable with all the same attributes
asdata.csv without the label.
• 0: reliable

Fig. 1 First 5 records from the data frame

pg. 2
3

Feature extraction and Pre-processing with the assumption that all features are
The embeddings used for the majority of conditionally independent given the class
the modeling are generated using the label. As with the other models, I used the
Doc2Vec model. The goal is to produce a Doc2Vec embeddings described above.
vector representation of each article. The Naive Bayes Rule is based on the
Before applying Doc2Vec, we perform Bayes’ theorem
some basic pre-processing of the data.
This includes removing stop words,
deleting special characters and
(
punctuation, and converting all text to
1
lowercase. This produces a comma-
separated list of words, which can be
input into the Doc2Vec algorithm to
Above,
produce a 300-length embedding vector
for each article.
 P(c|x) is the posterior probability
Doc2Vec is a model developed in 2014 of class (c, target)
based on the existing Word2Vec model, given predictor (x, attributes).
 P(c) is the prior probability of class.
which generates vector representations
 P(x|c) is the likelihood which is the
for words. Word2Vec represents
documents by combining the vectors of probability of predictor given class.
 P(x) is the prior probability
the individual words, but in doing so it
loses all word order information. Doc2Vec of predictor.
expands on Word2Vec by adding a
“document vector” to the output
representation, which contains some Parameter estimation for naive Bayes
information about the document as a models uses the method of maximum
whole, and allows the model to learn likelihood. The advantage here is that it
some information about word order. requires only a small amount of training
Preservation of word order information data to estimate the parameters.
makes Doc2Vec useful for our application,
as we are aiming to detect subtle Let’s understand it using an example.
differences between text documents. Below I have a training data set of
weather and corresponding target
Models The following learning algorithms variable ‘Play’ (suggesting possibilities of
are used in conjunction with the proposed playing). Now, we need to classify
methodology to evaluate the whether players will play or not based on
performance of fake news detection weather condition. Let’s follow the below
classifiers. steps to perform it.

Naive Bayes In order to get a baseline Step 1: Convert the data set into a
accuracy rate for our data, I implemented frequency table
a Naive Bayes classifier. Specifically, I used
the scikit-learn implementation of Step 2: Create Likelihood table by finding
Gaussian Naive Bayes. This is one of the the probabilities like Overcast probability
simplest approaches to classification, in = 0.29 and probability of playing is 0.64.
which a probabilistic approach is used,

pg. 3
4

Fig 3. Frequency Fig 4. Likelihood Table

Table

Naive Bayes uses this method to predict

Fig 2. Training data the probability of different class based on
Step 3: Now, use Naive Bayesian equation various attributes. This algorithm is mostly
to calculate the posterior probability for used in text classification and with
each class. The class with the highest problems having multiple classes.
posterior probability is the outcome of
prediction.
Support Vector Machine The original
Support Vector Machine (SVM) was
proposed by Vladimir N. Vapnik and
Alexey Ya. Chervonenkis in 1963. But that
model can only do linear classification so
it doesn’t suit for most of the practical
problems. Later in 1992, Bernhard E.
Boser, Isabelle M. Guyon and Vladimir N.
Vapnik introduced the kernel trick which
enables the SVM for non-linear
classification. That makes the SVM much
powerful. The objective of the support
vector machine algorithm is to find a
hyperplane in an N-dimensional space (N
— the number of features) that distinctly
classifies the data points.

Fig 5. Naive-bayes.py
Accuracy- 72.94%
Fig 6. Possible hyper planes

pg. 4
5

To separate the two classes of data points, We use the theory introduced in to
there are many possible hyperplanes that
could be chosen. Our objective is to find a
plane that has the maximum margin, i.e
the maximum distance between data
points of both classes. Maximizing the
margin distance provides some
reinforcement so that future data points implement the SVM. The main idea of the
can be classified with more confidence. SVM is to separate different classes of
data by the widest “street”. This goal can
Support vectors are data points that are be represented as the optimization
problem
(3)
Then we use the Lagrangian function to
get rid of the constraints.

closer to the hyperplane and influence the

Fig 7. Support Vectors
position and orientation of the
hyperplane. Using these support vectors,
we maximize the margin of the classifier.
Deleting the support vectors will change
the position of the hyperplane. These are
the 2points that help us build our SVM.
We use the Radial Basis Function kernel in
our project. The reason we use this kernel
is that two Doc2Vec feature vectors will
be close to each other if their
corresponding documents are similar, so
the distance computed by the kernel
function should still represent the original
dista
nce.
Since
the
Radi
al Basis Function is
(2)
It correctly represents the relationship we (4)
desire and it is a common kernel for SVM.

pg. 5
6

Finally, we solve this optimization (5)

problem using the convex optimization
(6)
tools provided by Python package
CVXOPT. (7)
Fig 8. SVM.py Fig 9 (i)(ii) neural-net-keras.py
Accuracy- 88.42% Fig 10(i). neural-net-tf.py
Feed-forward Neural Network Here I
implemented two feed-forward neural
network models, one using Tensor flow
and one using Keras. Neural networks are
commonly used in modern NLP
applications, in contrast to older
approaches which primarily focused on
linear models such as SVM’s and logistic
regression. The neural network
implementations use three hidden layers.
In the Tensor flow implementation, all
layers had 300 neurons each, and in the
Keras implementation used, layers of size
256, 256, and 80, interspersed with
dropout layers to avoid overfitting. For
the activation function, we chose the
Rectified Linear Unit (ReLU), which has
been found to perform well in NLP

applications.

pg. 6
7

Long Short-Term Memory The Long-Short

Term Memory (LSTM) unit was proposed
by Hochreiter and Schmidhuber. It is good
at classifying serialized objects because it
will selectively memorize the previous
input and use that, together with the
current input, to make prediction. The
news content (text) in our problem is
inherently serialized. The order of the
words carries the important information
of the sentence. So, the LSTM model suits
for our problem.
Since the order of the words is important
for the LSTM unit, we cannot use the
Doc2Vec for pre-processing because it will
transfer the entire document into one
vector and lose the order information. To
prevent that, we use the word embedding
instead. We first clean the text data by
removing all characters which are not
letters nor numbers. Then we count the
frequency of each word appeared in our
training dataset to find 5000 most
common words and give each one a
unique integer ID. For example, the most

Fig 10(ii)(iii)(iv) neural-net-tf.py

pg. 7
8

common word will have ID 0, and the delete the data with only a few words
second most common one will have 1, etc. since they don’t carry enough information
After that we replace each common word for training. By doing this, we transfer the
with its assigned ID and delete all original text string to a fixed length
uncommon words. integer vector while preserving the words
order information. Finally, we use word
embedding to transfer each word ID to a
32-dimension vector.
The word embedding will train each word
vector based on word similarity. If two
words frequently appear together in the
text, they are thought to be more similar
and the distance of their corresponding
vectors is small.
The pre-processing transfers each news in
Fig 11. Frequency of top common words raw text into a fixed size matrix. Then we
feed the processed training data into the
LSTM unit to train the model. The LSTM is
still a neural network. But different from
the fully connected neural network, it has
cycle in the neuron connections.
So,
the
previ

Fig 12. Length of the news

Notice that the 5000 most common words

cover the most of the text, as shown in
Figure 11, so we only lose little ous state (or memory) of the LSTM
information but transfer the string to a list unit ct will play a role in new
of integers. Since the LSTM unit requires a prediction ht.
fixed input vector length, we truncate the
list longer than 500 numbers because
more than half of the news is longer than
500 words as shown in Figure 12. Then for
the list shorter than 500 words, we pad
0’s at the beginning of the list. We also

pg. 8
9

Fig 13(i)(ii)(iii)(iv) LSTM.py

Accuracy- 94.53%

Confusion matrix A confusion matrix is a

table that is often used to describe the
performance of a classification model (or
"classifier") on a set of test data for which
the true values are known.
The output is known as the confusion
matrix, the left diagonal will give all the
correctly predicted results from the
dataset and the right diagonal will give all
the incorrectly predicted results.
For example,

pg. 9
10

 True Negative Rate: When it's actually

no, how often does it predict no?
o TN/actual no = 50/60 = 0.83
o equivalent to 1 minus False Positive
Rate
o also known as "Specificity"
 Precision: When it predicts yes, how
often is it correct?
o TP/predicted yes = 100/110 = 0.91
Fig 14. Confusion matrix for a binary  Prevalence: How often does the yes
classifier condition actually occur in our
The four values from the confusion matrix sample?
contain the following- o actual yes/total = 105/165 = 0.64

 true positives (TP): These are cases

which were predicted as positive (and Confusion matrices for our research
were actually positive). models are as follows-
 true negatives (TN): These are cases
which were predicted as negative (and
were actually negative).
 false positives (FP): These are cases
which were predicted as positive (but
were actually negative). (Also known
as a "Type I error.")
 false negatives (FN): These are cases in
which we predicted as negative (but
were actually positive). (Also known as
a "Type II error.")
Fig 15. Confusion matrix for Naive Bayes
The following rates are often computed
Accuracy- (1188+1803)/4153
from a confusion matrix-
 Accuracy: Overall, how often is the = 72.94%
classifier correct?
o (TP+TN)/total = (100+50)/165 = 0.91
 Misclassification Rate: Overall, how
often is it wrong?
o (FP+FN)/total = (10+5)/165 = 0.09
o equivalent to 1 minus Accuracy
o also known as "Error Rate"
 True Positive Rate: When it's actually
yes, how often does it predict yes?
o TP/actual yes = 100/105 = 0.95
o also known as "Sensitivity" or "Recall"
 False Positive Rate: When it's actually
no, how often does it predict yes?
o FP/actual no = 10/60 = 0.17
Fig 16. Confusion matrix for SVM

pg. 10
11

Accuracy- (1693+1969)/4153
= 88.42%

Fig 19. Confusion matrix for LSTM

Accuracy- (1982+1920)/4153
= 94.53%
Fig 17. Confusion matrix for Neural
Network using Tensor Flow
Conclusion In this paper, the comparison
Accuracy- (1452+1947)/4153 of various Natural Language Processing
techniques are made which are used to
=81.42%
detect if a news is fake or genuine. The
following results can be drawn from the
models which conclude the research.
(i). A comparison of the models using their
Confusion Matrices to calculate the

Precision, Recall and the F1 scores.

Fig 18. Confusion matrix for Neural
Network using Keras Fig 20. Model performance on the test set

Accuracy- (1529+1540)/4153 (ii). Comparison of the accuracies of the

models
= 92.62%

Fig 21.

pg. 11
12

Accuracy table for the models, shows Schmidhuber (1997). "Long short-term
highest for LSTM memory". [11]Senior, Andrew; Beaufays,
Francoise (2014). "Long Short-Term
Acknowledgement This work benefitted
Memory recurrent neural network
from the invaluable guidance from Ms. C
architectures for large scale
Fancy, who provided valuable feedback
acousticmodeling". [12]Xiangang Wu,
during the final drafting of the paper, her
Xihong (2014-10-15). "Constructing Long
support is gratefully acknowledged.
Short-Term Memory based Deep
References Recurrent Neural Networks for Large
Vocabulary Speech Recognition".[13] Sepp
[1] Datasets, Kaggle,
Hochreiter; Jürgen
https://www.kaggle.com/c/fake-
Schmidhuber (1997). "LSTM can Solve
news/data, February, 2018.[2]Sepp
Hard Long Time Lag
Hochreiter; Jürgen Schmidhuber (21
Problems".[14]KlausGreff; Rupesh Kumar
August 1995), Long Short Term
Srivastava; Jan Koutník; Bas R.
Memory[3]Allcott, H., and Gentzkow, M.,
Steunebrink; Jürgen Schmidhuber (2015).
Social Media and Fake News in the 2016
"LSTM: A Search Space Odyssey". IEEE
Election, https://web.
Transactions on Neural Networks and
stanford.edu/œgentzkow/research/faken
Learning Systems. [15]Beaufays, Françoise
ews.pdf, January, 2017.[4] Quoc, L.,
(August 11, 2015). "The neural networks
Mikolov, T., Distributed Representations
behind Google Voice
of Sentences and Documents,
transcription". Research Blog. Retrieved 2017-
https://arxiv. org/abs/1405.4053, May,
06-27.[16]Sak, Haşim; Senior, Andrew; Rao,
2014.[5] Christopher, M. Bishop, Pattern
Kanishka; Beaufays, Françoise; Schalkwyk,
Recognition and Machine Learning,
Johan (September 24, 2015). "Google
http://users.isr.ist.
voice search: faster and more
utl.pt/˜wurmd/Livros/school/Bishop%20-
accurate". Research Blog. Retrieved 2017-
%20Pattern%20Recognition%20And%
06-27.[17]Cortes, Corinna; Vapnik,
20Machine%20Learning%20-
Vladimir N. (1995). "Support-vector
%20Springer%20%202006.pdf, April,
networks" .[18]Ben-Hur, Asa; Horn, David;
2016. [6] Goldberg, Y., A Primer on Neural
Siegelmann, Hava; Vapnik, Vladimir N.
Network Models for Natural Language
""Support vector clustering"
Processing, https://arxiv.
(2001);". [19] "1.4. Support Vector
org/pdf/1510.00726.pdf, October, 2015.
Machines — scikit-learn 0.20.2
[7]Hochreiter, S., Jrgen, S., Long short-
documentation". Archived from the
term memory. http://www.bioinf.jku.at/
original on 2017-11-08. Retrieved 2017-
publications/older/2604.pdf, October,
11-08.
1997.[8]Bishop, C. M. (2006), Pattern
Recognition and Machine Learning, [20]Hastie, Trevor; Tibshirani,
Springer, Robert; Friedman, Jerome (2008). The
[9]Machine learning and pattern Elements of Statistical Learning : Data
recognition "can be viewed as two facets Mining, Inference,
andPrediction [21]Press, William H.;
of the same field." [10] Sepp
Teukolsky, Saul A.; Vetterling, William T.;
Hochreiter; Jürgen

pg. 12
13

Flannery, Brian P. (2007). "Section 16.5.

Support Vector Machines".[22]Joachims,
Thorsten (1998). "Text categorization with
Support Vector Machines: Learning with
many relevant features".[23]Pradhan,
Sameer S., et al. "Shallow semantic
parsing using support vector machines."
[24]Vapnik, Vladimir N.: Invited Speaker.
IPMU Information Processing and
Management 2014).[25] Barghout,
Lauren. "Spatial-Taxon Information
Granules as Used in Iterative Fuzzy-
Decision-Making for Image
Segmentation". [26]A. Maity (2016).
"Supervised Classification of RADARSAT-2
Polarimetric Data for Different Land
Features". [27]DeCoste, Dennis
(2002). "Training Invariant Support Vector
Machines" [28]Maitra, D. S.;
Bhattacharya, U.; Parui, S. K. (August
2015). "CNN based common approach to
handwritten character recognition of
multiplescripts"., [29]Bilwaj; Davatzikos,
Christos; "Analytic estimation of statistical
significance maps for support vector
machine based multi-variate image
analysis and classification".[30]Cuingnet,
Rémi; Rosso, Charlotte; Chupin, Marie;
Lehéricy, Stéphane; Dormont, Didier;
Benali, Habib; Samson, Yves; and Colliot,
Olivier; "Spatial regularization of SVM for
the detection of diffusion alterations
associated with stroke outcome"

[31]Statnikov, Alexander; Hardin, Douglas;

&Aliferis, Constantin; (2006); "Using SVM
weight-based methods to identify causally
relevant and non-causally relevant
variables", Sign, 1, 4.[32] Boser, Bernhard
E.; Guyon, Isabelle M.; Vapnik, Vladimir N.
(1992). "A training algorithm for optimal
margin classifiers".

pg. 13
14

Copyright protected @ ENGPAPER.COM and

AUTHORS

https://www.engpaper.com

pg. 14

Fake News Detection
100% (1)
Fake News Detection
25 pages
s134450 Fake News Detection Using Machine Learning
No ratings yet
s134450 Fake News Detection Using Machine Learning
91 pages
Fake News Detection Using Machine Learning: Presented by Fathima T H MSC Computer Science
71% (7)
Fake News Detection Using Machine Learning: Presented by Fathima T H MSC Computer Science
15 pages
Fake News Detection
No ratings yet
Fake News Detection
21 pages
Untitled
100% (2)
Untitled
66 pages
AI - Phase 4
No ratings yet
AI - Phase 4
11 pages
Pavan
No ratings yet
Pavan
23 pages
Bentham Science
No ratings yet
Bentham Science
12 pages
Project Documentation
No ratings yet
Project Documentation
6 pages
Comparison of Naive Bayes Classifier and C-LSTM
No ratings yet
Comparison of Naive Bayes Classifier and C-LSTM
6 pages
IR - MINIPROJECT Final
No ratings yet
IR - MINIPROJECT Final
15 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
11 pages
JETIRFN06035
No ratings yet
JETIRFN06035
4 pages
IRE Deliverable 3
No ratings yet
IRE Deliverable 3
7 pages
Pandey 2022 J. Phys. Conf. Ser. 2161 012027
No ratings yet
Pandey 2022 J. Phys. Conf. Ser. 2161 012027
13 pages
A Fake News Detection System Using Data Science and ML
No ratings yet
A Fake News Detection System Using Data Science and ML
7 pages
Kaliya R 2020
No ratings yet
Kaliya R 2020
13 pages
Fake News Detection Using Machine Learning Algorithms
No ratings yet
Fake News Detection Using Machine Learning Algorithms
6 pages
ML Project Report PDF
No ratings yet
ML Project Report PDF
26 pages
AI Phase4
No ratings yet
AI Phase4
6 pages
Sehrash
No ratings yet
Sehrash
4 pages
Formulario - de - Extraccion - de - Datos - 2
No ratings yet
Formulario - de - Extraccion - de - Datos - 2
4 pages
Fake News Mini PDF
No ratings yet
Fake News Mini PDF
12 pages
Effective Prediction of Fake News Using A Learning Vector Quantization
No ratings yet
Effective Prediction of Fake News Using A Learning Vector Quantization
5 pages
ML Report Fake News Detection
No ratings yet
ML Report Fake News Detection
15 pages
Sehrash
No ratings yet
Sehrash
3 pages
A Review: Machine Learning Approach and Deep Learning Approach For Fake News Detection
No ratings yet
A Review: Machine Learning Approach and Deep Learning Approach For Fake News Detection
5 pages
ML Paper 6
No ratings yet
ML Paper 6
4 pages
ML7 - Text Classification
No ratings yet
ML7 - Text Classification
13 pages
Project 02
No ratings yet
Project 02
1 page
Fake News - Machine Learning
No ratings yet
Fake News - Machine Learning
6 pages
Fake News Detection: 2018 IEEE International Students' Conference On Electrical, Electronics and Computer Sciences
No ratings yet
Fake News Detection: 2018 IEEE International Students' Conference On Electrical, Electronics and Computer Sciences
5 pages
Fake News Detection
No ratings yet
Fake News Detection
5 pages
Project Report
No ratings yet
Project Report
6 pages
Fake News Detection With Different Model
No ratings yet
Fake News Detection With Different Model
15 pages
Project Synopsis Report Format
No ratings yet
Project Synopsis Report Format
9 pages
CBLM Template Preliminary Pages
100% (1)
CBLM Template Preliminary Pages
9 pages
Machine Learning Techniques For The Classification of Fake News
No ratings yet
Machine Learning Techniques For The Classification of Fake News
5 pages
Identifying Fake News
No ratings yet
Identifying Fake News
9 pages
D13 Manuscript
No ratings yet
D13 Manuscript
12 pages
Fake News Detection PPT 1
No ratings yet
Fake News Detection PPT 1
13 pages
Real Time Fake News Detection Using Machine Learning and NLP
No ratings yet
Real Time Fake News Detection Using Machine Learning and NLP
5 pages
Fake News Detection Using Machine Learning12 2
No ratings yet
Fake News Detection Using Machine Learning12 2
65 pages
DL Highlights
No ratings yet
DL Highlights
6 pages
ML Summer Training
No ratings yet
ML Summer Training
20 pages
FAke News Report
No ratings yet
FAke News Report
16 pages
The Main Objective Is To Detect The Fake News, Which Is A Classic Text Classification
No ratings yet
The Main Objective Is To Detect The Fake News, Which Is A Classic Text Classification
57 pages
Final Demo Lesson Plan
No ratings yet
Final Demo Lesson Plan
10 pages
Report Rohun Sjmoon
No ratings yet
Report Rohun Sjmoon
6 pages
Fake News Detection Project Report
100% (1)
Fake News Detection Project Report
8 pages
JPNR 2022 04 140
No ratings yet
JPNR 2022 04 140
7 pages
M1 - L1 - 3 - Procedure For Sterilization and Sanitation of Nail Care Tools and Equipments
100% (2)
M1 - L1 - 3 - Procedure For Sterilization and Sanitation of Nail Care Tools and Equipments
2 pages
Report Se
No ratings yet
Report Se
4 pages
Machine Learning For The Classification of Fake News
No ratings yet
Machine Learning For The Classification of Fake News
4 pages
Fake News Detection Using Python and Machine Learning
No ratings yet
Fake News Detection Using Python and Machine Learning
6 pages
Fake News Synopsis 1
No ratings yet
Fake News Synopsis 1
6 pages
Sakuting Demo
No ratings yet
Sakuting Demo
3 pages
Fake News Detection Using Machine Learning: Nihel Fatima Baarir Abdelhamid Djeffal
No ratings yet
Fake News Detection Using Machine Learning: Nihel Fatima Baarir Abdelhamid Djeffal
6 pages
Unit 2 Applications of Multimedia: Structure
No ratings yet
Unit 2 Applications of Multimedia: Structure
10 pages
Fake News Synopsis 1
No ratings yet
Fake News Synopsis 1
6 pages
SYNOPSIS
No ratings yet
SYNOPSIS
4 pages
Fake News Detection Using Machine Learning Algorithm
No ratings yet
Fake News Detection Using Machine Learning Algorithm
7 pages
News Classification Using Machine Learning
No ratings yet
News Classification Using Machine Learning
5 pages
Synopsis Minor Project-2
No ratings yet
Synopsis Minor Project-2
5 pages
Curriculum Corner Phonemic Awareness
No ratings yet
Curriculum Corner Phonemic Awareness
2 pages
T-Tess Goals
No ratings yet
T-Tess Goals
2 pages
Summary - Mastering Coaching
No ratings yet
Summary - Mastering Coaching
7 pages
Authentic Pedagogy
No ratings yet
Authentic Pedagogy
3 pages
Possessive Pronouns Lesson Plan
No ratings yet
Possessive Pronouns Lesson Plan
3 pages
Introduction To Quality Assurance in Physiotherapy Education
No ratings yet
Introduction To Quality Assurance in Physiotherapy Education
10 pages
Unit Lesson Poems
No ratings yet
Unit Lesson Poems
5 pages
Science 8
No ratings yet
Science 8
21 pages
Technology Infused Lesson Plan - Wright: Usp Sharing
No ratings yet
Technology Infused Lesson Plan - Wright: Usp Sharing
9 pages
Coaching and Its Types
No ratings yet
Coaching and Its Types
12 pages
Second Language Acquisition
No ratings yet
Second Language Acquisition
2 pages
Frame of Reference 2020
No ratings yet
Frame of Reference 2020
4 pages
Sow F5 2019
No ratings yet
Sow F5 2019
14 pages
(Ebook) Teaching Elementary Social Studies: Principles and Applications by James J. Zarrillo ISBN 9780132565516, 013256551X Instant Download
0% (1)
(Ebook) Teaching Elementary Social Studies: Principles and Applications by James J. Zarrillo ISBN 9780132565516, 013256551X Instant Download
55 pages
Session Dates Topics Covered Session/Modules From Notebooks Aug. 26
No ratings yet
Session Dates Topics Covered Session/Modules From Notebooks Aug. 26
2 pages
Conversations En0209 PDF
No ratings yet
Conversations En0209 PDF
2 pages
Draft 2018 IYPT Reference Kit
No ratings yet
Draft 2018 IYPT Reference Kit
68 pages
The Portrait of Education in Indonesia
No ratings yet
The Portrait of Education in Indonesia
17 pages
Lesson Plan Short Story
100% (1)
Lesson Plan Short Story
2 pages
4methods and Strategies in Teaching The Arts
No ratings yet
4methods and Strategies in Teaching The Arts
4 pages
Gustar Lesson Plan
No ratings yet
Gustar Lesson Plan
5 pages
Numeracy Test Synthesis
No ratings yet
Numeracy Test Synthesis
3 pages
Chapter III: Teaching English As A Foreign Language in The Algerian Secondary Schools
No ratings yet
Chapter III: Teaching English As A Foreign Language in The Algerian Secondary Schools
11 pages
Defining Key Concepts in Didactics 03 February-2025
No ratings yet
Defining Key Concepts in Didactics 03 February-2025
4 pages
Grade 7 SA Unit 6
No ratings yet
Grade 7 SA Unit 6
3 pages

Machine Learning Fake News Blocking

Uploaded by

Machine Learning Fake News Blocking

Uploaded by

Machine Learning: Fake News Blocking

Abstract In this paper, the applications

news. For which, a corpus of labelled real

2. title: the title of a news article  data.csv: A full training dataset

Fig. 1 First 5 records from the data frame

Fig 3. Frequency Fig 4. Likelihood Table

Naive Bayes uses this method to predict

closer to the hyperplane and influence the

Finally, we solve this optimization (5)

Long Short-Term Memory The Long-Short

Fig 10(ii)(iii)(iv) neural-net-tf.py

Fig 12. Length of the news

Notice that the 5000 most common words

Fig 13(i)(ii)(iii)(iv) LSTM.py

Confusion matrix A confusion matrix is a

 True Negative Rate: When it's actually

 true positives (TP): These are cases

Fig 19. Confusion matrix for LSTM

Precision, Recall and the F1 scores.

Accuracy- (1529+1540)/4153 (ii). Comparison of the accuracies of the

Flannery, Brian P. (2007). "Section 16.5.

[31]Statnikov, Alexander; Hardin, Douglas;

Copyright protected @ ENGPAPER.COM and

You might also like