Fake News Detection Using Python and Machine Learning

This document summarizes a paper on detecting fake news using machine learning techniques. It discusses how social media is increasingly being used to spread misinformation. The paper proposes using an ensemble machine learning method to automatically classify news articles as real or fake. It aims to help users verify the reliability of news sources. Keywords mentioned are internet, social media, and fake news. The document discusses using a naive Bayes classifier and analyzing word counts and frequencies to detect fake news. It presents a three-part methodology using a machine learning classifier, checking external sources to verify claims, and authenticating the source of shared URLs.

Uploaded by

harini t

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views6 pages

Fake News Detection Using Python and Machine Learning

Uploaded by

harini t

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

FAKE NEWS DETECTION USING PYTHON AND MACHINE

LEARNING HARINI. T, ARUNA. S, LAVANYA .K

(Students of 3rd year B.Tech (IT) PANIMALAR ENGINEERING COLLEGE, CHENNAI)

ABSTRACT

Social media fake news detection is a novel field that is developing right now. Currently, the society
is significantly impacted by social media news, as evidenced by the statistics of people using
Facebook, Twitter and other social media platforms. Use apps like WhatsApp to share the most
recent news whether it is true or false. More information is being produced and shared by consumers
than ever thanks to the widespread use of social media platforms, many of which are false and have
no bearing on reality. It is suggested in this paper to classify news articles automatically using an
ensemble machine learning method. It aims to provide the user who has the ability to judge whether a
news item is accurate or not and to verify the reliability of the website that is posting it.

KEYWORDS: Internet, Social Media, Fake news.

I. INTRODUCTION presidential election was even further prevalent

of Facebook than the most widely circulated
More and more people are choosing to search actual conventional news. Consumers are
for and consume news from social media rather purposefully led to believe inaccurate or
than conventional news sources as we spend an prejudiced information by fake news. Fake news
increasing amount of our lives interacting modifies how people view and react to real
online through online media platforms. It news. This research aims to compare the
became simpler for customers to get the most effectiveness of various algorithms at identifying
recent news at their hands thanks to social fake news.
media websites. These platforms are, however
also utilised negatively to slant perceptions an There are two different types of algorithms: the
manipulate attitudes. The occurrence is first type uses a manually labelled news dataset,
frequently referred to as false news. It is simple and the second type supports the concepts of
to share and debate news with friends, and its using AI in conjunction with a manually labelled
frequently less costly and more timely to dataset to identify fake news. There two articles
consume news there. on related subjects vary in that the latter one
used Logistic Regression, specifically for the
For instance, 62% of American adults over the purpose of identifying bogus news, and that the
age of 18 in 2016 accessed news online. While current data collection was used to evaluate the
in 2012, only 49% said they viewed news on the built algorithm, which gives the chance to
internet. However, due to low cost of online evaluate its effectiveness using neoteric
news publication and because it spreads through statistics.
social media much more quickly and easily. It is
obvious that the most commonly disseminated
false information during the 2016 United States II. LITERATURE SURVEY
This study examines the various methods or context information into account.
systems that have been employed in the past to
identify fake news. This papers main goal is to Feng, Banerjee et.al.,[2]are successful in
observe and identify the most effective and deception-related classification tasks utilizing
objective solutions to the given issue. online review corpora, achieving 85%-91%.
Additionally, the survey below examines each These methods have only been proven successful
methodology used in the literatures discussed when combined with more advanced analytical
[3]. Fake news has puzzling root causes and is methods that use deep syntax analysis.[5]
widely spread. Numerous strategies are Top 5 Top 5
available and have been adopted by both Untrustworthy Trustworthy
people and organisations. News Sources News Sources

Regardless of the approaches, tools and Before It’s 2068 Wall Street 3898
resources used, this process is more or less News Journal
followed in other surveyed literatures. As a
result, it can be seen that machine learning is a Zero Hedge 146 New York 836
popular field for text analysis. It appears that a Times
false news detector is an unofficially named Guardian 90 USA Today 824
data science implementation model that can
identify and categorise fake and real news Washington 79 Washingt 823
based on provided data. [8] Since binary Examiners on Posts
classification is the focus of the news detection
problem, machine learning methods like IV. PROPOSED SYSTEM
logistic regression, Supported Vector Machines
(SVM), and Naive bayes are used more It may be useful to utilise a tfidf matrix, or word
frequently. tallies based on how frequently they appear in
other articles in the given dataset. This work
III. EXISTING SYSTEM develops a model using the count vectorizer.
Building a Naive Bayes classifier will be ideal
The classification of online reviews and because it is common for text based processing
publicly accessible social media posts has been and this challenge involves text categorization.
the focus of the majority of the research on The real objective is deciding which type of text
machine learning algorithms for fraud transformation (count vectorizer vs tfidf
detection. In the literature, the problem of vectorizer) (headlines vs full text). The next step
spotting “fake news” has drawn a lot of is to extract the best traits for the count
attention, especially since late 2016 during the vectorizer or tfidf vectorizer. To do this, a large
American Presidential Election. number of the most widely used words and or
phrases, whether they are capitalised or not, are
A number of strategies are described by used, and most stop words are largely removed.
Conroy et. al.[1] with the purpose of accurately In addition to this, Power BI is used to visualize
classifying the deceptive articles. They point the dataset in graphical representation.
out the superficial parts of speech(POS)
tagging and V. NAIVE BAYES CLASSIFIER AND
simple content-related n-grams have typically ITS USES
been inefficient for the classification challenge
because they neglected to take important Naive Bayes classifiers are a type of
straightforward machine learning used in
artificial intelligence.The well-known Naive
Bayes approach employs multinomial NB and
pipelining concepts to assess the accuracy and
veracity of news. There are several algorithms
for training these classifiers that focus on
common concepts, thus it is not the only one.
You can use Naive Bayes to determine whether
the news is authentic or bogus.

VI. NAIVE BAYESIAN FORMULA

DETAILS

The recipe of being naive is as follows: The

likelihood of the prior occurrence is used in
Bayes classification, which contrasts it with the
current event. Every single after calculating the
event’s probability, it is then determined how
likely the news is overall given the dataset. By
calculating the overall likelihood, we may
therefore obtain an approximation of the value
and determine whether the news is accurate or
not.

P(C / D)=P(D/ C). P(C )/ P (D),

Finding the probability of event, C when event

D is TRUE
VII. METHODOLOGY
P(C) = PRIOR PROBABILITY In this paper it provides an explanation of the
three-part method. The main component uses a
P(C / D) = POSTERIOR PROBABILITY machine learning classifier and is inactive. We
FINDING PROBABILITY: looked at the model and trained it with four
various classifiers before selecting the best one
P(C / D1)=P(C 1/D 1). P(C 2 /D 1) . P(C 3/ D 1) to use in the end. The second component is
dynamic and uses the user’s keyword or text to
P(C / D2)=P(C 1/D 2). P(C 2 /D 2). P(C 3 /D 2) look online for information about the likelihood
that the news is true. The third section provides
P(Word) = Word count + 1/(total number of
proof that the customer actually contributed the
words + No. of unique words) if probability is
0. Consequently, one can determine the news URL.
accuracy by applying this method

A.System Design
of categorizing the domain if the location isn’t
included in either database, the implementation
merely states that the news aggregator dosen’t
exist.

VIII. RESULT:
A Python programming tool was used to interpret
B. System Architecture the results for specific data sets. Results are
presented in various tables and histograms.
i) Static search
Table 1 Dataset evaluation result
The design of the static part of the false news
Outcomes estimate
detection system is rather simple, and it is
finished by keeping in mind the key AI Correctness 95.26814
measure stream. The frameworks’s
configuration is self-explanatory and is given Fidelity 95.79288
below. Most of the steps in the design are
ii) Dynamic search Rescinding 94.56869

F-measure 95.17685

The evaluation findings for a particular data set

are displayed in Table 1. The accuracy of the
model dataset was 95.26 percent, the accuracy
of the outcome was 95.79 percent, and the
accuracy of the Recall and F-Score were,
respectively, 94.56% and 95.71%.

Table 2 class results predicted are

Model Prediction Predicti
Yes on No
The websites’s second search box asks for
particular keywords to be entered for web Real Yes 296 17
searches and displays the possibility that those
phrases will actually occur in a piece or in the Class No 13 308
same post that makes use of those keywords.
With this model, we have 296 positive traits,
iii) URL Search 308 negative attributes, 17 false positives, and
13 false negatives, as shown in Table 2 of the
The execution searches for placement in our projected class results..[6]
data set of actual locations or boycotted
locations database after receiving a particular
Table 3 Actual news and Fake news
site name in the third search field of the Table 3 shows predictions for true and false
positioning.the domain names that routinely news, with True Positives(TP) and False
provides accurate and reliable news are kept in Positives(FP), respectively.[6].
the verity sites database and vice versa. Instead
Predicted verify the authenticity of websites.The accuracy
Model
Class of the dynamic system is 93% and gets better
with each repetition.
Real Yes TP FN
References:
Class No FP TN
[1] N. J. Conroy, V. L. Rubin, and Y. Chen,
“Automatic deception detection: Methods for
finding fake news,” Proceedings of the
Association for Information Science and
Technology, vol. 52, no. 1, pp. 1–4, 2015.

[2] S. Feng, R. Banerjee, and Y. Choi, “Syntactic

stylometry for deception detection,” in
Proceedings of the 50th Annual Meeting of the
Association for Computational Linguistics:
Short Papers-Volume 2, Association for
Computational Linguistics, 2012, pp. 171–175.

[3] International Journal of Computer Science &

Communication (ISSN: 0973-7391) Volume 12
This diagram depicts the range of fake and real
Issue 2 pp.38-44 April 2021 - Sept 2021
news detected in the given dataset.
www.csjournals.com
IX.Conclusion:
[4] International Journal of Recent Technology
The fake news detection system recommends and Engineering (IJRTE) ISSN: 2277-3878,
using the user’s input, which is then qualified as Volume-8, Issue- IC2, May 2019
true or false. Various NLP and machine learning
techniques should be applied in this case. A
[5] Learn in Fake news detection using machine
suitable dataset should be used to prepare the
learning in Pantech solutions.
model, and several performance measures
should be used to complete the performance
[6] Fake and Real News detection Using Python
evaluation. The best model, the most accurate
International Journal of Scientific Research in
models, was used to aggregate headlines or news
Science and Technology . june 2020 DOI:
articles. The best Logistic regression model was
10.32628/IJSRST207376
obtained from the static search with 65%
accuracy. Therefore, Logistic regression
[7]Learn on https://medium.com/swlh/fake-
performs better with search parameter
news-detection-u sing-machine-learning-
optimisation achieving 75% accuracy.
69ff9050351f
Thus, this demonstrates that there is a
[8] Detecting Fake News using Machine
75%chance that a customer’s real sentence will
Learning: A Systematic Literature Review
be represented by a certain news article or
headline that they enter into our model. Users
can view news articles or keywords online; also
[9] Fake News Detection Using Machine
Learning P. Yogendra Prasad1 ,
Dr.G.Nagalakshmi2 , P. Siva Kumar3 1Assistant
Professor, Dept. of CSSE, Sree Vidyanikethan
Engineering College, Tirupati. 2Assistant
Professor, Dept. of Computer Science, National
Sanskrit University, Tirupati. 3Applications
Lead, Oracle Corporation, Bangalore.

[10] Iftikhar Ahmad, Muhammad Yousaf, Suhail

Yousaf, Muhammad Ovais Ahmad, “Fake News
Detection Using Machine Learning Ensemble
Methods”, Complexity, vol. 2020, Article ID
8885861, 11 pages, 2020.
https://doi.org/10.1155/2020/8885861

[11] Smitha N, Bharath R (2020) Performance

comparison of machine learning classifiers for
fake news detection. In: 2020 Second
international conference on inventive research in
computing applications (ICIRCA), pp 696700.
IEEE, Coimbatore, India (2020).
https://doi.org/10.1109/ICIRCA48905.2020.918
3072

[12] Granik M, Mesyura V (2017) Fake news

detection using naive Bayes classifier. In: 2017
IEEE 1st Ukraine conference on electrical and
computer engineering UKRCON, pp 900903.
https://doi.orf/10.1109/UKRCON.2017.8100379

Fake News Detection Using Natural Language Processing
100% (1)
Fake News Detection Using Natural Language Processing
8 pages
Fake News Detection: 2018 IEEE International Students' Conference On Electrical, Electronics and Computer Sciences
No ratings yet
Fake News Detection: 2018 IEEE International Students' Conference On Electrical, Electronics and Computer Sciences
5 pages
Fake News Detection On Social Media Using Machine Learning Report
100% (1)
Fake News Detection On Social Media Using Machine Learning Report
27 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
11 pages
SMS Spam Detection Using Machine Learning
No ratings yet
SMS Spam Detection Using Machine Learning
9 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
8 pages
Mini Project: Diploma in Computer Engineering
No ratings yet
Mini Project: Diploma in Computer Engineering
30 pages
Mini Project Report (AutoRecovered)
100% (1)
Mini Project Report (AutoRecovered)
115 pages
Fake News Detection Using Machine Learning
100% (1)
Fake News Detection Using Machine Learning
11 pages
Major Project
No ratings yet
Major Project
84 pages
Final Intership Report
No ratings yet
Final Intership Report
32 pages
Research Paper Microfinance
No ratings yet
Research Paper Microfinance
15 pages
Me Internship Certificate(s)
No ratings yet
Me Internship Certificate(s)
27 pages
Fake News Detection
No ratings yet
Fake News Detection
18 pages
Intership Report MNHB
100% (1)
Intership Report MNHB
34 pages
Phase 2 Report
No ratings yet
Phase 2 Report
63 pages
PUMMP: Phishing URL Detection Using Machine Learning With Monomorphic and Polymorphic Treatment of Features
No ratings yet
PUMMP: Phishing URL Detection Using Machine Learning With Monomorphic and Polymorphic Treatment of Features
20 pages
Cyberspace News Prediction of Text and Image
No ratings yet
Cyberspace News Prediction of Text and Image
53 pages
Q1 Housekeeping Week4
No ratings yet
Q1 Housekeeping Week4
4 pages
Complete Final Sem Report PDF
100% (1)
Complete Final Sem Report PDF
79 pages
Fake News Detection
No ratings yet
Fake News Detection
5 pages
Project Photo Share)
No ratings yet
Project Photo Share)
58 pages
Interim Project - Sentiment Analysis of Movie
No ratings yet
Interim Project - Sentiment Analysis of Movie
101 pages
Automatic Timetable Generation
No ratings yet
Automatic Timetable Generation
10 pages
Fake News Final Report DNWSLVDK C
No ratings yet
Fake News Final Report DNWSLVDK C
51 pages
BoS - Session 1
100% (1)
BoS - Session 1
37 pages
IOT Streetlight Controller System
No ratings yet
IOT Streetlight Controller System
28 pages
BBC News Review Content Analysis Full Report
No ratings yet
BBC News Review Content Analysis Full Report
171 pages
News Classification Using Machine Learning
No ratings yet
News Classification Using Machine Learning
5 pages
Online Vehicle Service Center Management System Project Report
No ratings yet
Online Vehicle Service Center Management System Project Report
115 pages
Fake News Synopsis 1
No ratings yet
Fake News Synopsis 1
6 pages
Car Parking
No ratings yet
Car Parking
38 pages
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
No ratings yet
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
23 pages
Final Report
No ratings yet
Final Report
79 pages
A Project Report: in Partial Fulfillment For The Award of The Degree
No ratings yet
A Project Report: in Partial Fulfillment For The Award of The Degree
50 pages
Block Chain Waste Management Using Secure Data Standard A Novel Approach
No ratings yet
Block Chain Waste Management Using Secure Data Standard A Novel Approach
26 pages
Generating Fake News Detection Model Using A Two-Stage Evolutionary Approach 7th Aug 2023 Published
No ratings yet
Generating Fake News Detection Model Using A Two-Stage Evolutionary Approach 7th Aug 2023 Published
19 pages
Opinion Mining of Online Customer Reviews: Patlammagari Gowtamreddy
No ratings yet
Opinion Mining of Online Customer Reviews: Patlammagari Gowtamreddy
44 pages
Hadoop Final Docment
100% (1)
Hadoop Final Docment
79 pages
Project Report
No ratings yet
Project Report
27 pages
Get (Original PDF) Introductory Statistics, 9th Edition by Prem S. Mann Free All Chapters
100% (3)
Get (Original PDF) Introductory Statistics, 9th Edition by Prem S. Mann Free All Chapters
46 pages
Predicting Cyberbullying On Social Media in The Big Data Era Using Machine Learning Algorithms Review of Literature and Open Challenges PDF
No ratings yet
Predicting Cyberbullying On Social Media in The Big Data Era Using Machine Learning Algorithms Review of Literature and Open Challenges PDF
18 pages
A Blockchain-Based Aadhar System: Distributed Authentication System
No ratings yet
A Blockchain-Based Aadhar System: Distributed Authentication System
9 pages
Minor Report On Weather Forecasting Done by Ashish Kumar Singh 2
No ratings yet
Minor Report On Weather Forecasting Done by Ashish Kumar Singh 2
27 pages
Kumar Mu Tie Rep
No ratings yet
Kumar Mu Tie Rep
30 pages
Project Write Up
No ratings yet
Project Write Up
42 pages
Online Crime Management System: Sri Ramakrishna College of Arts and Science
No ratings yet
Online Crime Management System: Sri Ramakrishna College of Arts and Science
34 pages
Documentation-Fake News Detection
No ratings yet
Documentation-Fake News Detection
57 pages
Data Science Report
No ratings yet
Data Science Report
32 pages
PM4 Shikha MBA-B
No ratings yet
PM4 Shikha MBA-B
58 pages
Mini Project Report Format 5
No ratings yet
Mini Project Report Format 5
21 pages
Lecture2 DataMiningFunctionalities
No ratings yet
Lecture2 DataMiningFunctionalities
18 pages
18 Converging Blockchain and Machine Learning For Healthcare
No ratings yet
18 Converging Blockchain and Machine Learning For Healthcare
3 pages
Vehicle Damage Cost Cal
No ratings yet
Vehicle Damage Cost Cal
4 pages
Big Data
No ratings yet
Big Data
30 pages
Latest Seminar Report Yash Ingole
No ratings yet
Latest Seminar Report Yash Ingole
35 pages
Complete Final Sem Report PDF
No ratings yet
Complete Final Sem Report PDF
79 pages
Ankit Adhikari 2 PDF
No ratings yet
Ankit Adhikari 2 PDF
22 pages
Internship Project Report - PGDMHCM
No ratings yet
Internship Project Report - PGDMHCM
48 pages
Processing of Satellite Image Using Digital Image Processing
No ratings yet
Processing of Satellite Image Using Digital Image Processing
21 pages
The Problem and Its Background
No ratings yet
The Problem and Its Background
16 pages
Forest Fire Detection
No ratings yet
Forest Fire Detection
8 pages
Project Report On Employee Motivation
No ratings yet
Project Report On Employee Motivation
51 pages
CIGRE - 2012-11-07+List+of+Technical+Brochures
No ratings yet
CIGRE - 2012-11-07+List+of+Technical+Brochures
2 pages
Objectives: Technical Report Writing
No ratings yet
Objectives: Technical Report Writing
11 pages
Factors Influencing The Academic Performance in Physics of DMMMSU - MLUC Laboratory High School Fourth Year Students S.Y. 2011-2012 1369731433 PDF
No ratings yet
Factors Influencing The Academic Performance in Physics of DMMMSU - MLUC Laboratory High School Fourth Year Students S.Y. 2011-2012 1369731433 PDF
11 pages
Predetermined Motion Time System
100% (1)
Predetermined Motion Time System
5 pages
Design Thinking - A New Product Development Approach
No ratings yet
Design Thinking - A New Product Development Approach
37 pages
Naturalistic Generalizations: Robert E. Stake and Deborah J. Trumbull
100% (2)
Naturalistic Generalizations: Robert E. Stake and Deborah J. Trumbull
7 pages
Module 4 - MATHEMATICS AS STATISTICAL TOOL
No ratings yet
Module 4 - MATHEMATICS AS STATISTICAL TOOL
29 pages
Print SMPN 28 The Effect of Sociodrama Method For Junior High School Students
No ratings yet
Print SMPN 28 The Effect of Sociodrama Method For Junior High School Students
30 pages
Summary of Findings Thesis
100% (2)
Summary of Findings Thesis
5 pages
Organization&Management Week3
No ratings yet
Organization&Management Week3
11 pages
HLT 7031 - Assignment Brief WRIT 2 2024-25
No ratings yet
HLT 7031 - Assignment Brief WRIT 2 2024-25
7 pages
Electronic Circuits Homework
100% (1)
Electronic Circuits Homework
5 pages
NIS Book
No ratings yet
NIS Book
186 pages
Ogl 340 Module 4 Paper
No ratings yet
Ogl 340 Module 4 Paper
6 pages
ELT Materials - Claims, Critiques and Controversies
No ratings yet
ELT Materials - Claims, Critiques and Controversies
15 pages
Final Paper MCQs - PEC Item Bank System
No ratings yet
Final Paper MCQs - PEC Item Bank System
3 pages
Talent Management: An Empirical Study of Selected South African Hotel Groups
No ratings yet
Talent Management: An Empirical Study of Selected South African Hotel Groups
27 pages
Project Management Assignment
No ratings yet
Project Management Assignment
13 pages
Chishimba Field Project Final
No ratings yet
Chishimba Field Project Final
4 pages
Successful HIV Prevention Programming For HIV-Positive MSM
No ratings yet
Successful HIV Prevention Programming For HIV-Positive MSM
60 pages
JCP - Oxis Classifiation
No ratings yet
JCP - Oxis Classifiation
6 pages
IA Introduction 2 - Snell's Law
No ratings yet
IA Introduction 2 - Snell's Law
11 pages
Conger Et Al 2000
No ratings yet
Conger Et Al 2000
21 pages
Vaccine Hesitancy and Cognitive Biases
No ratings yet
Vaccine Hesitancy and Cognitive Biases
7 pages
Statistics and Probability
No ratings yet
Statistics and Probability
3 pages
Perceived Effect of Motivation On The Job Performance of Library Personnel of Universities in Benue State, Nigeria
No ratings yet
Perceived Effect of Motivation On The Job Performance of Library Personnel of Universities in Benue State, Nigeria
10 pages
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet