Project Presentation

The group presented their work on detecting fake news using machine learning techniques. They used the Liar dataset containing over 12,000 labeled news statements. After preprocessing including removing punctuation and tokenization, they used TF-IDF and BOW to vectorize the text. Various models were tested including logistic regression, decision trees, random forest, AdaBoost, SVM and MLP. The best accuracy was 62.93% using random forest. Limitations included the difficulty of the problem and future work could incorporate other media and expand to other languages.

Uploaded by

Robera Endale

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

246 views

Project Presentation

Uploaded by

Robera Endale

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

ML PROJECT

PRESENTATION
FAKE NEWS
DETECTION USING
ML TECHNIQUES
GROUP NO. 9

1
MOTIVATION
Fake News is a challenging problem in today's times.
Social Media websites are flooded with much
misinformation, which can prove fatal.
Twitter particularly struggles with the fake news problem.
However, there is a certain regular pattern in fake news.
Some individuals are more likely to spread fake news.
We can use Machine Learning to identify such patterns and
try to predict fake news.

2
LITERATURE REVIEW
Researchers used ML models like logistic regression and deep
learning models.
Researchers describe the use of detecting fake news.
They discuss vectorization techniques like TF-IDF and BOW to
convert text to numeric values.
Researchers discussed the importance of addressing the bias
using lexical and sentiment analysis.
They experimented with several models like SVM, Random
Forest etc.
3
DATASET USED
Used Liar Dataset
It contains sentences with their speakers
and their affiliations with labels
representing fake news or not.
The dataset contains 16 columns and
12788 rows.
Some columns are labels, statement
speakers, etc.

4
EDA SOME SCATTERPLOTS !!!

5
DATA PREPROCESSING
We clean the data and remove
punctuation marks, white spaces, etc
We tokenize the data using NLP
We use TF-IDF and BOW to vectorize the
tokenized data into numeric form.
We use wordcloud to visualize the word
frequencies.
We use a label encoder on Political Party
and speaker column.

6
DATA PREPROCESSING - 2
We dropped 12 columns out of 16 columns
The decision to keep which columns were
taken based on the heatmap.
Columns with high correlation were
dropped.
We take 4 partitions of data with and
without party and speaker using TF-IDF
and BOW.

7
MORE ON NLP !!!
We use TF-IDF and BOW in NLP
vectorisation.

8
PERFORMING TSNE !!
Following is the result of TSNE on TF-IDF Vector

9
ML MODELS
1 We use Grid Search to find
the best hyperparameters

We report the accuracies

and loss curves for various
2
ML models like SVM,
Neural networks etc.

The results are for TF-IDF

in NLP without
3
considering the party and
speaker name.

10
GAUSSIAN NAIVE BAYES
The accuracy using Gaussian Naive Bayes is 58.09%

11
LOGISTIC REGRESSION
The accuracy using Logistic Regression is 61.74%

12
DECISION TREE
The accuracy using the Decision Tree is 56.91%

13
RANDOM FOREST
The accuracy using Random Forest is 59.57%

14
ADABOOST
The accuracy using Adaboost is 58.93%

15
SVM
The accuracy using SVM is 59.34%

16
MLP
The accuracy using MLP is 58.95%

17
MLP WITH PCA
The accuracy using MLP along with PCA is 57.70%

18
MLP WITH TSNE
The accuracy using MLP along with TSNE is 55.27%

19
RESULTS SUMMED UP !!

20
LIMITATIONS
The accuracies are close to 60%, which is not much
efficient.
This is because it's impossible to solve this problem
using standard ML and NLP Techniques.
It is impossible to predict whether the news is fake
without knowing the ground truth at that time.

21
CONCLUSION
We can predict whether the given news is
fake or not with an accuracy better than a
random guess i.e. 50%.
The best accuracy was 62.93% using
Random Forest, BOW vectorisation, with
speaker and party and Gini gain as feature
selection criteria.

22
FUTURE WORK
We can extend the scope by also incorporating
visual and audio content in news articles.
We will try to incorporate languages other than
English.
Develop an interactive system where users can
give a news article as input and can receive a
credibility score for that article, suggesting its
credibility.

23
Timeline

Week 9
Week 8 Week 10
Learned about
Learned TSNE and PCA. Documented
Week 7 about MLP.
Observed
the complete Week 11
accuracy of
Implemented project.
Learned the MLP with
MLP in code. Did a Identified the
about SVM and without
Observed the complete future work
Implemented TSNE
accuracy of analysis of Identified the
SVM Model
the model. performance limitations
in the Code
of all models
24
Work Division

Sahil Goyal
Deeptorshi
Mondal Handled the
data Vibhor
Anshak Goel Handled the preprocessing Agarwal
Handled the NLP
documentation part
Part. Also helped in Helped in Decided which
Also did some NLP Part. implementing ML Models to
data the ML Models use.
preprocessing Analyzed the
accuracies of
the models. 25
ANY
QUESTIONS ?

26
REFERENCES
https://arxiv.org/pdf/1705.00648.pdf%E2%80%8B
https://www.researchgate.net/publication/336436870_Fak
e_News_Detection_Using_Machine_Learning_approaches_
A_systematic_Review
https://paperswithcode.com/paper/liar-liar-pants-on-fire-
a-new-benchmark
https://github.com/manideep2510/siamese-BERT-fake-
news-detection-LIAR

The War Within - An Anatomy ..
100% (3)
The War Within - An Anatomy ..
30 pages
Web Development Internship Task
No ratings yet
Web Development Internship Task
12 pages
Practical Module Federation
No ratings yet
Practical Module Federation
167 pages
WRAT5 + SATA Answer Sheets
100% (2)
WRAT5 + SATA Answer Sheets
9 pages
Automation Cheat Sheet 2.0
No ratings yet
Automation Cheat Sheet 2.0
6 pages
Glut Opengl Cheatsheet
No ratings yet
Glut Opengl Cheatsheet
2 pages
C# Quiz Results: W3schools
No ratings yet
C# Quiz Results: W3schools
13 pages
Backend - Dev Technical Test
No ratings yet
Backend - Dev Technical Test
4 pages
Smart Trolley Project in India
No ratings yet
Smart Trolley Project in India
3 pages
Validation Master Plan
No ratings yet
Validation Master Plan
3 pages
Offshore Wind Farm 708 PDF
No ratings yet
Offshore Wind Farm 708 PDF
33 pages
Project PPT - 1
No ratings yet
Project PPT - 1
24 pages
Project Final
No ratings yet
Project Final
30 pages
Remote Control Ing System SRS
No ratings yet
Remote Control Ing System SRS
9 pages
Week07 Tutorial06
No ratings yet
Week07 Tutorial06
4 pages
Python Questions
No ratings yet
Python Questions
10 pages
Unit 5
100% (1)
Unit 5
4 pages
500 - Projects of ML and DL
No ratings yet
500 - Projects of ML and DL
9 pages
Big Data Analytics Unit 4
No ratings yet
Big Data Analytics Unit 4
83 pages
SYMBIAN OS Report
No ratings yet
SYMBIAN OS Report
25 pages
Unit-I Introduction To Data Science
No ratings yet
Unit-I Introduction To Data Science
40 pages
L-2 Scan Conversion
No ratings yet
L-2 Scan Conversion
22 pages
Salesforce Practical File
No ratings yet
Salesforce Practical File
105 pages
Celeb AI
No ratings yet
Celeb AI
2 pages
Port City International University: Submitted To
No ratings yet
Port City International University: Submitted To
11 pages
Web Designing: HTML, PHP, Mysql, Javascript
No ratings yet
Web Designing: HTML, PHP, Mysql, Javascript
99 pages
Olx's Presentation
No ratings yet
Olx's Presentation
26 pages
100 TOP Real Time Objective C Multiple Choice Questions and Answers PDF Download
No ratings yet
100 TOP Real Time Objective C Multiple Choice Questions and Answers PDF Download
22 pages
Mathematics of Cryptography: Part III: Primes and Related Congruence Equations
No ratings yet
Mathematics of Cryptography: Part III: Primes and Related Congruence Equations
50 pages
Uninformed Search Strategies-20211124125231
No ratings yet
Uninformed Search Strategies-20211124125231
38 pages
The Beginners Guide To Engineering Compu
0% (1)
The Beginners Guide To Engineering Compu
7 pages
Employee's Management System Project in C++ With Source - Code in Code - Blocks PDF
67% (3)
Employee's Management System Project in C++ With Source - Code in Code - Blocks PDF
4 pages
C++ Program
No ratings yet
C++ Program
31 pages
30-Longest Sequence of 1 After Flipping A Bit-25-05-2023
No ratings yet
30-Longest Sequence of 1 After Flipping A Bit-25-05-2023
16 pages
Qualcomm OA Bits Hyd 23
No ratings yet
Qualcomm OA Bits Hyd 23
11 pages
Deepseek
No ratings yet
Deepseek
11 pages
Search Algorithms in Artificial Intelligence
No ratings yet
Search Algorithms in Artificial Intelligence
13 pages
Machine Learning Seminar Report
No ratings yet
Machine Learning Seminar Report
19 pages
Automatic Test Equipment
No ratings yet
Automatic Test Equipment
9 pages
Unit Ii Inheritance & Polymorphism
No ratings yet
Unit Ii Inheritance & Polymorphism
34 pages
CGV Lab Manual 2021-1
No ratings yet
CGV Lab Manual 2021-1
51 pages
Project Report PDF
No ratings yet
Project Report PDF
19 pages
Week 8 Activity
No ratings yet
Week 8 Activity
1 page
Grid Computing Joshy Joseph Ebook1
No ratings yet
Grid Computing Joshy Joseph Ebook1
431 pages
Bankers Algorithm in Java
No ratings yet
Bankers Algorithm in Java
3 pages
Devops Labmanual (R20)
No ratings yet
Devops Labmanual (R20)
77 pages
Program-1 1. Login To Oracle by User Name Given by You
No ratings yet
Program-1 1. Login To Oracle by User Name Given by You
48 pages
Complete Devops Subject Notes
No ratings yet
Complete Devops Subject Notes
110 pages
Ensemble Learning
No ratings yet
Ensemble Learning
22 pages
IAT-III Question Paper With Solution of 18CS734 User Interface Design Jan-2022-Vivia John
No ratings yet
IAT-III Question Paper With Solution of 18CS734 User Interface Design Jan-2022-Vivia John
7 pages
2.centralised Mutual Exclusion
No ratings yet
2.centralised Mutual Exclusion
6 pages
DWDM Unit - 1 MCQ: by Arpit Sharma 01629802018
No ratings yet
DWDM Unit - 1 MCQ: by Arpit Sharma 01629802018
27 pages
Google Cloud Notes
No ratings yet
Google Cloud Notes
7 pages
1-Introduction Ethical Hacking
100% (1)
1-Introduction Ethical Hacking
16 pages
All C Programs
No ratings yet
All C Programs
68 pages
Web Design With HTML-CSS-JavaScript - 0
No ratings yet
Web Design With HTML-CSS-JavaScript - 0
5 pages
Online Store Using E-Commerce and Database Design and Implementation
No ratings yet
Online Store Using E-Commerce and Database Design and Implementation
21 pages
Error Detection and Recovery in Compiler Design PDF
No ratings yet
Error Detection and Recovery in Compiler Design PDF
2 pages
C# Lab
No ratings yet
C# Lab
33 pages
Capgemini Recruitment Process
No ratings yet
Capgemini Recruitment Process
12 pages
6CS6.2 Unit 5 Learning
No ratings yet
6CS6.2 Unit 5 Learning
41 pages
Data Science: Concepts and Practice: Course Slides
No ratings yet
Data Science: Concepts and Practice: Course Slides
9 pages
Car Rental System Final Documentation
No ratings yet
Car Rental System Final Documentation
47 pages
MNIST Based Handwritten Digits Recognition
No ratings yet
MNIST Based Handwritten Digits Recognition
5 pages
Georgia Odyssey 2nd Edition James Charles Cobb - The latest ebook is available for instant download now
100% (2)
Georgia Odyssey 2nd Edition James Charles Cobb - The latest ebook is available for instant download now
57 pages
Calendar of Meets-Divisions: None Listed Contact Division President For More Information Meet Canceled
No ratings yet
Calendar of Meets-Divisions: None Listed Contact Division President For More Information Meet Canceled
3 pages
Kinematics Particles
No ratings yet
Kinematics Particles
21 pages
Wrap Up Quiz 8 Session 9 - Human Resource Management T222WSB 6 PDF
No ratings yet
Wrap Up Quiz 8 Session 9 - Human Resource Management T222WSB 6 PDF
14 pages
Final
No ratings yet
Final
46 pages
Handout5 Sacramentals
No ratings yet
Handout5 Sacramentals
3 pages
Doe Fundamentals Handbook
No ratings yet
Doe Fundamentals Handbook
216 pages
Reuyan Vs Inc Navigation Co. Phils., Inc. - GR. No. 250203 - LISING
No ratings yet
Reuyan Vs Inc Navigation Co. Phils., Inc. - GR. No. 250203 - LISING
4 pages
Team Charter: Process and Quality Improvement of Product X
100% (1)
Team Charter: Process and Quality Improvement of Product X
2 pages
OIML R 124 Edition 1997
No ratings yet
OIML R 124 Edition 1997
32 pages
Luyện phát âm Phonics
No ratings yet
Luyện phát âm Phonics
48 pages
Diversity Consciousness - Professional Development
No ratings yet
Diversity Consciousness - Professional Development
20 pages
Games and Sports Injuries: Essay 2
No ratings yet
Games and Sports Injuries: Essay 2
2 pages
iSAM2 - Incremental Smoothing and Mapping Using The Bayes Tree (Kaess2011)
No ratings yet
iSAM2 - Incremental Smoothing and Mapping Using The Bayes Tree (Kaess2011)
21 pages
Biology Ch. 23
No ratings yet
Biology Ch. 23
80 pages
Calendar and Community
No ratings yet
Calendar and Community
2 pages
Manual JCL
No ratings yet
Manual JCL
15 pages
TOEFL Reading 4&5
No ratings yet
TOEFL Reading 4&5
30 pages
June7 2020 Gilead A3 VolXX Issue8
No ratings yet
June7 2020 Gilead A3 VolXX Issue8
3 pages
Disorders of Aorta
100% (1)
Disorders of Aorta
25 pages
2051436 (1)
No ratings yet
2051436 (1)
15 pages
Energy Is The Eternal Delight
No ratings yet
Energy Is The Eternal Delight
7 pages
Cam Follower Design Basics
No ratings yet
Cam Follower Design Basics
2 pages
1-Nerve Cells, Nerve Impulses, Synapse
100% (1)
1-Nerve Cells, Nerve Impulses, Synapse
55 pages
Political Organization
No ratings yet
Political Organization
3 pages