0% found this document useful (0 votes)

56 views24 pages

A Machine Learning Project Report Fake News Prediction

Uploaded by

prasunagummadi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views24 pages

A Machine Learning Project Report Fake News Prediction

Uploaded by

prasunagummadi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

A Machine Learning Project Report (CM551PC)

Fake News Prediction

Submitted
in partial fulfilment of the requirements for
the award of the degree of

Bachelor of Technology
in
Computer Science and Engineering (AI&ML)
by
M.SUMAIYA (22261A6638)
V.AARTHI (22261A6659)

Under the guidance of

Mrs.J.Sreedevi
(Assistant Professor)

DEPARTMENT OF EMERGING TECHNOLOGIES

Mahatma Gandhi Institute of Technology (Autonomous)
(Affiliated to Jawaharlal Nehru Technological University Hyderabad)
Kokapet(V), Gandipet(M), Hyderabad.
Telangana - 500 075.

2024 - 2025
TABLE OF CONTENTS
List of Figures i
List of Tables ii

Abstract iii

1. Introduction 1

1.1 Motivation 2

1.2 Problem Definition 2

1.3 Existing System 2

1.4 Proposed System 2

1.5 Requirements Specification 3

1.5.1 Software Requirements 3

1.5.2 Hardware Requirements 3

2. Literature Survey 4

3. Methodology 9

3.1 Implementation 9

3.2 Project Architecture 12

3.2.1 Activity Diagram 12

4. Testing and Results 13

4.1 Model Performances 14

4.2 Comparison of Models 18

5. Conclusion and Future Work 1919

5.1Conclusion 19

5.2 Future Work 19

6. 20
LIST OF FIGURES
Figure 3.2.1 Activity Diagram 10
12
Figure 4.1 Classification Report and ROC of LR 20
Figure 4.2 Classification Report of SVM 14
Figure 4.3 Classification Report of Decision tree 15
Figure 4.4 Classification Report of Naïve Bayes 15
Figure 4.5 Classification Report of RBF 16
Figure 4.6 Classification Report and ROC of Random Forest 16
17

LIST OF TABLES 18

Table 2.1 Comparison of Literature survey 14

Table 4.1 Comparison of Results 38

ii
ABSTRACT

This project aims to develop a machine learning system capable of classifying news articles as
either real or fake using textual data. The goal is to enhance information integrity in an era
where misinformation proliferates.

The project begins with the importation of essential libraries, which provide the necessary tools
for data manipulation and machine learning model development. A significant preprocessing
step involves the creation of a 'content' column by combining the 'author' and 'title' of each
article, augmenting the feature set used for classification.

To prepare the text data for analysis, techniques such as stemming are employed, reducing
words to their root forms to maintain consistency in the dataset. All text is converted to
lowercase to eliminate case sensitivity issues. The textual data is then transformed into
numerical format using the Term Frequency-Inverse Document Frequency (TF-IDF) method,
which quantifies the importance of words in relation to the entire dataset.

The classification task is executed using a logistic regression model, which predicts the
authenticity of articles based on the computed features. The model demonstrates high efficacy,
achieving an accuracy score of 98% on the training data. This project underscores the effective
use of machine learning techniques in distinguishing between legitimate and misleading news
content, offering a potential tool for combating fake news in digital platforms.

iii
1.INTRODUCTION

In recent years, the proliferation of misinformation and fake news has emerged as a significant
challenge in the digital age. With the rapid expansion of online platforms and social media,
individuals are often exposed to a vast array of news articles, making it increasingly difficult
to discern credible information from falsehoods. The consequences of spreading fake news
can be detrimental, leading to public confusion and misinformed decisions. As a response to
this pressing issue, the need for automated systems that can effectively classify news content
has become more critical than ever.

This project aims to build a machine learning system capable of accurately classifying news
articles as real or fake, leveraging textual data to facilitate its predictions. By utilizing
established natural language processing techniques, the system processes various textual
features, including the article's title, author, and content. The project incorporates a
comprehensive approach that includes data preprocessing steps such as stemming, word
normalization, and vectorization using the TF-IDF method. These steps ensure that the model
can interpret and analyze the text data effectively, paving the way for robust predictions.

To achieve the classification goal, we implement a logistic regression model, which is adept
at handling binary classification tasks. The model is trained on a dataset comprised of labeled
news articles, allowing it to learn the underlying patterns that differentiate real news from fake
news. With a training accuracy score of 98%, the project demonstrates the potential of machine
learning in combating misinformation. This system not only showcases the capabilities of AI
in text classification but also serves as a valuable tool for users seeking to verify the
authenticity of news articles in an increasingly complex information landscape.
1.1 Motivation
In the digital age, access to information is unprecedented, but so is the proliferation of
misinformation and fake news. With social media and online platforms serving as primary
sources of news, the public is increasingly exposed to misleading and inaccurate information.
This not only distorts public perception but can also lead to serious societal consequences,
including a loss of trust in legitimate news sources, increased polarization, and public health
risks, particularly when false information is spread regarding critical issues such as health or
political events.

1.2 Problem Definition

The project aims to develop a machine learning system that classifies news articles as either
real or fake based on their textual content, addressing the critical issue of misinformation in
today's digital landscape. By preprocessing the text data through normalization techniques,
stemming, and TF-IDF vectorization, the project will utilize a logistic regression model to
predict article authenticity. The goal is to achieve a high accuracy rate (targeting around 98%)
in differentiating between real and fake news, ultimately providing a valuable tool for
individuals and organizations to combat the spread of misinformation and promote informed
decision-making.

1.3 Existing System

The existing systems for fake news detection include rule-based approaches that use heuristics
and predefined criteria, traditional machine learning models like Naive Bayes and SVMs
relying on manual feature extraction, and advanced deep learning techniques such as RNNs
and CNNs that learn features from large datasets. While these systems can vary in
effectiveness, they face significant limitations, such as inflexibility to adapt to evolving
misinformation tactics, the need for extensive labeled datasets, and scalability issues due to
labor-intensive verification processes. Additionally, many current platforms lack transparency,
impacting user trust in their capabilities. The proposed machine learning system aims to
address these challenges by implementing a comprehensive approach with advanced
preprocessing and modeling techniques to achieve high accuracy in classifying news articles
efficiently

1.4 Proposed System

The proposed system aims to develop a robust machine learning framework for classifying
news articles as real or fake by leveraging advanced text preprocessing techniques and an
efficient logistic regression model. It will begin by combining relevant text features, such as
author and title, to create a comprehensive content representation. The system will then employ
normalization methods, including stemming and TF-IDF vectorization, to convert textual data
into a format suitable for analysis. By focusing on a streamlined, scalable approach, the model
is designed to achieve high accuracy in distinguishing between authentic and misleading news
articles while enhancing adaptability to emerging trends in misinformation. This system not
only aims for superior classification performance but also emphasizes transparency and user
trust, providing a valuable tool for individuals and organizations to navigate the complexities
of news authenticity effectively.

2
1.5 Requirements Specification

1.5.1 Software Requirements

I. Programming Language: Python 3.12
II. Operating System: Windows / Linux / macOS
III. Tools and Libraries:
Data Analysis and Modeling: NumPy, Pandas, Scikit-learn,
TensorFlow/PyTorch,Matplotlib, Seaborn
Data Visualization: Plotly, Dash (optional for interactive visualizations)
Data Preprocessing: SciPy, Statsmodels
Notebook Environment: Jupyter Notebook or JupyterLab

1.5.2 Hardware Requirements

I. Processor: Intel Core i5 or equivalent
II. RAM: 8 GB (minimum) recommended for handling large datasets efficiently
III. Storage: 20 GB free space for dataset storage and model checkpoints

3
2. LITERATURE SURVEY

The literature on fake news detection highlights a growing interest in utilizing machine learning
and natural language processing (NLP) techniques to combat misinformation. Early studies
primarily focused on identifying the unique features of fake news articles compared to reliable
sources, such as linguistic cues, sentiment analysis, and credibility indicators. Researchers like
Lazer et al. (2018) emphasized the role of social media in amplifying false information,
prompting a wave of investigation into how algorithms could be employed to recognize and
stop the spread of fake news. This backdrop established a foundation for further exploration
into effective detection methodologies.

Recent advancements have led to a variety of machine learning models being applied to the
task of fake news detection. Techniques such as support vector machines, random forests, and
neural networks have shown promising results. For example, the work of Shang et al. (2020)
employed deep learning approaches to improve accuracy in classification tasks by leveraging
large datasets of news articles. Moreover, the integration of NLP techniques, like tokenization,
stemming, and the use of TF-IDF vectors, has significantly enhanced the feature extraction
process. These developments underscore the importance of sophisticated text processing
methodologies in building reliable classification systems.

The ongoing research in this field continues to evolve, with scholars exploring hybrid models
that combine multiple algorithms and approaches for greater accuracy. Recent studies have
introduced ensemble methods that aggregate the predictions of various classifiers to improve
performance. Furthermore, there is a growing focus on the ethical implications of automated
fake news detection, including biases in training data and the potential for algorithmic bias.
This literature survey highlights the dynamic nature of fake news detection research and the
critical need for continuous innovation in machine learning techniques to adapt to emerging
trends in misinformation dissemination.

4
Table 2.1: Comparison of Literature survey

5
6
7
8
3. METHODOLOGY

The implementation of the machine learning project for classifying news articles as real or fake
involves several key steps, ranging from data collection and preprocessing to model training
and evaluation. Below is a detailed breakdown of the implementation process.

3.1 IMPLEMENTATION

1. Data Collection
The first step in implementing the project is to gather a suitable dataset comprising labeled
news articles. The dataset should contain a diverse array of news articles classified as either
real or fake. Publicly available datasets, such as the "Fake News Dataset" from Kaggle or the
"LIAR dataset," can be used for this purpose. The data typically consists of columns for the
article's title, author, content, and label (real or fake).

2. Data Preprocessing
Once the data is collected, preprocessing is essential to prepare it for machine learning:

Combining Features: Create a new column `content` by combining the `author` and `title`
columns. This new column serves as the main input for classification.

python
df['content'] = df['author'] + ' ' + df['title']

Text Normalization: Convert all text to lowercase to maintain consistency and facilitate
analysis.

python
df['content'] = df['content'].str.lower()

Stemming:Apply stemming to reduce words to their root form, which helps in text
standardization. This can be done using libraries like NLTK or SpaCy.

python
from nltk.stem import PorterStemmer
stemmer = PorterStemmer()
df['content'] = df['content'].apply(lambda x: ' '.join([stemmer.stem(word) for word in
x.split()]))

9
Vectorization: Use the Term Frequency-Inverse Document Frequency (TF-IDF) vectorization
to convert the text data into numerical representations that can be fed into a machine learning
model.

python
from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(df['content'])
y = df['label'] # Assuming 'label' contains real or fake

3. Model Training
With the preprocessed data in hand, the next step is to train a machine learning model. A logistic
regression model can be selected for its ease of implementation and effectiveness in binary
classification tasks.

Splitting the Data: Split the dataset into training and testing sets to evaluate model
performance.

python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Training the Model:Fit the logistic regression model on the training data.

python
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train, y_train)

4. Model Evaluation
After training the model, it's important to evaluate its performance using the test set.

Predictions and Accuracy: Use the model to predict labels for the test set and calculate the
accuracy.
Python
from sklearn.metrics import accuracy_score
predictions = model.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f'Accuracy: {accuracy * 100:.2f}%')

10
5.Classification Models:

Logistic Regression: The model predicts labels using a sigmoid function, which maps
predicted probabilities to binary outcomes (0 or 1)

Support Vector Machine: SVM is a powerful machine learning algorithm used for
classification tasks, which aims to find the optimal hyperplane that separates different
classes— in this case, real and fake news articles .

Decision tree: A decision tree is a flowchart-like structure used in machine learning for
making decisions based on feature splits to classify or predict outcomes.

Random Forest: Random Forest is an ensemble learning method that constructs multiple
decision trees during training and outputs the mode of their classes for classification tasks.

Navie Bayes : Naive Bayes is a simple and efficient probabilistic classifier based on Bayes'
theorem, assuming feature independence.

Radial Basis Function (RBF) : It is a type of kernel function used in machine learning
algorithms, such as Support Vector Machines and neural networks, that calculates the similarity
between data points based on their distance from a center point.

11
3.2 Project Architecture
UML Diagram
3.2.1 Activity Diagram

12
4. TESTING AND RESULTS
Testing is a crucial phase that determines the quality of models used as well as the
importances of all the features under consideration. The algorithms used in this project
have been rigorously tested based on various factors including accuracy, recall,
precision, f1 score and kappa statistic.
Accuracy - It measures how many observations, both positive and negative, were
correctly classified.

Recall - It measures how many observations out of all positive observations, have we
classified as positive. Taking our customer churn example, it tells us how many churned
customers we recalled from all the churned customers.

1
While optimizing recall, you want to make sure you have identified ALL the customers
who could churn.

Precision - It measures how many observations predicted as positive are in fact positive.
Taking our fraud detection example, it tells us what the ratio of customers correctly
classified as churned is.

While optimizing precision, you want to make sure that the customers that you classify
as churned ARE ACTUALLY CHURNED.

F-1 score - Simply put, it combines precision and recall into one metric. It’s the
harmonic mean between precision and recall. A perfect F1-score is 1.0 or 100%. The
closer it is to 1.0, the better the model. You can calculate it in the following way:

13
4.1 Model Performances
4.1.1 Logistic Regression

Figure 4.1.Classification Report and ROC of LR

14
4.1.2 Support Vector Machine

Figure 4.2.Classification Report of SVM

4.1.3 Decision Tree

Figure 4.64Classification Report of Decision Tree

15
4.1.4 Naive Bayes

Figure 4.3.Classification Report of Naïve Bayes

4.1.5 Radial Basis Function (RBF)

Figure 4.5.Classification Report of RBF

16
4.1.6 Random Forest

Figure 4.6.Classification Report and ROC of Random Forest

17
4.2 Comparison of Models

A thorough comparison of algorithms based on the metrics mentioned above gives a

comprehensive insight into the performance and efficiency of each of them.

4.1. Comparison of Results

From the above table, we observe that the results predicted by the Random forest
algorithm are the most efficient, evident from the high accuracy, precision and f1 score.

18
5.CONCLUSION AND FUTURE WORK

5.1 Conclusion

In conclusion, the project successfully demonstrates the process of building a machine learning
system to classify news articles as real or fake using textual data. By employing various natural
language processing techniques such as stemming and the TF-IDF vectorization method, the
project effectively converts unstructured text into a structured format suitable for analysis. The
implementation of a logistic regression model yielded a high accuracy score of 98% on the
training data, indicating the model's capability to distinguish between genuine and misleading
news articles. This project highlights the importance of machine learning in combating
misinformation and the potential for automated systems to assist readers in evaluating the
credibility of news.

5.2 Future Work

For future work, several avenues can be explored to enhance the performance and robustness
of the classification system:

1. Model Optimization: Experimenting with various advanced algorithms beyond logistic

regression, such as support vector machines, random forests, or deep learning models like
recurrent neural networks (RNN) and transformers, can potentially improve classification
accuracy.

2. Larger and Diverse Datasets: Expanding the dataset to include a broader range of news
topics, sources, and styles can improve the model's generalizability and ability to handle
different types of misinformation.

3. Handling Imbalanced Data: Implementing techniques to manage class imbalance, which is

often present in real vs. fake news datasets, by using oversampling, undersampling, or synthetic
data generation methods can result in improved model performance.

4. Sentiment Analysis Integration: Combining sentiment analysis with news classification

could provide deeper insights into the nature of the articles and help in distinguishing subtle
forms of misinformation.

5. Real-time Detection: Developing a real-time news classification system that can analyze and
label articles as they are published would be a valuable tool for combating fake news on social
media platforms.

19
BIBLIOGRAPHY

 O'Sullivan, D. (2020). Machine Learning for Text: A Comprehensive Guide to Data

Science for Text Classification. New York: Springer.
 Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information
Retrieval. Cambridge: Cambridge University Press.
 Sebastian Raschka, & Vahid Mirjalili. (2019). Python Machine Learning: Machine
Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2.
Birmingham: Packt Publishing.
 K. D. O. G. J. N. P. (2018). Fake News Detection on Social Media: A Data Mining
Perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22-36.
doi:10.1145/3287560.3287598.
 Cohen, J. (2021). "The Role of Natural Language Processing in Fake News
Detection." Journal of Machine Learning Research, 22(1), 1-15. Retrieved
from http://www.jmlr.org/papers/volume22/19-088/19-088.pdf
 Bo Pang & Linda Lee. (2008). "Opinion Mining and Sentiment Analysis." Foundations
and Trends in Information Retrieval, 2(1-2), 1-135. doi:10.1561/1500000001.
 Zhang, L., & Wang, S. (2020). "Fine-Tuning Pretrained Transformers for Text
Classification." Proceedings of the 58th Annual Meeting of the Association for
Computational Linguistics, 9-15. Retrieved
from https://www.aclweb.org/anthology/2020.acl-main.1
 Scikit-learn documentation. (n.d.). Retrieved from https://scikit-
learn.org/stable/user_guide.html.
 S. J. M. (2021). "TF-IDF: A Comprehensive Explanation." Towards Data Science.
Retrieved from https://towardsdatascience.com/tf-idf-a-comprehensive-explanation-
1c094499e332
 B. S. (2023). "Understanding Stemming and Lemmatization." DataCamp Community.
Retrieved from https://www.datacamp.com/community/tutorials/stemming-
lemmatization-python

Thesis
No ratings yet
Thesis
73 pages
Health Mental Ai Driven Companion
No ratings yet
Health Mental Ai Driven Companion
54 pages
Facemask Detection Using Convolutional Neural Networks
100% (1)
Facemask Detection Using Convolutional Neural Networks
11 pages
TLE 7-8 Animal Production Q1 - M3 For Printing
100% (1)
TLE 7-8 Animal Production Q1 - M3 For Printing
24 pages
Lesson 3
100% (5)
Lesson 3
5 pages
PDF&Rendition 1
No ratings yet
PDF&Rendition 1
56 pages
Final RSR Word Report
No ratings yet
Final RSR Word Report
63 pages
1SJ18CS098 Soniya CJ 1
No ratings yet
1SJ18CS098 Soniya CJ 1
26 pages
Major Porject Report (Sem-Vii)
No ratings yet
Major Porject Report (Sem-Vii)
39 pages
Wordprediction Reportfinal
No ratings yet
Wordprediction Reportfinal
45 pages
Report HFP
No ratings yet
Report HFP
71 pages
Internship Report
No ratings yet
Internship Report
18 pages
Smart Traffic Management New
No ratings yet
Smart Traffic Management New
25 pages
Saro 2.0
No ratings yet
Saro 2.0
31 pages
Cryptocurrency Price Prediction Using Deep Learning
No ratings yet
Cryptocurrency Price Prediction Using Deep Learning
52 pages
PDL Lab 4
No ratings yet
PDL Lab 4
32 pages
Dual Clutch Transmission
100% (1)
Dual Clutch Transmission
7 pages
Fake News Detection Using Passive Aggressive Classification and Confusion Matrix
No ratings yet
Fake News Detection Using Passive Aggressive Classification and Confusion Matrix
28 pages
Null 1
No ratings yet
Null 1
89 pages
Final Doc Fin
No ratings yet
Final Doc Fin
87 pages
Content Part - Merged
No ratings yet
Content Part - Merged
76 pages
Major Project (FAKE NEWS) - Finalreport
No ratings yet
Major Project (FAKE NEWS) - Finalreport
30 pages
Major Final Report Kartik
No ratings yet
Major Final Report Kartik
48 pages
Final Report STesting
No ratings yet
Final Report STesting
30 pages
Smart Room Occupancy Analysis: Employing IOT Data and Predictive Insights
No ratings yet
Smart Room Occupancy Analysis: Employing IOT Data and Predictive Insights
63 pages
Flight Fare Prediction Final
No ratings yet
Flight Fare Prediction Final
65 pages
Career Y1
No ratings yet
Career Y1
51 pages
Roshini Project
No ratings yet
Roshini Project
74 pages
Team 4 Report Document
No ratings yet
Team 4 Report Document
72 pages
Final Document Recent f5
No ratings yet
Final Document Recent f5
52 pages
Intrusion Detection System: Submitted in The Partial Fulfilment of The Requirements For The Award of The Degree of
No ratings yet
Intrusion Detection System: Submitted in The Partial Fulfilment of The Requirements For The Award of The Degree of
35 pages
FINALREPORTCHETHAN
No ratings yet
FINALREPORTCHETHAN
41 pages
727722EUCS008 CorporateEventManagement CSE-A
No ratings yet
727722EUCS008 CorporateEventManagement CSE-A
27 pages
Chronic
No ratings yet
Chronic
33 pages
Final Document Recent f4
No ratings yet
Final Document Recent f4
52 pages
Fake News and Message Detection Project Report: September 2021
No ratings yet
Fake News and Message Detection Project Report: September 2021
13 pages
Project
No ratings yet
Project
63 pages
A Novel Image Style Transfer Model Using Generative AI
No ratings yet
A Novel Image Style Transfer Model Using Generative AI
72 pages
Objectfy 1
No ratings yet
Objectfy 1
54 pages
Body Final Merged
No ratings yet
Body Final Merged
27 pages
FINAL PRINT Fix-1
No ratings yet
FINAL PRINT Fix-1
33 pages
Coronavirus Disease (Covid-19) Cases Analysis Using Machine Learning
No ratings yet
Coronavirus Disease (Covid-19) Cases Analysis Using Machine Learning
11 pages
Sat - 67.Pdf - Human Activity Recognition With Smartphones Using Machine Learning Process
No ratings yet
Sat - 67.Pdf - Human Activity Recognition With Smartphones Using Machine Learning Process
11 pages
Sat - 19.Pdf - Prediction of Network Attacks Using Superrvised Machine Learning Algorithm
No ratings yet
Sat - 19.Pdf - Prediction of Network Attacks Using Superrvised Machine Learning Algorithm
11 pages
Predicting Health Insurance Claim Frauds Using Machine Learning
No ratings yet
Predicting Health Insurance Claim Frauds Using Machine Learning
11 pages
Rina Project
No ratings yet
Rina Project
46 pages
Weather App Report File
No ratings yet
Weather App Report File
36 pages
Anomaly1 Faas
No ratings yet
Anomaly1 Faas
24 pages
Project R 19
No ratings yet
Project R 19
94 pages
Management System
No ratings yet
Management System
56 pages
Sample
No ratings yet
Sample
9 pages
Movie Recom REPORT Update
No ratings yet
Movie Recom REPORT Update
26 pages
College Admission
No ratings yet
College Admission
5 pages
Malicious Twitter Bots Detection Using Machine Learning: A Mini Project Report
No ratings yet
Malicious Twitter Bots Detection Using Machine Learning: A Mini Project Report
54 pages
G42 Finalinternal
No ratings yet
G42 Finalinternal
30 pages
Sds FP
No ratings yet
Sds FP
2 pages
Front Pages1
No ratings yet
Front Pages1
6 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
28 pages
Yuvan
No ratings yet
Yuvan
42 pages
NLP Starting Pages
No ratings yet
NLP Starting Pages
7 pages
Project Report
No ratings yet
Project Report
22 pages
Ref System Fault Location
No ratings yet
Ref System Fault Location
24 pages
Index 2017-18 - INDIVIDUAL COPY Sharda Mahajan
No ratings yet
Index 2017-18 - INDIVIDUAL COPY Sharda Mahajan
9 pages
Blockchain Assignment
0% (1)
Blockchain Assignment
13 pages
CE 3220 11 Drilling Rock and Earth PDF
No ratings yet
CE 3220 11 Drilling Rock and Earth PDF
67 pages
2019 G6NA Language Arts Paper 2
No ratings yet
2019 G6NA Language Arts Paper 2
10 pages
Interview Questions - For LinkedIn
No ratings yet
Interview Questions - For LinkedIn
4 pages
Retraction
No ratings yet
Retraction
5 pages
SL Series Users Manual
No ratings yet
SL Series Users Manual
25 pages
Claves
No ratings yet
Claves
4 pages
Objective:: Power Plant Lab (Me-223L) Experiment No: 6 Title: Demonistration of Steam Engine
No ratings yet
Objective:: Power Plant Lab (Me-223L) Experiment No: 6 Title: Demonistration of Steam Engine
5 pages
All Postings Report
No ratings yet
All Postings Report
10 pages
Troubleshooting GEFANUC 90 30
No ratings yet
Troubleshooting GEFANUC 90 30
18 pages
DLL - English 4 - Q1 - W5
No ratings yet
DLL - English 4 - Q1 - W5
5 pages
HaightAshburyFreePressVol 1no 61968D D TeoliJr A C 1
100% (1)
HaightAshburyFreePressVol 1no 61968D D TeoliJr A C 1
16 pages
Child Dissociation The Descriptive Psychopathology Analysis of A Case
No ratings yet
Child Dissociation The Descriptive Psychopathology Analysis of A Case
14 pages
Import As Import As From Import
No ratings yet
Import As Import As From Import
23 pages
Cloud and Emerging Technologies
No ratings yet
Cloud and Emerging Technologies
5 pages
733-Article Text-1725-3-10-20230630
No ratings yet
733-Article Text-1725-3-10-20230630
16 pages
Abyip 2024 1
No ratings yet
Abyip 2024 1
11 pages
Adobe Scan 04-Mar-2024
No ratings yet
Adobe Scan 04-Mar-2024
12 pages
Ishan Earthing Solutions India PVT - LTD
No ratings yet
Ishan Earthing Solutions India PVT - LTD
2 pages
Social Science Disciplines
No ratings yet
Social Science Disciplines
2 pages
CH 10
No ratings yet
CH 10
22 pages
Welcome To::: Class 12
No ratings yet
Welcome To::: Class 12
14 pages
MK PPR Ecu en 72
No ratings yet
MK PPR Ecu en 72
2 pages
Grade 10 Physics Assessment
No ratings yet
Grade 10 Physics Assessment
1 page
BN Islander: Wingspan
No ratings yet
BN Islander: Wingspan
9 pages
Halter
No ratings yet
Halter
2 pages
Quality metrics for semantic interoperability in Health Informatics
From Everand
Quality metrics for semantic interoperability in Health Informatics
Alberto Moreno Conde
No ratings yet

A Machine Learning Project Report Fake News Prediction

Uploaded by

A Machine Learning Project Report Fake News Prediction

Uploaded by

A Machine Learning Project Report (CM551PC)

Fake News Prediction

Under the guidance of

DEPARTMENT OF EMERGING TECHNOLOGIES

1.2 Problem Definition 2

1.3 Existing System 2

1.4 Proposed System 2

1.5 Requirements Specification 3

1.5.1 Software Requirements 3

1.5.2 Hardware Requirements 3

3.2 Project Architecture 12

3.2.1 Activity Diagram 12

4. Testing and Results 13

4.1 Model Performances 14

4.2 Comparison of Models 18

5. Conclusion and Future Work 1919

5.2 Future Work 19

Table 2.1 Comparison of Literature survey 14

Table 4.1 Comparison of Results 38

1.2 Problem Definition

1.3 Existing System

1.4 Proposed System

1.5.1 Software Requirements

1.5.2 Hardware Requirements

Figure 4.1.Classification Report and ROC of LR

Figure 4.2.Classification Report of SVM

4.1.3 Decision Tree

Figure 4.64Classification Report of Decision Tree

Figure 4.3.Classification Report of Naïve Bayes

4.1.5 Radial Basis Function (RBF)

Figure 4.5.Classification Report of RBF

Figure 4.6.Classification Report and ROC of Random Forest

A thorough comparison of algorithms based on the metrics mentioned above gives a

4.1. Comparison of Results

5.2 Future Work

1. Model Optimization: Experimenting with various advanced algorithms beyond logistic

3. Handling Imbalanced Data: Implementing techniques to manage class imbalance, which is

4. Sentiment Analysis Integration: Combining sentiment analysis with news classification

 O'Sullivan, D. (2020). Machine Learning for Text: A Comprehensive Guide to Data

You might also like