0% found this document useful (0 votes)

10 views

Counterfeit News Detection Using Machine Learning

World is advancing rapidly. Doubtlessly we have different advantages of this Digital world anyway it has its impediments moreover. There are different issues in this cutting-edge world. One of them is fake data. Someone can easily spread fake news. Fake news is spread to hurt the remaining of an individual or an affiliation. Fake news is counterfeit information that is formed and conveyed by dishonest person. Clients are uninformed that the information that they got is deluding information.

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Counterfeit News Detection Using Machine Learning

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG302

Counterfeit News Detection Using Machine Learning

SHANI P.R
Lecturer Department of Computer Science
FMKMCC Madikeri

Abstract:- World is advancing rapidly. Doubtlessly we manufactured media. Today anybody can distribute content
have different advantages of this Digital world anyway it tenable or not that can be consumed by the Internet.
has its impediments moreover. There are different issues Tragically phony news collects a lot of a consideration over
in this cutting-edge world. One of them is fake data. the web particularly via online entertainment.
Someone can easily spread fake news. Fake news is
spread to hurt the remaining of an individual or an World is evolving quickly. Most likely we have
affiliation. Fake news is counterfeit information that is various benefits of this computerized world however it has
formed and conveyed by dishonest person. Clients are its detriments also. There are various issues in this advanced
uninformed that the information that they got is world. One of them is phony information. Somebody can
deluding information. Using Machine learning that can undoubtedly get out counterfeit word. Counterfeit word is
orchestrate whether the news is substantial or deceiving gotten out to hurt the standing of an individual or an
through setting up the model. There are different web association. Counterfeit news is bogus data that is composed
based stages where the individual can spread the fake and distributed by untrustworthy individual. Clients are
news. This consolidates Twitter, face book, Instagram, ignorant that the data that they got is misleading data.
Whatsapp, etc. Utilizing AI that can arrange whether the news is valid or
misleading via preparing the model. There are different
ML is the piece of man-made awareness that internet based stages where the individual can get out the
helpers in making the structures that can learn and phony word. This incorporates Twitter, face book,
perform different exercises. Simulated learning Instagram, Whatsapp and so forth.
computations will recognize the fake news thus at
whatever point they have arranged. A collection of Machine Learning is the piece of man-made
machine learning computations are available that consciousness that aides in creating the frameworks that can
consolidate the controlled computer based intelligence learn and perform various activities. . AI calculations will
estimations like Decision Tree, Random forest , identify the phony news consequently whenever they have
Stochastic gradient Descent, K Nearest Neighbor. As a prepared. An assortment of AI calculations are accessible
rule simulated intelligence estimations are used for that incorporate the regulated AI calculations like Choice
assumption reason or to perceive something hidden Tree, Irregular Backwoods, Stochastic Inclination Plunge, K
away. Closest Neighbor. More often than not AI calculations are
utilized for expectation reason or to recognize something
General Terms:- Counterfeit, Algorithms, Datasets, stowed away.
Patterns, Graphs, Fake news, Real News
Online stages are useful for the clients since they can
Keywords:- Machine Learning, Sentimental Analysis, Social undoubtedly get to news. Be that as it may, the issue is this
Media, Decision Tree, Random Forest , Stochastic Gradient offers the chance to the digital crooks to get out counterfeit
Descent, K Nearest Neighbor, Cross Validation. word through these stages. This news can be demonstrated
unsafe to an individual or society. Recognizing the phony
I. INTRODUCTION news is a major test since it's anything but a simple
undertaking .In the event that the phony news isn't
Current life has become very steady and individuals distinguished early then individuals can spread it to other
need to thank the tremendous commitment of the web people and every one individuals will begin trusting it.
innovation for transmission and data sharing. This is a People, associations, entertainers or ideological groups can
development in mankind's set of experiences and yet it be affected through the phony news. Individuals conclusions
unfocused the line between evident media and malevolently and their choices are impacted by the phony news.

IJISRT24AUG302 www.ijisrt.com 524

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG302

II. SYSTEM ARCHITECTURE

Fig 1: System Architecture

 Data Set:  True News Dataset:

The datasets used for this endeavor news.csv, is a  Entries: 21,417
broad social event of reports got from Kaggle. It mixes two  Label: 0 (indicating true news)
specific datasets: one containing Genuine news and the
other Phony news. The accompanying subtleties gives us a We are consolidated both datasets utilizing panda's
detail Depiction of dataset : inherent capability. The dataset we'll use for this Machine
Learning project is news.csv. This datasets has a state of
 Dataset Sythesis 7796*4. The main segment distinguishes the news, the
 Fake News Dataset: second and third are the title and text, and the fourth section
 Entries: 23,481 has marks signifying whether the news is Genuine or
 Label: 1 (indicating fake news) Counterfeit. The dataset occupies 29.2 MB .

IJISRT24AUG302 www.ijisrt.com 525

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG302

Fig 2 How many fake and real articles

 Word Cloud
Word Cloud is an information representation strategy utilized for addressing message information in which the size of each
word shows its recurrence or significance. Huge text based information focuses can be featured utilizing a word cloud. Word
mists are generally utilized for breaking down information from interpersonal organization sites.

Fig 3 Word Cloud for Fake News

IJISRT24AUG302 www.ijisrt.com 526

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG302

Fig 4 Word Cloud for Real News

 Confusion Matrix
A confusion matrix is a table that is utilized to characterize the exhibition of an order calculation. It imagines and sums up
the exhibition of an order calculation. It comprises of 4 fundamental qualities that are utilized to characterize the estimation
measurements of the classifier. These four attributes are:
 TP(True Positive): The model has anticipated indeed, and the genuine worth was likewise obvious.
 TN (True Negative): The model has anticipated no, and the real worth was likewise no.
 FP (False Positive): The model has anticipated indeed, however the real worth was no. It is likewise called a Sort 1 mistake.
 FN (False Negative): The model has anticipated no, yet the real worth was yes. It is likewise called a Sort 2 mistake.

Fig 5 Confusion Matrix

 Algorithm
 Step 1: Extract Data from source.
 Step 2: Pre-process the text based information.
 Testing data.
 Eliminate stop words.
 Standardize text based information to shape vector grid utilizing tf-idf/count vectorizer.
 Create Feature matrix.

IJISRT24AUG302 www.ijisrt.com 527

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG302

 Step 3: Train_Test Split.

 Step 4: Train the classifier.
 Random Forest.
 Decision Tree.
 Stochastic Gradient Descent.
 K Nearest Neighbour.

 Step 5: Test the algorithms.

 Step 6: Evaluate Algorithm.

 Decision Tree Classifier

This class is initialized with parameters such as criterion=’entropy’,max_depth=20,splitter=’best’,random_state=101.

The result shows that our DecisionTree Classifier algorithm was able to classify the test set with 99.62% accuracy in the
particular dataset.

Fig 6 Decision Tree Classifier

 Random Forest Classifier

This class is initialized with n_estimator 100, max_depth =8, random_state =42,verbose = 1, class_weight="balanced".

The result shows that our Random Forest classifier algorithm was able to classify the test set with 0.98 accuracy in the
particular dataset.

Fig 7 Random Forest Classifier

IJISRT24AUG302 www.ijisrt.com 528

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG302

 Stochastic Gradient Descent.

This class is initialized with parameters such as loss=modified_huber, shuffle = True, random state= 101.The result shows
that our SGDClassifier algorithm was able to classify the test set with 0.74 accuracy in the particular dataset.

Fig 8 Stochastic Gradient Descent

 K-Nearest Neighbors Classifier

This class is initialized with one parameter n-neighbors. This is basically values forth k

The result shows that our K-Neighbors classifier algorithm was able to classifythe test set with 0.98 accuracy in the
particular dataset.

Fig 9 K-Nearest Neighbors Classifier

IJISRT24AUG302 www.ijisrt.com 529

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG302

 Machine Learning Model Comparison

Fig 10 Machine Learning Model Comparison

I presented the classification report visualize displays the Comparison of Accuracy of Classification.

III. CONCLUSION [6]. "Detecting Fake News with Deep Neural

Networks"Authors: Y. Zhang, M. A. Elaziz, et al.
In this Research Machine learning techniques have [7]. "Fake News Detection Using Machine Learning
been checked for functionality. Models are prepared Algorithms"Authors: Santhosh Kumar, V. K. Dhiraj,
utilizing information from news to confirm the framework et al.
adequacy on different stages to lessen the commonness of [8]. "A Survey on Fake News Detection with Deep
counterfeit news. These modules help in the correlation of Learning"Authors: Aman Deep, Abhinav Moudgil,
genuine and counterfeit news. Twitter, face book, snap chat et al.
for example are well known stages that highlight news pages [9]. "Combating Fake News with Machine Learning: A
and applying these models there could help clients separate Survey"Authors: M. Gupta, A. Singh, et al.
genuine or counterfeit news. I presented the classification [10]. "Fake News Detection: A Novel Approach Using
report visualize displays the Comparison of Accuracy of Bert-Based Pretrained Language Models"Authors:
classification, precision, recall, F1 and support scores, A. G. Tiwari, P. Kumar, et al
Combined graphical representation of Confusion Matrix for [11]. "Towards Robust Fake News Detection: A
the model. In order to support easier interpretation and Multimodal Approach"Authors: C. Zhang, S. Yao,
problem detection. et al.
[12]. H. Jabeen, ”Stemming and Lemmatization in Python”,
REFERENCES DataCamp Community, 2020. [Online].
Available:https://www.datacamp.com/community/tutor
[1]. Kecman, Support Vector Machines-An Introduction in ials/stemminglemmatization-python. [Accessed: 14-
“Support Vector Machines: Theory and Applications”, Jul- 2020].
Springer, New York City, NY, USA, 2005.
[2]. Kaggle, Fake News Detection, Kaggle, San Francisco,
CA, USA, 2018, https://www.kaggle.com/jruvika/fake-
news-detection.
[3]. Ahmed, I. Traore, and S. Saad, “Detection of online
fake news using n-gram analysis and machine learning
techniques,” in Proceedings of the International
Conference on Intelligent, Secure, and Dependable
Systems in Distributed and Cloud Environments, pp.
127–138, Springer, Vancouver, Canada, 2017.
[4]. Evaluating Machine Learning algorithms for fake news
detection. -Shloka Gilda
[5]. "Fake News Detection on Social Media: A Data
Mining Perspective"Authors: Anil Kumar, S. Balaji,
et al.