Government Engineering College Ramanagara
Doddamannina Gudde, Near Janapadaloka, B.M. Road, Ramanagara, Karnataka 562159
Final Year Internship Presentation
on
Fake News Detection Using Machine Learning
Under the Guidance of
Presented by
Dr. CHETHAN K C
Geetha C
Assistant Professor, Dept. of CSE,
GEC, Ramanagara
CONTENTS
INTRODUCTION ABOUT COMPANY
TECHNOLOGIES LEARNT
ABSTRACT
INTRODUCTION TO PROJECT
SYSTEM ARCHITECTURE
METHODOLOGY
CONCLUSION
COMPANY OVERVIEW
Company name – SYSLOG TECHNOLOGIES
Syslog Technologies is a fast growing technology solutions and services provider.
Founded in 2005 by a team of technology professionals with venture capital backing,
Syslog Technologies has built a successful track record of delivering end-to-end
solutions to its customers from various industrial sectors that include.
Syslog Technologies has highly skilled and dedicated IT professionals to
provide customized IT solutions for several industries using our technical
expertise and experience.
Our vision is to provide quality services that exceeds the expectations of our
esteemed customers.
Our mission is to build long term relationships with our customers and clients
and provide exceptional customer services by pursuing business through
innovation and advanced technology.
TECHNOLOGIES LEARNT
PYTHON
Python is an easy to learn, powerful programming language. It has efficient high-level data
structures and a simple but effective approach to object-oriented programming. Python’s elegant
syntax and dynamic typing, together with its interpreted nature, make it an ideal language for
scripting and rapid application development in many areas on most platforms.
OPEN CV
Open Source Computer Vision Library is a common platform and set of programming
functions for real-time applications. The open CV library contains several algorithms for more than
500 optimized algorithms. Used mostly around the world, with forty thousand people in the user
group. The first languages used in C-C ++ are mainly written in C, making them portable to certain
platforms such as the digital signal processor. Now the language that is called Python is being used
recently, has been developed to encourage adoption by a wider audience.
ANACONDA
Anaconda is a distribution of the Python and R programming
languages for scientific computing (data science, machine learning applications, large-scale data
processing, predictive analytics, etc.), that aims to simplify package management and deployment. The
distribution includes data-science packages suitable for Windows, Linux, and macOS.
ANACONDA NAVIGATOR
Anaconda Navigator is a desktop graphical user interface (GUI) included in Anaconda
distribution that allows users to launch applications and manage conda packages, environments and
channels without using command-line commands.
The following applications are available by default in Navigator.
JupyterLab
Jupyter Notebook
Spyder
Glue
Orange
RStudio
Visual Studio Code
JUPYTER NOTEBOOK
Project Jupyter is a project to develop open-source software, open standards, and services for interactive
computing across multiple programming languages.
Jupyter has developed and supported the interactive computing products Jupyter Notebook, JupyterHub, and
JupyterLab. Jupyter is financially sponsored by NumFOCUS.
SPYDER
Spyder is an open-source cross-platform integrated development environment (IDE) for scientific
programming in the Python language. Spyder integrates with a number of prominent packages in the scientific
Python stack, including NumPy, Matplotlib, pandas, IPython, SymPy and Cython, as well as other open-source
software.
ABSTRACT
This Project comes up with the applications of NLP (Natural Language
Processing) techniques for detecting the ‘fake news’, that is, misleading news
stories that comes from the non-reputable sources.
Is it possible to build a model that can differentiate between “Real “news and
“Fake” news? So a proposed work on assembling a dataset of both fake and
real news in order to create a model to classify an article into fake or real
based on its words and phrases.
INTRODUCTION
Fake news spreads a wildlife and this is a big issues in this era.
These days fake news is creating different issues from sarcastic articles to a
fabricated news and plan government propaganda in some outlets.
Fake news and lack of trust in the media are growing problems with huge
ramifications in our society.
it is seeked to produce a model that can accurately predict the likelihood that a
given article is fake news.
We will be training and testing the data, when we use supervised learning it means
we are labeling the data.
PROPOSED SYSTEM
1) Model is build based on the count vectorizer or a tfidf matrix ( i.e ) word tallies
relatives to how often they are used in other articles in your dataset can help.
2) The actual goal is in developing a model which was the text transformation (count
vectorizer vs tfidf vectorizer) and choosing which type of text to use (headlines vs
full text).
3) Information was very clear and understandable. It gives accurate predictions
which is very clear to the user. User friendly and faster time compatibility.
SYSTEM ARCHITECTURE
Step 1: Read the dataset.
Step 2: Random Sampling is done on the data set to make it balanced.
Step 3: Divide the dataset into two parts i.e., Train dataset and Test dataset.
Step 4: Feature selection are applied for the proposed models.
Step 5: Accuracy and performance metrics has been calculated to know the
efficiency for different algorithms.
Step6: Then retrieve the best algorithm based on efficiency for the given dataset.
System Architecture
METHODOLOGY
The approach that this paper proposes, uses the latest machine learning
algorithms to detect fake news or real news.
Each model is trained multiple times with a set of different parameters using a grid
search to optimize the model for the best outcome.
fake news detection problem can be addressed with machine learning methods.
REQUIREMENTS
SOFTWARE
Anaconda navigator as an applications wrapper hub
Microsoft Excel
HARDWARE
Processor i3 and above
Ram 4gb and above
RESULTS AND DISCUSSION
CONCLUSION
The feasibility of the project is analyzed in this phase and business proposal is put
forth with a very general plan for the project and some cost estimates.
An innovative model for fake news detection using machine learning algorithms
has been presented.
This model takes news events as an input and based on twitter reviews and
classification algorithms it predicts the percentage of news being fake or real.
This is to ensure that the proposed system is not a burden to the company.
THANK YOU