100% found this document useful (1 vote)
154 views25 pages

Satya Final Minor Report

The document is a minor project report on building a fake news classifier using natural language processing and implementing it using React JS. It discusses using Python libraries like Pandas, NumPy, Matplotlib, Scikit-learn and Keras for natural language processing and building the classifier model. It also discusses using Google Collab and GPUs for model training. The methodology involves data preprocessing, generating a similarity matrix, and developing a text summarization method. The objective is to build a prototype fake news classifier model and develop a React app for its implementation.

Uploaded by

Satyajeet Rout
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
154 views25 pages

Satya Final Minor Report

The document is a minor project report on building a fake news classifier using natural language processing and implementing it using React JS. It discusses using Python libraries like Pandas, NumPy, Matplotlib, Scikit-learn and Keras for natural language processing and building the classifier model. It also discusses using Google Collab and GPUs for model training. The methodology involves data preprocessing, generating a similarity matrix, and developing a text summarization method. The objective is to build a prototype fake news classifier model and develop a React app for its implementation.

Uploaded by

Satyajeet Rout
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

MINOR PROJECT REPORT

Topic: Fake News Classifier using NLP and


implementation by React js
Supervised By: Dr. Kirmender Singh

Department of Electronics and Communication


JAYPEE INSTITUTE OF INFORMATION
TECHNOLOGY, NOIDA

Examiner: Submitted By:


Dr. Shamim Akhter Satya Jeet Rout
(17102003)

1
Table of Contents

Chapter no. Topics Page No.


Certificate …4

Acknowledgement …5

Abstract …6

Chapter-1 Introduction …7

1.1 Importance of NLP …7

1.2 Problem Statement …7

1.3 Problem Objective …7

Chapter-2 Overview …8

2.1 Methodology Used …9

Chapter-3 Introduction to Python libraries …10

3.1 Python libraries used …11

3.2 Briefing on the libraries used …11

3.2.1 Introduction to google colab and GPU …12

3.2.2 NLP Models Encoder Decoder …13


Chapter-4 Introduction to React …14

4.1 React libraries and features …14

4.2 Our approach for the App …15

4.3 Nodejs …15


Snapshots of various project
Chapter-5 components …16

5.1 App screenshots …16

5.2 NLP Model screenshots …18

Chapter-6 Conclusion …21

6.1 Scope Of Future Work …22

References …23
CERTIFICATE

This is to certify that the work titled “Fake News Classifier” submitted by Satya Jeet Rout in
partial fulfilment for the award of degree of B. Tech of Jaypee Institute of Information Technology,
Noida has been carried out under my supervision. This work has not been submitted partially or
wholly to any other University or Institute for the award of this or any other degree or diploma.

Name of Supervisor - Dr.Kirminder Singh

Signature of Supervisor -
ACKNOWLEGEMENT

I would like to express my deepest appreciation to all those who provided me the possibility to
complete this report. A special gratitude I give to our third year project manager, [Dr. Kirmender
Singh , Dr. Shamim Akhter], whose contribution in stimulating suggestions and encour-agement,
helped me to coordinate my project especially in writing this report. Furthermore I would also like
to acknowledge with much appreciation the crucial role of the seniors and mentors. Last but not
least, many thanks go to the head of the project, whose have invested his full effort in guid-ing me
in achieving the goal.

Satya Jeet Rout(17102004)


ABSTRACT

The web and internet-based life have led the entrance to news data, a lot less demanding and
agreeable. Mass-media affects the life of the general public and as it frequently occurs. There are
few individuals that exploit these privileges. This prompts the creation of the news articles that are
not totally evident or indeed, even totally false. People intentionally spread these counterfeit articles
with the help of web-based social networking sites. The fundamental objective of fake news sites is
to influence the popular belief on specific issues. The main goal of fake news websites is to affect
public opinion on certain matters. Our aim is to find a reliable and accurate model that classifies a
given news article as either fake or true. Index Terms: Classification algorithm , Fake news
detection, Machine learning, Natural language processing.
CHAPTER 1

INRODUCTION

Natural language processing (NLP) is an area of computer science and artificial


intelligence concerned with the interaction between computers and humans in natural
language. The ultimate goal of NLP is to help computers understand language as
well as we do. It is the driving force behind things like virtual assistants, speech
recognition, sentiment analysis, fake news classification, machine translation and
much more. In this post, we'll cover the basics of natural language processing, dive
into some of its techniques and also learn how NLP has benefited recent advances
in deep learning.

1.1 Need for NLP Fake News Classification

Media Monitoring — The problem of information overload and “content shock” has


been widely discussed. Automatic summarization presents an opportunity to
condense the continuous torrent of information into smaller pieces of information.

News letter — many newsletter takes the form of an introduction followed by


currated selection of relevant topics. Summarisation would allow organizations to
further enhance newsletter with a stream of summaries

Search marketing and seo- When evaluating search queries for SEO, its critical to
have well-rounded understanding of what your competitors are talking about in their
content. This is become further important as Google has updated its algorithm and
shifted focus towards topical authority

1.2 PROBLEM STATEMENT


Build a prototype text summariser using NLP and Deep learning techniques and implement it with a
react native app

1.3 PROJECT OBJECTIVE


The objective of this project is to prepare a prototype text summariser model
CHAPTER 2

Overview

Text summarization refers to the technique of shortening long pieces of text. The
intention is to create a coherent and fluent summary having only the main points
outlined in the document.

Automatic text summarization is a common problem in machine learning and natural


language processing (NLP).

Machine learning models are usually trained to understand documents and distill the
useful information before outputting the required summarized texts.
2.2 Methodology Used

1. Importing all Libraries

2. Generate Clean Sentences

3. Similarity matrix

4. Generate summary method

Expected Result:-

To build a prototype Fake News Classifierr and its React app based implementation
CHAPTER 3

Introduction to Python libraries:-

We know that a module is a file with some Python code, and a package is a directory
for sub packages and modules. But the line between a package and a Python
library is quite blurred.
A Python library is a reusable chunk of code that you may want to include in your
programs/ projects. Compared to languages like C++ or C, a Python libraries do not
pertain to any specific context in Python. Here, a ‘library’ loosely describes a
collection of core modules. Essentially, then, a library is a collection of modules. A
package is a library that can be installed using a package manager like rubygems or
npm..

The Python Standard Library is a collection of exact syntax, token, and semantics of
Python. It comes bundled with core Python distribution.

It is written in C, and handles functionality like I/O and other core modules. All this
functionality together makes Python the language it is. More than 200 core modules
sit at the heart of the standard library. This library ships with Python. But in addition
to this library, you can also access a growing collection of several thousand
components from the Python Package Index (PyPI).
3.1 Python libraries used

• Pandas library

• Marplotlib Library

• Scikit Learn packages

• Keras packages

• Numpy Packages

3.2 Briefing on the libraries used

1.Pandas- pandas is a fast, powerful, flexible and easy to use open source data
analysis and manipulation tool,built on top of the Python programming language.

2..NumPy is a Python package which stands for 'Numerical Python'. It is the core
library for scientific computing, which contains a powerful n-dimensional array
object, provide tools for integrating C, C++ etc

3.Matplotlib- matplotlib. pyplot is a plotting library used for 2D graphics in python


programming language. It can be used in python scripts, shell, web application
servers and other graphical user interface toolkits.

4.Scikitlearn- Scikit-learn is a free machine learning library for Python. It features


various algorithms like support vector machine, random forests, and k-neighbours,
and it also supports Python numerical and scientific libraries like NumPy and SciPy.
3.2.1 Introduction to google colab and GPU

Colaboratory, or “Colab” for short, is a product from Google Research.


Colab allows anybody to write and execute arbitrary python code through
the browser, and is especially well suited to machine learning, data
analysis and education. More technically, Colab is a hosted Jupyter
notebook service that requires no setup to use, while providing free access
to computing resources including GPUs

Colab notebooks are stored in Google Drive, or can be loaded from


GitHub. Colab notebooks can be shared just as you would with Google
Docs or Sheets. Simply click the Share button at the top right of any Colab
notebook, or follow these Google Drive file sharing instructions.

Code is executed in a virtual machine private to your account. Virtual


machines are deleted when idle for a while, and have a maximum lifetime
enforced by the Colab service.
3.2.2 NLP Models Encoder Decoder

The encoder-decoder model is a way of using recurrent neural networks for sequence-
to-sequence prediction problems.
It was initially developed for machine translation problems, although it has proven
successful at related sequence-to-sequence prediction problems such as text
summarization and question answering.

The approach involves two recurrent neural networks, one to encode the input
sequence, called the encoder, and a second to decode the encoded input sequence into
the target sequence called the decoder.

Following are some of the application of sequence to sequence models-

 Chatbots

 Machine Translation

 Text summary

 Image captioning
CHAPTER 4
Introduction to React

React Native is an open-source mobile application framework created by


Facebook. It is used to develop applications for Android, iOS, Web and
UWP by enabling developers to use React along with native platform
capabilities

4.1.1 React Libraries and Features

React components implement a render()method that takes input data and returns what
to display. This example uses an XML-like syntax called JSX. Input data that is
passed into the component can be accessed by render() via this.props.

Using props and state, we can put together a small Todo application. This example


uses state to track the current list of items as well as the text that the user has entered.
Although event handlers appear to be rendered inline, they will be collected and
implemented using event delegation.
4.2.1 Our approach for the App

We tried to implement a nice React native app with a nice template using react native
and with features in our app to direct us to our required tasks and files and Notebooks

4.2.2 Node JS

Scalability, latency, and throughput are key performance indicators for web servers.
Keeping the latency low and the throughput high while scaling up and out is not easy.
Node.js is a JavaScript runtime environment that achieves low latency and high
throughput by taking a “non-blocking” approach to serving requests. In other words,
Node.js wastes no time or resources on waiting for I/O requests to return.
CHAPTER 5

Snapshots of various project components

App Screenshots:-
NLP Model Screenshots:-
CHAPTER 6

Conclusions

In this project we implement a prototype model of how effectively abstractive


summarization can be achieved by using different approaches and different models
is clearly represented. This project can be helpful as a tool of reference for the
people who are novice in NLP especially constrained to abstractive text
summarization.
In the future the proposed architecture can be develop and evaluated with more
effective metrics for better results in abstractive text summarization.
6.1 Scope of future work

Summarization is very well useful to us in today’s world. The main aim of


abstractive text summarization is to produce shortened version of input text with
relevant meaning. The adjective abstractive is utilized because it denotes that the
generated summary is not a combination or selection of some repeated sentences,
but it a paraphrasing of core contents of the input document . Abstractive
summarization is a very difficult problem apart from Machine translation. The
main challenge in ATS is to compress the matter of input document in an
optimized way so that the main concepts of the document are not missed. In
current technologically advancing world, volumes of data is increasing and it is
very difficult to read the required data in short time. It is a pretty task to collect the
required information and then convert into summarized form. Therefore, text
summarization came into demand. Summarized text saves time and helps in
avoiding retrieving massive text. Abstractive Text summarization can be combined
with numerous intelligent systems on the basis of NLP technologies like
information retrieval, question answering, and text classification to find the
particular information . If latent structure information of the summaries can be
incorporated into abstractive summarization model, then the quality of summaries
generated can be improved . In some research works, topic models are used to
capture the latent information from the input paragraph or documents.

References

[1] https://pandas.pydata.org

[2]https://scikit-learn.org/stable/

[3]https://en.wikipedia.org/wiki/Natural_language_processing

[4] https://en.wikipedia.org/wiki/React_Native

[5]https://reactnative.dev/docs/getting-started

[6] “Survey on abstract Text Summarization using various approaches” by Mr. Arun Krishna
Chittori

You might also like