0% found this document useful (0 votes)

31 views30 pages

Mini Project

The document summarizes a project on detecting fake news using machine learning models. It discusses using TF-IDF, CountVectorizer, and Word2Vec to convert texts into numeric representations and classify whether news articles are real or fake. Neural networks like CNNs, LSTMs, and logistic regression are investigated. The project aims to determine which NLP model best preserves contextual information to help detect fake news.

Uploaded by

Anusha Kandula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views30 pages

Mini Project

Uploaded by

Anusha Kandula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

Main Project Seminar

On
Fake News Detection using different
Machine Learning models
Gayatri Vidya Parishad College of Engineering
(Autonomous)
Madhurawada,Visakhapatnam-530 048
Under the esteemed guidance of
S. Kanthi Kiran
Associate Professor
Department of Information Technology

Project Team Members

K. Anusha 17131A1251
B. Kamal
17131A1217
A. Amruth 17131A1210
K. Adithya Pavan
17131A1255
INTRODUCTION

Many news sources contain false information and are therefore “fake news.”
Because there is a lot of “fake news” articles and fabricated, misleading
information on the web, we would like to determine which texts are legitimate
(real) and which are illegitimate (fake). To solve this as a binary classification
problem, we investigate the effectiveness of different Natural Language
Processing models which are used to convert character based texts into numeric
representations such as TFIDF, CountVectorizer and Word2Vec models and find
out which model is able to preserve most of the contextual information about
the text used in a fake news data set and how helpful and effective it is in
detecting whether the text is a fake news or not.
TECHNOLOGIES USED:
• CountVectorizer
• TF-IDF
• Word2Vec
• ANNs
• LSTMs
• Logistic Regression
• Support Vector Machine
• Random Forest Classifier

PLATFORMS USED:
• Pycharm
• Python 3.6
CONVOLUTIONAL NEURAL NETWORK(CNN):

A Convolutional neural network (CNN) is a neural network that has one or more
convolutional layers and are used mainly for image processing, classification,
segmentation. Each convolutional layer contains a series of filters known as
convolutional kernels. The filter is a matrix of integers that are used on a subset of the
input pixel values, the same size as the kernel. Each pixel is multiplied by the
corresponding value in the kernel, then the result is summed up for a single value for
simplicity representing a grid cell, like a pixel, in the output channel/feature map.
Unsupervised Pre-training to encode our texts into numeric
representations
(1) (2)

(3)
RECURRENT NEURAL NETWORK (RNN):

Recurrent Neural Network is a generalization of feed forward neural network that has
an internal memory.
RNN is recurrent in nature as it performs the same task for every input data with the
output being dependent on the
previous computations. Once an output is produced, it is copied and sent back into the
recurrent network.
CONNECTIONIST TEMPORAL CLASSIFICATION(CTC):

If you want a computer to recognize text, neural networks (NN) are a good choice as
they outperform all other approaches at the moment. The NN for such use-cases
usually consists of convolutional layers (CNN) to extract a sequence of features and
recurrent layers (RNN) to propagate information through this sequence. It outputs
character-scores for each sequence-element, which simply is represented by a matrix.
Now, there are two things we want to do with this matrix: train: calculate the loss value
to train the NN infer: decode the matrix to get the text contained in the input image
Both tasks are achieved by the CTC operation.
EXISTING SYSTEM:

• Optical Character Recognition(OCR) is the existing system

for character recognition.

• It is an electronic translation of images of hand-written ,

type-written or printed text into machine editable text.

Drawbacks:
• It doesn’t have noise reduction.

• Direct use of OCR remains difficult problem to resolve,as it

leads to low reading accuracy.
PROPOSED SYSTEM:

• We use a NN for our task. It consists of convolutional NN (CNN)

layers, recurrent NN (RNN) layers and a final Connectionist
Temporal Classification (CTC) layer.

• The input image is fed into the CNN layers. These layers are
trained to extract relevant features from the image.

• The RNN output sequence is mapped to a matrix of size 32×80.

• While training the NN, the CTC is given the RNN output matrix and
the ground truth text and it computes the loss value.
Overview of the NN operations (green) and the data flow
through the NN (pink).
REQUIREMENTS SPECIFICATION:
HARDWARE REQUIREMENTS:
• Ram:4GB or higher
• Disc Space:1TB
• Processor:Intel i5 or higher

SOFTWARE REQUIREMENTS:
• Operating System:WINDOWS
• Python3
• Packages :TensorFlow,numpy,opencv,keras
• Pycharm
FUNCTIONAL REQUIREMENTS:

• The system should process the input given by the user only if it is
an image file.

• System will show the error message to the user when the input
given is not in the required format.

• System should detect the characters present in the image.

• System should retrieve characters present in the image and

display them to the user.
NON-FUNCTIONAL REQUIREMENTS:

• Performance: Handwritten characters in the input image will be recognized

with high accuracy.

• Functionality: This software will deliver on the functional requirements

mentioned in this document.

• Availability: This system will retrieve the handwritten character regions only if
the image contains written characters in it.

• Recognition Ability: The software is very easy to use and recognizes the
characters from the image.

• Reliability: This software will work reliably for any type of character images.
SYSTEM ARCHITECTURE/FLOW CHART:
Start

Real Image

Noise Removal

Classification of image

Extraction of text from image

Text contained in the image will

be displayed

Stop
Image Acquisition

Preprocessing

Segmentation

Classification and
Recognition

Post processing

Process Flow
Image Acquisition:
• In Image acquisition,the recognition system acquires a scanned image as an input
image.

• The image should be in png format.

Pre-processing:
• It is a gray-value image of size 128×32.

• Usually, the images from the dataset do not have exactly this size, therefore we
resize it (without distortion) until it either has a width of 128 or a height of 32.

• Then, we copy the image into a (white) target image of size 128×32
Segmentation:

• In this stage, an image of sequence of characters is decomposed into

sub-images of individual characters.

• The pre-processed input device is segmented into isolated characters by

assigning a number to each character using labelling process.

• Labelling process provides information about number of characters in

image.

• Each individual character is uniformly resized into pixels.

Classification And Recognition:

• The classification stage is the decision making part of the recognition

system.

• A feed forward back propagation neural network is used in this work

for classifying and recognizing the handwritten characters .

• The total number of neurons in the output layer is 79 as the proposed

system is designed to recognize English alphabets and digits.

Post-Processing:

• Post-Processing stage is the final stage of the proposed recognition

system.

• It prints the corresponding recognized character in the structured text

form.
UML DIAGRAM:

USE CASE DIAGRAM:

Upload Image

Cancel
<<include>> User
Convert Image-Gray
Initialize Scale
<<include>>

Pre-Process Image
<<include>> Gray Scale to
Binary format
System Recognize

Normalization
Generate Output
Output Screens:
Conclusions:

• Handwritten Text Recognition is a complex problem, which is not easily

solvable. The necessity is around dataset and database.

• This model is built to analyze the text we have written and convert it in
computer text.

• This application is applicable in many sectors of health care and consumer

sector.

• This type of model is used in health application can save understanding

perspectives of people and store each and every record digitally.

• Recognition of text depends on writing style.

• Salt and pepper noise can through off results.

Future Scope:

• This work can further be implemented to convert a handwritten paragraph in English to

structured format.

• It can be further extended to recognize other languages also.

References:

https://towardsdatascience.com/2326a3487cd5

https://repositum.tuwien.ac.at/obvutwhs/download/pdf/2874742

https://arxiv.org/pdf/1507.05717.pdf
Status:

• Data Collection - 100%

• Model Building - 100%

• User Interface - 100%

Thank You

Thesis
No ratings yet
Thesis
54 pages
Fpga Model To Implement Handwritten Digit Recognition
No ratings yet
Fpga Model To Implement Handwritten Digit Recognition
48 pages
Handwritten Digit Recogntion Using Mnist Dataset
No ratings yet
Handwritten Digit Recogntion Using Mnist Dataset
20 pages
Black Book Final Word
No ratings yet
Black Book Final Word
66 pages
Report
No ratings yet
Report
49 pages
Internship PPT Jeevika
No ratings yet
Internship PPT Jeevika
16 pages
Final Presentation
No ratings yet
Final Presentation
31 pages
Final Project Synopsis PDF
No ratings yet
Final Project Synopsis PDF
12 pages
Final Doc1
No ratings yet
Final Doc1
62 pages
Batch 6
No ratings yet
Batch 6
38 pages
Icicct 2018 8473291
No ratings yet
Icicct 2018 8473291
4 pages
Intelligent and Fuzzy Techniques: Smart and Innovative Solutions
No ratings yet
Intelligent and Fuzzy Techniques: Smart and Innovative Solutions
1,701 pages
Mainprojectsample Documentation.
No ratings yet
Mainprojectsample Documentation.
51 pages
Project Word Report
No ratings yet
Project Word Report
17 pages
Final
No ratings yet
Final
28 pages
A, Sign Language Detection
No ratings yet
A, Sign Language Detection
32 pages
Final Finaldoc
No ratings yet
Final Finaldoc
52 pages
Real-Time Detection of Spelling Mistakes in Handwritten Notes
No ratings yet
Real-Time Detection of Spelling Mistakes in Handwritten Notes
70 pages
BATCH 6 For Presentation
No ratings yet
BATCH 6 For Presentation
37 pages
Text Extraction From Document Image
No ratings yet
Text Extraction From Document Image
7 pages
Research Article
No ratings yet
Research Article
10 pages
Handwriting Recognition Is A Simple Task For Humans But A Difficult Task For Computers
No ratings yet
Handwriting Recognition Is A Simple Task For Humans But A Difficult Task For Computers
38 pages
Ppce Unit - I Process Planning and Cost Estimation
No ratings yet
Ppce Unit - I Process Planning and Cost Estimation
14 pages
Lung Cancer
No ratings yet
Lung Cancer
21 pages
NVDA Investor Presentation
No ratings yet
NVDA Investor Presentation
39 pages
Spark Africax Lms Core Structure
No ratings yet
Spark Africax Lms Core Structure
5 pages
Hand Written Letter Recognition
No ratings yet
Hand Written Letter Recognition
14 pages
Plagiarism Checker X Originality Report: Similarity Found: 26%
No ratings yet
Plagiarism Checker X Originality Report: Similarity Found: 26%
29 pages
Machine Learning 2025
No ratings yet
Machine Learning 2025
12 pages
Pehlivan 2019
No ratings yet
Pehlivan 2019
4 pages
Deep Learning - Handwritten Digit Recognition Using Python
No ratings yet
Deep Learning - Handwritten Digit Recognition Using Python
46 pages
Deepti Presentation CSLTS
No ratings yet
Deepti Presentation CSLTS
18 pages
Intro Ai Group3
No ratings yet
Intro Ai Group3
35 pages
Asl
No ratings yet
Asl
34 pages
Development of An Android Application For Recognizing Handwritten Text On Mobile Devices
No ratings yet
Development of An Android Application For Recognizing Handwritten Text On Mobile Devices
56 pages
Review
No ratings yet
Review
19 pages
Rida Mumtaz
No ratings yet
Rida Mumtaz
26 pages
Eti MCQ Vimp by v2v-1
No ratings yet
Eti MCQ Vimp by v2v-1
77 pages
APP2
No ratings yet
APP2
16 pages
8 - DH Method of Forward Kinematic - 1
No ratings yet
8 - DH Method of Forward Kinematic - 1
5 pages
Technical English Industrial Engineering Session 3
No ratings yet
Technical English Industrial Engineering Session 3
20 pages
C1 Projectreport
No ratings yet
C1 Projectreport
58 pages
Military Geography Dissertation
100% (2)
Military Geography Dissertation
7 pages
Deep Learning Question Bank
No ratings yet
Deep Learning Question Bank
8 pages
Life As A Darktrace Customer
No ratings yet
Life As A Darktrace Customer
16 pages
Aasl
No ratings yet
Aasl
34 pages
American SIGN - LANGUAGE - DETECTION
No ratings yet
American SIGN - LANGUAGE - DETECTION
35 pages
PL - 3DP Trend Report 2024 - EN
No ratings yet
PL - 3DP Trend Report 2024 - EN
24 pages
Bofinal
No ratings yet
Bofinal
10 pages
Gurully PTE Reading
No ratings yet
Gurully PTE Reading
11 pages
Goj 2250
No ratings yet
Goj 2250
18 pages
Unit 9 Grade 11 Test 1
No ratings yet
Unit 9 Grade 11 Test 1
9 pages
Synopsis Sample
No ratings yet
Synopsis Sample
7 pages
Sign Language Detection
No ratings yet
Sign Language Detection
32 pages
123 Handwritten
No ratings yet
123 Handwritten
10 pages
Can AI Do Your Garden
No ratings yet
Can AI Do Your Garden
1 page
Book Conference - ICBE 6 - 2024
No ratings yet
Book Conference - ICBE 6 - 2024
76 pages
Handwriting Recognition Using Deep Learning: Image Processing
No ratings yet
Handwriting Recognition Using Deep Learning: Image Processing
14 pages
Synopsis Report
No ratings yet
Synopsis Report
7 pages
Handwritten Text Recognition Using Deep Learning
No ratings yet
Handwritten Text Recognition Using Deep Learning
13 pages
Handwritten Character Recognition Using Deep Learning
No ratings yet
Handwritten Character Recognition Using Deep Learning
8 pages
Management Information System
No ratings yet
Management Information System
11 pages
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
No ratings yet
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
6 pages
Optimizing Britannia's Supply Chain
No ratings yet
Optimizing Britannia's Supply Chain
8 pages
Handwritten Text Recognition and Digital Text Conversion
No ratings yet
Handwritten Text Recognition and Digital Text Conversion
2 pages
Building The Cognitive Enterprise Nine Action Areas Deep Dive - 29030229USEN
No ratings yet
Building The Cognitive Enterprise Nine Action Areas Deep Dive - 29030229USEN
80 pages
Handwritten Text Recognition: Software Requirements Specification
No ratings yet
Handwritten Text Recognition: Software Requirements Specification
10 pages
Conf Paper
No ratings yet
Conf Paper
7 pages
Applied Artificial Intelligence in Modern Warfare and National Se
No ratings yet
Applied Artificial Intelligence in Modern Warfare and National Se
41 pages
Classifying Hand-Written Digits Using Neural Network: A Project Report On
No ratings yet
Classifying Hand-Written Digits Using Neural Network: A Project Report On
19 pages
Bengal College of Engineering and Technology, Durgapur: "Handwritten Text Recognition"
No ratings yet
Bengal College of Engineering and Technology, Durgapur: "Handwritten Text Recognition"
15 pages
2017project Paper
No ratings yet
2017project Paper
5 pages
Vidhale 2021
No ratings yet
Vidhale 2021
5 pages
Rimac Caso T Dig
No ratings yet
Rimac Caso T Dig
18 pages
Pavan
No ratings yet
Pavan
23 pages
Optical Character Recognizer: Team Member
No ratings yet
Optical Character Recognizer: Team Member
7 pages
HR Analytics Synopsis
100% (1)
HR Analytics Synopsis
3 pages
Ocr
No ratings yet
Ocr
3 pages
Handwritten Text Recognition Using Machine Learning Techniques in Application of NLP
No ratings yet
Handwritten Text Recognition Using Machine Learning Techniques in Application of NLP
4 pages
Implementation of Handwritten Digit Recognizer Using CNN: Vinjit, Bhojak, Kumar and Nikam
No ratings yet
Implementation of Handwritten Digit Recognizer Using CNN: Vinjit, Bhojak, Kumar and Nikam
9 pages
Convolutional Neural Network For A Self-Driving Car in A Virtual Environment
No ratings yet
Convolutional Neural Network For A Self-Driving Car in A Virtual Environment
6 pages
Object Detection and Recognition: Final Project Title
No ratings yet
Object Detection and Recognition: Final Project Title
6 pages
Main PPT2
No ratings yet
Main PPT2
31 pages
AIBSlide 18
No ratings yet
AIBSlide 18
37 pages
MG213 Class Week 02 Student
No ratings yet
MG213 Class Week 02 Student
14 pages
Expert System Technology For The Military: October 1988
No ratings yet
Expert System Technology For The Military: October 1988
42 pages
Final Sailu
No ratings yet
Final Sailu
12 pages
J&K Article
No ratings yet
J&K Article
8 pages
Sailu
No ratings yet
Sailu
5 pages
BUDT 737 Big Data and Artificial Intelligence For Business Spring 2022 - Syllabus
No ratings yet
BUDT 737 Big Data and Artificial Intelligence For Business Spring 2022 - Syllabus
7 pages
M.L. 3,5,6 Unit 3
No ratings yet
M.L. 3,5,6 Unit 3
6 pages
A Fast 3D CNN For Hyperspectral Image Classification: Muhammad Ahmad
No ratings yet
A Fast 3D CNN For Hyperspectral Image Classification: Muhammad Ahmad
5 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Image Compression: Efficient Techniques for Visual Data Optimization
From Everand
Image Compression: Efficient Techniques for Visual Data Optimization
Fouad Sabry
No ratings yet

Mini Project

Uploaded by

Mini Project

Uploaded by

Main Project Seminar

Project Team Members

• Optical Character Recognition(OCR) is the existing system

• It is an electronic translation of images of hand-written ,

• Direct use of OCR remains difficult problem to resolve,as it

• We use a NN for our task. It consists of convolutional NN (CNN)

• The RNN output sequence is mapped to a matrix of size 32×80.

• System should detect the characters present in the image.

• System should retrieve characters present in the image and

• Performance: Handwritten characters in the input image will be recognized

• Functionality: This software will deliver on the functional requirements

Extraction of text from image

Text contained in the image will

• The image should be in png format.

• In this stage, an image of sequence of characters is decomposed into

• The pre-processed input device is segmented into isolated characters by

• Labelling process provides information about number of characters in

• Each individual character is uniformly resized into pixels.

• The classification stage is the decision making part of the recognition

• A feed forward back propagation neural network is used in this work

• The total number of neurons in the output layer is 79 as the proposed

• Post-Processing stage is the final stage of the proposed recognition

• It prints the corresponding recognized character in the structured text

USE CASE DIAGRAM:

• Handwritten Text Recognition is a complex problem, which is not easily

• This application is applicable in many sectors of health care and consumer

• This type of model is used in health application can save understanding

• Recognition of text depends on writing style.

• Salt and pepper noise can through off results.

• This work can further be implemented to convert a handwritten paragraph in English to

• It can be further extended to recognize other languages also.

• Data Collection - 100%

• Model Building - 100%

• User Interface - 100%

You might also like