Ocr On A Grid Infrastructure: Project Synopsis

The document provides an overview of a project that aims to develop an optical character recognition (OCR) system based on a grid infrastructure. The system would allow for faster and more accurate recognition of characters during document processing compared to existing methods. It would process documents in multiple languages more effectively by recognizing heterogeneous characters. The system is intended to help organizations more efficiently digitize and analyze large volumes of paper documents.

Uploaded by

Abhishek Verma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

362 views9 pages

Ocr On A Grid Infrastructure: Project Synopsis

Uploaded by

Abhishek Verma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

PROJECT SYNOPSIS

ON

OCR ON A GRID INFRASTRUCTURE

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR
THE AWARD OF DEGREE OF

Bachelor of Technology
In
Information Technology
Axis Institute of Technology & Management, Kanpur

Submitted to : Submitted by :

Mr. Adesh Chandra Brajesh Kumar (1171913009)
(Assistant Professor) Aman Bhathiya(11719130)
Om Prakash Bharti (11719130)
Pooja Yaday (11719130)
Aditi Sharma (11719130)
INTRODUCTION

In the running world, there is growing demand for the software systems to recognize characters
in computer system when information is scanned through paper documents as we know that we
have number of newspapers and books which are in printed format related to different subjects.
These days there is a huge demand in storing the information available in these paper
documents in to a computer storage disk and then later reusing this information by searching
process. One simple way to store information in these paper documents in to computer system
is to first scan the documents and then store them as IMAGES. But to reuse this information it is
very difficult to read the individual contents and searching the contents form these documents
line-by-line and word-by-word. The reason for this difficulty is the font characteristics of the
characters in paper documents are different to font of the characters in computer system. As a
result, computer is unable to recognize the characters while reading them. This concept of storing
the contents of paper documents in computer storage place and then reading and searching the
content is called DOCUMENT PROCESSING. Sometimes in this document processing we need
to process the information that is related to languages other than the English in the world. For
this document processing we need a software system called CHARCATER RECOGNITION
SYSTEM. This process is also called DOCUMENT IMAGE ANALYSIS (DIA).
Thus our need is to develop character recognition software system to perform Document
Image Analysis which transforms documents in paper format to electronic format. For this
process there are various techniques in the world. Among all those techniques we have chosen
Optical Character Recognition as main fundamental technique to recognize characters. The
conversion of paper documents in to electronic format is an on-going task in many of the
organizations particularly in Research and Development (R&D) area, in large business
enterprises, in government institutions, so on. From our problem statement we can introduce the
necessity of Optical Character Recognition in mobile electronic devices such as cell phones,
digital cameras to acquire images and recognize them as a part of face recognition and
validation.
To effectively use Optical Character Recognition for character recognition in-order to
perform Document Image Analysis (DIA), we are using the information in Grid format. . This
system is thus effective and useful in Virtual Digital Librarys design and construction.

OBJECTIVE
The main purpose of Optical Character Recognition (OCR) system based on a grid
infrastructure is to perform Document Image Analysis, document processing of electronic
document formats converted from paper formats more effectively and efficiently. This improves
the accuracy of recognizing the characters during document processing compared to various
existing available character recognition methods. Here OCR technique derives the meaning of
the characters, their font properties from their bit-mapped images.
The primary objective is to speed up the process of character recognition in document
processing. As a result the system can process huge number of documents with-in less time
and hence saves the time.
Since our character recognition is based on a grid infrastructure, it aims to recognize multiple
heterogeneous characters that belong to different universal languages with different font
properties and alignments.

ABSTRACT

our need is to develop character recognition software system to perform Document Image
Analysis which transforms documents in paper format to electronic format. For this process there
are various techniques in the world. Among all those techniques we have chosen Optical
Character Recognition as main fundamental technique to recognize characters. The conversion of
paper documents in to electronic format is an on-going task in many of the organizations
particularly in Research and Development (R&D) area, in large business enterprises, in
government institutions, so on. From our problem statement we can introduce the necessity of
Optical Character Recognition in mobile electronic devices such as cell phones, digital cameras
to acquire images and recognize them as a part of face recognition and validation.
To effectively use Optical Character Recognition for character recognition in-order to perform
Document Image Analysis (DIA), we are using the information in Grid format. . This system is
thus effective and useful in Virtual Digital Librarys design and construction.
In the running world, there is growing demand for the software systems to recognize characters
in computer system when information is scanned through paper documents as we know that we
have number of newspapers and books which are in printed format related to different subjects.
These days there is a huge demand in storing the information available in these paper documents
in to a computer storage disk and then later reusing this information by searching process. One
simple way to store information in these paper documents in to computer system is to first scan
the documents and then store them as IMAGES. But to reuse this information it is very difficult
to read the individual contents and searching the contents form these documents line-by-line and
word-by-word. The reason for this difficulty is the font characteristics of the characters in paper
documents are different to font of the characters in computer system. As a result, computer is
unable to recognize the characters while reading them.
Thus our need is to develop character recognition software system to perform Document Image
Analysis which transforms documents in paper format to electronic format. For this process there
are various techniques in the world. Among all those techniques we have chosen Optical
Character Recognition as main fundamental technique to recognize characters. OCR thus derives
the meaning of characters, their font properties from their bit-mapped images.
To effectively use Optical Character Recognition for character recognition in-order to
perform Document Image Analysis (DIA), we are using the information in Grid format and
hence the Grid Technologies in character recognition. This system is thus effective and useful in
Virtual Digital Librarys design and construction.

SCOPE OF PROJECT

The scope of our product Optical Character Recognition on a grid infrastructure is to provide an
efficient and enhanced software tool for the users to perform Document Image Analysis,
document processing by reading and recognizing the characters in research, academic,
governmental and business organizations that are having large pool of documented, scanned
images. Irrespective of the size of documents and the type of characters in documents, the
product is recognizing them, searching them and processing them faster according to the needs
of the environment.

EXISTING SYSTEM
In the running world there is a growing demand for the users to convert the printed documents in
to electronic documents for maintaining the security of their data. Hence the basic OCR system
was invented to convert the data available on papers in to computer process able documents, So
that the documents can be editable and reusable. The existing system/the previous system of
OCR on a grid infrastructure is just OCR without grid functionality. That is the existing system
deals with the homogeneous character recognition or character recognition of single languages.

TECHNICAL REQUIREMENTS

2.1 SOFTWARE REQUIREMENTS SPECIFICATION

Operating System : Windows-XP

Programming Language : Core Java

User Interface : Swings

2.2 HARDWARE REQUIREMENTS SPECIFICATION

Processor : Pentium IV processor or higher

RAM : Minimum of 512 MB RAM

Memory : 500 MB or higher

PROPOSED METHODOLOGY

The Architecture of the optical character recognition system on a grid infrastructure consists of
the three main components. They are:-
Scanner
OCR Hardware or Software
Output Interface

BENEFIT OF PROPOSED SYSTEM
The benefit of proposed system that overcomes the drawback of the existing system is that it
supports multiple functionalities such as editing and searching. It also adds benefit by providing
heterogeneous characters recognition

TIME FRAME REQUIRED FOR VARIOUS STAGES OF PROJECT
IMPLEMENTATION
Sr. No. PHASES TIME DURATION
1. Synopsis ---- week
2. System Design ---- week
3. Coding ---- week
4. Implementation ---- week
5. Testing ---- week

Resume Analyser (IEEE)
No ratings yet
Resume Analyser (IEEE)
7 pages
Optical Character Recognition - Report
50% (2)
Optical Character Recognition - Report
33 pages
Automatic Speech Recognition Using Python
No ratings yet
Automatic Speech Recognition Using Python
18 pages
Indian Economy Notes
93% (15)
Indian Economy Notes
10 pages
@vtucode - in 21CS63 Module 2 PDF 2021 Scheme
No ratings yet
@vtucode - in 21CS63 Module 2 PDF 2021 Scheme
48 pages
Literary Terms
No ratings yet
Literary Terms
234 pages
Multi Banking System: Mini Project Report On
No ratings yet
Multi Banking System: Mini Project Report On
116 pages
Fake Account Detection Using Machine Learning and Data Science
No ratings yet
Fake Account Detection Using Machine Learning and Data Science
58 pages
Sign Language Recognition Using Deep Learning
No ratings yet
Sign Language Recognition Using Deep Learning
6 pages
Optical Character Recognition Project Report
No ratings yet
Optical Character Recognition Project Report
71 pages
Project File
No ratings yet
Project File
66 pages
Project Word Report
No ratings yet
Project Word Report
17 pages
Ocr With Machine Learning
No ratings yet
Ocr With Machine Learning
6 pages
Project Report Online Exam System 2011
94% (17)
Project Report Online Exam System 2011
38 pages
Digital Library System Documentation PDF
No ratings yet
Digital Library System Documentation PDF
68 pages
Optical Character Recognition
No ratings yet
Optical Character Recognition
27 pages
Web Based Attendance Management System
No ratings yet
Web Based Attendance Management System
19 pages
Visvesvaraya Technological University: Computer Graphics & Visualization Laboratory With Miniproject 18Csl67
No ratings yet
Visvesvaraya Technological University: Computer Graphics & Visualization Laboratory With Miniproject 18Csl67
30 pages
Machine Learning in The Field of Optical Character Recognition OCR
No ratings yet
Machine Learning in The Field of Optical Character Recognition OCR
5 pages
Elevate Abap Ty M
100% (1)
Elevate Abap Ty M
141 pages
Development of An Android Application For Recognizing Handwritten Text On Mobile Devices
No ratings yet
Development of An Android Application For Recognizing Handwritten Text On Mobile Devices
56 pages
BS en Iso 17678-2010
No ratings yet
BS en Iso 17678-2010
32 pages
Air Canvas
No ratings yet
Air Canvas
15 pages
Virtual Piano Report
No ratings yet
Virtual Piano Report
5 pages
Image Processing Based Facial Emotion Recognition: A Project Report On
No ratings yet
Image Processing Based Facial Emotion Recognition: A Project Report On
39 pages
The Optical Capture Recognition
No ratings yet
The Optical Capture Recognition
41 pages
Knowledge Cartography 2014
No ratings yet
Knowledge Cartography 2014
555 pages
UNIT 4 Information Retrieval Using NLP
No ratings yet
UNIT 4 Information Retrieval Using NLP
13 pages
College Connect App
0% (1)
College Connect App
20 pages
Notes Management System: A Synopsis On
No ratings yet
Notes Management System: A Synopsis On
8 pages
Asyn Mad Report
No ratings yet
Asyn Mad Report
17 pages
Mobile Computing Kca 051 1
No ratings yet
Mobile Computing Kca 051 1
2 pages
Text To Speech Conversion Using Raspberry - PI
No ratings yet
Text To Speech Conversion Using Raspberry - PI
3 pages
Air Canvas Whiteboard
No ratings yet
Air Canvas Whiteboard
20 pages
QR Code Generator and Detector Using Python
No ratings yet
QR Code Generator and Detector Using Python
8 pages
Blackbook
No ratings yet
Blackbook
35 pages
CARTOON OF AN IMAGE Documentation
No ratings yet
CARTOON OF AN IMAGE Documentation
38 pages
Virtual Mouse
No ratings yet
Virtual Mouse
59 pages
Object Detection - Deep Learning: Jamia Hamdard
No ratings yet
Object Detection - Deep Learning: Jamia Hamdard
26 pages
Seminar ppt@564
No ratings yet
Seminar ppt@564
16 pages
Online Unused Medicine Donation For Ngos
No ratings yet
Online Unused Medicine Donation For Ngos
5 pages
1.1 Introduction To Computer Graphics: User Interfaces
No ratings yet
1.1 Introduction To Computer Graphics: User Interfaces
13 pages
Drowsiness Detection Using Opencv Final
No ratings yet
Drowsiness Detection Using Opencv Final
83 pages
5.0 Best Practices For OCR
No ratings yet
5.0 Best Practices For OCR
4 pages
Computer Vision Module Application For Finding A Target in A Live Camera
No ratings yet
Computer Vision Module Application For Finding A Target in A Live Camera
8 pages
Synopsis
No ratings yet
Synopsis
18 pages
"Text To Speech Converter": A Project Report On
No ratings yet
"Text To Speech Converter": A Project Report On
9 pages
Project Documet Group 12 3
No ratings yet
Project Documet Group 12 3
98 pages
DL MiniProject
No ratings yet
DL MiniProject
27 pages
Face Mask Detection
No ratings yet
Face Mask Detection
34 pages
SYSTEM. This Process Is Also Called DOCUMENT IMAGE ANALYSIS (DIA)
No ratings yet
SYSTEM. This Process Is Also Called DOCUMENT IMAGE ANALYSIS (DIA)
88 pages
Roo Project
No ratings yet
Roo Project
16 pages
Flying Ball
No ratings yet
Flying Ball
25 pages
Final Sailu
No ratings yet
Final Sailu
12 pages
Design Analog Clock Using Computer Graphic and Turbo C++ Compiler
No ratings yet
Design Analog Clock Using Computer Graphic and Turbo C++ Compiler
20 pages
Hand Written Character Recognition Using Neural Network: BACHELOR OF ENGINEERING (Computer Engineering)
No ratings yet
Hand Written Character Recognition Using Neural Network: BACHELOR OF ENGINEERING (Computer Engineering)
46 pages
A12REVIEW
No ratings yet
A12REVIEW
18 pages
Optical Character Reconciliation
No ratings yet
Optical Character Reconciliation
55 pages
Face Recognition System
No ratings yet
Face Recognition System
7 pages
Optical Character Recognition System
No ratings yet
Optical Character Recognition System
41 pages
Foreword
No ratings yet
Foreword
1,318 pages
Project Detecto!: A Real-Time Object Detection Model
No ratings yet
Project Detecto!: A Real-Time Object Detection Model
3 pages
Ajay Kumar Garg Engineering College: 27 Delhi-Hapur Bypass Road GHAZIABAD-201001
No ratings yet
Ajay Kumar Garg Engineering College: 27 Delhi-Hapur Bypass Road GHAZIABAD-201001
9 pages
Java Fundamentals PDF
No ratings yet
Java Fundamentals PDF
106 pages
Vhfov2 User Manual
No ratings yet
Vhfov2 User Manual
10 pages
Agadu Du Du
No ratings yet
Agadu Du Du
15 pages
Alexander Hamilton, Michael A. Genovese, James Madison, John Jay - The Federalist Papers-Palgrave Macmillan (2009) PDF
No ratings yet
Alexander Hamilton, Michael A. Genovese, James Madison, John Jay - The Federalist Papers-Palgrave Macmillan (2009) PDF
313 pages
1 Optimization & Anti-Optimization of Structures Under Uncertainty - Isaac Elishakoff PDF
No ratings yet
1 Optimization & Anti-Optimization of Structures Under Uncertainty - Isaac Elishakoff PDF
425 pages
Liberalization: Globalization Globalization Refers To The Increasing Unification of The World's
No ratings yet
Liberalization: Globalization Globalization Refers To The Increasing Unification of The World's
4 pages
TTL-Transistor - Transistor Logic Transistor-Transistor Logic, or TTL, Refers To The Technology For Designing and Fabricating
No ratings yet
TTL-Transistor - Transistor Logic Transistor-Transistor Logic, or TTL, Refers To The Technology For Designing and Fabricating
3 pages
Lease Essays
No ratings yet
Lease Essays
5 pages
KA - PPT - 23aug2023
No ratings yet
KA - PPT - 23aug2023
33 pages
Song Lyrics
No ratings yet
Song Lyrics
45 pages
Forest Managemnet Assignment
No ratings yet
Forest Managemnet Assignment
3 pages
Steganography Project Report For Major Project in B Tech
No ratings yet
Steganography Project Report For Major Project in B Tech
74 pages
Park Kubzansky
No ratings yet
Park Kubzansky
11 pages
Hadiths Notes (1-20)
No ratings yet
Hadiths Notes (1-20)
13 pages
Modernism & Postmodernism
No ratings yet
Modernism & Postmodernism
1 page
Ontology As A Service (Oaas) : A Case For Sub-Ontology Merging On The Cloud
No ratings yet
Ontology As A Service (Oaas) : A Case For Sub-Ontology Merging On The Cloud
32 pages
Theory of Elasticity and Plasticity. (CVL 622) M.Tech. CE Term-2 (2017-18)
No ratings yet
Theory of Elasticity and Plasticity. (CVL 622) M.Tech. CE Term-2 (2017-18)
2 pages
Advanced Global Trading - AGT Arena #1
No ratings yet
Advanced Global Trading - AGT Arena #1
38 pages
Fourth Generation Mobile Technology
No ratings yet
Fourth Generation Mobile Technology
28 pages
AME M03 C02 SLM Fiscal Policy and Monetary Policy
No ratings yet
AME M03 C02 SLM Fiscal Policy and Monetary Policy
32 pages
Construction Services PDF
No ratings yet
Construction Services PDF
2 pages
Mercedeslist17 2 24
No ratings yet
Mercedeslist17 2 24
27 pages
Memorandums
No ratings yet
Memorandums
2 pages
The Common House Gecko, Hemidactylus Frenatus Schlegel in Dumeril & Bibron 1836 (Reptilia: Gekkonidae) in Gujarat, India
No ratings yet
The Common House Gecko, Hemidactylus Frenatus Schlegel in Dumeril & Bibron 1836 (Reptilia: Gekkonidae) in Gujarat, India
6 pages
Zoo Conservation Programmes
No ratings yet
Zoo Conservation Programmes
4 pages
Subject Code
No ratings yet
Subject Code
2 pages
Paper 1
No ratings yet
Paper 1
27 pages
Math II IMP-1
No ratings yet
Math II IMP-1
10 pages
Test 2 Questions
No ratings yet
Test 2 Questions
6 pages
2024 Exercise Allomorph Der Inf
No ratings yet
2024 Exercise Allomorph Der Inf
5 pages
Optical Character Recognition: Unlocking the Power of Computer Vision for Optical Character Recognition
From Everand
Optical Character Recognition: Unlocking the Power of Computer Vision for Optical Character Recognition
Fouad Sabry
No ratings yet
Optical Character Recognition: Fundamentals and Applications
From Everand
Optical Character Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet

Ocr On A Grid Infrastructure: Project Synopsis

Uploaded by

Ocr On A Grid Infrastructure: Project Synopsis

Uploaded by

PROJECT SYNOPSIS

You might also like