0% found this document useful (0 votes)

28 views24 pages

3 M&a

Uploaded by

qyryy7jw5c

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views24 pages

3 M&a

Uploaded by

qyryy7jw5c

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Optical Character Recognition (OCR)

By
Maryam Nabeel
Mohammed Ali
Ali Mohammed
Mustafa Dilshad

Supervisor
Mr. Farooq safaaldin

A thesis submitted in partial fulfilment of the requirements for the degree of

Computer Engineering Techniques

Computer Engineering Techniques Department

Technical Engineering College

Northern Technical University

BSc

April 2024

Kirkuk, Iraq
Acknowledgements

We would like to express our deepest appreciation to all those who provided
the possibility to complete this research. A special gratitude we give to our
project supervisor, mr. Farooq Safaaldin, whose contribution in stimulating
suggestions and encouragement, helped us to coordinate our project
especially in writing this research.

Furthermore, we would also like to acknowledge with much appreciation the

crucial role of the staff of Computer Engineering Department, that gave the
permission to use all required equipment and the necessary materials to
conduct the research.

A special thanks goes to our team, that we worked and helped each other to
assemble the parts and gave suggestions about the procedures of this project..

We are also grateful to Kirkuk Technical College for providing the laboratory
facilities.

Finally, we wish to thank our parents for their support and encouragement
throughout our study.

II
Abstract

Optical Character Recognition (OCR) technology has revolutionized the way

we interact with textual data,
enabling the digitization of documents from various mediums such as scanned
paper, PDFs, or images captured by digital cameras into editable and
searchable formats.^[14]

The recent surge in OCR accuracy can be attributed to the advent of

sophisticated deep learning models, which have been meticulously trained on
expansive and diverse datasets to perform exceptionally well even in
challenging conditions with complex layouts and background noise.^[15]

These state-of-the-art models have undergone rigorous benchmarking

through a series of tests, demonstrating their robustness and versatility in
recognizing an array of text styles across different backgrounds.^[16]

By incorporating OCR into computational workflows, we unlock new

horizons for data analysis and breathe new life into historical documents,
transforming them from static images into dynamic, analyzable datasets that
can be processed and examined computationally.^[17]]

III
Table of Contents
Contents
Acknowledgements..............................................................................................II

Abstract...............................................................................................................III

Table of Contents.................................................................................................X

Chapter 1 Introduction..........................................................................................1

1.1 Problem Statement..........................................................................................3

1.2 Thesis Organisation........................................................................................3

Chapter 2 Literature Review.................................................................................5

Chapter 3 Programming languages.....................................................................10

Chapter 4 Experimental Work............................................................................19

4.1

…………………………………………………………………………………
…………………………………19

4.2 front end........................................................................................................21

Chapter 5 back end.............................................................................................79

IV
5.1 back end........................................................................................................79

Database..............................................................................................................80

MySQL...............................................................................................................81

5.2 database and XAMPP...................................................................................82

Chapter 6 Conclusions and Reco Conclusions:..................................................86

6.1 86

6.2 88

References...........................................................................................................91

V
Chapter 1 Introduction

1. History and Evolution of OCR

Optical Character Recognition (OCR) technology has a long and interesting
history. The first OCR systems were developed in the early 1900s, but they
were limited in their ability to recognize text accurately. In the 1960s, OCR
technology began to evolve rapidly with the advent of computers, leading to
the development of more advanced OCR systems. During the 1970s, OCR
systems became more sophisticated, incorporating features such as document
layout analysis and font recognition. In the 1980s, OCR technology continued
to advance, with the development of more advanced algorithms for character
recognition, as well as the introduction of OCR software for personal
computers. The 1990s saw the rise of digital imaging and the widespread
adoption of OCR technology in a variety of industries, including government,
finance, and healthcare [1-2].
In recent years, OCR technology has advanced significantly with the advent
of deep learning and machine learning algorithms. Today, OCR systems can
recognize text in a variety of languages and scripts and are used in a wide
range of applications, from document scanning and archiving to data
extraction and information retrieval. The development of OCR technology is
expected to further improve its accuracy and efficiency, making it an
increasingly important tool for businesses and individuals a like [2-6].
Some recent developments in the field of OCR include Improved Text
Recognition in Complex Scenes, Handwritten Text Recognition, Integration
with Augmented Reality, and Integration with Internet of Things (IoT). The
section of this paper is organized as follows, and part 2 contains related works
on optical character recognition history and its presence in python
programming language. In section 3, the methodology adopted with driving
code for the OCR process, section 4 reviews the mechanism of OCR engine
along with future scope of OCR development using python. Section 5
concludes the paper with future research.

1.1 Problem Statement

OCR (Optical Character Recognition) is a technology that allows the
recognition of text within digital images or scanned documents, and the
conversion of that text into machine-readable characters. While OCR
technology has made significant advancements in recent years, there are still
several common problems associated with its use, including: Quality of the
Source Image: The quality of the source image is critical for accurate OCR.

6
Poor quality images with low resolution, blurred text, or skewing can make it
difficult for OCR software to recognize characters correctly. Character
Recognition Errors: OCR technology can sometimes misinterpret characters,
leading to errors in the output. For example, similar looking characters such
as 'l' and '1' or 'O' and '0' can be easily misinterpreted [1,8,10]. Formatting
Issues: OCR software can struggle with formatting issues, such as columns,
tables, and font styles, which can make it difficult to accurately recognize and
convert text language and Character Set Support: OCR software may not
support all languages and character sets, which can result in inaccurate
recognition of characters or an inability to recognize them at all.
Handwriting Recognition: OCR technology struggles with handwritten text
recognition. Even with the best OCR software, it's challenging to recognize
handwriting with high accuracy. Noise and Distortion: Noise and distortion in
the source image, such as smudges, stains, and creases, can cause OCR
technology to misinterpret characters or fail to recognize them at all.
Overall, OCR technology has made significant advancements in recent years,
but there are still several challenges that need to be addressed to improve its
accuracy and reliability [4].
1.2 Thesis Organisation
Optical Character Recognition (OCR) using Python provides an overview
of the various Python libraries and packages available for OCR, as well as the
current state of the art in OCR using Python. One of the most widely used
OCR libraries in Python is Tesseract, which is an open-source OCR engine
developed by Google. Tesseract provides a high level of accuracy and
supports a variety of languages and scripts, making it a popular choice for
OCR applications. The Python binding for Tesseract, ytesseract, provides a
simple interface for integrating Tesseract into Python pplications. Another
popular OCR library in Python is OpenCV, which is an open-source computer
vision library. OpenCV provides a range of image processing and computer
vision algorithms, including object detection and segmentation, which can be
used to improve the accuracy of OCR. The integration of OpenCV with
Tesseract or pytesseract provides a powerful tool for OCR applications. Other
OCR libraries in Python include OCRopus, a Python-based OCR engine

7
developed by Google, and pyOCR, a Python wrapper for the Tesseract OCR
engine. These libraries provide alternative options for OCR implementation in
Python and offer different levels of functionality and accuracy. In recent
years, there has been growing interest in the use of deep learning algorithms
for OCR.
Python provides a few deep learning libraries, such as TensorFlow and
PyTorch, which can be used to build OCR systems. These libraries allow for
the training of deep learning models for OCR and can be used to improve the
accuracy and efficiency of OCR systems. Overall, the literature survey
highlights the versatility of Python for OCR implementation, with a range of
libraries and packages available for OCR, including Tesseract and OpenCV,
as well as deep learning libraries such as TensorFlow and PyTorch. Python
provides a simple and flexible platform for OCR implementation, making it
an attractive option for OCR applications in a variety of domains.

Overall, the main aim and objectives of OCR technology are to automate the
recognition and conversion of text-based data, improving the efficiency,
accuracy, and accessibility of text-based information. OCR technology has
made significant advancements in recent years, but there are still several
challenges that need to be addressed to improve its accuracy and reliability

Chapter 2 Literature Review

Optical Character Recognition (OCR) using Python provides an overview
of the various Python libraries and packages available for OCR, as well as the
current state of the art in OCR using Python. One of the most widely used
OCR libraries in Python is Tesseract, which is an open-source OCR engine
developed by Google. Tesseract provides a high level of accuracy and

8
supports a variety of languages and scripts, making it a popular choice for
OCR applications. The Python binding for Tesseract,pytesseract, provides a
simple interface for integrating Tesseract into Python applications. Another
popular OCR library in Python is OpenCV, which is an open-source computer
vision library. OpenCV provides a range of image processing and com puter
vision algorithms, including object detection and segmentation, which can be
used to improve the accuracy of OCR.

The integration of OpenCV with Tesseract or pytesseract provides a powerful

tool for OCR applications. Other OCR libraries in Python include OCRopus,
a Python-based OCR engine developed by Google, and pyOCR, a Python
wrapper for the Tesseract OCR engine [9-14]. These libraries provide
alternative options for OCR implementation in Python and offer different
levels of functionality and accuracy. In recent years, there has been growing
interest in the use of deep learning algorithms for OCR.

Python provides a number of deep learning libraries, such as TensorFlow and

PyTorch, which can be used to build OCR systems. These libraries allow for
the training of deep learning models for OCR and can be used to improve the
accuracy and efficiency of OCR systems. Overall, the literature survey
highlights the versatility of Python for OCR implementation, with a range of
libraries and packages available for OCR, including Tesseract and OpenCV,
as well as deep learning libraries such as

TensorFlow and PyTorch. Python provides a simple and flexible platform for
OCR implementation, making it an attractive option for OCR applications in
a variety of domains. [1] Optical Character Recognition - Tesseract is Open
source OCR engine. It was initially developed between1984 to1994 at HP. In
1995, it was sent to UNLV for Annual Test of Optical Character Recognition

9
Accuracy after the joint project between HP Labs Bristol and HP‘s Scanner
Division in Colorado. Finally in 2005, Tesseract was released as open source
by HP an available at Tesseract OCR website. [2] Natural Language
Processing with Python: Analyzing text with Natural Language Toolkit. [3]
Information extraction and text summarization using linguistic knowledge
acquisition.- The lack of extensive linguistic coverage is the major barrier to
extracting useful information from large bodies of text. Current natural
language processing (NLP) systems do not have rich enough lexicons to cover
all the important words and phrases in extended texts that is all basically all of
the spoken language.

Chapter 3
Programming languages

Chapter 4 Experimental Work

Objective

The objective of this experiment was to develop and test a software system
capable of converting handwritten text and audio inputs into digital text

10
format. The system integrates Optical Character Recognition (OCR) and
voice recognition technologies to process images and sound files,
respectively.

Methodology

OCR Component

- Pytesseract: An OCR engine was utilized to interpret text from images.

Theand test a softlibrary, which is a Python wrapper for Google's Tesseract-
OCR Engine, was employed.

-tem capable of converting Images were preprocessed using OpenCV to

enhance text recognition accuracy. This involved converting images to
grayscale, applying dilation and erosion, and saving the preprocessed images.

-s Optical Character R The preprocessed images were then fed into the OCR
engine to extract text.

Voice Recognition Component

-handwritten text The system was designed to accept voice inputs in various
forms, including human speech, robot-generated voices, and other sounds.

-udio inputs into digital A voice recognition module was implemented to

convert the audio input into text. The specifics of the voice recognition
technology used (e.g., a specific API or custom model) would be detailed
here.

Experiment Execution

-) and voice recogniti A diverse dataset of images and audio files was
collected to test the system's capabilities.

-e of converti The OCR component was tested with the image dataset, while
the voice recognition component was tested with the audio files.

11
-t and audio inputs into di TheObjective

The objecfrom Python's

The objectlibrary was used to measure the accuracy of the text conversion by
comparing the system's output with a predefined ground truth.

Results

-s into digital te The system achieved an accuracy of X% on the image

dataset, successfully recognizing handwritten text across various styles and
quality.

-nputs into digital text format. The voice recognition component accurately
converted Y% of the audio inputs into text, demonstrating its effectiveness on
a range of sound types.

-o digital text format. The system i The combined OCR and voice recognition
system showed a robust performance, with an overall accuracy of Z% in
converting both images and audio into digital text.

Discussion

This section would analyze the results, discussing the success rate of the
system, its limitations, and potential areas for improvement.

This research delves into the advancements of Optical Character Recognition

(OCR) technology and its transformative impact on digital information
processing. OCR has emerged as a pivotal tool in converting printed and
handwritten texts into machine-encoded text, enabling efficient data retrieval
and analysis. The study explores the evolution of OCR from basic character
recognition to its current state, where it incorporates artificial intelligence to
interpret text across diverse languages and formats. The research highlights
the challenges faced in recognizing cursive handwriting and non-standard
fonts, and how machine learning algorithms have significantly improved
accuracy rates. Furthermore, the paper discusses the integration of OCR with

12
other technologies such as Natural Language Processing (NLP) to enhance
text comprehension and contextual analysis. The findings underscore the
potential of OCR in various sectors, suggesting that future developments
focus on increasing recognition precision, expanding language support, and
ensuring accessibility. The research concludes with recommendations for
future work, emphasizing the need for continuous innovation to meet the
growing demands of digitization in an increasingly data-driven world.

Chapter 5
5.1 back end
The main idea of this project came from a problem faced by many users in
copying the text content of published images. We must copy the terms
manually to obtain an accurate OCR (Optical Character Reader Software) that
is used to read characters from images. The image text section can be
screenshot and the characters from these images can be converted to editable
text form with the help of OCR software. This can be implemented as an
upgrade of existing media players.

Optical character recognition (OCR) is a technology that recognizes text in

images, such as scanned documents and photos. Perhaps you’ve taken a photo
of a text just because you didn’t want to take notes or because taking a photo
is faster than typing it. Fortunately, thanks to smartphones today, we can
apply OCR so that we can copy the picture of text we took before without

13
having to retype it.

The Python OCR (PYTHON OPTICAL CHARACTER RECOGNITION)

used that is a technology that recognizes and pulls out text in images like
scanned documents and photos using Python. It can be completed using the
open-source OCR engine Tesseract.‫اكتب كل المكتبات المستخدمة ببايثون‬

Optical Character Recognition is an old, but still challenging problem that

involves the detection and recognition of text from unstructured data,
including images and PDF documents. It has cool applications in banking, e-
commerce and content moderation in social media. But as with everything
topic in data science, there is a huge amount of resources when trying to learn
how to solve the OCR task. This is why I am writing this tutorial, which can
help you on getting started.

In this project, I am going to show some Python libraries that can allow you
to fastly extract text from images without struggling too much. The
explanation of the libraries is followed by a practical example. The dataset
used is taken from Kaggle. To simplify the concepts, I am just using an image
of the film Rush. The most important library in python used :
1. pytesseract: A Python wrapper for Google's Tesseract-OCR Engine. It
allows for the extraction of text from images.

2. tkinter: The standard Python interface to the Tk GUI toolkit. It is used for
creating graphical user interfaces.

3. filedialog: A module in tkinter that provides classes and factory functions

for creating file/directory selection windows.

4. fpdf: A library that allows for the creation of PDF files with Python.

5. cv2 (OpenCV): An open-source computer vision and machine

learning software library. It provides a common infrastructure for computer
vision applications and accelerates the use of machine perception in

14
commercial products.

6. numpy: A fundamental package for scientific computing with Python. It

provides support for large, multi-dimensional arrays and matrices, along with
a collection of mathematical functions to operate on these arrays.

7. PIL (Pillow): The Python Imaging Library adds image processing

capabilities to your Python interpreter. Pillow is the friendly PIL fork and an
easy-to-use library developed for opening, manipulating, and saving many
different image file formats.

8. difflib: A module that provides classes and functions for comparing

sequences, including HTML and context and unified diffs.

5.2 How to Convert Image to Text Using Python

Leveraging artificial intelligence (AI) and optical character recognition

(OCR), it's possible to draw out text from an array of file formats. This
extraction can be further simplified with coding. Today, we delve into the
method of translating images to textual data using the powerful Python
programming language.

Organizations in the modern era are bombarded with a significant amount of

unstructured data in a myriad of formats – PDFs, scanned files, images, and
the like. Manual extraction of crucial textual information from these heaps of
data is a taxing task bound to result in errors and inefficiencies.

However, there's a silver lining to this – thanks to strides made in AI, it's now
possible to streamline this task with code. Throw AI-fueled OCR algorithms
into the mix and one can efficiently and accurately translate image-based text
into accessible, actionable and searchable data.

This piece focuses on various types of images and the corresponding methods
required to extract text from them. We highlight the limitations of some

15
common approaches and offer practical solutions to enhance output. So, why
is it necessary to translate images to text?

5.3 Why is Text Extraction Important?

Numerous entities churn out image data from operational documentation.

Sadly, this text encounters issues when it comes to viewing, editing, or
analysing it since it's not searchable. Hence, it becomes imperative to extract
or translate it into string data to capture and utilize it.

In the scenario of extracting invoice details, dates, supplier information,

amounts, and other textual information from invoice images- one can store
such data for auditing, tax purposes or to assess supplier performance.

The necessity for extracting text isn't just restricted to invoices. Other
important use cases include the digital conversion of recruitment forms,
resumes, healthcare records, food labels, ID document scans, and location-
specific images such as store names and street signs.

5.4 What Kinds of Images are Suitable for Text Extraction?

In Python, text extraction lends itself to all types of images theoretically

speaking. However, depending on expected outputs, the complexity of code
and accuracy may greatly differ. ‫يذكر النوع بالتحديد‬

Images with a simple setup, sporting large text, limited words, simplistic
fonts, and clear contrast between text and images, may only require a few
lines of code. More complex images showcasing different fonts, noisy
backgrounds, shadowed or skewed text or handwritten text will likely prove
more challenging. Such images fight for extra coding efforts within a DIY

16
coding program. They demand preliminary processing of text prior to
extraction and further editing thereafter, to rectify text post-extraction.

5.5 Translating Simple Images to Textual Data in Python

For straightforward images, the ensuing methods are ideal. Tesseract and
OpenCV .Tesseract is a revered, open source OCR engine that assures
accurate text extraction from images. Its counterpart, the Open Source
Computer Vision Library (OpenCV), is a software library rooted in machine
learning that offers a variety of options and algorithms to work with videos
and images. Pairing Tesseract and OpenCV, users can extract data from
images using Python. After Tesseract is installed on the system, the
pytesseract library, a Python wrapper along with OpenCV, should be
installed. This is followed by simple steps to translate the text image into a
string using Tesseract. Another alternative in converting images to text is the
online service OnlineOCR. The easyOCR Fairly efficient and user-friendly,
easyOCR is a Python library that showcases a simple interface to extract text
from basic images. A brief command initializes text extraction. The readtext
method then returns a list of text detection results, easch containing extracted
text, bounding box coordinates, and a reliability score. Handling these results
is made easy with features allowing for text manipulation or printing.

There are other Python Libraries, besides pytesseract and easyOCR, there are
other Python libraries at our disposal that come with OCR capabilities to mine
text from images. They provide a cohesive interface to use these engines for
text extraction. Variations like PyOCR, OCRopus, provide supplementary
choices and flexibility in relation to OCR in Python. Some libraries can even
be used for both single-page and multi-page document OCR.

5.6 Limitations of Python Libraries

17
While they work wonders on basic images, open-source Python libraries
may encounter shortfalls when complex images come into play. They produce
inaccurate results if the background is pixelated, blurry or matches the text
color, or if dealing with an image is a handwritten or scanned copy. They
perform poorly if the image accommodates multiple columns or irregular text
placement. Also, they are not equipped with natural language processing
(NLP) features to check and improve output. If the input deviates from
standard, the Python libraries output incorrect results.

5.7 Improving Python Libraries' Efficiency

The efficiency of Python libraries can be optimized by converting images.

Preliminary to text extraction, the image must be converted to grayscale or
black and white. Following this, grayscale can be evolved into a binary
format where text is shown as black pixels against a white background. To
augment efficiencies, additional code for image preprocessing can be written.
Common preprocessing tasks encompass applying filters to enhance clarity,
adjusting text and background contrast, correcting image skew or rotation,
normalizing varying text size, and more.

In essence, the conversion of images to text considerably enhances the

accessibility and productivity of any data-heavy business operation.
Benefiting from the power of Python libraries to streamline this process
further leverages the overall efficiency and accuracy of text extraction. Even
better, with the advent of AI and OCR technologies, the process is only
poised to get more streamlined and refined in the future.

18
Chapter 6 Conclusions and
Recomendation
6.1 Recommendations:

Looking ahead, the future of OCR is poised for further innovation. The
integration of OCR with technologies like Natural Language Processing
(NLP) and Machine Learning (ML) will enhance its accuracy and efficiency.
Additionally, the development of OCR for less commonly used scripts and
languages will open new avenues for global information exchange. It is also
recommended that future research focuses on improving OCR's capability to
interpret handwritten texts and complex layouts, making it more versatile and
user-friendly.

Investing in the continuous improvement of OCR technology will

undoubtedly yield significant benefits across numerous sectors, including
education, healthcare, finance, and legal industries, where data extraction and
analysis are crucial.

19
6.2 Conclusions

Optical Character Recognition has been around for many years and has
become increasingly important as the amount of dig-ital information has
grown. The future of OCR development using Python looks very promising
as Python is a popular and widely used programming language for various
applications, including OCR. Here are a few areas where OCR development
using Python is expected to grow in the future: Improved Accuracy: With
advancements in deep learning and computer vi-sion, OCR algorithms will
continue to improve their accuracy in recognizing text in images and PDFs,
leading to even better performance. Real-Time OCR: As the demand for real-
time processing of images and videos increases, OCR systems will need to
adapt to real-time processing capabilities. Python's efficient programming and
ability to handle real-time data processing makes it a perfect choice for
developing real-time OCR systems. Multilingual OCR: As the world becomes
more connected and globalized, the demand for OCR systems that can handle
multiple languages will continue to grow. Python has strong support for
processing multiple languages, and this makes it an ideal platform for
multilingual OCR development. Handwrit-ing Recognition: With the
increasing use of digital devices for notetaking, the demand for OCR systems
that can recognize handwritten text will continue to grow.

Python's ability to integrate with machine learning libraries like TensorFlow

and PyTorch makes it a great choice for devel-oping handwriting recognition
systems. Integration with Other Technologies: OCR technology will continue
to be integrated with other technologies like augmented reality, virtual reality,
and the Internet of Things (IoT) to create new and innovative applications.
Python's ability to integrate with various technologies and its popularity as a

20
programming language make it an ideal choice for developing these kinds of
applications.

The topic of OCR (Optical Character Recognition) technology is one that is

fast developing, and numerous ongoing research projects are working to
increase the accuracy, speed, and adaptability of OCR. Here are some of the
most recent OCR re-search trends: Deep learning-based OCR: Systems that
can accurately recognize characters and words are being developed using
deep learning algorithms like CNNs and RNNs. To enhance OCR
performance, researchers are experimenting with new deep learning
architectures, training methodologies, and data augmentation strategies.
Multimodal OCR: To increase OCR accuracy and make input methods more
adaptable, multimodal OCR systems combine image recognition with speech
recog-nition or natural language processing. To improve OCR performance,
researchers are investigating new multimodal OCR de-signs, such as
attention-based models. OCR systems that can recognize and translate text
from a variety of languages are becoming more and more crucial in today's
globalized society. Using methods like language modelling, cross-lingual
transfer learning, and neural machine translation, researchers are creating
multilingual OCR systems. OCR for low-resource languages: OCR systems
for low-resource languages encounter a number of difficulties, including a
lack of standardization and a lack of training data. Researchers are
investigating techniques, such as transfer learning and unsupervised learning,
to adapt current OCR systems to low-resource languages. OCR systems for
historical documents confront a number of difficulties, including
deterioration, noise, and differences in writing styles. The accuracy of OCR
on historical documents is being improved by re-searchers using techniques
like picture enhancement, character identification based on context, and

21
crowdsourcing. In gen-eral, ongoing OCR research strives to increase OCR
speed, accuracy, and adaptability as well as make OCR available for a wider
variety of applications and languages. OCR technology is expected to become
more crucial as tasks related to digitaliza-tion, automation, and data analysis
progress.

With this work, we have cultivated the application development project

(OCR) utilizing python. We utilized the famous libraries that are used to
extract text data from images, docs, website’s URLs etc. We utilized python
libraries like: Apache tika, requests, warnings, pytesseract, PIL, os, io, pypdf,
pdfplumber, flask, open-cv, pymupdf, scikit-learn scipy matplotlib, youtube-
dl and shutil too. In this paper, we introduced Python as a practical language
for instruction and practical program-ming. We also observed the Python-
introduced characteristics, features, and types of programming assistance. In
agreement with these qualities, we discovered Python to be a quick, amazing,
versatile, basic, open-source language that maintains nu-merous
advancements. Then, various Python projects of different types were bought.
The report has similarly examined how a significant section of Python is
being used by various associations. According to facts gathered from well-
known and relia-ble journals and locations, the paper has discussed the
reasons why Python is the fastest creating programming language.

References

[1]J. L. Lions, “ARIANE 5 Flight - 501 Failures Report,” 2010.

[2] The Tesseract open source OCR engine
http://code.google.com/p/tesseract-ocr.
[3] Lisa F Rau,Paul S Jacobs,Uri Zernik , Information Extraction and Text
Summarisation using linguistic knowledge acquisition‘ in Information

22
Processing and Management, Volume 25,Issue 4 Page No -419-428.
[4] S. B. Kotsiantis, D. Kanellopoulos and P. E. Pintelas, Data Preprocessing
for Supervised Leaning‘, International Journal Of Computer Science Volume
1 Number 1 2006 ISSN.
[5] Jonathan Webster ,Chunya Kit, ―Tokenization as Initial phase in NLP,
City Polytechnic of Hong Kong,in proccedings of 14th Conference on
Computational Linguistics,Vol 4 page1106-1110.
[6]A Mitthal,P Kumarguru ―,Optical Character Recognition tool”,IIIT D
Dr. S. Vijayarani, Ms. J. Ilamathi, Ms. Nithya ,‘ Preprocessing Techniques for
Text Mining - An Overview‘ in International Journal of Computer Science &
Communication Networks,Vol 5(1),7-16.
[7] Meyer, David and Hornik, Kurt and Feinerer, Ingo (2008) Text Mining
Infrastructure in R. Journal of Statistical Software, 25 (5). pp. 1-54.
[8] Steven Bird,Edward Loper, NLTK : Natural Language Toolkit in
proceedings of Proceedings of the ACL 2004 on Interactive poster and
demonstration sessions, Article no 31.
[9] R. Smith. ―An overview of the Tesseract OCR Engine. Proc 9th Int.
Conf. on Document Analysis and Recognition, IEEE, Curitiba, Brazil, Sep
2007, pp629-633.
[10]The Tesseract open source OCR engine,http://code.google.com/p/tesseract-ocr.
[11] R.W. Smith, The Extraction and Recognition of Text from Multimedia
Document Images, PhD Thesis, University of Bristol, November 1987.
[12] Heuristic-Based OCR Post-Correction for Smart Phone Applications the
university of North Carolina at chapel hill department of computer science
honors thesis Author: WingSoon Wilson Lian 2009.
[13] Implementing Optical Character Recognition on the Android Operating
System for Business Cards By Sonia Bhaskar, Nicholas Lavassar, Scott Green
EE 368 Digital Image Processing.

[14]. Papers With Code. A comprehensive overview of OCR technologies,

benchmarks, and datasets.

[15]. Hegghammer. Benchmarking experiment comparing the performance of

Tesseract, Amazon Textract, and Google Document AI.

[16]. Springer. Detailed review on text extraction using OCR.

[17]. Academia.edu. Extensive overview of recent OCR research

9999999999999999999999999999999999999
‫هذه استخدمها‬

23
24

Optical Character Recognition:: An Illustrated Guide To The Frontier
No ratings yet
Optical Character Recognition:: An Illustrated Guide To The Frontier
197 pages
Defense HTTP and MQ API (V7.02.003)
No ratings yet
Defense HTTP and MQ API (V7.02.003)
495 pages
Optimization Improvement - CQI
No ratings yet
Optimization Improvement - CQI
2 pages
E-Design User Manual
No ratings yet
E-Design User Manual
18 pages
C
100% (1)
C
392 pages
CH 2-Software Testing Fundamentals - KM
No ratings yet
CH 2-Software Testing Fundamentals - KM
42 pages
Creating Active Directory Labs For Blue and Red Teams
No ratings yet
Creating Active Directory Labs For Blue and Red Teams
58 pages
10 - Chapter 2
No ratings yet
10 - Chapter 2
37 pages
Time Allowed: 60 Minutes: Initial Test On Eice - No 2
No ratings yet
Time Allowed: 60 Minutes: Initial Test On Eice - No 2
5 pages
DIY Tablet Powerful Easy To Make Relatively Cheap
No ratings yet
DIY Tablet Powerful Easy To Make Relatively Cheap
8 pages
Rtu560 CMR02
0% (1)
Rtu560 CMR02
4 pages
Chapter 1-Introduction
No ratings yet
Chapter 1-Introduction
19 pages
AKLABETH
No ratings yet
AKLABETH
22 pages
Is204 - 6
No ratings yet
Is204 - 6
27 pages
Class XII (As Per CBSE Board) : Computer Science
No ratings yet
Class XII (As Per CBSE Board) : Computer Science
18 pages
Experiment - 9 Array Processing: Objectives: Theory
No ratings yet
Experiment - 9 Array Processing: Objectives: Theory
11 pages
Lab2 Synthesis
No ratings yet
Lab2 Synthesis
27 pages
Cse 306
No ratings yet
Cse 306
2 pages
Urdu Optical Character Recognition OCR Thesis Zaheer Ahmad Peshawar Its Soruce Code Is Available On MATLAB Site 21-01-09
100% (1)
Urdu Optical Character Recognition OCR Thesis Zaheer Ahmad Peshawar Its Soruce Code Is Available On MATLAB Site 21-01-09
61 pages
Lab 8 - Submission
No ratings yet
Lab 8 - Submission
9 pages
Fi Pdflatex mk4 - Bezdeklarace
No ratings yet
Fi Pdflatex mk4 - Bezdeklarace
41 pages
A Survey of Modern Optical Character Rec PDF
No ratings yet
A Survey of Modern Optical Character Rec PDF
37 pages
Handwritten Optical Character Recognition (OCR) : A Comprehensive Systematic Literature Review (SLR)
No ratings yet
Handwritten Optical Character Recognition (OCR) : A Comprehensive Systematic Literature Review (SLR)
28 pages
Steps To Install Windows Virtual Machine
No ratings yet
Steps To Install Windows Virtual Machine
4 pages
An Empirical Modeling For The Baseline Energy Cons
No ratings yet
An Empirical Modeling For The Baseline Energy Cons
18 pages
k400 Quick Start Guide
No ratings yet
k400 Quick Start Guide
2 pages
Furniture Management System Project Report1
No ratings yet
Furniture Management System Project Report1
46 pages
Optical Character Recognition: Kaivan Gandhi 60001160012 Rahul Jha 60001160019 Shagun Vasmatkar 60001160061
No ratings yet
Optical Character Recognition: Kaivan Gandhi 60001160012 Rahul Jha 60001160019 Shagun Vasmatkar 60001160061
7 pages
Surrvey Paper On Intelligent Reader For Visually Impaired People
No ratings yet
Surrvey Paper On Intelligent Reader For Visually Impaired People
5 pages
Project Report On OCR Scanner
No ratings yet
Project Report On OCR Scanner
40 pages
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
No ratings yet
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
14 pages
Hand Written Character Recognition Using Neural Network: BACHELOR OF ENGINEERING (Computer Engineering)
No ratings yet
Hand Written Character Recognition Using Neural Network: BACHELOR OF ENGINEERING (Computer Engineering)
46 pages
Optical Character Recognition Using MATLAB: Sandeep Tiwari, Shivangi Mishra, Priyank Bhatia, Praveen Km. Yadav
No ratings yet
Optical Character Recognition Using MATLAB: Sandeep Tiwari, Shivangi Mishra, Priyank Bhatia, Praveen Km. Yadav
4 pages
Data Structure Interview Questions and Answers (Top 46)
No ratings yet
Data Structure Interview Questions and Answers (Top 46)
21 pages
Computer Science Project Term 1 (Xii)
No ratings yet
Computer Science Project Term 1 (Xii)
57 pages
Optical Character Recognition (Ocr) : Karan Panjwani T.E - B, 68 Guided By: Prof. Shalini Wankhade
No ratings yet
Optical Character Recognition (Ocr) : Karan Panjwani T.E - B, 68 Guided By: Prof. Shalini Wankhade
24 pages
Jagruthi Institute of Engineering and Technology: Optical Character Recognition
No ratings yet
Jagruthi Institute of Engineering and Technology: Optical Character Recognition
28 pages
Cne Practical Answers
No ratings yet
Cne Practical Answers
17 pages
Data Extraction From Images Through OCR-IJRASET
No ratings yet
Data Extraction From Images Through OCR-IJRASET
5 pages
Optical Character Recognition Project Report
No ratings yet
Optical Character Recognition Project Report
71 pages
Optical Character Recognition: Selected Topics in Computer Science
No ratings yet
Optical Character Recognition: Selected Topics in Computer Science
7 pages
A12REVIEW
No ratings yet
A12REVIEW
18 pages
9589-First Manuscript-57755-2-10-20220620 - X
No ratings yet
9589-First Manuscript-57755-2-10-20220620 - X
12 pages
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
No ratings yet
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
10 pages
Research Paper On OCR
No ratings yet
Research Paper On OCR
4 pages
CD UNIT-V Intermediate Code Generation
No ratings yet
CD UNIT-V Intermediate Code Generation
12 pages
Gce Advance Level Exam 2016 General Information Technology Git Past Papers
No ratings yet
Gce Advance Level Exam 2016 General Information Technology Git Past Papers
20 pages
Improved Optical Character Recognition With Deep Neural Network
No ratings yet
Improved Optical Character Recognition With Deep Neural Network
5 pages
Text Detector (OCR)
No ratings yet
Text Detector (OCR)
12 pages
Desktop Checking Assessment Form V2
No ratings yet
Desktop Checking Assessment Form V2
10 pages
CISCO CCNA1 Chapter 6 Ethiopian Digital Library
No ratings yet
CISCO CCNA1 Chapter 6 Ethiopian Digital Library
5 pages
Optical Character Recognition: Divyanshu Sagar Ahmed Zaid Faizee Vidyut Singhania
No ratings yet
Optical Character Recognition: Divyanshu Sagar Ahmed Zaid Faizee Vidyut Singhania
11 pages
Design of An OCR System and Its Hardware Implementation
No ratings yet
Design of An OCR System and Its Hardware Implementation
18 pages
Optical Character Recognition: Made By: Dhairya Goel-02814803115 Madhwan Sharma-60214803115
No ratings yet
Optical Character Recognition: Made By: Dhairya Goel-02814803115 Madhwan Sharma-60214803115
15 pages
OCR Presentation
No ratings yet
OCR Presentation
16 pages
Ocr Nanonets Tesseract
No ratings yet
Ocr Nanonets Tesseract
39 pages
Ocr PDF
No ratings yet
Ocr PDF
5 pages
Optical Character Recognizer: Team Member
No ratings yet
Optical Character Recognizer: Team Member
7 pages
Applsci 13 04584 With Cover
No ratings yet
Applsci 13 04584 With Cover
28 pages
AI Imo Qs.
No ratings yet
AI Imo Qs.
4 pages
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
No ratings yet
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
7 pages
IJMIE1April24 55698
No ratings yet
IJMIE1April24 55698
7 pages
IP MINI GD (Ver02) FINAL DG
No ratings yet
IP MINI GD (Ver02) FINAL DG
18 pages
Raj Synopsis12
No ratings yet
Raj Synopsis12
5 pages
AI Possible Risks & Mitigations: Optical Character Recognition
No ratings yet
AI Possible Risks & Mitigations: Optical Character Recognition
33 pages
ANN Miniproject Report
No ratings yet
ANN Miniproject Report
11 pages
Optical Character Recognition (OCR) System
No ratings yet
Optical Character Recognition (OCR) System
5 pages
Ocr
No ratings yet
Ocr
16 pages
10 1109@icirca48905 2020 9183326
No ratings yet
10 1109@icirca48905 2020 9183326
6 pages
Machine Learning in The Field of Optical Character Recognition OCR
No ratings yet
Machine Learning in The Field of Optical Character Recognition OCR
5 pages
Optical Character Recognition: Article
No ratings yet
Optical Character Recognition: Article
5 pages
Optical Character Recognition
100% (1)
Optical Character Recognition
17 pages
OCR Assignment
No ratings yet
OCR Assignment
5 pages
CMP 222 Week 8 - Optical Character Recognition
No ratings yet
CMP 222 Week 8 - Optical Character Recognition
8 pages
Optical Character Recognition: Article
No ratings yet
Optical Character Recognition: Article
5 pages
Optical Character Recognition System
No ratings yet
Optical Character Recognition System
41 pages
Advanced Techniques For Real
No ratings yet
Advanced Techniques For Real
7 pages
Development of Text Extraction Technique 3acb33e9
No ratings yet
Development of Text Extraction Technique 3acb33e9
8 pages
B Cisco Nexus 9000 NX Os Interfaces Configuration Guide 92x - Chapter - 0110
No ratings yet
B Cisco Nexus 9000 NX Os Interfaces Configuration Guide 92x - Chapter - 0110
36 pages
Fin Irjmets1684836352
No ratings yet
Fin Irjmets1684836352
7 pages
Optical Character Recognition System Using Artific
No ratings yet
Optical Character Recognition System Using Artific
7 pages
Beacon
No ratings yet
Beacon
2 pages
Ocr Presentation
No ratings yet
Ocr Presentation
15 pages
Micro-Project OCR Finally
No ratings yet
Micro-Project OCR Finally
13 pages
MP Final Report
No ratings yet
MP Final Report
38 pages
Optical Character Recognition Techniques
No ratings yet
Optical Character Recognition Techniques
6 pages
Unlocking Text From Images: The Future of OCR Technology
No ratings yet
Unlocking Text From Images: The Future of OCR Technology
4 pages
Optical Character Recognition Technologies and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Optical Character Recognition Technologies and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Intelligent Character Recognition: Advancing Machine Perception in Computer Vision
From Everand
Intelligent Character Recognition: Advancing Machine Perception in Computer Vision
Fouad Sabry
No ratings yet
Optical Character Recognition: Unlocking the Power of Computer Vision for Optical Character Recognition
From Everand
Optical Character Recognition: Unlocking the Power of Computer Vision for Optical Character Recognition
Fouad Sabry
No ratings yet

3 M&a

Uploaded by

3 M&a

Uploaded by

Optical Character Recognition (OCR)

A thesis submitted in partial fulfilment of the requirements for the degree of

Computer Engineering Techniques

Computer Engineering Techniques Department

Technical Engineering College

Northern Technical University

Furthermore, we would also like to acknowledge with much appreciation the

Optical Character Recognition (OCR) technology has revolutionized the way

The recent surge in OCR accuracy can be attributed to the advent of

These state-of-the-art models have undergone rigorous benchmarking

By incorporating OCR into computational workflows, we unlock new

1.1 Problem Statement..........................................................................................3

1.2 Thesis Organisation........................................................................................3

Chapter 2 Literature Review.................................................................................5

Chapter 3 Programming languages.....................................................................10

Chapter 4 Experimental Work............................................................................19

4.2 front end........................................................................................................21

Chapter 5 back end.............................................................................................79

5.2 database and XAMPP...................................................................................82

Chapter 6 Conclusions and Reco Conclusions:..................................................86

1. History and Evolution of OCR

1.1 Problem Statement

Chapter 2 Literature Review

The integration of OpenCV with Tesseract or pytesseract provides a powerful

Python provides a number of deep learning libraries, such as TensorFlow and

Chapter 4 Experimental Work

- Pytesseract: An OCR engine was utilized to interpret text from images.

-tem capable of converting Images were preprocessed using OpenCV to

Voice Recognition Component

-udio inputs into digital A voice recognition module was implemented to

The objecfrom Python's

-s into digital te The system achieved an accuracy of X% on the image

This research delves into the advancements of Optical Character Recognition

Optical character recognition (OCR) is a technology that recognizes text in

The Python OCR (PYTHON OPTICAL CHARACTER RECOGNITION)

Optical Character Recognition is an old, but still challenging problem that

3. filedialog: A module in tkinter that provides classes and factory functions

5. cv2 (OpenCV): An open-source computer vision and machine

6. numpy: A fundamental package for scientific computing with Python. It

7. PIL (Pillow): The Python Imaging Library adds image processing

8. difflib: A module that provides classes and functions for comparing

5.2 How to Convert Image to Text Using Python

Leveraging artificial intelligence (AI) and optical character recognition

Organizations in the modern era are bombarded with a significant amount of

5.3 Why is Text Extraction Important?

Numerous entities churn out image data from operational documentation.

In the scenario of extracting invoice details, dates, supplier information,

5.4 What Kinds of Images are Suitable for Text Extraction?

In Python, text extraction lends itself to all types of images theoretically

5.5 Translating Simple Images to Textual Data in Python

5.6 Limitations of Python Libraries

5.7 Improving Python Libraries' Efficiency

The efficiency of Python libraries can be optimized by converting images.

In essence, the conversion of images to text considerably enhances the

Investing in the continuous improvement of OCR technology will

Python's ability to integrate with machine learning libraries like TensorFlow

The topic of OCR (Optical Character Recognition) technology is one that is

With this work, we have cultivated the application development project

[1]J. L. Lions, “ARIANE 5 Flight - 501 Failures Report,” 2010.

[14]. Papers With Code. A comprehensive overview of OCR technologies,

[15]. Hegghammer. Benchmarking experiment comparing the performance of

[16]. Springer. Detailed review on text extraction using OCR.

You might also like