0% found this document useful (0 votes)
33 views34 pages

PDL-III Report FINAL

in partial fulfillment for the award of the degree

Uploaded by

smadhanmad2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views34 pages

PDL-III Report FINAL

in partial fulfillment for the award of the degree

Uploaded by

smadhanmad2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 34

TEXT EXTRACTION FROM IMAGES USING COMPUTER

VISION
PROJECT DEVELOPMENT LAB REPORT
Submitted by

M.BHUVANAESHWARI
(201061004)
&
J.SANGEERTHANA
(201061021)

in partial fulfillment for the award of the degree

of

BACHELOR OF TECHNOLOGY

in

INFORMATION TECHNOLOGY

IFET COLLEGE OF ENGINEERING

(AN AUTONOMOUS INSTITUTION)

VILLUPURAM 605108

NOVEMBER 2024
IFET COLLEGE OF ENGINEERING
(An Autonomous Institution)

BONAFIDE CERTIFICATE

Certified that this report titled “TEXT EXTRACTION FROM IMAGES


USING COMPUTER VISION” is the bonafide work of
M.BHUVANAESHWARI (201061004) & J.SANGEERTHANA (201061021)
who carried out the work under my supervision. Certified further that to the best of
my knowledge the work reported herein does not form part of any other thesis or
dissertation on the basis of which a degree or award was conferred on an earlier
occasion on this or any other candidate.

SIGNATURE SIGNATURE
Dr. R. THENDRAL Mr. M. ARUNKUMAR M.E.,
HEAD OF THE DEPARTMENT SUPERVISOR
Associate Professor, Assistant Professor,
Department of IT, Department of IT,
IFET College of Engineering, IFET College of Engineering,
Villupuram - 605108 Villupuram - 605108

i
CERTIFICATE OF EVALUATION
College name : IFET College of Engineering, Villupuram.

Branch : B.Tech - IT

Month & Year : November 2024

Sub. Code & : 19UITMP501/ PROJECT DEVELOPMENT LAB


Name

Name of the
Register Title of the
Name of the Student Supervisor with
Number Project
Designation

TEXT
EXTRACTION
M.BHUVANAESHWARI 201061004
FROM IMAGES Mr. M. ARUNKUMAR
& &
USING Assistant Professor
J.SANGEERTHANA 201061021
COMPUTER
VISION

The report for the Mini project-II submitted for the fulfillment of the award of
the degree of Bachelor of Technology in Information Technology of IFET College of
Engineering (Autonomous), permanently affiliated to Anna University was evaluated
and confirmed to be the work done by the above student.

SUPERVISOR HEAD OF THE DEPARTMENT

Submitted for the End Semester examination held on _____________________

INTERNAL EXAMINER EXTERNAL EXAMINER


ii
ACKNOWLEDGEMENT
I thank the almighty, for the blessings that have been showered upon me to

bring forth the success of the project. I would like to express my sincere gratitude to

our Chairman Mr.K.V.Raja, our Secretary Mr.K.Shivram Alva and our Treasurer

Mr.R.Vimal for providing us with an excellent infrastructure and necessary resources

to carry out this project and we extend our gratitude to our principal

Dr.G.Mahendran, for his constant support to our work.

I also take this opportunity to express my sincere thanks to our Vice Principal

and Dean Academics Dr.P.Kanimozhi who has provided all the needful help in

executing the project successfully.

I wish to express our thanks to our Head of the Department, Dr. R. Thendral,

for her persistent encouragement and support to complete this project. I express my

heartfelt gratitude to my guide Mr.M.Arunkumar, Assistant Professor, Department

of Information Technology for his / her priceless guidance and motivation which

helped us to bring this project to a perfect shape.

And I thank our faculty advisor Mr.M.Arunkumar, Assistant Professor,

Department of Information Technology who encouraged us in each and every step of

this project to complete it successfully.

I also thank our lab technicians and all the staff members of our department for

their support and assistance.

Last but not the least, I whole heartedly thanks my family and friends for their

moral support in tough times and their constructive criticism which made me to

succeed in our work.


iii
ABSTRACT

Optical Character Recognition (OCR) has been a major application of

Computer Vision for the past decade. OCR means converting handwritten, Typed, or

Printed text into Machine-readable text. Described how OCR systems are being used

currently with their benefits and limitations. The various applications of OCR in data

collection, management, and manipulation as document scanners are also described.

Tesseract-OCR is an optical character recognition engine that is available under the

Apache 2.0 license. It is compatible with several programming languages and

frameworks through wrappers. One such wrapper is Pytesseract. I have created a

simple text recognizing model using Pytesseract and OpenCV that can perform several

functions such as detecting characters, detecting words, detecting just digits,

converting handwritten text to computer-readable text, and detecting multiple

language text. The features of this model are also described in this project. OpenCV

(Open Source Computer Vision Library) is an open source computer vision and

machine learning software library. OpenCV was built to provide a common

infrastructure for computer vision applications and to accelerate the use of machine

perception in the commercial products. OpenCV makes it easy for businesses to utilize

and modify the code.

iv
TABLE OF CONTENTS
CHAPTER TITLE PAGE
NO NO
ABSTRACT iv
LIST OF FIGURES vi
LIST OF ABBREVIATION vii
1 INTRODUCTION 1
1.1. Introduction 1
1.2. Domain Overview 1
2 LITERATURE REVIEW 3
2.1 Text extraction from images 3
2.2 Text Conversion Topic 4
2.3 Text Detection from Image 4
2.4 Live Text Extraction 4
2.5 Text Localization and Recognition 5
2.6 Existing System 5
2.6.1 Disadvantages of Existing System 5
3 PROPOSED SYSTEM 6
3.1. Proposed System 6
3.2 System Architecture 7
3.3 Advantages of Proposed System 8
3.4 Modules 8
3.4.1 Detection 8
3.4.2 Enhancement 9
3.4.3 Extraction 10
4 SYSTEM REQUIREMENTS 11
4.1 Software Requirements 11
4.2 Hardware Requirements 11
5 CONCLUSION AND FUTURE WORK 12

v
APPENDIX-I 13
APPENDIX-II 20
REFERENCES 24

vi
LIST OF FIGURES
FIGURE NO TITLE PAGE NO
1 System Architecture 7
2 Example for text extraction 20
3 Reading image from URL 20
4 Extracting text and removing irrelevant symbols 21
5 Greyscale image 21
6 Rectangle around the text 22
7 Pattern on specific or word 22
8 Final output 23

vii
LIST OF ABBREVIATIONS

OCR - Optical Character Recognition


CNN - Convolutional Neural Network
API - Application Programming Interface
OPENCV - Open Source Computer Vision Library
LSTM - Long Short-Term Memory

viii
1.INTRODUCTION
1.1 INTRODUCTION

Text data present in images contain useful information for automatic annotation,
indexing, and structuring of images. Extraction of this information involves detection,
localization, tracking, extraction, enhancement, and recognition of the text from a giv
image. However, variations of text due to differences in size, style, orientation, and
alignment, as well as low image contrast and complex background make the problem
of automatic text extraction extremely challenging. While comprehensive surveys of
related problems such as face detection, document analysis, and image indexing can
be found, the problem of text information extraction is not well surveyed. A large
number of techniques have been proposed to address this problem, and the purpose is
to classify and review these algorithms, discuss benchmark data and performance
evaluation, and to point out promising directions for future research.

Content-based image indexing refers to the process of attaching labels to images


based on the Image content can be divided into two main categories: perceptual
content and semantic content. Perceptual content includes attributes such as color,
intensity, shape, texture, and their temporal changes, whereas semantic content means
objects, their relations. A number of studies on the use of relatively low-level
perceptual content for image and video indexing have already been reported. Studies
on semantic image content in the form of text, face, vehicle, and human action have
als recent interest. Among them, text within an image is of particular interest .

It can be easily extracted compared to other semantic contents, and It enables


applications such as keyword-based image search, automatic video logging, and text-
based image indexing.

1.2 DOMAIN OVERVIEW

Extracting text from images of documents computer vision that enables


computer and system to derive meaningful informations. OpenCV (Open source
computer vision) is a library of programming functions mainly aimed at real-time
1
computer vision. OpenCV in python helps to process an image and apply various
functions like resizing image, pixel manipulations, object detection, etc. In this article,
we will learn how to use contours to detect the text in an image and save it to a text
file.

Computer vision tasks include methods for acquiring, processing, analyzing and
understanding digital images, and extraction of high-dimensional data from the real
world in order to produce numerical or symbolic information, e.g. in the forms of
decisions. Understanding in this context means the transformation of visual images
into a editable and reusable text.

The image data can take many forms, such as video sequences, views from
multiple cameras, multi-dimensional data from a 3D scanner, or medical scanning
devices. It is very useful for users to save time and effort of typing from an images.

There is a variety of text present on boards, newspapers, books, websites, etc.


Printed texts are comprised of various fonts like Latin fonts, cursive fonts, old English
fonts, etc., and are of various styles like bold, italics, etc. Handwritten texts are also
distinctive as every person has a unique writing style. So, expecting OCR to recognize
all the characters is difficult. In general, there are two different ways to solve this
problem: either completely recognize the characters (pattern recognition), or detect the
individual lines and dashes from which the characters are made (feature detection) and
identify them in this way.

2
2.LITERATURE REVIEW

Long Short-Term Memory (LSTM) networks have shown exceptional


handwriting recognition results. The paper HighPerformance OCR for Printed English
and Fraktur using LSTM Networks explains how bidirectional LSTM networks were
used to solve the challenge of machine-printed Latin and Fraktur recognition. Latin
and Fraktur recognition differ substantially from handwriting recognition in terms of
both statistical characteristics of the data and the far greater degrees of accuracy
required. Because the precise location and baseline of handwritten letters vary, LSTM
network applications for handwriting recognition employ two-dimensional recurrent
networks. In contrast, for printed OCR, the authors employed a one-dimensional
recurrent network in conjunction with a new baseline and x-height normalization
technique. For training and testing, a variety of datasets were employed, including the
UW3 database, artificially produced and degraded Fraktur text, and scanned pages
from a book digitization effort.

2.1 Text extraction from images

AUTHOR: Sahana K Adanthaya

YEAR: 2020

This section provides a brief overview of the existing work carried out in the
field of text recognition. Text recognition has been in existence since a very long time.
images with colorful background is considered and a preprocessing method is
described which improves upon the performance of the Tesseract Optical Character
Recognition (OCR) engine. Here first text segmentation is done to separate the text
from the colorful background by dividing the original image into images. Then a
classifier recognizes the image containing text. There was an improvement of about
20% compared to the Tesseract OCR performance by employing preprocessing.

3
2.2 Text Conversion Topic

AUTHOR: Teena Varma, Stephen S Madari, Lenita L Montheiro, Rachna S Pooojary

YEAR: 2020

The recent technological advancements in the field of Image Processing and


Natural Language Processing are focusing on developing smart systems to improve
the quality of life. In this work, an effective approach is suggested for text recognition
and extraction from images and text to speech conversion. the text regions of the
enhanced image are detected by employing the Maximally Stable External Regions
(MSER) feature detector.

2.3 Text Detection from Image

AUTHOR: Gourav Chakraborty

YEAR:2020

Text detection from image and recognition as a whole have enormous


relevance from the point of view of application, like content-based image restoration,
document indexing and identification.

2.4 Live Text Extraction

AUTHOR: Sangitha Roy

YEAR: 2021

Live Text’. This feature is similar to how Google Lens works on Android
phones, and on the Google Search and Photos app on iOS. With Live Text, iOS will
now recognise any text in a photo, screenshot, or camera preview thanks to the optical
character recognition (OCR) allowing Apple to extract text from any image.

4
2.5 Text Localization and Recognition

AUTHOR: Saradindu Panda

YEAR: 2020

International Journal for Research in Applied Science and Engineering


Technology IJRASET. Optical Character Recognition emanates from technologies
involving telegraphy and creating reading devices for the blind. OCR is defined as the
electronic or mechanical conversion of handwritten script or printed text images into
machine-encoded text.

2.6 EXISTING SYSTEM

In this current world there is growing demand for the users to convert printed
documents, images ,videos into a text .the basic text extraction system was invented to
convert the data that are available on text. The existing system of text extraction on a
computer vision is just a text extraction system without any opencv functionality .
That is the existing system deals with the homogeneous character recognition or
character recognition of a single languages.It can only have the capability to recognize
, extract and convert only the images of English or only a single language . That is the
older text extraction system is only unlingual.

2.6.1 DISADVANTAGES OF EXISTING SYSTEM

 Text works efficiently with the printed text only and not with handwritten
text. Handwriting must be learnt by the pc.
 There is the need of lot of space required by the image produced.
 The quality of the image can be lose during this process.
 Not 100% accurate, there are likely to be some mistakes made during the
method. Not worth doing for little amounts of text.
 Extract a text from image of English or only a single language not a
multiple languages

5
3.PROPOSED SYSTEM
3.1 PROPOSED SYSTEM

The text extractor will use the Tesseract OCR Engine to extract the Image text.
The Web application will contain a section to submit/upload an image. which will then
go through our text extraction program, after the program outputs the result, Flask will
pull an API request to get the output text and display it on the Web page. Processing of
OCR information is fast because large amounts of text are often entered quickly. Often
times, a paper form is turned into an electronic form that is easy to distribute. The
advanced version can even recreate tables, columns, and flowcharts. It is cheaper than
paying someone to manually enter a large amount of text data and it can be useful for
users to editable and reusable. A great deal of study has previously been done on
several essential elements of handwritten character recognition systems. The text
conversion approach incorporates picture improvement by realignment, rescaling, and
resizing of the provided image, as well as noise removal from the image using
sophisticated algorithms. Furthermore, Tesseract's training with a self-adapted dataset
for handwritten characters has a substantial impact on final output. Using opencv
functions gives completely accurate results. They only understand information that is
organized. And this is exactly where Optical Character Recognition comes in the
picture. Tesseract requires a clean image to detect the text, this is where OpenCV
plays an important role as it performs the operations on an image like converting a
colored image to binary image, adjusting the contrast of an image, edge detection, and
many more.

6
3.2 SYSTEM ARCHITECTURE

Input : Images/video

Text Detection

Text Localization Text


Tracking

Text Extraction And Enhancement

Recognition(OCR)

Output : Text

Fig.1.System Architecture

The three basic steps involved in this process are detection, enhancement and
extraction. This diagram defines the structure of the system.

7
3.3 ADVANTAGES OF PROPOSED SYSTEM:
 Processing of Opencv information is fast. Large quantities of text are often
input quickly.
 This process is much faster as compared to the manual typing the
information into the system Advanced version can even Recreate tables,
columns and even produce sites.
 Information of OCR can be readable with high degree of accuracy. Flatbed
scanners are very accurate and may produce reasonably top quality image.
 It is cheaper than paying someone amount to manually enter great deal of
text data. Moreover it takes less time to convert within the electronic form.
3.4 MODULES
• Detection
• Enhancement
• Extraction
3.4.1 DETECTION

The process of detecting and extracting text from an image is known as Optical
Character Recognition (OCR). The Analyze Image step makes use of OCR to allow
users to extract and post-process textual data detected within images. The paper
document is generally scanned by the optical scanner and is converted in to the form
of a picture. At this stage we have the data in the form of image and this image can be
further analysed so that’s the important information can be retrieved. The image
resulting from the scanning process may contain a certain amount of noise. Depending
on the resolution on the scanner and the success of the applied technique for
thresholding, the characters may be smeared or broken. Some of these defects, which
may later cause poor recognition rates, can be eliminated by using a pre-processor to
smooth the digitized characters. The smoothing implies both filling and thinning.
Filling eliminates small breaks, gaps and holes in the digitized characters, while
8
thinning reduces the width of the line. The most common techniques for smoothing,
moves a window across the binary image of the character, applying certain rules to the
contents of the window. So, to improve quality of the input image, few operations are
performed for enhancement of image such as noise removal, normalization,
binarization etc.

3.4.2 ENHANCEMENT

There is a direct manipulation of a pixel in an image (on the image plane)


Processing the image is based on modifying the Fourier transform of an image.
Highlighting interesting detail in images. Removing noise from images.Making
images more visually appealing. All enhancement classes implement a typical
interface, containing one method which is named the 'enhance(factor)' method. The
method can be brightness, color, contrast, sharpness. Parameter: The enhance()
method takes just one parameter factor, i.e. a floating-point. Return Type: This method
returns an enhanced image. When u scan images, you can sharpen the text and
increase accuracy by using the Text Enhancement feature in Epson Scan. You can
enhance text only when you scan using a resolution. primary images are grayscale or
color, black-and-white (B/W) OCR images are generated for OCR purposes. You can
view and modify these. Although the OCR process tolerates low quality images, they
should preferably contain well-formed characters without “noise” (e.g. spots or
smudges or marginal shadow lines).You can use the following three tools on the SET
toolbar to enhance an image for OCR purposes: Despeckle, OCR Brightness and
Dropout color. Changes will be applied to the whole image unless any areas are
selected. The brightness tool has an effect on the image, but is useful only when the
primary image is color or grayscale, because the program generates a new OCR image
using your changed setting. It cannot improve quality when the primary image. In
those cases, you should rescan the document.

Brightness plays an important role in OCR accuracy. After loading an image,


check its appearance. If characters are thick and touching, lighten the brightness. If
characters are thin and broken, darken it. Use the OCR Brightness tool to optimize the
9
image. OCR Brightness tool also for selected image areas, so brightness can be
adjusted in different ways on different parts of an image. Adjusting brightness relates
both to characters and background. Generally image margins are darker.

This is used for preprinted colored forms where a different color is used for the
fixed texts. This allows just the respondent data to be recognized without the form
instructions, item names, boxes and other shapes.

Can select a predefined color (red, green or blue), or a colored area in the
image. Use the Select area tool to draw a rectangle including the page background
color and the color to be dropped. The selected color will become invisible in the
image.
3.4.3 EXTRACTION

Feature extraction is a process of dimensionality reduction by which an initial


set of raw data is reduced to more manageable groups for processing. A characteristic
of these large data sets is a large number of variables that require a lot of computing
resources to process. Feature extraction is the name for methods that select and /or
combine variables into features, effectively reducing the amount of data that must be
processed, while still accurately and completely describing the original data set. The
objective of feature extraction is to capture the essential characteristics of the symbols,
and it is generally accepted that this is one of the most difficult problems of pattern
recognition. The most straight forward way of describing a character is by the actual
raster image. Another approach is to extract certain features that still characterize the
symbols, but leaves out the unimportant attributes. The techniques for extraction of
such features are often divided into three main groups, where the features are found
from, the distribution of points. Transformations and series expansions. Structural
analysis. The different groups of features may be evaluated according to their
sensitivity to noise and deformation and the ease of implementation and use.

Extraction and recognition of text from image is an important step in building


efficient indexing and retrieval systems for multimedia databases. Our primary
10
objective is to make an unconstrained image indexing and retrieval system using
neural network. We adopt HSV based approaches for color reduction. This approach
show impressive results.

4.SYSTEM REQUIREMENTS
4.1 SOFTWARE REQUIREMENTS
 Jupiter Notebook(Anaconda3)
 Tessaract
 OpenCV

4.2HARDWARE REQUIREMENTS
• RAM : 512 MB and above
• Hard disk : 80GB and above
• Monitor : CRT or LCD monitor
• Keyboard : Normal or Multimedia
• Mouse : Compatible mouse

11
5.CONCLUSION

5.1 CONCLUSION

Nowadays, applications need several kinds of images as sources of information


for elucidation and analysis. When an image is transformed from one form to another
such as digitizing, scanning, and communicating, storing, etc. degradation occurs.
Therefore, the output image has to undertake a process called image enhancement,
which contains of a group of methods that seek to develop the visual presence of an
image. Image enhancement is fundamentally enlightening the interpretability or
awareness of information in images for human listeners and providing better input for
other automatic image processing systems. Opencv image processing is a tool form
preparation of expert knowledge edge and the combination of inaccurate information
from different sources. The intended opencv functions are an attractive result to
improve the quality of edges as much as possible.

5.2 FUTURE WORK

The proposed Opencv function used for extraction of text can be extended
further for recognition of other Indian scripts. This can be modified further to improve
accuracy of segmentation. New features can be added to improve the accuracy of
recognition. These algorithms can be tried on large database of handwritten text. There
is a need to develop the standard database for recognition of text. The proposed work
can be extended to work on degraded text or broken characters. Recognition of digits
in the text, half characters and compound characters can be done to improve the word
recognition rate. This extracted text can be further converted to audio so asto make

12
physically challenged that is blind people easily understand which text has been
converted from the image.

APPENDIX-I

(SOURCE CODE)

# In[1]:

#import requests to install tesseract

import requests

# In[2]:

# Downloading tesseract-ocr file

r = requests.get("https://raw.githubusercontent.com/tesseract-ocr/tessdata/4.00/
ind.traineddata", stream = True)

# Writing data to file to avoid path isuues

with open("E:/optical/ind.traineddata", "wb") as file:

for block in r.iter_content(chunk_size = 1024):

if block:

file.write(block)

# In[3]:

13
# Installing libraries required for optical character recognition

get_ipython().system(' apt install tesseract-ocr libtesseract-dev libmagickwand-dev')

# Importing IPython to clear output which is not important

from IPython.display import HTML, clear_output

clear_output()

# In[4]:

# Installing pytesseract and opencv

get_ipython().system(' pip install pytesseract wand opencv-python')

clear_output()

# In[5]:

# Import libraries

from PIL import Image

import pytesseract

import cv2

import numpy as np

from pytesseract import Output

import re

# In[6]:

# Reading image form url

image = Image.open(requests.get('https://i.stack.imgur.com/pbIdS.png',
stream=True).raw)

14
image = image.resize((300,150))

image.save('sample.png')

image

# In[7]:

# Simply extracting text from image

custom_config = r'-l eng --oem 3 --psm 6'

pytesseract.pytesseract.tesseract_cmd = 'C:/Program
Files/Tesseract-OCR/tesseract.exe'

text = pytesseract.image_to_string(image,config=custom_config)

print(text)

# In[8]:

# Extracting text from image and removing irrelevant symbols from characters

try:

text=pytesseract.image_to_string(image,lang="eng")

characters_to_remove = "!()@—*“>+-/,'|£#%$&^_~"

new_string = text

for character in characters_to_remove:

new_string = new_string.replace(character, "")

print(new_string)

except IOError as e:

print("Error (%s)." % e)

15
# In[9]:

# Now we will perform opencv operations to get text from complex images

image = cv2.imread('sample.png')

# In[10]:

# get grayscale image

def get_grayscale(image):

return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

gray = get_grayscale(image)

Image.fromarray(gray)

# In[11]:

# noise removal

def remove_noise(image):

return cv2.medianBlur(image,5)

noise = remove_noise(gray)

Image.fromarray(gray)

# In[12]:

#thresholding

def thresholding(image):

16
return cv2.threshold(image, 0, 255, cv2.THRESH_BINARY +
cv2.THRESH_OTSU)[1]

thresh = thresholding(gray)

Image.fromarray(thresh)

# In[13]:

#erosion

def erode(image):

kernel = np.ones((5,5),np.uint8)

return cv2.erode(image, kernel, iterations = 1)

erode = erode(gray)

Image.fromarray(erode)

# In[14]:

#Morphology

def opening(image):

kernel = np.ones((5,5),np.uint8)

return cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)

opening = opening(gray)

Image.fromarray(opening)

# In[15]:

#canny edge detection

def canny(image):

17
return cv2.Canny(image, 100, 200)

canny = canny(gray)

Image.fromarray(canny)

# In[16]:

#skew correction

def deskew(image):

coords = np.column_stack(np.where(image > 0))

angle = cv2.minAreaRect(coords)[-1]

if angle < -45:

angle = -(90 + angle)

else:

angle = -angle

(h, w) = image.shape[:2]

center = (w // 2, h // 2)

M = cv2.getRotationMatrix2D(center, angle, 1.0)

rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC,


borderMode=cv2.BORDER_REPLICATE)

return rotated

rotated = deskew(gray)

Image.fromarray(rotated)

# In[17]:

#template matching

18
def match_template(image, template):

return cv2.matchTemplate(image, template, cv2.TM_CCOEFF_NORMED)

match = match_template(gray, gray)

match

# In[18]:

# Drawing rectangle around text

img = cv2.imread('sample.png')

h, w, c = img.shape

boxes = pytesseract.image_to_boxes(img)

for b in boxes.splitlines():

b = b.split(' ')

img = cv2.rectangle(img, (int(b[1]), h - int(b[2])), (int(b[3]), h - int(b[4])), (0, 255,


0), 2)

Image.fromarray(img)

# In[19]:

# Drawing pattern on specific pattern or word

img = cv2.imread('sample.png')

d = pytesseract.image_to_data(img, output_type=Output.DICT)

keys = list(d.keys())

date_pattern = 'artificially'

n_boxes = len(d['text'])

19
for i in range(n_boxes):

if int(d['conf'][i]) > 60:

if re.match(date_pattern, d['text'][i]):

(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])

img = cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)

Image.fromarray(img)

APPENDEX – II

(SNAP SHOTS)

Fig.2: Example for text extraction

20
Fig.3: Reading image from URL

Fig.4: Extracting text and removing irrelevant symbols

21
Fig.5: Greyscale image

Fig.6: Rectangle around the text

22
Fig.7: Pattern on specific or word

23
Fig.8: Final Output

24
REFERENCES

[1] “Text extraction from product images using +AI” submmit by Rajesh shreedhar
Bhat .2020
[2] T.Som, Sumit Saha,"Handwritten Character Recognition Using Fuzzy
Membership Function", International Journal of Emerging Technologies in
Sciences and Engineering, Volume 5, December 2011.
[3] L. A. Zadeh. Fuzzy sets, Information Control 8 (1965) 338–353. Gur, Eran,
and ZeevZelavsky, “Retrieval of Rashi Semi-Cursive Handwriting via Fuzzy
Logic,” IEEE International Conference on Frontiers in Handwriting
Recognition (ICFHR), 2012.
[4] Thomas Natsvhlager, “Optical Character Recognition”, A Tutorial for the
Course Computational Intelligence.
[5] Andrei Polzounov,ArtsiomAblavatski , Sergio Escalera, Shijian Lu, JianfeiCai
“Wordfence: Text Detection In Natural Images With Border Awareness”.
[6] D.Trier , A.K.Jain ,T.Taxt , “Feature Extraction Method for Character
Recognition-A Survey” ,Pattern Recognition.
[7] C.-A. Boiangiu, R. loanitescu and R.-C. Dragomir, “VOTING-BASED OCR
SYSTEM,” Journal of Information Systems & Operations Management, 2016.
[8] Teena Varma Stephen S Madari, Lenita L Montheiro, Rachna S Pooojary 2021
“extraction from image and detection”.
[9] “Text extraction from images”, sahana k Adanthaya published on 2020.
[10] S. Shams, “Machine Learning Medium,” 15 January 2021. An investigation on
feature and text extraction from images using image recognition in Android.
[11] H. Li, P. Wang and C. Shen, “Towards End-to-End Car License Plates
Detection and Recognition with Deep Neural Networks,” arXiv:1709.08828
[cs.CV], p. 9, 2017.
[12] F. Yin, Y.-C. Wu, X.-Y. Zhang and C.-L. Liu, “Scene Text Recognition with
Sliding Convolutional Character Models,” arXiv:1709.01727v1, p. 10, 2017.
[13] K. Chaudhary and R. Bali, “EASTER: Efficient and Scalable Text Recognizer,”
arXiv:2008.07839v2 [cs.CV], p. 9, 2020.
25

You might also like