OpenCV OCR and Text Recognition With Tesseract - PyImageSearch
OpenCV OCR and Text Recognition With Tesseract - PyImageSearch
Navigation
Click here to download the source code to this post
Emailusing
A few weeks ago I showed you how to perform text detection Address
OpenCV’s EAST deep learning
✕
👋weHey
model. Using this model were able toWhich
there! detect and localize
of these the bounding
best describesbox coordinates of text
you?
START MY EMAIL COURSE
contained in an image. Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 1/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
The next step is to take each of these areas containing text and actually recognize and OCR the text
Clickand
using OpenCV here to download the source code to this post
Tesseract.
To learn how to build your own OpenCV OCR and text recognition system, just keep reading!
From there, I’ll show you how to write a Python script that:
1. Performs text detection using OpenCV’s EAST text detector, a highly accurate deep learning text
detector used to detect text in natural scene images.
2. Once we have detected the text regions with OpenCV, we’ll then extract each of the text ROIs and
pass them into Tesseract, enabling us to build an entire OpenCV OCR pipeline!
Finally, I’ll wrap up today’s tutorial by showing you some sample results of applying text recognition with
OpenCV, as well as discussing some of the limitations and drawbacks of the method.
Email Address
✕
👋
Figure 1: TheHey there!
Tesseract OCRWhich
engine has ofbeen
these best
around sincedescribes
START
the 1980s. As you?
MYtool
EMAIL
of 2018, it now
COURSE
includes built-in deep learning capability Click
making it a
to answerrobust OCR (just keep in mind that no
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 2/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
OCR system is perfect). Using Tesseract with OpenCV’s EAST detector makes for a great
Tesseract, a highly popular OCR engine, was originally developed by Hewlett Packard in the 1980s and
was then open-sourced in 2005. Google adopted the project in 2006 and has been sponsoring it ever
since.
If you’ve read my previous post on Using Tesseract OCR with Python, you know that Tesseract can work
very well under controlled conditions…
…but will perform quite poorly if there is a significant amount of noise or your image is not properly
preprocessed and cleaned before applying Tesseract.
Just as deep learning has impacted nearly every facet of computer vision, the same is true for character
recognition and handwriting recognition.
Deep learning-based models have managed to obtain unprecedented text recognition accuracy, far
beyond traditional feature extraction and machine learning approaches.
It was only a matter of time until Tesseract incorporated a deep learning model to further boost OCR
accuracy — and in fact, that time has come.
The latest release of Tesseract (v4) supports deep learning-based OCR that is significantly more
accurate.
The underlying OCR engine itself utilizes a Long Short-Term Memory (LSTM) network, a kind of
Recurrent Neural Network (RNN).
Free
In the remainder of this section, you will learn how to install 17-day
Tesseract v4 on crash
your machine. ×
course
Later in this blog post, you’ll learn how to combine OpenCV’s EAST on Computer
text detection algorithm with
Vision,
Tesseract v4 in a single Python script to automatically perform OpenCV,
OpenCV OCR. and
Let’s get started configuring your machine! Deep Learning
Install OpenCV Interested in computer vision, OpenCV, and
deep learning, but don't know where to
Free3.4.2
To run today’s script you’ll need OpenCV installed. Version 17-dayorcrash
bettercourse on Computer
is required.
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
To install OpenCV on your system, just follow one of my OpenCV installation guides, ensuring that you
the best possible introduction to computer
download the correct/desired version of OpenCV and OpenCV-contrib in the process.
vision and deep learning. Sound good? Enter
The exact commands used to install Tesseract 4 on Ubuntu will be different depending on whether you
Email Address
are using Ubuntu 18.04 or Ubuntu 17.04 and earlier.
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
To check your Ubuntu version you can use the Click
lsb_release
to answer command:
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 3/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
As you can see, I am running Ubuntu 18.04 but you should check your Ubuntu version before
continuing.
For Ubuntu 18.04 users, Tesseract 4 is part of the main apt-get repository, making it super easy to install
Tesseract via the following command:
If you’re using Ubuntu 14, 16, or 17 though, you’ll need a few extra commands due to dependency
requirements.
The good news is that Alexander Pozdnyakov has created an Ubuntu PPA (Personal Package Archive)
for Tesseract, which makes it super easy to install Tesseract 4 on older versions of Ubuntu.
Just add the alex-p/tesseract-ocr PPA repository to your system, update your package definitions,
and then install Tesseract:
Assuming there are no errors, you should now have Tesseract 4 installed on your machine.
Free 17-day crash ×
Install Tesseract 4 on macOS
course on Computer
Installing Tesseract on macOS is straightforward provided you have Homebrew, macOS’ “unofficial”
package manager, installed on your system.
Vision, OpenCV, and
Deep Learning
Just run the following command, making sure to specify the --HEAD switch, and Tesseract v4 will be
installed on your Mac: Interested in computer vision, OpenCV, and
deep learning, but don't know where to
OpenCV OCR and text recognition with Tesseract
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-dayShell
1 $ brew install tesseract --HEAD Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
If you already have Tesseract installed on your Mac (if you the followed
best possible introduction
my previous to computer
Tesseract install tutorial,
for example), you’ll first want to unlink the original install:vision and deep learning. Sound good? Enter
your email below to get started.
OpenCV OCR and text recognition with Tesseract Shell
1 $ brew unlink tesseract
Email Address
And from there you can run the install command. ✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
Verify your Tesseract version
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 4/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Figure 2: Screenshot of my system terminal where I have entered the tesseract -v command to query for the
version. I have verified that I have Tesseract 4 installed.
Once you have Tesseract installed on your machine you should execute the following command to verify
your Tesseract version:
Free
As long as you see tesseract 4 somewhere in the output you 17-day crash
know that you have the latest version of ×
Tesseract installed on your system.
course on Computer
Install your Tesseract + Python bindings Vision, OpenCV, and
Deep
Now that we have the Tesseract binary installed, we now need toLearning
install the Tesseract + Python bindings
so our Python scripts can communicate with Tesseract and perform OCR on images processed by
OpenCV. Interested in computer vision, OpenCV, and
deep learning, but don't know where to
Free 17-day crash course on Computer
If you are using a Python virtual environment (which I highly
start?recommend
Let me help.so you
I've can have
created separate,
a free, 17-day
Vision, OpenCV, and Deep Learning
independent Python environments) use the workon command to access your virtual environment:
crash course that is hand-tailored to give you
the best possible introduction to computer
OpenCV OCR and text recognition with Tesseract Shell
vision and deep learning. Sound good? Enter
1 $ workon cv
your email below to get started.
In this case, I am accessing a Python virtual environment named cv (short for “computer vision”) —
you can replace cv with whatever you have named your virtual
Email environment.
Address
✕
👋toHey
From there, we’ll use pip there!
install Pillow,Which
a moreof theseSTART
best MY
Python-friendly describes you?
version of PIL,
EMAIL COURSE
followed by
pytesseract and imutils : Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 5/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Now open up a Python shell and confirm that you can import both OpenCV and pytesseract :
Congratulations!
If you don’t see any import errors, your machine is now configured to perform OCR and text recognition
with OpenCV
Let’s move on to the next section (skipping the Pi instructions) where we’ll learn how to actually
implement a Python script to perform OpenCV OCR.
The following instructions aren’t for the faint of heart — you may run into problems. They are tested, but
mileage may vary on your own Raspberry Pi.
Free 17-day crash ×
course on Computer
First, uninstall your OpenCV bindings from system site packages:
OpenCV OCR and text recognition with Tesseract Vision, OpenCV, and Python
1 $ sudo rm /usr/local/lib/python3.5/site-packages/cv2.so
Deep Learning
Here I used the rm command since my cv2.so file in site-packages is just a sym-link. If the
Interested
cv2.so bindings are your real OpenCV bindings then you in computer
may want vision,
to move the OpenCV,
file out and
of site-
packages for safe keeping. deep learning, but don't know where to
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
Now install two QT packages on your system: crash course that is hand-tailored to give you
the best possible introduction to computer
OpenCV OCR and text recognition with Tesseract vision and deep learning. Sound good? EnterPython
1 $ sudo apt-get install libqtgui4 libqt4-test
your email below to get started.
Then, install tesseract via Thortex’s GitHub:
Email Address
OpenCV OCR and text recognition with Tesseract Shell
✕
1 $ cd ~ 👋
Hey there! Which of theseSTART
2 $ git clone https://github.com/thortex/rpi3-tesseract
best MY
describes you?
EMAIL COURSE
Click to answer
3 $ cd rpi3-tesseract/release
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 6/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
4 $ ./install_requires_related2leptonica.sh
5 $ ./install_requires_related2tesseract.sh
Click here to download the source code to this post
6 $ ./install_tesseract.sh
For whatever reason, the trained English language data file was missing from the install so I needed to
download and move it into the proper directory:
You’re done! Just keep in mind that your experience may vary.
✕
👋Hey
We’ll extract each of these ROIsthere! Which
and then of these
pass them best describes
into Tesseract you?
v4’s LSTM
START MY EMAIL COURSE
deep learning text
recognition algorithm. Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 7/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
The output of the LSTM will give us our actual OCR results.
Click here to download the source code to this post
Finally, we’ll draw the OpenCV OCR results on our output image.
But before we actually get to our project, let’s briefly review the Tesseract command (which will be called
under the hood by the pytesseract library).
When calling the tessarct binary we need to supply a number of flags. The three most important ones
are -l , --oem , and --psm .
The -l flag controls the language of the input text. We’ll be using eng (English) for this example but
you can see all the languages Tesseract supports here.
The --oem argument, or OCR Engine Mode, controls the type of algorithm used by Tesseract.
You can see the available OCR Engine Modes by executing the following command:
We’ll be using --oem 1 to indicate that we wish to use the deep learning LSTM engine only.
The final important flag, --psm controls the automatic Page Segmentation Mode used by Tesseract:
Email
Whenever you find yourself obtaining incorrect OCR results Address
I highly recommend adjusting the --psm as
✕
👋Hey there! Which of theseSTART
it can have dramatic influences on your output OCR results.
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 8/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Project structure
Click here to download the source code to this post
Be sure to grab the zip from the “Downloads” section of the blog post.
From there unzip the file and navigate into the directory. The tree command allows us to see the
directory structure in our terminal:
images/ : A directory containing six test images containing scene text. We will attempt OpenCV
OCR with each of these images.
frozen_east_text_detection.pb : The EAST text detector. This CNN is pre-trained for text
detection and ready to go. I did not train this model — it is provided with OpenCV; I’ve also included
it in the “Downloads” for your convenience.
text_recognition.py : Our script for OCR — we’ll review this script line by line. The script utilizes
the EAST text detector to find regions of text in the image and then takes advantage of Tesseract v4
for recognition.
Free 17-day crash ×
Implementing our OpenCV OCR algorithm
course on Computer
We are now ready to perform text recognition with OpenCV!
Vision, OpenCV, and
Open up the text_recognition.py file and insert the following code:
Deep Learning
OpenCV OCR and text recognition with Tesseract Python
Interested in computer vision, OpenCV, and
1 # import the necessary packages
2 deep learning, but don't know where to
from imutils.object_detection import non_max_suppression
3 import numpy as np Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
4 import pytesseract Vision, OpenCV, and Deep Learning
5 import argparse crash course that is hand-tailored to give you
6 import cv2 the best possible introduction to computer
vision
Today’s OCR script requires five imports, one of which is built and
intodeep learning. Sound good? Enter
OpenCV.
your email below to get started.
Most notably, we’ll be using pytesseract and OpenCV. My imutils package will be used for non-
maxima suppression as OpenCV’s NMSBoxes function doesn’t seem to be working with the Python API.
Email Address
I’ll also note that NumPy is a dependency for OpenCV. ✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 9/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
The argparse package is included with Python and handles command line arguments — there is
Click
nothing to install.here to download the source code to this post
Now that our imports are taken care of, let’s implement the decode_predictions function:
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 10/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
The decode_predictions function begins on Line 8 and is explained in detail inside the EAST text
detectionClick here
post. The to download the source code to this post
function:
1. Uses a deep learning-based text detector to detect (not recognize) regions of text in an image.
2. The text detector produces two arrays, one containing the probability of a given area containing text,
and another that maps the score to a bounding box location in the input image.
As we’ll see in our OpenCV OCR pipeline, the EAST text detector model will produce two variables:
The function processes this input data, resulting in a tuple containing (1) the bounding box locations of
the text and (2) the corresponding probability of that region containing text:
rects : This value is based on geometry and is in a more compact form so we can later apply
NMS.
confidences : The confidence values in this list correspond to each rectangle in rects .
Note: Ideally, a rotated bounding box would be included in rects , but it isn’t exactly straightforward to
extract a rotated bounding box for today’s proof of concept. Instead, I’ve computed the horizontal
bounding rectangle which does take angle into account. The angle is made available on Line 41 if
you would like to extract a rotated bounding box of a word to pass into Tesseract.
Free
For further details on the code block above, please see this17-day
blog post. crash ×
From there let’s parse our command line arguments: course on Computer
OpenCV OCR and text recognition with Tesseract
Vision, OpenCV, and Python
65
66
# construct the argument parser and parse the
ap = argparse.ArgumentParser()
Deep Learning
arguments
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 11/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
From there, we will load + preprocess our image and initialize key variables:
Our image is loaded into memory and copied (so we can later draw our output results on it) on Lines
82 and 83.
Free 17-day crash ×
We grab the original width and height (Line 84) and then extract the new width and height from the
args dictionary (Line 88). course on Computer
Vision, OpenCV, and
Using both the original and new dimensions, we calculate ratios used to scale our bounding box
coordinates later in the script (Lines 89 and 90). Deep Learning
Our image is then resized, ignoring aspect ratio (Line Interested
93). in computer vision, OpenCV, and
deep learning, but don't know where to
Free 17-day crash course on Computer
Next, let’s work with the EAST text detector: start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
OpenCV OCR and text recognition with Tesseract Python
the best possible introduction to computer
96 # define the two output layer names for the EAST detector model that
97 # we are interested in -- the first is the output visionprobabilities
and deep learning.
and Sound
the good? Enter
98 # second can be used to derive the bounding box coordinates of text
your email below to get started.
99 layerNames = [
100 "feature_fusion/Conv_7/Sigmoid",
101 "feature_fusion/concat_3"]
102
Email Address
103 # load the pre-trained EAST text detector ✕
104 print("[INFO] loading 👋
HeyEAST
there!
text Which of theseSTART
detector...") best MY
describes you?
EMAIL COURSE
105 net = cv2.dnn.readNet(args["east"]) Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 12/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Our two output layer names are put into list form on Lines 99-101. To learn why these two output names
Clickyou’ll
are important, here totodownload
want the source
refer to my original code
EAST text to this
detection post
tutorial.
Then, our pre-trained EAST neural network is loaded into memory (Line 105).
I cannot emphasize this enough: you need OpenCV 3.4.2 at a minimum to have the
cv2.dnn.readNet implementation.
Construct a blob on Lines 109 and 110. Read more about the process here.
Pass the blob through the neural network, obtaining scores and geometry (Lines 111 and
112).
Decode the predictions with the previously defined decode_predictions function (Line 116).
Apply non-maxima suppression via my imutils method (Line 117). NMS effectively takes the most
likely text regions, eliminating other overlapping regions.
Free 17-day crash ×
Now that we know where the text regions are, we need to take steps to recognize the text! We begin to
coursetheon
loop over the bounding boxes and process the results, preparing stageComputer
for actual text recognition:
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 13/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
We initialize the results list to contain our OCR bounding boxes and text on Line 120.
Then we begin looping over the boxes (Line 123) where we:
Scale the bounding boxes based on the previously computed ratios (Lines 126-129).
Pad the bounding boxes (Lines 134-141).
And finally, extract the padded roi (Line 144).
Our OpenCV OCR pipeline can be completed by using a bit of Tesseract v4 “magic”:
Taking note of the comment in the code block, we set our Tesseract config parameters on Line 151
(English language, LSTM neural network, and single-line of text).
Note: You may need to configure the --psm value using my instructions at the top of this tutorial if you
Free 17-day crash ×
find yourself obtaining incorrect OCR results.
course on Computer
The pytesseract library takes care of the rest on Line 152 where we call
Vision, OpenCV,
pytesseract.image_to_string , passing our roi and config string .
and
Deep Learning
? Boom! In two lines of code, you have used Tesseract v4 to recognize a text ROI in an image. Just
remember, there is a lot happening under the hood. Interested in computer vision, OpenCV, and
deep learning, but don't know where to
Free 17-day
Our result (the bounding box values and actual text string) crash course theonresults
Computer
start? are appended
Let me tocreated
help. I've list (Line
a free, 17-day
156). Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
the best possible introduction to computer
Then we continue this process for other ROIs at the top of the loop.
vision and deep learning. Sound good? Enter
your email below to get started.
Now let’s display/print the results to see if it actually worked:
OpenCV OCR and text recognition with Tesseract Email Address Python
158 # sort the results bounding box coordinates from top to bottom
✕
159 results = sorted(results,
160
👋
Hey there! Which
key=lambda of theseSTART
r:r[0][1]) best MY
describes you?
EMAIL COURSE
Click to answer
161 # loop over the results
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 14/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Our results are sorted from top to bottom on Line 159 based on the y-coordinate of the bounding
box (though you may wish to sort them differently).
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 15/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Notice how our OpenCV OCR system was able to correctly (1) detect the text in the image and then (2)
recognize the text as well.
The next example is more representative of text we would see in a real- world image:
Free 17-day crash ×
OpenCV OCR and text recognition with Tesseract Shell
course on Computer
1 $ python text_recognition.py --east frozen_east_text_detection.pb \
2 --image images/example_02.jpg
3 [INFO] loading EAST text detector... Vision, OpenCV, and
4 OCR TEXT
5 ======== Deep Learning
6 ® MIDDLEBOROUGH
Interested in computer vision, OpenCV, and
deep learning, but don't know where to
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
the best possible introduction to computer
vision and deep learning. Sound good? Enter
your email below to get started.
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 16/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Figure 5: A more complicated picture of a sign with white background is OCR’d with OpenCV and Tesseract 4.
Again, notice how our OpenCV OCR pipeline was able to correctly localize and recognize the text;
however, in our terminal output we see a registered trademark Unicode symbol — Tesseract was likely
confused here as the bounding box reported by OpenCV’s Free
EAST 17-day crash
text detector bled into the grassy
×
shrubs/plants behind the sign. course on Computer
Let’s look at another OpenCV OCR and text recognitionVision,
example: OpenCV, and
OpenCV OCR and text recognition with Tesseract
Deep Learning Shell
1 $ python text_recognition.py --east frozen_east_text_detection.pb \
2 --image images/example_03.jpg Interested in computer vision, OpenCV, and
3 [INFO] loading EAST text detector... deep learning, but don't know where to
4 OCR TEXT Free 17-day crash course on Computer
5 ======== start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
6 ESTATE crash course that is hand-tailored to give you
7
8 OCR TEXT the best possible introduction to computer
9 ======== vision and deep learning. Sound good? Enter
10 AGENTS
11 your email below to get started.
12 OCR TEXT
13 ========
14 SAXONS Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 17/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Figure 6: A large sign containing three words is properly OCR’d using OpenCV, Python, and
Tesseract.
OpenCV’s text detector is able to localize each of them — we then apply OCR to correctly recognize
each text region as well.
Free 17-day crash ×
course
Our next example shows the importance of adding padding on
in certain Computer
circumstances:
OpenCV OCR and text recognition with Tesseract Vision, OpenCV, and Shell
1 $ python text_recognition.py --east frozen_east_text_detection.pb \
2 --image images/example_04.jpg Deep Learning
3 [INFO] loading EAST text detector...
4 OCR TEXT
Interested in computer vision, OpenCV, and
5 ========
6 CAPTITO deep learning, but don't know where to
7 Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
8 OCR TEXT Vision, OpenCV, and Deep Learning
9 ======== crash course that is hand-tailored to give you
10 SHOP the best possible introduction to computer
11
12 OCR TEXT vision and deep learning. Sound good? Enter
13 ======== your email below to get started.
14 |.
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 18/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Figure 7: Our OpenCV OCR pipeline has trouble with the text regions identified by OpenCV’s
EAST detector in this scene of a bake shop. Keep in mind that no OCR system is perfect in all
cases. Can we do better by changing some parameters, though?
In the first attempt of OCR’ing this bake shop storefront, we see that “SHOP” is correctly OCR’d, but:
By adding a bit of padding we can expand the bounding box coordinates of the ROI and correctly
recognize the text: Free 17-day crash ×
OpenCV OCR and text recognition with Tesseract course on Computer Shell
1 $ python text_recognition.py --east frozen_east_text_detection.pb \
2 --image images/example_04.jpg --padding 0.05 Vision, OpenCV, and
3 [INFO] loading EAST text detector...
4 OCR TEXT Deep Learning
5 ========
6 CAPUTO'S
Interested in computer vision, OpenCV, and
7
8 OCR TEXT deep learning, but don't know where to
9 ======== Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
10 SHOP Vision, OpenCV, and Deep Learning
11 crash course that is hand-tailored to give you
12 OCR TEXT the best possible introduction to computer
13 ========
14 BAKE vision and deep learning. Sound good? Enter
your email below to get started.
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 19/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Figure 8: By adding additional padding around the text regions identified by EAST text detector,
we are able to properly OCR the three words in this bake shop sign with OpenCV and
Tesseract. See the previous figure for the first, failed attempt.
Just by adding 5% of padding surrounding each corner of the bounding box we’re not only able to
correctly OCR the “BAKE” text but we’re also able to recognize the “U” and “’S” in “CAPUTO’S”.
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 20/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Figure 9: With a padding of 25%, we are able to recognize “Designer” in this sign, but our
OpenCV OCR system fails for the smaller words due to the color being similar to the
background. We aren’t even able to detect the word “SUIT” and while “FACTORY” is detected,
we are unable to recognize the text with Tesseract. Our OCR system is far from perfect.
I increased the padding to 25% to accommodate the angle/perspective of the words in this sign. This
allowed for “Designer” to be properly OCR’d with EAST and Tesseract v4. But the smaller words are a
lost cause likely due to the similar color of the letters to the background.
Free 17-day crash ×
course
In these situations there’s not much we can do, but I would suggest on Computer
referring to the limitations and
drawbacks section below for suggestions on how to improve your OpenCV text recognition pipeline
when confronted with incorrect OCR results.
Vision, OpenCV, and
Deep Learning
Limitations and Drawbacks
Interested in computer vision, OpenCV, and
It’s important to understand that no OCR system is perfect!
deep learning, but don't know where to
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision,
There is no such thing as a perfect OCR engine, especially OpenCV, and
in real-world Deep Learning
conditions.
crash course that is hand-tailored to give you
the best possible introduction to computer
And furthermore, expecting 100% accurate Optical Character Recognition is simply unrealistic.
vision and deep learning. Sound good? Enter
yourinemail
As we found out, our OpenCV OCR system worked in well some below to getit started.
images, failed in others.
There are two primary reasons we will see our text recognition pipeline fail:
Email Address
✕
👋Hey there! Which of theseSTART
1. The text is skewed/rotated. best MY
describes you?
EMAIL COURSE
Click to answer
2. The font of the text itself is not similar to what the Tesseract model was trained on.
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 21/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Even though Tesseract v4 is significantly more powerful and accurate than Tesseract v3, the deep
learning Click
model ishere to download
still limited by the data itthe
wassource
trained oncode to text
— if your thiscontains
post embellished fonts or
fonts that Tesseract was not trained on, it’s unlikely that Tesseract will be able to OCR the text.
Secondly, keep in mind that Tesseract still assumes that your input image/ROI has been relatively
cleaned.
Since we are performing text detection in natural scene images, this assumption does not always hold.
In general, you will find that our OpenCV OCR pipeline works best on text that is (1) captured at a 90-
degree angle (i.e., top-down, birds-eye-view) of the image and (2) relatively easy to segment from the
background.
If this is not the case, you may be able to apply a perspective transform to correct the view, but keep in
mind that the Python + EAST text detector reviewed today does not provide rotated bounding boxes (as
discussed in my previous post), so you will still likely be a bit limited.
Tesseract will always work best with clean, preprocessed images, so keep that in mind whenever you
are building an OpenCV OCR pipeline.
If you have a need for higher accuracy and your system will have an internet connection, I suggest you
try one of the “big 3” computer vision API services:
…each of which uses even more advanced OCR approaches running on powerful machines in the
cloud. Free 17-day crash ×
course on Computer
Summary Vision, OpenCV, and
Deep
In today’s tutorial you learned how to apply OpenCV OCR Learning
to perform both:
Tesseract v4.
Email Address
We also looked at Python code to perform both text detection and text recognition in a single script.
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 22/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Our OpenCV OCR pipeline worked well in some cases but also failed in others. For the best OpenCV
Click results
text recognition here to download
I would theensure:
suggest you source code to this post
1. Your input ROIs are cleaned and preprocessed as much as possible. In an ideal world your text
would be perfectly segmented from the rest of the image, but in reality, that won’t always be
possible.
2. Your text have been captured at a 90 degree angle from the camera, similar to a top-down, birds-
eye-view. In the case this is not the case, a perspective transform may help you obtain better
results.
I hope you enjoyed today’s blog post on OpenCV OCR and text recognition!
To be notified when future blog posts are published here on PyImageSearch (including text
recognition tutorials), be sure to enter your email address in the form below!
Downloads:
If you would like to download the code and images used in this post, please enter
your email address in the form below. Not only will you get a .zip of the code, I’ll also
send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and
Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and
libraries to help you master CV and DL! Sound good? If so, enter your email address
and I’ll send you the code immediately!
Email address:
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 23/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
deep learning, east text detector, lstm, ocr, optical character recognition, tesseract, text, text
detection
Keras Tutorial: How to get started with Keras, Deep Learning, and Python pip install opencv
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 24/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
No, you still need to run the forward pass of the network which is still a
computationally expensive operation. It is certainly faster than trying to train the network from
scratch but it will still be slow. I would suggest you give it a try yourself 🙂
REPLY
Shreyans Sharma November 12, 2018 at 4:59 am #
Hi Adrian, I would really appreciate if you could suggest some way to distinguish
handwritten text from printed text in a scanned document.
I have tried using MXNet paragraph and line segmentation but that does not distinguish both the
classes.
Your help would be really appreciated.
Thanks
REPLY
Adrian Rosebrock November 13, 2018 at 4:44 pm #
Vision,
Hi Adrian, this is a great post! Thanks OpenCV, and
for sharing!
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 25/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Training your own NN for OCR can be a huge pain. Most of the time I
recommend against it. Have you tried Google’s Vision API yet? It works really well as an
off-the-shelf OCR system.
REPLY
Sara January 28, 2019 at 12:38 pm #
Thanks for such a great post , i needed to ask one thing that how to find the stable frame
in a live video ?
REPLY
Adrian Rosebrock January 28, 2019 at 5:46 pm #
Have you tried using a video stabilization algorithm? That would be my primary
suggestion.
REPLY
david zhang September 17, 2018 at 11:14 am #
Free 17-day crash ×
Your blog is great!
course on Computer
Vision, OpenCV, and
Adrian Rosebrock September 17, 2018 at 2:04Deep
pm # Learning REPLY
Email Address
“Inevitably, I’ll be asked how to install Tesseract 4 on the Rasberry Pi…”
✕
😉 👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
Thanks!!
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 26/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Thanks Jorge 🙂
REPLY
Abdulmalik Mustapha September 17, 2018 at 11:29 am #
Nice post. I really could use this for my project really thanks for posting this article. But could you
please do tutorial post on how to do handwritten recognition with OpenCV and Deep Learning using the
MNIST Dataset. That could help alot!
REPLY
Adrian Rosebrock September 17, 2018 at 2:03 pm #
Hey Abdulmalik — I actually cover that exact topic inside Deep Learning for Computer Vision
with Python.
REPLY
ygreq September 17, 2018 at 11:41 am #
Man oh man! I gotta start learning this. You have so many gems here.
May I ask if you also did a tutorial on correcting perspective, skewing and so on of a document? In the
Free 17-day crash
end the script would take many pics made with the phone for example and correct them accordingly. ×
course on Computer
Something similar on how the mobile app Office Lens works.
Email Address
✕
👋Hey
Adrian there!September
Rosebrock Which17, of2018
these best
at 2:02 pm # describes you?
START MY EMAIL COURSE
REPLY
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 27/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
The primary perspective transform tutorial I refer readers to is this one. I’m not sure if that will
Click here
help you, but to download
wanted the
to link you to source
it just in case. code to this post
REPLY
ygreq September 17, 2018 at 3:46 pm #
My, my! this could be it. Let’s see if my zero knowledge takes me anywhere. ;))
REPLY
Anthony The Koala September 17, 2018 at 12:34 pm #
Dear Dr Adrian,
The above examples work for fonts with serifs eg Times Roman and without serifs, eg Arial,
Can OCR software be applied to detecting characters of more elaborate fonts, such as Old English fonts
used for example in the masthead for the Washington Post,https://www.washingtonpost.com/ ? There are
other examples of Old English fonts at https://www.creativebloq.com/features/old-english-fonts-10-of-the-
best .
To put it another way, do you need to train or have a dataset for fancy fonts such as Old English in order
to have recognition of fonts of that type?
Thank you,
Anthony of Sydney :
Hey Walid — you need at least OpenCV 3.4.2 for this blog post. OpenCV 4-pre will also
work.
REPLY
Walid September 17, 2018 at 3:00 pm #
REPLY
Adrian Rosebrock September 17, 2018 at 3:04 pm #
Hi Adrian, I have the same error because I run in 3.4.1 OpenCV. I follow step by
step your guide to install on Ubuntu 18.04. It’s possible to upgrade or I need to
recompile?
Email Address
Adrian Rosebrock October 8, 2018 at 1:37 pm #
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 29/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Yes. Create a new Python virtual environment and then follow one of my OpenCV install
Click here to download the source code to this post
guides.
REPLY
Anand May 30, 2019 at 11:34 pm #
HI Adrian, i’m using opencv version 4.1.0 and encountered this trouble
REPLY
Fred September 17, 2018 at 3:02 pm #
Hey Adrian,
Great post!! Have you ever attempted to train Tesseract v4 with a custom font? I’ve had poor results with
my dataset..
Cheers
Fred
REPLY
Adrian Rosebrock September 17, 2018 at 3:04 pm #
Hey Fred — sorry, I have not trained Tesseract v4 with a custom font.
REPLY
Walid September 17, 2018 at 3:12 pm #
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 30/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
Adrian Rosebrock September 17, 2018 at 4:06 pm #
Thanks Mohamed 🙂
REPLY
DanB September 17, 2018 at 6:45 pm #
I ran into an issue were tesseract 4.0.0 does not support digits only white listing. Is there a separate
trained network for numerical digits only?
REPLY
Adrian Rosebrock September 17, 2018 at 7:24 pm #
Hey Dan — where did you run into the “no digits only” issue?
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 31/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
A follow up to this with a github issue ticket on the tesseract repo explaining more…
https://github.com/tesseract-ocr/tesseract/issues/751
REPLY
papy September 17, 2018 at 6:52 pm #
Good work Adrian, Am currently working of the recognition of license plates using Python +
Tesseract OCR. but am having issues training the .trandata file to correctly recognize my countries
license plate. Any advice, links or video to help me train this dataset will be of great help.
Thanks
REPLY
Adrian Rosebrock September 17, 2018 at 7:22 pm #
I wouldn’t recommend using Tesseract for Automatic License Plate Recognition. It would be
better to build your own custom pipeline. In fact, I demonstrate how to build such an ANPR system
inside the PyImageSearch Gurus course.
Free 17-day crash ×
course on Computer
REPLY
Nigel January 21, 2019 at 11:15 pm # Vision, OpenCV, and
Deep
Can I see where you demonstrated it? Can I workLearning
with your tutorials in making my own
model (model or plate in our country)?
Interested in computer vision, OpenCV, and
deep learning, but don't know where to
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning REPLY
Adrian Rosebrock January 22, 2019crash
at 9:09 course
am # that is hand-tailored to give you
the best possible introduction to computer
Hey Nigel — as I mention, I cover ANPR inside the PyImageSearch Gurus course.
vision and deep learning. Sound good? Enter
The course will teach you how to create ANPR systems for your own country as well.
your email below to get started.
Email Address
REPLY
Jari September 17, 2018 at 7:34 pm # ✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
Hi Adrian,
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 32/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Thank you for this. I’ve messed with tesseract in the past but have struggled to get good results out of it
(and IClick
_think_ here to download
I was using thebut
the LSTM version source code
I’m unsure) to for
on data this post
work. Our data is under varying
lighting conditions and can have significant blur. We use GCP’s OCR solution at the moment which works
really really well on this data but if course can get costly.
One thing I’ve repeatedly tried to do and failed is figure out how to train tesseract on my own data (both
real and synthetic). So much so that I gave up and (for the one part of our pipeline that Google doesn’t
work well on) built my own deep learning based OCR system which works quite well (but incurs
significant RnD overhead). If you know how to train tesseract and would be willing to write that down, I
would deeply appreciate that.
REPLY
Adrian Rosebrock September 18, 2018 at 5:56 am #
Tesseract does assume reasonable lighting conditions and if you’re images are blurry it can
get much worse for sure. I’m glad to hear GCP’s solution is working for you though! I personally have
never trained a Tesseract model from scratch so I unfortunately do not have any guidance there.
REPLY
Andrews September 17, 2018 at 7:47 pm #
Hi Adrian, thanks for your tutorials, they are helping me a lot. I work in a project that i don’t know
where to start, if have any tip, I will appreciate a lot.Here is the stackOverflow link:
https://stackoverflow.com/questions/52377025/how-can-i-use-opencv-to-process-a-market-leaflet-to-
extract-product-and-promotio
Email Address
✕
👋Hey there! Which of theseSTART
Adrian Rosebrock September 18, 2018 at 5:58 am
best
#
describes you?
MY EMAIL COURSE
REPLY
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 33/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Recognizing water meters is an entirely different beast since the numbers may be partially
Click here
obscured, toondownload
dirt/dust the
the meter itself, andsource code
any number to this
of possible post
lighting problems. You could try
using Tesseract here but I wouldn’t expect too high of accuracy. I’ll try to do a water meter recognition
post in the future or include it in a new book.
REPLY
Trami September 18, 2018 at 9:31 pm #
Thank for so much. could you give me some advice about the the problems on
recognizing the meter ?
REPLY
Vikas December 29, 2018 at 5:56 am #
Hi Adrian, Thanks a lot for the post. Could you please let me know if you have already
worked on the OCR code for meter reading ? I am looking for a solution for gas meter reading.
REPLY
Adrian Rosebrock January 2, 2019 at 9:34 am #
Sorry, I do not. Jeff Bass, a PyImageConf speaker, may be able to help though. Be
sure to see his GitHub repo.
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 34/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
tesseract_cmd = ‘tesseract’
to
tesseract_cmd = ‘C:\\Program Files (x86)\\Tesseract-OCR\\tesseract’
REPLY
Chen September 18, 2018 at 1:26 am #
Hi Adrian,
I have download the source code in my window computer. also install some relevant library.
i try to execute your source code.
REPLY
Chen September 18, 2018 at 1:29 am #
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 35/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
Aveshin Naidoo September 18, 2018 at 2:50 pm #
Click here to download the source code to this post
I forgot what to add what I want the second virtual environment for. The new one will hold the
EAST text detector and a new version of OpenCV, plus python and Tesseract 4
REPLY
Adrian Rosebrock September 18, 2018 at 4:05 pm #
Keep in mind that Tesseract is a binary, it’s not a Python package — I think you’re
confusing the tesseract command with the pytesseract Python package. You can create two
Python virtual environments if you want but you’ll only have one version of the actual Tesseract
binary itself which shouldn’t be na issue since Tesseract v4 also includes the v3 engine.
REPLY
Alex September 18, 2018 at 3:54 pm #
Hello Adrian, another very good tutorial thanks! Would you recommend it for a license plate
reader or in this case is it better to stick with normal segmentation and a KNN?
REPLY
Adrian Rosebrock September 18, 2018 at 4:03 pm #
Hey Alex, I wouldn’t recommend using Tesseract for Automatic License Plate Recognition. It
would be better to build your own custom pipeline. In fact, I demonstrate how to build such an ANPR
system inside the PyImageSearch Gurus course.
Free 17-day crash ×
course on Computer
REPLY
Niklas Wilke September 19, 2018 at 5:58 pm # Vision, OpenCV, and
Hi Adrian, even though not related to this post i Deep Learning
had thought about NN/AI security.
I’m not currently working on CV myself so im unsure if im up to date but you would probably know.
Interested in computer vision, OpenCV, and
There were methods (like pixel attacks) that allowed someone who was familiar with the architecture of a
deep learning, but don't know where to
CNN to create images or modify images to get a desiredFree output.
17-day crash course on Computer
start? Let me help. I've created a free, 17-day
=> change x , let the the model classify an airplane as a Vision,
fish. OpenCV, and Deep Learning
crash course that is hand-tailored to give you
The big “let down” here is that i could only do that with my
theown
bestNN so its pretty
possible pointless
introduction and the security
to computer
risk pretty low. But now that i think about how CV is implemented
vision andbydeep
semi-experts and without
learning. Sound good?clear rules
Enter
and standards i would imagine a lot of CV software solutions
your out there
email andtothose
below that are about to be
get started.
build will make use of the state of the art nets of the big researchers and will base their nets on that. They
probably tweak and modify it but the core structure might remain the same.
Email Address
Now my question:
✕
👋
Heyimplementations
Would those slightly modified there! Whichstill
ofbe these best
a valid describes
target
START MYforEMAIL
you? attacks or other
pixel manipulation
COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 36/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
attack forms, given i base them on the 5-6 biggest nets out there or will the net as soon as any
Click(for
modification here to download
example the
add a label class source
to the code
main pool) to this
has been madepost
, be safe of those attacks ?
Im not concerned about the “sure but you can easily avoid this by … ” solution, im concerned about semi-
expert who implement stuff in small businesses or in areas where nobody can really judge their work as
long as it seems to be working in my desired business case.
REPLY
Daniel September 20, 2018 at 5:23 am #
REPLY
Adrian Rosebrock October 8, 2018 at 1:16 pm #
REPLY
loch September 22, 2018 at 9:35 pm #
HI adrian
your code work perfectly , earlier i had opencv 3.2.0 where camera release function perfectly
Free 17-day crash
but after upgrading to opencv 3.4.2 to run the programme the camera release( capture.release() ) ×
function not working can u give me a solution to release the camera thank you
course on Computer
Vision, OpenCV, and
Adrian Rosebrock October 8, 2018 at 1:00 pmDeep
#
Learning REPLY
I’m not sure why your camera may have stopped Interested in computer
working in betweenvision,
OpenCVOpenCV,
3.2 andand
OpenCV 3.4.2. That is likely a great question for the deep learning,
OpenCV GitHubbut don'tpage.
Issues know where to
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
the best possible introduction to computer
REPLY
Tran September 22, 2018 at 11:53 pm # vision and deep learning. Sound good? Enter
your email below to get started.
Hi, just an idea. We can next use a translator to translate the text and print it to the image in
place of the OCR text.
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 37/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
ClickAdrian
here Rosebrock
to download
Octoberthe source
8, 2018 at 1:00 pm #code to this post
REPLY
REPLY
seventheefs September 24, 2018 at 11:30 am #
REPLY
Adrian Rosebrock October 8, 2018 at 12:50 pm #
REPLY
taysir February 15, 2019 at 6:05 am #
I am also looking for a powerful Python library for the detection of Arabic characters
REPLY
vinay September 24, 2018 at 11:32 am #
how to install tesseract + python bindings and iam getting workon command not found .please
help me out. Free 17-day crash ×
course on Computer
Vision, OpenCV, and
REPLY
Adrian Rosebrock October 8, 2018 at 12:50 pm #
Deep Learning
Hey Vinay, do you have virtualenv and virtualenvwrapper installed on your system? Did you
install OpenCV using Python virtual environments? IfInterested in computer
not, you can vision,command.
skip the “workon” OpenCV, and
deep learning, but don't know where to
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
REPLY
liu September 28, 2018 at 12:14 am # the best possible introduction to computer
vision and deep learning. Sound good? Enter
Hi,I got a problem.The code can detect some texts like “AB” or “CD”,etc.but it can’t recognize a
your email below to get started.
single character like ‘A’,’B’,etc.Does anyone know how to recognize a single character or provide another
model _detection.pb like east? Great thanks.
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 38/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Click here
keertika to download
September 28, 2018 at 2:28the
am # source code to this post
REPLY
Hey Adrian,I am running this code on Jupyter notebook (pyhton 3.6.+conda 4.5.11+opencv 3.4).
I get an error unrecognised error.
REPLY
keertika September 28, 2018 at 2:32 am #
I got it fixed !!
REPLY
Adrian Rosebrock October 8, 2018 at 12:24 pm #
REPLY
K September 28, 2018 at 3:04 am #
REPLY
K September 28, 2018 at 3:19 am #
hey,Adrian
Free 17-day crash ×
I get the following error
course on Computer
AttributeError: module ‘cv2.dnn’ has no attribute ‘readNet’
Vision, OpenCV, and
Deep Learning
REPLY
Adrian Rosebrock October 8, 2018 at 12:24 pm #
Interested in computer vision, OpenCV, and
deep learning, but don't know where to
Make sure you’re using OpenCV 3.4.2 or greater.
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
the best possible introduction to computer
REPLY
Oyekanmi Oyetunji September 30, 2018 at 9:58 amvision
# and deep learning. Sound good? Enter
your email below to get started.
Hi Adrian
Thanks for the tutorial..
I really like what you’re doing up here… Email Address
✕
I need your help 👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 39/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
I have raspbian with opencv pre-compiled.. Which I got when I bought a bundle from you…
Click here to download the source code to this post
Can I install tesaract straight up on it… Or do I have to uninstall opencv..
Thanks..
REPLY
Adrian Rosebrock October 8, 2018 at 10:54 am #
No need to uninstall OpenCV! You can simply install Tesseract as I recommend in this guide.
REPLY
Vittorio October 10, 2018 at 12:25 pm #
Hi Adrian!
In my project, I would need to recognize single RANDOMIC characters from a car chassis.
Do you think I should try a different solution or it should be good the one explained by this post?
Thx
REPLY
Adrian Rosebrock October 12, 2018 at 9:13 am #
Hey Royce, I would actually recommend workingEmailthrough the PyImageSearch Gurus course
Address
where I cover automatic license plate recognition in detail (including code).
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 40/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
Steven October 15, 2018 at 2:44 pm #
Click here to download the source code to this post
Hi Adrian,
Great post. I do have to ask: How did you decide on the “Saxon’s Estate Agents” image? Of the many
billions of images to choose from online, this is a rather peculiar one. This image was shot in the same
town where I am doing my PhD. 🙂
REPLY
Adrian Rosebrock October 16, 2018 at 8:25 am #
Hah! That’s so cool! I found the image when I searched for storefronts — that was one of the
images that popped up!
REPLY
ranjeet singh October 21, 2018 at 11:25 am #
Its not working on this image where I want to detect IMEI number
Pic – https://starofmysore.com/wp-content/uploads/2017/07/news-9-imei.jpg
Even when I align image correctly, it detects word ‘imei’ but does not capture IMEI number.
What should I do?
REPLY
Adrian Rosebrock October 22, 2018 at 7:59 am #
Hey Ranjeet, make sure you read the “Limitations and Drawbacks” section of this tutorial.
Free 17-day crash ×
OCR systems will fail in certain situations. You may want to try creating your own custom digit
detector for the actual number. course on Computer
Vision, OpenCV, and
Deep Learning REPLY
jim421616 October 25, 2018 at 7:42 pm #
Interested in computer vision, OpenCV, and
Hi, Adrian. I got the installation on my RPi first time (!) but when I issue tesseract –help-oem or -
deep learning, but don't know where to
psm or -l, I get the following error: Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
tesseract: error while loading shared libraries: libtesseract.so.4: cannotthat
crash course open shared objecttofile:
is hand-tailored Noyou
give such
file or directory. the best possible introduction to computer
vision and deep learning. Sound good? Enter
I’m in the virtual env cv_tesseract when I issue the command, but I get the same error message when I’m
not in it too. your email below to get started.
Any suggestions?
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 41/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
ClickAdrian
here Rosebrock
to download
Octoberthe source
29, 2018 at 1:48 pm code
# to this post REPLY
Hey Jim — have you tried posting on the official Tesseract GitHub Issues page? They would
be able to provide more targeted advice to your specific system.
REPLY
[email protected] October 30, 2018 at 6:35 pm #
Hi Jim
try
$ sudo ldconfig
REPLY
Gary Chris November 14, 2018 at 1:46 am #
…
AttributeError: module ‘cv2.dnn’ has no attribute ‘readNet’
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 42/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
ClickAdrian
here Rosebrock November 15, 2018 at 11:52 am #
to download the source code to this post
I would suggest trying with OpenCV 3.4.2 and see if that resolves the issue.
REPLY
Vagner December 9, 2018 at 8:58 pm #
Is there anything about comparing signatures, to find possible scams, using opencv and algorithms like
gsurf, harrison or something?
REPLY
Adrian Rosebrock December 11, 2018 at 12:48 pm #
REPLY
Dorra December 13, 2018 at 8:36 am #
Hi Doctor Adrian
Both scripts of “OpenCV Text Detection” and “OpenCV OCR and text recognition with Tesseract” make
use of the serialized EAST model ( frozen_east_text_detection.pb ) can you send me the source code of
(frozen_east_text_detection.py) I want undrestand how it work.
Thanks for your help
Free 17-day crash ×
course on Computer
bahman December 16, 2018 at 8:37 am # Vision, OpenCV, and REPLY
REPLY
KISHORE K December 26, 2018 at 8:14 am # Email Address
✕
👋Hey
hi Adrian, i amthere!
getting Which of these
only the first word ofbest describes
the image you?in image3 i am
,for example
START MY EMAIL COURSE
getting only estate and its not reading agents and saxons . can you please help me?..
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 43/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
Adrian Rosebrock December 27, 2018 at 10:11 am #
Click on the window opened by OpenCV and press any key on your keyboard to
advance execution of the script.
REPLY
Charley December 22, 2018 at 11:53 am #
Hi Adrian, great tutorial! I was wondering if it was possible to use this model to search for a
particular word? Or should I train a new model to look for the work specifically?
REPLY
Adrian Rosebrock December 27, 2018 at 10:51 am #
I would suggest you use the approach used in this post. Apply the text detector, OCR it, and
then see if the OCR’d text is the word you are looking for.
REPLY
Polefish January 2, 2019 at 11:03 am #
Hi Adrian
Email Address
Would it be possible to detect and read the electricity meter with this approach? If not, what else can be
✕
done? 👋
Hey there! Which of these best describes you? START MY EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 44/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Thanks
Ferry Click here to download the source code to this post
REPLY
Adrian Rosebrock January 29, 2019 at 6:58 am #
Hey Ferry — have you tried with your electricity meter images? Give it a try first and see how
it performs. I can’t really provide any guidance without first seeing your images.
REPLY
Aliff Mustaqim January 26, 2019 at 6:18 am #
It shows:
orig = image.copy()
REPLY
Adrian Rosebrock January 29, 2019 at 6:57 am #
Double-check your path to he input image. The image path is likely invalid (the image does
Free 17-day crash
not exist). You can read more about NoneType errors in OpenCV, including how to solve them, here. ×
course on Computer
Vision, OpenCV, and REPLY
Bhavya February 5, 2019 at 11:10 am #
Deep Learning
Hi Adrian,
Interested in computer vision, OpenCV, and
Can you please suggest how to print the text from video. I am very new to openCV. It would be very
deep learning, but don't know where to
helpful. Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
Thank you, crash course that is hand-tailored to give you
Bhavya the best possible introduction to computer
vision and deep learning. Sound good? Enter
your email below to get started.
REPLY
bharath February 6, 2019 at 12:20 am #
Email Address
can we use raspberrypi camera to get the images and process it ✕
?
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 45/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
You can use the Raspberry Pi camera to capture frames and OCR them; however, it will take
at least 15-20 seconds to process each frame (depending on the frame dimensions). The Pi is too
underpowered.
REPLY
Mohamed Akrem April 3, 2019 at 7:14 am #
REPLY
Adrian Rosebrock April 4, 2019 at 1:19 pm #
To what process?
all i want is to change the code you writed there , for that the pi camera will
capture every 30 seconds for example and after that i want to do it with pushbutton , this
is because i have a project OCR for visually impaired persons , when they click on the
button the camera should detect and give the text as vocal ,
but right now i just did what you did , and this happens even when i capture an image
Free 17-day crash ×
with the pi camera , but the process must happen only when i run the command that you
course
did there , i want the camera to capture and then sendon Computer
the photo to the pi and giving the
text , can you help me with that? im so lost.
Vision, OpenCV, and
Deep Learning
Adrian Rosebrock April 12, 2019
Interested
at 12:11 pm in
# computer vision, OpenCV, and
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 46/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
Clickamal
here to download the source code to this post
February 7, 2019 at 5:23 pm #
REPLY
Sajjad Manal February 8, 2019 at 12:03 am #
Hi Adrian, Thanks for this wonderful tutorial. Can you also tell how get all detection in one image
(I am getting 10 images for 10 words detected separately.) to save the final result? Also, if you can
suggest how to save the position(x,y coordinates) of the final detection(bounding box) along with the text
detected?
REPLY
Adrian Rosebrock February 14, 2019 at 2:57 pm #
You can move the cv2.imshow and cv2.waitKey call and put it at the end of the loop. I get
the impression that you may be new to the world of OpenCV and image processing — that’s okay, but
I would encourage you to read through Practical Python and OpenCV first to help get you up to
speed.
REPLY
Sajjad Manal February 10, 2019 at 11:03 pm #
Hello Adrian,
Curious to know how to run this script for large number of images in one go, say 100 images? Also, is it
Free 17-day crash
possible to have all the text detected for a single image in one final single output? Similarly, for each of
×
the 100 input images.
course on Computer
Vision, OpenCV, and
Deep Learning
Adrian Rosebrock February 14, 2019 at 1:40 pm # REPLY
I don’t know what I am doing wrong but I’ve tired this about 100 times now and keep getting he
Email Address
‘Nonetype’ error where the image.copy() is used [line 83]. Do I need to add the location to the image on
✕
👋
the preceding line[line 82]?
HeyCoz I’ve done
there! that now
Which of inthese
at least 8 different
best ways and
describes still keep getting that
you?
START MY EMAIL COURSE
error. Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 47/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Also, where does the code actually refer to the image location and also the location for the east code? If
Clickthe
I’ve followed here
codeto download
correctly, then thisthe source
should be line 88code to location
for image this postand line 111 for east file. So,
do I change the string value to the locations for the respective file?
Any help on this matter will be highly appreciated. Thanks for sharing the code though. Coming from a
different coding language, this page has been a lot of help to translate the image processing principles.
REPLY
Adrian Rosebrock February 20, 2019 at 12:32 pm #
Double-check your path to the input image. 99.9% likely that your input image is incorrect
causing “cv2.imread” to return “None”, hence the error. You should also read this tutorial on
NoneType errors and how to resolve them.
REPLY
Akhilesh February 19, 2019 at 3:34 am #
Hi Adrian, I installed tesseract 4.0 on my windows machine.The execution time is too slow
around 1.5 sec per image for pytesseract.Can you suggest to improve the speed of tessseract ??
REPLY
Adrian Rosebrock February 20, 2019 at 12:20 pm #
It’s not the speed of Tesseract, it’s the speed of the EAST text detector. You should look into
running the EAST text detector on your GPU.
Email Address
✕
👋February
Gary Zheng Hey there!
21, 2019 atWhich
3:12 pm # of these best describes you?
START MY EMAIL COURSE
REPLY
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 48/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
Adrian Rosebrock February 22, 2019 at 6:25 am #
REPLY
Kim February 22, 2019 at 11:43 pm #
REPLY
Abed Eljalil Berjawi February 24, 2019 at 1:34 pm #
I have a question: How can I apply this on the camera directly (continuous recording)? Is there any
tutorial?
Regards,
Abed Eljalil.
✕
👋Hey there! Which of these best describes you?
Take a look at “text to speech” libraries. Google’s
STARTgTTS
MYwould
EMAILbe a good one to start with. I’ll
COURSE
Click to answer
also be covering a similar topic in my upcoming Computer Vision + Raspberry Pi book, stay tuned!
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 49/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Hi Adrian,
I am working on Beaglebone black which is a linux debian. Can you share the steps to install tesseract
OCR and open cv.
Thank you.
REPLY
Adrian Rosebrock March 5, 2019 at 9:05 am #
Ubuntu is Debian based. You can use the Ubuntu install instructions to install Tesseract +
OpenCV on your system.
REPLY
Khaerul Umam August 3, 2019 at 9:22 am #
Are you got error on add-apt-repository? If yes, you can install them first by
Hope it helps
REPLY
Abobakr March 6, 2019 at 6:52 pm # Email Address
✕
hello Adrian; 👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 50/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
thank you for your help and support , i am really impressed with this post, but i need your help on
Click
something here
i need to download
to detect the when
text from receipts. source
i usedcode to this
your script post
it didn’t work well on my image it
detects the words from right to left and it doesn’t detect every work sometimes half of the word , could you
give me a guidelines to work on
REPLY
Ted March 6, 2019 at 7:22 pm #
Using a stylized font with exaggerated serifs (not as exaggerated as Old English typface typical
of newspaper brands). The Tesseract text detection bounding boxes are cutting off significant parts of
some letters rendering the text recognition inaccurate. Even when embedding the very font by using a
trainingdata file trained by ocr7.com and using perfect text examples created using the very same font,
this problem occurs. Is it possible to tweak tesseract’s bounding box parameters?
Shouldn’t Tesseract produce excellent results when exclusively using training data created with the one
font it is asked to detect/recognize?
Your text detection tutorial describes how to do so, but I don’t believe that part of the text recognition
process is exposed when using tesseract to do all processing. Thanks.
REPLY
Adrian Rosebrock March 8, 2019 at 5:25 am #
That might not be an issue with Tesseract itself, but rather the arguments you’re passing into
the Tesseract binary. See the “–oem” and “–psm” arguments — you may need to change those.
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 51/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
ClickAdrian
here Rosebrock
to download the
March 22, 2019source
at 8:34 am #code to this post
REPLY
Can you share more details on your system? What OS are you using? What Python,
Tesseract, etc. versions?
REPLY
Alex April 1, 2019 at 3:06 pm #
Hello, I have the same problem. My OS is Windows 10, the version of python is 3.6 and
the version of Tesseract is 4.1.0.
I also put this line in my code
pytesseract.pytesseract.tesseract_cmd = r’C:\Users\Alex\Tesseract-OCR\tesseract.exe’
but still doesn’t work.
REPLY
Adrian Rosebrock April 2, 2019 at 5:47 am #
Sorry, I’m not a Windows users and do not officially support Windows here on the
PyImageSearch blog.
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 52/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
Phil March
Click here21,to download
2019 at 10:16 pm # the source code to this post
Hello Adrian, your tutorial is helpful and amazing. I began to learn ML and CV recently , and I am
unfamiliar with Linux too. When I came to the last step, I got ” ImportError: No module named
imutils.object_detection”. I have searched this error on google, but I still don’t know how to fix it. Can you
help me ?
REPLY
Adrian Rosebrock March 22, 2019 at 8:26 am #
REPLY
Arjun Pal March 23, 2019 at 2:09 pm #
I’m trying to do something like this, except get a bounding box around every single text
character, rather than full words. How would I be to do this?
REPLY
Adrian Rosebrock March 27, 2019 at 9:16 am #
Sorry, I don’t have any tutorials for extracting just a single text character.
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 53/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Thanks
Click
Manish here to download the source code to this post
REPLY
Adrian Rosebrock April 2, 2019 at 6:19 am #
REPLY
Scott March 29, 2019 at 9:16 am #
Hello Adrian, thanks for sharing. It’s a really nice work! And I have a question, could you please
help me answer it?
You said that “The underlying OCR engine itself utilizes a Long Short-Term Memory (LSTM) network, a
kind of Recurrent Neural Network (RNN).”, but we use the EAST text detector to find text frame in
pictures, which based on CNN, right? So, what you mean about “the underlying OCR engine”?
Thanks for your time 😀
REPLY
Scott March 29, 2019 at 9:28 am #
thanks
Email Address
Adrian 👋
Rosebrock REPLY ✕
Hey there! Which of these best describes you?
April 4, 2019 at 1:34 pm #
START MY EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 54/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
I would suggest you start by learning how to access the Raspberry Pi camera module.
Click here to download the source code to this post
REPLY
Haruo April 4, 2019 at 2:23 pm #
Hi, Adrain.
Great tutorial and many thanks.
I am a novice in the image processing field. After carefully following all the installation steps and the
compiling the code, I was able to run the code succesfully.
One can simply use your tutorial and start working out of the box with minimal time.
REPLY
Haruo April 5, 2019 at 12:28 am #
Email Address
✕
👋
Hi Adrian,
Heyright now I working
there! Whichonofother thesearea, I justdescribes
best
START MY
need a smallyou?
EMAIL
test on image as of now.
COURSE
However, once I complete my current pending works, I would be coming back to image
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 55/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
processing area to explore more. Will see you at that time. Thank you for your response.
Click here to download the source code to this post
REPLY
Haruo April 6, 2019 at 7:42 am #
Hi Adrian, right now I am working on other area, I just need a small test on image as of now.
However, once I complete my current pending works, I would be coming back to image processing area
to explore more. Will see you at that time. Thank you for your response.
REPLY
Gordon April 13, 2019 at 2:57 am #
Hello Adrian,
Currently i am facing some issue whereby my scripts will run tesseract (with thread) on the video frame
every 6 secs to extract the information on the video frame.
But, everytime when the video almost ends, the process will slow down significantly and all the cpu cores
usage will suddenly spike to 100%. Then, there will be processes
produced (which ends up in zombies processes) and a lot of xxx.png and xxx_out.txt produced in the /tmp
directory. Do you or anyone else ever face this issue? Hope to hear from you guys soon.
Regards,
Gordon
course on Computer
That is odd but unfortunately I’m not sure what the problem is there. I wish I could be of
Vision,
more help but unfortunately without having physical access to theOpenCV,
Pi or the code I and
can’t really
diagnose.
Deep Learning
Interested in computer vision, OpenCV, and
deep learning, but don't know where toREPLY
Gabriel April 16, 2019 at 3:31 pm # Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
Hello Adrian! The proyect works fine and thank crash
you forcourse that
sharing thisis proyect
hand-tailored
to us! to
Now give you a
i have
the best
question, can you this proyect but via streaming video using possible Iintroduction
the camera? mean, that to computer
when i focus a
word, letter or number, it prints it on terminal? Thanks vision and deep learning. Sound good? Enter
your email below to get started.
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 56/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Yes, that’s absolutely possible. Have you accessed your webcam before using OpenCV? What is
Click
your here level
experience to download
with OpenCV?the source code to this post
REPLY
John Henderson April 16, 2019 at 8:08 pm #
HI Adrian, I think this blog post is awesome and I was wondering if it is possible to take the ROI’s
(each word) and the x,y coordinates of each ROI and import them to a new white image that has the
same dimensions as the original scanned image? I’m trying to build a document scanner and I’m having
issues preserving the placement of each word. Thanks!
REPLY
Adrian Rosebrock April 18, 2019 at 6:45 am #
Yes, that’s absolutely possible. You would use NumPy to create an empty array the same
size as your input image. You already have the (x, y)-coordinates of each ROI so you would use
NumPy array slicing to take the ROI from the original image and place it into the output image. If
you’re new to Python/OpenCV and would like to learn how to perform such slicing operations
definitely refer to Practical Python and OpenCV where I teach the basics. After going through the text
you will be able to solve the problem.
REPLY
Azat April 30, 2019 at 1:53 pm #
Hi, Adrian, How did you find RCNN to recognize texts? have you tried before and is it works well
?
Free 17-day crash ×
course on Computer
Mohamed Akrem May 11, 2019 at 10:40 pm #
Vision, OpenCV, and REPLY
Deep Learning
Hi adrian , this code works for me very well on my raspberry pi , thank you very much , but in
addition i want this whole process start after i click on a pushbutton
Interested that i insertedvision,
in computer in Rpi , OpenCV,
is that possible?
and
if yes tell me how please. deep learning, but don't know where to
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
the best possible introduction to computerREPLY
Gary Zheng May 17, 2019 at 10:28 am #
vision and deep learning. Sound good? Enter
hey Adrian, i run this code to some pictures andyour emailthe
it shows below
red to
boxget
butstarted.
not any text. What
could be causing that?
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Kalaiselvan Panneerselvam May 21,Click to answer
2019 at 5:25 am #
REPLY
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 57/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Iam trying to retrieve texts from a noisy and rusted iron plates. Tesseract v4 fails to read the text
Click
most here to download
of the times. theway
What is the best source code
to perform to Ithis
to OCR. postapi like amazon
tried cloud
rekognition but i trying to built it as a mobile app where ocr is performed with mobile phone in low
bandwidth or with no internet connection.
REPLY
Kotesh May 30, 2019 at 12:41 am #
Hey Adrian I run this code for text recognition but here the next is number but it is not
recognising the numbers. I tried with making changes in oem and psm but no change.
can you please help me how to detect numbers with this code.
REPLY
guruprasaad June 2, 2019 at 4:35 am #
I have a doubt in mind , can i use tessaract to detect and extract alphanumeric characters like
(!@#$%^&*()_+) ?
Thanks in advance
REPLY
Adrian Rosebrock June 6, 2019 at 8:30 am #
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE REPLY
Aish June 13, 2019 at 7:41 am # Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 58/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
Adrian Rosebrock June 13, 2019 at 9:30 am #
What is the error you received? Without knowing the error I cannot provide any suggestions.
REPLY
Amar June 17, 2019 at 1:40 am #
Dear sir, thanks for the article. I have been working on extracting text from scanned PDF files
and I have used other python based libraries and tools to achieve the same. I will definitely give this one a
try also.
As a next step in my project I would like to overlay the text to the scanned PDF so that the PDF itself
becomes searchable. Would you be kind enough to guide me on how to do that programmatically on
windows.
Regards
Amar
REPLY
Adrian Rosebrock June 19, 2019 at 2:06 pm #
Sorry, I don’t know how to programmatically overlay a PDF with text. There may be Python
libraries for that, but you’ll need to do your own research.
Click on the window opened by OpenCV and press any key on your keyboard to advance
Clickexecution
here to(the
download thecall
“cv2.waitKey(0)” source
preventscode to from
execution thiscontinuing
post until a key is
pressed).
REPLY
Madan June 21, 2019 at 3:01 am #
REPLY
Adrian Rosebrock June 26, 2019 at 1:47 pm #
ANPR systems are more advanced than just OCR. They also include localization
components as well. Refer to the PyImageSearch Gurus course for more details.
REPLY
Dinusha June 28, 2019 at 5:00 am #
Hi
I have tested this work fine without any problem for letters. But my problems when it is going to recognize
numbers ocr giving some wrong values compare with letters. What kind of configuration should I change
to improve accuracy of recognizing numbers?
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 60/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
REPLY
Kiran July 19, 2019 at 8:35 am #
After detecting the text using east algorithm can we use this post (ocr, tesseract) to recognise
the text.
REPLY
Adrian Rosebrock July 25, 2019 at 9:40 am #
Leave a Reply
Name (required)
Free 17-day crash ×
course on Computer
Email (will not be published) (required)
Website
Vision, OpenCV, and
Deep Learning
SUBMIT COMMENT Interested in computer vision, OpenCV, and
deep learning, but don't know where to
Free 17-day crash course on Computer
start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
crash course that is hand-tailored to give you
Search...
the best possible introduction to computer
vision and deep learning. Sound good? Enter
Resource Guide (it’s totally free). your email below to get started.
Get your FREE 17 page Computer Vision, OpenCV, and Deep Learning Resource Guide PDF. Inside you'll find my
Email Address
hand-picked tutorials, books, courses, and libraries to help you master CV and DL.
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 61/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 62/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
You're interested in deep learning and computer vision, but you don't know how to get started. Let me help. My new
book will teach you all you need to know about deep learning.
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 63/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
The PyImageSearch Gurus course is now enrolling! Inside the course you'll learn how to perform:
Click the button below to learn more about the course, take a tour, and get 10 (FREE) sample lessons.
I'm Ph.D and entrepreneur who has spent his entire adult life studying Computer Vision and Deep
Learning. I'm here to help you master CV, DL, and OpenCV. Learn More
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 64/65
8/27/2019 OpenCV OCR and text recognition with Tesseract - PyImageSearch
Want to learn computer vision & OpenCV? I can teach you in a single weekend. I know. It sounds crazy, but it’s no joke.
Click
My new book here
is your to download
guaranteed, the source
quick-start guide to becomingcode to this
an OpenCV Ninja.post
So why not give it a try? Click here
to become a computer vision ninja.
Never miss a post! Subscribe to the PyImageSearch RSS Feed and keep up to date with my image search
engine tutorials, tips, and tricks
POPULAR
Home surveillance and motion detection with the Raspberry Pi, Python, OpenCV, and Dropbox
JUNE 1, 2015
course on Computer
Ubuntu 16.04: How to install OpenCV
OCTOBER 24, 2016 Vision, OpenCV, and
Deep Learning
Interested in computer vision, OpenCV, and
deep learning, but don't know where to
Free 17-day crash course on Computer
Find me on Twitter, Facebook, and LinkedIn. start? Let me help. I've created a free, 17-day
Vision, OpenCV, and Deep Learning
Privacy Policy crash course that is hand-tailored to give you
© 2019 PyImageSearch. All Rights Reserved. the best possible introduction to computer
vision and deep learning. Sound good? Enter
your email below to get started.
Email Address
✕
👋Hey there! Which of theseSTART
best MY
describes you?
EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/ 65/65