gImageReader is a simple Gtk/Qt front-end to tesseract. Features include:
- Import PDF documents and images from disk, scanning devices, clipboard and screenshots
- Process multiple images and documents in one go
- Manual or automatic recognition area definition
- Recognize to plain text or to hOCR documents
- Recognized text displayed directly next to the image
- Post-process the recognized text, including spellchecking
- Generate PDF documents from hOCR documents
**Note**: This page is only a mirror for the downloads. Development is happening on github at https://github.com/manisandro/gImageReader, release binaries are also posted there.
Features
- Import PDF documents and images from disk, scanning devices, clipboard and screenshots
- Process multiple images and documents in one go
- Manual or automatic recognition area definition
- Recognize to plain text or to hOCR documents
- Recognized text displayed directly next to the image
- Post-process the recognized text, including spellchecking
- Generate PDF documents from hOCR documents
License
GNU General Public License version 3.0 (GPLv3)Follow gImageReader
Other Useful Business Software
Passwordless Authentication and Passwordless Security
It’s no secret — passwords can be a real headache, both for the people who use them and the people who manage them. Over time, we’ve created hundreds of passwords, it’s easy to lose track of them and they’re easily compromised. Fortunately, passwordless authentication is becoming a feasible reality for many businesses. Duo can help you get there.
Rate This Project
Login To Rate This Project
User Reviews
-
Thanks for the program under a free open source license!
-
This software is very helpful for me. It work correctly in English and Vietnamese. It help me so much in my study. Thanks !!!
-
Stable and a nice touch is the OCR editing facility to enable manual correction of automated OCR errors. The program would be further enhanced by enabling output of the input PDF image file also as a PDF image file but with the OCR as searchable text layer under the page image; this instead of / as an additional option to the existng PDF output of OCR-only text without image.
-
I find gImage Reader a practical and stable frontend to Tesseract OCR.Spell checking and editing text works fine for normal purposes where Tesseract OCR nowadays is very accurate.Unusual fonts as Fraktur (in heads of many longstanding newspapers) is supported and the Tesseract engine will (with effort) train any uncommon typography.The gImage frontend is stable and uncomplicated to work with when set up, and the only improvement I can think of is a way to easily close the sources pane for maximizing the head pane when reviewing and editing recognised text.A very useful program, thank you. Edit: glad to find a toggle option for the suggested improvement already exist (I'm slightly visionary impaired).The Tesseract Fraktur "language" I work with is fine but not programmed to trigger the spell checker and the output therefore interpreted with spelling errors / understrikes.This small problem (unrelated to gImageReader) could be solved by a spell checker on/off switch.Thank you again.
-
This is a great tool to do a proof of concept using tesseract.