Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)

The document discusses text extraction from images using optical character recognition (OCR). It begins by defining image processing and text extraction. Next, it describes the technologies used, including Python, Django, HTML, CSS, SQLite, and the OCR library Pytesseract. The implementation section explains that Pytesseract takes an input image and outputs the text. It then outlines the process: users input images, images are stored in a SQLite database using Django as the backend, and Pytesseract extracts text. Future work involves extracting text from videos. Applications are listed as document analysis, license plate recognition, paper analysis, and video subtitles. The document concludes by citing references used.

Uploaded by

suneetha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views20 pages

Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)

Uploaded by

suneetha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Text Extraction From Image

Team Members
CH.Suneetha(19mcmb22)
Mohit sharma(19mcmb13)
Agenda
● What is image processing
● What is text extraction
● Technologies
● Implementation
● Future Implementations
● Applications
What is Image Processing
Image processing is a method to perform some operations on an image, in order
to get an enhanced image or to extract some useful information from it.

Input:Image

Output:may be image or characteristics/features associated with that image

Steps In the Image Processing:

1. Importing the image via image acquisition tools;

2. Analysing and manipulating the image;
3. Output in which result can be altered image or report that is based on image
analysis.
What Is Text Extraction
Process of extracting information from the text is text extraction also known as
information extraction.

Input: Digital Image, video

Output: Text Which is Present in the image or video

Technologies and Softwares
Technologies:

● Python
● Django
● Html
● Css
● SqlLite
● Ocr(Optimal Character Recognition)

Softwares

● Atom as text editor

● Operationg System Ubuntu Processor intel core i5 4th gen 4GB RAM
Implementation
To extract a text from image we are using optimal character recognition(OCR)
image processing technique

Optical character recognition (OCR) refers to both the technology and

process of reading and converting typed, printed or handwritten
characters into machine-encoded text or something that the computer
can manipulate.
Implementation Of OCR
We are using python to implement OCR and to Implement there is One Python
Library which is pytesseract.

Pytesseract take the image path as the input and it outputs the text which is
present in the image as output to the user
Process
● First we are design a page for the user to input their image by using html and
css techniques
● The images given by the user will be taken as url and it can be stored in
SqlLite database to connect the front end with the database we are using
django as the backend technology.
● By using pytesseract package in python we are implementing OCR image
processing algorithm that will give a text from the image that te user given
Future Implementations

In Future we are extending this technique to the text extraction from the video
streaming using some image processing techniques
Applications
● Document Analysis
● Licence Plate Extraction from vehicle
● Technical Paper Analysis
● Video Subtitles extraction
Research Paper
References

 1. OCR for Devnagari Script by Mahesh Goyani

 2. Edge Based Text Extraction From Complex Images by Xiaoqing Liu
and Jagath Samarbandhu
 3. Automatic Text Detection using Morphological Operations and
Inpainting by Khyati Vaghela
 4. Font and Background Color Independent Text Binarization by T.Kasar
, J.Kumar , A.G. Ramkrishnan
 Research paper link http://www.ijera.com/papers/Vol8_issue5/Part-
5/D0805052733.pdf
Thank You

Advanced Physical Design Final
No ratings yet
Advanced Physical Design Final
4 pages
Ocr Nanonets Tesseract
No ratings yet
Ocr Nanonets Tesseract
39 pages
Review of Text Extraction Algorithms For Scene-Text and Document Images
No ratings yet
Review of Text Extraction Algorithms For Scene-Text and Document Images
22 pages
Ocr Gtts
No ratings yet
Ocr Gtts
49 pages
Ocr PPT GRP 12
No ratings yet
Ocr PPT GRP 12
10 pages
Building An Image Processing Pipeline With Python
100% (1)
Building An Image Processing Pipeline With Python
31 pages
Online Character Recognition Presentation
No ratings yet
Online Character Recognition Presentation
34 pages
HTTPWWW - Csportal.panasonic La - Comdescargasplaplaacon.20aireacon.20airesplit2015cs Ps12pkv 6documentomanual20de2
No ratings yet
HTTPWWW - Csportal.panasonic La - Comdescargasplaplaacon.20aireacon.20airesplit2015cs Ps12pkv 6documentomanual20de2
109 pages
IJMIE1April24 55698
No ratings yet
IJMIE1April24 55698
7 pages
Extracting Text From Images With LangChain - by Reflections On AI - Nov, 2024 - Python in Plain English
No ratings yet
Extracting Text From Images With LangChain - by Reflections On AI - Nov, 2024 - Python in Plain English
22 pages
Image Convert To Text
No ratings yet
Image Convert To Text
16 pages
Automated Text Extraction
No ratings yet
Automated Text Extraction
6 pages
MANVA
No ratings yet
MANVA
51 pages
Design Phase
No ratings yet
Design Phase
10 pages
Mahatma Jyotiba Phule Rohilkhand University, Bareilly: Dr. Iram Naim
No ratings yet
Mahatma Jyotiba Phule Rohilkhand University, Bareilly: Dr. Iram Naim
18 pages
Minor 2
No ratings yet
Minor 2
4 pages
98DSP
No ratings yet
98DSP
8 pages
Text Extraction From Digital Images With Text To Speech Conversion and Language Translation
No ratings yet
Text Extraction From Digital Images With Text To Speech Conversion and Language Translation
3 pages
Capstonepres
No ratings yet
Capstonepres
12 pages
PDL-III Report FINAL
No ratings yet
PDL-III Report FINAL
34 pages
Textextraction Ocr Presentation PDF
No ratings yet
Textextraction Ocr Presentation PDF
23 pages
Unlocking Text From Images: The Future of OCR Technology
No ratings yet
Unlocking Text From Images: The Future of OCR Technology
4 pages
ML Report
No ratings yet
ML Report
5 pages
Hrim Bejakshar Pooja (VGM-305)
No ratings yet
Hrim Bejakshar Pooja (VGM-305)
16 pages
Department of Electronics and Communication Engineering
No ratings yet
Department of Electronics and Communication Engineering
25 pages
Document 12
No ratings yet
Document 12
4 pages
Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)
No ratings yet
Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)
20 pages
2005 6606 1 PB
No ratings yet
2005 6606 1 PB
21 pages
Analysis Phase - PPTX - 20250108 - 101518 - 0000
No ratings yet
Analysis Phase - PPTX - 20250108 - 101518 - 0000
19 pages
Fin Irjmets1684836352
No ratings yet
Fin Irjmets1684836352
7 pages
TS Project - Submission
No ratings yet
TS Project - Submission
21 pages
Multilingual Text Recognition System
No ratings yet
Multilingual Text Recognition System
21 pages
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
No ratings yet
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
7 pages
IT ProjectManagement
No ratings yet
IT ProjectManagement
13 pages
DPI 515 Manual K245
100% (2)
DPI 515 Manual K245
116 pages
Plagiarism Checker X Originality Report: Similarity Found: 26%
No ratings yet
Plagiarism Checker X Originality Report: Similarity Found: 26%
29 pages
IP MINI GD (Ver02) FINAL DG
No ratings yet
IP MINI GD (Ver02) FINAL DG
18 pages
APP2
No ratings yet
APP2
16 pages
3 M&a
No ratings yet
3 M&a
24 pages
OCR Using Image Processing
No ratings yet
OCR Using Image Processing
8 pages
Module # 10C - Text Recognition With Tesseract OCR
No ratings yet
Module # 10C - Text Recognition With Tesseract OCR
8 pages
A12REVIEW
No ratings yet
A12REVIEW
18 pages
01 Problem Description and Pipeline 7 Min
No ratings yet
01 Problem Description and Pipeline 7 Min
4 pages
Raspberry Pi
No ratings yet
Raspberry Pi
21 pages
1767vol III
No ratings yet
1767vol III
183 pages
Technical Support Associate
100% (1)
Technical Support Associate
2 pages
KX-P7100 SM
No ratings yet
KX-P7100 SM
148 pages
Untitled Presentation Wonderslide
No ratings yet
Untitled Presentation Wonderslide
5 pages
Template
No ratings yet
Template
10 pages
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
No ratings yet
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
7 pages
Text Detector (OCR)
No ratings yet
Text Detector (OCR)
12 pages
Optical Character Recognition System Using Artific
No ratings yet
Optical Character Recognition System Using Artific
7 pages
Magnetic Level Gauge - WINGEL-Catalog New
No ratings yet
Magnetic Level Gauge - WINGEL-Catalog New
7 pages
Especificaciones Tecnica Im50 Im60 Im70 Im80
No ratings yet
Especificaciones Tecnica Im50 Im60 Im70 Im80
24 pages
KV-G21M2 BG-2S
No ratings yet
KV-G21M2 BG-2S
42 pages
Raj Synopsis12
No ratings yet
Raj Synopsis12
5 pages
ANN Miniproject Report
No ratings yet
ANN Miniproject Report
11 pages
Handwritten Text Recognition and Digital Text Conversion
No ratings yet
Handwritten Text Recognition and Digital Text Conversion
2 pages
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
No ratings yet
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
6 pages
Abstract (Extract Text From Image)
No ratings yet
Abstract (Extract Text From Image)
2 pages
Frm010-Serial LCD
No ratings yet
Frm010-Serial LCD
4 pages
Optical Character Recognition: Divyanshu Sagar Ahmed Zaid Faizee Vidyut Singhania
No ratings yet
Optical Character Recognition: Divyanshu Sagar Ahmed Zaid Faizee Vidyut Singhania
11 pages
Span and Ruling Span
100% (2)
Span and Ruling Span
12 pages
Window Type Aircon Air Conditioner
No ratings yet
Window Type Aircon Air Conditioner
1 page
Receiver Sensitivity
100% (1)
Receiver Sensitivity
16 pages
Bengal College of Engineering and Technology, Durgapur: "Handwritten Text Recognition"
No ratings yet
Bengal College of Engineering and Technology, Durgapur: "Handwritten Text Recognition"
15 pages
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
No ratings yet
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
10 pages
Latest Base Paper
No ratings yet
Latest Base Paper
4 pages
CNTE Exhibitors List 2018
No ratings yet
CNTE Exhibitors List 2018
21 pages
10 1109@icirca48905 2020 9183326
No ratings yet
10 1109@icirca48905 2020 9183326
6 pages
AI Summary
No ratings yet
AI Summary
3 pages
Optical Character Recognizer: Team Member
No ratings yet
Optical Character Recognizer: Team Member
7 pages
OCR (Optimal Character Recogintion)
No ratings yet
OCR (Optimal Character Recogintion)
7 pages
Basic Computer Concepts - Introduction
No ratings yet
Basic Computer Concepts - Introduction
4 pages
We Used Tesseract OCR For Train The Data and Recognize The Character From Digital Image Under The Apache 2
No ratings yet
We Used Tesseract OCR For Train The Data and Recognize The Character From Digital Image Under The Apache 2
1 page
Relay Testing (Oc&ef Relay) PDF
No ratings yet
Relay Testing (Oc&ef Relay) PDF
4 pages
DC Circuits
No ratings yet
DC Circuits
20 pages
Basics of Power Circuits
No ratings yet
Basics of Power Circuits
90 pages
Vegapal RN 3001 (Vega)
No ratings yet
Vegapal RN 3001 (Vega)
12 pages
Vaidhi Ayush Gurkirat Jatin Project Synopsis Format
No ratings yet
Vaidhi Ayush Gurkirat Jatin Project Synopsis Format
6 pages
Fluke - Dry Well Calibrator
No ratings yet
Fluke - Dry Well Calibrator
24 pages
Sir Ton - 2 Storey With Roof Deck - Elec - 2
No ratings yet
Sir Ton - 2 Storey With Roof Deck - Elec - 2
1 page
Industrial Application of Microcontrollers in Agriculture
No ratings yet
Industrial Application of Microcontrollers in Agriculture
2 pages
EL DWG 0009 Conduit Layout
No ratings yet
EL DWG 0009 Conduit Layout
1 page
Tris (Acetilacetonato) Manganeso (III)
No ratings yet
Tris (Acetilacetonato) Manganeso (III)
2 pages
GST200 2
No ratings yet
GST200 2
4 pages
Priyanka Rajput
No ratings yet
Priyanka Rajput
5 pages
Quantum Mechanics Course Zeemansplitting
No ratings yet
Quantum Mechanics Course Zeemansplitting
29 pages
Electronics Devices and Circuits 1 Figueroa, Raed A. August 31, 2010 B.S. Cpe Engr. Jason Rex H. Agustin
No ratings yet
Electronics Devices and Circuits 1 Figueroa, Raed A. August 31, 2010 B.S. Cpe Engr. Jason Rex H. Agustin
4 pages
Marmalade SDK Mobile Game Development Essentials
From Everand
Marmalade SDK Mobile Game Development Essentials
Sean Scaplehorn
No ratings yet

Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)

Uploaded by

Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)

Uploaded by

Text Extraction From Image

Output:may be image or characteristics/features associated with that image

Steps In the Image Processing:

1. Importing the image via image acquisition tools;

Input: Digital Image, video

Output: Text Which is Present in the image or video

● Atom as text editor

Optical character recognition (OCR) refers to both the technology and

 1. OCR for Devnagari Script by Mahesh Goyani

You might also like