0% found this document useful (0 votes)

15 views17 pages

Presentation 4

The document outlines the development of an Optical Text-to-Speech (TTS) converter that utilizes Optical Character Recognition (OCR) to transform printed or handwritten text into audible speech, aimed at assisting individuals with reading disabilities and language barriers. The proposed system integrates a Raspberry Pi, camera, and speaker, allowing users to capture images and convert the extracted text into speech with a user-friendly interface. The project demonstrates high accuracy in text recognition and translation across multiple languages, while also suggesting future enhancements for improved functionality and accessibility.

Uploaded by

chethanm9945

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views17 pages

Presentation 4

Uploaded by

chethanm9945

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

B.M.

S COLLEGE OF ENGINEERING
Bull Temple Road, Basavanagudi, Bangalore - 560 019,

OPTICAL TEXT TO
SPEECH CONVERTER
USING OCR AND TTS

Guided By :-
DR. LALAITHA.S
INTRODUCTION

• The Optical text-to-speech (TTS) converter is a transformative assistive technology

designed to empower individuals by enabling them to access textual information in
images and translate to different languages.

• Languages are the oldest way of communication between human beings whether
they are in spoken or written forms.

• Therefore, we have started to digitize these images, extract and interpret the data by
using specific techniques, and then perform text-to-speech synthesis (TTS).

• It is done in order to read the information aloud for the benefit and ease of the user.
Text extraction and TTS can be utilized together to help people with reading
disabilities

• This project has represented the innovative idea as well as a low cost technique that
is used to hear the contents of the text image without reading them.
LITERATURE SURVEY

S TITLE IEE LIMITATION

N
O
PROBLEM DEFINITION

• The problem is to translate the language for the individuals who have language
barrier.

• Even for travelers or tourists who travel to different states for vacation or business
can have language barriers.

• The individuals have reading disabilities.

• The lack of accessible, efficient, and natural-sounding TTS solutions limits the
usability of systems for visually impaired individuals, language learners, and smart
device users.
PROPOSED SOLUTION

• An Optical Image Text-to-Speech (OITS) translator is a system that

automatically converts printed or handwritten text into speech.

• This system integrates Optical Character Recognition (OCR) to extract text

from images (such as photos of documents or books) and Text-to-Speech (TTS)
synthesis to vocalize the extracted text.

• Text-to-Speech Conversion: The extracted text is converted into speech using

the Festival Text-to-Speech (TTS) engine. The system can read the text aloud in
realtime with a clear and natural voice.

• User Interaction: The system is designed for ease of use, requiring the user to
simply press a single button to initiate the entire process. The button triggers
the image capture, OCR, and text-to-speech conversion automatically.
HARDWARE & SOFTWARE REQUIREMENTS AND ESTIMATED
COSTS

HARDWARE

 Raspberry Pi 3 ₹3645
 Speaker ₹249
 Camera (OV5647) ₹200
 Cables and Connectors ₹100
 Adapter : ₹199
 Pvc box: ₹350
 Battery : ₹300
 Switch : ₹50

SOFTWARE

 Python 3 compiler
 Programming Language: Python
METHODOLOGY &
IMPLEMENTATION
Block Diagram
1.Raspberry Pi 3B+
A single-board computer used as the core processing unit.
Features:
 Quad-core 64-bit processor.
 1 GB RAM.
 Multiple GPIO pins for interfacing with other devices.
 3.5mm audio jack for sound output.

2. Push Button
• Acts as an input device to trigger image capture or processing.
• Connected to the GPIO pins of the Raspberry Pi.
• Enables user interaction with the system.
2. Webcam
• Captures images as input for the project.
• Connected to the Raspberry Pi via a USB port.
• The images are processed by software running on the Raspberry Pi.

4. Speakers
• Outputs the speech generated from the image processing result.
• Connected to the Raspberry Pi through the 3.5mm audio jack or USB (if using a USB
speaker).
5. Power Bank
• Serves as the power source for the Raspberry Pi and its peripherals.
• Connected via the Raspberry Pi's power supply port (micro-USB or USB-C
depending on the model).

Connections in the Diagram:

• Power Supply: Power bank to Raspberry Pi for continuous operation.
• USB: For connecting the webcam to capture images.
• 3.5mm Audio Jack: For connecting the speakers to output audio.
• GPIO: Push button connected for user input.
FLOW CHART
RESULTS & DISCUSSIONS
Experimental Setup:- The system was tested using text in Kannada, Hindi, and
English under various conditions to evaluate:
1. Text clarity and size.
2. Background complexity.
3. Lighting conditions.

Test Environment:
Lighting: Well-lit indoor settings with occasional tests in low-light conditions.
Text Source: Printed documents and posters in supported languages.
Camera: USB Webcam (720p resolution).

Parameters Evaluated:
OCR Accuracy: Ability to correctly extract text.
Language Detection: Ability to identify the dominant language..
TTS Clarity: Naturalness and intelligibility of spoken output.
RESULT ANALYSIS
Thresholds and Text Characteristics :-

Text Size: Text sizes between 12pt and 72pt (font sizes commonly used in printed
documents) were extracted with 95% accuracy.

Small Text Issues: Fonts smaller than 10pt resulted in OCR inaccuracies, with
recognition rates dropping 60% to 75%. Decorative or cursive fonts saw a drop in
accuracy to 70% due to OCR limitations.

Language Detection: Correctly identified the dominant language in 93% of cases

when single-language text was used. For mixed-language documents, results were
inconsistent, with ~75% accuracy in detecting the dominant language.

Translation: Kannada and Hindi translations to English showed ~90% accuracy for
simple sentences. Complex sentences with idiomatic expressions or ambiguous
contexts occasionally resulted in incorrect translations.
FUTURE TRENDS

• Additional Language Support: Extend capabilities to support more regional and

global languages.

• Improved OCR: Integrate AI-powered OCR for better recognition of cursive or

decorative text and handwritten scripts.

• Compact Design: Develop an all-in-one device with integrated camera, speaker, and
tactile feedback for greater portability.

• Cloud Integration: Leverage cloud-based OCR and translation for faster processing
and real-time updates.

• Accessibility Features: Add voice commands, braille displays, or haptic feedback

for broader usability
CONCLUSION

The Optical Image-to-Speech Converter Using OCR and TTS effectively transforms
printed text into audible speech, providing a valuable assistive tool for visually
impaired individuals.

The system demonstrated:

• Multilingual Support: Accurate recognition and translation of Kannada, Hindi, and
English text.
• High Performance: Reliable OCR and TTS outputs, with an average accuracy of
~90% for clear and well-lit images.
• User-Friendly Design: Interactive GPIO buttons and LEDs ensure ease of use and
portability.

By combining image preprocessing, language translation, and text-to-speech synthesis,

the project successfully created a portable, cost-effective solution for bridging
accessibility gaps in textual content.
REFERENCES
 Ravi, S. Khasimbee, T. Asha, T. Joshna and P. Jyothirmai, "Raspberry pi based
smart reader for blindpeople," 2020 International Conference on Electronics and
Sustainable Communication Systems, no. 1, pp. 445-450, 2020.

 V. Mainkar, T. Bagayatkar, S. Shetye, H. Tamhankar andR. Jadhav, "Raspberry Pi

based intelligent reader for visually impaired persons," 2020.

 S. Akhil, "An overview of tesseract OCR engine," Department of CSE, Calicut

Monsoon, 2016.

 S. Anbarasi, R. Krishnaveni and R. Aruna, "Smart Reader Glass for Blind and
Visually Impaired People," IOS Press, 2021.

 http://www.daveconroy.com/turn-raspberry-pi-translator-speech

 www.raspberrypi.org
O U
K Y
A N
T H

Touchpad Plus Ver. 4.0 Class 8
From Everand
Touchpad Plus Ver. 4.0 Class 8
Nidhi Gupta
No ratings yet
Mazda MX5 ND Manual Transmission M66M-D Serivce Manual
100% (2)
Mazda MX5 ND Manual Transmission M66M-D Serivce Manual
62 pages
Dokumen - Pub - Move Over Brokers Here Comes The Blockchain 1175682526
No ratings yet
Dokumen - Pub - Move Over Brokers Here Comes The Blockchain 1175682526
272 pages
Math El
No ratings yet
Math El
17 pages
Blind Reader: Project Guide:Dr. Jayanand Gawande
No ratings yet
Blind Reader: Project Guide:Dr. Jayanand Gawande
8 pages
Project Final Report
No ratings yet
Project Final Report
35 pages
Text To Speech Conversion Using Raspberry - PI
No ratings yet
Text To Speech Conversion Using Raspberry - PI
3 pages
Text To Speech Conversion
No ratings yet
Text To Speech Conversion
4 pages
Voice Assisted Text Reading System For Visually Impaired Persons
No ratings yet
Voice Assisted Text Reading System For Visually Impaired Persons
6 pages
6.python Text To Speech
No ratings yet
6.python Text To Speech
2 pages
Fingerprint - Extraction 1 Copy Copy1
No ratings yet
Fingerprint - Extraction 1 Copy Copy1
10 pages
Image To Speech Conversion PDF
No ratings yet
Image To Speech Conversion PDF
7 pages
Multilingual Translator
No ratings yet
Multilingual Translator
16 pages
First Review 1MS21LVS06
No ratings yet
First Review 1MS21LVS06
12 pages
Sign Board Reader
No ratings yet
Sign Board Reader
22 pages
Leslie Mashonga T2082163F
No ratings yet
Leslie Mashonga T2082163F
9 pages
Raspberry Pi Based Reader For Blind People
No ratings yet
Raspberry Pi Based Reader For Blind People
4 pages
Project Report
No ratings yet
Project Report
124 pages
Raspberry Pi Based Smart Glasses Using Opencv and ML For Visually Impaired
No ratings yet
Raspberry Pi Based Smart Glasses Using Opencv and ML For Visually Impaired
23 pages
A Smart Reader For Visually Impaired People Using Raspberry PI
No ratings yet
A Smart Reader For Visually Impaired People Using Raspberry PI
5 pages
Wa0015.
No ratings yet
Wa0015.
10 pages
Smart Reader For Blind People
No ratings yet
Smart Reader For Blind People
3 pages
Image To Speech Conversion in Multi Languages
No ratings yet
Image To Speech Conversion in Multi Languages
31 pages
Last Edited
No ratings yet
Last Edited
8 pages
Document Reader For Visually Imapired: Prof. Deepti Chandran
No ratings yet
Document Reader For Visually Imapired: Prof. Deepti Chandran
26 pages
Dip PDF
No ratings yet
Dip PDF
30 pages
Project Presentation
No ratings yet
Project Presentation
8 pages
MATHS Report
No ratings yet
MATHS Report
15 pages
Text Reader For Visually Impaired Person Using Image Processing Open-CV
No ratings yet
Text Reader For Visually Impaired Person Using Image Processing Open-CV
8 pages
AI Based Reading System For Blind Using OCR
No ratings yet
AI Based Reading System For Blind Using OCR
4 pages
Open Source Computer Vision
No ratings yet
Open Source Computer Vision
79 pages
Description of My Abstract
No ratings yet
Description of My Abstract
2 pages
Optical Character Recognition Based Speech Synthesis: Project Report
0% (1)
Optical Character Recognition Based Speech Synthesis: Project Report
17 pages
Ocr Gtts PDF
No ratings yet
Ocr Gtts PDF
53 pages
Raspberry Pi Based Smart Reader For Visually Impaired People
50% (2)
Raspberry Pi Based Smart Reader For Visually Impaired People
12 pages
Android Text To Speech Documentation by Paige
No ratings yet
Android Text To Speech Documentation by Paige
27 pages
Real-Time Braille To Speech Conversion: Project Reference No.: 41S - Be - 1713
No ratings yet
Real-Time Braille To Speech Conversion: Project Reference No.: 41S - Be - 1713
3 pages
Ocr Gtts
No ratings yet
Ocr Gtts
49 pages
PRE Synopsis
No ratings yet
PRE Synopsis
3 pages
Wa0001.
No ratings yet
Wa0001.
14 pages
3.smart Receptionist Without Human Interference Using AI Algorithm Using Raspberry Pi
No ratings yet
3.smart Receptionist Without Human Interference Using AI Algorithm Using Raspberry Pi
2 pages
Advanced Image To Speech Conversion
No ratings yet
Advanced Image To Speech Conversion
46 pages
Visually Disabled
No ratings yet
Visually Disabled
7 pages
Android TTS OCR Converter System For People With Visual Disability
No ratings yet
Android TTS OCR Converter System For People With Visual Disability
7 pages
An Efficient Approach For Text-to-Speech Conversio
No ratings yet
An Efficient Approach For Text-to-Speech Conversio
6 pages
Sign Scribe
No ratings yet
Sign Scribe
15 pages
Smart Glasses For Blind People: Abstract
No ratings yet
Smart Glasses For Blind People: Abstract
7 pages
Final Year Project: Embedded Based Reading and Speaking Support System For Blind and Mute
No ratings yet
Final Year Project: Embedded Based Reading and Speaking Support System For Blind and Mute
15 pages
B.tech Major Project Abstract 2022 - 23
No ratings yet
B.tech Major Project Abstract 2022 - 23
2 pages
Survey Paper Image Reader For Blind Pers
No ratings yet
Survey Paper Image Reader For Blind Pers
3 pages
Usha Mittal Institute of Technology S.N.D.T. Women's University Electronics & Communication Engg. / Electronics Engg. Department
No ratings yet
Usha Mittal Institute of Technology S.N.D.T. Women's University Electronics & Communication Engg. / Electronics Engg. Department
1 page
Review 5
No ratings yet
Review 5
17 pages
B.E Ece Batchno 88
No ratings yet
B.E Ece Batchno 88
77 pages
Presentation 1
No ratings yet
Presentation 1
22 pages
Text To Voice Conversion of Text Embedded in Images
No ratings yet
Text To Voice Conversion of Text Embedded in Images
7 pages
Devel Projevct
No ratings yet
Devel Projevct
59 pages
KH
No ratings yet
KH
7 pages
Text Extraction From Digital Images With Text To Speech Conversion and Language Translation
No ratings yet
Text Extraction From Digital Images With Text To Speech Conversion and Language Translation
3 pages
Raspberry Pi Based Voice-Operated Personal Assistant (Neobot)
No ratings yet
Raspberry Pi Based Voice-Operated Personal Assistant (Neobot)
5 pages
Text To Speech Abstract
No ratings yet
Text To Speech Abstract
2 pages
Human Visual System Model: Understanding Perception and Processing
From Everand
Human Visual System Model: Understanding Perception and Processing
Fouad Sabry
No ratings yet
Touchpad Modular Ver. 1.1 Class 8: Windows 7 & MS Office 2010
From Everand
Touchpad Modular Ver. 1.1 Class 8: Windows 7 & MS Office 2010
Team Orange
No ratings yet
K-Fruc Calc
No ratings yet
K-Fruc Calc
5 pages
67 Working Principle Ultraviolet Flame Sensor Honeywell
No ratings yet
67 Working Principle Ultraviolet Flame Sensor Honeywell
1 page
CHCCCS011 Learner Workbook
No ratings yet
CHCCCS011 Learner Workbook
27 pages
TCM Past Paper
No ratings yet
TCM Past Paper
4 pages
Vigiflow: Introduction and Basic Features
No ratings yet
Vigiflow: Introduction and Basic Features
26 pages
ICM Script PDF
No ratings yet
ICM Script PDF
30 pages
Gmail - FWD - Uniosun Webpay Transaction Details (Ref - Osu - 24bp29522 - lnb6g7)
No ratings yet
Gmail - FWD - Uniosun Webpay Transaction Details (Ref - Osu - 24bp29522 - lnb6g7)
2 pages
Meanstack Lab Manual 2022-23
No ratings yet
Meanstack Lab Manual 2022-23
80 pages
Emtech 4 5 6 7 Outline
No ratings yet
Emtech 4 5 6 7 Outline
8 pages
EAO MC 61 Main-Catalogue en
No ratings yet
EAO MC 61 Main-Catalogue en
110 pages
Bbi Notes
No ratings yet
Bbi Notes
20 pages
RTI GHY April 22
No ratings yet
RTI GHY April 22
42 pages
How To Use The Guide and Quiz: Select The Version. The Questions Are Identical
No ratings yet
How To Use The Guide and Quiz: Select The Version. The Questions Are Identical
11 pages
Particulars of Factories Paying Revenue of Rs. One Crore and Above During The Year 2006-2007 As Compared To 2005 - 06 Commissionerate: Chennai-Iv
No ratings yet
Particulars of Factories Paying Revenue of Rs. One Crore and Above During The Year 2006-2007 As Compared To 2005 - 06 Commissionerate: Chennai-Iv
13 pages
Seat Heating E46
No ratings yet
Seat Heating E46
1 page
ADBMS Lab Outline
No ratings yet
ADBMS Lab Outline
3 pages
Metabo GE 700 Retificadora
No ratings yet
Metabo GE 700 Retificadora
4 pages
2nd Summative Test
No ratings yet
2nd Summative Test
8 pages
Colorbond Brochure 140220
No ratings yet
Colorbond Brochure 140220
40 pages
Din en 13215.2000-07 - 2637602
No ratings yet
Din en 13215.2000-07 - 2637602
20 pages
Quarterly Presentation On Training As Probationary Deputy Executive Engineer (Civil)
No ratings yet
Quarterly Presentation On Training As Probationary Deputy Executive Engineer (Civil)
22 pages
Mobile Applications in Children With Cerebral Palsy
No ratings yet
Mobile Applications in Children With Cerebral Palsy
14 pages
AMER - BRO - Stroboscopy Solution - (MKENT-2482EN-U Rev 2) - 09.2020
No ratings yet
AMER - BRO - Stroboscopy Solution - (MKENT-2482EN-U Rev 2) - 09.2020
4 pages
Function Key
100% (1)
Function Key
3 pages
EXCEL-Convert Number of Month To Name of Month
No ratings yet
EXCEL-Convert Number of Month To Name of Month
7 pages
Id Unit 5
No ratings yet
Id Unit 5
9 pages
Exam 2-1-25
No ratings yet
Exam 2-1-25
4 pages
Applied Logistic Regression - 3rd Edition Scribd Download
100% (8)
Applied Logistic Regression - 3rd Edition Scribd Download
17 pages

Presentation 4

Uploaded by

Presentation 4

Uploaded by

B.M.

• The Optical text-to-speech (TTS) converter is a transformative assistive technology

S TITLE IEE LIMITATION

• The individuals have reading disabilities.

• An Optical Image Text-to-Speech (OITS) translator is a system that

• This system integrates Optical Character Recognition (OCR) to extract text

• Text-to-Speech Conversion: The extracted text is converted into speech using

Connections in the Diagram:

Language Detection: Correctly identified the dominant language in 93% of cases

• Additional Language Support: Extend capabilities to support more regional and

• Improved OCR: Integrate AI-powered OCR for better recognition of cursive or

• Accessibility Features: Add voice commands, braille displays, or haptic feedback

The system demonstrated:

By combining image preprocessing, language translation, and text-to-speech synthesis,

 V. Mainkar, T. Bagayatkar, S. Shetye, H. Tamhankar andR. Jadhav, "Raspberry Pi

 S. Akhil, "An overview of tesseract OCR engine," Department of CSE, Calicut

You might also like