Presentation 4
Presentation 4
S COLLEGE OF ENGINEERING
Bull Temple Road, Basavanagudi, Bangalore - 560 019,
OPTICAL TEXT TO
SPEECH CONVERTER
USING OCR AND TTS
Guided By :-
DR. LALAITHA.S
INTRODUCTION
• Languages are the oldest way of communication between human beings whether
they are in spoken or written forms.
• Therefore, we have started to digitize these images, extract and interpret the data by
using specific techniques, and then perform text-to-speech synthesis (TTS).
• It is done in order to read the information aloud for the benefit and ease of the user.
Text extraction and TTS can be utilized together to help people with reading
disabilities
• This project has represented the innovative idea as well as a low cost technique that
is used to hear the contents of the text image without reading them.
LITERATURE SURVEY
N
O
PROBLEM DEFINITION
• The problem is to translate the language for the individuals who have language
barrier.
• Even for travelers or tourists who travel to different states for vacation or business
can have language barriers.
• The lack of accessible, efficient, and natural-sounding TTS solutions limits the
usability of systems for visually impaired individuals, language learners, and smart
device users.
PROPOSED SOLUTION
• User Interaction: The system is designed for ease of use, requiring the user to
simply press a single button to initiate the entire process. The button triggers
the image capture, OCR, and text-to-speech conversion automatically.
HARDWARE & SOFTWARE REQUIREMENTS AND ESTIMATED
COSTS
HARDWARE
Raspberry Pi 3 ₹3645
Speaker ₹249
Camera (OV5647) ₹200
Cables and Connectors ₹100
Adapter : ₹199
Pvc box: ₹350
Battery : ₹300
Switch : ₹50
SOFTWARE
Python 3 compiler
Programming Language: Python
METHODOLOGY &
IMPLEMENTATION
Block Diagram
1.Raspberry Pi 3B+
A single-board computer used as the core processing unit.
Features:
Quad-core 64-bit processor.
1 GB RAM.
Multiple GPIO pins for interfacing with other devices.
3.5mm audio jack for sound output.
2. Push Button
• Acts as an input device to trigger image capture or processing.
• Connected to the GPIO pins of the Raspberry Pi.
• Enables user interaction with the system.
2. Webcam
• Captures images as input for the project.
• Connected to the Raspberry Pi via a USB port.
• The images are processed by software running on the Raspberry Pi.
4. Speakers
• Outputs the speech generated from the image processing result.
• Connected to the Raspberry Pi through the 3.5mm audio jack or USB (if using a USB
speaker).
5. Power Bank
• Serves as the power source for the Raspberry Pi and its peripherals.
• Connected via the Raspberry Pi's power supply port (micro-USB or USB-C
depending on the model).
Test Environment:
Lighting: Well-lit indoor settings with occasional tests in low-light conditions.
Text Source: Printed documents and posters in supported languages.
Camera: USB Webcam (720p resolution).
Parameters Evaluated:
OCR Accuracy: Ability to correctly extract text.
Language Detection: Ability to identify the dominant language..
TTS Clarity: Naturalness and intelligibility of spoken output.
RESULT ANALYSIS
Thresholds and Text Characteristics :-
Text Size: Text sizes between 12pt and 72pt (font sizes commonly used in printed
documents) were extracted with 95% accuracy.
Small Text Issues: Fonts smaller than 10pt resulted in OCR inaccuracies, with
recognition rates dropping 60% to 75%. Decorative or cursive fonts saw a drop in
accuracy to 70% due to OCR limitations.
Translation: Kannada and Hindi translations to English showed ~90% accuracy for
simple sentences. Complex sentences with idiomatic expressions or ambiguous
contexts occasionally resulted in incorrect translations.
FUTURE TRENDS
• Compact Design: Develop an all-in-one device with integrated camera, speaker, and
tactile feedback for greater portability.
• Cloud Integration: Leverage cloud-based OCR and translation for faster processing
and real-time updates.
The Optical Image-to-Speech Converter Using OCR and TTS effectively transforms
printed text into audible speech, providing a valuable assistive tool for visually
impaired individuals.
S. Anbarasi, R. Krishnaveni and R. Aruna, "Smart Reader Glass for Blind and
Visually Impaired People," IOS Press, 2021.
http://www.daveconroy.com/turn-raspberry-pi-translator-speech
www.raspberrypi.org
O U
K Y
A N
T H