Text_Recognition_in_Images_and_Converting_Recognized_Text_to_Speech__Image_Processing

This document presents a review of recent advancements in image-based text recognition and the conversion of recognized text to speech using image processing techniques. It discusses various methods such as Optical Character Recognition (OCR), Convolutional Neural Networks (CNN), and Text-to-Speech (TTS) synthesis, highlighting their applications, advantages, and challenges. The study aims to provide a comprehensive overview of current research while identifying gaps that require further investigation in the field.

Uploaded by

sanjana.devarapalli7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

Text_Recognition_in_Images_and_Converting_Recognized_Text_to_Speech__Image_Processing

Uploaded by

sanjana.devarapalli7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)

Text Recognition in Images and Converting

2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON) | 979-8-3503-8247-1/23/$31.00 ©2023 IEEE | DOI: 10.1109/UPCON59197.2023.10434298

Recognized Text to Speech – Image Processing

Yadwinder Singh Vikrant Sharma Arun Singh
Department of Computer Science and Department of Computer Science Student, IEEE member, Department of
Engineering, &Engineering Computer Science and Engineering, Lovely
Lovely Professional University Graphic Era Hill University Dehradun Professional University
Phagwara, India [email protected] Phagwara, Punjab,India
[email protected] [email protected]

Shruti Manik Rakhra Prerna Singh

Department of Computer Science and School of Computer Science and Department of Computer Science and
Engineering, Engineering Engineering,
Lovely Professional University Phagwara, Lovely Professional University Lovely Professional University
India Phagwara, India Phagwara, India
[email protected] [email protected] [email protected]

Desu Yadidya Dalwinder Singh Vikas Verma

Department of Computer Science and Assistant Professor, Head Of Department, Department of Computer Science and
Engineering, Computer Science and Engineering, Lovely Engineering,
Lovely Professional University Professional University Lovely Professional University
Phagwara, India Phagwara, India Phagwara, India
[email protected] [email protected] [email protected]

Adwit Singh
Department of Computer Science and
Engineering,
University of spain
[email protected]

Abstract— This review article provides an overview of of formats, including images, is essential. There has been
recent advancements in image-based text recognition and significant development in this field recently with the advent
converting recognized text to speech using image processing of machine learning, deep learning, and other cutting-edge
techniques. The article covers various techniques for text techniques. The objective of this survey of the literature is to
recognition in images, including Optical Character provide a comprehensive overview of the most recent
Recognition, Convolutional Neural Networks (CNN), and research on text recognition in images and converting
Recurrent Neural Networks. Additionally, the process of recognized text into speech, highlighting the most significant
converting recognized text to speech is discussed, including advancements and identifying any open-ended research
several Text-to-Speech (TTS) techniques such as concatenative
questions. The importance of text recognition in photographs
TTS, formant synthesis, and parametric synthesis.
is briefly discussed in the study's introduction.
Keywords— Text recognition, image processing, speech The discussion of different text recognition techniques
synthesis, machine learning, deep learning, convolutional neural follows, including standard Optical Character Recognition
networks, natural language processing. (OCR) techniques as well as machine learning and deep
learning techniques. In addition, the advantages and
I. INTRODUCTION
disadvantages of using convolutional neural networks
An important area of study in the realm of image (CNNs) and recurrent neural networks (RNNs) for text
processing is text recognition in images and text recognition recognition in images are covered in the article. Following is
to speech conversion. The most recent studies on text a review of well-known text-to-speech conversion
identification in images and text transcription from techniques, including concatenative synthesis, formant
recognized text are reviewed in this paper. The various synthesis, and statistical parametric synthesis. The
methods for text recognition and voice synthesis—including advantages and disadvantages of each method are also
machine learning, deep learning, convolutional neural discussed.
networks, and natural language processing—are covered in
the paper. This paper's goal is to give a summary of the most Before summarizing the most recent approaches for text
cutting-edge methods for voice synthesis and text recognition identification in images and speech generation, the study also
at the moment while also identifying any open questions that highlights the research gaps that need to be filled. It
need further investigation. highlights the need for more research into low-resolution
image identification and speaking naturally. There is a lot of
Text recognition in images and text to audio conversion potential in this area, and more research will be required to
are significant areas of study in image processing. The develop methods for speech synthesis and text recognition
creation of efficient methods for information extraction from that are more precise and efficient.
the vast amount of text data, which is accessible in a variety

orized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on January 08,2025 at 12:58:19 UTC from IEEE Xplore. Restrictions ap
2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)

The method of text-based image identification is also and their variations. We focus on the CNN technique and its
known as optical character recognition. (OCR). A computer many applications, including document analysis, analysis of
can recognize and extract text from images or scanned papers handwritten text, and text recognition in scene photos. We
using this technology. OCR is a technique for digitizing discuss the field's challenges and likely future directions
printed or handwritten text and turning it into a format that while highlighting CNN's possible impact on real-world
other software applications can use to find, edit, and display applications. Finding and extracting text from the images is
the content. required for text identification in photographs. Due to
differences in typeface, size, orientation, lighting, and other
OCR technology requires a number of processes. The elements that may have an impact on the image's quality, this
photograph is first preprocessed to highlight the text and can be a difficult task. Text recognition tests have
eliminate any ambiance. The text is then divided up and demonstrated the effectiveness of CNNs, [10-13]particularly
detected using algorithms for pattern recognition. The final when combined with additional methods like optical
output is generated when the detected text has been validated character recognition (OCR). The automatic learning of
and rectified.[19] features from the images made possible by CNNs can
increase the precision of text recognition.[14]
II. LITERATURE REVIEW
A. Using deep learning techniques, "Text Recognition in
Pictures" was published in 2018 by F. A. Rodriguez-Saona et
al. In this work, deep learning methods are used to look for
text in photographs. To increase the precision of text
recognition, the authors suggest a new model that combines
long short-term memory (LSTM) networks and
convolutional neural networks (CNNs). The suggested model
performs at the cutting edge on a number of benchmarks
after being trained on a sizable dataset of photos.
Fig. 1. OCR B . S. R. Singh and S. Ghosh's "A Study of Text
Detection and Recognition in Pictures" was published in
Text-to-speech (TTS) technology may translate the text 2019. The many methods for text detection and recognition
into speech after it has been identified. A technology called in photographs are summarized in this survey. The authors
TTS transforms written. Convolutional Neural Networks talk about the difficulties in text recognition and give a
(CNNs) have [17]achieved outstanding results in a variety of thorough breakdown of current developments. They also
computer vision applications, such as semantic segmentation, propose areas for more research by comparing the results of
object detection, and image classification. CNNs have also various algorithms on common benchmarks.
been used in recent years to recognize text in photos.
Traditional CNN models, on the other hand, need a lot of III. METHODOLOGY
memory and processing power, which makes them
unsuitable for devices with limited resources like A. .Working Principle
smartphones and embedded systems.[16] Using a picture as its input, a technique known as Text
recognition in images and converting detected text to voice
The novelty of creating a system that can recognize creates an audio file that reads the text that was found in the
handwritten or cursive text and transform it to speech in the image. The stages that such a converter takes to function are
same language, as well as creating a system that can as follows:[18]
recognize text in various languages. To solve this issue,
researchers have proposed a ground-breaking method dubbed 1) Image recognition:
CNN with Tensor Train decomposition. (CNN TT). This The process starts by using image recognition technology
method represents the CNN model more succinctly, resulting to identify the text present in the picture. This is achieved by
in a smaller memory footprint and a decrease in using techniques like Optical Character Recognition (OCR),
computational complexity without sacrificing speed.[15] which detaches the text from the image and formats it for
computer reading.
2) Text processing:
The text is cleaned up to remove any extraneous
characters, punctuation, or formatting after being extracted
from the picture. This procedure helps to guarantee that the
end audio file is easy to listen to and accurately conveys the
meaning of the original image.
3) Text-to-speech synthesis:
The text must now be converted into an audio file using
text-to-speech synthesis technology after it has been
modified. Typically, a clear and understandable computer-
Fig. 2. Process of OCR generated voice reads the material audibly.
This review study provides an overview of the most 4) Output:
current techniques for text detection in photos using CNNs

18
orized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on January 08,2025 at 12:58:19 UTC from IEEE Xplore. Restrictions ap
2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)

The user then hears the completed audio clip being c) Segmentation:
played back in order to hear the text being read aloud. This The OCR algorithm segments the picture into segments
may be particularly useful for people who have trouble that correspond to different characters or words.
reading or who have vision problems because it enables them
to access and understand the content of the image without d) Recognition:
relying on their vision. A sophisticated system called the The segmented words or characters are compared to a
image to text to speech converter turns photos correctly and database of known words or characters, and the most
rapidly into audio files by using text processing, text-to- probable match is chosen.
speech synthesis, and image recognition
e) Post-processing:
Any errors or inconsistencies in the identified text are
fixed to boost accuracy. OCR algorithms can differ in speed
and quality, and some work better with particular types of
text or languages than others. Thanks to advancements in
machine learning and artificial intelligence, OCR accuracy
has improved, and it is now more resistant to changes in font,
size, and image.
3) Text Processing :
Fig. 3. working of model
An image to text to speech converter is not complete
B. Algorithms without text processing, which involves converting the
detected text into a format that can be read audibly by a text-
1) Image Processing to-speech engine. Here are a few uses for text editing in this
picture processing can be used to improve the precision circumstance:
and efficiency of a picture to text to speech converter. The
following are some examples of image processing uses in a) Text normalization:
this circumstance: The identified text may contain a variety of grammatical
or spelling errors, such as typos, abbreviations, or incorrect
a) Image improvement: capitalization. Use text normalization to standardize the text
Before OCR, the image can be enhanced to improve and make it simpler to comprehend.
contrast, sharpness, and general visual quality. By doing this,
text output errors can be reduced and OCR systems can b) Text segmentation:
function more effectively. The identified text may be broken up into separate words
or sentences or it may simply be a continuous string of
b) disturbance reduction: characters. Text segmentation is a method for breaking up
Scratches and other types of picture disturbance can text into meaningful units, like words or sentences.
make OCR less accurate. Using image processing methods
like filtering and denoising, it is possible to lessen this noise c) Text processing
and improve OCR efficiency. Text processing can also involve adding markup or tags
to the text, which the text-to-speech engine can use to control
c) Text detection: how the material is spoken. For example, markup can be
Image processing can be used to identify and position used to specify how certain words or phrases should be
text in a photograph. By focusing only on the text and pronounced or to show where pauses or accents should be
disregarding the non-text areas like graphics or images, this placed.
can help OCR algorithms.
d) Language and voice selection:
d) Segmentation: Text processing may also entail selecting the appropriate
Image processing can be used to separate the text into language and voice for the text-to-speech output, depending
individual characters or words, improving OCR precision by on the user's preferences and the language of the recognized
reducing the likelihood that one character will be mistaken text.
for another.
4) Text-to-Speech (TTS) :
2) Optical Character Recognition (OCR) : The text-to-speech (TTS) function of an image to text to
Computers can read printed or handwritten text from voice converter converts recognized text into spoken words
images like scanned documents, photos, or screenshots that the viewer can hear. TTS can be used in this
thanks to a technique called optical character recognition circumstance in the following ways:
(OCR). The OCR algorithm typically includes several stages,
including: a) Speech synthesis:
TTS employs a speech synthesis engine to create human-
a) Pre-processing: like speaking from recognized text. The speech synthesis
The picture is enhanced and cleaned up to increase the engine may generate speech using a pre-recorded voice or
OCR's accuracy. This might involve adjusting the contrast, the text-to-speech synthesis algorithm.
aligning the text, or removing noise.
b) Voice selection:
b) Binarization: TTS gives users the option to pick the voice and language
By turning the picture black and white, the OCR system they prefer based on their preferences or the language of the
is better able to distinguish the text from the background. recognized text.[20]

19
orized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on January 08,2025 at 12:58:19 UTC from IEEE Xplore. Restrictions ap
2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)

c) Speech customization: [2] Tacotron: Toward end-to-end speech synthesis, Y. Wang et al.,
Proceedings of the International Speech Communication
TTS can also entail tailoring the output of speech to the Association's Annual Conference, INTERSPEECH, 2017.
user's requirements. For instance, the tempo, loudness, and [3] Text, Speech, and Conversation by A. Stepikhov, 2013.
pitch of the voice can be changed to enhance its naturalness [4] "Voice to Text Conversion Using Android Platform," International
and intelligibility. Journal of Engineering Research Applications, 2013.
d) Sounds output: [5] B. V. P. and P. Khilari, "A Review on Voice To Text Conversion
Techniques," International Journal of Advanced Research in
TTS generates sounds that can be heard through Computer Engineering Technology, 2015.
speakers or headphones. The audio output can be captured or [6] Towards end-to-end voice recognition with recurrent neural networks,
streamed in real time depending on the user's needs. A. Graves and N. Jaitly, networks," in ICML 2014, the 31st
International Conference on Machine Learning.
By integrating TTS into an image-to-text-to-speech [7] C. Herff et al., Front. Neurosci., 2015, "Brain-to-text: Decoding
converter, the identified text can be converted into spoken spoken sentences from phone representations in the brain."
words that the viewer can hear. This can make the converter [8] Bayesian jointsequence models for grapheme-to-phoneme conversion,
more accessible and usable for people who have vision J. Trmal, L. Ondel, S. Kesiraju, and L. Burget, ICASSP, IEEE
problems who might have difficulty reading text on a International Conference on Acoustics, Speech and Signal Processing
computer. - Proceedings, 2017.
[9] Deep voice 2: Multi-speaker neural text-to-speech, S. O. Arik et al.,
IV. RESULTS Advances in Neural Information Processing Systems, 2017.
[10] M. M. War, M. Rakhra and D. Singh, "Review On Application Based
Recent works in the field of text recognition in images Bus Tracking System," 2022 5th International Conference on
and converting recognized text to speech have shown Contemporary Computing and Informatics (IC3I), Uttar Pradesh,
promising results. Deep learning-based methods like CNNs India, 2022, pp. 876-880, doi: 10.1109/IC3I56241.2022.10072449.
and RNNs have outperformed traditional methods. [11] S. M. Makhdoomi, M. Rakhra, D. Singh and A. Singh, "Artificial-
Concatenative TTS techniques have been found to produce Intelligence based Prediction of Post-Traumatic Stress Disorder
(PTSD) using EEG reports," 2022 5th International Conference on
more natural-sounding speech than parametric TTS Contemporary Computing and Informatics (IC3I), Uttar Pradesh,
techniques. India, 2022, pp. 1073-1077, doi: 10.1109/IC3I56241.2022.10072671.
The difficult image processing issues of text recognition [12] R. S. Kushwaha, M. Rakhra, D. Singh and A. Singh, "An Overview:
Super-Image Resolution using Generative Adversarial Network for
in images and text to speech conversion have many real- Image Enhancement," 2022 5th International Conference on
world uses. OCR, deep learning-based methods, rule-based Contemporary Computing and Informatics (IC3I), Uttar Pradesh,
methods, and other approaches have all been suggested as India, 2022, pp. 1243-1246, doi: 10.1109/IC3I56241.2022.10072862.
solutions to these issues. Further studies are required to [13] T. Soewu, S. V. Uday Kalyan, M. Rakhra and D. Singh, "Lung
increase the precision and effectiveness of these methods Cancer Detection using Image Processing," 2022 5th International
despite recent studies' encouraging findings. OCR is a Conference on Contemporary Computing and Informatics (IC3I),
Uttar Pradesh, India, 2022, pp. 1206-1211, doi:
powerful tool in image processing that enables the automated 10.1109/IC3I56241.2022.10072589.
extraction of text from images, opening up a wide range of [14] C. Harika, D. Singh, A. Singh and M. Rakhra, "IoT Solution for
applications in various fields. Automatic Watering System," 2022 5th International Conference on
Contemporary Computing and Informatics (IC3I), Uttar Pradesh,
V. CONCLUSION India, 2022, pp. 1068-1072, doi: 10.1109/IC3I56241.2022.10073082.
Text recognition in images and text to speech translation [15] T. Soewu, Hemant, M. Rakhra and D. Singh, "Analysis of Data
are two important fields of study in image processing. The Mining-Based Approach for Intrusion Detection System," 2022 5th
International Conference on Contemporary Computing and
accuracy of text recognition has considerably improved as a Informatics (IC3I), Uttar Pradesh, India, 2022, pp. 908-912, doi:
result of recent advancements in deep learning techniques, 10.1109/IC3I56241.2022.10072828.
and excellent text-to-speech synthesis has been achieved [16] A. Ansari, B. Kaur, M. Rakhra, A. Singh and D. Singh, "Handwritten
using neural networks. These fields still require research, Text Recognition using Deep Learning Algorithms," 2022 4th
particularly in the area of developing algorithms that can International Conference on Artificial Intelligence and Speech
handle complex visuals and generate speech that sounds Technology (AIST), Delhi, India, 2022, pp. 1-6, doi:
10.1109/AIST55798.2022.10065348
genuine. Text recognition in images and text to speech
[17] R. Kumar Shukla, M. Rakhra, D. Singh and A. Singh, "The Role of
conversion are essential tools in the contemporary world. Machine Learning in Health Care Diagnosis," 2022 4th International
Numerous methods have been developed for text recognition Conference on Artificial Intelligence and Speech Technology (AIST),
in images, and deep learning-based methods have shown Delhi, India, 2022, pp. 1-6, doi: 10.1109/AIST55798.2022.10064906.
encouraging results. As TTS synthesis techniques have [18] A. Singh and M. Rakhra, "A Review For Different Sign Language
developed over time, it has been found that concatenative Recognition Systems," 2022 4th International Conference on
TTS techniques create more natural-sounding speech than Artificial Intelligence and Speech Technology (AIST), Delhi, India,
2022, pp. 1-6, doi: 10.1109/AIST55798.2022.10065037.
parametric TTS techniques. Future work can concentrate on
[19] M. K. Dath, M. Rakhra, D. Singh, A. Singh and R. Banala, "Basic
improving the accuracy of text detection in images and design for the implementation of automatic surveillance system on
developing more efficient TTS synthesis techniques. helmet detection," 2022 4th International Conference on Artificial
Intelligence and Speech Technology (AIST), Delhi, India, 2022, pp.
REFERENCES 1-5, doi: 10.1109/AIST55798.2022.10065367.
[1] Deep voice: Real-time neural text-to-speech, S. Arik et al., 34th [20] A. Sharma and D. Singh, "A Statistical Review on Machine Learning
International Conference on Machine Learning, 2017. Based Medical Diagnostic Systems for Chronic Kidney Disease,"
2022 3rd International Conference on Computation, Automation and
Knowledge Management (ICCAKM), Dubai, United Arab Emirates,
2022, pp. 1-5, doi: 10.1109/ICCAKM54721.2022.9990508.

20
orized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on January 08,2025 at 12:58:19 UTC from IEEE Xplore. Restrictions ap

A Survey of Evolution of Image Captioning PDF
No ratings yet
A Survey of Evolution of Image Captioning PDF
18 pages
Literature Review - Biometrics System
No ratings yet
Literature Review - Biometrics System
7 pages
LITERATURE SURVEY1
No ratings yet
LITERATURE SURVEY1
4 pages
dip_pdf
No ratings yet
dip_pdf
30 pages
Text Extraction From Digital Images With Text To Speech Conversion and Language Translation
No ratings yet
Text Extraction From Digital Images With Text To Speech Conversion and Language Translation
3 pages
DL Based Speech To Text Converter For Audio Visual Applications
No ratings yet
DL Based Speech To Text Converter For Audio Visual Applications
4 pages
IRJET-V10I1080
No ratings yet
IRJET-V10I1080
4 pages
Tamil Textual Image Reader
No ratings yet
Tamil Textual Image Reader
4 pages
doc
No ratings yet
doc
5 pages
Text_to_voice_conversion_of_text_embedded_in_images
No ratings yet
Text_to_voice_conversion_of_text_embedded_in_images
7 pages
23021 d 0515
No ratings yet
23021 d 0515
16 pages
An Efficient Approach For Text-to-Speech Conversio
No ratings yet
An Efficient Approach For Text-to-Speech Conversio
6 pages
Department of Computer Science: Image To Text Using Text Recognition & Text To Speech
No ratings yet
Department of Computer Science: Image To Text Using Text Recognition & Text To Speech
66 pages
Text-to-Image_Synthesis_With_Generative_Models_Methods_Datasets_Performance_Metrics_Challenges_and_Future_Direction_Basiv
No ratings yet
Text-to-Image_Synthesis_With_Generative_Models_Methods_Datasets_Performance_Metrics_Challenges_and_Future_Direction_Basiv
16 pages
Review of Text To Speech Conversion Methods: Poonam.S.Shetake, S.A.Patil, P. M Jadhav
No ratings yet
Review of Text To Speech Conversion Methods: Poonam.S.Shetake, S.A.Patil, P. M Jadhav
7 pages
(IJCST-V9I2P18) :swati, Harpreet Kaur
No ratings yet
(IJCST-V9I2P18) :swati, Harpreet Kaur
6 pages
Enhancing Text Spotting With A Language Model and Visual Context Information
No ratings yet
Enhancing Text Spotting With A Language Model and Visual Context Information
10 pages
Char RCG TH
No ratings yet
Char RCG TH
11 pages
Text-to-Image Synthesis With Generative Models Met
No ratings yet
Text-to-Image Synthesis With Generative Models Met
16 pages
Image To Speech Conversion in Multi Languages
No ratings yet
Image To Speech Conversion in Multi Languages
31 pages
Natural Language Processing: by Dr. Parminder Kaur
No ratings yet
Natural Language Processing: by Dr. Parminder Kaur
26 pages
Text To Speech Conversion Module
No ratings yet
Text To Speech Conversion Module
8 pages
91.IMAGETEXTTOSPEECHCONVERSIONIN
No ratings yet
91.IMAGETEXTTOSPEECHCONVERSIONIN
11 pages
New Microsoft Word Document (2)
No ratings yet
New Microsoft Word Document (2)
8 pages
Ijaret 09 05 015
No ratings yet
Ijaret 09 05 015
10 pages
Latest Base Paper
No ratings yet
Latest Base Paper
4 pages
Image To Text and Speech Conversion
No ratings yet
Image To Text and Speech Conversion
3 pages
PDF To Voice by Using Deep Learning
No ratings yet
PDF To Voice by Using Deep Learning
5 pages
Smart-Image-to-Text-and-Text-to-Speech-Reorganization-Using-Machine-Learning
No ratings yet
Smart-Image-to-Text-and-Text-to-Speech-Reorganization-Using-Machine-Learning
5 pages
Text To Speech Conversion Using Raspberry - PI
No ratings yet
Text To Speech Conversion Using Raspberry - PI
3 pages
Indian Institute OF Information Technology Allahabad: Text To Image Synthesis
No ratings yet
Indian Institute OF Information Technology Allahabad: Text To Image Synthesis
8 pages
Final Synopsis PANS (1)
No ratings yet
Final Synopsis PANS (1)
14 pages
Handwritten Text Recognition and Digital Text Conversion
No ratings yet
Handwritten Text Recognition and Digital Text Conversion
2 pages
A Novel Ensemble Deep Network Framework For Scene Text Recognition
No ratings yet
A Novel Ensemble Deep Network Framework For Scene Text Recognition
11 pages
Tess2Speech: An Intelligent Character Recognition-To-Speech Application For Android Using Google's Tesseract Optical Character Recognition Engine
No ratings yet
Tess2Speech: An Intelligent Character Recognition-To-Speech Application For Android Using Google's Tesseract Optical Character Recognition Engine
197 pages
DOC-20241111-WA0002.
No ratings yet
DOC-20241111-WA0002.
10 pages
Long2021 Article SceneTextDetectionAndRecogniti
No ratings yet
Long2021 Article SceneTextDetectionAndRecogniti
24 pages
A Survey of AI Text-to-Image and AI Text-to-Video Generators
No ratings yet
A Survey of AI Text-to-Image and AI Text-to-Video Generators
5 pages
Text To Speech
No ratings yet
Text To Speech
9 pages
Building A Voice Based Image Caption Generator With Deep Learning
No ratings yet
Building A Voice Based Image Caption Generator With Deep Learning
6 pages
IJRPR4449
No ratings yet
IJRPR4449
4 pages
6.python Text To Speech
No ratings yet
6.python Text To Speech
2 pages
Jaderberg 16
No ratings yet
Jaderberg 16
20 pages
IJARCCE 208
No ratings yet
IJARCCE 208
3 pages
Paper 5728
No ratings yet
Paper 5728
3 pages
Ref12
No ratings yet
Ref12
7 pages
Research Paper of Generating Caption From Image
No ratings yet
Research Paper of Generating Caption From Image
5 pages
Mini Project Fln..
No ratings yet
Mini Project Fln..
51 pages
Documents 5
No ratings yet
Documents 5
5 pages
Text Detection OCR Reseacrh Paper
No ratings yet
Text Detection OCR Reseacrh Paper
26 pages
On Text To Speech Conversion Using OCR
50% (2)
On Text To Speech Conversion Using OCR
26 pages
Research Paper - Virtual Assistant
No ratings yet
Research Paper - Virtual Assistant
15 pages
Project Report Image Captioning Models Prakhar Dhyani
No ratings yet
Project Report Image Captioning Models Prakhar Dhyani
8 pages
Image Captioning Using R-CNN & LSTM Deep Learning Model
No ratings yet
Image Captioning Using R-CNN & LSTM Deep Learning Model
4 pages
6
No ratings yet
6
5 pages
Generating AI Text to Image A Comprehensive Guide
No ratings yet
Generating AI Text to Image A Comprehensive Guide
3 pages
AI-Powered Text Generation For Harmonious Human-Machine Interaction: Current State and Future Directions
No ratings yet
AI-Powered Text Generation For Harmonious Human-Machine Interaction: Current State and Future Directions
8 pages
Deep Learning Approaches To Scene Text Detection A
No ratings yet
Deep Learning Approaches To Scene Text Detection A
61 pages
Image Captionbot For Assistive Technology
No ratings yet
Image Captionbot For Assistive Technology
3 pages
Industrial Automation: Learn the current and leading-edge research on SCADA security
From Everand
Industrial Automation: Learn the current and leading-edge research on SCADA security
Vikalp Joshi
No ratings yet
Introduction To Augmented Reality Hardware: Augmented Reality Will Change The Way We Live Now: 1, #1
From Everand
Introduction To Augmented Reality Hardware: Augmented Reality Will Change The Way We Live Now: 1, #1
Kaviyaraj R
No ratings yet
Cyber Security (MENTOR LED)
No ratings yet
Cyber Security (MENTOR LED)
18 pages
Gvpce_nueve It 2025
0% (1)
Gvpce_nueve It 2025
28 pages
IJCRT2108410
No ratings yet
IJCRT2108410
5 pages
2203.14725v1
No ratings yet
2203.14725v1
5 pages
Generating Music Using AI: Ebba Rickard
No ratings yet
Generating Music Using AI: Ebba Rickard
66 pages
JS
No ratings yet
JS
14 pages
Ans Key - 6 AI
No ratings yet
Ans Key - 6 AI
14 pages
Speech-to-Text Note-Taking Application
No ratings yet
Speech-to-Text Note-Taking Application
9 pages
Blue Eyes Technology
No ratings yet
Blue Eyes Technology
12 pages
Modern Speech Recognition Approa
No ratings yet
Modern Speech Recognition Approa
337 pages
A New Approach For Persian Speech Recognition
No ratings yet
A New Approach For Persian Speech Recognition
6 pages
Shekar23b Interspeech
No ratings yet
Shekar23b Interspeech
5 pages
The Ai Revolution In Project Management Elevating Productivity With Generative Ai For Hoang Anh Vijay Kanabar Jason Wong download
100% (1)
The Ai Revolution In Project Management Elevating Productivity With Generative Ai For Hoang Anh Vijay Kanabar Jason Wong download
76 pages
NLP Unit-5
No ratings yet
NLP Unit-5
14 pages
ARTIFICIAL INTELLIGENCE class 8
No ratings yet
ARTIFICIAL INTELLIGENCE class 8
16 pages
Speech Recognition in Assisted and Live Subtitling For Television
No ratings yet
Speech Recognition in Assisted and Live Subtitling For Television
13 pages
Report Sample
No ratings yet
Report Sample
61 pages
Akhila Summer Intern
No ratings yet
Akhila Summer Intern
15 pages
Lecture # 01 Introduction To AI: way-to-learn-Artificial-Intelligence-for-a-beginner
No ratings yet
Lecture # 01 Introduction To AI: way-to-learn-Artificial-Intelligence-for-a-beginner
16 pages
Application 1: Sentiment Analysis
No ratings yet
Application 1: Sentiment Analysis
6 pages
AI and Automation in Healthcare Whitepaper
100% (1)
AI and Automation in Healthcare Whitepaper
7 pages
The Financial Problems and Academic Performance Among Public University Students in Malaysia
No ratings yet
The Financial Problems and Academic Performance Among Public University Students in Malaysia
7 pages
Abbreviations
No ratings yet
Abbreviations
194 pages
Punjabi A
No ratings yet
Punjabi A
7 pages
FINAL PROJECT REPORT(phase1)
No ratings yet
FINAL PROJECT REPORT(phase1)
29 pages
Mixed Precision Training
No ratings yet
Mixed Precision Training
12 pages
Mini Project Report 1 Mba
No ratings yet
Mini Project Report 1 Mba
44 pages
Product Brief - RSC-4128
No ratings yet
Product Brief - RSC-4128
2 pages
2010 Atrey MultimodalFusionForMultimediaAnalysisSurvey
No ratings yet
2010 Atrey MultimodalFusionForMultimediaAnalysisSurvey
35 pages
Mca 104 Unit 3 Information Technology Notes
No ratings yet
Mca 104 Unit 3 Information Technology Notes
60 pages
Emerging Trends
No ratings yet
Emerging Trends
3 pages
Purdue PGP AI and ML
No ratings yet
Purdue PGP AI and ML
35 pages
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
No ratings yet
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
5 pages
Curriculum Vitae: Solomon Teferra Abate
No ratings yet
Curriculum Vitae: Solomon Teferra Abate
6 pages
Moussalli Et Al. (2019) Intelligent Personal Assistant - Can They Understand and Be Understood by Accented l2 Learners
No ratings yet
Moussalli Et Al. (2019) Intelligent Personal Assistant - Can They Understand and Be Understood by Accented l2 Learners
27 pages

Text_Recognition_in_Images_and_Converting_Recognized_Text_to_Speech__Image_Processing

Uploaded by

Text_Recognition_in_Images_and_Converting_Recognized_Text_to_Speech__Image_Processing

Uploaded by

2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)

Text Recognition in Images and Converting

Recognized Text to Speech – Image Processing

Shruti Manik Rakhra Prerna Singh

Desu Yadidya Dalwinder Singh Vikas Verma

979-8-3503-8247-1/23/$31.00 ©2023 IEEE 17

You might also like