Dsaa Project Initial Proposal GROUP-11 Members: Vaishnavi NV, Syed Jahangir Peeran, V Tejkiran, V Sai Rathan, Srilekha N
Dsaa Project Initial Proposal GROUP-11 Members: Vaishnavi NV, Syed Jahangir Peeran, V Tejkiran, V Sai Rathan, Srilekha N
INITIAL PROPOSAL
GROUP-11
Members: Vaishnavi NV, Syed Jahangir Peeran, V Tejkiran, V Sai Rathan, Srilekha N.
Aim
Extracting meaningful text from a given image using Optical Character recognition
(OCR) along with machine learning and then converting the extracted text into speech.
Applications
(i) A software which reads out bedtime stories or as a textbook reader for students.
(ii) An OCR based app with access to camera and speech which communicates the
street name/door number to a blind person (through speech).
Challenges
(i) We will try to recognise even broken text in an image.
(ii) With the addition of using machine learning algorithm, the accuracy of OCR will
be lifted up.
(iii) For better accuracy, we will run the converted text through grammar/spelling
checker to predict the word/sentence better.
Input
A photo with text written on it, will be sent as the input for processing.
Processing
(i) Detect text regions alone from the image using sliding windows technique.
(ii) Segmenting characters.
(iii) Classifying characters.
(iv) Combining the letters and merging words to form sentences.
(v) Checking its correctness.
(vi) Converting the obtained text to speech.
Output
A voice signal speaking out the text which was processed in the above step.
References
(i) Chen, Huizhong, et al. "Robust Text Detection in Natural Images with Edge-
Enhanced Maximally Stable Extremal Regions." Image Processing (ICIP), 2011 18th
IEEE International Conference on. IEEE, 2011.
(ii) https://www.coursera.org/learn/machine-learning