OCR of Kannada Characters Using Deep Learning
OCR of Kannada Characters Using Deep Learning
Abstract—Kannada, A dravidian language of south India that This paper proposes a model to perform line, word and
consists of kannada numerals from 0 to 9 and 49 letters that are character segmentation from handwritten document for
further classified into swara, vyanjana and yogavahagalu. The character recognition. Further pre-processing techniques have
task Optical Character Recognition(OCR) is to transform printed
or handwritten text into digital form. This technique can be been discussed that removes the unwanted information from
explored to extract kannada numerals and letters from images of the image. Finally, the model is trained on the Chars74K
handwritten documents, processed using image processing tech- dataset and tested on the input images and the performance
niques such as segmentation, skewing and slanting using OpenCV. is presented as results.
Deep learning is a subset of machine learning where artificial
neural networks, algorithms inspired by the human brain, learn
from large amounts of data. Convolutional neural network(CNN)
is a deep learning technique that can be used to train the model
and classify kannada characters using Tensorflow and Keras.
Our study has showed that our model has outperformed present
methods to classify Kannada numerals and characters with 100%
accuracy.
Keywords—Kannada OCR, Optical Character Recognition,
Deep Learning, CNN.
I. I NTRODUCTION
Kannada, A scheduled classical language of Indian
Constitution and is mainly spoken in south Indian state
of Karnataka. It is second old Dravidian language and has
non-latin script with forty nine phonemic letters, classified into
three groups swaras(13 vowels), vyanjanas(34 consonants)
and yogavahagalu(2 letters which are neither vowels nor
consonants). Kannada Optical character recognition is
complex compared to present latin character recognition
systems [1]. Presently, most research papers used only ideal
Fig. 1. Kannada Characters
characters for Kannada optical character recognition systems
whereas realtime handwritten document recognition facing
harder problems due to constraints like inadequate spacing This paper is ordered as follows, The Section 2 giving brief
between words and presence of vattu and dheergas which idea of existing systems. Section 3 describes our Dataset,
amplify the segmentation issues [2]. Section 4 dealing with literature review of previous and
OCR is a methodology of transforming handwritten or associated work, Section 5 Proposed Model. Section 6, We
printed text into digitized format. The system can be either will discuss the Results. At-last Section 7, Concludes the paper
Online or Offline. Online OCR uses Google translators which with References.
works on script written using optical pen and electronic pad II. BACKGROUND
on the other hand Offline OCR uses Handwritten or printed
text [2]. OCR in non-Latin languages is very challenging A. Optical Character Recognition
due to huge character set and complexity attached with it. The roots of OCR traces from way back in history to
The performance of OCR can be improved using image telegraphs. E. Goldberg – a physicist during first world war
processing steps such as segmentation and pre-processing [3]. scientific work led to invention of device that could decode
This study aims to perform Optical Character recognition the text as well as encode into telegraph code [8]. By 1920,
on Handwritten Kannada documents using OpenCV, Keras his work created first electronic document retrieval system
and Convolutional neural network. which was a milestone in the History of Optical character
recognition [4].
Comparison of Accuracy’s
Author Accuracy
J.S. Pradeep et.al 98%
K.K. Dutta et.al 89.39%
Anushua Banerjee et.al 95.69%
Prasad et.al 90%
K Indira et.al. 97.14%
36
Authorized licensed use limited to: M S RAMAIAH INSTITUTE OF TECHNOLOGY. Downloaded on November 15,2024 at 08:48:30 UTC from IEEE Xplore. Restrictions apply.
A. Segmentation C. Augmentation
a) Line Segmentation: On the first step, Individual lines As we know that image processing consumes huge amount of
from the input image are identified using bounding boxes and memory, which requires more training time which is not so
these lines are separated using contour detection technique recommended. Hence, the data in hand is Augmented and this
with this line, word and characters are segmented. Next step itself is used for the training of the model [14]. This important
is image enhancement, in order to remove the unwanted procedure can make the algorithm overcome the problem of
information because of which the actual information can be Over fitting thus making the model efficient [8].
lost will be removed. At this stage we need to detect the Augmentation includes the following procedures on input
external contours considering full image [2]. Compression images:-
techniques are used to neglect the other detection like diagonal • Resizing – Generally analyzing images with fixed res-
contour etc., and consider only rectangle around each line. For olution makes model robust on that lines the image
these rectangle contours the bounding box is achieved around resolution is fixed to 48 x 48
each line. Finally, Feature extraction is performed over the • Smoothening – As Binarization is already achieved we
bounded region and each region is stored as separate line [10]. perform blurring in this step by which we can neglect
b) Word Segmentation: At this stage, as each line is the noise of images with less clarity
stored as separate image, Considering these images for Word • Padding – As we already discussed, in order to make
segmentation. Same Image enhancement is achieved in order model robust we will make image centered by adding
to remove the unwanted information from the image and text some pixels as padding
blobs were detected [2]. In our study we got to know that Blob • Aspect ratio – As we resize image we must consider that
detection is very precise but was consuming more time than we must not change the image resolution. Therefore, It
Image enhancement techniques. Therefore, after considering is stretched both Vertically and Horizontally.
the time complexity issues Dilation is used to achieve the • Rotation – On the same lines If slant detection is positive
required results [11]. we rotate the image by required degree to make it
c) Character Segmentation: After observing line and
horizontal
word segmentation, we were sure about the Bounding box
technique and considering the word segmented images the D. Convolutional Neural Network
Bounding boxes were achieved for single characters by vertical
projection and which was most feasible solution to achieve the CNN also known as Convolutional Neural Networks as
results [12]. already discussed is a type of neural network model that
uses convolutional layers and pooling techniques to process
B. Pre-processing data [15]. Keras is a python library used for deep learning
Considering an image, the most important step is its prepro- that simplifies the implementation of CNN by using Google’s
cessing, which includes taking input, processing image and Tensorflow which is a special library for machine learning
extracting essential features by neglecting redundant informa- and its applications as backend [16].
tion and sending an output [13]. Due to preprocessing most
memory that would have wasted in order to store redundant
data will be saved along with time [5].
The Image Pre-processing includes the following procedures:-
• Grey scale conversion – converting images to grey scale
• Sampling – extra padding around the mainframe is re-
moved from the input image
• Contrast Normalization – contrast enhancement is done
on input image
• Thinning – Thinning is important step through which
skeletal form of image is achieved
• Slanting – Slant removal technique is used to remove the Fig. 3. CNN model
inclination errors
• Skewing – Image correction and image baseline is created Pradeep J.S. et.al used CNN in their work and we used
using this technique the similar approach to build the CNN model. As shown
• Noise Removal – using algorithms such as Denoising, in Fig. 3 we used 2 convolutional layers along with Max
noise from input image is removed pooling. In first convolutional layer Normalization layer along
• Binarization – Standard Binarization techniques were with dropout layer with 0.5 probability is added to avoid
used to perform this step overfitting. Thinning layer is appended over this to achieve
After Image pre-processing the image is stored for the further single dimensional vector because of which we can achieve
processing which includes Augmentation and Feature Extrac- Dense connected layer. This layer uses 32 filters with 5 x 5
tion. being its window size the next layer uses 64 filters with no
37
Authorized licensed use limited to: M S RAMAIAH INSTITUTE OF TECHNOLOGY. Downloaded on November 15,2024 at 08:48:30 UTC from IEEE Xplore. Restrictions apply.
variation in window size. We use softmax function similar to VII. C ONCLUSION
Pradeep J.S. et.al at output end to get the probabilities and The task of detecting Kannada numbers and characters is
the activation function at both layers is sigmoid. we used complex due to the vast character set and there is huge demand
Adagrad stochastic Gradient descent algorithm to optimize for more accurate systems in this scenario this paper presented
the output.We observed that from initial epochs itself the an OCR model to detect Kannada characters using Deep
model started to give the best results and at 80th epoch it Learning. With this work the task of reading kannada becomes
reached global maximum [14]. easy and the proposed techniques to segment handwritten kan-
nada characters using CNN has produced promising results. As
VI. R ESULTS & D ISCUSSION future improvement to existing work, we want to investigate
by developing RNN and LSTM models to overcome the
The Project makes use of windows environment and python.
challenges identified in this work.
We can observe in Fig. 4. Which is handwritten image
consisting of kannada numbers and Fig. 5. Is the Output Image R EFERENCES
for kannada numerals we can observe that all numbers from 0 [1] Thippeswamy, Dr.K. “A Comprehensive Survey on OCR Techniques for
to 9 were correctly detected by our model with 100% accuracy. Kannada Script.”, 2016.
Fig. 6. Is another input showing handwritten kannada word and [2] Kusumika Krori Dutta, Sunny A.S, Ashita Victor ,Archana G Nathu,
“Kannada Alphabets Recognition using Decision Tree and Random
in fig.7. we can observe that the kannada word is detected with Forest Models”, 2021.
100% accuracy. We have identified few challenging scenarios [3] Subhrajyoti Sen, Shreya V Prabhu, Steve Jerold, Pradeep J
in which the improvement need to done such as Handwritten S,“Comparative Study and Implementation of Supervised and Unsuper-
vised Models for Recognizing Handwritten Kannada Characters”,2021
images with overlapping vattu and dheergas i.e. ottaksharas. in IEEE Access, vol. 9, pp. 33203-33223.
This challenge need to be overcome with more robust learning [4] Kusumika Krori Dutta, Sunny Arokia .Swamy, Anushua Banerjee, Divya
methods by using RNN or LSTM. Rashi B, “Kannada Character Recognition Using Multi-Class SVM
Method”, J Med Internet Res 2021;23(4):e26627
[5] Roshan Fernandes, Anisha P Rodrigues 2021. ”Kannada Handwritten
Script Recognition using Machine Learning Techniques ” Information
12, no. 5: 204.
[6] Villavicencio, Charlyn; Macrohon, Julio J.; Inbaraj, X. A.; Jeng, Jyh-
Horng; Hsieh, Jer-Guang, ”Kannada Confusing Character Recognition
and Classification Using Random Forest and SVM, 2020, pp. 3220-3227,
doi: 10.1109/BigData50022.2020.9377888.
[7] Kevin George Joe, Meghna Savit, K. Chandrasekaran, ”Offline Character
recognition on Segmented Handwritten Kannada Characters,” 2019
IEEE International Conference on Big Data (Big Data), 2019, pp. 2763-
2772, doi: 10.1109/BigData47090.2019.9006136.
Fig. 4. Kannada Numbers [8] T. E. de Campos, B. R. Babu and M. Varma, “Character recognition in
natural images”,2021, Diabetics and Metabolic Syndrome, ELSEVIER
Journal
[9] Z. Razak, K. Zulkiflee“Off-line Handwriting Text recogntion”. In Proc.
of 2nd Intl. Conf. on Information and Computer Technologies, pages
46–51, 2019
[10] S. R. Kunte and R. D. Sudhaker Samuel. “Online character recognition
for handwritten Kannada Characters using wavelet features and neural
Fig. 5. Output of Kannada Numbers classifier”. In Proc. of Intl. Symposium on Signal Processing and
Information Technology, pages 558–563, December 2018
[11] Prasad, M. Mahadeva, M. Sukumar, and A. G. Ramakrishnan. “Divide
and conquer technique in online handwritten Kannada character recog-
nition”. https://www.cnn.com/ 2020/06/28/health/, June 2020. Accessed:
2020-06-28.
[12] Kumara, B.A., Kodabagi, M.M., Choudhury, T. et al. Improved email
classification through enhanced data preprocessing approach. Spat. Inf.
Res. 29, 247–255 (2021). https://doi.org/10.1007/s41324-020-00378-y
[13] Ido Kissos, Nachum Dershowitz. “OCR Error Correction Using Charac-
ter Correction and Feature - Based Word Classification”. IEEE Access,
8:91886–91893, May 2020
[14] A. Kashyap, S. Naveen and A. U. M, ”A Comparative Study on
Fig. 6. kannada word
Prediction of PM2.5 Level Using Optimization Techniques,” 2021 IEEE
CONECCT, 2021.
[15] A. K. B and M. M. Kodabagi, ”Efficient Data Preprocessing ap-
proach for Imbalanced Data in Email Classification System,” 2020
International Conference on Smart Technologies in Computing, Elec-
trical and Electronics (ICSTCEE), 2020, pp. 338-341, doi: 10.1109/IC-
STCEE49637.2020.9277221.
[16] Rajni Kumari Sah and K Indira. “Online Kannada Character Recognition
Using SVM Classifier”,2222284, June 2020. Accessed: 2020-06-28.
38
Authorized licensed use limited to: M S RAMAIAH INSTITUTE OF TECHNOLOGY. Downloaded on November 15,2024 at 08:48:30 UTC from IEEE Xplore. Restrictions apply.