0% found this document useful (0 votes)
19 views9 pages

Automated Digitization of Student's Marks From The Answer Book

This research presents a lightweight convolutional neural network (CNN) model for automated digitization of student marks from answer-book images, addressing challenges in handwritten digit recognition (HDR) due to varying writing styles and image quality. The proposed methodology includes a contour-based segmentation process and pre-processing techniques that enhance recognition accuracy, achieving state-of-the-art performance on real-time data. The experimental setup demonstrates the system's capability to generate digital records of student marks efficiently using a high-resolution camera and a Jetson Nano development board.

Uploaded by

bdfataniya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views9 pages

Automated Digitization of Student's Marks From The Answer Book

This research presents a lightweight convolutional neural network (CNN) model for automated digitization of student marks from answer-book images, addressing challenges in handwritten digit recognition (HDR) due to varying writing styles and image quality. The proposed methodology includes a contour-based segmentation process and pre-processing techniques that enhance recognition accuracy, achieving state-of-the-art performance on real-time data. The experimental setup demonstrates the system's capability to generate digital records of student marks efficiently using a high-resolution camera and a Jetson Nano development board.

Uploaded by

bdfataniya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

SN Computer Science (2024) 5:350

https://doi.org/10.1007/s42979-024-02693-9

ORIGINAL RESEARCH

Automated Digitization of Student’s Marks from the Answer‑Book


Images Using a Lightweight CNN Model
Rutul Patel1 · Neel Patel1 · Bhupendra Fataniya1 · Dhaval Shah1

Received: 10 May 2023 / Accepted: 8 February 2024


© The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2024

Abstract
Preparing student’s digital marksheet using images of student answer-books is a potential application in academic institu-
tions. Segmenting assigned marks automatically from answer-book images is extremely challenging, and it also demands
pre-processing before the recognition stage. In addition, recognizing handwritten digits is crucial due to different writing
styles. Existing research admits the superior performance of deep learning-based models in handwritten digit recognition
(HDR) applications for popular datasets. However, their implication on real-time data for an experimental setup needs much
attention. This paper presents an experimental setup that uses student answer-book images to record students’ marks digi-
tally. We proposed a lightweight convolutional neural network (CNN) model for HDR. We also introduced a contour-based
segmentation process for automatically extracting student details from answer-book images. The obtained results show the
state-of-the-art performance of our proposed CNN model for real-time images. Further, introducing additional pre-processing
before recognition significantly enhances the accuracy of the HDR experimental setup.

Keywords Convolutional neural network (CNN) · Handwritten digit recognition (HDR) · Handwritten digit segmentation ·
Image classification

Introduction help people with disabilities, such as those with dyslexia


[10], to access written information better.
Handwritten digit recognition is vital for a variety of docu- One of the significant HDR applications lies in the field
ment analysis applications [1, 2], such as automated bank of academic transcript analysis [11]. Handwritten digit rec-
cheque processing [3], handwriting recognition for text- ognition is necessary for digitizing marksheets, which makes
based input [4–8], and recognizing numbers for data entry. the process of collecting, storing, and analyzing marksheets
It is also used for security purposes, such as for biometric easier and more efficient. Additionally, handwritten char-
identification [9]. Handwriting recognition is also used to acter recognition can help reduce errors, eliminating the
need for manual data entry. Classical HDR algorithms [12]
This article is part of the topical collection “Image Processing and include four stages: pre-processing, segmentation, feature
Vision Engineering” guest edited by Sebastiano Battiato, Francisco extraction, and classification. Popular pre-processing tech-
Imai and Cosimo Distante. niques include image super-resolution [13, 14], threshold-
ing [15], and denoising. Such processed images improve the
* Dhaval Shah
[email protected] segmentation of digit images. Meaningful features from the
segmented digit images are extracted to identify the digit.
Rutul Patel
[email protected] Common feature extraction techniques include edge, con-
tour, and line detection. These extracted features enable
Neel Patel
[email protected] the classifier to assign the digit image to a class. K-nearest
neighbors (kNN), deep neural networks (DNN), and support
Bhupendra Fataniya
[email protected] vector machines (SVM) are some of the popular classifier
algorithms. A similar feature-based segmentation is imple-
1
Electronics and Communication Engineering, Institute mented in [16].
of Technology, Nirma University, Gota, Ahmedabad,
Gujarat 382481, India

SN Computer Science
Vol.:(0123456789)
350 Page 2 of 9 SN Computer Science (2024) 5:350

Existing work in the literature accepts pre-processed digit [1] proposed two-layer NN and compared the performance
images ready to feed in CNN model. However, real-time for two widely used loss functions, mean square loss (MSE),
images has blur, shadows, and illumination issues. Seg- and cross-entropy loss.
menting digit images in such cases is challenging. Further, Guo et al. [22] integrated a convolutional neural network
handwriting can vary in size, shape, background, and slant. with hidden markov model (CNN-HMM) for recognizing
Even individual letters can vary from person to person, house numbers from street view images. They have shown
depending on their writing styles. Additionally, the pres- that CNN features are more powerful than hand-crafted
ence of noise, such as smudging or non-standardized paper, features. Ghosh et al. [23] compared three neural network
can complicate the task of character recognition. To address approaches, deep neural network (DNN), deep belief net-
these challenges, we proposed an experimental setup that work (DBN), and convolutional neural network (CNN) for
generates digital record of student marks through automatic handwritten digit recognition on the MNIST dataset. Shima
segmentation and recognition. et al. [24] used the AlexNet-CNN model to extract the fea-
The key contributions of this paper are listed as follows: ture maps and then classified them through the SVM model.
James et al. [25] proposed to accompany an XGBoost pre-
1. A lightweight and accurate CNN model is proposed for dictor with CNN to improve accuracy over the NIST dataset,
handwritten digit recognition. With few trainable param- which consists of 810,000 isolated characters in lowercase,
eters, the proposed model achieves state-of-the-art accu- uppercase, and digits in English. Son et al. [26] used the
racy. VGG-CNN model for number recognition while reading the
2. An inclusion of pre-processing stage enhances the accu- numbers from the gas meter.
racy of handwritten digits. The pre-processing stage Neto et al. [27] presented a new Gated-CNN-BGRU
ensures that real-time images are consistent with the architecture for handwritten digit string recognition (HDSR)
MNIST dataset [17]. systems. They demonstrated the model robustness by achiev-
3. A contour-based segmentation technique automatically ing an average precision of 96.50%. Ahlawat [28] improved
detects the desired region of interest from the student’s the handwritten digit recognition accuracy using pure CNN
answer-book. architecture without ensemble architecture. Mukhoti et al.
4. An experimental setup that automatically captures stu- [29] used LeNet-CNN architecture for handwritten digit
dent information and generates a digital record. classification in Bangla and Hindi.

The organization of this paper is as follows: “Related Work”


discusses various deep learning models for HDR applica- Methodology
tions. “Methodology” elaborates the proposed handwritten
digit segmentation and recognition method and illustrates This section describes the proposed method to segment and
the experimental setup. “Results and Discussion” demon- recognize handwritten digits from student answer-books.
strates the performance of the proposed model over state-
of-the-art methods. “Conclusion” summarizes the findings Handwritten Digit Segmentation
and highlights inferences for the proposed work.
Our segmentation process uses the contour of binary images
proposed by Chang et al. [30]. Therefore, an input RGB image
Related Work is binarized via the widely used OTSU’s thresholding tech-
nique [15]. Further, this resultant image (Thres_Image) is
The handwritten digit recognition problem is challenging in inverted for representation. Next, we retrieve extreme outer
machine learning (ML) and computer vision (CV) applica- contours from the inverted image (Inv_Image). An approxi-
tions. In the past few decades, authors have proposed various mate bounding rectangle is represented outside the given con-
machine learning and deep learning (DL) models for HDR tour using the start coordinate of contour (x, y), height h, and
applications. These models are evaluated on MNIST datasets width w. Now, we select extracted contours containing only
to determine their performance. digit images using parameters wl , wu , hl , hu , yl , and yh. Here,
Lecun et al. [17] introduced the MNIST dataset to evalu- wl , wu , hl , hu , yl , and yh are width lower bound, width upper
ate the performance of ML- and DL-based models. They bound, height lower bound, height upper bound, y-coordinate
compared kNN, SVM, NN, and CNN for MNIST dataset lower bound, and y-coordinate upper bound, respectively.
classification. Belongie et al. [18] and Keysers et al. [19] These bounds are selected as per the image template of the
used KNN for HDR application. Kegl and Busa [20] used student’s answer-book. First, to segment the desired region of
boosted stump on Haar features. Decostse and Sholkopf [21] interest, we select contours whose width is within wl and wh.
used the SVM classifier for the MNIST dataset. Simard et al. This is because each digit is written within the answer-book’s

SN Computer Science
SN Computer Science (2024) 5:350 Page 3 of 9 350

bounding box. Next, we extract contours of roll number whose to binary by OTSU’s threshold. Next, we use the erosion oper-
y-coordinate is within yl , and contour height h is within the ator to thin digit image boundaries. Let, an image under the
range of hl and hu. Similarly, we extract marks contours whose set Roll_no and Marks, inverted and thresholded, is denoted
y-coordinate is higher than yh, and whose contour height h is as I. Using structuring element S, we get eroded image IE as,
within the range of hl and hu. As a result of the segmentation
process, we will have region of interest (ROI) with a set of
IE = I ⊝ S = {z|(S)z ⊆ I} (1)
individual digit images of roll number and marks denoted with This gives all digit images consistent with the MNIST data-
Roll_No and Marks. A summary of our proposed segmentation set. The aforesaid pre-processing steps are illustrated in
process is given in algorithm 1. Fig. 1.
Algorithm 1  Proposed Segmentation Algorithm

Pre‑processing
Handwritten Digit Recognition
The segmentation process results set of digit images Roll_no
and Marks, of students’ roll number and marks, respectively. In this phase, we have segmented individual digits from stu-
To enhance accuracy, we employed a pre-processing stage dents’ answer-books having their roll numbers and marks
that converts handwritten digit images closely matching the will pass through a pre-trained CNN model for recogni-
MNIST dataset. Since the collected images have a white back- tion purposes. The proposed CNN model includes three
ground and black-colored digits, we inverted these images to convolutional layers and three fully connected layers. Each
match the MNIST dataset. The generated image is transformed

Fig. 1  Pre-processing phase


before digit recognition. From
left to right: actual image, bina-
rized image, and eroded image

SN Computer Science
350 Page 4 of 9 SN Computer Science (2024) 5:350

convolutional layer consists of convolutional units followed ReLU activation. The third convolutional layer has 16 ker-
by batch normalization and max-pooling operation. nels, each of size 3x3 with ReLU activation. Before flatten-
At an instance, each CNN layer accepts an output feature ing, a dropout feature is introduced to prevent overfitting of
xi(L−1) from the previous layer ( L − 1) and computes N (L) fea- the CNN model. This improves the generalization ability of
ture maps z(L)
j
using kernel w(L)
j
as, CNN for predicting unseen data. After flatting, we deployed
three fully connected layers (FCNs) to map convolutional
z(L)
j
= xi(L−1) ∗ w(L)
j
, 1 ≤ j ≤ N (L) (2) layer features to the final output layer for predicting class
probabilities. Each of these FCNs has 128, 50, and 10 neu-
Here, ∗ refers convolution operation between xi(L−1) and w(L)
j rons, respectively. Further, the first two FCNs have ReLU
. activation functions, and the final FCN has a softmax acti-
These feature maps are further passed through ReLU acti- vation function. An ADAM optimizer is utilized during the
vation to discard negative values from features, training phase to minimize cross-entropy loss ( LCE).
( )
N
xj(L) = max 0, z(L) (3) 1∑
j
LCE = − y log(ŷi ) (4)
N i=1 i
As a result, for each previous layer feature xi(L−1), CNN layer
computes N (L) features, each denoted with xj(L) . Thus, we Here, yi and ŷi are true labels and predicted probabilities,
receive N (L−1)
× N feature maps at the execution of L
(L) th respectively, for each of the N classes.
CNN layer. Next, a batch normalization [31] normalizes
each training mini-batch. Toward the end, a max-pooling Experimental Setup
layer gets abstract information about feature maps that
reduce feature dimensions for the next stage. There is a drop- The proposed HDR system includes three key stages: seg-
out layer and flatten layer between the connection of last mentation, pre-processing, and recognition. This three-stage
convolutional layer and fully connected layers. The proposed process, including sub-processes, is shown in Fig. 3.
lightweight CNN architecture is shown in Fig. 2. As shown in Fig. 3, individual digit images are seg-
As shown in Fig. 2, the first convolutional layer has 64 mented through selection of extracted contours that map to
kernels, each of size 5 × 5 with ReLU activation. The second roll number and marks. These segmented digit images are
convolutional layer has 32 kernels, each of size 5 × 5 with pre-processed before CNN-based HDR recognition for better

Fig. 2  Proposed CNN archi-


tecture

SN Computer Science
SN Computer Science (2024) 5:350 Page 5 of 9 350

Fig. 3  Experimental setup for preparing student’s digital record of marks

Fig. 5  Sample images of MNIST dataset

Fig. 4  Automated HDR system through Jetson Nano for marksheet


preparation

accuracy. First, segmented digit images are binarized in the


pre-processing stage through OTSU’s thresholding method.
Then, a erosion operation is applied to keep the digit images
consistent with the MNIST dataset before recognition. In
the final stage, these pre-processed images are recognized
through pre-trained lightweight CNN model, and a digital
record of that student kept in the excel sheet. To generate
automated digital record of student’s marks, we have set up Fig. 6  Sample images of custom dataset
a system comprising a high-resolution auto-focus camera
­(Logitech® C920 HD Pro), a camera stand, a Jetson Nano
development board, and an LED light. This setup captures Results and Discussion
an image of a student’s answer-book and generates a digi-
tal record in real-time. An experimental setup for preparing This section illustrates the performance of the proposed
student’s digital record is shown in Fig. 4. work to recognize handwritten digits from student’s answer-
book and prepare digital record.

SN Computer Science
350 Page 6 of 9 SN Computer Science (2024) 5:350

Dataset been trained and validated by merging MNIST and our own
custom datasets. Approach 4 is the same as 3, but custom
The proposed model utilizes the MNIST dataset that con- dataset images are eroded first before training and valida-
tains 60,000 training and 10,000 testing images. These tion. Our dataset includes 441 images of handwritten digits
images are for digits 0–9 with variations such as rotation, segmented from students’ answer-books. To analyze the per-
illumination, and scale to have diversity. A few sample digit formance of the proposed model over real-time images, we
images from the dataset are shown in Fig. 5. computed training and validation cross-entropy loss for all
Our experimental setup captures real-time images of the approaches and shown in Fig. 8.
student answer-book for preparing the digital record. For that As shown in Fig. 8, Approach 1 results in a stable model
purpose, we created a custom dataset of 441-digit images with few variations. Approach 2 has instability in training
extracted from real-time images for tuning our model. A few that results in lower accuracy. Approaches 3 and 4 show
samples of such answer-book images are shown in Fig. 6. stable model behavior with higher accuracy than previ-
ous approaches. Learning curves in Fig. 8 indicate that
Evaluation of Proposed CNN Model with a smaller data set in Approach 2, the cross-entropy
loss is relatively higher than in Approaches 3 and 4. Also,
Compared to state-of-the-art models, we designed a sim- cross-entropy is more stable and has fewer variations in
ple and effective CNN model for HDR. Our CNN model Approaches 3 and 4 than in Approach 2. The validation and
was trained on the Jetson Nano device through the 60,000 testing performance for all these approaches on real-time
training images of the MNIST dataset. The model was data is shown in Table 1.
trained for 100 epochs with a learning rate of 0.001. Opti- It is evident from Table 1 that Approach 4 achieves high-
mizing network parameters utilizes Adam [32], and the est accuracy of 96.26% due to an additional pre-process-
loss function is cross-entropy. With all these hyperparam- ing step before recognition. An accuracy obtained with
eters, the total parameters of proposed CNN models are Approach 4 is almost 14% higher than accuracy achieved
67,104, of which trainable parameters are 66,880. This using Approach 1.
model accuracy and MSE over 100 epochs are shown in
Fig. 7a and b, respectively. Performance Comparison with State‑of‑the‑Art HDR
As shown in Fig. 7a and b, the proposed lightweight CNN Methods
model achieves accuracy and MSE, 0.9937 and 1.417 × 10−4,
respectively. To validate the performance of our proposed model, we
compared our model with state-of-the-art HDR methods.
For comparison, we considered total parameters, trainable
Model Performance over Real‑Time Images parameters, and accuracy. The comparison results are shown
in Table 2.
Existing HDR methods shows performance over the MNIST As shown in Table 2, Zhao et al. [36] proposed an ensem-
datasets. However, their performance under real-time condi- ble learning approach that achieves 98.10% accuracy with
tions has yet to be experimented. We evaluated our model the least total and trainable parameters of 12,857. In contrast,
on real-time images of the custom dataset with four different Albahli et al. [37] proposed a faster regional convolutional
approaches. Approach 1 has been trained and validated on neural network (FRCNN) that achieves maximum accuracy
the MNIST dataset. Approach 2 is trained and validated on of 99.70%. However, this approach uses a dense CNN with
our own custom dataset, which is smaller. Approach 3 has total and trainable parameters of 6,031,422 and 5,921,356,

Fig. 7  Performance of the


proposed model

(a) Test accuracy of proposed (b) MSE of the proposed CNN


CNN model model

SN Computer Science
SN Computer Science (2024) 5:350 Page 7 of 9 350

Fig. 8  Learning curves for a custom dataset. Left to right, top to bottom: Approaches 1, 2, 3, and 4

Table 1  Validation and test performance of the proposed approaches respectively, shown in red. The similar accuracy has been
Validation accuracy (%) Test
observed for other models with total and trainable param-
accuracy eters above one lacs. Such a dense network makes the train-
(%) ing process complex and time intensive. Our CNN-based
model achieves an accuracy of 99.37% with total and train-
Approach 1 99.28 82.76
able parameters of 67,104 and 66,880, respectively, shown
Approach 2 95.45 94.4
in blue color. Looking at the obtained accuracy with lesser
Approach 3 99.29 95.13
number of total and trainable parameters, our proposed light-
Approach 4 99.15 96.26
weight CNN model achieves state-of-the-art performance.

Table 2  Model performance Model architecture Total parameters Trainable Accuracy (%)
comparison with state-of-the-art parameters
HDR methods
Yang et al. [33] DFC 430,500 430,500 99.13
Enriques et al. [34] CNN+ML 210,740 210,740 98.00
Saqib et al. [35] CNN+DL4J 1,111,946 1,111,946 99.21
Zhao et al. [36] CNN+KNN+RF 12,857 12,857 98.10
Albahli et al. [37] FRCNN 6,031,422 5,921,356 99.70
Proposed Lightweight CNN 67,104 66,880 99.37

SN Computer Science
350 Page 8 of 9 SN Computer Science (2024) 5:350

Conclusion 7. Kumar P, Sharma A. Segmentation-free writer identification based


on convolutional neural network. Comput Electr Eng. 2020;85.
https://​doi.​org/​10.​1016/j.​compe​leceng.​2020.​106707.
This paper presents an experimental setup to prepare a digi- 8. Zouari R, Boubaker H, Kherallah M. Multi-language online
tal record of student’s marks through handwritten digit rec- handwriting recognition based on beta-elliptic model and hybrid
ognition. We have proposed a contour-based segmentation tdnn-svm classifier. Multimed Tools Appl. 2019;78(9):12103–23.
https://​doi.​org/​10.​1007/​s11042-​018-​6764-0.
process for automatically extracting digit images from the 9. Dargan S, Kumar M. A comprehensive survey on the biometric
real-time image of the student’s answer-book. Our experi- recognition systems based on physiological and behavioral modal-
mental setup uses the pre-trained lightweight CNN, and it ities. Exp Syst Appl. 2020;143. https://​doi.​org/​10.​1016/j.​eswa.​
achieves 99.37% accuracy. The results clearly show that the 2019.​113114.
10. Kamal N, Sharma P, Das R, Goyal V, Gupta R. Virtual Technical
proposed model achieves similar accuracy with 88 times Aids to Help People with Dysgraphia. 2022;222–35. https://​doi.​
lesser training parameters compared to the FRCNN model org/​10.​4018/​978-1-​7998-​8929-8.​ch009.
[37]. Further, we have evaluated the same model for our 11. Nouri HE. Handwritten digit recognition by deep learning for
custom dataset comprising 441 images that achieve 82.76% automatic entering of academic transcripts. Adv Intell Syst
Comput. 2020;1295:575–84. https://​doi.​org/​10.​1007/​978-3-​030-​
accuracy without pre-processing. The addition of the pre- 63319-6_​53.
processing stage before recognition enhances the accuracy 12. Yang J-B, Shen K-Q, Ong C-J, Li X-P. Feature selection for mlp
of HDR by 14%. neural network: The use of random permutation of probabilistic
outputs. IEEE Trans Neural Networks. 2009;20(12):1911–22.
Acknowledgements The authors are thankful to Nirma University, https://​doi.​org/​10.​1109/​TNN.​2009.​20325​43.
Ahmedabad, India, for providing financial support to conduct this 13. Patel R, Thakar V, Joshi R. Single image super-resolution through
research. sparse representation via coupled dictionary learning. Int J Elec-
tron Telecommun. 2020;66(2):347–53. https://​doi.​org/​10.​24425/​
Data Availability Statement The dataset compared during the current ijet.​2020.​131884.
study is proposed by Lecun et al. [17] and available at https://​yann.​ 14. Patel R, Thakar V, Joshi R. Dictionary learning-based image
lecun.​com/​exdb/​mnist/. super-resolution for multimedia devices. Multimed Tools
Appl. 2023;82(11):17243–62. https:// ​ d oi. ​ o rg/ ​ 1 0. ​ 1 007/​
Declarations s11042-​022-​14076-4.
15. Otsu N. Threshold selection method from gray-level histograms.
Conflict of Interests The authors have no competing interests to de- IEEE Trans Syst Man Cybern SMC. 1979;9(1):62–6. https://​doi.​
clare that are relevant to the content of this article. org/​10.​1109/​tsmc.​1979.​43100​76.
16. Vyas S, Fataniya B, Zaveri T, Acharya S. Automatic Image Seg-
mentation Algorithm for Microscopic Images of Liquorice and
Rhubarb. 2016;21–24–September–2016:66–70. https://​doi.​org/​
References 10.​1145/​29834​02.​29834​22.
17. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based
1. Simard PY, Steinkraus D, Platt JC. Best practices for convolu- learning applied to document recognition. Proc IEEE.
tional neural networks applied to visual document. Analysis. 1998;86(11):2278–323. https://​doi.​org/​10.​1109/5.​726791.
2003;2003–January:958–63. https://​doi.​org/​10.​1109/​ICDAR.​ 18. Belongie S, Malik J, Puzicha J. Shape matching and object rec-
2003.​12278​01. (cited By 1890). ognition using shape contexts. IEEE Trans Pattern Anal Mach
2. Roy RK, Pal U, Roy K, Kimura F. A system for recognition of Intell. 2002;24(4):509–22. https://​doi.​org/​10.​1109/​34.​993558.
destination address in postal documents of India. Malays J Com- (cited By 5199).
put Sci. 2020;33(3):202–16. https://​doi.​org/​10.​22452/​mjcs.​vol33​ 19. Keysers D, Deselaers T, Gollan C, Ney H. Deformation models
no3.3. for image recognition. IEEE Trans Pattern Anal Mach Intell.
3. Agrawal P, Chaudhary D, Madaan V, Zabrovskiy A, Prodan R, 2007;29(8):1422–35. https://​d oi.​o rg/​1 0.​1 109/​T PAMI.​2 007.​
Kimovski D, Timmerer C. Automated bank cheque verifica- 1153. (cited By 172).
tion using image processing and deep learning methods. Mul- 20. Kégl B, Róbert B-F, Boosting Products of Base Classifiers,
timed Tools Appl. 2021;80(4):5319–50. https://​doi.​org/​10.​1007/​ 2009;497–504 . cited By 69.
s11042-​020-​09818-1. 21. Decoste D, Schölkopf B. Training invariant support vector
4. Guo H, Wan J, Wang H, Wu H, Xu C, Miao L, Han M, Zhang H. machines. Mach Learn. 2002;46(1–3):161–90. https://​doi.​org/​
Self-powered intelligent human-machine interaction for handwrit- 10.​1023/A:​10124​54411​458. (cited By 428).
ing recognition. Research. 2021;2021. https://​doi.​org/​10.​34133/​ 22. Guo Q, Wang F, Lei J, Tu D, Li G. Convolutional feature learn-
2021/​46898​69. ing and hybrid cnn-hmm for scene number recognition. Neuro-
5. Semma A, Hannad Y, Siddiqi I, Lazrak S, Kettani MEYE. Feature computing. 2016;184:78–90. https://​doi.​org/​10.​1016/j.​neucom.​
learning and encoding for multi-script writer identification. Int 2015.​07.​135.
J Doc Anal Recogn. 2022;25(2):79–93. https://​doi.​org/​10.​1007/​ 23. Ghosh MMA, Maghari AY, A Comparative Study on Hand-
s10032-​022-​00394-8. writing Digit Recognition Using Neural Networks, 2017;77–81.
6. Ott F, Wehbi M, Hamann T, Barth J, Eskofier B, Mutschler C. The https://​doi.​org/​10.​1109/​ICPET.​2017.​20.
onhwdataset: Online handwriting recognition from imu-enhanced 24. Shima Y, Nakashima Y, Yasuda M. Pattern Augmentation for
ballpoint pens with machine learning. Proceedings of the ACM Handwritten Digit Classification Based on Combination of Pre-
on Interactive, Mobile, Wearable and Ubiquitous Technologies. trained CNN and SVM. 2018;2018–January:1–6. https://​doi.​org/​
2020;4(3). https://​doi.​org/​10.​1145/​34118​42. 10.​1109/​ICIEV.​2017.​83385​75.

SN Computer Science
SN Computer Science (2024) 5:350 Page 9 of 9 350

25. Joseph James S, Lakshmi C, UdayKiran P. Parthiban: An effi- 33. Yang Z, Moczulski M, Denil M, Freitas ND, Smola A, Song
cient offline hand written character recognition using cnn and L, Wang Z. Deep Fried Convnets. Int Conf Comput Vis, ICCV.
xgboost. Int J Innov Technol Explor Eng. 2019;8(6):115–8. 2015;2015:1476–83. https://​doi.​org/​10.​1109/​ICCV.​2015.​173.
26. Son C, Park S, Lee J, Paik J. Deep learning-based number detec- 34. Enriquez EA, Gordillo N, Bergasa LM, Romera E, Huélamo CG.
tion and recognition for gas meter reading. IEIE Trans Smart Convolutional neural network vs traditional methods for offline
Process Comput. 2019;8(5):367–72. https://​doi.​org/​10.​5573/​ recognition of handwritten digits. Adv Intell Syst Comput.
IEIES​PC.​2019.8.​5.​367. 2019;855:87–99. https://​doi.​org/​10.​1007/​978-3-​319-​99885-5_7.
27. De Sousa Neto AF, Bezerra BLD, Lima EB, Toselli AH. Hdsr- 35. Ali S, Shaukat Z, Azeem M, Sakhawat Z, Mahmood T, ur
flor: A robust end-to-end system to solve the handwritten digit Rehman K, An efficient and improved scheme for handwrit-
string recognition problem in real complex scenarios. IEEE ten digit recognition based on convolutional neural network.
Access. 2020;8:208543–53. https://​doi.​org/​10.​1109/​ACCESS.​ SN Applied Sciences 2019;1(9). https:// ​ d oi. ​ o rg/ ​ 1 0. ​ 1 007/​
2020.​30390​03. s42452-​019-​1161-5.
28. Ahlawat S, Choudhary A, Nayyar A, Singh S, Yoon B. Improved 36. Zhao H-H, Liu H. Multiple classifiers fusion and cnn feature
handwritten digit recognition using convolutional neural net- extraction for handwritten digits recognition. Granular Comput.
works (cnn). Sensors (Switzerland). 2020;20(12):1–18. https://​ 2020;5(3):411–8. https://​doi.​org/​10.​1007/​s41066-​019-​00158-6.
doi.​org/​10.​3390/​s2012​3344. 37. Albahli S, Nawaz M, Javed A, Irtaza A. An improved faster-
29. Mukhoti J, Dutta S, Sarkar R. Handwritten digit classifica- rcnn model for handwritten character recognition. Arab
tion in bangla and hindi using deep learning. Appl Artif Intell. J Sci Eng. 2021;46(9):8509–23. https:// ​ d oi. ​ o rg/ ​ 1 0. ​ 1 007/​
2020;34(14):1074–99. https://​doi.​org/​10.​1080/​08839​514.​2020.​ s13369-​021-​05471-4.
18042​28.
30. Chang F, Chen C-J, Lu C-J. A linear-time component-labeling Publisher's Note Springer Nature remains neutral with regard to
algorithm using contour tracing technique. Comput Vis Image jurisdictional claims in published maps and institutional affiliations.
Underst. 2004;93(2):206–20. https://​d oi.​o rg/​1 0.​1 016/j.​c viu.​
2003.​09.​002. Springer Nature or its licensor (e.g. a society or other partner) holds
31. Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep exclusive rights to this article under a publishing agreement with the
Network Training by Reducing Internal Covariate Shift. author(s) or other rightsholder(s); author self-archiving of the accepted
2015;1:448–56. https://​doi.​org/​10.​5555/​30451​18.​30451​67. manuscript version of this article is solely governed by the terms of
32. Kingma DP, Ba JL, Adam: A Method for Stochastic Optimiza- such publishing agreement and applicable law.
tion. 2015; https://​doi.​org/​10.​48550/​arXiv.​1412.​6980.

SN Computer Science

You might also like