2021 - Dysgraphia Classification Based On The Non-Discrimination Regularization in Rotational Region Convolutional Neural Network
2021 - Dysgraphia Classification Based On The Non-Discrimination Regularization in Rotational Region Convolutional Neural Network
55
1
Department of Computer Science and Engineering, Adhiyamaan College of Engineering, Hosur, Tamilnadu, India
2
Department of Computer Science and Engineering, CMR Institute of Technology, Bengaluru, India
3
Department of Computer Science and Engineering, Dayananda Sagar University, Bengaluru, India
* Corresponding author’s Email: [email protected]
Abstract: Dysgraphia is a handwriting disorder and the classification of dysgraphia in children's handwritten images
helps to identify the dysgraphia patient effectively and also prevents low self-esteem. Traditional methods of
dysgraphia classification based on experts manually classify the dysgraphia that requires more cost and time. Few
researches apply the machine learning and deep learning techniques for the classification of dysgraphia and existing
models have the limitations of overfitting problems in the training process. In this research, the Non-Discrimination
Regularization in Rotational Region Convolutional Neural Network (NDR-R2CNN) is proposed to improve the
efficiency of dysgraphia classification. The balancing parameters are introduced in the loss function to balance the
class in the training and eliminate the features to reduce the overfitting problem. The collected children's handwriting
data were used to evaluate the performance of the proposed NDR-R2CNN model. The proposed NDR-R2CNN model
has the advantages of effective feature analysis and non-discrimination word analysis. The NDR-R2CNN model has
the accuracy of 98.2 % and the SMOTE+SVM method has 90.2 % accuracy in Dysgraphia classification. The result
shows that the NDR-R2CNN model has 98.2 % accuracy and the existing CNN has the accuracy of 94.2 %.
Keywords: Dysgraphia, Handwriting data, Non-discrimination regularization, Overfitting and rotational region
convolutional neural network.
common approach for image and signal classification bridging the gap between the source and target
due to its efficiency. Machine learning is applied for dataset. A more efficient transfer learning method
the classification of handwriting text line images and was applied in the CNN model to improve the
speech signals [8]. The existing methods of performance of the detection. The result shows that
TestGraphia and CNN have lower efficiency in the developed model has a higher performance in the
feature analysis and non-discrimination is not detection compared to an existing model. The model
analyzed in the existing models [9, 10]. In this creates the overfitting problem in training due to the
research, the NDR-R2CNN model is proposed to low dropout rate and fine-tuning of the model is
improve the efficiency of the dysgraphia required to overcome this limitation.
classification. The collected children’s data were Mucha [13] applied the fractional-order deviation
used to evaluate the performance of the proposed method for the detection of dysgraphia among
model. The existing R2CNN model considers the parkinson’s patients. Kinematic partial correlation
incomplete data into one class and this affects the analysis of pearson’s and spearman’s measures was
performance. The NDR is applied to complete the used to analyze the relationship between the designed
input that helps to compare with a target for a better features and the patient’s clinical data. The 33
understanding of features. The balancing parameters parkinson’s disease persons and 36 healthy people
in NDR are applied in the loss function to balance the handwriting were collected to evaluate the
class in the training and eliminate the features to performance of the proposed model. A regression
reduce overfitting. The proposed method provides model was applied to train the features and detect
effective feature analysis and non-discriminative dysgraphia among the patients. The result shows that
word analysis to improve efficiency. the proposed model has higher performance
This paper is organized as follows: a review of compared to the existing methods. The kinematic
recent methods in dysgraphia classification is given measures are not sufficient for analyzing the
in section 2 and the proposed method is presented in discrimination of fractional order deviation features.
section 3. The simulation setup is explained in section Asselborn [14] analyzed a scale of handwriting
4 and the result of the proposed model is given in difficulties from the lightest case to severe case, and
section 5. The conclusion of this research is given in compare the score of the same age and gender. The
section 6. principal component analysis (PCA) method was
applied to reduce the set of 53 handwriting features
2. Literature review to three dimensions. The clustering method was
applied in the data set with three axes to accurate
Dysgraphia is a handwriting disorder and early
detection of dysgraphia detection. The shape features
detection of dysgraphia helps to treat the patient
were effectively analyzed in the method and the
effectively. Few researches involve in the
clustering method effectively classifies the data. The
classification of dysgraphia using the data-driven
result shows that the developed model has a higher
method and machine learning techniques.
performance in dysgraphia detection. The clustering
Dimauro [11] developed the software of
method has an overfitting problem due to the training
TestGraphia to support the doctor for diagnosis of
of many features.
dysgraphia among the patient. Various document
Lamba [15] proposed a kinematic feature
analysis algorithms and feature selection algorithms
extraction method from the handwritten document
were developed to detect dysgraphia. Children
for the detection of parkinson’s disease. The synthetic
handwriting of 2nd grade to 5th grade were collected
minority oversampling technique (SMOTE) method
to evaluate the TestGraphia software. The 9
was applied to handle the imbalanced dataset. A total
algorithms were used to analyze the features of the
of 29 kinematics features was extracted from the
text from the input data. The result shows that
dataset for detection. The genetic algorithm and
developed software has higher performance in
mutual information gain were applied for the feature
detecting dysgraphia and the computational time of
selection process. The classifiers such as XGBoost,
the model is less. The developed model has
AdaBoost, random forest, and support vector
ambiguity in borderline removal due to the presence
machine (SVM) were applied for the detection of
of variance and non-discriminatory letters were not
parkinson’s disease. The 10-fold cross-validation
analyzed in the software.
was applied to evaluate the performance of the model.
Gazda [12] applied convolutional neural network
The SMOTE method doesn’t consider the
(CNN) for the diagnosis of parkinson’s disease from
neighborhood value and this involves the overlapping
handwriting images. The pretrained CNN model was
of features. The SVM has lower performance in
applied for the idea of multiple fine turning to
overlapping and imbalanced datasets.
International Journal of Intelligent Engineering and Systems, Vol.15, No.1, 2022 DOI: 10.22266/ijies2022.0228.06
Received: June 30, 2021. Revised: October 11, 2021. 57
International Journal of Intelligent Engineering and Systems, Vol.15, No.1, 2022 DOI: 10.22266/ijies2022.0228.06
Received: June 30, 2021. Revised: October 11, 2021. 58
horizontal features are analyzed on pooled size of 3 × The weaker and strong conditions of ReLU-
𝑐
11 and help horizontal text detection with width activated networks coincide as 𝑔𝑙,𝑖,𝑗 ≥ 0∀𝑖,𝑗 . This
higher than its height. The more vertical features are regularization provides the completion of words that
analyzed with a pooled size of 11 × 3 and this is helps to compare with target data to analyze the
applied for vertical text detection with higher height difference.
than its width.
3.7 Training objectives (multi-task loss)
3.4 Boundary box
The RPN training loss is similar to faster R-CNN
After RPN involves axis-aligned bounding boxes [20]. The R2CNN loss function is introduced on each
to refine inclined bounding boxes and arbitrary- RPN-generated proposal of an axis-aligned box.
oriented texts, RPN generated a proposal region to Each region loss function is based on the
classify as text or non-text. Axis-aligned box is summation of the text/non-text box regression loss
related to each inclined box and inclined bounding and classification loss. The box regression loss
boxes are targets. Adding an axis-aligned bounding consists of two parts: axis-aligned boxes loss that
box improves the performance. encloses the arbitrary-oriented texts and minimum
area boxes inclined loss. Each proposal multi-task
3.5 Non-maximum suppression loss function is denoted as in Eq. (3).
In the post-process detection, the candidates are
performed based on non-maximum suppression 𝐿(𝑝, 𝑡, 𝑣, 𝑣 ∗ , 𝑢, 𝑢∗ ) = 𝐿𝑐𝑙𝑠 (𝑝, 𝑡) +
(NMS) for current object detection methods. Normal 𝜆1 𝑡 ∑𝑖∈{𝑥,𝑦,𝑤,ℎ} 𝐿𝑟𝑒𝑔 (𝑣𝑖 , 𝑣𝑖∗ ) +
NMS or axis-aligned bounding boxes or inclined 𝜆2 𝑡 ∑𝑖∈{𝑥1 ,𝑦1 ,𝑥2 ,𝑦2 ,ℎ} 𝐿𝑟𝑒𝑔 (𝑢𝑖 , 𝑢𝑖∗ ) (3)
bounding boxes are estimated in this step. The
traditional classification of intersection-over-union Three terms trade-off is control based on the
(IoU) is modified in inclined NMS to be IoU between balancing parameters of 𝜆1 and 𝜆2 . These balance
two inclined bounding boxes. The IoU calculation parameters help to maintain the class in the dataset
method is used [19]. and also reduces the overfitting problem.
The box regression is performed on the text and
3.6 Non-discriminatory regularization the class label indicator is denoted as 𝑡 . The
Applying the NDR model in R2CNN helps to background is labelled as 0 (𝑡 = 0) and the text is
complete the information based on the probability of labeled as 1 (𝑡 = 1). Softmax function computed
previous data. Once the suppression detects the the probability of text and background classes with
handwritten letters and NDR method is applied to parameter 𝑝 = (𝑝0 , 𝑝1 ) . True class 𝑡 log loss is
complete the information. The existing R2CNN denoted as 𝐿𝑐𝑙𝑠 (𝑝, 𝑡) = −𝑙𝑜𝑔 𝑝𝑡 .
model considers the incomplete data as another class Bounding box regression targets of the true axis-
and this affects the performance. The NDR method is aligned tuple are denoted as 𝑣 = (𝑣𝑥 , 𝑣𝑦 , 𝑣𝑤 , 𝑣ℎ ) ,
applied to complete the input data and apply for including center point coordinates, height, width and
classification. All off-diagonal elements are pushed text label predicted tuple 𝑣 ∗ = 𝑣𝑥 ∗ , 𝑣𝑦 ∗ , 𝑣𝑤 ∗ , 𝑣ℎ∗ .
to zero to obtain 𝐺𝑙𝑐 → 𝐼 ∀𝑐 in the following Bounding box regression targets of a truly inclined
regularization method, as shown in Eq. (1). The 𝐼 is tuple are 𝑢 = (𝑢𝑥1 , 𝑢𝑦1 , 𝑢𝑥2 , 𝑢𝑦2 , 𝑢ℎ ) including
the identity matrix, and 𝐺𝑙𝑐 is a full rank of 𝑙 𝑡ℎ layer first two points coordinates of its heights and the
and 𝑐 𝑡ℎ class. inclined box and text label predicted tuple is denoted
as 𝑢∗ = 𝑢𝑥1 ∗ ∗
, 𝑢𝑦1 ∗
, 𝑢𝑥2 ∗
, 𝑢𝑦2 , 𝑢ℎ∗ . A scale-invariant
𝑅𝑙
𝑠𝑡𝑟𝑜𝑛𝑔 1 1 𝑐
= 𝐶 ∑𝑐 2𝑀𝑐 ∑𝑖≠𝑗 |𝑔𝑙,𝑖,𝑗 | (1) translation is denoted as 𝑣 and 𝑣 ∗ for
parameterization and object proposal of height-width
in shift, relative log-space is used. The
Where data cardinality 𝑀𝑐 is belongs to the same
parameterization of 𝑢𝑥1 , 𝑢𝑦1 , (𝑢𝑥2 , 𝑢𝑦2 ) ,
discrimination class 𝑐. The input and dictionary data ∗ ∗
(𝑢𝑥1 , 𝑢𝑦1 ) is used for inclined bounding boxes and
patterns are 𝑖 and 𝑗, respectively. A hard constraint is ∗ ∗
𝑠𝑡𝑟𝑜𝑛𝑔 the same width of 𝑢𝑥2 , 𝑢𝑦2 is 𝑣𝑥 , 𝑣𝑦 . The
having 𝑅𝑙 → 0. Effective conditions based on ∗
the average correlations impose a weaker, as shown parameterization of 𝑢ℎ and 𝑢ℎ are similar to the
in Eq. (2). parameterization of 𝑣ℎ and 𝑣ℎ∗ .
Let 𝑤, 𝑤 ∗ denotes 𝐿𝑟𝑒𝑔 𝑤, 𝑤 ∗ , 𝑣𝑖 , 𝑣𝑖∗ or 𝑢𝑖 , 𝑢𝑖∗ ,
1 1 𝑐 as given in Eq. (4, 5).
𝑅𝑙 = 𝐶 ∑𝑐 2𝑀𝑐 | ∑𝑖≠𝑗 𝑔𝑙,𝑖,𝑗 | (2)
International Journal of Intelligent Engineering and Systems, Vol.15, No.1, 2022 DOI: 10.22266/ijies2022.0228.06
Received: June 30, 2021. Revised: October 11, 2021. 59
𝐿𝑟𝑒𝑔 (𝑤, 𝑤 ∗ ) = 𝑠𝑚𝑜𝑜𝑡ℎ𝐿1 (𝑤 − 𝑤 ∗ ) (4) environment and same data. The Python 3.7 tool was
used to implement the proposed and existing model.
0.5𝑥 2 𝑖𝑓|𝑥| < 1
𝑠𝑚𝑜𝑜𝑡ℎ𝐿1 (𝑥) = { (5) 5. Results
|𝑥| − 0.5 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Dysgraphia classification in childhood helps to
The classification is performed on the input treat the child effectively and improves self-
dataset based on this training and testing process. confidence of the child. Few existing methods
involve in the dysgraphia classification based on
4. Simulation setup machine learning techniques and have the limitation
This section describes the dataset, system of overfitting problem. The existing methods have
configuration, parameter settings and metrics. the limitations of lower efficiency in classification
Dataset: The dataset was collected from primary and non-discrimination word is not considered in the
schools in top cities in tamil nadu and karnataka with model.
help of occasional therapists. The 150 samples are This section provides a detailed description of the
taken to build the model from the children age 6 to 8 dataset, metrics, system requirements and results.
from primary school. The sample images of normal
and dysgraphia samples from collected and standard 5.1 Quantitative analysis
datasets were shown in Fig. 2 (a - c). In Fig. 2 (b), In quantitative analysis, standard machine
‘The’ in the sentence is not written properly and the learning models and deep learning models were
shape of the word ‘was’ is not proper. In Fig. 2 (c), compared with the proposed NDR-R2CNN model in
the shape of the letters is not properly written. The dysgraphia classification.
standard dygraphia dataset [9] was used to evaluate The proposed NDR-R2CNN model is evaluated
the performance of the proposed and existing model. for dysgraphia and compared with the existing
The standard dysgraphia dataset is available in methods, as shown in Table 1.
https://github.com/peet292929/Dysgraphia- The proposed NDR-R2CNN model is compared
detection-through-machine-learning. with standard classifiers such as decision trees,
Metrics: The four-parameter metrics such as random forests, and SVM. The four parameters such
accuracy, precision, recall and f-measure were as accuracy, precision, recall and f-measure were
evaluated from the proposed model. The formulas for evaluated in this analysis. The result shows that the
accuracy, precision, recall and f-measure are shown proposed NDR-R2CNN model has higher
in Eq. (6 - 9), respectively. performance compared to the existing standard
classifiers. The proposed NDR-R2CNN model has
𝑇𝑃+𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 × 100 (6) the advantage of effectively analyzing the features in
the data. The proposed NDR-R2CNN model also has
𝑇𝑃 the advantage of considering the non-discrimination
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃+𝐹𝑃 × 100 (7)
of the input data.
The decision tree has the lower efficiency in
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃+𝐹𝑁 × 100 (8) feature analysis and random forest has an overfitting
problem in the less few trees.
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛×𝑅𝑒𝑐𝑎𝑙𝑙 The SVM has a lower performance in handling
𝐹 − 𝑚𝑒𝑎𝑠𝑢𝑟𝑒 = 2 × (9)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙 the imbalanced data for the classification. The
proposed NDR-R2CNN method has an accuracy of
Where TP represents the true positive, TN
represents the true negative, FP represents the false Table 1. Performance comparison with standard machine
positive, and FN represents the false negative. learning models
Parameter settings: The aspect ratio is set as 3 F-
Accurac Precisio Recall
in the model, the learning rate is set as 10-3, and the Methods
y (%) n (%) (%)
measure
weight decay rate is set as 0.0005. (%)
System Requirement: The proposed and Decision
82.5 82.3 82.2 82.24
existing model is implemented in the system consists Tree
of an Intel i9 processor, 128 GB of RAM, 22 GB of Random
84.1 84.3 84.2 84.24
Forest
graphics card and 3 TB hard disk. The proposed and
SVM 85.2 86.1 86 86.04
existing models were implemented in the same
NDR -
98.2 96.4 100 98.16
R2CNN
International Journal of Intelligent Engineering and Systems, Vol.15, No.1, 2022 DOI: 10.22266/ijies2022.0228.06
Received: June 30, 2021. Revised: October 11, 2021. 60
Figure. 2 The sample images of a dataset normal handwriting (top one), and dysgraphia samples (bottom two)
98.2 % and the standard SVM classifier has 85.2 % Table 2. Deep learning model comparison
accuracy. F-
Method Accuracy Precisi Recall
The proposed NDR-R2CNN model and standard measure
s (%) on (%) (%)
classifiers were measured in terms of Accuracy, (%)
Precision, Recall and F-measure, as shown in Fig. 3. LSTM 90.1 91.4 91.3 91.34
RPN 88.2 86.5 86.3 86.39
The result shows that the proposed NDR-R2CNN
Fast
model has significantly higher performance
RCNN 91.8 91.2 91.1 91.14
compared to standard classifiers. The proposed NDR- Faster
R2CNN model has the advantages of effective RCNN 92.1 91.7 91.5 91.59
feature analysis and considers non-discriminative R2CNN 94.2 92.1 91.5 91.79
word analysis. The decision tree model has a lower NDR-
efficiency in feature analysis and doesn’t able to R2CNN 98.2 96.4 100 98.16
handle more data. The random forest method has the
overfitting problem when the number of trees is less in Table 2. The result shows that the proposed NDR-
and lower efficiency when the number of trees is low. R2CNN model has a higher performance in
The SVM model has lower efficiency in classification. The proposed NDR-R2CNN model
imbalance data. The proposed NDR-R2CNN model has the advantages of non-discrimination word
has a precision of 96.4 % and SVM classifiers have analysis and effective feature analysis. The existing
86.1 % precision. models have the limitations of overfitting problems
The proposed NDR-R2CNN model and deep that affect the efficiency of the model. The proposed
learning models such as LSTM, RPN, fast RCNN, NDR-R2CNN model has an accuracy of 98.2 % and
F\faster RCNN, and R2CNN models were compared the existing R2CNN model has 94.2 % accuracy.
International Journal of Intelligent Engineering and Systems, Vol.15, No.1, 2022 DOI: 10.22266/ijies2022.0228.06
Received: June 30, 2021. Revised: October 11, 2021. 61
The proposed NDR-R2CNN model and other NDR-R2CNN has higher performance compared to
deep learning models such as LSTM, RPN, fast existing methods. The proposed NDR-R2CNN
RCNN, faster RCNN, and R2CNN were compared in model has the advantage of effective feature analysis
Fig. 4. The result shows that the proposed NDR- and non-discrimination feature analysis.
R2CNN model has a higher efficiency compared to
other deep learning models. The proposed NDR- Table 3. Comparative analysis on collected dataset
R2CNN model has the advantages of effective F-
Rec
feature analysis and non-discrimination feature Accura Precisi measu
Methods all
analysis in the model. The proposed NDR-R2CNN cy (%) on (%) re
(%)
model has a precision of 96.4 % and the R2CNN (%)
TestGraphia
model has 92.1 % precision. 94.5 94.2 94.1 94.14
[11]
The proposed NDR-R2CNN model is evaluated CNN [12] 94.2 93.1 93.2 93.14
in the dysgraphia classification and compared with Partial
existing methods [11-15]. Various existing methods Correlation
were applied for the classification of dysgraphia and 86.4 85.7 86.2 85.94
Regression
achieved considerable performance. Model [13]
The proposed NDR-R2CNN model is compared PCA +
88.6 88.2 88.3 88.24
with existing methods such as TestGraphia [11], Clustering [14]
CNN [12], Partial Correlation Regression Model [13], SMOTE + SVM
90.2 91.5 91.6 91.54
PCA + Clustering [14], and SMOTE + SVM [15], as [15]
shown in Table 3. The result shows that the proposed NDR-R2CNN 98.2 96.4 100 98.16
International Journal of Intelligent Engineering and Systems, Vol.15, No.1, 2022 DOI: 10.22266/ijies2022.0228.06
Received: June 30, 2021. Revised: October 11, 2021. 62
International Journal of Intelligent Engineering and Systems, Vol.15, No.1, 2022 DOI: 10.22266/ijies2022.0228.06
Received: June 30, 2021. Revised: October 11, 2021. 63
International Journal of Intelligent Engineering and Systems, Vol.15, No.1, 2022 DOI: 10.22266/ijies2022.0228.06