0% found this document useful (0 votes)
27 views12 pages

Handwriting-Based ADHD Detection For Children Having ASD Using Machine Learning Approaches

This article proposes a machine learning approach to detect attention deficit hyperactivity disorder (ADHD) in children who also have autism spectrum disorder (ASD) based on their handwriting patterns. Handwriting samples were collected from 14 Japanese children with ADHD and ASD as well as 15 healthy children using a pen tablet. Statistical features were extracted from the samples and selected using sequential forward floating search. Seven machine learning algorithms were then trained on the features to classify the children, with the random forest classifier achieving the best performance. The study aims to provide evidence that ADHD in children with ASD can be classified based on their handwriting patterns.

Uploaded by

aria3tary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views12 pages

Handwriting-Based ADHD Detection For Children Having ASD Using Machine Learning Approaches

This article proposes a machine learning approach to detect attention deficit hyperactivity disorder (ADHD) in children who also have autism spectrum disorder (ASD) based on their handwriting patterns. Handwriting samples were collected from 14 Japanese children with ADHD and ASD as well as 15 healthy children using a pen tablet. Statistical features were extracted from the samples and selected using sequential forward floating search. Seven machine learning algorithms were then trained on the features to classify the children, with the random forest classifier achieving the best performance. The study aims to provide evidence that ADHD in children with ASD can be classified based on their handwriting patterns.

Uploaded by

aria3tary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

This article has been accepted for publication in IEEE Access.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2022.0092316

Handwriting-based ADHD Detection for


Children Having ASD using Machine
Learning Approaches
JUNGPIL SHIN 1 , (Senior Member, IEEE), MD. MANIRUZZAMAN 1
, YUTA UCHIDA1 , MD.
AL MEHEDI HASAN 2 , AKIKO MEGUMI 3 , AKIRA YASUMURA 3
1
School of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, Japan
2
Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi-6204, Bangladesh
2
Graduate School of Social and Cultural Sciences, Kumamoto University, Chuo-ku, Kumamoto, Japan
Corresponding author: Jungpil Shin ([email protected]).
This work was supported by JSPS KAKENHI Grant Numbers JP21H00891 and JP20K11892.

ABSTRACT
TTENTION deficit hyperactivity disorder (ADHD) for children is one of the behavioral disorders
A that affect the brain’s ability to control attention, impulsivity, and hyperactivity and its prevalence has
increased over time. The cure for ADHD is still unknown and only early detection can improve the quality of
life for children with ADHD. At the same time, children with ADHD often suffer from various comorbidities
like autism spectrum disorder (ASD), major depressive disorder (MDD), etc. Various researchers developed
computational tools to detect children with ADHD depending on handwriting text. Handwriting text-based
systems are depending on a specific language that causes problems for non-native speakers of that language.
Moreover, very few researchers considered other comorbidities such as ASD, MDD, etc., in their studies
to detect ADHD for children. In this study, handwriting patterns or drawing is assumed as an aspect
to identify/detect ADHD children who have ASD using machine learning (ML)-based approaches. We
collected handwriting samples from 29 Japanese children (14 ADHD with coexisting ASD children and
15 healthy children) using a pen tablet. We asked each child to draw two patterns, namely zigzag lines
and periodic lines (PL) on a pen tablet and repeated them three times. We extracted 30 statistical features
from raw datasets and these features were analyzed using sequential forward floating search (SFFS) and
selected the best combinations or subsets of features. Finally, these selected features were fed into seven
ML-based algorithms for detecting ADHD with coexisting ASD children. These classifiers were trained
with leave-one-out cross-validation and evaluated their performances using accuracy, recall, precision, f1-
score, and area under the curve (AUC). The experimental results illustrated that the highest performance
scores (accuracy: 93.10%; recall: 90.48%; precision: 95.00%; f1-score: 92.68%; and AUC: 0.930) were
achieved by the RF-based classifier for the PL predict task. This study will be helpful and provide evidence
of the possibility of classifying ADHD children having ASD and healthy children based on their handwriting
patterns.

INDEX TERMS ADHD, ASD, Detection, Handwriting Patterns, and Machine Learning.

I. INTRODUCTION poor academic performance and employment attainment [4],


TTENTION deficit hyperactivity disorder (ADHD) is
A one of the behavioral disorders that affect the brain’s
ability to regulate attention, impulsivity, and hyperactivity
poor physical/mental health [4], and suicide attempts [4]–[7].
Moreover, males are more affected by ADHD than females
[8]. According to CDC, the number of children with ADHD
[1]. It is mainly developed in childhood or preschoolers has fluctuated over time: 4.4 million children with ADHD
(aged 3-5 years) and becomes more acute/severe problems in aged 3-17 years were diagnosed n 2003 [9], 5.4 million in
school-aged children and can persist into adulthood [2], [3]. 2007 and 6.1 million in 2016 [8]. This rate has gradually
Children with ADHD suffer from various complications like

VOLUME 10, 2022 1

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

increased globally. Moreover, children or adults with ADHD


also suffer from other psychiatric disorders such as autism
spectrum disorder (ASD), major depressive disorder (MDD),
etc [10]–[12]. Moreover, existing studies found that adults
with ADHD had at least one coexisting psychiatric disorder
[10]–[12].
Various diagnostic tools like MRI [13], [14], fNIRS [15],
EEG [16]–[18], questionnaires-based [19], [20], and perfor-
mance test [21], [22], etc., were widely used for detecting FIGURE 1. ML-based framework for predicting children with ADHD who had
ADHD. Moreover, many existing works utilized handwriting ASD.
text to detect children with ADHD [23]–[25]. Handwritten
analysis can be performed in two ways: handwriting text
and handwriting patterns which can be done both offline • More effective and efficient features were selected using
or online. Handwriting text-based systems are depending the SFFS method.
on a specific language that causes problems for non-native • Finally, seven machine learning algorithms were
speakers of that language. Since non-native speakers do not adopted for ADHD detection for children having ASD.
know a language properly, they will face difficulties to write The remaining part of this study is organized as follows:
down a text of that language which will prevent them to Section II introduce materials and methods. This section
generate actual signals to detect ADHD children. Patterns are includes the proposed methodology, subject and recruitment
common for all people and it creates an equal opportunity to process, data collection device and its procedure, feature
draw the pattern for detecting ADHD. extraction, normalization, feature selection, and classification
Recently, handwriting patterns have also been used to clas- algorithms. Experimental design and performance metrics
sify age groups [26], [27], ADHD detection [28]. Moreover, are presented in Section III. Moreover, the experimental
very few researchers considered other coexisting comorbidi- results along with discussions are presented in Section IV.
ties in their studies to detect ADHD for children [29]. In Finally, we summarize the conclusion and future direction of
this work, we proposed a handwritten pattern for detecting this study in Section VII.
ADHD children with coexisting ASD. Nowadays, various
digital devices such as pen tablets allow us for recording the
II. MATERIALS AND METHODS
sequences of measurements from the tasks of handwriting.
A. PROPOSED METHODOLOGY
These recorded data are analyzed using statistical analysis
and multiple algorithms based on machine learning (ML)- In this study, we designed an ML-based framework for
based approaches. detecting ADHD children having ASD problems. Fig. 1
In this work, we also used handwriting patterns to detect illustrates the overall framework of this work. To conduct
children who have both ADHD and ASD problems. Two this study, we performed the following steps. Firstly, the
types of handwriting patterns such as zigzag and periodic handwritten dataset was first taken as input data and then
lines (PL) have two conditions: trace and prediction. At the divided into two phases: the training and testing phases. In
same time, we tried to extract some statistical features and the testing phase, one subject was taken and the remaining (n-
implemented sequential forward floating selection (SFFS) 1) subjects were chosen for the training phase. In the training
with seven ML-based approaches in order to select the best phase, training data was used to train ML-based models, and
combination of effective and efficient features based on clas- test data was utilized to predict children with ADHD having
sification accuracy. These seven ML-based algorithms were ASD problems. The second step was feature extraction from
support vector machine (SVM), random forest (RF), decision raw features, followed by feature normalization. This was
tree (DT), Gaussian naïve Bayes (GNB), k-nearest neighbors designed to keep the extracted features within a similar scale.
(k-NN), logistic regression (LR), and extra tree, respectively. The next step was feature selection, which involved selecting
This study found an excellent finding that handwritten PL the dominant features by removing irrelevant features. In this
predict pattern had discriminative power and was more capa- work, SFFS was employed as a feature selection method
ble to distinguish ADHD children with coexisting ASD from and selected the dominant features. These features were used
healthy children compared to other patterns. Moreover, Our to train ML-based framework algorithms and tuned their
proposed RF-based system produced the highest classifica- hyperparameters using the grid search method. In the training
tion accuracy for detecting ADHD in children having ASD phase, these tuned parameters were utilized to train again
problems. In summary, the contributions of this study are as ML-based algorithms. Moreover, the selected dominant fea-
follows: tures were also extracted from the test phase and fed into
• We proposed a novel handwriting pattern instead of trained ML-based algorithms to predict children into two
handwriting text for detecting ADHD for children who classes: children with ADHD who coexist with ASD and
have ASD. healthy children.
• have extracted statistical features from raw features.

2 VOLUME 10, 2022

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

TABLE 1. Summarization of utilized handwritten pattern datasets

Age No. of No. of No. of Total


Class labels
(years) subjects task repeat samples
ADHD with
5-14 14 4 3 168 (14×4×3)
coexisting ASD
Healthy children 8-12 15 4 3 180 (15×4×3)

FIGURE 3. Handwritten data collection procedure for four tasks.

of their body. At the same time, we set the distance between


the pen tablet and their eyes at about 40 cm. In this work, two
handwriting patterns were implemented to identify ADHD
FIGURE 2. Pen tablet device.
with coexisting ASD subjects and healthy subjects. One
pattern was a continuous zigzag and another pattern was a
periodic line (PL). Each pattern had two conditions including
B. SUBJECTS AND RECRUITMENT PROCESS trace and predict. In trace condition, subjects need to use a
In this work, we included 29 Japanese school students or pen to trace the entire or complete pattern that was visible
subjects aged 5-14 years. All subjects were diagnosed by a on the screen of the pen tablet. Whereas, subjects continued
medical practitioner using the psychiatric disorders (ADHD their drawing pattern in predicted conditions, which was also
or ASD) using the Autism-Spectrum Quotient (AQ) [30], visible on the screen of the pen tablet. The pattern was
[31] and the ADHD-Rating Scale-IV (ADHD-RS-IV) [32]. initiated at a distance of 2.1 cm from the left edge and 3.3
Depending on the rating scales, subjects were classified as cm from the top of the pen tablet screen.
ADHD children with coexisting ASD and healthy children. We set the following requirements in order to draw a
We found that 14 subjects had ADHD with coexisting ASD zigzag line on the screen: 70 degrees apex angle, 80 degrees
(age: 5-14 years) and the remaining 15 subjects had no dis- bottom angle, and 2.5 cm of each side, which was repeated
order or were healthy (age: 8-12 years). The summarization 5 times. The space among these five lines was 3.5 cm, which
of the dataset is presented in Table 1. All subjects were con- was presented in gray color for the trace line. Whereas, only
firmed to be right-handed. Prior to participation in this study, the 1st segmentation or parts were visible in the predicted
we obtained written or oral consent from all subjects or their condition. We asked subjects to draw predict or trace-based
guardians (parents/grandparents/elder sisters or brothers). patterns without lifting the pen tip from the tablet and con-
tinue to the next line after finishing one line. To make a
C. DATA COLLECTION DEVICE stimulus diagram of the PL line, we alternatively arranged
In this paper, we collected a handwriting pattern dataset a baseless quadrangle (height: 2 cm vs. width: 1.3 cm) and
based on a pen tablet (Cintiq Pro 16, Wacom Co. Ltd., an isosceles triangle (vertex: 70 degrees). About 7.5 cycles
Japan), which had a 15.6-inch screen with a resolution of were arranged in a row for triangles and squares. Five lines
2560 × 1440 resolution. In this work, we used a stylus pen of PL line were presented in light gray on one screen.
to draw handwritten on the pen tablet device and recorded To form a dataset, three blocks were executed in succes-
their drawings at a resolution of 200 Hz. When subjects were sive order. In order to perform tasks, we set the following
drawn the handwritten patterns using a stylus pen on the pen condition for each block: 20-second rest, 30-second for the
tablet device, the pen tablet device provided us with six raw drawing task of trace and another 20-second rest, and a 30-
features such as x and y coordinates (in pixels), drawing second for predictive drawing. We instructed each subject to
speed, pen pressure, and pen tilt (horizontal/vertical angle), perform these tasks 3 times. The data collection procedure of
time which is shown in Fig. 2. handwritten is explained in Fig. 3.

D. DATA COLLECTION PROCEDURE E. FEATURE EXTRACTION


We asked each subject to sit on a desk in a such way that their In this work, a set of 30 statistical features were extracted
feet touched the floor and kept the pen tablet near the center for each task to discriminate handwriting patterns by ADHD
VOLUME 10, 2022 3

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

with coexisting ASD children from healthy children. These


30 features were computed from six raw features including
x and y coordinates (in pixels), drawing speed, pen pressure,
and pen tilt (horizontal/vertical angle). These statistical fea-
tures are well-known and widely used in other domains [26],
[27]. The list of extracted feature names, explanations, and
their computational calculation formula is shown in Table 2.

F. FEATURE NORMALIZATION
Feature normalization is known as feature scaling or z-score
normalization in the field of statistics and machine learning.
In this work, we used z-score normalization in order to make FIGURE 4. Pseudo code of SFFS-based algorithm.
a standardization transformation for feature normalization,
which is computed using the following formulae:
X −µ [36], [37]. The main purpose of SVM is to find boundaries
z= (1) (hyperplanes) that can easily be separated the class label
σ
Where X is the input feature; µ is the mean/average of the (yes/no) by solving the following constraints:
feature, and σ is SD. The value of z ranges from 0 to 1.
n n n
X 1 XX
G. FEATURE SELECTION TECHNIQUE max αi − αi αj yi yj K(zi , zj ) (2)
α
i=1
2 i=1 j=1
Feature selection (FS) is a technique or process that reduces
the dimension of the training set by selecting or identifying
Subject to
only the biomarkers that are associated with or relevant to the
class or study variable (here, ADHD with coexisting ASD). n
X
This study excludes the biomarkers or features that: (i) have yi T αi = 1, 0 ≤ αi ≤ C, i = 1, ..., n & ∀ i = 1, 2, 3, ..., n
a lower or minimum discriminative power capability and (ii) i=1
are redundant or irrelevant to each other [34]. Because the (3)
selection of effective and efficient biomarkers or features The final discriminate function is written as:
Xn
may increase or improve learning algorithm efficiency, and f (z) = αi K(zi , zj ) + b (4)
predictive accuracy, and reduce computational time and cost. i=1
Moreover, the biomarkers or features that are fed into pre-
dictive or learning algorithms are hypothetically assumed to Where, b is the bias term and K(zi , zj ) is a gram or
be associated with underlying class labels or diseases (here, kernel matrix, which needs to be chosen or defined during
ADHD+ASD). This study used the SFFS-based FS algorithm performing SVM. In this work, we used a radial basis kernel,
to determine efficient or potential biomarkers. The details which is mathematically defined ass:
of SFFS-based FS algorithms are explained in the following
subsections: K(zi , zj ) = exp(-γ∥zi -zj ∥2 ) (5)

1) SFFS-based Algorithms In this work, the value of cost (C) and gamma (γ) are
Sequential feature selection (SFS)-based methods are the optimized using a grid search method. At the same time, we
set of greedy algorithms that are utilized to reduce feature chose the optimum value of C and γ at which points SVM
dimension space [35]. In this study, SFFS was used to de- provides the highest classification accuracy. In the case of this
termine a proper subset or combination of biomarkers or study, we have used the following steps in order to perform
features for ADHD with coexisting ASD. The pseudo-code SVM for predicting ADHD with coexisting ASD children:
of the SFFS-based algorithm is summarized in Fig. 4. Step 1: Spilt the dataset into training and test set. Whereas,
1 subject was taken as a test set, and the remaining
H. CLASSIFICATION MODEL (n-1) subjects are used as the training set.
Seven ML-based algorithms such as SVM, RF, DT, GNB, k- Step 2: Select a suitable kernel on the basis of the char-
NN, LR, and ET were employed to distinguish ADHD with acteristics of the training set. Here, we chose the
coexisting ASD children from healthy children, which are radial basis function.
shortly discussed in the next subsections: Step 3: Select the hyperparameters of regularization pa-
rameter C and gamma (γ) of the kernel function
1) Support Vector Machine using the grid search method.
SVM is a powerful supervised learning method that can Step 4: After optimizing hyperparameters (C and γ),
be used to solve problems in classification and regression trained SVM with RBF model on the training set.
4 VOLUME 10, 2022

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

TABLE 2. List of 30 extracted statistical features from the raw dataset and their computational formula

SN Feature Descriptions Formula


1 Width Range of X coordinate Max (X)-Min (X)
2 Height Range of Y coordinate Max
Pn (Y)-Min (Y)
3 Length (L) Total length of the drawing k=1 ak , Where, a = {x ϵ Distance}
4 Velocity (V) Total velocity of the drawing L/Total drawing time
5 Max_V_P Maximum velocity of the Peak Max{xϵV}
6 Min_V_P Minimum velocity of the Peak Min{xϵV}
7 Max_A_P Maximum acceleration of the Peak Max{xϵA}
8 Min_A_P Minimum acceleration of the peak Min{xϵA}P n
k=1 ak
9 Mean_GA_H_Mean Mean of grip angle for horizontal aah
¯ = Pn n
, Where, a = {x ∈ Angles}
k=1 ak
10 Mean_GA_V Mean of grip angle for vertical a¯
av = n
q Pn , Where, a = {x ∈ Angles}
¯ )2
i=1 (ai −aah
11 SD_GA_H SD of Grip angle for horizontal SD (ah) = n
, Where, a = {x ∈ Angles}
q Pn
¯ )2
i=1 (ai −aav
12 SD_GA_V SD of Grip angle for vertical SD (av)P= n
, Where, a = {x ∈ Angles}
n
k=1 ak
13 Mean_Press Mean of recorded pressure a¯p = r ,
n
Where, a = {x ∈ Pressure}
(ai −a¯p )2
Pn
i=1
14 SD_Press SD of recorded pressure SD (p) = n
, Where, a = {x ∈ Pressure}
Pn
k=1 ak
15 Mean_Pos_C_Pres Mean of positive change in pressure appc
¯ = n P
q , Where, a = {x ∈ Pressure} > 0
n (a −a ¯ )2
i=1 i ppc
16 SD_Pos_C_Pres SD of positive change in pressure SD (ppc) = n
, Where, a = {x ∈ Pressure} > 0
17 Max_Pos_C_Pres Maximum of positive change in pressure Max(k), where, k=x ∈ Pres>0
P n
k
k=1 a
18 Mean_Neg_C_Pres Mean of negative change in pressure anpc
¯ = n P
q , Where, a = {x ∈ Pres} < 0
n (a −a ¯ )2
i=1 i npc
19 SD_Neg_C_Pres SD of negative change in pressure SD(npc) = n
, Where, a = {x ∈ Pres} < 0
20 Max_Neg_C_Pres Maximum of negative change in pressure Max(k), where, k={x∈ Pressure}<0
No. of outliers and triangle square
21 Error (Square Angle-Triangle Angle) <0
errors based on angles Pn
k=1 ak
22 Mean_Peak_Pres Mean of Peak pressure at minima a¯pp = n
, Where, a = {x ∈ Pressure}
Pn
Mean of starting time at k=1 ak
23 ErrorStopTime , Where, a = {x ∈ Time}
minima point before error n
Pn
k=1 ak
24 Mean_Angle_Mean Mean of angles at maxima and minima a¯A = nP
, Where, a = {x ∈ Angles}
n
¯ )2
i=1 (ai −anpc
25 Angle_Var Angle variance Var (A) = n
, Where, a = {x ∈ Pressure}
26 RegL_Slope Slope of regression model [33]
27 RegL_Inter Intercept of the regression model [33]
Spent of writing time partitioned by Pn
28 LoopCount
the number of peaks k=1 ak , Where a = {x ϵ Peaks}
Mean of velocities at the edge of
29 Angle_Velocity Distance/Time
the peaks and valleys
30 Error_Rate Rate of Error Error/Peaks

Step 5: Use the trained SVM with RBF kernel to predict Step 2: Construct classification trees for each bootstrap
the class label (ADHD with coexisting ASD vs. sample by taking mtry of predictors and choosing
HC) of the test set. the best split from among these variables.
Step 6: Repeat Step 1 to Step 5 into n times. Step 3: Predict the new class by combining the prediction
Step 7: Compute performance metrics such as accuracy, of the (ntree ) trees.
recall, precision, f1-score, and AUC. Error rate or classification accuracy can be assessed in two
ways: training and test set. In this paper, we trained the RF-
2) Random forest based model for the training set and evaluated its perfor-
Random forest (RF) is an ensemble learning that integrates mance for the test set. The hyperparameters were optimized
multiple weak learners based on a decision tree to improve using a grid search method and the value of (ntree ) and
generalization ability. It is also used for both regression and mtry were selected at the points that provided the highest
classification problems. RF generates multiple decision or classification accuracy or lowest error rate, respectively.
classification trees during the training phase. Each tree is
created using bootstrapping sampling from the original data 3) Decision Tree
and the classification tree method. After forming a forest, a Decision tree (DT) is a tree-structure-based technique that
new object is placed on each tree for classification. The forest is widely used in data mining for solving regression and
is selected according to the class that provides the maximum classification tasks [38]. The objective of DT is to build
votes for the object. RF is performed as follows: a model that can predict the study variable by learning or
Step 1: Draw the number of trees (ntree ) bootstrap sam- training simple decision rules from input features. In DT,
ples from the n training samples. there are three nodes: the internal node, the decision node,
VOLUME 10, 2022 5

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

and the leaf node. Here, internal nodes are the set of input Here, “1” stands for ADHD with coexisting ASD children,
features, decision nodes are utilized to make any decision by and “0” stands for healthy children. During the training
learning or training and leaf nodes are the output of these phase, LR used logit or sigmoid function, which is defined
decisions. Nowadays, it is widely used in different fields like as:
healthcare, medical imaging, and so on.
1
σ(w) = (9)
4) Gaussian Naïve Bayes Pk 1 + exp (−w)
Gaussian Naïve Bayes (GNB) is a classification method that Where, w = i=0 αi Xi with α0 = 1. Here, Xi is the
is also widely used in ML. GNB assumes that the distribution set of ith input features, and αi is the set of ith unknown
of each input feature must follow a normal or Gaussian coefficients, which needs to be estimated. In this paper,
distribution. Assume that we have a set of input features we estimated these coefficients by MLE during the training
xi (i = 1, 2, ..., k) and yk be the outcome or class label phase and these coefficients were used to predict the outcome
(k=0, 1). Here, “0” stands for healthy children and “1” stands or class label as ADHD with coexisting ASD children and
for ADHD with coexisting ASD children and. First, the input healthy children.
training set is segmented by class label and computes or
estimates the mean and standard deviation (SD) of each input 7) Extra Tree
feature for each class using the following formula: An extra tree (ET) is an ensemble of DTs that perform clas-
sification or regression depending on a tree-based algorithm.
n
1X Unlike the RF-based algorithm, the ET-based algorithm also
Mean : µk = xi (6) constructs multiple DTs based on random subsets of training
n i=1
sets and features. At the same time, it randomly chooses
the thresholds for each feature. The ET-based algorithm
v !
u
u 1Xn
SD : σk = t (xi − x̄)
2
(7) maintains its optimization ability while adding an additional
n i layer of randomization [40].

Suppose that we have taken some observations value “v”. III. EXPERIMENTAL SETUP AND PERFORMANCE
The probability density function (pdf) of “v” given yk is METRICS
computed as follows: A. EXPERIMENTAL SETUP
2 In this paper, we used Python version 3.10 to perform all
1

v−µk
− 21
p(x = v|yk ) = √ e ; σk > 0 and −∞ ≤ v, µk ≤experiments.
σk
∞ As an operating system, Windows 10 version
σk 2π 21H1 (build 19043.1151) 64-bit is configured and Intel(R)
(8) Core (TM) i5-10400 with 16 GB RAM is used in terms
For a given testing data point, we compute the likelihood or of hardware. In this work, we have used the leave-one-out
posterior probability based on the estimated value of µk and cross-validation (LOOCV protocol. Whereas, the dataset was
σk for each class (1/0). The predicted class label is expected divided into two sets a training set and a test set. Here,
to belong to the class with the highest posterior probability. one subject was used for the test set and the remaining (n-
1) subject was used for the training set. During training
5) k-nearest neighbors predictive models, we set the initial interval or range for each
k-nearest neighbors (k-NN) is a popular ML-based technique hyperparameter of each predictive model, which is illustrated
developed by Fix and Hodges in 1951, later expanded by in Table 3. For example, SVM used "RBF" kernel with
Cover and Hart [39] that can be utilized to solve regression cost (C) of {0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000};
and classification problems. It is a distance-based learning and gamma (γ): {0.00001, 0.0001, 0.001, 0.01, 0.1, 1}. RF:
algorithm to measure feature vector similarity. This paper the max_depth: {5, None}; the n_estimator: {50, 100, 200,
used Euclidean distance to compute the distance to all train- 300}; the min samples_split: {2, 3}, the min samples leaf:
ing data points and select the value of k. At the same time, {1, 3}, bootstrap: {True, False}, and the criterion: {"gini",
we determine the majority class among k-neighbors that are "entropy"}. DT: the max depth of {1, 2, 3, 4, 5}; the min
treated as predicted classes. In this paper, we tune or optimize samples leaf of {1 to 10}, the min samples split of {2,3,4,5}.
the value of k using a grid search technique to achieve better k-NN: the value of n neighbors from 1 to 20; the weight
performance. of {’uniform’, ’distance’}, ’p’: {1, 2}. LR: cost (C) of
{10**i for i in range (-4,4)}, the penalty of {"l1", "l2"}, and
6) Logistic regression the solver of ’liblinear’. ET: the max depth of {3,4,5}, the
Logistic regression (LR) is a statistical method that is used min samples leaf of {1,4,7}, the min samples split of 2. In
for binary classification tasks. It is usually used to estimate or the training phase, we used the initial parameters of each
predict the probability of binary class labels based on input classifier and tuned these parameters using the grid search
feature vectors. Whereas, input features can be continuous or method. After optimizing the parameters of each classifier,
categorical, and the class label is binary either “1” or “0”. we trained again all classifiers, which were used to predict
6 VOLUME 10, 2022

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

TABLE 3. Set hyperparameters of classifiers F1-score


FS is computed based on the value of recall and precision
CT Search Range of Parameters
C: {0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000};
using the following formula:
SVM
γ: {0.00001, 0.0001, 0.001, 0.01, 0.1, 1}
max_depth: {5, None}, n_estimators: {50, 100, 200, 300}, 2T P
RF min_samples_split: {2, 3}, min_samples_leaf: {1, 3},
FS (%) = ×100 (13)
2T P + F P + F N
bootstrap: {True, False}, criterion: {"gini", "entropy"}
max_depth: {1, 2, 3, 4, 5}, min_samples_leaf: {1 to 10},
DT IV. EXPERIMENTS RESULTS
min_samples_split: {2, 3, 4, 5}
GNB None We organized the experimental results section into two sub-
n_neighbors: np.arange (1, 20), sections. Firstly, we described the statistical baseline charac-
KNN
weights: {’uniform’, ’distance’}, ’p’: {1, 2}
C: {10**i for i in range (-4,4)}, penalty: {"l1", "l2"},
teristics of ADHD children with coexisting ASD and healthy
LR children. Subsequently, we discussed the experimental re-
random_state: [1], solver: [’liblinear’]
max_depth: [3, 4, 5], min_samples_leaf: [1, 4, 7], sults for the classification models that included a comprehen-
ET
min_samples_split: [2]
sive performance analysis of the classifiers by considering all
features and selecting optimal features.
the class label on the test set. This entire process was repeated
A. STATISTICAL BASELINE CHARACTERISTICS OF
n times. Moreover, we also computed the predicted class of
ADHD WITH COEXISTING ASD AND HC CHILDREN
each trial and its probability for all subjects over four tasks.
The statistical Baseline Characteristics of ADHD with coex-
isting ASD and healthy children are illustrated in Table 5. As
B. PERFORMANCE METRICS
shown in Table 5, the overall prevalence of ADHD children
Performance metrics are used to assess the effectiveness with coexisting ASD and healthy children were 48.28% and
or efficiency of predictive models in making predictions or 51.72% and their average ages were 8.57± 2.24 and 9.92
classifications. In this work, different performance metrics ± 1.23 years. Moreover, we found that male children were
like accuracy (ACC), precision (Preci), recall (Rec), F1 more likely to have ADHD with coexisting ASD problems
score (FS), and ROC-AUC score were used to evaluate compared to females. Furthermore, about 57.89% of male
the performance of seven predictive models. These metrics children had ADHD with coexisting ASD problems and the
provide insight into how well the predictive models are able remaining male children were healthy. It was observed that
to accurately identify ADHD in children with ASD based on age and gender had no significant difference between ADHD
the input data. These performance metrics were computed children with coexisting ASD and healthy children.
based on true positive (TP), true negative (TN), false positive
(FP), and false negative (FN), which are presented in Table 4. B. EXPERIMENTAL RESULTS OF CLASSIFICATION
ACC, Preci, Rec, and FS are computed using the following MODELS
formula:
In this study, we performed two experiments for predicting
ADHD children with coexisting ASD: (1) all feature-based
Accuracy performance analysis for predicting ADHD children with
Accuracy measures the proportion of correct predictions. coexisting ASD and (ii) significant feature-based perfor-
The ratio of correctly predicted classes to the total predicted mance analysis for detecting ADHD children with coexisting
classes is mathematically expressed as: ASD. The results of these two experiments are more clearly
TP + TN explained in the following subsections:
ACC(%) = × 100 (10)
N
1) Experiment-I: All Feature-based Performance Analysis
Recall The purpose of this experiment was to examine the perfor-
Recall measures the proportion of true positive predicted mance of seven classifiers for identifying ADHD children
class out of all actual positive classes, which is mathemati- with coexisting ASD by considering all 30 features, which
cally expressed as: were extracted from six raw features (See in Table 2). Next,
seven ML-based classifiers (SVM, RF, DT, GNB, KNN, LR,
TP and ET) with LOOCV were trained and the hyperparameters
Rec (%) = × 100 (11)
R1 of each classifier were optimized to detect ADHD children
Precision with coexisting ASD. At the same time, we computed the
Precision measures the proportion of true positive predictions classification accuracy of each classifier over four tasks. The
(i.e., correctly identifying ADHD) out of all positive predic- classification accuracy of seven classifiers over four tasks is
tions classes and is mathematically expressed as: shown in Table 6, and their corresponding results are also
shown in Fig.
TP As shown in Table 6 and Fig. 5, we observed that the RF-
Prec (%) = × 100 (12) based model achieved the highest classification accuracy for
C1
VOLUME 10, 2022 7

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

TABLE 4. Confusion Matrix

Predicted Class
Total
ADHD with
Healthy children
Actual coexisting ASD
Class ADHD with
TP FN R1 =TP+FN
coexisting ASD
Healthy children FP TN R2 =FP+TN
Total C1 =TP+FP C2 =FN+TN N= R1 + R2 = C1 + C2

TABLE 5. Baseline characteristics of ADHD with coexisting ASD and healthy children

Variables Overall ADHD with coexisting ASD Healthy children Statistics p-value1
Total, n (%) 29 14 (48.28) 15 (51.72)
Age, Mean ± SD 9.27 ± 1.90 8.57 ± 2.24 9.92 ± 1.23 t=2.02, df=27 0.053
Gender, Male (%) 19 (70.4) 11 (57.89) 8 (42.11) X 2 =2.04, df=1 0.121
df: degrees of freedom; 1 p-value is gained from independent t-test for age variable and chi-square test for gender variable

TABLE 6. Classification TABLE 7. Classification accuracy (in %) of seven classifiers of individual task
for optimal features
accuracy (in %) of seven classifiers for all features over four tasks
CT Zigzag Trace Zigzag Predict PL Trace PL Predict Zigzag Trace Zigzag Predict PL Trace PL Predict
CT
SVM 68.97 70.11 79.31 82.76 NSF ACC NSF ACC NSF ACC NSF ACC
RF 73.56 75.86 86.21 88.51 SVM 10 77.01 6 80.46 16 87.36 5 86.21
DT 68.97 71.26 75.86 82.76 RF 5 87.36 5 85.06 7 88.51 9 93.10
GNB 60.92 58.62 56.32 72.41 DT 2 82.76 16 82.76 14 86.21 3 91.95
KNN 68.97 68.97 71.26 77.01 GNB 8 78.16 2 66.67 10 79.31 16 82.76
LR 70.11 62.07 75.86 80.46 KNN 6 82.76 8 82.76 3 81.61 5 90.80
ET 70.11 62.07 73.56 79.31 LR 13 75.86 5 62.07 17 81.61 5 88.51
CT: Classifier Types and PL: Plain Line ET 2 77.01 5 77.01 5 80.46 10 88.51
NSF: Number of Selected Features; CT: Classifier Types; PL: Periodic Line

all tasks compared to other classifiers. More specifically, RF


obtained 73.56% accuracy for zigzag trace, 75.86% accuracy optimal features were selected by SFFS. In this work, we
for zigzag predict, 86.21% accuracy for PL trace, and 88.51% employed SFFS with seven classifiers (SVM, RF, DT, GNB,
accuracy for PL predict, respectively. We also observed that KNN, LR, and ET). Moreover, we trained these classifiers
the RF-based model produced a higher accuracy (88.51%) separately with LOOCV and also optimized their hyperpa-
performance for PL predict tasks than the rest of the other rameters. At the same time, we computed the classification
tasks and classifiers. accuracy of each classifier and tried to determine the most
effective combination of features that yielded the highest
classification accuracy. The classification accuracy results of
SFFS with seven classifiers for four tasks are illustrated in
Table 7. Their correspondence results are also illustrated in
Fig. 6.
It was noticed that SFFS with an RF-based classifier pro-
duced better classification accuracy for all tasks than other
classifiers. But, SFFS with RF-based classifier selected 9
features and produced the highest classification accuracy of
93.10% for the PL prediction task compared to other tasks
and classifiers. For PL predict task and selected 9 features,
the classification accuracy of 86.21%, 91.95%, 82.76%,
90.80%, 88.51%, and 88.51% were obtained by SVM, DT,
NB, k-NN, LR, and ET, respectively. On the other hand,
FIGURE 5. Classification accuracy of seven classifiers for all features over
four tasks. SFFS with an LR-based classifier selected the combination
of 5 features and obtained the lowest classification accuracy
of 62.07% for the zigzag prediction task.
2) Experiment-II: Significant Selected Feature-based Moreover, the confusion matrix of the RF-based classifier
Performance Analysis for four tasks is presented in Fig. 7. Four performance
This section showed the performance analysis of seven clas- evaluation parameters including recall, precision, FS, and
sifiers for the identification of ADHD children with coexist- AUC of SFFS with RF-based classifier for four tasks are
ing ASD by selecting optimal or significant features. The presented in Table 8. We observed that the RF-based model
8 VOLUME 10, 2022

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

prove the quality of life. Existing studies were designed only


for ADHD detection systems based on handwritten text [24],
[25] and other diagnostic tools [41]–[43]. These handwritten-
text-based systems were developed based on offline systems
that required a specific language. As a result, non-native
speakers faced a little bit problem to write down the text
of these languages. In order to solve these problems, we de-
signed a handwritten pattern with an ML-based algorithm to
discriminate children with ADHD having ASD from healthy
children.
In order to design this system, we performed some steps.
Firstly, we asked children to draw four handwritten patterns
(zigzag trace, zigzag predict, PL trace, and PL predict) on
FIGURE 6. Classification accuracy of seven classifiers for optimal features the pen tablet device using a stylus pen and repeat them three
over four tasks. times. As a result, the pen tablet device generated six raw
features, which are already discussed in the data collection
TABLE 8. Four Performance parameters of proposed RF-based model
procedure section. From these six raw features, we extracted
Task Precision Recall F1-Score
thirty statistical features. Subsequently, we performed two
NSF AUC experiments to conduct this study. The first experiment was
Types (%) (%) (%)
Zigzag Trace 5 89.74 83.33 86.42 0.872 to examine the performance of classifiers by considering all
Zigzag Predict 5 85.37 83.33 84.34 0.850 features. In order to perform this experiment, we adopted
PL Trace 7 84.78 92.86 88.64 0.887
PL Predict 9 95.00 90.48 92.68 0.930 seven ML-based algorithms and trained these models with
LOOCV protocol for all features. Our experimental results
showed that PL predict-based patterns with an RF-based
obtained the highest performance scores (precision: 95.00%; algorithm achieved outstanding performance than other clas-
FS: 92.68%; and AUC: 0.930) except recall was obtained sifiers and other tasks. The second experiment was to ex-
by RF for predicting PL tasks compared to other tasks. We amine the discriminative power of classifiers by selecting
observed that the PL predict task with an RF-based model significant features. We employed SFFS-based algorithms
has more discriminative power to discriminate ADHD with in order to select the subset of the relevant features, which
coexisting ASD children from healthy children. were used in seven classifiers to discriminate ADHD children
with coexisting ASD from healthy children. We also trained
these classifiers with LOOCV protocol for four tasks and
their results are shown in Table 7. Our experimental results
also confirmed that the PL predict task with an RF-based
algorithm also obtained outstanding performance compared
to other algorithms and other tasks. Finally, we concluded
that our proposed system has high discriminative power to
detect ADHD children with coexisting ASD.

VI. FUTURE WORK DIRECTION


Despite this study obtaining promising results and its had
still limitations. For example, this study used a relatively
small number of subjects whose all subjects were confirmed
to be right-handed. In our future work, we will extend this
study by including more subjects and also considering left-
handed subjects. Moreover, we will also implement deep
learning-based algorithms to detect ADHD with coexisting
FIGURE 7. Confusion Matrix of RF for: (a) Zigzag Trace; (b) Zigzag Trace; (c)
PL Trace; and (d) PL Predict. ASD children.

VII. CONCLUSIONS
V. DISCUSSION This paper proposed an ML-based ADHD with coexisting
ADHD for children is one of the most common psychiatric ASD detection system from handwriting patterns. In order to
and behavioral disorders. Children with ADHD also suffered build this system, we performed the following experiments.
from various comorbidities and its prevalence has increased First, we extracted 30 statistical features from the raw data
globally. Moreover, the cure for ADHD with other comor- and normalized them by Z-score. Second, we employed
bidities were still unknown and only early detection can im- SFFS to determine relevant features and implemented a grid
VOLUME 10, 2022 9

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

search technique to select the optimal parameters of the [12] L. Reale, B. Bartoli, M. Cartabia, M. Zanetti, M. A. Costantino, M. P.
classification algorithms. This study used seven classification Canevini, C. Termine, and M. Bonati, “Comorbidity prevalence and treat-
ment outcome in children and adolescents with adhd,” Eur. Child Adolesc.
algorithms to discriminate ADHD children with ASD from Psychiatry., vol. 26, pp. 1443–1457, 2017.
healthy children for each task. The experimental results il- [13] C. J. Vaidya, X. You, S. Mostofsky, F. Pereira, M. M. Berl, and L. Kenwor-
lustrated that RF-based algorithm achieved 93.10% accuracy thy, “Data-driven identification of subtypes of executive function across
typical development, attention deficit hyperactivity disorder, and autism
for the PL prediction task, which was comparatively higher spectrum disorders,” J. Child Psychol. Psychiatry., vol. 61, no. 1, pp. 51–
than other classifiers and other tasks. This study suggests 61, 2020.
that the PL predict pattern with an RF-based classifier has a [14] X. Zhou, Q. Lin, Y. Gui, Z. Wang, M. Liu, and H. Lu, “Multimodal mr
images-based diagnosis of early adolescent attention-deficit/hyperactivity
high discriminative to detect ADHD children with coexisting disorder using multiple kernel learning,” Front. Neurosci., vol. 15, pp.
ASD. This study will be helpful for medical practition- 710 133–710 147, 2021.
ers/physicians to detect children with ADHD having ASD at [15] A. Yasumura, M. Omori, A. Fukuda, J. Takahashi, Y. Yasumura, E. Nak-
agawa, T. Koike, Y. Yamashita, T. Miyajima, T. Koeda et al., “Applied
an early stage. machine learning method to predict children with adhd using prefrontal
cortex activity: a multicenter study in japan,” J. Atten. Disord., vol. 24,
INSTITUTIONAL REVIEW BOARD STATEMENT no. 14, pp. 2012–2020, 2020.
All procedures followed were according to the ethical stan- [16] M. Maniruzzaman, M. A. M. Hasan, N. Asai, and J. Shin, “Optimal
channels and features selection based adhd detection from eeg signal using
dards of the responsible committee on human experimenta- statistical and machine learning techniques,” IEEE Access, vol. 11, pp.
tion (institutional and national) and with the Helsinki Dec- 33 570–33 583, 2023.
laration of 1964 and later versions. Ethical approval for this [17] H. M. Y. A. Maniruzzaman M, Shin J, “Efficient feature selection and
machine learning-based adhd detection using eeg signal,” Comput. Mater.
dataset was granted by the Interview Review Board (IRB)
Conti., vol. 72, no. 3, pp. 5179–5195, 2022.
of Kumamoto University, Japan. (Approval Number: 45, [18] A. Müller, S. Vetsch, I. Pershin, G. Candrian, G.-M. Baschera, J. D.
Approval Date: 25 May 2021). Kropotov, J. Kasper, H. A. Rehim, and D. Eich, “Eeg/erp-based
biomarker/neuroalgorithms in adults with adhd: Development, reliability,
and application in clinical practice,” World J. Biol. Psychiatry, vol. 21,
ACKNOWLEDGMENTS no. 3, pp. 172–182, 2020.
This work was supported by JSPS KAKENHI Grant Num- [19] M. Maniruzzaman, J. Shin, and M. A. M. Hasan, “Predicting children with
bers JP21H00891 and JP20K11892. adhd using behavioral activity: A machine learning analysis,” Appl. Sci.,
vol. 12, no. 5, pp. 2737–2749, 2022.
[20] S. Itani, M. Rossignol, F. Lecron, and P. Fortemps, “Towards interpretable
REFERENCES machine learning models for diagnosis aid: a case study on attention
[1] D. American Psychiatric Association, A. P. Association et al., Diagnostic deficit/hyperactivity disorder,” PloS one, vol. 14, no. 4, p. e0215720, 2019.
and statistical manual of mental disorders: DSM-5. American psychiatric [21] M. Adamou, S. L. Jones, L. Marks, and D. Lowe, “Efficacy of continuous
association Washington, DC, 2013, vol. 5, no. 5. performance testing in adult adhd in a clinical sample using qbtest+,” J.
[2] I. Lazzaro, E. Gordon, W. Li, C. Lim, M. Plahn, S. Whitmont, S. Clarke, Atten. Disord., vol. 26, no. 11, pp. 1483–1491, 2022.
R. Barry, A. Dosen, and R. Meares, “Simultaneous eeg and eda measures [22] M. O. Ogundele, H. F. Ayyash, and S. Banerjee, “Role of computerised
in adolescent attention deficit hyperactivity disorder,” Int. J. Psychophys- continuous performance task tests in adhd,” Prog. Neurol. Psychiatry.,
iol., vol. 34, no. 2, pp. 123–134, 1999. vol. 15, no. 3, pp. 8–13, 2011.
[3] M. Altınkaynak, N. Dolu, A. Güven, F. Pektaş, S. Özmen, E. Demirci, and
[23] M. B. Racine, A. Majnemer, M. Shevell, and L. Snider, “Handwriting per-
M. İzzetoğlu, “Diagnosis of attention deficit hyperactivity disorder with
formance in children with attention deficit hyperactivity disorder (adhd),”
combined time and frequency features,” Biocybern. Biomed. Eng., vol. 40,
J. Child Neurol., vol. 23, no. 4, pp. 399–406, 2008.
no. 3, pp. 927–937, 2020.
[24] R. A. Langmaid, N. Papadopoulos, B. P. Johnson, J. G. Phillips, and N. J.
[4] V. A. Harpin, “The effect of adhd on the life of an individual, their family,
Rinehart, “Handwriting in children with adhd,” J. Atten. Disord., vol. 18,
and community from preschool to adult life,” Arch. Dis. Child., vol. 90,
no. 6, pp. 504–510, 2014.
no. suppl 1, pp. i2–i7, 2005.
[5] R. Barkley, K. Murphy, and M. Fischer, “Adhd in adults: What the science [25] R. Cohen, B. Cohen-Kroitoru, A. Halevy, S. Aharoni, I. Aizenberg, and
says. new york, ny, us,” 2008. A. Shuper, “Handwriting in children with attention deficient hyperactive
[6] A. Stickley, A. Koyanagi, V. Ruchkin, and Y. Kamio, “Attention- disorder: role of graphology,” BMC Pediatr., vol. 19, no. 1, pp. 1–6, 2019.
deficit/hyperactivity disorder symptoms and suicide ideation and attempts: [26] J. Shin, M. Maniruzzaman, Y. Uchida, M. A. M. Hasan, A. Megumi,
Findings from the adult psychiatric morbidity survey 2007,” J. Affect. A. Suzuki, and A. Yasumura, “Important features selection and classifica-
Disord., vol. 189, pp. 321–328, 2016. tion of adult and child from handwriting using machine learning methods,”
[7] M. Impey and R. Heun, “Completed suicide, ideation and attempt in Appl. Sci., vol. 12, no. 10, pp. 5256–5270, 2022.
attention deficit hyperactivity disorder,” Acta. Psychiatr. Scand., vol. 125, [27] J. Shin, M. A. M. Hasan, M. Maniruzzaman, A. Megumi, A. Suzuki, and
no. 2, pp. 93–102, 2012. A. Yasumura, “Online handwriting based adult and child classification us-
[8] M. L. Danielson, R. H. Bitsko, R. M. Ghandour, J. R. Holbrook, M. D. ing machine learning techniques,” in 2022 IEEE 5th Eurasian Conference
Kogan, and S. J. Blumberg, “Prevalence of parent-reported adhd diagnosis on Educational Innovation (ECEI). IEEE, 2022, pp. 201–204.
and associated treatment among us children and adolescents, 2016,” J. [28] A. Megumi, A. Suzuki, J. Shin, and A. Yasumura, “Developmental
Clin. Child Adolesc. Psychol., vol. 47, no. 2, pp. 199–212, 2018. changes in writing dynamics and its relationship with adhd and asd tenden-
[9] C. for Disease Control, P. (CDC et al., “Mental health in the united cies: A preliminary study,” https://doi.org/10.21203/rs.3.rs-1616383/v1,
states. prevalence of diagnosis and medication treatment for attention- 2022.
deficit/hyperactivity disorder–united states, 2003,” MMWR. Morb. Mortal. [29] V. Johansson, S. Sandin, Z. Chang, M. J. Taylor, P. Lichtenstein, B. M.
Wkly. Rep., vol. 54, no. 34, pp. 842–847, 2005. D’Onofrio, H. Larsson, C. Hellner, and L. Halldner, “Medications for
[10] T. Torgersen, B. Gjervan, and K. Rasmussen, “Adhd in adults: a study of attention-deficit/hyperactivity disorder in individuals with or without co-
clinical characteristics, impairment and comorbidity,” Nord. J. Psychiatry., existing autism spectrum disorder: analysis of data from the swedish pre-
vol. 60, no. 1, pp. 38–43, 2006. scribed drug register,” Journal of Neurodevelopmental Disorders, vol. 12,
[11] E. Sobanski, D. Brüggemann, B. Alm, S. Kern, M. Deschner, T. Schubert, no. 1, pp. 1–12, 2020.
A. Philipsen, and M. Rietschel, “Psychiatric comorbidity and functional [30] A. Wakabayashi, Y. Tojo, S. Baron-Cohen, and S. Wheelwright, “The
impairment in a clinically referred sample of adults with attention- autism-spectrum quotient (aq) japanese version: evidence from high-
deficit/hyperactivity disorder (adhd),” Eur. Arch Psychiatry Clin. Neu- functioning clinical group and normal adults,” Shinrigaku Kenkyu., vol. 75,
rosci., vol. 257, pp. 371–377, 2007. no. 1, pp. 78–84, 2004.

10 VOLUME 10, 2022

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

[31] A. Wakabayashi, S. Baron-Cohen, T. Uchiyama, Y. Yoshida, Y. Tojo, MD. MANIRUZZAMAN received the B.Sc.,
M. Kuroda, and S. Wheelwright, “The autism-spectrum quotient (aq) M.Sc., and M.Phil degrees in statistics from the
children’s version in japan: a cross-cultural comparison,” J. Autism Dev. Department of Statistics, University of Rajshahi,
Disord., vol. 37, pp. 491–500, 2007. Rajshahi-6205, Bangladesh, in 2013, 2014, and
[32] G. J. DuPaul, T. J. Power, A. D. Anastopoulos, and R. Reid, ADHD Rating 2021, respectively. He became a Lecturer, an As-
Scale—IV: Checklists, norms, and clinical interpretation. Guilford press, sistant Professor, with the Statistics Discipline,
1998. Khulna University, Khulna-9205, Bangladesh in
[33] “Linear Regression,” https://scikit-learn.org/stable/modules/generated/
September 4, 2018, and November 30, 2020, re-
sklearn.linear_model.LinearRegression.html.
spectively. Currently, he is working as a Ph.D.
[34] S. Okser, T. Pahikkala, A. Airola, T. Salakoski, S. Ripatti, and T. Ait-
tokallio, “Regularized machine learning in the genetic prediction of com- fellow at the School of Computer Science and
plex traits,” PLoS Genet., vol. 10, no. 11, p. e1004754, 2014. Engineering, Pattern Processing Laboratory, The University of Aizu, Japan,
[35] P. Pudil, J. Novovičová, and J. Kittler, “Floating search methods in feature under the direct supervision of Prof. Dr. Jungpil Shin. His research interests
selection,” Pattern Recognit. Lett., vol. 15, no. 11, pp. 1119–1125, 1994. include bioinformatics, artificial intelligence, pattern recognition, medical
[36] M. A. M. Hasan, M. Nasser, B. Pal, and S. Ahmad, “Support vector image, signal processing, machine learning, data mining, and big data
machine and random forest modeling for intrusion detection system (ids),” analysis. He has co-authored more than 30 publications published in widely
J. Intell. Learn. Syst. Appl., vol. 6, no. 1, pp. 45–52, 2014. cited journals and conferences.
[37] S. U. Jan, Y.-D. Lee, J. Shin, and I. Koo, “Sensor fault classification based
on support vector machine and statistical time-domain features,” IEEE
Access, vol. 5, pp. 8682–8690, 2017.
[38] O. Z. Maimon and L. Rokach, Data mining with decision trees: theory and
applications. World scientific, 2014, vol. 81. YUTA UCHIDA received the bachelor’s degree
[39] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE in computer science and engineering from The
Trans. Inf. Theory., vol. 13, no. 1, pp. 21–27, 1967. University of Aizu (UoA), Japan, in March 2022.
[40] P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,” He is currently pursuing the master’s degree. He
Mach. learn., vol. 63, pp. 3–42, 2006. joined the Pattern Processing Laboratory, UoA, in
[41] H. Chen, W. Chen, Y. Song, L. Sun, and X. Li, “Eeg characteristics of
April 2020, under the direct supervision of Prof.
children with attention-deficit/hyperactivity disorder,” Neuroscience, vol.
Dr. Jungpil Shin. His research interests include
406, pp. 444–456, 2019.
[42] G. GÜNEY, E. Kisacik, C. KALAYCIOĞLU, and G. Saygili, “Exploring computer vision, pattern recognition, and deep
the attention process differentiation of attention deficit hyperactivity dis- learning. He is currently working on the analysis
order (adhd) symptomatic adults using artificial intelligence on electroen- and recognition of ADHD.
cephalography (eeg) signals,” Turkish J. Elect. Eng. Comput. Sci., vol. 29,
no. 5, pp. 2312–2325, 2021.
[43] Y. Chen, Y. Tang, C. Wang, X. Liu, L. Zhao, and Z. Wang, “Adhd
classification by dual subspace learning using resting-state functional MD. AL MEHEDI HASAN received the B.Sc.,
connectivity,” Artif. Intell. Med., vol. 103, p. 101786, 2020.
M.Sc., and Ph.D. degrees in computer science and
engineering from the Department of Computer
Science and Engineering, University of Rajshahi,
Rajshahi-6205, Bangladesh, in 2005, 2007, and
2017, respectively. He became a Lecturer, an As-
sistant Professor, an Associate Professor, and a
JUNGPIL SHIN (Senior Member, IEEE) received
Professor at the Department of Computer Science
a B.Sc. in Computer Science and Statistics and an
and Engineering, Rajshahi University of Engi-
M.Sc. in Computer Science from Pusan National
neering and Technology (RUET), Rajshahi-6204,
University, Korea, in 1990 and 1994, respectively.
Bangladesh in 2007, 2010, 2018, and 2019, respectively. Recently, he has
He received his Ph.D. in computer science and
completed his postdoctoral research at the School of Computer Science
communication engineering from Kyushu Univer-
and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, Japan.
sity, Japan, in 1999, under a scholarship from
His research interests include bioinformatics, artificial intelligence, pattern
the Japanese government (MEXT). He was an
recognition, medical image, signal processing, machine learning, computer
Associate Professor, a Senior Associate Professor,
vision, data mining, big data analysis, probabilistic and statistical inference,
and a Full Professor at the School of Computer
operating systems, computer networks, and security. He has co-authored
Science and Engineering, The University of Aizu, Japan, in 1999, 2004,
more than 100 publications published in widely cited journals and confer-
and 2019, respectively. He has co-authored more than 320 published papers
ences.
for widely cited journals and conferences. His research interests include
pattern recognition, image processing, computer vision, machine learning,
human-computer interaction, non-touch interfaces, human gesture recogni-
tion, automatic control, Parkinson’s disease diagnosis, ADHD diagnosis, AKIKO MEGUMI received a B.Sc. in Japanese
user authentication, machine intelligence, as well as handwriting analysis, Literature from the Doshisha Women’s College
recognition, and synthesis. He is a member of ACM, IEICE, IPSJ, KISS, and of Liberal Arts, Japan, in 2004. She received an
KIPS. He served as program chair and as a program committee member for M.Sc. and Ph.D. in Social and Cultural Sciences
numerous international conferences. He serves as an Editor of IEEE journals, from Kumamoto University, Japan, in 2019 and
MDPI Sensors and Electronics, and Tech Science. He serves as a reviewer 2023, respectively. She has co-authored 9 pub-
for several major IEEE and SCI journals. lished papers for widely cited journals and confer-
ences. She has also received several awards. Her
research interests include brain function and de-
velopmental disorders. She has clinical experience
as a Licensed Clinical Psychologist and Speech-language-hearing Therapist.

VOLUME 10, 2022 11

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903

AKIRA YASUMURA received a B.Sc. in Tech-


nical Education and an M.Sc. in Technical Ed-
ucation from the University of Teacher Educa-
tion Fukuoka, Japan, in 2001 and 2003, respec-
tively. He received his Ph.D. in Arts and Sciences
from the University of Tokyo, Japan, in 2013,
under a scholarship from the Japanese government
(MEXT). He has been an Associate Professor of
Psychology at Kumamoto University, Japan, since
2018. He has co-authored more than 40 published
papers for widely cited journals and conferences. He has also received
numerous awards and holds patents. His research interests include brain
function, developmental disorders, and machine learning. He is a Licensed
Clinical Psychologist.

12 VOLUME 10, 2022

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4

You might also like