Handwriting-Based ADHD Detection For Children Having ASD Using Machine Learning Approaches
Handwriting-Based ADHD Detection For Children Having ASD Using Machine Learning Approaches
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2022.0092316
ABSTRACT
TTENTION deficit hyperactivity disorder (ADHD) for children is one of the behavioral disorders
A that affect the brain’s ability to control attention, impulsivity, and hyperactivity and its prevalence has
increased over time. The cure for ADHD is still unknown and only early detection can improve the quality of
life for children with ADHD. At the same time, children with ADHD often suffer from various comorbidities
like autism spectrum disorder (ASD), major depressive disorder (MDD), etc. Various researchers developed
computational tools to detect children with ADHD depending on handwriting text. Handwriting text-based
systems are depending on a specific language that causes problems for non-native speakers of that language.
Moreover, very few researchers considered other comorbidities such as ASD, MDD, etc., in their studies
to detect ADHD for children. In this study, handwriting patterns or drawing is assumed as an aspect
to identify/detect ADHD children who have ASD using machine learning (ML)-based approaches. We
collected handwriting samples from 29 Japanese children (14 ADHD with coexisting ASD children and
15 healthy children) using a pen tablet. We asked each child to draw two patterns, namely zigzag lines
and periodic lines (PL) on a pen tablet and repeated them three times. We extracted 30 statistical features
from raw datasets and these features were analyzed using sequential forward floating search (SFFS) and
selected the best combinations or subsets of features. Finally, these selected features were fed into seven
ML-based algorithms for detecting ADHD with coexisting ASD children. These classifiers were trained
with leave-one-out cross-validation and evaluated their performances using accuracy, recall, precision, f1-
score, and area under the curve (AUC). The experimental results illustrated that the highest performance
scores (accuracy: 93.10%; recall: 90.48%; precision: 95.00%; f1-score: 92.68%; and AUC: 0.930) were
achieved by the RF-based classifier for the PL predict task. This study will be helpful and provide evidence
of the possibility of classifying ADHD children having ASD and healthy children based on their handwriting
patterns.
INDEX TERMS ADHD, ASD, Detection, Handwriting Patterns, and Machine Learning.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
F. FEATURE NORMALIZATION
Feature normalization is known as feature scaling or z-score
normalization in the field of statistics and machine learning.
In this work, we used z-score normalization in order to make FIGURE 4. Pseudo code of SFFS-based algorithm.
a standardization transformation for feature normalization,
which is computed using the following formulae:
X −µ [36], [37]. The main purpose of SVM is to find boundaries
z= (1) (hyperplanes) that can easily be separated the class label
σ
Where X is the input feature; µ is the mean/average of the (yes/no) by solving the following constraints:
feature, and σ is SD. The value of z ranges from 0 to 1.
n n n
X 1 XX
G. FEATURE SELECTION TECHNIQUE max αi − αi αj yi yj K(zi , zj ) (2)
α
i=1
2 i=1 j=1
Feature selection (FS) is a technique or process that reduces
the dimension of the training set by selecting or identifying
Subject to
only the biomarkers that are associated with or relevant to the
class or study variable (here, ADHD with coexisting ASD). n
X
This study excludes the biomarkers or features that: (i) have yi T αi = 1, 0 ≤ αi ≤ C, i = 1, ..., n & ∀ i = 1, 2, 3, ..., n
a lower or minimum discriminative power capability and (ii) i=1
are redundant or irrelevant to each other [34]. Because the (3)
selection of effective and efficient biomarkers or features The final discriminate function is written as:
Xn
may increase or improve learning algorithm efficiency, and f (z) = αi K(zi , zj ) + b (4)
predictive accuracy, and reduce computational time and cost. i=1
Moreover, the biomarkers or features that are fed into pre-
dictive or learning algorithms are hypothetically assumed to Where, b is the bias term and K(zi , zj ) is a gram or
be associated with underlying class labels or diseases (here, kernel matrix, which needs to be chosen or defined during
ADHD+ASD). This study used the SFFS-based FS algorithm performing SVM. In this work, we used a radial basis kernel,
to determine efficient or potential biomarkers. The details which is mathematically defined ass:
of SFFS-based FS algorithms are explained in the following
subsections: K(zi , zj ) = exp(-γ∥zi -zj ∥2 ) (5)
1) SFFS-based Algorithms In this work, the value of cost (C) and gamma (γ) are
Sequential feature selection (SFS)-based methods are the optimized using a grid search method. At the same time, we
set of greedy algorithms that are utilized to reduce feature chose the optimum value of C and γ at which points SVM
dimension space [35]. In this study, SFFS was used to de- provides the highest classification accuracy. In the case of this
termine a proper subset or combination of biomarkers or study, we have used the following steps in order to perform
features for ADHD with coexisting ASD. The pseudo-code SVM for predicting ADHD with coexisting ASD children:
of the SFFS-based algorithm is summarized in Fig. 4. Step 1: Spilt the dataset into training and test set. Whereas,
1 subject was taken as a test set, and the remaining
H. CLASSIFICATION MODEL (n-1) subjects are used as the training set.
Seven ML-based algorithms such as SVM, RF, DT, GNB, k- Step 2: Select a suitable kernel on the basis of the char-
NN, LR, and ET were employed to distinguish ADHD with acteristics of the training set. Here, we chose the
coexisting ASD children from healthy children, which are radial basis function.
shortly discussed in the next subsections: Step 3: Select the hyperparameters of regularization pa-
rameter C and gamma (γ) of the kernel function
1) Support Vector Machine using the grid search method.
SVM is a powerful supervised learning method that can Step 4: After optimizing hyperparameters (C and γ),
be used to solve problems in classification and regression trained SVM with RBF model on the training set.
4 VOLUME 10, 2022
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
TABLE 2. List of 30 extracted statistical features from the raw dataset and their computational formula
Step 5: Use the trained SVM with RBF kernel to predict Step 2: Construct classification trees for each bootstrap
the class label (ADHD with coexisting ASD vs. sample by taking mtry of predictors and choosing
HC) of the test set. the best split from among these variables.
Step 6: Repeat Step 1 to Step 5 into n times. Step 3: Predict the new class by combining the prediction
Step 7: Compute performance metrics such as accuracy, of the (ntree ) trees.
recall, precision, f1-score, and AUC. Error rate or classification accuracy can be assessed in two
ways: training and test set. In this paper, we trained the RF-
2) Random forest based model for the training set and evaluated its perfor-
Random forest (RF) is an ensemble learning that integrates mance for the test set. The hyperparameters were optimized
multiple weak learners based on a decision tree to improve using a grid search method and the value of (ntree ) and
generalization ability. It is also used for both regression and mtry were selected at the points that provided the highest
classification problems. RF generates multiple decision or classification accuracy or lowest error rate, respectively.
classification trees during the training phase. Each tree is
created using bootstrapping sampling from the original data 3) Decision Tree
and the classification tree method. After forming a forest, a Decision tree (DT) is a tree-structure-based technique that
new object is placed on each tree for classification. The forest is widely used in data mining for solving regression and
is selected according to the class that provides the maximum classification tasks [38]. The objective of DT is to build
votes for the object. RF is performed as follows: a model that can predict the study variable by learning or
Step 1: Draw the number of trees (ntree ) bootstrap sam- training simple decision rules from input features. In DT,
ples from the n training samples. there are three nodes: the internal node, the decision node,
VOLUME 10, 2022 5
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
and the leaf node. Here, internal nodes are the set of input Here, “1” stands for ADHD with coexisting ASD children,
features, decision nodes are utilized to make any decision by and “0” stands for healthy children. During the training
learning or training and leaf nodes are the output of these phase, LR used logit or sigmoid function, which is defined
decisions. Nowadays, it is widely used in different fields like as:
healthcare, medical imaging, and so on.
1
σ(w) = (9)
4) Gaussian Naïve Bayes Pk 1 + exp (−w)
Gaussian Naïve Bayes (GNB) is a classification method that Where, w = i=0 αi Xi with α0 = 1. Here, Xi is the
is also widely used in ML. GNB assumes that the distribution set of ith input features, and αi is the set of ith unknown
of each input feature must follow a normal or Gaussian coefficients, which needs to be estimated. In this paper,
distribution. Assume that we have a set of input features we estimated these coefficients by MLE during the training
xi (i = 1, 2, ..., k) and yk be the outcome or class label phase and these coefficients were used to predict the outcome
(k=0, 1). Here, “0” stands for healthy children and “1” stands or class label as ADHD with coexisting ASD children and
for ADHD with coexisting ASD children and. First, the input healthy children.
training set is segmented by class label and computes or
estimates the mean and standard deviation (SD) of each input 7) Extra Tree
feature for each class using the following formula: An extra tree (ET) is an ensemble of DTs that perform clas-
sification or regression depending on a tree-based algorithm.
n
1X Unlike the RF-based algorithm, the ET-based algorithm also
Mean : µk = xi (6) constructs multiple DTs based on random subsets of training
n i=1
sets and features. At the same time, it randomly chooses
the thresholds for each feature. The ET-based algorithm
v !
u
u 1Xn
SD : σk = t (xi − x̄)
2
(7) maintains its optimization ability while adding an additional
n i layer of randomization [40].
Suppose that we have taken some observations value “v”. III. EXPERIMENTAL SETUP AND PERFORMANCE
The probability density function (pdf) of “v” given yk is METRICS
computed as follows: A. EXPERIMENTAL SETUP
2 In this paper, we used Python version 3.10 to perform all
1
v−µk
− 21
p(x = v|yk ) = √ e ; σk > 0 and −∞ ≤ v, µk ≤experiments.
σk
∞ As an operating system, Windows 10 version
σk 2π 21H1 (build 19043.1151) 64-bit is configured and Intel(R)
(8) Core (TM) i5-10400 with 16 GB RAM is used in terms
For a given testing data point, we compute the likelihood or of hardware. In this work, we have used the leave-one-out
posterior probability based on the estimated value of µk and cross-validation (LOOCV protocol. Whereas, the dataset was
σk for each class (1/0). The predicted class label is expected divided into two sets a training set and a test set. Here,
to belong to the class with the highest posterior probability. one subject was used for the test set and the remaining (n-
1) subject was used for the training set. During training
5) k-nearest neighbors predictive models, we set the initial interval or range for each
k-nearest neighbors (k-NN) is a popular ML-based technique hyperparameter of each predictive model, which is illustrated
developed by Fix and Hodges in 1951, later expanded by in Table 3. For example, SVM used "RBF" kernel with
Cover and Hart [39] that can be utilized to solve regression cost (C) of {0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000};
and classification problems. It is a distance-based learning and gamma (γ): {0.00001, 0.0001, 0.001, 0.01, 0.1, 1}. RF:
algorithm to measure feature vector similarity. This paper the max_depth: {5, None}; the n_estimator: {50, 100, 200,
used Euclidean distance to compute the distance to all train- 300}; the min samples_split: {2, 3}, the min samples leaf:
ing data points and select the value of k. At the same time, {1, 3}, bootstrap: {True, False}, and the criterion: {"gini",
we determine the majority class among k-neighbors that are "entropy"}. DT: the max depth of {1, 2, 3, 4, 5}; the min
treated as predicted classes. In this paper, we tune or optimize samples leaf of {1 to 10}, the min samples split of {2,3,4,5}.
the value of k using a grid search technique to achieve better k-NN: the value of n neighbors from 1 to 20; the weight
performance. of {’uniform’, ’distance’}, ’p’: {1, 2}. LR: cost (C) of
{10**i for i in range (-4,4)}, the penalty of {"l1", "l2"}, and
6) Logistic regression the solver of ’liblinear’. ET: the max depth of {3,4,5}, the
Logistic regression (LR) is a statistical method that is used min samples leaf of {1,4,7}, the min samples split of 2. In
for binary classification tasks. It is usually used to estimate or the training phase, we used the initial parameters of each
predict the probability of binary class labels based on input classifier and tuned these parameters using the grid search
feature vectors. Whereas, input features can be continuous or method. After optimizing the parameters of each classifier,
categorical, and the class label is binary either “1” or “0”. we trained again all classifiers, which were used to predict
6 VOLUME 10, 2022
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
Predicted Class
Total
ADHD with
Healthy children
Actual coexisting ASD
Class ADHD with
TP FN R1 =TP+FN
coexisting ASD
Healthy children FP TN R2 =FP+TN
Total C1 =TP+FP C2 =FN+TN N= R1 + R2 = C1 + C2
TABLE 5. Baseline characteristics of ADHD with coexisting ASD and healthy children
Variables Overall ADHD with coexisting ASD Healthy children Statistics p-value1
Total, n (%) 29 14 (48.28) 15 (51.72)
Age, Mean ± SD 9.27 ± 1.90 8.57 ± 2.24 9.92 ± 1.23 t=2.02, df=27 0.053
Gender, Male (%) 19 (70.4) 11 (57.89) 8 (42.11) X 2 =2.04, df=1 0.121
df: degrees of freedom; 1 p-value is gained from independent t-test for age variable and chi-square test for gender variable
TABLE 6. Classification TABLE 7. Classification accuracy (in %) of seven classifiers of individual task
for optimal features
accuracy (in %) of seven classifiers for all features over four tasks
CT Zigzag Trace Zigzag Predict PL Trace PL Predict Zigzag Trace Zigzag Predict PL Trace PL Predict
CT
SVM 68.97 70.11 79.31 82.76 NSF ACC NSF ACC NSF ACC NSF ACC
RF 73.56 75.86 86.21 88.51 SVM 10 77.01 6 80.46 16 87.36 5 86.21
DT 68.97 71.26 75.86 82.76 RF 5 87.36 5 85.06 7 88.51 9 93.10
GNB 60.92 58.62 56.32 72.41 DT 2 82.76 16 82.76 14 86.21 3 91.95
KNN 68.97 68.97 71.26 77.01 GNB 8 78.16 2 66.67 10 79.31 16 82.76
LR 70.11 62.07 75.86 80.46 KNN 6 82.76 8 82.76 3 81.61 5 90.80
ET 70.11 62.07 73.56 79.31 LR 13 75.86 5 62.07 17 81.61 5 88.51
CT: Classifier Types and PL: Plain Line ET 2 77.01 5 77.01 5 80.46 10 88.51
NSF: Number of Selected Features; CT: Classifier Types; PL: Periodic Line
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
VII. CONCLUSIONS
V. DISCUSSION This paper proposed an ML-based ADHD with coexisting
ADHD for children is one of the most common psychiatric ASD detection system from handwriting patterns. In order to
and behavioral disorders. Children with ADHD also suffered build this system, we performed the following experiments.
from various comorbidities and its prevalence has increased First, we extracted 30 statistical features from the raw data
globally. Moreover, the cure for ADHD with other comor- and normalized them by Z-score. Second, we employed
bidities were still unknown and only early detection can im- SFFS to determine relevant features and implemented a grid
VOLUME 10, 2022 9
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
search technique to select the optimal parameters of the [12] L. Reale, B. Bartoli, M. Cartabia, M. Zanetti, M. A. Costantino, M. P.
classification algorithms. This study used seven classification Canevini, C. Termine, and M. Bonati, “Comorbidity prevalence and treat-
ment outcome in children and adolescents with adhd,” Eur. Child Adolesc.
algorithms to discriminate ADHD children with ASD from Psychiatry., vol. 26, pp. 1443–1457, 2017.
healthy children for each task. The experimental results il- [13] C. J. Vaidya, X. You, S. Mostofsky, F. Pereira, M. M. Berl, and L. Kenwor-
lustrated that RF-based algorithm achieved 93.10% accuracy thy, “Data-driven identification of subtypes of executive function across
typical development, attention deficit hyperactivity disorder, and autism
for the PL prediction task, which was comparatively higher spectrum disorders,” J. Child Psychol. Psychiatry., vol. 61, no. 1, pp. 51–
than other classifiers and other tasks. This study suggests 61, 2020.
that the PL predict pattern with an RF-based classifier has a [14] X. Zhou, Q. Lin, Y. Gui, Z. Wang, M. Liu, and H. Lu, “Multimodal mr
images-based diagnosis of early adolescent attention-deficit/hyperactivity
high discriminative to detect ADHD children with coexisting disorder using multiple kernel learning,” Front. Neurosci., vol. 15, pp.
ASD. This study will be helpful for medical practition- 710 133–710 147, 2021.
ers/physicians to detect children with ADHD having ASD at [15] A. Yasumura, M. Omori, A. Fukuda, J. Takahashi, Y. Yasumura, E. Nak-
agawa, T. Koike, Y. Yamashita, T. Miyajima, T. Koeda et al., “Applied
an early stage. machine learning method to predict children with adhd using prefrontal
cortex activity: a multicenter study in japan,” J. Atten. Disord., vol. 24,
INSTITUTIONAL REVIEW BOARD STATEMENT no. 14, pp. 2012–2020, 2020.
All procedures followed were according to the ethical stan- [16] M. Maniruzzaman, M. A. M. Hasan, N. Asai, and J. Shin, “Optimal
channels and features selection based adhd detection from eeg signal using
dards of the responsible committee on human experimenta- statistical and machine learning techniques,” IEEE Access, vol. 11, pp.
tion (institutional and national) and with the Helsinki Dec- 33 570–33 583, 2023.
laration of 1964 and later versions. Ethical approval for this [17] H. M. Y. A. Maniruzzaman M, Shin J, “Efficient feature selection and
machine learning-based adhd detection using eeg signal,” Comput. Mater.
dataset was granted by the Interview Review Board (IRB)
Conti., vol. 72, no. 3, pp. 5179–5195, 2022.
of Kumamoto University, Japan. (Approval Number: 45, [18] A. Müller, S. Vetsch, I. Pershin, G. Candrian, G.-M. Baschera, J. D.
Approval Date: 25 May 2021). Kropotov, J. Kasper, H. A. Rehim, and D. Eich, “Eeg/erp-based
biomarker/neuroalgorithms in adults with adhd: Development, reliability,
and application in clinical practice,” World J. Biol. Psychiatry, vol. 21,
ACKNOWLEDGMENTS no. 3, pp. 172–182, 2020.
This work was supported by JSPS KAKENHI Grant Num- [19] M. Maniruzzaman, J. Shin, and M. A. M. Hasan, “Predicting children with
bers JP21H00891 and JP20K11892. adhd using behavioral activity: A machine learning analysis,” Appl. Sci.,
vol. 12, no. 5, pp. 2737–2749, 2022.
[20] S. Itani, M. Rossignol, F. Lecron, and P. Fortemps, “Towards interpretable
REFERENCES machine learning models for diagnosis aid: a case study on attention
[1] D. American Psychiatric Association, A. P. Association et al., Diagnostic deficit/hyperactivity disorder,” PloS one, vol. 14, no. 4, p. e0215720, 2019.
and statistical manual of mental disorders: DSM-5. American psychiatric [21] M. Adamou, S. L. Jones, L. Marks, and D. Lowe, “Efficacy of continuous
association Washington, DC, 2013, vol. 5, no. 5. performance testing in adult adhd in a clinical sample using qbtest+,” J.
[2] I. Lazzaro, E. Gordon, W. Li, C. Lim, M. Plahn, S. Whitmont, S. Clarke, Atten. Disord., vol. 26, no. 11, pp. 1483–1491, 2022.
R. Barry, A. Dosen, and R. Meares, “Simultaneous eeg and eda measures [22] M. O. Ogundele, H. F. Ayyash, and S. Banerjee, “Role of computerised
in adolescent attention deficit hyperactivity disorder,” Int. J. Psychophys- continuous performance task tests in adhd,” Prog. Neurol. Psychiatry.,
iol., vol. 34, no. 2, pp. 123–134, 1999. vol. 15, no. 3, pp. 8–13, 2011.
[3] M. Altınkaynak, N. Dolu, A. Güven, F. Pektaş, S. Özmen, E. Demirci, and
[23] M. B. Racine, A. Majnemer, M. Shevell, and L. Snider, “Handwriting per-
M. İzzetoğlu, “Diagnosis of attention deficit hyperactivity disorder with
formance in children with attention deficit hyperactivity disorder (adhd),”
combined time and frequency features,” Biocybern. Biomed. Eng., vol. 40,
J. Child Neurol., vol. 23, no. 4, pp. 399–406, 2008.
no. 3, pp. 927–937, 2020.
[24] R. A. Langmaid, N. Papadopoulos, B. P. Johnson, J. G. Phillips, and N. J.
[4] V. A. Harpin, “The effect of adhd on the life of an individual, their family,
Rinehart, “Handwriting in children with adhd,” J. Atten. Disord., vol. 18,
and community from preschool to adult life,” Arch. Dis. Child., vol. 90,
no. 6, pp. 504–510, 2014.
no. suppl 1, pp. i2–i7, 2005.
[5] R. Barkley, K. Murphy, and M. Fischer, “Adhd in adults: What the science [25] R. Cohen, B. Cohen-Kroitoru, A. Halevy, S. Aharoni, I. Aizenberg, and
says. new york, ny, us,” 2008. A. Shuper, “Handwriting in children with attention deficient hyperactive
[6] A. Stickley, A. Koyanagi, V. Ruchkin, and Y. Kamio, “Attention- disorder: role of graphology,” BMC Pediatr., vol. 19, no. 1, pp. 1–6, 2019.
deficit/hyperactivity disorder symptoms and suicide ideation and attempts: [26] J. Shin, M. Maniruzzaman, Y. Uchida, M. A. M. Hasan, A. Megumi,
Findings from the adult psychiatric morbidity survey 2007,” J. Affect. A. Suzuki, and A. Yasumura, “Important features selection and classifica-
Disord., vol. 189, pp. 321–328, 2016. tion of adult and child from handwriting using machine learning methods,”
[7] M. Impey and R. Heun, “Completed suicide, ideation and attempt in Appl. Sci., vol. 12, no. 10, pp. 5256–5270, 2022.
attention deficit hyperactivity disorder,” Acta. Psychiatr. Scand., vol. 125, [27] J. Shin, M. A. M. Hasan, M. Maniruzzaman, A. Megumi, A. Suzuki, and
no. 2, pp. 93–102, 2012. A. Yasumura, “Online handwriting based adult and child classification us-
[8] M. L. Danielson, R. H. Bitsko, R. M. Ghandour, J. R. Holbrook, M. D. ing machine learning techniques,” in 2022 IEEE 5th Eurasian Conference
Kogan, and S. J. Blumberg, “Prevalence of parent-reported adhd diagnosis on Educational Innovation (ECEI). IEEE, 2022, pp. 201–204.
and associated treatment among us children and adolescents, 2016,” J. [28] A. Megumi, A. Suzuki, J. Shin, and A. Yasumura, “Developmental
Clin. Child Adolesc. Psychol., vol. 47, no. 2, pp. 199–212, 2018. changes in writing dynamics and its relationship with adhd and asd tenden-
[9] C. for Disease Control, P. (CDC et al., “Mental health in the united cies: A preliminary study,” https://doi.org/10.21203/rs.3.rs-1616383/v1,
states. prevalence of diagnosis and medication treatment for attention- 2022.
deficit/hyperactivity disorder–united states, 2003,” MMWR. Morb. Mortal. [29] V. Johansson, S. Sandin, Z. Chang, M. J. Taylor, P. Lichtenstein, B. M.
Wkly. Rep., vol. 54, no. 34, pp. 842–847, 2005. D’Onofrio, H. Larsson, C. Hellner, and L. Halldner, “Medications for
[10] T. Torgersen, B. Gjervan, and K. Rasmussen, “Adhd in adults: a study of attention-deficit/hyperactivity disorder in individuals with or without co-
clinical characteristics, impairment and comorbidity,” Nord. J. Psychiatry., existing autism spectrum disorder: analysis of data from the swedish pre-
vol. 60, no. 1, pp. 38–43, 2006. scribed drug register,” Journal of Neurodevelopmental Disorders, vol. 12,
[11] E. Sobanski, D. Brüggemann, B. Alm, S. Kern, M. Deschner, T. Schubert, no. 1, pp. 1–12, 2020.
A. Philipsen, and M. Rietschel, “Psychiatric comorbidity and functional [30] A. Wakabayashi, Y. Tojo, S. Baron-Cohen, and S. Wheelwright, “The
impairment in a clinically referred sample of adults with attention- autism-spectrum quotient (aq) japanese version: evidence from high-
deficit/hyperactivity disorder (adhd),” Eur. Arch Psychiatry Clin. Neu- functioning clinical group and normal adults,” Shinrigaku Kenkyu., vol. 75,
rosci., vol. 257, pp. 371–377, 2007. no. 1, pp. 78–84, 2004.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
[31] A. Wakabayashi, S. Baron-Cohen, T. Uchiyama, Y. Yoshida, Y. Tojo, MD. MANIRUZZAMAN received the B.Sc.,
M. Kuroda, and S. Wheelwright, “The autism-spectrum quotient (aq) M.Sc., and M.Phil degrees in statistics from the
children’s version in japan: a cross-cultural comparison,” J. Autism Dev. Department of Statistics, University of Rajshahi,
Disord., vol. 37, pp. 491–500, 2007. Rajshahi-6205, Bangladesh, in 2013, 2014, and
[32] G. J. DuPaul, T. J. Power, A. D. Anastopoulos, and R. Reid, ADHD Rating 2021, respectively. He became a Lecturer, an As-
Scale—IV: Checklists, norms, and clinical interpretation. Guilford press, sistant Professor, with the Statistics Discipline,
1998. Khulna University, Khulna-9205, Bangladesh in
[33] “Linear Regression,” https://scikit-learn.org/stable/modules/generated/
September 4, 2018, and November 30, 2020, re-
sklearn.linear_model.LinearRegression.html.
spectively. Currently, he is working as a Ph.D.
[34] S. Okser, T. Pahikkala, A. Airola, T. Salakoski, S. Ripatti, and T. Ait-
tokallio, “Regularized machine learning in the genetic prediction of com- fellow at the School of Computer Science and
plex traits,” PLoS Genet., vol. 10, no. 11, p. e1004754, 2014. Engineering, Pattern Processing Laboratory, The University of Aizu, Japan,
[35] P. Pudil, J. Novovičová, and J. Kittler, “Floating search methods in feature under the direct supervision of Prof. Dr. Jungpil Shin. His research interests
selection,” Pattern Recognit. Lett., vol. 15, no. 11, pp. 1119–1125, 1994. include bioinformatics, artificial intelligence, pattern recognition, medical
[36] M. A. M. Hasan, M. Nasser, B. Pal, and S. Ahmad, “Support vector image, signal processing, machine learning, data mining, and big data
machine and random forest modeling for intrusion detection system (ids),” analysis. He has co-authored more than 30 publications published in widely
J. Intell. Learn. Syst. Appl., vol. 6, no. 1, pp. 45–52, 2014. cited journals and conferences.
[37] S. U. Jan, Y.-D. Lee, J. Shin, and I. Koo, “Sensor fault classification based
on support vector machine and statistical time-domain features,” IEEE
Access, vol. 5, pp. 8682–8690, 2017.
[38] O. Z. Maimon and L. Rokach, Data mining with decision trees: theory and
applications. World scientific, 2014, vol. 81. YUTA UCHIDA received the bachelor’s degree
[39] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE in computer science and engineering from The
Trans. Inf. Theory., vol. 13, no. 1, pp. 21–27, 1967. University of Aizu (UoA), Japan, in March 2022.
[40] P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,” He is currently pursuing the master’s degree. He
Mach. learn., vol. 63, pp. 3–42, 2006. joined the Pattern Processing Laboratory, UoA, in
[41] H. Chen, W. Chen, Y. Song, L. Sun, and X. Li, “Eeg characteristics of
April 2020, under the direct supervision of Prof.
children with attention-deficit/hyperactivity disorder,” Neuroscience, vol.
Dr. Jungpil Shin. His research interests include
406, pp. 444–456, 2019.
[42] G. GÜNEY, E. Kisacik, C. KALAYCIOĞLU, and G. Saygili, “Exploring computer vision, pattern recognition, and deep
the attention process differentiation of attention deficit hyperactivity dis- learning. He is currently working on the analysis
order (adhd) symptomatic adults using artificial intelligence on electroen- and recognition of ADHD.
cephalography (eeg) signals,” Turkish J. Elect. Eng. Comput. Sci., vol. 29,
no. 5, pp. 2312–2325, 2021.
[43] Y. Chen, Y. Tang, C. Wang, X. Liu, L. Zhao, and Z. Wang, “Adhd
classification by dual subspace learning using resting-state functional MD. AL MEHEDI HASAN received the B.Sc.,
connectivity,” Artif. Intell. Med., vol. 103, p. 101786, 2020.
M.Sc., and Ph.D. degrees in computer science and
engineering from the Department of Computer
Science and Engineering, University of Rajshahi,
Rajshahi-6205, Bangladesh, in 2005, 2007, and
2017, respectively. He became a Lecturer, an As-
sistant Professor, an Associate Professor, and a
JUNGPIL SHIN (Senior Member, IEEE) received
Professor at the Department of Computer Science
a B.Sc. in Computer Science and Statistics and an
and Engineering, Rajshahi University of Engi-
M.Sc. in Computer Science from Pusan National
neering and Technology (RUET), Rajshahi-6204,
University, Korea, in 1990 and 1994, respectively.
Bangladesh in 2007, 2010, 2018, and 2019, respectively. Recently, he has
He received his Ph.D. in computer science and
completed his postdoctoral research at the School of Computer Science
communication engineering from Kyushu Univer-
and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, Japan.
sity, Japan, in 1999, under a scholarship from
His research interests include bioinformatics, artificial intelligence, pattern
the Japanese government (MEXT). He was an
recognition, medical image, signal processing, machine learning, computer
Associate Professor, a Senior Associate Professor,
vision, data mining, big data analysis, probabilistic and statistical inference,
and a Full Professor at the School of Computer
operating systems, computer networks, and security. He has co-authored
Science and Engineering, The University of Aizu, Japan, in 1999, 2004,
more than 100 publications published in widely cited journals and confer-
and 2019, respectively. He has co-authored more than 320 published papers
ences.
for widely cited journals and conferences. His research interests include
pattern recognition, image processing, computer vision, machine learning,
human-computer interaction, non-touch interfaces, human gesture recogni-
tion, automatic control, Parkinson’s disease diagnosis, ADHD diagnosis, AKIKO MEGUMI received a B.Sc. in Japanese
user authentication, machine intelligence, as well as handwriting analysis, Literature from the Doshisha Women’s College
recognition, and synthesis. He is a member of ACM, IEICE, IPSJ, KISS, and of Liberal Arts, Japan, in 2004. She received an
KIPS. He served as program chair and as a program committee member for M.Sc. and Ph.D. in Social and Cultural Sciences
numerous international conferences. He serves as an Editor of IEEE journals, from Kumamoto University, Japan, in 2019 and
MDPI Sensors and Electronics, and Tech Science. He serves as a reviewer 2023, respectively. She has co-authored 9 pub-
for several major IEEE and SCI journals. lished papers for widely cited journals and confer-
ences. She has also received several awards. Her
research interests include brain function and de-
velopmental disorders. She has clinical experience
as a Licensed Clinical Psychologist and Speech-language-hearing Therapist.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3302903
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4