Multi-Modal Data Fusion For Classification of Autism Spectrum Disorder Using Phenotypic and Neuroimaging Data
Multi-Modal Data Fusion For Classification of Autism Spectrum Disorder Using Phenotypic and Neuroimaging Data
Article QR
History: Received: November 3, 2022, Revised: December 8, 2022, Accepted: March 28, 2023,
Published: June 2, 2023
Citation: A. Younas, M. Y. Kamal, S. Kausar, and S. Tehsin, “Multi-Modal data fusion for
classification of autism spectrum disorder using phenotypic and neuroimaging
data,” UMT Artif. Intell. Rev., vol. 3, no. 1, pp. 01–16, June 2023, doi:
https://doi.org/10.32350.umt-air.31.01
A publication of
Department of Information System, Dr. Hasan Murad School of Management
University of Management and Technology, Lahore, Pakistan
Multi-Modal Data Fusion for Classification of Autism Spectrum
DisorderUsing Phenotypic and Neuroimaging Data
Adnan Younas, Muhammad Yousuf Kamal, Sumaira Kausar *, and Samabia
Tehsin
Center of Excellence in Artificial Intelligence (COE-AI), Department of
Computer Science, Bahria University, Islamabad, Pakistan
ABSTRACT Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that
causes disrupted social behaviors and interactions of individuals. Hence, it can adversely
affect the social functioning of individuals. Eachautistic individual is said to have a sort of
unique behavioral pattern. ASD has three major sub-categories, namely autism, Asperger,
and pervasive developmental disorder, not otherwise specified. The term spectrum
indicates that ASD possesses a large variety of symptoms of severity. Practitioners need to
have a vast experience and expertise for the accurate analysis of the symptoms of ASD.
These symptoms need to be acquired from a range of modalities. An accurate diagnosis
requires the analysis of brain scan and phenotypic data. These aspects present a multifold
challenge for computer-aided ASD diagnosis. Most of the existing computer aided ASD
diagnosis systems are capable of diagnosing only whether an individual is affected with
ASD or not. A detailed categorization into the subcategories of ASD in such diagnosis is
missing. Another aspect that is missing in the existing techniques is that symptoms are
observed from a single modality. This can adversely affect the accuracy of diagnosis, since
different modalities focus on different aspects of symptoms. These challenges and gaps
provided the motivation to present a method that covers the variety exhibited in ASD, while
considering the dire need of acquiring symptoms from a variety of data sources. The
proposed method showed rather encouraging results. Moreover, the achieved results are
evident of the efficacy of the proposed method.
INDEX TERMS Asperger, Autism Spectrum Disorder (ASD), diagnosis, feature
fusion, machine learning, psychiatry
JEL CODES H51, H52, and H53
I. INTRODUCTION that affects the verbal and social skills of an
individual with no discrimination of age,
A person with autism lacks social
gender, race, or any other social
communication and is involved in
background. Autism growth around the
repetitive behaviors [1]. The term
globe has made researchers work on early
‘spectrum’ describes the range which may
diagnosis and treatment to make the
vary from mild to severe levels of
effected individual an active part of society
disabilities in skills and behaviors [2].
again. Automated diagnosis of ASD is of
Regardless of race, ethnicity, culture, and
dire need of the time. Early and speedy
socioeconomic background, the basic signs
diagnosis can help patients in their early
of Autism Spectrum Disorder (ASD) are lack
treatment. With the emergence of AI
of social interaction and repetitive behavior
technology, the treatment of ASD has
[2]. ASD is a neurodevelopmental disorder
*
Corresponding Author: [email protected]
UMT Artificial Intelligence Review
2
Volume 3 Issue 1, Spring 2023
Younas et al.
become possible that would not only be inter- view revised), and “DISCO” (a
beneficial for patients, however, for all diagnostic instrument for social
other stakeholders as well. communication disorders, UK) are
available[2]. Some other tools are,
According to the diagnostic and statistical
“modified checklist for autism in toddlers
manual of mental disorder 4th edition
revised, with a follow up” (M-CHAT-R/F)”
(DSM-IV), autism has different types
survey of the well-being of young
including autistic disorder, Asperger
children”. Some other tools are also
disorder, childhood disintegrative disorder,
available to measure social deficiencies,
and pervasive developmental disorder not
such as “Social Communication
otherwise specified [3, 4]. This was the
Questionnaire (SCQ)”, “Social
actual diagnostic classificationof DSM-IV
Responsiveness Scale (SRS)” and “Autism
published by the American Psychiatric
Spectrum Screening” questionnaire. All
Association in 1994. In 2013, DSM-IV was
such tools are based on personal
upgraded to 5th edition as DSM-5 in which
observations, interviews, and
all these types except Rett syndrome were
questionnaires which are fairly
combined to make it ”spectrum” with a
questionable in the context of reliability.
range from mild to severe [1]. Different
Several societies prefer genetic testing in
studies have discovered that both, genetic
which different laboratory tests are
and environmental factors are responsible
performed, such as CBC, (complete blood
for restriction in the development of the
count), urine examination, and stool
brain and may cause autism due to
analysis. Brain images are not a common
changes \ in cerebellar architecture and
practice, however, autism’s relationship
abnormalities in the limbic system [1]. No
with brain urges th e clinicians to get
clear causes are found for autism, however,
neuroimages through MRI for detailed
there are several misconceptions about its
analysis [1].
causes [5]. There are some common
parental beliefs regarding the causes of With the advancement in the field of
autism. Parents believe that ASD in their neuroscience and psychopathology, early
child is either due to their child’s brain detection of ASD is possible [7]. ASD has
structure, environmental pollution, and some similar symptoms to other disorders,
genetic problems, or maybe the will of God. such as ADHD that makes screening
Some parents associate it with generalized difficult. Early detection of autism can help
stress, bad luck, poor diet, and tobacco or clinicians treat the patient at an early stage.
alcohol consumption [6]. There is no [8] ASD is a result of genetic mutation [9].
connection between vaccination and autism Most studies suggest that very few patients
[5], however, many parents feel that are diagnosed at an early age, although it is
vaccines have toxins that cause autism. proven in studies that diagnosis under the
Diagnosis of autism is not that easy since age of three has a stability rate of 100%.
clinicians have to depend on personal [10] Several studies have used automated
observation and information provided by methods based on computer vision
parents. [5, 6]. techniques and data analysis for diagnosis
of ASD [7–9].
Several screening tools including “STAT”
(screening tool for autism in toddlers), ASD is diagnosed clinically with different
‘’ADOS” (autism diagnostic observation tools including STAT, ADOS, ADI-R,
schedule), “ADI-R” (autism diagnostic CARS, SRS, and SCQ [2]. Iidka [12]
Department of Information Systems
3
Volume 3 Issue 1, Spring 2023
Multi-Modal Data Fusion for Classification…
applied neural networks to classify teenager 64.23% was achieved. The study focused
ASD patients and obtained 90% accuracy. on binary classification. [20] applied a deep
Chen et al. [13] used SVM and classified neural network for the diagnosis of ASD
79.17% of cases accurately. [14] obtained patients by using brain images dataset. Their
73.4% accurate results with CNN in which study employed a hybrid model of
a novel approach was proposed by using unsupervised autoencoders and supervised
full resolution 3D spatial structure of rs- CNN and obtained an accuracy of 84.05%.
MRI data. Moreover, ABIDE dataset was Pominova et al. [21] performed domain
also used for binary classification and an adaptation on brain images data for the
accuracy of 73.3% was obtained. Li et al. classification of ASD patients based on
[15] applied deep neural networks in a 2-stage brain pathology. Their approach
method for the classification of ASD in outperformed other existing approaches
which fMRI images were used through a with the use of 3D convolutional
3D CNN sigmoid classifier with an autoencoders. Lu et al. [22] proposed a
accuracy of 85.3%.Moreover, the problem fuzzy multi-kernel clustering approach
of interpretation of reliable biomarkers was based on autoencoders and an accuracy of
also addressed which was related to ASD 61% was obtained by combining thefMRI
classification, however, rs-FMRI images and phenotypic data. Their clustering
were used for binary classification. approach performed better than others for
the diagnosis of ASD.
Heinsfeld et al. [16] identified the most
influential areas of brain that causes ASD Huang et al. [23] used ABIDE-I multi-site
with an accuracy of 70% by using DNN. and multi-template data and classified ASD
Autoencoder increases the performance of patients by using brain image features and
model and classifiers, such as RF, SVM, achieved an accuracy of 89.13% .In 2018,
and DNN showed accuracies of 63%, 65%, Khosla et al. [9] used functional MRI and
and 70%, respectively. Brain images from obtained an accuracy of 73.3% for binary
ABIDE dataset were used to differentiate classification of ASD patients. However, a
ASD patients from non-autistic people. large number of studies have been conducted
Yang et al. [17] classified ASD and TD for boosting the performance of autism
with the help of rs-fMRI by using tensor classification by using different data
flow-based DNN models and obtained an processing techniques [24].
accuracy of 75.27%. Resting- state fMRI
It is evident from the literature of
was used, acquired from multi-site, through
automated ASD diagnosis that some areas
ABIDE repository for conducting the
require more focus in the research. One of
study. Moreover, images data was used for
the key observation in this regard is that
binary classification of ASD and no ASD.
majority of the work in literature is focused
Behavioral- based features were not used.
on binary classification, that is, ASD
Yin et al. [18] reviewed fMRI and sMRI
effected and control classes. Very less to
based diagnosis of ASD.
none work has been done in the
Arya et al. [19] used feature fusion of classification of sub categories of ASD.
behavioral and brain images data from and Another related research dimension that
with the help of GCN framework. They requires more attention is multi-modal data
fused brain summaries obtained from 3D analysis for diagnosis. The current study
CNN with phenotypic data to make the focused on these two aspects. The proposed
model more effective. A mean accuracy of method focuses on classification into
UMT Artificial Intelligence Review
4
Volume 3 Issue 1, Spring 2023
Younas et al.
subcategories of ASD and it also Some other sites have images data sets but
considered multiple modalities to take they lack behavioral and functional data.
symptoms. Moreover, the achieved results This data set comprises of clinical as well
were promising. behavioral data sets. In many other data
sets, data is available just for infants,
II. METHODS
adolescents, or adults separately, however,
A. DATASET here in this initiative, all ranges of age are
covered in a single data set. Moreover, most
Many datasets are available for ASD datasets are related to men, while ABIDE
patients. Kaggle, UCI machine learning, deals with both genders. Hence, global
and different self-collected datasets are
participation has made it more diverseand
available on different sites. However, an
effective in generalizing the diagnosis
initiative by the “National Institute of
process around the globe.
Mental Health America” has made a
repository of autism patients’ data with This data repository is presented by
controls. This data is publicly available on International Neuroimaging Data-sharing
the NIMH website for further research.It is a Initiative(INDI). It has two data sets named
multi-modal data with considerable number of ABIDE-I and ABIDE-II. Each data set
records. It is a multi-site global ASD data comprises of brain images taken at different
collected from different states of America laboratories over theworld. Hence, it should
and other sites over the world. This data be taken into consideration that these
features brain images that plays an images are taken with different MRI
important role in the diagnosis of ASD . machines and in different settings.ABIDE-
I was released in 2012 with 1112 records of
17.
FIGURE 1. Proposed model
Department of Information Systems
5
Volume 3 Issue 1, Spring 2023
Multi-Modal Data Fusion for Classification…
that is, entropy and Gini index. The formula Linear: K(x,y) = x.y
for both is given below:
Polynomial: K(x,y) = (1+x.y)d
𝑛𝑛
Gini= 1 − �𝑖𝑖=1 𝑝𝑝2 (ci) RBF: K(x,y) = exp(-a||x-y||)2
𝑛𝑛
Entropy = �𝑖𝑖=1 −𝑝𝑝 (ci) * log2 (p (ci))
2
Sigmoid: K(x,y) = tanh(ax.y+b)
Where, p (ci) is probability or percentage of 3) NEURAL NETWORKS (NN)
class (ci) in a node. In DT, “Gini” and A neural network (NN) is one of the
“Entropy” were used as criteria with a max most popular classifier. Ittakes inspiration
depth of 5. The model was fine-tuned with from the structure and working of human
different parameters and resultantly, the brain. The idea behind neural network was
model performed well with 70% training “neuron” which is a basic unit of brain.
samples and ”Gini” as a criterion with a There is some activation function that
depth of 5. processes the data with checking of error
2) SUPPORT VECTOR MACHINES and updating its weight. Gradually,
(SVM) machine learns and error is minimized. In
NN, an activation function refers to a
Support Vector Machine (SVM) is a mathematical functionthat maps inputs to the
machine learning model used for neuron to output of that neuron. Table II
classification. It works with mathematical shows different activation functions with
functions called ”Kernel”. This method equations and derivatives.
makes vectors of different groups along a Backpropagation is used to learn the
threshold and each valueis grouped into its parameters of the network.
particular vector. SVM is one of the leading Backpropagation uses gradients to optimize
classifiers. In SVM, different kernels are the parameters.
used. Four different kernels were used
including Linear, Polynomial,Sigmoid, and In the NN model, a simple model
Radial Base Function (RBF). Different having input, hidden, and output was used.
hyper-parameters were also used. The For binary classification, sigmoid in output
value of “C” error term w a s a l s o s e t and rectified linear unit (Relu) function in
in linear to50 and in RBF to 100 i n hidden layers were used. The Tangent
o r d e r to avoid over-fitting.In RBF, the Hyperbolic function in input and hidden
value of “gamma” was set to 1 in order to layers was also tested. To compute the loss,
make sure that the model uses a low value Binary Cross entropy with “ Adam” as
to get the influence of each training an optimizer was used. Stochastic gradient
example as far as possible. In polynomial descent (SGD) was also tested, however, it
kernel, the polynomial degree was set to 8 didn’t perform well. The batch size w a s
as going to higher values takes enough time s e t to 200 with 1000 epochsto in order
to run one instance. The train test split was to run the model for sufficient training and
70-10-20, however, 60-10-30 ratio was also prediction. 10-fold cross-validation was
experimented. usedto enhance model efficiency. In multi-
class classification, the same model was
Some of the basic kernels used in SVM are used with the same parameters, however,
given below with their mathematical the activation function was changed to
equations. “Softmax” for multi-class classification.
TABLE II
DERIVATIVES OF ACTIVATION FUNCTIONS
No Function Equation Derivative
1
1 Sigmoid σ(𝑥𝑥) = 𝑓𝑓′(𝑥𝑥) = 𝑓𝑓(𝑥𝑥)(1 − 𝑓𝑓(𝑥𝑥))
1 − 𝑒𝑒 −𝑥𝑥
𝑒𝑒 𝑥𝑥 − 𝑒𝑒 −𝑥𝑥 𝑓𝑓′(𝑥𝑥) = 1 − 𝑓𝑓(𝑥𝑥)2
2 TanH σ(𝑥𝑥) =
𝑒𝑒 𝑥𝑥 + 𝑒𝑒 −𝑥𝑥
0 𝑖𝑖𝑖𝑖 𝑥𝑥 < 0 0 𝑖𝑖𝑖𝑖 𝑥𝑥 < 0
𝑓𝑓(𝑥𝑥) = � 𝑓𝑓′(𝑥𝑥) = �
3 ReLu 𝑥𝑥 𝑖𝑖𝑖𝑖 𝑥𝑥 ≥ 0 1 𝑖𝑖𝑖𝑖 𝑥𝑥 ≥ 0
𝑒𝑒 𝑥𝑥 𝑒𝑒 𝑥𝑥 (𝑒𝑒 𝑥𝑥 )2
4 Softmax 𝑓𝑓(𝑥𝑥) = 𝑓𝑓 ′ (𝑥𝑥) = − 𝑗𝑗
∑𝑗𝑗1 +𝑒𝑒 𝑥𝑥 ∑𝑗𝑗1 +𝑒𝑒 𝑥𝑥 (�1 +𝑒𝑒 𝑥𝑥 )2
The first step of the current study was to approach. Results showed thatthe approach
classify ASD and controls which remained outperformed without feature fusion in DT
the focus of previous studies. Targeting Dx and SVM. Behavioral data showed the
group with two labels, samples were same results in NN in precision, however,
classified in individual modal and in the remained lower in accuracy and recall.
proposed method of feature fusion modal. Results showed that feature fusion gave
The current study proved that the proposed better results in all classifiers.
approach outperformed individual
Finally, the main focus of the current study
modalities in all classifiers. Results showed
was to classify ASD patients into
that feature fusion of multi-modality data
subcategories which were mostly not
gave the best results. The next step was to
discussed in the previous studies. “PDD
classify ASD and controls into broad multi-
DSM IV” were targeted with three labels.
classes of no autism, autistic, Asperger,and
Samples were classified in individual
PDDs-NOS, which remained a gap in
modalities and in the proposed method of
previous studies. Targeting “PDD DSM
feature fusion approach. Achieved results
IV” withfour labels, samples were classified in
showed that feature fusion gave the better
individual modalities as well as in the
results.
proposed method of feature fusion
biomarker interpretation in ASD using Conf. Mach. Vision, Rome, Italy, Nov.
deep learning and fMRI,” In Int. Conf. 2–6, 2021, pp. 570–577, doi:
Med. Image Comput. Comput-assist. https://doi.org/10.1117/12.2587348
Interven., pp. 206-214. Granada,
H. Lu, S. Liu, H. Wei, and J. Tu,
Spain, Sept. 16-20, 2018, pp. 206–214.
[22]