Advancing Early Autism Detection in Children Through Machine Learning-Assisted Spectrogram Analysis
Advancing Early Autism Detection in Children Through Machine Learning-Assisted Spectrogram Analysis
2024 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES) | 979-8-3503-7613-5/24/$31.00 ©2024 IEEE | DOI: 10.1109/SPICES62143.2024.10779925
Abstract—The diagnosis of Autism Spectrum Disorder (ASD) encompass the variability in presentation and severity. Symp-
is characterized by ongoing difficulties with verbal and non- toms typically emerge in early childhood and can vary from
verbal communication, repetitive habits, and social interaction. mild to severe, impacting various aspects of daily life [1].
Electroencephalography (EEG) is one of the most popular tech-
niques for investigating neurological disorders such as autism ASD affects people of all racial, ethnic, and socio-economic
because of its low cost, excellent temporal resolution, and backgrounds, with a reported prevalence that has been in-
general availability. However, the analysis of EEG recordings creasing over the years. Although ASD is quite common,
typically generates extensive data with dynamic characteristics, its precise etiology is still unknown. However, a number of
traditionally requiring visual inspection by trained clinicians neurological, environmental, and genetic factors have been
for ASD detection. This approach is labor-intensive, costly,
subjective, prone to errors, and lacks consistency. The aim of linked to the ASD [2 – 3]. The significance of comprehending
this research is to create an effective diagnostic method for the and managing this complicated illness is underscored by the
identification of ASD using time-frequency spectrogram images fact that early diagnosis and management are essential for
obtained from EEG signals. The process involves preprocessing enhancing outcomes and quality of life for people with ASD.
raw EEG signals using filtering, followed by the Short-Time EEG studies in ASD show unique brain activity patterns,
Fourier Transform (STFT) to convert them into spectrogram
images. Machine learning (ML) techniques are then used to signaling differences in the brain’s information processing.
evaluate these images. The spectrogram images are processed These differences often affect brain signal synchronization and
to extract features, which are then inputted into five distinct connectivity, especially in areas linked to understanding social
ML classifiers, including k-Nearest Neighbor (kNN), Support cues and processing sensory information. EEG is helpful in
Vector Machine (SVM), Random Forest (RF), Logistic Regression finding specific signs of ASD, which aids in early diagnosis
(LR), and Naive Bayes (NB), to perform classification. The
system achieved an accuracy of 92.7% when employing kNN. and tracking treatment effectiveness.
Its simplicity and high performance suggest it could serve as Several investigations have explored the integration of EEG
a valuable decision support tool for healthcare practitioners in data with ML techniques to classify and diagnose ASD. A
diagnosing ASD. study, as reported in [4], suggests a methodology for the
Index Terms—ASD, EEG, STFT, Machine learning categorization of ASD that combines ML techniques with
phase-based functional brain connectivity data from EEG.
I. I NTRODUCTION This approach revealed notable alterations in functional brain
ASD is a neurodevelopmental condition distinguished by connectivity in children with ASD, particularly within the theta
enduring difficulties in social interaction, communication, and frequency band. Another study mentioned in [5], focused on
behavior. Individuals with ASD may exhibit a wide range early ASD detection using EEG analytics and ML algorithms
of symptoms and abilities, leading to the term spectrum to with Scikit-learn. ML methods have also been deployed on
EEG and Magnetoencephalography (MEG) data for ASD
classification is discussed in [6]. A systematic review of
39 studies highlighted the effectiveness of these methods,
Authorized licensed use limited to: JSS Science & Technology University. Downloaded on January 22,2025 at 05:32:59 UTC from IEEE Xplore. Restrictions apply.
and all other points in the training set. The class of the new III. R ESULT AND DISCUSSIONS
point is then determined by the majority class among its k
nearest neighbors, with k being a user-defined parameter. In The initial step involves preprocessing the raw autistic EEG
this work, the value set for n neighbors is 3, weights are signals, which includes the application of notch and BPF to
set as ’uniform’, algorithm the default value is ’auto’, leaf remove undesired signals. Subsequently, the filtered signals
size is 30, p the default value is 2. It can be computationally are divided into segments, each lasting 10 seconds. Utilizing
expensive for large datasets and require careful preprocessing STFT, spectrogram images of these segments are generated,
to handle features effectively. Predicting disease risk poses a each with dimensions of 1025 × 7. The dataset comprising
growing complexity in the medical field, prompting extensive 3584 spectrogram images, where 75% is assigned for training
utilization of ML techniques, the kNN algorithm stands out as various machine learning techniques and the remaining 25%
a commonly employed method discussed in [16]. for testing the model’s performance. Figures 2, and 3 illustrate
2) Support Vector Machine: SVM is a supervised ML the spectrogram images of ASD subject and normal, respec-
technique used for regression and classification problems tively.
[17]. Its objective is to determine the optimal hyperplane
that divides various classes within the feature space with
the greatest margin, thereby reducing classification errors.
SVM employs a kernel trick to implicitly map data into a
higher-dimensional space, enabling non-linear classification
boundaries. The algorithm identifies support vectors, which
are data points situated nearest to the decision boundary,
to ascertain the optimal positioning of the hyperplane. The
parameter values used in this work are as follows: for C the
value is 1.0, kernel is set as ’linear’, degree default value is
3, gamma is set as scale, and coef0 is 0.
3) Random Forest: RF is an ensemble learning technique
that builds numerous decision trees during the training process
[18]. In the construction of each tree within the forest, a
randomized subset of features and training data is employed.
Prediction during classification or regression involves com-
bining the predictions from all trees, typically using averaging
for regression and voting for classification. This ensemble ap-
proach helps to reduce over fitting and improves generalization Fig. 2. Spectrogram image of ASD
performance compared to individual decision trees. The RF
parameters value used are, n estimators as 100, criterion is
’gini’, minimum samples split is 2, and minimum samples
leaf is 1.
4) Logistic Regression: When a target variable has two
alternative outcomes, LR is a statistical procedure used to
classify the data. A logistic function, which converts the linear
combination of input data into a probability score, is used to
represent the likelihood that a sample will belong to a specific
class. The model is trained by minimizing a cost function,
adjusting the coefficients to best fit the observed data and make
accurate predictions [19]. The parameter values are for penalty
is ’l2’, the C value is 1.0, solver is ’lbfgs’, and maximum
iteration is 100.
5) Naive Bayes: NB is a probabilistic classification tech-
nique founded on Bayes theorem, operating under the as-
sumption of feature independence. It calculates the proba-
bility of a given instance belonging to a particular class by
multiplying the conditional probabilities of each feature given
that class [20]. Despite its simplistic assumption of feature
Fig. 3. Spectrogram image of Normal
independence, NB often performs well in text classification
and other tasks, especially when the data is high-dimensional.
The parameter value used for variable smoothing is 1e-9, alpha TP
value is 1.0, and fit prior is set as True. Sensitivity = (2)
TP + FN
Authorized licensed use limited to: JSS Science & Technology University. Downloaded on January 22,2025 at 05:32:59 UTC from IEEE Xplore. Restrictions apply.
TABLE I
TN P ERFORMANCE M ATRICES DURING TESTING
Specif icity = (3)
TN + FP
ML Model Accuracy (%) Sensitivity Specificity F1 Score AUC
2T P
F 1Score = (4) kNN 92.7 0.856 0.956 0.926 0.908
2T P + F P + F N
RF 89.1 0.734 0.961 0.887 0.847
TP + TN
Accuracy = × 100 (5)
TP + TN + FP + FN SVM 83.4 0.741 0.875 0.834 0.808
IV. C ONCLUSION
The primary aim is to evaluate the effectiveness of state-
of-the-art ML methods in detecting ASD from EEG signals.
Comprehensive analysis was conducted using various ML
techniques such as kNN, RF, SVM, LR, and NB. The kNN
algorithm achieved the highest performance across all metrics,
making it the most effective method for classifying EEG
signals into ASD and non-ASD categories in this study. On
the other hand, the Naive Bayes algorithm showed the lowest
performance among the evaluated techniques. The findings
of this study demonstrate the potential of ML techniques to
enhance ASD diagnosis, offering valuable insights for both
clinical practitioners and researchers. Further research and
refinement of these methods hold promise for advancing the
accuracy and accessibility of ASD assessment in clinical
Fig. 5. ROC of kNN settings.
Authorized licensed use limited to: JSS Science & Technology University. Downloaded on January 22,2025 at 05:32:59 UTC from IEEE Xplore. Restrictions apply.
R EFERENCES [19] Joanne Peng, Kuk Lida Lee, Gary M. Ingersoll, “An Introduction to Lo-
gistic Regression Analysis and Reporting,” The Journal of Educational
[1] S. Baron Cohen, T. Jolliffe, C. Mortimore, M. Robertson, “Another Research, vol. 96, pp. 3-14, 2002.
Advanced Test of Theory of Mind: Evidence from Very High Func- [20] Feng-Jen Yang, “An Implementation of Naive Bayes Classifier,” In-
tioning Adults with Autism Or Asperger Syndrome,” Journal of Child ternational Conference on Computational Science and Computational
Psychology and Psychiatry, vol. 38, pp. 813 - 822, 1997. Intelligence (CSCI), Las Vegas, NV, USA, pp. 301 - 306, 2018.
[2] Julie A. Osterling, Geraldine Dawson, Jeffrey A. Munson, “Early
Recognition of 1-year-old Infants with Autism Spectrum Disorder Versus
Mental Retardation,” Development and Psychopathology, vol. 14, pp.
239 - 251, 2002.
[3] Senju A., Southgate V., White S., Frith U., “ Mindblind eyes: an absence
of spontaneous theory of mind in Asperger syndrome,” Science, vol. 325,
pp. 883 - 885, 2009.
[4] Alotaibi N., Maharatna K., “Classification of Autism Spectrum Disorder
From EEG-Based Functional Brain Connectivity Analysis,” Neural
computing, vol. 33, pp. 1914 - 1941, 2021.
[5] Bosl W.J., Tager-Flusberg H., Nelson C.A., “EEG Analytics for Early
Detection of Autism Spectrum Disorder: A data-driven approach,”
Scientific Reports, vol. 8, 2018.
[6] Das S., Zomorrodi R., Mirjalili M., Kirkovski M., Blumberger D.M.,
Rajji T.K., Desarkar P., “Machine learning approaches for electroen-
cephalography and magnetoencephalography analyses in autism spec-
trum disorder: A systematic review,” Progress in Neuropsychopharma-
cology and biological psychiatry, vol. 123, 2023.
[7] Jacek Rogala, Jarosaw Zygierewicz, Urszula Malinowska, Hanna Cygan,
Elbieta Stawicka, Adam Kobus,Bart Vanrumste, “Enhancing autism
spectrum disorder classification in children through the integration of
traditional statistics and classical machine learning techniques in EEG
analysis,” Scientific Reports, vol. 13, 2023.
[8] Chawla P., Rana S. B., Kaur H., Singh K., “Computer-aided diagnosis
of autism spectrum disorder from EEG signals using deep learning with
FAWT and multiscale permutation entropy features,” Proceedings of the
Institution of Mechanical Engineers, Part H, Journal of Engineering in
Medicine, vol. 237, pp. 282 - 294, 2023.
[9] Mohi-ud-Din, A. K. Jayanthy, “Detection of Autism Spectrum Disorder
from EEG signals using pre-trained deep convolution neural networks,”
Seventh International conference on Bio Signals, Images, and Instru-
mentation (ICBSII), Chennai, India, pp. 1-5, 2021.
[10] Mohi ud Din, Qaysar and Jayanthy, A. K., “Detection of Autism
Spectrum Disorder by feature extraction of EEG signals and machine
learning classifiers,” Biomedical Engineering: Applications, Basis and
Communications, vol. 35, 2023.
[11] Anamika Ranaut, Padmavati Khandnor, Trilok Chand, “Identifying
autism using EEG: unleashing the power of feature selection and
machine learning,” Biomedical Physics and Engineering Express, vol.
10, 2024.
[12] Alves C. L., Toutain, De Carvalho Aguiar P., Pineda A. M., Roster K.,
Thielemann C., Porto J., Rodrigues F. A., “Diagnosis of autism spectrum
disorder based on functional brain networks and machine learning,”
Scientific reports, vol. 13, 2023.
[13] Tawhid MNA, Siuly S., Wang H., Whittaker F., Wang K., Zhang Y., “A
spectrogram image based intelligent technique for automatic detection
of autism spectrum disorder from EEG,” PLoS ONE, vol. 16, 2021.
[14] Baygin, M., Dogan, S., Tuncer, T., Datta Barua, P., Faust, O., Arunk-
umar, N., Abdulhay, E. W., Emma Palmer, E., Rajendra Acharya U.,
“Automated ASD detection using hybrid deep lightweight features
extracted from EEG signals,” Computers in Biology and Medicine, vol.
134, 2021.
[15] Richard A. Altes., “Detection, estimation, and classification with spec-
trograms,” The Journal of the Acoustical Society of America, vol. 67,
pp. 1232 - 1246, 1980.
[16] Shahadat Uddin, Ibtisham Haque, Haohui Lu, Mohammad Ali
Moni,Ergun Gide, “Comparative performance analysis of k-nearest
neighbour (kNN) algorithm and its different variants for disease pre-
diction,” Scientific Reports, vol. 12, 2022.
[17] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt and B. Scholkopf, “Sup-
port vector machines,” IEEE Intelligent Systems and their Applications,
vol. 13, pp. 18 - 28, 1998.
[18] Vladimir Svetnik, Andy Liaw, Christopher Tong, J. Christopher Cul-
berson, Robert P. Sheridan, and Bradley P. Feuston, “Random Forest:
A Classification and Regression Tool for Compound Classification and
QSAR Modeling,” Journal of Chemical Information and Computer
Sciences, vol. 43, pp. 1947-1958, 2003.
Authorized licensed use limited to: JSS Science & Technology University. Downloaded on January 22,2025 at 05:32:59 UTC from IEEE Xplore. Restrictions apply.