Advanced Hybrid Transfer Learning Approaches For Autism Spectrum Disorder Detection Using Facial Features
Advanced Hybrid Transfer Learning Approaches For Autism Spectrum Disorder Detection Using Facial Features
Abstract— Autism Spectrum Disorder (ASD) is a complex Recent advances in AI and computer vision have
neurodevelopmental condition characterized by deficits in social enabled early ASD identification. Researchers have started to
interactions, communication, and emotional regulation. Early employ face recognition technology to evaluate minor facial
identification is crucial for improving outcomes. This work expressions that may indicate ASD. This method uses ASD
presents a DL approach for ASD detection using facial features patients' distinctive facial traits and emotional responses. Ml,
extracted from the Autistic Children Facial Dataset. Initially,
transfer learning models including VGG16, EfficientNet,
DL are attractive technologies for constructing automated
DenseNet, and MobileNet were applied individually to extract diagnostic systems due to the availability of vast datasets and
and analyze critical facial characteristics. VGG16 and DenseNet sophisticated computational models.
provided detailed high-level features, EfficientNet offered This research explores the detection of ASD through a
efficient feature extraction, and MobileNetV3 enabled rapid hybrid approach that combines advanced transfer learning
processing. Later, a hybrid model was developed by merging the techniques with ML models for facial feature analysis. Facial
features extracted from the above applied models for emotion- traits are extracted using a blend of pre-trained DL methods,
based ASD identification. The hybrid approach was trained and including VGG16, EfficientNet, DenseNet, and MobileNet.
evaluated using SVM and random forest ML models. The algorithms used in this work are described below.
Experimental results demonstrated that the hybrid approach
enhanced detection accuracy for early ASD detection. A. CNN
CNNs are widely used in image analysis to automatically
Keywords— ASD Detection, Transfer Learning, VGG,
learn spatial hierarchies of features. In this work, CNN is
EfficientNet, DenseNet, MobileNet.
employed to extract and identify critical facial features
I. INTRODUCTION associated with ASD.
ASD is marked by enduring difficulties with social B. VGG
interaction, communication, and repetitive activities. It is a This deep CNN is simple and efficient in feature extraction.
spectrum condition because the symptoms may appear in a Its 16 layers, mostly convolutional, capture fine-grained
wide range of ways, from mild to severe. People with ASD picture features. Effective transfer learning using its pre-
often struggle to read and react to social signals, which may
trained weights on big datasets makes it suitable for image
make it difficult for them to build connections and
recognition applications like face feature analysis for ASD
communicate successfully. The developmental outcomes for
those with ASD may be greatly improved by prompt identification.
intervention, which is why early diagnosis is so important. C. Efficientnet
WHO expecting that 1 out of 150 kids have ASD, and the
The EfficientNet family of deep learning models balances
count is growing. ASD is common, however it is typically
misdiagnosed or found later in life, particularly in areas with accuracy with processing economy. Scaling up network
limited access to specialists. Traditional diagnostic procedures dimensions (depth, breadth, and resolution) evenly gives
include subjective behavioural observations and parental EfficientNet great accuracy with fewer parameters for face
reports, which might delay diagnosis. More objective, image categorization and ASD diagnosis.
accessible, and scalable diagnostic methods are needed due to
these constraints.
used for its lightweight design, allowing for fast and efficient
feature extraction and processing.
Next, features from EfficientNet, DenseNet,
MobileNetV3, and VGG16 are combined to form a hybrid
model. These many characteristics are combined into a single,
cohesive feature vector by concatenating and normalizing
them. With the help of this method, face traits are represented
more fully and richly by using the complementing qualities of
each model. Afterwards, classifier RF and SVM in
particular—are trained using the hybrid feature vector to
evaluate how well the model detects ASD using facial
expression analysis.. The performance of the model is
assessed using various metrics.
in ratio of 80% and 20%. All tge experiments are conducted pooling layer and a final layer of 128 neurons used for
in google colab environment. classification.
B. Data Preprocessing I. Results comparison woth applied models
The facial images are resized to a uniform dimension and Table I and figure 2 shows the performance of various DL
converted to grayscale if necessary. Image normalization is models in ASD detection, evaluated through accuracy and
performed to adjust pixel values to a range between 0 and 1. recall metrics. The CNN model achieved 91% accuracy with
Data augmentation method rotation and flipping applied to 90% recall, while VGG16 showed a slight improvement at
enhance the dataset. 91.5% accuracy and 90% recall. EfficientNet and DenseNet
both demonstrated 92% accuracy, with EfficientNet achieving
C. Applying CNN 91% recall and DenseNet 90%. MobileNetV3 outperformed
CNN is applied to classify facial images. The architecture all, achieving high accuracy of 94% and a recall of 93%,
includes three layers with 32, 64, and 128, followed by ReLU making it the most effective model for ASD detection. These
activations and 2x2 max-pooling. The final layers are a dense results indicate that MobileNetV3 is particularly suited for
layer with 128 neurons and a softmax output layer for accurate and reliable ASD detection through facial features.
classification.
D. Applying VGG
Next, VGG16 model is applied to classify facial images. This Results with DL models
pre-trained deep learning model consists of 12 convolutional
layers. The final fully connected layers perform classification, 94%
with the last layer using a softmax to output probabilities for
the two categories: autistic and non-autistic. 93%
E. Applying Efficient
92%
Next, EfficientNet is applied to classify facial images using its
specific architecture. The model's base includes 7 91%
convolutional layers followed by a global average pooling
layer. EfficientNet-B0, the version used, has a total of 5.3 90%
million parameters and a depth of 24 layers. The final dense
layer is configured with 256 neurons and a softmax activation 89%
function to categorize the images into autistic and non-autistic
classes. 88%
CNN VGG Efficient Dense Mobile
F. Applying Densenet net net net
DenseNet is applied for facial image classification using its
distinctive architecture. Specifically, DenseNet-121 is Accuracy Recall
utilized, featuring 121 layers with a total of approximately 8
million parameters. The network includes dense blocks with
6, 12, 24, and 16 layers, each connected to the next through Fig. 2. Results with TL models
dense connections that enhance feature propagation and
gradient flow. The final classification layer consists of 256 TABLE I. RESULTS WITH TRANSFER LEARNING
complementary information captured by each deep learning fused features are aopplied using ML classifiers SVM and RF.
model. The hybrid approach significantly improved accuracy, with
SVM achieving 95% and Random Forest achieving 96.3%.
K. Applying ML model with hybrid features REFERENCES
In the hybrid model development phase, the feature vectors [1] C. K. Themistocleous et al., “Autism Detection in Children: Integrating
Machine Learning and Natural Language Processing in Narrative
obtained from the fusion of various transfer learning Analysis,” Behavioral Sciences, vol. 14, no. 6. MDPI AG, p. 459, May
approaches are utilized to train two different machine learning 29, 2024.
classifiers SVM and RF. Table II and figure 3 presents the [2] J. C. Mathew et al., "Autism Spectrum Disorder Using Convolutional
results obtained using hybrid features with machine learning Neural Networks," International Conference on Integrated Circuits and
models for ASD detection. The SVM model achieved an Communication Systems, Raichur, India, 2024.
accuracy of 95% with a recall of 94%, demonstrating its strong [3] M. D. Karthik, S. Jeba Priya and T. Mathu, "Autism Detection for
Toddlers using Facial Features with Deep Learning," International
performance in classifying ASD-related facial features. The Conference on Applied Artificial Intelligence and Computing, Salem,
Random Forest (RF) model surpassed SVM, delivering an India, 2024.
accuracy of 96.3% with a recall of 95%, indicating superior [4] J. Li, "Artificial Intelligence-Based Detection of Autism Spectrum
performance. These results highlight the effectiveness of Disorder Using Linguistic Features," International Conference on
combining deep learning feature extraction with traditional Computing and Machine Intelligence, Mt Pleasant, MI, USA, 2024.
machine learning models to enhance ASD detection accuracy [5] Supritha R et al., "Deep Learning for Autism Detection Using Eye
Tracking Scanpaths," International Conference on Interdisciplinary
and reliability. Approaches in Technology and Management for Social Innovation,
Gwalior, India, 2024.
TABLE II. RESULTS WITH HYBRID FEATURES [6] K. Pai et al., "Multimodal Integration, Fine Tuning of Large Language
Model for Autism Support," 2024 5th International Conference on
Model Accuracy Recall Mobile Computing and Sustainable Informatics , Lalitpur, Nepal, 2024.
SVM with hybrid 95% 94% [7] M. S. V. Sai Krishna Narala et al., "Prediction of Autism Spectrum
features Disorder Using Efficient Net," International Conference on Advanced
Computing and Communication Systems, Coimbatore, India, 2023.
RF with hybrid 96.3% 95%
features [8] R. Chandra et al., "Autism Spectrum Disorder Detection using Autistic
Image Dataset," International Conference on Electrical Engineering,
Computer Science and Informatics, Palembang, Indonesia, 2023.
[9] K. Patel et al., "Transfer Learning Approach for Detection of Autism
Spectrum Disorder using Facial Images," International Conference on
Electrical, Electronics and Computer Engineering, Gautam Buddha
Results with hybrid features Nagar, India.
[10] Ramanjot et al., "Autism Spectrum Disorder Detection using theDeep
Learning Approaches," 2022 2nd International Conference on
Technological Advancements in Computational Sciences, Tashkent,
97% Uzbekistan, 2022.
96% [11] https://www.kaggle.com/datasets/imrankhan77/autistic-children-
facial-data-set.
96%
95%
95%
94%
94%
93%
93%
Accuracy Recall
V. CONCLUSION
In conclusion, this paper presented a hybrid DL approach
for ASD detection using facial features of ASD Dataset.
Multiple DL models, including CNN, VGG16, EfficientNet,
DenseNet, and MobileNetV3, were applied to capture critical
facial characteristics. The individual performances of these
models yielded accuracies of 91% for CNN, 91.5% for
VGG16, 92% for EfficientNet, 92% for DenseNet, and 94%
for MobileNetV3.To further enhance the detection
performance, a hybrid model was developed by combining the
features extracted from the transfer learning models. These