0% found this document useful (0 votes)
101 views5 pages

Advanced Hybrid Transfer Learning Approaches For Autism Spectrum Disorder Detection Using Facial Features

The document presents research on advanced hybrid transfer learning approaches for detecting Autism Spectrum Disorder (ASD) using facial features. It discusses the application of various deep learning models, including VGG16, EfficientNet, DenseNet, and MobileNet, to enhance the accuracy of ASD detection through facial expression analysis. The findings indicate that a hybrid model combining these techniques significantly improves early detection accuracy of ASD compared to traditional methods.

Uploaded by

cacoja5728
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views5 pages

Advanced Hybrid Transfer Learning Approaches For Autism Spectrum Disorder Detection Using Facial Features

The document presents research on advanced hybrid transfer learning approaches for detecting Autism Spectrum Disorder (ASD) using facial features. It discusses the application of various deep learning models, including VGG16, EfficientNet, DenseNet, and MobileNet, to enhance the accuracy of ASD detection through facial expression analysis. The findings indicate that a hybrid model combining these techniques significantly improves early detection accuracy of ASD compared to traditional methods.

Uploaded by

cacoja5728
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proceedings of the 5th International Conference on Data Intelligence and Cognitive Informatics (ICDICI-2024)

IEEE Xplore Part Number: CFP24VL6-ART; ISBN: 979-8-3503-8960-9

Advanced Hybrid Transfer Learning Approaches for


Autism Spectrum Disorder Detection Using Facial
Features
2024 5th International Conference on Data Intelligence and Cognitive Informatics (ICDICI) | 979-8-3503-8960-9/24/$31.00 ©2024 IEEE | DOI: 10.1109/ICDICI62993.2024.10810869

Surya Lakshmi Kantham Vinti Bharathi Panduri Vijayshri Khedkar


Assistant Professor,Department of CSE Assistant Professor Department of CSE
Aditya College of Engineering and Gokaraju Rangaraju Institute of Symbiosis Institute of Technology,
Technology,Surampalem,India Engineering and Technology Symbiosis International (Deemed
[email protected] [email protected] University), Pune, India
[email protected]

Mahesh Babu Ketha Dr.A.Lakshmanarao Thokala Srivalli


Assistant professor, Department of ECE Associate Professor,Department of IT Assistant Professor, Department of CSE
Aditya University Aditya University Koneru Lakshmaiah Education Foundation
Surampalem, India (Deemed to be University), Vaddeswaram,
Surampalem, India Guntur District,Andhra Pradesh, India
[email protected] [email protected]
[email protected]

Abstract— Autism Spectrum Disorder (ASD) is a complex Recent advances in AI and computer vision have
neurodevelopmental condition characterized by deficits in social enabled early ASD identification. Researchers have started to
interactions, communication, and emotional regulation. Early employ face recognition technology to evaluate minor facial
identification is crucial for improving outcomes. This work expressions that may indicate ASD. This method uses ASD
presents a DL approach for ASD detection using facial features patients' distinctive facial traits and emotional responses. Ml,
extracted from the Autistic Children Facial Dataset. Initially,
transfer learning models including VGG16, EfficientNet,
DL are attractive technologies for constructing automated
DenseNet, and MobileNet were applied individually to extract diagnostic systems due to the availability of vast datasets and
and analyze critical facial characteristics. VGG16 and DenseNet sophisticated computational models.
provided detailed high-level features, EfficientNet offered This research explores the detection of ASD through a
efficient feature extraction, and MobileNetV3 enabled rapid hybrid approach that combines advanced transfer learning
processing. Later, a hybrid model was developed by merging the techniques with ML models for facial feature analysis. Facial
features extracted from the above applied models for emotion- traits are extracted using a blend of pre-trained DL methods,
based ASD identification. The hybrid approach was trained and including VGG16, EfficientNet, DenseNet, and MobileNet.
evaluated using SVM and random forest ML models. The algorithms used in this work are described below.
Experimental results demonstrated that the hybrid approach
enhanced detection accuracy for early ASD detection. A. CNN
CNNs are widely used in image analysis to automatically
Keywords— ASD Detection, Transfer Learning, VGG,
learn spatial hierarchies of features. In this work, CNN is
EfficientNet, DenseNet, MobileNet.
employed to extract and identify critical facial features
I. INTRODUCTION associated with ASD.
ASD is marked by enduring difficulties with social B. VGG
interaction, communication, and repetitive activities. It is a This deep CNN is simple and efficient in feature extraction.
spectrum condition because the symptoms may appear in a Its 16 layers, mostly convolutional, capture fine-grained
wide range of ways, from mild to severe. People with ASD picture features. Effective transfer learning using its pre-
often struggle to read and react to social signals, which may
trained weights on big datasets makes it suitable for image
make it difficult for them to build connections and
recognition applications like face feature analysis for ASD
communicate successfully. The developmental outcomes for
those with ASD may be greatly improved by prompt identification.
intervention, which is why early diagnosis is so important. C. Efficientnet
WHO expecting that 1 out of 150 kids have ASD, and the
The EfficientNet family of deep learning models balances
count is growing. ASD is common, however it is typically
misdiagnosed or found later in life, particularly in areas with accuracy with processing economy. Scaling up network
limited access to specialists. Traditional diagnostic procedures dimensions (depth, breadth, and resolution) evenly gives
include subjective behavioural observations and parental EfficientNet great accuracy with fewer parameters for face
reports, which might delay diagnosis. More objective, image categorization and ASD diagnosis.
accessible, and scalable diagnostic methods are needed due to
these constraints.

979-8-3503-8960-9/24/$31.00 ©2024 IEEE 609


Authorized licensed use limited to: M S RAMAIAH INSTITUTE OF TECHNOLOGY. Downloaded on January 09,2025 at 16:24:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 5th International Conference on Data Intelligence and Cognitive Informatics (ICDICI-2024)
IEEE Xplore Part Number: CFP24VL6-ART; ISBN: 979-8-3503-8960-9

D. Densenet Supritha R et al. [5] analyzed eye-tracking data for ASD


DenseNet has more layers that accept inputs from all previous using deep learning models DenseNet-201, EfficientNet B7,
layers. This extensive connection improves feature reuse and ResNet-50, and MobileNetV2. DenseNet-201 was most
accurate at 94.97%, followed by EfficientNet B7 at 94.74%.
mitigates the vanishing gradient issue, allowing the model to
These models and eye-tracking technologies show promise
capture complex facial feature patterns for better ASD
for early ASD screening.
classification. Supritha R et al. [6] developed an app featuring real-
E. Mobilenet time object recognition, speech-to-text, and Low-Rank
Adaptation to make special needs people feel included. The
This efficient CNN architecture is suited for mobile and
research generated guardian and therapist interaction
resource-constrained applications. Depth wise separable timestamps using YOLOv5 and machine learning to evaluate
convolutions provide excellent accuracy with little computing treatment recordings. YOLO and SSD MobileNet were used
load. Its lightweight and high performance make it excellent for object identification, demonstrating Large Language
for real-time facial picture ASD detection. Models' potential in dynamic, real-world applications. [7]
The selection of these models is based on the specific examined children's faces to diagnose ASD. The program
strengths of each model to extract foreground features. CNN identified autistic and ordinarily developing youngsters using
is used as the basis for its performance in capturing local 2,500 training and 300 testing face images. Facial features
image resources. While Forum VGG16 and DenseNet were predicted ASD with 88% accuracy using the EfficientNet
chosen for their ability to extract resources with a high level convolutional neural network.
of detail, EfficientNet provided a balance between accuracy
The authors of [8] employed deep learning
and computational efficiency, and MobileNetV3 was chosen
algorithms and face pictures to improve early ASD
for their speed. A hybrid approach that combines resources identification. An attentional mechanism and transfer
from these models. learning were added to the VGG16 and VGG19 models to
improve their ASD facial feature detection. These
modifications decreased overfitting and enhanced
II. LITERATURE SURVEY
performance, attaining 82.55% accuracy for VGG19 and
C. K. Themistocleous et al. [1] examined ASD diagnostic 80% for VGG16. The updated models outperformed their
delays, especially in lower-income or minority populations, original ones, suggesting non-invasive early ASD
where the average diagnosis age in Greece is six years. The identification. The authors of [9] used ML and DL to identify
authors used NLP to extract story and vocabulary abilities ASD in youngsters by examining face traits. We extracted
from 66 children with ASD and 52 normally developing peers eye, nose, and lip distances using VGG16 and VGG19 deep
to create an AI-based model for early identification. On CNNs. The collected characteristics were input into logistic
linguistic data, hist gradient boosting and XGBoost regression, SVM, naive Bayes, and ANN models. The
ML models distinguished ASD from normally developing approach's accuracy, sensitivity, and specificity showed
youngsters with 96% accuracy. explored detecting ASD by promise for early ASD diagnosis. Ramanjot et al. [10] used
analyzing facial features using CNNs. The authors in [2] used DL to identify ASD in children using face traits. To extract
a dataset of 2,940 images from Kaggle and developed a web distances between the eyes, nose, and lips, deep
application integrating DenseNet121 and EfficientNetB0. convolutional neural networks like VGG16 and VGG19 were
EfficientNetB0 achieved 90% accuracy, while DenseNet121 utilized. The features obtained from these networks were then
reached 54%. The approach aimed to provide an early applied to ML models such as SVM, NB, and ANN. This
detection tool for ASD, facilitating timely intervention. ASD approach showed promising results in terms of accuracy,
identification was improved by M D Karthik et al. [3] sensitivity, and specificity for the early diagnosis of ASD.
utilizing sophisticated deep learning algorithms on face
pictures. This technique used the Vision Transformer (ViT) III. RESEARCH METHODOLOGY
with PCA. SVM, CatBoost, SHAP, XGBoost, and VGG16 to Figure 1 illustrates the proposed methodology. The process
increase detection accuracy and reduce overfitting. The starts with data collection and preprocessing, utilizing facial
hybrid models performed significantly better than the state- images from the Autistic Children Facial Dataset.
of-the-art methods, at diagnosis ASD.
Preprocessing steps such as resizing the images to a standard
[3] conducted a study using machine learning to improve dimension, normalization, and data augmentation are
early diagnosis of ASD in children. The research applied employed to improve model robustness. Afterward, various
SVM, RF, and Gradient Boosting (GB) on a dataset from transfer learning models are applied to extract essential
Kaggle, which included factors such as demographic features from the facial images for further analysis and
information, medical history, and screening test results. SVM classification. The proposed method first uses CNN to capture
was particularly effective with high-dimensional data, RF basic features from the facial images, such as patterns and
performed well by handling varied data characteristics, and textures. Next, VGG16 is employed to extract more detailed,
GB boosted prediction accuracy through its ensemble high-level features, thanks to its deep convolutional
learning approach. The aim of the study was to identify key architecture that provides rich representations of facial
indicators of ASD and enhance the ability to detect it early.
characteristics. EfficientNet follows, chosen for its ability to
SVM emerged as the most effective model, highlighting the
balance accuracy with computational efficiency, ensuring
potential of ML in supporting the early diagnosis and
treatment of ASD. The authors of [4] evaluated how effective feature extraction without excessive resource use.
effectively several machine learning methods calculated DenseNet is then applied to capture more complex feature
ASD to enhance early detection. The analysis revealed that interactions by utilizing dense connections between layers,
SVM, KNN, DT, RF, and DNN were the models with the leading to a robust set of features. Lastly, MobileNetV3 is

979-8-3503-8960-9/24/$31.00 ©2024 IEEE 610


Authorized licensed use limited to: M S RAMAIAH INSTITUTE OF TECHNOLOGY. Downloaded on January 09,2025 at 16:24:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 5th International Conference on Data Intelligence and Cognitive Informatics (ICDICI-2024)
IEEE Xplore Part Number: CFP24VL6-ART; ISBN: 979-8-3503-8960-9

used for its lightweight design, allowing for fast and efficient
feature extraction and processing.
Next, features from EfficientNet, DenseNet,
MobileNetV3, and VGG16 are combined to form a hybrid
model. These many characteristics are combined into a single,
cohesive feature vector by concatenating and normalizing
them. With the help of this method, face traits are represented
more fully and richly by using the complementing qualities of
each model. Afterwards, classifier RF and SVM in
particular—are trained using the hybrid feature vector to
evaluate how well the model detects ASD using facial
expression analysis.. The performance of the model is
assessed using various metrics.

Fig. 1. Proposed Method

IV. EXPERIMENTS AND RESULTS


A. Facial Image Data Collection
This work used the Kaggle Autistic Children Facial Dataset
[11]. Autistic and non-autistic kids are equally represented in
this collection of 2,936 color images. Two sets of 1,468
images show autistic and non-autistic kids. Each image shows
just face features, making it easy to analyze and diagnose ASD
only on facial traits. The number of train and test images are

979-8-3503-8960-9/24/$31.00 ©2024 IEEE 611


Authorized licensed use limited to: M S RAMAIAH INSTITUTE OF TECHNOLOGY. Downloaded on January 09,2025 at 16:24:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 5th International Conference on Data Intelligence and Cognitive Informatics (ICDICI-2024)
IEEE Xplore Part Number: CFP24VL6-ART; ISBN: 979-8-3503-8960-9

in ratio of 80% and 20%. All tge experiments are conducted pooling layer and a final layer of 128 neurons used for
in google colab environment. classification.
B. Data Preprocessing I. Results comparison woth applied models
The facial images are resized to a uniform dimension and Table I and figure 2 shows the performance of various DL
converted to grayscale if necessary. Image normalization is models in ASD detection, evaluated through accuracy and
performed to adjust pixel values to a range between 0 and 1. recall metrics. The CNN model achieved 91% accuracy with
Data augmentation method rotation and flipping applied to 90% recall, while VGG16 showed a slight improvement at
enhance the dataset. 91.5% accuracy and 90% recall. EfficientNet and DenseNet
both demonstrated 92% accuracy, with EfficientNet achieving
C. Applying CNN 91% recall and DenseNet 90%. MobileNetV3 outperformed
CNN is applied to classify facial images. The architecture all, achieving high accuracy of 94% and a recall of 93%,
includes three layers with 32, 64, and 128, followed by ReLU making it the most effective model for ASD detection. These
activations and 2x2 max-pooling. The final layers are a dense results indicate that MobileNetV3 is particularly suited for
layer with 128 neurons and a softmax output layer for accurate and reliable ASD detection through facial features.
classification.
D. Applying VGG
Next, VGG16 model is applied to classify facial images. This Results with DL models
pre-trained deep learning model consists of 12 convolutional
layers. The final fully connected layers perform classification, 94%
with the last layer using a softmax to output probabilities for
the two categories: autistic and non-autistic. 93%
E. Applying Efficient
92%
Next, EfficientNet is applied to classify facial images using its
specific architecture. The model's base includes 7 91%
convolutional layers followed by a global average pooling
layer. EfficientNet-B0, the version used, has a total of 5.3 90%
million parameters and a depth of 24 layers. The final dense
layer is configured with 256 neurons and a softmax activation 89%
function to categorize the images into autistic and non-autistic
classes. 88%
CNN VGG Efficient Dense Mobile
F. Applying Densenet net net net
DenseNet is applied for facial image classification using its
distinctive architecture. Specifically, DenseNet-121 is Accuracy Recall
utilized, featuring 121 layers with a total of approximately 8
million parameters. The network includes dense blocks with
6, 12, 24, and 16 layers, each connected to the next through Fig. 2. Results with TL models
dense connections that enhance feature propagation and
gradient flow. The final classification layer consists of 256 TABLE I. RESULTS WITH TRANSFER LEARNING

neurons with a softmax activation function, effectively Model Accuracy Recall


distinguishing between autistic and non-autistic images.
CNN 91% 90%
G. Applying Densenet VGG 91.5% 90%
DenseNet is applied for facial image classification using its Efficient net 92% 91%
distinctive architecture. Specifically, DenseNet-121 is
Dense net 92% 90%
utilized, featuring 121 layers with a total of approximately 8
million parameters. The network includes dense blocks with Mobile net 94% 93%
6, 12, 24, and 16 layers, each connected to the next through
dense connections that enhance feature propagation and
gradient flow. The final classification layer consists of 256
J. Feature Fusion
neurons with a softmax activation function, effectively
distinguishing between autistic and non-autistic images. In the feature fusion step, features extracted from each transfer
learning model VGG16, EfficientNet, DenseNet, and
H. Applying MobileNet MobileNet are combined to create a comprehensive feature
Later, MobileNetV3 is employed with its lightweight vector. This process involves concatenating the output
architecture, which includes 16 depthwise separable features from the last layer of each model, resulting in a
convolution layers. The model features an initial convolution unified representation that integrates the strengths of each
layer with 32 filters of size 3x3, followed by several inverted model. This combined feature vector is then used for further
residual blocks with ReLU6 activations. A global average analysis and classification, leveraging the diverse and

979-8-3503-8960-9/24/$31.00 ©2024 IEEE 612


Authorized licensed use limited to: M S RAMAIAH INSTITUTE OF TECHNOLOGY. Downloaded on January 09,2025 at 16:24:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 5th International Conference on Data Intelligence and Cognitive Informatics (ICDICI-2024)
IEEE Xplore Part Number: CFP24VL6-ART; ISBN: 979-8-3503-8960-9

complementary information captured by each deep learning fused features are aopplied using ML classifiers SVM and RF.
model. The hybrid approach significantly improved accuracy, with
SVM achieving 95% and Random Forest achieving 96.3%.
K. Applying ML model with hybrid features REFERENCES
In the hybrid model development phase, the feature vectors [1] C. K. Themistocleous et al., “Autism Detection in Children: Integrating
Machine Learning and Natural Language Processing in Narrative
obtained from the fusion of various transfer learning Analysis,” Behavioral Sciences, vol. 14, no. 6. MDPI AG, p. 459, May
approaches are utilized to train two different machine learning 29, 2024.
classifiers SVM and RF. Table II and figure 3 presents the [2] J. C. Mathew et al., "Autism Spectrum Disorder Using Convolutional
results obtained using hybrid features with machine learning Neural Networks," International Conference on Integrated Circuits and
models for ASD detection. The SVM model achieved an Communication Systems, Raichur, India, 2024.
accuracy of 95% with a recall of 94%, demonstrating its strong [3] M. D. Karthik, S. Jeba Priya and T. Mathu, "Autism Detection for
Toddlers using Facial Features with Deep Learning," International
performance in classifying ASD-related facial features. The Conference on Applied Artificial Intelligence and Computing, Salem,
Random Forest (RF) model surpassed SVM, delivering an India, 2024.
accuracy of 96.3% with a recall of 95%, indicating superior [4] J. Li, "Artificial Intelligence-Based Detection of Autism Spectrum
performance. These results highlight the effectiveness of Disorder Using Linguistic Features," International Conference on
combining deep learning feature extraction with traditional Computing and Machine Intelligence, Mt Pleasant, MI, USA, 2024.
machine learning models to enhance ASD detection accuracy [5] Supritha R et al., "Deep Learning for Autism Detection Using Eye
Tracking Scanpaths," International Conference on Interdisciplinary
and reliability. Approaches in Technology and Management for Social Innovation,
Gwalior, India, 2024.
TABLE II. RESULTS WITH HYBRID FEATURES [6] K. Pai et al., "Multimodal Integration, Fine Tuning of Large Language
Model for Autism Support," 2024 5th International Conference on
Model Accuracy Recall Mobile Computing and Sustainable Informatics , Lalitpur, Nepal, 2024.
SVM with hybrid 95% 94% [7] M. S. V. Sai Krishna Narala et al., "Prediction of Autism Spectrum
features Disorder Using Efficient Net," International Conference on Advanced
Computing and Communication Systems, Coimbatore, India, 2023.
RF with hybrid 96.3% 95%
features [8] R. Chandra et al., "Autism Spectrum Disorder Detection using Autistic
Image Dataset," International Conference on Electrical Engineering,
Computer Science and Informatics, Palembang, Indonesia, 2023.
[9] K. Patel et al., "Transfer Learning Approach for Detection of Autism
Spectrum Disorder using Facial Images," International Conference on
Electrical, Electronics and Computer Engineering, Gautam Buddha
Results with hybrid features Nagar, India.
[10] Ramanjot et al., "Autism Spectrum Disorder Detection using theDeep
Learning Approaches," 2022 2nd International Conference on
Technological Advancements in Computational Sciences, Tashkent,
97% Uzbekistan, 2022.
96% [11] https://www.kaggle.com/datasets/imrankhan77/autistic-children-
facial-data-set.
96%
95%
95%
94%
94%
93%
93%
Accuracy Recall

SVM with hybrid features RF with hybrid features

Fig. 3. Results with hybrid features and ML models

V. CONCLUSION
In conclusion, this paper presented a hybrid DL approach
for ASD detection using facial features of ASD Dataset.
Multiple DL models, including CNN, VGG16, EfficientNet,
DenseNet, and MobileNetV3, were applied to capture critical
facial characteristics. The individual performances of these
models yielded accuracies of 91% for CNN, 91.5% for
VGG16, 92% for EfficientNet, 92% for DenseNet, and 94%
for MobileNetV3.To further enhance the detection
performance, a hybrid model was developed by combining the
features extracted from the transfer learning models. These

979-8-3503-8960-9/24/$31.00 ©2024 IEEE 613


Authorized licensed use limited to: M S RAMAIAH INSTITUTE OF TECHNOLOGY. Downloaded on January 09,2025 at 16:24:25 UTC from IEEE Xplore. Restrictions apply.

You might also like