0% found this document useful (0 votes)
22 views16 pages

Advanced Techniques For Biometric

This research paper discusses advanced techniques for liveness face detection in biometric authentication systems, emphasizing the integration of deep learning models with Explainable AI (XAI) methods like LIME to enhance security and interpretability. The study achieves a significant accuracy of 97.3% on the Spoof in Wild with Multiple Attacks Version 2 (SiWMv2) dataset, outperforming baseline models. The findings highlight the importance of robust liveness detection to prevent spoofing attacks and improve the reliability of biometric systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views16 pages

Advanced Techniques For Biometric

This research paper discusses advanced techniques for liveness face detection in biometric authentication systems, emphasizing the integration of deep learning models with Explainable AI (XAI) methods like LIME to enhance security and interpretability. The study achieves a significant accuracy of 97.3% on the Spoof in Wild with Multiple Attacks Version 2 (SiWMv2) dataset, outperforming baseline models. The findings highlight the importance of robust liveness detection to prevent spoofing attacks and improve the reliability of biometric systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Received 8 September 2024, accepted 25 September 2024, date of publication 21 October 2024, date of current version 29 October 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3474690

Advanced Techniques for Biometric


Authentication: Leveraging Deep Learning
and Explainable AI
SMITA KHAIRNAR 1,2 , SHILPA GITE 3,4 , (Member, IEEE), KASHISH MAHAJAN 2,

BISWAJEET PRADHAN 5 , (Senior Member, IEEE), ABDULLAH ALAMRI6 ,


AND SUDEEP D. THEPADE 2,7
1 Computer Science and Engineering Department, Symbiosis Institute of Technology, Symbiosis International (Deemed) University, Pune 412115, India
2 Pimpri Chinchwad College of Engineering, Pune 411044, India
3 AIML Department, Symbiosis Institute of Technology, Pune 412115, India
4 Symbiosis Centre for Applied AI, Symbiosis International (Deemed University), Pune 412115, India
5 Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), School of Civil and Environmental Engineering, Faculty of Engineering &

Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia


6 Department of Geology and Geophysics, College of Science, King Saud University, Riyadh 11543, Saudi Arabia
7 PCET, Pimpri Chinchwad University, Pune 412106, India

Corresponding authors: Shilpa Gite ([email protected]) and Biswajeet Pradhan ([email protected])


This work was supported in part by the Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of
Engineering and IT, University of Technology Sydney; and in part by the Researchers Supporting Project, King Saud University, Riyadh,
Saudi Arabia, under Project RSP 2024 R14.

ABSTRACT Liveness face detection is essential for modern biometric systems, ensuring that input data
is genuine and not derived from a false image or video. Liveness face detection in today’s biometric
systems will ensure that input comes from a real, live person rather than a manipulated image or video.
The novelty of this study lies in combining deep learning models with local interpretable model-agnostic
interpretation (LIME) to enhance the interpretability and transparency of facial liveness detection systems.
This technology is necessary for preventing spoofing attacks and attempts by hackers to break the security
feature via pictures, videos, masks, etc. Spoofing refers to the compromise of a biometric system by providing
it with untruthful material, photographs, videos, or masks to gain access. However, if not dealt with, such
forms of fraud could affect the security of the biometrics. Liveness detection relies on several strategies,
from basic facial actions like blinking and head twists to even more advanced algorithms that can identify
natural skin texture and warmth or detect differences at the pixel level between live and static images. Robust
liveness detection in biometric authentication significantly enhances security and reliability. The objective
of this research is to test the different pre-trained models to detect spoofing attacks and to use LIME to
explain the model’s predictions. This paper focuses on a dataset of Spoof in Wild with Multiple Attacks
Version 2 (SiWMv2), comprising 14 different spoofing techniques, ranging from replay attacks and makeup
disguises with paper glasses to more complex ones. Seven pre-trained architectures, VGG16, DenseNet201,
InceptionV3, VGG19, ResNet50, MobileNetV2, and Xception, are fine-tuned with the potential for actual
automatic liveness identification in facial images. Deep learning approaches achieve superior detection
performance against contemporary spoofing techniques. These techniques aim to enhance the interpretability
of their predictions. Building on deep learning approaches, LIME is incorporated to improve transparency
further. LIME provides visual explanations of the prediction to represent what features support the

The associate editor coordinating the review of this manuscript and


approving it for publication was Zahid Akhtar .

2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
153580 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 12, 2024
S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

model’s decisions. Our work demonstrates how LIME can effectively give insights that are useful to understand
findings on face identification and critically aid in understanding model decisions in security and authentication
systems. The key findings of this research show that our method achieves 97.3% accuracy on the SiWMv2 dataset,
a significant improvement over baseline models, which average 91.5%. This enhancement shows the effectiveness
of our approach in a system that boosts robustness and increases the reliability of liveness detection.

INDEX TERMS Explainable artificial intelligence (XAI), liveness detection, LIME, pre-trained models, spoof
attacks.

I. INTRODUCTION liveness detection [9]. This approach accelerates the develop-


Biometric authentication systems [1] are much required in ment of effective liveness detection systems and reduces the
this digital era of interactions to guarantee access to sensi- computational burden associated with training deep neural
tive information and services. Among such systems, liveness networks from scratch. Exploring the integration of Transfer
detection [2] stands out as a fundamental mechanism to verify Learning with Explainable AI (XAI) models will be interest-
the presence of a live human while authentication processes ing, as it can enhance the interpretability and transparency
are performed. of predictions, making Artificial Intelligence (AI) systems
Increasingly sophisticated spoofing attacks, however, are more reliable and trustworthy in critical applications [10].
testing the reliability of such systems to a great extent. Several tools improve the interpretability of machine learn-
In other words, it is when someone tries to trick that system ing algorithms, including Local Interpretable Model-agnostic
into letting them in by putting on your guise. According Explanations (LIME), which interpret individual predic-
to the literature, the attacks have been shown to undermine tions using simple models to approximate the behaviour
a basic guarantee in liveness detection mechanisms ensur- locally. (SHapley Additive exPlanations) SHAP [11] uses
ing the authenticity of a subject during image acquisition a game-theoretic approach to provide consistent feature
processes [3]. Knowing the intentions behind such attacks importance measures. Gradient-weighted Class Activation
is relevant and can help improve security measures, reduc- Mapping (Grad-CAM) [12] identifies the regions of an image
ing their huge economic impacts [4], [5]. Estimations have that are most influential in a model’s prediction. LIME
been made that a breached biometric system can cost an is commonly used in experimentation due to its versatil-
economy millions, particularly in high-security industries ity and model-agnostic nature, which allows it to explain
such as finance, healthcare, and government services, all predictions from any machine learning model without requir-
of which rely on secure authentication. Liveness detection ing modifications. This study incorporates an explainable
bypassing allows access and fraud, thus defeating trust in AI method to clarify how our liveness detection models
digital transactions and compromising user privacy. This make decisions. Explainable AI methods are vital because
paper considers the performances of seven recognized archi- they provide clear and understandable insights into the var-
tectures of VGG16, DenseNet201, InceptionV3, VGG19, ious factors that tend to influence predictions made by the
ResNet50, MobileNetV2, and Xception [6] concerning the model [13].
task of liveness detection. Testing is done with the Spoof These are used to explain the predictions of various
in Wild with Multiple Attacks Version 2 (SiWMv2) dataset, complex machine-learning algorithms locally and inter-
which includes 14 spoofing techniques, from the simplest pretably [14]. Their goal is to offer an understanding of
ones—like print and replay—to more elaborate methods how a model generates its forecasts for specific cases or
using disguises. We apply transfer learning—a technique that portions of data. This is important because many complex
‘‘borrows’’ knowledge from pre-trained models applied to machine learning models are frequently considered black
new tasks—to look into how such an architecture could be boxes, making their decision-making procedures unclear and
adapted and fine-tuned towards the goal of better detecting challenging to understand. In the context of liveness detec-
and mitigating spoofing attempts [6], [7]. tion, where distinguishing between live subjects and spoofing
Transfer learning benefits autonomous driving systems attempts is crucial, understanding these factors is paramount
by transferring knowledge from many models trained on for improving detection accuracy and robustness. We aim to
extensive images of roads to improve the accuracy of object enhance the resilience and transparency of liveness detection
detection and lane-keeping algorithms in varied environ- systems, thereby mitigating the economic impact of spoofing
ments [8]. Transfer learning in financial forecasting allows attacks and bolstering the security of biometric authentica-
models to be tuned for the forecasting of market trends by tion in digital environments. The structure of this paper is
leveraging patterns learned from a large amount of historical described as follows: Section II reviews existing research.
data from different financial instruments. It is expected that Then, Section III describes the proposed methodology. The
the robustness and generalizability of models against spoof- following section gives an overview of testing and informa-
ing attacks would be significantly enhanced by knowledge tion on how the procedure was carried out. Section V depicts
transfer from large-scale datasets in the specific task of the experiment’s results using LIME Explainer and prediction

VOLUME 12, 2024 153581


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

scores. In conclusion, Section VI discusses the results and models are more computationally expensive and may not be
possibilities for the future manuscript. suitable for real-time systems. Grad-CAM and SHAP also
enhance interpretability but are not robust against adversarial
II. RELATED WORK inputs.
Liveness detection has evolved from its initial blink detec- SHAP, LIME, Integrated Gradients, and Counterfactual
tion and motion analysis techniques into machine learning Explanations are Explainable AI (XAI) techniques that
and deep learning methods. Initial techniques focusing on bring transparency to various domains, including biometric
eye movement and motion analysis [15] struggled against systems. In [24], for example, XAI was used in smart farm-
advanced spoofing methods like high-resolution printed ing to make complex models more understandable. These
images and 3D masks [16], exposing the vulnerabilities of advanced methods can also make biometric authentication
early biometric systems. The integration of advanced learn- more explainable, e.g., distinguishing between live faces
ing techniques in the liveness detection domain is expected and spoofs. Additionally, they can identify vulnerabilities,
to enhance the analysis of dynamic facial features, texture thereby enhancing robustness and security. The cross-domain
variations, and depth information, which will be transferred applicability of XAI makes it essential to ensure trust and
into models that are Xception, ResNet50, VGG16, VGG19, reliability across all applications.
and InceptionV3. Precisely, deep learning models using AI goes beyond biometrics, proving effective in various
Convolutional Neural Networks (CNN) and Recurrent Neu- fields such as smart farming. For example, in smart farm-
ral Networks (RNN) have significantly improved liveness ing, AI techniques like deep learning and computer vision
detection through learning complex patterns that effectively are used to optimize crop yields, detect pests, and manage
separate live subjects from spoof attempts with high accuracy resources [25]. This shows the versatility of AI, where tech-
[17], [18]. niques like CNNs and RNNs, used in biometric liveness
Despite these advancements, recent studies have shown detection, are also applied to large datasets in agriculture,
that deep learning methods still struggle with advanced spoof- resulting in better performance and efficiency. The success
ing attacks. Steiner et al. showed that CNNs are vulnerable of these techniques across different domains shows that
to adversarial attacks, where even small changes in the input they can be used in security, authentication, agriculture, and
can lead to misclassifications. George and Marcel [19] also healthcare. This broader perspective helps put the importance
pointed out that while pixel-wise binary supervision improves of explainability and robustness in AI models into context,
performance against simple attacks, it fails to counter video which is critical to building trust in any AI system, whether
and 3D mask attacks. Therefore, there is a need to enhance for biometrics or other practical applications.
the robustness and interpretability of our models. Table 1 summarizes the related work, highlighting the
Pre-trained models, like those evaluated by Jain et al. [20], advantages and limitations of face liveness detection and
appear promising. Their fine-tuning of VGG16 and ResNet XAI. As can be seen from the table, while some of the models
for liveness detection showed that these models are accurate are better in accuracy, others offer faster detection with less
and efficient against simple spoofing attacks. However, they overhead in computation; in this context, there is a need
also highlighted challenges in generalizing these models to to balance both requirements when selecting deployment
more advanced spoofing methods, such as video and 3D models.
mask attacks. The explainability of these models is crucial The main research gap is the need for liveness detection
for building trust in biometric systems. systems that are both more reliable and interpretable. While
Despite this progress, deep learning models are not the existing models show high accuracy, most are inefficient
immune to adversarial attacks, implying that these models in defence mechanisms against advanced spoofing techniques
can be manipulated so that their decision affects misclassi- and adversarial attacks. Therefore, while building a system,
fication [21], [22] For understanding and validation of the on the way to achieving high accuracy, transparency of the
decision-making process, techniques considered are LIME, decision-making process should be pursued to expose possi-
SHAP, and Grad-CAM, which have been developed as criti- ble vulnerabilities. The problem has been addressed because
cal tools under explainable AI. For example, LIME explains of the critical fact that secure and trustworthy biometric
how models such as VGG or ResNet distinguish subtle authentication systems are essential.
facial cues and motion dynamics to identify vulnerability. Recent works, such as those by Dwivedi et al. [26], have
Techniques like this can increase transparency and trust in also examined the combination of interpretability and live-
biometric system development but remain a concern due to ness detection. They found that deep learning models are
their robustness against sophisticated spoofing and adversar- necessary for high performance, but adding explainability
ial attacks. techniques like LIME can make the system more trustworthy.
LIME, introduced in the evaluation of XAI [23], gives a However, the trade-off between computational complexity
practical solution by showing visual explanations of CNN and real-time is still an open problem.
model predictions. It demonstrates which parts of the face The key to gaining users’ trust and progress in bio-
contributed to the model’s decision, making the results more metric authentication lies in strong defence mechanisms
interpretable. However, it also highlights that explainable against spoofing and adversarial attacks and transparency

153582 VOLUME 12, 2024


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

TABLE 1. Summary of related works in explainable AI and liveness TABLE 1. (Continued.) Summary of related works in explainable AI and
detection: Highlighting advantages and limitations. liveness detection: Highlighting advantages and limitations.

deep learning techniques with methods from interpretable


AI to improve robustness and interpretability significantly.
Unlike previous works, which primarily focused on achieving
accuracy, our approach emphasizes identifying and mitigat-
ing vulnerabilities through explainable AI techniques. Our
approach provides detailed insight into how the decisions of
deep learning models are made to ensure that these models
achieve high accuracy, transparency, and trustworthiness.
The key contributions made by this paper are as fol-
lows: (1) Incorporation of Explainable AI: We used LIME
and other approaches to provide insights into the res-
onating process of deep learning architectures, ensuring
transparency and trustworthiness [27]. (2) Enhanced Robust-
ness: We conducted comprehensive using adversarial training
techniques, strengthening defense mechanisms against
sophisticated spoofing attacks. (3) Experiments: We per-
formed extensive experiments to demonstrate how our
method surpasses prior work in accuracy and robustness.

III. DATA AND METHODOLOGY


A. DATASET DESCRIPTION
The SiW-Mv2 (Spoof in Wild with Multiple Attacks Version
2) dataset is a key asset in face anti-spoofing (FAS), cre-
ated explicitly for enhancing multi-domain face anti-spoofing
algorithms. Meticulously curated, this dataset includes a
diverse array of 14 spoof attack types, all identified and
verified by the prestigious IARPA ODIN program. It com-
prises 915 spoof videos from 590+ subjects and 785 genuine
videos from 400+ subjects, offering extensive real-world
in decision-making processes. This paper proposes a new data collection. This rich dataset facilitates the comprehen-
approach to liveness detection by combining state-of-the-art sive evaluation and development of advanced anti-spoofing

VOLUME 12, 2024 153583


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

algorithms. Figure 1 shows live samples and spoofs from the MobileNetV2, DenseNet201, InceptionV3, and [29]. Each of
SiWMv2 dataset. these models has its architecture and strengths:

a: VGG16 AND VGG19


VGG16 [30] and VGG19 [31] are known for their simplicity
and depth. They consist of multiple layers of convolutional
filters and max-pooling operations arranged sequentially.
VGG16 has 16 layers, and VGG19 has 19 layers, with the
additional layers in VGG19 allowing for more complex hier-
archical feature extraction. An architecture of the proposed
VGG16 model is shown in Figure 2.

b: ResNet50
ResNet50 [32] employs residual connections (or skip con-
nections) to facilitate the training of intense networks. These
connections allow the model to learn residuals or differ-
ences between the input and output, addressing the vanishing
gradient problem and improving performance on deep learn-
FIGURE 1. Samples of real and spoofed images from the SIW-MV2 ing tasks. This architecture enables the model to go deeper
dataset. (50 layers) without encountering significant degradation in
performance.
B. METHOD
1) TRANSFER LEARNING c: DenseNet201
This work used the transfer learning method, which modifies DenseNet201 introduces strong connections between lev-
one task’s design to form a new model for another task [27]. els, with each layer receiving input from all the preceding
This approach is beneficial when we have limited data. Start- ones. This pattern improves gradient flow and feature reuse
ing from a pre-trained big data model such as ImageNet, throughout the network. DenseNet201 [33], with its 201 lay-
we can use the learning features and knowledge to improve ers, enhances performance by allowing for a more efficient
the performance of specific face preservation tasks [28]. flow of information and gradients, leading to improved clas-
sification and object detection results.
2) INITIAL TRAINING DATA
The initial training data includes ImageNet-heavy pre- d: MobileNetV2
training. ImageNet is an extensive image database used to MobileNetV2 was designed with functionality in mind,
study object recognition software. Pre-trained weights on making it particularly suitable for mobile applications and
ImageNet have already learned a rich set of features from graphics [34]. It reduces the number of computations and
millions of images, which include edges, textures, shapes, and variables by using depthwise separable convolutions, all
other visual attributes. This initial set of weights serves as a while maintaining the required accuracy. This design allows
robust starting point, allowing the model to recognize basic for faster inference times and lower computational resource
features that are common across many visual tasks, thereby usage, making it ideal for real-time applications on mobile
providing a significant head start when fine-tuning for the devices.
liveness task.
e: InceptionV3
3) FINE-TUNING PROCESS InceptionV3 features a complex architecture with inception
It involves training the model further on the specific face modules that capture multi-scale features by applying multi-
anti-spoofing dataset. During this process, the model weights ple convolutions with different kernel sizes in parallel [35].
are adjusted through multiple iterations to draw attention to This allows the network to learn various levels of abstraction
the critical characteristics of face anti-spoofing. The goal is and combine them, making it adept at handling diverse visual
to shift the model’s attention from the generalized features features. The InceptionV3 architecture balances accuracy and
learned from ImageNet to the specific nuances required to computational efficiency, making it suitable for a wide range
differentiate between real and spoofed faces. Fine-tuning of tasks.
helps adapt the models in a newer way, improving their The proposed architecture for face liveness detection using
accuracy and performance on the target task. a pretrained model and LIME is shown in Figure 3.

4) MODELS TRAINED f: Xception


Various convolutional neural networks (CNNs) were Xception [36] builds upon the Inception architecture by
trained, including ResNet50, VGG16, Xception, VGG19, incorporating depth-wise separable convolutions instead of

153584 VOLUME 12, 2024


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

FIGURE 2. Detailed representation of the VGG16 CNN architecture.

FIGURE 3. Proposed architecture for face liveness detection using pre-trained models and local interpretable model-agnostic explanations.

standard convolutions. This modification further enhances models for real-world biometric systems. We have focused
the model’s accuracy by separating the filtering process more on InceptionV3 and Xception because of their balanced
and combining feature maps. The use of depth-wise sepa- performance in both accuracy and feasibility.
rable convolutions reduces computational complexity while
improving model performance, making Xception a powerful 5) TRAINING ENVIRONMENT
tool for image classification and other vision tasks. Models are trained using TensorFlow on Google Colab, a free
We further evaluate the reliability of each model by exam- cloud environment that can use powerful GPUs and TPUs.
ining its performance across various types of spoofing attacks This location allows good educational standards even with
and robustness under different lighting and environmen- limited local resources. TensorFlow is a popular deep learning
tal conditions. This ensures the models are accurate and framework that provides a flexible and extensible frame-
reliable in real-world applications where conditions vary work for building and deploying machine learning models.
significantly. Google Colab encourages collaboration, makes budgeting
The advantages and limitations of deep learning models more accessible, and integrates with Google Drive to store
experimented with in this study are summarized in Table 2. and manage files.
InceptionV3 and Xception performed better in complex
patterns, but ResNet50 and DenseNet201 took more compu- 6) EVALUTION METRICS
tational costs and more time to train. This table shows the The model is evaluated using the categorical cross-entropy
tradeoff between accuracy, speed, and computational com- loss function, whose output is a value between 0 and 1,
plexity, which are the key factors to consider when choosing which is used to evaluate the effectiveness of the

VOLUME 12, 2024 153585


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

TABLE 2. Advantages and limitations of deep learning models used in two extensions of stochastic gradient descent that combine
experimentation.
to provide time-adjusted prediction (ADAP). The industry
standard for training robustness and integration stability is
0.001 runs.

8) REGULARIZATION TECHNIQUE
Dropout normalization is used to prevent overfitting. Dropout
is a process in which selected neurons are ignored during
training, i.e., their contribution to the neuron activity is tem-
porarily removed from the foreground, and the modified
weights are not used to recover the neurons [39]. This helps
to reduce the influence of neurons, making the network less
sensitive to the weight of a single neuron. As a result, the
model becomes more robust and less likely to damage the
training data, thus generalizing better to unseen data [40].

9) INITIAL RANKING SYSTEM


An initial ranking system was implemented to select the
optimal model configuration. This system evaluated different
model configurations based on their validation performance.
By leveraging cross-validation techniques, the system iden-
tified the best-performing models and hyperparameters,
ensuring that the chosen model is accurate and efficient. This
approach helps systematically compare various models and
select the one that strikes the best balance between complex-
ity and performance, thus enhancing the overall effectiveness
of the face anti-spoofing system.

10) OVERFITTING AND UNDERFITTING


In machine learning, both overfitting and underfitting [41]
are problems. A model performs best when it performs well
on training data but poorly on new, untrained data. This is
because the model absorbs information from the noise and
content contained in the training data. This has a negative
impact on its performance in general information. Con-
versely, underfitting occurs when a model performs poorly
on training and new data because it is too easy to identify
underlying patterns in the data. These issues can be partially
addressed through a combination of initial assessment and
ongoing management. The initial classification ensures the
model is well-formed and resolves the issue by selecting the
best model. At the same time, the version prevents the model
from overfitting and overfitting the data. After completing
classification model. This function compares the predicted the training model, the H5 archive contains the weights and
probability of the classification class with the proper classifi- functions.
cation (one-shot coding vector) [37]. It is particularly useful
for multivariate distributions because it penalizes the model 11) EXPLAINABILITY WITH LIME
based on the distance between the predicted and actual distri- Predictions from complex learning models can be explained
butions, allowing the model to improve effective prediction using a method called LIME (local interpretation model-
results. agnostic interpretation) [14]. The prediction is estimated
using a linear model or model averaging to describe the
7) OPTIMIZER USED prediction. Such local predictions can help to give confi-
To reduce the loss, an Adam optimizer with a learning dence and explain the model’s results by explaining why the
rate of 0.001 is used [38]. The adaptive gradient algorithm model made a prediction. The key features of LIME include
(AdaGrad) and root mean square propagation (RMSProp) are local, which focuses on explaining individual predictions

153586 VOLUME 12, 2024


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

rather than the entire model. Interpretable: Its purpose is to reliability, which is crucial for practical liveness detection
make the decision-making model understandable to humans. systems.
Model-independent: It can be applied to any machine learning
model regardless of the basic algorithm. Locally Interpretable TABLE 3. Evaluation of pre-trained model efficacy with test data accuracy
Model-Independent Explanation provides a locally inter- over 15 epochs.

pretable learning model using a simple interpretable model


as the description of the predictive model. It makes predic-
tions from complex models, creates negative models from
old ideas, and assigns weights based on their similarity to
old examples. This helps identify and visualize the most
important features in the decision-making process. The face
mask allows LIME to identify the key facial features that the
model uses to distinguish real faces from fake faces, provid-
ing valuable information about the initial model predictions.
The color change indicates the importance of the entry point
to the prediction model in the LIME graph. In the graph of accuracy and loss for each model, the
Green: Highlights areas the model identifies as indicative accuracy increases while the loss decreases each time. Then,
of a real (genuine) face. These regions contribute positively after ten times, each model starts to get stable accuracy and
to the prediction that the face is real. Red: Indicates areas the loses its importance for the rest of the time.
model considers indicative of a fake (spoofed) face. These In Figure 4, the training accuracy for InceptionV3 is
regions contribute negatively to the prediction, suggesting 86.77% for the first time and then increases each time. The
features associated with spoofing attempts. model’s accuracy is 89.65% at the beginning and continues
until the end. After ten times, the learning loss is 0.0231,
and the accuracy is 99.45%. The accuracy was 99.45% in the
IV. RESULTS past, and the learning loss was almost zero.
Based on the evaluation of pre-trained models using test data In Figure 5, the training accuracy for VGG19 is 87.86%
accuracy over 15 epochs (see Table 3 ), several observations for the first period and then increases with each period. The
can be made about their performance: model’s accuracy is 89.65% at the beginning and continues
VGG-16: Shows consistently high accuracy across epochs, until the end. After ten times, the learning loss becomes
with minor fluctuations. Achieves high accuracy even after 0.0107.
five epochs, indicating robust learning capability. In Figure 6, the training accuracy for Resnet50 is 85.68%
VGG-19: Starts with lower accuracy but shows significant in the first period and then increases in each period. The
improvement with more epochs. Achieves the highest model’s accuracy is 83.00% initially and continues until
accuracy after 15 epochs, suggesting it benefits from the end. After ten times, the learning loss is 0.3319,
longer training periods. and the accuracy is 89.00%. In the previous period, the accu-
Xception: Starts strong and maintains high accuracy across racy rate was 89.00%, and the learning loss was almost zero.
epochs, indicating stability and effectiveness in learning from In Figure 7, for VGG16, the model training started with
the dataset. a loss of 0.5059 and gradually decreased to 0.0083 by the
ResNet50: Shows moderate accuracy compared to other tenth epoch. During this time, training accuracy improved
models, with slight improvements over epochs but relatively from 85.21% to 99.92%. Meanwhile, validation performance
stable performance. started with a loss of 0.6001 and steadily decreased to
InceptionV3: Starts with high accuracy and maintains near- 0.0080 by the final epoch. Validation accuracy increased from
perfect scores, showing robust learning and generalization 84.00% to 99.70% over the same period.
capabilities even with fewer epochs. In Figure 8, for Xception, the model training started with
MobileNetV2: Starts with high accuracy and maintains a loss of 0.5059 and gradually decreased to 0.0083 by the
strong performance across epochs, with minor fluctuations. tenth epoch. During this time, training accuracy improved
Demonstrates efficiency in learning complex patterns. from 85.21% to 99.92%. Meanwhile, validation performance
DenseNet201: Starts with high accuracy, fluctuates started with a loss of 0.2412, decreased sharply to 0.0077 by
slightly, and achieves strong performance overall, though not the fifth epoch, and stabilized around 0.0148 by the final
consistently the highest. epoch. Validation accuracy also increased from 89.98% to
We tried multiple models (VGG16, DenseNet201, 99.84% over the same period, with slight variations.
MobileNetV2), but our analysis shows that InceptionV3 In Figure 9, for Mobilenetv2, the model training started
outperforms others in accuracy and reliability. So, we focused with a loss of 1.4409 and rapidly decreased to 0.0061 by
on these models in our detailed analysis to keep them in the tenth epoch. Concurrently, training accuracy climbed
the mainstream. The other models, while providing some from 86.54% to 99.77%. Meanwhile, validation performance
insights, didn’t meet the same level of performance and began with a loss of 1.2474 and fluctuated slightly around

VOLUME 12, 2024 153587


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

FIGURE 4. InceptionV3. (a) accuracy and (b) loss graph.

FIGURE 5. VGG19 (a) accuracy and (b) loss graph.

1.156 by the final epoch. Validation accuracy initially rose to LIME uses mixed images and their predictions to repre-
99.15% by the fifth epoch and stabilized around 89.7. sent interpretable models. The model definition identifies the
most essential superpixels for predicting the original model.
A. PERFORMANCE VALIDATION WITH XAI The critical regions in the LIME interpretation indicate the
1) MODEL EXPLAINABILITY superpixels that impact the model prediction most. A face
LIME is used in face recognition to identify the areas in with a predicted score of 0.98 was obtained through LIME,
images with the greatest predictive power for a model. This indicating high confidence in face detection.
helps in understanding the face, which the model must recog- LIME Explanation: Highlights the region around the face
nize. We use a predefined face detection model to detect faces (especially the upper part), showing these areas are crucial for
in images. The model outputs a confidence score for each the model’s prediction.
prediction, indicating the likelihood of the image containing Green (Real): Indicates regions the model associates
a face. LIME generates several perturbed image versions by with real faces.
randomly turning off some superpixels. The face detection Red (Fake): Indicates regions the model associates with
model is run on these perturbed images to get predictions. fake faces.

153588 VOLUME 12, 2024


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

FIGURE 6. Resnet50 (a) accuracy and (b) loss graph.

FIGURE 7. VGG16 (a) accuracy and (b) loss graph.

Although all models predict the correct label, they focus essential features and potential biases, enhancing model
on different features to make their predictions, so this varying transparency and trustworthiness.
from the various regions that are being highlighted as red and
green in other images, which provides a good understanding V. DISCUSSIONS
of the model’s behavior and verifying if it can make the Results obtained in this study are very encouraging in face
decisions based on the various features which are present in detection, mainly assessed by several pre-trained models.
the image. InceptionV3 has an accuracy of 99.80%, indicating its effec-
Figure 10 illustrates the application of LIME to explain the tiveness and strong generalization ability across face images.
predictions of the VGG19 face detection model. So, in this, VGG-16 also performed very well, with an accuracy of
we can observe that we can find the predicted score for real 99.61 %, reflecting its reliability and stability during train-
and fake images, which is 0.86 for real and 0.55 for fake. ing. Though a little behind, VGG-19 has an accuracy of
This highlights that the results obtained from the inception 97.76 % with significant improvements over time, thus show-
of the V3 model are of higher accuracy than those obtained ing its strong learning abilities. Moreover, high performance
by VGG19. Thus, LIME is a powerful tool for interpreting is further exemplified by Xception and MobileNetV2, with
and understanding the predictions of face detection mod- accuracies of 98.6 % and 98.8 %, respectively, thus showing
els. By providing local explanations, LIME helps identify efficiency in recognizing complex patterns. DenseNet201

VOLUME 12, 2024 153589


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

FIGURE 8. Xception (a) accuracy and (b) loss graph.

FIGURE 9. Mobilenetv2 (a) accuracy and (b) loss graph.

performs very well, with minimal variation in accuracy at We have InceptionV3 and VGG-16, which had high accu-
98.57%. It does, however, give room for minor improvement. racy throughout all epochs, proving their robustness for face
ResNet50 also reveals a pretty decent accuracy of 91.24% liveness detection. ResNet50 and DenseNet201 had compet-
and is very stable. These results provide a complete picture itive performance, but their accuracy fluctuated, and training
of the strengths of the models and a well-rounded view of time was higher, so they were unsuitable for real-time appli-
how these models work for face detection. The implica- cations. So, we recommend InceptionV3 and VGG-16 for
tions presented in these results are very significant in terms their optimal balance of performance and reliability.
of applications requiring exact face detection. If the most This supported a strong base for choosing the most appro-
elevated level of precision is needed, InceptionV3 stands priate model against specific needs, underscoring the critical
apart due to its exceptional accuracy. VGG-16 and VGG-19 role of model performance in ensuring that face detection
are much needed in situations requiring stability and long- systems are correct and reliable. Including LIME in the
term reliability. Xception and MobileNetV2 offer efficient evaluation procedure further improved the interpretability of
performance, making them very applicable in real-time appli- these results. LIME gave local explanations of how different
cations where speed and accuracy are essential. DenseNet201 models focused on crucial facial features—for example, the
and ResNet50 perform very well but point out areas eyes and the mouth—which greatly influenced prediction
that would benefit from further refinement to better their accuracy. For InceptionV3 and VGG-16, the relevant facial
effectiveness. regions were focused on with remarkable consistency, firmly

153590 VOLUME 12, 2024


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

FIGURE 10. Illustration of LIME application to explain the prediction of the Inception V3 face detection model.

FIGURE 11. Illustrates of LIME to explain the prediction of the VGG19 face detection model.

establishing their high accuracy and reliability. LIME gave with less bias. Besides diversity in datasets, another critical
more insight into the decision-making process of these mod- factor leading to high model performance is some effec-
els, hence a greater understanding of why they did what they tive preprocessing techniques. Data augmentation methods
did and a much-enhanced view of strengths and areas to extend the size of the training dataset and enable models to
improve on. handle variations in the appearance and orientation of faces.
These normalization techniques ensure that the data fed to
A. IMPLICATIONS models is homogeneous and helps stabilize the training pro-
The study provides critical theoretical implications about cess. This is an essential preliminary step towards organizing
how model architectures impact face detection performance. the data to ensure it learns relevant features for accuracy. The
Performance differences across models underline the impor- choice of training techniques and hyperparameters impinges
tance of choosing appropriate architectures for application on the results, underscoring careful tuning and validation.
requirements. Results offer insight into how algorithms can Interpretability methods, such as Local Interpretable Model-
be optimized for face detection and point out that choices of agnostic Explanations, further raise another methodological
architectures are an important factor in high accuracy and layer in the evaluation process. LIME is able to give local
robustness [42]. These theoretical insights can further be explanations of model predictions by pointing out which
used to build new model architectures or improve existing critical facial regions drive models’ decisions. LIME helps in
ones for future studies to enhance face detection technolo- knowing the most influential features, thus identifying pos-
gies’ efficacy. The present contribution opens up avenues to sible biases and spaces for improvement, making the models
develop more sophisticated models or approaches for han- more accurate, fairer, and trustworthy. Overall, the contribu-
dling different applications. Results support implementing tions of this study to face detection systems increase general
advanced models in real-world systems, improving accuracy accuracy and robustness, leading to practical solutions. These
and user trust in biometric solutions. The practical application will contribute to more robust and effective face detec-
domains are vast, ranging from security, access control, and tion solutions of significant practical relevance for security,
personal identification, which require reliable and effective biometric authentication, and many other uses requiring
face detection. facial identification.

B. METHODOLOGICAL CONSIDERATIONS C. BIAS DETECTION AND MITIGATION


The process specification can significantly impact the results LIME is critical not only in the detection but also in cor-
of face recognition. This paper demonstrates why represen- recting biases within face detection models. This is crucial
tative data should be used to enable performance assessment in ensuring that biometric authentication systems are fair

VOLUME 12, 2024 153591


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

and reliable. Local interpretable model-agnostic explanations implementation will ensure that face detection technologies
improve models’ credibility and effectiveness by identifying reliably and effectively work in various contexts.
biased areas. It visualizes the biased areas of real and fake We used LIME as the primary explainable AI technique to
pictures, guiding the model away from biased regions that explain the decision-making of deep learning models for face
lead to incorrect predictions [23]. Colors play a preeminent liveness detection. In future work, it will be interesting to see
role in the model’s prediction. According to the model, green how other XAI techniques like IG, LRP, and X-CNN perform
areas show those parts that relate to real faces, while red areas in face liveness detection. These techniques will give different
are those related to fake faces. insights into model interpretability and more comprehensive
This visual feedback will help understand what part of the explanations for complex decisions. By combining multiple
image the model is focusing on and will help guide model XAI methods, future research can improve the transparency
decisions based on what is most important and accurate. and robustness of biometric systems and trust and security in
By highlighting these areas, LIME provides essential infor- real-world applications.
mation to standardize and eliminate bias [43].

VI. CONCLUSION
D. ENHANCING MODEL TRANSPARENCY AND This research used pre-trained deep models to perform
TRUSTWORTHINESS face spoofing attacks and transfer learning techniques to
The above performance was incremented by adding LIME adapt these models to a particular problem. The idea was
with pretrained models, resulting in a deep understanding of tested using ResNet50, VGG16, DenseNet201, VGG19,
how the models make decisions. Such local explanations by MobileNetV2, InceptionV3, and Xception, observing how
LIME give transparency to the models so that one can see they performed for multiple epochs. Results showed that most
which features are more influential on the predictions. In a pre-trained models gave high accuracy, with InceptionV3
critical application where accurate face detection is crucial, and DenseNet201 being particularly strong. This validates
such transparency is necessary to ensure trust. LIME allows the effectiveness of transfer learning in face anti-spoofing.
users to understand what the models are doing so they can Besides, LIME gave essential insights into how these intrin-
confirm that models are learning to make decisions based on sic models make decisions. LIME helped identify the most
proper, relevant features [44]. influential features and parts of the face that contribute to the
prediction, thus enhancing the interpretability and reliability
of the anti-spoofing system. This integration of LIME into
E. FUTURE RESEARCH DIRECTIONS pre-trained models could provide high accuracy and trans-
Future research should focus on exploring new architec- parency, which is critical to applications in the real world
tures, fine-tuning models, real-time implementation, expand- where understanding model decisions is as important as per-
ing datasets, exploring multi-modal approaches, advancing formance itself.
interpretability methods, addressing privacy and ethical con-
siderations, and improving robustness to adversarial attacks,
along with refining existing models will also be crucial. AUTHOR CONTRIBUTIONS
These directions will enhance the security, efficiency, and Conceptualization, Smita Khairnar, Shilpa Gite,
transparency of face anti-spoofing systems, building on Sudeep D. Thepade, and Biswajeet Pradhan; methodol-
the foundation laid by this study. Integrating these mod- ogy, Smita Khairnar, Kashish Mahajan; software, Shilpa
els into real-world applications and continuously evaluating Gite; validation, Smita Khairnar, Shilpa Gite, Sudeep
their effectiveness will be essential for advancing face D. Thepade, S.M, Abdullah Alamri, and Biswajeet Pradhan;
detection technologies. Future research should investigate formal analysis, Smita Khairnar, S.M; investigation, Smita
the effects of various methods on model performance and Khairnar, Shilpa Gite, Sudeep D. Thepade, S.M, Abdullah
explore new ways to improve accuracy and reliability. Alamri, and Biswajeet Pradhan; resources, Biswajeet
Results based on this research will help develop better face Pradhan; data curation, Smita Khairnar; writing—original
solutions. Measuring the impact of face detection mod- draft preparation, Smita Khairnar, and S.M.; writing—
els in the real world is essential to understanding their review and editing, Shilpa Gite, Biswajeet Pradhan, Abdullah
performance. Alamri and Sudeep D. Thepade; visualization, Shilpa
This further implies performance assessment in different Gite, Biswajeet Pradhan, Abdullah Alamri and Sudeep
lighting conditions, demographic groups, and usage scenar- D. Thepade; supervision, Shilpa Gite and Sudeep
ios. Testing in real-world scenarios would bring considerable D. Thepade; project administration, Biswajeet Pradhan; fund-
illumination to the generalization capability of these models ing acquisition, Biswajeet Pradhan and Abdullah Alamri.
outside controlled experimental settings. The thorough eval-
uations and user feedback at the end refine the models to
ensure conformance with practical needs and expectations. DATA AVAILABILITY STATEMENT
Filling the gap between theoretical improvement and real The data will be made accessible upon request.

153592 VOLUME 12, 2024


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

CONFLICTS OF INTEREST [20] D. M. Jain, M. S. Bora, S. Chandnani, S. Grover, and S. Sadwal, ‘‘Compar-
The authors declare that they have no known competing ison of VGG-16, VGG-19, and ResNet-101 CNN models for the purpose
of suspicious activity detection,’’ Int. J. Sci. Res. Comput. Sci., Eng. Inf.
financial interests or personal relationships that could have Technol., vol. 9, pp. 121–130, Jan. 2023, doi: 10.32628/cseit2390124.
appeared to influence the work reported in this paper. [21] U. Muhammad and M. Oussalah, ‘‘Self-supervised face presentation
attack detection with dynamic grayscale snippets,’’ in Proc. IEEE 17th
Int. Conf. Autom. Face Gesture Recognit. (FG), Jan. 2023, pp. 1–6, doi:
REFERENCES 10.1109/FG57933.2023.10042547.
[1] D. Garud and S. S. Agrwal, ‘‘Face liveness detection,’’ in Proc. Int. Conf. [22] A. Potdar, P. Barbhaya, and S. Nagpure, ‘‘Face recognition for atten-
Autom. Control Dyn. Optim. Techn. (ICACDOT), Sep. 2016, pp. 789–792, dance system using CNN based liveliness detection,’’ in Proc. Int. Conf.
doi: 10.1109/ICACDOT.2016.7877695. Adv. Comput., Commun. Mater. (ICACCM), Nov. 2022, pp. 1–6, doi:
[2] S. Chakraborty and D. Das, ‘‘An overview of face liveness detec- 10.1109/ICACCM56405.2022.10009024.
tion,’’ Int. J. Inf. Theory, vol. 3, no. 2, pp. 11–25, Apr. 2014, doi: [23] H. T. T. Nguyen, H. Cao, K. V. T. Nguyen, and N. D. K. Pham.
10.5121/ijit.2014.3202. (2021). Evaluation of Explainable Artificial Intelligence: SHAP,
[3] M. Fang, N. Damer, F. Kirchbuchner, and A. Kuijper, ‘‘Real masks LIME, and CAM Evaluation of Explainable Artificial Intelligence:
and spoof faces: On the masked face presentation attack detec- SHAP, LIME, and CAM Hung Quoc Cao. [Online]. Available:
tion,’’ Pattern Recognit., vol. 123, Mar. 2022, Art. no. 108398, doi: https://www.researchgate.net/publication/362165633
10.1016/j.patcog.2021.108398. [24] Y. Akkem, S. K. Biswas, and A. Varanasi, ‘‘Streamlit-based enhanc-
[4] J. Määttä, A. Hadid, and M. Pietikäinen, ‘‘Face spoofing detection from ing crop recommendation systems with advanced explainable artificial
single images using micro-texture analysis,’’ in Proc. Int. Joint Conf. intelligence for smart farming,’’ Neural Comput. Appl., vol. 36, no. 32,
Biometrics (IJCB), Oct. 2011, pp. 1–7, doi: 10.1109/IJCB.2011.6117510. pp. 20011–20025, Nov. 2024, doi: 10.1007/s00521-024-10208-z.
[5] D. Wen, H. Han, and A. K. Jain, ‘‘Face spoof detection with image [25] Y. Akkem, S. K. Biswas, and A. Varanasi, ‘‘Smart farming using artificial
distortion analysis,’’ IEEE Trans. Inf. Forensics Security, vol. 10, no. 4, intelligence: A review,’’ Eng. Appl. Artif. Intell., vol. 120, Apr. 2023,
pp. 746–761, Apr. 2015, doi: 10.1109/TIFS.2015.2400395. Art. no. 105899, doi: 10.1016/j.engappai.2023.105899.
[26] R. Dwivedi, P. Kothari, D. Chopra, M. Singh, and R. Kumar, ‘‘An
[6] S. Khairnar, S. Gite, and S. D. Thepade, ‘‘Empirical performance analy-
efficient ensemble explainable AI (XAI) approach for morphed face detec-
sis of deep convolutional neural networks architectures for face liveness
tion,’’ Pattern Recognit. Lett., vol. 184, pp. 197–204, Aug. 2024, doi:
detection,’’ Tech. Rep., doi: 10.21203/rs.3.rs-3824202/v1.
10.1016/j.patrec.2024.06.014.
[7] S. Khairnar, S. Gite, K. Kotecha, and S. D. Thepade, ‘‘Face liveness
[27] D. Petkovic, ‘‘It is not ‘accuracy vs. explainability’—We need both
detection using artificial intelligence techniques: A systematic literature
for trustworthy AI systems,’’ IEEE Trans. Technol. Soc., vol. 4, no. 1,
review and future directions,’’ Big Data Cognit. Comput., vol. 7, no. 1,
pp. 46–53, Mar. 2023, doi: 10.1109/TTS.2023.3239921.
p. 37, Feb. 2023, doi: 10.3390/bdcc7010037.
[28] H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao,
[8] S. Khade, S. Gite, and B. Pradhan, ‘‘Iris liveness detection using multiple
D. Mollura, and R. M. Summers, ‘‘Deep convolutional neural networks for
deep convolution networks,’’ Big Data Cogn. Comput., vol. 6, no. 2, p. 67,
computer-aided detection: CNN architectures, dataset characteristics and
Jun. 2022, doi: 10.3390/bdcc6020067.
transfer learning,’’ IEEE Trans. Med. Imag., vol. 35, no. 5, pp. 1285–1298,
[9] M. Omara, M. Fayez, H. Khalid, and S. Ghoniemy, ‘‘A transfer learning May 2016, doi: 10.1109/TMI.2016.2528162.
approach for face liveness detection,’’ in Proc. 11th Int. Conf. Intell.
[29] M. Masood, M. Nawaz, A. Javed, T. Nazir, A. Mehmood, and
Comput. Inf. Syst. (ICICIS), Nov. 2023, pp. 122–127, doi: 10.1109/ici-
R. Mahum, ‘‘Classification of deepfake videos using pre-trained
cis58388.2023.10391203.
convolutional neural networks,’’ in Proc. Int. Conf. Digit. Futures
[10] D. Garreau and D. Mardaoui, ‘‘What does LIME see in images?’’ in Proc. Transformative Technol. (ICoDT2), May 2021, pp. 1–6, doi:
Int. Conf. Mach. Learn., 2021, pp. 1–10. 10.1109/ICoDT252288.2021.9441519.
[11] S. Rao, S. Mehta, S. Kulkarni, H. Dalvi, N. Katre, and M. Narvekar, ‘‘A [30] H. Qassim, A. Verma, and D. Feinzimer, ‘‘Compressed residual-VGG16
study of LIME and SHAP model explainers for autonomous disease predic- CNN model for big data places image recognition,’’ in Proc. IEEE
tions,’’ in Proc. IEEE Bombay Sect. Signature Conf. (IBSSC), Dec. 2022, 8th Annu. Comput. Commun. Workshop Conf. (CCWC), Jan. 2018,
pp. 1–6, doi: 10.1109/IBSSC56953.2022.10037324. pp. 169–175, doi: 10.1109/CCWC.2018.8301729.
[12] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and [31] L. Wen, X. Li, X. Li, and L. Gao, ‘‘A new transfer learning based on VGG-
D. Batra, ‘‘Grad-CAM: Visual explanations from deep networks via 19 network for fault diagnosis,’’ in Proc. IEEE 23rd Int. Conf. Comput.
gradient-based localization,’’ Int. J. Comput. Vis., vol. 128, no. 2, Supported Cooperat. Work Design (CSCWD), May 2019, pp. 205–209,
pp. 336–359, Feb. 2020, doi: 10.1007/s11263-019-01228-7. doi: 10.1109/CSCWD.2019.8791884.
[13] Y. Hailemariam, A. Yazdinejad, R. M. Parizi, G. Srivastava, and [32] Z. Zahisham, C. P. Lee, and K. M. Lim, ‘‘Food recognition with ResNet-
A. Dehghantanha, ‘‘An empirical evaluation of AI deep explainable tools,’’ 50,’’ in Proc. IEEE 2nd Int. Conf. Artif. Intell. Eng. Technol. (IICAIET),
in Proc. IEEE Globecom Workshops (GC Wkshps, Dec. 2020, pp. 1–6, doi: Sep. 2020, pp. 1–5, doi: 10.1109/IICAIET49801.2020.9257825.
10.1109/GCWkshps50303.2020.9367541. [33] Y. Fu, J. Wu, Y. Hu, M. Xing, and L. Xie, ‘‘DESNet: A multi-channel
[14] C. H. Ng, H. S. Abuwala, and C. H. Lim, ‘‘Towards more sta- network for simultaneous speech dereverberation, enhancement and sepa-
ble LIME for explainable AI,’’ in Proc. Int. Symp. Intell. Sig- ration,’’ in Proc. IEEE Spoken Lang. Technol. Workshop (SLT), Jan. 2021,
nal Process. Commun. Syst. (ISPACS), Nov. 2022, pp. 1–4, doi: pp. 857–864, doi: 10.1109/SLT48900.2021.9383604.
10.1109/ISPACS57703.2022.10082810. [34] D. Sinha and M. El-Sharkawy, ‘‘Thin MobileNet: An enhanced MobileNet
[15] Md. M. Hasan, Md. S. U. Yusuf, T. I. Rohan, and S. Roy, ‘‘Efficient two architecture,’’ in Proc. IEEE 10th Annu. Ubiquitous Comput., Electron.
stage approach to detect face liveness : Motion based and deep learning Mobile Commun. Conf. (UEMCON), Oct. 2019, pp. 0280–0285, doi:
based,’’ in Proc. 4th Int. Conf. Electr. Inf. Commun. Technol. (EICT), 10.1109/UEMCON47517.2019.8993089.
Dec. 2019, pp. 1–6, doi: 10.1109/EICT48899.2019.9068813. [35] X. Xia, C. Xu, and B. Nan, ‘‘Inception-v3 for flower classification,’’ in
[16] S. Kumar, S. Singh, and J. Kumar, ‘‘A comparative study on face spoof- Proc. 2nd Int. Conf. Image, Vis. Comput. (ICIVC), Jun. 2017, pp. 783–787,
ing attacks,’’ in Proc. Int. Conf. Comput., Commun. Autom. (ICCCA), doi: 10.1109/ICIVC.2017.7984661.
May 2017, pp. 1104–1108, doi: 10.1109/CCAA.2017.8229961. [36] S. Roopashree and J. Anitha, ‘‘DeepHerb: A vision based system
[17] D. Mery, ‘‘True black-box explanation in facial analysis,’’ in for medicinal plants using xception features,’’ IEEE Access, vol. 9,
Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops pp. 135927–135941, 2021, doi: 10.1109/ACCESS.2021.3116207.
(CVPRW), Jun. 2022, pp. 1595–1604, doi: 10.1109/CVPRW56347. [37] S. Mandol, S. Mia, and Sk. Md. M. Ahsan, ‘‘Real time liveness detection
2022.00166. and face recognition with OpenCV and deep learning,’’ in Proc. 5th Int.
[18] R. Koshy and A. Mahmood, ‘‘Optimizing deep CNN architectures for Conf. Electr. Inf. Commun. Technol. (EICT), Dec. 2021, pp. 1–6, doi:
face liveness detection,’’ Entropy, vol. 21, no. 4, p. 423, Apr. 2019, doi: 10.1109/EICT54103.2021.9733685.
10.3390/e21040423. [38] Z. Zhang, ‘‘Improved Adam optimizer for deep neural networks,’’ in Proc.
[19] A. George and S. Marcel, ‘‘Deep pixel-wise binary supervision for face IEEE/ACM 26th Int. Symp. Quality Service (IWQoS), Jun. 2018, pp. 1–2,
presentation attack detection,’’ 2019, arXiv:1907.04047. doi: 10.1109/IWQOS.2018.8624183.

VOLUME 12, 2024 153593


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

[39] I. Salehin and D.-K. Kang, ‘‘A review on dropout regularization approaches KASHISH MAHAJAN is currently pursuing the
for deep neural networks within the scholarly domain,’’ Electronics, Bachelor of Engineering degree in computer sci-
vol. 12, no. 14, p. 3106, Jul. 2023, doi: 10.3390/electronics12143106. ence with the Pimpri Chinchwad College of Engi-
[40] Y. Zhong, W. Deng, H. Fang, J. Hu, D. Zhao, X. Li, and D. Wen, ‘‘Dynamic neering and Research, Pune. She is currently a
training data dropout for robust deep face recognition,’’ IEEE Trans. Mul- Research Intern with the Symbiosis Centre for
timedia, vol. 24, pp. 1186–1197, 2022, doi: 10.1109/TMM.2021.3123478. Applied AI, specializing in deep learning. Her
[41] Y. Pu, Y. Han, Y. Wang, J. Feng, C. Deng, and G. Huang, ‘‘Fine- research interests include full-stack development,
grained recognition with learnable semantic data augmentation,’’ Java, Python, and advanced machine and deep
IEEE Trans. Image Process., vol. 33, pp. 3130–3144, 2024, doi:
learning techniques, reflecting her dedication to
10.1109/TIP.2024.3364500.
pushing the boundaries of technology and inno-
[42] S. Purnapatra et al., ‘‘Face liveness detection competition
(LivDet-Face)—2021,’’ in Proc. IEEE Int. Joint Conf. Biometrics vation. She holds copyright for a project related to branch prediction and
(IJCB), Aug. 2021, pp. 1–10, doi: 10.1109/IJCB52358.2021.9484359. has been awarded the Eaton Scholarship for her academic excellence. She
[43] E. Tjoa and C. Guan, ‘‘A survey on explainable artificial has also achieved notable success in a variety of projects and hackathons.
intelligence (XAI): Toward medical XAI,’’ IEEE Trans. Neural Her work demonstrates a solid commitment to contributing to cutting-edge
Netw. Learn. Syst., vol. 32, no. 11, pp. 4793–4813, Nov. 2021, doi: research and technological advancements.
10.1109/TNNLS.2020.3027314.
[44] N. Barr Kumarakulasinghe, T. Blomberg, J. Liu, A. Saraiva Leao, and
P. Papapetrou, ‘‘Evaluating local interpretable model-agnostic explana-
tions on clinical machine learning classification models,’’ in Proc. IEEE
33rd Int. Symp. Comput.-Based Med. Syst. (CBMS), Jul. 2020, pp. 7–12,
doi: 10.1109/CBMS49503.2020.00009.

SMITA KHAIRNAR received the M.E. degree in


computer engineering from PCCOE, Pune. She is
currently pursuing the Ph.D. degree in computer
engineering with the Symbiosis Institute of Tech-
nology (SIT), Symbiosis International Deemed
University, Pune. She is currently an Assistant
Professor with the Department of Computer Engi-
neering, Pimpri Chinchwad College of Engineer-
ing, Savitribai Phule Pune University, Pune, and
Maharashtra, India. She has published more than
15 papers in international journals and conferences. Her research interests
include computer vision and machine learning, the Internet of Things, and
domain adaptation.
BISWAJEET PRADHAN (Senior Member, IEEE)
received the Habilitation degree in remote sensing
from Dresden University of Technology, Germany,
in 2011. He is currently the Director of the Centre
for Advanced Modeling and Geospatial Informa-
tion Systems (CAMGIS), Faculty of Engineering
and IT. He is also a Distinguished Professor with
SHILPA GITE (Member, IEEE) is currently a Pas- the University of Technology Sydney. He is also an
sionate Educationist and a Researcher with the internationally established Scientist in geospatial
Symbiosis Institute of Technology. Her research information systems (GIS), remote sensing and
interests include deep learning, computer vision, image processing, complex modeling/geo-computing, machine learning and
multisensor data fusion, and assistive driving. She soft-computing applications, natural hazards, and environmental modeling.
is also working in medical imaging, explainable He has widely traveled abroad, visiting more than 55 countries to present his
AI, and QuantumAI. She has published impactful research findings. Since 2015, he has been the Ambassador Scientist of the
manuscripts in reputed international conferences Alexander Humboldt Foundation, Germany. He has published over 850 arti-
and Scopus/Web of Science-indexed journals and cles, of which more than 475 have been published in science citation index
books. She also received Best Paper Awards at the (SCI/SCIE) technical journals. He has authored eight books and 13 book
IEMERA Conference, U.K., in October 2020, and at the RAiSE Conference, chapters. He was a recipient of the Alexander von Humboldt Fellowship
Dubai, in October 2023. She is a guide to several national/international from Germany. He has received 55 awards for teaching, service, and research
undergraduates, postgraduate, and Ph.D. students from computer engineer- excellence, since 2006. From 2016 to 2018, he was listed as the World’s
ing and other disciplines. She has also served as a guest editor for two Most Highly Cited Researcher by Clarivate Analytics Report and one of the
SCIE-indexed journals and a Scopus-indexed book on computer vision. world’s most influential minds. In 2018, 2019, and 2020, he was awarded the
In addition to academics and research, she is also a Reviewer for reputed World Class Professor position by the Ministry of Research, Technology and
journals, such as IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, Neurocom- Higher Education, Indonesia. He is also an associate editor and an editorial
puting, and PeerJ Computer Science. member of over eight ISI journals.

153594 VOLUME 12, 2024


S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI

ABDULLAH ALAMRI received the B.S. degree SUDEEP D. THEPADE received the Ph.D. degree,
in geology from King Saud University, in 1981, in 2011. He is currently the Pro-Vice Chancellor
the M.Sc. degree in applied geophysics from the of Pimpri Chinchwad University and a Professor
University of South Florida, Tampa, FL, USA, with the Computer Engineering Department, Pim-
in 1985, and the Ph.D. degree in earthquake pri Chinchwad College of Engineering, Savitribai
seismology from the University of Minnesota, Phule Pune University, Pune, Maharashtra, India.
Minneapolis, MN, USA, in 1990. He is cur- He has more than 350 research papers published
rently an Earthquake Seismology Professor and in international/national conferences and journals
the Seismic Studies Center Director of King Saud to his credit. His research interests include image
University (KSU). He is also the President of the processing, image retrieval, video analysis, video
Saudi Society of Geosciences and the Editor-in-Chief of the Arabian Journal visual data summarization, biometrics, and biometric liveness detection.
of Geosciences (AJGS). He is a member of the Seismological Society He is a member of the International Association of Engineers (IAENG)
of America, American Geophysical Union, and European Association for and the International Association of Computer Science and Information
Environmental and Engineering Geophysics, Earthquakes Mitigation in the Technology (IACSIT). He has served as a technical program committee
Eastern Mediterranean Region, National Commission for Assessment and member and a reviewer for several international conferences and journals.
Mitigation of Earthquake Hazards in Saudi Arabia, and Mitigation of Natural
Hazards Com at Civil Defense. His research interests include crustal struc-
tures and seismic micro zoning of the Arabian Peninsula. His recent projects
also involve applications of EM and MT in deep groundwater exploration of
empty quarter and geothermal prospecting of volcanic Harrats in the Arabian
Shield. He has published over 150 research articles, achieved more than
45 research projects, and authored several books and technical reports. He is
a principal and co-investigator in several national and international projects
(KSU, KACST, NPST, IRIS, CTBTO, U.S. Air Force, NSF, UCSD, LLNL,
OSU, PSU, and Max Planck). He has co-chaired several SSG, GSF, and
RELEMR workshops and forums in the Middle East. He obtained several
worldwide prizes and awards for his scientific excellence and innovation.

VOLUME 12, 2024 153595

You might also like