Advanced Techniques For Biometric
Advanced Techniques For Biometric
ABSTRACT Liveness face detection is essential for modern biometric systems, ensuring that input data
is genuine and not derived from a false image or video. Liveness face detection in today’s biometric
systems will ensure that input comes from a real, live person rather than a manipulated image or video.
The novelty of this study lies in combining deep learning models with local interpretable model-agnostic
interpretation (LIME) to enhance the interpretability and transparency of facial liveness detection systems.
This technology is necessary for preventing spoofing attacks and attempts by hackers to break the security
feature via pictures, videos, masks, etc. Spoofing refers to the compromise of a biometric system by providing
it with untruthful material, photographs, videos, or masks to gain access. However, if not dealt with, such
forms of fraud could affect the security of the biometrics. Liveness detection relies on several strategies,
from basic facial actions like blinking and head twists to even more advanced algorithms that can identify
natural skin texture and warmth or detect differences at the pixel level between live and static images. Robust
liveness detection in biometric authentication significantly enhances security and reliability. The objective
of this research is to test the different pre-trained models to detect spoofing attacks and to use LIME to
explain the model’s predictions. This paper focuses on a dataset of Spoof in Wild with Multiple Attacks
Version 2 (SiWMv2), comprising 14 different spoofing techniques, ranging from replay attacks and makeup
disguises with paper glasses to more complex ones. Seven pre-trained architectures, VGG16, DenseNet201,
InceptionV3, VGG19, ResNet50, MobileNetV2, and Xception, are fine-tuned with the potential for actual
automatic liveness identification in facial images. Deep learning approaches achieve superior detection
performance against contemporary spoofing techniques. These techniques aim to enhance the interpretability
of their predictions. Building on deep learning approaches, LIME is incorporated to improve transparency
further. LIME provides visual explanations of the prediction to represent what features support the
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
153580 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 12, 2024
S. Khairnar et al.: Advanced Techniques for Biometric Authentication: Leveraging Deep Learning and XAI
model’s decisions. Our work demonstrates how LIME can effectively give insights that are useful to understand
findings on face identification and critically aid in understanding model decisions in security and authentication
systems. The key findings of this research show that our method achieves 97.3% accuracy on the SiWMv2 dataset,
a significant improvement over baseline models, which average 91.5%. This enhancement shows the effectiveness
of our approach in a system that boosts robustness and increases the reliability of liveness detection.
INDEX TERMS Explainable artificial intelligence (XAI), liveness detection, LIME, pre-trained models, spoof
attacks.
scores. In conclusion, Section VI discusses the results and models are more computationally expensive and may not be
possibilities for the future manuscript. suitable for real-time systems. Grad-CAM and SHAP also
enhance interpretability but are not robust against adversarial
II. RELATED WORK inputs.
Liveness detection has evolved from its initial blink detec- SHAP, LIME, Integrated Gradients, and Counterfactual
tion and motion analysis techniques into machine learning Explanations are Explainable AI (XAI) techniques that
and deep learning methods. Initial techniques focusing on bring transparency to various domains, including biometric
eye movement and motion analysis [15] struggled against systems. In [24], for example, XAI was used in smart farm-
advanced spoofing methods like high-resolution printed ing to make complex models more understandable. These
images and 3D masks [16], exposing the vulnerabilities of advanced methods can also make biometric authentication
early biometric systems. The integration of advanced learn- more explainable, e.g., distinguishing between live faces
ing techniques in the liveness detection domain is expected and spoofs. Additionally, they can identify vulnerabilities,
to enhance the analysis of dynamic facial features, texture thereby enhancing robustness and security. The cross-domain
variations, and depth information, which will be transferred applicability of XAI makes it essential to ensure trust and
into models that are Xception, ResNet50, VGG16, VGG19, reliability across all applications.
and InceptionV3. Precisely, deep learning models using AI goes beyond biometrics, proving effective in various
Convolutional Neural Networks (CNN) and Recurrent Neu- fields such as smart farming. For example, in smart farm-
ral Networks (RNN) have significantly improved liveness ing, AI techniques like deep learning and computer vision
detection through learning complex patterns that effectively are used to optimize crop yields, detect pests, and manage
separate live subjects from spoof attempts with high accuracy resources [25]. This shows the versatility of AI, where tech-
[17], [18]. niques like CNNs and RNNs, used in biometric liveness
Despite these advancements, recent studies have shown detection, are also applied to large datasets in agriculture,
that deep learning methods still struggle with advanced spoof- resulting in better performance and efficiency. The success
ing attacks. Steiner et al. showed that CNNs are vulnerable of these techniques across different domains shows that
to adversarial attacks, where even small changes in the input they can be used in security, authentication, agriculture, and
can lead to misclassifications. George and Marcel [19] also healthcare. This broader perspective helps put the importance
pointed out that while pixel-wise binary supervision improves of explainability and robustness in AI models into context,
performance against simple attacks, it fails to counter video which is critical to building trust in any AI system, whether
and 3D mask attacks. Therefore, there is a need to enhance for biometrics or other practical applications.
the robustness and interpretability of our models. Table 1 summarizes the related work, highlighting the
Pre-trained models, like those evaluated by Jain et al. [20], advantages and limitations of face liveness detection and
appear promising. Their fine-tuning of VGG16 and ResNet XAI. As can be seen from the table, while some of the models
for liveness detection showed that these models are accurate are better in accuracy, others offer faster detection with less
and efficient against simple spoofing attacks. However, they overhead in computation; in this context, there is a need
also highlighted challenges in generalizing these models to to balance both requirements when selecting deployment
more advanced spoofing methods, such as video and 3D models.
mask attacks. The explainability of these models is crucial The main research gap is the need for liveness detection
for building trust in biometric systems. systems that are both more reliable and interpretable. While
Despite this progress, deep learning models are not the existing models show high accuracy, most are inefficient
immune to adversarial attacks, implying that these models in defence mechanisms against advanced spoofing techniques
can be manipulated so that their decision affects misclassi- and adversarial attacks. Therefore, while building a system,
fication [21], [22] For understanding and validation of the on the way to achieving high accuracy, transparency of the
decision-making process, techniques considered are LIME, decision-making process should be pursued to expose possi-
SHAP, and Grad-CAM, which have been developed as criti- ble vulnerabilities. The problem has been addressed because
cal tools under explainable AI. For example, LIME explains of the critical fact that secure and trustworthy biometric
how models such as VGG or ResNet distinguish subtle authentication systems are essential.
facial cues and motion dynamics to identify vulnerability. Recent works, such as those by Dwivedi et al. [26], have
Techniques like this can increase transparency and trust in also examined the combination of interpretability and live-
biometric system development but remain a concern due to ness detection. They found that deep learning models are
their robustness against sophisticated spoofing and adversar- necessary for high performance, but adding explainability
ial attacks. techniques like LIME can make the system more trustworthy.
LIME, introduced in the evaluation of XAI [23], gives a However, the trade-off between computational complexity
practical solution by showing visual explanations of CNN and real-time is still an open problem.
model predictions. It demonstrates which parts of the face The key to gaining users’ trust and progress in bio-
contributed to the model’s decision, making the results more metric authentication lies in strong defence mechanisms
interpretable. However, it also highlights that explainable against spoofing and adversarial attacks and transparency
TABLE 1. Summary of related works in explainable AI and liveness TABLE 1. (Continued.) Summary of related works in explainable AI and
detection: Highlighting advantages and limitations. liveness detection: Highlighting advantages and limitations.
algorithms. Figure 1 shows live samples and spoofs from the MobileNetV2, DenseNet201, InceptionV3, and [29]. Each of
SiWMv2 dataset. these models has its architecture and strengths:
b: ResNet50
ResNet50 [32] employs residual connections (or skip con-
nections) to facilitate the training of intense networks. These
connections allow the model to learn residuals or differ-
ences between the input and output, addressing the vanishing
gradient problem and improving performance on deep learn-
FIGURE 1. Samples of real and spoofed images from the SIW-MV2 ing tasks. This architecture enables the model to go deeper
dataset. (50 layers) without encountering significant degradation in
performance.
B. METHOD
1) TRANSFER LEARNING c: DenseNet201
This work used the transfer learning method, which modifies DenseNet201 introduces strong connections between lev-
one task’s design to form a new model for another task [27]. els, with each layer receiving input from all the preceding
This approach is beneficial when we have limited data. Start- ones. This pattern improves gradient flow and feature reuse
ing from a pre-trained big data model such as ImageNet, throughout the network. DenseNet201 [33], with its 201 lay-
we can use the learning features and knowledge to improve ers, enhances performance by allowing for a more efficient
the performance of specific face preservation tasks [28]. flow of information and gradients, leading to improved clas-
sification and object detection results.
2) INITIAL TRAINING DATA
The initial training data includes ImageNet-heavy pre- d: MobileNetV2
training. ImageNet is an extensive image database used to MobileNetV2 was designed with functionality in mind,
study object recognition software. Pre-trained weights on making it particularly suitable for mobile applications and
ImageNet have already learned a rich set of features from graphics [34]. It reduces the number of computations and
millions of images, which include edges, textures, shapes, and variables by using depthwise separable convolutions, all
other visual attributes. This initial set of weights serves as a while maintaining the required accuracy. This design allows
robust starting point, allowing the model to recognize basic for faster inference times and lower computational resource
features that are common across many visual tasks, thereby usage, making it ideal for real-time applications on mobile
providing a significant head start when fine-tuning for the devices.
liveness task.
e: InceptionV3
3) FINE-TUNING PROCESS InceptionV3 features a complex architecture with inception
It involves training the model further on the specific face modules that capture multi-scale features by applying multi-
anti-spoofing dataset. During this process, the model weights ple convolutions with different kernel sizes in parallel [35].
are adjusted through multiple iterations to draw attention to This allows the network to learn various levels of abstraction
the critical characteristics of face anti-spoofing. The goal is and combine them, making it adept at handling diverse visual
to shift the model’s attention from the generalized features features. The InceptionV3 architecture balances accuracy and
learned from ImageNet to the specific nuances required to computational efficiency, making it suitable for a wide range
differentiate between real and spoofed faces. Fine-tuning of tasks.
helps adapt the models in a newer way, improving their The proposed architecture for face liveness detection using
accuracy and performance on the target task. a pretrained model and LIME is shown in Figure 3.
FIGURE 3. Proposed architecture for face liveness detection using pre-trained models and local interpretable model-agnostic explanations.
standard convolutions. This modification further enhances models for real-world biometric systems. We have focused
the model’s accuracy by separating the filtering process more on InceptionV3 and Xception because of their balanced
and combining feature maps. The use of depth-wise sepa- performance in both accuracy and feasibility.
rable convolutions reduces computational complexity while
improving model performance, making Xception a powerful 5) TRAINING ENVIRONMENT
tool for image classification and other vision tasks. Models are trained using TensorFlow on Google Colab, a free
We further evaluate the reliability of each model by exam- cloud environment that can use powerful GPUs and TPUs.
ining its performance across various types of spoofing attacks This location allows good educational standards even with
and robustness under different lighting and environmen- limited local resources. TensorFlow is a popular deep learning
tal conditions. This ensures the models are accurate and framework that provides a flexible and extensible frame-
reliable in real-world applications where conditions vary work for building and deploying machine learning models.
significantly. Google Colab encourages collaboration, makes budgeting
The advantages and limitations of deep learning models more accessible, and integrates with Google Drive to store
experimented with in this study are summarized in Table 2. and manage files.
InceptionV3 and Xception performed better in complex
patterns, but ResNet50 and DenseNet201 took more compu- 6) EVALUTION METRICS
tational costs and more time to train. This table shows the The model is evaluated using the categorical cross-entropy
tradeoff between accuracy, speed, and computational com- loss function, whose output is a value between 0 and 1,
plexity, which are the key factors to consider when choosing which is used to evaluate the effectiveness of the
TABLE 2. Advantages and limitations of deep learning models used in two extensions of stochastic gradient descent that combine
experimentation.
to provide time-adjusted prediction (ADAP). The industry
standard for training robustness and integration stability is
0.001 runs.
8) REGULARIZATION TECHNIQUE
Dropout normalization is used to prevent overfitting. Dropout
is a process in which selected neurons are ignored during
training, i.e., their contribution to the neuron activity is tem-
porarily removed from the foreground, and the modified
weights are not used to recover the neurons [39]. This helps
to reduce the influence of neurons, making the network less
sensitive to the weight of a single neuron. As a result, the
model becomes more robust and less likely to damage the
training data, thus generalizing better to unseen data [40].
rather than the entire model. Interpretable: Its purpose is to reliability, which is crucial for practical liveness detection
make the decision-making model understandable to humans. systems.
Model-independent: It can be applied to any machine learning
model regardless of the basic algorithm. Locally Interpretable TABLE 3. Evaluation of pre-trained model efficacy with test data accuracy
Model-Independent Explanation provides a locally inter- over 15 epochs.
1.156 by the final epoch. Validation accuracy initially rose to LIME uses mixed images and their predictions to repre-
99.15% by the fifth epoch and stabilized around 89.7. sent interpretable models. The model definition identifies the
most essential superpixels for predicting the original model.
A. PERFORMANCE VALIDATION WITH XAI The critical regions in the LIME interpretation indicate the
1) MODEL EXPLAINABILITY superpixels that impact the model prediction most. A face
LIME is used in face recognition to identify the areas in with a predicted score of 0.98 was obtained through LIME,
images with the greatest predictive power for a model. This indicating high confidence in face detection.
helps in understanding the face, which the model must recog- LIME Explanation: Highlights the region around the face
nize. We use a predefined face detection model to detect faces (especially the upper part), showing these areas are crucial for
in images. The model outputs a confidence score for each the model’s prediction.
prediction, indicating the likelihood of the image containing Green (Real): Indicates regions the model associates
a face. LIME generates several perturbed image versions by with real faces.
randomly turning off some superpixels. The face detection Red (Fake): Indicates regions the model associates with
model is run on these perturbed images to get predictions. fake faces.
Although all models predict the correct label, they focus essential features and potential biases, enhancing model
on different features to make their predictions, so this varying transparency and trustworthiness.
from the various regions that are being highlighted as red and
green in other images, which provides a good understanding V. DISCUSSIONS
of the model’s behavior and verifying if it can make the Results obtained in this study are very encouraging in face
decisions based on the various features which are present in detection, mainly assessed by several pre-trained models.
the image. InceptionV3 has an accuracy of 99.80%, indicating its effec-
Figure 10 illustrates the application of LIME to explain the tiveness and strong generalization ability across face images.
predictions of the VGG19 face detection model. So, in this, VGG-16 also performed very well, with an accuracy of
we can observe that we can find the predicted score for real 99.61 %, reflecting its reliability and stability during train-
and fake images, which is 0.86 for real and 0.55 for fake. ing. Though a little behind, VGG-19 has an accuracy of
This highlights that the results obtained from the inception 97.76 % with significant improvements over time, thus show-
of the V3 model are of higher accuracy than those obtained ing its strong learning abilities. Moreover, high performance
by VGG19. Thus, LIME is a powerful tool for interpreting is further exemplified by Xception and MobileNetV2, with
and understanding the predictions of face detection mod- accuracies of 98.6 % and 98.8 %, respectively, thus showing
els. By providing local explanations, LIME helps identify efficiency in recognizing complex patterns. DenseNet201
performs very well, with minimal variation in accuracy at We have InceptionV3 and VGG-16, which had high accu-
98.57%. It does, however, give room for minor improvement. racy throughout all epochs, proving their robustness for face
ResNet50 also reveals a pretty decent accuracy of 91.24% liveness detection. ResNet50 and DenseNet201 had compet-
and is very stable. These results provide a complete picture itive performance, but their accuracy fluctuated, and training
of the strengths of the models and a well-rounded view of time was higher, so they were unsuitable for real-time appli-
how these models work for face detection. The implica- cations. So, we recommend InceptionV3 and VGG-16 for
tions presented in these results are very significant in terms their optimal balance of performance and reliability.
of applications requiring exact face detection. If the most This supported a strong base for choosing the most appro-
elevated level of precision is needed, InceptionV3 stands priate model against specific needs, underscoring the critical
apart due to its exceptional accuracy. VGG-16 and VGG-19 role of model performance in ensuring that face detection
are much needed in situations requiring stability and long- systems are correct and reliable. Including LIME in the
term reliability. Xception and MobileNetV2 offer efficient evaluation procedure further improved the interpretability of
performance, making them very applicable in real-time appli- these results. LIME gave local explanations of how different
cations where speed and accuracy are essential. DenseNet201 models focused on crucial facial features—for example, the
and ResNet50 perform very well but point out areas eyes and the mouth—which greatly influenced prediction
that would benefit from further refinement to better their accuracy. For InceptionV3 and VGG-16, the relevant facial
effectiveness. regions were focused on with remarkable consistency, firmly
FIGURE 10. Illustration of LIME application to explain the prediction of the Inception V3 face detection model.
FIGURE 11. Illustrates of LIME to explain the prediction of the VGG19 face detection model.
establishing their high accuracy and reliability. LIME gave with less bias. Besides diversity in datasets, another critical
more insight into the decision-making process of these mod- factor leading to high model performance is some effec-
els, hence a greater understanding of why they did what they tive preprocessing techniques. Data augmentation methods
did and a much-enhanced view of strengths and areas to extend the size of the training dataset and enable models to
improve on. handle variations in the appearance and orientation of faces.
These normalization techniques ensure that the data fed to
A. IMPLICATIONS models is homogeneous and helps stabilize the training pro-
The study provides critical theoretical implications about cess. This is an essential preliminary step towards organizing
how model architectures impact face detection performance. the data to ensure it learns relevant features for accuracy. The
Performance differences across models underline the impor- choice of training techniques and hyperparameters impinges
tance of choosing appropriate architectures for application on the results, underscoring careful tuning and validation.
requirements. Results offer insight into how algorithms can Interpretability methods, such as Local Interpretable Model-
be optimized for face detection and point out that choices of agnostic Explanations, further raise another methodological
architectures are an important factor in high accuracy and layer in the evaluation process. LIME is able to give local
robustness [42]. These theoretical insights can further be explanations of model predictions by pointing out which
used to build new model architectures or improve existing critical facial regions drive models’ decisions. LIME helps in
ones for future studies to enhance face detection technolo- knowing the most influential features, thus identifying pos-
gies’ efficacy. The present contribution opens up avenues to sible biases and spaces for improvement, making the models
develop more sophisticated models or approaches for han- more accurate, fairer, and trustworthy. Overall, the contribu-
dling different applications. Results support implementing tions of this study to face detection systems increase general
advanced models in real-world systems, improving accuracy accuracy and robustness, leading to practical solutions. These
and user trust in biometric solutions. The practical application will contribute to more robust and effective face detec-
domains are vast, ranging from security, access control, and tion solutions of significant practical relevance for security,
personal identification, which require reliable and effective biometric authentication, and many other uses requiring
face detection. facial identification.
and reliable. Local interpretable model-agnostic explanations implementation will ensure that face detection technologies
improve models’ credibility and effectiveness by identifying reliably and effectively work in various contexts.
biased areas. It visualizes the biased areas of real and fake We used LIME as the primary explainable AI technique to
pictures, guiding the model away from biased regions that explain the decision-making of deep learning models for face
lead to incorrect predictions [23]. Colors play a preeminent liveness detection. In future work, it will be interesting to see
role in the model’s prediction. According to the model, green how other XAI techniques like IG, LRP, and X-CNN perform
areas show those parts that relate to real faces, while red areas in face liveness detection. These techniques will give different
are those related to fake faces. insights into model interpretability and more comprehensive
This visual feedback will help understand what part of the explanations for complex decisions. By combining multiple
image the model is focusing on and will help guide model XAI methods, future research can improve the transparency
decisions based on what is most important and accurate. and robustness of biometric systems and trust and security in
By highlighting these areas, LIME provides essential infor- real-world applications.
mation to standardize and eliminate bias [43].
VI. CONCLUSION
D. ENHANCING MODEL TRANSPARENCY AND This research used pre-trained deep models to perform
TRUSTWORTHINESS face spoofing attacks and transfer learning techniques to
The above performance was incremented by adding LIME adapt these models to a particular problem. The idea was
with pretrained models, resulting in a deep understanding of tested using ResNet50, VGG16, DenseNet201, VGG19,
how the models make decisions. Such local explanations by MobileNetV2, InceptionV3, and Xception, observing how
LIME give transparency to the models so that one can see they performed for multiple epochs. Results showed that most
which features are more influential on the predictions. In a pre-trained models gave high accuracy, with InceptionV3
critical application where accurate face detection is crucial, and DenseNet201 being particularly strong. This validates
such transparency is necessary to ensure trust. LIME allows the effectiveness of transfer learning in face anti-spoofing.
users to understand what the models are doing so they can Besides, LIME gave essential insights into how these intrin-
confirm that models are learning to make decisions based on sic models make decisions. LIME helped identify the most
proper, relevant features [44]. influential features and parts of the face that contribute to the
prediction, thus enhancing the interpretability and reliability
of the anti-spoofing system. This integration of LIME into
E. FUTURE RESEARCH DIRECTIONS pre-trained models could provide high accuracy and trans-
Future research should focus on exploring new architec- parency, which is critical to applications in the real world
tures, fine-tuning models, real-time implementation, expand- where understanding model decisions is as important as per-
ing datasets, exploring multi-modal approaches, advancing formance itself.
interpretability methods, addressing privacy and ethical con-
siderations, and improving robustness to adversarial attacks,
along with refining existing models will also be crucial. AUTHOR CONTRIBUTIONS
These directions will enhance the security, efficiency, and Conceptualization, Smita Khairnar, Shilpa Gite,
transparency of face anti-spoofing systems, building on Sudeep D. Thepade, and Biswajeet Pradhan; methodol-
the foundation laid by this study. Integrating these mod- ogy, Smita Khairnar, Kashish Mahajan; software, Shilpa
els into real-world applications and continuously evaluating Gite; validation, Smita Khairnar, Shilpa Gite, Sudeep
their effectiveness will be essential for advancing face D. Thepade, S.M, Abdullah Alamri, and Biswajeet Pradhan;
detection technologies. Future research should investigate formal analysis, Smita Khairnar, S.M; investigation, Smita
the effects of various methods on model performance and Khairnar, Shilpa Gite, Sudeep D. Thepade, S.M, Abdullah
explore new ways to improve accuracy and reliability. Alamri, and Biswajeet Pradhan; resources, Biswajeet
Results based on this research will help develop better face Pradhan; data curation, Smita Khairnar; writing—original
solutions. Measuring the impact of face detection mod- draft preparation, Smita Khairnar, and S.M.; writing—
els in the real world is essential to understanding their review and editing, Shilpa Gite, Biswajeet Pradhan, Abdullah
performance. Alamri and Sudeep D. Thepade; visualization, Shilpa
This further implies performance assessment in different Gite, Biswajeet Pradhan, Abdullah Alamri and Sudeep
lighting conditions, demographic groups, and usage scenar- D. Thepade; supervision, Shilpa Gite and Sudeep
ios. Testing in real-world scenarios would bring considerable D. Thepade; project administration, Biswajeet Pradhan; fund-
illumination to the generalization capability of these models ing acquisition, Biswajeet Pradhan and Abdullah Alamri.
outside controlled experimental settings. The thorough eval-
uations and user feedback at the end refine the models to
ensure conformance with practical needs and expectations. DATA AVAILABILITY STATEMENT
Filling the gap between theoretical improvement and real The data will be made accessible upon request.
CONFLICTS OF INTEREST [20] D. M. Jain, M. S. Bora, S. Chandnani, S. Grover, and S. Sadwal, ‘‘Compar-
The authors declare that they have no known competing ison of VGG-16, VGG-19, and ResNet-101 CNN models for the purpose
of suspicious activity detection,’’ Int. J. Sci. Res. Comput. Sci., Eng. Inf.
financial interests or personal relationships that could have Technol., vol. 9, pp. 121–130, Jan. 2023, doi: 10.32628/cseit2390124.
appeared to influence the work reported in this paper. [21] U. Muhammad and M. Oussalah, ‘‘Self-supervised face presentation
attack detection with dynamic grayscale snippets,’’ in Proc. IEEE 17th
Int. Conf. Autom. Face Gesture Recognit. (FG), Jan. 2023, pp. 1–6, doi:
REFERENCES 10.1109/FG57933.2023.10042547.
[1] D. Garud and S. S. Agrwal, ‘‘Face liveness detection,’’ in Proc. Int. Conf. [22] A. Potdar, P. Barbhaya, and S. Nagpure, ‘‘Face recognition for atten-
Autom. Control Dyn. Optim. Techn. (ICACDOT), Sep. 2016, pp. 789–792, dance system using CNN based liveliness detection,’’ in Proc. Int. Conf.
doi: 10.1109/ICACDOT.2016.7877695. Adv. Comput., Commun. Mater. (ICACCM), Nov. 2022, pp. 1–6, doi:
[2] S. Chakraborty and D. Das, ‘‘An overview of face liveness detec- 10.1109/ICACCM56405.2022.10009024.
tion,’’ Int. J. Inf. Theory, vol. 3, no. 2, pp. 11–25, Apr. 2014, doi: [23] H. T. T. Nguyen, H. Cao, K. V. T. Nguyen, and N. D. K. Pham.
10.5121/ijit.2014.3202. (2021). Evaluation of Explainable Artificial Intelligence: SHAP,
[3] M. Fang, N. Damer, F. Kirchbuchner, and A. Kuijper, ‘‘Real masks LIME, and CAM Evaluation of Explainable Artificial Intelligence:
and spoof faces: On the masked face presentation attack detec- SHAP, LIME, and CAM Hung Quoc Cao. [Online]. Available:
tion,’’ Pattern Recognit., vol. 123, Mar. 2022, Art. no. 108398, doi: https://www.researchgate.net/publication/362165633
10.1016/j.patcog.2021.108398. [24] Y. Akkem, S. K. Biswas, and A. Varanasi, ‘‘Streamlit-based enhanc-
[4] J. Määttä, A. Hadid, and M. Pietikäinen, ‘‘Face spoofing detection from ing crop recommendation systems with advanced explainable artificial
single images using micro-texture analysis,’’ in Proc. Int. Joint Conf. intelligence for smart farming,’’ Neural Comput. Appl., vol. 36, no. 32,
Biometrics (IJCB), Oct. 2011, pp. 1–7, doi: 10.1109/IJCB.2011.6117510. pp. 20011–20025, Nov. 2024, doi: 10.1007/s00521-024-10208-z.
[5] D. Wen, H. Han, and A. K. Jain, ‘‘Face spoof detection with image [25] Y. Akkem, S. K. Biswas, and A. Varanasi, ‘‘Smart farming using artificial
distortion analysis,’’ IEEE Trans. Inf. Forensics Security, vol. 10, no. 4, intelligence: A review,’’ Eng. Appl. Artif. Intell., vol. 120, Apr. 2023,
pp. 746–761, Apr. 2015, doi: 10.1109/TIFS.2015.2400395. Art. no. 105899, doi: 10.1016/j.engappai.2023.105899.
[26] R. Dwivedi, P. Kothari, D. Chopra, M. Singh, and R. Kumar, ‘‘An
[6] S. Khairnar, S. Gite, and S. D. Thepade, ‘‘Empirical performance analy-
efficient ensemble explainable AI (XAI) approach for morphed face detec-
sis of deep convolutional neural networks architectures for face liveness
tion,’’ Pattern Recognit. Lett., vol. 184, pp. 197–204, Aug. 2024, doi:
detection,’’ Tech. Rep., doi: 10.21203/rs.3.rs-3824202/v1.
10.1016/j.patrec.2024.06.014.
[7] S. Khairnar, S. Gite, K. Kotecha, and S. D. Thepade, ‘‘Face liveness
[27] D. Petkovic, ‘‘It is not ‘accuracy vs. explainability’—We need both
detection using artificial intelligence techniques: A systematic literature
for trustworthy AI systems,’’ IEEE Trans. Technol. Soc., vol. 4, no. 1,
review and future directions,’’ Big Data Cognit. Comput., vol. 7, no. 1,
pp. 46–53, Mar. 2023, doi: 10.1109/TTS.2023.3239921.
p. 37, Feb. 2023, doi: 10.3390/bdcc7010037.
[28] H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao,
[8] S. Khade, S. Gite, and B. Pradhan, ‘‘Iris liveness detection using multiple
D. Mollura, and R. M. Summers, ‘‘Deep convolutional neural networks for
deep convolution networks,’’ Big Data Cogn. Comput., vol. 6, no. 2, p. 67,
computer-aided detection: CNN architectures, dataset characteristics and
Jun. 2022, doi: 10.3390/bdcc6020067.
transfer learning,’’ IEEE Trans. Med. Imag., vol. 35, no. 5, pp. 1285–1298,
[9] M. Omara, M. Fayez, H. Khalid, and S. Ghoniemy, ‘‘A transfer learning May 2016, doi: 10.1109/TMI.2016.2528162.
approach for face liveness detection,’’ in Proc. 11th Int. Conf. Intell.
[29] M. Masood, M. Nawaz, A. Javed, T. Nazir, A. Mehmood, and
Comput. Inf. Syst. (ICICIS), Nov. 2023, pp. 122–127, doi: 10.1109/ici-
R. Mahum, ‘‘Classification of deepfake videos using pre-trained
cis58388.2023.10391203.
convolutional neural networks,’’ in Proc. Int. Conf. Digit. Futures
[10] D. Garreau and D. Mardaoui, ‘‘What does LIME see in images?’’ in Proc. Transformative Technol. (ICoDT2), May 2021, pp. 1–6, doi:
Int. Conf. Mach. Learn., 2021, pp. 1–10. 10.1109/ICoDT252288.2021.9441519.
[11] S. Rao, S. Mehta, S. Kulkarni, H. Dalvi, N. Katre, and M. Narvekar, ‘‘A [30] H. Qassim, A. Verma, and D. Feinzimer, ‘‘Compressed residual-VGG16
study of LIME and SHAP model explainers for autonomous disease predic- CNN model for big data places image recognition,’’ in Proc. IEEE
tions,’’ in Proc. IEEE Bombay Sect. Signature Conf. (IBSSC), Dec. 2022, 8th Annu. Comput. Commun. Workshop Conf. (CCWC), Jan. 2018,
pp. 1–6, doi: 10.1109/IBSSC56953.2022.10037324. pp. 169–175, doi: 10.1109/CCWC.2018.8301729.
[12] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and [31] L. Wen, X. Li, X. Li, and L. Gao, ‘‘A new transfer learning based on VGG-
D. Batra, ‘‘Grad-CAM: Visual explanations from deep networks via 19 network for fault diagnosis,’’ in Proc. IEEE 23rd Int. Conf. Comput.
gradient-based localization,’’ Int. J. Comput. Vis., vol. 128, no. 2, Supported Cooperat. Work Design (CSCWD), May 2019, pp. 205–209,
pp. 336–359, Feb. 2020, doi: 10.1007/s11263-019-01228-7. doi: 10.1109/CSCWD.2019.8791884.
[13] Y. Hailemariam, A. Yazdinejad, R. M. Parizi, G. Srivastava, and [32] Z. Zahisham, C. P. Lee, and K. M. Lim, ‘‘Food recognition with ResNet-
A. Dehghantanha, ‘‘An empirical evaluation of AI deep explainable tools,’’ 50,’’ in Proc. IEEE 2nd Int. Conf. Artif. Intell. Eng. Technol. (IICAIET),
in Proc. IEEE Globecom Workshops (GC Wkshps, Dec. 2020, pp. 1–6, doi: Sep. 2020, pp. 1–5, doi: 10.1109/IICAIET49801.2020.9257825.
10.1109/GCWkshps50303.2020.9367541. [33] Y. Fu, J. Wu, Y. Hu, M. Xing, and L. Xie, ‘‘DESNet: A multi-channel
[14] C. H. Ng, H. S. Abuwala, and C. H. Lim, ‘‘Towards more sta- network for simultaneous speech dereverberation, enhancement and sepa-
ble LIME for explainable AI,’’ in Proc. Int. Symp. Intell. Sig- ration,’’ in Proc. IEEE Spoken Lang. Technol. Workshop (SLT), Jan. 2021,
nal Process. Commun. Syst. (ISPACS), Nov. 2022, pp. 1–4, doi: pp. 857–864, doi: 10.1109/SLT48900.2021.9383604.
10.1109/ISPACS57703.2022.10082810. [34] D. Sinha and M. El-Sharkawy, ‘‘Thin MobileNet: An enhanced MobileNet
[15] Md. M. Hasan, Md. S. U. Yusuf, T. I. Rohan, and S. Roy, ‘‘Efficient two architecture,’’ in Proc. IEEE 10th Annu. Ubiquitous Comput., Electron.
stage approach to detect face liveness : Motion based and deep learning Mobile Commun. Conf. (UEMCON), Oct. 2019, pp. 0280–0285, doi:
based,’’ in Proc. 4th Int. Conf. Electr. Inf. Commun. Technol. (EICT), 10.1109/UEMCON47517.2019.8993089.
Dec. 2019, pp. 1–6, doi: 10.1109/EICT48899.2019.9068813. [35] X. Xia, C. Xu, and B. Nan, ‘‘Inception-v3 for flower classification,’’ in
[16] S. Kumar, S. Singh, and J. Kumar, ‘‘A comparative study on face spoof- Proc. 2nd Int. Conf. Image, Vis. Comput. (ICIVC), Jun. 2017, pp. 783–787,
ing attacks,’’ in Proc. Int. Conf. Comput., Commun. Autom. (ICCCA), doi: 10.1109/ICIVC.2017.7984661.
May 2017, pp. 1104–1108, doi: 10.1109/CCAA.2017.8229961. [36] S. Roopashree and J. Anitha, ‘‘DeepHerb: A vision based system
[17] D. Mery, ‘‘True black-box explanation in facial analysis,’’ in for medicinal plants using xception features,’’ IEEE Access, vol. 9,
Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops pp. 135927–135941, 2021, doi: 10.1109/ACCESS.2021.3116207.
(CVPRW), Jun. 2022, pp. 1595–1604, doi: 10.1109/CVPRW56347. [37] S. Mandol, S. Mia, and Sk. Md. M. Ahsan, ‘‘Real time liveness detection
2022.00166. and face recognition with OpenCV and deep learning,’’ in Proc. 5th Int.
[18] R. Koshy and A. Mahmood, ‘‘Optimizing deep CNN architectures for Conf. Electr. Inf. Commun. Technol. (EICT), Dec. 2021, pp. 1–6, doi:
face liveness detection,’’ Entropy, vol. 21, no. 4, p. 423, Apr. 2019, doi: 10.1109/EICT54103.2021.9733685.
10.3390/e21040423. [38] Z. Zhang, ‘‘Improved Adam optimizer for deep neural networks,’’ in Proc.
[19] A. George and S. Marcel, ‘‘Deep pixel-wise binary supervision for face IEEE/ACM 26th Int. Symp. Quality Service (IWQoS), Jun. 2018, pp. 1–2,
presentation attack detection,’’ 2019, arXiv:1907.04047. doi: 10.1109/IWQOS.2018.8624183.
[39] I. Salehin and D.-K. Kang, ‘‘A review on dropout regularization approaches KASHISH MAHAJAN is currently pursuing the
for deep neural networks within the scholarly domain,’’ Electronics, Bachelor of Engineering degree in computer sci-
vol. 12, no. 14, p. 3106, Jul. 2023, doi: 10.3390/electronics12143106. ence with the Pimpri Chinchwad College of Engi-
[40] Y. Zhong, W. Deng, H. Fang, J. Hu, D. Zhao, X. Li, and D. Wen, ‘‘Dynamic neering and Research, Pune. She is currently a
training data dropout for robust deep face recognition,’’ IEEE Trans. Mul- Research Intern with the Symbiosis Centre for
timedia, vol. 24, pp. 1186–1197, 2022, doi: 10.1109/TMM.2021.3123478. Applied AI, specializing in deep learning. Her
[41] Y. Pu, Y. Han, Y. Wang, J. Feng, C. Deng, and G. Huang, ‘‘Fine- research interests include full-stack development,
grained recognition with learnable semantic data augmentation,’’ Java, Python, and advanced machine and deep
IEEE Trans. Image Process., vol. 33, pp. 3130–3144, 2024, doi:
learning techniques, reflecting her dedication to
10.1109/TIP.2024.3364500.
pushing the boundaries of technology and inno-
[42] S. Purnapatra et al., ‘‘Face liveness detection competition
(LivDet-Face)—2021,’’ in Proc. IEEE Int. Joint Conf. Biometrics vation. She holds copyright for a project related to branch prediction and
(IJCB), Aug. 2021, pp. 1–10, doi: 10.1109/IJCB52358.2021.9484359. has been awarded the Eaton Scholarship for her academic excellence. She
[43] E. Tjoa and C. Guan, ‘‘A survey on explainable artificial has also achieved notable success in a variety of projects and hackathons.
intelligence (XAI): Toward medical XAI,’’ IEEE Trans. Neural Her work demonstrates a solid commitment to contributing to cutting-edge
Netw. Learn. Syst., vol. 32, no. 11, pp. 4793–4813, Nov. 2021, doi: research and technological advancements.
10.1109/TNNLS.2020.3027314.
[44] N. Barr Kumarakulasinghe, T. Blomberg, J. Liu, A. Saraiva Leao, and
P. Papapetrou, ‘‘Evaluating local interpretable model-agnostic explana-
tions on clinical machine learning classification models,’’ in Proc. IEEE
33rd Int. Symp. Comput.-Based Med. Syst. (CBMS), Jul. 2020, pp. 7–12,
doi: 10.1109/CBMS49503.2020.00009.
ABDULLAH ALAMRI received the B.S. degree SUDEEP D. THEPADE received the Ph.D. degree,
in geology from King Saud University, in 1981, in 2011. He is currently the Pro-Vice Chancellor
the M.Sc. degree in applied geophysics from the of Pimpri Chinchwad University and a Professor
University of South Florida, Tampa, FL, USA, with the Computer Engineering Department, Pim-
in 1985, and the Ph.D. degree in earthquake pri Chinchwad College of Engineering, Savitribai
seismology from the University of Minnesota, Phule Pune University, Pune, Maharashtra, India.
Minneapolis, MN, USA, in 1990. He is cur- He has more than 350 research papers published
rently an Earthquake Seismology Professor and in international/national conferences and journals
the Seismic Studies Center Director of King Saud to his credit. His research interests include image
University (KSU). He is also the President of the processing, image retrieval, video analysis, video
Saudi Society of Geosciences and the Editor-in-Chief of the Arabian Journal visual data summarization, biometrics, and biometric liveness detection.
of Geosciences (AJGS). He is a member of the Seismological Society He is a member of the International Association of Engineers (IAENG)
of America, American Geophysical Union, and European Association for and the International Association of Computer Science and Information
Environmental and Engineering Geophysics, Earthquakes Mitigation in the Technology (IACSIT). He has served as a technical program committee
Eastern Mediterranean Region, National Commission for Assessment and member and a reviewer for several international conferences and journals.
Mitigation of Earthquake Hazards in Saudi Arabia, and Mitigation of Natural
Hazards Com at Civil Defense. His research interests include crustal struc-
tures and seismic micro zoning of the Arabian Peninsula. His recent projects
also involve applications of EM and MT in deep groundwater exploration of
empty quarter and geothermal prospecting of volcanic Harrats in the Arabian
Shield. He has published over 150 research articles, achieved more than
45 research projects, and authored several books and technical reports. He is
a principal and co-investigator in several national and international projects
(KSU, KACST, NPST, IRIS, CTBTO, U.S. Air Force, NSF, UCSD, LLNL,
OSU, PSU, and Max Planck). He has co-chaired several SSG, GSF, and
RELEMR workshops and forums in the Middle East. He obtained several
worldwide prizes and awards for his scientific excellence and innovation.