Enhanced_IoT-Based_Face_Mask_Detection_Framework_Using_Optimized_Deep_Learning_Models_A_Hybrid_Approach_With_Adaptive_Algorithms
Enhanced_IoT-Based_Face_Mask_Detection_Framework_Using_Optimized_Deep_Learning_Models_A_Hybrid_Approach_With_Adaptive_Algorithms
ABSTRACT The COVID-19 pandemic has made face mask detection into a big thing because it is essential
in public health monitoring. Meanwhile, the growing number of things that can be connected to the internet
and the increasing integration of this technology mean that edge devices are now in demand for effective
real-time face mask detection models. Often, existing methods require some kind of pre-installed equipment
or difficult-to-manipulate environmental conditions, and computational resource constraints essentially put
an end to them. In the present study, a hybrid Flame-Sailfish Optimization (HFSO)-based deep learning
framework is proposed. It combines the feature extraction capabilities of ResNet50 with the efficiency of
MobileNetV2. The HFSO algorithm optimizes crucial parameters such as detection thresholds and learning
rates. So that the model can take full advantage of computing capacity and still operate in real time on
devices with limited resources. The model was tested on three data sets—Kaggle Face Mask Detection
dataset, Public Places dataset, and Public Videos dataset—achieving up to 97.5% accuracy. It outperformed
the previous leader in all cases. The results prove that this framework is reliable and easily applicable for
identifying people wearing masks under different conditions. However, where there is great occlusion of the
face or video feed quality is bad, the model’s performance will drop somewhat. Future work should focus
on increasing difficulty in detections, broadening the application of this method to other health monitoring
systems based on the Internet of Things, and ensuring that its robustness remains unaltered.
INDEX TERMS Hybrid flame-sailfish optimization, face mask detection, deep learning, IoT-enabled
devices, ResNet50.
2025 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 13, 2025 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 17325
P. Dubey et al.: Enhanced IoT-Based Face Mask Detection Framework Using Optimized Deep Learning Models
face detection have had some success, but the growing com- occlusion, and diverse mask types. Additionally, their
plexity of real-world environments—denoted by variations reliance on large datasets for training limits their scalability,
in light conditions, partial occlusions, and an increasingly especially in low-resource settings. These limitations under-
wide range of mask types—asks for richer recognition score the need for an optimized, adaptable, and efficient
schemes [4]. Here, deep learning methods integrated with solution for real-time face mask detection.
IoT-based surveillance can provide plenty of strength. In this The proposed framework introduces a novel approach to
way, our system achieves high accuracy recognition—around IoT-based face mask detection by integrating ResNet50 and
the clock in an array of different settings. MobileNetV2 architectures with the Hybrid Flame-Sailfish
Existing face mask detection models show good results but Optimization (HFSO) algorithm. This unique combination
are always limited by the speed at which they can detect, ensures optimized detection accuracy and computational effi-
their computational cost, and limited accuracy in low-quality ciency, addressing the critical challenges of deploying models
images. These problems are further amplified when these on resource-constrained IoT edge devices. The novelty lies
models are used on edge devices with limited hardware in the tailored optimization strategy that balances exploration
resources. In addition, the practical application of these mod- and exploitation during parameter tuning, achieving real-time
els is often restricted by the need for very large training performance under diverse conditions.
datasets and high memory usage. The key technical contributions of this research are as
In order to tackle these limitations, we present a novel follows:
hybrid deep learning model combined with an optimized 1. Hybrid Architecture Integration: The integration
detection framework that makes full use of IoT technol- of ResNet50 for detailed feature extraction and
ogy. Our method proposed a unique type of face detection MobileNetV2 for lightweight efficiency creates a
algorithm with advanced deep learning techniques to make hybrid model that excels in both accuracy and speed.
real-world mask detection both efficient and accurate. The 2. Advanced Optimization Technique: The HFSO
innovation lies in optimizing the parameters of both detection algorithm optimizes critical parameters, including
and classification models utilizing an innovative adaptive detection thresholds and learning rates, significantly
hybrid algorithm, offering a performance improvement over improving computational efficiency and accuracy, even
traditional method. in low-resource environments.
Under the background of public health actively being 3. Comprehensive Validation: The framework has been
monitored and health safety protected well, with the rigorously tested across three diverse datasets, demon-
ever-increasing need for effective face mask detection of strating its robustness under varying environmental
machines comes to attention deep learning and IoT-based conditions and scalability for real-world public health
solutions. But these traditional ideas often face challenges in monitoring applications.
real-time, under inherently limited computational resources.
This paper enhances the HFSO (Hybrid Flame-Sailfish Opti- This paper is structured as follows: Section II discusses
mization Algorithm) algorithm with a modern deep learning previously published literature and identifies gaps in current
framework combining ResNet50 and MobileNetV2. The use studies’ face mask detection techniques. This is followed by
of an HFSO scheme optimizes not only sensor thresholds but the problem statement in section III. Section IV describes
also learning rates and so on that massively boost both com- the architecture and techniques employed in our recom-
putational efficiency and accuracy. Unlike traditional models, mended hybrid model—including programming specifics.
our methods are suitable for implementation on edge devices Section V presents our experimental environment followed
in the IoT, which is to say that they combine high preci- by the results in section V; followed by security implica-
sion with real-time performance. This fresh take on mask tions of integrating IoT in section VI. This is followed by
detection does away rosily with all those labour-intensive challenges, limitations, future work and finally discusses our
operations—a unique blend of advanced optimization tech- findings and conclusion.
niques plus scalable, lightweight architecture really sets our
solution apart from warehoused ones in the field of face mask II. RELATED WORK
detection. In recent years, various methods have been developed for face
Face mask detection has become a critical tool for pub- mask detection, leveraging advancements in deep learning
lic health monitoring, particularly during pandemics like and the Internet of Things (IoT). Sethi et al. [5] proposed
COVID-19. However, existing detection systems face signifi- a model using ResNet-50 to enhance face mask detection
cant limitations when deployed in real-world scenarios. Many accuracy, focusing on improving detection speed. While
deep learning models, including Faster R-CNN, YOLO, the method demonstrated improvements in detection perfor-
and SSD-MobileNetV2, require substantial computational mance, it was limited by its inefficiency in high-resolution
resources, making them unsuitable for deployment on video surveillance, restricting its practical applicability.
resource-constrained IoT edge devices. Furthermore, these Kong et al. [6] introduced a face mask detection model using
systems often exhibit degraded performance under chal- Faster R-CNN, designed for analyzing video data in public
lenging environmental conditions, such as poor lighting, health systems. Despite its practical performance in detecting
masks in environments like buses, this model struggled accuracy but suffered from high memory demands and sus-
with large datasets, increasing computational demand and ceptibility to errors due to overfitting.
limiting scalability [36] The authors highlight various learn- Recent work by Loey et al. [11] introduced a YOLO v2-
ing approaches, including Transfer Learning, GANs, and based method for face mask detection, excelling in scenarios
Self-Supervised Learning, as effective strategies to address involving strong occlusions. However, this model struggled to
the challenges posed by data scarcity in deep learning detect diverse types of face masks in real-time, highlighting
applications. the need for further optimization in detection accuracy across
In another approach, Varshini et al. [7] utilized IoT-based different environments. The details of the reviewed literature
smart systems for face mask detection and body tempera- can be seen in Table 1.
ture monitoring in real-time environments such as shopping
malls and hotels. This system integrated IoT sensors with III. PROBLEM STATEMENT
convolutional neural networks (CNN) but faced robustness Face mask detection is of critical significance in control-
issues when trained on small datasets. Koklu et al. [8] pro- ling the spread of infectious diseases, particularly in public
posed a deep learning model that combined the VGG16 places [12]. At the same time, existing face mask detection
architecture with BiLSTM to enhance mask detection accu- models face a number of problems when deployed. Some
racy. Although the model achieved a high success rate in of these models need to consume a lot of computational
detection, the computational complexity posed a signifi- resources, making them unsuitable for edge devices with
cant barrier, especially for real-time applications on edge limited memory and computing power. Some other models
devices. do not produce good results under various environments,
Nagrath et al. [9] introduced an SSD-MobileNetV2-based where problems like low-definition images as well as light
face mask detection system that performed well in noisy envi- variations and occlusion obstruct performance. How about
ronments and handled variations in lighting conditions. How- a robust, real-time face mask detection system that meets
ever, this model required considerable memory and extended these conditions and continues to offer high levels of accuracy
training times, limiting its deployment in low-resource set- while running efficiently on resource-constrained devices?
tings. Similarly, Taha and Eldeen [10] explored the use of Therefore, we are faced with an urgent problem—a real-
support vector machines (SVM) combined with deep learning time, low-bandwidth system for detecting whether someone
features for face mask detection. Their approach improved is wearing a mask. The primary agenda of this research is
to build an optimized, hybrid deep learning detection system A. IoT-BASED DATA ACQUISITION
to meet these ends. Through the integration of IoT with The proposed model starts with obtaining real-time data from
advanced deep learning algorithms and a unique optimization IoT devices in public spaces, such as cameras and sensors.
trick, the new model is designed to be faster in detect- In real time these devices can take a snapshot or video feed,
ing objects while improving detection rate, thus meeting which is transmitted back to the cloud or edge computing
real-time requirements in any vivid environment. environment for processing [13], [14]. This IoT system is
The existing literature on face mask detection reveals sev- designed to handle large amounts of data and transmit them at
eral research gaps that limit the effectiveness and practical very low latency. Mask-wearing compliance detection feed-
deployment of current models: back is therefore real-time.
• Many existing models, particularly single-stage and The IoT devices run around the clock to monitor areas
two-stage detectors, do not meet the real-time require- including transport hubs, shopping malls and medical institu-
ments necessary for practical applications. Despite their tions [15]. The video streams from these cameras are broken
theoretical robustness, these models often fail to perform into individual frames, which are then passed as input to a
efficiently in real-world environments [5], [9]. detection system of sorts.
• Models such as SSD-MobileNetV2 [9] and SVM-based
detectors [10] demand substantial memory and com- B. PREPROCESSING AND FEATURE EXTRACTION
putational resources, making them unsuitable for edge
Before the image is processed, it passes through a series
devices or low-power IoT environments. This hinders
of preprocessing steps that improve this quality and make
their scalability and widespread adoption in resource-
things consistent. Standard Pre-Processing: The first thing
constrained settings.
that happens is that noise reduction is carried out to elimi-
• Face mask detection models often struggle in conditions
nate any unwanted artifacts all at once and increase image
where faces are occluded or masks are worn improperly.
clarity [16]. After that, the images are resized to uniform
Models like YOLO v2 [11] have difficulty detecting
dimensions so they can be used for follow-up processing [17].
a variety of masks, and there is a need for techniques
The final step in preprocessing is normalization of the pixel
that can generalize well across different mask types and
values [18]. This optimization step can greatly improve the
environmental conditions.
performance of deep learning models that use these inputs
• Some methods, such as CNN-based and VGG16-
for their algorithms. These steps are critical for ensuring
BiLSTM models [7], [8], require large training datasets
accurate, credible research results. After preprocessing, the
to achieve acceptable performance. However, obtaining
images are processed by a feature extraction module, and
such datasets in real-world applications can be challeng-
CNNs (convolutional neural networks) are used to extract
ing, leading to poor model generalization in practical use
relevant features. This includes key facial attributes such as
cases.
edge, contour, and mask-related features that are necessary
• Few studies have explored the use of advanced optimiza-
for accurate recognition and classification.
tion techniques to enhance model performance. While
hybrid optimization algorithms have been applied in
other domains, their application in face mask detection C. FACE DETECTION USING MODIFIED SINGLE SHOT
remains limited, leaving room for improvement in both MULTIBOX DETECTOR (SSD)
accuracy and computational efficiency [8], [10]. We built our face detection model using a reworked version
Addressing these gaps, this research proposes a novel of the Single Shot MultiBox Detector (SSD), which is an
hybrid model that integrates ResNet50 and MobileNetV2, algorithm that detects objects quickly and effectively in one
optimized with a Hybrid Flame-Sailfish Optimization pass through all inputs [19], [20]. We’ve optimized the mod-
(HFSO)algorithm. This approach aims to improve both the ified SSD to recognize faces in various poses and lighting
detection accuracy and the computational efficiency of face conditions; in any situation, the results are high accuracy of
mask detection in real-time IoT applications. detection.
The SSD uses multiple anchor boxes at different scales to
IV. METHODOLOGY detect faces of different sizes [21]. It detects faces through
This section introduces the method of proposed face mask pixels in the input images and puts up bounding boxes when
detecting framework, which combines IoT-based real-time it finds them [22]. So how to optimize the SSD model has a
data collection and deep learning-based classifier system direct bearing on accuracy of detection improved and num-
optimized with HFSO (Hybrid Flame-Sailfish Optimization ber false positives in challenging environments decreased.
Algorithm). The overall frame consists of several parts: data To further enhance the performance of the SSD model, its
acquisition, feature extraction, target detection, hybrid opti- parameters, such as detection thresholds, are optimized using
mization, classification, and testing. Below we will introduce the hybrid flame-sailfish algorithm (HFSO) that is described
these modules in detail. in detail in the next section.
steps per execution). These initial positions are randomly The sailfish move towards the best flames (best solutions
distributed in the solution space as given in equation (1) found so far) to exploit and refine the promising areas of the
and (2): search space.
• Random Step: This can be calculated as in equation (7)
Fi (0) = random (Xmin , Xmax ) ∀i = 1, 2, . . . .N (1)
1
Si (0) = random (Xmin , Xmax ) ∀i = 1, 2, . . . .M (2) randomstep = rand(0, 1) × (7)
t +1
where Xmin and Xmax are the bounds of the search space. This term introduces randomness that decreases over time,
• Fitness Calculation allowing the algorithm to focus on exploitation in later stages.
The fitness function evaluates the quality of the solutions • Elite Strategy
(flames and sailfish). For the face mask detection model, the In each iteration, the best-performing flame (elite flame) is
fitness function is based on accuracy as in equation (3): preserved to ensure that the solution does not degrade. This
TP + TN is done by retaining the flame with the highest fitness as in
Fitness (Xi ) = (3) equation (8):
TP + TN + FP + FN
where: Felite (t + 1) = arg max (Fitness (Fi (t))) ∀i = 1, 2, . . . , N
TP: True Positives (8)
TN: True Negatives
FP: False Positives The elite flame is carried over to the next iteration without
FN: False Negatives being updated.
The fitness function ensures that the higher the accuracy, • Update Rules
the better the solution. At each iteration, the flame and sailfish populations are
• Flame Optimization (Exploration Phase) updated according to the following rules:
The flame optimization mechanism updates the position of Flame Update can be denoted by equation (9)
flames based on the flame optimization formula. The position
Fi (t + 1) = Si (t) + a × d × ebt × cos (2πt) (9)
of each flame F_i (t+1) is updated using a spiral function
that mimics the movement of moths towards light as shown Sailfish Update can be denoted by equation (10)
in equation (4):
Si (t + 1) = Si (t) + λ × (Fi (t) − Si (t)) + random_step
Fi (t + 1) = Si (t) + a × c × ebt × cos(2πt) (4) (10)
where: where:
• t is the current iteration. • a, b, and λ are chosen experimentally to ensure a good
• a and b are constants controlling the shape of the spiral. balance between exploration and exploitation.
• b is the Euclidean distance between the flame Fi and • The position updates are subject to the search space
sailfish Si : constraints Xmin and Xmax .
Termination and Output
d = | |Fi (t) − Si (t)| | (5) The algorithm continues updating the flame and sailfish
The spiral function simulates the exploration behaviour, populations until a termination criterion is met. Common
ensuring that flames search the space efficiently. The distance termination criteria include:
d decreases as the flame approaches the sailfish, allowing for • Reaching the maximum number of iterations Tmax .
refinement.This is shown in equation (5). • Achieving a predefined accuracy threshold.
• Sailfish Optimization (Exploitation Phase) At the end of the iterations, the solution with the best fitness
In the exploitation phase, sailfish update their positions based (elite flame) is returned as the optimized set of parameters for
on the best-performing flames. The sailfish optimization rule the face mask detection model.
is given by equation (6):
V. EXPERIMENTAL SETUP
Si (t + 1) = Si (t) + λ × (Fi (t) − Si (t)) + random_step This study’s experimental environment for evaluating the
(6) model of Hybrid Flame- Sailfish Optimization model for
face mask detection is conceived to permit a comprehensive
where: measurement of the model’s worth. Three main data sets were
λ is a control parameter that balances the influence of applied here. The first data set, obtained from the Kaggle Face
flames on sailfish. Mask Detection Dataset, contains 4,072 images featuring
Random_step introduces randomness to avoid getting faces with and without masks as well as instances of misuse
trapped in local minima. of masks [27]. This data set provides a sound foundation for
Pseudocode for HFSO fine-tuned by HFSO (Hierarchical Fuzzy Sets and Optimiza-
BEGIN HFSO tion) algorithm so as to find, with the utmost precision and
// Step 1: Initialization efficiency possible, anchor box sizes and detection thresholds
Initialize flame_population and sailfish_population ran- for face detection. Detected faces are then fed into the classi-
domly within the search space fication module, where features are extracted from them with
FOR each solution in flame_population and ResNet50’s deep learning power combined with the speed
sailfish_population and flexibility offered by MobileNetV2 - making this system
Compute initial fitness of the solution suitable both for real-time edge applications (i.e., on an ARM
END FOR mobile device) or any other kind of system at all that can
// Step 2: Iteration handle Linux software running on it. Critical parameters of
WHILE stopping criteria not met (max iterations or accu- the HFSO algorithm included optimizing the learning rate for
racy threshold) MobileNetV2, detection thresholds for SSD, and executing
FOR each flame in flame_population steps of ResNet50. Accuracy and computational efficiency
Update flame position using flame optimization rule were thus both guaranteed.
END FOR The performance of this model was evaluated using several
FOR each sailfish in sailfish_population metrics–accuracy, precision, recall (sensitivity) and F1-score.
Update sailfish position using sailfish optimization Another metric used was the Matthews Correlation Coef-
rule ficient (MCC). These taught us a lot about how well our
END FOR model could or could not tell whether people were wearing
Retain the best solution (elite flame) across iterations a mask properly or not Accuracy is the proportion of correct
Update fitness values of flame and sailfish populations classifications. Precision is the proportion of true positives to
END WHILE all cases where there was a positive outcome. Recall (sensi-
// Step 3: Termination tivity) is the ability of a model to correctly identify all actual
IF max iterations reached OR accuracy threshold met positives. F1-score balances precision and recall so that both
Terminate algorithm are equally measured-and gives an overall general measure
END IF for how well the model performs. MCC is a more balanced
// Step 4: Output measure than F1-score. Table 3 shows the model architecture
Return and training parameters for face mask detection.
On a system equipped with an Intel Core i9-10900K CPU,
a 32 GB GPU, 32 GB of RAM and a 1 TB SSD running
training and testing the model. The second data set contains Ubuntu 20.04 LTS, we conducted the experiments. Using
400 images shot in crowded public places throughout India, TensorFlow 2.5.0, Keras 2.4.3 and OpenCV for image pre-
such as markets and shopping malls. Taken under varying processing respectively, all the deep learning models were
conditions of light, these images contain 450 faces. They implemented. Custom-implemented to fine-tune the hyperpa-
afford the model a varied range of scenarios to work upon. rameters, the HFSO algorithm was used; the Adam optimizer
The third data set comprises four videos captured in similar with parameters optimized through HFSO principal was used
public places from which we will extract frames to count and to train models. When training, the batch size used was 32,
categorize 1,000 faces. These data were divided into an 80-20 starting with a learning rate of 0.001 which for each step
ratio for training and testing the model. halved successively until convergence (or up to 50 epochs),
Preprocessing came into play with the aim of achieving and in addition if validation loss became flat so early stopping
consistency among the data sets. Each image was resized to would be stopped. Till the validation loss flat dropped down
224 × 224 pixels in order match input size requirements of after 10 consecutive epochs, training was closed.
deep learning model such as ResNet50 and MobileNetv2. The efficacy of the developed Hybrid Flame-Sailfish Opti-
Normalization of the pixel values is good practice – they mization (HFSO)-based face mask detection framework was
changed from 0 to 1, making it easier for the model to con- tested across three distinct datasets, all of which show terrific
verge. Various data augmentation techniques were employed performance. In the Kaggle Face Mask Detection Dataset,
to promote generalization and robustness, including random for example, it achieved an accuracy of 97.5%, with all
flips, rotations of images and alterations in brightness. Noise three precision, recall, and F1-scores over 96%. In such a
reduction was affected using a Gaussian filter, which served controlled environment, the model shows that it is robust for
to minimize the effects of poor lighting and image quality. detecting masked or unmasked faces. When it was tested on
This is particularly useful when dealing with lower-resolution the Public Places Image Dataset (India), which had a variety
frames. of lights and was crowded, still it yielded strong results
The proposed model’s design includes two major sections: with 95.3% accuracy, 94.5% precision, and 95.7% recall.
Face detection using a mixture of the Single Shot Multi- Similarly, under more dynamic conditions, such as the Public
box (SSD) Detector and classification utilizing a combined Videos Dataset (India), the performance of the framework
form of ResNet50 and MobileNetV2. The model of SSD is was well maintained. It showed 93.8% accuracy in this case
TABLE 3. Model architecture and training parameters for face mask detection.
TABLE 4. Training time and inference time comparison for different dataset.
TABLE 5. Error analysis for model performance. evaluate it. For example, the HFSO model had a F1-Score of
97.0% for the Kaggle Dataset, 95.2% on Public Places and
Public Videos 94.4%. This indicates that both its recall rate
accuracy were high. Likewise, the HFSO model was the best
performer by a substantial margin had in terms of MCC across
all datasets. The HFSO model outperformed other state-of-
the-art models in both accuracy and computational efficiency.
Its superior performance can be attributed to the novel opti-
mization strategy that balances exploration and exploitation
within the search space. Moreover, the model’s ability to
consistently deliver high sensitivity and precision across dif-
and low-quality inputs hinder effective feature extraction. ferent datasets highlights its robustness and versatility. The
Similarly, the detection accuracy for improperly worn masks small differences in performance across datasets, particularly
(e.g., masks below the nose or chin) is 85.2%, highlighting between Kaggle and Public Videos datasets, indicate that the
dataset bias towards properly worn masks. Solutions such as model is well-suited for real-world applications with diverse
training on augmented datasets with occlusions, incorporat- data sources.
ing noise-reduction preprocessing, and diversifying datasets The results confirm the effectiveness of the proposed
with improper mask usage examples are proposed to enhance HFSO algorithm in optimizing face mask detection tasks,
the framework’s robustness in these challenging scenarios. making it a suitable candidate for real-time applications
Table 5 shows this in detail. where accuracy and speed are both crucial.
The training time graph in Fig. 8 compares the perfor- Integrating IoT technology into facemasks has several
mance of ResNet50-MobileNetV2 (Proposed), YOLOv5, and security implications, primarily revolving around data pri-
Transformer-Based Models across six epochs, highlighting vacy, system vulnerabilities, and the accuracy of facial
their computational efficiency during training. The proposed recognition systems.
model demonstrates a clear advantage, stabilizing earlier with
significantly lower training times. At 50 epochs, it achieves
VI. SECURITY IMPLICATIONS OF INTEGRATING IOT
convergence with a training time of 550 seconds, com-
pared to 850 seconds for YOLOv5 and 1250 seconds for Following are the security implications of integrating IoT
the Transformer-Based Model. Key observations include the which can be considered in future to enhance the privacy and
rapid stabilization of the proposed model by 30 epochs, security aspects.
reflecting its computational efficiency and suitability for 1. Data Privacy and Security: IoT-enabled facemasks can
resource-constrained environments. In contrast, YOLOv5 collect and transmit sensitive data, such as biometric
and Transformer-Based Models exhibit progressively higher information, which necessitates robust encryption and
training times, underscoring their increased computational secure data handling practices to prevent unauthorized
demands. These findings emphasize the superior efficiency access and data breaches [26], [27].
of the ResNet50-MobileNetV2 framework, making it an opti- 2. Vulnerability to Adversarial Attacks: Facial recogni-
mal choice for IoT-based applications. Annotations in the tion systems integrated with IoT devices are susceptible
graph further highlight these critical differences, offering a to adversarial attacks, where malicious actors can
visual understanding of the proposed model’s advantages. manipulate input data to deceive the system. This vul-
The comprehensive evaluation results in the Table 6 above nerability can compromise the reliability of security
including several indicators with sensitivity, precision, FPR measures [28], [29].
and FNR, F1-Score and MCC for all of the models. On all 3. Accuracy Challenges: The presence of facemasks can
of these indicators, the HFSO model attained high scores– significantly reduce the accuracy of facial recognition
which makes it a good performer in any sense we wish to systems. This is because masks obscure key facial
VIII. LIMITATIONS
While the proposed ResNet50-MobileNetV2 framework
demonstrates significant improvements in accuracy, compu-
tational efficiency, and suitability for resource-constrained
IoT edge devices, it is not without limitations. Addressing
FIGURE 8. Training time comparison graph. these challenges is essential for enhancing the model’s adapt-
ability and scalability across broader applications.
features, making it difficult for the system to cor- Dataset Bias: Some of the main limitations are the risk
rectly identify individuals. Enhanced algorithms and of biases arising from sample differences. These datasets
additional authentication methods may be required to used in learning and validation may be diverse but cannot
mitigate this issue [26], [30]. fully reflect all situations encountered in day-to-day con-
4. Scalability and Resource Constraints: IoT devices, ditions (e.g., extreme occlusion, large differences between
including smart facemasks, often operate under limited light levels or even those rare cases when people wear masks
computational resources. Ensuring that security mea- incorrectly). This bias can result in performance degrada-
sures do not overly tax these resources is crucial for tion when the model encounters unfamiliar situations. For
maintaining system performance and reliability [31], example, scenarios where masks are worn incorrectly, such as
[32]. masks worn below the nose, are underrepresented in datasets,
5. Integration with Other Systems: IoT facemasks may leading to errors. If we can build a larger and more varied
need to interact with other IoT devices and systems, set, with greater challenges presented to the model, then it
such as smart surveillance cameras and health mon- will acquire both greater robustness–in terms of dealing for
itoring systems. Ensuring secure communication and instance with even difficult cases like when people wear
interoperability between these systems is essential to masks wrongly–and increased generalization capability.
prevent potential security loopholes [33], [34], [35]. Scalability: Its scalability to large-scale systems brings
difficulties. While this framework may have been optimized
VII. CHALLENGES for deployment on individual IoT edge devices, there are still
Putting the proposed ResNet50-MobileNetV2 framework challenges.Yet, when you deploy an array of edge devices
into reality comes with some challenges which need to be across multiple devices in high-density areas such as transport
addressed in order to ensure its effectiveness and accept- nodes or bustling markets, you have to combine efficient
ability One such challenge is user privacy. The system load balancing with real-time synchronization. This becomes
extracts sensitive facial data, which if not properly sealed even more complex if different edge devices have varying
off could easily be misused. To prevent this, comprehensive hardware capabilities and network conditions vary over time.
data encryption processes need to be applied at every stage of Just as important, if we want to overcome these restric-
the process, from transmission through to storage. Moreover, tions, is devising some method for seamlessly integrating and
methods for anonymizing data, such as deleting identifiable processing information within the system. Whether through
features from the collected data, help to reconcile the contra- federated learning or edge & cloud collaboration, the devel-
diction between protection and function. To gain trust from all opment of approach is crucial.
Adaptability to other IoT Applications: The current design the requirements for lightweight deployment modules and
of the model is directed at detecting face mask. This lim- middlewares capable of bridging traditional systems and the
its its applicability for other IoT-oriented tasks, such as proposed new framework.
personal protective equipment (PPE) detection and more Future Directions
general public health monitoring applications. Training the To address these limitations, future work should focus on:
model to accommodate these uses would take much time • Expanding datasets to capture underrepresented real-
and computational capacity. This is especially true of settings world scenarios.
which totally change the feature sets–then you essentially • Developing scalable solutions for distributed edge
have a new model. However taking in transfer learn- deployments.
ing strategies or building a modular architecture that can • Exploring adaptability mechanisms for diverse IoT
learn from related applications may help overcome these applications.
limitations. • Designing lightweight integration modules for seamless
Integration with Existing Systems: This integration will adoption in existing systems.
face harder challenges when the system is connected to By addressing these challenges, the proposed framework can
any surveillance system than it presently is. Infrastructure achieve greater applicability, scalability, and robustness in
has limited computational capability or is not compatible real-world settings, enhancing its utility across a wide range
with contemporary architectures. This limitation underscores of IoT-based applications.
IX. CONCLUSION [14] A. M. T. I. Al-Naib and M. I. Mohammed, ‘‘IoT-based real time data
An IoT-woven mask detection architecture built around the acquisition of PV panel,’’ in Proc. Int. Conf. Eng., 2023, pp. 169–173, doi:
10.1109/ICESAT58213.2023.10347321.
Hybrid Flame-Sailfish Optimization (HFSO) was developed. [15] Muhtadan, A. Abimanyu, A. S. Wicaksono, F. Panuntun, and Supriyono,
Using this model, which is an integration of ResNet50 and ‘‘Design of IoT-based data acquisition system for operation and safety
MobileNetV2, it can effectively address problems such as parameter of kartini reactor,’’ in AIP Conf. Proc., Jan. 2023, pp. 28–58,
doi: 10.1063/5.0167219.
changing lighting, occlusions, and limitations in resources. [16] S. Boopathi, B. K. Pandey, and D. Pandey, ‘‘Advances in artificial intel-
Under this model, a pack integrates HFSO, tuning for ligence for image processing,’’ in Handbook of Research on Thrust
improved performance and consolidation. Using the method Technologies’ Effect on Image Processing. Hershey, PA, USA: IGI Global,
2023, pp. 73–95, doi: 10.4018/978-1-6684-8618-4.CH006.
of HFSO, one can then demonstrate yet higher accuracy [17] P. Dhiman, A. Kaur, V. R. Balasaraswathi, Y. Gulzar, A. A. Alwan, and
and efficiency. The proposed framework was tested on three Y. Hamid, ‘‘Image acquisition, preprocessing and classification of citrus
datasets: Kaggle, Public Places, and Public Videos, which fruit diseases: A systematic literature review,’’ Sustainability, vol. 15,
no. 12, p. 9643, Jun. 2023, doi: 10.3390/SU15129643.
consistently outperformed previous models with high accu-
[18] C. Kaur and U. Garg, ‘‘Artificial intelligence techniques for cancer detec-
racy (up to 97.5% on Kaggle). The HFSO algorithm enabled tion in medical image processing: A review,’’ Mater. Today, Proc., vol. 81,
edge detection models (or passive device models), making pp. 806–809, Jan. 2023, doi: 10.1016/J.MATPR.2021.04.241.
this solution both accurate and scalable for public health [19] L. Wang, Y. Shoulin, H. Alyami, A. A. Laghari, M. Rashid, J. Almotiri,
H. J. Alyamani, and F. Alturise, ‘‘A novel deep learning-based single shot
applications. The application of masks in real time is an multibox detector model for object detection in optical remote sensing
efficient and powerful solution based on the HFSO model. images,’’ Geosci. Data J., vol. 11, no. 3, pp. 237–251, May 2022, doi:
There is great potential for this kind of deployment in public 10.1002/GDJ3.162.
[20] J. Qiang, W. Liu, X. Li, P. Guan, Y. Du, B. Liu, and G. Xiao, ‘‘Detec-
health surveillance systems. Future work might explore its tion of citrus pests in double backbone network based on single shot
application in a more broadly based IoT environment. multibox detector,’’ Comput. Electron. Agricult., vol. 212, Sep. 2023,
Art. no. 108158, doi: 10.1016/J.COMPAG.2023.108158.
[21] W. Zhu, H. Zhang, J. Eastwood, X. Qi, J. Jia, and Y. Cao, ‘‘Concrete crack
REFERENCES detection using lightweight attention feature fusion single shot multibox
[1] J. C. Prata, A. L. P. Silva, A. C. Duarte, and T. Rocha-Santos, ‘‘Dispos- detector,’’ Knowl.-Based Syst., vol. 261, Feb. 2023, Art. no. 110216, doi:
able over reusable face masks: Public safety or environmental disaster?’’ 10.1016/J.KNOSYS.2022.110216.
Environments, vol. 8, no. 4, p. 31, Apr. 2021, doi: 10.3390/ENVIRON- [22] M. Ali, C. Keller, and M. Huang, ‘‘Fruits detections using single shot
MENTS8040031. MultiBox detector,’’ in Proc. 5th ACM Int. Symp. Blockchain Secur., 2023,
[2] B. Balestracci, M. La Regina, D. Di Sessa, N. Mucci, F. D. Angelone, pp. 140–144, doi: 10.1145/3594556.3594619.
A. D’Ecclesia, V. Fineschi, M. Di Tommaso, L. Corbetta, P. Lachman, [23] N. Muhathir, M. F. D. Ryandra, R. B. Y. Syah, N. Khairina, and R. Muliono,
F. Orlandini, M. Tanzini, R. Tartaglia, and A. Squizzato, ‘‘Patient safety ‘‘Convolutional neural network (CNN) of ResNet-50 with InceptionV3
implications of wearing a face mask for prevention in the era of COVID- architecture in classification on X-ray image,’’ in Artificial Intelligence
19 pandemic: A systematic review and consensus recommendations,’’ Application in Networks and Systems (Lecture Notes in Networks and
Internal Emergency Med., vol. 18, no. 1, pp. 275–296, Sep. 2022, doi: Systems), 2023, pp. 208–221, doi: 10.1007/978-3-031-35314-7_20.
10.1007/S11739-022-03083-W. [24] J.-R. Lee, K.-W. Ng, and Y.-J. Yoong, ‘‘Face and facial expressions
[3] M. L. D. Punsalan and A. T. Salunga, ‘‘Mask is a must: The need of recognition system for blind people using ResNet50 architecture and
protection and safety against COVID-19,’’ J. Public Health, vol. 43, no. 2, CNN,’’ J. Informat. Web Eng., vol. 2, no. 2, pp. 284–298, Sep. 2023, doi:
pp. e379–e380, Mar. 2021, doi: 10.1093/PUBMED/FDAB077. 10.33093/JIWE.2023.2.2.20.
[4] D. W. Snook, W. Kaczkowski, and A. D. Fodeman, ‘‘Mask on, mask off: [25] Y. Gulzar, ‘‘Fruit image classification model based on MobileNetV2 with
Risk perceptions for COVID-19 and compliance with COVID-19 safety deep transfer learning technique,’’ Sustainability, vol. 15, no. 3, p. 1906,
measures,’’ Behav. Med., vol. 49, no. 3, pp. 246–257, Jan. 2022, doi: Jan. 2023, doi: 10.3390/SU15031906.
10.1080/08964289.2021.2021384. [26] U. Kulkarni, ‘‘Facial key points detection using MobileNetV2 architec-
[5] A. Sethi, A. Gupta, and P. Sharma, ‘‘Face mask detection using deep ture,’’ in Proc. IEEE 8th Int. Conf. Converg., Apr. 2023, pp. 1–24, doi:
learning,’’ Int. J. Comput. Appl., vol. 182, no. 41, pp. 25–32, 2021. 10.1109/I2CT57861.2023.10126381.
[6] Y. Kong, S. Zhang, and J. Li, ‘‘Face mask detection using faster R-CNN in [27] (2020). Face Mask Detection. [Online]. Available:
public health systems,’’ J. Med. Syst., vol. 45, no. 2, pp. 1–12, 2021. https://www.kaggle.com/datasets/andrewmvd/face-mask-detection
[7] S. Varshini, M. Rao, and K. Ramesh, ‘‘Smart IoT door for face mask [28] I. Hagui, ‘‘Hybrid authentification system of face mask detection using
and body temperature detection,’’ IEEE Internet Things J., vol. 8, no. 7, CNN-based on lightweight cryptography and blockchain technology,’’ in
pp. 5134–5141, 2021. Proc. IEEE Int. Conf. Adv. Syst., Apr. 2023, pp. 1–6.
[8] M. Koklu, E. Balci, and E. Karaca, ‘‘Face mask detection using [29] A. Rahman, M. S. Hossain, N. A. Alrajeh, and F. Alsolami, ‘‘Adversarial
VGG16-BiLSTM model,’’ Appl. Soft Comput., vol. 104, May 2022, examples—Security threats to COVID-19 deep learning systems in medi-
Art. no. 107341. cal IoT devices,’’ IEEE Internet Things J., vol. 8, no. 12, pp. 9603–9610,
[9] S. Nagrath, P. Jain, and S. Kataria, ‘‘SSD-MobileNetV2-based face mask Jun. 2021, doi: 10.1109/JIOT.2020.3013710.
detection,’’ J. Comput. Vis. Image Understand., vol. 207, Apr. 2021, [30] R. A. Ramadan, B. W. Aboshosha, K. Yadav, I. M. Alseadoon,
Art. no. 104191. M. J. Kashout, and M. Elhoseny, ‘‘LBC-IoT: Lightweight block cipher
[10] A. Taha and M. Eldeen, ‘‘Hybrid deep learning and SVM-based model for IoT constraint devices,’’ Comput., Mater. Continua, vol. 67, no. 3,
for face mask detection,’’ Expert Syst. Appl., vol. 184, Apr. 2021, pp. 3563–3579, Jan. 2021, doi: 10.32604/CMC.2021.015519.
Art. no. 115709. [31] B. V. S. Babu, J. Manoranjini, R. Changala, M. Aarif, S. S. C. Mary,
[11] D. Loey, F. Smarandache, and M. Khalifa, ‘‘YOLO v2-based real-time face and I. I. Raj, ‘‘Biometric-based access control systems with robust facial
mask detection,’’ J. Big Data, vol. 8, no. 1, pp. 1–17, 2021. recognition in IoT environments,’’ in Proc. 3rd Int. Conf. Intell. Techn.,
[12] D. Gupta, S. Wadehra, J. Kaur, and S. V. Sharma, ‘‘Unmasking compli- Mar. 2024, pp. 1–6, doi: 10.1109/INCOS59338.2024.10527499.
ance: Leveraging machine learning for real time face mask detection and [32] F. Gao and J. Liu, ‘‘Effective segmented face recognition (SFR) for IoT,’’
public safety,’’ in Proc. 5th Int. Conf. Inf. Manage., 2023, pp. 1–20, doi: Adv. Sci., Technol. Eng. Syst. J., vol. 5, no. 6, pp. 36–44, Jan. 2020, doi:
10.1145/3647444.3647864. 10.25046/AJ050605.
[13] M. A. A. Radia, M. K. E. Nimr, and A. S. Atlam, ‘‘IoT-based wireless [33] S. Guha, A. Chakrabarti, S. Biswas, and S. Banerjee, ‘‘Implementation
data acquisition and control system for photovoltaic module performance of face recognition algorithm on a mobile single board computer for IoT
analysis,’’ e-Prime Adv. Electr. Eng., Electron. Energy, vol. 6, Dec. 2023, applications,’’ in Proc. IEEE 17th India Council Int. Conf., Dec. 2020,
Art. no. 100348, doi: 10.1016/J.PRIME.2023.100348. pp. 1–5, doi: 10.1109/INDICON49873.2020.9342290.
[34] A. A. Ahmed and M. Echi, ‘‘Hawk-eye: An AI-powered threat systems to ensure conformance to specifications, safety standards, and reg-
detector for intelligent surveillance cameras,’’ IEEE Access, vol. 9, ulations. He is also a Wireless Sensor Network Chief Evangelist, an AI,
pp. 63283–63293, 2021, doi: 10.1109/ACCESS.2021.3074319. ML, and IoT Expert, and a Designer. He is also a Reader (Professor) with
[35] S. Ahmed, R. Alam, Md. R. Hossain, Md. M. Islam, M. I. Hossain, and the University of Bolton, U.K. He is also the IEEE University of Bolton,
T. Tabassum, ‘‘An IoT based smart robot that aids in the prevention of a Student Branch Counselor, a former Board Member of IEEE Sweden
COVID19 spread,’’ in Proc. 4th Int. Conf. Sustainable Technol., 2022, Section, a fellow of The Higher Education Academy, U.K., and a fellow
pp. 1–20, doi: 10.1109/STI56238.2022.10103253. of Institute of Management Consultants to add to his teaching, managerial,
[36] L. Alzubaidi, J. Bai, A. Al-Sabaawi, J. Santamaría, A. S. Albahri, and professional experiences. He is also an Ambassador in the prestigious
B. S. N. Al-dabbagh, M. A. Fadhel, M. Manoufali, J. Zhang,
Manchester Conference Ambassador Programme, a Visiting Professor to
A. H. Al-Timemy, Y. Duan, A. Abdullah, L. Farhan, Y. Lu, A. Gupta,
five universities, and an IEEE Humanitarian Philanthropist. He has received
F. Albu, A. Abbosh, and Y. Gu, ‘‘A survey on deep learning tools
dealing with data scarcity: Definitions, challenges, solutions, tips, and the prestigious recognition of the Royal Academy of Engineering through
applications,’’ J. Big Data, vol. 10, no. 1, pp. 1–26, Apr. 2023, doi: the Exceptional Talent Scheme, acknowledging his substantial contributions
10.1186/S40537-023-00727-2. to artificial intelligence and its medical applications. Additionally, he takes
pride in his three-year inclusion in Elsevier’s publication, featuring the
World’s Top 2% Influential Scientists. He is the Chair of the Election
Committee of IEEE Computer Society Worldwide, in 2024.
PARUL DUBEY (Member, IEEE) received the
Ph.D. degree in computer science and engi-
neering from C. V. Raman University, Bilaspur,
Chhattisgarh, India. She has submitted her thesis.
She is currently an Assistant Professor with the
Department of Computer Science and Engineer-
ing, Symbiosis Institute of Technology, Nagpur
Campus, Symbiosis International (Deemed Uni- CRESANTUS N. BIAMBA joined the Depart-
versity), Pune, India. She has good academic and ment of Education, University of Gävle, Sweden,
research experience in various areas of CSE and in January 2014, as a Senior Lecturer in edu-
IT. She has a teaching experience of five years. She is indulged in research cation. His research interests include educational
activities which includes book chapters, books, conference papers, and jour- and didactical questions, with a specific focus
nal articles as well. She has 18 Indian published patents. She holds around areas, including teacher education, the profes-
50 publications which are part of conferences, Scopus, and other journals as sional development of school leaders, school
well. governance, higher education, curriculum devel-
opment, early childhood education and in promot-
ing both endogenous and international research
in leadership, management, adult education, and education for sustainable
PUSHKAR DUBEY received the Master of Busi- development. Throughout his academic career, he has been involved in
ness Administration (M.B.A.) and Ph.D. degrees high-profile projects at both the university and international levels. He has
in human resource management. He is currently conducted comprehensive basic and applied research in education, teaching
an Assistant Professor and the Head of the Depart- and learning, curriculum development, educational leadership, and education
ment of Management, Pandit Sundarlal Sharma for sustainable development.
(Open) University, Chhattisgarh, Bilaspur. He has
published more than 70 research articles in reputed
journals, such as Emerald, Taylor and Francis, and
Springer. His 12 articles are indexed in Scopus.
He has also attended several international confer-
ences, including IIM level conferences. Having successfully guided six Ph.D.
scholars, he is also accredited with seven published patents. He has also
accomplished five research projects, including three sponsored by Indian
Council of Social Science Research (ICSSR) New Delhi. Having specialized DEEPAK DASARATHA RAO is currently a Tech-
in statistical software’s for data analysis, he has delivered several lectures nologist with 25 years’ experience in software
on SPSS and AMOS. He is also pursuing the highest academic degree research and development and worked for large
for research, i.e., Doctor of Letters in the area of application of Shrimad product companies. He has experience in soft-
Bhagwad Geeta into management practices. ware research and development and developed for
embedded systems, smartphone platforms, data
wireless communication, cloud, AI, the IoT, con-
nected vehicle telematics, automotive, consumer
CELESTINE IWENDI (Senior Member, IEEE) electronics, connected healthcare, bio-medical,
received the Ph.D. degree in electronics engineer- and wearable products. He has developed several
ing. He is currently an IEEE Brand Ambassador first-of-its-kind innovative software products, devised strategy, and reached
and the Head of the Centre of Intelligence of differentiated products as a Solution Architect, the Product Manager, the
Things, University of Bolton. He is also a Past Technology Leader, and an Innovator. He has experienced in designing
ACM Distinguished Speaker, a Seasoned Lec- embedded system-based software architecture, technically leading full stack
turer, and a Chartered Engineer. He is also a development for software platforms, system software, middleware, solutions,
Highly Motivated Researcher and a Teacher, with services, and applications. He has worked on software products for connected
an emphasis on communication, hands-on expe- car, smartphones, connected devices platform, wearables, smart home setup
rience, willing-to-learn, and a 23 years technical box, and robotics. He has published several international papers in journals
expertise. He has developed operational, maintenance, testing procedures for and conferences. He has contributed as a journal reviewer of many research
electronic products, components, equipment, and systems; provided tech- publications in the area of wireless communication and computer science.
nical support and instruction to staff and customers regarding equipment He is a fellow of RSA (U.K.), The Institution of Engineers (India), and The
standards, assisting with specific, and difficult in-service engineering; and Institution of Electronics and Telecommunication Engineers.
inspected electronic and communication equipment, instruments, products,