Enhancing Laboratory Safety With AI: PPE Detection and Non-Compliant Activity Monitoring Using Object Detection and Pose Estimation
Enhancing Laboratory Safety With AI: PPE Detection and Non-Compliant Activity Monitoring Using Object Detection and Pose Estimation
Abstract: Ensuring workplace safety and adhering to regulatory standards in pharmaceutical manufacturing is vital.
However, traditional manual monitoring methods are inefficient, prone to errors, and labor-intensive, resulting in potential
safety risks and non-compliance penalties. This research introduces an automated deep learning framework that employs
video analytics for real-time compliance monitoring, providing a scalable alternative to manual inspection processes.
The system integrates YOLOv11n for detecting Personal Protective Equipment (PPE), such as gloves, masks, and
goggles, identifying violations where PPE is either missing or improperly worn. Additionally, YOLOv8n-Pose is utilized to
assess non-compliant postures, including actions like bending, hand-raising, and face-touching. A logging system tracks
violations with precise timestamps, enabling efficient documentation for audits and regulatory purposes.
A curated video dataset was developed and annotated using Roboflow, featuring both compliant and non-compliant
actions. To enhance the model's robustness, preprocessing techniques such as resizing, contrast enhancement, and data
augmentation were applied. The system’s performance, evaluated using metrics like mean Average Precision (mAP), F1-
score, and precision, demonstrated an impressive 90% accuracy, with a mAP@50 of 92.1% and a processing speed of 25
frames per second (FPS), fulfilling the real-time monitoring criteria.
This solution offers a scalable, real-time alternative to manual inspections, reducing human intervention, improving
workplace safety, ensuring compliance with regulations, and automating the documentation process. Future developments
aim to integrate IoT devices, employ edge computing, and incorporate cloud-based analytics to further enhance safety
monitoring and compliance.
How to Cite: Aro Praveen; Nahin Shaikh; Mohammad Annus; Gayathri; Bharani Kumar Depuru (2025). Enhancing Laboratory
Safety with AI: PPE Detection and Non-Compliant Activity Monitoring Using Object Detection and Pose Estimation.
International Journal of Innovative Science and Research Technology, 10(3), 1895-1904.
https://doi.org/10.38124/ijisrt/25mar1274
Fig 1 This Figure Depicts the CRISP-ML(Q) Architecture that we have followed for this Research Study.
(Source: Mind Map - 360DigiTMG)
A key feature of this system is its automated logging emphasizes data understanding, preprocessing, model
mechanism, which records violations with timestamps, development, evaluation, and deployment with quality
providing a structured method for compliance tracking and assurance. The CRISP-ML(Q) process adopted in this study
audits. The combination of PPE detection and behavioural is illustrated in [Fig.1], demonstrating the systematic
analysis enhances workplace safety by identifying risk-prone approach taken for data collection, annotation, model
actions that may go unnoticed in manual inspections. training, and validation [5].
Fig 2 High Level Architecture Diagram Representing PPE Detection and Compliance Monitoring System Incorporating Object
Detection and Pose Estimation Models
System Workflow: In the model integration phase, the outputs from the
As depicted in [Fig.2], the system begins with video data object detection and pose estimation models are merged into
collection from the Opensource platform, capturing real- a unified framework. Fine-tuning is conducted to optimize
world scenarios where compliance needs to be monitored. detection accuracy and reduce false positives, ensuring high
Frames are extracted and annotated using Roboflow, where precision in compliance monitoring.
Personal Protective Equipment (PPE) components such as
hair cover, no hair cover, goggles, no goggle, face masks, The deployment phase involves use of streamlit
gloves, shoes, and lab coats etc are labelled. To enhance framework and it can run on both local machine and cloud,
model performance, preprocessing techniques such as image enabling real-time video processing for compliance
augmentation and resizing are applied, ensuring robustness verification. The system generates log files that record
across varied environments. detected violations, facilitating auditability and further
analysis. The deployed system operates in a continuous
For PPE detection, a YOLOv11n model is trained to monitoring mode, with regular performance evaluations to
identify missing protective equipment in real-time. Parallelly, ensure accuracy and adaptability to dynamic laboratory
pose estimation using YOLOv8 extracts key-points environments.
corresponding to human body joints, which are further
analysed through a rule-based approach to identify non- By leveraging deep learning-based object detection,
compliant activities, such as bending, raising hands, or pose estimation, and rule-based compliance verification, this
touching the face. The integration of these two models system provides an automated, scalable, and efficient solution
enables a comprehensive compliance assessment, capturing for laboratory safety enforcement. The architecture
both equipment violations and unsafe human actions within minimizes human intervention, enhances compliance
the laboratory environment [7]. monitoring, and enables real-time enforcement of safety
protocols, ensuring a safer working environment in laboratory
settings [8].
Fig 3 Low Level Architecture Diagram Representing PPE Detection and Compliance Monitoring System Incorporating Object
Detection and Pose Estimation Models
For a more detailed breakdown of system components, meticulous annotation to make it meaningful. For this,
data flow, and processing stages, a Low-Level Architecture Roboflow was used to manually label 12 PPE-related classes.
(LLA) is provided [Fig.3]. The LLA delves deeper into
module-specific interactions, highlighting key functionalities This manual annotation step was critical in ensuring
such as data preprocessing, model inference, decision-making accuracy, as precise bounding boxes allow the detection
logic, and deployment structure. This detailed architectural model to differentiate between compliant and non-compliant
view further enhances understanding of the system’s real- scenarios effectively [10].
time processing pipeline [9].
During initial analysis, an imbalance was observed—
III. DATA COLLECTION AND certain PPE classes had significantly fewer samples. This
PREPROCESSING posed a risk of biased detection, where underrepresented
classes might be overlooked by the model. To address this,
The success of an AI-driven PPE Detection and additional frames were extracted, and targeted. augmentation
Compliance Monitoring System heavily depends on the techniques were applied, ensuring each class had sufficient
quality, diversity, and balance of the dataset used for training. representation.
A well-structured dataset ensures that the model can
generalize effectively, reducing false positives and negatives Preprocessing for Object Detection: Making Data
in real-world laboratory environments. Model-Ready.
Fig 4 Training Graphs for the YOLO Model, Presenting its Learning Progress and Performance.
YOLOv8-Pose extends object detection by predicting Angle Calculations for Movement Analysis:
key points along with bounding boxes. Each detected human
is represented by a bounding box (x, y, width, height, The function calculate angle (a, b, c) computes angles
confidence score) and key points {(x_kp, y_kp, conf_kp)} for between three key points to analyze joint movements.
each joint, where x_kp, y_kp are the pixel coordinates and Angles are calculated for hips, knees, and elbows, which
conf_kp represents the confidence score of the key point are crucial for identifying postures like bending and arm
prediction. The model follows a single-stage detection movements.
approach, directly predicting key points from an input image
without requiring a separate detection step. It detects 18 key Action Recognition:
[Fig.6] points for each person, covering crucial anatomical
landmarks: Jump Detection: Uses ankle height relative to a baseline
(calculate jump) to determine if a person is jumping.
Each detected individual is enclosed in a bounding box, Overall Detection Accuracy: Registered at 90%,
with PPE components labeled near the corresponding body demonstrating robust performance across varied laboratory
parts. Additionally, pose estimation highlights non-compliant environments.
actions such as bending or touching the face. The processed
video output is displayed through the interface, allowing Precision and Recall:
users to monitor violations and maintain safety standards
effectively. [Fig.7 Illustrate output]. Precision: ~99%[Fig.8], confirming that the vast majority
of identified objects were indeed PPE.
VI. RESULT Recall: ~88%[Fig.9], illustrating the model’s ability to
capture most instances of PPE.
Object Detection Performance (YOLOv11n) Inference Speed: The model processes video streams at 25
Mean Average Precision (mAP@50): Achieved 92.1%, frames per second (FPS), ensuring real-time detection,
indicating that the model reliably detects Personal Protective which is critical for dynamic environments.
Equipment (PPE) items.
The results prove that it strikes a good balance between The results prove that it strikes a good balance between
precision and computational efficiency for the real-time precision and computational efficiency for the real-time
application. However, there are limitations to our research. application. However, there are limitations to our research.
Improving the model's performance would involve including Improving the model's performance would involve including
a larger and more diverse dataset. One main challenge for a larger and more diverse dataset. One main challenge for
future research is to enable efficient real-time inference future research is to enable efficient real-time inference
suitable for low-power edge devices. For future work, suitable for low-power edge devices. For future work,
transformer-based models and other advanced deep learning transformer-based models and other advanced deep learning
techniques will be explored for further detection accuracy techniques will be explored for further detection accuracy
improvement. improvement.
In addition, integration of the real-time deployment In addition, integration of the real-time deployment
strategy and enhanced interpretability of the model will be the strategy and enhanced interpretability of the model will be the
main tasks towards making this approach more universal and main tasks towards making this approach more universal and
scalable for different real-world applications. scalable for different real-world applications.
This research has developed and evaluated object We acknowledged that with the consent from
detection and pose estimation models based on YOLO 360DigiTMG, we have used the CRISP-ML(Q) Methodology
architectures. With high accuracy and computational (ak.1) and the ML Workflow which are available as open-
efficiency, our models are going to find applications in work source in the official website of 360DigiTMG(ak.2).
safety monitoring and action recognition. With wise
preprocessing techniques and augmentation, the model's Funding and Financial Declarations:
ability to generalize across various situations has been
enhanced, and therefore its reliability in real-world scenarios The authors affirm that no financial support, grants, or
is bolstered. The main contribution of this paper lies in funding were obtained during the research or the
combining object detection and pose estimation that brings manuscript preparation.
gains in action recognition and benefits sectors like The authors confirm that they have no financial or non-
healthcare, security, and industrial automation. financial conflicts of interest to disclose.