0% found this document useful (0 votes)
138 views

Face Mask Detection by Using Optimistic Convolutional Neural Network

con cua be be

Uploaded by

Trung Huynh Kien
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
138 views

Face Mask Detection by Using Optimistic Convolutional Neural Network

con cua be be

Uploaded by

Trung Huynh Kien
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Proceedings of the Sixth International Conference on Inventive Computation Technologies [ICICT 2021]

IEEE Xplore Part Number: CFP21F70-ART; ISBN: 978-1-7281-8501-9

Face Mask Detection by using Optimistic


Convolutional Neural Network
Suresh K1 , Palangappa MB2 , Bhuvan S3
1 ,2,3
Department of Computer Science, Amrita School of Arts and Sciences, Mysore, Amrita Vishwa Vidyapeetham, India
[email protected] m, [email protected], [email protected] m

Abstract—COVID-19 pandemic has rapidly increased health Support Vector Machine (SVM), decision tree, and ensemble
crises globally and is affecting our day-to-day lifestyle. A motive algorithm for facemask classification process [3]. An
for survival recommendations is to wear a safe facemask, stay architecture is trained efficiently using deep learning on a
protected against the transmission of coronavirus. By wearing a
dataset that has images of individual’s faces with and without
facemask, the most effective preventive care must be taken
2021 6th International Conference on Inventive Computation Technologies (ICICT) | 978-1-7281-8501-9/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICICT50816.2021.9358653

against COVID-19. Monitoring manually if the individuals are facemask. The system architecture prevents spreading of virus
wearing facemask correctly and to notify the victim in public and by finding out individuals who are not wearing facemask in
crowded areas is a difficult task. This paper approaches a smart cities by monitoring with public Closed-Circuit-
simplified way to achieve facemask detection and notifying the Television (CCTV) cameras. An individual is spotted if not
individual if not wearing facemask. Using Kaggle datasets, the wearing a facial mask and reported corresponding authority
proposed system/model is trained and examined. The system [4].
runs in real-time and detects if an individual face has facemask if
not then notify the individual personally through text message. COVID-19 disease 2019 has large scale negative impact on
The mask is extracted from real-time faces in public and is fed as
health across the globe. One major form of defense against
an input into convolutional neural network (CNN).
viruses is to wear face masks in public places. Retinaace Mask
Keywords— Face Mask Detection, Convolutional Neural detector, a high accuracy facemask detector. The proposed
Networks (CNNs), Kaggle Datasets, Public Safety, COVID-19, detector is a one-stage detector that has a pyramid network
OpenCV. feature used to fuse multiple feature maps with high-level
semantic information, and a novel-context attention module is
I. INT RODUCT ION used for facemask detection. With low anonymity and strong
In 2020, COVID-19 being a pandemic disease had rapid union intersection estimates are refused using the proposed
spread of virus over the world creating a red alert in global novel cross-class object elimination algorithm [5]. COVID-19
health, humanity and everyday lifestyle of humans and daily pandemic gave a breakout in global health, occurs badly need
lives had deep impact. All around the globe, illness on of facemask has a basic protection mechanism to break the
respiratory such as Severe Acute Respiratory Syndrome chain of spreading of coronavirus. Aims to detect facemask on
(SARS), and Middle East Respiratory Syndrome (MERS) are human face on live streaming video and also with human face
spreading, on December 2019 in Wuhan, China a new critical images. Face detector model is built using deep learning and
respiratory illness has arisen and has infected millions of Single Shot Detector(SSD).
peoples and lost millions of helpless lives in more than 200
countries according to world health organization (WHO) and
The presence of facemask in image or video stream is done
declared global pandemic. The spread of the virus is through
close contact with the people and in overcrowded public areas using basic concepts of transfer learning in neural networks
[1]. The guidelines listed by the WHO, primary precaution [6]. Coronavirus has rapidly affected day-to-day life in turn
should be taken to prevent the spread of virus is to wear disturbing the world trade and movements. A simplified
facemask and maintain social distance. An efficient based method detects the face from the image and identifies if it has
computer vision approach aims on real-time application which facemask on it or not using basic machine learning packages
monitors the individuals in public whether wearing facemask such as TensorFlow, Keras, OpenCV and Scikit-Learn. The
or not and safe social distancing by implementing raspberry model is able to detect facemask even if the face is in motion
pi4 model to the monitor and spot violation through public [7]. It is a difficult task to monitor if the people are wearing
cameras. masks manually in public and crowded areas. Using the pre-
trained state-of-art deep learning architecture, InceptionV3, an
Modern deep learning algorithms are mixed with techniques automated facemask detection model is developed using fine-
of geometry to build robust models which are able to cover tuning methodology. Model is skilled/trained using a
aspects such as, detection, tracking, and validation [2]. This Simulated Masked Face Dataset (SMFD). Here, on the public
paper addresses building a hybrid deep and classical machine face dataset mask is put and then it's simulated. This is used to
learning model to detect facemask. The model consists of two better training and testing of the model. The image
elements. The first element is designed using Resnet50 for augmentation technique is used for improved training and
feature extraction, and the second element is built using testing of the model for restricted data usability [8].

© IEEE 2021. This article is free to access and download, along with rights for full text and data mining, re-use and analysis. 1084

Authorized licensed use limited to: Huynh Trung. Downloaded on September 26,2021 at 05:59:49 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Inventive Computation Technologies [ICICT 2021]
IEEE Xplore Part Number: CFP21F70-ART; ISBN: 978-1-7281-8501-9

In this paper, we proposed to build a real-time facemask 11.0% and 5.9% higher than the standard results. An
detection model using Convolution Neural Network (CNN) architecture is built [5] to find whether people are wearing
which is a class of Deep Neural Network (DNN), most facemask in live streaming videos and even with human face
commonly used in image classification and recognition. The images using Single Shot Detector (SSD) serves the purpose
proposed model can be implanted in surveillance cameras in of object detection. Concepts of transfer learning in neural
organizations, schools, universities, shopping malls, multiplex networks used in finding presence or absence of facemask in
etc. which helps to monitor individuals automatically whether video streams and in images. Experimental findings indicate
they are wearing facemask, if not, spot them and report to that the model performs well with 100% accuracy and 99%
higher authorities as well as notify them personally through precision and recall, respectively. Simplified approach [6]
text. This model helps to break the chain of spreading of virus towards detecting facial masks even in motion using basic
when in close contact and reduces the positive cases which are machine learning packages such as Tensorlow, Keras,
rapidly increasing day-by-day and the rate of losing helpless OpenCV, and Scikit-learn. The method attains accuracy up to
lives can be controlled. 95.77% and 94.58% respectively on two different datasets. An
automated process [7] for finding whether individuals wear
II. RELATED WORK facemask in public. The model is built by fine-tuning the pre-
trained state-of-the-art deep learning models called
The facemask detection model has become a most required InceptionV3. Simulated Face Mask Dataset (SMFD) dataset is
and essential model during COVID-19 pandemic. Since used to train the dataset. Here, on the public face dataset mask
manually monitoring whether people are wearing facemask in is put and then it's simulated. This is used to better training
public and crowded areas, [1] research paper had built real- and testing of the model. The image augmentation technique is
time automated model integrated with surveillance cameras in used for improved training and testing of the model for
public which detects whether people are wearing facemask restricted data usability The model reaches 99.9% precision
and maintaining social distance in public areas and report to during training and 100% accuracy during testing.
the respective authorities using computer vision and
implementing raspberry pi4. Global health crises have An efficient real-time system approach towards a computer
occurred due to coronavirus, facial masks form a basic vision based to detect both violation of wearing facemask and
prevention from the virus, hence, [2] hybrid model is built social distancing in public areas using the convergence of
using classical and deep machine learning consisting of two advanced deep learning algorithms with geometric techniques
components. The first component is for feature extraction is resulted in model creation which is robust in nature and covers
by using Resnet50 and the second is for classification the aspects of validation, detecting, tracking. Rasberry -pi4 is
processing of mask is using ensemble, support vector machine implanted in public surveillance cameras and the built robust
(SVM) algorithms and decision tree. Uses three datasets after model was allowed to run on raspberry-pi4. Paper uses
investigation. Simulated Masked Face Dataset (SMFD) is the lightweight neural network MobileNetV2 to analyse Real-
first dataset , the second dataset is Labeled Faces in the Wild Time Streaming Protocol (RSTP) video stream using OpenCV
(LFW) and third one is the Real-World Masked Face Dataset and transfer learning techniques with Single Shot Detector
(RMFD). SVM learning algorithm achieved 99.49% accuracy (SSD) used to achieve resource limitation and accuracy
in SMFD. RMFD achieved 99.64% of accuracy, LFW recognition in monitoring real-time video surveillance
achieved 100% of testing accuracy. Healthcare system is cameras in public areas to spot out violation of wearing
under crisis. List of precautionary measures is being taken facemask and maintaining social distancing.
care in order to reduce the spread of viruses in which wearing
facemasks is one of them. [3] A system is created to find Hybrid models are built in detection of facemask using
people not wearing facemasks in smart cities using Closed - classical machine learning and deep learning. The system has
Circuit-Television (CCTV) cameras. two components, one component is used for extraction of
features using Resnet50 and the other component is used to
The trained system achieved 98.7% accuracy in differentiating classify facemask using Support Vector Machine (SVM),
people with mask and without mask. [4] An efficient and high- ensemble algorithm and decision tree. The Real World
accuracy detector called RetinaFaceMask detector is built to Masked Face Dataset (RMFD), the Labelled Faces in the Wild
spot whether people are wearing facemasks. The framework is (LFW) and Simulated Masked Face Dataset (SMFD) are the
a one-stage detector with a new background attention module three datasets used for experiment of considered three
to concentrate on face mask identification and a pyramid algorithms. SVM algorithm resulted more efficient than the
network feature to combine high-level semantic data with other three algorithms with 99.64% accuracy in RMFD,
several feature maps. An approach to the algorithm for the 99.49% accuracy in SMFD and reached 100% accuracy in
elimination of new background attention module artifacts to LFW.
delete projections of high union intersections and poor
confidences. Results of RetinaFaceMask achieves state-of-the- A system is proposed to find absence of facial mask with the
art results on facemask dataset with 2.3% and 1.5% higher people in public areas in smart cities by monitoring with
than the standard result and mask detection precision, with Closed-Circuit-Television (CCTV) cameras. A person without

978-1-7281-8501-9/21/$31.00 ©2021 IEEE 1085

Authorized licensed use limited to: Huynh Trung. Downloaded on September 26,2021 at 05:59:49 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Inventive Computation Technologies [ICICT 2021]
IEEE Xplore Part Number: CFP21F70-ART; ISBN: 978-1-7281-8501-9

a facial mask is detected and notified to higher authorities images are used in order to train the model that classifies into
through the city network. The architecture is developed using two categories: that is, faces with masks and faces without
Convolutional Neural Network (CNN) which helps for feature masks. These datasets are then converted into arrays in order
extraction from the dataset images as well the images captured to create a Deep Learning Model. The result of the person
by cameras in real-time. There is 98% of accuracy result by from the video displays a person with a square bound box.
the trained architecture. This system monitors continuously, and whenever a person is
identified without a mask then the person's face is been
captured and then it is sent to the higher authorities, also to
that person. Due to the outbreak of novel CoronaVirus this
III. PROPOSED WORK proposed model can be implemented in public at real-time for
monitoring the people wearing face masks. Our model can be
In this research work, we proposed an Optimistic Convolution
used for monitoring automatically in public places that would
Network that helps to ensure whether in public the people are help for those who monitor people physically/manually, that is
wearing masks or not by monitoring automatically. Here in the reason we picked this architecture. Our system can be used
Fig 1 we have described an architecture that shows how our in airports, schools, railway stations, shopping malls, offices
system functions automatically to prevent the spread of and other public areas to make sure that in-public people are
COVID19. wearing masks.

A Convolutional Neural Network


CNN plays a significant part in computer vision related
examples in recognizing patterns, on account of its less
computation cost and also the ability of spatial extraction.
CNN utilizes convolution portions to combine with the
primary images in order to remove top-level features. The
commencement network that is proposed in [9] permits the
network to get familiar with the mix of kernels. Planning to
build a good Convolutional Neural Network architecture
actually remains as a primary inquiry. To prepare a lot further
neural network, K. He et al. proposed Residual Network
(ResNet) [10] that can take in personality planning from the
past layer. As article locators are generally conveyed on
Fig. 1. Architecture of proposed system portable or any embedded device, where the computing assets
are extremely restricted, Mobile Network (MobileNet) [11] is
Our system uses the TensorFlow and Keras algorithm to detect proposed. This utilizes profundity shrewd convolution to
whether an individual is wearing a face mask along with the remove highlights and channelised convolutions to change
Convolutional Neural network model. Here we first train the channel numbers, so that the computational expense of the
system with the Dataset from Kaggle and train it with Keras MobileNet is a lot lower compared to networks utilizing
and TensorFlow, once the training is done then we will load standard convolutions. In Fig 2 we have shown a Schematic
face mask classifier from the disk, here faces are detected Diagram for Basic Convolution Neural Network.
from a real time video stream. This process also involves use
of MobileNet in order to train a huge collection of images and
classification of high-quality images.

Here image dataset is loaded from Keras and then the images
are converted into an array, later MobileNet is used to
preprocess input image and to append image to the data list. In
the proposed system the main contribution includes person
face identification and face mask detection. These both are
done in Real Time with the help of MobileNet and OpenCV.
A square box is been displayed on every person's face with the
color of red and green where red indicates the person is not Fig. 2. Schematic Diagram for Basic Convolution Neural Network
wearing a mask and green indicates a person is wearing a
mask.
B Dataset
We have used a face cropped dataset from Kaggle of about
3918 images of persons with masks and without masks. These

978-1-7281-8501-9/21/$31.00 ©2021 IEEE 1086

Authorized licensed use limited to: Huynh Trung. Downloaded on September 26,2021 at 05:59:49 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Inventive Computation Technologies [ICICT 2021]
IEEE Xplore Part Number: CFP21F70-ART; ISBN: 978-1-7281-8501-9

The dataset consists of 3918 images, that is divided into two Algorithm : Object removal
parts, faces with mask and faces without masks (Fig 3). These Require: choose face: D0 s, C 0 s; choose mask D0 n,
datasets are taken from Kaggle, face without masks includes C0n
faces with various skin colors, different angles, occlusion, etc. for ps in face detection D0 s do
Faces with masks includes mask with hand, with masks and for pm in mask detection D0 n do
other objects that cover the face, that provides us an advantage
if IoU(ps, pn) > thresh then
to improve variants of the dataset.
remove objects that are
lower confidence
end if
IV. MATHEMATICAL NOTATION
end for
end for
FaceMask gives two outputs for the given input image that is
localization offset prediction and classification prediction.
Ybloc ∈ Ra×4 , Ybc ∈ Ra×b , here a,b denotes number V. EXPERIM ENTA L RESULTS AND ANALYSIS
of anchors generated and number of classes. Also here we
have D ∈ Ra×4 default anchors, Yloc ∈ Ro×4 the Our system proposed is an Optimistic Convolutional Neural
ground truth boxes, and Yc ∈ Ro×1 classification label, Network which is used to detect face masks in real time. In the
here o refers to the number of objects. proposed system we have followed these four steps:

At first default anchor D is being matched with the Yloc ● Collection of Data and pre-processing
ground truth boxes and Yb classification label in order to ● Building and training the Model
obtain Pml ∈ Ra×4 and Pmc ∈ Ra×1 , in which each ● Model Testing
row of Pml represents the offsets and Pmc represents top ● Implementing the Model
classification label, respectively.

We described positive localization prediction Yc + loc ∈ A Collection of Data and pre-processing:


Ra+×4 and positive matched default anchors D + ml ∈
Ra+×1 , here a+ indicates the number of default anchors.
The proposed system used face cropped data containing
After that smooth loss is computed between Yc + loc and D + images with different angles and different poses of face with
ml, Lloc(Yc + loc, D + ml). and without masks that are labelled and is used to train our
model. The real time automated face mask detection has been
Later hard negative mining, predictive anchors and the done by MobileNet and OpenCV. The dataset consists of 3918
sampling negative is performed, D− mb ∈ Ra−×1 and Yc images that are used to train our proposed model. The data are
− b ∈ Ra−×1 , here a− indicates the number of negative divided into two different categories: Faces with mask and
sampling anchors. Then we calculated the confidence loss by without mask. Faces with masks includes mask with hand,
Lconf(Yc − b , D − mb) + Lconf(Yc + b , D + with masks and other objects that cover the face, that provides
mb). us an advantage to improve variants of the dataset.

Hence , the total loss function is

L = 1 Z (Lconf(Yc − b , D − mb) + Lconf(Yc + b ,


D + mb) + αLloc(Yc + loc, D + ml)),

here Z is the matched default anchor.

Fig. 3. Some Sample Images from the dataset

978-1-7281-8501-9/21/$31.00 ©2021 IEEE 1087

Authorized licensed use limited to: Huynh Trung. Downloaded on September 26,2021 at 05:59:49 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Inventive Computation Technologies [ICICT 2021]
IEEE Xplore Part Number: CFP21F70-ART; ISBN: 978-1-7281-8501-9

B Building and Training the Model:

In the proposed system the custom dataset is loaded and the


algorithm is being trained based on labelled images. In this
step the images are resized and its been converted into numpy
array format. This model uses MobileNet performing as a
backbone and train the model using TenserFlow. In Fig 4 we
have shown Model training accuracy/loss curves. Parameters
with a learning rate (initial) of INIT_LR = 1e-4, batch size BS
= 32 and the number of epoch EPOCHS = 20. For the model
webcam is used for face mask detection and once the person is
found we mark the person with the square bounded box.
Fig. 5. T est result of system/model

D Implementing the Model:

Our system uses a custom dataset with the input video taken
from any camera device. The system feeds with a real time
video in public places which automatically monitors and
detects whether or not people are wearing face masks.
Whenever a person is found without wearing a mask then
his/her photo is captured, then it is been sent to the higher
officials/authorities as well the victim so that they can take
any further actions.

VI. CONCLUSION
Fig. 4. T raining Model accuracy and loss curves
In our research we have proposed a system that automatically
identifies whether or not a person is wearing a face mask and
C Model Testing: notify the higher authorities if not wearing a mask. This
proposed system uses Computer Vision and MobileNet to help
Our system works in a automated manner that helps identify a the public ensure that they are wearing face masks and to keep
person if he or she is wearing or not wearing a mask and away from the spread of COVID-19 virus. Our research also
notify the person's image to the person as well as to the higher helps police or higher authorities that makes it easier to
authorities. Once our model is trained completely with the identify whether a person is wearing a mask, if not then they
provided dataset we test by showing a bounded box with the will be also having the victim's photo by which they can take
confidence score on top of the bonded box. From the camera further actions. The proposed system can be implemented in
our proposed system identifies all the persons face with a places like railway stations, shopping malls, offices, schools,
green and red bounding box (Fig 5) that identifies whether a airports, etc.
person is wearing a mask or not. If any of the people are not
wearing the mask then the system will capture that person's
image and send it to the victim as well as the higher VII. FUTURE WORK
authorities.
There are many more different cases in which this model can
be integrated for the safety of the public:
● Identify a person if he is doing any crime by wearing
face mask.
● Identify what type of mask is the person wearing.
● Coughing and Sneezing Detection.
● Temperature Screening

978-1-7281-8501-9/21/$31.00 ©2021 IEEE 1088

Authorized licensed use limited to: Huynh Trung. Downloaded on September 26,2021 at 05:59:49 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Inventive Computation Technologies [ICICT 2021]
IEEE Xplore Part Number: CFP21F70-ART; ISBN: 978-1-7281-8501-9

REFERENCES [9] Khan MKJ, Ud Din N, B.S.Y.J.: “ Interactive removal of microphone


object in facial images.” Electronics 8(10) (2019)
[10] B J, Bipin & Nihar, K & Adarsh, C. (2020). “ A Comparative
[1] Yadav S. “ Deep learning based safe social distancing and facemask Binarization Approach for Degraded Agreement Document Image from
detection in public areas for COVID-19 safety guidelines adherence.” Various Pharmacies.” Vol 12. 806-814. 10.31838/ijpr/2020.12.02.0124.
Int J Res Appl Sci Eng Technol. 2020;8(7):1368-1375.
[11] B J, Bipin & Khamarudheen, K.S. & Ranjitha, H.s. (2016). “ An
[2] Loey M, Manogaran G, T aha MHN, Khalifa NEM. “ A hybrid deep approach for identifying the presence of factor IX gene in DNA
transfer learning model with machine learning methods for face mask sequences using position vector ANN.” 87. 396-403.
detection in the era of the COVID-19 pandemic.” Measurement (Lond).
[12] Szegedy C, Liu W, Jia Y, et al. “ Going deeper with convolutions.
2021;167:108288.
In: 2015 IEEE Conference on Computer Vision and Pattern
[3] Suresh. K and Pattabiraman. V, “An improved utility itemsets Recognition (CVPR).” IEEE; 2015.
mining with respect to positive and negative values using
[13] Suresh.K and Pattabiraman. V, “Reduction of large database and
mathematical model”, International Journal of Pure and Applied
identifying frequent pat terns using enhanced high utility mining”,
Mathematics, Volume-101 No-5, Page No-763772, 2015.
International Journal of Pure and Applied Mathematics, Volume-109,
[4] Suresh. K and Pattabiraman. V, “Developing a customer model for No-5, Page No-161-169, 2016.
targeted marketing using association graph mining”, International
[16] Suresh. K and Devika Mohan, “Development of High Utility Itemsets In
Journal of Recent Technology and Engineering, Volume 8, Issue 2S4,
Streaming Database”, Test Engineering andManagement, Volume No-82,
July 2019, Pages 292-296. DOI:10.35940/ijrte.B1055.0782S419
Page No-13052 - 13056, ISSN:0193 – 4120, 2020.
[5] K. Suresh and O. Praveen, "Extracting of Patterns Using Mining
[17] Suresh. K and Kashyap. C, “Effectively Mining on Utility Itemset by
Methods Over Damped Window," 2020 Second International
Using Conventional Method”, T est Engineering and Management, Volume
Conference on Inventive Research in Computing Applications
No-82, Page No-13062 - 13068, ISSN: 0193 – 4120, 2020.
(ICIRCA), Coimbatore, India, 2020, pp. 235 -241, doi:
10.1109/ICIRCA48905.2020.9182893. [18] Chollet, F.: Xception: “ Deep learning with depthwise separable
convolutions. CoRR” abs/1610.02357 (2016),
[7] S, Akshay & Bhat, Mandara & Rao, Aishwarya. (2019). “ Facial http://arxiv.org/abs/1610.02357
Expression Recognition using Compressed Images.” International
Journal of Recent T echnology and Engineering. [19] Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., Chen, L.:
DOI:10.35940/ijrte.B1041.078219. “ Inverted residuals and linear bottlenecks: Mobile networks for
classification, detection and segmentation.” CoRRabs/1801.04381
[8] S. Akshay and P. Apoorva, "Segmentation and classification of FMM
(2018), http://arxiv.org/abs/1801.04381
compressed retinal images using watershed and canny segmentation and
support vector machine," 2017 International Conference on [20] “ Howard AG, Zhu M, Chen B, et al. MobileNets: Efficient
Communication and Signal Processing (ICCSP), Chennai, 2017, pp. Convolutional Neural Networks for Mobile Vision Applications.” arXiv
1035-1039, doi:10.1109/ICCSP.2017.8286531. [csCV]. Published online 2017. http://arxiv.org/abs/1704.04861

978-1-7281-8501-9/21/$31.00 ©2021 IEEE 1089

Authorized licensed use limited to: Huynh Trung. Downloaded on September 26,2021 at 05:59:49 UTC from IEEE Xplore. Restrictions apply.

You might also like