A Deep Learning Based Approach For Real Time Face Recognition System
A Deep Learning Based Approach For Real Time Face Recognition System
Tanusree Das Tithy, Soarov Chakraborty, Rabaya Islam, and Abdul Aziz
Department of Computer Science and Engineering
Khulna University of Engineering & Technology
Khulna-9203, Bangladesh
[email protected], [email protected], [email protected], [email protected]
Abstract—Face recognition technology provides us with a wide Besides, a system that detects an unauthorized person and
range of possibilities for creating a safe and secure environment. alerts the admin and saves necessary evidence against the
In this paper, two separate approaches are applied mutually while individual. The residual neural network gives that opportunity
designing a hybrid solution for face identification. The MTCNN
method is applied to the real-time video frame produced by to develop such a system, saving countless hours of worrying
the security camera. Moreover, a deep learning-based approach about safety.
was used to generate the facial landmarks from the obtain face In this paper, a face recognition method is performed
images, and the model was trained using the Res-Net architecture. using the facial landmarks generated by the Res-Net [3].
Furthermore, MongoDB is used as a database to store the facial Moreover, face images were extracted from a video frame
landmarks and corresponding identity, and if the system cannot
identify the person, an alarm is generated, and an auto-generated using the multi-task cascaded convolutional neural networks.
message immediately sends to the owner’s email address. This Furthermore, MongoDB [4] is used as a database to store the
way it creates a strong security system. facial landmarks and corresponding identity, and if the system
cannot identify the person, an alarm is generated, and an auto-
Index Terms—Face Recognition, Deep Learning, Res-Net, generated message immediately sends to the owner’s email
MongoDB.
address.
The works performed in recent times have been described in
I. I NTRODUCTION
section II. Later presents a brief analysis of the recommended
The amount of researches performed in the domain of methodology in section III. The analysis of the outcome
cybersecurity has been expanding in recent times. With the is presented in section IV. Finally, section V outlines the
advancement of technology, various experiments have been conclusions and future works.
conducted related to secured home technology. Face recogni-
tion is a practical application commonly applied for verifying II. L ITERATURE R EVIEW
identity, and that falls under the region of cybersecurity [1] Various CNN-based architectures have been developed for
[2]. performing face recognition i.e., DeepFace [5], DeepID [6],
Security of personal privacy or official confidentiality is and FaceNet [7]. Deep learning-based face recognition mod-
a popular issue. Since the advancement of technologies, we els perform better than traditional face recognition systems.
need to upgrade the fields of security. The common problem FaceNet [7] works with around 8 million images, including
a person faces at home or workplace is that he has to lock 2 million distinct identities utilized on the LFW dataset. This
everything for privacy whenever he leaves for a moment. method gains an accuracy of 99.63%.
It is very time-consuming and a bit hectic. The constant Guo et al. [8] proposed an architecture for face recogni-
thinking of someone might access the room in the absence tion that used near-infrared images and visible light. In this
and leaking some confidential information. Deep learning and deep network architecture, experimental outcomes explained
image processing give such opportunities that create a safe that the system was highly efficient in real-world situations.
system using deep neural networks. A system that processes Taigman et al. [5] applied a deep learning approach with a
the image of the people who enter the room and match it with dataset of 4,000,000 images, including 4000 distinct identities,
the selected people who have access to the room. However, and the system distinguishes two different identities using the
if the system has an Internet of Things (IoT), background, Euclidian distance. This system achieved 97.35% accuracy
it generates several loopholes, i.e., while the system is in while testing on the LFW dataset.
the offline state, the whole work collapse. Moreover, hackers Wu et al. [9] developed a light convolutional neural network
can easily hack the system and create several problems for structure to study compact embedding datasets, including
the admin, i.e., hackers can delete the admin database and heavy and noisy labels. The experiment was performed using
give him no access to the room or provide access to the large-scale face data, and the outcome proved that the system
other unauthorized person. So, we need a system that can was influential in the case of storage spaces and computa-
work offline so that no hacker can gain access to the system. tional costs. An and Liu [10] proposed a facial expression
Successfully
Methods Dataset Accuracy(%)
detected faces
Haar
18,000 17,220 95.67
Cascade
III. M ETHODOLOGY
Authorized licensed use limited to: Zhejiang University. Downloaded on July 01,2024 at 16:32:49 UTC from IEEE Xplore. Restrictions apply.
Fig. 4. The ResNet-34 architecture [3]
Fig. 2 represents a video frame, and faces were detected the circumstance, like working with unstructured and semi-
using the MTCNN method. Fig. 3 shows the extracted face structured data. Moreover, the system faces challenges with
images, and all of them were aligned and resized into 256×256 high-speed data access at the time of complex queries [17].
size. We applied Equation 1, 2, 3 during the alignment process. Therefore we used MongoDB to store the facial landmarks
The angle within the eyes was calculated using the distance and corresponding identity of the registered person.
between eye landmarks. Moreover, we calculated vertical and When someone comes in front of the camera, the system
horizontal distances within the left and right eyes for the generates facial landmarks and calculates Euclidian distance
alignment process. with the registered people’s facial landmarks. Equation 4
represents the calculation of Euclidian distance. We used a
Xdistance = Lef t Eyex − Right Eyex (1) tolerance value of 0.6. If the Euclidian distance is under
this value that represents the person’s face is registered into
the system, and if it exceeds the value, then that person is
Ydistance = Lef t Eyey − Right Eyey (2) considered as unknown.
v
u n
−1 Xdistance q
2 2 2
uX 2
Angle = cos q (3) (p1 -q1 ) +(p2 -q2 ) +...+(pn -qn ) = t (pi -qi ) (4)
(Xdistance 2 + Ydistance 2 ) i=1
TABLE II
PERFORMANCE MEASURES
20 97.02
(a) Registered Face (b) Intruder Face
68 97.08
Fig. 5. 68 Facial landmarks
Authorized licensed use limited to: Zhejiang University. Downloaded on July 01,2024 at 16:32:49 UTC from IEEE Xplore. Restrictions apply.
of dataset, and the accuracy of the model is quite satisfactory
and able to work in the virtual environment. Moreover, it can
detect multi faces at the same time, which is helpful for real-
time recognition. Further studies will be done on implementing
another system that handles 3D images generated by multiple
cameras, and it will become a revolutionary step towards the
future.
R EFERENCES
[1] A. Makandar and A. Patrot, “Malware class recognition using image
processing techniques,” in 2017 International Conference on Data
Management, Analytics and Innovation (ICDMAI). IEEE, 2017, pp.
Fig. 6. Data load time to store data in MongoDB DBMS (in second) 76–80.
[2] J. Kour, M. Hanmandlu, and A. Ansari, “Biometrics in cyber security.”
Defence Science Journal, vol. 66, no. 6, 2016.
[3] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” in Proceedings of the IEEE conference on computer vision
and pattern recognition, 2016, pp. 770–778.
[4] K. Chodorow, MongoDB: the definitive guide: powerful and scalable
data storage. ” O’Reilly Media, Inc.”, 2013.
[5] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “Deepface: Closing the
gap to human-level performance in face verification,” in Proceedings of
the IEEE conference on computer vision and pattern recognition, 2014,
pp. 1701–1708.
[6] Y. Sun, X. Wang, and X. Tang, “Deep learning face representation from
predicting 10,000 classes,” in Proceedings of the IEEE conference on
computer vision and pattern recognition, 2014, pp. 1891–1898.
[7] F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embed-
ding for face recognition and clustering,” in Proceedings of the IEEE
Fig. 7. Query response time to extract data from MongoDB DBMS (in conference on computer vision and pattern recognition, 2015, pp. 815–
millisecond) 823.
[8] K. Guo, S. Wu, and Y. Xu, “Face recognition using both visible light
image and near-infrared image and a deep network,” CAAI Transactions
on Intelligence Technology, vol. 2, no. 1, pp. 39–47, 2017.
and this security system is ready to operate in the professional [9] X. Wu, R. He, Z. Sun, and T. Tan, “A light cnn for deep face
domains. representation with noisy labels,” IEEE Transactions on Information
Forensics and Security, vol. 13, no. 11, pp. 2884–2896, 2018.
B. MongoDB Performance Analysis [10] F. An and Z. Liu, “Facial expression recognition algorithm based on
parameter adaptive initialization of cnn and lstm,” The Visual Computer,
We have analyzed the system’s load time for uploading the vol. 36, no. 3, pp. 483–498, 2020.
facial images and landmarks into MongoDB data storage. It [11] J.-J. Lv, C. Cheng, G.-D. Tian, X.-D. Zhou, and X. Zhou, “Landmark
perturbation-based data augmentation for unconstrained face recogni-
was determined by 25%, 50%, 75%, and 100% of the total tion,” Signal Processing: Image Communication, vol. 47, pp. 465–475,
dataset after executing all the steps. Moreover, the data loading 2016.
time is represented in the second, as displayed in Fig. 6. [12] P. Viola and M. Jones, “Rapid object detection using a boosted cascade
of simple features,” in Proceedings of the 2001 IEEE computer society
In the MongoDB database, we measure the total time for conference on computer vision and pattern recognition. CVPR 2001,
executing an operation. Fig. 7 represents four query response vol. 1. IEEE, 2001, pp. I–I.
times and the average query response time in milliseconds. [13] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and
alignment using multitask cascaded convolutional networks,” IEEE
Here, we try to find a match for registered persons and Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016.
calculate the amount of time needed for the recognition task. [14] Z. Zhang, Y. Song, and H. Qi, “Age progression/regression by condi-
tional adversarial autoencoder,” in Proceedings of the IEEE conference
V. C ONCLUSIONS on computer vision and pattern recognition, 2017, pp. 5810–5818.
[15] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, “Vggface2:
In this paper, we proposed a technologically enhanced A dataset for recognising faces across pose and age,” in 2018 13th IEEE
security system that can increase people’s safety. Here, a face international conference on automatic face & gesture recognition (FG
recognition method is performed using the facial landmarks 2018). IEEE, 2018, pp. 67–74.
[16] D. E. King, “Dlib-ml: A machine learning toolkit,” The Journal of
generated by the Res-Net. Moreover, face images were ex- Machine Learning Research, vol. 10, pp. 1755–1758, 2009.
tracted from a video frame using the MTCNN method. The [17] S. Chakraborty, S. Paul, and K. M. Azharul Hasan, “Performance
system seemed quite successful in achieving its goal. A more comparison for data retrieval from nosql and sql databases: A case
study for covid-19 genome sequence dataset,” in 2021 2nd International
protected environment can be created with the help of it. Conference on Robotics, Electrical and Signal Processing Techniques
Furthermore, this system can replace the human resources (ICREST), 2021, pp. 324–328.
that the security guards provide. While creating the authorized
person’s dataset, few terms should be considered, i.e., face
poses, light intensity, the distance between the camera and a
person. Otherwise, an authorized person can be labeled as an
intruder. The system worked perfectly well on a large number
Authorized licensed use limited to: Zhejiang University. Downloaded on July 01,2024 at 16:32:49 UTC from IEEE Xplore. Restrictions apply.