Face Recognition With Deep Learning Architectures
Face Recognition With Deep Learning Architectures
Architectures
Abstract— The progression of information discernment via facial identification and the emergence of innovative frameworks has
exhibited remarkable strides in recent years. This phenomenon has been particularly pronounced within the realm of verifying individual
credentials, a practice prominently harnessed by law enforcement agencies to advance the field of forensic science. A multitude of scholarly
endeavors have been dedicated to the application of deep learning techniques within machine learning models. These endeavors aim to
facilitate the extraction of distinctive features and subsequent classification, thereby elevating the precision of unique individual recognition.
In the context of this scholarly inquiry, the focal point resides in the exploration of deep learning methodologies tailored for the realm of
facial recognition and its subsequent matching processes. This exploration centers on the augmentation of accuracy through the meticulous
process of training models with expansive datasets. Within the confines of this research paper, a comprehensive survey is conducted,
encompassing an array of diverse strategies utilized in facial recognition. This survey, in turn, delves into the intricacies and challenges that
underlie the intricate field of facial recognition within imagery analysis.
Deep
Limitation
Paper Learning Journal/
& Future
& Year Architect Dataset Conference
Work
ure
LFW
Systems Limited
CNN with (Labeled
[1] Science & discussion
augmented Faces in
2020 Control on network
data the
Engineering specifics
Wild)
Assumes
high-quality
training data.
LFW,
Investigate
CFP,
[2] techniques to
ArcFace AgeDB, IEEE CVPR
2019 make the
VggFace
model robust
2
to noisy or
unbalanced
data
Performance
on large
unconstraine
d datasets
LFW
might be
(Labeled
[3] limited.
Deep CNN Faces in Springer
2017 Study
the
domain
Wild)
adaptation
techniques to
improve
performance
on diverse Limited
datasets exploration
Limited of
exploration CASIA- architectural
of more Deep WebFace innovations.
[9]
recent Discrimina , MS- IEEE CVPR Incorporate
2018
advancement tive CNN Celeb- recent CNN
Surveilla s. Investigate 1M advancement
Local s to enhance
[4] nce hybrid
Binary ACM feature
2018 video architectures
CNN learning
frames that combine
local and No specific
global limitation
features for mentioned.
better Investigate
Residual
recognition [10] ImageNe deeper
Networks IEEE CVPR
Focus on 2016 t architectures
(ResNet)
template- or
based modification
methods. s for face
Explore end- recognition
[5] Template CASIA- Limited
IEEE to-end
2017 Adaptation WebFace exploration
architectures
for of intra-class
verification variations.
and Study
[11] LFW,
identification FaceNet IEEE CVPR methods to
2015 YTF
Assumes handle
predefined extreme
class centers. variations
Explore for robust
dynamic clustering
[6] CASIA- Assumes
CosFace IEEE CVPR center
2018 WebFace availability
assignment
methods for of labeled
more data.
LFW,
adaptive Develop
[12] private
cosine loss DeepFace IEEE CVPR techniques
2014 Faceboo
Limited to for effective
k dataset
NIR-VIS face
face verification
CASIA recognition. with limited
[7] Wasserstei labeled data
NIR-VIS IEEE Extend to
2017 n CNN
2.0 broader
cross-modal III. CONVOLUTIONAL DEEP LEARNING:
recognition REVOLUTIONIZING FACE RECOGNITION
scenarios. Deep learning employs artificial neural networks to perform
Focus on extensive computations on vast volumes of data. This domain
Adversaria video face
of artificial intelligence, referred to as "deep learning," is
l recognition.
rooted in the intricate structure and functioning of the human
Embeddin YouTube Investigate
[8] brain. The
g, Faces, IEEE temporal
2018 principal classifications of deep learning algorithms
Variational IJB-A modeling for
Aggregatio improved encompass reinforcement learning, unsupervised learning, and
n video-based supervised learning. Neural networks, designed analogously to
recognition the human brain's configuration, are comprised of artificial
neurons commonly denoted as nodes. These nodes are
arranged in a
hierarchical manner across three tiers: the input layer, potential 5. Face Database: A face database is a collection of pre-
hidden layers, and the output layer. Among the myriad neural processed facial images that are used for recognition. This
network types accessible, examples include deep belief database serves as the reference for comparing and identifying
networks, long short-term memory networks, multilayer the face in the input image or frame. The database contains
perceptrons, generative adversarial networks, convolutional multiple examples of each individual's face, captured under
neural networks, and recurrent neural networks. Illustrated different lighting conditions, angles, and expressions.
below are just a few instances of the diverse neural network 6. Training Set-using CNN: Convolutional Neural Networks
variations accessible. Deep belief networks, long short-term (CNNs) are a type of deep learning model particularly well-
memory networks, multilayer perceptron, generative suited for image analysis tasks. To build a CNN-based face
adversarial networks, convolution neural networks, and recognition system, you need a training set. This set consists of
recurrent neural networks, etc. are only a few examples of the labeled images where each image is associated with the
various types of neural networks that are accessible [13]. The identity of the person in the image. The CNN learns to extract
fundamental procedures for implementing facial recognition features and patterns from these images that are specific to
through deep learning are depicted in the figure below. each person.
7. Face Recognition: In the face recognition step, the
preprocessed input image's features are extracted and
compared with the features stored in the face database. This
involves measuring the similarity between the input image's
features and the features of each individual in the database. The
closest match is then considered the recognized person.
Currently, one of the most commonly employed models is the
Convolutional Neural Network (CNN). This computational
framework within the domain of neural networks features the
Figure 1. Basic Block Diagram for Face Recognition
incorporation of one or multiple convolutional layers in
The above diagram shows the general technique of Face conjunction with a variant of the multilayer perceptron. Its
recognition from the image or a video sequence which is prevalent application is notably observed in scenarios
explained in detail as under: requiring classification tasks. The fundamental operations
1. Read Frame from an Image or Video Sequence: The process integral to CNN architecture encompass convolution, pooling,
starts by obtaining an image or a frame from a video sequence and fully connected layers, collectively constituting the triad of
where you want to perform face recognition. This could be a essential processes.
photograph or a single frame from a video clip.
2. Apply Preprocessing on the Image Frame: Before any
analysis can be done on the image, it is often necessary to
preprocess it. Preprocessing may involve resizing the image to
a consistent size, converting it to grayscale (if color
information is not needed), and performing various filtering or
enhancement operations to improve the quality of the image
and make subsequent steps more effective.
3. Facial Feature Extraction: This step involves identifying
and extracting key facial features from the preprocessed Figure 2. CNN Architecture
image. Common facial features include eyes, nose, mouth, and
sometimes landmarks like eyebrows or jawlines. There are A Convolutional Neural Network (CNN) stands as a
various techniques for feature extraction, including traditional specialized variant of a neural network meticulously crafted to
methods based on edge detection and newer deep learning process and dissect visual data, encompassing images and
methods that can automatically learn and identify features. videos, with an exceptional proficiency. Its efficacy becomes
4. Classifier: A classifier is used to determine whether the particularly pronounced in tasks such as image classification,
extracted features represent a face or not. This step helps filter object detection, and image generation. It is an architectural
out non-face objects from the analysis. Common classifiers homage to the human visual system, adroitly harnessing its
include Support Vector Machines (SVM), decision trees, or even innate capability to autonomously assimilate hierarchical
deep learning models. attributes from the ingested data. Here in lies an exhaustive
exposition delineating the modus operands of a CNN:
1. Input Layer: The CNN's ingress typically manifests as an formulation of ultimate predictions or classifications premised
image, expounded as an array of pixel values. Color images upon assimilated attributes. In the context of image
come endowed with multiple channels (e.g., the triad for classification, this layer typically embodies nodes correlative to
RGB), whereas grayscale images bear a solitary channel.
Subsequently, the input image traverses the network, stratum
by stratum, with each stratum orchestrating discrete
operations.
2. Convolutional Layer: Constituting the linchpin of the CNN,
this layer is constituted by a compendium of filters (also
recognized as kernels) that manifest as matrices of diminished
proportions. These filters elegantly perambulate the input
image with a predetermined stride, instigating a cascade of
element- wise multiplications and ensuing summations—an
ensemble denominated as convolution. This intricate
convolution operation lays bare localized attributes through
the discernment of patterns encompassing edges, vertices, and
textures. Notably, each filter is endowed with the competence
to identify a distinct attribute. In the aftermath of convolution,
an adjunct bias term is assimilated with the yield of each filter,
and subsequently, a non- linear activation function, the likes of
Rectified Linear Activation (ReLU), is deployed. This
augmentation bequeaths the network with non-linearity,
capacitating it to encapsulate more intricate interdependencies
inherent in the data.
3. Pooling Layer: The precincts of pooling layers preside over
the contraction of spatial dimensions of the feature maps
garnered from convolutional strata. Among the gamut of pooling
techniques, the apogee is occupied by max-pooling. In this
schema, a window, usually of dimensions 2x2 or 3x3,
navigates the feature map, and only the acme value within the
said window endures. This stratagem expedites the curtailment
of computational intricacies inherent in the network,
concurrently fostering resilience against infinitesimal spatial
oscillations.
4. Flattening: Following the iterative succession of
convolutional and pooling strata, the resultant feature maps
undergo a metamorphosis into a vector. This vector
subsequently interfaces with fully connected layers—
proximate to the strata observed within traditional neural
networks.
5. Fully Connected Layers: The compressed vector,
engendered by the antecedent step, converges with one or
more fully connected layers. These layers, akin to the latent
strata in conventional neural networks, adroitly internalize
intricate amalgamations of attributes hailing from the
precedent layers. These convolutions culminate in definitive
decisions, founded upon the culminated attributes. The
ultimate product of the terminal fully connected layer, in
classification undertakings, invariably confronts a softmax
activation function, engendering a probability distribution
spanning myriad classes.
6. Output Layer: The valedictory stratum culminates in the
diverse classes, each node epitomizing the probability of the
input image's pertinence to a specific class.
7. Training: The orchestration of CNN training is mediated
by annotated data via an iterative technique denoted as
backpropagation. In this process, the network's weights and
biases undergo incremental recalibration utilizing
optimization algorithms, gradient descent chief among them,
with the intent of minimizing disparities between the
prognosticated and actual labels—this dissonance being
encapsulated by the conduit of a loss function.
The architecture of CNNs is susceptible to wide-ranging
variations with respect to strata configurations and
profundity. Embellished constructs such as VGG, ResNet,
and Inception, embrace supplementary strata and innovative
frameworks, thereby ameliorating precision whilst capturing
intricacies of attributes.
Briefly, a Convolutional Neural Network orchestrates a
sequential execution of convolutional, activation, pooling,
and fully connected strata vis-à-vis an input image. This
intricate procession inexorably imbibes hierarchical
attributes and patterns, concurring to endow the network with
a discernment that invariably culminates in judicious
prognostications or classifications.
C. ResNet.
Figure 6. Architecture ResNet [20]
The bedrock upon which the architectural underpinnings of
deep Convolutional Neural Network (CNN) designs repose is Furthermore, it impairs the propagation of pertinent
rooted in the notion that with the escalation of network depth, information through the feature map during the feed-forward
coupled with the utilization of an array of nonlinear mappings process, a drawback that cannot be ignored. In addition to
and the cultivation of more intricate feature hierarchies, the these concerns,
it is essential to underscore that the ResNet's architectural detections within the image. This process commences by
configuration entails an exceptionally high computational cost,
which must be taken into careful consideration.
D. Region-Based Convolutional Neural Network
(R CNN).
In the realm of computer vision, the paradigm of Region-based
Convolutional Neural Networks, or R-CNN, emerged as a
significant advancement. In the year 2014, Ross Girshick and his
collaborators presented R-CNN as a robust solution aimed at
rectifying the challenges associated with effective object
localization in the context of object recognition tasks. The
fundamental predicament addressed by R-CNN stems from the
inherent inefficiency of Convolutional Neural Networks
(CNNs) in swiftly and accurately pinpointing objects of
interest. This inefficiency arises from the nature of CNNs,
which directly extract pertinent features from the input data.
Consequently, the conventional approach to identifying a
specific object within an image entails a considerable
computational time investment. One of the primary limitations
of employing a traditional convolutional network followed by
a fully connected layer lies in the variability of the output
layer's size. Unlike a fixed-size output layer, the output of such
networks can assume variable dimensions, leading to the
creation of image representations containing an unpredictable
multitude of instances featuring various objects. This
unpredictability in the number of object instances further
complicates the process of object localization and recognition
within the image data.
E. Google Net
In the scholarly publication titled "Going Deeper with
Convolutions," released in the year 2014 [22], a team of
researchers affiliated with Google introduced what has since
become widely recognized as GoogleNet, alternatively
referred to as Inception-V1. This architectural innovation
ascended to victory in the fiercely competitive arena of the
2014 ILSVRC image classification competition. In
comparison to the prior architectures employed in
Convolutional Neural Networks (CNNs), GoogleNet
demonstrated a notably diminished error rate, marking a
pivotal achievement in the realm of deep learning. The
overarching objective underpinning the creation of the
GoogleNet architecture was the pursuit of exceptional
accuracy in image classification tasks while maintaining a
judicious approach to computational resources. This
architectural marvel boasts a formidable depth, comprising a
total of 22 distinct layers, and incorporates a staggering 27
pooling levels. Within this intricate framework, the
researchers thoughtfully integrated a 1x1 convolutional layer
in conjunction with average pooling techniques. An inherent
challenge faced in the development of GoogleNet was the
looming specter of overfitting. Given the profound depth of
the network's layers, there existed a palpable risk of an
excessively specialized model that performed exceedingly
well on the training data but struggled to generalize
effectively. In response, the GoogleNet architecture
ingeniously diverged from the conventional wisdom of
deepening the network and instead embraced a strategy that
broadened its computational capabilities. This strategy was
anchored in the deployment of filters of varying sizes,
enabling them to operate synergistically on the same
set of complications. A salient issue pertained to the Nonetheless, it is imperative to acknowledge certain intrinsic
heterogeneous topology that necessitated intricate module-to- limitations inherent to CNNs. Firstly, CNNs do not encode
module modifications, posing a considerable challenge in information pertaining to an object's spatial location or
terms of design and implementation. Additionally, the orientation. Consequently, when an object undergoes slight
architecture grappled with a bottleneck phenomenon within its alterations in either its position or orientation, it may fail to
representation flow. This bottleneck significantly compressed activate the neural pathways responsible for its recognition.
the feature space in subsequent layers, thereby occasionally Additionally, the training process can become protracted,
leading to the unfortunate loss of pivotal data, adversely especially when a CNN encompasses numerous layers and the
affecting the model's overall performance and robustness. computational capabilities of the GPU are suboptimal. Another
notable drawback of CNNs is their voracious appetite for
TABLE II. COMPARATIVE STUDY OF VARIANTS OF CNN.
voluminous training data, rendering them relatively sluggish in
Architecture Origin Advantages Applications terms of processing speed. Furthermore, the pooling layer, an
1. Pioneer in CNNs.
integral component of CNN architecture, tends to overlook the
2. Efficient for 1. Handwritten
small image digit recognition interrelationship between localized features and the holistic
LeNet 1998 recognition tasks. (MNIST dataset). context, resulting in appreciable information loss. For instance,
3. Utilizes 2. Early character when discerning facial features from a video feed, a considerable
convolution and recognition.
degree of data dependency is requisite. Furthermore, CNNs are
pooling layers.
1. Introduced deep 1. Image not ideally suited for tackling time series problems. Their
CNNs. classification extensive parameterization, comprising millions of tunable
2. Utilizes ReLU (ImageNet parameters, renders them susceptible to underperformance
activation and challenge).
AlexNet 2012 when confronted with inadequately sized datasets. A surfeit of
dropout. 2. Object
3. GPU acceleration detection. data, conversely, imbues CNNs with greater robustness and
for 3. Image the propensity to yield enhanced performance outcomes. To
training. segmentation. ameliorate these limitations and optimize the performance of
1. Deep 1. Image
CNNs, a judicious strategy involves amalgamating the CNN
architectures classification
without (ImageNet algorithm with other neural network paradigms such as
vanishing. challenge). Recurrent Neural Networks (RNNs), Long Short-Term
ResNet 2015 2. Gradients 2. Object Memory (LSTM) networks, or alternative approaches. This
problem. detection (e.g.,
3. Improved
fusion facilitates enhanced computational efficiency and can
Faster R-CNN).
training of very 3. Semantic substantially augment the efficacy of the CNN algorithm,
deep networks. segmentation. particularly when confronted with complex, multifaceted
1. Combines region tasks.
proposals with 1. Object
CNNs detection and V. PRACTICAL SCENARIOS FOR FACE
R-CNN 2013 2. Achieved state- localization.
of-the-art results in
RECOGNITION.
2. Image
object detection segmentation. Face recognition technology has a wide range of practical
tasks.
scenarios across various industries and applications. Here are
GoogLeNet 2014 1. Inception 1. Image
modules for classification some practical scenarios for face recognition with
efficient and deep (ImageNet explanations: Access Control and Security: Facility Access: In
networks. challenge). office buildings or secure facilities, employees can gain access
2. Reduces the 2. Object
by simply having their faces recognized, enhancing security
number of detection (e.g.,
parameters. YOLO). and convenience.
Airport Security: Facial recognition can expedite the
In this exposition, we have delved into the rudimentary passenger screening process at airports, identifying individuals
principles underpinning Convolutional Neural Networks on watch lists or verifying their identity.
(CNNs). CNNs represent a dependable and efficacious deep Mobile Device Authentication: Smartphones: Users can
learning methodology, particularly germane to the realm of unlock their smartphones or authorize mobile payments by
image processing. They excel in multifarious image-related facial recognition, adding an extra layer of security to their
tasks such as facial recognition, image categorization, and devices. Payment Authorization: Retail Payments: Customers
object detection. One of the salient virtues of CNNs is their can make payments at stores or online by simply looking at
innate capacity for feature extraction sans human a camera, reducing the need for physical cards or passwords.
intervention.
Healthcare: Patient Identification: Hospitals can accurately VI. CHALLENGES AND COMPLICATIONS IN THE
identify patients to prevent medical errors and ensure that the SPHERE OF FACE RECOGNITION
right patient receives the right treatment. Face recognition technology has made significant
Law Enforcement and Public Safety: Criminal Identification: advancements in recent years, but it still faces several
Police departments can quickly identify suspects in crowds or challenges. Here are some of the key challenges in face
match suspects to existing databases, aiding in crime recognition:
prevention and solving cases. Privacy Concerns:
Attendance Tracking: Schools and Universities: Educational Data Privacy: The collection and storage of facial data raise
institutions can track student and faculty attendance privacy concerns, especially when used without individuals'
automatically, streamlining administrative tasks. consent or knowledge.
Customer Service: Retail and Hospitality: Businesses can use Surveillance: Widespread use of facial recognition in public
facial recognition to personalize customer experiences, spaces can lead to mass surveillance concerns and potential
recognize loyal customers, and improve service. abuse by governments and corporations.
Human Resources: Time and Attendance: Companies can Accuracy and Robustness:
automate employee attendance tracking, reducing errors and Variability: Faces can vary significantly due to lighting
ensuring fair compensation. conditions, angles, facial expressions, and occlusions,
Public Events and Venues: Ticketless Entry: Attendees at making it challenging to achieve consistently high accuracy.
concerts, sporting events, and amusement parks can gain entry Adversarial Attacks: Face recognition systems can be
by having their faces scanned, reducing ticket fraud. vulnerable to attacks that involve modifying or adding noise
Smart Homes or Home Automation: Homeowners can use to input images to deceive the system.
facial recognition to control smart home devices, customize Security Risks:
settings, and enhance security. Spoofing: Attackers can use photos, videos, or 3D masks to
Retail Analytics or Customer Insights: Retailers can gather trick face recognition systems, compromising security.
data on customer demographics, behavior, and shopping Privacy Invasion: Criminals or unauthorized individuals can
preferences, enabling targeted marketing strategies. use stolen biometric data to impersonate others or gain
Customized Advertising or Digital Signage: Advertisers can access to sensitive information.
display personalized ads based on the age and gender of Regulatory and Legal Challenges:
individuals passing by digital billboards. Lack of Standards: The absence of comprehensive
Aging and Healthcare Monitoring: Aging Population: Face regulations and standards can lead to inconsistent
recognition can help monitor the health and well-being of the deployment and ethical concerns.
elderly by detecting changes in facial expressions or vital Legislation: Governments are still working to create
signs. Authentication in Banking: ATM Access: Banks can appropriate legal frameworks to address the ethical and
enhance ATM security by adding facial recognition as a privacy implications of face recognition.
biometric authentication method. Scalability and Performance:
Visitor Management: Corporate Offices: Companies can Real-time Processing: Achieving real-time performance on a
streamline visitor check-ins and enhance security by using large scale, such as in crowded public spaces, remains a
facial recognition for visitor management. technical challenge.
Forensics: Criminal Investigations: Law enforcement agencies Hardware Constraints: Some applications may require
can use facial recognition to identify potential suspects from specialized hardware to perform face recognition efficiently.
surveillance footage or composite sketches. Aging and Long-term Changes:
Contactless Check-in at Hotels: Hospitality Industry: Guests Aging: Over time, people's faces change due to aging, which
can check into hotels without physical contact, improving the can reduce the accuracy of recognition systems.
check- in process and safety during a pandemic. Lifestyle Changes: Significant lifestyle changes, such as
Customized Healthcare Treatment: Medical Diagnosis: Facial weight loss or gain, can also affect facial recognition accuracy.
recognition can assist in diagnosing certain medical conditions Environmental Factors:
by analyzing facial features and expressions. Environmental conditions such as poor lighting, weather,
Search and Rescue Operations or Emergency Response: In or low-resolution images can affect the performance of
disaster scenarios, facial recognition can help locate missing face recognition algorithms.
persons by matching faces with databases of survivors.
VII. CONCLUSION. [4] Carolina Todedo Ferraz And Jose Hiroki. , “A Comprehensive
Analysis Of Local Binary Convolution Neural Network For Fast
In this comprehensive review paper, we endeavor to provide a
Face Recognition In Surveillance Video.” ACM. 2018.
meticulous summary of the diverse Deep Learning
[5] Nate Crosswhite, Jeffrey Byrne, Chris Stauffer, Omkar Parkhi,
methodologies that have been harnessed in the realm of facial Aiong Cao And Andrew Zisserman, “Template Adaptation For
recognition systems. A thorough and exhaustive scrutiny of Face Verification And Identification. 12th International
the existing literature has yielded the realization that Deep Conference On Automatic Face & Gesture Recognition”, IEEE.
Learning Techniques have, undeniably, propelled significant 2017.
advancements within the sphere of facial recognition. It is [6] Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong,
noteworthy to mention that a multitude of scholarly Jingchao Zhou, Zhifeng Li And Wei Liu, “Cosface: Large
publications have not only proffered insightful perspectives Margin Cosine Loss For Deep Face Recognition. Conference On
Computer Vision And Pattern Recognition.” , IEEE. 2018.
but have also implemented a myriad of methodologies catering
[7] Ran He, Xiang Wu, Zhenan Sun And Tieniu Tan. “Wasserstein
to various facets of face recognition, encompassing aspects Cnn: Learning Invariant Features For NIR-VIS Face
such as the accommodation of multiple facial expressions, Recognition.” IEEE. 2017.
temporal invariance, variations in facial weight, fluctuations in [8] Yibo Ju, Lingxiao Song, Bing Yu, Ran He, Zhenan Sun.
illumination conditions, and more. It is noteworthy to highlight “Adversarial Embedding And Variational Aggregation For
that the utilization of deep learning techniques in the context Video Face Recognition”, IEEE. 2018.
of facial recognition has thus far attracted a relatively modest [9] S, D. A. (2021). CCT Analysis and Effectiveness in e-Business
number of academic articles. However, upon a comprehensive Environment. International Journal of New Practices in
amalgamation of numerous evaluations, it becomes Management and Engineering, 10(01), 16–18.
https://doi.org/10.17762/ijnpme.v10i01.97
unequivocally apparent that the modified Convolutional
[10] Wang, X., Lu, Y., Wang, Z., & Feng, J. (2018). Deep
Neural Network (CNN) variants, specifically tailored for facial discriminative feature learning for face verification. In
recognition purposes, exhibit significant promise. This Proceedings of the IEEE Conference on Computer Vision and
observation underscores the existence of a substantial scope Pattern Recognition (CVPR) (2018).
for continued and extensive research endeavors employing [11] Kaiming He; Xiangyu Zhang; Shaoqing Ren; Jian Sun. ”Deep
Deep Learning techniques to further enhance the capabilities Residual Learning for Image Recognition”. IEEE Conference on
of facial recognition systems. It is of paramount importance to Computer Vision and Pattern Recognition (CVPR). 2016.
underscore that the findings of this review illuminate a [12] Florian Schroff; Dmitry Kalenichenko; James Philbin. “FaceNet:
A unified embedding for face recognition and clustering.” IEEE
relatively sparse adoption of the transfer-learning strategy
Conference on Computer Vision and Pattern Recognition
within the domain of facial recognition systems, subsequent to
(CVPR). 2015.
the identification and analysis of various deep learning [13] Yaniv Taigman; Ming Yang; Marc'Aurelio Ranzato; Lior Wolf.
approaches currently in use. Consequently, this underscores “DeepFace: Closing the Gap to Human-Level Performance in
the compelling need for future research endeavors to direct Face Verification.” IEEE Conference on Computer Vision and
their focus towards the refinement and augmentation of facial Pattern Recognition. 2014
recognition through the judicious application of deep learning [14] Mr. Zubin C. Bhaidasna, Dr. Priya R. Swaminarayan. “A
methodologies. This emerging area beckons for further SURVEY ON CONVOLUTION NEURAL NETWORK FOR
exploration and experimentation, promising breakthroughs that FACE RECOGNITION”, Journal of Data Acquisition and
Processing Vol. 38 (2) 2023
will undoubtedly bolster the efficacy and reliability of facial
[15] Mr. Zubin C. Bhaidasna, Dr. Priya R. Swaminarayan. “A
recognition systems in the times ahead.
SURVEY ON CONVOLUTION NEURAL NETWORK FOR
REFERENCES FACE RECOGNITION”, Journal of Data Acquisition and
Processing Vol. 38 (2) 2023.
[1] Peng Lu, Baoye Song, Lin Xu. “ Human face recognition based [16] Peng Lu, Baoye Song, Lin Xu“ Human face recognition based
on convolutional neural network and augmented dataset.“ on convolutional neural network and augmented dataset,
Systems Science & Control Engineering, 2020. Systems Science & Control Engineering, 2020.
[2] Jiankang Deng, Jia Guo, Niannan Xue, Stefanos Zafeiriou [17] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based
“ArcFace: Additive Angular Margin Loss for Deep Face learning applied to document recognition," in Proceedings of the
Recognition”, IEEE Conference on Computer Vision and Pattern IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
Recognition (CVPR), 2019. [18] Mr. Zubin C. Bhaidasna, Dr. Priya R. Swaminarayan. “A
[3] Jun-Cheng Chen, Rajeev Ranjan, Swami Sankaranarayanan, SURVEY ON CONVOLUTION NEURAL NETWORK FOR
Amit Kumar. Ching-Hui Chen, Vishal M. Patel, Carlos D. FACE RECOGNITION”, Journal of Data Acquisition and
Castillo, Rama Chellappa.” Unconstrained Still/Video-Based Processing Vol. 38 (2) 2023.
Face Verification With Deep Convolutional Neural Networks”,
Springer. 2017.
[19] Khan, Asifullah et al. “A survey of the recent architectures of
deep convolutional neural networks.” Artificial Intelligence
Review (2020).
[20] https://www.google.com/search?sca_esv=561848188&q=alexnet
+architecture&tbm=isch&source=lnms&sa=X&ved=2ahUKEwj
e9aWa3IiBAxVyTmwGHfcfDQQQ0pQJegQIDBAB&biw=136
6&bih=619&dpr=1#imgrc=xqC2QyZ_mjTNqM.
[21] Mr. Zubin C. Bhaidasna, Dr. Priya R. Swaminarayan. “A
SURVEY ON CONVOLUTION NEURAL NETWORK FOR
FACE RECOGNITION”, Journal of Data Acquisition and
Processing Vol. 38 (2) 2023.
[22] https://www.researchgate.net/figure/Block-diagram-of-Faster-R-
CNN_fig1_339463390.
[23] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet,
Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent
Vanhoucke, Andrew Rabinovich; Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition
(CVPR).