Final Project Documentation
Final Project Documentation
CHAPTER - 1
INTRODUCTION
B iometrics are part of the cutting-edge technology. Biometrics are the metrics
types of biometric types are available like Face Recognition, Iris Recognition,
comparing and analysing facial contours. It is not only used in security and law
enforcement but also used as a way to authenticate identity and unlock devices like
Face Recognition is one of the most widely used feature in many fields. It is
used in providing security to data, Fraud detection of Passports and Visas, Track
attendance of the employees. The system collects and records the facial fine points of
the employees in the database. Once the process is done, the employee only needs to
look at the camera and the attendance is automatically marked in the face recognition
attendance system.
The project develops a novel CNN model which is used for Face recognition.
The face that is detected and recognized using this CNN model will be provided the
attendance based on the time at which the person faces the camera.
1.3 Objective
The main objective of the project is to build a custom CNN model that can
recognize the faces and post the appropriate attendance into the database for the
CHAPTER – 2
LITERATURE SURVEY
T his chapter gives a brief description of various research papers related to the
order to understand the advantages, disadvantages and the limitations of the various
The term Deep Learning or Deep Neural Network refers to Artificial Neural
Networks (ANN) with multi layers. Over the last few decades, it has been considered
to be one of the most powerful tools, and has become very popular in the literature as
it is able to handle a huge amount of data. The interest in having deeper hidden layers
especially in pattern recognition. One of the most popular deep neural networks is the
Convolutional Neural Network (CNN). It takes this name from mathematical linear
pooling layer and fully connected layer and all those layers have been explained. The
convolutional and fully- connected layers have parameters but pooling and non-
linearity layers don't have parameters. The CNN has an excellent performance in
Specially the applications that deal with image data, such as largest image
classification data set (Image Net), computer vision, and in natural language
processing (NLP) and the results achieved were very amazing. This paper will explain
and define all the elements and important issues related to CNN, and explain the
effect of each parameter on performance of the network. In addition, we will also state
the parameters that effect CNN efficiency. The dataset used in the paper is CIFAR-10
dataset.
Networks
to Computer Vision and Image Processing. Some of the exciting application areas of
The powerful learning ability of deep CNN is primarily due to the use of
multiple feature extraction stages that can automatically learn representations from the
data. The availability of a large amount of data and improvement in the hardware
technology has accelerated the research in CNNs, and recently interesting deep CNN
CNNs have been explored, such as the use of different activation and loss functions,
through architectural innovations. Notably, the ideas of exploiting spatial and channel
have gained substantial attention. Similarly, the idea of using a block of layers as a
CNNs are one of the best learning algorithms for understanding image content
detection, and retrieval related tasks. CNN, with the automatic feature extraction
ability, reduces the need for a separate feature extractor. During training, CNN learns
similar to the response-based learning of the human brain. In 2015, the concept of
skip connections introduced by ResNet for the training of deep CNNs gained
popularity. Afterward, this concept was used by most of the succeeding networks,
such as Inception-ResNet, Wide ResNet, ResNeXt, etc., CNN has not only shown
good performance on images but also on 1D-data. The use of 1D-CNN as compared
ability. The major challenge with CNN is that deep CNNs are generally like a black
box and thus may lack in interpretation and explanation. Therefore, sometimes it is
difficult to verify them. Data augmentation can help CNN in learning diverse internal
question: why CNNs work well and how to design a ‘good’ architecture. Existing
studies tend to focus on reporting CNN architectures that work well for face
systems (CNNFRS) on a common ground to make the work easily reproducible. The
public database LFW (Labelled Faces in the Wild). This paper proposes three CNN
architectures which are the first reported architectures trained using LFW data. The
architectures including number of filters and layers are compared. It also evaluates the
face recognition performance using features from different layers: pooling, fully
connected and softmax layers. It concluded that the features from softmax layer
perform slightly better than those from the most widely used fully connected layer.
The attendance system plays a very important role in the modern enterprise’s
operation, and the security of the building has always been a matter of concern to the
people. Based on networked surveillance video, this paper integrates the attendance
and security functions and fuses video image processing, deep learning, and face
The experimental results verify the effectiveness of the proposed method. The
false reject rate (FRR) reaches 0.51%, the false accept rate (FAR) reaches 2.52%, and
the correct identification rate reaches 98.85%. The system is applied to some video
persons’ attendance at the same time. This paper uses the MTCNN [9] algorithm
which is widely used for human face detection. Like YOLO, it treats the detection
problem as a regression problem and uses the convolution network to train optimize
classification and object locations. First built the image pyramid, and then designed
three CNN models: P-Net, R-Net, and O-Net to perform cascading prediction to
improve the accuracy, defined the label of the image, and designed cross-entropy loss
and Euclidean distance loss function. The position of the face and the five key points
This paper combines the network structure design of AlexNet and Inception to
and builds face dataset in actual scenes, verifying the effectiveness of the algorithm.
The system can record such three violations of classroom discipline for automatic
attendance, that is absence, lateness and leaving early. An attendance table about
The system identifies faces very fast needing only 100 milliseconds to one frame
and obtaining a high accuracy. This face recognition model has an accuracy rate of
98.87% and the true positive rate under 1/1000 the false positive rate is 93.7% on
LFW. Based on the deep learning, MTCNN combines face detection with face
landmark and Center Face algorithm based on deep learning to achieve non-
interference automatic and whole process of class attendance. It also can give
absence, lateness, leaving early the three classroom attendance indicators. It’s a very
The existing system for Face Recognition uses some built-in datasets and some built-
in architectures. Even though some custom data sets have been prepared and used
they have been using the architectural models that have been already present. Some of
the models that have been used in the existing systems are:
In the LBHP approach for texture classification, the occurrences of the LBHP
codes in an image are collected into a histogram. The classification is then performed
for facial image representation results in a loss of spatial information and therefore
one should codify the texture information while retaining also their locations. One
way to achieve this goal is to use the LBHP texture descriptors to build several local
The facial image is divided into local regions and LBHP texture descriptors
are extracted from each region independently. The descriptors are then concatenated
levels of locality: the LBHP labels for the histogram contain information about the
patterns on a pixel-level, the labels are summed over a small region to produce
information on a regional level and the regional histograms are concatenated to build
video stream and the output is an identification of the subject or subjects that appear
Fisher Face is one of the popular algorithms used in face recognition, and is
effort to maximize the separation between classes in the training process. Image
recognition using Fisher Face method is based on the reduction of face space
dimension using Principal Component Analysis (PCA) method, then apply Fisher's
do not capture illumination as obviously as the Eigen faces method. The Discriminant
Analysis instead finds the facial features to discriminate between the persons. The
Fischer Face is especially useful when facial images have large variations in
An eigen face is the name given to a set of eigen vectors when used in
computer vision problem of human face recognition. The eigenvectors are derived
from the covariance matrix of the probability distribution over the high-dimensional
vector space of face images. The eigenfaces themselves form a basis set of all images
allowing the smaller set of basis images to represent the original training images.
Classification can be achieved by comparing how faces are represented by the basis
set.
face ingredients", derived from statistical analysis of many pictures of faces. Any
example, one's face might be composed of the average face plus 10% from eigen face
1, 55% from eigen face 2, and even −3% from eigen face 3. Remarkably, it does not
take many eigen faces combined together to achieve a fair approximation of most
The project proposes a pipeline to build a Deep Learning model for Face
Develops a novel Deep Learning model to detect and recognize the human
faces.
Develops a web application to post the attendance using the novel Deep
The process flow to develop a new face dataset is proposed which is furtherly
The result of the project will be the provision of attendance for the
corresponding period after recognising the face based on the time at which the
CHAPTER - 3
ANALYSIS
Computer Science and Engineering, SRIT Page
Face Recognition Using Convolutional Neural Networks
role in developing and achieving or meeting the desired goal. The Hardware
Requirements and the Software Requirements of the project for developing and
To overcome this issue, the development of deep learning model is done in the google
colaboratory platform. Which provides high end hardware support to develop the
machine learning and deep learning models. The configuration of the CPU and GPU
Central Processing Unit (CPU) used if 1X single core hyper threaded Xeon
These are the essential software requirements that are necessary to build and
deploy the Deep Learning model and the web application. The windows10 operation
system with intel i5 7th processor having 8GB RAM is used and the installation on the
programming language and the important libraries that are used to develop the project
1. Python
with dynamic semantics. Its high-level built in data structures, combined with dynamic
typing and dynamic binding, make it very attractive for Rapid Application
program modularity and code. Python 3 is used to develop the project. The detailed
2. TensorFlow
scale machine learning. TensorFlow bundles together a slew of machine learning and
deep learning models and algorithms and makes them useful by way of a common
performance C++.
The TensorFlow 2.2.0 version is used throughout the project. Here is the
3. Keras
Python and supports multiple back-end neural network computation engines. The
Keras 2.2.5 version is used throughout the project to develop the deep learning models.
(https://keras.io/#installation).
4. SciKit - Learn
Scikit-learn is probably the most useful library for machine learning in Python.
The sklearn library contains a lot of efficient tools for machine learning and statistical
The command pip install sklearn can be used to install the libraries.
5. Pandas
Pandas is a fast, powerful, flexible and easy to use open source data analysis
and manipulation tool, built on top of the Python programming language. The pandas
1.0.3 is used in the entire project and the command to install the package is pip
6. Matplotlib
work like MATLAB. Each pyplot function makes some change to a figure: e.g.,
creates a figure, creates a plotting area in a figure, plots some lines in a plotting area,
decorates the plot with labels, etc. The 3.2.1 version is used in the project and the
7. Jupyter
The Jupyter Notebook is an open source web application that you can use to
create and share documents that contain live code, equations, visualizations, and text.
The version 1.0.0 is used in the project and the command to install the package is pip
8. Flask
framework because it does not require particular tools or libraries. It has no database
abstraction layer, form validation, or any other components where pre-existing third-
party libraries provide common functions. The version 1.0.2 is used in the project and
CHAPTER – 4
DESIGN
collection of diagrams. The notation has evolved from the work of Grady
Booch, James Rumbaugh, Ivar Jacobson, and the Rational Software Corporation to be
used for object-oriented design, but it has since been extended to cover a wider
shifted from the development phase to an analysis and design phase. This reduces risk
and provides a vehicle for testing the architecture of the system before coding begins.
The analysis and design overhead will eventually pay dividends as the system has
been user driven, documented and when it’s time to start developing, many UML
tools will generate skeleton code that will be efficient, object oriented and promote re-
use.
Manage risk
4.2 Diagrams
There are eight most widely used UML diagrams. They are:
Class Diagram
Sequence Diagram
Activity Diagram
Collaboration Diagram
Deployment Diagram
Component Diagram
1. Class Diagram:
application. Class diagram is not only used for visualizing, describing, and
documenting different aspects of a system but also for constructing executable code of
Class diagram describes the attributes and operations of a class and also the
constraints imposed on the system. The class diagrams are widely used in the
modelling of object-oriented systems because they are the only UML diagrams,
Use case diagrams consists of actors, use cases and their relationships. The
It is defined and created from use case analysis. Its purpose is to present a
goals (represented as use cases), and any dependencies between those use cases. The
main purpose of a use case diagram is to show what system functions are performed
for which actor. Hence to model the entire system, a number of use case diagrams
are used.
3. Sequence Diagram:
sequential order i.e. the order in which these interactions take place. We can also use
the terms event diagrams or event scenarios to refer to a sequence diagram. Sequence
diagrams describe how and in what order the objects in a system function. These
diagrams are widely used by businessmen and software developers to document and
4. Activity Diagram:
system. We can also use an activity diagram to refer to the steps involved in the
execution of a use case. We model sequential and concurrent activities using activity
5. Collaboration Diagram
are used to show how objects interact to perform the behaviour of a particular use
case, or a part of a use case. Along with sequence diagrams, collaboration are used
by designers to define and clarify the roles of the objects that perform a particular
flow of events of a use case. They are the primary source of information used to
6. Deployment Diagram:
system, model the embedded system, model the hardware details for a client/server
system, model the hardware details of a distributed application and even in forward,
reverse engineering.
defined as a machine which defines different states of an object and these states are
State Chart diagrams are useful to model the reactive systems. Reactive
State Chart diagram describes the flow of control from one state to another
state. States are defined as a condition in which an object exists and it changes when
some event is triggered. The most important purpose of State Chart diagram is to
8. Component Diagram:
components are wired together to form larger components and or software systems.
They are used to illustrate the structure of arbitrarily complex systems. It does not
describe the functionality of the system but it describes the components used to make
those functionalities.
can be mostly used in modelling the components of a system, modelling the database
In this sequence diagram we have four objects namely user, web application,
Initially the user runs the web application and the web application gives an
Then the camera extracts the faces and gives it to CNN model.
Then the CNN model pre-processes the captured images and predicts the class
of the image.
Based on prediction of the class, the attendance is posted into the database for
the designated periods based on the timings of the college hours and the details
The Figure 4.1 represents the process flow in the form of Sequence Diagram.
CHAPTER - 5
T he data for Deep Learning is a key input to model that comprehend from such
data and learn the features for future prediction. Although, various aspects
come during the deep learning model development, without which various crucial
development without that it is not possible to train a machine that learns from humans
So, the project also focuses on the creating the data set of human faces for face
recognition. The data is collected through an automated program which takes the
faces of the humans, stores and transforms into a dataset. The dataset contains the
The Figure 5.1 illustrates the process flow of the data collection as follows:
Extract the face image and convert into gray scale image.
Attach the label w.r.t to the class of the image and write into a csv file.
The Figure 5.2 illustrates the samples of the data based on the classes.
increase the diversity of data available for training models, without actually collecting
huge data. It can be done by using techniques such as cropping, padding, and
horizontal flipping are commonly used to train large neural networks. In order to
achieve high accuracy, large volume of training data is required. Hence, data
data. In the current work, synthetic images are being generated randomly, by applying
Zoom
Shear
Height shift
Rotation
Width shift
augmentation.
visuals within a specific context to help people understand and make sense of large
amounts of data. The data is often displayed in a story format that visualizes patterns,
understand the patterns in the data. Principal Component Analysis (PCA) is used in
order to visualise the data in different forms. Here are some of the different
Pair plot is used to understand the best set of features to explain a relationship
between two variables or to form the most separated clusters. It also helps to form
some simple classification models by drawing some simple lines or make linear
pattern between two variables or dimensions in the dataset as shown in Figure 5.4.
Scatter plots are important in statistics because they can show the extent of
variables). If no correlation exists between the variables, the points appear randomly
scattered on the coordinate plane. If a large correlation exists, the points concentrate
near a straight line. The Figure 5.5 shows the three-dimensional scatter plot using the
PCA.
The Count Plot is used to understand the samples in the data with respect to
two quantitative variables (e.g., height and weight). Often a slightly looser definition
is used, whereby correlation simply means that there is some type of relationship
between two variables. This post will define positive and negative correlation, provide
some examples of correlation, explain how to measure correlation and discuss some
When the values of one variable increase as the values of the other increase,
this is known as positive correlation (see the image below). When the values of one
variable decrease as the values of another increase to form an inverse relationship, this
is known as negative correlation. The Figure 5.7 represents the correlation between
CHAPTER - 6
constructed and perfected with time, primarily over one particular algorithm a
Learning algorithm which can take in an input image, assign importance (learnable
weights and biases) to various aspects/objects in the image and be able to differentiate
process, analyse images and videos and extract details in the same way a human mind
does. Earlier computer vision was meant only to mimic human visual systems until
retail, banking, automobile, financial services, etc. Extensive research is recorded for
face recognition using CNNs, which is a key aspect of surveillance applications. The
rise of deep learning has enabled Deep Neural Networks (DNN) to achieve greater
DNN which is most commonly applied to analysing visual imagery. It is used not
only in Computer Vision but also for text classification in Natural Language
Processing (NLP).
CNNs performs well when compared to the feed forward neural networks as it
uses less number of parameters for computation and also it uses different layers to
learn the patterns. CNNs are used for a wide range of image-related tasks such as
lower cost has enabled a path to solve a lot of problems which have been considered
as virtually a hard task for the computers for the past few years.
Many architectures are created to solve computer vision tasks and some of the
widely popular CNN architectures are AlexNet (2012), VGG-16 (2014), Inception-v1
image spatially to detect features like edges and shapes. These high number of filters
essentially learn to capture spatial features from the image based on the learned
weights through back propagation and stacked layers of filters can be used to detect
complex spatial shapes from the spatial features at every subsequent level. Hence they
can successfully transform the given image into a highly abstracted representation
There are different layers that are used in order to build an efficient CNN model. They
are:
Convolutional Layer
Dropout Layer
Let’s see the in detailed explanation on the above mentioned layers that are used in
1. Convolutional Layer
features such as edges, from the input image. CNNs need not be limited to only one
capturing the Low-Level features such as edges, colour, gradient orientation, etc.
us a network which has the wholesome understanding of images in the dataset, similar
to how we would.
The above example illustrated the convolutional operation in 2D, but in reality
with a dimension for width, height, and depth. Depth is a dimension because of the
input is performed, where each operation uses a different filter. This results in
different feature maps. In the end, we take all of these feature maps and put them
Training deep neural networks with tens of layers is challenging as they can be
sensitive to the initial random weights and configuration of the learning algorithm.
One possible reason for this difficulty is the distribution of the inputs to layers deep in
the network may change after each mini-batch when the weights are updated. This can
cause the learning algorithm to forever chase a moving target. This change in the
Batch normalization is a technique for training very deep neural networks that
standardizes the inputs to a layer for each mini-batch. This has the effect of stabilizing
the learning process and dramatically reducing the number of training epochs required
maintaining the process of effectively training of the model. Figure 6.11: Max
Pooling Operation
4. Dropout Layer
adding this penalty, the model is trained such that it does not learn interdependent set
which helps reducing interdependent learning amongst the neurons. It forces a neural
network to learn more robust features that are useful in conjunction with many
different random subsets of the other neurons. roughly doubles the number of
iterations required to converge. However, training time for each epoch is less. The
linear function in that space. Now that we have converted our input image into a
suitable form for our Multi-Level Perceptron, we shall flatten the image into a column
The proposed CNN model consists of 20 layers. Which of them are Two-
Pooling Layer, Dense Layer. The architecture takes a gray scale image with shape
The total number of parameters of the CNN are 7,658,629 of which 7,656,197 are
trainable and 2,432 are non-trainable. The detailed description of the architecture is
shown in Table-6.1.
from the previous input, the batch normalization layer is used in order to normalize
the input and also overcome the problem of vanishing gradient and exploding
gradient. The max pooling layer is used to reduce the dimensionality of the input and
containing 4 classes. Which is divided in the ratio of 80:20 into train with 8524 samples
Optimizer – RMSProp
Epochs – 50
The accuracy and the loss of the CNN model during the training w.r.t number
of epochs on train and test data are represented in Figure 6.5. It is evident that with
increase in number of epochs the accuracy is increasing and loss is decreasing and at a
certain number of epochs, both the loss and accuracy do not change further.
Training the CNN involves feeding forward your training data, generating
predictions, and computing a loss score, which is used for optimization purposes.
However, it may be the optimizer gets stuck after some time and It is needed to know
why this occurs and, more importantly, the convolutional layers, which learn features
from the image, that can be used by densely connected layers for classification
purposes.
Especially with problems that are less straight-forward, CNNs can be tough to
train. In some cases, it does not even converge. Visualizing layer outputs gets
important in those cases. As convolutional layers, together with additional layers such
as pooling layers down sample the image in the sense that it gets smaller and more
abstract. When this happens, a neural network might no longer be able to discriminate
Visualising the features learned by the network helps to tune, since one can
see an error made by the network and able to point out the cause directly. Furtherly, if
the model performance is not improved, then extending and improving the overall
design of the model can be helpful. Based on the knowledge of the current design,
including the strengths and weaknesses of it. The intermediate layer visualizations of
Input Conv2D_1
Conv2D_2
Conv2D_3 Conv2D_4
Conv2D_5
Figure 6.15: Intermediate Layer Visualisation of CNN
Computer Science and Engineering, SRIT Page
Face Recognition Using Convolutional Neural Networks
regression, ranking, clustering, topic modelling, among others). Some metrics, such as
precision-recall, are useful for multiple tasks. Supervised learning tasks such as
1. Confusion Matrix
classification model. The number of correct and incorrect predictions are summarized
with count values and broken down by each class. This is the key to the confusion
2. Classification Accuracy
by the model over all kinds predictions made. In the Numerator, are our correct
predictions (True positives and True Negatives) and in the denominator, are the kind
3. Precision
of our prediction.
Precision=(TP)/(TP+ FP)
4. Recall
Recall=(TP)/(TP+ FN )
5. F1 Score
The F1 score is a number between 0 and 1 and is the harmonic mean of precision and
recall.
6. Kappa Score
function computes Cohen’s kappa, a score that expresses the level of agreement
Po−Pe
Κ=
1 – Pe
sample (the observed agreement ratio), and Pe is the expected agreement when both
CHAPTER - 7
IMPLEMENTATION
Network (CNN). The CNN implicitly extracts the features from the human
faces and predicts the class for given human face which can be achieved through
training and hyper parameter tuning of the proposed novel CNN. The project also
aims to develop an end-to-end process to build a custom facial recognition system i.e.
to deal with real time data. This section describes the proposed web-based application
for attendance posting using face recognition using CNN, which is developed using
Front-end-interface (Camera)
Server
CNN model
Database
1. Front-End interface
A simple web page which takes the video frame as input through the camera
using open computer vision (OpenCV) library. The video frame that is taken as input
is then sent to the server where already the novel CNN model is already dumped. It is
to be noted that in the image returned form the front-end interface the location of the
face is identified by the automated script and a new resized image fit for the input size
2. Server
The back-end of the application which maintains the connection with the
front-end interface, model, database and display screen. The input frame that is taken
from the front-end interface is processed to identify the face location and resize it
accordingly to the input size for the model for prediction. The script and the model
necessary for processing the image and providing the input to the model respectively
are previously dumped into the server and a call is made accordingly whenever the
application comes to running stage. Finally, the most important duty of the server is
3. CNN model
The novel CNN model developed for real-time datasets with high accuracy. It
receives resized image that is obtained from the front-end interface as input and
4. Database
their attendance including the timestamps. It becomes handy for the server to update,
5. End-result
The end-result is a web page which includes the details of the student such as
resources about all the algorithms for machine learning and deep learning but when
solution is to use the very light web framework to deploy the successfully trained ML
applications for any type of project and also very suggestible when deploying API’s
Firstly, after running the application, the very first screen is the web page
(front-end interface) that helps the user to capture the video frame. In the web page,
a small button is highlighted, once after clicking it the video frame is processed to
identify the location of the face in the frame and then it is resized for into to those
dimensions that is fit for the input size of the developed CNN model. After
successfully resizing the input frame, it is loaded to the model for making a
prediction.
The Flask framework provides scope to import any package from python
directly, so with this convenience we are going to import all the necessary packages
for loading the model. Generally, a function namely load_model from the package
TensorFlow is used to load the model. After loading the model, the resized image is
fed to the model for processing to generate a label. The novel architecture of the
proposed custom CNN model uses the classifier softmax, which is an activation
The CNN model outputs a vector that represents the probability distributions
of a list of potential outcomes. Since the desired output is a single label related to only
one of the trained faces, argmax function which helps in taking out the maximum of
the probability distributions generated by softmax. Since the output from the softmax
layer is a single dimensional vector, when it is passed to the argmax function the
index of the maximum value is returned. From that index value, the server matches
the existing label values with that of the generated index value and if there is a match,
it posts all the details corresponding to that label in the form a web page containing
CHAPTER - 8
RESULTS
and the data is divided into train and test in order to train and evaluating the
CNN model developed. The details of the test and train are shown in the Table 8.1.
The experimental results of the CNN model and the web application which uses the
Percentag Samples
Data
e
Trai 8524
85%
n
4 15% 1504
carried using the classification metrics such as Confusion Matrix, Precision, Recall,
F1 Score and Kappa Score. The mentioned classification metrics are discussed in
The Confusion Matrix is a table that is often used to describe the performance of a
The class wise accuracy of the CNN model represents the classification accuracy
of with respect to each class based on the predictions of the test data and it is
Class Accurac
Label y
1 100%
2 99.49%
3 100%
4 100%
Recall, F1 Score and Kappa Score are used to evaluate the CNN model
performance on the test or unseen data and the results are illustrated in the Table-
8.3.
Metric Value
Precision 99.861%
Recall 99.874%
F1 Score 99.867%
The Accuracy and the loss of the CNN model is 99.86% and 0.0019 and predictions
samples of the images from the test data is shown in Figure 8.2.
The web application is designed to post the attendance using the developed
CNN model for face recognition. The web application posts the attendance into the
database for the designated periods based on the timings of the college hours. The
Figure 8.3 shows the screen shots of the web application which illustrates the capturing
of the images and the end result of the application i.e., the status of the attendance is
displayed
CONCLUSION
(CNNs) in Face Recognition and adaption of CNN in attendance posting and also
proposes a novel CNN architecture for Face Recognition. The project also provides in
detailed explanation on the key components which are essential to build a robust deep
learning model, such as collecting real-time data i.e., Human Faces, Data
proposed CNN model. Moreover, the project also provides a web application for
attendance posting using Face Recognition by using the developed CNN model.
The future scope of the project is to build the dataset with more number of
classes and to train the proposed CNN model as the project uses only four classes. To
build a robust web application with different features such as Login Page, Attendance
REFERENCES
Papers:
[3] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once:
arXiv:1901.06032, 2019.
[5] A. El-Sawy, E.-B. Hazem, and M. Loey, “Cnn for handwritten Arabic digits
[6] Z.-W. Yuan and J. Zhang, “Feature extraction and image retrieval based on
2016), vol. 10033. International Society for Optics and Photonics, 2016, p. 100330E.
for big data places image recognition,” in 2018 IEEE 8th Annual Computing and
[8] K. Sun, Q. Zhao, J. Zou, and X. Ma, “Attendance and security system based on
[9] M. Cos¸kun, A. Uc¸ar, O¨. Yildirim, and Y. Demir, “Face recognition based on
[10] R. Fu, D. Wang, D. Li, and Z. Luo, “University classroom attendance based on
network based on tensorflow for face recognition,” in 2017 IEEE 2nd Advanced
[15] J. Kim, H. Kim et al., “An effective intrusion detection classifier using long
Conference on Platform Technology and Service (PlatCon). IEEE, 2017, pp. 1–6.