Object detection
Object detection
PROJECT REPORT
ON
FACE RECOGNITION USING PYTHON
BACHELOR OF ENGINEERING
Examiners
1.
Project Guide
Date:
Place:
DECLARATION
I declare that this written submission represents my ideas in my own words and where
others' ideas or words have been included, We have adequately cited and referenced the original
sources. I also declare that We have adhered to all principles of academic honesty and integrity
submission. I understand that any violation of the above will be cause for disciplinary action
by the Institute and can also evoke penal action from the sources which have thus not been
properly cited or from whom proper permission has not been taken when needed.
Date:
ACKNOWLEDGEMENT
I would like to take the opportunity to express my heartfelt gratitude to the people whose
help and co-ordination has made this project a success. We thank Prof. Naresh Shende for
knowledge, guidance and co-operation in the process of making this project.
I owe project success to my guide and convey my thanks to her. We would like to express
my heartfelt to all the teachers and staff members of Computer Engineering department of AMRIT
for their full support. We would like to thank my principal for conductive environment in the
institution.
I am grateful to the library staff of AMRIT for the numerous books, magazines made
available for handy reference and use of internet facility.
Lastly, I am also indebted to all those who have indirectly contributed in making this project
successful.
CONTENTS
CH.
TOPIC NAME PAGE NO.
NO
LIST OF FIGURES I
LIST OF TABLES II
ABSTRACT V
1 INTRODUCTION 1
1.1 Introduction 2
1.2 Objective of the project 2
1.3 Problem Statement 3
1.4 Aims and Objective 4
1.5 Main Purpose 4
2 LITERATURE SURVEY 5
3.3 Technology 18
4.1 Introduction 21
5.2.1 Application 31
SYSTEM DESIGN 33
6 6.1 Methodology 34
6.2 Working 34
6.3 Unified Modeling Language 36
7 PROJECT IMPLEMENTATION 40
7.1 Result 41-43
7.2 Output 44
775828
GRGREGE
15 7.2 Output 42
I
LIST OF TABLES
II
LIST OF SYMBOLS AND ABBREVIATIONS
III
ABSTRACT
Keywords: Voice disorder diagnosis, Machine Learning, Support Vector Machine (SVM), k-nearest
neighbors (KNN), Gradient Boosting, and Ensemble Learning.
IV
Face Recognition Using Python
CHAPTER 1
INTRODUCTION
AMRIT Page 1
Face Recognition Using Python
1. INTRODUCTION
1.1 Introduction
Real-time human detection and tracking is a vast, challenging and important field of
research. It has wide range of applications in human recognition, human computer interaction
(HCI), video surveillance etc. The research for biometric authentication of a person has reached
far but the real-time tracking of human beings has not gained much importance. Tracking of
human being can be used as a prior step in biometric face recognition. Keeping continuous track
of person will allow to identify person at any time. The system consist of two parts first human
detection and secondly tracking. Human detection step is split into face detection and eye
detection. Face is a vital part of human being represent most important information about the
individual. Eyes are the important biometric feature used in person identification. Face detection
is done using skin color-based methods. Color model is used to detect skin regions as it
represents intensity and color information separately. For eye region detection projection
function and pixel count methods are used.
AMRIT Page 2
Face Recognition Using Python
The paper proposed by Zhao, W et al. (2003) has listed the difficulties of facial
identification. One of the difficulties of facial identification is the identification between known
and unknown images. In addition, paper proposed by Pooja G.R et al. (2010) found out that the
training process for face recognition student attendance system is slow and time-consuming. In
addition, the paper proposed by Priyanka Wagh et al. (2015) mentioned that different lighting and
head poses are often the problems that could degrade the performance of face recognition based
student attendance system.
Hence, there is a need to develop a real time operating student attendance system which
means the identification process must be done within defined time constraints to prevent
omission. The extracted features from facial images which represent the identity of the students
have to be consistent towards a change in background, illumination, pose and expression. High
accuracy and fast computation time will be the evaluation points of the performance.
AMRIT Page 3
Face Recognition Using Python
AMRIT Page 4
Face Recognition Using Python
CHAPTER 2
LITERATURE SUREY
AMRIT Page 5
Face Recognition Using Python
2 LITERATURE SURVEY
Digital Image Processing is the processing of images which are digital in nature by a digital
computer. Digital image processing techniques are motivated by three major applications
mainly:
AMRIT Page 6
Face Recognition Using Python
Where, r (x, y) is the reflectivity of the surface of the corresponding image point. i (x,y)
Represents the intensity of the incident light. A digital image f(x, y) is discretized both in spatial
co-ordinates by grids and in brightness by quantization. Effectively, the image can be represented
as a matrix whose row, column indices specify a point in the image and the element value
identifies gray level value at that point. These elements are referred to as pixels or pels.
Typically following image processing applications, the image size which is used is𝟐𝟓𝟔 ×
𝟐𝟓𝟔, elements, 𝟔𝟒𝟎 × 𝟒𝟖𝟎 pels or 𝟏𝟎𝟐𝟒 × 𝟏𝟎𝟐𝟒 pixels. Quantization of these matrix pixels is
done at 8 bits for black and white images and 24 bits for colored images (because of the three
color planes Red, Green and Blue each at 8 bits)[.
● Image Acquisition - An imaging sensor and the capability to digitize the signal produced
by the sensor.
● Preprocessing – Enhances the image quality, filtering, contrast enhancement etc.
● Segmentation – Partitions an input image into constituent parts of objects.
● Description/feature Selection – extracts the description of image objects suitable for
further computer processing.
● Recognition and Interpretation – Assigning a label to the object based on the information
AMRIT Page 7
Face Recognition Using Python
Face Detection
Face detection is the process of identifying and locating all the present faces in a single
image or video regardless of their position, scale, orientation, age and expression. Furthermore,
the detection should be irrespective of extraneous illumination conditions and the image and
video content.
Face Recognition is therefore simply the task of identifying an already detected face as a
known or unknown face and in more advanced cases telling exactly whose face it is.
AMRIT Page 8
Face Recognition Using Python
Face Detection
A face Detector has to tell whether an image of arbitrary size contains a human face and if
so, where it is. Face detection can be performed based on several cues: skin color (for faces in
color images and videos, motion (for faces in videos), facial/head shape, facial appearance or a
combination of these parameters. Most face detection algorithms are appearance based without
using other cues. An input image is scanned at all possible locations and scales by a sub window.
Face detection is posed as classifying the pattern in the sub window either as a face or a non-face.
The face/nonface classifier is learned from face and non-face training examples using statistical
learning methods[9]. Most modern algorithms are based on the Viola Jones object detection
framework, which is based on Haar Cascades.
AMRIT Page 9
Face Recognition Using Python
Face Detection
Advantages Disadvantages
Method
1. Long Training Time. 2.Limited
1. High detection
Viola Jones Head Pose. 3.Not able to detect dark
Speed.
Algorithm 2. High Accuracy. faces.
1. Simple computation. 1. Only used for binary and grey
Local Binary 2. High tolerance against images. 2.Overall performance is
Pattern Histogram the monotonic inaccurate compared to Viola-Jones
illumination changes. Algorithm.
Viola-Jones algorithm which was introduced by P. Viola, M. J. Jones (2001) is the most
popular algorithm to localize the face segment from static images or video frame. Basically
the concept of Viola-Jones algorithm consists of four parts. The first part is known as Haar
feature, second part is where integral image is created, followed by implementation of
Adaboost on the third part and lastly cascading process.
AMRIT Page 10
Face Recognition Using Python
AMRIT Page 11
Face Recognition Using Python
The value of integrating image in a specific location is the sum of pixels on the left and
the top of the respective location. In order to illustrate clearly, the value of the integral
image at location 1 is the sum of the pixels in rectangle A. The values
of integral image at the rest of the locations are cumulative. For instance, the value at
location 2 is summation of A and B, (A + B), at location 3 is summation of A and C, (A +
C), and at location 4 is summation of all the regions, (A + B + C + D). Therefore, the sum
within the D region can be computed with only addition and subtraction of diagonal at
location 4 + 1 − (2 + 3) to eliminate rectangles A, B and C.
It was first described in 1994 (LBP) and has since been found to be a powerful feature for
texture classification. It has further been determined that when LBP is combined with
histograms of oriented gradients (HOG) descriptor, it improves the detection performance
considerably on some datasets. Using the LBP combined with histograms we can represent
the face images with a simple data vector.
AMRIT Page 12
Face Recognition Using Python
2. Training the Algorithm: First, we need to train the algorithm. To do so, we need to use a
dataset with the facial images of the people we want to recognize. We need to also set an ID (it
may be a number or the name of the person) for each image, so the algorithm will use this
information to recognize an input image and give you an output. Images of the same person must
have the same ID. With the training set already constructed, let’s see the LBPH computational
steps.
3. Applying the LBP operation: The first computational step of the LBPH is to create an
intermediate image that describes the original image in a better way, by highlighting the facial
characteristics. To do so, the algorithm uses a concept of a sliding window, based on the
parameters radius and neighbors.
Based on the image above, let’s break it into several small steps so we can understand it easily:
AMRIT Page 13
Face Recognition Using Python
Then, we convert this binary value to a decimal value and set it to the central value of the
matrix, which is actually a pixel from the original image.
At the end of this procedure (LBP procedure), we have a new image which represents better
the characteristics of the original image.
It can be done by using bilinear interpolation. If some data point is between the pixels, it uses
the values from the 4 nearest pixels (2x2) to estimate the
value of the new data point.
4. Extracting the Histograms: Now, using the image generated in the last step, we can use
the Grid X and Grid Y parameters to divide the image into multiple grids, as can be seen in the
following image:
Based on the image above, we can extract the histogram of each region as follows:
AMRIT Page 14
Face Recognition Using Python
● As we have an image in grayscale, each histogram (from each grid) will contain only
5. Performing the face recognition: In this step, the algorithm is already trained. Each
histogram created is used to represent each image from the training dataset. So, given an input image,
we perform the steps again for this new image and creates a histogram which represents the image.
a. So to find the image that matches the input image we just need to compare two histograms and
return the image with the closest histogram.
b. We can use various approaches to compare the histograms (calculate the distance between
two histograms), for example: Euclidean distance, chi-square, absolute value, etc. In this
example, we can use the Euclidean distance (which is quite known) based on the following
formula:
c. So the algorithm output is the ID from the image with the closest histogram. The algorithm
should also return the calculated distance, which can be used as a ‘confidence’ measurement.
d. We can then use a threshold and the ‘confidence’ to automatically estimate if the algorithm
has correctly recognized the image. We can assume that the algorithm has successfully
recognized if the confidence is lower than the threshold defined.
AMRIT Page 15
Face Recognition Using Python
CHAPTER 3
SYSTEM REQUIREMENT
& SPECIFICATION
AMRIT Page 16
Face Recognition Using Python
AMRIT Page 17
Face Recognition Using Python
3.3 Technology:
PYTHON:
Python is a multi-paradigm programming language. Object-oriented programming and structured
programming are fully supported, and many of its features support functional programming and
aspect- oriented programming (including metaprogramming and metaobjects).Many other
paradigms are supported via extensions, including design by contract and logic programming.
Python uses dynamic typing and a combination of reference counting and a cycle-detecting garbage
collector for memory management. It uses dynamic name resolution (late binding), which binds
method and variable names during program execution.
Its design offers some support for functional programming in the Lisp tradition. It has filter,
mapandreduce functions; list comprehensions, dictionaries, sets, and generator expressions. The
standard library has two modules (itertools and functools) that implement functional tools borrowed
from Haskell and Standard ML.
Rather than building all of its functionality into its core, Python was designed to be highly extensible
via modules. This compact modularity has made it particularly popular as a means of adding
programmable interfaces to existing applications. Van Rossum's vision of a small core language
with a large standard library and easily extensible interpreter stemmed from his frustrations with
Python strives for a simpler, less-cluttered syntax and grammar while giving developers a choice
in their coding methodology. In contrast to Perl's "there is more than one way to do it" motto,
Python embraces a "there should be one—and preferably only one—obvious way to do it"
philosophy.
Python's developers strive to avoid premature optimization and reject patches to non-critical parts
of the CPython reference implementation that would offer marginal increases in speed at the cost
of clarity.[75] When speed is important, a Python programmer can move time-critical functions to
extension modules written in languages such as C; or use PyPy, a just-in-time compiler. Cpython
is also available, which translates a Python script into C and makes direct C-level API calls into
AMRIT Page 18
Face Recognition Using Python
Python's developers aim for it to be fun to use. This is reflected in its name—a tribute to the British
comedy group Monty Python and in occasionally playful approaches to tutorials and reference
materials, such as examples that refer to spam and eggs (a reference to a Monty Python sketch) instead
of the standard foo and bar. The programming language's name 'Python' came from the BBC Comedy
series Monty Python's Flying Circus. Guido van Rossum thought he needed a name that was short,
unique and slightly mysterious, and so, he decided to name the programming language 'Python'.
A common neologism in the Python community is pythonic, which has a wide range of meanings
related to program style. "Pythonic" code may use Python idioms well, be natural or show fluency in the
language, or conform with Python's minimalist philosophy and emphasis on readability. Code that is
difficult to understand or reads like a rough transcription from another programming language is called
un-pythonic. Python users and admirers, especially those considered knowledgeable or experienced,
are often referred to as Pythonistas.
PYCHARM:
PyCharm is a dedicated Python Integrated Development Environment (IDE) providing a wide range of
essential tools for Python developers, tightly integrated to create a convenient environment for
productive Python, web, and data science development.
AMRIT Page 19
Face Recognition Using Python
CHAPTER 4
MODAL IMPLEMENTATION
AND ANALYSIS
AMRIT Page 20
Face Recognition Using Python
4 .1 INTRODUCTION:
Face detection involves separating image windows into two classes; one containing
faces (turning the background (clutter). It is difficult because although commonalities exist
between faces, they can vary considerably in terms of age, skin color and facial expression. The
problem is further complicated by differing lighting conditions, image qualities and
geometries, as well as the possibility of partial occlusion and disguise. An ideal face detector
would therefore be able to detect the presence of any face under any set of lighting
conditions, upon any background. The face detection task can be broken down into two steps.
The first step is a classification task that takes some arbitrary image as input and outputs a binary
value of yes or no, indicating whether there are any faces present in the image. The second
step is the face localization task that aims to take an image as input and output the location of
any face or faces within that image as some bounding box with (x, y, width, height).After
taking the picture the system will compare the equality of the pictures in its database and give the
most related result. We will use NVIDIA Jetson Nano Developer kit, Logitech C270 HD
Webcam, open CV platform and will do the coding in python language.
AMRIT Page 21
Face Recognition Using Python
The main components used in the implementation approach are open source
computer vision library (OpenCV). One of OpenCV’s goals is to provide a simple- to-use
computer vision infrastructure that helps people build fairly sophisticated vision
applications quickly. OpenCV library contains over 500 functions that span many areas in
vision. The primary technology behind Face recognition is OpenCV. The user stands in
front of the camera keeping a minimum distance of 50cm and his image is taken as an
input. The frontal face is extracted from the image then converted to gray scale and stored.
The Principal component Analysis (PCA) algorithm is performed on the images and the
eigen values are stored in an xml file. When a user requests for recognition the frontal face
is extracted from the captured video frame through the camera. The eigen value is re-
calculated for the test face and it is matched with the stored data for the closest neighbour.
AMRIT Page 22
Face Recognition Using Python
● Interest points: detection and matching
We copied this script and place it on a directory on our raspberry pi and saved it. Then
through terminal we made this script executable and then ran it.
AMRIT Page 23
Face Recognition Using Python
2. Python IDE: There are lots of IDEs for python. Some of them are PyCharm,
Thonny, Ninja, Spyder etc. Ninja and Spyder both are very excellent and free but we used
Spyder as it feature- rich than ninja. Spyder is a little bit heavier than ninja but still much
lighter than PyCharm. You can run them in pi and get GUI on your PC
It’s simpler than ever to get started! Just insert a microSD card with the system image,
boot the developer kit, and begin using the same NVIDIA JetPack SDK used across the
entire NVIDIA Jetson™ family of products. JetPack is compatible with NVIDIA’s world-
leading AI platform for training and deploying AI software, reducing complexity and effort
for developers.
AMRIT Page 24
Face Recognition Using Python
Specifications:
Display HDMI
USB 1x USB 3.0 Type A,2x USB 2.0 Type A, USB 2.0 Micro-B
Mechanical 100 mm x 80 mm x 29 mm
AMRIT Page 25
Face Recognition Using Python
The developer kit uses a microSD card as boot device and for main storage. It’s
important to have a card that’s fast and large enough for your projects; the minimum
requirement is a 32GB UHS-1 card.
Before utilizing it, we have to configure our NVIDIA Jetson Nano Board for Computer
Vision and Deep Learning with TensorFlow, Keras, TensorRT, and OpenCV.
The NVIDIA Jetson Nano packs 472GFLOPS of computational horsepower. While it is a
very capable machine, configuring it is not easy to configure.
4.3.2.2 Webcam:
AMRIT Page 26
Face Recognition Using Python
Face Detection:
Start capturing images through web camera of the client side: Begin:
End
Face Recognition:
Using PCA algorithm the following steps would be followed in for face recognition:
Begin:
● Find the face information of matched face image in from the database.
● update the log table with corresponding face image and system time
that makes completion of attendance for an individua students.
End
This section presents the results of the experiment conducted to capture the face
into a grey scale image of 50x50 pixels.
AMRIT Page 27
Face Recognition Using Python
AMRIT Page 28
Face Recognition Using Python
AMRIT Page 29
Face Recognition Using Python
CHAPTER 5
SYSTEM ARCHITECTURE
AMRIT Page 30
Face Recognition Using Python
5 SYSTEM ARCHITECTURE
Most existing digital video surveillance systems rely on human observers for detecting
specific activities in a real-time video scene. However, there are limitations in the human
capability to monitor simultaneous events in surveillance displays. Hence, human motion
analysis in automated video surveillance has become one of the most active and attractive
research topics in the area of computer vision and pattern recognition.
5.2.1 Applications:
People counting has a wide range of applications in the context of pervasive systems. These
applications range from efficient allocation of resources in smart buildings to handling emergency
situations. There exist several vision-based algorithms for people counting. Each algorithm
performs differently in terms of efficiency, flexibility and accuracy for different indoor scenarios.
Hence, evaluating these algorithms with respect to different application scenarios, environment
conditions and camera orientations will provide a better choice for actual deployment. For this
purpose, in our paper the most commonly implemented Frame Differencing, Circular Hough
AMRIT Page 31
Face Recognition Using Python
Transform and Histogram of Oriented Gradient based methods are evaluated with respect to
different factors like camera orientation, lighting, occlusion etc. The performance of these
algorithms under different scenarios demonstrates the need for more accurate and faster people
counting algorithms.
AMRIT Page 32
Face Recognition Using Python
CHAPTER 6
SYSTEM DESIGN
AMRIT Page 33
Face Recognition Using Python
6 SYSTEM DESIGN
6.1 Methodology:
TensorFlow is an open-source API from Google, which is widely used for solving machine
learning tasks that involve Deep Neural Networks. TensorFlow Object Detection API is an open-
source library made based on TensorFlow for supporting training and evaluation of Object
Detection models. Today we will take a look at “TensorFlow Detection Model”, which is a
collection of pre-trained models compatible with TensorFlow Object Detection API. PyCharm is
a programming language that translates an abstract idea into a program design we can see on
screens. PyCharm presents a three- step approach for creating programs which are to design the
appearance of the application, assign property settings to the objects of your program & write the
code to direct specific tasks at runtime.
6.2 Working:
User just need to download the file and run the main.py on their local system.
On the starting window of the application, user will be able to see START and EXIT option,
using which user can start the application or exit from the application.
When user starts the application using START button, a new window will open, which allows
user with options like, DETECT FROM IMAGE, DETECT FROM VIDEO or DETECT FROM
CAMERA.
When user selects any of the first two option, he/she needs to select the respective files using
SELECT button.
User can preview the selected file using PREVIEW button, and detect and count the humans
using DETECT BUTTON
And when user selects, the last option of detecting through camera, user need to open the
Camera, using OPEN CAMERA button, As soon as camera opens, detection process will start.
After detection process gets completed or user manually completes it, two graph get plotted,
AMRIT Page 34
Face Recognition Using Python
Along with this two plots, an option to generate crowd report also appears, On clicking
on it, a crowd report in form of PDF is generated and saved automatically at the project file
location. In the crowd report generated, there will be information like, What is Max Human
Count, Max Accuracy, Max Avg. Accuracy, and also a two line status about crowd.
AMRIT Page 35
Face Recognition Using Python
Fig 6.3.1:
The HUMAN DETECTION SYSTEM Class Diagram is a modeled diagram that explain its
classes
.
and relationships. The diagram depicts the names and attributes of the classes, as well as their
links and, their methods. It is the most essential type of UML diagram which is critical in
software development. It is an approach to show the system’s structure in detail, including its
properties and op.erations. The HUMAN DETECTION SYSTEM must have a designed
diagr,am to define the
AMRIT Page 36
Face Recognition Using Python
6.3.2 Use Case Diagram:
The objective of a use case diagram is to show the interactions of numerous items called
actors with the use case and to capture fundamental functionalities of a system. As you see
through the diagrams, there are the use cases involved to define the core functions of a system.
These processes were expected by the users to be connected to produce a certain output. Being
a programmer, this could be an important role that the HUMAN DETECTION SYSTEM
general Use Case Diagram should have.
AMRIT Page 37
Face Recognition Using Python
The data included in the System flow chart diagram was labeled properly to guide the
developers on the graphical representation of the HUMAN FACE DETECTION SYSTEM.
AMRIT Page 38
Face Recognition Using Python
A system architecture is the conceptual model that defines the structure, behavior, and more
views of a system. An architecture description is a formal description and representation of a
system, organized in a way that supports reasoning about the structures and behaviors of the
system. These diagrams visualize the boundaries, along with the software, nodes, and
processors that make up the system. They can also help you understand how different
components communicate with each other. Not only that, but they also give you an overview of
the physical hardware in the system.
AMRIT Page 39
Face Recognition Using Python
CHAPTER 7
PROJECT IMPLEMENTATION
AMRIT Page 40
Face Recognition Using Python
7. PROJECT IMPLEMENTATION
7.1 Results
AMRIT Page 41
Face Recognition Using Python
AMRIT Page 42
Face Recognition Using Python
AMRIT Page 43
Face Recognition Using Python
7.2 Output
AMRIT Page 44
Face Recognition Using Python
CONCLUSION
In the last section of the project, we generate Crowd Report, which will give some message on the
basis of the results we got from the detection process. For this we took some threshold human
count and we gave different message for different results of human count we got form detection
process.
Now coming to the future scope of this project or application, since in this we are taking any image,
video or with camera we are detecting humans and getting count of it, along with accuracy. So
some of the future scope can be: This can be used in various malls and other areas, to analyses the
maximum people count, and then providing some restrictions on number of people to have at a
time at that place.
This can replace various mental jobs, and this can be done more efficiently with machines. This
will ultimately leads to some kind of crowd-ness control in some places or areas when
implemented in that area.
AMRIT Page 45
Face Recognition Using Python
FUTURE SCOPE
Big data applications are consuming most of the space in industry and research area. Among the
widespread examples of big data, the role of video streams from CCTV cameras is equally important
as other sources like social media data, sensor data, agriculture data, medical data and data evolved
from space research. Surveillance videos have a major contribution in unstructured big data. CCTV
cameras are implemented in all places where security having much importance. Manual surveillance
seems tedious and time consuming. Security can be defined in different terms in different contexts
like theft identification, violence detection, chances of explosion etc.
In crowded public places the term security covers almost all type of abnormal events. Among them
violence detection is difficult to handle since it involves group activity. The anomalous or abnormal
activity analysis in a crowd video scene is very difficult due to several real-world constraints. The
paper includes a deep- rooted survey which starts from object recognition, action recognition, crowd
analysis and finally violence detection in a crowd environment. Majority of the papers reviewed in
this survey are based on deep learning technique. Various deep learning methods are compared in
terms of their algorithms and models.
AMRIT Page 46
Face Recognition Using Python
REFERENCE
LINKS:
▪ https://ieeexplore.ieee.org/document/9760635
▪ https://ieeexplore.ieee.org/document/9730709
▪ [1]. A brief history of Facial Recognition, NEC, New Zealand,26 May 2020.[Online].
Available: https://www.nec.co.nz/market-leadership/publications-media/a-brief-history-of-
facial- recognition/
▪ [2]. Face detection,TechTarget Network, Corinne Bernstein, Feb, 2020.[Online].
Available: https://searchenterpriseai.techtarget.com/definition/face-detection
▪ [3]. Paul Viola and Michael Jones, Rapid Object Detection using a Boosted Cascade of
Simple Features. Accepted Conference on Computer Vision and Pattern Re cognition,
2001.
▪ [4]. Face Detection with Haar Cascade,Towards Data Science-727f68dafd08,Girija
Shankar Behera, India, Dec 24, 2020.[Online].
Available:https://towardsdatascience.com/face-detection- with-haar-cascade-727f68dafd08
▪ [5]. Face Recognition: Understanding LBPH Algorithm,Towards Data Science-
90ec258c3d6b,Kelvin Salton do Prado, Nov 11, 2017.[Online].
Available
▪ :https://towardsdatascience.com/face-recognition-how-lbph-works-90ec258c3d6b
▪ [6]. What is Facial Recognition and how sinister is it, Theguardian, IanSample, July,
2019. [Online]. Available: https://www.theguardian.com/technology/2019/jul/29/what-is-
facial- recognition-and-how-sinister-is-it
▪ [7].Kushsairy Kadir , Mohd Khairi Kamaruddin, Haidawati Nasir, Sairul I Safie, Zulkifli
Abdul Kadir Bakti,"A comparative study between LBP and Haar-like features for Face
Detection using OpenCV", 4th International Conference on Engineering Technology and
Technopreneuship (ICE2T), DOI:10.1109/ICE2T.2014.7006273, 12 January 2015.
▪ [8].Senthamizh Selvi.R,D.Sivakumar, Sandhya.J.S , Siva Sowmiya.S, Ramya.S , Kanaga
Suba Raja.S,"Face Recognition Using Haar - Cascade Classifier for Criminal
AMRIT Page 47
Face Recognition Using Python
AMRIT Page 48
Face Recognition Using Python
AMRIT Page 49
7 2
March-April 2025
International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: [email protected]
Abstract
The human faces are dynamic multidimensional systems that require good recognition
processing techniques. Over the past few decades, the interest in automated face recognition has
been growing rapidly, including its theories and algorithms. Public security, criminal
identification, identity verification for physical and logical access, and intelligent autonomous
vehicles are a few examples of concrete applications of automated face recognition that are
gaining popularity among industries. Research in face cognition started in the 1960s. Since then,
various techniques have been developed and deployed, including local, holistic, and hybrid
approaches, which recognize faces using only a few face image features or whole facial features.
Yet, robust and efficient face recognition still provides challenges for computer vision and
pattern recognition researchers. In this paper, the researchers offered an overview of face
recognition, the different used techniques in previous literature and their applications.
1. INTRODUCTION
I. BACKGROUND
Facial recognition is a biometric tool. Like other regularly used biometric technologies such as
fingerprint recognition, iris recognition, and finger vein pattern recognition , it identifies a person
based on specific physiological features. The introduction of facial recognition in the field of
pattern recognition had an impact on the range of applicability, particularly for cyber
investigations. This has been possible due to advanced training techniques and progression made
in the analysis. Increased demand for a robust security system led the researchers to find work on
finding a reliable technology that verifies identities. Facial recognition systems could be the best
solution due to their speed and convenience over other biometric technologies. The identity of
any person is incomplete without facial recognition. Just like any other form of identification,
face recognition requires samples to be collected, identified, extracted with necessary
information (features), and stored for recognition.
Though the software used varies, the facial recognition process generally follows three main
phases. First, the face is captured in real time. The software then determines a number of facial
features known as landmarks of nodal points on the face. This includes the depth of the eye
sockets, the distance between the eyes, the width of the nose, and the distance from the forehead
to the chin. Each software uses different nodal points and can obtain up to 80 different
measurements. This data is then converted into a mathematical formula that represents the
person’s unique facial signature. Afterward, the facial signature is compared to a dataset if known
faces. This can all happen in a matter of seconds. Facial recognition technology has been around
since the 1960s. Woodrow Wilson Bledsoe developed the early system that
identified photographs by manually entering the coordinates of facial features such as the mouth
and nose using an electrical stylus. When given a photograph of a person, the system could
extract images from a dataset that most closely resembled it . However, since its inception, facial
recognition has been polarizing. Facial recognition technology is widely used in the field of
safety and security. It is used by law enforcement agencies to fight crime and locate missing
people. Furthermore, face recognition technology is increasingly being used at airport security
checkpoints around the world to protect passengers and identify criminals attempting to enter the
country [3, 4]. Today, some companies are developing a service using face recognition data
platforms to help prevent shoplifting and violent crime. Facial recognition technology is getting
faster and more accurate every year. However, its applications do not stop at safety and security.
It could also soon be used to make our lives more convenient. Facial recognition is increasingly
used in mobile devices and consumer products to authenticate users. College classrooms and
testing facilities are using it to take attendance and prevent cheating. Retailers are using it to
identify customers. Moreover, some automotive manufacturers are developing ways to use the
technology in place of car keys. Facial recognition could also be used for targeting products to
specific groups by offering a personalized experience [3]. There are vocal arguments against
facial recognition technology, with the biggest being its threat to an individual’s privacy. Some
cities across the world are already working towards banning real-time facial recognition. That is
mainly because facial data can be collected and stored without the person’s permission [1].
Yet, the technology is still not perfect as has been demonstrated. It is being widely adopted since
it is an advancing technology still seeking to reach high accuracy.
A. How Facial Recognition Works
Since computers would not understand faces as humans would, this technology is built on
turning face images into numerical expressions, called templates, that can be processed by the
computer and then used to compare with other face images. For the matching to be accurate and
reap true results, features need to be extracted from an image that makes it unique, so that when
compared in the future with other images in the dataset, both images will match only if they share
the same features. Formerly, the distances between key points on a template were taken for such
processes; however, that was far from accurate
Since computers recognize images as matrices where their numbers represent pixel colors, facial
recognition focuses are processing such matrices in a way such that faces can be recognized from
the way the numbers are organized. Modern approaches have led to passing the digital face image
through a series of “filters” to generate templates that are unique for each face. These filters are
used in a form that will produce a distinctive simplified fingerprint for the face being processed
Earlier at the beginning of facial recognition, scientists would pick the filters themselves that
would be applied to the images. However, nowadays computers are responsible for that task
through deep learning. The filters are selected by giving the system a series of three images, two
of each are of the same person and the third is for someone else. Through trial and error, the
system must reach the maximum similarity between both images and the least with the third of
each triplet in the series. The desired output is a collection of filters that are reliable enough to
IJFMR250239330 Volume 7, Issue 2, March-April 2025 2
International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: [email protected]
Illumination: Illumination is a challenging condition that can heavily affect face recognition
systems. As variation in lighting changes the appearance of the face extremely. Even if a person
took the same images with the same pose and expression but with varying lighting can appear
drastically different. Also, it is
shown that the difference between two different images of two different people under the same
illumination conditions is lesser than the difference between two images of the exact person under
different lighting conditions. Thus, this challenge has attracted the attention of researchers and is
widely considered to be hard for algorithms and humans. To overcome this challenge, there are
three methods that can be applied, gradient, grey level, and face reflection field estimation
techniques.
Occlusion: Occlusion can be defined as something that blocks the face or part of the face such as
a mustache, hands on the face, scarf, sunglasses, or shadows caused by extreme light. Occlusion of
less than 50% of the face is referred to as partial occlusion. This challenge degrades the
performance of face recognition systems. The holistic approach is one of the methods that can be
used for this challenge as it suppresses the features, traits, and characters while the rest of the
face is used as a piece of valuable information .
Resolution: Varying quality and resolution of the images given as inputs are considered a
crucial challenge. Any image that is below 16 x 16 is considered a low-resolution image, those
types of images are used in CCTV cameras in public streets or supermarkets. Also, they do not
provide much information as they are mostly lost. As a result, the recognition process goes down
drastically. Hence, there is a direct relation between efficiency and the face recognition system as
when the resolution increases it gives a better, easier, and more efficient recognition process.
Aging: Aging is an uncontrollable process for all human section, this paper provides numerous
previous research papers on face recognition using different techniques and approaches that will
be discussed, which will provide a foundation for the current study. Finally, the conclusion
section will summarize the main findings, discuss the limitations of the developed system, and
suggest future research directions.
input till it reaches an output that is compared to a predicted output and backward propagation
adjusts the hidden layer neurons’ weights depending on the calculated error as shown in Fig. 1
till the actual output equals the predicted one .bias resulting in a net sum, and the activation
function which takes the net sum to produce the neuron’s output, or simply put, depending on a
certain calculation it decides whether a neuron should be activated or not. There are various
types of activation functions: step, sigmoid, tanh, or Rectified Linear Unit (ReLu). Most of the
time, ReLU is used as the activation function as it outputs the value when larger than zero and
outputs zero when less whose equation is equation 1.𝑓(𝑥) = max {0, 𝑧}(1)Each neuron within
every hidden layer carries out these two functions and passes its output to the successive hidden
layer if there is one until the output layer is reached. If the final output does not match the
predicted one the error issent back through the neurons to adjust their weights. The output layer
has a variable number of neurons but most of them lie under the following classifications:
regression in which the output is a single neuron of a continuous number, binary classification
where the neuron is either 1 or 0 to signify classes, and multi- class classification which represents
various classes [21].
C. Contribution
Face recognition systems are widely used in many real- world applications as documented in
several research papers which are discussed in detail in the following sections. This paper
provides a comprehensive survey of face recognition methods, from traditional feature-based
methods to recent deeplearning methods and identifies the key challenges that need to
insights into the various aspects of the system’s function where the inputs are multiplied by very small
non- zero development and implementation. In the Literature Review weights and added to another small
non-zero number known as
1) Convolution Layer: An image can be viewed as a 2D matrix of its pixels. In face recognition,
a filter or kernelis an 𝑛 × 𝑛 matrix extracted, usually sized at (3×3), (4×4), or (5×4). The
convolution layer acts as the key building block of CNN, and it is where most of the processing
takes place. Convolution is done by having a filter go over the 2D input image and is multiplied
by the corresponding 𝑛 × 𝑛 matrix in the image as shown in Fig. 3. The filter may stride over the
input image matrix by 1 pixel as the red box does or 2 pixels as the orange box, then outputs the
resulting matrix of convolution known as the feature map. The convolution layer is sometimes
followed by one of the activation functions previously summarized before entering the following
layer, pooling .
2) Pooling Layer: The pooling layer follows convolution which helps reduce the matrix size and
speeds up the processing. Either Max-pooling or average pooling are used as shownin Fig. 4.
Maxpooling is done by passing another filter over the feature map and taking the maximum
value of the filtered region and outputting it to the final feature map. The filter is usually set to
2×2 with a stride of 2. Average pooling is done by taking the average of the filtered region and is
usually applied once before the fully connected layer to reduce the number of learnable
parameters. Just like convolution filters, pooling filters may vary in size and strides taken .
feed-forward network gives the fully connected layer output which may be multiple as shown in
Fig. 5 or single output neurons .
A. LBP and LBPH algorithm in OpenCV: It was mentioned earlier that OpenCV library,
Currently, the library has more than 2500 optimized algorithms. One of the algorithms is HOG
which is mentioned in this paper. Additionally, there are also other algorithms such as Local
Binary Patterns (LBP) and Local Binary Pattern Histogram (LBPH) algorithms that are used in a
variety of applications, such as face and object recognition, and can extract features from images.
There are various research papers that mentioned these algorithms in their face recognition
systems or application. Local Binary Patterns (LBP) is a texture operator that is derived from a
generic definition of texture in a local neighborhood and the original LBP operator was
introduced by Ojala et al. It is defined as a grayscale unvarying texture measure. LBP texture
operator has been a prominent method in many applications as a result of its discriminatory
power and computational simplicity. The computational simplicity feature facilitates the analysis
of images in difficult real-time conditions. Its crucial feature is its resistivity to monotonic
grayscale change, for instance, changes in lighting. Local Neighborhood refers to a particular
region or area surrounding a pixel in an image and is defined by choosing a crucial region around
each pixel in the image. This region’s size is determined by the radius parameter, which specifies
the distance between the center pixel and its neighbors. The original LBP operator generates labels
for picture pixels by thresholding the 3×3 neighborhood of each pixel with the center value and
then converting the result to a binary integer. The histogram of these various labels can then be
used as a texture descriptor. Before performing the LBP operation, it is required to train the
algorithm with a dataset of facial images of the person that needs to be recognized recognize, as
well as to assign an ID to each image in order for the algorithm to utilize that information in the
output result .Fig. 8 shows The LBP operator selects a part of the image that is 3×3 pixels in size,
which can also be modeled as a 3×3 matrix having the intensity of each pixel (0-255). The
central value of the matrix is utilized as a threshold to define the new values from the 8
neighbors. A new binary value is set foreach neighbor of the threshold .0 if the value falls below
the threshold and 1 if it exceeds the threshold. Consequently, the matrix will only contain binary
values that, when concatenated, generate a new binary value. The new binary value is then
converted to a decimal number and set to the matrix’s central value, which is actually a pixel
from the original image.
IJFMR250239330 Volume 7, Issue 2, March-April 2025 7
International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: [email protected]
Fig. 8: Calculation of an integral image for a 5×5 grid region. The shield subregion is the
area of interest
To calculate the area, sum 𝑉𝑅 of an interested region 𝑅 (shaded area in Fig. 8, in this paper do not
have to add upthe value of every single point inside that area. Instead, its integral image
representation can be used to compute the sum easily as shown in the following equation .
𝑉𝑅 = 𝑖𝑖(4) + 𝑖𝑖(1) − 𝑖𝑖(2) − 𝑖𝑖(3) Equation is an oriented gradient map of an image that comprises
a 2D grid of bin numbers (for zero- based, these numbers are from 0 to Nb-1 m where Nb is the
number of angular bins. A 2D grid of bin numbers makes up an image oriented gradient map. The
HOG features for a region inside this map can be constructed using this map. The HOG feature in
a region should be built in order to ascertain whether it contains the
Fig. 9: The HOG features of an interested region are generated by connecting the HOG
features of its four sub- areas. The four histograms are the features of the four overlapping
areas as indicated by dashed rectangles
𝑅𝑖𝑗ʹ=𝑅𝑖𝑗.𝑡 (9)
𝐺𝑖𝑗ʹ = (𝐺𝑖𝑗 ). 𝜏 (1
0)
𝐶𝑖𝑗 = {𝑅𝑖𝑗ʹ , 𝐺𝑖𝑗ʹ , 𝐵𝑖𝑗ʹ } (1
1)
Wher
e
1.4, 𝑌𝑎𝑣𝑒𝑔 < 64 (1
𝑡 = {0.6, 𝑌𝑎𝑣𝑒𝑔 > 192 1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 2)
To simplify computation, only the colors of 𝑅 and 𝐺 are compensated. Chrominance 𝐶𝑟 can
accurately represent human skin; thus, in order to simplify computation, just take 𝐶𝑟 into account
while performing a color space change. 𝐶𝑟 is defined as follows:
𝐶𝑟 = 0.5𝑅ʹ − 0.419𝐺ʹ − 0.081𝐵ʹ
In the previous equation [74], it can be seen that 𝑅ʹ and 𝐺ʹ are important factors due to their high
weight. In order to only compensate 𝑅 and 𝐺 to reduce computation. The human skin is
characterized by a binary matrix based on Cr and experimental findings:
IJFMR250239330 Volume 7, Issue 2, March-April 2025 10
International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: [email protected]
IV. METHODOLOGY
A MATLAB-based face recognition system using image processing and neural networks
A novel method for recognizing human faces is presented in this research. By employing the
two- dimensional discrete cosine transform (2D-DCT) to compress pictures and remove
superfluous data from face photos, this method employs an image-based approach to artificial
intelligence (2D-DCT). Based on skin color, the DCT derives characteristics from photos of
faces. DCT coefficients are calculated to create feature methods, the method‘s processing needs
are drastically lowered by vectors. Moreover, to determine whether the object in the input the
use of a smaller feature space .picture is ”present” or ”not present” in the image dataset, DCT-
based feature vectors are into groups using a self- organizing map (SOM), which uses an
unsupervised learning method. By categorizing the intensity levels of grayscale images into
several categories, SOM performs face recognition. Adding to these algorithms, MATLAB was
used to evaluate the system . A different research study in introduced a novel approach for
identifying facial expressions. This method involve sutilizing a two- dimensional discrete cosine
transform (DCT) across the facial image to identify distinctive features. They employed a
constructive one hidden-layer feedforward neural network to classify facial expressions. To
enhance the learning process and decrease network size without compromising performance,
they incorporated a technique called input-side pruning, which they had previously proposed.
than chrominance. The 2D-DCT image compression method distribution that characterizes a
human face are employed as employs an intensity (grayscale) representation of the picture for
features to divide the candidate regions into faces and non-faces. further processing . The second
stage determines if the A non-linear luminance-based lighting compensation method is topic in
the input image is ”present” or ”not present” in the also used in the detecting process , and it is
particularly effective image dataset by classifying vectors into groups using self- at boosting and
restoring the natural colors in pictures that were organizing map (SOM) and an unsupervised
IJFMR250239330 Volume 7, Issue 2, March-April 2025 11
International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: [email protected]
V. HARDWARE IMPLEMENTATION
Face Recognition System Based on Raspberry Pi Plat- form: This paper used Raspberry Pi to
implement their work on face recognition. The Raspberry Pi is like a minicomputer, a smaller
version of a modern computer that is able to execute a task efficiently. The module uses a variety
of processors, it can install open-source operating systems and applications. Also, Raspberry Pi
supports various programming languages such as Python, C, and C++. Numerous applications in
which the analysis of human facial expressions is crucial may utilize Raspberry Pi, which can
detect real-time facial emotions. Also, using features provided by Python libraries such as
OpenCV and NumPy. NumPy library stands for Numerical Python, used for scientific computing
so it is crucial in terms of numerical calculations, considered as the core of many other Python
libraries that have derived from it, and it provides a high-performance multidimensional array
object. Raspberry Pi is the microprocessor used in this system. Besides, an integral part of its
memory is SQLite. SQLite is the dataset recognition works less appropriately when the
recognition However, in this paper zero background effect is taken into consideration by
extracting the entire face of the person from the stored images in the dataset; thus, it works
appropriately even with any background or place changes. Another factor affecting the face
recognition algorithm is the tiny changes like beards, make-up, and lighting conditions. The
algorithm of this system examines the structure of the person and every different edge part of the
human face. Thus, the system is smartly detecting the face throughout these tiny changes. The
speed of processing of the face recognition system is a crucial part. In this paper, images are
saved in the dataset in XML format. XML (Extensible Markup Language) file is used rather than
a PNG (Portable Network Graphic) file, as XML is easier and does not consume time during the
process. Besides, the image processing takes a shorter time as, throughout the process, only the
face is extracted from the image so other information in the image is not processed. Another
integral factor for decreasing the time is the gray scaling. As a result, four frames per processed by
the system. On the other hand, identifying a moving person through a slow image recognition
algorithm is a challenge faced in this paper.
VI. CONCLUSION
Face recognition is a vital research topic in the image processing and computer vision fields
because of both its theoretical and practical impact. Access control, image search, human-machine
interfaces, security, and entertainment are just a few of the real- world applications for face
recognition. However, these applications confront several difficulties, including lighting
conditions. Numerous research articles on both software and Journal of Computer Applications
Technology and Research, hardware are highlighted in this document as well as a deep dig p.
371–376, 2022. into the algorithm’s comprehension, concentrating mostly on M. Lal, K.
Kumar, R. Hussain, Mait, S. Ali, and H. methods based on local, holistic, and hybrid features
Shaikh, “Study of face recognition techniques: A survey,” International Journal of Advanced
Computer Science and
VII. REFERNCES
1. 2017. using principal component analysis (pca) in mat lab,” pp. Y. Said, M. Barr, and H. E.
Ahmed, Design of a face
2. 115–119, 10 2016. recognition system based on convolutional neural network.
3. C. A. Hansen, “Face recognition,” Institute for Computer (cnn),” Engineering Technology
Applied Science Research.
4. Science University of Trom so. vol. 10, no. 3, pp. 5608–5612, 2020.
5. J. S. Ram bey, N. Sinaga, and B. D. Waluyo, “Automatic “At database of faces: Orl face
database.” [Online].
6. door access system using face recognition,” International Available:
http://camorl.co.uk/facedatabase.html