0% found this document useful (0 votes)
10 views

Surveillance System using opencv-report

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Surveillance System using opencv-report

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Surveillance System using Computer Vision

STUDENT INTERNSHIP PROJECT REPORT

SUBMITTED BY
Anish Bose S S

Supervisor

Dr. R. Hemalatha
Associate Professor

DEPARTMENT OF ECE
SRI SIVASUBRAMANIYA NADAR COLLEGE OF ENGINEERING
KALAVAKKAM-603110
Date: 12-06-23

CERTIFICATE OF COMPLETION

This is to certify that the internship project titled “Surveillance System Using Computer
Vision” undertaken by Anish Bose S S of Sri Sairam Engineering College has been completed
as per the proposed aim and objectives.

Faculty Incharge Head of the Department


Dr.R.Hemalatha Dr.P.Vijayalakshmi
Associate Professor Prof and Head

2
TABLE OF CONTENTS
1 PROJECT OVERVIEW 4
2 INTRODUCTION 5
3 OBJECTIVE 6
4 BLOCK DIAGRAM 7
5 PROPOSED SOLUTION 8
6 RESULT 14
REFERENCE 16

3
PROJECT OVERVIEW

I. PROJECT TITLE
Surveillance Sustem Using Computer Vission

II. MAJOR RESEARCH AREA


MTCNN: It works for faces having various orientations in images and can detect faces
across various scales. It can even handle occlusions. It doesn’t hold any major drawback
FaceNet Keras: The FaceNet Keras model only learns once. As a feature extractor, it
retrieves 128 vector embeddings. In circumstances where there are few datasets available, it
is even ideal. It has good accuracy even in these circumstances.
SVM: SVM (Support vector machine) creates an optimal hyperplane to classify the classes
of training dataset based on the different features of face .
III. PROJECT DURATION
The project was carried out between the 29th of May and the 4th of August 2023.

4
INTRODUCTION

Surveillance System:
Monitoring behaviour, numerous actions, or information is considered surveillance and is
done for information gathering, influencing, managing, or directing purposes. This may entail
remote surveillance using electronic tools like closed-circuit television (CCTV), as well as the
interception of electronically transmitted data like Internet traffic. It may also involve
straightforward technical techniques like postal interceptions and intelligence collection from
people.
Automated surveillance systems use cameras to keep an eye on the environment.
Motion detection, crowd behaviour, individual behaviour, and interactions between people,
crowds, and their surroundings are all used to analyse the observed environment. These
automated systems are capable of a variety of tasks, such as alert creation based on the study's
findings, detection, interpretation, and understanding. By modifying different aspects of these
systems, researchers have been able to improve monitoring performance while reducing human
error. This study takes a close look at video surveillance systems and the parts that go with them.
The most significant studies in these systems as well as the designs that were employed are
presented. Comparing existing surveillance systems in order to get a bigger picture.

Face Recognition:

A facial recognition system is a piece of technology that can compare an image or video
frame of a human face to a database of faces. Such a system locates and measures facial features
from an image and is typically used to authenticate users through ID verification services

The face identifier procedure simply requires any device that has digital photographic
technology to generate and obtain the images and data necessary to create and record the
biometric facial pattern of the person that needs to be identified..

The objective of face recognition is, from the incoming image, to find a series of data of
the same face in a set of training images in a database. The great difficulty is ensuring that this
process is carried out in real-time, something that is not available to all biometric face
recognition software providers.

5
OBJECTIVE

To make face recognition system that can recognize the face of people and identify the person
as known and unknown person. This project can be implemented as automatic attendance system
in college, school and workplaces. Employees can be identified using a facial recognition
attendance system, which can also confirm or deny admittance. This is crucial if your business
handles sensitive data or expensive inventory. As a result, those outside of your organisation or
other employees are unable to access a company's data.

The complete Face Recognition system is carried out by three major process :

1. Face Detection

2. Feature extraction

3. Feature Matching

Face Detection:

The Face detection method locates any faces in the image, extracts them, and displays them
(or saves them as a compressed file to be used for feature extraction later on).

Methods used in face detection:

● Haar Cascade Face Detection


● Dlib (HOG) Face Detection
● Dlib(CNN) Face Detection
● MTCNN Face Detection

Feature Extraction:

The fundamental and crucial initialising step for face recognition is feature extraction. Your
face's biological components are extracted. The physical characteristics of your face that vary
from person to person are caused by these biological elements. With the exception of identical
twins, no two people can share the same nodal points.

Methods used in feature extraction

● VGG
● Face Recognition API
● FaceNet Keras

6
Feature Classification:
The features of the test data are categorised among the various classes of facial features in the
training data using a geometry-based or template-based algorithm called feature classification.
There are several statistical techniques that can be used to create these template-based
classifications.
● Euclidean Distance
● Cosine Similarity
● SVM
● KNN
● ANN
Face Recognize:

BLOCK DIAGRAM

P-NET R-NET
MTCNN O-NET

SVM FACENET

7
PROJECT WORK
Process Flow Chart:

Face Feature Feature


Detection extraction Matching

Face Detection (MTCNN) :


In face detection there are many methods but here we use MTCNN algorithm to detect
face. This method gives the most accurate results out of all the four methods. It works for faces
having various orientations in images and can detect faces across various scales. It can even
handle occlusions.

It doesn’t hold any major drawback as such but is comparatively slower than HOG and
Haar cascade method. The first step is to take the image and resize it to different scales in order
to build an image pyramid, which is the input of the following three-staged cascaded network.
This first stage is a fully convolutional network (FCN). The difference between a CNN and a
FCN is that a fully convolutional network does not use a dense layer as part of the
architecture. This Proposal Network is used to obtain candidate windows and their bounding box
regression vectors.

Bounding box regression is a popular technique to predict the localization of boxes when
the goal is detecting an object of some pre-defined class, in this case faces. After obtaining the
bounding box vectors, some refinement is done to combine overlapping regions. The final output
of this stage is all candidate windows after refinement to downsize the volume of candidates. All
candidates from the P-Net are fed into the Refine Network. Notice that this network is a CNN, not
a FCN like the one before since there is a dense layer at the last stage of the network architecture.

The R-Net further reduces the number of candidates, performs calibration with bounding
box regression and employs non-maximum suppression (NMS) to merge overlapping candidates.

8
The R-Net outputs whether the input is a face or not, a 4-element vector which is the bounding
box for the face, and a 10 element vector for facial landmark localization.

OUTPUT OF MTCNN:

Proposal Network:
Fully convolutional network (FCN) is the first stage. A completely convolutional network
(FCN) does not include a dense layer as part of its architectural design, which is how it differs
from a CNN. To acquire candidate windows and their bounding box regression vectors, this
Proposal Network is used.

Refine Network:
The P-Net feeds the Refine Network with all candidates. Due to the presence of a dense layer at
the last stage of the network architecture, this network is a CNN rather than an FCN like the one
before.

Output Network:

Similar to the R-Net, this stage seeks to characterise the face in greater depth and output the
locations of the five facial landmarks, including the eyes, nose, and mouth

9
Feature Extraction (FaceNet Keras):

In feature extraction there are also various method here we use facenet keras. FaceNet
Keras is a one-shot learning model. It fetches 128 vector embeddings as a feature extractor. It is
even preferable in cases where we have a scarcity of datasets. It consists of good accuracy even
for such situations. FaceNet takes an image of the person’s face as input and outputs a vector of
128 numbers which represent the most important features of a face. In machine learning, this
vector is called embedding. Why embedding? Because all the important information from an
image is embedded into this vector. Basically, FaceNet takes a person’s face and compresses it
into a vector of 128 numbers. Ideally, embeddings of similar faces are also similar.

Mapping high-dimensional data (like images) into low-dimensional representations


(embeddings) has become a fairly common practice in machine learning these days. You can
read more about embeddings in this lecture by Google.

Embeddings are vectors and we can interpret vectors as points in the Cartesian coordinate
system. That means we can plot an image of a face in the coordinate system using its
embeddings.

We don’t directly tell FaceNet what the numbers in the vector should represent during training,
we only require that the embedding vectors of similar faces are also similar. Its up to FaceNet to
figure out how to represent faces with vectors so that the vectors of the same people are similar
and the vectors of different people are not. For this to be true, FaceNet needs to identify key
features of a person’s face which separate it from different faces.

FaceNet is trying out many different combinations of these features during training until it finds
the ones that work the best. FaceNet (or neural networks in general) dont represent features in an
image the same way as we do . That is why it is hard to interpret these vectors, but we are pretty

10
sure that something like distance between eyes is hidden behind the numbers in an embedding
vector.

FEATURE EXTRACTION FROM FACE:

FACENET KERAS:

11
Feature Matching:

Feature Matching or Feature classification is a geometry-based or template-based


algorithm used to classify the features of the test data among different classes of facial features in
the training data. These template-based classifications are possible using various statistical
approaches.

Here we use SVM (Support vector machine) creates an optimal hyperplane to classify the classes
of training dataset based on the different features of the face. The dimensionality of the
hyperplane is one less than the number of features. Different kernels can be applied to see what
features are used by the classifier to remove the features if required. This can help to improve
speed.

In difference space, we are interested in the following two classes: the dissimilarities between
images of the same individual, and dissimilarities between images of different people. These two
classes are the input to a SVM algorithm. A SVM algorithm generates a decision surface
separating the two classes. For face recognition, we re-interpret the decision surface to produce a
similarity metric between two facial images. This allows us to construct face-recognition
algorithms. The work of Moghaddam et al. uses a Bayesian method in a difference space, but
they do not derive a similarity distance from both positive and negative samples. We
demonstrate our SVM-based algorithm on both verification and identification applications. In
identification, the algorithm is presented with an image of an unknown person. The algorithm
reports its best estimate of the identity of an unknown person from a database of known
individuals. In a more general response, the algorithm will report a list of the most similar
individuals in the database. In verification (also referred to as authentication), the algorithm is
presented with an image and a claimed identity of the person. The algorithm either accepts or
rejects the claim. Or, the algorithm can return a confidence measure of the validity of the claim.
To provide a benchmark for comparison, we compared our algorithm with a principal component

12
analysis (PCA) based algorithm. We report results on images from the FEREf database of
images, which is the de facto standard in the face recognition community. From our experience
with the FEREf database, we selected harder sets of images on which to test the algorithms.
Thus, we avoided saturating performance of either algorithm and providing a robust comparison
between the algorithms. To test the ability of our algorithm to generalize to new faces, we trained
and tested the algorithms on separate sets of faces.

SVM-Support Vector Machine:

DATASET

No. of person Training images Testing images

30 360 90

50 653 163

Software used:

• Google Colab for testing and training

13
• Pycharm for face recognition

Result:

14
TABLE FOR ACCURACY:

No.of Person Accuracy


Training Testing
30 99.9986% 99.867%
50 96.4723% 94.512%

15
Reference:
• G. Al-Muhaidhri and J. Hussain, “Smart Attendance System using Face Recognition”,
International Journal of Engineering Research & Technology (IJERT), vol 8, pp. 51-54,
2017..
• P. Chakraborty, C.Muzammel, M. Khatun, Sk. Islam and S. Rahman, “Automatic Student
Attendance System Using Face Recognition”, International Journal of Engineering and
Advanced Technology (IJEAT), vol. 9, pp. 93-99, 2020.
• T. Ma, Q. Ji, N. Li, “Scene invariant crowd counting using multiscales head detection in
video surveillance,” IET Image Processing, 2015.
• M. Shani, C. Nanda, A. Sahu and B. Pattnaik, “Web-Based Online Embedded Door
Access Control and Home Security System Based on Face Recognition,” 2017.
• B. Surekha, K. Nazare, S. Viswanadha Raju and N. Dey, "Attendance Recording System
Using Partial Face Recognition Algorithm", Intelligent Techniques in Signal Processing
for Multimedia Security, vol. 660, pp. 293-319, 2016. doi: 10.1007/978-3- 319-44790-
2_14.
• D. Joseph, M. Mathew, T. Mathew, V. Vasappan and B. S Mony. “Automatic Attendance
System using Face Recognition”, International Journal for Research in Applied Science
and Engineering Technology. vol. 8, pp. 769-773, 2020. doi:
10.22214/ijraset.2020.30309.
• Dr.R.S Sabeenian, S. Aravind, P. Arunkumar and P.Joshua, and G. Eswarraj. “Smart
Attendance System Using Face Recognition”, Journal of Advanced Research in
Dynamical and Control Systems. vol. 12, pp. 1079-1084, 2020. doi:
10.5373/JARDCS/V12SP5/20201860.

• Z.-W. Yuan and J. Zhang, “Feature extraction and image retrieval based on alexnet,” in
Eighth International Conference on Digital Image Processing (ICDIP 2016), vol. 10033.
International Society for Optics and Photonics, 2016, p. 100330E.

16
• H. Qassim, A. Verma, and D. Feinzimer, “Compressed residual-vgg16 cnn model for big
data places image recognition,” in 2018 IEEE 8th Annual Computing and Communication
Workshop and Conference (CCWC). IEEE, 2018, pp. 169–175.

• S. Khan, M. H. Javed, E. Ahmed, S. A. Shah, and S. U. Ali, “Facial recognition using


convolutional neural networks and implementation on smart glasses,” in 2019
International Conference on Information Science and Communication Technology

• E.-J. Cheng, K.-P. Chou, S. Rajora, B.-H. Jin, M. Tanveer, C.-T. Lin, K.-Y. Young, W.-C. Lin,
and M. Prasad, “Deep sparse representation classifier for facial recognition and
detection system,” Pattern Recognition Letters.

17

You might also like