0% found this document useful (0 votes)
27 views63 pages

REPORT

Uploaded by

Naveen Kumar. M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views63 pages

REPORT

Uploaded by

Naveen Kumar. M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

REINFORCING ATM SECURITY THROUGH

DEEPLERANING BASED THEFT DETECTION AND WEAPON


IDENTIFICATION

A PROJECT REPORT

Submitted by

G. VISHNU RITHAN 912420104041


S. RAJESH KUMAR 912420104027
P. SAKTHIVEL 912420104028

in partial fulfilment for the award of the degree


of

BACHELOR OF ENGINEERING
IN

COMPUTER SCIENCE AND ENGINEERING

SHANMUGANATHAN ENGINEERING COLLEGE


ARASAMPATTI

ANNA UNIVERSITY :: CHENNAI 600 025

MAY 2024

i
ANNA UNIVERSITY :: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report “REINFORCING ATM SECURITY


THROUGH DEEPLERANING BASED THEFT DETECTION AND
WEAPON IDENTIFICATION” is the bonafide work of “G. VISHNU RITHAN
(912420104041), S. RAJESH KUMAR (912420104026), P. SAKTHIVEL
(912420104028)” who carried out the project work under my supervision.

SUPERVISOR HEAD OF THE DEPARTMENT


Mrs. G. Sasireka, M.E., Dr. S. Saravanakumar.,
Assistant Professor, Department of Computer
Department of Computer Science and Engineering,
Science and Engineering, Shanmuganathan Engineering
Shanmuganathan Engineering College,
College, Arasampatti-622507.
Arasampatti-622507.

Submitted for the viva voce to be held on _____________

INTERNAL EXAMINER EXTERNAL EXAMINER

ii
ACKNOWLEDGEMENT

"HARD WORK NEVER FAILS" So we thank god for having


gracefully blessed us to come up till now and thereby giving strength and
courage to complete the project successfully. We sincerely submit this project to
the almighty lotus feet.
We wish to acknowledge with thanks to the significant contribution given
by the management of our college chairman "KALLVI VALLAL
Mrs.Pichappa valliammai”, Correspondent Dr.P.Manikandan, and Secretary
Mr.Vishvanathan, Shanmuganathan Engineering College, Arasampatti, for
their extensive support.
We convey our indebted humble thanks to our energetic Principal
Dr.KL.Muthuramu M.E., (W.R), M.E.(S.E),M.I.S.T.E., F.I.E., Ph.D., for his
moral support to complete this project.
We are grateful to our Head of the Department Dr.S.Saravanakumar.,
for his valuable guidelines during the courses of the project.
Gratitude will never fail towards our dynamic and effective internal guide
Mrs.G.Sasireka M.E., for her unending helps that have been provided to us
throughout the project and during code debugging.
We wish to acknowledge the department project coordinator
Mr.R.Rajapandian M.E., for his valuable, innovative suggestion, constructive
instruction, constant encouragement, during project proceedings.
We also convey our humble thanks to all the staff members of the CSE
Department who had provide their contribution to steer our project towards the
successful completion.
Finally, we thank our family members who have spent their wealth for our
studies and have motivated us to complete this project successfully.

iii
ABSTRACT

The integration of embedded systems and facial recognition technology


represents a groundbreaking advancement in ATM security, particularly within
the framework of the New Wave paradigm. This innovative approach aims to
bolster authentication processes by employing facial recognition as a primary
means of identity verification. By leveraging embedded systems within ATM
infrastructure, such as specialized microcontrollers or processors, the system can
efficiently process and analyze facial features in real-time, ensuring high
accuracy and reliability. This integration not only addresses traditional security
vulnerabilities associated with PINs or cards but also offers a seamless and
convenient user experience. Moreover, the incorporation of facial recognition
adds an extra layer of security, significantly raising the bar for unauthorized
access to sensitive financial transactions. Integration of facial recognition
technology not only strengthens authentication processes but also deters
fraudulent activities, as potential attackers are less likely to attempt unauthorized
access when faced with biometric barriers this abstract highlights the potential
of New Wave ATM security in revolutionizing banking security, providing
robust protection against fraud while offering customers a friction less and
secure banking experience. The adoption of these advanced security systems is
poised to transform the landscape of financial security, enhancing transaction
safety and fostering greater trust in banking systems worldwide.

KEYWORDS: ATM, Face detection, Convolutional Neural Network


Algorithm
iv
TABLE OF CONTENT

CHAPTER NO TITLE PAGE NO


ABSTRACT iv
LIST OF FIGURES viii
LIST OF ABBERVATION ix
1. INTRODUCTION 1
1.1 Artificial Intelligence 2
1.2 Basics 2
1.3 Deep Learning 4
2. LITERATURE SURVEY 5
2.1 Title: An IOT Based ATM 5
Surveillance System
2.1.1 Advantages 5
2.1.2 Disadvantages 5
2.2 Title: Advanced ATM Security 6
System Using Arduino
2.2.1 Advantages 6
2.2.2 Disadvantages 6
2.3 Title: A Novel Method of ATM 7
Anti-Theft Design
2.3.1 Advantages 7
2.3.2 Disadvantages 7
2.4 Title: IOT-Based ATM Pin Entry 8
by Random Word Generator Using
Design Thinking Framework
2.4.1 Advantages 8
2.4.2 Disadvantages 8
2.5 Title: A Method of Small Face 9
Detection Based On CNN

v
2.5.1 Advantages 9
2.5.2 Disadvantages 9
3. SYSTEM DESIGN 10
3.1 Existing system 10
3.1.1 Disadvantages 10
3.2 Proposed system 11
3.2.1 Advantages 12
3.3 Overall system architecture 13
3.4 Module list 14
3.4.1 Module description 14
3.4.1.1 Image Dataset Collection 14
3.4.1.2 Image Preprocessing 14
3.4.1.3 Importing Modules 14
3.4.1.4 Training Dataset 15
3.4.1.5 Camera Interfacing 15
4. SYSTEM DESIGN 16

4.1 UML Diagrams 16


4.2 Types of UML Diagram 16
4.2.1 Usecase diagram 17
4.2.2 Class diagram 18
4.2.3 Architecture Diagram 19
4.2.4 Data flow diagram 20
4.2.5 Activity diagram 21
5. IMPLEMENTATION 22
5.1 Algorithm 22
5.2 Layers in CNN 22
5.2.1 Input Layer 22
5.2.2 Convo Layer (Convo) 22
5.2.3 Fully Connected (FC) Layer 23
5.2.4 Pooling Layer 23
5.2.5 Soft Max/Logistic Layer 23
5.2.6 Output Layer 24
vi
5.3 System Specification 24
5.3.1 Hardware Requirements 25
5.3.2 Software Requirements 25
5.4 Software Description 26
5.4.1 Python 27
5.4.2 Opencv 27
5.4.3 Python Pillow 29
5.5 Hardware Description 30
5.5.1 Power Supply 30
5.5.2 Linear Power Supply 31
5.5.3 Rectifier 32
5.5.4 Accelerometer 32
6. CONCLUSION AND FUTURE WORKS 34
6.1 Result and Discussion 34
6.2 Conclusion 35
6.3 Future Works 36
APPENDIXI 37
APPENDIX II 50
REFERENCE 53

vii
LIST OF FIGURES

FIG NO FIGURE NAME PAGE NO


3.1 System architecture 13
4.1 Use case diagram 17
4.2 Class diagram 18
4.3 Architecture Diagram 19
4.4 Data flow diagram 20
4.5 Activity diagram 21

viii
LIST OF ABBREVIATIONS

ATM - Automated Teller Machine

AI - Artificial Intelligence

TV - Television

CCTV - Closed-Circuit Television

GSM - Global System for Mobile Communication

GPS - Global Positioning System

MTCNN - Multi-task Convolution Neural Network

SMS - Short Message Service

UML - Unified Modeling Language

CNN - Convolutional Neural Networks

FC - Fully connected

OS - Operating system

PIL - Python Image Library

ix
CHAPTER 1

INTRODUCTION

In an era marked by rapid technological advancements and increasing


cybersecurity threats, traditional ATM security measures have become
insufficient to combat sophisticated attacks. To address this challenge, the
integration of embedded systems and facial recognition technology into ATM
security represents a new wave of innovation. This approach not only enhances
the security posture of ATMs but also elevates the user experience by providing
seamless and efficient authentication methods. Embedded systems serve as the
backbone of modern ATMs, offering a robust platform for integrating various
security features. These systems, comprising specialized hardware and software
components, enable ATMs to perform complex operations securely and
efficiently. By leveraging embedded systems, ATM manufacturers can
implement advanced encryption algorithms, secure boot mechanisms, and
tamper-resistant hardware, mitigating the risk of unauthorized access and data
breaches. Furthermore, embedded systems facilitate real-time monitoring and
remote management capabilities, empowering financial institutions to
proactively detect and respond to security incidents. Facial recognition
technology represents a cutting-edge innovation that holds immense potential in
revolutionizing ATM security. By employing sophisticated algorithms and high-
resolution cameras, ATMs can accurately identify users based on their unique
facial features. Unlike traditional authentication methods such as PINs or
magnetic stripe cards, facial recognition offers a seamless and frictionless
experience, eliminating the need for physical tokens or passwords. Moreover,
facial recognition enhances security by reducing the risk of identity theft and
card skimming, as users are authenticated based on biometric characteristics that
are difficult to replicate or forge.

1
1.1 ARTIFICIAL INTELLIGENCE
Artificial intelligence (AI), digital computer or computer-
controlled robot to perform tasks commonly associated with intelligent beings.
The term is frequently applied to the project of developing systems endowed
with the intellectual processes characteristic of humans, such as the ability to
reason, discover meaning, generalize, or learn from past experience. Since the
development of the digital computer in the 1940s, it has been demonstrated that
computers can be programmed to carry out very complex tasks—as, for
example, discovering proofs for mathematical theorems or playing chess—with
great proficiency. Still, despite continuing advances in computer processing
speed and memory capacity, there are as yet no programs that can match human
flexibility over wider domains or in tasks requiring much everyday knowledge.
On the other hand, some programs have attained the performance levels of
human experts and professionals in performing certain specific tasks, so that
artificial intelligence in this limited sense is found in applications as diverse as
medical diagnosis, computer search engines, and voice or handwriting
recognition.

1.2 BASICS
A typical AI analyzes its environment and takes actions that maximize its
chance of success. An AI's intended utility function (or goal) can be simple ("1
if the AI wins a game of Go, 0 otherwise") or complex ("Do mathematically
similar actions to the ones succeeded in the past"). Goals can be explicitly
defined, or induced. If the AI is programmed for "reinforcement learning", goals
can be implicitly induced by rewarding some types of behavior or punishing
others. Alternatively, an evolutionary system can induce goals by using a
"fitness function" to mutate and preferentially replicate high-scoring AI systems,
similarly to how animals evolved to innately desire certain goals such as finding
food. Some AI systems, such as nearest-neighbor, instead of reason by analogy,
2
these systems are not generally given goals, except to the degree that goals are
implicit in their training data. Such systems can still be benchmarked if the non-
goal system is framed as a system whose "goal" is to successfully accomplish its
narrow classification task. AI often revolves around the use of algorithms. An
algorithm is a set of unambiguous instructions that a mechanical computer can
execute. A complex algorithm is often built on top of other, simpler, algorithms.
A simple example of an algorithm is the following (optimal for first player)
recipe for play at tic-tac-toe. Many AI algorithms are capable of learning from
data; they can enhance themselves by learning new heuristics (strategies, or
"rules of thumb", that have worked well in the past), or can themselves write
other algorithms. Some of the "learners" described below, including Bayesian
networks, decision trees, and nearest-neighbor, could theoretically, (given
infinite data, time, and memory) learn to approximate any function, including
which combination of mathematical functions would best describe the world.
These learners could therefore, derive all possible knowledge, by considering
every possible hypothesis and matching them against the data. In practice, it is
almost never possible to consider every possibility, because of the phenomenon
of "combinatorial explosion", where the amount of time needed to solve a
problem grows exponentially. Much of AI research involves figuring out how to
identify and avoid considering broad range of possibilities that are unlikely to be
beneficial. For example, when viewing a map and looking for the shortest
driving route from Denver to New York in the East, one can in most cases skip
looking at any path through San Francisco or other areas far to the West; thus,
an AI wielding a path finding algorithm like A can avoid the combinatorial
explosion that would ensue if every possible route had to be ponderously
considered in turn. The earliest (and easiest to understand) approach to AI was
symbolism (such as formal logic): "If an otherwise healthy adult has a fever,
then they may have influenza". A second, more general, approach is Bayesian
inference: "If the current patient has a fever, adjust the probability they have

3
influenza in such-and-such way". The third major approach, extremely popular
in routine business AI applications, are analogizes such as SVM and nearest-
neighbor: "After examining the records of known past patients whose
temperature, symptoms, age, and other factors mostly match the current patient,
X% of those patients turned out to have influenza". A fourth approach is harder
to intuitively understand, but is inspired by how the brain's machinery works:
the artificial neural network approach uses artificial "neurons" that can learn by
comparing itself to the desired output and altering the strengths of the
connections between its internal neurons to "reinforce" connections that seemed
to be useful. These four main approaches can overlap with each other and with
evolutionary systems; for example, neural nets can learn to make inferences, to
generalize, and to make analogies. Some systems implicitly or explicitly use
multiple of these approaches, alongside many other AI and non-AI algorithms;
the best approach is often different depending on the problem.

1.3 DEEP LEARNING


Deep learning is a machine learning technique that teaches computers to
do what comes naturally to humans: learn by example. Deep learning is a key
technology behind driverless cars, enabling them to recognize a stop sign, or to
distinguish a pedestrian from a lamppost. It is the key to voice control in
consumer devices like phones, tablets, TVs, and hands-free speakers. Deep
learning is getting lots of attention lately and for good reason. It’s achieving
results that were not possible before In Deep learning, a computer model learns
to perform classification tasks directly from images, text, or sound. Deep
learning models can achieve state-of-the-art accuracy, sometimes exceeding
human-level performance. Models are trained by using a large set of labeled
data and neural network architectures that contain many layers.

4
CHAPTER 2

LITERATURE SURVEY

2.1 TITLE : AN IOT BASED ATM SURVEILLANCE SYSTEM


AUTHOR : V JACINTHA
YEAR : 2017
In the present scenario, majority of the population uses the ATM machine
to withdraw cash. At the same time, there are many ATM robberies that have
occurred in many areas, even if the CCTV cameras are placed in the ATM
center. Hence the security system needs to be changed. In order to reduce these
kinds of robberies, we present a security system for ATM theft by using a smart
and effective technology. This system also analyses various physical attacks on
ATM's. In our proposed system we use Face Recognizing Camera to capture the
face of the person, who is entering. Tilt and vibration sensors are used to detect
the irregular activities that are done on the ATM machine. The purpose of the
Temperature sensor is to determine the degree of temperature present in the
ATM booth. The main aim of our proposed system is to send an alert through
social media's like Facebook, twitter, and Gmail using IOT and GSM network.
Liquidator chloroform is used to spread the chloroform to make the thief
unconscious. This system caters realistic monitoring and control.

2.1.1 ADVANTAGES
• Potential deterrence of theft through the use of incapacitating substances.

2.1.2 DISADVANTAGES
• Privacy concerns with face recognition and social media alert systems.

5
2.2 TITLE : ADVANCED ATM SECURITY SYSTEM USING ARDUINO
AUTHOR : SAKSHI TAKKAR
YEAR : 2021
ATM machine is a great piece of technology used by millions of people
around the world. It makes our daily transactions easier without loading up the
banking systems However, it is important to keep them secure from thefts and
other malicious activities. Traditional ATM system uses PIN (personal
identification number) for the authentication purpose and nowadays smart card
with magnetic stripes and PIN is in use. Thereby taking care of security from
user's front. However, securing ATMs from bank's front needs a lot of work to
be done. Merely deploying security personnel at the ATM is not enough. This
project comprises an advanced security system that can monitor and activate
various security measures in case of robbery and theft. This security system
detects malicious activities inside the ATM booth. The security system check
different parameters of the security and keep the concerned authorities updated.
It uses sensors like Reed Switch, Ultrasonic Sensor, and Cameras to do so. If
any unauthorized person is trying to move or open the ATM machine, due to
reed switch, the circuit gets open. The ultrasonic sensor is used to sense to
presence of intruder. If there is any change in these two parameters, the
surveillance camera takes the picture and concerned authority is inform via SMS
and can check the face of intruder via an IP address.

2.2.1 ADVANTAGES
• Real-time monitoring of ATM security parameters.

2.2.2 DISADVANTAGES
• Potential false alarms from sensors.

6
2.3 TITLE : A NOVEL METHOD OF ATM ANTI-THEFT DESIGN
USING SYSTEM ON CHIP
AUTHOR : MRIDUL SHUKLA
YEAR : 2023
The idea of designing the Automated teller Machine security/Anti-theft
system originated with the observation of real-life incidents which are prevalent
around us. As the number of ATMs has been increasing abruptly, there is an
obvious shortcoming in the securities of the ATMs. This ATM anti-theft system
deals with robbery prevention which occurs at ATMs and is reported very lately
thus helping the security agency to catch the culprits. This system is made using
Embedded System (SOC) Therefore to overcome this issue of late reporting and
delayed investigation, this project came to light. Thus if something fishy
happened with the money tray in the ATM, the system gets activated and the
ESP-32 initiates the process by sensing that something is not okay thereby
sending the message to H12E to get an alert and send thereby alerting the
camera to click the photograph of the culprits and at the same time alarming the
nearest police station and other concerned authorities by sending the message
through GSM module and alarming the buzzer. This is how our project is
intended to work. In our project, an RF module, GSM module, ESP-32, H12E
encoder, microcontroller, and camera are used. GPS module is used to track the
location of the robbed ATM.

2.3.1 ADVANTAGES
• Real-time alerts to authorities and activation of surveillance measures
upon detecting suspicious activity.

2.3.2 DISADVANTAGES
• Complexity in system integration and maintenance.

7
2.4 TITLE : IOT-BASED ATM PIN ENTRY BY RANDOM WORD
GENERATOR USING DESIGN THINKING FRAMEWORK
AUTHOR : ARUNA
YEAR : 2023
The proposed method by Xue Yang et al [15], realizes an end-to-end detection
framework in ship detection called Rotation Dense Feature Pyramid
Networks (R- DFPN). The framework is based on a multi-scale detection
network, using a dense feature pyramid network, rotation anchors, multi-scale
ROI Align and other structures. Compared with other rotation region
detection methods such as RRPN and R2CNN, this framework is more
suitable for ship detection tasks, and has achieved the state-of the-art
performance. A new ship detection framework based on rotation region
which can handle different complex scenes, detect intensive objects and
reduce redundant detection region. The feature pyramid of dense connections
based on a multi-scale detection framework, which enhances feature
propagation, encourages feature reuse and ensures the effectiveness of
detecting multi-scale objects. The multi-scale ROI Align to solve the problem
of feature misalignment instead of ROI pooling, and to get the fixed-length
feature and regression bounding box to fully keep the completeness of
semantic and spatial information through the horizontal circumscribed
rectangle of proposal.

2.4.1 ADVANTAGES
• Ability to handle complex scenes and detect densely arranged targets.

2.4.2 DISADVANTAGES
• Possible challenges in parameter tuning and model optimization.

8
2.5 TITLE : A METHOD OF SMALL FACE DETECTION BASED ON
CNN
AUTHOR : QINGYU ZHANG
YEAR : 2019
Under the existing technology, due to the limitation of some scenes, image data
will have illumination changes, blurring, occlusion, low resolution and other
issues. These problems have brought great challenges to face detection. At
present, many algorithm models can recognize face detection well under the
condition of positive and high resolution. However, most of the faces in real
scenes are lateral and have low resolution. For this kind of face detection, the
existing algorithm models will face the problems of accuracy and real-time
performance. In this paper, various models of face detection algorithms are
deeply studied and analyzed. Combined with the accuracy and speed of the
algorithm model, this paper designs a face detection algorithm model based on
MTCNN (Multi-task Convolution Neural Network) network model. The
algorithm is tested on the Wider Face. Wider Face is the most commonly used
dataset in the field of face detection. The result shows that the algorithm is
superior to other algorithms in the accuracy and speed of face detection.

2.5.1 ADVANTAGES
• Utilizes the MTCNN network model, known for its effectiveness in
detecting faces.

2.5.2 DISADVANTAGES
• The algorithm's performance may vary depending on the quality and
characteristics of the input image data.

9
CHAPTER-3

SYSTEM DESIGN

3.1 EXISTING SYSTEM


• Dataset Construction: Developed a Chinese face dataset (UCEC-Face)
captured in uncontrolled classroom environments using 35 real
surveillance videos, comprising 7395 images of 130 subjects, including
44 males and 86 females.
• Model Utilization: Employed four existing face verification models
including Open Face, Arc Face, and VGG-Face for gender, expression,
and age recognition tasks on the UCEC-Face dataset.
• Evaluation: Evaluated the performance of the existing face verification
models on the UCEC-Face dataset to assess their effectiveness in
uncontrolled environments.
• Challenges Identified: Identified that the UCEC-Face dataset poses
greater challenges for face verification due to its closer resemblance to
real-world scenarios, leading to lower accuracy rates compared to other
datasets.
• Room for Improvement: Highlighted the need for further research and
development to enhance the performance of existing face verification
models specifically for Asian faces in uncontrolled environments,
indicating a significant gap between current model capabilities and real-
world application requirements.

3.1.1 DISADVANTAGE OF EXISTING SYSTEM


• Limited Dataset Diversity: The constructed UCEC-Face dataset may not
fully represent the diversity of Asian faces in uncontrolled environments,
potentially leading to biased model performance and generalization
issues.
10
• Small Sample Size: With only 7395 images of 130 subjects, the dataset's
relatively small sample size may restrict the robustness and reliability of
model training and evaluation, particularly for deep learning algorithms
that often require large amounts of data.
• Dependency on Surveillance Setup: The reliance on surveillance videos
for data collection introduces dependencies on factors like camera
placement, lighting conditions, and video quality, which may not
accurately reflect the variability encountered in real-world scenarios.
• Performance Gap with Existing Models: The experimental results
indicate that the best performance achieved by existing face verification
models on the UCEC-Face dataset is notably lower compared to other
datasets, suggesting limitations in the applicability of these models to
uncontrolled environments.
• Scalability Challenges: Constructing and annotating datasets from real-
world surveillance videos can be labor-intensive and time-consuming.
Scaling up the dataset collection process to encompass a broader range of
environments and demographics may pose logistical challenges.

3.2 PROPOSED SYSTEM


• Embedded System Integration: The proposed system integrates
embedded systems within ATM infrastructure, utilizing specialized
microcontrollers or processors to efficiently process and analyze facial
features in real-time.
• Facial Recognition Technology: Leveraging facial recognition as a
primary means of identity verification, the system aims to enhance
authentication processes, ensuring high accuracy and reliability.

11
• Addressing Security Vulnerabilities: By moving away from traditional
security measures like PINs or cards, the system addresses inherent
vulnerabilities, offering a more secure environment for financial
transactions.
• Seamless User Experience: The incorporation of facial recognition
technology not only enhances security but also provides a seamless and
convenient user experience, eliminating the need for physical
authentication tokens.
• Enhanced Security Measures: With facial recognition adding an extra
layer of security, the proposed system significantly raises the bar for
unauthorized access to sensitive financial transactions, fostering greater
trust in banking systems worldwide.

3.2.1 ADVANTAGES OF PROPOSED SYSTEM

• Improved Accuracy: Deep learning models can be trained on vast


amounts of data, enabling them to accurately identify suspicious
behaviour or objects with high precision, minimizing false alarms.
• Real-time Detection: Deep learning algorithms can analyse video
feeds in real-time, allowing for immediate detection and response to
potential threats, such as theft attempts or the presence of weapons.
• Enhanced Security: By detecting theft attempts and identifying
weapons, deep learning-based systems can deter criminals and reduce
the risk of violent incidents at ATMs, making them safer for customers
and reducing the likelihood of financial losses.
• Scalability: Once deployed, deep learning models can easily scale to
monitor multiple ATMs simultaneously, providing comprehensive
security coverage across a network of machines without significant
infrastructure.

12
3.3 SYSTEM ARCHITECTURE

High-level diagram illustrating the architecture of the ATM security system.


Explanation of the interaction between embedded systems, facial recognition
module, and other components. Discussion on communication protocols and
data flow between components.

Power supply for all


units

Camera

USB to
UART PC

accelerometer
microcontroller

Weapon
Email or SMS
detection

Beep sound Face


recognition

Excel

Fig.No 3.1 System architecture

13
3.4 MODULES USED

• IMAGE DATASET COLLECTION


• IMAGE PREPROCESSING
• IMPORTING MODULES
• TRAINING DATASET
• CAMERA INTERFACING

3.4.1 MODULE DESCRIPTION

3.4.1.1 IMAGE DATASET COLLECTION


For this project, we must gather every image that makes a appear to be oil
spill. This is the project's most crucial step. Therefore, all of the visuals that we
see come from real-time or recorded CCTV footage. The following procedures
can be taken after we get the data.

3.4.1.2 IMAGE PREPROCESSING


After gathering all the images, pre-processing is required. Thus not all images
can convey information clearly. So that we may prepare the images by
renaming, resizing, and labelling them. Once the procedure is complete, we can
use the photos to train our deep learning model.
3.4.1.3 IMPORTING MODULES
Following that, we must import all of the required library files. Library
files are collections of functions and small execution codes. This library files
will assist us in performing all of the necessary steps of object detection and
image processing. We use important library files such as numpy, Tensor Flow,
opencv, keras, and others in this project. These libraries will aid in making our
deep learning model more efficient and adaptable for processing real-time
images or videos.
specific libraries that we will use for data preprocessing, which are:

14
1. Numpy ➔ mathematical operation
2. Tensor flow ➔ Dataset training
3. Opencv ➔ using camera
4. Keras ➔ lite version using tensor flow

3.4.1.4 TRAINING DATASET


During the training of CNN, the neural network is being fed with a large dataset
of images being labelled with their corresponding class labels. The CNN
network processes each image with its values being assigned randomly and then
make comparisons with the class label of the input image.

3.4.1.5 CAMERA INTERFACING


One of the most important steps in image processing is computer vision. As a
result, we must connect the camera to our deep learning model. Because the
computer will see all real-world objects through the camera.
An oil spill is detected as circular or curved features on radar images with a
darker tone than the surrounding sea.

15
CHAPTER 4

SYSTEM DESIGN

4.1 UML DIAGRAMS

A UML diagram based on the UML(Unified Modeling Language) with the


purpose of visually representing a system along with its main actors, roles,
actions, artifacts or classes, in order to better understand, alter, maintain, or
document information about the system.

Additionally, sequence diagrams showcase the interactions between


objects or components over time, portraying the flow of messages or actions
between them. This aids in visualizing the dynamic behavior of the system and
identifying potential bottlenecks or inefficiencies in its operation.

Furthermore, use case diagrams highlight the system's functionalities from


the perspective of its users, illustrating the different use cases and the actors
involved in each scenario. This provides stakeholders with a clear understanding
of the system's capabilities and how users interact with it.

4.2 TYPES OF UML DIAGRAM

• Use Case Diagram


• Class Diagram
• Architecture Diagram
• Data Flow Diagram
• Activity Diagram

16
4.2.1 USECASE DIAGRAM
A Use Case is a list of steps, typically defining interactions between a role
(known in Unified Modeling Language (UML) as an “actor”) and a system, to
achieve a goal. The actor can be a human, an external system, or time. In systems
engineering, use cases are used at a higher level than within software
engineering, often representing missions or stakeholder goals. Use Case Diagram
has actors like sender and receiver. Use cases show the activities handled by
bothsender and receiver.

ACCESS IN CAMERA

MOVEMENT
DETECTION

FACIAL RECOGNITION

BEEP SOUND AND MAIL

PREDICTED OUTPUT

PREDICTED OUTPUT
Fig.No 4.1 Usecase Diagram

17
4.2.2 Class Diagram
A Class diagram is the main building block of object-oriented modeling. It
is used for general conceptual modeling of the structure of the application and for
detailed modelling translating the models into programming code. Class diagram
can also be used for data modelling. The classes in a class diagram represent both
the main elements, interactions in the application and the classes to be
programmed.

DATASET

TRAINING

NEW DATA

CNN ALGORITHM
WEAPON DETECTION

DETECTED FACE
OUTCOME RECONGNITION

Fig.No 4.2 Class Diagram

18
4.2.3 Architecture Diagram
An architecture diagram in UML diagrams provides a high-level overview
of the structure and components of a system. It showcases the relationships and
interactions between various elements within the system. They serve as
blueprints for understanding the system's design and organization.

Fig.No 4.3 Architecture Diagram

19
4.2.4 Data Flow Diagram
Data flow diagrams in UML illustrate the flow of data within a system,
focusing on how information moves from one process to another. They depict
processes, data stores, data flows, and external entities, showing how they
interact to accomplish system functions. These diagrams help in visualizing the
flow of information and identifying data transformations and storage points.

START

ACCESS IN
CAMERA

MOVEMENT
CHECKING WEAPON DETECTION

FACE
RECOGNITION

Fig.No 4.4 Data Flow Diagram

20
4.2.5 Activity Diagram

Activity diagrams are graphical representations of workflows of


stepwise activities and action with support for choice, iteration and
concurrency. In the Unified Modeling Language, activity diagrams are
intended to model both computational and organizational processes Activity
diagrams show the overall flow of control. Activity diagram has initial and
final state. Then activities are mentioned between the states.

Start

Input images

CNN Algorithm

Weapon detection and


accelerometer
face recognition

Excel update and beep Email and SMS


sound

stop

Fig.No 4.5 Activity Diagram

21
CHAPTER 5

IMPLEMENTATION

5.1 ALGORITHM

5.1.1 CONVOLATIONAL NEURAL NETWORK

Convolutional Neural Networks (CNNs) are a fundamental architecture in


deep learning, particularly renowned for their prowess in image recognition
tasks. At their core are convolutional layers, which apply filters to input data,
capturing essential features like edges and textures. These layers are often
followed by pooling layers, which down sample the data, reducing
computational load and guarding against overfitting. Non-linear activation
functions such as ReLU introduce complexity, enabling the network to grasp
intricate patterns. Fully connected layers then consolidate this information for
high-level reasoning and classification. Dropout, a regularization technique,
prevents overfitting by randomly deactivating neurons during training.
Techniques like padding and strides help manage spatial dimensions, while
batch normalization enhances stability and training speed.

5.2 Layers in CNN

5.2.1 Input layer

Input layer in CNN should contain image data. Image data is represented
by three dimensional matrix as we saw earlier. You need to reshape it into a
single column. Suppose you have image of dimension 28 x 28 =784, you need to
convert it into 784 x 1 before feeding into input. If you have “m” training
examples then dimension of input will be (784, m).

22
5.2.2 Convo layer (Convo + ReLU)
Convo layer is sometimes called feature extractor layer because features
of the image are get extracted within this layer. First of all, a part of image is
connected to Convo layer to perform convolution operation as we saw earlier
and calculating the dot product between receptive field (it is a local region of the
input image that has the same size as that of filter) and the filter. Result of the
operation is single integer of the output volume. Then we slide the filter over the
next receptive field of the same input image by a Stride and do the same
operation again. We will repeat the same process again and again until we go
through the whole image. The output will be the input for the next layer.

5.2.3 Fully connected (FC) layer


Fully connected layer involves weights, biases, and neurons. It connects neurons
in one layer to neurons in another layer. It is used to classify images between
different categories by training.

5.2.4 Pooling layer

23
Pooling layer is used to reduce the spatial volume of input image after
convolution. It is used between two convolution layer. If we apply FC after
Convo layer without applying pooling or max pooling, then it will be
computationally expensive and we don’t want it. So, the max pooling is only
way to reduce the spatial volume of input image. In the above example, we have
applied max pooling in single depth slice with Stride of 2. You can observe the 4
x 4 dimension input is reduce to 2 x 2 dimension.
There is no parameter in pooling layer but it has two hyperparameters —
Filter(F) and Stride(S).
• In general, if we have input dimension W1 x H1 x D1, then
• W2 = (W1−F)/S+1
• H2 = (H1−F)/S+1
• D2 = D1
5.2.5 Soft max/logistic layer
Soft max or Logistic layer is the last layer of CNN. It resides at the end of
FC layer. Logistic is used for binary classification and soft max is for multi-
classification.
5.2.6 Output layer
Output layer contains the label which is in the form of one-hot encoded.

24
5.3 SYSTEM SPECIFICATION
5.3.1 HARDWARE REQUIREMENTS
SYSTEM : PC OR LAPTOP

PROCESSOR : INTEL / AMD

RAM : 4 GB RECOMMENDED

ROM : 1 GB

5.3.2 SOFTWARE REQUIREMENTS


OPERATING SYSTEM : WINDOWS 10/11

LANGUAGE USED : PYTHON

BACKEND : PYTHON IDEL

FRONTEND : PYTHON SHELL


OPENCV
PILLOW
PYTESSER

25
5.4 SOFTWARE DESCRIPTION
• PYTHON
• OPENCV
• PYTHON PILLOW

5.4.1PYTHON
PYTHON WORKS
Python is an interpreted language, which precludes the need to compile
code before executing a program because Python does the compilation in the
background. Because Python is a high-level programming language, it abstracts
many sophisticated details from the programming code. Python focuses so much
on this abstraction that its code can be understood by most novice programmers.
Python code tends to be shorter than comparable codes. Although Python offers
fast development times, it lags slightly in terms of execution time. Compared to
fully compiling languages like C and C++, Python programs execute slower. Of
course, with the processing speeds of computers these days, the speed
differences are usually only observed in benchmarking tests, not in real-world
operations. In most cases, Python is already included in Linux distributions and
Mac OS X machines. Python is a dynamic, high level, free open source and
interpreted programming language. It supports object –oriented programming as
well as procedural oriented programming. Python is a very easy to code as
compared to other language like c , c ++, java etc.. It is also a developer-
friendly language. Python is also an Integrated language because we can easily
integrated python with other language like c, c ++, etc..

5.4.2 OPENCV
OpenCV tutorial provides basic and advanced concepts of OpenCV. Our
OpenCV tutorial is designed for beginners and professionals. OpenCV is an
open-source library for the computer vision. It provides the facility to the
26
machine to recognize the faces or objects. In this tutorial we will learn the
concept of OpenCV using the Python programming language.

OPENCV WORKS
Human eyes provide lots of information based on what they see.
Machines are facilitated with seeing everything, convert the vision into numbers
and store in the memory. Here the question arises how computer convert images
into numbers. So the answer is that the pixel value is used to convert images into
numbers. A pixel is the smallest unit of a digital image or graphics that can be
displayed and represented on digital display device.

The picture intensity at the particular location is represented by the


numbers. In the above image, we have shown the pixel values for a grayscale
image consist of only one value, the intensity of the black color at that location.

Installation Of Opencv

Install Opencv Using Anaconda


The first step is to download the latest Anaconda graphic installer for
Windows from it official site. Choose your bit graphical installer. You are
suggested to install 3.7 working with Python 3.

27
Install OpenCV in the Windows via pip
OpenCV is a Python library so it is necessary to install Python in the system and
install OpenCV using pip command:
1.pip install opencv-python

Open the command prompt and type the following code to check if the OpenCV
is installed or not.

5.4.3 PYTHON PILLOW


In today’s digital world, we come across lots of digital images. In case,
we are working with Python programming language, it provides lot of image
processing libraries to add image processing capabilities to digital images. Some
of the most common image processing libraries are: OpenCV, Python Imaging
Library (PIL), Scikit-image, Pillow. However, in this tutorial, we are only
focusing on Pillow module and will try to explore various capabilities of this
module. Pillow is built on top of PIL (Python Image Library). PIL is one of the
important modules for image processing in Python. However, the PIL module is
not supported since 2011 and doesn’t support python 3. Pillow module gives
more functionalities, runs on all major operating system and support for python
3. It supports wide variety of images such as “jpeg”, “png”, “bmp”, “gif”,
“ppm”, “tiff”. You can do almost anything on digital images using pillow
28
module. Apart from basic image processing functionality, including point
operations, filtering images using built-in convolution kernels, and color space
conversions.

Python Pillow - Environment Setup


This chapter discusses how to install pillow package in your computer.
Installing pillow package is very easy, especially if you’re installing it using pip.
Installing Pillow using pip
To install pillow using pip, just run the below command in your command
prompt:

python -m pip install pip

python -m pip install pillow

In case, if pip and pillow are already installed in your computer, above
commands will

5.5 HARDWARE DESCRIPTION

• POWER SUPPLY
• LINEAR POWER SUPPLY
• RECTIFIER
• ACCELEROMETER
5.5.1 POWER SUPPLY
Power supply is a reference to a source of electrical power. A device or system
that supplies electrical or other types of energy to an output load or group of
loads is called a power supply unit or PSU. The term is most commonly applied
29
to electrical energy supplies, less often to mechanical ones, and rarely to others.
Power supplies for electronic devices can be broadly divided into linear and
switching power supplies. The linear supply is a relatively simple design that
becomes increasingly bulky and heavy for high current devices; voltage
regulation in a linear supply can result in low efficiency. A switched-mode
supply of the same rating as a linear supply will be smaller, is usually more
efficient, but will be more complex.

5.5.2 LINEAR POWER SUPPLY


An AC powered linear power supply usually uses a transformer to convert the
voltage from the wall outlet (mains) to a different, usually a lower voltage. If it
is used to produce DC, a rectifier is used. A capacitor is used to smooth the
pulsating current from the rectifier. Some small periodic deviations from smooth
direct current will remain, which is known as ripple. These pulsations occur at a
frequency related to the AC power frequency (for example, a multiple of 50 or
60 Hz). The voltage produced by an unregulated power supply will vary
depending on the load and on variations in the AC supply voltage. For critical
electronics applications a linear regulator will be used to stabilize and adjust the
voltage. This regulator will also greatly reduce the ripple and noise in the output
direct current. Linear regulators often provide current limiting, protecting the
power supply and attached circuit from over current. Adjustable linear power
supplies are common laboratory and service shop test equipment, allowing the
output voltage to be set over a wide range. For example, a bench power supply
used by circuit designers may be adjustable up to 30 volts and up to 5 amperes
output. Some can be driven by an external signal, for example, for applications
requiring a pulsed output.

30
5.5.3 RECTIFIER
There are several ways of connecting diodes to make a rectifier to convert AC to
DC. The bridge rectifier is the most important and it produces full-wave varying
DC. A full-wave rectifier can also be made from just two diodes if a centre-tap
transformer is used, but this method is rarely used now that diodes are cheaper.
A single diode can be used as a rectifier but it only uses the positive (+) parts of
the AC wave to produce half-wave varying DC.

5.5.4 ACCELEROMETER

An accelerometer is a device that measures proper acceleration proper


acceleration is not the same as coordinate acceleration (rate of change of
velocity). For example, an accelerometer at rest on the surface of the Earth will
measure an acceleration due to Earth's gravity, straight upwards (by definition)
of g ≈ 9.81 m/s2. By contrast, accelerometers in free fall (falling toward the
center of the Earth at a rate of about 9.81 m/s2) will measure zero.
Accelerometers have multiple applications in industry and science. Highly
sensitive accelerometers are components of inertial navigation systems for
aircraft and missiles. Accelerometers are used to detect and monitor vibration in
rotating machinery. Accelerometers are used in tablet computers and digital
cameras so that images on screens are always displayed upright. Accelerometers
are used in drones for flight stabilisation. Coordinated accelerometers can be
used to measure differences in proper acceleration, particularly gravity, over
their separation in space; i.e., gradient of the gravitational field. This gravity
gradiometry is useful because absolute gravity is a weak effect and depends on
local density of the Earth which is quite variable. Single- and multi-axis models
of accelerometer are available to detect magnitude and direction of the proper
31
acceleration, as a vector quantity, and can be used to sense orientation (because
direction of weight changes), coordinate acceleration, vibration, shock, and
falling in a resistive medium (a case where the proper acceleration changes,
since it starts at zero, then increases). Micromachined accelerometers are
increasingly present in portable electronic devices and video game controllers,
to detect the position of the device or provide for game input. An accelerometer
measures proper acceleration, which is the acceleration it experiences relative to
free fall and is the acceleration felt by people and objects. Put another way, at
any point in spacetime the equivalence principle guarantees the existence of a
local inertial frame, and an accelerometer measures the acceleration relative to
that frame. Such accelerations are popularly denoted g-force i.e., in comparison
to standard gravity. Accelerometers are devices that measure acceleration, which
is the rate of change of the velocity of an object. They measure in meters per
second squared (m/s2) or in G-forces (g). A single G-force for us here on planet
Earth is equivalent to 9.8 m/s2, but this does vary slightly with elevation (and
will be a different value on different planets due to variations in gravitational
pull). Accelerometers are useful for sensing vibrations in systems or for
applications.

32
CHAPTER 6
CONCLUSION AND FUTURE WORKS
6.1 RESULT AND DISCUSSION
• Improved Authentication Accuracy: The integration of embedded
systems and facial recognition technology in New Wave ATM security
has led to significantly improved authentication accuracy compared to
traditional methods like PINs or magnetic stripe cards.
• Enhanced Fraud Prevention: By leveraging facial recognition as a
primary means of identity verification, the system offers robust protection
against various forms of fraud, including skimming and card trapping,
thereby increasing overall security.
• Seamless User Experience: The adoption of facial recognition
technology offers a seamless and user-friendly authentication experience
for customers, eliminating the need for physical authentication tokens and
reducing friction in banking transactions.
• Real-time Processing Capabilities: Embedded systems within ATM
infrastructure enable real-time processing and analysis of facial features,
ensuring efficient and reliable authentication even in dynamic
environments.
• Future Potential and Adaptability: The results highlight the promising
potential of New Wave ATM security in revolutionizing banking security.
Moreover, the adaptability of embedded systems allows for the
continuous evolution of security measures in response to emerging threats
and regulatory requirements.

33
6.2 CONCLUSION
In conclusion, the integration of embedded systems and facial recognition
technology heralds a new era in ATM security, offering a paradigm shift
towards more robust and efficient authentication processes. By leveraging
specialized microcontrollers or processors within ATM infrastructure, coupled
with advanced facial recognition algorithms, this innovative approach addresses
longstanding vulnerabilities associated with traditional authentication methods
such as PINs and magnetic stripe cards. Moreover, the seamless integration of
facial recognition not only enhances security but also enhances user experience
by eliminating the need for physical tokens, providing customers with a
frictionless banking experience. With an extra layer of security provided by
facial recognition, financial institutions can significantly mitigate the risk of
unauthorized access and fraudulent transactions, thereby fostering greater trust
and confidence in banking systems worldwide. The adoption of New Wave
ATM security represents a critical step towards ensuring the safety and integrity
of financial transactions in an increasingly digitalized world. The integration of
deep learning algorithms enables the system to continuously learn and adapt to
evolving patterns of criminal activity, thereby staying ahead of increasingly
sophisticated theft techniques. This adaptive capability enhances the overall
effectiveness of ATM security measures, minimizing the risk of successful
attacks. The incorporation of theft detection mechanisms enables prompt
identification of unusual activities, such as loitering, unauthorized access
attempts, or tampering with the ATM equipment. By swiftly alerting security
personnel or law enforcement agencies, potential theft incidents can be deterred
or intercepted before they escalate, safeguarding both the ATM infrastructure
and the individuals utilizing it. Leveraging deep learning-based theft detection
and weapon identification technologies represents a proactive and effective
strategy for reinforcing ATM security.

34
6.3 FUTURE WORK

Future research in reinforcing ATM security through deep learning-based


theft detection and weapon identification could explore several avenues to
enhance the effectiveness and reliability of the system. One promising direction
involves integrating multiple sensor modalities, such as video, audio, and
environmental sensors, to provide a more comprehensive understanding of the
ATM environment. By developing deep learning models capable of fusing
information from these modalities, researchers can improve detection accuracy
while reducing false alarms. moreover, future work could focus on
incorporating behavioural analysis techniques into the system. Rather than
solely relying on object detection, deep learning models could be trained to
recognize patterns associated with suspicious activities or anomalies in
customer behaviour around ATMs. This approach would enable the system to
identify potential threats more proactively and accurately. Another important
area for future research is enhancing the contextual understanding of ATM
transactions and surroundings. By developing deep learning models capable of
analysing contextual information such as time of day, location, and transaction
history, researchers can improve the system's ability to distinguish between
legitimate and suspicious activities. This contextual understanding could
significantly reduce false positives and improve overall detection performance.

35
APPENDICES
APPENDIX I
SOURCE CODE

##import serial
##ser = serial.Serial(port = "COM3", baudrate = '9600',timeout = 0.5)
import numpy as np
import os
import sys
import tensorflow as tf
from distutils.version import StrictVersion
from collections import defaultdict
from PIL import Image
from object_detection.utils import ops as utils_ops
from time import sleep
import face_recognition
import cv2
import numpy as np
from time import sleep
import csv
import winsound
frequency = 4500
duration = 6000
from twilio.rest import Client
account_sid = 'ACfa3788966bfb19e69d5b18bbc4f0ae49'
auth_token = 'd0d2551a4b8f51809e3c6edc004e5f70'
client1 = Client(account_sid, auth_token)
def sms():
message = client1.messages \

36
.create(
body='ALERT SMS: SOME SUSPICIOUS ACTIVITY ',
from_='+17254448259',
to = "+918637687225"
)
print(message.sid)
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.image import MIMEImage
import getpass
# Email configuration
sender_email = '[email protected]'
receiver_email = '[email protected]'
subject = 'Email with Image Attachment'
body = 'Hello, This email contains an image attachment.'
msg = MIMEMultipart()
msg['From'] = sender_email
msg['To'] = receiver_email
msg['Subject'] = subject
# Attach body text to the email
msg.attach(MIMEText(body, 'plain'))
def email():
# Attach image file
image_filename = 'capture.jpg' # Replace 'image.jpg' with your image file
name
with open(image_filename, 'rb') as img_file:
img_data = img_file.read()
image = MIMEImage(img_data, name=image_filename)

37
msg.attach(image)
# Email server configuration
smtp_server = 'smtp.gmail.com'
smtp_port = 587 # Use 465 for SSL connection
smtp_username = sender_email
smtp_password = 'zppezbzqwqktbvad'
# Create a SMTP session and send the email
server = smtplib.SMTP(smtp_server, smtp_port)
server.starttls()
server.login(smtp_username, smtp_password)
server.sendmail(sender_email, receiver_email, msg.as_string())
print('Email sent successfully!')
def crime(a)
with open('REPORT.csv','r') as file:
input_value = a
reader = csv.reader(file)
for row in reader:
if row[0] == input_value:
return row[1]
return None
def face():
# Get a reference to webcam #0 (the default one)
video_capture = cv2.VideoCapture(0)
print("camera enable")
# Load a sample picture and learn how to recognize it.
AdrianLamo_image=face_recognition.load_image_file("AdrianLamo.jpeg")
AdrianLamo_face_encoding=face_recognition.face_encodings(AdrianLamo_im
age)[0]
# Load a second sample picture and learn how to recognize it.

38
# Load a second sample picture and learn how to recognize it.
kevinmitnick_image = face_recognition.load_image_file("kevinmitnick.jpeg")
kevinmitnick_face_encoding=face_recognition.face_encodings(kevinmitnick_i
mage)[0]
# Load a second sample picture and learn how to recognize it.
MatthewBevan_image =
face_recognition.load_image_file("MatthewBevan.jpeg")
MatthewBevan_face_encoding=face_recognition.face_encodings(MatthewBeva
n_image)[0]
# Load a second sample picture and learn how to recognize it.
MichaelCalce_image = face_recognition.load_image_file("MichaelCalce.jpeg")
MichaelCalce_face_encoding=face_recognition.face_encodings(MichaelCalce_i
mage)[0]
# Load a second sample picture and learn how to recognize it.
Richardpryce_image = face_recognition.load_image_file("Richardpryce.jpeg")
Richardpryce_face_encoding=face_recognition.face_encodings(Richardpryce_i
mage)[0]
# Create arrays of known face encodings and their names
known_face_encodings = [
## DHANABAL_face_encoding,
AdrianLamo_face_encoding,
kevinmitnick_face_encoding,
MatthewBevan_face_encoding,
MichaelCalce_face_encoding,
Richardpryce_face_encoding
]
known_face_names = [
## "DHANABAL",
"AdrianLamo",

39
"kevinmitnick",
"MatthewBevan",
"MichaelCalce",
"Richardpryce",
]
# Initialize some variables
face_locations = []
face_encodings = []
face_names = []
process_this_frame = True
name=""
while True:
# Grab a single frame of video
ret, frame = video_capture.read()
# Resize frame of video to 1/4 size for faster face recognition processing
small_frame = cv2.resize(frame, (0, 0), fx=0.25, fy=0.25)
# Convert the image from BGR color (which OpenCV uses) to RGB color
(which face_recognition uses)
rgb_small_frame = small_frame[:, :, ::-1]
# Only process every other frame of video to save time
if process_this_frame:
# Find all the faces and face encodings in the current frame of video
face_locations = face_recognition.face_locations(rgb_small_frame)
face_encodings=face_recognition.face_encodings(rgb_small_frame,
face_locations)
face_names = []
for face_encoding in face_encodings:
# See if the face is a match for the known face(s)
matches=face_recognition.compare_faces(known_face_encodings,

40
face_encoding)
name = "Unknown"
# # If a match was found in known_face_encodings, just use the first
one.
# if True in matches:
# first_match_index = matches.index(True)
# name = known_face_names[first_match_index]
# Or instead, use the known face with the smallest distance to the new
face
face_distances=face_recognition.face_distance(known_face_encodings,face_enc
oding)
best_match_index = np.argmin(face_distances)
if matches[best_match_index]:
name = known_face_names[best_match_index]
face_names.append(name)
process_this_frame = not process_this_frame
# Display the results
for (top, right, bottom, left), name in zip(face_locations, face_names):
# Scale back up face locations since the frame we detected in was scaled to 1/4
size
top *= 4
right *= 4
bottom *= 4
left *= 4
# Draw a box around the face
cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2)
# Draw a label with a name below the face
cv2.rectangle(frame, (left, bottom - 35), (right, bottom), (0, 0, 255),
cv2.FILLED)

41
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, name, (left + 6, bottom - 6), font, 1.0, (255, 255, 255), 1)
# Display the resulting image
cv2.imshow('Video', frame)
if(name=="AdrianLamo"):
cv2.imwrite('capture.jpg', frame)
t="Adrian"
print("AdrianLamo FACE DETECTED....")
input_value=t
result=crime(input_value)
print(result)
email()
break
if(name=="kevinmitnick"):
cv2.imwrite('capture.jpg', frame)
print("kevinmitnick FACE DETECTED....")
t="kevin"
input_value=t
result=crime(input_value)
print(result)
email()
break
if(name=="MatthewBevan"):
cv2.imwrite('capture.jpg', frame)
print("MatthewBevan FACE DETECTED....")
t="Matthew"
input_value=t
result=crime(input_value)
print(result)

42
email()
break
if(name=="MichaelCalce"):
cv2.imwrite('capture.jpg', frame)
print("MichaelCalce FACE DETECTED....")
t="Michael"
input_value=t
result=crime(input_value)
print(result)
email()
break
if(name=="Richardpryce"):
cv2.imwrite('capture.jpg', frame)
print("Richardpryce FACE DETECTED....")
t="Richard"
input_value=t
result=crime(input_value)
print(result)
email()
break
# Hit 'q' on the keyboard to quit!
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release handle to the webcam
video_capture.release()
cv2.destroyAllWindows()
if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):
raise ImportError('Please upgrade your TensorFlow installation to v1.9.* or
later!')

43
from utils import label_map_util
MODEL_NAME = 'inference_graph'
PATH_TO_FROZEN_GRAPH=MODEL_NAME+'/frozen_inference_graph.pb'
PATH_TO_LABELS = 'training/labelmap.pbtxt'
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
category_index=label_map_util.create_category_index_from_labelmap(PATH_
TO_LABELS, use_display_name=True)
def run_inference_for_single_image(image, graph):
if 'detection_masks' in tensor_dict:
# The following processing is only for single image
detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
# Reframe is required to translate mask from box coordinates to image
coordinates and fit the image size.
real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1,
-1])
detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
detection_masks, detection_boxes, image.shape[0], image.shape[1])
detection_masks_reframed = tf.cast(
tf.greater(detection_masks_reframed, 0.5), tf.uint8)
# Follow the convention by adding back the batch dimension

44
tensor_dict['detection_masks'] = tf.expand_dims(
detection_masks_reframed, 0)
image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')
# Run inference
output_dict = sess.run(tensor_dict,
feed_dict={image_tensor: np.expand_dims(image, 0)})
# all outputs are float32 numpy arrays, so convert types as appropriate
output_dict['num_detections'] = int(output_dict['num_detections'][0])

output_dict['detection_classes']=output_dict['detection_classes'][0].astype(np.ui
nt8)
output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
output_dict['detection_scores'] = output_dict['detection_scores'][0]
global a2
if 'detection_masks' in output_dict:
output_dict['detection_masks'] = output_dict['detection_masks'][0]

ifoutput_dict['detection_classes'][0]==andoutput_dict['detection_scores'][0] >
0.85:
print('Hammer')
sms()
print("Sms send")
winsound.Beep(frequency, duration)
face()
sleep(1)
cap.release()
cv2.destroyAllWindows()
a2=1
ifoutput_dict['detection_classes'][0]==2andoutput_dict['detection_scores'][0]

45
> 0.70 :
print('Pen')
sleep(1)
a2=1
if a2==1:
a2=0
sleep(1)
## email()
sleep(1)
return output_dict
def serial_func():
print("serial enabled 1")
a=ser.readline().decode('ascii') # reading serial data
print(a)
b=a
print(len(b))
global a1
if len(b)>=2:
for letter in b:
if(letter == 'A'):
D1 = b[1]
a1 = int(D1)
print("RECIVED VALUE: ",a1)
if a1 == 1:
sms()
print("SMS send")
winsound.Beep(frequency, duration)
cap.release()
cv2.destroyAllWindows()

46
face()
import serial
ser = serial.Serial('COM3',baudrate=9600,timeout=1)
ser.flushInput()
a1=0
a2=0
import cv2
cap = cv2.VideoCapture(0)
try:
with detection_graph.as_default():
with tf.Session() as sess:
# Get handles to input and output tensors
ops = tf.get_default_graph().get_operations()
all_tensor_names = {output.name for op in ops for output in op.outputs}
tensor_dict = {}
for key in ['num_detections','detection_boxes', 'detection_scores',
'detection_classes', 'detection_masks'
]:
tensor_name = key + ':0'
if tensor_name in all_tensor_names:
tensor_dict[key] =
tf.get_default_graph().get_tensor_by_name(tensor_name)
while True:
(__, image_np) = cap.read()
# Expand dimensions since the model expects images to have shape: [1, None,
None,3]
image_np_expanded = np.expand_dims(image_np, axis=0)
## cv2.imwrite('capture.jpg',image_np)
serial_func()

47
# Actual detection.
output_dict =run_inference_for_single_image(image_np, detection_graph)
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks'),
use_normalized_coordinates=True,
line_thickness=8)
cv2.imshow('object_detection', cv2.resize(image_np,(800,600)))
if cv2.waitKey(1)& 0xFF == ord('q'):
cap.release()
cv2.destroyAllWindows()
break
except Exception as e:
print(e)
#cap.release()

48
APPENDIX – II

SCREENSHOTS

49
50
51
REFERENCES
[1] H.Du, H. Shi, D. Zeng, X.-P. Zhang, and T. Mei, ‘‘The elements of Endto-
end deep face recognition: A survey of recent advances,’’ ACM Comput. Surv.,
vol. 54, pp. 1–42, Jan. 2022, doi: 10.1145/3507902.
[2] Y. Liu, X. Tang, J. Han, J. Liu, D. Rui, and X. Wu, ‘‘Hambox: Delving into
mining high-quality anchors on face detection,’’ in Proc. IEEE/CVF Conf.
Comput.Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 13043–13051, doi:
10.1109/CVPR42600.2020.01306.
[3] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, ‘‘Joint face detection and alignment
using multitask cascaded convolutional networks,’’ IEEE Signal Process. Lett.,
vol. 23, no. 10, pp. 1499–1503, Oct. 2016, doi: 10.1109/LSP.2016.2603342.
[4] Y. Wang, X. Ji, Z. Zhou, H. Wang, and Z. Li, ‘‘Detecting faces using
regionbased fully convolutional networks,’’ 2017, arXiv:1709.05256.
[5] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, ‘‘A convolutional neural
network cascade for face detection,’’ in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit. (CVPR), Jun. 2015, pp. 5325–5334, doi:
10.1109/CVPR.2015.7299170.
[6] D.Zeng, H. Liu, F. Zhao, S. Ge, W. Shen, and Z. Zhang, Proposal Pyramid
Networks for Fast Face Detection, vol. 495. Amsterdam, The Netherlands:
Elsevier, 2019, pp. 136–149.
[7] Y. Xu, W. Yan, G. Yang, J. Luo, T. Li, and J. He, ‘‘Centerface: Joint face
detection and alignment using face as point,’’ Sci. Program., vol. 2020, pp. 1–8,
Jul. 2020, doi: 10.1155/2020/7845384.

[8] X. Huang, W. Deng, H. Shen, X. Zhang, and J. Ye, ‘‘PropagationNet:


Propagate points to curve to learn structure information,’’ in Proc.
IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 7263–
7272, doi: 10.1109/cvpr42600.2020.00729.
[9] J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M.
Tan, X. Wang, W. Liu, and B. Xiao, ‘‘Deep high-resolution representation

52
learning for visual recognition,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol.
43, no. 10, pp. 3349–3364, Oct. 2021, doi: 10.1109/TPAMI.2020.2983686.
[10] Z. Liu, X. Zhu, G. Hu, H. Guo, M. Tang, Z. Lei, N. M. Robertson, and J.
Wang, ‘‘Semantic alignment: Finding semantically consistent ground-truth for
facial landmark detection,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern
Recognit. (CVPR), Jun. 2019, pp. 3462–3471, doi: 10.1109/cvpr.2019.00358.
[11] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, ‘‘Spatial
transformer networks,’’ in Proc. Adv. Neural Inf. Process. Syst., vol. 28, C.
Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Eds. Curran
Associates, 2015, pp. 2017–2025. [Online].
[12] Y. Zhong, J. Chen, and B. Huang, ‘‘Toward end-to-end face recognition
through alignment learning,’’ IEEE Signal Process. Lett., vol. 24, no. 8, pp.
1213–1217, Aug. 2017, doi: 10.1109/LSP.2017.2715076. [13] A. Krizhevsky, I.
Sutskever, and G. E. Hinton, ‘‘ImageNet classification with deep convolutional
neural networks,’’ in Proc. Adv. Neural Inf. Process. Syst., vol. 25, F. Pereira,
C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, 2012,
pp. 1097–1105. [Online].
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, ‘‘ImageNet: A
large-scale hierarchical image database,’’ in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit., Jun. 2009, pp. 248–255, doi: 10.1109/CVPR.2009.5206848.
[15] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for
large-scale image recognition,’’ in Proc. 3rd Int. Conf. Learn. Represent.
(ICLR), Y. Bengio and Y. LeCun, Eds. San Diego, CA, USA, Jul. 2019.
[Online].
[16] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’ in Proc.
IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 1–9, doi:
10.1109/CVPR.2015.7298594.
[17] F. Schroff, D. Kalenichenko, and J. Philbin, ‘‘FaceNet: A unified

53
embedding for face recognition and clustering,’’ in Proc. IEEE Conf. Comput.
Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 815–823, doi:
10.1109/CVPR.2015.7298682.
[18] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun.
2016, pp. 770–778, doi: 10.1109/CVPR.2016.90.
[19] S. I. Serengil and A. Ozpinar, ‘‘LightFace: A hybrid deep face recognition
framework,’’ in Proc. Innov. Intell. Syst. Appl. Conf. (ASYU), Oct. 2020, pp.
1–5, doi: 10.1109/ASYU50717.2020.9259802. [20] Q. Cao, L. Shen, W. Xie, O.
M. Parkhi, and A. Zisserman, ‘‘VGGFace2: A dataset for recognising faces
across pose and age,’’

54

You might also like