REPORT
REPORT
A PROJECT REPORT
Submitted by
BACHELOR OF ENGINEERING
IN
MAY 2024
i
ANNA UNIVERSITY :: CHENNAI 600 025
BONAFIDE CERTIFICATE
ii
ACKNOWLEDGEMENT
iii
ABSTRACT
v
2.5.1 Advantages 9
2.5.2 Disadvantages 9
3. SYSTEM DESIGN 10
3.1 Existing system 10
3.1.1 Disadvantages 10
3.2 Proposed system 11
3.2.1 Advantages 12
3.3 Overall system architecture 13
3.4 Module list 14
3.4.1 Module description 14
3.4.1.1 Image Dataset Collection 14
3.4.1.2 Image Preprocessing 14
3.4.1.3 Importing Modules 14
3.4.1.4 Training Dataset 15
3.4.1.5 Camera Interfacing 15
4. SYSTEM DESIGN 16
vii
LIST OF FIGURES
viii
LIST OF ABBREVIATIONS
AI - Artificial Intelligence
TV - Television
FC - Fully connected
OS - Operating system
ix
CHAPTER 1
INTRODUCTION
1
1.1 ARTIFICIAL INTELLIGENCE
Artificial intelligence (AI), digital computer or computer-
controlled robot to perform tasks commonly associated with intelligent beings.
The term is frequently applied to the project of developing systems endowed
with the intellectual processes characteristic of humans, such as the ability to
reason, discover meaning, generalize, or learn from past experience. Since the
development of the digital computer in the 1940s, it has been demonstrated that
computers can be programmed to carry out very complex tasks—as, for
example, discovering proofs for mathematical theorems or playing chess—with
great proficiency. Still, despite continuing advances in computer processing
speed and memory capacity, there are as yet no programs that can match human
flexibility over wider domains or in tasks requiring much everyday knowledge.
On the other hand, some programs have attained the performance levels of
human experts and professionals in performing certain specific tasks, so that
artificial intelligence in this limited sense is found in applications as diverse as
medical diagnosis, computer search engines, and voice or handwriting
recognition.
1.2 BASICS
A typical AI analyzes its environment and takes actions that maximize its
chance of success. An AI's intended utility function (or goal) can be simple ("1
if the AI wins a game of Go, 0 otherwise") or complex ("Do mathematically
similar actions to the ones succeeded in the past"). Goals can be explicitly
defined, or induced. If the AI is programmed for "reinforcement learning", goals
can be implicitly induced by rewarding some types of behavior or punishing
others. Alternatively, an evolutionary system can induce goals by using a
"fitness function" to mutate and preferentially replicate high-scoring AI systems,
similarly to how animals evolved to innately desire certain goals such as finding
food. Some AI systems, such as nearest-neighbor, instead of reason by analogy,
2
these systems are not generally given goals, except to the degree that goals are
implicit in their training data. Such systems can still be benchmarked if the non-
goal system is framed as a system whose "goal" is to successfully accomplish its
narrow classification task. AI often revolves around the use of algorithms. An
algorithm is a set of unambiguous instructions that a mechanical computer can
execute. A complex algorithm is often built on top of other, simpler, algorithms.
A simple example of an algorithm is the following (optimal for first player)
recipe for play at tic-tac-toe. Many AI algorithms are capable of learning from
data; they can enhance themselves by learning new heuristics (strategies, or
"rules of thumb", that have worked well in the past), or can themselves write
other algorithms. Some of the "learners" described below, including Bayesian
networks, decision trees, and nearest-neighbor, could theoretically, (given
infinite data, time, and memory) learn to approximate any function, including
which combination of mathematical functions would best describe the world.
These learners could therefore, derive all possible knowledge, by considering
every possible hypothesis and matching them against the data. In practice, it is
almost never possible to consider every possibility, because of the phenomenon
of "combinatorial explosion", where the amount of time needed to solve a
problem grows exponentially. Much of AI research involves figuring out how to
identify and avoid considering broad range of possibilities that are unlikely to be
beneficial. For example, when viewing a map and looking for the shortest
driving route from Denver to New York in the East, one can in most cases skip
looking at any path through San Francisco or other areas far to the West; thus,
an AI wielding a path finding algorithm like A can avoid the combinatorial
explosion that would ensue if every possible route had to be ponderously
considered in turn. The earliest (and easiest to understand) approach to AI was
symbolism (such as formal logic): "If an otherwise healthy adult has a fever,
then they may have influenza". A second, more general, approach is Bayesian
inference: "If the current patient has a fever, adjust the probability they have
3
influenza in such-and-such way". The third major approach, extremely popular
in routine business AI applications, are analogizes such as SVM and nearest-
neighbor: "After examining the records of known past patients whose
temperature, symptoms, age, and other factors mostly match the current patient,
X% of those patients turned out to have influenza". A fourth approach is harder
to intuitively understand, but is inspired by how the brain's machinery works:
the artificial neural network approach uses artificial "neurons" that can learn by
comparing itself to the desired output and altering the strengths of the
connections between its internal neurons to "reinforce" connections that seemed
to be useful. These four main approaches can overlap with each other and with
evolutionary systems; for example, neural nets can learn to make inferences, to
generalize, and to make analogies. Some systems implicitly or explicitly use
multiple of these approaches, alongside many other AI and non-AI algorithms;
the best approach is often different depending on the problem.
4
CHAPTER 2
LITERATURE SURVEY
2.1.1 ADVANTAGES
• Potential deterrence of theft through the use of incapacitating substances.
2.1.2 DISADVANTAGES
• Privacy concerns with face recognition and social media alert systems.
5
2.2 TITLE : ADVANCED ATM SECURITY SYSTEM USING ARDUINO
AUTHOR : SAKSHI TAKKAR
YEAR : 2021
ATM machine is a great piece of technology used by millions of people
around the world. It makes our daily transactions easier without loading up the
banking systems However, it is important to keep them secure from thefts and
other malicious activities. Traditional ATM system uses PIN (personal
identification number) for the authentication purpose and nowadays smart card
with magnetic stripes and PIN is in use. Thereby taking care of security from
user's front. However, securing ATMs from bank's front needs a lot of work to
be done. Merely deploying security personnel at the ATM is not enough. This
project comprises an advanced security system that can monitor and activate
various security measures in case of robbery and theft. This security system
detects malicious activities inside the ATM booth. The security system check
different parameters of the security and keep the concerned authorities updated.
It uses sensors like Reed Switch, Ultrasonic Sensor, and Cameras to do so. If
any unauthorized person is trying to move or open the ATM machine, due to
reed switch, the circuit gets open. The ultrasonic sensor is used to sense to
presence of intruder. If there is any change in these two parameters, the
surveillance camera takes the picture and concerned authority is inform via SMS
and can check the face of intruder via an IP address.
2.2.1 ADVANTAGES
• Real-time monitoring of ATM security parameters.
2.2.2 DISADVANTAGES
• Potential false alarms from sensors.
6
2.3 TITLE : A NOVEL METHOD OF ATM ANTI-THEFT DESIGN
USING SYSTEM ON CHIP
AUTHOR : MRIDUL SHUKLA
YEAR : 2023
The idea of designing the Automated teller Machine security/Anti-theft
system originated with the observation of real-life incidents which are prevalent
around us. As the number of ATMs has been increasing abruptly, there is an
obvious shortcoming in the securities of the ATMs. This ATM anti-theft system
deals with robbery prevention which occurs at ATMs and is reported very lately
thus helping the security agency to catch the culprits. This system is made using
Embedded System (SOC) Therefore to overcome this issue of late reporting and
delayed investigation, this project came to light. Thus if something fishy
happened with the money tray in the ATM, the system gets activated and the
ESP-32 initiates the process by sensing that something is not okay thereby
sending the message to H12E to get an alert and send thereby alerting the
camera to click the photograph of the culprits and at the same time alarming the
nearest police station and other concerned authorities by sending the message
through GSM module and alarming the buzzer. This is how our project is
intended to work. In our project, an RF module, GSM module, ESP-32, H12E
encoder, microcontroller, and camera are used. GPS module is used to track the
location of the robbed ATM.
2.3.1 ADVANTAGES
• Real-time alerts to authorities and activation of surveillance measures
upon detecting suspicious activity.
2.3.2 DISADVANTAGES
• Complexity in system integration and maintenance.
7
2.4 TITLE : IOT-BASED ATM PIN ENTRY BY RANDOM WORD
GENERATOR USING DESIGN THINKING FRAMEWORK
AUTHOR : ARUNA
YEAR : 2023
The proposed method by Xue Yang et al [15], realizes an end-to-end detection
framework in ship detection called Rotation Dense Feature Pyramid
Networks (R- DFPN). The framework is based on a multi-scale detection
network, using a dense feature pyramid network, rotation anchors, multi-scale
ROI Align and other structures. Compared with other rotation region
detection methods such as RRPN and R2CNN, this framework is more
suitable for ship detection tasks, and has achieved the state-of the-art
performance. A new ship detection framework based on rotation region
which can handle different complex scenes, detect intensive objects and
reduce redundant detection region. The feature pyramid of dense connections
based on a multi-scale detection framework, which enhances feature
propagation, encourages feature reuse and ensures the effectiveness of
detecting multi-scale objects. The multi-scale ROI Align to solve the problem
of feature misalignment instead of ROI pooling, and to get the fixed-length
feature and regression bounding box to fully keep the completeness of
semantic and spatial information through the horizontal circumscribed
rectangle of proposal.
2.4.1 ADVANTAGES
• Ability to handle complex scenes and detect densely arranged targets.
2.4.2 DISADVANTAGES
• Possible challenges in parameter tuning and model optimization.
8
2.5 TITLE : A METHOD OF SMALL FACE DETECTION BASED ON
CNN
AUTHOR : QINGYU ZHANG
YEAR : 2019
Under the existing technology, due to the limitation of some scenes, image data
will have illumination changes, blurring, occlusion, low resolution and other
issues. These problems have brought great challenges to face detection. At
present, many algorithm models can recognize face detection well under the
condition of positive and high resolution. However, most of the faces in real
scenes are lateral and have low resolution. For this kind of face detection, the
existing algorithm models will face the problems of accuracy and real-time
performance. In this paper, various models of face detection algorithms are
deeply studied and analyzed. Combined with the accuracy and speed of the
algorithm model, this paper designs a face detection algorithm model based on
MTCNN (Multi-task Convolution Neural Network) network model. The
algorithm is tested on the Wider Face. Wider Face is the most commonly used
dataset in the field of face detection. The result shows that the algorithm is
superior to other algorithms in the accuracy and speed of face detection.
2.5.1 ADVANTAGES
• Utilizes the MTCNN network model, known for its effectiveness in
detecting faces.
2.5.2 DISADVANTAGES
• The algorithm's performance may vary depending on the quality and
characteristics of the input image data.
9
CHAPTER-3
SYSTEM DESIGN
11
• Addressing Security Vulnerabilities: By moving away from traditional
security measures like PINs or cards, the system addresses inherent
vulnerabilities, offering a more secure environment for financial
transactions.
• Seamless User Experience: The incorporation of facial recognition
technology not only enhances security but also provides a seamless and
convenient user experience, eliminating the need for physical
authentication tokens.
• Enhanced Security Measures: With facial recognition adding an extra
layer of security, the proposed system significantly raises the bar for
unauthorized access to sensitive financial transactions, fostering greater
trust in banking systems worldwide.
12
3.3 SYSTEM ARCHITECTURE
Camera
USB to
UART PC
accelerometer
microcontroller
Weapon
Email or SMS
detection
Excel
13
3.4 MODULES USED
14
1. Numpy ➔ mathematical operation
2. Tensor flow ➔ Dataset training
3. Opencv ➔ using camera
4. Keras ➔ lite version using tensor flow
15
CHAPTER 4
SYSTEM DESIGN
16
4.2.1 USECASE DIAGRAM
A Use Case is a list of steps, typically defining interactions between a role
(known in Unified Modeling Language (UML) as an “actor”) and a system, to
achieve a goal. The actor can be a human, an external system, or time. In systems
engineering, use cases are used at a higher level than within software
engineering, often representing missions or stakeholder goals. Use Case Diagram
has actors like sender and receiver. Use cases show the activities handled by
bothsender and receiver.
ACCESS IN CAMERA
MOVEMENT
DETECTION
FACIAL RECOGNITION
PREDICTED OUTPUT
PREDICTED OUTPUT
Fig.No 4.1 Usecase Diagram
17
4.2.2 Class Diagram
A Class diagram is the main building block of object-oriented modeling. It
is used for general conceptual modeling of the structure of the application and for
detailed modelling translating the models into programming code. Class diagram
can also be used for data modelling. The classes in a class diagram represent both
the main elements, interactions in the application and the classes to be
programmed.
DATASET
TRAINING
NEW DATA
CNN ALGORITHM
WEAPON DETECTION
DETECTED FACE
OUTCOME RECONGNITION
18
4.2.3 Architecture Diagram
An architecture diagram in UML diagrams provides a high-level overview
of the structure and components of a system. It showcases the relationships and
interactions between various elements within the system. They serve as
blueprints for understanding the system's design and organization.
19
4.2.4 Data Flow Diagram
Data flow diagrams in UML illustrate the flow of data within a system,
focusing on how information moves from one process to another. They depict
processes, data stores, data flows, and external entities, showing how they
interact to accomplish system functions. These diagrams help in visualizing the
flow of information and identifying data transformations and storage points.
START
ACCESS IN
CAMERA
MOVEMENT
CHECKING WEAPON DETECTION
FACE
RECOGNITION
20
4.2.5 Activity Diagram
Start
Input images
CNN Algorithm
stop
21
CHAPTER 5
IMPLEMENTATION
5.1 ALGORITHM
Input layer in CNN should contain image data. Image data is represented
by three dimensional matrix as we saw earlier. You need to reshape it into a
single column. Suppose you have image of dimension 28 x 28 =784, you need to
convert it into 784 x 1 before feeding into input. If you have “m” training
examples then dimension of input will be (784, m).
22
5.2.2 Convo layer (Convo + ReLU)
Convo layer is sometimes called feature extractor layer because features
of the image are get extracted within this layer. First of all, a part of image is
connected to Convo layer to perform convolution operation as we saw earlier
and calculating the dot product between receptive field (it is a local region of the
input image that has the same size as that of filter) and the filter. Result of the
operation is single integer of the output volume. Then we slide the filter over the
next receptive field of the same input image by a Stride and do the same
operation again. We will repeat the same process again and again until we go
through the whole image. The output will be the input for the next layer.
23
Pooling layer is used to reduce the spatial volume of input image after
convolution. It is used between two convolution layer. If we apply FC after
Convo layer without applying pooling or max pooling, then it will be
computationally expensive and we don’t want it. So, the max pooling is only
way to reduce the spatial volume of input image. In the above example, we have
applied max pooling in single depth slice with Stride of 2. You can observe the 4
x 4 dimension input is reduce to 2 x 2 dimension.
There is no parameter in pooling layer but it has two hyperparameters —
Filter(F) and Stride(S).
• In general, if we have input dimension W1 x H1 x D1, then
• W2 = (W1−F)/S+1
• H2 = (H1−F)/S+1
• D2 = D1
5.2.5 Soft max/logistic layer
Soft max or Logistic layer is the last layer of CNN. It resides at the end of
FC layer. Logistic is used for binary classification and soft max is for multi-
classification.
5.2.6 Output layer
Output layer contains the label which is in the form of one-hot encoded.
24
5.3 SYSTEM SPECIFICATION
5.3.1 HARDWARE REQUIREMENTS
SYSTEM : PC OR LAPTOP
RAM : 4 GB RECOMMENDED
ROM : 1 GB
25
5.4 SOFTWARE DESCRIPTION
• PYTHON
• OPENCV
• PYTHON PILLOW
5.4.1PYTHON
PYTHON WORKS
Python is an interpreted language, which precludes the need to compile
code before executing a program because Python does the compilation in the
background. Because Python is a high-level programming language, it abstracts
many sophisticated details from the programming code. Python focuses so much
on this abstraction that its code can be understood by most novice programmers.
Python code tends to be shorter than comparable codes. Although Python offers
fast development times, it lags slightly in terms of execution time. Compared to
fully compiling languages like C and C++, Python programs execute slower. Of
course, with the processing speeds of computers these days, the speed
differences are usually only observed in benchmarking tests, not in real-world
operations. In most cases, Python is already included in Linux distributions and
Mac OS X machines. Python is a dynamic, high level, free open source and
interpreted programming language. It supports object –oriented programming as
well as procedural oriented programming. Python is a very easy to code as
compared to other language like c , c ++, java etc.. It is also a developer-
friendly language. Python is also an Integrated language because we can easily
integrated python with other language like c, c ++, etc..
5.4.2 OPENCV
OpenCV tutorial provides basic and advanced concepts of OpenCV. Our
OpenCV tutorial is designed for beginners and professionals. OpenCV is an
open-source library for the computer vision. It provides the facility to the
26
machine to recognize the faces or objects. In this tutorial we will learn the
concept of OpenCV using the Python programming language.
OPENCV WORKS
Human eyes provide lots of information based on what they see.
Machines are facilitated with seeing everything, convert the vision into numbers
and store in the memory. Here the question arises how computer convert images
into numbers. So the answer is that the pixel value is used to convert images into
numbers. A pixel is the smallest unit of a digital image or graphics that can be
displayed and represented on digital display device.
Installation Of Opencv
27
Install OpenCV in the Windows via pip
OpenCV is a Python library so it is necessary to install Python in the system and
install OpenCV using pip command:
1.pip install opencv-python
Open the command prompt and type the following code to check if the OpenCV
is installed or not.
In case, if pip and pillow are already installed in your computer, above
commands will
• POWER SUPPLY
• LINEAR POWER SUPPLY
• RECTIFIER
• ACCELEROMETER
5.5.1 POWER SUPPLY
Power supply is a reference to a source of electrical power. A device or system
that supplies electrical or other types of energy to an output load or group of
loads is called a power supply unit or PSU. The term is most commonly applied
29
to electrical energy supplies, less often to mechanical ones, and rarely to others.
Power supplies for electronic devices can be broadly divided into linear and
switching power supplies. The linear supply is a relatively simple design that
becomes increasingly bulky and heavy for high current devices; voltage
regulation in a linear supply can result in low efficiency. A switched-mode
supply of the same rating as a linear supply will be smaller, is usually more
efficient, but will be more complex.
30
5.5.3 RECTIFIER
There are several ways of connecting diodes to make a rectifier to convert AC to
DC. The bridge rectifier is the most important and it produces full-wave varying
DC. A full-wave rectifier can also be made from just two diodes if a centre-tap
transformer is used, but this method is rarely used now that diodes are cheaper.
A single diode can be used as a rectifier but it only uses the positive (+) parts of
the AC wave to produce half-wave varying DC.
5.5.4 ACCELEROMETER
32
CHAPTER 6
CONCLUSION AND FUTURE WORKS
6.1 RESULT AND DISCUSSION
• Improved Authentication Accuracy: The integration of embedded
systems and facial recognition technology in New Wave ATM security
has led to significantly improved authentication accuracy compared to
traditional methods like PINs or magnetic stripe cards.
• Enhanced Fraud Prevention: By leveraging facial recognition as a
primary means of identity verification, the system offers robust protection
against various forms of fraud, including skimming and card trapping,
thereby increasing overall security.
• Seamless User Experience: The adoption of facial recognition
technology offers a seamless and user-friendly authentication experience
for customers, eliminating the need for physical authentication tokens and
reducing friction in banking transactions.
• Real-time Processing Capabilities: Embedded systems within ATM
infrastructure enable real-time processing and analysis of facial features,
ensuring efficient and reliable authentication even in dynamic
environments.
• Future Potential and Adaptability: The results highlight the promising
potential of New Wave ATM security in revolutionizing banking security.
Moreover, the adaptability of embedded systems allows for the
continuous evolution of security measures in response to emerging threats
and regulatory requirements.
33
6.2 CONCLUSION
In conclusion, the integration of embedded systems and facial recognition
technology heralds a new era in ATM security, offering a paradigm shift
towards more robust and efficient authentication processes. By leveraging
specialized microcontrollers or processors within ATM infrastructure, coupled
with advanced facial recognition algorithms, this innovative approach addresses
longstanding vulnerabilities associated with traditional authentication methods
such as PINs and magnetic stripe cards. Moreover, the seamless integration of
facial recognition not only enhances security but also enhances user experience
by eliminating the need for physical tokens, providing customers with a
frictionless banking experience. With an extra layer of security provided by
facial recognition, financial institutions can significantly mitigate the risk of
unauthorized access and fraudulent transactions, thereby fostering greater trust
and confidence in banking systems worldwide. The adoption of New Wave
ATM security represents a critical step towards ensuring the safety and integrity
of financial transactions in an increasingly digitalized world. The integration of
deep learning algorithms enables the system to continuously learn and adapt to
evolving patterns of criminal activity, thereby staying ahead of increasingly
sophisticated theft techniques. This adaptive capability enhances the overall
effectiveness of ATM security measures, minimizing the risk of successful
attacks. The incorporation of theft detection mechanisms enables prompt
identification of unusual activities, such as loitering, unauthorized access
attempts, or tampering with the ATM equipment. By swiftly alerting security
personnel or law enforcement agencies, potential theft incidents can be deterred
or intercepted before they escalate, safeguarding both the ATM infrastructure
and the individuals utilizing it. Leveraging deep learning-based theft detection
and weapon identification technologies represents a proactive and effective
strategy for reinforcing ATM security.
34
6.3 FUTURE WORK
35
APPENDICES
APPENDIX I
SOURCE CODE
##import serial
##ser = serial.Serial(port = "COM3", baudrate = '9600',timeout = 0.5)
import numpy as np
import os
import sys
import tensorflow as tf
from distutils.version import StrictVersion
from collections import defaultdict
from PIL import Image
from object_detection.utils import ops as utils_ops
from time import sleep
import face_recognition
import cv2
import numpy as np
from time import sleep
import csv
import winsound
frequency = 4500
duration = 6000
from twilio.rest import Client
account_sid = 'ACfa3788966bfb19e69d5b18bbc4f0ae49'
auth_token = 'd0d2551a4b8f51809e3c6edc004e5f70'
client1 = Client(account_sid, auth_token)
def sms():
message = client1.messages \
36
.create(
body='ALERT SMS: SOME SUSPICIOUS ACTIVITY ',
from_='+17254448259',
to = "+918637687225"
)
print(message.sid)
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.image import MIMEImage
import getpass
# Email configuration
sender_email = '[email protected]'
receiver_email = '[email protected]'
subject = 'Email with Image Attachment'
body = 'Hello, This email contains an image attachment.'
msg = MIMEMultipart()
msg['From'] = sender_email
msg['To'] = receiver_email
msg['Subject'] = subject
# Attach body text to the email
msg.attach(MIMEText(body, 'plain'))
def email():
# Attach image file
image_filename = 'capture.jpg' # Replace 'image.jpg' with your image file
name
with open(image_filename, 'rb') as img_file:
img_data = img_file.read()
image = MIMEImage(img_data, name=image_filename)
37
msg.attach(image)
# Email server configuration
smtp_server = 'smtp.gmail.com'
smtp_port = 587 # Use 465 for SSL connection
smtp_username = sender_email
smtp_password = 'zppezbzqwqktbvad'
# Create a SMTP session and send the email
server = smtplib.SMTP(smtp_server, smtp_port)
server.starttls()
server.login(smtp_username, smtp_password)
server.sendmail(sender_email, receiver_email, msg.as_string())
print('Email sent successfully!')
def crime(a)
with open('REPORT.csv','r') as file:
input_value = a
reader = csv.reader(file)
for row in reader:
if row[0] == input_value:
return row[1]
return None
def face():
# Get a reference to webcam #0 (the default one)
video_capture = cv2.VideoCapture(0)
print("camera enable")
# Load a sample picture and learn how to recognize it.
AdrianLamo_image=face_recognition.load_image_file("AdrianLamo.jpeg")
AdrianLamo_face_encoding=face_recognition.face_encodings(AdrianLamo_im
age)[0]
# Load a second sample picture and learn how to recognize it.
38
# Load a second sample picture and learn how to recognize it.
kevinmitnick_image = face_recognition.load_image_file("kevinmitnick.jpeg")
kevinmitnick_face_encoding=face_recognition.face_encodings(kevinmitnick_i
mage)[0]
# Load a second sample picture and learn how to recognize it.
MatthewBevan_image =
face_recognition.load_image_file("MatthewBevan.jpeg")
MatthewBevan_face_encoding=face_recognition.face_encodings(MatthewBeva
n_image)[0]
# Load a second sample picture and learn how to recognize it.
MichaelCalce_image = face_recognition.load_image_file("MichaelCalce.jpeg")
MichaelCalce_face_encoding=face_recognition.face_encodings(MichaelCalce_i
mage)[0]
# Load a second sample picture and learn how to recognize it.
Richardpryce_image = face_recognition.load_image_file("Richardpryce.jpeg")
Richardpryce_face_encoding=face_recognition.face_encodings(Richardpryce_i
mage)[0]
# Create arrays of known face encodings and their names
known_face_encodings = [
## DHANABAL_face_encoding,
AdrianLamo_face_encoding,
kevinmitnick_face_encoding,
MatthewBevan_face_encoding,
MichaelCalce_face_encoding,
Richardpryce_face_encoding
]
known_face_names = [
## "DHANABAL",
"AdrianLamo",
39
"kevinmitnick",
"MatthewBevan",
"MichaelCalce",
"Richardpryce",
]
# Initialize some variables
face_locations = []
face_encodings = []
face_names = []
process_this_frame = True
name=""
while True:
# Grab a single frame of video
ret, frame = video_capture.read()
# Resize frame of video to 1/4 size for faster face recognition processing
small_frame = cv2.resize(frame, (0, 0), fx=0.25, fy=0.25)
# Convert the image from BGR color (which OpenCV uses) to RGB color
(which face_recognition uses)
rgb_small_frame = small_frame[:, :, ::-1]
# Only process every other frame of video to save time
if process_this_frame:
# Find all the faces and face encodings in the current frame of video
face_locations = face_recognition.face_locations(rgb_small_frame)
face_encodings=face_recognition.face_encodings(rgb_small_frame,
face_locations)
face_names = []
for face_encoding in face_encodings:
# See if the face is a match for the known face(s)
matches=face_recognition.compare_faces(known_face_encodings,
40
face_encoding)
name = "Unknown"
# # If a match was found in known_face_encodings, just use the first
one.
# if True in matches:
# first_match_index = matches.index(True)
# name = known_face_names[first_match_index]
# Or instead, use the known face with the smallest distance to the new
face
face_distances=face_recognition.face_distance(known_face_encodings,face_enc
oding)
best_match_index = np.argmin(face_distances)
if matches[best_match_index]:
name = known_face_names[best_match_index]
face_names.append(name)
process_this_frame = not process_this_frame
# Display the results
for (top, right, bottom, left), name in zip(face_locations, face_names):
# Scale back up face locations since the frame we detected in was scaled to 1/4
size
top *= 4
right *= 4
bottom *= 4
left *= 4
# Draw a box around the face
cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2)
# Draw a label with a name below the face
cv2.rectangle(frame, (left, bottom - 35), (right, bottom), (0, 0, 255),
cv2.FILLED)
41
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, name, (left + 6, bottom - 6), font, 1.0, (255, 255, 255), 1)
# Display the resulting image
cv2.imshow('Video', frame)
if(name=="AdrianLamo"):
cv2.imwrite('capture.jpg', frame)
t="Adrian"
print("AdrianLamo FACE DETECTED....")
input_value=t
result=crime(input_value)
print(result)
email()
break
if(name=="kevinmitnick"):
cv2.imwrite('capture.jpg', frame)
print("kevinmitnick FACE DETECTED....")
t="kevin"
input_value=t
result=crime(input_value)
print(result)
email()
break
if(name=="MatthewBevan"):
cv2.imwrite('capture.jpg', frame)
print("MatthewBevan FACE DETECTED....")
t="Matthew"
input_value=t
result=crime(input_value)
print(result)
42
email()
break
if(name=="MichaelCalce"):
cv2.imwrite('capture.jpg', frame)
print("MichaelCalce FACE DETECTED....")
t="Michael"
input_value=t
result=crime(input_value)
print(result)
email()
break
if(name=="Richardpryce"):
cv2.imwrite('capture.jpg', frame)
print("Richardpryce FACE DETECTED....")
t="Richard"
input_value=t
result=crime(input_value)
print(result)
email()
break
# Hit 'q' on the keyboard to quit!
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release handle to the webcam
video_capture.release()
cv2.destroyAllWindows()
if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):
raise ImportError('Please upgrade your TensorFlow installation to v1.9.* or
later!')
43
from utils import label_map_util
MODEL_NAME = 'inference_graph'
PATH_TO_FROZEN_GRAPH=MODEL_NAME+'/frozen_inference_graph.pb'
PATH_TO_LABELS = 'training/labelmap.pbtxt'
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
category_index=label_map_util.create_category_index_from_labelmap(PATH_
TO_LABELS, use_display_name=True)
def run_inference_for_single_image(image, graph):
if 'detection_masks' in tensor_dict:
# The following processing is only for single image
detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
# Reframe is required to translate mask from box coordinates to image
coordinates and fit the image size.
real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1,
-1])
detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
detection_masks, detection_boxes, image.shape[0], image.shape[1])
detection_masks_reframed = tf.cast(
tf.greater(detection_masks_reframed, 0.5), tf.uint8)
# Follow the convention by adding back the batch dimension
44
tensor_dict['detection_masks'] = tf.expand_dims(
detection_masks_reframed, 0)
image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')
# Run inference
output_dict = sess.run(tensor_dict,
feed_dict={image_tensor: np.expand_dims(image, 0)})
# all outputs are float32 numpy arrays, so convert types as appropriate
output_dict['num_detections'] = int(output_dict['num_detections'][0])
output_dict['detection_classes']=output_dict['detection_classes'][0].astype(np.ui
nt8)
output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
output_dict['detection_scores'] = output_dict['detection_scores'][0]
global a2
if 'detection_masks' in output_dict:
output_dict['detection_masks'] = output_dict['detection_masks'][0]
ifoutput_dict['detection_classes'][0]==andoutput_dict['detection_scores'][0] >
0.85:
print('Hammer')
sms()
print("Sms send")
winsound.Beep(frequency, duration)
face()
sleep(1)
cap.release()
cv2.destroyAllWindows()
a2=1
ifoutput_dict['detection_classes'][0]==2andoutput_dict['detection_scores'][0]
45
> 0.70 :
print('Pen')
sleep(1)
a2=1
if a2==1:
a2=0
sleep(1)
## email()
sleep(1)
return output_dict
def serial_func():
print("serial enabled 1")
a=ser.readline().decode('ascii') # reading serial data
print(a)
b=a
print(len(b))
global a1
if len(b)>=2:
for letter in b:
if(letter == 'A'):
D1 = b[1]
a1 = int(D1)
print("RECIVED VALUE: ",a1)
if a1 == 1:
sms()
print("SMS send")
winsound.Beep(frequency, duration)
cap.release()
cv2.destroyAllWindows()
46
face()
import serial
ser = serial.Serial('COM3',baudrate=9600,timeout=1)
ser.flushInput()
a1=0
a2=0
import cv2
cap = cv2.VideoCapture(0)
try:
with detection_graph.as_default():
with tf.Session() as sess:
# Get handles to input and output tensors
ops = tf.get_default_graph().get_operations()
all_tensor_names = {output.name for op in ops for output in op.outputs}
tensor_dict = {}
for key in ['num_detections','detection_boxes', 'detection_scores',
'detection_classes', 'detection_masks'
]:
tensor_name = key + ':0'
if tensor_name in all_tensor_names:
tensor_dict[key] =
tf.get_default_graph().get_tensor_by_name(tensor_name)
while True:
(__, image_np) = cap.read()
# Expand dimensions since the model expects images to have shape: [1, None,
None,3]
image_np_expanded = np.expand_dims(image_np, axis=0)
## cv2.imwrite('capture.jpg',image_np)
serial_func()
47
# Actual detection.
output_dict =run_inference_for_single_image(image_np, detection_graph)
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks'),
use_normalized_coordinates=True,
line_thickness=8)
cv2.imshow('object_detection', cv2.resize(image_np,(800,600)))
if cv2.waitKey(1)& 0xFF == ord('q'):
cap.release()
cv2.destroyAllWindows()
break
except Exception as e:
print(e)
#cap.release()
48
APPENDIX – II
SCREENSHOTS
49
50
51
REFERENCES
[1] H.Du, H. Shi, D. Zeng, X.-P. Zhang, and T. Mei, ‘‘The elements of Endto-
end deep face recognition: A survey of recent advances,’’ ACM Comput. Surv.,
vol. 54, pp. 1–42, Jan. 2022, doi: 10.1145/3507902.
[2] Y. Liu, X. Tang, J. Han, J. Liu, D. Rui, and X. Wu, ‘‘Hambox: Delving into
mining high-quality anchors on face detection,’’ in Proc. IEEE/CVF Conf.
Comput.Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 13043–13051, doi:
10.1109/CVPR42600.2020.01306.
[3] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, ‘‘Joint face detection and alignment
using multitask cascaded convolutional networks,’’ IEEE Signal Process. Lett.,
vol. 23, no. 10, pp. 1499–1503, Oct. 2016, doi: 10.1109/LSP.2016.2603342.
[4] Y. Wang, X. Ji, Z. Zhou, H. Wang, and Z. Li, ‘‘Detecting faces using
regionbased fully convolutional networks,’’ 2017, arXiv:1709.05256.
[5] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, ‘‘A convolutional neural
network cascade for face detection,’’ in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit. (CVPR), Jun. 2015, pp. 5325–5334, doi:
10.1109/CVPR.2015.7299170.
[6] D.Zeng, H. Liu, F. Zhao, S. Ge, W. Shen, and Z. Zhang, Proposal Pyramid
Networks for Fast Face Detection, vol. 495. Amsterdam, The Netherlands:
Elsevier, 2019, pp. 136–149.
[7] Y. Xu, W. Yan, G. Yang, J. Luo, T. Li, and J. He, ‘‘Centerface: Joint face
detection and alignment using face as point,’’ Sci. Program., vol. 2020, pp. 1–8,
Jul. 2020, doi: 10.1155/2020/7845384.
52
learning for visual recognition,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol.
43, no. 10, pp. 3349–3364, Oct. 2021, doi: 10.1109/TPAMI.2020.2983686.
[10] Z. Liu, X. Zhu, G. Hu, H. Guo, M. Tang, Z. Lei, N. M. Robertson, and J.
Wang, ‘‘Semantic alignment: Finding semantically consistent ground-truth for
facial landmark detection,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern
Recognit. (CVPR), Jun. 2019, pp. 3462–3471, doi: 10.1109/cvpr.2019.00358.
[11] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, ‘‘Spatial
transformer networks,’’ in Proc. Adv. Neural Inf. Process. Syst., vol. 28, C.
Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Eds. Curran
Associates, 2015, pp. 2017–2025. [Online].
[12] Y. Zhong, J. Chen, and B. Huang, ‘‘Toward end-to-end face recognition
through alignment learning,’’ IEEE Signal Process. Lett., vol. 24, no. 8, pp.
1213–1217, Aug. 2017, doi: 10.1109/LSP.2017.2715076. [13] A. Krizhevsky, I.
Sutskever, and G. E. Hinton, ‘‘ImageNet classification with deep convolutional
neural networks,’’ in Proc. Adv. Neural Inf. Process. Syst., vol. 25, F. Pereira,
C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, 2012,
pp. 1097–1105. [Online].
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, ‘‘ImageNet: A
large-scale hierarchical image database,’’ in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit., Jun. 2009, pp. 248–255, doi: 10.1109/CVPR.2009.5206848.
[15] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for
large-scale image recognition,’’ in Proc. 3rd Int. Conf. Learn. Represent.
(ICLR), Y. Bengio and Y. LeCun, Eds. San Diego, CA, USA, Jul. 2019.
[Online].
[16] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’ in Proc.
IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 1–9, doi:
10.1109/CVPR.2015.7298594.
[17] F. Schroff, D. Kalenichenko, and J. Philbin, ‘‘FaceNet: A unified
53
embedding for face recognition and clustering,’’ in Proc. IEEE Conf. Comput.
Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 815–823, doi:
10.1109/CVPR.2015.7298682.
[18] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun.
2016, pp. 770–778, doi: 10.1109/CVPR.2016.90.
[19] S. I. Serengil and A. Ozpinar, ‘‘LightFace: A hybrid deep face recognition
framework,’’ in Proc. Innov. Intell. Syst. Appl. Conf. (ASYU), Oct. 2020, pp.
1–5, doi: 10.1109/ASYU50717.2020.9259802. [20] Q. Cao, L. Shen, W. Xie, O.
M. Parkhi, and A. Zisserman, ‘‘VGGFace2: A dataset for recognising faces
across pose and age,’’
54