Smart Attendance System
Smart Attendance System
DEPARTMENT OF
COMPUTER SCIENCE
IOT
GROUP NO.11
Sachin Kumar(2001321550043)
Project Supervisor
Dr. Indradeep Verma
This is to certify that this report embodies the original work done by Sahil Kumar,
Manish Singh Tomar and Sachin Kumar during this project submission as partial
fulfilment of the requirement for the Project of B.Tech (Computer Science - Internet
Of Things) V Semester, of the Greater Noida Institute Of Technology, Greater Noida.
We are grateful to our project guide Dr. Indradeep Verma Sir for the guidance,
inspiration and constructive suggestions that helpful us in the preparation of this project.
We also thank our colleagues who helped in successful completion of the project.
We also declare that this project is outcome of our own effort, that it has not been
submitted to any other university for the award of any degree.
1.Abstract
2. Introduction
3. Problem statement
4. Requirement & Specifications
4.1 Scope of the project
4.2 Objective of the project
5. System analysis
5.1 Existing system
5.2 Proposed system
6. Feasibility study
6.1 Economically feasibility
6.2 Technical feasibility
6.3 Operational feasibility
7. Methodology
8. System requirement
8.1 Hardware Requirements
8.2 Software Requirements
9. Face Recognition Process
10.Coding
11. Implementation
12. Analysis of the project
13. Conclusion
13.1. Scope for future Devlopment
14. References
1.Abstract
Attendance marking in a classroom during a lecture is not only a onerous task but also
a time consuming one at that. Due to an unusually high number of students present
during the lecture there will always be a probability of proxy attendance(s). Attendance
marking with conventional methods has been an area of challenge. The growing need
of efficient and automatic techniques of marking attendance is a growing challenge in
the area of face recognition. In recent years, the problem of automatic attendance
marking has been widely addressed through the use of standard biometrics like
fingerprint and Radio frequency Identification tags etc. However, these techniques lack
the element of reliability. In this proposed project an automated attendance marking and
management system is proposed by making use of face detection and recognition
algorithms. Instead of using the conventional methods, this proposed system aims to
develop an automated system that records the student’s attendance by using facial
recognition technology. The main objective of this work is to make the attendance
marking and management system efficient, time saving, simple and easy. Here faces
will be recognized using face recognition algorithms. The processed image will then be
compared against the existing stored record and then attendance is marked in the
database accordingly. Compared to existing system traditional attendance marking
system, this system reduces the workload of people. This proposed system will be
implemented with 4 phases such as Image Capturing, Segmentation.
2.Introduction
Nowadays Educational institutions are concerned about regularity of student attendance.
This is mainly Due to students’ overall academic performance is affected by his or her
attendance in the institute.
Mainly there are two conventional methods of marking attendance which are calling out
the roll call or by taking student sign on paper. They both were more time consuming
and difficult. Hence, there is a requirement of computer-based student attendance
management system which will assist the faculty for maintaining attendance record
automatically. In this project we have implemented the automated attendance system
using PYTHON. We have projected our ideas to implement “Automated Attendance
System Based on Facial Recognition”, in which it imbibes large applications. The
application includes face identification, which saves time and eliminates chances of
proxy attendance because of the face authorization.
Background:
Face recognition is crucial in daily life in order to identify family, friends or someone
we are familiar with. We might not perceive that several steps have actually taken in
order to identify human faces. Human intelligence allows us to receive information and
interpret the information in the recognition process. We receive information through the
image projected into our eyes, by specifically retina in the form of light. Light is a form
of electromagnetic waves which are radiated from a source onto an object and projected
to human vision. Robinson-Riegler, G., & Robinson-Riegler, B. (2008) mentioned that
after visual processing done by the human visual system, we actually classify shape,
size, contour and the texture of the object in order to analyse the information. The
analysed information will be compared to other representations of objects or face that
exist in our memory to recognize. In fact, it is a hard challenge
to build an automated system to have the same capability as a human to recognize faces.
However, we need large memory to recognize different faces, for example, in the
Universities, there are a lot of students with different race and gender, it is impossible
to remember every face of the individual without making mistakes. In order to overcome
human limitations, computers with almost limitless memory, high processing speed and
power are used in face recognition systems. The human face is a unique representation
of individual identity. Thus, face recognition is defined as a biometric method in which
identification of an individual is performed by comparing real-time capture image with
stored images in the database of that person (Margaret Rouse, 2012). Nowadays, face
recognition system is prevalent due to its simplicity and awesome performance. For
instance, airport protection systems and FBI use face recognition for criminal
investigations by tracking suspects, missing children and drug activities (Robert Silk,
2017). Apart from that, Facebook which is a popular social networking website
implement face recognition to allow the users to tag their friends in the photo for
entertainment purposes (Sidney Fussell, 2018). Furthermore, Intel Company allows the
users to use face recognition to get access to their online account (Reichert, C., 2017).
Apple allows the users to unlock their mobile phone, iPhone X by using face recognition
(deAgonia, M., 2017). The work on face recognition began in 1960. Woody Bledsoe,
Helen Chan Wolf and Charles Bisson had introduced a system which required the
administrator to locate eyes, ears, nose and mouth from images. The distance and ratios
between the located features and the common reference points are then calculated and
compared. The studies are further enhanced by Goldstein, Harmon, and Lesk in 1970
by using other features such as hair colour and lip thickness to automate the recognition.
In 1988, Kirby and Sirovich first suggested principle component analysis (PCA) to solve
face recognition problem. Many studies on face recognition were then conducted
continuously until today (Ashley DuVal, 2012)
3.Problem Statement
According to the previous attendance management system, the accuracy of the data
collected is the biggest issue. This is because the attendance might not be recorded
personally by the original person, in another word, the attendance of a particular person
can be taken by a third party without the realization of the institution which violates the
accuracy of the data. For example, student A is lazy to attend a particular class, so
student B helped him/her to sign for the attendance which in fact student A didn’t attend
the class, but the system overlooked this matter due to no enforcement practiced.
Supposing the institution establish an enforcement, it might need to waste a lot of human
resource and time which in turn will not be practical at all. Thus, all the recorded
attendance in the previous system is not reliable for analysis usage. The second problem
of the previous system is where it is too time consuming. Assuming the time taken for
a student to sign his/her attendance on a 3-4 paged name list is approximately 1 minute.
In 1 hour, only approximately 60 students can sign their attendance which is obviously
inefficient and time consuming. The third issue is with the accessibility of those
information by the legitimate concerned party. For an example, most of the parents are
very concerned to track their child’s actual whereabouts to ensure their kid really attend
the classes in college/school. However, in the previous system, there are no ways for the
parents to access such information. Therefore, evolution is needed to be done to the
previous system to improve efficiency, data accuracy and provides accessibility to the
information for those legitimate party.
Traditional student attendance marking technique is often facing a lot of trouble. The
face recognition student attendance system emphasizes its simplicity by eliminating
classical student attendance marking technique such as calling student names or
checking respective identification cards. There are not only disturbing the teaching
process but also causes distraction for students during exam sessions. Apart from calling
names, attendance sheet is passed around the classroom during the lecture sessions. The
lecture class especially the class with a large number of students might find it difficult
to have the attendance sheet being passed around the class. Thus, face recognition
student attendance system is proposed in order to replace the manual signing of the
presence of students which are burdensome and causes students get distracted in order
to sign for their attendance. Furthermore, the face recognition based automated student
attendance system able to overcome the problem of fraudulent approach and lecturers
does not have to count the number of students several times to ensure the presence of
the students. The paper proposed by Zhao, W et al. (2003) has listed the difficulties of
facial identification. One of the difficulties of facial identification is the identification
between known and unknown images. In addition, paper proposed by Pooja G.R et al.
(2010) found out that the training process for face recognition student attendance system
is slow and time-consuming. In addition, the paper proposed by Priyanka Wagh et al.
(2015) mentioned that different lighting and head poses are often the problems that
could degrade the performance of face recognition-based student attendance system.
Hence, there is a need to develop a real time operating student attendance system which
means the identification process must be done within defined time constraints to prevent
omission. The extracted features from facial images which represent the identity of the
students have to be consistent towards a change in background, illumination, pose and
expression. High accuracy and fast computation time will be the evaluation points of
the performance.
Our project targets the students of different academic levels and faculty members. The
main constraint we faced is distinguishing between identical twins. This situation is still
a challenge to biometric systems especially facial recognition technology. According to
Phillips and his co-researcher paper to get the best results of the algorithms your system
employed, they should run under certain conditions for taken pictures (i.e… age, gender,
expressions, studio environmental etc.) otherwise, the problem is still ongoing.
They provide application (method) to solve this problem, but in order to use this solution
you have to sign a contract with the organization and to be a researcher or developer.
For us, to solve this issue we suggest to record twins’ attendance manually.
Our primary goal is to help the lecturers, improve and organize the process of track and
manage student attendance and absent
4.2 Aim and Objectives of the project:
The objective of this project is to develop face recognition based automated student
attendance system. Expected achievements in order to ful fill the objectives are:
Analysis can be defined as breaking up of any whole so as to find Out their nature,
function etc. It defines design as to nuke preliminary sketches of; to sketch a pattern or
outline far plan. To plan and carry out especially by artistic arrangement or in a skill
full wall, System analysis and design can be characterized as a set of techniques and
processes, a community of interests, a culture and an intellectual orientation, The
various tasks in the system analysis include the following,
Understanding application.
Planning.
Scheduling.
Developing candidate solution.
Performing cost benefit analysis.
Recommending alternative solutions.
Supervising, installing and maintaining the system.
This system manages to the analysis of the report creation and develops manual entry
of the student attendance, First design the students entry form, staff allocation and time
table allocation forms. This project will helps the attendance System for the department
calculate percentage and reports for eligibility criteria of examination -The application
attendance entry system will provide flexible report for all students.
Existing system is a manual entry for the students, Here the attendance will be carried
out in the hand written registers. It will he a tedious job to maintain the record for the
user. The human effort is more here. The retrieval of the information is not as easy as
the records ace maintained in the hand written registers. This application requires
correct feed on input into the respective field, Suppose the wrong inputs are entered,
the application resist to work. so the user find it difficult to use.
To overcome drawbacks of existing system, the proposed system has been evolved. This
project aims to reduce the paper work and saving time to generate accurate results from
the student's attendance. The system provides with the best user interface.
The efficient reports can be generated by using this proposed system.
• User Friendly: The proposed system is very user friendly. The reason is the retrieval
and storing of data is fast and data is maintained efficiently. Furthermore the graphical
user interface is provided in the proposed system, which provides user to deal with the
system very comfortably.
• Reports are easily generated: Defaulter Reports can be generated very comfortably in
the proposed system so that user can generate the report as per his/her requirement
(monthly) or in the middle of the session. User can provide the notice to the students so
as to be regular.
• No paper work: The proposed system does not require much paper work. All the data
is fetched into the database immediately and reports can be generated very easily by the
teachers. Furthermore work becomes very easy because there is no need to keep data on
papers.
6.Feasibility study
Feasibility analysis begins once the goals are defined, It starts by generating broad
possible solutions. which are possible ta give an indication of what the new system
should look lime. This is where creativity and imagination are used, Analysts must think
up ways of doing things - generate new ideas, There is no need to go into the detailed
system operation yet. The solution should provide enough information to nuke
reasonable estimates about project cost and give users an indication of how the new
system will fit into the organization. It is important not to exert considerable effort at
this stage only to find out that the project is not worthwhile or that there is a need
significantly change the original goal. Feasibility of a new system means ensuring that
the new system. which we are going to implement. is efficient and affordable. There are
various types Of feasibility to be determined.
6.1 Economically Feasibility
Development of this application is highly economically feasible. The only thing to be
done in making an environment with an effective supervision.
It is cost effective in the sense that has eliminated the paper work completely. The
system is also time effective because the calculations are automated which are at the
end of the or as per the user requirement.
Install all upgrades framework into the .Net package supported widows based
application, This application depends on Microsoft office and intranet service, database.
Enter their attendance and generate report to excel sheet.
The system working is quite easy to use and learn due to its simple but attractive
interface, User requires no special training for operating the system. Technical
performance include issues such as determining whether the system can provide the
right information for the Department personnel student details, and whether the system
can organized so that it always delivers this information at the right place and on time
using intranet services.
7. METHODOLOGY
Based on the literature survey as we have studied various topics thoroughly that are
directly linked with our project we are going to design a possible solution to our problem.
In this part we will propose a method that will give an overview of the approach to our
project and the ways it should be done.
As the previous work was not enough which led us to the development in this project in
the most feasible and efficient way possible. The proposed face detection module for this
project
is Viola jones algorithm. Also, for face recognition modules which is proposed for this
project is a neural network architecture with LBPH.
The following figure shows the project system circuit design .
Although our own database should be used to design real time face recognition student
attendance system, the databases that are provided by the previous researchers are also
used to design the system more effectively, efficiently and for evaluation purposes. Yale
face database is used as both training set and testing set to evaluate the performance. Yale
face database contains one hundred and sixty-five grayscale images of fifteen individuals.
There are eleven images per individual; each image of the individual is in different
condition. The conditions included centre-light, with glasses, happy, left-light, without
glasses, normal, right-light, sad, sleepy, surprised and wink. These different variations
provided by the database is able to ensure the system to be operated consistently in variety
of situations and conditions.
Figure 5: Sample Images
For our own database, the images of students are captured by using laptop built in camera
and mobile phone camera. Each student provided four images, two for training set and
two for testing set. The images captured by using laptop built in camera are categorized
as low-quality images, whereas mobile phone camera captured images are categorized as
high quality images. The high-quality images consist of seventeen students while low
quality images consists of twenty-six students. The recognition rate of low-quality images
and high quality images will be compared in to draw a conclusion in term of performance
between image sets of different quality.
The input image for the proposed approach has to be frontal, upright and only a single
face. Although the system is designed to be able to recognize the student with glasses and
without glasses, student should provide both facial images with and without glasses to be
trained to increase the accuracy to be recognized without glasses. The training image and
testing image should be captured by using the same device to avoid quality difference.
The students have to register in order to be recognized. The enrolment can be done on the
spot through the user-friendly interface. These conditions have to be satisfied to ensure
that the proposed approach can perform well.
8.System requirement
The Raspberry Pi set up needs: The hardware used in this project consists of only 7
components which are:
▪ Raspberry Pi 4
▪ Webcam 8Mp or Camera Module
▪ Power Supply Cable
▪ Micro SD Card
• Display panel
• Mouse & Keyboard
•HDMI Cable
Raspberry Pi 4
Raspberry Pi Foundation is come up with a new upgrade, more powerful next generation
of Pi computer than before, that is Raspberry Pi 4 Model B.
a significant upgrade to the Raspberry Pi 3 generation. As an official reseller of
Raspberry Pi in India, We made it available with no waiting time.
Raspberry Pi 4 Model B is the latest product in the popular Raspberry Pi range of
computers. It offers ground-breaking increases in processor speed, multimedia
performance, memory; and connectivity compared to the prior-generation Raspberry Pi
3 Model B+ while retaining backward compatibility and similar power consumption.
For the end-user, Raspberry Pi 4 Model B provides desktop performance comparable to
entry-level x86 PC systems.
With the latest Bluetooth 5.0 technology make IoT solution better with 2 x Speed, 4 x
Range, and 8 x Data transfer speed, having such a faster and long-distance connectivity,
you will experience peripheral’s Bluetooth connectivity on the next level than before.
Furthermore, One big needful upgrade that Pi foundation has made in Raspberry Pi 4 B
is Type-C USB Port from which Pi can take up to 3A current to operate, and hence now
Pi 4 can provide more power to onboard chips and peripherals interface.
Key Features:
Faster processing:
With one of the latest Broadcom 2711; Quad-Core Cortex A72 (ARM V8-A) 64-bit
SoC Clocked at 1.5GHz processor improved power consumption; and thermals on the
Pi 4+B means that the CPU on the BCM2837 SoC can now run at 1.5 GHz; a 20%
increase on the previous Pi 3 model (which ran at 1.2GHz).
Video performance on Pi 4 B is upgraded with dual-display support at resolutions up to
4K via a pair of micro-HDMI ports; hardware video decode at up to 4Kp60 which
supports H 265 Decode (4Kp60); H.264, and MPEG-4 decode (1080p60).
Faster wireless:
A significant change on the Pi 4 B compared to the previous Pi 3 models is the inclusion
of a new faster; dual-band wireless chip with 802.11 b/g/n/ac wireless LAN.
The dual-band 2.4GHz and 5GHz wireless LAN enables faster networking with less
interference and the new PCB antenna technology allow better reception.
The latest 5.0 Bluetooth allows you to use a wireless keyboard/trackpad with more range
than before without extra dongles; keeping things nice and tidy.
The GPIO header remains the same, with 40 pins; fully backward-compatible with previous
boards as on the previous three models of Pi. However; it should note that the new PoE
header pins may contact components on the underside of some HATs; like Rainbow HAT.
Some standoffs will prevent any mischief from occurring thought.
The Raspberry Pi Camera Module 2 replaced the original Camera Module in April 2016.
The v2 Camera Module has a Sony IMX219 8-megapixel sensor (compared to the 5-
megapixel OmniVision OV5647 sensor of the original camera).
The Camera Module 2 can be used to take high-definition video, as well as stills
photographs. It’s easy to use for beginners, but has plenty to offer advanced users if
you’re looking to expand your knowledge. There are lots of examples online of people
using it for time-lapse, slow-motion, and other video cleverness. You can also use the
libraries we bundle with the camera to create effects.
You can read all the gory details about IMX219 and the Exmor R back-illuminated
sensor architecture on Sony’s website, but suffice to say this is more than just a
resolution upgrade: it’s a leap forward in image quality, colour fidelity, and low-light
performance. It supports 1080p30, 720p60 and VGA90 video modes, as well as still
capture. It attaches via a 15cm ribbon cable to the CSI port on the Raspberry Pi.
The camera works with all models of Raspberry Pi 1, 2, 3 and 4. It can be accessed
through the MMAL and V4L APIs, and there are numerous third-party libraries built
for it, including the Picamera Python library. See the Getting Started with Picamera
resource to learn how to use it.
All models of Raspberry Pi Zero require a Raspberry Pi Zero camera cable; the standard
cable supplied with the camera module is not compatible with the Raspberry Pi Zero
camera connector. Suitable cables are available at low cost from many Raspberry Pi
Approved Resellers, and are supplied with the Raspberry Pi Zero Case.
The camera module is very popular in home security applications, and in wildlife
camera traps.
Micro SD Card
Raspberry Pi computers use a micro SD card, except for very early models which use a
full-sized SD card.
Figure 10 : SD card
We recommend using an SD card of 8GB or greater capacity with Raspberry Pi OS. If
you are using the lite version of Raspberry Pi OS, you can use a 4GB card. Other
operating systems have different requirements: for example, LibreELEC can run
from a smaller card. Please check with the supplier of the operating system to find out
what capacity of card they recommend.
Display panel
Raspberry Pi OS provides touchscreen drivers with support for ten-finger touch and an
on-screen keyboard, giving you full functionality without the need to connect a
keyboard or mouse.
The 800 x 480 display connects to Raspberry Pi via an adapter board that handles power
and signal conversion. Only two connections to your Raspberry Pi are required: power
from the GPIO port, and a ribbon cable that connects to the DSI port on all Raspberry
Pi computers except for the Raspberry Pi Zero line.
Touch panel: True multi-touch capacitive touch panel with up to 10 points of absolution.
The power supply is connected to the Raspberry Pi and the keyboard is connected to the
Raspberry Pi. If the power supply were connected to the keyboard, with the Raspberry
Pi powered via the keyboard, then the keyboard would not operate correctly.
HDMI Cable
Raspberry Pi 4B was launched recently and everyone is busy in grabbing their favorite
Raspberry Pi's. But many of them have some doubts regarding the HDMI Ports. Because
the Raspberry Pi 4 comes with a micro HDMI Ports.
The official Raspberry Pi micro HDMI to HDMI (A/M) cable designed for the
Raspberry Pi 4 computer.
Dlib is a modern C++ toolkit containing machine learning algorithms and tools for
creating complex software in C++ to solve real world problems. It is used in both
industry and academia in a wide range of domains including robotics, embedded
devices, mobile phones, and large high performance computing environments.
Dlib's open source licensing allows you to use it in any application, free of charge.
OpenCV-Python software:
OpenCV is a software which deals with some programming languages like Java, python
and C++, this all are readable and useable on different platform including IOS, Android,
OS X, Linux and windows. Interfaces for rapid GPU tasks dependent on CUDA and
OpenCL are likewise under dynamic advancement. OpenCV-Python is a library of
Python intended to take care of PC vision issues (OpenCV,2018).
Python is a general purpose programming language started by Guido van Rossum that
became very popular very quickly, mainly because of its simplicity and code readability.
It enables the programmer to express ideas in fewer lines of code without reducing
readability.
Compared to languages like C/C++, Python is slower. That said, Python can be easily
extended with C/C++, which allows us to write computationally intensive code in C/C++
and create Python wrappers that can be used as Python modules. This gives us two
advantages: first, the code is as fast as the original C/C++ code (since it is the actual C++
code working in background) and second, it easier to code in Python than C/C++.
OpenCV-Python is a Python wrapper for the original OpenCV C++ implementation.
OpenCV-Python makes use of Numpy, which is a highly optimized library for numerical
operations with a MATLAB-style syntax. All the OpenCV array structures are converted
to and from Numpy arrays. This also makes it easier to integrate with other libraries that
use Numpy such as SciPy and Matplotlib.
certifi==2020.6.20
chardet==3.0.4
click==7.1.2
cmake==3.18.2.post1
decorator==4.4.2
dlib==19.18.0
face-recognition==1.3.0
face-recognition-models==0.3.0
idna==2.10
imageio==2.9.0
imageio-ffmpeg==0.4.2
moviepy==1.0.3
numpy==1.18.4
opencv-python==4.4.0.46
Pillow==8.0.1
proglog==0.1.9
requests==2.24.0
tqdm==4.51.0
urllib3==1.25.11
wincertstore==0.2
Visual Studio Code (famously known as VS Code) is a free open source text editor by
Microsoft. VS Code is available for Windows, Linux, and macOS. Although the editor
is relatively lightweight, it includes some powerful features that have made VS Code
one of the most popular development environment tools in recent times.
Windows Microsoft and Linux created a code manager source name visual studio code.
Basically, this method help the windows to troubleshoot, implanted Git control and
GitHub, language structure featuring, insightful code Finishing, scraps, and code
refactoring. Which I utilized in venture to run python code.
VS Code supports a wide array of programming languages from Java, C++, and Python
to CSS, Go, and Docker file. Moreover, VS Code allows you to add on and even creating
new extensions including code linters, debuggers, and cloud and web development
support.
Microsoft Excel
Microsoft Excel is a spreadsheet program incorporated in Microsoft Office suite of
applications. Spreadsheets prompt tables of values arranged in rows and columns that
can be mathematically manipulated using both basic and complex arithmetic functions
and operations.
Apart from its standard spreadsheet features, Excel also extends programming support
via Microsoft’s Visual Basic for Applications (VBA), the capacity to access data from
external sources via Microsoft’s Dynamic Data Exchange (DDE) and extensive
graphing and charting abilities.
Excel being electronic spreadsheet program can be used to store, organize and
manipulate the data. Electronic spreadsheet programs were formerly based on paper
spreadsheets used for accounting purpose.
The basic layout of computerized spreadsheets is more or less same as the paper ones.
Related data can be stored in tables - which are a group of small rectangular boxes or
cells that are standardized into rows and columns.
DATABASE CREATION:
The first step in the Attendance System is the creation of a database of faces that will be
used. Different individuals are considered and a camera is used for the detection of faces
and the recording of the frontal face. The number of frame to be taken for consideration
can be modified for accuracy levels. These images are then stored in the database along
with the Registration ID.
9. Face Recognition Process
Face detection is important as the image taken through the camera given to the system,
face detection algorithm applies to identify the human faces in that image, the number
of image processing algorithms are introduce to detect faces in an images and also the
location of that detected faces. We have used HOG method to detect human faces in
given image.
The value of integrating image in a specific location is the sum of pixels on the left and
the top of the respective location. In order to illustrate clearly, the value of the integral
image at location 1 is the sum of the pixels in rectangle A. The values of integral image
at the rest of the locations are cumulative. For instance, the value at location 2 is
summation of A and B, (A + B), at location 3 is summation of A and C, (A + C), and at
location 4 is summation of all the regions, (A + B + C + D) (Srushti Girhe et al., 2015).
Therefore, the sum within the D region can be computed with only addition and
subtraction of diagonal at location 4 + 1 − (2 + 3) to eliminate rectangles A, B and C.
Burak Ozen (2017) and Chris McCormick (2013), they have mentioned that Adaboost
which is also known as ‘Adaptive Boosting’ is a famous boosting technique in which
multiple “weak classifiers” are combined into a single “strong classifier”. The training
set is selected for each new classifier according to the results of the previous classifier
and determines how much weight should be given to each classifier in order to make it
significant. However, false detection may occur and it was required to remove manually
based on human vision shows an example of false face detection (circle with blue).
Figure 18 : Face Detection
Convolutional Neural Network (CNN) is another neural network algorithm for face
recognition. Similar to ANN, CNN consists of the input layer, hidden layer and output
layer. Hidden layers of a CNN consists of multiple layers which are convolutional
layers, pooling layers, fully connected layers and normalization layers. However, a
thousand or millions of facial images have to be trained for CNN to work 20 accurately
and it takes long time to train, for instance Deepface which is introduced by Facebook.
There are 68 specific points in a human face. In other words we can say 68 face
landmarks. The main function of this step is to detect landmarks of faces and to position
the image. A python script is used to automatically detect the face landmarks and to
position the face as much as possible without distorting the image.
This module takes input from the camera and tries to detect a face
in the video input. The detection of the face is achieved through the Haar classifiers
mainly, the Frontal face cascade classifier. The face is detected in a rectangle
format and converted to grayscale image and stored in the memory which can be used
for training the model.
Once the faces are detected in the given image, the next step is to extract the unique
identifying facial feature for each image. Basically whenever we get localization of face,
the 128 key facial point are extracted for each image given input which are highly
accurate and these 128-d facial points are stored in data file for face recognition.
This is last step of face recognition process. We have used the one of the best learning
technique that is deep metric learning which is highly accurate and capable of outputting
real value feature vector. Our system ratifies the faces, constructing the 128-dembedding
(ratification) for each. Internally compare faces function is used to compute the
Euclidean distance between face in image and all faces in the dataset. If the current
image is matched with the 60% threshold with the existing dataset, it will move to
attendance marking.
TRAINING OF FACES:
The images are saved in gray scale after being recorded by a camera. The LBPH
recognizer is employed to coach these faces because the coaching sets the resolution
and therefore the recognized face resolutions are completely variant. A part of the
image is taken as the centre and the neighbours are thresholded against it. If the
intensity of the centre part is greater or equal than it neighbour then it is denoted as 1
and 0 if not. This will result in binary patterns generally known as LBP codes.
Figure 23: 2D Human face recognition block diagram
Figure 24 : Flow of the Proposed Approach (Training Part)
CNN(Convolutional Neural Network)
Convolutional Neural Network (CNN) is a well-known deep learning architecture
inspired by the natural visual perception mechanism of the living creatures. In 1959,
Hubel & Wiesel found that cells in animal visual cortex are responsible for detecting
light in receptive fields. Inspired by this discovery, Kunihiko Fukushima proposed the
neocognitron in 1980, which could be regarded as the predecessor of CNN. In 1990,
LeCun et al. published the seminal paper establishing the modern framework of CNN,
and later improved it in. They developed a multi-layer artificial neural network called
LeNet-5 which could classify handwritten digits. Like other neural networks, LeNet-5
has multiple layers and can be trained with the backpropagation algorithm. It can obtain
effective representations of the original image, which makes it possible to recognize
visual patterns directly from raw pixels with little-to-none preprocessing. A parallel
study of Zhang et al used a shift-invariant artificial neural network (SIANN) to
recognize characters from an image. However, due to the lack of large training data and
computing power at that time, their networks can not perform well on more complex
problems, e.g., large-scale image and video classification.
Since 2006, many methods have been developed to overcome the difficulties encountered
in training deep CNNs. Most notably, Krizhevsky et al. proposed a classic CNN
architecture and showed significant improvements upon previous methods on the image
classification task. The overall architecture of their method, i.e., AlexNet, is similar to
LeNet-5 but with a deeper structure. With the success of AlexNet, many works have been
proposed to improve its performance. Among them, four representative works are ZFNet,
VGGNet, GoogleNet and ResNet. From the evolution of the architectures, a typical trend
is that the networks are getting deeper, e.g., ResNet, which won the champion of ILSVRC
2015, is about 20 times deeper than AlexNet and 8 times deeper than VGGNet. By
increasing depth, the network can better approximate the target function with increased
nonlinearity and get better feature representations. However, it also increases the
complexity of the network, which makes the network be more difficult to optimize and
easier to get overfitting. Along this way, various methods have been proposed to deal with
these problems in various aspects. In this paper, we try to give a comprehensive review
of recent advances and give some thorough discussions.
There are numerous variants of CNN architectures in the literature. However, their basic
components are very similar. Taking the famous LeNet-5 as an example, it consists of
three types of layers, namely convolutional, pooling, and fully-connected layers. The
convolutional layer aims to learn feature representations of the inputs. Convolution layer
is composed of several convolution kernels which are used to compute different feature
maps.
With the increasing challenges in the computer vision and machine learning tasks, the
models of deep neural networks get more and more complex. These powerful models
require more data for training in order to avoid overfitting. Meanwhile, the big training
data also brings new challenges such as how to train the networks in a feasible amount of
time.
Deep CNNs have made breakthroughs in processing image, video, speech and text. In
this paper, we have given an extensive survey on recent advances of CNNs. We have
discussed the improvements of CNN on different aspects, namely, layer design, activation
function, loss function, regularization, optimization and fast computation.
The ability to accurately represent sentences is central to language understanding.
We describe a convolutional architecture dubbed the Dynamic Convolutional
Neural Network (DCNN) that we adopt for the semantic modelling of sentences.
The network uses Dynamic k-Max Pooling, a global pooling operation over linear
sequences. The network handles input sentences of varying length and induces a
feature graph over the sentence that is capable of explicitly capturing short and
long-range relations. The network does not rely on a parse tree and is easily
applicable to any language. We test the DCNN in four experiments: small scale
binary and multi-class sentiment prediction, six-way question classification and
Twitter sentiment prediction by distant supervision. The network achieves excellent
performance in the first three tasks and a greater than 25% error reduction in the
last task with respect to the strongest baseline.
Figure 25:CNN
A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm that
can take in an input image, assign importance (learnable weights and biases) to various
aspects/objects in the image, and be able to differentiate one from the other. The pre-
processing required in a ConvNet is much lower as compared to other classification
algorithms. While in primitive methods filters are hand-engineered, with enough training,
ConvNets have the ability to learn these filters/characteristics.
The architecture of a ConvNet is analogous to that of the connectivity pattern of Neurons
in the Human Brain and was inspired by the organization of the Visual Cortex. Individual
neurons respond to stimuli only in a restricted region of the visual field known as the
Receptive Field. A collection of such fields overlap to cover the entire visual area.
An image is nothing but a matrix of pixel values, right? So why not just flatten the image
(e.g. 3x3 image matrix into a 9x1 vector) and feed it to a Multi-Level Perceptron for
classification purposes? Uh.. not really.
In cases of extremely basic binary images, the method might show an average precision
score while performing prediction of classes but would have little to no accuracy when it
comes to complex images having pixel dependencies throughout.
A ConvNet is able to successfully capture the Spatial and Temporal dependencies in
an image through the application of relevant filters. The architecture performs a better
fitting to the image dataset due to the reduction in the number of parameters involved and
the reusability of weights. In other words, the network can be trained to understand the
sophistication of the image better.
Figure 26.
we have an RGB image that has been separated by its three color planes — Red, Green,
and Blue. There are a number of such color spaces in which images exist — Grayscale,
RGB, HSV, CMYK, etc.
You can imagine how computationally intensive things would get once the images reach
dimensions, say 8K (7680×4320). The role of ConvNet is to reduce the images into a
form that is easier to process, without losing features that are critical for getting a good
prediction. This is important when we are to design an architecture that is not only good
at learning features but also scalable to massive datasets.
Similar to the Convolutional Layer, the Pooling layer is responsible for reducing the
spatial size of the Convolved Feature. This is to decrease the computational power
required to process the data through dimensionality reduction. Furthermore, it is useful
for extracting dominant features which are rotational and positional invariant, thus
maintaining the process of effectively training the model.
There are two types of Pooling: Max Pooling and Average Pooling. Max Pooling returns
the maximum value from the portion of the image covered by the Kernel. On the other
hand, Average Pooling returns the average of all the values from the portion of the
image covered by the Kernel.
Max Pooling also performs as a Noise Suppressant. It discards the noisy activations
altogether and also performs de-noising along with dimensionality reduction. On the
other hand, Average Pooling simply performs dimensionality reduction as a noise-
suppressing mechanism. Hence, we can say that Max Pooling performs a lot better
than Average Pooling.
The Convolutional Layer and the Pooling Layer, together form the i-th layer of a
Convolutional Neural Network. Depending on the complexities in the images, the number
of such layers may be increased for capturing low-level details even further, but at the
cost of more computational power.
After going through the above process, we have successfully enabled the model to
understand the features. Moving on, we are going to flatten the final output and feed it to
a regular Neural Network for classification purposes.
Figure 27.
Adding a Fully-Connected layer is a (usually) cheap way of learning non-linear
combinations of the high-level features as represented by the output of the convolutional
layer. The Fully-Connected layer is learning a possibly non-linear function in that space.
Now that we have converted our input image into a suitable form for our Multi-Level
Perceptron, we shall flatten the image into a column vector. The flattened output is fed to
a feed-forward neural network and backpropagation is applied to every iteration of
training. Over a series of epochs, the model is able to distinguish between dominating
and certain low-level features in images and classify them using the Softmax
Classification technique.
CNNs work with both grayscale and RGB images. Before we move on, you need to
understand the difference between grayscale and RGB images
An image consists of pixels. In deep learning, images are represented as arrays of pixel
values.
CNN Architecture
The CNN architecture is complicated when compared to the MLP architecture. There are
different types of additional layers and operations in the CNN architecture.
CNNs take the images in the original format. We do not need to flatten the images to
use with CNNs as we did in MLPs.
Layers in a CNN
There are three main types of layers in a CNN: Convolutional layers, Pooling layers
and Fully connected (dense) layers. In addition to that, activation layers are added
after each convolutional layer and fully connected layer.
Operations in a CNN
There are four main types of operations in a CNN: Convolution operation, Pooling
operation, Flatten operation and Classification (or other relevant) operation.
Figure 28.
The Green Box represents a Neural Network. The arrows indicate memory or simply
feedback to the next input.
The first figure shows the RNN. The Second figure shows the same RNN unrolled in time.
Consider a sequence [i am a good boy]. We can say that the sequence is arranged in time.
At t=0, X0=“i” is given as the input . At t=1, X1=“am” is given as the input. The state
from the first time step is remembered and given as input during the second time step
along with the current input at that time step.
In a Feed Forward Neural Network, the Network is forward propagated only once per
sample. But in RNN, the network is forward propagated equal to the number of time steps
per sample.
Recurrent Neural Network is a type of Artificial Neural Network that are good at
modeling sequential data. Traditional Deep Neural Networks assume that inputs and
outputs are independent of each other, the output of Recurrent Neural Networks depend
on the prior elements within the sequence. They have an inherent “memory” as they take
information from prior inputs to influence the current input and output. One can think
of this as a hidden layer that remembers information through the passage of time.
RNNs are mainly used for predictions of sequential data over many time steps. A
simplified way of representing the Recurrent Neural Network is by unfolding/unrolling
the RNN over the input sequence. For example, if we feed a sentence as input to the
Recurrent Neural Network that has 10 words, the network would be unfolded such that it
has 10 neural network layers.
Figure 29.
Exploding gradients
In some cases the value of the gradients keep on getting larger and becomes infinity
exponentially fast causing very large weight updates and gradient descent to diverge
making the training process very unstable. This problem is called the exploding gradient.
Vanishing gradients
In some other cases, as the background propagation advances from the output layer to the
input layer, the gradient term goes to zero exponentially fast, which which eventually
leaves the weights of the initial or lower layers nearly unchange and makes it difficult to
learn some long period dependencies. As a result, the gradient descent never converges
to the optimum. This problem is called the vanishing gradient.
Bidirectional recurrent neural networks
(BRNN)
A typical RNN relies on past and present events. However, there can be situations where
a prediction depends on past, present, and future events.
For example, predicting a word to be included in a sentence might require us to look into
the future, i.e., a word in a sentence could depend on a future event. Such linguistic
dependencies are customary in several text prediction tasks.
Thus, capturing and analyzing both past and future events is helpful.
To enable straight (past) and reverse traversal of input (future), Bidirectional RNNs or
BRNNs are used. A BRNN is a combination of two RNNs - one RNN moves forward,
beginning from the start of the data sequence, and the other, moves backward, beginning
from the end of the data sequence. The outputs of the two RNNs are usually concatenated
at each time step, though there are other options, e.g. summation. The individual network
blocks in a BRNN can either be a traditional RNN, GRU, or LSTM depending upon the
use-case.
import cv2
import numpy as np
import face_recognition
import os
from datetime import datetime
path = 'Training_images'
images = []
classNames = []
myList = os.listdir(path)
print(myList)
for cl in myList:
curImg = cv2.imread(f'{path}/{cl}')
images.append(curImg)
classNames.append(os.path.splitext(cl)[0])
print(classNames)
def findEncodings(images):
encodeList = []
def markAttendance(name):
with open('Attendance.csv', 'r+') as f:
myDataList = f.readlines()
nameList = []
for line in myDataList:
entry = line.split(',')
nameList.append(entry[0])
if name not in nameList:
now = datetime.now()
dtString = now.strftime('%H:%M:%S')
f.writelines(f'\n{name},{dtString}')
encodeListKnown = findEncodings(images)
print('Encoding Complete')
cap = cv2.VideoCapture(0)
while True:
success, img = cap.read()
# img = captureScreen()
imgS = cv2.resize(img, (0, 0), None, 0.25, 0.25)
imgS = cv2.cvtColor(imgS, cv2.COLOR_BGR2RGB)
facesCurFrame = face_recognition.face_locations(imgS)
encodesCurFrame = face_recognition.face_encodings(imgS, facesCurFrame)
if matches[matchIndex]:
name = classNames[matchIndex].upper()
# print(name)
y1, x2, y2, x1 = faceLoc
y1, x2, y2, x1 = y1 * 4, x2 * 4, y2 * 4, x1 * 4
cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.rectangle(img, (x1, y2 - 35), (x2, y2), (0, 255, 0), cv2.FILLED)
cv2.putText(img, name, (x1 + 6, y2 - 6), cv2.FONT_HERSHEY_COMPLEX, 1,
(255, 255, 255), 2)
markAttendance(name)
cv2.imshow('Webcam', img)
cv2.waitKey(1)
Design:
Input Design:
Input design is the process of converting the user originated inputs to computer based
format- A system user interacting through a workstation must be able to tell the system
Whether to accept the input to produce reports. The collection of input data is considered
to be most expensive part of the system design. Since the input bas to be planned in
such a manner so as to get relevant information. extreme care is taken obtain pertinent
information. Input design is part of overall system design that requires special attention
designing input data is to make the data entered easy and free from errors. The input
forms are designed using the controls available in .NET framework.
This project first will entered to the input of allocation farms it will be created on student
details form and subject entry form. time table form will help to calculate subject wise
attendance system.
Output Design:
Output design this application "Student Attendance management system" generally
refers to the results and information that arc generated by the system for many end-
users; output is the main reason Tor developing the system and the basis on which they
evaluate the usefulness of the application.
The output is designed in such a way that it is attractive, convenient and informative.
Forms are designed with various features, which nuke console output more pleasing.
As the outputs are the most important sources of information to the users; better design
should improve the system's relationships us and also will help in decision making,
Form design elaborates the way output is presented and the layout available for
capturing information.
One of the important factors of the system the output it produces. This system refers to
the results and information generated, Basically the output a computer system is used
to communicate the result of processing to the user.
Attendance management system to show the report subject wise attendance maintaining
by staffs, Takeo as a whole report Obtains on a administrator privileges only.
11. Implementation
Firstly, connecting Raspberry pi with required components as shown in the following
figure:
Introduction:
Our Source code has been generated, software must be tested to uncover (& Uncover)
as many errors as possible before delivery to customers.
Our goal is to design a Series of test cases that have high likelihood of finding errors.
To uncover the errors software techniques are used. These techniques provide
systematic guidance for designing test Exercise the internal logic of software
components, and Exercises the input and output domains of the program to uncover
errors In program function, behaviour and performance.
Internal program logic is exercised using —White box test case design Techniques.
Software requirements are exercised using —block box test ease Design techniques.
In both cases, the intent is 10 find the maximum number errors with the Minimum
amount of effort and time.
Testing Methodologies:
A strategy software’s testing must accommodate low-leve1 tests that arc necessary to
verify that a small source code segment has been correctly implemented as well as high-
level tests that validate major system junctions against customer requirements, A
strategy must provide guidance for the practitioner and a set of milestones for the
manager. Because the Steps Of the test Strategy Occur at a time When deadline pressure
begins to rise, progress must be measurable and problems must surface as early as
possible Following testing techniques are well known and the same strategy is adopted
during this project testing.
Performance Testing:
Performance testing is designed 10 tests the run-time performance of software within the
context of an integrated system. Performance testing occurs throughout all steps in the
testing process. Even at the unit level, the performance Oran individual module may be
assessed as white-box tests are conducted, This project reduce attendance table, codes.
it will generate report fast-no have extra time or waiting of results entered correct data
will show result few millisecond just used only low memory of our system.
Automatically do not getting access at another software, Get user permission and access
to other application
System Maintenance
Software maintenance is far more than finding mistakes. Provision must be made for
environment changes which may affect either the computer, or other parts of the
computer based system, such activity is normally called maintenance.
It includes both the Improvement of the system functions and the corrections of faults
which arise during the operation of a new system.
Storing data in a Separate secondary device leads to an effective and efficient maintains
of the system. The nominated person has sufficient knowledge of the organisations
computer passed proposed change.
1. Life-long Learning
With the implementation of this project, we gained skills on the commands of Lab View,
The specifically Vision Assistant and Acquisition based modules. Understanding of
Machine Vision Algorithm for face detection and reading manual of Lab View enhances
our skills on the Lab View.
Furthermore, the project management skills we gained by dividing the project into
different phases and time slot not only developed our project management skills but also
increased our time management skills.
Hence, our own database with colour images which is further categorized into high
quality set and the low quality set, as images are different in their quality: some images
are blurred while some are clearer.
This proposed approach provides a method to perform face recognition for student
attendance system, which is based on the texture based features of facial images. Face
recognition is the identification of an individual by comparing his/her real-time captured
image with stored images in database of that person. Thus, training set has to be chosen
based on the latest appearance of an individual other than taking important factor for
instance illumination into consideration. The proposed approach is being trained and
tested on different datasets. Yale face database which consists of one hundred and sixty-
five images of fifteen individuals with multiple conditions is implemented. However, this
database consists of only grayscale images. Hence, our own database with color images
which is further categorized into high quality set and the low quality set, as images are
different in their quality: some images are blurred while some are clearer. The statistics
of each data set have been discussed in the earlier chapter. Viola-Jones object detection
framework is applied in this approach to detect and localize the face given a facial image
or provided a video frame. From the detected face, an algorithm that can extract the
important features to perform face recognition is designed. Some pre-processing steps are
performed on the input facial image before the features are extracted. Median filtering is
used because it is able to preserve the edges of the image while removing the image
noises. The facial image will be scaled to a suitable size for standardizing purpose and
converted to grayscale image if it is not a grayscale image because CLAHE and LBP
operator work on a grayscale image. One of the factors that are usually a stumbling stone
for face recognition performance is uneven lighting condition. Hence, many alternatives
have been conducted in this proposed approach in order to reduce the non-uniform
lighting condition.Before feature extraction takes place, pre-processing is performed on
the cropped face image (ROI) to reduce the illumination problem.
13. Conclusion
The project has a very vast scope in future. The project can be implemented on intranet
in future. Project can be updated in near future as and when requirement for the same
arises, as it is very flexible in terms of expansion. With the proposed software of'
database Space Manager ready and fully functional the client is now able to manage and
hence run the entire work in a much better, accurate and error free manner.
14. References :-
RoshanTharanga, J. G., et al. "Smart attendance using real time face recognition
(smart-fr)." Department of Electronic and Computer Engineering, Sri Lanka
Institute of Information Technology (SLIIT), Malabe, Sri Lanka (2013)
Vinay Hermath, Ashwini Mayakar, "Face Recognition Using Eigen Faces and,"
IACSIT, vol. 2, no. 4, pp. 1793-8201, August 2010
Docs.opencv.org. (2018). OpenCV: Introduction to OpenCV
Python Tutorials. [online] Available at:
https://docs.opencv.org/3.4/d0/de3/tutorial_py_intr o.html