Black Book
Black Book
SUBMITTED BY
CERTIFICATE
Submitted by
MITESH PARMAR Exam No:
NUPUR PURANIK Exam No:
DHRUVA JOSHI Exam No:
SONAL MALPANI Exam No:
is a bonafide student of this institute and the work has been carried out by Students
under the supervision of Prof B.S. Thakare and it is approved for the partial fulfilment
of the requirement of Savitribai Phule Pune University, for the award of the degree of
Bachelor of Engineering (Computer Engineering).
Place: Pune
Date:
ABSTRACT
Currently a lot of time is needed for the verification of signature manually. The need of
developing an automated checking system is felt because of signature forgery in various
transactions. The dynamic signature is a biometric trait which is used in identification.
Automatic signature verification is an application of Image Processing. The aim of the
model is identifying correct signature for reducing fraudulent transactions. It is difficult
to have multiple signatures of the same person so the idea is to duplicate a given
signature number of times by inducing variations and using verifiers to give a
similarity. For duplication of signature the approach of using the Cognitive Inspired
Model is used. The approach for creating human like signatures can be done by
introducing Intra-Component and Inter-Component variability. For verification,
Structural Similarity Index (SSIM) and Image Hashing techniques are used.
Keywords: SRSS, Cognitive Inspired Model, Signature Duplication, SSIM, Image
Hashing
ACKNOWLEDGEMENTS
I would like to take this opportunity to thank my internal guide Prof. B. S. Thakare for
giving me all the help and guidance I needed. I am really grateful to him for his kind
support. His valuable suggestions were very helpful.
I would also like to thank my project co-ordinator Prof. Santosh Shelke for giving me
the necessary guidance and support. I would also like to thank him for his useful inputs
and tips which really helped this project.
Mitesh Parmar
Nupur Puranik
Dhruva Joshi
Sonal Malpani
(B.E. Computer Engg.)
INDEX
1 INTRODUCTION 1
OVERVIEW 2
MOTIVATION 3
Dataset 5
Image Processing 7
OpenCV 7
2 LITERATURE SURVEY 9
FUNCTIONAL REQUIREMENTS 14
System Features 14
User Interfaces 15
Hardware Interfaces 15
Software Interfaces 15
NON-FUNCTIONAL REQUIREMENTS 16
Performance Requirements 16
Safety Requirements 16
Security Requirements 16
Software Quality Attributes 16
SYSTEM REQUIREMENTS 17
Database Requirements 17
Software Requirements 17
Hardware Requirements 17
4 SYSTEM DESIGN 20
SYSTEM ARCHITECTURE 21
MATHEMATICAL MODEL 22
UML DIAGRAMS 25
Use-Case Diagram 25
Activity Diagram 28
Sequence Diagram 29
Class Diagram 30
Deployment Diagram 32
5 PROJECT PLAN 33
PROJECT ESTIMATES 34
Reconciled Estimates 34
Project Resources 34
RISK MANAGEMENT 36
PROJECT SCHEDULE 38
Project Task Set 38
Task Network 38
Timeline Chart 39
TEAM ORGANISATION 39
Team Structure 39
6 PROJECT IMPLEMENTATION 41
Software Tools 42
Programming Languages 43
ALGORITHM DETAILS 44
Image Recognition 44
Signature Verification 44
7 SOFTWARE TESTING 45
TYPES OF TESTING 46
Unit Testing 46
Integration Testing 46
Manual Testing: 47
OUTCOMES 50
SCREENSHOTS 50
MySQL Database 54
9 CONCLUSIONS 54
CONCLUSIONS 55
FUTURE WORK 55
APPLICATIONS 56
Platform 68
Tools Used 68
Libraries Used 68
Deployment 68
PyCharm Setup 69
XAMPP Setup 70
Database 71
Resultant GUI 72
REFERENCES 77
LIST OF FIGURES
System Architecture 22
Activity Diagram 28
Sequence Diagram 29
Class Diagram 30
Deployment Diagram 32
Task Network 38
Sign-up Screen 50
Log-in Screen 50
User Profile 51
Output 52
Anaconda Setup 69
XAMPP Setup 70
1.1 Dataset 5
4.1 Use-Case 26
Estimate of Cost 34
Risk Table 36
INTRODUCTION
OVERVIEW
Since the time when authentication came into existence Signatures became an integral
part of the process. Be it any legal documents, agreements, bank documents like
cheques, forms, wills the need of Signature is of utmost importance. The 21st century
has bought great advancements in the development of various processes of
authentication like iris recognition, fingerprint detection but yet Signatures seem to be
inevitable. This gave the rise to improve systems for detecting signatures. On a positive
this has been a remarkable achievement showing the capability of us humans and shows
how much science and technology has advanced. The need of Detection was because of
forged signatures. Any person trying to duplicate signatures is known as forgery. This
needed to be checked.
In this project the implantation of Image Processing has been utilized in Python. It is
not convenient for a person to give Signatures with slight differences is not possible.
Image Processing is one of the upcoming and efficient ways of processing images
which uses various modification representation techniques and algorithms to identify
and change images with a few changes as needed making the task of the user easier and
efficient. Image Processing has been implemented in python with is an object orient
language and a structured programming language. Python uses dynamic typing,
reference counting and a cycle-detecting garbage collector for memory management. It
also features late binding, which binds method and variable names during program
execution. Python was designed to be highly extensible.
MOTIVATION
The title for our project is ‘Signature Verification Using One Real Signature’. The main
purpose of the project is to verify a signature which requires the user to click the picture
of his/her signature and upload it to the application, here the application creates
duplicates which are used in the verification process and when a new signature from the
same signer is given as an input, similarity is found to give the result as forged or
genuine.
The scope of our project is to help in verification process of the signature. Here the
single signature system is unique and needed as it is not possible to collect various
signatures. Signature plays an important role in authentication process. Due to
advancement in technology the forgery rate has also gone up. This is a counter-measure
to authenticate correct information. While we are taking only one signature, the goal is
to create maximum duplicates as the signature of the same person also varies under
different circumstances. It would be helpful in banks where cheques have to be
processed, also in case of wills and legal documents. The information provided by the
user in the form of signature would be processed and a final output would be given.
This output is given to the banks or any organization using the system to differentiate
between real and duplicate signatures. This is done so as to prevent forgeries and use
signatures in a safer way for authentication.
Limitations-
- It creates less differences if the signature is complex and hence accuracy id
affected.
- If external reference of any object is present then it might result into reduced
prediction accuracy.
METHODOLOGIES OF PROBLEM SOLVING
Dataset
The dataset we used is a self-generated data set. Every time a new user registers a
signature is provided by the user. This signature is taken for processing. We have used
OpenCV for creating duplicates of the signature which comes as an input. The
duplicates are created by adding variations to each part of the signature. The most
variations are observed in the curves of any signature and hence maximum duplicates
have differences in curves. The number of duplicates per signature depends widely on
the number of components. However on an average 50 duplicates per signature is
formed. This is indeed a complex dataset, as when the verification process needs to be
done, the new signature is compared with only its duplicates that were previously
created. Our proposed system gives a correct result 89.6% of the times when checked
with 20 different signers. To calculate this result we have taken into account the real as
well as forged signature of the signer.
Figure 1.1 Dataset Images
Image Processing:
Image processing deals with image-to-image transformation. The input and output of image
processing are both images. Image Processing is a method to perform some operations on the
image, in order to get an enhanced image or extract useful information from this image. It is a
type of signal processing in which the input is an image and the output can be an image or its
important characteristics. The process involves 3 main steps: Importing the image. Analysing
and manipulating the image. Output with the necessary changes. For the proposed system, the
input data are the signatures which need to be read and given further for addition of
duplications. The duplications formed here is a process of Image Modification based on certain
logic. For this we have used OpenCV. The manipulations are done using features of OpenCV
and the duplicates are created.
OpenCV:
OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed
at real-time computer vision. It is library used for Image Processing. It is mainly used to do all
the operation related to Images. OpenCV was originally developed in C++. In addition to it,
Python and Java bindings were provided. OpenCV runs on various Operating Systems such as
windows, Linux, OSx, FreeBSD, Net BSD, Open BSD, etc.OpenCV is used for all sorts of
image and video analysis, like facial recognition and detection, license plate reading, photo
editing, advanced robotic vision, optical character recognition,verification.
p-Hash
pHash is an open source software library released under the GPLv3 license that implements
several perceptual hashing algorithms, and provides a C-like API to use those functions in your
own programs. Perceptual hashing can be applied to a wide variety of situations. Similar to
comparing images for copyright infringement, a group of researchers found that it could be used
to compare and match images in a database. A perceptual hash, is a generated string (hash) that
is produced by a special algorithm. This perceptual hash is a based on some input picture, that
can be used to compare images by calculating the Hamming distance (which basically counts the
number of different individual bits). If you use another hashing technique for comparing images,
making the slightest change to the picture, will generate a totally different hash. Perceptual hash
uses similar approach but instead of averaging relies on discrete cosine transformation.
d-Hash
Difference hash uses the same approach as a-hash, but instead of using information about
average values, it uses gradients (difference between adjacent pixels). The difference is that with
image hashing, if two pictures look practically identical but are in a different format, or
resolution (or there is minor corruption, perhaps due to compression) they should hash to the
same number. Despite the actual bits of their data being totally different, if they look parctically
identical to a human, they hash to the the same thing.
SSIM
SSIM stand for Structural Similarity index. The Structural Similarity Index (SSIM) is a
perceptual metric that quantifies image quality degradation* caused by processing such as data
compression or by losses in data transmission. It is a full reference metric that
requires two images from the same image capture— a reference image and a processed image.
The processed image is typically compressed. It may, for example, be obtained by saving a
reference image as a JPEG (at any quality level) then reading it back in. SSIM is best known in
the video industry, but has strong applications for still photography. Any image may be used,
including those of Imatest test patterns such as Spilled Coins or Log F-Contrast. SSIM actually
measures the perceptual difference between two similar images. It cannot judge which of the
two is better: that must be inferred from knowing which is the “original” and which has been
subjected to additional processing such as data compression
CHAPTER 2
LITERATURE SURVEY
Authors Main Features Database Approach Results/Conclusions
Napa Sae Bae, Histogram Feature MCYT-100, SUSIG HMM FAR: 4.66%
et al.[15] Extraction, User
Template
Generation,
Manhattan Distance
B.S Thakare, SIFT,LBP for GPDS-300 Markov Random Model FRR: 16.62%
H.R feature extraction, FAR:14.33
Deshmukh[15 Computer Vision
]
Zhihua Xia, et Two step verification SGNOTE, K-Nearest Neighbour 2 step verification
al. [2] MCYT-100 (KNN) improves the
accuracy
Assumptions:
1. The end user device should be a laptop.
2. Additionally, the end user has used the application before for his duplicates to be present in
the database.
3. The system has a dataset inbuilt in the backend which stores all duplicates for
verification and prediction purpose.
Dependencies:
FUNCTIONAL REQUIREMENTS
System Features
User Interfaces
The user needs to give his original signature while signing up. User will click a picture
of signature and upload it. The picture will be sent to the program for processing and
duplicating. The duplicates are stores as dataset for later use. When the signer gives the
signature for verification, this new signature is checked with the duplicates and the
result is given as an output.
Hardware Interfaces
The Personal Computer (PC) is used to access the desktop application. It acts as a
medium through which the user can interact with the database. User inputs his/her data
as required by the application and is received by PC and later the system stores this data
into the database.
Software Interfaces
The desktop application interacts with the Flask, MYSQLLite server where the database
is accessed for login/signup purpose. Here the records are stored as given by the user.
These records are retrieved by the desktop application for further analysis and
prediction.
NON-FUNCTIONAL REQUIREMENTS
Performance Requirements
Accuracy: The system can verify a signature to be forged or real with a accuracy of
89.6% using p-hash, d-hash and SSIM verifiers.
Privacy: The authentication is provided in form of user id, password which ensures that
no unauthorized person can access other user’s records.
Usability: The system requires the user to click the picture of signature. Main objective
is to duplicate using single reference system and predict the result.
Safety Requirements
● The image which is being detected needs to be of good quality i.e the image of
document with signature needs to be visible.
● When clicking the image the signature, any part of the document can be visible
in the image but it may extract the exact image.
● When cropping the image, only the part of the image which has to be detected
should be cropped by the user.
Security Requirements
Third-party access to the dataset should be restricted and care should be taken to check
if unauthorized access to the application is also prohibited.
● Availability: The system does not crash and performs its function in every
possible scenario.
● Robustness: The system reduces the impact of operational mistakes or invalid
data.
● Extensibility: The system can further be extended to detect any number of signatures.
● Portability: The system can operate and adapt in new environments efficiently.
SYSTEM REQUIREMENTS
Database Requirements
Software Requirements
Platform:
1. Operating System: Windows 7/8/10 (32 or 64bit)
2. IDE: Anaconda(Jupyter Notebook, Spyder)
3. Programming Language: Python, JavaScript, HTML, MySQLs, jQuery
Hardware Requirements
Agile SDLC model is a combination of iterative and incremental process models with focus on
process adaptability and customer satisfaction by rapid delivery of working software
product. Agile Methods break the product into small incremental builds. These builds are
provided in iterations. These builds are provided in iterations. Each iteration typically lasts from
about one to three weeks. Every iteration involves cross functional teams working
simultaneously on various areas like −
● Planning
● Requirements Analysis
● Design
● Coding
● Acceptance Testing.
All these phases are intervened to each other in which progress is seen as flowing
steadily like a spiral through the phases. The next phase is started and we come back to
that phase if any changes are required, so the name Agile Model. In this model phases
do overlap.
CHAPTER 4
SYSTEM DESIGN
SYSTEM ARCHITECTURE
Input x, Output y
Therefore, y=f(x)
I1: Login
I2: User details
I3: Image of Signature
O1 : Prediction if Forged or
Authentic
An entity relationship model (ER model for short) describes interrelated things of
interest in a specific domain of knowledge. A basic ER model is composed of entity
types (which classify the things of interest) and specifies relationships that can exist
between instances of those entity types. In software engineering, an ER model is
commonly formed to represent things that a business needs to remember in order to
perform business processes. Consequently, the ER model becomes an abstract data
model, that defines a data or information structure which can be implemented in a
database, typically a relational database.
Use-Case Diagram
A use case diagram in the Unified Modelling Language (UML) is a type of a behaviour
diagram defined by and created from a use case analysis. Its purpose is to present a
graphical overview of the functionality provided by a system in terms of actor, their
goals (represented as use cases), and any dependencies between those use cases. The
main purpose of a use case diagram is to show what system functions are performed for
which actor. Roles of the actors in the system can be depicted.
A sequence diagram in UML is a kind of interaction diagram that shows how processes
operate with one another and in what order. It is a construct of message sequence chart.
A sequence diagram shows object interaction arranged in time sequence. It depicts the
objects and classes involved in the scenario and sequence of messages exchanges
between the objects needed to carry out the functionality of scenario. Sequence
diagrams are typically associated with use case realizations in the logical view of the
system under development.
A class diagram in the UML is a type of static structure diagram that describes the
structure of a system by showing the system classes, their attributes and the
relationships between the classes. This diagram shows various classes or main entities
involved in the system and also their relationship with each other. It depicts the
attributes and operations each class can carry out, individually and with the help of
other classes in the system designed
UML state machine diagrams depict the various states that an object may be in and the
transitions between those states. Infact, in other modelling languages it is common for
this type of a diagram to be called a state transition diagram or even a simple state
diagram. A state represents a stage in the behaviour pattern of an object. A transition is
a progression from one state to another and will be triggered by an event that is either
internal or external to the object.
PROJECT PLAN
PROJECT ESTIMATES
Reconciled Estimates
Cost Estimates
Compone Cost in
nt INR
Paper Publication fees per person ₹2000
Project Competition -
Table 5.1: Estimate of Cost
Time Estimates
Activity Duration
Project Team Identification 10 days
Topic Selection 2 weeks
Research for proposed system 3 weeks
Design of the proposed system 1 month
Implementation of the signature 2 months
verification application.
Testing 3 weeks
Documentation 1 week
Table 5.2: Project Implementation Plan
Project Resources
Software Tools
● Windows 7/8/10 (32/64 bit)
● Anaconda
Programming Languages
● Python
● JavaScript
● HTML
● MySQL
● jQuery
RISK MANAGEMENT
The risks for the Project can be analysed within the constraints of time and quality
Probabili Probability of
ty occurrence
High > 75%
Medium 26 – 75 %
Low < 25%
Table 5.4 Risk Probability Definitions
Risk ID 1
Risk Description Signature Prediction Failure
Category Technology
Source Software requirement specification document
Probability Medium
Impact Very High
Response Monitor
Strategy Training the system with more and complex images will help
improve the results.
Risk Status Identified
Risk ID 2
Risk Description Signature Duplication Failure
Category Technology
Source Software requirement specification document
Probability High
Impact High
Response Mitigate
Strategy Using a better dataset with approximately correct
corresponding calorie value for training.
Risk Status Identified
Risk ID 3
Risk Description Software Database interfacing failure
Category Technology
Source Identified during early development and testing
Probability Medium
Impact Low
Response Accept and Improve means to store local data
Strategy Identify database servers and problems, technology used
and mitigate issue
Risk Status Identified
+PROJECT SCHEDULE
● Task 2: Design and maintain the schema and database connection with front end
for signature and credential record management.
● Task 3: Testing the connection and interfacing, checking for proper functionality
and working before deployment.
Figure 5.1 Task Network
TEAM ORGANISATION
Team Structure
2) Review II
5) Review III
6) Paper conference
7) Paper conference
PROJECT IMPLEMENTATION
6.1 OVERVIEW OF PROJECT MODULES
● Windows 7/8/10(32/64bit)
The project was implemented on a Laptop running Windows 10 64-bit, on a 6 th
generation Intel core i5 processor 5Ghz with 8GB RAM.
● Database Server
A website hosting is deployed. It gives a panel for managing web pages, PHP
scripts and also provides SQLLite database server for querying and record
storage.
6.2.2 Programming Languages
● Python
Python is an interpreted, high-level, general-purpose programming language.
Python uses dynamic typing, reference counting and a cycle-detecting garbage
collector for memory management. It also features late binding, which binds method
and variable names during program execution. Python was designed to be highly
extensible.
● JavaScript
JavaScript is the programming language of HTML and the Web. It is a prototype-
based, multi-paradigm, dynamic language, supporting object-oriented, imperative,
and declarative (e.g. functional programming) styles.
● HTML
The Hypertext Markup Language (HTML) is the most widely used language on
Web to develop web pages. It is a standard markup language for creating web pages
and web applications. With Cascading Style Sheets (CSS) and JavaScript, it forms a
triad of cornerstone technologies for the World Wide Web.
● MySQL
MySQL is an Oracle backed open-source relational database management system
(RDBMS) based on Structured Query Language (SQL).MySQL runs on virtually
all platforms, including Linux, UNIX and Windows. MySQL is an important
component of an open source enterprise stack called LAMP.
● jQuery
jQuery is a JavaScript library designed to simplify HTML DOM tree traversal and
manipulation, as well as event handling, animation, and Ajax. jQuery is a light
weight, ”write less, do more”, JavaScript library. The purpose of jQuery is to make
it much easier to use JavaScript on your website. jQuery also simplifies a lot of the
complicated things from JavaScript, like AJAX calls and DOM manipulation.
6.3 ALGORITHMDETAILS
SOFTWARE TESTING
TYPES OF TESTING
User acceptance testing is concerned with what is to be testes from the users view point
of what the system does. This is not a technical description of the software, but the
USERS view of the functions.
Unit testing focuses verification effort on the smallest unit of the software design the
software component or module. The important control paths are tested to uncover errors
within the boundary of the module. The unit test is white-box oriented. In our software
product the following components would undergo unit testing:
Manual testing is the process of manually testing software for defects. It requires a
tester to play the role of an end user and use most of all features of the application to
ensure correct behavior. The manual testing is very basic type of testing which helps to
find the bugs in the application under test. It is preliminary testing, must be carried out
prior to start automating the test cases and also needs to check the feasibility of
automation testing.
RESULTS
OUTCOMES
SCREENSHOTS
CONCLUSIONS
CONCLUSIONS
Signature Verification System is an advancing field and has varied usage such as
Banking sector, legal documents etc. A user while signing any document does not
have the same signature always and has a few variations. It is necessary to develop a
system that can incorporate these variations and can accurately predict if the
signature is forged or genuine. One Real Signature is given emphasis in our
experiment as it is not feasible to collect multiple signatures. Our system modifies
the One Real Signature by adding slight variations. Further when the same signer
gives an input for the second time, this new image is checked with its previously
made duplicates. Finally we used SSIM and Image Hash to give similarity and
predict the output. Thus using Cognitive Inspired Model for image duplication gives
a better approach rather that previously used techniques which uses mathematical
formulae as this model gives a cognitive insight
FUTURE WORK
● Banks: For the account holders of the banks who can collect their signature once
and use it later for verification purpose.
1. P type Problem:
The Collection of problems that can be solved in polynomial time is called P.
This polynomial is small degree. The problems belonging to this class are easy
to solve and can be solved using tractable input.
2. NP type Problem:
The algorithm in which every operation may not have unique result, rather there
can be specified set of possibilities for every operation. Such an algorithm is
called non-deterministic algorithm. Non-deterministic algorithm is a two stage
algorithm.
A. NP-complete: They are harder to compute rather than to verify; they could not
be solved in polynomial time but they can be verified in polynomial time.
Input x, Output y
Therefore, y=f(x)
I1: Login
I2: User details
I3: Image of Signature
O1 : Prediction if Forged or
Authentic
1. Paper Title: State of the Art Literature Survey 2019 on Signature Verification
1. Paper Title: Image Processing Based Signature Duplication and its Verification.
PLAGIARISM REPORT
APPENDIX D
AIM- Review of design and necessary corrective actions taking into consideration the
feedback report of Term 1 assessment, and other competitions/conferences participated
like IIT, Central Universities, University Conferences or equivalent centres of
excellence etc.
● Suggested about the use of various optimizer and trial with each one to predict
the output.
● Trial of other Neural Networks architecture was suggested to converge to a
positive outcome.
AIM: Project workstation selection, installations along with setup and installation
report preparations.
Platform
Tools Used
Anaconda
SQLLITE Database
Flask
Libraries Used
1. Python 3.6
2. Pandas
3. Numpy
4. Keras
5. Matplotlib
6. TensorFlow
7. OpenCV
8. Flask
Anaconda Setup
XAMPP Setup
Database
D.2.3.1 SQLLITE
A website with SQLLITE database connectivity was used for the specific purpose of
data management. Structured data, primarily relational based is used for record storage.
The records contains data about the login credentials, images, correctly predicted and
accordingly constructs the accuracy analysis data.
LABORATORY ASSIGNMENT NO. 3
AIM: Programming of the project functions, interfaces and GUI (if any) as per 1 st Term
term-work, submission using corrective actions recommended in Term-1 assessment of
Term-Work
Functions
1. Login & SignUp: Function that check the credentials and logs into the account
for Login and Inserts details of a new user for SignUp option.
2. Detection: Function that takes a cropped image as input and performs
duplications and gives the prediction after image recognition.
3. Duplication: Function that duplicates the signature given by user and stores it in
system.
4. Similarity Prediction: This gives the similarity between 2 images and predicts if
the signature uploaded is authenticated or not.
Resultant GUI:
LABORATORY ASSIGNEMENT NO. 4
AIM: Test Tool selection and testing of various test cases for the project performed and
generate various testing result charts, graphs etc, including reliability testing,
3. Button – PASSED
4. User Input-PASSED
5. User Interface-PASSED
6. Link-PASSED