Test
Test
net/publication/328601752
CITATIONS READS
0 586
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Nada A Rasheed on 30 October 2018.
BY
NADA, A. RASHEED
SUPERVISED BY
September
2002
(1)
Supervisor Certification
Signature: Signature:
Name: Dr. IMAD, H. AL-HUSSAINI Name: Dr. MOAID, A. FADHIL
(2)
To the moon, which honors me and
lightens my way of life…
To whom who is present inside my heart,
but away from my eyes…
Without him life would not have any kind
of meaning…
To that who grows inside my spirit the
hope of great success…
I may be honored to dedicate this thesis as
a kind of gratefulness and to be always
faithful.
My brother
( Laith )
(3)
تمييز التوقيع باستخدام
الشبكات العصبية
رسالة مقدمة الى
الكلية الهندسية العسكرية كجزء من متطلبات
نيل درجة الماجستير علوم في الحاسبات
من قبل
نــدى عبداهلل رشيد الجبوري
بأشراف
أيــلول
2002
)(4
Acknowledgements
NADA
(5)
Abstract
Recognition is regarded as a basic attribute of human beings, as
well as other living organisms. There are many practical applications
of pattern recognition, such as: in the area of signature recognition,
handwriting, fingerprints, face recognition and other applications.
Signatures are used every day to authorize the transfer of funds of
millions of people, Bank checks, credit cards and legal documents, all
of which require our signatures. In this work a proposed Signature
Recognition System ( SRS ) that is able to recognize signatures by
using neural networks. To achieve this objective, Image loading by
Optical scanner is done to read the image file. After that the
preprocessing input image will be done including some modifications
of: Noise Removal, Image Scaling-1, Image Centralization-1, Image
Rotation, Image Trimming, Image Scaling-2, Image Centralization-2.
Then feature extraction by finding the frequency histograms of the on-
pixels in the (50x50) image matrix on the horizontal lines, vertical,
upper triangle of the main diagonal and lower triangle of the main
diagonal. At last, Backpropagation Algorithm has been used in
recognition. Experimental results showed that, out of (100) test
signatures, (98) signatures have been correctly recognized which
amounts to (98%) success rate. The proposed system could produce
the best performance to recognize signature by using a
backpropagation neural network with (49) hidden nodes and (0.5)
learning rate within a very short training time.
(6)
الخـــــالصة
يعد ددتمييز ييد ددسمةد ددبيمة ا د ددييميلك يعد دديميي ح د د ييم د د ي ميل ا د ددا ميي يد دديم
يألخد .م ز ددتمزك يتدا مدتيدتمميز ييدسميأل داكم دامتديم تدلمز ييدسمييز يد م م
يي د د مميي ك دد دديمة ميي ز دديم اييد ددتم م ةد د ا ميألةد ددا مم مز يي ددسميي د د د م
زك يتا مةخ .مز زختممييز ي ي مي ياًمميلزُخ ميلمتيم تدلميأل د ليلمي ييدينم دنم
ميي د م كا ددا ميو ز ددانم يي يددا ثمييتا يدديم يد مينم ددلم يي ددا ل م تدديمةد
ه هميي او مززكلبمز ي يع ا مزممتيمه يميي د مي زد ينم مدامم دات مدلديمز ييدسم
ييز ي م ا زختيممييح ا مييعة ييم إل داسمه يمييهتم ميزممي زتداءمييةد مم
ا ددزختيمميي ا د دحميي ةد د ميتد د يءمم ل ددمميية د مم م ع ددتم ي د م عايد دديمييةد د مم
دسممة ييديميلةد مم يي تخليم ح ل ميسيييمييض ضداءم مزتيدي مة يديميلةد مم م
دسمميا يدديميلةد مم ميددمم م م مزددت ي مييةد مم مه ددتيممييةد مم مزتيددي ميددا
ز د ددزخلاميية د ددبا م ريد د دداتمييز د د د ي ي مفي د دديميي ت د دداكميي د د د تيءمت د دديم ة د ددب تيم
()05×05متدديمييخكد كميألتتيدديم مييعا تيدديم ميي يلد مة ددبلمييتكد ميي ي دديم م
يي يل مةدليمييتك ميي ي ي م ةخي يًم مخ ي س ييمح يميو زحدا مخلبداًمييعةد ييم
ز ددزختممت دديمييز يي ددسم مةمهد د م بز ددال ممييزد ددا بم د د م ددنم ددينم()055مز يد د م
يخز ددا مز ددممز يي ددسم()89مز يد د م حد د لمةد د يحم حد د ل م د د يم د ددانم ت ددتي هام
( )%89م ي مت نميي مدامميي تزد نميُ دنمةبنمي تدثميألتيءميألتضدلمتديمز ييدسم
ييز يد د م ا ددزختيممحد د يميو زح ددا مخلبد داًم د د م()98مدُتبددتمم خبي دديم د د بيمزعل ددمم
م ةي مدتيًميزبت يبمييح ي تتي هام()5.0م م
م
)(7
Table of Abbreviations
Symbol Name
ANNs Artificial Neural Networks.
BP BackPropagation.
FAR False Acceptance Rate.
FRR False Rejection Rate.
HSV Handwritten signature verification.
LMS Least mean square.
NNs Neural Networks.
S.R.S Signature Recognition System.
TAR True Acceptance Rate.
TRR True Rejection Rate.
(8)
Table of Contents
Chapter One: Introduction
(9)
Chapter Three: Design of the Proposed Signature
Recognition System (S.R.S.)
3.1 Introduction …………………………………………. 47
3.2 The SRS……………………………………………… 47
Chapter Four: (SRS) Implementation and Experimental
Results.
4.1 Introduction ………………………………………….. 74
4.2 SRS Implementation …………………….………….. 74
4.2.1 Image Loading ………………………………………. 76
4.2.2 Preprocessing………………………………………….. 76
4.2.3 Feature Extraction …………………………………… 81
(10)
List of Tables
4.4 Results for the networks with (0.8) Learning rate ……………… 96
4.5 No. of iterations according to the networks with varying
Learning rate….………………………………………………… 97
4.6 Time (minutes) according to the networks with varying Learning
rate. ……………………………………………………………. 98
(11)
List of Figures
(12)
3.8 Image Trimming. ………………………………………………… 59
3.9 Image Scaling. …………………………………………………….. 60
3.10 Image Centralisation-2. …………………………………………… 61
3.11 Features Extraction on Horizontal Histogram. …………………… 62
3.12 Features Extraction on Vertical Histogram. ……………………… 63
3.13 Features Extraction on the lower part of the main diagonal. …….. 64
3.14 Features Extraction on the upper part of the main diagonal. ……. 65
3.15 Diagram of a Neural Network. …………………………………… 68
4.1 Framework of the SRS. …………………………………………… 75
Image Loading. …………………………………………………..
4.2 76
Noise Removal ……………………………………………………
4.3 77
Image Scaling-1 …………………………………………………
4.4 77
Image Centralisation-1 …………………………………………….
4.5 78
Image Rotation……………………………………………………..
4.6 79
Image Trimming……………………………………………………
4.7 79
4.8 Image Scaling-2 ………………………………………………….. 80
4.9 Image Centralisation-2…………………………………………….. 80
4.10 Features Extraction on Horizontal Histogram. …………………… 81
4.11 Features Extraction on Vertical Histogram. ……………………… 82
4.12 Features Extraction on the lower part of the main diagonal. …….. 82
4.13 Features Extraction on the upper part of the main diagonal. …….. 83
(13)
4.16 Delete Person's Signature. ………………………………………… 85
4.17 Database File Loading. …………………………………………… 86
(14)
Chapter One
Introduction
1.1 Introduction
Since the beginning of the computer industry, users of computers
have been forced to modify their behavior to utilize these devices.
User interfaces ranged from confusing to downright hostile. As
computers became more powerful, as measured in processing speed,
user interfaces were written in a more intuitive fashion, but users still
had to change the way they normally interacted with the world. For
example, talking to a computer, until very recently, will not get it to
accomplish a desired task, and smiling at a computer won‟t make it
respond in a friendlier fashion. Some users may want or even need to
be able to speak commands directly to the computer. May be the
computer should be able to read our handwriting [1].
Though the benefits of an automatic handwriting recognition
system have long been known, research in this field has been slow to
advance. It has only been within the last ten years that research in the
field of handwriting recognition has begun to take major strides.
However, as personal computers have become a standard
consumer staple, and computing power has drastically increased, the
barriers to handwriting recognition research have been eliminated.
This rekindled interest has been furthered by the realization that
Artificial Neural Networks (ANNs) have the ability to generalize,
(15)
adapt, and learn implicit concepts. These ANN properties are
particularly well suited to address the problems caused by high
variability between handwriting samples. An examination of the
signature verification process provides a nice illustration of the
usefulness of ANNs. The power of ANNs lies in their ability to
recognize general patterns. This is particularly useful because,
although there is a large amount of variability between signatures,
there will be general characteristics present in each signature [2].
A handwritten signature is a kind of agreement. Mostly, it is an
agreement with the content of a document. It can be the signature on a
contract, an application for an official document [3].
Handwritten signatures come in many different forms and there
is a great deal of variability even in signatures of people who use the
same language. Some people simply write their name while others
may have signatures that are only vaguely related to their name, some
signatures may be quite complex while others are simple and appear
as if they may be forged easily. It is also interesting to note that the
signature style of individuals relates to the environment in which the
individual developed their signature.
It is known that no two genuine signatures of a person are
precisely the same and some signature experts note that if two
signatures of the same person written on paper were identical they
could be considered forgery by tracing. Successive signatures by the
same person will differ and may also differ in scale and orientation[4].
(16)
Handwritten signatures are the most widely employed form of
secure personal identification, especially for cashing checks and card
transaction. However, for several reasons the task of verifying human
signatures cannot be considered a trivial pattern recognition problem.
It is a difficult problem because signature samples from the same
person are similar but not identical. In addition, a person‟s signature
often changes radically during his life. Figure (1.1) shows an example
of how a signature may develop over time. In fact great variability can
be observed in signatures according to country, age, time, habits,
psychological or mental state, physical and practical condition[5].
(17)
To illustrate the concept of signature recognition, figure (1.2)
shows object „classification‟ problems that can generally be divided
into four types: „cognition,‟ „recognition,‟ „identification,‟ and
„verification‟. This depends on the a priori knowledge available about
the classes under study and on the type of information extracted from
the different objects. In this context, the „semantic‟ part of the object
„handwriting‟ refers to the message content of the handwritten data
and the „singular‟ part reflects some individual characteristics of the
writer. Assuming that both types of information could be separately
processed by computer, four types of systems can be differentiated.
„Cognition‟ and „recognition‟ systems refer to man-computer
interfaces processing the „semantic‟ information of a message to
recover its content irrespective of the writer. When the classes are
known a priori (i.e. the 36 alphanumeric English symbols for block
characters) the term handwriting or text recognition is used. When the
classes are unknown (as in some approaches to the study of cursive
script) a cognition phase is necessary to define these classes before
going to recognition.
In a similar fashion, „identification‟ and „verification‟ systems
describe interfaces the „singular‟ information of handwriting to
establish the identity of the writer, irrespective of handwritten content.
A writer „identification‟ system must establish a writer‟s identity by
comparing some specific attributes of his handwriting with those of all
the (N) writers enrolled in a reference database. A „verification‟
(18)
system, decides on the claimed identity of any writer by a one-to-one
comparison process. Signature and text can acquire „on-line‟ or „off-
line‟ their processing is often referred to as „dynamic‟ and „static‟
accordingly.
Historically, most handwriting classification projects have been
oriented toward text or signature analysis as in figure (1.3). On the on
hand, the text recognition problem have been the most popular
although several groups have been active in identification and
verification problems. When signatures are used the authentication
study is reduced to a verification task and when text is used,
identification experiments generally are run [6].
Type of knowledge
A PRIORI A PRIORI
Unknown classes Known classes
OBJECT
Semantic part Cognition or learning Recognition
Singular part Identification Verification
Handwriting
On-line Off-line
Text Signatur
e
(19)
Handwritten recognition has a number of different areas as
shown in figure (1.3). The two major methods of handwriting
recognition are off-line and on-line [5].
Off-line Recognition
Off-line processing happens after the writing of handwritten is
complete and the scanned image is preprocessed and off-line inputs
have no temporal information associated with the image. The system
is not able to infer any relationships between pixels or the order in
which strokes were created. Its knowledge is limited to whether a
given pixel is on or off[7].
On-line Recognition
On-line handwritten recognition accepts (x, y) coordinate pairs
from an electronic pen touching a pressure- sensitive digital tablet.
On-line processing happens in real- time while the writing is taking
place. Also, relationships between pixels and strokes are supplied due
to the implicit sequencing of on-line systems that can assist in the
recognition task [7].
These systems find particular application, for instance to verify
the identity of credit card. In this case the person will be required to
sign on an electronic device, typically a digitizing tablet. The test
signature is verified by comparing it with a template in a database,
which may take the form of a reference signature or a vector of
parameters describing the features of the signature. In this context a
template will consist of data extracted from a set of sample signatures
supplied by the individual at the time of registration [8].
(20)
1.2 Review of Related Works
Carr and Fox (1985) propose more than (90) features for
consideration. Once a set of features has been selected, there may be
no need to store the reference signature and only the features' values
of the reference signature need to be stored. Also, when a test
signature is presented, only the features' values are needed, not the
signature. This often saves on storage (storage may be at premium if,
for example, the reference signature needs to be stored on a card) and
that is why representing a signature by a set of values of its features is
sometimes called compression of the signature[35].
(21)
Pender (1991) explores the use of neural networks for detecting
casual forgeries. The training of neural networks requires genuine
signatures as well as forgeries although signatures of other people may
be used as forgeries. A database of signatures was created, in which
static signature features were collected from five individuals over two
years. It had (380) genuine signatures and the same five individuals
signed (265) forgeries in which the individuals knew the name of the
person whose signature was being forged, but had not viewed a
genuine signature. As in the earlier work, the signatures were scanned
and processed by reducing them to (128x64) size. An FRR of (3%)
and FAR of zero-effort forgeries of (3%) have been reported[37].
(22)
remaining (8) were used for testing. Signatures were collected on
paper at different times of the day and then scanned. A fast
backpropagation neural network method was used as the classifier. A
FRR of (1.4%) is reported. No skilled forgeries were available and no
zero-effort forgeries were tested[38].
(23)
signatures taken from the person. The scanned true signature to the
system, which produces a (84x20 bits) binary image. The image is
divided into ten parts, each part is used to teach one part of the neural
network. When an unknown signature is presented to the network it
will reach some stable state, then the system will either accept or
reject the signature with a matching factor, of (9%) FRR and (16%)
FAR[9].
(24)
method was used to code the extracted contour from the first step. The
third step included applications of the (invariant moments) method.
The system will accept the signature of (94.25%) TAR and reject
(5.75%) FAR [11].
(25)
the backpropagation networks most closely related to the experiments
of this research.
(26)
Chapter Two
Pattern Recognition and
Neural Networks
2.1 Introduction
The contents of this chapter fall into two parts. The first part
overviews pattern recognition, which can be considered as a two-stage
device, feature extraction and classification and its approaches. The
second part overviews artificial neural network, feedforward network
and training by backpropagation network.
(27)
Pattern recognition can be defined as an area of science
concerned with discriminating objects on the basis of information
available about them. Each distinct bit of information about objects is
called a feature [14].
Typical problems in pattern recognition begin with representing
the objects by some form of data, that is fed into a computing system
as "input". The fundamental objective for pattern recognition is
classification. A classification algorithm translates into a program that
"reads" the data and "classifies" the input into a collection of different
answers. Thus, the solution is read as the "output" of the pattern
recognition system, a pattern recognition system can be considered as
a two stage device. The first stage is feature extraction and the second
is a classification[15].
2.2.1 Features
The main objective of the feature extraction process to capture
the most relevant and discriminate characteristics of the objects to be
recognized. The dimension of the resulting feature vector is usually
smaller than the dimension of the original pixel images, thus
facilitating the subsequent classification processes[16].
Typically, the observations from which a source is to be
classified images, the data may consist of an array of tens of
thousands of pixels. In order to successfully develop a decision rule, it
is often necessary to extract features of the data, good features
summarize the data into few values that provide. Good discrimination
(28)
between source classes. Feature selection is extremely important to the
success of a pattern recognition system: good features lead to good
performance[17].
These features are either geometrical shapes in the body of the
pattern or non-geometrical features. The geometrical features mean
the recognizable features in the body of character such as curves, their
curvature values, strokes, their directions, loops, corner, crossing
points, and others. The non-geometrical features mean mathematical
transformation. The extraction of these features simplifies the
classification, which could be achieved by comparing them with the
standard features [18].
Usually several features are required to be able to adequately
distinguish inputs that belong to different classes. Selecting these
features can be a difficult problem that may require significant
computational effort. Feature selection is the process of choosing
input to the pattern recognition system. This usually involves
judgement. The key is to choose and extract features that are:
Computationally feasible.
Lead to a good classification system with fewer mis-
classification errors.
Reduce the problem data into a manageable amount of
information without discarding valuable or vital
information[5].
(29)
2.2.2 Classification
Classification is rarely performed using a single measurement, or
feature, from the input pattern. Usually, several measurements are
required to be able to adequately distinguish inputs that belong to
different categories (or classes)[15].
The aim of pattern recognition is the design of a classifier. It is a
mechanism which takes the features of objects as its input and which
results in a classification a label or value indicating to which class the
object belongs, by measuring the similarity between these objects
features and class features [14].
Given a set of learning search areas, the classifier selects the
“best”' search area for new inputs. The classifier must develop a
description of each search area that can be used to make an informed
selection. Each search area is comprised of a set of solution traces,
where each solution traces correspond to a unique input. The set of
inputs, which form a search area, can be considered a class of input.
The expectation that the input from a particular class share some
common features. These features are ultimately responsible for the
similarity of the inputs' solution traces. It is also expected that two
inputs from different classes have some variant features, which
distinguish their solution traces. Identifying these features is essential
in classifying new input. In addition to experimenting with existing
feature analysis techniques, the custom strategies for identifying input
features that influence input classification.
(30)
For a given search area, it will be necessary to store a
representation of the inputs that formed the search area. The
representation could be statistical information on the features of every
input in the class. Thus, it may be necessary to store feature
information for each input. The features of new input will be
compared to the representative from each class. The search area of the
representative that shares the most features with the new input will be
used to compute the solution. The feature comparison must be
efficient in order for the whole process to compete with
straightforward algorithm [19].
(31)
The third problem involves the determination of optimum
decision procedures, which are needed in the identification and
classification process[13].
(32)
by using suitable criterion. For example, the matching of the patterns
with templates which are stored in terms of feature measurements[18].
Features are assumed to be variable with statistical distributions
and different classes to have different values for the distribution
parameters. Values for the parameters can be estimated from
measurements. Successful applications of this approach to practical
problems include character recognition, medical diagnosis, etc[20].
(33)
presented in two ways: as a string grammar or a graphical grammar. A
string grammar, defines in a one-dimensional way, how a string of
non-terminal symbols can be replaced by a string of non-terminal or
terminal symbols. Graphical grammars can present the relationship of
the features in a higher-dimensional way. Syntactical methods have
one major drawback: if the structure of an unknown pattern does not
exactly match to any of the structures of the predefined classes, a
syntactical classifier cannot say anything about the class. Not even
which class would be close to the pattern, as no sensible distance
measure can be defined between the pattern instances. Serious
problems also arise if the representatives of two different classes can
have similar structures. In such cases, some additional information or
features are needed to discriminate the classes.
Structural methods are also based on an analysis of the features
and their relations. They differ from the syntactical methods, as the
classification is not based on parsing but on matching and various
decision rules. The characters can be recognized hierarchically, the
candidate classes are first ruled out or selected according to lower–
level features. Then, more complicated higher-level structures are
used. The pattern is classified to the class, which has matching
structure. In case of ambiguity, decision rules containing context and
language information, or rules specific to the confusing class pairs can
be used. The hierarchical matching and decision procedures can
usually be described with decision trees[20].
(34)
The Neural Recognition Approach
Neural Networks (NNs) deal with the classification of („ pattern
recognition‟) problems with greater reliability than human counter
parts. Automatic training allows the NNs to train before being used on
real problems. Certainty factors accompany the results as a mitigation
against possible errors that occur. A low certainty factor accompanies
borderline cases [21].
The field of NNs can be thought of as being related to artificial
intelligence, machine learning, parallel processing, statistics, and other
fields. The attraction of NNs is that they are best suited to solving the
problems that are the most difficult to solve by traditional
computational methods [22].
The ANN approach has many similarities with statistical pattern
recognition concerning both the data representation and the
classification principles. The practical implementation is, however,
very different. The analysis mode involves the configuration of a
network of artificial neurons and the training of the net to determine
how the individual neurons can effect each other. The recognition
mode involves sending data through the net and evaluating which
class got the highest score [12].
(35)
2.3 Artificial Neural Networks (ANN)
2.3.1 Biological Neural System
It is estimated that the human brain contains over (100) billion
09
(1011) neurons and05 ( ) synapses in the human nervous system.
Studies of brain anatomy of the neurons indicate more than (1000)
synapses on the input and output of each neuron[23].
There is a close analogy between the structure of a biological
neuron (i.e., a brain or nerve cell) and the processing element (or
artificial neuron). In fact, the structure of an individual neuron varies
much less from species to species than does the organization of the
system of which the neuron is an element [24].
Neurons and the interconnections synapses constitute the key
elements for neural information processing. As shown in figure[2.1].
Dendrites
Soma
Axon
Synapses
(36)
Most neurons possess tree-like structures called dendrites, which
receive income signals from other neurons across junction called
synapses. There are three parts in a neuron:
1. A neuron cell body.
2. Branching extensions called dendrites for receiving input.
3. An axon that carries the neuron's output to the dendrites of other
neurons.
A neuron sends its output to other neurons via its axon. An
axon carries information through a series of action potentials, that
depends on the neuron's potential. This process is often modeled as a
propagation rule represented by a net value. A neuron collects signals
at its synapses by summing all the excitatory and inhibitory influences
acting on it. If the excitatory influences are dominant, then the neuron
fires and sends this message to other neurons via the outgoing
synapses. In this sense, the neuron function can be modeled as a
simple threshold function called activation function[23].
An important characteristic that ANNs share with biological
neural systems is fault tolerance. Biological neural systems are fault
tolerant in two respects:
First, we are able to recognize many input signals that are,
somewhat, different from any signal we have seen before. An example
of this is our ability to recognize a person in a picture we have not
seen before or to recognize a person after a long period of time.
(37)
Second, we are able to tolerate damage to the neural system itself.
Humans are born with as many as (100) billion neurons. Most of these
are in the brain, and most are not replaced when they die. In spite of
our continuous loss of neurons, we continue to learn. Even cases of
traumatic neural loss, other neurons can sometimes be trained to take
over the functions of the damaged cell, in a similar manner, ANNs can
be designed to be insensitive to small damage to the network, and the
network can be re-trained in cases of significant damage[24].
(38)
x1
w1
x2 w2
xi wi Y
xn wn
(39)
the net value netj . The mapping is mathematically described by a net
function to yield a new activation value Yj are shown in equation (2.1)
[23]. As shown in the figure below, we can summarize the operations
as follows:
n
Y j f ( x i w ij w 0j ) j=1,2,…….,m (2.1)
i 1
x1 w0
w1
x2 w2
x3 w3 Yj= f(netj)
xi wi netj Yj
The weighted sum w ijxi is called the net input to unit i, often
i
written netj.
Note that wij refers to the weight from unit i to unit j (not the other
way around).
The bias weights w0 shift the function in the x- and y-direction,
respectively, while the other weights scale it along those two
directions.
The function Yj is the unit's activation function. In the simplest
case, Yj is the identity function, and the unit's output is just its net
input. This is called a linear unit[27].
(40)
The most common activation functions are step, sigmoid
function:
Identity function
f(x) = x x (2.2)
Single - layer nets often use a step function to convert the net
input, which is a continuously valued variable, to an output unit that is
a binary (1 or 0) or bipolar (1 or –1). As shown in figure (2.4a).
1 if x 0
f(x) = 0 if x< 0 (2.3)
Sigmoid function
One commonly used activation function is the sigmoid function.
The geometrical representation of this function is an "S" shape, which
means that the output signal has two transitional states connected by a
transitional state[2].
The sigmoid function is (0) at negative infinity. As we approach (0), it
begins to rise to (0.5). Most of the rising is done between (-10 and 10).
Then it continues to rise until it reaches (1) at infinity. Think of this
function as a kind of "smoothed" form of the step functions we've
been using before. Other functions can be used besides the sigmoid,
but it's particularly popular because it has a first derivative which is
very simple, so it makes the math easy[28].
(41)
1- Binary Sigmoid
The logistic function, a sigmoid function with rang (0 - 1), is
often used as the activation function for neural in which the desired
output values either are binary or are in the interval between (0 and 1).
To emphasize the range of the function, it is called the binary sigmoid
and it is also called the logistic sigmoid [24].
The graph in figure (2.4c) shows the output for ( =0.5, 1, and
10), as the activation varies from (-10 to 10) [29].
1
f (x) ( x)
(2.4)
1 exp
Takes the input x, and transforms it with reference to a constant.
2- Bipolar Sigmoid
The logistic sigmoid function can be scaled to have any range of
values that is appropriate for a given problem. The most common
range is from (–1 to 1): it is called this sigmoid the bipolar sigmoid, as
shown in figure(2.4d)
2 (2.5)
g( x ) 2f ( x ) 1 1
( x )
1 exp
(42)
f(x) f(x)
x 0
x
(a) Identity function. (b) Binary step function
f(x)
x
(c) Binary Sigmoid
f(x)
1.00
0.50
0.00 x
-0.50
-1.00
-4.00 -2.00 0.00 2.00 4.00
(43)
2.3.4 Training Algorithms
The method of setting the values of the weights (training) is an
important distinguishing characteristic of different neural nets. The
neural networks are commonly categorized in terms of their
corresponding training algorithms: supervised networks and
unsupervised networks [23].
Many of the tasks that neural nets can be trained to perform fall
into the areas of mapping clustering and considered special forms of
the more general problem of mapping input vectors or patterns to the
specified output vectors or patterns [24].
Supervised Learning
In supervised training, there is a teacher that presents input
patterns to the network, compares the resulting outputs with those
desired, and then adjusts the network weights in such a way as to
reduce the difference. It is difficult to conceive such a teaching
mechanism in biological systems.
Examples of supervised learning algorithms include the least-
mean-square (LMS) algorithm and its generalization is known as the
BackPropagation (BP) algorithm. The (LMS) algorithm involves a
single neuron, whereas the (BP) algorithm involves a multi layered
interconnection of neurons. The backpropagation algorithm derives its
name from the fact that error terms in the algorithm are back
propagated through the network, on a layer by layer basis. Naturally,
the backpropagation algorithm is more powerful in application than
(44)
the (LMS) algorithm. Indeed, the backpropagation algorithm includes
the (LMS) algorithm as a special case[30].
Unsupervised Learning
Unsupervised training doesn‟t require a teacher. Input patterns
are applied, and the network self – organizes by adjusting its weights
according to a well – defined algorithm. Because no desired output is
specified during the training process, the results are unpredictable in
terms of firing patterns of specific neurons. What does occur,
however, is that the network is organizes in a fashion that develops
emergent properties of the training set. For example, input patterns
may be classified according to their degree of similarity, with similar
patterns activating the same output neuron[30].
(45)
nearest neighbors. It is common to use networks with a regular
connection structure to facilitate their implementation [31].
(a) (b)
(46)
2.3.6 Taxonomy of Neural Networks
There are two phases in neural information processing. They are
the learning phase and the production phase. In the learning phase, a
training data set is used to determine the weight parameters that define
the neural model. This trained neural model will be used later in the
retrieving phase to process real test patterns and yield classification
results.
Learning Phase: A salient feature of neural networks is their learning
ability. They learn by adaptively updating the synaptic weights that
characterize the strength of the connections. The weights are updated
according to the information extracted from new training patterns.
Production Phase: Various nonlinear systems have been proposed for
retrieving desired or stored patterns. The results can be either
computed in one shot or updated iteratively based on the retrieving
dynamics equations. The final neuron values represent the desired
output to be retrieved. A possible taxonomy of neural networks is
shown in figure (2.6a)[23].
Another taxonomy of neural networks that can be used for
classification of static pattern is presented in figure (2.6b). The
taxonomy is first divided between nets with binary and continuous
valued input. Below this, the nets are divided between those trained
with and without supervision [9].
(47)
Neural Networks
(a)
Neural Networks
(b)
(48)
It is a powerful mapping network, which has been successfully
applied to a wide variety of problems.
Feedforward Propagation
The feedforward network is composed of a hierarchy of
processing units, organized in a series of two or more mutually
exclusive sets of neuron or layers. The first or input layer serves as a
holding site for the values applied to the network. The last or output
layer is the point at which the final state of the network is read.
Between these two extremes lie zero or more layers of hidden units.
Links, or weights, connect each unit in one layer to only those in the
next – higher layer[5].
Y2 Y1
Output
W
Hidden
Z1 Z2 Z3 Z4
V
Input
X1 X2 X3 X4 X5 X6
(49)
Figure (2.7) depicts an example feed-forward neural network.
This network has six units in the first layer and four units in the
second layer, and has two units in the third layer. Finally, this network
has six network inputs and two network outputs. Each input-units,
hidden-units and output-units connection (the lines in Figure 2.7) is
modified by a weight [33].
In the implementation of network, each neuron receives a signal
from the neurons in the previous layer, and each of those signals is
multiplied by a separate weight value. The weighted inputs are
summed, and passed through a limiting function which scales the
output to a fixed range of values called the sigmoid function[22].
The sigmoid function is an equation that determines the strength,
or activation energy, with which the hidden layer node will fire its
signal to the output nodes. It returns the activation energy of the new
signal. The output nodes then perform this same process, using the
activation energy received from the hidden layer nodes rather than the
values that the hidden layer used.
Although this architecture is all that is needed to carry out feed-
forward propagation, several other design considerations were
implemented. For instance, add to both the input layer and to the
hidden layer a special "bias node" that always fires a signal and acts as
yet another control on the flow of the network. By altering the weight
between the bias and its postsynaptic nodes, the network could
effectively shut down nodes that continuously fail and boost those
succeed[34].
(50)
Feed backward Propagation
In order for the neural network to adapt and improve its
responses, it must have some method of learning. The neural network
learns through backpropagation. After the main sequence of the
program has passed its messages from the input layer through the
hidden layer, and to the output layer, backpropagation begins[34].
To train the neural network, present the inputs and determine the
values of the hidden layer and of the output layer. Compare the results
of the output layer to the correct results trying to train the network to
produce. Then modify the weights between hidden and output layers
(W) also the weights between input and hidden layers (V) so that they
are closer to producing that output. The rule use for modifying the
weights is known as the delta rule as shown in the following
equations, because it changes each weight according to how much it
had in the final outcome (the delta):
(51)
f ' (z_inj): Is the derivative of the sigmoid function in hidden layer.
f ' (y_ink): Is the derivative of the sigmoid function in output layer.
z_inj : the actually calculated in hidden units
zj : Hidden unit j: the output signal after applying its
activation function, zj = f ( z_inj ).
y_ink : The actually calculated in output units
yk : Output unit k: the output signal after applying its
activation function, yk = f (y_ink).
k : Portion of error correction weight adjustment for wjk that
is due to an error at output unit Yk : also, the information
about the error at unit Yk.
j: Portion of error correction weight adjustment for vij that
is due to backpropagation of error information from the
output layer to hidden unit zj .
vij : Weight update in hidden layer.
wjk: Weight update in output layer.
: Learning rate less than one.
This rule is applied to all weights at the same time (in other
words), don't change the (W) weights and then use those new weights
in the (2.7) equation, use the old (W) weights.
The training procedure finishes when the network is reliably
producing something very close to the expected output for every input
provide it.
(52)
One way to help the neural network converge is to lower the
learning rate, it'll take longer to learn, but will have a better chance of
finding the global optimum. Another way to help it is to increase the
number of neurons in the hidden layer. It has a downside: if you
increase the neurons in the hidden layer by too much, then the
network will learn exactly the inputs provided, but won't be able to
come up with a general solution.
Often you can't provide all the inputs to a network because there are
just too many. In this case, you want the network to learn the function
from just a subset, and successfully generalize what it learned to all
possible inputs as a whole. Large numbers of hidden layer neurons
make this less likely to succeed [28].
1
Ep
2
[ t k Yk ] 2 (2.10)
Where tk represents the target output for pattern (p) on nod (k)
(53)
This error is a function of the connection weights (wij) which are
the parameters that have to be optimized in such a way that the error
(Ep) becomes minimum. On a multi-dimensional error surface - where
each dimension corresponds to one weight of the net [32].
Since the output of the net is related to the weights between the
units and the input applied, the error is a function of the weights and
inputs to the network. Thus, for a fixed input pattern, the error
function will change with different weight settings. The error function
can be thought of as a surface sitting “above” the weight space of the
network. This surface is known as the error surface. Figure (2.8)
shows a cross-section of an error surface, the point Hmin , is called the
global minimum. H1 and H2 are other minimum points where the
search for the global minimum might accidentally get.
Ep
H1
H2
Hmin
W
(54)
The error function has point of the minimum value (wells) that
correspond to the minimum error, and points of maximum value
(peaks) that correspond to maximum error. Training the network aims
to find the weight vector (w) that will result in reaching the global
minimum (minimize the error)[5].
One way to achieve this goal is with the so-called gradient-
E p
descent method. The negative gradient of the error surface
w ij
represents the local direction of descent and thus is the search
direction for the new weight (wij) on the way toward a minimum,
which hopefully is global, After the nth iteration the error can be
calculated and all weights of the net have to be updated:
w ij (t 1) w ij (t ) w ij (2.11)
Figure (2.9) shows the weight update for a single weight for an
idealized error curve. If the gradient wE is positive, the weight has to be
diminished by w, in order to proceed in the direction of the
minimum. If the gradient is negative, the weight update has to be
positive, since w(t) is still smaller than w(min)[32].
(55)
E
E
w
E
w
Emin w
w
0 wmin w( t+1) w( t )
w(t 1) w(t ) w
Figure 2.9: Weight update for a single weight for an idealized error curve.
(56)
The “generalised delta rule” learning law used with the
feedforward neural network has the property that, given any starting
point (w) on the error surface that is not a minimum, the learning law
will modify the weight vector (w) so that the error (Ep) will decrease.
The generalised delta rule does this by calculating the value of the
error function, which is then propagated backward through the
network and used to determine weight changes within the network.
Each unit in the net has its weights adjusted so that it reduces the
value of the error function. This process is repeated until the network
reaches a desired state of response.
Adjusting the weights for units actually on the output layer is
relatively simple, as both the actual and desired outputs are known.
On the other hand, weight adjustment for a unit in the hidden layers
should be in direct proportion to the error in the units to which it is
connected. That is why the error estimation is backpropagated through
the net to allow the weights between all the layers to be correctly
adjusted[5].
(57)
along smooth areas of the error surface. Further, it may prevent
oscillations in the system and may help in escaping local minima in
the training process[5].
E E
w w
(a) (b)
Figure 2.11: (a) Small Learning rate Slow Convergence.
(b) Large Learning rate up bouncing around the error
surface
(58)
The Backpropagation Training Algorithm
The backpropagation-training algorithm is an iterative gradient
algorithm designed to minimize the mean square error between the
actual output of a multi layer feedforward and the desired output. It
requires continuous differentiable non-linearity. The following
algorithm assumes a sigmoid function is used[30]:
Step1: Initializes weights and offsets.
Set all weights to small random values between (-1 and 1).
Step2: Presents Input and Desired Outputs.
Presents a continuous valued input vector (x0, x1, .., xn-1)
and specify the desired outputs (t0, t1, ……, tm-1 ). If the net
is used as a classifier then all desired outputs are typically
set to zero except for that corresponding to the class the
input is from. That desired output is (1). The input could be
new on each trial or samples from a training set could be
presented cyclically until weights stabilize.
Step3: Calculate Actual outputs.
Use the sigmoid non-linearity from the equation (2.1) to
calculate outputs y0, y1…..ym-1 .
(59)
Figure (2.12) shows a flowchart for the backpropagation-training
algorithm:
Start
Iteration =1
Is
Error < Yes
tolerance
No
Iteration = Iteration + 1
(60)
Chapter Three
Design of the Proposed Signature
Recognition System (S.R.S.)
3.1 Introduction
This chapter is devoted to present the proposed system. The
proposition is to design an off-line signature recognition system using
Backpropagation algorithm.
The description of the stage sequence of the proposed system
will be given, which includes some modifications of: Image Loading,
Noise Removal, Image Scaling-1, Image Centralization-1, Image
Rotation, Image Trimming, Image Scaling-2, Image
Centralization-2, then Feature extraction and Recognition Using
Backpropagation Algorithm will be presented in this chapter.
3.2 The S. R. S.
The system is divided into a set of procedures, each of which
does a specific job; the result is then given to the post procedure. In
the SRS, image loading is done to read the image file and to transform
it into a binary image. The next procedure is noise removal, which is
done by removing the noisy objects.
After that, image scaling-1 procedure takes place to reduce or
enlarge the size of the pattern, that facilitates the image rotation
operation, and then the image is centralized. The next stage is Image
(61)
Rotation procedure to get fixed directions of the signatures. Then,
Image trimming procedure takes place to get the signature body.
Finally, Image scaling is done to get a fixed size of the image in an
array of (50x50) pixels and centralizing the image around y-axis.
The feature vector is extracted from the (vertical, horizontal,
upper triangle and lower triangle of the main diagonal) histograms of
the image, which is used as an input for a backpropagation network to
recognize the signature. The SRS processing sequence is depicted in
figure (3.1).
(62)
Start
(1)
Image Loading
(2.1)
Noise Removal
(2.2)
Image Scaling-1
(2) Preprocessing
(2.3)
Image Centralization-1
(2.4)
Image Rotation
(2.5)
Image Trimming
(2.6)
Image Scaling-2
(2.7)
Image Centralization-2
(3)
Feature extraction
(4)
Recognition Using
Backpropagation Algorithm
End
(63)
Moreover Preprocessing steps are performed in order to reduce
noise in the input images, and to remove most of the variability of the
handwriting.
Different people sign their signatures with different orientation,
size, deviation, etc. Even the signatures of the same individual change
temporarily in the aforementioned attributes under different
circumstances (e.g. the size of the signing space). To minimize the
variation in the final results, all signatures are normalized for duration,
rotation, position and size. The preprocessing includes some
operations such as:
(64)
Before After
Figure (3.2): Noise Removal.
(65)
the object size at a fixed ratio equals (0.7071068) to facilitate the
rotation operation and this ratio is calculated as follows.
e
r
r
As shown in figure (3.3), the radius (r) divides the side opposite
to the center into equal parts. Thus we obtain right triangle. Applying
the low of Pythagorean, the hypotenuse can be calculated from
equation (3.1).
x r2 r2 (3.1)
x 2r
Where (x) is the hypotenuse.
When the image is rotated it's sides intersect with the original image
angles with a magnitude (e) calculated in (3.2):
e 2rr (3.2)
e r ( 2 1)
(66)
Sc
r (3.3)
2r
1
2
Sc 0.7071068
Sc
50 (3.4)
RL
(a) (b)
(a) (b)
(c) (d)
Before After
Figure (3.4): Image Scaling-1.
(67)
Scaling-1 Algorithm1
Input:
Xold , Yold The pixel which will be scaled
L, R, U, D The boundaries of the signature body.
Output:
Xnew, Ynew The scaled pixel.
Program body:
Step 1: Compute the difference between left and right pixel of the
signature body to scale by minimizing or maximizing ratio
procedure
if R- L <50 then scale should be maximizing ratio
(68)
the centralization is done according to X-axis and Y-axis. Therefore
the rotation operation is very dependent on image centralization.
Before After
Figure (3.5): Image Centralization-1.
Centralization-1 Algorithm
Input:
Xold , Yold The pixel which will be centralized
Sx, Sy The ratio that centralization image according to X-axis
and Y-axis.
Output:
Xnew, Ynew The centralized pixel.
Program body:
Step1: for i=0…n
Step2: for j=0 …m
if (Xold + Sx , Yold + Sy) between boundaries of the image
(69)
Hence, the rotation algorithm must be used to unify signature
orientation in a horizontal manner to overcome this problem.
It is important to compute the angle (), which is used in the
rotation operation. The rotation of an image requires the calculation of
a new position for each point of the image after the transformation.
Each image point is rotated through an angle () about the origin,
which varies from one signature to another and can be calculated
according to the inclination angle. The following Algorithm is used
for this purpose.
S1 S2
a a
n B1 n
d d
A1 B1
( (
A2
S S
1 2 7 2
8 ) 8 )
Figure (3.6): .Compute the angle
. ().
Step1: Divide the image vertically into (8) equal sections. Consider
the first and last section (S1) and (S2).
Step2: Calculate the first point in the signature from the top to the
base in (S1) and (S2).
Step3: Calculate the average of the two side points:
A1 A 2 (3.10)
C1
2
(3.11)
B B2
C2 1
2
Step4: Connect C1 and C2.
(70)
Step5: Calculate the lengths of the triangle sides which is formed
from the points coordinates C1 (x1,y1) and C2 (x2,y2) by using :
y2 y1
tan 1( ) (3.12)
x 2 x1
as shown in figure (3.7), The signature is rotated through an angle ().
Rotation Algorithm:
Input:
Xold , Yold The pixel which will be Rotated
L, R, D, U the boundaries of image
Output:
Xnew, Ynew The Rotated pixel
is the angle of rotation
Program body:
Step1 : Calculating the angle ()
Step1.1: Calculating (S1, S2) which are represented the number of
columns in the left and the right of object to cutting
the signature in the first On-pixel for up and down.
(71)
Before After
Figure (3.7): Image Rotation.
Trimming Algorithm
Input:
L, R, D, U the boundaries of signature body
Output:
Image Trimming
Program body:
Step1: for i=0 … n
Step2: for j=0 …m
a =i- U (3.17)
b =j- L (3.18)
y[a, b]=x[i, j] (3.19)
end for
Step3: n = D - U (3.20)
m=R- L (3.21)
Step4: Image Trimming at new dimensions (n x m)
end for
(72)
Before After
Figure (3.8): Image Trimming.
Scaling-2 Algorithm
Input:
Xold , Yold The pixel which will be scaled.
Output:
Xnew, Ynew The scaled pixel.
Program body:
Step1: Sc= 50/ R-L
Step2: for i=0 … n
Step3: for j=0 … m
using equations (3.6), (3.7) to get Xnew , Ynew .
Step5:
using the same equations above to get Nnew , Mnew
the new image dimensions
(73)
Before After
Figure (3.9): Image Scaling-2.
Centralization-2 Algorithm
Input:
the centralization is done according to Y-axis only.
L, R, D, U The boundaries of signature body
Output:
Ynew The centralized pixel.
Program body:
Step1: Sc = 50 / r-l
Sy = (50 - U - D )* Sc /4 (3.22)
Step2: for i=0 … n
Step3: for j=0 … m
if Yold + Sy between boundaries of the image
using equation (3.9) to get Ynew .
(74)
Before After
Figure (3.10): Image Centralization-2.
(75)
Having applied the above stages on a signature, the feature
extraction stage is started by finding the frequency histograms of the
on-pixels in the (50x50) image matrix.
The extracted features in this work include four histograms that
are computed in different directions. Each histogram has (50) values,
and these histograms are explained as the following:
1- First histogram: this histogram is computed via frequency the
pixels that are on the horizontal lines of the image matrix, that is
producing a vector of (50) values and these values are represented
in the table (3.1) and illustrated as a histogram in figure (3.11).
45
40
35
30
Frequency
25
20
15
10
5
0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
Row No.
(76)
2- Second histogram: this histogram is computed via the frequency of
the pixels on the vertical lines of the image matrix, that produces a
vector of (50) values and these values are represented in the table
(3.2) and illustrated as a histogram in figure (3.12).
4
Frequency
0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
Column No.
(77)
Table (3.3): Features Extraction on the lower part of the main diagonal.
Lower d. No. 1 2 3 4 5 6 7 8 9 10 11 12 13
Frequency 0 0 0 1 4 2 3 4 2 3 2 2 3
Lower d. No. 14 15 16 17 18 19 20 21 22 23 24 25 26
Frequency 2 5 3 4 3 3 3 3 4 2 2 2 4
Lower d. No. 27 28 29 30 31 32 33 34 35 36 37 38 39
Frequency 7 4 3 6 2 4 2 4 2 1 3 1 2
Lower d. No. 40 41 42 43 44 45 46 47 48 49 50
Frequency 2 1 1 0 0 0 0 0 0 0 0
8
7
6
Frequency
5
4
3
2
1
0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
(78)
Table (3.4): Features Extraction on the upper part of the main diagonal.
Upper d. No. 1 2 3 4 5 6 7 8 9 10 11 12 13
Frequency 0 0 0 1 7 6 5 5 1 2 2 2 2
Upper d. No. 14 15 16 17 18 19 20 21 22 23 24 25 26
Frequency 4 4 2 3 4 3 4 4 5 2 2 2 2
Upper d. No. 27 28 29 30 31 32 33 34 35 36 37 38 39
Frequency 2 2 2 2 4 6 3 5 3 3 3 2 0
Upper d. No. 40 41 42 43 44 45 46 47 48 49 50
Frequency 2 1 1 0 0 0 0 0 0 0 0
8
7
6
Ferquency
5
4
3
2
1
0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
Most techniques for HSV involve the following five phases: data
acquisition, preprocessing, feature extraction, comparison process, and
performance evaluation
Most methods, but not all, during the first three phases, data
acquisition, preprocessing and feature extraction, would generate a
reference signature (or a set of reference signatures) for each
individual. This normally requires a number of signatures of the user
(79)
to be captured at enrollment or registration time (these signatures are
called sample signatures) and processed. When a user claims to be a
particular individual and presents a signature (we call this signature
the test signature), the test signature is compared with the reference
signature for that individual. The difference between the two is then
computed using one of the many existing (or specially developed)
distance measures. If the distance is above a predefined threshold
value the user is rejected otherwise authenticated.
The first step in preparing the signature database is to determine
the number of sample signers that need to be used in recognition, here
the sample is (50) signers.
The second step is to determine the number of sample signatures
that need to be collected from each person. Each group consists of (5)
signatures from each signer and there are (50) signers used their
signatures in the database.
Each sample signature was collected in (250x250) pixels, using
the same type of writing tools and white sheets of paper. It has been
found that there is a negligible noise on a scanned image when using a
black pen on plain white paper. Then the image is passed through
preprocessing, after that, the feature extraction stage begins, the result
of this stage is a vector of (200) values for each signature, but, there
are (5) signatures for each signer and there are (50) signers, therefore,
there are (250) signatures in all database. Also, database includes the
values of the desired output layer, the outputs of the appropriate six-
bit code, which is arbitrarily assigned to each original input pattern.
(80)
Firstly, the data used in recognition involves calculations of the
vector values‟ average for each person and the determination of the
deviation between the average and the vectors‟ values for all
signatures of the same person. Then the deviation value is added to
and subtracted from the average. This results in a new data (i.e., the
average, the average + deviation, and the average- deviation) which
forms the actual database. The large deviation among a person‟s
signatures has a great effect on recognition, thus the original vectors
are used as recognition data.
(81)
This model consists of (200) units in the input layer, in addition
to an additional unit representing the bias. The number of nodes in the
second hidden layer are chosen to be (49) nodes (which was found
suitable for this problem by experimentating) in addition to the bias
unit as in the input layer. Finally, it outputs the appropriate six-bit
code arbitrarily assigned to each original input pattern figure ( 3.15)
shows the net architecture as follows:
1 1
1
2 2
2
3
3
4
3
199
5
200
49 6
(82)
is the input to the sigmoid function. It returns the activation energy of
the new signal. And the hidden nodes then propagate the signal to the
output layer in the same manner.
The output nodes, then perform this same process, using the
activation energy received from the hidden layer nodes rather than the
values that the hidden layer used.
Although this architecture is all that is needed to carry out feed-
forward propagation, several other design considerations were
implemented. For instance, add to both the input layer and to the
hidden layer a special "bias node" that always fires a signal and acts as
yet another control on the flow of the network. By altering the weight
between the bias and its postsynaptic nodes, the network could
effectively shut down nodes that continuously fail and boost those that
succeed.
To illustrate the backpropagation training algorithm is an
iterative gradient algorithm is designed to minimize the mean square
error between the actual output of feedforward phase and the desired
output. It requires continuous differentiable non-linearity.
Step1. Initialize weights and offsets
Set all weights and node offsets to small random values
between (-1 and 1), here used (9800) values between input and hidden
layers, (294) values between hidden and output layers. Furthermore,
the bias equal to (55) values.
- (49) Values between input and hidden layer (Vij).
- (6) Values between hidden and output layer (Wjk).
(83)
Step2. While stopping condition is false, do steps 3-6.
Step3. For each training pair, do steps 4-5.
Step4. Feedforward phase
Step4.1: Each input unit (x1, ……, x200 ) receives an input value and
broadcast this value to all units in the layer above (the
hidden units).
Step4.2: Each hidden unit (Zj) sums its weight input signals.
200
z _ in j v oj xj=1,2,
i . v…,49
ij (3.23)
i 1
applies its activation function to compute its output values
1
z j f (z _ in j ) (3.24)
1 exp ( f(z_in j ))
and sends this signal to all units in the layer above (output units). Where, equal (1)
Step4.3: Each output unit (Yk, k=1,….,6) sums its weighted input
signals.
49
y _ in k w ok z j . w jk k =1,..,6
j 1
and applies its activation function to compute its out put
values
1
y k f ( y _ in k )
1 exp( f(y_in k ))
Step5. Backpropagation of error phase
Step5.1: Each output unit (Y1, ……, Y6 ) receives a target signature
corresponding to the input training signature and computes
its error information term.
k (t k y k )f ( y _ in k ) k=1,..,6 (3.25)
f ' ( y _ in k ) f ( y _ in k ) [1 - f ( y _ in k )]
(84)
calculates its weight correction term (used to update wjk later).
w jk k z j (3.26)
calculates its bias correction term (used to update w0k later)
and the learning rate () equal to (0.5).
w 0k k (3.27)
and send k to units in the layer below.
Step5.2: Each hidden unit (Z1, Z2, ….. Z49) sums its delta inputs (from
units in the layer above).
6
_ in j k w jk j=1,2 (3.28)
k 1
vij jx i
and calculates its bias correction term (used to update v0j
later).
v 0 j j
update weights and biases:
Step5.3: Each output unit (Yk, k=1,…6) updates its bias and weights:
w jk (t 1) w jk (t ) w jk j=1,2,….,49 (3.30)
(85)
the hidden units (Z1,Z2, …., Z49) updates its bias and
weights:
v ij (t 1) v ij (t ) v ij i=0,….200
Step6. Test stopping condition.
An epoch is one cycle through the entire set of training vectors.
Typically, many epochs are required for training a backpropagation
neural network. The foregoing algorithm updates the weights after
each training signature is presented. A common variation is batch
updating, in which weight updates are accumulated over an entire
epoch (or some other number of presentations of signatures) before
being applied.
Training continued until the total squared error for (250)
signatures was less than (0.01).
Step7. Testing phase
After training, a backpropagation neural net is applied by using
only the feedforward phase of the training algorithm. The testing
procedure is as follows:
Step7.1: Initialize weights (from training algorithm).
Step7.2: For each input vector, do steps 3-5.
Step7.3: For i= 1….200: set activation of input unit (xj).
Step7.4: For j= 1….49:
200
z _ in j v oj x i . v ij j =1,….,49
i 1
zj = f(z_inj)
(86)
Step7.5: For k = 1…6.
49
y _ in k w ok z j . w jk k =1,..,6
j 1
yk = f(y_ink)
After application the set of preprocess procedures to the
unknown signature and the training algorithm to database is complete.
The unknown signature must be tested to be recognized by using the
feedforward phase to compute the actual output to the unknown
signature and calculate the minimum distance between the targets,
which is computed and all targets to the database signatures. A
signature is taken to be present if the minimum distance between the
computed output nodes and the original target nodes equals zero.
(87)
Chapter Four
(S.R.S) Implementation and
Experimental Results
4.1 Introduction
This chapter is divided into two parts. The first part is devoted to
present the construction of the proposed system where a graphic
interface module was implemented with the main recognition program
modules to produce a (S.R.S.) system. The second part is devoted to
present the results of testing the performance of the proposed SRS.
(88)
Start Start
References
Preprocessing Preprocessing
Classification
References
database
Signature No
Recognition
End
(89)
4.2.1: Image Loading
This procedure loads the image after it is transduced via the
scanner device. Figure (4.2) presents the user interface of the SRS.
4.2.2: Preprocessing
Preprocessing steps applied to the system are noise removal,
Image Scaling-1, Image Centralisation-1, Image Rotation, Image
Trimming, Image Scaling-2, then Image Centralisation-2. Details of
processing steps are as follows:
(90)
4.2.2.1: Noise Removal
This procedure removes the noisy objects, as in figure (4.3).
(91)
Figure (4.4b): Maximize the signature size.
(92)
4.2.2.4: Image Rotation
This procedure rotates the signature through an angle about the
origin, as shown in figure (4.6).
(93)
4.2.2.6 Image Scaling-2
This procedure scales the signature at a fixed size, as shown in
figure (4.8).
(94)
4.2.3 Feature Extraction
An important phase when designing a pattern recognition system
is to identify which attributes are most relevant for decision making.
In this thesis four histograms are computed in different directions.
Each histogram has (50) values, and these histograms can be
explained as follows:
(95)
4.2.3.2: Features Extraction on Vertical Histogram.
This procedure is related to the histogram that represents the
features on the Vertical lines of the image matrix, as in figure (4.11).
(96)
4.2.3.4: Features Extraction on upper triangle of the Main
diagonal
This procedure is related to the histogram, which presented the
frequency on the upper triangle of the main diagonal, as shown in
figure (4.13).
(97)
Figure (4.14): Recognition Using Backpropagation Algorithms
(98)
4.2.6: Delete Signature
This procedure deletes the person's name and data from system's
database, as shown in figure (4.16).
(99)
gives the weights between the input and hidden layers for each node
and the weights between the hidden and output layers. Finally the
database file shows the data for each pattern which represents the
inputs, it also shows the targets for each pattern.
(100)
4.2.7.2: Database File Saving
This procedure saves the database file.
(101)
4.2.7.4 Computing Output
This procedure computes the output of each output node.
(102)
Chapter Five
Conclusions and Recommendations
for Future Work
5.1 Overview
In the available literature, the handwritten signature the
recognition problem has been approached in various ways.
In general, handwritten recognition is a difficult task because of
the variation of writing styles even with the same writer, therefore,
great attentions must be taken in designing a recognition system.
The current research presents satisfactory results in the
recognition of handwritten signatures using Backpropagation Neural
Network.
5.2 Conclusions
The main conclusions of the research work can be summarized
as follows:
1- The SRS given a high proportion of achievement in the recognition,
amounts to (98%), (100) test signatures were chosen out of (50)
signatures belonging to persons whose signatures are existents
including the database and other (50) signatures not included in the
database.
(103)
2- The system proved its effectiveness to classify by applying NN
especially the Backpropagation network, which also proved to be
effective in pattern recognition. Through applied experiments was
performed on (100) test signature, the system accepts (98)
signatures and refuses (2) signatures so type I error are (2%).
4- The results of the experiments listed in tables (4.1) and (4.3) have
shown that the enhanced backpropagation network with (= 0.5)
learning rate is the most suitable for the task of off-line signature
recognition.
5- The rotation algorithm must be used to unify signature orientation
in a horizontal manner to overcome a person‟s situation that differs
in each signature from time to time or at the same time. The use of
the SRS without Rotation operation, provides a proportion of
achievement in recognition, which amounts to (20%).
6- The extracted features in this work include four histograms, which
produce a good result where the same idea hat may be used and
applied on other pattern recognition.
(104)
5.3 Recommendations for Future Research
The following recommendation for further research can be
identified:
1- The structural pattern recognition of Bezier Curves or Elliptic
Curves can be regarded as the attributes that can be used for the
recognition and the application of backpropagation algorithm after
solving the signature segmentation problem.
2- Since the Backpropagation network was found effective in pattern
recognition, it can be improved by using multi layer instead of one
of the hidden layer.
3- Other networks can be used instead of a Backpropagation network
such as learning Vector Quantization or Necognitron net. These
networks may give fast results.
4- There are many reasons for the significant differences among
signatures of the same person such as psychological, age.
Therefore, it is necessary to find a system for minimizing the
differences among the signatures of the same person.
5- The SRS can be used for (handwriting, fingerprints and face)
recognition. It will be more effective than a signature recognition.
(105)
References
(106)
7- Klassen, T. “Towards Neural Network Recognition of Handwritten
Arabic Letters”, University of Dalhousie, P.12, 2001.
Internet Site: http://www.cs.dal.ca./~mheywood/Reports/Tklassen.pdf
8- Herbst, B., “On An off-line Signature Vrefication System”, University
of Stellenbosch, P.1, 1998.
Internet Site:
http://citeseer.nj.nec.com/rd/635002,352799,1,0,Download/http%3A%
2F%2Fdip.sun.ac.za/%7Eherbst/Publications_ps/prasa98sign.ps.
9- Krykor, L. Z, thesis in "Signature Verification Using Neural Network",
University of Technology, 1995.
10- Musa, A. K., thesis in " Signature Recognition and Verification by
Using Complex-Moments Characteristics", Baghdad University, 1998.
11- Adhaem, R., thesis in "Automatic Computer Technique for Personal
Signature Verification", Baghdad University, 2001.
12- "Introduction to Pattern Recognition", Internet Site:
http://cwis.auc.dk/phd/fulltext/larsen/htm1/node5.htm1.
13- Tou, J. T. and Gonzalez, R. C. “Pattern Recognition Principles”,
Addison-Wesley Publishing Company, Inc., 1974.
14- Internet Site :http://www.ph.tn.tude/ft.nl/Research/neural/feature-extracti
on /papers /thesis/node67.html.
15- Assadi, A., "A Basic Introduction To Pattern Recognition", 2001.
Internet Site: http://www.Imcg.wisc.edu/bioCVG/courses/math991-f2001
/courseware /Pattern.Recognition-Reading .pdf
16- Tay, Y. H., Lallican, P. M., Khalid, M., Gaudin, C. V., Knerr, S., "An
Off-line Cursive Handwritten Word Recognition System", P.123,
2001.
Internet Site: http://www.casro.utm.my/publications/yhtay tenc onol.pdf
(107)
17- EEE350 Project # 4, "pattern Recognition", P.2, 1998.
Internet Site:http://www.eas.asu.edu/~morrell/350summer98/projectu .pdf
18- Rasheed, S. A., thesis in "Genetic Algorithm Application in Pattern
Recognition", Higher Educational Institute of Computer and Information,
P.22, 2000.
19- Breimer, E. A., "A Learning Approach for Designing Dynamic
Programming Algorithms - Input Classification", Rensselaer
Polytechnic Institute, 2000.
Internet Site: http://www.cs.rpi.edu/~breime/slide/node21.html.
20- Vuori, V., thesis in “Adaptation in On-line Recognition of
Handwriting”, Helsinki University of Technology, 1999.
21- David, C., “Neural Networks and Genetic Algorithms”, Reading
University, P.1, 1998.
Internet Site:
http://www.seattlerobotics.org/encoder/nov98/neural.html.
(108)
Internet Site: http//hem.hj.se/~de96klda/NeuralNetworks.htm#1.1Method.
(109)
Patent Number 4, 495, 644, 1985.
36- Mighell, D. A., "Backpropagation and its Application to Handwritten
Signature" , IEEE spectrum, P.P. 22-30, 1989.
37- Pender, D. A., "Neural Networks and Handwritten Signature
Verification", PhD Thesis, Department of Electrical Engineering,
Stanford University, 1991.
38- Darwish, A. M. and Auda, G. A., "A New Composite Feature Vector for
Arabic Handwritten Signature Verification", Proc IEEE Int Conf on
Acoustics, V2, P.P 613-666, 1994.
(110)