0% found this document useful (0 votes)

5 views58 pages

Comparative Performance of Classification Methods For Fingerprints

The document presents a comparative analysis of various classification methods for fingerprint recognition, utilizing NIST Special Database 4, which includes images from 2000 different fingers. It evaluates traditional classifiers and neural network approaches, focusing on their accuracy, computational requirements, and feature extraction techniques. The ultimate goal is to develop an effective automatic fingerprint classifier to enhance the efficiency of Automated Fingerprint Identification Systems (AFIS).

Uploaded by

Elias Jesus Lopez Mendoza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views58 pages

Comparative Performance of Classification Methods For Fingerprints

Uploaded by

Elias Jesus Lopez Mendoza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

Comparative Performance of

Classification Methods for

Fingerprints

G. T. Candela
R. Chellappa

U.S. DEPARTMENT OF COMMERCE

Technology Administration
National Institute of Standards
and Technology
Computer Systems Laboratory
Advanced Systems Division
Gaithersburg, MD 20899

r '

GC
100 NIST
.U56
#51 65
1993
NISTIR 5163

Comparative Performance of
Classification Methods for
Fingerprints

G. T. Candela
R. Chellappa

U.S. DEPARTMENT OF COMMERCE

Technology Administration
National Institute of Standards
and Technology
Computer Systems Laboratory
Advanced Systems Division
Gaithersburg, MD 20899

April 1993

U.S. DEPARTMENT OF COMMERCE

Ronald H. Brown, Secretary

NATIONAL INSTITUTE OF STANDARDS

AND TECHNOLOGY
Raymond G. Hammer, Acting Director
Comparative Performance of Classification Methods
for Fingerprints

G. T. Candela
R. Chellappa

U. S. DEPARTMENT OF COMMERCE
National Institute of Standards and Technology
Technology Administration
Gaithersburg, MD 20899

(Dr. Chellappa is a member of the

Department of Electrical Engineering
Institute for Advanced Computer Studies
University of Maryland
College Park, MD 20742)

April 1993
Abstract

We report the results of several pattern classifiers as tested on NIST Special Database 4, which
consists of fingerprint images produced from two rollings of each of 2000 different fingers. The
fingerprints are to be classified into five broad classes known as Arch, Tented Arch, Left Loop,
Right Loop and Whorl. The classifiers tested are drawn from traditional pattern recognition
literature (minimum distance, maximum a posteriori, nearest neighbor) as well as neural network
literature (multilayer perception, radial basis functions, probabilistic neural network). To enable a
fair comparison of the classifiers, preprocessing steps such as enhancement and feature extraction
are kept the same for all the classifiers. Classification accuracies for all the classifiers are tabulated
for the case when they are trained on fingerprints from one rolling and tested on fingerprints from
a different rolling. Variations of classifier accuracies as the feature vector dimension varies are
studied. Computational and memory requirements of the different classifiers are also compared.
Contents
1 Introduction 1

2 Previous Fingerprint Classification Work 5

3 Experimental Fingerprint Database 6

4 Components of Classification System 6

5 Feature Extractor 7

6 Discriminant Functions (Classifiers) of Various Types 15

6.1 Notation 16

6.2 Ad Hoc Classifiers 17

6.2.1 Euclidean Minimum Distance (EMD) 17

6.2.2 Quadratic Minimum Distance (QMD) 18

6.3 Normal (NRML): A Parametric Classifier 19

6.4 Nearest Neighbor Classifiers 21

6.4.1 Single Nearest Neighbor (1-NN) 21

6.4.2 Weighted Several Nearest Neighbors (WSNN) 22

6.5 Neural Net Classifiers 24

6.5.1 Multi-Layer Perceptron (MLP) 24

6.5.2 Radial Basis Functions (RBF1 and RBF2) 27

6.5.3 Probabilistic Neural Net (PNN) 31

6.6 List of Discriminant Functions 33

7 Maximum Finder and Rejector 33

8 Comparative Error Rates of the Classifiers 34

8.1 Realistic Scoring Using Balanced Test Set 34

8.2 Error vs. Reject Curves 36

9 Future Plans 39

10 Summary and Conclusions 39

i
List of Figures

1 A fingerprint of the Arch class 1

2 A fingerprint of the Tented Arch class 3

3 A fingerprint of the Left Loop class 3

4 A fingerprint of the Right Loop class 4

5 A fingerprint of the Whorl class 4

6 Components of Classification System 7

7 The Whorl example fingerprint after FFT-based filtering 8

8 Ridge directions, core point, and median of core points 10

9 The 840 unequally- spaced ridge directions of the Whorl example print 11

10 The top 112 eigenvalues of the covariance matrix 11

11 The mean vector 12

12 Eigenvector no. 1; eigenvalue is 48.35 12

13 Eigenvector no. 2; eigenvalue is 23.33 13

14 Eigenvector no. 3; eigenvalue is 16.92 13

15 The full set of 2000 training examples, represented using the first two K-L features 14

16 EMD class regions. Estimated class means are marked 17

17 QMD class regions 18

18 NRML class regions 20

19 1-NN class regions 21

20 WSNN class regions, for a = 3, 10, 50, and 90 23

21 MLP class regions, for 1, 2, 24, and 80 hidden nodes 26

22 RBF1 class regions, for 1, 2, 4, and 6 hidden nodes per class 29

23 RBF2 class regions, for 1, 2, 4, and 6 hidden nodes per class 30

24 PNN class regions, for a = 0.14, 0.42, 1.00, and 2.12 32

25 Error vs. reject for 1-NN; confidence function 2 37

26 Error vs. reject, MLP, confidence function 2 37

27 Error vs. reject, PNN, confidence function 4 38

List of Tables

1 Lowest error percentages for the various classifier types, and the parameters that
produced them 35

ii
Error percentages for classifiers and numbers of features
1 Introduction

This report compares the performance of several types of automatic classifiers as applied to the
Pattern-level Classification Automation (PCA) problem for fingerprints. This is the problem of
building a machine to classify fingerprint images into five broad classes known as Arch, Tented
Arch, Left Loop, Right Loop, and Whorl. Figures 1 through 5 are example fingerprints of the
five classes. An automatic fingerprint classifier will be useful as a component of an Automated
Fingerprint Identification System, or AFIS. The purpose of an AFIS
whether the
is to find out
individual represented by an incoming “search” fingerprint card is the same as an individual
represented by one of a large database of “file” cards. First, the name and other textual data
on the incoming card (sex, birthdate, etc.) are automatically searched for in a database of the
corresponding data from file cards; very often, a candidate matching card is found by this “name
search”, and the match can then be confirmed or denied by comparison of the fingerprints on the
search card and the candidate file card. However, if the name search turns up no candidate card,
or if the candidate card's prints do not match those of the search card, then it is necessary to
conduct a “technical search” to find out whether there is any file card (regardless of textual data)
whose prints match those of the search card.

Figure 1: A fingerprint of the Arch class

Automated systems for comparing fingerprints have existed for many years. The Federal Bu-
reau of Investigation uses an automatic “minutiae finder” to find the locations and directions
of many small details (ridge endings and bifurcations) of the search fingerprints, and an au-
tomatic “minutiae matcher” to rapidly compare the resulting minutiae-lists with electronically
stored minutiae-lists of the fingerprints from file cards. The matcher produces a small list of can-
didate matching cards, and a fingerprint examiner then manually compares the candidates with

1
the search card to ascertain whether one of them is a positive match. If a match is found, a second
examiner verifies the identification.

The automated technical search process currently used by the FBI precedes the minutiae-
matching step with a manual classification step. Technicians classify each of the fingerprints on
the search card, then assign a classification code to the card as a whole. The system used for
manual classification is the so-called Henry system, with modifications and extensions introduced
by the FBI. (The handbook [1] provides a complete description of the rules for manual finger-
print classification.) The file cards are stored in cabinets in the sequence defined by their Henry
classifications, and their electronically stored minutiae-lists are also organized according to these
classifications. Therefore, after a search card has been classified, it is not necessary to use the
automatic matcher to compare its minutiae with those of every file card; rather, it is only nec-
essary to compare its minutiae with those of cards of the same class. Because the Henry system
separates the file cards into a very large number of classes, the number of minutiae-lists that the
automatic matcher must compare with the search minutiae-list is often very much smaller than
the total number of file cards. Manual classification therefore saves a large amount of processing
time that would otherwise be used by the automatic matcher. »

Classification done manually because an automatic classifier of sufficient accuracy and effi-
is

ciency has not yet been developed. The goal of the PCA project is to produce such an automatic
classifier, albeit one that separates fingerprints into only five broad pattern-level classes rather

than the much more numerous classes used in the Henry system. Automated classification using
the Henry system may be achieved eventually, but a shorter-term goal is to produce an accurate
classifier that uses only the five pattern-level classes. Note that although the PCA method uses
10
only five classes for a single fingerprint, it separates a database of file cards into 9,765,625 (= 5 )

different card classes, each consisting of one of the possible ordered ten-tuples of fingerprint classes.
One cannot assume that these classes will separate the file cards into 5 10 equal-sized, groups; nev-
ertheless, the classification of a search card will often be that of one of the reasonably small
groups, and therefore automatic pattern-level classification can be expected to save considerable
minutiae-matching work.

2
Figure 2: A fingerprint of the Tented Arch class

Figure 3: A fingerprint of the Left Loop class

3
Figure 4: A fingerprint of the Right Loop class

Figure 5: A fingerprint of the Whorl class

4
2 Previous Fingerprint Classification Work
Over the last fifteen years, at least four major approaches have been taken for automatic fingerprint
classification. These are the structural, syntactic, rulebased, and artificial neural network (ANN)
approaches. In the syntactic approach [2], one typically approximates ridge patterns [3] as a string
of primitives and models the plausible strings from a class such as Tented Arch using production
rules as in grammar. Depending on the type grammar used one can see a significant drop in the
of
number of production rules required. For example, a context free grammar would require fewer
production rules than a regular grammar. When a new fingerprint whose identification is sought
arrives, one extracts the string of primitives and passes it through a parser. A successful parser
output indicates the class to which the fingerprint belongs. Whether the parsing is successful or
not, a description of the input string is always generated. The more general the grammar is, the
more complex the parser tends to be. Applications of the syntactic approach using more complex
grammars such as stochastic grammars [4] (where probablities are associated with the production
rules), tree grammars [5], and programmed grammars [6] for the fingerprint classification problem
have been considered. The main stumbling block of the syntactic approach is that mechanisms
for the inference of grammar from training samples have not been well understood [7, 8]. Recent
advances in learning the structure of the grammar from training samples using neural nets [9] look
promising.

At the Image Recognition Group of the National Institute of Standards and Technology, a
massively parallel neural network fingerprint classification system has recently been implemented
using a parallel computer of the SIMD variety (single-instruction-stream, multiple-data-stream)
[10]. The work described in the present report is the result of modifications made to that system.

More recently [11], a fingerprint matching and classification system that uses a hierarchical pyramid
structure has been reported. The matching module takes two images as inputs and outputs a
number between 0 and 1. This number, which reflects the degree of belief that the two images
are from the same finger, is estimated using a probabilistic Bayesian approach. The network
parameters (about 104 in number) are trained using a steepest descent procedure that minimizes
a cross entropy measure. Impressive matching results are reported. It is interesting to note that
the receptive fields of the learning filter at the bottom of the pyramid appear to be edge or ridge
orientation detectors, but sometimes correspond to minutia detectors. This finding supports the
choice of ridge direction components used as features in the NIST system. It would be interesting
to compare the performance of the method described in [11] and the NIST system on the same
dataset [12].

5
3 Experimental Fingerprint Database
The classifiers described in this report were trained and tested using feature vectors derived from
the fingerprint images of NIST Special Database 4 [12]. This database consists of 8 bit per pixel
graylevel raster images of two inked impressions (“rollings”) of each of 2000 different fingers. The
feature vectors used to train the classifiers were made from the 2000 first-rollings, and those used
to test the classifiers were made from the 2000 second-rollings. Every fingerprint in the database
has an associated class-label, assigned by experts. The two rollings of any finger have the same
class, since the class of a fingerprint is not affected by variations that occur between different

rollings of the finger.

Fingerprints as they naturally occur are not distributed equally into the five classes. We
have taken a summary of the NCIC classes of fingerprints from more than 22.2 million cards and
1

reduced the numbers contained therein to estimates of the true frequencies or a priori probabilities
of the five PCA classes. The estimated probabilities are .037, .029, .338, .317, and .279, for the
classes Arch, Tented Arch, Left Loop, Right Loop, and Whorl respectively.

The 2000 fingers represented in Special Database 4 are equally divided among the five classes.
The database was produced this way, rather than by using a natural distribution, so as to increase
the representation of the relatively rare, and also Arch and Tented Arch classes. This
difficult,

provides trainable classifiers with more examples with which to learn these problematical classes.
The training set (first rollings) therefore has equally many prints of each class, so the testing set
(second rollings) also has equally many prints of each class, since the database consists of rolling
pairs.

4 Components of Classification System

Each of our experimental fingerprint classifiers consists of a set of components as shown in Figure 6.
The ovals represent input and output data, the rectangles represent processing components, and
the arrows represent the flow of data. The components do not necessarily correspond to separate
devices or programs; they merely represent a separation of the processing into conceptual units, so
that the overall structure may be discerned. The fingerprint image is a raster of 512 by 512 8-bit
grayscale pixels, produced by scanning the fingerprint card with a CCD device or by some other
method. The “Feature Extractor” component reduces the fingerprint from a raster of 262,144 bytes
to a feature vector a small ordered list of numbers that takes only about 448 bytes. The feature
,

vector is intended to represent most of the information relevant to classification in a compact

form. The next component is the bank of “Discriminant Functions” There are five discriminant .

functions, one for each class. Each one produces a single floating-point number, which tends to have
a large value if the input fingerprint is of the class corresponding to that particular discriminant
function. The five-tuple of values produced by the bank of discriminant functions is sent to two
final components, the “Maximum Finder” and the “Rejector”. The maximum finder merely finds
x
The NCIC classification system separates fingerprints into a very numerous set of classes, which are basically
the same as the classes used in a Henry system. The NCIC method for producing a card’s class from the classes of
its individual fingerprints is different than the summarizing method used in the Henry system.

6
which one of the discriminant values is highest, and assigns its class as the hypothesized class of the
fingerprint, thatis, the classification system’s best guess as to the class. The rejector component

computes a “confidence function’’ of the discriminant values, compares the resulting value with
a preset threshold, and thereby decides whether the fingerprint should be accepted or rejected.
Rejection of a fingerprint means that the classification system refuses to assign a class, because
it cannot be sufficiently certain as to the correct class. Rejected fingerprints must be classified
manually.

Figure 6: Components of Classification System

The following sections describe the classification system components in some detail. The feature
extractor and rejector, while important components of the classification system, are peripheral to
the main theme of this report, which is the comparison of different kinds of discriminant functions.

5 Feature Extractor

The classifiers described in this report take as their input a small vector of numerical features
derived from a fingerprint raster image. The fingerprint is reduced to 112 features (not all of
which need be used) as follows. First, it is subjected to an FFT-based filter that increases the
relative power of dominant frequencies, increasing the ratio of signal (fingerprint ridges) to noise.
Figure 7 shows the result of applying this filter to the Whorl example fingerprint of Figure 5. The
local orientations of the ridges at 840 equally- spaced locations (a 28 by 30 grid) are then measured,
using an orientation finder described in [10]. The orientation finder is based on a “ridge- valley”
fingerprint binarizer described in [13]. It computes an orientation at the location of each pixel,
then averages these basic orientations in nonoverlapping 16xl6-pixel squares to produce the grid of
840 orientations. Figure 8 represents the output of this program, with the computed orientations
indicated by the bars; the program has succeeded if the bars are parallel to nearby ridges. The
resulting array of angles is sent to a registration program [14, 10], which finds a “core” feature.
The plus sign in Figure 8 marks this core point, and the plus sign enclosed by a square marks the
median of the core points of a sample of fingerprints. The array of pixel-specific ridge orientations
is shifted horizontally and vertically so as to bring its core point to the median core point, and

then it is averaged using an overlapping set of 16xl6-pixel squares to produce an array of 3596 (58
x 62) orientations. The purpose of registration is to reduce differences between fingerprint images

7
that result from differences in how the fingers were rolled, so that the differences that remain will
be more likely to indicate true variations in fingerprint structure.

Figure 7: The Whorl example fingerprint after FFT-based filtering

Each of the 3596 orientations is represented, not as a ridge angle 0, but as a two-dimensional
vector (cos(20), sin(20)). This representation seems prefererable because it takes account of the
have an orientation
fact that ridges (e.g. north-south) but not a sense (e.g. north as oppposed to
south), and because it removes the artificial discontinuity that an angular representation always
has at some point around its circle. The orientations are thus represented as a vector of 7192
floating-point numbers.

We had decided very early that the Karhunen-Loeve (K-L) transform [15], described below,
would probably be useful, because it could greatly reduce the dimensionality of feature vectors
before they were sent to a classifier (that is, a bank of discriminant functions). Reducing feature
dimensionality saves training time for classifiers such as multilayer perceptrons or radial basis
functions, and saves memory for multiple-prototype classifiers such as the Probabilistic Neural
Net. However, production of the K-L transform for 7192-dimensional input vectors would have
been unfeasible on our computing equipment, because of the huge size of the required covariance
matrix. Therefore, we first reduce the array of 3596 equally-spaced ridge orientations to a set
of 840 orientations, arranged in a fixed unequally-spaced pattern that is intended to be optimal.
The areas of the fingerprint raster that contained large numbers of the “cores” and “deltas” of a
sample of fingerprints receive relatively denser spacings of orientations in the fixed pattern. (A
description of cores, deltas, and the official rules for manual fingerprint classification can be found
in [1]. The method of producing the unequally-spaced pattern is described in [16].) Figure 9 shows
the unequally-spaced orientations for the Whorl example fingerprint. The K-L transform is applied
to the vector of 1680 elements which this figure represents, reducing it to a vector of 112 elements,
not all of which need be used. We have tried both equally- and unequally-spaced patterns of 840

8
orientations as the final features before the K-L transform, and the unequally-spaced patterns
produced better classification results.

The K-L transform is an affine function produced using the mean vector and covariance matrix

of a sample of the vectors whose dimension is to be reduced. (The production of this transform
is also known as principal factors or principal components analysis.) The covariance matrix is
subjected to a diagonalization program (made from EISPACK routines) to produce a set of largest
eigenvalues and corresponding eigenvectors. To perform the K-L transform on an input vector of
1680 elements, the mean vector is subtracted from it and the resulting vector is then premultiplied
by the matrix whose rows are the eigenvectors. In our case, the diagonalization program produced
112 eigenvectors. (It was set up to find all eigenvalues greater than 0.1 and their corresponding
eigenvectors.) Each K-L feature is associated with an eigenvector and eigenvalue. In fact, the
eigenvalues are equal to the sample variances of the corresponding features over the training set,
and are therefore a measure of the “sizes” of the features. Figure 10 shows the top 112 eigenvalues;
their rapid dropoff shows that the information content of the K-L features is concentrated in the
first few features. Since the features associated with relatively small, later eigenvalues encode
small amounts of the information contained in the original vectors, they are naturally expected
to be less valuable than higher-ranking features, for purposes of classification. Therefore, if it

is desired to use only a subset of k of the 112 K-L features, one obviously uses the first k that
occur when they are arranged in decreasing order of eigenvalue. Figure 11 shows the mean vector;
Figures 12 through 14 show the first three eigenvectors.

Figure 15 shows the full training set of 2000 examples represented using the two K-L
first

features, with the letters indicating the classes of the examples. It can be seen that even just two
features can separate the classes somewhat: the A examples are concentrated on the right side,
L below, R above, and W to the left, although the T examples are not very well separated from
the others in this figure. Only two features are used in this illustration, for the simple reason that
the paper on which the figure is printed is a two-dimensional surface; when the various kinds of
discriminant functions were tested, they were allowed to use whatever number of features produced
the best results, up to a maximum of 112 features. (The descriptions of discriminant functions
are illustrated using diagrams of hypothetical class regions, also in two features.)

The quality of the features used with a classifier can have a major effect on the resulting
classification accuracy. This report, however, is intended primarily to compare different versions
of the classifier per se, that is, the last stage of the classification process. In order to make a fair
comparison, all classifiers described were tested using the same features described above, although
each classifier was allowed to use its particular optimal number of the features.

9
Figure 8: Ridge directions from filtered W example print. Plus sign marks core point; plus sign
in square marks median of core points

10
/ / / I / /

/ / J / I

1 I I I \ \ ' v

/ I I I 1 I » \

1 //////// f J
l / J / / f f/J
. '
, '. , , /f/J
' /fJJ ‘

Figure 9: The 840 unequally-spaced ridge directions of the Whorl example print

Figure 10: The top 112 eigenvalues of the covariance matrix

11
Figure 12: Eigenvector no. 1; eigenvalue is 48.35
13
t
L A mT *
Lfr l L L *
Tt LL W L W&
I
* ^ L fc
L

Figure 15: The full set of 2000 training examples, represented using only the first two K-L features.
A= Arch, T = Tented Arch, L = Left Loop, R= Right Loop, W = Whorl.

14
6 Discriminant Functions (Classifiers) of Various Types

This section provides descriptions for each of the types of discriminant functions we have tested.
As noted earlier, the bank of discriminant functions is the classifier per se, as distinguished from
other components of the classification system which can be thought of as pre- and post-processors.
The classifiers we have used may be separated into four categories, bearing in mind that the
category names are somewhat arbitrary and that some classifiers have attributes of more than
one category. The “Ad Hoc ” category contains the simple EuclideanMinimum Distance (EMD)
classifier and a more advanced version, the Quadratic Minimum Distance (QMD) classifier. The

“Parametric” category contains only one example, the Normal (NRML) classifier. The “Nearest
Neighbors"' category contains the Single Nearest Neighbor (1-NN) classifier and an improvement
we call Weighted Several Nearest Neighbors (WSNN). Finally, the Neural Net category contains
the Multi-Layer Perceptron (MLP), Radial Basis Functions classifiers of two types (RBF1 and
RBF2), and the Probabilistic Neural Net (PNN).
For each type of discriminant function, one or more diagrams are provided showing the resulting
hypothetical class regions in two-dimensional feature space. (More than one diagram is provided
if the discriminant functions have an adjustable parameter, such as a number of hidden nodes.)
These diagrams show the hypothesized classes that are assigned to each of a large number of
regularly spaced feature vectors: each one is made using a square array of 512 by 512 points with
(0,0) at the center and with the extent large enough to contain all the training feature vectors.
It would have been better, perhaps, to make these diagrams for the optimal dimensionalities of

feature space for each type of classifier - such as 28 or 112 dimensions - but we have decided to
use just two dimensions because it would probably be difficult to produce satisfactory pictures
of high-dimensional regions or boundaries. The reader should try to imagine class regions in
high- dimensional feature spaces that are analogous to the regions depicted in the diagrams.

15
6.1 Notation

The notation below will be used in the descriptions of the discriminant functions.

N number of classes. For fingerprint PCA, N=5

p(i) a priori probability that a
:
random fingerprint is of class i (1 < i < N)
p(i) an estimate of p(i)

n number of features used

R n
the set of all n-tuples of real numbers = “feature space”

x a “feature vector” representing a fingerprint (x € Rn )

Mi number of training prints of class i (1 < i < N)

x« feature vector from j th training print of class i (1 < i < iV, 1 < j < Mi) (x^ £ Rn )

mean feature vector for class i (1 < i < N) E R n

)

m, an estimate of fi t

covariance matrix for class i (1 <i< N )

(S -

t
is an n x n matrix)
s, an estimate of X;
T
d 2 (x,y) (x — y) (x — y) = squared Euclidean distance between x and y (x,y £ Rn )

A(x) ?
th
discriminant function (1 < i < N 1
x € Rn )

16
d

6.2 Ad Hoc Classifiers

6.2.1 Euclidean Minimum Distance (EMD)

This is perhaps one of the simplest classifiers that one can design. Its discriminant functions are
of the form
A(x) - — 2
(x, m,).

Classifying an unknown to the class of the highest-valued discriminant function is equivalent to

using the class label of the estimated class-mean that is closest to the unknown in the sense
of squared Euclidean distance. The resulting hypothetical class regions are convex polygons.
Figure 16 shows the class regions when only two features are used. The estimated class mean
vectors m^ are marked with plus signs.

Figure 16: EMD class regions. Estimated class means are marked

17
6.2.2 Quadratic Minimum Distance (QMD)

In this classifier, the training examples of each class i are used to produce an estimate S t
of the
class covariance matrix, as well as an estimate m* of the class mean vector. Then the following
discriminants are used:
T
A(x) - -(x - m ')
t
S“ 1 (x - m,).

This can be thought of as a form intermediate between EMD and the Normal (NRML) classifier

(described next). Figure 17 shows the resulting class regions; their boundaries are quadratic.

Figure 17: QMD class regions

18
•

6.3 Normal (NRML): A Parametric Classifier

This based on parametric density estimation that presupposes a multivariate normal

classifier is
distribution for each class of fingerprints. First, it will be useful to mention a few facts that pertain
to any parametric classifier, using the following terminology:

A(«|j) = loss incurred by classifying to z a print that is of class j (1 < i : j < N)

Given a particular loss function A(z|j), the optimal or “Bayesian” classifier is the one that mini-
mizes the expected loss. Suppose the “symmetric” loss function is used:

0 i-j
A (*li)
1 otherwise

This means that correct classifications produce no losses and that all kinds of incorrect classifi-

cations produce equal loss values of 1 unit. In this case, the Bayesian classifier is the one that
classifies each unknown x to the class z for which the a posteriori probability p(z|x) is highest.
According to Bayes’s rule [17],

x) = TF
p(x)
—
Since the value of the mixture density p(x) has no effect on which possible z value maximizes
p(z|x), p(x) may be disregarded, resulting in the rule that one should classify x to the class z for
which p(z)p(x|z) is highest.

When using the Normal classifier, one assumes that each class z of fingerprint feature vectors
has conditional density function

P(x|«) = (
27r)-t|S 2
|"^exp (-^(x - ~ M«)) >

where n andl
are the mean vector and covariance matrix for class z. Therefore, the optimal
classifier picks the z that maximizes

p{i)p(x\i) =p(i)( 27 r)"t|S -|"Jexp l

(-^(x - - /Xi)) •

The above expression may be modified by discarding the (27r)”f term and taking the logarithm;
these changes do not affect which of the classes z maximizes the expression. The rule, then, is that

one should maximize

log p(i) - -1 log I

Si I
1
- -(x - Mi)
Ts _1 (x -Mi)-
t

Finally, in order to produce the discriminant functions for the Normal classifier, one replaces
the several p(i) with estimates p(i) and replaces the class mean vectors ijl z and class covariance

19
matrices S; with their estimates, the m ? (class sample mean vectors) and the S, (class sample
covariance matrices). The resulting discriminant functions are

A(x) = log p(i) - i log |S*-| - i(x- mi^Sr^x-mi).

The hypothetical class regions for the Normal classifier are shown in Figure 18. Their boundaries
QMD classifier, but the Arch region has become much smaller and the
are quadratic, as for the
Tented Arch region has disappeared completely.

Figure 18: NRML class regions

20
6.4 Nearest Neighbor Classifiers

6.4.1 Single Nearest Neighbor (1-NN)

This can be thought of as a generalization of EMD. Instead of using just m 2 ,

the estimated mean
vector for class i, as a single prototype for the class (as EMD does), the 1-NN classifier uses
all of the class-z training examples as prototypes for the class. The hypothetical class produced
by 1-NN for an unknown vector is simply the class of the closest prototype. This rule seems
intuitively appealing, and Cover and Hart [18] have shown it to have good asymptotic behavior:
under mild assumptions, its large-sample probability of error is bounded above by twice the Bayes
(i.e. minimum possible) probability of error.

Discriminant functions of the following form may be used to implement the 1-NN classifier:

Di(x) — — min 2
d (x.x
(!)
) .

Figure 19 shows the class regions. Each region is the union of many convex polygons each con-
taining a single prototype of the class; hence, a class region is a very complicated polygon, not
necessarily convex or even connected.

|A
gggT IHH^H L j
w

Figure 19: 1-NN class regions

21
6.4.2 Weighted Several Nearest Neighbors (WSNN)
After trying the 1-NN classifier, we naturally tried its well-known generalization the k-NN classi-
fier, expecting improved results. This classifier finds the k nearest prototypes to the unknown and
classifies it to the class thatmost abundantly represented among these near neighbors. (For
is

practical purposes, k may be optimized by simply finding which value produces the highest test
scores.) Suprisingly, our experiments always produced lower scores for k values greater than 1
than for k = 1. However, results better than the 1-NN results were obtained with a slightly more
elaborate version of k-NN, which may be called a Weighted Several Nearest Neighbor classifier.
This classifier finds the closest prototype to the unknown, then defines the “neighboring'’ proto-
types to be those whose squared Euclidean distance from the unknown is less than a times the
squared-distance of the nearest prototype, where a is a constant. The number of “votes” received
by class ? is divided by the square root of the sum of squared-distances of class-? near neighbors
from the unknown, so as to diminish the importance of neighbors that are relatively far away
compared to other neighbors.

To put the above description in mathematical notation, let:

a = neighborhood-size factor
= the set of indices of class-? training vectors that are
in the a-neighborhood of unknown vector x

= {'
1 < j <
~ M{,d
2
(x,x 3 ^)
(
< a min 2
d (x,
fc)
xj,
p ) )
\ > l<k<N,l<p<Mk V > J

VW = |<S^| = number of “votes” for class ?

The discriminant functions are then

A(x) = <
Vi
(,)
(n &sLO<*‘(x,xf)) if n
(i)
> o

0 otherwise

Figure 20 shows the WSNN class regions resulting from a values of 3, 10, 50, and 90.

22
3 10

50 90

Figure 20: WSNN class regions, for a 3, 10, 50, and 90

23
6.5 Neural Net Classifiers
6.5.1 Multi-Layer Perceptron (MLP)

This classifier is also known as a feedforward neural net. We have used an MLP with three layers
(counting the inputs as a layer). It will be convenient to define the following notation:

= number of nodes in th layer (i = 0, 1,2) 2

x
f(x) = 1/(1 + e~ = sigmoid function )

k
b\
= bias weight of th node of th layer
^
2 £:

wjj) = weight connecting th node of k th layer to j th node of 2

1 < j < Nf^

-1
(k — l) layer (k = 1, 2; 1 <
th
< )) i

T =
x = (xi, xn .a feature vector
.
. )

Note that

N^ = number of input nodes = n = number of features

j\dP = number of hidden nodes

N * 2)
= number of output nodes = N= number of classes

The discriminant functions are then of the form

/ M1
) / M°) \ \
a(x) = up >

j: j

For the training of the weights of this network, a reasonable procedure is the use of an optimiza-
tion algorithm to minimize the mean- squared-error, over the training set, between the discriminant
values actually produced and “target discriminant values” consisting of the appropriate strings of
l’s and 0’s as defined by the actual classes of the training examples. For example, if a training
feature vector is of class 2, then its target vector of discriminant values is set to (0, 1, 0, 0, 0). It

ismore feasible to minimize this kind of an “error function” than to attempt to directly minimize
the number of incorrectly classified training examples, since the latter number will take on only
relatively few values and is a discontinuous “step function”. In fact, the error function that is min-
imized is defined to contain an additional “regularization” term in addition to the mean-squared
discriminant error. This term consists of a factor times the average of the squares of the weights.
The motivation for including the regularization term is that the resulting weights will be somewhat
controlled in magnitude, and that this will increase the generalization ability of the network. The
goal of training is to produce a network that will “generalize” well, that is, accurately classify new
examples that were not part of the training set; even if the training algorithm produces weights
that result in perfect classification of the training examples, this is no guarantee that the network
will generalize well.

Networks of the MLP type are the most commonly used “neural nets” in use today, and they
are usually trained using a “backpropagation” algorithm [19]. However, we have used a “scaled
conjugate gradient” training method instead [20, 21, 22, 23]. The procedure trains the network

24
much fasterthan backpropagation and produces networks that generalize as well as or better than
those trained by backpropagation. Figure 21 shows MLP class regions resulting from the use of
1, 2, 24, and 80 hidden nodes.

25
I ^

24 80

Figure 21: MLP class regions, for 1, 2, 24, and 80 hidden nodes

26
, -

6.5.2 Radial Basis Functions (RBF1 and RBF2)

Neural nets of the Radial Basis Functions type get their name from the fact that they are built
from radially symmetric Gaussian functions of the inputs. Actually, the RBF nets discussed here
use Gaussian functions that are more general than radially symmetric functions: their constant
potential surfaces are ellipsoids whose axes are parallel to the coordinate axes, whereas radially
symmetric Gaussian functions have spherical constant potential surfaces. However, the name
Radial Basis Functions has become customary for any neural net that uses Gaussian functions in
its first layer.

We have experimented with RBF networks of two types, which will be denoted RBF1 and
RBF2. The following notation will be convenient:

= number of nodes in th layer (i = 0, 1,2) z

c ti) — center vector of j th hidden node (1 < j < N (c^ E Rn )

t
= j)
(4 ,...,4
j)
)

od'd = width vector of j

th
hidden node (1 < j < N^) (tr^E Rn )

=9j bias weight of 7

th
hidden node (RBF2 only)
f(x) =
x
1/(1 + e~ )
= sigmoid function
=6, bias weight of i
th
output node (1 < i < N^)
W{j = weight connecting i
th
output node to j th hidden node (1 < i < N 1 < j < N^)
T =
x = (aq, . .
.
xn ) a feature vector

Each hidden node computes a radial basis function. For RBF1, these functions are of the
exponential form
( N(0) , .9 ,\
^i(x) = exp f - £ (x k - j)
c[
) /
j
,

and for RBF2, they are of the sigmoidal form

/ N(0) .9 9\
^( x = / ) - E {
x* ck
]

) / (4
J)

)
j
•

th
For either type of RBF, the i discriminant function is the following function of the radial basis
functions:
/ M 1)
\

A(x) = / Ui + Xj w y^( x ) I
•

Packing both layers of the neural net into a single equation for the discriminant function results
in the following for RBF1:
and the following for RBF2:

/ M 1 ) / 7V(°)

A(x) = f Ut + w ijf ~ ei ~ (
x* ~ J)
4 ) / (
ak ]

)
V j=i \ fc=i

The centers c^\ widths a^ 3 \ hidden- node bias weights 9j (RBF2 only), output-node bias
weights b{, and output-node weights W{ 3 may be collectively thought of as the trainable “weights”
of the RBF network. They are trained as follows. First, a k-means clustering program is applied
to each class-set of the training patterns in turn, that is, each set of all training patterns that are

of a particular class. This program separates the class-set into clusters of similar patterns. The
mean vectors of these clusters are used as initial values for the center vectors To initialize

the width vectors cr^\ we simply set all components of all width vectors to a single positive value;
good values for this initial width component may be found by trial and error. The initial values of
the output-node weights are set in such a way that each output node is connected with a positive
weight to hidden nodes of its class (that is, hidden nodes whose initial center vectors are means
of clusters from its class), and connected with a negative weight to hidden nodes of other classes.

The k-means algorithm and the above rules are used to set the initial values of the weights.
Then, the weights are optimized using a conjugate- gradient algorithm, which is the same algorithm
we have used to optimize the weights of the MLP network. Figure 22 shows RBF1 class regions
resulting from the use of 1, 2, 4, and 6 hidden nodes per class, and Figure 23 shows RBF2 class
regions for the same numbers of hidden nodes per class.

28
A 6

Figure 22: RBF1 class regions, for 1, 2, 4, and 6 hidden nodes per class

29
Figure 23: RBF2 class regions, for 1, 2, 4, and 6 hidden nodes per class

30
6.5.3 Probabilistic Neural Net (PNN)

This classifier is promulgated in a recent paper by Specht [24]. Each training example becomes
the center of a kernel function which takes its maximum at the example and recedes gradually as
one moves away from the example in feature space. An unknown x is classified by computing,
for each class z, the sum of the values of the class-z kernels at x, multiplying these numbers by
compensatory factors involving the estimated a priori probabilities, and picking the class whose
resulting discriminant value is highest. Many forms are possible for the kernel functions; we have
obtained our best results using radially symmetric Gaussian kernels. The resulting discriminant
functions are of the form

A(x) =

where a is a scalar “smoothing parameter" that may be optimized by trial and error. Figure 24
shows the PNN class regions resulting from the use of <r values of 0.14, 0.42, 1.00, and 2.12. Notice
that a small a value produces very complex class regions similar to those of 1-NN, and that as a
is increased, the regions become simpler.

31
7, ~ 7 ;,
' '*'
'

r U ' ,, \
is '

‘A'/?y/S;Y//fovAw

A
>

::' .
S' •'

grv
' .

ill
1.00 2.12

Figure 24: PNN class regions, for a 0.14, 0.42, 1.00, and 2.12

32
d

6.6 List of Discriminant Functions

The following list is included for comparison of the functional forms.

EMD:
Di(x) = — 2
(x, m,)

QMD:
T
Di(x) = -(x - mi) S-
1
(x - m,)

NRML:
A(x) = log p(i) - i log Si |
|
- i(x-mi) T S f
1
(x m,

1-NN:
DAx) = — min
V
2
d (x. x
V ’
(

3;
!)

)
J
'
l<j<Mi

WSNN:
2

A(x) = < {’*?)) «^°>0 ,

where
0 otherwise

2 2 k
si0 = 1 <
- j
J <
- Mi,d (x.x
( j)

) < a min d (x,x[D ^) 1-

x = I5«|
\<k<N,\< P <Mk
' ,
{. \ 3 ’ J V 1

MLP:
/ M 1
)
/ M°)
A(x) = / &!
2)
+ v „< 2
>/ + 5;
V 3=1 \ k= 1

RBF1:
AT (1) / MO)
a(x) = /(&*+ 53 w ij ex p - 53 (
x* ~ 4J) ) / (4
J)
)
j=i V *=i

RBF2:
M 1)
M°)
a(x) = / (
b, + y. waf
/

- E (** - m ^a w
4?’)' /1
2
(
2

j=i V fc=i

PNN:

= I^E exP (“5^ J=1

2<j 2 (
x x 5°)
’

7 Maximum Finder and Rejector

th
The “Maximum Finder” component is very simple: it finds the i for which Z),-(x), the i discrim-
inant function evaluated at the feature vector x, is maximal, and outputs i as the hypothesized
class of the fingerprint of which x is the feature vector.

33
Operating in parallel with the Maximum Finder, and using the same data - the outputs of
the bank of discriminant functions - is the Rejector component. The motivation for a rejection
mechanism is that by refusing to classify some “ambiguous’' fingerprints (which must then be
classified manually), the automatic classifier can achieve a lower substitutional error rate on those
fingerprints which it does classify.

The rejectors used in our experiments are of the following form. The outputs of the discriminant
functions are fed to a “confidence function”, which produces a number that is treated as if it were a
measure of reliability of the classification decision made by the Maximum Finder. In other words,
fingerprints that produce high values of the confidence function are considered to be more likely to
have been assigned correct classes than those that produce lower confidence values. The following
confidence function (referred to as confidence function 2, for historical reasons) often produces
good results: define its value to be the highest discriminant-function value minus the second-
highest value. This is intuitively reasonable, since it will assign low confidence to fingerprints
that are near a hypothetical class boundary. Another successful confidence function (number 4)
is the same as function 2 except that it first “normalizes” the discriminant-function values by

dividing them by sum. To reject some fraction of the fingerprints presented, one simply sets
their
a “confidence threshold” and then rejects all fingerprints whose confidence values are below the
threshold. Higher thresholds result in rejecting greater percentages of the presented fingerprints
and also result in correctly classifying higher percentages of the accepted fingerprints.

8 Comparative Error Rates of the Classifiers

8.1 Realistic Scoring Using Balanced Test Set

The test set used in these experiments - the second rollings of Special Database 4 - consists of an
equal number (400) of each of the five classes of fingerprints. Because naturally occurring prints
have a very unequal distribution into classes, it would be a mistake to use the percentage of this test
set incorrectly classified as an estimate of the probability that a classifier will incorrectly classify a
naturally occurring print. Instead, the following calculations are used to produce the test “score”
(estimated probability of incorrect classification). For each class z, the number of the 400 class-z
test prints (i.e. test prints whose actual class is z
)
that were incorrectly classified, is counted; this
count is denoted W{. Clearly, tc;/400 may be used as an estimate of the conditional probability of
incorrect classification of a print given that the actual class of the print is z. Therefore,

N
X^iKOWdoo
2 =1

may be used as an estimate of the probability of incorrect classification. Accuracy “scores” men-
tioned below are these estimated probabilities, expressed as percentages.

Tableshows the lowest error rate that was obtained for each type of classifier. Also listed,
1

for each are the optimal settings that were found for other parameters: number of
classifier type,
K-L features used; type of training set - the full “balanced” set of 400 prints of each class, or a
“natural” set produced by discarding some of the training prints so as to cause the frequencies to

34
be approximately equal to the estimated a priori probabilities; and, for some of the classfier types,
another adjustable parameter or a number of hidden nodes. Table 2 shows, for each classifier type,
the lowest error rates that were obtained for each of several numbers of features; it is clear that
the optimal number of features is not the same for all of these classifier types.

Classifier Error % No. of features Training set Other parameter values

EMD 26.2 80 balanced -
QMD 12.8 16 balanced -
NRML 11.3 28 balanced -
1-NN 9.0 96 natural -
WSNN 8.9 96 natural a = 1.09
MLP 8.2 64 natural 64 hidden nodes
RBF1 8.3 112 natural 70 hidden nodes
RBF2 8.1 64 natural 110 hidden nodes
PNN 7.2 112 balanced <7 = 2.26

Table 1: Lowest error percentages for the various classifier types, and the parameters that produced
them

Number of features
Classifier 16 32 48 64 80 96 112

EMD 26.9 26.6 26.4 26.3 26.2 26.3 26.3

QMD 12.8 15.6 18.0 20.1 20.7 21.6 23.0
NRML 13.5 12.8 16.8 18.1 19.7 20.7 23.0
1-NN 10.7 9.6 9.7 9.3 9.1 9.0 9.3
WSNN 10.3 9.3 9.1 9.1 8.9 8.9 9.0
MLP 9.1 8.8 8.6 8.2 8.2 8.4 8.5
RBFl 9.8 8.6 9.1 8.8 8.8 8.5 8.3
RBF2 10.7 9.5 10.7 8.1 8.8 8.4 8.2
PNN 9.0 7.9 7.5 7.6 7.4 7.3 7.2

Table 2: Error percentages for classifiers and numbers of features. NRML produced a smaller
error percentage for a number of features not in the table: 11.3, for 28 features.

35
8.2 Error vs. Reject Curves

The above weighted scoring method can be extended to produce realistic scores at various levels of
rejection, thereby producing an “error vs. reject” curve that is really a graph of an “estimated con-
ditional probability of incorrect classification given acceptance” versus an “estimated probability
of rejection”. First, define some notation:

p{r) probability of rejection

p(r\i) conditional probability of rejection, given that actual class is i

P{a probability of acceptance = 1 — p(r)

p{a,w) probability of acceptance and incorrect classification

p{a,w\i conditional probability of acceptance and incorrect

classification, given that actual class is i

p{w\a) conditional probability of incorrect classification given acceptance

P(a )
p{a-,w)
1 -p{r)
r t
number of the 400 class-? prints that were rejected
Wi number of the 400 class-z prints that were accepted and incorrectly classified

Clearly r,-/ 400 may be used as an estimate of p(r|z), and W{j 400 as an estimate of p(a.w\i).
Therefore
N N
p( r ) = J2p( i
)p( r 10 ~ X^( >*/4oo, ?

2 =1 2 =1
N N
P(a,w) = E=
2 1
p(i)p(a, w\i) ~ ^p(z')iCj/400, and
2=1

X,=i p(i)wj/m
p{w\a) = p(a,w)/( 1 ~p{r))
1
-£Lp(*>,/400'
Figures 25 through 27 are error vs. reject curves for some of the classifiers, produced by plotting

the above estimates of p{w\a) against the estimates of p(r). The confidence functions that were
used are indicated.

36
Figure 25: Error vs. reject for 1-NN; confidence function 2

Figure 26: Error vs. reject, MLP, confidence function 2

37
Figure 27: Error vs. reject, PNN, confidence function 4

38
NIST-114 U.S. DEPARTMENT OF COMMERCE (ERB USE ONLY)
(REV. 9-92) NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY ERB CONTROL NUMBER DIVISION

ADMAN 4.09 4 7:
PUBUCATION REPORT NUMBER CATEGORY CODE
MANUSCRIPT REVIEW AND APPROVAL
INSTRUCTIONS: ATTACH ORIGINAL OF THIS FORM TO ONE (1) COPY OF MANUSCRIPT AND SEND TO: PUBUCATION DATE NUMBER PRINTED PAGES
THE SECRETARY, APPROPRIATE EDITORIAL REVIEW BOARD.
TITLE AND SUBTITLE (CITE IN FULL)

Comparative Performance of Classification Methods for Fingerprints

CONTRACT OR GRANT NUMBER TYPE OF REPORT AND/OR PERIOD COVERED

AUTHOR(S) (LAST NAME, FIRST INITIAL SECOND INITIAL) PERFORMING ORGANIZATION (CHECK (X) ONE BOX)
X NIST/GAITHERSBURG
NIST/BOULDER
Candela, G. T. and Chellappa, R. JILA/BOULDER
LABORATORY AND DIVISION NAMES (FIRST NIST AUTHOR ONLY)

Co mpntpr SvgrprrK T nhnrnmrv AHvnnrpri Systems Division C 875. 121

UNO ORGANIZATION NAME AND COMPLETE ADDRESS (STREET, CITY, STATE, ZIP)
SPONSORI
Nib a NO*
P% s V
RECOMMENDED FCR NIST PUBUCATION

JOURNAL OF RESEARCH (NIST JRES) MONOGRAPH (NIST MN) LETTER CIRCULAR

J. PHYS. 4 CHEM. REF. DATA (JPCRD) NATL STD. REF. DATA SERIES (NIST NSRDS) BUILDING SCIENCE SERIES
HANDBOOK (NIST HB) FEDERAL INF. PROCESS. STDS. (NIST FIPS) PRODUCT STANDARDS
SPECIAL PUBUCATION (NIST SP) UST OF PUBUCATIONS (NIST LP) OTHER
TECHNICAL NOTE (NIST TN) X
H —
NIST INTERAGENCY/INTERNAL
1

U.S.
1
REPORT
FOREIGN
(NISTIR)

PAPER CD-ROM
R 5^3 l a do rv\
‘ DISKETTE (SPECIFY)
fjt
V-S-J3 OTHER (SPECIFY)
SUPPLEMENTARY NOTES

ABSTRACT (A 1500-CHARACTER OR LESS FACTUAL SUMMARY OF MOST SIGNIFICANT INFORMATION. IF DOCUMENT INCLUDES A SIGNIFICANT BIBLIOGRAPHY
OR LITERATURE SURVEY, CITE IT HERE. SPELL OUT ACRONYMS ON FIRST REFERENCE.) (CONTINUE ON SEPARATE PAGE, IF NECESSARY.)

We report the results of several pattern classifiers as tested on NIST Special Database 4, which consists of fingerprint

images produced from two rollings of each of 2000 different fingers. The fingerprints are to be classified into five broac
classes known as Arch, Tented Arch, Left Loop, Right Loop, and Whorl. The classifiers tested are drawn from
traditional pattern recognition literature (minimum distance, maximum a posteriori, nearest neighbor) as well as neura
network literature (multilayer perceptron, radial basis functions, probabilistic neural network). To enable a fair
comparison of the classifiers, preprocessing steps such as enhancement and feature extraction are kept the same for al
the classifiers. Classification accuracies for all the classifiers are tabulated for the case when they are trained or
fingerprints from one rolling and tested on fingerprints from a different rolling. The effect of feature vector dimension

on classifier accuracies is indicated. Computational and memory requirements of the classifiers are compared.

KEY WORDS (MAXIMUM 9 KEY WORDS; 28 CHARACTERS AND SPACES EACH; ALPHABETICAL ORDER; CAPITAUZE ONLY PROPER NAMES)

classification: fingerprint classification: multilayer perceptron: nearest neighbor: neural nets: paralle

omputation; pattern recognition: probabilistic neural net: radial basi^finictions

AVAILABILITY NOTE TO AUTHOR(S) IF YOU DO NOT WISH THIS
X UNUMITED |
FOR OFFICIAL DISTRIBUTION. DO NOT RELEASE TO
|
NTIS. MANUSCRIPT ANNOUNCED BEFORE PUBUCATION.
ORDER FROM SUPERINTENDENT OF DOCUMENTS, U.S. GPO. WASHINGTON. D.C.2O402 PLEASE CHECK HERE.
ORDER FROM NTIS. SPRINGFIELD. V A 22161

ELECTRONIC FORM
:
&v«w i'c Vi

mi
9 Future Plans
The Image Recognition Group will continue work on automatic fingerprint classification (among
other subjects), with the aim of improving the accuracy of the experimental classifiers while keep-
ing memory and time costs under control. More generally, we will continue our attempts to
characterize the difficulties inherent in automatic fingerprint classification, and will continue to
increase our knowledge about the feasibility of various classification methods as applied to this
task. The group is involved in other image processing work, in particular the classification of
handwritten characters; cross-fertilization between the areas of fingerprint and character classifi-

cation has proved beneficial so far, and will probably continue to be so. (The classifier algorithms
discussed here are applicable to any classification task; they are not specific to fingerprints.)

Two of the possible areas of future research for this group can be mentioned here. One is the
use of “decision tree” classifiers and their combination (discussed by Sethi [25]) with neural nets.
Indications are that such combination tree/net methods have advantages compared to either tree
or net methods alone. Another area of methods for reducing the costs of manv-
interest consists of
prototype classifiers. In particular, classification memory and time can be saved by reducing the
prototype set to a well-chosen subset of the training set, and time can be saved by using algorithms
that do not require computing the distances of the unknown from all stored prototypes. Nearest-
neighbor classifiers have been the subject of decades of research (see, for example, Dasarathy’s
collection of papers [26]), and it is likely that many of the cost-reducing techniques developed over
the years can be applied to other classifiers involving numerous prototypes, in particular PNN.

10 Summary and Conclusions

Pattern-level classification of fingerprints is a nontrivial task even for humans; it appears to be

very difficult to automate, even using a massively parallel computer. We have performed many
experiments in fingerprint classification, using “statistical” and “neural net” approaches. When
our features are used - that is, a fixed pattern of registration-corrected local ridge directions
dimension-reduced by a K-L transform - the differences in classification accuracy resulting from
varying the classifier per se between the types reported here, are significant but not dramatic.
Our test results indicate that the accuracy differences are fairly small when certain “reasonable”
classifiers arecompared. Of course, it is possible that some other classifier algorithm that we have
not tested, or that has not yet been invented, will produce much better results using the same
features as input; research and testing must continue. The best accuracy results we obtained were
for the Probabilistic Neural Net (PNN) method, which achieved an error rate of 7.2% (3.2% at
a 10% rejection level). Improvements in accuracy seem likely to be attainable by using a larger
training set and better features, even if the same PNN algorithm is used for the classifier itself.
Production of special-purpose hardware is probably required if fingerprint classifiers are to be
implemented and with the highest possible
at the lowest cost classification speed; the algorithms
discussed here are naturally parallel and their implementation in special hardware would probably
not be very difficult.

39
References

[1] The Science of Fingerprints. U. S. Department of Justice, Washington, DC, 1984.

[2] I\. S. Fu. Syntactic Pattern Recognition. Prentice-Hall, Englewood Cliffs, NJ, 1982.

[3] B. Moayer and K. S. Fu. A syntactic approach to fingerprint pattern recognition. Pattern
Recognition 7:1-23, 1975.
,

[4] B. Moayer and K. S. Fu. An application of stochastic languages to fingerprint pattern recog-
nition. Pattern Recognition 8:173-179, 1976.
,

[5] B. Moayer and K. S. Fu. A tree system approach for fingerprint pattern recognition. IEEE
Trans, on Computers 25:262-274, 1976.
,

[6] K. Rao and K. Balck. Type classification of fingerprints: A syntactic approach. IEEE Trans,
on Patt. Anal. Mach. Intel!, 2:223-231, 1980.

[7] K. S. Fu and T. L. Booth. Grammatical inference: Introduction and survey - part i. IEEE
Trans, on Systems, Man, and Cybernetics 5:95-111, January
,
1975.

[8] K. S. Fu and T. L. Booth. Grammatical inference: Introduction and survey - part ii. IEEE
Trans, on Systems, Man, and Cybernetics 5:409-423, July
,
1975.

[9] C. L. Giles, C. B. Miller, D. Chen, H. H. Chen, G. Z. Sun, and Y. C. Lee. Learning and
extracting finite state automata with second-order recurrent neural networks. Neural Com-
putation, 4(3):380, 1992.

[10] C. L. Wilson, G. T. Candela, P. J. Grother, C. I. Watson, and R. A. Wilkinson. Massively

Parallel Neural Network Fingerprint Classification System. Technical Report NISTIR 4880,
National Institute of Standards and Technology, July 1992.

[11] P. Baldi and Y. Chauvin. Neural networks for fingerprint matching and classification (ab-
stract). In Proceedings, Neural Information Processing Systems Conference, Vail, Colorado
1992.

[12] C. I. Watson and C. L. Wilson. Fingerprint database. National Institute of Standards and
Technology Special Database
, 4, FPDB, April 18, 1992.

[13] R. M. Stock and C. W. Swonger. Development and evaluation of a reader of fingerprint

minutiae. Cornell Aeronautical Laboratory Technical Report ,
CAL No. XM-2478-X-l:13-17,
1969.

[14] J. H. Wegstein. An automated fingerprint identification system. National Institute of Stan-

dards and Technology NBS Special Publication 500-89, Febuary
,
1982.

[15] Anil K. Jain. Fundamentals of Digital Image Processing, chapter 5.11, pages 163-174. Prentice
Hall Inc., prentice hall international edition, 1989.

40
[16] C. L. Wilson, G. T. Candela, and C. Watson. Neural network fingerprint
I. classification.
Journal of Artificial Neural Networks 1992. to be published.
,

[17] Nils J. Nilsson. Learning Machines: Foundations of trainable pattern-classifying systems ,

chapter 3, page 45. McGraw-Hill Book Company, 1965.

[18] T. M. Cover and P. E. Hart. Nearest neighbour pattern classification. IEEE Transactions on
Information Processing IT-13:21-27, 1967.
,

[19] D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning representations by back-

propagating errors. Nature 332:533-536, 1986.,

[20] R. Fletcher and C. M. Reeves. Function minimization by conjugate gradients. Computer

Journal 7:149-154, 1964.
,

[21] E. M. Johansson, F. U. Dowda, and D. M. Goodman. Backpropagation learning for multi-

layer feed-forw ard neural networks using the conjugate gradient
T
method. IEEE Transactions
on Neural Networks 1991. ,

[22] M. F. M0ller. A scaled conjugate gradient algorithm for fast supervised learning. Technical
Report PB-339, Aarhus University, 1990.

[23] J. L. Blue and P. J. Grother. Training Feed Forward Networks Using Conjugate Gradients. In
Conference on Character Recognition and Digitizer Technologies volume 1661, pages 179-190,
,

San Jose California, February 1992. SPIE.

[24] Donald F. Specht. Probabilistic neural networks. Neural Networks , 3(1) 109— 1 18, 1990.
:

[25] I.K. Sethi. Entropy nets: from decision trees to neural networks. Proceedings of the IEEE ,

78:1605-1613, 1990.

[26] B. V. Dasarathy, editor. Nearest Neighbor (NN) Norms: NN Pattern Classification Tech-
niques. IEEE Computer Society Press, 1991.

41
mm

Philosophy of Law
No ratings yet
Philosophy of Law
8 pages
Dark & Soft Gradients Pitch Deck by Slidesgo
No ratings yet
Dark & Soft Gradients Pitch Deck by Slidesgo
52 pages
Fingerprint
No ratings yet
Fingerprint
36 pages
A Combination Fingerprint Classifier
No ratings yet
A Combination Fingerprint Classifier
10 pages
PCASYSA Pattern-Level Classification Automation System For Fingerprints
No ratings yet
PCASYSA Pattern-Level Classification Automation System For Fingerprints
50 pages
Classification of Fingerprints
No ratings yet
Classification of Fingerprints
38 pages
A Multichannel Approach To Fingerprint Classification
No ratings yet
A Multichannel Approach To Fingerprint Classification
12 pages
Sherin
No ratings yet
Sherin
8 pages
Fingerprint Retrieval For Identification
No ratings yet
Fingerprint Retrieval For Identification
11 pages
Yager 2004
No ratings yet
Yager 2004
17 pages
Chapter 2 Intro - FP - Classification
No ratings yet
Chapter 2 Intro - FP - Classification
4 pages
Classification of Fingerprint Images: July 1998
No ratings yet
Classification of Fingerprint Images: July 1998
8 pages
Electronics and Communication Engineering: Fingerprint Recognition
No ratings yet
Electronics and Communication Engineering: Fingerprint Recognition
47 pages
Revised: Fingerprint Classification Based On Gray-Level Fuzzy Clustering Co-Occurrence Matrix
No ratings yet
Revised: Fingerprint Classification Based On Gray-Level Fuzzy Clustering Co-Occurrence Matrix
11 pages
Fingerprint Image Classification Using Local Diagonal and Directional Extrema Patterns
No ratings yet
Fingerprint Image Classification Using Local Diagonal and Directional Extrema Patterns
14 pages
Technical Seminar Sample
No ratings yet
Technical Seminar Sample
15 pages
Biometrics - Fingerprint
No ratings yet
Biometrics - Fingerprint
14 pages
Fingerprint Recognition: Some Biological Principles (Moenssens 1971) Related To Fingerprint Recognition Are As Follows
100% (1)
Fingerprint Recognition: Some Biological Principles (Moenssens 1971) Related To Fingerprint Recognition Are As Follows
40 pages
For Biometrics
No ratings yet
For Biometrics
18 pages
2
No ratings yet
2
28 pages
Efficient Fingerprint Search Based On Database Clustering: Manhua Liu, Xudong Jiang, Alex Chichung Kot
No ratings yet
Efficient Fingerprint Search Based On Database Clustering: Manhua Liu, Xudong Jiang, Alex Chichung Kot
11 pages
Classification of Images
No ratings yet
Classification of Images
14 pages
Fingerprint Recognition System: Design & Analysis: January 2011
No ratings yet
Fingerprint Recognition System: Design & Analysis: January 2011
6 pages
Seminar On Fingerprint Recognization
No ratings yet
Seminar On Fingerprint Recognization
18 pages
Fingerprint Recognition: 18-551 Spring 2001 Group #5 Final Report
No ratings yet
Fingerprint Recognition: 18-551 Spring 2001 Group #5 Final Report
21 pages
Atm Terminal Design Based On Fingerprint
No ratings yet
Atm Terminal Design Based On Fingerprint
29 pages
Fingerprint Classification and Identification Algorithms For Criminal Investigation A Survey
No ratings yet
Fingerprint Classification and Identification Algorithms For Criminal Investigation A Survey
14 pages
Efficient and Effective Content-Based Image Retrieval Framework For Fingerprint Databases
No ratings yet
Efficient and Effective Content-Based Image Retrieval Framework For Fingerprint Databases
6 pages
Fingerprint Recognition: Presented by Dr. Nazneen Akhter
No ratings yet
Fingerprint Recognition: Presented by Dr. Nazneen Akhter
28 pages
2934 - Finger Print Authentication
No ratings yet
2934 - Finger Print Authentication
17 pages
A Survey of Fingerprint Classification Part I
No ratings yet
A Survey of Fingerprint Classification Part I
22 pages
Fingerprint
No ratings yet
Fingerprint
16 pages
FingerPrint Lecture1
No ratings yet
FingerPrint Lecture1
48 pages
Hybrid Finger Print Recognition
No ratings yet
Hybrid Finger Print Recognition
10 pages
MODULE 7 - Fingerprint Classification System SFO1 Libranda
No ratings yet
MODULE 7 - Fingerprint Classification System SFO1 Libranda
18 pages
A.Rattani, D.R. Kisku, M.Bicego, Member, IEEE and M. Tistarelli
No ratings yet
A.Rattani, D.R. Kisku, M.Bicego, Member, IEEE and M. Tistarelli
30 pages
4-Fingerprint Presentation and Acquisition-10!01!2025
No ratings yet
4-Fingerprint Presentation and Acquisition-10!01!2025
39 pages
6-Fingerprint Anatomy - History-10-01-2025
No ratings yet
6-Fingerprint Anatomy - History-10-01-2025
39 pages
A Fingerprint Verification System Based
No ratings yet
A Fingerprint Verification System Based
11 pages
Implementation of An Automatic Fingerprint Identification System
No ratings yet
Implementation of An Automatic Fingerprint Identification System
81 pages
Enhanced Thinning Based Finger Print Recognitio
100% (1)
Enhanced Thinning Based Finger Print Recognitio
14 pages
Proposal To Enhance Fingerprint Recognit
No ratings yet
Proposal To Enhance Fingerprint Recognit
13 pages
Fingerprint Recognition
67% (6)
Fingerprint Recognition
44 pages
Fingerprint PPT Title
No ratings yet
Fingerprint PPT Title
3 pages
Lecture 1
No ratings yet
Lecture 1
24 pages
Automatic Fingerprint Classification Using Graph Theory
No ratings yet
Automatic Fingerprint Classification Using Graph Theory
5 pages
Short Papers: Fingerprint Indexing Based On Novel Features of Minutiae Triplets
No ratings yet
Short Papers: Fingerprint Indexing Based On Novel Features of Minutiae Triplets
7 pages
Fingerprint Identification: BIOM 426 Instructor: Natalia A. Schmid
No ratings yet
Fingerprint Identification: BIOM 426 Instructor: Natalia A. Schmid
29 pages
Biometric Recognition Based On Fingerprint (2017)
No ratings yet
Biometric Recognition Based On Fingerprint (2017)
6 pages
Group 6 Final Report
No ratings yet
Group 6 Final Report
26 pages
Handbook of Fingerprint Recognition: Chapters 1 & 2 Presentation by Konda Jayashree
No ratings yet
Handbook of Fingerprint Recognition: Chapters 1 & 2 Presentation by Konda Jayashree
53 pages
Fingerprint Recognition Using Fuzzy Inferencing Techniques
No ratings yet
Fingerprint Recognition Using Fuzzy Inferencing Techniques
9 pages
Classification of Fingerprint
No ratings yet
Classification of Fingerprint
4 pages
Externalized Fingerprint Matching
No ratings yet
Externalized Fingerprint Matching
17 pages
Math of Fingerprint
No ratings yet
Math of Fingerprint
55 pages
Security Access Control System Using 89C51: A Major Project Synopsis Report On
No ratings yet
Security Access Control System Using 89C51: A Major Project Synopsis Report On
7 pages
Industry Internship Summary Report
No ratings yet
Industry Internship Summary Report
8 pages
Chapter 4 Fingerprint
No ratings yet
Chapter 4 Fingerprint
14 pages
Report On Fingerprint Recognition System
No ratings yet
Report On Fingerprint Recognition System
9 pages
Henry CL System
No ratings yet
Henry CL System
12 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Ansi-Nist - 2007 - Formato D Intercmbio de Imagenes Rostro Tatuajes y Marcas
No ratings yet
Ansi-Nist - 2007 - Formato D Intercmbio de Imagenes Rostro Tatuajes y Marcas
22 pages
Determination of Sex Difference From Fingerprint R
No ratings yet
Determination of Sex Difference From Fingerprint R
9 pages
Position of The European Fingerprint Working Group (EFPWG) of The European Network of Forensic Science Institutes (ENFSI) Regarding The NRC Report
No ratings yet
Position of The European Fingerprint Working Group (EFPWG) of The European Network of Forensic Science Institutes (ENFSI) Regarding The NRC Report
4 pages
Transferencia Directa - Obtención de Huellas Latentes de La Piel de Una Persona Viva
No ratings yet
Transferencia Directa - Obtención de Huellas Latentes de La Piel de Una Persona Viva
44 pages
Without The Help of Glasses The Anthrop
No ratings yet
Without The Help of Glasses The Anthrop
16 pages
MMM-Case Study
No ratings yet
MMM-Case Study
24 pages
Sudmo Components en
No ratings yet
Sudmo Components en
492 pages
Troubleshooting Ten Step Tango
No ratings yet
Troubleshooting Ten Step Tango
16 pages
FUJITSU SoftwareServerView Suite Remote Management
No ratings yet
FUJITSU SoftwareServerView Suite Remote Management
426 pages
Island Biogeography With Questions
No ratings yet
Island Biogeography With Questions
2 pages
WMI 2024 Final Grade 04 Paper B Question
No ratings yet
WMI 2024 Final Grade 04 Paper B Question
3 pages
External Thermal Insulation Composite Systems Etics
No ratings yet
External Thermal Insulation Composite Systems Etics
34 pages
User's Manual: Satellite C660/C665/C660D/C665D Satellite Pro C660/C660D Series
No ratings yet
User's Manual: Satellite C660/C665/C660D/C665D Satellite Pro C660/C660D Series
158 pages
Philosophy, Goals and Objectives
No ratings yet
Philosophy, Goals and Objectives
31 pages
Projectile Motion With Examples PDF
No ratings yet
Projectile Motion With Examples PDF
3 pages
Jubilee-Greenway-Route-Section-One
No ratings yet
Jubilee-Greenway-Route-Section-One
5 pages
Nogueira 2015 Stir Bar - Sorptive - Ex
No ratings yet
Nogueira 2015 Stir Bar - Sorptive - Ex
10 pages
AIATS Second Step JEE (Main & Advanced) 2024
No ratings yet
AIATS Second Step JEE (Main & Advanced) 2024
5 pages
UG Piping - Mechanical Handbook
100% (1)
UG Piping - Mechanical Handbook
7 pages
Momentum and Forces in Fluid Flow
No ratings yet
Momentum and Forces in Fluid Flow
11 pages
Ece Result 6TH Sem Ipu
No ratings yet
Ece Result 6TH Sem Ipu
325 pages
Nature vs. Nuture PDF
No ratings yet
Nature vs. Nuture PDF
4 pages
(IFAC Symposia Series) International Federation of Automatic Control, C. McGreavy-Dynamics and Control of Chemical Reactors and Distillation Columns. Selected Papers from the IFAC Symposium, Bournemou.pdf
No ratings yet
(IFAC Symposia Series) International Federation of Automatic Control, C. McGreavy-Dynamics and Control of Chemical Reactors and Distillation Columns. Selected Papers from the IFAC Symposium, Bournemou.pdf
322 pages
Candy Crossword
No ratings yet
Candy Crossword
3 pages
Module THERMO Thermodynamics BEXET 2 BSIT RACT X BMET MT 2 OK
No ratings yet
Module THERMO Thermodynamics BEXET 2 BSIT RACT X BMET MT 2 OK
124 pages
RT9724GB GP
No ratings yet
RT9724GB GP
13 pages
Manual Digital 1200.4 Evox Ingl Rev3
No ratings yet
Manual Digital 1200.4 Evox Ingl Rev3
16 pages
Effects of Food On Drug Therapy
No ratings yet
Effects of Food On Drug Therapy
12 pages
VZ7 - Viking Z Seven - Datasheet
No ratings yet
VZ7 - Viking Z Seven - Datasheet
2 pages
Rincian Harga CCTV: Paket All Dahua 4 Channel
No ratings yet
Rincian Harga CCTV: Paket All Dahua 4 Channel
2 pages
Khilgaon Flyover
No ratings yet
Khilgaon Flyover
18 pages
IR.20 150 Manual GB120314
No ratings yet
IR.20 150 Manual GB120314
12 pages
Thesis LD
No ratings yet
Thesis LD
4 pages