Learning A Fixed-Length Fingerprint Representation: Joshua J. Engelsma, Kai Cao, and Anil K. Jain, Life Fellow, IEEE
Learning A Fixed-Length Fingerprint Representation: Joshua J. Engelsma, Kai Cao, and Anil K. Jain, Life Fellow, IEEE
Abstract—We present DeepPrint, a deep network, which learns to extract fixed-length fingerprint representations of only 200 bytes.
DeepPrint incorporates fingerprint domain knowledge, including alignment and minutiae detection, into the deep network architecture
to maximize the discriminative power of its representation. The compact, DeepPrint representation has several advantages over the
prevailing variable length minutiae representation which (i) requires computationally expensive graph matching techniques, (ii) is
difficult to secure using strong encryption schemes (e.g. homomorphic encryption), and (iii) has low discriminative power in poor quality
fingerprints where minutiae extraction is unreliable. We benchmark DeepPrint against two top performing COTS SDKs (Verifinger and
Innovatrics) from the NIST and FVC evaluations. Coupled with a re-ranking scheme, the DeepPrint rank-1 search accuracy on the
NIST SD4 dataset against a gallery of 1.1 million fingerprints is comparable to the top COTS matcher, but it is significantly faster
(DeepPrint: 98.80% in 0.3 seconds vs. COTS A: 98.85% in 27 seconds). To the best of our knowledge, the DeepPrint representation
arXiv:1909.09901v2 [cs.CV] 18 Dec 2019
is the most compact and discriminative fixed-length fingerprint representation reported in the academic literature.
Index Terms—Fingerprint Matching, Minutiae Representation, Fixed-Length Representation, Representation Learning, Deep
Networks, Large-scale Search, Domain Knowledge in Deep Networks
1 I NTRODUCTION
VER 100 years ago, the pioneering giant of modern day fin-
O gerprint recognition, Sir Francis Galton, astutely commented
on fingerprints in his 1892 book titled “Finger Prints”:
“They have the unique merit of retaining all their
peculiarities unchanged throughout life, and afford in
consequence an incomparably surer criterion of identity
than any other bodily feature.” [1]
Galton went on to describe fingerprint minutiae, the small details
woven throughout the papillary ridges on each of our fingers,
which Galton believed provided uniqueness and permanence prop-
erties for accurately identifying individuals. Over the 100 years
since Galton’s ground breaking scientific observations, fingerprint (a) Level-1 features (b) Level-2 features
recognition systems have become ubiquitous and can be found in a
Fig. 1. The most popular fingerprint representation consists of (a) global
plethora of different domains [2] such as forensics [3], healthcare, level-1 features (ridge flow, core, and delta) and (b) local level-2 features,
mobile device security [4], mobile payments [4], border cross- called minutiae points, together with their descriptors (e.g., texture in
ing [5], and national ID [6]. To date, virtually all of these systems local minutiae neighborhoods). The fingerprint image illustrated here is
a rolled impression from the NIST SD4 database [7]. The number of
continue to rely upon the location and orientation of minutiae minutiae in NIST4 rolled fingerprint images range all the way from 12 to
within fingerprint images for recognition (Fig. 1). 196.
Although automated fingerprint recognition systems based
on minutiae representations (i.e. handcrafted features) have seen and (ii) matching in the encrypted domain, a necessity for
tremendous success over the years, they have several limitations. user privacy protection, is computationally expensive, and
results in loss of accuracy [9].
• Minutiae-based representations are of variable length, • In the context of global population registration, fingerprint
since the number of extracted minutiae (Table 1) varies recognition can be viewed as a 75 billion class problem
amongst different fingerprint images even of the same (≈ 7.5 billion living persons, assuming nearly all with
finger (Fig. 2 (a)). Variations in the number of minutiae 10 fingers) with large intra-class variability and large
originate from a user’s interaction with the fingerprint inter-class similarity (Fig. 2). This necessitates extremely
reader (placement position and applied pressure) and con- discriminative yet compact representations that are com-
dition of the finger (dry, wet, cuts, bruises, etc.). This plementary and at least as discriminative as the traditional
variation in the number of minutiae causes two main minutiae-based representation. For example, India’s civil
problems: (i) pairwise fingerprint comparison is compu- registration system, Aadhaar, now has a database of ≈ 1.3
tationally demanding and varies with number of minutiae billion residents who are enrolled based on their 10 finger-
prints, 2 irises, and face image [6].
• J. J. Engelsma and A. K. Jain are with the Department of Computer Science • Reliable minutiae extraction in low quality fingerprints
and Engineering, Michigan State University, East Lansing, MI, 48824 (due to noise, distortion, finger condition) is problematic,
E-mail: [email protected], [email protected] causing false rejects in the recognition system (Fig. 2 (a)).
• K. Cao is a Senior Biometrics Researcher at Goodix, San Diego, CA
E-mail: [email protected] See also NIST fingerprint evaluation FpVTE 2012 [10].
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2
Fig. 4. Flow diagram of DeepPrint: (i) a query fingerprint is aligned via a Localization Network which has been trained end-to-end with the Base-
Network and Feature Extraction Networks (no reference points are needed for alignment); (ii) the aligned fingerprint proceeds to the Base-Network
which is followed by two branches; (iii) the first branch extracts a 96-dimensional texture-based representation; (iv) the second branch extracts a
96-dimensional minutiae-based representation, guided by a side-task of minutiae detection (via a minutiae map which does not have to be extracted
during testing); (v) the texture-based representation and minutiae-based representation are concatenated into a 192-dimensional representation of
768 bytes (192 features and 4 bytes per float). The 768 byte template is compressed into a 200 byte fixed-length representation by truncating floating
point value features into integer value features, and saving the scaling and shifting values (8 bytes) used to truncate from floating point values to
integers. The 200 byte DeepPrint representations can be used both for authentication and large-scale fingerprint search. The minutiae-map can be
used to further improve system accuracy and interpretability by re-ranking candidates retrieved by the fixed-length representation.
fingerprint domain knowledge via an automatic alignment module The primary benefit of the 200 byte representation extracted
and a multi-task learning objective which requires minutiae- by DeepPrint comes into play when performing mega-scale search
detection (in the form of a minutiae-map) as a side task to against millions or even billions of identities (e.g., India’s Aad-
representation learning. More specifically, DeepPrint automati- haar [6] and the FBI’s Next Generation Identification (NGI)
cally aligns an input fingerprint and subsequently extracts both a databases [3]). To highlight the significance of this benefit, we
texture representation and a minutiae-based representation (both benchmark the search performance of DeepPrint against the latest
with 96 features). The 192-dimensional concatenation of these version SDKs (as of July, 2019) of two top performers in the NIST
two representations, followed by compression from floating point FpVTE 2012 (Innovatrics4 v7.2.1.40 and Verifinger5 v10.06 ) on
features to integer value features comprises a 200 byte fixed-length the NIST SD4 [7] and NIST SD14 [21] databases augmented with
representation (192 bytes for the feature vector and 4 bytes for a gallery of nearly 1.1 million rolled fingerprints. Our empirical
storing the 2 compression parameters). As a final step, we utilize results demonstrate that DeepPrint is competitive with these two
Product Quantization [18] to further compress the DeepPrint state-of-the-art COTS matchers in accuracy while requiring only a
representations stored in the gallery, significantly reducing the fraction of the search time. Furthermore, a given DeepPrint fixed-
computational requirements and time for large-scale fingerprint length representation can also be matched in the encrypted domain
search. via homomorphic encryption with minor loss to recognition accu-
Detecting minutiae (in the form of a minutiae-map) as a side- racy as shown in [14] for face recognition.
task to representation learning has several key benefits: More concisely, the primary contributions of this work are:
• We guide our representation to incorporate domain in- • A customized deep network (Fig. 4), called DeepPrint,
spired features pertaining to minutiae by sharing pa- which utilizes fingerprint domain knowledge (alignment
rameters between the minutiae-map output task and the and minutiae detection) to learn and extract a discrimina-
representation learning task in the multi-task learning tive fixed-length fingerprint representation.
framework. • Demonstrating in a manner similar to [29] that Product
• Since minutiae representations are the most popular for Quantization can be used to compress DeepPrint fin-
fingerprint recognition, we posit that our method for gerprint representations, enabling even faster mega-scale
guiding the DeepPrint feature extraction via its minutiae- search (51 ms search time against a gallery of 1.1 million
map side-task falls in line with the goal of “Explainable fingerprints vs. 27,000 ms for a COTS with comparable
AI” [19]. accuracy).
• Given a probe fingerprint, we first use its DeepPrint • Demonstrating with a two-stage search scheme similar
representation to find the top k candidates and then re- to [29] that candidates retrieved by DeepPrint represen-
rank the top k candidates using the minutiae-map provided tations can be re-ranked using a minutiae-matcher in
by DeepPrint 3 . This optional re-ranking add-on further conjunction with the DeepPrint minutiae-map. This further
improves both accuracy and interpretability.
4. https://www.innovatrics.com/
3. The 128 × 128 × 6 DeepPrint minutiae-map can be easily converted into 5. https://www.neurotechnology.com/
a minutiae-set with n minutia: {(x1 , y1 , θ1 ), ..., (xn , yn , θn )} and passed to 6. We note that Verifinger v10.0 performs significantly better than earlier
any minutia-matcher (e.g., COTS A, COTS B, or [20]). versions of the SDK often used in the literature.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 4
TABLE 2
Published Studies on Fixed-Length Fingerprint Representations
improves system interpretability and accuracy and demon- cylindrical structures computed with minutiae points [24]. While
strates that the DeepPrint features are complementary to both of these representations demonstrated success at the time
the traditional minutiae representation. they were proposed, their accuracy is now significantly inferior to
• Benchmarking DeepPrint against two state-of-the-art state-of-the-art COTS matchers
COTS matchers (Innovatrics and Verifinger) on NIST
Following the seminal contributions of [22], [23] and [24],
SD4 and NIST SD14 against a gallery of 1.1 million
the past 10 years of research on fixed-length fingerprint repre-
fingerprints. Empirical results demonstrate that DeepPrint
sentations [31], [32], [33], [34], [35], [36], [37], [38], [39] has
is comparable to COTS matchers in accuracy at a signifi-
not produced a representation competitive in terms of fingerprint
cantly faster search speed.
recognition accuracy with the traditional minutiae-based represen-
• Benchmarking the authentication performance of Deep-
tation. However, recent studies [25], [26], [27], [28] have utilized
Print on the NIST SD4 and NIST SD14 rolled-fingerprints
deep networks to extract highly discriminative fixed-length finger-
databases and the FVC 2004 DB1 A slap fingerprint
print representations. More specifically, (i) Cao and Jain [25] used
database [8]. Again, DeepPrint shows comparable perfor-
global alignment and Inception v3 to learn fixed-length fingerprint
mance against the two COTS matchers, demonstrating the
representations. (ii) Song and Feng [26] used deep networks to
generalization ability of DeepPrint to both rolled and slap
extract representations at various resolutions which were then
fingerprint databases.
aggregated into a global fixed-length representation. (iii) Song et
• Demonstrating that homomorphic encryption can be used
al. [27] further learned fixed-length minutiae descriptors which
to match DeepPrint templates in the encrypted domain, in
were aggregated into a global fixed-length representation via an
real time (1.26 ms), with minimal loss to matching accu-
aggregation network. Finally, (v) Li et al. [28] extracted local
racy as shown for fixed-length face representations [14].
descriptors from predefined “fingerprint classes” which were then
• An interpretability visualization which demonstrates our
aggregated into a global fixed-length representation through global
ability to guide DeepPrint to look at minutiae-related
average pooling.
features.
While these efforts show tremendous promise, each method
has some limitations. In particular, (i) the algorithms proposed
2 P RIOR W ORK in [25] and [26] both required computationally demanding global
Several early works [22], [23], [24] presented fixed-length fin- alignment as a preprocessing step, and the accuracy is inferior to
gerprint representations using traditional image processing tech- state-of-the-art COTS matchers. (ii) The representations extracted
niques. In [22], [23], Jain et al. extracted a global fixed-length in [27] require the arduous process of minutiae-detection, patch
representation of 640 bytes, called Fingercode, using a set of extraction, patch-level inference, and an aggregation network
Gabor Filters. Cappelli et al. introduced a fixed-length minutiae to build a single global feature representation. (iii) While the
descriptor, called Minutiae Cylinder Code (MCC), using 3D algorithm in [28] obtains high performance on rolled fingerprints
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 5
Fig. 5. Fingerprint impressions from one subject in the DeepPrint training dataset [30]. Impressions were captured longitudinally, resulting in the
variability across impressions (contrast and intensity from environmental conditions; distortion and alignment from user placement). Importantly,
training with longitudinal data enables learning compact representations which are invariant to the typical noise observed across fingerprint
impressions over time, a necessity in any fingerprint recognition system.
(with small gallery size), the accuracy was not reported for slap Algorithm 1 Extract DeepPrint Representation
fingerprints. Since [28] aggregates local descriptors by averaging 1: L(If ): Shallow localization network, outputs x, y, θ
them together, it is unlikely that the approach would work well 2: A: Affine matrix composed with parameters x, y, θ
when areas of the fingerprint are occluded or missing (often times 3: G(If , A): Bilinear grid sampler, outputs aligned fingerprint
the case in slap fingerprint databases like FVC 2004 DB1 A), 4: S(It ): Inception v4 stem
and (v) all of the algorithms, suffer from lack of interpretability 5: E(x): Shared minutiae parameters
compared to traditional minutiae representations. 6: M (x): Minutia representation branch
In addition, existing studies targeting deep, fixed-length finger- 7: D(x): Minutiae map estimation
print representations all lack an extensive, large-scale evaluation 8: T (x): Texture representation branch
of the deep features. Indeed, one of the primary motivations for 9:
fixed-length fingerprint representations is to perform orders of 10: Input: Unaligned 448 × 448 fingerprint image If
magnitude faster large scale search. However, with the exception 11: A ← (x, y, θ) ← L(If )
of Cao and Jain [25], who evaluate against a database of 250K 12: It ← G(If , A)
fingerprints, the next largest gallery size used in any of the 13: Fmap ← S(It )
aforementioned studies is only 2,700. 14: Mmap ← E(Fmap )
As an addendum, deep networks have also been used to 15: R1 ← M (Mmap )
improve specific sub-modules of fingerprint recognition systems 16: H ← D(Mmap )
such as segmentation [40], [41], [42], [43], orientation field 17: R2 ← T (Fmap )
estimation [44], [45], [46], minutiae extraction [47], [48], [49], 18: R ← R1 ⊕ R2
and minutiae descriptor extraction [50]. However, these works all 19: Output: Fingerprint representation R ∈ R192 and minutiae-
still operate within the conventional paradigm of extracting an map H . (H can be optionally utilized for (i) visualization and
unordered, variable length set of minutiae for fingerprint matching. (ii) fusion of DeepPrint scores obtained via R with minutiae-
matching scores.)
3 D EEP P RINT
In the following section, we (i) provide a high-level overview and fingerprint images (≈ 12 fingerprint impressions / finger). The last
intuition of DeepPrint, (ii) present how we incorporate automatic fully connected layer is taken as the representation for fingerprint
alignment into DeepPrint, and (iii) demonstrate how the accuracy comparison during authentication and search.
and interpretability of DeepPrint is improved through the injection The input to DeepPrint is a 448 × 448 7 grayscale fingerprint
of fingerprint domain knowledge. image, If , which is first passed through the alignment module
(Fig. 4). The alignment module consists of a localization network,
3.1 Overview L, and a grid sampler, G [51]. After applying the localization
network and grid sampler to If , an aligned fingerprint It is passed
A high level overview of DeepPrint is provided in Figure 4
to the base-network, S .
with pseudocode in Algorithm 1. DeepPrint is trained with a
The base-network is the stem of the Inception v4 architecture
longitudinal database (Fig. 5) comprised of 455K rolled fingerprint
(Inception v4 minus Inception modules). Following the base-
images stemming from 38,291 unique fingers [30]. Longitudinal
network are two different branches (Fig. 4) comprised primarily of
fingerprint databases consist of fingerprints from distinct subjects
the three Inception modules (A, B, and C) described in [52]. The
captured over time (Fig. 5) [30]. It is necessary to train DeepPrint
first branch, T (x), completes the Inception v4 architecture 8 as
with a large, longitudinal database so that it can learn compact,
fixed-length representations which are invariant to the differences 7. Fingerprint images in our training dataset vary in size from ≈ 512 × 512
introduced during fingerprint image acquisition at different times to ≈ 800 × 800. As a pre-processing step, we do a center cropping (using
and in different environments (humidity, temperature, user interac- Gaussian filtering, dilation and erosion, and thresholding) to all images to
≈ 448 × 448. This size is sufficient to cover most of the rolled fingerprint
tion with the reader, and finger injuries). The primary task during area without extraneous background pixels.
training is to predict the finger identity label c ∈ [0, 38291] 8. We selected Inception v4 after evaluating numerous other architectures
(encoded as a one-hot vector) of each of the 455K training such as: ResNet, Inception v3, Inception ResNet, and MobileNet.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 6
Fig. 6. Unaligned fingerprint images from NIST SD4 (top row) and corresponding DeepPrint aligned fingerprint images (bottom row).
n
X
H(i, j, k) = Cs ((xt , yt ), (i, j)) · Co (θt , 2kπ/6) (2)
t=1
where Cs (.) and Co (.) calculate the spatial and orientation con-
tribution of minutiae mt to the minutiae map at (i, j, k) based
upon the euclidean distance of (xt , yt ) to (i, j) and the orientation
difference between θt and 2kπ/6 as follows:
Finally, using the addition of all these loss terms, and a dataset
comprised of N training images, our model parameters w are
trained in accordance with:
N
Fig. 8. The custom multi-task minutiae branch of DeepPrint. The di- X
mensions inside each box represent the input dimensions, kernel size, argmin λ1 L1 (Iti , y i ) + λ2 L2 (Iti ) + λ3 L3 (Iti , H i ) (9)
w
and stride length, respectively. i=1
where min(R) and max(R) output the minimum and maximum 5 D EEP P RINT S EARCH
feature values of the vector R, respectively. In order to decompress
Fingerprint search entails finding the top k candidates, in a
the features back to float values for matching, we need to save the
database (gallery or background) of N fingerprints, for an input
minimum and maximum values for each representation. Thus, our
probe fingerprint. The simplest algorithm for obtaining the top
final representation is 200 bytes, 192 bytes for the features, 4 bytes
k candidates is to (i) compute a similarity measure between the
for the minimum value and 4 bytes for the maximum value. To
probe template and every enrolled template in the database, (ii)
decompress the representations (when loading them into RAM),
sort the enrolled templates by their similarity to the probe 13 , and
we simply reverse the min-max normalization using the saved
(iii) select the top k most similar enrollees. More formally, finding
minimum and maximum values. Table 4 shows that compression
the top k candidates Ck (.) in a gallery G for a probe fingerprint
only minimally impacts the matching accuracy.
Rp is formulated as:
TABLE 4
Effect of Compression on Accuracy Ck (Rp ) = Rankk ({s(Rp , Rg )|Rg ∈ G}) (12)
DeepPrint DeepPrint where Rankk (.) returns the k most similar candidates from an
Dataset
Uncompressed Features Compressed Features input set of candidates and s is a similarity function such as
NIST SD4† 97.95% 97.90% defined in Equation 11.
FVC 2004 DB1 A†† 97.53% 97.50%
Since minutiae-based matching is computationally expensive,
†
comparing the probe to every template enrolled in the database
TAR @ FAR = 0.01% is reported.
†† TAR @ FAR = 0.1% is reported.
in a timely manner is not feasible with minutiae matchers. This
has led to a number of schemes to either significantly reduce the
search space, or utilize high-level features to quickly index top
candidates [56], [57], [58], [59], [60]. However, such methods
4 D EEP P RINT M ATCHING have not achieved high-levels of accuracy on public benchmark
datasets such as NIST SD4 or NIST SD14.
Two, unit length, DeepPrint representations Rp and Rg can In contrast to minutiae-matchers, the fixed-length, 200 byte
be easily matched using the cosine similarity between the two DeepPrint representations can be matched extremely quickly using
representations. In particular: Equation 11. Therefore, large scale search with DeepPrint can
be performed by exhaustive comparison of the probe template
s(Rp , Rg ) = R|p · Rg (11) to every gallery template in accordance with Equation 12. The
complexity of exhaustive search is linear with respect to both
Thus, DeepPrint authentication (1:1 matching) requires only the gallery size N and the dimensionality d of the DeepPrint
192 multiplications and 191 additions. We also experimented representation (d = 192 in this case).
with euclidian distance as a scoring function, but consistently
obtained higher performance with cosine similarity. Note that if
5.1 Faster Search
compression is added, there would be an additional d subtrac-
tions and d multiplications to reverse the min-max normaliza- Although exhaustive search can be effectively utilized with Deep-
tion of the enrolled representation. Therefore, the authentication Print representations in conjunction with Equation 12, it may be
time effectively doubles. However, depending on the application desirable to even further decrease the search time. For example,
or implementation, compression does not necessarily effect the when searching against 100 million fingerprints, the DeepPrint
search speed since the gallery of representations could be already search time is still (11 seconds on an i9 processor with 64 GB of
decompressed and in RAM before performing a search. RAM) 14 . A natural way to reduce the search time further with
minimal loss to accuracy is to utilize an effective approximate
nearest neighbor (ANN) algorithm.
4.1 Fusion of DeepPrint Score with Minutiae Score Product Quantization is one such ANN algorithm which has
Given the speed of matching two DeepPrint representations, the been successfully utilized in large-scale face search [29]. Product
minutiae-based match scores of any existing AFIS can also be quantization is still an exhaustive search algorithm, however,
fused together with the DeepPrint scores with minimal loss to representations are first compressed via keys to a lookup table,
the overall AFIS authentication speed (i.e. DeepPrint can be which significantly reduces the comparison time between two
easily used as an add-on to existing minutiae-based AFIS to representations. In other words, product quantization reformulates
improve recognition accuracy). In our experimental analysis, we the comparison function in Equation 11 to a series of lookup oper-
demonstrate this by fusing DeepPrint scores together with the ations in a table stored in RAM. More formally, a given DeepPrint
scores of minutiae-based matchers COTS A, COTS B, and [20] representation Rg of dimensionality d, is first decomposed into m
and subsequently improving authentication accuracy. This indi- sub-vectors as:
cates that the information contained in the compact DeepPrint
13. In our search experiments, we reduce the typical sorting time from
representation is complementary to that of minutiae representa- N log(N ) to N log(k) (where k << N ) by maintaining a priority queue
tions. Note, since DeepPrint already extracts minutiae as a side of size k since we only care about the scores of the top k candidates. This
task, fusion with a minutiae-based matcher requires little extra trick reduces sorting time from 23 seconds to 8 seconds when the gallery size
N = 100, 000, 000 and the candidate list size k = 100.
computational overhead (simply feed the minutiae extracted by
14. Search time for 100 million gallery was simulated by generating 100
DeepPrint directly to the minutiae matcher, eliminating the need million random representations, where each feature was a 32-bit float value
to extract minutiae a second time). drawn from a uniform distribution from 0 to 1.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 10
Fig. 9. Examples of poor quality fingerprint images from benchmark datasets. Row 1: Rolled fingerprint impressions from NIST SD4. Row 2: Slap
fingerprint images from FVC 2004 DB1 A. Rolled fingerprints are often heavily smudged, making them challenging to accurately recognize. FVC
2004 DB1 A also has several distinct challenges such as small overlapping fingerprint area between two fingerprint images, heavy non-linear
distortions, and extreme finger conditions (wet or dry). Minutiae annotated with COTS A.
TABLE 5
Benchmarking DeepPrint Search Accuracy against Fixed-Length Representations in the Literature and COTS
1
Only 2,000 fingerprints are included in the gallery to enable comparison with previous works.
2
Last 2,700 pairs are used to enable comparison with previous works.
3
Search times for all algorithms benchmarked on NIST SD4 with an Intel Core i9-7900X CPU @ 3.30GHz
4
We use the proprietary COTS templates which are comprised of minutiae together with other proprietary features.
† These results primarily show that (i) DeepPrints is competitive with the best fixed-length representation in the literature [28] (with a
smaller template size) and state-of-the-art COTS, but also (ii) the benchmark dataset performances are saturated due to small gallery
sizes. Therefore, in subsequent experiments we compare with state-of-the-art COTS against a background of 1.1 million.
7.2 FVC 2004 DB1 A the latest version of the SDK as of July, 2019). Due to our Non-
The FVC 2004 DB1 A dataset is an extremely challenging bench- disclosure agreement, we cannot provide a link between aliases
mark dataset (even for commercial matchers) for several reasons: COTS A and COTS B and Verifinger or Innovatrics. Both of
(i) small overlapping fingerprint area between fingerprint images these SDKs provide an ISO minutia-only template as well as a
from the same subject, (ii) heavy non-linear distortion, and (iii) proprietary template comprised of minutiae and other features. To
extremely wet and dry fingers (Fig. 9). Another major motivation obtain the best performance from each SDK, we extracted the
for selecting FVC 2004 DB1 A as a benchmark dataset is that it is more discriminative proprietary templates. The proprietary tem-
comprised of slap fingerprint images. Because of this, we are able plates are comprised of minutiae and other features unknown to
to demonstrate that even though DeepPrint was trained on rolled us. We note that both Verifinger and Innovatrics are top performers
fingerprint images similar to NIST SD4 and NIST SD14, our in the NIST and FVC evaluations [10], [63].
incorporation of domain knowledge into the network architecture
enables it to generalize well to slap fingerprint datasets. 9 B ENCHMARK E VALUATIONS
We begin our experiments by comparing the DeepPrint search
8 COTS M ATCHERS performance to the state-of-the-art fixed-length representations
In most all of our experiments, we benchmark DeepPrint against reported in the literature. Then, we show that the DeepPrint rep-
COTS A and COTS B (Verifinger 10.0 or Innovatrics v7.2.1.40, resentation can also be used for state-of-the-art authentication by
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 12
TABLE 8 TABLE 9
Encrypted Authentication using DeepPrint Representation DeepPrint + Minutiae Re-ranking Search Accuracy (1.1 million
background)
DeepPrint
search against large galleries. To adequately showcase this feature, 98.25% 98.41% 13,000
+ COTS B2
we benchmark the DeepPrint search accuracy against COTS A and
COTS B on a gallery of over 1.1 million rolled fingerprint images. COTS A3 98.85% 99.51% 27,472
The experimental results show that DeepPrint is able to obtain COTS B3 89.2% 85.6% 428
competitive search accuracy with the top COTS algorithm, at or- 1
Search times benchmarked on an Intel Core i9-7900X CPU @ 3.30GHz
ders of magnitude faster speeds. Note, we are unable to benchmark 2
COTS only used for re-ranking the top 500 DeepPrint candidates.
other recent fixed-length representations in the literature against 3
COTS used to perform search against the entire 1.1 million gallery.
the large scale background, since code for these algorithms has
not been open-sourced. TABLE 10
DeepPrint + PQ: Search Accuracy (1.1 million background)
10.1 DeepPrint Search
First, we show the search performance of DeepPrints using a sim- NIST SD4 NIST SD14
Search Time
Algorithm Rank 1 Rank1
ple exhaustive search technique previously described. In particular, (milliseconds)1
Search Accuracy Search Accuracy
we match a probe template to every template in the gallery, and
select the k candidates with the highest similarity scores. We use DeepPrint 95.15% 94.44% 160
the NIST SD4 and NIST SD14 databases in conjunction with a
DeepPrint + PQ 94.80% 94.18% 51
gallery of 1.1 million rolled fingerprints. Under this exhaustive
1
search scheme, the DeepPrint representation enables obtaining Search times benchmarked on an Intel Core i9-7900X CPU @ 3.30GHz
Rank-1 identification accuracies of 95.15% and 94.44%, respec-
tively (Table 10) and (Fig. 10). Notably, the search time is only 160 DeepPrint representation to further improve the Rank-1 identifica-
milliseconds. At Rank-100, the search accuracies for both datasets tion accuracy. Following this re-ranking, we obtain competitive
cross over 99%. In our subsequent experiments, we demonstrate search accuracy as the top COTS SDK, but at significantly faster
how we can re-rank the top k candidates to further improve the speeds (Table 9).
Rank-1 accuracy with minimal cost to the search time.
10.3 Product Quantization
We further improve the already fast search speed enabled by the
DeepPrint representation by performing product quantization on
the templates stored in the gallery. This reduces the DeepPrint
template size to only 64 bytes and reduces the search speed down
to 51 milliseconds from 160 milliseconds with only marginal loss
to search accuracy (Table 10) and (Fig. 10).
11 A BLATION S TUDY
Finally, we perform an ablation study to highlight the impor-
tance of (i) the automatic alignment module in the DeepPrint
architecture and (ii) the minutiae-map domain knowledge added
Fig. 10. Closed-Set Identification Accuracy of DeepPrint (with and with- during training of the network. In our ablation study, we report
out Product Quantization (PQ)) on NIST SD4 and NIST SD14 (last 2,700
pairs) supplemented with a gallery of 1.1 Million. Rank-1 Identification the authentication performance of DeepPrint with/without the
accuracies are 95.15% and 94.44%, respectively. Search time is only constituent modules.
160 milliseconds. After adding product quantization, the search time We note that in all scenarios, the addition of domain knowl-
is reduced to 51 milliseconds and the Rank-1 accuracies only drop to
94.8% and 94.2%, respectively.
edge improves authentication performance (Tables 11 and 12).
This is especially true for the FVC 2004 DB1 A database which
is comprised of slap fingerprints with different characteristics
(size, distortion, conditions) than the rolled fingerprints used for
10.2 Minutiae Re-ranking training DeepPrint. Thus we show how adding minutiae domain
Using the open-source minutiae matcher proposed in [20], COTS knowledge enables better generalization of DeepPrint to datasets
A and COTS B, we re-rank the top-500 candidates retrieved by the which are very disparate from its training dataset. We note that
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 14
Fig. 11. Illustration of DeepPrint interpretability. The first row shows three example fingerprints from NIST SD4 which act as inputs to DeepPrint.
The second row shows which pixels the texture branch is focusing on as it extracts its feature representation. Singularity points are overlaid to show
that the texture branch fixates primarily on regions surrounding the singularity points. The last row shows pixels which the minutiae branch focuses
on as it extracts its feature representation. We overlay minutiae to show how the minutiae branch focuses primarily on regions surrounding minutiae
points. Thus, each branch of DeepPrint extracts complementary features which comprise more accurate and interpretable fixed-length fingerprint
representations than previously reported in the literature.
alignment does not help in the case of NIST SD4 and NIST SD14 high. This indicates to us that our guiding the DeepPrint network
(since rolled fingerprints are already mostly aligned), however, with minutiae domain knowledge does indeed draw the attention
it significantly improves the performance on FVC 2004 DB1 A of the network to minutiae points. Since both branches focus on
where fingerprint images are likely to be severely unaligned. complementary areas and features, the fusion of the representa-
We also note that the minutiae-based representation and the tions improves the overall matching performance (Table 11).
texture-based representation from DeepPrint are indeed comple-
mentary, evidenced by the improvement in accuracy when fusing
13 C OMPUTATIONAL R ESOURCES
the scores from both representations. (Table 11).
DeepPrint models and training code are implemented in Tensor-
flow 1.14.0. All models were trained across 2 NVIDIA GeForce
12 I NTERPRETABILITY
RTX 2080 Ti GPUs. All search and authentication experiments
As a final experiment, we demonstrate the interpretability of were performed with an Intel Core i9-7900X CPU @ 3.30GHz
the DeepPrint representation using the deconvolutional network and 32 GB of RAM.
proposed in [64]. In particular, we show in Fig. 11 which pixels
in an input fingerprint image are fixated upon by the DeepPrint
network as it extracts a representation. From this figure, we make 14 C ONCLUSION
some interesting observations. In particular, we note that while We have presented the design of a custom deep network ar-
the texture branch of the DeepPrint network seems to only focus chitecture, called DeepPrint, capable of extracting highly dis-
on texture surrounding singularity points in the fingerprint (core criminative fixed-length fingerprint representations (200 bytes)
points, deltas), the minutiae branch focuses on a larger portion of for both authentication (1:1 fingerprint comparison) and search
the fingerprint in areas where the density of minutiae points are (1:N fingerprint comparison). We showed how alignment and
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 15
[30] S. Yoon and A. K. Jain, “Longitudinal Study of Fingerprint Recognition,” [52] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4,
Proceedings of the National Academy of Sciences, vol. 112, no. 28, Inception-resnet and the Impact of Residual Connections on Learning.,”
pp. 8555–8560, 2015. 5, 10 in AAAI, vol. 4, p. 12, 2017. 5
[31] Y. Sutcu, H. T. Sencar, and N. Memon, “A Geometric Transformation to [53] R. Caruana, “Multitask Learning,” Machine Learning, vol. 28, pp. 41–75,
Protect Minutiae-based Fingerprint Templates,” in Biometric Technology Jul 1997. 7
for Human Identification IV, vol. 6539, p. 65390E, International Society [54] X. Yin and X. Liu, “Multi-task Convolutional Neural Network for Pose-
for Optics and Photonics, 2007. 4 Invariant Face Recognition,” IEEE Transactions on Image Processing,
[32] Y. Sutcu, S. Rane, J. S. Yedidia, S. C. Draper, and A. Vetro, “Feature vol. 27, no. 2, pp. 964–975, 2018. 7
Extraction for a Slepian-Wolf Biometric System using LDPC Codes,” in [55] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A Discriminative Feature
2008 IEEE International Symposium on Information Theory, pp. 2297– Learning Approach for Deep Face Recognition,” in European Conference
2301, IEEE, 2008. 4 on Computer Vision, pp. 499–515, Springer, 2016. 8
[33] A. Nagar, S. Rane, and A. Vetro, “Privacy and Security of Features [56] B. Bhanu and X. Tan, “Fingerprint Indexing Based on Novel Features of
Extracted from Minutiae Aggregates,” in 2010 IEEE International Con- Minutiae Triplets,” IEEE Transactions on Pattern Analysis and Machine
ference on Acoustics, Speech and Signal Processing, pp. 1826–1829, Intelligence, vol. 25, no. 5, pp. 616–622, 2003. 9
IEEE, 2010. 4 [57] R. Cappelli, M. Ferrara, and D. Maltoni, “Fingerprint Indexing based
on Minutia Cylinder-Code,” IEEE Transactions on Pattern Analysis and
[34] J. Bringer and V. Despiegel, “Binary Feature Vector Fingerprint Repre- Machine Intelligence, vol. 33, no. 5, pp. 1051–1057, 2010. 9
sentation from Minutiae Vicinities,” in 2010 Fourth IEEE International [58] X. Jiang, M. Liu, and A. C. Kot, “Fingerprint Retrieval for Identification,”
Conference on Biometrics: Theory, Applications and Systems (BTAS), IEEE Transactions on Information Forensics and Security, vol. 1, no. 4,
pp. 1–6, IEEE, 2010. 4 pp. 532–542, 2006. 9
[35] E. Liu, H. Zhao, J. Liang, L. Pang, H. Chen, and J. Tian, “Random [59] M. Liu and P.-T. Yap, “Invariant Representation of Orientation Fields
Local Region Descriptor (RLRD): A New Method for Fixed-length for Fingerprint Indexing,” Pattern Recognition, vol. 45, no. 7, pp. 2532–
Feature Representation of Fingerprint Image and its Application to 2542, 2012. 9
Template Protection,” Future Generation Computer Systems, vol. 28, [60] Y. Su, J. Feng, and J. Zhou, “Fingerprint Indexing with Pose Constraint,”
no. 1, pp. 236–243, 2012. 4 Pattern Recognition, vol. 54, pp. 1–13, 2016. 9
[36] F. Farooq, R. M. Bolle, T.-Y. Jea, and N. Ratha, “Anonymous and Revo- [61] U. Uludag, S. Pankanti, and A. K. Jain, “Fuzzy Vault for Fingerprints,”
cable Fingerprint Recognition,” in 2007 IEEE Conference on Computer in International Conference on Audio-and Video-Based Biometric Person
Vision and Pattern Recognition, pp. 1–7, IEEE, 2007. 4 Authentication, pp. 310–319, Springer, 2005. 10
[37] H. Xu, R. N. Veldhuis, A. M. Bazen, T. A. Kevenaar, T. A. Akkermans, [62] J. Fan and F. Vercauteren, “Somewhat Practical Fully Homomorphic
and B. Gokberk, “Fingerprint Verification using Spectral Minutiae Rep- Encryption.,” IACR Cryptology ePrint Archive, vol. 2012, p. 144, 2012.
resentations,” IEEE Transactions on Information Forensics and Security, 10
vol. 4, no. 3, pp. 397–409, 2009. 4 [63] “Fvc ongoing.” https://biolab.csr.unibo.it/fvcongoing/UI/Form/Home.aspx.
[38] K. Nandakumar, “A Fingerprint Cryptosystem Based on Minutiae Phase 11
Spectrum,” in 2010 IEEE International Workshop on Information Foren- [64] M. D. Zeiler and R. Fergus, “Visualizing and Understanding Convolu-
sics and Security, pp. 1–6, IEEE, 2010. 4 tional Networks,” in European Conference on Computer Vision, pp. 818–
[39] Z. Jin, M.-H. Lim, A. B. J. Teoh, B.-M. Goi, and Y. H. Tay, “Generating 833, Springer, 2014. 14
Fixed-length Representation from Minutiae using Kernel Methods for
Fingerprint Authentication,” IEEE Transactions on Systems, Man, and Joshua J. Engelsma graduated magna cum
Cybernetics: Systems, vol. 46, no. 10, pp. 1415–1428, 2016. 4 laude with a B.S. degree in computer science
[40] Y. Zhu, X. Yin, X. Jia, and J. Hu, “Latent Fingerprint Segmentation from Grand Valley State University, Allendale,
based on Convolutional Neural Networks,” in Information Forensics and Michigan, in 2016. He is currently working to-
Security (WIFS), 2017 IEEE Workshop on, pp. 1–6, IEEE, 2017. 5 wards a PhD degree in the Department of
Computer Science and Engineering at Michigan
[41] X. Dai, J. Liang, Q. Zhao, and F. Liu, “Fingerprint Segmentation via
State University. His research interests include
Convolutional Neural Networks,” in Chinese Conference on Biometric
pattern recognition, computer vision, and image
Recognition, pp. 324–333, Springer, 2017. 5
processing with applications in biometrics. He
[42] J. Ezeobiejesi and B. Bhanu, “Latent Fingerprint Image Segmentation won the best paper award at the 2019 IEEE
using Deep Neural Network,” in Deep Learning for Biometrics, pp. 83– International Conference on Biometrics (ICB).
107, Springer, 2017. 5
[43] D.-L. Nguyen, K. Cao, and A. K. Jain, “Automatic Latent Fingerprint
Segmentation,” in 2018 IEEE 9th International Conference on Biometrics Kai Cao received the Ph.D. degree from the
Theory, Applications and Systems (BTAS), pp. 1–9, IEEE, 2018. 5 Key Laboratory of Complex Systems and Intel-
[44] K. Cao and A. K. Jain, “Latent Orientation Field Estimation via Con- ligence Science, Institute of Automation, Chi-
volutional Neural Network,” in Biometrics (ICB), 2015 International nese Academy of Sciences, Beijing, China, in
Conference on, pp. 349–356, IEEE, 2015. 5 2010. He was a Post Doctoral Fellow in the De-
[45] P. Schuch, S.-D. Schulz, and C. Busch, “Deep Expectation for Estimation partment of Computer Science & Engineering,
of Fingerprint Orientation Fields,” in Biometrics (IJCB), 2017 IEEE Michigan State University. His research interests
International Joint Conference on, pp. 185–190, IEEE, 2017. 5 include biometric recognition, image processing
[46] Z. Qu, J. Liu, Y. Liu, Q. Guan, R. Li, and Y. Zhang, “A Novel System and machine learning.
for Fingerprint Orientation Estimation,” in Chinese Conference on Image
and Graphics Technologies, pp. 281–291, Springer, 2018. 5
[47] D.-L. Nguyen, K. Cao, and A. K. Jain, “Robust Minutiae Extractor:
Integrating Deep Networks and Fingerprint Domain Knowledge,” in 2018 Anil K. Jain is a University Distinguished Pro-
International Conference on Biometrics (ICB), pp. 9–16, IEEE, 2018. 5 fessor in the Department of Computer Science
[48] Y. Tang, F. Gao, J. Feng, and Y. Liu, “Fingernet: An Unified Deep at Michigan State University. He is a Fellow of
Network for Fingerprint Minutiae Extraction,” in Biometrics (IJCB), the ACM, IEEE, IAPR, AAAS and SPIE. His
2017 IEEE International Joint Conference on, pp. 108–116, IEEE, 2017. research interests include pattern recognition
5 and biometric authentication. He served as the
[49] L. N. Darlow and B. Rosman, “Fingerprint Minutiae Extraction using editor-in-chief of the IEEE Transactions on Pat-
Deep Learning,” in Biometrics (IJCB), 2017 IEEE International Joint tern Analysis and Machine Intelligence, a mem-
Conference on, pp. 22–30, IEEE, 2017. 5 ber of the United States Defense Science Board
[50] K. Cao and A. K. Jain, “Automated Latent Fingerprint Recognition,” and the Forensics Science Standards Board. He
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018. has received Fulbright, Guggenheim, Alexander
5 von Humboldt, and IAPR King Sun Fu awards. He is a member of
[51] M. Jaderberg, K. Simonyan, A. Zisserman, et al., “Spatial Transformer the United States National Academy of Engineering, a member of The
Networks,” in Advances in Neural Information Processing Systems, World Academy of Science, and foreign members of the Indian National
pp. 2017–2025, 2015. 5, 6 Academy of Engineering and the Chinese Academy of Sciences.