6 Template Matching
6 Template Matching
TEMPLATE MATCHING
0
TEMPLATE MATCHING
● The Goal: Given a set of reference patterns known
as TEMPLATES, find to which one an unknown
pattern matches best. That is, each class is
represented by a single typical pattern.
0
● Typical Applications
➢ Speech Recognition
➢ Motion Estimation in Video Coding
➢ Data Base Image Retrieval
➢ Written Word Recognition
➢ Bioinformatics
r (1), r (2),..., r ( I )
t (1), t (2),..., t ( J )
0
➢ In general I ≠ J
0
➢ Path: A path through the grid, from an initial node
(i0, j0) to a final one (if, jf), is an ordered set of nodes
(i0, j0), (i1, j1), (i2, j2) … (ik, jk) … (if, jf)
➢ Each path is associated with a cost
K −1
where K is theDnumber
= ∑ dof
( iknodes
, jk ) across the path
k =0
0
➢ Search for the path with the optimal cost Dopt.
0
BELLMAN’S OPTIMALITLY PRINCIPLE
● Optimum path:
opt
(i0 , j0 ) ⎯⎯→ (i f , j f )
0
● Bellman’s Principle:
opt opt
(i0 , j0 ) ⎯⎯→ (i f , j f ) = (i0 , j0 ) ⎯⎯→ (i, j ) ⊕
opt
(i, j ) ⎯⎯→ (i f , j f )
0
Dopt (ik , jk ) = opt{Dopt (ik −1 , jk −1 ) + d (ik , jk )}
0
● The Edit distance
➢ It is used for matching written words.
Applications:
• Automatic Editing
• Text Retrieval
0
● The cost is based on the philosophy behind the so-
called variational similarity, i.e.,
➢ Measure the cost associated with converting one pattern
to the other
D( A, B) = min[C ( j ) + I ( j ) + R( j )]
where j runs over Allj possible variations of symbols, in
order to convert A B
0
● Allowable predecessors and costs
➢ (i − 1, j − 1) → (i, j )
⎧0, if t (i ) = r ( j )
d (i, j i − 1, j − 1) = ⎨
⎩1, t (i ) ≠ r ( j )
➢ Horizontal
d (i, j i − 1, j ) = 1
➢ Vertical
d (i, j i, j − 1) = 1
0
● Examples:
0
● Examples:
0
● The Algorithm
➢ D(0,0)=0
➢ For i=1, to I
• D(i,0)=D(i-1,0)+1
➢ END {FOR}
➢ For j=1 to J
• D(0,j)=D(0,j-1)+1
➢ END{FOR}
➢ For i=1 to I
• For j=1, to J
– C1=D(i-1,j-1)+d(i,j ׀i-1,j-1)
– C2=D(i-1,j)+1
– C3=D(i,j-1)+1
– D(i,j)=min (C1,C2,C3)
• END {FOR}
➢ END {FOR}
➢ D(A,B)=D(I,J)
0
● Dynamic Time Warping in Speech Recognition
The isolated word recognition (IWR) will be discussed.
➢ The goal: Given a segment of speech corresponding to an
unknown spoken word (test pattern), identify the word by
comparing it against a number of known spoken words in a
data base (reference patterns).
➢ The procedure:
• Express the test and each of the reference patterns as
sequences of feature vectors , , .
• r (i ) segments
To this end, divide each of the speech t ( j) in a number
of successive frames.
0
• For each frame compute a feature vector. For example, the
DFT coefficients and use, say, ℓ of those:
⎡ xi (0) ⎤ ⎡ x j ( 0) ⎤
⎢ x (1) ⎥ ⎢ ⎥
⎢ jx (1) ⎥
⎢ i ⎥
r (i ) = ⎢... ⎥, i = 1, ..., I t ( j ) = ⎢... ⎥, j = 1, ..., J
⎢ ⎥ ⎢ ⎥
⎢... ⎥ ⎢... ⎥
⎢ x (! ⎥
• Choοse⎢⎣ axi (cost
! ⎥
− 1)function
⎦ associated with⎢⎣ each
j − 1)
node
⎥⎦ across a
path, e.g., the Euclidean distance
0
➢ Prior to performing the math one has to choose:
• The global constraints: Defining the region of space within
which the search for the optimal path will be performed.
0
• The local constraints: Defining the type of transitions
allowed between the nodes of the grid.
0
● Measures based on Correlations: The major task here is to
find whether a specific known reference pattern resides within
a given block of data. Such problems arise in problems such as
target detection, robot vision, video coding. There are two
basic steps in such a procedure:
➢ Step 1: Move the reference pattern to all possible
positions within the block of data. For each position,
compute the “similarity” between the reference pattern and
the respective part of the block of data.
0
➢ Application to images: Given a reference image, r(i,j) of
MxN size, and an IxJ image array t(i,j). Move r(i,j) to all
possible positions (m,n) within t(i,j). Compute:
• 2
D(m, n) = ∑∑ t (i, j ) − r (i − m, j − n)
for every (m,n).
i j
c(m, n)
c N (m, n) = 2 2
∑∑ t (i, j ) ⋅∑∑ r (i, j )
i j i j
0
– cN(m,n) is less than one and becomes equal to one only
if
t (i, j ) = α ⋅ r (i − m, j − n)
•
0
● Deformable Template Matching
In correlation matching, the reference pattern was assumed to
reside within the test block of data. However, in most practical
cases a version of the reference pattern lives within the test
data, which is “similar” to the reference pattern, but not
exactly the same. Such cases are encountered in applications
such as content based retrieval from data bases.
➢ The philosophy: Given a reference pattern r(i,j) known as
prototype:
• Deform the prototype to produce different variants.
Deformation is described by the application of a parametric
transform on r(i,j):
Tξ [r (i, j )]
0
• For different values of the parameter vector the goodness
ξ
of fit with the test pattern is given by the matching energy:
0
• .