The Zhu-Takaoka Algorithm: Advisor: Prof. R. C. T. Lee Speaker: S. Y. Tang
The Zhu-Takaoka Algorithm: Advisor: Prof. R. C. T. Lee Speaker: S. Y. Tang
Algorithm
On improving the average case of the Boyer-Moore string
matching algorithm, Journal of Information Processing
10(3):173-177, 1987 R. F. ZHU, T. TAKAOKA
Pattern
C T A A G
No GC appears in P.
0 1 2 3 4 5 6 7 8 9 10 11 Text
A C T G C C T A A G T A
Pattern
C T A A G
0 1 2 3 4 5 6 7 8 9 10 11 Text
A C T G C C T A A G T A
Pattern
C T A A G
4
How can we know whether a
specified 2-substring appears
in P or not?
5
Whenever a mismatch or a complete match
occurs, we select
the last 2-substring in T and search for the
rightmost location of this 2-substring in P if it
exists. This is done by constructing a ztBc table.
• Example
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Text
G C A T C G C A G A G G A T A T A C A G T A C G
Pattern
G C A G A G A G
Shift by 5
G C A G A G A G
Shift by 1
G C A G A G A G
ztBc A C G *
Pattern
G C A G A G A G
G C A G A G A G
ztBc A C G *
A 8 8 2 8
C 5 8 7 8
G 1 6 7 8
* 8 8 7 8 ↑
a
Shift by 5
← b
i 0 1 2 3 4 5 6 7
x[i] G C A G A G A G
Pattern
G C A G A G A G
G C A G A G A G
ztBc A C G *
A 8 8 2 8
C 5 8 7 8
G 1 6 7 8
* 8 8 7 8 ↑
a
Shift by 7
← b
i 0 1 2 3 4 5 6 7
x[i] G C A G A G A G
Pattern
G C A G A G A G
ztBc A C G *
A 8 8 2 8
C 5 8 7 8
G 1 6 7 8
* 8 8 7 8
↑
a
i 0 1 2 3 4 5 6 7
x[i] G C A G A G A G ← b
Pattern
G C A G A G A G
Shift by 5
G C A G A G A G
a
P
7
=CA]
←
b i 0 1 2 3 4 5 6 7
x[i] G C A G A G A G
bmGs 7 7 7 2 7 4 7 1
14
• Full Example
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Text
G C A T C G C A G A G A G T A T A C A G T A C G
Pattern
G C A G A G A G
exact matching
Shift by 7
G C A G A G A G
a
← b
i 0 1 2 3 4 5 6 7
x[i] G C A G A G A G
bmGs 7 7 7 2 7 4 7 1
15
• Full Example
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Text
G C A T C G C A G A G A G T A T A C A G T A C G
Pattern
G C A G A G A G
Shift by 4
a
G C A G A G A G
← b i
0 1 2 3 4 5 6 7
x[i] G C A G A G A G
bmGs 7 7 7 2 7 4 7 1
16
• Full Example
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Text
G C A T C G C A G A G A G T A T A C A G T A C G
Pattern
ztBc A C G *
A 8 8 2 8
C 5 8 7 8
G 1 6 7 8
* 8 8 7 8 ↑
a
G C A G A G A G
σ
2 )
time and space complexity. ➢( σ
= the numbers of alphabet of the text ).
• searching phase in O(m
x
n) time complexity.
18
References
1. ZHU, R.F. and TAKAOKA, T., 1987, On
improving the
average case of the Boyer-Moore string
matching algorithm, Journal of Information
Processing 10(3):173-177 .
19
Thank you for your
attention.
20