0% found this document useful (0 votes)
75 views25 pages

17 String Matching - Rabin Karp Algorithm

Uploaded by

ctxxfx5ykz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views25 pages

17 String Matching - Rabin Karp Algorithm

Uploaded by

ctxxfx5ykz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Design and Analysis of Algorithms

String Matching Problem


(Rabin - Karp Algorithm)

Dr. D. P. Acharjya
Professor, SCOPE
SJT Annex – 201 E
Dr. D. P. Acharjya 2/26/2024 11
String Matching Problem
 Finding all occurrences of a pattern (String) in a text
leads to string matching problem.
 It arises frequently in text editing programs where text
is a document, and the string searched for is a
particular word.
 String-matching algorithms are also used to search for
particular patterns in DNA sequences.
 For Example:

Dr. D. P. Acharjya 2/26/2024 22


Mathematical Formulation
 Assume that the text is an array T[1  n] of length n.
The pattern is an array P[1  m] of length m. It is to be
noted that m ≤ n.

 It is to be noted that the elements of P and T are


characters drawn from a finite alphabet Σ. For example,
Σ = {0, 1} or Σ = {a, b, , z}.

 The character arrays P and T are often called strings of


characters.
Dr. D. P. Acharjya 2/26/2024 33
Continued…
 The pattern P occurs with shift s in text T (or pattern P
occurs beginning at position s + 1 in text T) if
0 ≤ s ≤ n - m and T[s+1  s+m] = P[1  m]
 It means that T[s + j] = P[j], for 1 ≤ j ≤ m.
 If P occurs with shift s in T, then it is a valid shift
otherwise, s is an invalid shift.
 The string-matching problem is the problem of finding
all valid shifts with which a given pattern P occurs in
a given text T.

Dr. D. P. Acharjya 2/26/2024 44


Rabin-Karp Algorithm
1. RABIN-KARP-MATCHER(T, P, d, q)
2. n ← Length[T]
3. m ← Length[P]
4. h ← d(m-1) mod q
5. p←0
6. t0 ← 0
7. for i ← 1 to m (Preprocessing)
8. do p ← (dp + P[i]) mod q
9. t0 ← (dt0 + T[i]) mod q

Dr. D. P. Acharjya 2/26/2024 55


Continued …
10. for s ← 0 to (n – m) (Matching)
11. do if p = ts
12. then if P[1  m] = T[s + 1  s + m]
13. then print “Pattern occurs with shift s”
14.if s < (n-m)
15. then ts+1 ← (d(ts-T[s+1]h)+T[s+m+1]) mod q
 The preprocessing step executes for (m)
 The execution time of matching is O(n – m + 1)
 Each value of s has m computations as P[j] = T[s + j];
1 j  m
 Computing Time is O((n – m + 1)m)
Dr. D. P. Acharjya 2/26/2024 66
Numerical Illustration
 Consider a text T =
2359023141526739921, find all
the valid shifts for the pattern P
= 31415.
 Here  = {0,1,2, …, 9}
 Let q = 13 (Prime)
 d = |  |=10
 Here n = Length[T] = 19
 m = Length[P] = 5
 h = dm-1 mod q = 104 mod 13 =3
Dr. D. P. Acharjya 2/26/2024 7
Continued … (Preprocessing)
 p = 0 and t0 = 0
 For i =1
p = (dp + P[i =1]) % 13
= (100 + 3) % 13 = 3
t0 = (dt0 + T[i =1]) % 13
= (100 + 2) % 13 = 2
 For i =2
p = (dp + P[i =2]) % 13 = (103 + 1) % 13 = 5
t0 = (dt0 + T[i =1]) % 13 = (102 + 3) % 13 = 10

Dr. D. P. Acharjya 2/26/2024 8


Continued …
 For i = 3
p = (dp + P[i =3]) % 13
= (105 + 4) % 13 = 2
t0 = (dt0 + T[i =3]) % 13
= (1010 + 5) % 13 = 1
 For i = 4
p = (dp + P[i =4]) % 13
= (102 + 1) % 13 = 8
t0 = (dt0 + T[i =4]) % 13
= (101 + 9) % 13 = 6
Dr. D. P. Acharjya 2/26/2024 9
Continued …
 For i = 5
p = (dp + P[i =5]) % 13
= (108 + 5) % 13 = 7
t0 = (dt0 + T[i =5]) % 13
= (106 + 0) % 13 = 8

Dr. D. P. Acharjya 2/26/2024 10


Continued … (Matching)
 For s = 0
p (7) = t0 (8) (False)
s = 0 < (n-m) = 14

t1 = (d(t0 – T[s+1]h) + T[s+m+1]) % q


= (10(8 – T[1]3) + T[0+5+1]) % 13
= (10(8 – 23) + 2) % 13 = 9

Dr. D. P. Acharjya 2/26/2024 11


Continued …
 For s = 1
p (7) = t1 (9) (False)
s = 1 < (n-m) = 14

t2 = (d(t1 – T[2]h) + T[1+5+1]) % 13


= (10(9 – 33) + 3) % 13 = 3

Dr. D. P. Acharjya 2/26/2024 12


Continued …
 For s = 2
p (7) = t2 (3) (False)
s = 2 < (n-m) = 14

t3 = (d(t2 – T[3]h) + T[2+5+1]) % 13


= (10(3 – 53) + 1) % 13 = 11

Dr. D. P. Acharjya 2/26/2024 13


Continued …
 For s = 3
p (7) = t3 (11) (False)
s = 3 < (n-m) = 14

t4 = (d(t3 – T[4]h) + T[3+5+1]) % 13


= (10(11 – 93) + 4) % 13 = 0

Dr. D. P. Acharjya 2/26/2024 14


Continued …
 For s = 4
p (7) = t4 (0) (False)
s = 4 < (n-m) = 14

t5 = (d(t4 – T[5]h) + T[4+5+1]) % 13


= (10(0 – 03) + 1) % 13 = 1

Dr. D. P. Acharjya 2/26/2024 15


Continued …
 For s = 5
p (7) = t5 (1) (False)
s = 5 < (n-m) = 14

t6 = (d(t5 – T[6]h) + T[5+5+1]) % 13


= (10(1 – 23) + 5) % 13 = 7

Dr. D. P. Acharjya 2/26/2024 16


Continued …
 For s = 6
p (7) = t6 (7) (True)
Pattern exists at shift s=6
s = 6 < (n-m) = 14
t7 = (d(t6 – T[7]h) + T[6+5+1]) % 13
= (10(7 – 33) + 2) % 13 = 8

Dr. D. P. Acharjya 2/26/2024 17


Continued …
 For s = 7
p (7) = t7 (8) (False)

s = 7 < (n-m) = 14
t8 = (d(t7 – T[8]h) + T[7+5+1]) % 13
= (10(8 – 13) + 6) % 13 = 4

Dr. D. P. Acharjya 2/26/2024 18


Continued …
 For s = 8
p (7) = t8 (4) (False)

s = 8 < (n-m) = 14
t9 = (d(t8 – T[9]h) + T[8+5+1]) % 13
= (10(4 – 43) + 7) % 13 = 5

Dr. D. P. Acharjya 2/26/2024 19


Continued …
 For s = 9
p (7) = t9 (5) (False)

s = 9 < (n-m) = 14
t10 =(d(t9 – T[10]h) + T[9+5+1]) % 13
= (10(5 – 13) + 3) % 13 = 10

Dr. D. P. Acharjya 2/26/2024 20


Continued …
 For s = 10
p (7) = t10 (10) (False)

s = 10 < (n-m) = 14
t11 =(d(t10 – T[11]h) + T[10+5+1]) % 13
= (10(10 – 53) + 9) % 13 = 11

Dr. D. P. Acharjya 2/26/2024 21


Continued …
 For s = 11
p (7) = t11 (11) (False)

s = 11 < (n-m) = 14
t12 =(d(t11 – T[12]h) + T[11+5+1]) % 13
= (10(11 – 23) + 9) % 13 = 7

Dr. D. P. Acharjya 2/26/2024 22


Continued …
 For s = 12
p (7) = t12 (7) (True)
P[j]  T[s + j]
s = 12 < (n-m) = 14
t13 =(d(t12 – T[13]h) + T[12+5+1]) % 13
= (10(7 – 63) + 2) % 13 = 9

Dr. D. P. Acharjya 2/26/2024 23


Continued …
 For s = 13
p (7) = t13 (9) (False)

s = 13 < (n-m) = 14
t14 =(d(t13 – T[14]h) + T[13+5+1]) % 13
= (10(9 – 73) + 1) % 13 = 11

Dr. D. P. Acharjya 2/26/2024 24


Continued …
 For s = 14
p (7) = t14 (11) (False)

s = 14 < (n-m) =14 (False)


Process Terminates.

Dr. D. P. Acharjya 2/26/2024 25

You might also like