0% found this document useful (0 votes)
10 views

Data-Structures-Lecture 3

The document discusses string processing and pattern matching algorithms. It introduces strings, how they are stored, and the pattern matching problem of determining if a pattern occurs within a text string. It then describes a first algorithm that compares all substring combinations and has complexity of O(|P| * (|T| - |P| + 1)) in the worst case. Finally, it introduces a fast pattern matching algorithm that uses a table derived from the pattern to efficiently determine matches.

Uploaded by

Zakir Hossen
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Data-Structures-Lecture 3

The document discusses string processing and pattern matching algorithms. It introduces strings, how they are stored, and the pattern matching problem of determining if a pattern occurs within a text string. It then describes a first algorithm that compares all substring combinations and has complexity of O(|P| * (|T| - |P| + 1)) in the worst case. Finally, it introduces a fast pattern matching algorithm that uses a table derived from the pattern to efficiently determine matches.

Uploaded by

Zakir Hossen
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Data Structures

Class Lecture-3

Mohammad Ashraful Islam


Lecturer
Department of CSE, JU
08/15/20 1
String Processing
String is a finite sequence S of zero or more
character.
The String with zero character is called empty
string.
Alphabet: a, b, c, d ... x, y, z
Digits: 0 to 9
Special character: +,-,/,(), $, =

08/15/20 2
String Processing
String are stored in three types of structure:
• Fix length structure
• Variable length structure
• Linked structure

08/15/20 3
Pattern Matching Problem
You are given a string text T and string pattern P. You
have to determine whether P occurs or appears in T
or not. This problem is known as pattern matching
algorithm.
For example:
o Let T =“abcdef” and P = “bcd”. Here P occurs in T.
o Let T =“abcdef” and P = “abd”. Here P does not
occurs in T.

08/15/20 4
First Pattern Matching Algorithm
• T = GCTACACATC
• P = ACAT
• T = T[1]T[2]T[3] T[4] T[5] T[6] T[7] T[8] T[9]T[10]
• P = P[1]P[2]P[3]P[4]
• So, T has 10-4+1 substrings of length 4.

08/15/20 5
First Pattern Matching Algorithm
• These 10-4+1 =7 substrings of length 4 are,
• W1 = T[1][2][3][4]
• W2 = T [2][3][4][5]
• W3 = T[3][4][5][6]
• W4 = T[4][5][6][7]
• W5 = T[5][6][7] [8]
• W6 = T[6][7] [8] [9]
• W7 = T[7] [8] [9][10]
08/15/20 6
First Pattern Matching Algorithm
• More formally, substring of T starts at kth
position : WK = T[K][K+1][K+2][K+3]

• In first pattern matching algorithm, for each K =


1 to 7, we will compare WK with Pattern P. If any
of the substring matches exactly with the
pattern P, then we can say Text T contains
pattern P.

08/15/20 7
First Pattern Matching Algorithm

08/15/20 8
First Pattern Matching Algorithm

08/15/20 9
First Pattern Matching Algorithm

08/15/20 10
Complexity
• The complexity of this pattern matching
algorithm is the number of comparison
between characters in pattern Pand
characters in text T.
In the given example, C = 1+1+1+4+1+4 =12

08/15/20 11
Complexity of First Pattern
Matching Algorithm
• Let, P = aaaab
• T = aaaaaaaaaaa
• |P| = 5, |T|= 11
• So, W1 = W2 = W3 = W4 = W5= W6=W7 = aaaaa
• C = 5+5+5+5+5+5+5 = 35.
• So, in worse case, C = |P| × (|T| - |P| + 1)

08/15/20 12
Fast Pattern Matching Algorithm

In second pattern matching algorithm we use a table


which is derived from P.
P = aaba
Q0 = “” (Empty string)
Q1= “a”
Q2= “aa”
Q3= “aab”
Q4= “aaba”
08/15/20 13
Fast Pattern Matching Algorithm
Initially, we have an empty string Q0.
Character ‘a’ after Q0 gives us Q1 =“a”
Character ‘a’ after Q1 gives us Q2 =“aa”
Character ‘b’ after Q2 gives us Q3 =“aab”
Character ‘a’ after Q3 gives us Q4 =“aaba” = P

08/15/20 14
Fast Pattern Matching Algorithm
Initially, we have an empty string Q0.
Character ‘a’ after Q0 gives us Q1 =“a”
But, what will happen if we get ‘b’ after Q0 ?
Answer : ‘b’ after Q0 means string, Str = “b”.
The maximum length Suffix of Str that matches
with Prefix of P is 0. So, it takes us to Q0.

08/15/20 15
Fast Pattern Matching Algorithm
What will happen if we get ‘b’ after Q 1 ?
Answer : ‘b’ after Q1 means string, Str = “ab”.
The maximum length Suffix of Str that matches
with Prefix of P is 0. So, it takes us to Q0.
The Suffix of Str are , “b” , “ab”
The prefix of P are , “a” , “aa”

08/15/20 16
Fast Pattern Matching Algorithm
What will happen if we get ‘a’ after Q2 ?
Answer : ‘a’ after Q2 means string, Str = “aaa”.
The maximum length Suffix of Str that matches with
Prefix of P is 0. So, it takes us to Q0.
The Suffix of Str are , “a” , “aa” , “aaa”
The prefix of P are , “a” , “aa” , “aab”
So, Maximum length that matches is 2.

08/15/20 17
Fast Pattern Matching Algorithm
What will happen if we get ‘b’ after Q 3 ?
Answer : ‘b’ after Q3 means string, Str = “aabb”.
The maximum length Suffix of Str that matches
with Prefix of P is 0. So, it takes us to Q0.
The Suffix of Str are , “b” , “bb” , “abb”, “aabb”
The prefix of P are , “a” , “aa” , “aab” , “aaba”.

08/15/20 18
Pattern Matching Table
a b

Q0 Q1 Q0

Q1 Q2 Q0

Q2 Q2 Q3

Q3 Q4 = P Q0

08/15/20 19
Thank You

20

You might also like