DAA Unit 5 Part 1

The document discusses various string matching algorithms, including the Naive String Matching Algorithm, Rabin-Karp Algorithm, and Knuth-Morris-Pratt Algorithm. It explains the principles behind each algorithm, such as the use of hashing in Rabin-Karp and the prefix table in KMP for efficient pattern searching. Additionally, it provides examples and key concepts associated with these algorithms.

Uploaded by

Raj Vishwakarma9451

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views27 pages

DAA Unit 5 Part 1

Uploaded by

Raj Vishwakarma9451

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

String Matching

String Matching
• String Matching Algorithm is also called "String Searching Algorithm.“
• This is a vital class of string algorithm is declared as "this is the
method to find a place where one is several strings are found within
the larger string.“
• Given a text array, T [1.....n], of n character and a pattern array, P
[1......m], of m characters. The problems are to find an integer s,
called valid shift where 0 ≤ s < n-m and T [s+1......s+m] = P [1......m].
• In other words, to find even if P in T, i.e., where P is a substring of T.
The item of P and T are character drawn from some finite alphabet
such as {0, 1} or {A, B .....Z, a, b..... z}.
String Matching
• Given a string T [1......n], the substrings are represented as T [i......j] for some 0≤i ≤ j≤n-1,
the string formed by the characters in T from index i to index j, inclusive. This process that
a string is a substring of itself (take i = 0 and j =m).
Algorithms used for String Matching:
• There are different types of method is used to finding the string
• The Naive String Matching Algorithm
• The Rabin-Karp-Algorithm
• Finite Automata
• The Knuth-Morris-Pratt Algorithm
• The Boyer-Moore Algorithm
The Naive String Matching Algorithm
• The naive approach tests all the possible placement of Pattern P [1.......m] relative to text
T [1......n]. We try shift s = 0, 1.......n-m, successively and for each shift s. Compare T
[s+1.......s+m] to P [1......m].
• The naive algorithm finds all valid shifts using a loop that checks the condition P [1.......m]
= T [s+1.......s+m] for each of the n - m +1 possible value of s.
Naïve String Matching
Example:
Suppose T = 1011101110
P = 111
Find all the Valid Shift
Naive String Matching
Naive String Matching
Naive String Matching
Rabin-Karp String Matching
• The Rabin-Karp algorithm is a string matching algorithm that efficiently finds patterns
within a larger text using a technique called hashing.
• The basic idea is to convert the pattern and each possible substring of the text into
numeric values (hashes) and then compare these values rather than the strings
themselves. This allows for faster comparisons, especially when dealing with large texts.
• The Rabin-Karp algorithm for string matching is useful because it can quickly find patterns
in large texts. It’s especially good when you need to search for multiple patterns at once
or when the text is very long.
Rabin-Karp String Matching - Working
• Hashing the Pattern: First, the algorithm turns the pattern (the word or sequence
you’re looking for) into a number using a hash function. This number is like a
unique ID for the pattern.
• Hashing the Text: Next, the algorithm takes the first part of the text that is the
same length as the pattern and turns it into a number using the same hash
function.
• Comparing Hashes: The algorithm compares the number (hash) of the pattern
with the number (hash) of the part of the text. If the numbers are the same, it
means the pattern might be there. If the numbers are different, the pattern is
definitely not there.
• Sliding the Window: If the numbers don't match, the algorithm slides over to the
next part of the text, creates a new hash for this part, and compares again. This
sliding continues until the whole text is checked.
• Collision Check: Sometimes, different parts of the text can have the same hash
number even if they are not the same as the pattern. When the hash numbers
match, the algorithm checks the actual text to make sure it really found the
pattern.
Rabin-Karp String Matching - Key Concepts
1. Hash Function
• A hash function is a mathematical function that converts a string (like a sequence of
characters) into a fixed-size numeric value, which is called a hash.
• The hash function takes a string as input and returns a unique number (hash) for that string.
• By converting strings into numbers, the Rabin-Karp algorithm can compare the pattern with
parts of the text much faster. Instead of comparing the strings character by character, it just
compares their hash values. If the hash values match, then the strings are likely to match as
well.
Rabin-Karp String Matching - Key Concepts
2. Sliding Window Technique
• The sliding window technique is a method used by the Rabin-Karp algorithm to move
through the text efficiently while updating the hash value.

How It Works:
• Imagine you have a window that covers a substring of the text that is the same length as the
pattern. You start by calculating the hash for the first substring within this window. Then, you
slide the window one character to the right to cover the next substring.
• Instead of recalculating the hash from scratch for each new substring, the algorithm uses the
hash of the previous substring and updates it by removing the effect of the first character and
adding the effect of the new character.
• For example, if your current window covers "abc" (with a hash of 294), and you move the
window to cover "bcd", you can quickly update the hash by subtracting the effect of 'a' and
adding the effect of 'd'.
• This technique reduces the number of operations needed to calculate the hash for each new
substring, making the algorithm much faster. Instead of recalculating the hash for every
substring from scratch, you only need to make a few adjustments, which is much quicker.
Rabin-Karp String Matching - Algorithm
Rabin-Karp String Matching - Example
Let the text be

Let the pattern be

Let us assign a numerical value(v)/weight for the characters we will be using in the
problem. Here, we have taken first ten alphabets only (i.e. A to J).
Rabin-Karp String Matching - Example
n be the length of the pattern and m be the length of the text.
Here, m = 10 and n = 3.
Let d be the number of characters in the input set.
Here, we have taken input set {A, B, C, ..., J}.
So, d = 10. You can assume any suitable value for d.

Calculate the hash value of the pattern.

hash value for pattern(p) = Σ(v * dn-1) mod 13
= ((3 * 102) + (4 * 101) + (4 * 100)) mod 13
= 344 mod 13
=6

We have chosen a prime number (here, 13) in such a way that we can perform all the
calculations with single-precision arithmetic.
Rabin-Karp String Matching - Example
Calculate the hash value for the text-window of size m.
For the first window ABC, hash value for text(t)
= Σ(v * dn-1) mod 13
= ((1 * 102) + (2 * 101) + (3 * 100)) mod 13
= 123 mod 13
=6
• Compare the hash value of the pattern with the hash value of the text. If they match then,
character-matching is performed.
• In the above examples, the hash value of the first window (i.e. t) matches with p so, go for
character matching between ABC and CDD.
• Since they do not match so, go for the next window.

Spurious Hit
• When the hash value of the pattern matches with the hash value of a window of the text but
the window is not the actual pattern then it is called a spurious hit.
Rabin-Karp String Matching - Example
We calculate the hash value of the next window by subtracting the first term and adding the
next term as shown below.
t = ((2 * 102) + (3* 101) + (3 * 100)) mod 13
= 233 mod 13
= 12
Knuth-Morris-Pratt Algorithm
• KMP algorithm was invented by Donald Knuth and Vaughan Pratt together and
independently by James H Morris in the year 1970. In the year 1977, all the three jointly
published KMP Algorithm.
• KMP algorithm is used to find a "Pattern" in a "Text". This algorithm campares character by
character from left to right.
• But whenever a mismatch occurs, it uses a preprocessed table called "Prefix Table" to skip
characters comparison while matching.
• Some times prefix table is also known as LPS Table. Here LPS stands for "Longest proper
Prefix which is also Suffix".
Knuth-Morris-Pratt Algorithm
Steps for Creating LPS Table (Prefix Table)
• Step 1 - Define a one dimensional array with the size equal to the length of the Pattern.
(LPS[size])
• Step 2 - Define variables i & j. Set i = 0, j = 1 and LPS[0] = 0.
• Step 3 - Compare the characters at Pattern[i] and Pattern[j].
• Step 4 - If both are matched then set LPS[j] = i+1 and increment both i & j values by one.
Goto to Step 3.
• Step 5 - If both are not matched then check the value of variable 'i'. If it is '0' then set LPS[j]
= 0 and increment 'j' value by one, if it is not '0' then set i = LPS[i-1]. Goto Step 3.
• Step 6- Repeat above steps until all the values of LPS[] are filled.
Knuth-Morris-Pratt Algorithm
Knuth-Morris-Pratt Algorithm
Knuth-Morris-Pratt Algorithm
Knuth-Morris-Pratt Algorithm
Knuth-Morris-Pratt Algorithm
How to use LPS Table
• We use the LPS table to decide how many characters are to be skipped for comparison
when a mismatch has occurred.
• When a mismatch occurs, check the LPS value of the previous character of the mismatched
character in the pattern.
• If it is '0' then start comparing the first character of the pattern with the next character to the
mismatched character in the text.
• If it is not '0' then start comparing the character which is at an index value equal to the LPS
value of the previous character to the mismatched character in pattern with the mismatched
character in the Text.
Knuth-Morris-Pratt Algorithm
Knuth-Morris-Pratt Algorithm
Knuth-Morris-Pratt Algorithm

Unit II
No ratings yet
Unit II
94 pages
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt
No ratings yet
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt
49 pages
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
No ratings yet
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
21 pages
Strings
No ratings yet
Strings
23 pages
String Matching
No ratings yet
String Matching
63 pages
Strings and Pattern Matching
No ratings yet
Strings and Pattern Matching
17 pages
Module 6 AOA
No ratings yet
Module 6 AOA
19 pages
4th Sem DAA Module 4
No ratings yet
4th Sem DAA Module 4
10 pages
String Matching
No ratings yet
String Matching
30 pages
Unit 3-Pattern Matching
No ratings yet
Unit 3-Pattern Matching
43 pages
Lecture 34, 35 36 - String Matching Algorithms
No ratings yet
Lecture 34, 35 36 - String Matching Algorithms
42 pages
UNIT-5 DAA Complete Notes
No ratings yet
UNIT-5 DAA Complete Notes
52 pages
Naive and Rabin Karp
No ratings yet
Naive and Rabin Karp
47 pages
String Matching 2019
No ratings yet
String Matching 2019
50 pages
String Matching
100% (1)
String Matching
27 pages
Pattern Matching
No ratings yet
Pattern Matching
33 pages
Patternmatching
No ratings yet
Patternmatching
29 pages
String Matching Algorithms
No ratings yet
String Matching Algorithms
25 pages
UNIT-V String Matching
No ratings yet
UNIT-V String Matching
24 pages
String Matching
No ratings yet
String Matching
35 pages
DAA Unit 5
No ratings yet
DAA Unit 5
22 pages
BNP Unit-5 Lecture 19
No ratings yet
BNP Unit-5 Lecture 19
13 pages
Lecture#8 - String Matching Algorithm
No ratings yet
Lecture#8 - String Matching Algorithm
38 pages
M3-String Matching
No ratings yet
M3-String Matching
74 pages
String Matching Algorithms: International Journal of Engineering and Computer Science March 2018
No ratings yet
String Matching Algorithms: International Journal of Engineering and Computer Science March 2018
5 pages
Ada Notes Unit 4
No ratings yet
Ada Notes Unit 4
28 pages
Unit 2 - Letter ManipilationPattern Searching
No ratings yet
Unit 2 - Letter ManipilationPattern Searching
19 pages
Abstract
No ratings yet
Abstract
12 pages
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
No ratings yet
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
15 pages
Unit 3-Pattern Matching
No ratings yet
Unit 3-Pattern Matching
42 pages
4string Matching Kmprabin Karp and Naive
No ratings yet
4string Matching Kmprabin Karp and Naive
57 pages
Unit8 ADA SPPDF 2022 11 11 17 17 37pdf 2023 12 06 16 57 08
No ratings yet
Unit8 ADA SPPDF 2022 11 11 17 17 37pdf 2023 12 06 16 57 08
18 pages
CH 8
No ratings yet
CH 8
26 pages
Pattern Matching Algo
No ratings yet
Pattern Matching Algo
21 pages
Adobe Scan Nov 24, 2023
No ratings yet
Adobe Scan Nov 24, 2023
5 pages
5CS4-AOA-Unit-3 @zammers
No ratings yet
5CS4-AOA-Unit-3 @zammers
7 pages
A Two Way Pattern Matching Algorithm Using Sliding Patterns
No ratings yet
A Two Way Pattern Matching Algorithm Using Sliding Patterns
5 pages
Rabin-Karp String Matching Algorithm
No ratings yet
Rabin-Karp String Matching Algorithm
11 pages
54.string Inotes
No ratings yet
54.string Inotes
20 pages
String Matching
No ratings yet
String Matching
34 pages
Ads Unit5
No ratings yet
Ads Unit5
26 pages
RB Matcher String Matching Technique
No ratings yet
RB Matcher String Matching Technique
4 pages
Lecture 14 String Matching 28052023 010329pm 21122023 025135pm
No ratings yet
Lecture 14 String Matching 28052023 010329pm 21122023 025135pm
23 pages
Rabinkarp PPT
No ratings yet
Rabinkarp PPT
12 pages
String Matching
No ratings yet
String Matching
4 pages
Lecture 56string Matching
No ratings yet
Lecture 56string Matching
43 pages
Rabin Karp Matching
No ratings yet
Rabin Karp Matching
11 pages
String Matching Algorithms
No ratings yet
String Matching Algorithms
46 pages
Rabin Karp Plagiarism Check
No ratings yet
Rabin Karp Plagiarism Check
16 pages
Text Pattern Search Using Naïve Algorithm: Justine Estoesta, Patricia Mae Omana, Winci John Singh
No ratings yet
Text Pattern Search Using Naïve Algorithm: Justine Estoesta, Patricia Mae Omana, Winci John Singh
5 pages
Unit 5 String Matching 2010
No ratings yet
Unit 5 String Matching 2010
5 pages
Adsa
No ratings yet
Adsa
9 pages
DAA DA Output
No ratings yet
DAA DA Output
9 pages
Rabin Karp
No ratings yet
Rabin Karp
11 pages
Rabin-Karp Algorithm For Pattern Searching: Examples
No ratings yet
Rabin-Karp Algorithm For Pattern Searching: Examples
5 pages
ADA Lect10
No ratings yet
ADA Lect10
12 pages
Daa Mini Project
No ratings yet
Daa Mini Project
5 pages
Basic Exercises for Competitive Programming: Python
From Everand
Basic Exercises for Competitive Programming: Python
Jan Pol
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Ian Talks Regex A-Z
From Everand
Ian Talks Regex A-Z
Ian Eress
No ratings yet
l4 Himework
No ratings yet
l4 Himework
4 pages
HW & SW Engineer Open Roles-CS
No ratings yet
HW & SW Engineer Open Roles-CS
6 pages
Unit-3 - Unix Scripts
No ratings yet
Unit-3 - Unix Scripts
21 pages
Coa Unit 3
No ratings yet
Coa Unit 3
37 pages
Final Oop
No ratings yet
Final Oop
9 pages
Problem Solving Using C Full Course Paart 1
No ratings yet
Problem Solving Using C Full Course Paart 1
108 pages
2.1 Data Types Notes PDF
No ratings yet
2.1 Data Types Notes PDF
3 pages
DSA MK Lect7 PDF
No ratings yet
DSA MK Lect7 PDF
77 pages
COS1512 Assignment02
No ratings yet
COS1512 Assignment02
8 pages
Common DS Interview Questions and Answers - 5
No ratings yet
Common DS Interview Questions and Answers - 5
4 pages
VB Net Notes All Units
No ratings yet
VB Net Notes All Units
13 pages
Digital Communication UNIT - V
No ratings yet
Digital Communication UNIT - V
14 pages
Matlog - Logistics Engineering Using Matlab PDF
No ratings yet
Matlog - Logistics Engineering Using Matlab PDF
6 pages
Data Structure & Algorithms Solutions
No ratings yet
Data Structure & Algorithms Solutions
9 pages
Half-Yearly Revision Worksheet-1
No ratings yet
Half-Yearly Revision Worksheet-1
32 pages
D.Y.Patil Technical Campus, Talsande Faculty of Engineering & Faculty of Management (Polytechnic)
No ratings yet
D.Y.Patil Technical Campus, Talsande Faculty of Engineering & Faculty of Management (Polytechnic)
11 pages
Packages in Java and How To Use Them
No ratings yet
Packages in Java and How To Use Them
10 pages
CS502 Quiz-2 File by Vu Topper RM
No ratings yet
CS502 Quiz-2 File by Vu Topper RM
45 pages
ABAP Full Stack Course Content - Gopal
No ratings yet
ABAP Full Stack Course Content - Gopal
15 pages
ES6 - Cheat Sheet
No ratings yet
ES6 - Cheat Sheet
12 pages
24B15CS111 - SDF Lab - 1 - Assignment 14 And15
No ratings yet
24B15CS111 - SDF Lab - 1 - Assignment 14 And15
7 pages
Analysis and Design of Algorithms
No ratings yet
Analysis and Design of Algorithms
15 pages
Blade Runner 2049 Shooting Script 2
No ratings yet
Blade Runner 2049 Shooting Script 2
13 pages
Vikrant Unix Notes
No ratings yet
Vikrant Unix Notes
12 pages
ABAP General Naming Standards Quick Reference
No ratings yet
ABAP General Naming Standards Quick Reference
3 pages
PRN221 - PE - NOTE (AutoRecovered)
No ratings yet
PRN221 - PE - NOTE (AutoRecovered)
12 pages
Daa Lab Task - 3
No ratings yet
Daa Lab Task - 3
9 pages
Assignment-1 Solution July 2019
No ratings yet
Assignment-1 Solution July 2019
5 pages
Frrouting Developers Guide
No ratings yet
Frrouting Developers Guide
321 pages
Computer Programming For Everyone
No ratings yet
Computer Programming For Everyone
3 pages

DAA Unit 5 Part 1

Uploaded by

DAA Unit 5 Part 1

Uploaded by

String Matching

Let the pattern be

Calculate the hash value of the pattern.

You might also like