0% found this document useful (0 votes)
45 views5 pages

Daa Mini Project

Uploaded by

nsinghpanwar565
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views5 pages

Daa Mini Project

Uploaded by

nsinghpanwar565
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Guru Gobind Singh College of Engineering & Research Centre,

Nashik

Mini-Project Report

Name of Programme: Computer Engineering Academic Year: 2024-25


Semester: BECO-Sem 7 Course code: 410246
Name of Course: Laboratory Practice-III (DAA)
-------------------------------------------------------------------------------------------------------------------------------
Title of Mini-Project: “Naive string matching algorithm and Rabin-Karp algorithm for
string matching”

1.0 Rationale:
In this mini project, we have implemented two popular algorithms for string matching: the Naive String
Matching Algorithm and the Rabin-Karp Algorithm. String matching is a fundamental problem in computer
science, and it has a wide range of applications, such as in search engines, DNA sequence analysis, and text
editing software. The Naive String Matching Algorithm is simple and brute force in nature, while the Rabin-
Karp Algorithm introduces a more efficient approach using hashing for pattern matching.

2.0 Aim /Benefits of Mini-Project:


1. To understand the working principles of string matching algorithms.
2. To compare the performance of the Naive String Matching Algorithm and the Rabin-Karp
Algorithm in terms of time complexity.
3. To demonstrate the use of hashing in improving the efficiency of string matching.

3.0 Course Outcomes achieved (COs):


1. Analyze performance of an algorithm.
2. Learn how to implement algorithms that follow algorithm design strategies namely divide and
conquer, greedy, dynamic programming, backtracking, branch and bound.

4.0 Literature Review: -


The Naive String Matching Algorithm checks for a pattern within a text by comparing every possible
substring of the text to the given pattern. While simple, this approach can be inefficient, with a time complexity
of O(n*m), where n is the length of the text and m is the length of the pattern.

The Rabin-Karp Algorithm improves upon this by introducing hashing. Instead of comparing the pattern with
every substring directly, it computes a hash for the pattern and for each substring of the text, and only compares
the actual strings if the hash values match.

2
5.0 Actual Methodology followed:
1. Start: Identify the problem of string matching and decide on two algorithms: Naive and Rabin-

Karp.

2. Algorithm Implementation:

a. Implement the Naive String Matching Algorithm, iterating through the text and

comparing substrings.

b. Implement the Rabin-Karp Algorithm, using a rolling hash function for efficient pattern

matching.

3. Testing: Test both algorithms on various text inputs to observe their performance with different

string lengths and patterns.

4. Time Complexity Analysis: Measure the time taken by both algorithms and analyze their time

complexities for different input sizes.

5. Edge Case Handling: Ensure the algorithms work correctly when the pattern is not found, or

when the pattern length is greater than the text length.

6. Comparison: Compare both algorithms based on performance metrics, specifically focusing on

time complexity.

3
6.0 Actual Code of Program:

def naive_string_matching(text, pattern):


n = len(text)
m = len(pattern)

# Loop over every position where the pattern can fit in the text
for i in range(n - m + 1):
# Check for a match between the text and the pattern
if text[i:i + m] == pattern:
print(f"Pattern found at index {i}")

# Rabin-Karp Algorithm implementation


def rabin_karp(text, pattern, q=101):
d = 256 # Number of characters in the input alphabet
n = len(text)
m = len(pattern)
p = 0 # Hash value for the pattern
t = 0 # Hash value for the text
h = 1 # Value of d^(m-1)

# Precompute h = pow(d, m-1) % q


for i in range(m - 1):
h = (h * d) % q

# Compute the hash value of the pattern and the first window of text

for i in range(m):
p = (d * p + ord(pattern[i])) % q
t = (d * t + ord(text[i])) % q

# Slide the pattern over the text


for i in range(n - m + 1):
# Check the hash values of the current window of text and the pattern

4
if p == t:
# If the hash values match, check the characters one by one
if text[i:i + m] == pattern:
print(f"Pattern found at index {i}")

# Compute the hash for the next window of text


if i < n - m:
t = (d * (t - ord(text[i]) * h) + ord(text[i + m])) % q

# We might get a negative hash value, so we convert it to positive


if t < 0:
t += q

# Test the algorithms


text = "ABAAABCDBBABCDDEBCABC"
pattern = "ABC"

print("Naive String Matching:")


naive_string_matching(text, pattern)

print("\nRabin-Karp String Matching:")


rabin_karp(text, pattern)

5
Explanation of Differences

1. Naive Algorithm:
 Time Complexity: O (n * m), where n is the length of the text and m is the length of the pattern.
 It checks each possible substring of the text by comparing it character by character with the
pattern.
 This is simple but can be inefficient for larger texts and patterns due to the quadratic time
complexity.

2. Rabin-Karp Algorithm:
 Time Complexity: O (n + m) on average and O (n * m) in the worst case (due to hash collisions).
 It uses a rolling hash to efficiently compare the pattern with the text by checking only hash
values. If the hashes match, it compares the actual strings.
 This can be much faster than the naive method.

7.0.Skill Developed / Learning outcome from this Mini-Project:

1. Understanding of the fundamental principles behind string matching algorithms.


2. Proficiency in implementing and optimizing algorithms using efficient data structures.
3. Ability to use hash functions to improve the efficiency of pattern matching.
4. Experience in analyzing and comparing the performance of different algorithm.

8.0.Applications of Mini Project:

1. Search engines: To find occurrences of specific words or phrases within a large body of text.
2. Text editors: To implement the "Find and Replace" functionality efficiently.
3. Plagiarism detection systems: To compare large sets of text for similarities.
4. DNA sequence analysis: To match specific patterns of nucleotides in biological data.

Evaluated by: Ms. B.P Ahuja


Date: Name & Signature of Guide

You might also like