Algo Research
Algo Research
Comparative Analysis
Name: Muhammad Zeeshan Roll No: 005
University offundamental
Sialkot in this process, particularly for
desktop search systems, which focus on
retrieving content stored locally on
computers. These algorithms are critical in
Abstract
various applications, including text mining,
bioinformatics, and plagiarism detection.
The reviewed research paper focuses on the
analysis of various string matching This review paper examines to performance
algorithms, particularly in the context of of several string matching algorithms,
desktop search systems. Desktop search, a focusing on their application in desktop
specialized variant of enterprise search, search environments. The algorithms under
involves the retrieval of information stored review include the Two-Way, Colussi,
on personal computers, including documents, Optimal Mismatch, and Maximal Shift
emails, and web content. The research algorithms.
compares four specific algorithms—Two-
Way, Colussi, Optimal Mismatch, and 2. Review of String Matching Algorithms
Maximal Shift—in terms of their 2.1 Two-Way Algorithm
performance, including search time, number
of iterations, and accuracy. The study The Two-Way algorithm is designed for
concludes that the Colussi algorithm efficient string matching by factorizing the
outperforms the others in terms of accuracy, search pattern into two parts. The algorithm
making it a preferable choice for desktop processes the pattern in two stages, ensuring
search applications. that mismatches are quickly handled with
minimal unnecessary comparisons. The
1. Introduction complexity of this algorithm is O(n), making
it suitable for various text-matching
The field of Information Retrieval (IR) is applications.
essential in managing and retrieving relevant
documents from large datasets based on user 2.2 Colussi Algorithm
queries. String matching algorithms are
The Colussi algorithm is an enhancement of
the Knuth-Morris-Pratt (KMP) algorithm,
focusing on optimizing the comparison
process by dividing the pattern positions into
two subsets. This algorithm performs
comparisons in two phases: the no-holes
phase (left-to-right) and the holes phase analyzed to determine the most efficient
(right-to-left). Its primary advantage is the algorithm for desktop search purposes.
reduced need for re-comparison of certain
text characters, leading to improved
performance in many cases.
References: