Searching A Substring - JetBrains Academy - Learn Programming by Building Your Own Apps
Searching A Substring - JetBrains Academy - Learn Programming by Building Your Own Apps
Searching a substring
Theory Practice 23% completed, 0 problems solved
Search in a string
§1. The problem
Prefix function
Strings create the programming universe, being one of the most used data
structures, so it is essential to know how to handle them. Searching for a substring Rabin-Karp algorithm
is one of the skills you're likely to use quite often. For example, while working in a
text editor, you might want to find all occurrences of a particular word in a text, or
you might need to find a particular phrase on a web page without having to read Table of contents:
the entire thing. For convenience and clarity, let's differentiate: a substring that we
↑ Searching a substring
are looking for is called a pattern, and a string in which we make a search is a text.
§1. The problem
Say we have a pattern "ACA" and a text "ACBACAD". We can see that the pattern is
a substring of the text because it's contained there starting from the third and §2. The simplest algorithm
ending with the fifth symbol (assuming zero-based indexing). With this short
string, our watchful eye is enough, but in programming and with longer sequences §3. An example
we need to come up with an algorithm that can solve the problem for an arbitrary
§4. Complexity analysis
pattern and text.
§5. Additional notes
One of the simplest algorithms for finding a pattern in a text is the following:
§3. An example
The picture below illustrates how this algorithm works for the pattern "ACA" and
the text "ACBACAD". To please your eye and aid your comprehension, matching
symbols are shown in green, non-matching ones are red, and those not used in the
current step are colored with a gentle blue hue:
Here's what's happening: in the first step, we compare the pattern with the very
beginning of the text. The first two symbols match but, alas, the third doesn't. The
attempt isn't successful, so we shift the pattern to the right. In the second and the
third steps, we try to compare the corresponding symbols again but have an
obvious mismatch in the first symbol. In the fourth step, all the corresponding
symbols match, so an occurrence is successfully found.
You can see that sometimes there is no need to process all the symbols of the
pattern. If we have a mismatch, say, in the first symbol, we don't need to compare
the rest. In case of failure, we can immediately shift the pattern and start a new
step: a nice life lesson to learn from.
Report a typo
255 users liked this piece of theory. 8 didn't like it. What about you?