0% found this document useful (0 votes)
33 views17 pages

Trie and Suffix Trees

Uploaded by

Muin Sayyad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views17 pages

Trie and Suffix Trees

Uploaded by

Muin Sayyad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

INDIAN INSTITUTE OF INFORMATION TECHNOLOGY, NAGPUR

Department of Computer Science and Engineering

DSA
Module 3
An introduction to string matching
String matching is an important branch of algorithm, and it has applications in many fields, as:
Text searching
Molecular biology
Data compression
and so on…
Exact String Matching
Locating a given pattern in a string and standard solutions

String, substring, subsequence


Problem Solving
Find multiple patterns in a text
A good solution
Suffix
trees
and
Trie
Compressed Trie
Build Trie
Trie
Applications:
What questions can a trie answer?
1. If pattern P is substring of S
◦ NO ouotgoing edge, not a substring
◦ If path toward leaf, then a substring
◦ Eg : baa; yes a substring

2. If P is a suffix of S
◦ NO ouotgoing edge, not a suffix
◦ If path till you reach a leaf, then a suffix
◦ Eg : baa; not a suffix

3. No. of times P occurs as a substring of S


◦ NO ouotgoing edge, not a substring
◦ If we end up at node n, then number of leaf nodes rooted at subtree n is the answer
◦ Eg : aba; no of leaf nodes below node n is 2; so 2 occurrences of the string

4. Longest repeated substring of S


◦ NO ouotgoing edge, not a substring
◦ Find the deepest node with more than one child
◦ Eg : here, it is aba
Trie
Suffix Trees
Definition: A suffix tree for a string T of length m is a rooted tree such that:
1. It has exactly m leafs, numbered from 1 to m;
2. Every edge has a label, which is a substring of T;
3. Every internal node has at least two children;
4. Labels of two edges starting at an internal node do not start with the same character;
5. The label of the path from the root to a leaf numbered I is the suffix of T starting at position
i, i.e. T[i..m]
trees
using
trie
Suffix
trees
using trie
Generalized Suffix Trees

A generalized suffix tree is simply a ST for a set of strings, each one ending with a
different marker. The leafs have two numbers, one identifiing the string and the other
identifiing the position inside the string. c$
ab

c# (1,4)
b
S1 = abbc$ (2,2)
(2,4)
bc$

(1,1)
abc#
c$
S2 = babc#
bc$ (2,1)
(1,3)
(2,3)
(1,2)
Generalized
Suffix Trees

Longest Common Palindrome


(LCP)

You might also like