Tries and Radix Tree1

Uploaded by

KUNCHANGI GNANA SURYA DEEPIKA 421205

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views27 pages

Tries and Radix Tree1

Uploaded by

KUNCHANGI GNANA SURYA DEEPIKA 421205

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Tries, radix trees,

suffix trees
Tries
A trie (pronounced try) is a data structure for
representing a set of strings
● It can also be used for a map where the keys are
strings
● Or where the key is a list of some kind
It is a kind of tree, but not based on
comparisons
Name: pun on retrieval and tree
● Originally pronounced “tree”, but now pronounced
“try” to avoid confusion with trees…
Tries
This trie represents the set {“cat”, “cats”,
“cow”, “pig”, “pin”}: Edges labelled
with characters.
Invariant:
Double circle: no node has two
cat is in the set edges with the
(Concatenate all same label
the characters
on the path Invariant:
from the root all leaves
to this node – are “double
c-a-t) Single circle: circled”
“co” not in the set
Tries, more formally
A trie is a tree where edges are labelled with characters
● Represents a set or map of strings
● More generally, keys can be lists; the edges are labelled with single
elements
Each node in the tree represents a string
● Which string? Follow the path from the root to the node and
concatenate the characters on those edges
Some nodes are marked as corresponding to an element of
the set
● In diagrams: a double circle
Invariant:
● Each node has at most one child labelled with a given edge
● All leaf nodes are “double circles” (they represent elements of the set)
Tries
To check if a string is in the set, just follow
the edges, starting from the root!

Double circle:
“cat” is
in the set
Tries
To insert a new string, also follow the edges,
making new nodes as you go
● The final node should
be a “double circle”
Tries
Inserting “pi” creates no new nodes, but we
mark the final node as a “double circle”
Tries
To delete a string, we first turn the node into
a “single circle”…
Example: deleting “crow”
Tries
If the node is a leaf, we should remove it. We
go up the tree removing any single-circled
leaves, which restores the invariant:
Tries – other neat things we can do
Given a set of strings stored as a trie, we can:
● Find all strings starting with a given prefix
(for this reason a trie is often called a prefix tree)
We can take the union or intersection of two tries
● Linear time, but much faster if the two tries are mostly disjoint
If we can iterate over all edges of a node in
alphabetical order, we can also:
● Generate a list of strings in dictionary order (i.e., we can use a
trie for sorting)
● Find all strings lying between two words in dictionary order
(e.g., all words in the set that are after “chicken” but before
“pickle” in the dictionary)
Tries
To find all strings starting with a prefix, just
follow the edges along that prefix:
Return all
words in
this subtree
Tries
To find all strings between “chicken” and
“pickle”, just follow all edges that lie between
them in dictionary order:
“pin” is too
“ca” is too big
small
Tries – implementation
How to represent a trie? Not obvious:
● Edges are labelled
● Each node can have many edges
One reasonable choice: each node carries a map from label to
child node
● e.g., using a hash table means that following an edge will take O(1) time
“Double circles” are recorded by having a Boolean field in the
node object
● If the trie is used as a map, each node object can contain a value
Just as with any tree data structure, the trie itself is
represented as a reference to the root node
Fairly simple to implement!
Tries – performance
Trie operations take O(w) time, where w is
the length of the string to be inserted
● Independent of the number of strings stored in the
trie!
Is this better or worse than BSTs?
● Better if the trie consists of many short strings,
because each node will have many children
● Worse if the trie consists of few long strings, because
many nodes will only have one child
Tries – a bad case for performance
Tries containing few long strings perform
worse than BSTs
Many nodes have one child!
Long chains of nodes
without any branching
Radix trees are a refinement of
tries that only introduce nodes
when branching is needed
Radix trees
Idea: label edges with strings rather than
characters, and compress chains of nodes
into a single string
Radix trees
Finding values in a radix tree works the same
as in a trie
● Important invariant: each node only has one outgoing
edge starting with each letter!
● Can also maintain: each non-double-circled node has at
least 2 children

Theorem:
number of nodes is at most 2n,
where n is size of set!
Radix trees
Insertion works like in a trie, except that you
sometimes have to split an edge into two
E.g. to insert “cabbie”, we have to split
“bbage” into “bb” and “age”:
Radix trees – implementation
To navigate in a radix tree we need to be able
to look up a character and get the outgoing
edge starting with that character
So, each node stores its outgoing edges as a
map:
● the key is the first character of the label
● the value is a pair (rest of the label, target node)
Apart from that, implementation is similar
to tries
● Main other difference: splitting an edge in two
Suffix trees
A suffix tree is a radix tree that stores all suffixes of a
given string
● Example: suffixes of “catinthehat” are:
“catinthehat”, “atinthehat”, “tinthehat”, etc.
Why? Can be used to search for all occurrences of
given substring in a string
● In a radix tree, you can find all strings that start with a given
prefix
● In a suffix tree, you can find all suffixes of a string that start
with a given prefix
● This is the same as finding all occurrences of the prefix
● “at” is a substring of “catinthehat” if and only if some suffix of
“catinthehat” starts with “at”
Suffix trees
A suffix tree for “catinthehat”:

To find “at” in “catinthehat”, let’s check which suffixes start

with “at”. There are two:
● “atinthehat” → at occurs followed by “inthehat”
● “at” → at occurs followed by the end of the string
From the length of the suffix we can tell what positions
“at” occurs at!
● “catinthehat”.length - “atinthehat”.length = 1
● “catinthehat”.length - “at”.length = 9
Suffix trees, implementation

If implemented carelessly, this takes O(n2) memory!

● Each edge is labelled with a substring of the input string,
which may take O(n) memory to store
The trick:
● Remember the original input string
● Label each edge with data that records which substring of the
input string it is
One way: a pair (position in input string, length)
e.g. “intheh” would become (3, 6) – starts at index 3 of
“catinthehat” and goes on for 6 characters
Suffix trees, implementation

(need to also remember that the input string is catinthehat)

This takes O(n) memory! (Recall that a radix tree containing n
values has at most 2n nodes)
Radix trees for numbers (not on exam)
You can view an integer as a sequence of digits
● e.g. 12345 → [1, 2, 3, 4, 5]
So you can use radix trees to store sets of numbers!
Example: {123, 456, 12456}

Note: we pad the numbers with leading zeroes

when necessary, so that we can do range queries
like “find all the numbers between 100 and 500”
Radix trees for numbers,
implementation (not on exam)
We can use several tricks to implement radix
trees for numbers super-efficiently!
● Instead of storing the children of each node in a map,
store them in an array of size 10 (one for each digit)
● Store strings of digits as a pair
(number, length of number)
e.g. [0,1,3] becomes (13, 3)
● Don’t use base 10 but (e.g.) base 16, so that we can
use e.g. bit-shifting instead of division – in other
words, we view an integer as a list of 4-bit numbers
(“hexadecimal digits”)
Radix trees for strings,
using radix trees for numbers
If we have a radix tree for lists of 4-bit numbers
we can use it to store strings!
We can view a string as a series of 4-bit numbers:
● Each character has a character code, which is an 8 to 32-bit
number (depending on the encoding)
● Chop up the string into a list of character codes
e.g. “hello” → [104, 101, 108, 108, 111]
● Chop up each character code into 4-bit pieces
e.g. 108 = (binary) 0110 1100 = [0110, 1100]
● Now you have a series of 4-bit numbers
For efficiency, radix trees for strings are often
implemented this way!
Summary
Radix trees can be used to implement sets or maps, where the
keys are lists
● e.g. strings, or numbers treated as strings of base-16 “digits”
● Requires a map data structure for list elements, often implemented as an array
Time taken by each operation is low!
● O(min(w, log n)) where w is length of string, n is size of set
Tries are simpler to implement, but have O(w) performance
Both also support: finding strings starting with a given prefix,
range queries
● Also union and intersection, which we didn’t see
Suffix trees are radix trees which store all suffixes of a string, and
can be used to find all occurrences of a given substring
● A suffix tree can be built in O(n) time (which we didn’t see)
● Then searching takes O(log n) worst-case time

CewePrometer User Manual (Cewe) BGX501-856-R02
100% (1)
CewePrometer User Manual (Cewe) BGX501-856-R02
80 pages
Houdini Nodes Quickref
100% (3)
Houdini Nodes Quickref
87 pages
Tries Data Structures (Trie) PPT
100% (1)
Tries Data Structures (Trie) PPT
11 pages
Cs301 Solved Mcqs Final Term by Junaid
No ratings yet
Cs301 Solved Mcqs Final Term by Junaid
34 pages
Ch-3: Search Trees
No ratings yet
Ch-3: Search Trees
12 pages
B+ Tree Example
No ratings yet
B+ Tree Example
5 pages
Introduction and Elementary Data Structures: Analysis of Algorithms
No ratings yet
Introduction and Elementary Data Structures: Analysis of Algorithms
12 pages
CS202 Unit5 Slides
No ratings yet
CS202 Unit5 Slides
92 pages
Tries Data Structure
100% (1)
Tries Data Structure
14 pages
C and Ds MCQ Jeca 05.06.22
100% (1)
C and Ds MCQ Jeca 05.06.22
11 pages
Data Strucrures Syllabus
No ratings yet
Data Strucrures Syllabus
3 pages
Radix Search Tree
100% (1)
Radix Search Tree
18 pages
Advance Data Structures
No ratings yet
Advance Data Structures
184 pages
Predictive Modeling Week3
No ratings yet
Predictive Modeling Week3
68 pages
Tries and Suffix Tries
No ratings yet
Tries and Suffix Tries
29 pages
Ders10 Data Structures-Tries
No ratings yet
Ders10 Data Structures-Tries
34 pages
Lec 11 Trie
No ratings yet
Lec 11 Trie
28 pages
1.advanced Tree Structures
No ratings yet
1.advanced Tree Structures
29 pages
Trie Insertion
No ratings yet
Trie Insertion
31 pages
Types of Tries
No ratings yet
Types of Tries
20 pages
Trie
No ratings yet
Trie
13 pages
Fibonacci Heap
No ratings yet
Fibonacci Heap
19 pages
Tries 1427
No ratings yet
Tries 1427
19 pages
Unit 3 Tries
No ratings yet
Unit 3 Tries
16 pages
Notes 06 Text Indexing PDF
No ratings yet
Notes 06 Text Indexing PDF
162 pages
Ads 2 Part 4
No ratings yet
Ads 2 Part 4
18 pages
Lecture 8 - Data Structures
No ratings yet
Lecture 8 - Data Structures
24 pages
Week Week 22 22: Radix Search Trees
No ratings yet
Week Week 22 22: Radix Search Trees
64 pages
Tries Data Structure
No ratings yet
Tries Data Structure
13 pages
CSC10004: Data Structures and Algorithms
No ratings yet
CSC10004: Data Structures and Algorithms
20 pages
Tries
No ratings yet
Tries
4 pages
GBM Vignette
No ratings yet
GBM Vignette
28 pages
Tries and Huffman Encoding
No ratings yet
Tries and Huffman Encoding
16 pages
CourseModule DS (UGCA 1915)
No ratings yet
CourseModule DS (UGCA 1915)
14 pages
Artificial Intelligence: Tutorial 3 Questions Heuristic Search and Game Playing
No ratings yet
Artificial Intelligence: Tutorial 3 Questions Heuristic Search and Game Playing
5 pages
Tries and Suffix Tries
No ratings yet
Tries and Suffix Tries
26 pages
Unit 5 Tree Lect Notes
No ratings yet
Unit 5 Tree Lect Notes
46 pages
Data Structures and Algorithms: (CS210/ESO207/ESO211)
No ratings yet
Data Structures and Algorithms: (CS210/ESO207/ESO211)
25 pages
Indexed Search Tree (Trie) : Nelson Padua-Perez Chau-Wen Tseng
No ratings yet
Indexed Search Tree (Trie) : Nelson Padua-Perez Chau-Wen Tseng
21 pages
Trie - Wikipedia
No ratings yet
Trie - Wikipedia
10 pages
5.4. ADS - Tries - Standard Tries
No ratings yet
5.4. ADS - Tries - Standard Tries
34 pages
09 SuffixTrees
No ratings yet
09 SuffixTrees
21 pages
Programming-Assignment-1
No ratings yet
Programming-Assignment-1
20 pages
TRIE Trees: Search Engines Genome Analysis Data Analytics
No ratings yet
TRIE Trees: Search Engines Genome Analysis Data Analytics
6 pages
Trie
No ratings yet
Trie
16 pages
Suffix Tree and Suffix Array Techniques For Pattern Analysis in Strings
No ratings yet
Suffix Tree and Suffix Array Techniques For Pattern Analysis in Strings
78 pages
Suffix Trees: CSC 448 Bioinformatics Algorithms Alexander Dekhtyar
No ratings yet
Suffix Trees: CSC 448 Bioinformatics Algorithms Alexander Dekhtyar
8 pages
Ingeniería Del Conocimiento: Msc. Jorge Espinosa
No ratings yet
Ingeniería Del Conocimiento: Msc. Jorge Espinosa
103 pages
6 Suffix-Tree
No ratings yet
6 Suffix-Tree
20 pages
Programming Assignment 1: Suffix Trees
No ratings yet
Programming Assignment 1: Suffix Trees
21 pages
Suffixtrees
No ratings yet
Suffixtrees
50 pages
Data Structure Record
No ratings yet
Data Structure Record
94 pages
A2SV - Trie Lecture (No Code)
No ratings yet
A2SV - Trie Lecture (No Code)
39 pages
Cs 301
No ratings yet
Cs 301
404 pages
55 TriesNOTES
No ratings yet
55 TriesNOTES
18 pages
Unit5 Trie
No ratings yet
Unit5 Trie
23 pages
Trie Tree
No ratings yet
Trie Tree
21 pages
Suffix Tree
No ratings yet
Suffix Tree
6 pages
Topic - Q Implement Trie (Prefix Tree) - Information O..
No ratings yet
Topic - Q Implement Trie (Prefix Tree) - Information O..
3 pages
Suffix Trees and Suffix Arrays
No ratings yet
Suffix Trees and Suffix Arrays
33 pages
Trie
No ratings yet
Trie
6 pages
Lecture Notes On Tries
No ratings yet
Lecture Notes On Tries
10 pages
Module 06. String Algorithms Lecture 3-6
No ratings yet
Module 06. String Algorithms Lecture 3-6
48 pages
Chapter 3 Part 2
No ratings yet
Chapter 3 Part 2
22 pages
Suffix Trees, Suffix Arrays, and Their Applications
No ratings yet
Suffix Trees, Suffix Arrays, and Their Applications
29 pages
Lecture4 - Indexing and Searching I
No ratings yet
Lecture4 - Indexing and Searching I
56 pages
Tries
No ratings yet
Tries
3 pages
Lecture03 SuffixTree
No ratings yet
Lecture03 SuffixTree
3 pages
Huffman Code
No ratings yet
Huffman Code
25 pages
DS - Lecture Notes-Students
No ratings yet
DS - Lecture Notes-Students
244 pages
Reporting Agent
No ratings yet
Reporting Agent
10 pages
A. Yet Another Problem With Strings: ACM ICPC Practice Contest, 8 November, 2015
No ratings yet
A. Yet Another Problem With Strings: ACM ICPC Practice Contest, 8 November, 2015
2 pages
Suffix Array Tutorial
No ratings yet
Suffix Array Tutorial
17 pages
6.851 Advanced Data Structures (Spring'12) Prof. Erik Demaine Problem 9 Sample Solution
No ratings yet
6.851 Advanced Data Structures (Spring'12) Prof. Erik Demaine Problem 9 Sample Solution
2 pages
Experiment No.-01 Aim:: To Study About Data Structure
No ratings yet
Experiment No.-01 Aim:: To Study About Data Structure
49 pages
Trie Vs BST Vs HashTable
No ratings yet
Trie Vs BST Vs HashTable
2 pages
Placent Cell Question Bank
No ratings yet
Placent Cell Question Bank
52 pages
Hybris Retail en
No ratings yet
Hybris Retail en
132 pages
Parts Description Database: Generative CAPP System
100% (1)
Parts Description Database: Generative CAPP System
26 pages
Binary Search Tree: Basic Operations
No ratings yet
Binary Search Tree: Basic Operations
4 pages
Chapter 09 Advanced Data Structures
No ratings yet
Chapter 09 Advanced Data Structures
9 pages
CU-2022 B.Sc. (Honours) Computer Science Semester-2 Paper-CC-3 QP
No ratings yet
CU-2022 B.Sc. (Honours) Computer Science Semester-2 Paper-CC-3 QP
2 pages
Advance Data Structures: Tries
No ratings yet
Advance Data Structures: Tries
26 pages
Group7 - Decision Tree Analysis
No ratings yet
Group7 - Decision Tree Analysis
8 pages
Advantages Relative To Other Search Algorithms
No ratings yet
Advantages Relative To Other Search Algorithms
7 pages
Suf Tree
No ratings yet
Suf Tree
6 pages
Outline and Reading: Tries 4/1/2003 9:02 AM
No ratings yet
Outline and Reading: Tries 4/1/2003 9:02 AM
3 pages
Introduction to Topology
From Everand
Introduction to Topology
Simone Malacrida
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Basics of Triad Chords: A Little Help…Please!
From Everand
Basics of Triad Chords: A Little Help…Please!
Lynette Haddock
5/5 (1)

Tries and Radix Tree1

Uploaded by

Tries and Radix Tree1

Uploaded by

Tries, radix trees,

To find “at” in “catinthehat”, let’s check which suffixes start

If implemented carelessly, this takes O(n2) memory!

(need to also remember that the input string is catinthehat)

Note: we pad the numbers with leading zeroes

You might also like