Trie Insertion

A trie is a tree data structure used to store a dynamic set of strings that allows for efficient retrieval of strings based on their prefixes. It provides faster lookups compared to hash tables, but uses more memory. A trie stores strings as sequences of node traversals from the root, with each node representing a character. Common prefixes are shared across subtrees, reducing memory usage compared to storing each string separately. Compressed tries merge redundant internal nodes with single child nodes to reduce memory usage further. PATRICIA tries process keys by comparing bits one at a time for the most compact representation.

Uploaded by

Linux Things

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Trie Insertion

Uploaded by

Linux Things

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

TRIE

PROBLEM :Given a dictionary that contains a list of strings, and a string , we

want to check whether or not is in the dictionary.
1. A hash table (also called a hash map) is a data structure that is used to map
keys to values in an unsorted way. In our problem, we can treat each string in
the dictionary as a key to the hash table.

Time complexity : Since a hash function needs to consider all characters of the
input string, it is o(n) , where n is the length of the input string.

2. A trie or a prefix tree is a particular kind of search tree, where

nodes are usually keyed by strings.
In a trie, a link between two nodes represents a character in the
keyed string.
how a trie data structure looks like for key, value pairs ("abc",1),("xy",2),("xyz",5),
("abb",9)("xyzb",8), ("word",5).
Comparisons

Lookup Speed

When we look up a string for a hash table, we first calculate the hash value of the
string, which takes o(n) time. Then, it will take o(1) time to locate the hash value in
the memory, assuming we have a good hash function. Therefore, the overall
lookup time complexity is o(n)
When we look up a string for a trie, we go through each character of the string and
locate its corresponding node in the trie. The overall lookup time complexity is
also o(n)
However, the trie has some more overhead to retrieve the whole string. We need
to access the memory multiple times to locate the trie nodes along the character
path. For the hash table, we only need to compute the hash value for the input
string once. Therefore, it is relatively faster when we look up a whole string in the
hash table.
Memory Requirement

When we first construct a hash table, we normally pre-allocate a big chunk of

memory to avoid collisions by hashing uniformly on the size of the memory. In the
future when we insert a string into the hash table, we only need to store the string
content.
For a trie data structure, we need to store extra data such as character link
pointers and complete node flags. Therefore, the trie requires more memory to
store the string data.
However, if there are many common prefixes, the memory requirement becomes
smaller as we can share the prefix nodes.
Overall, the memory requirement between a hash table and a trie is based on the size
of pre-allocated hash table memory and input dictionary strings.
Applications
Trie : can quickly look up prefixes of keys, enumerate all entries with
a given prefix, etc.
Trie advantages :
Predictable O(k) lookup time where k is the size of the key
Lookup can take less than k time if it's not there
Supports ordered traversal
No need for a hash function
Deletion is straightforward
it all depends on what problem you're trying to solve. If all you
need to do is insertions and lookups, go with a hash table. If you
need to solve more complex problems such as prefix-related
queries, then a trie might be the better solution.
Trie insertion
struct TrieNode { //pointer array for child nodes of each node
TrieNode *childNode[26];
int wordEndCnt;
//constructor TrieNode()
{
//initialize the wordEndCnt variable with 0 //initialize every index of
childNode array with NULL
wordEndCnt = 0;
for (int i = 0; i < 26; i++)
{
childNode[i] = NULL;
}
}
};
Each TrieNode will have 26 children from a-z represented by a
character pointer array.

Each node will have a wordEndCnt integer variable. This variable

will store the count of the strings in the Trie which are the same as
that of the prefix represented by that node of the Trie.

Inside the structure of a TrieNode we made a constructor

TrieNode() which will initialize every index of the childNode pointer
array with NULL whenever a new node is created. It will also
initialize the wordEndCnt value for every node with 0.
TrieNode* insert_key(TrieNode *root, string &key){
//initialize the currentNode pointer with the root node TrieNode *currentNode = root;

//Store the length of the key string

int length = key.size();
//iterate across the length of the string
for (int i = 0; i < length; i++)
{
//Check X-'a' th index is NULL or not
if (currentNode->childNode[key[i] - 'a'] == NULL)
{
//If null make a new node TrieNode * newNode = new TrieNode();
//Point the X-'a' th index of current node to the new node
currentNode->childNode[key[i] - 'a'] = newNode; }

//Move the current node pointer to the newly created node.

currentNode = currentNode->childNode[key[i] - 'a'];
}
currentNode->wordEndCnt++;

//return the updated root node return root;}

Implementation of the Search Operation in a Trie Data Structure
bool search_key(TrieNode *root, string &queryString){

//Initialize the currentNode pointer with the root node TrieNode *currentNode = root;

//Store the length of the query string int length = queryString.size();

for (int i = 0; i < length; i++)
{
//Check if the X-'a' th index is NULL or not if (currentNode->childNode[queryString[i] - 'a'] == NULL)
{return false;}
//If null then the query string is not present in the Trie //return false

//If not NULL //Move the currentNode pointer to the node pointed by X-'a' th index of the //
current node currentNode = currentNode->childNode[queryString[i] - 'a'];
}

//If currentNode pointer is not NULL //and wordEndCnt for the currentNode pointer //is greater than
0 then return true else //return false
return true if currentNode != NULL && currentNode->wordEndCnt > 0; else false
}
Deletion

1. If key 'k' is not present in trie, then we should not modify trie in any way.
2. If key 'k' is not a prefix nor a suffix of any other key and nodes of key 'k' are not part
of any other key then all the nodes starting from root node(excluding root node) to
leaf node of key 'k' should be deleted.
3. If key 'k' is a prefix of some other key, then leaf node corresponding to key 'k' should
be marked as 'not a leaf node'. No node should be deleted in this case.
4. If key 'k' is a suffix of some other key 'k1', then all nodes of key 'k' which are not part
of key 'k1' should be deleted.

5. If key 'k' is not a prefix nor a suffix of any other key but some nodes of key 'k' are
shared with some other key 'k1', then nodes of key 'k' which are not common to any
other key should be deleted and shared nodes should be kept intact.
Compressed Trie
compressed Trie
Standard Trie :The size of a trie is directly correlated to the size of all
the possible values that the trie could represent.
The first thing that we’ll notice when we look at this trie is that there
are two keys for which we have redundant nodes as well as a
redundant chain of edges.
A redundant node is one that takes up an undue amount of space
because it only has one child node of its own. We’ll see that for the
key "deck", the node for the character "e" is redudant, because it
only has a single child node, but we still have to initialize an entire
node, with all of its pointers, as a result.
Similarly, the edges that connect the key "did" are redundant, as
they connect redundant nodes that don’t really all need to be
initialized, since they each have only one child node of their own.
The redundancy of a standard trie comes from the fact that we are
repeating ourselves by allocating space for nodes or edges that
contain only one possible string or word. Another way to to think about
is that we repeat ourselves by allocating a lot of space for something
that only has one possible branch path.
RULE FOR COMPRESSED TRIE
each internal node (every parent node) must have two or more child nodes. If a
parent has two child nodes, which is at two branch paths to potential leaf nodes,
then it doesn’t need to be compressed, since we actually need to allocate space
and memory for both of these branch paths.
However, if a parent node only has one child node — that is to say, if it only has
one possible branch path to a leaf node — then it can be compressed. In order to
do the work of “compacting” the trie, each node that is the only child of its
parent node is merged into its parent. The parent node and the single-child node
are fused together, as are the values that they contain.
Compressed tries are also known as radix trees, radix tries, or compact prefix
trees.
a space-optimized version of a standard trie. Unlike regular tries, the
references/edges/pointers of a radix tree can hold a sequence of a string, and
not just a single character element.
PATRICIA tree.
A trie’s keys could be read and processed a byte at a time, half a byte at a time, or two bits at a
time. However, there is one particular type of radix tree that processes keys in a really
interesting way, called a PATRICIA tree.
PATRICIA stands for “Practical Algorithm To Retrieve Information Coded In Alphanumeric”.

The most important thing to remember about a PATRICIA tree is that its radix is 2. Since we
know that the way that keys are compared happens r bits at a time, where 2 to the power of r is
the radix of the tree, we can use this math to figure out how a PATRICIA tree reads a key.

Since the radix of a PATRICIA tree is 2, we know that r must be equal to 1, since 2¹ = 2. Thus, a
PATRICIA tree processes its keys one bit at a time.
Let’s say that we want to turn our original set of keys, ["dog", "doge",
"dogs"] into a PATRICIA tree representation. Since a PATRICIA tree
reads keys one bit at a time, we’ll need to convert these strings down
nto binary so that we can look at them bit by bit.
dog: 01100100 01101111 01100111
doge: 01100100 01101111 01100111 01100101
dogs: 01100100 01101111 01100111 01110011
Notice how the keys "doge" and "dogs" are both substrings of "dog". The binary
representation of these words is the exact same up until the 25th digit. Interestingly,
even "doge" is a substring of "dogs"; the binary representation of both of these two
words is the same up until the 28th digit!

so since we know that "dog" is a prefix of "doge", we will compare them bit by bit. The
point at which they diverge is at bit 25, where "doge" has a value of 0. Since we know
that our binary radix tree can only have 0’s and 1’s, we just need to put "doge" in the
correct place. Since it diverges with a value of 0, we’ll add it as the left child node of
our root node "dog".
Now we’ll do the same thing with "dogs". Since "dogs" differs from its binary prefix "doge" at bit
28, we’ll compare bit by bit up until that point.
Suffix trie
Suffix tree is nothing but an extended version of trie. It's a compressed trie
which includes all of a string's suffixes. There are some string-related problems
which can be solved using suffix trees. Some of those problems are pattern
matching, identifying unique substrings inside a string, and determining the
longest palindrome. The suffix tree for the string S of length n is defined as a
tree that has the following properties:
1. There are exactly n leaves on the tree, numbered 1 through n.
2. Each edge is identified by a non-empty S substring.
3. Every internal node has at least two child, with the exception of the root.
4. String-labels can't start with the same character on two edges that emerge from
the same node.
5. The suffix S[i..n], for I from 1 to n, is formed by combining all the string-labels
encountered on the path from the root to the leaf i.

Functionality of Suffix Tree

A suffix tree for a string S of length n can be created in Theta (n) time if the letters
come from an alphabet of integers with a polynomial range of -infinity to +infinity (in
particular, this is true for fixed-sized alphabets). The majority of the time is spent
sorting the letters into an O(n)-sized range for larger alphabets; on average, it takes
O(nlog n) time. Imagine that over the string S of length n, a suffix tree has been
constructed, then you can:
1. Look for strings:
In O(m) time, determine whether a string P of length m is a substring.
In O(m) time, find the very first occurrence of the sequences P1...PQ with a total
length of m as substrings.
In O(m+z) time, find all z occurrences of the patterns P1...PQ of length m in
substrings.
Trie applications
Consider a web browser. Do you know how the web browser can auto complete
your text or show you many possibilities of the text that you could be writing? Yes,
with the trie you can do it very fast. Do you know how an orthographic corrector
can check that every word that you type is in a dictionary? Again a trie. You can
also use a trie for suggested corrections of the words that are present in the text
but not in the dictionary.

3-SDU Help
No ratings yet
3-SDU Help
1,929 pages
Allo GSM Interface Card Setup Guide
No ratings yet
Allo GSM Interface Card Setup Guide
9 pages
Tries.pptx
No ratings yet
Tries.pptx
33 pages
Trie - Wikipedia
No ratings yet
Trie - Wikipedia
10 pages
Advantages Relative To Other Search Algorithms
No ratings yet
Advantages Relative To Other Search Algorithms
7 pages
Easy There Entropy: Coding, Crypto, Culture, Cosmos
No ratings yet
Easy There Entropy: Coding, Crypto, Culture, Cosmos
13 pages
Trie
No ratings yet
Trie
6 pages
Trab 1
No ratings yet
Trab 1
22 pages
tries and Radix Tree1
No ratings yet
tries and Radix Tree1
27 pages
UNIT-4
No ratings yet
UNIT-4
9 pages
Trie
No ratings yet
Trie
16 pages
TRIE Trees: Search Engines Genome Analysis Data Analytics
No ratings yet
TRIE Trees: Search Engines Genome Analysis Data Analytics
6 pages
Chapter 11
No ratings yet
Chapter 11
10 pages
ASSIGNMENT Sonali Raghuvanshi
No ratings yet
ASSIGNMENT Sonali Raghuvanshi
15 pages
Introducing Arrays: Next
No ratings yet
Introducing Arrays: Next
10 pages
Lecture 19
No ratings yet
Lecture 19
7 pages
Advance Data Structure
No ratings yet
Advance Data Structure
120 pages
unit5_trie
No ratings yet
unit5_trie
23 pages
Unit 3 Tries
No ratings yet
Unit 3 Tries
16 pages
Unit 3 Storage Strategies Indices B-Trees Hashing
No ratings yet
Unit 3 Storage Strategies Indices B-Trees Hashing
12 pages
Lesson 1 Interview Question: 1.why Do We Need Pointers?
No ratings yet
Lesson 1 Interview Question: 1.why Do We Need Pointers?
19 pages
Types of Tries.pptx
No ratings yet
Types of Tries.pptx
20 pages
Making A Faster Cryptanalytic Time-Memory Trade-Off
No ratings yet
Making A Faster Cryptanalytic Time-Memory Trade-Off
15 pages
9nm4alc: CS 213 M 2023 Data Structures
No ratings yet
9nm4alc: CS 213 M 2023 Data Structures
21 pages
Data Structures (KCS301)
100% (1)
Data Structures (KCS301)
21 pages
Lecture Notes On Hash Tables: 15-122: Principles of Imperative Computation Frank Pfenning, Rob Simmons February 28, 2013
No ratings yet
Lecture Notes On Hash Tables: 15-122: Principles of Imperative Computation Frank Pfenning, Rob Simmons February 28, 2013
7 pages
Question Bank DSA
No ratings yet
Question Bank DSA
6 pages
Cracking Passwords
No ratings yet
Cracking Passwords
14 pages
KDTree Trie
No ratings yet
KDTree Trie
5 pages
bio 9
No ratings yet
bio 9
9 pages
List Dict Set Tuple
No ratings yet
List Dict Set Tuple
23 pages
55 TriesNOTES
No ratings yet
55 TriesNOTES
18 pages
Data Structutes Using C'
No ratings yet
Data Structutes Using C'
7 pages
Unit 3
No ratings yet
Unit 3
32 pages
Chapter 28 Hashing: Hash Table. The Function That Maps A Key To An Index in The Hash Table Is
No ratings yet
Chapter 28 Hashing: Hash Table. The Function That Maps A Key To An Index in The Hash Table Is
4 pages
Example 3-16: The Hash Table Algorithm
No ratings yet
Example 3-16: The Hash Table Algorithm
5 pages
Bca C Language Chapters 5
No ratings yet
Bca C Language Chapters 5
55 pages
10 Binary Trees
No ratings yet
10 Binary Trees
24 pages
Data Structure
No ratings yet
Data Structure
5 pages
C Interview Questions Tech Preparation
No ratings yet
C Interview Questions Tech Preparation
17 pages
competitive programming.doc
No ratings yet
competitive programming.doc
82 pages
CH6 - Linked List
No ratings yet
CH6 - Linked List
67 pages
Handout 9 - Hashing
No ratings yet
Handout 9 - Hashing
11 pages
Unit 3 Hashing
No ratings yet
Unit 3 Hashing
23 pages
Lab 11
No ratings yet
Lab 11
4 pages
Theory PDF
No ratings yet
Theory PDF
18 pages
Pertemuan 8
No ratings yet
Pertemuan 8
30 pages
Problem Set 4
No ratings yet
Problem Set 4
9 pages
Strings - The Basics
No ratings yet
Strings - The Basics
51 pages
Threaded Binary Tree
No ratings yet
Threaded Binary Tree
25 pages
Chapter Six: Introduction To Arrays, String and Pointers
No ratings yet
Chapter Six: Introduction To Arrays, String and Pointers
44 pages
Hash Tables: Map Dictionary Key "Address."
No ratings yet
Hash Tables: Map Dictionary Key "Address."
16 pages
DSA Practical Final
No ratings yet
DSA Practical Final
35 pages
BST 4
No ratings yet
BST 4
7 pages
ARRAYS, STRINGS, POINTERSclass PDF
No ratings yet
ARRAYS, STRINGS, POINTERSclass PDF
28 pages
Hashing
No ratings yet
Hashing
13 pages
CS 2420 Program 2 - 24 Points Due Fall 2013 Fun With Recursion
No ratings yet
CS 2420 Program 2 - 24 Points Due Fall 2013 Fun With Recursion
4 pages
Trie Data Structure
No ratings yet
Trie Data Structure
5 pages
DSF
No ratings yet
DSF
26 pages
5. TRIES DATA STRUCTURE
No ratings yet
5. TRIES DATA STRUCTURE
13 pages
Ruby Gems Mastery: 100 Essential Packages for 2024
From Everand
Ruby Gems Mastery: 100 Essential Packages for 2024
Kanto
No ratings yet
Search Tree: Fundamentals and Applications
From Everand
Search Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
10 234trees
No ratings yet
10 234trees
59 pages
FibonacciHeaps 2x2
No ratings yet
FibonacciHeaps 2x2
23 pages
Segment Tree
No ratings yet
Segment Tree
6 pages
Data Science CEP Flyer
No ratings yet
Data Science CEP Flyer
1 page
Be 6th 2009 Result Anal Batch
No ratings yet
Be 6th 2009 Result Anal Batch
7 pages
Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications
No ratings yet
Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications
9 pages
Using SWISH To Realise Interactive Web Based Tutorials For Logic Based Languages
100% (1)
Using SWISH To Realise Interactive Web Based Tutorials For Logic Based Languages
35 pages
Huawei OceanStor HDP3500E Datasheet PDF
No ratings yet
Huawei OceanStor HDP3500E Datasheet PDF
2 pages
Mtech 1 Sem Foundation of Computer Science Mtcs 101 2018 19
100% (1)
Mtech 1 Sem Foundation of Computer Science Mtcs 101 2018 19
1 page
Trees Binary Tree Traversals (Walkthroughs)
No ratings yet
Trees Binary Tree Traversals (Walkthroughs)
2 pages
Core Java Notes
No ratings yet
Core Java Notes
243 pages
Quiz 3 - Soltn PDF
No ratings yet
Quiz 3 - Soltn PDF
4 pages
2 Tukda Dharo Act Rules 1959
No ratings yet
2 Tukda Dharo Act Rules 1959
26 pages
Binary Trees Implementation
No ratings yet
Binary Trees Implementation
7 pages
Foundation of Tcs
No ratings yet
Foundation of Tcs
142 pages
student Management System (1)
No ratings yet
student Management System (1)
21 pages
ADS - Unit 1 and 2 Notes
No ratings yet
ADS - Unit 1 and 2 Notes
28 pages
Zamjene Raznih Elemenata
100% (1)
Zamjene Raznih Elemenata
86 pages
Train Borne ATP Guide
No ratings yet
Train Borne ATP Guide
12 pages
Fourbanger Manual
No ratings yet
Fourbanger Manual
12 pages
Big Data-Mind Map
No ratings yet
Big Data-Mind Map
1 page
ERP SD Revenue Recognition - ERP SD - SCN Wiki PDF
No ratings yet
ERP SD Revenue Recognition - ERP SD - SCN Wiki PDF
6 pages
Soal Asistensi AB Pertemuan 4
No ratings yet
Soal Asistensi AB Pertemuan 4
3 pages
Autonomous Driving System Based On Deep Q Learnig: Takafumi Okuyama, Tad Gonsalves Jaychand Upadhay
No ratings yet
Autonomous Driving System Based On Deep Q Learnig: Takafumi Okuyama, Tad Gonsalves Jaychand Upadhay
5 pages
Reading GAP LightSquared
No ratings yet
Reading GAP LightSquared
3 pages
USB Bootable PDF
No ratings yet
USB Bootable PDF
4 pages
Data Architect or ETL Architect
No ratings yet
Data Architect or ETL Architect
4 pages
Elements of Power Electronics The Oxford Series in Electrical and Computer Engineering by DR Philip Krein 0199388423
No ratings yet
Elements of Power Electronics The Oxford Series in Electrical and Computer Engineering by DR Philip Krein 0199388423
5 pages
LTE CCO Feature Description - RF Based (V100R015C00 - 01) (PDF) - en
No ratings yet
LTE CCO Feature Description - RF Based (V100R015C00 - 01) (PDF) - en
87 pages
03.3 - PB Java Conditional Statements Advanced Exercise
No ratings yet
03.3 - PB Java Conditional Statements Advanced Exercise
9 pages
Quarkus Cheat Sheet
No ratings yet
Quarkus Cheat Sheet
62 pages
Customer Quick Reference Guide: For Oracle Premier Support Customers
No ratings yet
Customer Quick Reference Guide: For Oracle Premier Support Customers
1 page