Tries
Tries
What is Trie?
• Trie is a type of k-ary search tree used for storing and searching a specific key from a set.
Using Trie, search complexities can be brought to optimal limit (key length).
• A trie (derived from retrieval) is a multiway tree data structure used for storing strings over
an alphabet. It is used to store a large amount of strings. The pattern matching can be done
efficiently using tries.
• Trie is also known as digital tree or prefix tree.
• Trie data structure is defined as a Tree based data structure that is used for storing some
collection of strings and performing efficient search operations on them. The word Trie is
derived from reTRIEval, which means finding something or obtaining it.
• Trie follows some property that If two strings have a common prefix then they will have the
same ancestor in the trie. A trie can be used to sort a collection of strings alphabetically as
well as search whether a string with a given prefix is present in the trie or not.
• The trie shows words like allot, alone, ant, and, are, bat, bad. The idea is that all strings
sharing common prefix should come from a common node. The tries are used in spell
checking programs.
What is Trie?
• Preprocessing pattern improves the performance of pattern
matching algorithm. But if a text is very large then it is
better to preprocess text instead of pattern for efficient
search.
• A trie is a data structure that supports pattern matching
queries in time proportional to the pattern size.
• If we store keys in a binary search tree, a well balanced BST
will need time proportional to M * log N, where M is the
maximum string length and N is the number of keys in the
tree. Using Trie, the key can be searched in O(M) time.
However, the penalty is on Trie storage requirements
Need for Trie Data Structure?
• A Trie data structure is used for storing and
retrieval of data and the same operations
could be done using another data structure
which is Hash Table but Trie can perform these
operations more efficiently than a Hash Table.
• Moreover, Trie has its own advantage over the
Hash table. A Trie data structure can be used
for prefix-based searching whereas a Hash
table can’t be used in the same way.
Advantages of Trie Data Structure
over a Hash Table
• The A trie data structure has the following advantages over
a hash table:
– We can efficiently do prefix search (or auto-complete) with Trie.
– We can easily print all words in alphabetical order which is not
easily possible with hashing.
– There is no overhead of Hash functions in a Trie data structure.
– Searching for a String even in the large collection of strings in a
Trie data structure can be done in O(L) Time complexity, Where L
is the number of words in the query string. This searching time
could be even less than O(L) if the query string does not exist in
the trie.
Properties of a Trie Data Structure
• Below are some important properties of the Trie data structure:
– There is one root node in each Trie.
– Each node of a Trie represents a string and each edge represents a
character.
– Every node consists of hashmaps or an array of pointers, with each
index representing a character and a flag to indicate if any string ends at
the current node.
– Trie data structure can contain any number of characters
including alphabets, numbers, and special characters. But for this
article, we will discuss strings with characters a-z. Therefore, only 26
pointers need for every node, where the 0th index represents ‘a’ and
the 25th index represents ‘z’ characters.
– Each path from the root to any node represents a word or string.
How does Trie Data Structure work?
• Trie data structure can contain any number of characters including alphabets, numbers,
and special characters.
• Here strings with characters a-z are considered. Therefore, only 26 pointers need for every
node, where the 0th index represents ‘a’ and the 25th index represents ‘z’ characters.
• Any lowercase English word can start with a-z, then the next letter of the word could
be a-z, the third letter of the word again could be a-z, and so on.
• So for storing a word, we need to take an array (container) of size 26 and initially, all the
characters are empty as there are no words and it will look as shown below.
An easy solution
to perform a
delete operation
for this case is to
just decrement
the wordCount
by 1 at the
ending node of
the word
3.2 The deleted word shares a common
prefix with other words in Trie
• The deleted word “and” has some common prefixes
with other words ‘ant’. They share the prefix ‘an’.
The solution
for this case is
to delete all
the nodes
starting from
the end of the
prefix to the
last character
of the given
word.
3.3 The deleted word does not share any
common prefix with other words in Trie
• The word “geek” does not share any common prefix
with any other words.
The solution
for this case is
just to delete
all the nodes.
Implementation of all the cases
bool delete_key(TrieNode* root, string& word)
{
TrieNode* currentNode = root;
TrieNode* lastBranchNode = NULL;
char lastBrachChar = 'a';
if (count > 1) {
lastBranchNode = currentNode;
lastBrachChar = c;
}
currentNode = currentNode->childNode;
}
}
Implementation of all the cases
int count = 0;
for (int i = 0; i < 26; i++) {
if (currentNode->childNode[i] != NULL)
count++;
}