0% found this document useful (0 votes)

70 views

Implement Trie (Prefix Tree)

The document describes a trie data structure and provides an example Python implementation of the Trie class. A trie is a tree that stores strings to efficiently perform operations like autocomplete and spellchecking. The Trie class needs to implement insert(), search(), and startsWith() methods. insert() adds a string to the trie. search() checks if a string is in the trie. startsWith() checks if a string prefix is in the trie.

Uploaded by

frencheddonuts

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views

Implement Trie (Prefix Tree)

Uploaded by

frencheddonuts

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

A

trie (pronounced as "try") or prefix tree is a tree data structure used to efficiently store and retrieve
keys in a dataset of strings. There are various applications of this data structure, such as autocomplete
and spellchecker.
Implement the Trie class:
Trie() Initializes the trie object.
void insert(String word) Inserts the string word into the trie.
boolean search(String word) Returns true if the string word is in the trie (i.e., was
inserted before), and false otherwise.
boolean startsWith(String prefix) Returns true if there is a previously inserted
string word that has the prefix prefix , and false otherwise.
Example 1:
Input ["Trie", "insert", "search", "search", "startsWith", "insert", "search"] [[],
["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]] Output [null, null, true, false, true,
null, true]

Explanation Trie trie = new Trie(); trie.insert("apple"); trie.search("apple"); // return

True trie.search("app"); // return False trie.startsWith("app"); // return True
trie.insert("app"); trie.search("app"); // return True
Constraints:
1 <= word.length, prefix.length <= 2000
word and prefix consist only of lowercase English letters.
At most 3 * 104 calls in total will be made to insert , search , and startsWith .

class Trie:

def __init__(self):

def insert(self, word: str) -> None:

def search(self, word: str) -> bool:

def startsWith(self, prefix: str) -> bool:

# Your Trie object will be instantiated and called as such:

# obj = Trie()
# obj.insert(word)
# param_2 = obj.search(word)
# param_3 = obj.startsWith(prefix)
My Solution
class Trie:
def __init__(self):
self.root = { 'EOW': None }

def insert(self, word: str) -> None:

i,cur = 0,self.root
while i < len(word):
if word[i] not in cur:
cur[word[i]] = {}
i,cur = i+1,cur[word[i]]
cur['EOW'] = None

def search(self, word: str) -> bool:

i,cur = 0,self.root
while i < len(word) and cur != None:
i,cur = i+1,cur.get(word[i], None)
return i == len(word) and cur != None and 'EOW' in cur

def startsWith(self, prefix: str) -> bool:

i,cur = 0,self.root
while i < len(prefix) and cur != None:
i,cur = i+1,cur.get(prefix[i], None)
return i == len(prefix) and cur != None

Perf:
Runtime: 92.58%
Memory: 85.49%

Learnings
Representing the induction hypothesis for node-based data structures like Linked Lists and Trees
can often take the form: DataStructure[head:cur] .
Examples:
Linked List: List[head:cur]
Tree: Tree[root:cur]
Note that for Tree, this only works when we are going along a linear path. With this caveat
in mind, it's no surprise that they look so similar: they are essentially the same thing.
Also for node-based data structures, remember a couple of these useful concepts:
Dummy node (like root in our example)
Terminal value (like None in both our example and Linked Lists)
Sentinel value (like 'EOW' in our example)

Interesting Notes
Something that initially surprised me was that the similarity of the loop invariant for the read
methods, search/startsWith was the same as the write method, insert . Upon reflection
though, it makes sense, as a lot of it is just making sure we are keeping i and cur in-sync.
Really, the main contribution of the loop invariant here is to help us traverse through our data
structures.
Ok. We have to implement
insert(word)
search(word)
startsWith(prefix)

I can visualize a Trie in my head. My intuition is also telling me that I should have a dummy node acting as a
root, so that we don't end up with potentially 26 disparate trees. We should probably instantiate it in the
constructor. My first thought is to create a Node class, but is there a different way to model this? Perhaps
just a bunch of nested dictionaries? Let's start with the nested dictionaries approach. I'm not even sure how
nested classes in Python work off the top of my head.
I feel like search and startsWith are easier to implement. I'm wondering if there's any downside to
starting with them, but let's go with it. Ok. So how do I implement search(word) ? Well, we'd have a
pointer, i , to the character, word[i] . What is the (loop) invariant? Let root[0:i] denote all nodes up
to the i th level. Then we can express our invariant as: root[0:i] contains word[0:i] . We can
initialize the invariant with: i = 0 Given that root[0:i] contains word[0:i] , how do we make sure
that root[0:i+1] contains word[0:i+1] ? Well, we can't. So the loop terminates if root[0:i+1] does
not contain word[0:i+1] . If it does, then we can execute i = i+1 . Ok, so how can we check if
root[0:i+1] contains word[0:i+1] ? We can check by seeing if word[i+1] in root[i] . Ok. Now I
feel as if we have to get more precise with how we are indexing into root . So. We root is a nested
dictionary. So root[key] == { ... } (some dictionary). Ok. It looks like we'll have to maintain a different
kind of index for root besides i . A kind of pointer. Let's call it cur . Then our invariant is: word[i] in
cur Well, the only way we can satisfy this invariant in the beginning is: i,cur = -1,root Loop guard:
i+1 < len(word) and word[i+1] in cur[word[i]] I'm having a hard time with the fact that we
actually can't guarantee word[0] in root Ok. So how about word[-1] in root where we define
word[-1] as None ? Ahh. Actually yeah, that's fine. I was actually having a hard time figuring out how to
index into cur in the loop guard...but that's not what we're supposed to do!!
Ok. Let's begin again. We have the right initialization code: i,cur = -1,root The loop guard is: i+1 <
len(word) and word[i+1] in cur The body? Well, we know we have to run several commands:

i = i+1
cur = cur[i]

The loop guard guarantees that they maintain the loop invariant so...I think we're good! Then we can just
return i == len(word) after the loop. Let's put it all together:
def search(self, word: str) -> bool:
i,cur = -1,root
while i+1 < len(word) and word[i+1] in cur:
i,cur = i+1,cur[word[i+1]]
return i == len(word)-1 and word[i] in cur
As you can see, I had to modify the code in order to correct some mistakes in reasoning I caught along the
way. The code feels a bit clumsy though. Among other things, I don't like that we have to manually do an
extra check at the end... Perhaps a better way to articulate the invariant is: word[0:i] ==
tree[root:cur] Actually, by conceptualizing the invariant this way, perhaps we can get rid of that extra
check at the end:
def search(self, word: str) -> bool:
i,cur = 0,root
while i < len(word) and word[i] in cur:
i,cur = i+1,cur[word[i+1]]
return i == len(word)

I actually keep seeing this kind of thing over and over again when it comes to node-based data structures
like trees and linked lists. In fact, I did something very similar when reasoning through some linked list
algorithms. But it's still not quite right: word[i+1] will be out of bounds and, even if I get that fixed,
cur[word[i+1]] will result in a KeyError if the word we are looking for isn't in our Trie. Hmmm...I get
the feeling that I'm actually not satisfying the loop invariant. Specifically, perhaps the right bound, cur , in
tree[root:cur] is not truly exclusive? Or we can just guard cur = cur[word[i+1]] lol. So simple.

def search(self, word: str) -> bool:

i,cur = 0,root
while i < len(word) and word[i] in cur:
i = i+1
if i < len(word) and word[i] in cur:
cur = cur[word[i]]
return i == len(word)

Eh. The first one actually feels better.

I suspect some version of that loop will be written over and over again during the course of implementing
the rest of the methods.
The Next Day... You know what? I'm still not happy with search() , but let's move on to startsWith() .
Perhaps by deriving it from scratch, I can get a different perspective on search() as well.
Alright. startsWith(prefix) ... We want to see if prefix exists as a path in our Trie. Oh! Oh shit. I just
realized that what I wrote for search() is actually startsWith() ! To properly implement search() ,
we actually have to make sure that cur contains or becomes some kind of sentinel value that denotes the
end of a word! Man. This is what I was kind of worried about when deciding to focus on implementing one
function at a time. Deciding to implement search() and startsWith() was the right idea, as figuring
out what they need informs how we implement insert() as well.
So what I'm realizing is that every valid word's last node will also include an end-of-word (EOW) sentinel
value. What should this sentinel value be? Perhaps something like "EOW"? I should probably initialize root
w/ it as well. The corresponding value can be None .
Ok. Actually, let's go back to implementing search() with this new insight. What do we want our invariant
to be? I really want word[0:i] == tree[root:cur] to work. Let's try it again. Initialization:
i,cur = 0,self.root

Loop guard...hmmm...what do we want the postcondition of the loop to be? Let's think that through. We
know that if i == len(word) and cur == None , then we can return True . For anything else, we
return False . To explore a little more, what do the other cases correspond to?

i < len(word) and cur == None

Our Trie only contains a proper prefix of the word .
i == len(word) and cur != None
Our Trie contains word as a proper prefix, but not as a whole word.
word is a proper prefix of some other word that is in our Trie.

i < len(word) and cur != None

This should never happen because in this state, the loop should still be processing word .
The negation of this predicate is consistent with the 3 other states. Here's a question: can
we just use the negation of i == len(word) and cur == None as our loop guard?
i != len(word) or cur != None Hmmm...If we go this route, we'd definitely need return
False inside the loop body, since the loop would only terminate if our Trie contains word .
What would be nice about this approach is that we can simply return True after the loop. Let's
try both approaches. One approach:
i,cur = 0,self.root
while i != len(word) or cur != None:
if i < len(word) and cur != None:
i,cur = i+1,cur[word[i]]
else:
return False
return True

Another approach:
i,cur = 0,self.root
while i < len(word) and cur != None:
i,cur = i+1,cur[word[i]]
return i == len(word) and cur == None

I realized while implementing these approaches that the source of my confusion yesterday was in how I was
incorrectly "incrementing" cur . Yesterday, I was doing: cur = cur[word[i+1]] . But think about the
initial conditions when i,cur = 0,self.root ... If word[0] in self.root , then for word[1] , we
need to check if it's in self.root[word[0]] ! So I don't have to worry about word[i+1] going out of
bounds anymore...
Woops. I just realized that in these 2 approaches, I totally forgot to check if word[i] in cur ! But you
know what? I can solve this in a slick way: have the default value for a non-existent key be None ! I'll have
to take care of that in the constructor and insert() , so I don't have to change anything in search() .
Wait wait wait. I don't think this is quite right. Hold on. What happened to our 'EOW' sentinel value? Ok. I
think the only modification I need to make is to change the return statement:
return i == len(word) and cur != None and 'EOW' in cur Wow. That case analysis I did
earlier was off! Let's re-do it to make sure I really understand what is happening. Let's first make
sure we know exactly what values cur can be:
A dictionary
None , in the case where word[i-1] not in parent(cur)
This happens when our Trie doesn't contain word even as a prefix. Right. So if cur ==
None we can definitely return False . And if cur != None , but 'EOW' not in cur ,
then our Trie contains word as a proper prefix, but not as a word.
Ok. With all that, this is my preferred version:
i,cur = 0,self.root
while i < len(word) and cur != None:
i,cur = i+1,cur.get(word[i], None)
return i == len(word) and cur != None and 'EOW' in cur

I was trying to decide between using collections.defaultdict or .get() , and I decided to go with
.get() as my "go-to" pattern so that I can use dictionary comprehensions. This also means that I won't
have to do anything special in the constructor or in insert() .
Ok ok. Now we also know how to implement startsWith pretty easily! It's literally the same thing, but we
just don't have to check for 'EOW' in cur .
Now, let's work on insert(word) . The loop invariant is:
word[0:i] == tree[root:cur] Lol. It's the same loop invariant. But this time, instead of
terminating when we encounter a key that doesn't exist, we insert a new dictionary at that key and
update cur to be that new dictionary. We terminate when i == len(word) , when we've
processed the entire word .
i,cur = 0,self.root
while i < len(word):
if word[i] not in cur:
cur[word[i]] = {}
i,cur = i+1,cur[word[i]]
cur['EOW'] = None

Beautiful. Now, for a final touch, I think it's good Software Engineering to make 'EOW' some kind of private
constant. I wonder how to do this in python?
Lol. Python doesn't have true privates. Instead, you put a leading underscore to mark it for internal
use only.
https://www.geeksforgeeks.org/underscore-_-python/
Python also doesn't have constants. Instead, you capitalize all letters to communicate that the
variable should not be changed.
https://stackoverflow.com/questions/2682745/how-do-i-create-a-constant-in-python
Ugh. I'm getting errors with using _EOW I'm going to slate this for later.
def __init__(self):
self.root = { 'EOW': None }

def insert(self, word: str) -> None:

i,cur = 0,self.root
while i < len(word):
if word[i] not in cur:
cur[word[i]] = {}
i,cur = i+1,cur[word[i]]
cur['EOW'] = None

def search(self, word: str) -> bool:

i,cur = 0,self.root
while i < len(word) and cur != None:
i,cur = i+1,cur.get(word[i], None)
return i == len(word) and cur != None and 'EOW' in cur

def startsWith(self, prefix: str) -> bool:

i,cur = 0,self.root
while i < len(prefix) and cur != None:
i,cur = i+1,cur.get(prefix[i], None)
return i == len(prefix) and cur != None

Killed it! First submission passed.

Runtime: 92.58%
Memory: 85.49%
The core difficulty I ran into here was the implementation. Specifically, reasoning about how to develop
guards for termination for Trees. Like, what is the analogue of i < len(arr) for Trees?

Code MP c3003 3503 4503 5503 6003 PDF
100% (2)
Code MP c3003 3503 4503 5503 6003 PDF
189 pages
Forensics - Lab 6
No ratings yet
Forensics - Lab 6
4 pages
Tries Data Structures (Trie) PPT
100% (1)
Tries Data Structures (Trie) PPT
11 pages
Machine Problem 1
0% (1)
Machine Problem 1
4 pages
Iso 27002 Compliance Guide
100% (2)
Iso 27002 Compliance Guide
23 pages
A2SV - Trie Lecture (No Code)
No ratings yet
A2SV - Trie Lecture (No Code)
39 pages
Dictionary Using Tries in C
No ratings yet
Dictionary Using Tries in C
6 pages
Implement Trie (Prefix Tree) - LeetCode
No ratings yet
Implement Trie (Prefix Tree) - LeetCode
1 page
Representation:: Insertion and Search in Trie Data Structure
No ratings yet
Representation:: Insertion and Search in Trie Data Structure
25 pages
55 TriesNOTES
No ratings yet
55 TriesNOTES
18 pages
Trie - Wikipedia
No ratings yet
Trie - Wikipedia
10 pages
Tries 1427
No ratings yet
Tries 1427
19 pages
Trie Data Structure
No ratings yet
Trie Data Structure
5 pages
Presentation 1
No ratings yet
Presentation 1
20 pages
ACFrOgCAKmRfjY6us4qL6LR3xXovdCC6c767NI_XGduk5dcI_VXBfjT5wL8Zo7RaqCNCfOYuJdOrWxPmifVD96eQLGwoNSl71mrkDPy0UkCrDIH6LLNtbTh_Cpx0-Euz_nehO2KaceleIM21jeXub8l7BNOhficvY1dP3yzQyQ==
No ratings yet
ACFrOgCAKmRfjY6us4qL6LR3xXovdCC6c767NI_XGduk5dcI_VXBfjT5wL8Zo7RaqCNCfOYuJdOrWxPmifVD96eQLGwoNSl71mrkDPy0UkCrDIH6LLNtbTh_Cpx0-Euz_nehO2KaceleIM21jeXub8l7BNOhficvY1dP3yzQyQ==
3 pages
Tries and Huffman Encoding
No ratings yet
Tries and Huffman Encoding
16 pages
Tries_and_Suffix_Tries
No ratings yet
Tries_and_Suffix_Tries
29 pages
Advance Data Structures
No ratings yet
Advance Data Structures
184 pages
TRIE Trees: Search Engines Genome Analysis Data Analytics
No ratings yet
TRIE Trees: Search Engines Genome Analysis Data Analytics
6 pages
Zihad projeject
No ratings yet
Zihad projeject
20 pages
Trie Insertion
No ratings yet
Trie Insertion
31 pages
tries and Radix Tree1
No ratings yet
tries and Radix Tree1
27 pages
5.4. ADS_Tries_Standard Tries
No ratings yet
5.4. ADS_Tries_Standard Tries
34 pages
Digital Search Tree
No ratings yet
Digital Search Tree
61 pages
Lecture Notes On Tries
No ratings yet
Lecture Notes On Tries
10 pages
Crack Maang Companies - DSA Questions (C++) - 1
No ratings yet
Crack Maang Companies - DSA Questions (C++) - 1
257 pages
Daa Tut 6 Sudhanshu Raut: Pseudo Code For KMP Algorithm
No ratings yet
Daa Tut 6 Sudhanshu Raut: Pseudo Code For KMP Algorithm
11 pages
ADS Lab Week 12
No ratings yet
ADS Lab Week 12
4 pages
Trie Tree
No ratings yet
Trie Tree
21 pages
10.1 Object-Oriented Programming in Python
No ratings yet
10.1 Object-Oriented Programming in Python
23 pages
Trie Data Structure Implementation
No ratings yet
Trie Data Structure Implementation
12 pages
Ads 2 Part 4
No ratings yet
Ads 2 Part 4
18 pages
Trie
No ratings yet
Trie
6 pages
unit 4 and 5 -python BISM
No ratings yet
unit 4 and 5 -python BISM
49 pages
CSC10004: Data Structures and Algorithms
No ratings yet
CSC10004: Data Structures and Algorithms
20 pages
Unit 3 Tries
No ratings yet
Unit 3 Tries
16 pages
Suffix Array Tutorial
No ratings yet
Suffix Array Tutorial
17 pages
Nuevo Documento de Texto
No ratings yet
Nuevo Documento de Texto
6 pages
Python Session3 Strings Lists
No ratings yet
Python Session3 Strings Lists
13 pages
Week 4
No ratings yet
Week 4
6 pages
Unit 3
No ratings yet
Unit 3
18 pages
Sequences, Strings and Sets
No ratings yet
Sequences, Strings and Sets
40 pages
Extra
No ratings yet
Extra
4 pages
Algorithms For Predictino
No ratings yet
Algorithms For Predictino
7 pages
Data Structure: Name: Mahadi Hassan ID: 1320133042 & Name: Mahmud Rahman Parag ID: 1320900042
No ratings yet
Data Structure: Name: Mahadi Hassan ID: 1320133042 & Name: Mahmud Rahman Parag ID: 1320900042
168 pages
Trie Data Structure: September 2015
No ratings yet
Trie Data Structure: September 2015
4 pages
A. Yet Another Problem With Strings: ACM ICPC Practice Contest, 8 November, 2015
No ratings yet
A. Yet Another Problem With Strings: ACM ICPC Practice Contest, 8 November, 2015
2 pages
AI&DS_DSAL_Manual_4
No ratings yet
AI&DS_DSAL_Manual_4
10 pages
U18CSI2201 - Strings 1
No ratings yet
U18CSI2201 - Strings 1
38 pages
Trie
No ratings yet
Trie
16 pages
X-Fast and Y-Fast Tries
No ratings yet
X-Fast and Y-Fast Tries
66 pages
Programming-Assignment-1
No ratings yet
Programming-Assignment-1
20 pages
unit5_trie
No ratings yet
unit5_trie
23 pages
Test Solver
No ratings yet
Test Solver
2 pages
Strings Python
No ratings yet
Strings Python
20 pages
Strings
No ratings yet
Strings
8 pages
final_aid_sheet
No ratings yet
final_aid_sheet
2 pages
Mod6 Assignment 2
No ratings yet
Mod6 Assignment 2
4 pages
Compiler Design Record (21072)
No ratings yet
Compiler Design Record (21072)
48 pages
Programming Assignment 1: Suffix Trees
No ratings yet
Programming Assignment 1: Suffix Trees
21 pages
Python Module 3
No ratings yet
Python Module 3
42 pages
6 String
No ratings yet
6 String
11 pages
PHP programming
From Everand
PHP programming
Nino Paiotta
No ratings yet
50 Python Concepts Every Developer Should Know
From Everand
50 Python Concepts Every Developer Should Know
Hernando Abella
No ratings yet
IR Obstacle Detector
No ratings yet
IR Obstacle Detector
5 pages
Empowerment Technologies Fidp - Gutierrezdocx
No ratings yet
Empowerment Technologies Fidp - Gutierrezdocx
10 pages
leiauteNFe v4.00
No ratings yet
leiauteNFe v4.00
198 pages
IT 313 Object Oriented Programming
No ratings yet
IT 313 Object Oriented Programming
6 pages
4
No ratings yet
4
9 pages
Netsh AdvFirewall Firewall Commands
No ratings yet
Netsh AdvFirewall Firewall Commands
24 pages
Optimal Charging Schedule of Electric Vehicle Using Evolutionary Programming To Minimise Costs
No ratings yet
Optimal Charging Schedule of Electric Vehicle Using Evolutionary Programming To Minimise Costs
15 pages
LS6 PPT-AE-JHS (Troubleshoot Basic Computer Software)
No ratings yet
LS6 PPT-AE-JHS (Troubleshoot Basic Computer Software)
28 pages
Thermostats
No ratings yet
Thermostats
10 pages
New Features ibaDatCoordinator v2.4.1
No ratings yet
New Features ibaDatCoordinator v2.4.1
3 pages
Week 2 - (Reading) Introduction To Computers
No ratings yet
Week 2 - (Reading) Introduction To Computers
17 pages
TDI Injection and Glow Plug System 4-Cyl 2 0 LTR 4-Valve Common Rail
No ratings yet
TDI Injection and Glow Plug System 4-Cyl 2 0 LTR 4-Valve Common Rail
77 pages
Fully Modular, Substation-Rated, All-Gigabit Layer 2 Switch/Layer 3 Router, With 1000 Mbps or 10 Gigabit SFP Uplink Ports
No ratings yet
Fully Modular, Substation-Rated, All-Gigabit Layer 2 Switch/Layer 3 Router, With 1000 Mbps or 10 Gigabit SFP Uplink Ports
4 pages
11.AWS Key Management Service
No ratings yet
11.AWS Key Management Service
8 pages
CP R80.10 IdentityAwareness AdminGuide PDF
No ratings yet
CP R80.10 IdentityAwareness AdminGuide PDF
140 pages
Arduino-DC-12-24V-to-AC-230V-380V-3-phases-inverter - Arduino Project Hub
No ratings yet
Arduino-DC-12-24V-to-AC-230V-380V-3-phases-inverter - Arduino Project Hub
1 page
MLT-X 625-75 H PDF
No ratings yet
MLT-X 625-75 H PDF
4 pages
RoMat PGS
No ratings yet
RoMat PGS
2 pages
Contactors and Contactor Assemblies: Sirius
No ratings yet
Contactors and Contactor Assemblies: Sirius
33 pages
The Internet
No ratings yet
The Internet
15 pages
Kettyós Köcsög
No ratings yet
Kettyós Köcsög
9 pages
Transpuls Synergic 4000
No ratings yet
Transpuls Synergic 4000
212 pages
ICT Grade 5 Scope and Sequence Term 2
No ratings yet
ICT Grade 5 Scope and Sequence Term 2
5 pages
Chaldal
No ratings yet
Chaldal
16 pages
Argumentative/Persuasive Essay Topics
No ratings yet
Argumentative/Persuasive Essay Topics
11 pages
Tarjeta de Control Banners Seleccion
No ratings yet
Tarjeta de Control Banners Seleccion
15 pages

Implement Trie (Prefix Tree)

Uploaded by

Implement Trie (Prefix Tree)

Uploaded by

A

Explanation Trie trie = new Trie(); trie.insert("apple"); trie.search("apple"); // return

def insert(self, word: str) -> None:

def search(self, word: str) -> bool:

def startsWith(self, prefix: str) -> bool:

# Your Trie object will be instantiated and called as such:

def insert(self, word: str) -> None:

def search(self, word: str) -> bool:

def startsWith(self, prefix: str) -> bool:

def search(self, word: str) -> bool:

Eh. The first one actually feels better.

i < len(word) and cur == None

i < len(word) and cur != None

def insert(self, word: str) -> None:

def search(self, word: str) -> bool:

def startsWith(self, prefix: str) -> bool:

Killed it! First submission passed.

You might also like