cpp-coding-interview-community
cpp-coding-interview-community
Šimon Tóth
This book is for sale at http://leanpub.com/cpp-coding-interview
This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing
process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and
many iterations to get reader feedback, pivot until you have the right book and build traction once
you do.
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
The commercial edition vs. community edition . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
The author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Book structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Companion repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Using this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Linked Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
std::list and std::forward_list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Custom lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Simple operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Canonical problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Traversal algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Depth-first search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Breadth-first search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Backtracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Notable variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Canonical problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Representing trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Tree traversals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
BST: Binary Search Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Paths in trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Canonical problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Preface
Welcome, you are reading the Surviving the C++ coding interview book. I conceived this book’s
idea during the mass layoffs of 2022/2023.
Most companies still insist on using coding puzzles during interviews, and while for some, this is
only a convenient scaffolding, for many, it remains the primary filter for candidates. Therefore
training for this part of the interview remains a necessity.
This book aims to guide you through the different types of problems you can come across while also
focusing on information that will remain relevant past the interview process.
After finishing this book, you should be able to recognize the solution patterns when presented with
a problem and transform those patterns into an implementation.
As with all my books, this book focuses on practical information. The text is interspersed
with commented examples, and the book comes with a companion repository that contains a
comprehensive test suite. You are encouraged to attempt to solve each of the presented problems
yourself and only then compare it with the commented solution.
The author
I am Šimon Tóth, the sole author of this book. My primary qualification is 20 years of C++
experience, with C++ being my primary language in a commercial setting for approximately 15
of those years.
My background is in HPC, spanning academia, big tech, and startup environments.
I have architected, built, and operated systems of all scales, from single-machine hardware-
supported high-availability to planet-scale services.
My passion has always been teaching and mentoring junior engineers throughout my career, which
is why you are now reading this book.
For more about me, check out my LinkedIn profile1 .
1 https://www.linkedin.com/in/simontoth/
Introduction
Book structure
I believe in interleaving theory and practical training, and I have structured this book to facilitate and
enrich this workflow. Each chapter adheres to a consistent structure to ensure a steady progression.
We start with an introduction that covers relevant C++ background as necessary. Then, we move on
to essential patterns and simple operations. Each chapter concludes with carefully selected problems,
complemented by solutions and commentary.
While the chapters are sequential, with each building on the foundations of the previous, the book
doesn’t restrict you to strict reading order. Instead, it comes with a comprehensive index. You
can always refer to the index to look up more details if you encounter an unfamiliar concept or
algorithm.
Companion repository
This book has a companion repository1 with a test suite and scaffolding for each problem.
The repository is set up with a DevContainer configuration. It allows for a seamless C++
development environment, equipped with the latest stable versions of GCC, GDB, and Clang when
accessed through VS Code. All you need to take full advantage of this are Visual Studio Code2 and
Docker3 .
To get up and running, follow these steps:
1 https://github.com/HappyCerberus/cpp-coding-interview-companion
2 https://code.visualstudio.com/download
3 https://www.docker.com/products/docker-desktop/
Introduction 3
4. Visual Studio Code will ask for a location and, once done, will ask to open the cloned repository.
Confirm.
5. Visual Studio Code will now ask whether you trust me. Confirm that you do. You can see all
the relevant configuration inside the repository in the .vscode and .devcontainer directories.
4 https://github.com/HappyCerberus/cpp-coding-interview-companion
Introduction 4
6. Finally, Visual Studio Code will detect the devcontainer configuration and ask whether you
want to re-open the project in the devcontainer. Confirm. After VSCode downloads the
container, you will have a fully working C++ development environment with the latest GCC,
Clang, GDB, and Bazel.
1 #include <list>
2
3 std::list<int> first{1,2,3};
4 std::list<int> second{4,5,6};
5
6 // Get iterator to the element with value 2.
7 auto it = std::next(first.begin());
8
9 // Move the element to the begining of the second list.
10 second.splice(second.begin(), first, it);
11
12 // first == {1, 3}, second == {2,4,5,6}
13
14 // iterator still valid
15 // *it == 2
The iterator stability is one of the use cases where we would use a std::list or std::forward_list
in practical applications. The only reasonable alternative would be wrapping each element in a
std::unique_ptr, which does offer reference stability (but not iterator stability) irrespective of the
wrapping container.
Linked Lists 7
Of course, we do pay for this stability with performance. Linked lists are node-based containers,
meaning each element is allocated in a separate node, potentially very distant from each other in
memory. When we combine this with the inherent overhead of the indirection, traversing a std::list
can regularly end up 5x-10x slower than an equivalent flat std::vector.
Aside from iterator stability, we also get access to a suite of O(1) operations, and these can potentially
outweigh the inherent overhead of a std::list.
Figure 9. O(1) operations using a std::list and std::forward_list.
1 #include <list>
2
3 std::list<int> data{1,2,3,4,5};
4
5 // O(1) splicing between lists, or within one list
6
7 // effectively rotate left by one element
8 data.splice(data.end(), data, data.begin());
9 // data == {2,3,4,5,1}
10
11 // O(1) erase
12
13 // iterator to element with value 4
14 auto it = std::next(data.begin(), 2);
15 data.erase(it);
16 // data == {2,3,5,1}
17
18 // O(1) insertion
19
20 // effectively push_front()
Linked Lists 8
21 data.insert(data.begin(), 42);
22 // data = {42,2,3,5,1}
Because std::list is a bidirectional range and std::forward_list is a forward range, we lose access to
some standard algorithms. Both lists expose custom implementations of sort, unique, merge, reverse,
remove, and remove_if as member functions.
Figure 10. List specific algorithms.
1 #include <list>
2
3 std::list<int> data{1,2,3,4,5};
4
5 data.reverse();
6 // data = {5,4,3,2,1}
7
8 data.sort();
9 // data = {1,2,3,4,5}
10
11 data.remove_if([](int v) { return v % 2 == 0; });
12 // data == {1,3,5}
The std::forward_list has an additional quirk; since we can only erase and insert after an iterator,
the std::forward_list offers a modified interface.
Figure 11. Modified interface of std::forward_list.
1 #include <forward_list>
2
3 std::forward_list<int> data{1,2,3,4,5};
4
5 // before_begin() iterator
6 auto it = data.before_begin();
7
8 // insert and erase only possible after the iterator
9 data.insert_after(it, 42);
10 // data == {42,1,2,3,4,5}
11 data.erase_after(it);
12 // data == {1,2,3,4,5}
Custom lists
When implementing a simple custom linked list, you might be tempted to use a straightforward
implementation using a std::unique_ptr.
Linked Lists 9
Sadly, this approach isn’t usable. The fundamental problem here is the design. We are mixing
ownership with structural information. In this case, this problem manifests during destruction.
Because we have tied the ownership with the structure, the destruction of a list will be recursive,
potentially leading to stack exhaustion and a crash.
Figure 13. A demonstration of a problem caused by recursive destruction.
1 #include <memory>
2
3 struct Node {
4 int value;
5 std::unique_ptr<Node> next;
6 };
7
8 {
9 std::unique_ptr<Node> head = std::make_unique<Node>(0,nullptr);
10 // Depending on the architecture/compiler, the specific number
11 // of elements we can handle without crash will differ.
12 Node* it = head.get();
13 for (int i = 0; i < 100000; ++i)
14 it = (it->next = std::make_unique<Node>(0,nullptr)).get();
15 } // BOOM
If we desire both the O(1) operations and iterator stability, the only option is to rely on manual
resource management (at which point we might as well use std::list or std::forward_list).
However, if we limit ourselves, there are a few alternatives to std::list and std::forward_list.
If we want to capture the structure of a linked list with reference stability, we can rely on the
previously mentioned combination of a std::vector and a std::unique_ptr. This approach doesn’t give
us any O(1) operations or iterator stability; however, this approach is often used during interviews.
Figure 14. Representing the structure of a linked list using a std::vector and std::unique_ptr.
1 #include <vector>
2 #include <memory>
3
4 struct List {
5 struct Node {
6 int value;
7 Node* next;
8 };
9 Node *head = nullptr;
10 Node *new_after(Node* prev, int value) {
11 nodes_.push_back(std::make_unique<Node>(value, nullptr));
12 if (prev == nullptr)
13 return head = nodes_.back().get();
14 else
15 return prev->next = nodes_.back().get();
16 }
17 private:
18 std::vector<std::unique_ptr<Node>> nodes_;
19 };
20
21
22 List list;
23 auto it = list.new_after(nullptr, 1);
24 it = list.new_after(it, 2);
25 it = list.new_after(it, 3);
26
27 // list.head->value == 1
28 // list.head->next->value == 2
29 // list.head->next->next->value == 3
The crucial difference from the naive approach is that the list data structure owns all nodes, and the
structure is encoded only using weak pointers.
Finally, if we do not require stable iterators or references but do require O(1) operations, we can use
Linked Lists 11
a flat list approach. We can store all elements directly in a std::vector and represent information
about the next and previous nodes using indexes.
However, this introduces a problem. Erasing an element from the middle of a std::vector is O(n)
because we need to shift successive elements to fill the gap. Since we are encoding the list structure,
we can swap the to-be-erased element with the last element and only then erase it in O(1).
Figure 15. Erase an element from the middle of a flat list in O(1).
1 #include <vector>
2
3 inline constexpr ptrdiff_t nill = -1;
4
5 struct List {
6 struct Node {
7 int value;
8 ptrdiff_t next;
9 ptrdiff_t prev;
10 };
11 ptrdiff_t new_after(ptrdiff_t prev, int value) {
12 storage.push_back({value, nill, prev});
13 if (prev != nill)
14 storage[prev].next = std::ssize(storage)-1;
15 else
16 head = std::ssize(storage)-1;
17 return std::ssize(storage)-1;
18 }
19 void erase(ptrdiff_t idx) {
20 // move head
21 if (idx == head)
22 head = storage[idx].next;
23 // unlink the erased element
24 if (storage[idx].next != nill)
25 storage[storage[idx].next].prev = storage[idx].prev;
26 if (storage[idx].prev != nill)
27 storage[storage[idx].prev].next = storage[idx].next;
28 // relink the last element
29 if (idx != std::ssize(storage)-1) {
30 if (storage.back().next != nill)
31 storage[storage.back().next].prev = idx;
32 if (storage.back().prev != nill)
33 storage[storage.back().prev].next = idx;
34 }
35 // swap and O(1) erase
36 std::swap(storage[idx],storage.back());
Linked Lists 12
37 storage.pop_back();
38 }
39 ptrdiff_t get_head() { return head; }
40 Node& at(ptrdiff_t idx) { return storage[idx]; }
41 private:
42 ptrdiff_t head = nill;
43 std::vector<Node> storage;
44 };
45
46
47 List list;
48 ptrdiff_t idx = list.new_after(nill, 1);
49 idx = list.new_after(idx, 2);
50 idx = list.new_after(idx, 3);
51 idx = list.new_after(idx, 4);
52 idx = list.new_after(idx, 5);
53 // list == {1,2,3,4,5}
54
55 idx = list.get_head();
56 list.erase(idx);
57 // list == {2,3,4,5}
Simple operations
Let’s explore some basic operations frequently used as the base for a more complex solution. The
three most frequent operations are:
1 #include <list>
2 #include <forward_list>
3
4 {
5 std::list<int> left{2,4,5};
6 std::list<int> right{1,3,9};
7 left.merge(right);
8 // left == {1,2,3,4,5,9}
9 // right == {}
10 }
11
12 {
13 std::forward_list<int> left{2,4,5};
14 std::forward_list<int> right{1,3,9};
15 left.merge(right);
16 // left == {1,2,3,4,5,9}
17 // right == {}
18 }
However, implementing one from scratch isn’t particularly complicated either. We consume the
merged-in list, one element at a time, advancing the insertion position as needed.
Figure 17. Custom merge operation.
1 #include <forward_list>
2
3 std::forward_list dst{1, 3, 5, 6};
4 std::forward_list src{2, 4, 7};
5
6 auto dst_it = dst.begin();
7
8 while (!src.empty()) {
9 if (std::next(dst_it) == dst.end() ||
10 *std::next(dst_it) >= src.front()) {
11 dst.splice_after(dst_it, src, src.before_begin());
12 } else {
13 ++dst_it;
14 }
15 }
16 // dst == {1,2,3,4,5,6,7}
17 // src == {}
Linked Lists 14
The same situation applies to reversing a list. Both lists provide a built-in in-place reverse operation{i:
“lists|reverse”}.
Figure 18. Built-in in place reverse.
1 #include <forward_list>
2
3 std::forward_list<int> src{1,2,3,4,5,6,7};
4
5 src.reverse();
6 // src == {7,6,5,4,3,2,1}
Implementing a custom reverse is straightforward if we use a second list. However, the in-place
version can be tricky.
Figure 19. Custom implementations of linked list reverse.
1 #include <forward_list>
2 #include <iostream>
3
4 std::forward_list<int> src{1,2,3,4,5,6,7};
5
6 // Custom reverse using a second list
7 std::forward_list<int> dst;
8 while (!src.empty())
9 dst.splice_after(dst.before_begin(), src, src.before_begin());
10 // dst == {7,6,5,4,3,2,1}
11 // src == {}
12
13 // Custom in-place reverse
14 auto tail = dst.begin();
15 if (tail != dst.end())
16 while (std::next(tail) != dst.end())
17 dst.splice_after(dst.before_begin(), dst, tail);
18 // dst == {1,2,3,4,5,6,7}
The in-place reverse takes advantage of the fact that the first element will be the last once the list is
reversed.
Linked Lists 15
Finally, scanning with two iterators is a common search technique for finding a sequence of elements
that conform to a particular property. As long as this property is calculated strictly from elements
entering and leaving the sequence, we do not need to access the elements currently in the sequence{i:
“two pointers|sliding window”}.
Figure 21. Find the longest subsequence with sum less than 4.
1 #include <forward_list>
2
3 std::forward_list<int> data{4,2,1,1,1,3,5};
4
5 // two iterators denoting the sequence [left, right)
6 auto left = data.begin();
7 auto right = data.begin();
8 int sum = 0;
9 int len = 0;
10 int max = 0;
11
12 while (right != data.end()) {
13 // extend right, until we break the property
14 while (sum < 4 && right != data.end()) {
15 max = std::max(max, len);
16 ++len;
17 sum += *right;
18 ++right;
19 }
20 // shrink from left, until the property is restored
21 while (sum >= 4 && left != right) {
22 sum -= *left;
23 --len;
24 ++left;
25 }
26 }
27 // max == 3, i.e. {1,1,1}
Linked Lists 16
Canonical problems
Now that we’ve covered the basics, let’s move on to real-world problems often seen in technical
interviews. This next section will cover four linked list challenges: reversing k-groups in a list,
merging a list of sorted lists, removing the kth element from the end, and finding a loop in a corrupted
list.
It’s a step up from what we’ve done so far, but with the foundation you’ve built, you should be
well-prepared to handle these tasks. Let’s get started.
You should be able to implement a version that operates in O(n) time and O(1) additional space,
where n is the number of elements in the list.
The scaffolding for this problem is located at lists/k_groups. Your goal is to make the
following commands pass without any errors: bazel test //lists/k_groups/..., bazel
test --config=addrsan //lists/k_groups/..., bazel test --config=ubsan //lists/k_-
groups/....
You should be able to implement a version that operates in O(n*log(k)) time and uses O(k) additional
memory, where n is the total number of elements and k is the number of lists we are merging.
The scaffolding for this problem is located at lists/merge. Your goal is to make the
following commands pass without any errors: bazel test //lists/merge/..., bazel test
--config=addrsan //lists/merge/..., bazel test --config=ubsan //lists/merge/....
Figure 24. Example of removing the 3rd element from the end of the list.
You should be able to implement a version that operates in O(n) time and uses O(1) additional
memory, where n is the number of elements in the list.
The scaffolding for this problem is located at lists/end_of_list. Your goal is to make
the following commands pass without any errors: bazel test //lists/end_of_list/...,
bazel test --config=addrsan //lists/end_of_list/..., bazel test --config=ubsan
//lists/end_of_list/....
Linked Lists 18
• Progression A: return the node that is the first node on the loop.
• Progression B: fix the list.
You should be able to implement a version that operates in O(n) and uses O(1) additional memory,
where n is the number of elements in the list.
The scaffolding for this problem is located at lists/loop for the basic version and
lists/loop_node, lists/loop_fix for the two progressions. Your goal is to make the
following commands pass without any errors: bazel test //lists/loop/..., bazel test
--config=addrsan //lists/loop/..., bazel test --config=ubsan //lists/loop/....
Adjust the directory to loop_node and loop_fix for the relevant progression.
Hints
Solutions
The complexity lies in applying this operation multiple times in sequence. For that, we need to keep
track of terminating nodes:
Linked Lists 20
• the head of the already processed part; this will be our final result
• the tail of the already processed part; this is where we will attach each reversed section as we
iterate
• the head of the unprocessed part; for iteration and we also need it to relink the reversed group
of k elements
Figure 28. Calculate the kth element by advancing k steps from the unprocessed head.
Figure 30. Connect the reversed group to the tail of the processed part. Connect the unprocessed head to the kth
element.
Linked Lists 21
Figure 31. The new tail is the unprocessed head. The new unprocessed head is the kth element.
35 }
We access each element at most twice. Once when advancing by k elements and the second when
we are reversing a group of k elements. This means that our time complexity is $O(n), and since we
only store the terminal nodes, our space complexity is O(1).
22 while (!q.empty()) {
23 // Extract the node that holds the element,
24 // without making a copy
25 auto top = q.extract(q.begin());
26
27 // Splice the first element of the top list to the result
28 result.splice_after(tail,
29 top.value(), top.value().before_begin());
30 ++tail;
31
32 if (!top.value().empty())
33 q.insert(std::move(top)); // put back
34 }
35 return result;
36 }
Because we extract each element once and each extract operation involves O(log(k)) insertion
operation, we end up with O(n*log(k)) time complexity. Our std::multiset will use O(k) memory.
Figure 34. Solution using pairwise merging.
We merge n elements in every iteration, repeating this for log(k) iterations, leading to O(n*log(k))
Linked Lists 24
time complexity. The only additional memory we use is to store the partially merged lists; therefore,
we end up with O(k) space complexity.
We will eventually reach the end if we iterate over the list without a loop. However, if there is a
loop, we will end up stuck in the loop.
The tricky part is detecting that we are stuck in the loop. If we use two pointers to iterate, one slow
one, iterating normally and one fast one, iterating over two nodes in each step, we have a guarantee
that if they get stuck in a loop, they will eventually meet.
Figure 36. Initial configuration: slow and fast pointers are pointing to the head of the list.
To detect the start of the loop, we must look at how many steps both pointers made before they met
up.
Consider that the slow pointer moved x steps before entering the loop and then y steps after entering
the loop for a slow = x + y total.
The fast pointer moved similarly. It also moved x steps before entering the loop and then y steps after
entering the loop when it met up with the slow pointer; however, before that, it did an unknown
number of loops: fast = x + n*loop + y. Importantly, we also know that the fast pointer also did
2*slow steps.
If we put this together, we end up with the following:
• 2*(x + y) = x + n*loop + y
• x = n*loop - y
Linked Lists 27
This means that the number of steps to reach the loop is the same as the number of steps remaining
from where the pointers met up to the start of the loop.
So to find the start of the loop, we can iterate from the start and the meeting point. Once these two
new pointers meet, we have our loop start.
Figure 40. One pointer at the meeting point, one at the list head.
The main difficulty in fixing the list is that we are working with a singly-linked list. Fixing the list
means that we must unlink node one before the start of the loop.
Figure 43. Solution for fixing the list.
29 before->next = nullptr;
30 }
Traversal algorithms
This chapter is dedicated to three algorithms we will keep revisiting in different variants throughout
the book. The two search-traversal algorithms, depth-first and breadth-first search, and the
constraint-traversal algorithm, backtracking.
Let’s start with a problem: imagine you need to find a path in a maze; how would you do it?
You could wander randomly, and while that might take a very long time, you will eventually reach
the end.
However, for a more structured approach, you might consider an approach similar to a depth-first
search, exploring each branch until you reach a dead-end, then returning to the previous crossroads
and taking a different path.
Depth-first search
The depth-first search opportunistically picks a direction at each space and explores that direction
fully before returning to this space and picking a different path.
Traversal algorithms 31
A typical approach would use a consistent strategy for picking the direction order: e.g., north, south,
west, east; however, as long as the algorithm explores every direction, the order doesn’t matter and
can be randomized.
Because of the repeating nested nature, a recursive implementation is a natural fit for the depth-first
search.
Figure 46. Recursive implementation of a depth-first search.
17 map[row][col] = '.';
18
19 return dfs(row-1,col,map) || // North
20 dfs(row+1,col,map) || // South
21 dfs(row,col-1,map) || // West
22 dfs(row,col+1,map); // East
23 }
We can flatten the recursive version using a stack data structure. However, we need to remember
the LIFO nature of a stack. The order of exploration will be inversed from the order in which we
insert the elements into the stack.
Figure 47. Implementation of depth-first search using a std::stack.
1 bool dfs(int64_t row, int64_t col,
2 std::vector<std::vector<char>>& map) {
3 std::stack<std::pair<int64_t,int64_t>> next;
4 next.push({row,col});
5
6 // As long as we have spaces to explore.
7 while (!next.empty()) {
8 auto [row,col] = next.top();
9 next.pop();
10
11 // If we reached the exit, we are done.
12 if (map[row][col] == 'E')
13 return true;
14
15 // Mark as visited
16 map[row][col] = '.';
17
18 // Helper to check if a space can be stepped on
19 // i.e. not out-of-bounds and either empty or exit.
20 auto is_path = [&map](int64_t row, int64_t col) {
21 return row >= 0 && row < std::ssize(map) &&
22 col >= 0 && col < std::ssize(map[row]) &&
23 (map[row][col] == ' ' || map[row][col] == 'E');
24 };
25
26 // Due to the stack data structure we need to insert
27 // elements in the reverse order we want to explore.
28 if (is_path(row,col+1)) // East
29 next.push({row,col+1});
30 if (is_path(row,col-1)) // West
Traversal algorithms 33
31 next.push({row,col-1});
32 if (is_path(row+1,col)) // South
33 next.push({row+1,col});
34 if (is_path(row-1,col)) // North
35 next.push({row-1,col});
36 }
37
38 // We have explored all reachable spaces
39 // and didn't find the exit.
40 return false;
41 }
While the depth-first search is excellent for finding a path, we don’t necessarily get the shortest
path. If our goal is to determine reachability, a depth-first search will be sufficient; however, if we
require the path to be optimal, we must use the breadth-first search.
Breadth-first search
As the name suggests, the algorithm expands in breadth, visiting spaces in lock-step. The algorithm
first visits all spaces next to the starting point, then all spaces next to those, i.e., two spaces away
from the start, then three, four, and so on. To visualize, you can think about how water would flood
the maze from the starting point.
When implementing a breadth-first search, we need a data structure that will allow us to process
the elements in the strict order we discover them, a queue.
Figure 49. Implementation of breadth-first search using a std::queue.
1 int64_t bfs(int64_t row, int64_t col, std::vector<std::vector<char>>& map) {
2 std::queue<std::tuple<int64_t,int64_t,int64_t>> next;
3 next.push({row,col,0});
4
5 // As long as we have spaces to explore.
6 while (!next.empty()) {
7 auto [row,col,dist] = next.front();
8 next.pop();
9
10 // If we reached the exit, we are done.
11 // Return the current length.
12 if (map[row][col] == 'E')
13 return dist;
14
15 // Mark as visited.
16 map[row][col] = '.';
17
18 // Helper to check if a space can be stepped on
19 // i.e. not out-of-bounds and either empty or exit.
20 auto is_path = [&map](int64_t row, int64_t col) {
21 return row >= 0 && row < std::ssize(map) &&
22 col >= 0 && col < std::ssize(map[row]) &&
23 (map[row][col] == ' ' || map[row][col] == 'E');
24 };
25
26 if (is_path(row-1,col)) // North
27 next.push({row-1,col,dist+1});
28 if (is_path(row+1,col)) // South
29 next.push({row+1,col,dist+1});
30 if (is_path(row,col-1)) // West
31 next.push({row,col-1,dist+1});
32 if (is_path(row,col+1)) // East
33 next.push({row,col+1,dist+1});
34 }
35
36 // We have explored all reachable spaces
37 // and didn't find the exit.
38 return -1;
39 }
Traversal algorithms 35
Backtracking
Both depth-first and breadth-first searches are traversal algorithms that attempt to reach a specific
goal. The difference between the two algorithms is only in the order in which they traverse the
space.
However, in some situations, we may not know the goal and only know the properties (constraints)
the path toward the goal must fulfill.
The backtracking algorithm explores the solution space in a depth-first order, discarding paths that
do not fulfill the requirements.
Let’s take a look at a concrete example: The N-Queens problem. The goal is to place N-Queens onto
an NxN chessboard without any of the queens attacking each other, i.e., no queens sharing a row,
column, or diagonal.
Traversal algorithms 36
The paths we explore are partial but valid solutions that build upon each other. In the above example,
we traverse the solution space in row order. First, we pick a position for a queen in the first row,
then second, then third, and finally fourth. The example also demonstrates two dead-ends we reach
if we place the queen in the first row into the first column.
A backtracking algorithm implementation will be similar to a depth-first search. However, we
must keep track of the partial solution (the path), adding to the solution as we explore further and
removing from the solution when we return from a dead-end.
Figure 51. Example implementation of backtracking.
1 #include <vector>
2 #include <cstdint>
3
4 // Check if we can place a queen in the specified row and column
5 bool available(std::vector<int64_t>& solution,
6 int64_t row, int64_t col) {
7 for (int64_t queen = 0; queen < std::ssize(solution); ++queen) {
8 // Column occupied
9 if (solution[queen] == col)
10 return false;
11 // NorthEast/SouthWest diagonal occupied
12 if (row + col == queen + solution[queen])
13 return false;
14 // NorthWest/SouthEast diagonal occupied
15 if (row - col == queen - solution[queen])
16 return false;
17 }
18 return true;
19 }
20
21 bool backtrack(std::vector<int64_t>& solution, int64_t n) {
22 if (std::ssize(solution) == n)
23 return true;
24
25 // We are trying to fit a queen on row std::ssize(solution)
26 for (int64_t column = 0; column < n; ++column) {
27 if (!available(solution, std::ssize(solution), column))
28 continue;
29
30 // This space is not in conflict
31 solution.push_back(column);
32 // We found a solution, exit
33 if (backtrack(solution, n))
34 return true;
Traversal algorithms 38
Notable variants
The traversal algorithms mentioned earlier are largely standalone, ready to be deployed to solve
diverse problems with minimal tweaks.
Yet, we frequently encounter a few additional versions. In this section, we’ll tackle three
such variants: traversing multiple dimensions, adjusting for non-unit costs, and managing the
propagation of constraints.
We’ll also illustrate each variant using a concrete problem, all of which you can find in the
companion repository. Try solving each of them before you read the corresponding solution.
Multi-dimensional traversal
Applying a depth-first or breadth-first search to a problem with additional spatial dimensions is
straightforward. From the algorithm’s perspective, additional dimensions only introduce a broader
neighborhood for each space. However, in some problems, the additional dimensions will not be
that obvious.
Consider the following problem: Given a 2D grid of size m*n, containing 0s (spaces) and 1s
(obstacles), determine the length of the shortest path from the coordinate {0,0} to {m-1,n-1}, given
that you can remove up to k obstacles.
Before you continue reading, try solving this problem yourself. The scaffolding for
this problem is located at traversal/obstacles. Your goal is to make the following
commands pass without any errors: bazel test //traversal/obstacles/..., bazel
test --config=addrsan //traversal/obstacles/..., bazel test --config=ubsan
//traversal/obstacles/....
Because we are looking for the shortest path, we don’t have a choice of the traversal algorithm. We
must use a breadth-first search. But how do we deal with the obstacles?
Let’s consider adding a 3rd dimension to the problem. Instead of removing an obstacle, we can
virtually move to a new maze floor, where this obstacle never existed. However, this introduces a
problem. We can’t apply this logic mindlessly since there are potentially m*n obstacles.
Traversal algorithms 39
Fortunately, we can lean on the behaviour of breadth-first search. When we enter a new floor of
the maze, we have a guarantee that we will never revisit the space we entered through. This means
we do not have to track which specific obstacles we removed, only how many we can still remove,
shrinking the number of floors to k+1. Applying a breadth-first search then leaves us with a total
time complexity of O(m*n*(k+1)).
Figure 52. Breadth-first search in a maze with obstacle removal.
1 #include <vector>
2 #include <queue>
3 #include <cstdint>
4
5 struct Dir {
6 int64_t row;
7 int64_t col;
8 };
9
10 struct Pos {
11 int64_t row;
12 int64_t col;
13 int64_t k;
14 int64_t distance;
15 };
16
17 int shortest_path(const std::vector<std::vector<int>>& grid, int64_t k) {
18 // Keep track of visited spaces, initialize all spaces as unvisited.
19 std::vector<std::vector<std::vector<bool>>> visited(
20 grid.size(), std::vector<std::vector<bool>> (
21 grid[0].size(), std::vector<bool>(k+1, false)
22 )
23 );
24
25 // BFS
26 std::queue<Pos> q;
27 // start in {0,0} with zero removed obstacles
28 q.push(Pos{0,0,0,0});
29 visited[0][0][0] = true;
30
31 while (!q.empty()) {
32 auto current = q.front();
33 q.pop();
34 // The first time we visit the end coordinate is the shortest path
35 if (current.row == std::ssize(grid)-1 &&
36 current.col == std::ssize(grid[current.row])-1) {
Traversal algorithms 40
37 return current.distance;
38 }
39
40 // For every direction, try to move there
41 for (auto dir : {Dir{-1,0}, Dir{1,0}, Dir{0,-1}, Dir{0,1}}) {
42 // This space is out of bounds, ignore.
43 if ((current.row + dir.row < 0) ||
44 (current.col + dir.col < 0) ||
45 (current.row + dir.row >= std::ssize(grid)) ||
46 (current.col + dir.col >= std::ssize(grid[0])))
47 continue;
48
49 // If the space in the current direction is an empty space:
50 Pos empty = {current.row + dir.row, current.col + dir.col,
51 current.k, current.distance + 1};
52 if (grid[empty.row][empty.col] == 0 &&
53 !visited[empty.row][empty.col][empty.k]) {
54 // add it to the queue
55 q.push(empty);
56 // and mark as visited
57 visited[empty.row][empty.col][empty.k] = true;
58 }
59
60 // If we have already removed k obstacles,
61 // we don't consider removing more.
62 if (current.k == k)
63 continue;
64
65 // If the space in the current direction is an obstacle:
66 Pos wall = {current.row + dir.row, current.col + dir.col,
67 current.k + 1, current.distance + 1};
68 if (grid[wall.row][wall.col] == 1 &&
69 !visited[wall.row][wall.col][wall.k]) {
70 // add it to the queue
71 q.push(wall);
72 // and mark as visited
73 visited[wall.row][wall.col][wall.k] = true;
74 }
75 }
76 }
77
78 // If we are here, we did not reach the end coordinate.
79 return -1;
Traversal algorithms 41
80 }
Before you continue reading, try solving this problem yourself. The scaffolding for
this problem is located at traversal/heightmap. Your goal is to make the following
commands pass without any errors: bazel test //traversal/heightmap/..., bazel
test --config=addrsan //traversal/heightmap/..., bazel test --config=ubsan
//traversal/heightmap/....
The primary requirement of BFS is that we process elements in the order of their distance from the
start of the path. When all transitions have a unit cost, we can achieve this by relying on a queue.
However, with non-unit costs, we must use an ordered structure such as std::priority_queue. Note
that switching to a priority queue will affect the time complexity as we are moving from O(1) push
and pop operations to O(log(n)) push and pop operations.
The second guarantee we lose concerns the shortest path when we first push a space into the queue.
If we discovered a space with a path length X we had a guarantee that all later paths that also lead
to this space would, at best equal X. Because of this constraint, we could limit ourselves to adding
each space into the queue only once. With non-unit costs, this property no longer holds.
It is possible that a later (and longer) path can enter the same space with an overall shorter path
length. Consequently, we might need to insert a space multiple times into our queue (bounded by
the number of neighbours). However, we still have a slightly weaker but still significant guarantee.
The ordered nature of the priority queue guarantees that the first time we pop a space from the
queue, it is part of the shortest path that enters this space.
Due to the queue’s logarithmic complexity, we end up with O(m*n*log(m*n)) overall time complex-
ity for the breadth-first search.
Traversal algorithms 42
Figure 53. Breadth-first search using a priority queue to handle non-unit costs.
1 #include <vector>
2 #include <queue>
3 #include <cstdint>
4
5 struct Coord {
6 int64_t row;
7 int64_t col;
8 };
9
10 int64_t shortest_path(const std::vector<std::vector<int>>& map,
11 Coord start, Coord end) {
12 struct Pos {
13 int64_t length;
14 Coord coord;
15 };
16
17 // For tracking visited spaces
18 std::vector<std::vector<bool>> visited(map.size(),
19 std::vector<bool>(map[0].size(), false));
20
21 // Helper to check whether a space can be stepped on
22 // not out of bounds, not impassable and not visited
23 auto can_step = [&map, &visited](Coord coord) {
24 auto [row, col] = coord;
25 return row >= 0 && col >= 0 &&
26 row < std::ssize(map) && col < std::ssize(map[row]) &&
27 map[row][col] >= 0 &&
28 !visited[row][col];
29 };
30
31 // Priority queue instead of a simple queue
32 std::priority_queue<Pos, std::vector<Pos>,
33 decltype([](const Pos& l, const Pos& r) {
34 return l.length > r.length;
35 })> q;
36 // Start with path length zero at start
37 q.push({0,start});
38
39 // Helper to determine the cost of moving between two spaces
40 auto step_cost = [&map](Coord from, Coord to) {
41 if (map[from.row][from.col] < map[to.row][to.col]) return 4;
42 if (map[from.row][from.col] > map[to.row][to.col]) return 1;
Traversal algorithms 43
43 return 2;
44 };
45
46 while (!q.empty()) {
47 // Grab the position closest to the start
48 auto [length, pos] = q.top();
49 q.pop();
50
51 if (visited[pos.row][pos.col]) continue;
52 // The first time we grab a position from the queue is guaranteed
53 // to be the shortest path, so now we need to mark it as visited.
54 // If we later visit the same position (already in queue at this point)
55 // with a longer path, we skip it based on the above check.
56 visited[pos.row][pos.col] = true;
57
58 // First time we would try to exit the end space is the shortest path.
59 if (pos.row == end.row && pos.col == end.col)
60 return length;
61
62 // Expand to all four directions
63 for (auto next : {Coord{pos.row-1,pos.col},
64 Coord{pos.row+1, pos.col},
65 Coord{pos.row, pos.col-1},
66 Coord{pos.row, pos.col+1}}) {
67 if (!can_step(next)) continue;
68 q.push({length + step_cost(pos, next), next});
69 }
70 }
71
72 return -1;
73 }
Constraint propagation
In the previous section, we used backtracking to solve the N-Queens problem. However, if you look
at the implementation, we repeatedly check each new queen against all previously placed queens.
We can do better.
When working with backtracking, we cannot escape the inherent exponential complexity of the
worst case. However, we can often significantly reduce the exponents by propagating the problem’s
constraints forward. The main objective is to remove as many options from the consideration
altogether by ensuring that the constraints are maintained
Traversal algorithms 44
Before you continue reading, try modifying the previous version yourself. The scaf-
folding for this problem is located at traversal/queens. Your goal is to make the
following commands pass without any errors: bazel test //traversal/queens/...,
bazel test --config=addrsan //traversal/queens/..., bazel test --config=ubsan
//traversal/queens/....
Specifically for the N-Queens problem, we have N rows, N columns, 2*N-1 NorthWest, and 2*N-1
NorthEast diagonals. Placing a queen translates to claiming one row, column, and the corresponding
diagonals. Instead of checking each queen against all previous queens, we can limit ourselves to
checking whether the corresponding row, column, or one of the two diagonals was already claimed.
Figure 54. Solving the N-Queens problem with backtracking and constraint propagation.
1 // Helper to store the current state:
2 struct State {
3 State(int64_t n) : n(n), solution{}, cols(n), nw_dia(2*n-1), ne_dia(2*n-1) {}
4 // Size of the problem.
5 int64_t n;
6 // Partial solution
7 std::vector<int64_t> solution;
8 // Occupied columns
9 std::vector<bool> cols;
10 // Occupied NorthWest diagonals
11 std::vector<bool> nw_dia;
12 // Occupied NorthEast diagonals
13 std::vector<bool> ne_dia;
14 // Check column, and both diagonals
15 bool available(int64_t row, int64_t col) const {
16 return !cols[col] && !nw_dia[row-col+n-1] && !ne_dia[row+col];
17 }
18 // Mark this position as occupied and add it to the partial solution
19 void mark(int64_t row, int64_t col) {
20 solution.push_back(col);
21 cols[col] = true;
22 nw_dia[row-col+n-1] = true;
23 ne_dia[row+col] = true;
24 }
25 // Unmark this position as occupied and remove it from the partial solution
26 void erase(int64_t row, int64_t col) {
27 solution.pop_back();
28 cols[col] = false;
29 nw_dia[row-col+n-1] = false;
30 ne_dia[row+col] = false;
31 }
Traversal algorithms 45
32 };
33
34 bool backtrack(auto& state, int64_t row, int64_t n) {
35 // All Queens have their positions, we have solution
36 if (row == n) return true;
37
38 // Try to find a feasible column on this row
39 for (int c = 0; c < n; ++c) {
40 if (!state.available(row,c))
41 continue;
42 // Mark this position
43 state.mark(row,c);
44 // Recurse to the next row
45 if (backtrack(state, row+1, n))
46 return true; // We found a solution on this path
47 // This position lead to dead-end, erase and try another
48 state.erase(row,c);
49 }
50 // This is dead-end
51 return false;
52 }
Canonical problems
Traversal algorithms are possibly the most frequent algorithms during technical interviews. In this
section, we will limit ourselves to only five problems that exemplify the different variants of traversal
algorithms we have discussed in the last two sections.
Locked rooms
Given an array of n locked rooms, each room containing 0..n distinct keys, determine whether you
can visit each room. You are given the key to room zero, and each room can only be opened with
the corresponding key (however, there may be 0..n copies of that key).
Assume you can freely move between rooms; the key is the only thing you need.
The scaffolding for this problem is located at traversal/locked. Your goal is to make
the following commands pass without any errors: bazel test //traversal/locked/...,
bazel test --config=addrsan //traversal/locked/..., bazel test --config=ubsan
//traversal/locked/....
Traversal algorithms 46
For example, in the above situation, we can open the red lock to collect the blue and green keys,
then the green lock to collect the brown key, and finally, open the remaining blue and brown locks.
Bus routes
Given a list of bus routes, where route[i] = {b1,b2,b3} means that bus i stops at stops b1, b2, and b3,
determine the smallest number of buses you need to reach the target bus stop starting at the source.
Return -1 if the target is unreachable.
The scaffolding for this problem is located at traversal/buses. Your goal is to make
the following commands pass without any errors: bazel test //traversal/buses/...,
bazel test --config=addrsan //traversal/buses/..., bazel test --config=ubsan
//traversal/buses/....
Figure 56. Example of possible sequences of bus trips for different combinations of source and target stops.
Traversal algorithms 47
In the above situation, we can reach stop six from stop one by first taking the red bus and then
switching to the blue bus at stop four.
Counting islands
Given a map as a std::vector<std::vector<char>> where ‘L’ represents land and ‘W’ represents water,
return the number of islands on the map.
An island is an orthogonally (four directions) connected area of land spaces that is fully (orthogo-
nally) surrounded by water.
The scaffolding for this problem is located at traversal/islands. Your goal is to make
the following commands pass without any errors: bazel test //traversal/islands/...,
bazel test --config=addrsan //traversal/islands/..., bazel test --config=ubsan
//traversal/islands/....
For example, in the above map, we only have one island since no other land masses are fully
surrounded by water.
Figure 58. Example of all possible valid combinations of parentheses for three pairs of parentheses.
For example, for n==3 all valid combinations are: ()()(), (()()), ((())), (())() and ()(()).
Sudoku solver
Given a Sudoku puzzle as std::vector<std::vector<char>>, where unfilled spaces are represented as a
space, solve the puzzle.
In a solved Sudoku puzzle, each of the nine rows, columns, and 3x3 boxes must contain all digits
1..9.
The scaffolding for this problem is located at traversal/sudoku. Your goal is to make
the following commands pass without any errors: bazel test //traversal/sudoku/...,
bazel test --config=addrsan //traversal/sudoku/..., bazel test --config=ubsan
//traversal/sudoku/....
Hints
Locked rooms
1. Think about the keys.
2. The question we need to answer is whether we can collect all keys.
3. We do not need to find the shortest route; therefore, a depth-first search will be good enough.
Bus routes
1. Don’t think in terms of bus stops.
2. You will need to pre-compute something.
3. Pre-computing a connection mapping, which other buses we can switch to, will allow you to
traverse over the buses.
4. We need the shortest path and must use a breadth-first search.
Counting islands
1. If we traverse a potential island, we will visit all its and neighbouring spaces.
2. What type of space will we not encounter if the landmass is an island?
Sudoku solver
1. We have walked through the solution for a very similar problem.
2. Try modifying the solution for N-Queens.
3. What constraints can we propagate to improve the solving performance?
Traversal algorithms 50
Solutions
Locked rooms
Let’s start with our goal. We want to determine whether we can visit all the locked rooms. However,
this is a bit too complex, as we would need to consider both rooms and keys. We can simplify the
problem by reformulating our goal: collect a complete set of keys.
Because we are not concerned with the optimality of our solution, only whether it is possible to
collect all keys, we can choose depth-first search as our base algorithm. We will use one key in each
step of our solution. Using a key will potentially give us access to more keys.
Figure 60. Example of one possible DFS execution on the example problem.
Once we run out of keys, we can check whether we have collected a complete set. With a complete
set of keys, we can visit all rooms.
Traversal algorithms 51
Bus routes
We are trying to find the shortest path that minimizes the number of changes at bus stops. We could
therefore use a breadth-first search and search across bus stops.
However, that poses a problem. We don’t have a convenient way to determine which bus stops we
can reach. We could construct a data structure representing which bus stops can be reached by a
single connection. However, such a data structure would grow based on the overall number of bus
stops.
We can do a lot better. Instead of considering bus stops, we can think in terms of bus lines. We still
need to build a data structure that will provide the mapping of connections, i.e., for each bus line,
list all other bus lines that we can switch to directly from this line. However, the big difference is
that now the size of our data structure scales with the number of bus lines, not bus stops.
To construct the bus line mapping, we can sort the list of bus stops for each line and then check
each pair of buses for overlap. This leads to O(s*logs) complexity for the sort and O(b*b*s) for the
construction of the line mapping.
Traversal algorithms 52
Executing the breadth-first search on the bus line mapping will require O(b*b) time.
Figure 62. Solution for the Bus routes problem.
Counting islands
Our first objective is to figure out a way to determine that a connected piece of land is an island.
If we consider this problem from the perspective of traversing a piece of land, we will encounter
not only the spaces this piece of land occupies but also all neighbouring spaces (otherwise, we could
miss a piece of land). Therefore, we can reformulate this property.
A piece of land is an island if we do not encounter the map boundary during our traversal.
Encountering a land space extends this land mass, and encountering water maintains the island
property.
To ensure that we check all possible islands, we have to scan through the entire map, and when we
Traversal algorithms 54
encounter a space that hasn’t been traversed yet, we start a new traversal to determine whether this
land mass is an island.
So far, I haven’t specified whether we should use a depth-first or a breadth-first search. Unlike most
other problems, where there is a clear preference towards one or the other, in this case, both end up
equal in both time and space complexity. The example solution relies on a depth-first search.
Figure 63. Solution for the counting islands problem.
1 // depth-first search
2 bool island(int64_t i, int64_t j, std::vector<std::vector<char>>& grid) {
3 // If we managed to reach out of bounds, this is not an island
4 if (i == -1 || i == std::ssize(grid) || j == -1 || j == std::ssize(grid[i]))
5 return false;
6 // If this space is not land, ignore
7 if (grid[i][j] != 'L')
8 return true;
9 // Mark this space as visited
10 grid[i][j] = 'V';
11
12 // We can only return true (this is an island) if all four
13 // directions of our DFS return true. However, at the same time
14 // even if this is not an island we want to explore all spaces
15 // of the land mass, just to mark it as visited.
16 // If we used a boolean expression, we would run into
17 // short-circuiting, the first "false" result would stop
18 // the evaluation.
19 // Here we take advantage of the bool->int conversion:
20 // false == 0, true == 1
21 return (island(i-1,j,grid) + island(i+1,j,grid)
22 + island(i,j-1,grid) + island(i,j+1,grid)) == 4;
23 }
24
25 int count_islands(std::vector<std::vector<char>> grid) {
26 int cnt = 0;
27 // For every space
28 for (int64_t i = 0; i < std::ssize(grid); ++i)
29 for (int64_t j = 0; j < std::ssize(grid[i]); ++j)
30 // If it is an unvisited land space, check if it is an island
31 if (grid[i][j] == 'L' && island(i,j,grid))
32 ++cnt;
33 return cnt;
34 }
Traversal algorithms 55
Sudoku solver
One of the requirements for a proper Sudoku puzzle is that it can be solved entirely without guessing
simply by propagating the constraints.
However, implementing a non-guessing Sudoku solver is not something you could do within a
coding interview; therefore, we will need to limit our scope and do at least some guessing. At
the same time, we do not want to completely brute force the puzzle, as that will be pretty slow.
A good middle ground is applying the primary Sudoku constraint: each number can only appear
once in each row, column, and box. Consequently, if we are guessing a number for a particular
space, we can skip all the numbers already present in that row, column, and box.
Figure 65. Example of the effect of primary Sudoku constraints. The highlighted cell has only two possible values:
six and seven.
The implementation mirrors the solution for the N-Queens problem with constraint propagation;
however, because we are working with a statically sized puzzle (9x9), we can additionally take
advantage of the fastest C++ containers: std::array and std::bitset.
Each Sudoku puzzle has nine rows, nine columns, and nine boxes. Each of which we can represent
with a std::bitset, where 1s represent digits already present in the corresponding row, column, or
box.
Traversal algorithms 57
86 state.unmark(r_next,c_next,i);
87 puzzle[r_next][c_next] = ' ';
88 // And try the next digit
89 }
90 return false;
91 }
92
93 bool solve(std::vector<std::vector<char>>& puzzle) {
94 State state(puzzle);
95 return backtrack(puzzle,state,0,0);
96 }
Trees
Interview questions that include trees can be tricky, notably in C++. You might expect problems
involving trees to be of similar complexity to linked lists. In fact, on a fundamental level, both trees
and linked lists are directed graphs. However, unlike linked lists, trees do not get support from the
standard C++ library. No data structure can directly represent trees, and no algorithms can directly
operate on trees1 .
Representing trees
Since we cannot rely on the standard library to provide a tree data structure, we must build our own.
The design options mirror our approaches when implementing a custom linked list (see. Custom
lists).
The most straightforward approach for a binary tree would be to rely on std::unique_ptr and have
each node own its children.
Figure 67. Flawed approach for implementing a binary tree.
1 template <typename T>
2 struct TreeNode {
3 T value = T{};
4 std::unique_ptr<TreeNode> left;
5 std::unique_ptr<TreeNode> right;
6 };
7
8 auto root = std::make_unique<TreeNode<std::string>>(
9 "root node", nullptr, nullptr);
10 // root->value == "root node"
11 root->left = std::make_unique<TreeNode<std::string>>(
12 "left node", nullptr, nullptr);
13 // root->left->value == "left node"
14 root->right = std::make_unique<TreeNode<std::string>>(
15 "right node", nullptr, nullptr);
16 // root->right->value == "right node"
While this might be tempting, and notably, this approach even makes sense from an ownership
perspective, this approach suffers the recursive destruction problem as the linked list.
When working with well-balanced trees, the problem might not manifest; however, a forward-only
linked list is still a valid binary tree. Therefore, we can easily trigger the problem.
1 You could argue that heap algorithms fit into this category.
Trees 61
As a reminder: The recursive nature comes from the chaining of std::unique_ptr. As part of
destroying a std::unique_ptr<TreeNode<int>> we first need to destroy the child nodes, which
first need to destroy their children, and so on. Each program has a limited stack space, and
a sufficiently deep naive binary tree can quickly exhaust this space.
While the above approach isn’t quite suitable for production code, it does offer a convenient
interface. For example, splicing the tree requires only calling std::swap on the source and destination
std::unique_ptr, which will work even across trees.
To avoid recursive destruction, we can separate the encoding of the structure of the tree from
resource ownership.
Figure 69. A binary tree with structure and resource ownership separated.
12 }
13 Node* root;
14 private:
15 std::vector<std::unique_ptr<Node>> storage_;
16 };
17
18 Tree<std::string> tree;
19 tree.root = tree.add("root node");
20 // tree.root->value == "root node"
21 tree.root->left = tree.add("left node");
22 // tree.root->left->value == "left node"
23 tree.root->right = tree.add("right node");
24 // tree.root->right->value == "right node"
This approach does completely remove the recursive destruction; however, we pay for that. While
we can still easily splice within a single tree, splicing between multiple trees becomes cumbersome
(because it involves splicing between the resource pools).
In the context of C++, neither of the above solutions is particularly performance-friendly. The
biggest problem is that we are allocating each node separately, which means that they can be
allocated far apart, in the worst-case situation, each node mapping to a different cache line.
Conceptually, the solution is obvious: flatten the tree. However, as we are talking about
performance-sensitive design, the specific details of the approach matter a lot. A serious imple-
mentation will have to take into account the specific data access pattern.
The following is one possible approach for a binary tree.
Figure 70. One possible approach for representing a binary tree using flat data structures.
16 children.push_back(Children{});
17 return data.size()-1;
18 }
19 size_t add_as_left_child(size_t idx, auto&&... args) {
20 size_t cid = add(std::forward<decltype(args)>(args)...);
21 children[idx].left = cid;
22 return cid;
23 }
24 size_t add_as_right_child(size_t idx, auto&&... args) {
25 size_t cid = add(std::forward<decltype(args)>(args)...);
26 children[idx].right = cid;
27 return cid;
28 }
29 };
30
31 Tree<std::string> tree;
32 auto root = tree.add("root node");
33 // tree.data[root] == "root node"
34 auto left = tree.add_as_left_child(root, "left node");
35 // tree.data[left] == "left node", tree.children[root].left == left
36 auto right = tree.add_as_right_child(root, "right node");
37 // tree.data[right] == "right node", tree.children[root].right == right
As usual, we pay for the added performance by increased complexity. We must refer to nodes
through their indices since both iterators and references get invalidated during a std::vector
reallocation. On top of that, implementing splicing for a flat tree would be non-trivial and not
particularly performant as it involves re-indexing.
Tree traversals
Before you read this section, I encourage you to familiarize yourself with depth-first and breadth-
first search. Both searches are suitable for traversing a tree.
However, for binary trees in particular, the property we care about is the specific order in which we
visit the nodes of the tree. We will start with three traversals that are all based on the depth-first
search.
Pre-order traversal
In pre-order traversal, we visit each node before visiting its children.
Trees 64
Figure 72. Order of visiting nodes using pre-order traversal in a full binary tree.
A typical use case for pre-order traversal is when we need to serialise or deserialise a tree. In the
following example, we serialise a binary tree as a series of space-delimited integers with missing
child nodes represented by zeroes.
Figure 73. Serializing a binary tree of integers into a stream of space-delimited integers.
We must have already deserialised the parent node before we can insert its children into the tree.
This is why pre-order traversal is a natural fit for this use case.
Trees 65
Non-recursive pre-order
With a recursive approach, we can run into the same stack exhaustion problem we faced during tree
destruction. Fortunately, similar to the baseline depth-first-search, we can switch to a non-recursive
implementation by relying on a std::stack or std::vector to store the traversal state.
Figure 75. Non-recursive implementation of pre-order traversal.
1 void pre_order_stack(Node* root, const std::function<void(Node*)>& visitor) {
2 std::stack<Node*> stack;
3 stack.push(root);
4 while (!stack.empty()) {
5 Node *curr = stack.top();
6 stack.pop();
7 visitor(curr);
8 // We visit "null" nodes with this approach, which might be helpful.
9 if (curr == nullptr) continue;
10 // Alternatively, we could move this condition to the push:
11 // if (curr->right != nullptr) stack.push(curr->right);
12
13 // We must insert in reverse to maintain the same
14 // ordering as recursive pre-order.
15 stack.push(curr->right);
16 stack.push(curr->left);
17 }
18 }
Trees 66
Post-order traversal
In post-order traversal, we visit each node after its children.
Figure 76. Recursive post-order traversal of a binary tree.
Figure 77. Order of visiting nodes using post-order traversal in a full binary tree.
Because of this ordering, one use case for post-order is in expression trees, where we can only
evaluate the parent expression if both its children were already evaluated.
Figure 78. Example of a simple expression tree implementation where each node contains a value or a simple
operation.
1 struct Eventual {
2 std::variant<int, // value
3 // or operation
4 std::function<int(const Eventual& l, const Eventual& r)>> content;
5 };
6
7 Tree<Eventual> tree;
8 auto plus = [](const Eventual& l, const Eventual& r) {
9 return get<0>(l.content) + get<0>(r.content);
10 };
11 auto minus = [](const Eventual& l, const Eventual& r) {
12 return get<0>(l.content) - get<0>(r.content);
13 };
14 auto times = [](const Eventual& l, const Eventual& r) {
15 return get<0>(l.content) * get<0>(r.content);
16 };
17 // encode (4-2)*(2+1)
18 auto root = tree.root = tree.add(Eventual{times});
Trees 67
Non-recursive post-order
For a non-recursive approach, we could visit all nodes in pre-order, remembering each, and then
iterate over the nodes in reverse order. However, we can do better.
The main problem we must solve is remembering enough information to correctly decide whether
it is time to visit the parent node. The following approach eagerly explores the left sub-tree,
remembering both the right sibling and the parent node. When we revisit the parent node, we
can decide whether it is time to visit it based on the presence of the right sibling.
Figure 79. Non-recursive post-order traversal with only partial memoization.
14 if (s.empty()) return;
15 current = s.top();
16 s.pop();
17 // If we have the right child remembered,
18 // it would be on the top of the stack.
19 if (current->right && !s.empty() && current->right == s.top()) {
20 // If it is, we must visit it (and it's children) first.
21 s.pop();
22 s.push(current);
23 current = current->right;
24 } else {
25 // If the top of the stack is not the right child,
26 // we have already visited it.
27 visitor(current);
28 current = nullptr;
29 }
30 }
31 }
In-order traversal
In in-order traversal, we visit each node in between visiting its left and right children.
Unlike pre- and post-order traversals that are relatively general, and we can easily apply them to
n-ary trees, in-order traversal only makes sense in the context of binary trees.
Figure 80. Recursive in-order traversal of a binary tree.
1 // in-order traversal
2 void in_order(Node* node, const std::function<void(Node*)>& visitor) {
3 if (node == nullptr) return;
4 in_order(node->left, visitor);
5 visitor(node);
6 in_order(node->right, visitor);
7 }
Figure 81. Order of visiting nodes using in-order traversal in a full binary tree.
Trees 69
The typical use case for in-order traversal is for traversing binary trees that encode an ordering of
elements. The in-order traversal naturally maintains this order during the traversal.
Figure 82. Traversing a BST to produce a sorted output.
Non-recursive in-order
The non-recursive approach is similar to post-order, but we avoid the complexity of remembering
the right child.
Trees 70
Rank-order traversal
The rank-order or level-order traversal traverses nodes in the order of their distance from the root
node.
All the previous traversals: pre-order, post-order and in-order are based on depth-first search, rank-
order traversal is based on breadth-first search, naturally avoiding the recursion problem.
Figure 84. Rank-order traversal implementation.
12 }
13 }
Figure 85. Order of visiting nodes using rank-order traversal in a full binary tree.
Rank-order traversal typically comes up as part of more complex problems. By default, it can be used
to find the closest node to the root that satisfies particular criteria or calculate the nodes’ distance
from the root.
Figure 86. Calculating the maximum node value at each level of the tree.
One specific type of tree you can encounter during interviews is a binary search tree. This tree
encodes a simple property. For each node, all children in the left subtree are of lower values than
the value of this node, and all children in the right subtree are of higher values than the value of
this node.
A balanced binary search tree can be used as a quick lookup table, as we can lookup any value using
log(n) operations; however, whether we will arrive at a balanced tree very much depends on the
order in which elements are inserted into the tree, as the binary search tree doesn’t come with any
self-balancing algorithms (for that we would have to go to Red-Black trees, which is outside the
scope of this book).
Constructing a BST
To construct a binary search tree, we follow the lookup logic to find a null node where the added
value should be located.
Figure 88. Constructing a BST from a range.
19 return t;
20 }
As mentioned above, the binary search tree doesn’t come with any self-balancing algorithms; we
can, therefore, end up in pathological situations, notably when constructing a binary search tree
from a sorted input.
Validating a BST
Binary search trees frequently appear during coding interviews as they are relatively simple, yet
they encode an interesting property.
The most straightforward problem (aside from constructing a BST) is validating whether a binary
tree is a binary search tree.
Before you continue reading, I encourage you to try to solve it yourself. The scaf-
folding for this problem is located at trees/validate_bst. Your goal is to make the
following commands pass without any errors: bazel test //trees/validate_bst/...,
bazel test --config=addrsan //trees/validate_bst/..., bazel test --config=ubsan
//trees/validate_bst/....
If we are checking a particular node in a binary search tree, going to the left subtree sets an upper
bound on all the values in the left subtree and going to the right subtree sets a lower bound on all
the values in the right subtree.
Figure 90. Example of partitioning of values imposed by nodes in a binary search tree.
Trees 74
If we traverse the tree, keeping track and verifying these bounds, we will validate the BST. If we do
not discover any violations, the tree is a BST; if we do, it isn’t.
Figure 91. Validating a binary search tree.
Note that this solution assumes no repeated values. To support duplicate values we would have
adjust the check in the left branch, recursing with the same limit value, instead of -1.
Paths in trees
Another ubiquitous category of tree-oriented problems is dealing with paths in trees. Notably, paths
in trees can be non-trivial to reason about, but at the same time, the tree structure still offers the
possibility for very efficient solutions.
A path is a sequence of nodes where every two consecutive nodes have a parent/child relationship,
and each node is visited at most once.
Trees 75
Before you continue reading, I encourage you to try to solve it yourself. The scaf-
folding for this problem is located at trees/maximum_path. Your goal is to make the
following commands pass without any errors: bazel test //trees/maximum_path/...,
bazel test --config=addrsan //trees/maximum_path/..., bazel test --config=ubsan
//trees/maximum_path/....
Let’s consider a single node in the tree. Only four possible paths can be the maximum path that
crosses this node:
Considering the above list, we can limit the information we need to calculate the maximum path in
a sub-tree whose root is the above node.
If the maximum path doesn’t cross this node, then the path is entirely contained in one of the child
subtrees.
Trees 76
If the path crosses this node, we can calculate the maximum path by using the information about
maximum paths that terminate in the left and right child.
The maximum path crossing this node is the maximum of the above paths.
Now that we know what to calculate, we can traverse the tree in post-order (visiting the children
before the parent node) while keeping track of the aforementioned values.
Figure 94. Solution using post-order traversal.
Canonical problems
Tree problems can cover quite a range, from simple variants of the basic traversals through various
variants of paths in trees to tricky problems that require non-trivial analysis for an efficient solution.
This section covers three medium complexity problems: (de)serializing an n-ary tree, all nodes’ k-
distance, and the number of reorders of a BST. The section also covers two tricky problems: sum of
distances to all nodes and well-behaved paths.
1 struct Node {
2 uint32_t value;
3 std::vector<Node*> children;
4 };
5
6 struct Tree {
7 Node* root = nullptr;
8 // Add node to the tree, when parent == nullptr, the method sets the tree root
9 Node* add_node(uint32_t value, Node* parent = nullptr);
10
11 friend std::istream& operator>>(std::istream& s, Tree& tree);
12 friend std::ostream& operator<<(std::ostream& s, Tree& tree);
13 private:
14 std::vector<std::unique_ptr<Node>> storage_;
15 };
Each node stores a uint32_t value and a vector of weak pointers to children. To add a node to the
tree, use the add_node method (the method will set the tree root when the parent is nullptr).
The scaffolding for this problem is located at trees/nary_tree. Your goal is to make
the following commands pass without any errors: bazel test //trees/nary_tree/...,
bazel test --config=addrsan //trees/nary_tree/..., bazel test --config=ubsan
//trees/nary_tree/....
Figure 97. Example tree with highlighted nodes distance two from the node with value 9.
The scaffolding for this problem is located at trees/kdistance. Your goal is to make
the following commands pass without any errors: bazel test //trees/kdistance/...,
bazel test --config=addrsan //trees/kdistance/..., bazel test --config=ubsan
//trees/kdistance/....
Figure 98. Example of a tree with four nodes and the corresponding calculated sums of distances.
The scaffolding for this problem is located at trees/sum_distances. Your goal is to make
the following commands pass without any errors: bazel test //trees/sum_distances/...,
bazel test --config=addrsan //trees/sum_distances/..., bazel test --config=ubsan
//trees/sum_distances/....
Trees 80
Return the number of well-behaved paths. A well-behaved path begins and ends in a node with the
same value, with all intermediate nodes being either lower or equal to the values at the ends of the
path.
Figure 99. Example of a tree with five single-node well-behaved paths and one four-node (dashed line) well-behaved
path.
The scaffolding for this problem is located at trees/well_behaved. Your goal is to make
the following commands pass without any errors: bazel test //trees/well_behaved/...,
bazel test --config=addrsan //trees/well_behaved/..., bazel test --config=ubsan
//trees/well_behaved/....
Figure 100. Example of two reorders that lead to the same binary search tree.
The scaffolding for this problem is located at trees/bst_reorders. Your goal is to make
the following commands pass without any errors: bazel test //trees/bst_reorders/...,
bazel test --config=addrsan //trees/bst_reorders/..., bazel test --config=ubsan
//trees/bst_reorders/....
Hints
4. You should be able to derive a straightforward formula for calculating the sum of distances for
a child if you know the value for the parent node.
5. If you consider the values, a straightforward formula for calculating the sum of distances for
a child from the parent node will pop out.
6. Once you have the formula, a pre-order traversal will allow you to fill in the missing values.
Solutions
Figure 102. Example that demonstrates the distance changing between children on the path to the target node or not.
If we explore the graph using pre-order traversal, we will also have a second guarantee that a node
is only on the path if it is also on the path between the root and the target node.
Using these observations, we can construct a two-pass solution.
First, we find our target and initialize the distances for all nodes on the path between the target and
the tree’s root.
In the second pass, if we have a precomputed value for a node, we know that it is on the path,
which allows us to distinguish between the two situations. Also, when we encounter a node with
the appropriate distance, we remember it.
Figure 103. Solution
12 left >= 0) {
13 distances[root->value] = left + 1;
14 return left + 1;
15 }
16 // Target in the right sub-tree
17 if (int right = distance_search(root->right, target, distances);
18 right >= 0) {
19 distances[root->value] = right + 1;
20 return right + 1;
21 }
22 // Target not in this sub-tree
23 return -1;
24 }
25
26 // Second pass traversal.
27 void dfs(Node* root, Node* target, int k, int dist,
28 std::unordered_map<int,int>& distances,
29 std::vector<int>& result) {
30 if (root == nullptr) return;
31 // Check if this node is on the path to target.
32 auto it = distances.find(root->value);
33 // Node is on the path to target, update distance.
34 if (it != distances.end())
35 dist = it->second;
36 // This node is k distance from the target.
37 if (dist == k)
38 result.push_back(root->value);
39
40 // Distances to children are one more, unless they are on the path
41 // which is handled above.
42 dfs(root->left, target, k, dist + 1, distances, result);
43 dfs(root->right, target, k, dist + 1, distances, result);
44 }
45
46 std::vector<int> find_distance_k_nodes(Node* root, Node* target, int k) {
47 // First pass
48 std::unordered_map<int,int> distances;
49 distance_search(root, target, distances);
50 // Second pass
51 std::vector<int> result;
52 dfs(root, target, k, distances[root->value], distances, result);
53 return result;
Trees 86
54 }
Figure 104. Example of a disconnected tree with the two highlighted nodes for which we have calculated the sum of
distances values.
We can reconstruct the sum of distances for the connected tree from the two disjoint values.
This formula gives us the opportunity to calculate the answer for a child from the value of a parent.
After we have calculated the sum of distances for the root node with post-order traversal, we do a
second traversal, this time in pre-order, filling in values for all nodes using the above formula.
This gives us a much better O(n) time complexity.
Figure 105. Solution for the sum of distances problem.
1 struct TreeInfo {
2 TreeInfo(int n) : subtree_sum(n,0), node_count(n,0), result(n,0) {}
3 std::vector<int> subtree_sum;
4 std::vector<int> node_count;
5 std::vector<int> result;
6 };
7
8 void post_order(int node, int parent,
9 const std::unordered_multimap<int,int>& neighbours, TreeInfo& info) {
10 // If there are no children we have zero distance and one node.
11 info.subtree_sum[node] = 0;
12 info.node_count[node] = 1;
13
14 auto [begin, end] = neighbours.equal_range(node);
15 for (auto [from, to] : std::ranges::subrange(begin, end)) {
16 // Avoid looping back to the node we came from.
17 if (to == parent) continue;
18 // post_order traversal, visit children first
19 post_order(to, node, neighbours, info);
20 // accumulate number of nodes and distances
21 info.subtree_sum[node] += info.subtree_sum[to] + info.node_count[to];
22 info.node_count[node] += info.node_count[to];
23 }
24 }
25
26 void pre_order(int node, int parent,
27 const std::unordered_multimap<int,int>& neighbours, TreeInfo& info) {
28 // For the root node the subtree_sum matches the result.
29 if (parent == -1) {
30 info.result[node] = info.subtree_sum[node];
31 } else {
32 // Otherwise, we can calculate the result from the parent,
33 // because in pre-order we visit the parent before the children.
34 info.result[node] = info.result[parent] + info.result.size()
35 - 2*info.node_count[node];
36 }
37 // Now visit any children.
38 auto [begin, end] = neighbours.equal_range(node);
Trees 88
• The maximum value in either of the trees we are connecting by adding this edge is at most
std::max(value[node_left], value[node_right]), and that has to be the maximum value in at least
one of the trees (because both nodes already exist in those trees).
• If the maximum value in one of the subtrees is lower than std::max(value[node_left],
value[node_right]), no valid paths are crossing this edge (since the maximum node creates a
barrier).
• If the maximum value in both of the subtrees is std::max(value[node_left], value[node_right]),
then this edge adds freq_of_max[left]*freq_of_max[right] valid paths. From each node node
Trees 89
with the maximum value in the left subtree to each node with the maximum value in the right
subtree.
While this looks like a complete solution, we have a big problem. How do we efficiently keep track
of the frequencies of the maximum values? Not only that, a new edge might be connecting to any
node in a connected subtree, so we also need to be able to retrieve the frequency for a connected
subtree based on any node in this subtree.
Fortunately, the UnionFind algorithm offers a solution. Union find can keep track of connected
components by keeping track of a representative node for each component. In our case, the
components are subtree, and we additionally want the representative node to be one of the nodes
with the maximum value.
Figure 106. Solution for the well-behaved paths problem.
1 // UnionFind
2 int64_t find(std::vector<int64_t>& root, int64_t i) {
3 if (root[i] == i) // This is the representative node for this subtree
4 return i;
5 // Otherwise find the representative node and cache for future lookups.
6 // The caching is what provides the O(alpha(n)) complexity.
7 return root[i] = find(root, root[i]);
8 }
9
10 int64_t well_behaved_paths(std::vector<int64_t> values,
11 std::vector<std::pair<int64_t,int64_t>> edges) {
12 // Start with all nodes disconnected, each node is the
13 // representative node of its subtree.
14 std::vector<int64_t> root(values.size(), 0);
15 std::ranges::iota(root,0);
16
17 // The frequencies of the maximum value in each subtree.
18 std::vector<int64_t> freq(values.size(), 1);
19
20 // Start with trivial paths.
21 int64_t result = values.size();
22
23 std::ranges::sort(edges, [&](auto& l, auto& r) {
24 return std::max(values[l.first], values[l.second]) <
25 std::max(values[r.first], values[r.second]);
26 });
27
28 for (auto &edge : edges) {
29 // Find the representative nodes for the two ends.
Trees 90
Let’s consider the permutation {3,1,2,4,5}. Changing the order of elements within each partition (i.e.
{1,2}, {4,5}) would produce different partitions; however, we can freely interleave these partitions
without changing the result. More formally, we are looking for the number of ways to pick the
positions for the left (or right) partition out of all positions, i.e. C(n-1,k) (binomial coefficient),
where n is the total number of elements in the permutation and k is the number of elements in the
left partition.
The second point we have not considered is the number of reorderings in the two sub-tree, which
we can calculate recursively.
This leads to a total formula: reorderings(left)*reorderings(right)*coeff(n-1,left.size()).
This implies that we will have to pre-calculate the binomial coefficients, which we can do using
Pascal’s triangle.
Finally, we need to apply the requested modulo operation where appropriate.
Figure 107. Solution for the number of BST reorders problem.
93