0% found this document useful (0 votes)
3 views

cpp-coding-interview-community

Surviving the C++ Coding Interview is a practical guide designed to help candidates prepare for coding interviews, particularly focusing on common coding puzzles. The book includes a structured approach with theory, practical examples, and a companion repository for hands-on practice. Authored by Šimon Tóth, who has extensive experience in C++, the book aims to equip readers with the skills to recognize and implement solution patterns effectively.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

cpp-coding-interview-community

Surviving the C++ Coding Interview is a practical guide designed to help candidates prepare for coding interviews, particularly focusing on common coding puzzles. The book includes a structured approach with theory, practical examples, and a companion repository for hands-on practice. Authored by Šimon Tóth, who has extensive experience in C++, the book aims to equip readers with the skills to recognize and implement solution patterns effectively.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

Surviving the C++ Coding Interview

Šimon Tóth
This book is for sale at http://leanpub.com/cpp-coding-interview

This version was published on 2023-10-01

This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing
process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and
many iterations to get reader feedback, pivot until you have the right book and build traction once
you do.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0


International License
Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
The commercial edition vs. community edition . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
The author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Book structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Companion repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Using this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Linked Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
std::list and std::forward_list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Custom lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Simple operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Canonical problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Traversal algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Depth-first search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Breadth-first search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Backtracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Notable variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Canonical problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Representing trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Tree traversals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
BST: Binary Search Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Paths in trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Canonical problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Preface
Welcome, you are reading the Surviving the C++ coding interview book. I conceived this book’s
idea during the mass layoffs of 2022/2023.
Most companies still insist on using coding puzzles during interviews, and while for some, this is
only a convenient scaffolding, for many, it remains the primary filter for candidates. Therefore
training for this part of the interview remains a necessity.
This book aims to guide you through the different types of problems you can come across while also
focusing on information that will remain relevant past the interview process.
After finishing this book, you should be able to recognize the solution patterns when presented with
a problem and transform those patterns into an implementation.
As with all my books, this book focuses on practical information. The text is interspersed
with commented examples, and the book comes with a companion repository that contains a
comprehensive test suite. You are encouraged to attempt to solve each of the presented problems
yourself and only then compare it with the commented solution.

The commercial edition vs. community edition


This book is part of my Creative Common series (under the CC-BY-NC-SA 4.0 license), and as such,
the commercial edition doesn’t differ from the community edition.
On top of that, 100% of my royalties from the purchases of this book go to the Electronic Frontier
Foundation.

The author

I am Šimon Tóth, the sole author of this book. My primary qualification is 20 years of C++
experience, with C++ being my primary language in a commercial setting for approximately 15
of those years.
My background is in HPC, spanning academia, big tech, and startup environments.
I have architected, built, and operated systems of all scales, from single-machine hardware-
supported high-availability to planet-scale services.
My passion has always been teaching and mentoring junior engineers throughout my career, which
is why you are now reading this book.
For more about me, check out my LinkedIn profile1 .
1 https://www.linkedin.com/in/simontoth/
Introduction
Book structure
I believe in interleaving theory and practical training, and I have structured this book to facilitate and
enrich this workflow. Each chapter adheres to a consistent structure to ensure a steady progression.
We start with an introduction that covers relevant C++ background as necessary. Then, we move on
to essential patterns and simple operations. Each chapter concludes with carefully selected problems,
complemented by solutions and commentary.
While the chapters are sequential, with each building on the foundations of the previous, the book
doesn’t restrict you to strict reading order. Instead, it comes with a comprehensive index. You
can always refer to the index to look up more details if you encounter an unfamiliar concept or
algorithm.

Companion repository
This book has a companion repository1 with a test suite and scaffolding for each problem.
The repository is set up with a DevContainer configuration. It allows for a seamless C++
development environment, equipped with the latest stable versions of GCC, GDB, and Clang when
accessed through VS Code. All you need to take full advantage of this are Visual Studio Code2 and
Docker3 .
To get up and running, follow these steps:

1. Open Visual Studio Code, and select View>Command Palette.

Figure 1. Open Command Palette

1 https://github.com/HappyCerberus/cpp-coding-interview-companion
2 https://code.visualstudio.com/download
3 https://www.docker.com/products/docker-desktop/
Introduction 3

2. Write in git clone and select the Git: Clone action.

Figure 2. Write git clone

3. Paste in the companion repository URL: github.com/HappyCerberus/cpp-coding-interview-


companion4 and confirm.

Figure 3. Paste in the URL

4. Visual Studio Code will ask for a location and, once done, will ask to open the cloned repository.
Confirm.

Figure 4. Confirm open

5. Visual Studio Code will now ask whether you trust me. Confirm that you do. You can see all
the relevant configuration inside the repository in the .vscode and .devcontainer directories.

4 https://github.com/HappyCerberus/cpp-coding-interview-companion
Introduction 4

Figure 5. Confirm trust

6. Finally, Visual Studio Code will detect the devcontainer configuration and ask whether you
want to re-open the project in the devcontainer. Confirm. After VSCode downloads the
container, you will have a fully working C++ development environment with the latest GCC,
Clang, GDB, and Bazel.

Figure 6. Reopen in container


Introduction 5

Using this book


The process you employ to solve the problems presented in this book is essential. Typically, in a
coding interview, you vocalize your thoughts to allow for feedback and guidance. However, while
using this book, that instant interaction isn’t available. Should you find yourself stuck, consider this
strategy:
First, ensure you grasp the problem at hand. Then, try sketching it out or examining some examples
on paper.
Next, see if you can implement a simple brute-force solution as a starting point. From there, ask
yourself what could be optimized. Is there repeated work? Can one solution inform another?
If you’re genuinely stumped, the hints section could offer valuable insight.
Linked Lists
While rare in practical applications, linked lists crop up frequently in interviews. Partly this is
because the node structure lends itself to formulating tricky problems, similar to trees and graphs,
without the added topological complexity.

std::list and std::forward_list


The standard library offers two list types, std::list - a doubly-linked list and std::forward_list - a
singly-linked list. The std::forward_list exists primarily as a space optimization, saving 8 bytes per
element on 64-bit architectures.
Both offer perfect iterator and reference stability, i.e., the only operation that invalidates iterators or
references is the erasure of an element, and only for the removed element. The stability does extend
even to moving elements between lists.
Figure 7. Iterator stability with std::list.

1 #include <list>
2
3 std::list<int> first{1,2,3};
4 std::list<int> second{4,5,6};
5
6 // Get iterator to the element with value 2.
7 auto it = std::next(first.begin());
8
9 // Move the element to the begining of the second list.
10 second.splice(second.begin(), first, it);
11
12 // first == {1, 3}, second == {2,4,5,6}
13
14 // iterator still valid
15 // *it == 2

The iterator stability is one of the use cases where we would use a std::list or std::forward_list
in practical applications. The only reasonable alternative would be wrapping each element in a
std::unique_ptr, which does offer reference stability (but not iterator stability) irrespective of the
wrapping container.
Linked Lists 7

Figure 8. Reference stability using std::unique_ptr.


1 #include <vector>
2 #include <memory>
3
4 std::vector<std::unique_ptr<int>> stable;
5 stable.push_back(std::make_unique<int>(1));
6 stable.push_back(std::make_unique<int>(2));
7 stable.push_back(std::make_unique<int>(3));
8
9 // get a stable weak reference (or pointer) to an element
10 int *it = stable[1].get();
11 stable.erase(stable.begin()); // invalidates all iterators
12 // it still valid, *it == 2

Of course, we do pay for this stability with performance. Linked lists are node-based containers,
meaning each element is allocated in a separate node, potentially very distant from each other in
memory. When we combine this with the inherent overhead of the indirection, traversing a std::list
can regularly end up 5x-10x slower than an equivalent flat std::vector.
Aside from iterator stability, we also get access to a suite of O(1) operations, and these can potentially
outweigh the inherent overhead of a std::list.
Figure 9. O(1) operations using a std::list and std::forward_list.
1 #include <list>
2
3 std::list<int> data{1,2,3,4,5};
4
5 // O(1) splicing between lists, or within one list
6
7 // effectively rotate left by one element
8 data.splice(data.end(), data, data.begin());
9 // data == {2,3,4,5,1}
10
11 // O(1) erase
12
13 // iterator to element with value 4
14 auto it = std::next(data.begin(), 2);
15 data.erase(it);
16 // data == {2,3,5,1}
17
18 // O(1) insertion
19
20 // effectively push_front()
Linked Lists 8

21 data.insert(data.begin(), 42);
22 // data = {42,2,3,5,1}

Because std::list is a bidirectional range and std::forward_list is a forward range, we lose access to
some standard algorithms. Both lists expose custom implementations of sort, unique, merge, reverse,
remove, and remove_if as member functions.
Figure 10. List specific algorithms.
1 #include <list>
2
3 std::list<int> data{1,2,3,4,5};
4
5 data.reverse();
6 // data = {5,4,3,2,1}
7
8 data.sort();
9 // data = {1,2,3,4,5}
10
11 data.remove_if([](int v) { return v % 2 == 0; });
12 // data == {1,3,5}

The std::forward_list has an additional quirk; since we can only erase and insert after an iterator,
the std::forward_list offers a modified interface.
Figure 11. Modified interface of std::forward_list.
1 #include <forward_list>
2
3 std::forward_list<int> data{1,2,3,4,5};
4
5 // before_begin() iterator
6 auto it = data.before_begin();
7
8 // insert and erase only possible after the iterator
9 data.insert_after(it, 42);
10 // data == {42,1,2,3,4,5}
11 data.erase_after(it);
12 // data == {1,2,3,4,5}

Custom lists
When implementing a simple custom linked list, you might be tempted to use a straightforward
implementation using a std::unique_ptr.
Linked Lists 9

Figure 12. A naive approach to a linked list.


1 #include <memory>
2
3 struct Node {
4 int value;
5 std::unique_ptr<Node> next;
6 };
7
8 std::unique_ptr<Node> head = std::make_unique<Node>(20,nullptr);
9 head->next = std::make_unique<Node>(42,nullptr);
10 // head->value == 20
11 // head->next->value == 42

Sadly, this approach isn’t usable. The fundamental problem here is the design. We are mixing
ownership with structural information. In this case, this problem manifests during destruction.
Because we have tied the ownership with the structure, the destruction of a list will be recursive,
potentially leading to stack exhaustion and a crash.
Figure 13. A demonstration of a problem caused by recursive destruction.
1 #include <memory>
2
3 struct Node {
4 int value;
5 std::unique_ptr<Node> next;
6 };
7
8 {
9 std::unique_ptr<Node> head = std::make_unique<Node>(0,nullptr);
10 // Depending on the architecture/compiler, the specific number
11 // of elements we can handle without crash will differ.
12 Node* it = head.get();
13 for (int i = 0; i < 100000; ++i)
14 it = (it->next = std::make_unique<Node>(0,nullptr)).get();
15 } // BOOM

The recursive nature comes from chaining std::unique_ptr. As part of destroying a


std::unique_ptr<Node> we need first to destroy the nested next pointer, which in turn needs
to destroy its nested next pointer, and so on. A destruction of the linked list means a full
expansion of destructors until we reach the end of the list. After reaching the end of the
list, we can finally finish the destruction of the trailing node, propagating back towards the
front. Each program has a limited stack space, and a sufficiently long naive linked list can
easily exhaust this space.
Linked Lists 10

If we desire both the O(1) operations and iterator stability, the only option is to rely on manual
resource management (at which point we might as well use std::list or std::forward_list).
However, if we limit ourselves, there are a few alternatives to std::list and std::forward_list.
If we want to capture the structure of a linked list with reference stability, we can rely on the
previously mentioned combination of a std::vector and a std::unique_ptr. This approach doesn’t give
us any O(1) operations or iterator stability; however, this approach is often used during interviews.
Figure 14. Representing the structure of a linked list using a std::vector and std::unique_ptr.

1 #include <vector>
2 #include <memory>
3
4 struct List {
5 struct Node {
6 int value;
7 Node* next;
8 };
9 Node *head = nullptr;
10 Node *new_after(Node* prev, int value) {
11 nodes_.push_back(std::make_unique<Node>(value, nullptr));
12 if (prev == nullptr)
13 return head = nodes_.back().get();
14 else
15 return prev->next = nodes_.back().get();
16 }
17 private:
18 std::vector<std::unique_ptr<Node>> nodes_;
19 };
20
21
22 List list;
23 auto it = list.new_after(nullptr, 1);
24 it = list.new_after(it, 2);
25 it = list.new_after(it, 3);
26
27 // list.head->value == 1
28 // list.head->next->value == 2
29 // list.head->next->next->value == 3

The crucial difference from the naive approach is that the list data structure owns all nodes, and the
structure is encoded only using weak pointers.
Finally, if we do not require stable iterators or references but do require O(1) operations, we can use
Linked Lists 11

a flat list approach. We can store all elements directly in a std::vector and represent information
about the next and previous nodes using indexes.
However, this introduces a problem. Erasing an element from the middle of a std::vector is O(n)
because we need to shift successive elements to fill the gap. Since we are encoding the list structure,
we can swap the to-be-erased element with the last element and only then erase it in O(1).
Figure 15. Erase an element from the middle of a flat list in O(1).
1 #include <vector>
2
3 inline constexpr ptrdiff_t nill = -1;
4
5 struct List {
6 struct Node {
7 int value;
8 ptrdiff_t next;
9 ptrdiff_t prev;
10 };
11 ptrdiff_t new_after(ptrdiff_t prev, int value) {
12 storage.push_back({value, nill, prev});
13 if (prev != nill)
14 storage[prev].next = std::ssize(storage)-1;
15 else
16 head = std::ssize(storage)-1;
17 return std::ssize(storage)-1;
18 }
19 void erase(ptrdiff_t idx) {
20 // move head
21 if (idx == head)
22 head = storage[idx].next;
23 // unlink the erased element
24 if (storage[idx].next != nill)
25 storage[storage[idx].next].prev = storage[idx].prev;
26 if (storage[idx].prev != nill)
27 storage[storage[idx].prev].next = storage[idx].next;
28 // relink the last element
29 if (idx != std::ssize(storage)-1) {
30 if (storage.back().next != nill)
31 storage[storage.back().next].prev = idx;
32 if (storage.back().prev != nill)
33 storage[storage.back().prev].next = idx;
34 }
35 // swap and O(1) erase
36 std::swap(storage[idx],storage.back());
Linked Lists 12

37 storage.pop_back();
38 }
39 ptrdiff_t get_head() { return head; }
40 Node& at(ptrdiff_t idx) { return storage[idx]; }
41 private:
42 ptrdiff_t head = nill;
43 std::vector<Node> storage;
44 };
45
46
47 List list;
48 ptrdiff_t idx = list.new_after(nill, 1);
49 idx = list.new_after(idx, 2);
50 idx = list.new_after(idx, 3);
51 idx = list.new_after(idx, 4);
52 idx = list.new_after(idx, 5);
53 // list == {1,2,3,4,5}
54
55 idx = list.get_head();
56 list.erase(idx);
57 // list == {2,3,4,5}

Simple operations
Let’s explore some basic operations frequently used as the base for a more complex solution. The
three most frequent operations are:

• merge two sorted lists


• reverse a list
• scan with two pointers

Both std::list and std::forward_list come with a built-in merge operation.


Linked Lists 13

Figure 16. Merging two sorted lists using merge().

1 #include <list>
2 #include <forward_list>
3
4 {
5 std::list<int> left{2,4,5};
6 std::list<int> right{1,3,9};
7 left.merge(right);
8 // left == {1,2,3,4,5,9}
9 // right == {}
10 }
11
12 {
13 std::forward_list<int> left{2,4,5};
14 std::forward_list<int> right{1,3,9};
15 left.merge(right);
16 // left == {1,2,3,4,5,9}
17 // right == {}
18 }

However, implementing one from scratch isn’t particularly complicated either. We consume the
merged-in list, one element at a time, advancing the insertion position as needed.
Figure 17. Custom merge operation.

1 #include <forward_list>
2
3 std::forward_list dst{1, 3, 5, 6};
4 std::forward_list src{2, 4, 7};
5
6 auto dst_it = dst.begin();
7
8 while (!src.empty()) {
9 if (std::next(dst_it) == dst.end() ||
10 *std::next(dst_it) >= src.front()) {
11 dst.splice_after(dst_it, src, src.before_begin());
12 } else {
13 ++dst_it;
14 }
15 }
16 // dst == {1,2,3,4,5,6,7}
17 // src == {}
Linked Lists 14

The same situation applies to reversing a list. Both lists provide a built-in in-place reverse operation{i:
“lists|reverse”}.
Figure 18. Built-in in place reverse.

1 #include <forward_list>
2
3 std::forward_list<int> src{1,2,3,4,5,6,7};
4
5 src.reverse();
6 // src == {7,6,5,4,3,2,1}

Implementing a custom reverse is straightforward if we use a second list. However, the in-place
version can be tricky.
Figure 19. Custom implementations of linked list reverse.

1 #include <forward_list>
2 #include <iostream>
3
4 std::forward_list<int> src{1,2,3,4,5,6,7};
5
6 // Custom reverse using a second list
7 std::forward_list<int> dst;
8 while (!src.empty())
9 dst.splice_after(dst.before_begin(), src, src.before_begin());
10 // dst == {7,6,5,4,3,2,1}
11 // src == {}
12
13 // Custom in-place reverse
14 auto tail = dst.begin();
15 if (tail != dst.end())
16 while (std::next(tail) != dst.end())
17 dst.splice_after(dst.before_begin(), dst, tail);
18 // dst == {1,2,3,4,5,6,7}

The in-place reverse takes advantage of the fact that the first element will be the last once the list is
reversed.
Linked Lists 15

Figure 20. Process of reversing a forward list in place.

Finally, scanning with two iterators is a common search technique for finding a sequence of elements
that conform to a particular property. As long as this property is calculated strictly from elements
entering and leaving the sequence, we do not need to access the elements currently in the sequence{i:
“two pointers|sliding window”}.
Figure 21. Find the longest subsequence with sum less than 4.

1 #include <forward_list>
2
3 std::forward_list<int> data{4,2,1,1,1,3,5};
4
5 // two iterators denoting the sequence [left, right)
6 auto left = data.begin();
7 auto right = data.begin();
8 int sum = 0;
9 int len = 0;
10 int max = 0;
11
12 while (right != data.end()) {
13 // extend right, until we break the property
14 while (sum < 4 && right != data.end()) {
15 max = std::max(max, len);
16 ++len;
17 sum += *right;
18 ++right;
19 }
20 // shrink from left, until the property is restored
21 while (sum >= 4 && left != right) {
22 sum -= *left;
23 --len;
24 ++left;
25 }
26 }
27 // max == 3, i.e. {1,1,1}
Linked Lists 16

Canonical problems
Now that we’ve covered the basics, let’s move on to real-world problems often seen in technical
interviews. This next section will cover four linked list challenges: reversing k-groups in a list,
merging a list of sorted lists, removing the kth element from the end, and finding a loop in a corrupted
list.
It’s a step up from what we’ve done so far, but with the foundation you’ve built, you should be
well-prepared to handle these tasks. Let’s get started.

Reverse k-groups in a list


Our first challenge is all about diligence. Given a singly-linked list and a positive integer k, transpose
the list so that each group of k elements is in reversed internal order. If k doesn’t divide the number
of elements without a remainder, the remainder of nodes should be left in their original order.

Figure 22. Example of reversing groups of 3 elements.

You should be able to implement a version that operates in O(n) time and O(1) additional space,
where n is the number of elements in the list.

The scaffolding for this problem is located at lists/k_groups. Your goal is to make the
following commands pass without any errors: bazel test //lists/k_groups/..., bazel
test --config=addrsan //lists/k_groups/..., bazel test --config=ubsan //lists/k_-
groups/....

Merge a list of sorted lists


Given a list of sorted lists, return a merged sorted list.
Linked Lists 17

Figure 23. Example of merging three sorted lists.

You should be able to implement a version that operates in O(n*log(k)) time and uses O(k) additional
memory, where n is the total number of elements and k is the number of lists we are merging.

The scaffolding for this problem is located at lists/merge. Your goal is to make the
following commands pass without any errors: bazel test //lists/merge/..., bazel test
--config=addrsan //lists/merge/..., bazel test --config=ubsan //lists/merge/....

Remove the kth element from the end of a singly-linked list


Given a singly-linked list and a positive integer k, remove the kth element (counted) from the end
of the list.

Figure 24. Example of removing the 3rd element from the end of the list.

You should be able to implement a version that operates in O(n) time and uses O(1) additional
memory, where n is the number of elements in the list.

The scaffolding for this problem is located at lists/end_of_list. Your goal is to make
the following commands pass without any errors: bazel test //lists/end_of_list/...,
bazel test --config=addrsan //lists/end_of_list/..., bazel test --config=ubsan
//lists/end_of_list/....
Linked Lists 18

Find a loop in a linked list


Given a potentially corrupted singly-linked list, determine whether it is corrupted, i.e., it forms a
loop.

Figure 25. Example of a corrupted list.

• Progression A: return the node that is the first node on the loop.
• Progression B: fix the list.

You should be able to implement a version that operates in O(n) and uses O(1) additional memory,
where n is the number of elements in the list.

The scaffolding for this problem is located at lists/loop for the basic version and
lists/loop_node, lists/loop_fix for the two progressions. Your goal is to make the
following commands pass without any errors: bazel test //lists/loop/..., bazel test
--config=addrsan //lists/loop/..., bazel test --config=ubsan //lists/loop/....
Adjust the directory to loop_node and loop_fix for the relevant progression.

Hints

Reverse k-groups in a list


1. You will need a function to reverse a list.
2. When you reverse k elements, what needs to re-connect to what? Drawing a picture may help.
3. When we reverse a group of k elements, what was previously the first element is now the last
element of the group.

Merge a list of sorted lists


1. There are two ways to achieve log scaling.
2. If you can decrease the number of unmerged lists by a factor of two in each step, you will get
log scaling.
3. Picking the lowest element from a sorted range is log(n).
Linked Lists 19

Remove the kth element from the end of a singly-linked list


1. How can you check that an element is the kth element from the end of the list?
2. How can you re-use the information you calculated to check the next element?

Find a loop in a linked list


1. What happens when there is no loop?
2. If we iterate over the list and there is a loop, we get stuck; how could we detect this situation?
3. Can we use two pointers?
4. If we use one slow (moving by one) and fast (moving by two) pointer, they will meet up if there
is a loop.

Solutions

Reverse k-groups in a list


There isn’t anything algorithmically complex in this problem. However, there are many opportuni-
ties to trip up and make a mistake.
We talked about reversing a singly-linked list in the simple operations section. This operation will
be our base for reversing the group of k elements.
Figure 26. Reversing a group of k elements.

1 List::Node* reverse(List::Node *head, int64_t k) {


2 List::Node *result = nullptr;
3 List::Node *iter = head;
4
5 for (int64_t i = 0; i < k; ++i)
6 iter = std::exchange(iter->next, std::exchange(result, iter));
7 /* we want to put iter to the front of the list
8 but we want the original value of result to be the new iter->next
9 but we want the original value of iter->next to be the new iter
10 */
11
12 return result;
13 }

The complexity lies in applying this operation multiple times in sequence. For that, we need to keep
track of terminating nodes:
Linked Lists 20

• the head of the already processed part; this will be our final result
• the tail of the already processed part; this is where we will attach each reversed section as we
iterate
• the head of the unprocessed part; for iteration and we also need it to relink the reversed group
of k elements

Figure 27. Keeping track of nodes

The algorithm repeats these steps:

Figure 28. Calculate the kth element by advancing k steps from the unprocessed head.

Figure 29. Reverse k elements starting from the unprocessed head.

Figure 30. Connect the reversed group to the tail of the processed part. Connect the unprocessed head to the kth
element.
Linked Lists 21

Figure 31. The new tail is the unprocessed head. The new unprocessed head is the kth element.

Figure 32. Solution

1 void reverse_groups(List &list, int64_t k) {


2 List::Node *unprocessed_head = list.head;
3 List::Node *processed_tail = nullptr;
4 List::Node *result = nullptr;
5 List::Node *iter = list.head;
6
7 while (iter != nullptr) {
8 // advance by k elements
9 int cnt = 0;
10 iter = unprocessed_head;
11 while (cnt < k && iter != nullptr) {
12 iter = iter->next;
13 ++cnt;
14 }
15
16 // if we have a full set of k elements
17 if (cnt == k) {
18 List::Node *processed_head = reverse(unprocessed_head, k);
19 // initialize the result if this is the first group
20 if (result == nullptr)
21 result = processed_head;
22 // if this isn't the first group, link the existing tail
23 if (processed_tail != nullptr)
24 processed_tail->next = processed_head;
25
26 // what was head is now the tail of the reversed section
27 processed_tail = unprocessed_head;
28 // and iter is the new head
29 unprocessed_head = iter;
30 }
31 }
32 if (processed_tail != nullptr)
33 processed_tail->next = unprocessed_head;
34 list.head = result == nullptr ? unprocessed_head : result;
Linked Lists 22

35 }

We access each element at most twice. Once when advancing by k elements and the second when
we are reversing a group of k elements. This means that our time complexity is $O(n), and since we
only store the terminal nodes, our space complexity is O(1).

Merge a list of sorted lists


We already discussed merging two lists in the simple operations section. However, we need to
be careful here. If we would merge-in each list in a loop, we would end up with $O(k*n) time
complexity, where k is the number of lists and n is the number of nodes.
The desired O(n*log(k)) time complexity should point you towards some form of a sorted struc-
ture(std::priority_queue, heap algorithms, std::set). A sorted structure would give us log(k) lookup,
which we can then repeat n times to traverse all the elements in order.
The second way to reach log scaling is with an exponential factor. If we merge lists in pairs, we
will half the number of lists in each step, with total log(k) steps, leaving us again with O(n*log(k))
complexity.
In either case, we must be careful not to introduce accidental copies.
Figure 33. Solution using an always-sorted structure.

1 std::forward_list<int64_t> merge(std::forward_list<std::forward_list<int64_t>> in) {


2 using fwlst = std::forward_list<int64_t>;
3
4 // Custom comparator, compare based on the first element
5 auto cmp = [](const fwlst& l, const fwlst& r) {
6 return l.front() < r.front();
7 };
8 // View of non-empty lists, if we filter here,
9 // we don't have to check later.
10 auto non_empty = in |
11 std::views::filter([](auto& l) { return !l.empty(); }) |
12 std::views::common;
13 // Consume the input using std::move_iterator,
14 // avoids making copies of the lists.
15 std::multiset<fwlst,decltype(cmp)> q(
16 std::move_iterator(non_empty.begin()),
17 std::move_iterator(non_empty.end()),
18 cmp);
19
20 fwlst result;
21 fwlst::iterator tail = result.before_begin();
Linked Lists 23

22 while (!q.empty()) {
23 // Extract the node that holds the element,
24 // without making a copy
25 auto top = q.extract(q.begin());
26
27 // Splice the first element of the top list to the result
28 result.splice_after(tail,
29 top.value(), top.value().before_begin());
30 ++tail;
31
32 if (!top.value().empty())
33 q.insert(std::move(top)); // put back
34 }
35 return result;
36 }

Because we extract each element once and each extract operation involves O(log(k)) insertion
operation, we end up with O(n*log(k)) time complexity. Our std::multiset will use O(k) memory.
Figure 34. Solution using pairwise merging.

1 std::forward_list<int64_t> merge(std::forward_list<std::forward_list<int64_t>> in) {


2 std::forward_list<std::forward_list<int64_t>> merged;
3 // While we have more than one list
4 while (std::next(in.begin()) != in.end()) {
5 auto it = in.begin();
6 // Take a pair of lists
7 while (it != in.end() && std::next(it) != in.end()) {
8 // Merge
9 it->merge(*std::next(it));
10 merged.emplace_front(std::move(*it));
11 std::advance(it, 2);
12 }
13 // If we have odd number of lists
14 if (it != in.end())
15 merged.emplace_front(std::move(*it));
16 // Switch the lists for the next iteration
17 in = std::move(merged);
18 }
19 return std::move(in.front());
20 }

We merge n elements in every iteration, repeating this for log(k) iterations, leading to O(n*log(k))
Linked Lists 24

time complexity. The only additional memory we use is to store the partially merged lists; therefore,
we end up with O(k) space complexity.

Remove the kth element from the end of a list


The trivial approach would be to check whether we are at the kth element from the back and, if not,
advance to the next element. However, this approach would have O(n*k) time complexity.
Once we check whether an element is the kth element from the back, we have two iterators that are
k-1 elements apart. To check the next element, we do not have to repeat the entire process; instead,
we can advance the iterator pointing to the previous k-1 apart element, again ending with a pair of
elements that are k-1 apart.
Extending this idea allows us to implement an O(n) solution. We calculate the first pair of elements
that k apart, and from there, we advance both iterators in step until we reach the end of the list.
Figure 35. Solution

1 void remove_kth_from_end(std::forward_list<int64_t>& head, int64_t k) {


2 auto node = head.before_begin();
3 auto tail = head.begin();
4 // advance the tail by k-1 steps
5 for (int64_t i = 1; i < k && tail != head.end(); ++i)
6 ++tail;
7
8 // there is no k-the element from the back
9 if (tail == head.end()) return;
10
11 // advance both node and tail, until we reach the end with tail
12 // this means that node is poiting to the k-th element from the back
13 while (std::next(tail) != head.end()) {
14 ++node;
15 ++tail;
16 }
17
18 // remove the node
19 head.erase_after(node);
20 }

Find a loop in a linked list


This problem is the prototypical problem for the fast and slow pointer technique (a.k.a. Floyd’s
tortoise and hare).
Linked Lists 25

We will eventually reach the end if we iterate over the list without a loop. However, if there is a
loop, we will end up stuck in the loop.
The tricky part is detecting that we are stuck in the loop. If we use two pointers to iterate, one slow
one, iterating normally and one fast one, iterating over two nodes in each step, we have a guarantee
that if they get stuck in a loop, they will eventually meet.

Figure 36. Initial configuration: slow and fast pointers are pointing to the head of the list.

Figure 37. Configuration after two steps.

Figure 38. Configuration after four steps.


Linked Lists 26

Figure 39. Configuration after six steps. Loop detected.

1 bool loop_detection(const List &list) {


2 List::Node *slow = list.head;
3 List::Node *fast = list.head;
4 do {
5 // nullptr == no loop
6 if (slow == nullptr)
7 return false;
8 if (fast == nullptr || fast->next == nullptr)
9 return false;
10 slow = slow->next;
11 fast = fast->next->next;
12 } while (slow != fast);
13
14 return true;
15 }

Identifying the start of the loop

To detect the start of the loop, we must look at how many steps both pointers made before they met
up.
Consider that the slow pointer moved x steps before entering the loop and then y steps after entering
the loop for a slow = x + y total.
The fast pointer moved similarly. It also moved x steps before entering the loop and then y steps after
entering the loop when it met up with the slow pointer; however, before that, it did an unknown
number of loops: fast = x + n*loop + y. Importantly, we also know that the fast pointer also did
2*slow steps.
If we put this together, we end up with the following:

• 2*(x + y) = x + n*loop + y
• x = n*loop - y
Linked Lists 27

This means that the number of steps to reach the loop is the same as the number of steps remaining
from where the pointers met up to the start of the loop.
So to find the start of the loop, we can iterate from the start and the meeting point. Once these two
new pointers meet, we have our loop start.

Figure 40. One pointer at the meeting point, one at the list head.

Figure 41. The loop start is identified after two steps.

Figure 42. Solution with start detection.

1 List::Node *loop_start(const List &list) {


2 // Phase 1, detect the loop.
3 List::Node *slow = list.head;
4 List::Node *fast = list.head;
5 do {
6 // nullptr == no loop
7 if (slow == nullptr)
8 return nullptr;
9 if (fast == nullptr || fast->next == nullptr)
10 return nullptr;
11 slow = slow->next;
12 fast = fast->next->next;
13 } while (slow != fast);
14
15 // Phase 2, iterate from head and from meeting point.
16 List::Node *onloop = slow;
17 List::Node *offloop = list.head;
Linked Lists 28

18 while (onloop != offloop) {


19 onloop = onloop->next;
20 offloop = offloop->next;
21 }
22 return onloop;
23 }

Fixing the list

The main difficulty in fixing the list is that we are working with a singly-linked list. Fixing the list
means that we must unlink node one before the start of the loop.
Figure 43. Solution for fixing the list.

1 void loop_fix(List &list) {


2 // Phase 1, detect the loop.
3 List::Node *slow = list.head;
4 List::Node *fast = list.head;
5 List::Node *before = nullptr;
6 do {
7 // nullptr == no loop
8 if (slow == nullptr)
9 return;
10 if (fast == nullptr || fast->next == nullptr)
11 return;
12 slow = slow->next;
13 // Keep track of the node one before the fast pointer
14 before = fast->next;
15 fast = fast->next->next;
16 } while (slow != fast);
17
18 // Phase 2, iterate from head and from meeting point.
19 List::Node *onloop = slow;
20 List::Node *offloop = list.head;
21 while (onloop != offloop) {
22 // Keep track of the node one before the onloop pointer
23 before = onloop;
24 onloop = onloop->next;
25 offloop = offloop->next;
26 }
27
28 // Phase 3, fix the list, before != nullptr
Linked Lists 29

29 before->next = nullptr;
30 }
Traversal algorithms
This chapter is dedicated to three algorithms we will keep revisiting in different variants throughout
the book. The two search-traversal algorithms, depth-first and breadth-first search, and the
constraint-traversal algorithm, backtracking.
Let’s start with a problem: imagine you need to find a path in a maze; how would you do it?

Figure 44. Maze with one entrance and one exit.

You could wander randomly, and while that might take a very long time, you will eventually reach
the end.
However, for a more structured approach, you might consider an approach similar to a depth-first
search, exploring each branch until you reach a dead-end, then returning to the previous crossroads
and taking a different path.

Depth-first search
The depth-first search opportunistically picks a direction at each space and explores that direction
fully before returning to this space and picking a different path.
Traversal algorithms 31

A typical approach would use a consistent strategy for picking the direction order: e.g., north, south,
west, east; however, as long as the algorithm explores every direction, the order doesn’t matter and
can be randomized.

Figure 45. Depth-first search using the NSWE strategy.

Because of the repeating nested nature, a recursive implementation is a natural fit for the depth-first
search.
Figure 46. Recursive implementation of a depth-first search.

1 bool dfs(int64_t row, int64_t col,


2 std::vector<std::vector<char>>& map) {
3 // Check for out-of-bounds.
4 if (row < 0 || row == std::ssize(map) ||
5 col < 0 || col == std::ssize(map[row]))
6 return false;
7
8 // If we reached the exit, we are done.
9 if (map[row][col] == 'E')
10 return true;
11 // If this is not an unvisited space, do not
12 // terminate but, also do not continue.
13 if (map[row][col] != ' ')
14 return false;
15
16 // Mark this space as visited.
Traversal algorithms 32

17 map[row][col] = '.';
18
19 return dfs(row-1,col,map) || // North
20 dfs(row+1,col,map) || // South
21 dfs(row,col-1,map) || // West
22 dfs(row,col+1,map); // East
23 }

We can flatten the recursive version using a stack data structure. However, we need to remember
the LIFO nature of a stack. The order of exploration will be inversed from the order in which we
insert the elements into the stack.
Figure 47. Implementation of depth-first search using a std::stack.
1 bool dfs(int64_t row, int64_t col,
2 std::vector<std::vector<char>>& map) {
3 std::stack<std::pair<int64_t,int64_t>> next;
4 next.push({row,col});
5
6 // As long as we have spaces to explore.
7 while (!next.empty()) {
8 auto [row,col] = next.top();
9 next.pop();
10
11 // If we reached the exit, we are done.
12 if (map[row][col] == 'E')
13 return true;
14
15 // Mark as visited
16 map[row][col] = '.';
17
18 // Helper to check if a space can be stepped on
19 // i.e. not out-of-bounds and either empty or exit.
20 auto is_path = [&map](int64_t row, int64_t col) {
21 return row >= 0 && row < std::ssize(map) &&
22 col >= 0 && col < std::ssize(map[row]) &&
23 (map[row][col] == ' ' || map[row][col] == 'E');
24 };
25
26 // Due to the stack data structure we need to insert
27 // elements in the reverse order we want to explore.
28 if (is_path(row,col+1)) // East
29 next.push({row,col+1});
30 if (is_path(row,col-1)) // West
Traversal algorithms 33

31 next.push({row,col-1});
32 if (is_path(row+1,col)) // South
33 next.push({row+1,col});
34 if (is_path(row-1,col)) // North
35 next.push({row-1,col});
36 }
37
38 // We have explored all reachable spaces
39 // and didn't find the exit.
40 return false;
41 }

While the depth-first search is excellent for finding a path, we don’t necessarily get the shortest
path. If our goal is to determine reachability, a depth-first search will be sufficient; however, if we
require the path to be optimal, we must use the breadth-first search.

Breadth-first search
As the name suggests, the algorithm expands in breadth, visiting spaces in lock-step. The algorithm
first visits all spaces next to the starting point, then all spaces next to those, i.e., two spaces away
from the start, then three, four, and so on. To visualize, you can think about how water would flood
the maze from the starting point.

Figure 48. Breadth-first search demonstration.


Traversal algorithms 34

When implementing a breadth-first search, we need a data structure that will allow us to process
the elements in the strict order we discover them, a queue.
Figure 49. Implementation of breadth-first search using a std::queue.
1 int64_t bfs(int64_t row, int64_t col, std::vector<std::vector<char>>& map) {
2 std::queue<std::tuple<int64_t,int64_t,int64_t>> next;
3 next.push({row,col,0});
4
5 // As long as we have spaces to explore.
6 while (!next.empty()) {
7 auto [row,col,dist] = next.front();
8 next.pop();
9
10 // If we reached the exit, we are done.
11 // Return the current length.
12 if (map[row][col] == 'E')
13 return dist;
14
15 // Mark as visited.
16 map[row][col] = '.';
17
18 // Helper to check if a space can be stepped on
19 // i.e. not out-of-bounds and either empty or exit.
20 auto is_path = [&map](int64_t row, int64_t col) {
21 return row >= 0 && row < std::ssize(map) &&
22 col >= 0 && col < std::ssize(map[row]) &&
23 (map[row][col] == ' ' || map[row][col] == 'E');
24 };
25
26 if (is_path(row-1,col)) // North
27 next.push({row-1,col,dist+1});
28 if (is_path(row+1,col)) // South
29 next.push({row+1,col,dist+1});
30 if (is_path(row,col-1)) // West
31 next.push({row,col-1,dist+1});
32 if (is_path(row,col+1)) // East
33 next.push({row,col+1,dist+1});
34 }
35
36 // We have explored all reachable spaces
37 // and didn't find the exit.
38 return -1;
39 }
Traversal algorithms 35

Backtracking
Both depth-first and breadth-first searches are traversal algorithms that attempt to reach a specific
goal. The difference between the two algorithms is only in the order in which they traverse the
space.
However, in some situations, we may not know the goal and only know the properties (constraints)
the path toward the goal must fulfill.
The backtracking algorithm explores the solution space in a depth-first order, discarding paths that
do not fulfill the requirements.
Let’s take a look at a concrete example: The N-Queens problem. The goal is to place N-Queens onto
an NxN chessboard without any of the queens attacking each other, i.e., no queens sharing a row,
column, or diagonal.
Traversal algorithms 36

Figure 50. Demonstration of backtracking for the 4-Queens problem.


Traversal algorithms 37

The paths we explore are partial but valid solutions that build upon each other. In the above example,
we traverse the solution space in row order. First, we pick a position for a queen in the first row,
then second, then third, and finally fourth. The example also demonstrates two dead-ends we reach
if we place the queen in the first row into the first column.
A backtracking algorithm implementation will be similar to a depth-first search. However, we
must keep track of the partial solution (the path), adding to the solution as we explore further and
removing from the solution when we return from a dead-end.
Figure 51. Example implementation of backtracking.
1 #include <vector>
2 #include <cstdint>
3
4 // Check if we can place a queen in the specified row and column
5 bool available(std::vector<int64_t>& solution,
6 int64_t row, int64_t col) {
7 for (int64_t queen = 0; queen < std::ssize(solution); ++queen) {
8 // Column occupied
9 if (solution[queen] == col)
10 return false;
11 // NorthEast/SouthWest diagonal occupied
12 if (row + col == queen + solution[queen])
13 return false;
14 // NorthWest/SouthEast diagonal occupied
15 if (row - col == queen - solution[queen])
16 return false;
17 }
18 return true;
19 }
20
21 bool backtrack(std::vector<int64_t>& solution, int64_t n) {
22 if (std::ssize(solution) == n)
23 return true;
24
25 // We are trying to fit a queen on row std::ssize(solution)
26 for (int64_t column = 0; column < n; ++column) {
27 if (!available(solution, std::ssize(solution), column))
28 continue;
29
30 // This space is not in conflict
31 solution.push_back(column);
32 // We found a solution, exit
33 if (backtrack(solution, n))
34 return true;
Traversal algorithms 38

35 // This was a dead-end, remove the queen from this position


36 solution.pop_back();
37 }
38
39 // This is a dead-end
40 return false;
41 }

Notable variants
The traversal algorithms mentioned earlier are largely standalone, ready to be deployed to solve
diverse problems with minimal tweaks.
Yet, we frequently encounter a few additional versions. In this section, we’ll tackle three
such variants: traversing multiple dimensions, adjusting for non-unit costs, and managing the
propagation of constraints.
We’ll also illustrate each variant using a concrete problem, all of which you can find in the
companion repository. Try solving each of them before you read the corresponding solution.

Multi-dimensional traversal
Applying a depth-first or breadth-first search to a problem with additional spatial dimensions is
straightforward. From the algorithm’s perspective, additional dimensions only introduce a broader
neighborhood for each space. However, in some problems, the additional dimensions will not be
that obvious.
Consider the following problem: Given a 2D grid of size m*n, containing 0s (spaces) and 1s
(obstacles), determine the length of the shortest path from the coordinate {0,0} to {m-1,n-1}, given
that you can remove up to k obstacles.

Before you continue reading, try solving this problem yourself. The scaffolding for
this problem is located at traversal/obstacles. Your goal is to make the following
commands pass without any errors: bazel test //traversal/obstacles/..., bazel
test --config=addrsan //traversal/obstacles/..., bazel test --config=ubsan
//traversal/obstacles/....

Because we are looking for the shortest path, we don’t have a choice of the traversal algorithm. We
must use a breadth-first search. But how do we deal with the obstacles?
Let’s consider adding a 3rd dimension to the problem. Instead of removing an obstacle, we can
virtually move to a new maze floor, where this obstacle never existed. However, this introduces a
problem. We can’t apply this logic mindlessly since there are potentially m*n obstacles.
Traversal algorithms 39

Fortunately, we can lean on the behaviour of breadth-first search. When we enter a new floor of
the maze, we have a guarantee that we will never revisit the space we entered through. This means
we do not have to track which specific obstacles we removed, only how many we can still remove,
shrinking the number of floors to k+1. Applying a breadth-first search then leaves us with a total
time complexity of O(m*n*(k+1)).
Figure 52. Breadth-first search in a maze with obstacle removal.

1 #include <vector>
2 #include <queue>
3 #include <cstdint>
4
5 struct Dir {
6 int64_t row;
7 int64_t col;
8 };
9
10 struct Pos {
11 int64_t row;
12 int64_t col;
13 int64_t k;
14 int64_t distance;
15 };
16
17 int shortest_path(const std::vector<std::vector<int>>& grid, int64_t k) {
18 // Keep track of visited spaces, initialize all spaces as unvisited.
19 std::vector<std::vector<std::vector<bool>>> visited(
20 grid.size(), std::vector<std::vector<bool>> (
21 grid[0].size(), std::vector<bool>(k+1, false)
22 )
23 );
24
25 // BFS
26 std::queue<Pos> q;
27 // start in {0,0} with zero removed obstacles
28 q.push(Pos{0,0,0,0});
29 visited[0][0][0] = true;
30
31 while (!q.empty()) {
32 auto current = q.front();
33 q.pop();
34 // The first time we visit the end coordinate is the shortest path
35 if (current.row == std::ssize(grid)-1 &&
36 current.col == std::ssize(grid[current.row])-1) {
Traversal algorithms 40

37 return current.distance;
38 }
39
40 // For every direction, try to move there
41 for (auto dir : {Dir{-1,0}, Dir{1,0}, Dir{0,-1}, Dir{0,1}}) {
42 // This space is out of bounds, ignore.
43 if ((current.row + dir.row < 0) ||
44 (current.col + dir.col < 0) ||
45 (current.row + dir.row >= std::ssize(grid)) ||
46 (current.col + dir.col >= std::ssize(grid[0])))
47 continue;
48
49 // If the space in the current direction is an empty space:
50 Pos empty = {current.row + dir.row, current.col + dir.col,
51 current.k, current.distance + 1};
52 if (grid[empty.row][empty.col] == 0 &&
53 !visited[empty.row][empty.col][empty.k]) {
54 // add it to the queue
55 q.push(empty);
56 // and mark as visited
57 visited[empty.row][empty.col][empty.k] = true;
58 }
59
60 // If we have already removed k obstacles,
61 // we don't consider removing more.
62 if (current.k == k)
63 continue;
64
65 // If the space in the current direction is an obstacle:
66 Pos wall = {current.row + dir.row, current.col + dir.col,
67 current.k + 1, current.distance + 1};
68 if (grid[wall.row][wall.col] == 1 &&
69 !visited[wall.row][wall.col][wall.k]) {
70 // add it to the queue
71 q.push(wall);
72 // and mark as visited
73 visited[wall.row][wall.col][wall.k] = true;
74 }
75 }
76 }
77
78 // If we are here, we did not reach the end coordinate.
79 return -1;
Traversal algorithms 41

80 }

Shortest path with non-unit costs


In all the problems we discussed, the cost of moving from one space to another was always one
unit. The breadth-first search algorithm explicitly relies on this property to process spaces strictly
by distance.
Therefore if we are working with a problem that doesn’t have unit cost, we must adjust our approach.
Consider the following problem: Given a 2D heightmap of size m*n, where negative integers
represent impassable terrain and positive integers represent the terrain height, determine the shortest
path between the two given coordinates under the following constraints:

• the path cannot cross impassable terrain


• moving on a level terrain costs two time-units
• moving downhill costs one time-unit
• moving uphill costs four time-units

Before you continue reading, try solving this problem yourself. The scaffolding for
this problem is located at traversal/heightmap. Your goal is to make the following
commands pass without any errors: bazel test //traversal/heightmap/..., bazel
test --config=addrsan //traversal/heightmap/..., bazel test --config=ubsan
//traversal/heightmap/....

The primary requirement of BFS is that we process elements in the order of their distance from the
start of the path. When all transitions have a unit cost, we can achieve this by relying on a queue.
However, with non-unit costs, we must use an ordered structure such as std::priority_queue. Note
that switching to a priority queue will affect the time complexity as we are moving from O(1) push
and pop operations to O(log(n)) push and pop operations.
The second guarantee we lose concerns the shortest path when we first push a space into the queue.
If we discovered a space with a path length X we had a guarantee that all later paths that also lead
to this space would, at best equal X. Because of this constraint, we could limit ourselves to adding
each space into the queue only once. With non-unit costs, this property no longer holds.
It is possible that a later (and longer) path can enter the same space with an overall shorter path
length. Consequently, we might need to insert a space multiple times into our queue (bounded by
the number of neighbours). However, we still have a slightly weaker but still significant guarantee.
The ordered nature of the priority queue guarantees that the first time we pop a space from the
queue, it is part of the shortest path that enters this space.
Due to the queue’s logarithmic complexity, we end up with O(m*n*log(m*n)) overall time complex-
ity for the breadth-first search.
Traversal algorithms 42

Figure 53. Breadth-first search using a priority queue to handle non-unit costs.
1 #include <vector>
2 #include <queue>
3 #include <cstdint>
4
5 struct Coord {
6 int64_t row;
7 int64_t col;
8 };
9
10 int64_t shortest_path(const std::vector<std::vector<int>>& map,
11 Coord start, Coord end) {
12 struct Pos {
13 int64_t length;
14 Coord coord;
15 };
16
17 // For tracking visited spaces
18 std::vector<std::vector<bool>> visited(map.size(),
19 std::vector<bool>(map[0].size(), false));
20
21 // Helper to check whether a space can be stepped on
22 // not out of bounds, not impassable and not visited
23 auto can_step = [&map, &visited](Coord coord) {
24 auto [row, col] = coord;
25 return row >= 0 && col >= 0 &&
26 row < std::ssize(map) && col < std::ssize(map[row]) &&
27 map[row][col] >= 0 &&
28 !visited[row][col];
29 };
30
31 // Priority queue instead of a simple queue
32 std::priority_queue<Pos, std::vector<Pos>,
33 decltype([](const Pos& l, const Pos& r) {
34 return l.length > r.length;
35 })> q;
36 // Start with path length zero at start
37 q.push({0,start});
38
39 // Helper to determine the cost of moving between two spaces
40 auto step_cost = [&map](Coord from, Coord to) {
41 if (map[from.row][from.col] < map[to.row][to.col]) return 4;
42 if (map[from.row][from.col] > map[to.row][to.col]) return 1;
Traversal algorithms 43

43 return 2;
44 };
45
46 while (!q.empty()) {
47 // Grab the position closest to the start
48 auto [length, pos] = q.top();
49 q.pop();
50
51 if (visited[pos.row][pos.col]) continue;
52 // The first time we grab a position from the queue is guaranteed
53 // to be the shortest path, so now we need to mark it as visited.
54 // If we later visit the same position (already in queue at this point)
55 // with a longer path, we skip it based on the above check.
56 visited[pos.row][pos.col] = true;
57
58 // First time we would try to exit the end space is the shortest path.
59 if (pos.row == end.row && pos.col == end.col)
60 return length;
61
62 // Expand to all four directions
63 for (auto next : {Coord{pos.row-1,pos.col},
64 Coord{pos.row+1, pos.col},
65 Coord{pos.row, pos.col-1},
66 Coord{pos.row, pos.col+1}}) {
67 if (!can_step(next)) continue;
68 q.push({length + step_cost(pos, next), next});
69 }
70 }
71
72 return -1;
73 }

Constraint propagation
In the previous section, we used backtracking to solve the N-Queens problem. However, if you look
at the implementation, we repeatedly check each new queen against all previously placed queens.
We can do better.
When working with backtracking, we cannot escape the inherent exponential complexity of the
worst case. However, we can often significantly reduce the exponents by propagating the problem’s
constraints forward. The main objective is to remove as many options from the consideration
altogether by ensuring that the constraints are maintained
Traversal algorithms 44

Before you continue reading, try modifying the previous version yourself. The scaf-
folding for this problem is located at traversal/queens. Your goal is to make the
following commands pass without any errors: bazel test //traversal/queens/...,
bazel test --config=addrsan //traversal/queens/..., bazel test --config=ubsan
//traversal/queens/....

Specifically for the N-Queens problem, we have N rows, N columns, 2*N-1 NorthWest, and 2*N-1
NorthEast diagonals. Placing a queen translates to claiming one row, column, and the corresponding
diagonals. Instead of checking each queen against all previous queens, we can limit ourselves to
checking whether the corresponding row, column, or one of the two diagonals was already claimed.
Figure 54. Solving the N-Queens problem with backtracking and constraint propagation.
1 // Helper to store the current state:
2 struct State {
3 State(int64_t n) : n(n), solution{}, cols(n), nw_dia(2*n-1), ne_dia(2*n-1) {}
4 // Size of the problem.
5 int64_t n;
6 // Partial solution
7 std::vector<int64_t> solution;
8 // Occupied columns
9 std::vector<bool> cols;
10 // Occupied NorthWest diagonals
11 std::vector<bool> nw_dia;
12 // Occupied NorthEast diagonals
13 std::vector<bool> ne_dia;
14 // Check column, and both diagonals
15 bool available(int64_t row, int64_t col) const {
16 return !cols[col] && !nw_dia[row-col+n-1] && !ne_dia[row+col];
17 }
18 // Mark this position as occupied and add it to the partial solution
19 void mark(int64_t row, int64_t col) {
20 solution.push_back(col);
21 cols[col] = true;
22 nw_dia[row-col+n-1] = true;
23 ne_dia[row+col] = true;
24 }
25 // Unmark this position as occupied and remove it from the partial solution
26 void erase(int64_t row, int64_t col) {
27 solution.pop_back();
28 cols[col] = false;
29 nw_dia[row-col+n-1] = false;
30 ne_dia[row+col] = false;
31 }
Traversal algorithms 45

32 };
33
34 bool backtrack(auto& state, int64_t row, int64_t n) {
35 // All Queens have their positions, we have solution
36 if (row == n) return true;
37
38 // Try to find a feasible column on this row
39 for (int c = 0; c < n; ++c) {
40 if (!state.available(row,c))
41 continue;
42 // Mark this position
43 state.mark(row,c);
44 // Recurse to the next row
45 if (backtrack(state, row+1, n))
46 return true; // We found a solution on this path
47 // This position lead to dead-end, erase and try another
48 state.erase(row,c);
49 }
50 // This is dead-end
51 return false;
52 }

Canonical problems
Traversal algorithms are possibly the most frequent algorithms during technical interviews. In this
section, we will limit ourselves to only five problems that exemplify the different variants of traversal
algorithms we have discussed in the last two sections.

Locked rooms
Given an array of n locked rooms, each room containing 0..n distinct keys, determine whether you
can visit each room. You are given the key to room zero, and each room can only be opened with
the corresponding key (however, there may be 0..n copies of that key).
Assume you can freely move between rooms; the key is the only thing you need.

The scaffolding for this problem is located at traversal/locked. Your goal is to make
the following commands pass without any errors: bazel test //traversal/locked/...,
bazel test --config=addrsan //traversal/locked/..., bazel test --config=ubsan
//traversal/locked/....
Traversal algorithms 46

Figure 55. Example of a situation where each room can be reached.

For example, in the above situation, we can open the red lock to collect the blue and green keys,
then the green lock to collect the brown key, and finally, open the remaining blue and brown locks.

Bus routes
Given a list of bus routes, where route[i] = {b1,b2,b3} means that bus i stops at stops b1, b2, and b3,
determine the smallest number of buses you need to reach the target bus stop starting at the source.
Return -1 if the target is unreachable.

The scaffolding for this problem is located at traversal/buses. Your goal is to make
the following commands pass without any errors: bazel test //traversal/buses/...,
bazel test --config=addrsan //traversal/buses/..., bazel test --config=ubsan
//traversal/buses/....

Figure 56. Example of possible sequences of bus trips for different combinations of source and target stops.
Traversal algorithms 47

In the above situation, we can reach stop six from stop one by first taking the red bus and then
switching to the blue bus at stop four.

Counting islands
Given a map as a std::vector<std::vector<char>> where ‘L’ represents land and ‘W’ represents water,
return the number of islands on the map.
An island is an orthogonally (four directions) connected area of land spaces that is fully (orthogo-
nally) surrounded by water.

The scaffolding for this problem is located at traversal/islands. Your goal is to make
the following commands pass without any errors: bazel test //traversal/islands/...,
bazel test --config=addrsan //traversal/islands/..., bazel test --config=ubsan
//traversal/islands/....

Figure 57. Example of a 4x4 map with only a single island.

For example, in the above map, we only have one island since no other land masses are fully
surrounded by water.

All valid parentheses sequences


Given n pairs of parentheses, generate all valid combinations of these parentheses.

The scaffolding for this problem is located at traversal/parentheses.


Your goal is to make the following commands pass without any
errors: bazel test //traversal/parentheses/..., bazel test
--config=addrsan //traversal/parentheses/..., bazel test --config=ubsan
//traversal/parentheses/....
Traversal algorithms 48

Figure 58. Example of all possible valid combinations of parentheses for three pairs of parentheses.

For example, for n==3 all valid combinations are: ()()(), (()()), ((())), (())() and ()(()).

Sudoku solver
Given a Sudoku puzzle as std::vector<std::vector<char>>, where unfilled spaces are represented as a
space, solve the puzzle.
In a solved Sudoku puzzle, each of the nine rows, columns, and 3x3 boxes must contain all digits
1..9.

The scaffolding for this problem is located at traversal/sudoku. Your goal is to make
the following commands pass without any errors: bazel test //traversal/sudoku/...,
bazel test --config=addrsan //traversal/sudoku/..., bazel test --config=ubsan
//traversal/sudoku/....

Figure 59. Example of Sudoku puzzle.


Traversal algorithms 49

Hints

Locked rooms
1. Think about the keys.
2. The question we need to answer is whether we can collect all keys.
3. We do not need to find the shortest route; therefore, a depth-first search will be good enough.

Bus routes
1. Don’t think in terms of bus stops.
2. You will need to pre-compute something.
3. Pre-computing a connection mapping, which other buses we can switch to, will allow you to
traverse over the buses.
4. We need the shortest path and must use a breadth-first search.

Counting islands
1. If we traverse a potential island, we will visit all its and neighbouring spaces.
2. What type of space will we not encounter if the landmass is an island?

All valid parentheses


1. What are the properties that hold true for a valid parentheses sequence?
2. We only have n pairs of parentheses. This leads to one constraint.
3. We can only add a right parenthesis under specific circumstances. This leads to one constraint.
4. We need the backtracking algorithm to find all paths that satisfy the above constraints.

Sudoku solver
1. We have walked through the solution for a very similar problem.
2. Try modifying the solution for N-Queens.
3. What constraints can we propagate to improve the solving performance?
Traversal algorithms 50

Solutions

Locked rooms
Let’s start with our goal. We want to determine whether we can visit all the locked rooms. However,
this is a bit too complex, as we would need to consider both rooms and keys. We can simplify the
problem by reformulating our goal: collect a complete set of keys.
Because we are not concerned with the optimality of our solution, only whether it is possible to
collect all keys, we can choose depth-first search as our base algorithm. We will use one key in each
step of our solution. Using a key will potentially give us access to more keys.

Figure 60. Example of one possible DFS execution on the example problem.

Once we run out of keys, we can check whether we have collected a complete set. With a complete
set of keys, we can visit all rooms.
Traversal algorithms 51

Figure 61. Solution for the locked rooms problem.


1 bool locked_rooms(const std::vector<std::vector<int>>& rooms) {
2 // Keep track of the keys we have collected
3 std::vector<bool> keys(rooms.size(),false);
4
5 std::stack<int> keys_to_use;
6 keys_to_use.push(0); // We start with the key to room 0
7 keys[0] = true;
8
9 while (!keys_to_use.empty()) {
10 // Use the key to open a room
11 int key = keys_to_use.top();
12 keys_to_use.pop();
13
14 // Check if any of the keys in the room are new
15 for (int k : rooms[key])
16 if (!keys[k]) {
17 keys_to_use.push(k);
18 keys[k] = true;
19 }
20 }
21
22 // Do we have all the keys?
23 return std::ranges::all_of(keys, std::identity{});
24 }

Bus routes
We are trying to find the shortest path that minimizes the number of changes at bus stops. We could
therefore use a breadth-first search and search across bus stops.
However, that poses a problem. We don’t have a convenient way to determine which bus stops we
can reach. We could construct a data structure representing which bus stops can be reached by a
single connection. However, such a data structure would grow based on the overall number of bus
stops.
We can do a lot better. Instead of considering bus stops, we can think in terms of bus lines. We still
need to build a data structure that will provide the mapping of connections, i.e., for each bus line,
list all other bus lines that we can switch to directly from this line. However, the big difference is
that now the size of our data structure scales with the number of bus lines, not bus stops.
To construct the bus line mapping, we can sort the list of bus stops for each line and then check
each pair of buses for overlap. This leads to O(s*logs) complexity for the sort and O(b*b*s) for the
construction of the line mapping.
Traversal algorithms 52

Executing the breadth-first search on the bus line mapping will require O(b*b) time.
Figure 62. Solution for the Bus routes problem.

1 // There is no convenient is_overlapping algorithm unfortunately


2 bool overlaps(const std::vector<int>& left, const std::vector<int>& right) {
3 ptrdiff_t i = 0; ptrdiff_t j = 0;
4 while (i < std::ssize(left) && j < std::ssize(right)) {
5 while (i < std::ssize(left) &&
6 left[i] < right[j])
7 ++i;
8 while (i < std::ssize(left) &&
9 j < std::ssize(right) &&
10 left[i] > right[j])
11 ++j;
12 if (i < std::ssize(left) &&
13 j < std::ssize(right) &&
14 left[i] == right[j])
15 return true;
16 }
17 return false;
18 }
19
20 int min_tickets(std::vector<std::vector<int>> routes, int source, int target) {
21 if (source == target) { return 0; }
22
23 // Map of bus -> connecting busses
24 std::vector<std::vector<int>> connections(routes.size());
25 for (auto &route : routes)
26 std::ranges::sort(route);
27
28 // Flag for whether a bus stops at target
29 std::vector<bool> is_dest(routes.size(), false);
30 // Flag for whether this bus was already visited
31 std::vector<bool> visited(routes.size(), false);
32 // Queue for BFS
33 std::queue<std::pair<int,int>> q;
34
35 for (ptrdiff_t i = 0; i < std::ssize(routes); ++i) {
36 // The bus stops at source, one of our starting buses
37 if (std::ranges::binary_search(routes[i], source)) {
38 q.push({i,1});
39 visited[i] = true;
40 }
Traversal algorithms 53

41 // The bus stops at target


42 if (std::ranges::binary_search(routes[i], target))
43 is_dest[i] = true;
44
45 // Find all other busses that connect to this bus
46 for (ptrdiff_t j = i+1; j < std::ssize(routes); ++j) {
47 if (overlaps(routes[i],routes[j])) {
48 connections[i].push_back(j);
49 connections[j].push_back(i);
50 }
51 }
52 }
53
54 // BFS
55 while (!q.empty()) {
56 auto [current,len] = q.front();
57 q.pop();
58 if (is_dest[current])
59 return len;
60
61 for (auto bus : connections[current]) {
62 if (visited[bus]) continue;
63 q.push({bus, len+1});
64 visited[bus] = true;
65 }
66 }
67
68 return -1;
69 }

Counting islands
Our first objective is to figure out a way to determine that a connected piece of land is an island.
If we consider this problem from the perspective of traversing a piece of land, we will encounter
not only the spaces this piece of land occupies but also all neighbouring spaces (otherwise, we could
miss a piece of land). Therefore, we can reformulate this property.
A piece of land is an island if we do not encounter the map boundary during our traversal.
Encountering a land space extends this land mass, and encountering water maintains the island
property.
To ensure that we check all possible islands, we have to scan through the entire map, and when we
Traversal algorithms 54

encounter a space that hasn’t been traversed yet, we start a new traversal to determine whether this
land mass is an island.
So far, I haven’t specified whether we should use a depth-first or a breadth-first search. Unlike most
other problems, where there is a clear preference towards one or the other, in this case, both end up
equal in both time and space complexity. The example solution relies on a depth-first search.
Figure 63. Solution for the counting islands problem.

1 // depth-first search
2 bool island(int64_t i, int64_t j, std::vector<std::vector<char>>& grid) {
3 // If we managed to reach out of bounds, this is not an island
4 if (i == -1 || i == std::ssize(grid) || j == -1 || j == std::ssize(grid[i]))
5 return false;
6 // If this space is not land, ignore
7 if (grid[i][j] != 'L')
8 return true;
9 // Mark this space as visited
10 grid[i][j] = 'V';
11
12 // We can only return true (this is an island) if all four
13 // directions of our DFS return true. However, at the same time
14 // even if this is not an island we want to explore all spaces
15 // of the land mass, just to mark it as visited.
16 // If we used a boolean expression, we would run into
17 // short-circuiting, the first "false" result would stop
18 // the evaluation.
19 // Here we take advantage of the bool->int conversion:
20 // false == 0, true == 1
21 return (island(i-1,j,grid) + island(i+1,j,grid)
22 + island(i,j-1,grid) + island(i,j+1,grid)) == 4;
23 }
24
25 int count_islands(std::vector<std::vector<char>> grid) {
26 int cnt = 0;
27 // For every space
28 for (int64_t i = 0; i < std::ssize(grid); ++i)
29 for (int64_t j = 0; j < std::ssize(grid[i]); ++j)
30 // If it is an unvisited land space, check if it is an island
31 if (grid[i][j] == 'L' && island(i,j,grid))
32 ++cnt;
33 return cnt;
34 }
Traversal algorithms 55

All valid parentheses sequences


Enumerating all possible combinations under a specific constraint is a canonical problem for
backtracking. Our first objective is to formulate our constraints.
The first constraint follows from the input. Because we only have n pairs of parentheses, we can
only add n left and n right parentheses.
The second constraint encodes the validity of a parentheses sequence. Adding a left parenthesis
will never produce an invalid sequence; however, each right parenthesis must match a previous left
parenthesis, meaning we can never have more right parentheses than left parentheses.
Finally, we need to make sure that we keep track of the values required to validate both constraints
as we go, to avoid continually recounting the number of parentheses.
Figure 64. Solution for the valid parentheses problem.
1 void generate(std::vector<std::string>& solutions, size_t n,
2 std::string& prefix, size_t left, size_t right) {
3 // n parentheses, we have solution
4 if (prefix.length() == 2*n)
5 solutions.push_back(prefix);
6
7 // We can only add a left parenthesis if we haven't used all of them.
8 if (left < n) {
9 prefix.push_back('(');
10 // Explore all solutions with this prefix
11 generate(solutions, n, prefix, left + 1, right);
12 prefix.pop_back();
13 }
14 // We can only add a left parenthesis if we have used more left
15 // than right parentheses.
16 if (left > right) {
17 prefix.push_back(')');
18 // Explore all solutions with this prefix
19 generate(solutions, n, prefix, left, right + 1);
20 prefix.pop_back();
21 }
22 }
23
24 std::vector<std::string> valid_parentheses(size_t n) {
25 std::vector<std::string> solutions;
26 std::string prefix;
27 generate(solutions, n, prefix, 0, 0);
28 return solutions;
29 }
Traversal algorithms 56

Sudoku solver
One of the requirements for a proper Sudoku puzzle is that it can be solved entirely without guessing
simply by propagating the constraints.
However, implementing a non-guessing Sudoku solver is not something you could do within a
coding interview; therefore, we will need to limit our scope and do at least some guessing. At
the same time, we do not want to completely brute force the puzzle, as that will be pretty slow.
A good middle ground is applying the primary Sudoku constraint: each number can only appear
once in each row, column, and box. Consequently, if we are guessing a number for a particular
space, we can skip all the numbers already present in that row, column, and box.

Figure 65. Example of the effect of primary Sudoku constraints. The highlighted cell has only two possible values:
six and seven.

The implementation mirrors the solution for the N-Queens problem with constraint propagation;
however, because we are working with a statically sized puzzle (9x9), we can additionally take
advantage of the fastest C++ containers: std::array and std::bitset.
Each Sudoku puzzle has nine rows, nine columns, and nine boxes. Each of which we can represent
with a std::bitset, where 1s represent digits already present in the corresponding row, column, or
box.
Traversal algorithms 57

Figure 66. Solution for the Sudoku solver problem.


1 /* Calculate the corresponding box for row/col coordinates:
2 0 1 2
3 3 4 5
4 6 7 8
5
6 Any mapping will work, as long as it is consistent.
7 */
8 int64_t get_box(int64_t row, int64_t col) {
9 return (row/3)*3+col/3;
10 }
11
12 struct State {
13 // Initialize the state with given digits
14 State(const std::vector<std::vector<char>>& puzzle) {
15 for (int64_t i = 0; i < 9; ++i)
16 for (int64_t j = 0; j < 9; ++j)
17 if (puzzle[i][j] != ' ')
18 mark(i, j, puzzle[i][j]-'1');
19 }
20
21 std::array<std::bitset<9>,9> row;
22 std::array<std::bitset<9>,9> col;
23 std::array<std::bitset<9>,9> box;
24
25 // Get the already used digits for a specific space.
26 std::bitset<9> used(int64_t r_idx, int64_t c_idx) {
27 return row[r_idx] | col[c_idx] | box[get_box(r_idx,c_idx)];
28 }
29 // Mark this digit as used in the corresponding row, column and box.
30 void mark(int64_t r_idx, int64_t c_idx, int64_t digit) {
31 row[r_idx][digit] = true;
32 col[c_idx][digit] = true;
33 box[get_box(r_idx, c_idx)][digit] = true;
34 }
35 // Mark this digit as unused in the corresponding row, column and box.
36 void unmark(int64_t r_idx, int64_t c_idx, int64_t digit) {
37 row[r_idx][digit] = false;
38 col[c_idx][digit] = false;
39 box[get_box(r_idx, c_idx)][digit] = false;
40 }
41 };
42
Traversal algorithms 58

43 // Get the next empty space after {row,col}


44 std::pair<int64_t,int64_t> next(
45 const std::vector<std::vector<char>>& puzzle,
46 int64_t row, int64_t col) {
47 int64_t start = col;
48 for (int64_t i = row; i < std::ssize(puzzle); ++i)
49 for (int64_t j = std::exchange(start,0); j < std::ssize(puzzle[i]); ++j)
50 if (puzzle[i][j] == ' ')
51 return {i,j};
52 return {-1,-1};
53 }
54
55 bool backtrack(
56 std::vector<std::vector<char>>& puzzle,
57 State& state,
58 int64_t r_curr, int64_t c_curr) {
59
60 // next coordinate to fill
61 auto [r_next, c_next] = next(puzzle, r_curr, c_curr);
62 // {-1,-1} means there is no unfilled space,
63 // i.e. we have solved the puzzle
64 if (r_next == -1 && c_next == -1)
65 return true;
66
67 // The candidate numbers for this space cannot
68 // repeat in the row, column or box.
69 auto used = state.used(r_next, c_next);
70
71 // Guess a number
72 for (int64_t i = 0; i < 9; ++i) {
73 // Already in a row, column or box
74 if (used[i]) continue;
75
76 // Mark it on the puzzle
77 puzzle[r_next][c_next] = '1'+i;
78 state.mark(r_next,c_next,i);
79
80 if (backtrack(puzzle,state,r_next,c_next))
81 return true;
82 // we get false if this was a guess
83 // that didn't lead to a solution
84
85 // Unmark from the puzzle
Traversal algorithms 59

86 state.unmark(r_next,c_next,i);
87 puzzle[r_next][c_next] = ' ';
88 // And try the next digit
89 }
90 return false;
91 }
92
93 bool solve(std::vector<std::vector<char>>& puzzle) {
94 State state(puzzle);
95 return backtrack(puzzle,state,0,0);
96 }
Trees
Interview questions that include trees can be tricky, notably in C++. You might expect problems
involving trees to be of similar complexity to linked lists. In fact, on a fundamental level, both trees
and linked lists are directed graphs. However, unlike linked lists, trees do not get support from the
standard C++ library. No data structure can directly represent trees, and no algorithms can directly
operate on trees1 .

Representing trees
Since we cannot rely on the standard library to provide a tree data structure, we must build our own.
The design options mirror our approaches when implementing a custom linked list (see. Custom
lists).
The most straightforward approach for a binary tree would be to rely on std::unique_ptr and have
each node own its children.
Figure 67. Flawed approach for implementing a binary tree.
1 template <typename T>
2 struct TreeNode {
3 T value = T{};
4 std::unique_ptr<TreeNode> left;
5 std::unique_ptr<TreeNode> right;
6 };
7
8 auto root = std::make_unique<TreeNode<std::string>>(
9 "root node", nullptr, nullptr);
10 // root->value == "root node"
11 root->left = std::make_unique<TreeNode<std::string>>(
12 "left node", nullptr, nullptr);
13 // root->left->value == "left node"
14 root->right = std::make_unique<TreeNode<std::string>>(
15 "right node", nullptr, nullptr);
16 // root->right->value == "right node"

While this might be tempting, and notably, this approach even makes sense from an ownership
perspective, this approach suffers the recursive destruction problem as the linked list.
When working with well-balanced trees, the problem might not manifest; however, a forward-only
linked list is still a valid binary tree. Therefore, we can easily trigger the problem.
1 You could argue that heap algorithms fit into this category.
Trees 61

Figure 68. A demonstration of a problem caused by recursive destruction.

1 template <typename T>


2 struct TreeNode {
3 T value = T{};
4 std::unique_ptr<TreeNode<T>> left;
5 std::unique_ptr<TreeNode<T>> right;
6 };
7
8 {
9 auto root = std::make_unique<TreeNode<int>>(0,nullptr);
10 // Depending on the architecture/compiler, the specific number
11 // of elements we can handle without crash will differ.
12 TreeNode<int>* it = root.get();
13 for (int i = 0; i < 100000; ++i)
14 it = (it->left = std::make_unique<TreeNode<int>>(0,nullptr)).get();
15 } // BOOM

As a reminder: The recursive nature comes from the chaining of std::unique_ptr. As part of
destroying a std::unique_ptr<TreeNode<int>> we first need to destroy the child nodes, which
first need to destroy their children, and so on. Each program has a limited stack space, and
a sufficiently deep naive binary tree can quickly exhaust this space.

While the above approach isn’t quite suitable for production code, it does offer a convenient
interface. For example, splicing the tree requires only calling std::swap on the source and destination
std::unique_ptr, which will work even across trees.
To avoid recursive destruction, we can separate the encoding of the structure of the tree from
resource ownership.
Figure 69. A binary tree with structure and resource ownership separated.

1 template <typename T>


2 struct Tree {
3 struct Node {
4 T value = T{};
5 Node* left = nullptr;
6 Node* right = nullptr;
7 };
8 Node* add(auto&& ... args) {
9 storage_.push_back(std::make_unique<Node>(
10 std::forward<decltype(args)>(args)...));
11 return storage_.back().get();
Trees 62

12 }
13 Node* root;
14 private:
15 std::vector<std::unique_ptr<Node>> storage_;
16 };
17
18 Tree<std::string> tree;
19 tree.root = tree.add("root node");
20 // tree.root->value == "root node"
21 tree.root->left = tree.add("left node");
22 // tree.root->left->value == "left node"
23 tree.root->right = tree.add("right node");
24 // tree.root->right->value == "right node"

This approach does completely remove the recursive destruction; however, we pay for that. While
we can still easily splice within a single tree, splicing between multiple trees becomes cumbersome
(because it involves splicing between the resource pools).
In the context of C++, neither of the above solutions is particularly performance-friendly. The
biggest problem is that we are allocating each node separately, which means that they can be
allocated far apart, in the worst-case situation, each node mapping to a different cache line.
Conceptually, the solution is obvious: flatten the tree. However, as we are talking about
performance-sensitive design, the specific details of the approach matter a lot. A serious imple-
mentation will have to take into account the specific data access pattern.
The following is one possible approach for a binary tree.
Figure 70. One possible approach for representing a binary tree using flat data structures.

1 constexpr inline size_t nillnode =


2 std::numeric_limits<size_t>::max();
3
4 template <typename T>
5 struct Tree {
6 struct Children {
7 size_t left = nillnode;
8 size_t right = nillnode;
9 };
10
11 std::vector<T> data;
12 std::vector<Children> children;
13
14 size_t add(auto&&... args) {
15 data.emplace_back(std::forward<decltype(args)>(args)...);
Trees 63

16 children.push_back(Children{});
17 return data.size()-1;
18 }
19 size_t add_as_left_child(size_t idx, auto&&... args) {
20 size_t cid = add(std::forward<decltype(args)>(args)...);
21 children[idx].left = cid;
22 return cid;
23 }
24 size_t add_as_right_child(size_t idx, auto&&... args) {
25 size_t cid = add(std::forward<decltype(args)>(args)...);
26 children[idx].right = cid;
27 return cid;
28 }
29 };
30
31 Tree<std::string> tree;
32 auto root = tree.add("root node");
33 // tree.data[root] == "root node"
34 auto left = tree.add_as_left_child(root, "left node");
35 // tree.data[left] == "left node", tree.children[root].left == left
36 auto right = tree.add_as_right_child(root, "right node");
37 // tree.data[right] == "right node", tree.children[root].right == right

As usual, we pay for the added performance by increased complexity. We must refer to nodes
through their indices since both iterators and references get invalidated during a std::vector
reallocation. On top of that, implementing splicing for a flat tree would be non-trivial and not
particularly performant as it involves re-indexing.

Tree traversals
Before you read this section, I encourage you to familiarize yourself with depth-first and breadth-
first search. Both searches are suitable for traversing a tree.
However, for binary trees in particular, the property we care about is the specific order in which we
visit the nodes of the tree. We will start with three traversals that are all based on the depth-first
search.

Pre-order traversal
In pre-order traversal, we visit each node before visiting its children.
Trees 64

Figure 71. Pre-order traversal on a binary tree.

1 void pre_order(Node *node, const std::function<void(Node*)>& visitor) {


2 if (node == nullptr) return;
3 visitor(node);
4 pre_order(node->left, visitor);
5 pre_order(node->right, visitor);
6 }

Figure 72. Order of visiting nodes using pre-order traversal in a full binary tree.

A typical use case for pre-order traversal is when we need to serialise or deserialise a tree. In the
following example, we serialise a binary tree as a series of space-delimited integers with missing
child nodes represented by zeroes.
Figure 73. Serializing a binary tree of integers into a stream of space-delimited integers.

1 // Serialize using pre-order traversal.


2 void serialize(Node *node, std::ostream& s) {
3 if (node == nullptr) {
4 s << 0 << " ";
5 return;
6 }
7 s << node->value << " ";
8 serialize(node->left, s);
9 serialize(node->right, s);
10 }

We must have already deserialised the parent node before we can insert its children into the tree.
This is why pre-order traversal is a natural fit for this use case.
Trees 65

Figure 74. Deserializing a binary tree from a stream of space-delimited integers.


1 // Helper for deserializing a single node.
2 Tree<int>::Node *deserialize_single(Tree<int>& tree, std::istream& s) {
3 int value = 0;
4 if (!(s >> value) || value <= 0) return nullptr;
5 return tree.add(value);
6 }
7
8 // Deserialize using pre-order traversal.
9 Tree<int>::Node *deserialize(Tree<int>& tree, std::istream& s) {
10 auto node = deserialize_single(tree, s);
11 if (node == nullptr) return node;
12 node->left = deserialize(tree, s);
13 node->right = deserialize(tree, s);
14 return node;
15 }

Non-recursive pre-order

With a recursive approach, we can run into the same stack exhaustion problem we faced during tree
destruction. Fortunately, similar to the baseline depth-first-search, we can switch to a non-recursive
implementation by relying on a std::stack or std::vector to store the traversal state.
Figure 75. Non-recursive implementation of pre-order traversal.
1 void pre_order_stack(Node* root, const std::function<void(Node*)>& visitor) {
2 std::stack<Node*> stack;
3 stack.push(root);
4 while (!stack.empty()) {
5 Node *curr = stack.top();
6 stack.pop();
7 visitor(curr);
8 // We visit "null" nodes with this approach, which might be helpful.
9 if (curr == nullptr) continue;
10 // Alternatively, we could move this condition to the push:
11 // if (curr->right != nullptr) stack.push(curr->right);
12
13 // We must insert in reverse to maintain the same
14 // ordering as recursive pre-order.
15 stack.push(curr->right);
16 stack.push(curr->left);
17 }
18 }
Trees 66

Post-order traversal
In post-order traversal, we visit each node after its children.
Figure 76. Recursive post-order traversal of a binary tree.

1 void post_order(Node *node, const std::function<void(Node*)>& visitor) {


2 if (node == nullptr) return;
3 post_order(node->left, visitor);
4 post_order(node->right, visitor);
5 visitor(node);
6 }

Figure 77. Order of visiting nodes using post-order traversal in a full binary tree.

Because of this ordering, one use case for post-order is in expression trees, where we can only
evaluate the parent expression if both its children were already evaluated.
Figure 78. Example of a simple expression tree implementation where each node contains a value or a simple
operation.

1 struct Eventual {
2 std::variant<int, // value
3 // or operation
4 std::function<int(const Eventual& l, const Eventual& r)>> content;
5 };
6
7 Tree<Eventual> tree;
8 auto plus = [](const Eventual& l, const Eventual& r) {
9 return get<0>(l.content) + get<0>(r.content);
10 };
11 auto minus = [](const Eventual& l, const Eventual& r) {
12 return get<0>(l.content) - get<0>(r.content);
13 };
14 auto times = [](const Eventual& l, const Eventual& r) {
15 return get<0>(l.content) * get<0>(r.content);
16 };
17 // encode (4-2)*(2+1)
18 auto root = tree.root = tree.add(Eventual{times});
Trees 67

19 auto left = root->left = tree.add(Eventual{minus});


20 auto right = root->right = tree.add(Eventual{plus});
21 left->left = tree.add(Eventual{4});
22 left->right = tree.add(Eventual{2});
23 right->left = tree.add(Eventual{2});
24 right->right = tree.add(Eventual{1});
25
26 post_order(tree.root, [](Node* node) {
27 // If this node already has a result value, we don't have to do anything.
28 if (std::holds_alternative<int>(node->value.content)) return;
29 // If it is an operation, then evaluate.
30 // Post-order guarantees that node->left->value
31 // and node->right->value are both values.
32 node->value.content = std::get<1>(node->value.content)(
33 node->left->value, node->right->value);
34 });
35 // get<0>(root->value.content) == 6

Non-recursive post-order

For a non-recursive approach, we could visit all nodes in pre-order, remembering each, and then
iterate over the nodes in reverse order. However, we can do better.
The main problem we must solve is remembering enough information to correctly decide whether
it is time to visit the parent node. The following approach eagerly explores the left sub-tree,
remembering both the right sibling and the parent node. When we revisit the parent node, we
can decide whether it is time to visit it based on the presence of the right sibling.
Figure 79. Non-recursive post-order traversal with only partial memoization.

1 void post_order_nonrecursive(Node *root, const std::function<void(Node*)>& visitor) {


2 std::stack<Node*> s;
3 Node *current = root;
4 while (true) {
5 // Explore left, but remember node & right child.
6 if (current != nullptr) {
7 if (current->right != nullptr)
8 s.push(current->right);
9 s.push(current);
10 current = current->left;
11 continue;
12 }
13 // current == nullptr
Trees 68

14 if (s.empty()) return;
15 current = s.top();
16 s.pop();
17 // If we have the right child remembered,
18 // it would be on the top of the stack.
19 if (current->right && !s.empty() && current->right == s.top()) {
20 // If it is, we must visit it (and it's children) first.
21 s.pop();
22 s.push(current);
23 current = current->right;
24 } else {
25 // If the top of the stack is not the right child,
26 // we have already visited it.
27 visitor(current);
28 current = nullptr;
29 }
30 }
31 }

In-order traversal
In in-order traversal, we visit each node in between visiting its left and right children.
Unlike pre- and post-order traversals that are relatively general, and we can easily apply them to
n-ary trees, in-order traversal only makes sense in the context of binary trees.
Figure 80. Recursive in-order traversal of a binary tree.
1 // in-order traversal
2 void in_order(Node* node, const std::function<void(Node*)>& visitor) {
3 if (node == nullptr) return;
4 in_order(node->left, visitor);
5 visitor(node);
6 in_order(node->right, visitor);
7 }

Figure 81. Order of visiting nodes using in-order traversal in a full binary tree.
Trees 69

The typical use case for in-order traversal is for traversing binary trees that encode an ordering of
elements. The in-order traversal naturally maintains this order during the traversal.
Figure 82. Traversing a BST to produce a sorted output.

1 // Insert an element into the tree in sorted order


2 void add_sorted(Tree<int64_t>& tree, Node* node, int64_t value) {
3 if (value <= node->value) {
4 if (node->left == nullptr)
5 node->left = tree.add(value);
6 else
7 add_sorted(tree, node->left, value);
8 } else {
9 if (node->right == nullptr)
10 node->right = tree.add(value);
11 else
12 add_sorted(tree, node->right, value);
13 }
14 }
15
16 Tree<int64_t> tree;
17 // Generate a sorted binary tree with 10 nodes
18 std::mt19937 gen(0); // change the seed for a different output
19 std::uniform_int_distribution<> dist(0,1000);
20 tree.root = tree.add(dist(gen));
21 for (int i = 0; i < 9; i++) {
22 add_sorted(tree, tree.root, dist(gen));
23 }
24
25 // in-order traversal will print the values in sorted order
26 in_order(tree.root, [](Node* node) {
27 std::cout << node->value << " ";
28 });
29 std::cout << "\n";
30 // stdlibc++: 424 545 549 593 603 624 715 845 848 858
31 // libc++: 9 192 359 559 629 684 707 723 763 835

Non-recursive in-order

The non-recursive approach is similar to post-order, but we avoid the complexity of remembering
the right child.
Trees 70

Figure 83. Non-recursive in-order traversal implementation.

1 void in_order_nonrecursive(Node *root, const std::function<void(Node*)>& visitor) {


2 std::stack<Node*> s;
3 Node *current = root;
4 while (current != nullptr || !s.empty()) {
5 // Explore left
6 while (current != nullptr) {
7 s.push(current);
8 current = current->left;
9 }
10 // Now going back up the left path visit each node,
11 // then explore the right child.
12 // This works, because the left child was already
13 // visited as we go up the path.
14 current = s.top();
15 s.pop();
16 visitor(current);
17 current = current->right;
18 }
19 }

Rank-order traversal
The rank-order or level-order traversal traverses nodes in the order of their distance from the root
node.
All the previous traversals: pre-order, post-order and in-order are based on depth-first search, rank-
order traversal is based on breadth-first search, naturally avoiding the recursion problem.
Figure 84. Rank-order traversal implementation.

1 void rank_order(Node* root, const std::function<void(Node*)>& visitor) {


2 std::queue<Node*> q;
3 if (root != nullptr)
4 q.push(root);
5 while (!q.empty()) {
6 Node* current = q.front();
7 q.pop();
8 if (current == nullptr) continue;
9 visitor(current);
10 q.push(current->left);
11 q.push(current->right);
Trees 71

12 }
13 }

Figure 85. Order of visiting nodes using rank-order traversal in a full binary tree.

Rank-order traversal typically comes up as part of more complex problems. By default, it can be used
to find the closest node to the root that satisfies particular criteria or calculate the nodes’ distance
from the root.
Figure 86. Calculating the maximum node value at each level of the tree.

1 std::vector<int> max_at_level(Node* root) {


2 std::vector<int> result;
3 std::queue<std::pair<Node*,size_t>> q;
4 if (root != nullptr)
5 q.push({root,0});
6 while (!q.empty()) {
7 auto [node,rank] = q.front();
8 q.pop();
9 if (result.size() <= rank)
10 result.push_back(node->value);
11 else
12 result[rank] = std::max(result[rank], node->value);
13 if (node->left != nullptr)
14 q.push({node->left,rank+1});
15 if (node->right != nullptr)
16 q.push({node->right, rank+1});
17 }
18 return result;
19 }

BST: Binary Search Tree


Binary trees are a commonly used data structure as they can efficiently encode decisions (at each
node, we can progress to the left or right child), leading to log(n) complexity (for a balanced tree).
Trees 72

One specific type of tree you can encounter during interviews is a binary search tree. This tree
encodes a simple property. For each node, all children in the left subtree are of lower values than
the value of this node, and all children in the right subtree are of higher values than the value of
this node.

Figure 87. Example of a balanced binary search tree.

A balanced binary search tree can be used as a quick lookup table, as we can lookup any value using
log(n) operations; however, whether we will arrive at a balanced tree very much depends on the
order in which elements are inserted into the tree, as the binary search tree doesn’t come with any
self-balancing algorithms (for that we would have to go to Red-Black trees, which is outside the
scope of this book).

Constructing a BST
To construct a binary search tree, we follow the lookup logic to find a null node where the added
value should be located.
Figure 88. Constructing a BST from a range.

1 Node*& find_place_for(Node*& root, int value) {


2 // The first empty (null) node we encounter
3 // is the place where we want to insert.
4 if (root == nullptr)
5 return root;
6 // Higher values go to the right
7 if (root->value > value)
8 return find_place_for(root->left, value);
9 // Lower values go to the left
10 if (root->value <= value) // == is for equivalent values
11 return find_place_for(root->right, value);
12 return root;
13 }
14
15 Tree construct_bst(const std::vector<int>& rng) {
16 Tree t;
17 for (int v : rng)
18 find_place_for(t.root, v) = t.add(v);
Trees 73

19 return t;
20 }

As mentioned above, the binary search tree doesn’t come with any self-balancing algorithms; we
can, therefore, end up in pathological situations, notably when constructing a binary search tree
from a sorted input.

Figure 89. Example of an unbalanced tree formed by inserting elements {1,2,3,4}.

Validating a BST
Binary search trees frequently appear during coding interviews as they are relatively simple, yet
they encode an interesting property.
The most straightforward problem (aside from constructing a BST) is validating whether a binary
tree is a binary search tree.

Before you continue reading, I encourage you to try to solve it yourself. The scaf-
folding for this problem is located at trees/validate_bst. Your goal is to make the
following commands pass without any errors: bazel test //trees/validate_bst/...,
bazel test --config=addrsan //trees/validate_bst/..., bazel test --config=ubsan
//trees/validate_bst/....

If we are checking a particular node in a binary search tree, going to the left subtree sets an upper
bound on all the values in the left subtree and going to the right subtree sets a lower bound on all
the values in the right subtree.

Figure 90. Example of partitioning of values imposed by nodes in a binary search tree.
Trees 74

If we traverse the tree, keeping track and verifying these bounds, we will validate the BST. If we do
not discover any violations, the tree is a BST; if we do, it isn’t.
Figure 91. Validating a binary search tree.

1 bool is_valid_bst(Node* root, int min, int max) {


2 // Is this node within the bounds?
3 if (root->value > max || root->value < min)
4 return false;
5 // Explore left subtree with the updated bounds
6 if (root->left != nullptr) {
7 if (root->value == INT_MIN) // avoid underflow
8 return false;
9 if (!is_valid_bst(root->left, min, root->value - 1))
10 return false;
11 }
12 // Explore right subtree with the updated bounds
13 if (root->right != nullptr) {
14 if (root->value == INT_MAX) // avoid overflow
15 return false;
16 if (!is_valid_bst(root->right, root->value + 1, max))
17 return false;
18 }
19 return true;
20 }
21
22 bool is_valid_bst(const Tree& tree) {
23 // Root can be any value
24 return is_valid_bst(tree.root, INT_MIN, INT_MAX);
25 }

Note that this solution assumes no repeated values. To support duplicate values we would have
adjust the check in the left branch, recursing with the same limit value, instead of -1.

Paths in trees
Another ubiquitous category of tree-oriented problems is dealing with paths in trees. Notably, paths
in trees can be non-trivial to reason about, but at the same time, the tree structure still offers the
possibility for very efficient solutions.
A path is a sequence of nodes where every two consecutive nodes have a parent/child relationship,
and each node is visited at most once.
Trees 75

Figure 92. Example of a tree with two highlighted paths.

Let’s demonstrate this on a concrete interview problem.

Maximum path in a tree


Given a binary tree, where each node has an integer value, determine the maximum path in this
tree. The value of a path is the total of all the node values visited.

Before you continue reading, I encourage you to try to solve it yourself. The scaf-
folding for this problem is located at trees/maximum_path. Your goal is to make the
following commands pass without any errors: bazel test //trees/maximum_path/...,
bazel test --config=addrsan //trees/maximum_path/..., bazel test --config=ubsan
//trees/maximum_path/....

Let’s consider a single node in the tree. Only four possible paths can be the maximum path that
crosses this node:

• a single-node path that contains this node only


• a path coming from the left child, terminating in this node
• a path coming from the right child, terminating in this node
• a path crossing this node, i.e. going from one child, crossing this node, continuing to the other
child

Figure 93. The four possible paths crossing a node.

Considering the above list, we can limit the information we need to calculate the maximum path in
a sub-tree whose root is the above node.
If the maximum path doesn’t cross this node, then the path is entirely contained in one of the child
subtrees.
Trees 76

If the path crosses this node, we can calculate the maximum path by using the information about
maximum paths that terminate in the left and right child.

• a single-node path is simply the value of the node


• a path coming from the left child terminating in this node is the value of the maximum path
terminating in the left child, plus the value of this node
• a path coming from the right child terminating in this node is the value of the maximum path
terminating in the right child, plus the value of this node
• a path crossing this node is the value of the maximum path terminating on the right child, plus
the value of the maximum path terminating in the left child, plus the value of this node

The maximum path crossing this node is the maximum of the above paths.
Now that we know what to calculate, we can traverse the tree in post-order (visiting the children
before the parent node) while keeping track of the aforementioned values.
Figure 94. Solution using post-order traversal.

1 // We return two values:


2 // - the maximum path that terminates in this node
3 // - the maximum path in this sub-tree
4 std::pair<int,int> maxPath(Tree::Node* node) {
5 // initialize with single-node paths
6 int max_path = node->value;
7 int max_subtree = node->value;
8 int full_path = node->value;
9
10 if (node->left != nullptr) {
11 // Calculate recursive values for the left path
12 auto [path,tree] = maxPath(node->left);
13 // Path terminating in this node: max of case 1 and 2
14 max_path = std::max(max_path, path + node->value);
15 // maximum path might not be crossing this node,
16 // contained in the left subtree
17 max_subtree = std::max(max_subtree, tree);
18 // value of the crossing path (case 4)
19 full_path += path;
20 }
21 if (node->right != nullptr) {
22 // Calculate recursive values for the right path
23 auto [path,tree] = maxPath(node->right);
24 // Path terminating in this node: max of case 1 and 3
25 // note, we already included the case 2 in the left-node if
26 max_path = std::max(max_path, path + node->value);
Trees 77

27 // maximum path might not be crossing this node,


28 // contained in the right subtree
29 max_subtree = std::max(max_subtree, tree);
30 // value of the crossing path (case 4)
31 full_path += path;
32 }
33 // the full path is the path starting in the left subtree,
34 // crossing this node, continuing into the right subtree
35 // the maximum path in this subtree is any of the paths
36 max_subtree = std::max(max_subtree, std::max(full_path, max_path));
37 // max_path is the longest path terminating in this node
38 return {max_path, max_subtree};
39 }
40
41 // Final computation, simply return the maximum
42 int maxPath(const Tree& t) {
43 auto [path,tree] = maxPath(t.root);
44 return tree;
45 }

Canonical problems
Tree problems can cover quite a range, from simple variants of the basic traversals through various
variants of paths in trees to tricky problems that require non-trivial analysis for an efficient solution.
This section covers three medium complexity problems: (de)serializing an n-ary tree, all nodes’ k-
distance, and the number of reorders of a BST. The section also covers two tricky problems: sum of
distances to all nodes and well-behaved paths.

Serialise and de-serialise n-ary tree


Given an n-ary tree data structure, implement stream extraction and insertion operations that
serialise and deserialise the tree. The choice of format is part of the assignment.
Trees 78

Figure 95. The tree data structure.

1 struct Node {
2 uint32_t value;
3 std::vector<Node*> children;
4 };
5
6 struct Tree {
7 Node* root = nullptr;
8 // Add node to the tree, when parent == nullptr, the method sets the tree root
9 Node* add_node(uint32_t value, Node* parent = nullptr);
10
11 friend std::istream& operator>>(std::istream& s, Tree& tree);
12 friend std::ostream& operator<<(std::ostream& s, Tree& tree);
13 private:
14 std::vector<std::unique_ptr<Node>> storage_;
15 };

Figure 96. Serialising and deserialising an n-ary tree.

Each node stores a uint32_t value and a vector of weak pointers to children. To add a node to the
tree, use the add_node method (the method will set the tree root when the parent is nullptr).

The scaffolding for this problem is located at trees/nary_tree. Your goal is to make
the following commands pass without any errors: bazel test //trees/nary_tree/...,
bazel test --config=addrsan //trees/nary_tree/..., bazel test --config=ubsan
//trees/nary_tree/....

Find all nodes of distance k in a binary tree


Given a binary tree containing unique integer values, return all nodes that are k distance from the
given node n.
Trees 79

Figure 97. Example tree with highlighted nodes distance two from the node with value 9.

The scaffolding for this problem is located at trees/kdistance. Your goal is to make
the following commands pass without any errors: bazel test //trees/kdistance/...,
bazel test --config=addrsan //trees/kdistance/..., bazel test --config=ubsan
//trees/kdistance/....

Sum of distances to all nodes


Given a tree with n nodes, represented as a graph using a neighbourhood map, calculate the sum of
distances to all other nodes for each node.

Figure 98. Example of a tree with four nodes and the corresponding calculated sums of distances.

The node ids are in the range [0,n).

The scaffolding for this problem is located at trees/sum_distances. Your goal is to make
the following commands pass without any errors: bazel test //trees/sum_distances/...,
bazel test --config=addrsan //trees/sum_distances/..., bazel test --config=ubsan
//trees/sum_distances/....
Trees 80

Well-behaved paths in a tree


Given a tree, represented using two arrays of length n:

• an array of node values, with values represented by positive integers


• an array of edges represented as pairs of indexes

Return the number of well-behaved paths. A well-behaved path begins and ends in a node with the
same value, with all intermediate nodes being either lower or equal to the values at the ends of the
path.

Figure 99. Example of a tree with five single-node well-behaved paths and one four-node (dashed line) well-behaved
path.

The scaffolding for this problem is located at trees/well_behaved. Your goal is to make
the following commands pass without any errors: bazel test //trees/well_behaved/...,
bazel test --config=addrsan //trees/well_behaved/..., bazel test --config=ubsan
//trees/well_behaved/....

Number of reorders of a serialized BST


Given a permutation of 1..N as std::vector, return the number of other permutations that produce the
same BST. The BST is produced by inserting elements in the order of their indexes (i.e. left-to-right).
Because the number of permutations can be high, return the result as modulo 10^9+7.
Trees 81

Figure 100. Example of two reorders that lead to the same binary search tree.

The scaffolding for this problem is located at trees/bst_reorders. Your goal is to make
the following commands pass without any errors: bazel test //trees/bst_reorders/...,
bazel test --config=addrsan //trees/bst_reorders/..., bazel test --config=ubsan
//trees/bst_reorders/....

Hints

Serialise and de-serialise n-ary tree


1. We have covered an approach for (de)serializing a binary tree.
2. Use pre-order traversal with a format that can represent the list of children.
3. You will need to store the number of children or use a terminal value.

Find all nodes of distance k in a binary tree


1. Can you calculate the value for a child if you know the distance for the parent node?
2. Consider whether the child lies on the path towards the target or not.
3. You will need a two-step process: finding the path to the target from the root and then
calculating the distances.

Sum of distances to all nodes


1. Can you calculate the value for the root node?
2. You can use post-order traversal to calculate the sum of distance for the root node from the
children trees.
3. What would happen if we removed one edge, calculated the values for the two roots, and then
recombined?
Trees 82

4. You should be able to derive a straightforward formula for calculating the sum of distances for
a child if you know the value for the parent node.
5. If you consider the values, a straightforward formula for calculating the sum of distances for
a child from the parent node will pop out.
6. Once you have the formula, a pre-order traversal will allow you to fill in the missing values.

Well-behaved paths in a tree


1. You will want to iterate over edges instead of nodes.
2. Suppose we connect two sub-trees with an edge, and the maximum value in both sub-trees is
the same. In that case, we can trivially calculate the number of paths this connection generates
if you know the number of instances of the maximum value in both trees.
3. You will need the union-find algorithm for efficient lookup.

Number of reorders of a serialized BST


1. What is the role of the first value in the serialized tree?
2. We end up with two partitions; what operation can we do that doesn’t affect the value of these
partitions?
3. You will need to pre-calculate the Pascal triangle.

Solutions

Serialize and de-serialize n-ary tree


There are many possible approaches, as we can choose any format for serialization. However, all
approaches should use a pre-order traversal since we need the parent node when we process its
children.
The following approach uses a recursive pre-order traversal and a “terminal value” format. The
format terminates the list of children with the value -1, which is outside of the domain of uint32_t.
For example, a single node tree with the root value of 0 will serialize into “0 -1”.
We have to choose how to handle an empty tree. We can either serialize an empty tree as “-1” or
leave the output unchanged, representing an empty tree as an empty string. The former has the
benefit of explicitly denoting an empty tree, allowing us to store a specific number of empty trees
in serialized form (where an empty output is simply empty).
To deserialize a node, we deserialize children until we read a negative value, processing the entire
input using pre-order traversal.
Trees 83

Figure 101. Solution for serializing and de-serializing an n-ary tree.

1 void serialize(Node* root, std::ostream& s) {


2 // each node is serialized into "value {children} -1"
3 s << root->value << " ";
4 for (auto c : root->children)
5 serialize(c, s);
6 s << "-1 ";
7 }
8
9 std::ostream& operator<<(std::ostream& s, Tree& tree) {
10 if (tree.root != nullptr)
11 serialize(tree.root, s);
12 else
13 // serialize empty tree as "-1"
14 s << "-1 ";
15 return s;
16 }
17
18 Node* deserialize(Tree& tree, Node* parent, uint32_t parent_value, std::istream& s) {
19 // pre-reading the value allows for cleaner code
20 Node* result = tree.add_node(parent_value, parent);
21 // we have to use int64_t to represent all values for uint32_t and -1
22 int64_t value;
23 while (s >> value) {
24 if (value < 0) // if we read -1, we are done reading
25 // the children of this node
26 return result;
27 // otherwise recursively de-serialize this child
28 deserialize(tree, result, value, s);
29 }
30 return result;
31 }
32
33 std::istream& operator>>(std::istream& s, Tree& tree) {
34 int64_t value;
35 if (s >> value && value >= 0)
36 deserialize(tree, nullptr, value, s);
37 return s;
38 }
Trees 84

Find all nodes of distance k in a binary tree


One option is to translate the binary tree, essentially a directed graph, into an undirected graph, in
which we can easily find k-distance nodes by applying a breadth-first search.
However, we have a simpler option based on the following observation. Consider a node with one
of its children.
If the child doesn’t lie on the path between the node and our target, its distance to the target is
simply one more than the distance of the parent node. If it does lie on the path between the node
and the target, its distance is one less.

Figure 102. Example that demonstrates the distance changing between children on the path to the target node or not.

If we explore the graph using pre-order traversal, we will also have a second guarantee that a node
is only on the path if it is also on the path between the root and the target node.
Using these observations, we can construct a two-pass solution.
First, we find our target and initialize the distances for all nodes on the path between the target and
the tree’s root.
In the second pass, if we have a precomputed value for a node, we know that it is on the path,
which allows us to distinguish between the two situations. Also, when we encounter a node with
the appropriate distance, we remember it.
Figure 103. Solution

1 // Search for target and build distances to root


2 int distance_search(Node* root, Node* target,
3 std::unordered_map<int,int>& distances) {
4 if (root == nullptr)
5 return -1;
6 if (root == target) {
7 distances[root->value] = 0;
8 return 0;
9 }
10 // Target in the left sub-tree
11 if (int left = distance_search(root->left, target, distances);
Trees 85

12 left >= 0) {
13 distances[root->value] = left + 1;
14 return left + 1;
15 }
16 // Target in the right sub-tree
17 if (int right = distance_search(root->right, target, distances);
18 right >= 0) {
19 distances[root->value] = right + 1;
20 return right + 1;
21 }
22 // Target not in this sub-tree
23 return -1;
24 }
25
26 // Second pass traversal.
27 void dfs(Node* root, Node* target, int k, int dist,
28 std::unordered_map<int,int>& distances,
29 std::vector<int>& result) {
30 if (root == nullptr) return;
31 // Check if this node is on the path to target.
32 auto it = distances.find(root->value);
33 // Node is on the path to target, update distance.
34 if (it != distances.end())
35 dist = it->second;
36 // This node is k distance from the target.
37 if (dist == k)
38 result.push_back(root->value);
39
40 // Distances to children are one more, unless they are on the path
41 // which is handled above.
42 dfs(root->left, target, k, dist + 1, distances, result);
43 dfs(root->right, target, k, dist + 1, distances, result);
44 }
45
46 std::vector<int> find_distance_k_nodes(Node* root, Node* target, int k) {
47 // First pass
48 std::unordered_map<int,int> distances;
49 distance_search(root, target, distances);
50 // Second pass
51 std::vector<int> result;
52 dfs(root, target, k, distances[root->value], distances, result);
53 return result;
Trees 86

54 }

Sum of distances to all nodes


We will start with a simpler sub-problem: calculate the sum of distances to all nodes for the root
node only.
Let’s consider a node with a child subtree represented by the left child. When we move from the
child to the parent, we increase the distance to all nodes in this subtree by one or put another way,
we increase the total distance by node_count(left_subtree).
Therefore, if we want to calculate the sum of distances for the root node, we can do a post-
order traversal. At each node, we calculate the sum of distances as sum_of_distances(left) +
node_count(left) + sum_of_distances(right) + node_count(right).
Because this approach only gives us the solution to the root node, applying this process to the entire
tree would require rotating the tree, but more importantly, it would lead to O(n*n) complexity.
Fortunately, we can do better.
Instead of focusing on the nodes, let’s focus on the edges between them. Let’s consider a specific
edge that we have removed from the tree, and we have calculated the sum of distances for the two
nodes originally connected by the removed edge.

Figure 104. Example of a disconnected tree with the two highlighted nodes for which we have calculated the sum of
distances values.

We can reconstruct the sum of distances for the connected tree from the two disjoint values.

• sum_of_distances(a) = disconnected_sum(a) + disconnected_sum(b) + node_count(b)


• sum_of_distances(b) = disconnected_sum(b) + disconnected_sum(a) + node_count(a)
• sum_of_distances(a) - sum_of_distances(b) = node_count(b) - node_count(a)

This formula gives us the opportunity to calculate the answer for a child from the value of a parent.

• sum_of_distances(child) = sum_of_distances(parent) + node_count(parent) - node_count(child)


• sum_of_distances(child) = sum_of_distances(parent) + (total_nodes - node_count(child)) -
node_count(child)
• sum_of_distances(child) = sum_of_distances(parent) + total_node - 2*node_count(child)
Trees 87

After we have calculated the sum of distances for the root node with post-order traversal, we do a
second traversal, this time in pre-order, filling in values for all nodes using the above formula.
This gives us a much better O(n) time complexity.
Figure 105. Solution for the sum of distances problem.
1 struct TreeInfo {
2 TreeInfo(int n) : subtree_sum(n,0), node_count(n,0), result(n,0) {}
3 std::vector<int> subtree_sum;
4 std::vector<int> node_count;
5 std::vector<int> result;
6 };
7
8 void post_order(int node, int parent,
9 const std::unordered_multimap<int,int>& neighbours, TreeInfo& info) {
10 // If there are no children we have zero distance and one node.
11 info.subtree_sum[node] = 0;
12 info.node_count[node] = 1;
13
14 auto [begin, end] = neighbours.equal_range(node);
15 for (auto [from, to] : std::ranges::subrange(begin, end)) {
16 // Avoid looping back to the node we came from.
17 if (to == parent) continue;
18 // post_order traversal, visit children first
19 post_order(to, node, neighbours, info);
20 // accumulate number of nodes and distances
21 info.subtree_sum[node] += info.subtree_sum[to] + info.node_count[to];
22 info.node_count[node] += info.node_count[to];
23 }
24 }
25
26 void pre_order(int node, int parent,
27 const std::unordered_multimap<int,int>& neighbours, TreeInfo& info) {
28 // For the root node the subtree_sum matches the result.
29 if (parent == -1) {
30 info.result[node] = info.subtree_sum[node];
31 } else {
32 // Otherwise, we can calculate the result from the parent,
33 // because in pre-order we visit the parent before the children.
34 info.result[node] = info.result[parent] + info.result.size()
35 - 2*info.node_count[node];
36 }
37 // Now visit any children.
38 auto [begin, end] = neighbours.equal_range(node);
Trees 88

39 for (auto [from, to] : std::ranges::subrange(begin, end)) {


40 if (to == parent) continue;
41 pre_order(to, node, neighbours, info);
42 }
43 }
44
45 std::vector<int> distances_in_tree(
46 int n, const std::unordered_multimap<int,int> neighbours) {
47 TreeInfo info(n);
48 // post-order pass to calculate subtree_sum and node_count
49 post_order(0,-1,neighbours,info);
50 // pre-order pass to calculate result
51 pre_order(0,-1,neighbours,info);
52 return info.result;
53 }

Well-behaved paths in a tree


This is a reasonably tricky problem. If the problem was limited to binary trees, we could use
post-order traversal, keep track of values not blocked by parent nodes with higher values and then
construct intersections from the left and right subtrees.
We can’t apply this simple logic to an n-arry tree as we would have to check every subtree against
every other subtree at each node, quickly exploding the complexity.
Instead of considering the nodes, let’s think in terms of edges. Because we are working with a tree,
each edge divides the tree into two trees.
As a reminder, a valid path requires that both ends have the same values and all intermediate nodes
are, at most, equal to the ends of the path. This gives us a hint towards using an ordered approach.
Let’s start with an empty tree with no edge and slowly add the edges in non-descending order based
on the values of the nodes on their ends, i.e. std::max(value[node_left], value[node_right]).
This gives us some interesting properties:

• The maximum value in either of the trees we are connecting by adding this edge is at most
std::max(value[node_left], value[node_right]), and that has to be the maximum value in at least
one of the trees (because both nodes already exist in those trees).
• If the maximum value in one of the subtrees is lower than std::max(value[node_left],
value[node_right]), no valid paths are crossing this edge (since the maximum node creates a
barrier).
• If the maximum value in both of the subtrees is std::max(value[node_left], value[node_right]),
then this edge adds freq_of_max[left]*freq_of_max[right] valid paths. From each node node
Trees 89

with the maximum value in the left subtree to each node with the maximum value in the right
subtree.

While this looks like a complete solution, we have a big problem. How do we efficiently keep track
of the frequencies of the maximum values? Not only that, a new edge might be connecting to any
node in a connected subtree, so we also need to be able to retrieve the frequency for a connected
subtree based on any node in this subtree.
Fortunately, the UnionFind algorithm offers a solution. Union find can keep track of connected
components by keeping track of a representative node for each component. In our case, the
components are subtree, and we additionally want the representative node to be one of the nodes
with the maximum value.
Figure 106. Solution for the well-behaved paths problem.

1 // UnionFind
2 int64_t find(std::vector<int64_t>& root, int64_t i) {
3 if (root[i] == i) // This is the representative node for this subtree
4 return i;
5 // Otherwise find the representative node and cache for future lookups.
6 // The caching is what provides the O(alpha(n)) complexity.
7 return root[i] = find(root, root[i]);
8 }
9
10 int64_t well_behaved_paths(std::vector<int64_t> values,
11 std::vector<std::pair<int64_t,int64_t>> edges) {
12 // Start with all nodes disconnected, each node is the
13 // representative node of its subtree.
14 std::vector<int64_t> root(values.size(), 0);
15 std::ranges::iota(root,0);
16
17 // The frequencies of the maximum value in each subtree.
18 std::vector<int64_t> freq(values.size(), 1);
19
20 // Start with trivial paths.
21 int64_t result = values.size();
22
23 std::ranges::sort(edges, [&](auto& l, auto& r) {
24 return std::max(values[l.first], values[l.second]) <
25 std::max(values[r.first], values[r.second]);
26 });
27
28 for (auto &edge : edges) {
29 // Find the representative nodes for the two ends.
Trees 90

30 // The representative nodes are always the maximum value nodes.


31 int64_t l_max = find(root, edge.first);
32 int64_t r_max = find(root, edge.second);
33
34 // The maximum in both subtrees is the same.
35 if (values[l_max] == values[r_max]) {
36 // Add path from each maximum node in left subtree to each
37 // maximum node in right subtree.
38 result += freq[l_max]*freq[r_max];
39
40 // Merge the trees right into left
41 freq[l_max] += freq[r_max];
42 root[r_max] = l_max;
43 } else if (values[l_max] > values[r_max]) {
44 // No new paths, but merge the trees
45 // right into left.
46 root[r_max] = l_max;
47 // This doesn't change the frequency because
48 // all nodes in r_max subtree < values[l_max].
49 } else { // (values[r_max] > values[l_max])
50 // No new paths, but merge the trees
51 // left into right.
52 root[l_max] = r_max;
53 // This doesn't change the frequency because
54 // all nodes in l_max subtree < values[r_max].
55 }
56 }
57 return result;
58 }

Number of reorders of a serialized BST


First, let’s remind ourselves how Binary Search trees operate. For each node, all the nodes in the
left subtree are lower than the value of this node, and all nodes in the right subtree are higher than
the value of this node.
This allows us to find any node with a specific value in log(n).
Determining the number of reorderings that produce the same BST is much less obvious. However,
the first fairly straightforward observation is that the first node will always be the root node, and it
creates a partition over the other nodes (for the left and right subtrees).
This points to the first part contributing to the total count of reorderings. Any reordering of elements
that produces the same stable partition will lead to the same BST.
Trees 91

Let’s consider the permutation {3,1,2,4,5}. Changing the order of elements within each partition (i.e.
{1,2}, {4,5}) would produce different partitions; however, we can freely interleave these partitions
without changing the result. More formally, we are looking for the number of ways to pick the
positions for the left (or right) partition out of all positions, i.e. C(n-1,k) (binomial coefficient),
where n is the total number of elements in the permutation and k is the number of elements in the
left partition.
The second point we have not considered is the number of reorderings in the two sub-tree, which
we can calculate recursively.
This leads to a total formula: reorderings(left)*reorderings(right)*coeff(n-1,left.size()).
This implies that we will have to pre-calculate the binomial coefficients, which we can do using
Pascal’s triangle.
Finally, we need to apply the requested modulo operation where appropriate.
Figure 107. Solution for the number of BST reorders problem.

1 constexpr inline int32_t mod = 1e9+7;


2
3 int32_t count(std::span<int> nums,
4 std::vector<std::vector<int>>& coef) {
5 if (nums.size() < 3) return 1;
6
7 // Partition into left and right child
8 auto rng = std::ranges::stable_partition(nums,
9 [pivot = nums[0]](int v) { return v < pivot; });
10 auto left = std::span(nums.begin(), rng.begin());
11 // Skip the pivot, since the pivot is the parent node
12 auto right = std::span(rng.begin()+1, nums.end());
13
14 // Calculate the number of reorders for both sub-trees
15 int64_t left_cnt = count(left, coef) % mod;
16 int64_t right_cnt = count(right, coef) % mod;
17 // Side note, we need 64bit here because we need
18 // to fit 32bit*32bit in the bellow calculation.
19
20 // The result is:
21 // left * right * number of ways to pick positions
22 // for left.size() elements in nums.size()-1 positions.
23 return ((left_cnt*right_cnt) % mod)
24 *coef[nums.size()-1][left.size()] % mod;
25 }
26
27 int32_t number_of_reorders(std::span<int> nums) {
Trees 92

28 // Precalculate binomial coefficients upto nums.size()-1


29 std::vector<std::vector<int>> binomial_coefficients;
30 binomial_coefficients.resize(nums.size());
31 for (int64_t i = 0; i < std::ssize(nums); ++i) {
32 binomial_coefficients[i].resize(i+1, 1);
33 for (int64_t j = 1; j < i; ++j)
34 // Pascal's triangle
35 binomial_coefficients[i][j] =
36 (binomial_coefficients[i-1][j-1] +
37 binomial_coefficients[i-1][j]) % mod;
38 }
39
40 return (count(nums, binomial_coefficients)-1) % mod;
41 }
Index
always-sorted data structure, 16 reverse, 16

divide and conquer, 16 std::forward_list, 6


std::list, 6
lists
custom flat, 11 two pointers
custom simple, 10 sliding window, 17
merge, 13 slow and fast, 18

93

You might also like