Competitive Programming
Competitive Programming
Foreword vi
Preface vii
Convention ix
Abbreviations x
List of Tables xi
1 Introduction 1
1.1 Competitive Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Tips to be Competitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Tip 1: Quickly Identify Problem Types . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Tip 2: Do Algorithm Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Tip 3: Master Programming Languages . . . . . . . . . . . . . . . . . . . . . 7
1.2.4 Tip 4: Master the Art of Testing Code . . . . . . . . . . . . . . . . . . . . . . 9
1.2.5 Tip 5: Practice and More Practice . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Getting Started: Ad Hoc Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
ii
CONTENTS c Steven & Felix, NUS
⃝
3.3 Greedy ⊖ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.1 Classical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.2 Non Classical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.3 Remarks About Greedy Algorithm in Programming Contests . . . . . . . . . 37
3.4 Dynamic Programming ⊖ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4.1 DP Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4.2 Several Classical DP Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4.3 Non Classical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4.4 Remarks About Dynamic Programming in Programming Contests . . . . . . 54
3.5 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4 Graph 58
4.1 Overview and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2 Depth First Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Breadth First Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4 Kruskal’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.5 Dijkstra’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.6 Bellman Ford’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.7 Floyd Warshall’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.8 Edmonds Karp’s (excluded in IOI syllabus) . . . . . . . . . . . . . . . . . . . . . . . 81
4.9 Special Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.9.1 Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.9.2 Directed Acyclic Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.9.3 Bipartite Graph (excluded in IOI syllabus) . . . . . . . . . . . . . . . . . . . 89
4.10 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5 Mathematics 93
5.1 Overview and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.2 Ad Hoc Mathematics Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3 Number Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3.1 Prime Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3.2 Greatest Common Divisor (GCD) & Least Common Multiple (LCM) . . . . 98
5.3.3 Euler’s Totient (Phi) Function . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.3.4 Extended Euclid: Solving Linear Diophantine Equation . . . . . . . . . . . . 99
5.3.5 Modulo Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.3.6 Fibonacci Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3.7 Factorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.4 Java BigInteger Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.4.1 Basic Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.4.2 Bonus Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.5 Miscellaneous Mathematics Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.5.1 Combinatorics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.5.2 Cycle-Finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.5.3 Existing (or Fictional) Sequences and Number Systems . . . . . . . . . . . . 107
5.5.4 Probability Theory (excluded in IOI syllabus) . . . . . . . . . . . . . . . . . . 108
5.5.5 Linear Algebra (excluded in IOI syllabus) . . . . . . . . . . . . . . . . . . . . 108
5.6 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
iii
CONTENTS c Steven & Felix, NUS
⃝
Bibliography 135
iv
CONTENTS c Steven & Felix, NUS
⃝
Acknowledgements
.
Steven wants to thank:
• God, Jesus Christ, Holy Spirit, for giving talent and passion in this competitive programming.
• My lovely wife, Grace Suryani, for allowing me to spend our precious time for this project.
• My younger brother and co-author, Felix Halim, for sharing many data structures, algorithms,
and programming tricks to improve the writing of this book.
• My father Lin Tjie Fong and mother Tan Hoey Lan for raising us and encouraging us to do
well in our study and work.
• Fellow Teaching Assistants of CS3233 and ACM ICPC Trainers @ NUS: Su Zhan, Ngo Minh
Duc, Melvin Zhang Zhiyong, Bramandia Ramadhana.
• My CS3233 students in Sem2 AY2008/2009 who inspired me to come up with the lecture
notes and CS3233 students in Sem2 AY2009/2010 who help me verify the content of this
book plus the Live Archive contribution.
Copyright
This book is written mostly during National University of Singapore (NUS) office hours as part
of the ‘lecture notes’ for a module titled CS3233 - Competitive Programming. Hundreds of hours
have been devoted to write this book.
Therefore, no part of this book may be reproduced or transmitted in any form or by any means,
electronically or mechanically, including photocopying, scanning, uploading to any information
storage and retrieval system.
v
CONTENTS c Steven & Felix, NUS
⃝
Foreword
Long time ago (exactly the Tuesday November 11th 2003 at 3:55:57 UTC), I received an e-mail
with the following sentence: I should say in a simple word that with the UVa Site, you have given
birth to a new CIVILIZATION and with the books you write (he meant “Programming Challenges:
The Programming Contest Training Manual” [23], coauthored with Steven Skiena), you inspire the
soldiers to carry on marching. May you live long to serve the humanity by producing super-human
programmers.
Although it’s clear that was an exaggeration, to tell the truth I started thinking a bit about and
I had a dream: to create a community around the project I had started as a part of my teaching
job at UVa, with persons from everywhere around the world to work together after that ideal. Just
by searching in Internet I immediately found a lot of people who was already creating a web-ring
of sites with excellent tools to cover the many lacks of the UVa site.
The more impressive to me was the ’Methods to Solve’ from Steven Halim, a very young
student from Indonesia and I started to believe that the dream would become real a day, because
the contents of the site were the result of a hard work of a genius of algorithms and informatics.
Moreover his declared objectives matched the main part of my dream: to serve the humanity. And
the best of the best, he has a brother with similar interest and capabilities, Felix Halim.
It’s a pity it takes so many time to start a real collaboration, but the life is as it is. Fortunately,
all of us have continued working in a parallel way and the book that you have in your hands is the
best proof.
I can’t imagine a better complement for the UVa Online Judge site, as it uses lots of examples
from there carefully selected and categorized both by problem type and solving techniques, an
incredible useful help for the users of the site. By mastering and practicing most programming
exercises in this book, reader can easily go to 500 problems solved in UVa online judge, which will
place them in top 400-500 within ≈100000 UVa OJ users.
Then it’s clear that the book “Competitive Programming: Increasing the Lower Bound of
Programming Contests” is suitable for programmers who wants to improve their ranks in upcoming
ICPC regionals and IOIs. The two authors have gone through these contests (ICPC and IOI)
themselves as contestants and now as coaches. But it’s also an essential colleague for the newcomers,
because as Steven and Felix say in the introduction ‘the book is not meant to be read once, but
several times’.
Moreover it contains practical C++ source codes to implement the given algorithms. Because
understand the problems is a thing, knowing the algorithms is another, and implementing them
well in short and efficient code is tricky. After you read this extraordinary book three times you
will realize that you are a much better programmer and, more important, a more happy person.
Miguel A. Revilla
UVa Online Judge site creator
ACM-ICPC International Steering Committee Member and Problem Archivist
University of Valladolid
http://uva.onlinejudge.org
http://acmicpc-live-archive.uva.es
vi
CONTENTS c Steven & Felix, NUS
⃝
Preface
This is a book that every competitive programmer must read – and master, at least during the
middle phase of their programming career: when they want to leap forward from ‘just knowing
some programming language commands’ and ‘some algorithms’ to become a top programmer.
Typical readers of this book will be: 1). Thousands University students competing in annual
ACM International Collegiate Programming Contest (ICPC) [27] regional contests, 2). Hundreds
Secondary or High School Students competing in annual International Olympiad in Informatics
(IOI) [12], 3). Their coaches who are looking for a comprehensive training materials [9], and 4).
Basically anyone who loves problem solving using computer.
Beware that this book is not for a novice programmer. When we wrote the book, we set it
for readers who have knowledge in basic programming methodology, familiar with at least one
programming language (C/C++/Java), and have passed basic data structures and algorithms (or
equivalent) typically taught in year one of Computer Science University curriculum.
Due to the diversity of its content, this book is not meant to be read once, but several times.
There are many exercises and programming problems scattered throughout the body text of this
book which can be skipped at first if solution is not known at that point of time, but can be
revisited in latter time after the reader has accumulated new knowledge to solve it. Solving these
exercises help strengthening the concepts taught in this book as they usually contain interesting
twists or variants of the topic being discussed, so make sure to attempt them.
Use uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=118,
felix-halim.net/uva/hunting.php, www.uvatoolkit.com/problemssolve.php, and
www.comp.nus.edu.sg/~stevenha/programming/acmoj.html to help you to deal with UVa [17]
problems listed in this book.
We know that one probably cannot win an ACM ICPC regional or get a gold medal in IOI just
by mastering the current version of this book. While we have included a lot of material in this
book, we are well aware that much more than what this book can offer, are required to achieve
that feat. Some pointers are listed throughout this book for those who are hungry for more.
We believe this book is and will be relevant to many University and high school students as
ICPC and IOI will be around for many years ahead. New students will require the ‘basic’ knowledge
presented in this book before hunting for more challenges after mastering this book. But before
you assume anything, please check this book’s table of contents to see what we mean by ‘basic’.
We will be happy if in year 2010 and beyond, the level of competitions in ICPC and IOI increase
because many of the contestants have mastered the content of this book. We hope to see many
ICPC and IOI coaches around the world, especially in South East Asia, adopt this book knowing
that without mastering the topics in and beyond this book, their students have no chance of doing
well in future ICPCs and IOIs. If such increase in ‘required lowerbound knowledge’ happens, this
book has fulfilled its objective of advancing the level of human knowledge in this era.
vii
CONTENTS c Steven & Felix, NUS
⃝
Authors’ Profiles
Felix Halim is currently a PhD student in the same University: SoC, NUS. In terms of programming
contests, Felix has much colorful reputation than his older brother. He was IOI 2002 contestant.
His teams (at that time, Bina Nusantara University) took part in ACM ICPC Manila Regional
2003-2004-2005 and obtained rank 10th, 6th, and 10th respectively. Then, in his final year, his
team finally won ACM ICPC Kaohsiung Regional 2006 and thus became ACM ICPC World Finalist
@ Tokyo 2007 (Honorable Mention). Today, felix halim actively joins TopCoder Single Round
Matches and his highest rating is a yellow coder.
viii
CONTENTS c Steven & Felix, NUS
⃝
Convention
There are a lot of C++ codes shown in this book. Many of them uses typedefs, shortcuts, or
macros that are commonly used by competitive programmers to speed up the coding time. In this
short section, we list down several examples.
#define _CRT_SECURE_NO_DEPRECATE // suppress some compilation warning messages (for VC++ users)
// To simplify repetitions/loops, Note: define your loop style and stick with it!
#define REP(i, a, b) \
for (int i = int(a); i <= int(b); i++) // a to b, and variable i is local!
#define TRvi(c, it) \
for (vi::iterator it = (c).begin(); it != (c).end(); it++)
#define TRvii(c, it) \
for (vii::iterator it = (c).begin(); it != (c).end(); it++)
#define TRmsi(c, it) \
for (msi::iterator it = (c).begin(); it != (c).end(); it++)
ix
CONTENTS c Steven & Felix, NUS
⃝
CC : Coin Change
CCW : Counter ClockWise
CS : Computer Science
ED : Edit Distance
OJ : Online Judge
PE : Presentation Error
RB : Red-Black (BST)
RMQ : Range Minimum Query
RSQ : Range Sum Query
RTE : Run Time Error
x
List of Tables
6.1 Some String Processing Problems in Recent ACM ICPC Asia Regional . . . . . . . . 110
7.1 Some (Computational) Geometry Problems in Recent ACM ICPC Asia Regional . . 120
xi
List of Figures
xii
LIST OF FIGURES c Steven & Felix, NUS
⃝
6.1 String Alignment Example for A = ‘ACAATCC’ and B = ‘AGCATGC’ (score = 7) 113
6.2 Suffix Trie (Left) and Suffix Tree (Right) of S = ’acacag$’ (Figure from [24]) . . . . 114
6.3 Generalized Suffix Tree of S1 = ’acgat#’ and S2 = ’cgt$’ (Figure from [24]) . . . . . 116
6.4 Suffix Array of S = ’acacag$’ (Figure from [24]) . . . . . . . . . . . . . . . . . . . . . 116
6.5 Suffix Tree versus Suffix Array of S = ’acacag$’ (Figure from [24]) . . . . . . . . . . 116
xiii
Chapter 1
Introduction
In this chapter, we introduce readers to the world of competitive programming. Hopefully you enjoy the ride
and continue reading and learning until the very last page of this book, enthusiastically.
Illustration on solving UVa Online Judge [17] Problem Number 10911 (Forming Quiz Teams).
Abridged problem description: Let (x,y) be the coordinate of a student’s house on a 2-D plane.
There are 2N students and we want to pair them into N groups. Let di be the distance
!N
between the houses of 2 students in group i. Form N groups such that i=1 di is minimized.
Constraints: N ≤ 8; 0 ≤ x, y ≤ 1000. Think first, try not to flip this page immediately!
1
1.2. TIPS TO BE COMPETITIVE c Steven & Felix, NUS
⃝
Now, ask yourself, which one is you? Note that if you are unclear with the materials or termi-
nologies shown in this chapter, you can re-read it after going through this book once.
No kidding! Although this tip may not mean much as ICPC nor IOI are about typing speed
competition, but we have seen recent ICPCs where rank i and rank i + 1 are just separated by few
minutes. When you can solve the same number of problems as your competitor, it is now down to
coding skill and ... typing speed.
2
1.2. TIPS TO BE COMPETITIVE c Steven & Felix, NUS
⃝
Try this typing test at http://www.typingtest.com and follow the instructions there on how to
improve your typing skill. Steven’s is ∼85-95 wpm and Felix’s is ∼55-65 wpm. You also need to
familiarize your fingers with the position of frequently used programming language characters, e.g.
braces {} or () or <>, semicolon ‘;’, single quote for ‘char’ and double quotes for “string”, etc.
As a little practice, try typing this C++ code (a UVa 10911 solution above) as fast as possible.
int N;
double dist[20][20], memo[1 << 16]; // 1 << 16 is 2^16, recall that max N = 8
int main() {
char line[1000], name[1000];
int i, j, caseNo = 1, x[20], y[20];
// freopen("10911.txt", "r", stdin); // one way to simplify testing
while (sscanf(gets(line), "%d", &N), N) {
for (i = 0; i < 2 * N; i++)
sscanf(gets(line), "%s %d %d", &name, &x[i], &y[i]);
3
1.2. TIPS TO BE COMPETITIVE c Steven & Felix, NUS
⃝
The classification in Table 1.1 is adapted from [18] and by no means complete. Some problems,
e.g. ‘sorting’, are not classified here as they are ‘trivial’ and only used as ‘sub-routine’ in a bigger
problem. We do not include ‘recursion’ as it is embedded in other categories. We also omit ‘data
structure related problems’ and such problems will be categorized as ‘Ad Hoc’.
Of course there can be a mix and match of problem types: one problem can be classified into
more than one type, e.g. Floyd Warshall’s is either a solution for graph problem: All-Pairs Shortest
Paths (APSP, Section 4.7) or a Dynamic Programming (DP) algorithm (Section 3.4).
In the future, these classifications may grow or change. One significant example is DP. This
technique was not known before 1940s, not frequently used in ICPCs or IOIs before mid 1990s, but
it is a must today. There are ≥ 3 DP problems (out of 11) in recent ICPC World Finals 2010.
As an exercise, read the UVa [17] problems shown in Table 1.2 and determine their problem
types. The first one has been filled for you. Filling this table is easy after mastering this book.
4
1.2. TIPS TO BE COMPETITIVE c Steven & Felix, NUS
⃝
The goal is not just to map problems into categories as in Table 1.1. After you are familiar with
most of the topics in this book, you can classify the problems into just four types as in Table 1.3.
To be competitive, you must frequently classify the problems that you read in the problem set into
type A (or at least type B).
• Prove correctness of an algorithm (especially for Greedy algorithms, see Section 3.3).
• Analyze time/space complexity analysis for iterative and recursive algorithms.
• Perform amortized analysis (see [4], Chapter 17) – although rarely used in contests.
• Do output-sensitive analysis, to analyze algorithm which depends on output size, example:
the O(|Q| + occ) complexity for finding an exact string matching of query string Q with help
of Suffix Tree (see Section 6.4).
5
1.2. TIPS TO BE COMPETITIVE c Steven & Felix, NUS
⃝
Many novice programmers usually skip this phase and tempted to directly code the first algorithm
that they can think of (usually the naı̈ve version), after that they ended up realizing that the chosen
data structure is not efficient or their algorithm is not fast enough (or wrong). Our advice: refrain
from coding until you are sure that your algorithm is both correct and fast enough.
To help you in judging how fast is ‘enough’, we produce Table 1.4. Variants of such Table 1.4
can also be found in many algorithms book. However, we put another one here from programming
contest perspective. Usually, the input size constraints are given in the problem description. Us-
ing some logical assumptions that typical year 2010 CPU can do 1M operations in 1s and time
limit of 3s (typical time limit used in most UVa online judge [17] problems), we can predict the
‘worst’ algorithm that can still pass the time limit. Usually, the simplest algorithm has poor time
complexity, but if it can already pass the time limit, just use it!
From Table 1.4, we see the importance of knowing good algorithms with lower order of growth
as they allow us to solve problems with bigger input size. Beware that a faster algorithm is usually
non trivial and harder to code. In Section 3.1.2 later, we will see a few tips that may allow us to
enlarge the possible input size n for the same class of algorithm.
Table 1.4: Rule of Thumb for the ‘Worst AC Algorithm’ for various input size n (single test case
only), assuming that year 2010 CPU can compute 1M items in 1s and Time Limit of 3s.
• Program with nested loops of depth k running about n iterations each has O(nk ) complexity.
• If your program is recursive with b recursive calls per level and has L levels, the program has
roughly O(bL ) complexity. But this is an upper bound. The actual complexity depends on
what actions done per level and whether some pruning are possible.
• Dynamic Programming algorithms which fill a 2-D matrix in O(k) per cell is in O(k × n2 ).
• The best time complexity of a comparison-based sorting algorithm is Ω(n log2 n).
• Most of the time, O(n log2 n) algorithms will be sufficient for most contest problems.
6
1.2. TIPS TO BE COMPETITIVE c Steven & Felix, NUS
⃝
1. There are n webpages (1 ≤ n ≤ 10M ). Each webpage i has different page rank ri . You want
to pick top 10 pages with highest page ranks. Which method is more feasible?
(a) Load all n webpages’ page rank to memory, sort (Section 2.2.1), and pick top 10.
(b) Use priority queue data structure (heap) (Section 2.2.2).
2. Given a list L of up to 10K integers, you want to frequently ask the value of sum(i, j), i.e.
the sum of L[i] + L[i+1] + ... + L[j]. Which data structure should you use?
3. You have to compute the ‘shortest path’ between two vertices on a weighted Directed Acyclic
Graph (DAG) with |V |, |E| ≤ 100K. Which algorithm(s) can be used?
(a) Dynamic Programming + Topological Sort (Section 3.4, 4.2, & 4.9.2).
(b) Breadth First Search (Section 4.3).
(c) Dijkstra’s (Section 4.5).
(d) Bellman Ford’s (Section 4.6).
(e) Floyd Warshall’s (Section 4.7).
4. Which algorithm is faster (based on its time complexity) for producing a list of the first 10K
prime numbers? (Section 5.3.1)
7
1.2. TIPS TO BE COMPETITIVE c Steven & Felix, NUS
⃝
import java.util.*;
import java.math.*;
Another illustration to reassure you that mastering a programming language is good: Read this
input: There are N lines, each line always start with character ’0’ followed by ’.’, then unknown
number of digits x, finally each line always terminated with three dots ”...”. See an example below.
2
0.1227...
0.517611738...
int main() {
scanf("%d", &N);
while (N--) { // we simply loop from N, N-1, N-2, ... 0
scanf("0.%[0-9]...", &digits); // surprised?
printf("the digits are 0.%s\n", digits);
} }
Not many C/C++ programmers are aware of the trick above. Although scanf/printf are C-style
I/O routines, they can still be used in C++ code. Many C++ programmers ‘force’ themselves to
use cin/cout all the time which, in our opinion, are not as flexible as scanf/printf and slower.
In ICPCs, coding should not be your bottleneck at all. That is, once you figure out the ‘worst
AC algorithm’ that will pass the given time limit, you are supposed to be able to translate it into
bug-free code and you can do it fast! Try to do some exercises below. If you need more than
10 lines of code to solve them, you will need to relearn your programming language(s) in depth!
Mastery of programming language routines will help you a lot in programming contests.
1. Given a string that represents a base X number, e.g. FF (base 16, Hexadecimal), convert it
to base Y, e.g. 255 (base 10, Decimal), 2 ≤ X, Y ≤ 36. (More details in Section 5.4.2).
3. Given a date, determine what is the day (Monday, Tuesday, ..., Sunday) of that date?
4. Given a long string, replace all the occurrences of a character followed by two consecutive
digits in with “***”, e.g. S = “a70 and z72 will be replaced, but aa24 and a872 will not” will
be transformed to S = “*** and *** will be replaced, but aa24 and a872 will not”.
8
1.2. TIPS TO BE COMPETITIVE c Steven & Felix, NUS
⃝
Here are some guidelines for designing good test cases, based on our experience:
1. Must include sample input as you have the answer given... Use ‘fc’ in Windows or ‘diff’ in
UNIX to help checking your code’s output against the sample output.
2. Must include boundary cases. Increase the size of input incrementally up to the maximum
possible. Sometimes your program works for small input size, but behave wrongly when input
size increases. Check for overflow, out of bounds, etc.
3. For multiple input test cases, use two identical test cases consecutively. Both must output the
same result. This is to check whether you have forgotten to initialize some variables, which
will be easily identified if the 1st instance produce correct output but the 2nd one does not.
4. Create tricky test cases by identifying cases that are ‘hidden’ in the problem description.
5. Do not assume the input will always be nicely formatted if the problem description does not
say so (especially for badly written programming problem). Try inserting white spaces (space,
tabs) in your input, and check whether your code is able to read in the values correctly.
6. Finally, generate large random test cases. See if your code terminates on time and still give
reasonably ok output (correctness is hard to verify here – this test is only to verify that your
code runs within time limit).
However, after all these careful steps, you may still get non-AC responses. In ICPC, you and your
team can actually use the judge’s response to determine your next action. With more experience
in such contests, you will be able to make better judgment. See the next exercises:
9
1.2. TIPS TO BE COMPETITIVE c Steven & Felix, NUS
⃝
1. You receive a WA response for a very easy problem. What should you do?
2. You receive a TLE response for an your O(N 3 ) solution. However, maximum N is just 100.
What should you do?
Figure 1.1: University of Valladolid (UVa) Online Judge, a.k.a Spanish OJ [17]
UVa ‘sister’ online judge is the ACM ICPC Live Archive that contains recent ACM ICPC Regionals
and World Finals problem sets since year 2000. Train here if you want to do well in future ICPCs.
10
1.3. GETTING STARTED: AD HOC PROBLEMS c Steven & Felix, NUS
⃝
USA Computing Olympiad has a very useful training website [18] for you to learn about program-
ming contest. This one is more geared towards IOI participants. Go straight to their website,
register your account, and train yourself.
TopCoder arranges frequent ‘Single Round Match’ (SRM) [26] that consists of a few problems
that should be solved in 1-2 hours. Then afterwards, you are given the chance to ‘challenge’ other
contestants code by supplying tricky test cases. This online judge uses a rating system (red, yellow,
blue, etc coders) to reward contestants who are really good in problem solving with higher rating
as opposed to a more diligent contestants who happen to solve ‘more’ easier problems.
11
1.3. GETTING STARTED: AD HOC PROBLEMS c Steven & Felix, NUS
⃝
1. UVa 100 - The 3n + 1 problem (follow the problem description, note the term ‘between’ !)
2. UVa 272 - TEX Quotes (simply replace all double quotes to TEX() style quotes)
3. UVa 394 - Mapmaker (array manipulation)
4. UVa 483 - Word Scramble (read char by char from left to right)
5. UVa 573 - The Snail (be careful of boundary cases!)
6. UVa 661 - Blowing Fuses (simulation)
7. UVa 739 - Soundex Indexing (straightforward conversion problem)
8. UVa 837 - Light and Transparencies (sort the x-axis first)
9. UVa 941 - Permutations (find the n-th permutation of a string, simple formula exists)
10. UVa 10082 - WERTYU (keyboard simulation)
11. UVa 10141 - Request for Proposal (this problem can be solved with one linear scan)
12. UVa 10281 - Average Speed (distance = speed × time elapsed)
13. UVa 10363 - Tic Tac Toe (simulate the Tic Tac Toe game)
14. UVa 10420 - List of Conquests (simple frequency counting)
15. UVa 10528 - Major Scales (the music knowledge is given in the problem description)
16. UVa 10683 - The decadary watch (simple clock system conversion)
17. UVa 10703 - Free spots (array size is ‘small’, 500 x 500)
18. UVa 10812 - Beat the Spread (be careful with boundary cases!)
19. UVa 10921 - Find the Telephone (simple conversion problem)
20. UVa 11044 - Searching for Nessy (one liner code exists)
21. UVa 11150 - Cola (be careful with boundary cases!)
22. UVa 11223 - O: dah, dah, dah! (tedious morse code conversion problem)
23. UVa 11340 - Newspaper (use ‘Direct Addressing Table’ to map char to integer value)
24. UVa 11498 - Division of Nlogonia (straightforward problem)
25. UVa 11547 - Automatic Answer (one liner code exists)
26. UVa 11616 - Roman Numerals (roman numeral conversion problem)
27. UVa 11727 - Cost Cutting (sort the 3 numbers and get the median)
28. UVa 11800 - Determine the Shape (Ad Hoc geometry problem)
29. LA 2189 - Mobile Casanova (Dhaka06)
30. LA 3012 - All Integer Average (Dhaka04)
31. LA 3173 - Wordfish (Manila06) (STL next permutation, prev permutation)
32. LA 3996 - Digit Counting (Danang07)
33. LA 4202 - Schedule of a Married Man (Dhaka08)
34. LA 4786 - Barcodes (World Finals Harbin10)
12
1.4. CHAPTER NOTES c Steven & Felix, NUS
⃝
Figure 1.5: Some Reference Books that Inspired the Authors to Write This Book
This and subsequent chapters are supported by many text books (see Figure 1.5) and Internet
resources. Tip 1 is an adaptation from introduction text in USACO training gateway [18]. More
details about Tip 2 can be found in many CS books, e.g. Chapter 1-5, 17 of [4]. Reference for
Tip 3 are http://www.cppreference.com, http://www.sgi.com/tech/stl/ for C++ STL and
http://java.sun.com/javase/6/docs/api for Java API. For more insights to do better testing
(Tip 4), a little detour to software engineering books may be worth trying. There are many other
Online Judges than those mentioned in Tip 5, e.g.
SPOJ http://www.spoj.pl,
POJ http://acm.pku.edu.cn/JudgeOnline,
TOJ http://acm.tju.edu.cn/toj,
ZOJ http://acm.zju.edu.cn/onlinejudge/,
Ural/Timus OJ http://acm.timus.ru, etc.
13
Chapter 2
14
2.2. DATA STRUCTURES WITH BUILT-IN LIBRARIES △ c Steven & Felix, NUS
⃝
There are two central operations commonly performed on array: sorting and searching.
There are many sorting algorithms mentioned in CS textbooks, which we classify as:
1. O(n2 ) comparison-based sorting algorithms [4]: Bubble/Selection/Insertion Sort.
These algorithms are slow and usually avoided, but understanding them is important.
2. O(n log n) comparison-based sorting algorithms [4]: Merge/Heap/Random Quick Sort.
We can use C++ STL sort, partial sort, stable sort, in <algorithm> to achieve
this purpose (Java Collections.sort). We only need to specify the required comparison
function and these library routines will handle the rest.
3. Special purpose sorting algorithms [4]: O(n) Counting Sort, Radix Sort, Bucket Sort.
These special purpose algorithms are good to know, as they can speed up the sorting time
if the problem has special characteristics, like small range of integers for Counting Sort,
but they rarely appear in programming contests.
Then, there are basically three ways to search for an item in Array, which we classify as:
1. O(n) Linear Search from index 0 to index n − 1 (avoid this in programming contests).
2. O(log n) Binary Search: use lower bound in C++ STL <algorithm> (or Java
Collections.binarySearch). If the input is unsorted, it is fruitful to sort it just once
using an O(n log n) sorting algorithm above in order to use Binary Search many times.
3. O(1) with Hashing (but we can live without hashing for most contest problems).
15
2.2. DATA STRUCTURES WITH BUILT-IN LIBRARIES △ c Steven & Felix, NUS
⃝
• Balanced Binary Search Tree (BST): C++ STL <map>/<set> (Java TreeMap/TreeSet)
BST is a way to organize data as a tree-structure. In each subtree rooted at x, this BST
property holds: items on the left subtree of x are smaller than x and items on the right
subtree of x are greater (or equal) than x. Organizing the data like this (see Figure 2.1, left)
allows O(log n) insertion, search, and deletion as only O(log n) worst case root-to-leaf scan is
needed to perform those actions (details in [4]) – but this only works if the BST is balanced.
Implementing a bug-free balanced BST like AVL2 Tree or Red-Black (RB) Tree is tedious
and hard to do under time constrained contest environment. Fortunately, C++ STL has
<map> and <set> which are usually the implementation of RB Tree, thus all operations
are in O(log n). Mastery of these two STL templates can save a lot of precious coding time
during contests! The difference is simple: <map> stores (key → data) pair whereas <set>
only stores the key.
16
2.2. DATA STRUCTURES WITH BUILT-IN LIBRARIES △ c Steven & Felix, NUS
⃝
Heap is useful to model Priority Queue, where item with highest priority can be deleted in
O(log n) and new item can be inserted into priority queue also in O(log n). The implementa-
tion of priority queue is available in C++ STL <queue>. Priority Queue is an important
component in algorithms like Kruskal’s for Minimum Spanning Tree (MST) problem (Section
4.4) and Dijkstra’s for Single-Source Shortest Paths (SSSP) problem (Section 4.5).
This data structure is also used to perform partial sort in C++ STL <algorithm>. This
is done by taking the max element k times (k is the number of the top most items to be
sorted). As each delete-max is in O(log n), partial sort has O(k log n) time complexity.
Programming exercises to practice using basic data structures and algorithms (with libraries):
17
2.3. DATA STRUCTURES WITH OUR-OWN LIBRARIES △ c Steven & Felix, NUS
⃝
2.3.1 Graph
Graph is a pervasive data structure which appears in many CS problems. Graph is simply a
collection of vertices and edges (that store connectivity information between those vertices). In
Chapter 3 & 4, we will explore many important graph problems and algorithms. In this subsection,
we only briefly discuss four basic ways (there are others) to store graph information. Assuming
that we have a graph G with V vertices and E edges, here are the ways to store them:
18
2.3. DATA STRUCTURES WITH OUR-OWN LIBRARIES △ c Steven & Felix, NUS
⃝
B Adjacency List, usually in form of C++ STL vector<vii> AdjList, with vii defined as:
typedef pair<int, int> ii; typedef vector<ii> vii; // our data type shortcuts
In Adjacency List, we have a vector of V vertices and for each vertex v, we store another
vector that contains pairs of (neighboring vertex and it’s edge weight) that have connection
to v. If the graph is unweighted, simply store weight = 0 or drop this second attribute.
With Adjacency List, we can enumerate the list of neighbors of a vertex v efficiently. If there
are k neighbors of v, this enumeration is O(k). As this is one of the most common operations
in most graph algorithms, it is advisable to stick with Adjacency List as your default choice.
C Edge List, usually in form of C++ STL priority queue<pair<int, ii> > EdgeList.
In Edge List, we store the list of edges, usually in some order. This structure is very useful
for Kruskal’s algorithm for MST (Section 4.4) where the collection of edges are sorted by their
length from shortest to longest.
D Parent-Child Tree Structure, usually in form of int parent & C++ STL vector<int> child.
If the graph is a tree (connected graph with no cycle and E = V − 1), like a directory/folder
structure, then there exists another form of data structure. For each vertex, we only store two
attributes: the parent (NULL for root vertex) and the list of children (NULL for leaves).
Exercise: Show the Adjacency Matrix, Adjacency List, and Edge List of the graph in Figure 4.1.
19
2.3. DATA STRUCTURES WITH OUR-OWN LIBRARIES △ c Steven & Felix, NUS
⃝
an item belongs to is expensive! C++ STL <algorithm>’s set union is also not efficient enough
although it combines two sets in linear time, as we still have to deal with the shuffling of the content
inside the vector of sets! Thus, we need our own library to support this data structure. One such
example is shown in this section.
The key ideas of this data structure are like this: Keep a representative (‘parent’) item of each
set. This information is stored in vector<int> pset, where pset[i] tells the representative item
of the set that contains item i. Example: suppose we have 5 items: {A, B, C, D, E} as 5 disjoint
sets of 1 item each. Each item initially has itself as the representative, as shown in Figure 2.3.
When we want to merge two sets, we call unionSet(i, j) which make both items ‘i’ and ‘j’ to
have the same representative item5 – directly or indirectly (see Path Compression below). This
is done by calling findSet(j) – what is the representative of item ‘j’, and assign that value to
pset[findSet(i)] – update the parent of the representative item of item ‘i’.
In Figure 2.4, we see what is happening when we call unionSet(i, j): every union is simply done
by changing the representative item of one item to point to the other’s representative item.
5
There is another heuristic called ‘union-by-rank’ [4] that can further improve the performance of this data
structure. But we omit this enhancing heuristic from this book to simplify this discussion.
20
2.3. DATA STRUCTURES WITH OUR-OWN LIBRARIES △ c Steven & Felix, NUS
⃝
Figure 2.5: Calling findSet(i) to Determine the Representative Item (and Compressing the Path)
In Figure 2.5, we see what is happening when we call findSet(i). This function recursively
calls itself whenever pset[i] is not yet itself (‘i’). Then, once it finds the main representative item
(e.g. ‘x’) for that set, it will compress the path by saying pset[i] = x. Thus subsequent calls of
findSet(i) will be O(1). This simple heuristic strategy is aptly named as ‘Path Compression’.
In Figure 2.6, we illustrate another operation for this data structure, called isSameSet(i, j)
that simply calls findSet(i) and findSet(j) to check if both refer to the same representative
item. If yes, ‘i’ and ‘j’ belong to the same set, otherwise, they do not.
Figure 2.6: Calling isSameSet(i, j) to Determine if Both Items Belong to the Same Set
Exercise 1: There are two more queries commonly performed on the Union-Find Disjoint Sets data
structure: int numberOfSets() that returns the number of disjoint sets currently in the structure
and int sizeOfSet(int i) that returns the size of set that currently contains item i. Update
the codes shown in this section to support these two queries efficiently!
Exercise 2: In [4], there is a ‘union by rank’ heuristic to speed-up this data structure. Do you think
this heuristic will help speed up the data structure significantly? If yes, in which case(s)? Is there
any programming tricks to achieve similar effect without using this heuristic?
21
2.3. DATA STRUCTURES WITH OUR-OWN LIBRARIES △ c Steven & Felix, NUS
⃝
Values = 8 | 7 | 3 | 9 | 5 | 1 | 10
Array A = -------------------------------------------------------------
Indices = 0 | 1 | 2 | 3 | 4 | 5 | 6
There are several ways to solve this RMQ. One of the trivial algorithm is to simply iterate the
array from index i to j and report the index with the minimum value. But this is O(n) per query.
When n is large, such algorithm maybe infeasible.
In this section, we solve the RMQ with Segment Tree: a binary tree similar to heap, but usually
not a complete binary tree. For the array A above, the segment tree is shown in Figure 2.7. The
root of this tree contains the full segment, from [0, N - 1]. And for each segment [l, r], we split them
into [l, (l + r) / 2] and [(l + r) / 2 + 1, r] until l = r. See the O(n log n) built_segment_tree
routine below. With segment tree ready, answering an RMQ can now be done in O(log n).
For example, we want to answer RMQ(1, 3). The execution in Figure 2.7 (red solid lines) is
as follows: From root [0, 6], we know that the answer for RMQ(1, 3) is on the left of vertex [0, 6]
as [0, 6] is still larger than the RMQ(1, 3), thus the stored min(imum) value of [0, 6] = 5 is not
appropriate as it is the min value over a larger segment [0, 6] than the RMQ(1, 3).
We move to the left segment [0, 3]. At vertex [0, 3], we have to search two sides as [0, 3] is still
larger than the RMQ(1, 3) and intersect both the left segment [0, 1] and the right segment [2, 3].
The right segment is [2, 3], which is inside the required RMQ(1, 3), so from the stored min
value inside this node, we know that RMQ(2,3) = 2. We do not need to traverse further down.
The left segment is [0, 1], which is not yet inside the RMQ(1, 3), so another split is necessary.
From [0, 1], we move right to segment [1, 1], which is now inside the RMQ(1, 3). Then, we return
the min value = 1 to the caller.
Back in segment [0, 3], we now know that RMQ(1, 1) = 1 and RMQ(2, 3) = 2. Because
A[RMQ(1, 1)] > A[RMQ(2, 3)] since A[1] = 7 and A[2] = 3, we know that RMQ(1, 3) = 2.
Let’s take a look at another example: RMQ(4, 6). The execution in Figure 2.7 (blue dashed line)
is as follows: We again start from the root [0, 6]. Since it is bigger than the query, we move right
22
2.3. DATA STRUCTURES WITH OUR-OWN LIBRARIES △ c Steven & Felix, NUS
⃝
to segment [4, 6]. Since this segment is exactly the RMQ(4, 6), we simply return the index of
minimum element that is stored in this node, which is 5. Thus RMQ(4, 6) = 5. We do not have to
traverse the unnecessary parts of the tree! In the worst case, we have two root-to-leaf paths which
is just O(log n). For example in RMQ(3, 4) = 4, we have one root-to-leaf path from [0, 6] to [3, 3]
and another root-to-leaf path from [0, 6] to [4, 4].
If the array A is static, then using Segment Tree to solve RMQ is an overkill as there exists
a Dynamic Programming (DP) solution that requires O(n log n) one-time pre-processing and O(1)
per RMQ. This DP solution will be discussed later in Section 3.4.3.
The Segment Tree becomes useful if array A is frequently updated. For example, if A[5] is now
changed from 1 to 100, then what we need to do is to update the leaf to root nodes which can be
done in O(log n). The DP solution requires another O(n log n) pre-processing to do the same.
Figure 2.8: Updating Array A to {8, 7, 3, 9, 5, 100, 10}. Only leaf-to-root nodes are affected.
Our library implementation for Segment Tree is shown below. The code shown here supports static
Range Minimum/Maximum/Sum queries (the dynamic update part is left as exercise). There are
of course other ways to implement segment tree, e.g. a more efficient version that only expands
the segments when needed.
#include <iostream>
#include <math.h>
#include <vector>
using namespace std;
23
2.3. DATA STRUCTURES WITH OUR-OWN LIBRARIES △ c Steven & Felix, NUS
⃝
else { // recursively compute the values in the left and right subtrees
int leftIdx = 2 * node, rightIdx = 2 * node + 1;
build_segment_tree(code, A, leftIdx , b , (b + e) / 2);
build_segment_tree(code, A, rightIdx, (b + e) / 2 + 1, e );
int lContent = segment_tree[leftIdx], rContent = segment_tree[rightIdx];
if (code == RANGE_SUM) // make this segment contains sum of left and right subtree
segment_tree[node] = lContent + rContent;
else { // (code == RANGE_MIN/MAXIMUM)
int lValue = A[lContent], rValue = A[rContent];
if (code == RANGE_MIN) segment_tree[node] = (lValue <= rValue) ? lContent : rContent;
else segment_tree[node] = (lValue >= rValue) ? lContent : rContent;
} } }
int query(int code, int A[], int node, int b, int e, int i, int j) {
if (i > e || j < b) return -1; // if the current interval does not intersect query interval
if (b >= i && e <= j) return segment_tree[node]; // if the current interval is inside query interval
// compute the minimum position in the left and right part of the interval
int p1 = query(code, A, 2 * node , b , (b + e) / 2, i, j);
int p2 = query(code, A, 2 * node + 1, (b + e) / 2 + 1, e , i, j);
int main() {
int A[] = {8,7,3,9,5,1,10};
init_segment_tree(7); build_segment_tree(RANGE_MIN, A, 1, 0, 6);
printf("%d\n", query(RANGE_MIN, A, 1, 0, 6, 1, 3)); // answer is index 2
return 0;
}
Exercise 1: Draw a segment tree of this array A = {10, 2, 47, 3, 7, 9, 1, 98, 21, 37} and answer
RMQ(1, 7) and RMQ(3, 8)!
Exercise 2: Using the same tree as in exercise 1 above, answer this Range Sum Query(i, j) (RSQ),
i.e. a sum from A[i] + A[i + 1] + ... + A[j]. What is RSQ(1, 7) and RSQ(3, 8)? Is this a good
approach to solve this problem? (See Section 3.4).
Exercise 3: The Segment Tree code shown above lacks update operation. Add the O(log n) update
function to update the value of a certain segment in the Segment Tree!
Programming exercises that use data structures with our own libraries:
24
2.4. CHAPTER NOTES c Steven & Felix, NUS
⃝
25
Chapter 3
This chapter highlights four problem solving paradigms commonly used to attack problems in programming
contests, namely Complete Search, Divide & Conquer, Greedy, and Dynamic Programming. Mastery of
all these problem solving paradigms will help contestants to attack each problem with the appropriate ‘tool’,
rather than ‘hammering’ every problem with brute-force solution... which is clearly not competitive. Our ad-
vice before you start reading: Do not just remember the solutions for the problems presented in this chapter,
but remember the way, the spirit of solving those problems!
26
3.1. COMPLETE SEARCH ⊖ c Steven & Felix, NUS
⃝
In this section, we give two examples of this simple paradigm and provide a few tips to give
Complete Search solution a better chance to pass the required Time Limit.
3.1.1 Examples
We show two examples of Complete Search: one that is implemented iteratively and one that is
implemented recursively (backtracking). We also mention a few optimization tricks to make some
‘impossible’ cases become possible.
27
3.1. COMPLETE SEARCH ⊖ c Steven & Felix, NUS
⃝
int main() {
scanf("%d", &TC);
while (TC--) {
scanf("%d %d", &a, &b);
memset(x, 0, sizeof x); lineCounter = 0;
printf("SOLN COLUMN\n");
printf(" # 1 2 3 4 5 6 7 8\n\n");
NQueens(1); // generate all possible 8! candidate solutions
if (TC) printf("\n");
}
return 0;
}
28
3.1. COMPLETE SEARCH ⊖ c Steven & Felix, NUS
⃝
3.1.2 Tips
The biggest gamble in writing a Complete Search solution is whether it will be able to pass the
Time Limit. If it is 1 minute and your program currently runs in 1 minute 5 seconds, you may
want to tweak the ‘critical code’1 of your program first rather than painfully redo the problem with
a faster algorithm – which may not be trivial to design.
Here are some tips that you may want to consider when designing your solution, especially a
Complete Search solution, to give it a higher chance for passing the Time Limit.
Programs that generate lots of candidate solutions and then choose the ones that are correct
(or remove the incorrect ones) are called ‘filters’ – recall the naı̈ve 8-queens solver with 88 time
complexity. Those that hone in exactly to the correct answer without any false starts are called
‘generators’ – recall the improved 8-queens solver with 8! complexity plus diagonal checks.
Generally, filters are easier to code but run slower. Do the math to see if a filter is good enough
or if you need to create a generator.
In generating solutions (see tip 1 above), we may encounter a partial solution that will never lead
to a full solution. We can prune the search there and explore other parts. For example, see the
diagonal check in 8-queens solution above. Suppose we have placed a queen at row[1] = 2, then
placing another queen at row[2] = 1 or row[2] = 3 will cause a diagonal conflict and placing
another queen at row[2] = 2 will cause a row conflict. Continuing from any of these branches will
never lead to a valid solution. Thus we can prune these branches right at this juncture, concentrate
on only valid positions of row[2] = {4, 5, 6, 7, 8}, thus saving overall runtime.
Some problems have symmetries and we should try to exploit symmetries to reduce execution time!
In the 8-queens problem, there are 92 solutions but there are only 12 unique (or fundamental)
solutions as there are rotations and reflections symmetries in this problem [35]. You can utilize this
fact by only generating the 12 unique solutions and, if needed, generate the whole 92 by rotating
and reflecting these 12 unique solutions.
Sometimes it is helpful to generate tables or other data structures that enable the fastest possible
lookup of a result - prior to the execution of the program itself. This is called Pre-Computation,
in which one trades memory/space for time.
Again using the 8-queens problem above. If we know that there are only 92 solutions, then
we can create a 2-dimensional array int solution[92][8] and then fill it with all 92 valid per-
mutations of 8 queens row positions! That’s it, we create a generator program (which takes some
runtime) to fill this 2-D array solution, but afterwards, we generate a new program and submit
the code that just prints out the correct permutations with 1 queen at (a, b) (very fast).
1
It is said that every program is doing most of its task in only about 10% of the code – the critical code.
29
3.1. COMPLETE SEARCH ⊖ c Steven & Felix, NUS
⃝
Surprisingly, some contest problems seem far easier when they are solved backwards than when
they are solved using a frontal attack. Be on the lookout for processing data in reverse order or
building an attack that looks at the data in some order other than the obvious.
This tip is best shown using an example: UVa 10360 - Rat Attack. Abridged problem descrip-
tion: Imagine a 2-D array (up to 1024 x 1024) containing rats. There are n ≤ 20000 rats at some
cells, determine which cell (x, y) should be gas-bombed so that the number of rats killed in square
box (x - d, y - d) to (x + d, y + d) is maximized. The value d is the power of the gas-bomb (d is
up to 50), see Figure 3.2.
First option is to attack this problem frontally: Try bombing each of the 10242 cells and see which
one is the most effective. For each bombed cell (x, y), we need to do O(d2 ) scans to count the
number of rats killed within the square-bombing radius. For the worst case when the array has
size 10242 and d = 50, this takes 10242 × 502 = 2621M operations. Clearly TLE!
Second option is to attack this problem backwards: Create an array int killed[1024][1024].
For each n rat population at coordinate (x, y), add the value of array killed[i][j] with the
number of rats in (x, y) that will be killed if a bomb is placed in (i, j) and (i, j) is within the
square-bombing radius (i.e. |i − x| ≤ d and |j − y| ≤ d). This pre-processing takes O(n × d2 )
operations. Then, to determine the most optimal bombing position, we find the coordinate of the
highest entry in array killed, which can be done in O(n2 ) operations. This backwards approach
only requires 20000 × 502 + 200002 = 51M operations for the worst test case (n = 20000, d = 50)
and approximately 51 times faster than the frontal attack!
There are many tricks that you can use to optimize your code. Understanding computer hardware,
especially I/O, memory, and cache behavior, can help you design a better program. Some examples:
2. Use the expected O(n log n) but cache-friendly quicksort (built-in in C++ STL sort as part
of ‘introsort’) rather than the true O(n log n) but not (cache) memory friendly mergesort.
3. Access a 2-D array in a row major fashion (row by row) rather than column by column.
4. Bitwise manipulation on integer is faster than using an array of bits (see Section 3.4.3), etc.
5. Use STL <bitset> rather than vector<bool> for Sieve of Eratosthenes (see Section 5.3.1).
30
3.1. COMPLETE SEARCH ⊖ c Steven & Felix, NUS
⃝
6. Declare a bulky data structure just once by setting it to have global scope, so you do not
have to pass the structure as function arguments.
7. Allocate memory just once, according to the largest possible input in the problem description,
rather than re-allocating it for every test case in a multiple-input problem.
Browse the Internet or reference books to find more information on how to speed up your code.
No kidding. Using better data structures and algorithms always outperforms any optimization tips
mentioned in Tips 1-6 above. If all else fails, abandon Complete Search approach.
• Iterative
1. UVa 154 - Recyling (try all combinations)
2. UVa 441 - Lotto (6 nested loops!)
3. UVa 639 - Don’t Get Rooked (generate 216 possible combinations, prune invalid ones)
4. UVa 725 - Division (elaborated in this section)
5. UVa 10360 - Rat Attack (this problem is also solvable using 10242 DP range sum)
6. UVa 10662 - The Wedding (3 nested loops!)
7. UVa 11242 - Tour de France (iterative complete search + sorting)
8. UVa 11804 - Argentina (5 nested loops!)
• Recursive Backtracking
1. UVa 193 - Graph Coloring (Maximum Independent Set)
2. UVa 222 - Budget Travel (input not large)
3. UVa 524 - Prime Ring Problem (also see Section 5.3.1)
4. UVa 624 - CD (input size is small, use backtracking; also solve-able with DP)
5. UVa 628 - Passwords (backtracking)
6. UVa 729 - The Hamming Distance Problem (backtracking)
7. UVa 750 - 8 Queens Chess Problem (solution already shown in this section)
8. UVa 10285 - Longest Run on a Snowboard (backtracking, also solve-able with DP)
9. UVa 10496 - Collecting Beepers (small TSP instance)
10. LA 4793 - Robots on Ice (World Finals Harbin10, recommended problem for practice)
Problem I - ‘Robots on Ice’ in the recent ACM ICPC World Final 2010 can be viewed as a ‘tough
test on pruning strategy’. The problem is simple: Given an M x N board with 3 check-in points
{A, B, C}, find a Hamiltonian path of length (M x N) from coordinate (0, 0) to coordinate
(0, 1). This Hamiltonian path must hit check point {A, B, C} at one-fourth, one-half, and
three-fourths of the way through its tour, respectively. Constraints: 2 ≤ M, N ≤ 8.
A naı̈ve recursive backtracking algorithm will get TLE. To speed up, we must prune the search
space if: 1). it does not hit the appropriate target check point at 1/4, 1/2, or 3/4 distance; 2).
it hits target check point earlier than the target time; 3). it will not be able to reach the next
check point on time from the current position; 4). it will not be able to reach final point (0, 1)
as the current path blocks the way. These 4 pruning strategies are sufficient to solve LA 4793.
31
3.2. DIVIDE AND CONQUER ⊖ c Steven & Felix, NUS
⃝
1. Divide the original problem into sub-problems – usually by half or nearly half,
2. Find (sub-)solutions for each of these sub-problems – which are now easier,
3. If needed, combine the sub-solutions to produce a complete solution for the main problem.
We have seen this D&C paradigm in previous chapters in this book: various sorting algorithms like
Quick Sort, Merge Sort, Heap Sort, and Binary Search in Section 2.2.1 utilize this paradigm. The
way data is organized in Binary Search Tree, Heap, and Segment Tree in Section 2.2.2 & 2.3.3, also
has the spirit of Divide & Conquer.
Recall: The ordinary usage of Binary Search is for searching an item in a static sorted array. We
check the middle portion of the sorted array if it is what we are looking for. If it is or there is
no more item to search, we stop. Otherwise, we decide whether the answer is on the left or right
portion of the sorted array. As the size of search space is halved (binary) after each query, the
complexity of this algorithm is O(log n). In Section 2.2.1, we have seen that this algorithm has
library routines, e.g. C++ STL <algorithm>: lower bound, Java Collections.binarySearch.
This is not the only way to use and apply binary search. The pre-requisite to run binary search
algorithm – a static sorted array (or vector) – can also be found in other uncommon data structure,
as in the root-to-leaf path on a structured tree below.
Binary Search on Uncommon Data Structure (Thailand ICPC National Contest 2009)
Problem in short: given a weighted (family) tree of N vertices up to N ≤ 80K with a special trait:
vertex values are increasing from root to leaves. Find the ancestor vertex closest to root from a
starting vertex v that has weight at least P . There are up to Q ≤ 20K such queries.
Naı̈ve solution is to do this linear O(N ) scan per query: Start from a given vertex v, then move
up the family tree until we hit the first ancestor with value < P . In overall, as there are Q queries,
this approach runs in O(QN ) and will get TLE as N ≤ 80K and Q ≤ 20K.
A better solution is to store all the 20K queries first. Then traverse the family tree just once
from root using O(N ) Depth First Search (DFS) algorithm (Section 4.2). Search for some non-
existent value so that DFS explores the entire tree, building a partial root-to-leaf sorted array as it
goes – this is because the vertices in the root-to-leaf path have increasing weights. Then, for each
32
3.2. DIVIDE AND CONQUER ⊖ c Steven & Felix, NUS
⃝
vertex asked in query, perform a O(log N ) binary search, i.e. lower bound, along the current
path from root to that vertex to get ancestor closest to root with weight at least P . Finally, do
an O(Q) post-processing to output the results. The overall time complexity of this approach is
O(Q log N ), which is now manageable.
Bisection Method
What we have seen so far are the usage of binary search in finding items in a static sorted array.
However, the binary search principle can also be used to find the root of a function that may be
difficult to compute mathematically.
Sample problem: You want to buy a car using loan and want to pay in monthly installments of
d dollars for m months. Suppose the value of the car is originally v dollars and the bank charges
i% interest rate for every unpaid money at the end of each month. What is the amount of money
d that you must pay per month (rounded to 2 digits after decimal point)?
Suppose d = 576, m = 2, v = 1000, and i = 10%. After one month, your loan becomes 1000 ×
(1.1) - 576 = 524. After two months, your loan becomes 524 × (1.1) - 576 ≈ 0.
But if we are only given m = 2, v = 1000, and i = 10%, how to determine that d = 576? In
another words, find the root d such that loan payment function f (d, 2, 1000, 10) ≈ 0. The easy
way is to run the bisection method2 . We pick a reasonable range as the starting point. In this
case, we want to find d within range [a . . . b]. a = 1 as we have to pay something (at least d = 1
dollar). b = (1 + i) × v as the earliest we can complete the payment is m = 1, if we pay exactly
(1 + i%) × v = (1 × 10) × 1000 = 1100 dollars after one month. Then, we apply bisection method
to obtain d as follows:
For bisection method to work3 , we must ensure that the function values of the two extreme points
in the initial Real range [a . . . b], i.e. f (a) and f (b) have opposite signs (true in the problem above).
Bisection method in this example only takes log2 1099/ϵ tries. Using a small ϵ = 1e-9, this is just
≈ 40 tries. Even if we use an even smaller ϵ = 1e-15, we still just need ≈ 60 tries4 . Bisection
method is more efficient compared to linearly trying each possible value of d = [1..1100]/ϵ.
2
We use the term ‘binary search principle’ as a divide and conquer technique that involve halving the range
of possible answer. We use the term ‘binary search algorithm’ (finding index of certain item in sorted array) and
‘bisection method’ (finding root of a function) as instances of this principle.
3
Note that the requirement of bisection method (which uses binary search principle) is slightly different from the
more well-known binary search algorithm which needs a sorted array.
4
Thus some competitive programmers choose to do ‘loop 100 times’ which guarantees termination instead of
testing whether the error is now less than ϵ as some floating point errors may lead to endless loop.
33
3.2. DIVIDE AND CONQUER ⊖ c Steven & Felix, NUS
⃝
Binary Search ‘the Answer’ is another problem solving strategy that can be quite powerful. This
strategy is shown using UVa 714 - Copying Books below.
In this problem, you are given m books numbered 1, 2, . . . , m that may have different number
of pages (p1 , p2 , . . . , pm ). You want to make one copy of each of them. Your task is to divide these
books among k scribes, k ≤ m. Each book can be assigned to a single scriber only, and every
scriber must get a continuous sequence of books. That means, there exists an increasing succession
of numbers 0 = b0 < b1 < b2 , . . . < bk−1 ≤ bk = m such that i-th scriber gets a sequence of
books with numbers between bi−1 + 1 and bi . The time needed to make a copy of all the books is
determined by the scriber who was assigned the most work. The task is to minimize the maximum
number of pages assigned to a single scriber.
There exist Dynamic Programming solution for this problem, but this problem can already be
solved by guessing the answer in binary search fashion! Suppose m = 9, k = 3 and p1 , p2 , . . . , p9
are 100, 200, 300, 400, 500, 600, 700, 800, and 900, respectively.
If we guess ans = 1000, then the problem becomes ‘simpler’, i.e. if the scriber with the most
work can only copy up to 1000 pages, can this problem be solved? The answer is ‘no’. We can
greedily assign the jobs as: {100, 200, 300, 400} for scribe 1, {500} for scribe 2, {600} for scribe 3,
but we have 3 books {700, 800, 900} unassigned. The answer must be at least 1000.
If we guess ans = 2000, then we greedily assign the jobs as: {100, 200, 300, 400, 500} for scribe
1, {600, 700} for scribe 2, and {800, 900} for scribe 3. We still have some slacks, i.e. scribe 1, 2,
and 3 still have {500, 700, 300} unused potential. The answer must be at most 2000.
This ans is binary-searchable between lo = 1 (1 page) and hi = p1 + p2 + . . . + pm (all pages).
34
3.3. GREEDY ⊖ c Steven & Felix, NUS
⃝
3.3 Greedy
An algorithm is said to be greedy if it makes locally optimal choice at each step with the hope of
finding the optimal solution. For some cases, greedy works - the solution code becomes short and
runs efficiently. But for many others, it does not. As discussed in [4], a problem must exhibit two
things in order for a greedy algorithm to work for it:
The coin changing example above has the two ingredients for a successful greedy algorithm:
However, this greedy algorithm does not work for all sets of coin denominations, e.g. {1, 3, 4}
cents. To make 6 cents with that set, a greedy algorithm would choose 3 coins {4, 1, 1} instead of
the optimal solution using 2 coins {3, 3}. This problem is revisited later in Section 3.4.2.
There are many other classical examples of greedy algorithms in algorithm textbooks, for ex-
ample: Kruskal’s for Minimum Spanning Tree (MST) problem – Section 4.4, Dijkstra’s for Single-
Source Shortest Paths (SSSP) problem – Section 4.5, Greedy Activity Selection Problem [4], Huff-
man Codes [4], etc.
35
3.3. GREEDY ⊖ c Steven & Felix, NUS
⃝
This problem can be solved using a greedy algorithm. But first, we have to make several observa-
tions. If there exists an empty chamber, at least one chamber with 2 specimens must be moved to
this empty chamber! Otherwise the empty chambers contribute too much to IMBALANCE! See
Figure 3.4.
Next observation: If S > C, then S − C specimens must be paired with one other specimen already
in some chambers. The Pigeonhole principle! See Figure 3.5.
Now, the key insight that can simplify the problem is this: If S < 2C, add dummy 2C −S specimens
with mass 0. For example, C = 3, S = 4, M = {5, 1, 2, 7} → C = 3, S = 6, M = {5, 1, 2, 7, 0, 0}.
Then, sort these specimens based on their mass such that M1 ≤ M2 ≤ . . . ≤ M2C−1 ≤ M2C . In
this example, M = {5, 1, 2, 7, 0, 0} → {0, 0, 1, 2, 5, 7}.
36
3.3. GREEDY ⊖ c Steven & Felix, NUS
⃝
By adding dummy specimens and then sorting them, a greedy strategy ‘appears’. We can now:
Pair the specimens with masses M1 &M2C and put them in chamber 1, then
Pair the specimens with masses M2 &M2C−1 and put them in chamber 2, and so on . . .
This greedy algorithm – known as ‘Load Balancing’ – works! See Figure 3.6.
37
3.3. GREEDY ⊖ c Steven & Felix, NUS
⃝
In this section, we want to highlight another problem solving trick called: Decomposition!
While there are only ‘few’ basic algorithms used in contest problems (most of them are covered
in this book), harder problems may require a combination of two (or more) algorithms for their
solution. For such problems, try to decompose parts of the problems so that you can solve
the different parts independently. We illustrate this decomposition technique using a recent
top-level programming problems that combines three problem solving paradigms that we have
just learned: Complete Search, Divide & Conquer, and Greedy!
You are given a scenario of airplane landings. There are 2 ≤ n ≤ 8 airplanes in the scenario.
Each airplane has a time window during which it can safely land. This time window is specified
by two integers ai , bi , which give the beginning and end of a closed interval [ai , bi ] during which
the i-th plane can land safely. The numbers ai and bi are specified in minutes and satisfy
0 ≤ ai ≤ bi ≤ 1440. In this problem, plane landing time is negligible. Then, your task is to:
1. Compute an order for landing all airplanes that respects these time windows.
HINT: order = permutation = Complete Search?
2. Furthermore, the airplane landings should be stretched out as much as possible so that the
minimum achievable time gap between successive landings is as large as possible. For example,
if three airplanes land at 10:00am, 10:05am, and 10:15am, then the smallest gap is five minutes,
which occurs between the first two airplanes. Not all gaps have to be the same, but the smallest
gap should be as large as possible!
HINT: Is this similar to ‘greedy activity selection’ problem [4]?
3. Print the answer split into minutes and seconds, rounded to the closest second.
See Figure 3.7 for illustration: line = the time window of a plane; star = its landing schedule.
Solution:
Since the number of planes is at most 8, an optimal solution can be found by simply trying all
8! = 40320 possible orders for the planes to land. This is the Complete Search portion of the
problem which can be easily solved using C++ STL next permutation.
Now, for each specific landing order, we want to know the largest possible landing window.
Suppose we use a certain window length L. We can greedily check whether this L is feasible
by forcing the first plane to land as soon as possible and the subsequent planes to land in
max(a[that plane], previous landing time + L). This is a Greedy Algorithm.
38
3.3. GREEDY ⊖ c Steven & Felix, NUS
⃝
int main() {
while (scanf("%d", &n), n) { // 2 <= n <= 8
for (i = 0; i < n; i++) {
scanf("%lf %lf", &a[i], &b[i]); // [ai, bi] is the interval where plane i can land safely
a[i] *= 60; b[i] *= 60; // originally in minutes, convert to seconds
order[i] = i;
}
39
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
Then the answer is 19, which may come from buying the underlined items (8+10+1).
Note that this solution is not unique, as we also have (6+10+3) and (4+10+5).
Then the answer is “no solution” as buying all the cheapest models (4+5+1) = 10 is still > M .
First, let’s see if Complete Search (backtracking) can solve this problem: Start with money_left = M
and garment_id = 0. Try all possible models in that garment_id = 0 (max 20 models). If model
i is chosen, then subtract money_left with model i’s price, and then recursively do the same
process to garment_id = 1 (also can go up to 20 models), etc. Stop if the model for the last
garment_id = C - 1 has been chosen. If money_left < 0 before we reach the last garment_id,
prune this partial solution. Among all valid combinations, pick one that makes money_left as
close to 0 as possible yet still ≥ 0. This maximizes the money spent, which is (M - money_left).
40
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
This solution works correctly, but very slow! Let’s analyze its worst case time complexity. In the
largest test case, garment id 0 have up to 20 choices; garment id 1 also have up to 20 choices; ...;
and the last garment id 19 also have up to 20 choices. Therefore, Complete Search like this runs
in 20 × 20 × ... × 20 of total 20 times in the worst case, i.e. 2020 = a very very large number. If
we only know Complete Search, there is no way we can solve this problem.
Since we want to maximize the budget spent, why don’t we take the most expensive model in each
garment_id which still fits our budget? For example in test case A above, we choose the most
expensive model 3 of garment_id = 0 with cost 8 (money_left = 20-8 = 12), then choose the
most expensive model 2 of garment_id = 1 with cost 10 (money_left = 12-10 = 2), and then for
garment_id = 2, we can only choose model 1 with cost 1 as money_left does not allow us to buy
other models with cost 3 or 5. This greedy strategy ‘works’ for test cases A+B above and produce
the same optimal solution (8+10+1) = 19 and “no solution”, respectively. It also runs very fast,
which is 20 + 20 + ... + 20 of total 20 times = 400 operations in the worst case.
But greedy does not work for many other cases. This test case below is a counter-example:
M = 12, C = 3
3 models of garment id 0 → 6 4 8
2 models of garment id 1 → 5 10
4 models of garment id 2 → 1 5 3 5
Greedy strategy selects model 3 of garment_id = 0 with cost 8 (money_left = 12-8 = 4), thus
we do not have enough money to buy any model in garment_id = 1 and wrongly reports “no
solution”. The optimal solution is actually (4+5+3 = 12), which use all our budget.
To solve this problem, we have to use DP. Let’s see the key ingredients to make DP works:
1. This problem has optimal sub-structures.
This is shown in Complete Search recurrence 3 above: solution for the sub-problem is part of
the solution of the original problem. Although optimal sub-structure are the same ingredient
to make a Greedy Algorithm work, this problem lacks the ‘greedy property’ ingredient.
2. This problem has overlapping sub-problems.
This is the key point of DP! The search space is actually not as big as 2020 analyzed in
Complete Search discussion above as many sub-problems are actually overlapping!
41
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
Let’s verify if this problem has overlapping sub-problems. Suppose that there are 2 models in certain
garment_id with the same price p. Then, Complete Search will move to the same sub-problem
shop(money_left - p, garment_id + 1) after picking either model! Similarly, this situation
also occur if some combination of money_left and chosen model’s price causes money_left1 - p1
= money_left2 - p2 . This same sub-problem will be computed more than once! Inefficient!
So, how many distinct sub-problems (a.k.a. states) are there in this problem? The answer
is, only 201 × 20 = 4,020. As there only there are only 201 possible money_left (from 0 to 200,
inclusive) and 20 possible garment_id (from 0 to 19, inclusive). Each of the sub-problem just need
to be computed only once. If we can ensure this, we can solve this problem much faster.
Implementation of this DP solution is surprisingly simple. If we already have the recursive
backtracking (the recurrence relation shown previously), we can implement top-down DP by
doing these few additional steps:
(a) If it is, simply return the value from the DP memo table, O(1).
(b) If it is not, compute as per normal (just once) and then store the computed value in the
DP memo table so that further calls to this sub-problem is fast.
Analyzing DP solution is easy. If it has M states, then it requires at least O(M ) memory space. If
filling a cell in this state requires O(k) steps, then the overall time complexity is O(kM ). UVa 11450
- Wedding Shopping problem above has M = 201 × 20 = 4020 and k = 20 (as we have to iterate
through at most 20 models for each garment_id). Thus the time complexity is 4020 × 20 = 80400,
which is very manageable. We show our code below as an illustration, especially for those who
have never coded a top-down DP algorithm before.
42
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
scanf("%d", &TC);
while (TC--) {
scanf("%d %d", &M, &C);
for (i = 0; i < C; i++) {
scanf("%d", &K);
price[i][0] = K; // to simplify coding, we store K in price[i][0]
for (j = 1; j <= K; j++)
scanf("%d", &price[i][j]);
}
There is another style of writing DP solutions, called the bottom-up DP. This is actually the ‘true
form’ of DP a.k.a. the ‘tabular method’. The steps to build bottom-up DP are like this:
3. Determine how to fill the rest of the DP table based on the Complete Search recurrence,
usually involving one or more nested loops to do so.
For UVa 11450 above, we can write the bottom-up DP as follow. For clarity of this discussion,
please see Figure 3.8 which illustrates test case A above.
First, set up a boolean matrix can_reach[money_left][garment_id] of size 201 × 20. At
first, only the cells reachable by buying any of the models of garment_id = 0 are true. See Figure
3.8, leftmost, where only rows ‘20-6 = 14’, ‘20-4 = 16’, and ‘20-8 = 12’ in column 0, are true.
Then, we loop through from second garment until the last garment. We set can_reach[a][b]
to be true if it is possible to reach this state from any states in previous column, i.e. from state
can_reach[a + price of any model of previous garment_id][b - 1]. See Figure 3.8, middle,
where for example, can_reach[11][1] can be reached from can_reach[11 + 5][0] by buying a
model with cost 5 in garment_id = 1; can_reach[2][1] can be reached from can_reach[2 + 10][0]
by buying a model with cost 10 in garment_id = 1; etc.
Finally, the answer can be found in the last column. Find the cell in that column nearest to
index 0 that is set to be true. In Figure 3.8, rightmost, the cell can_reach[1][2] is the answer.
This means that we can somehow reach this state (money_left = 1) by buying combination of
various garment models. The final answer is actually M - money_left, or in this case, 20-1 = 19.
The answer is “no solution” if there is no cell in the last column that is set to be true.
43
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
#include <iostream>
#include <string.h>
using namespace std;
int main() {
int i, j, l, TC, M, C, K, price[25][25]; // price[garment_id (<= 20)][model (<= 20)]
bool can_reach[210][25]; // can_reach table[money_left (<= 200)][garment_id (<= 20)]
// question: is 2nd dimension (model) needed? M = (201*20) -> (201) only?
scanf("%d", &TC); // can we compute the solution by just maintaining 2 most recent columns?
while (TC--) { // hint: DP-on-the-fly (a.k.a space saving trick)
scanf("%d %d", &M, &C);
for (i = 0; i < C; i++) {
scanf("%d", &K);
price[i][0] = K; // to simplify coding, we store K in price[i][0]
for (j = 1; j <= K; j++)
scanf("%d", &price[i][j]);
}
for (i = 0; i <= M && !can_reach[i][C - 1]; i++); // the answer is in the last column
44
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
As you can see, the way the bottom-up DP table is filled is not as intuitive as the top-down DP as
it requires some ‘reversals’ of the signs in Complete Search recurrence that we have developed in
previous sections. However, we are aware that some programmers actually feel that the bottom-up
version is more intuitive. The decision on using which DP style is in your hand. To help you decide
which style that you should take when presented with a DP solution, we present the trade-off
comparison between top-down and bottom-up DP in Table 3.1.
Top-Down Bottom-Up
Pro: Pro:
1. It is a natural transformation from normal 1. Faster if many sub-problems are revisited
Complete Search recursion as there is no overhead from recursive calls
2. Compute sub-problems only when neces- 2. Can save memory space with DP ‘on-the-
sary (sometimes this is faster) fly’ technique (see comment in code above)
Cons: Cons:
1. Slower if many sub-problems are revisited 1. For some programmers who are inclined
due to recursive calls overhead (usually this with recursion, this may be not intuitive
is not penalized in programming contests)
2. If there are M states, it can use up to 2. If there are M states, bottom-up DP visits
O(M ) table size which can lead to Memory and fills the value of all these M states
Limit Exceeded (MLE) for some hard prob-
lems
Problem: Given a sequence {X[0], X[1], . . . , X[N-1]}, determine its Longest Increasing Subsequence
(LIS)5 – as the name implies. Take note that ‘subsequence’ is not necessarily contiguous.
Example:
N = 8, sequence = {-7, 10, 9, 2, 3, 8, 8, 1}
The LIS is {-7, 2, 3, 8} of length 4.
Solution: Let LIS(i) be the LIS ending in index i, then we have these recurrences:
1. LIS(0) = 1 // base case
2. LIS(i) = ans, computed with a loop below:
int ans = 1;
for (int j = 0; j < i; j++) // O(n)
if (X[i] > X[j]) // if we can extend LIS(j) with i
ans = max(ans, 1 + LIS(j))
5
There are other variants of this problem: Longest Decreasing Subsequence, Longest Non Increasing/Decreasing
Subsequence, and the O(n log k) solution by utilizing the fact that the LIS is sorted and binary-searchable. See
http://en.wikipedia.org/wiki/Longest increasing subsequence for more details. Note that increasing subse-
quences can be modeled as a Directed Acyclic Graph (DAG). Thus finding LIS is equivalent to finding longest path
in DAG.
45
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
The answer is the highest value of LIS(k) for all k in range [0 . . . N-1].
There are clearly many overlapping sub-problems in LIS problem, but there are only N states, i.e.
the LIS ending at index i, for all i ∈ [0 . . . N-1]. As we need to compute each state with an O(n)
loop, then this DP algorithm runs in O(n2 ). The LIS solution(s) can be reconstructed by following
the arrows via some backtracking routine (scrutinize the arrows in Figure 3.9 for LIS(5) or LIS(6)).
Problem6 : Given a target amount V cents and a list of denominations of N coins, i.e. We have
coinValue[i] (in cents) for coin type i ∈ [0 . . . N-1], what is the minimum number of coins that we
must use to obtain amount V ? Assume that we have unlimited supply of coins of any type.
Example 1:
V = 10, N = 2, coinValue = {1, 5}
We can use:
A. Ten 1 cent coins = 10 × 1 = 10; Total coins used = 10
B. One 5 cents coin + Five 1 cent coins = 1 × 5 + 5 × 1 = 10; Total coins used = 6
C. Two 5 cents coins = 2 × 5 = 10; Total coins used = 2 → Optimal
Recall that we can use greedy solution if the coin denominations are suitable – as in Example 1
above (See Section 3.3). But for general cases, we have to use DP – as in Example 2 below:
Example 2:
V = 7, N = 4, coinValue = {1, 3, 4, 5}
Greedy solution will answer 3, using 5+1+1 = 7, but optimal solution is 2, using 4+3 only!
6
There are other variants of this problem, e.g. counting how many ways to do coin change and a variant where
supply of coins are limited.
46
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
We can see that there are a lot of overlapping sub-problems in this Coin Change problem, but there
are only O(V ) possible states! As we need to try O(N ) coins per state, the overall time complexity
of this DP solution is O(V N ).
Maximum Sum
Abridged problem statement: Given n x n (1 ≤ n ≤ 100) array of integers, each ranging from [-127,
127], find a minimum sub-rectangle that have the maximum value.
Naı̈vely attacking this problem as shown below does not work as it is an 1006 algorithm.
Solution: There are several (well-known) DP solutions for static range problem. DP can work in
this problem as computing a large sub-rectangle will definitely involves computing smaller sub-
rectangles in it and such computation involves overlapping sub-rectangles!
One possible DP solution is to turn this n x n array into an n x n sum array where arr[i][j]
no longer contains its own value, but the sum of all items within sub-rectangle (0, 0) to (i, j). This
can easily be done on-the-fly when reading the input and still O(n2 ).
47
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
scanf("%d", &n);
for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) {
scanf("%d", &arr[i][j]);
if (i > 0) arr[i][j] += arr[i - 1][j]; // if possible, add values from top
if (j > 0) arr[i][j] += arr[i][j - 1]; // if possible, add values from left
if (i > 0 && j > 0) arr[i][j] -= arr[i - 1][j - 1]; // to avoid double count
} // inclusion-exclusion principle
This code turns input array (shown in the left) into sum array (shown in the right):
0 -2 -7 0 ==> 0 -2 -9 -9
9 2 -6 2 ==> 9 9 -4 2
-4 1 -4 1 ==> 5 6 -11 -8
-1 8 0 -2 ==> 4 13 -4 -3
Now, with this sum array, we can answer the sum of any sub-rectangle (i, j) to (k, l) in O(1)! Suppose
we want to know the sum of (1, 2) to (3, 3). We can split the sum array into 4 sections and compute
arr[3][3] - arr[0][3] - arr[3][1] + arr[0][1] = -3 - 13 - (-9) + (-2) = -9.
0 [-2]| -9 [-9]
-----------------
9 9 | -4 2
5 6 |-11 -8
4 [13]| -4 [-3]
With this O(1) DP formulation, this problem can now be solved in 1004 .
Lesson: Not every range problems require Segment Tree as in Section 2.3.3! Problems where the
input data is static like this usually is solvable with DP technique.
Exercise 1: The solution above runs in 1004 . There exist 1003 solution. Can you figure out how to
formulate this solution?
Exercise 2: What if the given static array is 1-D? Can you form a similar O(1) DP solution to
answer range sum query(i, j), i.e. arr[i] + arr[i+1] + ... + arr[j]?
Exercise 3: Use the solution from Exercise 2 to find maximum sum in 1-D array in O(n2 ). Can
you further improve the solution to O(n)?
Exercise 4: Now, what if the query is range minimum query(i, j) on 1-D static array. The solution
in Section 2.3.3 uses Segment Tree. Can you just utilize DP to solve the same problem – assuming
that this time there is no update operation?
48
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
Remarks
There are other classical DP problems that we choose not to cover in this book such as Matrix
Chain Multiplication [4], Optimal Binary Search Tree [14], 0-1 Knapsack [5, 14]. However, Floyd
Warshall’s will be discussed in Section 4.7; String Edit Distance, Longest Common Subsequence
(LCS), plus other DP on Strings in Section 6.3.
Abridged problem statement: Given a stick of length 1 ≤ l ≤ 1000, make 1 ≤ n ≤ 50 cuts to that
sticks (coordinates within range (0 . . . l) are given). The cost of cut is determined by the length of
the stick to be cut. Find a cutting sequence so that the overall cost is minimized!
Example: l = 100, n = 3, and cut coordinates: coord = {25, 50, 75} (already sorted)
Solution: Use this Complete Search recurrences + top-down DP (memoization): cut(left, right)
1. If (left + 1 = right) where left and right are indices in array coord,
then cut(left, right) = 0 as we are left with 1 segment, there is no need to cut anymore.
2. Otherwise, try all possible cutting points and pick the minimum cost using code below:
Lesson: The Complete Search recurrences above only has 50× 50 possible left/right indices configu-
ration (states) and runs in just 50 × 50 × 50 = 125K operations! It is easy to convert this Complete
Search / recursive backtracking recurrence into a top-down DP a.k.a memoization technique!
49
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
For the abridged problem statement and full solution code, refer to the very first problem mentioned
in Chapter 1. The grandiose name of this problem is “Minimum Weighted Perfect Matching on
Small General Graph”. In general, this problem is hard and the solution is Edmond’s Matching
algorithm (see [34]) which is not easy to code.
However, if the input size is small, up to M ≤ 20, then the following DP + bitmasks technique
can work. The idea is simple, as illustrated for M = 6: when nothing is matched yet, the state is
bit_mask=000000. If item 0 and item 2 are matched, we can turn on bit 0 and bit 2 via these simple
bit operations, i.e. bit_mask | (1 << 0) | (1 << 2), thus the state becomes bit_mask=000101.
Then, if from this state, item 1 and item 5 are matched, the state becomes bit_mask=010111. The
perfect matching is obtained when the state is all ‘1’s, in this case: bit_mask=111111.
Although there are many ways to arrive at a certain state, there are only O(2M ) distinct states!
For each state, we record the minimum weight of previous matchings that must be done in order
to reach this state. As we want a perfect matching, then for a currently ‘off’ bit i, we must find
the best other ‘off’ bit j from [i+1 . . . M -1] using one O(M ) loop. This check is again done using
bit operation, i.e. if (!(bit_mask & (1 << i))) – and similarly for j. This algorithm runs in
O(M × 2M ). In problem UVa 10911, M = 2N and 2 ≤ N ≤ 8, so this DP + bitmasks approach is
feasible. For more details, please study the code shown in Section 1.2.
Exercise: The code in Section 1.2 says that “this ’break’ is necessary. do you understand why?”,
with hint: “it helps reducing time complexity from O((2N )2 × 22N ) to O((2N ) × 22N )”. Answer it!
Abridged problem statement: Given number of cities 3 ≤ n ≤ 50, available time 1 ≤ t ≤ 1000, and
two n × n matrices (one gives a travel time and another gives tolls between two cities), choose a
route from the first city 0 in such a way that one has to pay as little money for tolls as possible to
arrive at the last city n − 1 within a certain time t. Output two information: the total tolls that
is actually used and the actual traveling time.
Notice that there are two potentially conflicting requirements in this problem. Requirement
one is to minimize tolls along the route. Requirement two is to ensure that we arrive in last city
within allocated time, which may mean that we have to pay higher tolls in some part of the path.
Requirement two is a hard constraint. We must satisfy it, otherwise we do not have a solution.
It should be quite clear after trying several test cases that greedy Single-Source Shortest Paths
(SSSP) algorithm like Dijkstra’s (Section 4.5) – on its pure form – will not work as picking the
shortest travel time to ensure we use less than available time t may not lead to the smallest possible
tolls. On the other hand, picking the cheapest tolls along the route may not ensure that we arrive
within available time t. Both requirements cannot be made independent!
Solution: We can use the following Complete Search recurrences + top-down DP (memoization)
go(curCity, time left) that returns pairs of information (actual toll paid, actual time used):
1. go(any curCity, < 0) = make pair(INF, INF); // Cannot go further if we run out of time.
2. go(n − 1, ≥ 0) = make pair(0, 0); // Arrive at last city, no need to pay/travel anymore.
3. In general case, go(curCity, time left) is best described using the C++ code below:
50
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
pair<int, int> go(int curCity, int time_left) { // top-down DP, returns a pair
if (time_left < 0) // invalid
return make_pair(INF, INF); // a trick: return large value so that this state is not chosen
if (curCity == n - 1) // at last city
return make_pair(0, 0); // no need to pay toll, and time needed is 0
if (memo[curCity][time_left].first != -1) // visited before
return memo[curCity][time_left]; // simply return the answer
Lesson: Some graph shortest (longest) path problems look solvable with classical (usually greedy)
graph algorithms in Chapter 4, but in reality should be solved using DP techniques7 .
Abridged problem statement: Given t oak trees, the height h of all trees, the height f that Jayjay
loses when it flies from one tree to another, 1 ≤ t, h ≤ 2000, 1 ≤ f ≤ 500, and the positions of acorns
on each of oak trees: acorn[tree][height], determine the maximum number of acorns that Jayjay
can collect in one single descent. Example: if t = 3, h = 10, f = 2 and acorn[tree][height] as
in Figure 3.11, the best descent path has a total of 8 acorns (dotted line).
Figure 3.11: ACM ICPC Singapore 2007 - Jayjay the Flying Squirrel Collecting Acorns
7
This problem can also be solved with a modified Dijkstra’s.
51
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
Naı̈ve DP Solution: Use a memo table total[tree][height] that stores the best possible acorns
collected when Jayjay is on a certain tree at certain height. Then Jayjay recursively tries to either
go down (-1) unit on the same oak tree or flies (-f ) unit(s) to t−1 other oak trees from this position.
This approach requires up to 2000 × 2000 = 4M states and has time complexity 4M × 2000 = 8B
operations. This approach is clearly TLE!
Better DP Solution: We can actually ignore the information: “on which tree Jayjay is currently
at” as just memoizing the best among them is sufficient. Set a table: dp[height] that stores the
best possible acorns collected when Jayjay is at this height. The bottom-up DP code that requires
only 2000 = 2K states and time complexity of 2000 × 2000 = 4M is as follow:
for (int tree = 0; tree < t; tree++) // initialization
dp[h] = max(dp[h], acorn[tree][h]);
for (int height = h - 1; height >= 0; height--)
for (int tree = 0; tree < t; tree++) {
acorn[tree][height] +=
max(acorn[tree][height + 1], // from this tree, +1 above
((height + f <= h) ? dp[height + f] : 0)); // best from tree at height + f
dp[height] = max(dp[height], acorn[tree][height]); // update this too
}
printf("%d\n", dp[0]); // solution will be here
Lesson: When naı̈ve DP states are too large causing the overall DP time complexity not-doable,
think of different ways other than the obvious to represent the possible states. Remember that no
programming contest problem is unsolvable, the problem setter must have known a trick.
Abridged problem statement: You are given a simple arithmetic expression which consists of only
addition and subtraction operators, i.e. 1 - 2 + 3 - 4 - 5. You are free to put any parentheses
to the expression anywhere and as many as you want as long as the expression is still valid. How
many different numbers can you make? The answer for simple expression above is 6:
1 - 2 + 3 - 4 - 5 = -7 1 - (2 + 3 - 4 - 5) = 5
1 - (2 + 3) - 4 - 5 = -13 1 - 2 + 3 - (4 - 5) = 3
1 - (2 + 3 - 4) - 5 = -5 1 - (2 + 3) - (4 - 5) = -3
The expression consists of only 2 ≤ N ≤ 30 non-negative numbers less than 100, separated by
addition or subtraction operators. There is no operator before the first and after the last number.
52
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
But these two states are not unique yet. For example, this partial expression: ‘1-1+1-1...’ has
state idx=3 (indices: 0,1,2,3 have been processed), open=0 (cannot put close bracket anymore),
which sums to 0. Then, ‘1-(1+1-1)...’ also has the same state idx=3, open=0 and sums to 0. But
‘1-(1+1)-1...’ has the same state idx=3, open=0, but sums to -2. This DP state is not yet unique.
So we need additional value to distinguish them, i.e. the value ‘val’, to make these states unique.
We can represent all possible states of this problem with bool state[idx][open][val], a
3-D array. As ‘val’ ranges from -2800 to 3000 (5801 distinct values), the number of states is
30 × 30 × 5801 ≈ 5M with only O(1) processing per state – fast enough. The code is shown below:
Lesson: DP formulation for this problem is not trivial. Try to find a state representation that can
uniquely identify sub-problems. Make observations and consider attacking the problem from there.
53
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
Abridged problem statement: Given a weighted graph G, find Max Weighted Independent Set
(MWIS) on it. A subset of vertices of graph G is said to be Independent Set (IS)8 if there is no
edge in G between any two vertices in the IS. Our task is to select an IS with maximum weight of
G. If graph G is a tree, this problem has efficient DP solution (see Figure 3.12).
Tip: For almost tree-related problems, we need to ‘root the tree’ first if it is not yet rooted. If the
tree does not have a vertex dedicated as root, pick an arbitrary vertex as the root. By doing this,
the subproblems w.r.t subtrees may appear, like in this MWIS problem on Tree (See Figure 3.13).
Once we have a rooted (sub)tree, we can formulate MWIS recurrences w.r.t their children. One
formulation is: C(v, selected) = max weight of subtree rooted at ‘v’ if it is ‘selected’.
Lesson: Optimization problems on tree may be solved with DP techniques. The solution usually
involves passing information to parent and getting information from the children of a rooted tree.
54
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
that can uniquely and efficiently represent sub-problems and then how to fill that table, either via
top-down recursion or bottom-up loop.
We suggest that other than the examples shown in Section 3.4.3, contestants should learn
newer forms of DP problems that keep appearing in recent programming contests, especially tricks
on speeding up DP solution using ‘quadrangle inequality’, convex property, binary search, etc.
55
3.4. DYNAMIC PROGRAMMING ⊖ c Steven & Felix, NUS
⃝
56
3.5. CHAPTER NOTES c Steven & Felix, NUS
⃝
57
Chapter 4
Graph
Many real-life problems can be classified as graph problems. Some have efficient solutions. Some do not have
them yet. In this chapter, we learn various graph problems with known efficient solutions, ranging from basic
traversal, minimum spanning trees, shortest paths, and network flow algorithms.
58
4.2. DEPTH FIRST SEARCH c Steven & Felix, NUS
⃝
Table 4.1: Some Graph Problems in Recent ACM ICPC Asia Regional
DFS_WHITE = -1), DFS starts from a vertex u, mark u as ‘visited’ (set dfs_num[u] to DFS_BLACK = 1),
and then for each ‘unvisited’ neighbor v of u (i.e. edge u − v exist in the graph), recursively visit
v. The snippet of DFS code is shown below:
typedef pair<int, int> ii; // we will frequently use these two data type shortcuts
typedef vector<ii> vii;
#define TRvii(c, it) \ // all sample codes involving TRvii use this macro
for (vii::iterator it = (c).begin(); it != (c).end(); it++)
The time complexity of this DFS implementation depends on the graph data structure used. In
a graph with V vertices and E edges, dfs runs in O(V + E) and O(V 2 ) if the graph is stored as
Adjacency List and Adjacency Matrix, respectively.
Figure 4.1: Sample graph for the early part of this section
On sample graph in Figure 4.1, dfs(0) – calling DFS from a start vertex u = 0 – will trigger
this sequence of visitation: 0 → 1 → 2 → 3 → 4. This sequence is ‘depth-first’, i.e. DFS goes to
59
4.2. DEPTH FIRST SEARCH c Steven & Felix, NUS
⃝
the deepest possible vertex from the start vertex before attempting another branches. Note that
this sequence of visitation depends very much on the way we order neighbors of a vertex, i.e. the
sequence 0 → 1 → 3 → 2 (backtrack to 3) → 4 is also a possible visitation sequence. Also notice
that one call of dfs(u) will only visit all vertices that are connected to vertex u. That is why
vertices 5, 6, and 7 in Figure 4.1 are currently unvisited by calling dfs(0).
The DFS code shown here is very similar to the recursive backtracking code shown earlier in
Section 3.1. If we compare the pseudocode of a typical backtracking code (replicated below) with
the DFS code shown above, we can see that the main difference is just whether we flag visited
vertices. DFS does. Backtracking does not. By not revisiting vertices, DFS runs in O(V + E), but
the time complexity of backtracking goes up exponentially.
void backtracking(state) {
if (hit end state or invalid state) // invalid state includes states that cause cycling
return; // we need terminating/pruning condition
for each neighbor of this state // regardless it has been visited or not
backtracking(neighbor);
}
Other Applications
DFS is not only useful for traversing a graph. It can be used to solve many other graph problems.
The fact that one single call of dfs(u) will only visit vertices that are actually connected to u can
be utilized to find (and to count) the connected components of an undirected graph (see further
below for a similar problem on directed graph). We can simply use the following code to restart
DFS from one of the remaining unvisited vertices to find the next connected component (until all
are visited):
#define REP(i, a, b) \ // all sample codes involving REP use this macro
for (int i = int(a); i <= int(b); i++)
Exercise: We can also use Union-Find Disjoint Sets to solve this graph problem. How?
60
4.2. DEPTH FIRST SEARCH c Steven & Felix, NUS
⃝
DFS can be used for other purpose than just finding (and counting) the connected components.
Here, we show how a simple tweak of dfs(u) can be used to label the components. Typically, we
‘label’ (or ‘color’ the component by using its component number. This variant is more famously
known as ‘flood fill’.
Exercise: Flood Fill is more commonly performed on 2-D grid (implicit graph). Try to solve UVa
352, 469, 572, etc.
Running DFS on a connected component of a graph will form a DFS spanning tree (or spanning
forest if the graph has more than one component and DFS is run on each component). With one
more vertex state: DFS_GRAY = 2 (visited but not yet completed) on top of DFS_WHITE (unvisited)
and DFS_BLACK (visited and completed), we can use this DFS spanning tree (or forest) to classify
graph edges into four types:
1. Tree edges: those traversed by DFS, i.e. from vertex with DFS_GRAY to vertex with DFS_WHITE.
2. Back edges: part of cycle, i.e. from vertex with DFS_GRAY to vertex with DFS_GRAY too.
Note that usually we do not count bi-directional edges as having ‘cycle’
(We need to remember dfs_parent to distinguish this, see the code below).
Figure 4.2 shows an animation (from top left to bottom right) of calling dfs(0), then dfs(5), and
finally dfs(6) on the sample graph in Figure 4.1. We can see that 1 → 2 → 3 → 1 is a (true) cycle
and we classify edge (3 → 1) as a back edge, whereas 0 → 1 → 0 is not a cycle but edge (1 → 0) is
just a bi-directional edge. The code for this DFS variant is shown below.
61
4.2. DEPTH FIRST SEARCH c Steven & Felix, NUS
⃝
Motivating problem: Given a road map (undirected graph) with cost associated to all intersections
(vertices) and roads (edges), sabotage either a single intersection or a single road that has minimum
cost such that the road network breaks down. This is a problem of finding the least cost Articulation
Point (intersection) or the least cost Bridge (road) in an undirected graph (road map).
An ‘Articulation Point’ is defined as a vertex in a graph G whose removal disconnects G. A
graph without any articulation points is called ‘Biconnected’. Similarly, a ‘Bridge’ is defined as
an edge in a graph G whose removal disconnects G. These two problems are usually defined for
undirected graphs, although they are still well defined for directed graphs.
62
4.2. DEPTH FIRST SEARCH c Steven & Felix, NUS
⃝
A naı̈ve algorithm to find articulation points (can be tweaked to find bridges too):
1. Run O(V + E) DFS to count number of connected components of the original graph
2. For all vertex v ∈ V // O(V )
(a) Cut (remove) vertex v and its incident edges
(b) Run O(V + E) DFS to check if number of connected components increase
(c) If yes, v is an articulation point/cut vertex; Restore v and its incident edges
This naı̈ve algorithm calls DFS O(V ) times, thus it runs in O(V × (V + E)) = O(V 2 + V E). But
this is not the best algorithm as we can actually just run the O(V + E) DFS once to identify all
the articulation points and bridges.
This DFS variant, due to John Hopcroft and Robert Endre Tarjan (see problem 22.2 in [4]), is
just another extension from the previous DFS code shown earlier.
This algorithm maintains two numbers: dfs_num(u) and dfs_low(u). Here, dfs_num(u) stores
the iteration counter when the vertex u is visited for the first time and not just for distinguishing
DFS_WHITE versus DFS_GRAY/DFS_BLACK. The other number dfs_low(u) stores the lowest dfs_num
reachable from DFS spanning sub tree of u. Initially dfs_low(u) = dfs_num(u) when vertex u
is first visited. Then, dfs_low(u) can only be made smaller if there is a cycle (some back edges
exist). Note that we do not update dfs_low(u) with back edge (u, v) if v is a direct parent of u.
See Figure 4.3 for clarity. In these two sample graphs, we run articulationPointAndBridge(0).
Suppose in the graph in Figure 4.3 – left side, the sequence of visitation is 0 (at iteration 0) → 1 (1)
→ 2 (2) (backtrack to 1) → 4 (3) → 3 (4) (backtrack to 4) → 5 (5). See that these iteration counters
are shown correctly in dfs_num. Since there is no back edge in this graph, all dfs_low = dfs_num.
Figure 4.3: Introducing two more DFS attributes: dfs number and dfs low
In the graph in Figure 4.3 – right side, the sequence of visitation is 0 (at iteration 0) → 1 (1) → 2
(2) (backtrack to 1) → 3 (3) (backtrack to 1) → 4 (4) → 5 (5). There is an important back edge
that forms a cycle, i.e. edge 5-1 that is part of cycle 1-4-5-1. This causes vertices 1, 4, and 5 to be
able to reach vertex 1 (with dfs_num 1). Thus dfs_low of {1, 4, 5} are all 1.
When we are in a vertex u, v is a neighbor of u, and dfs_low(v) ≥ dfs_num(u), then u is
an articulation vertex. This is because the fact that dfs_low(v) is not smaller than dfs_num(u)
implies that there is no back edge connected to vertex v that can reach vertex w with a lower
dfs_num(w) (which further implies that w is the parent of u in the DFS spanning tree). Thus, to
reach that parent of u from v, one must pass through vertex u. This implies that removing the
vertex u will disconnect the graph.
Special case: The root of the DFS spanning tree (the vertex chosen as the start of DFS call) is
an articulation point only if it has more than one children (a trivial case that is not detected by
this algorithm).
63
4.2. DEPTH FIRST SEARCH c Steven & Felix, NUS
⃝
Figure 4.4: Finding articulation points with dfs num and dfs low
See Figure 4.4 for more details. On the graph in Figure 4.4 – left side, vertices 1 and 4 are
articulation points, because for example in edge 1-2, we see that dfs_low(2) ≥ dfs_num(1) and
in edge 4-5, we also see that dfs_low(5) ≥ dfs_num(4). On the graph in Figure 4.4 – right side,
vertex 1 is the articulation point, because for example in edge 1-5, dfs_low(5) ≥ dfs_num(1).
The process to find bridges is similar. When dfs_low(v) > dfs_num(u), then edge u-v is a
bridge. In Figure 4.5, almost all edges are bridges for the left and right graph. Only edges 1-4, 4-5,
and 5-1 are not bridges on the right graph. This is because – for example – edge 4-5, dfs_low(5)
≤ dfs_num(4), i.e. even if this edge 4-5 is removed, we know for sure that vertex 5 can still reach
vertex 1 via another path that bypass vertex 4 as dfs_low(5) = 1.
Figure 4.5: Finding bridges, also with dfs num and dfs low
void articulationPointAndBridge(int u) {
dfs_low[u] = dfs_num[u] = dfsNumberCounter++; // dfs_low[u] <= dfs_num[u]
TRvii (AdjList[u], v)
if (dfs_num[v->first] == DFS_WHITE) { // a tree edge
dfs_parent[v->first] = u; // parent of this children is me
if (u == dfsRoot) // special case
rootChildren++; // count children of root
articulationPointAndBridge(v->first);
if (dfs_low[v->first] >= dfs_num[u]) // for articulation point
articulation_vertex[u] = true; // store this information first
if (dfs_low[v->first] > dfs_num[u]) // for bridge
printf(" Edge (%d, %d) is a bridge\n", u, v->first);
dfs_low[u] = min(dfs_low[u], dfs_low[v->first]); // update dfs_low[u]
}
else if (v->first != dfs_parent[u]) // a back edge and not direct cycle
dfs_low[u] = min(dfs_low[u], dfs_num[v->first]); // update dfs_low[u]
}
64
4.2. DEPTH FIRST SEARCH c Steven & Felix, NUS
⃝
printf("Articulation Points:\n");
REP (i, 0, V - 1)
if (articulation_vertex[i])
printf(" Vertex %d\n", i);
Yet another application of DFS is to find strongly connected components in a directed graph. This
is a different problem to finding connected components of an undirected graph. In Figure 4.6,
we have a similar graph to the graph in Figure 4.1, but now the edges are directed. We can see
that although the graph in Figure 4.6 looks like ‘connected’ into one component, it is actually
‘not strong’. In directed graph, we are more interested with the notion of ‘Strongly Connected
Component (SCC)’, i.e. if we pick any pair of vertices u and v in the SCC, we can find a path from
u to v and vice versa. There are actually three SCCs in Figure 4.6, as highlighted: {0}, {1, 3, 2},
and {4, 5, 7, 6}.
Figure 4.6: An example of directed graph and its Strongly Connected Components (SCC)
There are at least two known algorithms to find SCCs: Kosaraju’s – explained in [4] and Tarjan’s
algorithm [45]. In this book, we adopt Tarjan’s version, as it extends naturally from our previous
discussion of finding Articulation Points and Bridges – also due to Tarjan.
The basic idea of the algorithm is that SCCs form the subtrees of the DFS spanning tree
and the roots of the subtrees are also the roots of the SCCs. To determine whether a vertex u
is the root of an SCC, Tarjan’s SCC algorithm uses dfs_num and dfs_low, i.e. by checking if
dfs_low(u) = dfs_num(u). The visited vertices are pushed into a stack according to its dfs_num.
When DFS returns from a subtree, the vertices are popped from the stack. If the vertex is the root
of an SCC, then that vertex and all of the vertices popped before it forms that SCC. The code is
shown below:
65
4.2. DEPTH FIRST SEARCH c Steven & Felix, NUS
⃝
void tarjanSCC(int u) {
dfs_low[u] = dfs_num[u] = dfsNumberCounter++; // dfs_low[u] <= dfs_num[u]
dfs_scc.push(u); in_stack.insert(u); // stores u based on order of visitation
TRvii (AdjList[u], v) {
if (dfs_num[v->first] == DFS_WHITE) // a tree edge
tarjanSCC(v->first);
if (in_stack.find(v->first) != in_stack.end()) // condition for update
dfs_low[u] = min(dfs_low[u], dfs_low[v->first]); // update dfs_low[u]
}
if (dfs_low[u] == dfs_num[u]) { // if this is a root of SCC
printf("SCC: ");
while (!dfs_scc.empty() && dfs_scc.top() != u) {
printf("%d ", dfs_scc.top()); in_stack.erase(dfs_scc.top()); dfs_scc.pop();
}
printf("%d\n", dfs_scc.top()); in_stack.erase(dfs_scc.top()); dfs_scc.pop();
} }
Exercise: This implementation can be improved by about O(log V ) factor by using other data
structure other than set<int> in_stack. How?
Topological sort or topological ordering of a Directed Acyclic Graph (DAG) is a linear ordering of
the vertices in the DAG so that vertex u comes before vertex v if edge (u → v) exists in the DAG.
Every DAG has one or more topological sorts. There are several ways to implement a Topological
Sort algorithm. The simplest is to slightly modify the simplest DFS implementation in this section.
void topoVisit(int u) {
dfs_num[u] = DFS_BLACK;
TRvii (AdjList[u], v)
if (dfs_num[v->first] == DFS_WHITE)
topoVisit(v->first);
topologicalSort.push_back(u); // this is the only change
}
In topoVisit(u), we append u to the list vertices explored only after visiting all subtrees below u.
As vector only support efficient insertion from back, we work around this issue by simply reversing
the print order in the output phase. This simple algorithm for finding (a valid) topological sort is
again due to Tarjan. It again runs in O(V + E) as with DFS.
66
4.3. BREADTH FIRST SEARCH c Steven & Felix, NUS
⃝
67
4.3. BREADTH FIRST SEARCH c Steven & Felix, NUS
⃝
take out the front most vertex u from the queue and enqueue each unvisited neighbors of u. With
the help of the queue, BFS will visit vertex s and all vertices in the connected component that
contains s layer by layer. This is why the name is breadth-first. BFS algorithm also runs in O(V +E)
on a graph represented using an Adjacency List.
Implementing BFS is easy if we utilize C++ STL libraries. We use queue to order the sequence
of visitation and map to record if a vertex has been visited or not – which at the same time also
record the distance (layer number) of each vertex from source vertex. This feature is important as
it can be used to solve special case of Single-Source Shortest Paths problem (discussed below).
queue<int> q; map<int, int> dist;
q.push(s); dist[s] = 0; // start from source
while (!q.empty()) {
int u = q.front(); q.pop(); // queue: layer by layer!
printf("Visit %d, Layer %d\n", u, dist[u]);
TRvii (AdjList[u], v) // for each neighbours of u
if (!dist.count(v->first)) { // dist.find(v) != dist.end() also works
dist[v->first] = dist[u] + 1; // if v not visited before + reachable from u
q.push(v->first); // enqueue v for next steps
} }
Exercise: This implementation uses map<STATE-TYPE, int> dist to store distance information.
This may be useful if STATE-TYPE is not integer, e.g. a pair<int, int> of (row, col) coordinate.
However, this trick adds a log V factor to the O(V + E) BFS complexity. Please, rewrite this
implementation to use vector<int> dist instead!
If we run BFS from the vertex labeled with 35 (i.e. the source vertex s = 35) on the connected
undirected graph shown in Figure 4.7, we will visit the vertices in the following order:
Layer 0: >35< (source)
Layer 1: 15, 55, 40
Layer 2: 10, 20, 50, 60
Layer 3: >30<, 25, 47, 65
Layer 4: 45
// Three layers from ’35’ to ’30’ implies that the shortest path between them
// on this unweighted graph is 3 distance units.
Other Applications
Single-Source Shortest Paths (SSSP) on Unweighted Graph
The fact that BFS visits vertices of a graph layer by layer from a source vertex turns BFS as a good
solver for Single-Source Shortest Paths (SSSP) problem on unweighted graph. This is because in
68
4.3. BREADTH FIRST SEARCH c Steven & Felix, NUS
⃝
unweighted graph, the distance between two neighboring vertices connected with an edge is simply
one unit. Thus the layer count of a vertex that we have seen previously is precisely the shortest
path length from the source to that vertex. For example in Figure 4.7, the shortest path from the
vertex labeled with ‘35’ to the vertex labeled ‘30’, is 3, as ‘30’ is in the third layer in BFS sequence
of visitation. Reconstructing the shortest path: 35 → 15 → 10 → 30 is easy if we store the BFS
spanning tree, i.e. vertex ‘30’ remembers ‘10’ as its parent, vertex ‘10’ remembers ‘15’, vertex ‘15’
remembers ‘35’ (the source).
69
4.4. KRUSKAL’S c Steven & Felix, NUS
⃝
4.4 Kruskal’s
Basic Form and Application
Motivating problem: Given a connected, undirected, and weighted graph G (see the leftmost graph
in Figure 4.8), select a subset of edges E ′ ∈ G such that the graph G is (still) connected and the
total weight of the selected edges E ′ is minimal!
To satisfy connectivity criteria, edges in E ′ must form a tree that spans (covers) all V ∈ G –
the spanning tree! There can be several valid spanning trees in G, i.e. see Figure 4.8, middle and
right sides. One of them is the required solution that satisfies the minimal weight criteria.
This problem is called the Minimum Spanning Tree (MST) problem and has many practical
applications, as we will see later in this section.
Figure 4.8: Example of a Minimum Spanning Tree (MST) Problem (from UVa 908 [17])
This MST problem can be solved with several well-known algorithms, i.e. Prim’s and Kruskal’s,
both are greedy algorithms and explained in [4, 21, 14, 23, 16, 1, 13, 5]. For programming contests,
we opt Kruskal’s as it’s implementation is very easy with help of 2 data structures.
Joseph Bernard Kruskal Jr.’s algorithm first sort E edges based on non decreasing weight in
O(E log E). This can be easily done using priority_queue (or alternatively, use vector & sort).
Then, it greedily tries to add O(E) edges with minimum costs to the solution as long as such
addition does not form a cycle. This cycle check can be done easily using Union-Find Disjoint Sets.
The code is short and in overall runs in O(E log E).
#typedef pair<int, int> ii; // we use ii as a shortcut of integer pair data type
priority_queue< pair<int, ii> > EdgeList; // sort by edge weight O(E log E)
// PQ default: sort descending. To sort ascending, we can use <(negative) weight(i, j), <i, j> >
70
4.4. KRUSKAL’S c Steven & Felix, NUS
⃝
mst_cost = 0; initSet(V); // all V are disjoint sets initially, see Section 2.3.2
while (!EdgeList.empty()) { // while there exist more edges, O(E)
pair<int, ii> front = EdgeList.top(); EdgeList.pop();
if (!isSameSet(front.second.first, front.second.second)) { // if no cycle
mst_cost += (-front.first); // add (negated) -weight of e to MST
unionSet(front.second.first, front.second.second); // link these two vertices
} }
// note that the number of disjoint sets must eventually be one!
// otherwise, no MST has been formed...
Figure 4.9: Kruskal’s Algorithm for MST Problem (from UVa 908 [17])
Exercise: The implementation shown here only stop when EdgeList is empty. For some cases, we
can stop Kruskal’s algorithm earlier. When and how to modify the code to handle this?
Figure 4.9 shows the execution of Kruskal’s algorithm on the graph shown in Figure 4.8, leftmost1 .
Other Applications
Variants of basic MST problems are interesting. In this section, we will explore some of them.
This is a simple variant where we want the maximum, instead of the minimum ST. In Figure 4.10,
we see a comparison between MST and Maximum ST. Solution: sort edges in non increasing order.
71
4.4. KRUSKAL’S c Steven & Felix, NUS
⃝
In this variant, we do not start with a clean slate. Some edges in the given graph are already fixed
and must be taken as part of the Spanning Tree solution. We must continue building the ‘M’ST
from there, thus the resulting Spanning Tree perhaps no longer minimum overall. That’s why we
put the term ‘Minimum’ in quotes. In Figure 4.11, we see an example when one edge 1-2 is already
fixed (left). The actual MST is 10+13+17 = 40 which omits the edge 1-2 (middle). However, the
solution for this problem must be (25)+10+13 = 48 which uses the edge 1-2 (right).
The solution for this variant is simple. After taking into account all the fixed edges, we continue
running Kruskal’s algorithm on the remaining free edges.
In this variant, we want the spanning criteria, i.e. all vertices must be covered by some edges, but
we can stop even though the spanning tree has not been formed as long as the spanning criteria
is satisfied! This can happen when we have a spanning ‘forest’. Usually, the desired number of
components is told beforehand in the problem description. In Figure 4.12, left, we observe that the
MST for this graph is 10+13+17 = 40. But if we want a spanning forest with 2 components, then
the solution is just 10+13 = 23 on Figure 4.12, right.
To get minimum spanning forest is simple. Run Kruskal’s algorithm as per normal, but as
soon as the number of connected component equals to the desired pre-determined number, we can
terminate the algorithm.
Sometimes, we are interested to have a backup plan. In the context of finding the MST, we may
want not just the MST, but also the second best spanning tree, in case the MST is not workable.
Figure 4.13 shows the MST (left) and the second best ST (right). We can see that the second best
ST is actually the MST with just two edges difference, i.e. one edge is taken out from MST and
another chord2 edge is added to MST. In this example: the edge 4-5 is taken out and the edge 2-5
is added in.
1
Note that the solution for this problem is definitely a tree, i.e. no cycles in the solution!
2
A chord edge is defined as an edge in graph G that is not selected in the MST of G.
72
4.4. KRUSKAL’S c Steven & Felix, NUS
⃝
Figure 4.13: Second Best Spanning Tree (from UVa 10600 [17])
A simple solution is to first sort the edges in O(E log E), then find the MST using Kruskal’s in
O(E). Then, for each edge in the MST, make its weight to be INF (infinite) to ‘delete’ it – in
practice, this is just a very big number. Then try to find the second best ST in O(V E) as there are
E = V − 1 edges in the MST that we have to try. Figure 4.14 shows this algorithm on the given
graph. Remember that both MST and second best ST are spanning tree, i.e. they are connected!
In overall, this algorithm runs in O(E log E + E + V E).
Figure 4.14: Finding the Second Best Spanning Tree from the MST
73
4.5. DIJKSTRA’S c Steven & Felix, NUS
⃝
• Variants
1. UVa 10147 - Highways (Partial ‘Minimum’ Spanning Tree)
2. UVa 10369 - Arctic Networks (Minimum Spanning ‘Forest’)
3. UVa 10397 - Connect the Campus (Partial ‘Minimum’ Spanning Tree)
4. UVa 10600 - ACM Contest and Blackout (Second Best Spanning Tree)
5. UVa 10842 - Traffic Flow (find min weighted edge in ‘Maximum’ Spanning Tree)
6. LA 3678 - The Bug Sensor Problem (Minimum Spanning ‘Forest’)
7. LA 4110 - RACING (‘Maximum’ Spanning Tree)
8. POJ 1679 - The Unique MST (Second Best Spanning Tree)
4.5 Dijkstra’s
Motivating problem: Given a weighted graph G and a starting source vertex s, what are the shortest
paths from s to the other vertices of G?
This problem is called the Single-Source3 Shortest Paths (SSSP) problem on a weighted graph.
It is a classical problem in graph theory and efficient algorithms exist. If the graph is unweighted,
we can use the BFS algorithm as shown earlier in Section 4.3. For a general weighted graph, BFS
does not work correctly and we should use algorithms like the O((E +V ) log V ) Dijkstra’s algorithm
(discussed here) or the O(V E) Bellman Ford’s algorithm (discussed in Section 4.6).
Edsger Wybe Dijkstra’s algorithm is a greedy algorithm: Initially, set the distance to all vertices
to be INF (a large number) but set the dist[source] = 0 (base case). Then, repeat the following
process from the source vertex: From the current vertex u with the smallest dist[u], ‘relax’ all
neighbors of u. relax(u, v) sets dist[v] = min(dist[v], dist[u] + weight(u, v)). Vertex
u is now done and will not be visited again. Then we greedily replace u with the unvisited vertex
x with currently smallest dist[x]. See proof of correctness of this greedy strategy in [4].
There can be many ways to implement this algorithm, especially in using the priority_queue.
The following code snippet may be one of the easiest implementation.
vector<int> dist(V, INF); dist[s] = 0; // INF = 2.10^9 not MAX_INT to avoid overflow
priority_queue<ii, vector<ii>, greater<ii> > pq; pq.push(ii(0, s)); // sort by distance
3
This generic SSSP problem can also be used to solve: 1). Single-Pair Shortest Path problem where both source
+ destination vertices are given and 2). Single-Destination Shortest Paths problem where we can simply reverse the
role of source and destination vertices.
74
4.6. BELLMAN FORD’S c Steven & Felix, NUS
⃝
Figure 4.15: Dijkstra Animation on a Weighted Graph (from UVa 341 [17])
Figure 4.15 shows an example of running Dijkstra’s on a simple weighted graph |V | = 5 and |E| = 7:
2. From vertex 2, we relax vertices {1, 3, 5}. Now dist[1] = 2, dist[3] = 7, and dist[5] = 6.
Vertex 2 is done. The content of our priority_queue pq is {(2,1), (6,5), (7,3)}.
3. Among unprocessed vertices {1, 5, 3} in pq, vertex 1 has the least dist[1] = 2 and is in front
of pq. We dequeue (2,1) and relax all its neighbors: {3, 4} such that dist[3] = min(dist[3],
dist[1]+weight(1,3)) = min(7, 2+3) = 5 and dist[4] = 8. Vertex 1 is done. Now pq
contains {(5,3), (6,5), (7,3), (8,4)}. See that we have 2 vertex 3. But it does not matter, as
our Dijkstra’s implementation will only pick one with minimal distance later.
4. We dequeue (5,3) and try to do relax(3,4), but 5+5 = 10, whereas dist[4] = 8 (from
path 2-1-4). So dist[4] is unchanged. Vertex 3 is done and pq contains {(6,5), (7,3), (8,4)}.
5. We dequeue (6,5) and relax(5, 4), making dist[4] = 7 (the shorter path from 2 to 4 is
now 2-5-4 instead of 2-1-4). Vertex 5 is done and pq contains {(7,3), (7,4), (8,4)}.
6. Now, (7,3) can be ignored as we know that d > dist[3] (i.e. 7 > 5). Then (7,4) is processed
as before. And finally (8,4) is ignored again as d > dist[4] (i.e. 8 > 7). Dijkstra’s stops
here as the priority queue is empty.
75
4.6. BELLMAN FORD’S c Steven & Felix, NUS
⃝
To solve SSSP problem in the presence of negative edge weight, the more generic (but slower)
Bellman Ford’s algorithm must be used. This algorithm was invented by Richard Ernest Bellman
(the pioneer of DP techniques) and Lester Randolph Ford, Jr (the same person who invented Ford
Fulkerson’s method in Section 4.8). This algorithm is simple: Relax all E edges V − 1 times!
This is based on the idea that dist[source] = 0 and if we relax an edge(source, u), then
dist[u] will have correct value. If we then relax an edge(u, v), then dist[v] will also have
correct value... If we have relaxed all E edges V − 1 times, then the shortest path from the source
vertex to the furthest vertex from the source (path length: V -1 edges) should have been correctly
computed. The code is simple:
vector<int> dist(V, INF); dist[s] = 0; // INF = 2B not MAX_INT to avoid overflow
REP (i, 0, V - 1) // relax all E edges V-1 times, O(V)
REP (u, 0, V - 1) // these two loops = O(E)
TRvii (AdjList[u], v) // has edge and can be relaxed
dist[v->first] = min(dist[v->first], dist[u] + v->second);
REP (i, 0, V - 1)
printf("SSSP(%d, %d) = %d\n", s, i, dist[i]);
The complexity of Bellman Ford’s algorithm is O(V 3 ) if the graph is stored as Adjacency Matrix
or O(V E) if the graph is stored as Adjacency List. This is (much) slower compared to Dijkstra’s.
Thus, Bellman Ford’s is typically only used to solve SSSP problem when the input graph is not
too big and not guaranteed to have all non-negative edge weights!
Bellman Ford’s algorithm has one more interesting usage. After relaxing all E edges V -1 times,
the SSSP problem should have been solved, i.e. there is no way we can relax any more vertex. This
fact can be used to check the presence of negative cycle, although such a problem is ill-defined.
In Figure 4.17, left, we see a simple graph with negative cycle. After 1 pass, dist[1] = 973 and
dist[2] = 1015 (middle). After V − 1 = 2 passes, dist[1] = 988 and dist[1] = 946 (right).
But since there is a negative cycle, we can still do this one more time, i.e. relaxing dist[1] =
946+15 = 961 < 988!
Figure 4.17: Bellman Ford’s can detect the presence of negative cycle (from UVa 558 [17])
76
4.7. FLOYD WARSHALL’S c Steven & Felix, NUS
⃝
In Table 4.3, we present an SSSP algorithm decision table with programming contest in mind.
This is to help readers in deciding which algorithm to choose depending on various graph criteria.
77
4.7. FLOYD WARSHALL’S c Steven & Felix, NUS
⃝
Load the small graph into an Adjacency Matrix and then run the following short code with
3 nested loops. When it terminates, AdjMatrix[i][j] will contain the shortest path distance
between two pair of vertices i and j in G. The original problem now become easy.
REP (k, 0, V - 1) // recall that #define REP(i, a, b) for (int i = int(a); i <= int(b); i++)
REP (i, 0, V - 1)
REP (j, 0, V - 1)
AdjMatrix[i][j] = min(AdjMatrix[i][j], AdjMatrix[i][k] + AdjMatrix[k][j]);
This algorithm is called Floyd Warshall’s algorithm, invented by Robert W Floyd and Stephen
Warshall. Floyd Warshall’s is a DP algorithm that clearly runs in O(V 3 ) due to its 3 nested
loops4 , but since |V | ≤ 100 for the given problem, this is do-able. In general, Floyd Warshall’s
solves another classical graph problem: the All-Pairs Shortest Paths (APSP) problem. It is an
alternative algorithm (for small graphs) compared to calling SSSP algorithms multiple times:
In a programming contest setting, Floyd Warshall’s main attractiveness is basically its implemen-
tation speed – 4 short lines only. If the given graph is small, do not hesitate using this algorithm
– even if you only need a solution for the SSSP problem.
78
4.7. FLOYD WARSHALL’S c Steven & Felix, NUS
⃝
In Figure 4.18, we want to find sp(3,4). The shortest possible path is 3-0-2-4 with cost 3. But how
to reach this solution? We know that with direct edges only, sp(3,4) = 5, as in Figure 4.18.A.
The solution for sp(3,4) will eventually be reached from sp(3,2)+sp(2,4). But with only direct
edges, sp(3,2)+sp(2,4) = 3+1 is still > 3.
When we allow k = 0, i.e. vertex 0 can now be used as an intermediate vertex, then sp(3,4) is
reduced as sp(3,4) = sp(3,0)+sp(0,4) = 1+3 = 4, as in Figure 4.18.B. Note that with k = 0,
sp(3,2) – which we will need later – also drop from 3 to sp(3,0)+sp(0,2) = 1+1 = 2. Floyd
Warshall’s will process sp(i,j) for all pairs considering only vertex 0 as the intermediate vertex.
When we allow k = 1, i.e. vertex 0 and 1 can now be used as the intermediate vertices, then it
happens that there is no change to sp(3,4), sp(3,2), nor to sp(2,4).
When we allow k = 2, i.e. vertices 0, 1, and 2 now can be used as the intermediate vertices, then
sp(3,4) is reduced again as sp(3,4) = sp(3,2)+sp(2,4) = 2+1 = 3. Floyd Warshall’s repeats
this process for k = 3 and finally k = 4 but sp(3,4) remains at 3.
k to be the shortest distance from i to j with only [0..k] as intermediate vertices.
We define Di,j
This DP formulation requires us to fill the entries layer by layer. To fill out an entry in the table k,
we make use of entries in the table k-1. For example, to calculate D3,4 2 , (row 3, column 4, in table
The naı̈ve implementation is to use 3-dimensional matrix D[k][i][j] of size O(V 3 ). However,
we can utilize a space-saving trick by dropping dimension k and computing D[i][j] ‘on-the-fly’.
Thus, the Floyd Warshall’s algorithm just need O(V 2 ) space although it still runs in O(V 3 ).
Other Applications
The basic purpose of Floyd Warshall’s algorithm is to solve the APSP problem. However, it can
also be applied to other graph problems.
79
4.7. FLOYD WARSHALL’S c Steven & Felix, NUS
⃝
Stephen Warshall developed algorithm for finding solution for Transitive Closure problem: Given
a graph, determine if vertex i is connected to j. This variant uses logical bitwise operators which
is much faster than arithmetic operators. Initially, d[i][j] contains 1 (true) if vertex i is directly
connected to vertex j, 0 (false) otherwise. After running O(V 3 ) Warshall’s algorithm below, we
can check if any two vertices i and j are directly or indirectly connected by checking d[i][j].
REP (k, 0, V - 1)
REP (i, 0, V - 1)
REP (j, 0, V - 1)
d[i][j] = d[i][j] | (d[i][k] & d[k][j]);
The Minimax path problem is a problem of finding the minimum of maximum edge weight among
all possible paths from i to j. There can be many paths from i to j. For a single path from i to
j, we pick the maximum edge weight along this path. Then for all possible paths from i to j, pick
the one with the minimum max-edge-weight. The reverse problem of Maximin is defined similarly.
The solution using Floyd Warshall’s is shown below. First, initialize d[i][j] to be the weight
of edge (i,j). This is the default minimax cost for two vertices that are directly connected. For pair
i-j without any direct edge, set d[i][j] = INF. Then, we try all possible intermediate vertex k.
The minimax cost d[i][j] is min of either (itself) or (the max between d[i][k] or d[k][j]).
REP (k, 0, V - 1)
REP (i, 0, V - 1)
REP (j, 0, V - 1)
d[i][j] = min(d[i][j], max(d[i][k], d[k][j]));
Exercise: In this section, we have shown you how to solve Minimax (and Maximin) with Floyd
Warshall’s algorithm. However, this problem can also be modeled as an MST problem and solved
using Kruskal’s algorithm. Find out the way!
80
4.8. EDMONDS KARP’S (EXCLUDED IN IOI SYLLABUS) c Steven & Felix, NUS
⃝
• Variants
1. UVa 334 - Identifying Concurrent Events (transitive closure is only the sub-problem)
2. UVa 534 - Frogger (Minimax)
3. UVa 544 - Heavy Cargo (Maximin)
4. UVa 869 - Airline Comparison (run Warshall’s twice, then compare the AdjMatrices)
5. UVa 925 - No more prerequisites, please!
6. UVa 10048 - Audiophobia (Minimax)
7. UVa 10099 - Tourist Guide (Maximin)
Figure 4.21: Illustration of Max Flow (From UVa 820 [17] - ICPC World Finals 2000 Problem E)
One solution is the Ford Fulkerson’s method – invented by the same Lester Randolph Ford. Jr who
created Bellman Ford’s algorithm and Delbert Ray Fulkerson. The pseudo code is like this:
max_flow = 0
while (there exists an augmenting path p from s to t) { // iterative algorithm
augment flow f along p, i.e.
f = min edge weight in the path p
max_flow += f // we can send flow f from s to t
forward edges -= f // reduce capacity of these edges
backward edges += f // increase capacity of reverse edges
}
output max_flow
81
4.8. EDMONDS KARP’S (EXCLUDED IN IOI SYLLABUS) c Steven & Felix, NUS
⃝
There are several ways to find an augmenting path in the pseudo code above, each with different
time complexity. In this section, we highlight two ways: via DFS or via BFS.
Ford Fulkerson’s method implemented using DFS can run in O(f ∗ E) where f ∗ is the max flow
value. This is because we can have a graph like in Figure 4.22 where every path augmentation only
decreases the edge capacity along the path by 1. This is going to be repeated f ∗ times. In Figure
4.22, it is 200 times. The fact that number of edges in flow graph is E ≥ V − 1 to ensure ∃ ≥ 1 s-t
flow dictates that a DFS run is O(E). The overall time complexity is O(f ∗ E).
A better implementation of Ford Fulkerson’s method is to use BFS for finding the shortest path
– in terms of number of layers/hops - between s and t. This algorithm is discovered by Jack
Edmonds and Richard Karp, thus named as Edmonds Karp’s algorithm. It runs in O(V E 2 ) as it
can be proven that after O(V E) iterations, all augmenting paths will already exhausted. Interested
readers can browse books like [4] to study more about this algorithm. As BFS also runs in O(E),
the overall time complexity is O(V E 2 ). Edmonds Karp’s only needs two s-t paths in Figure 4.22:
s-a-t (send 100 unit flow) and s-b-t (send another 100).
82
4.8. EDMONDS KARP’S (EXCLUDED IN IOI SYLLABUS) c Steven & Felix, NUS
⃝
The code snippet above shows how to implement Edmonds Karp’s algorithm in a way that it still
achieves its O(V E 2 ) time complexity. The code uses both Adjacency List (for fast enumeration of
neighbors) and Adjacency Matrix (for fast access to edge weight) of the same flow graph.
In general, this O(V E 2 ) Edmonds Karp’s implementation is sufficient to answer most network
flow problems in programming contests. However, for harder problems, we may need O(V 2 E)
Dinic’s or O(V 3 ) Push-Relabel (relabel-to-front) Max Flow algorithms [4].
Exercise: The implementation of Edmonds Karp’s algorithm shown here uses AdjMatrix to store
residual capacity of each edge. A better way is to store flow of each edge, and then derive the
residual from capacity of edge minus flow in edge. Modify the implementation!
Other Applications
There are several other interesting applications of Max Flow problem. We discuss six examples
here while some others are deferred until Section 4.9. Note that some tricks shown in this section
may be applicable to other graph problems.
Min Cut
Let’s define an s-t cut C = (S, T ) as a partition of V ∈ G such that source s ∈ S and sink t ∈ T .
Let’s also define a cut-set of C to be the set {(u, v) ∈ E | u ∈ S, v ∈ T } such that if all edges in the
cut-set of C are removed, the Max Flow from s to t is 0 (i.e. s and t are disconnected). The cost of
an s-t cut C is defined by the sum of the capacities of the edges in the cut-set of C. The Min Cut
problem is to minimize the amount of capacity of an s-t cut. This problem is more general than
finding bridges in a graph (See Section 4.2), i.e. in this case we can cut more than just one edge,
but we want to do so in least cost way.
The solution for this problem is simple. The by-product of computing the Max Flow is Min
Cut. In Figure 4.21.D, we can see that edges that are saturated, i.e. the flow on that edge equals
to that edge’s capacity, belong to the Min Cut!, i.e. edges 1-4 (capacity 30, flow 30), 3-4 (5/5) and
3-2 (25/25). The cost of cut is 30+5+25 = 60. This is the minimum over all possible s-t cuts. All
vertices that are still reachable from source s belong to set S. The rest belong to set T . Here,
S = {1, 3} and T = {4, 2}.
Sometimes, we can have more than one source and/or more than one sink. However, this variant
is no harder than the original Max Flow problem with a single source and a single sink. Create a
83
4.8. EDMONDS KARP’S (EXCLUDED IN IOI SYLLABUS) c Steven & Felix, NUS
⃝
super source ss and a super sink st. Connect ss with all s with infinite capacity and also connect
tt with all t with infinite capacity, then run Edmonds Karp’s algorithm as per normal.
We can also have a Max Flow variant where capacities are not just defined along the edges but
also on the vertices. To solve this variant, we can use the vertex splitting technique. A graph with
a vertex weight can be converted into a more familiar graph without a vertex weight by splitting
the vertex v to vout and vin , reassigning incoming edges to vout /outgoing edges to vin , and putting
the original vertex v’s weight as the weight of edge (vout , vin ). For details, see Figure 4.23. Then
run Edmonds Karp’s algorithm as per normal.
The problem of finding the maximum number of independent paths from source s to sink t can be
reduced to the Max Flow problem. Two paths are said to be independent if they do not share any
vertex apart from s and t (vertex-disjoint). Solution: construct a flow network N = (V, E) from G
with vertex capacities, where N is the carbon copy of G except that the capacity of each v ∈ V is
1 (i.e. each vertex can only be used once) and the capacity of each e ∈ E is also 1 (i.e. each edge
can only be used once too). Then run Edmonds Karp’s algorithm as per normal.
Finding the maximum number of edge-disjoint paths from s to t is similar to finding max inde-
pendent paths. The only difference is that this time we do not have any vertex capacity (i.e. two
edge-disjoint paths can still share the same vertex). See Figure 4.24 for a comparison between
independent paths and edge-disjoint paths from s = 1 to t = 7.
Figure 4.24: Comparison Between Max Independent Paths versus Max Edge-Disjoint Paths
84
4.9. SPECIAL GRAPHS c Steven & Felix, NUS
⃝
The min cost flow problem is the problem of finding the cheapest possible way of sending a certain
amount of flow (usually the Max Flow) through a flow network. In this problem, every edge has
two attributes: the flow capacity through this edge and the cost to use this edge. Min Cost (Max)
Flow, or in short MCMF, can be solved by replacing the BFS to find augmenting path in Edmonds
Karp’s to Bellman Ford’s (Dijkstra’s may not work as the edge weight in the residual flow graph
can be negative).
Figure 4.25: Special Graphs (Left to Right): Tree, Directed Acyclic Graph, Bipartite Graph
85
4.9. SPECIAL GRAPHS c Steven & Felix, NUS
⃝
4.9.1 Tree
Tree is a special graph with the following characteristics: has E = V -1 (any O(V + E) algorithm is
O(V )), it has no cycle, it is connected, and there exists one unique path from any pair of vertices.
In Section 4.2, we have seen O(V + E) Tarjan’s DFS algorithm for finding articulation points and
bridges of a graph. However, if the given graph is a tree, the problem becomes simpler: all edges on
a tree are bridges and all internal vertices (degree > 1) are articulation points. This is still O(V )
as we have to scan the tree to count the number of internal vertices, but the code is simpler.
In Sections 4.5 and 4.6, we have seen two general purpose algorithms (O((E + V ) log V ) Dijkstra’s
and O(V E) Bellman-Ford’s) for solving the SSSP problem on a weighted graph. However, if the
given graph is a tree, the SSSP problem becomes simpler: any O(V ) graph traversal algorithm, i.e.
BFS or DFS, can be used to solve this problem. There is only one unique path between any two
vertices in a tree, so we simply traverse the tree to find the path connecting the two vertices and
the shortest path between these two vertices is basically the sum of edge weights of this unique
path.
In Section 4.7, we have seen a general purpose algorithm (O(V 3 ) Floyd Warshall’s) for solving
the APSP problem on weighted graph. However, if the given graph is a tree, the APSP problem
becomes simpler: repeat the SSSP process V times from each vertex, thus it is O(V 2 ).
But this can still be improved to O(V +Q×L): Q is the number of query and L is the complexity
of the Lowest Common Ancestor (LCA) implementation (see [40] for more details). We run O(V )
DFS/BFS once from any vertex v to find dist[v][other vertices] in tree. Then we can answer any
shortest path query (i, j) on this tree by reporting dist[v][i] + dist[v][j] - 2 × dist[v][LCA(i, j)].
Diameter of Tree
The diameter of a graph is defined as the greatest distance between any pair of vertices. To find
the diameter of a graph, we first find the shortest path between each pair of vertices (the APSP
problem). The greatest length of any of these paths is the diameter of the graph. For general
graph, we may need O(V 3 ) Floyd Warshall’s algorithm discussed in Section 4.7. However, if the
given graph is a tree, the problem becomes simpler: do DFS/BFS from any node s to find furthest
vertex x, then do DFS/BFS one more time from vertex x to get the true furthest vertex y from x.
The length of the unique path along x to y is the diameter of that tree. This solution only requires
two calls of O(V ) graph traversal algorithm.
In Section 3.4, we have shown a DP on Tree example that solves Max Weighted Independent Set
(MWIS) on Tree in O(V ). In Section 4.9.3 below, we will revisit this problem on a Bipartite Graph,
86
4.9. SPECIAL GRAPHS c Steven & Felix, NUS
⃝
which can be reduced to Max Flow problem and runs in O(V E 2 ) with Edmonds Karp’s. However,
this problem is NP-complete on general graph.
Programming Exercises related to Tree (also see Section 3.4.3 for ‘DP on Tree’ Topic):
The Single-Source Shortest Paths (SSSP) problem becomes much simpler if the given graph is a
DAG as DAG has at least one topological order! We can use an O(V +E) topological sort algorithm
in Section 4.2 to find one such topological order, then relax edges according to this order. The
topological order will ensure that if we have a vertex b that has an incoming edge from a vertex a,
then vertex b is relaxed after vertex a. This way, the distance information propagation is correct
with just one O(V + E) linear pass!
Single-Source Longest Paths problem, i.e. finding the longest path from a starting vertex s is
NP-complete on a general graph [39]. However the problem is again easy if the graph has no cycle,
which is true in DAG. The solution for the Longest Paths in DAG (a.k.a. Critical Paths) is just a
minor tweak from the SSSP solution in DAG shown above, i.e. simply negate all edge weights.
Motivating problem: Imagine that the vertices in Figure 4.26.A are passengers, and we draw an
edge between two vertices u − v if a single taxi can serve passenger u then passenger v on time. The
question is: What is the minimum number of taxis that must be deployed to serve all passengers?
87
4.9. SPECIAL GRAPHS c Steven & Felix, NUS
⃝
The answer for the motivating problem above is two taxis. In Figure 4.26.D, we see one possible
solution. One taxi (red dotted line) serves passenger 1 (colored with red), passenger 2 (blue), and
then passenger 4 (yellow). Another taxi (green dashed line) serves passenger 3 (green) and passenger
5 (orange). All passengers are served with just two taxis.
In general, the Min Path Cover (MPC) problem in DAG is described as the problem of finding the
minimum number of paths to cover each vertex in DAG G = (V, E).
"
This problem has a polynomial solution: Construct a bipartite graph G′ = (Vout Vin , E ′ ) from
G, where Vout = {v ∈ V : v has positive out-degree}, Vin = {v ∈ V : v has positive in-degree},
and E ′ = {(u, v) ∈ (V out, V in) : (u, v) ∈ E}. This G′ is a bipartite graph. Finding a matching on
bipartite graph G′ forces us to select at most one outgoing edge from v ∈ Vout (similarly for Vin ).
DAG G initially has n vertices, which can be covered with n paths of length 0 (the vertex itself).
One matching between vertex a and vertex b using edge (a, b) says that we can use one less path as
it can cover both vertices in a ∈ Vout and b ∈ Vin . Thus if the Max Cardinality Bipartite Matching
(MCBM) in G′ has size m, then we just need n − m paths to cover each vertex in G.
The MCBM in G′ that is needed to solve the MPC in G is discussed below. The solution for
bipartite matching is polynomial, thus the solution for the MPC in DAG is also polynomial. Note
that MPC in general graph is NP-Complete [42].
88
4.9. SPECIAL GRAPHS c Steven & Felix, NUS
⃝
Motivating problem (from TopCoder [26] Open 2009 Qualifying 1): Group a list of numbers into
pairs such that the sum of each pair is prime. For example, given the numbers {1, 4, 7, 10, 11, 12},
we can have: {1 + 4 = 5}, {1 + 10 = 11}, {1 + 12 = 13}, {4 + 7 = 11}, {7 + 10 = 17}, etc.
Actual task: Given a list of numbers N , return a list of all the elements in N that could be
paired with N [0] successfully as part of a complete pairing (i.e. each element a in N is paired to a
unique other element b in N such that a + b is prime), sorted in ascending order. The answer for
the example above would be {4, 10} – omitting 12. This is because even though (1+12) is prime,
there would be no way to pair the remaining 4 numbers whereas if we pair (1+4), we have (7+10),
(11+12) and if we pair (1+10), we have (4+7), (11+12).
Constraints: list N contains an even number of elements (within [2 . . . 50], inclusive). Each
element of N will be between 1 and 1000, inclusive. Each element of N will be distinct.
Although this problem involves finding prime numbers, this is not a pure math problem as the
elements of N are not more than 1K – there are not too many primes below 1000. The issue is
that we cannot do Complete Search pairings as there are 50 C2 possibilities for the first pair, 48 C2
for the second pair, ..., until 2 C2 for the last pair. DP + bitmask technique is not an option either
because 50! is too big.
The key to solve this problem is to realize that this pairing or matching is done on bipartite
graph! To get a prime number, we need to sum 1 odd + 1 even, because 1 odd + 1 odd = even
number which is not prime, and 1 even + 1 even = also even number, which is not prime. Thus we
can split odd/even numbers to set1/set2 and give edges from set1 to set2 if set1[i] + set2[j]
is prime.
After we build this bipartite graph, the solution is trivial: If size of set1 and set2 are different,
complete pairing is not possible. Otherwise, if the size of both sets is n/2, try to match set1[0]
with set2[k] for k = [0 . . . n/2 − 1] and do Max Cardinality Bipartite Matching (MCBM) for the
rest. This problem can be solved with Max Flow algorithm like O(V E 2 ) Edmonds Karps algorithm.
If we obtain n/2 − 1 more matchings, add set2[k] to the answer. For this test case, the answer is
{4, 10}.
Bipartite Matching can be reduced to the Max Flow problem by assigning a dummy source
vertex connected to all vertices in set1 and a dummy sink vertex connected to all vertices in set2.
By setting capacities of all edges in this graph to be 1, we force each vertex in set1 to be matched to
only one vertex in set2. The Max Flow will be equal to the maximum number of possible matchings
on the original graph (see Figure 4.27).
89
4.9. SPECIAL GRAPHS c Steven & Felix, NUS
⃝
Motivating Problem: Suppose that there are two users: User A and B. Each user has transactions,
e.g. A has {A1 , A2 , . . . , An } and each transaction has a weight, e.g. W (A1 ), W (A2 ), etc. These
transactions use shared resources, e.g. transaction A1 uses {r1 , r2 }. Access to a resource is exclusive,
e.g. if A1 is selected, then any of user B’s transaction(s) that use either r1 or r2 cannot be selected.
It is guaranteed that two requests from user A will never use the same resource, but two requests
from different users may be competing for the same resource. Our task is to maximize the sum of
weight of the selected transactions!
Let’s do several keyword analysis of this problem. If a transaction from user A is selected, then
transactions from user B that share some or all resources cannot be selected. This is a strong hint
for Independent Set. And since we want to maximize sum of weight of selected transactions, this
is Max Weighted Independent Set (MWIS). And since there are only two users (two sets)
and the problem statement guarantees that there is no resource conflict between the transactions
from within one user, we are sure that the input graph is a Bipartite Graph. Thus, this problem
is actually an MWIS on Bipartite Graph.
Let’s see Figure 4.28 for illustration. We have two users. We list down all transactions of A on the
left and all transactions of B on the right. We draw an edge between two transactions if they share
similar resource. For example, transaction A1 uses resource 1 and transaction B1 also uses resource
1. We draw an edge between A1 (with weight 45) and B1 (with weight 54) because they share the
same resource. In fact, there are two more edges between A2 − B2 and A3 − B3. Transaction B4
has no edge because the resources that it used: {4, 5} are not shared with any other transactions.
In this instance, {B1 (54), A2 (51), A3 (62), B4 (2)} is the MWIS, with total weight = 54+51+62+2
= 169.
90
4.9. SPECIAL GRAPHS c Steven & Felix, NUS
⃝
To find the solution for non-trivial cases, we have to reduce this problem to a Max Flow problem.
We assign the original vertex cost (the weight of taking that vertex) as capacity from source to
that vertex for user A and capacity from that vertex to sink for user B. Then, we give ‘infinite’
capacity in between any edge in between sets A and B. See Figure 4.29.
Figure 4.29: Reducing MWIS on Bipartite Graph to Max Flow Problem (from LA 3487 [11])
Then, we run O(V E 2 ) Edmonds Karp’s algorithm on this Flow graph. After the Max Flow algo-
# #
rithm terminates, the solution is {s-component vertices in User A} + {t-component vertices in
User B} where s-component (t-component) are the vertices still reachable to source vertex (sink ver-
tex) after running Max Flow. In Figure 4.30, the solution is: {A1 (20), A2 (18), A4 (54)}+{B3 (47)} =
139. This value can also be obtained via: MWIS = Total Weight - Max Flow = 259 − 120 = 139.
91
4.10. CHAPTER NOTES c Steven & Felix, NUS
⃝
92
Chapter 5
Mathematics
We all use math every day; to predict weather, to tell time, to handle money.
Math is more than formulas or equations; it’s logic, it’s rationality,
it’s using your mind to solve the biggest mysteries we know.
— TV show NUMB3RS
Recent ICPCs (especially in Asia) usually contain one or two mathematics problems. This chapter aims to
prepare contestants in dealing with them.
Table 5.1: Some Mathematics Problems in Recent ACM ICPC Asia Regional
93
5.2. AD HOC MATHEMATICS PROBLEMS c Steven & Felix, NUS
⃝
The appearance of mathematics problems in programming contests is not surprising since Computer
Science is deeply rooted in Mathematics. The term ‘computer’ itself comes from the word ‘compute’
as computer is built primarily to help human compute numbers.
We are aware that different countries have different emphasis in mathematics training in pre-
University education. Thus, for some newbie ICPC contestants, the term ‘Euler Phi’ is a familiar
term, but for others, the term does not ring any bell. Perhaps because he has not learnt it before, or
perhaps the term is different in his native language. In this chapter, we want to make a more level-
playing field for the readers by listing common mathematic terminologies, definitions, problems,
and algorithms that frequently appear in programming contests.
1. UVa 344 - Roman Numerals (conversion from roman numerals to decimal and vice versa)
2. UVa 377 - Cowculations (base 4 operations)
3. UVa 10346 - Peter’s Smoke (simple math)
4. UVa 10940 - Throwing Cards Away II (find pattern using brute force, then use the pattern)
5. UVa 11130 - Billiard bounces (use billiard table reflection technique)
6. UVa 11231 - Black and White Painting (use the O(1) formula once you spot the pattern)
7. UVa 11313 - Gourmet Games (similar to UVa 10346)
8. UVa 11428 - Cubes (simple math with complete search)
9. UVa 11547 - Automatic Answer (one liner O(1) solution exists)
10. UVa 11723 - Numbering Road (simple math)
11. UVa 11805 - Bafana Bafana (very simple O(1) formula exists)
94
5.3. NUMBER THEORY c Steven & Felix, NUS
⃝
Prime number is an important topic in number theory and the source for many programming
problems1 . In this section, we will discuss algorithms involving prime numbers.
The first algorithm presented in this section is for testing whether a given natural number N is
prime, i.e. bool isPrime(N). The most naı̈ve version is to test by definition, i.e. test if N is
divisible by divisor ∈ [2 . . . N -1]. This of course works, but runs in O(N ) – in terms of number of
divisions. This is not the best way and there are several possible improvements.
√
The first improvement is to test if N is divisible by a divisor ∈ [2 . . . N ], i.e. we stop when
√
the divisor is already greater than N . This is because if N is divisible by p, then N = p × q. If
√
q were smaller than p, then q or a prime factor of q would have divided N earlier. This is O( N )
which is already much faster than previous version, but can still be improved to be twice faster.
√
The second improvement is to test if N is divisible by divisor ∈ [3, 5, 7 . . . N ], i.e. we only
√
test odd numbers up to N . This is because there is only one even prime number, i.e. number 2,
√ √
which can be tested separately. This is O( N /2), which is also O( N ).
The third improvement2 which is already good enough for contest problems is to test if N is
√
divisible by prime divisors ≤ N . This is because if a prime number X cannot divide N , then
√
there is no point testing whether multiples of X divide N or not. This is faster than O( N ) which
√ $
is about O(|#primes ≤ N |). For example, there are 500 odd numbers in [1 . . . (106 )], but there
are only 168 primes in the same range. The number of primes ≤ M – denoted by π(M ) – is bounded
√ √
by O(M/(ln(M ) − 1)), so the complexity of this prime testing function is about O( N / ln( N )).
The code is shown in the next discussion below.
If we want to generate a list of prime numbers between range [0 . . . N ], there is a better algorithm
than testing each number in the range whether it is a prime or not. The algorithm is called ‘Sieve
of Eratosthenes’ invented by Eratosthenes of Alexandria. It works as follows.
First, it sets all numbers in the range to be ‘probably prime’ but set numbers 0 and 1 to be
not prime. Then, it takes 2 as prime and crosses out all multiples3 of 2 starting from 2 × 2 = 4,
6, 8, 10, ... until it the multiple is greater than N . Then it takes the next non-crossed number 3
as a prime and crosses out all multiples of 3 starting from 3 × 3 = 9, 12, 15, 18, .... Then it takes
5 and crosses out all multiples of 5 starting from 5 × 5 = 25, 30, 35, 40, .... After that, whatever
left uncrossed within the range [0 . . . N ] are primes. This algorithm does approximately (N × (1/2
+ 1/3 + 1/5 + 1/7 + ... + 1/last prime in range ≤ N )) operations. Using ‘sum of reciprocals of
primes up to n’, we end up with the time complexity of roughly O(N log log N ) [44].
Since generating a list of small primes ≤ 10K using the sieve is fast (our library code below can
go up to 107 under contest setting), we opt sieve for smaller primes and reserve optimized prime
testing function for larger primes – see previous discussion. The combined code is as follows:
1
In real life, large primes are used in cryptography because it is hard to factor a number xy into x × y when both
are relatively prime.
2
This is a bit recursive – testing whether a number is a prime by using another (smaller) prime numbers. But the
reason should be obvious after reading the next section.
3
Common sub-optimal implementation is to start from 2 × i instead of i × i, but the difference is not that much.
95
5.3. NUMBER THEORY c Steven & Felix, NUS
⃝
#include <bitset> // compact STL for Sieve, more efficient than vector<bool>!
ll _sieve_size; // ll is defined as: typedef long long ll;
bitset<10000010> bs; // 10^7 + small extra bits should be enough for most prime-related problems
vi primes; // compact list of primes in form of vector<int>
// in int main()
sieve(10000000); // can go up to 10^7
printf("%d\n", isPrime(5915587277)); // 10 digit prime
In number theory, we know that a prime number N only have 1 and itself as factors but a composite
numbers N , i.e. the non-primes, can be written uniquely it as a multiplication of its prime factors.
That’s it, prime numbers are multiplicative building blocks of integers. For example, N = 240 =
2 × 2 × 2 × 2 × 3 × 5 = 24 × 3 × 5 (the latter form is called prime-power factorization).
A naı̈ve algorithm generates a list of primes (e.g. with sieve) and check how many of them can
actually divide the integer N – without changing N . This can be improved!
A better algorithm utilizes a kind of Divide and Conquer spirit. An integer N can be expressed
as: N = P F × N ′ , where P F is a prime factor and N ′ is another number which is N/P F – i.e.
we can reduce the size of N by taking out its factor P F . We can keep doing this until eventually
N ′ = 1. Special case if N is actually a prime number. The code template below takes in an integer
N and return the list of prime factors – see code below.
vi primeFactors(int N) {
vi factors; // vi "primes" (generated by sieve) is optional
int PF_idx = 0, PF = primes[PF_idx]; // using PF = 2, 3, 4, ..., is also ok.
while (N != 1 && (PF * PF <= N)) { // stop at sqrt(N), but N can get smaller
while (N % PF == 0) { N /= PF; factors.push_back(PF); } // remove this PF
PF = primes[++PF_idx]; // only consider primes!
}
if (N != 1) factors.push_back(N); // special case if N is actually a prime
return factors;
}
96
5.3. NUMBER THEORY c Steven & Felix, NUS
⃝
// in int main()
sieve(100); // prepare list of primes [0 .. 100]
vi result = primeFactors(10000); // with that, we can factor up to 100^2 = 10000
vi::iterator last = unique(result.begin(), result.end()); // to remove duplicates
for (vi::iterator i = result.begin(); i != last; i++) // output: 2 and 5
printf("%d\n", *i);
In the worst case – when N is prime, this prime factoring algorithm with trial division requires
√ √ √ √
testing all smaller primes up to N, mathematically denoted as O(π( N )) = O( N /ln N ).
However, if given composite numbers, this algorithm is reasonably fast.
97
5.3. NUMBER THEORY c Steven & Felix, NUS
⃝
5.3.2 Greatest Common Divisor (GCD) & Least Common Multiple (LCM)
The Greatest Common Divisor (GCD) of two integers (a, b) denoted by gcd(a, b), is defined as the
largest positive integer d such that d | a and d | b where x | y implies that x divides y. Example
of GCD: gcd(4, 8) = 4, gcd(10, 5) = 5, gcd(20, 12) = 4. One practical usage of GCD is to simplify
4 4/gcd(4,8) 4/4
fraction, e.g. 8 = 8/gcd(4,8) = 8/4 = 12 .
To find the GCD between two integers is an easy task with an effective Euclid algorithm [20, 4]
which can be implemented as a one liner code (see below). Thus finding the GCD is usually not
the actual issue in a Math-related contest problem, but just part of the bigger solution.
The GCD is closely related to Least (or Lowest) Common Multiple (LCM). The LCM of two
integers (a, b) denoted by lcm(a, b), is defined as the smallest positive integer l such that a | l and
b | l. Example of LCM: lcm(4, 8) = 8, lcm(10, 5) = 10, lcm(20, 12) = 60. It has been shown [20]
that: lcm(a, b) = a × b/gcd(a, b). This can also be implemented as a one liner code (see below).
The GCD of more than 2 numbers, e.g. gcd(a, b, c) is equal to gcd(a, gcd(b, c)), etc, and similarly
for LCM. Both GCD and LCM algorithms run in O(log10 n), where n = max(a, b).
98
5.3. NUMBER THEORY c Steven & Felix, NUS
⃝
In fact, there are only 12 integers less than equal to 36 that are relatively prime to 36. They
are 1, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, and 35. As we need to factor N , the complexity of
this algorithm is similar with the complexity of factoring an integer with trial division mentioned
earlier. The code is below.
int EulerPhi(int N) {
vi factors = primeFactors(N);
vi::iterator new_end = unique(factors.begin(), factors.end()); // get unique
int result = N;
for (vi::iterator i = factors.begin(); i != new_end; i++)
result = result - result / *i;
return result;
}
1. UVa 10179 - Irreducible Basic Fractions (direct application of Euler’s Phi function)
2. UVa 10299 - Relatives (another direct application of Euler’s Phi function)
3. UVa 10820 - Send A Table (a[i] = a[i - 1] + 2 ×ϕ(i))
4. UVa 11064 - Number Theory
5. UVa 11327 - Enumerating Rational Numbers
99
5.3. NUMBER THEORY c Steven & Felix, NUS
⃝
Using extendedEuclid, the Linear Diophantine Equation with two variables can be easily solved.
For our motivating problem above: 25x + 18y = 839, we have:
a = 25, b = 18, extendedEuclid(25, 18) = ((−5, 7), 1), or
25 × (−5) + 18 × 7 = 1.
Multiplying the left and right hand side of the equation above by 839/gcd(25, 18) = 839, we have:
25 × (−4195) + 18 × 5873 = 839. Thus,
x = −4195 + (18/1)n, y = 5873 − (25/1)n.
Since we need to have non-negative x and y, we have:
−4195 + 18n ≥ 0 and 5873 − 25n ≥ 0, or
4195/18 ≤ n ≤ 5873/25, or
233.05 ≤ n ≤ 234.92.
The only possible integer for n is 234.
Thus x = −4195 + 18 × 234 = 17 and y = 5873 − 25 × 234 = 23,
i.e. 17 apples (of 25 cents each) and 23 oranges (of 18 cents each) of a total of 8.39 SGD.
100
5.3. NUMBER THEORY c Steven & Felix, NUS
⃝
5.3.7 Factorial
Factorial of n, i.e. f ac(n) is defined as 1 if n = 0 or n × f ac(n − 1) if n > 0. Factorial also grows
very fast and may also need Java BigInteger library (Section 5.4).
101
5.4. JAVA BIGINTEGER CLASS c Steven & Felix, NUS
⃝
6. UVa 10061 - How many zeros & how many digits? (there exists a formula for this)
7. UVa 10139 - Factovisors (there exists a formula for this)
8. UVa 10858 - Recover Factorial
9. UVa 10220 - I Love Big Numbers! (use Java BigInteger class)
10. UVa 10323 - Factorial! You Must Be Kidding
11. UVa 10780 - Again Prime? No time.
12. UVa 11347 - Multifactorials
13. UVa 11415 - Count the Factorials
num1 =1,000,000,000,000,000,000,000
num2 = 17
------------------------------ *
7,000,000,000,000,000,000,000
10,000,000,000,000,000,000,00
------------------------------ +
num1 * num2 = 17,000,000,000,000,000,000,000
Addition and subtraction are two simpler operations in BigInteger. Multiplication takes a bit more
programming job. Efficient division and raising number to a certain power are more complicated.
Anyway, coding these library routines in C/C++ under stressful contest environment can be a
buggy affair, even if we can bring notes containing such C/C++ library in ICPC. Fortunately, Java
has a BigInteger class that we can use for this purpose (as of 9 August 2010, C++ STL currently
does not have such library thus it is a good idea to use Java for BigInteger problems).
The Java BigInteger (BI) class supports the following basic integer operations: addition –
add(BI), subtraction – subtract(BI), multiplication – multiply(BI), division – divide(BI),
remainder – remainder(BI), combination of division and remainder – divideAndRemainder(BI),
modulo – mod(BI) (slightly different to remainder(BI)), and power – pow(int exponent). For
example, the following short Java code is the solution for UVa 10925 - Krakovia which simply
requires BigInteger addition (to sum N large bills) and division (to divide the large sum to F
friends).
102
5.4. JAVA BIGINTEGER CLASS c Steven & Felix, NUS
⃝
import java.io.*;
import java.util.*; // Scanner class is inside this package
import java.math.*; // BigInteger class is inside this package
GCD Revisited
When we need to compute the GCD of two big integers, we do not have to worry. See an example
below for UVa 10814 - Simplifying Fractions that ask us to simplify a given (large) fraction to its
simplest form by dividing both numerator and denominator with the gcd between them.
import java.io.*;
import java.util.*;
import java.math.*;
One of the problems presented in the previous section is LA 4104 - MODEX. We are asked to find
what is the value of xy (mod n). It turns out that this problem can be solved with:
103
5.4. JAVA BIGINTEGER CLASS c Steven & Felix, NUS
⃝
import java.io.*;
import java.util.*;
import java.math.*;
The base number conversion is actually a not-so-difficult4 mathematical problem, but this problem
becomes even simpler with Java BigInteger class. We can construct and print a big integer in any
base (radix) as shown below:
import java.io.*;
import java.util.*;
import java.math.*;
Programming Exercises related to Big Integer that are not mentioned elsewhere in this chapter.
104
5.5. MISCELLANEOUS MATHEMATICS PROBLEMS c Steven & Felix, NUS
⃝
5.5.1 Combinatorics
Combinatorics [31] is a branch of discrete mathematics concerning the study of finite or count-
able discrete structures. Programming problems involving combinatorics usually titled ‘How Many
[Object]’, ‘Count [Object]’, etc, although some problem setters choose to hide this fact from their
problem title. The code is usually short, but finding the recurrence formula takes some mathemat-
ics brilliance and patience. In ICPC, if such problem exists in the given problem set, ask one team
member to derive the formula whereas the other two concentrates on other problems. Quickly code
the usually short formula once it is obtained.
For example, try solving UVa 10401 - Triangle Counting. This problem has a short description:
“given n rods of length 1, 2, . . . , n, pick any 3 of them & build a triangle. How many distinct
triangles can you make? (3 ≤ n ≤ 1M ) ”. Note that, two triangles will be considered different if
they have at least one pair of arms with different lengths. If you are lucky, you may spend only a
few minutes to spot the pattern. Otherwise, this problem may end up unsolved by the time contest
is over. Hint: answers for few small n = 3, 4, 5, 6, 7, 8, 9, and 10 are 0, 1, 3, 7, 13, 22, 34, and 50,
respectively.
105
5.5. MISCELLANEOUS MATHEMATICS PROBLEMS c Steven & Felix, NUS
⃝
5.5.2 Cycle-Finding
Cycle-finding [32] is a problem of finding a cycle in a sequence of iterated function values.
For any function f : S → S and any initial value x0 ∈ S, the sequence of iterated function values:
x0 , x1 = f (x0 ), x2 = f (x1 ), . . . , xi = f (xi−1 ), . . . must eventually use the same value twice (cycle),
i.e. ∃i ̸= j such that xi = xj . Once this happens, the sequence must repeat the cycle of values from
xi to xj−1 . We let µ as the smallest index i and let λ (the loop length) be the smallest positive
integer such that xµ = xµ+λ . The cycle-finding is the problem of finding µ and λ, given f and x0 .
For example in UVa 350 - Pseudo-Random Numbers, we are given a pseudo-random number
generator f (x) = (Z × x + I)%M with x0 = L and we want to find out the sequence length before
any number is repeated.
A naı̈ve algorithm that works in general for this problem uses a data structure (e.g. C++ STL
<set>, hash table or direct addressing table) to store information that a number xi has been
visited in the sequence and then for each xj found later (j > i), we test if xj is stored in the data
structure or not. If it is, it implies that xj = xi , µ = i, λ = j − i. This algorithm runs in O(µ + λ)
but also requires at least O(µ + λ) space to store past values.
There is a better algorithm called Floyd’s Cycle-Finding algorithm [32] that runs in the same
O(µ + λ) time complexity but only uses O(1) memory space – much smaller than the naı̈ve version.
The working C/C++ implementation of this algorithm is shown below:
106
5.5. MISCELLANEOUS MATHEMATICS PROBLEMS c Steven & Felix, NUS
⃝
• Sequences
1. UVa 100 - The 3n + 1 problem
2. UVa 413 - Up and Down Sequences
3. UVa 694 - The Collatz Sequence (similar to UVa 100)
4. UVa 10408 - Farey Sequences
5. UVa 10930 - A-Sequence
6. UVa 11063 - B2 Sequences
• Number Systems
1. UVa 136 - Ugly Numbers
2. UVa 138 - Street Numbers
3. UVa 443 - Humble Numbers
4. UVa 640 - Self Numbers (DP)
5. UVa 962 - Taxicab Numbers (Pre-calculate the answer)
6. UVa 974 - Kaprekar Numbers
7. UVa 10001 - Bangla Numbers
8. UVa 10006 - Carmichael Numbers
9. UVa 10042 - Smith Numbers
10. UVa 10044 - Erdos Numbers (solvable with BFS)
11. UVa 10591 - Happy Number (solvable with the Floyd’s Cycle-Finding algorithm)
12. UVa 11461 - Square Numbers
13. UVa 11472 - Beautiful Numbers
107
5.6. CHAPTER NOTES c Steven & Felix, NUS
⃝
108
5.6. CHAPTER NOTES c Steven & Felix, NUS
⃝
In this chapter, we have seen a quite effective trial division method for finding prime factors
of an integer. For a faster integer factorization, one can use the Pollard’s rho algorithm [4].
However, if the integer to be factored is a large prime number, then this is still a slow business.
This fact is the security part in modern cryptography techniques.
There are other theorem, hypothesis, and conjectures about prime numbers, e.g. Carmichael’s
function, Riemann’s hypothesis, Goldbach’s conjecture, twin prime conjecture, etc. However, when
such things appear in programming contests, usually their definitions are given!
We can compute f ib(n) in O(log n) using matrix multiplication, but this is usually not needed
in contest setting unless the problem setter use a very big n.
Other mathematics problems that may appear in programming contests are those involving:
Chinese Remainder Theorem (e.g. UVa 756 - Biorhythms), Divisibility properties (e.g. UVa
995), Pascal’s Triangle, Combinatorial Games (e.g. the Sprague-Grundy’s theorem for
games like UVa 10165 - Stone Game (Nim game), Chess, Tic-Tac-Toe, etc), problems involving
non-conventional Grid system (e.g. UVa 10182 - Bee Maja), etc.
Note that (Computational) Geometry is also part of Mathematics, but since we have a special
chapter for that, we reserve the discussions related to geometry problems in Chapter 7.
In terms of doing well in ICPC, it is a good idea to have at least one strong mathematician in
your ICPC team. This is because there usually exists one or two mathematics problems in the set
where the solution is short but getting the solution/formula requires a strong thinking cap.
We suggest that interested readers should browse more about number theory – see books like
[20], http://mathworld.wolfram.com/, Wikipedia and do more programming exercises related to
mathematics problems, visit http://projecteuler.net/ [7].
109
Chapter 6
String Processing
In this chapter, we present one more topic that is tested in ICPC, namely: string processing. Processing
(long) string is quite common in the research field of bioinformatics and some of such problems are presented
as contest problems in ICPC.
Table 6.1: Some String Processing Problems in Recent ACM ICPC Asia Regional
110
6.2. AD HOC STRING PROCESSING PROBLEMS c Steven & Felix, NUS
⃝
C++ <string> class, or Java String class. For example, we can use strstr in C to find certain
substring in a longer string (also known as string matching or string searching), strtok in C to
tokenize longer string into tokens based on some delimiters. Here are some examples:
However, recent contest problems in ACM ICPC usually do not ask solutions for basic string
processing except for the ‘giveaway’ problem that all teams should be able to solve. Some string
processing problems are solve-able with Dynamic Programming (DP) technique. We discuss them
in Section 6.3. Some other string processing problems have to deal with long strings, thus an
efficient data structure for string like Suffix Tree or Suffix Array must be used. We discuss these
data structures and several specialized algorithms using these data structures in Section 6.4.
111
6.3. STRING PROCESSING WITH DYNAMIC PROGRAMMING c Steven & Felix, NUS
⃝
After aligning A with B, there are few possibilities between character A[i] and B[i] ∀ index i:
1. Character A[i] and B[i] match (assume we give ‘+2’ score),
2. Character A[i] and B[i] mismatch and we replace A[i] with B[i] (‘-1’ score),
3. We insert a space in A[i] (also ‘-1’ score), or
4. We delete a letter from A[i] (also ‘-1’ score).
For example:
A = ’information’ -> ’___information_’
B = ’bioinformatics’ -> ’bioinformatic_s’
---222222222--- -> String Alignment Score = 9 x 2 - 6 = 12
A brute force solution that tries all possible alignments will typically end up with a TLE verdict
for long strings A and/or B. The solution for this problem is a well-known DP solution (Needleman-
Wunsch’s algorithm [24]). Consider two strings A[1 ... n] and B[1 ... m]. We define V (i, j)
to be the score of the optimal alignment between A[1 ... i] and B[1 ... j] and score(A, B) is
the score if character A is aligned with character B.
Base case:
V (0, 0) = 0 // no score for matching two empty strings
In short, this DP algorithm concentrates on the three possibilities for the last pair of characters,
which must be either a match/mismatch, a deletion, or an insertion. Although we do not know
which one is the best, we can try all possibilities while avoiding re-computation of overlapping
subproblems (i.e. basically a DP technique).
With a simple cost function where a match gets a +2 point and mismatch, insert, delete all get
a -1 point, the detail of string alignment score of A = ’ACAATCC’ and B = ’AGCATGC’ is shown
1
Align is a process of inserting spaces to strings A or B such that they have the same number of characters.
112
6.3. STRING PROCESSING WITH DYNAMIC PROGRAMMING c Steven & Felix, NUS
⃝
Figure 6.1: String Alignment Example for A = ‘ACAATCC’ and B = ‘AGCATGC’ (score = 7)
in Figure 6.1. The alignment score is 7 (bottom right). Follow the dashed (red) arrows from the
bottom right cell to reconstruct the solution Diagonal arrow means a match or a mismatch (e.g.
the last ‘C’). Vertical arrow means a deletion (e.g. ..CAT.. to ..C_A..). Horizontal arrow means
an insertion (e.g. A_C.. to AGC..).
A = ’A_CAAT[C]C’
B = ’AGC_AT[G]C’
As we need to fill in all entries in the table of n × m matrix and each entry can be computed in
O(1), the time complexity is O(nm). The space complexity is O(nm) – the size of the DP table.
6.3.3 Palindrome
A palindrome is a string that can be read the same way in either direction. Some variants of
palindrome finding problems are solve-able with DP technique, as shown in this example: given a
string of up to 1000 characters, determine the length of the longest palindrome that you can make
from it by deleting zero or more characters. Examples:
The DP solution: let len(l, r) be the length of the longest palindrome from string A[l ... r].
Base cases:
If (l = r), then len(l, r) = 1. // true in odd-length palindrome
If (l + 1 = r), then len(l, r) = 2 if (A[l] = A[r]), or 1 otherwise. // true in even-length palindrome
113
6.4. SUFFIX TREE AND SUFFIX ARRAY c Steven & Felix, NUS
⃝
Recurrences:
If (A[l] = A[r]), then len(l, r) = 2 + len(l + 1, r − 1). // both corner characters are similar
else len(l, r) = max(len(l, r − 1), len(l + 1, r)). // increase left side or decrease right side
Figure 6.2: Suffix Trie (Left) and Suffix Tree (Right) of S = ’acacag$’ (Figure from [24])
Consider a string2 S = ’acacag$’, a Suffix3 Trie of S is a tree that contains all possible suffixes
of S (see Figure 6.2, left). Two suffixes that share common prefix will share the same first few
vertices, e.g. ’cag$’ and ’cacag$’ share the first two vertices ’ca’ before they split. The leaves
contain the indices of the suffixes. Suffix Tree of S is Suffix Trie where we merge vertices with only
one child (see Figure 6.2, right). Notice the ‘edge-label’ and ‘path-label’ in the figure.
114
6.4. SUFFIX TREE AND SUFFIX ARRAY c Steven & Felix, NUS
⃝
With Suffix Tree, we can find all (exact) occurrences of a query string Q in S in O(|Q| + occ) where
|Q| is the length of the query string Q itself and occ is the total number of occurrences of Q in S –
no matter how long the string S is. When the Suffix Tree is already built, this approach is faster
than many exact string matching algorithms (e.g. KMP).
With Suffix Tree, our task is to search for the vertex x in the Suffix Tree which represents the
query string Q. This can be done by just one root to leaf traversal that follows the edge labels.
Vertex with path-label = Q is the desired vertex x. Then, leaves in the subtree rooted at x are the
occurrences of Q in S. We can then read the starting indices of such substrings that are stored in
the leaves of the sub tree.
For example, in the Suffix Tree of S = ’acacag$’ shown in Figure 6.2, right and Q = ’aca’,
we can simply traverse from root, go along the edge label ‘a’, then the edge label ‘ca’ to find vertex
x with the path-label ‘aca’ (follow the dashed red arrow in Figure 6.2, right). The leaves of this
vertex x point to index 1 (substring: ’acacag$’) and index 3 (substring: ’acag$’).
With Suffix Tree, we can also find the longest repeated substring in S easily. The deepest internal
vertex X in the Suffix Tree of S is the answer. Vertex X can be found with an O(n) tree traversal.
The fact that X is an internal vertex implies that it represent more than one suffixes (leaves) of
string S and these suffixes shared a common prefix (repeated substring). The fact that X is the
deepest internal vertex (from root) implies that its path-label is the longest repeated substring.
For example, in the Suffix Tree of S = ’acacag$’ shown in Figure 6.2, right, the longest
repeated substring is ‘aca′ as it is the path-label of the deepest internal vertex.
The problem of finding the Longest Common Substring (not Subsequence)5 of two or more
strings can be solved in linear time with Suffix Tree. Consider two strings S1 and S2, we can
build a generalized Suffix Tree for S1 and S2 with two different ending markers, e.g. S1 with
character ‘#’ and S2 with character ‘$’. Then, we mark each internal vertices with have leaves
that represent suffixes of both S1 and S2 – this means the suffixes share a common prefix. We then
report the deepest marked vertex as the answer.
For example, with S1 = ’acgat#’ and S2 = ’cgt$’, The Longest Common Substring is ’cg’
of length 2. In Figure 6.3, we see the root and vertices with path-labels ‘cg’, ‘g’, and ‘t’ all have
two different leaf markers. The deepest marked vertex is ‘cg’. The two suffixes cgat# and cgt$
share a common prefix ‘cg’.
4
As Suffix Tree is more compact than Suffix Trie, we will concentrate on Suffix Tree.
5
In ‘abcdef’, ‘bce’ (skip character ‘d’) is subsequence and ‘bcd’ (contiguous) is substring and also subsequence.
115
6.4. SUFFIX TREE AND SUFFIX ARRAY c Steven & Felix, NUS
⃝
Figure 6.3: Generalized Suffix Tree of S1 = ’acgat#’ and S2 = ’cgt$’ (Figure from [24])
Suffix Tree and Suffix Array are very related. As we can see in Figure 6.5, the leaves of a Suffix
Tree (from left to right) is in Suffix Array order. In short, a vertex in Suffix Tree corresponds to
a range in Suffix Array!
Figure 6.5: Suffix Tree versus Suffix Array of S = ’acacag$’ (Figure from [24])
A Suffix Array is good enough for many practical string operations in contest problems. In this
section, we present two simple ways to build a Suffix Array given a string S[0 ... n-1].
116
6.4. SUFFIX TREE AND SUFFIX ARRAY c Steven & Felix, NUS
⃝
#include <iostream>
#include <stdlib.h>
#include <string.h>
using namespace std;
char S[1001]; // this naive Suffix Array cannot go beyond 1000 characters
int SA[1001], n; // compare suffixes
int SA_cmp(const void *a, const void *b) { return strcmp(S + *(int*)a, S + *(int*)b); }
int main() { // first approach: O(n^2 log n), only for n <= 1000
n = strlen(gets(S));
for (int i = 0; i < n; i++) SA[i] = i; // sort * comparison
qsort(SA, n, sizeof(int), SA_cmp); // O(n log n) * O(n) = O(n^2 log n)
for (int i = 0; i < n; i++) printf("%d %s\n", SA[i], S+SA[i]);
} // return 0;
When applied to string S = ’acacag$’, the simple code that simply sort all suffixes with sort
library will produce the correct Suffix Array = {6, 0, 2, 4, 1, 3, 5} (note that index starts from 0).
However, this is barely useful except for contest problems with n ≤ 1000.
A better way to build the Suffix Array is to sort suffixes O(n log n) in increasing length. We
start from suffixes with length 1, length 2, length 4, length 8, ..., up to n. As the length grows
exponentially, we only need O(log n) steps. Thus the overall complexity is O(n log2 n). With this
complexity, working with strings of length n ≤ 100K – the typical programming contest range – is
not a problem. The library code is shown below. For explanation, see [29, 24].
#include <algorithm>
#include <iostream>
#include <stdlib.h>
#include <string.h>
using namespace std;
#define MAXN 200010
int RA[MAXN], SA[MAXN], LCP[MAXN], *FC, *SC, step;
char S[MAXN], Q[MAXN];
117
6.4. SUFFIX TREE AND SUFFIX ARRAY c Steven & Felix, NUS
⃝
Now with a Suffix Array already built, we can search for a query string Q of length m in string S of
length n in O(m log n). This is O(log n) times slower than the Suffix Tree version but in practice is
quite acceptable. The O(m log n) complexity comes from the fact that we can do O(log n) binary
search on a sorted suffixes and do up to O(m) comparisons per suffix.
The fact that the occurrences of Q in the Suffix Array of S are consecutive can be used to
deal with the longest repeated substring and the longest common substring problems in similar
O(m log n) + O(occ). For example in S = ’acacag$’, the repeated substring ’ca’ occurs in
SA[4] and SA[5]. For the common substring between S1 = ’acgat#’ and S2 = ’cgt$’, we can
concatenate the two string into S = ’acgat#cgt$’, build the Suffix Array of the concatenated
string, and then modify the comparison function to check for the marker character ‘#’ and ‘$’.
Our code for finding the query string Q in a Suffix Array is shown below:
int main() {
int n = strlen(gets(S));
suffix_array(S, n + 1); // NULL is included!
for (int i = 1; i <= n; i++) // SA[0] is the NULL
printf("%d %s\n", SA[i], S + SA[i]);
gets(Q);
pair<int, int> pos = range(n, Q);
if (pos.first != -1 && pos.second != -1) {
printf("%s is found SA [%d .. %d] of %s\n", Q, pos.first, pos.second, S);
printf("They are:\n");
for (int i = pos.first; i <= pos.second; i++)
printf(" %s\n", S + SA[i]);
}
else
printf("%s is not found in %s\n", Q, S);
return 0;
}
118
6.5. CHAPTER NOTES c Steven & Felix, NUS
⃝
1. UVa 719 - Glass Beads (minimum lexicographic rotation, solvable with SA)
2. UVa 10526 - Intellectual Property
3. UVa 11107 - Life Forms
4. UVa 11512 - GATTACA
5. LA 4657 - Top 10 (ACM ICPC Jakarta 2009, problem setter: Felix Halim)
6. https://www.spoj.pl/problems/SARRAY/ (problem setter: Felix Halim)
119
Chapter 7
(Computational) Geometry
(Computational) Geometry is yet another topic that frequently appears in programming contests. Many
contestants afraid to tackle them due to floating point precision errors1 or the many tricky ‘special cases’
commonly found in geometry problems. Some others skip these problems as they forgot some important for-
mulas and unable to derive the required formulas from basic concepts. Study this chapter for some ideas on
tackling (computational) geometry problems in ICPC.
Table 7.1: Some (Computational) Geometry Problems in Recent ACM ICPC Asia Regional
1
To avoid this error, usually we do floating-point comparison test in this way: f abs(a − b) < EP S where EP S
usually is a small number like 1e-9.
120
7.2. GEOMETRY BASICS c Steven & Felix, NUS
⃝
We divide this chapter into two parts. The first part is geometry basics in Section 7.2. We
review many (not all) English geometric terminologies and formulas that are commonly used in
programming contests. The second part deals with computational geometry in Section 7.4 -
7.5, where we use data structures and algorithms which can be stated in terms of geometry.
• Lines
1. In a 2-D Cartesian coordinate system, the Circle centered at (a, b) with radius r is the
set of all points (x, y) such that (x − a)2 + (y − b)2 = r 2 .
2. The constant π is the Ratio of any circle’s circumference to its diameter in the Euclidean
space. To avoid precision error, the safest value for programming contest is pi = 2 ×
acos(0.0), unless if this constant is defined in the problem description!
3. The Circumference c of a circle with a Diameter d is c = π × d where d = 2 × r.
4. The length of an Arc of a circle with a circumference c and an angle α (in degrees2 ) is
α
360.0 ×c
2
Human usually works with degrees, but many mathematical functions in programming languages works with
radians. Check your programming language manual to verify this. To help with conversion, just remember that one
π radian equals to 180 degrees.
121
7.2. GEOMETRY BASICS c Steven & Felix, NUS
⃝
5. The length of a Chord of a circle with a radius r and an angle α (in degrees) can be
obtained with the Law of Cosines: 2r 2 × (1 − cos(α)) – see the explanation of this law
in the discussion about Triangles below.
6. The Area A of a circle with a radius r is A = π × r 2
α
7. The area of a Sector of a circle with an area A and an angle α (in degrees) is 360.0 ×A
8. The area of a Segment of a circle can be found by subtracting the area of the corre-
sponding Sector with the area of an Isosceles Triangle with sides: r, r, and Chord-length.
1. A Triangle is a polygon (defined below) with three vertices and three edges. There are
several types of triangles:
Equilateral Triangle, all three edges have the same length and all inside/interior angles
are 60 degrees;
122
7.2. GEOMETRY BASICS c Steven & Felix, NUS
⃝
123
7.2. GEOMETRY BASICS c Steven & Felix, NUS
⃝
• Rectangles
1. A Rectangle is a polygon with four edges, four vertices, and four right angles.
2. The Area A of a rectangle with width w and height h is A = w × h.
3. The Perimeter p of a rectangle with width w and height h is p = 2 × (w + h).
4. A Square is a special case of rectangle where w = h.
• Trapeziums
1. A Trapezium is a polygon with four edges, four vertices, and one pair of parallel edges.
If the two non-parallel sides of the trapezium have the same length, we have an Isosceles
Trapezium.
2. The Area A of a trapezium with base w1, another edge parallel with the base w2 and
height h is A = 0.5 × (w1 + w2) × h.
• Quadrilaterals
1. A Quadrilateral or Quadrangle is a polygon with with four edges (and four vertices).
Rectangles, Squares, and Trapeziums that are mentioned above are Quadrilaterals. Fig-
ure 7.3 shows a few more examples: Parallelogram, Kite, Rhombus.
124
7.2. GEOMETRY BASICS c Steven & Felix, NUS
⃝
• Spheres
Figure 7.4: Great-Circle and Great-Circle Distance (Arc A-B) (Figures from [46])
125
7.2. GEOMETRY BASICS c Steven & Felix, NUS
⃝
• Polygons
| x1 y1 |
| x2 y2 |
1 | x3 y3 |
A = - * | . . | = 1/2 * (x1y2 + x2y3 + y3y4 + ... + xny1
2 | . . | -x2y1 - x3y2 - x3y4 - ... - x1yn)
| . . |
| xn yn |
This can be written into the library code below. Notice that our default setting is all-
integer coordinates and use all-integer operations whenever possible. You need to change
a part of this code if the given points are not integers:
4. The perimeter p of an n-sided polygon with n pairs of coordinates given in some order
(clockwise or counter-clockwise) can be computed with Pythagorean theorem:
126
7.2. GEOMETRY BASICS c Steven & Felix, NUS
⃝
5. Testing if a polygon is convex (or concave) is easy with a quite robust3 geometric pred-
icate test called CCW (Counter Clockwise) Test (a.k.a. Left-Turn Test).
CCW test is a simple yet important predicate test in computational geometry. This test
takes in 3 points p, q, r in a plane and determine if the sequence p → q → r is a left
turn4 . For example, CCW (p, q, r) where p = (0, 0), q = (1, 0), r = (0, 1) is true. This
test can be implemented with the following library code:
int turn(point p, point q, point r) {
int result = (r.x - q.x) * (p.y - q.y) - (r.y - q.y) * (p.x - q.x);
if (result < 0) return -1; // P->Q->R is a right turn
if (result > 0) return 1; // P->Q->R is a left turn
return 0; // P->Q->R is a straight line, i.e. P, Q, R are collinear
}
With the library code above, we can now check if a polygon is convex by verifying if all
three consecutive points in the polygon make left-turns if visited in counter clockwise
order. If we can find at least one triple where this is false, the polygon is concave.
• There are of course exists many other geometric shapes, objects, and formulas that have not
been covered yet, like 3-D objects, etc. What we have covered so far are the ones which
appear more frequently in programming contests.
Other Programming Exercises related to Basic Geometry that are not listed above:
b
1. UVa 10088 - Trees on My Island (Georg A. Pick’s Theorem: A = i + 2 − 1, see [43])
2. UVa 10297 - Beavergnaw (cones, cylinders, volumes)
3. UVa 10387 - Billiard (expanding surface, trigonometry)
4. UVa 11232 - Cylinder (circles, rectangles, cylinders)
5. UVa 11507 - Bender B. Rodriguez Problem (simulation)
3
Geometric programs are preferred to be robust, namely, no numerical errors. To help achieving that quality,
computations are often done by predicate tests (e.g. the CCW test) rather than by floating point calculations
which is prone to precision errors. Moreover, arithmetic operations used are limited to additions, subtractions and
multiplications on integers only (exact arithmetics).
4
Or in other words: p → q → r is counter-clockwise, r is on the left of line pq, triangle pqr has a positive area and
its determinant is greater than zero.
127
7.3. GRAHAM’S SCAN c Steven & Felix, NUS
⃝
There are several convex hull algorithms available. In this section, we choose the O(n log n) Ronald
Graham’s Scan algorithm. This algorithm first sorts all n points of P (Figure 7.5.A) based on its
angle w.r.t a point called pivot. In our example, we pick bottommost and rightmost point in P as
pivot (see point 0 and the counter-clockwise order of the remaining points in Figure 7.5.B).
Then, this algorithm maintains a stack S of candidate points. Each point of P is pushed once
on to S and points that are not going to be part of CH(P ) will be eventually popped from S.
Examine Figure 7.5.C. The stack previously contains (bottom) 11-0-1-2 (top), but when we try to
insert 3, 1-2-3 is a right turn, so we pop 2. Now 0-1-3 is a left turn, so we insert 3 to the stack,
which now contains (bottom) 11-0-1-3 (top).
When Graham’s Scan terminates, whatever left in S are the points of CH(P ) (see Figure 7.5.D,
the stack contains (bottom) 11-0-1-4-7-10-11 (top)). Graham Scan’s eliminates all the right turns!
Check that the sequence of vertices in S always makes left turns – a convex polygon.
The implementation of Graham’s Scan, omitting parts that have shown earlier like ccw function,
is shown below.
128
7.3. GRAHAM’S SCAN c Steven & Felix, NUS
⃝
vector<point> ConvexHull;
while (!S.empty()) { // from stack back to vector
ConvexHull.push_back(S.top());
S.pop();
}
ConvexHull.pop_back(); // the last one is a duplicate of first one
129
7.4. INTERSECTION PROBLEMS c Steven & Felix, NUS
⃝
However, intersections can occur between different types of objects other than two line segments.
The other objects are: Cube, Box, Circle, Polygon, Triangle, Rectangle, etc. Some examples are
shown below.
130
7.5. DIVIDE AND CONQUER REVISITED c Steven & Felix, NUS
⃝
Suppose that we have a problem as follows: Given a desired soccer field with desired length : width
ratio = A : B, two arcs from the same circle whose center is located in the middle of the field,
and the length of the athletics track (the perimeter: 2× length + 2× arc) to be 400m, what is the
actual length and width of the field? See Figure 7.6.
It is quite hard to find the solution with pen and paper, but with the help of computer and
bisection method (binary search), we can find the solution easily.
Suppose we binary search the value of L, then we can get W from b × L/a. The expected length
of an arc is (400 − 2 × L)/2. Now we can use Trigonometry to compute the radius r and the angle o
via triangle CM X (see Figure 7.6). With r and o, we can compute the actual arc length. We then
compare this value with the expected arc length to decide whether we have to enlarge or reduce
the length L. The important portion of the code is shown below.
lo = 0.0; hi = 400.0; // the range of answer
while (fabs(lo - hi) > 1e-9) { // do bisection method on L
L = (lo + hi) / 2.0;
W = b * L / a; // W can be derived by W
expected_arc = (400 - 2.0 * L) / 2.0;
half_L = 0.5 * L; half_W = 0.5 * W;
r = sqrt(half_L * half_L + half_W * half_W);
angle = 2.0 * atan(half_W / half_L) * 180.0 / PI;
this_arc = angle / 360.0 * PI * (2.0 * r);
if (this_arc > expected_arc) hi = L;
else lo = L;
}
printf("Case %d: %.12lf %.12lf\n", caseNo++, L, W);
131
7.6. CHAPTER NOTES c Steven & Felix, NUS
⃝
1. UVa 10245 - The Closest Pair Problem (as the problem name implies)
2. UVa 10566 - Crossed Ladders (Bisection Method)
3. UVa 11378 - Bey Battle (also a Closest Pair Problem)
4. UVa 11646 - Athletics Track (Bisection Method, the circle is at the center of track)
5. UVa 11648 - Divide The Land (Bisection Method)
132
Appendix A
Problem Credits
The problems discussed in this book are mainly taken from UVa online judge [17] and ACM ICPC
Live Archive [11]. We have tried our best to contact the original authors and get their permissions.
So far, we have contacted the following problem setters and obtained their permissions: Sohel
Hafiz, Shahriar Manzoor, Manzurur Rahman Khan, Rujia Liu, Gordon Cormack, Jim Knisely,
Melvin Zhang, and Colin Tan.
If any of the author of a particular problem discussed in this book that we have not contacted
yet does not allow his/her problem to be used in this book, we will replace that particular problem
with another similar problem from different problem setter.
133
Appendix B
You, the reader, can help us to improve the quality of the future versions of this book. If you spot
any technical, grammatical, spelling errors, etc in this book or if you want to contribute certain
parts for the future version of this book (i.e. I have a better example/algorithm to illustrate a
certain point), etc, please send email the main author directly: [email protected].
134
Bibliography
[1] Ahmed Shamsul Arefin. Art of Programming Contest (from Steven’s Website). Gyankosh
Prokashoni (Available Online), 2006.
[2] Jon Bentley. Programming Pearls. Addison Wesley, 2nd edition, 2000.
[3] Frank Carrano. Data Abstraction & Problem Solving with C++. Pearson, 5th edition, 2007.
[4] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Cliff Stein. Introduction to
Algorithm. MIT Press, 2nd edition, 2001.
[5] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms. McGraw Hill,
2008.
[6] Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried Cheong Schwarzkopf. Com-
putational Geometry: Algorithms and Applications. Springer, 2nd edition, 2000.
[9] Steven Halim and Felix Halim. Competitive Programming in National University of Singapore.
Ediciones Sello Editorial S.L. (Presented at Collaborative Learning Initiative Symposium CLIS
@ ACM ICPC World Final 2010, Harbin, China, 2010.
[10] Steven Halim, Roland Hock Chuan Yap, and Felix Halim. Engineering Stochastic Local Search
for the Low Autocorrelation Binary Sequence Problem. In Principles and Practice of Con-
straint Programming, pages 640–645, 2008.
[13] Jon Kleinberg and Eva Tardos. Algorithm Design. Addison Wesley, 2006.
[14] Anany Levitin. Introduction to The Design & Analysis of Algorithms. Addison Wesley, 1st
edition, 2002.
[15] Rujia Liu. Algorithm Contests for Beginners (In Chinese). Tsinghua University Press, 2009.
135
BIBLIOGRAPHY c Steven & Felix, NUS
⃝
[16] Rujia Liu and Liang Huang. The Art of Algorithms and Programming Contests (In Chinese).
Tsinghua University Press, 2003.
[19] Joseph O’Rourke. Computational Geometry in C. Cambridge University Press, 2nd edition,
1998.
[20] Kenneth H. Rosen. Elementary Number Theory and its applications. Addison Wesley Longman,
4th edition, 2000.
[21] Robert Sedgewick. Algorithms in C++, Part 1-5. Addison Wesley, 3rd edition, 2002.
[23] Steven S. Skiena and Miguel A. Revilla. Programming Challenges. Springer, 2003.
[24] Wing-Kin Sung. Algorithms in Bioinformatics: A Practical Introduction. CRC Press (Taylor
& Francis Group), 1st edition, 2010.
[28] Tom Verhoeff. 20 Years of IOI Competition Tasks. Olympiads in Informatics, 3:149166, 2009.
[29] Adrian Vladu and Cosmin Negruşeri. Suffix arrays - a programming contest approach. 2008.
136
BIBLIOGRAPHY c Steven & Felix, NUS
⃝
[47] Yonghui Wu and Jiang De Wang. Practical Algorithm Analysis and Program Design (In
Chinese). Posts and Telecom Press, 2009.
137
Index
138
INDEX c Steven & Felix, NUS
⃝
139
INDEX c Steven & Felix, NUS
⃝
140