AlgNotes PDF
AlgNotes PDF
CITS3210 Algorithms
Lecture Notes
3. Sorting
(c) QuickSort.
Notes by CSSE, Comics by xkcd.com
1 2
9. Optimization Algorithms.
3 4
What are the outcomes of this unit?
Theoretical results
40.0
Exercise: Show the number of method calls
made to fib() is 2Fn − 1.
30.0
20.0
10.0
22 24 26
11 12
Re-design the algorithm
return f_0;
}
13 14
Another solution?
Recurrence Relations
The Fibonacci sequence is specified by the
homogeneous recurrence relation:
Recurrence relations can be a useful way to
1 if n = 1, 2;
!
F (n) = specify the complexity of recursive functions.
F (n − 1) + F (n − 2) otherwise.
In general we can define a closed form for For example the linear homogeneous
these recurrence equations: recurrence relation:
1 if n = 1, 2;
!
F (n) =
F (n − 1) + F (n − 2) otherwise
F (n) = Aαn + Bβ n specifies the sequence 1, 1, 2, 3, 5, 8, 13, .....
All linear homogeneous recurrence relations The roots of the polynomial are
specify exponential functions. We can find a " √
closed form for the recurrence relation as −b ± b2 − 4ac 1± 5
=
follows: 2a 2
and so the solution is
Suppose that F (n) = rn . # √ $n # √ $n
1+ 5 1− 5
Then rn = a1rn−1 + ... + ak r(n − k). We divide U (n) = A +B
2 2
both sides of the equation by rn−k .
Then rk = a1rk−1 + ... + ak .
To find r we can solve the polynomial If we substitute n = 1 and n = 2 into the
equation: rk − a1rk−1 − .... − aK = 0. equation we get
1 −1
A=√ B=√
There are k solutions, r1, ..., rk to this equation, 5 5
and each satisfies the recurrence: Thus
√ $n √ $n
F (n) = a1F (n − 1) + a2F (n − 1) + ... + ak F (n − k).
# #
1 1+ 5 1 1− 5
F (n) = √ −√
5 2 5 2
We also have to satisfy the rest of the
recurrence relation, F (1) = c1 etc. To do this
we can use a linear combination of the
solutions, rkn. That is, we must find α1, ..., αk
such that
What is an algorithm?
19 20
A computational problem: Travelling Salesman An algorithm for Sorting
Instance: A set of “cities” X together with a One simple algorithm for Sorting is called
“distance” d(x, y) between any pair x, y ∈ X. Insertion Sort. The basic principle is that it
takes a series of steps such that after the i-th
Question: What is the shortest circular route step, the first i objects in the array are sorted.
that starts and ends at a given city and visits Then the (i + 1)-th step inserts the (i + 1)-th
all the cities? element into the correct position, so that now
the first i + 1 elements are sorted.
An instance of Travelling Salesman is a list of
cities, together with the distances between the procedure INSERTION-SORT(A)
cities, such as for j ← 2 to length[A]
do key ← A[j]
X = {A, B, C, D, E, F } ! Insert A[j] into the sorted sequence
A B C D E F ! A[1 . . . j − 1]
A 0 2 4 ∞ 1 3 i=j−1
B 2 0 6 2 1 4 while i > 0 and A[i] > key
d= C 4 6 0 1 2 1 do A[i + 1] ← A[i]
D ∞ 2 1 0 6 1 i=i−1
E 1 1 2 6 0 3
A[i + 1] ← key
F 3 4 1 1 3 0
21 22
23 24
Correctness of insertion sort
Evaluating Algorithms
Insertion sort can be shown to be correct by a
proof by induction.
There are many considerations involved in this procedure INSERTION-SORT(A)
question. for j ← 2 to length[A]
do key ← A[j]
! Insert A[j] into the sorted sequence
! A[1 . . . j − 1]
• Correctness i=j−1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
1. Theoretical correctness i=i−1
A[i + 1] ← key
2. Numerical stability
25 26
Proof by Induction
Another proof technique you may need is proof
by contradiction.
To show insertion sort is correct, let p(n) be
the statement “after the nth iteration, the first
Here, if you want to show some property p is
n + 1 elements of the array are sorted”
true, you assume p is not true, and show this
assumption leads to a contradiction
To show p(0) we simply note that a single
(something we know is not true, like i < i).
element is always sorted.
27 28
Complexity of insertion sort Correctness
For simple programs, we can directly calculate An algorithm is correct if, when it terminates,
the number of basic operations that will be the output is a correct answer to the given
performed:
question.
procedure INSERTION-SORT(A)
1 for j ← 2 to length[A]
2 do key ← A[j] Incorrect algorithms or implementations
! Insert A[j] into the sorted sequence A[1 . . . j − 1]
3 i=j−1 abound, and there are many costly and
4 while i > 0 and A[i] > key embarrassing examples:
5 do A[i + 1] ← A[i]
6 i=i−1
7 A[i + 1] ← key
• Intel’s Pentium division bug—a
scientist discovered that the original
The block containing lines 2-7 will be executed Pentium chip gave incorrect results on
length[A] − 1 times, and contains 3 basic certain divisions. Intel only reluctantly
operations replaced the chips.
In the worst case the block containing lines 5-7 • USS Yorktown—after switching their
will be executed j − 1 times, and contains 2 systems to Windows NT, a “division by
basic operations. zero” error crashed every computer on
In the worst case the algorithm will take the ship, causing a multi-million dollar
warship to drift helplessly for several
(N − 1).3 + 2(2 + 3 + ... + N ) = N 2 + 4N − 5 hours.
where length[A] = N . • Others...?
29 30
Types of Algorithm
Theoretical correctness
31 32
Numerical Stability Accumulation of errors
You can be fairly certain of exact results from Performing repeated calculations will take the
a computer program provided all arithmetic is small truncation errors and cause them to
done with the integers accumulate. The resulting error is known as
Z = {. . . , −3, −2, −1, 0, 1, 2, 3, . . .} and you guard roundoff error. If we are careful or lucky, the
carefully about any overflow. roundoff error will tend to behave randomly,
both positive and negative, and the growth of
error will be slow.
However the situation is entirely different when
the problem involves real number, because
Certain calculations however, vastly increase
there is necessarily some round-off error when
roundoff error and can cause errors to grow
real numbers are stored in a computer. A
catastrophically to the point where they
floating point representation of a number in
completely swamp the real result.
base β with precision p is a representation of
the form. Two particular operations that can cause
d.ddddd × β e numerical instability are
where d.ddddd has exactly p digits.
• Subtraction of nearly equal quantities
• Division by numbers that are nearly
zero
Efficiency
Measuring time
35 36
Complexity
Example
The complexity of an algorithm is a
“device-independent” measure of how much Suppose you run a small business and have a
time it consumes. Rather than expressing the program to keep track of your 1024 customers.
time consumed in seconds, we attempt to The list of customers is changing frequently
count how many “elementary operations” the and you often need to sort it. Your two
algorithm performs when presented with programmers Alice and Bob both come up
instances of different sizes. with algorithms.
The result is expressed as a function, giving Alice presents an algorithm that will sort n
the number of operations in terms of the size names using 256n lg n comparisons and Bob
of the instance. This measure is not as precise presents an algorithm that uses n2
as a benchmark, but much more useful for comparisons. (Note: lg n ≡ log2 n)
answering the kind of questions that commonly
arise:
Your current computer system takes 10−3
seconds to make one comparison, and so when
• I want to solve a problem twice as your boss benchmarks the algorithms he
big. How long will that take me? concludes that clearly Bob’s algorithm is
• We can afford to buy a machine twice better.
as fast? What size of problem can we
solve in the same time? Size Alice Bob
1024 2621 1049
Hardware improvement
Big-O notation
Big-Theta notation
Our analysis of insertion sort showed that it
took about n2 + 4n − 5 operations, but this is
Big-O notation defines an asymptotic upper
more precise than necessary. As previously
bound for a function f (n). But sometimes we
discussed, the most important thing about the
can define a lower bound as well, allowing a
time taken by an algorithm is its rate of
tighter constraint to be defined. In this case
growth. The fact that it is n2/2 rather than
we use an alternative notation.
2n2 or n2/10 is considered irrelevant. This
motivates the traditional definition of Big-O
Definition A function f (n) is said to be
notation.
Θ(g(n)) if there are constants c1, c2 and N
such that
Definition A function f (n) is said to be
O(g(n)) if there are constants c and N such 0 ≤ c1g(n) ≤ f (n) ≤ c2g(n) ∀n ≥ N.
that
f (n) ≤ cg(n) ∀n ≥ N. If we say that f (n) = Θ(n2) then we are
implying that f (n) is approximately
proportional to n2 for large values of n.
Thus by taking g(n) = n2, c = 1 and N = 1 we
conclude that the running time of Insertion
Sort is O(n2), and moreover this is the best See CLRS (section 3) for a more detailed
bound that we can find. (In other words description of the O and Θ notation.
Insertion Sort is not O(n) or O(n lg n).)
43 44
Why is big-O notation useful? An asymptotically better sorting algorithm
45 46
Merge-sort complexity
The Master Theorem
The complexity of Merge Sort can be shown to
be Θ(nlgn). Merge Sort’s complexity can be described by
the recurrence relation:
47 48
Average case analysis Inversions
Input size
An asymptotically worse algorithm
The complexity of an algorithm is a measure of
Quicksort is Θ(n2), but it’s average complexity how long it takes as a function of the size of
is better than Merge-sort! (CLRS Chapter 7) the input. For Sorting we took the number of
items n, as a measure of the size of the input.
procedure QUICKSORT(A, p, r)
if p < r This is only true provided that the actual size
then q ← PARTITION(A, p, r) of the items does not grow as their number
QUICKSORT(A, p, q − 1) increases. As long as they are all some
QUICKSORT(A, q + 1, r) constant size K, then the input size is Kn. The
actual value of the constant does not matter,
as we are only expressing the complexity in
procedure PARTITION(A, p, r) big-O notation, which suppresses all constants.
x ← A[r]
i←p−1 But what is an appropriate input parameter for
for j ← p to r − 1 Travelling Salesman? If the instance has n
do if A[j] ≤ x cities, then the input itself has size Kn2—this
then i ← i + 1 is because we need to specify the distance
exchange A[i] ↔ A[j] between each pair of cities.
exchange A[i + 1] ↔ A[r]
return i + 1 Therefore you must be careful about what
parameter most accurately reflects the size of
the input.
51 52
Travelling Salesman Good Algorithms
Naive solution: Consider every permutation of Theoretical computer scientists use a very
the n cities, and compute the length of the broad brush to distinguish between good and
resulting tour, saving the shortest length. bad algorithms.
How long will this take? We count two main An algorithm is good if it runs in time that is a
operations polynomial function of the size of the input,
otherwise it is bad.
Summary (cont.)
55 56
Summary (cont.)
57
Overview
Computer Science and Software Engineering, 2011
1. Introduction
2. Tree Search
CITS3210 Algorithms - Breadth first search
- Depth first search
Graph Algorithms - Topological sort
- Kruskal’s algorithm
- Prim’s algorithm
- Implementations
- Priority first search
- Dijkstra’s algorithm
- Bellman-Ford algorithm
- Dynamic Programming
Notes by CSSE, Comics by xkcd.com
1 2
Isomorphisms
What is a graph?
Consider the following two graphs:
Example If A B 1 2 3
!
V (G2) = all the airstrips in the world, and
!
{x, y} ∈ E(G2) if there is a direct passenger ! ! !
! !!" !
"! "! "
flight from x to y. " !"
! !
!
! "
V (G3) = all the people who have ever !"
5 6
! ! ! ! ! ! ! !
%&& #& % &$$ #&
% &$$ #&
% &$$ #&
% &$$ #&
% &$$#% $$# 1 4
% && # $ %$&& # $ %$&& # $ %$&& # $ %$&& # $ %$&& # $%$ #
#$&%&$ #$&& #$&& #$&& #$&& % $#$&&
!$$ ! $ ! $ ! $ ! $ ! $ # &! %& # &!
% $ % $ % $ % $ % #
%# $ %&# &$ %&# &$ %&# &$ %&# &$ %&
%&& #% #&% & #%$$ #&
% & #%$$ #&
% & #%$$ #&
% & #%$$ #&
% & #%$$#% #% $$#
%# & %&
# $ # &
%$ %&# $ %$ # & %&# $ %$ # & %&# $ %$ # & %&# $ %$ # &%&# $%$ # %#
$& $& $& $& $& $&
!& #! %# & #! %# & #! %# & #! %# & #! %# & $%#! %# & $%!
#%$$ #% #%$ && $#% #& %$& $#% #& %$& $#% #& %$& $#% #& %$& $#% #& % &#%
$# %# %& $ $%&
$ $%&
$ $%&
$ $%&
$
% & #%& #% & $ %$ #% &
#& $ %$ #% &
#& $ %$ #% &
#& $ %$ #% &
#& $ %$ #% $#%
#& $ #
%# %& $ $
# & %# %& $ $
# & %# %& $ $
# & %# %& $ $
# & %# %& $ $
# & %# %& $ $%# %#
# &
!& #! %# & #! %# & #! %# & #! %# & #! %# &
2 5
$%#! %# & $%!
#%$$ #% #%& $& $#% #& %$& $#% #& %$& $#% #& %$& $#% #& %$& $#% #& % &#%
$
# %# %& $ $%&
$ $%&
$ $%&
$ $%&
$ " !
% & #%& #% & $ %$ #% &
#& $ %$ #% &
#& $#%
&$ #% & $ %$ #% &
#& $#%
&$ #% $#%$ # " !
%# %& $ $
# & %# %& $ $
# & %# %& $ $
# & %# %& $ $
# & %# %& $ $
# & %# %& $ $%# %#
# & " !
!& !# %# & #! %# & #! %# & #! %# & #! %# & $%#! %# & $%!
#%$$ #% #%& $& $# % # &
%$& $# % # &
%$& $# % # &
%$& $# % # &
%$& $#% #&
% & #% " !
$
# %# %& $ $ $
%& $%&
$ $%&
$ $%&
$ " !
% & #%& #% & $ %$ #% &
#& $ %$ #% &
#& $#%
&$ #% & $ %$ #% &
#& $#%
&$ #% $#%$ # " !
%# %& $ $
# & %# %& $ $
# & %# %& $ $
# & %# %& $ $
# & %# %& $ $
# & %# %& $ $%# %#
# & " !
!& !# %# & #! %# & #! %# & #! %# & #! %# & $%#! %# & $%!
#%$$ #% #%& $& $# % # &
%$& $# % # &
%$& $# % # &
%$& $# % # &
%$& $#% #&
% & #% " !
$
# %# %& $ $ $
%& $ $
%& $ $
%& $ $
%&
% & #%& #% &
%# %& $
# &
$ %$ #% &
#&
%# %& $
# &
$
%# %&
%$ #% &
#&
$
# &
$
%# %&
%$ #% &
#&
$
# &
$
%# %&
%$ #% &
#&
$
# &
$
%# %&
%$ #% $#%$ #
#&
$%# %#
# &
3 6 7
$ $ $ $ $ $
!& !# %# & !# %# & #! %# & #! %# & #! %# & $%#! %# & $%!
#%$$ #% #%& $& $#% #& %$& $#% #& %$& $#% #& %$& $#% #& %$& $#% #& % &#%
$
# %# %& $ $ $
%& $ $
%& $ $
%& $ $
%&
& #%& &
$ %$
#& &$ %$
#& &$ %$
#& &$ %$
#& &$ %$
#& $#%$
# %&$&$# %& $&$# %& $&$# %& $&$# %& $&$# %& $&$# %
!$
#
#$$% #&
%$ !#
$& $% #&
&%$ !#
$& $% #&
&%$#!
$& $% #&
& #!
%$
$& $% #&
& #!
%$
$&
&%#!
$% #&&%
&%!
The graph G4 has 7 vertices and 9 edges.
7 8
Basic properties of graphs
Counting Exercises
11 12
Distance in weighted graphs
Directed and weighted graphs When talking about weighted graphs, we need
to extend the concept of distance.
There are two important extensions to the
basic definition of a graph. Definition In a weighted graph X a path
x = x0 ∼ x1 ∼ · · · ∼ xn = y
Directed graphs In a directed graph, an
edge is an ordered pair of vertices, and hence has weight
has a direction. In directed graphs, edges are i=n−1
!
often called arcs. w(xi , xi+1).
i=0
The shortest path between two vertices x and
Directed Tree Each vertex has at most one
y is the path of minimum weight.
directed edge leading into it, and there is one
vertex (the root) which has a path to every
other vertex.
13 14
Representation of graphs
!
1 2 !
! ! !
2 1 3 5 !
3 !
2 !
5 !
6 ! 1 4
!
4 5 !
! ! ! ! !
5 2 3 4 6 7 !
! ! !
6 3 5 7 ! 2 5
" #
! ! " #
7 5 6 ! " #
" #
" #
" #
" #
This representation requires two list elements " #
3 6 7
for each edge and therefore the space required
is Θ(|V (G)| + |E(G)|).
The adjacency matrix of a graph G is a V × V For small graphs or those without weighted
matrix A where the rows and columns are edges it is often better to use the adjacency
indexed by the vertices and such that Aij = 1 if matrix representation anyway.
and only if vertex i is adjacent to vertex j.
It is also easy and more intuitive to define
For graph G4 we have the following adjacency matrix representations for directed
and weighted graphs.
0 1 0 0 0 0 0
1 0 1 0 1 0 0
0 1 0 0 1 1 0
However your final choice of representation
A= 0 0 0 0 1 0 0
depends precisely what questions you will be
0 1 1 1 0 1 1
0 0 1 0 1 0 1
asking. Consider how you would answer the
0 0 0 0 1 1 0 following questions in both representations (in
particular, how much time it would take).
The adjacency matrix representation uses
Θ(V 2) space. Is vertex v adjacent to vertex w in an
undirected graph?
For a sparse graph E is much less than V 2, and
hence we would normally prefer the adjacency What is the out-degree of a vertex v in a
list representation. directed graph?
For a dense graph E is close to V 2 and the What is the in-degree of a vertex v in a
adjacency matrix representation is preferred. directed graph?
17 18
A third representation to consider is a recursive Searching through a graph is one of the most
representation. In this representation you may fundamental of all algorithmic tasks, and
not have access to a list of all vertices in the therefore we shall examine several techniques
graph. Instead you have access to a single for doing so.
vertex, and from that vertex you can deduce
the adjacent vertices. Breadth-first search is a simple but extremely
important technique for searching a graph.
The following java class is an example of such This search technique starts from a given
a representation: vertex v and constructs a spanning tree for G,
called the breadth-first tree. It uses a (first-in,
abstract class Vertex{ first-out) queue as its main data structure.
19 20
Queues
21 22
23 24
Example of breadth-first search After visiting vertex 1
'(
#$
!"
1
%&
4 1 4
'(
#$
2 5 !"
2
%&
5
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
3 6 7 3 6 7
Head Head
↓ ↓
queue 1 queue 1 2
1 4 1 4
'(
#$ '(
#$
2 !"
%&
5 2 !"
%&
5
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
'( '(
! " ! "
#$
! " ! #$ "
!"
3
%&
6 7 3 !"
6
%&
7
Head Head
↓ ↓
queue 1 2 3 5 queue 1 2 3 5 6
'(
#$ '(
#$
1 !"
4
%&
1 !"
4
%&
2 5 2 5
! !
! ! !
!
! !
! ! !
!
! !
! ! !
!
! !
!
! ! !
!
!
! !
!
! ! !
!
!
! !
!
! ! !
!
!
! !
'( '( '(
! ! ! !
#$ #$ #$
! !
! !
!
!!
! ! !
!
!!
!
3 !"
6
%& !"
%&
7 3 6 !"
7
%&
Head Head
↓ ↓
queue 1 2 3 5 6 4 7 queue 1 2 3 5 6 4 7
1 4 1 4
2 5 2 5
! !
! ! !
!
! !
! ! !
!
! !
! ! !
!
! !
!
! ! !
!
!
! !
!
! ! !
!
!
! !
!
! ! !
!
!
! !
'(
! ! ! !
#$
! !
! !
!
!!
! ! !
!
!!
!
3 6 !"
%&
7 3 6 7
Head Head
↓ ↓
queue 1 2 3 5 6 4 7 queue 1 2 3 5 6 4 7
33 34
35 36
Basic recursive depth-first search
A Non-recursive DFS
The following recursive program computes the
depth-first search tree for a graph G starting All recursive algorithms can be implemented as
from the source vertex v. non-recursive algorithms. A non-recursive DFS
requires a stack to record the previously visited
To initialize the search we mark the colour of vertices.
every vertex as white. Then we call the
recursive routine DFS(v) where v is the source procedure DFS(w)
vertex. initialize stack S
push w onto S
procedure DFS(w) while S not empty do
colour[w] ← grey x ← pop off S
for each vertex x adjacent to w do if colour[x]=white then
if colour[x] is white then colour[x] ← black
π[x] ← w for each vertex y adjacent to x do
DFS(x) if colour[y] is white then
end if push y onto S
end for π[y] ← x
colour[w] ← black end if
end for
At the end of this depth-first search procedure end if
we have produced a spanning tree containing end while
every vertex in the connected component
containing v.
37 38
'(
#$
!"
1 4 1
%&
4
2 5 2 5
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
! " ! "
3 6 7 3 6 7
'(
#$ '(
#$
!"
1
%&
4 !"
1
%&
4
'(
#$ '(
#$
!"
2
%&
5 !"
2
%&
5
! " !!
!
! "
! " !
!
! "
! " !
!! "
! " !!
! "
! " !
!
! "
! " !
!! "
'(
! " !!
! "
! " #$
!
!
! "
3 6 7 !"
3
%&
6 7
'(
#$ '(
#$
!"
2
%& !"
5 '(
!!%&
!
! "
#$
!
!
! " !"
1
%&
4
!
!! "
!!
! "
!
!
! "
!
!! "
'(
!!
! "
#$
!
!
! "
!"
3
%&
6 7 '(
#$ '(
#$
!"
2
%& !" 5
!!%&
!
! "
!
!
! "
!
!! "
!!
! "
!
!
! "
!
!! "
'(
!!
! "
#$
!
!
! "
!"
3
%&
6 7
1
'(
#$ '(
#$
!"
2
%&
5 !"
!%& *
!
!
!
!!
! !
!
2
!
!! !
!!
! !
!
!
! ! )
!
!! ! 3
'( '(
!!
! !
#$
!
!
! #$ !
!"
3
%& !"
6
%&
7 + )
5
! "
! "
! "
4 6
"
"
"
7
47 48
Depth-first search for directed graphs
The parenthesis property
A depth-first search on an undirected graph
This assigns to each vertex a discovery time, produces a classification of the edges of the
which is the time at which it is first discovered, graph into tree edges, or back edges. For a
and a finish time, which is the time at which directed graph, there are further possibilities.
all its neighbours have been searched and it no The same depth-first search algorithm can be
longer plays any further role in the search. used to classify the edges into four types:
The discovery and finish times satisfy a tree edges If the procedure DFS(u) calls
property called the parenthesis property. DFS(v) then (u, v) is a tree edge
Then the resulting expression is a well-formed forward edges If the procedure DFS(u)
expression with correctly nested parentheses. explores the edge (u, v) but finds that v is
an already visited descendant of u, then
For our example depth-first search we get: (u, v) is a forward edge
(1 (2 (3 (4 (5 5 ) (6 (7 7) 6) 4) 3) 2) 1)
cross edges All other edges are cross-edges
49 50
We shall consider a classic simple application For example, consider this dag describing the
of depth-first search. stages of getting dressed and the dependency
between items of clothing (from CLRS, page
Definition A directed acyclic graph (dag) is a 550).
directed graph with no directed cycles.
underpants
socks
Theorem In a depth-first search of a dag there
are no back edges.
trousers
shoes
Consider now some complicated process in
which various jobs must be completed before
watch
others are started. We can model this by a shirt
belt
graph D where the vertices are the jobs to be
completed and there is an edge from job u to
job v if job u must be completed before job v is tie
started. Our aim is to find some linear ordering
of the jobs such that they can be completed jacket
without violating any of the constraints.
What is the appropriate linear order in which to
This is called finding a topological sort of the do these jobs so that all the precedences are
dag D. satisfied.
51 52
Doing the topological sort
! " !
D " H ! L " P
" "
" "
" "
Algorithm for TOPOLOGICAL SORT "! "
"
"
"
"!
" "
" "
" "
! "
The algorithm for topological sort is an C " G K ! O
extremely simple application of depth-first !" !"
"! "!
search.
! ! "
B " F " J ! N
Algorithm "
"
"
"
" " !"
"! " "
" "
" "
Apply the depth-first search procedure to find "
"
"
"
" !
the finishing times of each vertex. As each A E ! I " M
vertex is finished, put it onto the front of a
linked list.
53 54
!" !"
2/27 9/24 10/23 30/31
"! "! ! ! "
B " F " J ! N
" !!
!
!
2/27 9/24 10/23 " !
!
!
" !!
!
! ! " !"
B " F " J ! N "! "
"
!!
!
!
!
!
" !!
!
! " !
!
!
" !
!
! " !!
" !! 1/28 11/12 ! 13/16 14/15
! !" " !
!
!
" !
!
! " !
"!
"
"
!
!
!!
! A E ! I " M
!
" !!
!
1/28 " 11/12 !
!
!
13/16 14/15
" !
A E ! I " M
As the vertices were placed at the front of a
linked list as they became finished the final
Notice that there is a component that has not
topological sort is: O − N − A − B − C − G − F −
been reached by the depth-first search. To
J −K −L−P −I −M −E−D−H
complete the search we just repeatedly perform
depth-first searches until all vertices have been
A topologically sorted dag has the property
examined.
that any edges drawn in the above diagram will
got from left-to-right.
55 56
Analysis and correctness Proof (contd)
Time analysis of the algorithm is very easy — WHITE: v is a descendant of u so we will set
to the Θ(V + E) time for the depth-first search its time now but we are still exploring u so we
we must add Θ(V ) time for the manipulation will set its finished time at some point in the
of the linked list. Therefore the total time future (and so therefore f [v] < f [u]). (refer
taken is again Θ(V + E). back to the psuedocode).
Proof of topological sort BLACK: v has already been visited and so its
finish time must have been set earlier, whereas
Suppose DFS has calculated the finish times of we are still exploring u and so we will set its
a dag G = (V, E). For any pair of adjacent finish time in the future (and so again
vertices u, v ∈ V (implying (u, v) ∈ E) then we f [v] < f [u]).
just need to show f [v] < f [u] (the destination
vertex v must finish first). Since for every edge in G there are two
possible destination vertex colours and in each
For each edge (u, v) explored by DFS of G case we can show f [v] < f [u], we have shown
consider the colour of vertex v. that this property applies to every connected
vertex in G.
GREY: v can never be grey since v should
therefore be an ancestor of u and so the graph See CLRS (theorem 22.11) for a more
would be cyclic. thorough treatment.
57 58
Depth-first search can be used on a graph to It is clear that finding a MST for W is the
find all the articulation points in time solution to this problem.
Θ(V + E).
59 60
Kruskal’s method
61 62
1 3 1 1 3 1
1 1 2 2 1 1 2 2
1 6 2 1 6 2
7 2 5 2 7 2 5 2
8 2 2 8 2 2
2 4 5 6 2 4 5 6
2 5 1 2 5 1
63 64
After using edges of weight 2 The final MST
1 3 1 1 3 1
1 1 2 2 1 1 2 2
1 6 2 1 6 2
7 2 5 2 7 2 5 2
8 2 2 8 2 2
2 4 5 6 2 4 5 6
2 5 1 2 5 1
65 66
Prim’s algorithm
67 68
Problem solved?
69 70
Implementation of Kruskal
73 74
Can we improve the time of the merging However we can achieve something similar by
operation? just adjusting one pointer — suppose we
simply change the pointer for the element 1, by
making it point to 0 instead of itself.
75 76
The new data structure
0& 4
! &
& "
"# $%
$
There are two heuristics that can be applied to
&
& " $
&
&
& "
" $
$
the new data structure, that speed things up
&
& " $
2 1 6 7 enormously at the cost of maintaining a little
"# $%
" $ extra data.
" $
" $
" $
" $
3 5 Let the rank of a root node of a tree be the
height of that tree (the maximum distance
from a leaf to the root).
This new improved merging has complexity
only Θ(1). However we have now lost the The union-by-rank heuristic tries to keep the
ability to do the Find properly. In order to
trees balanced at all times. When a merging
correctly find the leader of the cell containing
operation needs to be done, the root of the
an element we have to run through a little
shorter tree is made to point to the root of the
loop:
taller tree. The resulting tree therefore does
procedure Find(x) not increase its height unless both trees are the
while x != π(x) same height in which case the height increases
x = π(x) by one.
79 80
The priority queue ADT
For Prim’s algorithm we repeatedly have to The operations associated with this data type
select the next vertex that is closest to the include
tree that we have built so far. Therefore we
need some sort of data structure that will insert(queue,entry,key) Places an entry with
enable us to associate a value with each vertex its associated key into the data structure
(being the distance to the tree under
construction) and rapidly select the vertex with
the lowest value. change(queue,entry,newkey) Changes the
value of the key associated with a given
entry
From our study of Data Structures we know
that the appropriate data structure is a priority
queue and that a priority queue is implemented max(queue) Returns the element with the
by using a heap. highest priority
81 82
Heaps
Prim’s algorithm
A heap is a complete binary tree such that the
key associated with any node is larger than (or It is now easy to see how to implement Prim’s
equal to) the key associated with either of its algorithm.
children. This means that the root of the
binary tree has the largest key. We first initialize our priority queue Q to
empty. We then select an arbitrary vertex s to
99 grow our minimum spanning tree A and set the
&%%
&&& %%%
&& %%% key value of s to 0. In addition, we maintain an
&&& %%
&&& %%
15 7 55 23 being grown.
"! "! "! "
" ! " ! " ! "
" ! " ! " ! "
Here we want low key values to represent high
5 1 6 2 50 40 11
priorities, so we will rename our two last
priority queue operations to min(queue) and
We can insert items into a heap, change the extractmin(queue).
key value of an item in the heap, and remove
the item at the root from a heap, (always Next, we add each vertex v != s to Q and set
maintaining the heap property) in time the key value key[v] using the following criteria:
O(log n), where n is the size of the heap. !
weight (v, s) if (v, s) ∈ E
key[v] =
∞ otherwise
A heap can be used to implement all priority
queue operations in time O(log n).
83 84
Each time an element is added to the priority
queue Q, a heapify is carried out to maintain 3. We then examine the neighbours of u. For
the heap property of Q. Since low key values each neighbour v, there are two
represent high priorities, the heap for Q is so possibilities:
maintained that the key associated with any
node is smaller (rather than larger) than the (a) If v is is already in the spanning tree A
key of any of its children. This means that the being constructed then we do not
root of the binary tree always has the smallest consider it further.
key. (b) If v is currently on the priority queue Q,
then we see whether this new edge (u, v)
We store the following information in the should cause an update in the priority of
minimum spanning tree A: (v, key[v], π[v]). v. If the value weight (u, v) is lower than
Thus, at the beginning of the algorithm, the current key value of v, then we
A = {(s, 0, undef)}. change key[v] to weight (u, v) and set
π[v] = u. Note that each time the key
At each stage of the algorithm: value of a vertex in Q is updated, a
heapify is carried out to maintain the
1. We extract the vertex u that has the heap property of Q.
highest priority (that is, the lowest key
value!). With the binary tree being
heapified, u is simply the root of the tree. At the termination of the algorithm, Q = ∅ and
the spanning tree A contain all the edges,
2. We add (u, key[u], π[u]) to A and carry out
together with their weights, that span the tree.
extractmin(Q)
85 86
Priority-first search
Complexity of Prim
87 88
The operation of PFS
Complexity of PFS
After initialization the operation of PFS is as
follows: The complexity of this search is easy to
calculate — the main loop is executed V
procedure PFS(s) times, and each extractmin operation takes
change(s,0) O(lg V ) yielding a total time of O(V lg V ) for
while Q != ∅ the extraction operations.
u ← extractmin(Q)
for each v adjacent to u do During all V operations of the main loop we
if v ∈ Q ∧ PRIORITY < key [v] then examine the adjacency list of each vertex
π[v] ← u exactly once — hence we make E calls, each
change(Q,v,PRIORITY) of which may cause a change to be performed.
end if Hence we do at most O(E lg V ) work on these
end for operations.
end while
Therefore the total is
It is important to notice how the array π is
managed — for every vertex v ∈ Q with a finite
key value, π[v] is the vertex not in Q that was
O(V lg V + E lg V ) = O(E lg V ).
responsible for the key of v reaching the
highest priority it has currently reached.
89 90
Shortest paths
91 92
Dijkstra’s algorithm
1 3 1
Dijkstra’s algorithm can be implemented as a
priority-first search by taking the priority of a
vertex v ∈ Q to be the shortest path from s to 1 1 2 2
v that consists entirely of vertices in the 1 6 2
priority-first search tree (except of course for
v).
7 2 5 2
8 2 2
This can be implemented as a PFS by
replacing PRIORITY with
2 4 5 6
key [u] + weight (u, v)
2 5 1
93 94
Proof of correctness
95 96
Relaxation
Relaxation schedules
Negative edge weights
101 102
Correctness of Bellman-Ford
103 104
All-pairs shortest paths A dynamic programming method
105 106
(m)
We shall let dij denote the distance from What is the smallest weight of the path from
vertex i to vertex j along a path that uses at vertex i to vertex j that uses at most m edges?
most m edges, and define D(m) to be the Now a path using at most m edges can either
(m)
matrix whose ij-entry is the value dij . be
107 108
Example Computing D(2)
(m)
Recall the method for computing dij , the
The remaining matrices (i, j) entry of the matrix D(m) using the
method similar to matrix multiplication.
Proceeding to compute D(3) from D(2) and A,
(m)
and then D(4) from D(3) and A we get: dij ← ∞
for k = 1 to V do
(m) (m) (m−1)
dij = min(dij , dik + w(k, j))
end for
0 4 8 2 5
1 0 4 3 6
(3)
=
D 10 14 0 12 15
Let us use ! to denote this new matrix product.
3 2 6 0 3
16 ∞ 6 18 0
Then we have
0 4 8 2 5 D(m) = D(m−1) ! A
1 0 4 3 6
D(4) =
10 14 0 12 15
3 2 6 0 3
Hence it is an easy matter to see that we can
16 20 6 18 0 compute as follows:
111 112
Complexity of this method
113 114
For the inductive step we assume that we have The overall algorithm is then simply a matter
constructed already the matrix D(k−1) and of running V times through a loop, with each
wish to use it to construct the matrix D(k). entry being assigned as the minimum of two
possibilities. Therefore the overall complexity
Let us consider all the paths from i to j whose of the algorithm is just O(V 3).
intermediate vertices lie in {1, 2, . . . , k}. There
are two possibilities for such paths D(0) ← A
for k = 1 to V do
for i = 1 to V do
(1) The path does not use vertex k
for j = 1 to V do
(2) The path does use vertex k (k) (k−1) (k−1) (k−1)
dij = min(dij , dik + dkj )
end for j
The shortest possible length of all the paths in end for i
(k−1)
category (1) is given by dij which we end for k
already know.
At the end of the procedure we have the
If the path does use vertex k then it must go matrix D(V ) whose (i, j) entry contains the
from vertex i to k and then proceed on to j, length of the shortest path from i to j, all of
and the length of the shortest path in this whose vertices lie in {1, 2, . . . , V } — in other
(k−1) (k−1)
category is dik + dkj . words, the shortest path in total.
115 116
Example
The entire sequence of matrices
Consider the weighted directed graph with the
following adjacency matrix:
0 ∞ 11 2 6
1 0 4 3 7
(2)
=
D 10 ∞ 0 12 16
0 ∞ 11 2 6
3 2 6 0 3
1 0 4 ∞ ∞
(0) ∞ ∞ 6 ∞ 0
D 10 ∞
= 0 ∞ ∞
∞ 2 6 0 3
0 ∞ 11 2 6
∞ ∞ 6 ∞ 0
1 0 4 3 7
(3)
∞ 0
D 10
= 12 16
Let us see how to compute D(1)
3 2 6 0 3
16 ∞ 6 18 0
0 4 8 2 5
0 ∞ 11 2 6
1 0 4 3 6
1 0 4
D (4)
=
(1) 10 14 0 12 15
D = 10 ∞ 0
3 2 6 0 3
∞ 2 6 0 3
16 20 6 18 0
∞ ∞ 6 ∞ 0
0 4 8 2 5
To find the (2, 4) entry of this matrix we have 1 0 4 3 6
D (5)
= 10 14 0 12 15
to consider the paths through the vertex 1 —
3 2 6 0 3
is there a path from 2 – 1 – 4 that has a
better value than the current path? If so, then 16 20 6 18 0
that entry is updated.
117 118
Summary
119 120
Summary (contd)
Summary (contd)
7. DFS visits the vertices nearest to the
source first. It can be used to determine
12. Dynamic Programming is a general
whether a graph is connected.
approach for solving problems which can be
decomposed into sub-problems and where
8. BFS visits the vertices furtherest to the solutions to sub-problems can be combined
source first. It can be used to perform a to solve the main problem.
topological sort.
10. Dijkstra’s method determines the shortest 14. The minimum path problem can be used
path between any two vertices in a directed for motion planning of robots through large
graph so long as all the weights are graphs using a priority first search.
non-negative.
121 122
Computer Science and Software Engineering, 2011 Flow networks
3 2 2 5 4 2
7 9 3 4 9 6
"!
! 4 "!
! 2 "!
s " " " "
! !
Notes by CSSE, Comics by xkcd.com 3 6
1 2
A flow
The MAX FLOW problem
A flow in a flow network is a function
MAX FLOW
f :V ×V →R Instance. A flow network G with source s and
that satisfies the following properties. sink t.
Question. What is the maximum flow from s
Capacity constraint to t?
For each edge (u, v) The most convenient mental model for the
network flow problem is to think of the edges
f (u, v) ≤ c (u, v) of the capacity graph as representing pipelines
of various capacities.
Skew symmetry
The source is to be viewed as a producer of
For each pair of vertices u, v some sort of fluid (maybe an oil well), and the
sink as a consumer of some sort of fluid
f (u, v) = −f (v, u)
(maybe an oil refinery).
3 4
An example flow
"!
! -!
2 "!
! -!
1 "!
If we increase the flow from A to B by one unit
s " " " "
! !
!
2 !
1 then the new flow will be 4 units from A to B
(same as −4 units from B to A), whereas if we
The value of a flow f is defined to be the total
increase the flow from B to A by one unit then
flow leaving the source vertex
! the new flow will be −2 units from B to A
|f | = f (s, v) (same as 2 units from A to B).
v∈V
Consider the same flow, but this time also The residual network is the network where we
including the (original) capacities of the edges just list the “unused capacities” of the pipes.
on the same diagram. Given a capacity graph G and a flow f the
residual network is called Gf where Gf has the
same vertex set as G and capacities cf (u, v)
! -!
2/4 ! -!
3/6
" " " " t given by
! !
!" !
2/9 !" !
3/3 !"
! 6 ! 9
"! !
0/6 "! !
0/9 "!
" " " " t
! ! ! !
" " " "
!" 7 !" 0 !"
! !
!" !
0/7 !" !
0/5 !"
1 4 1 6 3 3
!
2/7 -!
2/9 !
1/3 -!
1/4 !
1/9 -!
1/6
"! "! "!
! 6 ! 9
" "
"!
! -!
2/4 "!
! -!
1/2 "!
!" 7
"
!
!" 5
"
!
!"
s " " " "
! !
!
2/3 !
1/6
5 11 2 5 8 7
It is clear that some of the pipes have got
some residual capacity in that they are not "!
! 6 "!
! 3 "!
being fully used. s " " " "
! !
1 5
7 8
Augmenting flows
! !
" " " " t !
2 -!
2 !
1 -!
1 !
4 -!
4
! !
!" !" !"
!
3 -!
3
"!
! -!
3 "!
! -!
3 "!
" " " "
! !
!" !
3 !" !
3 !"
"!
! -!
3 "!
! -!
3 "!
!
" " " " 5 -!
5 !
1 -!
1 !
1 -!
1
! !
!" !
3 !" !
3 !"
!
3 -!
3
"!
! -!
2 "!
! -!
1 "!
s " " " "
! !
!
2 !
1
"! "! "!
! !
s " " " "
! !
9 10
! 6 ! 9
" " " " t
!" 7 !
!" 0 !
!"
The Ford-Fulkerson method is an iterative
method for solving the maximum flow problem.
1 4 1 6 0 6
It proceeds by starting with the zero valued
"!
9 "!
12 "! flow (where f (u, v) = 0 for all u, v ∈ V ).
! !
" " " "
! !
!" 4 !" 2 !"
At each stage in the method an augmenting
2 14 2 5 8 7 path is found — that is, a path from s to t
along which we may push some additional flow.
Given an augmenting path the bottleneck
"!
! 6 "!
! 3 "!
s " " " "
capacity b is the smallest residual capacity of
! !
1 5 the edges along the path.
11 12
Cuts
For a given flow network G = (V, E), a source An s,t-cut is a partition of V into two subsets
vertex s and a sink vertex t, Ford-Fulkerson S and T such that s ∈ S and t ∈ T .
method can be summarised as follows:
! 4 ! 6
T " " T " " t
! !
!" 9 !" 3 !"
Ford-Fulkerson(G, s, t)
for each edge (u, v) ∈ E do 3 2 2 5 4 2
f (u, v) ← 0
f (v, u) ← 0 "!
! 6 "!
! 9 "!
13 14
! 4 ! 6
T " " T " " t
!" 9 !
!" 3 !
!"
!
2 -!
2 !
1 -!
1 !
4 -!
4
3 2 2 5 4 2 "!
! -!
3 "!
! -!
3 "!
" " " "
! !
!" !
3 !" !
3 !"
"! "! "!
! 6 ! 9
S " " S " " T !
!" 7 !
!" 5 !
!" 5 -!
5 !
1 -!
1 !
1 -!
1
7 9 3 4 9 6 "!
! -!
2 "!
! -!
1 "!
s " " " "
! !
!
2 !
1
"!
! 4 "!
! 2 "!
s " " S " " T
3 !
6 ! The flow across the cut is
Now let us compute the flow across a different Theorem. The flow across every cut has the
cut. same value.
-!
2 -!
3 In order to prove this theorem our strategy will
! !
" " " " t be to show that moving a single vertex from
! !
!" !
2 !" !
3 !"
one side of a cut to the other does not affect
!
2 -!
2 !
1 -!
1 !
4 -!
4 the flow across that cut.
This will then show that any two cuts have the
"!
! -!
3 "!
! -!
3 "!
" " " " same flow across them because we can shift
! !
!" !
3 !" !
3 !"
any number of vertices from one side of the
cut to the other without affecting the flow
!
5 -!
5 !
1 -!
1 !
1 -!
1
across the cut.
"!
! -!
2 "!
! -!
1 "!
Proof. Suppose S, T is a cut such that u ∈ S.
s " " " "
! !
!
2 !
1 We show that we can move u to T without
altering the flow across the cut by considering
The flow across this cut is the value
3+3+1=7 f (S, T ) − f (S − {u}, T + {u})
17 18
Minimum cut
Proof continued
For any s, t-cut S, T it is clear that
The contribution that vertex u makes to the
f (S, T ) ≤ c (S, T )
flow f (S, T ) is
!
f (u, w)
Therefore the value of the flow is at most the
w∈T
capacity of the cut. Therefore we can consider
whereas the contribution it makes to the flow
the cut with the lowest possible capacity —
f (S − {u}, T + {u}) is
!
the minimum cut, and it is clear that the
f (w, u) capacity of this cut is an upper bound for the
w∈S maximum flow.
Therefore Therefore
f (S, T ) − f (S − {u}, T + {u})
! ! max flow ≤ min cut
= f (u, w) − f (w, u)
w∈T w∈S
! ! Example
= f (u, w) + f (u, w)
w∈T w∈S
! For our example, the cut S = V − {t}, T = {t}
= f (u, w) has capacity 7, so the maximum flow has value
w∈V
at most 7. (As we have already found a flow of
= 0 value 7 we can be sure that this is indeed the
maximum).
19 20
Max-flow min-cut theorem
Obvious.
The max-flow min-cut theorem is an instance
of duality that is used in linear optimization
21 22
Let us find the minimum cut in our previous The main significance of the max-flow min-cut
example. theorem is that it tells us that if our current
flow is not the maximum flow, then we are
guaranteed that there will be an augmenting
! 4 ! 6
" " " " t path.
! !
!" 9 !" 3 !"
23 24
Complexity of Ford-Fulkerson
Improving the performance
Therefore the complexity is O(E|f ∗|) where |f ∗| • Always augment by a path of maximum
is the value of the maximum flow. bottleneck capacity
25 26
Analysis
The second heuristic
We can view the Edmonds-Karp heuristic as
Edmonds and Karp’s second heuristic produces operating in several “stages” where each stage
an asymptotic complexity which is independent deals with all the augmenting paths of a given
of the edge capacities. length.
27 28
Bipartite graph
Applications of network flow
A bipartite graph is an undirected graph
One interesting application of network flow is G = (V, E) in which V can be partitioned into
to solve the bipartite matching problem. two sets V1 and V2 such that (u, v) ∈ E implies
either u ∈ V1 and v ∈ V2 or u ∈ V2 and v ∈ V1.
A matching in a graph G is a set of edges that That is, all edges go between the two sets V1
do not share any vertices. and V2.
29 30
A
The stable marriage problem is not a network
F flow problem, but it is a matching problem.
B The scenario is as follows:
G
s C t
We are given two sets, VM and VF (male and
female) of the same size. Also, every man
H v ∈ Vm ranks every woman u ∈ VF and every
D
woman u ∈ VF ranks every man in VM .
I
E We will write u <v u" if v would rather marry u
than u".
V1 V2
The stable marriage problem is to find a
(Assume all edge capacities are 1). matching E ⊂ VM × VF such that if (v, u) and
(w, z) are in E, then either u <v z or w <z v.
31 32
Pseudo-Code
procedure StableMarriage(VM , VF , P )
Solution E←∅
while |E| < n
The Gale-Shapley algorithm is a solution to the for each v ∈ VM where ∀u, (v, u) ∈ /E
stable marriage problem that involves a u ← v’s next preference
number of rounds. Heuristically, each round if (w, u) ∈ E and v <u w
every “unengaged” man proposes to his most E ← E − {(w, u)} ∪ {(v, u)}
preferred woman that he has not already else if ∀w, (w, u) ∈
/E
proposed to. If this woman is unengaged or E ← E ∪ {(v, u)}
engaged to someone she prefers less than her return E
new suitor, she breaks off her current
engagement and accepts the new proposal.
33 34
Summary cont.
Summary
35 36
Computational Geometry
Computer Science and Software Engineering, 2011
Computational Geometry is the study of
algorithms for solving geometric problems.
This has applications in many areas such as
graphics, robotic, molecular modelling, forestry,
statistics, meteorology,... basically any field
that stores data as a set of points in space.
3 4
Is this point on a line?
Geometric Problems
5 6
Do any of these lines intersect? It turns out that surprisingly little mathematics
is required for the algorithms we will be using,
although some basic linear algebra is assumed.
(You should know how to add and subtract
points in the plane, perform scalar
multiplication, calculate Euclidean distances
and solve systems of linear equations).
7 8
The Cross-Product
Simplifying the cross product
In fact the geometric mathematics in the
problems we have mentioned can generally be The mathematical definition and properties of
reduced to the question: the cross product are all very interesting, but
the only thing we need to worry about is the
sign of p1 × p2: if it is positive, then p1 is to
With respect to point A, is point B to the right of p2; if it is 0, p1 and p2 are on the
the left or right of point C? same line; and if it is negative p1 is to the left
of p2 (all with respect to the origin).
Suppose p1 = (x1, y1) and p2 = (x2, y2) are two
vectors (i.e. lines that start at the origin and To this end, let’s define a function for the
end at the given point). The cross product of direction of an angle
these two vectors is the (3D) vector that is
perpendicular to both vectors, and has
Dir(p0,p1, p2) = (p1 − p0) × (p2 − p0)
magnitude
p1 × p2 = x1y2 − x2y1
The only thing you need to be careful of is
which in turn is equal to the signed area of the that you get the order of the points right.
parallelogram with edges p1 and p2.
p2
p1 9 10
p0
>0 p2
<0
p0
p1
13 14
A related concept is the dot product of two The next question we can answer is whether
vectors v = (x1, y1) and u = (x2, y2), which is two lines, (p0, p1) and (p2, p3) intersect. (To
calculated as x1x2 + y1y2. simplify the code we will suppose intersect
means the lines cross properly, rather than
The dot product is the length of the vector v touch at an end-point, and the endpoints of
when it is projected orthogonally onto the each line are sorted lexicographically).
vector u multiplied by the length of u (or vice
versa). procedure Crosses(p0 , p1, p2, p3)
u d ← Dir(p0, p1, p2)× Dir(p0, p1, p3)
if d < 0
return true
else return false
19 20
Events
Pseudo-code
There are two types of events we must
consider: when we encounter the first point of Suppose that S = {s1, s2, ...sn} is a set line
a line segment, and when we encounter the segments, where si = (pi , qi), and let T be an
second point of a line segment: ordered list.
• When we encounter the first point of a line Sort the set of points {pi, qi | si ∈ S}
segment, we insert the line segment into for each point r in the sorted list
our ordered list of line segments. When we do if r = pi
do this we should check to see if the new then INSERT(T ,si )
line segment intersects the line segment if ABOVE(T ,si ) exists and intersects si
directly above it or below it in the list. then return true
if BELOW(T ,si ) exists and intersects si
then return true
• When we encouter the end point of a line else if r = qi
segment, we remove the segment from the then if ABOVE(T ,si ) and BELOW(T ,si )
ordered list. This will cause the line then return true
segment above and below the removed DELETE(T ,si )
segment to become adjacent, so we must return false
check if they intersect.
21 22
Correctness
1. If two line segments si and sj intersect, From the previous statements we are able to
there is some event point x, such that at x conclude the algorithm is correct. However:
si is next to sj in the list (say si and sj are
neighbours).
• What if lines start and end at the same
sweep line. Does the proof still work?
2. In between event points, the only way new
neighbours may arise is if already
• What simplifying assumption does this
neigbouring line segments intersect.
algorithm make?
23 24
Complexity
• Then, for each point we need to either: A shape is convex if any line segment
between two points inside the shape
1. insert the corresponding segment into
an ordered list (finding the segments remains inside the shape.
above and below) O(lgN ), and calculate
whether the segments intersect O(1); or
The convex hull algorithm we will examine is
2. find the segments above and below known as Graham’s scan. Like the
(O(lg N )), calculate whether they all-segments intersection algorithm, it is based
intersect (O(1)), and delete the segment on a sweep, but this time it is a roational
from the ordered list (O(lg N )). sweep, rather than a linear sweep.
This gives overall performance of O(N lg N ). The algorithm is based on the fact that you
can walk clockwise round the perimeter of a
Note that the insertions, deletions and find convex hull, and every point of the set will
operations on the ordered list require an always on your right.
efficient implementation, such as a Red-Black
tree (java.util.TreeMap) or an AVL tree.
25 26
Pseudo-code
2 procedure Graham-Scan(P )
1 Find the left-most point p0 in P
5
Sort the set of points P − {p0}
3
according to their angle around p0
for each point p in the sorted list
4 6 do if |S| = 1
0
8 then PUSH(S,p)
7
else q1 ← POP(S), q0 ← POP(S)
while DIR(p0,p1, p) ¡0
9
do q1 ← q0, q0 ← POP(S)
PUSH(S, q0), PUSH(S,q1), PUSH(S,q)
RETURN S.
29 30
Correctness (sketch)
Complexity
The algorithm returns a stack of points. If we
list these points in order (wrapping back to the We can find the left-most point in time O(n).
start vertex) we get the edges of a polygon.
We can sort the points according to their angle
We now must show two things: around p0 in time O(n lg n). Note we do not
have to calculate the angle to do this. We can
just do a normal sort, but instead of using <
1. The polygon is convex: This follows from for comparison, we can use the DIR function.
the fact that the algorithm ensures that
each corner of the polygon turns right (for The algorithm then has two nest loops each
a clockwise direction). potentially going through n iterations.
However, we may note that the inner loop is
popping elements of the stack. As each
2. The polygon contains every point in P :
element is added to the stack exactly once,
Every point p is added to S, and points are
this operation cannot be performed more than
only removed if we find an edge p1p2 such
n times.
that the triangle p0p1p2 contains p. As
p0, p1, p2 will then appear in the stack, p
will be contained in the polygon. Therefore the total complexity is O(n lg n)
31 32
Closest pair of points
33 34
Pseudo-code
Checking Points Across Partitions Let P = {p1, ..., pn } be a set of points. We will
just give the method for finding the closest
If we have solved the Closest Pair of Points distance. We will assume that P is sorted by
problem for PL and PR then we know the x-coordinate, and also that we have access to
minimum distance betwwen any pair of points a copy of P , Q, that is sorted by y-coordinate
in eitehr partition. Let this distance be δ. (with an inverse index).
Therefore, we only need to check if points
within a δ-width strip on either side of the procedure ClosestPair(P )
divide are closer. Split P into PL (the n/2 leftmost points)
and PR (the other points)
Furthermore, we know all the points in either δ = min{ClosestPair(PL ), ClosestPair(PR )}
side of the 2δ-width strip must be at least a For each point p in P
distance of δ from one another. We can use if px is within δ of pxn/2
this fact to show that each point in the strip then add p to A
y
only needs to be compared to the 5 subsequent For each point qi in A in order of qi
points (ordered from top to bottom). For j = 1, ..., 5
δ = min{δ, DIST (qi , qi+j }
RETURN δ
35 36
Analysis Point inside a polygon
The divide and conquer strategy is easily seen Suppose we are given a polygon (as a set of
to be valid, however the algorithm descibed points P = (p0, p1, ...pn) where p0 = pn) and we
above uses a number of optimizations, such as are required to determine whether a point q is
presorting and examining a relatively small set inside the polygon or not.
of pairs of points. See CLRS chapter 33 for a
justification of these optimizations. This algorithm is relatively simple. We take a
line segment starting at q and ending at a
The complexity of O(n lg n) can be shown point we know to be outside the polygon.
using the recurrence: Then we count the number of times the line
segment crosses an edge of the polygon. As
T (n) = 2T (n/2) + O(n)
every time it crosses an edge, the line segment
(see for example merge-sort). What goes from insdie the polygon to outside, or
optimizations do we require to ensure that the from outside the polygon to inside. Therefore
divide and merge can be performed in time if there are an odd number of such crossings,
O(n)? the pointis inside, otherwise it is outside.
37 38
procedure Pt-In-Poly(P , q)
Find a point r outside the polygon
e.g. ((minX) − 1, 0)
set c ← 0
for i = 1 to n
do if Intersects(r, q, pi−1, pi)
5 then c ← c + 1
4 if c is even return false
else return true.
3
1 2
You should be careful how the Intersect
methods treats lines touching at a point, or
interval.The intersects method provided in
these notes requires that the lines properly
cross, but it still requires a careful treatment in
the rare case that the ray passes through the
corner of a polygon appropriate in this context.
41 42
Summary
We have examined:
• 2D Geometric objects.
• Sweep lines:
– Convex hull.
43
Computer Science and Software Engineering, 2011
Overview
Matches
Pattern Matching
String-matching clearly has many important
applications — text editing programs being
We consider the following problem. Suppose T
only the most obvious of these. Other
is a string of length n over a finite alphabet Σ,
applications include searching for patterns in
and that P is a string of length m over Σ.
DNA sequences or searching for particular
patterns in bit-mapped images.
The pattern-matching problem is to find
occurrences of P within T . Analysis of the
We can describe a match by giving the number
problem varies according to whether we are
of characters s that the pattern must be
searching for all occurrences of P or just the
shifted along the text in order for every
first occurrence of P .
character in the shifted pattern match the
corresponding text characters. We call this
For example, suppose that we have
number a valid shift.
Σ = {a, b, c} and
abaaabacccaabbaccaababacaababaac
T = abaaabacccaabbaccaababacaababaac aab
P = aab aab
aab
aab
Our aim is to find all the substrings of the text
aab
that are equal to aab.
m(n − m + 1)
abaaabacccaabbaccaababacaababaac
aab comparisons.
which fails at the second character of the The naive string matcher is inefficient because
pattern. when it checks the shift s it makes no use of
any information that might have been found
When s = 1 we have earlier (when checking previous shifts).
Eventually this will succeed when s = 3. then it is clear that no shift s ≤ 9 can possibly
work.
5 6
Rabin-Karp algorithm
Rabin-Karp continued
The naive algorithm basically consists of two
nested loops — the outermost loop runs
Thus to try the shift s = 0, instead of
through all the n − m + 1 possible shifts, and
comparing
for each such shift the innermost loop runs
through the m characters seeing if they match. 1−7−6
against
Rabin and Karp propose a modified algorithm
that tries to replace the innermost loop with a 1−2−2
single comparison as often as possible. character by character, we simply do one
operation comparing 176 against 122.
Consider the following example, with alphabet
being decimal digits. It takes time O(m) to compute the value 176
from the string of characters in the pattern P .
122938491281760821308176283101
176
However it is possible to compute all the
n − m + 1 decimal values from the text just in
Suppose now that we have computer words
time O(n), because it takes a constant number
that can store decimal numbers of size less
of operations to get the “next” value from the
than 1000 in one word (and hence compare
previous.
such numbers in one operation).
Then we can view the entire pattern as a To go from 122 to 229 only requires dropping
single decimal number and the substrings of the 1, multiplying by 10 and adding the 9.
the text of length m as single numbers.
7 8
Rabin-Karp formalized
But what if the pattern is long?
Being a bit more formal, let P [1..m] be an
array holding the pattern and T [1..n] be an This algorithm works well, but under the
array holding the text. unreasonable restriction that m is sufficiently
small that the values p and {ts | 0 ≤ s ≤ n − m}
We define the values all fit into a single word.
If the alphabet is not decimal, but in fact has Again it is easy to see that t#s+1 can be
size d, then we can simply regard the values as computed from t#s in constant time.
d−ary integers and proceed as before.
9 10
Example
The whole algorithm
Suppose we have the following text and pattern
If t#s $= p# then the shift s is definitely not valid,
and can thus be rejected with only one 5 4 1 4 2 1 3 5 6 2 1 4 1 4
comparison. If t#s = p# then either ts = p and 4 1 4
the shift s is valid, or ts $= p and we have a
spurious hit. Suppose we use the modulus q = 13, then
p# = 414 mod 13 = 11.
q2 1" q3
!
The states will be Now suppose, for example, that the automaton
is given the string
Q = {0, 1, . . . , m}
where the state i corresponds to Pi, the a b b a b b ···
leading substring of P of length i.
The first five characters match the pattern, so
The start state q0 = 0 and the only accepting the automaton moves from state 0, to 1, to 2,
state is m. to 3, to 4 and then 5. After receiving the sixth
character b which does not match the pattern,
a b b a b a a what state should the automaton enter?
! ! ! ! ! ! !
0 " 1 " 2 " 3 " 4 " 5 " 6 " 7
As we observed earlier, the longest suffix of this
This is only a partially specified automaton, string that is a prefix of the pattern abbabaa
but it is clear that it will accept the pattern P . has length 3, so we should move to state 3,
indicating that only the last 3 characters read
We will specify the remainder of the match the beginning of the pattern.
automaton so that it is in state i if the last i
characters read match the first i characters of
the pattern.
17 18
We can express this more formally: The automaton has the following transition
function:
If the machine is in state q and receives a q a b
character c, then the next state should be q ! 0 1 0
where q ! is the largest number such that Pq! is 1 1 2
a suffix of Pq c. 2 1 3
3 4 0
Applying this rule we get the following finite 4 1 5
5 6 3
state automaton to match the string abbabaa.
6 7 2
7 1 2
! "
b
! " "
! "
a Use it on the following string
a b
a b b a b a a ababbbabbabaabbabaabaaabababbabaabbabbaa
0 1 2 3 4 5 6 7
#$ b #$ a a b b
% & Character a b a b b b a b b a b
% & Old state 0 1 2 1 2 3 0 1 2 3 4
% &
New state 1 2 1 2 3 0 1 2 3 4 5
21 22
23 24
The KMP algorithm
25 26
If a mismatch is found, then the shift s is not Now as soon as we detect the bad character i
valid, and we try the next possible shift by we know immediately that the next shift must
setting be at least 6 places or the i will simply not
match.
s←s+1
and starting the testing loop again. Notice that advancing the shift by 6 places
means that 6 text characters are not examined
The two heuristics both operate by providing a at all.
number other than 1 by which the current shift
can be incremented without missing any
matches.
27 28
The bad character heuristic
Then if a mismatch is detected when scanning The characters of the text that do match with
position j of the pattern (remember we are the pattern are called the good suffix. In this
going from right-to-left so j goes from m to case the good suffix is ed. Any shift of the
1), the bad character heuristic proposes pattern cannot be valid unless it matches at
advancing the shift by the equation: least the good suffix that we have already
found. In this case we must move the pattern
s ← s + (j − λ(T [s + j]))
at least 4 spaces in order that the ed at the
beginning of the pattern matches the good
Notice that the bad-character heuristic might suffix.
occasionally propose altering the shift to the
left, so it cannot be used alone.
29 30
31 32
Example Example continued
o n e _ s h o n e _ t h e _ o n e _ p h o n e
What is the last occurrence function λ? o n e _ s h o n e _ t h e _ o n e
o n e _ s h o n e _ t h e _ o n e _ p h o n e
o n e _ s h o n e _ t h e _ o n e
so γ(20) = 6.
33 34
o n e _ s h o n e _ t h e _ o n e _ p h o n e
o n e _ s h o n e LONGEST COMMON
SUBSEQUENCE
Instance: Two sequences X and Y
so γ(19) = 14. Question: What is a longest common
subsequence of X and Y
What about γ(18)? What is the smallest shift
that can match the characters p h o n e? A Example
shift of 20 will match all those that are still
left.
If
o n e _ s h o n e _ t h e _ o n e _ p h o n e X = "A, B, C, B, D, A, B#
o n e and
Y = "B, D, C, A, B, A#
This then shows us that γ(j) = 20 for all
j ≤ 18, so then a longest common subsequence is either
6
20 ≤ j ≤ 22 "B, C, B, A#
γ(j) = 14 j = 19 or
20 1 ≤ j ≤ 18
"B, D, A, B#
35 36
A recursive relationship A recursive solution
As is usual for dynamic programming problems This can easily be turned into a recursive
we start by finding an appropriate recursion, algorithm as follows.
whereby the problem can be solved by solving
smaller subproblems. Given the two sequences X and Y we find the
LCS Z as follows:
Suppose that
If xm = yn then find the LCS Z % of Xm−1 and
Yn−1 and set Z = Z %xm.
X = !x1, x2, . . . , xm"
If xm $= yn then find the LCS Z1 of Xm−1 and
Y = !y1, y2, . . . , yn" Y , and the LCS Z2 of X and Yn−1, and set Z
and that they have a longest common to be the longer of these two.
subsequence
It is easy to see that this algorithm requires the
Z = !z1, z2, . . . , zk "
computation of the LCS of Xi and Yj for all
values of i and j. We will let l(i, j) denote the
If xm = yn then zk = xm = yn and Zk−1 is a length of the longest common subsequence of
LCS of Xm−1 and Yn−1. Xi and Yj .
Otherwise Z is either a LCS of Xm−1 and Y or Then we have the following relationship on the
a LCS of X and Yn−1. lengths
(This depends on whether zk $= xm or zk $= yn 0
if ij = 0
l(i, j) = l(i − 1, j − 1) + 1 if xi = yj
respectively — at least one of these two
max(l(i − 1, j), l(i, j − 1)) if x $= y
possibilities must arise.) i j
37 38
43 44
Finding the LCS
Finding the LCS
The LCS can be found (in reverse) by tracing
the path of the arrows from l(m, n). Each We can trace back the arrows in our final array,
diagonal arrow encountered gives us another in the manner just described, to determine that
element of the LCS. the LCS is 11010 and see which elements
within the two sequences match.
As l(8, 6) points to l(7, 6) so we know that the
LCS is the LCS of X7 and Y6.
j 0 1 2 3 4 5 6
Now l(7, 6) has a diagonal arrow, pointing to i yj 1 1 0 1 1 0
l(6, 5) so in this case we have found the last 0 xi 0 0 0 0 0 0 0
entry of the LCS — namely it is x7 = y6 = 0. 1 0 0 ↑0 ↑0 "1 ←1 ←1 "1
2 1 0 "1 "1 ↑1 "2 "2 ←2
Then l(6, 5) points (upwards) to l(5, 5), which 3 1 0 "1 "2 ←2 "2 "3 ←3
points diagonally to l(4, 4) and hence 1 is the 4 0 0 ↑1 ↑2 "3 ←3 ↑3 "4
second-last entry of the LCS.
5 1 0 "1 "2 ↑3 "4 "4 ↑4
6 0 0 ↑1 ↑2 "3 ↑4 ↑4 "5
Proceeding in this way, we find that the LCS is
7 0 0 ↑1 ↑2 "3 ↑4 ↑4 "5
11010 8 1 0 ↑1 "2 ↑3 "4 "5 ↑5
Notice that if at the very final stage of the A match occurs whenever we encounter a
algorithm (where we had a free choice) we had diagonal arrow along the reverse path.
chosen to make l(8, 6) point to l(8, 5) we
would have found a different LCS See section 15.4 of CLRS for the pseudo-code
11011 for this algorithm.
45 46
After initialization we simply fill in mn entries Data compression algorihtms are used by
in the table — with each entry costing only a programs such as WinZip, pkzip and zip. They
constant number of comparisons. Therefore are also used in the definition of many data
the cost to produce the table is Θ(mn) formats such as pdf, jpeg, mpeg and .doc.
Following the trail back to actually find the Data compression algorithms can either be
LCS takes time at most O(m + n) and lossless (e.g. for archiving purposes) or lossy
therefore the total time taken is Θ(mn). (e.g. for media files).
47 48
Simplification
A good code
0000011101001000100100000001
Notice that there are many more occurrences
of 0 and 1 than the other characters. using the fixed length code and
10110001010000111011010
using the variable length code.
51 52
Prefix codes
"$$
0
"
0
""
"" $$
$$ 1
"" $$
Then the number of bits required to encode a
!!# !!#
file is
0
!
#
#
1 0!
#
#
1 !
8 1 B(T ) = f (c)dT (c)
0!!!###1 0!!!###1 c ∈C
3 7 which we define as the cost of the tree T .
0!!!###1 0!!!###1
6 4 2
0!!!###1
5 9
53 54
"$$$
0:2 d=1 every node is either a leaf or has precisely two
"
"" $$
"" $$
"" $ children.
d=2
!# !#
! # ! #
! # ! #
8:1 1:1 d=3 Therefore if we are dealing with an alphabet of
!#
! #
!#
! # s symbols we can be sure that our tree has
! # ! #
3:0 7:1 d=4 precisely s leaves and s − 1 internal nodes, each
!!#
# !!##
! # ! # with two children.
6:0 4:1 2:0 d=5
!!#
#
! # Huffman invented a greedy algorithm to
5:0 9:1 d=6
construct such an optimal tree.
55 56
Huffman’s algorithm
57 58
4821 3:5960 7:6878 7261 · · · The proof is divided into two steps:
! " ! "
! " ! "
6:2260 4:2561 2819 2:4442 First it is necessary to demonstrate that the
! "
! "
5:1294 9:1525 first step (merging the two lowest frequency
characters) cannot cause the tree to be
Notice how we are growing sections of the tree non-optimal. This is done by showing that any
from the bottom-up (compare with the tree on optimal tree can be reorganised so that these
slide 16). two characters have the same parent node.
(see CLRS, Lemma 16.2, page 388)
See CLRS (page 388) for the pseudo-code
corresponding to this algorithm. Secondly we note that after making an optimal
first choice, the problem can be reduced to
finding a Huffman code for a smaller alphabet.
(see CLRS, Lemma 16.3, page 391)
59 60
Algorithms: Adaptive Huffman Coding
63 64
Ziv-Lempel compression algorithms Algorithms: LZ77
The Ziv-Lempel compression algorithms are a The LZ77 algorithms use a sliding window.
family of compression algorithms that can be The sliding window is a buffer consisting of the
applied to arbitrary file types. last m letters encoded (a0...am−1) and the next
n letters to be encoded (b0...bn−1).
The Ziv-Lempel algorithms represent recurring
strings with abbreviated codes. There are two Initially we let a0 = a1 = ... = an−1 = w0 and
main types: output "0, 0, w# where w0 is the first letter of
the word to be compressed
65 66
This outputs Note the trick with the third triple "2, 3, c# that
allows the look-back buffer to overflow into the
"0, 0, a#"0, 2, b#"2, 3, c#"1, 2, a#
look ahead buffer. See
http://en.wikipedia.org/wiki/LZ77 and LZ78 for
more information.
67 68
Algorithms: LZW
Algorithms: LZW
69 70
71 72
Summary Summary cont.
1. String matching is the problem of finding 5. The Boyer-Moore algorithm uses the bad
all matches for a given pattern, in a given character and good suffix heuristics to give
sample of text. the best performance in the expected case.
2. The Rabin-Karp algorithm uses prime 6. The longest common subsequence problem
numbers to find matches in linear time in is can be solved using dynamic
the expected case. programming.
3. A String matching automata works in linear 7. Dynammic programming can improve the
time, but requires a significant amount of efficiency of divide and conquor algorithms
precomputing. by storing the resul;ts of sub-computations
so they can be reused later.
73 74
Summary cont.
75
Computer Science and Software Engineering, 2011
Algorithm Design
[a, b] = {x ∈ R | a ≤ x ≤ b}
The following schedules are all allowable
Activity Selection
Algorithm
Consider the greedy approach of selecting the
interval that finishes first from the collection As a precondition the list of tasks must be
{[6, 9), [1, 10), [2, 4), [1, 7), [5, 6), [8, 11), [9, 11)} sorted into ascending order of their finish times
to ensure
Then we would choose [2, 4) as the first finish(t1) ≤ finish(t2) ≤ finish(t3) ≤ . . .
interval, and after eliminating clashes we are
left with the task of finding the largest set of
mutually disjoint intervals from the set The pseudo-code will then process the sorted
list of tasks t:
{[6, 9), [5, 6), [8, 11), [9, 11)}.
procedure GREEDY-ACTIVITY-SEL(t)
At this stage, we simply apply the algorithm A ← {t1}
recursively. Therefore being greedy in the same i←1
way we select [5, 6) as the next interval, and for m ← 2 to length(t) do
after eliminating clashes (none in this case) we if start(tm) ≥ finish(ti ) then
are left with.
A ← A ∪ {tm}
{[6, 9), [8, 11), [9, 11)}. i←m
end if
Continuing in this way gives the ultimate result end for
that the largest possible collection of return A
non-intersecting intervals is
It returns A, a subset of compatible activities.
[2,4) then [5,6) then [6,9) then [9,11).
7 8
Does it work?
Intuitive Proof
The greedy algorithm gives us a solution to the
activity scheduling problem — but is it actually The formal proof that we can use t1 as the
the best solution, or could we do better by first task and be certain that we will not
considering the global impact of our choices
change the number of compatible tasks is
more carefully.
rather involved and you are referred to the text
book (see CLRS, pages 373-375).
For the problem ACTIVITY SELECTION
we can show that the greedy algorithm always
finds an optimal solution. However the basic idea is a proof by
contradiction. Assume using t1 results in a
We suppose first that the activities are ordered sub-optimal solution and therefore we can find
by finishing time - so that a compatible solution with (k + 1) tasks. This
would only be possible if we can find two tasks
f1 ≤ f2 ≤ · · · ≤ fn
t"1 and t""1 which occupy the same interval as t1.
But this would imply
Now consider some optimal solution for the
problem consisting of k tasks (s1 ≤ (s"1 < f1" ) ≤ (s""1 < f1"") ≤ f1)
and hence that f1" < f1 but we know that the
ti1 , ti2 , . . . , tik
tasks are sorted in order of ascending finish
times, so no task can have a finish time less
Then
that that of t1, leading to a contradiction.
t1, ti2 , . . . , tik Hence using t1 as the first task must lead to
is also an optimal solution since it will also an optimal solution with k tasks.
consist of k tasks.
9 10
Vertex Cover
Running time
A vertex cover for a graph G is a set of
The running time for this algorithm is vertices V " ⊆ V (G) such that every edge has at
dominated by the time taken to sort the n least one end in V " (the set of vertices covers
inputs at the start of the algorithm. all the edges.
" ! "
! !
Unfortunately, greedy algorithms only work for ! !
! !
! !
a certain narrow range of problems — most ! !! !"
11 12
A greedy algorithm
! ! ! ! !
In problems where the greedy algorithm works,
! !
the earlier choices do not interfere negatively
with the later choices.
the greedy algorithm gives
! " Unfortunately, most problems are not amenable
to the greedy algorithm.
" ! " ! "
! "
!
13 14
More NP-problems
Non-deterministic polynomial time
15 16
How hard are these problems?
19 20
A dynamic programming solution
Let V (m, w) be the value of the optimal 2. recursively solve the smaller sub-problems,
solution to this subproblem. Then for any m recording the solutions in a table;
and any w, we can see
V (m, w) = max{V (m−1, w), vm+V (m−1, w−wm)}. 3. construct the solution to the original
problem from the table.
Since V (0, w) = 0 for all w this allows us to
define a (very inefficient) recursive algorithm.
For the 0-1 knapsack problem we will construct
a table where the entries are V (i, j) for
i = 0, ..., n and j = 0, ..., W .
21 22
Pseudo-code
Example
23 24
Linear Programming
Example
The fractional knapsack problem is an example
of a linear programming problem. A linear
i\w 0 1 2 3 4 5
0 0 0 0 0 0 0 programme is an optimization problem of the
1 0 2 2 2 2 2 form:
2 0 2 3 4 4 4 Find real numbers: x1, ..., xn
3 0 2 3 4 6 7 that maximizes Σn i=1ci xi
subject to Σn i=1 ij xi ≤ bj for j = 1, ..., m
a
Note that the actual items contributing to the and xi ≥ 0 for j = 1, ..., n.
solution (that is, items 2 and 3) can be found Therefore a linear programme is paramaterized
by examination of the table. If T (i, w) are the by the the cost vector, (c1, ..., cn), an n × m
items that produce the solution V (i, w), then array of constraint coefficients, aij , and a
T (i, w) = T (i − 1, w) if V (i, w) = V (i − 1, w) bounds vector (b1, ..., bm).
= {i} ∪ T (i − 1, w − wi) otherwise.
It is clear the fractional knapsack problem can
be presented as a linear programme.
25 26
For example, given a weighted, directed graph, The simplex algorithm is effectively a
G = (V, E), the length of shortest path from s hill-climbing algorithm that moves
to t can be described using a linear programme. incrementally improves the solution until no
Using the distance array from the Bellman-Ford further improvements can be made.
algorithm, we have the programme:
Maximize d[t] There are also polynomial interior point
subject to d[v] − d[u] ≤ w(u, v) for j = 1, ..., m methods to solve linear programmes.
and d[s] = 0.
Maximum flow problems can also be easily We won’t examine these algorithms. Rather we
converted into linear programmes. will simply consider the technique of converting
problems into linear programmes.
27 28
Example
Integer Linear Programming
#
3# Adding the constraint that all solutions to a
#
#
# ! linear programme be integer values, gives an
" # !
2"" #!
integer linear programme.
"" !#
"
!" #
! ""# !
! "#
" !
1 ! #
""
!
! !#""
! ! # "" The 0-1 knapsack problem can be written as
! # ""
0 !
! #
#
""
""
an integer linear programme, as can the
0!
! 1 2 #
#
3 ""
travelling salesmen problem.
#
29 30
Approximation Algorithms
Standard Instances
An approximation algorithm is an algorithm
that produces some feasible solution but with
Both TRAVELLING SALESMAN and
no guarantee that the solution is optimal.
DOMINATING SET have been fairly
extensively studied, and a number of algorithms
Therefore an approximation algorithm for the for their solution have been proposed.
travelling salesman problem would produce
some valid circular tour, but it may not be the
In each case there are some standard instances
shortest tour.
for would-be solvers to test their code on. A
package called TSPLIB provides a variety of
An approximation algorithm for the minimum standard travelling salesman problems. Some
dominating set problem would produce some of them have known optimal solutions, while
dominating set for G, but it may not be the others are currently unsolved and TSPLIB just
smallest possible dominating set. records the best known solution.
The performance of an approximation There are problems with around 2000 cities for
algorithm on a given instance I is measured by which the best solution is not known, but this
the ratio problem has been very heavily studied by
A(I)/OP T (I) brilliant groups of researchers using massive
computer power and very sophisticated
where A(I) is the value given by the
techniques.
approximation algorithm and OP T (I) is the
true optimum value.
31 32
The football pool problem
33 34
35 36
Types of Travelling Salesman Instance
A greedy approximation algorithm
Consider a travelling salesman problem defined
There is a natural greedy approximation in the following way. The “cities” are n
algorithm for the minimum dominating set randomly chosen points ci = (xi, yi) on the
problem. Euclidean plane, and the “distances” are
defined by the normal Euclidean distance
Start by selecting a vertex of maximum degree !
d(ci , cj ) = (xi − xj )2 + (yi − yj )2
(so it dominates the greatest number of
vertices). Then mark or delete all of the or the Manhattan distance
dominated vertices, and select the next vertex
d(ci , cj ) = |xi − xj | + |yi − yj |
that dominates the greatest number of
currently undominated vertices. Repeat until
all vertices are dominated. Instance of the travelling salesman problem
that arise in this fashion are called geometric
The graph P5 (a path with 5 vertices) shows travelling salesman problems. Here the
that this algorithm does not always find the “distance” between the cities is actually the
optimal solution. geometric distance between the corresponding
points under some metric.
37 38
Non-geometric instances
Properties of geometric instances
Of course it is easy to define instances that are
All geometric instances have the properties not geometric.
that they are symmetric and satisfy the
triangle inequality. Let X = {A, B, C, D, E, F }
If Let d be given by
d(ci , cj ) = d(cj , ci)
A B C D E F
for all pairs of cities in an instance of A 0 2 4 ∞ 1 3
TRAVELLING SALESMAN then we say that B 2 0 6 2 1 4
the instance is symmetric. C 4 ∞ 0 1 2 1
D ∞ 2 1 0 9 1
If E 1 1 2 6 0 3
F 3 4 1 1 3 0
d(ci , ck ) ≤ d(ci, cj ) + d(cj , ck )
for all triples of cities in an instance of Many approximation algorithms only work for
TRAVELLING SALESMAN then we say that geometric instances because it is such an
the instance satisfies the triangle inequality. important special case, but remember that it is
only a special case!
39 40
A geometric instance of NN
90
50
y
For an n-city instance of TRAVELLING
SALESMAN this algorithm takes time O(n2). 40
30
41 42
A geometric instance of NN
50
y
43 44
Search the tree . . .
Minimum spanning tree
Perform a depth first search on the minimum
Suppose that we have an instance I of spanning tree.
TRAVELLING SALESMAN that is symmetric
and satisfies the triangle inequality. Then the
following algorithm called M ST is guaranteed
to find a tour that is at most twice the optimal
length.
Then 1 16 15 14
M ST (I) ≤ 2 OP T (I).
2 3 12 13
In order to see why this works, we first observe
that removing one edge from the optimal tour
yields a spanning tree for I, and therefore the 5 4 8 10
2 3 12 13
$
$ This algorithm proceeds very much like
$
$
$ Kruskal’s algorithm, but the added simplicity
$
5 4 " 8 $
10 means that the complicated union-find data
"" $
! "" $ #
!
! "
"" $#
#$
structure is unnecessary.
!"" # $
""! # $
"" ! # $
7 6 9 11
47 48
Insertion methods Three insertion techniques
Nearest insertion
Suppose that we are intending to insert the
new vertex x into the partial tour C (called C
At each stage the vertex x is chosen to be the
because it is a cycle).
one closest to C.
49 50
Nearest insertion tours ranged from 631 to 701 Farthest insertion tours ranged from 594 to
on the above example. 679 on the above example.
Example of a tour found by nearest insertion Example of a tour found by farthest insertion
90 90
80 80
70 70
60 60
50 50
y
40 40
30 30
20 20
10 10
0 20 40 60 80 100 0 20 40 60 80 100
x x
51 52
Tour found by random insertion
60
d(u, x) + d(x, v) − d(u, v)
50
y
53 54
!! "
!!! ""
B D ""
55 56
A state space graph
Hill-climbing
Searching the state space graph
The simplest heuristic state-space search is
In general S(I) is so vast that it is totally known as hill-climbing.
impossible to write down the entire graph.
The rule for proceeding from one state to
The greedy insertion methods all provide us another is very easy
with a single vertex in S(I) (a single tour), and
the iterative improvement heuristics all involve
• Systematically generate neighbours T " of T
doing a walk in S(I) moving along edges from
and move to the first neighbour of lower
tour to neighbouring tour attempting to find
cost than T .
the lowest cost vertex.
In this type of state space searching we have This procedure will terminate when T has no
the concept of a “current” tour T and at each neighbours of lower cost — in this case T is a
stage of the search we generate a neighbour T " 2-optimal tour.
of T and decide whether the search should
proceed to T " or not. An obvious variant of this is to always choose
the best move at each step.
59 60
A local optimum State-space for DOMINATING SET
A hill-climb will always finish on a vertex of We can apply similar methods to the graph
lower cost than all its neighbours — such a domination problem provided that we define
vertex is a local minimum. the state-space graph carefully.
Unfortunately the state space graph has an Suppose that we are trying to see whether a
enormous number of local minima, each of graph G has a dominating set of size k. Then
them possibly tremendously different from the the “states” in the state space graph are all
global minimum. the possible subsets of V (G) of size k. The
“cost” of each can be taken to be the number
If we mentally picture the state space graph as of vertices not dominated by the corresponding
a kind of “landscape” where costs are k-subset. The solution that we are seeking is
represented by heights, then S(I) is a savagely then a state of cost 0.
jagged landscape of enormously high
dimension. Now we must define some concept of
“neighbouring states”. In this situation a
Hill climbing merely moves directly into the natural way to define a neighbouring state is
nearest local optimum and cannot proceed the state that results from moving one of the k
from there. vertices to a different position.
61 62
We can now apply the hill-climbing procedure Annealing is a physical process used in forming
to this state space graph. crystalline solids.
In this fashion the search “wanders” around At a high temperature the solid is molten, and
the state-space graph, but again it will the molecules are moving fast and randomly. If
inevitably end up in a local minimum from the mixture is very gradually cooled, then as
which there is not escape. the temperature drops the mixture becomes
more ordered, with molecules beginning to
Hill climbing is unsatisfactory because it has no align into a crystalline structure. If the cooling
mechanism for escaping locally optimum is sufficiently slow, then at freezing point the
solutions. Ideally we want a heuristic search resulting solid has a perfect regular crystalline
technique that tries to improve the current structure.
solution but has some method for escaping
local optima. The crystalline structure has the lowest
potential energy, so we can regard the process
Two techniques that have been proposed and as trying to find the configuration of a group
extensively investigated in the last decade or so of molecules with a global minimum potential
are energy.
Therefore at high temperatures, almost all Simulated annealing has had success in several
moves are accepted, good or bad, whereas as areas of combinatorial optimization, particularly
the temperature reduces, fewer bad moves are in problems with continuous variables.
accepted and the procedure settles down
again. When t ≈ 0 then the procedure reverts In general it seems to work considerably better
to a hill-climb. than hill-climbing, though it is not clear
whether it works much better than multiple
The value of the the initial temperature and hill-climbs.
the way in which it is reduced is called a
cooling schedule: Each of these combinatorial optimization
heuristics has their own adherents, and
something akin to religious wars can erupt if
• Start with some initial temperature t0 anyone is rash enough to say “X is better than
• Perform N iterations at each temperature Y”.
• Reduce the temperature by a constant
multiplicative factor t ← Kt Experimentation is fraught with problems also,
in that an empirical comparison of techniques
depends so heavily on the test problems that
For example the values t0 = 1, N = 1000, almost any desired result can be convincingly
K = 0.95 might be suitable. produced by careful enough choice.
The word tabu (or taboo) means something The basic idea of a tabu search is that it
prohibited or forbidden. always maintains a tabu list detailing the last h
vertices that it has visited.
Tabu search is another combinatorial search
heuristic that combines some of the features of
hill-climbing and simulated annealing. However • Select the best possible neighbour T ! of T .
it can only be used in slightly more restricted • If T ! is not on the tabu list, then move to it
circumstances. and update the tabu list accordingly.
69 70
Practical considerations
71 72
A glimpse of GAs
Summary cont.
77 78