0% found this document useful (0 votes)
6 views74 pages

Lec 1

Uploaded by

owemecoffee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views74 pages

Lec 1

Uploaded by

owemecoffee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

CS711008Z Algorithm Design and Analysis

Lecture 1. Introduction and some representative problems

Dongbo Bu

Institute of Computing Technology


Chinese Academy of Sciences, Beijing, China

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
1 / 74
The origin of the word “Algorithm”

Figure 1: Muhammad ibn Musa al-Khwarizmi (C. 780—850), a Persian scholar, formerly Latinized as Algoritmi

In the twelfth century, latin translations of his work on the


Indian-Arabic numerals introduced the decimal positional
number system to the western world.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
2 / 74
Al-Khwarizmi‘s contributions

Al-Khwarizmi‘s The Compendious Book on Calculation by


Completion and Balancing presented the first systematic
solution of linear and quadratic equations in Arabic.
Two words:
Algebra: from Arabic “al-jabr” meaning “reunion of broken
parts” — one of the two operations he used to solve equations
Algorithm: a step-by-step set of operations to get solution to
a problem

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
3 / 74
Algorithm design: the art of computer programming

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
4 / 74
V. Vazirani said:

Our philosophy on the design and exposition of algorithms is nicely


illustrated by the following analogy with an aspect of
Michelangelos’s art: A major part of his effort involved looking for
interesting pieces of stone in the quarry and staring at them for
long hours to determine the form they naturally wanted to
take. The chisel work exposed, in a minimal manner, this form.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
5 / 74
V. Vazirani said: cont′ d

By analogy, we would like to start with a clean, simply stated


problem. Most of the algorithm design effort actually goes into
understanding the algorithmically relevant combinatorial
structure of the problem. The algorithm exploits this structure
in a minimal manner..... with emphasis on stating the structure
offered by the problems, and keeping the algorithms minimal.

(See extra slides.)

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
6 / 74
Basic algorithmic strategies

Divide-and-conquer: Let’s start from the “smallest”


problem first, and investigate whether a large problem can
reduce to smaller subproblems.
Improvement: Let’s start from an initial complete
solution, and try to improve it step by step.
“Intelligent” Enumeration: Consider an optimization
problem. If the solution can be constructed step by step, we
might enumerate all possible complete solutions by
constructing a partial solution tree. Due to the huge size of
the search tree, some techniques should be employed to prune
it.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
7 / 74
The first example: calculating the greatest common divisor
(gcd)

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
8 / 74
The first problem: calculating gcd

Definition (gcd)
The greatest common divisor of two integers a and b, when at
least one of them is not zero, is the largest positive integer that
divides the numbers without a remainder.

Example:
The divisors of 54 are: 1, 2, 3, 6, 9, 18, 27, 54
The divisors of 24 are: 1, 2, 3, 4, 6, 8, 12, 24
Thus, gcd(54, 24) = 6.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
9 / 74
The first problem: calculating gcd

INPUT: two n-bits numbers a, and b (a ≥ b)


OUTPUT: gcd(a, b)

Observation: the problem size can be measured by using n;


Let’s start from the “smallest” instance: gcd(1, 0) = 1;
But how to efficiently solve a “larger” instance, say
gcd(1949, 101)?

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
10 / 74
Euclidean algorithm
Problem instances

gcd(1949, 101)

gcd(1, 0)

Observation: a large problem can reduce to a smaller


subproblem:
gcd(1949, 101) = gcd(101, 1949 mod 101) = gcd(101, 30)
. . . . . . . . . . . . . . . 1. . . . .
. . . . . . . . . . . . . . . . . . . .
11 / 74
Strategy: reduce to “smaller” problems

Problem instances

gcd(1949, 101)

gcd(101, 30)

gcd(1, 0)

gcd(101, 30) = gcd(30, 101 mod 30) = gcd(30, 11)


. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
12 / 74
Strategy: reduce to ”smaller” problems

Problem instances

gcd(1949, 101)

gcd(101, 30)

gcd(30, 11)

gcd(1, 0)

gcd(30, 11) = gcd(11, 30 mod 11) = gcd(11, 8)


. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
13 / 74
Strategy: reduce to ”smaller” problems

Problem instances

gcd(1949, 101)

gcd(101, 30)

gcd(30, 11)

gcd(1, 0)

gcd(30, 11) = gcd(11, 8) = gcd(8, 3) = gcd(3, 2) =


gcd(2, 1) = gcd(1, 0) = 1 . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
14 / 74
Sub-instance relationship graph

Problem instances

gcd(1949, 101)

gcd(101, 30)

gcd(30, 11)

gcd(1, 0)

Node: subproblems
Edge: reduction relationship . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
15 / 74
Euclid algorithm

function Euclid(a, b)
1: if b = 0 then
2: return a;
3: end if
4: return Euclid(b, a mod b);

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
16 / 74
Time complexity analysis

Theorem
Suppose a is a n-bit integer. Euclid(a, b) ends in O(n3 ) time.

Proof.
There are at most 2n recursive calling.
Note that a mod b < a2 .
After two rounds of recursive calling, both a and b shrink at
least a half size.
At each recursive calling, the mod operation costs O(n2 )
time.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
17 / 74
The second example: traveling salesman problem (TSP)

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
18 / 74
TSP: a concrete example

In 1925, H. M. Cleveland, a salesman of the Page seed


company, traveled 350 cities to gather order form.
Of course, the shorter the total distance, the better.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
19 / 74
How did they do? Pin and wire!

Two pictures excerpted from Secretarial Studies, 1922.


. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
20 / 74
Travelling Salesman Problem

INPUT: n cities V = {1, 2, ..., n}, and a distance matrix D,


where dij (1 ≤ i, j ≤ n) denotes the distance between city i and j.
OUTPUT: the shortest tour that visits each city exactly once and
returns to the origin city.

5
2 3
7
1 3
8
1 4
3

Tour 1: 1→2→3→4→1 (distance: 12)


Tour 2: 1→2→4→3→1 (distance: 19)
Tour 3: 1→3→2→4→1 (distance: 23)
Tour 4: 1→3→4→2→1 (distance: 19)
Tour 5: 1→4→2→3→1 (distance: 23) . . . . . . . . . . . . . . . . . . . .

1→4→3→2→1
. . . . . . . . . . . . . . . . . . . .
Tour 6: (distance: 12) 21 / 74
Trial 1: Divide and conquer

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
22 / 74
Decompose the original problem into subproblems

d
6 5
2 6
8 8
e c e c
7 4 7 4
4 3 4 3
a b a b
3 3

Note that it is not easy to obtain the optimal solution to the


original problem (e.g., tour in blue) through the optimal
solution to subproblem (e.g., tour in red).

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
23 / 74
Consider a tightly-related problem
Let’s consider a tightly-related problem: calculating M(s, S, x),
the minimum distance, starting from city s, visiting each city
in S once and exactly once, and ending at city x.
It is easy to decompose this problem into subproblems, and
the original problem could be easily solved if M(s, S, x) were
determined for all subset S ⊆ V and e ∈ V.
5
2 3
7
1 3
8
1 4
3
For example, since there are 3 cases of the city from which we
return to 1, the shortest tour can be calculated as:
min{ d2,1 + M(1, {3, 4}, 2),
d3,1 + M(1, {2, 4}, 3),
d4,1 + M(1, {2, 3}, 4)}.
.
.
.
.
. . . . .
. . . .
. . . .
. . . .
. . . .
. . . . .
.
.
.
.
.
.
.
.
.

24 / 74
Consider the smallest instance of M(s, S, x)

It is trivial to calculate M(s, S, x) when S consists of only 1


city.
5 5
2 3 2 3
8 8
1 1

1 1
M(1, {2}, 3) = d12 + d23 M(1, {3}, 2) = d13 + d32

But how to solve a larger problem, say M(1, {2, 3}, 4)?

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
25 / 74
Divide a large problem into smaller problems

5 5
2 3 2 3
7 7
1 3 1 3
8 8
1 4 1 4
3 3

M(1, {2, 3}, 4) = min{d34 + M(1, {2}, 3), d24 + M(1, {3}, 2)}

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
26 / 74
Sub-instance relationship graph

Problem instances

M(1, {2, 3}, 4)

M(1, {3}, 2) M(1, {2}, 3)

A large problem can be reduced into smaller subproblems.


. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
27 / 74
Held-Karp algorithm [1962]

function TSP(D)
1: return mine∈V,e̸=s M(s, V − {e}, e) + des ;

function M(s, S, x)
1: if S = {v} then
2: M(s, S, x) = dsv + dvx ;
3: return M(s, S, x);
4: end if
5: return mini∈S, i̸=x M(s, S − {i}, i) + dxi ;

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
28 / 74
An example

x
S b c d
{b} - 7 6
{c} 8 - 12
{d} 10 15 -
{b, c} - - 15
{b, d} - 14 -
{c, d} 15 - -

Time complexity: O(2n n2 ).

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
29 / 74
Trial 2: Improvement strategy

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
30 / 74
Solution space

Landscape of solution space:


Node: a complete solution. Each node is associated with an
objective function value.
Edge: if two nodes are neighboors, an edge is added to
connect them. Here ”neighbours” refers to two nodes with
small difference.
Improvement strategy: start from a rough complete solution,
and try to improve it step by step. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
31 / 74
Improvement strategy

5
2 3
7
1 3
8
1 4
3

Note that a complete solution can be expressed as a


permutations of the n cities, e.g., 1, 2, 3, 4.
Let’s start from an initial complete solution, and try to
improve it;

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
32 / 74
Improvement strategy

function GenericImprovement(V, D)
1: Let s be an initial tour;
2: while TRUE do
3: Select a new tour s′ from the neighbourhood of s;
4: if s′ is shorter than s then
5: s = s′ ;
6: end if
7: if stopping(s) then
8: return s;
9: end if
10: end while
Here, neighbourhood is introduced to describe how to change an
existing tour into a new one;

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
33 / 74
But how to define neighbourhood of a tour?

2-opt strategy: if s′ and s differ at only two edges (Note:


1-opt is impossible)

2 3 2 3

1 4 1 4
s s′

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
34 / 74
An example

d
6 5
2 6
8
e c
7 4
4 3
a b
3

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
35 / 74
Step 1

d
6 5
2 6
8
e c
7 4
4 3
a b
3

Initial complete solution s: a → b → c → d → e → a


(distance: 25)

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
36 / 74
Step 2: a 2-opt operation improves s ⇒ s′

d d
6 5 6 5
2 6 2 6
8 8
e c e c
7 4 7 4
4 3 4 3
a b a b
3 3

Initial solution s: a → b → c → d → e → a (distance: 25)


Improve from s to s′ : a → b → c → e → d → a (distance: 23)

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
37 / 74
Step 3: One more 2-opt operation improves s′ ⇒ s′′

d d
6 5 6 5
2 6 2 6
8 8
e c e c
7 4 7 4
4 3 4 3
a b a b
3 3

A complete solution s′ : a → b → c → e → d → a (distance:


23)
Improve s′ to s′′ : a → c → b → e → d → a (distance: 19)
Done! No 2-OPT can be found to improve further.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
38 / 74
Recent development: Learning 2-OPT heuristics

We can run 2-OPT on any edge-pair.


How to select an appropriate edge-pair? Using NN and RL
[Roberto2020].
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
39 / 74
Extension: 3-OPT and LKH technique

Selecting three edge (a, b), (c, d), and (e, f), and applying
2 − OPT and 3 − OPT on them, we will have a total of 7
possible new tours.
The LK-1 technique uses up to sequential 5 − OPT as well as
non-sequential 4 − OPT. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
40 / 74
Trial 3: “Intelligent” enumeration strategy

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
41 / 74
Solution form
Note that a complete solution can be expressed as a sequence
of n edges. Given a certain order of the m edges, a complete
solution can be represented as X = [x1 , x2 , ..., xm ], where
xi = 1 if the edge i was used in the tour, and xi = 0 otherwise.
For example, the tour a → b → c → d → e → a can be
represented as X = [1, 0, 0, 1, 1, 0, 0, 1, 0, 1].
As every solution is a combination of m items, we can
enumerate all possible solutions.

d
e10 : 6 e8 : 5

e3 : 2 e6 : 6
e9 : 8
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3
a b
e1 : 3
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
42 / 74
Organizing solutions into a partial solution tree

X =??????????
x1 = 1 0

1????????? 0?????????
x2 = 1 0 1 0

11???????? 10???????? 01???????? 00????????


1 0 1 0 1

x3 = 0
101?0?????
1 0 1 0 1 0 1 0 1 0 ...

... x4 = 0

1011000011 1010001110 0111001001 0110011010 … …

Partial solution tree:


Leaf node: representing a complete solution associated with an
objective function value, e.g., X = 1011000011 has an objective
function value of 23.
Internal node: representing a partial solution, where only a subset
of items (in the path from root to the node) are known, e.g.,
X = 10???????? refers to tours including e1. but
. . .excluding
. . . . . . . e. .. . .
2 . . . . .
. . . . . . . . . . . . . . . . . . . .
43 / 74
Two alternative views of a partial solution

X =??????????
x1 = 1 0

1????????? 0?????????
x2 = 1 0 1 0

11???????? 10???????? 01???????? 00????????


1 0 1 0 1

x3 = 0
101?0?????
1 0 1 0 1 0 1 0 1 0 ...

... x4 = 0

1011000011 1010001110 0111001001 0110011010 … …

A subproblem: to determine the unknown items, e.g., X = 10????????


means to find the shortest tour including e1 but excluding e2 .
A collection of complete solutions: the leaf nodes of the subtree rooted at
this partial solution, e.g., X = 101?0????? represents two nodes in red
1011000011 and 1010001110.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
44 / 74
Deduction rules for simplification

X =??????????
x1 = 1 0

1????????? 0?????????
x2 = 1 0 1 0

11???????? 10???????? 01???????? 00????????


1 0 1 0 1

x3 = 0
101?0?????
1 0 1 0 1 0 1 0 1 0 ...

... x4 = 0

1011000011 1010001110 0111001001 0110011010 … …

Note that the following deduction rules were applied for


simplification.
An edge (i, j) should be included in the tour if the removal of this
edge leads to less than two edges incident to i or j, e.g., x8 = 1 in
X = 1010001110.
An edge (i, j) should be excluded from the tour if the addition of
this edge leads to more than two edges incident
. . . to
. . .i .or
. .j,. or
. . leads
. . . to
. . . . .
. . . . . . . . . . . . . . . . . . . .
a small circuit, e.g., x5 = 0 in X = 101?0?????. 45 / 74
Enumerate all complete solutions: Backtrack algorithm

function GenericBacktrack
1: Let A = {X0 }. //Start with the root node X0 =??...?. Here, A
denotes the active set of nodes that are unexplored
2: best_so_far = ∞;
3: while A ̸= NULL do
4: Choose and remove a node X from A;
5: Select an undetermined item in X, numerate this item, and thus
expand X into nodes X1 , X2 , ..., Xk ;
6: for i = 1 to k do
7: if Xi represents a complete solution then
8: Update best_so_far if Xi has better objective function value;
9: else
10: Insert Xi into A; //Xi needs to be explored further;
11: end if
12: end for
13: end while
14: return best_so_far;
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
46 / 74
An example: Step 1

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X0 }
a b best_so_far = +∞
e1 : 3
Partial Solution Tree
X =??????????
X0

Initially, the partial solution tree has only one root node, i.e.,
the original problem X0 . The value best_so_far is set as +∞.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
47 / 74
Step 2

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X1 , X2 }
a b best_so_far = +∞
e1 : 3
Partial Solution Tree
X =??????????
X0
x1 = 1/0

1????????? 0?????????
X1 X2

X0 is decomposed into two subproblems X1 and X2 . We then


expand X1 at the next step.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
48 / 74
Step 3

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X2 , X3 , X4 }
a b best_so_far = +∞
e1 : 3
Partial Solution Tree
X =??????????
X0
x1 = 1/0

1????????? 0?????????
X1 X2
x2 = 1/0

110?0????? 10????????
X3 X4

At this stage X3 was selected for expansion.


. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
49 / 74
Step 4
X =??????????
X0
x1 = 1/0

1????????? 0?????????
X1 X2
x2 = 1/0

110?0????? 10????????
X3 X4
x3 = 1/0

11100????? X 11000?????
X5 X6

In this way all possible complete solutions can be enumerated


step by step, and it is unnecessary to store the whole tree in
memory, thus reducing space requirement.
Note that some internal nodes, say X5 , can be removed as the
corresponding edges cannot form a valid tour.
Theoretically speaking it will take exponential time to
enumerate all possible complete solutions.
Question: can we make the enumeration efficient? . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
50 / 74
Speeding up enumeration process by pruning branches

Basic idea: to speed-up enumerating process, a feasible way is


to prune branches at some internal nodes with “low quality”.
That is, partial solutions can be exploited to exclude
some complete solutions. The representatives of this
strategy include greedy method and branch-and-bound.
Note that only complete solutions are associated with
objective function value. Thus, how to estimate quality of a
partial solution?
In greedy approaches, heuristic functions are used to
estimate objective function value for a partial solution.
In branch-and-bound approaches, lower bound functions are
used to calculate the lower bound of objective function value
of a partial solution, i.e., the objective function value of all
complete solutions represented by this partial solution.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
51 / 74
Calculate lower bound for partial solution: an example
d
e10 : 6 e8 : 5

e3 : 2 e6 : 6
e9 : 8
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3
a b
e1 : 3

For the partial solution X =??????????, we estimate the


shortest tour as below:
For each city, we select the shortest two adjacent edges;
The sum of these 2n edges is less than or equal to 2 times the
optimal tour;
Thus, we can give a lower bound as
1
2 (5 + 6 + 8 + 7 + 9) = 17.5.
Similarly, the lower bound for the partial solution
X = 10???????? was estimated as 12 (5 + 6 + 9 + 7 + 9) = 18. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
52 / 74
“Intelligent” enumeration
function IntelligentBacktrack
1: Let A = {X0 }. // Start with the root node X0 =??...?. Here A
denotes the active set of nodes that are unexplored.
2: best_so_far = ∞;
3: while A ̸= NULL do
4: Choose a node X ∈ A with lower bound less than best_so_far,
and remove X from A;
5: Select an undetermined item in X, numerate this item, and thus
expand X into nodes X1 , X2 , ..., Xk ;
6: for i = 1 to k do
7: if Xi represents a complete solution then
8: Update best_so_far if Xi has better objective function value;
9: else
10: if lowerbound(Xi ) ≤ best_so_far then
11: Insert Xi into A; //Xi needs to be explored further
12: end if
13: end if
14: end for
15: end while . . . . . . . . . . . . . . . . . . . .

16: return best_so_far; . . . . . . . . . . . . . . . . . .


53 / 74
. .
An example: Step 1

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X0 }
a b best_so_far = +∞
e1 : 3
Partial Solution Tree
X =??????????
X0 (17.5)

Initially, the partial solution tree has only one root node, i.e.,
the original problem X0 (lower bound: 17.5). The value
best_so_far is set as +∞.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
54 / 74
Step 2
d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X1 , X2 }
a b best_so_far = +∞
e1 : 3
Partial Solution Tree
X =??????????
X0 (17.5)
x1 = 1/0

1????????? 0?????????
X1 (17.5) X2 (18.5)

X0 is decomposed into two subproblems X1 and X2 , and lower


bounds of these two subproblems are estimated accordingly.
As X1 has a smaller lower bound than X2 , X1 will be chosen
to expand at the next step.
Here, we adopt a rule to choose the subproblem with the
smallest lower bound with the hope to find the optimal . . . . . . . . . . . . . . . . . . . .

solution as soon as possible. . . . . . . . . . . . . . . . . . .


55 / 74
. .
Step 3

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X2 , X3 , X4 }
a b best_so_far = +∞
e1 : 3
Partial Solution Tree
X =??????????
X0 (17.5)
x1 = 1/0

1????????? 0?????????
X1 (17.5) X2 (18.5)
x2 = 1/0

110?0????? 10????????
X3 (20.5) X4 (18)

X4 has the smallest lower bound; thus, it will be chosen for


expansion.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
56 / 74
Step 4

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X2 , X3 , X5 , X6 }
a b best_so_far = +∞
e1 : 3
Partial Solution Tree
X =??????????
X0 (17.5)
x1 = 1/0

1????????? 0?????????
X1 (17.5) X2 (18.5)
x2 = 1/0

110?0????? 10????????
X3 (20.5) X4 (18)
x3 = 1/0

101?0????? 100?1?????
X5 (18) X6 (23)

X5 has the smallest lower bound; thus, it will be chosen for . . . . . . . . . . . . . . . . . . . .


. . . . . . . . . . . . . . . . . . . .
expansion. 57 / 74
Step 5
d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X2 , X3 , X6 }
a b best_so_far = 21
e1 : 3
Partial Solution Tree
X =??????????
X0 (17.5)
x1 = 1/0
1????????? 0?????????
X1 (17.5) X2 (18.5)
x2 = 1/0
110?0????? 10????????
X3 (20.5) X X4 (18)
x3 = 1/0
101?0?????
X5 (18) 100?1?????
x4 = 1/0 X6 (23) X

1011000011 1010001110
X7 (23) X8 (21)

Two complete solutions, X7 and X8 , are obtained, and. best_so_far is updated


. . . . . . . . . . . . . . . . . . .

as 21. X3 and X6 should be removed, and X2 will be. chosen


. .
for expansion.
. . . . . . . . . . . . . . .
58 / 74
. .
Step 6

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X9 }
a b best_so_far = 21
e1 : 3
Partial Solution Tree
X =??????????
X0 (17.5)
x1 = 1/0
1????????? 0?????????
X1 (17.5) X2 (18.5)
x2 = 1/0
110?0????? 10???????? 01???????? 001?1?????
X3 (20.5) X X4 (18) X9 (18.5) X10 (21) X
x3 = 1/0
101?0?????
X5 (18) 100?1?????
x4 = 1/0 X6 (23) X

1011000011 1010001110
X7 (23) X8 (21)

. . . . . . . . . . . . . . . . . . . .

X10 should not be added into A, and X9 is chosen


. .
for
.
expansion.
. . . . . . . . . . . . . . .
59 / 74
. .
Step 7

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3 A = {X11 }
a b best_so_far = 21
e1 : 3
Partial Solution Tree
X =??????????
X0 (17.5)
x1 = 1/0
1????????? 0?????????
X1 (17.5) X2 (18.5)
x2 = 1/0
110?0????? 10???????? 01???????? 001?1?????
X3 (20.5) X X4 (18) X9 (18.5) X10 (21) X
x3 = 1/0
101?0????? 011?0????? 010?1?????
X5 (18) 100?1????? X11 (18.5) X12 (23.5) X
x4 = 1/0 X6 (23) X

1011000011 1010001110
X7 (23) X8 (21)

. . . . . . . . . . . . . . . . . . . .

X12 should not be added into A, and X11 will be. expanded.
. . . . . . . . . . . . . . . . .
60 / 74
. .
Final step

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4
A = {}
e2 : 4 e7 : 3
a b best_so_far = 19
e1 : 3
Partial Solution Tree
X =??????????
X0 (17.5)
x1 = 1/0
1????????? 0?????????
X1 (17.5) X2 (18.5)
x2 = 1/0
110?0????? 10???????? 01???????? 001?1?????
X3 (20.5) X X4 (18) X9 (18.5) X10 (21) X
x3 = 1/0
101?0????? 011?0????? 010?1?????
X5 (18) 100?1????? X11 (18.5) X12 (23.5) X
x4 = 1/0 X6 (23) X

1011000011 1010001110 0111001001 0110011010


X7 (23) X8 (21) X13 (19) X14 (23)

. . . . . . . . . . . . . . . . . . . .
Note: the tree was pruned to contain only 15 nodes,
. . . making
. . . . . . the
. . . . . . . . . . .
61 / 74
Backtracking strategy: another trial
d
e10 : 6 e8 : 5

e3 : 2 e6 : 6
e9 : 8
e c
e5 : 7 e4 : 4

e2 : 4 e7 : 3
a b
e1 : 3

Another solution form: Note that a tour can also be expressed


as a sequence of n nodes, i.e., X = [x1 , x2 , ..., xn−1 ], where
xi ∈ V. Without loss of generality, we assume x1 = a, and b
should appear before c in the tour. For example, the tour in
blue can be represented as X = abcd.
As solutions have this form, we can enumerate them through
building a partial solution tree together with pruning
technique. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
62 / 74
An example: Step 1

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4
A = {X0 }
e2 : 4 e7 : 3
a
e1 : 3
b best_so_far = ∞
Partial Solution Tree
X = a???
X0 (17.5)

Initially, we have only one active sub-problem X0 with lower bound of


tour distance 17.5. X0 will be expanded at the next step.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
63 / 74
Step 2

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4
A = {X1 , X3 , X4 }
e2 : 4 e7 : 3
a
e1 : 3
b best_so_far = ∞
Partial Solution Tree
X =a???
X0 (17.5)
x2 = b/c/d/e

ab?? ac?? ad?? ae??


X1 (17.5) X2 X X3 (17.5) X4 (20)

Note that the partial solution X = ac?? was abandoned as we assume b


appears before c in the tour. X1 will be expanded at the next step.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
64 / 74
Step 3

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4
A = {X3 , X4 , X5 ,
e2 : 4 e7 : 3
a b X6 , X7 }
Partial Solution Tree e1 : 3
best_so_far = ∞
X = a???
X0 (17.5)
x2 = b/c/d/e

ab?? ac?? ad?? ae??


X1 (17.5) X2 X X3 (17.5) X4 (20)
x3 = b/c/d/e

abc? abd? abe?


X5 (18.5) X6 (19.5) X7 (17.5)

X7 will be expanded at the next step.


. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
65 / 74
Step 4

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4
A = {X3 , X4 , X5 , X6 }
e2 : 4 e7 : 3
a
e1 : 3
b best_so_far = 21
Partial Solution Tree
X = a???
X0 (17.5)
x2 = b/c/d/e

ab?? ac?? ad?? ae??


X1 (17.5) X2 X X3 (17.5) X4 (20)
x3 = b/c/d/e

abc? abd? abe?


X5 (18.5) X6 (19.5) X7 (17.5)
x4 = b/c/d/e

abec abed
X8 (21) X9 (21)

We obtained two complete solutions, and updated . best_so_far


. . . . . . . . . . . . . . . . . . .

accordingly. X3 will be expaned. . . . . . . . . . . . . . . . . . .


66 / 74
. .
Step 5

d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4
A = {X4 , X5 , X6 ,
e2 : 4 e7 : 3
a b X11 , X12 }
e1 : 3
Partial Solution Tree best_so_far = 21
X = a???
X0 (17.5)
x2 = b/c/d/e

ab?? ac?? ad?? ae??


X1 (17.5) X2 X X3 (17.5) X4 (20)
x3 = b/c/d/e

abc? abd? abe? adc? X adb? ade?


X5 (18.5) X6 (19.5) X7 (17.5) X10 X11 (19.5) X12 (18)
x4 = b/c/d/e

abec abed
X8 (21) X9 (21)

X12 will be expanded at the next step. .


.
.
.
.
. . . . .
. . . .
. . . .
. . . .
. . . .
. . . . .
.
.
.
.
.
.
.
.
.

67 / 74
Final step
d
e10 : 6 e8 : 5
e3 : 2 e9 : 8 e6 : 6
e c
e5 : 7 e4 : 4
A = {}
e2 : 4 e7 : 3
a
e1 : 3
b best_so_far = 19
Partial Solution Tree
X = a???
X0 (17.5)
x2 = b/c/d/e

ab?? ac?? ad?? ae?? X


X1 (17.5) X2 X X3 (17.5) X4 (20)
x3 = b/c/d/e

abc? X abd? X abe? adc? X adb? X ade?


X5 (18.5) X6 (19.5) X7 (17.5) X10 X11 (19.5) X12 (18)
x4 = b/c/d/e

abec abed adec adeb


X8 (21) X9 (21) X13 X X14 (19)

Note: the tree was pruned to contain only 15 nodes,


.
making the
. . . . . . . . . . . . . . . . . . .

algorithm efficient. . . . . . . . . . . . . . . . . . .
68 / 74
. .
Time complexity and space complexity

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
69 / 74
Time complexity and space complexity

Time (space) complexity of an algorithm quantifies the time


(space) taken by the algorithm.
Since the time costed by an algorithm grows with the size of
the input, it is traditional to describe running time as a
function of the input size.
Input size: The best notation of input size depends on the
problem being studied.
For the TSP problem, the number of cities in the
input.
For the Multiplication problem, the total number of
bits needed to represent the input number is the best
measure.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
70 / 74
Running time: we are interested in its growth rate

A straightforward way is to use the exact seconds that a


program used. However, this measure highly depends on CPU,
OS, compiler, etc.
Several simplifications to ease analysis of algorithm:
1 We simply use the number of primitive operations (rather than
the exact seconds used) under the assumption that a primitive
operation costs constant time. Thus the running time is
T(n) = an2 + bn + c for some constants a, b, c.
2 We consider only the leading term, i.e. an2 , since the lower
order terms are relatively insignificant for large n.
3 We also ignore the leading term’s coefficient a since the it is
less significant than the growth rate.
Thus, we have T(n) = an2 + bn + c = O(n2 ). Here, the letter
O denotes order.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
71 / 74
Big O notation
Recall that big O notation is used to describe the error
term in Taylor series, say:
2
ex = 1 + x + x2 + O(x3 ) as x → 0

600
f(x)
g(x)

500

400

300

200

100

0
1 2 3 4 5 6 7 8
Figure 2: Example: f(x) = O(g(x)) as there exists c > 0 (e.g. c = 1) and
x0 = 5 such that f(x) < cg(x) whenever x > x0
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
72 / 74
Big Ω and Big Θ notations

In 1976 D.E. Knuth published a paper to justify his use of the


Ω-symbol to describe a stronger property. Knuth wrote: ”For
all the applications I have seen so far in computer science, a
stronger requirement [...] is much more appropriate”.
He defined
f(x) = Ω(g(x)) ⇔ g(x) = O(f(x))
with the comment: ”Although I have changed Hardy and
Littlewood’s definition of Ω, I feel justified in doing so because
their definition is by no means in wide use, and because there
are other ways to say what they want to say in the
comparatively rare cases when their definition applies”.
Big Θ notation is used to describe “f(n) grows asymptotically
as fast as g(n)”.
f(x) = Θ(g(x)) ⇔ g(x) = O(f(x)) and f(x) = O(g(x)).
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
73 / 74
Worst case and average case

Worst-case: the case that takes the longest time;


Average-case: we need know the distribution of the instances;
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
74 / 74

You might also like