Algorithms and Data
Structures
Lecture X
Simonas altenis
Nykredit Center for Database
Research
Aalborg University
[email protected]
October 21, 2002
This Lecture
Dynamic programming
Fibonacci numbers example
Optimization problems
Matrix multiplication optimization
Principles of dynamic programming
Longest Common Subsequence
October 21,
What have we learned? (1)
Ways of reasoning about algorithms:
Correctness
Concrete and asymptotic running time
Data structures and algorithms for
implementing sorted and unsorted
dictionary ADTs:
Hashing
Binary trees, Red-Black trees
Data structures for secondary storage (B-trees)
October 21,
What have we learned? (2)
Examples of classic algorithms:
Searching (binary search)
Sorting (selection sort, insertion sort,
heap-sort, merge-sort, quick-sort)
Algorithm design techniques:
Iterative algorithms
Divide-and-conquer algorithms
October 21,
Divide and Conquer
Divide and conquer method for
algorithm design:
Divide: If the input size is too large to deal
with in a straightforward manner, divide the
problem into two or more disjoint
subproblems
Conquer: Use divide and conquer
recursively to solve the subproblems
Combine: Take the solutions to the
subproblems and merge these solutions
into a solution for the original problem
October 21,
Divide and Conquer (2)
For example,
MergeSort
The
subproblems are
independent, all
different
October 21,
Merge-Sort(A,
Merge-Sort(A, p,
p, r)
r)
if
p
<
r
then
if p < r then
q(p+r)/2
q(p+r)/2
Merge-Sort(A,
Merge-Sort(A, p,
p, q)
q)
Merge-Sort(A,
Merge-Sort(A, q+1,
q+1, r)
r)
Merge(A,
Merge(A, p,
p, q,
q, r)
r)
Fibonacci Numbers
Fn= Fn-1+ Fn-2
F0 =0, F1 =1
0, 1, 1, 2, 3, 5, 8, 13, 21, 34
Straightforward recursive procedure is
slow!
Why? How slow?
Lets draw the recursion tree
October 21,
Fibonacci Numbers (2)
F(6) = 8
F(5)
F(4)
F(3)
F(2)
F(1)
F(4)
F(3)
F(2)
F(1) F(1)
F(0) F(1)
F(2)
F(3)
F(2)
F(1)
F(0)
F(1)
F(2)
F(1) F(1)
F(0)
F(0)
F(0)
We keep calculating the same value over
and over!
October 21,
Fibonacci Numbers (3)
How many summations are there?
Golden ratio Fn1 1 5 1.61803...
Fn
2
Thus Fn1.6n
Our recursion tree has only 0s and 1s
as leaves, thus we have 1.6n
summations
Running time is exponential!
October 21,
Fibonacci Numbers (4)
We can calculate Fn in linear time by
remembering solutions to the solved
subproblems dynamic programming
Compute solution in a bottom-up fashion
Trade space for time!
In this case, only two values need to be
remembered at any time (probably less than
the depth of your recursion stack!)
October 21,
Fibonacci(n)
Fibonacci(n)
FF00
00
FF11
11
for
for ii
11 to
to nn do
do
FFi
FFi-1 ++ FFi-2
i
i-1
i-2
10
Optimization Problems
We have to choose one solution out of
many a one with the optimal
(minimum or maximum) value.
A solution exhibits a structure
It consists of a string of choices that were
made what choices have to be made to
arrive at an optimal solution?
The algorithms computes the optimal
value plus, if needed, the optimal
solution
October 21,
11
Multiplying Matrices
Two matrices, A nm matrix and B mk
matrix, can be multiplied to get C with
dimensions nk, using nmk scalar
multiplications
a11 a12
... ... ...
m
b11 b12 b13
a21 a22 b b b
a a 21 22 23
31 32
...
c
22 ...
... ... ...
ci , j ai ,l bl , j
l 1
Problem: Compute a product of many
matrices efficiently
Matrix multiplication is associative
(AB)C = A(BC)
October 21,
12
Multiplying Matrices (2)
The parenthesization matters
Consider ABCD, where
Costs:
A is 301,B is 140, C is 4010, D is 1025
(AB)C)D1200 + 12000 + 7500 = 20700
(AB)(CD)1200 + 10000 + 30000 = 41200
A((BC)D)400 + 250 + 750 = 1400
We need to optimally parenthesize
A1 A2 K An , where Ai is a d i 1 d i matrix
October 21,
13
Multiplying Matrices (3)
Let M(i,j) be the minimum number ofj
Ak
multiplications necessary to compute
k i
Key observations
The outermost parenthesis partition the
chain of matrices (i,j) at some k, (ik<j):
(Ai Ak)(Ak+1 Aj)
The optimal parenthesization of matrices
(i,j) has optimal parenthesizations on either
side of k: for matrices (i,k) and (k+1,j)
October 21,
14
Multiplying Matrices (4)
We try out all possible k. Recurrence:
M (i, i ) 0
M (i, j ) min i k j M (i, k ) M (k 1, j ) d i 1d k d j
A direct recursive implementation is
exponential there is a lot of
duplicated work (why?)
n
2
(
n
)
But there are only 2
different
subproblems (i,j), where 1i j n
October 21,
15
Multiplying Matrices (5)
Thus, it requires only (n2) space to store the optimal cost M(i,j)
for each of the subproblems: half of a 2d array M[1..n,1..n]
Matrix-Chain-Order(d
Matrix-Chain-Order(d00d
dnn))
11 for
for i1
i1 to
to nn do
do
22
M[i,i]
M[i,i]
33 for
for l2
l2 to
to nn do
do
44
for
for i1
i1 to
to n-l+1
n-l+1 do
do
55
jj i+l-1
i+l-1
66
M[i,j]
M[i,j]
for
for ki
ki to
to j-l
j-l do
do
88
qq M[i,k]+M[k+1,j]+d
dkdj
M[i,k]+M[k+1,j]+di-1
i-1dkdj
99
if
if qq << M[i,j]
M[i,j] then
then
10
M[i,j]
10
M[i,j] q
q
11
c[i,j]
11
c[i,j] k
k
12
12 return
return M,
M, cc
October 21,
16
Multiplying Matrices (6)
After execution: M[1,n] contains the
value of the optimal solution and c
contains optimal subdivisions (choices of
k) of any subproblem into two
subsubproblems
A simple recursive algorithm PrintOptimal-Parents(c, i, j) can be used to
reconstruct an optimal parenthesization
Let us run the algorithm on
d=
[10, 20, 3, 5, 30]
October 21,
17
Multiplying Matrices (7)
Running time
It is easy to see that it is O(n3)
It turns out, it is also (n3)
From exponential time to polynomial
October 21,
18
Memoization
If we still like recursion very much, we can
structure our algorithm as a recursive
algorithm:
Initialize all M elements to and call LookupChain(d,
i, j)
Lookup-Chain(d,i,j)
Lookup-Chain(d,i,j)
11 if
if M[i,j]
M[i,j] << then
then
22
return
return m[i,j]
m[i,j]
33 if
if i=j
i=j then
then
44
m[i,j]
m[i,j] 0
0
55 else
else for
for kk i
i to
to j-1
j-1 do
do
66
qq Lookup-Chain(d,i,k)+
Lookup-Chain(d,i,k)+
Lookup-Chain(d,k+1,j)+d
dkdj
Lookup-Chain(d,k+1,j)+di-1
i-1dkdj
77
if
if qq << M[i,j]
M[i,j] then
then
88
M[i,j]
M[i,j] q
q
99 return
return M[i,j]
M[i,j]
October 21,
19
Dynamic Programming
In general, to apply dynamic programming,
we have to address a number of issues:
1. Show optimal substructure an optimal
solution to the problem contains within it
optimal solutions to sub-problems
Solution to a problem:
Making a choice out of a number of possibilities (look
what possible choices there can be)
Solving one or more sub-problems that are the result of
a choice (characterize the space of sub-problems)
Show that solutions to sub-problems must themselves
be optimal for the whole solution to be optimal (use
cut-and-paste argument)
October 21,
20
Dynamic Programming (2)
2. Write a recurrence for the value of an
optimal solution
Mopt = Minover all choices k {(Sum of Mopt of all subproblems, resulting from choice k) + (the cost
associated with making the choice k)}
Show that the number of different
instances of sub-problems is bounded by a
polynomial
October 21,
21
Dynamic Programming (3)
3. Compute the value of an optimal solution
in a bottom-up fashion, so that you always
have the necessary sub-results precomputed (or use memoization)
See if it is possible to reduce the space
requirements, by forgetting solutions to
sub-problems that will not be used any more
4. Construct an optimal solution from
computed information (which records a
sequence of choices made that lead to an
optimal solution)
October 21,
22
Longest Common
Subsequence
Two text strings are given: X and Y
There is a need to quantify how
similar they are:
Comparing DNA sequences in studies of
evolution of different species
Spell checkers
One of the measures of similarity is
the length of a Longest Common
Subsequence (LCS)
October 21,
23
LCS: Definition
Z is a subsequence of X, if it is possible
to generate Z by skipping some
(possibly none) characters from X
For example: X =ACGGTTA, Y=CGTAT,
LCS(X,Y) = CGTA or CGTT
To solve LCS problem we have to find
skips that generate LCS(X,Y) from X,
and skips that generate LCS(X,Y) from
Y
October 21,
24
LCS: Optimal Substructure
We make Z to be empty and proceed from the
ends of Xm=x1 x2 xm and Yn=y1 y2 yn
If xm=yn, append this symbol to the beginning of Z,
and find optimally LCS(Xm-1, Yn-1)
If xmyn,
Skip either a letter from X
or a letter from Y
Decide which decision to do by comparing LCS(Xm, Yn-1)
and LCS(Xm-1, Yn)
Cut-and-paste argument
October 21,
25
LCS: Reccurence
The algorithm could be easily extended by
allowing more editing operations in addition
to copying and skipping (e.g., changing a letter)
Let c[i,j] = LCS(Xi, Yj)
0
if i 0 or j 0
c[i, j ] c[i 1, j 1] 1
if i, j 0 and xi y j
max{c[i, j 1], c[i 1, j ]} if i, j 0 and x y
i
j
Observe: conditions in the problem restrict subproblems (What is the total number of subproblems?)
October 21,
26
LCS: Compute the
Optimum
LCS-Length(X,
LCS-Length(X, Y,
Y, m,
m, n)
n)
11 for
for i1
i1 to
to mm do
do
22
c[i,0]
c[i,0]
33 for
for j0
j0 to
to nn do
do
44
c[0,j]
c[0,j]
55 for
for i1
i1 to
to mm do
do
66
for
for j1
j1 to
to nn do
do
77
if
if xxii== yyjj then
then
88
c[i,j]
c[i,j] c[i-1,j-1]+1
c[i-1,j-1]+1
99
b[i,j]
b[i,j] copy
copy
10
10 else
else if
if c[i-1,j]
c[i-1,j] c[i,j-1]
c[i,j-1] then
then
11
c[i,j]
11
c[i,j] c[i-1,j]
c[i-1,j]
12
b[i,j]
skipx
12
b[i,j] skipx
13
else
13
else
14
c[i,j]
14
c[i,j] c[i,j-1]
c[i,j-1]
15
b[i,j]
15
b[i,j] skipy
skipy
16
16 return
return c,
c, bb
October 21,
27
LCS: Example
Lets run: X =ACGGTTA, Y=CGTAT
How much can we reduce our space
requirements, if we do not need to
reconstruct LCS?
October 21,
28
Next Lecture
Graphs:
Representation in memory
Breadth-first search
Depth-first search
Topological sort
October 21,
29