0% found this document useful (0 votes)
16 views

12.04.dynamic Programming

12.04.Dynamic Programming

Uploaded by

curtisandrea242
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

12.04.dynamic Programming

12.04.Dynamic Programming

Uploaded by

curtisandrea242
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 97

ECE 250 Algorithms and Data Structures

Dynamic programming

Douglas Wilhelm Harder, M.Math. LEL


Department of Electrical and Computer Engineering
University of Waterloo
Waterloo, Ontario, Canada

ece.uwaterloo.ca
[email protected]

© 2006-2013 by Douglas Wilhelm Harder. Some rights reserved.


Dynamic programming
2

Dynamic programming

To begin, the word programming is used by mathematicians to


describe a set of rules which must be followed to solve a problem
– Thus, linear programming describes sets of rules which must be solved
a linear problem
– In our context, the adjective dynamic describes how the set of rules
works
Dynamic programming
3

Outline

We will begin by looking at two problems:


– Fibonacci numbers
– Newton polynomial coefficients
– Determining connectedness in a DAG

We will examine the naïve top-down approach


– We will implement the naïve top-down approach
– We will discuss the run-times of these algorithms
– We will consider the application of memoization
– We will discuss the benefits of a bottom-up approach
Dynamic programming
4

Fibonacci numbers

Consider this function:


double F( int n ) {
return ( n <= 1 ) ? 1.0 : F(n - 1) + F(n - 2);
}

The run-time of this algorithm is

 (1) n 1
T n  
T n  1  T n  2   (1) n  1
Dynamic programming
5

Fibonacci numbers

Consider this function calculating Fibonacci numbers:


double F( int n ) {
return ( n <= 1 ) ? 1.0 : F(n - 1) + F(n - 2);
}

The runtime is similar to the actual definition of Fibonacci numbers:


 (1) n 1  1 n 1
T n   F   F n  1  F n  2 1 n 1
n 
T n  1  T n  2   (1) n  1     
Therefore, T(n) = W(F(n)) = W(fn)
T n 
– In actual fact, T(n) = Q(fn), only lim 2
n  F n 
Dynamic programming
6

Fibonacci numbers

To demonstrate, consider:

#include <iostream> double F( int n ) {


return ( n <= 1 ) ? 1.0 : F(n - 1) + F(n - 2);
#include <ctime>
}
using namespace std;

int main() {
cout.precision( 16 ); // print 16 decimal digits of precision for
doubles
// 53/lg(10) = 15.95458977...

for ( int i = 33; i < 100; ++i ) {


cout << "F(" << i << ") = "
<< F(i) << '\t' << time(0) << endl;
}

return 0;
}
Dynamic programming
7

Fibonacci numbers

The output:
F(33) = 5702887 1206474355 

F(34) = 9227465 1206474355  F(34), and F(35) in 1 s
F(33),
F(35) = 14930352 1206474355 

F(36) = 24157817 1206474356
F(37) = 39088169 1206474358
F(38) = 63245986 1206474360
F(39) = 102334155 1206474363
F(40) = 165580141 1206474368
F(41) = 267914296 1206474376
F(42) = 433494437 1206474389
F(43) = 701408733 1206474411
F(44) = 1134903170 1206474469 ~1 min to calculate F(44)
Dynamic programming
8

Fibonacci numbers

Problem:
– To calculate F(44), it is necessary to calculate F(43) and F(42)
– However, to calculate F(43), it is also necessary to calculate F(42)
– It gets worse, for example
• F(40) is called 5 times
• F(30) is called 620 times
• F(20) is called 75 025 times
• F(10) is called 9 227 465 times
• F(0) is called 433 494 437 times

Surely we don’t have to recalculate F(10) almost ten million times…


Dynamic programming
9

Fibonacci numbers

Here is a possible solution:


– To avoid calculating values multiple times, store intermediate
calculations in a table
– When storing intermediate results, this process is called memoization
• The root is memo
– We save (memoize) computed answers for possible later reuse, rather
than re-computing the answer multiple times
Dynamic programming
10

Fibonacci numbers

Once we have calculated a value, can we not store that value and
return it if a function is called a second time?
– One solution: use an array
– Another: use an associative hash table
Dynamic programming
11

Fibonacci numbers

static const int ARRAY_SIZE = 1000;


double * array = new double[ARRAY_SIZE];

array[0] = 1.0;
array[1] = 1.0;

ray?
t
// use 0.0 to indicate we have not yet calculated
of gers?
he ar inatevalue
for ( int i = 2; i < ARRAY_SIZE;t he end) { exed by
++i
array[i] = 0.0; bey ond no t ind
f you go
oblem is
} ti
Wha at if our pr
s:
lem Whn ) {
Probdouble F( int
if ( array[n] == 0.0 ) {
array[n] = F( n – 1 ) + F( n – 2 );
}

return array[n];
}
Dynamic programming
12

Fibonacci numbers

Recall the characteristics of an associative container:

template <typename S, typename T>


class Hash_table {
public:
// is something stored with the given key?
bool member( S key ) const;

// returns value associated with the inserted key


T retrieve( S key ) const;

void insert( S key, T value );


// ...
};
Dynamic programming
13

Fibonacci numbers

This program uses the Standard Template Library:


#include <map>

/* calculate the nth Fibonacci number */


double F( int n ) {
static std::map<int, double> memo;
// shared by all calls to F
// the key is int, the value is double

if ( n <= 1 ) {
return 1.0;
} else {
if ( memo[n] == 0.0 ) {
memo[n] = F(n - 1) + F(n - 2);
}

return memo[n];
}
}
Dynamic programming
14

Fibonacci numbers

This prints the output up to F(1476):


int main() {
std::cout.precision( 16 );

for ( int i = 0; i < 1476; ++i ) {


std::cout << "F(" << i << ") = " << F(i) << std::endl;
}

return 0;
}

The last two lines are


F(1475) = 1.306989223763399e+308
F(1476) = inf

Exact value: F(1475) =


13069892237633993180363115538027198309839244390741264072
6006659460192793070479231740288681087777017721095463154979012276234322246936939647185366706368489362660844147
4499413484628009227558189696347433489829164249540627441359698656154072764924106537217745906695448014908376491
61732095972658064630033793347171632
Dynamic programming
15

Top-down and bottom-up algorithms

This also allows for another approach:


– Our implementation is top-down
– From the very top, we break the problem into sub-problems
– All divide-and-conquer algorithms we have seen are top-down

An alternative approach is bottom-up approach


– Solve smaller problems and use these to build a solution for the problem
– Merge sort could be implemented bottom-up:
• Divide the array into groups of size m and sort each group with insertion sort
• Merge adjacent pairs into groups of size 2m where possible
• Repeat this successively until the entire list is sorted
Dynamic programming
16

Top-down and bottom-up algorithms

Here is a bottom-up approach to calculating Fibonacci numbers:

double F( int n ) {
if ( n <= 1 ) {
return 1.0;
}

double a = 1.0, b = 1.0;

for ( int i = 2; i <= n; ++i ) {


a += b;
b = a - b;
}

return a;
}
Dynamic programming
17

Newton polynomials

The coefficient of a Newton polynomial which allows a polynomial of


degree n – 1 passing through n points may be calculated by:

yk ,...,2  yk  1,...,1
yk ,...,1 
xk  x1
where k = 1, …, n and the coefficients are defined recursively:
yi ,..., j  1  yi 1,..., j
yi ,..., j 
x j  xk

y j 1  y j
y j , j 1 
x j 1  x j
Dynamic programming
18

Newton polynomials

A recursive implementation for calculating Newton polynomials


coefficients would have a run time of Q(2n)

double Newton_polynomials( int i, int j, double *xs, double *ys )


{
if ( i == j ) {
return ys[i];
} else if ( i + 1 == j ) {
return ( ys[j] – ys[i] )/( xs[j] – xs[i] );
} else {
return (
Newton_polynomials( i + 1, j, xs, ys )
- Newton_polynomials( i, j - 1, xs, ys )
) / (xs[j] - x[i]);
}
}
Dynamic programming
19

Newton polynomials
n n  1
Note, however, that there are only coefficients
2
– This shouldn’t require such recursion

x0 y0
y1  y0
y1, 0 
x1  x0 y2 ,1  y1, 0
x1 y1 y2 ,1, 0 
y2  y1 x2  x0 y3, 2 ,1  y2 ,1, 0
y2 ,1  y3, 2 ,1, 0 
x2  x1 y3 , 2  y2 ,1 x3  x0
x2 y2 y3, 2 ,1 
y3  y 2 x3  x1
y3 , 2 
x3  x2
x3 y3
Dynamic programming
20

Newton polynomials

Here is an array-based implementation:


#include <cmath>

static const int ARRAY_SIZE = 1000;


double * array = new double[ARRAY_SIZE]
[ARRAY_SIZE];

for ( int i = 0; i < ARRAY_SIZE; ++i ) {


for ( int j = 0; j < ARRAY_SIZE; ++j ) {
array[i][j] = nan( "" );
}
} double Newton_polynomials( int i, int j, double *xs, double
*ys ) {
if ( i == j ) {
return ys[i];
} else if ( i + 1 == j ) {
return ( ys[j] – ys[i] )/( xs[j] – xs[i] );
} else {
if ( isnan( array[i][j] ) {
array[i][j] = (
Newton_polynomials( i + 1, j, xs, ys )
- Newton_polynomials( i, j - 1, xs, ys )
) / (xs[j] - x[i]);
}

return array[i][j];
}
Dynamic programming
21

Newton polynomials

Here is a hash-table based implementation:


#include <map>
#include <cmath>
#include <utility>

double Newton_polynomials( int i, int j, double *xs, double *ys, bool clear = true
) {
static std::map< std::pair< int, int >, double > memo;

if ( clear ) {
memo.clear();
}

if ( i == j ) {
return ys[i];
} else if ( i + 1 == j ) {
return ( ys[j] - ys[i] )/( xs[j] - xs[i] );
} else {
if ( memo.find( std::pair<int, int>( i, j ) ) == memo.end() ) {
memo[std::pair<int, int>( i, j )] = (
Newton_polynomials( i + 1, j, xs, ys, false )
- Newton_polynomials( i, j - 1, xs, ys, false )
) / (xs[j] - xs[i]);
}

return memo[std::pair<int, int>( i, j )];


}
Dynamic programming
22

Newton polynomials

Here is a bottom-up approach:


void Newton_polynomials( int n, double *xs, double *ys, double *coeffs ) {
for ( int i = 0; i < n; ++i ) {
coeffs[i] = ys[i];
}

for ( int i = 1; i < n; ++i ) {


for ( int j = n - 1; j >= i; --j ) {
coeffs[j] = ( coeffs[j] – coeffs[j – 1] ) / ( xs[j] – xs[j – i] );
}
}
}

Note that the memory required is now Q(n) and not Q(n2)
Dynamic programming
23

Connected

Determining if two vertices are connected in a DAG, we could


implement the following:
bool Weighted_graph::connected( int i, int j ) {
if ( adjacent( i, j ) ) {
return true;
}

for ( int v : neighbors( i ) ) {


if ( connected( v, j ) ) {
return true;
}
}

return false;
}

– What are the issues with this implementation?


Dynamic programming
24

Dynamic programming

In solving optimization problems, the top-down approach may


require repeatedly obtaining optimal solutions for the same sub-
problem
– Mathematician Richard Bellman initially formulated the concept of
dynamic programming in 1953 to solve such problems
– This isn’t new, but Bellman formalized defined this process
Dynamic programming
25

Dynamic programming

Dynamic programming is distinct from divide-and-conquer, as the


divide-and-conquer approach works well if the sub-problems are
essentially unique
– Storing intermediate results would only waste memory

If sub-problems re-occur, the problem is said to have overlapping


sub-problems
Dynamic programming
26

Matrix chain multiplication

Suppose A is k × m and B is m × n
– Then AB is k × n and calculating AB is Q(kmn)
– The number of multiplications is given exactly kmn

Suppose we are multiplying three matrices ABC


– Matrix multiplication is associative so we may choose (AB)C or A(BC)
– The order of the multiplications may significantly affect the run time

For example, if A and B are n × n matrices and v is an n-dimensional


column vector:
– Calculating (AB)v is Q(n3)
– Calculating A(Bv) is Q(n2)
Dynamic programming
27

Matrix chain multiplication

Suppose we want to multiply four matrices ABCD


– There are may ways of parenthesizing this product:
((AB)C)D
(AB)(CD)
(A(BC))D
A((BC)D)
A(B(CD))

– Which has the least number of operations?


Dynamic programming
28

Matrix chain multiplication

For example, consider these four:

Matrix Dimensions
A 20 × 5
B 5 × 40
C 40 × 50
D 50 × 10
Dynamic programming
29

Matrix chain multiplication

Considering each order:


((AB)C)D

The required number of multiplications is:


AB 20 × 5 × 40 = 4000
(AB)C 20 × 40 × 50 = 40000
((AB)C)D 20 × 50 × 10 = 10000

This totals to 54000 multiplications


Dynamic programming
30

Matrix chain multiplication

Considering the next order:


(AB)(CD)

The required number of multiplications is:


AB 20 × 5 × 40 = 4000
CD 40 × 50 × 10 = 20000
(AB)(CD) 20 × 40 × 10 = 8000

This totals to 32000 multiplications


Dynamic programming
31

Matrix chain multiplication

Considering the next order:


(A(BC))D

The required number of multiplications is:


BC 5 × 40 × 50 = 10000
A(BC) 20 × 5 × 50 = 5000
(A(BC))D 20 × 50 × 10 = 10000

This totals to 25000 multiplications


Dynamic programming
32

Matrix chain multiplication

And the the next order:


A((BC)D)

The required number of multiplications is:


BC 5 × 40 × 50 = 10000
(BC)D 5 × 50 × 10 = 2500
A((BC)D) 20 × 5 × 10 = 1000

This totals to 13500 multiplications


Dynamic programming
33

Matrix chain multiplication

Repeating this for the last, we get the following table:

Order Multiplications
((AB)C)D 54000
(AB)(CD) 32000
(A(BC))D 25000
A((BC)D) 13500
A(B(CD)) 23000

The optimal run time uses A((BC)D)


Dynamic programming
34

Matrix chain multiplication

Thus, the optimal run time may be found by calculating the product
in the order
A((BC)D)

Problem: What if we are multiply n matrices?


Dynamic programming
35

Matrix chain multiplication

Can we generate a greedy algorithm to achieve this?


– Greedy by the smallest number of operations?
• Unfortunately, 2 × 1 × 2 + 2 × 2 × 3 = 16 > 12 = 1 × 2 × 3 + 2 × 1 × 3
even though 2×1×2= 4<6 =1×2×3
 3  3 8 9
  6 4  
2
   1 0 5

– Greedy by generating the smallest matrices (sum of dimensions)?


• Unfortunately, 2 × 1 × 2 + 2 × 2 × 4 = 20 > 16 = 1 × 2 × 4 + 2 × 1 × 4
even though 2+2= 4<5 =1+4

 3  2 7 9 4
  6 4  
2
   0 8 1 5 
Dynamic programming
36

Matrix chain multiplication

If we are multiplying A1A2A3A4⋅⋅⋅An, starting top-down, there are n –


1 different ways of parenthesizing this sequence:
(A1)(A2A3A4 ⋅⋅⋅An)
(A1A2)(A3A4 ⋅⋅⋅An)
(A1A2A3)(A4 ⋅⋅⋅An)
⋅⋅⋅
(A1⋅⋅⋅An – 2)(An – 1An)
(A1⋅⋅⋅An – 2An – 1)(An)

For each one we must ask:


– What is the work required to perform this multiplication
– What is the minimal amount of work required to perform both of the
other products?
Dynamic programming
37

Matrix chain multiplication

For example, in finding the best product of


(A1⋅⋅⋅Ai)(Ai + 1 ⋅⋅⋅An)
the work required is:
– The product columns(A1) rows(Ai) rows(An)
• Note that rows(Ai) and columns(Ai + 1) must be equal
– The minimal work required to multiply A1⋅⋅⋅Ai
– The minimal work required to multiply Ai + 1⋅⋅⋅An
Dynamic programming
38

Matrix chain multiplication

int matrix_chain( Matrix *ms, int i, int j ) {


// There is only one matrix
if ( i + 1 == j ) {
return 0;
} else if ( i + 2 == j ) {
assert( ms[i].columns() == ms[i + 1].rows() );
return ms[i].rows() * ms[i].columns() * ms[i + 1].columns();
}

// We are multiplying at least three matrices


// Start with calculating the work for M[i] * (M[i + 1] * ... * M[j - 1])
assert( ms[i].columns() == ms[i + 1].rows() );
int minimum = matrix_chain( ms, i + 1, j ) + ms[i].rows() * ms[i].columns() * ms[j -
1].columns();

for ( int k = i + 2; k < j; ++k ) {


// Find the work for (M[i] * ... * M[k - 1]) * (M[k] * ... M[j - 1]) and update if it
is less
assert( ms[k - 1].columns() == ms[k].rows() );
int current = matrix_chain( ms, i, k ) + matrix_chain( ms, k, j ) +
+ ms[i].rows() * ms[k].rows() * ms[j - 1].columns();

if ( current < minimum ) {


minimum = current;
}
}

return minimum;
Dynamic programming
39

Matrix chain multiplication

Because of the recursive nature, we will on numerous occasions be


asking for the optimal behaviour of a given subsequence
– We will asked the optimal way to multiply A3 ⋅⋅⋅ An – 2 when we ask the
optimal way to multiply any of the following:
A1 (A2 (( (A3 ⋅⋅⋅ An – 2) An – 1)An))
A1 (A2 ( (A3 ⋅⋅⋅ An – 2) (An – 1 An)))
A1 ((A2 (A3 ⋅⋅⋅ An – 2) ) (An – 1An))
A1 ((A2 ( (A3 ⋅⋅⋅ An – 2) An – 1)) An)
A1 ((A2 (A3 ⋅⋅⋅ An – 2) ) An – 1) An)
(A1 A2) (( (A3 ⋅⋅⋅ An – 2) An – 1) An)
(A1 A2) ( (A3 ⋅⋅⋅ An – 2) (An – 1 An))
(A1 (A2 ( (A3 ⋅⋅⋅ An – 2) )) (An – 1 An)
((A1 A2) (A3 ⋅⋅⋅ An – 2) )(An – 1 An)
(A1 (A2 (A3 ⋅⋅⋅ An – 2) An – 1))) An
(A1 ((A2 (A3 ⋅⋅⋅ An – 2) )An – 1)) An
((A1 A2) ( (A3 ⋅⋅⋅ An – 2) An – 1) An
((A1 (A2 (A3 ⋅⋅⋅ An – 2) )) An – 1) An
(((A1 A2) (A3 ⋅⋅⋅ An – 2) )An – 1) An
Dynamic programming
40

Matrix chain multiplication

The actual number of possible orderings is given by the following


recurrence relation:
 1 n 1
 n 1
Pn    Pn Pn  k  n  1

 k 1
Without proof, this recurrence relation is solved by P(n) = C(n – 1)
where C(n – 1) is the (n – 1)th Catalan number

C(1) = 1 C(6) = 132


1  2n   4n  C(2) = 2 C(7) = 429
C n       3/2  C(3) = 5 C(8) = 1430
n 1  n  n  C(4) = 14 C(9) = 4862
C(5) = 42 C(10) = 16796
Dynamic programming
41

Matrix chain multiplication

The number of function calls of the implementation is given by


 1 n 1, 2
T n   n 3
 3n 
4 3 n 3

Can we speed this up?


– Memoization: once we’ve found the optimal number of solutions for a
given sequence, store it
Dynamic programming
42

Matrix chain multiplication


#include <map>
#include <utility>

int matrix_chain_memo( Matrix *ms, int i, int j, bool clear = true ) {


static std::map< std::pair< int, int >, int > memo;

if ( clear ) {
memo.clear();
} Associate a pair (i, j) with an integer
if ( i + 1 == j ) {
return 0;
} else if ( memo[std::pair<int, int>(i, j)] == 0 ) {
if ( i + 2 == j ) {
memo[std::pair<int, int>(i, j)] = ms[i].rows() * ms[i].columns() * ms[i + 1].columns();
} else {
int minimum = matrix_chain_memo( ms, i + 1, j, false )
+ ms[i].rows() * ms[i].columns() * ms[j - 1].columns();

for ( int k = i + 2; k < j; ++k ) {


int current = matrix_chain_memo( ms, i, k, false ) + matrix_chain_memo( ms, k, j, false )
+ ms[i].rows() * ms[k].rows() * ms[j - 1].columns();

if ( current < minimum ) {


minimum = current;
}
}

memo[std::pair<int, int>(i, j)] = minimum;


}
}

return memo[std::pair<int, int>(i, j)];


}
Dynamic programming
43

Matrix chain multiplication

Our memoized version now runs in


 1 n 1 
T n     n 
2

n  1n  1 n 2 

This is a top-down implementation


– Can we implement a bottom-up version?
Dynamic programming
44

Matrix chain multiplication

For a bottom-up implementation, we need a matrix that stores our


best current solution to Ai to Aj:

j=1 2 3 4 ··· n
i=1 0
2 0 ···

3 0 ···

4 0 ···
·
··

n 0
Dynamic programming
45

Matrix chain multiplication

As we calculate the minimum number of multiplications required for


a specific sequence Ai···Aj, we fill the entry aij in the table

j=1 2 3 4 ··· n
i=1 0
2 0 ···

3 0 ···

4 0 ···
·
··

n 0
Dynamic programming
46

Matrix chain multiplication

This table has n2 entries, and therefore our run time must be at least
W(n2)

However, at each step, the actual run time is Q(n3)


Dynamic programming
47

Matrix chain multiplication

For example, given the previous example

Matrix Dimensions
A1 20 × 5
A2 5 × 40
A3 40 × 50
A4 50 × 10
we can calculate the table as follows...
Dynamic programming
48

Matrix chain multiplication

We can calculate the off-diagonal easily:

1 2 3 4
Matrix Dimensions 1 0 4000
A1 20 × 5 20×5 20×40

A2 5 × 40 2 0 10000
5×40 5×50
A3 40 × 50
3 0 20000
A4 50 × 10 40×50 40×10

4 0
50×10
Dynamic programming
49

Matrix chain multiplication

Next, we may calculate either:


A1(A2A3) 20 × 5 × 50 + 10000 =
15000
(A1A2)A3 4000 + 20 × 40 × 50 =
44000
1 2 3 4
Matrix Dimensions 1 0 4000 15000
A1 20 × 5 20×5 20×40 20×50

A2 5 × 40 2 0 10000
5×40 5×50
A3 40 × 50
3 0 24000
A4 50 × 10 40×50 40×10

4 0
50×10
Dynamic programming
50

Matrix chain multiplication

We continue:
A2(A3A4) 5 × 40 × 10 + 20000 =
22000 (A2A3)A4 10000 + 5 × 50 × 10 =
12500

1 2 3 4
Matrix Dimensions 1 0 4000 15000
A1 20 × 5 20×5 20×40 20×50

A2 5 × 40 2 0 10000 12500
5×40 5×50 5×10
A3 40 × 50
3 0 20000
A4 50 × 10 40×50 40×10

4 0
50×10
Dynamic programming
51

Matrix chain multiplication

Finally we calculate:
A1((A2A3)A4) 20 × 5 × 10 + 12500 =
13500
(A1A2)(A3A4) 4000 + 20 × 40 × 10 + 20000 =
32000
1 2
(A1(A2A3))A4 15000 + 20 ×3 50 × 104 =
Matrix Dimensions 1 0 4000 15000 13500
25000
A 20 × 5 20×5 20×40 20×50 20×10
1

A2 5 × 40 2 0 10000 12500
5×40 5×50 5×10
A3 40 × 50
3 0 20000
A4 50 × 10 40×50 40×10

4 0
50×10
Dynamic programming
52

Matrix chain multiplication

Thus, counting the number of calculations required (each Q(1)):


n–1
(n – 2) 2
(n – 3) 3
which suggests the sum
n 1 n 1 n 1 3 2 3
n  n n ( n  1)( 2 n  1) n  n

i 1
 
( n  i )i n i 
i 1 i 1
i2 
2

6

6

Therefore, the run time is Q(n3)


Dynamic programming
53

Matrix chain multiplication


int matrix_chain_iterative( Matrix *ms, int n ) {
int array[n][n];

for ( int i = 0; i < n; i++ ) {


array[i][i] = 0;
}

for ( int i = 1; i < n; i++ ) {


for ( int j = 0; j < n - i; j++ ) {
array[j][j + i] = array[j][j] + array[j + 1][j + i]
+ ms[j].rows()*ms[j + 1].rows()*ms[j + i].columns();

for ( int k = j + 1; k < j + i; k++) {


int current = array[j][k] + array[k + 1][j + i]
+ ms[j].rows()*ms[k + 1].rows()*ms[j + i].columns();

if ( current < array[j][j + i] ) {


array[j][j + i] = current;
}
}
}
}

return array[0][n - 1];


}
Dynamic programming
54

Matrix chain multiplication

How can you estimate run times?


– Which is faster—the top-down implementation with memoization or the
bottom-up implementation?
– You could plot run times…
– Question: what is the growth of
this plotted data?
• Quadratic?
• Cubic?
• Q(n2 ln(n))
Dynamic programming
55

Matrix chain multiplication

If a function grows in polynomial time, it is of the form:


T(n) = anb
Take the logarithm of both sides:
ln(T(n)) = ln(anb) = ln(a) + b ln(n)
– It grows linearly with a slope b
Top-down with memoization Bottom-up

b≈2 b≈3
Dynamic programming
56

Optimal polygon triangulation

In graphics and geometry, convex polygons are a basic unit


– Applications in graphics and finite-element methods
Dynamic programming
57

Optimal polygon triangulation

In graphics and geometry, convex polygons are a basic unit


– Dividing such a polygon into simpler triangles is a common operation
Dynamic programming
58

Optimal polygon triangulation

In graphics and geometry, convex polygons are a basic unit


– Some triangulations may be better than others
– For example, fewer thin triangles
Dynamic programming
59

Optimal polygon triangulation

Now, a convex quadrilateral (tetragon) can only be triangulated in


two different ways
Dynamic programming
60

Optimal polygon triangulation

A convex pentagon can be triangulated in five different ways


Dynamic programming
61

Optimal polygon triangulation

And a convex hexagon can be triangulated in 14 different ways


Dynamic programming
62

Optimal polygon triangulation

The sequence 2, 5, 13 may already appear familiar


– The Catalan numbers...

If we can put a weight on each generated triangle, can we find an


optimal triangulation?
– Can we come up with a good algorithm?
– Consider the previous problem of finding an optimal order for multiplying
matrices
Dynamic programming
63

Optimal polygon triangulation

Choose a side and begin numbering the sides in order


– Any two adjacent sides can be joined to create a triangle
– Any two adjacent matrices could be multiplied
Dynamic programming
64

Optimal polygon triangulation

Taking two adjacent sides and creating a triangle is similar to


bracketing
Dynamic programming
65

Optimal polygon triangulation

Instead of sides 1 and 2, we could choose 2 and 3


Dynamic programming
66

Optimal polygon triangulation

Or 3 and 4
Dynamic programming
67

Optimal polygon triangulation

Suppose the triangle 3/4 was optimal


– Next, do we add the side 2?
Dynamic programming
68

Optimal polygon triangulation

Suppose the triangle 3/4 was optimal


– Or do we add the side 5?
Dynamic programming
69

Optimal polygon triangulation

The analogy is not exact because there is no logic to bracketing and


multiplying matrices A1 and A7

Never-the-less, this strongly suggests that there is an efficient


algorithm based on dynamic programming that will find an optimal
triangulation
Dynamic programming
70

Interval scheduling

Process Interval
A problem we saw in the topic on
greedy algorithms was interval A 5 –
8
scheduling
B 10 –
13
C 6 –
9
D 12 –
15
E 3 –
7
F 8 –
11
G 1 –
6
H 8 –
12
J 3 –
Dynamic programming
71

Interval scheduling

Process Interval
The earliest-deadline-first greedy
algorithm will maximize the number A 5 –
8
of processes that can be scheduled
B 10 –
13
C 6 –
9
D 12 –
15
E 3 –
7
F 8 –
11
G 1 –
6
H 8 –
12
J 3 –
Dynamic programming
72

Interval scheduling

Process Interval Weight


Suppose, however, that there are
weights associated with the processes A 5 – 1.7
8
– Can we find the schedulable processes
B 10 – 1.3
that have maximal weight?
13
C 6 – 3.9
Currently, no greedy algorithm is known 9
that can solve this problem D 12 – 3.2
15
E 3 – 5.9
7
F 8 – 3.3
11
G 1 – 5.4
6
H 8 – 1.2
12
J 3 – 5.8
Dynamic programming
73

Interval scheduling

Process Interval Weight


Note: if we wanted to optimize based
on processor usage, the weight would A 5 – 3
8
be equal to the computation time
B 10 – 3
13
C 6 – 3
9
D 12 – 3
15
E 3 – 4
7
F 8 – 3
11
G 1 – 5
6
H 8 – 4
12
J 3 – 2
Dynamic programming
74

Interval scheduling

First, as before, we sort Place Process Interval Weight


the processes by their 1 K 2 – 4.8
deadlines 4
2 J 3 – 5.8
5
3 G 1 – 5.4
6
4 E 3 – 5.9
7
5 A 5 – 1.7
8
6 C 6 – 3.9
9
7 F 8 – 3.3
11
8 H 8 – 1.2
12
9 B 10 – 1.3
Dynamic programming
75

Interval scheduling

Process L could only be Place Process Interval Weight Previous


run if nothing after the 7th 1 K 2 – 4.8 0
process is chosen 4
2 J 3 – 5.8 0
5
3 G 1 – 5.4 0
6
4 E 3 – 5.9 0
7
5 A 5 – 1.7 2
8
6 C 6 – 3.9 3
9
7 F 8 – 3.3 5
11
8 H 8 – 1.2 5
12
9 B 10 – 1.3 6
Dynamic programming
76

Interval scheduling

Process M could only be Place Process Interval Weight Previous


run if nothing after the 6th 1 K 2 – 4.8 0
process is chosen 4
2 J 3 – 5.8 0
5
3 G 1 – 5.4 0
6
4 E 3 – 5.9 0
7
5 A 5 – 1.7 2
8
6 C 6 – 3.9 3
9
7 F 8 – 3.3 5
11
8 H 8 – 1.2 5
12
9 B 10 – 1.3 6
Dynamic programming
77

Interval scheduling

If a process must be run Place Process Interval Weight Previous


first, we mark its previous 1 K 2 – 4.8 0
process as 0 4
2 J 3 – 5.8 0
5
3 G 1 – 5.4 0
6
4 E 3 – 5.9 0
7
5 A 5 – 1.7 2
8
6 C 6 – 3.9 3
9
7 F 8 – 3.3 5
11
8 H 8 – 1.2 5
12
9 B 10 – 1.3 6
Dynamic programming
78

Interval scheduling

If a process must be run Place Process Interval Weight Previous


first, we mark its previous 1 K 2 – 4.8 0
process as 0 4
2 J 3 – 5.8 0
5
3 G 1 – 5.4 0
6
4 E 3 – 5.9 0
7
5 A 5 – 1.7 2
8
6 C 6 – 3.9 3
9
7 F 8 – 3.3 5
11
8 H 8 – 1.2 5
12
9 B 10 – 1.3 6
Dynamic programming
79

Interval scheduling

Thus, given process n, the optimal weight is either:


The weight of this process plus the optimal weight
of that previous process that could have been run
or
The optimal weight of the (n – 1)st process

This can be quickly converted to code:


int optimal_weight( int n ) const {
// The zeroeth process has weight 0
if ( n == 0 ) {
return 0;
}

return std::max(
weight[n] + optimal_weight( previous[n] ),
optimal_weight( n - 1 )
);
}
Dynamic programming
80

Interval scheduling

Unfortunately, consider the worst-case run-time analysis:


– Suppose it is true that
previous[n] == (n - 1)
for all n, the run time analysis is:

int optimal_weight( int n ) const {


if ( n == 0 ) {
return 0;
  1 n 0
}
T n  
2T n  1   1 n  0
return std::max(
weight[n] + optimal_weight( previous[n] ),
optimal_weight( n - 1 )
);
}
Dynamic programming
81

Interval scheduling

In the optimal case where we can run every process, the run time is
exponential:
  1 n 0 
T n     2 
n

2T n  1   1 n  0 
Dynamic programming
82

Interval scheduling

However, using memoization:


int optimal_weight_memo( int n, bool clear = true ) const {
static std::map<int, int> memo;

if ( clear ) {
memo.clear();
}

if ( n == 0 ) {
return 0;
} else if ( memo[n] == 0 ) {
memo[n] = std::max(
weight[n] + optimal_weight_memo( previous[n], false ),
optimal_weight_memo( n – 1, false )
);
}

return memo[n];
}
Dynamic programming
83

Interval scheduling

We can also implement an iterative bottom-up approach:

int optimal_weight_iterative( int n ) const {


int optimal[n + 1];
optimal[0] = 0;

for ( int k = 1; k <= n; ++k ) {


optimal[k] = max(
weight[n] + optimal[previous[n]],
optimal[n – 1]
);
}

return optimal[n];
}
Dynamic programming
84

Interval scheduling

There is only one loop: the run-time is T n   n 

int optimal_weight_iterative( int n ) const {


int optimal[n + 1];
optimal[0] = 0;

for ( int k = 1; k <= n; ++k ) {


optimal[k] = max(
weight[n] + optimal[previous[n]],
optimal[n – 1]
);
}

return optimal[n];
}
Dynamic programming
85
Project management
0/1 knapsack problem
Recall our problem of project management or 0/1 knapsack problem
– The constraint W is an integer value
– There are n items we could include:
• The nth item has an integral weight wk
• Its value is vk
– The optimal value of objects we can place in the knapsack is:
0 k 0

knapsack k , w  knapsack k  1, w  wk  w
max knapsack k  1, w , knapsack k  1, w  w   v  w w
 k k k

– Again, this is a recursive optimization problem


Dynamic programming
86
Project management
0/1 knapsack problem
Here is a hash-table based implementation:
#include <map>
#include <cmath>
#include <utility>

double knapsack( int *weight, double *value, int k, int W, bool clear = true ) {
static std::map< std::pair< int, int >, double > memo;

if ( clear ) memo.clear();

if ( k == 0 ) {
return 0;
} else {
if ( memo.find( std::pair<int, int>( k, W ) ) == memo.end() ) {
if ( weight[k - 1] > W ) {
memo[std::pair<int, int>( k, W )] = knapsack( weight, value, k - 1, W,
false );
} else {
memo[std::pair<int, int>( k, W )] = std::max(
knapsack( weight, value, k - 1, W, false ),
knapsack( weight, value, k - 1, W - weight[k - 1], false ) + value[k
- 1]
);
}
}

return memo[std::pair<int, int>( k, W )];


Dynamic programming
87
Project management
0/1 knapsack problem
The run-time is O(nW)
– This is significantly slower than our greedy algorithm with O(n ln(n))
– The greedy algorithm also works if the weights are not integral

For our previous example, the output of


int main() {
int weight[10] = { 15, 12, 10, 9, 8, 7, 5, 4,
3, 1};
double value[10] = {210, 220, 180, 120, 160, 170, 90, 40,
60, 10};

std::cout << knapsack( weight, value, 10, 26 ) << std::endl;

is 520, return
which is0;the optimal result
}
Dynamic programming
88

Arrays versus hash tables

Both simple arrays and hash tables may be dynamically resized

As we saw before, this does not significantly affect the run time

Two problems with arrays are:


– They are indexed only by integers
– They are dense (we may not be solving all sub-problems)
Dynamic programming
89

Remember tables in Maple

Maple is a programming language with built-in memoization


– In Maple, memoization is achieved through the use of hash tables
termed remember tables
– To demonstrate, we will implement the calculation of Fibonacci numbers
in Maple
Dynamic programming
90

Remember tables in Maple

In Maple, we state that a function must have option remember


This gives the function remember tables

> F := proc(n)
option remember; Maple’s version of
printf( "Calling F(%d)\n", n ); “this” function

procname( n - 1 ) + procname( n - 2 );
end proc:
F( 0 ) := 1: # assign F(0) = 1 to remember table
F( 1 ) := 1: # assign F(1) = 1 to remember table
Dynamic programming
91

Remember tables in Maple

> F(5); # the first time it is called,


# we call F(5), F(4), F(3), and F(2)
Calling F(5)
Calling F(4)
Calling F(3)
Calling F(2)
8
> F(5); # the second time, the result comes
# from the remember table
8
> F(7); # should only call F(7) and F(6)
Calling F(7)
Calling F(6)
21
Dynamic programming
92

Remember tables in Maple

Recall however, that remember tables are hash tables (with 1024
bins)

Therefore, they can fill up, and therefore accessing previous results
can become expensive
Dynamic programming
93

Remember tables in Maple

Maple uses a static chained hash table with 1024 bins


– Making n accesses to a hash table should occur in Q(n) time
– The following shows the time (in seconds) it takes to calculate
F( 1024n ) for n up to 80

– Behaviour is initially linear, but it reverts back to quadratic growth


Dynamic programming
94

Applications

Other applications of dynamic programming include:


– The Floyd-Warshall algorithm solving the all-pairs shortest path problem
– Longest common subsequence between two strings
• This has applications to bioinformatics such as DNA matching
– Solves the Travelling salesman problem in O(n2 2n) time
– The Earley parser of context-free languages (ECE 351 Compilers)
– The Levenshtein distance between two strings (used in spell checkers)

These will be discussed in the homework


Dynamic programming
95

Applications

And finally:
– the Duckworth-Lewis method for calculating the target score in an
interrupted cricket match:
• On the 2006 One-Day International between India and Pakistan, India batted first and
was all out in the 49th over for 328
• Pakistan was 7 wickets down for 311 when bad light stopped play after the 47th over
• With three full overs left to play and three wickets in hand, the Duckworth-Lewis method
calculated the target to be 304, so the result of the match is officially listed as
“Pakistan won by 7 runs (D/L Method)”

Jono4174
http://en.wikipedia.org/wiki/Duckworth-Lewis_method
Dynamic programming
96

Summary

In this class, we have considered the algorithm design strategy of


dynamic programming
– Useful when recursive algorithms have overlapping sub-problems
– Storing calculated values allows significant reductions in time
• Memoization
– Considered top-down and bottom-up approaches to solving problems
– Looked at both simple and more complex solutions:
• Fibonacci numbers
• Newton polynomial coefficients
• Matrix chain multiplications
• Optimal polygon triangulation
• Interval scheduling
– Also considered Maple’s remember tables
Dynamic programming
97

References

Wikipedia, http://en.wikipedia.org/wiki/Topological_sorting

[1] Cormen, Leiserson, and Rivest, Introduction to Algorithms, McGraw Hill, 1990, §11.1, p.200.
[2] Weiss, Data Structures and Algorithm Analysis in C++, 3 rd Ed., Addison Wesley, §9.2, p.342-5.

These slides are provided for the ECE 250 Algorithms and Data Structures course. The
material in it reflects Douglas W. Harder’s best judgment in light of the information available to
him at the time of preparation. Any reliance on these course slides by any party for any other
purpose are the responsibility of such parties. Douglas W. Harder accepts no responsibility for
damages, if any, suffered by any party as a result of decisions made or actions based on these
course slides for any other purpose than that for which it was intended.

You might also like