Matrix Chain Multiplication
Matrix Chain Multiplication
Given some matrices to multiply, determine the best order to multiply them so you minimize the number of single element multiplications.
i.e. Determine the way the matrices are parenthesized.
First off, it should be noted that matrix multiplication is associative, but not commutative. But since it is associative, we always have: ((AB)(CD)) = (A(B(CD))), or any other grouping as long as the matrices are in the same consecutive order. BUT NOT: ((AB)(CD)) = ((BA)(DC))
It may appear that the amount of work done wont change if you change the parenthesization of the expression, but we can prove that is not the case! Let us use the following example:
Let A be a 2x10 matrix Let B be a 10x50 matrix Let C be a 50x20 matrix
Lets get back to our example: We will show that the way we group matrices when multiplying A, B, C matters:
Let A be a 2x10 matrix Let B be a 10x50 matrix Let C be a 50x20 matrix
The key to solving this problem is noticing the sub-problem optimality condition:
If a particular parenthesization of the whole product is optimal, then any subparenthesization in that product is optimal as well.
Say What?
If (A (B ((CD) (EF)) ) ) is optimal Then (B ((CD) (EF)) ) is optimal as well
Assume that we are calculating ABCDEF and that the following parenthesization is optimal:
(A (B ((CD) (EF)) ) )
Why is this?
Because if it wasn't, and say ( ((BC) (DE)) F) was better, then it would also follow that
(A ( ((BC) (DE)) F) ) was better than (A (B ((CD) (EF)) ) ),
In essence, there is exactly one value of k for which we should "split" our work into two separate cases so that we get an optimal result.
(A0) (A1 Ak+2 ... An-1) (A0 A1) (A2 Ak+2 ... An-1) (A0 A1A2) (A3 Ak+2 ... An-1) ... (A0 A1 ... An-3) (An-2 An-1) (A0 A1 ... An-2) (An-1)
Basically, count the number of multiplications in each of these choices and pick the minimum.
One other point to notice is that you have to account for the minimum number of multiplications in each of the two products.
Ni,j = min value of Ni,k + Nk+1,j + didk+1dj+1, over all valid values of k.
Now lets turn this recursive formula into a dynamic programming solution
Which sub-problems are necessary to solve first? Clearly it's necessary to solve the smaller problems before the larger ones.
In particular, we need to know Ni,i+1, the number of multiplications to multiply any adjacent pair of matrices before we move onto larger tasks. Similarly, the next task we want to solve is finding all the values of the form Ni,i+2, then Ni,i+3, etc.
Algorithm:
1) Initialize N[i][i] = 0, and all other entries in N to . 2) for i=1 to n-1 do the following 2i) for j=0 to n-1-i do 2ii) for k=j to j+i-1 2iii) if (N[j][j+i-1] > N[j][k]+N[k+1][j+i-1]+djdk+1di+j) N[j][j+i-1]= N[j][k]+N[k+1][j+i-1]+djdk+1di+j
Basically, were checking different places to split our matrices by checking different values of k and seeing if they improve our current
References
Slides adapted from Arup Guhas Computer Science II Lecture notes: http://www.cs.ucf.edu/~dmarino/ucf/cop350 3/lectures/ Additional material from the textbook:
Data Structures and Algorithm Analysis in Java (Second Edition) by Mark Allen Weiss
Additional images:
www.wikipedia.com xkcd.com