0% found this document useful (0 votes)
1 views2 pages

Fox Example

g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views2 pages

Fox Example

g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Example of Matrix Multiplication by Fox Method

Thomas Anastasio
November 23, 2003

Fox’s algorithm for matrix multiplication is described in Pacheco 1 . This handout gives an example of the
algorithm applied to 2 × 2 matrices, A and B. The product is a 2 × 2 matrix C.
A00 A01 B00 B01 A00 B00 + A01 B10 A00 B01 + A01 B11
A= B= C=
A10 A11 B10 B11 A10 B00 + A11 B10 A10 B01 + A11 B11

Assume that we have n2 processes, one for each of the elements in A, B, and C. Call the processes P 00 , P01 ,
P10 , and P11 , and think of them as being arranged in a grid as follows:
P00 P01
P10 P11
Allocate space on each processor Pij for an A element, a B element, and a C element.

Fox’s algorithm takes n stages for matrices of order n. The algorithm starts off with each C i,j = 0. In
stage k, process Pi,j computes
Ci,j = Ci,j + Ai,i+k × Bi+k,j
In this example, since our matrices are of order 2, there will be two stages. In stage 0, P i,j computes
Ci,j = Ci,j + Ai,i × Bi,j . In stage 1, Pi,j computes Ci,j = Ci,j + Ai,i+1 × Bi+1,j , a column to the “right” in
A and a row “down” in B.

1. Stage 0
(a) We want Ai,i on process Pi,j , so broadcast the diagonal elements of A across the rows, (A ii → Pij ).
This will place A0,0 on each P0,j and A1,1 on each P1,j . The A elements on the P matrix will be
A00 A00
A11 A11
(b) We want Bi,j on process Pi,j , so broadcast B across the rows (Bij → Pij ). The A and B values
on the P matrix will be
A00 A00
B00 B01
A11 A11
B10 B11
1 Peter Pacheco, Parallel Programming with MPI, Morgan-Kaufmann, 1996, Section 7.2

1
(c) Compute Cij = AB for each process
A00 A00
B00 B01
C00 = A00 B00 C01 = A00 B01
A11 A11
B10 B11
C10 = A11 B10 C11 = A11 B11

We are now ready for the second stage. In this stage, we broadcast the next column (mod n) of A
across the processes and shift-up (mod n) the B values.

2. Stage 1

(a) The next column of A is A0,1 for the first row and A1,0 for the second row (it wrapped around,
mod n). Broadcast next A across the rows
A01 A01
B00 B01
C00 = A00 B00 C01 = A00 B01
A10 A10
B10 B11
C10 = A11 B10 C11 = A11 B11
(b) Shift the B values up. B1,0 moves up from process P1,0 to process P0,0 and B0,0 moves up (mod n)
from P0,0 to P1,0 . Similarly for B1,1 and B0,1 .
A01 A01
B10 B11
C00 = A00 B00 C01 = A00 B01
A10 A10
B00 B01
C10 = A11 B10 C11 = A11 B11
(c) Compute Cij = AB for each process
A01 A01
B10 B11
C00 = C00 + A01 B10 C01 = C01 + A01 B11
A10 A10
B00 B01
C10 = C10 + A10 B00 C11 = C11 + A10 B01
The algorithm is complete after n stages and process P i,j contains the final result for Ci,j .

You might also like