Lecture 31
Lecture 31
CS G553
‹#›
Lecture – 31
High-Level Synthesis for Reconfigurable Devices
(Behavioral Synthesis): Temporal Partitioning Algorithms
CS G553 2
Temporal partitioning & Scheduling
Scheduling:
o Inputs:
• A DFG
• An architecture (i.e. a set of processing elements)
o Output:
• Starting time of each node on a given resource
Temporal partitioning:
o Input:
• A DFG
• A reconfigurable device
o Output:
• A set of partitions
• Starting time of each node is the starting time of the partition to which it
belongs
CS G553 3
Temporal partitioning & Scheduling
Solution approaches:
o List scheduling
o Integer linear programming (exact method)
o Network flow
o Spectral method
• * Recursive bi-partitioning approaches
CS G553 4
Unconstrained Scheduling
Unconstrained scheduling:
o Assumption: unlimited amount of resources
• Device with unlimited size
CS G553 5
Unconstrained Scheduling
o Defines the earliest starting time for each node in the DFG
o Defines the latest starting time for each node in the DFG
according to a given latency
The mobility of a node:
o (ALAP starting time) – (ASAP starting time)
CS G553 6
ASAP Algorithm
ASAP(G(V,E),d) {
FOREACH ( vi without predecessor)
s(vi) := 0;
REPEAT {
choose a node vi , whose predecessors are all planned;
s(vi) := maxj:(vj,vi)E {s(vj)+ dj};
}
UNTIL (all nodes vi are planned);
RETURN s;
}
CS G553 7
ASAP Example
Time 0
* * * * +
Time 1
* * + <
Time 2
-
Zeit 3
Time 3
-
Time 4
CS G553 8
ASAP Example
Assumptions:
o Multiplication: latency of 100
clocks,
o Addition/subtraction: 50
clocks,
o data transmission delay is *
neglected.
CS G553 9
ASAP Example
Assumptions:
o Multiplication: latency of 100
clocks,
o Addition/subtraction: 50
clocks,
o data transmission delay is
neglected.
Computation delay of
the prev. node
CS G553 10
ALAP-Algorithm
ALAP(G(V,E),d, L) {
FOREACH( vi without successor)
s(vi) := L - di;
REPEAT {
Choose a node vi , which successors are all planned;
s(vi) := minj:(vi,vj)E {s(vj)} - di;
}
UNTIL (all nodes vi are planned);
RETURN s
}
CS G553 11
ALAP-Example
Time 0
* *
Time 1
* *
Time 2
- * * +
Time 3
- + <
Time 4
CS G553 12
Mobility
Zeit 0
Time 0 0 0
* * * * +
Zeit 1
Time 1
0 * 1 + <
**
Zeit 2
Time 2
0 1 2 2
- * * +
Zeit 3
Time 3 2 2
0 - + <
Zeit 4
Time 4
CS G553 13
ALAP Example
Assumptions:
o Multiplication: latency of 100
clocks,
o Addition/subtraction: 50
clocks,
o Overall computation time: *
250
CS G553 14
ALAP Example
Assumptions:
o Multiplication: latency of 100
clocks,
o Addition/subtraction: 50
clocks,
o Overall computation time:
250
Computation delay of
the prev. node
CS G553 15
Constrained Scheduling
Constrained scheduling:
o A set of fixed resources available (ASIC).
o Many tasks competing for a given resource,
• → One of them must be chosen according to a given criteria and
the rest will be scheduled later.
1. Extended ASAP, ALAP:
o Compute ASAP or ALAP
o Assign the tasks earlier (ASAP) or later (ALAP), until the
resource constraints (e.g. area) are fulfilled.
CS G553 16
Extended ASAP
● Constraint:
o 2 Multipliers, 2 ALUs (+, −, <)
Time 0
* * +
Time 1
* * <
Time 2
- * *
Time 3
- +
Time 4
CS G553 17
Constrained Scheduling
List scheduling:
o Sort nodes in topological order
o Assign priority to nodes
o Criteria can be:
• number of successors,
• depth (length of longest path from inputs),
• latency-weighted depth,
– w: latency of the operation to be executed by the nodes on the path.
• mobility,
• connectivity,
• ...
CS G553 18
Constrained Scheduling
CS G553 19
Constrained Scheduling
o At a given step, the free resource is assigned the
task with highest priority.
CS G553 20
Constrained Scheduling (Example)
3 3 2 1 1
* * * * +
2 1 0 0
* * + <
1
-
0 -
CS G553 21
Constrained Scheduling (Example)
Time 0
* +
Time 1
* <
Time 2
*
Time 3
* -
Time 4
*
Time 5
* -
Time 6
+
Time 7
CS G553 22
List Scheduling: Example
Resources: 1 multiplier, 1 adder
Latency:
o Multiplication: 100 clocks,
o Add/sub: 50 clocks,
number of
successors
as priority 400
CS G553 23
Temporal Partitioning in RCS
CS G553 24
Temporal Partitioning vs.
Constrained Scheduling
In RCS,
o Only the starting time and the end time of the complete partition is
usually considered.
CS G553 25
Temporal Partitioning in RCS
Temporal partitioning:
o The same as list scheduling
o Assignment criterion: there should be enough places left on the
device to accommodate the new component.
Algorithm: List-scheduling algorithm for reconfigurable devices
sort the nodes of v according to their priorities
P0 := Ø
while V ≠Ø do
select a vertex v V with highest priority and whose predecessors
are all placed
if (a partition Pi exists with s(Pi) + s(v) ≤ s(H)) then
Pi = Pi {v}
else
create a new partition Pi+1 and set Pi+1 = {v}
end if
end while
CS G553 26
Temporal Partitioning vs.
Constrained Scheduling
3 3 1
3 * 3 * 2 * 1 * 1 +
* * +
0 2 1 0 0
* * + <
P1 <
1 -
2 2
* *
0 -
P2
1 ● Criterion: number of
- successors
● Connectivity:
1 1 ● c(P1) = 1/6,
* ● size(FPGA) = 250,
* ● c(P2) = 1/3,
● size (mult) = 100,
0 0 ● c(P3) = 2/6.
● size(add) = size(sub) = 20,
- P3 + ● Quality: 0.28
● size(comp) = 10.
CS G553 27
Improvement
Best criteria:
o Total computation time of DFG:
tDFG = n × CH + 1,…,n(tPi)
CS G553 28
Improvement
+ - *
Level 1
- /
Level 2
• Disadvantage: Level 3
➢ Levelization:
− Modules are assigned to partitions based more on their
level number rather than their interconnectivity with other
component.
➢ Interconnectivity (data exchange) must be optimized.
CS G553 29
LS-Based Temporal Partitioning
3 3 1 3 * 3 * 2 * 1 * 1 +
* * +
2 1 0 0
0 * * + <
P1 <
1 -
2 2
* * 0 -
P2
1 ● Criterion: number of
● Connectivity:
- successors
1 ● c(P1) = 1/6,
1 ● size(FPGA) = 250,
* * ● size (mult) = 100,
● c(P2) = 1/3,
● c(P3) = 2/6.
0 0 ● size(add) = size(sub) = 20,
- ● Quality: 0.28
P3 + ● size(comp) = 10.
CS G553 30
Improved Temporal Partitioning
3 1 1 3 * 3 * 2 * 1 * 1 +
* * +
P1 2 1 0 0
0 * * + <
0 + <
1-
3 *
2
* 0 -
P2
1 ● Connectivity:
- 2 ● c(P1) = 2/10,
1 * ● c(P2) = 2/3,
● c(P3) = 2/3.
0 *
● Quality: 0.51
- P3
● Quality is better
CS G553 31
Improved List Scheduling
Pair wise interchange
CS G553 32
Temporal partitioning – ILP
With the ILP (Integer Linear Programming),
o Temporal partitioning constraints are formulated as
equations.
o The equations are then solved using an ILP-solver.
The constraints usually considered are:
o Uniqueness constraint
o Temporal order constraint
o Memory constraint
o Resource constraint
o Latency constraint
Notations:
( yvi = 1) (v Pi )
( wuv 0) ((u Pi ) (v Pj ) ( Pi Pj ))
CS G553 33
Temporal partitioning – ILP
Unique assignment constraint: Each task must be placed in
exactly one partition. (m = # of partitions)
m
v V , yvi = 1
i =1
CS G553 34
Temporal partitioning – ILP
Pi P, y
vV
vi s (v) s (device)
Pi P, w uv
( uPi ) ( vPi )
+ w uv
( uPi ) ( vPi )
T (device)
CS G553 35
Temporal partitioning – ILP
CS G553 36
Temporal partitioning by ILP: Example
Assignment constraint:
o y11+ y12 + y13 = 1
o y21+ y22 + y23 = 1
o …
o y71 +y72 + y73 = 1
o Partition P1:
o y22 = y23 = 0, y21 = 1
o y32 = y33 = 0, y31 = 1
o y42 = y43 = 0, y41 = 1
o Partition P2:
o y11 = y13 = 0, y12 = 1
o y51 = y53 = 0, y52 = 1
o y61 = y63 = 0, y62 = 1
o Partition P3:
o y71 = y72 = 0, y73 = 1
CS G553 37
Temporal partitioning by ILP: Example
Precedence constraint:
i i
i i
CS G553 38
Temporal partitioning by ILP: Example
Resource constraint:
o Device with a size of 200 LUTs, and 100 LUTs for the multiplication,
50 LUTs each for the addition, the comparison
s(u)
s(u)
s(u)
CS G553 39
Temporal partitioning by ILP: Example
Bits
Bits
CS G553 40
The End
Questions ?
CS G553 41