Exercise8 - Solution - Introduction For Embedded Systems
Exercise8 - Solution - Introduction For Embedded Systems
Exercise8 - Solution - Introduction For Embedded Systems
Pipeline-resources process data in time intervals that are smaller than the actual execution time w. As soon as
after the start of a task v1 the so-called pipeline-interval P I has elapsed, the next task v2 can be started on
the same resource (see Figure 1). Non-pipeline-resources are a special case of pipeline-resources with P I = w.
v2
v1
t
w
PI
NOP 0
1 2 3
4 5
n
NOP
a) Modify the LIST algorithm given in the lecture notes so that pipeline-resources are considered. Which
step has to be reformulated and how? (Explain your answer!)
b) Perform the scheduling for the sequence graph given in Figure 2 using the modified algorithm. You can
use Table 1. The multiplication (r2 ) lasts 4 time units and the length of the pipeline-interval is 2 time
1
units. The addition (r1 ) lasts 2 time units and cannot be executed as pipeline-operation. 1 adder and 1
multiplier are available. Use the number of successor nodes as priority criterion. What is the resulting
latency?
Solution to Task 1:
a) [. . . ]
Determine candidates Ut,k to be scheduled;
Determine set of occupied resources Ot,k ;
Choose subset St ⊆ Ut,k with maximal priority and |St,k | + |Ot,k | ≤ α(vk )
[. . . ]
Ot,k is the set of resources of type k that are occupied in the time slot t and are not yet available for the
following operation. On each of these resources exactly one operation is executed in a pipeline-interval.
2
Task 2: Integer Linear Programming
NOP 0
1 2 6 8 10
3 7 9 11
n
NOP
For the execution times of the operations assume: A multiplication operation (MULT) takes 2 time units and
all other (ALU) operations take 1 time unit each. Two units of the resource type r1 (multiplier) and two units
of the resource type r2 (ALU) are allocated.
(a) Apply the ASAP and ALAP algorithms to compute the earliest (li ) and the latest (hi ) starting time of
all operations vi ∈ Vs , i ∈ {1, . . . , 11}. For ALAP, assume the maximum latency L = 7. Fill in the
starting times in Table 2.
(b) Formulate the problem of latency minimization with restricted resources as an integer linear program
(ILP). For this, you should introduce the binary variables xi,t ∈ {0, 1} ∀vi ∈ VS and ∀t ∈ {t ∈
Z | li ≤ t ≤ hi }. τ (vi ) is used to denote the starting time of operation vi ∈ VS and α(ri ) with
ri ∈ VR = {MULT, ALU} denotes the number of allocated resource instances. Given the above
P
notations, write down the following equations/inequations without using the symbol.
(i) Express the objective function of the ILP
(ii) Define τ (vi ) ∀i ∈ {1, . . . , 11} as a function of xi,t , where l1 ≤ t ≤ h1
(iii) Express all data dependencies
(iv) Express all resource limitations
(c) In an analogous manner try to formulate an ILP that solves the problem of cost minimization with
latency limitation. Hint: We assume that the cost of a realization is the sum of the costs c of the
multipliers with c(r1 ) = 2 per allocated unit, and of the ALUs with c(r2 ) = 1 per allocated unit. For
the latency bound, we choose L̄ = 6.
Solution to Task 2:
(a) The starting times are listed in Table 2. The corresponding ASAP/ALAP schedules are depicted in
Figure 4.
3
li (ASAP) hi (ALAP)
v1 1 2
v2 1 2
v3 3 4
v4 5 6
v5 6 7
v6 1 3
v7 3 5
v8 1 5
v9 3 7
v10 1 6
v11 2 7
NOP 0 NOP 0
ASAP ALAP
1 2 6 8 10
t=1
11 1 2
t=2
3 7 9 6
t=3
3
t=4
4 7 8
t=5
5 4 10
t=6
n 5 9 11
NOP t=7
n
NOP
4
(iii) Data dependencies:
Latency limitations:
L = τ (vn ) − τ (v0 ) ≤ L̄ = 6
New objective function:
5
Task 3: Iterative Algorithms
Please answer the following questions considering the given video codec application specified as a marked
graph in Figure 5.
ν1 ν2 ν3 ν4 ν5
w(νi ) 10 10 10 5 5
Figure 5: Video codec marked graph representation Table 3: Execution time of each function
where P is the minimum iteration interval. The execution time of each function is listed in Table 3.
(b) Assuming unlimited resources and only one token on the edge between ν5 and ν1 , determine the minimum
iteration interval P and the latency L. To justify your answer, draw the scheduling on the timeline given
in Figure 6 with the dependency from ν5 to ν1 highlighted.
(c) The motion estimation function (ν1 ) uses the result of the previous frame (See the dependency between
ν1 and ν5 ). Let us now suppose that any arbitrary number of tokens can be inserted to reduce P using
functional pipelining. Then, determine the minimum number of tokens that should be added on the
edge ν5 → ν1 to achieve P = 10? To justify your answer, draw the pipelined scheduling on the timeline
given in Figure 7 with the dependency from ν5 to ν1 highlighted and calculate the latency L of the
schedule.
Solution to Task 3:
(a) Dependencies:
τ (ν2 ) − τ (ν1 ) ≥ 10
τ (ν3 ) − τ (ν2 ) ≥ 10
τ (ν4 ) − τ (ν2 ) ≥ 10
6
Figure 7: Pipelined scheduling result of the video codec
τ (ν5 ) − τ (ν4 ) ≥ 5
τ (ν1 ) − τ (ν5 ) ≥ 5 − 1 · P
(c) Now the iteration interval P is given (P = 10) and we are looking for the number of tokens n. Therefore,
we replace the last inequation in 3a) by τ (ν1 ) − τ (ν5 ) ≥ 5 − n · 10 and solve the new set of inequations
for n.
⇒ nmin = 3
We have to add at least 2 tokens on the edge between ν5 and ν1 .
L = 30