0% found this document useful (0 votes)

27 views58 pages

DSP-FPGA - Ch02 - Iteration Bound - HK202

Uploaded by

tam.duongweed3010

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views58 pages

DSP-FPGA - Ch02 - Iteration Bound - HK202

Uploaded by

tam.duongweed3010

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

Ch02 – Iteration bound

(Giới hạn lặp)

TLTK:
1. Các slide từ sách của Prof. Parhi
2. Slide của Prof. Viktor Öwall
3. Slide bài giảng của Thầy Hồ Trung Mỹ
4. KTGHK cũ

1
Outline
 Introduction
 Loop Bound (giới hạn vòng)
 Important Definitions and Examples
 Iteration Bound (giới hạn lặp)
 Important Definitions and Examples
 Techniques to Compute Iteration Bound
 Algorithms to compute iteration bound
 Longest Path Matrix (LPM)
 Minimum Cycle Mean (MCM)

2
Some Definitions For DSP Algorithms

 Non-terminating programs
 Execute repetitively
 Iteration
 Execution of all the computations in the algorithm once
 Iteration period
 The time required for execution of an iteration
 Iteration rate
 Number iterations executed per second

3
Some Definitions For DSP Algorithms
 Sampling period
 The time difference between two consecutive samples
 Sampling rate (throughput)
 Number of samples processed per second
 Critical path for combinational logic circuit
 Longest path between inputs and outputs
 Critical path for sequential circuits
 Longest path between any two storage (delay)
elements
 In general, critical path is defined to be the path with
the longest computation time among all paths that
contain zero delays.
4
Some Definitions For DSP Algorithms
 Critical path computation time determines
 Minimum feasible clock period of DSP systems
 Clock period = Clock cycle time = 1/clock rate =
1/Critical path period
 Latency
 Difference between the time an output generated and
the time at which its corresponding input was received
by the system
 Latency representations
• Absolute time units, or the number of gate delays, for
combinational logic circuit systems.
• Number of clock cycle for the sequential systems.

5
Various Representations of DSP Systems

 DSP Algorithm Descriptions

 Behavioral Descriptions
 Graphical Representations

6
DSP algorithm descriptions
 Mathematical formulations
 Specify functionality of DSP algorithm
 Does not specify the order & structure of the
internal operations

 Behavioral description languages or graphical

representations for architectural design

7
Behavioral Descriptions
 Applicative languages
 A set of equations
 Silage language
 Prescriptive languages
 Specify the order of the assignment statements
 Pascal, C, SystemC,..
 Descriptive languages
 Represents the structure of DSP system
 Verilog HDL or VHDL

8
Graphical Representations
 Efficient for investigating and analyzing data
flow properties
 Efficient for exploiting the inherent parallelism
among different subtasks
 Easy mapping DSP algorithm descriptions to
hardware structural implementations
 Provide technology-independent architecture

9
Various graphical representations

 Block Diagram (BD)

 Signal Flow Graph (SFG)

 Data Flow Graph (DFG)

 Dependence Graph (DG)

10
Graphical representations

functional
blocks

11
Ex: Direct Form 4-tap FIR filter

Note:
Clock speed limited by Critical Path!
TCritical = TM + (N-1)TA
N = Nr. of Taps 12
Graphical Representation Method 2:
Signal-Flow Graph (SFG)

Note:
 Source = no entering edge
 Sink = only entering edges 13
14
15
Block diagram with data broadcast

16
Graphical Representation Method 3:
Data-Flow Graph (DFG)

17
18
Loop Bound
 Loop: a directed path that begins and ends at the
same node.
 Loop Bound of the loop
 Lower bound on the loop computation time or the time
requires to execute one iteration of the loop.
 Defined as t𝑙/w𝑙, where t𝑙 is the loop computation time
and w𝑙 is the number of delays in the loop
 E.g. 𝑦(𝑛)
(2)  A→B→A is a loop
𝑥(𝑛) (4)
A B a  t𝑙 = 2+ 4=6, w𝑙 = 1
D ⟹loop bound = t𝑙/w𝑙 =6

𝑦 𝑛 = ay 𝑛 − 1 + 𝑥 𝑛
Loop Bound (cont’d)
 Another example
 A→B→A is a loop
y(n)  t𝑙 = 2+ 4=6, w𝑙 = 2 (since 2D)
x(n) (2) ⟹ loop bound = t𝑙/w𝑙 = 3
A (4)
B a
It means one iteration of loop can be
executed in 3 time unit.
2D

 Another example
D
(2) (4) (5) Two loops
 A→B→A : T = 6, W = 2, loop bound = 3
A B C
 A→B→C→A, T = 11, W = 1, loop bound = 11
2D
Loop Bound (cont’d)

Same critical path

Different loop bound
Critical path:
 For combinational logic circuit: Longest path between
inputs and outputs
 For sequential circuit: Longest path between any two
storage (delay) elements.
21
Loop Bound (cont’d)

22
Iteration Bound



 In other words, it is not possible to achieve iteration period (or

sample period) lower than iteration bound.
𝑇sample ≥ 𝑇∞
23
Example: Iteration Bound
 Compute the loop bounds and iteration bound of the following
loops.
D  Two loops
(2) (4) (5)  A→B→A : T = 6, W = 2, loop bound = 3
A B C
 A→B→C→A, T = 11, W = 1, loop bound = 11
⇒ 𝑻∞ = 𝐦𝐚𝐱{3, 11} = 𝟏𝟏𝐮. 𝐭.
2D

 Compute the loop bounds and iteration bound of the following

loops.
 Three loops

= 12ns
24
Iteration Bound (cont’d)

25
Critical Path
 Critical Path: the path with longest computation time
among all paths that contain zero delays.
 Critical Path: the lower bound on clock period. 𝑇clock ≥ 𝑇𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙

 Example: Critical Paths

6→3→2→1: 5 u.t.
5→3→2→1: 5 u.t.
 Minimum clock period = 5 u.t.

 To achieve high-speed, the

length of the critical path can be
reduced by pipelining and
parallel processing (Chapter 3).
26
Algorithms to compute iteration bound

 Longest Path Matrix (LPM)

 Minimum Cycle Mean (MCM)

27
Algorithms to compute iteration bound

 Longest Path Matrix (LPM)

 Minimum Cycle Mean (MCM)

28
Longest Path Matrix (LPM)

𝒎 𝟏 𝒎−𝟏
𝒍𝒊,𝒋 = max −𝟏, 𝒍𝒊,𝒌 + 𝒍𝒌,𝒋 𝐟𝐨𝐫 𝒌 ∈ 𝟏, 𝟐, ⋯ , 𝒅 ; 𝐦 = 𝟐, 𝟑, ⋯ , 𝒅
𝒌

𝑚
and 𝑙𝑖,𝑖 ≠ −1 29
Examples for LPM

-1 0 -1 -1
(1)
L = 4 -1 0 -1
5 -1 -1 0
5 -1 -1 -1
d i → dj
max
q q
t if at least one path exists
𝑙 (1)i,j =
-1 if no such path exists

where max tqd i → dj is the maximum of the longest

computation time between delay element di to delay
element dj

30
𝒎 𝟏 𝒎−𝟏
𝒍𝒊,𝒋 = max −𝟏, 𝒍𝒊,𝒌 + 𝒍𝒌,𝒋 𝑙 (1)i,k ≠ −1 ; 𝑙 (m-1)k,j ≠ −1
𝒌∈ 𝟏,𝒅
𝑙 (1)1,2
• Ex. i = 1; m = 2; L(2)
(2) (1) (1) i=1
𝑙 1,1 = max( -1, 𝑙 1,k + 𝑙 k,1)
k{2}

= max( -1,0+4) = 4
(2) (1) (1)
𝑙 1,2 = max( -1, 𝑙 1,k + 𝑙 k,2)

= max( -1) = -1
j=1
𝑙 (1)2,1
𝑙 1,3(2) = max( -1, 𝑙 (1)1,k + 𝑙 (1)k,3) i=1
k{2} 4 - - -
= max(-1, 0+0) = 0 - - - -
(2)
𝑙 1,4(2) = max ( -1, 𝑙 (1)1,k + 𝑙 (1)k,4) L = - - - -
- - - -
= max(-1) = -1
31
𝒎 𝟏 𝒎−𝟏
𝒍𝒊,𝒋 = max −𝟏, 𝒍𝒊,𝒌 + 𝒍𝒌,𝒋 𝑙 (1)i,k ≠ −1 ; 𝑙 (m-1)k,j ≠ −1
𝒌∈ 𝟏,𝒅
𝑙 (1)1,2
• Ex. i = 1; m = 2; L(2)
(2) (1) (1) i=1
𝑙 1,1 = max( -1, 𝑙 1,k + 𝑙 k,1)
k{2}

= max( -1,0+4) = 4
(2) (1) (1)
𝑙 1,2 = max( -1, 𝑙 1,k + 𝑙 k,2)

= max( -1) = -1
j=2
𝑙 1,3(2) = max( -1, 𝑙 (1)1,k + 𝑙 (1)k,3) i=1
k{2} 4 -1 - -
= max(-1, 0+0) = 0 - - - -
(2)
𝑙 1,4(2) = max ( -1, 𝑙 (1)1,k + 𝑙 (1)k,4) L = - - - -
- - - -
= max(-1) = -1
32
𝒎 𝟏 𝒎−𝟏
𝒍𝒊,𝒋 = max −𝟏, 𝒍𝒊,𝒌 + 𝒍𝒌,𝒋 𝑙 (1)i,k ≠ −1 ; 𝑙 (m-1)k,j ≠ −1
𝒌∈ 𝟏,𝒅
𝑙 (1)1,2
• Ex. i = 1; m = 2; L(2)
(2) (1) (1) i=1
𝑙 1,1 = max( -1, 𝑙 1,k + 𝑙 k,1)
k{2}

= max( -1,0+4) = 4
(2) (1) (1)
𝑙 1,2 = max( -1, 𝑙 1,k + 𝑙 k,2)
𝑙 (1)2,3
= max( -1) = -1
j=3
𝑙 1,3(2) = max( -1, 𝑙 (1)1,k + 𝑙 (1)k,3) i=1
k{2} 4 -1 0 -
= max(-1, 0+0) = 0 - - - -
(2)
𝑙 1,4(2) = max ( -1, 𝑙 (1)1,k + 𝑙 (1)k,4) L = - - - -
- - - -
= max(-1) = -1
33
𝒎 𝟏 𝒎−𝟏
𝒍𝒊,𝒋 = max −𝟏, 𝒍𝒊,𝒌 + 𝒍𝒌,𝒋 𝑙 (1)i,k ≠ −1 ; 𝑙 (m-1)k,j ≠ −1
𝒌∈ 𝟏,𝒅
𝑙 (1)1,2
• Ex. i = 1; m = 2; L(2)
(2) (1) (1) i=1
𝑙 1,1 = max( -1, 𝑙 1,k + 𝑙 k,1)
k{2}

= max( -1,0+4) = 4
(2) (1) (1)
𝑙 1,2 = max( -1, 𝑙 1,k + 𝑙 k,2)

= max( -1) = -1
j=4
𝑙 1,3(2) = max( -1, 𝑙 (1)1,k + 𝑙 (1)k,3) i=1
k{2} 4 -1 0 -1
= max(-1, 0+0) = 0 - - - -
(2)
𝑙 1,4(2) = max ( -1, 𝑙 (1)1,k + 𝑙 (1)k,4) L = - - - -
- - - -
= max(-1) = -1
34
𝒎 𝒎 𝟏 + 𝒍 𝒎−𝟏 𝟏 𝒎−𝟏
𝒍𝒊,𝒋
𝒍𝒊,𝒋 ==𝒌∈
max
max −𝟏,𝒍𝒍𝒊,𝒌
−𝟏, 𝒊,𝒌 + 𝒍 𝒌,𝒋
𝒌,𝒋
𝑙 (1)i,k ≠ −1 ; 𝑙 (m-1)k,j ≠ −1
𝟏,𝒅𝒌
𝑙 (1)2,3
• Ex. i = 2; m = 2; L(2)
(2) (1) (1)
𝑙 2,1 = max( -1, 𝑙 2,k +𝑙 k,1)
k{3} i=2
= max( -1,0+5) = 5
(2) (1) (1)
𝑙 2,2 = max( -1, 𝑙 2,k + 𝑙 k,2)
k{1}

= max( -1,4+0 ) = 4 𝑙 (1)3,1

𝑙 2,3(2) = max( -1, 𝑙 (1)2,k + 𝑙 (1)k,3) i=1
4 -1 0 -1
i=2
= max(-1) = -1 5 4 -1 0
(2)
𝑙 2,4(2) = max ( -1, 𝑙 (1)2,k + 𝑙 (1)k,4) L = - - - -
k{3}
- - - -
= max(-1,0+0) = 0
35
𝒎𝒎 𝟏 + 𝒍 𝒎−𝟏 𝟏 𝒎−𝟏
𝒍𝒊,𝒋
𝒍𝒊,𝒋 ==𝒌∈
max
max −𝟏,𝒍𝒍𝒊,𝒌
−𝟏, 𝒊,𝒌 + 𝒍 𝒌,𝒋
𝒌,𝒋
𝑙 (1)i,k ≠ −1 ; 𝑙 (m-1)k,j ≠ −1
𝟏,𝒅 𝒌

• Ex. i = 3; m = 2; L(2)
𝑙
(2) (1)
𝑙 3,k
(1)
+ 𝑙 k,1) 𝑙 (1)3,4
3,1 = max( -1,
k{4}

= max( -1,0+5) = 5 i=3

(2) (1) (1)
𝑙 3,2 = max( -1, 𝑙 3,k + 𝑙 k,2)
k{1}

= max( -1,5+0) = 5 𝑙 (1)4,1

𝑙 3,3(2) = max( -1, 𝑙 (1)3,k + 𝑙 (1)k,3) i=1
4 -1 0 -1
i=2
= max(-1) = -1 5 4 -1 0
(2) i=3
𝑙 3,4(2) = max ( -1, 𝑙 (1)3,k + 𝑙 (1)k,4) L = 5 5 -1 -1
- - - -
= max(-1) = -1
36
𝒎 𝒎 𝟏 + 𝒍 𝒎−𝟏 𝟏 𝒎−𝟏
𝒍𝒊,𝒋
𝒍𝒊,𝒋 ==𝒌∈
max
max −𝟏,𝒍𝒍𝒊,𝒌
−𝟏, 𝒊,𝒌 + 𝒍 𝒌,𝒋
𝒌,𝒋
𝑙 (1)i,k ≠ −1 ; 𝑙 (m-1)k,j ≠ −1
𝟏,𝒅𝒌

• Ex. i = 4; m = 2; L(2)
(2) (1) (1)
𝑙 4,1 = max( -1, 𝑙 4,k +𝑙 k,1)

= max( -1) = -1
(2) (1) (1)
𝑙 4,2 = max( -1, 𝑙 4,k + 𝑙 k,2)
k{1} i=4

= max( -1,5+0) = 5 𝑙 (1)4,1

𝑙 4,3(2) = max( -1, 𝑙 (1)4,k + 𝑙 (1)k,3) i=1
4 -1 0 -1
i=2
= max(-1) = -1 5 4 -1 0
(2) i=3
𝑙 4,4(2) = max ( -1, 𝑙 (1)4,k + 𝑙 (1)k,4) L = 5 5 -1 -1
i=4
-1 5 -1 -1
= max(-1) = -1
37
(m=1)

(m=2)

(m=3) (m=4)

38
Examples for LPM (other solution)
𝒎 𝟏 𝒎−𝟏
𝒍𝒊,𝒋 = max −𝟏, 𝒍𝒊,𝒌 + 𝒍𝒌,𝒋
𝒌∈ 𝟏,𝒅