0% found this document useful (0 votes)
757 views29 pages

Vlsi DSP Chapter 3 Solution

Digital filters can be represented using block diagrams, signal flow graphs, and data flow graphs. These representations show the structure and flow of data through the system. Signal flow graphs and data flow graphs are useful for analyzing properties like iteration bounds, which represent the minimum time needed to complete one iteration through a feedback loop. Retiming techniques can modify these graphs to reduce iteration bounds and optimize hardware implementations of digital filters.

Uploaded by

sachin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
757 views29 pages

Vlsi DSP Chapter 3 Solution

Digital filters can be represented using block diagrams, signal flow graphs, and data flow graphs. These representations show the structure and flow of data through the system. Signal flow graphs and data flow graphs are useful for analyzing properties like iteration bounds, which represent the minimum time needed to complete one iteration through a feedback loop. Retiming techniques can modify these graphs to reduce iteration bounds and optimize hardware implementations of digital filters.

Uploaded by

sachin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Digital Filtering In Hardware

Slide 1
Representations of DSP
• Mathematical formulations
Algorithms
• Behavioral description languages
• Applicative language
• Represents a set of equations satisfied by the variables, e.g. Silage
• Perspective language
• Explicitly specify the order of assignment, e.g. C and other HLLs
• Descriptive language
• Represents the structure of a DSP system, e.g. VHDL, Verilog
• Graphical representations
• Block diagrams
• Signal flow graph (SFG)
• Data flow graph (DFG)
• Dependence graph (DG)
VLSI DSP 2019 3-2
Block Diagrams (1)
• Consists of functional blocks connected with directed edges
• Functional block, e.g. Add, Mult
• Unit delay element
• Directed edge representing the data flow between blocks
• Basic blocks

VLSI DSP 2008 Y.T. Hwang 3-3


Block Diagrams (2)
• 3-tap FIR example

• Alternative block diagram with data broadcast

VLSI DSP 2008 Y.T. Hwang 3-4


Signal Flow Graph (1)
• A collection of nodes and directed edges
• Node: computation or task
• Directed edge (j,k)
• a linear transformation from node j to node k
• Usually as constant gain multiplier or delay elements
• Widely used in digital filter structures
• Flow graph reversal (transposition)
• A transform to obtain equivalent structure
• Applicable to single-input single output system
• Reverse the directions of all edges
• Exchange the input output node
• Retain the edge gain and edge delay

VLSI DSP 2008 Y.T. Hwang 3-5


Signal Flow Graph (2)
• SFG of a 3-tap FIR filter
Original SFG

Transposed SFG

VLSI DSP 2008 Y.T. Hwang 3-6


Signal Flow Graph (3)
• Limitations of transposition
• can be applied to MIMO systems described by symmetric transform matrices
• More on SFG
• Applicable to linear network
• Cannot be used to described multi-rate system

VLSI DSP 2008 Y.T. Hwang 3-7


Data Flow Graph (1)
• DFG
• Node: computation (function or subtask)
• Directed edge: data path or communication between nodes
• Associated edge delay: non-negative
• Associated node delay: execution time of each node
add

mpy

Block diagram Conventional DFG Synchronous DFG


VLSI DSP 2008 Y.T. Hwang 3-8
Data Flow Graph (2)
• Applications: high level synthesis
• Firing rules
• A node can fire whenever all the input data are available
• Concurrency: multiple nodes can be fired simultaneously
• Data driven (implicit) scheduling
• Precedence constraint
• Intra-iteration: imposed by edge with no delay
• Inter-iteration: imposed by edge with delay
• fine-grain (atomic) v.s. coarse grain DFG

VLSI DSP 2008 Y.T. Hwang 3-9


Data Flow Graph (3)
• 3-tap FIR filter example
Direct form

Transpose form

VLSI DSP 2008 Y.T. Hwang 3-10


Data Flow Graph (4)
• Synchronous DFG
• Number of data samples produced or consumed by each node is specified a
priori
• Single rate system
• Multi-rate system: different nodes working on different frequencies
• Multi-rate system can be represented by a single rate system via unfolding
(unrolling)

VLSI DSP 2008 Y.T. Hwang 3-11


Introduction to Iteration bound

• DSP algorithms often contain feedback loops


• Impose an inherent lower bound on the achievable iteration or sample period
• Iteration bound
• Impossible to achieve an iteration period less than the iteration bound even
with infinite HW
t
Iteration k-1
Iteration k
Iteration k+1
Iteration k+2
Iteration period
VLSI DSP 2008 Y.T. Hwang 4-12
Data Flow Graph Representations Execution
Intra-
iteration time of a
• For n = 0 to ∞ node
y(n) = ay(n-1) + x(n)
Critical path
AB

Inter-
iteration

• Iteration – execution of each DFG node once


• Precedence constraints
• Intra-iteration – no delay on edge
• Inter-iteration – at least one delay on edge

VLSI DSP 2008 Y.T. Hwang 4-13


Critical Path
• Critical path of a DFG
• The path with the longest computation time among all paths containing zero
delays
• The minimum computation time for one iteration of the DFG
• 6→3→2→1
• 5→3→2→1
• Iteration period = 5 u.t.
• Iteration bound
• Recursive DFG has a lower
bound on the shortest
iteration period

VLSI DSP 2008 Y.T. Hwang 4-14


Loop• Loop
bound
bound
and iteration bound (1)
• Minimum time to execute one loop in the DFG
• tl / wl: tl = loop computation time, wl = number of delays in the
loop

• (a) loop bound = (4+2)/2 = 3


• (b) loop bound 1 = (4+2)/2 = 3
• (b) loop bound 2 = (2+4+5)/1 = 11

VLSI DSP 2008 Y.T. Hwang 4-15


Loop bound and iteration bound (2)
• In (a), two independent sets of computing threads
• Two iterations in every 6 u.t.  iteration period = 3 u.t.
• A0→B0  A2→B2  A4→B4  A6→…
• A1→B1  A3→B3  A5→B5  A7→…

• In (b)
• Loop 1: A→B→A
• Loop 2: A→B→C→A (critical loop)

VLSI DSP 2008 Y.T. Hwang 4-16


Loop bound and iteration bound (3)
• Loop bound of the critical loop  iteration bound of the DSP
algorithm  tl 
T  max  
lL  wl 

 6 11
T  max  ,   11 u.t.
lL  2 1 

• Algorithms to find T∞
• Longest path matrix algorithm
• Minimum cycle mean algorithm
• Negative cycle detection algorithm

VLSI DSP 2008 Y.T. Hwang 4-17


Cut-set Retiming
• Feed-forward cut-set:
• Delay transfer theorem
• Adding arbitrary non-negative
number of delays to each edge
of a feed-forward cut-set of a
DFG will not alter its output,
except the output timing will
be delayed.
• Transfer the same amount of
• Feed-back cut-set delays from edges of the same
direction across a feed-back cut
set of a DFG to all edges of
opposing edges across the
same cut set will not alter the
output, but its timing.

(C) 2004-2006 by Yu Hen Hu


Feed-forward Cut-Set Retiming
• Consider the FIR digital filter and its DFG:
y(n) = b0x(n) + b1x(n-1) • Retiming:
ynew(n) = b0x(n-1) + b1x(n-2)
ynew(n) = y(n-1)
• Critical path = Max(TM, TA)
x(n) D
x(n-1)

X b0 X b1 x(n) D
x(n-1)

+ y(n) X b0 X b1

• Critical path length = TM+TA D D


• Select a cut set
• Insert a delay each to each edge in the cut set. + y(n)

(C) 2004-2006 by Yu Hen Hu


Feed-back Cut Set Retiming
• Consider an IIR digital filter
• Shift 1 delay to the other edge
y(n) = a·y(n-2) + x(n) across a feed-back cut set

x(n) y(n) x(n) y(n)


+ +
2D D
D


a

a

l • Filter remains unchanged.


loop bound = (TM+TA)/2
oop bound = (TM+TA)/2 clock cycle = Max(TM ,TA)
clock cycle = TM+TA

(C) 2004-2006 by Yu Hen Hu


Timing Diagram
• Assume tM = tA = 1 t.u.
• Before retiming
x(1) x(2) x(3) x(4)
MAC 1 2 3 4
y(1) y(2) y(3) y(4)

• After retiming

x(1) x(2) x(3) x(4) x(5) x(6) x(7) x(7)


Add 1 2 3 4 5 6 7 8
y(1) y(2) y(3) y(4) y(5) y(6) y(7) y(7)
a y(1)

Mul 0 1 2 3 4 5 6 7 8
(C) 2004-2006 by Yu Hen Hu
Feed-back Cut Set Retiming
• Consider an IIR digital filter x(2k-1)=x(k)
y(n) = ay(n-1) + x(n) x(2k) = 0

x(n) y(n)
+ x(m) y(m)
+
D
2D


a

a

loop bound = (TM+TA)


throughput = 1/(TM+TA) Clock period = (TM+TA)
Throughput = 1/[2(TM+TA)]

(C) 2004-2006 by Yu Hen Hu


Slowdown + Retiming
Start with Start with

y(n) = a y(n-1) + x(n) y(n) = a y(n-2) + x(n)

x(n) y(n)
x(m) y(m) +
+
D
D D
D


a

a

loop bound = (TM+TA)/2


clock cycle = Max(TM ,TA) clock cycle = Max(TM ,TA)
Throughput = 1/[2max(TM,TA)] throughput = 1/ Max(TM ,TA)

(C) 2004-2006 by Yu Hen Hu


Example 3.2.1
a2 D a4
a6
a1
• Node delay = 1 t.u. D
• Before retiming: a5
a3
• Critical path: a3  a4  a5  a6
• Clock cycle time = 4
• 2 delay units D a4
D a2
• After cut-set retiming D a6
a1
• Critical path: a3  a5, a4  a6 D
D
• Clock cycle time = 2
D
• 6 delay units a3 a5
• After additional retiming 2D a4
D a2 D
• Critical path: none a6
a1 2D
• Clock cycle time = 1
D
• 11 delay units D
D 2D
a3 a5
(C) 2004-2006 by Yu Hen Hu
DFG Illustration of the Example

T = max. {(1+2+1)/2, (1+2+1)/3} = 2 T = max. {(1+2+1)/2, (1+2+1)/3} = 2


Cr. Path delay = 2+1 = 3 t.u Cr. Path Delay = max{2,2,1+1} = 2 t.u

(C) 2004-2006 by Yu Hen Hu


Dependence Graph (DG)
• The basic representation of an algorithm. • No implementation or hardware constraint
• Shows only dependency among operations. are imposed on DG.

• No notion of delay is represented.


• No loop, cycle allowed.
• Can be used to represent asynchronous
operations.
• Most useful in exploiting inherent parallelism
in the algorithm

(C)2002-2006 Yu Hen Hu 26
Data Flow Graph
• Node: • Example
• Computation y(n) = a*y(n-1) + b*u(n)
• Associated with a computing time. • The delay of 1 u.t. indicates that to compute y(n+1)
• Direct edge: in the next iteration depends on result y(n) of the
present iteration.
• data path and delay
• Delay labeled with D or positive integer on edges
• Delay: iteration count

(C)2002-2006 Yu Hen Hu 27
DFG
x(n) D D
• Intra-iteration dependency
• A direct edge without any delay
(4) M0 (4) M1 (4) M2
• Inter-iteration dependency
• Direct edge with 1 or more delays y(n)
A0 A1
• Node computing delay labeled with
(2) (2)
parenthesis.
• Critical path: longest path between … • Recursive DFG: contains loops. Must
• Example: critical path delay = 4+2+2 = 8 t.u. have at least one delay element
along any loop. Otherwise, the
algorithm is NON-computable!

(C)2002-2006 Yu Hen Hu 28
Loop bound and Iteration bound
D
(2)
t
(4) (5)
i
iloop A B C
Tloop 
d
iloop
i
2D

T  Max Tloop (2) (4)


all loops

A B
• T{A-B-A} = (2+4)/2 = 3 t.u.
2D
• T = max{(2+4)/2, (2+4+5)/1}
= max{3, 11} = 11

(C)2002-2006 Yu Hen Hu 29

You might also like