0% found this document useful (0 votes)
65 views34 pages

Fpgas in DSP Applications: Haibo Wang Ece Department Southern Illinois University Carbondale, Il 62901

This document provides an overview of fixed-point and floating-point number systems and their application in digital signal processing (DSP) circuits implemented using FPGAs. It discusses number representation formats, arithmetic operations, and common DSP functions such as filters and fast Fourier transforms. Filter implementation techniques including direct form, pipelined, parallel and serial architectures are described. Distributed arithmetic is also covered as an efficient method for implementing multipliers with a constant input.

Uploaded by

Huzur Ahmed
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views34 pages

Fpgas in DSP Applications: Haibo Wang Ece Department Southern Illinois University Carbondale, Il 62901

This document provides an overview of fixed-point and floating-point number systems and their application in digital signal processing (DSP) circuits implemented using FPGAs. It discusses number representation formats, arithmetic operations, and common DSP functions such as filters and fast Fourier transforms. Filter implementation techniques including direct form, pipelined, parallel and serial architectures are described. Distributed arithmetic is also covered as an efficient method for implementing multipliers with a constant input.

Uploaded by

Huzur Ahmed
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 34

13-1

ECE 428 Programmable ASI C Design


Haibo Wang
ECE Department
Southern Illinois University
Carbondale, IL 62901
FPGAs in DSP Applications
13-2
Overview
Motivation
Outline
Number Systems
Fixed-Point Number System
Floating-Point Number System
VLSI Architectures for DSP Circuits
Distributed Arithmetic Circuits
Digital Signal Processing (DSP) is one of the most active area in VLSI applications
Traditionally, DSP algorithms are implemented either using general purpose DSP
processors (Low speed, less expensive, flexible) or using ASICs (High speed,
expensive, less flexible)
FPGAs provide solutions that maintain both the advantages of the approach based
on DSP processors and the approach based on ASICs
13-3
Fixed-Point Number System
Binary number representation of fixed-point numbers
b
n
b
n-1
b
n-2
b
0
b
-1
b
-2
b
-m


= =
- + - =
m
j
j
j
n
i
i
i
b b
1 1
2 2
Examples:
101.0101 5.3125
000.011 0.375
1.25 1.01
Binary
Decimal Decimal
Binary
1.24 1.001111010
If the binary number can have 8 bits for fractional part, we can use
1.00111101 (1.23828125) to approximate 1.24
If the binary number can have 7 bits for fractional part, we can use
1.0011111 (1.2421875) to approximate 1.24
Binary point
13-4
Arithmetic Operations for Fixed-Point Numbers
Add/Subtract
b
n
b
n-1
b
n-2
b
0
b
-1
b
-2
b
-m

a
n
a
n-1
a
n-2
a
0
a
-1
a
-2
a
-m

+/-
s
n
s
n-1
s
n-2
s
0
s
-1
s
-2
s
-m

c
B
A
S
B
A
S
+/-
The addition or subtraction of two fixed-point numbers can be performed by
regular adder or subtracter if the binary points of the two numbers are aligned.
The binary point remains the same position in the resulted number.
13-5
Arithmetic Operations for Fixed-Point Numbers
Multiplication
k bits n - k bits l bits m - l bits

k+l bits n+m-k-l bits
=
Arithmetic operation with fixed word-length
For the convenience of hardware implementation, we prefer to have the product
of a multiplication keeping the same length as the multiplicand or the multiplier
(assume they have the same length). To achieve this, we normally truncate
the least significant bits of the product.

n
n
n
n
n
n
n
13-6
Arithmetic Operations for Fixed-Point Numbers
Normalized fixed-point numbers
Scaling all the numbers involved in computation by a factor K such that all the
numbers are within the range from 0 to 1
n bits
Fixed-point number after normalization
Addition/Subtraction
n bits n bits
+/- =
n bits
Addition/Subtraction
n bits n bits
=
n bits n bits
truncated
13-7
Representation of Negative Numbers
Signed-magnitude numbers
Normalized magnitude
S
Sign bit: 0 for positive number and 1 for negative number
2s complementary numbers
Normalized 2s complementary number
S
Sign bit: 0 for positive number and 1 for negative number
13-8
Floating-Point Numbers
Scientific Notation
6.02 x 10
23

radix (base)
decimal point
Binary Floating-Point Numbers
1.0
two
x 2
-1

radix (base)
binary point
Mantissa
13-9
Floating-Point Representation
Normal format: +1.xxxxxxxxxx
two
*2
yyyy
two
S Exponent Significand
S represents Sign
- (1 for negative number and 0 for positive number)

Exponent represents yyyy
- (It is a biased number, is is also called as excess-bias number. E.g. if
a number A is a excess-8 coding, the real value of the number is A-8)

Significand represents xxxxxxxxx
(-1)
S
* (1 + Significand) * 2
(Exponent - Bias)

13-10
Arithmetic Operations of Floating-Point Numbers
Assume number
E
X
M
X X 2 - =
E
Y
M
Y Y 2 - =
Addition/Subtraction:
E E E
Y
M
Y X
M
Y X Y X 2 ) 2 ( - - =

Where X
E
< Y
E

1. Compute Y
E
-X
E
, a fixed-point subtraction
2. Right shift X
M
by Y
E
-X
E
bits to obtain X
M
2
Xe-Ye

3. Compute X
M
2
Xe-Ye
Y
M
, a fixed-point addition or subtraction
Multiplication:
E E
Y X
M M
Y X Y X
+
- - = - 2 ) (
1. Compute X
M
Y
M
, a fixed-point multiplication
2. Compute X
E
+Y
E
, a fixed-point addition
13-11
DSP Applications
Common DSP Functions that are implemented using VLSIs
Filters (FIR, IIR)
Fast Fourier Transform (FFT)
Direct Cosine Transform (DCT)
Encoder/decoder and error correction/detection functions

FIR (Finite Impulse Response) Filter
] [ ] 1 [ ] [ ] [
1 0
k n x a n x a n x a n y
k
- + + - + - =
1. Y[n] is the output at nth clock cycle; X[n] is the input at nth clock cycle
2. a
0
, a
1
, .. a
k-1
are filter coefficients
IIR (Infinite Impulse Response) Filter
] [ ] 1 [ ] [ ] [ ] [
1 0
m n y b n y b k n x a n x a n y
m k
- + + - + - + + - =
13-12
FIR Filter Implementation
Example:
] 3 [ ] 2 [ ] 1 [ ] [ ] [
3 2 1 0
- + - + - + - = n x a n x a n x a n x a n y
Tap This is a 4-tap FIR filter
Canonic form implementation:
D D D

+

+ +
x[n]
y[n]
a
0
a
1

a
2
a
3

Clock frequency
adder mult
clk
t t
f
- +
s
3
1
13-13
FIR Filter Implementation
Pipelined implementation 1:
Clock frequency
adder mult
clk
t t
f
+
s
1
D

+

+ +
x[n]
y[n-3]
a
0
a
1

a
2
a
3

D
D D
D D
D
D D
13-14
FIR Filter Implementation
Pipelined implementation 2:
Clock frequency
mult
clk
t
f
1
s
D

+

+ +
x[n]
y[n-3]
a
0
a
1

a
2
a
3

D D
D D
D
D D
D D D
(assume t
mult
> t
add
)
13-15
FIR Filter Implementation
Pipelined implementation 3 (inverted form):

+

+ +
x[n]
y[n]
a
3
a
2

a
1
a
0

D D D
Clock frequency
adder mult
clk
t t
f
+
s
1
13-16

+

+ +
x[n]
y[n-1]
a
3
a
2

a
1
a
0

D D D
D D D D
FIR Filter Implementation
Pipelined implementation 4:
Clock frequency
mult
clk
t
f
1
s
(assume t
mult
> t
add
)
13-17
FIR Filter Implementation
Pipelined implementation 5:
D D D

+

x[n]
y[n-2]
a
0
a
1

a
2
a
3

D D D D
+
+
D D
mult
clk
t
f
1
s
(assume t
mult
> t
add
)
Difficult to layout
13-18
FIR Filter Implementation
Parallel implementation 1:
x[n+1]
x[n]

+


+
D
+
y[n+1]

+

+
D

+
y[n]
x[n+3]
x[n+2]
a
2
a
3

a
0

a
1

a
2
a
3
a
0
a
1

D
x[n-1]
adder mult
clk
t t
f
+
s
1
13-19
FIR Filter Implementation
Parallel implementation 2:
x[n+1]
x[n]

+


+
D
+
y[n-1]

+

+
D

+
y[n-2]
x[n+3]
x[n+2]
a
2
a
3

a
0

a
1

a
2
a
3
a
0
a
1

D
x[n-1]
mult
clk
t
f
1
s
D D D D
D D
D
D
If t
mult
> t
adder

13-20
FIR Filter Implementation
Parallel implementation 3:
D
D
D
x[n+1]
x[n]

D D
+


D D
+
D
D
D D
+
y[n-5]

D

D
+
D

D
+
D

D
+
D
D
y[n-6]
x[n+3]
x[n+2]
a
0
a
1
a
2

a
3

a
0
a
1
a
2
a
3

13-21
FIR Filter Implementation
Serial implementation:
+

a
1

D
D
a
0

a
2

a
3

x[n]
x[n-1] x[n-2]
x[n-3]
y[n]
Multiplier accumulator (MAC)
13-22
FIR Filter Implementation
Implementation of FIR filters with large number of taps
Examples: implementation of a 16-tap FIR filter

+
D

D
D
+

+
D

D
D
D
+
D
X
k-12
a
12
X
k-8
a
8

X
k-4
a
4
X
k
a
0

X
k-13
a
13
X
k-9
a
9

X
k-5
a
5
X
k-1
a
1

X
k-14
a
14
X
k-10
a
10

X
k-6
a
6
X
k-2
a
2

X
k-15
a
15
X
k-11
a
11

X
k-7
a
7
X
k-3
a
3

=
- =
15
0
] [ ] [
i
i
i k x a k y
13-23
IIR Filter Implementation
Example:
] 2 [ ] 1 [ ] 1 [ ] [ ] [
2 1 1 0
- + - + - + - = n x a n y a n x b n x b n y
Direct Implementation:

D

+ +
D D

+

a
1

a
2

b
0

b
1

x[n]
y[n]
Clock frequency
adder mult
clk
t t
f
- +
s
3
1
13-24
IIR Filter Implementation
Pipelined Implementation 1:

D

+ +

+

a
1

a
2

b
0

b
1

x[n]
y[n-3]
D D
D
D
D
D
Clock frequency
mult
clk
t
f
1
s
(assume t
mult
> t
add
)
13-25
Pipelined Implementation 2:
IIR Filter Implementation
+
D
+

D



D
+
a
1

a
2

b
0

b
1

x[n]
y[n-1]
adder mult
clk
t t
f
- +
s
2
1
13-26
LUT-Based Multiplier
In many DSP circuits, multipliers always have one constant input.

x[n] y[n]
C
i
(constant)
For the above multiplier, y[n] purely depends on x[n]. Thus,
a look-up table (LUT) can be used to implement the multiplier


X[n]
address
y[n]
For example, a 25616 bit memory
can be used to implement a 8-bit
multiplier if one of its input is
always constant.
13-27
Distributed Arithmetic
Multiplication by using shift-and-add technique
13-28
Distributed Arithmetic
Calculate A-Y
0
+ B-Y
1
+ C-Y
2
+ D-Y
3

13-29
Distributed Arithmetic
Serial Distributed Arithmetic for Computing A-Y
0
+ B-Y
1
+ C-Y
2
+ D-Y
3

13-30
Distributed Arithmetic
LUT-Based SDA for Computing A-Y
0
+ B-Y
1
+ C-Y
2
+ D-Y
3

13-31
Distributed Arithmetic
LUT Technique for Distributed Arithmetic
13-32
Distributed Arithmetic
SDA 16-MAC Circuit
13-33
Distributed Arithmetic
SDA 16-Tap FIR Filter
13-34
Parallel Distributed Arithmetic

You might also like