DSD ch-5 Building Blocks

This document discusses basic building blocks for FPGA designs, including different types of computational blocks, embedded processors, and architectural options for adders, multipliers, and shifters. Some key points: - Modern FPGAs contain dedicated blocks like multipliers, adders, and DSP slices to improve performance for tasks like signal processing. - Common components in FPGAs include 18x18 multipliers (Altera, Xilinx), 8x8 multipliers and 16-bit adders (Quick Logic), and DSP48 blocks (Xilinx). - FPGAs also contain embedded processors like PowerPC, ARM, or MicroBlaze for control functions. - Architectural

Uploaded by

hafsa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views85 pages

DSD ch-5 Building Blocks

Uploaded by

hafsa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 85

Basic building blocks design

options
Introduction
• FPGAs outperform their traditional competing
technology of digital signal processors (DSPs).
• No matter how many MACs the DSP vendor can
place on a chip, still it cannot compete with the
availability of hundreds of these units on a high
end FPGA device.
• The modern day FPGAs come with embedded
processors, standard interfaces and signal
processing building blocks consisting of
multipliers, adders, registers and multiplexers.
• It is expected that the number and type of these
building blocks on FPGAs will see an upward
trend
Dedicated computational blocks
• 18x18 multiplier in Virtex II, Virtex II pro and
Spartan 3 FPGA
• 8x8 multiplier and 16 bit adder in Quick Logic
FPGA
• 18x18 multiplier and adder in Altera FPGA
• DSP48 blocks in Xilinx 7 series FPGAs
18x18 multiplier
18x18 multiplier and adder in Altera
DSP48 blocks in Xilinx
Embedded processors
• The FPGA vendors are also incorporating cores
of programmable processors and numerous
high speed interfaces.
• High end devices in Xilinx FPGAs are
embedded with Hard IP core of PowerPC,
ARM cortex or Soft IP core of Microblaze,
along with standard interfaces like PCI Express
and Gigabit Ethernet
FPGA embedded processors and other
interfaces
Instantiation of Embedded Blocks
• Example: a second order infinite impulse
response (IIR) filter in Direct Form (DF) II
realization
RTL Code
Code
Spartan-3 architecture
• Multiplication is implemented using 18x18
dedicated multipliers
Device utilization summary
Spartan-3
Virtex-4 architecture
• The multiplication and addition operations are
mapped on DSP48 multiply accumulate (MAC)
embedded blocks.
Device utilization summary
Virtex-4
Design optimization by pipe-lining
• Example-2: 8 tap FIR filter
RTL code: FIR
RTL code: FIR
Schematic Spartan-3
Device utilization
Design optimization
• Introducing pipeline registers
RTL code
Synthesis report
• Design is 9 times faster
Basic Building Blocks architecture
• After the foregoing discussion of the use of
dedicated multipliers and MAC blocks, it is
pertinent to look at the architectures for the
basic building blocks
• Several architectural options are available for
selecting an appropriate HW block for
operations like addition, multiplication and
shifting
– Parallel adders
– Barrel shifters
– Parallel multipliers
Adders
• Adders are used in addition, subtraction,
multiplication and division.
• The speed of any digital design of a signal
processing or communication system depends
heavily on these functional units.
• The ripple carry adder (RCA) is the slowest in
adder family.
• To cater for the slow carry propagation, fast
adders are designed. These make the process of
carry generation and its propagation faster.
Fast adders
• Carry look ahead adder
• Conditional sum adder
• Carry select adder
• Hierarchical Carry select adder
Single bit full adder/ Gate level options
Ripple carry adder
• Slowest adder due to carry propagation delay
Logic placement in CLBs
Carry look ahead adder
• A simple consideration of full adder logic
identifies that a carry c(i+1) is generated if
a(i) = b(i) = 1, and a carry is propagated if
either a(i) or b(i) is 1. This can be written as:
Carry look ahead adder
CLA logic
Grouping of CLAs
• Industrial practice is to use 4 bit wide blocks.
This limits the computation of carries until c3,
and c4 is not computed. The first four terms in
c4 are grouped as G0 and the product
p3p2p1p0 in the last term is tagged as P0 as
given here
Grouping of CLAs
• Similarly, bits 4 to 7 are also grouped together and
c5, c6 and c7 are computed in the first level of the
CLA block using c4 from the second level of CLA logic.
The first level CLA block for these bits also generates
G1 and P1.
A 16-bit carry look-ahead adder using two levels of CLA logic
A 64-bit carry look-ahead adder using
three levels of CLA logic
Conditional sum adder
16-bit uniform groups carry select
adder
Hierarchical carry select adder
Barrel Shifter
Barrel Shifter
• The circuit should support
– Logical right shift: x>>S
– Logical left shift: x<<S
– Arithmetic right shift: x>>>S
– Arithmetic left shift: x<<<S
8-bit arithmetic shift
8-bit logical and
arithmetic shift
Multi-stage barrel shifter
Parallel Multipliers
Carry Save Addition
• while reducing three operands to two, does
not propagate carries; rather, a carry is saved
to the next significant bit position. Thus this
addition reduces three operands to two
without carry propagation delay
Dot notation
• Dot notation facilitates description of different
reduction schemes
• Dots are used to represent each bit of the
partial product
Parallel multiplier circuits
• A CSA is one of the fundamental building
blocks of most parallel multiplier
architectures. The partial products are first
reduced to two numbers using a CSA tree.
These two numbers are then added to get the
final product.
Three components of a multiplier
Partial Product Generation
• Partial products PP[i] are genearted by ANDing each
bit a(i) of the multiplier with all the bits of the
multiplicand b
• Each PP[i] is shifted to the left by i bit positions
PPs generation code
Partial Product Reduction
• For a general N1xN2 multiplier, the following
four techniques are generally used to reduce
N1 layers of the partial products to two layers
for their final addition using any CPA:
– carry save reduction
– dual carry save reduction
– Wallace tree reduction
– Dadda tree reduction.
Carry Save Reduction
• The first three layers of the PPs are reduced to two
layers using carry save addition (CSA).
• Isolated bits in a column three layers, are simply
dropped down to the same column
• Columns with two bits are reduced to two bits using
half adders and the columns with three bits are
reduced to two bits using full adders
• Once the first three PPs are reduced to two layers,
the fourth partial product is grouped with them to
make a new group of three layers.
• The process is repeated until two layers are left
which are added using CPA
12x12 multiplier PPs
reduction
Carry save reduction scheme layout for
a 6x6 multiplier
• Level 0
Carry save reduction scheme layout for
a 6x6 multiplier
• Level 1
Carry save reduction scheme layout for
a 6x6 multiplier
Tree diagram 6x6
Tree diagram
Dual Carry Save Reduction
• The partial products are divided into 2 equal
size groups
• The carry save reduction scheme is applied on
both the groups simultaneously
• This results into two partial product layers in
each group
• The four layers are then reduced using Carry
Save Reduction
• The last two layers are added using any CPA
Tree diagram 8x8
Tree diagram
Wallace Tree Multipliers
• One of the most commonly used multiplier
architecture
• The number of adder levels increase
logarithmically as the partial products increase
Wallace Tree Multipliers
• Make group of threes and apply CSA reduction
in parallel
• Each CSA layer produces two rows
• Repeat the above two steps until two rows are
left
• The final rows are added together using CPA
for the final product
12x12 multiplication using Wallace
Reduction Tree
12x12 multiplication using Wallace Reduction Tree
12x12 multiplication using Wallace Reduction Tree
Wallace tree diagram 12x12
Wallace tree
Wallace Reduction layout for a 6x6
array of PPs
Wallace Reduction layout for a 6x6
array of PPs
Wallace Reduction layout for a 6x6 array of PPs
Adder delays in Wallace tree
A Decomposed Multiplier
• Four Multipliers of size NxN can be combined
to make a 2N x 2N multiplier
A 16x16 bit Multiplier decomposed
into four 8x8 multipliers
Two’s Complement Signed Multiplier
• 4 x 4-bit signed by signed multiplication
– The sign bits of the first three PPs are extended
– Two’s complement of the last PP is taken
– HW implementation results in additional logic
The End

Q&A
Appendix
8x8 multiplier and 16 bit adder
Tree diagram
Sign - extension Elimination
• Flip the sign bit, extend the number with all 1s and add a 1 at
the location of the sign bit
• Irrespective of the sign of the number, the technique makes
all the extended bits into 1s
Applied to multiplication
• First the MSB of all the PPs except the last one are
flipped and a 1 is added at the sign bit location, and the
number is extended by all 1s.
• For the last PP, the two’s complement is computed by
flipping all the bits and adding 1 to the LSB position.
• The MSB of the last PP is flipped again and 1 is added
to this bit location for sign extension.
• All these 1s are added to find a correction vector (CV).
• Now all the 1s are removed and the CV is simply added
and it takes care of the sign extension logic.
Correction vector
4x4 multiplication example

DSP Notes Unit1 and 2
No ratings yet
DSP Notes Unit1 and 2
45 pages
Adders and Multipliers
No ratings yet
Adders and Multipliers
59 pages
PPT
100% (1)
PPT
9 pages
DSD Subsystem Design
No ratings yet
DSD Subsystem Design
65 pages
ASM Design Example Bin Mult
No ratings yet
ASM Design Example Bin Mult
11 pages
Architecture
No ratings yet
Architecture
112 pages
S Rawat
No ratings yet
S Rawat
49 pages
Module 2-1
No ratings yet
Module 2-1
93 pages
Cpe626 Multipliers
No ratings yet
Cpe626 Multipliers
37 pages
D S A K: Igital Esign
No ratings yet
D S A K: Igital Esign
63 pages
Bit Pair Recoding
No ratings yet
Bit Pair Recoding
39 pages
Multipliers PDF
No ratings yet
Multipliers PDF
39 pages
Lect 3
No ratings yet
Lect 3
54 pages
VLSI
No ratings yet
VLSI
20 pages
Imp 22
No ratings yet
Imp 22
31 pages
An Optimized Modified Parallel Implementation Design of Multiplier and Accumulator Operator
No ratings yet
An Optimized Modified Parallel Implementation Design of Multiplier and Accumulator Operator
39 pages
Edited Project
No ratings yet
Edited Project
48 pages
Module 2 Notes
No ratings yet
Module 2 Notes
28 pages
Ece-Vii-dsp Algorithms & Architecture U2
No ratings yet
Ece-Vii-dsp Algorithms & Architecture U2
21 pages
8 Karatsuba Document
No ratings yet
8 Karatsuba Document
75 pages
The DSP Primer11: Transposed FIR With Multiplier Block
No ratings yet
The DSP Primer11: Transposed FIR With Multiplier Block
40 pages
Advanced VLSI Design: Dr. Premananda B.S
No ratings yet
Advanced VLSI Design: Dr. Premananda B.S
42 pages
Wallace Tree Multiplier
No ratings yet
Wallace Tree Multiplier
11 pages
7.performance Analysis of Wallace Tree Multiplier With Kogge Stone Adder Using 15-4 Compressor
No ratings yet
7.performance Analysis of Wallace Tree Multiplier With Kogge Stone Adder Using 15-4 Compressor
38 pages
Dspa 17ec751 M2
No ratings yet
Dspa 17ec751 M2
27 pages
Implementation of ALU Using Modified Radix-4 Modified Booth Multiplier
No ratings yet
Implementation of ALU Using Modified Radix-4 Modified Booth Multiplier
15 pages
Unit IV Designing Arithmatic Building Blocks
No ratings yet
Unit IV Designing Arithmatic Building Blocks
52 pages
Design of Binary Multiplier Using Adders-3017 PDF
No ratings yet
Design of Binary Multiplier Using Adders-3017 PDF
5 pages
Tomar2017 18 PDF
No ratings yet
Tomar2017 18 PDF
13 pages
Multipliers: Presented By
No ratings yet
Multipliers: Presented By
13 pages
Design of Area, Power and Delay Efficient High-Speed Multipliers
No ratings yet
Design of Area, Power and Delay Efficient High-Speed Multipliers
8 pages
DSP Arch
No ratings yet
DSP Arch
10 pages
Vlsi Architecture of Parallel Multiplier - Accumulator Based
No ratings yet
Vlsi Architecture of Parallel Multiplier - Accumulator Based
8 pages
Unit IV Designing Arithmatic Building Blocks
No ratings yet
Unit IV Designing Arithmatic Building Blocks
52 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
6 pages
4bit Array Multiplier
No ratings yet
4bit Array Multiplier
4 pages
PaperID 74S201921
No ratings yet
PaperID 74S201921
7 pages
Multiplication Is Basically A Shift Add Operation
No ratings yet
Multiplication Is Basically A Shift Add Operation
13 pages
Bhattacharjee 2011
No ratings yet
Bhattacharjee 2011
5 pages
Design FF Low Power Multiplier Unit Using Wallace Tree Algorithm IJERTV9IS020069
No ratings yet
Design FF Low Power Multiplier Unit Using Wallace Tree Algorithm IJERTV9IS020069
5 pages
Architectures For Programmable Digital Signal Processing Devices
No ratings yet
Architectures For Programmable Digital Signal Processing Devices
24 pages
ARITHMETIC and LOGIC UNIT - in This Lecture, We Will Examine How
No ratings yet
ARITHMETIC and LOGIC UNIT - in This Lecture, We Will Examine How
12 pages
Carry Save Adder Trees in Multipliers: ECEN 6263 Advanced VLSI Design
No ratings yet
Carry Save Adder Trees in Multipliers: ECEN 6263 Advanced VLSI Design
8 pages
EC3021 Computer Organisation and Architecture: Latest Technologies in Multiplier Design
No ratings yet
EC3021 Computer Organisation and Architecture: Latest Technologies in Multiplier Design
6 pages
Computer Organisation and Architecture:Multiplier Design
No ratings yet
Computer Organisation and Architecture:Multiplier Design
6 pages
Speed Enhanced Multiprecision Multiplier Using Compressing Techniques
No ratings yet
Speed Enhanced Multiprecision Multiplier Using Compressing Techniques
3 pages
An Efficient High Speed Wallace Tree Multiplier
No ratings yet
An Efficient High Speed Wallace Tree Multiplier
5 pages
VLSI Implementation of Modified Booth Algorithm: Rasika Nigam, Jagdish Nagar
No ratings yet
VLSI Implementation of Modified Booth Algorithm: Rasika Nigam, Jagdish Nagar
4 pages
FPGA Implementation of 8 Bit Multiplier
No ratings yet
FPGA Implementation of 8 Bit Multiplier
4 pages
Literature Survey: 2.1 Background of The Project
No ratings yet
Literature Survey: 2.1 Background of The Project
5 pages
FPGA Implementation of Efficient Modifie
No ratings yet
FPGA Implementation of Efficient Modifie
4 pages
High-Speed, Area Efficient VLSI Architecture of Wallace-Tree Multiplier For DSP-applications
No ratings yet
High-Speed, Area Efficient VLSI Architecture of Wallace-Tree Multiplier For DSP-applications
5 pages
The Efficient Implementation of An Array Multiplier
No ratings yet
The Efficient Implementation of An Array Multiplier
5 pages
FALLSEM2024-25 BECE406E ETH VL2024250104214 2024-08-16 Reference-Material-I
No ratings yet
FALLSEM2024-25 BECE406E ETH VL2024250104214 2024-08-16 Reference-Material-I
23 pages
A Review Paper On Different Multipliers Based On Their Different Performance Parameters
No ratings yet
A Review Paper On Different Multipliers Based On Their Different Performance Parameters
4 pages
I Ji Scs 04112102
No ratings yet
I Ji Scs 04112102
4 pages

DSD ch-5 Building Blocks

Uploaded by

DSD ch-5 Building Blocks

Uploaded by

Basic building blocks design

You might also like