Mirror Adder

Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

ECE 124A

VLSI Principles
Lecture 16

Prof. Kaustav Banerjee


Electrical and Computer Engineering
E-mail: [email protected]

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


A Generic Digital Processor
Static, Dynamic Memory
Lecture 15 Latch, Register
Timing, Stability

MEMORY
Input / Output

CONTROL
Interconnect

DATAPATH Sequential Circuit

Today……

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


DATAPATH
‰ The core of a digital Processor
‰ Datapath consists
ƒ Logic Blocks
– Combinational Logic Functions (AND, OR, XOR….)
ƒ Arithmetic Blocks
– Addition
– Multiplication
– Comparison
– Shift

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


An Intel Microprocessor
9-1 Mux

5-1 Mux

a g64
CARRYGEN
Intel Itanium®
node1
Integer Datapath

SUMSEL
sum sumb

REG
ck1
to Cache
9-1 Mux

2-1 Mux

SUMGEN s0
+ LU s1
b

LU : Logical
Unit

1000um

Itanium has 6 integer execution units like this

Fetzer, Orton, ISSCC’02

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Bit-Sliced Design
Control

DATA OUT
Bit 32

Multiplexer
DATA IN

Register

Shifter
Adder

……
Bit 0

Design a single bit datapath and repeat for all bits

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Adders

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Design an Adder
‰ Fundamental Arithmetic Building Block
‰ Performance
ƒ Logic Level Optimization
– Optimize Boolean Functions
– Carry Lookahead
ƒ Circuit Level Optimization
– Transistor Sizing
‰ Power Consumption

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Half-Adder Implementations
Half-Adder X
y c

X y X
s
y

X c
Half-Adder Carry y
s
X
y

SUM
c
X

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Full-Adder

A B
A HA
Full B
Cin Cout Cout
adder HA
Cin
Sum
s

S = A ⊕ B ⊕ Cin = ABC in + ABC in + ABCin + ABCin


Cout = AB + BCin + ACin

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Express Sum and Carry
A B Ci S Co status
0 0 0 0 0 D
Generate (G) = AB
0 0 1 1 0 D
0 1 0 1 0 P Delete (D) = A B
0 1 1 0 1 P
Propagate (P) = A ⊕ B
1 0 0 1 0 P
1 0 1 0 1 P
1 1 0 0 1 G
Co = G + PCi
1 1 1 1 1 G

Truth Table for Full Adder S = P ⊕ Ci

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Complimentary Static CMOS Full Adder
VDD

VDD
Ci A B

A B
A
S
B
C0 B
Ci VDD
A
X
Ci

Ci A S
Ci

A B B VDD
A B Ci A

Co B

S = A ⊕ B ⊕ Cin = ABCi + C0 ( A + B + Ci )
28 Transistors
Co = AB + BCin + ACin

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


The Ripple-Carry Adder

Worst case delay: linear with the number of bits td = O(N)

tadder = (N-1)tcarry + tsum


‰ Propagation Delay is linearly proportional to N
‰ tcarry dominates the propagation delay

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Inverting Property

S( A, B,Ci ) = S( A, B,C i )
C 0 ( A, B,Ci ) = C0 ( A, B,C i )

Inverting all inputs to FA results in inverted values for all outputs

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Minimize Critical Path by Reducing Inverting Stages

Even cell Odd cell

A0 B0 A1 B1 A2 B2 A3 B3

Ci,0 Co,0 Co,1 Co,2 Co,3


FA FA FA FA

S0 S1 S2 S3

Exploit Inversion Property

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Mirror Adder (1)
Carry VDD

VDD VDD B
A B Ci S Co status

0 0 0 0 0 D A B B A B Ci A
0 0 1 1 0 D
0 1 0 1 0 P A Ci
Co
0 1 1 0 1 P Ci S
1 0 0 1 0 P A Ci
1 0 1 0 1 P
1 1 0 0 1 G A B B A B Ci A
1 1 1 1 1 G

B
Truth Table for Full Adder

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Mirror Adder (2)
Sum VDD

VDD VDD B
A B Ci S Co status

0 0 0 0 0 D
A B B A B Ci A
0 0 1 1 0 D
0 1 0 1 0 P A Ci
Co
0 1 1 0 1 P Ci S
1 0 0 1 0 P A Ci
1 0 1 0 1 P
1 1 0 0 1 G A B B A B Ci A
1 1 1 1 1 G

B
Truth Table for Full Adder

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Mirror Adder Implementation
Stick Diagram
VDD

A B Ci B A Ci Co Ci A B

Co

GND

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Mirror Adder Summary
‰ 24 Transistors

‰ The NMOS and PMOS chains are completely symmetrical

‰ A maximum of 2 series transistors in carry-generation circuitry

‰ The transistors connected to Ci should be closest to the output

‰ Carry-Stage transistors have to be optimized for speed

‰ Sum-Stage transistors can be optimized for area

‰ The most critical issue is to minimize the capacitance at Co

‰ Co is composed of 4 diffusion capacitances, 2 internal gate


capacitances, and 6 gate capacitances in the connecting adder cell

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Transmission Gate Full Adder
P
VDD
VDD Ci
A
P S Sum Generation
A A P Ci

A P VDD
B B
VDD A
P
P Co Carry Generation
Ci Ci Ci
A
Setup P

This implementation has similar sum and carry output delay

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Manchester Carry-Chain
Manchester Carry Gates
VDD
Pi
φ
VDD
Pi Ci Co
Gi
Co Gi
Ci

Di
Pi φ

Static Implementation Dynamic Implementation

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


4-Bit Manchester Carry-Chain
VDD
φ
P0 P1 P2 P3
C3

Ci,0
G0 G1 G2 G3

C0 C1 C2 C3

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Manchester Carry-Chain Implementation
Stick Diagram
Propagate/Generate Row

VDD
Pi Gi φ Pi + 1 Gi + 1 φ

Ci - 1 Ci Ci + 1

GND

Inverter/Sum Row

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Design for Long Word Length

‰ Propagation delay of chain style design


is quadratic in the number of bits

‰ Chainstyle design is NOT practical for


long word length (e.g. 32 bits)

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Carry-Bypass Adder
Carry-Skip Adder
P0 G1 P0 G1 P2 G2 P3 G3

Ci,0 C o,0 C o,1 Co,2 Co,3


FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3
BP=P oP1 P2 P3
Ci,0 C o,0 Co,1 C o,2

Multiplexer
FA FA FA FA
Co,3

Idea: If (P0 and P1 and P2 and P3 = 1)


then C o3 = C0, else “kill” or “generate”.

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


16-bit Carry-Bypass Adder
Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15
Setup tsetup Setup Setup Setup
tbypass

Carry Carry Carry Carry


propagation propagation propagation propagation

Sum Sum Sum tsum Sum

M bits
N=16 M=4

tadder = tsetup + Mtcarry + (N/M-1)tbypass + (M-1)tcarry + tsum

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Carry Ripple versus Carry Bypass
tp Ripple Adder

Bypass Adder
For smaller N, bypass
adder is not preferred due
to the overhead of bypass
multiplexer
4~8 N

Depends on technology

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Linear Carry-Select Adder
Setup

P,G Both situations are evaluated


"0" "0" Carry Propagation

"1" "1" Carry Propagation

Co,k-1 Multiplexer C o,k+3

Carry Vector

Sum Generation

~ 30 % Hardware overhead

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


16-bit Linear Carry-Select Adder
Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15
Setup Setup Setup Setup

0 0-Carry 0 0-Carry 0 0-Carry 0 0-Carry

1 1-Carry 1 1-Carry 1 1-Carry 1 1-Carry

Multiplexer Multiplexer Multiplexer Multiplexer


Ci,0 Co,3 Co,7 Co,11 Co,15

Sum Generation Sum Generation Sum Generation Sum Generation


S0–3 S4–7 S8–11 S12–15

N=16 M=4

tadder = tsetup + Mtcarry + (N/M)tmux + tsum

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Square-Root Carry-Select Adder
Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13 Bit 14-19

Setup Setup Setup Setup


(1)

"0" Carry "0" Carry "0" Carry "0" Carry


"0" "0" "0" "0"
(1)

"1" Carry "1" Carry "1" Carry "1" Carry


"1" "1" "1" "1"
(3) (3) (4) (5) (6) (7)
(4) (5) (6) (7)
Multiplexer Multiplexer Multiplexer Multiplexer Mux
Ci,0
(8)
Sum Generation Sum Generation Sum Generation Sum Generation Sum

S0-1 S2-4 S 5-8 S9-13 S14-19 (9)

N bits P stages First Stage has M bits


2
For example:
N = 2 + 3 + ...... + (P + 1) ≅ P P = 2N
M=2 2
tadder = tsetup + Mtcarry + (N/M)tmux + tsum= tsetup + Mtcarry +Ptmux + tsum

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Adder Delays - Comparison
50

40 Ripple adder
tp (in unit delays)

30

Linear select
20

10
Square root select

0
0 20 40 60
N

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Look-Ahead - Basic Idea
A0 , B0 A1, B1 ••• AN-1, BN-1
Expanding Lookahead equations:
C0,k=Gk+PkC0,k-1
C0,k=Gk+Pk(Gk-1+Pk-1C0,k-2)
Ci,0 P0 Ci,1 P1
Ci, N-1 PN-1

S0 S1 ••• SN-1

All the way:

C0,k=Gk+Pk(Gk-1+Pk-1(……+P1(G0+P0Ci,0))))))

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Carry Determination
Co,0=G0+P0Ci,0

Co,1=G1+P1Co,0=G1+P1(G0+P0Ci,0)=G1+P1G0+P1P0Ci,0

Co,2=G2+P2G1+P2P1G0+P2P1P0Ci,0

Co,3=G3+P3G2+P3P2G1+P3P2P1G0+P3P2P1P0Ci,0

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


4-bit Look-Ahead
VDD

G3

G2

G1

G0
Large stack
Ci,0
Co,3

P0 Poor Performance
P1

P2

P3

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Hierarchically Decomposition
Co,0=G0+P0Ci,0
Co,1=G1+P1Co,0 A0 F

Co,2=G2+P2C0,1
A1 A2 A3 A4 A5 A6 A7
Co,3=G3+P3Co,2
A0
tp∼ N
A1

(g”,p”) (g’,p’) A2
A3
F
A4
g=g”+g’p” A5

p=p’p” A6 tp∼ log2(N)


A7

(g,p)

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


(A0, B0) S0

(A1, B1) S1

(A2, B2) S2

(A3, B3) S3

(A4, B4) S4

(A5, B5) S5

(A6, B6) S6

(A7, B7) S7

t p ∝ log2 N
(A8, B8) S8
Kogge-Stone tree

(A9, B9) S9

Lecture 16, ECE 124A, VLSI Principles


(A10, B10) S10

(A11, B11) S11

(A12, B12) S12

(A13, B13) S13

(A14, B14) S14

(A15, B15) S15


Logarithmic Look-Ahead Adder

Kaustav Banerjee
Multipliers

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


The Binary Multiplication

1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0

0 0 0 0 0 0 Partial products

+ 1 0 1 0 1 0

1 1 1 0 0 1 1 1 0 Result

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Reduce Partial Products

‰ Partial products can be reduced by


multiplier transformation
(Booth’s Recording)

‰ Reduce number of partial products is


equivalent to reducing the number of
additions

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


4X4 Bit-Array Multiplier
X3 X2 X1 X0 Y0

X3 X2 X1 X0 Y1 Z0

HA FA FA HA

X3 X2 X1 X0 Y2 Z1

FA FA FA HA

X3 X2 X1 X0 Y3 Z2

FA FA FA HA

Z7 Z6 Z5 Z4 Z3

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Critical Path X3 X2 X1
tand
X0 Y0

X3 X2 X1 X0 Y1 Z0
M
HA FA FA HA

X3 X2 X1 X0 Y2 Z1

FA FA FA HA N-1
X3 X2 X1 X0 Y3 Z2

FA FA FA HA

Z7 Z6 Z5 Z4 Z3

For MXN Bit-Array Multiplier


tmultiplier = [ (M-1) + (N-2) ] tcarry + (N-1) tsum + tand

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Carry-Save Multiplier
HA HA HA HA

HA FA FA FA

HA FA FA FA

Carry is saved for the next adder stage


HA FA FA HA

Vector Merging Adder

tmultiplier = (N-1) tcarry + tand + tmerge

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Multiplier Floorplan
Carry-Save Multiplier
X3 X2 X1 X0

Y0
Y1 HA Multiplier Cell
C S C S C S C S
Z0

FA Multiplier Cell
Y2
C S C S C S C S
Z1 Vector Merging Cell

Y3
C S C S C S C S X and Y signals are broadcasted
Z2 through the complete array.
( )

C C C C
S S S S

Z7 Z6 Z5 Z4 Z3

Rectangular shape is easy for integration

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Partial Products Transformation
Partial products First stage
6 5 4 3 2 1 0 6 5 4 3 2 1 0 Bit position

(a) (b)

Second stage Final adder


6 5 4 3 2 1 0 6 5 4 3 2 1 0

FA HA
(c) (d)

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Wallace-Tree Multiplier
x3y2 x2y2 x3y1 x1y2 x3y0 x1y1 x2y0 x0y1
Partial products x3y3 x2y3 x1y3 x0y3 x2y1 x0y2 x1y0 x0y0

First stage
HA HA

Second stage FA FA FA FA

Final adder
z7 z6 z5 z4 z3 z2 z1 z0

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Wallace Tree Multiplier
‰ Pros:
ƒ Substantial hardware saving
ƒ Reduce propagation delay

‰ Cons:
ƒ Irregular structure
ƒ Customized layout

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Shifters

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


The Binary Shifter
Widely used for floating point units, multiplications by constant numbers
Right nop Left

Ai Bi

Ai-1 Bi-1

Bit-Slice i

...
This binary shifter is extremely slow for multi-bit applications

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


The Barrel Shifter
Area Dominated by Control Wiring
A3
B3

Sh1
A2
B2

Sh2 : Data Wire


A1
B1 : Control Wire

Sh3
A0
B0

Sh0 Sh1 Sh2 Sh3


Decoder Control Signal

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


4x4 barrel shifter
A3

A2

A1

A0

Sh0 Sh 1 S h2 Sh3
B uffer
Widthbarrel ~ 2 pm M

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


Logarithmic Shifter
Sh1 Sh1 Sh2 Sh2 Sh4 Sh4

A3 B3

A2 B2

A1 B1

A0 B0

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee


0-7 bit Logarithmic Shifter

A Out3
3

A Out2
2

A
1 Out1

A Out0
0

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

You might also like