Mirror Adder
Mirror Adder
Mirror Adder
VLSI Principles
Lecture 16
MEMORY
Input / Output
CONTROL
Interconnect
Today……
5-1 Mux
a g64
CARRYGEN
Intel Itanium®
node1
Integer Datapath
SUMSEL
sum sumb
REG
ck1
to Cache
9-1 Mux
2-1 Mux
SUMGEN s0
+ LU s1
b
LU : Logical
Unit
1000um
DATA OUT
Bit 32
Multiplexer
DATA IN
Register
Shifter
Adder
……
Bit 0
X y X
s
y
X c
Half-Adder Carry y
s
X
y
SUM
c
X
A B
A HA
Full B
Cin Cout Cout
adder HA
Cin
Sum
s
VDD
Ci A B
A B
A
S
B
C0 B
Ci VDD
A
X
Ci
Ci A S
Ci
A B B VDD
A B Ci A
Co B
S = A ⊕ B ⊕ Cin = ABCi + C0 ( A + B + Ci )
28 Transistors
Co = AB + BCin + ACin
S( A, B,Ci ) = S( A, B,C i )
C 0 ( A, B,Ci ) = C0 ( A, B,C i )
A0 B0 A1 B1 A2 B2 A3 B3
S0 S1 S2 S3
VDD VDD B
A B Ci S Co status
0 0 0 0 0 D A B B A B Ci A
0 0 1 1 0 D
0 1 0 1 0 P A Ci
Co
0 1 1 0 1 P Ci S
1 0 0 1 0 P A Ci
1 0 1 0 1 P
1 1 0 0 1 G A B B A B Ci A
1 1 1 1 1 G
B
Truth Table for Full Adder
VDD VDD B
A B Ci S Co status
0 0 0 0 0 D
A B B A B Ci A
0 0 1 1 0 D
0 1 0 1 0 P A Ci
Co
0 1 1 0 1 P Ci S
1 0 0 1 0 P A Ci
1 0 1 0 1 P
1 1 0 0 1 G A B B A B Ci A
1 1 1 1 1 G
B
Truth Table for Full Adder
A B Ci B A Ci Co Ci A B
Co
GND
A P VDD
B B
VDD A
P
P Co Carry Generation
Ci Ci Ci
A
Setup P
Di
Pi φ
Ci,0
G0 G1 G2 G3
C0 C1 C2 C3
VDD
Pi Gi φ Pi + 1 Gi + 1 φ
Ci - 1 Ci Ci + 1
GND
Inverter/Sum Row
P0 G1 P0 G1 P2 G2 P3 G3
BP=P oP1 P2 P3
Ci,0 C o,0 Co,1 C o,2
Multiplexer
FA FA FA FA
Co,3
M bits
N=16 M=4
Bypass Adder
For smaller N, bypass
adder is not preferred due
to the overhead of bypass
multiplexer
4~8 N
Depends on technology
Carry Vector
Sum Generation
~ 30 % Hardware overhead
N=16 M=4
40 Ripple adder
tp (in unit delays)
30
Linear select
20
10
Square root select
0
0 20 40 60
N
S0 S1 ••• SN-1
C0,k=Gk+Pk(Gk-1+Pk-1(……+P1(G0+P0Ci,0))))))
Co,1=G1+P1Co,0=G1+P1(G0+P0Ci,0)=G1+P1G0+P1P0Ci,0
Co,2=G2+P2G1+P2P1G0+P2P1P0Ci,0
Co,3=G3+P3G2+P3P2G1+P3P2P1G0+P3P2P1P0Ci,0
G3
G2
G1
G0
Large stack
Ci,0
Co,3
P0 Poor Performance
P1
P2
P3
Co,2=G2+P2C0,1
A1 A2 A3 A4 A5 A6 A7
Co,3=G3+P3Co,2
A0
tp∼ N
A1
(g”,p”) (g’,p’) A2
A3
F
A4
g=g”+g’p” A5
(g,p)
(A1, B1) S1
(A2, B2) S2
(A3, B3) S3
(A4, B4) S4
(A5, B5) S5
(A6, B6) S6
(A7, B7) S7
t p ∝ log2 N
(A8, B8) S8
Kogge-Stone tree
(A9, B9) S9
Kaustav Banerjee
Multipliers
1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0
0 0 0 0 0 0 Partial products
+ 1 0 1 0 1 0
1 1 1 0 0 1 1 1 0 Result
X3 X2 X1 X0 Y1 Z0
HA FA FA HA
X3 X2 X1 X0 Y2 Z1
FA FA FA HA
X3 X2 X1 X0 Y3 Z2
FA FA FA HA
Z7 Z6 Z5 Z4 Z3
X3 X2 X1 X0 Y1 Z0
M
HA FA FA HA
X3 X2 X1 X0 Y2 Z1
FA FA FA HA N-1
X3 X2 X1 X0 Y3 Z2
FA FA FA HA
Z7 Z6 Z5 Z4 Z3
HA FA FA FA
HA FA FA FA
Y0
Y1 HA Multiplier Cell
C S C S C S C S
Z0
FA Multiplier Cell
Y2
C S C S C S C S
Z1 Vector Merging Cell
Y3
C S C S C S C S X and Y signals are broadcasted
Z2 through the complete array.
( )
C C C C
S S S S
Z7 Z6 Z5 Z4 Z3
(a) (b)
FA HA
(c) (d)
First stage
HA HA
Second stage FA FA FA FA
Final adder
z7 z6 z5 z4 z3 z2 z1 z0
Cons:
Irregular structure
Customized layout
Ai Bi
Ai-1 Bi-1
Bit-Slice i
...
This binary shifter is extremely slow for multi-bit applications
Sh1
A2
B2
Sh3
A0
B0
A2
A1
A0
Sh0 Sh 1 S h2 Sh3
B uffer
Widthbarrel ~ 2 pm M
A3 B3
A2 B2
A1 B1
A0 B0
A Out3
3
A Out2
2
A
1 Out1
A Out0
0