CH 5
CH 5
x = q · d + rem
qj+1 = SEL(rw[j], d)
(a) q[j+1]
• RADIX r
• QUOTIENT-DIGIT SET
1. CANONICAL: 0 ≤ qj ≤ r − 1
2. REDUNDANT: qj ∈ Da = {−a, −a + 1, . . . , −1, 0, 1, . . . , a − 1, a}
a 1
• REDUNDANCY FACTOR: ρ = r−1
, ρ> 2
• REPRESENTATION OF RESIDUAL:
1. NONREDUNDANT (e.g., 2’s complement)
2. REDUNDANT: carry-save, signed-digit
Initialization:
• w[0] = x − dq[0] and |w[0]| ≤ ρd. Options:
– Make q[0] = 0 and
∗ For ρ = 1 we make w[0] = x/2.
∗ For 1/2 < ρ < 1 we make w[0] = x/4
Compensated in the termination step
– Make q[0] = 1 and w[0] = x − d. Applicable for ρ < 1 because q > 1 + ρ
not allowed.
Termination:
• QUOTIENT:
q[N ] if w[N ] ≥ 0
q =
q[N ] − r−N if w[N ] < 0
N – number of iterations
If dividend shifted in initialization - shift quotient (extra iteration)
• UPDATE
Q[j + 1] = Q[j] + qj+1r−(j+1)
QM [j + 1] = Q[j + 1] − r−(j+1)
Q[j] + (qj+1 − 1)r
−(j+1)
if qj+1 > 0
=
QM [j] + ((r − 1) − |qj+1|)r−(j+1) if qj+1 ≤ 0
• ALL ADDITIONS ARE CONCATENATIONS
(Q[j], qj+1) if qj+1 ≥ 0
Q[j + 1] =
(QM [j], (r − |qj+1|)) if qj+1 < 0
(Q[j], qj+1 − 1)
if qj+1 > 0
QM [j + 1] =
(QM [j], ((r − 1) − |qj+1|)) if qj+1 ≤ 0
j qj Q[j] QM [j]
0 0 0
1 1 0.1 0.0
2 1 0.11 0.10
3 0 0.110 0.101
4 1 0.1101 0.1100
5 -1 0.11001 0.11000
6 0 0.110010 0.110001
7 0 0.1100100 0.1100011
8 -1 0.11000111 0.11000110
9 1 0.110001111 0.110001110
10 0 0.1100011110 0.1100011101
11 1 0.11000111101 0.11000111100
12 0 0.110001111010 0.110001111001
Q QM load
select
2-1 MUX
Qin
CONTROL
q Q qj+1
QM load
select
2-1 MUX
QMin
QM REG
QM
shif t Q with insert (Qin)
if Cshif tQ = 1
Q ←
shif t QM with insert (Qin) if CloadQ = 1
shif t QM with insert (QMin) if Cshif tQM = 1
QM ←
shif t Q with insert (QMin) if CloadQM = 1
qj+1 if qj+1 ≥ 0
Qin =
r − |qj+1| if qj+1 < 0
qj+1 − 1 if qj+1 > 0
QMin =
(r − 1) − |qj+1| if qj+1 ≤ 0
′ ′
REGISTER CONTROL SIGNALS: CloadQ = Cshif tQ and CloadQM = Cshif tQM
INITIAL STEP
w[0]
RESIDUAL REG
x
w[j]
d INITIAL STEP
RECURRENCE
STEP w[0]
q[j+1] w[j+1]
d qik+1
q REC. STEP
(a)
w[ik+1]
ON-THE-FLY CONVERTER
qik+2
w[0]
d q1
REC. STEP
w[1]
ON-THE-FLY CONVERTER
q(i+1)k
q2
& TERMINAL STEP
REC. STEP
COMBINATIONAL
REC. STEP
w[2] w[(i+1)k]
w[N] q
(c)
qN
REC. STEP
w[N]
(b) q
Figure 5.3: DIVISION IMPLEMENTATION: (a) TOTALLY SEQUENTIAL. (b) TOTALLY COMBINATIONAL. (c) COMBINED
IMPLEMENTATION.
Digital Arithmetic – Ercegovac/Lang 2003 5 – Division
18
EXAMPLES OF ALGORITHMS AND IMPLEMENTATIONS
• CELLS: delay function of the load; delay and area in terms of 2-NAND.
• DEGREE OF OPTIMIZATION: the same modules have been used in all de-
signs.
• INTERCONNECTIONS NOT INCLUDED: not considered the delay, area nor
load of interconnections.
• EXECUTION TIME AND THE AREA FOR 53-BIT OPERANDS AND RE-
SULT
• INCLUDED DELAY AND AREA OF REGISTERS
a b b c c a a b c d
[4:2] MODULE
a b c d
ti+1 [4:2]
ti
Module
vci
vci+1 vsi ti+1 d ti
Delays:
0 1
to ti+1: 1.6 MUX
Figure 5.4: Basic modules: (a) Multiplexers. (b) Buffer and register cell. (c) Full-adder. (d) [4:2] module.
The estimate yc has four bits (three integer bits and one fractional bit) of the
shifted residual in carry-save form,
• CSADD is carry-save addition
• −qj+1d is in 2’s complement form, and
• CON V ERT on-the-fly quotient conversion function
2W S[0] = 000.10011111
∗
2W C[0] = 000.00000001 y[0]
c = 0.5 q1 = 1
−q1d = 11.00111010
2W S[1] = 111.01001000
2W C[1] = 000.01101100 y[0]
c = −1 q2 = −1
−q2d = 00.11000101
2W S[2] = 111.11000010
∗
2W C[2] = 001.00110001 y[1]
c = 0.5 q3 = 1
−q3d = 11.00111010
2W S[3] = 011.10010010
∗
2W C[3] = 100.11001001 y[2]
c =0 q4 = 1
−q4d = 11.00111010
2W S[4] = 000.11000010
2W C[4] = 110.01101000 y[4]
c = −1.5 q5 = −1
−q5d = 00.11000101
* for 2’s complement of qj+1d
q = 2(.11̄111̄) = .1101
Digital Arithmetic – Ercegovac/Lang 2003 5 – Division
Figure 5.7: Example of radix-2 division with residual in carry-save form.
24
Divisor d WC[0] = 0 WC[j+1] WS[0] = x /2 WS[j+1]
2.2
qSEL buff MUX HA* REG
6.8 1.8 1.4 4
HA
i 8 9 10 11 12 13 14 15
m2(i)+ 12 14 15 16 18 20 20 24
m1(i)+ 4 4 4 4 6 6 8 8
m0(i)+ -4 -6 -6 -6 -8 -8 -8 -8
m−1(i)+ -13 -15 -16 -18 -20 -20 -22 -24
+: real value = shown value/16
4W S[0]+ = 000.10101111
4W C[0]+ = 000.00000001 ∗
y[0]
c = 10/16 q1 = 1
−q1d+ = 11.00111010
W S[1] = 1.10010100
W C[1] = 0.01010110
4W S[1]+ = 110.01010000
4W C[1]+ = 001.01011000 y[1]
c = −6/16 q2 = 0
−q2d+ = 00.00000000
W S[2] = 1.00001000
W C[2] = 0.10100000
4W S[2]+ = 100.00100000
4W C[2]+ = 010.10000000 y[2]
c = −22/16 q3 = −2
−q3d+ = 01.10001010
w[3] = 0.00101010
* least-significant 1 for 2’s complement of qj+1d
+ only one integer bit used in the recurrence, because of the range of w[j + 1].
q[3] = .102̄4 = .0324
Digital Arithmetic – Ercegovac/Lang 2003 5 – Division
Figure 5.9: Example of radix-4 division with residual in carry-save form.(On-the-fly conversion and termination not shown)
29
DELAY AND AREA OF RADIX-4 STAGE
2.2
qSEL buff MUX HA* REG
10.8 1.8 1.8 4
HA
qH = {−8, −4, 0, 4, 8}
AND
qL = {−2, −1, 0, 1, 2}
element delay area
q-digit selection (qh) 12.2 610
buffers 1.8 20
MUXes 1.8 600
CSAh 2.2 360
CSAl 4.2 360
registers (3) 4.0 650
Convert & Round (NC) 1360
Cycle time 26.2
Total area 3960
q
qH
j+1
2.2
qSEL buff MUX HA* HA REG
12.2 1.8 1.8 4
HA 4.2
qLj+1
{2d, d, 0, d, 2d}
[3:2] (qj+1 < 0)
d d
buffers WS[j+1] WC[j+1]
(a)
-2 -1 0. 1 2 3 4 5 6 7
4WS[j] x x x. x x x x x x x
{4W[j]}7
4WC[j] x x x. x x x x x x x
7 most 8 least
{4W[j]}7
{2d}7 {d}7 {d}7 {2d}7
8 least
7 most 8 8 8
3
{d}4 qSEL [3:2] [3:2] [3:2] [3:2]
7 7
5-1 MUX
Conditional truncated
qj+1 qj+2 residuals
(b)
CSA2
qj+1
qSEL buff MUX CSA1 HA REG
12.2 1.8 1.8 4
qj+2
2.2
[3:2]8 qSEL MUX buff MUX HA*
4.2 11.2 1.4 1.8 1.8
• RECTANGULAR MULTIPLIER-ACCUMULATOR
• DELAY-AREA
element delay area
M-module (NC) 1800
MUX 1.4
recoder 6.0 70
buffer 1.8
MUX 1.8 3000
2 levels of 4-2 CSA 12.0 3100
registers(3) 4.0 650
Convert & Round (NC) 1360
Cycle time 27
Total area 9980
M z
scale/iterate
MUX MUX
Round &
Recode
8+8 53+9+5 = 67
qj+1 M ; qj+1
512w[j]
MULTIPLIER
- ACCUMULATOR
Md; Mx; 512w -qz
WC[j+1] 67 67 WS[j+1]
WC[j] WS[j]
2(10+2)
CPA
WC[j+1] WS[j+1]
8+8
SZ; CONVERT qj+1
Quotient-digit set:
qj+1 ∈ Da = {−a, −a + 1, . . . , −1, 0, 1, . . . , a − 1, a}
Redundancy factor:
a 1
ρ= , <ρ≤1
r−1 2
• TWO FUNDAMENTAL CONDITIONS FOR q-SELECTION
• CONTAINMENT – must guarantee bounded residual
• CONTINUITY – there must exist a valid choice of qj+1 in the range of shifted
residual
• RESIDUAL RECURRENCE
w[j + 1] = rw[j] − dqj+1 |w[j]| ≤ ρd
ρ = a/(r − 1) − a ≤ qj ≤ a
• SELECTION INTERVALS
• If rw[j] ∈ [Lk , Uk ] then qj+1 = k makes w[j + 1] bounded
w[j+1]
ρd
-a -1 0 1 k a
-ρd
Lk Uk
(a)
k>0
Uk
selection region
for qj+1=k Lk
U0
d
0
L0
Uk
k<0 Lk
(b) L-a=-rρd
qj+1 = SEL(w[j], d)
• SEL represented by the set {sk }, −a ≤ k ≤ a,
qj+1 = k if sk ≤ rw[j] ≤ sk+1 − ulp
• sk defined as the minimum value of rw[j] for which qj+1 = k
• sk ’s are functions of the divisor d
• CONTAINMENT: Lk ≤ sk ≤ Uk
• CONTINUITY: qj+1 = k − 1 for rw[j] = sk − ulp ≤ Uk−1
Uk ≥ Uk−1 + ulp → Lk ≤ sk ≤ Uk−1 + ulp or Lk ≤ sk ≤ Uk−1
• OVERLAP
Uk−1 − Lk = (k − 1 + ρ)d − (k − ρ)d = (2ρ − 1)d
RESULTING IN
ρ ≥ 2−1
• REDUNDANCY IN q-DIGIT SET → OVERLAP BETWEEN SELECTION
INTERVALS - SIMPLER SELECTION
Digital Arithmetic – Ercegovac/Lang 2003 5 – Division
45
w[j+1]
qj+1=k-1 qj+1=k
rw[j]
Lk Uk
k
Lk-1 Uk-1
k-1
overlap
k or
k-1
(a)
rw[j]
Uk-1
sk
qj+1=k
Lk
overlap
(k or k-1)
qj+1=k-1
d
1/2 1
normalized divisor range
(b)
Figure 5.17: Selection function.
rw[j] Uk-1
mk
Lk
k>0
1/2 1 d
k<0
Uk-1
mk
Lk
-d
L-1 U-1 L1 U1
L0 U0
-1 0 1
(a)
U1
qj+1=1
1
U0
1/2 m1=1/2
U-1 d
1/2 L1 1
qj+1=0
-1/2 m0=-1/2
L0
-1
L-1 qj+1=-1
-2
rw[j]
qj+1=k+1
mk+1(i) ulp
qj+1=k
mk(i)
2-δ
di di+1
min(Uk-1(di), Uk-1(di+1))
rw[j]
Lk
max(Lk(di), Lk(di+1))
2-δ
di di+1
mk (i) = Ak (i)2−c
WHERE Ak (i) IS INTEGER
rw[j]
Uk-1
2-c
mk(4) Lk
mk(2,3)
mk(1)
mk(0)
d
2-δ
d0 d1 d2 d3 d4 d5
{rw [j ]}c
log2 r + c + δ
1 + log2 r + c
SELECTION {d }δ
FUNCTION δ−1
q j+1
Figure 5.23: SELECTION WITH TRUNCATED RESIDUAL AND DIVISOR.
4w[j]
U2
2
2
U1
L2
w[j+1] 1
2d/3 1
U0
0 1 2
L1
d/3 4d/3 0 1
4w[j] d
2d/3 5d/3 8d/3
d=1/2
U-1
L0
-2d/3 -1
(a) -1
U-2
L-1
-2
-2
L-2
(b)
{4w[j]}3 U1
{4w[j]}3 {d}4
m2(7) 12/8
xxx.xxx 0.1xxx
11/8 L2
m2(4,5,6) 10/8 6 3
9/8 qj+1=2
m2(2,3) 1 SELECTION
m2(1) 7/8
qj+1=1 3
m2(0) 6/8
5/8 qj+1
{d}4
1/2 10/16 12/16 14/16 1 (b)
9/16 11/16 13/16 15/16
(a)
Figure 5.25: QUOTIENT-DIGIT SELECTION: (a) FRAGMENT OF THE P-D DIAGRAM. (b) IMPLEMENTATION.
ǫmin ≤ y − yc ≤ ǫmax
L∗k = Lk − ǫmin
Uk∗ = Uk − ǫmax
Uk-1
U*
k-1
ε max
L*k
Lk
−εmin
di d i+1
• OVERLAP
∗ ∗
min(Uk−1 (di), Uk−1 (di+1)) − max(L∗k (di), L∗k (di+1)) ≥ 0
• RANGE
|rw[j]| ≤ rρd < rρ (for d < 1)
y=rw[j]
εmax -εmin
y (estimate)
-rρ rρ
-q d rw[j] (redundant)
j+1
w[j+1] (redundant)
t t
ws x x x. x x x x x x x x x x x. x x x x x x x x
rw[j] rw[j]
wc x x x. x x x x x x x x + + +. + + + + + + + +
- + +. + + + + + + + + - - -. - - - - - - - -
(b) (c)
Figure 5.28: USE OF REDUNDANT ADDER: (a) Redundant adder. (b) Carry-save case. (c) Signed-digit case.
• ERRORS
d ∗ −t −t
U k−1 = ⌊Uk−1 + 2 ⌋t = ⌊Uk−1 − 2 ⌋t
c ∗
L k = ⌈Lk ⌉t = ⌈Lk ⌉t
-t+1 Case A
2
^
L k (represented by )
L k L*k
di d i+1 d i+2
(for positive k)
d c
U k−1(di) − Lk (di+1) ≥ 0
2ρ − 1
− (a − ρ)2−δ ≥ 2−t
2
• RANGE
⌊−rρ − 2−t⌋t ≤ yc ≤ ⌊rρ − ulp⌋t
⌊z⌋t = 2−t⌊2tz⌋
1
− 0 × 2−δ ≥ 2−t
2
c c d d
max(L (d
k i ), L (d
k i+1 )) ≤ mk (i) ≤ min( U (d
k−1 i ), U k−1(di+1 ))
c
L 1 (1) = 0
d
U 0(1/2) = 0
c
L 0 (1/2) = −1/2
d
U −1(1) = −1/2
c d
(L 1(1) = 0) ≤ m1 ≤ (U0(1/2) = 0)
c d
(L 0 (1/2) = −1/2) ≤ m0 ≤ ( U −1(1) = −1/2)
This results in the selection constants m1 = 0 and m0 = −1/2
5 c
− ≤ y ≤ 3/2
2
1 if 0 ≤ yc ≤ 3/2
qj+1 = 0 if yc = −1/2
−1 if −5/2 ≤ y
c ≤ −1
2w[j]
U1
2 ^ x x x. x
y[j] x x x. x
3/2
U0
1 4+4
^
^
1/2 L1 U*
U0 q j+1=1 m(1)=0 0
0
SELECTION
q j+1=0 d FUNCTION
^ ^
L 1 , U-1
y[j] -1/2 U-1
^
L0 m(0)=-1/2
-1 U* q j+1
q j+1=-1 -1
-3/2 L 0 , L*0
(b)
-2
L -1
-5/2
d=1/2 d=1-ulp
(a)
Figure 5.30: RADIX-2 DIVISION WITH CARRY-SAVE ADDER: (a) P-D PLOT. (b) SELECTION FUNCTION.
qm = (p−1p0p1)′ (5.2)
qs = p−2 ⊕ (g−1 + p−1g0 + p−1p0g1)
where
pi = ci ⊕ si gi = ci · si
and
(c−2, c−1, c0, c1)
1 4 −δ
− 2 ≥ 2−t
6 3
−t 1 1 1
2 ≤ − =
6 12 12
44 42
− ≤ ŷ ≤
16 16
^
y[j] U1
U^1
12/8 L^2
L2
11/8
10/8
x x x.x x x x 0.1 x x x
y^ ^
d
7 3
9/8
SELECTION
1 FUNCTION
q
j+1
7/8
(b)
6/8
5/8
m2 (i)
region of choice for (a)
m2 (i)
Figure 5.31: SELECTION FUNCTION FOR RADIX-4 SCHEME WITH CARRY-SAVE ADDER: (a) Fragment of P-D diagram. (b)
Implementation.