Chapter 03
Chapter 03
Chapter 3
Arithmetic for
Computers
§3.1 Introduction
Arithmetic for Computers/Processors
Representations
2’s complement representation for fixed-point N-bit INT
Std. IEEE754 FP32/64 representation
Fixed-point INT arithmetic vs. Floating-point (FP) arithmetic
General operations: Addition/subtraction, multiplication, division
Special DSP operations: fused multiply-and-accumulate (MAC),
butterfly unit, general matrix-matrix multiplication (GEMM), …
Efficient multiplication/division algorithms
Efficient implementation of adder, multiplier, and divider
detection
and, or, nor : logical AND, logical OR, logical NOR
0010 add
Result
32 0110 subtract
Overflow 0111 set-on-less-than
B 1100 nor
32
CarryOut
32-Bit ALU Group Bit-Slice ALU
Design trick 1: divide and conquer
Break the problem into simpler problems, solve them and glue together
the solution
Design trick 2: solve part of the problem and extend
A 32 B 32
a31 b31 a0 b0 4
m ALU0 m
ALU31
ALUop
c31 cin c0 cin
s31 s0
Overflow Zero
32 Result
—9
A 4-bit ALU Example
Design trick 3: take pieces you know (or can imagine) and try to put
them together
4-bit ALU
A2 1-bit Result2
ALU
B2
1-bit add
Full 2 CarryIn3 CarryOut2
B Adder A3 1-bit Result3
ALU
B3
CarryOut
CarryOut3
Overflow Detection Logic
Overflow = CarryIn[N-1] XOR CarryOut[N-1]
CarryIn0
CarryOut3
— 11
Arithmetic for Multimedia
Graphics and media processing operates on vectors of
8-bit (byte) and 16-bit INT data
multiplicand
1000
multiplier
× 1001
1000
0000
0000
1000
product 1001000
00011111111111111111111111111111 11000000000000000000000000000000
Hi Lo
mfhi $t3 $t3 00011111111111111111111111111111
Initially 0
0010 x 0011
2. Shift Multiplicand register left 1 bit
Product Multiplier Multiplicand
0000 0000 0011 0000 0010
0000 0010 0001 0000 0100 3. Shift Multiplier register right 1 bit
0000 0110 0000 0000 1000
0000 0110 0000 0001 0000 No: < 32 repetitions
32nd
0000 0110 0000 0010 0000 Done
repetition?
Yes: 32 repetitions
Done
— 15
Observations
1 clock per cycle => too slow
Ratio of multiply to add 5:1 to 100:1
Half of the bits in multiplicand always 0
=> 64-bit adder is wasted
0’s inserted in right of multiplicand as shifted
=> least significant bits of product never changed once formed
Instead of shifting multiplicand to left, shift product to
right?
Product register wastes space => combine Multiplier and
Product register
10
101 Restoring division
1010 Do the subtract, and if remainder goes < 0, add
-1000
divisor back
remainder 10
Signed division
Divide using absolute values
Initially divisor
in left half
Initially dividend
Yes: 33 repetitions
Done — 24
Observations
Half of the bits in divisor register always 0
=> 1/2 of 64-bit adder is wasted
=> 1/2 of divisor is wasted
Instead of shifting divisor to right,
shift remainder to left?
1st step cannot produce a 1 in quotient bit
(otherwise quotient is too big for the register)
=> switch order to shift first and then subtract
=> save 1 iteration
Eliminate Quotient register by combining with Remainder
register as shifted left
Instructions
+0.002 × 10–4
not normalized
+987.02 × 109
In binary
( 1)S (1 F) 2(EBias)
±1.xxxxxxx2 × 2yyyy
The programming language C use the name float (or
double) for single-precision (or double-precision) FP
numbers.
Chapter 3 — Arithmetic for Computers — 31
Standard FP Representation
Defined by IEEE Std 754-1985
Two representations
32-bit single-precision (SP) FP
SP : approx 2–23
Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal
digits of precision
DP : approx 2–52
Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal
digits of precision
SP : 1011111101000…00
DP : 1011111111101000…00
11000000101000…00
S=1
Fraction = 01000…002
= –5.0
Special numbers
Range: 1.0 2-126 1.8 10-38
What if result too small? (>0, < 1.8x10-38 => Underflow! )
What if result too large? (> 3.4x1038 => Overflow! )
Step 1
Step 2
Step 3
Step 4
Guard and round bits: extra bits to guard against loss of bits during
intermediate additions
to the right of significand
can later be shifted left into significand during normalization
Sticky bit
Additional bit to the right of the round digit
Better fine tune rounding
Optional variations
I: integer operand
P: pop operand from stack
R: reverse operand order
But not all combinations allowed