0% found this document useful (0 votes)
18 views11 pages

Basic Arithmetic and The ALU Basic Arithmetic and The ALU

The document discusses basic arithmetic operations and the architecture of the Arithmetic Logic Unit (ALU), including integer representation, addition, subtraction, and logical operations. It covers concepts such as 2's complement for signed integers, ripple-carry adders, and carry lookahead mechanisms to improve efficiency in addition. Additionally, it touches on shifters and various types of adders used in digital circuits.

Uploaded by

jonathanj302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views11 pages

Basic Arithmetic and The ALU Basic Arithmetic and The ALU

The document discusses basic arithmetic operations and the architecture of the Arithmetic Logic Unit (ALU), including integer representation, addition, subtraction, and logical operations. It covers concepts such as 2's complement for signed integers, ripple-carry adders, and carry lookahead mechanisms to improve efficiency in addition. Additionally, it touches on shifters and various types of adders used in digital circuits.

Uploaded by

jonathanj302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Basic Arithmetic and the ALU Basic Arithmetic and the ALU

Forecast Integer multiplication, division


• Representing numbers, 2’s Complement, unsigned floating point arithmetic later
• Addition and subtraction
not crucial for the project
• Add/Sub ALU
• full adder, ripple carry, subtraction, together
• Carry-Lookahead addition, etc.
• Logical operations
• and, or, xor, nor, shifts - barrel shifter
• Overflow, MMX

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 1 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 2

Background Background
Recall: 32-bit types include
• unsigned integers
n bits give rise to 2n combinations
• singed integers
let us call a string of 32 bits as “b31 b30 . . . b3 b2 b1 b0”
• single-precision floating point
No inherent meaning • MIPS instructions (A.10)
• one interpretation f(b31 . . . b4 b3 b2 b1 b0) -> value
• another f(b31 . . . b4 b3 b2 b1 b0) -> control signals

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 3 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 4
Unsigned integers Signed Integers
f(b31 . . . b0) = b31 x 231 + . . . + b1 x 2 + b0 x 20 2’s Complement

Treat as normal binary number f(b31 b30 . . . b1 b0) = -b31 x 231 + . . . + b1 x 2 + b0 x 20

e.g., 0 . . .011010101 max f(0111 . . . 11) = 231 - 1 = 2147483647


= 1 x 27 + 1 x 26 + 0 x 25 + 1 x 24 + 0 x 23 + 1 x 22 + 0 x 21 + 1 x 20 min f(100 . . . 00) = -231 = -2147483648 (asymmetric)
= 128 + 64 + 16 + 4 + 1 = 213
range [-231, 231-1] => #values (231-1 - -231 + 1) = 232
max f (111 . . . 11) = 232 - 1 = 4, 294, 967, 295 E.g., -6
min f(000 . . . 00) = 0 • 000 . . . 0110 --> 111 . . 1001 + 1 --> 111 . . .1010

range [0, 232-1] => # values (232 -1) - 0 + 1 = 232


© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 5 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 6

Why 2’s Complement Addition and Subtraction


why not use signed magnitude 4-bit unsigned example

2’s complement makes computer arithmetic simpler


0 0 1 1 3
just like humans don’t work with Roman Numerals 1 0 1 0 10
Representation affects ease of calculation 1 1 0 1 13

not answer 000 000 4-bit 2’s Complement - ignoring overflow


111 001 111 001
-1 0 1 -3 0 1
110 -2 2 010 110 -2 2 010 0 0 1 1 3
-3 3 -1 3
101 -4 011 101 -0 011 1 0 1 0 -6
100 100 1 1 0 1 -3

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 7 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 8
Subtraction Full Adder
A - B = A + 2’s complement of B full adder (a, b, cin )--> (cout, s)
E.g., 3 - 2
cout = two of more of (a, b, cin)

0 0 1 1 3 s = exactly one or three of (a, b, cin)


1 1 1 0 -2
0 0 0 1 1

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 9 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 10

Ripple-carry Adder Ripple-carry Subtractor


Just concatenate the full adders A - B = A + (-B) => invert B and set Cin0 to 1

cin Full Full Full Full Full Full Full Full


Add Add Add Add Add Add Add Add Cout
er er er er Cout 1 er er er er

a0 b0 a1 b1 a2 b2 a31b31 a0 b0 a1 b1 a2 b2 a3 b3

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 11 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 12
Combined Ripple-carry Adder/Subtractor Carry Lookahead
control = 1 => subtract The above ALU is too slow -
XOR B with control and set Cin0 to control • gate delays for add = 32 x FA + XOR ~= 64 - too slow

Theoretically:
Full Full Full Full
Add Add Add Add Cout
er er er er • In parallel
• sum0 = f(cin, a0, b0)
• sumi = f(cin, ai . . . a0, bi . . . b0)
• sum31 = f(cin, a31 . . . a0, b31 . . . b0)
operation
b b b b •
a0 0 a1 1 a2 2 a31 31
© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 13 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 14

Carry Lookahead Carry Lookahead


Need compromise 0101 0100

0011 0110
build tree so delay is O(log2 n) for n bits
Need both to generate and at least one to propagate
E.g., 2 x 5 gate delays for 32-bits Define: gi = ai * bi ## carry generate

p i = ai + b i ## carry propagate
We will give the basic idea with (a) 4-bit then (b) 16-bit adder
Recall: ci+1 = ai * bi + ai * ci + bi * ci

A little convoluted! = ai * bi + (ai + bi) * ci

= g i + pi * ci

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 15 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 16
Carry Lookahead 4-bit Carry Lookahead Adder
Therefore
c1 = g0 + p0 * c0 c0
c4 Carry Lookahead Block
c2 = g1 + p1 * c1 = g1 + p1 * (g0 + p0 * c0)

= g 1 + p 1 * g 0 + p1 * p 0 * c 0
g3 p3 a3 b3 g2 p2 a2 b2 g1 p1 a1 b1 g0 p0 a0 b0
c3 = g2 + p2 * g1 + p2 * p1 * g0 + p2 * p1 * p0 * c0
c3 c2 c1 c0
c4 = g3 + p3*g2 + p3*p2*g1 + p3*p2*p1*g0 + p3*p2*p1*p0*c0

Uses one level to form pi and gi, two levels for carry
s3 s2 s1 s0
But, this needs n+1 fanin at the OR and the rightmost AND
© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 17 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 18

Hierachical Carry Lookahead for 16 bits Hierachical Carry Lookahead for 16 bits
Build 16-bit adder from four 4-bit adders

c0 Figure out Generate and Propagate for 4-bits together


c15 Carry Lookahead Block
G0,3 = g3 + p3 * g2 + p3 * p2 * g1 + p3 * p2 * p1 * g0

P0,3 = p3 * p2 * p1 * p0 (Notation a little different from the book)


G P a,b12-15 G P a,b8-11 G P a4-7b4-7 G P a0-3b0-3
G4,7 = g7 + p7 * g6 + p7 * p6 * g5 + p7 * p6 * p5 * g4
c12 c8 c4 c0 P4,7 = p7 * p6* p5 * p4

G12,15 = g15 + p15 * g14 + p15* p14 * g13 + p15 * p14 * p13 * g12
s12-15 s8-11 s4-7 s0-3 P12,15 = p15 * p14 * p13 * p12
© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 19 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 20
Carry Lookahead Basics Carry Lookahead: Compute G’s and P’s
Fill in the holes in G’s and P’s:
Gi, k = Gj+1,k + Pj+1, k * Gi,j (assume i < j +1 < k ) G12,15 G8,11 G4,7 G0,3
P12,15 P8,11 P4,7 P0,3
Pi,k = Pi,j * Pj+1, k

G0,7 = G4,7 + P4,7 * G0,3 P0,7 = P0,3* P4,7


G8,15 G0,7
G8,15 = G12,15 + P12,15 * G8,11 P8,15 = P8,11 * P12, 15 P8,15 P0,7
G0,15 = G8,15 + P8,15 * G0,7 P0,15 = P0,7 * P8, 15

G0,15
P0,15

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 21 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 22

Carry Lookahead: Compute c’s Other Adders: Carry Select


Two adds in parallel - one with Cin 0 and the other cin 1
g12 - g15 g8 - g11 g4 - g7 g0 - g3
p12 - p15 p8 - p11 p4 - p7 p0 - p 3 • When Cin is done, select the right result

Full Adder 0
c12 c8 c4 c0 c0
Full Adder
G8,11 G0,3
P8,11 P0,3 Full Adder 1

c8 c0
G0,7 next
P0,7 select
2-1 Mux select
c0
© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 23 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 24
Other Adders: Carry Save Wallace Tree
A + B -> S f e d c b a
Save carries A + B -> S, Cout CSA CSA
Use Cin A + B + C -> S1, S2 (3# to 2# in parallel)

Used in combinational multipliers by building a Wallace Tree CSA


c b a

CSA
CSA

c s

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 25 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 26

Logical Operations Shifter


Bitwise AND, OR, XOR, NOR E.g., Shift left logical for d<7:0> and shamt<2:0>
• Implement with 32 gates in parallel Using 2-1 Muxes called Mux(select, in0, in1)
Shifts and rotates stage0<7:0> = Mux(shamt<0>,d<7:0>, 0 || d<7:1>)
• rol -> rotate left (MSB --> LSB)
stage1<7:0> = Mux(shamt<1>, stage0<7:0>, 00 || stage0<6:2>)
• ror -> rotate right (LSB --> MSB)
dout<7:0) = Mux(shamt<2>, stage1<7:0>, 0000 || stage1<3:0>)
• sll -> shift left logical (0 --> LSB)
• srl -> shift right logical (0 --> MSB) For Barrel shifter used wider muxes

• srl -> shift right arithmetic (old MSB --> new MSB)

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 27 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 28
Shifter All Together
d7 d6 d1 d0 d0 0 operation
invert carryin
Mux shift based on 0th bit by 0 or 1 shamt0
a
stage0
s07 s00
s07 s06 s00 s02 s01 0 s00 0
result

Mux
Mux st
shift based on 1 bit by 0 or 2 shamt1
stage1
s17 s10
b

Mux
s17 s13 s14 s10 s13 0 s10 0 Add
Mux shift based on 2nd bit by 0 or 4 shamt2
dout
dout7 dout7
© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 29 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 30

Overflow Overflow
with n-bits only 2ncombinations More involved for 2’s Complement

-1+ -1 = -2
Unsigned [0, 2n -1], 2’s Complement [-2n-1, 2n-1-1]
111
Unsigned Add
+ 111
5+6>7
1110
101
110 = -2 is correct => can’t just use carry-out
+ 110

1011

f(3:0) = a(2:0) + b(2:0) => overflow = f(3) ;; carryout

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 31 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 32
Addition Overflow Addition Overflow
When is overflow NOT possible? p1, p2 > 0 and n1, n2 < 0 2+3=5>4 010
p1 + p2 + 011
p1 + n1 not possible 101 = -3 < 0! In general, X = f(2)
n1 + p2 not possible -1 + -4 111
n1 + n2 + 100
overflow = X * a(2) * b(2) + Y * a(2) * b(2) 011 which is 011 > 0 In general Y = f(2)

What are X and Y? Overflow = f(2) * a(2) * b(2) + f(2) * a(2) * b(2)

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 33 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 34

Subtraction Overflow What to do on overflow


No overflow on a-b if signs are same Ignore!

neg - pos ==> neg ;; overflow otherwise Flag - condition code that may be tested by software

pos - neg ==> pos ;; overflow otherwise sticky flag - e.g., for floating point

overflow = f(2) * (a2) * b(2) + f(2) * a(2) * (b2) trap - possibly with mask

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 35 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 36
Zero and Negative Zero and Negative
zero = f(2) + f(1) + f(0) May also want correct answer even on overflow
can’t also look at f(3) because negative = (a < b) = (a-b < 0) even if overflow
001 +1 E.g., is -4 < 2?

+ 111 -1 100 -4

1000 0 - 010 2

So, negative = f(2) 1010 -6 => overflow

If you work it out,

negative = f(2) XOR overflow

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 37 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 38

MMX MMX, cont.


MMX [Peleg & Weiser, IEEE Micro, Aug. 96] E.g., ADDB (for byte)
• Goal 2x performance in audio, video, etc. 17 87 100 ... 6 more
+ 17 13 200 ... 6 more
• Key technique: SIMD - single instruction multiple data ---- ---- ---- ...
34 100 44 == 300 mod 256
• 1999 Streaming SIMD Extensions in same spirit or 255 == maximum value

E.g., 16 element dot product from matrix multiply


Data types • [a1... a16] x [b1 ... b16] = a1*b1 + ... + a16*b16
1 x 64 bit quad word
2 x 32 bit double-word • IA-32: 32 loads, 16 * , 15 +, 12 loop control = 76 instr.
4 x 16 bit word
8 x 8 bit byte • MMX: 16 instr.
• Cycles 200 for int, 76 for FP, & 12 for MMX (6x over FP!)

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 39 © 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 40
MMX, cont.
Others: MOV, (UN)PACK, & MASK (e.g., next)
15 15 100 120 101 76 15 15
15 15 15 15 15 15 15 15
--------------------------------
FF FF 00 00 00 00 FF FF
Why? Weatherperson at 00’s & weathermap at FF’s
Comments
• Backward compatible & no OS changes (overload FP regs)
• Others have similar: Sun, HP, and now Intel SSE
• ISVs (i.e., for games) have not (yet) embraced

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 4 41

You might also like