Computer Architecture ECE 361 Lecture 5: The Design Process & ALU Design
Computer Architecture ECE 361 Lecture 5: The Design Process & ALU Design
ECE 361
Lecture 5: The Design Process & ALU Design
361 design.1
Quick Review of Last Lecture
361 design.2
MIPS ISA Design Objectives and Implications
361 design.3
MIPS jump, branch, compare instructions
° Instruction Example Meaning
° set on less than slt $1,$2,$3 if ($2 < $3) $1=1; else $1=0
Compare less than; 2’s comp.
° set less than imm.slti $1,$2,100 if ($2 < 100) $1=1; else $1=0
Compare < constant; 2’s comp.
° set less than uns. sltu $1,$2,$3 if ($2 < $3) $1=1; else $1=0
Compare less than; natural numbers
° set l. t. imm. uns. sltiu $1,$2,100 if ($2 < 100) $1=1; else $1=0
Compare < constant; natural numbers
register
Immediate op rs rt immed
Base+index
op rs rt immed
Memory
register +
PC-relative
op rs rt immed
Memory
PC +
361 design.5
MIPS Instruction Formats
361 design.6
MIPS Operation Overview
° Arithmetic logical
° SLL, SRL
° Memory Access
° SW, SB
361 design.7
Branch & Pipelines
Time
li r3, #7 execute
32
Multiplicand
Register LoadMp
Arithmetic
32=>34
signEx
32
<<1
34 1000 µProc
34
32=>34 1 0
CPU 60%/yr
signEx 34x2 MUX
34
Multi x2/x1 “Moore’s Law” .
34
(2X/1.
Performance
100 5yr)
Processor-Memory
34-bit ALU Sub/Add
Control Performance Gap:
Logic
34
(grows 50% / year)
10
[0]"
32 2 32 ShiftAll
"L O
ENC[2] DRAM
LO[1]
Encoder
2 HI register 2 LO register
2 bits
Booth
9%/yr.
Extra
ENC[1]
Prev
(16x2 bits) (16x2 bits) DRAM(2X/10
ENC[0]
2
1
LoadLO
ClearHI
LoadHI
LO[1:0] yrs)
1
1
9
1
9
8
1
9
8
0
1
9
8
1
1
9
8
2
1
9
8
3
1
9
8
4
1
9
8
5
1
9
8
6
1
9
8
7
1
9
8
1
9
9
1
9
10
9
9
1
91
9
2
1
9
13
49
9
1
95
91
9
6
2
79
0
9
8
0
9
0
32 32
Single/multicycle
Datapaths
Pipelining
I/O
361 design.9
Memory Systems
Outline of Today’s Lecture
° Refinements
361 design.10
The Design Process
361 design.11
Design Process
361 design.12
Design Refinement
Informal System Requirement
Initial Specification
Intermediate Specification
refinement
increasing level of detail
Final Architectural Description
Physical Implementation
361 design.13
Design as Search
Problem A
Strategy 1 Strategy 2
-- Given design space of components & assemblies, which part will yield
the best solution?
° Requirements?
361 design.15
MIPS ALU requirements
361 design.16
MIPS arithmetic instruction format
31 25 20 15 5 0
R-type:
op Rs Rt Rd funct
I-Type: op Rs Rt Immed 16
° Break the problem into simpler problems, solve them and glue together
the solution
° Example: assume the immediates have been taken care of before the
ALU
• 10 operations (4 bits) 00 add
01 addU
02 sub
03 subU
04 and
05 or
06 xor
07 nor
12 slt
13 sltU
361 design.18
Refined Requirements
(1) Functional Specification
inputs: 2 x 32-bit operands A, B, 4-bit mode (sort of control)
outputs: 32-bit result S, 1-bit carry, 1 bit overflow
operations: add, addu, sub, subu, and, or, xor, nor, slt, sltU
32 32
A B 4
c ALU m
ovf
S
32
361 design.19
Behavioral Representation: VHDL
Entity ALU is
generic (c_delay: integer := 20 ns;
S_delay: integer := 20 ns);
...
S <= A + B;
361 design.20
Design Decisions
ALU
bit slice
° ...
361 design.21
Refined Diagram: bit-slice ALU
A 32 B 32
a31 b31 a0 b0 4
ALU0 m ALU0 m
M
co cin co cin
s31 s0
Ovflw
32
S
361 design.22
7-to-2 Combinational Logic
127
361 design.23
A One Bit ALU
CarryIn
Mux
Result
1-bit
Full
B Adder
CarryOut
361 design.24
A One-bit Full Adder CarryIn
A 1-bit
° This is also called a (3, 2) adder C
Full
B Adder
° Half Adder: No CarryIn nor CarryOut
Inputs Outputs
Inputs Outputs
° CarryOut = (!A & B & CarryIn) | (A & !B & CarryIn) | (A & B & !CarryIn)
| (A & B & CarryIn)
Inputs Outputs
° Sum = (!A & !B & CarryIn) | (!A & B & !CarryIn) | (A & !B & !CarryIn)
| (A & B & CarryIn)
361 design.27
Logic Equation for Sum (continue)
° Sum = (!A & !B & CarryIn) | (!A & B & !CarryIn) | (A & !B & !CarryIn)
| (A & B & CarryIn)
X Y X XOR Y
0 0 0
0 1 1
1 0 1
1 1 0
361 design.28
Logic Diagrams for CarryOut and Sum
CarryIn
B CarryOut
361 design.29
Seven plus a MUX ?
° Design trick 2: take pieces you know (or can imagine) and try to put
them together
S-select
CarryIn
and
A
or Result
Mux
1-bit add
Full
B Adder
CarryOut
361 design.30
A 4-bit ALU
CarryIn0
CarryIn
A0 1-bit
A Result0
B0 ALU
CarryIn1 CarryOut0
A1 1-bit Result1
Result B1 ALU
Mux
CarryIn2 CarryOut1
A2 1-bit Result2
B2 ALU
1-bit CarryIn3 CarryOut2
Full A3
B 1-bit Result3
Adder
B3 ALU
CarryOut CarryOut3
361 design.31
How About Subtraction?
Subtract
A CarryIn
4
Zero
“ALU”
Result
Sel 4
2x1 Mux
B 0
4
1 4
4 !B CarryOut
361 design.32
Additional operations
° A - B = A + (– B)
• form two complement by invert and add one
S-select
invert CarryIn
and
A
or Result
Mux
1-bit add
Full
Adder
B
CarryOut
Set-less-than? – left as an exercise
361 design.33
Revised Diagram
A 32 B 32
a31 b31 a0 b0 4
ALU0 ALU0
M
? co cin co cin
s31 s0
C/L to
produce
select,
comp,
Ovflw 32 c-in
S
361 design.34
Overflow
Decimal Binary Decimal 2’s Complement
0 0000 0 0000
1 0001 -1 1111
2 0010 -2 1110
3 0011 -3 1101
4 0100 -4 1100
5 0101 -5 1011
6 0110 -6 1010
7 0111 -7 1001
-8 1000
° Examples: 7 + 3 = 10 but ...
° -4 - 5 = -9 but ...
0 1 1 1 1 0
0 1 1 1 7 1 1 0 0 –4
+ 0 0 1 1 3 + 1 0 1 1 –5
1 0 1 0 –6 0 1 1 1 7
361 design.35
Overflow Detection
° Overflow: the result is too large (or too small) to represent properly
• Example: - 8 < = 4-bit binary number <= 7
0 1 1 1 1 0
0 1 1 1 7 1 1 0 0 –4
+ 0 0 1 1 3 + 1 0 1 1 –5
1 0 1 0 –6 0 1 1 1 7
361 design.36
Overflow Detection Logic
CarryOut3
361 design.37
Zero Detection Logic
CarryOut3
361 design.38
More Revised Diagram
A 32 B 32
signed-arith
and cin xor co
a31 b31 a0 b0 4
ALU0 ALU0
M
co cin co cin
s31 s0
C/L to
produce
select,
comp,
Ovflw 32 c-in
S
361 design.39
But What about Performance?
CarryOut3
CarryIn0
A0 1-bit Result0
B0 ALU
CarryIn1 CarryOut0 CarryIn
A1 1-bit Result1 A
B1 ALU
CarryIn2 CarryOut1
A2 1-bit Result2
B2 ALU
CarryIn3 CarryOut2
A3 B CarryOut
1-bit Result3
B3 ALU
CarryOut3
361 design.41
Carry Look Ahead (Design trick: peek)
Cin A B C-out
0 0 0 “kill”
A0 S 0 1 C-in “propagate”
B1 G 1 0 C-in “propagate”
P 1 1 1 “generate”
C1 =G0 + C0 • P0
A S P = A or B
B G G = A and B
P
C2 = G1 + G0 • P1 + C0 • P0 • P1
A S
B G
P
C3 = G2 + G1 • P2 + G0 • P1 • P2 + C0 • P0 • P1 • P2
A S
G G
B
P P
C4 = . . .
361 design.42
Plumbing as Carry Lookahead Analogy
c0
g0
p0
c1 c0
g0
p0
c0
g1 g0
p1 p0
c2
g1
p1
g2
p2
g3
p3
361 design.43 c4
The Idea Behind Carry Lookahead (Continue)
° We can rewrite:
• Cin1 = g0 | (p0 & Cin0)
• Cin2 = g1 | (p1 & g0) | (p1 & p0 & Cin0)
• Cin3 = g2 | (p2 & g1) | (p2 & p1 & g0) | (p2 & p1 & p0 & Cin0)
Cin1
Cin2 1-bit 1-bit Cin0
ALU ALU
Cout1
Cout0
° Recall: CarryOut = (B & CarryIn) | (A & CarryIn) | (A & B)
• Cin2 = Cout1 = (B1 & Cin1) | (A1 & Cin1) | (A1 & B1)
• Cin1 = Cout0 = (B0 & Cin0) | (A0 & Cin0) | (A0 & B0)
361 design.45
Cascaded Carry Look-ahead (16-bit): Abstraction
C C0
L
A G0
P0
C1 =G0 + C0 • P0
4-bit
Adder
C2 = G1 + G0 • P1 + C0 • P0 • P1
4-bit
Adder
C3 = G2 + G1 • P2 + G0 • P1 • P2 + C0 • P0 • P1 • P2
G
4-bit P
Adder
361 design.46 C4 = . . .
2nd level Carry, Propagate as Plumbing
g0
p0
p1 g1
p1
p2
p3 g2
p2
P0
g3
p3
G0
361 design.47
A Partial Carry Lookahead Adder
° Common practices:
• Connects several N-bit Lookahead Adders to form a big adder
• Example: connects four 8-bit carry lookahead adders to form
a 32-bit partial carry lookahead adder
8-bit Carry C24 8-bit Carry C16 8-bit Carry C8 8-bit Carry C0
Lookahead Lookahead Lookahead Lookahead
Adder Adder Adder Adder
8 8 8 8
CP(2n) = 2*CP(n)
n-bit adder n-bit adder
361 design.49
Carry Select
A[3:0] CarryIn
4
Result[3:0]
ALU
4
B[3:0]
4
A[7:4]
4
Result[7:4]
ALU
4
B[7:4]
4
CarryOut
361 design.50
Carry Select (Continue)
ALU
0 4
A[7:4] B[3:0]
4 C4
4
X[7:4] Sel
ALU 0
4 1
2 to 1 MUX
B[7:4] A[7:4] Result[7:4]
4 C0 4
Y[7:4] 4
ALU
4 1
B[7:4]
4 C1
0 1 Sel C4
2 to 1 MUX
CarryOut
361 design.51
Carry Skip Adder: reduce worst case delay
B A4 B A0
P3 P2 S P3 P2 S
P1 P0 P1 P0
361 design.52
Additional MIPS ALU requirements
361 design.53
Elements of the Design Process
361 design.54
Summary of the Design Process
Hierarchical Design to manage complexity
Block Diagrams
Optimization Criteria:
361 design.55