Lec06 ALU
Lec06 ALU
2024 Spring
Overview
1 Overview
2 Addition Unit
4 Shifter
2/41
Overview
Abstract Implementation View
4/41
Arithmetic
Where we’ve been: abstractions
• Instruction Set Architecture (ISA)
• Assembly and machine language
5/41
Arithmetic
Where we’ve been: abstractions
• Instruction Set Architecture (ISA)
• Assembly and machine language
zero ovf
1
1
A
32
ALU result
32
B
32 4
m (operation)
5/41
Machine Number Representation
• in real systems have to provide for more than just integers, e.g., fractions and real
numbers (and floating point) and alphanumeric (characters)
1 6/41
conventions define the relationships between bits and numbers
RISC-V Representation
7/41
Two’s Complement Operations
• Negating a two’s complement number – complement all the bits and then add a 1
• remember: “negate” and “invert” are quite different!
8/41
Design the RISC-V Arithmetic Logic Unit (ALU)
RV 32I:
add, sub, mul, mulh, mulhu, mulhsu, zero ovf
div, divu, rem, li, addi, sll, srl, 1
sra, or, xor, not, slt, sltu, slli, 1
A
srli, srai, andi, ori, xori, slti, 32
31 20 19 15 14 12 11 7 6 0
I-Type Imm[11:0] rs1 funct3 rd opcode
31 25 24 20 19 15 14 12 11 7 6 0
R-Type funct7 rs2 rs1 funct3 rd opcode
I-Type R-Type
Type opcode funct Imm[11:5] Type opcode funct
ADDI 0010011 000 xx (any) ADD 0110011 0000000 000
SLLI 0010011 001 0000000 SUB 0110011 0100000 000
SLTI 0010011 010 xx SLL 0110011 0000000 001
SLTIU 0010011 011 xx SLT 0110011 0000000 010
SRLI 0010011 101 0000000 SLTU 0110011 0000000 011
SRAI 0010011 101 0100000 XOR 0110011 0000000 100
ORI 0010011 110 xx SRL 0110011 0000000 101
ANDI 0010011 111 xx SRA 0110011 0100000 101
10/41
Addition Unit
Building a 1-bit Binary Adder
c0=carry_in
A0 1-bit
FA S0
B0
c1
• Just connect the carry-out of the least significant bit FA to the
A1 1-bit
FA S1 carry-in of the next least significant bit and connect ...
B1
c2
A2 1-bit
FA S2
B2 • Ripple Carry Adder (RCA)
c3
• ,: simple logic, so small (low cost)
...
13/41
Glitch
Glitch
invalid and unpredicted output that can be read by the next stage and result in a wrong
action
OR O
I NOT
O1
I
14/41
Glitch
Glitch
invalid and unpredicted output that can be read by the next stage and result in a wrong
action
OR O
I NOT
O1
I
T
O1
14/41
Glitch
Glitch
invalid and unpredicted output that can be read by the next stage and result in a wrong
action
OR O
I NOT
O1
I
T
O1
T
O
T
14/41
Glitch in RCA
c0=carry_in
A0 1-bit A B carry_in carry_out S
FA S0
B0
c1
0 0 0 0 0
A1 1-bit
FA S1 0 0 1 0 1
B1
c2
0 1 0 0 1
A2 1-bit
FA S2 0 1 1 1 0
B2
c3 1 0 0 0 1
...
1 0 1 1 0
c31
1 1 0 1 0
A31 1-bit
FA S31 1 1 1 1 1
B31
c32=carry_out
15/41
But What about Performance?
CarryIn0
A0 1-bit Result0
B0 ALU
CarryOut0
CarryIn1
A1 1-bit Result1
B1 ALU
CarryOut1
CarryIn2
A2 1-bit Result2
B2 ALU
CarryOut2
CarryIn3
A3 1-bit Result3
B3 ALU
CarryOut3
16/41
A 32-bit Ripple Carry Adder/Subtractor
add/sub c0=carry_in
A0 1-bit
FA S0
l complement all the bits B0 c1
control A1 1-bit
(0=add,1=sub) B0 if control = 0 S1
FA
B0 !B0 if control = 1 B1
c2
A2 1-bit
FA S2
l add a 1 in the least significant bit B2 c3
...
A 0111 -> 0111
B - 0110 -> + 1001 c31
0001 1 A31 1-bit
1 0001 FA S31
B31
c32=carry_out
17/41
Tailoring the ALU to the ISA
• Also need to support the logic operations (and, nor, or, xor)
• Bit wise operations (no carry operation involved)
• Need a logic gate for each function and a mux to choose the output
• Also need to support the set-on-less-than instruction (slt)
• Uses subtraction to determine if (a − b) < 0 (implies a < b)
• Also need to support test for equality (bne, beq)
• Again use subtraction: (a − b) = 0 implies a = b
• Also need to add overflow detection hardware
• overflow detection enabled only for add, addi, sub
• Immediates are sign extended outside the ALU with wiring (i.e., no logic needed)
18/41
A Simple ALU Cell with Logic Op Support
add/subt carry_in op
result
1-bit
FA
B
add/subt carry_out
19/41
A Simple ALU Cell with Logic Op Support
add/subt carry_in op
A
0
1
2
3 result
1-bit
FA 6
B
less 7
add/subt carry_out
Modifying the ALU Cell for slt
19/41
Modifying the ALU for slt
A0
result0
B0 +
less
A1
• First perform a subtraction
• Make the result 1 if the subtraction yields a negative result1
result
B1 +
result . . .
• Tie the most significant sum bit (sign bit) to the low A31
order less input
result31
B31 +
less
0
set
20/41
Overflow Detection
Overflow occurs when the result is too large to represent in the number of bits
allocated
• adding two positives yields a negative
• or, adding two negatives gives a positive
• or, subtract a negative from a positive gives a negative
• or, subtract a positive from a negative gives a positive
0 1 1 1 1 0
0 1 1 1 7 1 1 0 0 –4
+ 0 0 1 1 3 + 1 0 1 1 –5
1 0 1 0 –6 0 1 1 1 7
21/41
Modifying the ALU for Overflow
op
add/subt
A0
result0
B0 +
less
A1
...
B1 + zero
• Enable overflow bit setting for signed 0 less
result31
+
B31
0 less overflow
set
22/41
Overflow Detection and Effects
23/41
New Instructions
24/41
25/41
Multiplication & Division
Multiplication
0010 (multiplicand)
x_1011 (multiplier)
0010
0010 (partial product
0000 array)
0010
00010110 (product)
27/41
First Version of Multiplication Hardware
multiplicand
29/41
Second Version: Example
0110
Multiplicand
Add
4-bit ALU
Shift right
Product Multiplier
Initial: 0 00 0 00 101 Control
Multiplier0 is 1
30/41
Second Version: Example
0110
Multiplicand
0000
Add
4-bit ALU
Shift right
Product Multiplier
Add: 0 00 0 00 101 Control
30/41
Second Version: Example
0110
Multiplicand
0000
Add
4-bit ALU
30/41
Second Version: Example
0110
Multiplicand
Add
4-bit ALU
Shift right
Product Multiplier
Add: 0 01 1 00 101 Control
30/41
Second Version: Example
0110
Multiplicand
Add
4-bit ALU
Shift right
Product Multiplier
Shift: 0 00 1 10 010 Control
Multiplier0 is 0
30/41
Second Version: Example
0110
Multiplicand
Add
4-bit ALU
Shift right
Product Multiplier
Shift: 0 00 0 11 001 Control
Multiplier0 is 1
30/41
Second Version: Example
0110
Multiplicand
0001
Add
4-bit ALU
Shift right
Product Multiplier
Add: 0 00 0 11 001 Control
30/41
Second Version: Example
0110
Multiplicand
0001
Add
4-bit ALU
30/41
Second Version: Example
0110
Multiplicand
Add
4-bit ALU
Shift right
Product Multiplier
Add: 0 01 1 11 001 Control
30/41
Second Version: Example
0110
Multiplicand
Add
4-bit ALU
Shift right
Product Multiplier
Shift: 0 00 1 11 100 Control
Multiplier0 is 0
30/41
Second Version: Example
0110
Multiplicand
Add
4-bit ALU
Shift right
Product Multiplier
Shift: 0 00 0 11 110 Control
• mul performs an 32-bit × 32-bit multiplication and places the lower 32 bits in the
destination register.
• mulh, mulhu, and mulhsu perform the same multiplication but return the upper 32
bits of the full 64-bit product, for signed×signed, unsigned×unsigned, and
signed×unsigned multiplication respectively.
31/41
Division
• Division is just a bunch of quotient digit guesses and left shifts and subtracts
n
n quotient
0 0 0 dividend
divisor
0
partial
0 remainder
array
0
remainder
n
32/41
Division Hardware
33/41
Question: Division
Dividing 1001010 by 1000
34/41
RISC-V Divide Instruction
• div perform an 32 bits by 32 bits signed integer division of rs1 by rs2, rounding
towards zero.
• div and divu perform signed and unsigned integer division of 32 bits by 32 bits.
• rem and remu provide the remainder of the corresponding division operation.
35/41
Shifter
Shift Operations
• Shifts by a constant are encoded as a specialization of the I-type format. The operand
to be shifted is in rs1, and the shift amount is encoded in the lower 5 bits of the
I-immediate field.
• slli is a logical left shift; srli is a logical right shift; and srai. is an arithmetic
right shift.
• Logical shifts fill with zeros, arithmetic left shifts fill with the sign bit
Ai Bi
Ai-1 Bi-1
Bit-Slice i
...
38/41
Parallel Programmable Shifters
39/41
Logarithmic Shifter Structure
Sh0 !Sh0
shifts dataini+1
of 0 !Sh0
or 1 dataouti
dataini
bits
dataini-1
0,1
shifts
40/41
Logarithmic Shifter Structure
Data Out
shifts shifts dataini+2
Data In
of 0 of 0 !Sh1
or 1 or 2 dataouti
dataini
bits bits
dataini-2
0,1 0,1,2,3
shifts shifts
40/41
Logarithmic Shifter Structure
40/41
Logarithmic Shifter Structure
40/41
Logarithmic Shifter Structure
Sh0 !Sh0 Sh1 !Sh1 Sh2 !Sh2 Sh3 !Sh3 Sh4 !Sh4
40/41
Logarithmic Shifter Structure
A3 B3
A2 B2
A1 B1
A0 B0
41/41