0% found this document useful (0 votes)
12 views59 pages

Lec06 ALU

The document provides an overview of the Arithmetic and Logic Unit (ALU) in computer organization, focusing on its design and implementation in the RISC-V architecture. It covers key concepts such as binary addition, two's complement operations, and the handling of overflow in arithmetic operations. Additionally, it discusses the necessary components and logic required to support various arithmetic and logic instructions within the ALU.

Uploaded by

yeshuag2000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views59 pages

Lec06 ALU

The document provides an overview of the Arithmetic and Logic Unit (ALU) in computer organization, focusing on its design and implementation in the RISC-V architecture. It covers key concepts such as binary addition, two's complement operations, and the handling of overflow in arithmetic operations. Additionally, it discusses the necessary components and logic required to support various arithmetic and logic instructions within the ALU.

Uploaded by

yeshuag2000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

CENG 3420

Computer Organization & Design


Lecture 06: Arithmetic and Logic Unit
Bei Yu
CSE Department, CUHK
[email protected]

(Textbook: Chapters 3.2 & A.5)

2024 Spring
Overview

1 Overview

2 Addition Unit

3 Multiplication & Division

4 Shifter

2/41
Overview
Abstract Implementation View

Instruction Write Data Address


Read
Memory Register
Write Addr Data Data
PC Address Instruction File ALU Memory Read Data
Read Addr Read
Write Data
Data
Read Addr

4/41
Arithmetic
Where we’ve been: abstractions
• Instruction Set Architecture (ISA)
• Assembly and machine language

5/41
Arithmetic
Where we’ve been: abstractions
• Instruction Set Architecture (ISA)
• Assembly and machine language

What’s up ahead: Implementing the ALU architecture

zero ovf

1
1
A
32
ALU result
32
B
32 4
m (operation)

5/41
Machine Number Representation

• Bits are just bits (have no inherent meaning)1


• Binary numbers (base 2) – integers
Of course, it gets more complicated:
• storage locations (e.g., register file words) are finite, so have to worry about overflow
(i.e., when the number is too big to fit into 32 bits)
• have to be able to represent negative numbers, e.g., how do we specify -8 in

addi $sp, $sp, -8 #$sp = $sp - 8

• in real systems have to provide for more than just integers, e.g., fractions and real
numbers (and floating point) and alphanumeric (characters)

1 6/41
conventions define the relationships between bits and numbers
RISC-V Representation

32-bit signed numbers (2’s complement):

0000 0000 0000 0000 0000 0000 0000 0000two = 0ten


0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten
0000 0000 0000 0000 0000 0000 0000 0010two = + 2ten
...

0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten


0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten
1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten
1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten
1000 0000 0000 0000 0000 0000 0000 0010two = – 2,147,483,646ten
...

1111 1111 1111 1111 1111 1111 1111 1101two = – 3ten


1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten

What if the bit string represented addresses?


• need operations that also deal with only positive (unsigned) integers

7/41
Two’s Complement Operations

• Negating a two’s complement number – complement all the bits and then add a 1
• remember: “negate” and “invert” are quite different!

• Converting n-bit numbers into numbers with more than n bits:


• 16-bit immediate gets converted to 32 bits for arithmetic
• sign extend: copy the most significant bit (the sign bit) into the other bits
0010 -> 0000 0010
1010 -> 1111 1010
• sign extension versus zero extend (lb vs. lbu)

8/41
Design the RISC-V Arithmetic Logic Unit (ALU)

• Must support the Arithmetic/Logic operations of the ISA

RV 32I:
add, sub, mul, mulh, mulhu, mulhsu, zero ovf
div, divu, rem, li, addi, sll, srl, 1
sra, or, xor, not, slt, sltu, slli, 1
A
srli, srai, andi, ori, xori, slti, 32

sltiu, ALU result


32
B
RV 64I: 32 4

addw, subw, remu, mulw, divw, divuw, m (operation)

remw, remuw, addiw, sllw, srlw, sraw,


srliw, sraiw,

• With special handling for:


• sign extend: addi, slti, sltiu
• zero extend: andi, xori
• Overflow detected: add, addi, sub
9/41
RISC-V Arithmetic and Logic Instructions

31 20 19 15 14 12 11 7 6 0
I-Type Imm[11:0] rs1 funct3 rd opcode

31 25 24 20 19 15 14 12 11 7 6 0
R-Type funct7 rs2 rs1 funct3 rd opcode

I-Type R-Type
Type opcode funct Imm[11:5] Type opcode funct
ADDI 0010011 000 xx (any) ADD 0110011 0000000 000
SLLI 0010011 001 0000000 SUB 0110011 0100000 000
SLTI 0010011 010 xx SLL 0110011 0000000 001
SLTIU 0010011 011 xx SLT 0110011 0000000 010
SRLI 0010011 101 0000000 SLTU 0110011 0000000 011
SRAI 0010011 101 0100000 XOR 0110011 0000000 100
ORI 0010011 110 xx SRL 0110011 0000000 101
ANDI 0010011 111 xx SRA 0110011 0100000 101
10/41
Addition Unit
Building a 1-bit Binary Adder

carry_in A B carry_in carry_out S


0 0 0 0 0
0 0 1 0 1
A 1 bit
Full S 0 1 0 0 1
B Adder 0 1 1 1 0
1 0 0 0 1
carry_out 1 0 1 1 0
1 1 0 1 0
1 1 1 1 1

S = A xor B xor carry_in


carry_out = A&B | A&carry_in | B&carry_in
(majority function)

• How can we use it to build a 32-bit adder?


• How can we modify it easily to build an adder/subtractor?
12/41
Building 32-bit Adder

c0=carry_in
A0 1-bit
FA S0
B0
c1
• Just connect the carry-out of the least significant bit FA to the
A1 1-bit
FA S1 carry-in of the next least significant bit and connect ...
B1
c2
A2 1-bit
FA S2
B2 • Ripple Carry Adder (RCA)
c3
• ,: simple logic, so small (low cost)
...

c31 • /: slow and lots of glitching (so lots of energy consumption)


A31 1-bit
FA S31
B31
c32=carry_out

13/41
Glitch

Glitch
invalid and unpredicted output that can be read by the next stage and result in a wrong
action

Example: Draw the propagation delay

OR O
I NOT
O1
I

14/41
Glitch

Glitch
invalid and unpredicted output that can be read by the next stage and result in a wrong
action

Example: Draw the propagation delay

OR O
I NOT
O1
I

T
O1

14/41
Glitch

Glitch
invalid and unpredicted output that can be read by the next stage and result in a wrong
action

Example: Draw the propagation delay

OR O
I NOT
O1
I

T
O1

T
O

T
14/41
Glitch in RCA

c0=carry_in
A0 1-bit A B carry_in carry_out S
FA S0
B0
c1
0 0 0 0 0
A1 1-bit
FA S1 0 0 1 0 1
B1
c2
0 1 0 0 1
A2 1-bit
FA S2 0 1 1 1 0
B2
c3 1 0 0 0 1
...

1 0 1 1 0
c31
1 1 0 1 0
A31 1-bit
FA S31 1 1 1 1 1
B31
c32=carry_out

15/41
But What about Performance?

• Critical path of n-bit ripple-carry adder is n × CP


• Design trick: throw hardware at it (Carry Lookahead)

CarryIn0
A0 1-bit Result0
B0 ALU
CarryOut0
CarryIn1
A1 1-bit Result1
B1 ALU
CarryOut1
CarryIn2
A2 1-bit Result2
B2 ALU
CarryOut2
CarryIn3
A3 1-bit Result3
B3 ALU

CarryOut3
16/41
A 32-bit Ripple Carry Adder/Subtractor

add/sub c0=carry_in
A0 1-bit
FA S0
l complement all the bits B0 c1
control A1 1-bit
(0=add,1=sub) B0 if control = 0 S1
FA
B0 !B0 if control = 1 B1
c2
A2 1-bit
FA S2
l add a 1 in the least significant bit B2 c3

...
A 0111 -> 0111
B - 0110 -> + 1001 c31
0001 1 A31 1-bit
1 0001 FA S31
B31
c32=carry_out

17/41
Tailoring the ALU to the ISA

• Also need to support the logic operations (and, nor, or, xor)
• Bit wise operations (no carry operation involved)
• Need a logic gate for each function and a mux to choose the output
• Also need to support the set-on-less-than instruction (slt)
• Uses subtraction to determine if (a − b) < 0 (implies a < b)
• Also need to support test for equality (bne, beq)
• Again use subtraction: (a − b) = 0 implies a = b
• Also need to add overflow detection hardware
• overflow detection enabled only for add, addi, sub
• Immediates are sign extended outside the ALU with wiring (i.e., no logic needed)

18/41
A Simple ALU Cell with Logic Op Support

add/subt carry_in op

result

1-bit
FA
B

add/subt carry_out
19/41
A Simple ALU Cell with Logic Op Support

add/subt carry_in op

A
0

1
2
3 result

1-bit
FA 6
B
less 7

add/subt carry_out
Modifying the ALU Cell for slt
19/41
Modifying the ALU for slt
A0

result0
B0 +

less

A1
• First perform a subtraction
• Make the result 1 if the subtraction yields a negative result1
result
B1 +

• Make the result 0 if the subtraction yields a positive 0 less

result . . .
• Tie the most significant sum bit (sign bit) to the low A31
order less input
result31
B31 +

less
0
set
20/41
Overflow Detection
Overflow occurs when the result is too large to represent in the number of bits
allocated
• adding two positives yields a negative
• or, adding two negatives gives a positive
• or, subtract a negative from a positive gives a negative
• or, subtract a positive from a negative gives a positive

Question: prove you can detect overflow by:


Carry into MSB xor Carry out of MSB

0 1 1 1 1 0
0 1 1 1 7 1 1 0 0 –4
+ 0 0 1 1 3 + 1 0 1 1 –5
1 0 1 0 –6 0 1 1 1 7

21/41
Modifying the ALU for Overflow
op
add/subt
A0
result0

B0 +

less

A1

• Modify the most significant cell to result1

determine overflow output setting

...
B1 + zero
• Enable overflow bit setting for signed 0 less

arithmetic (add, addi, sub) . . .


A31

result31
+
B31
0 less overflow
set

22/41
Overflow Detection and Effects

• On overflow, an exception (interrupt) occurs


• Control jumps to predefined address for exception
• Interrupted address (address of instruction causing the overflow) is saved for
possible resumption
• Don’t always want to detect (interrupt on) overflow

23/41
New Instructions

Category Instr Op Code Example Meaning


Arithmetic add unsigned 0 and 21 addu $s1, $s2, $s3 $s1 = $s2 + $s3
(R & I sub unsigned 0 and 23 subu $s1, $s2, $s3 $s1 = $s2 - $s3
format) add 9 addiu $s1, $s2, 6 $s1 = $s2 + 6
imm.unsigned
Data ld byte 24 lbu $s1, 20($s2) $s1 = Mem($s2+20)
Transfer unsigned
ld half unsigned 25 lhu $s1, 20($s2) $s1 = Mem($s2+20)
Cond. set on less than 0 and 2b sltu $s1, $s2, $s3 if ($s2<$s3) $s1=1
Branch unsigned else
(I & R $s1=0
format) set on less than b sltiu $s1, $s2, 6 if ($s2<6) $s1=1
imm unsigned else
$s1=0
• Sign extend: addi, addiu, slti
• Zero extend: andi, ori, xori
• Overflow detected: add, addi, sub

24/41
25/41
Multiplication & Division
Multiplication

• More complicated than addition


• Can be accomplished via shifting and adding

0010 (multiplicand)
x_1011 (multiplier)
0010
0010 (partial product
0000 array)
0010
00010110 (product)

• Double precision product produced


• More time and more area to compute

27/41
First Version of Multiplication Hardware

Note: n-bit × n-bit needs 2n-bit adder


28/41
Second Version of Multiplication Hardware

multiplicand

32-bit ALU add


shift
right
product
multiplier Control

Note: n-bit × n-bit needs only n-bit adder

29/41
Second Version: Example

0110
Multiplicand

Add
4-bit ALU

Shift right
Product Multiplier
Initial: 0 00 0 00 101 Control

Multiplier0 is 1

30/41
Second Version: Example

0110
Multiplicand
0000

Add
4-bit ALU

Shift right
Product Multiplier
Add: 0 00 0 00 101 Control

30/41
Second Version: Example

0110
Multiplicand
0000

Add
4-bit ALU

0110 Shift right


Product Multiplier
Add: 0 00 0 00 101 Control

30/41
Second Version: Example

0110
Multiplicand

Add
4-bit ALU

Shift right
Product Multiplier
Add: 0 01 1 00 101 Control

30/41
Second Version: Example

0110
Multiplicand

Add
4-bit ALU

Shift right
Product Multiplier
Shift: 0 00 1 10 010 Control

Multiplier0 is 0

30/41
Second Version: Example

0110
Multiplicand

Add
4-bit ALU

Shift right
Product Multiplier
Shift: 0 00 0 11 001 Control

Multiplier0 is 1

30/41
Second Version: Example

0110
Multiplicand
0001

Add
4-bit ALU

Shift right
Product Multiplier
Add: 0 00 0 11 001 Control

30/41
Second Version: Example

0110
Multiplicand
0001

Add
4-bit ALU

0111 Shift right


Product Multiplier
Add: 0 00 0 11 001 Control

30/41
Second Version: Example

0110
Multiplicand

Add
4-bit ALU

Shift right
Product Multiplier
Add: 0 01 1 11 001 Control

30/41
Second Version: Example

0110
Multiplicand

Add
4-bit ALU

Shift right
Product Multiplier
Shift: 0 00 1 11 100 Control

Multiplier0 is 0

30/41
Second Version: Example

0110
Multiplicand

Add
4-bit ALU

Shift right
Product Multiplier
Shift: 0 00 0 11 110 Control

Final Result: 00011110 = 30


30/41
RISC-V Multiply Instruction

• mul performs an 32-bit × 32-bit multiplication and places the lower 32 bits in the
destination register.

mul rd, rs1, rs2

• mulh, mulhu, and mulhsu perform the same multiplication but return the upper 32
bits of the full 64-bit product, for signed×signed, unsigned×unsigned, and
signed×unsigned multiplication respectively.

31/41
Division

• Division is just a bunch of quotient digit guesses and left shifts and subtracts

n
n quotient
0 0 0 dividend
divisor
0
partial
0 remainder
array
0

remainder
n

32/41
Division Hardware

33/41
Question: Division
Dividing 1001010 by 1000

34/41
RISC-V Divide Instruction

• div generates the reminder in hi and the quotient in lo

div rd, rs1, rs2

• div perform an 32 bits by 32 bits signed integer division of rs1 by rs2, rounding
towards zero.
• div and divu perform signed and unsigned integer division of 32 bits by 32 bits.
• rem and remu provide the remainder of the corresponding division operation.

35/41
Shifter
Shift Operations

• Shifts by a constant are encoded as a specialization of the I-type format. The operand
to be shifted is in rs1, and the shift amount is encoded in the lower 5 bits of the
I-immediate field.

srli rd, rs1, imm[4:0]


srai rd, rs1, imm[4:0]

• slli is a logical left shift; srli is a logical right shift; and srai. is an arithmetic
right shift.
• Logical shifts fill with zeros, arithmetic left shifts fill with the sign bit

The shift operation is implemented by hardware separate from the ALU


Using a barrel shifter, which would takes lots of gates in discrete logic, but is pretty easy
to implement in VLSI
37/41
A Simple Shifter

Right nop Left

Ai Bi

Ai-1 Bi-1

Bit-Slice i

...

38/41
Parallel Programmable Shifters

Shift amount (Sh4Sh3Sh2Sh1Sh0)


Control = Shift direction (left, right)
Shift type (logical, arithmetic)

39/41
Logarithmic Shifter Structure

Sh0 !Sh0

Sh0 & right

shifts dataini+1
of 0 !Sh0
or 1 dataouti
dataini
bits
dataini-1

Sh0 & left

0,1
shifts

40/41
Logarithmic Shifter Structure

Sh0 !Sh0 Sh1 !Sh1

Sh1 & right

Data Out
shifts shifts dataini+2
Data In

of 0 of 0 !Sh1
or 1 or 2 dataouti
dataini
bits bits
dataini-2

Sh1 & left

0,1 0,1,2,3
shifts shifts

40/41
Logarithmic Shifter Structure

Sh0 !Sh0 Sh1 !Sh1 Sh2 !Sh2

shifts shifts shifts


of 0 of 0 of 0
or 1 or 2 or 4
bits bits bits

0,1 0,1,2,3 0,1,2,3,4,


shifts shifts 5,6,7
shifts

40/41
Logarithmic Shifter Structure

Sh0 !Sh0 Sh1 !Sh1 Sh2 !Sh2 Sh3 !Sh3

shifts shifts shifts shifts


of 0 of 0 of 0 of 0
or 1 or 2 or 4 or 8
bits bits bits bits

0,1 0,1,2,3 0,1,2,3,4, 0,1,2…15


shifts shifts 5,6,7 shifts
shifts

40/41
Logarithmic Shifter Structure

Sh0 !Sh0 Sh1 !Sh1 Sh2 !Sh2 Sh3 !Sh3 Sh4 !Sh4

shifts shifts shifts shifts shifts


of 0 of 0 of 0 of 0 of 0
or 1 or 2 or 4 or 8 or 16
bits bits bits bits bits

0,1 0,1,2,3 0,1,2,3,4, 0,1,2…15 0,1,2…31


shifts shifts 5,6,7 shifts shifts
shifts

40/41
Logarithmic Shifter Structure

Sh1 Sh1 Sh2 Sh2 Sh4 Sh4

A3 B3

A2 B2

A1 B1

A0 B0

41/41

You might also like