0% found this document useful (0 votes)
25 views42 pages

Lecture 02

IIT Mandi Computer Architecture lecture 2 PDF

Uploaded by

syntaxajju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views42 pages

Lecture 02

IIT Mandi Computer Architecture lecture 2 PDF

Uploaded by

syntaxajju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

CS214 Computer Organization

Completed with notes. No need to review.

Tuesday, September 10, 2024

Kumar Sambhav Pandey


School of Computing and Electrical Engineering.
 Copyright IIT Mandi. 2024 1
Control needs to CPU Memory Devices
1. input instructions from Memory
2. issue signals to control the Control Input
information flow between the
Datapath components and to Datapath Output
control what operations they
perform
3. control instruction sequencing
Fetch

Exec Decode

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 2


Datapath needs to have CPU Memory Devices
1. Components:
the functional units and storage Control Input
(e.g. register file) needed to
execute instructions Datapath Output
2. Interconnects:
components connected so that
the instructions can be
accomplished and so that data Fetch
can loaded from and stored to
Memory
Exec Decode

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 3


 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 4
• Used as the example throughout the course
• Developed at UC Berkeley as open ISA
• Now managed by the RISC-V Foundation (riscv.org)
• Typical of many modern ISAs
• Similar ISAs have a large share of embedded core market
• Applications in consumer electronics, network/storage
equipment, cameras, printers, …

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 5


Registers
Instruction Categories
• Computational R0 - R31
• Load/Store
• Jump and Branch
• Floating Point
• coprocessor
PC
• Memory Management
HI
• Special
LO

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 6


6 Instruction Formats
(all 32 bits wide)

Tuesday, September 10, 2024 7


Register File
64bits
Holds thirty-two 64-bit registers src1 addr
5 64 src1
• Two read ports and data
• One write port 5
src2 addr 32
Registers are 5 locations
• Faster than main memory dst addr
- But register files with more locations are slower
64 src2
64
(e.g., a 64 word file could be as much as 50% slower data
than a 32 word file)
- Read/write port increase impacts speed quadratically write control
• Easier for a compiler to use
- e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order vs. stack
• Can hold variables so that
- code density improves (since register are named with fewer bits than a memory location)

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 8


 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 9
• Add and subtract, three operands
• Two sources and one destination

add a, b, c // a gets b + c

• All arithmetic operations have this form


• Design Principle 1: Simplicity favours regularity
• Regularity makes implementation simpler
• Simplicity enables higher performance at lower cost

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 10


• C code:

f = (g + h) - (i + j);
• Compiled RISC-V code:

add t0, g, h // temp t0 = g + h


add t1, i, j // temp t1 = i + j
sub f, t0, t1 // f = t0 - t1

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 11


• Arithmetic instructions use register
operands
• RISC-V has a 32 × 64-bit register file
• Use for frequently accessed data
• 64-bit data is called a “doubleword”
• 32 x 64-bit general purpose registers x0 to x30
• 32-bit data is called a “word”
• Design Principle 2: Smaller is faster
• c.f. main memory: millions of locations

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 12


• C code:
f = (g + h) - (i + j);
• f, …, j in x19, x20, …, x23
• Compiled RISC-V code:
add x5, x20, x21
add x6, x22, x23
sub x19, x5, x6

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 13


• Main memory used for composite data
• Arrays, structures, dynamic data
• To apply arithmetic operations
• Load operand values from memory into registers
• Store result from register to memory
• Memory is byte addressed
• Each address identifies an 8-bit byte
• RISC-V is Little Endian
• Least-significant byte at least address of a word
• c.f. Big Endian: most-significant byte at least address
• RISC-V does not require words to be aligned in memory
• Unlike some other ISAs
 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 14
• C code:
A[12] = h + A[8];
• h in x21, base address of A in x22
• Compiled RISC-V code:
• Index 8 requires offset of 64
• 8 bytes per doubleword

ld x9, 64(x22)
add x9, x21, x9
sd x9, 96(x22)

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 15


• Registers are faster to access than memory
• Operating on memory data requires loads and stores
• More instructions to be executed
• Compiler must use registers for variables as much as possible
• Only spill to memory for less frequently used variables
• Register optimization is important!

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 16


• Constant data specified in an instruction
addi x22, x22, 4

• Make the common case fast


• Small constants are common
• Immediate operand avoids a load instruction

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 17


funct7 rs2 rs1 funct3 rd opcode

• Instruction Fields:
• opcode: operation code
• rd: destination register address
• funct3: 3-bit function code (additional operation code)
• rs1: first source register address
• rs2: second register address
• funct7: 7-bit function code (additional operation code)

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 18


funct7 rs2 rs1 funct3 rd opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits

add x9, x20, x21


0 21 20 0 9 51

0000000 10101 10100 000 01001 0110011

(00000001010110100000010010110011) 2=(015A04B3)16

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 19


immediate rs1 funct3 rd opcode
12 bits 5 bits 3 bits 5 bits 7 bits

• Immediate Arithmetic and load instructions:


• rs1: source or base register address
• immediate: small constant operand or offset added to base address, 2’s
complement, sign extended
• Design Principle 3: Good design demands good compromises
• Different formats complicate decoding, but allow 32-bit instructions uniformly
• Keep formats as similar as possible

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 20


imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits

• Different immediate format for store instructions


• rs1: base address register number
• rs2: source operand register number
• immediate: offset added to base address
• split so that rs1 and rs2 fields always in the same place

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 21


Operation C RISC-V
Shift Left << slli
Shift Right >> srli
Bitwise AND & and, andi
Bitwise OR | or, ori
Bitwise XOR ^ xor, xori

• Useful for extracting and inserting groups of bits in a word

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 22


funct6 immed[5:0] rs1 funct3 rd opcode
6 bits 6 bits 5 bits 3 bits 5 bits 7 bits

• immed: how many positions to shift


• Shift left logical
• Shift left and fill with 0 bits
• slli by i bits multiplies by 2i
• Shift right logical
• Shift right and fill with 0 bits
• srli by i bits divides by 2i (unsigned only)

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 23


• Useful to mask bits in a word
• Select some bits, clear others to 0
and x9,x10,x11
00000000
x10 00000000 00000000 00000000 00000000 00000000 00001101 11000000

x11 00000000 00000000 00000000 00000000 00000000 00000000 00111100 0000000

x9 00000000 00000000 00000000 00000000 00000000 00000000 00001100 0000000

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 24


• Useful to include bits in a word
• Set some bits to 1, leave others unchanged
or x9,x10,x11
00000000
x10 00000000 00000000 00000000 00000000 00000000 00001101 11000000

x11 00000000 00000000 00000000 00000000 00000000 00000000 00111100 00000000

x9 00000000 00000000 00000000 00000000 00000000 00000000 00111101 11000000

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 25


• Differencing operation
• logical exclusive-OR
or x9,x10,x11 // Can be used for NOT
00000000
x10 00000000 00000000 00000000 00000000 00000000 00001101 11000000

x11 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111

x9 11111111 11111111 11111111 11111111 11111111 11111111 11110010 00111111

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 26


• RISC-V has two basic data transfer instructions for accessing
memory
lw $t0, 4($s3) #load word from memory
sw $t0, 8($s3) #store word to memory

• The data is loaded into (lw) or stored from (sw) a register in the
register file – a 5 bit address
• The memory address – a 32 bit address – is formed by adding the
contents of the base address register to the offset value
• A 12-bit field meaning access is limited to memory locations within a region
of 211 or 2,048 bytes of the address in the base register
• Note that the offset can be positive or negative

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 27


immediate rs1 funct3 rd opcode
12 bits 5 bits 3 bits 5 bits 7 bits

• I Format Instruction
• rs1: source or base register address
• immediate: small constant operand or offset added to base
address, 2’s complement, sign extended

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 28


lw $t0, 24($s2) Memory
0xf f f f f f f f
2410 + $s2 =

. . . 0001 1000 0x120040ac


$t0
+ . . . 1001 0100 $s2 0x12004094
. . . 1010 1100 =
0x120040ac 0x0000000c
0x00000008
0x00000004
0x00000000
data word address (hex)
 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 29
imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits

• S Format instruction
• Different immediate format for store instructions
• rs1: base address register number
• rs2: source operand register number
• immediate: offset added to base address
• split so that rs1 and rs2 fields always in the same place

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 30


sw $s1, 24($s2) Memory
0xf f f f f f f f
2410 + $s2 =

. . . 0001 1000 $s1 0x120040ac


+ . . . 1001 0100 $s2 0x12004094
. . . 1010 1100 =
0x120040ac 0x0000000c
0x00000008
0x00000004
0x00000000
data word address (hex)
 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 31
• RISC-V conditional branch instructions:
bne $s0, $s1, Lbl #go to Lbl if $s0$s1
beq $s0, $s1, Lbl #go to Lbl if $s0=$s1
Ex: if (i==j) h = i + j;
bne $s0, $s1, Lbl1
add $s3, $s0, $s1
Lbl1: ...
• Instruction Format (I format):
immediate rs1 funct3 rd opcode
12 bits 5 bits 3 bits 5 bits 7 bits
• How is the branch destination address specified?
 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 32
• Use a register (like in lw and sw) added to the 12-bit offset
• which register? Instruction Address Register (the PC)
• its use is automatically implied by instruction
• PC gets updated (PC+4) during the fetch cycle so that it holds the address of
the next instruction
• limits the branch distance to -211 to +211-1 instructions from the
(instruction after the) branch instruction, but most branches are
local anyway

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 33


from the upper-order 12 bits of the branch instruction
12

offset
sign-extend

00

branch dst
32 32
Add
address
PC 32
32 Add
32
32
4
32 ?

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 34


• C code:
if (i==j) f = g+h;
else f = g-h;

• f, g, … in x19, x20, …

• Compiled RISC-V code:


bne x22, x23, Else
add x19, x20, x21
beq x0,x0,Exit // unconditional
Else: sub x19, x20, x21
Exit: …
Assembler calculates addresses

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 35


• C code:
while (save[i] == k) i += 1;

• i in x22, k in x24, address of save in x25

• Compiled RISC-V code:


Loop: slli x10, x22, 3
add x10, x10, x25
ld x9, 0(x10)
bne x9, x24, Exit
addi x22, x22, 1
beq x0, x0, Loop
Exit: …
Assembler calculates addresses

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 36


• Most constants are small
• 12-bit immediate is sufficient
• For the occasional 32-bit constant
lui rd, constant
• Copies 20-bit constant to bits [31:12] of rd
• Extends bit 31 to bits [63:32]
• Clears bits [11:0] of rd to 0
lui x19, 976 //0x003D0
00000000 00000000 00000000 00000000 00000000 00111101 00000000 00000000

addi x19, x19, 128 //0x500


00000000 00000000 00000000 00000000 00000000 00111101 00000101 00000000

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 37


• Jump and link (jal) target uses 20-bit immediate for larger range
• UJ format:
imm[9:0] imm[18:11] rd opcode
10 bits 8 bits 5 bits 7 bits
• For long jumps, eg, to 32-bit absolute address
lui: load address[31:12] to temp register
jalr: add address[11:0] and jump to target

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 38


• Jump and link (jal) target uses 20-bit immediate for larger range
• UJ format:
imm[9:0] imm[18:11] rd opcode
10 bits 8 bits 5 bits 7 bits
• For long jumps, eg, to 32-bit absolute address
lui: load address[31:12] to temp register
jalr: add address[11:0] and jump to target

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 39


Processor
Memory
Register File
1…1100
src1 addr
5 src1 read/write
src2 addr data
5 32 64 addr
dst addr registers src2
5 ($zero - $ra) 64
write data data
64 64
262
64 bits read data words
branch offset 0…1100
64 64
PC add 0…1000
64 64 add 64
write data
Fetch 4 64 4 5 6 7 0…0100
PCPC+4 64 64
0 1 2 3 0…0000
64 word address (binary)
ALU
64 32 bits
Execute
Decode byte address
64
(big Endian)

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 40


Category Instructions OpCode Example Semantics
Arithmetic add 0 and 32 add $s1, $s2, $s3 $s1 = $s2 + $s3
(R & I format)
subtract 0 and 34 sub $s1, $s2, $s3 $s1 = $s2 - $s3
add immediate 8 addi $s1, $s2, 6 $s1 = $s2 + 6
or immediate 13 ori $s1, $s2, 6 $s1 = $s2 v 6
Data Transfer load word 35 lw $s1, 24($s2) $s1 = Memory($s2+24)
(I format)
store word 43 sw $s1, 24($s2) Memory($s2+24) = $s1
load byte 32 lb $s1, 25($s2) $s1 = Memory($s2+25)
store byte 40 sb $s1, 25($s2) Memory($s2+25) = $s1
load upper immediate 15 lui $s1, 6 $s1 = signextended(6 * 212) padded with zeroes
Conditional Branch br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L
(I & R format)
br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L
set on less than 0 and 42 slt $s1, $s2, $s3 if ($s2<$s3) $s1=1 else $s1=0
set on less than immediate 10 slti $s1, $s2, 6 if ($s2<6) $s1=1 else $s1=0
Unconditional Jump jump and link 3 Jal 2500 go to 10000; $ra=PC+4
(J & R format)
jump and link register 0 and 8 jalr $t1 go to $t1

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 41


May I answer any of your
questions?

 Copyright IIT Mandi. 2024 Tuesday, September 10, 2024 42

You might also like