0% found this document useful (0 votes)
10 views

Lecture 4,5 - Processor Architecture Overview - Part 2

Uploaded by

Sohila Lashien
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Lecture 4,5 - Processor Architecture Overview - Part 2

Uploaded by

Sohila Lashien
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 56

Digital Design &

Computer Architecture
Sarah Harris & David Harris

Chapter 6:
Architecture
Chapter 6 :: Topics
• Introduction
• Assembly Language
• Programming
• Machine Language
• Addressing Modes
• Lights, Camera, Action:
Compiling, Assembly, & Loading
• Odds & Ends

2 Digital Design & Computer Architecture Architecture


Chapter 6: Architecture

Logical / Shift
Instructions
Programming
• High-level languages:
– e.g., C, Java, Python
– Written at higher level of abstraction
• High-level constructs: loops, conditional
statements, arrays, function calls
• First, introduce instructions that support
these:
– Logical operations
– Shift instructions
– Multiplication & division
– Branches & Jumps

4 Digital Design & Computer Architecture Architecture


Logical Instructions
• and, or, xor
– and: useful for masking bits
• Masking all but the least significant byte of a value:
0xF234012F AND 0x000000FF = 0x0000002F
– or: useful for combining bit fields
• Combine 0xF2340000 with 0x000012BC:
0xF2340000 OR 0x000012BC = 0xF23412BC
– xor: useful for inverting bits:
• A XOR -1 = NOT A (remember that -1 = 0xFFFFFFFF)

5 Digital Design & Computer Architecture Architecture


Logical Instructions: Example 1
Source Registers
s1 0100 0110 1010 0001 1111 0001 1011 0111
s2 1111 1111 1111 1111 0000 0000 0000 0000

Assembly Code Result


and s3, s1, s2 s3 0100 0110 1010 0001 0000 0000 0000 0000
or s4, s1, s2 s4 1111 1111 1111 1111 1111 0001 1011 0111
xor s5, s1, s2 s5 1011 1001 0101 1110 1111 0001 1011 0111

6 Digital Design & Computer Architecture Architecture


Logical Instructions: Example 2
Source Values
t3 0011 1010 0111 0101 0000 1101 0110 1111

imm 1111 1111 1111 1111 1111 1010 0011 0100


sign-extended

Assembly Code Result


andi s5, t3, -1484 s5 0011 1010 0111 0101 0000 1000 0010 0100
ori s6, t3, -1484 s6 1111 1111 1111 1111 1111 1111 0111 1111
xori s7, t3, -1484 s7 1100 0101 1000 1010 1111 0111 0101 1011

-1484 = 0xA34 in 12-bit 2’s complement representation.

7 Digital Design & Computer Architecture Architecture


Shift Instructions
Shift amount is in (lower 5 bits of) a register
• sll: shift left logical
– Example: sll t0, t1, t2 # t0 = t1 << t2
• srl: shift right logical
– Example: srl t0, t1, t2 # t0 = t1 >> t2
• sra: shift right arithmetic
– Example: sra t0, t1, t2 # t0 = t1 >>> t2

8 Digital Design & Computer Architecture Architecture


Immediate Shift Instructions
Shift amount is an immediate between 0 to 31
• slli: shift left logical immediate
– Example: slli t0, t1, 23 # t0 = t1 << 23
• srli: shift right logical immediate
– Example: srli t0, t1, 18 # t0 = t1 >> 18
• srai: shift right arithmetic immediate
– Example: srai t0, t1, 5 # t0 = t1 >>> 5

9 Digital Design & Computer Architecture Architecture


Chapter 6: Architecture

Multiplication and
Division
Multiplication
32 × 32 multiplication → 64 bit result
mul s3, s1, s2
s3 = lower 32 bits of result
mulh s4, s1, s2
s4 = upper 32 bits of result, treats operands as signed
{s4, s3} = s1 x s2
Example: s1 = 0x40000000 = 230; s2 = 0x80000000 = -231
s1 x s2 = -261 = 0xE0000000 00000000
s4 = 0xE0000000; s3 = 0x00000000

11 Digital Design & Computer Architecture Architecture


Division
32-bit division → 32-bit quotient & remainder
– div s3, s1, s2 # s3 = s1/s2
– rem s4, s1, s2 # s4 = s1%s2

Example: s1 = 0x00000011 = 17; s2 = 0x00000003 = 3


s1 / s2 = 5
s1 % s2 = 2
s3 = 0x00000005; s4 = 0x00000002

12 Digital Design & Computer Architecture Architecture


Chapter 6: Architecture

Branches & Jumps


Branching
• Execute instructions out of sequence
• Types of branches:
– Conditional
• branch if equal (beq)
• branch if not equal (bne)
• branch if less than (blt)
• branch if greater than or equal (bge)
– Unconditional We’ll talk
• jump (j) about these
• jump register (jr) when discuss
function calls
• jump and link (jal)
• jump and link register (jalr)
14 Digital Design & Computer Architecture Architecture
Conditional Branching
# RISC-V assembly
addi s0, zero, 4 # s0 = 0 + 4 = 4
addi s1, zero, 1 # s1 = 0 + 1 = 1
slli s1, s1, 2 # s1 = 1 << 2 = 4
beq s0, s1, target # branch is taken
addi s1, s1, 1 # not executed
sub s1, s1, s0 # not executed

target: # label
add s1, s1, s0 # s1 = 4 + 4 = 8

Labels indicate instruction location. They can’t be reserved words and


must be followed by a colon (:)

15 Digital Design & Computer Architecture Architecture


The Branch Not Taken (bne)
# RISC-V assembly
addi s0, zero, 4 # s0 = 0 + 4 = 4
addi s1, zero, 1 # s1 = 0 + 1 = 1
slli s1, s1, 2 # s1 = 1 << 2 = 4
bne s0, s1, target # branch not taken
addi s1, s1, 1 # s1 = 4 + 1 = 5
sub s1, s1, s0 # s1 = 5 – 4 = 1

target:
add s1, s1, s0 # s1 = 1 + 4 = 5

16 Digital Design & Computer Architecture Architecture


Unconditional Branching (j)
# RISC-V assembly
j target # jump to target
srai s1, s1, 2 # not executed
addi s1, s1, 1 # not executed
sub s1, s1, s0 # not executed

target:
add s1, s1, s0 # s1 = 1 + 4 = 5

17 Digital Design & Computer Architecture Architecture


Chapter 6: Architecture

Conditional
Statements & Loops
Conditional Statements & Loops
• Conditional Statements
– if statements
– if/else statements
• Loops
– while loops
– for loops

19 Digital Design & Computer Architecture Architecture


If Statement
C Code RISC-V assembly code
# s0 = f, s1 = g, s2 = h
# s3 = i, s4 = j
if (i == j) bne s3, s4, L1
f = g + h; add s0, s1, s2

L1:
f = f – i; sub s0, s0, s3

Assembly tests opposite case (i != j) of high-level code (i == j)

20 Digital Design & Computer Architecture Architecture


If/Else Statement
C Code RISC-V assembly code
# s0 = f, s1 = g, s2 = h
# s3 = i, s4 = j
if (i == j) bne s3, s4, L1
f = g + h; add s0, s1, s2
j done

else L1:
f = f – i; sub s0, s0, s3
done:

Assembly tests opposite case (i != j) of high-level code (i == j)

21 Digital Design & Computer Architecture Architecture


While Loops
C Code RISC-V assembly code
// determines the power # s0 = pow, s1 = x
// of x such that 2x = 128
int pow = 1; addi s0, zero, 1
int x = 0; add s1, zero, zero
addi t0, zero, 128
while (pow != 128) { while:
pow = pow * 2; beq s0, t0, done
x = x + 1; slli s0, s0, 1
} addi s1, s1, 1
j while
done:

Assembly tests opposite case (pow == 128) of high-level code


( pow != 128)

22 Digital Design & Computer Architecture Architecture


For Loops
for (initialization; condition; loop operation)
statement

• initialization: executes before the loop begins


• condition: is tested at the beginning of each iteration
• loop operation: executes at the end of each iteration
• statement: executes each time the condition is met

23 Digital Design & Computer Architecture Architecture


For Loops
C Code RISC-V assembly code
// add the numbers from 0 to 9 # s0 = i, s1 = sum
int sum = 0; addi s1, zero, 0
int i; add s0, zero, zero
addi t0, zero, 10
for (i=0; i!=10; i = i+1) { for:
sum = sum + i; beq s0, t0, done
} add s1, s1, s0
addi s0, s0, 1
j for
done:

24 Digital Design & Computer Architecture Architecture


Less Than Comparison
C Code RISC-V assembly code
// add the powers of 2 from 1 # s0 = i, s1 = sum
// to 100 addi s1, zero, 0
int sum = 0; addi s0, zero, 1
int i; addi t0, zero, 101
loop:
for (i=1; i < 101; i = i*2) { bge s0, t0, done
sum = sum + i; add s1, s1, s0
} slli s0, s0, 1
j loop
done:

25 Digital Design & Computer Architecture Architecture


Less Than Comparison: Version 2
C Code RISC-V assembly code
// add the powers of 2 from 1 # s0 = i, s1 = sum
// to 100 addi s1, zero, 0
int sum = 0; addi s0, zero, 1
int i; addi t0, zero, 101
loop:
for (i=1; i < 101; i = i*2) { slt t2, s0, t0
sum = sum + i; beq t2, zero, done
} add s1, s1, s0
slli s0, s0, 1
j loop
done:

slt: set if less than instruction


slt t2, s0, t0 # if s0 < t0, t2 = 1
# otherwise t2 = 0

26 Digital Design & Computer Architecture Architecture


Chapter 6: Architecture

Arrays
Arrays
• Access large amounts of similar data
• Index: access each element
• Size: number of elements

28 Digital Design & Computer Architecture Architecture


Arrays
• 5-element array
• Base address = 0x123B4780 (address of first
element, array[0])
• First step in accessing an array: load base
address into a register
Address Data

123B4790 array[4]
123B478C array[3]
123B4788 array[2]
123B4784 array[1]
123B4780 array[0]

Main Memory

29 Digital Design & Computer Architecture Architecture


Accessing Arrays
Address Data
// C Code
int array[5]; 123B4790 array[4]
array[0] = array[0] * 2; 123B478C array[3]
array[1] = array[1] * 2; 123B4788 array[2]
123B4784 array[1]
123B4780 array[0]
# RISC-V assembly code
# s0 = array base address Main Memory
lui s0, 0x123B4 # 0x123B4 in upper 20 bits of s0
addi s0, s0, 0x780 # s0 = 0x123B4780

lw t1, 0(s0) # t1 = array[0]


slli t1, t1, 1 # t1 = t1 * 2
sw t1, 0(s0) # array[0] = t1

lw t1, 4(s0) # t1 = array[1]


slli t1, t1, 1 # t1 = t1 * 2
sw t1, 4(s0) # array[1] = t1

30 Digital Design & Computer Architecture Architecture


Accessing Arrays Using For Loops
// C Code
int array[1000];
int i;

for (i=0; i < 1000; i = i + 1)


array[i] = array[i] * 8;

# RISC-V assembly code


# s0 = array base address, s1 = i

31 Digital Design & Computer Architecture Architecture


Accessing Arrays Using For Loops
# RISC-V assembly code
# s0 = array base address, s1 = i
# initialization code
lui s0, 0x23B8F # s0 = 0x23B8F000
ori s0, s0, 0x400 # s0 = 0x23B8F400
addi s1, zero, 0 # i = 0
addi t2, zero, 1000 # t2 = 1000

loop:
bge s1, t2, done # if not then done
slli t0, s1, 2 # t0 = i * 4 (byte offset)
add t0, t0, s0 # address of array[i]
lw t1, 0(t0) # t1 = array[i]
slli t1, t1, 3 # t1 = array[i] * 8
sw t1, 0(t0) # array[i] = array[i] * 8
addi s1, s1, 1 # i = i + 1
j loop # repeat
done:

32 Digital Design & Computer Architecture Architecture


ASCII Code
• ASCII: American Standard Code for
Information Interchange
• Each text character has unique byte
value
– For example, S = 0x53, a = 0x61, A = 0x41
– Lower-case and upper-case differ by 0x20 (32)

33 Digital Design & Computer Architecture Architecture


Cast of Characters: ASCII Encodings
# Char # Char # Char # Char # Char # Char
20 space 30 0 40 @ 50 P 60 ` 70 p
21 ! 31 1 41 A 51 Q 61 a 71 q
22 “ 32 2 42 B 52 R 62 b 72 r
23 # 33 3 43 C 53 S 63 c 73 s
24 $ 34 4 44 D 54 T 64 d 74 t
25 % 35 5 45 E 55 U 65 e 75 u
26 & 36 6 46 F 56 V 66 f 76 v
27 ‘ 37 7 47 G 57 W 67 g 77 w
28 ( 38 8 48 H 58 X 68 h 78 x
29 ) 39 9 49 I 59 Y 69 i 79 y
2A * 3A : 4A J 5A Z 6A j 7A z
2B + 3B ; 4B K 5B [ 6B k 7B {
2C , 3C < 4C L 5C \ 6C l 7C |
2D - 3D = 4D M 5D ] 6D m 7D }
2E . 3E > 4E N 5E ^ 6E n 7E ~
2F / 3F ? 4F O 5F _ 6F o

34 Digital Design & Computer Architecture Architecture


Accessing Arrays of Characters
// C Code
char str[80] = “CAT”;
int len = 0;

// compute length of string


while (str[len]) len++;

# RISC-V assembly code


# s0 = array base address, s1 = len

addi s1, zero, 0 # len = 0


while: add t0, s0, s1 # address of str[len]
lw t1, 0(t0) # load str[len]
beq t1, zero, done # are we at the end of the string?
addi s1, s1, 1 # len++
j while # repeat while loop
done:

35 Digital Design & Computer Architecture Architecture


Chapter 6: Architecture

Function Calls
Function Calls
• Caller: calling function (in this case, main)
• Callee: called function (in this case, sum)
C Code
void main()
{
int y;
y = sum(42, 7);
...
}

int sum(int a, int b)


{
return (a + b);
}

37 Digital Design & Computer Architecture Architecture


Simple Function Call
C Code RISC-V assembly code
int main() {
simple(); 0x00000300 main: jal simple # call
a = b + c; 0x00000304 add s0, s1, s2
} ... ...

void simple() {
return; 0x0000051c simple: jr ra # return
}
void means that simple doesn’t return a value

jal simple:
ra = PC + 4 (0x00000304)
jumps to simple label (PC = 0x0000051c)
jr ra:
PC = ra (0x00000304)

38 Digital Design & Computer Architecture Architecture


Function Calling Conventions
• Caller:
– passes arguments to callee
– jumps to callee
• Callee:
– performs the function
– returns result to caller
– returns to point of call
– must not overwrite registers or memory needed by
caller

39 Digital Design & Computer Architecture Architecture


RISC-V Function Calling Conventions
• Call Function: jump and link (jal func)
• Return from function: jump register (jr ra)
• Arguments: a0 – a7
• Return value: a0

40 Digital Design & Computer Architecture Architecture


Input Arguments & Return Value
C Code
int main()
{
int y;
...
y = diffofsums(2, 3, 4, 5); // 4 arguments
...
}

int diffofsums(int f, int g, int h, int i)


{
int result;
result = (f + g) - (h + i);
return result; // return value
}

41 Digital Design & Computer Architecture Architecture


Input Arguments & Return Value
RISC-V assembly code
# s7 = y
main:
. . .
addi a0, zero, 2 # argument 0 = 2
addi a1, zero, 3 # argument 1 = 3
addi a2, zero, 4 # argument 2 = 4
addi a3, zero, 5 # argument 3 = 5
jal diffofsums # call function
add s7, a0, zero # y = returned value
. . .
# s3 = result
diffofsums:
add t0, a0, a1 # t0 = f + g
add t1, a2, a3 # t1 = h + i
sub s3, t0, t1 # result = (f + g) − (h + i)
add a0, s3, zero # put return value in a0
jr ra # return to caller

42 Digital Design & Computer Architecture Architecture


Input Arguments & Return Value
RISC-V assembly code
# s3 = result
diffofsums:
add t0, a0, a1 # t0 = f + g
add t1, a2, a3 # t1 = h + i
sub s3, t0, t1 # result = (f + g) − (h + i)
add a0, s3, zero # put return value in a0
jr ra # return to caller

• diffofsums overwrote 3 registers: t0, t1, s3


•diffofsums can use stack to temporarily store registers

43 Digital Design & Computer Architecture Architecture


Chapter 6: Architecture

The Stack
The Stack
• Memory used to temporarily
save variables
• Like stack of dishes, last-in-
first-out (LIFO) queue
• Expands: uses more memory
when more space needed
• Contracts: uses less memory
when the space is no longer
needed

45 Digital Design & Computer Architecture Architecture


The Stack
• Grows down (from higher to lower memory
addresses)
• Stack pointer: sp points to top of the stack
Address Data Address Data

BEFFFAE8 AB000001 sp BEFFFAE8 AB000001


BEFFFAE4 BEFFFAE4 12345678
BEFFFAE0 BEFFFAE0 FFEEDDCC sp
BEFFFADC BEFFFADC

Make room on stack for 2 words.


46 Digital Design & Computer Architecture Architecture
How Functions use the Stack
• Called functions must have no unintended
side effects
• But diffofsums overwrites 3 registers: t0,
t1, s3
# RISC-V assembly
# s3 = result
diffofsums:
add t0, a0, a1 # t0 = f + g
add t1, a2, a3 # t1 = h + i
sub s3, t0, t1 # result = (f + g) − (h + i)
add a0, s3, zero # put return value in a0
jr ra # return to caller

47 Digital Design & Computer Architecture Architecture


Storing Register Values on the Stack
# s3 = result
diffofsums:
addi sp, sp, -12 # make space on stack to

# store three registers


sw s3, 8(sp) # save s3 on stack
sw t0, 4(sp) # save t0 on stack
sw t1, 0(sp) # save t1 on stack
add t0, a0, a1 # t0 = f + g
add t1, a2, a3 # t1 = h + i
sub s3, t0, t1 # result = (f + g) − (h + i)
add a0, s3, zero # put return value in a0
lw s3, 8(sp) # restore s3 from stack
lw t0, 4(sp) # restore t0 from stack
lw t1, 0(sp) # restore t1 from stack
addi sp, sp, 12 # deallocate stack space
jr ra # return to caller

48 Digital Design & Computer Architecture Architecture


The Stack During diffofsums Call

Address Data Address Data Address Data

BEF0F0FC ? sp BEF0F0FC ? BEF0F0FC ? sp


BEF0F0F8 BEF0F0F8 s3 BEF0F0F8
BEF0F0F4 stack frame BEF0F0F4 t0 BEF0F0F4
BEF0F0F0 BEF0F0F0 t1 sp BEF0F0F0

Before Call During Call After Call

49 Digital Design & Computer Architecture Architecture


Preserved Registers

Preserved Nonpreserved
Callee-Saved Caller-Saved
s0-s11 t0-t6
sp a0-a7
ra
stack above sp stack below sp

50 Digital Design & Computer Architecture Architecture


Storing Saved Registers on the Stack
# s3 = result
diffofsums:
addi sp, sp, -4 # make space on stack to

# store one register


sw s3, 0(sp) # save s3 on stack
add t0, a0, a1 # t0 = f + g
add t1, a2, a3 # t1 = h + i
sub s3, t0, t1 # result = (f + g) − (h + i)
add a0, s3, zero # put return value in a0
lw s3, 0(sp) # restore s3 from stack
addi sp, sp, 4 # deallocate stack space
jr ra # return to caller

51 Digital Design & Computer Architecture Architecture


Optimized diffofsums
# a0 = result
diffofsums:
add t0, a0, a1 # t0 = f + g
add t1, a2, a3 # t1 = h + i
sub a0, t0, t1 # result = (f + g) − (h + i)
jr ra # return to caller

52 Digital Design & Computer Architecture Architecture


Non-Leaf Function Calls
Non-leaf function:
a function that calls another function
func1:
addi sp, sp, -4 # make space on stack
sw ra, 0(sp) # save ra on stack
jal func2
...
lw ra, 0(sp) # restore ra from stack
addi sp, sp, 4 # deallocate stack space
jr ra # return to caller

Must preserve ra before function call.

53 Digital Design & Computer Architecture Architecture


Non-Leaf Function Call Example
# f1 (non-leaf function) uses s4-s5 and needs a0-a1 after call to f2
f1:
addi sp, sp, -20 # make space on stack for 5 words
sw a0, 16(sp)
sw a1, 12(sp)
sw ra, 8(sp) # save ra on stack
sw s4, 4(sp)
sw s5, 0(sp)
jal func2
...
lw ra, 8(sp) # restore ra (and other regs) from stack
...
addi sp, sp, 20 # deallocate stack space
jr ra # return to caller

# f2 (leaf function) only uses s4 and calls no functions


f2:
addi sp, sp, -4 # make space on stack for 1 word
sw s4, 0(sp)
...
lw s4, 0(sp)
addi sp, sp, 4 # deallocate stack space
jr ra # return to caller

54 Digital Design & Computer Architecture Architecture


Stack during Function Calls
Address Data Address Data Address Data
BEF7FF0C ? sp BEF7FF0C ? BEF7FF0C ?
BEF7FF08 BEF7FF08 a0 BEF7FF08 a0

f1's stack frame

frame f1's stack frame


BEF7FF04 BEF7FF04 a1 BEF7FF04 a1
BEF7FF00 BEF7FF00 ra BEF7FF00 ra
BEF7FEFC BEF7FEFC s4 BEF7FEFC s4
BEF7FEF8 BEF7FEF8 s5 sp BEF7FEF8 s5

f2's stack
BEF7FEF4 BEF7FEF4 BEF7FEF4 s4 sp

Before Calls After Call to f1 After Call to f2

55 Digital Design & Computer Architecture Architecture


Function Call Summary
•Caller
– Save any needed registers (ra, maybe t0-t6/a0-a7)
– Put arguments in a0-a7
– Call function: jal callee
– Look for result in a0
– Restore any saved registers

•Callee
– Save registers that might be disturbed (s0-s11)
– Perform function
– Put result in a0
– Restore registers
– Return: jr ra

56 Digital Design & Computer Architecture Architecture

You might also like