0% found this document useful (0 votes)
28 views48 pages

L06 - RISCVII (Revised)

The document provides information about an introductory course on RISC-V computer architecture including details about upcoming assignments, topics to be covered in upcoming classes, and explanations of RISC-V instruction types and memory organization. Specifically, it discusses I-type instructions, register usage, arithmetic and logic instructions, load instructions, and differences between big-endian and little-endian byte ordering in memory.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views48 pages

L06 - RISCVII (Revised)

The document provides information about an introductory course on RISC-V computer architecture including details about upcoming assignments, topics to be covered in upcoming classes, and explanations of RISC-V instruction types and memory organization. Specifically, it discusses I-type instructions, register usage, arithmetic and logic instructions, load instructions, and differences between big-endian and little-endian byte ordering in memory.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

CS 110

Computer Architecture
Intro to RISC-V I
Instructors:
Siting Liu & Chundong W ng
Course website: https://to st-l b.sist.sh ngh itech.edu.cn/courses/CS110@Sh ngh iTech/
Spring-2023/index.html
School of Inform tion Science nd Technology (SIST)
Sh ngh iTech University

2023/2/6
a
a
a
a
a
a
a
a
a
a
a
Course Info
• Lab 3 will be released after class (10 a.m.), get yourself prepared
before going to lab sessions!
• Our project 1.1 will be available this weekend, and will be marked
in lab sessions. Deadline March 13th.
• Next week discussion on RISC-V related materials and assembly
coding.

2
Assembly Instructions
• Different types of instructions

• I-type
• Register-Immediate type
• Has two operands (one accessed from source register, another a constant/
immediate) and one output (saved to destination register)
• Can do arithmetic/logic/load from main memory/jump (covered later)

3
RV32I I-type Arithmetic
• Syntax of instructions: assembly language
• Addition: addi rd, rs1, imm
Adds imm to rs1, stores the result to rd, and imm is a signed number.
• Example: addi x5, x4, 10
addi x6, x4, -10
Registers
• Similarly, andi/ori/xori/slti/sltiu 0 x0/zero
0x12340000 x1
• All the imm’s are sign-extended x2
0x00006789
0xFFFFFFFF x3
0x3 x4
x5
• slli/srli/srai are special (de ined in x6
RV64I), and can be extended to RV32I usage x7
(RTFM)
4
f
RV32I Arithmetic/Logic Test
• addi x1, x2, -1 Registers
• or x2, x2, x1 0 x0/zero
• add x3, x1, x2
• slt x4, x3, x1 0 x1
• sra x5, x3, x4 0 x2
• sub x0, x5, x4
0 x3
• Register zero (x0) is ‘hard-wired’ to 0; 0 x4
• By convention RISC-V has a speci ic 0 x5
no-op instruction...
0 x6
– add x0 x0 x0
– You may need to replace code later: No- 0 x7
ops can fill space, align data, and
perform other options
– Practical use in jump-and-link
operations (covered later)
5
f
RV32I I-type Load
Memory
Processor Input
Enable?
Read/Write
Control

Program
Datapath Much larger place
Address
Bytes to hold values, but
PC
slower than registers!
Registers
Write Data

Arithmetic & Logic Unit Read Data Data


Output
(ALU) (Load data
from mem.
Fast but limited place to Reg.)
to hold values Processor-Memory Interface
I/O-Memory Interfaces
6
en.wikipedia.org/wiki/Big_endian

Big Endian vs. Little Endian


Big-endian and little-endian from Jonathan Swift's Gulliver's Travels
• The order in which BYTES are stored in memory
• Bits always stored as usual. (E.g., 0xC2=0b 1100 0010)
Consider the number 1025 as we normally write it:
BYTE3 BYTE2 BYTE1 BYTE0
00000000 00000000 00000100 00000001

Big Endian Little Endian


ADDR3 ADDR2 ADDR1 ADDR0 ADDR3 ADDR2 ADDR1 ADDR0
BYTE0 BYTE1 BYTE2 BYTE3 BYTE3 BYTE2 BYTE1 BYTE0
00000001 00000100 00000000 00000000 00000000 00000000 00000100 00000001

Examples Examples
Names in the West (e.g. Siting, Liu) Names in China (e.g. LIU Siting)
Java Packages: (e.g. org.mypackage.HelloWorld) Internet names (e.g. sist.shanghaitech.edu.cn)
Dates done correctly ISO 8601 YYYY-MM-DD Dates written in England DD/MM/YYYY
(e.g. 2020-03-22) (e.g. 22/03/2020)
Eating Pizza crust first Eating Pizza skinny part first (the normal way)
Unix file structure (e.g., /usr/local/bin/python)
”Network Byte Order”: most network protocols CANopen
IBM z/Architecture; very old Macs Intel x86; RISC-V (can also support big-endian)

big-endian: MIPS, IA-64, PowerPC 7


Assembly Instructions—Load
• RV32I is a load-store architecture, where only load and store instructions
access memory and arithmetic instructions only operate on CPU registers.

Offset Base
• lw rd, imm(rs1) : Load word at addr. to register rd
addr.= (number in rs1) + imm Bytes
• Example 56 34 23
34 12 cd
01
ab
3c
56 34 23 01
lw x1, 12(x4) 34 12 cd ab
0 x0/zero 56 34 23 01
addr.= 4 + 12 = (10)HEX 0x12340000 34 12 cd ab
x1 56 34 23 01
0x00006789 x2 34 12 cd ab
56 34 23 01
0xFFFFFFFF x3 34 12 cd ab
0x4 x4 56 34 23 01 …
34 12 cd ab
x5 56 34 23 01 c
x6 34 12 cd ab 8
56 34 23 01 4
x7 34 12 cd ab 0 8
Registers Main memory
Assembly Instructions—Load
• RV32I is a load-store architecture, where only load and store instructions
access memory and arithmetic instructions only operate on CPU registers.

• lw rd, imm(rs1) : Load word at addr. to register rd


addr.= (number in rs1) + imm Bytes
• Example 56 34 23
34 12 cd
01
ab
3c
56 34 23 01
lw x1, 12(x4) 34 12 cd ab
0 x0/zero 56 34 23 01
addr.= 4 + 12 = (10)HEX 0x12340000 x1 34 12 cd ab
• C code example 0x00006789 x2
56 34 23
34 12 cd
01
ab …
56 34 23 01
int A[100]; 0xFFFFFFFF x3 34 12 cd ab
/*assume &A[0] = 4*/ 0x4 x4 56 34 23 01
34 12 cd ab 10
G = A[3]; x5 56 34 23 01 c
x6 34 12 cd ab 8
/*load G to x1/ 56 34 23 01 4
x7
34 12 cd ab 0 9
Registers Main memory
Assembly Instructions—Load
• RV32I is a load-store architecture, where only load and store instructions
access memory and arithmetic instructions only operate on CPU registers.

• lb/lbu rd, imm(rs1) : Load signed/unsigned byte at addr. to


register rd
addr.= (number in rs1) + imm 56 34 23 01 3c
34 12 cd ab
• Example 56 34 23 01
34 12 cd ab
0 x0/zero 56 34 23 01
lb x1, 12(x4) 0x12340000 x1 34 12 cd ab
addr.= 4 + 12 = (10)HEX 56 34 23 01
0x00006789 x2 34 12 cd ab …
56 34 23 01
0xFFFFFFFF x3 34 12 cd ab
lbu x1, 12(x4) 0x4 x4 56 34 23 01
34 12 cd ab 10
x5 56 34 23 01 c
x6 34 12 cd ab 8
56 34 23 01 4
x7 34 12 cd ab 0 10
Registers Main memory
Assembly Instructions—Load
• RV32I is a load-store architecture, where only load and store instructions
access memory and arithmetic instructions only operate on CPU registers.

• lh/lhu rd, imm(rs1) : Load signed/unsigned halfword at


addr. to register rd (similar to lb/lbu)
addr.= (number in rs1) + imm 56 34 23 01 3c
34 12 cd ab
• Example 56 34 23 01
34 12 cd ab
0 x0/zero 56 34 23 01
lh x1, 12(x4) 0x12340000 x1 34 12 cd ab
addr.= 4 + 12 = (10)HEX 56 34 23 01
0x00006789 x2 34 12 cd ab …
56 34 23 01
0xFFFFFFFF x3 34 12 cd ab
lhu x1, 12(x4) 0x4 x4 56 34 23 01
34 12 cd ab 10
x5 56 34 23 01 c
x6 34 12 cd ab 8
56 34 23 01 4
x7 34 12 cd ab 0 11
Registers Main memory
Assembly Instructions—S-Type Store
• RV32I is a load-store architecture, where only load and store instructions
access memory and arithmetic instructions only operate on CPU registers.

• sw rs2, imm(rs1) : Store word at rs2 to memory addr.


addr.= (number in rs1) + imm
• Example
56 34 23 01 3c
sw x1, 12(x4) 34 12 cd
56 34 23
ab
01
addr.= 4 + 12 = (10)HEX 0 34 12 cd
x0/zero 56 ab
34 23 01
• C code example 0x12340000 x1 34 12 cd ab
56 34 23 01
int A[100]; 0x00006789 x2 34 12 cd ab …
56 34 23 01
0xFFFFFFFF x3
/* &A[0] => x4 */ 34 12 cd ab
0x4 x4 56 34 23 01
A[3] = h; 34 12 cd ab 10
x5 56 34 23 01 c
/* h in rs2 => x1 */ x6 34 12 cd ab 8
56 34 23 01 4
x7 34 12 cd ab 0 12
Registers Main memory
Assembly Instructions—S-Type Store
• RV32I is a load-store architecture, where only load and store instructions
access memory and arithmetic instructions only operate on CPU registers.

• sw rs2, imm(rs1) : Store word at rs2 to memory addr.


addr.= (number in rs1) + imm
• Example
56 34 23 01 3c
sw x1, 12(x4) 34 12 cd
56 34 23
ab
01
addr.= 4 + 12 = (10)HEX 0 34 12 cd
x0/zero 56 ab
34 23 01
• Similarly, 0x12340000 x1 34 12 cd ab
56 34 23 01
sh:Store lower 16 bits at rs2
0x00006789 x2 34 12 cd ab …
56 34 23 01
0xFFFFFFFF x3
sb:Store lower 8 bits at rs2
34 12 cd ab
0x4 x4 56 34 23 01
34 12 cd ab 10
x5 56 34 23 01 c
x6 34 12 cd ab 8
56 34 23 01 4
x7 34 12 cd ab 0 13
Registers Main memory
Memory Alignment
• RISC-V does not require that integers be word aligned...
– But it can be very very bad if you don't make sure they are...
• Consequences of unaligned integers
– Slowdown: The processor is allowed to be a lot slower when it happens
• In fact, a RISC-V processor may natively only support aligned
accesses, and do unaligned-access in software!
An unaligned load could take hundreds of times longer!
• Lack of atomicity: The whole thing doesn't happen at once... can
introduce lots of very subtle bugs
• So in practice, RISC-V recommends integers to be aligned on 4- byte
boundaries; halfword 2-byte boundaries

14
Question! What’s in x12?

addi x11,x0,0x4F6 A: 0x0


sw x11,0(x5) B: 0x4
lb x12,1(x5) C: 0x6
D: 0xF
E: 0xFFFFFFFF

15
Question! What’s in x12?

addi x11,x0,0x85F6 A: 0x8


sw x11,0(x5) 0x85
B:
lb x12,1(x5)
C: 0xC

D: 0xBC
E: 0XFFFFFF85
F: 0XFFFFFFF8
G: 0XFFFFFFC
H: 0XFFFFFFBC
16
Summary

• RISC-V ISA basics: (32 registers, referred to as x0-x31, x0=0)


• Simple is better
• One instruction (simple operation) per line (RISC-V assembly)
• Fixed-length instructions (for RV32I)
• 6 types of instructions (depending on their format)
• Instructions for arithmetics, logic operations, register-memory data
exchange (load/store word/halfword/byte)

• RISC-V is little-endian
• Load-store architecture
17
CS 110
Computer Architecture
Intro to RISC-V II
Computer Decision Making
Instructors:
Siting Liu & Chundong W ng
Course website: https://to st-l b.sist.sh ngh itech.edu.cn/courses/CS110@Sh ngh iTech/
Spring-2023/index.html
School of Inform tion Science nd Technology (SIST)
Sh ngh iTech University

2023/2/6
a
a
a
a
a
a
a
a
a
a
a
Computer Decision Making—Branch
• Normal operation: execute instructions in sequence
• In C: if/while/for-statement; function call
• RISV-V provides conditional branch (B-type) & unconditional jump (j)

• RISC-V: similar if-statement instruction


beq rs1, rs2, L(imm/label)
means: go to statement labeled if (value in rs1) == (value in rs2);
otherwise, go to next statement
• beq stands for branch if equal
• Similarly, bne for branch if not equal
19
Computer Decision Making—Branch
• Example:
beq rs1, rs2, L(imm/label)

• C code • Assembly
int main(void) { addi x2, x0, 5
int i=5; addi x3, x0, 6
if (i!=6){ bne x2, x3, L1
i++; beq x2, x3, L2
}
L1:addi x2, x2, 1
else i--;
return 0; ret (kind of jump)
} L2:addi x2, x2, -1
ret
• Label can also point to data (more in discussion) 20
Computer Decision Making—Branch
• Assembly (real stuff in ARM64)
• Example: mov w8, #5
Ltmp3:
beq rs1, rs2, L(imm/label) .loc 1 10 9 is_stmt 0

• C code
subs w8, w8, #6
b.eq LBB0_2
b LBB0_1
LBB0_1:
Ltmp4:
int main(void) { .loc 1 11 10 is_stmt 1
int i=5; ldr w8, [sp, #8]
if (i!=6){ add w8, w8, #1
str w8, [sp, #8]
i++; .loc 1 12 5
} b LBB0_3
else i--; Ltmp5:
LBB0_2:
return 0; .loc 1 13 11
} ldr w8, [sp, #8]
subs w8, w8, #1
str w8, [sp, #8]
b LBB0_3
Ltmp6:
LBB0_3: 21
.loc 1 0 11 is_stmt 0
mov w0, #0
.loc 1 14 5 is_stmt 1
add sp, sp, #16
ret
Computer Decision Making—Branch
• Normal operation: execute instructions in sequence
• In programming languages: if/while/for-statement
• RISV-V provides conditional branch & unconditional jump

• RISC-V: if-statement instructions are


blt/bltu/bge/bgeu rs1, rs2, L(imm/label)
means: go to statement labeled L1 if (value in rs1) </≥ (value in rs2)
using singed/unsigned comparison; otherwise, go to next statement

22
C Loop Mapped to RISC-V Assembly
int A[20]; # Assume x8 holds pointer to A
int sum = 0; # Assign x10=sum
for (int i=0; i < 20; i++) add x9, x8, x0 # x9=&A[0]
sum += A[i]; add x10, x0, x0 # sum=0
add x11, x0, x0 # i=0
addi x13,x0, 20 # x13=20
Loop:
bge x11,x13,Done
lw x12, 0(x9) # x12=A[i]
add x10,x10,x12 # sum+=
addi x9, x9,4 # &A[i+1]
addi x11,x11,1 # i++
j Loop
Done:

23
Optimization
• The simple translation is sub- # Assume x8 holds pointer to A
optimal! # Assign x10=sum

• Inner loop is now 4 add x10, x0, x0 # sum=0


instructions rather than 7 add x11, x8, x0 # ptr = A

• And only 1 branch/jump addi x12,x11, 80 # end = A + 80

rather than two: Because Loop:

irst time through is always lw x13,0(x11) # x13 = *ptr


true so can move check to add x10,x10, x13 # sum += x13
the end! addi x11,x11, 4 # ptr++

• The compiler will often do this


blt x11, x12, Loop: # ptr < end

automatically for optimization • This optimization is not required


• See that i is only used as an • Line by line translation is good
index in a loop
• Correctness irst, performance second 24
f
f
Arrays and Pointers

int i; int *p;


int array[10]; int array[10];

for (i = 0; i < 10; i++) for (p = array; p < &array[10]; p++)


{ {
array[i] = …; *p = …;
} }

These code sequences have the same effect!

25
Translate Assembly
addi x10, x0, 0x7 x10 = 7
add x12, x0, x0 x12 = 0
label_a: label_a: x14 = x10 & 1
andi x14, x10, 1 if (x14!=0)
beq x14, x0, label_b {x12 = x10+x12;}
add x12, x10, x12
label_b: x10 = x10-1;
label_b:
if (x10!=0)
addi x10, x10, -1
{go to label_a;}
bne x10, x0, label_a

26
Call a Function—Unconditional Jump
0000000100003f40 <_main>:
100003f40: ff c3 00 d1 sub sp, sp, #48
… …
100003f58: 48 9a 80 52 mov w8, #1234
100003f5c: a8 83 1f b8 stur w8, [x29, #-8]
100003f60: 28 1c 82 52 mov w8, #4321
100003f64: a8 43 1f b8 stur w8, [x29, #-12]
100003f68: a8 83 5f b8 ldur w8, [x29, #-8]
100003f6c: a9 43 5f b8 ldur w9, [x29, #-12]
100003f70: 08 01 09 0b add w8, w8, w9
… …
100003f90: 05 00 00 94 bl 0x100003fa4 <_printf+0x100003fa4>
… …
Disassembly of section __TEXT,__stubs:
0000000100003fa4 <__stubs>: Memory
100003fa4: 10 00 00 b0 adrp x16, 0x100004000 <__stubs+0x4>
100003fa8: 10 02 40 f9 ldr x16,Processor
[x16]
Read
100003fac: 00 02 1f d6 br x16
Control Instructions Data
Increase by 4
Datapath
each time an Bytes
PC
instruction
is executed Registers
Instruction
Except for Program
Arithmetic & Logic Unit Address
branch/jump/ (ALU)
function call 27
Call a Function
#include <stdio.h> 3. Acquire (local) storage resources
int sum_two_number(int a, int b) needed for function
{
int y; 4. Perform desired task of the
return y=a+b; function
}
int main(int argc, const char * argv[]) {
int x=4321, y=1234;
int a=1,b=2,c=3,d=4,e=5,f=6,g=0;
y = sum_two_number(x,y);
c = sum_two_number(a,b); Memory
Processor
f = sum_two_number(e,d);
g = sum_two_number(c,f); Control
printf("Sum is %d.\n",y);
return 0; Datapath
} PC
Bytes
1. Put parameters in a place where Registers
function can access them
Arithmetic & Logic Unit
2. Transfer control to function (PC (ALU)
jump to sum_two_number)
28
Call a Function
#include <stdio.h>
int sum_two_number(int a, int b) 6. Return control to point of origin,
{ since a function can be called
int y;
return y=a+b; from several points in a program
}
int main(int argc, const char * argv[]) {
int x=4321, y=1234;
int a=1,b=2,c=3,d=4,e=5,f=6,g=0;
y = sum_two_number(x,y);
c = sum_two_number(a,b); Memory
Processor
f = sum_two_number(e,d);
g = sum_two_number(c,f); Control
printf("Sum is %d.\n",y);
return 0; Datapath
} PC
Bytes
Registers
5. Put result value in a place where
calling code can access it and Arithmetic & Logic Unit
(ALU)
restore any registers you used
29
RISC-V Function Call Conventions
• Registers faster than memory, so use them as much as possible
• Give names to registers, conventions on how to use them

Older version: https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf


Latest draft: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/draft-20230220-87f4a72d5aeaf048b35a230e0ba5accd1bfcf072
30
RISC-V Function Call Conventions
• a0–a7 (x10-x17): eight argument registers to pass parameters and
return values (a0-a1)
• ra: one return address register to return to the point of origin (x1)
• Also s0-s1 (x8-x9) and s2-s11 (x18-x27): saved registers
(more about those later)

31
Call a Function
#include <stdio.h>
int sum_two_number(int a, int b) y is returned function argument;
{
int y; Can be put in registers a0-a1
return y=a+b;
}
int main(int argc, const char * argv[]) {
int x=4321, y=1234;
int a=1,b=2,c=3,d=4,e=5,f=6,g=0;
y = sum_two_number(x,y);
c = sum_two_number(a,b); Memory
Processor
f = sum_two_number(e,d);
g = sum_two_number(c,f); Control
printf("Sum is %d.\n",y);
return 0; Datapath
} PC
Bytes
x and y are function arguments; Registers
Can be put in registers a0-a7 Arithmetic & Logic Unit
(ALU)

32
Call a Function
#include <stdio.h>
int sum_two_number(int a, int b)
{ Func_called:
0x2000
int y;//one instruction
0x2004
return//another
y=a+b; instruction
} …… //need jump back to main()
int main(int argc, const char * argv[]) {
int x=4321, y=1234;
Start:
int a=1,b=2,c=3,d=4,e=5,f=6,g=0;
0x1000 //one instruction
y = sum_two_number(x,y);
0x1004 //another instruction
Save this value c = sum_two_number(a,b);
0x1008 //a third instruction
to register ra f = sum_two_number(e,d);
0x100c //PC jump to 0x2000 (call function
g = sum_two_number(c,f);
sum_two_number)
printf("Sum is %d.\n",y);
0x1010 //next instruction… …
return 0;
} …… 33
Call a Function—Jump
• JAL: Jump & Link, jump to function
• Unconditional jump (J-type)

jal rd label
Jump to label (imm+PC, explain later) and save return address
(PC+4) to rd;
rd is x1 (ra) by convention; sometimes can be x5.
When rd is x0, it is simply unconditional jump (j) without
recording PC+4.

34
Return—Jump
• JALR: Jump & Link Register
• Unconditional jump (I-type)

jalr rd label
Jump to label (imm+rs1)&~1 and save return address (PC+4) to rd
rs1 can be the return address we just saved to ra
When rd is x0, it is simply unconditional jump (j) without recording
PC+4.

35
Jump
—jal rd offset —jalr rd rs offset
• Jump and Link
• Add the immediate value to the current address in the program (the “Program
Counter”), go to that location

• The offset is 20 bits, sign extended and left-shifted one (not two)
• At the same time, store into rd the value of PC+4
• So we know where it came from (need to return to)
• jal offset == jal x1 offset (pseudo-instruction; x1 = ra = return
address)

•j offset == jal x0 offset (jump is a pseudo-instruction in RISC-V)

• Two uses:
• Unconditional jumps in loops and the like
• Calling other functions 36
Jump and Link Register
• The same except the destination
• Instead of PC + immediate it is rs + immediate
• Same immediate format as I-type: 12 bits, sign extended
• Again, if you don’t want to record where you jump to…
• jr rs == jalr x0 rs

• Two main uses


• Returning from functions (which were called using Jump and Link)
• Calling pointers to function
37
Notes on Functions
• Calling program (caller) puts parameters into registers a0-a7 and uses
jal X to invoke (callee) at address labeled X

• Must have register in computer with address of currently executing


instruction

• Instead of Instruction Address Register (better name), historically


called Program Counter (PC)

• It’s a program’s counter; it doesn’t count programs!


• What value does jal X place into ra?
• jr ra puts address inside ra back into PC

38
Call a Function
1. Put parameters in a place where function can access them

2. Transfer control to function (PC jump to function)

3. Acquire (local) storage resources needed for function

4. Perform desired task of the function

5. Put result value in a place where calling code can access it


and restore any registers you used

6. Return control to point of origin, since a function can be


called from several points in a program
39
Where Are Old Register Values Saved
to Restore Them After Function Call?
• Need a place to save old values before call function, restore them when
return, and delete

• Ideal is stack: last-in- irst-out queue (e.g., stack of plates) Limited

• Push: placing data onto stack space

• Pop: removing data from stack


Processor
Control

• Stack in memory, so need register to point to it Datapath

• sp is the stack pointer in RISC-V (x2)


PC

• Convention is grow from high to low addresses


Registers

• Push decrements sp, Pop increments sp


Arithmetic & Logic Unit
(ALU)

40
f
Stack
• Stack frame may include:
• Return “instruction” address
• Parameters (spill)
• Space for other local variables
• Stack frames contiguous; stack pointer (sp/x2) tells where bottom of
stack frame is

• When procedure ends, stack frame is tossed off the stack; frees memory
for future stack frames; sp restores

41
Example
• Leaf function: a function that calls no function
0 x0/zero
ra x1
sp x2
int Leaf (int g, int h, int i, int j) ……
{
s1 x9
int f; f = (g + h) - (i + j);
return f; a0 x10
} a1 x11
int main (void){ a2 x12
int a=1, b=2, c=3, d=4, e;
a3 x13
e = Leaf(a,b,d,c);
return e; a4 x14
} /*a function called by OS*/ ……

• Parameter variables g, h, i, and j in argument registers a1, a2,


a3, and a4, and f in a0 when returned, and assume e in s1,
later should be copied to a0 when returned
• Register ra consideration 42
Stack Before, During, After Function
• Caller needs to save old values of a1, a2, a3 and a4
• Callee needs to save old value of s1 (and any other callee saved
registers), and makes sure they are not changed after return

43
Stack Before, During, After Function
• Need to save old values of ra, a1, a2, a3 and a4 (caller-saved)
• W.r.t. main()

sp sp
Saved ra Saved ra
Saved a1 Saved a1
Saved a2 Saved a2
Saved a3 Saved a3
Saved a4 Saved a4
sp

Before call During call After call


44
RISC-V Code for Main()/Leaf()
Main:
addi sp, sp, -20 # adjust stack for 5 items, 4 int & 1 ra pointer/address
sw ra, 16(sp) # save ra for use afterwards (return to OS)
sw a1, 12(sp) # save a1 for use afterwards, these are all caller-saved
sw a2, 8(sp) # save a2 for use afterwards
sw a3, 4(sp) # save a3 for use afterwards
sw a4, 0(sp) # save a4 for use afterwards OS stack
sp
Saved ra
jal ra, Leaf # save a1 for use afterwards
Saved a1
Saved a2
lw a1, 12(sp) # restore register a1
Saved a3
lw a2, 8(sp) # restore register a2 Saved a4
lw a3, 8(sp) # restore register a2
lw a4, 8(sp) # restore register a2
lw ra, 8(sp) # restore register ra
addi sp, sp, 20 # adjust stack to delete 5 items
Stack
mv a0, a0 # move result to return register
During call
jr ra # return 45
RISC-V Code for Main()/Leaf()
Leaf:
addi sp, sp, -4 # adjust stack for 1 items, callee saved s1
sw s1, 0(sp) # save callee saved s1 to stack
add a1, a0, a1 # g = g + h
add a2, a2, a3 # j = i + j
sub s1, a0, a2 # calculate result (g + h) – (i + j)
mv a0, s1 # return value (g + h) – (i + j)
Saved ra
Saved a1
lw s1, 0(sp) # restore register s1 for caller
Saved a2
addi sp, sp, 4 # adjust stack to delete 1 items
Saved a3
jr ra # jump back to caller Saved a4
(pseudo-assembly: ret) sp
Saved s1

Stack
During call
46
RISC-V Code for Main()/Leaf()
Main:
addi sp, sp, -20 # adjust stack for 5 items, 4 int & 1 ra pointer/address
sw ra, 16(sp) # save ra for use afterwards (return to OS)
sw a1, 12(sp) # save a1 for use afterwards, these are all caller-saved
sw a2, 8(sp) # save a2 for use afterwards
sw a3, 4(sp) # save a3 for use afterwards
sw a4, 0(sp) # save a4 for use afterwards OS stack
Saved ra
jal ra, Leaf # save a1 for use afterwards
Saved a1
Saved a2
lw a1, 12(sp) # restore register a1
Saved a3
lw a2, 8(sp) # restore register a2 Saved a4
lw a3, 8(sp) # restore register a2 sp
lw a4, 8(sp) # restore register a2
lw ra, 8(sp) # restore register ra
addi sp, sp, 20 # adjust stack to delete 5 items
Stack
mv a0, a0 # move result to return register
During call
jr ra # return 47
Call a Function
1. Caller put parameters in a place where function can access
them (a0-a7, or stack when registers not avail.), and then
save caller-saved registers to stack
2. Transfer control to function (PC jump to function): JAL, ra
is changed to where caller left
3. Acquire (local) storage resources needed for function:
change sp (size decided when compiling);
Push callee-saved registers to stack (e.g., s0-s11)
4. Perform desired task of the function
5. Put result value in a place where calling code can access it
(a0, a1), and restore callee-saved registers (s0-s11, sp)
6. Return control to point of origin, since a function can be
called from several points in a program (jr); caller restores
caller-saved registers 48

You might also like