Lecture6 RISC V Assembly III
Lecture6 RISC V Assembly III
Computer Architectures
● Based on the input data and the values created during computation,
different instructions execute.
2
Conditional Operations
● Two decision-making instructions in RISC-V: similar to an if statement
with a goto.
○ Branch to a labeled instruction if a condition is true, otherwise, continue
sequentially
● beq: branch if equal
○ if (rs1 == rs2) branch to instruction labeled L1
3
Compiling If Statements
● C code:
if (i==j) f = g+h;
else f = g-h;
○ f, g, … in x19, x20, …
4
Compiling If Statements
● C code:
if (i==j) f = g+h;
else f = g-h;
○ f, g, … in x19, x20, …
5
Compiling If Statements
● C code:
if (i==j) f = g+h;
else f = g-h;
○ f, g, … in x19, x20, …
7
Compiling Loop Statements
● C code: while (save[i] == k) i += 1;
○ i in x22, k in x24, address of save in x25
8
Basic Blocks
● A basic block is a sequence of instructions with:
○ No embedded branches (except at end)
○ No branch targets (except at beginning)
● One of the first early phases of compilation is breaking the program into
basic blocks.
○ A compiler identifies basic blocks for optimization
9
More Conditional Operations
● blt: branch if less than
blt rs1, rs2, L1
○ if (rs1 < rs2) branch to instruction labeled L1
10
More Conditional Operations
● Example: if (a > b) a += 1;
○ a in x22, b in x23
● Compiled RISC-V code:
11
More Conditional Operations
● Example: if (a > b) a += 1;
○ a in x22, b in x23
● Compiled RISC-V code:
12
Signed vs. Unsigned
● Signed comparison: blt, bge
● Unsigned comparison: bltu, bgeu
● Example
○ x22 = 1111 1111 1111 1111 1111 1111 1111 1111
○ x23 = 0000 0000 0000 0000 0000 0000 0000 0001
blt x23, x22, L1
○ x22 < x23 // signed -> branch not taken
■ –1 < +1
bltu x23, x22, L1
○ x22 > x23 // unsigned -> branch taken
■ +4,294,967,295 > +1
13
Bounds Check Shortcut
● Treating signed numbers as if they were unsigned gives us a low-cost way
of checking if 0 ≤ x < y
○ matches the index out-of-bounds check for arrays
14
Bounds Check Shortcut
● Treating signed numbers as if they were unsigned gives us a low-cost way
of checking if 0 ≤ x < y
○ matches the index out-of-bounds check for arrays
15
Switch/Case Statements
● Implement switch is via a sequence of conditional tests
○ turning the switch statement into a chain of if-then-else statements.
switch(a) {
case 1: <code 1>;
case 2: <code 2>;
…
}
16
Switch/Case Statements
● The branch table:
○ an array containing addresses that correspond to labels in the code.
● Load the appropriate entry from the branch table into a register.
○ Branch using the address in the register.
● jump-and-link register (jalr) instruction:
○ unconditional branch to the address specified in a register.
TABLE <code1>
<code2>
…
switch(a) {
case 1: <code 1>;
case 2: <code 2>;
…
}
17
Switch/Case Statements
● The branch table:
○ an array containing addresses that correspond to labels in the code.
● Load the appropriate entry from the branch table into a register.
○ Branch using the address in the register.
● jump-and-link register (jalr) instruction:
○ unconditional branch to the address specified in a register.
TABLE <code1>
TABLE stores the addresses <code2>
switch(a) {
case 1: <code 1>; slli x9, x10, 3 // multiply a with 8
case 2: <code 2>; add x9, x9, x11 // x11 holds start of TABLE
… ld x12, 0(x9) // load address
} jalr x1, x12 // jump to address in x12
18
Procedures
● A procedure or function
○ tool programmers use to structure programs,
■ to make programs easier to understand
■ to allow code to be reused.
● Parameters:
○ used to pass values and return results.
○ interface between the procedure and the rest of the program and data,
long long int proc_example (long long int g, long long int h,
long long int i, long long int j) {
long long int f;
f = (g + h) - (i + j);
return f;
}
19
Procedure Calling Convention
● We need a way to:
1. Put parameters in a place where the procedure can access them.
2. Transfer control to the procedure.
3. Acquire the storage resources needed for the procedure.
4. Perform the desired task.
5. Put the result value in a place where the calling program can access it.
6. Return control to the point of origin, since a procedure can be called from
several points in a program.
20
Procedure Calling – RISC-V
● Steps required
○ Place parameters in registers x10 to x17
○ Transfer control to procedure
○ Acquire storage for procedure
○ Perform procedure’s operations
○ Place result in register for caller
○ Return to place of call (address in x1)
21
Parameters
● Registers are the fastest place to hold data in a computer:
○ use them as much as possible.
22
Procedure Call Instructions
● Procedure call: jump and link
jal x1,ProcedureLabel
○ Address of following instruction put in x1 (return address)
○ Jumps to target address
23
Program Counter
● Abbreviated by PC
○ holds the address of the current instruction being executed
● The jal instruction actually saves PC + 4 in its designation register (usually
x1)
○ to link to the byte address of the following instruction to set up the procedure
return.
24
Procedure Call Summary
● The calling program, or caller:
○ puts the parameter values in x10–x17
○ uses jal x1, X to branch to procedure X
■ x1 holds PC+4
● The callee
○ performs the calculations,
○ places the results in the same parameter registers,
○ and returns control to the caller using jalr x0, 0(x1).
25
Using More Registers
● What if the procedure has more than 8 parameters?
● Moreover, if the callee uses the registers needed by the caller:
○ They must be restored to the values that they contained before the procedure
was invoked.
26
Stack
● Push
○ placing data onto the stack
sp x20
● Pop sp
○ removing data from the stack
● Stack pointer
○ the most recently allocated address in the stack
○ In RISC-V, the stack pointer is register x2, also known by the name sp.
○ The stack pointer is adjusted by one doubleword for each register that is saved
or restored.
● Stack “grows” from higher addresses to lower addresses.
27
Leaf Procedure Example
● C code:
long long int leaf_example (
long long int g, long long int h,
long long int i, long long int j) {
long long int f;
f = (g + h) - (i + j);
return f;
}
28
Leaf Procedure Example
● The compiled program starts with the label of the procedure:
leaf_example:
● The next step is to save the registers used by the procedure.
○ we need to save x5, x6, and x20.
■ need to “push” the old values onto the stack.
○ This phase is called Prologue.
29
Leaf Procedure Example
● Prologue:
leaf_example:
addi sp, sp, -24 // adjust stack to make room for 3 items
sd x5, 16(sp) // save register x5 for use afterwards
sd x6, 8(sp) // save register x6 for use afterwards
sd x20, 0(sp) // save register x20 for use afterwards
30
Leaf Procedure Example
● The next three statements correspond to the body of the procedure
leaf_example:
addi sp, sp, -24 // adjust stack to make room for 3 items
sd x5, 16(sp) // save register x5 for use afterwards
sd x6, 8(sp) // save register x6 for use afterwards
sd x20, 0(sp) // save register x20 for use afterwards
add x5, x10, x11 // register x5 contains g + h
add x6, x12, x13 // register x6 contains i + j
sub x20, x5, x6 // f = x5 − x6, which is (g + h) − (i + j)
31
Leaf Procedure Example
● To return the value of f, we copy it into a parameter register:
leaf_example:
addi sp, sp, -24 // adjust stack to make room for 3 items
sd x5, 16(sp) // save register x5 for use afterwards
sd x6, 8(sp) // save register x6 for use afterwards
sd x20, 0(sp) // save register x20 for use afterwards
add x5, x10, x11 // register x5 contains g + h
add x6, x12, x13 // register x6 contains i + j
sub x20, x5, x6 // f = x5 − x6, which is (g + h) − (i + j)
addi x10, x20, 0 // returns f (x10 = x20 + 0)
32
Leaf Procedure Example
● Epilogue: restore the three old values of the registers
leaf_example:
addi sp, sp, -24 // adjust stack to make room for 3 items
sd x5, 16(sp) // save register x5 for use afterwards
sd x6, 8(sp) // save register x6 for use afterwards
sd x20, 0(sp) // save register x20 for use afterwards
add x5, x10, x11 // register x5 contains g + h
add x6, x12, x13 // register x6 contains i + j
sub x20, x5, x6 // f = x5 − x6, which is (g + h) − (i + j)
addi x10, x20, 0 // returns f (x10 = x20 + 0)
ld x20, 0(sp) // restore register x20 for caller
ld x6, 8(sp) // restore register x6 for caller
ld x5, 16(sp) // restore register x5 for caller
addi sp, sp, 24 // adjust stack to delete 3 items
33
Leaf Procedure Example
● The procedure ends with a branch register using the return address:
leaf_example:
addi sp, sp, -24 // adjust stack to make room for 3 items
sd x5, 16(sp) // save register x5 for use afterwards
sd x6, 8(sp) // save register x6 for use afterwards
sd x20, 0(sp) // save register x20 for use afterwards
add x5, x10, x11 // register x5 contains g + h
add x6, x12, x13 // register x6 contains i + j
sub x20, x5, x6 // f = x5 − x6, which is (g + h) − (i + j)
addi x10, x20, 0 // returns f (x10 = x20 + 0)
ld x20, 0(sp) // restore register x20 for caller
ld x6, 8(sp) // restore register x6 for caller
ld x5, 16(sp) // restore register x5 for caller
addi sp, sp, 24 // adjust stack to delete 3 items
jalr x0, 0(x1) // branch back to calling routine
34
Register Usage
● RISC-V software separates 19 of the registers into two groups to avoid
saving and restoring a register whose value is never used.
○ x5 – x7, x28 – x31: temporary registers
■ Not preserved by the callee
○ x8 – x9, x18 – x27: saved registers
■ If used, the callee saves and restores them
● In the previous slide, the callee does not need to save registers x5 and x6
but save and restore x20.
35
Non-Leaf Procedures
● Procedures that call other procedures
○ also recursive procedures
● We have problems:
○ main program calls procedure A: jal x1, A.
○ procedure A calls procedure B: jal x1, B
○ What’s the problem?
36
Non-Leaf Procedures
● Procedures that call other procedures
○ also recursive procedures
● We have problems:
○ main program calls procedure A: jal x1, A.
○ procedure A calls procedure B: jal x1, B
○ There is a conflict over the return address in register x1
■ it now has the return address for B.
37
Non-Leaf Procedures
● The caller pushes the registers that are needed after the call:
○ any argument registers (x10–x17)
○ or temporary registers (x5-x7 and x28-x31)
● The callee pushes:
○ the return address register x1
○ and any saved registers (x8-x9 and x18-x27) used by the callee.
● Upon the return:
○ the registers are restored from memory
○ the stack pointer is readjusted
38
Non-Leaf Procedure Example
long long int fact (long long int n)
● C code:
{
if (n < 1) return 1;
else return n * fact(n - 1);
}
○ Argument n in x10
○ Result in x10
39
Non-Leaf Procedure Example
● The compiled program starts with the label of the procedure and then
saves two registers on the stack
○ the return address and x10:
fact:
addi sp, sp, -16 // adjust stack for 2 items
sd x1, 8(sp) // save the return address
sd x10, 0(sp) // save the argument n
40
Non-Leaf Procedure Example
● The next two instructions test whether n is less than 1, going to L1 if n ≥ 1.
fact:
addi sp, sp, -16 // adjust stack for 2 items
sd x1, 8(sp) // save the return address
sd x10, 0(sp) // save the argument n
addi x5, x10, -1 // x5 = n - 1
bge x5, x0, L1 // if (n - 1) >= 0, go to L1
41
Non-Leaf Procedure Example
● If n < 1, fact returns 1 by putting 1 into a value register:
○ pops the two saved values off the stack
■ Since x1 and x10 didn’t change, no need to load x1 and x10 from stack.
○ branches to the return address
fact:
addi sp, sp, -16 // adjust stack for 2 items
sd x1, 8(sp) // save the return address
sd x10, 0(sp) // save the argument n
addi x5, x10, -1 // x5 = n - 1
bge x5, x0, L1 // if (n - 1) >= 0, go to L1
addi x10, x0, 1 // return 1
addi sp, sp, 16 // pop 2 items off stack
jalr x0, 0(x1) // return to caller
42
Non-Leaf Procedure Example
● If n is not less than 1, the argument n is decremented and then fact is
called again with the decremented value:
fact:
addi sp, sp, -16 // adjust stack for 2 items
sd x1, 8(sp) // save the return address
sd x10, 0(sp) // save the argument n
addi x5, x10, -1 // x5 = n - 1
bge x5, x0, L1 // if (n - 1) >= 0, go to L1
addi x10, x0, 1 // return 1
addi sp, sp, 16 // pop 2 items off stack
jalr x0, 0(x1) // return to caller
L1: addi x10, x10, -1 // n >= 1: argument gets (n − 1)
jal x1, fact // call fact with (n − 1)
43
Non-Leaf Procedure Example
● Now the old return address and old argument are restored, along with the
stack pointer:
fact:
addi sp, sp, -16 // adjust stack for 2 items
sd x1, 8(sp) // save the return address
sd x10, 0(sp) // save the argument n
addi x5, x10, -1 // x5 = n - 1
bge x5, x0, L1 // if (n - 1) >= 0, go to L1
addi x10, x0, 1 // return 1
addi sp, sp, 16 // pop 2 items off stack
jalr x0, 0(x1) // return to caller
L1: addi x10, x10, -1 // n >= 1: argument gets (n − 1)
jal x1, fact // call fact with (n − 1)
addi x6, x10, 0 // return from jal: move result of fact (n - 1) to x6:
ld x10, 0(sp) // restore argument n
ld x1, 8(sp) // restore the return address
addi sp, sp, 16 // adjust stack pointer to pop 2 items
44
Non-Leaf Procedure Example
● Next, argument register x10 gets the product of the old argument and the
result of fact(n - 1), now in x6.
fact:
addi sp, sp, -16 // adjust stack for 2 items
sd x1, 8(sp) // save the return address
sd x10, 0(sp) // save the argument n
addi x5, x10, -1 // x5 = n - 1
bge x5, x0, L1 // if (n - 1) >= 0, go to L1
addi x10, x0, 1 // return 1
addi sp, sp, 16 // pop 2 items off stack
jalr x0, 0(x1) // return to caller
L1: addi x10, x10, -1 // n >= 1: argument gets (n − 1)
jal x1, fact // call fact with (n − 1)
addi x6, x10, 0 // return from jal: move result of fact (n - 1) to x6:
ld x10, 0(sp) // restore argument n
ld x1, 8(sp) // restore the return address
addi sp, sp, 16 // adjust stack pointer to pop 2 items
mul x10, x10, x6 // return n * fact (n − 1) 45
Non-Leaf Procedure Example
● Finally, fact branches again to the return address:
fact:
addi sp, sp, -16 // adjust stack for 2 items
sd x1, 8(sp) // save the return address
sd x10, 0(sp) // save the argument n
addi x5, x10, -1 // x5 = n - 1
bge x5, x0, L1 // if (n - 1) >= 0, go to L1
addi x10, x0, 1 // return 1
addi sp, sp, 16 // pop 2 items off stack
jalr x0, 0(x1) // return to caller
L1: addi x10, x10, -1 // n >= 1: argument gets (n − 1)
jal x1, fact // call fact with (n − 1)
addi x6, x10, 0 // return from jal: move result of fact (n - 1) to x6:
ld x10, 0(sp) // restore argument n
ld x1, 8(sp) // restore the return address
addi sp, sp, 16 // adjust stack pointer to pop 2 items
mul x10, x10, x6 // return n * fact (n − 1)
jalr x0, 0(x1) // return to the caller 46
The end
● Read the Sections 2.7 and 2.8 of your book.
47