Lec04 Control
Lec04 Control
2025 Spring
Overview
1 Introduction
2 Control Instructions
3 Accessing Procedures
4 Summary
2/38
Introduction
RISC-V Instruction Fields
Instruction Categories
• Load and Store instructions
• Bitwise instructions
• Arithmetic instructions
• Control transfer instructions
• Pseudo instructions
5/38
RISC-V Register File
Register File
32 bits
5 32 src1
src1 addr • Holds thirty-two 32-bit general purpose registers
5 data
src2 addr 32
5 locations • Two read ports
dst addr
32 src2
write data
32
data
• One write port
write control
Registers are
• Faster than main memory
• But register files with more locations are slower
• E.g., a 64 word file may be 50% slower than a 32 word file
• Read/write port increase impacts speed quadratically
• Easier for a compiler to use
• (A*B)-(C*D)-(E*F) can do multiplies in any order vs. stack
• Can hold variables so that code density improves (since register are named with
fewer bits than a memory location)
6/38
Aside: RISC-V Register Convention
8/38
Control Instructions
RISC-V Control Flow Instructions
Example
if (i==j) h = i + j;
10/38
RISC-V Control Flow Instructions
• We have beq, bne, but what about other kinds of branches (e.g., branch-if-less-than)?
• For this, we need yet another instruction, slt
12/38
In Support of Branch Instructions
Can use slt, beq, bne, and the fixed value of 0 in register zero to create other
conditions
• less than: blt s1, s2, Label
14/38
Bounds Check Shortcut
• Treating signed numbers as if they were unsigned gives a low cost way of checking if
0 ≤ x < y (index out of bounds for arrays)
• The key is that negative integers in two’s complement look like large numbers in
unsigned notation.
• Thus, an unsigned comparison of x < y also checks if x is negative as well as if x is
less than y.
15/38
Other Control Flow Instructions
16/38
In Support of Branch Instructions
18/38
EX: Compiling a while Loop in C
while (save[i] == k) i += 1;
Assume that i and k correspond to registers s3 and s5 and the base of the array save is in
s6.
19/38
EX: Compiling a while Loop in C
while (save[i] == k) i += 1;
Assume that i and k correspond to registers s3 and s5 and the base of the array save is in
s6.
Note: left shift s3 to align word address, and later address is increased by 1
19/38
Six Steps in Execution of a Procedure
1 Main routine (caller) places parameters in a place where the procedure (callee) can
access them
• a0 – a7: for argument registers
2 Caller transfers control to the callee
3 Callee acquires the storage resources needed
4 Callee performs the desired task
5 Callee places the result value in a place where the caller can access it
• s0-s11: 12 value registers for result values
6 Callee returns control to the caller
• ra: one return address register to return to the point of origin
20/38
Accessing Procedures
Instructions for Accessing Procedures
• Saves PC + 4 in register ra to have a link to the next instruction for the procedure
return
• Machine format (J format):
• Then can do procedure return with a
22/38
Example of Accessing Procedures
• For a procedure that computes the GCD of two values i (in t0) and j (in t1):
gcd(i,j);
• The caller puts the i and j (the parameters values) in a0 and a1 and issues a
• The callee computes the GCD, puts the result in s0, and returns control to the caller
using
24/38
What if the callee needs to use more registers than allocated to argument and
return values?
25/38
Allocating Space on the Stack
26/38
Allocating Space on the Stack
28/38
EX-3: Compiling a C Leaf Procedure
Leaf procedures are ones that do not call other procedures. Give the RISC-V assembler
code for the follows.
int leaf_ex (int g, int h, int i, int j)
{
int f;
f = (g+h) - (i+j);
return f;
}
Solution:
29/38
EX-3: Compiling a C Leaf Procedure
Leaf procedures are ones that do not call other procedures. Give the RISC-V assembler
code for the follows.
int leaf_ex (int g, int h, int i, int j)
{
int f;
f = (g+h) - (i+j);
return f;
}
Solution:
Suppose g, h, i, and j are in a0, a1, a2, a3
leaf_ex: addi sp, sp, -8 # make stack room
sw t1, 4(sp) # save t1 on stack
sw t0, 0(sp) # save t0 on stack
add t0, a0, a1
add t1, a2, a3
sub s0, t0, t1
lw t0, 0(sp) # restore t0
lw t1, 4(sp) # restore t1
addi sp, sp, 8 # adjust stack ptr
jalr zero, 0(ra)
29/38
Nested Procedures
30/38
Nested procedures (cont.)
rt_2: . . .
• On the call to rt_1, the return address (next in the caller routine) gets stored in ra.
Question:
What happens to the value in ra (when a0 != 0) when to_2 makes a call to rt_2?
31/38
Compiling a Recursive Procedure
32/38
Compiling a Recursive Procedure (cont.)
33/38
Compiling a Recursive Procedure (cont.)
C program
compiler
assembly code
assembler
linker
loader
memory
36/38
Compiler Benefits
The un-optimized code has the best CPI1 , the O1 version has the lowest
instruction count, but the O3 version is the fastest.
gcc opt Relative Clock cycles Instr count CPI
performance (M) (M)
None 1.00 158,615 114,938 1.38
O1 (medium) 2.37 66,990 37,470 1.79
O2 (full) 2.38 66,521 39,993 1.66
O3 (proc mig) 2.41 65,747 44,993 1.46
1 37/38
CPI: clock cycles per instruction
Addressing Modes Illustrated
38/38