RISC-V Theory
RISC-V Theory
Link
• Hierarchy of memories
16bit compressed
instruction is
decompressed into
32bit Instruction
ARM vs RISC V : Performance
• In the performance comparison between RISC-V and ARM,
• ARM's consistent iteration, comprehensive ecosystem, and wide
range of options give it a notable performance advantage.
• However, RISC-V's modular nature and customization potential hold
promise for specific use cases. The ongoing efforts of RISC-V
proponents to narrow the performance gap will be a crucial factor in
determining how well RISC-V can match ARM's established
performance standards in the future.
ARM vs RISC V : Power Efficiency
• ARM's refined power management techniques and specialized cores
give it a major advantage in power efficiency.
• While RISC-V holds promise due to its customization potential, its
open nature requires a more extensive investment of time and
resources to fully harness its energy-saving capabilities.
RISC V Architecture
• The RISC-V architecture is based on the RISC principles (as compared to
CISC), which emphasize a small, simple, and efficient instruction set.
The key architectural features of RISC-V include
• load-store architecture
• fixed-length 32-bit instruction format
• small number of general-purpose registers
• RISC-V supports various integer instruction set extensions, such as RV32I (32-bit),
RV64I (64-bit), and RV128I (128-bit), which define the base integer instruction
set for different address space sizes.
• RISC-V utilizes little-endian byte ordering within the memory system, implying
that the smallest significant byte of multi-byte data is stored at the lowest
memory address.
RISC V Architecture: Modularity &
Extensibility
• One of the defining characteristics of RISC-V is its modularity and
extensibility.
• The ISA is designed to be easily extended with custom instructions
and coprocessors, allowing for tailored implementations that meet
specific application requirements.
• This flexibility is achieved through a modular design, where the base
ISA can be combined with optional standard extensions, such as the
M extension for integer multiplication and division, the A extension
for atomic operations, and the F and D extensions for single- and
double-precision floating-point arithmetic.
Standard
extensions
…
64-bit in RV64
x5
x31
In both of them, each instruction is encoded
into 32 bits.
RISC V Modes: Privilege levels & Virtual
Memory
• The RISC-V Privileged Architecture Specification defines three privilege
levels:
1. machine mode (M-mode),
2. supervisor mode (S-mode),
3. user mode (U-mode).
• These privilege levels provide a mechanism for isolating the operating
system kernel, hypervisors, and user applications, ensuring system
security and stability.
• RISC-V also supports a virtual memory system based on a multi-level
page table scheme, enabling efficient memory management and
protection.
RISC-V Modes
• RISC-V Privileged Specification defines 3 levels of
RISC-V Modes
privilege, called Modes
• Machine mode is the highest privileged mode and the only Level Name Abbr.
required mode 0 User/Application U
– Flexibility allows for a range of targeted 1 Supervisor S
implementations from simple MCUs to high-
2 Hypervisor HS
performance Application Processors
– Example for Simple bare metal application Machine 3 Machine M
mode is enough and it’s the default and mandatory
mode, for an isolation boundary between the
application and more direct hardware access, M and Supported Combinations of Modes
U mode may both be supported. Supported Levels Modes
– A robust system, such as a server or desktop machine 1 M
will support M, S, and U as the Supervisor mode will
2 M, U
bring the benefits of Virtualization and Hypervisor
called Hypervisor-extended Supervisor (HS) 3 M, S, U
4 M, HS, S, U
• Machine, Hypervisor, Supervisor modes each have Control
and Status Registers (CSRs)
We will discuss 4 addressing
modes (relevant for RISC-V)
• Immediate
• Register direct
• Register indirect
• Base-offset
imm value
2. Register Direct
Mode
• Value ← r1 The value is obtained from the register directly.
r1 r0 186
784
r1
r15 410 784 value
register file
Examples of instructions that use
those modes
• Register direct: sub r3, r1, r2
• r1 and r2 values are fetched from registers. Result is stored
in r3
r15
4019 value
register file 148 4019
M emory
4. Base-off set Addressing
Mode
• Value ← offset(r1) (1) Read value of r1 from register file. Add offset to
it. This gives a memory address
(2) Read the value stored in memory at that
address
Let offset = 9
0
4
r0
r1 r1 451
offset 100
r15 8914
value
register file 460 8914
memory
Examples of instructions that use
those modes
Load and store instructions use register indirect and base-offset
addressing modes
lw r1, 10(r2) sw r1, 10(r2)
memory memory
register register
file 10 file 10
r1 r1
r2 r2
Lw = load word
Sw= store word
(a) (b)
Solved
Example
Consider below instructions that are
executed when the state of register file and
memory is as given here.
ld x6, 24(x10)
sd x5, 16(x10)
Show only the changed values in R F and
memory after these instructions are
executed.
Solution: x10+24 is 1040. ld instruction will load from 1040 and store that value in
register x6. So, x6 will become 67891234
x10+16 is 1032. sd instruction will store the value of x5 at address 1032. Hence,
memory address 1032 will change to 3897409
Arithmetic Operations
• Add and subtract, three operands
• Two sources and one destination
add a, b, c // a gets b + c
• All arithmetic operations have this form
• Design Principle 1: Simplicity favors regularity
• Regularity makes implementation simpler
• Simplicity enables higher performance at lower cost
Arithmetic Example
• C code:
f = (g + h) - (i + j);
• Compiled RISC-V code:
add t0, g, h // temp t0 = g + h
add t1, i, j // temp t1 = i + j
sub f, t0, t1 // f = t0 - t1
Register Operands
• Arithmetic instructions use register
operands
33
Register Description
• Register x0 RISC-V dedicates register x0 to be hard-wired to the value zero.
• Return address A link to the calling site that allows a procedure to return to the proper address; in RISC-V it
is stored in register x1.
• Program counter(PC) The register containing the address of the instruction in the program being executed.
• Stack pointer A value denoting the most recently allocated address in a stack that shows where registers
should be spilled or where old register values can be found. In RISC-V, it is register sp, or x2.
• Global pointer The register that is reserved to point to the static area where static variables are stored.
• Frame pointer A value denoting the location of the saved registers and local variables for a given procedure.
•Argument registers: x10 to x17 are used to pass arguments to a function. Before calling a function,
arguments are copied to these registers. If more than 8 arguments need to be passed, we use the stack.
•Temporary registers (t0 to t6): used to hold intermediate values during instruction or function
execution.
• Thread-pointer register, tp, that is designed for thread-local data.
Register Operand Example
• C code:
f = (g + h) - (i + j);
• f, …, j in x19, x20, …, x23
• Instruction fields
• opcode: operation code
• rd: destination register number
• funct3: 3-bit function code (additional opcode)
• rs1: the first source register number
• rs2: the second source register number
• funct7: 7-bit function code (additional opcode)
R-format Example
funct7 rs2 rs1 funct3 rd opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
add x9,x20,x21
0 21 20 0 9 51
li a0, 4 # a0 4
li a1, -3 # a1 0xFFFFFFFD
slt a4, a0, a1 # a4 0 because 4<-3 is false
sltu a5, a0, a1 # a5 1 because 0x4< 0xFFFFFFFD is
true
Variants of
S LT
SEQZ: Set If Equal to Zero . Syntax: seqz rd, rs1
Example: seqz x6, x5
If x5 was zero, then x6 is set to 1. Otherwise, x6 is set to 0.
Caller
The program that instigates a procedure and provides the necessary parameter values.
Callee
A procedure that executes a series of stored instructions based on parameters provided by the
caller and then returns control to the caller.
Procedure Call Instructions
• Procedure call: Jump and link
jal x1, ProcedureLabel
• Address of following instruction put in x1
• Jumps to target address ProcedureLabel
• Procedure return (Also a Pseudo Instruction):
ret
• Jumps to address in x1
• ret is same as “jalr x0, x1, 0”
• Jump and link register
jalr x0, 0(x1)
• Like jal, but jumps to 0 + address in x1
• Use x0 as rd (x0 cannot be changed)
How to pass arguments/ return
values
• Solution : use registers
.func: Before calling a function, arguments
add a0, a0, a1
are copied to registers a0 and a1.
ret
.main: a0 is same as x10
li a0, 3 a1 is same as
li a1, 5 x11
jal x1, .func
add a2, a0, 10
Return value is
stored in a0 itself
Limitations with use of registers for
argument passing or returning
results
Space Problem
We have a limited number of registers
We cannot pass more than certain number of arguments
Solution : Use memory also
Overwrite Problem
What if a function calls itself ? (recursive call)
The callee can overwrite the registers of the caller
Solution : Spilling
Note: spilling is a technique in which, a variable is moved out from a register space to the main
memory(the RAM) to make space for other variables, which are to be used in the program currently
under executi on.
Register
Spilling
caller saved scheme
The caller can save the set of registers its needs
Call the function
And then restore the set of registers after the function
returns
Known as the caller saved scheme
callee saved scheme
The callee saves the registers, and later restores them
caller or callee-saver
conventions
Caller Caller
Save registers
Callee Callee
Save registers
Restore registers
Restore registers
Saver is “caller”: means that a function caller must save that register
somewhere before calling e.g. main()
Saver is “callee”: means that if a function wants to use that register,
it
must first save it somewhere, and restore it before returning e.g.
Limitations with our
approach
• Using memory, and spilling solves both the space problem and
overwrite problem
• However, there needs to be :
• a strict agreement between the caller and the callee
regarding the set of memory locations that need to be used
• Secondly, after a function has finished execution, all
the space that it uses needs to be reclaimed
17
Activation
Block
Activation block
int foo(int arg1) { Arguments
int a, b, c; Return address
a = 3; b = 4;
Register spill area
c=a+b+
arg1; return
c; Local variables
}
• Activation block → memory map of a function
Stack
foo foo foo foo
foobarbar
22
Working with the
Stack
• Allocate a part of the memory to save the stack
• Traditionally stacks are downward growing.
• The first activation block starts at the highest address
• Subsequent activation blocks are allocated lower addresses
• The stack pointer register (sp) points to the beginning of an
activation block
• Allocating an activation block : sp ← sp - <constant>
• De-allocating an activation block: sp ← sp + <constant>
23
Saving variable in stack (pushing and
popping)
myFunction:
addi sp,sp,-24
sd x5,16(sp)
Save x5, x6, x20 on stack (called
sd x6,8(sp) pushing)
sd x20,0(sp)
add x5,x10,x11
add x6,x12,x13
Do some processing of a
sub x20,x5,x6
function
addi x10,x20,0
ld x20,0(sp)
ld x6,8(sp)
ld Resore x5, x6, x20 from stack (called
x5,16(sp) popping)
addi sp,sp,24 Return to caller
ret
This is an example of callee saved
scheme
How Stack
Functions
n 1012
1013
68
24
1012
1013
AC
E0
1014 E0 1014 24
1015 AC 1015 68
1016 09 1016 AB
1017 EF 1017 CD
1018 CD 1018 EF
1019 AB 1019 09 (b) is correct for
RISC-V is little endian any Big-endian
1020 78 1020 12
Hence, (a) is correct for RISC-V I SA
1021 56 1021 34
1022 34 1022 56
1023 12 1023 78
1024 1024
1025 1025
1026 1026
imm imm
[10:5] rs2 rs1 funct3 [4:1] opcode
imm[12] imm[11]
PC-relative addressing
Target address = PC + immediate × 2
RISC-V UJ-format Instructions : Jump
Addressing
• Jump and link (jal) target uses 20-bit immediate for
larger range
• UJ format:
89
Machine Status (mstatus) - The Most
Important CSR
Control and track the hart’s current operating state
90
Timer
CSRs
• mtime • mtimecmp
– RISC-V defines a requirement – RISC-V defines a memory
for a counter exposed as a mapped timer compare
memory mapped register register
– There is no frequency – Triggers an interrupt when
requirement on the timer, but mtime is greater than or
• It must run at a constant
frequency equal to mtimecmp
• The platform must expose
frequency
23
Supervisor CSRs
• Most of the Machine mode CSRs have
Supervisor mode equivalents
– Supervisor mode CSRs can be used to control the Bits Field Name Description
state of Supervisor and User Modes. [21:0] PPN Physical Page Number of the root page table
– Most equivalent Supervisor CSRs have the same [30:22] ASID Address Space Identifier
mapping as Machine mode without Machine 31 MODE MODE=1 uses Sv32 Address Translation
mode control bits RV32 satp CSR
– sstatus, stvec, sip, sie, sepc, scause, satp, and
more Bits Field Name Description
• satp - Supervisor Address Translation and [43:0] PPN Physical Page Number of the root page table
Protection Register [59:44] ASID Address Space Identifier
– Used to control Supervisor mode address [63:60] MODE Encodings for Sv32, Sv39, Sv48
translation and protection RV64 satp CSR
– Virtual Memory is only supported in
Supervisor mode
92
0xFFFF_FFFF 0xFFFF_FFFF
Virtual Memory
• RISC-V has support for Virtual Memory
allowing for sophisticated memory Physical
management and OS support (Linux)
Address
• Requires an S-Mode implementation
• Sv32
– 32bit Virtual Address
– 4KiB, 4MiB page tables (2 Levels)
• Sv39 (requires an RV64 implementation)
– 39bit Virtual Address Virtual
– 4KiB, 2MiB, 1GiB page tables (3 Levels)
• Sv48 (requires an RV64 implementation) 0x0000_0000 Address 0x0000_0000
Reset If USER mode tries to enter other regions apart from grey zones
94
then the software will then TRAP to machine mode
RISC-V Interrupts
• RISC-V defines the following interrupts per Hart/ Core
– Software – architecturally defined software interrupt
– Timer – architecturally defined timer interrupt
– External – Peripheral Interrupts
– Local – Hart/Core specific Peripheral Interrupts i.e.
specific to a particular core
95
Machine Status (mstatus) – As it
relates to Interrupts
Bits Field Name Description Bits Field Name Description
0 UIE User Interrupt Enable [14:13] FS Floating Point State
1 SIE Supervisor Interrupt Enable [16:15] XS User Mode Extension State
2 Reserved 17 MPRIV Modify Privilege (access memory as MPP)
Interrupt
Specific 3 MIE Machine Interrupt Enable 18 SUM Permit Supervisor User Memory Access
Bits for 4 UPIE User Previous Interrupt Enable 19 MXR Make Executable Readable
different
Modes 5 SPIE Supervisor Previous Interrupt Enable 20 TVM Trap Virtual memory
6 Reserved 21 TW Timeout Wait (traps S-Mode wfi)
7 MPIE Machine Previous Interrupt Enabler 22 TSR Trap SRET
8 SPP Supervisor Previous Privilege [23:30] Reserved
[10:9] Reserved [31] SD State Dirty (FS and XS summary bit)
[12:11] MPP Machine Previous Privilege
96
Machine Interrupt Cause CSR
(mcause)
Interrupt = 0 (exception)
• Interrupts are identified by reading the Exception Description
Code
mcause CSR
0 Instruction Address Misaligned
• The Interrupt field determines if a trap Interrupt = 1 (interrupt)
1 Instruction Access Fault
was caused by an interrupt or an Exception
Code
Description
2 Illegal Instruction
exception 0 User Software Interrupt
3 Breakpoint
4 Load Address Misaligned
1 Supervisor Software Interrupt
5 Load Access Fault
2 Reserved
3 Machine Software Interrupt 6 Store/AMO Address Misaligned
Bits Field Name Description
4 User Timer Interrupt 7 Store/AMO Access Fault
XLEN-1 Interrupt Identifies if an interrupt was
synchronous or asynchronous 5 Supervisor Timer Interrupt 8 Environment Call from U-mode
currently pending 2
3
Reserved
MSIE Machine Software Interrupt Enable
– Can be used for polling 4 UTIE User Timer Interrupt Enable
generate a supervisor interrupt by setting the 9 SEIE Supervisor External Interrupt Enable
98
Machine Trap Vector CSR
(mtvec) mtvec sets the Base interrupt vector and the interrupt Mode
Bits Field Name Description mtvec Modes
[XLEN-1:6] Base Machine Trap Vector Base Address. Value Name Description
64-byte Alignment
0x0 Direct All Exceptions set PC to mtvec.BASE
[1:0] Mode MODE Sets the interrupt processing Requires 4-Byte alignment
mode.
0x1 Vectored Asynchronous interrupts set pc to
mtvec.BASE + (4×mcause.EXCCODE)
mtvec CSR Requires 4-Byte alignment
• mtvec.Mode = Direct > 0x01 Reserved
– All Interrupts trap to the address mtvec.Base
– Software must read the mcause CSR and react accordingly
• mtvec.Mode = Vectored
– Interrupts trap to the address mtvec.Base + (4*mcause.ExCode)
– Eliminates the need to read mcause for asynchronous exceptions
99
Trap Handler – Entry and
Exit
mtevc.MODE = Direct
PC MEPC
Priv mstatus.MPP
10 MIE mstatus.MPIE
0
Flow
Interrupt Handler Code
RISC-V Assembly interrupt handler C Code Handler determines interrupt cause and branches to the appropriate
to Push and Pop register file function
.align 2
Step 4 void handle_trap()
Step 2 .global trap_entry
{
trap_entry: unsigned long mcause = read_csr(mcause);
addi sp, sp, - if (mcause & MCAUSE_INT) {
16*REGBYTES //mask interrupt bit and branch
to handler
//store ABI Caller
Registers STORE x1, isr_handler[mcause &
0*REGBYTES(sp) STORE x5, MCAUSE_CAUSE] ();
2*REGBYTES(sp) } else {
… //branch to handler
STORE x30, exception_handler[mcause]();
14*REGBYTES(sp) STORE
}
Step 3 x31, 15*REGBYTES(sp)
}
//call C Code
Handler call
handle_trap Step 1
//write trap_entry address to mtvec
//restore ABI write_csr(mtvec, ((unsigned
Caller Registers long)&trap_entry));
LOAD x1,
0*REGBYTES(sp)
Step 5 LOAD x5,
2*REGBYTES(sp)
…
LOAD x30,
14*REGBYTES(sp) LOAD
x31, 15*REGBYTES(sp)
10
3
RISC-V Global Interrupts
• RISC-V defines Global Interrupts as a
Interrupt which can be routed to any
hart in a system
3
7
PLIC Interrupt Code
Example
• In this example an interrupt is presented to the PLIC
• The PLIC signals an interrupt to a hart using the Machine External Interrupt (interrupt 11)
• The interrupt handler (handle_trap) branches to the defined function to handle the Machine External Interrupt
– C Code placed the address of machine_external_interrupt function in location 11 of the async_handler vector table
• The machine_external_interrupt handler does the following:
– Reads the PLIC’s claim/complete register to determine highest priority pending interrupt
– Uses another vector table to branch to the interrupt’s specific handler
– Completes the interrupt by writing the interrupt number back to the PLIC’s claim/complete