0% found this document useful (0 votes)
983 views106 pages

RISC-V Theory

RISC-V is an open standard instruction set architecture (ISA) that promotes flexibility, control, and visibility for hardware and software customization, enabling innovation in processor design. It was developed at UC Berkeley in 2010 and released in 2015, allowing free use without royalties. RISC-V's modular architecture supports various extensions and addressing modes, making it suitable for diverse applications, including AI and embedded systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
983 views106 pages

RISC-V Theory

RISC-V is an open standard instruction set architecture (ISA) that promotes flexibility, control, and visibility for hardware and software customization, enabling innovation in processor design. It was developed at UC Berkeley in 2010 and released in 2015, allowing free use without royalties. RISC-V's modular architecture supports various extensions and addressing modes, making it suitable for diverse applications, including AI and embedded systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 106

RISC-V

Link

Eight Great Ideas


• Design for Moore’s Law

• Use abstraction to simplify design

• Make the common case fast

• Performance via parallelism

• Performance via pipelining

• Performance via prediction

• Hierarchy of memories

• Dependability via redundancy


Below Your Program
• Application software
• Written in high-level language
• System software
• Compiler: translates HLL code to
machine code
• Operating System: service code
• Handling input/output
• Managing memory and storage
• Scheduling tasks & sharing resources
• Hardware
• Processor, memory, I/O controllers
Levels of Program Code
• High-level language
• Level of abstraction closer to
problem domain
• Provides for productivity and
portability
• Assembly language
• Textual representation of
instructions
• Hardware representation
• Binary digits (bits)
• Encoded instructions and data
Why RISC V?
• Silicon vendors look for ways to innovate, reduce costs and stay ahead
of the competition, many are now turning to RISC-V, the open
standard instruction set architecture (ISA).
• RISC-V is enabling a new, collaborative era of processor innovation.
• Because it’s open standard, architects, designers and developers can
modify and improve upon the existing ISA codebase as needed.
• This flexibility has allowed teams to experiment with architectures
and chip designs within the same ISA and has enabled a collaborative
environment
History of RISC-V.
• The development of the RISC-V standard began in 2010 when
researchers at the University of California, Berkeley created a simple,
yet powerful ISA that could be used by anyone with minimal
restrictions.
• It was released in 2015 as a free and open-standard ISA that allows
anyone to design, manufacture and sell processors based on the RISC-
V specification — without royalties or license fees.
• Now managed by the RISC-V Foundation (riscv.org)
Industry Opinion about RISC V &
Applications
• RISC-V has grown rapidly due to its flexibility and cost-saving benefits
because it can be an ideal way to customize computing environments
without the high costs of proprietary ISA’s.
• These advantages are quickly being recognized across industries, and
businesses have been taking advantage of RISC-V cores for all kinds of
applications including LINK
• Artificial intelligence (AI) image sensors,
• Security management,
• AI computing,
• Machine control systems for 5G networks, and
• More sophisticated storage, graphics and machine learning applications.
Advantage of RISC V
The 3 main advantages of RISC-V
• Flexibility: RISC-V offers a unique set of features that allow users to customize
and optimize both software and hardware for specific use cases, resulting in
faster development cycles and better design tradeoffs for performance, power
and area.
• Control: This open ISA provides designers and developers with greater control
over their computing environments, allowing them to fine-tune their systems
without relying on third parties or incurring additional license fees associated
with proprietary architectures.
• Visibility: The open-standard nature of RISC-V also means that developers have
more visibility into the codebase, making it easier to understand the roadmap
and identify potential security risks before they become an issue.
ARM vs RISC V : ISA Comparison
ISAs can be broadly categorized into two types:
1. Open ISA
2. Closed ISA
• Closed ISAs, like ARM, are proprietary and tightly controlled by
specific companies (Arm Holdings here), offering established
reliability and compatibility but limiting customization.
• Open ISAs, exemplified by RISC-V, are community-driven and
provide greater flexibility for customization, fostering
innovation and adaptation to specific needs.
ARM vs RISC V : Architectural
Overview

16bit compressed
instruction is
decompressed into
32bit Instruction
ARM vs RISC V : Performance
• In the performance comparison between RISC-V and ARM,
• ARM's consistent iteration, comprehensive ecosystem, and wide
range of options give it a notable performance advantage.
• However, RISC-V's modular nature and customization potential hold
promise for specific use cases. The ongoing efforts of RISC-V
proponents to narrow the performance gap will be a crucial factor in
determining how well RISC-V can match ARM's established
performance standards in the future.
ARM vs RISC V : Power Efficiency
• ARM's refined power management techniques and specialized cores
give it a major advantage in power efficiency.
• While RISC-V holds promise due to its customization potential, its
open nature requires a more extensive investment of time and
resources to fully harness its energy-saving capabilities.
RISC V Architecture
• The RISC-V architecture is based on the RISC principles (as compared to
CISC), which emphasize a small, simple, and efficient instruction set.
The key architectural features of RISC-V include
• load-store architecture
• fixed-length 32-bit instruction format
• small number of general-purpose registers
• RISC-V supports various integer instruction set extensions, such as RV32I (32-bit),
RV64I (64-bit), and RV128I (128-bit), which define the base integer instruction
set for different address space sizes.
• RISC-V utilizes little-endian byte ordering within the memory system, implying
that the smallest significant byte of multi-byte data is stored at the lowest
memory address.
RISC V Architecture: Modularity &
Extensibility
• One of the defining characteristics of RISC-V is its modularity and
extensibility.
• The ISA is designed to be easily extended with custom instructions
and coprocessors, allowing for tailored implementations that meet
specific application requirements.
• This flexibility is achieved through a modular design, where the base
ISA can be combined with optional standard extensions, such as the
M extension for integer multiplication and division, the A extension
for atomic operations, and the F and D extensions for single- and
double-precision floating-point arithmetic.
Standard
extensions

•Standard R I S C encoding in a fixed 32-


bit instruction format
•The “C” extension (compressed
extension) offers shorter 16-bit versions
of common 32-bit RISC-V instructions
(can be intermixed with 32-bit
instructions)

Do not confuse C extension with C programming language!


RISC V Architecture: Compressed
Instruction set
• Compared to ARM’s Thumb instruction set, RISC-V also supports a
compressed instruction set extension called RV32C (or RV64C for 64-
bit), which provides 16-bit compressed instructions that can be mixed
with the standard 32-bit instructions.
• This feature helps reduce code size and improve energy efficiency,
making RISC-V particularly suitable for embedded systems and low-
power applications.
2 Versions of R I S C -V (based on
maximum width of registers
supported)
1.RISC-V 32 (RV32): max register width Register
file
(XLEN) is 32 bits x0

2.RISC-V 64 (RV64): max register width x1


(XLEN) is 64 bits x2
RV64 supports RV32 also. x3
Both versions have 32 registers. x4
32-bit in RV32


64-bit in RV64
x5
x31
In both of them, each instruction is encoded
into 32 bits.
RISC V Modes: Privilege levels & Virtual
Memory
• The RISC-V Privileged Architecture Specification defines three privilege
levels:
1. machine mode (M-mode),
2. supervisor mode (S-mode),
3. user mode (U-mode).
• These privilege levels provide a mechanism for isolating the operating
system kernel, hypervisors, and user applications, ensuring system
security and stability.
• RISC-V also supports a virtual memory system based on a multi-level
page table scheme, enabling efficient memory management and
protection.
RISC-V Modes
• RISC-V Privileged Specification defines 3 levels of
RISC-V Modes
privilege, called Modes
• Machine mode is the highest privileged mode and the only Level Name Abbr.
required mode 0 User/Application U
– Flexibility allows for a range of targeted 1 Supervisor S
implementations from simple MCUs to high-
2 Hypervisor HS
performance Application Processors
– Example for Simple bare metal application Machine 3 Machine M
mode is enough and it’s the default and mandatory
mode, for an isolation boundary between the
application and more direct hardware access, M and Supported Combinations of Modes
U mode may both be supported. Supported Levels Modes
– A robust system, such as a server or desktop machine 1 M
will support M, S, and U as the Supervisor mode will
2 M, U
bring the benefits of Virtualization and Hypervisor
called Hypervisor-extended Supervisor (HS) 3 M, S, U
4 M, HS, S, U
• Machine, Hypervisor, Supervisor modes each have Control
and Status Registers (CSRs)
We will discuss 4 addressing
modes (relevant for RISC-V)
• Immediate
• Register direct
• Register indirect
• Base-offset

• NOTE: Register indirect is a special case of base-offset. If we


set offset as zero, we get register indirect mode
1. Immediate addressing
mode
• Value ← imm
The value (e.g., 4, 8, 0x13, -3 etc) is available in the
instruction itself. No need to access register file

imm value
2. Register Direct
Mode
• Value ← r1 The value is obtained from the register directly.

r1 r0 186
784

r1
r15 410 784 value
register file
Examples of instructions that use
those modes
• Register direct: sub r3, r1, r2
• r1 and r2 values are fetched from registers. Result is stored
in r3

• Immediate: addi r3, r1, 500


• Here, r1 and r3 are accessed from registers and 500 is the
immediate value available in the instruction itself.
• RISC-V: at most one operand can be an immediate.
3. Register Indirect
Mode
(1) Read value of r1 from register file. This gives
• Value ← (r1) a memory address
(2) Read the value stored in memory at that
address
0
4
r0
r1 r1 148
100

r15
4019 value
register file 148 4019

M emory
4. Base-off set Addressing
Mode
• Value ← offset(r1) (1) Read value of r1 from register file. Add offset to
it. This gives a memory address
(2) Read the value stored in memory at that
address
Let offset = 9
0
4
r0
r1 r1 451
offset 100

r15 8914
value
register file 460 8914

memory
Examples of instructions that use
those modes
Load and store instructions use register indirect and base-offset
addressing modes
lw r1, 10(r2) sw r1, 10(r2)
memory memory
register register
file 10 file 10
r1 r1
r2 r2

Lw = load word
Sw= store word

(a) (b)
Solved
Example
Consider below instructions that are
executed when the state of register file and
memory is as given here.
ld x6, 24(x10)
sd x5, 16(x10)
Show only the changed values in R F and
memory after these instructions are
executed.

Solution: x10+24 is 1040. ld instruction will load from 1040 and store that value in
register x6. So, x6 will become 67891234
x10+16 is 1032. sd instruction will store the value of x5 at address 1032. Hence,
memory address 1032 will change to 3897409
Arithmetic Operations
• Add and subtract, three operands
• Two sources and one destination
add a, b, c // a gets b + c
• All arithmetic operations have this form
• Design Principle 1: Simplicity favors regularity
• Regularity makes implementation simpler
• Simplicity enables higher performance at lower cost
Arithmetic Example
• C code:
f = (g + h) - (i + j);
• Compiled RISC-V code:
add t0, g, h // temp t0 = g + h
add t1, i, j // temp t1 = i + j
sub f, t0, t1 // f = t0 - t1
Register Operands
• Arithmetic instructions use register
operands

• RISC-V has a count of 32 registers having a width (64-bit or 32bit)


• Use for frequently accessed data
• 64-bit data is called a “double word”
• 32 x 64-bit/32-bit general purpose registers x0 to x31
• 32-bit data is called a “word”

• Design Principle 2: Smaller is faster


• Compact flash main memory: millions of locations
RISC-V Registers
• x0: the constant value 0
• x1: return address
• x2: stack pointer
• x3: global pointer
• x4: thread pointer
• x5 – x7, x28 – x31: temporaries
• x8: frame pointer
• x9, x18 – x27: saved registers
• x10 – x11: function arguments/results
• x12 – x17: function arguments
RISC-V Registers
• RV32I/64I have 32 Integer Registers Register ABI Name Description Saver
x0 zero Hard-wired zero -
– Optional 32 FP registers with the F
x1 ra Return address Caller
and D extensions
– RV32E reduces the register file to x2 sp Stack pointer Callee
16 x3 gp Global pointer -
integer registers for area constrained x4 tp Thread pointer -
embedded devices x5-7 t0-2 Temporaries Caller
• Width of Registers is determined by ISA x8 s0/fp Saved register/Frame pointer Callee
• RISC-V Application Binary Interface (ABI) x9 s1 Saved register Callee
defines standard functions for registers x10-11 a0-1 Function Arguments/return Caller
– Allows for software interoperability values
• Development tools usually use ABI names x12-17 a2-7 Function arguments Caller
for simplicity x18-27 s2-11 Saved registers Callee
x28-31 t3-6 Temporaries Caller

33
Register Description
• Register x0 RISC-V dedicates register x0 to be hard-wired to the value zero.
• Return address A link to the calling site that allows a procedure to return to the proper address; in RISC-V it
is stored in register x1.
• Program counter(PC) The register containing the address of the instruction in the program being executed.
• Stack pointer A value denoting the most recently allocated address in a stack that shows where registers
should be spilled or where old register values can be found. In RISC-V, it is register sp, or x2.
• Global pointer The register that is reserved to point to the static area where static variables are stored.
• Frame pointer A value denoting the location of the saved registers and local variables for a given procedure.
•Argument registers: x10 to x17 are used to pass arguments to a function. Before calling a function,
arguments are copied to these registers. If more than 8 arguments need to be passed, we use the stack.
•Temporary registers (t0 to t6): used to hold intermediate values during instruction or function
execution.
• Thread-pointer register, tp, that is designed for thread-local data.
Register Operand Example
• C code:
f = (g + h) - (i + j);
• f, …, j in x19, x20, …, x23

• Compiled RISC-V code:


add x5, x20, x21
add x6, x22, x23
sub x19, x5, x6
Memory Operands What is XLEN in RISC-V?
RISC-V is little-endian and comes in
32 and 64 bit flavours. In keeping
with the RISC-V documents, the
• Main memory used for composite data flavour (either 32 or 64) is called
• Arrays, structures, dynamic data XLEN in further slides at few places
• To apply arithmetic operations where it matters.
• Load values from memory into registers
• Store result from register to memory
• Memory is byte addressed
• Each address identifies an 8-bit byte
• RISC-V is Little Endian
• Least-significant byte at least address of a word
• c.f. Big Endian: most-significant byte at least address
• RISC-V does not require words to be aligned in memory
• Unlike some other ISAs
Memory Operand Example
• C code:
A[12] = h + A[8];
• h in x21, base address of A in x22
• Compiled RISC-V code:
• Index 8 requires offset of 32
• 4 bytes per word
ld x9, 32(x22)
add x9, x21, x9
sd x9, 48(x22)
Registers vs. Memory
• Registers are faster to access than memory
• Operating on memory data requires loads and stores
• More instructions to be executed
• Compiler must use registers for variables as much as possible
• Only spill to memory for less frequently used variables
• Register optimization is important!
RV32 and RV64
• Although the RISC-V registers in this course are 32 bits wide, the RISC-
V architects conceived multiple variants of the ISA.
• In addition to this variant, known as RV32, a variant named RV64 has
64-bit registers, whose larger addresses make RV64 better suited to
processors for servers and smart phones.
Immediate Operands
• Constant data specified in an instruction
addi x22, x22, 4

• Make the common case fast


• Small constants are common
• Immediate operand avoids a load instruction
Example on
add
• Mapping of variables to registers:
• i x19, g x20

C codes C code (simplified) RISC-V code


i++ i++ addi x19, x19, 1
g = ++i i++ addi x19, x19, 1
add x20, x19, x0
g=i
g = i++ g=i add x20, x19, x0
i+ addi x19, x19, 1
+
L I and M V (pseudo)
instructions
• Load Immediate (LI) loads register rd with an immediate value
• Syntax: li rd, imm
• li t0, 0x4A # Load register t0 with a value
• mv t1, t0 # Copy contents of register t0 to register t1
• Pseudo instructions are similar to macros in C/C++.

Assuming v and r are stored in t0 and t1,


respectively.
C codes RISC-V code
v= 10 li t0, 10
v=r mv t0, t1
RISC-V R-format Instructions
funct7 rs2 rs1 funct3 rd opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits

• Instruction fields
• opcode: operation code
• rd: destination register number
• funct3: 3-bit function code (additional opcode)
• rs1: the first source register number
• rs2: the second source register number
• funct7: 7-bit function code (additional opcode)
R-format Example
funct7 rs2 rs1 funct3 rd opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits

add x9,x20,x21
0 21 20 0 9 51

0000000 10101 10100 000 01001 0110011

0000 0001 0101 1010 0000 0100 1011 0011two =


015A04B316
RISC-V I-format Instructions
immediate rs1 funct3 rd opcode
12 bits 5 bits 3 bits 5 bits 7 bits

• Immediate arithmetic and load instructions


• rs1: source or base address register number
• immediate: constant operand, or offset added to base address
• 2s-complement, sign extended
• Design Principle 3: Good design demands good compromises
• Different formats complicate decoding, but allow 32-bit instructions
uniformly
• Keep formats as similar as possible
RISC-V S-format Instructions
imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits

• Different immediate format for store instructions


• rs1: base address register number
• rs2: source operand register number
• immediate: offset added to base address
• Split so that rs1 and rs2 fields always in the same place
The 6 Instruction
Formats• R-Format: instructions using 3 register
inputs
– add, xor, mul —arithmetic/logical ops
• I-Format: instructions with immediates,
loads
– addi, lw, jalr, slli
• S-Format: store instructions: sw, sb
• SB-Format: branch instructions: beq, bge
• U-Format: instructions with upper
immediates
– lui, auipc —upper immediate
Logical Operations
• Instructions for bitwise manipulation
Operation C Java RISC-V
Shift left << << slli
Shift right >> >>> srli
Bit-by-bit AND & & and, andi
Bit-by-bit OR | | or, ori
Bit-by-bit XOR ^ ^ xor, xori
Bit-by-bit NOT ~ ~
 Useful for extracting and inserting
groups of bits in a word
Shift Operations
funct6 immed rs1 funct3 rd opcode
6 bits 6 bits 5 bits 3 bits 5 bits 7 bits

• immed: how many positions to shift


• Shift left logical
• Shift left and fill with 0 bits
• slli by i bits multiplies by 2i
• Shift right logical
• Shift right and fill with 0 bits
• srli by i bits divides by 2i (unsigned only)
Conditional Operations
• Branch to a labeled instruction if a condition is true
• Otherwise, continue sequentially

• beq rs1, rs2, L1


• if (rs1 == rs2) branch to instruction labeled L1

• bne rs1, rs2, L1


• if (rs1 != rs2) branch to instruction labeled L1
Compiling If Statements
• C code:
if (i==j) f = g+h;
else f = g-h;
• f, g, … in x19, x20, …
• Compiled RISC-V code:
bne x22, x23, Else
add x19, x20, x21
beq x0,x0,Exit // unconditional
Else: sub x19, x20, x21
Exit: …

Assembler calculates addresses


Compiling Loop Statements
• C code:
while (save[i] == k) i += 1;
• i in x22, k in x24, address of save in x25 Assume “long int” variables (8B).
Base address of Arr is in a0. i is in a1. val is
• Compiled RISC-V code: incode
C a2 Simplified C code RISC-V code
Loop: slli x10, x22, 3 temp=5 li a3, 5
add x10, x10, x25 Arr[i] =5;
Offset = i*8 slli a4, a1, 3
ld x9, 0(x10) addr = Arr + offset add a5, a4, a0
Memory[addr] = temp sd a3, 0(a5)
bne x9, x24, Exit
val = Arr[7] ld a2, 56(a0)
addi x22, x22, 1
beq x0, x0, Loop
Exit: …
More Conditional Operations
• blt rs1, rs2, L1
• if (rs1 < rs2) branch to instruction labeled L1
• bge rs1, rs2, L1
• if (rs1 >= rs2) branch to instruction labeled L1
• Example
• if (a > b) a += 1;
• a in x22, b in x23
bge x23, x22, Exit // branch if b >= a
addi x22, x22, 1
Exit:
Signed vs.
Unsigned
• A number stored in a register or memory address can be
interpreted as signed (2’s complement) or unsigned. Both may
lead to different decimal numbers.

• x22 = 1111 1111 1111 1111 = 65535 (unsigned), -1 (signed)


• x23 = 0000 0000 0000 0001 = 1 (signed), 1 (unsigned)

• On comparing them, result depends on whether you take


them as signed or unsigned.
• On signed comparison, x22<x23 because -1 < +1
• On unsigned comparison, x22 > x23 because +65535 > +1
Signed and unsigned
comparison
• To take this into account, RISC-V has two variants of
comparison/branching instructions.
• Signed comparison: blt, bge
• Unsigned comparison: bltu, bgeu
S LT : S E T O n L e s s T h a n (signed
comparison) and S LT U (unsigned
comparison)
Syntax: slt rd, rs1, rs2
S LT writes 1 to rd if rs1 < rs2, 0 otherwise.
Example code:
li x5, 39 # x5  39
li x3, 57 # x3  57
slt x1, x5, x3 # x1  x5 < x3. So, x1 will store 1
C code RISC-V code
if (a<b) slt a2, a0, a1
Assuming a/b/c are c=1;
stored in else
a0/a1/a2 respectively. c=0;
Comparison Instructions (SLT
and SLTU)
S LT and S LT U perform signed and unsigned compares respectively, writing 1 to rd if rs1 < rs2, 0 otherwise.

li a0, -4 # a0  0xFFFFFFFC (in hex)


li a1, 3 # a1 3
slt a2, a0, a1 # a2  1 because -4<3 is true
sltu a3, a0, a1 # a3  0 because 0xFFFFFFFC <3
is False

li a0, 4 # a0  4
li a1, -3 # a1  0xFFFFFFFD
slt a4, a0, a1 # a4  0 because 4<-3 is false
sltu a5, a0, a1 # a5  1 because 0x4< 0xFFFFFFFD is
true
Variants of
S LT
SEQZ: Set If Equal to Zero . Syntax: seqz rd, rs1
Example: seqz x6, x5
If x5 was zero, then x6 is set to 1. Otherwise, x6 is set to 0.

Assuming a/b/c are


Similar instructions: stored in a0/a1/a2
C respectively.
code RISC-V code
SGTZ: Set If Greater Than Zero if (a==0) seqz a2, a0
c=1;
SLTZ: Set If Less Than Zero else
c=0;
S NEZ: Set If Not Equal to Zero
Procedure Calling
Steps required
1. Place parameters in registers x10 to x17
2. Transfer control to procedure
3. Acquire storage for procedure
4. Perform procedure’s operations
5. Place result in register for caller
6. Return to place of call (address in x1)

Caller

The program that instigates a procedure and provides the necessary parameter values.
Callee

A procedure that executes a series of stored instructions based on parameters provided by the
caller and then returns control to the caller.
Procedure Call Instructions
• Procedure call: Jump and link
jal x1, ProcedureLabel
• Address of following instruction put in x1
• Jumps to target address ProcedureLabel
• Procedure return (Also a Pseudo Instruction):
ret
• Jumps to address in x1
• ret is same as “jalr x0, x1, 0”
• Jump and link register
jalr x0, 0(x1)
• Like jal, but jumps to 0 + address in x1
• Use x0 as rd (x0 cannot be changed)
How to pass arguments/ return
values
• Solution : use registers
.func: Before calling a function, arguments
add a0, a0, a1
are copied to registers a0 and a1.
ret
.main: a0 is same as x10
li a0, 3 a1 is same as
li a1, 5 x11
jal x1, .func
add a2, a0, 10
Return value is
stored in a0 itself
Limitations with use of registers for
argument passing or returning
results
 Space Problem
 We have a limited number of registers
 We cannot pass more than certain number of arguments
 Solution : Use memory also
 Overwrite Problem
 What if a function calls itself ? (recursive call)
 The callee can overwrite the registers of the caller
 Solution : Spilling
Note: spilling is a technique in which, a variable is moved out from a register space to the main
memory(the RAM) to make space for other variables, which are to be used in the program currently
under executi on.
Register
Spilling
 caller saved scheme
 The caller can save the set of registers its needs
 Call the function
 And then restore the set of registers after the function
returns
 Known as the caller saved scheme
 callee saved scheme
 The callee saves the registers, and later restores them
caller or callee-saver
conventions
Caller Caller

Save registers
Callee Callee
Save registers

Restore registers

Restore registers

(a) Caller saved (b) Callee saved

Saver is “caller”: means that a function caller must save that register
somewhere before calling e.g. main()
Saver is “callee”: means that if a function wants to use that register,
it
must first save it somewhere, and restore it before returning e.g.
Limitations with our
approach
• Using memory, and spilling solves both the space problem and
overwrite problem
• However, there needs to be :
• a strict agreement between the caller and the callee
regarding the set of memory locations that need to be used
• Secondly, after a function has finished execution, all
the space that it uses needs to be reclaimed

17
Activation
Block
Activation block
int foo(int arg1) { Arguments
int a, b, c; Return address
a = 3; b = 4;
Register spill area
c=a+b+
arg1; return
c; Local variables

}
• Activation block → memory map of a function

arguments, register spill area, local variables


Organising Activation
Blocks
• All the information of an executing function is stored in its activation
block
• These blocks need to be dynamically created and destroyed – millions
of times
• What is the correct way of managing them, and ensuring their fast
creation and deletion ?
• Is there a pattern ?
Pattern of Function
Calls
tes test2
t
main foo foobar foobarbar

main() test() test2()


{ test( { ... { ...
); test2( retur
} foo() ); } n;
; } return
;
foo() { foobar() { foobarbar() {
... ... ...
foobar( foobarbar
); (); return; return;
return; } }
}
Pattern of Function
Calls
• Last in First Out
• Use a stack to store activation blocks

Stack
foo foo foo foo

foobar foobar foobar

foobarbar

(a) (b) (c) (d) 21


Issues solved by
stack
• Space problem
• Pass as many parameters as required in the activation block
• Overwrite problem
• Solved by activation blocks
• Management of activation blocks
• Solved by the notion of the stack

22
Working with the
Stack
• Allocate a part of the memory to save the stack
• Traditionally stacks are downward growing.
• The first activation block starts at the highest address
• Subsequent activation blocks are allocated lower addresses
• The stack pointer register (sp) points to the beginning of an
activation block
• Allocating an activation block : sp ← sp - <constant>
• De-allocating an activation block: sp ← sp + <constant>
23
Saving variable in stack (pushing and
popping)
myFunction:
addi sp,sp,-24
sd x5,16(sp)
Save x5, x6, x20 on stack (called
sd x6,8(sp) pushing)
sd x20,0(sp)
add x5,x10,x11
add x6,x12,x13
Do some processing of a
sub x20,x5,x6
function
addi x10,x20,0
ld x20,0(sp)
ld x6,8(sp)
ld Resore x5, x6, x20 from stack (called
x5,16(sp) popping)
addi sp,sp,24 Return to caller
ret
This is an example of callee saved
scheme
How Stack
Functions

Stack grows downwards


A Question on Byte
addresses
Stack
1011
1012
1013
Consider 32b version of RISC-V and see below code. 1014 ?
addi sp,sp,- 1015 ?
12 sw 1016 ?
x5,8(sp) sw 1017 ?
1018 ?
x6,4(sp) sw 1019 ?
1020 ?
x20,0(sp) 1021 ?
1022
Initially, we 1023
have value of 1024
x5 as 1025
0x12345678, x6 1026
as 0xABCDEF09
Solutio Byte
1011
addresses
Byte
1011
addresses

n 1012
1013
68
24
1012
1013
AC
E0
1014 E0 1014 24
1015 AC 1015 68
1016 09 1016 AB
1017 EF 1017 CD
1018 CD 1018 EF
1019 AB 1019 09 (b) is correct for
RISC-V is little endian any Big-endian
1020 78 1020 12
Hence, (a) is correct for RISC-V I SA
1021 56 1021 34
1022 34 1022 56
1023 12 1023 78
1024 1024
1025 1025
1026 1026

(a) Little (b) Big


Endian
Leaf Procedure Example
• C code:
long long int leaf_example (
long long int g, long long int h,
long long int i, long long int j) {
long long int f;
f = (g + h) - (i + j);
return f;
}
• Arguments g, …, j in x10, …, x13
• f in x20
• temporaries x5, x6
• Need to save x5, x6, x20 on stack
Leaf Procedure Example
• RISC-V code:
leaf_example:
addi sp,sp,-24 Save x5, x6, x20 on stack
sd x5,16(sp)
sd x6,8(sp)
sd x20,0(sp
add x5,x10,x11 x5 = g + h
add x6,x12,x1 x6 = i + j
sub x20,x5,x6 f = x5 – x6
addi x10,x20,0 copy f to return register
ld x20,0(sp) Resore x5, x6, x20 from stack
ld x6,8(sp)
ld x5,16(sp)
addi sp,sp,24
jalr x0,0(x1) Return to caller
Register Usage
• x5 – x7, x28 – x31: temporary registers
• Not preserved by the callee

• x8 – x9, x18 – x27: saved registers


• If used, the callee saves and restores them

What is and what is not preserved across a procedure call


Byte/Halfword/Word Operations

• RISC-V byte/halfword/word load/store


• Load byte/halfword/word: Sign extend to 64 bits in rd
• lb rd, offset(rs1)
• lh rd, offset(rs1)
• lw rd, offset(rs1)
• Load byte/halfword/word unsigned: Zero extend to 64 bits in rd
• lbu rd, offset(rs1)
• lhu rd, offset(rs1)
• lwu rd, offset(rs1)
• Store byte/halfword/word: Store rightmost 8/16/32 bits
• sb rs2, offset(rs1)
• sh rs2, offset(rs1)
• sw rs2, offset(rs1)
RISC-V SB-format Instructions: Branch
Addressing
• Branch instructions specify
• Opcode, two registers, target address
• Most branch targets are near branch
• Forward or backward
• SB format:

imm imm
[10:5] rs2 rs1 funct3 [4:1] opcode

imm[12] imm[11]

 PC-relative addressing
 Target address = PC + immediate × 2
RISC-V UJ-format Instructions : Jump
Addressing
• Jump and link (jal) target uses 20-bit immediate for
larger range
• UJ format:

imm[10:1] imm[19:12] rd opcode


5 bits 7 bits
imm[20] imm[11]

 For long jumps, eg, to 32-bit absolute


address
 lui: load address[31:12] to temp register
 jalr: add address[11:0] and jump to target
RISC-V Addressing Summary
RISC-V Encoding Summary
RISC-V Multiplication
• Four multiply instructions:
• mul: multiply
• Gives the lower 64 bits of the product
• mulh: multiply high
• Gives the upper 64 bits of the product, assuming the operands are signed
• mulhu: multiply high unsigned
• Gives the upper 64 bits of the product, assuming the operands are unsigned
• mulhsu: multiply high signed/unsigned
• Gives the upper 64 bits of the product, assuming one operand is signed and the other
unsigned
• Use mulh result to check for 64-bit overflow
RISC-V Division
• Four instructions:
• div, rem: signed divide, remainder
• divu, remu: unsigned divide, remainder

• Overflow and division-by-zero don’t produce errors


• Just return defined results
• Faster for the common case of no error
CSR and ECALL
Instructions
• Control and Status Registers (CSRs)
have their own dedicated
instructions :
– Read/Write
– Read and Set bit
– Read and Clear bit
• Environment Call instruction used to
transfer control to the execution
environment and a higher privileged
mode
– Triggers a synchronous Interrupt
(discussed later)
– Example: User mode program can use an
87 ECALL to transfer control to a Machine
What are Control and Status
Registers (CSRs)
• CSRs are Registers which contain the working
state of a RISC-V machine
• CSRs are specific to a Mode
– Machine Mode has ~17 CSRs (not including performance
monitor CSRs)
– Supervisor Mode has a similar number, though most are
subsets of their equivalent Machine Mode CSRs
• Machine Mode can also access Supervisor CSRs

• CSRs are defined in the RISC-V privileged


specification
–We will cover a few key CSRs here
88
Identification CSRs
• misa – Machine ISA Register
– Reports the ISA supported by the hart (i.e. RV32bit ,
64bit, 128bit)
• mhartid – Machine hart/Core ID
– Integer ID of the Hardware Thread (Core)
• mvendorid – Machine Vendor ID
– JEDEC Vendor ID
• marchid – Machine Architecture ID
– Used along with mvendorid to identify a
implementation. No format specified
• mimpid - Machine Implementation ID
– Implementation defined format

Note: In RISC V Architecture we call Core as hart(hardware Thread)

89
Machine Status (mstatus) - The Most
Important CSR
Control and track the hart’s current operating state

Bits Field Name Description Bits Field Name Description


0 UIE User Interrupt Enable [14:13] FS Floating Point State
1 SIE Supervisor Interrupt Enable [16:15] XS User Mode Extension State
2 Reserved 17 MPRIV Modify Privilege (access memory as MPP)
3 MIE Machine Interrupt Enable 18 SUM Permit Supervisor User Memory Access
4 UPIE User Previous Interrupt Enable 19 MXR Make Executable Readable
5 SPIE Supervisor Previous Interrupt Enable 20 TVM Trap Virtual memory
6 Reserved 21 TW Timeout Wait (traps S-Mode wfi)
7 MPIE Machine Previous Interrupt Enabler 22 TSR Trap SRET
8 SPP Supervisor Previous Privilege [23:30] Reserved
[10:9] Reserved [31] SD State Dirty (FS and XS summary bit)
[12:11] MPP Machine Previous Privilege

RV32 mstatus CSR

90
Timer
CSRs
• mtime • mtimecmp
– RISC-V defines a requirement – RISC-V defines a memory
for a counter exposed as a mapped timer compare
memory mapped register register
– There is no frequency – Triggers an interrupt when
requirement on the timer, but mtime is greater than or
• It must run at a constant
frequency equal to mtimecmp
• The platform must expose
frequency

Bits Field Name Description Bits Field Name Description


[63:0] mtime Machine Time Register [63:0] mtimecmp Machine Time Compare Register

mtime CSR mtimecmp CSR

23
Supervisor CSRs
• Most of the Machine mode CSRs have
Supervisor mode equivalents
– Supervisor mode CSRs can be used to control the Bits Field Name Description
state of Supervisor and User Modes. [21:0] PPN Physical Page Number of the root page table
– Most equivalent Supervisor CSRs have the same [30:22] ASID Address Space Identifier
mapping as Machine mode without Machine 31 MODE MODE=1 uses Sv32 Address Translation
mode control bits RV32 satp CSR
– sstatus, stvec, sip, sie, sepc, scause, satp, and
more Bits Field Name Description
• satp - Supervisor Address Translation and [43:0] PPN Physical Page Number of the root page table
Protection Register [59:44] ASID Address Space Identifier
– Used to control Supervisor mode address [63:60] MODE Encodings for Sv32, Sv39, Sv48
translation and protection RV64 satp CSR
– Virtual Memory is only supported in
Supervisor mode

92
0xFFFF_FFFF 0xFFFF_FFFF

Virtual Memory
• RISC-V has support for Virtual Memory
allowing for sophisticated memory Physical
management and OS support (Linux)
Address
• Requires an S-Mode implementation
• Sv32
– 32bit Virtual Address
– 4KiB, 4MiB page tables (2 Levels)
• Sv39 (requires an RV64 implementation)
– 39bit Virtual Address Virtual
– 4KiB, 2MiB, 1GiB page tables (3 Levels)
• Sv48 (requires an RV64 implementation) 0x0000_0000 Address 0x0000_0000

– 48bit Virtual Address Virtual Address Map Physical Address Map


– 4KiB, 2MiB, 1 GiB, 512GB page tables (4
Levels)
• Page Tables also contain access
permission
attributes
Physical Memory Protection
(PMP)
• Can be used to enforce access 0xFFFF_FFFF
restrictions on less privileged modes
4 Byte Regio n Locked.
– Machine Mode can Prevent Only accessi Locked Region
Supervisor and User Mode reset le after a

software from accessing


unwanted memory User Mode h
RWX as full User Mode
• Up to 16 regions with a minimum Privileg s
Context Can define
region size of 4 bytes entire address
map as not
accessible
User Mode h to U-Mode in 1
• Ability to Lock a region only Privileg as Read
s User Mode Data
register

– A locked region enforces


permissions on all accesses, User Mode h
including M-Mode i.e. none Executeas Shared Library
of the mode will be able to only
Privileges Code
gain access here 0x0000_0000
– Only way to unlock a region
is a Example PMP Memory Map

Reset If USER mode tries to enter other regions apart from grey zones
94
then the software will then TRAP to machine mode
RISC-V Interrupts
• RISC-V defines the following interrupts per Hart/ Core
– Software – architecturally defined software interrupt
– Timer – architecturally defined timer interrupt
– External – Peripheral Interrupts
– Local – Hart/Core specific Peripheral Interrupts i.e.
specific to a particular core

• Optionally per privilege level


– Can have Supervisor Software/Timer/Machine
Interrupts
– Can have User Software/Timer/Machine

• Local interrupts are optional and implementation


specific
– Can be used for hart-specific peripheral interrupts
– Useful for latency-sensitive embedded systems or
small embedded systems with a small number of
interrupts

For more Details Refer: https://five-embeddev.com/quickref/i


nterrupts.html

95
Machine Status (mstatus) – As it
relates to Interrupts
Bits Field Name Description Bits Field Name Description
0 UIE User Interrupt Enable [14:13] FS Floating Point State
1 SIE Supervisor Interrupt Enable [16:15] XS User Mode Extension State
2 Reserved 17 MPRIV Modify Privilege (access memory as MPP)
Interrupt
Specific 3 MIE Machine Interrupt Enable 18 SUM Permit Supervisor User Memory Access
Bits for 4 UPIE User Previous Interrupt Enable 19 MXR Make Executable Readable
different
Modes 5 SPIE Supervisor Previous Interrupt Enable 20 TVM Trap Virtual memory
6 Reserved 21 TW Timeout Wait (traps S-Mode wfi)
7 MPIE Machine Previous Interrupt Enabler 22 TSR Trap SRET
8 SPP Supervisor Previous Privilege [23:30] Reserved
[10:9] Reserved [31] SD State Dirty (FS and XS summary bit)
[12:11] MPP Machine Previous Privilege

RV32 mstatus CSR


• M/S/U IE – Global Interrupt Enables for Modes which supports interrupts
• M/S/U PIE – Encodes the state of interrupt enables prior to an interrupt.
– These bits can also be written to in order to enable interrupts when returning to lower privilege modes
• M/S PP – Encodes the privilege level prior to the previous interrupt
– These bits can also be written to in order to enter a lower privilege mode when executing MRET or SRET instructions

96
Machine Interrupt Cause CSR
(mcause)
Interrupt = 0 (exception)
• Interrupts are identified by reading the Exception Description
Code
mcause CSR
0 Instruction Address Misaligned
• The Interrupt field determines if a trap Interrupt = 1 (interrupt)
1 Instruction Access Fault
was caused by an interrupt or an Exception
Code
Description
2 Illegal Instruction
exception 0 User Software Interrupt
3 Breakpoint
4 Load Address Misaligned
1 Supervisor Software Interrupt
5 Load Access Fault
2 Reserved
3 Machine Software Interrupt 6 Store/AMO Address Misaligned
Bits Field Name Description
4 User Timer Interrupt 7 Store/AMO Access Fault
XLEN-1 Interrupt Identifies if an interrupt was
synchronous or asynchronous 5 Supervisor Timer Interrupt 8 Environment Call from U-mode

6 Reserved 9 Environment Call from S-mode


[XLEN-2:0] Exception Code Identifies the exception
7 Machine Timer Interrupt 10 Reserved
mcause CSR
8 User External Interrupt 11 Environment Call from M-
mode
9 Supervisor External Interrupt
12 Instruction Page Fault
10 Reserved
13 Load Page Fault
11 Machine External Interrupt
14 Reserved
12 - 15 Reserved
15 Store/AMO Page Fault
≥16 Local Interrupt X
97 ≥16 Reserved
Machine Interrupt-Enable and
Pending CSRs (mie, mip)
• mie used to enable/disable a given
Bits Field Name Description
interrupt 0 USIE User Software Interrupt Enable
• mip indicates which interrupts are 1 SSIE Supervisor Software Interrupt Enable

currently pending 2
3
Reserved
MSIE Machine Software Interrupt Enable
– Can be used for polling 4 UTIE User Timer Interrupt Enable

• Lesser-privilege bits in mip are 5 STIE Supervisor Timer Interrupt Enable


6 Reserved
writeable 7 MTIE Machine Timer Interrupt Enable
– i.e. Machine-mode software can be used to 8 UEIE User External Interrupt Enable

generate a supervisor interrupt by setting the 9 SEIE Supervisor External Interrupt Enable

STIP bit 10 Reserved


11 MEIE Machine External Interrupt Enable
• mip has the same mapping as mie 12-15 Reserved
≥16 LIE Local Interrupt Enable
mie CSR

98
Machine Trap Vector CSR
(mtvec) mtvec sets the Base interrupt vector and the interrupt Mode
Bits Field Name Description mtvec Modes
[XLEN-1:6] Base Machine Trap Vector Base Address. Value Name Description
64-byte Alignment
0x0 Direct All Exceptions set PC to mtvec.BASE
[1:0] Mode MODE Sets the interrupt processing Requires 4-Byte alignment
mode.
0x1 Vectored Asynchronous interrupts set pc to
mtvec.BASE + (4×mcause.EXCCODE)
mtvec CSR Requires 4-Byte alignment
• mtvec.Mode = Direct > 0x01 Reserved
– All Interrupts trap to the address mtvec.Base
– Software must read the mcause CSR and react accordingly
• mtvec.Mode = Vectored
– Interrupts trap to the address mtvec.Base + (4*mcause.ExCode)
– Eliminates the need to read mcause for asynchronous exceptions

99
Trap Handler – Entry and
Exit
mtevc.MODE = Direct

• On entry, the RISC-V hart will


– Save the current state • Typical trap handler software will
– PC is Copied to Machine Exception Program Stacking
Counter determine type of Int.
– Privilege mode is Copied to M Status Prev. Priv. Mode Push Registers

– Int. EN is Copied to Machine Status Previous Int. EN interrupt = mcause.msb
PC MEPC if interrupt
branch isr_handler[mcause.code]
Priv mstatus.MPP
else
MIE mstatus.MPIE branch
exception_handler[mcause.code]
– Then set PC = mtvec (address), mstatus.MIE = 0(to disable …
Pop
Interrupt) Registers
– At this Point we are in Trap Handler MRET Interrupt handler pseudo code

• MRET instruction restores state Machine Return and Unstacking

PC MEPC
Priv mstatus.MPP

10 MIE mstatus.MPIE
0
Flow
Interrupt Handler Code
RISC-V Assembly interrupt handler C Code Handler determines interrupt cause and branches to the appropriate
to Push and Pop register file function

.align 2
Step 4 void handle_trap()
Step 2 .global trap_entry
{
trap_entry: unsigned long mcause = read_csr(mcause);
addi sp, sp, - if (mcause & MCAUSE_INT) {
16*REGBYTES //mask interrupt bit and branch
to handler
//store ABI Caller
Registers STORE x1, isr_handler[mcause &
0*REGBYTES(sp) STORE x5, MCAUSE_CAUSE] ();
2*REGBYTES(sp) } else {
… //branch to handler
STORE x30, exception_handler[mcause]();
14*REGBYTES(sp) STORE
}
Step 3 x31, 15*REGBYTES(sp)
}
//call C Code
Handler call
handle_trap Step 1
//write trap_entry address to mtvec
//restore ABI write_csr(mtvec, ((unsigned
Caller Registers long)&trap_entry));
LOAD x1,
0*REGBYTES(sp)
Step 5 LOAD x5,
2*REGBYTES(sp)

LOAD x30,
14*REGBYTES(sp) LOAD
x31, 15*REGBYTES(sp)

addi sp, sp,


16*REGBYTES mret
Compiler Interrupt
Attribute
• Pushing and Popping Registers in Assembly Interrupt handler with interrupt attribute.
No assembly Code necessary
is a pain
void handle_trap(void) attribute((interrupt));
void handle_trap()
• The interrupt attribute was added to GCC {

to facilitate interrupt handlers written unsigned long mcause = read_csr(mcause);


if (mcause & MCAUSE_INT) {
entirely in C //mask interrupt bit and branch to handler
isr_handler[mcause & MCAUSE_CAUSE] ();
– Interrupt functions only saves/restores } else {

necessary registers onto the stack //synchronous exception, branch to handler


exception_handler[mcause & MCAUSE_CAUSE]();
– Align function on an 8-byte boundary }
}
– Calles MRET after popping register file
back //write handle_trap address to mtvec
off the stack write_csr(mtvec, ((unsigned long)&handle_trap));

10
3
RISC-V Global Interrupts
• RISC-V defines Global Interrupts as a
Interrupt which can be routed to any
hart in a system

• Global Interrupts are prioritized and


distributed by the Platform Level
Interrupt Controller (PLIC)

• The PLIC is connected to the External


Interrupt signal for 1 or more harts in
an implementation

3
7
PLIC Interrupt Code
Example
• In this example an interrupt is presented to the PLIC
• The PLIC signals an interrupt to a hart using the Machine External Interrupt (interrupt 11)
• The interrupt handler (handle_trap) branches to the defined function to handle the Machine External Interrupt
– C Code placed the address of machine_external_interrupt function in location 11 of the async_handler vector table
• The machine_external_interrupt handler does the following:
– Reads the PLIC’s claim/complete register to determine highest priority pending interrupt
– Uses another vector table to branch to the interrupt’s specific handler
– Completes the interrupt by writing the interrupt number back to the PLIC’s claim/complete

void handle_trap(void) attribute((interrupt)); void machine_external_interrupt()


void handle_trap()
{
{
unsigned long mcause = read_csr(mcause); //get the highest priority pending PLIC interrupt
if (mcause & MCAUSE_INT) { uint32_t int_num = plic.claim_comlete;
//mask interrupt bit and branch to handler //branch to handler
isr_handler[mcause & MCAUSE_CAUSE] (); plic_handler[int_num]();
} else { //complete interrupt by
//synchronous exception, branch to writing interrupt number
handler exception_handler[mcause & back to PLIC
MCAUSE_CAUSE]();
plic.claim_complete =
} int_num;
}
}
//install PLIC handler at MEIP Location
isr_handler[11] = machine_external_interrupt;
//write trap_entry address to mtvec
write_csr(mtvec, ((unsigned long)&handle_trap));
RISC-V Interrupt System Architecture
(M-mode only example)
Core Local Interruptor(CLINT) is a memory
mapped peripheral used to generate
Software and Timer Interrupts

• The Platform Level Interrupt Controller(PLIC)


handles the majority of the Core Complex’s
Interrupts
– The PLIC has a programmable number
prioritization levels
– Only the highest priority pending interrupt is
presented on the claim/complete register

• Multi-Core interrupt distribution


– The PLIC is globally addressable and is connected
to the Machine External Interrupt signal of all
cores in the Core Complex

You might also like